Agenda

Agenda is subject to change. Times listed below are in Pacific.

Lesson Materials: will be provided closer to event date

Tuesday, July 29 - Preparation day (virtual)

Pacific time	Session
9:00 AM – 11:00 AM	Preparation Day - Welcome & Orientation Andrea Zonca, Lead of Scientific Computing Applications and Chair of the Summer Institute
	Accounts, Login, Environment, Running Jobs and Logging into Expanse User Portal Robert Sinkovits, Director of Education and Training, Emeritus
	Q&A wrap up

Monday, August 4

Pacific time	Main Room Session
8:00 AM – 8:30 AM	Check-in & Registration
8:30 AM - 9:30 AM	Welcome & Overview Andrea Zonca, Lead of Scientific Computing Applications and Chair of the Summer Institute
9:30 AM - 12:00 PM (break 10:30-10:45 AM)	Data Management: Data Storage, Data Transfers, File Systems Marty Kandes, Computational and Data Science Research Specialist Proper data management is essential for the effective use of advanced CI. This session will cover an overview of file systems, data compression, archives (tar files), checksums and MD5 digests, downloading data using wget and curl, data transfer and long-term storage solutions
12:00 PM - 1:30 PM	Lunch
1:30 PM - 3:15 PM	Running Batch and Interactive Jobs Mary Thomas, Computational Data Scientist
3:15 PM - 3:30 PM	Break
3:30 PM - 4:45 PM	Code Migration & Software Environments Mahidhar Tatineni, Director of User Services
4:45 PM - 5:15 PM	Q&A + Wrap-up
5:30 PM - 7:00 PM	Evening Reception

Tuesday, August 5

Pacific time	Main Room Session
8:00 AM – 8:30 AM	Check-in & Light Breakfast
8:30 AM - 10:30AM	Parallel Computing Concepts Robert Sinkovits, Director of Education and Training, Emeritus Advanced cyberinfrastructure users, whether they develop their own software or run 3rd party applications, should understand fundamental parallel computing concepts. Here we cover supercomputer architectures, the differences between threads and processes, implementations of parallelism (e.g., OpenMP and MPI), strong and weak scaling, limitations on scalability (Amdahl’s and Gustafson’s Laws) and benchmarking. We also discuss how to choose the appropriate number of cores, nodes or GPUs when running your applications and, when appropriate, the best balance between threads and processes. This session does not assume any programming experience.
10:30 AM - 10:45 AM	Break
10:45 AM - 12:00 PM	High Throughput Computing Marty Kandes, Computational and Data Science Research Specialist High-throughput computing (HTC) workloads are characterized by large numbers of small jobs. These frequently involve parameter sweeps where the same type of calculation is done repeatedly with different input values or data processing pipelines where an identical set of operations is applied to many files. This session covers the characteristics and potential pitfalls of HTC, job bundling, the Open Science Grid and the resources available through the Partnership to Advance Throughput Computing (PATh).
12:00 PM - 1:30 PM	Lunch
1:30 PM – 2:15 PM	Getting Help Nicole Wolter, Computational and Data Science Research Specialist Reducing the time and effort needed to address problems related to application performance, batch job submission or data management can minimize frustration and enable the users to become more productive. In this section we will cover common problems and best practices on resolving issues.
2:15 PM - 4:30 PM (break 3:15 PM - 3:30 PM)	Parallel Computing using MPI & Open MP Mahidhar Tatineni, Director of User Services This session is targeted at attendees who are looking for a hands-on introduction to parallel computing using MPI and Open MP programming. The session will start with an introduction and basic information for getting started with MPI. An overview of the common MPI routines that are useful for beginner MPI programmers, including MPI environment set up, point-to-point communications, and collective communications routines will be provided. Simple examples illustrating distributed memory computing, with the use of common MPI routines, will be covered. The OpenMP section will provide an overview of constructs and directives for specifying parallel regions, work sharing, synchronization and data scope. Simple examples will be used to illustrate the use of OpenMP shared-memory programming model, and important run time environment variables Hands on exercises for both MPI and OpenMP will be done in C and FORTRAN.
4:30 PM - 4:45 PM	Q&A + Wrap-up

Wednesday, August 6

Pacific time	Main Room Session
8:00 AM – 8:30 AM	Check-in & Light Breakfast
8:30 AM - 9:30 AM	Knowledge Management Jon Stephens, Computational and Data Researcher This session will help participants understand knowledge management and how to implement it, specifically within the scientific community. It will also highlight the fundamental shift in the machine learning paradigm and how to incorporate knowledge management into daily processes. This section will cover the basic concepts of knowledge management, from ontology development to document management.
9:30 AM - 12:00 PM (break 10:30-10:45 AM)	Deep Learning - Part 1 Mai Nguyen, Lead for Data Analytics Paul Rodriguez, Computational Data Scientist Deep learning, a subfield of machine learning, has seen tremendous growth and success in the past few years. Deep learning approaches have achieved state-of-the-art performance across many domains, including image classification, speech recognition, and biomedical applications. This session provides an introduction to neural networks and deep learning concepts and approaches. Examples utilizing deep learning will be presented, and hands-on exercises will be covered using Keras. Please note: Knowledge of fundamental machine learning concepts and techniques is required.
12:00 PM - 1:30 PM	Lunch
1:30 PM - 4:30 PM (break 3:15 PM - 3:30 PM)	Deep Learning – Part 2 Mai Nguyen, Lead for Data Analytics Paul Rodriguez, Computational Data Scientist This session continues and extends Deep Learning - Part 1 by going into more advanced examples. Concepts regarding architecture, layers, and applications will be presented. Additionally, more advanced tutorials and hands-on exercises with larger deep convolutional networks and transfer learning will be executed on GPUs. There will also be a chance to learn Keras more in depth and become familiar with building more flexible models.
4:30 PM - 4:45 PM	Q&A + Wrap-up

Thursday, August 7

Pacific time	Main Room Session
8:00 AM – 8:30 AM	Check-in & Light Breakfast
8:30 AM - 9:30 AM	Best Practices for Scientific Computing Fernando Garzon, Computational and Data Science Research Specialist
9:30 AM - 12:00 PM (break 10:30-10:45 AM)	Performance Tuning Robert Sinkovits, Director of Education and Training, Emeritus This session is targeted at attendees who both do their own code development and need their calculations to finish as quickly as possible. We will cover the effective use of cache, loop-level optimizations, force reductions, optimizing compilers and their limitations, short-circuiting, time-space tradeoffs and more. Exercises will be done mostly in C, but emphasis will be on general techniques that can be applied in any language.
12:00 PM - 1:30 PM	Lunch
1:30 PM - 1:45 PM	Empowering Women in Technology with NetApp & Neuvys Cecile Kellam, AI & Analytics Senior Solution Architect- US Public Sector, NetApp Rebecca Pinheiro, CEO & Founder, Neuvys Technologies
1:45 PM - 4:15 PM (break 3:15 PM - 3:30 PM)	GPU Computing and Programming Andreas Goetz, Research Scientist and Principal Investigator This session introduces massively parallel computing with graphics processing units (GPUs). The use of GPUs is popular across all scientific domains since GPUs can significantly accelerate time to solution for many computational tasks. Participants will be introduced to the essential background of the GPU chip architecture and will learn how to program GPUs via the use of libraries, OpenACC compiler directives, and CUDA programming. The session will incorporate hands-on exercises for participants to acquire the basic skills to use and develop GPU aware applications
4:15 PM - 4:30 PM	Q&A + Wrap-up Group Photo

Friday, August 8

Pacific time	Main Room Session
8:00 AM – 8:30 AM	Check-in & Light Breakfast
8:30 AM – 11:00AM	Python for HPC Andrea Zonca, Lead of Scientific Computing Applications and Chair of the Summer Institute In this session, we’ll explore how Numba (for accelerated computation) and Dask (for parallel and distributed processing) unlock Python’s potential in HPC environments. Basic Python familiarity is recommended but not required. Numba: Instant Speedups with JIT Compilation Optimize performance-critical code by compiling Python functions to machine code on-the-fly using Numba’s @njit and @vectorize decorators, achieving near-C performance with minimal effort. Parallel Python: Threads, Processes, and the GIL Learn how Python’s Global Interpreter Lock (GIL) limits parallelism, and use Numba’s nogil mode and Dask’s multi-core scheduling to efficiently utilize all CPU cores on a single node. Distributed Workloads with Dask Scale computations across clusters using Dask arrays and dataframes, which mirror NumPy/pandas APIs while automating task distribution, load balancing, and multi-node parallelism. This streamlined approach progresses from single-core optimization to multi-node scaling, focusing on practical tools for real-world HPC challenges.
11:00 AM – 11:15 AM	Overview of Voyager Amit Majumdar, Division Director of Data-Enabled Scientific Computing Voyager provides an innovative system architecture uniquely optimized for deep learning operations using well-established frameworks such as PyTorch and TensorFlow. Voyager comprises 42 training nodes of Supermicro X12 Habana Gaudi Training Servers; each training node contains 8 GAUDI HL-205 training processor cards which have 100 GbE non-blocking, all-to-all connections among the 8 cards within a node; the 42 Training nodes are connected via a high-performance, low latency 400 GbE switch interconnect. Voyager’s architecture has already shown highly scalable AI application performance in various areas such as LLMs (with billions of parameters such as for GPT2-XL and GPT3-XL), convolutional neural network-based image processing, and graph neural network based high-energy particle physics.
11:15 AM - 11:30 AM	Overview of COSMOS Mahidhar Tatineni, Director of User Services
11:30 AM - 11:45 AM	Overview of Prototype National Research Platform (PNRP) Mahidhar Tatineni, Director of User Services
11:45 AM - Noon	Closing Remarks Andrea Zonca, Lead of Scientific Computing Applications and Chair of the Summer Institute Lunch boxes will be provided*

Get Connected

Contact Us