All program content will be found on the GitHub Repository:
https://github.com/sdsc/sdsc-summer-institute-2023
Agenda is subject to change. Times listed below are in Pacific.
Wednesday, August 2 (Virtual)
Pacific time |
Session |
9:00 AM – 11:00 AM |
1.0 Preparation Day - Welcome & Orientation Accounts, Login, Environment, Running Jobs and Logging into Expanse User Portal Q&A wrap up |
Monday, August 7
Pacific time |
Main Room Session |
8:00 AM – 8:30 AM |
Check-in & Registration |
8:30 AM - 9:30 AM | Welcome |
|
2.1 Parallel Computing Concepts Robert Sinkovits, Director of Education and Training |
9:30 AM - 10:15 AM |
2.2 Hardware Overview All users of advanced CI can benefit from a basic understanding of hardware, to determine which factors affect application performance. Here we give an overview starting from CPUs (processors, cores, hyperthreading, instruction sets), the anatomy of a compute node (sockets, memory, attached devices, accelerators), to an overview of cluster architecture (login and compute nodes, interconnects). We also cover how to obtain hardware information using Linux tools, pseudo-filesystems and commonly used hardware utilization monitoring tools. |
10:15 AM – 10:30 AM |
Break |
10:30 AM – 12:00 PM |
2.3 Intermediate Linux Effective use of Linux based compute resources via the command line interface (CLI) can significantly increase researcher productivity. Assuming basic familiarity with the Linux CLI we cover some more advanced concepts with focus on the Bash shell. Among others this includes the filesystem hierarchy, file permissions, symbolic and hard links, wildcards and file globbing, finding commands and files, environment variables and modules, configuration files, aliases, history and tips for effective Bash shell scripting. |
12:00 PM - 1:30 PM | Lunch |
1:30 PM – 2:30 PM |
2.4 Batch Computing |
2:30 PM – 2:45 PM |
Break |
2:45 PM – 3:45 PM |
2.5 Interactive Computing |
3:45 PM - 4:15 PM | Q&A + Wrap-up SDSC Data Center Tour |
4:30 PM - 6:30 PM | Evening Reception SDSC Auditorium |
Tuesday, August 8
Pacific time |
Main Room Session |
8:00 AM – 8:30 AM |
Check-in & Light Breakfast |
8:30 AM – 9:00 AM |
3.1 Security |
9:00 AM – 10:00 AM |
3.2 Data Management Proper data management is essential for the effective use of advanced CI. This session will cover an overview of file systems, data compression, archives (tar files), checksums and MD5 digests, downloading data using wget and curl, data transfer and long-term storage solutions.
|
10:00 AM – 10:15 AM |
Break |
10:15 AM – 11:00 AM |
3.3 Getting Help |
11:00 AM – 12:00 PM |
3.4 Code Migration |
12:00 PM - 1:30 PM |
Lunch |
1:30 PM – 2:45 PM |
3.5 High Throughput Computing |
2:45 PM - 3:00 PM | Break |
3:00 PM - 4:30 PM |
3.6 Linux Tools for File Processing |
4:30 PM | Q&A + Wrap-up |
Wednesday, August 9
Pacific time |
Main Room Session |
Breakout Room Session |
8:00 AM – 8:30 AM |
Check-in & Light Breakfast |
|
8:30 AM – 10:00 AM |
4.1a Intro to Git & GitHub |
4.1b Advanced Git & GitHub Data Science Research Specialist |
10:00 AM – 10:15 AM |
Break |
|
10:15 AM – 12:30 PM |
4.2a Python for HPC In this session we will introduce four key technologies in the Python ecosystem that provide significant benefits for scientific applications run in supercomputing environments. Previous Python experience is recommended but not required.
|
4.2b Information Visualization Concepts
|
12:30 PM - 2:00 PM | Lunch | |
Group Photo | ||
2:00 PM – 4:30 PM |
4.3a Scientific Visualization for mesh based data with Unreal Engine 5 |
4.3b Scalable Machine Learning This session introduces approaches that can be used to perform machine learning at scale. Tools and procedures for executing machine learning techniques on HPC will be presented. Spark will also be covered for scalable data analytics and machine learning. Please note: Knowledge of fundamental machine learning algorithms and techniques is required. |
4:30 PM | Q&A + Wrap-up |
Thursday, August 10
Pacific time |
Main Room Session |
Breakout Room Session |
8:00 AM – 8:30 AM |
Check-in & Light Breakfast |
|
8:30 AM – 11:00 AM |
5.1a Performance Tuning |
5.1b Deep Learning - Part 1 |
11:00 AM – 11:15 AM |
Break |
|
11:15 AM - 12:15 PM |
5.2 An Introduction to Singularity: Containers for Scientific and High-Performance Computing |
|
12:15 PM - 1:45 PM | Lunch | |
1:45 PM – 4:30 PM |
5.3a GPU Computing and Programming This session introduces massively parallel computing with graphics processing units (GPUs). The use of GPUs is popular across all scientific domains since GPUs can significantly accelerate time to solution for many computational tasks. Participants will be introduced to essential background of the GPU chip architecture and will learn how to program GPUs via the use of libraries, OpenACC compiler directives, and CUDA programming. The session will incorporate hands-on exercises for participants to acquire the basic skills to use and develop GPU aware applications.
|
5.3b Deep Learning – Part 2 |
4:30 PM | Q&A + Wrap-up | |
6:00 PM - 8:00 PM |
Evening Dinner |
Friday, August 11
Pacific time |
Main Room Session |
|
8:00 AM – 8:30 AM |
Check-in & Light Breakfast |
|
8:30 AM – 11:30 AM |
6.1a Parallel Computing using MPI & Open MP |
6.1b A Short Introduction to Data Science and its Applications The new era of data science is here. Our lives as well as any field of science, engineering, business, and society are continuously transformed by our ability to collect meaningful data in a systematic fashion and turn that into value. These needs not only push for new and innovative capabilities in composable data management and analytical methods that can scale in an anytime anywhere fashion, but also require methods to bridge the gap between applications and compose such capabilities within solution architectures.
|
11:30 AM – 12:15 PM |
6.2 Scaling up Interactive Data Analysis in Jupyter Lab: From Laptop to HPC |
|
12:15 PM – 12:30 PM |
Closing Remarks Robert Sinkovits, Director of Education and Training |