Agenda is subject to change. Times listed below are in Pacific.
Lesson Material: https://github.com/ciml-org/ciml-summer-institute-2023
Tuesday, June 20 – Preparation Day (virtual)
9:00 am - 9:15 am |
1.1. Welcome & Orientation |
|
9:15 am – 9:45 am |
1.2 Accounts, Login, Environment, Running Jobs and Logging into Expanse User Portal |
|
9:45 am – 10:30 am |
Q&A & Wrap-up |
Tuesday, June 27 – HPC, Parallel Concepts
8:00 am -8:30 am | Light Breakfast & Check-in |
8:30 am - 9:30 am |
2.1 Welcome and Introductions Mary Thomas, Computational Data Scientist & Director of the CIML Summer Institute |
9:30 am - 9:40 am | Break |
9:40 am - 11:00 am |
2.2 Parallel Computing Concepts of parallelism (e.g., OpenMP and MPI), strong and weak scaling, limitations on scalability (Amdahl’s and Gustafson’s Laws) and benchmarking. |
11:00 am -11:10 am | Break |
11:10 am - 12:30 pm |
2.3 Running Batch Jobs on SDSC Systems Marty Kandes, Computational and Data Science Research Specialist computing (HPC) systems. Learning how to interact with them and compose your work into batch jobs |
12:30 pm - 1:30 pm Lunch |
|
1:30 pm - 2:50 pm |
2.4 Data Management and File Systems |
2:50 pm - 3:00 pm | Break |
3:00 pm - 4:30 pm |
2.5 GPU Computing - Hardware architecture and software infrastructure Andreas Goetz, Research Scientist & Principal Investigator Brief overview of the massively parallel GPU architecture that enables large-scale deep learning applications, access and use of GPUs on SDSC Expanse for ML applications |
4:30 pm - 5:00 pm |
Q&A, Wrap-up |
5:00 pm - 7:00 pm Evening Reception - 15th Floor, the Village |
Wednesday, June 28 - Scalable Machine Learning
8:00 am - 8:30 am | Light Breakfast & Check-in |
8:30 am - 8:40 am |
3.1 Quick Welcome |
8:40 am - 10:00 am |
3.2 Introduction to Singularity: Containers for Scientific and High-Performance Computing Marty Kandes, Computational and Data Science Research Specialist to scientific and high-performance computing. With Singularity you can package complex computational workflows --- software applications, libraries, and data --- in a simple, portable, and reproducible way, which can then be run almost anywhere. |
10:00 am - 10:10 am | Break |
10:10 am - 12:10 pm |
3.3 CONDA Environments and Jupyter Notebook on Expanse: Scalable & Reproducible Data Exploration and ML |
12:10 pm - 1:10 pm |
|
1:10 pm - 1:30 pm |
3.4 Machine Learning (ML) Overview Mai Nguyen, Lead for Data Analytics |
1:30 pm - 2:25 pm |
3.5 R on HPC |
2:25 pm - 2:35 pm | Break |
2:35 pm -4:35 pm | 3.6 Spark Mai Nguyen, Lead for Data Analytics Introduction to performing machine learning at scale, with hands-on exercises using Spark |
4:35 pm - 5:00 pm | Q&A, Wrap-up |
Thursday, June 29 - Deep Learning
8:00 am – 8:30 am | Light breakfast & Check-in | |
8:30 am– 8:40 am |
4.1 Quick Welcome |
|
8:40 am – 10:00 am |
4.2 Introduction to Neural Networks and Convolution Neural Networks An overview of the main concepts of neural networks and feature discovery; the basic convolution neural network for digit recognition using tensorflow |
|
10:00 am – 10:10 am | Break | |
10:10 am – 11:30 am |
4.3 Practical Guidelines for Training Deep Learning on HPC Paul Rodriguez, Computational Data Scientist Guildelines on running deep networks on Expanse, such as using tensorboard, notebooks, and batch jobs; also some discussion of multinode execution. |
|
11:30 am - 12:30 pm Lunch |
||
12:30 pm – 1:30 pm |
4.4 Deep Learning Layers and Architectures |
|
1:30 pm – 1:40 pm | Break | |
1:40 pm – 3:10 pm |
4.5 Deep Learning Transfer Learning Mai Nguyen, Lead for Data Analytics Tutorial and hands-on exercises on the use of transfer learning for efficient training of deep learning models. |
|
3:10 pm – 3:20 pm | Break | |
3:20 pm – 4:50 pm |
4.6 Deep Learning – Special Connections Paul Rodriguez, Computational Data Scientist The architecture of many networks use paths and connections in flexible ways; we will review gate, skip, and residual connections and get some intuition what they are good for |
|
4:50 pm - 5:00 pm |
Q&A, Wrap-up |