Agenda - CIML

Agenda is subject to change. Times listed below are in Pacific.

Lesson Material: https://github.com/ciml-org/ciml-summer-institute-2023

Tuesday, June 20 – Preparation Day (virtual)

9:00 am - 9:15 am	1.1. Welcome & Orientation
9:15 am – 9:45 am	1.2 Accounts, Login, Environment, Running Jobs and Logging into Expanse User Portal Robert Sinkovits, Director of Education and Training
9:45 am – 10:30 am	Q&A & Wrap-up

Tuesday, June 27 – HPC, Parallel Concepts

8:00 am -8:30 am	Light Breakfast & Check-in
8:30 am - 9:30 am	2.1 Welcome and Introductions Mary Thomas, Computational Data Scientist & Director of the CIML Summer Institute
9:30 am - 9:40 am	Break
9:40 am - 11:00 am	2.2 Parallel Computing Concepts Robert Sinkovits, Director of Education and Training We will cover supercomputer architectures, the differences between threads and processes, implementations of parallelism (e.g., OpenMP and MPI), strong and weak scaling, limitations on scalability (Amdahl’s and Gustafson’s Laws) and benchmarking.
11:00 am -11:10 am	Break
11:10 am - 12:30 pm	2.3 Running Batch Jobs on SDSC Systems Marty Kandes, Computational and Data Science Research Specialist Batch job schedulers are used to manage and fairly distribute the shared resources of high-performance computing (HPC) systems. Learning how to interact with them and compose your work into batch jobs is essential to becoming an effective HPC user.
12:30 pm - 1:30 pm Lunch
1:30 pm - 2:50 pm	2.4 Data Management and File Systems Mahidhar Tatineni, Director of User Services Managing data efficiently on a supercomputer is important from both users' and system's perspectives. We will cover a few basic data management techniques and I/O best practices in the context of the Expanse system at SDSC.
2:50 pm - 3:00 pm	Break
3:00 pm - 4:30 pm	2.5 GPU Computing - Hardware architecture and software infrastructure Andreas Goetz, Research Scientist & Principal Investigator Brief overview of the massively parallel GPU architecture that enables large-scale deep learning applications, access and use of GPUs on SDSC Expanse for ML applications
4:30 pm - 5:00 pm	Q&A, Wrap-up
5:00 pm - 7:00 pm Evening Reception - 15th Floor, the Village

Wednesday, June 28 - Scalable Machine Learning

8:00 am - 8:30 am	Light Breakfast & Check-in
8:30 am - 8:40 am	3.1 Quick Welcome
8:40 am - 10:00 am	3.2 Introduction to Singularity: Containers for Scientific and High-Performance Computing Marty Kandes, Computational and Data Science Research Specialist Singularity is an open-source container engine designed to bring operating system-level virtualization to scientific and high-performance computing. With Singularity you can package complex computational workflows --- software applications, libraries, and data --- in a simple, portable, and reproducible way, which can then be run almost anywhere.
10:00 am - 10:10 am	Break
10:10 am - 12:10 pm	3.3 CONDA Environments and Jupyter Notebook on Expanse: Scalable & Reproducible Data Exploration and ML Peter Rose, Director of Structural Bioinformatics Laboratory Set up reproducible and transferable software environments and scale up calculations to large datasets using parallel computing.
12:10 pm - 1:10 pm Lunch
1:10 pm - 1:30 pm	3.4 Machine Learning (ML) Overview Mai Nguyen, Lead for Data Analytics Brief review of machine learning concepts
1:30 pm - 2:25 pm	3.5 R on HPC Paul Rodriguez, Computational Data Scientist A presentation and demo of parallelizing R; also an example case study of several ML tools and R for big data
2:25 pm - 2:35 pm	Break
2:35 pm -4:35 pm	3.6 Spark Mai Nguyen, Lead for Data Analytics Introduction to performing machine learning at scale, with hands-on exercises using Spark
4:35 pm - 5:00 pm	Q&A, Wrap-up

Thursday, June 29 - Deep Learning

8:00 am – 8:30 am	Light breakfast & Check-in
8:30 am– 8:40 am	4.1 Quick Welcome
8:40 am – 10:00 am	4.2 Introduction to Neural Networks and Convolution Neural Networks Paul Rodriguez, Computational Data Scientist An overview of the main concepts of neural networks and feature discovery; the basic convolution neural network for digit recognition using tensorflow
10:00 am – 10:10 am	Break
10:10 am – 11:30 am	4.3 Practical Guidelines for Training Deep Learning on HPC Paul Rodriguez, Computational Data Scientist Guildelines on running deep networks on Expanse, such as using tensorboard, notebooks, and batch jobs; also some discussion of multinode execution.
11:30 am - 12:30 pm Lunch
12:30 pm – 1:30 pm	4.4 Deep Learning Layers and Architectures Mai Nguyen, Lead for Data Analytics Overview of deep learning concepts, including layers, architectures, applications, and libraries
1:30 pm – 1:40 pm	Break
1:40 pm – 3:10 pm	4.5 Deep Learning Transfer Learning Mai Nguyen, Lead for Data Analytics Tutorial and hands-on exercises on the use of transfer learning for efficient training of deep learning models.
3:10 pm – 3:20 pm	Break
3:20 pm – 4:50 pm	4.6 Deep Learning – Special Connections Paul Rodriguez, Computational Data Scientist The architecture of many networks use paths and connections in flexible ways; we will review gate, skip, and residual connections and get some intuition what they are good for
4:50 pm - 5:00 pm	Q&A, Wrap-up

Get Connected

Stay up to date on the latest news and events from SDSC by following us on social media, and subscribing to our newsletter.

Stay connected with SDSC and get your newsletter - SIGN UP TODAY

Contact: events@sdsc.edu