Agenda
Lesson material repository: SDSC Summer Institute 2021
Agenda is subject to change. Times listed below are in Pacific Daylight Time.
Wednesday, July 28
Time (PDT) | Session |
---|---|
9:00 - 11:00 AM | Preparation Session: Navigate online tools, account set-up, log-in and access of system |
Monday, August 2
Time (PDT) | Main Room Session | |
8:00 - 9:00 AM | 1.1. Welcome, Orientation, & Introductions (Main Room) Bob Sinkovits, Director of Scientific Computing, SDSC & Director of the Summer Institute |
|
9:00 - 10:00 AM | 1.2 Accessing and Running Jobs on Expanse (Main Room) Mary Thomas, Computational Data Scientist, SDSC This session covers the basics of accessing Expanse; managing the user environment; and compiling and running jobs. It is assumed that you have completed the basic steps of logging onto Expanse and refreshing your Unix skills prior to the event. |
|
10:00 - 10:30 AM | 1.3. Expanse User Portal (Main Room) Subhashini Sivagnanam, Senior Computational and Data Science Specialist, SDSC |
|
10:30- 10:45 - AM Break |
||
Main Room Session | Breakout Room Session | |
10:45 - 12:45 PM |
1.4a. Introduction to version control with git and GitHub
|
1.4b. Advanced Github
|
12:45 - 1:15 PM -30-minute lunch/break |
||
1:15 - 2:00 PM | 1.5. Understanding Performance and Obtaining Hardware Information (Main Room) Bob Sinkovits, Director of Scientific Computing, SDSC & Director of the Summer Institute |
Tuesday, August 3
AM Session | Main Room Session | Breakout Room Session |
8 AM- 10:45 AM
|
2.1a. Python for HPC
In this session we will introduce four key technologies in the Python ecosystem that provide significant benefits for scientific applications run in supercomputing environments. Previous Python experience is recommended but not required. |
2.1b. A Short Introduction to Data Science and its Applications Ilkay Altintas, Chief Data Science Officer, SDSC Shweta Purawat, Computational and Data Researcher, SDSC
The new era of data science is here. Our lives as well as any field of science, engineering, business, and society are continuously transformed by our ability to collect meaningful data in a systematic fashion and turn that into value. These needs not only push for new and innovative capabilities in composable data management and analytical methods that can scale in an anytime anywhere fashion, but also require methods to bridge the gap between applications and compose such capabilities within solution architectures.
|
15 minute break will be based on instructor | ||
10:45 – 11:15 AM - 30-minute lunch/break |
||
PM Session | Main Room Session | Breakout Room Session |
11:15 - 2:00 PM
|
2.2a. Performance Tuning
|
2.2b. Information Visualization Concepts
This tutorial will provide a ground up understanding of information visualization concepts and how they can be leveraged to select and use effective visual idioms for different data types such spreadsheet data, geospatial, graph, etc.). Example visualization designs and fixing problems with existing visualizations will be discussed. Practical rules of thumbs for visualization will be discussed as well. |
15-minute break will be based on instructor |
Wednesday, August 4
AM Session | Main Room Session | Breakout Room Session |
8 AM- 10:45 AM
|
3.1a. Scientific Visualization for mesh based data with Visit
This tutorial will provide a high-level overview of scientific visualization techniques and their applicability for structured mesh-based data (such as rectilinear grids). Attendees will follow along exercises in a hands-on manner to employ different types of techniques using VisIt software and also perform remote visualization on Expanse cluster. |
3.1b. Scalable Machine Learning Mai Nguyen, Lead for Data Analytics, SDSC Paul Rodriguez, Research Analyst, SDSC
From scientific domains to social media analytics, the data that needs to be analyzed has become massive and complex. This session introduces approaches that can be used to perform machine learning at scale. Tools and procedures for executing machine learning techniques on HPC will be presented. Spark will also be covered. In particular, we will use Spark’s machine learning library, MLlib, to demonstrate how distributed computing can be used to provide scalable machine learning. Please note: Knowledge of fundamental machine learning algorithms and techniques is required |
15-minute break will be based on instructor | ||
10:45 – 11:15 AM - 30-minute lunch/break |
||
PM Session | Main Room Session | |
11:15 - 2:00 PM |
Group photo 3.2. Lightning Rounds |
|
15-minute break will be based on instructor |
Thursday, August 5
AM Session | Main Room Session | Breakout Room Session |
8 AM- 10:45 AM
|
4.1a. GPU Computing and Programming |
4.1b. Deep Learning (part 1) Mai Nguyen, Lead for Data Analytics, SDSC Paul Rodriguez, Research Analyst, SDSC
Deep learning, a subfield of machine learning, has seen tremendous growth and success in the past few years. Deep learning approaches have achieved state-of-the-art performance across many domains, including image classification, speech recognition, and biomedical applications. Deep learning makes use of models that are composed of many layers of interconnected processing units. The many layers allow for a deep network to learn representations of data at multiple and increasingly complex and task-specific levels of abstraction, leading to automatic feature learning and excellent prediction performance. This session provides an introduction to deep learning concepts and approaches. Case studies utilizing deep learning will be presented, and hands-on exercises will be covered using Keras. Please note: Knowledge of fundamental machine learning concepts and techniques is required.
|
15-minute break will be based on instructor | ||
10:45 – 11:15 AM - 30-minute lunch/break |
||
PM Session | Main Room Session | Breakout Room Session |
11:15 - 2:00 PM |
4.2a. Parallel Computing using MPI & Open MP |
4.2b. Deep Learning (part 2) Mai Nguyen, Lead for Data Analytics, SDSC Paul Rodriguez, Research Analyst, SDSC
Deep learning, a subfield of machine learning, has seen tremendous growth and success in the past few years. Deep learning approaches have achieved state-of-the-art performance across many domains, including image classification, speech recognition, and biomedical applications. Deep learning makes use of models that are composed of many layers of interconnected processing units. The many layers allow for a deep network to learn representations of data at multiple and increasingly complex and task-specific levels of abstraction, leading to automatic feature learning and excellent prediction performance. This session provides an introduction to deep learning concepts and approaches. Case studies utilizing deep learning will be presented, and hands-on exercises will be covered using Keras. Please note: Knowledge of fundamental machine learning concepts and techniques is required.
|
15-minute break will be based on instructor |
Friday, August 6
Time | Main Room Session | |
8:30 - 9:00 AM |
5.1. An Introduction to Singularity: Containers for Scientific and High-Performance Computing Martin Kandes, Computational & Data Science Research Specialist, SDSC |
|
9:00 - 9:30 AM |
5.2. Data sharing via SeedMeLab Amit Chourasia, Senior Visualization Scientist, SDSC |
|
9:30 - 10:00 AM | 5.3. Open Science Chain, Protecting Data Integrity with Open Science Chain Subhashini Sivagnanam, Senior Computational and Data Science Specialist, SDSC & Manu Shantharam, Senior Computational Scientist, SDSC |
|
10:00 - 11:00 AM |
Introduction to new projects/special topics (30 minutes each):
|
|
11:00- 11:15 - AM Break |
||
11:15 - 11:45 PM | Introduction to new projects/special topics (30 minutes each):
|
|
11:45 - 12:00 PM | Adjourn- Wrap-up, thank you for joining us! Bob Sinkovits, Director of Scientific Computing, SDSC & Director of the Summer Institute |
Stay up to date on the latest news and events from SDSC by following us on social media, and subscribing to our newsletter.
Stay connected with SDSC and get your newsletter - SIGN UP TODAY
events@sdsc.edu