Predicting and Optimizing Runtime Performance of Deep Learning Models
Abstract
In this tutorial, we will introduce techniques to easily find the underutilization and performance bottlenecks of GPUs for deep-learning (DL) workloads. After that, we will do a brief introduction to CUDA programming as an example of a current way of addressing typical performance bottlenecks and underutilization in DL workloads. And we will wrap it up by introducing a new DL compiler Hidet (ASPLOS2023 paper), that allows rapid development of performant tensor programs in a higher-level language such as Python.
Scope and Objectives
This tutorial has the following objectives. First, we will demonstrate how to use modern tools to rapidly profile DNN workloads that you can adopt in your day-to-day research and/or work. Second, we will cover the basics of the CUDA programming model to provide the necessary background for the motivation of the new DL compiler Hidet. Thirdly, we will introduce Hidet and demonstrate its expressive power relative to the CUDA. At the end of this tutorial, you will have everything you need to get started with Hidet to rapidly develop performant tensor programs.
Agenda
To get the most out of the tutorial, please preferably have the following ready when you attend this tutorial:
- Bring a laptop computer with Visual Studio Code. Install the Remote-SSH plugin, which will be used to launch profiling on a remote workstation.
- (Recommended) Have a remote workstation running Linux with a NVIDIA GPU that you can ssh into. You also need to have Python and CUDA installed.
March 25th, 2023
- Time
- Topic
- Speaker
- 1:40 – 3:00
- 3:00 – 3:20
- Brief Introduction to CUDA Programming
- Yaoyao Ding
- 3:20 – 3:40
- Coffee Break
- 3:40 – 5:00
- Build Tensor Programs with Hidet in Python
- Yaoyao Ding
Slides
Organizers
Gennady Pekhimenko
Assistant Professor at University of Toronto, CEO at CentML Inc.
Yaoyao Ding
PhD student at University of Toronto, Research SDE at CentML Inc.
Yubo Gao
PhD student at University of Toronto, Research SDE at CentML Inc.
Anand Jayarajan
PhD student at University of Toronto,
Chief Software Architect at CentML Inc.