Carnegie Mellon University

Webinar - Neocortex: CS-2 Overview

Presented on Tuesday, March 29, 2022, 2:30 - 3:30 pm (ET), by Dr. Natalia Vassilieva, Director of Product, Machine Learning at Cerebras Systems Inc.

This webinar gives an overview of the recent Neocortex System upgrade, an NSF-funded AI supercomputer deployed at PSC, now featuring two Cerebras CS-2 systems. In order to help researchers better understand the benefits of the new servers and the changes to the systems, we would like to invite you to participate in a virtual overview presentation by Dr. Natalia Vassilieva from Cerebras.

For more information about Neocortex, please visit https://www.cmu.edu/psc/aibd/neocortex/

View slides (Will be available soon)

Table of Contents

00:08 - Welcome
01:50 - Code of Conduct
03:17 - CS-2 Overview
06:01 - Cerebras Wafer Scale-Engine 2
07:45 - Cerebras CS-1 and CS-2: Cluster-scale Performance in a Single System
08:48 - The Cerebras Software Platform
10:17 - Execution Mode on CS-1 for DNNs
11:43 - Execution Modes on CS-2 for DNNs
14:15 - Comparing Execution Modes
17:26 - CS-2 advantages for Pipelined
18:08 - Can fit larget models. How much larger?
25:46 - Can fit larget inputs. How much larger?
27:41 - Faster training. How much faster?
32:44 - CS-2 and Weight Streaming advantages
36:45 - Wafer Memory Management
38:56 - No layer partitioning
41:13 - Summary
42:26 - Q&A Session

Please find the recording on the Neocortex Portal

Q&A

Neocortex is now CS2 only. The storage is on the SDFlex front-end, as before.
If the same-sized problem can be decomposed onto more processing elements, it will run faster. However, the larger size may allow for larger models to be run that were not able to be run before. We don’t know how the use will change to know the timing changes with any level of certainty.
Yes, that is right, the software stack handles how the model is mapped and the availability of more cores and bandwidth allows us to do this with bigger models.
Yes, that is right.

In a single replica setting, updates happen every step (one passes through a batch). In multi-replica, one batch is distributed across all the replicas, and each replica process samples sequentially.

This question has been answered live: [43:03]

Around 31 million weights.
This question has been answered live: [45:01]
This question has been answered live: [45:51]
This question has been answered live: [47:25]
This question has been answered live: [48:25]
This question has been answered live: [49:50]
This question has been answered live: [51:28]
This question has been answered live: [52:07]
This question has been answered live: [54:40]