Pittsburgh Supercomputing Center › HPC AI and Big Data Group › Neocortex › Webinar - Neocortex Spring 2023 Call for Proposals and System Overview

Webinar - Neocortex Spring 2023 Call for Proposals and System Overview

Presented on Tuesday, February 28, 2023, 2:00 - 3:00 pm (ET), by Paola A. Buitrago, Neocortex, Principal Investigator & Project Director and Director, AI and Big Data, Pittsburgh Supercomputing Center; Claire Zhang, Machine Learning Solutions Engineer at Cerebras Systems Inc.; Dr. Leighton Wilson, HPC Solutions Engineer at Cerebras Systems Inc., and Dr. Dirk Van Essendelft, HPC, AI, and Data Scientist at the National Energy Technology Laboratory.

This webinar presents the upcoming Spring 2023 Call for Proposals (NeocortexSpring2023CFP) and gives a system overview of Neocortex, an AI-specialized NSF-funded supercomputer deployed at PSC/CMU.

For more information about Neocortex, please visit https://www.cmu.edu/psc/aibd/neocortex/.

View Slides

00:00 - Welcome
02:20 - Intro
04:41 - Speakers
05:48 - The Neocortex Program
11:16 - Neocortex System Overview
14:17 - Applications Supported by Neocortex - as of February 2023
19:46 - Spring 2023 Call for Proposals
23:16 - To Learn More and Participate
23:48 - Cerebras CS-2: the AI Compute Engine for Neocortex, Overview
26:26 - Developer Resources
27:29 - CS-2 for Deep Learning
29:11 - ML Software Key Features
30:12 - Topics of interest for ML applications
32:14 - CS-2 for HPC Using the SDK
34:59 - Cerebras SDK
39:40 - Topics of interest for HPC applications
41:02 - Cerebras Recap
42:51 - Using NETL's WFA for Scientific Computing on the WSE2
44:02 - What is the WFA?
47:27 - Near Real Time Scientific Modeling
50:33 - Seeking Beta Testers for Scientific Computing
54:26 - Open Q&A

Q&A

Will the data from users to be processed by cerebras protected?

Data are stored and processed at PSC, and protected like all data on PSC-operated systems. Please contact neocortex@psc.edu for further explanations.

Will the user's data be publicly available? What if I have sensitive data to process and I don't want it to be made public? Does Cerebras keep a copy of the user's data or algorithms that it processes?

All the data and work takes place at PSC, none at Cerebras. Please contact neocortex@psc.edu if you need details about PSC policies.

Do density function theory simulations in computational materials science fall into the general category?

DFT should be possible in the WFA but we do not have kernels built for it. It is on the long-term development list for the WFA.

I can't click on any URLs to save them?

The recording of the webinar and slides will be made available. This is the link to access the CFP webpage.

Great effort! For the model zoo models, are they faithful reproductions of the original models? Have you done experiments to make sure the correctness of those models? Or are they still in some testing phase

We validate all of the models released in the ModelZoo. Sometimes there exist several different implementations of the model, typically we mirror one of those. Please pay attention to the model configurations in yaml files. In our README files we also provide details about model implementation.

Is Neocortex only available for AI projects? E.g., I have a model SMNI that I am ready to apply to fMRI data.

The SDK and WSE presentations will discuss possible non-ML applications.

Would TorchANI work on neocortex https://github.com/aiqm/torchani

It would require some porting in any case, and the ability to port dependst on the types of the models implemented in TorchANI. If the models require kernels outside of the set of existing kernels (typical components of MLPs, transformer-style models), than it won’t be possible to port it today.

If you use Slurm, does your platform permit Arrays?

We have a mainstream Slurm configuration and most of the different features supported by Slurm are permitted in the Neocortex system. There might be some slight differences, but if something doesn’t work as expected, we will be happy to work with you to solve any issues.

Is there more information on how the physical problem is actually mapped onto the machine? We have a problem that had radiative boundary conditions, and is now periodic so octrees that didn’t talk to each other now need to be right next to each other on the machine. How the compilers do this exactly isn’t entirely clear

It's up to you to map the physical problem to the hardware. You have to do that in the SDK or in the lower-level tools in the WFA. Problem mapping is one of the most important parts of how to code the WSE and one thing that is very different than traditional distributed computing.

Does each PE have it's own memory (i.e. SRAM)?

Yes. All of them have 48KB of SRAM.

How can I get or download the slides?

The recording of the webinar and slides will be made available using this webinar page.

I just heard that you only accept Python as a coding language? Does that mean you do not support C (e.g., gcc)?

For the SDK, our *host code* must be Python, at the moment, but C++ support is coming. All the code running on the wafer itself, i.e., the compute kernels, must be written with CSL.

This is for SDK. For ML, you would write code in python and leverage Pytorch or TF.

Are real time control system applications under this architecture better using ML vs HPC?

That depends on what you mean by "better".

s there direct communication between the SRAM of one PE to another SRAM? Or is it a communication between PEs?

The communication is between PEs. Each PE can access its own SRAM and you can program to read data from SRAM of another PE and send it to the target PE via on-chip fabric.

Can one run DFT-based tools like VASP, Quantum espresso etc on this?

Not yet.

Since DFT kernels are not built, do you suggest still applying for access to neocortex?

You could also use the SDK, and in fact, I think many DFT kernels would be a great fit. However, using the SDK will take a significant amount of effort since it’s quite low level.

I assume that any C code I have would have to be rewritten (probably not) or wait until you support C. Would the coming C++ platform be before your next deadline, or when will it likely be available?

Technically, we actually already support C++ host code, but it’s undocumented. By the time we provide system access to the next round of proposals, we will have some documented examples.

We have yet to test integrating into larger, established C++ code bases with other dependencies, etc., so we unfortunately can’t guarantee that existing C++ code bases can integrate CSL/ SDK kernels without running into compilation issues.

Thanks for your answer, if you can point me in the direction of approaches other people have tried I’d appreciate it.

This question has been answered live: [54:52]

Why hex's instead of rectangles?

This question has been answered live: [57:55]

I understand only 32bit floating points are available; are there (non-standard) floating point types that are 32 bits wide but have more mantissa and less exponent available?

This question has been answered live: [58:45]

I have a working Potential Neural Network (to replace DFT) modified from TorchANI and I am looking for a faster computer cluster to train it on, as the size of the data is large.

This question has been answered live: [01:00:32]

What is the recommended way participants are encouraged to allocate their development? (how long is it for)

This question has been answered live: [01:02:37]

HPC AI and Big Data Group

Pittsburgh Supercomputing Center