Wiki¶
Collection of ideas and relevant topics in connection with Levante¶
- usage of GPU-nodes, SLURM integration
- compiler and experiences/best-practices with programming models for GPUs
- Debugging/Profiling Tools (NSight, DDT, etc)
- A100 for visualisation tasks
- AI workload and environment
Levante system layout¶
HLRE-4 (Levante) will be installed in two phases- mid/end 2021: mainly CPU-nodes (2769 CPU-nodes, 4 GPU-nodes)
- early 2022: addition of 56 GPU-nodes
- 2x AMD EPYC 7713 CPUs -> https://www.amd.com/de/products/cpu/amd-epyc-7713
- 4x NVIDIA A100 GPU (56 nodes with 80GB GPUs and 4 nodes with 40GB GPUs) -> https://www.nvidia.com/de-de/data-center/a100/
Details on the system are given here: DKRZ docu pages
Collection of online ressources to get started with GPU programming¶
- Extensive list of online presentations, videos, documentation, ... by NVIDIA covering everything from "Intro to GPU programming", programming models, tools, ...: https://www.gpuhackathons.org/technical-resources
- Recordings from the "Directive Based GPU Programming" workshop at CSCS in 2018: https://www.youtube.com/watch?v=oShf5JIpqNc&list=PL1tk5lGm7zvQfbPoMMoMpIjStFkH1LuX7