Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
pub:hpc:hellbender [2025/04/14 12:52] – [Storage: Research Data Ecosystem ('RDE')] nal8cfpub:hpc:hellbender [2025/09/11 19:57] (current) – [What is Hellbender?] bjmfg8
Line 9: Line 9:
 **Hellbender** is the latest High Performance Computing (HPC) resource available to researchers and students (with sponsorship by a PI) within the UM-System. **Hellbender** is the latest High Performance Computing (HPC) resource available to researchers and students (with sponsorship by a PI) within the UM-System.
  
-**Hellbender** consists of 208 mixed x86-64 CPU nodes (112 AMD, 96 Intel) providing 18,688 cores as well as 28 GPU nodes consisting of a mix of Nvidia GPU's (see hardware section for more details). Hellbender is attached to our Research Data Ecosystem ('RDE') that consists of 8PB of high performance and general purpose research storage. RDE can be accessible from other devices outside of Hellbender to create a single research data location across different computational environments.+**Hellbender** consists of 222 mixed x86-64 CPU nodes providing 22,272 cores as well as 28 GPU nodes consisting of a mix of Nvidia GPU's (see hardware section for more details). Hellbender is attached to our Research Data Ecosystem ('RDE') that consists of 8PB of high performance and general purpose research storage. RDE can be accessible from other devices outside of Hellbender to create a single research data location across different computational environments.
  
 ==== Investment Model ==== ==== Investment Model ====
Line 73: Line 73:
  
 ^ Service                              ^ Rate        ^ Unit         ^ Support        ^ ^ Service                              ^ Rate        ^ Unit         ^ Support        ^
-|Hellbender CPU Node| $2,702.00 | Per Node/Year | Year to Year |+|Hellbender CPU Node | $2,702.00 | Per Node/Year | Year to Year |
 |Hellbender GPU Node* | $7,691.38 | Per Node/Year | Year to Year | |Hellbender GPU Node* | $7,691.38 | Per Node/Year | Year to Year |
 |RDE Storage: High Performance | $95.00 | Per TB/Year | Year to Year | |RDE Storage: High Performance | $95.00 | Per TB/Year | Year to Year |
 |RDE Storage: General Performance | $25.00 | Per TB/Year | Year to Year | |RDE Storage: General Performance | $25.00 | Per TB/Year | Year to Year |
  
-***Update 08/2024**: Additional priority partitions cannot be allocated at this time as both CPU and GPU investment has reached beyond the 50% threshold. If you require capacity beyond the general pool we are able to plan and work with your grant submissions to add additional capacity to Hellbender.+***Update 06/2025**: Additional GPU priority partitions cannot be allocated at this time as GPU investment has reached beyond the 50% threshold. If you require capacity beyond the general pool we are able to plan and work with your grant submissions to add additional capacity to Hellbender.
  
  
Line 119: Line 119:
 | Dell C6525 | 112    | 128        | 490 GB        | 1.6 TB          | 14336  | c001-c112  | | Dell C6525 | 112    | 128        | 490 GB        | 1.6 TB          | 14336  | c001-c112  |
  
-**The 2024 pricing is: $2,702 per node per year.**+**The 2025 pricing is: $2,702 per node per year.**
  
 ==== GPU Node Lease  ==== ==== GPU Node Lease  ====
Line 129: Line 129:
 | Dell R740xa | 17     | 64         | 238 GB        | A100 | 80 GB      | 4     | 1.6 TB        | 1088    | Dell R740xa | 17     | 64         | 238 GB        | A100 | 80 GB      | 4     | 1.6 TB        | 1088   
  
-**The 2024 pricing is: $7,692 per node per year.**+***Update 06/2025: Additional GPU priority partitions cannot be allocated at this time as GPU investment has reached beyond the 50% threshold. If you require capacity beyond the general pool we are able to plan and work with your grant submissions to add additional capacity to Hellbender** 
 + 
 +**The 2025 pricing is: $7,692 per node per year.**
  
 ==== Storage: Research Data Ecosystem ('RDE') ==== ==== Storage: Research Data Ecosystem ('RDE') ====
Line 161: Line 163:
 **__None of the cluster attached storage available to users is backed up in any way by us__**, this means that if you delete something and don't have a copy somewhere else, it is gone. Please note the data stored on cluster attached storage is limited to Data Class 1 and 2 as defined by [[https://www.umsystem.edu/ums/is/infosec/classification-definitions| UM System DCL]]. If you have need to store things in DCL3 or DCL4 please contact us so we may find a solution for you. **__None of the cluster attached storage available to users is backed up in any way by us__**, this means that if you delete something and don't have a copy somewhere else, it is gone. Please note the data stored on cluster attached storage is limited to Data Class 1 and 2 as defined by [[https://www.umsystem.edu/ums/is/infosec/classification-definitions| UM System DCL]]. If you have need to store things in DCL3 or DCL4 please contact us so we may find a solution for you.
  
-**The 2024 pricing is: General Storage: $25/TB/Year, High Performance Storage: $95/TB/Year**+**The 2025 pricing is: General Storage: $25/TB/Year, High Performance Storage: $95/TB/Year**
  
 To order storage please fill out our [[https://missouri.qualtrics.com/jfe/form/SV_6zkkwGYn0MGvMyO| RSS Services Order Form]] To order storage please fill out our [[https://missouri.qualtrics.com/jfe/form/SV_6zkkwGYn0MGvMyO| RSS Services Order Form]]
Line 238: Line 240:
 ==== Software ==== ==== Software ====
  
-The Foundry was built and managed with Puppet. The underlying OS for the Foundry is Alma 8.9. For resource management and scheduling we are using SLURM Workload manager version 22.05.11+Hellbender was built and managed with Puppet. The underlying OS for the Hellbender is Alma 8.9. For resource management and scheduling we are using SLURM Workload manager version 22.05.11
  
 ==== Hardware ==== ==== Hardware ====
Line 254: Line 256:
 Dell C6420: .5 unit server containing dual 24 core Intel Xeon Gold 6252 CPUs with a base clock of 2.1 GHz. Each C6420 node contains 384 GB DDR4 system memory. Dell C6420: .5 unit server containing dual 24 core Intel Xeon Gold 6252 CPUs with a base clock of 2.1 GHz. Each C6420 node contains 384 GB DDR4 system memory.
  
-Dell R6620: 1 unit server containing dual 128 core AMD EPYC 9754 CPUs with a base clock of 2.25 GHz. Each R6620 node contains 1 TB DDR5 system memory.+Dell R6625: 1 unit server containing dual 128 core AMD EPYC 9754 CPUs with a base clock of 2.25 GHz. Each R6625 node contains 1 TB DDR5 system memory. 
 + 
 +Dell R6625: 1 unit server containing dual 128 core AMD EPYC 9754 CPUs with a base clock of 2.25 GHz. Each R6625 node contains 6 TB DDR5 system memory.
  
 | **Model**  | **Nodes** | **Cores/Node** | **System Memory** | **CPU**                                  | **Local Scratch**   | **Cores** | **Node Names** | | **Model**  | **Nodes** | **Cores/Node** | **System Memory** | **CPU**                                  | **Local Scratch**   | **Cores** | **Node Names** |
Line 260: Line 264:
 | Dell R640  | 32        | 40             | 364 GB            | Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz | 100 GB              | 1280      | c113-c145      | | Dell R640  | 32        | 40             | 364 GB            | Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz | 100 GB              | 1280      | c113-c145      |
 | Dell C6420 | 64        | 48             | 364 GB            | Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz | 1 TB                | 3072      | c146-c209      | | Dell C6420 | 64        | 48             | 364 GB            | Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz | 1 TB                | 3072      | c146-c209      |
-| Dell R6620 | 12        | 256            | 994 GB            | AMD EPYC 9754 128-Core Processor         | 1.5 TB              | 3072      | c210-c221      | +| Dell R6625 | 12        | 256            | 994 GB            | AMD EPYC 9754 128-Core Processor         | 1.5 TB              | 3072      | c210-c221      | 
-|            |                          |                                                            | Total Cores         21760     |                |+| Dell R6625 | 2         | 256            | 6034 GB           | AMD EPYC 9754 128-Core Processor         | 1.6 TB              | 512       | c222-c223      | 
 +|            |                          |                                                            | Total Cores         22272     |                |
  
 === GPU nodes === === GPU nodes ===
  
 | **Model**   | **Nodes** | **Cores/Node** | **System Memory** | **GPU**  | **GPU Memory** | **GPUs** | **Local Scratch** | **Cores** | **Node Names** | | **Model**   | **Nodes** | **Cores/Node** | **System Memory** | **GPU**  | **GPU Memory** | **GPUs** | **Local Scratch** | **Cores** | **Node Names** |
-| Dell R740xa | 17        | 64             238 GB            | A100     | 80 GB          | 4        | 1.6 TB            | 1088    | g001-g017      |+| Dell R750xa | 17        | 64             490 GB            | A100     | 80 GB          | 4        | 1.6 TB            | 1088    | g001-g017      |
 | Dell XE8640 | 2         | 104            | 2002 GB           | H100     | 80 GB          | 4        | 3.2 TB            | 208     | g018-g019      | | Dell XE8640 | 2         | 104            | 2002 GB           | H100     | 80 GB          | 4        | 3.2 TB            | 208     | g018-g019      |
 | Dell XE9640 | 1         | 112            | 2002 GB           | H100     | 80 GB          | 8        | 3.2 TB            | 112     | g020           | | Dell XE9640 | 1         | 112            | 2002 GB           | H100     | 80 GB          | 8        | 3.2 TB            | 112     | g020           |
Line 610: Line 615:
 Below is process for setting up a class on the OOD portal. Below is process for setting up a class on the OOD portal.
  
-  - Send the class name, the list of students and TAs, and any shared storage requirements to itrss-support@umsystem.edu.+  - Send the class name, the list of students and TAs, and any shared storage requirements to itrss-support@umsystem.edu. This can be also accomplished by filling out our course request form:  * **[[https://missouri.qualtrics.com/jfe/form/SV_6FpWJ3fYAoKg5EO|Hellbender: Course Request Form]]**
   - We will add the students to the group allowing them access to OOD.   - We will add the students to the group allowing them access to OOD.
   - If the student does not have a Hellbender account yet, they will be presented with a link to a form to fill out requesting a Hellbender account.   - If the student does not have a Hellbender account yet, they will be presented with a link to a form to fill out requesting a Hellbender account.
Line 801: Line 806:
  
 **Documentation**:http://docs.nvidia.com/cuda/index.html **Documentation**:http://docs.nvidia.com/cuda/index.html
 +
 +==== RStudio ====
 +
 +[[https://youtu.be/WuAwXMUYE_Y]]
  
 ==== Visual Studio Code ==== ==== Visual Studio Code ====