Hellbender

Request an account

You can request an account on the Hellbender by filling out the form found at https://request.itrss.umsystem.edu

System Information

Maintenance

Regular maintenance is scheduled for the 2nd Tuesday of every month. Jobs will run if scheduled to complete before the window begins, and jobs will start once maintenance is complete.

Services

Order RSS Services by filling out the form found at https://missouri.qualtrics.com/jfe/form/SV_6zkkwGYn0MGvMyO

RSS offers:

  • HPC Compute: Compute Node
  • HPC Compute: GPU Node
  • RDE Storage: General storage
  • RDE Storage: High performance storage
  • Software

Add People to Existing Account(s)

Add users to existing compute (SLURM), storage, or software groups by filling out the form found at https://missouri.qualtrics.com/jfe/form/SV_9LAbyCadC4hQdBY

Software

The Foundry was built and managed with Puppet. The underlying OS for the Foundry is Alma 8.9. For resource management and scheduling we are using SLURM Workload manager version 22.05.11

Hardware

Management nodes

The head nodes and login nodes are virtual, making this one of the key differences from the previous generation cluster named Lewis.

Compute nodes

Dell R6525: .5 rack unit servers each containing dual 64 core AMD EPYC Milan 7713 CPUs with a base clock of 2 GHz and a boost clock of up to 3.675 GHz. Each C6525 node contains 512 GB DDR4 system memory.

Model CPU Cores System Memory Node Count Local Scratch Total Core Count
Dell C6525 128 512 GB 112 1.6 TB 14336

GPU nodes

Model CPU cores System Memory GPU GPU Memory GPU Count Local Scratch Node Count
Dell XE8640 104 2048 GB H100 80 GB 4 3.2 TB 2
Dell R740xa 64 356 GB A100 80 GB 4 1.6 TB 17

A specially formatted sinfo command can be ran on Hellbender to report live information about the nodes and the hardware/features they have.

  sinfo -o "%5D %4c %8m %28f %35G" 

Investment Model

Overview

The newest High Performance Computing (HPC) resource, Hellbender, has been provided through partnership with the Division of Research Innovation and Impact (DRII) and is intended to work in conjunction with DRII policies and priorities. This outline will provide definitions about how fairshare, general access, priority access, and researcher contributions will be handled for Hellbender. HPC has been identified as a continually growing need for researchers, as such DRII has invested in Hellbender to be an institutional resource. This investment is intended to increase ease of access to these resources, provide cutting edge technology, and grow the pool of resources available.

Fairshare

To understand how general access and priority access differs, fairshare must first be defined. Fairshare is an algorithm that is used by the scheduler to assign priority to jobs from users in a way that gives every user a fair chance at the resources available. This algorithm has several metrics to perform this calculation over for any given job waiting in the queue, such as job size, wait time, current and recent usage, and individual user priority levels. This allows administrators to tune the fairshare algorithm, to adjust how it determines which jobs are next to run once resources are available.

Resources Available to Everyone: General Access

General access will be open to any research or teaching faculty, staff, and students for any UM system campus. General access is defined as open access to all resources available to users of the cluster at an equal fairshare value. This means that all users will have the same level of access to the general resource. Research users of the general access portion of the cluster will be given the RDE Standard Allocation to operate from. Larger storage allocations will be provided through RDE Advanced Allocations, and independent of HPC priority status.

Hellbender Advanced: Priority Access

When researcher needs are not being met at the general access level, researchers may request an advanced allocation on Hellbender to gain priority access. Priority access will give research groups a limited set of resources that will be available to them without competition from general access users. Priority Access will be provided to a specific set of hardware through a priority partition which contains these resources. This partition will be created, and limited to use by the user and their associated group. These resources will also be in an overlapping pool of resources available to general access users. This pool will be administered such that if a priority access user submits jobs to their priority access partition, any jobs running on those resources from the overlapping partition will be requeued and begin execution again on another resource in that partition if available, or return to wait in the queue for resources. Priority access users will retain general access status, fairshare will still play a part in moderating their access to the general resource. Fairshare inside a priority partition determine which user’s jobs are selected for execution next inside this partition. The jobs running inside this priority partition will also affect a user’s fairshare calculations even for resources in the general access partition. Meaning that running a large amount of jobs inside a priority partition will lower a user’s priority for the general resources as well.

Priority Designation

Hellbender Advanced Allocations are eligible for DRII Priority Designation. This means that DRII has determined the proposed use case (such as a core or grant-funded project) presents a strategic advantage or high priority service to the university. In this case, DRII fully subsidizes the resources used to create the Advanced Allocation.

Traditional Investment

Hellbender Advanced Allocation requests that are not approved for DRII Priority Designation may be treated as traditional investments with the researcher paying for the resources used to create the Advanced Allocation at the defined rate. These rates are subject to change based on the determination of DRII, and hardware costs.

Resource Management

Information Technology Research Support Solutions (ITRSS) will procure, set up, and maintain the resource. ITRSS will work in conjunction with MU Division of Information Technology and Facility Services to provide adequate infrastructure for the resource.

Resource Growth

Priority access resources will generally be made available from existing hardware in the general access pool and the funds will be retained for a future time to allow a larger pool of funds to accumulate for expansion of the resource. This will allow the greatest return on investment over time. If the general availability resources are less than 50% of the overall resource, an expansion cycle will be initiated to ensure all users will still have access to a significant amount of resources. If a researcher or research group is contributing a large amount of funding, it may trigger an expansion cycle if that is determined to be advantageous at the time of the contribution.

Benefits of Investing

The primary benefit of investing is receiving “shares” and a priority access partition for you or your research group. Shares are used to calculate the percentage of the cluster owned by an investor. As long as an investor has used less than they own, investors will be able to use their shares to get higher priorities in the general queue. than they own. FairShare is by far the largest factor in queue placement and wait times.

Investors will be granted Slurm accounts to use in order to charge their investment (FairShare). These accounts can contain the same members of a POSIX group (storage group) or any other set of users at the request of the investor.

To use an investor account in an sbatch script, use:

#SBATCH --account=<investor account>
#SBATCH --partition=<investor partition> (for cpu jobs)
#SBATCH --partition=<investor partition>-gpu --gres=gpu:A100:1 (requests 1 A100 gpu for gpu jobs)

To use a QOS in an sbatch script, use:

#SBATCH --qos=<qos>

HPC Pricing

The HPC Service is available at any time at the following rates for year 2023-2024:

Service Rate Unit Support
Hellbender HPC Compute $2,702.00 Per Node/Year Year to Year
GPU Compute* $7,691.38 Per Node/Year Year to Year
High Performance Storage $95.00 Per TB/Year Year to Year
General Storage $25.00 Per TB/Year Year to Year

*Note: The GPU compute service is no longer active. We have reached 50% of the GPU nodes in the cluster under investment - if you need GPU capacity beyond the general pool we are able to plan and work with your grant submissions to add additional capacity to Hellbender.

Policies

Under no circumstances should your code be running on the login node.

Software and Procurement

Open Source Software installed cluster-wide must have an open source (https://opensource.org/licenses) license or be obtained utilizing the procurement process even if there is not a cost associated with it.

Licensed software (any software that requires a license or agreement to be accepted) must follow the procurement process to protect users, their research, and the University. Software must be cleared via the ITSRQ. For more information about this process please reach out to us!

For widely used software RSS can facilitate the sharing of license fees and/or may support the cost depending on the cost and situation. Otherwise, user are responsible for funding for fee licensed software and RSS can handle the procurement process. We require that if the license does not preclude it, and there are not node or other resource limits, that the software is make made available to all users on the cluster. All licensed software installed on the cluster is to be used following the license agreement. We will do our best to install and support a wide rage of scientific software as resources and circumstances dictate but in general we only support scientific software that will run on RHEL in a HPC cluster environment. RSS may not support software that is implicitly/explicitly deprecated by the community.

Containers, Singularity/Apptainer/Docker

A majority of scientific software and software libraries can be installed in users’ accounts or in group space. We also provide limited support for Singularity for advanced users who require more control over their computing environment. We cannot knowingly assist users to install software that may put them, the University, or their intellectual property at risk.

Storage

None of the cluster attached storage available to users is backed up in any way by us, this means that if you delete something and don't have a copy somewhere else, it is gone. Please note the data stored on cluster attached storage is limited to Data Class 1 and 2 as defined by UM System Data Classifications. If you have need to store things in DCL3 or DCL4 please contact us so we may find a solution for you.

Storage Type Location Quota Description
Home /home/$USER 50 GB Available to all users
Pixstor /home/$USER/data 500 GB Available to all users
Local Scratch /local/scratch 1.6-3.2 TB Available to all users
Pixstor /cluster/pixstor, /mnt/pixstor Varies For investment, cluster attached
Vast /cluster/VAST Varies For investment, cluster/instrument attached

Research Network

Research Network DNS: The domain name for the Research Network (RNet) is rnet.missouri.edu and is for research purposes only. All hosts on RNet will have a .rnet.missouri.edu domain. Subdomains and CNAMEs are not permitted. Reverse records will always point to a host in the .rnet.missouri.edu domain.

Partitions

To view partition limits, use scontrol show part partitionname.

Default Time Limit Maximum Time Limit Description
general 1 hour 2 days For non-investors to run multi-node, multi-day jobs.
requeue 10 minutes 2 days For non-investor jobs that have been requeued due to their landing on an investor-owned node.
gpu 1 hour 2 days Acceptable use includes jobs that utilize a GPU for the majority of the run. Is composed of Nvidia A100 cards, 4 per node.
interactive 1 hour 2 days For short interactive testing, interactive debugging, and general interactive jobs. Use this for light testing as opposed to the login node.
logical_cpu 1 hour 2 days For workloads that can make use of hyperthreaded hardware
priority partitions 1 hour 28 days For investors

Citation

We ask that when you cite any of the RSS clusters in a publication to send an email to muitrss@missouri.edu as well as share a copy of the publication with us. To cite the use of any of the RSS clusters in a publication please use: The computation for this work was performed on the high performance computing infrastructure operated by Research Support Solutions in the Division of IT at the University of Missouri, Columbia MO DOI:https://doi.org/10.32469/10355/97710

Quick Start

Logging In

SSH (Linux)

Open a terminal and type

 ssh username@hellbender-login.rnet.missouri.edu 

replacing username with your campus sso username, enter your sso password

Logging in places you onto the login node. Under no circumstances should you run your code on the login node.

If you are submitting a batch file, then your job will be redirected to a compute node to be computed.

However, if you are attempting use a GUI, ensure that you do not run your session on the login node (Example: username@hellbender-login-p1). Use an interactive session to be directed to a compute node to run your software.

 salloc --time=1:00:00 --x11

Putty (Windows)

Open Putty and connect to hellbender-login.rnet.missouri.edu using your campus SSO.

Off Campus Logins

Our off campus logins use public key authentication only, password authentication is disabled for off campus users unless they are connected to the campus VPN. Please send your public key to itrss-support@umsystem.edu with the subject of “Hellbender Public Key Access”. After setting up your client to use your key, you still use the host hellbender-login.rnet.missouri.edu to connect, however now without need for using the VPN.

Open OnDemand

OnDemand provides an integrated, single access point for all of your HPC resources. The following apps are currently available on Hellbender's Open Ondemand.

  • Jupyter Notebook
  • RStudio Server
  • Virtual Desktop
  • VSCode

SSH Keys

If you want to connect using SSH keys, either to avoid having to type in your password, or are wanting to connect from off campus without a VPN, you can add your SSH public key to Hellbender.

For Windows users, we recommend using MobaXterm https://mobaxterm.mobatek.net. Powershell, git bash, and putty are some other available options. Mac and Linux users can use the default terminal.

Generating an SSH Key

You can generate a new SSH key on your local machine. After you generate the key, you can add the public key to your account on Hellbender.

  • Open terminal
  • Paste the text below, replacing the email used in the example with your University email address.
ssh-keygen -t ed25519 -C "sso@example.com"
Note: If you are using a legacy system that doesn't support the Ed25519 algorithm, use:

ssh-keygen -t rsa -b 4096 -C "sso@example.com"

This creates a new SSH key, using the provided email as a label.

Generating public/private ALGORITHM key pair.

When you're prompted to “Enter a file in which to save the key”, you can press Enter to accept the default file location. Please note that if you created SSH keys previously, ssh-keygen may ask you to rewrite another key, in which case we recommend creating a custom-named SSH key. To do so, type the default file location and replace id_ALGORITHM with your custom key name.

# Windows
Enter file in which to save the key (/c/Users/USERNAME/.ssh/id_ALGORITHM):[Press enter]
# Mac
Enter a file in which to save the key (/Users/USERNAME/.ssh/id_ALGORITHM): [Press enter]
# Linux
Enter a file in which to save the key (/home/USERNAME/.ssh/id_ALGORITHM):[Press enter]
  • At the prompt, type a secure passphrase.
Enter passphrase (empty for no passphrase): [Type a passphrase]
Enter same passphrase again: [Type passphrase again]

Adding your SSH key

You may add your own SSH public key to your Hellbender account. You can also send the key to itrss-support@umsystem.edu and we can add it to your Hellbender account.

  • Copy the contents of your SSH public key, which is written to the file created in the Generating an SSH Key step.
# Windows
Your public key has been saved in /c/Users/USERNAME/.ssh/id_ALGORITHM.pub
# Mac
Your public key has been saved in /Users/USERNAME/.ssh/id_ALGORITHM.pub
# Linux
Your public key has been saved in /home/USERNAME/.ssh/id_ALGORITHM.pub
# Windows
  • The id_ALGORITHM.pub file contents should look similar to the ones below.
# ed25519
ssh-ed25519 AAAAB3NzaC1yc2EAAAABIwAAAQEAklOUpkDHrfHY17SbrmTIpNLTGK9Tjom/BWDSUGP truman@example.com

# rsa
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAklOUpkDHrfHY17SbrmTIpNLTGK9Tjom/BWDSUGPl+nafzlHDTYW7hdI4yZ5ew18JH4JW9jbhUFrviQzM7xlELEVf4h9lFX5QVkbPppSwg0cda3Pbv7kOdJ/MTyBlWXFCR+HAo3FXRitBqxiX1nKhXpHAZsMciLq8V6RjsNAQwdsdMFvSlVK/7XAt3FaoJoAsncM1Q9x5+3V0Ww68/eIFmb1zuUFljQJKprrX88XypNDvjYNby6vw/Pb0rwert/EnmZ+AW4OZPnTPI89ZPmVMLuayrD2cE86Z/il8b+gw3r3+1nKatmIkjn2so1d01QraTlMqVSsbxNrRFi9wrf+M7Q== truman@example.com
  • Add your public key to your account by appending it to your authorized_keys file on Hellbender
[sso@hellbender-login ~]$ vim /home/sso/.ssh/authorized_keys
  • OR send us your public key.

Submitting a job

Using SLURM, you need to create a submission script to execute on the backend nodes, then use a command line utility to submit the script to the resource manager. See the file contents of a general submission script complete with comments.

Example Job Script
batch.sub
#!/bin/bash
#SBATCH --job-name=Change_ME 
#SBATCH --ntasks=1
#SBATCH --time=0-00:10:00 
#SBATCH --mail-type=begin,end,fail,requeue 
#SBATCH --export=all 
#SBATCH --out=Hellbender-%j.out 
 
# %j will substitute to the job's id
#now run your executables just like you would in a shell script, Slurm will set the working directory as the directory the job was submitted from. 
#e.g. if you submitted from /home/username/softwaretesting your job would run in that directory.
 
#(executables) (options) (parameters)
echo "this is a general submission script"
echo "I've submitted my first batch job successfully"

Now you need to submit that batch file to the scheduler so that it will run when it is time.

 sbatch batch.sub 

You will see the output of sbatch after the job submission that will give you the job number, if you would like to monitor the status of your jobs you may do so with the squeue command. If you submit a job to the requeue partition you will receive a warning message like:

sbatch: Warning, you are submitting a job the to the requeue partition. There is a chance that your job
will be preempted by priority partition jobs and have to start over from the beginning.
Submitted batch job 167
Common SBATCH Directives
Directive Valid Values Description
–job-name= string value no spaces Sets the job name to something more friendly, useful when examining the queue.
–ntasks= integer value Sets the requested CPUS for the job
–nodes= integer value Sets the number of nodes you wish to use, useful if you want all your tasks to land on one node.
–time= D-HH:MM:SS, HH:MM:SS Sets the allowed run time for the job, accepted formats are listed in the valid values column.
–mail-type=begin,end,fail,requeue Sets when you would like the scheduler to notify you about a job running. By default no email is sent
–mail-user=email address Sets the mailto address for this job
–export= ALL,or specific variable names By default Slurm exports the current environment variables so all loaded modules will be passed to the environment of the job
–mem= integer value Amount of memory in MB you would like the job to have access to, each queue has default memory per CPU values set so unless your executable runs out of memory you will likely not need to use this directive.
–mem-per-cpu= integer Amount of memory in MB you want per cpu, default values vary by queue but are typically 800MB.
–nice= integer Allows you to lower a jobs priority if you would like other jobs set to a higher priority in the queue, the higher the nice number the lower the priority.
–constraint= please see sbatch man page for usage Used only if you want to constrain your job to only run on resources with specific features, please see the next table for a list of valid features to request constraints on.
–gres= name:count Allows the user to reserve additional resources on the node, specifically for our cluster gpus. e.g. –gres=gpu:2 will reserve 2 gpus on a gpu enabled node
-p partition_name Not typically used, if not defined jobs get routed to the highest priority partition your user has permission to use. If you were wanting to specifically use a lower priority partition because of higher resource availability you may do so.
Valid Constraints
Feature Description
intel Node has intel CPUs
amd Node has amd CPUs
EDR Node has an EDR (100Gbit/sec) infiniband interconnect
gpu Node has GPU acceleration capabilities
cpucodename* Node is running the codename of cpu you desire e.g. rome

Note if some combination of your constraints and requested resources is unfillable you will get a submission error when you attempt to submit your job.

Monitoring your jobs

 squeue -u username OR squeue --me
JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
  167   requeue fahgpu_v jonesjos  R       2:50      1 compute-41-01

Cancel your job

scancel - Command to cancel a job, user must own the job being cancelled or must be root.

scancel <jobnumber>

Viewing your results

Output from your submission will go into an output file in the submission directory, this will either be slurm-jobnumber.out or whatever you defined in your submission script. In our example script we set this to hellbender-jobnumber.out, this file is written asynchronously so it may take a bit after the job is complete for the file to show up if it is a very short job.

Moving Data

Globus

Users can use Globus to move data to and from Hellbender. You can use Globus by going to app.globus.org and logging in with your university account. If you want to migrate date one collection to another, you will need to connect to both collections. Once you are connected to both collections you can just drag and drop from one system to another and Globus will manage actually moving the data.

  • Hellbender Collection Name: U MO ITRSS RDE
  • Lewis Collection Name: MU RCSS Lewis Home Directories
  • Mill Collection Name: Missouri S&T Mill
  • Foundry Collection Name: Missouri S&T HPC Storage

See more detailed information on how to use globus at https://docs.itrss.umsystem.edu/pub/hpc/hellbender#globus1

The screen will look something like this:

Data Transfer Node (DTN)

Hellbender has a node designed for high speed transfers. The node has a 100Gb/s link to the internet. The DTN is currently configured to accept SLURM jobs.

Example slurm sbatch script to use rsync to transfer data to Hellbender:

#!/bin/bash
#SBATCH --job-name=[JOBNAME]
#SBATCH --mem=[MEMORY]
#SBATCH --nodes=1
#SBATCH --cpus-per-task=2
#SBATCH --time=[FORMAT DAY-HOUR:MINUTE:SECOND]
#SBATCH --partition=dtn
#SBATCH --mail-user=[YOUR_EMAIL]
#SBATCH --mail-type=ALL

SRC="remote_hostname:/absolute/path/to/src/files/"
DEST="/absolute/path/to/desk/files"

rsync -av --info=progress2 $SRC $DEST

Interactive Jobs

Some things can't be run with a batch script because they require user input, or you need to compile some large code and are worried about bogging down the login node. To start an interactive job simply use the salloc command and your terminal will now be running on one of the compute nodes. The hostname command can help you confirm you are no longer running on a login node. Now you may run your executable by hand without worrying about impacting other users. By default, salloc will use the partition's defaults, which for the interactive partition is 1 CPU, 800M Memory, for 1 hour.

Use the cluster interactively on the interactive partition:

[user@hellbender-login ~]$ salloc -p interactive
salloc: Pending job allocation 1902141
salloc: job 1902141 queued and waiting for resources
salloc: job 1902141 has been allocated resources
salloc: Granted job allocation 1902141
salloc: Waiting for resource configuration
salloc: Nodes c067 are ready for job
[user@c001 ~]$

Use the cluster interactively with more time and resources:

# -N number of nodes, -n number of cpus, -t time in hours:minutes:seconds
[user@hellbender-login ~]$ salloc -p interactive -N 2 -n 6 -t 4:00:00

Use a GPU interactively:

[user@hellbender-login ~]$ salloc -p requeue --gres=gpu:A100:1
salloc: Warning, you are submitting a job the to the requeue partition. There is a chance that your job will be preempted by priority partition jobs and have to start over from the beginning.
salloc: Pending job allocation 1902152
salloc: job 1902152 queued and waiting for resources
salloc: job 1902152 has been allocated resources
salloc: Granted job allocation 1902152
salloc: Waiting for resource configuration
salloc: Nodes g001 are ready for job

Teaching Cluster

Hellbender can be used by instructors, TAs, and students for instructional work via the Hellbender Classes Open OnDemand Classes portal (OOD).

Below is process for setting up a class on the OOD portal.

  1. Send the class name, the list of students and TAs, and any shared storage requirements to itrss-support@umsystem.edu.
  2. We will add the students to the group allowing them access to OOD.
  3. If the student does not have a Hellbender account yet, they will be presented with a link to a form to fill out requesting a Hellbender account.
  4. We activate the student account and the student will receive an Account Request Complete email.

If desired, the professor would be able to perform step 2 themselves. You may already be able to modify your class groups here: https://netgroups.apps.mst.edu/auth-cgi-bin/cgiwrap/netgroups/netmngt.pl

If the class size is large, we can perform steps 3 and 4.

Software

Anaconda

Anaconda is an open source package management system and environment management system. Conda quickly installs, runs and updates packages and their dependencies. Conda easily creates, saves, loads and switches between environments on your local computer. It was created for Python programs, but it can package and distribute software for any language.

Software URL:https://www.anaconda.org/

Documentation:https://conda.io/en/latest/

By default, Conda stores environments and packages within the folder ~/.conda.

To avoid using up all of your home folder's quota, which can easily happen when using Conda, we recommend placing the following within the file ~/.condarc. You can create the file if it is not already present. You can also choose a different path, so long as it is not in your home folder.

envs_dirs:
- /mnt/pixstor/data/${USER}/miniconda/envs
pkgs_dirs:
- /mnt/pixstor/data/${USER}/miniconda/pkgs

Usage

The version of Anaconda we have available on Hellbender is called “Miniconda”. Miniconda is a version of Anaconda that only provides the conda command.

First, you will want to make sure that you are running in a compute job.

srun -p interactive --mem 8G --pty bash

Then, you need to load the miniconda3 module:

module load miniconda3

After that command completes, you will have the conda command available to you. conda is what you will use to manage your Anaconda environments. To list the Anaconda environments that are installed, run the following:

conda env list

If this is your first time running Anaconda, you will probably only see the “root” environment. This environment is shared between all users of Hellbender and cannot be modified. To create an Anaconda environment that you can modify, do this:

conda create --name my_environment python=3.7

You can use any name you want instead of my_environment. You can also choose other Python versions or add any other packages. Ideally, you should create one environment per project and include all the required packages when you create the environment.

After running the above command, you should see something like this:

The following NEW packages will be INSTALLED:
 
  _libgcc_mutex      pkgs/main/linux-64::_libgcc_mutex-0.1-main
  _openmp_mutex      pkgs/main/linux-64::_openmp_mutex-5.1-1_gnu
  ca-certificates    pkgs/main/linux-64::ca-certificates-2023.08.22-h06a4308_0
  certifi            pkgs/main/linux-64::certifi-2022.12.7-py37h06a4308_0
  ld_impl_linux-64   pkgs/main/linux-64::ld_impl_linux-64-2.38-h1181459_1
  libffi             pkgs/main/linux-64::libffi-3.4.4-h6a678d5_0
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-11.2.0-h1234567_1
  libgomp            pkgs/main/linux-64::libgomp-11.2.0-h1234567_1
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-11.2.0-h1234567_1
  ncurses            pkgs/main/linux-64::ncurses-6.4-h6a678d5_0
  openssl            pkgs/main/linux-64::openssl-1.1.1w-h7f8727e_0
  pip                pkgs/main/linux-64::pip-22.3.1-py37h06a4308_0
  python             pkgs/main/linux-64::python-3.7.16-h7a1cb2a_0
  readline           pkgs/main/linux-64::readline-8.2-h5eee18b_0
  setuptools         pkgs/main/linux-64::setuptools-65.6.3-py37h06a4308_0
  sqlite             pkgs/main/linux-64::sqlite-3.41.2-h5eee18b_0
  tk                 pkgs/main/linux-64::tk-8.6.12-h1ccaba5_0
  wheel              pkgs/main/linux-64::wheel-0.38.4-py37h06a4308_0
  xz                 pkgs/main/linux-64::xz-5.4.2-h5eee18b_0
  zlib               pkgs/main/linux-64::zlib-1.2.13-h5eee18b_0
 
 
Proceed ([y]/n)?

Press y to continue. Your packages should be downloaded. After the packages are downloaded, the following will be printed:

#
# To activate this environment, use:
# > source activate my_environment
#
# To deactivate an active environment, use:
# > source deactivate
#

Make a note of that because those commands are how to get in and out of the environment you just created. To test it out, run:

[bjmfg8@c067 ~]$ source activate my_environment
(my_environment) [bjmfg8@c067 ~]$ python
Python 3.7.16 (default, Jan 17 2023, 22:20:44)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

You might notice that (my_environment) now appears before your prompt, and that the Python version is the one you specified above (in our example, version 3.7).

Press Ctrl-D to exit Python.

When the environment name appears before your prompt, you are able to install packages with conda. For instance, to install pandas:

(my_environment) [bjmfg8@c067 ~]$ conda install pandas

Now, pandas will be accessible from your environment:

(my_environment) [bjmfg8@c067 ~]$ python
Python 3.7.16 (default, Jan 17 2023, 22:20:44)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> pandas.__version__
'1.3.5'

Press Ctrl-D to exit Python. To see list of installed packages in the environment, run:

conda list

To exit your environment:

(my_environment) [bjmfg8@c067 ~]$ conda deactivate
[bjmfg8@c067 ~]$

In the case that you do not need your environment, you can use the following to remove it (after exit):

conda env remove --name my_environment

Conda Channels

Whenever we use conda create or conda install without mentioning a channel name, Conda package manager search its default channels to install the packages. If you are looking for specific packages that are not in the default channels you have to mention them by using:

conda create --name env_name --channel channel1 --channel channel2 ... package1 package2 ...

For example the following creates new_env and installs r-sf, shapely and bioconductor-biobase from r, conda-forge and bioconda channels:

conda create --name new_env --channel r --channel conda-forge --channel bioconda r-sf shapely bioconductor-biobase

Conda Packages

To find the required packages, we can visit anaconda.org and search for packages to find their full name and the corresponding channel. Another option is using conda search command. Note that we need to search the right channel to find pakages that are not in the default channels. For example:

conda search --channel bioconda biobase

CUDA

CUDA® is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU). With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are using GPU-accelerated computing for broad-ranging applications.

Software URL:https://developer.nvidia.com/cuda-zone

Documentation:http://docs.nvidia.com/cuda/index.html

Globus

Globus is a file transfer software ecosystem widely used in the science and research community. It offers the ability to transfer large amounts of data in parallel and over secure channels between various “endpoints.”

Getting Started with Globus

https://docs.globus.org/how-to/get-started/

  • Select “University of Missouri System”.
  • Login using your University e-mail and password.
  • Follow the prompts.

Linking Identities (If you already have a globus account)

Link Globus Identities: https://docs.globus.org/how-to/link-to-existing/

  • Please link an organizational identity to your existing Globus account.
  • Select “University of Missouri System” as the identity to link.
  • Log in using your university e-mail username and your University password.
  • Follow the prompts to complete the account linking.

Sharing Data Via Guest Collection

  • Create the desired folder structure on the target system
  • After linking your University of Missouri System identity, you can connect to the mapped collection via Globus.
  • Login to the globus application
  • Select File Manger
  • In the Collection field, enter your search target - right now we have the Lewis cluster: “MU RCSS Lewis Home Directories” as well as RDE “U MO ITRSS RDE”
  • Change the Path field to your target directory.
  • Follow prompts to Share the path and invite users.

Moving Data From Lewis to Hellbender Using Globus

Both Lewis and Hellbender have Globus endpoints which allows for the easy transfer of data between the two clusters directly from the Globus application.

To begin, login to the Globus web client and follow the login prompts to connect to your account. In the file manager menu search for the Lewis endpoint “MU RCSS Lewis Home Directories”

This will land on your home file path (the same as where you land by default after logging into Lewis). From here you can select the file that you would like to transfer to Hellbender. In this case we will be moving the file “test.txt”.

Next, we need to find the Hellbender endpoint to transfer this file to. In the collection search bar on the right search for the Hellbender/RDE endpoint “U MO ITRSS RDE”. If you are trying to transfer files from your university OneDrive select “U MO ITRSS RDE - M365”. If you do not see the menu on the right - select the “transfer or sync to” option.

After “U MO ITRSS RDE” is selected:

You will land by default at the root directory of the RDE storage system. Use the path box to navigate to the specific file path on Hellbender/RDE that you are wanting to move the data to. NOTE: This works the same for group storage as well as the personal /data. In this example - we are using the personal data directory of user bjmfg8:

Once you have your desired file selected from the Lewis side and your destination selected on the Hellbender/RDE side you are ready to transfer the file. Select the “Start” button on the source (Lewis) side to begin:

You can refresh the folder and you should see the small test.txt file has been successfully transferred:

Matlab

For running Matlab in OOD Virtual Desktop on a gpu, in the Virtual Desktop terminal, type.

export CUDA_VISIBLE_DEVICES="1"
module load matlab
matlab

Visual Studio Code

OpenOnDemand

Visual Studio Code, also commonly referred to as VS Code, is a source-code editor developed by Microsoft for Windows, Linux and macOS. Features include support for debugging, syntax highlighting, intelligent code completion, snippets, code refactoring, and embedded Git.

We require users who want to work with VS Code on Hellbender to only use our interactive application in Open On Demand. Native connections to VS Code will spawn resource intensive processes on the login node and your session will likely be terminated by a system administrator.

To open a VS Code session in Open On Demand navigate to https://ondemand.rnet.missouri.edu/ in a web browser.

You will see a landing page similar to this:

Next select “Interactive Apps” and choose VS Code Server:

You will see a menu to add resources to your job submission. VS Code should be fine with the defaults (this is just for running the VS Code editor - this is not the resource selections for the actual jobs you want to run).

Your job will be submitted to the queue, after a few seconds you should see the option to launch your VS Code window:

You should land in your /data directory by default. You can now use VS Code as you wish.

X Forwarding

Another way to use VS Code is by using X forwarding through the regular slurm scheduler. First you will want to connect to the Mill with X forwarding enabled.

ssh -X hellbender-login.rnet.missouri.edu

If you are using a Mac computer you should use

ssh -Y hellbender-login.rnet.missouri.edu

Also, if you are a Mac user you will need to install xquartz package first (https://www.xquartz.org/). Restart of your Mac might be required after you install the xquartz package.

Next, you will want to start an interactive session on the cluster.

salloc --x11 --time=1:00:00 --ntasks=2 --mem=2G --nodes=1

Once you get a job, you will want to load the VS Code module with:

module load vscode/1.88.1

To launch VS Code in this job you simply run the command:

code