Services

RSS provides the following services:


Hellbender Investment Model

Overview

The newest High Performance Computing (HPC) resource, Hellbender, has been provided through partnership with the Division of Research Innovation and Impact (DRII) and is intended to work in conjunction with DRII policies and priorities. This outline will provide definitions about how fairshare, general access, priority access, and researcher contributions will be handled for Hellbender. HPC has been identified as a continually growing need for researchers, as such DRII has invested in Hellbender to be an institutional resource. This investment is intended to increase ease of access to these resources, provide cutting edge technology, and grow the pool of resources available.

Fairshare

To understand how general access and priority access differs, fairshare must first be defined. Fairshare is an algorithm that is used by the scheduler to assign priority to jobs from users in a way that gives every user a fair chance at the resources available. This algorithm has several metrics to perform this calculation over for any given job waiting in the queue, such as job size, wait time, current and recent usage, and individual user priority levels. This allows administrators to tune the fairshare algorithm, to adjust how it determines which jobs are next to run once resources are available.

Resources Available to Everyone: General Access

General access will be open to any research or teaching faculty, staff, and students for any UM system campus. General access is defined as open access to all resources available to users of the cluster at an equal fairshare value. This means that all users will have the same level of access to the general resource. Research users of the general access portion of the cluster will be given the RDE Standard Allocation to operate from. Larger storage allocations will be provided through RDE Advanced Allocations, and independent of HPC priority status.

Hellbender Advanced: Priority Access

When researcher needs are not being met at the general access level, researchers may request an advanced allocation on Hellbender to gain priority access. Priority access will give research groups a limited set of resources that will be available to them without competition from general access users. Priority Access will be provided to a specific set of hardware through a priority partition which contains these resources. This partition will be created, and limited to use by the user and their associated group. These resources will also be in an overlapping pool of resources available to general access users . This pool will be administered such that if a priority access user submits jobs to their priority access partition, any jobs running on those resources from the overlapping partition will be requeued and begin execution again on another resource in that partition if available, or return to wait in the queue for resources. Priority access users will retain general access status, fairshare will still play a part in moderating their access to the general resource. Fairshare inside a priority partition determine which user’s jobs are selected for execution next inside this partition. The jobs running inside this priority partition will also affect a user’s fairshare calculations even for resources in the general access partition. Meaning that running a large amount of jobs inside a priority partition will lower a user’s priority for the general resources as well.

Priority Designation

Hellbender Advanced Allocations are eligible for DRII Priority Designation. This means that DRII has determined the proposed use case (such as a core or grant-funded project) presents a strategic advantage or high priority service to the university. In this case, DRII fully subsidizes the resources used to create the Advanced Allocation.

Traditional Investment

Hellbender Advanced Allocation requests that are not approved for DRII Priority Designation may be treated as traditional investments with the researcher paying for the resources used to create the Advanced Allocation at the defined rate. These rates are subject to change based on the determination of DRII, and hardware costs.

Resource Management

Information Technology Research Support Solutions (ITRSS) will procure, set up, and maintain the resource. ITRSS will work in conjunction with MU Division of Information Technology and Facility Services to provide adequate infrastructure for the resource.

Resource Growth

Priority access resources will generally be made available from existing hardware in the general access pool and the funds will be retained for a future time to allow a larger pool of funds to accumulate for expansion of the resource. This will allow the greatest return on investment over time. If the general availability resources are less than 50% of the overall resource, an expansion cycle will be initiated to ensure all users will still have access to a significant amount of resources. If a researcher or research group is contributing a large amount of funding, it may trigger an expansion cycle if that is determined to be advantageous at the time of the contribution.

Benefits of Investing

The primary benefit of investing is recieving "shares". Shares are used to calculate the percentage of the cluster owned by an investor. As long as an investor has used less than they own, investors will recieve higher priorities on the queue. This is called "FairShare" and can be monitored by running sshare. A FairShare value of more than 0.5 indicates that an investor has used less than they own, and conversely a FairShare value of less than 0.5 indicates the investor has used more than they own. FairShare is by far the largest factor in queue placement and wait times.

Investors will be granted Slurm accounts to use in order to charge their investment (FairShare). These accounts can contain the same members of a POSIX group (storage group) or any other set of users at the request of the investor.

To use an investor account in an sbatch script, use:

#SBATCH --account=<investor account>

To use a QOS in an sbatch script, use:

#SBATCH --qos=<qos>

HPC Pricing

The HPC Service is available at any time at the following rates for year 2023:

Service Rate Unit Support
Hellbender HPC Compute $2,702.00 Per Node Year to Year
GPU Compute $7,691.38 Per Node Year to Year
High Performance Storage $95.00 Per TB/year Year to Year
General Storage $25.00 Per TB/year Year to Year
GPRS Storage $7.00 Per TB/Month Month to Month

Storage

Research Data Ecosystem Storage (RDE) Allocation Model

The Research Data Ecosystem (RDE) is a collection of technical resources supporting research data. These resources are intended to meet researcher needs for physical storage, data movement, and institutional management. They are provided in partnership with the Division of Research Innovation and Impact (DRII) and are intended to work in conjunction with DRII policy drivers and priorities. The Ecosystem brings together storage platforms, data movement, metadata management, data protection, technical and practice expertise, and administration to fully support the research data lifecycle. DRII has invested in RDE as an institutional resource. Details of the specific underlying platforms may change over time, but always directed towards ease of use, access, and performance to purpose. Throughout May 2023, portions of the RDE are moving into production. These include:

  • On-premises high-performance and general-purpose research storage
  • Globus for advanced data movement
  • Specialized need consultation

These resources work in conjunction with RSS services related to grant support, HPC infrastructure, and data management plan development. Capabilities that are not yet generally available but in development include:

  • Data backup
  • Data archival
  • Data analytics and reporting

Additionally, effective use of some resources may require changes to network architecture, so additional limitations may apply at this time.
We invite researchers needing solutions (including those dependent on resources not yet generally available) to consult with RSS. We may be able to find effective workarounds or make use of pilot projects when appropriate. We are committed to finding solutions supporting your research productivity needs!

To order storage please fill out a request in our Resource Request Form

Resources available to all researchers: RDE Standard Allocation

All researchers are eligible for the RDE Standard Allocation. The Standard Allocation provides a base level of storage suitable for use in connection with High Performance Computing clusters, or for general-purpose lab shares (SMB or NFS). The exact capacity is subject to change based on utilization and DRII direction. See the appendices for current specifications.

RDE Advanced Allocation

For needs beyond the RDE Standard Allocation, researchers may request one or more RDE Advanced Allocations. Advanced Allocations can provide for larger or specialized research storage needs. Storage is provided at a per-TB/per-year rate which is subject to change under DRII guidance. See the appendices for current rates and how the cost model is being implemented. RDE Advanced Allocations should be associated with research services or defined projects. Research Cores, Research Centers, labs providing services to other labs, and RSS may be considered research services. Defined projects include sponsored programs or otherwise well-defined initiatives. All RDE allocations must include appropriate data management planning. A plan may be a formal Data Management Plan associated with a grant, or an operational workflow description appropriate for a core or service entity, as long as data protection, transfer, and disposition requirements are documented.
Advanced allocations require consultation with RSS. RSS will work with researchers to match allocated resources with capacity, performance, protection, and connectivity needs.

Priority Designation

RDE Advanced Allocations are eligible for DRII Priority Designation. This means DRII has determined the proposed use case (such as a core or grant-funded project) presents a strategic advantage or high priority service to the University and agrees to subsidize the resources assigned in that designation. DRII is responsible for determination of criteria for Priority Designation.

Traditional Investment

RDE Advanced Allocations that are not approved for DRII Priority Designation or that inherently receive funding for storage may be treated as traditional investments, with the researcher paying for the allocation at the defined rate.

Data Compliance

By default, the RDE environment supports data classified as DCL 1 or 2. It may be possible to arrange for a higher DCL, but this must be vetted and approved by appropriate security and compliance authorities.

Allocation Maintenance

Researchers are expected to ensure allocation information is kept current. Annual confirmation that the allocation is still needed will be required for all Standard Allocations. For lab (group) and Advanced Allocations, annual vetting of group membership will be required, as well as updates to data management planning if changes (duration, disposition, etc.) are needed.

Appendix

Appendix A: RDE Standard Allocation

  • Individual researcher: 500GB (In addition to 50GB home directory space for HPC users)
  • Lab group: 5TB
  • Duration/renewal cycle: Annual

Appendix B: RDE Advanced Allocation

  • Capacities and duration determined in consultation.
  • Cost per TB per year (equipment/licensing): $95 for high performance storage, $25 for general purpose storage*
  • Supplemental services
  • Snapshotting (pending implementation and potential cost evaluation)
  • Performance optimization
  • Backup (pending implementation and potential cost evaluation)
  • Archival (pending implementation and potential cost evaluation)
  • Globus endpoint

*Note: All currently available storage is high performance. As capacity is consumed, general purpose (lower tier) storage will be added to the hardware environment and data priced as “general purpose” will be subject to automatic migration to the lower tier.

Teaching Cluster

The teaching cluster is meant as a resource for students and instructors for computational computing. The teaching cluster is a full HPC cluster and students are allowed to run jobs on the head node. Students must contact instructors for course related questions and support.

Service Capabilities

  • 12 compute nodes (152 Cores)

Example Use Cases

  1. As a student, I want to learn how to login and run a simple program on an HPC cluster with minimal setup time and effort.
  2. As an instructor, I want a resource to teach my students about the different functions of high performance computing without needing to spend a lot of time to set up accounts and get the students logged in.

Service Policies

For Instructors:

  1. The teaching cluster provided to all UM students, students must be official UM students with an PawPrint or UM SSO ID.
  2. The environment is "research grade". No backups or high availability. Students/TA's are responsible for backing up any data throughout the semester.
  3. Only infrastructure support is provided, there is no student/end-user support. All support requests should come through the instructor or TA’s via muitrss@missouri.edu. Support is best effort and provided during regular business hours.
  4. Software is limited to CentOS 7 packages installed via yum that require minimal configuration and a subset of Lewis scientific packages.
  5. We do not support a development environment/IDE. Users need to use either sftp or other console based text editors. For Windows users we have a site license for MobaXterm
  6. We take security seriously. We upgrade the entire environment (including rebooting) on a regular basis and without notice. SELinux is enforced.
  7. Students must be made aware of the "Teaching Cluster Policy" and the limitations of the environment.

For Students:

  1. Use of this system is governed by the rules and regulations of the University of Missouri and the University of Missouri System.
  2. Users must be familiar with and abide by the UM System acceptable use policy and the UM System Data Classification System (DCL). Collected Rules and Regulations - Chapter 110, Data Classification System.
  3. Only DCL 1 data is permitted on the cluster. See the Data Classification System - Definitions
  4. This is a shared environment with limited storage, RAM, or CPU with no quotas. Please be nice.
  5. Data is not backed up and all data deleted when students graduate. This policy may be revised.
  6. Students must contact instructors for course related questions and support.

GPU Service

Some scientific workflows are greatly accelerated when they run on one or more Graphical Processing Units (GPUs). Both Hellbender and Lewis have GPU environments as described below.

Hellbender GPU Capabilities

  • GPU Node Definitions
  • Dell R750XA
  • CPU: Intel Xeon Gold 6338 Ice Lake Processor
  • Qty: 2
  • Cores: 32 physical per cpu (64 per node)
  • Base clock: 2.0GHz
  • Turbo clock: 3.2GHz
  • L3 cache: 48MB
  • GPU: Nvidia Ampere A100
  • Qty: 4
  • RAM: 80GB each (320GB total)
  • Cuda cores: 6912 each (27,648 total)
  • Base clock: 1065MHz
  • Boost clock: 1410MHz
  • RAM: 256 GB DDR4 - 3200
  • Local scratch: 1.6TB NVME dedicated
  • Network: HDR-200 Infiniband connection
  • Bandwidth: 200Gb/s
  • Latency: less than 600 nanoseconds
  • Purpose: MPI communications and access to all network storage.
  • Current total Quantity: 17

Lewis GPU Best Practices

Example Use Cases

  • As a researcher I want to train a neural network to classify images, but my project budget does not cover the cost of purchasing and managing the amount of GPU hardware that is required to complete this task.

Service Policies

  1. The GPU partition must only be used for GPU accelerated workflows. Jobs running on the GPU partition that are not utilizing the GPU are subject to cancellation and potential loss of GPU partition access.
  2. The use of srun for active development and debugging is permitted but is limited to allocations of 2 hours or less. Excessive srun session idle time or excessive number of srun sessions is not permitted.
  3. GPU jobs that utilize only 1 GPU should be structured in a way to allow other jobs to share the node. The exclusive SLURM option should NOT be used and CPU cores, memory, and GPU resources need to be 'right-sized' to the workload. Resource requests should match the correct class and quantity of GPUs for the algorithm.
  4. No more than 50% of the partition resources will be available for concurrent use by any single user.
  5. Users that have not invested in gpu4 can run jobs on the Gpu partition (which allows access to all GPU nodes) for up to 2 hours and by request jobs up to 2 days on the gpu3 partition.
  6. Access to the gpu4 partition is limited to investors only and jobs must be submitted directly to the gpu4 partition (--partition gpu4) with a gpu4 QOS (--qos gpu4) and their GPU account to charge (example --account engineering-gpu). The account must not be 'general' or their CPU investment account.

High Performance Compute

Hellbender HPC Compute Capabilities

112 Compute Nodes:

  • Compute Node Definitions
    • Dell C6525
    • CPU: AMD 7713 Epyc Milan Processor
    • Qty: 2
    • Cores: 64 physical per cpu (128 per node)
    • Base clock: 2.0GHz
    • Boost clock: 3.675GHz
    • L3 Cache: 256MB
    • RAM: 512 GB DDR4 - 3200
    • Local scratch: 1.6TB NVME dedicated
    • Network: HDR-200 Infiniband connection
    • Bandwidth: 200Gb/s
    • Latency: less than 600 nanoseconds
    • Purpose: MPI communications and access to all network storage.

Lewis HPC Compute Capabilities

217 Compute Nodes Spanning 4 Generations:

  • HPC3
  • 19 Nodes (456 Cores)
  • Intel Haswell
  • HPC4/HTC4/HPC4RC
  • 101 Nodes (2828 Cores)
  • Intel Broadwell
  • HPC5
  • 35 Nodes (1400 Cores)
  • Intel Skylake
  • HPC6
  • 62 Nodes (2976 Cores)
  • Intel Cascade Lake

Lewis HPC Compute Best Practices

  • Never run calculations on the home disk
  • Always use SLURM to schedule jobs
  • The login nodes are only for editing files and submitting jobs
  • Do not run calculations interactively on the login nodes

Example Use Cases

  1. I need to run a computational fluid dynamics simulation that requires very fast communication across different logical units.
  2. I need to analyze a large pool of gene expression data that far exceeds the processing capacity of my lab's PC's.
  3. I want to run a simulation on drug interaction and toxicity without involving live subjects.

Service Policies

General Use

  • Use of this system is governed by the rules and regulations of the University of Missouri and the University of Missouri System and by the requirements of any applicable granting agencies. Those found in violation of policy are subject to account termination.
  • Users will follow the UM System acceptable use policy
  • Users are responsible to ensure that only data classified as DCL1 or DCL2 will be stored or processed on the system.

Accounts

  • Faculty of the University of Missouri – Columbia, Kansas City, St. Louis, and S&T may request user accounts for themselves, current students, and current collaborators. Account requests require the use of a UM System email address. The exception is for researchers of the ShowMeCI consortium who are not part of the University of Missouri System; they may apply for accounts for themselves and their students using their organization email address.

Collaborator Accounts

  • Faculty requesting accounts for collaborators must first apply for their collaborator to have a Courtesy Appointment thru the faculty’s department using the Personal Action Form. After the Courtesy Appointment approval, faculty can submit an Account Request for their collaborator. Collaborators must submit account requests for their students using the student’s university email address. Collaborators agree to abide by the External Collaborator policy.

External Collaborator Policy

  1. External collaborators must agree to the following:

  2. Follow the University of Missouri's rules and policies listed above and your home institution's policies and rules.

  3. Data on the cluster is restricted to DCL1 or DCL2 as described above.
  4. Data storage and computation is for academic research purposes only, no personal, commercial, or administrative use.
  5. Follow the Research Computing cluster policy.
  6. Under no circumstances may access to the user account be shared or granted to third parties.
  7. As a collaborator you will be assigned different priorities and limits from the rest of the users in the cluster.
  8. Data is not backed up on the cluster and Research Computing is not responsible for the integrity of the data or data loss or the accuracy of the calculations performed.
  9. We ask that you give the University of Missouri, Division of IT, Research Computing Support Services acknowledgment for the use of the computing resources.
  10. We ask that you provide us with citations of any publications and/or products that utilized the computing resources.

Account Sharing

  • Direct sharing of account data on the cluster should only be done via a shared group folder. A shared group folder is setup by the faculty adviser or PI. This person is the group owner and can appoint other faculty to be a co-owner. The owners and co-owners approve the members of the group and are responsible for all user additions and removals. The use of collaboration tools, such as Git, is encouraged for (indirect) sharing and backup of source data.
  • Sharing of accounts and ssh-keys is strictly prohibited. Sharing of ssh-keys will immediately result in account suspension.

Running Jobs

  • All jobs must be run using the SLURM job scheduler. Long term or resource heavy processes running on the login node are subject to immediate termination.
  • Normal Jobs running on the cluster are limited to two days running time. Jobs up to 7 days may be run after consultation with the RSS team. Long jobs may be occasionally extended upon request. The number of long jobs running on the cluster is limited to ensure that users can run jobs in a timely manner. All jobs are subject to termination for urgent security or maintenance work or the stability of the cluster.

Lewis Investor Policy

  • Investors purchase nodes and, in exchange for idle cycles, the space, power, and cooling as well as management of the hardware, operating system, security, and scientific applications are provided by Research Computing Support Services at no cost for five years. After 5 years the nodes are placed in the Bonus pool for extended life and removed at the discretion of RSS based on operating conditions. Investors get prioritized access to their capacity via the SLURM FairShare scheduling policy and unused cycles are shared with the community. Investors get 3TB of group storage and help migrating their computational research to the cluster. For large investments (rack scale) we will work with researchers and vendors to test and optimize configurations to maximize performance and value. Information on becoming an investor can be requested via muitrss@missouri.edu.

Acknowledgements

  • We ask that when you cite any of the RSS clusters in a publication to send an email to muitrss@missouri.edu as well as share a copy of the publication with us. To cite the use of any of the RSS clusters in a publication please use: "The computation for this work was performed on the high performance computing infrastructure provided by Research Computing Support Services and in part by the National Science Foundation under grant number CNS-1429294 at the University of Missouri, Columbia MO."

Lewis HPC Storage

HPC Storage is fastest when dealing with "large" files (>100MB). This is because files on HPC Storage are striped, which means they are split across multiple storage devices. In Lustre, file striping involves Object Storage Targets (OSTs). Much of the work Lustre does involves coordinating Object Storage Servers (OSSs) to reassemble a file from the various OSTs. Therefore, workloads that involve many small files create far more work for Lustre than workloads with the same amount of data that deal only with a few files.

The HPC Storage service should be used for loading large datasets into memory before processing them, or writing large output files. When data is too large to fit into memory, please use file streams with large chunk sizes to process files when possible. The HPC Storage service should not be used for millions of small files - such usage will impact performance for all users. To avoid inappropriate usage, consider using formats such as HDF5 or NetCDF to store large collections of related data. If you have any questions about your workflow, please contact us and we will be happy to help!

Our HPC Storage service (/storage/hpc, /data, /group, and /scratch) is a Lustre parallel filesystem ideal for storing data and results.

HPC Storage Capabilities

  • 595 TB of shared high speed parallel storage
  • 2 MDSs and 4 OSSs, serving 2 MDTs and 4 OSTs respectively

Example Use Cases

  1. I need to be able to get gigabytes of data from disk into memory as fast as possible
  2. I have many large files that need to be read by multiple nodes simultaneously. These files are too large to be stored on local scratch
  3. I need a place to store tarballs that will be extracted to local scratch for further processing

HPC Storage Best Practices

  • Store input and output that will be used in the near-term by batch jobs
  • Avoid large metadata operations, such as ls -la
  • Ensure your files are appropriate for Lustre
  • Tune your folders for your workload

Policy

  1. There are no backups of any storage. The RSS team is not responsible for data integrity and data loss. You are responsible for your own data and data backup. RSS recommends the UMKC Researcher Managed Backup Storage.
  2. Groups are located in HPC/storage/hpc/group/group_name and all users belonging to a group will have the same access permissions by default. The PI for the group is the only person who can approve additions and removals of users in groups.

Appropriate Use

HPC Storage Should Be Used

  • Input and output, stored in an efficient container
  • binary formats are usually great
  • CSV and TSV are good if loaded using big chunks
  • other text formats are okay as long as they minimize random I/O
  • Read-Only Metadata

HPC Storage Should Not Be Used

  • Executable files and source code*
  • Small text files for input
  • Attempt to concatenate as many of these files as possible
  • Files larger than RAM that require random I/O
  • Use HTC Storage, then copy the files to local scratch before processing
  • Log files*
  • Read/Write Metadata*
  • Files intended for human use*

*Use your home directory instead

HPC Storage Must Not Be Used

  • Datasets stored as thousands of files under 1MB
  • Avoid this practice. Get in touch with us if you need help finding other solutions.
  • Files that require locks
  • The home filer supports file locks
  • Files that you are not prepared to lose in the event of a storage failure
  • NO RSS STORAGE SERVICE SHOULD BE USED FOR THESE FILES

Tuning

Unlike many filesystems, Lustre can be tuned by users in userspace. The most important commands to know in regards to Lustre tuning is lfs getstripe and lfs setstripe. These commands show and modify stripe settings on files and folder. The stripe count is the number of Object Storage Targets (OSTs) that a file is stored on. Large files benefit from larger stripe sizes, while small files benefit from small stripe sizes.

Examples

Note: these changes only affect new files.

To get the current stripe information of a file or directory:

lfs getstripe <path>

To set up a directory to be used for small files (mostly <128MB):

lfs setstripe -c 1 <dir>

To set up a directory to be used for both small and large files:

lfs setstripe -c 2 <dir>

To set up a directory to be used for large files:

lfs setstripe -c 4 <dir>

Lewis HTC Storage

The High Throughput Computing Storage (HTC Storage) service has been designed for researchers with a need for large amounts of long term storage for High Throughput Computing (HTC) with cost as a primary consideration. HTC Storage investments are sold in 10TB increments.

HTC Storage Capabilities

  • 1240 TB low computational intensity project storage
  • Utilizes the ZFS file system

Example Use Case

  1. I have a large amount of research data that I may want to quickly analyze on Lewis at a later date. Instead of constantly moving my data between sources, it would be nice to be able to have access to cost effective storage.

Service Policies

  1. The storage is only internally accessible on the Lewis cluster compute nodes and externally accessible on the Lewis login nodes and data transfer (DTN) nodes via rsync over ssh or sftp.
  2. Storage is limited to DCL1 and DCL2 research data. Administrative, commercial, and personal data is prohibited.
  3. Storage is allocated in blocks and each storage block has its own quota mount point.
  4. Storage blocks are prepaid in full. New investments must be at least 50TB.
  5. Existing investments must be incremented in 10TB blocks.
  6. Storage nodes are expanded in large increments (100TB). Depending on available capacity, requests for storage blocks may be delayed until a storage node is expanded.
  7. The storage blocks expire after 5 years. Users must either purchase another storage block and transfer data to the new storage node; transfer data to another system; or do nothing and the data will be destroyed. Users may request a hardship waiver to the CI Council for temporary storage. No automatic data migration services are provided and users are responsible for the data integrity of moved data.
  8. Storage is based on ZFS and provided on a single node and single volume basis with no high availability (HA).
  9. There are no backups of the data. The data storage system is resilient to multiple disk failures (parity) and we do our best to protect the data but are not responsible for any data loss. Snapshots are available and count towards storage block.
  10. Storage is transferable but is not refundable.
  11. By default, all users in a group have the same access permissions.
  12. The group PI is the only one able to approve user additions and removals.
  13. The path to HTC storage is: /storage/htc/group_name

Grant Assistance

The RSS team is here to help with grants. We offer consultations and project reviews that include but are not limited to:

  • Security Reviews
  • Vendor Quotes
  • Letters of Support
  • Regional Partnerships
  • Data Management Plans
  • Facilities Description(Download)

Getting Started with RSS Grant Assistance

  • Please fill out our initial grant consultation form RSS Grant Requirements Form
  • Once this is received - the RSS team will review the form and if necessary coordinate with the PI as well as the appropriate subject matter experts to determine the next steps.