Policies

Following are RSS policies and guidelines for different services and groups:


Software and Procurement Policy

Open Source Software

  1. Software installed cluster-wide must have an open source (https://opensource.org/licenses) license or be obtained utilizing the procurement process even if there is not a cost associated with it.

Licensed Software

  1. Licensed software (any software that requires a license or agreement to be accepted) must follow the procurement process to protect users, their research, and the University. To ensure this, for RSS to install and support licensed software RSS must manage the license and the license server.
  2. For widely used software RSS can facilitate the sharing of license fees and/or may support the cost depending on the cost and situation. Otherwise, user are responsible for funding for fee licensed software and RSS can handle the procurement process. We require that if the license does not preclude it, and there are not node or other resource limits, that the software is make made available to all users on the cluster. All licensed software installed on the cluster is to be used following the license agreement. We will do our best to install and support a wide rage of scientific software as resources and circumstances dictate but in general we only support scientific software that will run on CentOS in a HPC cluster environment. RSS may not support software that is implicitly/explicitly deprecated by the community.

Singularity Support

  1. A majority of scientific software and software libraries can be installed in users’ accounts or in group space. We also provide limited support for Singularity for advanced users who require more control over their computing environment. We cannot knowingly assist users to install software that may put them, the University, or their intellectual property at risk.

Research Network Policy

Research Network DNS

The domain name for the Research Network (RNet) is rnet.missouri.edu and is for research purposes only. All hosts on RNet will have a .rnet.missouri.edu domain. Subdomains and CNAMEs are not permitted. Reverse records will always point to a host in the .rnet.missouri.edu domain.

GPRN

The GPRN is a wired secure networking service of the Research Network for instruments, printers, workstations, and other computing devices that need to communicate with servers on RNet. GPRN provides private IPv4 addresses that are directly routed to the research network and internet access is provided by the campus network (NAT). Machines inside GPRN can connect to RNET without restriction (all ports) and only the research network can connect to machines in the GPRN via ssh only (port 22). Devices must use DHCP to connect to the network and the devices host name will be registered (DDNS) at <hostname>.gprn.rnet.missouri.edu. Static DHCP leases are available for instruments that require IP addresses that do not change. A MoCode for the port fee is required to establish a connection.

Path Rule
Internet to GPRN Closed
RNet to GPRN Port 22/Routed
GPRN to RNet Open/Routed
GPRN to Internet NAT'd
VPN to GPRN Open

BioCompute Group Policy

  1. Access to the BioCompute partition requires a current investment in the Lewis cluster with at least 12 fairshare shares (or an entire machine) and approval of the BioCompute Advisory Committee. Investment must be made within 90 days of access or within 90 days after an investment expires.
  2. Access to the biocommunity Slurm account is on a per-project, time limited basis and needs approval of the BioCompute Advisory Committee and active coordination with the community.
  3. Special time-allocated access to the BioCompute partition and the biocommunity Slurm account is provided as a special allocation with an application and report (see CIC Special Allocation Documentation) and requires approval of the BioCompute Advisory Committee.
  4. Requests for access should be sent to muitrss@missouri.edu
  5. User needs to specify a Slurm account( #SBATCH -A <account name> ) other than general to access the BioCompute partition.

General Purpose Research Network Policy

The General Purpose Research Network (GPRN) is a wired networking service of the Research Network (RNet) for instruments, printers, workstations, and other computing devices that need to communicate with servers on RNet. GPRN provides private IPv4 addresses that are routed to the research network and internet access is provided by the campus network (NAT) providing additional security. Machines inside GPRN can connect to RNet without restriction (all ports) and only the Research Network can connect to machines in the GPRN via ssh (port 22). Devices must use DHCP to connect to the network and will be named **\*.gprn.missouri.edu. Static DHCP leases and names are available for instruments and printers that require an IP address that does not change. A MoCode for the port fee is required to establish a connection.

Flow Rule
GPRN to RNet Open/Routed
RNet to GPRN Port 22/Routed
GPRN to Internet Open via NAT
Internet to GPRN Closed (NAT)
VPN to GPRN Open

Teaching Cluster Policy

The teaching cluster is meant as a resource for students and instructors for computational computing. The teaching cluster is a full HPC cluster and students are allowed to run jobs on the head node. Students must contact instructors for course related questions and support.

Teaching Cluster Policy for Instructors

  1. The teaching cluster provided to all UM students, students must be official UM students with an PawPrint or UM SSO ID.
  2. The environment is "research grade". No backups or high availability. Students/TA's are responsible for backing up any data throughout the semester.
  3. Only infrastructure support is provided, there is no student/end-user support. All support requests should come through the instructor or TA’s via muitrss@missouri.edu. Support is best effort and provided during regular business hours.
  4. Software is limited to CentOS 7 packages installed via yum that require minimal configuration and a subset of Lewis scientific packages.
  5. We do not support a development environment/IDE. Users need to use either sftp or emacs/vi/nano or other console based text editors. For windows users we have a site license for MobaXterm https://missouri.app.box.com/rcss-mobaxterm
  6. We take security seriously. We upgrade the entire environment (including rebooting) on a regular basis and without notice. SELinux is enforced.
  7. Students must be made aware of the "Teaching Cluster Policy" and the limitations of the environment.

Teaching Cluster Policy for all users

  1. Use of this system is governed by the rules and regulations of the University of Missouri and the University of Missouri System.
  2. Users must be familiar with and abide by the UM System acceptable use policy (CRR 110.005) and the UM System Data Classification System (DCL). Collected Rules and Regulations - Chapter 110, Data Classification System.
  3. Only DCL 1 data is permitted on the cluster Data Classification System - Definitions.
  4. This is a shared environment with limited storage, RAM, or CPU. We have implemented a 500MB quota for each user. Users will not be able to write or create new items after this limit is met until their total data usage across Clark is lower than this quota.
  5. Data is not backed up and all data deleted when students graduate. This policy may be revised.
  6. Students must contact instructors for course related questions and support.

Secure4 Policy

Secure4 Research Cluster is a secured computational research cluster that hosts data falling within the DCL3 and DCL4 classifications per University guidelines regarding the Data Classification System. New requests to be added to the Secure4 environment are not being accepted at this time. If you have DCL3 and DCL4 data that require HPC resources please reach out to muitrss@missouri.edu

Data Repositories

Data repositories that are hosted on the Secure4 HTC storage with HTC storage rates applied. To create a data repository the main PI must provide the IRB information, data requirements, designated repository administrator, and any other pertinent information specific to the data repository request. The data repository administrator is required to be part of the IRB and is responsible for controlling transfer of data from the repository to the requester. For transfers of data, the data repository administrator should create a transfer folder with a subfolder for the specific transfer request. When data has been put into the subfolder, a ticket should be opened via Cherwell and assigned to RSS ("Research Computing" group under DoIT). When the ticket comes up in the queue, RSS will change the group owner of the sub folder to the requesting group and then close the ticket. Data repositories can only be transferred to Secure4 accounts. All requests are and will be tracked via Cherwell.

Accounts

Researchers at the University of Missouri or the University of Missouri System may request project workspace and user accounts on Secure4. To request a project work space and/or user accounts the PI must contact muitrss@missouri.edu to set up a consultation. Students, staff, faculty that are part of the University of Missouri or the University of Missouri System may have user accounts created and be associated with projects at the request of the project's PI and pending IRB confirmation. Accounts can be requested for collaborators outside of the University of Missouri and University of Missouri System once the collaborator has a courtesy account and is added to the IRB. It is the PI’s responsibility for requesting user accounts to be removed from projects when a member of the project leaves.

IT Pros

Department IT Pros are responsible for securing, encrypting and ensuring patch enforcement and management of users workstations and laptops, and assistance with verification of a passphrase-protected ssh key pair creation for Secure4 users. IT Pros will pass along the users public ssh key in the Cherwell ticket for the account request.


NSF MRI-RC Policy

  1. Access to the MRI-RC account requires permission, please contact PI Chi-Ren Shyu (shyuc@missouri.edu) for more details.
  2. Users that utilize the MRI-RC account must acknowledge that "The computation in this paper was supported by NSF Grant #1429294 and by RSS" in their publications.

NSF MRI-RC Information:

  1. National Science Foundation Grant: CNS-1429294
  2. https://www.nsf.gov/awardsearch/showAward?AWD_ID=1429294&HistoricalAwards=false
  3. http://munews.missouri.edu/news-releases/2014/0926-nsf-grants-1-million-to-mu-to-expand-supercomputer-equipment-and-expertise-for-big-data-analytics-at-mu/

Storage Policy

The Lewis HPC cluster contains 6 distinct types of storage namely Home, HPC, HTC, GPRS, Local Scratch and UMKC Researcher Managed Backup. The description and appropriate use of each is detailed in this document. For more information on purchasing additional storage please send a request to muitrss@missouri.edu.

Note: There are no backups of any storage. The RSS team is not responsible for data integrity and data loss. You are responsible for your own data and data backup. RSS recommends the UMKC Researcher Managed Backup Storage.

Core Lewis File Systems

Storage Type Location Quota Properties Purpose
Home /home/$USER 5GB SSD, compressed This is a good location to store source code, scripts, and other (small) important files. Extremely fast, but very small.
HPC /storage/hpc/* (includes /data, /group, and /scratch) 100GB by default, $15.50/TB/month for quotas larger than 500GB High Performance Parallel File System (Luster), compressed This is a good location for storing raw data and results. Very fast - optimized for High Performance Computing (HPC) parallel work flows. Group quotas will be applied to the /group folder instead of user quotas if the file permissions are correct.
HTC /storage/htc/* $160/TB/60 months Economical Large Storage File System This is for large bulk storage of data needed for work flows. Optimized for High Throughput Computing (HTC) workloads e.g. DNA sequencing or image processing. Not appropriate for highly parallel work flows or simultaneous access to single file by many hosts (see HPC storage)
Local Scratch /local/scratch/$USER varies by partition (500 GB - 14 TB) HDDs or SSDs depending on partition Very fast scratch space - only usable inside the duration of a job (not persistent). User is responsible for cleaning up data at the end of each job.

Note: HTC Storage - The minimum investment for HTC Storage is 10TB for 5 years which is $2.66 x 10TB x 60 months = $1,600.

Groups located in HPC or HTC storage (/storage/hpc/group/group_name or /storage/htc/group/group_name) are considered flat and all users located within a specific group have the same permissions. The PI for the group is the only person who can approve additions and removals of users in groups.

Edge Lewis File Systems

These file systems are mounted only on certain non-compute hosts on Lewis, namely the Login and DTN nodes. These are used to move data into and out of the cluster, but never for direct computation.

Storage Type Location Quota Properties Purpose
GPRS /gprs/* $7/TB/mo highly reliable, does not go down when Lewis is under maintenance Accessible from outside the cluster, including Windows machines. Ideal for attaching storage to instruments (e.g. DNA Sequencing machine) or for users with highly variable data sizes month to month. Not suitable for any computational work flows - archive and transfer only.
UMKC Researcher Managed Backup not mounted on Lewis; but firewall-preferred rsync from DTNs to UMKC $2/TB/mo Off-site backup, similar to HTC storage Researcher Managed Backup - a rsync target located in the UMKC data center. Ideal for user-managed backup schemes.

Note: GPRS Storage - Managed by the University Enterprise Storage Team in concert with RSS. See GPRS Storage Info for more info.

UMKC Researcher Managed Backup - Managed by University of Missouri - Kansas City. More information is available at Research Managed Backup Storage. To purchase Research Managed Backup Storage please email umkcisrcss@umkc.edu.

Data Retention Policies

Location Policy
/home, /data Subject to removal for inactive users or users no longer associated with the institution associated with their account
/group Subject to removal for inactive or empty groups
/scratch Data can be removed automatically after 10 days.
/local/scratch Data can be removed automatically after job exits.

Note: The RSS team reserves the right to delete anything in /scratch and /local/scratch at any time for any reason.


Partition Policy

Partition Usage Guidelines

  • General is the default partition.
    • default time limit: 2 hours
    • maximum time limit: 4 hours
    • single node jobs only
  • Lewis is designed for longer running multi-core and multi-node jobs.
    • default time limit: 2 hours
    • maximum time limit: 2 days
    • InfiniBand Fabric available for MPI jobs
    • optimized for HPC workloads
    • not suitable for large multi-node (>100 cores) jobs; use a partition listed in the MPI document (i.e. 'hpc5') instead
  • Interactive is designed for short interactive testing, interactive debugging, and general interactive jobs.
    • maximum time limit: 4 hours
    • acceptable use production codes is not permitted
  • Gpu designed for GPU accelerated workflows.
  • QOS 'long' is designed for non-serial code that needs more than 48 hours to run.
    • requires permission: contact muitrss@missouri.edu to gain access to this Quality of Service (QOS)
    • requires the QOS 'long'
    • multi-core and multi-node jobs only
    • maximum time limit: 7 days
    • works in the 'Lewis' partition and any of the HPC sub-partitions (i.e. 'hpc5')
  • Serial is designed for non-parallel code that needs more than 48 hours to run.
    • requires permission: contact muitrss@missouri.edu to gain access to this partition
    • requires the QOS 'seriallong'
    • single core jobs only
    • maximum time limit: 28 days
    • optimized for HTC workloads
  • Dtn is only for long running data manipulation jobs.
    • requires permission: contact muitrss@missouri.edu to gain access to this partition
    • requires the QOS 'dtn'
    • maximum time limit: 2 days
    • acceptable use includes file transfer (rsync, sftp, wget), checksums (md5sum, sha256sum), compression, (tar, zip), and encryption (gz)
  • BioCompute is for BioCompute Investors.
    • requires permission
    • maximum time limit: 2 days
    • optimized for HTC workloads

Job Cancellation Policy

While we try to avoid canceling jobs the following events would merit job cancellation:

  • Emergency Security Event/Patches
    • Cluster is shut down immediately up to and including killing all active jobs depending on the severity of the incident.
  • Urgent Security Patching
    • Login nodes will be closed and the cluster will be shutdown after all 2-day or less jobs complete, jobs 2 days or longer will be terminated.
  • Scheduled Maintenance
    • We will communicate the approximate date well in advance and give 2 weeks notice. The maintenance reservation will ensure jobs less than 7 days will complete in time. Any long serial jobs will be terminated.

Available Partitions

Partition Name Time Limit Nodes Cores (per node*) Cores (total) Memory in Gigabytes (per node*) Memory in Gigabytes(total) Processors
General 04:00:00 172 24+ 5636 122+ 46873 Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz, Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
Interactive 04:00:00 5 24+ 144 251+ 1515 Intel(R) Xeon(R) CPU E5-2695 v2 @ 2.40GHz, Intel(R) Xeon(R) CPU E7-4850 v2 @ 2.30GHz
Serial 2-00:00:00 1 64 64 1025 1025 AMD EPYC 7601 32-Core Processor
Dtn 2-00:00:00 2 16+ 36 66+ 188 Intel(R) Xeon(R) CPU X5550 @ 2.67GHz, Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
Gpu 02:00:00 15 16+ 284 122+ 1837 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz, Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
BioCompute 2-00:00:00 37 56 2072 509 18853 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
Lewis 2-00:00:00 135 24+ 3564 122+ 28020 Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz, Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
htc4 2-00:00:00 37 56 2072 509 18853 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
hpc5 2-00:00:00 33 40 1320 379 12516 Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz
hpc6 2-00:00:00 62 48 2976 379+ 23502 Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz
hpc4rc 2-00:00:00 36 28 1008 251 9055 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
hpc4 2-00:00:00 45 28 1260 251+ 11318 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
hpc3 2-00:00:00 54 24 1296 122+ 7646 Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
z10ph-hpc3 2-00:00:00 32 24 768 122 3920 Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
s2600tpr-hpc4 2-00:00:00 4 28 112 251 1006 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
r640-hpc5 2-00:00:00 33 40 1320 379 12516 Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz
r630-htc4 2-00:00:00 37 56 2072 509 18853 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
r630-hpc4rc 2-00:00:00 36 28 1008 251 9055 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
r630-hpc4 2-00:00:00 41 28 1148 251 10312 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
r630-hpc3 2-00:00:00 22 24 528 122+ 3725 Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz

Note:

  • *For 'Cores (per node)' and 'Memory (per node)', a '+' indicates a mixed environment. The number before the plus represents the minimum.