Policies

Following are RCSS policies and guidelines for different services and groups:


Software and Procurement Policy

Open Source Software

  1. Software installed cluster-wide must have an open source (https://opensource.org/licenses) license or be obtained utilizing the procurement process even if there is not a cost associated with it.

Licensed Software

  1. Licensed software (any software that requires a license or agreement to be accepted) must follow the procurement process to protect users, their research, and the University. To ensure this, for RCSS to install and support licensed software RCSS must manage the license and the license server.
  2. For widely used software RCSS can facilitate the sharing of license fees and/or may support the cost depending on the cost and situation. Otherwise, user are responsible for funding for fee licensed software and RCSS can handle the procurement process. We require that if the license does not preclude it, and there are not node or other resource limits, that the software is make made available to all users on the cluster. All licensed software installed on the cluster is to be used following the license agreement. We will do our best to install and support a wide rage of scientific software as resources and circumstances dictate but in general we only support scientific software that will run on CentOS in a HPC cluster environment. RCSS may not support software that is implicitly/explicitly deprecated by the community.

Singularity Support

  1. A majority of scientific software and software libraries can be installed in users’ accounts or in group space. We also provide limited support for Singularity for advanced users who require more control over their computing environment. We cannot knowingly assist users to install software that may put them, the University, or their intellectual property at risk.

Research Network Policy

Research Network DNS

The domain name for the Research Network (RNet) is rnet.missouri.edu and is for research purposes only. All hosts on RNet will have a .rnet.missouri.edu domain. Subdomains and CNAMEs are not permitted. Reverse records will always point to a host in the .rnet.missouri.edu domain.

GPRN

The GPRN is a wired secure networking service of the Research Network for instruments, printers, workstations, and other computing devices that need to communicate with servers on RNet. GPRN provides private IPv4 addresses that are directly routed to the research network and internet access is provided by the campus network (NAT). Machines inside GPRN can connect to RNET without restriction (all ports) and only the research network can connect to machines in the GPRN via ssh only (port 22). Devices must use DHCP to connect to the network and the devices host name will be registered (DDNS) at <hostname>.gprn.rnet.missouri.edu. Static DHCP leases are available for instruments that require IP addresses that do not change. A MoCode for the port fee is required to establish a connection.

Path Rule
Internet to GPRN Closed
RNet to GPRN Port 22/Routed
GPRN to RNet Open/Routed
GPRN to Internet NAT'd
VPN to GPRN Open

BioCompute Group Policy

  1. Access to the BioCompute partition requires a current investment in the Lewis cluster with at least 12 fairshare shares (or an entire machine) and approval of the BioCompute Advisory Committee. Investment must be made within 90 days of access or within 90 days after an investment expires.
  2. Access to the biocommunity Slurm account is on a per-project, time limited basis and needs approval of the BioCompute Advisory Committee and active coordination with the community.
  3. Special time-allocated access to the BioCompute partition and the biocommunity Slurm account is provided as a special allocation with an application and report (see CIC Special Allocation Documentation) and requires approval of the BioCompute Advisory Committee.
  4. Requests for access should be sent to mudoitrcss@missouri.edu
  5. User needs to specify a Slurm account( #SBATCH -A <account name> ) other than general to access the BioCompute partition.

General Purpose Research Network Policy

The General Purpose Research Network (GPRN) is a wired networking service of the Research Network (RNet) for instruments, printers, workstations, and other computing devices that need to communicate with servers on RNet. GPRN provides private IPv4 addresses that are routed to the research network and internet access is provided by the campus network (NAT) providing additional security. Machines inside GPRN can connect to RNet without restriction (all ports) and only the Research Network can connect to machines in the GPRN via ssh (port 22). Devices must use DHCP to connect to the network and will be named **\*.gprn.missouri.edu. Static DHCP leases and names are available for instruments and printers that require an IP address that does not change. A MoCode for the port fee is required to establish a connection.

Flow Rule
GPRN to RNet Open/Routed
RNet to GPRN Port 22/Routed
GPRN to Internet Open via NAT
Internet to GPRN Closed (NAT)
VPN to GPRN Open

Teaching Cluster Policy

The teaching cluster is meant as a resource for students and instructors for computational computing. The teaching cluster is a full HPC cluster and students are allowed to run jobs on the head node. Students must contact instructors for course related questions and support.

Teaching Cluster Policy for Instructors

  1. The teaching cluster provided to all UM students, students must be official UM students with an PawPrint or UM SSO ID.
  2. The environment is "research grade". No backups or high availability. Students/TA's are responsible for backing up any data throughout the semester.
  3. Only infrastructure support is provided, there is no student/end-user support. All support requests should come through the instructor or TA’s via mudoitrcss@missouri.edu. Support is best effort and provided during regular business hours.
  4. Software is limited to CentOS 7 packages installed via yum that require minimal configuration and a subset of Lewis scientific packages.
  5. We do not support a development environment/IDE. Users need to use either sftp or emacs/vi/nano or other console based text editors. For windows users we have a site license for MobaXterm https://missouri.app.box.com/rcss-mobaxterm
  6. We take security seriously. We upgrade the entire environment (including rebooting) on a regular basis and without notice. SELinux is enforced.
  7. Students must be made aware of the "Teaching Cluster Policy" and the limitations of the environment.

Teaching Cluster Policy for all users

  1. Use of this system is governed by the rules and regulations of the University of Missouri and the University of Missouri System.
  2. Users must be familiar with and abide by the UM System acceptable use policy (CRR 110.005) and the UM System Data Classification System (DCL). Collected Rules and Regulations - Chapter 110, Data Classification System.
  3. Only DCL 1 data is permitted on the cluster Data Classification System - Definitions.
  4. This is a shared environment with limited storage, RAM, or CPU. We have implemented a 500MB quota for each user. Users will not be able to write or create new items after this limit is met until their total data usage across Clark is lower than this quota.
  5. Data is not backed up and all data deleted when students graduate. This policy may be revised.
  6. Students must contact instructors for course related questions and support.

Teaching VM Policy

The web based scientific computing teaching VM is meant as a resource for courses. Students must contact instructors for course related questions and support.

Web Based Scientific Computing Teaching VM Policy

  1. The teaching virtual machine is provided for a single course and will be removed, and all data deleted, at the end of the semester (after grades are due).
  2. The environment is "research grade". No backups or high availability. Students/TA's are responsible for backing up any data throughout the semester.
  3. Only infrastructure support is provided, there is no student/end-user support. All support requests should come through TA’s/DBA’s to mudoitrcss@missouri.edu. Support is best effort and provided during regular business hours.
  4. Software is limited to CentOS 7/Fedora 25 packages installed via yum that require minimal configuration.

  5. We only support PHP at its latest version and stock PHP modules/functions from CentOS 7/Fedora 25 (aka only a yum install php-package / dnf install php-package).

  6. We support PostgreSQL, SQLite3, and MariaDB. We will install CentOS 7 database drivers for any languages and the command line clients as well.

  7. We do not support a development environment/IDE. Users need to use either sftp or emacs/vi or other console based text editors. For windows users we have a site license for MobaXterm

  8. Access to web-pages is authentication limited and passwords/access must not be made public.
  9. We take security seriously. We upgrade the entire environment (including rebooting) on a regular basis and without notice. SELinux is enforced.
  10. Only DCL 1 data is permitted on the VM Data Classification System - Definitions
  11. This is a shared environment with no storage, RAM, or CPU quotas/limits.

Secure4 Policy

Secure4 Research Cluster is a secured computational research cluster that hosts data falling within the DCL3 and DCL4 classifications per University guidelines regarding the Data Classification System. Requests to use to the Secure4 Research Cluster must be emailed to mudoitrcss@missouri.edu and should be sent from the primary PI. Projects must be IRB approved (although exceptions may be made on a per project basis.) All projects will be processed per the specifics of the data repository requirements and processes.

Data Repositories

Data repositories that are hosted on the Secure4 HTC storage with HTC storage rates applied. To create a data repository the main PI must provide the IRB information, data requirements, designated repository administrator, and any other pertinent information specific to the data repository request. The data repository administrator is required to be part of the IRB and is responsible for controlling transfer of data from the repository to the requester. For transfers of data, the data repository administrator should create a transfer folder with a subfolder for the specific transfer request. When data has been put into the subfolder, a ticket should be opened via Cherwell and assigned to RCSS ("Research Computing" group under DoIT). When the ticket comes up in the queue, RCSS will change the group owner of the sub folder to the requesting group and then close the ticket. Data repositories can only be transferred to Secure4 accounts. All requests are and will be tracked via Cherwell.

Accounts

Researchers at the University of Missouri or the University of Missouri System may request project workspace and user accounts on Secure4. To request a project work space and/or user accounts the PI must contact mudoitrcss@missouri.edu to set up a consultation. Students, staff, faculty that are part of the University of Missouri or the University of Missouri System may have user accounts created and be associated with projects at the request of the project's PI and pending IRB confirmation. Accounts can be requested for collaborators outside of the University of Missouri and University of Missouri System once the collaborator has a courtesy account and is added to the IRB. It is the PI’s responsibility for requesting user accounts to be removed from projects when a member of the project leaves.

IT Pros

Department IT Pros are responsible for securing, encrypting and ensuring patch enforcement and management of users workstations and laptops, and assistance with verification of a passphrase-protected ssh key pair creation for Secure4 users. IT Pros will pass along the users public ssh key in the Cherwell ticket for the account request.


NSF MRI-RC Policy

  1. Access to the MRI-RC account requires permission, please contact PI Chi-Ren Shyu (shyuc@missouri.edu) for more details.
  2. Users that utilize the MRI-RC account must acknowledge that "The computation in this paper was supported by NSF Grant #1429294 and by RCSS" in their publications.

NSF MRI-RC Information:

  1. National Science Foundation Grant: CNS-1429294
  2. https://www.nsf.gov/awardsearch/showAward?AWD_ID=1429294&HistoricalAwards=false
  3. http://munews.missouri.edu/news-releases/2014/0926-nsf-grants-1-million-to-mu-to-expand-supercomputer-equipment-and-expertise-for-big-data-analytics-at-mu/

Storage Policy

The Lewis HPC cluster contains 6 distinct types of storage namely Home, HPC, HTC, GPRS, Local Scratch and UMKC Researcher Managed Backup. The description and appropriate use of each is detailed in this document. For more information on purchasing additional storage please send a request to mudoitrcss@missouri.edu.

Note: There are no backups of any storage. The RCSS team is not responsible for data integrity and data loss. You are responsible for your own data and data backup. RCSS recommends the UMKC Researcher Managed Backup Storage.

Core Lewis File Systems

Storage Type Location Quota Properties Purpose
Home /home/$USER 5GB SSD, compressed This is a good location to store source code, scripts, and other (small) important files. Extremely fast, but very small.
HPC /storage/hpc/* (includes /data, /group, and /scratch) 100GB by default, up to 500GB by request; $15.50/TB/month for quotas larger than 500GB High Performance Parallel File System (Luster), compressed This is a good location for storing raw data and results. Very fast - optimized for High Performance Computing (HPC) parallel work flows. Group quotas will be applied to the /group folder instead of user quotas if the file permissions are correct.
HTC /storage/htc/* $160/TB/60 months Economical Large Storage File System This is for large bulk storage of data needed for work flows. Optimized for High Throughput Computing (HTC) workloads e.g. DNA sequencing or image processing. Not appropriate for highly parallel work flows or simultaneous access to single file by many hosts (see HPC storage)
Local Scratch /local/scratch/$USER varies by partition (500 GB - 14 TB) HDDs or SSDs depending on partition Very fast scratch space - only usable inside the duration of a job (not persistent). User is responsible for cleaning up data at the end of each job.

Note: HTC Storage - The minimum investment for HTC Storage is 10TB for 5 years which is $2.66 x 10TB x 60 months = $1,600.

Groups located in HPC or HTC storage (/storage/hpc/group/group_name or /storage/htc/group/group_name) are considered flat and all users located within a specific group have the same permissions. The PI for the group is the only person who can approve additions and removals of users in groups.

Edge Lewis File Systems

These file systems are mounted only on certain non-compute hosts on Lewis, namely the Login and DTN nodes. These are used to move data into and out of the cluster, but never for direct computation.

Storage Type Location Quota Properties Purpose
GPRS /gprs/* $7/TB/mo highly reliable, does not go down when Lewis is under maintenance Accessible from outside the cluster, including Windows machines. Ideal for attaching storage to instruments (e.g. DNA Sequencing machine) or for users with highly variable data sizes month to month. Not suitable for any computational work flows - archive and transfer only.
UMKC Researcher Managed Backup not mounted on Lewis; but firewall-preferred rsync from DTNs to UMKC $2/TB/mo Off-site backup, similar to HTC storage Researcher Managed Backup - a rsync target located in the UMKC data center. Ideal for user-managed backup schemes.

Note: GPRS Storage - Managed by the University Enterprise Storage Team in concert with RCSS. See GPRS Storage Info for more info.

UMKC Researcher Managed Backup - Managed by University of Missouri - Kansas City. More information is available at Research Managed Backup Storage. To purchase Research Managed Backup Storage please email umkcisrcss@umkc.edu.

Data Retention Policies

Location Policy
/home, /data Subject to removal for inactive users or users no longer associated with the institution associated with their account
/group Subject to removal for inactive or empty groups
/scratch Data can be removed automatically after 10 days.
/local/scratch Data can be removed automatically after job exits.

Note: The RCSS team reserves the right to delete anything in /scratch and /local/scratch at any time for any reason.


Partition Policy

Partition Usage Guidelines

  • General is the default partition.
    • default time limit: 2 hours
    • maximum time limit: 4 hours
    • single node jobs only
  • Lewis is designed for longer running multi-core and multi-node jobs.
    • default time limit: 2 hours
    • maximum time limit: 2 days
    • InfiniBand Fabric available for MPI jobs
    • optimized for HPC workloads
    • not suitable for large multi-node (>100 cores) jobs; use a partition listed in the MPI document (i.e. 'hpc5') instead
  • Interactive is designed for short interactive testing, interactive debugging, and general interactive jobs.
    • maximum time limit: 4 hours
    • acceptable use production codes is not permitted
  • Gpu designed for GPU accelerated workflows.
  • QOS 'long' is designed for non-serial code that needs more than 48 hours to run.
    • requires permission: contact mudoitrcss@missouri.edu to gain access to this Quality of Service (QOS)
    • requires the QOS 'long'
    • multi-core and multi-node jobs only
    • maximum time limit: 7 days
    • works in the 'Lewis' partition and any of the HPC sub-partitions (i.e. 'hpc5')
  • Serial is designed for non-parallel code that needs more than 48 hours to run.
    • requires permission: contact mudoitrcss@missouri.edu to gain access to this partition
    • requires the QOS 'seriallong'
    • single core jobs only
    • maximum time limit: 28 days
    • optimized for HTC workloads
  • Dtn is only for long running data manipulation jobs.
    • requires permission: contact mudoitrcss@missouri.edu to gain access to this partition
    • requires the QOS 'dtn'
    • maximum time limit: 2 days
    • acceptable use includes file transfer (rsync, sftp, wget), checksums (md5sum, sha256sum), compression, (tar, zip), and encryption (gz)
  • BioCompute is for BioCompute Investors.
    • requires permission
    • maximum time limit: 2 days
    • optimized for HTC workloads

Job Cancellation Policy

While we try to avoid canceling jobs the following events would merit job cancellation:

  • Emergency Security Event/Patches
    • Cluster is shut down immediately up to and including killing all active jobs depending on the severity of the incident.
  • Urgent Security Patching
    • Login nodes will be closed and the cluster will be shutdown after all 2-day or less jobs complete, jobs 2 days or longer will be terminated.
  • Scheduled Maintenance
    • We will communicate the approximate date well in advance and give 2 weeks notice. The maintenance reservation will ensure jobs less than 7 days will complete in time. Any long serial jobs will be terminated.

Available Partitions

Partition Name Time Limit Nodes Cores (per node*) Cores (total) Memory in Megabytes (per node*) Memory in Megabytes(total) Processors
General 04:00:00 172 24+ 5636 122384+ 46873788 Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz, Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
Interactive 04:00:00 5 24+ 144 251550+ 1515828 Intel(R) Xeon(R) CPU E5-2695 v2 @ 2.40GHz, Intel(R) Xeon(R) CPU E7-4850 v2 @ 2.30GHz
Serial 2-00:00:00 1 64 64 1025401 1025401 AMD EPYC 7601 32-Core Processor
Dtn 2-00:00:00 2 16+ 36 66170+ 188682 Intel(R) Xeon(R) CPU X5550 @ 2.67GHz, Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
Gpu 02:00:00 15 16+ 284 122512+ 1837776 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz, Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
BioCompute 2-00:00:00 37 56 2072 509560 18853720 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
Lewis 2-00:00:00 135 24+ 3564 122384+ 28020068 Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz, Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
htc4 2-00:00:00 37 56 2072 509560 18853720 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
hpc5 2-00:00:00 33 40 1320 379284 12516372 Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz
hpc6 2-00:00:00 62 48 2976 379067+ 23502193 Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz
hpc4rc 2-00:00:00 36 28 1008 251528 9055008 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
hpc4 2-00:00:00 45 28 1260 251527+ 11318756 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
hpc3 2-00:00:00 54 24 1296 122384+ 7646304 Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
z10ph-hpc3 2-00:00:00 32 24 768 122530 3920960 Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
s2600tpr-hpc4 2-00:00:00 4 28 112 251527 1006108 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
r640-hpc5 2-00:00:00 33 40 1320 379284 12516372 Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz
r630-htc4 2-00:00:00 37 56 2072 509560 18853720 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
r630-hpc4rc 2-00:00:00 36 28 1008 251528 9055008 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
r630-hpc4 2-00:00:00 41 28 1148 251528 10312648 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
r630-hpc3 2-00:00:00 22 24 528 122384+ 3725344 Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz

Note:

  • *For 'Cores (per node)' and 'Memory (per node)', a '+' indicates a mixed environment. The number before the plus represents the minimum.