Milgram
Milgram is a HIPAA aligned cluster intended for use on projects that may involve sensitive data. This applies to both storage and computation. If you have any questions about this policy, please contact us.
Milgram is named for Dr. Stanley Milgram, a psychologist who researched the behavioral motivations behind social awareness in individuals and obedience to authority figures. He conducted several famous experiments during his professorship at Yale University including the lost-letter experiment, the small-world experiment, and the Milgram experiment.
Milgram Usage Policies
Users wishing to use Milgram must agree to the following:
- All Milgram users must have fulfilled and be current with Yale's HIPAA training requirement.
- Since Milgram's resources are limited, we ask that you only use Milgram for work on and storage of sensitive data, and that you do your other high performance computing on our other clusters.
Operating System Upgrade"
Milgram has been upgraded to RHEL 8 during the February maintenance window, February 6-8, 2024. For more information see our Milgram Operating System Upgrade documentation.
Access the Cluster
Once you have an account, the cluster can be accessed via ssh or through the Open OnDemand web portal.
Info
Connections to Milgram can only be made from the Yale VPN (access.yale.edu
)--even if you are already on campus (YaleSecure or ethernet). See our VPN page for setup instructions.
System Status and Monitoring
For system status messages and the schedule for upcoming maintenance, please see the system status page. For a current node-level view of job activity, see the cluster monitor page (VPN only).
Partitions and Hardware
Milgram is made up of several kinds of compute nodes. We group them into (sometimes overlapping) Slurm partitions meant to serve different purposes. By combining the --partition
and --constraint
Slurm options you can more finely control what nodes your jobs can run on.
Job Submission Limits
-
You are limited to 4 interactive app instances (of any type) at one time. Additional instances will be rejected until you delete older open instances. For OnDemand jobs, closing the window does not terminate the interactive app job. To terminate the job, click the "Delete" button in your "My Interactive Apps" page in the web portal.
-
Job submissions are limited to 200 jobs per hour. See the Rate Limits section in the Common Job Failures page for more info.
Interactive Partition Name Change
The 'interactive' and 'psych_interactive partitions have been renamed to 'devel' and 'psych_devel', respectively. Please adjust your job submissions accordingly.
Public Partitions
See each tab below for more information about the available common use partitions.
Use the day partition for most batch jobs. This is the default if you don't specify one with --partition
.
Request Defaults
Unless specified, your jobs will run with the following options to salloc
and sbatch
options for this partition.
--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120
Job Limits
Jobs submitted to the day partition are subject to the following limits:
Limit | Value |
---|---|
Maximum job time limit | 1-00:00:00 |
Maximum CPUs per user | 324 |
Available Compute Nodes
Requests for --cpus-per-task
and --mem
can't exceed what is available on a single compute node.
Count | CPU Type | CPUs/Node | Memory/Node (GiB) | Node Features |
---|---|---|---|---|
9 | 6240 | 36 | 181 | cascadelake, avx512, 6240, nogpu, standard, common, bigtmp, oldest |
Use the devel partition to jobs with which you need ongoing interaction. For example, exploratory analyses or debugging compilation.
Request Defaults
Unless specified, your jobs will run with the following options to salloc
and sbatch
options for this partition.
--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120
Job Limits
Jobs submitted to the devel partition are subject to the following limits:
Limit | Value |
---|---|
Maximum job time limit | 06:00:00 |
Maximum CPUs per user | 4 |
Maximum memory per user | 32G |
Maximum running jobs per user | 1 |
Maximum submitted jobs per user | 1 |
Available Compute Nodes
Requests for --cpus-per-task
and --mem
can't exceed what is available on a single compute node.
Count | CPU Type | CPUs/Node | Memory/Node (GiB) | Node Features |
---|---|---|---|---|
2 | 6240 | 36 | 181 | cascadelake, avx512, 6240, nogpu, standard, common, bigtmp, oldest |
Use the week partition for jobs that need a longer runtime than day allows.
Request Defaults
Unless specified, your jobs will run with the following options to salloc
and sbatch
options for this partition.
--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120
Job Limits
Jobs submitted to the week partition are subject to the following limits:
Limit | Value |
---|---|
Maximum job time limit | 7-00:00:00 |
Maximum CPUs per user | 72 |
Available Compute Nodes
Requests for --cpus-per-task
and --mem
can't exceed what is available on a single compute node.
Count | CPU Type | CPUs/Node | Memory/Node (GiB) | Node Features |
---|---|---|---|---|
4 | 6240 | 36 | 181 | cascadelake, avx512, 6240, nogpu, standard, common, bigtmp, oldest |
Use the gpu partition for jobs that make use of GPUs. You must request GPUs explicitly with the --gpus
option in order to use them. For example, --gpus=gtx1080ti:2
would request 2 GeForce GTX 1080Ti GPUs per node.
Request Defaults
Unless specified, your jobs will run with the following options to salloc
and sbatch
options for this partition.
--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120
GPU jobs need GPUs!
Jobs submitted to this partition do not request a GPU by default. You must request one with the --gpus
option.
Job Limits
Jobs submitted to the gpu partition are subject to the following limits:
Limit | Value |
---|---|
Maximum job time limit | 2-00:00:00 |
Maximum GPUs per user | 4 |
Available Compute Nodes
Requests for --cpus-per-task
and --mem
can't exceed what is available on a single compute node.
Count | CPU Type | CPUs/Node | Memory/Node (GiB) | GPU Type | GPUs/Node | vRAM/GPU (GB) | Node Features |
---|---|---|---|---|---|---|---|
2 | 5222 | 8 | 181 | rtx5000 | 4 | 16 | cascadelake, avx512, 5222, doubleprecision, common, bigtmp, rtx5000 |
Use the scavenge partition to run preemptable jobs on more resources than normally allowed. For more information about scavenge, see the Scavenge documentation.
Request Defaults
Unless specified, your jobs will run with the following options to salloc
and sbatch
options for this partition.
--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120
GPU jobs need GPUs!
Jobs submitted to this partition do not request a GPU by default. You must request one with the --gpus
option.
Job Limits
Jobs submitted to the scavenge partition are subject to the following limits:
Limit | Value |
---|---|
Maximum job time limit | 1-00:00:00 |
Available Compute Nodes
Requests for --cpus-per-task
and --mem
can't exceed what is available on a single compute node.
Count | CPU Type | CPUs/Node | Memory/Node (GiB) | GPU Type | GPUs/Node | vRAM/GPU (GB) | Node Features |
---|---|---|---|---|---|---|---|
20 | 6342 | 48 | 478 | icelake, avx512, 6342, bigtmp, nogpu, standard, pi | |||
1 | 6326 | 32 | 497 | a40 | 4 | 48 | icelake, avx512, pi, 6326, singleprecision, bigtmp, a40 |
17 | 6240 | 36 | 181 | cascadelake, avx512, 6240, nogpu, standard, common, bigtmp, oldest | |||
10 | 6240 | 36 | 372 | rtx2080ti | 4 | 11 | cascadelake, avx512, 6240, singleprecision, pi, bigtmp, rtx2080ti, oldest |
2 | 5222 | 8 | 181 | rtx5000 | 4 | 16 | cascadelake, avx512, 5222, doubleprecision, common, bigtmp, rtx5000 |
Private Partitions
With few exceptions, jobs submitted to private partitions are not considered when calculating your group's Fairshare. Your group can purchase additional hardware for private use, which we will make available as a pi_groupname
partition. These nodes are purchased by you, but supported and administered by us. After vendor support expires, we retire compute nodes. Compute nodes can range from $10K to upwards of $50K depending on your requirements. If you are interested in purchasing nodes for your group, please contact us.
PI Partitions (click to expand)
Request Defaults
Unless specified, your jobs will run with the following options to salloc
and sbatch
options for this partition.
--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120
GPU jobs need GPUs!
Jobs submitted to this partition do not request a GPU by default. You must request one with the --gpus
option.
Available Compute Nodes
Requests for --cpus-per-task
and --mem
can't exceed what is available on a single compute node.
Count | CPU Type | CPUs/Node | Memory/Node (GiB) | GPU Type | GPUs/Node | vRAM/GPU (GB) | Node Features |
---|---|---|---|---|---|---|---|
1 | 6326 | 32 | 497 | a40 | 4 | 48 | icelake, avx512, pi, 6326, singleprecision, bigtmp, a40 |
Request Defaults
Unless specified, your jobs will run with the following options to salloc
and sbatch
options for this partition.
--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120
Job Limits
Jobs submitted to the psych_day partition are subject to the following limits:
Limit | Value |
---|---|
Maximum job time limit | 1-00:00:00 |
Maximum CPUs per group | 500 |
Maximum memory per group | 2500G |
Maximum CPUs per user | 350 |
Maximum memory per user | 1750G |
Available Compute Nodes
Requests for --cpus-per-task
and --mem
can't exceed what is available on a single compute node.
Count | CPU Type | CPUs/Node | Memory/Node (GiB) | Node Features |
---|---|---|---|---|
19 | 6342 | 48 | 478 | icelake, avx512, 6342, bigtmp, nogpu, standard, pi |
Request Defaults
Unless specified, your jobs will run with the following options to salloc
and sbatch
options for this partition.
--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120
Job Limits
Jobs submitted to the psych_devel partition are subject to the following limits:
Limit | Value |
---|---|
Maximum job time limit | 06:00:00 |
Maximum CPUs per user | 4 |
Maximum memory per user | 32G |
Maximum running jobs per user | 1 |
Maximum submitted jobs per user | 1 |
Available Compute Nodes
Requests for --cpus-per-task
and --mem
can't exceed what is available on a single compute node.
Count | CPU Type | CPUs/Node | Memory/Node (GiB) | Node Features |
---|---|---|---|---|
1 | 6342 | 48 | 478 | icelake, avx512, 6342, bigtmp, nogpu, standard, pi |
Request Defaults
Unless specified, your jobs will run with the following options to salloc
and sbatch
options for this partition.
--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120
GPU jobs need GPUs!
Jobs submitted to this partition do not request a GPU by default. You must request one with the --gpus
option.
Job Limits
Jobs submitted to the psych_gpu partition are subject to the following limits:
Limit | Value |
---|---|
Maximum job time limit | 7-00:00:00 |
Maximum GPUs per user | 20 |
Available Compute Nodes
Requests for --cpus-per-task
and --mem
can't exceed what is available on a single compute node.
Count | CPU Type | CPUs/Node | Memory/Node (GiB) | GPU Type | GPUs/Node | vRAM/GPU (GB) | Node Features |
---|---|---|---|---|---|---|---|
10 | 6240 | 36 | 372 | rtx2080ti | 4 | 11 | cascadelake, avx512, 6240, singleprecision, pi, bigtmp, rtx2080ti, oldest |
Request Defaults
Unless specified, your jobs will run with the following options to salloc
and sbatch
options for this partition.
--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120
GPU jobs need GPUs!
Jobs submitted to this partition do not request a GPU by default. You must request one with the --gpus
option.
Job Limits
Jobs submitted to the psych_scavenge partition are subject to the following limits:
Limit | Value |
---|---|
Maximum job time limit | 7-00:00:00 |
Available Compute Nodes
Requests for --cpus-per-task
and --mem
can't exceed what is available on a single compute node.
Count | CPU Type | CPUs/Node | Memory/Node (GiB) | GPU Type | GPUs/Node | vRAM/GPU (GB) | Node Features |
---|---|---|---|---|---|---|---|
20 | 6342 | 48 | 478 | icelake, avx512, 6342, bigtmp, nogpu, standard, pi | |||
10 | 6240 | 36 | 372 | rtx2080ti | 4 | 11 | cascadelake, avx512, 6240, singleprecision, pi, bigtmp, rtx2080ti, oldest |
Request Defaults
Unless specified, your jobs will run with the following options to salloc
and sbatch
options for this partition.
--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120
Job Limits
Jobs submitted to the psych_week partition are subject to the following limits:
Limit | Value |
---|---|
Maximum job time limit | 7-00:00:00 |
Maximum CPUs per group | 350 |
Maximum memory per group | 2000G |
Maximum CPUs per user | 250 |
Maximum memory per user | 1500G |
Maximum CPUs in use | 448 |
Available Compute Nodes
Requests for --cpus-per-task
and --mem
can't exceed what is available on a single compute node.
Count | CPU Type | CPUs/Node | Memory/Node (GiB) | Node Features |
---|---|---|---|---|
20 | 6342 | 48 | 478 | icelake, avx512, 6342, bigtmp, nogpu, standard, pi |
Storage
/gpfs/milgram
is Milgram's primary filesystem where home, project, and scratch60 directories are located. For more details on the different storage spaces, see our Cluster Storage documentation.
You can check your current storage usage & limits by running the getquota
command. Note that the per-user usage breakdown only update once daily.
For information on data recovery, see the Backups and Snapshots documentation.
Warning
Files stored in scratch60
are purged if they are older than 60 days. You will receive an email alert one week before they are deleted. Artificial extension of scratch file expiration is forbidden without explicit approval from the YCRC. Please purchase storage if you need additional longer term storage.
Partition | Root Directory | Storage | File Count | Backups | Snapshots |
---|---|---|---|---|---|
home | /gpfs/milgram/home |
125GiB/user | 500,000 | Yes | >=2 days |
project | /gpfs/milgram/project |
1TiB/group, increase to 4TiB on request | 5,000,000 | Yes | >=2 days |
scratch60 | /gpfs/milgram/scratch60 |
20TiB/group | 15,000,000 | No | No |