Skip to content

Milgram

Stanley

Milgram is a HIPAA aligned cluster intended for use on projects that may involve sensitive data. This applies to both storage and computation. If you have any questions about this policy, please contact us.

Milgram is named for Dr. Stanley Milgram, a psychologist who researched the behavioral motivations behind social awareness in individuals and obedience to authority figures. He conducted several famous experiments during his professorship at Yale University including the lost-letter experiment, the small-world experiment, and the Milgram experiment.

Info

Connections to Milgram can only be made from the Yale VPN (access.yale.edu)--even if you are already on campus (YaleSecure or ethernet). See our VPN page for setup instructions. If your group has a workstation (see list), you can connect using one of those.


System Status and Monitoring

For system status messages and the schedule for upcoming maintenance, please see the system status page. For a current node-level view of job activity, see the cluster monitor page (VPN only).

Partitions and Hardware

Milgram is made up of several kinds of compute nodes. We group them into (sometimes overlapping) Slurm partitions meant to serve different purposes. By combining the --partition and --constraint Slurm options you can more finely control what nodes your jobs can run on.

Job Submission Rate Limits

Job submissions are limited to 200 jobs per hour. See the Rate Limits section in the Common Job Failures page for more info.

Psychology Partitions Name Changes

To bring the Psychology private partitions in agreement with existing YCRC naming conventions, the partitions have been renamed as follows. These changes make it more clear to all Milgram users which partitions are for common use and which are reserved for Psychology users.

  • gpu -> psych_gpu
  • scavenge -> psych_scavenge

We have also consolidated the short, long and verylong partitions into two partitions. As with the previous paradigm, all non-gpu nodes are be available to these partitions, but psych_week usage is limited to 1/3 of the total cores count.

  • psych_day: for jobs with a walltime of less than or equal to 24 hours
  • psych_week: for jobs with a walltime less than or equal to 7 days.

Starting on April 8th, regardless of which name you use (previous or new version), your jobs are redirected to the appropriate "new" partition (as seen when running squeue). At a later date, we plan to deactivate the old partition names, so we recommend that you update your scripts to use the new names at your earliest convenience.

Public Partitions

See each tab below for more information about the available common use partitions.

Use the day partition for most batch jobs. This is the default if you don't specify one with --partition.

Request Defaults

Unless specified, your jobs will run with the following options to srun and sbatch options for this partition.

--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120

Job Limits

Jobs submitted to the day partition are subject to the following limits:

Limit Value
Maximum job time limit 1-00:00:00
Maximum CPUs per user 324

Available Compute Nodes

Requests for --cpus-per-task and --mem can't exceed what is available on a single compute node.

Count CPU Type CPUs/Node Memory/Node (GiB) Node Features
18 6240 36 181 cascadelake, avx512, 6240, nogpu, standard, common

Use the interactive partition to jobs with which you need ongoing interaction. For example, exploratory analyses or debugging compilation.

Request Defaults

Unless specified, your jobs will run with the following options to srun and sbatch options for this partition.

--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120

Job Limits

Jobs submitted to the interactive partition are subject to the following limits:

Limit Value
Maximum job time limit 06:00:00
Maximum CPUs per user 4
Maximum memory per user 32G
Maximum running jobs per user 1
Maximum submitted jobs per user 1

Available Compute Nodes

Requests for --cpus-per-task and --mem can't exceed what is available on a single compute node.

Count CPU Type CPUs/Node Memory/Node (GiB) Node Features
2 6240 36 181 cascadelake, avx512, 6240, nogpu, standard, common

Use the week partition for jobs that need a longer runtime than day allows.

Request Defaults

Unless specified, your jobs will run with the following options to srun and sbatch options for this partition.

--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120

Job Limits

Jobs submitted to the week partition are subject to the following limits:

Limit Value
Maximum job time limit 7-00:00:00

Available Compute Nodes

Requests for --cpus-per-task and --mem can't exceed what is available on a single compute node.

Count CPU Type CPUs/Node Memory/Node (GiB) Node Features
4 6240 36 181 cascadelake, avx512, 6240, nogpu, standard, common

Use the gpu partition for jobs that make use of GPUs. You must request GPUs explicitly with the --gres option in order to use them. For example, --gres=gpu:gtx1080ti:2 would request 2 GeForce GTX 1080Ti GPUs per node.

Request Defaults

Unless specified, your jobs will run with the following options to srun and sbatch options for this partition.

--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120

GPU jobs need GPUs!

Jobs submitted to this partition do not request a GPU by default. You must request one with the --gres option.

Job Limits

Jobs submitted to the gpu_commons partition are subject to the following limits:

Limit Value
Maximum job time limit 2-00:00:00
Maximum GPUs per user 4

Available Compute Nodes

Requests for --cpus-per-task and --mem can't exceed what is available on a single compute node.

Count CPU Type CPUs/Node Memory/Node (GiB) GPU Type GPUs/Node vRAM/GPU (GB) Node Features
2 5222 8 181 rtx5000 4 16 cascadelake, avx512, 5222, doubleprecision, common

Use the scavenge partition to run preemptable jobs on more resources than normally allowed. For more information about scavenge, see the Scavenge documentation.

Request Defaults

Unless specified, your jobs will run with the following options to srun and sbatch options for this partition.

--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120

GPU jobs need GPUs!

Jobs submitted to this partition do not request a GPU by default. You must request one with the --gres option.

Job Limits

Jobs submitted to the scavenge_all partition are subject to the following limits:

Limit Value
Maximum job time limit 1-00:00:00

Available Compute Nodes

Requests for --cpus-per-task and --mem can't exceed what is available on a single compute node.

Count CPU Type CPUs/Node Memory/Node (GiB) GPU Type GPUs/Node vRAM/GPU (GB) Node Features
18 6240 36 181 cascadelake, avx512, 6240, nogpu, standard, common
2 5222 8 181 rtx5000 4 16 cascadelake, avx512, 5222, doubleprecision, common

Private Partitions

With few exceptions, jobs submitted to private partitions are not considered when calculating your group's Fairshare. Your group can purchase additional hardware for private use, which we will make available as a pi_groupname partition. These nodes are purchased by you, but supported and administered by us. After vendor support expires, we retire compute nodes. Compute nodes can range from $10K to upwards of $50K depending on your requirements. If you are interested in purchasing nodes for your group, please contact us.

PI Partitions (click to expand)

Request Defaults

Unless specified, your jobs will run with the following options to srun and sbatch options for this partition.

--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120

Job Limits

Jobs submitted to the psych_day partition are subject to the following limits:

Limit Value
Maximum job time limit 1-00:00:00
Maximum CPUs per group 500
Maximum memory per group 2500G
Maximum CPUs per user 350
Maximum memory per user 1750G

Available Compute Nodes

Requests for --cpus-per-task and --mem can't exceed what is available on a single compute node.

Count CPU Type CPUs/Node Memory/Node (GiB) Node Features
46 E5-2660_v4 28 247 broadwell, E5-2660_v4, nogpu, standard, pi, oldest

Request Defaults

Unless specified, your jobs will run with the following options to srun and sbatch options for this partition.

--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120

GPU jobs need GPUs!

Jobs submitted to this partition do not request a GPU by default. You must request one with the --gres option.

Job Limits

Jobs submitted to the psych_gpu partition are subject to the following limits:

Limit Value
Maximum job time limit 7-00:00:00

Available Compute Nodes

Requests for --cpus-per-task and --mem can't exceed what is available on a single compute node.

Count CPU Type CPUs/Node Memory/Node (GiB) GPU Type GPUs/Node vRAM/GPU (GB) Node Features
10 6240 36 372 rtx2080ti 4 11 cascadelake, avx512, 6240, singleprecision, pi

Request Defaults

Unless specified, your jobs will run with the following options to srun and sbatch options for this partition.

--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120

GPU jobs need GPUs!

Jobs submitted to this partition do not request a GPU by default. You must request one with the --gres option.

Job Limits

Jobs submitted to the psych_scavenge partition are subject to the following limits:

Limit Value
Maximum job time limit 7-00:00:00

Available Compute Nodes

Requests for --cpus-per-task and --mem can't exceed what is available on a single compute node.

Count CPU Type CPUs/Node Memory/Node (GiB) GPU Type GPUs/Node vRAM/GPU (GB) Node Features
10 6240 36 372 rtx2080ti 4 11 cascadelake, avx512, 6240, singleprecision, pi
48 E5-2660_v4 28 247 broadwell, E5-2660_v4, nogpu, standard, pi, oldest

Request Defaults

Unless specified, your jobs will run with the following options to srun and sbatch options for this partition.

--time=01:00:00 --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=5120

Job Limits

Jobs submitted to the psych_week partition are subject to the following limits:

Limit Value
Maximum job time limit 7-00:00:00
Maximum CPUs in use 448

Available Compute Nodes

Requests for --cpus-per-task and --mem can't exceed what is available on a single compute node.

Count CPU Type CPUs/Node Memory/Node (GiB) Node Features
46 E5-2660_v4 28 247 broadwell, E5-2660_v4, nogpu, standard, pi, oldest

Storage

/gpfs/milgram is Milgram's primary filesystem where home, project, and scratch60 directories are located. For more details on the different storage spaces, see our Cluster Storage documentation.

You can check your current storage usage & limits by running the getquota command. Note that the per-user usage breakdown only update once daily.

Warning

Files stored in scratch60 are purged if they are older than 60 days. You will receive an email alert one week before they are deleted.

Partition Root Directory Storage File Count Backups
home /gpfs/milgram/home 20GiB/user 500,000 Yes
project /gpfs/milgram/project varies varies No
scratch60 /gpfs/milgram/scratch60 varies 5,000,000 No

Last update: April 9, 2021