Python

Python is a language and free software distribution that is used for websites, system administration, security testing, and scientific computing, to name a few. On the Yale Clusters there are a couple ways in which you can set up Python environments. The default python provided is the minimal install of Python 3.8 that comes with Red Hat Enterprise Linux 8. We strongly recommend that you use one of the methods below to set up your own python environment.

The Python Module

We provide a Python as a software module. We include frozen versions of many common packages used for scientific computing.

Find and Load Python

Find the available versions of Python version 3 with:

module avail Python/3

To load version 3.8.6:

module load Python/3.8.6-foss-2020b

To show installed Python packages and their versions for the Python/3.8.6-foss-2020b module:

module help Python/3.8.6-foss-2020b

Install Packages

We recommend against installing python packages with pip after having loaded the Python module. Doing so installs them to your home directory in a way that does not make it clear to other python installs what environment the packages you installed belong to. Instead we recommend using virtualenv or Conda environments. We like conda because of all the additional pre-compiled software it makes available.

Warning

Grace's login nodes have newer architecture than the oldest nodes on the cluster. If you do pip install packages, do so in an interactive job submitted with the -C oldest Slurm flag if you want to ensure your code will work on all generations of the compute nodes.

Conda-based Python Environments

You can easily set up multiple Python installations side-by-side using the conda command. With Conda you can manage your own packages and dependencies for Python, R, etc. See our guide for more detailed instructions.

# install once
module load miniconda
conda create -n py3_env python=3 numpy scipy matplotlib ipython jupyter jupyterlab
# use later
module reset && module load miniconda
conda activate py3_env

Run Python

We will kill Python jobs on the login nodes that are using excessive resources. To be a good cluster citizen, launch your computation in jobs. See our Slurm documentation for more detailed information on submitting jobs.

Interactive Job

To run Python interactively, first launch an interactive job on a compute node. If your Python sessions will need up to 10 GiB of RAM and up to 4 hours, you would submit you job with:

salloc --mem=10G -t 4:00:00

Once your interactive session starts, you can load the appropriate module or Conda environment (see above) and start python or ipython on your command prompt. If you are happy with your Python commands, save them to a file which can then be submitted and run as a batch job.

Batch Mode

To run Python in batch mode, create a plain-text batch script to submit. In that script, you call your Python script. In this case myscript.py is in the same directory as the batch script, batch script contents shown below.

#!/bin/bash
#SBATCH -J my_python_program
#SBATCH --mem=10G
#SBATCH -t 4:00:00

module load miniconda
conda activate py3_env
python myscript.py

To actually submit the job, run sbatch my_py_job.sh where the batch script above was saved as my_py_job.sh.

Jupyter Notebooks

You can run Jupyter notebooks & JupyterLab by submitting your notebook server as a job. See our page dedicated to Jupyter for more info.

Last update: January 30, 2025