Python
Python is a language and free software distribution that is used for websites, system administration, security testing, and scientific computing, to name a few. On the Yale Clusters there are a couple ways in which you can set up Python environments. The default python provided is the minimal install of Python 3.8 that comes with Red Hat Enterprise Linux 8. We strongly recommend that you use one of the methods below to set up your own python environment.
The Python Module
We provide a Python as a software module. We include frozen versions of many common packages used for scientific computing.
Find and Load Python
Find the available versions of Python version 3 with:
module avail Python/3
To load version 3.8.6:
module load Python/3.8.6-foss-2020b
To show installed Python packages and their versions for the Python/3.8.6-foss-2020b
module:
module help Python/3.8.6-foss-2020b
Install Packages
We recommend against installing python packages with pip
after having loaded the Python module.
Doing so installs them to your home directory in a way that does not make it clear to other python installs what environment the packages you installed belong to.
Instead we recommend using virtualenv or Conda environments.
We like conda because of all the additional pre-compiled software it makes available.
Warning
Grace's login nodes have newer architecture than the oldest nodes on the cluster.
If you do pip
install packages, do so in an interactive job submitted with the -C oldest
Slurm flag if you want to ensure your code will work on all generations of the compute nodes.
Conda-based Python Environments
You can easily set up multiple Python installations side-by-side using the conda
command. With Conda you can manage your own packages and dependencies for Python, R, etc.
See our guide for more detailed instructions.
# install once
module load miniconda
conda create -n py3_env python=3 numpy scipy matplotlib ipython jupyter jupyterlab
# use later
module purge && module load miniconda
conda activate py3_env
Run Python
We will kill Python jobs on the login nodes that are using excessive resources. To be a good cluster citizen, launch your computation in jobs. See our Slurm documentation for more detailed information on submitting jobs.
Interactive Job
To run Python interactively, first launch an interactive job on a compute node. If your Python sessions will need up to 10 GiB of RAM and up to 4 hours, you would submit you job with:
salloc --mem=10G -t 4:00:00
Once your interactive session starts, you can load the appropriate module or Conda environment (see above) and start python
or ipython
on your command prompt.
If you are happy with your Python commands, save them to a file which can then be submitted and run as a batch job.
Batch Mode
To run Python in batch mode, create a plain-text batch script to submit. In that script, you call your Python script.
In this case myscript.py
is in the same directory as the batch script, batch script contents shown below.
#!/bin/bash
#SBATCH -J my_python_program
#SBATCH --mem=10G
#SBATCH -t 4:00:00
module load miniconda
conda activate py3_env
python myscript.py
To actually submit the job, run sbatch my_py_job.sh
where the batch script above was saved as my_py_job.sh
.
Jupyter Notebooks
You can run Jupyter notebooks & JupyterLab by submitting your notebook server as a job. See our page dedicated to Jupyter for more info.