Python is a language and free software distribution that is used for websites, system administration, security testing, and scientific computing, to name a few. On the Yale Clusters there are a couple ways in which you can set up Python environments. The default python provided is the minimal install of Python 2.7 that comes with Red Hat Enterprise Linux 7. Unless your Python scripts only use Python's standard library, you will probably want to use one of the methods below to set up your own python environment.
The Python Module
We provide a Python as a software module. We include frozen versions of many common packages used for scientific computing.
Find and Load Python
Find the available versions of Python version 3 with:
module avail Python/3
To load version 3.7.0:
module load Python/3.7.0-foss-2018b
To show installed Python packages and their versions for the
module help Python/3.7.0-foss-2018b
We recommend against installing python packages with
pip after having loaded the Python module. Doing so installs them to your home directory in a way that does not make it clear to other python installs what environment the packages you installed belong to. Instead we recommend using virtualenv or Conda environments. We like conda because of all the additional pre-compiled software it makes available.
Grace's login nodes have newer architecture than the oldest nodes on the cluster. If you do
pip install packages, do so in an interactive job submitted with the
-C oldest Slurm flag if you want to ensure your code will work on all generations of the compute nodes.
Conda-based Python Environments
You can easily set up multiple Python installations side-by-side using the
conda command. With Conda you can manage your own packages and dependencies for Python, R, etc. See our guide for more detailed instructions.
# install once module load miniconda conda create -n py3_env python=3 numpy scipy matplotlib ipython jupyter jupyterlab # use later module purge && module load miniconda conda activate py3_env
We will kill Python jobs on the login nodes that are using excessive resources. To be a good cluster citizen, launch your computation in jobs. See our Slurm documentation for more detailed information on submitting jobs.
To run Python interactively, first launch an interactive job on a compute node. If your Python sessions will need up to 10 GiB of RAM and up to 4 hours, you would submit you job with:
salloc --mem=10G -t 4:00:00
Once your interactive session starts, you can load the appropriate module or Conda environment (see above) and start
ipython on your command prompt. If you are happy with your Python commands, save them to a file which can then be submitted and run as a batch job.
To run Python in batch mode, create a plain-text batch script to submit. In that script, you call your Python script. In this case
myscript.py is in the same directory as the batch script, batch script contents shown below.
#!/bin/bash #SBATCH -J my_python_program #SBATCH --mem=10G #SBATCH -t 4:00:00 module load miniconda conda activate py3_env python myscript.py
To actually submit the job, run
sbatch my_py_job.sh where the batch script above was saved as
You can run Jupyter notebooks & JupyterLab by submitting your notebook server as a job. See our page dedicated to Jupyter for more info.