For researchers who have Python or R package requirements beyond the most common packages (e.g. Numpy, Scipy, Pandas), we recommend using Anaconda. Using Anaconda's Conda package manager, you can create and manage packages and environments. These allow you to easily switch between versions of Python or R libraries and applications for different projects.
Many other software applications have also started to use Conda as a package manager. It has become a popular choice for managing pipelines that involve several tools, especially with multiple languages.
The Miniconda Module
For your convenience, we provide a relatively recent version of Miniconda (a minimal set of Anaconda libraries) as a module. It serves to bootstrap your personal environments. By using this module, you do not need to download your own copy of Conda, which will prevent unnecessary file and storage usage in your directories.
Note: If you are on Milgram and run out of space in your home directory for Conda, you can either reinstall your environment in your project space (see below) or contact us at firstname.lastname@example.org for help with your home quota.
Setup Your Environment
Load the Miniconda Module
# Grace, Omega module load Tools/miniconda # all Others module load miniconda
You can save this to your default module collection by using
module save. See our module documentation for more details.
Install to Your Project Directory
Conda will look in the directory/directories specified in the environment variable
CONDA_ENVS_PATH for places to find and install environments. If you want your environments stored in a directory where your quotas are higher, for example,
~/project/conda_envs, you would need to set this variable to something like. We set this by default for you on Grace, Farnam and Ruddle.
To match this behavior on Milgram:
echo "export CONDA_ENVS_PATH=~/project/conda_envs:$CONDA_ENVS_PATH" >> ~/.bashrc source ~/.bashrc
To create an environment (saved to the first location in
$CONDA_ENVS_PATH or to
~/.conda/envs) use the
conda create command. You should give your environments names that are meaningful to you, so you can more easily keep track of which serves which project or purpose. You can also use environments manage groups of packages that have conflicting prerequisites.
Because dependency resolution is hard and messy, we find specifying as many packages as possible at environment creation time can help minimize broken dependencies. Although often unavoidable for Python, we also recommend against heavily mixing the use of
pip to install applications. If needed, try to get as much installed with
conda, then use
pip to get the rest of the way to your desired environment.
For added reproducibility and control, specify versions of packages to be installed using
packagename=version syntax. E.g.
For example, if you have a legacy application that needs Python 2 and OpenBLAS:
conda create -n legacy_application python=2.7 openblas
If you want a good starting point for interactive development of scientific Python scripts:
conda create -n py37_dev python=3.7 numpy scipy pandas matplotlib ipython jupyter
conda create -n r_env r-essentials r-base
You could use the Conda Forge channel to install Brian2
conda create -n brian2 --channel conda-forge brian2
Bioconda provides recent versions of various bioinformatics tools, for example:
conda create -n bioinfo --channel bioconda biopython bedtools bowtie2 repeatmasker
Using Your Environment
To use the applications in your environment, make sure you have the
miniconda module loaded then run the following:
source activate env_name
Your conda environments will not follow you into job allocations, so make sure to activate them after your interactive job begins.
In a Job Script
To make sure that you are running in your project environment in a submission script, make sure to include the following lines in your submission script before running any other commands or scripts (but after your Slurm directives):
#!/bin/bash #SBATCH --partition=general #SBATCH --job-name=my_conda_job #SBATCH --cpus-per-task 4 #SBATCH --mem-per-cpu=6000 # Grace, Omega module load Tools/miniconda # All other clusters module load miniconda source activate env_name python analyses.py
Find and Install Additional Packages
You can search Anaconda Cloud for any packages you would like to install. Once in your conda environment, you can install any additional packages using
conda install numpy
All R packages are prepended with
conda install r-ggplot2
If you get a permission denied error while trying to conda or pip install a package, make sure you have created an environment or activated an existing one first.
"-bash: activate: No such file or directory"
If you get the above error, it is likely that you don't have the necessary module file loaded. Try loading the appropriate module and rerunning your
source activate env_name command.
"could not find environment:"
This error means that the version of Anaconda/Miniconda you have loaded doesn't recognize the environment name you have supplied. Make sure you have the Miniconda module loaded (and not a different Python module) and have previously created this environment. You can see a list of previously created environments by running:
conda info --envs
Additional Conda Commands
List Installed Packages
source activate env_name conda list
Delete a Conda Environment
conda env remove --name env_name
Share your Conda Environment
If you want to share or back up a conda environment, you can export it to a file. To do so you need to run the following, replacing
env_name with the desired environment.
source activate env_name conda env export > env_name_environment.yml # on another machine or account, run conda env create -f env_name_environment.yml