Python with Conda
For researchers who have Python (or R package--see bottom) requirements beyond the most common packages (e.g. Numpy, Scipy, Pandas), we recommend using Anaconda. Using Anaconda's Conda package manager, you can create and manage packages and environments. These allow you to easily switch between versions of Python libraries and applications for different projects.
Many other software applications have also started to use Conda as a package manager. It has become a popular choice for managing pipelines that involve several tools, especially with multiple languages.
The Miniconda Module
For your convenience, we provide a relatively recent version of Miniconda (a minimal set of Anaconda libraries) as a module. It serves to bootstrap your personal environments. By using this module, you do not need to download your own copy of Conda, which will prevent unnecessary file and storage usage in your directories.
Note: If you are on Milgram and run out of space in your home directory for Conda, you can either reinstall your environment in your project space (see below) or contact us at email@example.com for help with your home quota.
Setup Your Environment
Load the Miniconda Module
module load miniconda
You can save this to your default module collection by using
module save. See our module documentation for more details.
Default Install Locations
By default on all clusters, we set the
CONDA_PKGS_DIRS environment variables to
conda_pkgs in your project directory where there is more quota available. Conda will install to and search in these directories for environments and cached packages.
To create an environment (saved to the first location in
$CONDA_ENVS_PATH or to
~/.conda/envs) use the
conda create command. You should give your environments names that are meaningful to you, so you can more easily keep track of which serves which project or purpose. You can also use environments manage groups of packages that have conflicting prerequisites.
Because dependency resolution is hard and messy, we find specifying as many packages as possible at environment creation time can help minimize broken dependencies. Although often unavoidable for Python, we also recommend against heavily mixing the use of
pip to install applications. If needed, try to get as much installed with
conda, then use
pip to get the rest of the way to your desired environment.
For added reproducibility and control, specify versions of packages to be installed using
packagename=version syntax. E.g.
For example, if you have a legacy application that needs Python 2 and OpenBLAS:
conda create -n legacy_application python=2.7 openblas
If you want a good starting point for interactive development of scientific Python scripts:
conda create -n py37_dev python=3.7 numpy scipy pandas matplotlib ipython jupyter
There are also community-lead collections of unofficial packages that you can use with
conda called channels. A few popular examples are Conda Forge and Bioconda. See the conda docs for more info about managing channels.
You could use the Conda Forge channel to install Brian2
conda create -n brian2 --channel conda-forge brian2
Bioconda provides recent versions of various bioinformatics tools, for example:
conda create -n bioinfo --channel conda-forge --channel bioconda biopython bedtools bowtie2 repeatmasker
Channel priority decreases from left to right - the first argument is higher priority than the second.
Using Your Environment
To use the applications in your environment, make sure you have the
miniconda module loaded then run the following:
source activate env_name
We do not recommend putting
source activate commands in your .bashrc file. This can lead to issues in interactive or batch jobs. If you do have issues with an environment in an interactive or batch job, trying re-entering the environment by calling
source deactivate before rerunning
source activate env_name.
Your conda environments will not follow you into job allocations. Make sure to activate them after your interactive job begins.
In a Job Script
To make sure that you are running in your project environment in a submission script, make sure to include the following lines in your submission script before running any other commands or scripts (but after your Slurm directives):
#!/bin/bash #SBATCH --partition=general #SBATCH --job-name=my_conda_job #SBATCH --cpus-per-task 4 #SBATCH --mem-per-cpu=6000 module load miniconda source activate env_name python analyses.py
Find and Install Additional Packages
You can search Anaconda Cloud for any packages you would like to install. Once in your conda environment, you can install any additional packages using
conda install numpy
If you get a permission denied error while trying to conda or pip install a package, make sure you have created an environment and activated it or activated an existing one first.
"-bash: activate: No such file or directory"
If you get the above error, it is likely that you don't have the necessary module file loaded. Try loading the
minconda module and rerunning your
source activate env_name command.
"could not find environment:"
This error means that the version of Anaconda/Miniconda you have loaded doesn't recognize the environment name you have supplied. Make sure you have the
miniconda module loaded (and not a different Python module) and have previously created this environment. You can see a list of previously created environments by running:
conda info --envs
Additional Conda Commands
List Installed Packages
source activate env_name conda list
Delete a Conda Environment
conda env remove --name env_name
Share your Conda Environment
If you want to share or back up a conda environment, you can export it to a file. To do so you need to run the following, replacing
env_name with the desired environment.
source activate env_name conda env export > env_name_environment.yml # on another machine or account, run conda env create -f env_name_environment.yml
Conda for R
Conda can also be used under certain circumstances to install and manage R packages. All R packages are prepended with
Create new environment:
conda create -n r_env r-essentials r-base
Install additional packages:
conda install r-ggplot2