Anaconda

Anaconda is an open source package management system and environment management system. Conda quickly installs, runs and updates packages and their dependencies. Conda easily creates, saves, loads and switches between environments on your local computer. It was created for Python programs, but it can package and distribute software for any language.

This is the topic for RSS "Virtual Environments" training.

  • Training presentation (pdf)

To be notified about our coming training, please subscribe to our Announcement List.


By default, Conda stores environments and packages within the folder ~/.conda.

To avoid using up all of your home folder's quota, which can easily happen when using Conda, we recommend placing the following within the file ~/.condarc. You can create the file if it is not already present. You can also choose a different path, so long as it is not in your home folder.

envs_dirs:
  - /storage/hpc/data/${USER}/miniconda/envs
pkgs_dirs:
  - /storage/hpc/data/${USER}/miniconda/pkgs

These settings can also be changed when creating the Conda environment. See Sharing Environments with Other Lewis Users for an example.

Usage

The version of Anaconda we have available on Lewis is called "Miniconda". Miniconda is a version of Anaconda that only provides the conda command.

First, you will want to make sure that you are running in a compute job.

[user@lewis4-r710-login-node223 ~]$ srun --partition Interactive --qos interactive --pty /bin/bash

Then, you need to load the miniconda3 module

[user@lewis4-c8k-hpc2-node279 ~]$ module load miniconda3

After that command completes, you will have the conda command available to you. conda is what you will use to manage your Anaconda environments. To list the Anaconda environments that are installed, run the following:

[user@lewis4-c8k-hpc2-node279 ~]$ conda env list

If this is your first time running Anaconda, you will probably only see the "root" environment. This environment is shared between all users of Lewis and cannot be modified. To create an Anaconda environment that you can modify, do this:

[user@lewis4-c8k-hpc2-node279 ~]$ conda create --name my_environment python=3.7

You can use any name you want instead of my_environment. You can also choose other Python versions or add any other packages. Ideally, you should create one environment per project and include all the required packages when you create the environment.

After running the above command, you should see something like this

Fetching package metadata ...........
Solving package specifications: .

Package plan for installation in environment `/storage/hpc/data/$USER/miniconda/envs/my_environment`:

The following NEW packages will be INSTALLED:

    ca-certificates: 2019.1.23-0
    certifi:         2018.11.29-py37_0
    libedit:         3.1.20181209-hc058e9b_0
    libffi:          3.2.1-hd88cf55_4
    libgcc-ng:       8.2.0-hdf63c60_1
    libstdcxx-ng:    8.2.0-hdf63c60_1
    ncurses:         6.1-he6710b0_1
    openssl:         1.1.1b-h7b6447c_0
    pip:             19.0.3-py37_0
    python:          3.7.2-h0371630_0
    readline:        7.0-h7b6447c_5
    setuptools:      40.8.0-py37_0
    sqlite:          3.26.0-h7b6447c_0
    tk:              8.6.8-hbc83047_0
    wheel:           0.33.1-py37_0
    xz:              5.2.4-h14c3975_4
    zlib:            1.2.11-h7b6447c_3

Proceed ([y]/n)? y

Press y to continue. Your packages should be downloaded. After the packages are downloaded, the following will be printed:

#
# To activate this environment, use:
# > source activate my_environment
#
# To deactivate an active environment, use:
# > source deactivate
#

Make a note of that because those commands are how to get in and out of the environment you just created. To test it out, run:

[user@lewis4-c8k-hpc2-node279 ~]$ source activate my_environment
(my_environment) [user@lewis4-c8k-hpc2-node279 ~]$  python
Python 3.7.2 (default, Dec 29 2018, 06:19:36)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

You might notice that (my_environment) now appears before your prompt, and that the Python version is the one you specified above (in our example, version 3.7).

Press Ctrl-D to exit Python.

When the environment name appears before your prompt, you are able to install packages with conda. For instance, to install pandas:

(my_environment) [user@lewis4-c8k-hpc2-node279 ~]$ conda install pandas

Now, pandas will be accessible from your environment:

(my_environment) [user@lewis4-c8k-hpc2-node279 ~]$ python
Python 3.7.2 (default, Dec 29 2018, 06:19:36)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> pandas.__version__
'0.24.1'

Press Ctrl-D to exit Python. To see list of installed packages in the environment, run:

conda list

To exit your environment:

(my_environment) [user@lewis4-c8k-hpc2-node279 ~]$ source deactivate
[user@lewis4-c8k-hpc2-node279 ~]$

In the case that you do not need your environment, you can use the following to remove it (after exit):

conda env remove --name my_environmet

Sharing Environments with Other Lewis Users

To create an environment that can be shared with other users, you will need to manually specify a folder to create the environment in using the --prefix option.

[user@lewis4-c8k-hpc2-node279 rcsslab]$ conda create --prefix /storage/hpc/group/<YOUR GROUP FOLDER>/shared_environment python=3.7

The command to activate the environment will be printed after the environment installs:

#
# To activate this environment, use:
# > source activate /storage/hpc/group/<YOUR GROUP FOLDER>/shared_environment
#
# To deactivate an active environment, use:
# > source deactivate
#

To use the environment, other users will need to load the module and run the command above:

[friend@lewis4-c8k-hpc2-node279 rcsslab]$ module load miniconda3
[friend@lewis4-c8k-hpc2-node279 rcsslab]$ source activate /storage/hpc/group/<YOUR GROUP FOLDER>/shared_environment

Conda Channels

Whenever we use conda create or conda install without mentioning a channel name, Conda package manager search its default channels to install the packages. If you are looking for specific packages that are not in the default channels you have to mention them by using:

codna create --name env_name --channel channel1 --channel channel2 ... package1 package2 ...

For example the following creates new_env and installs r-sf, shapely and bioconductor-biobase from r, conda-forge and bioconda channels:

codna create --name new_env --channel r --channel conda-forge --channel bioconda r-sf shapely bioconductor-biobase  

Conda Packages

To find the required packages, we can visit anaconda.org and search for packages to find their full name and the corresponding channel. Another option is using conda search command. Note that we need to search the right channel to find pakages that are not in the default channels. For example:

conda search --channel bioconda biobase

SBATCH Example

To use Anaconda from an SBATCH script, you simply need to load the module and activate the environment. The following script will create my-env in the /data directory if it is not present.

#!/bin/bash
######################### Batch Headers #########################
#SBATCH -p Lewis            # use the Lewis partition
#SBATCH -J conda_test       # give the job a custom name
#SBATCH -o results-%j.out   # give the job output a custom name
#SBATCH -t 0-02:00          # two hour time limit
#SBATCH -N 1                # number of nodes
#SBATCH -n 1                # number of cores (AKA tasks)
#SBATCH --mem 16G           # 16G of memory
#################################################################

export MYLOCAL=/home/$USER/data/ # you may update the directory in here
module load miniconda3

# The following tests if my-env exist or not and will create it if not    
if [ ! -d $MYLOCAL/my-env ]; then
conda create --yes --prefix $MYLOCAL/my-env pandas 
fi

source activate $MYLOCAL/my-env

# Now you can run Python scripts that use the packages in your environment
python -c 'import pandas; print(pandas.__version__)'

Non-HPC Usage

To set up an Anaconda environment on Windows without needing admin (Anaconda as a user)

Using Anaconda Navigator to create a Python virtual environment (Anaconda Navigator)