Anaconda
Anaconda is an open source package management system and environment management system. Conda quickly installs, runs and updates packages and their dependencies. Conda easily creates, saves, loads and switches between environments on your local computer. It was created for Python programs, but it can package and distribute software for any language.
- Software URL: https://www.anaconda.org/
- Documentation: https://conda.io/en/latest/
This is the topic for RSS "Virtual Environments" training.
- Training presentation (pdf)
To be notified about our coming training, please subscribe to our Announcement List.
Recommended Configuration
By default, Conda stores environments and packages within the folder ~/.conda
.
To avoid using up all of your home folder's quota, which can easily happen when
using Conda, we recommend placing the following within the file ~/.condarc
.
You can create the file if it is not already present. You can also choose a
different path, so long as it is not in your home folder.
envs_dirs:
- /storage/hpc/data/${USER}/miniconda/envs
pkgs_dirs:
- /storage/hpc/data/${USER}/miniconda/pkgs
These settings can also be changed when creating the Conda environment. See Sharing Environments with Other Lewis Users for an example.
Usage
The version of Anaconda we have available on Lewis is called "Miniconda".
Miniconda is a version of Anaconda that only provides the conda
command.
First, you will want to make sure that you are running in a compute job.
[user@lewis4-r710-login-node223 ~]$ srun --partition Interactive --qos interactive --pty /bin/bash
Then, you need to load the miniconda3
module
[user@lewis4-c8k-hpc2-node279 ~]$ module load miniconda3
After that command completes, you will have the conda
command available to you.
conda
is what you will use to manage your Anaconda environments. To list the
Anaconda environments that are installed, run the following:
[user@lewis4-c8k-hpc2-node279 ~]$ conda env list
If this is your first time running Anaconda, you will probably only see the "root" environment. This environment is shared between all users of Lewis and cannot be modified. To create an Anaconda environment that you can modify, do this:
[user@lewis4-c8k-hpc2-node279 ~]$ conda create --name my_environment python=3.7
You can use any name you want instead of my_environment
. You can also
choose other Python versions or add any other packages. Ideally, you should create one environment per
project and include all the required packages when you create the environment.
After running the above command, you should see something like this
Fetching package metadata ...........
Solving package specifications: .
Package plan for installation in environment `/storage/hpc/data/$USER/miniconda/envs/my_environment`:
The following NEW packages will be INSTALLED:
ca-certificates: 2019.1.23-0
certifi: 2018.11.29-py37_0
libedit: 3.1.20181209-hc058e9b_0
libffi: 3.2.1-hd88cf55_4
libgcc-ng: 8.2.0-hdf63c60_1
libstdcxx-ng: 8.2.0-hdf63c60_1
ncurses: 6.1-he6710b0_1
openssl: 1.1.1b-h7b6447c_0
pip: 19.0.3-py37_0
python: 3.7.2-h0371630_0
readline: 7.0-h7b6447c_5
setuptools: 40.8.0-py37_0
sqlite: 3.26.0-h7b6447c_0
tk: 8.6.8-hbc83047_0
wheel: 0.33.1-py37_0
xz: 5.2.4-h14c3975_4
zlib: 1.2.11-h7b6447c_3
Proceed ([y]/n)? y
Press y
to continue. Your packages should be downloaded. After the packages
are downloaded, the following will be printed:
#
# To activate this environment, use:
# > source activate my_environment
#
# To deactivate an active environment, use:
# > source deactivate
#
Make a note of that because those commands are how to get in and out of the environment you just created. To test it out, run:
[user@lewis4-c8k-hpc2-node279 ~]$ source activate my_environment
(my_environment) [user@lewis4-c8k-hpc2-node279 ~]$ python
Python 3.7.2 (default, Dec 29 2018, 06:19:36)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
You might notice that (my_environment)
now appears before your prompt,
and that the Python version is the one you specified above
(in our example, version 3.7).
Press Ctrl-D
to exit Python.
When the environment name appears before your prompt, you are able to
install packages with conda
. For instance, to install pandas:
(my_environment) [user@lewis4-c8k-hpc2-node279 ~]$ conda install pandas
Now, pandas will be accessible from your environment:
(my_environment) [user@lewis4-c8k-hpc2-node279 ~]$ python
Python 3.7.2 (default, Dec 29 2018, 06:19:36)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> pandas.__version__
'0.24.1'
Press Ctrl-D
to exit Python. To see list of installed packages in the environment, run:
conda list
To exit your environment:
(my_environment) [user@lewis4-c8k-hpc2-node279 ~]$ source deactivate
[user@lewis4-c8k-hpc2-node279 ~]$
In the case that you do not need your environment, you can use the following to remove it (after exit):
conda env remove --name my_environmet
Sharing Environments with Other Lewis Users
To create an environment that can be shared with other users, you will need
to manually specify a folder to create the environment in using the --prefix
option.
[user@lewis4-c8k-hpc2-node279 rcsslab]$ conda create --prefix /storage/hpc/group/<YOUR GROUP FOLDER>/shared_environment python=3.7
The command to activate the environment will be printed after the environment installs:
#
# To activate this environment, use:
# > source activate /storage/hpc/group/<YOUR GROUP FOLDER>/shared_environment
#
# To deactivate an active environment, use:
# > source deactivate
#
To use the environment, other users will need to load the module and run the command above:
[friend@lewis4-c8k-hpc2-node279 rcsslab]$ module load miniconda3
[friend@lewis4-c8k-hpc2-node279 rcsslab]$ source activate /storage/hpc/group/<YOUR GROUP FOLDER>/shared_environment
Conda Channels
Whenever we use conda create
or conda install
without mentioning a channel name,
Conda package manager search its default channels to install the packages.
If you are looking for specific packages that are not in the default channels
you have to mention them by using:
codna create --name env_name --channel channel1 --channel channel2 ... package1 package2 ...
For example the following creates new_env
and installs r-sf, shapely and bioconductor-biobase from r
, conda-forge
and bioconda
channels:
codna create --name new_env --channel r --channel conda-forge --channel bioconda r-sf shapely bioconductor-biobase
Conda Packages
To find the required packages, we can visit anaconda.org and search for packages to find
their full name and the corresponding channel. Another option is using conda search
command. Note that we need to
search the right channel to find pakages that are not in the default channels. For example:
conda search --channel bioconda biobase
SBATCH Example
To use Anaconda from an SBATCH script, you simply need to load the module
and activate the environment. The following script will create my-env
in the /data
directory if it is not present.
#!/bin/bash
######################### Batch Headers #########################
#SBATCH -p Lewis # use the Lewis partition
#SBATCH -J conda_test # give the job a custom name
#SBATCH -o results-%j.out # give the job output a custom name
#SBATCH -t 0-02:00 # two hour time limit
#SBATCH -N 1 # number of nodes
#SBATCH -n 1 # number of cores (AKA tasks)
#SBATCH --mem 16G # 16G of memory
#################################################################
export MYLOCAL=/home/$USER/data/ # you may update the directory in here
module load miniconda3
# The following tests if my-env exist or not and will create it if not
if [ ! -d $MYLOCAL/my-env ]; then
conda create --yes --prefix $MYLOCAL/my-env pandas
fi
source activate $MYLOCAL/my-env
# Now you can run Python scripts that use the packages in your environment
python -c 'import pandas; print(pandas.__version__)'
Non-HPC Usage
To set up an Anaconda environment on Windows without needing admin (Anaconda as a user)
Using Anaconda Navigator to create a Python virtual environment (Anaconda Navigator)