Here I will discuss some of the most useful commands as well as comment about related topics that may be useful when working with conda.
Table of Contents
- Introduction
- Command Syntax
- Managing Environments
- Managing Packages
- Sharing Environments
- Adding Conda envs to Jupyter
- Installing LALSuite under Conda environments
Introduction
Conda is a package and environment manager. It allows to install python (and other software) packages, and to create environment to isolate all the software dependencies and avoid incompatibilities with other existing software.
conda
< miniconda
< Anaconda
.
miniconda includes conda with python, pip and a few more. Anaconda includes miniconda with a lot of scientific packages.
Useful links:
-
Introduction to Conda for (data) Scientist. Very easy to read and understand the basics.
-
Official Conda Docs. Extensive documentation of conda commands and how to manage packages and environments (Tasks section).
Command Syntax
The structure of a conda
command can be decomposed as follows:
conda
+ command
+ argument
+ --options
Sometimes the order of each instruction is not relevant and you can add the options before the argument.
(Disclaimer: I am calling options to what is added with --something
, but for some commands they are not optional so they are in fact arguments.)
Managing Environments
-
conda create --name name-of-env packages-of-env
create environment named ‘name-of-env’ withpackages-of-env
installed.
E.g.conda create --name igwn-py38 python numpy pandas
.
The version of each packages can be added with ‘=’ or ‘==’:conda create --name igwn-py38 python=3.8 numpy=1.20 pandas=1.3
.
Typically the third version number (build number) is left free so there is freedom to add bugfixes and avoid some incompatibilities between other dependencies.
The problem with specifying the versions is that you need to know if those versions are compatible. If not, conda will throw an error and will not create the environment. -
conda create --clone original-env --name copy-env
clone a conda environment with new name. conda (de)activate name-of-env
-
conda activate
activate base (now conda) environment which is always available with miniconda/anaconda - By default the environment will be located under
~/.conda/envs/name-of-env
. To specify the location use--prefix /path/to/folder
. E.g.conda create --prefix ./env python numpy pandas
. -
The
--prefix
option does not allow us to use--name
and the “name” of the env will be the whole path. However, this can be shortened doingconda config --set env_prompt '({name})'
.
This will create or modify your .condarc file and will set the name of the env to the last item of the path specified in prefix, in this caseenv
. But the environment will continue without a real name. -
--name
and--prefix
can be used in other commands to refer to a specific environment. If not present, the command will apply to the current active environment. -
conda env list
to see the list of available environments -
conda remove --name name-of-env (--prefix /path/to/env) --all
revome the whole environment (you can also do rm -rf the folder of the env). But be careful not to delete the environment you are currently working on or if you are inside the env folder.
Managing Packages
-
conda (un)install package-name(=version.number)
. This will install a new package in the current environemnt, but you can specify in which environment to install it using the--name
or the--prefix
options.
E.g.conda install --name name-of-env panda=1.3
.
You can specify several packages in the same line. -
Don’t install packages in the base environment!. Always leave it clean and create another environment for your project.
-
conda search package-name
. This will list the different versions available for that package. -
conda list (--name env-name / --prefix /path)
to see the packages installed in current(specified) environment. -
Extra info: a conda package is a tarball file with the structure: bin/; info/; lib/python3.x/site-packages/folders-of-python-packages.
-
The conda packages can be obtained from different repositories or “channels”. Some channels may have more updated packages or packages that do not exist in other channels. Some channels are: main (or defaults), conda-forge, bioconda, pytorch, etc.
-
conda install scipy=1.6 --channel conda-forge
specify the channel where to install from. Add a hierarchy of channels with... --channel conda-forge --channel bioconda ...
. -
conda search package-name --channel conda-forge
see versions available of this package in the channel conda-forge. -
We can specify different channels for different packages both when installing a package or createing the environment with this syntax:
conda create --name name-of-env conda-forge::python=3.6 pytorch::pytorc=1.1
-
If the package is not available in any channel, we can use
pip
, but only as a last resource.
Don’t use your system pip or the (ana)miniconda’s pip, always installpip
in your conda env (conda install pip
) and use this to avoid conflicts. Then dopython -m pip install scipy==1.2
notice now the double ‘==’. -
IMPORTANT WARNING: try not to install anything with
pip --user
or delete your~/.local
folder. For some weird reason, conda looks for python packages first under~/.local/lib/python3.8/site-packages
and then looks at the environment specific~/.conda/envs/your-env/lib/python3.8/site-packages
. This is a very contraintuitive behaviour and quite annoying since we lose the whole point of an environment. I would understand that you use the system’s package as last resource, if they are not in the conda environment…
By doingexport PYTHONNOUSERSITE=True
you will skip the.local
however this will also be true when you do not use conda and you are just using the system’s python (all the pip installed packages go to ~.local when using system’s pip). There must be a way of adding this environment variable only when activating a conda environment e.g. with .condarc (I haven’t found it yet, all the issues have been open for years or closed without a solution…). I have just modified myactivate_conda
alias which activate CVMFS’s conda and addedexport PYTHONNOUSERSITE=True
there. There are other aliases to set and unsetPYTHONNOUSERSITE
but it easier to open a new terminal. There is another alias (sys-path
) to check where python is looking for packages.
Sharing Environments
-
Environments can be created from and exported to YAML files. They have a simple syntax and use identation for indicate nesting.
-
We can indicate the name, the channels and the packages to be installed. We can also tell to install packages with pip (which must be a dependency) or include a requirements.txt file with the pip packages. This is just a column with the package name and the version e.g.
numpy>=1.20
. -
It is good practice to version control the environment.yml files.
-
Example:
name: name-of-env
channels:
- pytorch
- defaults
dependencies:
- ipython
- matplotlib=3.1
- pytorch=1.1
- pip=19.1
- pip:
-kaggle==1.5
(or - -r file:requirements.txt)
When using the --file
option we have to add env
to the conda create
command:
-
conda env create --file environment.yml
this will create an env with the name specified in the yml. With--prefix
it will not have name. -
conda env export (--name name-of-env/--prefix ...) --file my-env.yml
export current(specific env) to a file. If no--file
option is set, it will default toenvironment.yml
. -
Previous command will add all the dependencies into the file. To export only the original packages that you used to create the environment do:
conda env export --from-history --file my-env.yml
-
conda env update --prefix ./env --file myenv.yml --prune
update the env after the yml has been modified.--prune
will remove the dependencies that are no longer necessary. If you delete one dependency it will not remove it. -
conda env create --prefix ./env --file myenv.yml --force
to rebuild the environment from scratch.
Adding Conda envs to Jupyter
We do not need to install Jupyter in every conda environment, we can make our jupyter installation be aware of our conda environments:
- Install in your environment
ipykernel
. - (Activate environment if it was not.)
-
python -m ipykernel install --user --name env-name --display-name "Env-name"
.
This command will create a Kernel spec file in JSON format to be used by Jupyter. The--user
option will put this file under~/.local/share/jupyter/kernels/env-name/kernel.json
. Notice that the argument--name env-name
is to be used internally by jupyter and does not need to correspond with the actual name of the environment (it may not have it if it was created with--prefix
). - Then you can open Jupyter (from your local installation or from another conda env which has it) and it will show the new kernel in the dropdown menu with the “Env-name” in
--display-name
. Deleting the~/.local/share/jupyter/kernels/env-name
folder will remove the kernel.
Installing LALSuite under Conda environments
The igwn-pyXY environments already provide LALSuite. If you need to install your own LAL:
- Activate igwn environment (or better a copy 12GB)
-
./00boot
as usual -
configure --prefix=$CONDA_PREFIX --options
. In Hawk it was necessary to addCFLAGS=-Wno-error
. 3.b My usual command for only LALSimulation is:configure --prefix=$CONDA_PREFIX --disable-lalinference --disable-lalinspiral --disable-lalapps --disable-lalpulsar --disable-lalburst
-
make -j; make install
as usual.
This will add and replace several folders, files and binaries under $CONDA_PREFIX
. Inside this location the structure is something like this
etc/
lib/*, pkg-config, python3.8/site-packages
include/
share/
- …
The make install
distributes the LAL files in that folder structure.
Multiple LAL installations under one Conda environment
It is recommendable to have one conda environment per LAL installation. However, the igwn
environments are 12GB in size so it is not practical to have several of them. What I would advise is to create smaller environments, where you install whatever you need just to compile LAL (in my case it would be only LALSimulation) see next section.
Nevertheless, if you need to demostrate that you LAL code works in an official igwn
environment here is what you need to do.
-
Compile your LAL as in the previous section but adding a extra subfolder:
--prefix=$CONDA_PREFIX/lalsuite-new
. -
If now you source this LAL
source $CONDA_PREFIX/lalsuite-new/etc/lalsuite-user-env.sh
it will pick the correct LAL installation, branch, etc. Check with aliasescheck-lal
andsys-path
. However, importinglalsimulation
will not work due to a swig error related. -
The reason seems to be that the LAL files live in their own
etc/
,bin/
,include/
, … structure and they are not shared with the files in the conda’s structure. So the obvious solution seems to move the LAL files to the conda structure. -
The script
switch-conda-lal.sh
does this. Instead of moving the files it will create symlinks in the conda’s structure pointing to the files in thenew-lal
structure and it will back up inlalsuite-default
those files that will be replaced. -
Before creating the symlinks, it will check if the files in conda’s structure already exists in the folder of the
old-lal
installation. Thisold-lal
is the LAL installation of the current LAL, it corresponds tolalsuite-default
if it has been already created or any other LAL installation which has been sourced before. Then it will remove from conda’s structure those files which are already present in old-lal. -
Finally, it will source the
new-lal
installationsource $CONDA_PREFIX/lalsuite-new/etc/lalsuite-user-env.sh
. In the default LAL installation there is no such a file, instead, we will source all theetc/lal*-user-env.sh
(lalsuite-user-env.sh
is just a pointer to them). In case we are switching to the default LAL, the environment variablesLAL_PREFIX
andPYTHONPATH
are unset. The former because it is used in the script to detect the default installation and the later to keepsys-path
clean and not to mix several installations. -
The script needs to be run with
. switch-conda-lal.sh
orsource switch-conda-lal.sh
. Usingbash
will not make the environment variables persistent. I created an alias for easy run:switch-conda-lal
. -
To test that it worked you can again use the alias
check-lal
,sys-path
and now you canimport lalsimulation
. You can also run some of the tests in LALSimulation e.g.:python lalsimulation/test/pythont/test_phenomX.py
. I tested also in Jupyter notebooks and it worked. - Further tests:
- Check bilby
Done by runningpython fast_tutorial.py
from the bilby examples. Correct LAL picked in the logs. - HTCondor, getenv=true
- Check bilby
-
EXTRA: It might be that there is a more elegant solution. The XLAL functions in the main package lal could be accessed through swig just by doing the source. It is also true that lalsimulation appears in some other places in the file structure where lal is not present, so it is difficult to work out a solution without an understanding of the swig interface. The file that was causing the import problem seemed to be
lalsimulation.py
, modifying this could fix the problem, but bear in mind that this is a file “automatically generated by the swig interface” and I do not know if it is during compilation or at run time. - WARNING: I tried to pip install pesummary to update the version but ran into problems with “too many levels of symbolic links”.
Light Conda environment for LALSimulation
This section describes which are the minimal requirements to install a light version of LALSuite, basically consisting of LALSimulation, under a conda environment.
The requirements needed consist of the libraries: FFT, GSL, numpy, lalframe, lalmetaio and h5py.
- Create custom conda environment:
conda create --name mylal-branch python=3.9 mamba
We chose python3.9 and installed mamba which is a “fast conda”. - Install in this environment the libraries:
numpy
-
gsl
(for some reason the global gsl installation was not picked up during the configure) -
h5py
(the configure will work without this, but then themake
will complain during the surrogate and roms step)
- Install the libraries:
framel
,metaio
:- Instructions can be found in this ligo website
- Basically download the library, untar, configure, make, make install
- What I did:
curl http://software.igwn.org/lscsoft/source/metaio-8.5.1.tar.gz > ~/Utilities/metaio-8.5.1.tar.gz
curl http://software.igwn.org/lscsoft/source/framel-8.39.2 > ~/Utilities/framel-8.39.2
- tar -xf both of them
- cd framel*;
cmake CMakeList.txt
;sudo make install
- cd metaio*;
./configure
;make
;sudo make install
- This installed the libraries globally in my system (
/usr/local/lib
), but presumably, doing--prefix=$CONDA_PREFIX
in the configure would have installed only in the conda environment.
- Compile LALSuite. Now we proceed with the general recipe outlined at the beginning of the main section:
- cd source lalsuite folder,
git clean -xdf
./00boot
- cd build_directory;
configure --prefix=$CONDA_PREFIX --disable-non-lalsimulation-stuff
-
make -j
;make install
- cd source lalsuite folder,
-
All these steps where done for my lalsuite-ceci-phX20 environment, and the alias
check-lal
responded correctly. Later I installed other stuff likematplotlib
,pandas
andscipy
. After insntallingipykernel
I was able to generate a TD waveform for XHM within a jupyter notebook (opened with another conda environment). - Install more packages as needed:
pesummary
,pycbc
,bilby
, etc.