PyFAMSA #
Cython bindings and Python interface to FAMSA, an algorithm for ultra-scale multiple sequence alignments.
Overview#
FAMSA is a method published in 2016 by Deorowicz et al. for large-scale multiple sequence alignments. It uses state-of-the-art time and memory optimizations as well as a fast guide tree heuristic to reach very high performance and accuracy.
pyfamsa
is a Python module that provides bindings to FAMSA using the
Cython language. It implements a user-friendly,
Pythonic interface to align protein sequences using different parameters and
access results directly. It interacts with the FAMSA library interface, which
has the following advantages:
Just add pyfamsa
as a pip
or conda
dependency, no need
for the FAMSA binary or any external dependency.
Create input Sequence
objects programmatically through
the Python API.
Retrieve alignments as dedicated Alignment
objects
using the compressed gap representation from FAMSA.
Run computations in parallel using the FAMSA threading model built on POSIX threads.
Get the same results as the latest FAMSA version (2.2.0
).
Access all the features of the original CLI through the Python API.
Setup#
Run pip install pyfamsa
in a shell to download the latest release and all
its dependencies from PyPi, or have a look at the
Installation page to find other ways to install pyfamsa
.
Library#
License#
This library is provided under the GNU General Public License v3.0.
FAMSA is developed by the REFRESH Bioinformatics Group and is distributed under the terms of the GPLv3 as well.
This project is in no way not affiliated, sponsored, or otherwise endorsed by the original FAMSA authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.