PyFAMSA Stars#

Cython bindings and Python interface to FAMSA, an algorithm for ultra-scale multiple sequence alignments.

Actions Coverage PyPI Bioconda AUR Wheel Versions Implementations License Source Mirror Issues Docs Changelog Downloads

Overview#

FAMSA is a method published in 2016 by Deorowicz et al. for large-scale multiple sequence alignments. It uses state-of-the-art time and memory optimizations as well as a fast guide tree heuristic to reach very high performance and accuracy.

pyfamsa is a Python module that provides bindings to FAMSA using the Cython language. It implements a user-friendly, Pythonic interface to align protein sequences using different parameters and access results directly. It interacts with the FAMSA library interface, which has the following advantages:

Batteries-included

Just add pyfamsa as a pip or conda dependency, no need for the FAMSA binary or any external dependency.

Flexible

Create input Sequence objects programmatically through the Python API.

Practical

Retrieve alignments as dedicated Alignment objects using the compressed gap representation from FAMSA.

Parallel

Run computations in parallel using the FAMSA threading model built on POSIX threads.

Consistent

Get the same results as the latest FAMSA version (2.2.0).

Feature-complete

Access all the features of the original CLI through the Python API.

Setup#

Run pip install pyfamsa in a shell to download the latest release and all its dependencies from PyPi, or have a look at the Installation page to find other ways to install pyfamsa.

Library#

License#

This library is provided under the GNU General Public License v3.0.

FAMSA is developed by the REFRESH Bioinformatics Group and is distributed under the terms of the GPLv3 as well.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original FAMSA authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.