Aligner#

class pyfamsa.Aligner#

A single FAMSA aligner.

scoring_matrix#

The scoring matrix used for scoring alignments.

Type:

ScoringMatrix

New in version 0.4.0: The scoring_matrix attribute.

__init__(*, threads=0, guide_tree='sl', tree_heuristic=None, medoid_threshold=0, n_refinements=100, keep_duplicates=False, refine=None, scoring_matrix=None)#

Create a new aligner with the given configuration.

Keyword Arguments:
  • threads (int) – The number of threads to use for parallel computations. If 0 given (the default), use os.cpu_count to spawn one thread per CPU on the host machine.

  • guide_tree (str) – The method for building the guide tree. Supported values are: sl for MST+Prim single linkage, slink for SLINK single linkage, upgma for UPGMA, nj for neighbour joining.

  • tree_heuristic (str or None) – The heuristic to use for constructing the tree. Supported values are: medoid for medoid trees, part for part trees, or None to disable heuristics.

  • medoid_threshold (int) – The minimum number of sequences a set must contain for medoid trees to be used, if enabled with tree_heuristic.

  • n_refinements (int) – The number of refinement iterations to run.

  • keep_duplicates (bool) – Set to True to avoid discarding duplicate sequences before building trees or alignments.

  • refine (bool or None) – Set to True to force refinement, False to disable refinement, or leave as None to disable refinement automatically for sets of more than 1000 sequences.

  • scoring_matrix (ScoringMatrix or str) – The scoring matrix to use for scoring alignments. By default, the MIQS matrix by Yamada & Tomii (2014) is used like in the original FAMSA implementation.

New in version 0.4.0: The scoring_matrix argument.

align(sequences)#

Align sequences together.

Parameters:

sequences (iterable of Sequence) – An iterable yielding the digitized sequences to align.

Returns:

Alignment – The aligned sequences, in aligned format.

align_profiles(profile1, profile2)#

Align two profiles together.

Profile-profile alignment computes a new alignment using sequences from the two input alignments while preserving the columns of each profile.

Parameters:
  • profile1 (Alignment) – The first profile to align.

  • profile2 (Alignment) – The second profile to align.

Returns:

Alignment – The resulting profile-profile alignment.

New in version 0.5.0.

build_tree(sequences)#

Build a tree from the given sequences.

Parameters:

sequences (iterable of Sequence) – An iterable yielding the digitized sequences to build a tree from.

Returns:

GuideTree – The guide tree obtained from the sequences.