Skip to content

markschl/seq_consensus

Repository files navigation

Consensus sequences from multiple alignments

Python package Documentation Status

seq_consensus is a simple Python 3 library focused on calculating consensus sequences. Ambiguous letters in the input are handled as well. Numpy is used under the hood. Currently, DNA/RNA sequences are supported.

The package additionally offers a small utility (cons_tool), which allows calculating consensus sequences on the commandline.

How is the consensus calculated?

The method is identical with the approach by Geneious and very similar to the function ConsensusSequence from the DECIPHER R package (options a little different). The API documentation contains some more description.

Documentation

The complete user guide is found here and the API is documented here. Below some small examples for demonstration:

Usage example

from seq_consensus import consensus

seqs = [
    'ATTGC',
    'AT-CC',
    'RT-C-'
]

consensus(seqs, threshold=0.6)

This returns:

'AT-CC'

Commandline tool examle

The script cons_tool allows using the same functionality from the commandline. An especially useful feature is the possibility to group sequences by arbitrary regular expression pattern matched in the sequence headers:

cons_tool -k 'p:\w+' input.fasta

Example output (given that taxonomic annotations are present in the headers):

>p:Evosea consensus (n=124)
TACKATTTA--RTATTGAC-?TWA?-GKTACTAAAGCATGGGKA-T?AAA?AGGATTAGAGACCCTYGTA
>p:Chordata consensus (n=7065)
TWAYTTTA?--WAW-YWAY-YTGAA-YCCACGAAAGCTAAGAMA-CAAACTGGGATTAGATACCCCACTA
>p:Mollusca consensus (n=843)
TWAWTWTAW--WAW?WWAY-TTGAA-KYYAYGAAAKCTWRGRWA-YAAACTAGGATTAGATACCCTAYTA
>p:Chordata consensus (n=8509)
TWAYTTTA?--WAW-YMAC-TTGAA-CCCACGAAAGCTARGAMA-CAAACTGGGATTAGATACCCCACTA
>p:Platyhelminthes_ consensus (n=130)
TWAWTWTAA--WDW?TKWY-YTGAA-KYYACGAAAGYTAKGWTA-YAAACTGGGATTAGATACCCCATTA
>p:Ascomycotaconsensus (n=280)
TTAWTWTAA--WAA?TDAC-TTGAR-K??ACGAAAGCTWRGRWA-CAAACTAGGATTAGATACCCYABTA
>p:Streptophyta consensus (n=269)
TWAWTWTAW--WAW?TRAY-TTGAR-KY?ACGAAAGCTTRGRKA-CAAACTAGGATTAGATACCCTAKTA
(...)

About

Alignment consensus in Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages