Skip to content

sugatoray/genespeak

Repository files navigation

GeneSpeak 🧬

genespeak-banner

GitHub - License PyPI - Python Version PyPI - Package Version Conda - Platform Conda (channel only) Conda Recipe Docs - GitHub.io CodeFactor Streamlit App DOI

A library to encode text as DNA and decode DNA to text.

GeneSpeak allows you to encode regular text as DNA using base-pairs (A, T, G, C) and convert back to the original text. Text encoding is done for both ascii and utf-8 characters based on the strategy keyword argument. The encoding scheme could be any combination of A, T, G, C.

Installation 📜

You can install the library via pip or conda.

Install with pip

pip install genespeak

Install with conda

conda install -c conda-forge genespeak

Quickstart ⚡

See the quickstart guide here.

Service Link/Badge
Colab Colab Badge
Binder Binder
SageMaker StudioLab Open in SageMaker Studio Lab
Deepnote View in Deepnote
Kaggle Kaggle

Demo App ✨

Streamlit App

You can play around with GeneSpeak in this streamlit app: https://tinyurl.com/genespeak-demo

Usage ✋

import genespeak as gp
print(f'{gp.__name__} version: {gp.__version__}')

schema = "ATCG" # (1)
strategy = "ascii" # (2)
text = "Hello World!"

dna = gp.text_to_dna(text, schema=schema)
text_from_dna = gp.dna_to_text(dna, schema=schema)
print(f'Text: {text}\nEncoded DNA: {dna}\nDecoded Text: {text_from_dna}\nSuccess: {text == text_from_dna}')

Output

genespeak version: 0.0.5
Text: Hello World!
Encoded DNA: TACATCTTTCGATCGATCGGACAATTTGTCGGTGACTCGATCTAACAT

Text: Hello World!
Encoded DNA: TACATCTTTCGATCGATCGGACAATTTGTCGGTGACTCGATCTAACAT
Decoded Text: Hello World!

Documentation 📚

Docs - GitHub.io

The genespeak docs are maintained here.

License 📑

GitHub - License

The library is available under MIT license.

Citation 🔖

DOI

You may cite this library as follows.

@software{ray2022genespeak,
    author = {Ray, Sugato},
    title = {GeneSpeak - A library to encode text as DNA and decode DNA to text},
    url = {https://github.com/sugatoray/genespeak},
    doi = {10.5281/zenodo.5885777},
    month = {1},
    year = {2022}
}

GeneSpeak Thumb Print 👍

Let's have some fun! ✨ The following is a GeneSpeak thumbprint of genespeak itself.

schema strategy thumbprint
ATCG ascii TCTGTCTTTCGCTCTTTGAGTGAATCTTTCATTCCG

Repository Health Metrics 💟

Includes health and security badges from:

  • Sonarcloud
  • OSSF Code Quality
Click to expand 👇

Quality Gate Status Security Rating Maintainability Rating Lines of Code Technical Debt Vulnerabilities OSSF CodeQL