Skip to content

StevenSong/steml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

STEML: Spatial Transcriptomics Enhanced Machine Learning

steml is a python module with tools to load, transform, and analyze spatial transcriptomics data. The spatial transcriptomics technology currently used is 10x Genomics Visium.

The module must first be installed by cloning the repo, initializing and installing the conda environment provided by env.yml, and finally installing the module itself with pip install . at the root of the repo. Updates to the codebase require the module to be reinstalled with pip install ..

The repo is divided into several subdirectories to organize code, scripts, and experiments:

  • Useful scripts which call external software are contained within scripts (e.g. running spaceranger count to process the raw FastQ data).
  • The machine learning pipeline source code is contained within the steml subdirectory. This is the actual python module which is divided into submodules:
    • The recipes submodule contains the code for functions which should be called directly by a user of the module. These include data preprocessing and model training recipes.
    • The data submodule contains the code for the data loading tools. Notably, the data preprocessing tools are not contained here, but rather in the recipes submodule. Further data preprocessing tools are jupyter notebooks in the notebooks subdirectory at the root of the repo.
    • The models submodule contains the code for instantiating machine learning models. The model architecture currently implemented is ResNet18 with the capability to predict categorical outputs or regress continuous values.
    • The plots submodule contains the code for some limited plotting functionality for modeling results. Many more plots are currently generated by notebooks in the notebooks subdirectory at the root of the repo.
  • notebooks contain several useful jupyter notebooks which preprocess data, perform axis alignment, and plot results from the machine learning pipeline.
  • experiments contains the python scripts which call the steml pipeline. The scripts are titled their experiment ID. Exact details for each experiment are recorded in a table at experiments.csv.

This repo depends on code and data from the following sources:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published