Skip to content
/ gnumap Public

GNUMAP: Genomic Next-generation Universal MAPper version 4.0

Notifications You must be signed in to change notification settings

byucsl/gnumap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GNUMAP: Genomic Next-generation Universal MAPper version 4.0

Installation

  1. Download the source code from the repo or git clone https://github.com/byucsl/gnumap.git.
  2. In the gnumap/ directory run make.
  3. Optional: add the path to the GNUMAP binary (./bin/gnumap) to your PATH variable.

Mac Installation

To install on a Mac, clone the repository with the following command:

git clone -b mac https://github.com/byucsl/gnumap.git

and follow the regular installation instructions.

Quick Start

  1. Run ./bin/gnumap to view a help menu explaining how to run GNUMAP.
  2. To perform an alignment using the example dataset, run the following command:

./bin/gnumap -g examples/Cel_gen.fa -o gnumap.out -a .9 examples/Cel_gen.reads.1.fq

  • The -g parameter specifies the genome in FASTA format.
  • The -o parameter gives the path and file-name in the SAM format.
  • The -a parameter takes a percentage (in the form of a floating point number) and specifies the minimum alignment score that will be accepted for mapped reads.
  • The last parameter is the file containing the reads needed to be mapped in FASTQ format. Note: one may list multiple read files and GNUMAP will map the reads from each read file.

Common Parameters

Here are some of the common parameters used, to see a complete list of parameters refer to the file ./docs/DOCUMENTATION.

  • -g, --genome=STRING Genome .fa file(s)
  • -o, --output=STRING Output file
  • -v, --verbose=INT Verbose (default=0)
  • -c, --num_proc=INT Number of processors to run on
  • -m, --mer_size=INT Mer size (default=0)
  • -j, --jump=INT The number of bases to jump in the sequence indexing (default: mer_size)
  • -k, --num_seed=INT The total number of seed hits that must match to a location before it is considered for alignment (default: 2)
  • -h, --max_kmer=INT Kmers in the reference genome that occur more than this will not be used in the read mapping
  • --no_nw This will disable the Needleman-Wunsch alignments and only use hit count as the basis for alignment. Score is calculated by summing the number of hits for a position

Reference

@article{fujimoto2018gnumap,
  title={GNUMAP 4.0: Space and Time Efficient NGS Read Mapping Using the FM-Index},
  author={\textbf{Fujimoto, MS} and Lyman, CA and Bodily, PM and others},
  journal={Insights Bioinform},
  volume={1},
  number={1},
  pages={1--8},
  year={2018}
}

GNUMAP is made by the Computational Sciences Laboratory at Brigham Young University.