LexicMap: efficient sequence alignment against millions of prokaryotic genomes​
GitHub Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Indexing GlobDB

Info:

Steps:

# download data
wget https://fileshare.lisc.univie.ac.at/globdb/globdb_r220/globdb_r220_genome_fasta.tar.gz

tar -zxf globdb_r220_genome_fasta.tar.gz

# file list
find globdb_r220_genome_fasta/ -name "*.fa.gz" > files.txt

# index with lexicmap
# elapsed time: 3h:40m:38s
# peak rss: 87.15 GB
lexicmap index -S -X files.txt -O globdb_r220.lmi --log globdb_r220.lmi -g 50000000