Motivation
-
BLASTN is not able to scale to millions of bacterial genomes, it’s slow and has a high memory occupation. For example, it requires >2000 GB for alignment a 2-kb gene sequence against all the 2.34 millions of prokaryotics genomes in Genbank and RefSeq.
-
Large-scale sequence searching tools only return which genomes a query matches (color), but they can’t return positional information.