Skip to content

Searching benchmarks

Software, datasets and commands details.

Softwares

GTDB r202 representative genomes are used for tests:

  • file size: 46.26 GB
  • files: 47,894
  • bases: 151.94 Gb

KMCP vs COBS

All k-mers are indexed and searched.

Database size and building time:

cobs kmcp
database size 86.96GB 55.15GB
building time 29m:55s 24min52s
temporary files 160.76GB 1.19TB

Searching with bacterial genomes or short reads (~1M reads).

KMCP vs Mash and Sourmash

Only FracMinHash (Scaled MinHash) (scale=1000 for Sourmash and KMCP) or MinHash (scale=3400 for Mash) are indexed and searched.

Database size and building time:

mash sourmash kmcp
database size 743MB 5.19GB 1.52GB
building time 11m39s 89m59s 7min02s
temporary files - - 3.41GB

Searching with bacterial genomes.

Result