Searching benchmarks
Software, datasets and commands details.
Softwares
GTDB r202 representative genomes are used for tests:
- file size: 46.26 GB
- files: 47,894
- bases: 151.94 Gb
KMCP vs COBS
All k-mers are indexed and searched.
Database size and building time:
cobs | kmcp | |
---|---|---|
database size | 86.96GB | 55.15GB |
building time | 29m:55s | 24min52s |
temporary files | 160.76GB | 1.19TB |
Searching with bacterial genomes or short reads (~1M reads).
KMCP vs Mash and Sourmash
Only FracMinHash (Scaled MinHash) (scale=1000 for Sourmash and KMCP) or MinHash (scale=3400 for Mash) are indexed and searched.
Database size and building time:
mash | sourmash | kmcp | |
---|---|---|---|
database size | 743MB | 5.19GB | 1.52GB |
building time | 11m39s | 89m59s | 7min02s |
temporary files | - | - | 3.41GB |
Searching with bacterial genomes.