Compose control counttables for `kevlar novel` step #275

standage · 2018-06-18T20:38:17Z

After counting k-mers for each control sample, we should investigate composing the counttables into a single nodetable before running kevlar novel. This should a couple of synergistic benefits.

We have a single table instead of 2 (or more), reducing time due to k-mer abundance queries
A nodetable consumes 1/8 of the size of a counttable with the same number of buckets

The cost is, of course, another pass over the "data". But it should be possible to build a nodetable directly from the underlying counttables themselves without iterating over the reads again. So "data" should be quite small and manageable.

The text was updated successfully, but these errors were encountered:

standage · 2018-06-18T20:39:30Z

Most of this would be implemented in khmer-land. See dib-lab/khmer#1379 and dib-lab/khmer#1392 for relevant threads in that project.

standage · 2018-06-18T23:16:49Z

Investigating this in dib-lab/khmer#1874.

standage added the optimization label Jun 18, 2018

standage mentioned this issue Jun 18, 2018

[Meta] Memory and runtime performance improvements #272

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compose control counttables for `kevlar novel` step #275

Compose control counttables for `kevlar novel` step #275

standage commented Jun 18, 2018 •

edited

Loading

standage commented Jun 18, 2018

standage commented Jun 18, 2018

Compose control counttables for kevlar novel step #275

Compose control counttables for kevlar novel step #275

Comments

standage commented Jun 18, 2018 • edited Loading

standage commented Jun 18, 2018

standage commented Jun 18, 2018

Compose control counttables for `kevlar novel` step #275

Compose control counttables for `kevlar novel` step #275

standage commented Jun 18, 2018 •

edited

Loading