Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compose control counttables for kevlar novel step #275

Open
standage opened this issue Jun 18, 2018 · 2 comments
Open

Compose control counttables for kevlar novel step #275

standage opened this issue Jun 18, 2018 · 2 comments

Comments

@standage
Copy link
Collaborator

standage commented Jun 18, 2018

After counting k-mers for each control sample, we should investigate composing the counttables into a single nodetable before running kevlar novel. This should a couple of synergistic benefits.

  • We have a single table instead of 2 (or more), reducing time due to k-mer abundance queries
  • A nodetable consumes 1/8 of the size of a counttable with the same number of buckets

The cost is, of course, another pass over the "data". But it should be possible to build a nodetable directly from the underlying counttables themselves without iterating over the reads again. So "data" should be quite small and manageable.

@standage
Copy link
Collaborator Author

Most of this would be implemented in khmer-land. See dib-lab/khmer#1379 and dib-lab/khmer#1392 for relevant threads in that project.

@standage
Copy link
Collaborator Author

Investigating this in dib-lab/khmer#1874.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant