-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using vcfanno to annotate SV calls with GNOMAD SV frequencies #106
Comments
I have pushed a fix for this. Would you try the attached binary? |
you'll need to set
and then you'll probably want |
Thanks @brentp ! This binary works. However, even with -permissive-overlap option on, annotation happens only when POS in the query is the same as POS in the GNOMAD. When I tried to annotate an SV [158,000-166,000] with gnomad [157,000-166,000] it would not annotate. Sergey |
don't use `self`.
…On Mon, Mar 18, 2019 at 1:59 PM Sergey Naumenko ***@***.***> wrote:
Thanks @brentp <https://github.com/brentp> !
This binary works. However, even with -permissive-overlap option on,
annotation happens only when POS in the query is the same as POS in the
GNOMAD. When I tried to annotate an SV [158,000-166,000] with gnomad
[157,000-166,000] it would not annotate.
Sergey
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAGy1ClEmgvHyISdD_bXMWXJqvwidZRks5vX_A7gaJpZM4b6ZzQ>
.
|
Thanks @brentp! I've got it - when working with intervals, it is better to annotate a vcf file vs bed file rather than vs vcf. My new config
Probably, before using gnomad_sv for annotation we have to 'normalize' gnomad_SV database like we did for small variants. However, SVs are large, and there are rare deletions overlapping with frequent duplications in the population. What is your take on this problem? Would you suggest to split gnomad bed file into several files according to types (DEL, DUP) and annotate separately? A test of sensitivity of vcfanno:
Do you think that the last overlap is too permissive and something like % required reciprocal overlap could be added here in vcfanno like in bedtools? I'd happy to contribute here, just point me out to the right place. Another suggestion might be a flag to use an additional INFO field when matching (i.e. SVTYPE in vcf and a certain column in BED). Or maybe it is possible to allow users to custom calculation on fields before annotation happens, i.e. to define what is matching with lua as it happens with post-annotation? Thanks! |
Hi, if I understand correctly, you are describing what appear to be bugs in vcfanno--variants that should be getting annotated, but are not. Is that correct? Are you sure both files are sorted and that the file you are annotating with is indexed correctly? |
Hi @brentp ! I see two issues. In both annotation happens.
Command:
Works the same way with and without -permissive-overlap, and vcfanno 0.3.1 and 0.3.2. Sergey |
I guess the problem is related to finding reciprocal overlap and currently most of the existing methods dont deal with it. Has this been resolved yet in vcfanno? Eagerly waiting for it. |
I am having similar issue. Did someone has an update on this? |
hi, I won't have time to work on this. I think the machinery is in place inside vcfanno for the annotation which can be followed up with some post-processing to require a certain amount of overlap. |
Hello, Brent!
Thanks for the very useful vcfanno tool!
I'd like to start using it for annotation of structural variants (SV) in the same way how it works for small variant vcf files.
Recently GNOMAD released its SV frequencies, and I gave it a try:
crg.sv.vcfanno.conf:
no lua
https://storage.googleapis.com/gnomad-public/papers/2019-sv/gnomad_v2_sv.sites.vcf.gz
https://storage.googleapis.com/gnomad-public/papers/2019-sv/gnomad_v2_sv.sites.vcf.gz.tbi
test.vcf.gz
The query contains a fake SV DUP 1:158000-163000 which overlaps with an SV in gnomad_sv.vcf.gz:
1 157000 gnomAD_v2_DUP_1_8 N <DUP> 999 PASS END=166000
vcfanno crg.sv.vcfanno.conf test.vcf.gz | bgzip -c > test.annotated.vcf.gz
There is a result file: test.annotated.vcf.gz, but it does not contain the gnomad_sv_popmax_af annotation.
What would be the algorithm of SV matching in vcfanno? For small variants it is just chr:pos:ref:alt, but for SVs it is different: min 50% reciprocal overlap between features (END-START) or similar.
Sergey
The text was updated successfully, but these errors were encountered: