8000 Normalize complex variants to meet the requirements of HGVS · Issue #1929 · samtools/bcftools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Normalize complex variants to meet the requirements of HGVS #1929
Closed
@user-tq

Description

@user-tq

Recently, I encountered a complex variant with an IGV image of
image
Simply put, this mutation is

GGAATTAAGAGAAG
G---------GAAC

I called it using mutet2, and ultimately it was represented as

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO
7	55242465	.	GGAATTAAGA	G	.	.	.
7	55242478	.	G	C	.	.	.

This seems to follow the left alignment rule

But according to the 3'rule of hgvs

(I confirmed the direction of my focus on transcript NM_005228)
this mutation is

GGAATTAAGAGAAG
GGAA---------C
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO
7	55242469	.	TTAAGAGAAG	C	.	.	.

I also annotated the final results using VEP. Obviously, VEP does not merge p.hgvs variants.

7	55242465	.	GGAATTAAGA	G	.	.	CSQ=-|inframe_deletion|MODERATE|EGFR|1956|Transcript|NM_001346897.2|protein_coding|18/26||NM_001346897.2:c.2104_2112del|NP_001333826.1:p.Leu702_Glu704del|2362-2370/3848|2101-2109/3276|701-703/1091|ELR/-|GAATTAAGA/-|rs121913436|1||1|||deletion|EntrezGene|||||NP_001333826.1||||||||||||3||||||||||||||||||drug_response||1||||||,-|inframe_deletion|MODERATE|EGFR|1956|Transcript|NM_001346898.2|protein_coding|19/27||NM_001346898.2:c.2239_2247del|NP_001333827.1:p.Leu747_Glu749del|2497-2505/3983|2236-2244/3411|746-748/1136|ELR/-|GAATTAAGA/-|rs121913436|1||1|||deletion|EntrezGene|||||NP_001333827.1||||||||||||3||||||||||||||||||drug_response||1||||||,-|inframe_deletion|MODERATE|EGFR|1956|Transcript|NM_001346899.2|protein_coding|18/27||NM_001346899.2:c.2104_2112del|NP_001333828.1:p.Leu702_Glu704del|2362-2370/9770|2101-2109/3498|701-703/1165|ELR/-|GAATTAAGA/-|rs121913436|1||1|||deletion|EntrezGene|||||NP_001333828.1||||||||||||3||||||||||||||||||drug_response||1||||||,-|inframe_deletion|MODERATE|EGFR|1956|Transcript|NM_001346900.2|protein_coding|19/28||NM_001346900.2:c.2080_2088del|NP_001333829.1:p.Leu694_Glu696del|2268-2276/9676|2077-2085/3474|693-695/1157|ELR/-|GAATTAAGA/-|rs121913436|1||1|||deletion|EntrezGene|||||NP_001333829.1||||||||||||3||||||||||||||||||drug_response||1||||||,-|inframe_deletion|MODERATE|EGFR|1956|Transcript|NM_001346941.2|protein_coding|13/22||NM_001346941.2:c.1438_1446del|NP_001333870.1:p.Leu480_Glu482del|1696-1704/9104|1435-1443/2832|479-481/943|ELR/-|GAATTAAGA/-|rs121913436|1||1|||deletion|EntrezGene|||||NP_001333870.1||||||||||||3||||||||||||||||||drug_response||1||||||,-|inframe_deletion|MODERATE|EGFR|1956|Transcript|NM_005228.5|protein_coding|19/28||NM_005228.5:c.2239_2247del|NP_005219.2:p.Leu747_Glu749del|2497-2505/9905|2236-2244/3633|746-748/1210|ELR/-|GAATTAAGA/-|rs121913436|1||1||1|deletion|EntrezGene||YES|||NP_005219.2||||||||||||3||||||||||||||||||drug_response||1||||||,-|downstream_gene_variant|MODIFIER|EGFR|1956|Transcript|NM_201284.2|protein_coding||||||||||rs121913436|1|3736|1|||deletion|EntrezGene|||||NP_958441.1||||||||||||||||||||||||||||||drug_response||1||||||,-|downstream_gene_variant|MODIFIER|EGFR-AS1|100507500|Transcript|NR_047551.1|lncRNA||||||||||rs121913436|1|4969|-1|||deletion|EntrezGene||YES|||||||||||||||||||||||||||||||||drug_response||1||||||
7	55242478	.	G	C	.	.	CSQ=C|missense_variant|MODERATE|EGFR|1956|Transcript|NM_001346897.2|protein_coding|18/26||NM_001346897.2:c.2113G>C|NP_001333826.1:p.Ala705Pro|2374/3848|2113/3276|705/1091|A/P|Gca/Cca|rs121913229&COSV51830861|1||1|||SNV|EntrezGene|||||NP_001333826.1|||||||||tolerated(0.24)|probably_damaging(0.942)||||||||||||||||||||likely_pathogenic&uncertain_significance|0&1|1&1|25157968&15737014&22622260|||||,C|missense_variant|MODERATE|EGFR|1956|Transcript|NM_001346898.2|protein_coding|19/27||NM_001346898.2:c.2248G>C|NP_001333827.1:p.Ala750Pro|2509/3983|2248/3411|750/1136|A/P|Gca/Cca|rs121913229&COSV51830861|1||1|||SNV|EntrezGene|||||NP_001333827.1|||||||||tolerated(0.26)|probably_damaging(0.995)||||||||||||||||||||likely_pathogenic&uncertain_significance|0&1|1&1|25157968&15737014&22622260|||||,C|missense_variant|MODERATE|EGFR|1956|Transcript|NM_001346899.2|protein_coding|18/27||NM_001346899.2:c.2113G>C|NP_001333828.1:p.Ala705Pro|2374/9770|2113/3498|705/1165|A/P|Gca/Cca|rs121913229&COSV51830861|1||1|||SNV|EntrezGene|||||NP_001333828.1|||||||||tolerated(0.19)|probably_damaging(0.998)||||||||||||||||||||likely_pathogenic&uncertain_significance|0&1|1&1|25157968&15737014&22622260|||||,C|missense_variant|MODERATE|EGFR|1956|Transcript|NM_001346900.2|protein_coding|19/28||NM_001346900.2:c.2089G>C|NP_001333829.1:p.Ala697Pro|2280/9676|2089/3474|697/1157|A/P|Gca/Cca|rs121913229&COSV51830861|1||1|||SNV|EntrezGene|||||NP_001333829.1|||||||||tolerated(0.21)|probably_damaging(0.942)||||||||||||||||||||likely_pathogenic&uncertain_significance|0&1|1&1|25157968&15737014&22622260|||||,C|missense_variant|MODERATE|EGFR|1956|Transcript|NM_001346941.2|protein_coding|13/22||NM_001346941.2:c.1447G>C|NP_001333870.1:p.Ala483Pro|1708/9104|1447/2832|483/943|A/P|Gca/Cca|rs121913229&COSV51830861|1||1|||SNV|EntrezGene|||||NP_001333870.1|||||||||tolerated(0.2)|probably_damaging(0.986)||||||||||||||||||||likely_pathogenic&uncertain_significance|0&1|1&1|25157968&15737014&22622260|||||,C|missense_variant|MODERATE|EGFR|1956|Transcript|NM_005228.5|protein_coding|19/28||NM_005228.5:c.2248G>C|NP_005219.2:p.Ala750Pro|2509/9905|2248/3633|750/1210|A/P|Gca/Cca|rs121913229&COSV51830861|1||1||1|SNV|EntrezGene||YES|||NP_005219.2|||||||||tolerated(0.18)|probably_damaging(0.942)||||||||||||||||||||likely_pathogenic&uncertain_significance|0&1|1&1|25157968&15737014&22622260|||||,C|downstream_gene_variant|MODIFIER|EGFR|1956|Transcript|NM_201284.2|protein_coding||||||||||rs121913229&COSV51830861|1|3748|1|||SNV|EntrezGene|||||NP_958441.1||||||||||||||||||||||||||||||likely_pathogenic&uncertain_significance|0&1|1&1|25157968&15737014&22622260|||||,C|downstream_gene_variant|MODIFIER|EGFR-AS1|100507500|Transcript|NR_047551.1|lncRNA||||||||||rs121913229&COSV51830861|1|4965|-1|||SNV|EntrezGene||YES|||||||||||||||||||||||||||||||||likely_pathogenic&uncertain_significance|0&1|1&1|25157968&15737014&22622260|||||
7	55242469	.	TTAAGAGAAG	C	.	.	CSQ=C|protein_altering_variant|MODERATE|EGFR|1956|Transcript|NM_001346897.2|protein_coding|18/26||NM_001346897.2:c.2104_2113delinsC|NP_001333826.1:p.Leu702_Ala705delinsPro|2365-2374/3848|2104-2113/3276|702-705/1091|LREA/P|TTAAGAGAAGca/Cca|rs727504278&COSV51765099|1||1|||indel|EntrezGene|||||NP_001333826.1||||||||||||||||||||||||||||||drug_response|0&1|1&1|24033266|||||,C|protein_altering_variant|MODERATE|EGFR|1956|Transcript|NM_001346898.2|protein_coding|19/27||NM_001346898.2:c.2239_2248delinsC|NP_001333827.1:p.Leu747_Ala750delinsPro|2500-2509/3983|2239-2248/3411|747-750/1136|LREA/P|TTAAGAGAAGca/Cca|rs727504278&COSV51765099|1||1|||indel|EntrezGene|||||NP_001333827.1||||||||||||||||||||||||||||||drug_response|0&1|1&1|24033266|||||,C|protein_altering_variant|MODERATE|EGFR|1956|Transcript|NM_001346899.2|protein_coding|18/27||NM_001346899.2:c.2104_2113delinsC|NP_001333828.1:p.Leu702_Ala705delinsPro|2365-2374/9770|2104-2113/3498|702-705/1165|LREA/P|TTAAGAGAAGca/Cca|rs727504278&COSV51765099|1||1|||indel|EntrezGene|||||NP_001333828.1||||||||||||||||||||||||||||||drug_response|0&1|1&1|24033266|||||,C|protein_altering_variant|MODERATE|EGFR|1956|Transcript|NM_001346900.2|protein_coding|19/28||NM_001346900.2:c.2080_2089delinsC|NP_001333829.1:p.Leu694_Ala697delinsPro|2271-2280/9676|2080-2089/3474|694-697/1157|LREA/P|TTAAGAGAAGca/Cca|rs727504278&COSV51765099|1||1|||indel|EntrezGene|||||NP_001333829.1||||||||||||||||||||||||||||||drug_response|0&1|1&1|24033266|||||,C|protein_altering_variant|MODERATE|EGFR|1956|Transcript|NM_001346941.2|protein_coding|13/22||NM_001346941.2:c.1438_1447delinsC|NP_001333870.1:p.Leu480_Ala483delinsPro|1699-1708/9104|1438-1447/2832|480-483/943|LREA/P|TTAAGAGAAGca/Cca|rs727504278&COSV51765099|1||1|||indel|EntrezGene|||||NP_001333870.1||||||||||||||||||||||||||||||drug_response|0&1|1&1|24033266|||||,C|protein_altering_variant|MODERATE|EGFR|1956|Transcript|NM_005228.5|protein_coding|19/28||NM_005228.5:c.2239_2248delinsC|NP_005219.2:p.Leu747_Ala750delinsPro|2500-2509/9905|2239-2248/3633|747-750/1210|LREA/P|TTAAGAGAAGca/Cca|rs727504278&COSV51765099|1||1||1|indel|EntrezGene||YES|||NP_005219.2||||||||||||||||||||||||||||||drug_response|0&1|1&1|24033266|||||,C|downstream_gene_variant|MODIFIER|EGFR|1956|Transcript|NM_201284.2|protein_coding||||||||||rs727504278&COSV51765099|1|3739|1|||indel|EntrezGene|||||NP_958441.1||||||||||||||||||||||||||||||drug_response|0&1|1&1|24033266|||||,C|downstream_gene_variant|MODIFIER|EGFR-AS1|100507500|Transcript|NR_047551.1|lncRNA||||||||||rs727504278&COSV51765099|1|4965|-1|||indel|EntrezGene||YES|||||||||||||||||||||||||||||||||drug_response|0&1|1&1|24033266|||||

I am looking for tools to adjust the location of del occurrence based on the direction of the transcript of interest to meet the final HGVS requirements.
It seems that bcftool cannot achieve this either,
Can corresponding functions be added? Thank you for any suggestions

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0