8000 Update Mus musculus default GTF · Issue #1579 · nf-core/rnaseq · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Update Mus musculus default GTF #1579
Open
@Z-Zen

Description

@Z-Zen

Description of the bug

Hi nf-core team,

We have recently analyzed RNA-seq data for the Mus musculus species using the default parameters. We identified differentially expressed genes that are currently deprecated or retired in the Ensembl database. After the investigation, we noticed that the GTF used by the pipeline appears to be outdated. We used gffcompare to compare with Ensembl GTF (Mus_musculus.GRCm38.102.chr.gtf.gz). I made sure to compare the genome.filtered.gtf produced by the pipeline.

# gffcompare v0.12.9 | Command line was:
#./gffcompare-0.12.9.Linux_x86_64/gffcompare -r genome.filtered.gtf -o gtfcmp_new genome.filtered_updated.gtf
#

#= Summary for dataset: genome.filtered_updated.gtf
#     Query mRNAs :  142604 in   53448 loci  (115576 multi-exon transcripts)
#            (20372 multi-transcript loci, ~2.7 transcripts per locus)
# Reference mRNAs :  109160 in   44274 loci  (89147 multi-exon)
# Super-loci w/ reference transcripts:    43571
#-----------------| Sensitivity | Precision  |
        Base level:    99.0     |    79.4    |
        Exon level:    98.7     |    85.2    |
      Intron level:    99.5     |    88.6    |
Intron chain level:    97.9     |    75.5    |
  Transcript level:    79.9     |    61.2    |
       Locus level:    59.9     |    49.4    |

     Matching intron chains:   87258
       Matching transcripts:   87258
              Matching loci:   26524

          Missed exons:     924/377955  (  0.2%)
           Novel exons:   35793/447306  (  8.0%)
        Missed introns:     562/253399  (  0.2%)
         Novel introns:   18019/284759  (  6.3%)
           Missed loci:     348/44274   (  0.8%)
            Novel loci:    9842/53448   ( 18.4%)

 Total union super-loci across all input datasets: 53413
142604 out of 142604 consensus transcripts written in gtfcmp_new.annotated.gtf (0 discarded as redundant)

Is it possible to update the default GTF for Mus musculus (and perhaps other species too)?

Command used and terminal output

Relevant files

No response

System information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0