8000 How to deal with libraries that do not have UMIs? · Issue #75 · timoast/sinto · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

How to deal with libraries that do not have UMIs? #75

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gabrielnegreira opened this issue May 8, 2025 · 0 comments
Open

How to deal with libraries that do not have UMIs? #75

gabrielnegreira opened this issue May 8, 2025 · 0 comments

Comments

@gabrielnegreira
Copy link

Hi,

I am trying to run sinto barcode in a fastq file originating from a custom single-cell DNA (not RNA) library. This library does not have UMIs. It contains the cell barcode in the first 45 nt in read 2, which is followed by the genomic insert. The structure is:

NNNNNNNNAGGANNNNNNNNACTCNNNNNNNNAAGGNNNNNNNNT-Genomic Insert

I am using sinto barcode to simply add the cell barcode to the reads identifier with the following command:

sinto barcode --barcode_fastq "$r2" \
--read1 "$r1" \
--read2 "$r2" \
--bases 45 \
--whitelist "$WHITELIST \
--suffix $LIB_PREFIX"

Where $r2 points to the read2 fastq file, $r1 points to the read 1 file, and $WHITELIST points to a text file with the whitelist of known barcodes.

Unfortunately, after some time running I get the following error:

Function run_barcode called with the following arguments:

barcode_fastq   /scratch/antwerpen/205/vsc20542/atrandi_scDNA/input_fastq/fastp/scDNA_AT_01_R2_clean.fastq
read1   /scratch/antwerpen/205/vsc20542/atrandi_scDNA/input_fastq/fastp/scDNA_AT_01_R1_clean.fastq
read2   /scratch/antwerpen/205/vsc20542/atrandi_scDNA/input_fastq/fastp/scDNA_AT_01_R2_clean.fastq
bases   45
prefix
suffix  scDNA_AT_01
whitelist       /scratch/antwerpen/205/vsc20542/atrandi_scDNA/whitelist.tsv
func    <function run_barcode at 0x154c53258360>
Traceback (most recent call last):
  File "/data/antwerpen/205/vsc20542/python_lib/bin/sinto", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/data/antwerpen/205/vsc20542/python_lib/lib/python3.12/site-packages/sinto/arguments.py", line 555, in main
    options.func(options)
  File "/data/antwerpen/205/vsc20542/python_lib/lib/python3.12/site-packages/sinto/utils.py", line 24, in wrapper
    func(args)
  File "/data/antwerpen/205/vsc20542/python_lib/lib/python3.12/site-packages/sinto/cli.py", line 105, in run_barcode
    addbarcodes.addbarcodes(
  File "/data/antwerpen/205/vsc20542/python_lib/lib/python3.12/site-packages/sinto/addbarcodes.py", line 101, in addbarcodes
    barcodes = correct_barcodes(barcodes, whitelist)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/antwerpen/205/vsc20542/python_lib/lib/python3.12/site-packages/sinto/addbarcodes.py", line 49, in correct_barcodes
    for entry in clusterer(counts, threshold=1):
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/antwerpen/205/vsc20542/python_lib/lib/python3.12/site-packages/umi_tools/network.py", line 368, in __call__
    assert max(len_umis) == min(len_umis), (
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: not all umis are the same length(!):  43 - 56

Is it due to the lack of UMIs after the barcode sequence? If so, is there a way of making Sinto bypass the UMI detection?

Not sure if helpful, but here is the head of my whitelist.txt file:

$ head whitelist.tsv
GTAATGCCAGGATACAGCAGACTCTACAACCGAAGGGTAACCGAT
GTAATGCCAGGATACAGCAGACTCTACAACCGAAGGTCCTCAACT
GTAATGCCAGGATACAGCAGACTCTACAACCGAAGGTGGTCTCAT
GTAATGCCAGGATACAGCAGACTCTACAACCGAAGGGTCCGATTT
GTAATGCCAGGATACAGCAGACTCTACAACCGAAGGTTGACCACT
GTAATGCCAGGATACAGCAGACTCTACAACCGAAGGTCCAGGATT
GTAATGCCAGGATACAGCAGACTCTACAACCGAAGGGACAGCATT
GTAATGCCAGGATACAGCAGACTCTACAACCGAAGGGATGGTCTT
GTAATGCCAGGATACAGCAGACTCTACAACCGAAGGCATACCGTT
GTAATGCCAGGATACAGCAGACTCTACAACCGAAGGCGGTTGATT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0