10000 does pybedtools closest support multiple databases like bedtools? · Issue #163 · daler/pybedtools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

does pybedtools closest support multiple databases like bedtools? #163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
xguse opened this issue Mar 11, 2016 · 3 comments
Open

does pybedtools closest support multiple databases like bedtools? #163

xguse opened this issue Mar 11, 2016 · 3 comments

Comments

@xguse
Copy link
xguse commented Mar 11, 2016

I am trying to run something like this:

$ cat a.bed
chr1  10  20  a1  1 -

$ cat b1.bed
chr1  5   6   b1.1  1 -
chr1  30  40  b1.2  2 +

$ cat b2.bed
chr1  0   1   b2.1  1 -
chr1  21  22  b2.2  2 +



$ bedtools closest -a a.bed -b b1.bed b2.bed -mdb each -d
chr1  10  20  a1  1 - 1 chr1  5   6   b1.1  1 - 5
chr1  10  20  a1  1 - 2 chr1  21  22  b2.2  2 + 2

This is what I am doing:

k_nearest = snp_bed.closest([gene_model_subtracted_bed, genes_only_sorted_bed],
                                k=k_number,
                                names=['novel_mapped_tx', 'official_annotations'],
                                D='ref',    # Include SIGNED distances from SNP based on the ref genome
                                t='all',    # Return all members of a distance "tie"
                                mdb='each', # Return `k_number` of neighboors for EACH `names`
                                )

This is the error I am getting:

pybedtools.helpers.BEDToolsError:
Command was:

        bedtools closest -t all -names novel_mapped_tx official_annotations -mdb each -k 10 -b /tmp/pybedtools.eieijdz9.tmp -a snp_bed.bed -D ref

Error message was:

***** ERROR: Number of database name tags given does not match number of databases. *****

Is it possible to give multiple "B" files in pybedtools? It seems to enforce that only a single *arg is passed.

Any help would be awesome!

Gus

PS: I also tried just passing in a string with the two file paths, but it tried to open them as if it were a single path declaration which borks of course.

@daler
Copy link
Owner
daler commented Mar 11, 2016

As of v0.7.5 this works (also see #156) . . . but only if the list contains filename strings rather than BedTool objects. I will fix this so it detects BedTool objects as well, and while I'm working on it, might as well support mixes of BedTool objects and string filenames.

In the meantime, try modifying your example to:

k_nearest = snp_bed.closest(
                               [
                                   gene_model_subtracted_bed.fn,
                                   genes_only_sorted_bed.fn
                                ],
                                k=k_number,
                                names=['novel_mapped_tx', 'official_annotations'],
                                D='ref',    # Include SIGNED distances from SNP based on the ref genome
                                t='all',    # Return all members of a distance "tie"
                                mdb='each', # Return `k_number` of neighboors for EACH `names`
                                )

I'll keep this issue open until I add support for lists of BedTool objects.

@xguse
Copy link
Author
xguse commented Mar 11, 2016

You rock.

Thanks as always for the speedy and helpful response.

Gus

@SHuang-Broad
Copy link

I second on this feature request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0