8000 Unable to run NEAT examples in the README · Issue #89 · ncsa/NEAT · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Unable to run NEAT examples in the README #89

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
NuriaQueralt opened this issue Oct 31, 2023 · 10 comments · Fixed by #111
Closed

Unable to run NEAT examples in the README #89

NuriaQueralt opened this issue Oct 31, 2023 · 10 comments · Fixed by #111
Labels
duplicate This issue or pull request already exists

Comments

@NuriaQueralt
Copy link

Describe the bug
Running the 'whole genome simulation' in the readme herehere , this error arisen:
raise ValueError("Bad mode %r" % mode)
ValueError: Bad mode 'xt'

To Reproduce
Steps to reproduce the behavior:

  1. Download the hg19.fa: ```wget ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa
2. Create a neat_config.yml file with the following content ```reference: hg19.fa
read_len: 126
produce_bam: True
produce_vcf: True
paired_ended: True
fragment_mean: 300
fragment_st_dev: 30```

3. Execute neat on the command line: ```neat read-simulator                  \
        -c neat_config.yml          \
        -o /home/your path/simulated_reads```
4. See error:
raise ValueError("Bad mode %r" % mode)
ValueError: Bad mode 'xt'

**Expected behavior**
To get the synthetic data as three different files: fastq, bam and vcf as it is stated in the README

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Desktop (please complete the following information):**
 - OS: Ubuntu 20.04
 - Browser: Firefox
 - Version: NEAT v4.0


@joshfactorial
Copy link
Collaborator

Apologies, this is a bug we're currently working through.

@joshfactorial joshfactorial added the duplicate This issue or pull request already exists label Oct 31, 2023
@NuriaQueralt
Copy link
Author

Thank you for the fast reply! Looking forward to reruning it once this is fixed.

@alabarga
Copy link

can be 'skipped' temporarily adding

overwrite_output: True

to the config.yml

also there is a typo in the README for the Targeted region simulation

[contents of neat_config.yml]
reference: hg19.fa
read_len: 126
produce_bam: True
produce_vcf: True
paired_ended: True
fragment_mean: 300
fragment_st_dev: 30
targed_bed: hg19_exome.bed

should be

target_bed: hg19_exome.bed

however it looks like NEAT will generate data for all the reference, not only for the bed file, am I correct?

@alabarga
Copy link

if I set the reference to a FASTA file for the region in the bed file, it will run but I will get another error

self.errors = err_model.get_sequencing_errors(self.length, self.reference_segment,
File ".venv/lib/python3.10/site-packages/neat/models/models.py", line 413, in get_sequencing_errors
  if self.rng.random() < self.quality_score_error_rate[quality_scores[i]]:
KeyError: <generator object bin_scores at 0x7f7941f53530>

@joshfactorial
Copy link
Collaborator

however it looks like NEAT will generate data for all the reference, not only for the bed file, am I correct?

By default the variants will be concentrated in the bed file areas, but there will still be some in the background (as well as sequencing errors). You can use the parameter off_target_scalar to adjust this. If you want no variants outside the bed, then you can set this to 0.0.

if I set the reference to a FASTA file for the region in the bed file, it will run but I will get another error

Yeah, same, that's why this bug fix is taking me a minute.

@joshfactorial
Copy link
Collaborator

In the mean time you can try version 3.2. Apologies for the broken release...

@alabarga
Copy link
alabarga commented Nov 1, 2023

ok, thanks, just for your info, I manage to skip the pervious error adding

avg_seq_error: 0

to the config.yml but then I get a different error,

2023-10-31 18:53:43,982:ERROR:neat:read-simulator failed, see the traceback below
Traceback (most recent call last):
  File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/cli/cli.py", line 133, in main
    cmd(args)
  File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/cli/commands/read_simulator.py", line 47, in execute
    read_simulator_runner(arguments.config, arguments.output)
  File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/read_simulator/runner.py", line 353, in read_simulator_runner
    generate_reads(local_reference,
  File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/read_simulator/utils/generate_reads.py", line 589, in generate_reads
    read1.finalize_read_and_write(error_model_1, fq1, options.produce_fastq)
  File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/read_simulator/utils/read.py", line 278, in finalize_read_and_write
    self.quality_array = err_model.get_quality_scores(len(self.reference_segment))
  File "/home/alabarga/BSC/code/synthetic-genomes/.venv/lib/python3.10/site-packages/neat/models/models.py", line 498, in get_quality_scores
    self.rng.normal(self.quality_score_probabilities[i][0],
IndexError: index 151 is out of bounds for axis 0 with size 151

input_read_length length is 162 but quality_score_probabilities length is 151

@joshfactorial
Copy link
Collaborator
joshfactorial commented Nov 1, 2023 via email

@joshfactorial
Copy link
Collaborator

I pushed a fix to the Develop branch. If you have time, can you test your code on that branch?

@joshfactorial
Copy link
Collaborator

This should now be fixed on the main branch an in the current release. Please check out the newest version and open a new ticket if you have any further issues.

@joshfactorial joshfactorial linked a pull request May 26, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
0