Description
Currently, when querying with an indexed database, the script uses the total number of records for the chromosome as the total
for building a progress bar via tqdm
. It should fairly easy to use a better estimate of the number of reads in a region by using the start position of the bin containing the region start, and the last position of the bin containing the region end. This could be achieved by querying the index with something like: DataBase.get_chromosome(bytes_chromosome_label).index[end_position]
.
Sadly, the __getitem__
method of ChromosomeIndex
currently only returns the start position of a bin, instead of both start/end. This should be expanded, which requires some minor refactoring.
The number of records would then be (end_byte_position-start_byte_position)/record_size_in_bytes
.