cogent3
is a mature python library for analysis of genomic sequence data. We endeavour to provide a first-class experience within Jupyter notebooks, but the algorithms also support parallel execution on compute systems with 1000's of processors.
Migration to new type core objects βΌοΈ
We are changing the migration strategy from old type to new type cogent3
core classes. At present we have old type and new type implementations for sequences, sequence collections, alignments, molecular types, alphabets and genetic codes. Users can select the new classes by specifying new_type=True
to the functions like make_aligned_seqs()
or load_aligned_seqs()
. Alternately, you can do this across all objects by using the COGENT3_NEW_TYPE
environment variable. We have established that it is not viable to support both old and new types simultaneously. Therefore, the first release after July 1st 2025 will remove all of the old type classes! Arguments specific to the old type classes will be deprecated at that point. While this is a major change, we have been using these ourselves consistently and feel confident that the disruption to users should be small. However, we strongly advise all users to migrate now and report any errors. To do this, add the following statement to the top of your scripts.
import os
os.environ["COGENT3_NEW_TYPE"] = "1"
Major advances in our progress towards a fully plugin-based architecture!
We have implemented the infrastructure to support alternative sequence storage plugins. These provide the backend storage for the new type sequence collections. We have implemented a proof-of-principle plugin cogent3-h5seqs for sequence storage based on the HDF5 format. This allows efficient storage of very large sequence collections (aligned or unaligned). See the readme for that project on how to use it.
We have implemented the infrastructure to support third-party provision of every bioinformaticians favourite game -- parsing / writing the multitude of sequence file formats. All builtin format parsers / writers are implemented as plugins. We use third-party versions by default.
We have implemented the infrastructure to support hook-style plugins. We have definied a single hook now -- the new type Alignment.quick_tree()
method checks for an external plugin for calculation. The developers of piqtree have made the rapid-NJ algorithm available for this hook! Once installed, it is used as aln.quick_tree(use_hook="piqtree")
.
Note For assistance in writing your own plugins, contact us via the cogent3 discussions p 8000 age.
Now distributed with sample data!
We have added sample data sets for quick testing of different features. Check out cogent3.available_datasets()
to see the available datasets. You can load one using cogent3.get_dataset(name)
.
cogent3
is unique in providing numerous non-stationary Markov models for modelling sequence evolution, including codon models. cogent3
also includes an extensive collection of time-reversible models (again including novel codon models). We have done more than just invent these new methods, we have established the most robust algorithms for their implementation and their suitability for real data. Additionally, there are novel signal processing methods focussed on statistical estimation of integer period signals.
π¬ Demo non-reversible substitution model
cogent3-demo-composable.mp4
Beyond our novel methods, cogent3
provides an extensive suite of capabilities for manipulating and analysing sequence data. You can manipulate sequences by their annotations, e.g.
π¬ Demo sequences with annotations
cogent3-demo-new-ann.mp4
Plus, you can read standard tabular and biological data formats, perform multiple sequence alignment using any cogent3
substitution models, phylogenetic reconstruction and tree manipulation, manipulation of tabular data, visualisation of phylogenies and much more.
Our cogent3.app
module provides a very different approach to using the library capabilities. Expertise in structural programming concepts is not essential!
π¬ Demo friendly coding
cogent3-demo-composable.mp4
For most users we recommend
$ pip install "cogent3[extra]"
which installs support for data visualisation and jupyter notebooks.
If you're running on a high-performance computing system we recommend
$ pip install cogent3
which skips the data visualisation and notebook support.
To install the development version directly from GitHub
$ pip install git+https://github.com/cogent3/cogent3.git@develop#egg=cogent3
cogent3
is released under the BSD-3 license, documentation is at cogent3.org, while cogent3
code is on GitHub. If you would like to contribute (and we hope you do!), we have created a companion c3dev
GitHub repo which provides details on how to contribute and some useful tools for doing so.
cogent3
is a descendant of PyCogent. While there is much in common with PyCogent, the amount of change has been substantial, motivating the name change to cogent3
. This name has been chosen because cogent
was always the import name (dating back to PyEvolve in 2004) and it's Python 3 only.
Given this history, we are grateful to the multitude of individuals who have made contributions over the years. Many of these contributors were also co-authors on the original PyEvolve and PyCogent publications. Individual contributions can be seen by using "view git blame" on individual lines of code on GitHub, through git log in the terminal, and more recently the changelog.
Cogent3 has received funding support from the Australian National University and an Essential Open Source Software for Science Grant from the Chan Zuckerberg Initiative.