-
Notifications
You must be signed in to change notification settings - Fork 5
Locus tag question #101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We see that the original locus_tag is in the existing entries, so it should be possible to update like this? https://www.ebi.ac.uk/ena/browser/api/embl/CAC21482.1?lineLimit=1000 |
We also notice that at the NCBI and this entry was updated |
I edited the first comment to provide some context. |
Locus_tag is EVEN defined as a stable identifier:
|
Please advise how to continue here. |
Previous submissions of the fission yeast genome have used "locus_tag" for the systematic identifier (since before 2002).
We are now being asked to use "old_locus_tag" because the PomBase locus_tag includes a "." (period) e.g SPCC18B5.03
This change shouldn't be forced on existing IDs. We should not change systematic identifiers unecessarily (this is contrary to FAIR data principles), and we can only use "old_locus_tag" if we provide a current "locus_tag" according to the documentation, so we are a bit stuck.
The fission yeast systematic identifiers stored in locus_tag are probably used in 4-6 thousand publications and in many thousands of genome-wide functional genomics datasets. This label is also used by downstream databases and pipelines (e.g. UniPotKB). The S. pombe systematic identifiers currently in "locus_tag" will never be deprecated because they are used in every large dataset for fission yeast, and provide our only unique and constant identifier for every gene. Studied genes, and most conserved genes are given a 'standard name' but completely unstudied genes often have no "standard name" assigned. Standard names may change in exceptional circumstances (i.e to resolve conflicts, or adopt universal nomenclature).
Systematic identifiers (i.e. the current locus_tag in INDSC) will therefore continue to provide the only unique identifier for functional genomics datasets, because this label will never change (unless a gene merges or splits- in which case one or both IDs will become synonyms and are therefore still trackable when referencing the gene history in a model organism database). I assume this is the case for most Model Organism Databases (as far as I'm aware there is no other mechanism to provide a recognisable unique ID for a locus genome wide).
Is there another label that can be used for uniquely "orf name"
Thanks
The text was updated successfully, but these errors were encountered: