[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CA2874492A1 - Nano46 genes and methods to predict breast cancer outcome - Google Patents

Nano46 genes and methods to predict breast cancer outcome Download PDF

Info

Publication number
CA2874492A1
CA2874492A1 CA2874492A CA2874492A CA2874492A1 CA 2874492 A1 CA2874492 A1 CA 2874492A1 CA 2874492 A CA2874492 A CA 2874492A CA 2874492 A CA2874492 A CA 2874492A CA 2874492 A1 CA2874492 A1 CA 2874492A1
Authority
CA
Canada
Prior art keywords
tat
gat
tct
expression
intrinsic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CA2874492A
Other languages
French (fr)
Other versions
CA2874492C (en
Inventor
Sean M. Ferree
Joel S. Parker
James Justin STORHOFF
Charles M. Perou
Matthew J. Ellis
Philip S. Bernard
Torsten O. Nielsen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
British Columbia Cancer Agency BCCA
University of North Carolina at Chapel Hill
University of Utah Research Foundation UURF
Washington University in St Louis WUSTL
Nanostring Technologies Inc
Original Assignee
British Columbia Cancer Agency BCCA
University of North Carolina at Chapel Hill
University of Utah Research Foundation UURF
Washington University in St Louis WUSTL
Nanostring Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by British Columbia Cancer Agency BCCA, University of North Carolina at Chapel Hill, University of Utah Research Foundation UURF, Washington University in St Louis WUSTL, Nanostring Technologies Inc filed Critical British Columbia Cancer Agency BCCA
Publication of CA2874492A1 publication Critical patent/CA2874492A1/en
Application granted granted Critical
Publication of CA2874492C publication Critical patent/CA2874492C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention provides methods for classifying and for evaluating the prognosis of a subject having breast cancer are provided. The methods include prediction of breast cancer subtype using a supervised algorithm trained to stratify subjects on the basis of breast cancer intrinsic subtype. The prediction model is based on the gene expression profile of the intrinsic genes listed in Table 1. Further provided are compositions and methods for predicting outcome or response to therapy of a subject diagnosed with or suspected of having breast cancer. These methods are useful for guiding or determining treatment options for a subject afflicted with breast cancer. Methods of the invention further include means for evaluating gene expression profiles, including microarrays and quantitative polymerase chain reaction assays, as well as kits comprising reagents for practicing the methods of the invention.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS
[01] This application claims the benefit of U.S. Provisional Application No.
61/650,209, filed May 22, 2012 and U.S. Provisional Application No. 61/753,673, filed January 17, 2013.
The contents of each of these applications are incorporated herein by reference in their entireties.
FIELD OF THE INVENTION
[02] This disclosure relates generally to the field of cancer biology, and specifically, to the fields of detection and identification of specific cancer cell phenotypes and correlation with appropriate therapies.
BACKGROUND OF THE INVENTION
[03] Current approaches to treating early breast cancer, including adjuvant therapy, have indeed improved survival and reduced recurrence. However, the risk of recurrence may be underestimated in some patients, but overestimated in others.
[04] While the risk of recurrence does diminish somewhat over time, ongoing risk has been observed in many studies, some of them involving tens of thousands of patients with breast cancer. In fact, some of the patients who experienced recurrence after five years in these studies had previously been considered "low risk" - for example, their cancer had not spread to the lymph nodes at the time of their initial diagnosis, or their estrogen receptor status was positive. In one of these studies, a substantial number of recurrences occurred more than five years post-treatment. Thus, there is a need in the art to determine risk of recurrence and determine therapies which reduce that risk and improve overall survival.
SUMMARY OF THE INVENTION
[05] The present invention provides a method of predicting outcome in a subject having breast cancer comprising: providing a tumor sample from the subject;
determining the expression of the genes in the NAN046 intrinsic gene list of Table 1 in the tumor sample;
measuring the similarity of the tumor sample to an intrinsic subtype based on the expression of the genes in the NAN046 intrinsic gene list, wherein the intrinsic subtype consists of at least Basal-like, Luminal A, Luminal B or HER2-enriched; determining a proliferation score based on the expression of a subset of proliferation genes in the NAN046 intrinsic gene list;

determining the size of the tumor, calculating a risk of recurrence score using a weighted sum of said intrinsic subtype, proliferation score and tumor size; and determining whether the subject has a low or high risk of recurrence based on the recurrence score. In one embodiment a low score indicates a more favorable outcome and high score indicates a less favorable outcome.
[06] The methods of the present invention can include determining the expression of at least one of, a combination of, or each of, the NAN046 intrinsic genes recited in Table 1. In some embodiments, the methods of the present invention can include determining the expression of at least one of, a combination of, or each of, the NAN046 intrinsic genes selected from ANLN, CCNE1, CDC20, CDC6, CDCA1, CENPF, CEP55, EX01, KIF2C, KNTC2, MELK, MKI67, ORC6L, PTTG1, RRM2, TYMS, UBE2C and/or UBE2T. The expression of the members of the NAN046 intrinsic gene list can be determined using the nanoreporter code system (nCounter Analysis system).
[07] The methods of the present invention can include determining at least one of, a combination of, or each of, the following: tumor size, tumor grade, nodal status, intrinsic subtype, estrogen receptor expression, progesterone receptor expression, and expression [08] The sample can be a sampling of cells or tissues. The sample can be a tumor. The tissue can be obtained from a biopsy. The sample can be a sampling of bodily fluids. The bodily fluid can be blood, lymph, urine, saliva or nipple aspirate.
[09] While the disclosure has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the disclosure, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.
[10] The patent and scientific literature referred to herein establishes the knowledge that is available to those with skill in the art. All United States patents and published or unpublished United States patent applications cited herein are incorporated by reference.
All published foreign patents and patent applications cited herein are hereby incorporated by reference.
Genbank and NCBI submissions indicated by accession number cited herein are hereby incorporated by reference. All other published references, documents, manuscripts and scientific literature cited herein are hereby incorporated by reference.
[11] While this disclosure has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure encompassed by the appended claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[12] Figure 1 is a heatmap of the breast cancer intrinsic subtypes and the intrinsic genes of Table 1.
[13] Figure 2 shows a Kaplan Meier survival curves from a cohort of untreated breast cancer patients.
[14] Figure 3 shows a Kaplan Meier survival curves from a cohort of node-negative, ER+
Breast Cancer Patients treated with tamoxifen.
[15] Figure 4 shows a 10 Year event probability as a function of ROR Score in ER+, Node-negative breast cancer patients treated with tamoxifen. The graph shows the sub-population subtyped as Luminal A or B within this population. RFS = Recurrence-free survival; DSS = disease-specific survival [16] Figure 5 is a schematic of the breast cancer intrinsic subtyping assay.
[17] Figure 6 is a schematic of the algorithm process.
[18] Figure 7 is an illustration showing the hybridization of the CodeSet to mRNA.
[19] Figure 8 is an illustration showing the removal of excess reporters.
[20] Figure 9 is an illustration showing the binding of the reporters to the surface of a cartridge.
[21] Figure 10 is an illustration showing the immobilization and alignment of a reporter.
[22] Figure 11 is an illustration of data collection.
[23] Figure 12 is an illustration of the nCounter analysis system breast cancer test assay process.
[24] Figure 13 is an illustration of the nCounter Prep Station.
[25] Figure 14 is an illustration of nCounter Digital Analyzer.
DETAILED DESCRIPTION OF THE INVENTION
[26] The disclosure presents a method of predicting outcome in a subject having breast cancer comprising: providing a tumor sample from the subject; determining the expression of the genes in the NAN046 intrinsic gene list of Table 1 in the tumor sample;
determining the intrinsic subtype of the tumor sample based on the expression of the genes in the NAN046 intrinsic gene list, wherein the intrinsic subtype consists of at least Basal-like, Luminal A, Luminal B or HER2-enriched; determining a proliferation score based on the expression of a subset of proliferation genes in the NAN046 intrinsic gene list; determining the size of the tumor, calculating a risk of recurrence score using a weighted sum of said intrinsic subtype, proliferation score and tumor size; and determining whether the subject has a low or high risk of recurrence based on the recurrence score. In one embodiment a low score indicates a more favorable outcome and high score indicates a less favorable outcome.
[27] Intrinsic genes are statistically selected to have low variation in expression between biological sample replicates from the same individual and high variation in expression across samples from different individuals. Thus, intrinsic genes are used as classifier genes for breast cancer classification. Although clinical information was not used to derive the breast cancer intrinsic subtypes, this classification has proved to have prognostic significance.
Intrinsic gene screening can be used to classify breast cancers into five molecular distinct intrinsic subtypes, Luminal A (LumA), Luminal B (LumB), HER2-enriched (Her-2-E), Basal-like, and Normal-like (Perou et al. Nature, 406(6797):747-52 (2000);
Sorlie et al.
PNAS, 98(19):10869-74 (2001)).
[28] A NAN046 gene expression assay, as described herein, can identify intrinsic subtype from a biological sample, e.g., a standard formalin fixed paraffin embedded tumor tissue.
The methods utilize a supervised algorithm to classify subject samples according to breast cancer intrinsic subtype. This algorithm, referred to herein as the NAN046 classification model, is based on the gene expression profile of a defined subset of intrinsic genes that has been identified herein as superior for classifying breast cancer intrinsic subtypes. The subset of genes, along with primers target-specific sequences utilized for their detection, is provided in Table 1. Table 1A provides the sequences of target specific probe sequences for detecting each gene utilized in Table 1. The sequences provided in Table 1A are merely representative and are not meant to limit the invention. The skilled artisan can utilize any target sequence-specific probe for detecting any of (or each of) the genes in Table 1.
[29] Table 1 REPRESENTATIVE
SE Q SEQ
GENBANK FORWARD
GENE ID REVERSE PRIMER ID
ACCESSION PRIMER
NO: NO:
NUMBER
NM_020445 AAAGATTCCTGGG TGGGGCAGTTCTGTA

ACTR3B NM_001040135 ACCTGA TTACTTC
ACAGCCACTTTCA CGATGGTTTTGTACA

ANLN NM_018685 GAAGCAAG AGATTTCTC
CTGGAAGAGTTGA GCAAATCCTTGGGC

BAG1 NM_004323 ATAAAGAGC AGA

_ GCTGGCTGAGCAG TTCCTCCATCAAGAG

BLVRA AAAG TTCAACA
GGCCAAAATCGAC GGGTCTGCACAGAC

CTGTCTGAGTGCC TCCTTGTAATGGGGA

GTAAATCACCTTC ACTTGGGATATGTGA

CDC6 NM_001254 TGAGCCT ATAAGACC

CDCA1 NM_031423 AACCAG TTTCCA
GACAAGGAGAAT ACTGTCTGGGTCCAT

GTGGCAGCAGATC GGATTTCGTGGTGGG

CENPF NM_016343 ACAA TIC
CCTCACGAATTGC CCACAGTCTGTGATA

CATGAAATAGTGC CCATCAACATTCTCT

ACACAGAATCTAT ATCAACTCCCAAAC

EGFR NM_005228 ACCCACCAGAGT GGTCAC
GCTGGCTCTCACA GCCCTTACACATCGG

ERBB2 NM_001005862 CTGATAG AGAAC
GCAGGGAGAGGA GACTTCAGGGTGCTG

ESR1 NM_001122742 GTTTGT GAC
CCCATCCATGTGA TGTGAAGCCAGCAA

EX 01 NM_130398 GGAAGTATAA TATGTATC
CTTCTTGGACCTT TATTGGGAGGCAGG

GCTACTACGCAGA CTGAGTTCATGTTGC

FOXA1 NM_004496 CACG TGACC
GATGTTCGAGTCA GACAGCTACTATTCC

FOXCl NM_001453 CAGAGG CGTT
TTCGGCTGGAAGG TATGTGAGTAAGCTC

(UBE2T) NM_014176 CTCCAAA TGGAG
TGGGTCGTGTCAG CACCGCTGGAAACT

K1F2C NM_006845 GAAAC GAAC

CGCAGTCATCCAG CGTGCACATCCATGA

KNTC2 NM_006101 AGATGTG CCTT
ACTCAGTACAAGA GAGGAGATGACCTT

GTTGGACCAGTCA GCCATAGCCACTGCC

TGTGGCTCATTAG CTTCGACTGGACTCT

GACTCCAAGCGCG CAGACATGTTGGTAT

MAPT NM_001123066 AAAAC TGCACATT
CCACAAAATATTC AGGCGATCCTGGGA

CCAGTAGCATTGT CCCATTTGTCTGTCT
MELK NM_014791 CCGAG TCAC
GTCTCTGGTAATG CTGATGGTTGAGGCT
GTGGAATGCCTGC CGCACTCCAGCACCT
MKI67 NM_002417 TGACC AGAC
AGGGGTGCCCTCT TCACAGGGTCAAAC
MLPH NM_024101 GAGAT TTCCAGT
CGAGATCGCCAAG GATGGTAGAGTTCC
MM Pll NM_005940 ATGTT AGTGATT
AGCCTCGAACAAT ACACAGATGATGGA
MYC NM_002467 TGAAGA GATGTC
ATCGACTGTGTAA

TTTAAGAGGGCAA CGGATTTTATCAACG

ORC6L NM_014321 ATGGAAGG ATGCAG
TGCCGCAGAACTC CATTTGCCGTCCTTC

PGR NM_000926 ACTTG ATCG
CCTCAGATGATGC GCAGGTCAAAACTC

CAGCAAGCGATGG AGCGGGCTTCTGTAA

AATGCCACCGAAG GCCTCAGATTTCAAC

TCGAACTGAAGGC CTGCTGAGAATCAA

GTCGAAGCCGCAA GGAACAAACTGCTC

SLC39A6 NM_012319 TTAGG TGCCA
CAAACGTGTGTTC ACAGCTCTTTAGCAT

TGCCCTGTATGAT GGGACTATCAATGTT

Table la. Probes for detecting NAN046 genes Gene Name RefSeq Accession Target Sequence SEQ
ID
NO:
ACTR3B NM_001040135 .1 CCAGAAGAAGTTTGTTATAGACGTTGGTTACG 140 AAAGATTCCTGGGACCTGAAATATTCTTTCAC
CCGGAGTTTGCCAACCCAGACTTTATGGAGTC
CATC
ANLN NM_018685.2 CGTGCCAGGCGAGAGAATCTTCAGAGAAAAA 141 TGGCTGAGAGGCCCACAGCAGCTCCAAGGTC
TATGACTCATGCTAAGCGAGCTAGACAGCCA
CTTTCAG
BAG1 NM_004323.3 CTTCATGTTACCTCCCAGCAGGGCAGCAGTGA 142 ACCAGTTGTCCAAGACCTGGCCCAGGTTGTTG
AAGAGGTCATAGGGGTTCCACAGTCTTTTCAG
AAAC
BCL2 NM_000633.2 CCAAGCACCGCTTCGTGTGGCTCCACCTGGAT 143 GTTCTGTGCCTGTAAACATAGATTCGCTTTCC
ATGTTGTTGGCCGGATCACCATCTGAAGAGCA
GACG
BLVRA NM_000712.3 TTCCTGAAAAAAGAAGTGGTGGGGAAAGACC 144 TGCTGAAAGGGTCGCTCCTCTTCACAGCTGGC
CCGTTGGAAGAAGAGCGGTTTGGCTTCCCTGC
ATTCA
CCNE1 NM_001238.1 GAGAACTGTGTCAAGTGGATGGTTCCATTTGC 145 CATGGTTATAAGGGAGACGGGGAGCTCAAAA
CTGAAGCACTTCAGGGGCGTCGCTGATGAAG
ATGCAC
CDC20 NM_001255.1 CCCGAGTGGGCTCCCTAAGCTGGAACAGCTA 146 TATCCTGTCCAGTGGTTCACGTTCTGGCCACA
TCCACCACCATGATGTTCGGGTAGCAGAACA
CCATGT
CDC6 NM_001254.3 GGGGAAGTTATATGAAGCCTACAGTAAAGTC 147 TGTCGCAAACAGCAGGTGGCGGCTGTGGACC
AGTCAGAGTGTTTGTCACTTTCAGGGCTCTTG
GAAGCC
CDCA1 NM_145697.1 GCCTGGCGGTGTTTTCGTCGTGCTCAGCGGTG 148 GGAGGAGGCGGAAGAAACCAGAGCCTGGGA
GATTAACAGGAAACTTCCAAGATGGAAACTT
TGTCTIT
CDH3 NM_001793.3 CCCTCGACCGTGAGGATGAGCAGTTTGTGAG 149 GAACAACATCTATGAAGTCATGGTCTTGGCCA
TGGACAATGGAAGCCCTCCCACCACTGGCAC
GGGAAC
CENPF NM_016343.3 AGAAAATCTTGCAGAGTCCTCCAAACCAACA 150 GCTGGTGGCAGCAGATCACAAAAGGTCAAAG
TTGCTCAGCGGAGCCCAGTAGATTCAGGCAC
CATCCTC
CEP55 NM_018131.3 GTACTACCGCATTGCTTGAACAGCTGGAAGA 151 GACAACGAGAGAAGGAGAAAGGAGGGAGCA
GGTGTTGAAAGCCTTATCTGAAGAGAAAGAC
GTATTGAA
CXXC5 NM_016463.5 AGCTGCCCTCTCCGTGCAATGTCACTGCTCGT 152 GTGGTCTCCAGCAAGGGATTCGGGCGAAGAC
AAACGGATGCACCCGTCTTTAGAACCAAAAA
TATTCT
EGFR NM_005228.3 GCAGCCAGGAACGTACTGGTGAAAACACCGC 153 AGCATGTCAAGATCACAGATTTTGGGCTGGCC
AAACTGCTGGGTGCGGAAGAGAAAGAATACC

ATGCAG
ERB B 2 NM_004448.2 TGAAGGTGCTTGGATCTGGCGCTTTTGGCACA 154 GTCTACAAGGGCATCTGGATCCCTGATGGGG
AGAATGTGAAAATTCCAGTGGCCATCAAAGT
GTTGAG
ESR1 NM_000125.2 AGGAACCAGGGAAAATGTGTAGAGGGCATGG 155 TGGAGATCTTCGACATGCTGCTGGCTACATCA
TCTCGGTTCCGCATGATGAATCTGCAGGGAGA
GGAGT
EX01 NM_006027.3 TGGCCCACAAAGTAATTAAAGCTGCCCGGTCT 156 CAGGGGGTAGATTGCCTCGTGGCTCCCTATGA
AGCTGATGCGCAGTTGGCCTATCTTAACAAAG
CGGG
FGFR4 NM_002011.3 CCCACATCCAGTGGCTGAAGCACATCGTCATC 157 AACGGCAGCAGCTTCGGAGCCGACGGTTTCC
CCTATGTGCAAGTCCTAAAGACTGCAGACATC
AATAG
FOXA1 NM_004496.2 TGGATGGTTGTATTGGGCAGGGTGGCTCCAG 158 GATGTTAGGAACTGTGAAGATGGAAGGGCAT
GAAACCAGCGACTGGAACAGCTACTACGCAG
ACACGCA
FOXCl NM_001453.1 TTCGAGTCACAGAGGATCGGCTTGAACAACT 159 CTCCAGTGAACGGGAATAGTAGCTGTCAAAT
GGCCTTCCCTTCCAGCCAGTCTCTGTACCGCA
CGTCCG
GPR160 NM_014373.1 GGATTTCAGTCCTTGCTTATGTTTTGGGAGAC 160 CCAGCCATCTACCAAAGCCTGAAGGCACAGA
ATGCTTATTCTCGTCACTGTCCTTTCTATGTCA
GCAT
UBE2T NM_014176.1 GTGTCAGCTCAGTGCATCCCAGGCAGCTCTTA 161 GTGTGGAGCAGTGAACTGTGTGTGGTTCCTTC
TACTTGGGGATCATGCAGAGAGCTTCACGTCT
GAAG
KIF2C NM_006845.2 GTTGTCTACAGGTTCACAGCAAGGCCACTGGT 162 ACAGACAATCTTTGAAGGTGGAAAAGCAACT
TGTTTTGCATATGGCCAGACAGGAAGTGGCA
AGACAC
KNTC2 NM_006101.1 AAAAGGTCATAAGCATGAAGCGCAGTTCAGT 163 TTCCAGCGGTGGTGCTGGCCGCCTCTCCATGC
AGGAGTTAAGATCCCAGGATGTAAATAAACA
AGGCCT
KRT14 NM_000526.3 GCAGTCATCCAGAGATGTGACCTCCTCCAGCC 164 GCCAAATCCGCACCAAGGTCATGGATGTGCA
CGATGGCAAGGTGGTGTCCACCCACGAGCAG
GTCCTT
KRT17 NM_000422.1 CTGACTCAGTACAAGAAAGAACCGGTGACCA 165 CCCGTCAGGTGCGTACCATTGTGGAAGAGGT
CCAGGATGGCAAGGTCATCTCCTCCCGCGAG
CAGGTCC
KRT5 NM 000424.2 CTGGTTCTCTTGCTCCACCAGGAACAAGCCAC 166 CATGTCTCGCCAGTCAAGTGTGTCCTTCCGGA
GCGGGGGCAGTCGTAGCTTCAGCACCGCCTCT
GCCA
MAPT NM_016835.3 GCCGGGTCCCTCAACTCAAAGCTCGCATGGTC 167 AGTAAAAGCAAAGACGGGACTGGAAGCGATG
ACAAAAAAGCCAAGACATCCACACGTTCCTC
TGCTAA

MDM2 NM_006878.2 GGTGAGGAGCAGGCAAATGTGCAATACCAAC 168 ATGTCTGTACCTACTGATGGTGCTGTAACCAC
CTCACAGATTCCAGCTTCGGAACAAGAGACC
CTGGTT
MELK NM_014791 .2 AGAGACAGCCAACAAAATATTCATGGTTCTT 169 GAGTACTGCCCTGGAGGAGAGCTGTTTGACT
ATATAATTTCCCAGGATCGCCTGTCAGAAGAG
GAGACC
MIA NM_006533.1 CCGGGGCCAAGTGGTGTATGTCTTCTCCAAGC 170 TGAAGGGCCGTGGGCGGCTCTTCTGGGGAGG
CAGCGTTCAGGGAGATTACTATGGAGATCTG
GCTGCT
MKI67 NM_002417.2 GCTTCCAGCAGCAAATCTCAGACAGAGGTTC 171 CTAAGAGAGGAGGAGAAAGAGTGGCAACCTG
CCTTCAAAAGAGAGTGTCTATCAGCCGAAGT
CAACATG
MLPH NM_024101.4 GAGGAAGTCAAACCTCCCGATATTTCTCCCTC 172 GAGTGGCTGGGAAACTTGGCAAGAGACCAGA
GGACCCAAATGCAGACCCTTCAAGTGAGGCC
AAGGCA
MMP11 NM_005940.3 AGCAGCCAAGGCCCTGATGTCCGCCTTCTACA 173 CCTTTCGCTACCCACTGAGTCTCAGCCCAGAT
GACTGCAGGGGCGTTCAACACCTATATGGCC
AGCCC
MYC NM_002467.3 CACCGAGGAGAATGTCAAGAGGCGAACACAC 174 AACGTCTTGGAGCGCCAGAGGAGGAACGAGC
TAAAACGGAGCTTTTTTGCCCTGCGTGACCAG
ATCCCG
NAT 1 NM_000662.4 AGCACTTCCTCATAGACCTTGGATGTGGGAGG 175 ATTGCATTCAGTCTAGTTCCTGGTTGCCGGCT
GAAATAACCTGAATTCAAGCCAGGAAGAAGC
AGCAA
ORC 6L NM_014321.2 GACTGTGTAAACAACTAGAGAAGATTGGACA 176 GCAGGTCGACAGAGAACCTGGAGATGTAGCT
ACTCCACCACGGAAGAGAAAGAAGATAGTGG
TTGAAGC
PGR NM_000926.2 GGGATGAAGCATCAGGCTGTCATTATGGTGTC 177 CTTACCTGTGGGAGCTGTAAGGTCTTCTTTAA
GAGGGCAATGGAAGGGCAGCACAACTACTTA
TGTGC
PHGDH NM_006623.2 GCGACGGCTTCGATGAAGGACGGCAAATGGG 178 AGCGGAAGAAGTTCATGGGAACAGAGCTGAA
TGGAAAGACCCTGGGAATTCTTGGCCTGGGC
AGGATTG
P1101 NM_004219.2 CACCAGCCTTACCTAAAGCTACTAGAAAGGC 179 TTTGGGAACTGTCAACAGAGCTACAGAAAAG
TCTGTAAAGACCAAGGGACCCCTCAAACAAA
AACAGCC
RRM2 NM_001034.1 TTCCTTTTGGACCGCCGAGGAGGTTGACCTCT 180 CCAAGGACATTCAGCACTGGGAATCCCTGAA
ACCCGAGGAGAGATATTTTATATCCCATGTTC
TGGCT
SFRP 1 NM_003012.3 GTGGGTCACACACACGCACTGCGCCTGTCAGT 181 AGTGGACATTGTAATCCAGTCGGCTTGTTCTT
GCAGCATTCCCGCTCCCTTCCCTCCATAGCCA
CGCT
SLC39A6 NM 012319.2 GATCGAACTGAAGGCTATTTACGAGCAGACT 182 CACAAGAGCCCTCCCACTTTGATTCTCAGCAG
CCTGCAGTCTTGGAAGAAGAAGAGGTCATGA
TAGCTC

TMEM45B NM_138788.3 CTGGCTGCCCTCAGCATTGTGGCCGTCAACTA 183 TTCTCTTGTTTACTGCCTTTTGACTCGGATGAA
GAGACACGGAAGGGGAGAAATCATTGGAATT
CAGA
TYMS NM_001071 .1 TGCTAAAGAGCTGTCTTCCAAGGGAGTGAAA 184 ATCTGGGATGCCAATGGATCCCGAGACTTTTT
GGACAGCCTGGGATTCTCCACCAGAGAAGAA
GGGGAC
UBE2C NM_007019.2 GTCTGCCCTGTATGATGTCAGGACCATTCTGC 185 TCTCCATCCAGAGCCTTCTAGGAGAACCCAAC
ATTGATAGTCCCTTGAACACACATGCTGCCGA
GCTC
[30] Table 2 provides select sequences for the NAN046 genes of Table 1.
[31] Table 2 GENBANK
ACCESSION SEQUENCE SEQ
NUMBER ID
NO:
NM_020445 GGGCGCTCTCGGGCTGCCGGCGGGGCCGAGCGCCGCGCGICCCGAGCAIGGCAGGCICCCIGCCICCCIG
CGT GGT GGAC T GT GGCACCGGGTATACCAAGC T TGGCTACGCAGGCAACACTGAGCCCCAGT TCAT TAT
T
CCT TCATGTAT TGCCATCAGAGAGTCAGCAAAGGTAGT T GACCAAGCTCAAAGGAGAGT GT TGAGGGGAG
T TGATGACCT TGACT TTTTCATAGGAGATGAAGCCATCGATAAACCTACATATGCTACAAAGTGGCCGAT
ACGACATGGAATCAT TGAAGACTGGGATCT TAT GGAAAGGT TCATGGAGCAAGTGGT T T T TAAATATCT
T
CGAGCTGAACCTGAGGACCAT TAT TTTT TAATGACAGAACCTCCACTCAATACACCAGAAAACAGAGAGT
ATCT TGCAGAAAT TATGT T TGAATCAT T TAACGTACCAGGACTCTACAT TGCAGT TCAGGCAGTGCTGGC

CT TGGCGGCATCT TGGACATCTCGACAAGTGGGTGAACGTACGT TAACGGGGATAGTCAT TGACAGCGGA
GAT GGAGTCACCCAT GT TATCCCAGTGGCAGAAGGT TAT GTAAT TGGAAGCTGCATCAAACACATCCCGA
T TGCAGGTAGAGATAT TACGTAT T TCAT TCAACAGC T GC TAAGGGAGAGGGAGGT
GGGAATCCCTCCTGA
GCAGICACIGGAGACCGCAAAAGCCAT TAAGGAGAAATAC T GT TACAT TIGCCCCGATATAGICAAGGAA
T T T GCCAAGTAT GAT GT GGATCCCCGGAAGT GGATCAAACAGTACACGGGTAT CAAT
GCGATCAACCAGA
AGAAGT T T GT TATAGACGT TGGT TACGAAAGAT TCCTGGGACCTGAAATAT TCT T TCACCCGGAGT T
T GC
CAACCCAGACT T TAT GGAGTCCATC T CAGAT GT T GT T GAT GAAGTAATACAGAAC T
GCCCCATCGAT GT G
CGGCGCCCGC T GTATAAGAAT GTCGTAC TC TCAGGAGGC TCCACCAT GT TCAGGGAT
TTCGGACGCCGAC
TGCAGAGGGAT T T GAAGAGAGT GGT GGAT GC TAGGC T GAGGCTCAGCGAGGAGC T CAGC
GGCGGGAGGAT
CAAGCCGAAGCCIGIGGAGGTCCAGGIGGICACGCATCACATGCAGCGCTACGCCGIGIGGITCGGAGGC
T CCAT GC T GGCCT CGACTCCCGAGT ICI T ICAGGIC T GCCACACCAAGAAGGAC TAT
GAAGAGTACGGGC
CCAGCATCTGCCGCCACAACCCCGTCT T T GGAGTCAT GT CCTAGTGTC T GCCT GAACGCGTCGT
TCGATG
GT GTCACGT TGGGGAACAAGTGTCCT TCAGAACCCAGAGAAGGCCGCCGT T CT GTAAATAGCGACGTCGG
TGT TGC TGCCCAGCAGCGT GCT TGCAT TGCCGGTGCATGAGGCGCGGCGCGGGCCCT TCAGTAAAAGCCA
T T TATCCGT GTGCCGACCGC T GTC T GCCAGCCT CC TCC T
TCTCCCGCCCTCCTCACCCTCGCTCTCCCTC
CICCICCICC TCCGAGC T GC TAGC T GACAAATACAAT IC T GAAGGAATCCAAAT =GAO T T
TGAAAAT TG
T TAGAGAAAACAACAT TAGAAAAIGGCGCAAAATCGITAGGICCCAGGAGAGAATGIGGGGGCGCAAACC
CTTT TCC TCCCAGCC TAT TTTT GTAAATAAAAT GT T TAAACT T GAAATACAAATCGAT GT T
TATAT T TCC
TATCAT T T TGTAT T T TATGGTAT T TGGTACAACTGGCTGATACTAAGCACGAATAGATAT TGATGT
TATG
GAGTGCTGTAATCCAAAGT T T T TAAT TGTGAGGCATGT T CTGATAT GT T
TATAGGCAAACAAATAAAACA
GCAAACTTTTT TGCCACAT GT T TGCTAGAAAAT GAT TATACT T TAT TGGAGTGACATGAAGT T
TGAACAC
TAAACAGTAATGTATGAGAAT TACTACAGATACATGTAT CT T T TAGT TTTTTT TGT T TGAACT T
TCTGGA
GCTGT T T TATAGAAGATGATGGT T TGT TGTCGGTGAGTGT TGGATGAAATACT TCCT TGCACCAT
TGTAA
TAAAAGCTGT TAGAATAT T TGTAAATATC
NM_00104013 GGGCGCTCTCGGGCTGCCGGCGGGGCCGAGCGCCGCGCGICCCGAGCATGGCAGGCICCCIGCCICCCIG
CGTGGTGGACTGT GGCACCGGGTATACCAAGCT TGGCTACGCAGGCAACACTGAGCCCCAGT TCAT TAT T
CCT TCATGTAT TGCCATCAGAGAGTCAGCAAAGGTAGT T GACCAAGCTCAAAGGAGAGT GT TGAGGGGAG
T TGATGACCT TGACT TTTTCATAGGAGATGAAGCCATCGATAAACCTACATATGCTACAAAGTGGCCGAT
ACGACATGGAATCAT TGAAGACTGGGATCT TAT GGAAAGGT TCATGGAGCAAGTGGT T T T TAAATATCT
T
CGAGCTGAACCTGAGGACCAT TAT TTTT TAATGACAGAACCTCCACTCAATACACCAGAAAACAGAGAGT
ATCT TGCAGAAAT TATGT T TGAATCAT T TAACGTACCAGGACTCTACAT TGCAGT TCAGGCAGTGCTGGC

CT TGGCGGCATCT TGGACATCTCGACAAGTGGGTGAACGTACGT TAACGGGGATAGTCAT TGACAGCGGA
GATGGAGTCACCCATGT TATCCCAGTGGCAGAAGGT TAT GTAAT TGGAAGCTGCATCAAACACATCCCGA
T TGCAGGTAGAGATAT TACGTAT T TCAT TCAACAGCTGCTAAGGGAGAGGGAGGTGGGAATCCCTCCTGA
GCAGTCACTGGAGACCGCAAAAGCCAT TAAGGAGAAATACTGT TACAT T TGCCCCGATATAGTCAAGGAA

T T T GCCAAGTAT GAT GT GGATCCCCGGAAGT GGATCAAACAGTACACGGGTAT CAAT
GCGATCAACCAGA
AGAAGT T T GT TATAGACGT TGGT TACGAAAGAT TCCTGGGACCTGAAATAT TCT T TCACCCGGAGT T
T GC
CAACCCAGACT T TATGGAGTCCATCTCAGATGT TGT TGATGAAGTAATACAGAACTGCCCCATCGATGTG
CGGCGCCCGCTGTATAAGCCCGAGT T CT T TCAGGTCTGCCACACCAAGAAGGACTATGAAGAGTACGGGC
CCAGCATCTGCCGCCACAACCCCGICITIGGAGICATGICCTAGIGICTGCCIGAACGCGICGTICGAIG
GTGTCACGT TGGGGAACAAGTGTCCT TCAGAACCCAGAGAAGGCCGCCGT T CT GTAAATAGCGACGTCGG
TGT TGCTGCCCAGCAGCGTGCT TGCAT TGCCGGTGCATGAGGCGCGGCGCGGGCCCT TCAGTAAAAGCCA
T T TATCCGTGTGCCGACCGCTGTCTGCCAGCCTCCTCCT TCTCCCGCCCTCCTCACCCTCGCTCTCCCTC
CICCICCICCTCCGAGCTGCTAGCTGACAAATACAAT IC TGAAGGAATCCAAATGIGAC T T TGAAAAT TG
T TAGAGAAAACAACAT TAGAAAAIGGCGCAAAATCGITAGGICCCAGGAGAGAATGIGGGGGCGCAAACC
CT T T TCCTCCCAGCCTAT T T T TGTAAATAAAAT GT T TAAACT TGAAATACAAATCGATGT T
TATAT T TCC
TATCAT T T TGTAT T T TATGGTAT T TGGTACAACTGGCTGATACTAAGCACGAATAGATAT TGATGT
TATG
GAGTGCTGTAATCCAAAGT T T T TAAT TGTGAGGCATGT T CTGATAT GT T
TATAGGCAAACAAATAAAACA
GCAAACTTTTT TGCCACAT GT T TGCTAGAAAAT GAT TATACT T TAT TGGAGTGACATGAAGT T
TGAACAC
TAAACAGTAATGTATGAGAAT TACTACAGATACATGTAT CT T T TAGT T T T T TT TGT T TGAACT T
TCTGGA
GCTGT T T TATAGAAGATGATGGT T TGT TGTCGGTGAGTGT TGGATGAAATACT TCCT TGCACCAT
TGTAA
TAAAAGCTGT TAGAATAT T TGTAAATATC
NM_018685 CTCGGCGCTGAAAT TCAAAT T

CAGCTGGITGIGGGAGAGT TCCCCCGCCICAGACTCCIGGI TIT TT CCAGGAGACACAC TGAGCTGAGAC
=ACT T T =ICI T CCTGAAT T TGAACCACCGT T TCCATCGICICGTAGICCGACGCCTGGGGCGATGGAT
CCGT T TACGGAGAAACTGCTGGAGCGAACCCGTGCCAGGCGAGAGAATCT TCAGAGAAAAATGGCT GAGA
GGCCCACAGCAGC TCCAAGGTCTATGACTCATGCTAAGCGAGCTAGACAGCCACT T TCAGAAGCAAGTAA
CCAGCAGCCCCTCTCTGGTGGTGAAGAGAAATCT TGTACAAAACCATCGCCATCAAAAAAACGCTGT TCT
GACAACACTGAAGTAGAAGITICTAACTIGGAAAATAAACAACCAGITGAGTCGACATCTGCAAAATCTI
GT TCTCCAAGTCCTGTGTCTCCTCAGGTGCAGCCACAAGCAGCAGATACCATCAGTGAT TCTGT TGCTGT
CCCGGCATCACTGCTGGGCATGAGGAGAGGGCTGAACTCAAGAT TGGAAGCAACTGCAGCCTCCTCAGT T
AAAACACGTATGCAAAAACT TGCAGAGCAACGGCGCCGT TGGGATAATGAT GATATGACAGATGACAT IC
CTGAAAGCTCACT CT ICICACCAATGCCATCAGAGGAAAAGGCTGC T TCCCCT CCCAGACCICTGC T T IC

AAATGCCTCGGCAACTCCAGT TGGCAGAAGGGGCCGTCT GGCCAAT CT TGCTGCAACTAT T T GC TCC T
GG
GAAGAT GAT GTAAATCAC T CAT T TGCAAAACAAAACAGTGTACAAGAACAGCCTGGTACCGCT T GT T
TAT
CCAAAT T T TCC TC T GCAAGT GGAGCATC T GC TAGGATCAATAGCAGCAGT GT TAAGCAGGAAGC
TACAT T
C T GT TCCCAAAGGGAT GGCGAT GCC T CT T T GAATAAAGCCC TATCC TCAAGTGC T GAT GAT
GCGTC T T TG
GT TAATGCCTCAAT T TCCAGC TC T GT GAAAGC TAC T TC T CCAGT GAAATC TAC TACATC
TATCAC T GAT G
CTAAAAGT T GT GAGGGACAAAATCC T GAGC TAC T TCCAAAAAC TCC TAT TAGT CC TC T
GAAAACGGGGGT
ATCGAAACCAAT T GT GAAGTCAAC T T TATCCCAGACAGT TCCATCCAAGGGAGAAT TAAGTAGAGAAAT
T
T GTO T GCAATC TCAATC TAAAGACAAATC TACGACACCAGGAGGAACAGGAAT TAAGCCT T
TCCTGGAAC
GC T T T GGAGAGCGT TGICAAGAACATAGCAAAGAAAGICCAGC TCGTAGCACACCCCACAGAACCCCCAT
TAT TACTCCAAATACAAAGGCCATCCAAGAAAGAT TAT T CAAGCAAGACACAT CT =AT C TAC TACCCAT

T TAGCACAACAGCTCAAGCAGGAACGTCAAAAAGAACTAGCATGTCT TCGTGGCCGAT T TGACAAGGGCA
ATATAT GGAGTGCAGAAAAAGGCGGAAAC T CAAAAAGCAAACAAC TAGAAACCAAACAGGAAAC T CAC T G

ICAGAGCACTCCCCICAAAAAACACCAAGGIGT TICAAAAACICAGICACT TCCAGTAACAGAAAAGGIG
ACCGAAAACCAGATACCAGCCAAAAAT TCTAGTACAGAACCTAAAGGT T TCACTGAATGCGAAATGACGA
AATCTAGCCCT T TGAAAATAACAT T GT T T T TAGAAGAGGACAAATCCT
TAAAAGTAACATCAGACCCAAA
GGT TGAGCAGAAAAT TGAAGTGATACGTGAAAT T GAGAT GAGT GT GGAT GATGAT GATATCAATAGT
TCG
AAAGTAAT TAATGACCTCT TCAGT GAT GTCC TAGAGGAAGGT GAAC TAGATAT GGAGAAGAGCCAAGAGG

AGATGGATCAAGCAT TAGCAGAAAGCAGCGAAGAACAGGAAGAT GCAC T GAATATC TCC TCAAT GT CT T
T
ACT TGCACCAT TGGCACAAACAGT T GGT GT GGTAAGTCCAGAGAGT T TAGT GT CCACACC TAGAC T
GGAA
T T GAAAGACACCAGCAGAAGT GAT GAAAGTCCAAAACCAGGAAAAT TCCAAAGAACTCGTGTCCCTCGAG
CTGAATCTGGTGATAGCCT TGGT TCTGAAGATCGTGATCT TCT T TACAGCAT T GAT GCATATAGAT C
TCA
AAGAT T CAAAGAAACAGAACGTCCAT CAATAAAGCAGGT GAT T GT T CGGAAGGAAGAT GT TACT
TCAAAA
CTGGATGAAAAAAATAATGCCT T TCCT TGTCAAGT TAATATCAAACAGAAAATGCAGGAACTCAATAACG
AAATAAATATGCAACAGACAGTGATCTATCAAGCTAGCCAGGCTCT TAAC T GC T GT GT T GAT
GAAGAACA
TGGAAAAGGGTCCCTAGAAGAAGCTGAAGCAGAAAGACT TCT TCTAAT TGCAACTGGGAAGAGAACACT T
T T GAT T GAT GAAT TGAATAAAT TGAAGAACGAAGGACCTCAGAGGAAGAATAAGGCTAGTCCCCAAAGTG
AAT T TAT GCCATCCAAAGGATCAGT TACT T T GT CAGAAATCCGC T TGCCTCTAAAAGCAGAT T T
TGTCTG
CAGTACGGT TCAGAAACCAGATGCAGCAAAT TAC TAT TACT TAAT TATACTAAAAGCAGGAGCTGAAAAT
AT GGTAGCCACACCAT TAGCAAGTACT TCAAACTCTCT TAACGGT GAT GC T CT GACAT T CAC TAC
TACAT
T TACTCTGCAAGATGTATCCAATGACT T TGAAATAAATAT TGAAGT T TACAGCT TGGTGCAAAAGAAAGA
TCCCTCAGGCCT T GATAAGAAGAAAAAAACATCCAAGTCCAAGGC TAT TACTCCAAAGCGACTCCTCACA
IC TATAACCACAAAAAGCAACAT =AT ICI ICAGICAIGGCCAGICCAGGAGGIC T TAGIGCTGIGCGAA
CCAGCAACTTCGCCCT T GT TGGATCT TACACAT TATCAT TGTCT TCAGTAGGAAATACTAAGT T T GT
TCT
GGACAAGGTCCCCTTTT TATCT TCT T TGGAAGGTCATAT T TAT T TAAAAATAAAATGTCAAGTGAAT
TCC
AGT GT TGAAGAAAGAGGT T T TCTAACCATAT T T GAAGAT GT TAGTGGT T T
TGGTGCCTGGCATCGAAGAT
GGT GT GT TCT T TCTGGAAACTGTATATCT TAT TGGACT TATCCAGAT GAT
GAGAAACGCAAGAATCCCAT
AGGAAGGATAAATCTGGCTAAT TGTACCAGTCGTCAGATAGAACCAGCCAACAGAGAAT T T T GT GCAAGA
CGCAACACTTTTGAAT TAAT TAC T GT CCGACCACAAAGAGAAGAT GACCGAGAGAC TC T
TGTCAGCCAAT
GCAGGGACACAC T C T GT GT TACCAAGAACTGGCTGTCTGCAGATACTAAAGAAGAGCGGGATCTCTGGAT
GCAAAAACTCAATCAAGT T CT T GT TGATAT TCGCC TC T GGCAACC T GAT GC T T GC
TACAAACC TAT TGGA

AAGCCT TAAACCGGGAAAT TTCCATGCTATCTAGAGGTT TTTGATGTCATCTTAAGAAACACACTTAAGA
GCATCAGATTTACTGATTGCATTTTATGCTTTAAGTACGAAAGGGT TTGTGCCAATATTCACTACGTATT
ATGCAGTATTTATATCTTT TGTATGTAAAACTT TAACTGATTTCTGTCATTCATCAATGAGTAGAAGTAA
ATACAT TATAGT T GAT T T T GCTAAAT CT TAAT T TAAAAGCCTCATT T TCCTAGAAATCTAAT
TAT T CAGT
TAT TCATGACAATAT T T T T T TAAAAGTAAGAAAT TCTGAGT TGTCT
TCTTGGAGCTGTAGGTCTTGAAGC
AGCAACGTCTTTCAGGGGT TGGAGACAGAAACCCATTCTCCAATCTCAGTAGT TTTTTCGAAAGGCTGTG
ATCATT TAT TGAT CGTGATATGACT T GT TACTAGGGTAC TGAAAAAAATGT CTAAGGCC T T
TACAGAAAC
AT T T T TAGTAATGAGGATGAGAACT T TTTCAAATAGCAAATATATATTGGCTTAAAGCATGAGGCTGTCT
TCAGAAAAGTGATGTGGACATAGGAGGCAATGTGTGAGACTTGGGGGTTCAATATTTTATATAGAAGAGT
TAATAAGCACATGGT T TACAT T TACT CAGCTAC TATATATGCAGTGTGGTGCACAT T T T CACAGAAT
TCT
GGCT TCAT TAAGATCAT TAT T T T TGC TGCGTAGCT TACAGACT TAGCATAT TAGTTTTT
TCTACTCCTAC
AAGTGTAAATTGAAAAATCTTTATAT TAAAAAAGTAAAC TGT TATGAAGCT GC TATGTACTAATAATACT
TTGCTTGCCAAAGTGTTTGGGTTTTGTTGTTGTTTGTTTGTTTGTTTGTTTTTGGTTCATGAACAACAGT
GTCTAGAAACCCAT T T TGAAAGTGGAAAAT TAT TAAGTCACCTATCACCTT TAAACGCCTTTTTTTAAAA
TTATAAAATATTGTAAAGCAGGGTCTCAACTTT TAAATACACT T TGAACT T CT TCTCTGAAT TAT TAAAG

T TCT T TATGACCT CAT T TATAAACAC TAAAT TC TGTCACCTCCTGT CAT T T TAT T T T T
TAT TCAT T CAAA
TGTAT T T T T TCT T GTGCATAT TATAAAAATATAT T T TAT GAGCTCT
TACTCAAATAAATACCTGTAAATG
IC TAAAGGAAAAAAAAAAAAAAAAAA
NM_004323 GGICAACAAGIGCGGGCCT GGCTCAGCGCGGGGGGGCGCGGAGACCGCGAGGCGACCGGGAGCGGC I GGG
T TCCCGGCTGCGCGCCCT TCGGCCAGGCCGGGAGCCGCGCCAGTCGGAGCCCCCGGCCCAGCGTGGTCCG
CCTCCCTCTCGGCGTCCACCTGCCCGGAGTACTGCCAGCGGGCATGACCGACCCACCAGGGGCGCCGCCG
CCGGCGCTCGCAGGCCGCGGATGAAGAAGAAAACCCGGCGCCGCTCGACCCGGAGCGAGGAGT TGACCCG
GAGCGAGGAGT TGACCCTGAGTGAGGAAGCGACCTGGAGTGAAGAGGCGACCCAGAGTGAGGAGGCGACC
CAGGGC GAAGAGA T GAAT C GGAGC CAGGAGGT GACCC GG GAC GAGGAGT CGAC CC GGAG
CGAGGAG G T GA
CCAGGGAGGAAAT GGCGGCAGC T GGG C T CAC C G T GAC TGTCACCCACAGCAAT GAGAAGCACGACC
T T CA
TGTTACCTCCCAGCAGGGCAGCAGTGAACCAGT TGTCCAAGACCTGGCCCAGGTTGTTGAAGAGGTCATA
GGGGT I CCACAGT CT I I TCAGAAACT CATAT I TAAGGGAAAATCIC TGAAGGAAAIGGAAACACCGT
IGT
CAGCACTTGGAATACAAGATGGTTGCCGGGTCATGTTAATTGGGAAAAAGAACAGTCCACAGGAAGAGGT
TGAACTAAAGAAGTTGAAACATTTGGAGAAGTCTGTGGAGAAGATAGCTGACCAGCTGGAAGAGTTGAAT
AAAGAGCTTACTGGAATCCAGCAGGGTTTTCTGCCCAAGGATTTGCAAGCTGAAGCTCTCTGCAAACTTG
ATAGGAGAGTAAAAGCCACAATAGAGCAGTTTATGAAGATCTTGGAGGAGATTGACACACTGATCCTGCC
AGAAAA I I I CAAAGACAG I AGAT I GAAAAGGAAAGGC I I GG TAAAAAAGG I I CAGGCAT ICC
TAGCCGAG
TGTGACACAGTGGAGCAGAACATCTGCCAGGAGACTGAGCGGCTGCAGTCTACAAACTT TGCCCTGGCCG
AGTGAGGTGTAGCAGAAAAAGGCTGT GC TGCCC TGAAGAATGGCGCCACCAGCTCTGCCGTCTCTGGAGC
GGAATT TACCTGATTTCTTCAGGGCTGCTGGGGGCAACTGGCCATT TGCCAAT TTTCCTACTCTCACACT
GGTTCTCAATGAAAAATAGTGTCTTTGTGATTT TGAGTAAAGCTCC TATCT GT TTTCTCCTTCTGTCTCT
GTGGT T GTACTGT CCAGCAATCCACC T T T TCTGGAGAGGGCCACCT CTGCCCAAAT T T T CCCAGCT
GT T T
GGACCTCTGGGTGCTTTCT TTGGGCTGGTGAGAGCTCTAATTTGCCTTGGGCCAGTTTCAGGTTTATAGG
CCCCCTCAGTCTTCAGATACATGAGGGCTTCTT TGCTCT TGTGATCGTGTAGTCCCATAGCTGTAAAACC
AGAATCACCAGGAGGTTGCACCTAGTCAGGAATATTGGGAATGGCCTAGAACAAGGTGT TTGGCACATAA
GTAGACCACT TAT CCCICAT IGIGACCIAAT ICCAGAGCATCIGGCTGGGI I= IGGGI ICIAGAC I I
TG
TCCTCACCTCCCAGTGACCCTGACTAGCCACAGGCCATGAGATACCAGGGGGCCGTTCCTTGGATGGAGC
CTGTGGTTGATGCAAGGCTTCCTTGTCCCCAAGCAAGTCTTCAGAAGGTTAGAACCCAGTGTTGACTGAG
ICIGIGCTIGAAACCAGGCCAGAGCCAIGGATTAGGAAGGGCAAAGAGAAGGCACCAGAATGAGTAAAGC
AGGCAGGTGGTGAAGCCAACCATAAACT TCTCAGGAGTGACATGTGCT TCC T T CAAAGGCAT T T T T GT
TA
ACCATATCCTTCTGAGTTCTATGTTTCCTTCACAGCTGT TCTATCCAT T T T GT GGACTGTCCCCCACCCC
CACCCCATCATTGTTTTTAAAAAATTAAGGCCTGGCGCAGCAGCTCATGCC TATAAT CC CAGCACT T TGG
GAGGCT GAGGCGGGCGGAT CACI TGAGGCCAGGAGT I TGAGACCAGCCCAGGCAACATAGCAAAACCCCA
T TCTGC T T TAAAAAAAAAAAAAAAAAAAAT TAGCT TGGCGTAGTGGCATGT GCCTATAATCCCAGCTACT
GGGGAGGCTGAGGCACAAGAATCATT TGAACCTGGGAGGTAGAGGT TGCTGTGAGCCGAGATTACGCCCC
IGCACTCCAGCCIGGGTCACAGAGIGAGACICCATCICAGAAAAAAAAAAAAT TGAGTCAGGTGCAGTAG
=CT I CCTGIAGICCCAGCTACT TGGGAGGCT GAGGCTAGAGGAT CACI I GAGCCCAGGAGT I TGAGIC
TAGTCTGGGCAACATAGCAAGACCCCATCTCTAAAATTTAAGTAAGTAAAAGTAGATAAATAAAAAGAAA
AAAAAACTGTTTATGTGCTCATCATAAAGTAGAAGAGTGGTTTGCT T T T T T TT T T T T T T T
TGGAT TAATG
AGGAAATCATTCTGTGGCTCTAGTCATAATTTATGCTTAATAACAT TGATAGTAGCCCT TTGCGCTATAA
CTCTACCTAAAGACTCACATCAT T TGGCAGAGAGAGAGT CGT TGAAGTCCCAGGAAT TCAGGACTGGGCA
GGTTAAGACCTCAGACAAGGTAGTAGAGGTAGACTTGTGGACAAGGCTCGGGTCCCAGCCCACCGCACCC
CACTI TAATCAGAGTGGT TCACTAT TGAICIAT I I I IGIGIGATAGCIGT GT GGCGTGGGCCACAACAT

T TAATGAGAAGT TACTGTGCACCAAACTGCCGAACACCAT TCTAAACTAT T CATATATAT TAGTCAT T TA

AT TCT TACATAAC T TGAGAGGTAGACAGATATCCT TAT T TTAGAGATGAGGAAACCAAGAGAACTTAGGT
CAT TAGCGCAAGGT TGTAGAGTAAGCGGCAAAGCCAAGACACAAAGCTGGGTGGT T TGGT T TCAGAGCCA
GIGOT I I ICCCCI CIACIGTACTGCCTCICAACCAACACAGGGI TGCACAGGCCCAT IC ICTGAT I I
TIT
TCCTCT TGTCCTC TGCCTC TCCCTCTAGCTCCCACT TCCTCTCTGC TCTAGT T CAT T T T CT T
TAGAGCAG
CCCGAGIGATCAT GAAGIGCAAATCT IGCCATGICAGICCCCIGCT TAGAACCCICCAAIGGCTCACT I I
CT= I TAGGCAAAAGICT I TACCCCATGCCT IC ICCCAT CTCATCT CAACCCCCTCAT I
IGTIGGCTGIC
TGCTGTCAGCCACTCTTCT TTCAGGTCCTCAGATGCACTGCACCCTCTCCTGCCTGGGGGTCTTTGCTCC
TGCTACTACCTCTGCTTGAACAGCTCCTCACCT TCCTTCCTCCAACCCTACCCTTGTATAGGTGACTTTT

GT TCATCCT TCAGAAT TCAACTCACATGTCTCT TGCATGGAGAACCCTCACCTACTGTGT TGAGACCCTG
TCCAGCCCCCAGGTGGGATCCTCTCTCGACT TCCCATACAT T TCT T TCACAGCAT T TACATAGTCCATGA
TAGT T TACT TGTGGGAT TAT T TGGT TAATCT T TGCCT T TAACACCAGGGT T CC T
TGGGTGAAGGAGCT TC
T T TATCT TGGTAACAGCAT TAT T TCAAGCATAACT TGTAATATAGT TATAT
TACATATATAACATATATA
TATATAACATAACATATATAACATATATAACAAGCATAACT TGT TATATAGTCT TGTATATAGTAAGACC
T CAAT AAAT AT II GGAGAACAAAAAAAAAAAAAAA
NM_000633 T T TCTGTGAAGCAGAAGTCTGGGAATCGATCTGGAAATCCTCCTAAT T T T

CCTGAT TCAT TGGGAAGT T TCAAATCAGCTATAACTGGAGAGTGCTGAAGATTGATGGGATCGT TGCCT T
ATGCAT T TGT T T TGGT T T TACAAAAAGGAAACT TGACAGAGGATCATGCTGTACT
TAAAAAATACAACAT
CACAGAGGAAGTAGACTGATAT TAACAATACT TACTAATAATAACGIGCCTCATGAAATAAAGATCCGAA
AGGAAT TGGAATAAAAAT T TCCTGCATCTCATGCCAAGGGGGAAACACCAGAATCAAGT GT TCCGCGTGA
T TGAAGACACCCCCTCGTCCAAGAATGCAAAGCACATCCAATAAAATAGCTGGAT TATAACTCCTCT TCT
T TCTCTGGGGGCCGTGGGGTGGGAGCTGGGGCGAGAGGTGCCGT TGGCCCCCGT TGCT T T TCCTCTGGGA
AGGAIGGCGCACGCTGGGAGAACAGGGTACGATAACCGGGAGATAGIGATGAAGTACAT COAT TATAAGC
TGICGCAGAGGGGCTACGAGIGGGAIGCGGGAGAIGIGGGCGCCGCGCCCCCGGGGGCCGCCCCCGCACC
GGGCAT CT IC TCC TCCCAGCCCGGGCACACGCCCCATCCAGCCGCATCCCGGGACCCGGTCGCCAGGACC
TCGCCGCTGCAGACCCCGGCTGCCCCCGGCGCCGCCGCGGGGCCTGCGCTCAGCCCGGTGCCACCT GT GG
TCCACCTGACCCTCCGCCAGGCCGGCGACGACT TCTCCCGCCGCTACCGCCGCGACT TCGCCGAGAIGIC
CAGCCAGCTGCACCTGACGCCCT TCACCGCGCGGGGACGCT T TGCCACGGT GGTGGAGGAGCTCTT CAGG
GACGGGGTGAACTGGGGGAGGAT T GT GGCCT TC T T T GAG T T CGGT GGGGT CAT GT GT GT
GGAGAGC GT CA
ACCGGGAGATGTCGCCCCT GGTGGACAACATCGCCCTGT GGATGAC TGAGTACCTGAACCGGCACCTGCA
CACCTGGATCCAGGATAACGGAGGCTGGGATGCCT T TGTGGAACTGTACGGCCCCAGCATGCGGCCTCTG
T T TGAT T TCTCCTGGCTGTCTCTGAAGACTCTGCTCAGT T TGGCCCTGGTGGGAGCT TGCATCACCCTGG
GTGCCTATCTGGGCCACAAGTGAAGTCAACATGCCTGCCCCAAACAAATATGCAAAAGGT TCACTAAAGC
AGTAGAAATAATATGCAT T GICAGIGAIGTACCATGAAACAAAGCT GCAGGCT Gil TAAGAAAAAATAAC
ACACATATAAACATCACACACACAGACAGACACACACACACACAACAAT TAACAGTCT TCAGGCAAAACG
TCGAATCAGCTAT T TACTGCCAAAGGGAAATAT CAT T TAT TTTT TACAT TAT TAAGAAAAAAAGAT T
TAT
T TAT T TAAGACAGTCCCATCAAAACTCCTGTCT T TGGAAATCCGACCACTAAT TGCCAAGCACCGCTICG
TGTGGCTCCACCTGGATGT TCTGTGCCTGTAAACATAGAT TCGCT T TCCAT GT TGT TGGCCGGATCACCA
TCTGAAGAGCAGACGGATGGAAAAAGGACCTGATCAT TGGGGAAGCTGGCT TTCTGGCTGCTGGAGGCTG
GGGAGAAGGTGT T CAT TCACT TGCAT T TCT T TGCCCTGGGGGCTGTGATAT TAACAGAGGGAGGGT
TCCT
GTGGGGGGAAGTCCATGCCTCCCTGGCCTGAAGAAGAGACTCT T TGCATATGACTCACATGATGCATACC
TGGTGGGAGGAAAAGAGT TGGGAACT TCAGATGGACCTAGTACCCACTGAGAT T TCCACGCCGAAGGACA
GCGATGGGAAAAATGCCCT TAAATCATAGGAAAGTAT TTTTT TAAGCTACCAAT TGTGCCGAGAAAAGCA
T T T TAGCAAT T TATACAATATCATCCAGTACCT TAAGCCCTGAT TGTGTATAT TCATATAT T T
TGGATAC
GCACCCCCCAACT CCCAATACTGGCT CTGTCTGAGTAAGAAACAGAATCCT CT GGAACT TGAGGAAGTGA
ACAT T TCGGTGACT TCCGCATCAGGAAGGCTAGAGT TACCCAGAGCATCAGGCCGCCACAAGTGCCTGCT
TT TAGGAGACCGAAGICCGCAGAACCIGCCIGIGICCCAGCTIGGAGGCCTGGICCIGGAACTGAGCCGG
GGCCCT CACTGGCCTCCTCCAGGGAT GATCAACAGGGCAGTGTGGT CTCCGAATGTCTGGAAGCTGATGG
AGCTCAGAAT TCCACTGTCAAGAAAGAGCAGTAGAGGGGTGTGGCTGGGCCTGTCACCCTGGGGCCCTCC
AGGTAGGCCCGT T T ICACGIGGAGCAIGGGAGCCACGACCCT ICI TAAGACAT GIATCACIGTAGAGGGA
AGGAACAGAGGCCCTGGGCCCT TCCTATCAGAAGGACATGGTGAAGGCTGGGAACGTGAGGAGAGGCAAT
GGCCACGGCCCAT T T TGGC TGTAGCACATGGCACGT T GGC T GT GT GGCC T T GGCCCACC T GT
GAGT T TAA
AGCAAGGCTT TAAATGACT T TGGAGAGGGTCACAAATCCTAAAAGAAGCAT TGAAGTGAGGTGTCATGGA
T TAAT TGACCCCTGTCTATGGAAT TACATGTAAAACAT TATCT TGTCACTGTAGT T TGGT T T TAT T
TGAA
AACCTGACAAAAAAAAAGT TCCAGGTGTGGAATATGGGGGT TATCTGTACATCCTGGGGCAT TAAAAAAA
AAATCAAT GGT GGGGAAC TATAAAGAAGTAACAAAAGAAGT GACAT CI TCAGCAAATAAAC TAGGAAAT T

TTTTTT TCT TCCAGT T TAGAATCAGCCT TGAAACAT TGATGGAATAACTCT GT GGCAT TAT TGCAT
TATA
TACCAT T TATCTGTAT TAACT T TGGAATGTACTCTGT TCAATGT T TAATGCTGTGGT TGATAT T
TCGAAA
GCTGCT T TAAAAAAATACATGCATCTCAGCGT TTTTT TGT T T T TAAT TGTATT TAGT
TATGGCCTATACA
CTATTTGTGAGCAAAGGTGATCGTTTTCTGTTTGAGATTTTTATCTCTTGATTCTTCAAAAGCATTCTGA
GAAGGTGAGATAAGCCCTGAGTCTCAGCTACCTAAGAAAAACCTGGATGTCACTGGCCACTGAGGAGCT T
TGT T TCAACCAAGTCATGTGCAT T TCCACGTCAACAGAAT TGT T TAT TGTGACAGT TATATCTGT
TGTCC
CT T TGACCT TGT T TCT TGAAGGT T TCCTCGTCCCTGGGCAAT TCCGCAT T TAAT TCATGGTAT
TCAGGAT
TACATGCATGT T TGGT TAAACCCATGAGAT TCAT TCAGT TAAAAATCCAGATGGCAAATGACCAGCAGAT
TCAAATCTATGGTGGT T TGACCT T TAGAGAGT TGCT T TACGTGGCCTGT T
TCAACACAGACCCACCCAGA
GCCCICCIGCCCT CCT TCCGCGGGGGCT T ICICAIGGCT GICCITCAGGGI CT TCCTGAAATGCAGTGGT
GCTIACGCTCCACCAAGAAAGCAGGAAACCIGIGGIATGAAGCCAGACCICCCCGGCGGGCCICAGGGAA
CAGAATGATCAGACCT T TGAATGAT TCTAAT T T T TAAGCAAAATAT TAT T T TATGAAAGGT T
TACAT TGT
CAAAGT GATGAATATGGAATATCCAATCCTGTGCTGCTATCCTGCCAAAAT CAT T T TAATGGAGTCAGT T
TGCAGTATGCTCCACGTGGTAAGATCCTCCAAGCTGCT T TAGAAGTAACAATGAAGAACGTGGACGT T T T
TAATATAAAGCCT GT T T TGTCT T T TGT TGT TGT TCAAACGGGAT TCACAGAGTAT T
TGAAAAATGTATAT
ATAT TAAGAGGTCACGGGGGCTAAT TGCTGGCTGGCTGCCT T T TGCTGTGGGGT T T TGT TACCTGGT T
T T
AATAACAGTAAATGTGCCCAGCCTCT TGGCCCCAGAACTGTACAGTAT TGTGGCTGCACT TGCTCTAAGA
GTAGT TGATGT TGCAT T T TCCT TAT T GT TAAAAACATGT
TAGAAGCAATGAATGTATATAAAAGCCTCAA
CTAGTCAT TTTTT TCTCCT CT TCT TTTTTT TCAT TATATCTAAT TAT T T TGCAGT
TGGGCAACAGAGAAC
CATCCC TAT T T TGTAT TGAAGAGGGAT TCACAT CTGCAT CT TAACTGCTCT
TTATGAATGAAAAAACAGT
CCTCTGTATGTACTCCTCT T TACACTGGCCAGGGTCAGAGT TAAATAGAGTATATGCACT T TCCAAAT TG

GGGACAAGGGCTCTAAAAAAAGCCCCAAAAGGAGAAGAACATCTGAGAACCTCCTCGGCCCTCCCAGTCC
CTCGCTGCACAAATACTCCGCAAGAGAGGCCAGAATGACAGCTGACAGGGTCTATGGCCATCGGGTCGTC
TCCGAAGAT TIGGCAGGGGCAGAAAACTCTGGCAGGCTIAAGAT TIGGAATAAAGICACAGAAT TAAGGA
AGCACCTCAAT T TAGT TCAAACAAGACGCCAACAT TCTCTCCACAGCTCACTTACCTCTCTGTGT TCAGA
TGTGGCCT TCCAT T TATATGTGATCT T TGT T T TAT TAGTAAATGCT
TATCATCTAAAGATGTAGCTCTGG
CCCAGTGGGAAAAAT TAGGAAGTGAT TATAAATCGAGAGGAGT TATAATAATCAAGAT TAAATGTAAATA
ATCAGGGCAATCCCAACACATGTCTAGCT T TCACCTCCAGGATCTAT TGAGTGAACAGAAT TGCAAATAG
TCTCTAT T TGTAAT TGAACT TATCCTAAAACAAATAGT T TATAAATGTGAACT TAAACTCTAAT TAAT
TC
CAACTGTACT T T TAAGGCAGTGGCTGT T T T TAGACT T TCT TATCACT TATAGT
TAGTAATGTACACCTAC
TCTATCAGAGAAAAACAGGAAAGGCTCGAAATACAAGCCAT TCTAAGGAAATTAGGGAGTCAGT TGAAAT
TCTAT TCTGATCT TAT TCT GTGGTGT CT T T TGCAGCCCAGACAAATGTGGT TACACACT T T T
TAAGAAAT
ACAAT TCTACAT TGTCAAGCT TATGAAGGT TCCAATCAGATCT T TAT TGT TAT TCAAT T TGGATCT
T TCA
GGGAT TTTTTTTT TAAAT TAT TATGGGACAAAGGACAT T TGT TGGAGGGGTGGGAGGGAGGAAGAAT T T
T
TAAATGTAAAACAT TCCCAAGT T TGGATCAGGGAGT TGGAAGT T T TCAGAATAACCAGAACTAAGGGTAT
GAAGGACCTGTAT TGGGGTCGATGTGATGCCTCTGCGAAGAACCT TGTGTGACAAATGAGAAACAT T T TG
AAGT T TGTGGTACGACCT T TAGAT TCCAGAGACATCAGCATGGCTCAAAGTGCAGCTCCGT T TGGCAGTG
CAATGGTATAAAT T TCAAGCTGGATATGTCTAATGGGTAT T TAAACAATAAATGTGCAGT T T TAACTAAC
AGGATAT T TAATGACAACCT TCTGGT TGGTAGGGACATCTGT T TCTAAATGTT TAT TAT
GTACAATACAG
AAAAAAAT T T TATAAAAT TAAGCAATGTGAAACTGAAT TGGAGAGTGATAATACAAGTCCT T TAGT CT
TA
CCCAGTGAATCAT TCTGT TCCATGTCT T TGGACAACCATGACCT TGGACAATCATGAAATATGCATCTCA
CTGGATGCAAAGAAAATCAGATGGAGCATGAATGGTACTGTACCGGT TCAT CT GGACTGCCCCAGAAAAA
TAACT TCAAGCAAACATCCTATCAACAACAAGGT TGT TCTGCATACCAAGCTGAGCACAGAAGATGGGAA
CACTGGTGGAGGATGGAAAGGCTCGCTCAATCAAGAAAAT TCTGAGACTAT TAATAAATAAGACTGTAGT
GTAGATACTGAGTAAATCCATGCACCTAAACCT T T TGGAAAATCTGCCGTGGGCCCTCCAGATAGCTCAT
T TCAT TAAGT T T T TCCCTCCAAGGTAGAAT T TGCAAGAGTGACAGTGGAT TGCAT T TCT T T
TGGGGAAGC
TTTCTTTTGGTGGTTTTGTTTATTATACCTTCTTAAGTTTTCAACCAAGGTTTGCTTTTGTTTTGAGTTA
CTGGGGT TAT T T T TGT T T TAAATAAAAATAAGTGTACAATAAGTGT T T T TGTAT TGAAAGCT T
T TGT TAT
CAAGAT T T TCATACT T T TACCT TCCATGGCTCT T T T TAAGAT TGATACT T T
TAAGAGGTGGCTGATAT TC
TGCAACACTGTACACATAAAAAATACGGTAAGGATACT T TACATGGT TAAGGTAAAGTAAGTCTCCAGT T
GGCCACCAT TAGCTATAATGGCACT T TGT T TGT GT TGT TGGAAAAAGTCACAT TGCCAT TAAACT T
TCCT
TGTCTGTCTAGT TAATAT TGTGAAGAAAAATAAAGTACAGTGTGAGATACTG

AT TTTTT TCACT TAACGT T CAT TATGTGATAGGAGT T T TCCATCCTAT
TATACCGCTGTGCGATCTGATC
T TGGGCACGT TAACCAACCTCT TGT TGCCTCGAT T T TCTCACCTGTAAAAGTGGGGGTAATCATAATGCT
TACT TAGTAGGATAGCCCTGAAGAATAAGTGACT TAGCGAACATAAATAGCTTACAATAGGGT T T TCAGC
AIGGGAAGGATICAGTAAATGITAGCTGICATCATCACCACCIACAAAGGAAGCAATACTGIGCTGAAAG
TTTT TCCATCAT TAATGTAAT T TCTATAGTACGAT TCCCAAGAAGATAT TAAAAT TATGGAAATAAAGGT
AT TGGTATAT TCCTAAT TAT T TCCTAAAAGAT TGTAT TGATAAATATGCTCATCCT TCCCT
TAACGGGAT

AAT TACAATAGGGTCTCTAAT TATACT TCAACT TTTT TAGGAATAAT TCTCAGTGTGT T T TCCCACAT
T T
CATATGTAAT TTTTTTTTTTTTTTTTTTT TGAGACAGAGCCTCGCCCTGTCACCAGGCTGGAGTACAGTG
GCGCGATCTCGGCTCACTGCAACT TCCACCTGCTGGGT TCAAGCAAT TCT TCTGACCTCAGGTGATCCAC
CCGCCTCGGCCTCCCAAAGTGCTGGGAT TATAACAGGCGTGGCATGAGTCACCGCGCCCGGCCGAT CT T T
ACT T TT T TAT ICI TIGTACCCCCIGCCIATCCAGITAGCATGIGAT TAAAGICAAAGAT TIGCCACTITG

GGCCACATCTAT TAAT T T TCATCT T T GT TATAAT TGTAT T TAGT T T T TGATCTACACTGCT
TAT TACTCC
CAGTCAT TTTT TATAGAACTGAAAATCTGGTAAAATACTCAAAAT TGCACTGACT TCTATGTAGAGGCGA
CACTCCATCAGAACCGTGGGCTGACAGGGAATCCCACTGTGCAGGAGCTGCGCGCAT T T TCAT T TCTGAT
TCTCT T TGGCGTATCCAGGACTCTGATGACATGATCATATAT T TAT CAGTAGTAACAGGT TGGGCCAT T T

GT TTTT TGTGGTAAATCATATAT T TAAGAT T T TAGAAATAAGT TGATAGCCATGTAT T T TGGAAT T
TGAA
AAAGACAT TGCAT TACTCAGCT TCAAAT TAAGCT T TAATCAAATAGTGAAACT T TCCAT
TAATGGACAGT
GTATACCT T T T TGTGTAT T TAAAAAAAAAAACACTGAATATAGTGCCT T TGTGACAGGGGAGCT TGGT
TC
CTGACAATGTCCT CT TGAGCCT TTTTTTTTTTTT TGAGATGGAGTC TCACT GT GTCACCCAGGCTGGAGT
GCAGTGGCGCCAT CT TGGCTCACTGCAACCTCCGCCCCCTGGGT TCAAGTGAT TCTCAT TCCTCAGCT TC
CTAAGTAGCTGGGAT TACAGGCACGCACCACCATGACCAGCTAAT T TT TATACTIT TAGTAGAGACAGGG
T T T TGCCATGT TGGCTAGGT TGGTCT CGAACTCCTGACCTCAAGTAATCCACCCACCAT GGCCTCCCCAA
AGTGCT GGGAT TACAGGCGTGAGCCAT T TCACCCGGCCT =CT TCCGICT T TGAGCTGT GAGGAAATAGC

TACAT TACATGAGCTGCTAGATCTGCCT TATGGTCAGAAATGAAGGT TGAACTCTCAGGAACAGTGACAT
ATATACACACTGATAT T TCCAAAGTACAATGCCCCAAAT TGATCCACAAAGGAAT TAAGGTCAT T TGCAA
CAAAAT CACAGAATAGTAACAAATAAATAGAAGATAAATATGGCCAGGGAT GC TGCAAACTGATATACTG
CCAAGT T TATCAGT TGGGAATCCCAACAGTGAAAAGCATAAAAATGAAAGGAAT T T TAAGGAGACT TTTT
ATAGAAGAGTGGGAAGGAT TGGAGGAGCCAACAAGTGATGGTGAGGCACACAGGGAAGAGCT TCAGTGGG
CACCATCCCCTCTCTGGT T TGAAGGGGTAGGGAGGGGACCAGAGCTGGGAGGAGGGGGCTGGAATACTGC
IGGAGGAGCCACTCCCT TCCAGACCT GC T GIGGCCATCACAGAAT GCAGCCAC TGCCAGAGCAGCAGCCC
GAGGAACCAGGCAGGGGGAGCACAAGTACCCTAGCCTCT CTCT T TCTGT T T CT TGCCTGCCGATCTCCTC
CAC T GGC TAAACCCAGC T GGAT GC TAAGAGTACAGT CAGCC T GCC T GC T
GAGGAGGGACCACCAGGGACC
ACCATCAGCAAGGGATCCAATGTCT T TCTGCCTCTGCAGAATGAAGGT TGGGGCGCGGGGGGCGCTCTAC
T TCT TAGGGATAT TGTGGGAATAAAAGGAAATAGGCAAAAAATGT T T T TGAAAAACAAAGCACATACTGC
GCACCCGTGGGCCACTACTGCT T T TGACCCCTGGCTCTGT T TCATGAAGTAATGTCGTGTCAT TCT CT T
T

T TAGGTGCTACAGGAT T TCT T TAGGT T TGT T T TCTGTCCACCATAT T TCAACTCATGTGTGCTGT
T TGT T
GTGCTAAAACAAATAT T TGCTGATGCCTGAGTGAATAGT TGAATAT T T TATATAAGTCAAAT T TATACGT

AATGAT T T T TCT TGTAACT TAGCCGT T TCTCT T T TACAAACTCAGAAAACCTCAGACT T
TGAAAAGGCCT
TGAAGT TCCTCACCTGAAATCTGAGAACT TGGAGCGCCT TAAAAAATCTAAAGGAAAACAAAACAGTGAA
AGAACATGATATAGTCAGTGTAGAGAATAAAAT TAT T TATGTAAT TAATAT TGAGGATGCAGATAACACA
T TGTGAAATCT TGCT TGTAAAAAATC TCGATCT GCTGAAGAAAGAT GT TCT CT CTAGAGATCT T
TGAAAG
CATAAT TAT TGAGCT T T TAAAATGT TAGAAACAAAAGT TAGACCCACACATAT TCTGGCGTGTGGAAGAT

T TGCAT TCCT TCCCCTGCCCGCCCCGCCCCCACACT TGTGAGT TGTGCCTGTGTACGCAGT TCCTGTAGC
ACTCGGCTGGGCAGAAATCATCT T TCAGCACTAAGGGAACATAGT TATGAT CT GGACCT ICIGGGAGIGG
TCAGTGCCCAAGAACAGGTATGGGACTCCAGAAAGT TCTGCTCTCAACCCTAT T T TGAAATAGAGT TACA
CAT TGT TCTACAAT TAT T TGAGT TAATAAGCAGCTCT T T TCAAACGTGAT TAT GCCCT TCCAAGT
T TAAA
TACACTAGACT T TAGTGAAAGTAAT TGACCTCATCTCAT T TCTCTCCTGT TATAT TAAGATCACT T
TCAG
TAAAAGGTAGAAGCT II TGAAGTGGTGAGGAGGAGGTAGAGGAGGGACATAGAGCAGATAGGGGCTGGAA
AGTGGGGTGAGGAAGAGAGTGGCT TCTCT T TGGCAGAGTACCAAGGAAAAGCCCTATCTGTACAGAACCT
T TGTGCCTGGGAACT TGATGGCTGCAACCTGAGCCTCAACCTAGT T TGCT TGCGGAGCCAGAAGAGAAGC
TAAAAACCT TCAGT TAACCAAGCCAGACACCAAGAAAGT TAAACCGAAAGAGAACCCCCCACCCCCCGCA
AAAAAAAGAAGTAAAGT GGGT TAAAGT GATAT CAT GT TAGCACAGAAAGAGAACATAAGGGT CAT C
TAAG
T TCATCTGCCCCCTCT TCTAT T TCAAGGTGCAGAAACTAAGGCACAAGGGACCCCGTGTCCTGCTCT TGA
TCACATAGCTAGT GGGTGCCAAGCCAGGTCTAGAACTCT GT TCTCT GGGGTCACAGGCT GGCTCT TCATC
CCTCTAGAGAGATAGCTCATCTGTGTGCACCTGAGCCCGT TGTGT T TCGGAGTCAAAGCAAATAAAGGCT
CAAACTCCAAGACTGT T T TGCAGACCGGCTGCAGTAGATATGGGGGGAGGAGAAACCTGCT T TAAAT TGC
T TCAAGCAAGT TGT T TCTGCAAAGGT GT TGACT TTTT TCT T TCAACT T
TCTAGTGAGTCACTGCAGCCTG
AGCTGT TAT T TGT CAT TAT GCAATAAT TCAGGAACTAACTCAAGAT TCT TCTTTT TAAAT TAT T
TGT T TA
T T TAGAGACAGAGTCT TGCTCTGT TGCCCAGGCTGGAGTGCAGTGGTGTGATCTCGGCTCACTGCAGCCT
CIGCCTCCIGGGI ICAAGCAATICICATGICICAGCCICCCGAATAGCTGGIAT TGCAGGCTCGTGCCAC
CACCCCCTGCTAAT T T T TGTAAT T T TAGTGGAGACACGGT T TCGCCATGT TGGCCGGGCTCGTCT
TGAGC
TCCTGGCCTCAGGTGATCCGCCCGCC TCGGCCT CCCAAAGTGCTGGGAT TGCAGCCGTGAGCCTCCACAC
CCGGCC TAT T TAT T TAT T T T TAAAT T GGCTGCT CT TAGAAAGGCATACCAT GT T
TCTGGATGGGAAGGCT
TAT TAAT TCACCCTAAT T TAATGTATAAAT T TGATGCAATCATAGTCACAGTCCCAGTGGAAT TTTT TAA

CT TGGTAAGATGT TCTAAAAT TAATGAGAGAACT TGAAT TACCAGGTAT TGAAACACTGTAAAGCCACAA
TCATGTAAACAGTATGT TA TAACCAT GGGAATAGAGGTC TGTGATACAGCAGAAAAAAGTGAAAAAAAGA
ATAACTGTAT TCATAAAAAT T TAAATGTGGAGTCACTGGGGGAAAGGAT TAAATAT TCGATAATGTAGAA
ACAACTCAACTAT T TGGAGAAATGTAAAT T TAGAGCCT TATCTCATGCCATATACCAAAATACTAT T TAG
All T GAT TAAAAAATAAAAAAAAAAAAAAAAAAA

CGGCGCCGCCGCCGCCACT GCCGTCGCCGCCGC CGCC TGCCGGGAC IGGAGCGCGCCGT CCGCCGCGGAC
AAGACC C T GGCC T CAGGCC GGAGCAGCCCCAT CAT GCCGAGGGAGC GCAGGGAGCGGGAT
GCGAAGGAGC
GGGACACCATGAAGGAGGACGGCGGCGCGGAGT ICICGGCTCGCTCCAGGAAGAGGAAGGCAAACGTGAC
CGT TTTTT TGCAGGATCCAGATGAAGAAATGGCCAAAAT CGACAGGACGGCGAGGGACCAGT GT GGGAGC
CAGCCT IGGGACAATAATGCAGICIGIGCAGACCCCIGCTCCCIGATCCCCACACCTGACAAAGAAGAIG
ATGACCGGGT T TACCCAAACICAACGIGCAAGCCTCGGAT TAT TGCACCATCCAGAGGCTCCCCGCTGCC
TGTACTGAGCTGGGCAAATAGAGAGGAAGTCTGGAAAATCATGT TAAACAAGGAAAAGACATACT TAAGG
GATCAGCACT T TCT TGAGCAACACCCTCT TCTGCAGCCAAAAATGCGAGCAAT TCT TCTGGAT TGGT TAA

TGGAGGTGTGTGAAGTCTATAAACT TCACAGGGAGACCT T T TACT TGGCACAAGAT T TCT T
TGACCGGTA
TATGGCGACACAAGAAAAT GT TGTAAAAACTCT T T TACAGCT TAT TGGGAT TTCATCT T TAT T TAT
TGCA
GCCAAACT TGAGGAAATCTATCCTCCAAAGT TGCACCAGT T TGCGTATGTGACAGATGGAGCT TGT TCAG
GAGATGAAAT TCTCACCATGGAAT TAATGAT TATGAAGGCCCT TAAGTGGCGT T TAAGTCCCCTGACTAT
TGTGTCCTGGCTGAATGTATACATGCAGGT TGCATATCTAAATGACT TACATGAAGTGCTACTGCCGCAG
TATCCCCAGCAAATCT T TATACAGAT TGCAGAGCTGT TGGATCTCTGTGTCCTGGATGT TGACTGCCT TG
AAT T TCCT TATGGTATACT TGCTGCT TCGGCCT TGTATCAT T TCTCGTCAT CT GAAT
TGATGCAAAAGGT
T TCAGGGTATCAGTGGTGCGACATAGAGAACTGTGTCAAGTGGATGGT TCCAT T TGCCATGGT TATAAGG
GAGACGGGGAGCTCAAAACTGAAGCACT T CAGGGGC G T C GC T GAT GAAGAT
GCACACAACATACAGACCC
ACAGAGACAGCT T GGAT T T GCTGGACAAAGCCCGAGCAAAGAAAGC CAT= TGICTGAACAAAATAGGGC
T TCTCCTCTCCCCAGTGGGCTCCTCACCCCGCCACAGAGCGGTAAGAAGCAGAGCAGCGGGCCGGAAATG
GCGTGACCACCCCATCCT TCTCCACCAAAGACAGT TGCGCGCCTGCTCCACGT TCTCT TCTGTCTGT TGC
AGCGGAGGCGTGCGT T TGCT T T TACAGATATCTGAATGGAAGAGTGT T TCT TCCACAACAGAAGTAT T
TC
TGTGGATGGCATCAAACAGGGCAAAGTGT TTTT TAT TGAATGCT TATAGGT TT T T T T
TAAATAAGTGGGT
CAAGTACACCAGCCACCTCCAGACACCAGIGCGIGCTCCCGATGCTGCTATGGAAGGIGCTACTIGACCT
AAGGGACTCCCACAACAACAAAAGCT TGAAGCTGTGGAGGGCCACGGTGGCGTGGCTCTCCTCGCAGGTG
T TCTGGGCTCCGT TGTACCAAGTGGAGCAGGTGGT TGCGGGCAAGCGT TGTGCAGAGCCCATAGCCAGCT
GGGCAGGGGGCTGCCCTCTCCACAT TATCAGT TGACAGTGTACAATGCCT T TGATGAACTGT T T TGTAAG
TGCTGCTATATCTATCCAT TTTT TAA TAAAGAT AATACT GT T T T
TGAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

GCGITCGAGAGIGACCIGCACICGCTGCTICAGCIGGAIGCACCCATCCCCAAIGCACCCCCIGCGCGCT
GGCAGCGCAAAGCCAAGGAAGCCGCAGGCCCGGCCCCCT CACCCAT GC GGGCC GCCAAC CGAT CCCACAG
CGCCGGCAGGACTCCGGGCCGAACTCCTGGCAAATCCAGTICCAAGGITCAGACCACTCCTAGCAAACCT
GGCGGTGACCGCTATATCCCCCATCGCAGTGCTGCCCAGATGGAGGTGGCCAGCT TCCTCCTGAGCAAGG

AGAACCAGCCTGAAAACAGCCAGACGCCCACCAAGAAGGAACATCAGAAAGCCTGGGCT T TGAACCTGAA
CGGT T T TGAIGTAGAGGAAGCCAAGATCCITCGGCTCAGIGGAAAAACCACAAAAATGCGCCAGAGGGIT
ATCACGAACAGAC T GAAAGTACTCTACAGCCAAAAGGCCACTCCT GGCTCCAGCCGGAAGACCTGCCGT T
TACAT TCCT TCCCTGCCAAGACCGTATCCTGGAIGCGCCTGAAATCGAAT GAC TAT TAACT GAACC T GIG

GGAC TGGCAGTCCGGGGAAT GT CCGGGCCGGGCCACGGC CACGAGG T GT T CCG T GT GGAGT
GCAAGC T GG
GACACACCGTGCCGCT T GT GCACAGGGCCACGCGGGGAAATAAT CC CGGGGCGCGCAAAGCGGCAC TGGC
GAGAGCCGCACGGGCCGGT GC T GGGGGT GGTACAACAGGCCAAAACAACACACAAGGCCAACAAGACATA
CGCGCGCTGACACCACGGT GCAAAGCGCTCAGACGAGTAGTAACCGGCACT GI GGT T GC TGCCTCCCCAC
C TC T CC CGC T C T CAGCGTAAGATAAAAGAAAGAAGAGCAAAAAGCAAAGAAAGAAGAC
GAGACGAGACAC
ACAGGAACGAACAGTAAAGCAAGC TAAAGCAAACGCAAGACCAGACAACAGAAATAGAAAGAACCAACAG
AGAGGAGACAGAACAGGAC GCCAGCAACAT AGCAACAAACGAACAGAAGAGAG CAC TAAACAAAAGCAGC
AGCAAGACGAGACAGGAGAGAAGGAGGAAGGAGGGCCGAGCGAGCAGGGAGCGCGAGCAGCGAGGCGAAG
CAGCAGACAAGGGCAGGCGAAGGGCAACGAGAGGAGGCACCACACAAAAAGGAGAGGGGACAGGAGAAGC
AGCGAGAGAAGCGGAGGAGCAACAAGAGGAAGAAAAGGAGAGGGAGAGGAGGGAGAGAGCGGAAGGAGGA
AGAAACAGCACGAGGCGAC GAAGGGG GGAGACGC GGGGGCAGGAAAAGACACAGGAAGGCAGCGCGGAGG
AGGAGAAGGGGAAGCAGGAAGGAGAC GGAAGGAGAAGAGGGAGAGGACAGCGCAAGAGAGCGCGCGCGGC
GACAGCGAGGGACGGAGCGAGAGAGAGGAAACGGAAAGC GAGAGGGAAGAGGAGAGGCAACGCAGCGAAC
CAACCGAAAACAGCAGAAAGAGAGGAGAAGGAC GC GCAAAGAGGCAAGCGCAAGACGACAGGAAAC GAG
CGAGAGACGAGAAGCCGGT GACGAGCAGGAGAAAGGGAAGGCAGGAGACAGGACAGGCGGAAGAGAGACA
C G C GAGAC G CAAA GAG T GA G CAGAAC GAAG C GAAGAG CAAC G CAC GAGAGAAA C GAC
NM_001254 GAGCGCGGCTGGAGT T T GC T GCT GCCGCT GT GCAGT T T GT TCAGGGGCT

CT GCGT GT GAGAGACGT GAGAAGGAT CCT GCAC T GAGGAGGT GGAAAGAAGAGGAT T GC
TCGAGGAGGCC
IGGGGI =GT GAGGCAGCGGAGCT GGGTGAAGGCT GCGGGT TCCGGCGAGGCC TGAGCT GIGOT GT CGTC

AT GCCT CAAACCCGATCCCAGGCACAGGCTACAATCAGT T T TCCAAAAAGGAAGCTGTCTCGGGCAT T GA
ACAAAGC TAAAAAC T CCAGT GAT GCCAAAC TAGAACCAACAAAT GT CCAAACC GTAACC T GT
TCTCCTCG
T GTAAAAGCCCT GCCTCTCAGCCCCAGGAAACGTCT GGGCGAT GACAACCTAT GCAACACTCCCCAT T TA
CCTCCT T GT TCTCCACCAAAGCAAGGCAAGAAAGAGAAT GGTCCCCCTCAC TCACATACACT TAAGGGAC
GAAGAT TGGTAT T TGACAATCAGCTGACAAT TAAGTCTCCTAGCAAAAGAGAACTAGCCAAAGT TCACCA
AAACAAAATACT T ICI ICAGT TAGAAAAAGICAAGAGAT CACAACAAAT IC TGAGCAGAGAIGICCACT G

AAGAAAGAATCT GCAT GT GT GAGACTAT TCAAGCAAGAAGGCACT TGCTACCAGCAAGCAAAGCTGGTCC
T GAACACAGCT GT CCCAGATCGGCT GCCT GCCAGGGAAAGGGAGAT GGAT GTCATCAGGAAT T TCT T
GAG
GGAACACATCT GT GGGAAAAAAGCT GGAAGCCT T TACCT T TCTGGTGCTCCTGGAACTGGAAAAACTGCC
TGCT TAAGCCGGAT TCTGCAAGACCTCAAGAAGGAACTGAAAGGCT T TAAAACTATCATGCTGAAT TGCA
TGTCCT TGAGGACTGCCCAGGCTGTAT TCCCAGCTAT TGCTCAGGAGAT T T GT CAGGAAGAGGTAT CCAG

GCCAGC T GGGAAGGACAT GAT GAGGAAAT T GGAAAAACATAT GAC T GCAGAGAAGGGCC CCAT GAT
T GT G
T TGGTAT TGGACGAGATGGATCAACTGGACAGCAAAGGCCAGGATGTAT TGTACACGCTAT T TGAATGGC
CAT GGC TAAGCAAT TCTCACT T GGT GCT GAT TGGTAT TGCTAATACCCTGGATCTCACAGATAGAAT
TCT
ACCTAGGCT TCAAGCTAGAGAAAAAT GTAAGCCACAGCT GT TGAACT TCCCACCT TATACCAGAAATCAG
ATAGTCACTAT T T TGCAAGATCGACT TAATCAGGTATCTAGAGATCAGGT T CT GGACAAT GCT GCAGT
TC
AAT TCT GT GCCCGCAAAGT CTCT GCT GT T TCAGGAGAT GT TCGCAAAGCAC TGGAT GT T
TGCAGGAGAGC
TAT TGAAAT TGTAGAGTCAGATGTCAAAAGCCAGACTAT TCTCAAACCACT GT CT GAAT GTAAATCACCT
TCT GAGCCTCT GAT TCCCAAGAGGGT TGGTCT TAT TCACATATCCCAAGTCATCTCAGAAGT T GAT
GGTA
ACAGGATGACCT TGAGCCAAGAAGGAGCACAAGAT TCCT TCCCTCT TCAGCAGAAGATCT TGGT T TGCTC
TT T GAT GCTCT TGATCAGGCAGT T GAAAATCAAAGAGGT CACICIGGGGAAGT TATATGAAGCCTACAGT

AAAGTC T GTCGCAAACAGCAGGT GGCGGCT GT GGACCAGTCAGAGT GT T TGTCACT T TCAGGGCTCT
TGG
AAGCCAGGGGCAT T T TAGGAT TAAAGAGAAACAAGGAAACCCGT T T GACAAAGGT GT T T T
TCAAGAT T GA
AGAGAAAGAAATAGAACAT GCTCT GAAAGATAAAGCT T TAAT T GGAAATAT CT TAGCTACTGGAT
TGCCT
TAAAT ICI ICI= TACACCCCACCCGAAAGTAT TCAGCTGGCAT T TAGAGAGC TACAGT CT =AT T T
TAG
TGCT T TACACAT TCGGGCCTGAAAACAAATATGACCT TTTT TACT TGAAGCCAATGAAT T T TAATC
TATA
GAT TCT T TAATAT TAGCACAGAATAATATCT T TGGGTCT TACTAT T T T
TACCCATAAAAGTGACCAGGTA
GACCCT T T T TAAT TACAT TCACTACT TCTACCACT T GT GTATCTCTAGCCAAT GT GCT
TGCAAGTGTACA
GATCT GT GTAGAGGAAT GT GT GTATAT T TACCT CT TCGT T TGCTCAAACATGAGTGGGTAT T T T
T T T GT T
T GT T T T T T T T GT T GT T GT T GT T T T T GAGGCGCGTCTCACCCT GT
TGCCCAGGCTGGAGTGCAATGGCGCG
T TCTCTGCTCACTACAGCACCCGCT TCCCAGGT T GAAGT GAT TCTCT TGCCTCAGCCTCCCGAGTAGCTG
GGAT TACAGGTGCCCACCACCGCGCCCAGCTAAT TTTT TAAT T T T TAGTAGAGACAGGGT T T TACCAT
GT
T GGCCAGGCT GGT CT TGAACTCCTGACCCTCAAGT GATC T GCCCACCT TGGCCTCCCTAAGTGCTGGGAT

TATAGGCGTGAGCCACCATGCTCAGCCAT TAAGGIATIT 1= TAAGAACT T TAAGTT TAGGGTAAGAAGA
AT GAAAAT GATCCAGAAAAAT GCAAGCAAGTCCACAT GGAGAT T TGGAGGACACTGGT TAAAGAAT T
TAT
T TCT T T GTATAGTATACTAT GT TCATGGTGCAGATACTACAACAT T GT GGCAT T T TAGACTCGT
TGAGT T
TCT TGGGCACTCCCAAGGGCGT TGGGGTCATAAGGAGACTATAACTCTACAGAT T GT GAATATAT T TAT T

T TCAAGT TGCAT T CT T T GT CT T T T TAAGCAATCAGAT T TCAAGAGAGCTCAAGCT T
TCAGAAGTCAAT GT
GAAAAT TCCT TCCTAGGCTGTCCCACAGTCT T TGCTGCCCT TAGATGAAGCCACT T GT T
TCAAGATGACT
ACT T TGGGGT TGGGT T T TCATCTAAACACAT T T T TCCAGTCT TAT TAGATAAAT
TAGTCCATATGGT TGG
T TAATCAAGAGCCT TCTGGGT T TGGT T TGGTGGCAT TAAATGG
NM_031423 GCGGAATGGGGCGGGACT TCCAGTAGGAGGCGGCAAGT T T GAAAAGT GAT GACGGT

TTTTGACT T TGCT TGTAGCTGCTCCCCGAACTCGCCGTCT TCCT GT CGGCGGCCGGCAC T GTAGAT
TAAC
AGGAAACT TCCAAGATGGAAACT T TGTCT T TCCCCAGATATAATGTAGCTGAGAT T GT GAT TCATAT
TCG
CAATAAGATCT TAACAGGAGCT GAT GGTAAAAACCTCACCAAGAAT GATCT TTATCCAAATCCAAAGCCT

GAAGTCT TGCACATGATCTACATGAGAGCCT TACAAATAGTATATGGAAT TCGACTGGAACAT T T T TACA
I GAIGCCAGIGAACIC I GAAGICAT GTATCCACAT I TAAT GGAAGGC I ICI TACCAT TCAGCAAT T
TAGT
TACTCATCTGGACTCAT TTTTGCCTATCTGCCGGGTGAATGACT T TGAGACTGCTGATAT TC TAT GTCCA
AAAGCAAAACGGACAAGTCGGT T T T TAAGTGGCAT TATCAACT T TAT TCACTTCAGAGAAGCATGCCGTG

AAACGTATATGGAAT T TCT T T GGCAATATAAAT CC TC T GCGGACAAAAT GCAACAGT
TAAACGCCGCACA
CCAGGAGGCAT TAAT GAAAC I GGAGAGAC I I GAT I C I GT I CCAGT I GAAGAGCAAGAAGAGT
I CAAGCAG
CT T TCAGATGGAAT TCAGGAGCTACAACAATCACTAAATCAGGAT T T TCAT CAAAAAACGATAGT GC T
GC
AAGAGGGAAAT TCCCAAAAGAAGTCAAATAT T TCAGAGAAAACCAAGCGT T TGAATGAACTAAAAT TGTC
GGTGGT T TCT T TGAAAGAAATACAAGAGAGT T TGAAAACAAAAAT T GT GGAT T C TCCAGAGAAGT
TAAAG
AAT TAT AAAGAAAAAAT GAAAGATAC GG T CCAGAAGC T TAAAAATGCCAGACAAGAAGTGGTGGAGAAAT

AT GAAATC TAT GGAGAC TCAGT TGACTGCCTGCCT TCATGTCAGT TGGAAGTGCAGT TATATCAAAAGAA

AATACAGGACCT T TCAGATAATAGGGAAAAAT TAGCCAGTATCT TAAAGGAGAGCCTGAACT TGGAGGAC
CAAAT T GAGAGT GAT GAGT CAGAAC T GAAGAAAT TGAAGACTGAAGAAAAT TCGT TCAAAAGAC T
GAT GA
T T GT GAAGAAGGAAAAAC T TGCCACAGCACAAT TCAAAA TAAATAAGAAGCAT GAAGAT GI
TAAGCAATA
CAAACGCACAGTAAT TGAGGAT TGCAATAAAGT TCAAGAAAAAAGAGGT GC TGTC TAT GAACGAGTAACC
ACAAT TAAT CAAGAAAT CCAAAAAAT TAAAC T T GGAAT T CAACAAC TAAAAGAT GC T GC T
GAAAGG GAGA
AACTGAAGTCCCAGGAAATAT T TCTAAACT T GAAAAC T GC T T TGGAGAAATACCACGACGGTAT
TGAAAA
GGCAGCAGAGGAC TCC TAT GC TAAGATAGAT GAGAAGACAGC T GAAC T GAAGAGGAAGAT GT
TCAAAATG
TCAACC T GAT TAACAAAAT TACAT GT CT T T T T GTAAAT GGC T TGCCATCT T TTAAT T T
TC TAT T TAGAAA
GAAAAGT T GAAGCGAAT GGAAGTATCAGAAGTACCAAATAAT GT TGGCT TCATCAGT T T T
TATACACTCT
CATAAGTAGT TAATAAGATGAAT T TAATGTAGGCT T T TAT TAAT T TATAAT TAAAATAACT T GT
GCAGC T
AT TCATGTCTCTACTCTGCCCCT T GT TGTAAATAGT T TGAGTAAAACAAAACTAGT TACCT T
TGAAATAT
ATATATTTTTTTCTGTTACTATC

CC T GC T TCAAAGCT T T GGGATAACAGCGCC TCCGGGGGATAAT GAAT GCGGAGCC TCCGT T T
TCAGTCGA
CT TCAGAT GT GTC TCCAC T T T T T TCCGCTGTAGCCGCAAGGCAAGGAAACATT TCTCT T
CCCGTAC T GAG
GAGGC T GAGGAGT GCAC T GGGT GT TCT T T
TCTCCTCTAACCCAGAACTGCGAGACAGAGGCTGAGTCCCT
GTAAAGAACAGCTCCAGAAAAGCCAGGAGAGCGCAGGAGGGCATCCGGGAGGCCAGGAGGGGT TCGCTGG
GGCC TCAACCGCACCCACAT CGGTCCCACCTGCGAGGGGGCGGGACCTCGT GGCGCT GGACCAAT CAGCA
CCCACC TGCGCTCACCTGGCCT CCTCCCGCTGGCTCCCGGGGGCT GCGGT GCT CAAAGGGGCAAGAGCTG
AGCGGAACACCGGCCCGCCGTCGCGGCAGC I GC I TCACCCCT C IC I CTGCAGCCAIGGGGCTCCCTCGIG

GAO= I CGCGTC TCTCCT COT ICTCCAGGI I I GC TGGC TGCAGT GCGCGGCC
TCCGAGCCGTGCCGGGC
GGTCT TCAGGGAGGCTGAAGTGACCT I GGAGGCGGGAGGCGCGGAGCAGGAGCCCGGCCAGGCGCT GGGG
AAAGTAT TCAT GGGCTGCCC T GGGCAAGAGCCAGC TC T GT T TAGCAC T GATAAT GAT GAC T
TCAC T GT GC
GGAATGGCGAGACAGTCCAGGAAAGAAGGTCACTGAAGGAAAGGAATCCAT TGAAGATCT TCCCATCCAA
ACGTAT C T TACGAAGACACAAGAGAGAT IGGGI GGT I GC ICCAATAICIGT
CCCIGAAAAIGGCAAGGGI
CCCITCCCCCAGAGACTGAATCAGCTCAAGTCTAATAAAGATAGAGACACCAAGAT TTTCTACAGCATCA
CGGGGCCGGGGGCAGACAGCCCCCCT GAGGGTGTCT TCGCTGTAGAGAAGGAGACAGGCTGGT T GT T GT T
GAATAAGCCACTGGACCGGGAGGAGAT TGCCAAGTATGAGCTCT T T GGCCACGCT GT GT CAGAGAAT GGT

GCC TCAGTGGAGGACCCCAT GAACAT C TCCATCATAGT GACCGACCAGAAT GACCACAAGCCCAAGT T TA

CCCAGGACACC I I CCGAGGGAGIGIC I TAGAGGGAGTCC TACCAGGTAC I I CT GIGAIGCAGAT
GACAGC
CACAGATGAGGAT GATGCCATCTACACCTACAATGGGGT GGT T GC T TACTCCATCCATAGCCAAGAACCA
AAGGACCCACACGACCTCAIGTICACAATICACCGGAGCACAGGCACCATCAGCGICATCTCCAGIGGCC
TGGACCGGGAAAAAGTCCC I GAG TACACAC I GACCATCCAGGCCACAGACATGGAT GGG GACGGCT CCAC

CACCACGGCAGTGGCAGTAGTGGAGATCCT T GAT GCCAAT GACAAT GCTCCCAT GT T TGACCCCCAGAAG

TACGAGGCCCATG TGCC TGAGAAT GCAGTGGGC CAT GAGGTGCAGAGGCTGAC GGTCAC I GATC
TGGACG
CCCCCAACTCACCAGCGTGGCGTGCCACCTACC T TAT CAT GGGCGG T GACGAC GGGGAC CAT T T
TACCAT
CACCACCCACCCT GAGAGCAAC CAGGGCAT C C I GACAAC CAGGAAGGGT T TGGAT II
TGAGGCCAAAAAC
CAGCACACCCTGTACGT T GAAGT GAC CAACGAGGCCCC T T T T GT GC
TGAAGCICCCAACCTCCACAGCCA
CCATAGTGGTCCACGIGGAGGAIGIGAAT GAGGCACCIGIGT I I= CCCACCC TCCAAAGICGT I GAG=
CCAGGAGGGCATCCCCAC T GGGGAGCC T GT GT GT GTC TACAC T
GCAGAAGACCCTGACAAGGAGAATCAA
AAGAT CAGC TACCGCATCC T GAGAGACCCAGCAGGGT GGC TAGCCAT GGACCCAGACAG T GGGCAGGT
CA
CAGCTGTGGGCACCCTCGACCGTGAGGATGAGCAGT T T GT GAGGAACAACATC TAT GAAGT CAT GG TC
T T
GGCCAT GGACAAT GGAAGCCCTCCCACCACTGGCACGGGAACCCT T C T GC TAACAC T GAT T GAT GT
CAAC
GACCAT GGCCCAGTCCCTGAGCCCCGTCAGATCACCATC TGCAACCAAAGCCC T GT GCGCCAGGT GC T GA

ACATCACGGACAAGGACCT GT CTCCCCACACCT CCCC T T TCCAGGCCCAGCTCACAGAT GACTCAGACAT
CTACTGGACGGCAGAGGTCAACGAGGAAGGTGACACAGTGGTCT TGTCCCT GAAGAAGT TCCTGAAGCAG
GATACATAT GACGT GCACC I I I CT= GTO I GACCATGGCAACAAAGAGCAGCT GACGGT
GATCAGGGCCA
=GT= GCGAC I GCCAT GGCCAT GTCGAAACC T GCCCIGGACCCIGGAAAGGAGGI I TCATCCTCCCT GT

GC T GGGGGCTGTCC TGGC T CT GC T GT TCC TCCT GCTGGT GC T GC T T T T GT
TGGTGAGAAAGAAGCGGAAG
ATCAAGGAGCCCC TCC TAC TCCCAGAAGATGACACCCGT GACAACGTC T TC TAC TAT
GGCGAAGAGGGGG
GT GGCGAAGAGGACCAGGAC TATGACATCACCCAGCTCCACCGAGGTC T GGAGGCCAGGCCGGAGGT GGT
ICICCGCAATGACGIGGCACCAACCATCATCCCGACACCCAIGTACCGTCCTAGGCCAGCCAACCCAGAT
GAAATCGGCAACT I TATAA I I GAGAACC I GAAGGCGGC TAACACAGACCCCACAGCCCC GCCC TAC
GACA
CCCTCTTGGTGTTCGACTATGAGGGCAGCGGCTCCGACGCCGCGTCCCTGAGCTCCCTCACCTCCTCCGC
CTCCGACCAAGACCAAGAT TAC GAT TAT C T GAACGAGT GGGGCAGC CGCT T
CAAGAAGCTGGCAGACATG
TACGGTGGCGGGGAGGACGACTAGGCGGCCIGCCTGCAGGGCTGGGGACCAAACGICAGGCCACAGAGCA
TCTCCAAGGGGTCTCAGT TCCCCCT TCAGCTGAGGACT TCGGAGCT TGTCAGGAAGTGGCCGTAGCAACT

T GGCGGAGACAGGC TAT GAGT C T GACGT TAGAGTGGT T GC T T CC T TAGCCT T T CAGGAT
GGAGGAAT GT G
GGCAGT ITGACTICAGCACTGAAAACCICICCACCIGGGCCAGGGI TGCCTCAGAGGCCAAGT T TCCAGA
AGCCTCT TACC T GCCGTAAAAT GC T CAACCCTGT GT CC T GGGCC T GGGCC T GC T GT GAC T
GACC TACAGT
GGAC T T TC TC TC T GGAAT GGAACC T T CT TAGGCCTCCTGGTGCAACT TAAT TT T T T T T
T T TAAT GC TAT C
T TCAAAACGT TAGAGAAAGT TC T T CAAAAGT GCAGCCCAGAGC T GC T GGGCCCAC T GGCCGT CC
T GCAT T
TCTGGT T TCCAGACCCCAATGCCTCCCAT TCGGATGGATCTCTGCGT T T T TATAC T GAGT GT GCC
TAGGT
TGCCCCT TAT TTTT TAT T T T CCC T GT TGCGT T GC TATAGAT GAAGGGT GAGGACAAT CGT
GTATAT GTAC
TAGAACTTTTT TAT TAAAGAAACTTTTCCCAAAAAAAAAAAAAAAA
NM_016343 GAGACCAGAAGCGGGCGAAT TGGGCACCGGTGGCGGCTGCGGGCAGT T TGAAT

CCCGCCGAAGCCGCGCCAGAACTGTACTCTCCGAGAGGTCGT T T TCCCGTCCCCGAGAGCAAGT T TAT T T
ACAAAT GT TGGAGTAATAAAGAAGGCAGAACAAAATGAGCTGGGCT T TGGAAGAATGGAAAGAAGGGCTG
CC TACAAGAGC TC T T CAGAAAAT TCAAGAGCT TGAAGGACAGCT TGACAAACTGAAGAAGGAAAAGCAGC

AAAGGCAGT T TCAGCT TGACAGTCTCGAGGCTGCGCTGCAGAAGCAAAAACAGAAGGT TGAAAATGAAAA
AACCGAGGGTACAAACCTGAAAAGGGAGAATCAAAGAT T GAT GGAAATAT GTGAAAGT C T GGAGAAAAC T

AAGCAGAAGAT T T CT CAT GAAC T TCAAGTCAAGGAGTCACAAGTGAAT T TCCAGGAAGGACAACTGAAT
T
CAGGCAAAAAACAAATAGAAAAACTGGAACAGGAACT TAAAAGGT GTAAAT CT GAGC T T GAAAGAAGC CA

ACAAGC T GCGCAGT C T GCAGAT GT C T C TC T GAAT CCAT GCAATACACCACAAAAAAT T T T
TACAACTCCA
CTAACACCAAGTCAATAT TATAGTGGT T CCAAG TAT GAAGAT C TAAAAGAAAAATATAATAAAGAGGT TG

AAGAACGAAAAAGAT TAGAGGCAGAGGT TAAAGCCT TGCAGGCTAAAAAAGCAAGCCAGACTCTTCCACA
AGCCACCATGAATCACCGCGACAT TGCCCGGCATCAGGCT T CAT CAT C T GT GT T C T CAT
GGCAGCAAGAG
AAGACCCCAAGT CAT CT T T CAT C TAAT TCTCAAAGAACTCCAAT TAGGAGAGAT TTCTCTGCATCT
TACT
TTTCTGGGGAACAAGAGGTGACTCCAAGTCGATCAACT T T GCAAATAGGGAAAAGAGAT GC TAATAGCAG
TTTCTTTGACAAT T C TAGCAGTCCT CAT CT T T T GGAT CAAT
TAAAAGCGCAGAATCAAGAGCTAAGAAAC
AAGAT TAATGAGT TGGAACTACGCCTGCAAGGACATGAAAAAGAAATGAAAGGCCAAGTGAATAAGT T TC
AAGAACTCCAACTCCAACTGGAGAAAGCAAAAGIGGAAT TAT TGAAAAAGAGAAAGIT T TGAACAAAT G
TAGGGATGAACTAGTGAGAACAACAGCACAATACGACCAGGCGTCAACCAAGTATACTGCAT TGGAACAA
AAACTGAAAAAAT TGACGGAAGAT T TGAGT T GT CAGCGACAAAAT GCAGAAAGT GCCAGAT GT
TCTCTGG
AACAGAAAAT TAAGGAAAAAGAAAAGGAGITICAAGAGGAGCTCTCCCGICAACAGCGT ICI T TCCAAAC
AC T GGACCAGGAGTGCAT CCAGAT GAAGGCCAGACTCACCCAGGAGT TACAGCAAGCCAAGAATATGCAC
AACGT CC T GCAGGC T GAAC T GGATAAACTCACAT CAGTAAAGCAACAGC TAGAAAACAAT T
TGGAAGAGT
T TAAGCAAAAGT T GT GCAGAGC T GAACAGGCGT TCCAGGCGAGTCAGATCAAGGAGAATGAGCTGAGGAG
AAGCAT GGAGGAAAT GAAGAAGGAAAACAACC T CC T TAAGAGT CAC T C T
GAGCAAAAGGCCAGAGAAGT C
TGCCACCIGGAGGCAGAACICAAGAACATCAAACAGIGT T TAAATCAGAGCCAGAATIT TGCAGAAGAAA
T GAAAGCGAAGAATACC T C T CAGGAAAC CAT GT TAAGAGATCT
TCAAGAAAAAATAAATCAGCAAGAAAA
C T CC T TGACT T TAGAAAAACTGAAGCT T GC T GT GGC T GAT C T GGAAAAGCAGCGAGAT T
GT TCTCAAGAC
CT T T T GAAGAAAAGAGAACAT CACAT TGAACAACT TAATGATAAGT TAAGCAAGACAGAGAAAGAGT C
CA
AAGCCT T GC T GAGT GC T T TAGAGT TAAAAAAGAAAGAATATGAAGAAT T GAAAGAAGAGAAAAC T
C T GT T
T TC T T GT TGGAAAAGTGAAAACGAAAAACT T T TAACTCAGATGGAATCAGAAAAGGAAAACT
TGCAGAGT
AAAAT TAAT CAC T TGGAAACT T GT C T GAAGACACAGCAAATAAAAAGT CAT
GAATACAACGAGAGAGTAA
GAACGC T GGAGAT GGACAGAGAAAACC TAAGT GT CGAGAT CAGAAACC T T CACAACGT GT
TAGACAGTAA
GT CAGT GGAGGTAGAGACCCAGAAAC TAGC T TATATGGAGCTACAGCAGAAAGCTGAGT TCTCAGATCAG
AAACAT CAGAAGGAAATAGAAAATAT GT GT T TGAAGACTTCTCAGCT TACTGGGCAAGT T GAAGAT C
TAG
AACACAAGCT TCAGT TAC T GT CAAAT GAAATAAT GGACAAAGACCGGT GT TACCAAGACT
TGCATGCCGA
ATAT GAGAGCC T CAGGGAT C T GC TAAAAT CCAAAGAT GC T TC TC T GGT GACAAAT GAAGAT
CAT CAGAGA
AGT CT T T T GGC T T T T GAT CAGCAGCC T GCCAT GCAT CAT T CC T T TGCAAATATAAT
TGGAGAACAAGGAA
GCATGCCT T CAGAGAGGAGT GAAT GT CGT T TAGAAGCAGACCAAAGTCCGAAAAAT T C T GCCAT CC
TACA
AAATAGAGT T GAT T CAC T TGAAT T T T CAT
TAGAGTCTCAAAAACAGATGAACTCAGACCTGCAAAAGCAG
T GI GAAGAGT T GGT GCAAAT CAAAGGAGAAATAGAAGAAAAT C T CAT GAAAGCAGAACAGAT GOAT
CAAA
GT T T T GT GGC T GAAACAAGT CAGC GOAT TAG TAAGT TACAGGAAGACACT T CT GOT CAC
CAGAAT GT T GI
T GC T GAAACC T TAAGTGCCCT TGAGAACAAGGAAAAAGAGCTGCAACT T T
TAAATGATAAGGTAGAAACT
GAGCAGGCAGAGAT TCAAGAAT TAAAAAAGAGCAAC CAT C TACT TGAAGACTCTCTAAAGGAGCTACAAC
ITT TAT CCGAAAC CC TAAGC T TGGAGAAGAAAGAAATGAGT T COAT CAT T T CT C
TAAATAAAAGGGAAAT
TGAAGAGCTGACCCAAGAGAATGGGACTCT TAAGGAAAT TAT GOAT CC T TAAATCAAGAGAAGATGAAC
T TAATCCAGAAAAGTGAGAGT T T TGCAAACTATATAGATGAAAGGGAGAAAAGCAT T TCAGAGT TAT C T
G
AT CAGTACAAGCAAGAAAAAC T TAT T T TAO TACAAAGAT GI GAAGAAACCGGAAAT GCATAT
GAGGAT C T
TAGTCAAAAATACAAAGCAGCACAGGAAAAGAAT TCTAAAT TAGAAT GOT T GC TAAAT GAT GCAC TAG
T
CT T T GI GAAAATAGGAAAAAT GAGT TGGAACAGCTAAAGGAAGCAT T TGCAAAGGAACACCAAGAAT
TOT
TAACAAAAT TAGCAT T T GC T GAAGAAAGAAAT CAGAAT C T GAT GC TAGAGT
TGGAGACAGTGCAGCAAGC
TCTGAGATCTGAGATGACAGATAACCAAAACAAT TCTAAGAGCGAGGCTGGTGGT T TAAAGCAAGAAATC
AT GAO TI TAAAGGAAGAACAAAACAAAAT GCAAAAGGAAGT TAT GACT TAT TACAAGAGAAT GAACAGC

T GAT GAAGGTAAT GAAGAC TAAACAT GAAT GT CAAAAT C TAGAAT CAGAAC CAAT TAGGAAC T C
T GT GAA
AGAAAGAGAGAGIGAGAGAAATCAATGIAATTI TAAACCICAGAIGGATCT TGAAGT TAAAGAAAT T TOT
C TAGATAGT TATAATGCGCAGTIGGIGCAAT TAGAAGC TAT GC TAAGAAATAAGGAAT TAAAACT T
CAGG
AAAGTGAGAAGGAGAAGGAGTGCCTGCAGCATGAAT TACAGACAAT TAGAGGAGATCT TGAAACCAGCAA
T T TGCAAGACATGCAGTCACAAGAAAT TAGTGGCCT TAAAGAC T GT GAAATAGAT GCGGAAGAAAAGTAT

AT T T CAGGGCC T CAT GAGT T GT CAACAAGT CAAAACGACAAT GCACACC T T CAGT GC TC TC
T GCAAACAA
CAAT GAACAAGC T GAAT GAGC TAGAGAAAATAT GT GAAATAC T GCAGGC T GAAAAGTAT GAAC T
CGTAAC
T GAGC T GAT GAT T CAAGG T CAGAAT GTAT CACAGCAAC TAGGAAAAT GGCAGAAGAGG
TAGGGAAAC TA

CTAAATGAAGT TAAAATAT TAAAT GAT GACAGT GGT CT T CT CCAT GGT GAGT TAGT
GGAAGACATACCAG
GAGGTGAAT T T GGT GAACAACCAAAT GAACAGCACCC T GT GT CT T T GGC T CCAT
TGGACGAGAGTAAT TC
CTACGAGCACT TGACAT T GT CAGACAAAGAAGT TCAAATGCACT T TGCCGAAT TGCAAGAGAAAT TCT
TA
TCTT TACAAAGTGAACACAAAAT T T TACAT GAT CAGCAC T GT CAGAT GAGC TC TAAAAT GT
CAGAGC T GC
AGACC TAT GT T GAC T CAT TAAAGGCCGAAAAT T TGGTCT T GT CAACGAAT C TGAGAAAC T T
T CAAGGT GA
CT TGGTGAAGGAGATGCAGCTGGGCT TGGAGGAGGGGCTCGT T CCAT CCC T GT CAT CC T CT T GT
GT GCC T
GACAGCTCTAGTCT TAGCAGT T T GGGAGAC T CC T CC T T T
TACAGAGCTCTTTTAGAACAGACAGGAGATA
T GT CTC T T T T GAGTAAT T TAGAAGGGGC T GT T TCAGCAAACCAGTGCAGTGTAGATGAAGTAT T
T TGCAG
CAGTCTGCAGGAGGAGAATCTGACCAGGAAAGAAACCCCT T CGGCCCCAGCGAAGGGT GT TGAAGAGCT T
GAGT CCC T C T GT GAGGT GTACCGGCAGTCCC T CGAGAAGC TAGAAGAGAAAAT GGAAAGT
CAAGGGAT TA
TGAAAAATAAGGAAAT TCAAGAGCTCGAGCAGT TAT TAAGT TCTGAAAGGCAAGAGCT TGACTGCCT TAG
GAAGCAGTAT T T GT CAGAAAAT GAACAGT GGCAACAGAAGC T GACAAGCGT GAC T C T GGAGAT
GGAGT CC
AAGT T GGCGGCAGAAAAGAAACAGACGGAACAAC T GT CAC T TGAGCTGGAAGTAGCACGACTCCAGCTAC
AAGGTCTGGACT TAAGT T C T CGGT CT T T GC T TGGCATCGACACAGAAGATGCTAT T
CAAGGCCGAAAT GA
GAGC T GT GACATAT CAAAAGAACATAC T T CAGAAAC TACAGAAAGAACACCAAAGCAT GAT GT T
CAT CAG
AT T T GT GATAAAGAT GC T CAGCAGGACC T CAAT C TAGACAT T GAGAAAATAAC T GAGAC T
GGT GCAGT GA
AACCCACAGGAGAGT GC T C T GGGGAACAGT CCCCAGATACCAAT TAT GAGCCT
CCAGGGGAAGATAAAAC
CCAGGGCTCTTCAGAATGCAT TTCTGAAT T GT CAT T T TC T GGT CC TAAT GC T T T GGTACC
TAT GGAT T TC
CTGGGGAATCAGGAAGATATCCATAATCT TCAACTGCGGGTAAAAGAGACATCAAATGAGAAT T TGAGAT
TACT T CAT GT GATAGAGGACCGT GACAGAAAAGT TGAAAGT T T GC TAAAT GAAAT GAAAGAAT
TAGACTC
AAAACTCCAT T TACAGGAGGTACAACTAATGACCAAAAT TGAAGCATGCATAGAAT TGGAAAAAATAGT T
GGGGAACT TAAGAAAGAAAACTCAGAT T TAAGTGAAAAAT TGGAATAT T T T TCT T GT GAT
CACCAGGAGT
TACTCCAGAGAGTAGAAACTTCTGAAGGCCTCAAT T C T GAT T TAGAAAT GCAT GCAGATAAAT CAT
CACG
TGAAGATAT T GGAGATAAT GT GGCCAAGGT GAAT GACAGC T GGAAGGAGAGAT T TCT T GAT GT
GGAAAAT
GAGCTGAGTAGGATCAGATCGGAGAAAGCTAGCAT TGAGCATGAAGCCCTCTACCTGGAGGCTGACT TAG
AGGTAGT T CAAACAGAGAAGC TAT GT T TAGAAAAAGACAATGAAAATAAGCAGAAGGT TAT T GT C T
GCC T
TGAAGAAGAACTCTCAGTGGTCACAAGTGAGAGAAACCAGCT TCGTGGAGAAT TAGATAC TAT GT CAAAA
AAAACCACGGCACTGGATCAGT T GT C T GAAAAAAT GAAGGAGAAAACACAAGAGC T T GAGT C T CAT
CAAA
GT GAGT GT C T CCAT TGCAT T CAGGT GGCAGAGGCAGAGGT GAAGGAAAAGACGGAAC T CC T
TCAGACT T T
GT CC T C T GAT GT GAGT GAGC T GT TAAAAGACAAAAC T CAT C T CCAGGAAAAGC T
GCAGAGT T TGGAAAAG
GAC T CACAGGCAC T GT CT T T GACAAAAT GT GAGC T GGAAAACCAAAT
TGCACAACTGAATAAAGAGAAAG
AAT T GC T T GT CAAGGAAT C T GAAAGCC T GCAGGCCAGAC T GAGT GAAT CAGAT TAT
GAAAAGC T GAAT GT
CTCCAAGGCCT TGGAGGCCGCACTGGTGGAGAAAGGTGAGT TCGCAT TGAGGCTGAGCTCAACACAGGAG
GAAGTGCATCAGCTGAGAAGAGGCATCGAGAAACTGAGAGT TCGCAT TGAGGCCGATGAAAAGAAGCAGC
T GCACAT CGCAGAGAAAC T GAAAGAACGCGAGCGGGAGAAT GAT T CAC T TAAGGATAAAGT
TGAGAACCT
TGAAAGGGAAT TGCAGAIGICAGAAGAAAACCAGGAGCTAGIGATTCTIGATGCCGAGAATICCAAAGCA
GAAGTAGAGACTCTAAAAACACAAATAGAAGAGATGGCCAGAAGCCTGAAAGT TTTTGAAT TAGACCT TG
TCACGT TAAGGTCTGAAAAAGAAAATCTGACAAAACAAATACAAGAAAAACAAGGTCAGT T GI CAGAAC T
AGACAAGT TAC TC TCT T CAT T TAAAAGT C T GT
TAGAAGAAAAGGAGCAAGCAGAGATACAGATCAAAGAA
GAAT C TAAAAC T GCAGT GGAGAT GC T TCAGAATCAGT TAAAGGAGCTAAATGAGGCAGTAGCAGCCT T
GT
GT GGT GACCAAGAAAT TAT GAAGGCCACAGAACAGAGT C TAGACCCACCAA TAGAGGAAGAGCAT CAGC
T
GAGAAATAGCAT T GAAAAGC T GAGAGCCCGCC TAGAAGC T GAT GAAAAGAAGCAGC T C T GT GT C
T TACAA
CAAC T GAAGGAAAGT GAGCAT CAT GCAGAT T TACT TAAGGGIAGAGIGGAGAACCT T GAAAGAGAGC
TAG
AGATAGCCAGGACAAACCAAGAGCATGCAGCTCTTGAGGCAGAGAAT TCCAAAGGAGAGGTAGAGACCCT
AAAAGCAAAAATAGAAGGGATGACCCAAAGTCTGAGAGGTCTGGAAT TAGATGT T GT TAC TATAAGGT CA
GAAAAAGAAAATCTGACAAATGAAT TACAAAAAGAGCAAGAGCGAATATCTGAAT TAGAAATAATAAAT T
CAT CAT T TGAAAATAT T T T GCAAGAAAAAGAGCAAGAGAAAGTACAGAT GAAAGAAAAAT CAAGCAC T
GC
CAT GGAGAT GC T TCAAACACAAT TAAAAGAGC T CAAT GAGAGAGT GGCAGC CC T GCATAAT
GACCAAGAA
GCCTGTAAGGCCAAAGAGCAGAATCT TAGTAGT CAAGTAGAGT GT C T TGAACT TGAGAAGGCTCAGT T
GC
TACAAGGCCT T GAT GAGGCCAAAAATAAT TATAT T GT T T TGCAATCT T CAGTGAAT GGCC T CAT
TCAAGA
AGTAGAAGATGGCAAGCAGAAACTGGAGAAGAAGGATGAAGAAATCAGTAGACTGAAAAATCAAAT T CAA
GACCAAGAGCAGCT T GT C T C TAAAC T GT CCCAGGT GGAAGGAGAGCACCAACT T
TGGAAGGAGCAAAACT
TAGAACTGAGAAATCTGACAGTGGAAT T GGAGCAGAAGAT CCAAGT GC TACAAT CCAAAAAT GCC T CT
T T
GCAGGACACAT TAGAAGT GC T GCAGAGT TCT TACAAGAATCTAGAGAATGAGCT TGAAT TGACAAAAATG

GACAAAAT GT CC T T T GT T GAAAAAGTAAACAAAAT GAC T GCAAAGGAAAC T GAGC T
GCAGAGGGAAAT GC
AT GAGA T GGCACAGAAAACAGCAGAGC T GCAAGAAGAAC T CAG T GGAGAGAAAAATAGGC TAGC T
GGAGA
GT TGCAGT TAC T GT TGGAAGAAATAAAGAGCAGCAAAGATCAAT TGAAGGAGCTCACACTAGAAAATAGT
GAAT TGAAGAAGAGCCTAGAT TGCATGCACAAAGACCAGGTGGAAAAGGAAGGGAAAGTGAGAGAGGAAA
TAGCTGAATATCAGCTACGGCT T CAT GAAGC T GAAAAGAAACACCAGGC T T TGCTTTTGGACACAAACAA

ACAGTATGAAGTAGAAATCCAGACATACCGAGAGAAAT TGACT T C TAAAGAAGAAT GT C T CAGT
TCACAG
AAGCTGGAGATAGACCT T T TAAAGTCTAGTAAAGAAGAGCTCAATAAT T CAT T GAAAGC TAC TAC T
CAGA
TTTTGGAAGAAT TGAAGAAAACCAAGATGGACAATCTAAAATATGTAAATCAGT T GAAGAAGGAAAAT GA
ACGT GC CCAGGGGAAAAT GAAGT T GT T GAT CAAAT CC T G TAAACAGC T
GGAAGAGGAAAAGGAGATAC T G
CAGAAAGAACTCT CT CAC T TCAAGC T GCACAGGAGAAGCAGAAAACAGGTAC T GI TAT
GGATACCAAGG
TCGATGAAT TAACAAC T GAGAT CAAAGAAC T GAAAGAAAC TCT T GAAGAAAAAACCAAGGAGGCAGAT
GA
ATACT T GGATAAG TACT GI T CC T T GC T TATAAGCCATGAAAAGT TAGAGAAAGC TAAAGAGAT
GI TAGAG
ACACAAGT GGCCCAT C T GT GT TCACAGCAATCTAAACAAGAT T CCCGAGGGTC T CC T T T GC
TAGGT CCAG
T T GT T CCAGGACCAT C T CCAAT CCC T TCT GT TACTGAAAAGAGGT TAT CAT CT
GGCCAAAATAAAGC T TC

AGGCAAGAGGCAAAGATCCAGT GGAATAT GGGAGAAT GGTAGAGGACCAACACC T GC TACCCCAGAGAGC
TT T IC TAAAAAAAGCAAGAAAGCAGT CAT GAGT GGIAT T CACCCIGCAGAAGACACGGAAGGIAC T
GAG T
T TGAGCCAGAGGGACT TCCAGAAGT TGTAAAGAAAGGGT T TGCTGACATCCCGACAGGAAAGACTAGCCC
ATATATCCIGCGAAGAACAACCAIGGCAACTCGGACCAGCCCCCGCCTGGCTGCACAGAAGITAGCGCTA
TCCCCACTGAGTCTCGGCAAAGAAAATCT TGCAGAGTCCTCCAAACCAACAGCTGGTGGCAGCAGATCAC
AAAAGGTCAAAGT T GC TCAGCGGAGCCCAGTAGAT TCAGGCACCATCCTCCGAGAACCCACCACGAAATC
CGTCCCAGICAATAATCITCCTGAGAGAAGICCGACTGACAGCCCCAGAGAGGGCCIGAGGGTCAAGCGA
GGCCGACT TGTCCCCAGCCCCAAAGCTGGACTGGAGTCCAACGGCAGTGAGAACTGTAAGGTCCAGTGAA
GGCACT T T GT GT GTCAGTACCCC T GGGAGGT GCCAGTCAT T GAATAGATAAGGC T GT GCC
TACAGGAC T T
CTCT T TAGTCAGGGCAT GC T T TAT TAGTGAGGAGAAAACAAT TCCT TAGAAGTCT TAAATATAT
TGTACT
CT T TAGATC TCCCAT GT GTAGGTAT TGAAAAAGT T T GGAAGCAC T GATCACCT GT TAGCAT
TGCCAT TCC
TC TAC T GCAAT GTAAATAGTATAAAGC TAT GTATATAAAGC T T T T T GGTAATAT GT TACAAT
TAAAAT GA
CAAGCAC TATATCACAATC TC T GT T T GTAT GT GGGT T T TACACTAAAAAAATGCAAAACACAT T
T TAT TC
T TCTAAT TAACAGC TCC TAGGAAAAT GTAGAC T T T T GC T T TAT GATAT TC TAT C T
GTAGTAT GAGGCAT G
GAATAGT T T TGTATCGGGAAT T TC TCAGAGC T GAGTAAAAT GAAGGAAAAGCAT GT TAT GT GT T
T T TAAG
GAAAAT GT GCACACATATACAT GTAGGAGT GT T TATCT T TCTCT TACAATCTGT T T TAGACATCT
T T GC T
TAT GAAACC T GTACATAT GT GT GT GT GGGTAT GT GT T TAT T TCCAGTGAGGGCTGCAGGCT
TCCTAGAGG
TGTGCTATACCATGCGTCTGTCGTTGTGCTTTT TTCTGT TTTTAGACCAAT TT TTTACAGTTCTTTGGTA
AGCAT T GTCGTAT C T GGT GAT GGAT TAACATATAGCCT T T GT T T TCTAATAAAATAGTCGCCT
TCGT T T T
CIGTAAAAAAAAAAAAAAAAAAAAAA

AGGGT GGCGAGGGGCGGCCAGGACCCGCAGCCC CGGGGCCGGGCCGGT CCGGACCGCCAGGGAGGGCAGG
TCAGTGGGCAGATCGCGTCCGCGGGATTCAATC TCTGCCCGC T C T GATAACAGT CC T T T
TCCCTGGCGCT
CAC T TCGTGCCTGGCACCCGGCTGGGCGCCTCAAGACCGT TGTCTCT TCGATCGCT TCT T TGGACT TGGC

GACCAT T TCAGAGATGTCT TCCAGAAGTACCAAAGAT T TAAT TAAAAGTAAGTGGGGATCGAAGCCTAGT
AACTCCAAATCCGAAACTACAT TAGAAAAAT TAAAGGGAGAAAT TGCACACTTAAAGACATCAGTGGATG
AAATCACAAGTGGGAAAGGAAAGCTGACTGATAAAGAGAGACACAGACT T T TGGAGAAAAT TCGAGTCCT
T GAGGC T GAGAAGGAGAAGAAT GC T TATCAACTCACAGAGAAGGACAAAGAAATACAGCGACTGAGAGAC
CAACTGAAGGCCAGATATAGTACTACCGCAT T GC T TGAACAGCTGGAAGAGACAACGAGAGAAGGAGAAA
GGAGGGAGCAGGT GT TGAAAGCCT TATCTGAAGAGAAAGACGTAT TGAAACAACAGT T GTC T GC T
GCAAC
CTCACGAAT T GC T GAAC T TGAAAGCAAAACCAATACACTCCGT T TATCACAGAC T GT GGC
TCCAAAC T GC
T T CAAC T CAT CAATAAATAATAT T CAT GAAAT GGAAATACAGC T GAAAGAT GC T C T
GGAGAAAAAT CAGC
AGT GGC TCGT GTAT GATCAGCAGCGGGAAGTC TAT GTAAAAGGAC T T T TAGCAAAGATCT T TGAGT
TGGA
AAAGAAAACGGAAACAGC T GC TCAT T CAC TCCCACAGCAGACAAAAAAGCC TGAATCAGAAGGT TATCT
T
CAAGAAGAGAAGCAGAAAT GT TACAACGATCTCT TGGCAAGTGCAAAAAAAGATCT TGAGGT TGAACGAC
AAACCATAACTCAGCTGAGT T T TGAACTGAGTGAAT T TCGAAGAAAATATGAAGAAACCCAAAAAGAAGT
TCACAAT T TAAAT CAGC T GT TGTAT T CACAAAGAAGGGCAGAT GT GCAACATC T GGAAGAT
GATAGGCAT
AAAACAGAGAAGATACAAAAACTCAGGGAAGAGAATGATAT T GC TAGGGGAAAAC T TGAAGAAGAGAAGA
AGAGATCCGAAGAGCTCT TATCTCAGGTCCAGT T TCT T TACACATC TC T GC
TAAAGCAGCAAGAAGAACA
AACAAGGGTAGC T C T GT TGGAACAACAGATGCAGGCATGTACT T TAGACT T
TGAAAATGAAAAACTCGAC
CGTCAACAT GT GCAGCATCAAT TGCATGTAAT TCT TAAGGAGCTCCGAAAAGCAAGAAATCAAATAACAC
AGT TGGAATCCT TGAAACAGCT TCATGAGT T TGCCATCACAGAGCCAT TAGTCACT T TCCAAGGAGAGAC

TGAAAACAGAGAAAAAGT TGCCGCCTCACCAAAAAGTCCCACTGCTGCACTCAATGAAAGCCTGGTGGAA
T GTCCCAAGT GCAATATACAGTATCCAGCCAC T GAGCAT CGCGATC T GC T T GT CCAT GT
GGAATAC T GT T
CAAAGTAGCAAAATAAGTAT T T GT T T TGATAT TAAAAGAT TCAATACTGTATT T TC T GT TAGCT
T GT GGG
CAT T T TGAAT TATATAT T TCACAT T T TGCATAAAACTGCCTATCTACCT T T GACAC TCCAGCAT
GC TAGT
GAATCATGTATCT T T TAGGC T GC T GT GCAT T TCTCT TGGCAGTGATACCTCCCTGACATGGT
TCATCATC
AGGC T GCAAT GACAGAAT GT GGT GAGCAGCGTC TAC T GAGAC TAC TAACAT TT
TGCACTGTCAAAATACT
TGGTGAGGAAAAGATAGCTCAGGT TAT T GC TAAT GGGT TAATGCACCAGCAAGCAAAATAT T T TAT GT
T T
TGGGGGT T TGAAAAATCAAAGATAAT TAACCAAGGATCT TAAC T GT GT TCGCAT TTTT
TATCCAAGCACT
TAGAAAACCTACAATCCTAAT T T T GAT GTCCAT T GT TAAGAGGT GGT GATAGATAC TAT
TTTTTTTT TCA
TAT TGTATAGCGGT TAT TAGAAAAGT TGGGGAT T T TCT TGATCT T TAT T GC TGC T TACCAT
TGAAACT TA
ACCCAGC T GT GT T CCCCAAC TC T GT T C T GCGCACGAAACAGTATC T GT T TGAGGCATAATCT
TAAGTGGC
CACACACAAT GT T T TCTCT TAT GT TATCTGGCAGTAACTGTAACT TGAAT TACAT TAGCACAT TC T
GC T T
AGCTAAAAT T GT TAAAATAAACT T TAATAAACCCAT GTAGCCC TC T CAT T T GAT TGACAGTAT T
T TAGT T
AT TTTTGGCAT TCT TAAAGCTGGGCAATGTAATGATCAGATCT T T GT T T GT CT GAACAGGTAT T T
T TATA
CAT GC T T T T T GTAAACCAAAAAC T T T TAAAT T T CT TCAGGT T T TC TAACAT GC T
TACCACTGGGCTACTG
TAAATGAGAAAAGAATAAAAT TAT T TAAT GT T T TAAAAAAAAAAAAAAA

CCGAGCCGAGCGAGAAGAGCGGCAGAGCCT TAT CCCCIGAAGCCGGGCCCCGCGICCCAGCCCIGCCCAG
CCCGCGCCCAGCCATGCGCGCCGCCT GC T GAGT CCGGGCGCCGCACGCTGAGCCCTCCGCCCGCGAGCCG
CGCTCAGCTCGGGGGT GAT TAGTTGC T T T T T GT T GT TTTT
TAATTTGGGCCGCGGGGAGGGGGAGGAGGG
GCAGGT GC TGCAGGCT CCCCCCCCTCCCCGCC T CGGGCCAGCCGCGGCGGCGCGACTCGGGCTCCGGACC
C GGGCAC T GC T GGCGGC T GGAGCGGAGCGCACC GCGGCGGT GGT GC CCAGAGC GGAGCGCAGC T
CC C T GC
CCCGCC CC TCCCCCTCGGCCTCGCGGCGACGGCGGCGGT GGCGGCT TGGACGACTCGGAGAGCCGAGT GA
AGACAT ITCCACCTGGACACCTGACCAIGTGCCTGCCCT GAGCAGCGAGGCCCACCAGGCAT C IC T GT TG
IGGGCAGCAGGGCCAGGICCIGGICIGIGGACCCTCGGCAGTIGGCAGGCTCCCTCTGCAGIGGGGICIG
GGCC TCGGCCCCACCAT GT CGAGCC T CGGCGGT GGC TCCCAGGAT GCCGGCGGCAGTAGCAGCAGCAGCA

CCAATGGCAGCGGTGGCAGTGGCAGCAGTGGCCCAAAGGCAGGAGCAGCAGACAAGAGTGCAGTGGTGGC
TGCCGCCGCACCAGCCICAGIGGCAGATGACACACCACCCCCCGAGCGICGGAACAAGAGCGGIAT CAT C
AGT GAGCCCCTCAACAAGAGCC T GCGCCGCTCCCGCCCGCTCTCCCACTAC TC TTCTTTTGGCAGCAGTG
GT GGTAGT GGCGG T GGCAGCAT GAT GGGCGGAGAGTCTGC T GACAAGGCCACT GCGGCT GCAGCCGC
T GC
C TCCC T GI IGGCCAAIGGGCATGACCTGGCGGCGGCCAT GGCGGTGGACAAAAGCAACCCTACCTCAAAG
CACAAAAGIGGIGC T GTGGCCAGCCT GC TGAGCAAGGCAGAGCGGGCCACGGAGCTGGCAGCCGAGGGAC
AGCTGACGCTGCAGCAGT T T GCGCAG T CCACAGAGAT GC TGAAGCGCGTGGTGCAGGAGCATCTCCCGCT
GAT GAGCGAGGCGGGTGC T GGCCT GC CTGACAT GGAGGC TGIGGCAGGIGCCGAAGCCCTCAAIGGCCAG
TCCGAC TTCCCCTACCTGGGCGCT T T CCCCAT CAACCCAGGCC T CT TCAT TAT GACCCCGGCAGGT
GT GT
T CC TGGCCGAGAGCGCGCT GCACATGGCGGGCC TGGCTGAGTACCCCATGCAGGGAGAGCTGGCCT C T GC
CATCAGCTCCGGCAAGAAGAAGCGGAAACGCTGCGGCAT GT GCGCGCCC T GCCGGCGGCGCATCAAC T GC
GAGCAGTGCAGCAGT TGTAGGAATCGAAAGACTGGCCATCAGAT T T GCAAA T T CAGAAAAT GTGAGGAAC

TCAAAAAGAAGCCT TCCGC T GC TC T GGAGAAGGT GAT GC T TCCGACGGGAGCCGCCT TCCGGTGGT
T TCA
GTGACGGCGGCGGAACCCAAAGC TGCCCICICCGIGCAAT =CAC T GC TCGIGT GGICT CCAGCAAGGGA
T TCGGGCGAAGACAAACGGATGCACCCGTCT T TAGAACCAAAAATAT TCTCTCACAGAT T TCAT TCC T
GT
T T T TATATATATAT T T T T T GT TGTCGT T T
TAACATCTCCACGTCCCTAGCATAAAAAGAAAAAGAAAAAA
AT T TAAAC T GC T T T T
TCGGAAGAACAACAACAAAAAAGAGGTAAAGACGAATCTATAAAGTACCGAGACT
TCCTGGGCAAAGAATGGACAATCAGT T TCCT TCC T GT GT CGAT GTCGAT GT TGTC T GT
GCAGGAGAT GCA
GT T T T T GT GTAGAGAAT GTAAAT T T T CT GTAACC T T T T GAAATC TAGT
TACTAATAAGCACTACTGTAAT
T TAGCACAGT T TAACTCCACCCTCAT T TAAACT TCCT T T GAT TCT T
TCCGACCATGAAATAGTGCATAGT
T TGCCTGGAGAATCCACTCACGT TCATAAAGAGAAT GT T GAT GGCGCCGTGTAGAAGCCGC TC T
GTATCC
ATCCACGCGIGCAGAGCTGCCAGCAGGGAGCTCACAGAAGGGGAGGGAGCACCAGGCCAGCTGAGCTGCA
CCCACAGTCCCGAGAC T GGGATCCCCCACCCCAACAGT GAT T T TGGAAAAAAAAATGAAAGT TC T GT
TCG
T T TATCCAT TGCGATCTGGGGAGCCCCATCTCGATAT T TCCAATCCTGGCTACT T T TCT
TAGAGAAAATA
AGTCCTTTTTT TCTGGCCT TGCTAAT GGCAACAGAAGAAAGGGC T T CT T T GCGT GGTCCCC T GC T
GGT GG
GGGT GGGTCCCCAGGGGGCCCCC T GCGGCC T GGGCCCCCC T GCCCACGGCCAGC T TCC T GC T GAT
GAACA
T GC T GT T TGTAT T GT T T TAGGAAACCAGGC T GT T T T GT GAATAAAACGAAT GCAT GT T
T GT GTCACGAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
NM_005228 CCCCGGCGCAGCGCGGCCGCAGCAGCC TCCGCCCCCCGCACGGT GT

CCGGAGICCCGAGC TAGCCCCGGCGGCCGCCGCCGCCCAGACCGGACGACAGGCCACC ICGICGGC =CC
GCCCGAGTCCCCGCCTCGCCGCCAACGCCACAACCACCGCGCACGGCCCCCTGACTCCGTCCAGTAT T GA
T CGGGAGAGCCGGAGCGAGCTCT TCGGGGAGCAGCGATGCGACCCT CCGGGAC GGCCGGGGCAGCGC T CC
TGGCGC TGCTGGCTGCGCTCTGCCCGGCGAGICGGGCTCIGGAGGAAAAGAAAGIT T GC CAAGGCACGAG
TAACAAGCTCACGCAGT TGGGCACTTTTGAAGATCAT T T TC TCAGCC TCCAGAGGAT GT TCAATAAC T
GT
GAGGTGGTCCT TGGGAAT T TGGAAAT TACC TAT GT GCAGAGGAAT TAT GAT CT T TCC T T CT
TAAAGACCA
TCCAGGAGGTGGCTGGT TAT GTCC TCAT T GCCC TCAACACAGT GGAGCGAAT T CC T T
TGGAAAACCTGCA
GATCATCAGAGGAAATATGTACTACGAAAAT TCC TAT GCC T TAGCAGTCT TAT C TAAC TAT GAT
GCAAAT
AAAACCGGACTGAAGGAGCTGCCCATGAGAAAT T TACAGGAAATCCTGCATGGCGCCGTGCGGT TCAGCA
ACAACCCTGCCC T GT GCAACGTGGAGAGCATCCAGT GGCGGGACATAGTCAGCAGT GAC T T TCTCAGCAA

CAT GTCGAT GGAC T TCCAGAACCACC TGGGCAGC TGCCAAAAGT GT GATCCAAGC T GTCCCAAT
GGGAGC
T GC T GGGGT GCAGGAGAGGAGAAC T GCCAGAAAC TGACCAAAAT CAT C T GT
GCCCAGCAGTGCTCCGGGC
GC T GCC GTGGCAAGT CCCCCAGT GAC T GC TGCCACAACCAGT GT GC
TGCAGGCTGCACAGGCCCCCGGGA
GAGCGACTGCCTGGTCTGCCGCAAAT T CCGAGACGAAGCCACGT GCAAGGACACC T GCCCCCCACT CAT G
CTCTACAACCCCACCACGTACCAGAT GGAT GT GAACCCCGAGGGCAAATACAGC T T TGGTGCCACCTGCG
TGAAGAAGTGTCCCCGTAAT TAT GT GGT GACAGAT CACGGCT CGTGCGT CC GAGCC T GT
GGGGCCGACAG
C TAT GAGAT GGAGGAAGACGGCGTCCGCAAGT GTAAGAAGT GCGAAGGGCC T T GCCGCAAAGT GT
GTAAC
GGAATAGGTAT TGGTGAAT T TAAAGACTCAC TC TCCATAAAT GC TACGAATAT TAAACACT
TCAAAAACT
GCACCTCCATCAGTGGCGATCTCCACATCCTGCCGGTGGCAT T TAGGGGTGACTCCT TCACACATACTCC
TCCTCTGGATCCACAGGAACTGGATAT TCTGAAAACCGTAAAGGAAATCACAGGGT T T T T GC T GAT
TCAG
GC T TGGCCTGAAAACAGGACGGACCTCCATGCCT T TGAGAACCTAGAAATCATACGCGGCAGGACCAAGC

GATAAGT GAT GGAGAT GT GATAAT T TCAGGAAACAAAAAT T T GT GC TAT GCAAATACAATAAAC T
GGAAA
AAAC T GT T TGGGACCTCCGGTCAGAAAACCAAAAT TATAAGCAACAGAGGTGAAAACAGCTGCAAGGCCA
CAGGCCAGGTCTGCCATGCCT T GT GC TCCCCCGAGGGC T GC T GGGGCCCGGAGCCCAGGGAC
TGCGTCTC
T TGCCGGAATGTCAGCCGAGGCAGGGAATGCGTGGACAAGTGCAACCTTCTGGAGGGTGAGCCAAGGGAG
T T TGTGGAGAACTCTGAGTGCATACAGTGCCACCCAGAGTGCCTGCCTCAGGCCATGAACATCACCTGCA
CAGGACGGGGACCAGACAACTGTATCCAGTGTGCCCACTACAT T GACGGCCCCCACTGC GT CAAGACCT G
CCCGGCAGGAGT CAT GGGAGAAAACAACACCCT GGTC T GGAAGTAC GCAGACGCCGGCCAT GT GT
GCCAC
C TGTGCCAT CCAAAC TGCACC TACGGAT GCACT GGGCCAGGT C T TGAAGGC
TGTCCAACGAATGGGCCTA
AGAT CC CGTCCAT CGCCACTGGGATGGTGGGGGCCCTCC TC T T GC T GC T GG TGGT GGCC C T
GGGGAT CGG
CC TC T T CAT GCGAAGGCGCCACAT CG T T CGGAAGCGCAC GC T GCGGAGGC T GC T
GCAGGAGAGGGAGC T T
GIGGAGCCICT TACACCCAGTGGAGAAGCTCCCAACCAAGCICICT TGAGGATCT TGAAGGAAACTGAAT
TCAAAAAGATCAAAGTGCTGGGCTCCGGTGCGT TCGGCACGGTGTATAAGGGACTCTGGATCCCAGAAGG
TGAGAAAGT TAAAAT TCCCGICGCTATCAAGGAAT TAAGAGAAGCAACATCTCCGAAAGCCAACAAGGAA
ATCCTCGATGAAGCCIACGTGAIGGCCAGCGIGGACAACCCCCACGIGIGCCGCCIGCTGGGCATCTGCC
T CACCT CCACCGT GCAGCT CAT CACGCAGC T CAT GCCC T T CGGCT GCC T CC TGGAC TAT GT
CCGGGAACA
CAAAGACAATATIGGCTCCCAGTACCTGCTCAACIGGIGIGIGCAGATCGCAAAGGGCATGAACTACTIG
GAGGACCGTCGCT TGGTGCACCGCGACCTGGCAGCCAGGAACGTACTGGTGAAAACACCGCAGCATGTCA

AGATCACAGAT T T TGGGCTGGCCAAACTGCTGGGTGCGGAAGAGAAAGAATACCATGCAGAAGGAGGCAA
AGTGCCTATCAAGTGGATGGCAT TGGAATCAAT T T TACACAGAATC TATACCCACCAGAGT GAT GT CT
GG
AGCTACGGGGTGACCGT T TGGGAGT T GAT GACC T T TGGATCCAAGCCATATGACGGAATCCCTGCCAGCG

AGATCTCCICCATCCIGGAGAAAGGAGAACGCCICCCICAGCCACCCATATGIACCATCGAIGICTACAT
GATCAT GGTCAAGT GCT GGAT GATAGACGCAGATAGTCGCCCAAAGT TCCGTGAGT TGATCATCGAAT TC
TCCAAAAIGGCCCGAGACCCCCAGCGCTACCT T =CAT TCAGGGGGATGAAAGAATGCAT T TGCCAAGIC
CTACAGACTCCAACT TCTACCGTGCCCT GAT GGAT GAAGAAGACAT GGACGACGT GGT GGAT GCCGACGA

GTACCICATCCCACAGCAGGGCTTCT ICAGCAGCCCCICCACGICACGGACTCCCCICC TGAGCTC ICI G
AGIGCAACCAGCAACAATICCACCGIGGCTTGCATTGATAGAAAIGGGCTGCAAAGCTGICCCATCAAGG
AAGACAGCTTCT TGCAGCGATACAGCTCAGACCCCACAGGCGCCT TGACTGAGGACAGCATAGACGACAC
CT T CC T CCCAGT GCCTGAATACATAAACCAGTCCGT TCCCAAAAGGCCCGC TGGCTCTGTGCAGAAT CC
T
GTCTATCACAATCAGCCTC T GAAC CC CGCGCCCAGCAGAGACCCACAC TAC CAGGACCCCCACAGCAC T G

CAGIGGGCAACCCCGAGTATCICAACACTGICCAGCCCACCTGIGT CAACAGCACATTCGACAGCCCTGC
CCAC T GGGCCCAGAAAGGCAGCCACCAAAT TAGCCTGGACAACCCT GACTACCAGCAGGACT ICI T TCCC
AAGGAAGCCAAGCCAAATGGCATCT T TAAGGGCTCCACAGCTGAAAATGCAGAATACCTAAGGGTCGCGC
CACAAAGCAGT GAAT T TAT T GGAGCAT GACCAC GGAGGA TAGTAT GAGCCC TAAAAATCCAGACTC
TT TC
GATACCCAGGACCAAGCCACAGCAGGTCCTCCATCCCAACAGCCATGCCCGCAT TAGCT CT TAGACCCAC
AGACTGGT T T TGCAACGT T TACACCGACTAGCCAGGAAGTACT TCCACCTCGGGCACAT T T TGGGAAGT
T
GCAT TCCT T TGTCT TCAAACT GT GAAGCAT T TACAGAAACGCATCCAGCAAGAATAT TGTCCCT T
TGAGC
AGAAAT T TATCT T TCAAAGAGGTATAT T T GAAAAAAAAAAAAAGTATAT GT GAGGAT T T T TAT T
GAT TGG
GGATCT TGGAGT T T T TCAT TGTCGCTAT T GAT T T T TACT TCAATGGGCTCT
TCCAACAAGGAAGAAGCT T
GCTGGTAGCACT TGCTACCCTGAGT T CATCCAGGCCCAACT GT GAGCAAGGAGCACAAGCCACAAGTCT T
CCAGAGGATGCT T GAT TCCAGTGGT T CT GCT TCAAGGCT TCCACTGCAAAACACTAAAGATCCAAGAAGG

COT ICAIGGCCCCAGCAGGCCGGATCGGIACT GIATCAAGICAT GGCAGGIACAGTAGGATAAGCCACTC
TGTCCCT TCCTGGGCAAAGAAGAAACGGAGGGGATGGAAT TCT TCCT TAGACT TACT T T T GTAAAAAT
GT
CCCCACGGTACT TACTCCCCACT GAT GGACCAGT GGT T TCCAGTCATGAGCGT TAGACTGACT T GT T
T GT
CT TCCAT TCCAT T GT T T TGAAACTCAGTATGCTGCCCCTGTCT T GC T
GTCATGAAATCAGCAAGAGAGGA
TGACACATCAAATAATAACTCGGAT TCCAGCCCACAT TGGAT TCATCAGCATT TGGACCAATAGCCCACA
GCT GAGAAT GT GGAATACC TAAGGATAGCACCGCT T T T GT TCTCGCAAAAACGTATCTCCTAAT T
TGAGG
CTCAGATGAAATGCATCAGGTCCT T T GGGGCATAGATCAGAAGACTACAAAAAT GAAGC T GCTCT GAAAT
CTCCT T TAGCCATCACCCCAACCCCCCAAAAT TAGT T T GT GT TACT TAT GGAAGATAGT T T
TCTCCTTTT
ACT TCACT TCAAAAGCT T T T TACTCAAAGAGTATAT GT TCCCTCCAGGTCAGCTGCCCCCAAACCCCCTC

CT TACGCT T T GTCACACAAAAAGT GT CTCT GCC T TGAGTCATCTAT TCAAGCACT
TACAGCTCTGGCCAC
AACAGGGCAT T T TACAGGTGCGAATGACAGTAGCAT TAT GAGTAGT GT GGAAT TCAGGTAGTAAATAT
GA
AACTAGGGT T TGAAAT TGATAATGCT T TCACAACAT T T GCAGAT GT T T TAGAAGGAAAAAAGT
TCCT TCC
TAAAATAAT T TCTCTACAAT TGGAAGAT TGGAAGAT TCAGCTAGT TAGGAGCCCACCT TTTT TCCTAATC

T GT GT GT GCCCT GTAACCT GACT GGT TAACAGCAGTCCT T T GTAAACAGT GT T T
TAAACTCTCCTAGTCA
ATATCCACCCCATCCAAT T TATCAAGGAAGAAAIGGITCAGAAAATAT T T ICAGCCIACAGT TAT GT ICA

GICACACACACATACAAAAT GT TCCT T TIGCTI T TAAAGTAAT TIT T GACT
CCCAGATCAGICAGAGCCC
CTACAGCAT T GT TAAGAAAGTAT T T GAT TTTTGTCTCAATGAAAATAAAACTATAT TCAT T
TCCACTCTA
AAAAAAAAAAAAAAAA
NM_00100586 GT TCCCGGAT T T T T GT GGGCGCCT GCCCCGCCCCTCGTCCCCCT GC T GT GT

GAT CT TTTTTGAGT
CGCAAT T GAAGTACCACCTCCCGAGGGT GAT TGCT TCCCCATGCGGGGTAGAACCT T T GCT GTCCT GT
TC
ACCACT CTACCTCCAGCACAGAAT T T GGCT TAT GCCIAC ICAAIGT GAAGATGAT GAGGAT
GAAAACCT T
TGT GAT GATCCAC T TCCACTIAATGAATGGIGGCAAAGCAAAGCTATATICAAGACCACATGCAAAGCTA
CTCCCT GAGCAAAGAGTCACAGATAAAAC GGGGGCACCAG TAGAAT GGCCAGGACAAAC GCAGT GCAGCA
CAGAGACTCAGACCCT GGCAGCCAT GCCTGCGCAGGCAGT GAT GAGAGTGACAT GTACT GT TGTGGACAT
GCACAAAAGTGAGT GT GCACCGGCACAGACAT GAAGCT GCGGCTCCCT GCCAGTCCCGAGACCCACCTGG
ACAT GC ICCGCCACCICTACCAGGGC T GCCAGGIGGIGCAGGGAAACCIGGAACICACCTACCTGCCCAC
CAAT GCCAGCCTGT CC T TCCTGCAGGATATCCAGGAGGTGCAGGGCTACGT GC TCATCGCTCACAACCAA
GT GAGGCAGGT CCCAC T GCAGAGGCT GCGGAT T GT GCGAGGCACCCAGC TC T T T GAGGACAAC
TAT GCCC
T GGCCGT GC TAGACAAT GGAGACCCGCT GAACAATACCACCCCTGT CACAGGGGCCTCCCCAGGAGGCC T
GCGGGAGCTGCAGCT TCGAAGCCTCACAGAGAT CT TGAAAGGAGGGGTCT T GATCCAGCGGAACCCCCAG
=CT GC TACCAGGACAC GAT T T TGIGGAAGGACATCT TCCACAAGAACAAC CAGCTGGC T C T
CACACT GA
TAGACACCAACCGCTCTCGGGCCIGCCACCCCT GT ICICCGAIGIGTAAGGGCTCCCGC T GCT GGGGAGA
GAGT TC TGAGGAT TGICAGAGCCTGACGCGCACTGICIGTOCCGGIGGCTGIGCCCGCTGCAAGGGGCCA
CIGCCCACTGACTGCTGCCATGAGCAGIGIGCTGCCGGCTGCACGGGCCCCAAGCACTCTGACTGCCTGG
CCTGCCICCACTICAACCACAGIGGCATCTGIGAGCTGCACTGCCCAGCCCTGGICACCIACAACACAGA
CACGT T TGAGTCCATGCCCAATCCCGAGGGCCGGTATACAT TCGGCGCCAGCT GT GT GACT GCCT GTCCC

TACAACTACCITICTACGGACGIGGGATCCTGCACCCTCGICTGCCCCCTGCACAACCAAGAGGIGACAG
CAGAGGAIGGAACACAGCGGIGTGAGAAGTOCAGCAAGCCCIGTGCCCGAGIGIGCTATGGICIGGGCAT
GGAGCACT TGCGAGAGGTGAGGGCAGT TACCAGTGCCAATATCCAGGAGT T TGCTGGCTGCAAGAAGATC
T T TGGGAGCCTGGCAT T TCTGCCGGAGAGCT T T GAT GGGGACCCAGCCTCCAACACT
GCCCCGCTCCAGC
CAGAGCAGCTCCAAGT GT T T GAGACT CT GGAAGAGATCACAGGT TACCTATACATCTCAGCATGGCCGGA
CAGCCT GCCTGACCICAGCGICT TCCAGAACCT GCAAGTAATCCGGGGACGAAT ICI GCACAATGGCGCC
TACTCGCTGACCCTGCAAGGGCTOGGCATCAGCTGGCTGGGGCTGCGCTCACTGAGGGAACTGGGCAGIG
GACTGGCCCTCATCCACCATAACACCCACCTCTGCT TCGTGCACACGGTGCCCTGGGACCAGCTCT T TCG

GAACCCGCACCAAGC TC T GC TCCACAC T GCCAACCGGCCAGAGGACGAGT GTGT GGGCGAGGGCC T
GGCC
T GCCACCAGC T GT GCGCCCGAGGGCAC T GC T GGGGICCAGGGCCCACCCAGTGT GTCAAC T
GCAGCCAGT
=CT TCGGGGCCAGGAGTGCGTGGAGGAAT GCCGAGTAC T GCAGGGGCTCCCCAGGGAGTAIGTGAAT GC
CAGGCAC T GT T T GCCGTGCCACCCT GAGT GTCAGCCCCAGAAT GGC TCAGT GACCT GT T T
TGGACCGGAG
GC T GACCAGT GT GT GGCC T GTGCCCAC TATAAGGACCCTCCC T
TCTGCGTGGCCCGCTGCCCCAGCGGTG
TGAAACCTGACCTCTCCTACATGCCCATCTGGAAGT T TCCAGATGAGGAGGGCGCATGCCAGCCTTGCCC
CATCAAC T GCACCCAC TCC T GT GT GGACC T GGAT GACAAGGGC T
GCCCCGCCGAGCAGAGAGCCAGCCC T
CIGACGICCATCATCTC I GCGGIGGI I GGCAT I =GC I GGICGIGGICT TGGGGGIGGI CT I
IGGGATCC
ICATCAAGCGACGGCAGCAGAAGATCCGGAAGTACACGAIGCGGAGAC I GC TGCAGGAAACGGAGC I GGT
GGAGCCGC T GACACC TAGCGGAGCGATGCCCAACCAGGCGCAGAT GCGGAT CC T GAAAGAGACGGAGC T
G
AGGAAGGIGAAGGIGC I IGGAICIGGCGC Till' GGCACAGICIACAAGGGCAT CTGGATCCCT GAT GGGG

AGAAT GT GAAAAT TCCAGT GGCCAT CAAAGT GT TGAGGGAAAACACATCCCCCAAAGCCAACAAAGAAAT
CT TAGACGAAGCATACGT GAT GGCT GGT GT GGGC TCCCCATAT GTC TCCCGCC T
TCTGGGCATCTGCCTG
ACATCCACGGIGCAGCIGGIGACACAGC I TAIGCCCIAT GGC I GCC ICI TAGACCAIGT CCGGGAAAACC

GCGGACGCCTGGGC TCCCAGGACCTGC I GAACT GGIGTAIGCAGAT TGCCAAGGGGATGAGCTACCTGGA
GGAT GT GCGGC TCGTACACAGGGAC T T GGCCGC TCGGAACGT GC T GGTCAAGAGTCCCAACCAT GT
CAAA
AT TACAGACT TCGGGC T GGCTCGGC T GC T GGACAT
TGACGAGACAGAGTACCATGCAGATGGGGGCAAGG
TGCCCATCAAGTGGATGGCGCTGGAGTCCAT TCTCCGCCGGCGGT T CACCCACCAGAGT GAT GT GT GGAG
I TAIGGIGIGAC I GIGIGGGAGC I GAT GAC I I I IGGGGCCAAACCI
TACGATGGGATCCCAGCCCGGGAG
ATCCC T GACC T GC T GGAAAAGGGGGAGCGGC T GCCCCAGCCCCCCATC T GCACCAT T GAT GTC
TACAT GA
TCAT GGTCAAAT GT T GGAT GAT TGACTCTGAATGTCGGCCAAGAT TCCGGGAGT TGGTGTCTGAAT
TCTC
CCGCATGGCCAGGGACCCCCAGCGCT T T GT GGT CATCCAGAAT GAGGAC T TGGGCCCAGCCAGTCCCT
TG
GACAGCACCT TC TACCGCT CAC T GC T GGAGGACGATGACAIGGGGGACC T GGT GGATGC
TGAGGAGTATC
TGGTACCCCAGCAGGGCTTCT TC T GT CCAGACCCTGCCCCGGGCGC T GGGGGCAT GGTCCACCACAGGCA
CCGCAGCTCATCTACCAGGAGTGGCGGTGGGGACCTGACACTAGGGCTGGAGCCCTCTGAAGAGGAGGCC
CCCAGGTCTCCACTGGCACCCTCCGAAGGGGCTGGCTCCGATGTAT T T GAT GGT GACC T GGGAATGGGGG
CAGCCAAGGGGCT GCAAAGCCTCCCCACACATGACCCCAGCCCTCTACAGCGGTACAGT GAGGACCCCAC
AGTACCCCTGCCC TCTGAGACTGATGGCTACGT TGCCCCCCTGACC TGCAGCCCCCAGCCTGAATAT GT G
AACCAGCCAGATGT TCGGCCCCAGCCCCCT T CGCCCCGAGAGGOCCCT C TGCC TGCTGCCCGACC T GC T
G
GT GCCACT C T GGAAAGGCCCAAGACT CTCT CCCCAGGGAAGAATGGGGTCGTCAAAGACGT T T T
TGCCT T
TGGGGGTGCCGTGGAGAACCCCGAGTACT TGACACCCCAGGGAGGAGCTGCCCCTCAGCCCCACCCTCCT
CCTGCCT TCAGCCCAGCCT TCGACAACC TC TAT TACTGGGACCAGGACCCACCAGAGCGGGGGGC I CCAC
CCAGCACC I TCAAAGGGACACCTACGGCAGAGAACCCAGAGTACCT GGGTC TGGACGT GCCAGIGT GAAC
CAGAAGGCCAAGTCCGCAGAAGCCCT GAIGIGT CC TCAGGGAGCAGGGAAGGCCTGAC I TC T GC T
GGCAT
CAAGAGGTGGGAGGGCCCTCCGACCACT TCCAGGGGAACCTGCCAT GCCAGGAACC T GT CC TAAGGAACC
TICCTICCIGCTIGAGITCCCAGATGGCTGGAAGGGGICCAGCCTCGTIGGAAGAGGAACAGCACIGGGG
AGTCT T T GT GGAT TCTGAGGCCCTGCCCAATGAGACTCTAGGGTCCAGTGGATGCCACAGCCCAGCT TGG
CCCT T TCCT TCCAGATCCTGGGTACTGAAAGCCT TAGGGAAGCTGGCCTGAGAGGGGAAGCGGCCCTAAG
GGAGTGTCTAAGAACAAAAGCGACCCAT TCAGAGACT GT CCCT GAAACCTAGTACT GCCCCCCAT GAGGA
AGGAACAGCAATGGTGTCAGTATCCAGGCT T TGTACAGAGTGCT T T TCT GT TTAGT T T T TACT
TTTTTTG
T T T T GT TTTTT TAAAGAT GAAATAAAGACCCAGGGGGAGAAT GGGT GT T GTAT GGGGAGGCAAGT
GT GGG
GGGTCCT TCTCCACACCCACT T TGTCCAT T TGCAAATATAT T T TGGAAAACAGCTA
NM_00112274 AT GGTCATAACAGCCTCCT GTCTACCGACTCAGAACGGAT T T

TCTATAGCATAAGAAGA
CAGTCT CT GAGT GATAATC T TCTCT T CAAGAAGAAGAAAAC TAGGAAGGAG TAAGCACAAAGATCT CT
TC
ACAT TCTCCGGGACTGCGGTACCAAATATCAGCACAGCACT TCT TGAAAAAGGATGTAGAT T T TAATCTG
AACT T TGAACCATCACTGAGGTGGCCCGCCGGT T TCTGAGCCTTCT GCCCT GC GGGGACACGGT C T
GCAC
CC TGCC CGCGGCCACGGAC CATGACCATGACCC T CCACACCAAAGCAT CTGGGATGGCCCTACT GCAT CA

GATCCAAGGGAACGAGCTGGAGCCCCTGAACCGTCCGCAGCTCAAGATCCCCCTGGAGCGGCCCCT GGGC
GAGGT G TACC T GGACAGCAGCAAGCC CGCCG T G TACAAC TACCCCGAGGGC GC CGCC TACGAG T
TCAACG
CCGCGGCCGCCGCCAACGCGCAGGT C TACGGTCAGACCGGCCTCCCCTACGGCCCCGGGT C T GAGGC T GC
GGCGTTCGGCTCCAACGGCCTGGGGGGT T T CCCCCCACT CAACAGCGTGT C TCCGAGCC CGC T GAT GC
TA
CTGCACCCGCCGCCGCAGCTGTCGCCT T TCCTGCAGCCCCACGGCCAGCAGGTGCCCTACTACCTGGAGA
ACGAGCCCAGCGGCTACACGGIGCGCGAGGCCGGCCCGCCGGCAT I CTACAGGCCAAAT TCAGATAATCG
ACGCCAGGGT GGCAGAGAAAGAT IGGCCAGTACCAATGACAAGGGAAGTAT GGC TAIGGAATC I GCCAAG
GAGACTCGCTACT GT GCAGT GT GCAAT GACTAT GCT TCAGGCTACCAT TAT GGAGTCT GGTCCT GT
GAGG
GCTGCAAGGCCT ICI TCAAGAGAAGTAT ICAAGGACATAACGACTATAIGT GT CCAGCCACCAACCAGT G
CACCAT TGATAAAAACAGGAGGAAGAGCTGCCAGGCCIGCCGGCTCCGCAAATGCTACGAAGIGGGAATG
AT GAAAGGT GGGA TACGAAAAGACCGAAGAGGAGGGAGAAT GT T GAAACACAAGCGCCAGAGAGAT GAT G

GGGAGGGCAGGGGIGAAGIGGGGTCTGCTGGAGACATGAGAGCTGCCAACCIT TGGCCAAGCCCGCTCAT
GATCAAACGCTCTAAGAAGAACAGCCIGGCCT T =COT GACGGCCGACCAGAIGGICAGT GCCT T GT TG
GAT GCT GAGCCCCCCATAC TCTAT TCCGAGTATGATCCTACCAGACCCTTCAGTGAAGCT TCGAT GAT GG
GCTIACTGACCAACCTGGCAGACAGGGAGCTGGITCACATGATCAACIGGGCGAAGAGGGIGCCAGGCTI
T GT GGAT T TGACCCTCCATGATCAGGTCCACCT TCTAGAAT GT GCC T GGCTAGAGATCC T GAT GAT
TGGT
CTCGTC T GGCGCT CCAT GGAGCACCCAGGGAAGCTACT GT T TGCTCCTAACTTGCTCT TGGACAGGAACC

AGGGAAAAT GT GTAGAGGGCAT GGT GGAGATCT TCGACATGCTGCTGGCTACATCATCTCGGT TCCGCAT
GAT GAATCT GCAGGGAGAGGAGT T T GT GT GCCT CAAATC TAT TAT T T TGCT TAAT
TCTGGAGTGTACACA
T T TCTGTCCAGCACCCTGAAGTCTCTGGAAGAGAAGGACCATATCCACCGAGTCCTGGACAAGATCACAG

ACACT T TGATCCACCTGATGGCCAAGGCAGGCCTGACCCTGCAGCAGCAGCACCAGCGGCTGGCCCAGCT
COTO= CATO= ICCCACATCAGGCACATGAGTAACAAAGGCAIGGAGCAIC IGTACAGCAIGAAGIGC
AAGAACGTGGTGCCCCTCTATGACCTGCTGCTGGAGATGCTGGACGCCCACCGCCTACATGCGCCCACTA
GCCGTGGAGGGGCATCCGTGGAGGAGACGGACCAAAGCCACT TGGCCACTGCGGGCTCTACTTCATCGCA
T T CCT T GCAAAAG TAT TACATCACGGGGGAGGCAGAGGGT T TCCCTGCCACGGTCTGAGAGCTCCCTGGC

TCCCACACGGT TCAGATAATCCCTGCTGCAT T T TACCCTCATCATGCACCACT T TAGCCAAAT TCTGTCT
CCTGCATACACTCCGGCATGCATCCAACACCAATGGCT T TCTAGATGAGTGGCCAT TCAT T TGCT TGCTC
AGT TCT TAGTGGCACATCT TCTGTCT TCTGT TGGGAACAGCCAAAGGGAT TCCAAGGCTAAATCT T TGTA

ACAGCTCTCT T TCCCCCT TGCTATGT TACTAAGCGTGAGGAT TCCCGTAGCTCTTCACAGCTGAACTCAG
TCTATGGGT TGGGGCTCAGATAACTCTGTGCAT T TAAGC TACT TGTAGAGACCCAGGCCTGGAGAGTAGA
CAT T T TGCCTCTGATAAGCACT T T T TAAATGGCTCTAAGAATAAGCCACAGCAAAGAAT T
TAAAGTGGCT
COT I TAAT IGGIGACT TGGAGAAAGC TAGGICAAGGGI I TAT TATAGCACCCT CT IGTAT ICCIAT
GGCA
ATGCATCCT T T TATGAAAGTGGTACACCT TAAAGCT T T TATATGACTGTAGCAGAGTATCTGGTGAT TGT

CAAT TCAT TCCCCCTATAGGAATACAAGGGGCACACAGGGAAGGCAGATCCCCTAGT TGGCAAGAC TAT T
T TAACT TGATACACTGCAGAT TCAGATGTGCTGAAAGCTCTGCCTCTGGCT TTCCGGTCATGGGT TCCAG
T TAAT TCATGCCTCCCATGGACCTATGGAGAGCAGCAAGT TGATCT TAGT TAAGTCTCCCTATATGAGGG
ATAAGT TCCTGAT T T T TGT T T T TAT T T T TGTGT TACAAAAGAAAGCCCTCCCTCCCTGAACT
TGCAGTAA
GGICAGCTICAGGACCIGT ICCAGIGGGCACIGTACT IGGAICT ICCCGGCGT =GT= GC= TACACAG
GGGTGAACTGT TCACTGTGGTGATGCATGATGAGGGTAAATGGTAGT TGAAAGGAGCAGGGGCCCTGGTG
I TGCAT I TAGCCCTGGGGCATGGAGC TGAACAGTACT IGIGCAGGAT IGT I GT GGCTAC
TAGAGAACAAG
AGGGAAAGTAGGGCAGAAACIGGATACAGT TOT GAGGCACAGCCAGACT TGCT CAGGGT GGCCCTGCCAC
AGGCTGCAGCTACCIAGGAACAT ICC I TGCAGACCCCGCAT TGCCC I I IGGGGGIGCCCTGGGATCCCTG
GGGIAGICCAGCT CT ICI I CAT I ICCCAGCGTGGCCCTGGI IGGAAGAAGCAGCIGTCACAGCTGC IGIA

GACAGCTGTGT TCCTACAAT TGGCCCAGCACCCTGGGGCACGGGAGAAGGGTGGGGACCGT TGCTGTCAC
TACTCAGGCTGACTGGGGCCTGGTCAGAT TACGTATGCCCTTGGTGGT T TAGAGATAATCCAAAATCAGG
GT T TGGT T TGGGGAAGAAAATCCTCCCCCT TCCTCCCCCGCCCCGT TCCCTACCGCCTCCACTCCTGCCA
GCTCAT I TOOT TCAAT I TOOT I TGACCIATAGGCTAAAAAAGAAAGGCTCAT I
CCAGCCACAGGGCAGCC
T TCCCTGGGCCT T TGCT TCTCTAGCACAAT TAT GGGT TACT TCCT T T T TCT
TAACAAAAAAGAATGT T TG
AT T TCCTCTGGGTGACCT TAT TGTCTGTAAT TGAAACCC TAT TGAGAGGTGAT GTCTGT GT
TAGCCAATG
ACCCAGGTGAGCTGCTCGGGCT TCTCT TGGTATGTCT TGT T TGGAAAAGTGGAT T TCAT TCAT T
TCTGAT
T GT CCAGT TAAGT GAT CAC CAAAGGAC T GAGAAT C T GGGAGGGCAAAAAAAAAAAAAAAGT T T T
TAT GT G
CACT TAAAT T TGGGGACAAT T T TATGTATCTGT GT TAAGGATATGT T TAAGAACATAAT TCT T T
TGT TGC
TGT T TGT T TAAGAAGCACCT TAGT T T GT T TAAGAAGCACCT TATATAGTATAATATATAT
TTTTTTGAAA
T TACAT TGCT TGT T TATCAGACAAT TGAATGTAGTAAT TCTGT TCTGGAT T TAAT T TGACTGGGT
TAACA
TGCAAAAACCAAGGAAAAATAT T TAGT TTTTTTTTTTTTTTT TGTATACT T TTCAAGCTACCT TGTCATG
TATACAGTCAT T TATGCCTAAAGCCTGGTGAT TAT TCAT T TAAATGAAGATCACAT T TCATATCAACT T
T
TGTATCCACAGTAGACAAAATAGCAC TAATCCAGATGCC TAT TGT TGGATACTGAATGACAGACAATCT T
ATGTAGCAAAGAT TATGCCTGAAAAGGAAAAT TAT TCAGGGCAGCTAAT T T TGCT T T
TACCAAAATATCA
GTAGTAATAT T T T TGGACAGTAGCTAATGGGTCAGTGGGT TCT T T T TAATGTT TATACT TAGAT T
T TCT T
T TAAAAAAAT TAAAATAAAACAAAAAAAAAT TI C TAGGAC TAGAC GAT G TAAT AC CAGC
TAAAGCCAAAC
AAT TATACAGTGGAAGGT T T TACAT TAT TCATCCAATGT GT T TCTAT TCAT GT
TAAGATACTACTACAT T
TGAAGTGGGCAGAGAACATCAGATGAT TGAAAT GT TCGCCCAGGGGTCTCCAGCAACT T TGGAAATCTCT
T TGTAT T T T TACT TGAAGTGCCACTAATGGACAGCAGATAT T T TCT GGCTGAT GT TGGTAT
TGGGTGTAG
GAACAT GAT T TAAAAAAAAACTCT TGCCTCTGCT T TCCCCCACTCTGAGGCAAGT TAAAATGTAAAAGAT
GTGAT T TATCTGGGGGGCTCAGGTATGGTGGGGAAGTGGAT TCAGGAATCTGGGGAATGGCAAATATAT T
AAGAAGAGTAT TGAAAGTAT T TGGAGGAAAATGGT TAAT TCTGGGTGTGCACCAGGGT TCAGTAGAGTCC
ACT TCTGCCCTGGAGACCACAAATCAACTAGCTCCAT T TACAGCCAT T TCTAAAATGGCAGCT TCAGT TC
TAGAGAAGAAAGAACAACATCAGCAGTAAAGTCCATGGAATAGCTAGTGGT CT GTGT T T CT T T TCGCCAT

TGCCTAGCT TGCCGTAATGAT TCTATAATGCCATCATGCAGCAAT TATGAGAGGCTAGGTCATCCAAAGA
GAAGACCCTATCAATGTAGGT TGCAAAATCTAACCCCTAAGGAAGT GCAGT CT T TGAT T TGAT T
TCCCTA
GTAACC I IGCAGATAIGT I TAACCAAGCCATAGCCCATGCCT I I TGAGGGC TGAACAAATAAGGGACT
TA
CTGATAAT T TACT T T TGATCACAT TAAGGTGT TCTCACCT TGAAAT CT TATACACTGAAATGGCCAT
TGA
T T TAGGCCACTGGCT TAGAGTACTCCT TCCCCTGCATGACACTGAT TACAAATACT T TCCTAT TCATACT

T TCCAAT TATGAGATGGACTGTGGGTACTGGGAGTGATCACTAACACCATAGTAATGTCTAATAT TCACA
GGCAGAICIGCT I GGGGAAGCTAGT TAIGIGAAAGGCAAATAGAGICATACAGTAGCTCAAAAGGCAACC
ATAAT TCTCT T TGGTGCAGGTCT TGGGAGCGTGATCTAGAT TACACTGCACCAT TCCCAAGT TAATCCCC
TGAAAACT TACTCTCAACTGGAGCAAATGAACT I IGGICCCAAATATCCAT CT T T TCAGTAGCGT TAAT
T
ATGCTCTGT T TCCAACTGCAT T TCCT T TCCAAT TGAAT TAAAGTGTGGCCTCGT T T T TAGTCAT T
TAAAA
T TGT T T TCTAAGTAAT TGCTGCCTCTAT TATGGCACT TCAAT T T TGCACTGTCT T T TGAGAT
TCAAGAAA
AAT T TC TAT TCT TTTTTTTGCATCCAAT TGTGCCTGAACT T T TAAAATATGTAAATGCTGCCATGT
TCCA
AACCCATCGTCAGTGTGTGTGT T TAGAGCTGTGCACCCTAGAAACAACATATTGTCCCATGAGCAGGTGC
CTGAGACACAGACCCCT T TGCAT TCACAGAGAGGTCAT TGGT TATAGAGACTTGAAT TAATAAGTGACAT
TATGCCAGT T TCT GT TCTCTCACAGGTGATAAACAATGCTTTT TGTGCACTACATACTCT TCAGTGTAGA
GCTCT T GT T T TAT GGGAAAAGGCTCAAATGCCAAAT TGT GT T TGATGGAT TAATATGCCCT T T
TGCCGAT
GCATAC TAT TACT GATGTGACTCGGT T T TGTCGCAGCT T TGCT T TGT T TAATGAAACACACT
TGTAAACC
TCT T T TGCACT T TGAAAAAGAATCCAGCGGGATGCTCGAGCACCTGTAAACAAT T T TCTCAACCTAT T
TG
AT GT TCAAATAAAGAAT TAAAC TAAA
NM_130398 AAAT TGAAAGGTCAGCCT T TCGCGCGCTGTGTAGGCAAGT TACCCGTGT TCTGCGT

GC TC T GGCCACAGT GAGT TAGGGGCGTCGGAGCGGGT T TCTCCAACCGCAATCGGCTCCGCTCAAGGGGA
GGAGGAGAGICCCTICICGGAAGGCCIAAGGAAACGIGICGICIGGAAIGGGCTIGGGGGCCACGCCIGC
ACATC T CCGCGAGACAGAGGGATAAAGT GAAGAT GGT GC T GT TAT T GT
TACCTCGAGTGCCACATGCGAC
CTCTGAGATATGTACACAGTCAT TCT TACTATCGCACTCAGCCAT TCT TACTACGCTAAAGAAGAAATAA
T TAT TCGAGGATAT T TGCCTGGCCCAGAAGAAACT TAT GTAAAT T T CAT GAAC TAT TATATCCGT
T T TCC
TCGGAGTGAGAGAAAACTCTTTT TAGATATCATCTGAGAGAACTAGTGAATCCCAGTCACTGAGTGGAGT
TGAGAGTCTAAGAACCTCTGAAAT T T GAGAAC T GC T GGACCAGAGCC T T
TAGAGCTCTGATAAGGTGTCA
ACAGGGTAGT TAAT T TGGCACCATGGGGATACAGGGAT T GC TACAAT T TAT CAAAGAAGC T
TCAGAACCC
ATCCAT GT GAGGAAGTATAAAGGGCAGGTAGTAGC T GT GGATACATAT T GC TGGC T TCACAAAGGAGC
TA
T T GC T T GT GC T GAAAAAC TAGCCAAAGGT GAACC TAC T GATAGGTAT GTAGGAT T T
TGTATGAAAT T T GT
AAATAT GT TAC TATC TCAT GGGATCAAGCC TAT TCTCGTAT T T GAT GGAT GTAC T T TACCT
TCTAAAAAG
GAAGTAGAGAGATCTAGAAGAGAAAGACGACAAGCCAATCT TCT TAAGGGAAAGCAACT TCT TCGTGAGG
GGAAAGTC TCGGAAGC TCGAGAGT GT T TCACCCGGTCTATCAATATCACACATGCCATGGCCCACAAAGT
AAT TAAAGCTGCCCGGTCTCAGGGGGTAGAT T GCC TCGT GGC TCCC TAT GAAGC T GAT GCGCAGT
TGGCC
TATCT TAACAAAGCGGGAAT T GT GCAAGCCATAAT TACAGAGGAC T CGGAT CT CC TAGC T T T T
GGC T GTA
AAAAGGTAAT T T TAAAGATGGACCAGT T TGGAAATGGACT TGAAAT T GATCAAGC TCGGC TAGGAAT
GT G
CAGACAGCT TGGGGATGTAT TCACGGAAGAGAAGT T TCGT TACAT GT GTAT TCT T TCAGGT T GT
GAC TAC
CTGTCATCACTGCGTGGGAT TGGAT TAGCAAAGGCATGCAAAGTCCTAAGACTAGCCAATAATCCAGATA
TAGTAAAGGT TAT CAAGAAAAT TGGACAT TATCTCAAGATGAATATCACGGTACCAGAGGAT TACATCAA
CGGGT T TAT TCGGGCCAACAATACCT ICC= TATCAGC TAGT TITT GATCCCATCAAAAGGAAAC T TAT
T
CC TC T GAACGCC TAT GAAGAT GAT GT T GATCC T GAAACAC TAAGC TACGCT GGGCAATAT GT
T GAT GAT T
CCATAGCTCT TCAAATAGCACT TGGAAATAAAGATATAAATACTTTTGAACAGATCGATGACTACAATCC
AGACAC T GC TAT GCCTGCCCAT ICAAGAAGICATAGT TGGGAT GACAAAACAT GTCAAAAGTCAGC
TAAT
GT TAGCAGCAT T T GGCATAGGAAT TACICICCCAGACCAGAGICGGGIAC T GT TICAGATGCCCCACAAT

TGAAGGAAAATCCAAGTACTGIGGGAGIGGAACGAGIGAT TAGTACTAAAGGGITAAATCTCCCAAGGAA
ATCATCCAT T GT GAAAAGACCAAGAAGT GCAGAGC T GTCAGAAGAT GACC T GT TGAGTCAGTAT
TCTCT T
TCAT T TACGAAGAAGACCAAGAAAAATAGC TC T GAAGGCAATAAAT CAT TGAGCT T T TC T GAAGT
GT T TG
TGCCTGACCTGGTAAATGGACCTACTAACAAAAAGAGTGTAAGCACTCCACCTAGGACGAGAAATAAAT T
TGCAACAT T T T TACAAAGGAAAAATGAAGAAAGTGGTGCAGT T GT GGT TCCAGGGACCAGAAGCAGGT T
T
TTTTGCAGT TCAGAT TC TAC T GAC T GT GTATCAAACAAAGT GAGCATCCAGCC TC T GGAT GAAAC
T GC T G
TCACAGATAAAGAGAACAATCTGCATGAATCAGAGTATGGAGACCAAGAAGGCAAGAGACTGGT TGACAC
AGAIGTAGCACGTAATICAAGIGATGACATICCGAATAATCATATICCAGGIGATCATATICCAGACAAG
GCAACAGIGIT TACAGAT GAAGAGICCIACTC T TT T GAGAGCAGCAAAT T TACAAGGACCAT
TICACCAC
CCACT T TGGGAACACTAAGAAGT T GT T T TAGT T GGTCTGGAGGTCT TGGAGAT T T T
TCAAGAACGCCGAG
CCCCTCTCCAAGCACAGCAT TGCAGCAGT TCCGAAGAAAGAGCGAT TCCCCCACCTCT T TGCCTGAGAAT
AATAT GTC T GAT GT GTCGCAGT TAAAGAGCGAGGAGTCCAGTGACGATGAGTCTCATCCCT TACGAGAAG
AGGCAT GT TCT TCACAGTCCCAGGAAAGTGGAGAAT TC T CAC T GCAGAGT TCAAATGCATCAAAGCT T
TC
TCAGT GC TC TAGTAAGGAC TC T GAT T CAGAGGAATC T GAT TGCAATAT TAAGT TACT
TGACAGTCAAAGT
GACCAGACCTCCAAGCTACGT T TATCTCAT T TCTCAAAAAAAGACACACCTCTAAGGAACAAGGT T CC T G

GGC TATATAAGICCAGT IC T GCAGAC =ICI T T C TACAACCAAGAT CAAACCT C TAGGACC
TGCCAGAGC
CAGTGGGCTGAGCAAGAAGCCGGCAAGCATCCAGAAGAGAAAGCATCATAATGCCGAGAACAAGCCGGGG
T TACAGATCAAACTCAATGAGCTCTGGAAAAACT T TGGAT T TAAAAAAGAT TCTGAAAAGCT TCC T CC
T T
GTAAGAAACCCCTGTCCCCAGTCAGAGATAACATCCAACTAACTCCAGAAGCGGAAGAGGATATAT T TAA
CAAACC T GAAT GT GGCCGT GT TCAAAGAGCAATAT TCCAGTAAATGCAGACTGCTGCAAAGCTTTTGCCT
GCAAGAGAATCTGATCAAT T T GAAGT CCC T GT T TGGGAATGAGGCACT TAT CAGCAT GAAGAAT
TTTT TC
TCAT TC T GT GCCAT T T TAAAAATAGAATACAT T T TGTATAT TAACT T TATAAT TGGGT T GT
GGT TTTTTT
GC TCAGC T T T T TATAT T T T TATAAGAAGCTAAATAGAAGAATAAT TGTATCTCTGACAGGT
TTTTGGAGG
T T T TAGT GT TAAT TGGGAAAATCCTCTGGAGT T TATAAAAGTCTACTCTAAATAT T TC T GTAAT
GT TGTC
AAGTAGAAAGATAGTAAAT GGAGAAAC TACAAAAAAAAAAAAAAAAAA

GAGGAGGT GT GATAGGC T T CCCACGCAGGGTAGATCCAGAGACACCAGT GCCACCCATAGGCCCC TAGGA
CTGCAGTGGTCACCCGAT T CC T T T GT CCCAGC T GAGAC T CAGT TC T GAGT GT T C TAT T
T TGGGGAACAGA
GGCGTCCT TGGTAGCAT T T GGAAGAGGATAGCCAGCTGGGGT GT GT GTACATCACAGCC T
GACAGTAACA
GCATCCGAACCAGAGGTGACTGGCTAAGGGCAGACCCAGGGCAACAGGT TAACCGT TCTAGGGCCGGGCA
CAGGGAGGAGAACAT TCCAACACTC T GIGIGCCCAGTGCCGACGCACGT IC IC ICI T T
TATCCTCAAAAC
AGTCC TAT GAGGATATAAGCCAGAGAGAGACAGAGACAAGGAAT TACAAGT TGGTGAGAGTCAGGAT T TG
AACT TGGCTCTGGCAGATGGAAAAT TAGGGTCTGTAT TCT T TACAAAACCGTGT GT GCC TCAGAT
GGAGT
IGGIGCATAACAAGCAGAGGIATCCAGGGICGCGGICCT GC T TGCCACGGAAGGGGCCGCC T TGICAGT T
GT GACCACCCAGCCC T GGAAAT GTCAGTAAT GC T GTAAGGAGT GGGGATCGGATCAGAT
GCCATCCAGAT
GCTGAAGTTTGACCTTGTGTCATTTTTCACTTTCTTTTTTGGCTCTTCTGCAATCAATTCATTTATTTAG
CAAAAAAGAAAT TAT GT GT GCCGAGAGCAT GCAGAAGATAT GTC TCCGT TC TC T GC T
TCCCTCCAAAAAA
GAATCCCAAAAC T GC T T TC T GT GAACGT GT GCCAGGGTCCCAGCAGGAC
TCAGGGAGAGCAGGAAGCCCA
GCCCAGACCCCTT GCACAACCTACCGTGGGGAGGCCT TAGGCTCTGGCTACTACAGAGCTGGT TCCAGTC
TGCACTGCCACAGCCTGGCCAGGGACT T GGACACATC T GC TGGCCAC T TCCTGTCTCAGT T TCCT
TATCT
GCAAAATAAGGGAAAAGCCCCCACAAAGGIGCACGIGTAGCAGGAGC IC T T TT CCCTCCC TAT T T
TAGGA
AGGCAGT TGGTGGGAAGTCCAGCT T GGGTCCCT GAGAGC T GT GAGAAGGAGAT GCGGC T GC T GC T
GGCCC
TGT IGGGGGICCT GC T GAGIGIGCCT GGOCCTCCAGIC T TGICCCT GGAGGCC IC T
GAGGAAGIGGAGC T
TGGTATGGCT TC T GAGGT GGGAGAGGGT GGCAGGGGT GGGAAGAGT GGGCACCAGGAGGGGGC T GC T
GGG

CTGAGCAAAGCTGGAAAGGATCCT TGCCCAGGCCCTGAGAAGGTGGCGGCAGGGCAGGGCTCAACCACTG
AGACTCAGTCAGTGCCTGGCT TCCAGCAAGCAT TCATCTATCACTGTGTCTGCGAGAGAGGACTGGCCT T
GCAGGGCGCAGGGCCCTAAGCTGGGC TGCAGAGCTGGTGGTGAGCT CCT TGCCTGGGTGTGTGTGCGTGT
=GT= =GT TCTGTGCAC IGGGIGT GIGACCIAGGAGGICCAGGCAGCAT GT GIGGIATAAGCAT TATG
AGGGTGATATGCCCCGGTGCAGCATGACCCTGTATGTGGCACCAACAGCAT GT GCCT TGTGTGTGTGTGT
GICCGTAIGIGIGIGIGIGTAIGCGT GIGIGIGIGIGIGIGIGIGT =CT I GGCCACIGICATGIGCACT
AAATGCTGTGTGTGTGACATGCCCCAAGAGTGTGGCAT T TGCCCTGGGTGTGGCATCCGCAGCATGTGGC
TGTGTGGGTGTCAAGGAGTGGTGGCTCCT TCAGCATGCGT TGCGAAGTGCT TGTGCCCTGCATGTGCGGT
GIGT IC ICIGTACACAGGAGGCTGCC ICAGAIGGGGCTGCGGGGIC IGCTGACCTCTGCCCICIGCCCAC
AGAGCCCTGCCTGGCTCCCAGCCTGGAGCAGCAAGAGCAGGAGCTGACAGTAGCCCTTGGGCAGCCTGTG
CGGCTGTGCTGTGGGCGGGCTGAGCGTGGTGGCCACTGGTACAAGGAGGGCAGTCGCCTGGCACCTGCTG
GCCGTG TACGGGGC TGGAGGGGCCGCC TAGAGAT TGCCAGCT ICC TACC TGAGGAT GC T
GGCCGCTACC T
CTGCCTGGCACGAGGCTCCATGATCGTCCTGCAGAATCTCACCT TGAT TACAGGTGACTCCTTGACCTCC
AGCAACGATGATGAGGACCCCAAGTCCCATAGGGACCTC TCGAATAGGCACAGT TACCCCCAGCAAGGTC
AGTAGGICICCAAGGACT I GIGICCCCGCTGCTGCTCAT CIGATCACTGAGAAGAGGAGGCCTGIGIGGG
AACACACGGTCAT ICIAGGGGCCTTCCCCTGCCCTCCAGCACCCIACIGGACACACCCCCAGCGCAIGGA
GAAGAAACTGCATGCAGTACCIGCGGGGAACACCGICAAGITCCGCTGTCCAGCTGCAGGCAACCCCACG
CCCACCATCCGCT GGCT TAAGGAIGGACAGGCCT T =AT GGGGAGAACCGCAT T GGAGGCAT TCGGCTGC
GCCATCAGCAC IGGAGIC T CGT GAIGGAGAGCGIGGIGC CC TCGGACCGCGGCACATACACCTGCC TGGT
AGAGAACGCTGTGGGCAGCATCCGT TATAAC TACC T GC TAGAT GT GCTGGAGC GGT
CCCCGCACCGGCCC
AT CC TGCAGGCCGGGCTCCCGGCCAACACCACAGCCGT GGT GGGCAGCGACGT GGAGCT GC T GT
GCAAGG
T GTACAGCGATGCCCAGCCCCACAT C CAGT GGC TGAAGCACAT CGT CATCAACGGCAGCAGCT TCGGAGC

CGACGGT T TCCCCTATGTGCAAGTCCTAAAGACTGCAGACATCAATAGCTCAGAGGTGGAGGTCCTGTAC
CIGCGGAACGIGT CAGCCGAGGACGCAGGCGAGTACACC TGCCTCGCAGGCAAT =CAT CGGCCTC ICCI
ACCAGICTGCCIGGCTCACGGIGCTGCCAGGIGAGCACCIGAAGGGCCAGGAGAIGCTGCGAGAIGCCCC
TCTGGGCCAGCAGTGGGGGCTGTGGCCTGT TGGGTGGTCAGTCTCT GT TGGCCTGTGGGGTCTGGCCTGG
GGGGCAGIGIGIGGAT I IGIGGGI I I GAGCTGIATGACAGCCCCIC IGIGCCI CICCACACGIGGCCGTC
CATGTGACCGTCT GCTGAGGTGTGGGTGCCTGGGACTGGGCATAAC TACAGCT TCCTCCGTGTGTGTCCC
CACATAIGTIGGGAGCIGGGAGGGACTGAGT TAGGGIGCACGGGGCGGCCAGT CICACCACTGACCAGT I
TGTCTGTCTGTGTGTGTCCATGTGCGAGGGCAGAGGAGGACCCCACATGGACCGCAGCAGCGCCCGAGGC
CAGGTATACGGACAT CAT C C T GTACGCGTCGGGCTCCCT GGCCT TGGC T GT GC TCCTGC
TGCTGGCCAGG
CTGTAT CGAGGGCAGGCGC TCCACGGCCGGCACCCCCGCCCGCCCGCCACT GI GCAGAAGCTCTCCCGCT
TCCCTC TGGCCCGACAGTT CTCCCTGGAGTCAGGCTCTTCCGGCAAGTCAAGC TCATCCCTGGTACGAGG
CGTGCGTCTCTCCTCCAGCGGCCCCGCCT T GC T CGCCGGCCTCGTGAGTCTAGATCTACCTCTCGACCCA
CIAIGGGAGT ICCCCCGGGACAGGCTGGIGCT TGGGAAGCCCCTAGGCGAGGGCTGCT I TGGCCAGGTAG
TACGTGCAGAGGCCTT TGGCATGGACCCTGCCCGGCCTGACCAAGCCAGCACT GTGGCCGTCAAGATGCT
CAAAGACAACGCC TCTGACAAGGACC TGGCCGACCTGGTCTCGGAGATGGAGGTGATGAAGCTGAT CGGC
CGACACAAGAACAT CAT CAACCTGC T TGGTGTCTGCACCCAGGAAGGGCCCCTGTACGT GAT CGTGGAGT
GCGCCGCCAAGGGAAACCTGCGGGAGT T CC T GC GGGCCCGGCGCCCCCCAGGCCCCGAC C T CAGCCCCGA

CGGTCCTCGGAGCAGTGAGGGGCCGC TCTCCT TCCCAGTCCTGGTCTCCTGCGCCTACCAGGTGGCCCGA
GGCATGCAGIATCIGGAGICCCGGAAGIGTATCCACCGGGACCTGGCTGCCCGCAATGIGCTGGIGACTG
AGGACAATGTGATGAAGAT TGCTGACT T TGGGCTGGCCCGCGGCGTCCACCACAT TGACTACTATAAGAA
AACCAGCAACGGCCGCCIGCCIGIGAAGIGGAT GGCGCCCGAGGCCT I= I TGACCGGGTGTACACACAC
CAGAGT GACGIGT GGICT I I IGGGAT CCIGCTAIGGGAGAICT ICACCCICGGGGGCTCCCCGTAT CCTG

GCATCCCGGTGGAGGAGCT GT TCTCGCTGCTGCGGGAGGGACATCGGATGGACCGACCCCCACACT GCCC
CCCAGAGCTGIACGGGCTGAIGCGIGAGIGCTGGCACGCAGCGCCCTCCCAGAGGCCTACCTTCAAGCAG
CIGGIGGAGGCGCTGGACAAGGICCTGCTGGCCGTCTCTGAGGAGTACCTCGACCICCGCCIGACCTTCG
GACCCIATICCCCCICIGGIGGGGACGCCAGCAGCACCIGCTCCTCCAGCGAT TCTGTCT TCAGCCACGA
CCCCCT GCCAT TGGGATCCAGCTCCT TCCCCT TCGGGTCTGGGGTGCAGACATGAGCAAGGCTCAAGGCT
GTGCAGGCACATAGGCTGGTGGCCT TGGGCCT TGGGGCTCAGCCACAGCCTGACACAGTGCTCGACCT TG
ATAGCAIGGGGCCCCIGGCCCAGAGT IGCTGIGCCGIGICCAAGGGCCGIGCCCITGCCCITGGAGCTGC
CGTGCC IGIGICC TGAIGGCCCAAAT GICAGGGI ICIGC TCGGCT I CT TGGACCT TGGCGCT
TAGTCCCC
ATCCCGGGT T TGGCTGAGCCTGGCTGGAGAGCT GCTATGCTAAACC TCCTGCCTCCCAATACCAGCAGGA
GGT ICI GGGCCTC TGAACCCCCT I ICCCCACACCTCCCCCIGCTGCTGCTGCCCCAGCGICT TGACGGGA
GCATIGGCCCCIGAGCCCAGAGAAGCTGGAAGCCTGCCGAAAACAGGAGCAAATGGCGT I I TATAAAT TA
TTTTTTTGAAAT
NM_004496 TAAGAT CCACATCAGCTCAACTGCAC I IGCCICGCAGAGGCAGCCCGCTCACT

CCGGCGCCGCGCTCCGCGGCAGCCGCCTGCCCCCGGCGCTGCCCCCGCCCGCCGCGCCGCCGCCGCCGCC
GCGCACGCCGCGCCCCGCAGCTCTGGGCT TCCT CT TCGCCCGGGTGGCGT T GGGCCCGCGCGGGCGCTCG
GGIGACTGCAGCTGCTCAGCTCCCCTCCCCCGCCCCGCGCCGCGCGGCCGCCCGTCGCT TCGCACAGGGC
TGGATGGT TGTAT TGGGCAGGGTGGCTCCAGGATGT TAGGAACTGT GAAGATGGAAGGGCATGAAACCAG
CGACTGGAACAGCTACTACGCAGACACGCAGGAGGCCTACTCCTCCGTCCCGGTCAGCAACATGAACTCA
GGCC T GGGC T C CA T GAAC T CCAT GAACACC TACAT GACCAT GAACACCAT GAC
TACGAGCGGCAACAT GA
CCCCGGCGTCCT T CAACAT =CC TAT GCCAACCCGGGCC TAGGGGCCGGCC TGAGICCCGGCGCAG TAGC
CGGCAT GCCGGGGGGCT CGGCGGGCGCCAT GAACAGCAT GACTGCGGCCGGCG T GACGGCCATGGGTACG
GCGCTGAGCCCGAGCGGCATGGGCGCCATGGGT GCGCAGCAGGCGGCCTCCAT GAATGGCCTGGGCCCCT
ACGCGGCCGCCAT GAACCC GT GCAT GAGCCCCATGGCGTACGCGCCGTCCAAC C TGGGC CGCAGCCGCGC
GGGCGGCGGCGGCGACGCCAAGACGT TCAAGCGCAGCTACCCGCACGCCAAGCCGCCCTACTCGTACATC

TCGCTCATCACCATGGCCATCCAGCAGGCGCCCAGCAAGATGCTCACGCTGAGCGAGATCTACCAGTGGA
ICAIGGACCICTICCCCIATTACCGGCAGAACCAGCAGCGCTGGCAGAACTCCATCCGCCACTCGCTGIC
CT T CAAT GAC IGO I I CGT CAAGGTGGCACGCTCCCCGGACAAGCCGGGCAAGGGCT CCT AC TGGAC
GC T G
CACCCGGACTCCGGCAACAT GT TCGAGAACGGCTGCTAC T TGCGCCGCCAGAAGCGCT T CAAGT GC GAGA

AGCAGCCGGGGGCCGGCGGCGGGGGCGGGAGCGGAAGCGGGGGCAGCGGCGCCAAGGGCGGCCCTGAGAG
CCGCAAGGACCCC TCTGGCGCCTCTAACCCCAGCGCCGACTCGCCCCTCCATC GGGGT GTGCACGGGAAG
ACCGGCCAGCTAGAGGGCGCGCCGGCCCCCGGGCCCGCCGCCAGCCCCCAGAC TCTGGACCACAGTGGGG
CGACGGCGACAGGGGGCGCCTCGGAGT TGAAGACTCCAGCCTCCTCAACTGCGCCCCCCATAAGCTCCGG
GCCCGGGGCGCTGGCCTCTGTGCCCGCCTCTCACCCGGCACACGGC T TGGCACCCCACGAGTCCCAGCTG
CACCTGAAAGGGGACCCCCACTACTCCT TCAACCACCCGT TCTCCATCAACAACCTCAT GTCCTCCTCGG
AGCAGCAGCATAAGCTGGACT T CAAGGCATACGAACAGGCAC T GCAATAC T CGCCT TACGGCTCTACGT T

GCCCGCCAGCC TGCC T C TAGGCAGCGCCTCGGT GACCACCAGGAGC CCCAT CGAGCCC I CAGCCCT
GGAG
CCGGCGTACTACCAAGGIGIGTAT TCCAGACCC GT CC TAAACACT T CC TAGCT CCCGGGACTGGGGGGT
T
T GT C T GGCATAGC CAT GCT GGTAGCAAGAGAGAAAAAAT CAACAGCAAACAAAACCACACAAACCAAACC

GICAACAGCATAATAAAATCCCAACAACTATIT T TAT T T CAT TT T T CATGCACAACCT T
TCCCCCAGTGC
AAAAGACTGT TAC T T TAT TAT TGTAT TCAAAAT TCATTGTGTATAT
TACTACAAAGACAACCCCAAACCA
AT T T T T T TCCTGCGAAGT T TAATGAT CCACAAGTGTATATATGAAAT TCTCCT CCT TCC T
TGCCCCCCTC
TCTTTCTTCCCTCTTTCCCCTCCAGACATTCTAGTTTGTGGAGGGT TAT T TAAAAAAACAAAAAAGGAAG
ATGGTCAAGTTTGTAAAATATTTGTT TGTGCTT TTTCCCCCTCCTTACCTGACCCCCTACGAGTTTACAG
GTCTGTGGCAATACTCTTAACCATAAGAATTGAAATGGTGAAGAAACAAGTATACACTAGAGGCTCTTAA
AAGTAT TGAAAGACAATACTGCTGTTATATAGCAAGACATAAACAGATTATAAACATCAGAGCCAT TTGC
TTCTCAGTTTACATTTCTGATACATGCAGATAGCAGATGTCTTTAAATGAAATACATGTATATTGTGTAT
GGACTTAATTATGCACATGCTCAGATGTGTAGACATCCTCCGTATATTTACATAACATATAGAGGTAATA
GATAGGTGATATACATGATACATTCTCAAGAGT TGCTTGACCGAAAGTTACAAGGACCCCAACCCCTTTG
TCCTCTCTACCCACAGATGGCCCTGGGAATCAATTCCTCAGGAATTGCCCTCAAGAACTCTGCTTCTTGC
TTTGCAGAGTGCCATGGTCATGTCAT TCTGAGGTCACATAACACATAAAAT TAGTTTCTATGAGTGTATA
CCATTTAAAGAAT TTTTTTTTCAGTAAAAGGGAATATTACAATGTTGGAGGAGAGATAAGTTATAGGGAG
CTGGAT TTCAAAACGTGGTCCAAGAT TCAAAAATCCTAT TGATAGTGGCCATT TTAATCATTGCCATCGT
GTGCT T GT T TCAT CCAGTGT TATGCACT T TCCACAGT TGGACATGGTGT TAGTATAGCCAGACGGGT
T TC
ATTATTATTTCTCTTTGCTTTCTCAATGTTAATTTATTGCATGGTTTATTCTTTTTCTTTACAGCTGAAA
TTGCTT TAAATGATGGTTAAAATTACAAATTAAATTGTTAATTTTTATCAATGTGATTGTAATTAAAAAT
AT T T TGAT T TAAATAACAAAAATAATACCAGAT T T TAAGCCGTGGAAAATGT T CT TGAT CAT T
TGCAGT T
AAGGAC II T AAA T AAA T CAAA T GT T AACAAAAAAAAAAAAAAAA
NM_001453 GC TAC TACCGCGCGGCGGCCGCGGCGGCCGGGGGCGGC T ACACCGCCATGCCGGCCCCCATGAGCG IGTA
CTCGCACCCTGCGCACGCCGAGCAGTACCCGGGCGGCAT GGCCCGCGCCTACGGGCCCTACACGCCGCAG
CCGCAGCCCAAGGACATGGTGAAGCCGCCCTATAGCTACATCGCGCTCATCACCATGGCCATCCAGAACG
CCCCGGACAAGAAGAT CAC CC TGAACGGCAT C T ACCAGT T CAT CAT GGACCGC T TCCCC T T C
TACCGGGA
CAACAAGCAGGGC IGGCAGAACAGCATCCGCCACAACCT CICGCICAACGAGT GC T TCGICAAGGTGCCG
CGCGACGACAAGAAGCCGGGCAAGGGCAGCTAC TGGACGCTGGACCCGGAC TCCTACAACATGTTCGAGA
ACGGCAGCT T CC T GCGGCGGCGGCGGCGCT TCAAGAAGAAGGACGCGGTGAAGGACAAGGAGGAGAAGGA
CAGGCTGCACCTCAAGGAGCCGCCCCCGCCCGGCCGCCAGCCCCCGCCCGCGCCGCCGGAGCAGGCCGAC
GGCAACGCGCCCGGTCCGCAGCCGCCGCCCGTGCGCATCCAGGACATCAAGACCGAGAACGGTACGTGCC
CCTCGCCGCCCCAGCCCCTGTCCCCGGCCGCCGCCCTGGGCAGCGGCAGCGCCGCCGCGGTGCCCAAGAT
C GAGAGCCCCGACAGCAGCAGCAGCAGCC T GT C CAGCGGGAGCAGC CCCCC GGGCAGCC T GCCGT C
GGC G
CGGCCGCTCAGCCTGGACGGTGCGGAT TCCGCGCCGCCGCCGCCCGCGCCCTCCGCCCCGCCGCCGCACC
ATAGCCAGGGCT T CAGCGT GGACAACAT CAT GACGT CGC T GCGGGGGT CGC CGCAGAGC
GCGGCCGCGGA
GCTCAGCTCCGGCCT TCTGGCCTCGGCGGCCGCGTCCTCGCGCGCGGGGATCGCACCCCCGCTGGCGCTC
GGCGCCTACTCGCCCGGCCAGAGCTCCCTCTACAGCTCCCCCTGCAGCCAGACCTCCAGCGCGGGCAGCT
C GGGCGGCGGCGGCGGCGGCGCGGGGGCCGCGGGGGGCGCGGGCGGCGCCGGGACC TAC CAC T GCAACC T
GCAAGC CAT GAGC C T GTAC GCGGCCGGCGAGCGCGGGGGCCAC T TGCAGGGCGCGCCCGGGGGCGCGGGC

GGCTCGGCCGTGGACGACCCCCTGCCCGACTACTCTCTGCCTCCGGTCACCAGCAGCAGCTCGTCGTCCC
TGAGTCACGGCGGCGGCGGCGGCGGCGGCGGGGGAGGCCAGGAGGCCGGCCACCACCCTGCGGCCCACCA
AGGCCGCCTCACC TCGT GGTACCTGAACCAGGCGGGCGGAGACCTGGGCCACT TGGCGAGCGCGGCGGCG
GCGGCGGCGGCCGCAGGCTACCCGGGCCAGCAGCAGAAC TICCACTCGGTGCGGGAGAT GI TCGAGTCAC
AGAGGATCGGCT T GAACAACTCT CCAGT GAACGGGAATAGTAGC T G T CAAATGGCC T T C CC T
TCCAGCCA
GTCTCT GTACCGCACGTCCGGAGCT T TCGTCTACGACTGTAGCAAGTTTTGACACACCCTCAAAGCCGAA
C T AAA T C GAACCC CAAAGCAGGAAAA GC T AAAG GAACCCA I CAAGGCAAAATC
GAAACTAAAAAAAAAAA
AT C CAA T TAAAAAAAACCCC T GAGAA TAT T CAC CACACCAGC GAACAGAAT AT C CC T C
CAAAAAT T CAGC
TCACCAGCACCAGCACGAAGAAAACT CTAT T T T CT TAACCGAT TAAT TCAGAGCCACCT CCACT T T
GCCT
TGTCTAAATAAACAAACCCGTAAACT GT T T TATACAGAGACAGCAAAATCT TGGT T TAT TAAAGGACAGT

GT TACT CCAGATAACACGTAAGT T TC T TCT TGC T T T TCAGAGACCT GCT T T CCCCTCCT
CCCGTCT CCCC
TCTCTTGCCTTCT TCCT TGCCTCTCACCTGTAAGATAT TAT T T TAT CCTAT GT
TGAAGGGAGGGGGAAAG
TCCCCGTTTATGAAAGTCGCTTTCTT T T TAT TCATGGAC T TGT T T TAAAAT GTAAAT
TGCAACATAGTAA
T T TAT T TTTAATT TGTAGT TGGATGTCGTGGACCAAACGCCAGAAAGTGTTCCCAAAACCTGACGT TAAA
TTGCCTGAAACTT TAAAT T GTGCT T T T T T TCTCAT TATAAAAAGGGAAACT GTAT TAAT CT TAT
TC TATC
CTCTTT TCTTTCT TTTTGT TGAACATAT TCAT T GT T TGT T TAT TAATAAAT
TACCATTCAGTTTGAATGA
GACCTATATGTCT GGATAC T T TAATAGAGCT T TAAT TAT TACGAAAAAAGATT
TCAGAGATAAAACACTA

GAAGT TACCTAT TCTCCACCTAAATCTCTGAAAAATGGAGAAACCCTCTGACTAGTCCATGTCAAAT T T T
ACTAAAAGTCT T T T TGT T TAGAT T TAT T T TCCTGCAGCATCT
TCTGCAAAATGTACTATATAGTCAGCT T
GCT T TGAGGCTAGTAAAAAGATAT T T T TCTAAACAGAT TGGAGT TGGCATATAAACAAATACGT T T
TCTC
ACTAATGACAGTCCATGAT TCGGAAAT T T TAAGCCCATGAATCAGCCGCGGTCT TACCACGGTGATGCCT
GTGTGCCGAGAGATGGGACTGTGCGGCCAGATATGCACAGATAAATAT T TGGCT TGTGTAT TCCATATAA
AAT TGCAGTGCATAT TATACATCCCTGTGAGCCAGATGCTGAATAGATAT T TTCCTAT TAT T TCAGTCCT
T TATAAAAGGAAAAATAAACCAGT T T T TAAATGTATGTATATAAT TCTCCCCCAT T TACAATCCT
TCATG
TAT TACATAGAAGGAT TGCTTTTT TAAAAATATACTGCGGGT TGGAAAGGGATAT T TAATCT T TGAGAAA

CTAT T T TAGAAAATATGT T TGTAGAACAAT TAT T T T TGAAAAAGAT T
TAAAGCAATAACAAGAAGGAAGG
CGAGAGGAGCAGAACAT T T TGGTCTAGGGTGGT T TCT T T T TAAACCAT T T T TT CT TGT TAAT
T TACAGT T
AAACCTAGGGGACAATCCGGAT TGGCCCTCCCCCT T T TGTAAATAACCCAGGAAATGTAATAAAT T CAT T
ATCT TAGGGTGATCTGCCCTGCCAATCAGACT T TGGGGAGATGGCGAT T TGAT TACAGACGT TCGGGGGG
GTGGGGGGCT TGCAGT T TGT T T TGGAGATAATACAGT T TCCTGCTATCTGCCGCTCCTATCTAGAGGCAA

CACT TAAGCAGTAAT TGCT GT TGCT T GT TGTCAAAAT T TGATCAT T GT TAAAGGAT
TGCTGCAAATAAAT
ACACT T TAAT T TCAGTCAAAAA

GCAGGT TCGGCTGGAAGGAACCGCTC TCGCT TCGICCIACACT TGCGCAAATGICICCGAGCT TAC ICAO
ATAGCATAT T GGTATAT CAAAAT GAAAT GCAAGGAACCAAAAATAACATAAT T GAAGGCAGTAAAAGT GA

AAT TAAATAGGAAGATCAT CAGTCAAGGAAGACCCACTGGAGAGGACAGAAAATGAAGCAGTGT T T TATC
ATGTGTAT T TCAGCAGGTCT TCT TGAAAT T TAACTAAAAATATGACTGCTCTCTCT TCAGAGAACTGCTC
T T T TCAGTACCAGT TACGTCAAACAAACCAGCCCCTAGACGT TAAC TATCT GC TAT TCT
TGATCATACT T
GGGAAAATAT TAT TAAATATCCT TACACTAGGAATGAGAAGAAAAAACACCTGTCAAAAT T T TATGGAAT
AT TTTTGCAT T TCACTAGCAT TCGT TGATCT T T TACT T T TGGTAAACAT T TCCAT TATAT
TGTAT T TCAG
GGAT T T TGTACT T T TAAGCAT TAGGT TCACTAAATACCACATCTGCCTAT T TACTCAAAT TAT T
TCCT T T
ACT TAT GGCT T T T TGCAT TATCCAGT T T TCCTGACAGCT TGTATAGAT TAT TGCCTGAAT T
TCTCTAAAA
CAACCAAGCT T TCAT T TAAGTGTCAAAAAT TAT T T TAT T TCT T TACAGTAATT T TAAT T
TGGAT T TCAGT
CCT TGCT TATGT T T TGGGAGACCCAGCCATCTACCAAAGCCTGAAGGCACAGAATGCT TAT TCTCGTCAC
TGTCCT T TCTATGTCAGCAT TCAGAGT TACTGGCTGTCAT T T T TCATGGTGAT GAT T T TAT T
TGTAGCT T
TCATAACCTGT TGGGAAGAAGT TACTACT T TGGTACAGGCTATCAGGATAACT TCCTATATGAATGAAAC
TATCT TATAT T T TCCT T T T TCATCCCACTCCAGT TATACTGTGAGATCTAAAAAAATAT TCT
TATCCAAG
CTCAT TGTCTGT T T TCTCAGTACCTGGT TACCAT T TGTACTACT TCAGGTAAT CAT TGT T T TACT
TAAAG
T TCAGAT TCCAGCATATAT TGAGATGAATAT TCCCTGGT TATACT T TGTCAATAGT T T TCTCAT
TGCTAC
AGTGTAT TGGT T TAAT TGTCACAAGCT TAAT T TAAAAGACAT TGGAT TACCTT TGGATCCAT T
TGTCAAC
TGGAAGTGCTGCT TCAT TCCACT TACAAT TCCTAATCT TGAGCAAAT TGAAAAGCCTATATCAATAATGA
T T TGT TAATAT TAT TAAT TAAAAGT TACAGCTGTCATAAGATCATAAT T T TAT GAACAGAAAGAAC
TCAG
GACATAT TAAAAAATAAACTGAACTAAAACAACT T T TGCCCCCTGACTGATAGCAT T TCAGAATGTGTCT
T T TGAAGGGCTATACCAGT TAT TAAATAGTGT T T TAT T T TAAAAACAAAATAAT TCCAAGAAGT T
T T TAT
AGT TAT TCAGGGACACTATAT TACAAATAT TACT T TGT TAT TAACACAAAAAGTGATAAGAGT TAACAT
T
TGGCTATACTGAT GT T TGT GT TACTCAAAAAAACTACTGGATGCAAACTGT TATGTAAATCTGAGAT T TC

ACTGACAACT T TAAGATATCAACCTAAACAT T T T TAT TAAATGT TCAAATGTAAGCAAGAAAAAAAAAA
NM_014176 AGTCAGAGGTCGCGCAGGCGCTGGTACCCCGT TGGTCCGCGCGT

AGTGCATCCCAGGCAGCTC T TAGIGT GGAGCAGTGAACT GIGIGIGGI ICC TT CTACT T
GGGGATCATGC
AGAGAGCT TCACGTCTGAAGAGAGAGCTGCACATGT TAGCCACAGAGCCACCCCCAGGCATCACAT GT TG
GCAAGATAAAGACCAAATGGATGACCTGCGAGCTCAAATAT TAGGTGGAGCCAACACACCT TATGAGAAA
GGTGT T T T TAAGCTAGAAGT TATCAT TCCTGAGAGGTACCCAT T TGAACCT CC TCAGAT CCGAT T
TCTCA
CICCAATITATCATCCAAACAT TGAT ICTGCTGGAAGGAT TIGICIGGAIGTICICAAAT TGCCACCAAA
AGGTGCT TGGAGACCATCCCTCAACATCGCAACTGTGT TGACCTCTAT TCAGCTGCTCATGTCAGAACCC
AACCCIGATGACCCGCTCATGGCTGACATATCCICAGAATITAAATATAATAAGCCAGCCTICCICAAGA
ATGCCAGACAGTGGACAGAGAAGCAT GCAAGACAGAAACAAAAGGC TGATGAGGAAGAGATGCT TGATAA
TCTACCAGAGGCTGGTGACTCCAGAGTACACAACTCAACACAGAAAAGGAAGGCCAGTCAGCTAGTAGGC
ATAGAAAAGAAAT T TCATCCTGATGT T TAGGGGACT TGTCCTGGT TCATCT TAGT TAATGTGT TCT T
TGC
CAAGGTGATCTAAGT TGCCTACCT TGAAT TTTTTTT TAAATATAT T TGATGACATAAT T T T
TGTGTAGT T
TAT T TATCT TGTACATATGTAT T T TGAAATCT T T TAAACCTGAAAAATAAATAGTCAT T TAATGT
TGAAA
AAAAAAAAAAAAAAAAAAAAAAAAA
NM_006845 ACGCT TGCGCGCGGGAT T TAAACTGCGGCGGT T TACGCGGCGT TAAGACT

AGGI T T CT IGGIAT TGCGCGT T IC= T TCCT TGCTGACT CICCGAAIGGCCAT GGACTCGICGCT T
CAGG
CCCGCCTGT T TCCCGGTCTCGCTATCAAGATCCAACGCAGTAATGGT T TAATTCACAGTGCCAATGTAAG
GACTGTGAACT TGGAGAAATCCTGTGT T TCAGTGGAATGGGCAGAAGGAGGTGCCACAAAGGGCAAAGAG
AT TGAT T T TGATGATGTGGCTGCAATAAACCCAGAACTCT TACAGCT TCT T CCCT
TACATCCGAAGGACA
ATCTGCCCITGCAGGAAAAIGTAACAATCCAGAAACAAAAACGGAGATCCGICAACTCCAAAATICCIGC
TCCAAAAGAAAGT CT TCGAAGCCGCTCCACTCGCATGTCCACTGTCTCAGAGCT TCGCATCACGGC TCAG
GAGAAT GACATGGAGGTGGAGCTGCCTGCAGCT GCAAAC TCCCGCAAGCAGT T T TCAGT TCCTCCTGCCC
CCACTAGGCCT TCCTGCCCTGCAGTGGCTGAAATACCAT TGAGGATGGTCAGCGAGGAGATGGAAGAGCA
AGICCATICCATCCGAGGCAGCTCTICTGCAAACCCIGTGAACICAGTICGGAGGAAATCATGICT TGIG
AAGGAAGTGGAAAAAATGAAGAACAAGCGAGAAGAGAAGAAGGCCCAGAACTCTGAAATGAGAATGAAGA
GAGCTCAGGAGTATGACAGTAGT T T TCCAAACTGGGAAT T TGCCCGAATGATTAAAGAAT T TCGGGCTAC
T T TGGAATGTCATCCACT TACTATGACTGATCCTATCGAAGAGCACAGAATATGTGTCTGTGT TAGGAAA
CGCCCACTGAATAAGCAAGAAT TGGCCAAGAAAGAAAT TGATGTGAT T TCCAT TCCTAGCAAGTGTCTCC

TCT TGGTACATGAACCCAAGT TGAAAGTGGACT TAACAAAGTATCTGGAGAACCAAGCAT TC T GC T T T
GA
CT T T GCAT T T GAT GAAACAGC T TCGAATGAAGT TGTCTACAGGT
TCACAGCAAGGCCACTGGTACAGACA
ATCT T TGAAGGTGGAAAAGCAACT T GT T T T GCATAT GGCCAGACAGGAAGT GGCAAGACACATAC
TAT GG
GCGGAGACC TC TC T GGGAAAGCCCAGAAT GCAT CCAAAGGGATC TAT GCCATGGCC TCCCGGGACGTC
T T
CC TCC T GAAGAAT CAACCC T GC TACCGGAAGT T GGGCC T GGAAGTC TAT GT GACAT TCT
TCGAGATCTAC
AAT GGGAAGC T GT T T GACC T GC TCAACAAGAAGGCCAAGC T GCGCGT GC T
GGAGGACGGCAAGCAACAGG
TGCAAGTGGTGGGGCTGCAGGAGCATCTGGT TAACTCT GC T GAT GAT GTCATCAAGAT GATCGACAT GGG

CAGCGCCTGCAGAACCTCTGGGCAGACAT T TGCCAACTCCAAT TCC TCCCGCT CCCACGCGT GC T TCCAA

AT TAT I C I TCGAGC TAAAGGGAGAAT GOAT GGCAAGT IC ICI I I GGIAGAT CT GGCAGGGAAT
GAGCGAG
GCGCGGACACT TCCAGT GC TGACCGGCAGACCCGCAT GGAGGGCGCAGAAATCAACAAGAGTC TC T TAGC
CC T GAAGGAGT GCATCAGGGCCCTGGGACAGAACAAGGC TCACACCCCGT TCCGTGAGAGCAAGCTGACA
CAGGTGC I GAGGGACICCI =AT I GGGGAGAAC ICIAGGACITGCAT GAT I GCCACGAT C
TCACCAGGCA
TAAGCT CC T GT GAATATAC T T TAAACACCCTGAGATATGCAGACAGGGTCAAGGAGCTGAGCCCCCACAG
TGGGCCCAGTGGAGAGCAGT T GAT TCAAAT GGAAACAGAAGAGAT GGAAGCCT GC TC TAACGGGGCGCT
G
AT TCCAGGCAAT T TATCCAAGGAAGAGGAGGAACTGTCT TCCCAGATGTCCAGCT T TAACGAAGCCAT GA
C TCAGATCAGGGAGC T GGAGGAGAAGGC TAT GGAAGAGC TCAAGGAGATCA TACAGCAAGGACCAGAC T
G
GC T T GAGC TCTC T GAGAT GACCGAGCAGCCAGAC TAT GACC T GGAGACC T T
TGTGAACAAAGCGGAATCT
GCICIGGCCCAGCAAGCCAAGCAT I I CICAGCCCIGCGAGAIGICATCAAGGCCI I GCGCCIGGCCAT GC
AGO T GGAAGAGCAGGCTAGCAGACAAATAAGCAGCAAGAAACGGCCCCAGT GACGACIGCAAATAAAAAT
C T GT T TGGT T TGACACCCAGCCTCT TCCCTGGCCCTCCCCAGAGAACT T
TGGGTACCTGGTGGGTCTAGG
CAGGGTCTGAGCTGGGACAGGT TC T GGTAAAT GCCAAGTAT GGGGGCATC T GGGCCCAGGGCAGC T
GGGG
AGGGGGTCAGAGT GACAT GGGACAC T CC T T T TC T GT TCCTCAGT
TGTCGCCCTCACGAGAGGAAGGAGCT
CT TAGT TACCC T T T T GT GT T GCCC T T CT T TCCATCAAGGGGAAT GT TCTCAGCATAGAGCT
T TCTCCGCA
GCATCC T GCC T GCGT GGAC T GGC T GC TAAT GGAGAGC TCCC T GGGGT
TGTCCTGGCTCTGGGGAGAGAGA
CGGAGCCT T TAGTACAGC TATC T GC T GGC TC TAAACC T TCTACGCCT T
TGGGCCGAGCACTGAATGTCT T
GTACT T TAAAAAAAT GT T T CT GAGACC TC T T TCTACT T
TACTGTCTCCCTAGAGATCCTAGAGGATCCCT
AC T GT T T TC T GT T T TAT GT GT T TATACAT T GTAT GTAACAATAAAGAGAAAAAATAAAT
CAGC T GT T TAA
GT GT GT GGAAAAAAAAAAAAAAAAAA
NM_006101 AC T GCGCGCGTCGT GCGTAAT GACGT CAGCGCCGGCGGAGAAT T TCAAAT

GAGGAAGGACC T GGT GT T T T GAT GACCGC T GTCC T GTC TAGCAGATAC T TGCACGGT T
TACAGAAAT TCG
=COT GGGTCGT GICAGGAAACIGGAAAAAAGGICATAAGCAT GAAGCGCAGT TCAGT T TCCAGCGGTG
GIGO TGGCCGCCT =CAT GCAGGAGT TAAGAT CCCAGGAIGTAAATAAACAAGGCCIC TATACCCCICA
AACCAAAGAGAAACCAACC I I IGGAAAGT I GAGTATAAACAAACCGACATC TGAAAGAAAAGICTCGC TA
T T TGGCAAAAGAACTAGTGGACATGGATCCCGGAATAGTCAACT TGGTATATT T TCCAGT TCTGAGAAAA
TCAAGGACCCGAGACCACT TAATGACAAAGCAT TCAT TCAGCAGTGTAT TCGACAAC TC T GT GAGT

TACAGAAAATGGT TAT GCACATAAT GT GTCCAT GAAATC TC TACAAGC TCCCT C T GT TAAAGACT
T CC T G
AAGATCT TCACAT T TCT T TAT GGC T T CC T GT GCCCC TCATACGAAC T TCCTGACACAAAGT T
TGAAGAAG
AGGT TCCAAGAATCT T TAAAGACCT TGGGTATCCTTTTGCACTATCCAAAAGCTCCATGTACACAGTGGG
GGCTCCTCATACATGGCCTCACAT T GT GGCAGCC T TAGT T TGGCTAATAGACTGCATCAAGATACATACT
GCCATGAAAGAAAGCTCACCT T TAT T T GAT GAT GGGCAGCC T TGGGGAGAAGAAACTGAAGATGGAAT
TA
TGCATAATAAGT T GT T T T T GGAC TACACCATAAAAT GC TAT GAGAGT T T
TATGAGTGGTGCCGACAGCT T
T GAT GAGAT GAAT GCAGAGC T GCAGT CAAAAC T GAAGGAT T TAT T TAAT GT GGAT GC T T
T TAAGCTGGAA
T CAT TAGAAGCAAAAAACAGAGCAT TGAATGAACAGAT TGCAAGAT TGGAACAAGAAAGAGAAAAAGAAC
CGAATCGTCTAGAGTCGT TGAGAAAACTGAAGGCT TCCT TACAAGGAGAT GT T CAAAAGTATCAGGCATA
CAT GAGCAAT T T GGAGTC T CAT TCAGCCAT TCT TGACCAGAAAT TAAATGGTCTCAATGAGGAAAT T
GC T
AGAG TAGAAC TAGAAT G T GAAACAAT AAAACAGGAGAACAC T C GAC TACAGAA TAT CAT
TGACAACCAGA
AGTACTCAGT TGCAGACAT TGAGCGAATAAATCATGAAAGAAATGAAT T GCAGCAGAC TAT TAATAAAT T
AACCAAGGACCTGGAAGCTGAACAACAGAAGT T GT GGAAT GAGGAGT TAAAATATGCCAGAGGCAAAGAA
GCGAT TGAAACACAAT TAGCAGAGTATCACAAAT TGGCTAGAAAAT TAAAACT TAT TCC TAAAGGT GC T
G
AGAAT TCCAAAGGT TAT GAC T T TGAAAT TAAGT T TAATCCCGAGGCTGGTGCCAACTGCCT
TGTCAAATA
CAGGGCTCAAGT T TAT GTACC TC T TAAGGAAC T CC T GAAT GAAAC T GAAGAAGAAAT
TAATAAAGCCC TA
AATAAAAAAATGGGT T TGGAGGATACT T TAGAACAAT TGAATGCAATGATAACAGAAAGCAAGAGAAGTG
TGAGAACTCTGAAAGAAGAAGT TCAAAAGCTGGATGATCT T TACCAACAAAAAAT TAAGGAAGCAGAGGA
AGAGGAT GAAAAAT GT GCCAGT GAGC T TGAGTCCT T GGAGAAACACAAGCACC T GC TAGAAAGTAC
T GT T
AACCAGGGGC TCAGT GAAGC TAT GAAT GAAT TAGAT GC T GT TCAGCGGGAATACCAACTAGT T GT
GCAAA
CCACGACTGAAGAAAGACGAAAAGTGGGAAATAACT T GCAACGTC T GT TAGAGATGGT T GC TACACAT
GT
TGGGTCTGTAGAGAAACATCT TGAGGAGCAGAT T GC TAAAGT TGATAGAGAATATGAAGAATGCATGTCA
GAAGATCTCTCGGAAAATAT TAAAGAGAT TAGAGATAAGTATGAGAAGAAAGCTACTCTAAT TAAGTCT T
C T GAAGAAT GAAGATAAAAT GT T GAT CAT GTATATATAT CCATAGT GAATAAAAT
TGTCTCAGTAAAGTG
TAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

GGCAICGGGGGCGGCAICGGGGGCGGCTCCAGCCGCATCTCCICCGICCIGGCCGGAGGGICCIGCCGCG
CCCCCAGCACCTACGGGGGCGGCCTGTCTGTCTCATCCTCCCGCT T C TCC T CT GGGGGAGCC TAT GGGT
T
GGGGGGCGGCTATGGCGGTGGCTTCAGCAGCAGCAGCAGCAGCTTTGGTAGTGGCTTTGGGGGAGGATAT
GGTGGTGGCCT T GGT GC TGGC T TGGGTGGTGGCT T TGGTGGTGGCT T T GC T GGT GGT GAT
GGGC T T CT GG
T GGGCAGT GAGAAGGT GACCAT GCAGAACC TCAACGACCGCC T GGCC TCC TACC T GGACAAGGT
GCGT GC
TC T GGAGGAGGCCAACGCCGACC T GGAAGT GAAGATCCGTGAC T GGTACCAGAGGCAGCGGCCT GC T
GAG
ATCAAAGACTACAGTCCCTACT TCAAGACCAT TGAGGACCTGAGGAACAAGAT TCTCACAGCCACAGTGG

ACAAT GCCAAT GT CC T TCTGCAGAT TGACAATGCCCGTCTGGCCGCGGATGACT TCCGCACCAAGTAT GA

GACAGAGT T GAACC T GCGCAT GAGT GT GGAAGCCGACAT CAAT GGCC T GCGCAGGGT GC T
GGACGAAC T G
ACCCTGGCCAGAGCTGACCTGGAGATGCAGAT TGAGAGCCTGAAGGAGGAGCTGGCCTACCTGAAGAAGA
ACCACGAGGAGGAGAT GAATGCCC T GAGAGGCCAGGT GGGT GGAGAT GTCAAT GT GGAGAT GGACGC T
GC
ACCTGGCGT GGACC T GAGC CGCAT TO T GAACGAGAT GCG T GACCAG TAT GAGAAGAT
GGCAGAGAAGAAC
CGCAAGGATGCCGAGGAAT GGT TOT TCACCAAGACAGAGGAGCTGAACCGCGAGGIGGCCACCAACAGCG
AGCTGGTGCAGAGCGGCAAGAGCGAGATCTCGGAGCTCCGGCGCACCATGCAGAACCTGGAGAT TGAGCT
GCAGTCCCAGCTCAGCATGAAAGCAT CCCTGGAGAACAGCCTGGAG GAGAC CAAAGGTC GC TAC T GCAT G

CAGCTGGCCCAGATCCAGGAGAT GAT T GGCAGC GT GGAGGAGCAGC TGGCCCAGC T CCGC TGCGAGAT
GG
AGCAGCAGAAC CAGGAGTACAAGAT C C T GC T GGAC G T GAAGACGCG GC T GGAGCAGGAGAT
CGCCAC C TA
CCGCCGCCT GC TGGAGGGCGAGGACGCCCACCT C T CC T C C TCCCAG T TCTCCTCTGGAT CGCAGT
CAT CC
AGAGATGTGACCT CCTCCAGCCGCCAAATCCGCACCAAGGTCATGGAT GT GCACGAT GGCAAGGT GGTGT
CCACCCACGAGCAGGTCCT TCGCACCAAGAACTGAGGCTGCCCAGCCCCGC ICAGGCCIAGGAGGCCCCC
CGTGTGGACACAGATCCCAC TGGAAGAT CCCC T C ICC TGCCCAAGCACT ICACAGCTGGACCCIGC T T
CA
CCCTCACCCCCTCCTGGCAATCAATACAGCT TCAT TATCTGAGT TGCATAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

CCTGGGC T GCCCICCCCICCICCGGGAC T GC IC TGGAC T GACAC T GC ICAGGI TCGGAT
TCCCICAAAGA
CT T T GGGAGACAAGAC T T GGTCCCCC T T T TACAAACAAGGGAACGGAGGCT CTAGAAC T GAC T
ICC T GAA
AGGCT TGGATCCAAAGCTCCCTCAGT TCAGCGGCCACGT C TAT T TCCCTCAGACACAGGGATCCT TGAAC
C T GT GGGC T GTAT C TCCCCGCGGAC T
TGGAAGAATCCCAAGAGAGTGGGGCTCCCACAGGCTGGAGTGCA
AIGGIGT GATC TCGGCTCACTGCAACCICCACC TCCCAGGT TCAAGC TAT T CT CCTGCC TCAGCCT
CCTG
AGTAGCTGGGAT TACAGAT CCIGGIGGC TGT GGTCGGTAAT TCCAGC T =GIGO IGGCTACAGGTGGAIG
AT GCCCACCTGGC TGCCGAT GAO= T GCACCAAGTGAGGC T GGGT CICIGGAGCTGCCCCAGGGGC T GG

ACAAGCTGACCCTGGCCGGGGCCAACCTGGAGATGCAGAT TGAGAACCTCAAGGAGGACCTGGTCTACCT
GAAGAAGAACCACAAGCAGGAAAT GAACGT CC T T T GAGG T CAGGT GGAT GAGGAT GT CAGT GT
GAAGAT G
GACAC T GT GCCTGGAGT GAACC T GAGC T GCATCC T GAAT GAGAT GC GT GAO
CAGGACAAGACAT TGGTGG
AGAAGAGC T GCAAGGAT GCCGAGGGC IGGI ICI TCAGCAT GGT GGGT GGCCGT GCGTAAGCAGGT GT
GTA
CACGT GT GGGCACAT GT GC T GCAT GC T GGT GCAGC T GGAGCAC T GGCAGAT CCACAGGC T
GTCCCAGT TG
GAAGGACTTTTGGAAACCAGT T GGAC CAGCCCC T CAT GT TI TAGAT GTAAAAC GT GAGGC T
CAGAGAGGA
CTCAAGCTCACACAGCCCT TCAC T GT GGCCT GCAAAATAGATCCAGGTC TC TACAAGTC T GGTC T
TGGGT
T TCCACCACAGC T GT T TACAGGAT GT GCGTAT T TGAATACATATGTATACCCT T GGCAAGCACAGGC
T GA
=ICI CCGGIAT CCIAGGGACAGCAACAGGCGCAAAAGAATAACACCCAGIGCCIGIC T T TGAGGIGCT
GCAGT T CAG TAGGAAAAAGAAAT GCAAAT GACC GCAGAG CAGGC T GAAT T C CT C CAAG T
TCCAATGTGGG
TGCAGAGGCICICIGIGIGCAGAAAGAGGGGCTGAACTGCGAGGIGGCCACCAACACAGAGGCCCIGCAG
AGT GGC T GGATAGAGATAT GGAGC TC TACGTC T C T GT GCAGAACCT GAGCCGTCCCAGC
TCAGCAAGAAA
GCATCGC T GGAGGGCAGCC T GGT GGAGAT GGAGGT GT GT
TACAGGACCCTGCCGGCCCAGCTGCAGGGGC
T TAACAGAAGCAT GGAGCAGCAGC T G T GCGAGC TOT GOT
GCGACACGGAGCACCAGGACCACAAGCACAG
GTCC T T CT GGACGT GAAGACGT GGC T GGAGCAGGAGATCGCCACCTACCGCCGC T TGCTGGAGGT
TGAGG
ACGCCCAGAGGT GATAC T GACGAT GCAGGCT GGAGTC T GGC T GAGGAGCCT TGAATGCCAAGT
TAAAGCG
TCTGGACTAGATCACGTAGGCAATGGGGAGCCATGGAGGGAT T T GGAGCAGGAGAGT GAAAT GAACAT CA
AGAGAT TI TAGAACAT T CAC TC T GGC TGCAGAGGGAGAAATGGATCAGAGGGGTCAGGGCGGGGCCAGAG

AGAT GT GT CAGGGGGC T GGAGCAGGGAGT C T GGCCAGAGAAGT CCCGTGCGGT
GGTGGGTAGTGGGGCAG
GGGAAGGAAGGTGGTGCACGCAGAAGAGAGGT TATAGCTCAAAACAGCGGGACTGGATGCCTGGATCTCG
GGGTAAGCAT GGC T CACAGT CAGGAC T CAGTAAGT GT CGGGAGAACACAT GAAGGAGCAGGCAT T
GAT GG
CCCTGGGT T TCTGGT TC T GAT GAC T GT GT GAGT GGT GAAGAGCAAGGT GGGTGGT GGT TGGGT
T TGCAGT
T GGGAAGGGT GAT CAGGCC T TCAGC T GAGAGT GTCCCGGAGTC TCCAT GC T TAGTCACACGT
TGCAGCT T
T T T GC T CCCCGGAAAT GGT GAAGTCCATC TATAGTC TAACAACAGT C TC TCCT GC T T TAAT
T GGGT C TAT
T T GT TGGGCCCTCTGGGT TAT GGAAAAACCAC T T GC TCAGC T TCTCCT TGTAAAT
TCCTGGTGAGTAGCC
ACAGAGT GCCGCCAGACC TAC T GC T GT GC T GT T TCTTTT TCT TCT T CC T GC TGT GC T
GAACCCC TGCCC T
T TCAT TCT TGGGCCTGCGCTAAT T TC T GT GCAT TCCCAAC T GT GAT T T T TCACCAAT T
TAGGGGAACCTC
CTCTGCCAGGGCCTACT TC TCCCCAGCAGT GC T T GCAGGT GCC T GGGC T GGCT GGCATCCCTGGGC
T GAT
GGGTGC T IC TCTCCCTGCAGGC TGGCCACTCAGTAC ICC T TGTCCC TGGCC
TCGCAGCCCACCCGGGAAG
CCACAG T GACCAGCCACCAGGT GT GCCATCGT GGAGGAAGT CCAGG T TGGAGAGGT GGT CT TCT TC
T GIG
AGCAGGTCCACTTCTCCACCCACTGAGACCCCT T TC T GT CTGCGACAGCCCCACCTCGAGGGCCACGGCA
CAGCCATCAGCTCCAGCTCCCAGCATGCTACTGCCACGCCCCGAGIGTCCGTCIGGGCCCCGGTGCAIGG
CCIGT T GT= T IC IGTATC TACT T IC T GCAGCCCCTCAC TGAGGAGGCCTCCT GGGT T T
GTCCAGT GCC T
AC TAT TAAAGCT T T GC TCCAAGT TC

CT T GC T GCCAAGAGAICAGIGC TGCAAGGCAAGGT TAT T IC TAAC T
GAGCAGAGCCTGCCAGGAAGAAAG
CGT T TGCACCCCACACCAC TGIGCAGGIGTGACCGGIGAGCTCACAGCTGCCCCCCAGGCATGCCCAGCC
CAC T TAATCAT TCACAGCT CGACAGC TCTCTCGCCCAGCCCAGT TC TGGAAGGGATAAAAAGGGGGCATC
ACCGT T CC IGGGIAACAGAGCCACCT IC T GCGT CCTGCT GAGC T C T GT T C T CT CCAGCACC
TCCCAACCC
AC TAGT GCCT GGT TCTCT T GC TCCACCAGGAACAAGCCACCATGTC TCGCCAGTCAAGT GTGTCC T
TCCG
GAGCGGGGGCAGT CGTAGC T TCAGCACCGCC IC T GCCAT CACCCCGTC 1= CT CCCGCACCAGC T T
CACC
TCCGIGICCCGGICCGGGGGIGGCGGTGGIGGIGGCTTCOGCAGGGICAGCCT T GCGGGT GC T 1= GGAG
T GGGT GGC TAT GGCAGCCGGAGCC TC TACAACC T GGGGGGC TCCAAGAGGATATCCATCAGCAC
TAGAGG

AGGCAGCT TCAGGAACCGGT T T GGT GC T GGT GC T GGAGGCGGC TAT GGC T T
TGGAGGTGGTGCCGGTAGT
GGAT T TGGT T TCGGCGGTGGAGCTGGTGGTGGCT T TGGGCTCGGTGGCGGAGCTGGCT T TGGAGGTGGCT

TCGGTGGCCCTGGCT T T CC T GT C TGCCCTCC T GGAGGTAT CCAAGAGGTCACT GT CAACCAGAGT
C T CC T
GAC T CC CC T CAAC C I GCAAAT CGACC CCAGCAT CCAGAGGGTGAGGACCGAGGAGCGCGAGCAGAT
CAAG
ACCCTCAACAATAAGT T TGCCTCCT TCATCGACAAGGTGCGGT T CC TGGAGCAGCAGAACAAGGT T CTGG

ACACCAAGT GGACCC T GC T GCAGGAGCAGGGCACCAAGAC T GT GAGGCAGAACCTGGAGCCGT T GT
TOGA
GCAGTACATCAACAACCICAGGAGGCACCIGGACAGCATCGIGGGGGAACGGGGCCGCCTGGACICAGAG
CTGAGAAACATGCAGGACCTGGTGGAAGACT T CAAGAACAAG TAT GAGGAT GAAAT CAACAAGCG T ACCA

C T GC T GAGAAT GAGT T T GT GAT GC T GAAGAAGGAT GTAGAT GC T GCC TACATGAACAAGGT
GGAGC TGGA
GGCCAAGGT T GAT GCAC T GAT GGAT GAGAT TAACT T CAT GAAGAT GT TCTT TGAT GCGGAGC
T GT CCCAG
AT GCAGACGCAT GT= I GACACCICAGIGGICCICICCAIGGACAACAACCGCAACCT GGACCIGGATA
GCATCATCGCTGAGGTCAAGGCCCAGTATGAGGAGAT TGCCAACCGCAGCCGGACAGAAGCCGAGT CC T G
=AT CAGACCAAG TAT GAGGAGCT GCAGCAGACAGC I GGCCGGCAT GGCGATGACC I CC GCAACAC
CAAG
CAT GAGAT CACAGAGAT GAACCGGAT GAT CCAGAGGC T GAGAGCCGAGAT I GACAAT GT
CAAGAAACAGT
GCGCCAAT C TGCAGAACGCCAT TGCGGAT GCCGAGCAGCGTGGGGAGC T GGCCCT CAAGGAT GCCAGGAA

CAAGCT GGCCGAGC TGGAGGAGGCCC TGCAGAAGGCCAAGCAGGACATGGCCCGGC I GC I GCGTGAGTAC
CAGGAGC I CAT GAACACCAAGCTGGCCC I GGACGTGGAGAT CGCCAC I TACCGCAAGC I GOT
GGAGGGCG
AGGAATGCAGACTCAGTGGAGAAGGAGT T GGACCAGT CAACAT CTC T GT T GTCACAAGCAGT GT T T
CC T C
IGGATAIGGCAGT GGCAGT GGC TAIGGCGGTGGCCTCGGIGGAGGI C I I GGCGGCGGCC ICGGIGGAGGI

CT T GCCGGAGGTAGCAGT GGAAGC TAC TAC TCCAGCAGCAGT GGGGGT GT CGGCC TAGGT GGT
GGGC T CA
GT GT GGGGGGC T C T GGC T TCAGTGCAAGCAGTGGCCGAGGGCTGGGGGTGGGCT T
TGGCAGTGGCGGGGG
TAGCAGCTCCAGCGTCAAAT T T GT C T CCACCACCTCCTCCTCCCGGAAGAGCT ICAAGAGCTAAGAACCI
GC I GCAAGTCAC I GCC I ICCAAGIGCAGCAACCCAGCCCATGGAGAT I GCC IC I ICIAGGCAGT I
GC I CA
AGCCAT GT T T TAT CC T T T T CTGGAGAGTAGTCTAGACCAAGCCAAT TGCAGAACCACAT
TCTTTGGT T CC
CAGGAGAGCCCCAT TCCCAGCCCCTGGTCTCCCGTGCCGCAGT TCTATAT T CT GC T TCAAATCAGCCT TC

AGGI I ICCCACAGCAIGGCCCCIGC I GACACGAGAACCCAAAGT I I I CCCAAAT C TAAAT CAT
CAAAACA
GAAT CCCCACCCCAAT CCCAAAT I I I GT I I IGGI ICIAAC TACCICCAGAAIGIGT I
CAATAAAAT GC I I
T TATAATAT
NM_00112306 GGACGGCCGAGCGGCAGGGCGC T CGCGCGCGCCCAC TAGT GGCCGGAGGAGAAGGC T

GCGCCGCCCGCCGGCC ICA
GGAACGCGCCCTCT TCGCCGGCGCGCGCCCTCGCAGTCACCGCCACCCACCAGCTCCGGCACCAACAGCA
GCGCCGCTGCCACCGCCCACCT TCTGCCGCCGCCACCACAGCCACCT TCTCCT CCTCCGCT GT CC T CTCC
CGTCCTCGCCTCTGTCGAC TAT CAGGT GAAC T T I GAACCAGGATGGC I GAGCCCCGCCAGGAGT I
CGAAG
T GAT GGAAGAT CACGCTGGGACGTACGGGT TGGGGGACAGGAAAGATCAGGGGGGCTACACCATGCACCA
AGACCAAGAGGGTGACACGGACGCTGGCCTGAAAGAATCTCCCCIGCAGACCCCCACTGAGGACGGATCT
GAGGAACCGGGC I CTGAAACCTC I GATGC TAAGAGCACT CCAACAGCGGAAGATGT GACAGCACCC T
TAG
TGGATGAGGGAGCTCCCGGCAAGCAGGCTGCCGCGCAGCCCCACACGGAGATCCCAGAAGGAACCACAGC
TGAAGAAGCAGGCAT T GGAGACACCC CCAGCC T GGAAGACGAAGC T GC T GG T CACG T
GACCCAAGAGCC T
GAAAGT GGIAAGGIGGICCAGGAAGGCT TCCTCCGAGAGCCAGGCCCCCCAGGIC I GAGCCACCAGCTCA
T GT CCGGCAT GCC T GGGGC TCCCCTCCTGCCTGAGGGCCCCAGAGAGGCCACACGCCAACCT TCGGGGAC
AGGACC T GAGGACACAGAGGGCGGCC GCCACGC CCC T GAGC T GC T CAAGCACCAGC T TC
TAGGAGACCTG
CACCAGGAGGGGC CGCC GC T GAAGGGGGCAGGGGGCAAAGAGAGGC CGGGGAGCAAGGAGGAGG T GGAT G

AAGACCGCGACGT CGATGAGTCCTCCCCCCAAGACTCCCCTCCCTCCAAGGCC TCCCCAGCCCAAGAT GG
GCGGCCTCCCCAGACAGCCGCCAGAGAAGCCACCAGCATCCCAGGCT TCCCAGCGGAGGGTGCCAT CCCC
CTCCCTGTGGAT T TCCTCTCCAAAGT T TCCACAGAGATCCCAGCCT CAGAGCCCGACGGGCCCAGT GTAG
GGCGGGCCAAAGGGCAGGATGCCCCCCTGGAGT TCACGT T TCACGTGGAAATCACACCCAACGTGCAGAA
GGAGCAGGCGCAC TCGGAGGAGCAT T TGGGAAGGGCTGCAT T TCCAGGGGCCCCTGGAGAGGGGCCAGAG
GCCCGGGGCCCCT CT T T GGGAGAGGACACAAAAGAGGC T GACCT TCCAGAGCCCTCTGAAAAGCAGCCTG
CTGCTGCTCCGCGGGGGAAGCCCGTCAGCCGGGTCCCTCAACTCAAAGCTCGCATGGTCAGTAAAAGCAA
AGACGGGACTGGAAGCGAT GACAAAAAAGCCAAGACAT CCACACGT T CC= TGC TAAAACCT I GAAAAAT
AGGCCT TGCCT TAGCCCCAAACACCCCACT CC T GGTAGC T CAGACCC T C T GAT
CCAACCCTCCAGCCC T G
C T GT GT GCCCAGAGCCACC T T CCTCT CC TAAATACGT CT CT TCT GT CAC T
TCCCGAACTGGCAGTTCTGG
AGCAAAGGAGAT GAAAC T CAAGGGGGC T GAT GG TAAAAC GAAGAT C GCCACAC CGCGGG GAGCAGC
CCC T
CCAGGCCAGAAGGGCCAGGCCAACGCCACCAGGAT TCCAGCAAAAACCCCGCCCGCTCCAAAGACACCAC
CCAGCT CTGCGAC TAAGCAAGTCCAGAGAAGACCACCCCCTGCAGGGCCCAGATCTGAGAGAGGTGAACC
T CCAAAAT CAGGGGATCGCAGCGGCTACAGCAGCCCCGGCTCCCCAGGCAC TCCCGGCAGCCGCTCCCGC
ACCCCGTCCCT TCCAACCCCACCCACCCGGGAGCCCAAGAAGGTGGCAGTGGTCCGTAC TCCACCCAAGT
CGCCGT CT TCCGCCAAGAGCCGCCTGCAGACAGCCCCCGTGCCCAT GCCAGACC T GAAGAAT GT CAAGT C

CAAGATCGGCTCCACTGAGAACCTGAAGCACCAGCCGGGAGGCGGGAAGGTGCAGATAAT TAATAAGAAG
CTGGATCT TAGCAACGT CCAGT CCAAGT GT GGC TCAAAGGATAATAT CAAACACGT CCCGGGAGGCGGCA

GT GT GCAAATAGT C TACAAACCAGT T GACC T GAGCAAGGT GACC T CCAAGT GT GGCT CAT
TAGGCAACAT
CCAT CA TAAACCAGGAGG T GGCCAGG T GGAAG T AAAAT C T GAGAAGC T TGACT
TCAAGGACAGAGTCCAG
I CGAAGAT I GGGT CCC I GGACAATAT CACCCAC GT CCC I GGCGGAGGAAATAAAAAGAT I
GAAACCCACA
AGCTGACCT T CCGCGAGAACGCCAAAGCCAAGACAGACCACGGGGCGGAGA TC GT GTACAAGT CGCCAGT
GGIGTCIGGGGACACGICICCACGGCATCICAGCAAIGICTCCTCCACCGGCAGCATCGACAIGGTAGAC
I CGCCCCAGCTCGCCACGC TAGC I GACGAGGIGICIGCC TCCCTGGCCAAGCAGGGT I I =GAT CAGGCC

CCTGGGGCGGICAATAAT T GT GGAGAGGAGAGAAT GAGAGAGT GT GGAAAAAAAAAGAA TAAT GACCCGG

CCCCCGCCC T C T GCCCCCAGC T GC T CC T CGCAGT TCGGT TAAT TGGT TAAT CAC T TAACC
T GC T T T T GT C

ACTCGGCT T TGGCTCGGGACT TCAAAATCAGTGATGGGAGTAAGAGCAAAT TTCATCT T TCCAAAT TGAT
GGGTGGGCTAGTAATAAAATAT T TAAAAAAAAACAT TCAAAAACATGGCCACATCCAACAT T TCCTCAGG
CAAT TCCT T T TGAT TCT TTTT TCT TCCCCCTCCATGTAGAAGAGGGAGAAGGAGAGGCTCTGAAAGCTGC

T TCTGGGGGAT T TCAAGGGACTGGGGGTGCCAACCACCTCTGGCCCTGT TGTGGGGGTGTCACAGAGGCA
GT GGCAGCAACAAAGGAT T TGAAACT T GGT GT G T TCGTGGAGCCACAGGCAGACGAT GT CAACCT T
GT GT
GAGIGT GACGGGGGITGGGGIGGGGCGGGAGGCCACGGGGGAGGCCGAGGCAGGGGCTGGGCAGAGGGGA
GAGGAAGCACAAGAAGIGGGAGIGGGAGAGGAAGCCACGTGCTGGAGAGTAGACATCCCCCTCCT TGCCG
CIGGGAGAGCCAAGGCCIATGCCACCTGCAGCGTCTGAGCGGCCGCCIGICCI TGGTGGCCGGGGGTGGG
GGCCTGCTGIGGGICAGIGIGCCACCCTCTGCAGGGCAGCCTGIGGGAGAAGGGACAGCGGGIAAAAAGA
GAAGGCAAGCTGGCAGGAGGGTGGCACT TCGTGGATGACCTCCT TAGAAAAGACTGACCT TGATGT CT TG
AGAGCGCTGGCCT CT TCCTCCCTCCCTGCAGGGTAGGGGGCCTGAGT TGAGGGGCT TCCCTCTGCTCCAC
AGAAACCCIGT I I TAT TGAGT ICTGAAGGI IGGAACTGCTGCCATGAT I I I GGCCACT I
TGCAGACCTGG
GACT I TAGGGCTAACCAGT TOT= I I GTAAGGACT IGIGCCTCT IGGGAGACGTCCACCCGT I
TCCAAGC
CIGGGCCACIGGCATCICIGGAGIGIGIGGGGGICIGGGAGGCAGGICCCGAGCCCCCTGICCTICCCAC
GGCCACTGCAGTCACCCCGTCTGCGCCGCTGTGCTGT TGTCTGCCGTGAGAGCCCAATCACTGCCTATAC
CCCTCATCACACGTCACAAIGICCCGAATICCCAGCCICACCACCCCITCTCAGTAATGACCCIGGITGG
T TGCAGGAGGTACCTACTCCATACTGAGGGTGAAAT TAAGGGAAGGCAAAGTCCAGGCACAAGAGTGGGA
CCCCAGCCTCTCACTCTCAGT TCCACTCATCCAACTGGGACCCTCACCACGAATCTCATGATCTGAT TCG

GCCT TGT TGACATGGAGAGAGCCCT T T COCO TGAGAAGGCC T GGCCCCT T C CT
GIGCTGAGCCCACAGCA
GCAGGCTGGGIGT CT IGGI T =CA= GGIGGCACCAGGAIGGAAGGGCAAGGCACCCAGGGCAGGCCCAC
AGTCCCGC T GT CC CCCAC T TGCACCC TAGCT TGTAGCTGCCAACCTCCCAGACAGCCCAGCCCGCT
GCTC
AGCTCCACATGCATAGTAT CAGCCCT CCACACCCGACAAAGGGGAACACACCCCCT IGGAAAIGGI ICI I
T TCCCCCAGTCCCAGCTGGAAGCCAT GCTGTCT GT TCTGCTGGAGCAGCTGAACATATACATAGAT GT TG
CCCTGCCCTCCCCATCTGCACCCTGT TGAGT TGTAGT TGGAT T TGTCTGT T TATGCT TGGAT
TCACCAGA
GTGACTAT GATAGTGAAAAGAAAAAAAAAAAAAAAAAAGGACGCAT GTATC T T GAAATGCT TGTAAAGAG
GT I ICIAACCCACCCICACGAGGIGT CICICACCCCCACACTGGGACTCGT GT GGCCIGIGIGGIGCCAC
CCTGCTGGGGCCTCCCAAGT T T TGAAAGGCT T TCCTCAGCACCTGGGACCCAACAGAGACCAGCT TCTAG
CAGCTAAGGAGGCCGT TCAGCTGTGACGAAGGCCTGAAGCACAGGAT TAGGACTGAAGCGATGATGTCCC
=COO TACT ICCCCI IGGGGCTCCC IGIGTCAGGGCACAGACTAGGICT I GT GGCIGGICIGGCT TGCG
GCGCGAGGAIGGI ICI= IGGICATAGCCCGAAGICICAIGGCAGICCCAAAGGAGGC I TACAAC TOOT
GOAT CACAAGAAAAAGGAAGC CAC I G C CAGC T GGGGGGA I C I GCAGC T CCCAGAAGC T C
CGT GAGC C T CA
GCCACCCCTCAGACTGGGT T CC IC IC CAAGC T CGCCC IC
TGGAGGGGCAGCGCAGCCTCCCACCAAGGGC
CCTGCGACCACAGCAGGGAT TGGGAT GAAT T GCC T GT CC TGGATCT GC T C
TAGAGGCCCAAGCTGCCTGC
CTGAGGAAGGATGACT TGACAAGICAGGAGACAC T GT T C CCAAAGC C T T GACCAGAGCACC
TCAGCCCGC
TGACCT IGCACAAACICCATCTGCTGCCATGAGAAAAGGGAAGCCGCCI I I GCAAAACAT TGCTGCCTAA
AGAAACICAGCAGCCICAGGCCCAAT TCTGCCACT TCTGGT T TGGGTACAGTTAAAGGCAACCCTGAGGG
ACT TGGCAGTAGAAATCCAGGGCCTCCCCTGGGGCTGGCAGCT TCGTGTGCAGCTAGAGCT T TACCTGAA
AGGAAGICICIGGGCCCAGAACICTCCACCAAGAGCCTCCCIGCCGT =GC TGAGICCCAGCAAT I CTCC
TAAGT I GAAGGGATCTGAGAAGGAGAAGGAAAT GIGGGGTAGAT I I GGIGGIGGI TAGAGATAIGCCCCC
CTCAT TACTGCCAACAGT I TCGGCTGCAT I ICI ICACGCACCICGGI TOOT CT ICCIGAAGT ICI I
GTGC
CCTGCT CT ICAGCACCAIGGGCCI IC I TATACGGAAGGCTCIGGGAICICCCCCI IGIGGGGCAGGCICT
IGGGGCCAGCCIAAGATCAIGGI I TAGGGIGAT CAGTGC TGGCAGATAAAT TGAAAAGGCACGCTGGCT I
GTGATCT TAAATGAGGACAATCCCCCCAGGGCTGGGCACTCCTCCCCTCCCCTCACT TCTCCCACCTGCA
GAGCCAGIGICCI IGGGIGGGCTAGATAGGATATACIGTAIGCCGGCTCCI ICAAGCTGCTGACICACT I
TATCAATAGT TCCAT T TAAAT TGACT TCAGTGGTGAGACTGTATCCTGT T T GC TAT TGCT TGT
TGTGCTA
TGGGGGGAGGGGGGAGGAATGTGTAAGATAGT TAACATGGGCAAAGGGAGATCT TGGGGTGCAGCACT TA
AACTGCCICGTAACCCI I I TCATGAT I TCAACCACAT I I GCTAGAGGGAGGGAGCAGCCACGGAGT
TAGA
GGCCCT TGGGGT T TCTCT T T TCCACTGACAGGCT T TCCCAGGCAGCTGGCTAGT
TCATTCCCTCCCCAGC
CAGGTGCAGGCGTAGGAATATGGACATCTGGT TGCT T TGGCCTGCTGCCCT CT T TCAGGGGTCCTAAGCC
CACAATCATGCCTCCCTAAGACCT TGGCATCCT TCCCTCTAAGCCGT TGGCACCTCTGTGCCACCTCTCA
CACIGGCTCCAGACACACAGCCIGIGCT I I IGGAGCTGAGATCACTCGCT I CACCCICC TCATCT I I=
CICCAAGTAAAGCCACGAGGICGGGGCGAGGGCAGAGGIGATCACCTGCGIGTCCCATCTACAGACCIGC
AGCT TCATAAAACT TCTGAT T TCTCT TCAGCT T TGAAAAGGGITACCCIGGGCACTGGCCIAGAGCCTCA
CCTCCTAATAGACT TAGCCCCATGAGT T TGCCATGT TGAGCAGGAC TAT T T CT GGCACT
TGCAAGTCCCA
TGAT T T CT TCGGTAAT TCTGAGGGTGGGGGGAGGGACATGAAATCATCT TAGCT TAGCT T
TCTGTCTGTG
AATGTCTATATAGTGTAT TGTGTGT T T TAACAAATGAT T TACACTGACTGT TGCTGTAAAAGTGAAT T
TG
GAAATAAAGT TAT TACTCT GAT TAAA

GAGCCCGAGGGGC GGCCGCGACCCC I C T GACCGAGATCC TGC T GC T T
TCGCAGCCAGGAGCACCGICCCT
CCCCGGAT TAGTGCGTACGAGCGCCCAGTGCCCTGGCCCGGAGAGT GGAAT GATCCCCGAGGCCCAGGGC
GTCGT GC T TCCGCAGTAGT CAGTCCCCGTGAAGGAAACTGGGGAGT CT TGAGGGACCCCCGACTCCAAGC
GCGAAAACCCCGGAIGGIGAGGAGCAGGCAAAT GTGCAATACCAACAT GTO TG TACCTAC T GAIGGIGC T
GTAACCACCTCACAGAT TCCAGCT ICGGAACAAGAGACCCIGGI TAGACCAAAGCCAT I GCT I I TGAAGT

TAT TAAAGTCTGT TGGTGCACAAAAAGACACT TATACTATGAAAGAGGT TC TT T T T TAT CT
TGGCCAGTA
TAT TAT GACTAAACGAT TATATGATGAGAAGCAACAACATAT TGTATAT TGTTCAAATGATCT TCTAGGA
GAT T TGT T TGGCGTGCCAAGCT TCTCTGTGAAAGAGCACAGGAAAATATATACCATGATCTACAGGAACT
TGGTAGTAGTCAATCAGCAGGAATCATCGGACTCAGGTACATCTGTGAGTGAGAACAGGTGTCACCT TGA

AGGTGGGAGTGATCAAAAGGACCT TGTACAAGAGCT TCAGGAAGAGAAACCTTCATCT TCACAT T TGGT T
TCTAGACCATCTACCTCATCTAGAAGGAGAGCAAT TAGTGAGACAGAAGAAAAT TCAGATGAAT TATCTG
GTGAACGACAAAGAAAACGCCACAAATCTGATAGTAT T TCCCT T TCCT T TGATGAAAGCCTGGCTCTGTG
TGTAATAAGGGAGATATGT TGTGAAAGAAGCAGTAGCAGTGAATCTACAGGGACGCCATCGAATCCGGAT
CT TGATGCTGGTGTAAGTGAACAT TCAGGTGAT TGGT TGGATCAGGAT TCAGT T TCAGATCAGT T
TAGTG
TAGAAT T TGAAGT TGAATCTCTCGACTCAGAAGAT TATAGCCT TAGTGAAGAAGGACAAGAACTCTCAGA
TGAAGATGATGAGGTATATCAAGT TACTGTGTATCAGGCAGGGGAGAGTGATACAGAT T CAT T TGAAGAA
GATCCTGAAAT T TCCT TAGCTGACTAT TGGAAATGCACT TCATGCAATGAAATGAATCCCCCCCT TCCAT
CACAT TGCAACAGATGT TGGGCCCT TCGTGAGAAT TGGCT TCCTGAAGATAAAGGGAAAGATAAAGGGGA
AATCTC TGAGAAAGCCAAACTGGAAAACTCAACACAAGC TGAAGAGGGCT T TGATGT TCCTGAT TGTAAA
AAAACTATAGTGAATGAT T CCAGAGAGTCATGT GT TGAGGAAAATGATGATAAAAT TACACAAGCT TCAC
AATCACAAGAAAGTGAAGACTAT TCTCAGCCATCAACT TCTAGTAGCAT TAT T TATAGCAGCCAAGAAGA
TGTGAAAGAGT T TGAAAGGGAAGAAACCCAAGACAAAGAAGAGAGTGTGGAATCTAGT T TGCCCCT TAAT
GCCAT TGAACCT TGTGTGAT T TGTCAAGGTCGACCTAAAAATGGT TGCAT T GT CCATGGCAAAACAGGAC

ATCT TATGGCCTGCT T TACATGTGCAAAGAAGCTAAAGAAAAGGAATAAGCCCTGCCCAGTATGTAGACA
ACCAAT TCAAATGAT TGTGCTAACT TAT T TCCCCTAGT TGACCTGTCTATAAGAGAAT TATATAT T
IOTA
ACTATATAACCCTAGGAAT T TAGACAACCTGAAAT T TAT TCACATATATCAAAGTGAGAAAATGCCTCAA
T TCACATAGAT T T CT TCTCT T TAGTATAAT TGACCTACT T TGGTAGTGGAATAGTGAATACT
TACTATAA
T T TGACT TGAATATGTAGCTCATCCT T TACACCAACTCCTAAT T T TAAATAAT T TCTACTCTGTCT
TAAA
TGAGAAGTACT TGGT TTTTTTTT TCT TAAATATGTATATGACAT T TAAATGTAACT TAT TAT TTTTTT
TG
AGACCGAGTCT TGCTCTGT TACCCAGGCTGGAGTGCAGT GGGTGAT CT TGGCTCACTGCAAGCTCTGCCC
TCCCCGGGT TCGCACCAT TCTCCTGCCTCAGCCTCCCAAT TAGCT TGGCCTACAGTCATCTGCCACCACA
CCTGGCTAAT TITTIGTACTIT TAGTAGAGACAGGGITICACCGIGITAGCCAGGAIGGICTCGATCTCC
TGACCTCGTGATCCGCCCACCTCGGCCTCCCAAAGTGCTGGGAT TACAGGCATGAGCCACCG
NM_014791 GAGAT T TGAT TCCCT TGGCGGGCGGAAGCGGCCACAACCCGGCGATCGAAAAGAT TCT

ACCAGCCGCGTCTCTCAGGACAGCAGGCCCCTGTCCT TCTGTCGGGCGCCGCTCAGCCGTGCCCTCCGCC
CCICAGGI ICI T T TICTAAT TCCAAATAAACT TGCAAGAGGACTATGAAAGAT TATGATGAACT IC
ICAA
ATAT TATGAAT TACATGAAACTAT TGGGACAGGTGGCT T TGCAAAGGTCAAACT TGCCTGCCATATCCT T
ACTGGAGAGATGGTAGCTATAAAAAT CATGGATAAAAACACACTAGGGAGT GAT T TGCCCCGGATCAAAA
CGGAGAT TGAGGCCT TGAAGAACCTGAGACATCAGCATATATGTCAACTCTACCATGTGCTAGAGACAGC
CAACAAAATAT TCATGGT T CT TGAGTACTGCCCTGGAGGAGAGCTGT T TGACTATATAAT T TCCCAGGAT

CGCCTGTCAGAAGAGGAGACCCGGGT TGTCT TCCGTCAGATAGTAT CTGCT GT TGCT TATGTGCACAGCC
AGGGCTATGCTCACAGGGACCTCAAGCCAGAAAAT T TGCTGT T TGATGAATATCATAAAT TAAAGCTGAT
TGACT T IGGICICIGIGCAAAACCCAAGGGIAACAAGGAT TACCATCTACAGACATGCTGIGGGAGICIG
GCT TAT GCAGCACCTGAGT TAATACAAGGCAAATCATAT CT TGGATCAGAGGCAGATGT T TGGAGCATGG
GCATACTGT TATATGT TCT TATGTGTGGAT T TCTACCAT T TGATGATGATAATGTAATGGCT T
TATACAA
GAAGAT TATGAGAGGAAAATATGATGT TCCCAAGTGGCTCTCTCCCAGTAGCAT TCTGCT TCT TCAACAA
ATGCTGCAGGTGGACCCAAAGAAACGGAT T TCTATGAAAAATCTAT TGAACCATCCCTGGATCATGCAAG
AT TACAACTATCCTGT TGAGTGGCAAAGCAAGAATCCT T T TAT TCACCTCGATGATGAT TGCGTAACAGA
ACT T TCTGTACATCACAGAAACAACAGGCAAACAATGGAGGAT T TAAT T TCACTGTGGCAGTATGATCAC
CTCACGGCTACCTATCT TCTGCT TCTAGCCAAGAAGGCTCGGGGAAAACCAGT TCGT T TAAGGCT T TCT T

CT TICTCCIGIGGACAAGCCAGIGCTACCCCAT TCACAGACATCAAGTCAAATAAT TGGAGTCTGGAAGA
TGTGACCGCAAGTGATAAAAAT TATGTGGCGGGAT TAATAGACTAT GAT TGGTGTGAAGATGAT T TATCA
ACAGGTGCTGCTACTCCCCGAACATCACAGT T TACCAAGTACTGGACAGAATCAAATGGGGTGGAATCTA
AATCAT TAACTCCAGCCT TATGCAGAACACCTGCAAATAAAT TAAAGAACAAAGAAAATGTATATACTCC
TAAGTCTGCTGTAAAGAATGAAGAGTACT T TAT GT T TCCTGAGCCAAAGACTCCAGT TAATAAGAACCAG
CATAAGAGAGAAATACTCACTACGCCAAATCGT TACACTACACCCT CAAAAGC TAGAAACCAGTGCCT GA
AAGAAACTCCAAT TAAAATACCAGTAAAT TCAACAGGAACAGACAAGT TAATGACAGGTGTCAT TAGCCC
TGAGAGGCGGTGCCGCTCAGTGGAAT IGGATCTCAACCAAGCACATAIGGAGGAGACICCAAAAAGAAAG
GGAGCCAAAGIGT TIGGGAGCCT TGAAAGGGGGT IGGATAAGGI TATCACT GT GCTCACCAGGAGCAAAA
GGAAGGGT TCTGCCAGAGACGGGCCCAGAAGACTAAAGCT TCACTATAACGTGACTACAACTAGAT TAGT
GAATCCAGATCAACTGT TGAATGAAATAATGTC TAT TCT TCCAAAGAAGCATGT TGACT T TGTACAAAAG
GGT TATACACTGAAGTGTCAAACACAGTCAGAT T T TGGGAAAGTGACAATGCAAT T TGAAT TAGAAGTGT
GCCAGCT TCAAAAACCCGATGTGGTGGGTATCAGGAGGCAGCGGCT TAAGGGCGATGCCTGGGT T TACAA
AAGAT TAGTGGAAGACATCCTATCTAGCTGCAAGGTATAAT TGATGGAT TCTTCCATCCTGCCGGATGAG
TGTGGGTGTGATACAGCCTACATAAAGACTGT TATGATCGCT T TGAT T T TAAAGT TCAT TGGAACTACCA

ACT TGT T TCTAAAGAGCTATCT TAAGACCAATATCTCT T TGT T T T TAAACAAAAGATAT TAT T T
TGTGTA
TGAATCTAAATCAAGCCCATCTGTCAT TATGT TACTGTCTTTTT TAATCAT GT GGT T T TGTATAT
TAATA
AT TGT TGACT T TCT TAGAT TCACT TCCATATGTGAATGTAAGCTCT TAACTATGTCTCT T
TGTAATGTGT
AAT T TCT T TCTGAAATAAAACCAT T TGTGAATATAG

CT ICICCGGACCIGGIGICAGGGGIGGICCIAT GCCCAAGCTGGCT GACCGGAAGCTGT GTGCGGACCAG
GAGTGCAGCCACCCTATCT CCATGGC TGTGGCCCT TCAGGACTACATGGCCCCCGACTGCCGAT TCCTGA
CCAT TCACCGGGGCCAAGTGGTGTATGTCT TCTCCAAGCTGAAGGGCCGTGGGCGGCTCT TCTGGGGAGG
CAGCGT TCAGGGAGAT TAC TATGGAGATCTGGC TGCTCGCCTGGGC TAT T TCCCCAGTAGCAT TGTCCGA

GAGGACCAGACCC TGAAACCTGGCAAAGTCGAT GT GAAGACAGACAAAT GGGA T T TC TAC T GCCAG T
GAG
CTCAGCCTACCGCTGGCCCTGCCGT T TCCCCICCTIGGGIT TATGCAAATACAATCAGCCCAGTGCAAAA
AAAAAAAAAAAAAAAAAAAACT TCGGAGAAGAGATAGCAACAAAAGGCCGCTTGTGTGAAGGCGCCAAAA

GT T T TCGCCCAAGAGACCT T CGGCC T CCCCCAGGGCGCGCGCAAAGGCGCC T T GT T T
TGACAACCTCTTG
GACAACCGGAGGGGC TACCGCCCGGAGACCCC T GT GGTGGACCCCCCGGGCAACCCGGT GT GACAGGGTA
CTCACCCCCACGGCT T TGT CGGGGGT CCCACCAAAGGCCCCAAAGAGGCT OTT T CAAGGCAC TAT T CC
T T
GT TGTAGACCT T GT GT GT GCCACAGGCGCCAAAGAAACC TCGGGGGGC TAACAAACGCACGT GC T
TGGCA
GC T CCGAGAAGGC ICICICCCACCCGAGGGGIGGACGCAACAGGGGGAAIGGGCCATCATAT 1= T GCCC
CCGGIGGOCACCAACICTITTICCCCCATAGAGAGGCCT TAGCACACTAIGIGGGGCACGT TAT TGCCGC
CTAGAGAAACCGAGCGCCAGAAAAT T T CGAAGGGGGGGGCGC T TC T CAT CAT T T
TGCGCAAAACCCCCT T
GIGGGAGTAT GCCCCGAAC T CCICIGGAACACACAAGCGACAC T TGCGOGGGGIC T GCAAAAAACC T CC
T
GT TGGGAAGCCGGCT TCACN
NM_002417 TACCGGGCGGAGGT GAGCGCGGCGCCGGC T CC T CC T GCGGCGGAC T T

GT T CGACAAGIGGCCT T GCGGGCCGGAT CGICCCAGIGGAAGAGT T GTAAAT T TGC T IC TGGCC I
TCCCC
TACGGAT TATACCTGGCCT TCCCCTACGGAT TATACTCAACT TAC T GT T TAGAAAAT GT
GGCCCACGAGA
CGCCTGGT TAC TAT CAAAAGGAGCGGGGT CGACGGT CCCCAC T T T CCCC T GAGCC T CAGCACCT
GC T T GT
T TGGAAGGGGTAT T GAAT GT GACAT CCGTAT CCAGC T T CC T GT T GT GT CAAAACAACAT
TGCAAAAT T GA
AATCCATGAGCAGGAGGCAATAT TACATAAT T TCAGT T CCACAAAT CCAACACAAGTAAAT GGGT C T
GT T
AT T GAT GAGCC T GTACGGC TAAAACAT GGAGAT GTAATAAC TAT TAT T GAT CGT T CC T
TCAGGTATGAAA
AT GAAAGT C T TCAGAATGGAAGGAAGTCAACTGAAT T TCCAAGAAAAATACGTGAACAGGAGCCAGCACG
T CGT GT C T CAAGAT C TAGC T TC TC T T C T GACCC T GAT GAGAAAGC T CAAGAT T
CCAAGGCC TAT TCAAAA
AT CAC T GAAGGAAAAGT T T CAGGAAAT CC TCAGGTACATAT CAAGAAT GT
CAAAGAAGACAGTACCGCAG
AT GAC T CAAAAGACAGT GT T GC T CAGGGAACAAC TAAT GT T CAT T CC T CAGAACAT GC T
GGACGTAAT GG
CAGAAAT GCAGC T GAT CCCAT TTCTGGGGAT T T TAAAGAAAT T TCCAGCGT TAAAT
TAGTGAGCCGT TAT
GGAGAAT T GAAGT C T GT T CCCAC TACACAAT GT C T
TGACAATAGCAAAAAAAATGAATCTCCCTTTTGGA
AGCT T TAT GAGT CAGT GAAGAAAGAGT T GGAT GTAAAAT CACAAAAAGAAAAT GT CC TACAGTAT
TGTAG
AAAATCTGGAT TACAAAC T GAT TACGCAACAGAGAAAGAAAGT GC T GAT GGT T
TACAGGGGGAGACCCAA
C T GT T GGT C T CGCGTAAGT CAAGACCAAAAT C T GGT GGGAGCGGCCACGC T GT
GGCAGAGCCTGC T T CAC
CTGAACAAGAGCT T GACCAGAACAAGGGGAAGGGAAGAGACGTGGAGT C T GT T CAGAC T
CCCAGCAAGGC
T GT GGGCGCCAGC T T T CC T C TC TAT GAGCCGGC TAAAAT GAAGACCCC T GTACAATAT
TCACAGCAACAA
AAT ICI CCACAAAAACATAAGAACAAAGACCIGTATAC TACIGGIAGAAGAGAAT CT= GAAT CIGGGIA
AAAGTGAAGGCT TCAAGGCTGGTGATAAAACTCT TACTCCCAGGAAGCT T TCAACTAGAAATCGAACACC
AGCTAAAGT TGAAGATGCAGCTGACTCTGCCACTAAGCCAGAAAATCTCTCTTCCAAAACCAGAGGAAGT
AT T CC TACAGAT GT GGAAGT TCTGCCTACGGAAACTGAAAT TCACAATGAGCCAT T T T TAAC T C
T GT GGC
T CAC T CAAGT TGAGAGGAAGATCCAAAAGGAT TCCCTCAGCAAGCCTGAGAAAT TGGGCACTACAGCTGG
ACAGAT GT GC T C T GGGT TACCTGGTCT TAGT TCAGT TGATATCAACAACT T TGGT GAT TCCAT
TAAT GAG
AGTGAGGGAATACCT T T GAAAAGAAGGCGT GT GT CC T T T GGT GGGCACC TAAGACC T GAAC
TAT T T GAT G
AAAACT T GCC T CC TAATACGCC T C T CAAAAGGGGAGAAGCCCCAACCAAAAGAAAGT C T C T
GGTAAT GCA
CAC T CCACC T GT CC T GAAGAAAAT CAT CAAGGAACAGCC TCAACCAT CAGGAAAACAAGAGT
CAGGT T CA
GAAAT CCAT GT GGAAGT GAAGGCACAAAGC T TGGT TATAAGCCCT CCAGC T CC TAGT CC
TAGGAAAAC T C
CAGT T GCCAGT GAT CAACGCCGTAGGT CC T GCAAAACAGCCCCTGC T
TCCAGCAGCAAATCTCAGACAGA
GGT T CC TAAGAGAGGAGGGAGAAAGAGT GGCAACC T GCC T T CAAAGAGAGT GT C TAT
CAGCCGAAGT CAA
CAT GATAT T T TACAGAT GATAT GT T CCAAAAGAAGAAGT GGT GC T T CGGAAGCAAAT C T GAT
T GT TGCAA
AAT CAT GGGCAGAT GTAGTAAAAC T T GGT GCAAAACAAACACAAAC TAAAG TCATAAAACAT GGT C
C T CA
AAGGT CAAT GAACAAAAGGCAAAGAAGACC T GC TAC T CCAAAGAAGCC T GT GGGCGAAGT T
CACAGT CAA
T T TAGTACAGGCCACGCAAAC T C T CC T TGTACCATAATAATAGGGAAAGCTCATACTGAAAAAGTACATG

T GCC T GC T CGACCCTACAGAGT GC T CAACAAC T T CAT T TCCAACCAAAAAATGGACT T
TAAGGAAGATCT
T T CAGGAATAGC T GAAAT GT TCAAGACCCCAGTGAAGGAGCAACCGCAGT T GACAAGCACAT GT
CACAT C
GC TAT T TCAAAT TCAGAGAAT T T GC T TGGAAAACAGT T T CAAGGAAC T GAT
TCAGGAGAAGAACC T C T GC
TCCCCACCTCAGAGAGT T T T GGAGGAAAT GT GT ICI ICAGIGCACAGAAT GCAGCAAAACAGCCAT C
TGA
TAAAT GC T CT GCAAGCCC T CCC T TAAGACGGCAGTGTAT
TAGAGAAAATGGAAACGTAGCAAAAACGCCC
AGGAACACCTACAAAATGACTTCTCTGGAGACAAAAACT TCAGATACTGAGACAGAGCCT TCAAAAACAG
TAT CCAC T GCAAACAGGT CAGGAAGGT C TACAGAGT T CAGGAATATACAGAAGC TACC T GT
GGAAAGTAA
GAGTGAAGAAACAAATACAGAAAT T GT T GAGT GCAT CC TAAAAAGAGGT CAGAAGGCAACAC TAC
TACAA
CAAAGGAGAGAAGGAGAGATGAAGGAAATAGAAAGACCTTTTGAGACATATAAGGAAAATAT TGAAT TAA
AAGAAAAC GAT GAAAAGAT GAAAGCAAT GAAGAGAT CAAGAAC T T GGGGGCAGAAAT GT GCACCAAT
GT C
T GACC T GACAGACCTCAAGAGC T TGCCTGATACAGAACT CAT GAAAGACAC GGCACGTGGCCAGAAT C
T C
CTCCAAACCCAAGAT CAT GCCAAGGCACCAAAGAGTGAGAAAGGCAAAAT CAC TAAAAT GCCCT GCCAG T
CAT TACAACCAGAACCAATAAACACCCCAACACACACAAAACAACAGT TGAAGGCATCCCTGGGGAAAGT
AGGT GT GAAAGAAGAGC T CC TAGCAGTCGGCAAGT TCACACGGACGT CAGGGGAGACCAC GCACACGCAC

AGAGAGCCAGCAGGAGAT GGCAAGAGCAT CAGAACGT T TAAGGAGT CTCCAAAGCAGAT CCTGGACCCAG
CAGCCCGT GTAAC T GGAAT GAAGAAGTGGCCAAGAACGCC TAAGGAAGAGGCC CAGT CAC TAGAAGACC
T
GGC T GGC T TCAAAGAGC T C T TCCAGACACCAGGICCCTC T GAGGAAT CAAT GAC T GAT
GAGAAAAC TACC
AAAATAGCCIGCAAATCTCCACCACCAGAATCAGIGGACACTCCAACAAGCACAAAGCAAIGGCCIAAGA
GAAGTCTCAGGAAAGCAGATGTAGAGGAAGAAT TCT TAGCACTCAGGAAACTAACACCATCAGCAGGGAA
AGCCAT GC T TACGCCCAAACCAGCAGGAGGTGAT GAGAAAGACAT TAAAGCAT ITAIGGGAACTCCAGTG
CAGAAACIGGACCTGGCAGGAACTITACCIGGCAGCAAAAGACAGCTACAGACTCCIAAGGAAAAGGCCC
AGGCTCTAGAAGACCIGGCTGGCT T TAAAGAGC ICI T CCAGACTCC TGGICACACCGAGGAAT TAGTGGC
T GC T GGTAAAACCAC TAAAATACCC T GCGAC T C T CCACAGT CAGACCCAGT
GGACACCCCAACAAGCACA
AAGCAACGACCCAAGAGAAGTATCAGGAAAGCAGAIGTAGAGGGAGAACTCTIAGCGTGCAGGAATCTAA
T GCCAT CAGCAGGCAAAGCCAT GCACACGCC TAAACCAT CAGTAGGT GAAGAGAAAGACAT CAT CATAT
T

T GI GGGAAC T CCAGT GCAGAAAC T GGACC T GACAGAGAAC T
TAACCGGCAGCAAGAGACGGCCACAAACT
CC TAAGGAAGAGGCCCAGGC IC TGGAAGACC TGAC TGGC T T TAAAGAGCICIT
CCAGACCCCIGGICATA
C TGAAGAAGCAGT GGCTGC TGGCAAAACTAC TAAAAT GC CC TGCGAAT C T T CT
CCACCAGAATCAGCAGA
CACCCCAACAAGCACAAGAAGGCAGCCCAAGACACCT T TGGAGAAAAGGGACGTACAGAAGGAGCT C T CA
GCCCTGAAGAAGC T CACACAGACATCAGGGGAAAC CACACACACAGATAAAGTAC CAGGAGGT GAG GATA
AAAGCAT CAACGC GT T TAGGGAAACTGCAAAACAGAAAC TGGACCCAGCAGCAAGTGTAACTGGTAGCAA
GAGGCACCCAAAAACTAAGGAAAAGGCCCAACCCCTAGAAGACCTGGCTGGCT TGAAAGAGCICIT CCAG
ACACCAGTATGCACTGACAAGCCCACGACTCACGAGAAAACTACCAAAATAGCCTGCAGATCACAACCAG
ACCCAGTGGACACACCAACAAGCTCCAAGCCACAGTCCAAGAGAAGTCTCAGGAAAGTGGACGTAGAAGA
AGAAT TOT T CGCAC T CAGGAAAC GAACAC CATCAGCAGGCAAAGCCAT GCACACACCCAAAC
CAGCAGTA
AGTGGT GAGAAAAACATCTACGCAT T TAT GGGAAC T CCAGTGCAGAAAC T GGACCTGACAGAGAAC T T
AA
CTGGCAGCAAGAGACGGCTACAAACT CC TAAGGAAAAGGCCCAGGC TCTAGAAGACCTGGCTGGCT T TAA
AGAGC T CT T CCAGACACGAGGT CACAC T GAGGAAT CAAT GAC TAAC GATAAAAC T
GCCAAAGTAGC C T GC
AAATCT TCACAACCAGACCCAGACAAAAACCCAGCAAGC TCCAAGCGACGGCT CAAGACATCCCTGGGGA
AAGT GGGCGT GAAAGAAGAGCT CC TAGCAGT TGGCAAGC T CACACAGACAT CAGGAGAGACTACACACAC

ACACACAGAGCCAACAGGAGATGGTAAGAGCAT GAAAGCAT T TAT GGAGT C TCCAAAGCAGATCT TAGAC
TCAGCAGCAAGTC TAAC T GGCAGCAAGAGGCAGC T GAGAAC T CC TAAGGGAAAGT C T GAAGT CCC
T GAAG
ACCTGGCCGGCT T CAT CGAGC T C T T CCAGACACCAAGT CACAO TAAGGAAT
CAATGACTAACGAAAAAAC
TACCAAAGTAT CC TACAGAGCT ICACAGCCAGACCIAGIGGACACCCCAACAAGCTCCAAGCCACAGCCC
AAGAGAAGTCTCAGGAAAGCAGACACTGAAGAAGAAT T T T TAGCAT T TAGGAAACAAACGCCATCAGCAG
GCAAAGCCATGCACACACCCAAACCAGCAGTAGGTGAAGAGAAAGACATCAACACGT T T T T GGGAAC T CC
AGTGCAGAAAC T GGACCAGCCAGGAAAT T TACO T GGCAGCAATAGACGGC T AC AAAC
TCGTAAGGAAAAG
GCCCAGGCTCTAGAAGAAC TGACTGGCT TCAGAGAGCTTTTCCAGACACCATGCACTGATAACCCCACGA
C T GAT GAGAAAAC TACCAAAAAAATACTCTGCAAATCTCCGCAATCAGACCCAGCGGACACCCCAACAAA
CACAAAGCAACGGCCCAAGAGAAGCC TCAAGAAAGCAGACGTAGAGGAAGAAT T T T TAG CAT TCAGGAAA
C TAACACCATCAGCAGGCAAAGCCAT GCACACGCC TAAAGCAGCAG TAG= GAAGAGAAAGACATCAACA
CAT T TGIGGGGACTCCAGT GGAGAAAC T GGACC T GC TAGGAAAT T
TACCTGGCAGCAAGAGACGGCCACA
AAC T CC TAAAGAAAAGGCCAAGGCTC TAGAAGATCTGGC TGGCT TCAAAGAGC TCTTCCAGACACCAGGT
CACAO T GAGGAAT CAAT GACCGAT GACAAAAT CACAGAAGTAT CC T GCAAATC
TCCACAACCAGACCCAG
TCAAAACCCCAACAAGCTCCAAGCAACGACTCAAGATAT CC T T GGGGAAAG TAGGT GT GAAAGAAGAGGT
CC TAC CAGTCGGCAAGC T CACACAGACGT CAGGGAAGAC CACACAGACACACAGAGAGACAGCAGGAGAT
GGAAAGAGCATCAAAGCGT T TAAGGAAT C T GCAAAGCAGAT GC T GGACCCAGCAAAC TAT GGAAC T
GGGA
TGGAGAGGTGGCCAAGAACACCTAAGGAAGAGGCCCAAT CAC TAGAAGACC TGGCCGGC T TCAAAGAGCT
C T T CCAGACAC CAGAC CACAC T GAGGAAT CAACAAC T GAT GACAAAAC TAC CAAAATAGCC T
GCAAAT C T
C CAC CACCAGAAT CAAT GGACACT CCAACAAGCACAAGGAGGCGGCCCAAAACACC T T T
GGGGAAAAGGG
ATATAGTGGAAGAGCTCTCAGCCCTGAAGCAGC TCACAC AGACCACACACACAGACAAAGTAC CAG GAGA
T GAGGATAAAGGCAT CAAC GT GT TCAGGGAAAC T GCAAAACAGAAAC T
GGACCCAGCAGCAAGTGTAACT
GGTAGCAAGAGGCAGC CAAGAAC T CC TAAGGGAAAAGCCCAACCCC TAGAAGACT TGGC TGGCT TGAAAG

AGCTCT TCCAGACACCAATATGCACT GACAAGC CCAC GAC T CAT GAGAAAACTAC CAAAATAGCC T
GCAG
AT C T CCACAACCAGACCCAGT GGGTACCCCAACAAT C T T CAAGCCACAGTCCAAGAGAAGTCTCAGGAAA

GCAGACGTAGAGGAAGAAT CC T TAGCACTCAGGAAAC GAACAC CAT CAGTAGG GAAAGC TAT
GGACACAC
C CAAAC CAGCAGGAGGT GAT GAGAAAGACAT GAAAGCAT T TAT GGGAAC T C CAGTGCAGAAAT
TGGACCT
GCCAGGAAAT T TACCTGGCAGCAAAAGATGGCCACAAAC T CC TAAG GAAAAGGCCCAGGC T C
TAGAAGAC
CTGGCT GGCT TCAAAGAGCTCTTCCAGACACCAGGCAC T GACAAGCCCACGAC T GAT GAGAAAAC TAC
CA
AAATAGCC TGCAAAT CT CCACAAC CAGACCCAGTGGACACCCCAGCAAGCACAAAGCAACGGCCCAAGAG
AAACCT CAGGAAAGCAGACGTAGAGGAAGAAT T T T TAGCAC TCAGGAAAC GAACAC CAT
CAGCAGGCAAA
GCCAIGGACACACCAAAACCAGCAGTAAGIGAT GAGAAAAATATCAACACATT T GT GGAAAC T CCAGT GC
AGAAAC T GGACCT GC TAGGAAAT T TACCTGGCAGCAAGAGACAGCCACAGACT CC TAAG GAAAAGGC T
GA
GGCTCTAGAGGACCTGGT T GGCT TCAAAGAACT CT TCCAGACACCAGGTCACACTGAGGAATCAAT GACT
GAT GACAAAAT CACAGAAG TAT CC T G TAAAT C T CCACAGCCAGAGT CAT TCAAAACCTCAAGAAGC
TCCA
AGCAAAGGC T CAAGATACC CC T GGT GAAAGT GGACATGAAAGAAGAGCCCC TAGCAGTCAGCAAGC T
CAC
ACGGACAT CAGGGGAGACTACGCAAACACACAC AGAGCCAACAGGAGATAG TAAGAGCAT CAAAGC GT II
AAGGAGTC TCCAAAGCAGAT CC TGGACCCAGCAGCAAGT GTAAC IGGIAGCAG GAGGCAGC T GAGAAC T
C
GTAAGGAAAAGGC CCGT GC TCTAGAAGACCTGGT TGACT TCAAAGAGCTCT TCTCAGCACCAGGTCACAC
T GAAGAGT CAATGAC TAT T GACAAAAACACAAAAAT T CCC T GCAAAT C T CC CCCACCAGAAC
TAACAGAC
AC T GCCAC GAGCACAAAGAGAT GCCCCAAGACACGICCCAGGAAAGAAGTAAAAGAGGAGC T C T CAGCAG

T TGAGAGGC T CAC GCAAAC A T CAGGGCAAAGCACACACACACACAAAGAAC CAGCAAGC GG T GAT
GAGGG
CAT CAAAGTAT TGAAGCAACGTGCAAAGAAGAAACCAAACCCAGTAGAAGAGGAACCCAGCAGGAGAAGG
CCAAGAGCACCIAAGGAAAAGGCCCAACCCCIGGAAGACCTGGCCGGCT TCACAGAGCTCTCTGAAACAT
CAGGICACACTCAGGAATCACTGACT GCTGGCAAAGCCAC TAAAATACCC T GC GAAT C T CCCCCAC
TAGA
AGIGGIAGACACCACAGCAAGCACAAAGAGGCATCICAGGACACGIGIGCAGAAGGIACAAGTAAAAGAA
GAGCCT TCAGCAGTCAAGT ICACACAAACATCAGGGGAAACCACGGATGCAGACAAAGAACCAGCAGGIG
AAGATAAAGGCAT CAAAGCAT TGAAGGAATCTGCAAAACAGACACCGGCTCCAGCAGCAAGTGTAACTGG
CAGCAGGAGACGGCCAAGAGCACCCAGGGAAAGIGCCCAAGCCATAGAAGACCTAGCTGGCT TCAAAGAC
CCAGCAGCAGGTCACACTGAAGAATCAATGACT GATGACAAAACCAC TAAAAT ACCC T GCAAAT CAT CAC
CAGAAC TAGAAGACACCGCAACAAGC ICAAAGAGACGGCCCAGGACACGIGCCCAGAAAGTAGAAGIGAA
GGAGGAGC T GT TAGCAGT T GGCAAGC T CACACAAACCTCAGGGGAGACCAC GCACACCGACAAAGAGCCG

GTAGGT GAGGGCAAAGGCACGAAAGCAT T TAAGCAACCT GCAAAGCGGAAGCT GGACGCAGAAGAT GTAA

T TGGCAGCAGGAGACAGCCAAGAGCACCTAAGGAAAAGGCCCAACCCCTGGAAGATCTGGCCAGCT TCCA
AGAGC T C TC TCAAACACCAGGCCACAC T GAGGAAC T GGCAAAT GGT GC T GC TGATAGC T T
TACAAGCGCT
CCAAAGCAAACACC TGACAGT GGAAAACC TC TAAAAATATCCAGAAGAGT T CT TCGGGCCCCTAAAGTAG
AACCCGIGGGAGACGIGGIAAGCACCAGAGACCCIGTAAAATCACAAAGCAAAAGCAACAC I TCCC TGCC
CCCACTGCCCTICAAGAGGGGAGGTGGCAAAGAIGGAAGCGICACGGGAACCAAGAGGCTGCGCTGCATG
CCAGCACCAGAGGAAAT T GT GGAGGAGC T GCCAGCCAGCAAGAAGCAGAGGGT T GC TCCCAGGGCAAGAG

GCAAATCATCCGAACCCGTGGTCATCATGAAGAGAAGT T TGAGGACT TCTGCAAAAAGAAT T GAACC T GC
GGAAGAGCTGAACAGCAACGACATGAAAACCAACAAAGAGGAACACAAAT TACAAGACTCGGICCCIGAA
AATAAGGGAATAT CCCT GCGC TCCAGACGCCAAAATAAGAC T GAGGCAGAACAGCAAATAAC T GAG= T
T TGTAT TAGCAGAAAGAAT AGAAATAAACAGAAAT GAAAAGAAGCCCAT GAAGACC TCCCCAGAGAT GGA
CAT T CAGAAT CCAGAT GAT GGAGCCC GGAAACC CATACC TAGAGACAAAGT CAC T GAGAACAAAAG
G T GC
ITGAGGICTGCTAGACAGAATGAGAGCTCCCAGCCIAAGGIGGCAGAGGAGAGCGGAGGGCAGAAGAGIG
CGAAGGT TCTCATGCAGAATCAGAAAGGGAAAGGAGAAGCAGGAAAT TCAGAC TCCAT GT GCCTGAGAT C
AAGAAAGACAAAAAGCCAGCCTGCAGCAAGCACT T T GGAGAGCAAATC T GT GCAGAGAG TAACGCGGAGT
GTCAAGAGGT GT GCAGAAAATCCAAAGAAGGC T GAGGACAAT GT GT GT GTCAAGAAAAT
AAGAACCAGAA
GTCATAGGGACAGTGAAGATAT T TGACAGAAAAATCGAACTGGGAAAAATATAATAAAGT TAGT T T T GT G

ATAAGT TCTAGTGCAGT TTTTGTCATAAAT TACAAGTGAAT TC T GTAAGTAAGGC T GTCAGTC T GC T
TAA
GGGAAGAAAACT T TGGAT T T GC T GGGTC T GAAT CGGC T T CATAAAC TCCAC TGGGAGCACT
GC T GGGC TC
=GAO T GAGAATAGT TGAACACCGGGGGCT T IGTGAAGGAGICIGGGCCAAGGIT TGCCCICAGCTITG
CAGAATGAAGCCT T GAG= CIGICACCACCCACAGCCACCCIACAGCAGCC TTAAC T GT GACAC T T
GCCA
CAC T GT GTCGTCGT T T GT T T GCC TAT GTCCTCCAGGGCACGGTGGCAGGAACAAC TATCC TCGTC
T GTCC
CAACAC T GAGCAGGCAC TCGGTAAACACGAAT GAAT GGAT GAGCGCACGGATGAAT GGAGC T
TACAAGAT
CTGTCT T TCCAATGGCCGGGGGCAT T TGGTCCCCAAAT TAAGGC TAT TGGACATCTGCACAGGACAGTCC
TAT T T T T GAT GTCC T T TCCT T TCTGAAAATAAAGT T T T GT GC T T
TGGAGAATGACTCGTGAGCACATCT T
TAGGGACCAAGAGTGACT T TCTGTAAGGAGTGACTCGTGGCT TGCCT T GGT CT C T TGGGAATACT T T
TCT
AACTAGGGT T GC T C TCACC T GAGACAT TC TCCACCCGCGGAATC TCAGGGT CCCAGGC T GT
GGGCCATCA
CGACCTCAAACTGGCTCCTAATCTCCAGCT T TCCTGTCAT TGAAAGCT TCGGAAGT T TAC T GGCTC T
GC T
CCCGCC 1= T T IC T T IC T GACTC TAT CIGGCAGCCCGAT GCCACCCAGTACAGGAAGT
GACACCAGTAC T
CIGTAAAGCATCATCATCCTIGGAGAGACTGAGCACTCAGCACCT T CAGCCACGAT T TCAGGATCGC T IC
CT T GT GAGCCGC T GCC TCCGAAATC T CC T T TGAAGCCCAGACATCT T TCTCCAGCT TCAGACT
TGTAGAT
ATAACTCGT TCATC T =AT T TAC T T T CCAC T T T GCCCCC T GTO= ICI= GT
TCCCCAAATCAGAGAAT
AGCCCGCCATCCCCCAGGICACCIGICTGGAT TCCTCCCCAT ICACCCACCITGCCAGGIGCAGGIGAGG
AT GGTGCACCAGACAGGGTAGC T GTCCCCCAAAAT GT GCCC T GT GCGGGCAGT GCCC T GTC
TCCACGT T T
GT T TCCCCAGT GT C T GGCGGGGAGCCAGGT GACATCATAAATAC T T GC T GAAT GAAT
GCAGAAATCAGCG
GTACTGACT TGTACTATAT TGGCTGCCATGATAGGGT TCTCACAGCGTCATCCATGATCGTAAGGGAGAA
TGACAT TC T GC T TGAGGGAGGGAATAGAAAGGGGCAGGGAGGGGACATCTGAGGGCT TCACAGGGCTGCA
AAGGGTACAGGGAT T GCACCAGGGCAGAACAGGGGAGGGT GT TCAAGGAAGAGTGGCTCT TAGCAGAGGC
ACT T T GGAAGGT GT GAGGCATAAAT GC T TCCT TCTACGTAGGCCAACCTCAAAACT T
TCAGTAGGAAT GT
T GC TAT GATCAAGT T GT TCTAACACT T TAGACT TAGTAGTAAT TAT GAACC TCACATAGAAAAAT
T TCAT
CCAGCCATAT GCC T GT GGAGT GGAATAT TC T GT T TAGTAGAAAAAT CC T T TAGAGT
TCAGCTCTAACCAG
AAATCT T GC T GAAGTAT GT CAGCACC T T T TCTCACCCTGGTAAGTACAGTATT
TCAAGAGCACGCTAAGG
GT GGT T T TCAT T T TACAGGGC T GT T GAT GAT GGGT TAAAAAT GT TCAT T TAAGGGC
TACCCCCGT GT T TA
ATAGATGAACACCACT TCTACACAACCCTCCT TGGTACTGGGGGAGGGAGAGATCTGACAAATACTGCCC
AT TCCCCTAGGCTGACTGGAT T TGAGAACAAATACCCACCCAT T TCCACCATGGTATGGTAACT TCTCTG
AGCTTCAGT T TCCAAGTGAAT T TCCATGTAATAGGACAT TCCCAT TAAATACAAGCT GT T T T TACT
TTTT
CGCCTCCCAGGGCCT GT GGGATCT GGTCCCCCAGCCTCT CT TGGGCT T TCT TACACTAACTCTGTACCTA

CCATCTCCTGCCTCCCT TAGGCAGGCACCTCCAACCACCACACACT CCCT GCT GT T T TCCCTGCCTGGAA
CT T TCCCTCCTGCCCCACCAAGATCAT T TCATCCAGTCCTGAGCTCAGCT TAAGGGAGGCT ICI IGO=
IGGGI T CCCTCACCCCCAT GCCIGICCICCAGGCTGGGGCAGGI IC T TAGT T T GCCTGGAAT TGT T
=GT
ACC TC T T T GTAGCACGTAGT GT T GT GGAAAC TAAGCCAC TAAT TGAGT T
TCTGGCTCCCCTCCTGGGGT T
GTAAGT T T T GT =AT =AT GAGGGCCGAC T GCAT T ICC T GGT
TACICTATCCCAGTGACCAGCCACAGGA
GAT GTCCAATAAAGTAT GT GAT GAAAT GGTC T TAAAAAAAAAAAAAA
NM_024101 GCGCCGGGACGTGGCCAGT

GCCCTGCT TGCCCCCAT TATCCAGCCITGCCCCGGCGCCCIGACCIGACGCCCIGGCCIGACGCCCIGCT
TCGTCGCCTCCTT TCTCTCCCAGGT GCT GGACCAGGGAC T GAGCGTCCCCCGGAGAGGGTCCGGT GT GAC
CCCGACAAGAAGCAGAAATGGGGAAGAAACTGGATCT II CCAAGCTCACTGAT GAAGAGGCCCAGCAT GT
CT TGGAAGT T GT TCAACGAGAT T T TGACCTCCGAAGGAAAGAAGAGGAACGGCTAGAGGCGTTGAAGGGC
AAGAT TAAGAAGGAAAGCT CCAAGAGGGAGC T GC T T TCCGACAC T GCCCAT CT GAACGAGACCCAC
T GCG
CCCGCTGCCTGCAGCCCTACCAGCTGCT T GT GAATAGCAAAAGGCAGT GCC TGGAAT GT GGCCTCT TCAC

C TGCAAAAGC T GT GGCCGCGTCCACCCGGAGGAGCAGGGC T GGATC T GTGACCCC T GCCATC T
GGCCAGA
GTCGT GAAGATCGGCTCAC TGGAGTGGTAC TAT GAGCAT GT GAAAGCCCGC T T CAAGAGGT
TCGGAAGTG
CCAAGGTCATCCGGTCCCTCCACGGGCGGC T GCAGGGT GGAGC T GGGCCTGAAC T GATATC T
GAAGAGAG
AAGTGGAGACAGCGACCAGACAGATGAGGATGGAGAACCIGGCTCAGAGGCCCAGGCCCAGGCCCAGCCC
ITIGGCAGCAAAAAAAAGCGCCTCCICICCGICCACGACTTCGACTTCGAGGGAGACTCAGATGACTCCA
CICAGCCTCAAGGICACTCCCTGCACCTGICCTCAGTCCCIGAGGCCAGGGACAGCCCACAGICCCTCAC
AGAT GAGT CC T GC T CAGAGAAGGCAGCCCCTCACAAGGC T GAGGGCCTGGAGGAGGCTGATACTGGGGCC

ICIGGGTGCCACTCCCATCCGGAAGAGCAGCCGACCAGCATCICACCTICCAGACACGGCGCCCIGGCTG
AGC TC T GCCCGCC T GGAGGC TCCCACAGGAT GGCCC T GGGGAC T GC T GC T GCAC
TCGGGTCGAAT GTCAT
CAGGAATGAGCAGCTGCCCCTGCAGTACT T GGCCGAT GT GGACACC TCT GATGAGGAAAGCATCCGGGCT
CACGT GAT GGCCTCCCACCAT TCCAAGCGGAGAGGCCGGGCGTCT T CT GAGAGTCAGAT CT T T GAGCT
GA
ATAAGCATAT T TCAGCTGT GGAAT GCCTGCT GACCTACCIGGAGAACACAGIT GIGO= CCCT T GGCCAA

GGGTCTAGGTGCTGGAGTGCGCACGGAGGCCGATGTAGAGGAGGAGGCCCTGAGGAGGAAGCTGGAGGAG
C TGACCAGCAACGT CAGT GACCAGGAGACC T CG T C C GAGGAGGAGGAAGCCAAGGAC
GAAAAGGCAGAGC
CCAACAGGGACAAATCAGT TGGGCCTCTCCCCCAGGCGGACCCGGAGGTGGGCACGGCTGCCCATCAAAC
CAACAGACAGGAAAAAAGC CCCCAGGACCC T GGGGACCC CGT CCAG TACAACAGGACCACAGAT GAGGAG
CT GTCAGAGCT GGAGGACAGAGT GGCAGT GACGGCCTCAGAAGTCCAGCAGGCAGAGAGCGAGGT T TCAG
ACAT TGAATCCAGGAT TGCAGCCCTGAGGGCCGCAGGGCTCACGGIGAAGCCCTCGGGAAAGCCCCGGAG
GAAGICAAACCTCCCGATAT TICTCCCTCGAGTGGCTGGGAAACTIGGCAAGAGACCAGAGGACCCAAAT
GCAGACCCT TCAAGT GAGGCCAAGGCAAT GGCT GT GCCC TATCT TCTGAGAAGAAAGT TCAGTAAT
TCCC
T GAAAAGTCAAGGTAAAGAT GAT GAT TCT T T TGATCGGAAATCAGTGTACCGAGGCTCGCTGACACAGAG
AAACCCCAACGCGAGGAAAGGAAIGGCCAGCCACACCITCGCGAAACCIGTGGIGGCCCACCAGTCCIAA
CGGGACAGGACAGAGAGACAGAGCAGCCCTGCACT GT T T TCCCTCCACCACAGCCATCCTGTCCCT CAT T
GGCTCT GIGOT T TCCACTATACACAGTCACCGTCCCAATGAGAAACAAGAAGGAGCACCCTCCACATGGA
CTCCCACCTGCAAGTGGACAGCGACAT TCAGTCCTGCACTGCTCACCTGGGTT TACT GATGACTCC T GGC
T GCCCCACCATCC TCTCT GATCT GT GAGAAACAGCTAAGCT GCT GT GACT TCCCT T TAGGACAAT
GT T GT
GTAAAT CT T T GAAGGACACACCGAAGACCT T TATACT GT GATCT T T TACCCCT T =ACT CT
TGGCT =CT
TAT GT TGCT T TCATGAATGGAATGGAAAAAAGATGACTCAGT TAAGGCACCAGCCATAT GT GTAT T CT
TG
AT GGTC TATATCGGGGT GT GAGCAGAT GT T TGCGTAT T T CT T GT GGGTGTGACT GGATAT
TAGACATCCG
GACAAGTGACT GAACTAAT GATCT GC T GAATAAT GAAGGAGGAATAGACACCCCAGICCCCACCCTACGT
GCACCCGCTCTGCAAGT TCCCAT GT GATCT GTAGACCAGGGGAAAT TACACTGCGGTCAAGGGCAGAGCC
TGCACATGACAGCAAGTGAGCAT T TGATAGAT GC TCAGAT GC TAGT GCAGAGAGCC TGC
TGGGAGACGAA
GAGACAGCAGGCAGAGCTCCAGATGGGCAAGGAAGAGGCT TGGT TC TAGCC TGGCTCT GCCCCTCACT GC
AGT GGAT CCAGT GGGGCAGAGGACAGAGGGTCACAACCAAT GAGGGAT GTC TGCCAAGGAT GGGGGT GCA

GAGGCCACAGGAGTCAGCT TGCCACTCGCCCAT TGGT TACATAGAT GATCT CT CAGACAGGCTGGGACTC
AGAGT TAT TICCIAGTATCGGIGIGCCCCATCCAGTITTAAGIGGAGCCCICCAAGACTCTCCAGAGCTG
CCT T T GAACATCC TAACAGTAATCACATCICACCCICCC T GAG= T CACI T
TAGACAGGACCCAATGGCT
GCACTGCCT T T GT CAGAGGGGGT GCT GAGAGGAGT GGCT TCT T T
TAGAATCAAACAGTAGAGACAAGAGT
CAAGCCT T GT GTC T TCAAGCAT TGACCAAGT TAAGT GT T TCCT TCCCTCTCTCAATAAGACACT
TCCAGG
AGCT T TCCAATCTCTCACT TAAAACTAAGGT T T GAATCT CAAAGT GT
TGCTGGGAGGCTGATACTCCTGC
AACT TCAGGAGACCT GT GAGCACACAT TAGCAGCT GT T TCTCTGACTCCT T GT
GGCATCAGATAAAAACG
TGGGAGT T T T TCCATATAAT TCCCAGCCT TACT TATAAAT TCTAT T CT T TGAAAAAAT TAT
TCAGGCTAG
GTAAGGTGGCTCATACCTATAATCCCAGCCCT T TGAGAGGCCAAGGTGGGAGAAT TGCT TGAGGCCAGGA
GT T TGAGACCTCCTGGGCAACATAGTGAGATCCCATCTCTACAAAAAACAAAACAAAAAAAT TACCCAAG
CAT GAT GGTATAT GCCT GTAGTCGTACCTACT TACT TAGGAGGCTGAGGCAGGAGGATCACT TGAGCCCT
GGAGGT TGGGGCTGCAGTGAGCCATGATCGCATCACTATACTCGAGCCTGGGCAACAGAGTGAGACCT TG
TCTCT TAAAAAAAT TAATAATAAATAAATGAAAATAAT T CT TCAGAAAAAAAAAAAAAAAA
NM_005940 CCCCCGAT GCTGC TGCTGC T GCTCCAGCCGCCGCCGCT GCTGGCCCGGGCT CT GCCGCCGGACGCCCACC
ACCTCCAT GCCGAGAGGAGGGGGCCACAGCCCT GGCAT GCAGCCCT GCCCAGTAGCCCGGCACCTGCCCC
T GCCACGCAGGAAGCCCCCCGGCCTGCCAGCAGCC T CAGGCCTCCCCGCTG TGGCGTGCCCGACCCATC T
GAT GGGCT GAGTGCCCGCAACCGACAGAAGAGG T TCGTGCT T TCTGGCGGGCGCTGGGAGAAGACGGACC
TCACCTACAGGATCCTTCGGT TCCCATGGCAGT IGGIGCAGGAGCAGGTGCGGCAGACGAIGGCAGAGGC
CCIAAAGGIAIGGAGCGATGIGACGCCACTCACCIT TAC T GAGGTGCACGAGGGCCGT GCT GACAT CAT G
ATCGACTICGCCAGGIACIGGCATGGGGACGACCIGCCGIT TGAIGGGCCTGGGGGCATCCIGGCCCATG

TGACCAGGGCACAGACCTGCTGCAGGTGGCAGCCCATGAAT T T GGCCACGT GC T GGGGCTGCAGCACACA
ACAGCAGCCAAGGCCCTGAIGICCGC CT IC TACACC T T TCGCTACCCACTGAGICTCAGCCCAGATGACT
GCAGGGGCGT TCAACACCTATATGGCCAGCCCTGGCCCACTGTCACCTCCAGGACCCCAGCCCTGGGCCC
CCAGGC T GGGATAGACACCAAT GAGAT TGCACCGCT GGAGCCAGACGCCCCGCCAGAT GCCTGTGAGGCC
T CC T T T GACGCGGTCTCCACCATCCGAGGCGAGCTCTTT TTCT TCAAAGCGGGCT T T GT GT
GGCGCCTCC
GT GGGGGCCAGC T GCAGCCCGGCTACCCAGCAT TGGCCT CTCGCCACTGGCAGGGACTGCCCAGCCCTGT
GGACGCTGCCT TCGAGGATGCCCAGGGCCACAT T IGGI T CT TCCAAGGIGCTCAGTACTGGGIGTACGAC
GGT GAAAAGCCAGTCCTGGGCCCCGCACCCCTCACCGAGCT GGGCC T GGT GAGGT TCCCGGTCCAT GCTG
CCT TGGTCTGGGGTCCCGAGAAGAACAAGATCTACT TCT TCCGAGGCAGGGACTACTGGCGT T TCCACCC
CAGCACCCGGCGIGTAGACAGICCCGIGCCCCGCAGGGCCACTGACTGGAGAGGGGIOCCCICTGAGATC
GACGCTGCCT TCCAGGAT GCT GAT GGCTATGCCTACT TCCTGCGCGGCCGCCTCTACTGGAAGT T TGACC
CT GT GAAGGT GAAGGCTCT GGAAGGC T TCCCCCGTCTCGTGGGTCCTGACT TCT T T GGC T GT
GCCGAGCC
T GCCAACACT T TCCICT GACCAT GGC T IGGATGCCCICAGGGGT GC T GACCCC
TGCCAGGCCACGAATAT
CAGGCTAGAGACCCAIGGCCATCTITGIGGCTGIGGGCACCAGGCATGGGACTGAGCCCATGICTCCICA
GGGGGATGGGGTGGGGTACAACCACCATGACAACTGCCGGGAGGGCCACGCAGGTCGTGGTCACCTGCCA
GCGACTGTCTCAGACTGGGCAGGGAGGCT T TGGCATGACT TAAGAGGAAGGGCAGTCT TGGGCCCGCTAT
GCAGGICCIGGCAAACCIGGCTGCCCIGICICCATCCCIGICCCTCAGGGIAGCACCATGGCAGGACIGG
GGGAACTGGAGTGTCCT T GCT GTATCCCT GT T GT GAGGT TCCT
TCCAGGGGCTGGCACTGAAGCAAGGGT
GCTGGGGCCCCATGGCCT TCAGCCCTGGCTGAGCAACTGGGCTGTAGGGCAGGGCCACT TCCTGAGGTCA
GGTCT TGGTAGGTGCCTGCATCTGTCTGCCT TC T GGCT GACAATCC T GGAAATCT GT
TCTCCAGAATCCA
GGCCAAAAAGT TCACAGTCAAATGGGGAGGGGTAT TCT T CAT GCAGGAGACCCCAGGCCCT GGAGGCT GC
AACATACC TCAAT CC T GTCCCAGGCCGGATCC T CC T GAAGCCC T T T TCGCAGCAC T GC TATCC
TCCAAAG
CCAT T GTAAAT GT GT GTACAGT GT GTATAAACC T TCT TCT TC T T T T T T T T T TT T
TAAACTGAGGAT TGTC
NM_002467 GACCCCCGAGC T GT GC T GC TCGCGGCCGCCACCGCCGGGCCCCGGCCGTCCCT GGC

GAGAAGGGCAGGGCT TCTCAGAGGCT TGGCGGGAAAAAGAACGGAGGGAGG GAT CGCGC TGAG TAT AAAA
GCCGGT T T TCGGGGCT T TAT C TAAC ICGCTGIAGTAAT T
CCAGCGAGAGGCAGAGGGAGCGAGCGGGCGG
CCGGCTAGGGTGGAAGAGCCGGGCGAGCAGAGC TGCGCT GCGGGCGTCCTGGGAAGGGAGATCCGGAGCG
AATAGGGGGCT IC GCCTCT GGCCCAGCCCTCCCGC T GAT CCCCCAGCCAGCGGICCGCAACCCITGCCGC
AT C CAC GAAAC T T TGCCCATAGCAGCGGGCGGGCACT T T GCACTGGAACT
TACAACACCCGAGCAAGGAC
GCGACTCTCCCGACGCGGGGAGGCTAT TCTGCCCAT T TGGGGACAC T TCCCCGCCGCTGCCAGGACCCGC
T TCTCT GAAAGGCTCTCCT T GCAGC T GC T TAGACGCTGGAT TTTTT
TCGGGTAGTGGAAAACCAGCAGCC
TCCCGCGACGATGCCCCTCAACGT TAGCT TCACCAACAGGAAC TAT GACCTCGAC TACGACTCGGT GCAG
CCGTAT T TCTACT GCGACGAGGAGGAGAACT TC TACCAGCAGCAGCAGCAGAGCGAGCTGCAGCCCCCGG
CGCCCAGCGAGGATAICIGGAAGAAAT TCGAGC TGCTGCCCACCCCGCCCCTGICCCCIAGCCGCCGCTC
CGGGCT C T GC TCGCCCT CC TACGT T GCGGT CACACCCT T CTCCCT T
CGGGGAGACAACGACGGCGGTGGC
GGGAGC I IC TCCACGGCCGACCAGC I GGAGAIGGIGACCGAGC I GC I GGGAGGAGACAT GGT
GAACCAGA
GT T T CAT C IGCGACCCGGACGACGAGACC T T CAT CAAAAACAT CAT CATCCAGGAC T GTAT GT
GGAGCGG
CTTCTCGGCCGCCGCCAAGCTCGTCT CAGAGAAGC T GGCCT CC TAC CAGGC TGCGCGCAAAGACAGCGGC
AGCCCGAACCCCGCCCGCGGCCACAGCGT C T GC TCCACCTCCAGCT T GTAC CT GCAGGAT C T
GAGCGCCG
CCGCCTCAGAGTGCATCGACCCCTCGGTGGTCT TCCCCTACCC IC T CAACGACAGCAGC I CGCCCAAGT C
CTGCGCCTCGCAAGACTCCAGCGCCT T C T C T CC GT CCTCGGAT TCT CTGC T CT CCTCGACGGAGT
C C T CC
CCGCAGGGCAGCCCCGAGCCCCT GGT GC T CCAT GAGGAGACACCGC CCACCACCAGCAGCGAC IC T
GAGG
AGGAACAAGAAGAT GAGGAAGAAATCGAIGT I= I ICIGIGGAAAAGAGGCAGGCTCCT GGCAAAAGGTC
AGAGTCTGGATCACCT TC T GC T GGAGGCCACAGCAAACC TCCICACAGCCCAC T GGTCC TCAAGAGGT
GC
CAC= TCCACACATCAGCACAAC TACGCAGCGCCTCCC TCCAC TCGGAAGGAC TATCC TGCTGCCAAGA
GGGICAAGITGGACAGIGICAGAGICCTGAGACAGATCAGCAACAACCGAAAATGCACCAGCCCCAGGIC
CTCGGACACCGAGGAGAATGTCAAGAGGCGAACACACAACGTCT T GGAGCGCCAGAGGAGGAACGAGC TA
AAACGGAGCTTTTTTGCCCTGCGTGACCAGATCCCGGAGT TGGAAAACAATGAAAAGGCCCCCAAGGTAG
T TATCCT TAAAAAAGCCACAGCATACATCC T GT CCGTCCAAGCAGAGGAGCAAAAGC TCAT T TCTGAAGA

GGACT T GT TGCGGAAACGACGAGAACAGT TGAAACACAAACT T GAACAGC TACGGAAC T CT T GT
GCGTAA
GGAAAAGTAAGGAAAACGAT TCCT TCTAACAGAAATGTCCTGAGCAATCACCTATGAACT T GT T TCAAAT
GCATGATCAAATGCAACCTCACAACCT TGGCTGAGTCT TGAGACTGAAAGATT TAGCCATAATGTAAACT
GCCTCAAAT TGGACT T TGGGCATAAAAGAACTTTTT TAT GC T TACCATC T T TT T T T T T T CT
T TAACAGAT
T TGTAT T TAAGAAT T GT T T T TAAAAAAT T T TAAGAT T TACACAAT GT T TC T CT
GTAAATAT TGCCAT TAA
AT GTAAATAAC T T TAATAAAACGT T TATAGCAGT TACACAGAAT T
TCAATCCTAGTATATAGTACCTAGT
AT TATAGGTACTATAAACCCTAAT TTTTTT TAT T TAAGTACAT T T T GC T T T T TAAAGT T GAT
TTTTT TCT
AT T GT T T T TAGAAAAAATAAAATAAC T GGCAAA TATAT CAT T GAGC CAAAT CT
TAAAAAAAAAAAAAAA

AAGTGACT TAAGTCAGGT T CCCCCAAACCAGACACCAAGACAAGAATCCAT GT GT GT GT GAC T
GAAGGAA
GT GC T GGGAGAGC CCCAGC TGCAGCC T GGAT GT GAAC T GCAACTCCAAAGT GT GTCCAGAC
TCAAGGCAA
GGGCACTAGGCT T TCCAGACCTCCTACTAAGTCAT T GAT CCAGCAC TGCCC TGCCAGGACATAAAT CCC
T
GGCACCTCT T GC T C TCTGCAAAGGAGGGCAAAGCAGC T TCAGGAGCCCT TGGGAGTCCTCCAAAGAGAGT

CTAGGGTACAGGTCCGAAAGTAGAAGAACACAGAAGGCAGGCCAGGGGCACTGTGAGATGGTAAAAGAGA
IC I GAAGGGATCCAGAAT I CAAGCCAGGAAGAAGCAGCAATCIGIC I ICIGGAT TAAAAC I
GAAGATCAA
CC TAC T T TCAACT TACTAAGAAAGGGGATCATGGACAT TGAAGCATATCT TGAAAGAAT TGGCTATAAGA

AGTCTAGGAACAAAT TGGACT TGGAAACAT TAACTGATAT TCT TCAACACCAGATCCGAGC T GT TCCCT
T
TGAGAACCT TAACATCCAT T GT GGGGAT GCCAT GGAC T TAGGCT TAGAGGCCAT TTTTGATCAAGT T
GT G
AGAAGAAATCGGGGTGGATGGTGTCTCCAGGTCAATCATCT TC T GTAC T GGGC TC T GACCAC TAT
TGGT T
I I GAGACCACGAT GT IGGGAGGGIAT GT I TACAGCAC TCCAGCCAAAAAATACAGCACT GGCAT GAT
ICA
CC T TCTCCTGCAGGTGACCAT T GAT GGCAGGAAC TACAT T GTCGAT GC T GGGT T
TGGACGCTCATACCAG
AT GT GGCAGCC TC T GGAGT TAAT T TCTGGGAAGGATCAGCCTCAGGTGCCT TGTGTCT TCCGT T
TGACGG
AAGAGAATGGAT TCTGGTATCTAGACCAAATCAGAAGGGAACAGTACAT TCCAAATGAAGAAT T TCT TCA
T TCTGATCTCCTAGAAGACAGCAAATACCGAAAAATCTACTCCT T TACTCT TAAGCCTCGAACAAT TGAA
GAT T T T GAGTC TAT GAATACATACC T GCAGACATC TCCATCATC T GT GT T TACTAGTAAATCAT
T T T GT T
CC T TGCAGACCCCAGATGGGGT TCAC T GT T TGGTGGGCT TCACCCTCACCCATAGGAGAT TCAAT
TATAA
GGACAATACAGATCTAATAGAGT T CAAGAC T C T GAG T GAGGAAGAAATAGAAAAAG T GC T GAAAAA
TATA
T T TAATAT T TCCT TGCAGAGAAAGCT T GT GCCCAAACAT GGT GATAGAT TTTT TAC TAT T
TAGAATAAGG
AGTAAAACAATCT T GTC TAT T TGTCATCCAGCTCACCAGT TATCAACTGACGACCTATCATGTATCT TCT
=COO I TAO= TAT I I I GAAGAAAATCCIAGACATCAAATCAT I I CACC TATAAAAAT
GICATCATATA
TAAT TAAACAGCTTTT TAAAGAAACATAACCACAAACCT II TCAAATAATAATAATAATAATAATAATAA
AT GTC T T T TAAAGAT GGCC T GT GGT TATCT TGGAAAT T GGT GAT T TAT GC TAGAAAGC T
T T TAAT GT TGG
T T TAT T GT TGAAT TCCTAGAAAAGT T T TAT GGGTAGAT GAGTAAATAAAATAT TGTAAAAAAACT
TAT TG
TCTATAAAGTATAT TAAAACAT T GT TGGCTAATATAAAAAAAAAAAAAA
NM_014321 GCGCGCGGGT T TCGT TGACCCGCGGCGT TCACGGGAAT T GT TCGCT T

AGC T GAT CGGGCGCC TAGCCCCGCGC C TGGGCC TCGCCGAGCCCGACAT GC
TGAGGAAAGCAGAGGAGTA
CT T GCGCC T GTCCCGGGT GAAGT GT G T CGGCCT CTCCGCACGCACCACGGAGACCAGCAGT GCAGT
CAT G
TGCCTGGACCT TGCAGCT T CC T GGAT GAAGT GCCCC T TGGACAGGGCT TAT TTAAT TAAACT T
TCTGGT T
T GAACAAGGAGACATATCAGAGC T GT C T TAAAT CT T T T GAGT GT T TACTGGGCCTGAAT
TCAAATAT TGG
AATAAGAGACCTAGCTGTACAGT T TAGCTGTATAGAAGCAGTGAACATGGCTTCAAAGATACTAAAAAGC
TAT GAGTCCAGTC T TCCCCAGACACAGCAAGTGGATCT TGACT TAT CCAGGCCAC T T T T CAC T TC
T GC T G
CAC T GC T T TCAGCATGCAAGAT TC TAAAGC T GAAAGT GGATAAAAACAAAATGGTAGCCACATCCGGT
GT
AAAAAAAGCTATAT T T GAT CGAC T GT GTAAACAAC TAGAGAAGAT T
GGACAGCAGGTCGACAGAGAACC T
GGAGATGTAGCTACTCCACCACGGAAGAGAAAGAAGATAGTGGT TGAAGCCCCAGCAAAGGAAATGGAGA
AGGIAGAGGAGATGCCACATAAACCACAGAAAGATGAAGATCTGACACAGGAT TAT GAAGAAT GGAAAAG
AAAAAT T T T GGAAAAT GC T GCCAGT GC TCAAAAGGC TACAGCAGAGT GAT T TCAGCT
TCCAAACTGGTAT
ACAT TCCAAACTGATAGTACAT TGCCATCTCCAGGAAGACT TGACGGCT T TGGGAT T T T GT T
TAAACT T T
TATAATAAGGATCC TAAGAC T GT TGCCT T TAAATAGCAAAGCAGCCTACCTGGAGGCTAAGTCTGGGCAG
T GGGC T GGCCCC T GGT GT GAGCAT TAGACCAGCCACAGT GCC T GAT TGGTATAGCCT TAT GT
GC T T TOOT
ACAAAATGGAAT TGGAGGCCGGGCGCAGTGGCTCACGCC TGTAATCCCAGCACT T TGGGAGGCCAAGGTG
GGIGGATCACCIGAGGICAGGAGC TCGAGACCAGCCT GGCCAACAT GGT GAAACCCCAT OTC TAC TAAAA
ATACAAAAAT TAGCCAGGT GT GAIGGIGCATGCC TGTAATCCCAGC TCCTCAGTAGGC I GAGACAGGAGC
ATCACT TGAACGTGGGAGGCAGAGGT TGCAGTGAGCCGAGAT TGCACCACCGCACTCCAGCCTGGGTGAC
AGAGCGAGACTIATCICATAAATAAATAGATAGATACTCCAGCCIGGGTGACAGAGCGAGACTIATAGAT
AGATAGATAGATAGATGGATAGATAGATAGATAGATAGATAGATAGATAAACGGAAT TGGAGCCAT TTTG
CT T TAAGTGAATGGCAGTCCCT TGTCT TAT TCAGAATATAAAAT TCAGTCTGAATGGCATCT TACAGAT T

T TACT TCAAT T T T T GT GTACGGTAT TTTT TAT T TGACTAAATCAATATAT TGTACAGCCTAAGT
TAATAA
AT GT TAT T TATAT AT GCAAAAAAAAAAAAAAAAA
NM_000926 AGTCCACAGC T GT CAC TAATCGGGGTAAGCC T T GT TGTAT T T GT GCGT GT

AC TAGC T TCACT TGTCAT T TGAGTGAAATCTACAACCCGAGGCGGCTAGTGCTCCCGCACTACTGGGATC
TGAGATCT T CGGAGAT GAO T GT CGCCCGCAGTACGGAGCCAGCAGAAGT CC GACCCT TCC T GGGAAT
GGG
CTGTACCGAGAGGTCCGACTAGCCCCAGGGT T T TAGT GAGGGGGCAGT GGAAC TCAGCGAGGGAC T GAGA

GC T TCACAGCATGCACGAGT T T GAT GCCAGAGAAAAAG T C GGGAGA TAAAGGAGCCGCG T G T
CAC T AAAT
TGCCGTCGCAGCCGCAGCCACTCAAGTGCCGGACT T GT GAGTAC TC T GCGT CT CCAGTCC
TCGGACAGAA
GT TGGAGAACTCTCTTGGAGAACTCCCCGAGT TAGGAGACGAGATCTCCTAACAAT TACTACTTTT TCT T
GCGCTCCCCACT T GCCGCTCGCTGGGACAAACGACAGCCACAGT TCCCCTGACGACAGGATGGAGGCCAA
GGGCAGGAGCTGACCAGCGCCGCCCTCCCCCGCCCCCGACCCAGGAGGTGGAGATCCCICCGGICCAGCC
ACAT TCAACACCCACT T TCTCCTCCCTCTGCCCCTATAT TCCCGAAACCCCCT CC T CCT TCCCT T T
TCCC
T CC T CC TGGAGAC GGGGGAGGAGAAAAGGGGAGTCCAGT CGTCATGACTGAGCTGAAGGCAAAGGGTCCC
CGGGCTCCCCACGTGGCGGGCGGCCCGCCCTCCCCCGAGGTCGGAT CCCCACT GC T GT G T CGCCCAGCCG
CAGGTCCGTTCCCGGGGAGCCAGACCTCGGACACCTTGCCTGAAGT T T CGGCCATACC TAT CTCCC TGGA
CGGGCTACICITCCCTCGGCCCTGCCAGGGACAGGACCCCTCCGACGAAAAGACGCAGGACCAGCAGICG
CTGTCGGACGTGGAGGGCGCATAT T CCAGAGC T GAAGC TACAAGGGGT GOT GGAGGCAGCAGT TCTAGTC

CCCCAGAAAAGGACAGCGGAC T GC T GGACAGT GTC T T GGACAC TC T GT
TGGCGCCCTCAGGTCCCGGGCA
GAGCCAACCCAGCCCTCCCGCCTGCGAGGT CAC CAGC TC T T GGT GC C T GT T TGGCCCCGAACT
TCCCGAA
GATCCACCGGCTGCCCCCGCCACCCAGCGGGTGT TGTCCCCGCTCATGAGCCGGTCCGGGTGCAAGGT TG
GAGACAGCTCCGGGACGGCAGCTGCCCATAAAG T GC TGCCCCGGGGCC T GT CACCAGCCCGGCAGC T GC
T
GCT CCCGGCCTCT GAGAGCCCT CAC T GGTCCGGGGCCCCAGTGAAGCCGTC TCCGCAGGCCGCTGCGGTG
GAGGT T GAGGAGGAGGATGGCTCTGAGTCCGAGGAGTCTGCGGGTCCGCT T CT GAAGGGCAAACCT CGGG
CTCTGGGTGGCGCGGCGGC TGGAGGAGGAGCCGCGGCTGTCCCGCCGGGGGCGGCAGCAGGAGGCGTCGC
CCTGGTCCCCAAGGAAGAT TCCCGCT TCTCAGCGCCCAGGGTCGCCCTGGT GGAGCAGGACGCGCC GAT G
GCGCCCGGGCGCTCCCCGCTGGCCACCACGGIGAIGGAT TICATCCACGIGCCTATCCTGCCTCTCAATC
ACGCCT TAT T GGCAGCCCGCACTCGGCAGCT GC TGGAAGACGAAAGTTACGACGGCGGGGCCGGGGCTGC
CAGCGC CT T T GCCCCGCCGCGGAGT T CACCC T G T GCCTCGTCCACCCCGGT
CGCTGTAGGCGACTTCCCC
GACTGCGCGTACCCGCCCGACGCCGAGCCCAAGGACGACGCGTACCCTCTC TATAGCGACT TCCAGCCGC
CCGCTC TAAAGATAAAGGAGGAGGAGGAAGGCGCGGAGGCCTCCGCGCGCT CCCCGCGT T CCTACC T T GT
GGCCGGTGCCAACCCCGCAGCCTTCCCGGAT T T CCCGT T GGGGCCACCGCCCCCGCT GC CGCCGCGAGCG
ACCCCATCCAGACCCGGGGAAGCGGCGGT GACGGCCGCACCCGCCAGTGCC TCAGTCTCGTCTGCGT CC T
CCTCGGGGTCGACCCTGGAGTGCATCCTGTACAAAGCGGAGGGCGCGCCGCCCCAGCAGGGCCCGT TOGO
GCCGCCGCCCTGCAAGGCGCCGGGCGCGAGCGGCTGCCTGCTCCCGCGGGACGGCCTGCCCTCCACCTCC
GCCTCTGCCGCCGCCGCCGGGGCGGCCCCCGCGCTCTACCCTGCACTCGGCCTCAACGGGCTCCCGCAGC
TCGGCTACCAGGCCGCCGTGCTCAAGGAGGGCCTGCCGCAGGTCTACCCGCCC TATC T CAAC TACO T GAG
GCCGGAT T CAGAAGCCAGCCAGAGCC CACAATACAGC T T CGAGT CAT TACCTCAGAAGAT T T GT T
TAATC
T GT GGGGAT GAAGCATCAGGC T GTCAT TAT GGT GTCC T TACC T GT GGGAGC TGTAAGGT C T
TCT T TAAGA
GGGCAATGGAAGGGCAGCACAACTACT TAT GT GOT GGAAGAAAT GAC T GCATC GT T GATAAAAT CC
GCAG
AAAAAAC T GCCCAGCAT GT CGCC T TAGAAAGT GC T GTCAGGC T GGCAT GGT CC T
TGGAGGTCGAAAAT T T
AAAAAGT TCAATAAAGTCAGAGT T GT GAGAGCAC T GGAT GC T GT T GC TC TCCCACAGCCAGT
GGGCGT TC
CAAATGAAAGCCAAGCCCTAAGCCAGAGAT T CAC T T T T T CACCAGGT CAAGACATACAGT T GAT
TCCACC
ACT GAT CAACCIGT TAATGAGCAT T GAACCAGAT GT GAT C TAT GCAGGACATGACAACACAAAACC T
GAO
ACC TCCAGT TCT T T GC T GACAAGTC T TAATCAACTAGGCGAGAGGCAACT T CT T
TCAGTAGTCAAGTGGT
C TAAAT CAT TGCCAGGT T T TCGAAACT TACATAT T GAT GACCAGATAAC TC TCAT TCAGTAT
TCT TGGAT
GAGCT TAAT GGT GT T T GGT C TAGGAT GGAGATCC TACAAACACGTCAGT GGGCAGAT GC T GTAT
T T TGCA
CC T GAT C TAATAC TAAAT GAACAGCGGAT GAAAGAATCATCAT TC TAT TCAT TAT GCC T
TACCAT GT GGC
AGATCCCACAGGAGT T T GT CAAGC T TCAAGT TAGCCAAGAAGAGT T CC TC T GTAT GAAAGTAT T
GT TACT
TCT TAATACAAT T CC T T TGGAAGGGCTACGAAGTCAAACCCAGT T TGAGGAGATGAGGTCAAGCTACAT
T
AGAGAGCTCATCAAGGCAAT TGGT T TGAGGCAAAAAGGAGT T GT GT CGAGC TCACAGCGT T TC TAT
CAAC
T TACAAAACT TCT TGATAACT TGCATGATCT TGTCAAACAACT TCATC T GTAC T GC T TGAATACAT
T TAT
CCAGTCCCGGGCAC T GAGT GT TGAAT T TCCAGAAAT GAT GTC T GAAGT TAT TGCTGCACAAT
TACCCAAG
ATATTGGCAGGGATGGTGAAACCCCT TCTCTTTCATAAAAAGTGAATGTCATCTTTTTCTTTTAAAGAAT
TAAATT TTGTGGTATGTCT TTTTGTT TTGGTCAGGATTATGAGGTCTTGAGTT TTTATAATGTTCT TCTG
AAAGCCTTACATT TATAACATCATAGTGTGTAAAT T TAAAAGAAAAAT TGT GAGGT TCTAAT TAT T
TTCT
TTTATAAAGTATAATTAGAATGTTTAACTGTTT TGTTTACCCATAT TTTCT TGAAGAAT TTACAAGATTG
AAAAAGTACTAAAATTGTTAAAGTAAACTATCT TATCCATAT TAT T TCATACCATGTAGGTGAGGATTTT
TAACTT T TGCATC TAACAAATCATCGACT TAAGAGAAAAAATCT TACATGTAATAACACAAAGCTAT TAT
ATGT TAT T TCTAGGTAACT CCCT T TGTGTCAAT TATATT TCCAAAAATGAACCTTTAAAATGGTATGCAA

AATTTTGTCTATATATATT TGTGTGAGGAGGAAATTCATAACTTTCCTCAGAT TTTCAAAAGTATT T T TA
ATGCAAAAAATGTAGAAAGAGT T TAAAACCACTAAAATAGAT TGAT GT TCT TCAAACTAGGCAAAACAAC
TCATAT GT TAAGACCAT T T TCCAGAT TGGAAACACAAATCTCTTAGGAAGT TAATAAGTAGATTCATATC
AT TATGCAAATAGTAT TGT GGGT T T T GTAGGT T TTTAAAATAACCT
TTTTTGGGGAGAGAATTGTCCTCT
AATGAGGTATTGCGAGTGGACATAAGAAATCAGAAGATTATGGCCTAACTGTACTCCTTACCAACTGTGG
CATGCT GAAAGT TAGTCAC TCT TACT GAT TCTCAAT TCT CTCACCT TTGAAAGTAGTAAAATATCT
TTCC
TGCCAAT TGCTCC T T TGGGTCAGAGC T TAT TAACATCT T
TTCAAATCAAAGGAAAGAAGAAAGGGAGAGG
AGGAGGAGGGAGGTATCAATTCACATACCTTTCTCCTCT TTATCCTCCACTATCATGAATTCATAT TATG
TTTCAGCCATGCAAATCTT TTTACCATGAAATT TCTTCCAGAATTT TCCCCCT TTGACACAAATTCCATG
CATGTT TCAACCT TCGAGACTCAGCCAAATGTCAT T TCT GTAAAAT CT TCCCT GAGTCT
TCCAAGCAGTA
AT T TGCCT TCTCC TAGAGT T TACCTGCCAT T T T GTGCACAT T TGAGT TACAGTAGCATGT TAT
T T TACAA
T TGTGACTCTCCT GGGAGT CTGGGAGCCATATAAAGTGGTCAATAGTGT T T GC TGACTGAGAGT TGAATG

ACATTT TCTCTCTGTCTTGGTATTACTGTAGAT T TCGAT CAT TCT T TGGTTACATTTCTGCATATT TCTG

TACCCATGACTTTATCACT T TCT TCT CCCATGC T T TATC TCCATCAAT TAT CT TCAT TACT T T
TAAAT T T
TCCACCTTTGCTTCCTACT TTGTGAGATCTCTCCCTTTACTGACTATAACATAGAAGAATAGAAGTGTAT
TTTATGTGTCTTAAGGACAATACTTTAGATTCCTTGTTCTAAGTTT T TAAACT GAATGAATGGAATAT TA
TTTCTCTCCCTAAGCAAAATTCCACAAAACAAT TAT T TC T TATGT T TATGTAGCCTTAAATTGTTT TGTA

CIGTAAACCICAGCATAAAAACT T IC T =AT T T CTAAT T =AT TCAACAAATAT TGAT T GAATACC
IGGI
AT TAGCACAAGAAAAATGT GCTAATAAGCCT TATGAGAAT T TGGAGCTGAAGAAAGACATATAACT CAGG
AAAGTTACAGTCCAGTAGTAGGTATAAATTACAGTGCCTGATAAATAGGCATT TTAATATTTGTACACTC
AACGTATACTAGGTAGGTGCAAAACATTTACATATAATT TTACTGATACCCATGCAGCACAAAGGTACTA
ACT T TAAATAT TAAATAACACCT T TATGTGTCAGTAAT T CAT T TGCAT TAAAT CT TAT T
GAAAAGGCT T T
CAATATAT T T TCCCCACAAATGTCAT CCCAAGAAAAAAGTAT T T T TAACAT CT CCCAAATATAATAGT
TA
CAGGAAATCTACC TCTGTGAGAGTGACACCTCT CAGAAT GAACTGT GTGACACAAGAAAATGAATGTAGG
TCTATCCAAAAAAAACCCCAAGAAACAAAAACAATAT TAT TAGCCC T T TAT GC T TAAGT GATGGAC
TCAG
GGAACAGT TGATGT TGTGATCAT T T TAT TATCT GAT TCT TGT TACT
TTGAATTAAACCAATATTTTGATG
ATATAAATCATTTCCACCAGCATATATTTAATT TCCATAATAACTT TAAAATT TTCTAATTTCACTCAAC
TATGAGGGAATAGAATGTGGTGGCCACAGGTTTGGCTTT TGTTAAAATGTT TGATATCT TCGATGT TGAT
CTCTGTCTGCAATGTAGATGTCTAAACACTAGGATTTAATATTTAAGGCTAAGCTTTAAAAATAAAGTAC
CT T T T TAAAAAGAATATGGCT TCACCAAATGGAAAATACCTAAT T T CTAAATC T T T T TC
TCTACAAAGTC
CTATCTACTAATGTCTCCATTACTAT T TAGTCATCATAACCAT TAT CT TCAT T TTACATGTCGTGT TCTT

TCTGGTAGCTCTAAAATGACACTAAATCATAAGAAGACAGGTTACATATCAGGAAATACTTGAAGGTTAC
TGAAATAGATTCT TGAGTTAATGAAAATATTTTCTGTAAAAAGGTT TGAAAAGCCATTTGAGTCTAAAGC
AT TATACCTCCAT TATCAGTAGT TAT GTGACAAT TGTGT GTGTGT T TAATGTT
TAAAGATGTGGCACTTT
TTAATAAGGCAATGCTATGCTATTTTTTCCCAT T TAACAT TAAGATAAT T TAT TGCTATACAGATGATAT
GGAAATATGATGAACAATATTTTTTT TGCCAAAACTATGCCTTGTAAGTAGCCATGGAATGTCAACCTGT
AACTTAAATTATCCACAGATAGTCATGTGTTTGATGATGGGCACTGTGGAGATAACTGACATAGGACTGT
GCCCCCCTTCTCTGCCACT TACTAGC TGGATGAGAT TAAGCAAGTCAT T TAAC TGCTCT GAT TAAACCTG

CCTTTCCCAAGTGCTTTGTAATGAATAGAAATGGAAACCAAAAAAAACGTATACAGGCCTTCAGAAATAG
TAATTGCTACTAT TTTGTT TTCATTAAGCCATAGTTCTGGCTATAATTTTATCAAACTCACCAGCTATAT
TCTACAGTGAAAGCAGGAT TCTAGAAAGTCTCACTGTTT TAT T TAT GTCACCATGTGCTATGATATAT T T
GGTTGAATTCATT TGAAAT TAGGGCT GGAAGTAT TCAAGTAAT T TC T TCTGCT GAAAAAATACAGT GT
T T
TGAGTT TAGGGCCTGTTTTATCAAAGTTCTAAAGAGCCTATCACTCTTCCATTGTAGACATTTTAAAATA
ATGACACTGATTT TAACAT TTTTAAGTGTCTTT TTAGAACAGAGAGCCTGACTAGAACACAGCCCCTCCA
AAAACCCATGCTCAAAT TAT T T T TAC TATGGCAGCAAT T CCACAAAAGGGAACAATGGGT T TAGAAAT
TA
CAATGAAGTCATCAACCCAAAAAACATCCCTATCCCTAAGAAGGTTATGATATAAAATGCCCACAAGAAA
TCTATGTCTGCTT TAATCT GTCT T T TAT TGCT T TGGAAGGATGGCTATTACAT
TTTTAGTTTTTGCTGTG
AATACC TGAGCAGT T TCTC TCATCCATACT TAT CCT TCACACATCAGAAGT CAGGATAGAATATGAATCA

TTTTAAAAACTTT TACAACTCCAGAGCCATGTGCATAAGAAGCATTCAAAACT TGCCAAAACATACATTT
TTTTTCAAATTTAAAGATACTCTATT TTTGTAT TCAATAGCTCAACAACTGTGGTCCCCACTGATAAAGT
GAAGTGGACAAGGAGACAAGTAATGGCATAAGT TTGTTT TTCCCAAAGTATGCCTGTTCAATAGCCATTG
GATGTGGGAAATT TCTACATCTCTTAAAATTTTACAGAAAATACATAGCCAGATAGTCTAGCAAAAGTTC
ACCAAGTCCTAAAT TGCT TATCCT TACT TCACTAAGTCATGAAATCAT T T TAATGAAAAGAACATCACCT
AGGTTT TGTGGTT TCT T T T T T TCT TAT TCATGGCTGAGT GAAAACAACAAT CT CTGT T T
CTCCCTAGCAT
CTGTGGACTAT T TAATGTACCAT TAT TCCACACTCTATGGTCCTTACTAAATACAAAAT TGAACAAAAAG
CAGTAAAACAACTGACTCT TCACCCATATTATAAAATATAATCCAAGCCAGAT TAGTCAACATCCATAAG
ATGAAT CCAAGCT GAACTGGGCCTAGAT TAT TGAGT TCAGGT TGGATCACATCCCTAT T TAT
TAATAAAC
T TAGGAAAGAAGGCCT TACAGACCAT CAGT TAGCTGGAGCTAATAGAACCTACACT TCTAAAGT TCGGCC
TAGAAT CAATGIGGCCT TAAAAGCTGAAAAGAAGCAGGAAAGAACAGT T T T CT ICAATAATTIGICCACC
CTGTCACTGGAGAAAATTTAAGAATT TGGGGGT GT TGGTAGTAAGT TAAACACAGCAGCTGTTCATGGCA
GAAAT TAT TCAATACATACCT TCTCT GAATATCCTATAACCAAAGCAAAGAAAAACACCAAGGGGT TTGT

TCTCCTCCTTGGAGTTGACCTCATTCCAAGGCAGAGCTCAGGTCACAGGCACAGGGGCTGCGCCCAAGCT
TGICCGCAGCCITATGCAGCTGIGGAGICIGGAAGACTGITGCAGGACTGCTGGCCIAGICCCAGAATGT
CAGCCT CAT T T TCGAT T TACTGGCTC T TGT TGC TGTATGTCATGCT GACCT TAT TGT
TAAACACAGGT T T
GT T TGC T T T T T T T CCACTCATGGAGACATGGGAGAGGCAT TAT T T T
TAAGCTGGTTGAAAGCTTTAACCG
ATAAAGCAT T T T TAGAGAAATGTGAATCAGGCAGCTAAGAAAGCATACTCT GT CCAT TACGGTAAAGAAA
ATGCACAGAT TAT TAACTCTGCAGTGTGGCATTAGTGTCCTGGTCAATATTCGGATAGATATGAATAAAA
TAT T TAAATGGTAT TGTAAATAGT T T TCAGGACATATGCTATAGCT TAT T T T TAT TATC T T T
TGAAAT TG
CTCTTAATACATCAAATCCTGATGTATTCAATT TATCAGATATAAAT TAT T CTAAATGAAGCCCAGT TAA
ATGTTT TTGTCTTGTCAGT TATATGT TAAGTTTCTGATCTCTTTGTCTATGACGTTTACTAATCTGCATT
T T TACT GT TATGAAT TAT T T TAGACAGCAGTGGT T TCAAGCT T T T T GCCAC TAAAAATACCT
T T TAT T T T
CTCCTCCCCCAGAAAAGTCTATACCT TGAAGTATCTATCCACCAAACTGTACT TCTATTAAGAAATAGTT
AT TGTGT T T TCT TAATGT T T TGT TAT
TCAAAGACATATCAATGAAAGCTGCTGAGCAGCATGAATAACAA
TTATATCCACACAGATTTGATATATT TTGTGCAGCCTTAACTTGATAGTATAAAATGTCATTGCTT T T TA
AATAATAGTTAGTCAATGGACTTCTATCATAGCTTTCCTAAACTAGGTTAAGATCCAGAGCTTTGGGGTC
ATAATATATTACATACAAT TAAGTTATCTTTTTCTAAGGGCTTTAAAATTCATGAGAATAACCAAAAAAG
GTATGTGGAGAGT TAATACAAACATACCATAT T CT TGT T GAAACAGAGATGTGGCTCTGCT TGT TC
TCCA
TAAGGTAGAAATACT T TCCAGAAT T T GCCIAAACTAGTAAGCCCIGAAT T T GC TATGAT
TAGGGATAGGA
AGAGAT TTTCACATGGCAGACTTTAGAATTCTTCACTTTAGCCAGTAAAGTATCTCCTT TTGATCT TAGT
AT TCTGTGTAT T T TAACTT T TCTGAGT TGTGCATGT T TATAAGAAAAATCAGCACAAAGGGT T
TAAGT TA
AAGCCT T T T TACT GAAAT T TGAAAGAAACAGAAGAAAATATCAAAGTTCTT TGTATTTTGAGAGGATTAA

ATATGATTTACAAAAGTTACATGGAGGGCTCTCTAAAACATTAAAT TAAT TAT TTTTTGTTGAAAAGTCT
TACT T TAGGCATCAT T T TAT TCCTCAGCAACTAGCTGTGAAGCCT T TACTGTGCTGTATGCCAGTCACTC

TGCTAGATTGTGGAGATTACCAGTGT TCCCGTCTTCTCCGAGCTTAGAGTTGGATGGGGAATAAAGACAG
GTAAACAGATAGC TACAATAT TGTAC TGTGAAT GCT TAT GCTGGAGGAAGTACAGGGAACTAT TGGAGCA
CCTAAGAGGAGCACCTACCTTGAATT TAGGGGT TAGCAGAGGCATCCTGAAAAAAGTCAAAGCTAAGCCA
CAATCTATAAGCAGTTTAGGAATTAGCAGAACGTGCGTGGTGAGGAGATGCCAAAGGCAAGAAGAGAAGA
GTATTCCAAACAGGAGGGATTCCAAAGAGAGAAGAGTATCCCAAACAACAT TTGCACAAACCTGATGGGG
AGAGAGAATGTGGGGTGGGGATGGATGATGAGACTGAAGAAGAAAGCCAGGTCTAGATAATCAGTGGCCT
TGTACACCATGTTAAAGAGTGTAGACTTGATTCTGTTGTAAACAGGAAAGCAGCACAAT TCATATGAATA
T T T TAGAAGACTCCCACTGGAATATGGAGAATAAAGT TGGAGATGACTAAT CC TGGAAGCAGGGAGAACA
TTTTTGAGGAAGT TGCACTATTTTGGTGAAAATGATGATCATAAACATGAAGAATTGTAGGTGATCATGA
CCTCCTCTCTAAT TTTCCAGAAGGGT TTTGGAAGATATAACATAGGAACAT TGACAGGACTGACGAAAGG
AGATGAAATACACCATATAAAT TGTCAAACACAAGGCCAGATGTCTAAT TAT T TTGCTTATGTGTTGAAA
TTACAAATTTTTCATCAGGAAACCAAAAACTACAAAACT TAGTTTTCCCAAGTCCCAGAATTCTATCTGT
CCAAACAATCIGTACCACT CCACCIATATCCCIACCT T T GOAT= TGICCAACCICAAAGICCAGGICT
ATACACACGGGTAAGACTAGAGCAGT TCAAGTT TCAGAAAATGAGAAAGAGGAACTGAGTTGTGCTGAAC
CCATACAAAATAAACACAT TCT T TGTATAGAT T CT TGGAACCTCGAGAGGAAT TCACCTAACTCATAGGT
AT T TGATGGTATGAATCCATGGCTGGGCTCGGC T T T TAAAAAGCCT TATCTGGGATTCCTTCTATGGAAC
CAAGTTCCATCAAAGCCCATTTAAAAGCCTACATTAAAAACAAAAT TCTTGCTGCATTGTATACAAATAA
TGATGT CATGATCAAATAATCAGATGCCAT TAT CAAGTGGAAT TACAAAAT GGTATACCCACTCCAAAAA
AAAAAAAAAAGCTAAATTCTCAGTAGAACATTGTGACTTCATGAGCCCTCCACAGCCTTGGAGCTGAGGA
GGGAGCACTGGTGAGCAGTAGGTTGAAGAGAAAACTTGGCGCTTAATAATCTATCCATGTTTTTTCATCT
AAAAGAGCCTTCT TTTTGGATTACCT TAT TCAAT T TCCATCAAGGAAAT TGT TAGT TCCACTAACCAGAC

AGCAGCTGGGAAGGCAGAAGCT TACT GTATGTACATGGTAGCTGTGGGAAGGAGGT T TC T T TCTCCAGGT
CCTCAC TGGCCATACACCAGTCCCT T GT TAGT TATGCCT GGTCATAGACCCCCGT TGCTATCATCT CATA

T T TAAGTCT T TGGCT TGTGAAT T TAT CTAT TCT
TTCAGCTTCAGCACTGCAGAGTGCTGGGACTTTGCTA
ACT TCCAT T TCT T GCTGGC T TAGCACAT TCCTCATAGGCCCAGCTC T T T TC
TCATCTGGCCCTGCT GTGG
AGTCACCT TGCCCCT TCAGGAGAGCCATGGCT TACCACTGCCTGCTAAGCC TCCACTCAGCTGCCACCAC
ACTAAATCCAAGCTTCTCTAAGATGT TGCAGAC T T TACAGGCAAGCATAAAAGGCT TGATCT TCCT GGAC
TTCCCT T TACT TGTCTGAATCTCACC TCCT TCAACT T TCAGTCTCAGAATGTAGGCAT T
TGTCCTCTTTG
CCCTACATCTTCCTTCTTCTGAATCATGAAAGCCTCTCACTTCCTCTTGCTATGTGCTGGAGGCTTCTGT
CAGGTT TTAGAATGAGTTCTCATCTAGTCCTAGTAGCTT TTGATGCTTAAGTCCACCTT TTAAGGATACC
ITTGAGATITAGACCAIGT ITTICGCTIGAGAAAGCCCIAATCTCCAGACT TGCCT T IC TGIGGAT =CA
AAGACCAACTGAGGAAGTCAAAAGCT GAATGT T GACT T T CT T TGAACAT T T CCGCTATAACAAT
TCCAAT
TCTCCT CAGAGCAATATGCCTGCCTCCAACTGACCAGGAGAAAGGT CCAGT GC CAAAGAGAAAAACACAA
AGAT TAAT TAT T T CAGT TGAGCACATACT T TCAAAGTGGT T TGGGTAT TCATATGAGGT
TTTCTGTCAAG
AGGGTGAGACTCT TCATCTATCCATGTGTGCCTGACAGT TCTCCTGGCACTGGCTGGTAACAGATGCAAA
ACTGTAAAAATTAAGTGATCATGTAT T T TAACGATATCATCACATACT TAT TT TCTATGTAATGTT TTAA
AT T TCCCCTAACATACT T T GACTGT T T TGCACATGGTAGATAT TCACAT T T TT T TGTGT
TGAAGTTGATG
CAATCT TCAAAGT TATCTACCCCGT T GCT TAT TAGTAAAACTAGIGT TAATAC T
IGGCAAGAGATGCAGG
GAATCT T ICICAT GACICACGCCCIAT T TAGT TAT TAAT GCTACTACCCIATT
ITGAGTAAGTAGTAGGI
CCCTAAGTACATTGTCCAGAGTTATACTTTTAAAGATAT TTAGCCCCATATACTTCTTGAATCTAAAGTC
ATACACCTTGCTCCTCATT TCTGAGTGGGAAAGACATTTGAGAGTATGTTGACAATTGT TCTGAAGGTTT
TTGCCAAGAAGGTGAAACTGTCCTTTCATCTGTGTATGCCTGGGGCTGGGTCCCTGGCAGTGATGGGGTG
ACAATGCAAAGCTGTAAAAACTAGGTGCTAGTGGGCACCTAATATCATCATCATATACT TAT T T TCAAGC
TAATATGCAAAATCCCATCTCTGTTT TTAAACTAAGTGTAGATTTCAGAGAAAATATTT TGTGGTTCACA
TAAGAAAACAGTCTACTCAGCTTGACAAGTGTT TTATGT TAAATTGGCTGGTGGTTTGAAATGAATCATC
TTCACATAATGTT T TCT T TAAAAATAT TGTGAAT T TAAC TCTAAT T CT TGT TAT TCTGT
GTGATAATAAA

GAATAAACTAAT T IOTA

T TACAT T TACT T TGTCCATAT T TGCTCCTATGCTCTAGGCTCGTGCACAACAAACACAGTGTGGGCCCT T

ACCCTAGAAGCCAACT TCTCATGACCT T TCTCTATCTCCAGAATCCATGCAGTGGGAATGAAGGTAAAAG
AAGGT T T TCATGGGATCCAGCTGAGAGCTCTACGGGGAAAATGGATCTGAGGAGCCATGTGCTCCATCTC
T T T TAT T T TACAGGIAGAGACTAGGGGIATAGAGIGAGGIGAAT TACCGCAGT GACCCACACAT TGT
TGG
CAGACCTAGGAT TAGAACTCTGTCT TCCTGGT TCCCAGCT TGGTGCT T T TGAAAGCATACT TGCTGCT T
T
CT TACCGGCCTGGTGTCTGCCACT T TGGGACAGAGTGTGGACT TGCTCACCTGCCCCAT T TCT TAGGGAT
TCTCAT TCTGTGT T TGAGCAAGAATAT TCT TAT TCTGGAAAGAACCACATACCACAGGAT TCTGGGTGAG
CATAAGGAAGAT TGTCT TGGGGATCTGACT TAGCTCACGTATAGTGGCTATGATGAAT TCAGTGTCT TAT
TTTT TGCATATGTATAT T T T TAGTCTAATAT TGCCTGGGTGTCTGAGCAAGTCTAGATGAAT T TAAT
TGC
TCTCAT TTTTCCCCTGCCCCTCTTCCTTTGGTCTCTCTT TTAGGAAATGTT TT TCTTTCAACATTCGTTT
CAT TCAT TAT T TACTCATTCGGCCAACCAACAT T TAT TGAGTGCCT TCCCTGTATCAGGGACAGGGGCT
T
ACAAAGTAGAAT T TGATCCCACCICT GCCCICAGTAGCT CAGIGIC TAAIGGAGGIAGT GAIGT =AT TA
AGCGTCGCCAGATACTGTGCTAGGTGCTGTGCCTGT TCTCTCTCGCT TGT T CC TCACACACT TGAGAAGG
CCGAAGCTGAT TCATAGCT TGGAAGGCAGGGGCCT TGGAT T TGAACCCAGGCCTGACCAATGGCAGAACC
TATCAGAIGIGIGGACAGATGACAT TGCCT T IC T T ICI T TGGATATATCAAAATCAGCCAGCAGGCAGGA

ACTCCCAT T T TGAGCAAGCAATGTGCAGGAATGATAGGGTATACAGAGAGGAACAGGAGATGGCCCCTGA
CT TCCAGCATGTGTCTGAT GGACATCCAGGCTGCAGGCATCATGGT GCTGT CTAGAGAGATGAGCCAGGT
GCCCAGAGCCCATGGGCCAATGCTGCCCT T TCT TGAGCATGCCAAACAAAGCGGT TGGTGTGT TAGAGGC
ACAGTC TCCTCCACTCTAAGTAAAAATCAGCAT GAGTCC TAGCCCACAT T TCCCTAGTGAGTACACCAAA
GATATCTATGAACTGGCAGICATCAGIGACT TCCIAAGGI TCCGGAAATGCAT CT= TACTCAGGAGTAA
GCAATGAIGIGCCTGCGGC T T TACGAGT ICICACAGAAT GACT T IC IGGACCCAAAIGT T TT T ICI
GCT T
CAGGACTGIGAAGGCCT TAT 1= TCGCTCTGCCACCAAGGIGACCGCTGAT GT CATCAACGCAGCT GAGA
AACTCCAGGTGGTGGGCAGGGCTGGCACAGGTGTGGACAATGTGGATCTGGAGGCCGCAACAAGGAAGGG
CATCT TGGT TATGAACACCCCCAATGGGAACAGCCTCAGTGCCGCAGAACTCACT TGTGGAATGATCATG
TGCCTGGCCAGGCAGAT TCCCCAGGCGACGGCT TCGATGAAGGACGGCAAATGGGAGCGGAAGAAGT T CA
TGGGAACAGAGCT GAATGGAAAGACCCTGGGAATTCTTGGCCTGGGCAGGAT T GGGAGAGAGGTAGCTAC
CCGGATGCAGTCCT T TGGGATGAAGACTATAGGGTATGACCCCATCAT T TCCCCAGAGGTCTCGGCCTCC
T T TGGT GT TCAGCAGCTGCCCCTGGAGGAGATCTGGCCTCTCTGTGAT T TCATCACTGTGCACACTCCTC
TCCTGCCCTCCACGACAGGCT TGCTGAATGACAACACCT T TGCCCAGTGCAAGAAGGGGGTGCGTGTGGT
GAACTGTGCCCGTGGAGGGATCGTGGACGAAGGCGCCCT GCTCCGGGCCCT GCAGTCTGGCCAGTGTGCC
GGGGCTGCACTGGACGTGT T TACGGAAGAGCCGCCACGGGACCGGGCCT TGGTGGACCATGAGAATGTCA
TCAGCTGTCCCCACCTGGGTGCCAGCACCAAGGAGGCTCAGAGCCGCTGTGGGGAGGAAAT TGCTGT TCA
GT TCGTGGACATGGTGAAGGGGAAAT CTCTCACGGGGGT TGTGAATGCCCAGGCCCT TACCAGTGCCT TC
ICICCACACACCAAGCCTIGGATIGGICIGGCAGAAGCTCIGGGGACACTGATGCGAGCCTGGGCTGGGI
CCCCCAAAGGGACCATCCAGGTGATAACACAGGGAACATCCCTGAAGAATGCTGGGAACTGCCTAAGCCC
CGCAGT CAT TGICGGCCTCCIGAAAGAGGCTTCCAAGCAGGCGGATGIGAACT TGGTGAACGCTAAGCTG
CIGGIGAAAGAGGCTGGCCICAATGICACCACCTCCCACAGCCCIGCTGCACCAGGGGGGCAAGGCTICG
GGGAATGCCTCCTGGCCGTGGCCCIGGCAGGCGCCCCTTACCAGGCTGIGGGCTTGGTCCAAGGCACIAC
ACC T GT AC TGCAGGGGCTCAAT GGAGC TGT C T TCAGGCCAGAAGTGCCTCT CC
GCAGGGACCTGCCCCT G
C T CC TA T T CCGGAC T CAGACC T C T GACCC T GCAAT GC TGCC TACCA T GAT T
GGCCTCCT GGCAGAGGCAG
GCGT GC GGCTGCT GT CCTACCAGAC T TCACTGGTGTCAGATGGGGAGACCTGGCACGTCATGGGCATCTC
CTCCT TGCTGCCCAGCCTGGAAGCGTGGAAGCAGCATGTGACTGAAGCCTTCCAGT TCCACT TCTAACCT
IGGAGCTCACIGGICCCTGCCICIGGGGCTITICTGAAGAAACCCACCCACTGIGATCAATAGGGAGAGA
AAATCCACAT TCT TGGGCTGAACGCGAGCCTCTGACACTGCT TACACTGCACTCTGACCCTGTAGTACAG
CAATAACCGTCTAATAAAGAGCCTACCCCC

CCTGCCTCAGATGATGCCTATCCAGAAATAGAAAAAT TCT T TCCCT TCAAT CC TCTAGACT T TGAGAGT
T
T TGACCTGCCTGAAGAGCACCAGAT TGCGCACCTCCCCT TGAGTGGAGTGCCTCTCATGATCCT TGACGA
GGAGAGAGAGCT TGAAAAGCTGT T TCAGCTGGGCCCCCCT TCACCTGTGAAGATGCCCTCTCCACCATGG
GAATCCAATCIGT TGCAGT =CT TCAAGCAT T CTGTCGACCCIGGAIGT T GAAT TGCCACCIGIT TGCT
GTGACATAGATAT T TAAAT T TCT TAGTGCT TCAGAGTCTGTGTGTAT T TGTAT TAATAAAGCAT TCT
T TA
ACAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGGGGGGGAGACACAAAAA
GAAT TCCCCAAGAGGGGGCCACAAGATAATCAGAGGATATCACACAAGATC IC TCGGCGCACCAAC GAC G
GGGGCC CCAAATAAGGGAGAGACCCAGAAT CACAACAGC CAAGACAC GG T G GACACGAC GGAAACAAACA

CACAGC CCAGACACGGGGG CAAACAC GCGCGCACACCGC GGACACCAT GGGACAAAGCAGACAC CACC CA
CAAAACAACACCGCGGAGGGGGAAGAACAACAAAACAAG T GCGCAAACAGAACACAACCACAGAAAGAGA
AAAAT TAAAACGGCCCCCAAGACGGCGACAACACAACAAAACAACCACTACAGAGCGCT CAACAGC C GAG
TAAAAACACAACAACGGACAAC TAACACACAAAGGAAT GAAACAAAGCGGGGC CACACACCGACAC CGGA
AAT CCGGCGAACAACTCACACCGAGCGAGGGT CCCAGACAACAAATACACAGACAACGAAACCGAGAAAC
AAGACCAGCAAGACGAGCAGGCAAAAGACAAACAAGACAGAGGAGACGACGAC GAACGCAAAGGACAAGA
GGACACAACGACGCGAGGAGCGAGAGCGAGAGGAAGAGACAACAAAAAGACACAAAAGAACAACAAGCAA
GCAGCGAAGAACGACACACAAC CACAO GAGACAGCAGGAGCAGAGGC GGAGAAAACACAAC GAGCAAGC C
AAGAC CAAGAGAGGAGAACAAAATAAAAAAATACGAGAG CAGGCGGACGAGAG CAC GAGAC GAACAGACA
AACGGGAAT CAGAAGCATAACGAT CC GCGACGC GAACAACN

TCCGTGTCCCGCTCGCGCCCATCACGGACCCGCAGCAGCTGCAGCTCTCGCCGCTGAAGGGGCTCAGCT T

GGTCGACAAGGAGAACACGCCGCCGGCCCTGAGCGGGACCCGCGTCCTGGCCAGCAAGACCGCGAGGAGG
ATCT TCCAGGAGAAAACCCCCGCCGCT T TGTCATCT TCCCCATCGAGTACCATGATATCTGGCAGATGTA
TAAGAAGGCAGAGGCTTCCT T T TGGACCGCCGAGGAGGTGGACCTCTCCAAGGACAT TCAGCACTGGGAA
TCCCTGAAACCCGAGGAGAGATAT T T TATATCCCATGT TCTGGCT T TCT T TGCAGCAAGCGATGGCATAG

TAAATGAAAACTIGGIGGAGCGATITAGCCAAGAAGTICAGAT TACAGAAGCCCGCTGT T TCTATGGCT T
CCAAAT TGCCATGGAAAACATACAT TCTGAAATGTATAGTCT TCT TAT TGACACT TACATAAAAGATCCC

TGCGCTGGAT TGGGGACAAAGAGGCTACCTATGGTGAACGTGT TGTAGCCT TTGCTGCAGTGGAAGGCAT
T T TCT T T TCCGGT TCT T T TGCGTCGATAT TCTGGCTCAAGAAACGAGGACTGATGCCTGGCCTCACAT
T T
TCTAATGAACT TAT TAGCAGAGATGAGGGT T TACACTGT GAT T T TGCT TGCCTGATGT
TCAAACACCTGG
TACACAAACCATCGGAGGAGAGAGTAAGAGAAATAAT TAT CAAT GC T GT TCGGATAGAACAGGAGT TOOT
CACTGAGGCCT TGCCTGTGAAGCTCAT TGGGATGAAT TGCACTCTAATGAAGCAATACAT TGAGT T TGTG
GCAGACAGACT TATGCTGGAACTGGGT T T TAGCAAGGT T T TCAGAGTAGAGAACCCAT T TGACT T
TATGG
AGAATAT T TCACTGGAAGGAAAGACTAACT TCT T TGAGAAGAGAGTAGGCGAG TAT CAGAGGAT GGGAGT

GATGTCAAGTCCAACAGAGAAT TCT T T TACCT TGGATGCTGACT TCTAAATGAACTGAAGATGTGCCCT T
ACT TGGCTGAT TTTTTTTTT TCCATCTCATAAGAAAAATCAGCTGAAGTGT TACCAACTAGCCACACCAT
GAAT TGICCGTAAIGT =AT TAACAGCATCT T TAAAACT GIGTAGC TACCT CACAACCAGICCTGT =GT
T TATAGTGCTGGTAGTATCACCT T T TGCCAGAAGGCCTGGCTGGCTGTGACTTACCATAGCAGTGACAAT
GGCAGT CT TGGCT T TAAAGTGAGGGGTGACCCT T TAGTGAGCT TAGCACAGCGGGAT TAAACAGTCCT T
T
AACCAGCACAGCCAGT TAAAAGATGCAGCCICACTGCT T CAACGCAGAT T T TAATGT T TACT
TAAATATA
AACCTGGCACT T TACAAACAAATAAACAT TGT T TGTACTCACAAGGCGATAATAGCT TGAT T TAT T
TGGT
T TCTACACCAAATACAT TCTCCTGACCACTAATGGGAGCCAAT TCACAAT TCACTAAGTGACTAAAGTAA
GT TAAACT TGTGTAGACTAAGCATGTAAT T T T TAAGT T T TAT T T TAATGAATTAAAATAT T TGT
TAACCA
ACT T TAAAGTCAGTCCTGTGTATACCTAGATAT TAGTCAGT TGGTGCCAGATAGAAGACAGGT TGT GT T T

T TATCCTGTGGCT TGTGTAGTGTCCTGGGAT TCTCTGCCCCCTCTGAGTAGAGTGT TGTGGGATAAAGGA
ATCTCTCAGGGCAAGGAGCT TCT TAAGT TAAATCACTAGAAAT T TAGGGGTGATCTGGGCCT TCATATGT
GTGAGAAGCCGT T TCAT T T TAT T TCTCACTGTAT T T TCCTCAACGTCTGGT TGATGAGAAAAAAT T
CT TG
AAGAGT T T TCATATGTGGGAGCTAAGGTAGTAT TGTAAAAT T TCAAGTCAT CC T
TAAACAAAATGATCCA
CCTAAGATCT TGCCCCTGT TAAGTGGTGAAATCAACTAGAGGTGGT TCCTACAAGT TGT TCAT TCTAGT T
T TGT T TGGTGTAAGTAGGT TGTGTGAGT TAAT T CAT T TATAT T TAC TATGT CT GT
TAAATCAGAAAT T T T
T TAT TATCTATGT TCT TCTAGAT T T TACCTGTAGT TCATACT TCAGTCACCCAGTGTCT TAT
TCTGGCAT
TGTCTAAATCTGAGCAT TGTCTAGGGGGATCT TAAACT T TAGTAGGAAACCATGAGCTGT TAATACAGT T
TCCAT TCAAATAT TAAT T TCAGAATGAAACATAAT TTTTTTTTTTTTTTTT TGAGATGGAGTCTCGCTCT
GT TGCCCAGGCTGGAGTGCAGTGGCGCGAT T T TGGCTCACTGTAACCTCCATCTCCTGGGT TCAAGCAAT
TCTCCTGTCTCAGCCTCCCTAGTAGCTGGGACTGCAGGTATGTGCTACCACACCTGGCTAAT TTTTGTAT
T T T TAGTAGAGATGGAGT T TCACCATAT TGGTCAGGCTGGTCT TGAACTCCTGACCTCAGGTGATCCACC
CACCTCGGCCTCCCAAAGTGCTGGGAT TGCAGGCGTGATAAACAAATAT TCTTAATAGGGCTACT T TGAA
T TAATCTGCCTT TATGT T TGGGAGAAGAAAGCTGAGACAT TGCATGAAAGATGATGAGAGATAAAT GT TG
ATCT T T TGGCCCCAT T TGT TAAT TGTAT TCAGTAT T TGAACGTCGTCCTGT T TAT TGT TAGT T
T TCT TCA
TCAT T TAT TGTATAGACAAT T T T TAAATCTCTGTAATATGATACAT T T TCCTATCT T T TAAGT
TAT TGT T
ACCTAAAGT TAATCCAGAT TATATGGTCCT TATATGTGTACAACAT TAAAATGAAAGGCT T TGTCT TGCA
T TGTGAGGTACAGGCGGAAGT TGGAATCAGGT T T TAGGAT TCTGTCTCTCATTAGCTGAATAATGTGAGG
AT TAACTICTGCCAGCTCAGACCAT T TCCTAATCAGT TGAAAGGGAAACAAGTAT T TCAGTCTCAAAAT T
GAATAATGCACAAGTCT TAAGTGAT TAAAATAAAACTGT TCT TATGTCAGT TT

CGCCGCACCICCGGGAGCCGGGGCGCACCCAGCCCGCAGCGCCGCCTCCCCGCCCGCGCCGCCICCGACC
GCAGGC CGAGGGC C GCCAC T GGCCGGGGGGACC GGGCAGCAGC T T GCGGCC GC GGAGCC GGGCAAC
GC T G
GGGACTGCGCCTTTTGTCCCCGGAGGTCCCTGGAAGT II GCGGCAGGACGCGCGCGGGGAGGCGGCGGAG
GCAGCCCCGACGT C GC GGAGAACAGGGCGCAGAGCCGGCAT GGGCAT CGGGCGCAGCGAGGGGGGC CGC C
GCGGGGCAGCCCTGGGCGT GC T GC TGGCGCTGGGCGCGGCGC T T C T GGCCG
TGGGCTCGGCCAGCGAGTA
CGACTACGTGAGCTTCCAGTCGGACATCGGCCCGTACCAGAGCGGGCGCT T CT ACACCAAGCCACC TCAG
TGCGTGGACATCCCCGCGGACCTGCGGCTGTGCCACAACGTGGGCTACAAGAAGATGGT GCTGCCCAACC
I GC TGGAGCACGAGACCAT GGC GGAGGT GAAGCAGCAGG CCAGCAGC IGGGIGCCCCTGCTCAACAAGAA
CTGCCACGCCGGCACCCAGGTCT TCCTCTGCTCGCTCT TCGCGCCCGTCTGCC TGGACCGGCCCAT CTAC
CCGTGTCGCTGGC TCTGCGAGGCCGT GCGCGACTCGTGCGAGCCGGTCATGCAGT TCTTCGGCT TC TACT
GGCCCGAGATGCT TAAGTGTGACAAGT TCCCCGAGGGGGACGTCTGCATCGCCATGACGCCGCCCAATGC
CACCGAAGCCICCAAGCCCCAAGGCACAACGGIGIGICCTCCCIGTGACAACGAGITGAAATCTGAGGCC
AT CAT TGAACATCTCTGTGCCAGCGAGT T TGCACTGAGGATGAAAATAAAAGAAGTGAAAAAAGAAAATG
GCGACAAGAAGAT TGICCCCAAGAAGAAGAAGCCCCIGAAGTIGGGGCCCATCAAGAAGAAGGACCIGAA
GAAGCT TGIGCTGIACCTGAAGAATGGGGCTGACTGICCCIGCCACCAGCTGGACAACCICAGCCACCAC
T TCCTCATCATGGGCCGCAAGGTGAAGAGCCAGTACT TGCTGACGGCCATCCACAAGTGGGACAAGAAAA
ACAAGGAGT TCAAAAACT T CAT GAAGAAAAT GAAAAACCAT GAGT GCCCCACC T T TCAGTCCGTGT T
TAA
GTGAT TCTCCCGGGGGCAGGGTGGGGAGGGAGCCTCGGGTGGGGTGGGAGCGGGGGGGACAGTGCCCCGG
GAACCCGGTGGGTCACACACACGCAC TGCGCCTGTCAGTAGTGGACAT T TAAT CCAGTCGGCT TGT TCT T
GCAGCAT TCCCGCTCCCTTCCCTCCATAGCCACGCTCCAAACCCCAGGGTAGCCATGGCCGGGTAAAGCA
AGGGCCAT T TAGAT TAGGAAGGT III TAAGATCCGCAATGIGGAGCAGCAGCCACTGCACAGGAGGAGGI
GACAAAC CAT T T C CAACAG CAACACAGC CAC TAAAACACAAAAAGGGGGAT TGGGCGGAAAGT
GAGAGCC
AGCAGCAAAAACTACAT T T TGCAACT TGT TGGTGTGGATCTAT TGGCTGATCTATGCCT T TCAACTAGAA

AAT TCTAATGAT TGGCAAGTCACGT T GT T T TCAGGTCCAGAGTAGT T TCT T TCTGTCTGCT T
TAAATGGA
AACAGACTCATACCACACT TACAAT TAAGGTCAAGCCCAGAAAGTGATAAGTGCAGGGAGGAAAAGTGCA
AGTCCAT TAIGTAATAGIGACAGCAAAGGGACCAGGGGAGAGGCAT TGCCT IC ICTGCCCACAGIC T TIC
CGTGTGAT TGTCT T TGAATCTGAATCAGCCAGTCTCAGATGCCCCAAAGT T TCGGT TCCTATGAGCCCGG
GGCATGATCTGAT CCCCAAGACATGT GGAGGGGCAGCCT GTGCCTGCCT T T GT GTCAGAAAAAGGAAACC
ACAGTGAGCCTGAGAGAGACGGCGAT II TCGGGCTGAGAAGGCAGTAGT II TCAAAACACATAGT TAAAA
AAGAAACAAAT GAAAAAAAT T T TAGAACAGT CCAGCAAAT T GC TAG T CAGG GT GAAT T GT
GAAAT T GGGT
GAAGAGCT TACGAT TCTAATCTCATGT TTTT TCCT T T TCACAT T T T
TAAAAGAACAATGACAAACACCCA
CT TAT T T T TCAAGGT T T TAAAACAGTCTACAT TGAGCAT T
TGAAAGGTGTGCTAGAACAAGGTCTCCTGA
TCCGTCCGAGGCTGCT TCCCAGAGGAGCAGCTCTCCCCAGGCAT T TGCCAAGGGAGGCGGAT T TCCCTGG
TAGTGTAGCTGTGTGGCT T TCCT TCCTGAAGAGTCCGTGGT TGCCCTAGAACCTAACACCCCCTAGCAAA
ACTCACAGAGCT T TCCGT TTTTT TCT T TCCTGTAAAGAAACAT T TCCT T TGAACT TGAT
TGCCTATGGAT
CAAAGAAAT TCAGAACAGCCTGCCTGTCCCCCCGCACT T T T TACATATAT T TGT T TCAT T
TCTGCAGATG
GAAAGT TGACATGGGTGGGGTGTCCCCATCCAGCGAGAGAGT T TAAAAAGCAAAACATCTCTGCAGT T T T
TCCCAAGIGCCCIGAGATACTICCCAAAGCCCT TATGT T TAATCAGCGATGTATATAAGCCAGT TCACT T
AGACAACT T TACCCT ICI T GTCCAAT GTACAGGAAGTAGT ICTAAAAAAAATGCATAT TAAT T ICI
TCCC
CCAAAGCCGGAT T CT TAAT TCTCTGCAACACT T TGAGGACAT T TAT GAT
TGTCCCTCTGGGCCAATGCT T
ATACCCAGTGAGGATGCTGCAGTGAGGCTGTAAAGTGGCCCCCTGCGGCCC TAGCCTGACCCGGAGGAAA
GGATGGTAGAT TCTGT TAACTCT TGAAGACTCCAGTATGAAAATCAGCATGCCCGCCTAGT TACCTACCG
GAGAGT TATCCTGATAAAT TAACCTCTCACAGT TAGTGATCCTGTCCT T T TAACACCT TTTT TGTGGGGT

TCTCTCTGACCT T TCATCGTAAAGTGCTGGGGACCT TAAGTGAT T TGCCTGTAAT T T TGGATGAT
TAAAA
AATGTGTATATATAT TAGCTAAT TAGAAATAT TCTACT TCTCTGT TGTCAAACTGAAAT TCAGAGCAAGT
TCCTGAGTGCGTGGATCTGGGTCT TAGT TCTGGT TGAT TCACTCAAGAGT TCAGTGCTCATACGTATCTG
CTCAT T T TGACAAAGTGCCTCATGCAACCGGGCCCTCTCTCTGCGGCAGAGTCCT TAGTGGAGGGGT T TA
CCTGGAACAT TAGTAGT TACCACAGAATACGGAAGAGCAGGTGACT GTGCT GT GCAGCT CTCTAAATGGG
AAT TCTCAGGTAGGAAGCAACAGCT TCAGAAAGAGCTCAAAATAAAT T GGAAAT GT GAAT CGCAGC T GT
G
GGT T T TACCACCGICIGIC =AGA= CCCAGGACCT TGAGIGICAT TAGT TAC T T TAT TGAAGGIT T
TAG
ACCCATAGCAGCT T TGTCTCTGTCACATCAGCAAT T TCAGAACCAAAAGGGAGGCTCTCTGTAGGCACAG
AGCTGCACTATCACGAGCCT T TGT T T T TCTCCACAAAGTATCTAACAAAACCAATGTGCAGACTGAT TGG
CCTGGT CAT TGGTCTCCGAGAGAGGAGGT T TGCCTGTGAT T TCCTAAT TAT CGCTAGGGCCAAGGT
GGGA
T T TGTAAAGCT T TACAATAATCAT TCTGGATAGAGTCCTGGGAGGTCCT TGGCAGAACTCAGT TAAATCT
T TGAAGAATAT T TGTAGT TATCT TAGAAGATAGCATGGGAGGTGAGGAT TCCAAAAACAT T T TAT T T
T TA
AAATATCCTGTGTAACACT TGGCTCT TGGTACCTGTGGGT TAGCATCAAGT TCTCCCCAGGGTAGAAT TC
AATCAGAGCTCCAGT T TGCAT TIGGAIGIGTAAAT TACAGTAATCCCAT T T CC CAAACC TAAAATC 1=
T
T T TCTCATCAGACTCTGAGTAACTGGT TGCTGTGTCATAACT TCATAGATGCAGGAGGCTCAGGTGATCT
GT T TGAGCAGAGCACCCTAGGCAGCCTGCAGGGAATAACATACTGGCCGT T CT GACCTGT TGCCAGCAGA
TACACAGGACATGGATGAAAT TCCCGT T TCCTCTAGT T T CT TCCTGTAGTACTCCTCT T T
TAGATCCTAA
GTCTCT TACAAAAGCT T TGAATACTGTGAAAAT GT T T TACAT TCCAT T TCATT TGTGT T GT
TTTTT TAAC
TGCAT T T TACCAGATGT T T TGATGT TATCGCT TATGT TAATAGTAAT TCCCGTACGTGT TCAT T T
TAT T T
T CAT GC T T T T TCAGCCATGTATCAATAT TCACT TGAC TAAAAT CAC TCAAT
TAATCAAAAAAAAAAAAAA
AA
NM_012319 AGTCCTGGGCGAAGGGGGCGGTGGT TCCCCGCGGCGCTGCGCGCGGCGGTAAT TAGTGAT

CT TCGCGAAGGCTAGGGGCGCGGCTGCCGGGIGGCTGCGCGGCGCTGCCCCCGGACCGAGGGGCAGCCAA
CCCAAT GAAACCACCGCGT GI TCGCGCCTGGTAGAGAT T TCTCGAAGACACCAGTGGGCCCGTTCCGAGC
CCTCIGGACCGCCCGTGIGGAACCAAACCIGCGCGCGIGGCOGGGCCGIGGGACAACGAGGCCGCGGAGA
CGAAGGCGCAATGGCGAGGAAGT TAT CTGTAAT CT TGATCCTGACCT T TGCCCTCTCTGTCACAAATCCC
CT TCATGAACTAAAAGCAGCTGCT T TCCCCCAGACCACTGAGAAAAT TAGTCCGAAT TGGGAATCTGGCA
T TAATGT TGACT TGGCAAT T TCCACACGGCAATATCATCTACAACAGCT T T TCTACCGCTATGGAGAAAA

TAAT IC T T TGICAGT TGAAGGGT TCAGAAAAT TACT TCAAAATATAGGCATAGATAAGAT
TAAAAGAATC
CATATACACCAT GACCACGAC CAT CAC T CAGAC CACGAGCAT CAC T CAGAC CA T GAGCG T CAC
T CAGACC
ATGAGCATCACTCAGACCACGAGCAT CACTCTGACCATGATCATCACTCTCACCATAAT CATGCTGCT TC
IGGIAAAAATAAGCGAAAAGCTCTITGCCCAGACCATGACICAGATAGTICAGGIAAAGATCCIAGAAAC
AGCCAGGGGAAAGGAGCTCACCGACCAGAACAT GCCAGT GGTAGAAGGAAT GI CAAGGACAGIGT TAGTG
C TAGTGAAGTGACCTCAAC TGTGTACAACACTGTCTCTGAAGGAAC TCACT TTCTAGAGACAATAGAGAC
TCCAAGACCTGGAAAACTCT TCCCCAAAGATGTAAGCAGCTCCACT CCACCCAGTGTCACATCAAAGAGC
CGGGTGAGCCGGC TGGCTGGTAGGAAAACAAAT GAATCT GTGAGTGAGCCCCGAAAAGGCT T TATGTAT T
CCAGAAACACAAATGAAAATCCTCAGGAGTGT T TCAATGCATCAAAGCTACTGACATCTCATGGCATGGG
CATCCAGGT TCCGCTGAATGCAACAGAGT TCAACTATCTCTGTCCAGCCATCATCAACCAAAT TGATGCT
AGATCT TGTCTGAT TCATACAAGTGAAAAGAAGGCTGAAATCCCTCCAAAGACCTAT TCAT TACAAATAG
CCTGGGT TGGTGGT T T TATAGCCAT T TCCATCATCAGT T TCCTGTCTCTGCTGGGGGT TATCT
TAGTGCC
TCTCATGAATCGGGTGT T T T TCAAAT T TCTCCTGAGT T TCCT TGTGGCACTGGCCGT TGGGACT T
TGAGT
GGTGAT GOT TITT TACACC T ICI TCCACAT ICI CATGCAAGICACCACCATAGICATAGCCATGAAGAAC

CAGCAATGGAAATGAAAAGAGGACCACT T T TCAGTCATCTGTCT TCTCAAAACATAGAAGAAAGTGCCTA
T T T TGAT TCCACGTGGAAGGGTCTAACAGCTCTAGGAGGCCTGTAT T TCAT GT T TCT TGT
TGAACATGTC
C T CACAT T GAT CAAACAAT T TAAAGATAAGAAGAAAAAGAAT CAGAAGAAACC T GAAAAT GAT GAT
GAT G
TGGAGAT TAAGAAGCAGT T GTCCAAG TAT GAAT CTCAAC T T
TCAACAAATGAGGAGAAAGTAGATACAGA
TGATCGAACTGAAGGCTAT T TACGAGCAGACTCACAAGAGCCCTCCCACT T TGAT TCTCAGCAGCCTGCA
GTCT TGGAAGAAGAAGAGGTCATGATAGCTCATGCTCATCCACAGGAAGTCTACAATGAATATGTACCCA

GAGGGTGCAAGAATAAATGCCAT TCACAT T TCCACGATACACTCGGCCAGTCAGACGATCTCAT TCACCA
CCATCATGACTACCATCATATICICCATCATCACCACCACCAAAACCACCATCCICACAGICACAGCCAG
CGCTAC TCTCGGGAGGAGC TGAAAGATGCCGGCGTCGCCACTCTGGCCTGGAT GGTGATAATGGGT GATG
GCCTGCACAAT T TCAGCGATGGCCTAGCAAT TGGTGCTGCT T T TACTGAAGGCT TATCAAGTGGT T
TAAG
TACT TCTGT TGCTGTGT TCTGTCATGAGT TGCCTCATGAAT TAGGTGACT T TGCTGT TCTACTAAAGGCT

GGCATGACCGT TAAGCAGGCTGTCCT T TATAATGCAT TGTCAGCCATGCTGGCGTATCT TGGAATGGCAA
CAGGAAT T T TCAT TGGTCAT TATGCTGAAAATGT T TCTATGTGGATAT T TGCACT TACT GCTGGCT
TAT T
CATGTATGT TGCTCTGGT TGATATGGTACCTGAAATGCTGCACAATGATGCTAGTGACCATGGATGTAGC
CGCTGGGGGTAT T TCT T T T TACAGAATGCTGGGATGCT T T TGGGT T T TGGAAT TATGT TACT
TAT T TCCA
TAT T TGAACATAAAATCGT GT T TCGTATAAAT T TCTAGT TAAGGT T TAAAT GC TAGAGTAGCT
TAAAAAG
T TGTCATAGT T TCAGTAGGTCATAGGGAGATGAGT T TGTATGCTGTACTATGCAGCGT T TAAAGT TAGTG

GGT T T TGTGAT T T T TGTAT TGAATAT TGCTGTCTGT TACAAAGTCAGT TAAAGGTACGT T T
TAATAT T TA
AGT TAT TCTATCT TGGAGATAAAATCTGTATGTGCAAT TCACCGGTAT TACCAGT T TAT TATGTAAACAA

GAGAT T TGGCATGACATGT TCTGTAT GT T TCAGGGAAAAATGTCT T TAATGCT T T T
TCAAGAACTAACAC
AGT TAT TCCTATACTGGAT T T TAGGTCTCTGAAGAACTGCTGGTGT T TAGGAATAAGAATGTGCATGAAG
CC TAAAATACCAAGAAAGC T TATAC T GAAT T TAAGCAAAGAAATAAAGGAGAAAAGAGAAGAAT C T
GAGA
AT TGGGGAGGCATAGAT TCT TATAAAAATCACAAAAT T T GT TGTAAAT TAGAGGGGAGAAAT T
TAGAAT T
AAGTATAAAAAGGCAGAAT TAGTATAGAGTACAT TCAT TAAACAT T T T TGTCAGGAT TAT T
TCCCGTAAA
AACGTAGTGAGCACT T T TCATATACTAAT T TAGT TGTACAT T TAACT T
TGTATAATACAGAAATCTAAAT
ATAT T TAATGAAT TCAAGCAATATATCACT TGACCAAGAAAT TGGAAT T TCAAAATGT TCGTGCGGGTAT

ATACCAGATGAGTACAGTGAGTAGT T T TATGTATCACCAGACTGGGT TAT TGCCAAGT TATATATCACCA
AAAGCTGTATGACTGGATGT TCTGGT TACCTGGT T TACAAAAT TAT CAGAGTAGTAAAACT T TGATATAT

ATGAGGATAT TAAAACTACACTAAGTATCAT T T GAT TCGAT TCAGAAAGTACT T TGATATCTCTCAGTGC

T TCAGTGCTATCAT TGTGAGCAAT TGTCT T T TATATACGGTACTGTAGCCATACTAGGCCTGTCTGTGGC
AT TCTCTAGATGT T TCT TTTT TACACAATAAAT TCCT TATATCAGCT TGAAAAAAAAAAAAAAAAAA

T T ICAAGGGCCACGCGCT TCCAGGGAGT T ICI ICCIGAT CAT IGGGCTGIGIT
GGICAGIGAAGTACCCG
CTGAAGTACT T TAGCCACACGCGGAAGAACAGCCCACTACAT TACTATCAGCGTCTCGAGATCGTCGAAG
CCGCAAT TAGGACT T TGT T T TCCGTCACTGGGATCCTGGCAGAGCAGT T TGTTCCGGATGGGCCCCACCT

GCACCICTACCATGAGAACCACIGGATAAAGT TAATGAATIGGCAGCACAGCACCAIGTACCTAT ICI T T
GCAGTCTCAGGAAT TGT TGACATGCTCACCTATCTGGTCAGCCACGT TCCCTTGGGGGTGGACAGACTGG
T TATGGCTGTGGCAGTAT TCATGGAAGGT T TCCTCT TCTACTACCACGTCCACAACCGGCCTCCGCTGGA
CCAGCACATCCAC TCACTCCTGCTGTATGCTCT GT TCGGAGGGTGT GT TAGTATCTCCCTAGAGGTGATC
T TCCGGGACCACAT TGTGCTGGAACT T T TCCGAACCAGTCTCATCAT TCT TCAGGGAACCTGGT TCTGGC

AGAT IGGGI T 1= GCTGT TCCCACCT T T TGGAACACCCGAATGGGACCAGAAGGATGATGCCAACCTCAT
GT TCATCACCATGTGCT TCTGCTGGCACTACCTGGCTGCCCTCAGCAT TGTGGCCGTCAACTAT TCTCT T
GT T TACTGCCT T T TGACTCGGATGAAGAGACACGGAAGGGGAGAAATCAT T GGAAT TCAGAAGCTGAAT
T
CAGATGACACTIACCAGACCGCCCICTIGAGIGGCTCAGATGAGGAATGAGCCGAGATGCGGAGGGCGCA
GATGTCCCACTGCACAGCTGGAATGAATGGAGT TCATCCCCTCCACCTGAATGCCTGCTGTGGTCTGATC
T TAAGGGTCTATATAT T TGCACCICC =AT ICAACACAGGGCTGGAGGI IC TACAACAGGAAATCAGGCC
TACAGCATCCTGTGTATCT TGCAGT TGGGAT T T T TAAACATACTATAAAGT CT GTGT
TGGTATAGTACCC
T TCATAAGGAAAAATGAAGTAATGCCTATAAGTAGCAGGCCT T TGTGCCTCAGTGTCAAGAGAAATCAAG
AGATGCTAAAAGCT T TACAATGGAAGTGGCCTCATGGATGAATCCGGGGTATGAGCCCAGGAGAACGTGC
TGCT T T TGGTAACT TATCCCT T T T TCTCT TAAGAAAGCAGGTACT T TCT TAT TAGAAATATGT
TAGAATG
TGTAAGCAAACGACAGTGCCT T TAGAAT TACAAT TCTAACT TACATAT T T T TT GAAAGTAAAATAAT
TCA
CAAGCT T TGGTAT T T TAAAAT TAT TGT TAAACATATCATAACTAATCATACCAGGGTACTGCAATACCAC

TGT T TATAAGTGACAAAAT TAGGCCAAAGGTGAT TTTTTTT TAAATCAGGAAGCTGGT TACTGGCTCTAC
TGAGAGT TGGAGCCCTGAT GT TCTGAT TCT TCAAAGTCACCCTAAAAGAAGATCTGACAGGAAAGCTGTA
TAATGAGATAGAAAAACGTCAGGTATGGAAGGCT T TCAGT T T TAATATGGCTGAAAGCAAAGGATAACGA
AT TCAGAAT TAGTAATGTAAAATCT TGATACCCTAATCT TGCT TCTGGATCTGT TCT TTTTT TAAAAAAA

CT TCCT TCACCGCGCCTATAATCCTAGCACT T TGGGAGGCCGAGGCAGGCAGATCACGGGGTCAGGAGAT
CAAGACCATCCTGGCTAACATGGTGAAACCCCGTCTCTACTGAAAATACAAAAAAT TAGCCGGGTGTGGT
GGCGGGCGCCTGTAGT TCCAGCTACT CGGGAGGCTGAGGCAAGAGAATGGCAT GAACCCGGTAGGGGAGC
T TGCAGTGAGCCCAGATCATGCCACTGTACTCCAGCCTAGGTGACAGAGCAAGACTCTGTCTCAAAAACA
AGCAAACAGACT TCCT TCAACAAATAT T TAT TAAATATCCACT T TGCAACAGCACTGAAATGGCTGTAAG
GACTCCIGAGATAIGIGICCAGCAAGGAGTITACAGICAAACAGGAGAGACATGCCIGTAGITACATCCA
GTGTGATGGGTGCTGAGAGGCAAGTACAAACCACGATG

CCGCGCCGGCCCT I GCCCCCCGCCGCACAGGAGCGGGACGCCGAGCCGCGI CCGCCGCACGGGGAGC I GC
AGTACC TGGGGCAGATCCAACACATCCTCCGCT GCGGCGTCAGGAAGGACGCCCGCCCGGGCACCGGTAC
CCIGCCGGIATICGGCATOCAGGCGCGCTACAGCCTGAGAGATGAATICCCTCTGCTGACAACCAAACGT
=GT ICTGGAACGGIGCT TCGGAGGAGCTGCTGIGGCT TATCAAGGGATCCACAAACGCTATAGACCIGT
CT TCCCCGGCAGCGAAAAT CICGGGATGCCACT GGATCCCGACACT CICIGGACACCCT GGGAT IC TCCA
CCAGAGAAGAACGCGACT T GGGCCCAGT T T GT GGC T CTCAGCGGAGGCC T C CT GT
GGCAGAATACATACA
TT I CCAAT CAGAT CAC I ICCCGGACACGGACCNIGACCAGCCTGCCAAAAAGT GGAT I
ICCCCCCACCCC
AGAACCCANCCCCTGACGCACAGAAACCAACCCAT ICGT I= TGCCGCCI I GC GAACCCCAACCAGAAT C
T C T CCCCCCTGGCCGGCGCGCCTGCC GCTGCCAAT GCCC C TAT GGCGGCC T CT
TGGCCCGCACCTTCCAA
T TGGTCGCCCTGCGCAACCAGCGAGAAAACACTGGCCCGCCCGTCTCCCCCCCGCTCCGCCTACCCCACT

TAATGCGCCTCCGTGGCATGACGCACGCGT T TGGTGTCCGCCGCCGTCTCATGTCCGCGCGGTGTGGACC
CCCT T T ICICICGCGGCACATCCCCCCIAT TCCCT TGCCCI I IGGGGGGCACCCCCICTAGACCCGCGCT
TCTCT TCTCGTCCGGTGGGGGACATTGGT T TGCCTGCCGCGGCGGGGGCGNTAAAAATAAAAACAGCCTG
T TAGCCCGGCCCAGTACCCCCCCCCGGCCGGGGCCGCCT TNCGT T TGCAT T TATACCCCAACCCATAAAG
CCGCGCCCCT T TAGCNCCNTAACTT T T GT GGTGTGGCCTCCCCCCT T T T
TCCCGGGGAGCAGCAACGGAC
ATCTGTACACTAATGCTGGCCCCGACCT T TCCCAAAAACCCCCCGCCCGTGTCCCGTATAAAT T TGGTGC
CAANCCTGACGNGTICICCCCCGCCCICGCCCCGTIGGCCGCCCGT ITAAAGCCCCCCCGGIGGITGCGC
CGCCCAACGAGTCCACCTATAGT TAANTCCACCAACACCCCCACCT T T TCCTCCCCGCCGCATCT TCCCC
ACGTACCCCCT T T IGICGCGAGAIGGCCACTCCCCCCCCCCTGI I I GT I TAAAACAACGAGAAT
GGTGCT
IIIGCCAACGCTGGTC TCCCCCCCCGGACCGCGACCGCCAGGGGGAATACGTACCATAAGCCCCCGCGCC
CNC= I TIT TCCCCCCTCCCCGCCAATCAAGAT CCGCCGTCCAT TAGACGTAT TAT I I I
TCCCGCGATAC
ACGAAAAAACAGGGCCGCCCAT I TATAACTAAAT TCCCGTCGCCGCCGCGCGGATAIGT T TCCCAAAATA
CCACCCCCCCCCCCCCAT T T TCT T TGCCCCCAACTCCTGCGCACCGGTGT TCACCAGCCTCGCGCCGC

CCAACGCCGCCCGGATGGC I ICCCAAAACCGCGACCCAGCCGCCAC TAGCGICGCCGCCGCCCGTAAAGG
AGCTGAGCCGAGCGGGGGCGCCGCCCGGGGTCCGGTGGGCAAAAGGCTACAGCAGGAGC TGATGACCCIC
AIGGIGAGIGAT TAAGTGCCCAGAACCCCAGCCTICCATCCAATIT ICAGTAGCCICCT T T TT TCCGTCA
GCTITIT TGCTAGACATAGGGGTAATGTAAT TIGCICCCTCCIGGGAAAGAAGTICATACACCCCACCIA
CACCAT I ICI TCCAGCAGT CCCICCI CCCAAT I CCATCCCCCCACACGAAGITAICICGAACACT TCCCT

GAAGICATACAAGACCCICCCIATCCAGIGIGICCCIACTICCIAGCCCCAACCAAGCT T TACCCACACC
CAACTCCCCGCCCT TCTTGGTAT T TCTAGCCTATGAAT T TGGT T GC T T TAT TT
TGGATCAGAGTGATGAG
AT TAAGGGGAGGCTGGGCGCGGTAGCTCACACCT TATAATCCCAAAGTGCTGGGAT TACAGGCGTGAGCC
ACCGCGCCCGGCCAGCAACTAATAT TCTAAT TGAACTAAAGCACAGGATGCCAAT T TACAATCCT TAGAC
CAAAGAGTCACTGAIGICICCACCAGATAAGAGGAAAGCATCAGGCTAGGCATAGIGGCTCACACCIGTA
ATCTCAGCACT T TGGGAGGCTGAGGCAGGCAGATCACATGAGCCCAGGAGT TTGAGACTGGCCTGGGCAA
CAT GGT GAAACCC TGTCTC TAAAATAAAAAC TAAAC TAAAAAAACT T T T
TAAAAAGGCAGTGGGGAGCAT
CAGAACCAGCTCAACAGT T TGTCTAC TGTCCGGTCCCAGAGAAACT CAAGAT T CTAGCAAGCCCCT T GT
G
IGGGGCTIGGGITGGGACATGAGGCTGCTGCTGGAGCTTACICTGCAACTGIT TCTCCAAATGCCAGGTA
TAT GAAGACCTGAGGTATAAGCTCTCGCTAGAGT TCCCCAGTGGCTACCCT TACAAT GCGCCCACAGT GA
AGT ICC ICACGCCCIGCTATCACCCCAACGIGGACACCCAGGGIAACATAT GCCT GGACATCCTGAAGGA
AAAGT GGTCTGCCCT GTAT GAT GTCAGGACCAT ICTGCTCTCCATCCAGAGCCTICTAGGAGAACCCAAC
AT TGATAGTCCCT TGAACACACATGCTGCCGAGCTCTGGAAAAACCCCACAGCT T T TAAGAAGTACCTGC
AAGAAACCTACTCAAAGCAGGTCACCAGCCAGGAGCCCT GACCCAGGCTGCCCAGCCTGTCCT T GT GTCG
TCTTTTTAATTTTTCCTTAGATGGTCTGTCCTTTTTGTGATTTCTGTATAGGACTCTTTATCTTGAGCTG
TGGTAT T T T TGT T T TGT T T T TGTCT T T TAAAT TAAGCCTCGGT TGAGCCCT TGTATAT
TAAATAAATGCA
TTTT TGTCCT TTTT TAAAAAAAAAAT AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
A
[32] At least 40, at least 41, at least 42, at least 43, at least 44, at least 46 or all 46 of the genes in Table 1 can be utilized in the methods of the present invention.
Preferably, the expression of each of the 46 genes is determined in a biological sample. The prototypical gene expression profiles (i.e. centroid) of the four intrinsic subtypes were pre-defined from a training set of FFPE breast tumor samples using hierarchical clustering analysis of gene expression data. A heatmap of the prototypical gene expression profiles (i.e.
centroids) of these four subtypes is shown in Figure 1, where the level of expression is illustrated by the heatmap. Table 3 shows the actual values.
[33] Table 3.
Tumor Subtype Centroids for Comparison to a Sample Target Gene Basal-like Her2-enriched Luminal A Luminal B
ACTR3B -0.2052 -0.7965 -0.2790 -0.4380 ANLN 1.0227 0.5006 -0.7289 0.1149 BAG1 -0.4676 -0.3132 0.4716 0.5879 BCL2 -0.7365 -0.7237 0.7234 0.6363 BLVRA -0.8761 0.2270 0.1628 0.7138 CCNE1 1.3100 0.2201 -0.6231 -0.2729 CDC20 1.0995 0.1445 -1.0518 -0.1173 CDC6 0.5817 0.6601 -0.7032 0.3134 CDCA1 0.9367 0.1623 -0.4509 0.2692 CDH3 0.7639 0.0144 -0.0502 -1.0229 CENPF 1.0222 0.2944 -0.5657 0.2437 CEP55 1.0442 0.4881 -0.6365 0.2921 CXXC5 -0.9732 0.1866 0.5687 0.9463 EGFR 0.3352 -0.1326 -0.0011 -0.9755 ERBB2 -0.7045 1.4182 0.2420 0.1978 ESR1 -1.1847 -0.4926 0.7177 1.0101 EX01 1.0546 0.4317 -0.7259 0.2559 FGFR4 -0.2073 1.4562 0.1707 -0.2223 FOXA1 -1.3590 0.5726 0.7131 0.7963 FOXCl 1.0666 -0.7362 -0.4078 -0.9877 GPR160 -1.0540 0.5524 0.6032 0.7305 KIF2C 0.9242 0.1104 -1.1001 -0.2771 KNIC2 1.1373 0.2266 -0.7593 0.1656 KRT14 0.4759 -0.5269 0.8187 -0.8879 KRT17 0.6863 -0.3777 0.6149 -1.1415 KRT5 0.7136 -0.4146 0.5832 -0.9462 MAPT -1.1343 -0.2711 1.0957 0.8372 MDM2 -0.7498 -0.4855 -0.1788 0.2397 MELK 1.0209 0.2678 -0.8016 0.1012 MIA 1.2408 -0.5475 0.3289 -0.6320 MK167 1.0446 0.4630 -0.6717 0.3161 MLPH -1.4150 0.4842 0.8829 0.8194 MMP11 -0.1295 0.5220 0.3402 0.5653 MYC 0.5639 -0.9904 -0.3015 -0.2791 NATI -0.9711 -0.2708 1.2256 0.9576 ORC6L 1.0086 0.5152 -1.0385 -0.0336 PGR -0.9216 -0.5755 1.2061 0.9278 PHGDH 0.9192 0.0322 -0.5194 -0.5371 P1101 0.9541 0.2079 -1.1207 0.1052 RRM2 0.7895 0.6336 -0.8099 0.3228 SFRP1 0.7694 -0.8271 0.2617 -1.0846 SLC39A6 -0.9992 -0.4573 0.6607 0.9222 TMEM45B -1.0721 0.7926 0.3190 0.2016 TYMS 0.9823 -0.0960 -0.8593 0.1827 UBE2C 0.8294 0.3358 -1.0141 0.0608 UBE2T 0.6258 0.0617 -0.8652 -0.0487 [34] After performing the Breast Cancer Intrinsic Subtyping test with a test breast cancer tumor sample and the reference sample provided as part of the test kit, a computational algorithm based on a Pearson's correlation compares the normalized and scaled gene expression profile of the NAN046 intrinsic gene set of the test sample to the prototypical expression signatures of the four breast cancer intrinsic subtypes. The intrinsic subtype analysis is determined by determining the expression of a NAN050 set of genes (which is determining the expression of the NAN046 set of genes and further includes determining the expression of MYBL2, BIRC5, GRB7 and CCNB1) and the risk of recurrence ("ROR") is determined using the NAN046 set of genes). Specifically, the intrinsic subtype is identified by comparing the expression of the NAN050 set of genes in the biological sample with the expected expression profiles for the four intrinsic subtypes. The subtype with the most similar expression profile is assigned to the biological sample. The ROR score is an integer value on a 0-100 scale that is related to an individual patient's probability of distant recurrence within 10 years for the defined intended use population. The ROR
score is calculated by comparing the expression profiles of the NAN046 genes in the biological sample with the expected profiles for the four intrinsic subtypes, as described above, to calculate four different correlation values. These correlation values are then combined with a proliferation score (and optionally one or more clinicopathological variables, such as tumor size) to calculate the ROR score. Preferably, the ROR score is calculated by comparing only the expression profiles of the NAN046 genes.
[35] Figure 6 provides a schematic of the specific algorithm transformations.
The tumor sample is assigned the subtype with the largest positive correlation to the sample. Kaplan Meier survival curves generated from a training set of untreated breast cancer patients demonstrate that the intrinsic subtypes are a prognostic indicator of recurrence free survival (RFS) in this test population, which includes both estrogen receptor positive/negative and HER2 positive/negative patients, Figure 2.
[36] Independent testing on a cohort of node negative, estrogen receptor positive patients treated with tamoxifen shows predominantly Luminal A and B subtype patients with Luminal A patients exhibiting better outcome than Luminal B patients, Figure 3. The outcome of Luminal A patients is expected to improve even further using clinical trial specimens that use more modern treatment regimens (i.e. aromatase inhibitors) and have better adherence to therapy which will improve outcome [37] The training set of FFPE breast tumor samples, which had well defined clinical characteristics and clinical outcome data, were used to establish a continuous Risk of Recurrence (ROR) score. The score is calculated using coefficients from a Cox model that includes correlation to each intrinsic subtype, a proliferation score (mean gene expression of a subset of 18 of the 46 genes), and tumor size, Table 4.

Table 4. Coefficients to calculate ROR-PT (equation 1) Test Variables Coefficient Basal-like Pearson's correlation (A) - 0.0067 Her2-enriched Pearson's correlation (B) 0.4317 Luminal A Pearson's correlation (C) - 0.3172 Luminal B Pearson's correlation (D) 0.4894 Proliferation Score (E) 0.1981 Tumor Size (F) 0.1133 [38] The test variables in Table 4 are multiplied by the corresponding coefficients and summed to produce a risk score ("ROR-PT").
[39] ROR-PT equation = -0.0067*A + 0.4317*B + -0.3172*C + 0.4894*D + 0.1981*E
+
0.1133*F
[40] In previous studies, the ROR score provided a continuous estimate of the risk of recurrence for ER-positive, node-negative patients who were treated with tamoxifen for 5 years (Nielsen et al. Clin. Cancer Res., 16(21):5222-5232 (2009)). This result was verified on ER-positive, node-negative patients from the same cohort, Figure 4. The ROR
score also exhibited a statistically significant improvement over a clinical model based in determining RFS within this test population providing further evidence of the improved accuracy of this decision making tool when compared to traditional clinicopathological measures (Nielsen et al. Clin. Cancer Res., 16(21):5222-5232 (2009)).
[41] The gene set contains many genes that are known markers for proliferation. The methods of the present invention provide for the determination of subsets of genes that provide a proliferation signature. The methods of the present invention can include determining the expression of at least one of, a combination of, or each of, a 18-gene subset of the NAN046 intrinsic genes selected from ANLN, CCNE1, CDC20, CDC6, CDCA1, CENPF, CEP55, EX01, KIF2C, KNTC2, MELK, MKI67, ORC6L, PTTG1, RRM2, TYMS, UBE2C and/or UBE2T. Preferably, the expression of each of the 18-gene subset of the NAN046 gene set is determined to provide a proliferation score. The expression of one or more of these genes may be determined and a proliferation signature index can be generated by averaging the normalized expression estimates of one or more of these genes in a sample.
The sample can be assigned a high proliferation signature, a moderate/intermediate proliferation signature, a low proliferation signature or an ultra-low proliferation signature.
Methods of determining a proliferation signature from a biological sample are as described in Nielsen et al. Clin. Cancer Res., 16(21):5222-5232 (2009) and supplemental online material (these documents are incorporated herein, by reference, in their entireties).
[42] Description of Intrinsic Subtype Biology [43] Luminal subtypes: The most common subtypes of breast cancer are the luminal subtypes, Luminal A and Luminal B. Prior studies suggest that luminal A
comprises approximately 30% to 40% and luminal B approximately 20% of all breast cancers, but they represent over 90 % of hormone receptor positive breast cancers (Nielsen et al. Clin. Cancer Res., 16(21):5222-5232 (2009)). The gene expression pattern of these subtypes resembles the luminal epithelial component of the breast. These tumors are characterized by high expression of estrogen receptor (ER), progesterone receptor (PR), and genes associated with ER activation, such as LIV1, GATA3, and cyclin D1, as well as expression of luminal cytokeratins 8 and 18 (Lisa Carey & Charles Perou (2009). Gene Arrays, Prognosis, and Therapeutic Interventions. Jay R. Harris et al. (4th ed.), Diseases of the breast (pp. 458-472).
Philadelphia, PA: Lippincott Williams & Wilkins).
[44] Luminal A: Luminal A (LumA) breast cancers exhibit low expression of genes associated with cell cycle activation and the ERBB2 cluster resulting in a better prognosis than Luminal B. The Luminal A subgroup has the most favorable prognosis of all subtypes and is enriched for endocrine therapy-responsive tumors.
[45] Luminal B: Luminal B (LumB) breast cancers also express ER and ER-associated genes. Genes associated with cell cycle activation are highly expressed and this tumor type can be HER2(+) (-20%) or HER2(-). The prognosis is unfavorable (despite ER
expression) and endocrine therapy responsiveness is generally diminished relative to LumA.
[46] HER2-enriched: The HER2-enriched subtype is generally ER-negative and is positive in the majority of cases with high expression of the ERBB2 cluster, including ERBB2 and GRB7. Genes associated with cell cycle activation are highly expressed and these tumors have a poor outcome.
[47] Basal-like: The Basal-like subtype is generally ER-negative, is almost always clinically HER2-negative and expresses a suite of "basal" biomarkers including the basal epithelial cytokeratins (CK) and epidermal growth factor receptor (EGFR).
Genes associated with cell cycle activation are highly expressed.
[48] Clinical variables [49] The NAN046 classification model described herein may be further combined with information on clinical variables to generate a continuous risk of recurrence (ROR) predictor.
As described herein, a number of clinical and prognostic breast cancer factors are known in the art and are used to predict treatment outcome and the likelihood of disease recurrence.

Such factors include, for example, lymph node involvement, tumor size, histologic grade, estrogen and progesterone hormone receptor status, HER-2 levels, and tumor ploidy. In one embodiment, risk of recurrence (ROR) score is provided for a subject diagnosed with or suspected of having breast cancer. This score uses the NAN046 classification model in combination with clinical factors of lymph node status (N) and tumor size (T).
Assessment of clinical variables is based on the American Joint Committee on Cancer (AJCC) standardized system for breast cancer staging. In this system, primary tumor size is categorized on a scale of 0-4 (TO: no evidence of primary tumor; Tl : <2 cm;
T2: > 2 cm - <
cm; T3 : > 5 cm; T4: tumor of any size with direct spread to chest wall or skin). Lymph node status is classified as N0-N3 (NO: regional lymph nodes are free of metastasis; N1 :
metastasis to movable, same-side axillary lymph node(s); N2: metastasis to same-side lymph node(s) fixed to one another or to other structures; N3: metastasis to same-side lymph nodes beneath the breastbone). Methods of identifying breast cancer patients and staging the disease are well known and may include manual examination, biopsy, review of patient's and/or family history, and imaging techniques, such as mammography, magnetic resonance imaging (MRI), and positron emission tomography (PET).
[50] Sample Source [51] In one embodiment of the present disclosure, breast cancer subtype is assessed through the evaluation of expression patterns, or profiles, of the intrinsic genes listed in Table 1 in one or more subject samples. For the purpose of discussion, the term subject, or subject sample, refers to an individual regardless of health and/or disease status. A
subject can be a subject, a study participant, a control subject, a screening subject, or any other class of individual from whom a sample is obtained and assessed in the context of the disclosure.
Accordingly, a subject can be diagnosed with breast cancer, can present with one or more symptoms of breast cancer, or a predisposing factor, such as a family (genetic) or medical history (medical) factor, for breast cancer, can be undergoing treatment or therapy for breast cancer, or the like. Alternatively, a subject can be healthy with respect to any of the aforementioned factors or criteria. It will be appreciated that the term "healthy" as used herein, is relative to breast cancer status, as the term "healthy" cannot be defined to correspond to any absolute evaluation or status. Thus, an individual defined as healthy with reference to any specified disease or disease criterion, can in fact be diagnosed with any other one or more diseases, or exhibit any other one or more disease criterion, including one or more cancers other than breast cancer. However, the healthy controls are preferably free of any cancer.
[52] In particular embodiments, the methods for predicting breast cancer intrinsic subtypes include collecting a biological sample comprising a cancer cell or tissue, such as a breast tissue sample or a primary breast tumor tissue sample. By "biological sample"
is intended any sampling of cells, tissues, or bodily fluids in which expression of an intrinsic gene can be detected. Examples of such biological samples include, but are not limited to, biopsies and smears. Bodily fluids useful in the present disclosure include blood, lymph, urine, saliva, nipple aspirates, gynecological fluids, or any other bodily secretion or derivative thereof.
Blood can include whole blood, plasma, serum, or any derivative of blood. In some embodiments, the biological sample includes breast cells, particularly breast tissue from a biopsy, such as a breast tumor tissue sample. Biological samples may be obtained from a subject by a variety of techniques including, for example, by scraping or swabbing an area, by using a needle to aspirate cells or bodily fluids, or by removing a tissue sample (i.e., biopsy). Methods for collecting various biological samples are well known in the art. In some embodiments, a breast tissue sample is obtained by, for example, fine needle aspiration biopsy, core needle biopsy, or excisional biopsy. Fixative and staining solutions may be applied to the cells or tissues for preserving the specimen and for facilitating examination.
Biological samples, particularly breast tissue samples, may be transferred to a glass slide for viewing under magnification. In one embodiment, the biological sample is a formalin-fixed, paraffin-embedded breast tissue sample, particularly a primary breast tumor sample. In various embodiments, the tissue sample is obtained from a pathologist-guided tissue core sample.
[53] Expression Profiling [54] In various embodiments, the present disclosure provides methods for classifying, prognosticating, or monitoring breast cancer in subjects. In this embodiment, data obtained from analysis of intrinsic gene expression is evaluated using one or more pattern recognition algorithms. Such analysis methods may be used to form a predictive model, which can be used to classify test data. For example, one convenient and particularly effective method of classification employs multivariate statistical analysis modeling, first to form a model (a "predictive mathematical model") using data ("modeling data") from samples of known subtype (e.g., from subjects known to have a particular breast cancer intrinsic subtype:
LumA, LumB, Basal-like, HER2-enriched, or normal-like), and second to classify an unknown sample (e.g., "test sample") according to subtype. Pattern recognition methods have been used widely to characterize many different types of problems ranging, for example, over linguistics, fingerprinting, chemistry and psychology. In the context of the methods described herein, pattern recognition is the use of multivariate statistics, both parametric and non-parametric, to analyze data, and hence to classify samples and to predict the value of some dependent variable based on a range of observed measurements. There are two main approaches. One set of methods is termed "unsupervised" and these simply reduce data complexity in a rational way and also produce display plots which can be interpreted by the human eye. However, this type of approach may not be suitable for developing a clinical assay that can be used to classify samples derived from subjects independent of the initial sample population used to train the prediction algorithm.
[55] The other approach is termed "supervised" whereby a training set of samples with known class or outcome is used to produce a mathematical model which is then evaluated with independent validation data sets. Here, a "training set" of intrinsic gene expression data is used to construct a statistical model that predicts correctly the "subtype"
of each sample.
This training set is then tested with independent data (referred to as a test or validation set) to determine the robustness of the computer-based model. These models are sometimes termed "expert systems," but may be based on a range of different mathematical procedures.
Supervised methods can use a data set with reduced dimensionality (for example, the first few principal components), but typically use unreduced data, with all dimensionality. In all cases the methods allow the quantitative description of the multivariate boundaries that characterize and separate each subtype in terms of its intrinsic gene expression profile. It is also possible to obtain confidence limits on any predictions, for example, a level of probability to be placed on the goodness of fit. The robustness of the predictive models can also be checked using cross-validation, by leaving out selected samples from the analysis.
[56] The NAN046 classification model described herein is based on the gene expression profile for a plurality of subject samples using the intrinsic genes listed in Table 1. The plurality of samples includes a sufficient number of samples derived from subjects belonging to each subtype class. By "sufficient samples" or "representative number" in this context is intended a quantity of samples derived from each subtype that is sufficient for building a classification model that can reliably distinguish each subtype from all others in the group. A
supervised prediction algorithm is developed based on the profiles of objectively-selected prototype samples for "training" the algorithm. The samples are selected and subtyped using an expanded intrinsic gene set according to the methods disclosed in International Patent Publication WO 2007/061876 and US Patent Publication No. 2009/0299640, which is herein incorporated by reference in its entirety. Alternatively, the samples can be subtyped according to any known assay for classifying breast cancer subtypes. After stratifying the training samples according to subtype, a centroid-based prediction algorithm is used to construct centroids based on the expression profile of the intrinsic gene set described in Table 1.
[57] In one embodiment, the prediction algorithm is the nearest centroid methodology related to that described in Narashiman and Chu (2002) PNAS 99:6567-6572, which is herein incorporated by reference in its entirety. In the present disclosure, the method computes a standardized centroid for each subtype. This centroid is the average gene expression for each gene in each subtype (or "class") divided by the within-class standard deviation for that gene.
Nearest centroid classification takes the gene expression profile of a new sample, and compares it to each of these class centroids. Subtype prediction is done by calculating the Spearman's rank correlation of each test case to the five centroids, and assigning a sample to a subtype based on the nearest centroid.
[58] Detection of intrinsic gene expression [59] Any methods available in the art for detecting expression of the intrinsic genes listed in Table 1 are encompassed herein. By "detecting expression" is intended determining the quantity or presence of an RNA transcript or its expression product of an intrinsic gene.
Methods for detecting expression of the intrinsic genes of the disclosure, that is, gene expression profiling, include methods based on hybridization analysis of polynucleotides, methods based on sequencing of polynucleotides, immunohistochemistry methods, and proteomics-based methods. The methods generally detect expression products (e.g., mRNA) of the intrinsic genes listed in Table 1. In preferred embodiments, PCR-based methods, such as reverse transcription PCR (RT-PCR) (Weis et al., TIG 8:263- 64, 1992), and array-based methods such as microarray (Schena et al., Science 270:467- 70, 1995) are used. By "microarray" is intended an ordered arrangement of hybridizable array elements, such as, for example, polynucleotide probes, on a substrate. The term "probe" refers to any molecule that is capable of selectively binding to a specifically intended target biomolecule, for example, a nucleotide transcript or a protein encoded by or corresponding to an intrinsic gene. Probes can be synthesized by one of skill in the art, or derived from appropriate biological preparations. Probes may be specifically designed to be labeled. Examples of molecules that can be utilized as probes include, but are not limited to, RNA, DNA, proteins, antibodies, and organic molecules.
[60] Many expression detection methods use isolated RNA. The starting material is typically total RNA isolated from a biological sample, such as a tumor or tumor cell line, and corresponding normal tissue or cell line, respectively. If the source of RNA
is a primary tumor, RNA (e.g., mRNA) can be extracted, for example, from frozen or archived paraffin-embedded and fixed (e.g., formalin-fixed) tissue samples (e.g., pathologist-guided tissue core samples).
[61] General methods for RNA extraction are well known in the art and are disclosed in standard textbooks of molecular biology, including Ausubel et al., ed., Current Protocols in Molecular Biology, John Wiley & Sons, New York 1987-1999. Methods for RNA
extraction from paraffin embedded tissues are disclosed, for example, in Rupp and Locker, Lab Invest.
56:A67, (1987); and De Andres et al. Biotechniques 18:42-44, (1995). In particular, RNA
isolation can be performed using a purification kit, a buffer set and protease from commercial manufacturers, such as Qiagen (Valencia, CA), according to the manufacturer's instructions.
For example, total RNA from cells in culture can be isolated using Qiagen RNeasy mini-columns. Other commercially available RNA isolation kits include MASTERPURETm Complete DNA and RNA Purification Kit (Epicentre, Madison, Wis.) and Paraffin Block RNA Isolation Kit (Ambion, Austin, TX). Total RNA from tissue samples can be isolated, for example, using RNA Stat-60 (Tel-Test, Friendswood, TX). Total RNA from FFPE can be isolated, for example, using High Pure FFPE RNA Microkit, Cat No. 04823125001 (Roche Applied Science, Indianapolis, IN). RNA prepared from a tumor can be isolated, for example, by cesium chloride density gradient centrifugation. Additionally, large numbers of tissue samples can readily be processed using techniques well known to those of skill in the art, such as, for example, the single-step RNA isolation process of Chomczynski (U.S. Pat.
No. 4,843,155).
[62] Isolated RNA can be used in hybridization or amplification assays that include, but are not limited to, PCR analyses and probe arrays. One method for the detection of RNA
levels involves contacting the isolated RNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length cDNA, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 60, 100, 250, or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to an intrinsic gene of the present disclosure, or any derivative DNA or RNA. Hybridization of an mRNA with the probe indicates that the intrinsic gene in question is being expressed.
[63] In one embodiment, the mRNA is immobilized on a solid surface and contacted with a probe, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative embodiment, the probes are immobilized on a solid surface and the mRNA is contacted with the probes, for example, in an Agilent gene chip array. A skilled artisan can readily adapt known mRNA
detection methods for use in detecting the level of expression of the intrinsic genes of the present disclosure.
[64] An alternative method for determining the level of intrinsic gene expression product in a sample involves the process of nucleic acid amplification, for example, by RT-PCR (U.S.
Pat. No. 4,683,202), ligase chain reaction (Barany, PNAS USA 88: 189-93, (1991)), self sustained sequence replication (Guatelli et al., Proc. Natl. Acad. Sci USA 87:
1874-78, (1990)), transcriptional amplification system (Kwoh et al., Proc. Natl. Acad.
ScL USA 86:
1173-77, (1989)), Q-Beta Replicase (Lizardi et al., Bio/Technology 6:1197, (1988)), rolling circle replication (U.S. Pat. No. 5,854,033), or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques well known to those of skill in the art. These detection schemes are especially useful for the detection of nucleic acid molecules if such molecules are present in very low numbers.
[65] In particular aspects of the disclosure, intrinsic gene expression is assessed by quantitative RT-PCR. Numerous different PCR or QPCR protocols are known in the art and exemplified herein below and can be directly applied or adapted for use using the presently-described compositions for the detection and/or quantification of the intrinsic genes listed in Table 1. Generally, in PCR, a target polynucleotide sequence is amplified by reaction with at least one oligonucleotide primer or pair of oligonucleotide primers. The primer(s) hybridize to a complementary region of the target nucleic acid and a DNA polymerase extends the primer(s) to amplify the target sequence. Under conditions sufficient to provide polymerase-based nucleic acid amplification products, a nucleic acid fragment of one size dominates the reaction products (the target polynucleotide sequence which is the amplification product).
The amplification cycle is repeated to increase the concentration of the single target polynucleotide sequence. The reaction can be performed in any thermocycler commonly used for PCR. However, preferred are cyclers with real time fluorescence measurement capabilities, for example, SMARTCYCLER (Cepheid, Sunnyvale, CA), ABI PRISM
7700 (Applied Biosystems, Foster City, Calif.), ROTOR- GENETM (Corbett Research, Sydney, Australia), LIGHTCYCLER (Roche Diagnostics Corp, Indianapolis, Ind.), ICYCLER (Biorad Laboratories, Hercules, Calif.) and MX4000 (Stratagene, La Jolla, Calif.).
[66] In another embodiment of the disclosure, microarrays are used for expression profiling. Microarrays are particularly well suited for this purpose because of the reproducibility between different experiments. DNA microarrays provide one method for the simultaneous measurement of the expression levels of large numbers of genes.
Each array consists of a reproducible pattern of capture probes attached to a solid support. Labeled RNA
or DNA is hybridized to complementary probes on the array and then detected by laser scanning. Hybridization intensities for each probe on the array are determined and converted to a quantitative value representing relative gene expression levels. See, for example, U.S.
Pat. Nos. 6,040,138, 5,800,992 and 6,020,135, 6,033,860, and 6,344,316. High-density oligonucleotide arrays are particularly useful for determining the gene expression profile for a large number of RNAs in a sample.
[67] In a preferred embodiment, the nCounter Analysis system is used to detect intrinsic gene expression. The basis of the nCounter Analysis system is the unique code assigned to each nucleic acid target to be assayed (International Patent Application Publication No. WO
08/124847, US Patent No. 8,415,102 and Geiss et al. Nature Biotechnology.
2008. 26(3):
317-325; the contents of which are each incorporated herein by reference in their entireties).
The code is composed of an ordered series of colored fluorescent spots which create a unique barcode for each target to be assayed. A pair of probes is designed for each DNA or RNA
target, a biotinylated capture probe and a reporter probe carrying the fluorescent barcode.
This system is also referred to, herein, as the nanoreporter code system.
[68] Specific reporter and capture probes are synthesized for each target.
Briefly, sequence-specific DNA oligonucleotide probes are attached to code-specific reporter molecules. Preferably, each sequence specific reporter probe comprises a target specifc sequence capable of hybriding to no more than one NAN046 gene of Table 1 and optionally comprises at least two, at least three, or at least four label attachment regions, said attachment regions comprising one or more label monomers that emit light. Capture probes are made by ligating a second sequence-specific DNA oligonucleotide for each target to a universal oligonucleotide containing biotin. Reporter and capture probes are all pooled into a single hybridization mixture, the "probe library". Preferably, the probe library comprises a probe pair (a capture probe and reporter) for each of the NAN046 genes in Table 1.
[69] The relative abundance of each target is measured in a single multiplexed hybridization reaction. The method comprises contacting a biological sample with a probe library, the library comprising a probe pair for the NAN046 genes in Table 1, such that the presence of the target in the sample creates a probe pair ¨ target complex.
The complex is then purified. More specifically, the sample is combined with the probe library, and hybridization occurs in solution. After hybridization, the tripartite hybridized complexes (probe pairs and target) are purified in a two-step procedure using magnetic beads linked to oligonucleotides complementary to universal sequences present on the capture and reporter probes. This dual purification process allows the hybridization reaction to be driven to completion with a large excess of target-specific probes, as they are ultimately removed, and, thus, do not interfere with binding and imaging of the sample. All post hybridization steps are handled robotically on a custom liquid-handling robot (Prep Station, NanoString Technologies).
[70] Purified reactions are deposited by the Prep Station into individual flow cells of a sample cartridge, bound to a streptavidin-coated surface via the capture probe, electrophoresed to elongate the reporter probes, and immobilized. After processing, the sample cartridge is transferred to a fully automated imaging and data collection device (Digital Analyzer, NanoString Technologies). The expression level of a target is measured by imaging each sample and counting the number of times the code for that target is detected.
Data is output in simple spreadsheet format listing the number of counts per target, per sample.
[71] This system can be used along with nanoreporters. Additional disclosure regarding nanoreporters can be found in International Publication No. WO 07/076129 and WO
07/076132, and US Patent Publication No. 2010/0015607 and 2010/0261026, the contents of which are incorporated herein in their entireties. Further, the term nucleic acid probes and nanoreporters can include the rationally designed (e.g. synthetic sequences) described in International Publication No. WO 2010/019826 and US Patent Publication No.
2010/0047924, incorporated herein by reference in its entirety.
[72] Data processing [73] It is often useful to pre-process gene expression data, for example, by addressing missing data, translation, scaling, normalization, weighting, etc.
Multivariate projection methods, such as principal component analysis (PCA) and partial least squares analysis (PLS), are so-called scaling sensitive methods. By using prior knowledge and experience about the type of data studied, the quality of the data prior to multivariate modeling can be enhanced by scaling and/or weighting. Adequate scaling and/or weighting can reveal important and interesting variation hidden within the data, and therefore make subsequent multivariate modeling more efficient. Scaling and weighting may be used to place the data in the correct metric, based on knowledge and experience of the studied system, and therefore reveal patterns already inherently present in the data.
[74] If possible, missing data, for example gaps in column values, should be avoided.
However, if necessary, such missing data may replaced or "filled" with, for example, the mean value of a column ("mean fill"); a random value ("random fill"); or a value based on a principal component analysis ("principal component fill").
[75] "Translation" of the descriptor coordinate axes can be useful. Examples of such translation include normalization and mean centering. "Normalization" may be used to remove sample-to-sample variation. For microarray data, the process of normalization aims to remove systematic errors by balancing the fluorescence intensities of the two labeling dyes. The dye bias can come from various sources including differences in dye labeling efficiencies, heat and light sensitivities, as well as scanner settings for scanning two channels.
Some commonly used methods for calculating normalization factor include: (i) global normalization that uses all genes on the array; (ii) housekeeping genes normalization that uses constantly expressed housekeeping/invariant genes; and (iii) internal controls normalization that uses known amount of exogenous control genes added during hybridization (Quackenbush Nat. Genet. 32 (Suppl.), 496-501 (2002)). In one embodiment, the intrinsic genes disclosed herein can be normalized to control housekeeping genes. For example, the housekeeping genes described in U.S. Patent Publication 2008/0032293, which is herein incorporated by reference in its entirety, can be used for normalization. Exemplary housekeeping genes include MRPL19, PSMC4, SF3A1, PUM1, ACTB, GAPD, GUSB, RPLPO, and TFRC. It will be understood by one of skill in the art that the methods disclosed herein are not bound by normalization to any particular housekeeping genes, and that any suitable housekeeping gene(s) known in the art can be used.
[76] Many normalization approaches are possible, and they can often be applied at any of several points in the analysis. In one embodiment, microarray data is normalized using the LOWESS method, which is a global locally weighted scatter plot smoothing normalization function. In another embodiment, qPCR data is normalized to the geometric mean of set of multiple housekeeping genes.
[77] "Mean centering" may also be used to simplify interpretation. Usually, for each descriptor, the average value of that descriptor for all samples is subtracted. In this way, the mean of a descriptor coincides with the origin, and all descriptors are "centered" at zero. In "unit variance scaling," data can be scaled to equal variance. Usually, the value of each descriptor is scaled by 1/StDev, where StDev is the standard deviation for that descriptor for all samples. "Pareto scaling" is, in some sense, intermediate between mean centering and unit variance scaling. In pareto scaling, the value of each descriptor is scaled by 1/sqrt(StDev), where StDev is the standard deviation for that descriptor for all samples. In this way, each descriptor has a variance numerically equal to its initial standard deviation.
The pareto scaling may be performed, for example, on raw data or mean centered data.
[78] "Logarithmic scaling" may be used to assist interpretation when data have a positive skew and/or when data spans a large range, e.g., several orders of magnitude.
Usually, for each descriptor, the value is replaced by the logarithm of that value. In "equal range scaling,"
each descriptor is divided by the range of that descriptor for all samples. In this way, all descriptors have the same range, that is, 1. However, this method is sensitive to presence of outlier points. In "autoscaling," each data vector is mean centered and unit variance scaled.
This technique is a very useful because each descriptor is then weighted equally, and large and small values are treated with equal emphasis. This can be important for genes expressed at very low, but still detectable, levels.
[79] In one embodiment, data is collected for one or more test samples and classified using the NAN046 classification model described herein. When comparing data from multiple analyses (e.g., comparing expression profiles for one or more test samples to the centroids constructed from samples collected and analyzed in an independent study), it will be necessary to normalize data across these data sets. In one embodiment, Distance Weighted Discrimination (DWD) is used to combine these data sets together (Benito et al. (2004) Bioinformatics 20(1): 105-114, incorporated by reference herein in its entirety). DWD is a multivariate analysis tool that is able to identify systematic biases present in separate data sets and then make a global adjustment to compensate for these biases; in essence, each separate data set is a multi-dimensional cloud of data points, and DWD takes two points clouds and shifts one such that it more optimally overlaps the other.
[80] The methods described herein may be implemented and/or the results recorded using any device capable of implementing the methods and/or recording the results.
Examples of devices that may be used include but are not limited to electronic computational devices, including computers of all types. When the methods described herein are implemented and/or recorded in a computer, the computer program that may be used to configure the computer to carry out the steps of the methods may be contained in any computer readable medium capable of containing the computer program. Examples of computer readable medium that may be used include but are not limited to diskettes, CD- ROMs, DVDs, ROM, RAM, and other memory and computer storage devices. The computer program that may be used to configure the computer to carry out the steps of the methods and/or record the results may also be provided over an electronic network, for example, over the internet, an intranet, or other network.
[81] Calculation of risk of recurrence [82] Provided herein are methods for predicting breast cancer outcome within the context of the intrinsic subtype and optionally other clinical variables. Outcome may refer to overall or disease-specific survival, event-free survival, or outcome in response to a particular treatment or therapy. In particular, the methods may be used to predict the likelihood of long-term, disease-free survival. "Predicting the likelihood of survival of a breast cancer patient" is intended to assess the risk that a patient will die as a result of the underlying breast cancer. "Long-term, disease-free survival" is intended to mean that the patient does not die from or suffer a recurrence of the underlying breast cancer within a period of at least five years, or at least ten or more years, following initial diagnosis or treatment.
[83] In one embodiment, outcome is predicted based on classification of a subject according to subtype. In addition to providing a subtype assignment, the bioinformatics model provides a measurement of the similarity of a test sample to all four subtypes which is translated into a Risk of Recurrence (ROR) score that can be used in any patient population regardless of disease status and treatment options. The intrinsic subtypes and ROR also have value in the prediction of pathological complete response in women treated with, for example, neoadjuvant taxane and anthracycline chemotherapy (Rouzier et al., J Clin Oncol 23:8331-9 (2005), incorporated herein by reference in its entirety). Thus, in various embodiments of the present disclosure, a risk of recurrence (ROR) model is used to predict outcome. Using these risk models, subjects can be stratified into low, medium, and high risk of recurrence groups. Calculation of ROR can provide prognostic information to guide treatment decisions and/or monitor response to therapy.
[84] In some embodiments described herein, the prognostic performance of the defined intrinsic subtypes and/or other clinical parameters is assessed utilizing a Cox Proportional Hazards Model Analysis, which is a regression method for survival data that provides an estimate of the hazard ratio and its confidence interval. The Cox model is a well-recognized statistical technique for exploring the relationship between the survival of a patient and particular variables. This statistical method permits estimation of the hazard (i.e., risk) of individuals given their prognostic variables (e.g., intrinsic gene expression profile with or without additional clinical factors, as described herein). The "hazard ratio" is the risk of death at any given time point for patients displaying particular prognostic variables. See generally Spruance et al., Antimicrob. Agents & Chemo. 48:2787-92 (2004).
[85] The NAN046 classification model described herein can be trained for risk of recurrence using subtype distances (or correlations) alone, or using subtype distances with clinical variables as discussed supra. In one embodiment, the risk score for a test sample is calculated using intrinsic subtype distances alone using the following equation:
[86] ROR = 0.05*Basal + 0.1 l*Her2 + -0.25*LumA + 0.07*LumB + -0.1 l*Normal, where the variables "Basal," "Her2," "LumA," "LumB," and "Normal" are the distances to the centroid for each respective classifier when the expression profile from a test sample is compared to centroids constructed using the gene expression data deposited with the Gene Expression Omnibus (GEO).
[87] Risk score can also be calculated using a combination of breast cancer subtype and the clinical variables tumor size (T) and lymph nodes status (N) using the following equation:
ROR (full) = 0.05*Basal + 0.1*Her2 + -0.19*LumA + 0.05*LumB + - 0.09*Normal +
0.16*T + 0.08*N, again when comparing test expression profiles to centroids constructed using the gene expression data deposited with GEO as accession number GSE2845.
[88] In yet another embodiment, risk score for a test sample is calculated using intrinsic subtype distances alone using the following equation:
[89] ROR-S = 0.05*Basal + 0.12*Her2 + -0.34*LumA + 0Ø23*LumB, where the variables "Basal," "Her2," "LumA," and "LumB" are as described supra and the test expression profiles are compared to centroids constructed using the gene expression data deposited with GEO as accession number GSE2845. In yet another embodiment, risk score can also be calculated using a combination of breast cancer subtype and the clinical variable tumor size (T) using the following equation (where the variables are as described supra):
ROR-C = 0.05*Basal + 0.1 l*Her2 + -0.23*LumA + 0.09*LumB + 0.17*T.
[90] In yet another embodiment, risk score for a test sample is calculated using intrinsic subtype distances in combination with the proliferation signature ("Prolif') using the following equation:
[91] ROR-P = -0.001*Basal + 0.7*Her2 + -0.95*LumA + 0.49*LumB + 0.34*Prolif, where the variables "Basal," "Her2," "LumA," "LumB" and "Prolif' are as described supra and the test expression profiles are compared to centroids constructed using the gene expression data deposited with GEO as accession number GSE2845.
[92] In yet another embodiment, risk score can also be calculated using a combination of breast cancer subtype, proliferation signature and the clinical variable tumor size (T) using the ROR-PT described in conjunction with Table 3 supra.
[93] Detection of Subtypes [94] Immunohistochemistry for estrogen (ER), progesterone (PgR), HER2, and Ki67 was performed concurrently on serial sections with the standard streptavidin¨biotin complex method with 3,3'-diaminobenzidine as the chromogen. Staining for ER, PgR, and interpretation can be performed as described previously (Cheang et al., Clin Cancer Res.
2008;14(5):1368-1376.), however any method known in the art may be used.
[95] For example, a Ki67 antibody (clone SP6; ThermoScientific, Fremont, CA) can be applied at a 1:200 dilution for 32 minutes, by following the Ventana Benchmark automated immunostainer (Ventana, Tucson AZ) standard Cell Conditioner 1 (CC1, a proprietary buffer) protocol at 98 C for 30 minutes. An ER antibody (clone SP1;
ThermoFisher Scientific, Fremont CA) can be used at 1:250 dilution with 10-minute incubation, after an 8-minute microwave antigen retrieval in 10 mM sodium citrate (pH 6.0). Ready-to-use PR
antibody (clone 1E2; Ventana) can be used by following the CC1 protocol as above. HER2 staining can be done with a 5P3 antibody (ThermoFisher Scientific) at a 1:100 dilution after antigen retrieval in 0.05 M Tris buffer (pH 10.0) with heating to 95 C in a steamer for 30 minutes. For HER2 fluorescent in situ hybridization (FISH) assay, slides can be hybridized with probes to LSI (locus-specific identifier) HER2/neu and to centromere 17 by use of the PathVysion HER-2 DNA Probe kit (Abbott Molecular, Abbott Park, IL) according to manufacturer's instructions, with modifications to pretreatment and hybridization as previously described (Brown LA, Irving J, Parker R, et al. Amplification of EMSY, a novel oncogene on 11q13, in high grade ovarian surface epithelial carcinomas.
Gynecol Oncol.
2006;100(2):264-270). Slides can then be counterstained with 4',6-diamidino-2-phenylindole, stained material was visualized on a Zeiss Axioplan epifluorescent microscope, and signals were analyzed with a Metafer image acquisition system (Metasystems, Altlussheim, Germany). Biomarker expression from immunohistochemistry assays can then be scored by two pathologists, who were blinded to the clinicopathological characteristics and outcome and who used previously established and published criteria for biomarker expression levels that had been developed on other breast cancer cohorts.
[96] Tumors were considered positive for ER or PR if immunostaining was observed in more than 1% of tumor nuclei, as described previously. Tumors were considered positive for HER2 if immunostaining was scored as 3+ according to HercepTest criteria, with an amplification ratio for fluorescent in situ hybridization of 2.0 or more being the cut point that was used to segregate immunohistochemistry equivocal tumors (scored as 2+) (Yaziji, et al., JAMA, 291(16):1972-1977 (2004)). Ki67 was visually scored for percentage of tumor cell nuclei with positive immunostaining above the background level by two pathologists.
[97] Other methods can also be used to detect subtypes. These techniques include ELISA, Western blots, Northern blots, or FACS analysis.
[98] Kits [99] The present disclosure also describes kits useful for classifying breast cancer intrinsic subtypes and/or providing prognostic information to identify risk of recurrence These kits comprise a set of capture probes and/or primers specific for the intrinsic genes listed in Table 1. The kit may further comprise a computer readable medium.
[100] In one embodiment of the present disclosure, the capture probes are immobilized on an array. By "array" is intended a solid support or a substrate with peptide or nucleic acid probes attached to the support or substrate. Arrays typically comprise a plurality of different capture probes that are coupled to a surface of a substrate in different, known locations. The arrays of the disclosure comprise a substrate having a plurality of capture probes that can specifically bind an intrinsic gene expression product. The number of capture probes on the substrate varies with the purpose for which the array is intended. The arrays may be low-density arrays or high-density arrays and may contain 4 or more, 8 or more, 12 or more, 16 or more, 32 or more addresses, but will minimally comprise capture probes for the 46 intrinsic genes listed in Table 1.
[101] Techniques for the synthesis of these arrays using mechanical synthesis methods are described in, e.g., U.S. Patent No. 5,384,261, incorporated herein by reference in its entirety for all purposes. The array may be fabricated on a surface of virtually any shape or even a multiplicity of surfaces. Arrays may be probes (e.g., nucleic-acid binding probes) on beads, gels, polymeric surfaces, fibers such as fiber optics, glass or any other appropriate substrate, see U.S. Pat. Nos. 5,770,358, 5,789,162, 5,708,153, 6,040,193 and 5,800,992, each of which is hereby incorporated in its entirety for all purposes. Arrays may be packaged in such a manner as to allow for diagnostics or other manipulation on the device. See, for example, U.S. Pat. Nos. 5,856,174 and 5,922,591 herein incorporated by reference.
[102] In another embodiment, the kit comprises a set of oligonucleotide primers sufficient for the detection and/or quantitation of each of the intrinsic genes listed in Table 1. The oligonucleotide primers may be provided in a lyophilized or reconstituted form, or may be provided as a set of nucleotide sequences. In one embodiment, the primers are provided in a microplate format, where each primer set occupies a well (or multiple wells, as in the case of replicates) in the microplate. The microplate may further comprise primers sufficient for the detection of one or more housekeeping genes as discussed infra. The kit may further comprise reagents and instructions sufficient for the amplification of expression products from the genes listed in Table 1.
[103] In order to facilitate ready access, e.g., for comparison, review, recovery, and/or modification, the molecular signatures/expression profiles are typically recorded in a database. Most typically, the database is a relational database accessible by a computational device, although other formats, e.g., manually accessible indexed files of expression profiles as photographs, analogue or digital imaging readouts, spreadsheets, etc. can be used.
Regardless of whether the expression patterns initially recorded are analog or digital in nature, the expression patterns, expression profiles (collective expression patterns), and molecular signatures (correlated expression patterns) are stored digitally and accessed via a database. Typically, the database is compiled and maintained at a central facility, with access being available locally and/or remotely.
[104] Devices and Tests [105] General - The NanoString nCounter Analysis System delivers direct, multiplexed measurements of gene expression through digital readouts of the relative abundance of hundreds of mRNA transcripts. The nCounter Analysis System uses gene-specific probe pairs (Figure 7) that are mixed together to form a single reagent called a CodeSet.
The probe pairs hybridize directly to the mRNA sample in solution eliminating any enzymatic reactions that might introduce bias in the results.
[106] After hybridization, all of the sample processing steps are automated on the nCounter Prep Station. First, excess capture and reporter probes are removed (Figure 8) followed by binding of the probe-target complexes to random locations on the surface of the nCounter cartridge via a streptavidin-biotin linkage (Figure 9).
[107] Finally, probe/target complexes are aligned and immobilized (Figure 10) in the nCounter Cartridge. The Reporter Probe carries the fluorescent signal; the Capture Probe allows the complex to be immobilized for data collection. Up to 800 pairs of probes, each specific to a particular gene, can be combined with a series of internal controls to form a CodeSet.
[108] After sample processing has completed, cartridges are placed in the nCounter Digital Analyzer for data collection. Each target molecule of interest is identified by the "color code"
generated by six ordered fluorescent spots present on the reporter probe. The Reporter Probes on the surface of the cartridge are then counted and tabulated for each target molecule (Figure 11).
[109] Reagents and Test Components - The Breast Cancer test will simultaneously measure the expression levels of NAN046 plus eight housekeeping genes in a single hybridization reaction using an nCounter CodeSet designed specifically to those genes. Each assay also includes positive assay controls comprised of a linear titration of in vitro transcribed RNA
transcripts and corresponding probes, and a set of probes with no sequence homology to human RNA sequences which are used as negative controls. Each assay run includes a reference sample consisting of in vitro transcribed RNA's of the targets and housekeeping genes for normalization purposes. The normalized gene expression profile of a breast tumor sample is correlated to prototypical gene expression profiles of the four breast cancer intrinsic subtypes (Luminal A, Luminal B, HER2-enriched, or Basal-like) that were identified from a training set of breast tumors. The gene expression profile, in combination with selected clinical variables, is used as part of a trained algorithm as a prognostic indicator of risk of distant recurrence of breast cancer.
[110] Figure 12 outlines the assay processes associated with the nCounter Analysis System Breast Cancer Test.
[111] FFPE Tissue Extraction - The Breast Cancer Test will use RNA extracted from Formalin-fixed, Paraffin-embedded (FFPE) tissue that has been diagnosed as invasive carcinoma of the breast. A pathologist first performs an H & E stain of a tumor section mounted onto a slide to identify the region of viable invasive breast carcinoma containing tumor content above a minimum threshold. The pathologist circles the region on the H & E
slide. The pathologist then mounts unstained tissue sections onto slides and marks the area of the slides containing invasive tumor. For larger tumors (>100mm2 of viable invasive carcinoma on the H&E slide), the test requires only a single 10 ,m section.
For smaller tumors (<100mm2), the test requires 3 sections. The identified region of viable invasive breast carcinoma containing sufficient tumor content on the slides is macro-dissected prior to RNA extraction. Procedures for shipping FFPE tissue slides from the collection site to a testing site will be defined as part of the procedure.
[112] Following extraction of total RNA and removal of genomic DNA, the optical density is measured at wavelengths of 260 nm and 280 nm to determine both yield and purity. The assay procedure requires an input range of 125-500ng of total RNA for the subsequent hybridization step. NanoString plans to validate that this input range of RNA
is sufficient to reproducibly perform the assay on the nCounter Analysis System. Additionally, the RNA
quality will be measured using an OD 260/280 reading, with a target ratio of no less than 1.7 with an upper limit of 2.5. Procedures for storing RNA will be provided to the user so that downstream processing can be performed at a later point in time if desired.
[113] Requirements for Spectrophotometer to measure yield and purity post RNA
extraction - RNA isolations from the FFPE sample result in a final sample volume of 30 L. This volume is too low for the quantitation of nucleic acid abundance using absorbance measurements in a cuvette-type UV-Vis spectrophotometer; therefore, NanoString's protocol includes a step for quantitating total RNA using a low volume spectrophotometer such as the NanoDropTM spectrophotometer. NanoString will define performance specifications for the spectrophotometer so that the range of RNA input recommended for the test is above the limit of detection of the low volume spectrophotometer and is reproducibly measurable.
[114] Hybridization - For each set of up to 10 RNA samples, the user will pipette the specified amount of RNA into separate tubes within a 12 reaction strip tube and add the CodeSet and hybridization buffer. A reference sample is pipetted into the remaining two tubes with CodeSet and hybridization buffer. The CodeSet consists of probes for each gene that is targeted, additional probes for endogenous "housekeeping"
normalization genes and positive and negative controls. The probes within the CodeSet pertaining to each of these genes within the four groups (target genes, housekeeping genes, and positive and negative controls) are each assigned a unique code and are therefore individually identifiable within each run. The reference sample consists of in vitro transcribed RNA for the targeted genes and housekeeping genes. Once the hybridization reagents are added to the respective tubes, the user transfers the strip tube into a heated-lid heatblock for a specified period of time at a set temperature.
[115] Requirement for Heat block with heated lid for hybridization step - The nCounter assay includes an overnight hybridization under isothermal conditions. Because the overnight hybridization is performed in a small volume at elevated temperature, care must be taken to avoid evaporation. Many commercial PCR thermocyclers are equipped with heated lids that will prevent the evaporation of small volumes of liquid. Because the assay does not require any fine control of temperature ramping, any heat block with a programmable heated lid and a block with dimensions that fit the NanoString tubes will work with the NanoString assay. NanoString plans to provide specifications for heat blocks that meet the assay requirements.
[116] Purification and Binding on the Prep Station - Upon completing hybridization, the user will then transfer the strip tube containing the set of 10 assays and 2 reference samples into the nCounter Prep Station along with the required prepackaged reagents and disposables described in Table 1. The Prep Plates contain the necessary reagents for purification of excess probes and binding to the cartridge (see section IIIC below for detailed description of purification process). The prep plates are centrifuged in a swinging bucket centrifuge prior to placement on the deck of the Prep Station. An automated purification process then removes excess capture and reporter probe through two successive hybridization-driven magnetic bead capture steps. The nCounter Prep Station then transfers the purified target/probe complexes into an nCounter cartridge for capture to a glass slide. Following completion of the run, the user removes the cartridge from the Prep Station and seals it with an adhesive film.
[117] Imaging and Analysis on the Digital Analyzer - The sealed cartridge is then inserted into the nCounter Digital Analyzer which counts the number of probes captured on the slide for each gene, which corresponds to the amount of target in solution.
Automated software then checks thresholds for the housekeeping genes, reference sample, and positive and negative controls to qualify each assay and ensure that the procedure was performed correctly. The housekeeping genes provide a measure of RNA integrity, and the thresholds indicate when a tested RNA sample is too degraded to be analyzed by the test due to improper handling or storage of tissue or RNA (e.g. improper tumor fixation, FFPE block storage, RNA storage, RNA handling introducing RNase). The positive and negative assay controls indicate a failure of the assay process (e.g. error in assay setup such as sample mixing with CodeSet, or sample processing such as temperature). The signals of each sample are next normalized using the housekeeping genes to control for input sample quality. The signals are then normalized to the reference sample within each run to control for run-to-run variations. The resulting normalized data is entered in the Breast Cancer Intrinsic Subtyping algorithm to determine tumor intrinsic subtype, risk of relapse score, and risk classification.
[118] Instrumentation- The nCounter Analysis System is comprised of two instruments, the nCounter Prep Station used for post-hybridization processing, and the Digital Analyzer used for data collection and analysis.
[119] nCounter Prep Station - The nCounter Prep Station (Figure 13) is an automated fluid handling robot that processes samples post-hybridization to prepare them for data collection on the nCounter Digital Analyzer. Prior to processing on the Prep Station, total RNA
extracted from FFPE (Formalin-Fixed, Paraffin-Embedded) tissue samples is hybridized with the NanoString Reporter Probes and Capture Probes according to the nCounter protocol described above.
[120] Hybridization to the target RNA is driven by excess NanoString probes.
To accurately analyze these hybridized molecules they are first purified from the remaining excess probes in the hybridization reaction. The Prep Station isolates the hybridized mRNA
molecules from the excess Reporter and Capture probes using two sequential magnetic bead purification steps. These affinity purifications utilize custom oligonucleotide-modified magnetic beads that retain only the tripartite complexes of mRNA molecules that are bound to both a Capture probe and a Reporter probe.
[121] Next, this solution of tripartite complexes is washed through a flow cell in the NanoString sample cartridge. One surface of this flow cell is coated with a polyethylene glycol (PEG) hydrogel that is densely impregnated with covalently bound streptavidin. As the solution passes through the flow cell, the tripartite complexes are bound to the streptavidin in the hydrogel through biotin molecules that are incorporated into each Capture probe. The PEG hydrogel acts not only to provide a streptavidin-dense surface onto which the tripartite complexes can be specifically bound, but also inhibits the non-specific binding of any remaining excess reporter probes.
[122] After the complexes are bound to the flow cell surface, an electric field is applied along the length of each sample cartridge flow cell to facilitate the optical identification and order of the fluorescent spots that make up each reporter probe. Because the reporter probes are charged nucleic acids, the applied voltage imparts a force on them that uniformly stretches and orients them along the electric field. While the voltage is applied, the Prep Station adds an immobilization reagent that locks the reporters in the elongated configuration after the field is removed. Once the reporters are immobilized the cartridge can be transferred to the nCounter Digital Analyzer for data collection. All consumable components and reagents required for sample processing on the Prep Station are provided in the nCounter Master Kit. These reagents are ready to load on the deck of the nCounter Prep Station which can process up to 10 samples and 2 reference samples per run in approximately 2.5 hours.
[123] nCounter Digital Analyzer - The nCounter Digital Analyzer (Figure 14) collects data by taking images of the immobilized fluorescent reporters in the sample cartridge with a CCD
camera through a microscope objective lens. Because the fluorescent Reporter Probes are small, single molecule barcodes with features of smaller than the wavelength of visible light, the Digital Analyzer uses high magnification, diffraction limited imaging to resolve the sequence of the spots in the fluorescent barcodes.
[124] The Digital Analyzer captures hundreds of consecutive fields-of-view (FOV) that can each contain hundreds or thousands of discrete Reporter Probes. Each FOV is a combination of four monochrome images captured at different wavelengths. The resulting overlay can be thought of as a four-color image in blue, green, yellow, and red. Each 4-color FOV is processed in real time to provide a "count" for each fluorescent barcode in the sample.
Because each barcode specifically identifies a single mRNA molecule, the resultant data from the Digital Analyzer is a precise measure of the relative abundance of each mRNA of interest in a biological sample.
[125] Software - The Prep Station and the Digital Analyzer are stand-alone units that do not require connection to an external PC, but must be networked to one another using a Local Area Network (LAN). The nCounter System software securely manages operations through user accounts and permissions. Both instruments use setup and process wizards on an embedded touch screen user interface to guide the user through the sample processing and data collection steps of the assay. The user is led through the procedure by step-by-step instructions on the Prep Station and Digital Analyzer. The instrument touch screen uses a pressure sensitive method for controlling operations and enables the user to interact with the system by touching a selection on the screen. Because the touchscreen provides a limited human interface for data entry, the system also hosts a web-based application for user accounts management, sample batch definition, and sample status tracking.
[126] When samples are processed, the system software tracks the user account and reagent lots for each sample in a centralized data repository. After expression data for a sample is acquired by the Digital Analyzer, it is first analyzed to ensure that all pre-specified quality control metrics are met. The qualified data are then processed through a locked PAM50 algorithm to generate a report containing intrinsic subtype and risk of recurrence (ROR) score. The sample report is transferred to the central repository where it can be securely accessed for download by a user with the correct permissions.
[127] The Breast Cancer Intrinsic Subtyping Algorithm - The nCounter system will be used to identify the intrinsic subtype of an excised invasive carcinoma of the breast using a 50 gene classifier algorithm originally named the PAM50 (Parker J. S., et al.
Supervised Risk Predictor of Breast Cancer Based on Intrinsic Subtypes. Journal of Clinical Oncology, 27:
1160-1167 (2009)). The gene expression profile will assign a breast cancer to one of four molecular classes or intrinsic subtypes: Basal-like, Luminal A, Luminal B, and enriched. A brief description of each subtype is provided below.
[128] Luminal subtypes: The most common subtypes of breast cancer are the luminal subtypes in the hormone-receptor positive population, Luminal A and Luminal B.
Prior studies suggest that luminal A comprises approximately 30% to 40% and luminal B
approximately 20% of breast cancers2 and over 90 % of hormone receptor-positive breast cancers. The gene expression pattern of these subtypes resembles the luminal epithelial component of the breast (Nielsen, TO et al. A comparison of PAM50 intrinsic subtyping with immunohistochemistry and clinical prognostic factors in tamoxifen-treated estrogen receptor positive breast cancer. Clinical Cancer Research, 16:5222-5232 (2010)). These tumors are characterized by high expression of estrogen receptor (ER), progesterone receptor (PR), and genes associated with ER activation2 such as LIV1, GATA3, and cyclin D1, as well as expression of luminal cytokeratins 8 and 18.
[129] Luminal A: Luminal A (LumA) breast cancers exhibit low expression of genes associated with cell cycle activation and the ERBB2 cluster resulting in a better prognosis than luminal B. The Luminal A subgroup has the most favorable prognosis of all subtypes and is enriched for endocrine therapy-responsive tumors.
[130] Luminal B: Luminal B (LumB) breast cancers express ER and ER-associated genes, but to a lower extent than LumA. Genes associated with cell cycle activation are highly expressed and this tumor type can be HER2(+) or HER2(-). The prognosis is unfavorable (despite ER expression) and endocrine therapy responsiveness is generally diminished relative to LumA.
[131] Basal-like: The Basal-like subtype is generally ER-negative, is almost always clinically HER2-negative and expresses a suite of "basal" biomarkers including the basal epithelial cytokeratins (CK) and epidermal growth factor receptor (EGFR).
Genes associated with cell cycle activation are highly expressed.
[132] HER2-enriched: The HER2-enriched subtype is generally ER-negative and is positive in the majority of cases with high expression of the ERBB2 cluster, including ERBB2 and GRB7. Genes associated with cell cycle activation are highly expressed and these tumors have a poor outcome.
[133] Cutoffs for the intrinsic subtyping algorithm are pre-defined from training sets that defined the following: 1) intrinsic subtype centroids (i.e. the prototypical gene expression profile of each subtype), 2) coefficients for Risk of Recurrence (ROR) score, and 3) risk classification (Low/Intermediate/High). The intrinsic subtype centroids (Luminal A, Luminal B, Her2-enriched, Basal-like) were trained using a clinically representative set of archived FFPE breast tumor specimens collected from multiple sites. Hierarchical clustering analysis of gene expression data from the FFPE breast tumor samples was combined with breast tumor biology (i.e. gene expression of previously defined intrinsic subtypes) to define the prototypical expression profile (i.e. centroid) of each subtype. A
computational algorithm correlates the normalized 50 gene expression profile of an unknown breast cancer tumor sample to each of the prototypical expression signatures of the four breast cancer intrinsic subtypes. The tumor sample is assigned the subtype with the largest positive correlation to the sample.
[134] 304 unique tumor samples with well-defined clinical characteristics and clinical outcome data were used to establish the ROR score. The ROR score is calculated using coefficients from a Cox model that includes the Pearson correlation (R) to each intrinsic subtype, a proliferation score (P), and tumor size (T), as shown in the equation below.
ROR = aRLumA + bRulina + cRiterze + dRbasai + eP + if [135] To classify tumor samples into specific risk groups (Low Risk/Intermediate Risk/High Risk) based on their calculated ROR score, cutoffs were set based on probability of recurrence free survival in a patient population consisting of hormone receptor positive, post-menopausal patients treated with endocrine therapy alone.
[136] Anticipated Use of NanoString Breast Cancer Test in Clinical Practice -Oncologists currently use a series of tests to develop a treatment protocol for breast cancer patients.
Included in these are the IHC / FISH tests such as ER/PR IHC and HER2 IHC /
FISH, and the Agendia MammaPrint assay and the Genomic Health Oncotype Dx test. These tests offer the oncologist additional information regarding the patient's prognosis and recommended treatment regimens.
[137] These tests, however, have limitations. ER, PgR, and Her2 testing is done locally by pathologists and reference labs, but the challenges with widespread standardization of IHC
and FISH testing is well documented (Lester, J et al. Assessment of Tissue Estrogen and Progesterone Receptor Levels: A Survey of Current Practice, Techniques, and Quantitation Methods. The Breast Journal, 6:189-196 (2000); Wolff, Act al. American Society of Clinical Oncology / College of American Pathologists Guideline Recommendations for Human Epidermal Growth Factor Receptor 2 Testing in Breast Cancer. Archives of Pathology and Laboratory Medicine, 131:18-43 (2007)). The MammaPrint test is FDA cleared for use only with frozen or fresh-preserved tissue samples, yet most of the tumor samples collected in the United States are FFPE rather than fresh-frozen. This test is also not distributed and is only available through the Agendia reference labs. The Oncotype Dx test can be used to predict the risk of relapse for stage I/II, node negative, estrogen receptor-positive patients receiving adjuvant Tamoxifen therapy as well as response to cyclophosphamide/methotrexate/5-fluorouracil (CMF) chemotherapy. However this test is only offered as a lab-developed test (LDT) through Genomic Health's CLIA laboratory and is not FDA cleared for prognostic use, or FDA approved for predicting chemotherapy response.
[138] NanoString envisions a model that would have the Breast Cancer test used in conjunction with other sources of clinical data currently available to oncologists for breast cancer prognosis in selected patient segments. The Breast Cancer Test would be an additional source of prognostic information adding significant value to established clinical parameters (i.e tumor size, nodal status) used by oncologists in managing a patient with breast cancer.
[139] Methods, Assays and Kits [140] The methods, assays and kits of the present invention include a series of quality control metrics that are automatically applied to each sample during analysis.
These metrics evaluate the performance of the assay to determine whether the results fall within expected values. Upon successful analysis of these quality control metrics, the Assay gives the following results:
ResuIt Output Values The Intrinsic Subtype of the Luminal A
Breast Cancer Specimen Luminal B
HER2-Enriched Basal-Like Individual Estimate of the 0-100%
Probability of Distant Recurrence within 10 years Risk of Recurrence (ROR) Integer value on a 0-100 scale Score Risk Category Low, Intermediate, High [141] Intrinsic Subtypes [142] The Intrinsic Subtype of a breast cancer tumor has been shown to be related to prognosis in Early Stage Breast Cancer. On average, patients with a Luminal A
tumor have significantly better outcomes than patients with Luminal B, HER2-Enriched, or Basal-like tumors.
[143] The Intrinsic Subtype is identified by comparing the gene expression profile of 50 genes in an unknown sample with the expected expression profiles for the four intrinsic subtypes. The subtype with the most similar profile is assigned to the unknown sample.
[144] The most common subtypes of breast cancer are the luminal subtypes, Luminal A
(LumA) and Luminal B (LumB). Prior studies suggest that Luminal A comprises approximately 30% to 40% and Luminal B approximately 20% of breast cancers.
However, greater than 90% of hormone-receptor positive patients have luminal tumors.
The gene expression pattern of these subtypes resembles the luminal epithelial component of the breast tissue. These tumors are characterized by high expression of estrogen receptor (ER), progesterone receptor (PR), and genes associated with ER activation, such as LIV1, GATA3, and cyclin D1, as well as expression of luminal cytokeratins 8 and 18. Luminal A breast cancers exhibit lower expression of genes associated with cell cycle activation when compared to Luminal B breast cancers resulting in a better prognosis.
[145] Prior studies suggest that the HER2-Enriched subtype (Her2E) comprises approximately 20% of breast cancers. However, HER2-Enriched tumors are generally ER-negative, so only 5% of the tested ER-positive patient population was found to have HER2-Enriched breast cancer. Regardless of ER-status, HER2-Enriched tumors are HER2-positive in the majority of cases with high expression of the ERBB2 cluster, including ERBB2 and GRB7. Genes associated with cell cycle activation are also highly expressed.
[146] Published data suggest that the Basal-like subtype comprises approximately 20% of breast cancers. However, Basal-like tumors are generally ER-negative, so only 1% of hormone receptor-positive patients have Basal-like breast cancer. The Basal-like subtype is almost always clinically HER2-negative and expresses a suite of "basal"
biomarkers including the basal epithelial cytokeratins (CK) and epidermal growth factor receptor (EGFR). Genes associated with cell cycle activation are highly expressed.
[147] ROR Score [148] The ROR score is an integer value on a 0-100 scale that is related to an individual patient's probability of distant recurrence within 10 years for the defined intended use population. The ROR score is calculated by comparing the expression profiles of 46 genes in an unknown sample with the expected profiles for the four intrinsic subtypes, as described above, to calculate four different correlation values. These correlation values are then combined with a proliferation score and the tumor size to calculate the ROR
score.
[149] Probability of 10-Year Distant Recurrence [150] The ROR scores for a cohort of post-menopausal women with hormone receptor-positive early stage breast cancer were compared to distant recurrence-free survival following surgery and treatment with 5 years of adjuvant endocrine therapy followed by 5 years of observation. This study resulted in a model relating the ROR score to the probability of distant recurrence in this tested patient population including a 95%
confidence interval.
[151] Risk Classification [152] Risk classification is also provided to allow interpretation of the ROR
score by using cutoffs related to clinical outcome in tested patient populations.
[153] Risk classification by ROR range and nodal status Nodal Statuslir-12OR Classification 0-40 Low Node-Negative 41-60 Intermediate 61-100 High 0-15 Low Node-Positive (1-3 nodes) 16-40 Intermediate 41-100 High [154] Quality Control [155] Each lot of the Assay components is tested using predetermined specifications. All kit-level items are lot tracked, and the critical components contained within each kit are tested together and released as a Master Lot.
[156] The assay kit includes a series of internal controls that are used to assess the quality of each run set as a whole and each sample individually. These controls are listed below.
[157] Batch Control Set: In vitro transcribed RNA Reference Sample [158] A synthetic RNA Reference Sample is included as a control within the Assay kit.
The reference sample is comprised of in-vitro transcribed RNA targets from the 50 algorithm and 8 housekeeping genes. The Reference Sample is processed in duplicate in each assay run along with a set of up to 10 unknown breast tumor RNA samples in a 12 reaction strip tube.
The signal from the Reference Sample is analyzed against pre-defined thresholds to qualify the run.
[159] The signal from each of the 50 algorithm genes of the breast tumor RNA
sample is normalized to the corresponding genes of the Reference Sample.
[160] Positive Control Set: in vitro transcribed RNA targets and corresponding Capture and Reporter Probes [161] Synthetic RNA targets are used as positive controls (PCs) for the assay.
The PC
target sequences are derived from the External RNA Control Consortium (ERCC) DNA
sequence library. The RNA targets are in-vitro transcribed from DNA plasmids.
Six RNA
targets are included within the assay kit in a 4-fold titration series (128 ¨
0.125 fM final concentration in hybridization reaction) along with the corresponding Capture and Reporter Probes. The PCs are added to each breast tumor RNA sample and Reference RNA
Sample tested with the Prosigna Assay. A sample will be disqualified from further analysis if the signal intensities from the PCs do not meet pre-defined thresholds.
[162] Negative control set: exogenous probes without targets [163] Negative control (NC) target sequences are derived from the ERCC DNA
sequence library. The probes designed to detect these target sequences are included as part of the assay kit without the corresponding target sequence. The negative controls (NCs) are added to each breast tumor RNA sample and Reference Sample tested with the Prosigna Assay as a quality control measure. The sample will be disqualified from further analysis if the signal intensities from the NCs do not meet pre-defined thresholds.
[164] RNA Integrity Control Set: Housekeeping genes [165] Capture and Reporter Probes designed to detect 8 housekeeping genes and algorithm genes are included as part of the kit. The expression levels of the 8 housekeeping genes are analyzed to determine the quality of RNA extracted from the FFPE
tissue sample and input into the assay. The sample will be disqualified from further analysis if the expression level of the housekeeping genes falls below pre-defined thresholds.
[166] The housekeeping genes are also used to normalize for any differences in the intact RNA amount in a sample prior to Reference Sample normalization.
[167] Definitions [168] For the purposes of the present disclosure, "breast cancer" includes, for example, those conditions classified by biopsy or histology as malignant pathology. The clinical delineation of breast cancer diagnoses is well known in the medical arts. One of skill in the art will appreciate that breast cancer refers to any malignancy of the breast tissue, including, for example, carcinomas and sarcomas. Particular embodiments of breast cancer include ductal carcinoma in situ (DCIS), lobular carcinoma in situ (LCIS), or mucinous carcinoma.
Breast cancer also refers to infiltrating ductal (IDC) or infiltrating lobular carcinoma (ILC).
In most embodiments of the disclosure, the subject of interest is a human patient suspected of or actually diagnosed with breast cancer.
[169] The article "a" and "an" are used herein to refer to one or more than one (i.e., to at least one) of the grammatical object of the article. By way of example, "an element" means one or more element.
[170] Throughout the specification the word "comprising," or variations such as "comprises" or "comprising," will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
EXAMPLES
Example 1. NAN046 Subtyping Test [171] Figure 5 outlines the assay processes associated with the Breast Cancer Intrinsic Subtyping test. Following RNA isolation, the test will simultaneously measure the expression levels of 46 target genes plus eight housekeeping genes in a single hybridization reaction using an nCounter CodeSet designed specifically to those genes. For example, the housekeeping genes described in U.S. Patent Publication 2008/0032293, which is herein incorporated by reference in its entirety, can be used for normalization.
Exemplary housekeeping genes include MRPL19, PSMC4, SF3A1, PUM1, ACTB, GAPD, GUSB, RPLPO, and TFRC. The housekeeping genes are used to normalize the expression of the tumor sample. Each assay run also includes a reference sample consisting of in vitro transcribed RNA's of the 58 targets for normalization purposes.
[172] FFPE Tissue Review/Procurement and RNA Extraction: The Breast Cancer Intrinsic Subtyping Test will use RNA extracted from Formalin-fixed, Paraffin-embedded (FFPE) tissue that has been diagnosed as invasive carcinoma of the breast. A
Pathologist reviews an H & E stained slide to identify the tissue area containing sufficient tumor tissue content for the test. Unstained slide mounted tissue sections are processed by macro-dissecting the identified tumor area on each slide to remove any adjacent normal tissue. RNA
is then isolated from the tumor tissue, and DNA is removed from the sample.
[173] Assay Setup and Initiation of Hybridization: For each batch of up to 10 RNA samples isolated from a breast tumor, the user will set up a run using the nCounter Analysis x5 system software, which tracks sample processing, reagent lots, and results for each sample. To initiate the assay, the user will pipette the specified amount of RNA into separate tubes within a 12 reaction strip tube and add the CodeSet and hybridization buffer. A
reference sample is pipetted into the remaining two tubes with CodeSet and hybridization buffer.
The CodeSet consists of probes for each gene that is targeted, additional probes for endogenous "housekeeping" normalization genes and positive and negative controls that are spiked into the assay. The reference sample consists of in vitro transcribed RNA for the targeted genes and housekeeping genes. Once the hybridization reagents are added to the respective tubes, the user transfers the strip tube into a heated-lid heatblock for a specified period of time at a set temperature.
[174] Purification and Binding on the Prep Station: Upon completing hybridization, the user will transfer the strip tube containing the set of 10 assays and 2 reference samples onto the nCounter Prep Station along with the required prepackaged reagents and disposables. An automated purification process then removes excess capture and reporter probe through two successive hybridization-driven magnetic bead capture steps. The nCounter Prep Station then transfers the purified target/probe complexes into an nCounter cartridge for capture to a glass slide. Following completion of the run, the user removes the cartridge from the Prep Station and seals it with an adhesive film.
[175] Imaging and Analysis on the Digital Analyzer: The cartridge is then sealed and inserted into the nCounter Digital Analyzer which counts the number of probes captured on the slide for each gene, which corresponds to the amount of target in solution. Automated software will then check thresholds for the housekeeping genes, reference sample, and positive and negative controls to qualify each assay and ensure that the procedure was performed correctly. The signals of each sample are next normalized using the housekeeping genes to control for input sample quality. The signals are then normalized to the reference sample within each run to control for run-to-run variations. The resulting normalized data is entered in the Breast Cancer Intrinsic Subtyping algorithm to determine tumor intrinsic subtype and risk of recurrence score.
[176] Example 2: Clinical Validation of the NAN046 risk of recurrence (ROR) score for predicting residual risk of distant-recurrence (DR) after endocrine therapy in postmenopausal women with HR+ early breast cancer (EBC): An AB SCSG Study.
[177] The aim of the study is to assess the performance of the ROR score in predicting distal recurrence for postmenopausal patients with hormone receptor positive early breast cancer (HR+ EBC) treated with tamoxifen or tamoxifen followed by anastrozole when the NAN046 test is performed in a routine hospital pathology lab. Does the ROR
score add prognostic information (Distant RFS) beyond the Clinical Treatment Score in all patients (CTS includes: nodes, grade, tumor size, age, treatment)? Do the ROR-based risk groups at prognostic information (Distant RFS) beyond the Clinical Treatment Score in all patients?
[178] Study Overview: 3,714 patients were enrolled in a ABCSG8. Patients were postmenopausal women with HR+ EBC (node negative and note positive), grade one or two, with no prior treatment. 1,671 patients re-consented for long-term follow-up or are deceased.
The median follow-up was 11 years. 1,620 FFPE blocks were collected. 25 had insufficient cancer in the block on path review, 73 had insufficient RNA included, 44 failed QC specs for the NanoString device. 1,478 patients (91.2%) passed the NAN046 analysis.
[179] Methods: Three unstained 10 micron sections and 1 H&E slide for each patient was sent to an independent academic pathology laboratory at BCCA where tissue review, manual micro-dissection and RNA extraction were performed. NAN046 analysis was then conducted on 250 ng of the extracted RNA using the NanoString nCounter Analysis System;
both intrinsic subtype and ROR score were calculated.
[180] Results: The ROR Score adds statistically significant prognostic information (Distant RFS) beyond CTS in all patients (Likelihood ratio test ALRx2 = 53.5, p <
0.0001). The ROR-based risk groups add statistically significant prognostic information (Distant RFS) beyond CTS in all patients (Likelihood ratio test ALRx2 = 34.1, p < 0.0001).
Differentiation between Luminal A and Luminal B adds statistically significant prognostic information (Distant RFS) beyond CTS in all patients (Luminal B vs. A: HR=2.38, 95% CI;
1.69-3.35, p <0.0001). Results in the node-negative and node-positive subgroups are similar to the results for all patients that are reported in the study.
[181] Conclusions: The results show that both the ROR score and the ROR-based risk groups add statistically significant prognostic information beyond the Clinical Treatment Score. The results demonstrate that a complex, multi-gene-expression test can be performed in a hospital pathology laboratory and meet the same quality metrics as a central reference laboratory. The results of the TransATAC and ABCSG8 studies together provide Level 1 evidence for the clinical validity of the NAN046 test for predicting the risk of distant recurrence in postmenopausal women with HR+ EBC treated with endocrine therapy alone.
The results also show that Luminal A subtypes have better outcomes than Luminal B
subtypes in postmenopausal women with HR+ EBC treated with endocrine therapy alone.

Claims (15)

1. A method of predicting outcome in a subject having breast cancer comprising:
providing a tumor sample from the subject;
determining the expression of at least the genes in the NANO46 intrinsic gene list of Table 1 in the tumor sample;
determining the intrinsic subtype of the tumor sample, wherein the intrinsic subtype is selected from the group consisting of at least Basal-like, Luminal A, Luminal B or HER2-enriched;
determining a proliferation score based on the expression of a subset of proliferation genes in the NANO46 intrinsic gene list;
calculating a risk of recurrence score using a weighted sum of said intrinsic subtype, proliferation score and optionally one or more clinicopathological variables such as tumor size, nodal status or histological grade; and determining whether the subject has a low or high risk of recurrence based on the risk of recurrence score.
2. The method of claim 1, wherein determining a proliferation signature based on the expression of a subset of proliferation genes in the NANO46 intrinsic gene list comprises determining the expression of each of the NANO46 intrinsic genes selected from ANLN, CCNE1, CDC20, CDC6, CDCA1, CENPF, CEP55, EXO1, KIF2C, KNTC2, MELK, MKI67, ORC6L, PTTG1, RRM2, TYMS, UBE2C and UBE2T.
3. The method of claim 1, further comprising determining at least one of the following:
tumor grade, tumor ploidy, nodal status, estrogen receptor expression, progesterone receptor expression, and HER2/ERBB2 expression
4. The method of claim 1, further comprising determining each of the following: tumor grade, tumor ploidy, nodal status, estrogen receptor expression, progesterone receptor expression, and HER2/ERBB2 expression
5. The method of claim 1, wherein the risk of recurrence score is calculated using the following equation:
ROR-PT = -0.0067*Basal + 0.4317*Her2 + -0.3172*LumA + 0.4894*LumB +

0.1981*ProliferationScore + 0.1133*TumorSize.
6. The method of claim 1, wherein the outcome is breast cancer specific survival, event-free survival or response to therapy.
7. The method of claim 1, wherein the expression of the members of the intrinsic gene list is determined using the nanoreporter code system (nCounter® Analysis system).
8. A kit comprising a plurality of probes for determining the expression of at least the genes in the NANO46 intrinsic gene list of Table 1 in a tumor sample for use in a method of predicting outcome in a subject having breast cancer.
9. The kit of claim 8, wherein the kit comprises a plurality of probes of Table 1A.
10. The kit of claim 9, wherein the kit comprises each of the probes of Table 1A.
11. The kit of claim 8, comprising probes for determining the expression of each of the NANO46 intrinsic genes selected from ANLN, CCNE1, CDC20, CDC6, CDCA1, CENPF, CEP55, EXO1, KIF2C, KNTC2, MELK, MKI67, ORC6L, PTTG1, RRM2, TYMS, UBE2C
and UBE2T.
12. The kit of claim 8, wherein each probe in the plurality of probes comprises a target specific sequence capable of hybridizing to no more than one NANO46 intrinsic gene listed in Table 1, and optionally comprises at least two label attachment regions, said label attachment regions comprising one or more label monomers that emit light.
13. The kit of claim 9, wherein the plurality of probes comprises a probe pair to detect the NANO46 intrinsic genes listed in Table 1, wherein each probe in the probe pair comprises a target specific sequence capable of hybridizing to no more than one NANO46 intrinsic gene listed in Table 1 and wherein the target specific sequences bind to different regions of the same NANO46 intrinsic gene.
14. The kit of claim 13, wherein one probe of the probe pair further comprises at least two label attachment regions, said label attachment regions comprising one or more label monomers that emit light
15. The kit of claim 8, further comprising one or more reagents for determining one or more clinicopathological variables of the tumor sample such as tumor size, tumor grade, tumor ploidy, nodal status, estrogen receptor expression, progesterone receptor expression, and HER2/ERBB2 expression.
CA2874492A 2012-05-22 2013-05-22 Nano46 genes and methods to predict breast cancer outcome Active CA2874492C (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201261650209P 2012-05-22 2012-05-22
US61/650,209 2012-05-22
US201361753673P 2013-01-17 2013-01-17
US61/753,673 2013-01-17
PCT/US2013/042157 WO2013177245A2 (en) 2012-05-22 2013-05-22 Nano46 genes and methods to predict breast cancer outcome

Publications (2)

Publication Number Publication Date
CA2874492A1 true CA2874492A1 (en) 2013-11-28
CA2874492C CA2874492C (en) 2021-10-19

Family

ID=49624503

Family Applications (1)

Application Number Title Priority Date Filing Date
CA2874492A Active CA2874492C (en) 2012-05-22 2013-05-22 Nano46 genes and methods to predict breast cancer outcome

Country Status (12)

Country Link
US (3) US20130337444A1 (en)
EP (1) EP2852689B1 (en)
JP (1) JP6325530B2 (en)
CN (2) CN104704128A (en)
AU (1) AU2013266419B2 (en)
BR (1) BR112014029300A2 (en)
CA (1) CA2874492C (en)
ES (1) ES2763931T3 (en)
IL (1) IL235795B (en)
IN (1) IN2014MN02418A (en)
MX (1) MX369628B (en)
WO (1) WO2013177245A2 (en)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2457534T3 (en) 2008-05-30 2014-04-28 The University Of North Carolina At Chapel Hill Gene expression profiles to predict outcomes in breast cancer
AU2012229123B2 (en) 2011-03-15 2017-02-02 British Columbia Cancer Agency Branch Methods of treating breast cancer with anthracycline therapy
GB201106254D0 (en) 2011-04-13 2011-05-25 Frisen Jonas Method and product
JP6144695B2 (en) 2011-11-30 2017-06-07 ユニバーシティー オブ ノースカロライナ アット チャペル ヒル How to treat breast cancer with taxane therapy
DK3511423T4 (en) 2012-10-17 2024-07-29 Spatial Transcriptomics Ab METHODS AND PRODUCT FOR OPTIMIZING LOCALIZED OR SPATIAL DETECTION OF GENE EXPRESSION IN A TISSUE SAMPLE
WO2014186349A1 (en) * 2013-05-13 2014-11-20 Nanostring Technologies, Inc. Methods to predict risk of recurrence in node-positive early breast cancer
CA2916660C (en) 2013-06-25 2022-05-17 Prognosys Biosciences, Inc. Spatially encoded biological assays using a microfluidic device
AU2014317843A1 (en) 2013-09-09 2016-03-24 British Columbia Cancer Agency Branch Methods and kits for predicting outcome and methods and kits for treating breast cancer with radiation therapy
EP4029952A1 (en) 2014-11-21 2022-07-20 Nanostring Technologies, Inc Enzyme- and amplification-free sequencing
EP3223947B1 (en) * 2014-11-24 2019-10-30 Nanostring Technologies, Inc. Methods and apparatuses for gene purification and imaging
WO2016091880A1 (en) * 2014-12-09 2016-06-16 King's College London Breast cancer treatment with taxane therapy
JP6828007B2 (en) 2015-04-10 2021-02-10 スペーシャル トランスクリプトミクス アクチボラグ Spatial-identified multiplex nucleic acid analysis of biological samples
WO2017014694A1 (en) * 2015-07-23 2017-01-26 National University Of Singapore Wbp2 as a co-prognostic factor with her2 for stratification of patients for treatment
EP4324929A1 (en) 2016-05-16 2024-02-21 Nanostring Technologies, Inc. Methods for detecting target nucleic acids in a sample
EP3472359B1 (en) 2016-06-21 2022-03-16 10X Genomics, Inc. Nucleic acid sequencing
WO2018074865A2 (en) * 2016-10-21 2018-04-26 서울대학교병원 Composition and method for breast cancer prognosis prediction
JP6730525B2 (en) 2016-11-21 2020-07-29 ナノストリング テクノロジーズ,インコーポレイティド Chemical composition and method of using the same
KR101950717B1 (en) * 2016-11-23 2019-02-21 주식회사 젠큐릭스 Methods for predicting effectiveness of chemotherapy for breast cancer patients
US11978535B2 (en) * 2017-02-01 2024-05-07 The Translational Genomics Research Institute Methods of detecting somatic and germline variants in impure tumors
CN111263819A (en) 2017-10-06 2020-06-09 卡特阿纳公司 RNA templated ligation
KR102071491B1 (en) * 2017-11-10 2020-01-30 주식회사 디시젠 Breast cancer prognosis prediction method and system based on machine learning using next generation sequencing
SG11202011274YA (en) 2018-05-14 2020-12-30 Nanostring Technologies Inc Chemical compositions and methods of using same
CN113302313A (en) * 2018-11-05 2021-08-24 拜恩科技诊断有限责任公司 Method for predicting breast cancer
EP3976817A1 (en) 2019-05-31 2022-04-06 10X Genomics, Inc. Method of detecting target nucleic acid molecules
JP7352937B2 (en) * 2019-07-19 2023-09-29 公立大学法人福島県立医科大学 Differential marker gene set, method and kit for differentiating or classifying breast cancer subtypes
EP3945136A1 (en) * 2020-07-28 2022-02-02 Hospital Clínic de Barcelona In vitro method for the prognosis of patients suffering from her2-positive breast cancer
US12060603B2 (en) 2021-01-19 2024-08-13 10X Genomics, Inc. Methods for internally controlled in situ assays using padlock probes
WO2023117807A1 (en) * 2021-12-20 2023-06-29 Reveal Genomics S.L Development and validation of an in vitro method for the prognosis of patients suffering from her2-positive breast cancer
CN117965734B (en) * 2024-02-02 2024-09-24 奥明星程(杭州)生物科技有限公司 Gene marker for detecting hard fibroid, kit, detection method and application

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5800992A (en) 1989-06-07 1998-09-01 Fodor; Stephen P.A. Method of detecting nucleic acids
US6040138A (en) 1995-09-15 2000-03-21 Affymetrix, Inc. Expression monitoring by hybridization to high density oligonucleotide arrays
DE69217497T2 (en) 1991-09-18 1997-06-12 Affymax Tech Nv METHOD FOR SYNTHESISING THE DIFFERENT COLLECTIONS OF OLIGOMERS
CA2124087C (en) 1991-11-22 2002-10-01 James L. Winkler Combinatorial strategies for polymer synthesis
US5384261A (en) 1991-11-22 1995-01-24 Affymax Technologies N.V. Very large scale immobilized polymer synthesis using mechanically directed flow paths
US5856174A (en) 1995-06-29 1999-01-05 Affymetrix, Inc. Integrated nucleic acid diagnostic device
EP0880598A4 (en) 1996-01-23 2005-02-23 Affymetrix Inc Nucleic acid analysis techniques
AU1287799A (en) 1997-10-31 1999-05-24 Affymetrix, Inc. Expression profiles in adult and fetal organs
US6020135A (en) 1998-03-27 2000-02-01 Affymetrix, Inc. P53-regulated genes
US6750015B2 (en) * 2000-06-28 2004-06-15 Kathryn B. Horwitz Progesterone receptor-regulated gene expression and methods related thereto
US20020168639A1 (en) * 2000-09-22 2002-11-14 Muraca Patrick J. Profile array substrates
JP4680898B2 (en) * 2003-06-24 2011-05-11 ジェノミック ヘルス, インコーポレイテッド Predicting the likelihood of cancer recurrence
US20050071087A1 (en) * 2003-09-29 2005-03-31 Anderson Glenda G. Systems and methods for detecting biological features
CA2574447A1 (en) 2004-07-15 2006-01-26 University Of Utah Research Foundation Housekeeping genes and methods for identifying the same
CN101061480A (en) * 2004-09-22 2007-10-24 三路影像公司 Methods and computer program products for analysis and optimization of marker candidates for cancer prognosis
WO2007061876A2 (en) 2005-11-23 2007-05-31 University Of Utah Research Foundation Methods and compositions involving intrinsic genes
ATE525482T1 (en) 2005-12-23 2011-10-15 Nanostring Technologies Inc NANOREPORTERS AND METHOD FOR THE PRODUCTION AND USE THEREOF
ES2402939T3 (en) 2005-12-23 2013-05-10 Nanostring Technologies, Inc. Compositions comprising immobilized and oriented macromolecules and methods for their preparation
AU2008237018B2 (en) 2007-04-10 2014-04-03 Bruker Spatial Biology, Inc. Methods and computer systems for identifying target-specific sequences for use in nanoreporters
WO2008150512A2 (en) * 2007-06-04 2008-12-11 University Of Louisville Research Foundation, Inc. Methods for identifying an increased likelihood of recurrence of breast cancer
CA2698569A1 (en) * 2007-09-06 2009-09-03 Mark G. Erlander Tumor grading and cancer prognosis
ES2457534T3 (en) * 2008-05-30 2014-04-28 The University Of North Carolina At Chapel Hill Gene expression profiles to predict outcomes in breast cancer
EP2331704B1 (en) 2008-08-14 2016-11-30 Nanostring Technologies, Inc Stable nanoreporters
UA110790C2 (en) * 2010-03-31 2016-02-25 Сівідон Діагностікс Гмбх Method for breast cancer recurrence prediction under endocrine treatment

Also Published As

Publication number Publication date
CN104704128A (en) 2015-06-10
EP2852689A4 (en) 2016-05-11
AU2013266419B2 (en) 2018-09-27
MX369628B (en) 2019-11-14
EP2852689B1 (en) 2019-12-11
EP2852689A2 (en) 2015-04-01
CN111500718A (en) 2020-08-07
IL235795B (en) 2020-02-27
CA2874492C (en) 2021-10-19
AU2013266419A1 (en) 2014-12-11
US20230272476A1 (en) 2023-08-31
IN2014MN02418A (en) 2015-08-14
US20130337444A1 (en) 2013-12-19
US20200332368A1 (en) 2020-10-22
JP2015518724A (en) 2015-07-06
ES2763931T3 (en) 2020-06-01
JP6325530B2 (en) 2018-05-16
BR112014029300A2 (en) 2017-07-25
IL235795A0 (en) 2015-01-29
WO2013177245A2 (en) 2013-11-28
WO2013177245A3 (en) 2015-01-29
MX2014014275A (en) 2015-07-06

Similar Documents

Publication Publication Date Title
CA2874492C (en) Nano46 genes and methods to predict breast cancer outcome
US20230250484A1 (en) Gene expression profiles to predict breast cancer outcomes
CA2877378A1 (en) Methods of treating breast cancer with gemcitabine therapy
CA2857505A1 (en) Methods of treating breast cancer with taxane therapy
JP6246845B2 (en) Methods for quantifying prostate cancer prognosis using gene expression
CA2776751C (en) Methods to predict clinical outcome of cancer
JP2020150949A (en) Prognosis prediction for melanoma cancer
AU2014317843A1 (en) Methods and kits for predicting outcome and methods and kits for treating breast cancer with radiation therapy
US20140154681A1 (en) Methods to Predict Breast Cancer Outcome
AU2014265623A1 (en) Methods to predict risk of recurrence in node-positive early breast cancer
CA2959670C (en) Compositions, methods and kits for diagnosis of a gastroenteropancreatic neuroendocrine neoplasm
US10718030B2 (en) Methods for predicting effectiveness of chemotherapy for a breast cancer patient
JP2015503923A (en) Methods and biomarkers for the analysis of colorectal cancer
CA3085464A1 (en) Compositions and methods for diagnosing lung cancers using gene expression profiles
AU2015359479A1 (en) Breast cancer treatment with taxane therapy
Ramirez et al. Quantitative polymerase chain reaction for companion diagnostics and precision medicine application
Ramirez et al. Quantitative Polymerase Chain Reaction and Precision Medicine
NZ752676B2 (en) Gene expression profile algorithm for calculating a recurrence score for a patient with kidney cancer

Legal Events

Date Code Title Description
EEER Examination request

Effective date: 20180410