CA2805862A1 - Fibronectin cradle molecules and libraries thereof - Google Patents
Fibronectin cradle molecules and libraries thereof Download PDFInfo
- Publication number
- CA2805862A1 CA2805862A1 CA2805862A CA2805862A CA2805862A1 CA 2805862 A1 CA2805862 A1 CA 2805862A1 CA 2805862 A CA2805862 A CA 2805862A CA 2805862 A CA2805862 A CA 2805862A CA 2805862 A1 CA2805862 A1 CA 2805862A1
- Authority
- CA
- Canada
- Prior art keywords
- cradle
- amino acid
- loop
- library
- seq
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 108010067306 Fibronectins Proteins 0.000 title description 37
- 102000016359 Fibronectins Human genes 0.000 title description 3
- 230000027455 binding Effects 0.000 claims abstract description 319
- 238000009739 binding Methods 0.000 claims abstract description 313
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 285
- 102000004196 processed proteins & peptides Human genes 0.000 claims abstract description 265
- 229920001184 polypeptide Polymers 0.000 claims abstract description 261
- 238000000034 method Methods 0.000 claims abstract description 142
- 238000012216 screening Methods 0.000 claims abstract description 24
- 230000002255 enzymatic effect Effects 0.000 claims abstract description 7
- 150000001413 amino acids Chemical class 0.000 claims description 563
- 102000002090 Fibronectin type III Human genes 0.000 claims description 474
- 108050009401 Fibronectin type III Proteins 0.000 claims description 474
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 175
- 238000006467 substitution reaction Methods 0.000 claims description 152
- 210000004027 cell Anatomy 0.000 claims description 54
- 108010047041 Complementarity Determining Regions Proteins 0.000 claims description 38
- 238000012217 deletion Methods 0.000 claims description 37
- 230000037430 deletion Effects 0.000 claims description 37
- 238000003780 insertion Methods 0.000 claims description 36
- 230000037431 insertion Effects 0.000 claims description 36
- 102000040430 polynucleotide Human genes 0.000 claims description 36
- 108091033319 polynucleotide Proteins 0.000 claims description 36
- 239000002157 polynucleotide Substances 0.000 claims description 36
- 229910052799 carbon Inorganic materials 0.000 claims description 30
- 240000004808 Saccharomyces cerevisiae Species 0.000 claims description 24
- 238000002703 mutagenesis Methods 0.000 claims description 23
- 231100000350 mutagenesis Toxicity 0.000 claims description 23
- 239000000126 substance Substances 0.000 claims description 23
- 125000001165 hydrophobic group Chemical group 0.000 claims description 19
- 238000004458 analytical method Methods 0.000 claims description 15
- 101001027128 Homo sapiens Fibronectin Proteins 0.000 claims description 13
- 102000008100 Human Serum Albumin Human genes 0.000 claims description 11
- 108091006905 Human Serum Albumin Proteins 0.000 claims description 11
- 108010043121 Green Fluorescent Proteins Proteins 0.000 claims description 10
- 102000004144 Green Fluorescent Proteins Human genes 0.000 claims description 10
- 239000005090 green fluorescent protein Substances 0.000 claims description 10
- 108010014251 Muramidase Proteins 0.000 claims description 8
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 claims description 8
- 239000004325 lysozyme Substances 0.000 claims description 8
- 229960000274 lysozyme Drugs 0.000 claims description 8
- 235000010335 lysozyme Nutrition 0.000 claims description 8
- 102000014400 SH2 domains Human genes 0.000 claims description 7
- 108050003452 SH2 domains Proteins 0.000 claims description 7
- 108010003723 Single-Domain Antibodies Proteins 0.000 claims description 6
- 108091008874 T cell receptors Proteins 0.000 claims description 6
- 102000016266 T-Cell Antigen Receptors Human genes 0.000 claims description 6
- GOJUJUVQIVIZAV-UHFFFAOYSA-N 2-amino-4,6-dichloropyrimidine-5-carbaldehyde Chemical group NC1=NC(Cl)=C(C=O)C(Cl)=N1 GOJUJUVQIVIZAV-UHFFFAOYSA-N 0.000 claims description 5
- 241000894006 Bacteria Species 0.000 claims description 5
- 102000044159 Ubiquitin Human genes 0.000 claims description 5
- 108090000848 Ubiquitin Proteins 0.000 claims description 5
- 230000008859 change Effects 0.000 claims description 5
- 101000665140 Homo sapiens Scm-like with four MBT domains protein 2 Proteins 0.000 claims description 4
- 239000000178 monomer Substances 0.000 claims description 4
- 101000628899 Homo sapiens Small ubiquitin-related modifier 1 Proteins 0.000 claims description 3
- 102100036294 Polycomb protein SCMH1 Human genes 0.000 claims description 3
- 102100026940 Small ubiquitin-related modifier 1 Human genes 0.000 claims description 3
- 102100038691 Scm-like with four MBT domains protein 2 Human genes 0.000 claims description 2
- 102000054929 human SFMBT2 Human genes 0.000 claims description 2
- 101710102548 Polycomb protein SCMH1 Proteins 0.000 claims 2
- 102000016943 Muramidase Human genes 0.000 claims 1
- 239000011230 binding agent Substances 0.000 abstract description 11
- 235000001014 amino acid Nutrition 0.000 description 633
- 229940024606 amino acid Drugs 0.000 description 542
- 108090000623 proteins and genes Proteins 0.000 description 123
- 102000004169 proteins and genes Human genes 0.000 description 92
- 235000018102 proteins Nutrition 0.000 description 76
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 46
- 102000002669 Small Ubiquitin-Related Modifier Proteins Human genes 0.000 description 46
- 108010043401 Small Ubiquitin-Related Modifier Proteins Proteins 0.000 description 46
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 38
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 37
- 238000002965 ELISA Methods 0.000 description 34
- 102100037362 Fibronectin Human genes 0.000 description 34
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 34
- 230000003993 interaction Effects 0.000 description 33
- 125000000539 amino acid group Chemical group 0.000 description 32
- 239000000203 mixture Substances 0.000 description 30
- 230000004048 modification Effects 0.000 description 30
- 238000012986 modification Methods 0.000 description 30
- 102220580964 Induced myeloid leukemia cell differentiation protein Mcl-1_P44Y_mutation Human genes 0.000 description 29
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 29
- 238000009826 distribution Methods 0.000 description 29
- 239000003446 ligand Substances 0.000 description 29
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 25
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 23
- 238000013461 design Methods 0.000 description 23
- 229910052739 hydrogen Inorganic materials 0.000 description 23
- 238000002823 phage display Methods 0.000 description 23
- 238000002198 surface plasmon resonance spectroscopy Methods 0.000 description 23
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 22
- 125000003295 alanine group Chemical group N[C@@H](C)C(=O)* 0.000 description 22
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 21
- 102000001708 Protein Isoforms Human genes 0.000 description 21
- 108010029485 Protein Isoforms Proteins 0.000 description 21
- 239000013078 crystal Substances 0.000 description 21
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 20
- 229910052757 nitrogen Inorganic materials 0.000 description 20
- 108020004705 Codon Proteins 0.000 description 19
- 102220580976 Induced myeloid leukemia cell differentiation protein Mcl-1_G41Y_mutation Human genes 0.000 description 19
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 19
- 230000035772 mutation Effects 0.000 description 19
- 125000000174 L-prolyl group Chemical group [H]N1C([H])([H])C([H])([H])C([H])([H])[C@@]1([H])C(*)=O 0.000 description 18
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Chemical compound CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 18
- 238000006243 chemical reaction Methods 0.000 description 18
- 230000000694 effects Effects 0.000 description 18
- 102220577161 Density-regulated protein_D67Y_mutation Human genes 0.000 description 17
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 17
- 238000013459 approach Methods 0.000 description 17
- 239000011324 bead Substances 0.000 description 17
- 238000003752 polymerase chain reaction Methods 0.000 description 17
- 239000013598 vector Substances 0.000 description 17
- 241000588724 Escherichia coli Species 0.000 description 15
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 14
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 14
- 230000006870 function Effects 0.000 description 14
- 102000051619 SUMO-1 Human genes 0.000 description 13
- 239000000872 buffer Substances 0.000 description 13
- 238000002898 library design Methods 0.000 description 13
- 150000007523 nucleic acids Chemical group 0.000 description 13
- 108010032595 Antibody Binding Sites Proteins 0.000 description 12
- 102000014914 Carrier Proteins Human genes 0.000 description 12
- 108010070675 Glutathione transferase Proteins 0.000 description 12
- 102000005720 Glutathione transferase Human genes 0.000 description 12
- ODKSFYDXXFIFQN-BYPYZUCNSA-N L-arginine Chemical compound OC(=O)[C@@H](N)CCCN=C(N)N ODKSFYDXXFIFQN-BYPYZUCNSA-N 0.000 description 12
- 239000000427 antigen Substances 0.000 description 12
- 102000036639 antigens Human genes 0.000 description 12
- 108091007433 antigens Proteins 0.000 description 12
- 108091008324 binding proteins Proteins 0.000 description 12
- 238000002474 experimental method Methods 0.000 description 12
- 230000002209 hydrophobic effect Effects 0.000 description 12
- 238000000338 in vitro Methods 0.000 description 12
- 239000000243 solution Substances 0.000 description 12
- 230000010741 sumoylation Effects 0.000 description 12
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 11
- 238000007792 addition Methods 0.000 description 11
- 239000003112 inhibitor Substances 0.000 description 11
- 108020004999 messenger RNA Proteins 0.000 description 11
- 102000039446 nucleic acids Human genes 0.000 description 11
- 108020004707 nucleic acids Proteins 0.000 description 11
- 238000002818 protein evolution Methods 0.000 description 11
- 238000003556 assay Methods 0.000 description 10
- 210000005253 yeast cell Anatomy 0.000 description 10
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 9
- 102100037204 Sal-like protein 1 Human genes 0.000 description 9
- 108010090804 Streptavidin Proteins 0.000 description 9
- 230000009824 affinity maturation Effects 0.000 description 9
- 230000015572 biosynthetic process Effects 0.000 description 9
- 230000021615 conjugation Effects 0.000 description 9
- 230000004927 fusion Effects 0.000 description 9
- 238000002702 ribosome display Methods 0.000 description 9
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 9
- 239000004474 valine Substances 0.000 description 9
- 108010038447 Chromogranin A Proteins 0.000 description 8
- 102100031186 Chromogranin-A Human genes 0.000 description 8
- 238000005481 NMR spectroscopy Methods 0.000 description 8
- 108091006629 SLC13A2 Proteins 0.000 description 8
- 108700038981 SUMO-1 Proteins 0.000 description 8
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 8
- 235000004279 alanine Nutrition 0.000 description 8
- 239000011616 biotin Substances 0.000 description 8
- 229960002685 biotin Drugs 0.000 description 8
- 239000003153 chemical reaction reagent Substances 0.000 description 8
- 239000013604 expression vector Substances 0.000 description 8
- 229960000310 isoleucine Drugs 0.000 description 8
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 8
- 230000005291 magnetic effect Effects 0.000 description 8
- 238000013507 mapping Methods 0.000 description 8
- CPTIBDHUFVHUJK-NZYDNVMFSA-N mitopodozide Chemical compound C1([C@@H]2C3=CC=4OCOC=4C=C3[C@H](O)[C@@H](CO)[C@@H]2C(=O)NNCC)=CC(OC)=C(OC)C(OC)=C1 CPTIBDHUFVHUJK-NZYDNVMFSA-N 0.000 description 8
- 230000014616 translation Effects 0.000 description 8
- 239000004475 Arginine Substances 0.000 description 7
- 108020004414 DNA Proteins 0.000 description 7
- 102100038912 E3 SUMO-protein ligase RanBP2 Human genes 0.000 description 7
- 101710198453 E3 SUMO-protein ligase RanBP2 Proteins 0.000 description 7
- 102000004190 Enzymes Human genes 0.000 description 7
- 108090000790 Enzymes Proteins 0.000 description 7
- 239000004471 Glycine Substances 0.000 description 7
- 101000693367 Homo sapiens SUMO-activating enzyme subunit 1 Proteins 0.000 description 7
- 101000684495 Homo sapiens Sentrin-specific protease 1 Proteins 0.000 description 7
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 7
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 7
- 102100033468 Lysozyme C Human genes 0.000 description 7
- 102100025809 SUMO-activating enzyme subunit 1 Human genes 0.000 description 7
- 102100023653 Sentrin-specific protease 1 Human genes 0.000 description 7
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 7
- 230000001580 bacterial effect Effects 0.000 description 7
- 235000020958 biotin Nutrition 0.000 description 7
- 210000004899 c-terminal region Anatomy 0.000 description 7
- 150000001875 compounds Chemical class 0.000 description 7
- -1 e.g. Proteins 0.000 description 7
- 229940088598 enzyme Drugs 0.000 description 7
- 238000001943 fluorescence-activated cell sorting Methods 0.000 description 7
- 125000003630 glycyl group Chemical group [H]N([H])C([H])([H])C(*)=O 0.000 description 7
- 238000004519 manufacturing process Methods 0.000 description 7
- 238000005259 measurement Methods 0.000 description 7
- 239000002773 nucleotide Substances 0.000 description 7
- 125000003729 nucleotide group Chemical group 0.000 description 7
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 7
- 229910052717 sulfur Inorganic materials 0.000 description 7
- 238000003786 synthesis reaction Methods 0.000 description 7
- 238000011144 upstream manufacturing Methods 0.000 description 7
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- 102220580972 Induced myeloid leukemia cell differentiation protein Mcl-1_P44V_mutation Human genes 0.000 description 6
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 6
- 125000000205 L-threonino group Chemical group [H]OC(=O)[C@@]([H])(N([H])[*])[C@](C([H])([H])[H])([H])O[H] 0.000 description 6
- 102220498087 Lipoma-preferred partner_Y76S_mutation Human genes 0.000 description 6
- 239000004698 Polyethylene Substances 0.000 description 6
- 239000004473 Threonine Substances 0.000 description 6
- 239000007983 Tris buffer Substances 0.000 description 6
- 230000004075 alteration Effects 0.000 description 6
- 229940009098 aspartate Drugs 0.000 description 6
- 230000008901 benefit Effects 0.000 description 6
- 238000010367 cloning Methods 0.000 description 6
- 230000029180 desumoylation Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 239000000284 extract Substances 0.000 description 6
- 108020001507 fusion proteins Proteins 0.000 description 6
- 102000037865 fusion proteins Human genes 0.000 description 6
- 230000007246 mechanism Effects 0.000 description 6
- 230000009149 molecular binding Effects 0.000 description 6
- 239000013612 plasmid Substances 0.000 description 6
- 108010000222 polyserine Proteins 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 239000000047 product Substances 0.000 description 6
- 238000000746 purification Methods 0.000 description 6
- 238000001228 spectrum Methods 0.000 description 6
- 239000006228 supernatant Substances 0.000 description 6
- 238000013519 translation Methods 0.000 description 6
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 6
- 238000005406 washing Methods 0.000 description 6
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 5
- 101000879203 Caenorhabditis elegans Small ubiquitin-related modifier Proteins 0.000 description 5
- 102220556561 Delta and Notch-like epidermal growth factor-related receptor_Q44W_mutation Human genes 0.000 description 5
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 5
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 5
- 239000004472 Lysine Substances 0.000 description 5
- 108091034117 Oligonucleotide Proteins 0.000 description 5
- 229920001213 Polysorbate 20 Polymers 0.000 description 5
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 5
- 108700012920 TNF Proteins 0.000 description 5
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 5
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 5
- 102100040247 Tumor necrosis factor Human genes 0.000 description 5
- 230000002378 acidificating effect Effects 0.000 description 5
- 230000003321 amplification Effects 0.000 description 5
- 229960001230 asparagine Drugs 0.000 description 5
- 235000009582 asparagine Nutrition 0.000 description 5
- 230000000295 complement effect Effects 0.000 description 5
- 238000010276 construction Methods 0.000 description 5
- 229910052805 deuterium Inorganic materials 0.000 description 5
- 210000003527 eukaryotic cell Anatomy 0.000 description 5
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 5
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 5
- 238000000126 in silico method Methods 0.000 description 5
- 230000005764 inhibitory process Effects 0.000 description 5
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 5
- 238000003199 nucleic acid amplification method Methods 0.000 description 5
- 229910052698 phosphorus Inorganic materials 0.000 description 5
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 5
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 5
- 230000004044 response Effects 0.000 description 5
- 238000005070 sampling Methods 0.000 description 5
- 239000002904 solvent Substances 0.000 description 5
- 230000009870 specific binding Effects 0.000 description 5
- 239000000758 substrate Substances 0.000 description 5
- 125000001493 tyrosinyl group Chemical group [H]OC1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- 229910052727 yttrium Inorganic materials 0.000 description 5
- 101100539164 Caenorhabditis elegans ubc-9 gene Proteins 0.000 description 4
- 102100025698 Cytosolic carboxypeptidase 4 Human genes 0.000 description 4
- 101000932590 Homo sapiens Cytosolic carboxypeptidase 4 Proteins 0.000 description 4
- 108060003951 Immunoglobulin Proteins 0.000 description 4
- 102220498084 Lipoma-preferred partner_Y76L_mutation Human genes 0.000 description 4
- 101001033003 Mus musculus Granzyme F Proteins 0.000 description 4
- 102220643165 Polycystic kidney disease 2-like 1 protein_T39E_mutation Human genes 0.000 description 4
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 4
- 125000004429 atom Chemical group 0.000 description 4
- 230000003197 catalytic effect Effects 0.000 description 4
- 239000003795 chemical substances by application Substances 0.000 description 4
- 238000003776 cleavage reaction Methods 0.000 description 4
- 230000009260 cross reactivity Effects 0.000 description 4
- 235000018417 cysteine Nutrition 0.000 description 4
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 4
- 238000013480 data collection Methods 0.000 description 4
- 239000012634 fragment Substances 0.000 description 4
- 229930195712 glutamate Natural products 0.000 description 4
- 102000018358 immunoglobulin Human genes 0.000 description 4
- 238000001727 in vivo Methods 0.000 description 4
- 238000011534 incubation Methods 0.000 description 4
- 230000002401 inhibitory effect Effects 0.000 description 4
- UEGPKNKPLBYCNK-UHFFFAOYSA-L magnesium acetate Chemical compound [Mg+2].CC([O-])=O.CC([O-])=O UEGPKNKPLBYCNK-UHFFFAOYSA-L 0.000 description 4
- 239000011654 magnesium acetate Substances 0.000 description 4
- 235000011285 magnesium acetate Nutrition 0.000 description 4
- 229940069446 magnesium acetate Drugs 0.000 description 4
- 210000004962 mammalian cell Anatomy 0.000 description 4
- 230000001404 mediated effect Effects 0.000 description 4
- 239000013642 negative control Substances 0.000 description 4
- 238000012856 packing Methods 0.000 description 4
- 229910052700 potassium Inorganic materials 0.000 description 4
- 238000011084 recovery Methods 0.000 description 4
- 230000007017 scission Effects 0.000 description 4
- 238000002864 sequence alignment Methods 0.000 description 4
- 239000000725 suspension Substances 0.000 description 4
- 230000008685 targeting Effects 0.000 description 4
- 238000004448 titration Methods 0.000 description 4
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 3
- 108091026890 Coding region Proteins 0.000 description 3
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 3
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 3
- 125000000010 L-asparaginyl group Chemical group O=C([*])[C@](N([H])[H])([H])C([H])([H])C(=O)N([H])[H] 0.000 description 3
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 3
- 125000000510 L-tryptophano group Chemical group [H]C1=C([H])C([H])=C2N([H])C([H])=C(C([H])([H])[C@@]([H])(C(O[H])=O)N([H])[*])C2=C1[H] 0.000 description 3
- 102220639700 Leptin_R78L_mutation Human genes 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 102220567169 Ornithine decarboxylase antizyme 1_Y72R_mutation Human genes 0.000 description 3
- 108010076504 Protein Sorting Signals Proteins 0.000 description 3
- 238000012300 Sequence Analysis Methods 0.000 description 3
- 108060008682 Tumor Necrosis Factor Proteins 0.000 description 3
- 102220550275 Usher syndrome type-1C protein-binding protein 1_N77Q_mutation Human genes 0.000 description 3
- 238000002835 absorbance Methods 0.000 description 3
- 150000001408 amides Chemical class 0.000 description 3
- 230000000903 blocking effect Effects 0.000 description 3
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 3
- 239000002299 complementary DNA Substances 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000029087 digestion Effects 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- MHMNJMPURVTYEJ-UHFFFAOYSA-N fluorescein-5-isothiocyanate Chemical compound O1C(=O)C2=CC(N=C=S)=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 MHMNJMPURVTYEJ-UHFFFAOYSA-N 0.000 description 3
- 230000005714 functional activity Effects 0.000 description 3
- 230000036541 health Effects 0.000 description 3
- 239000001257 hydrogen Substances 0.000 description 3
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 3
- 239000000543 intermediate Substances 0.000 description 3
- 239000006166 lysate Substances 0.000 description 3
- 229930182817 methionine Natural products 0.000 description 3
- 230000007935 neutral effect Effects 0.000 description 3
- 239000002245 particle Substances 0.000 description 3
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 3
- 210000002729 polyribosome Anatomy 0.000 description 3
- 210000001236 prokaryotic cell Anatomy 0.000 description 3
- 239000011541 reaction mixture Substances 0.000 description 3
- 238000010839 reverse transcription Methods 0.000 description 3
- 230000028327 secretion Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000013518 transcription Methods 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- PIEPQKCYPFFYMG-UHFFFAOYSA-N tris acetate Chemical compound CC(O)=O.OCC(N)(CO)CO PIEPQKCYPFFYMG-UHFFFAOYSA-N 0.000 description 3
- 102220504260 17-beta-hydroxysteroid dehydrogenase type 6_T39V_mutation Human genes 0.000 description 2
- KIUMMUBSPKGMOY-UHFFFAOYSA-N 3,3'-Dithiobis(6-nitrobenzoic acid) Chemical compound C1=C([N+]([O-])=O)C(C(=O)O)=CC(SSC=2C=C(C(=CC=2)[N+]([O-])=O)C(O)=O)=C1 KIUMMUBSPKGMOY-UHFFFAOYSA-N 0.000 description 2
- AWGBKZRMLNVLAF-UHFFFAOYSA-N 3,5-dibromo-n,2-dihydroxybenzamide Chemical compound ONC(=O)C1=CC(Br)=CC(Br)=C1O AWGBKZRMLNVLAF-UHFFFAOYSA-N 0.000 description 2
- VXPSQDAMFATNNG-UHFFFAOYSA-N 3-[2-(2,5-dioxopyrrol-3-yl)phenyl]pyrrole-2,5-dione Chemical compound O=C1NC(=O)C(C=2C(=CC=CC=2)C=2C(NC(=O)C=2)=O)=C1 VXPSQDAMFATNNG-UHFFFAOYSA-N 0.000 description 2
- 102220492731 6-phosphogluconate dehydrogenase, decarboxylating_T39Q_mutation Human genes 0.000 description 2
- 102220583875 AMP deaminase 1_S80G_mutation Human genes 0.000 description 2
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 108090001008 Avidin Proteins 0.000 description 2
- 102220606581 C-reactive protein_S84K_mutation Human genes 0.000 description 2
- 101000665495 Caenorhabditis elegans Ran GTPase-activating protein 2 Proteins 0.000 description 2
- 108091035707 Consensus sequence Proteins 0.000 description 2
- 102220563778 DALR anticodon-binding domain-containing protein 3_S81G_mutation Human genes 0.000 description 2
- 102220525519 DNA polymerase delta catalytic subunit_Y73P_mutation Human genes 0.000 description 2
- 230000033616 DNA repair Effects 0.000 description 2
- 230000006820 DNA synthesis Effects 0.000 description 2
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 2
- 101001099946 Drosophila melanogaster Ran GTPase-activating protein Proteins 0.000 description 2
- 102220568831 Dual specificity mitogen-activated protein kinase kinase 1_Y70V_mutation Human genes 0.000 description 2
- 102220473546 ELAV-like protein 1_S81K_mutation Human genes 0.000 description 2
- 238000012286 ELISA Assay Methods 0.000 description 2
- 102220523133 Eukaryotic translation initiation factor 4E-binding protein 2_T37E_mutation Human genes 0.000 description 2
- 239000004606 Fillers/Extenders Substances 0.000 description 2
- 238000012413 Fluorescence activated cell sorting analysis Methods 0.000 description 2
- 102220480513 GSK3B-interacting protein_N42I_mutation Human genes 0.000 description 2
- 102220581630 Haptoglobin-related protein_N42H_mutation Human genes 0.000 description 2
- 102220577800 Hepatocyte nuclear factor 3-alpha_D80E_mutation Human genes 0.000 description 2
- 241000238631 Hexapoda Species 0.000 description 2
- 102220624186 Histamine N-methyltransferase_S81A_mutation Human genes 0.000 description 2
- 241000282412 Homo Species 0.000 description 2
- 102220539938 Ileal sodium/bile acid cotransporter_T39K_mutation Human genes 0.000 description 2
- 102220599926 Inositol 1,4,5-trisphosphate receptor-interacting protein-like 1_N42G_mutation Human genes 0.000 description 2
- 102220624297 Interferon alpha-8_S81D_mutation Human genes 0.000 description 2
- 102220564708 Killer cell immunoglobulin-like receptor 2DL2_E70Q_mutation Human genes 0.000 description 2
- SRBFZHDQGSBBOR-HWQSCIPKSA-N L-arabinopyranose Chemical compound O[C@H]1COC(O)[C@H](O)[C@H]1O SRBFZHDQGSBBOR-HWQSCIPKSA-N 0.000 description 2
- 125000000241 L-isoleucino group Chemical group [H]OC(=O)[C@@]([H])(N([H])[*])[C@@](C([H])([H])[H])(C(C([H])([H])[H])([H])[H])[H] 0.000 description 2
- 125000003290 L-leucino group Chemical group [H]OC(=O)[C@@]([H])(N([H])[*])C([H])([H])C(C([H])([H])[H])([H])C([H])([H])[H] 0.000 description 2
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 2
- 241001490312 Lithops pseudotruncatella Species 0.000 description 2
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 2
- 102220494379 Methylmalonyl-CoA mutase, mitochondrial_N42T_mutation Human genes 0.000 description 2
- 102220556035 Methylthioribulose-1-phosphate dehydratase_S84D_mutation Human genes 0.000 description 2
- 102220610330 Musculin_N42A_mutation Human genes 0.000 description 2
- 102220608182 Myosin-binding protein H_S84N_mutation Human genes 0.000 description 2
- 102220609390 N-alpha-acetyltransferase 60_K79Q_mutation Human genes 0.000 description 2
- 108091007491 NSP3 Papain-like protease domains Proteins 0.000 description 2
- 102220526023 Nectin-1_S84Y_mutation Human genes 0.000 description 2
- 102220509145 PDZ domain-containing protein 11_S81Y_mutation Human genes 0.000 description 2
- 108091000080 Phosphotransferase Proteins 0.000 description 2
- 229920002594 Polyethylene Glycol 8000 Polymers 0.000 description 2
- 102220511108 Probable aminopeptidase NPEPL1_S84A_mutation Human genes 0.000 description 2
- 102000001253 Protein Kinase Human genes 0.000 description 2
- 102220640338 Proton channel OTOP1_S81C_mutation Human genes 0.000 description 2
- 102220473688 Ras-related protein Rab-5A_S84E_mutation Human genes 0.000 description 2
- 102220644164 Rho guanine nucleotide exchange factor 10-like protein_T49W_mutation Human genes 0.000 description 2
- 108091030425 SUMO family Proteins 0.000 description 2
- 102000039405 SUMO family Human genes 0.000 description 2
- 229940124639 Selective inhibitor Drugs 0.000 description 2
- 229920002684 Sepharose Polymers 0.000 description 2
- 102220590324 Spindlin-1_D80A_mutation Human genes 0.000 description 2
- 102220601470 Succinate dehydrogenase [ubiquinone] cytochrome b small subunit, mitochondrial_S84P_mutation Human genes 0.000 description 2
- QAOWNCQODCNURD-UHFFFAOYSA-N Sulfuric acid Chemical compound OS(O)(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-N 0.000 description 2
- 108010076818 TEV protease Proteins 0.000 description 2
- FQPDRTDDEZXCEC-SVSWQMSJSA-N Thr-Ile-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O FQPDRTDDEZXCEC-SVSWQMSJSA-N 0.000 description 2
- 102000005789 Vascular Endothelial Growth Factors Human genes 0.000 description 2
- 108010019530 Vascular Endothelial Growth Factors Proteins 0.000 description 2
- 239000000370 acceptor Substances 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 238000001042 affinity chromatography Methods 0.000 description 2
- 210000004102 animal cell Anatomy 0.000 description 2
- 125000000637 arginyl group Chemical group N[C@@H](CCCNC(N)=N)C(=O)* 0.000 description 2
- 125000003118 aryl group Chemical group 0.000 description 2
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 2
- 230000006287 biotinylation Effects 0.000 description 2
- 238000007413 biotinylation Methods 0.000 description 2
- 102220350046 c.242G>A Human genes 0.000 description 2
- 102220417859 c.242G>C Human genes 0.000 description 2
- 102220427842 c.250T>A Human genes 0.000 description 2
- 102220370181 c.75C>G Human genes 0.000 description 2
- 238000004113 cell culture Methods 0.000 description 2
- 150000005829 chemical entities Chemical class 0.000 description 2
- 210000000349 chromosome Anatomy 0.000 description 2
- 230000001268 conjugating effect Effects 0.000 description 2
- 239000000470 constituent Substances 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 239000003431 cross linking reagent Substances 0.000 description 2
- 125000000151 cysteine group Chemical group N[C@@H](CS)C(=O)* 0.000 description 2
- 238000010790 dilution Methods 0.000 description 2
- 239000012895 dilution Substances 0.000 description 2
- 238000010494 dissociation reaction Methods 0.000 description 2
- 230000005593 dissociations Effects 0.000 description 2
- 239000000386 donor Substances 0.000 description 2
- 230000001747 exhibiting effect Effects 0.000 description 2
- 238000000684 flow cytometry Methods 0.000 description 2
- 239000000499 gel Substances 0.000 description 2
- 238000010353 genetic engineering Methods 0.000 description 2
- 125000000291 glutamic acid group Chemical group N[C@@H](CCC(O)=O)C(=O)* 0.000 description 2
- 125000000404 glutamine group Chemical group N[C@@H](CCC(N)=O)C(=O)* 0.000 description 2
- 230000013595 glycosylation Effects 0.000 description 2
- 238000006206 glycosylation reaction Methods 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 239000001963 growth medium Substances 0.000 description 2
- 210000004408 hybridoma Anatomy 0.000 description 2
- 230000001900 immune effect Effects 0.000 description 2
- 238000003018 immunoassay Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 150000002500 ions Chemical class 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 125000003588 lysine group Chemical group [H]N([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 2
- 239000002609 medium Substances 0.000 description 2
- MYWUZJCMWCOHBA-VIFPVBQESA-N methamphetamine Chemical compound CN[C@@H](C)CC1=CC=CC=C1 MYWUZJCMWCOHBA-VIFPVBQESA-N 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 108010087904 neutravidin Proteins 0.000 description 2
- 230000006780 non-homologous end joining Effects 0.000 description 2
- 238000002515 oligonucleotide synthesis Methods 0.000 description 2
- 102000020233 phosphotransferase Human genes 0.000 description 2
- DCWXELXMIBXGTH-UHFFFAOYSA-N phosphotyrosine Chemical compound OC(=O)C(N)CC1=CC=C(OP(O)(O)=O)C=C1 DCWXELXMIBXGTH-UHFFFAOYSA-N 0.000 description 2
- 239000004033 plastic Substances 0.000 description 2
- 229920003023 plastic Polymers 0.000 description 2
- 230000003389 potentiating effect Effects 0.000 description 2
- 108020001580 protein domains Proteins 0.000 description 2
- 230000006916 protein interaction Effects 0.000 description 2
- 108060006633 protein kinase Proteins 0.000 description 2
- 102000005962 receptors Human genes 0.000 description 2
- 108020003175 receptors Proteins 0.000 description 2
- 108091008146 restriction endonucleases Proteins 0.000 description 2
- 238000003757 reverse transcription PCR Methods 0.000 description 2
- 102200139563 rs104893651 Human genes 0.000 description 2
- 102220226098 rs1064793491 Human genes 0.000 description 2
- 102220005682 rs121913116 Human genes 0.000 description 2
- 102220053204 rs138241615 Human genes 0.000 description 2
- 102220223196 rs146238336 Human genes 0.000 description 2
- 102220289587 rs1554918207 Human genes 0.000 description 2
- 102220279306 rs1555054109 Human genes 0.000 description 2
- 102220283010 rs1555527000 Human genes 0.000 description 2
- 102220074265 rs180177184 Human genes 0.000 description 2
- 102220058795 rs200514222 Human genes 0.000 description 2
- 102200145334 rs2274084 Human genes 0.000 description 2
- 102220005398 rs33964317 Human genes 0.000 description 2
- 102200118205 rs33990858 Human genes 0.000 description 2
- 102220004811 rs34571024 Human genes 0.000 description 2
- 102200082923 rs34703513 Human genes 0.000 description 2
- 102220005324 rs34703513 Human genes 0.000 description 2
- 102220275424 rs398124184 Human genes 0.000 description 2
- 102220045232 rs587781939 Human genes 0.000 description 2
- 102220085548 rs745327804 Human genes 0.000 description 2
- 102220297185 rs758076073 Human genes 0.000 description 2
- 102220060031 rs779297339 Human genes 0.000 description 2
- 102220077770 rs797045029 Human genes 0.000 description 2
- 102200155054 rs80358230 Human genes 0.000 description 2
- 102220202090 rs864622596 Human genes 0.000 description 2
- 102220103278 rs878854729 Human genes 0.000 description 2
- 102220134669 rs886054796 Human genes 0.000 description 2
- 239000000523 sample Substances 0.000 description 2
- 238000010187 selection method Methods 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 150000003384 small molecules Chemical class 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- PFNFFQXMRSDOHW-UHFFFAOYSA-N spermine Chemical compound NCCCNCCCCNCCCN PFNFFQXMRSDOHW-UHFFFAOYSA-N 0.000 description 2
- 238000010561 standard procedure Methods 0.000 description 2
- 238000012916 structural analysis Methods 0.000 description 2
- 235000011149 sulphuric acid Nutrition 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 230000009897 systematic effect Effects 0.000 description 2
- 230000001225 therapeutic effect Effects 0.000 description 2
- 108010061238 threonyl-glycine Proteins 0.000 description 2
- FLCQLSRLQIPNLM-UHFFFAOYSA-N (2,5-dioxopyrrolidin-1-yl) 2-acetylsulfanylacetate Chemical compound CC(=O)SCC(=O)ON1C(=O)CCC1=O FLCQLSRLQIPNLM-UHFFFAOYSA-N 0.000 description 1
- HGUIGIFJDSEBMS-YZUVYHPZSA-N (3s)-3-[[2-[[(2s)-2-[[2-[[(2s,3r)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-amino-3-(4-hydroxyphenyl)propanoyl]amino]propanoyl]amino]-3-methylbutanoyl]amino]-3-hydroxybutanoyl]amino]acetyl]amino]-5-(diaminomethylideneamino)pentanoyl]amino]acetyl]amino]-4-[[(1s)-1-carb Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)NC(=O)CNC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HGUIGIFJDSEBMS-YZUVYHPZSA-N 0.000 description 1
- SGVWDRVQIYUSRA-UHFFFAOYSA-N 1-[2-[2-(2,5-dioxopyrrol-1-yl)ethyldisulfanyl]ethyl]pyrrole-2,5-dione Chemical compound O=C1C=CC(=O)N1CCSSCCN1C(=O)C=CC1=O SGVWDRVQIYUSRA-UHFFFAOYSA-N 0.000 description 1
- 238000004461 1H-15N HSQC Methods 0.000 description 1
- 102220492355 2'-5'-oligoadenylate synthase 3_R30A_mutation Human genes 0.000 description 1
- 102220492454 2'-5'-oligoadenylate synthase 3_T35L_mutation Human genes 0.000 description 1
- KMEMIMRPZGDOMG-UHFFFAOYSA-N 2-cyanoethoxyphosphonamidous acid Chemical compound NP(O)OCCC#N KMEMIMRPZGDOMG-UHFFFAOYSA-N 0.000 description 1
- NEWKHUASLBMWRE-UHFFFAOYSA-N 2-methyl-6-(phenylethynyl)pyridine Chemical compound CC1=CC=CC(C#CC=2C=CC=CC=2)=N1 NEWKHUASLBMWRE-UHFFFAOYSA-N 0.000 description 1
- JMUAKWNHKQBPGJ-UHFFFAOYSA-N 3-(pyridin-2-yldisulfanyl)-n-[4-[3-(pyridin-2-yldisulfanyl)propanoylamino]butyl]propanamide Chemical compound C=1C=CC=NC=1SSCCC(=O)NCCCCNC(=O)CCSSC1=CC=CC=N1 JMUAKWNHKQBPGJ-UHFFFAOYSA-N 0.000 description 1
- 102220643830 39S ribosomal protein L9, mitochondrial_N40Q_mutation Human genes 0.000 description 1
- QLPHBNRMJLFRGO-YDHSSHFGSA-N 5-[(3as,4s,6ar)-2-oxo-1,3,3a,4,6,6a-hexahydrothieno[3,4-d]imidazol-4-yl]-n-[6-[3-(pyridin-2-yldisulfanyl)propanoylamino]hexyl]pentanamide Chemical compound C([C@H]1[C@H]2NC(=O)N[C@H]2CS1)CCCC(=O)NCCCCCCNC(=O)CCSSC1=CC=CC=N1 QLPHBNRMJLFRGO-YDHSSHFGSA-N 0.000 description 1
- 102220561382 5-hydroxytryptamine receptor 2B_Q45E_mutation Human genes 0.000 description 1
- 102220497093 5-hydroxytryptamine receptor 3B_K68A_mutation Human genes 0.000 description 1
- 102220496088 5-hydroxytryptamine receptor 3B_N47A_mutation Human genes 0.000 description 1
- 102220497079 5-hydroxytryptamine receptor 3B_Q33A_mutation Human genes 0.000 description 1
- 102220476540 60S ribosomal protein L26-like 1_K86F_mutation Human genes 0.000 description 1
- VVIAGPKUTFNRDU-UHFFFAOYSA-N 6S-folinic acid Natural products C1NC=2NC(N)=NC(=O)C=2N(C=O)C1CNC1=CC=C(C(=O)NC(CCC(O)=O)C(O)=O)C=C1 VVIAGPKUTFNRDU-UHFFFAOYSA-N 0.000 description 1
- 102220546915 ADP-ribosylation factor 3_D35L_mutation Human genes 0.000 description 1
- 102220491264 ADP-ribosylation factor 6_Y31V_mutation Human genes 0.000 description 1
- 102220472850 ADP-ribosylation factor-like protein 13B_T35N_mutation Human genes 0.000 description 1
- 102220491410 ADP-ribosylation factor-like protein 16_T37N_mutation Human genes 0.000 description 1
- 102220554193 APC membrane recruitment protein 1_E38A_mutation Human genes 0.000 description 1
- 102220553838 APC membrane recruitment protein 1_K79A_mutation Human genes 0.000 description 1
- 102220511153 APC membrane recruitment protein 1_N75D_mutation Human genes 0.000 description 1
- 102220510697 APC membrane recruitment protein 1_T28A_mutation Human genes 0.000 description 1
- 102220554063 APC membrane recruitment protein 1_T41A_mutation Human genes 0.000 description 1
- 102220579265 ARF GTPase-activating protein GIT1_T35H_mutation Human genes 0.000 description 1
- 102220485323 ATP-dependent DNA/RNA helicase DHX36_N77G_mutation Human genes 0.000 description 1
- 102220492601 ATPase GET3_K86D_mutation Human genes 0.000 description 1
- 101100295756 Acinetobacter baumannii (strain ATCC 19606 / DSM 30007 / JCM 6841 / CCUG 19606 / CIP 70.34 / NBRC 109757 / NCIMB 12457 / NCTC 12156 / 81) omp38 gene Proteins 0.000 description 1
- 102220529180 Activated RNA polymerase II transcriptional coactivator p15_K68G_mutation Human genes 0.000 description 1
- 102220635346 Adenylate kinase isoenzyme 6_V29K_mutation Human genes 0.000 description 1
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 1
- MCKSLROAGSDNFC-ACZMJKKPSA-N Ala-Asp-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MCKSLROAGSDNFC-ACZMJKKPSA-N 0.000 description 1
- LSMDIAAALJJLRO-XQXXSGGOSA-N Ala-Thr-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O LSMDIAAALJJLRO-XQXXSGGOSA-N 0.000 description 1
- 241001504639 Alcedo atthis Species 0.000 description 1
- 102220469561 Aldo-keto reductase family 1 member D1_A24Y_mutation Human genes 0.000 description 1
- 102220591721 Amphiregulin_A26R_mutation Human genes 0.000 description 1
- 102220613787 Angiotensin-converting enzyme 2_K68D_mutation Human genes 0.000 description 1
- 102220613801 Angiotensin-converting enzyme 2_Q42L_mutation Human genes 0.000 description 1
- 102220615939 Apolipoprotein E_R79T_mutation Human genes 0.000 description 1
- SGYSTDWPNPKJPP-GUBZILKMSA-N Arg-Ala-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SGYSTDWPNPKJPP-GUBZILKMSA-N 0.000 description 1
- LRPZJPMQGKGHSG-XGEHTFHBSA-N Arg-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N)O LRPZJPMQGKGHSG-XGEHTFHBSA-N 0.000 description 1
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 1
- PFOYSEIHFVKHNF-FXQIFTODSA-N Asn-Ala-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PFOYSEIHFVKHNF-FXQIFTODSA-N 0.000 description 1
- WONGRTVAMHFGBE-WDSKDSINSA-N Asn-Gly-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N WONGRTVAMHFGBE-WDSKDSINSA-N 0.000 description 1
- MYCSPQIARXTUTP-SRVKXCTJSA-N Asn-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(=O)N)N MYCSPQIARXTUTP-SRVKXCTJSA-N 0.000 description 1
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 1
- YRTOMUMWSTUQAX-FXQIFTODSA-N Asn-Pro-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O YRTOMUMWSTUQAX-FXQIFTODSA-N 0.000 description 1
- PQKSVQSMTHPRIB-ZKWXMUAHSA-N Asn-Val-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O PQKSVQSMTHPRIB-ZKWXMUAHSA-N 0.000 description 1
- NJIKKGUVGUBICV-ZLUOBGJFSA-N Asp-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O NJIKKGUVGUBICV-ZLUOBGJFSA-N 0.000 description 1
- DWOGMPWRQQWPPF-GUBZILKMSA-N Asp-Leu-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O DWOGMPWRQQWPPF-GUBZILKMSA-N 0.000 description 1
- BPAUXFVCSYQDQX-JRQIVUDYSA-N Asp-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC(=O)O)N)O BPAUXFVCSYQDQX-JRQIVUDYSA-N 0.000 description 1
- 101100136076 Aspergillus oryzae (strain ATCC 42149 / RIB 40) pel1 gene Proteins 0.000 description 1
- 102220548849 B-cell linker protein_Y72F_mutation Human genes 0.000 description 1
- 102220567469 BICD family-like cargo adapter 2_A26Q_mutation Human genes 0.000 description 1
- 102220567512 BICD family-like cargo adapter 2_R33A_mutation Human genes 0.000 description 1
- 102220567627 BICD family-like cargo adapter 2_R79A_mutation Human genes 0.000 description 1
- 102220567494 BICD family-like cargo adapter 2_V75G_mutation Human genes 0.000 description 1
- 102000008836 BTB/POZ domains Human genes 0.000 description 1
- 101100018944 Bacillus subtilis (strain 168) iolA gene Proteins 0.000 description 1
- 101100286998 Bacillus subtilis (strain 168) iolC gene Proteins 0.000 description 1
- 101100397068 Bacillus subtilis (strain 168) iolD gene Proteins 0.000 description 1
- 101100179883 Bacillus subtilis (strain 168) iolF gene Proteins 0.000 description 1
- 101100341057 Bacillus subtilis (strain 168) iolG gene Proteins 0.000 description 1
- 102220518579 Baculoviral IAP repeat-containing protein 6_T43E_mutation Human genes 0.000 description 1
- 102220520088 Barrier-to-autointegration factor_K54E_mutation Human genes 0.000 description 1
- 102220608040 Beta-defensin 1_R30T_mutation Human genes 0.000 description 1
- 102220477367 Bis(5'-nucleosyl)-tetraphosphatase [asymmetrical]_K79M_mutation Human genes 0.000 description 1
- 102220617295 CPX chromosomal region candidate gene 1 protein_A83F_mutation Human genes 0.000 description 1
- 102220617294 CPX chromosomal region candidate gene 1 protein_A83I_mutation Human genes 0.000 description 1
- 102220617178 CPX chromosomal region candidate gene 1 protein_A83W_mutation Human genes 0.000 description 1
- 102220546641 Cadherin-1_V27Q_mutation Human genes 0.000 description 1
- 101710167800 Capsid assembly scaffolding protein Proteins 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 102220565767 Carboxylesterase 4A_K68Y_mutation Human genes 0.000 description 1
- 102220571118 Carboxypeptidase M_V29R_mutation Human genes 0.000 description 1
- 102220634221 Casein kinase I isoform alpha-like_T43Q_mutation Human genes 0.000 description 1
- 102220534615 Caspase recruitment domain-containing protein 9_D66R_mutation Human genes 0.000 description 1
- 102220534607 Caspase recruitment domain-containing protein 9_R35E_mutation Human genes 0.000 description 1
- 102220581427 Cell cycle regulator of non-homologous end joining_N40D_mutation Human genes 0.000 description 1
- 102220582871 Cellular tumor antigen p53_A39P_mutation Human genes 0.000 description 1
- 102220584248 Cellular tumor antigen p53_A78V_mutation Human genes 0.000 description 1
- 102220584298 Cellular tumor antigen p53_P82S_mutation Human genes 0.000 description 1
- 102220474811 Chemerin-like receptor 2_D66K_mutation Human genes 0.000 description 1
- 241000251730 Chondrichthyes Species 0.000 description 1
- 102220503654 Chymotrypsin-C_D35H_mutation Human genes 0.000 description 1
- 102100036444 Clathrin interactor 1 Human genes 0.000 description 1
- 102220579680 Claudin-1_A39R_mutation Human genes 0.000 description 1
- 102220603374 Claudin-9_V45N_mutation Human genes 0.000 description 1
- 102220525561 Coiled-coil domain-containing protein 200_A24N_mutation Human genes 0.000 description 1
- 101710169694 Core protein VP8 Proteins 0.000 description 1
- 241000699802 Cricetulus griseus Species 0.000 description 1
- 102220553754 Cyclic GMP-AMP synthase_T35E_mutation Human genes 0.000 description 1
- 102220614195 Cyclin-dependent kinase 2_N40E_mutation Human genes 0.000 description 1
- 102220503704 Cyclin-dependent kinase 8_V27L_mutation Human genes 0.000 description 1
- 102220602507 Cyclin-dependent kinases regulatory subunit 1_V85Y_mutation Human genes 0.000 description 1
- 102220603755 D(1A) dopamine receptor_T37P_mutation Human genes 0.000 description 1
- 102220555786 DDB1- and CUL4-associated factor 4_W22C_mutation Human genes 0.000 description 1
- 230000005778 DNA damage Effects 0.000 description 1
- 231100000277 DNA damage Toxicity 0.000 description 1
- 108050008316 DNA endonuclease RBBP8 Proteins 0.000 description 1
- 102220614076 DNA ligase 3_D35E_mutation Human genes 0.000 description 1
- 102220520404 DNA polymerase beta_K68R_mutation Human genes 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 102220555200 DNA topoisomerase 3-alpha_D23R_mutation Human genes 0.000 description 1
- 102100026662 Delta and Notch-like epidermal growth factor-related receptor Human genes 0.000 description 1
- 102220545976 Dihydropyrimidinase_Y70A_mutation Human genes 0.000 description 1
- 102220623947 DnaJ homolog subfamily A member 3, mitochondrial_N75Y_mutation Human genes 0.000 description 1
- 102220499619 Dysferlin_R79D_mutation Human genes 0.000 description 1
- 108091005942 ECFP Proteins 0.000 description 1
- 102220466999 EH domain-binding protein 1-like protein 1_G33K_mutation Human genes 0.000 description 1
- 102220540124 ER membrane protein complex subunit 6_E38D_mutation Human genes 0.000 description 1
- 102220626670 Electrogenic sodium bicarbonate cotransporter 1_T49D_mutation Human genes 0.000 description 1
- 102220487437 Electron transfer flavoprotein subunit alpha, mitochondrial_Y70M_mutation Human genes 0.000 description 1
- 102220498147 Electron transfer flavoprotein subunit beta_Q42P_mutation Human genes 0.000 description 1
- 102220517319 Electron transfer flavoprotein-ubiquinone oxidoreductase, mitochondrial_A26H_mutation Human genes 0.000 description 1
- 102220574090 Endoglin_G52V_mutation Human genes 0.000 description 1
- 102220584607 Endonuclease 8-like 1_K54L_mutation Human genes 0.000 description 1
- 102220597087 Essential MCU regulator, mitochondrial_S85W_mutation Human genes 0.000 description 1
- 102220614871 Extracellular calcium-sensing receptor_S53P_mutation Human genes 0.000 description 1
- 102220538052 FAD-AMP lyase (cyclizing)_Y31F_mutation Human genes 0.000 description 1
- 102220571590 Fatty acid hydroxylase domain-containing protein 2_K79N_mutation Human genes 0.000 description 1
- 241000724791 Filamentous phage Species 0.000 description 1
- MPJKWIXIYCLVCU-UHFFFAOYSA-N Folinic acid Natural products NC1=NC2=C(N(C=O)C(CNc3ccc(cc3)C(=O)NC(CCC(=O)O)CC(=O)O)CN2)C(=O)N1 MPJKWIXIYCLVCU-UHFFFAOYSA-N 0.000 description 1
- 102000020897 Formins Human genes 0.000 description 1
- 108091022623 Formins Proteins 0.000 description 1
- 102220621869 G-protein coupled estrogen receptor 1_K54I_mutation Human genes 0.000 description 1
- 102220480535 GSK3B-interacting protein_K86M_mutation Human genes 0.000 description 1
- 102220605911 GTPase HRas_V29A_mutation Human genes 0.000 description 1
- 102220642257 Gap junction alpha-8 protein_Q33H_mutation Human genes 0.000 description 1
- 102220590637 Gap junction beta-1 protein_S85F_mutation Human genes 0.000 description 1
- 102220606778 Gap junction beta-1 protein_V37M_mutation Human genes 0.000 description 1
- 102220589915 Glial fibrillary acidic protein_N77K_mutation Human genes 0.000 description 1
- DTMLKCYOQKZXKZ-HJGDQZAQSA-N Gln-Arg-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DTMLKCYOQKZXKZ-HJGDQZAQSA-N 0.000 description 1
- LFIVHGMKWFGUGK-IHRRRGAJSA-N Gln-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N LFIVHGMKWFGUGK-IHRRRGAJSA-N 0.000 description 1
- FQCILXROGNOZON-YUMQZZPRSA-N Gln-Pro-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O FQCILXROGNOZON-YUMQZZPRSA-N 0.000 description 1
- UCZXXMREFIETQW-AVGNSLFASA-N Glu-Tyr-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O UCZXXMREFIETQW-AVGNSLFASA-N 0.000 description 1
- 102220563887 Glucagon receptor_D80S_mutation Human genes 0.000 description 1
- 102220563884 Glucagon receptor_T76M_mutation Human genes 0.000 description 1
- 102220519818 Glutathione synthetase_A26D_mutation Human genes 0.000 description 1
- XCLCVBYNGXEVDU-WHFBIAKZSA-N Gly-Asn-Ser Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O XCLCVBYNGXEVDU-WHFBIAKZSA-N 0.000 description 1
- XLFHCWHXKSFVIB-BQBZGAKWSA-N Gly-Gln-Gln Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O XLFHCWHXKSFVIB-BQBZGAKWSA-N 0.000 description 1
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 1
- KMSGYZQRXPUKGI-BYPYZUCNSA-N Gly-Gly-Asn Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(N)=O KMSGYZQRXPUKGI-BYPYZUCNSA-N 0.000 description 1
- LLZXNUUIBOALNY-QWRGUYRKSA-N Gly-Leu-Lys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN LLZXNUUIBOALNY-QWRGUYRKSA-N 0.000 description 1
- YYXJFBMCOUSYSF-RYUDHWBXSA-N Gly-Phe-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYXJFBMCOUSYSF-RYUDHWBXSA-N 0.000 description 1
- POJJAZJHBGXEGM-YUMQZZPRSA-N Gly-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)CN POJJAZJHBGXEGM-YUMQZZPRSA-N 0.000 description 1
- 102220598073 Granulocyte colony-stimulating factor receptor_E38P_mutation Human genes 0.000 description 1
- 102220552715 Group IIE secretory phospholipase A2_N40W_mutation Human genes 0.000 description 1
- 102220571078 Growth arrest and DNA damage-inducible protein GADD45 gamma_A83R_mutation Human genes 0.000 description 1
- 102220520009 Guanylyl cyclase-activating protein 1_Y73I_mutation Human genes 0.000 description 1
- 102220480123 H/ACA ribonucleoprotein complex subunit DKC1_R35A_mutation Human genes 0.000 description 1
- 102220467477 HEPACAM family member 2_K86T_mutation Human genes 0.000 description 1
- 102220467563 HLA class II histocompatibility antigen, DR beta 5 chain_D66N_mutation Human genes 0.000 description 1
- 102220493757 HLA class II histocompatibility antigen, DRB1 beta chain_P40D_mutation Human genes 0.000 description 1
- 102220493761 HLA class II histocompatibility antigen, DRB1 beta chain_P40S_mutation Human genes 0.000 description 1
- 102220511663 Heme oxygenase 1_Y70F_mutation Human genes 0.000 description 1
- 102220558720 Hemogen_D66G_mutation Human genes 0.000 description 1
- 102220554203 Hemogen_T49V_mutation Human genes 0.000 description 1
- 108010054147 Hemoglobins Proteins 0.000 description 1
- 102000001554 Hemoglobins Human genes 0.000 description 1
- HTTJABKRGRZYRN-UHFFFAOYSA-N Heparin Chemical compound OC1C(NC(=O)C)C(O)OC(COS(O)(=O)=O)C1OC1C(OS(O)(=O)=O)C(O)C(OC2C(C(OS(O)(=O)=O)C(OC3C(C(O)C(O)C(O3)C(O)=O)OS(O)(=O)=O)C(CO)O2)NS(O)(=O)=O)C(C(O)=O)O1 HTTJABKRGRZYRN-UHFFFAOYSA-N 0.000 description 1
- 102220577733 Hepatocyte nuclear factor 3-alpha_A78E_mutation Human genes 0.000 description 1
- 102220577735 Hepatocyte nuclear factor 3-alpha_D81A_mutation Human genes 0.000 description 1
- 102220491788 High mobility group protein B1_S53E_mutation Human genes 0.000 description 1
- 102220631643 Histone H1.8_S85P_mutation Human genes 0.000 description 1
- 102220492800 Histone H4 transcription factor_T71R_mutation Human genes 0.000 description 1
- 102220471962 Histone deacetylase 7_S55W_mutation Human genes 0.000 description 1
- 101000851951 Homo sapiens Clathrin interactor 1 Proteins 0.000 description 1
- 101001038321 Homo sapiens Leucine-rich repeat protein 1 Proteins 0.000 description 1
- 101001093152 Homo sapiens Polycomb protein SCMH1 Proteins 0.000 description 1
- 101100202522 Homo sapiens SCMH1 gene Proteins 0.000 description 1
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 1
- 241000243251 Hydra Species 0.000 description 1
- 102000004157 Hydrolases Human genes 0.000 description 1
- 108090000604 Hydrolases Proteins 0.000 description 1
- 241000370541 Idia Species 0.000 description 1
- 102100029199 Iduronate 2-sulfatase Human genes 0.000 description 1
- UWLHDGMRWXHFFY-HPCHECBXSA-N Ile-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N1CCC[C@@H]1C(=O)O)N UWLHDGMRWXHFFY-HPCHECBXSA-N 0.000 description 1
- GLYJPWIRLBAIJH-FQUUOJAGSA-N Ile-Lys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N GLYJPWIRLBAIJH-FQUUOJAGSA-N 0.000 description 1
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 1
- YBKKLDBBPFIXBQ-MBLNEYKQSA-N Ile-Thr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)O)N YBKKLDBBPFIXBQ-MBLNEYKQSA-N 0.000 description 1
- DGTOKVBDZXJHNZ-WZLNRYEVSA-N Ile-Thr-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N DGTOKVBDZXJHNZ-WZLNRYEVSA-N 0.000 description 1
- QHUREMVLLMNUAX-OSUNSFLBSA-N Ile-Thr-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)O)N QHUREMVLLMNUAX-OSUNSFLBSA-N 0.000 description 1
- 108010021625 Immunoglobulin Fragments Proteins 0.000 description 1
- 102000008394 Immunoglobulin Fragments Human genes 0.000 description 1
- 108700005091 Immunoglobulin Genes Proteins 0.000 description 1
- 108010065920 Insulin Lispro Proteins 0.000 description 1
- 102220546673 Insulin-like growth factor-binding protein complex acid labile subunit_Y76A_mutation Human genes 0.000 description 1
- 102100034343 Integrase Human genes 0.000 description 1
- 102220499975 Interferon-induced GTP-binding protein Mx2_T28L_mutation Human genes 0.000 description 1
- 102220470890 Interferon-induced protein with tetratricopeptide repeats 5_T37V_mutation Human genes 0.000 description 1
- 102220471647 Interleukin-10 receptor subunit alpha_Q42F_mutation Human genes 0.000 description 1
- 102220471658 Interleukin-10 receptor subunit alpha_Q42N_mutation Human genes 0.000 description 1
- 102220465628 Interleukin-17A_R78V_mutation Human genes 0.000 description 1
- 102100024319 Intestinal-type alkaline phosphatase Human genes 0.000 description 1
- 101710184243 Intestinal-type alkaline phosphatase Proteins 0.000 description 1
- 102220487762 Isoaspartyl peptidase/L-asparaginase_V45D_mutation Human genes 0.000 description 1
- 102100037157 Keratin, type I cytoskeletal 40 Human genes 0.000 description 1
- 102220564706 Killer cell immunoglobulin-like receptor 2DL2_E70C_mutation Human genes 0.000 description 1
- 125000000570 L-alpha-aspartyl group Chemical group [H]OC(=O)C([H])([H])[C@]([H])(N([H])[H])C(*)=O 0.000 description 1
- 125000002059 L-arginyl group Chemical group O=C([*])[C@](N([H])[H])([H])C([H])([H])C([H])([H])C([H])([H])N([H])C(=N[H])N([H])[H] 0.000 description 1
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 1
- FFFHZYDWPBMWHY-VKHMYHEASA-N L-homocysteine Chemical compound OC(=O)[C@@H](N)CCS FFFHZYDWPBMWHY-VKHMYHEASA-N 0.000 description 1
- 125000001176 L-lysyl group Chemical group [H]N([H])[C@]([H])(C(=O)[*])C([H])([H])C([H])([H])C([H])([H])C(N([H])[H])([H])[H] 0.000 description 1
- 125000000773 L-serino group Chemical group [H]OC(=O)[C@@]([H])(N([H])*)C([H])([H])O[H] 0.000 description 1
- 125000003798 L-tyrosyl group Chemical group [H]N([H])[C@]([H])(C(=O)[*])C([H])([H])C1=C([H])C([H])=C(O[H])C([H])=C1[H] 0.000 description 1
- 102220595510 Lanosterol 14-alpha demethylase_Y72N_mutation Human genes 0.000 description 1
- 241000880493 Leptailurus serval Species 0.000 description 1
- VCSBGUACOYUIGD-CIUDSAMLSA-N Leu-Asn-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O VCSBGUACOYUIGD-CIUDSAMLSA-N 0.000 description 1
- CQGSYZCULZMEDE-UHFFFAOYSA-N Leu-Gln-Pro Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)N1CCCC1C(O)=O CQGSYZCULZMEDE-UHFFFAOYSA-N 0.000 description 1
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 1
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 1
- HRTRLSRYZZKPCO-BJDJZHNGSA-N Leu-Ile-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HRTRLSRYZZKPCO-BJDJZHNGSA-N 0.000 description 1
- AIQWYVFNBNNOLU-RHYQMDGZSA-N Leu-Thr-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O AIQWYVFNBNNOLU-RHYQMDGZSA-N 0.000 description 1
- BGGTYDNTOYRTTR-MEYUZBJRSA-N Leu-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC(C)C)N)O BGGTYDNTOYRTTR-MEYUZBJRSA-N 0.000 description 1
- 102100040249 Leucine-rich repeat protein 1 Human genes 0.000 description 1
- 102220637111 Lipoma-preferred partner_K54S_mutation Human genes 0.000 description 1
- 102220558628 Low affinity immunoglobulin gamma Fc region receptor III-B_A78D_mutation Human genes 0.000 description 1
- 102220496736 Lymphocyte function-associated antigen 3_V37K_mutation Human genes 0.000 description 1
- QUYCUALODHJQLK-CIUDSAMLSA-N Lys-Asp-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUYCUALODHJQLK-CIUDSAMLSA-N 0.000 description 1
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 1
- XREQQOATSMMAJP-MGHWNKPDSA-N Lys-Ile-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XREQQOATSMMAJP-MGHWNKPDSA-N 0.000 description 1
- QBHGXFQJFPWJIH-XUXIUFHCSA-N Lys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN QBHGXFQJFPWJIH-XUXIUFHCSA-N 0.000 description 1
- QVTDVTONTRSQMF-WDCWCFNPSA-N Lys-Thr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CCCCN QVTDVTONTRSQMF-WDCWCFNPSA-N 0.000 description 1
- 102220535072 Lysophospholipase_D76K_mutation Human genes 0.000 description 1
- 102220574679 Lysosomal alpha-mannosidase_V27T_mutation Human genes 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 102220494258 Methylmalonyl-CoA mutase, mitochondrial_E83A_mutation Human genes 0.000 description 1
- 102220509916 Methylosome protein 50_S48I_mutation Human genes 0.000 description 1
- 102220476321 Mis18-binding protein 1_S55G_mutation Human genes 0.000 description 1
- 102220637403 Mitochondrial genome maintenance exonuclease 1_G79K_mutation Human genes 0.000 description 1
- 101100043441 Mus musculus Srsf12 gene Proteins 0.000 description 1
- 102000008300 Mutant Proteins Human genes 0.000 description 1
- 108010021466 Mutant Proteins Proteins 0.000 description 1
- 101001059625 Mytilus californianus Fibronectin type III domain-containing protein Proteins 0.000 description 1
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 1
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 1
- 102220506264 N-alpha-acetyltransferase 50_Y31A_mutation Human genes 0.000 description 1
- 102220523821 NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 13_G79H_mutation Human genes 0.000 description 1
- 102220498805 NBAS subunit of NRZ tethering complex_Q44E_mutation Human genes 0.000 description 1
- 102220532674 NEDD8-conjugating enzyme UBE2F_D81R_mutation Human genes 0.000 description 1
- 102220476547 NF-kappa-B inhibitor alpha_D35A_mutation Human genes 0.000 description 1
- 102220577739 Natural killer cell receptor 2B4_T76K_mutation Human genes 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 102220574719 Netrin receptor UNC5C_T35I_mutation Human genes 0.000 description 1
- 102220606690 Neurotrimin_D23N_mutation Human genes 0.000 description 1
- 108091060545 Nonsense suppressor Proteins 0.000 description 1
- 102220484661 Norrin_K54N_mutation Human genes 0.000 description 1
- 102220484662 Norrin_V45M_mutation Human genes 0.000 description 1
- 102220482008 Nuclear cap-binding protein subunit 2_T76R_mutation Human genes 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- BZQFBWGGLXLEPQ-UHFFFAOYSA-N O-phosphoryl-L-serine Natural products OC(=O)C(N)COP(O)(O)=O BZQFBWGGLXLEPQ-UHFFFAOYSA-N 0.000 description 1
- 108700020796 Oncogene Proteins 0.000 description 1
- 102220567357 Ornithine decarboxylase antizyme 1_K79D_mutation Human genes 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 102220581197 Oxidized purine nucleoside triphosphate hydrolase_G37F_mutation Human genes 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 101150086423 PIAS2 gene Proteins 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 240000005373 Panax quinquefolius Species 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 102220479493 Pantetheinase_R33N_mutation Human genes 0.000 description 1
- 102220514015 Pecanex-like protein 1_D81N_mutation Human genes 0.000 description 1
- 239000004105 Penicillin G potassium Substances 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 102220473285 Peroxisomal membrane protein 11B_K68T_mutation Human genes 0.000 description 1
- 102220645720 Phosphoglucomutase-1_K68M_mutation Human genes 0.000 description 1
- 102220520241 Phospholipase A2 group XV_V85S_mutation Human genes 0.000 description 1
- 108010001441 Phosphopeptides Proteins 0.000 description 1
- 102220633875 Phytanoyl-CoA hydroxylase-interacting protein-like_R33E_mutation Human genes 0.000 description 1
- 102220633874 Phytanoyl-CoA hydroxylase-interacting protein-like_R33K_mutation Human genes 0.000 description 1
- 102220539588 Piwi-like protein 1_K79I_mutation Human genes 0.000 description 1
- 206010035226 Plasma cell myeloma Diseases 0.000 description 1
- 102220502464 Polyadenylate-binding protein 2_R79E_mutation Human genes 0.000 description 1
- 102220502878 Polyadenylate-binding protein 2_R79I_mutation Human genes 0.000 description 1
- 102220508362 Prefoldin subunit 1_D66T_mutation Human genes 0.000 description 1
- 102220641232 Pregnancy-specific beta-1-glycoprotein 1_T43P_mutation Human genes 0.000 description 1
- 102220542872 Presenilins-associated rhomboid-like protein, mitochondrial_T69D_mutation Human genes 0.000 description 1
- 241000288906 Primates Species 0.000 description 1
- OOLOTUZJUBOMAX-GUBZILKMSA-N Pro-Ala-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O OOLOTUZJUBOMAX-GUBZILKMSA-N 0.000 description 1
- FKLSMYYLJHYPHH-UWVGGRQHSA-N Pro-Gly-Leu Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O FKLSMYYLJHYPHH-UWVGGRQHSA-N 0.000 description 1
- HAEGAELAYWSUNC-WPRPVWTQSA-N Pro-Gly-Val Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAEGAELAYWSUNC-WPRPVWTQSA-N 0.000 description 1
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 1
- JLMZKEQFMVORMA-SRVKXCTJSA-N Pro-Pro-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 JLMZKEQFMVORMA-SRVKXCTJSA-N 0.000 description 1
- RCYUBVHMVUHEBM-RCWTZXSCSA-N Pro-Pro-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O RCYUBVHMVUHEBM-RCWTZXSCSA-N 0.000 description 1
- WVXQQUWOKUZIEG-VEVYYDQMSA-N Pro-Thr-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O WVXQQUWOKUZIEG-VEVYYDQMSA-N 0.000 description 1
- RMJZWERKFFNNNS-XGEHTFHBSA-N Pro-Thr-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMJZWERKFFNNNS-XGEHTFHBSA-N 0.000 description 1
- 102220511110 Probable aminopeptidase NPEPL1_K82A_mutation Human genes 0.000 description 1
- 102220511156 Probable aminopeptidase NPEPL1_R78A_mutation Human genes 0.000 description 1
- 102220511102 Probable aminopeptidase NPEPL1_S85A_mutation Human genes 0.000 description 1
- 101710130420 Probable capsid assembly scaffolding protein Proteins 0.000 description 1
- 102220539295 Programmed cell death 1 ligand 2_T37Y_mutation Human genes 0.000 description 1
- XBDQKXXYIPTUBI-UHFFFAOYSA-M Propionate Chemical compound CCC([O-])=O XBDQKXXYIPTUBI-UHFFFAOYSA-M 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 102220544203 Proteasome maturation protein_Q45A_mutation Human genes 0.000 description 1
- 102000006010 Protein Disulfide-Isomerase Human genes 0.000 description 1
- 102220587748 Protein FAM102A_Y73N_mutation Human genes 0.000 description 1
- 102220537320 Protein NDRG2_G31P_mutation Human genes 0.000 description 1
- 102220633356 Protein PALS2_P40A_mutation Human genes 0.000 description 1
- 102220633318 Protein PALS2_V37G_mutation Human genes 0.000 description 1
- 102220513293 Protein VAC14 homolog_G52E_mutation Human genes 0.000 description 1
- 102220630212 Protein amnionless_R30E_mutation Human genes 0.000 description 1
- 102220631068 Protein amnionless_S80T_mutation Human genes 0.000 description 1
- 102220466316 Protein jagged-1_G33V_mutation Human genes 0.000 description 1
- 102220579432 Protein mono-ADP-ribosyltransferase PARP3_S80Y_mutation Human genes 0.000 description 1
- 102220540010 Protein phosphatase 1 regulatory subunit 1A_T35D_mutation Human genes 0.000 description 1
- 102220543111 Protein pitchfork_V27N_mutation Human genes 0.000 description 1
- 102220543892 Protocadherin-10_A39I_mutation Human genes 0.000 description 1
- 102220543868 Protocadherin-10_A39T_mutation Human genes 0.000 description 1
- 102220547641 Protocadherin-10_S55L_mutation Human genes 0.000 description 1
- 102220543831 Protoporphyrinogen oxidase_G40E_mutation Human genes 0.000 description 1
- 101100084022 Pseudomonas aeruginosa (strain ATCC 15692 / DSM 22644 / CIP 104116 / JCM 14847 / LMG 12228 / 1C / PRS 101 / PAO1) lapA gene Proteins 0.000 description 1
- 102220612295 Putative coiled-coil domain-containing protein 144 N-terminal-like_S55D_mutation Human genes 0.000 description 1
- 102220495216 Putative translationally-controlled tumor protein-like protein TPT1P8_V45T_mutation Human genes 0.000 description 1
- 102220492545 Pyruvate dehydrogenase protein X component, mitochondrial_N72G_mutation Human genes 0.000 description 1
- 102220530405 Pyruvate kinase PKLR_S80P_mutation Human genes 0.000 description 1
- 102220614638 Queuine tRNA-ribosyltransferase accessory subunit 2_T41R_mutation Human genes 0.000 description 1
- 102220615001 RELT-like protein 2_A83L_mutation Human genes 0.000 description 1
- 238000010802 RNA extraction kit Methods 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 102220539213 Ras suppressor protein 1_K86S_mutation Human genes 0.000 description 1
- 102220518020 Ras-related protein Rab-35_K82Q_mutation Human genes 0.000 description 1
- 102220518026 Ras-related protein Rab-35_K82Y_mutation Human genes 0.000 description 1
- 102220494497 Ras-related protein Rab-5C_S85E_mutation Human genes 0.000 description 1
- 102220551728 Ras-related protein Rab-7L1_T71E_mutation Human genes 0.000 description 1
- 102220573260 Ras-related protein Ral-A_E38R_mutation Human genes 0.000 description 1
- 102220528786 Receptor expression-enhancing protein 5_T49K_mutation Human genes 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 102220531763 Retinoid-inducible serine carboxypeptidase_Y73H_mutation Human genes 0.000 description 1
- 102220611645 Retinoschisin_Q42A_mutation Human genes 0.000 description 1
- 102220521659 Ribosome biogenesis protein NSA2 homolog_K86A_mutation Human genes 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- 102220490364 S-adenosylhomocysteine hydrolase-like protein 1_S80A_mutation Human genes 0.000 description 1
- 102000000583 SNARE Proteins Human genes 0.000 description 1
- 102100035250 SUMO-activating enzyme subunit 2 Human genes 0.000 description 1
- 241000235070 Saccharomyces Species 0.000 description 1
- 101710204410 Scaffold protein Proteins 0.000 description 1
- 102220492407 Selenoprotein V_T41P_mutation Human genes 0.000 description 1
- BTPAWKABYQMKKN-LKXGYXEUSA-N Ser-Asp-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BTPAWKABYQMKKN-LKXGYXEUSA-N 0.000 description 1
- LQESNKGTTNHZPZ-GHCJXIJMSA-N Ser-Ile-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O LQESNKGTTNHZPZ-GHCJXIJMSA-N 0.000 description 1
- CKDXFSPMIDSMGV-GUBZILKMSA-N Ser-Pro-Val Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O CKDXFSPMIDSMGV-GUBZILKMSA-N 0.000 description 1
- VFWQQZMRKFOGLE-ZLUOBGJFSA-N Ser-Ser-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N)O VFWQQZMRKFOGLE-ZLUOBGJFSA-N 0.000 description 1
- CUXJENOFJXOSOZ-BIIVOSGPSA-N Ser-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CO)N)C(=O)O CUXJENOFJXOSOZ-BIIVOSGPSA-N 0.000 description 1
- XJDMUQCLVSCRSJ-VZFHVOOUSA-N Ser-Thr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O XJDMUQCLVSCRSJ-VZFHVOOUSA-N 0.000 description 1
- SDFUZKIAHWRUCS-QEJZJMRPSA-N Ser-Trp-Glu Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CO)N SDFUZKIAHWRUCS-QEJZJMRPSA-N 0.000 description 1
- 102220529495 Serine/threonine-protein kinase Chk2_K54A_mutation Human genes 0.000 description 1
- 102220594891 Serine/threonine-protein kinase PLK1_K82M_mutation Human genes 0.000 description 1
- 102220471540 Single-stranded DNA cytosine deaminase_Y31H_mutation Human genes 0.000 description 1
- 102220590337 Spindlin-1_G77D_mutation Human genes 0.000 description 1
- 102220596535 Splicing factor 1_W22A_mutation Human genes 0.000 description 1
- 102220596575 Splicing factor 1_W22F_mutation Human genes 0.000 description 1
- 102220515195 Steroid receptor-associated and regulated protein_G52W_mutation Human genes 0.000 description 1
- 102220602971 Store-operated calcium entry-associated regulatory factor_Y70T_mutation Human genes 0.000 description 1
- 102220600132 Succinate dehydrogenase [ubiquinone] cytochrome b small subunit, mitochondrial_Q42R_mutation Human genes 0.000 description 1
- QAOWNCQODCNURD-UHFFFAOYSA-L Sulfate Chemical compound [O-]S([O-])(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-L 0.000 description 1
- 102220623859 Sulfotransferase 2B1_N72W_mutation Human genes 0.000 description 1
- 102220581570 Sulfotransferase 6B1_V75D_mutation Human genes 0.000 description 1
- 102220600906 Syndecan-4_E47L_mutation Human genes 0.000 description 1
- 102220600905 Syndecan-4_E47Q_mutation Human genes 0.000 description 1
- 102220600947 Syndecan-4_E47R_mutation Human genes 0.000 description 1
- 102220617491 Syndecan-4_V75A_mutation Human genes 0.000 description 1
- 102220559638 T-cell-interacting, activating receptor on myeloid cells protein 1_D66E_mutation Human genes 0.000 description 1
- 102220521930 THAP domain-containing protein 1_N75I_mutation Human genes 0.000 description 1
- 102220601764 Terminal nucleotidyltransferase 5D_N72A_mutation Human genes 0.000 description 1
- 102220601769 Terminal nucleotidyltransferase 5D_N72H_mutation Human genes 0.000 description 1
- 102220484130 Testis-specific serine/threonine-protein kinase 6_K54M_mutation Human genes 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- DCLBXIWHLVEPMQ-JRQIVUDYSA-N Thr-Asp-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DCLBXIWHLVEPMQ-JRQIVUDYSA-N 0.000 description 1
- QQWNRERCGGZOKG-WEDXCCLWSA-N Thr-Gly-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O QQWNRERCGGZOKG-WEDXCCLWSA-N 0.000 description 1
- JKGGPMOUIAAJAA-YEPSODPASA-N Thr-Gly-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O JKGGPMOUIAAJAA-YEPSODPASA-N 0.000 description 1
- GXUWHVZYDAHFSV-FLBSBUHZSA-N Thr-Ile-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GXUWHVZYDAHFSV-FLBSBUHZSA-N 0.000 description 1
- WYLAVUAWOUVUCA-XVSYOHENSA-N Thr-Phe-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WYLAVUAWOUVUCA-XVSYOHENSA-N 0.000 description 1
- NDXSOKGYKCGYKT-VEVYYDQMSA-N Thr-Pro-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O NDXSOKGYKCGYKT-VEVYYDQMSA-N 0.000 description 1
- VTMGKRABARCZAX-OSUNSFLBSA-N Thr-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O VTMGKRABARCZAX-OSUNSFLBSA-N 0.000 description 1
- NHQVWACSJZJCGJ-FLBSBUHZSA-N Thr-Thr-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NHQVWACSJZJCGJ-FLBSBUHZSA-N 0.000 description 1
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 1
- XGFYGMKZKFRGAI-RCWTZXSCSA-N Thr-Val-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XGFYGMKZKFRGAI-RCWTZXSCSA-N 0.000 description 1
- QNXZCKMXHPULME-ZNSHCXBVSA-N Thr-Val-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N)O QNXZCKMXHPULME-ZNSHCXBVSA-N 0.000 description 1
- 102220603018 Transcription factor Sp9_A24K_mutation Human genes 0.000 description 1
- 102220636841 Transforming protein RhoA_T28N_mutation Human genes 0.000 description 1
- 102220501003 Triadin_N75A_mutation Human genes 0.000 description 1
- PNHABSVRPFBUJY-UMPQAUOISA-N Trp-Arg-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O PNHABSVRPFBUJY-UMPQAUOISA-N 0.000 description 1
- PXQPYPMSLBQHJJ-WFBYXXMGSA-N Trp-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N PXQPYPMSLBQHJJ-WFBYXXMGSA-N 0.000 description 1
- 102220470555 Tryptase delta_D80T_mutation Human genes 0.000 description 1
- 102220470551 Tryptase delta_K79L_mutation Human genes 0.000 description 1
- 102100040403 Tumor necrosis factor receptor superfamily member 6 Human genes 0.000 description 1
- KDGFPPHLXCEQRN-STECZYCISA-N Tyr-Arg-Ile Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDGFPPHLXCEQRN-STECZYCISA-N 0.000 description 1
- SGFIXFAHVWJKTD-KJEVXHAQSA-N Tyr-Arg-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SGFIXFAHVWJKTD-KJEVXHAQSA-N 0.000 description 1
- NZBSVMQZQMEUHI-WZLNRYEVSA-N Tyr-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N NZBSVMQZQMEUHI-WZLNRYEVSA-N 0.000 description 1
- JHDZONWZTCKTJR-KJEVXHAQSA-N Tyr-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JHDZONWZTCKTJR-KJEVXHAQSA-N 0.000 description 1
- GPLTZEMVOCZVAV-UFYCRDLUSA-N Tyr-Tyr-Arg Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CC=C(O)C=C1 GPLTZEMVOCZVAV-UFYCRDLUSA-N 0.000 description 1
- 102220563326 Tyrosine-protein kinase BTK_P25E_mutation Human genes 0.000 description 1
- 102220573253 Tyrosine-protein kinase Fer_Y73L_mutation Human genes 0.000 description 1
- 102220539699 Ubiquitin-like modifier-activating enzyme 5_Y73D_mutation Human genes 0.000 description 1
- 102220616530 Uncharacterized protein C19orf84_N75E_mutation Human genes 0.000 description 1
- 102220512942 Uncharacterized protein KIAA0087_S85N_mutation Human genes 0.000 description 1
- 102220476885 Vacuolar protein sorting-associated protein 16 homolog_K54D_mutation Human genes 0.000 description 1
- VMRFIKXKOFNMHW-GUBZILKMSA-N Val-Arg-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(=O)O)N VMRFIKXKOFNMHW-GUBZILKMSA-N 0.000 description 1
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 1
- XBJKAZATRJBDCU-GUBZILKMSA-N Val-Pro-Ala Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O XBJKAZATRJBDCU-GUBZILKMSA-N 0.000 description 1
- YTNGABPUXFEOGU-SRVKXCTJSA-N Val-Pro-Arg Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O YTNGABPUXFEOGU-SRVKXCTJSA-N 0.000 description 1
- USLVEJAHTBLSIL-CYDGBPFRSA-N Val-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C USLVEJAHTBLSIL-CYDGBPFRSA-N 0.000 description 1
- LTTQCQRTSHJPPL-ZKWXMUAHSA-N Val-Ser-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N LTTQCQRTSHJPPL-ZKWXMUAHSA-N 0.000 description 1
- GBIUHAYJGWVNLN-UHFFFAOYSA-N Val-Ser-Pro Natural products CC(C)C(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O GBIUHAYJGWVNLN-UHFFFAOYSA-N 0.000 description 1
- NZYNRRGJJVSSTJ-GUBZILKMSA-N Val-Ser-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O NZYNRRGJJVSSTJ-GUBZILKMSA-N 0.000 description 1
- DLRZGNXCXUGIDG-KKHAAJSZSA-N Val-Thr-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O DLRZGNXCXUGIDG-KKHAAJSZSA-N 0.000 description 1
- SSKKGOWRPNIVDW-AVGNSLFASA-N Val-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N SSKKGOWRPNIVDW-AVGNSLFASA-N 0.000 description 1
- WBPFYNYTYASCQP-CYDGBPFRSA-N Val-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)N WBPFYNYTYASCQP-CYDGBPFRSA-N 0.000 description 1
- 102220634597 Vitamin K epoxide reductase complex subunit 1_R35P_mutation Human genes 0.000 description 1
- 102220514519 Vitronectin_T69E_mutation Human genes 0.000 description 1
- 102220514520 Vitronectin_T76E_mutation Human genes 0.000 description 1
- 102220479618 Voltage-dependent L-type calcium channel subunit beta-2_N40A_mutation Human genes 0.000 description 1
- 102220470257 Voltage-dependent L-type calcium channel subunit beta-2_P25V_mutation Human genes 0.000 description 1
- 102220470309 Voltage-dependent L-type calcium channel subunit beta-2_R30Q_mutation Human genes 0.000 description 1
- 102220469732 Voltage-dependent L-type calcium channel subunit beta-2_S53A_mutation Human genes 0.000 description 1
- 102220469724 Voltage-dependent L-type calcium channel subunit beta-2_S53D_mutation Human genes 0.000 description 1
- 102220469723 Voltage-dependent L-type calcium channel subunit beta-2_S53K_mutation Human genes 0.000 description 1
- 102220546632 Voltage-dependent L-type calcium channel subunit beta-2_S55N_mutation Human genes 0.000 description 1
- 238000002441 X-ray diffraction Methods 0.000 description 1
- 102220636115 Zinc finger and BTB domain-containing protein 34_G46R_mutation Human genes 0.000 description 1
- LIPOUNRJVLNBCD-UHFFFAOYSA-N acetyl dihydrogen phosphate Chemical compound CC(=O)OP(O)(O)=O LIPOUNRJVLNBCD-UHFFFAOYSA-N 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 102000019997 adhesion receptor Human genes 0.000 description 1
- 108010013985 adhesion receptor Proteins 0.000 description 1
- 238000003314 affinity selection Methods 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- 239000000910 agglutinin Substances 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000012867 alanine scanning Methods 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- 125000001931 aliphatic group Chemical group 0.000 description 1
- 125000003368 amide group Chemical group 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 230000001745 anti-biotin effect Effects 0.000 description 1
- 230000009833 antibody interaction Effects 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 101150042295 arfA gene Proteins 0.000 description 1
- 108010036533 arginylvaline Proteins 0.000 description 1
- 125000006615 aromatic heterocyclic group Chemical group 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007622 bioinformatic analysis Methods 0.000 description 1
- 238000003766 bioinformatics method Methods 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 238000009835 boiling Methods 0.000 description 1
- KGBXLFKZBHKPEV-UHFFFAOYSA-N boric acid Chemical compound OB(O)O KGBXLFKZBHKPEV-UHFFFAOYSA-N 0.000 description 1
- 239000004327 boric acid Substances 0.000 description 1
- 125000001246 bromo group Chemical group Br* 0.000 description 1
- 102220415904 c.104G>A Human genes 0.000 description 1
- 102220349968 c.110C>T Human genes 0.000 description 1
- 102220131335 c.110C>T Human genes 0.000 description 1
- 102220355161 c.110G>A Human genes 0.000 description 1
- 102220385477 c.113A>G Human genes 0.000 description 1
- 102220358890 c.116C>A Human genes 0.000 description 1
- 102200108120 c.116C>T Human genes 0.000 description 1
- 102220350010 c.119C>A Human genes 0.000 description 1
- 102220356527 c.122C>A Human genes 0.000 description 1
- 102220362554 c.127A>G Human genes 0.000 description 1
- 102220361798 c.133C>A Human genes 0.000 description 1
- 102220363868 c.134A>T Human genes 0.000 description 1
- 102220360781 c.158C>A Human genes 0.000 description 1
- 102220415795 c.161A>G Human genes 0.000 description 1
- 102220366036 c.196G>T Human genes 0.000 description 1
- 102220359241 c.211A>C Human genes 0.000 description 1
- 102220370165 c.214T>C Human genes 0.000 description 1
- 102220350260 c.223G>C Human genes 0.000 description 1
- 102220407719 c.224T>A Human genes 0.000 description 1
- 102220422268 c.233G>A Human genes 0.000 description 1
- 102220357741 c.234G>C Human genes 0.000 description 1
- 102220346653 c.235A>G Human genes 0.000 description 1
- 102200089577 c.239G>T Human genes 0.000 description 1
- 102220405661 c.249G>C Human genes 0.000 description 1
- 102220354772 c.254G>T Human genes 0.000 description 1
- 102220351961 c.43G>T Human genes 0.000 description 1
- 102220363431 c.52G>A Human genes 0.000 description 1
- 102220414042 c.68A>G Human genes 0.000 description 1
- 102220353639 c.76T>C Human genes 0.000 description 1
- 102220362819 c.77C>T Human genes 0.000 description 1
- 102220363778 c.97C>G Human genes 0.000 description 1
- 102220347737 c.98G>T Human genes 0.000 description 1
- 239000001201 calcium disodium ethylene diamine tetra-acetate Substances 0.000 description 1
- 244000309466 calf Species 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 150000001718 carbodiimides Chemical class 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 125000002915 carbonyl group Chemical group [*:2]C([*:1])=O 0.000 description 1
- 230000003915 cell function Effects 0.000 description 1
- 239000006285 cell suspension Substances 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000033077 cellular process Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000000973 chemotherapeutic effect Effects 0.000 description 1
- 238000013377 clone selection method Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 239000013256 coordination polymer Substances 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 239000007822 coupling agent Substances 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 239000002577 cryoprotective agent Substances 0.000 description 1
- 239000011549 crystallization solution Substances 0.000 description 1
- 238000002447 crystallographic data Methods 0.000 description 1
- 239000012228 culture supernatant Substances 0.000 description 1
- 238000012258 culturing Methods 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- 210000004292 cytoskeleton Anatomy 0.000 description 1
- 230000034994 death Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 229950006137 dexfosfoserine Drugs 0.000 description 1
- 238000002050 diffraction method Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- WDRWZVWLVBXVOI-QTNFYWBSSA-L dipotassium;(2s)-2-aminopentanedioate Chemical compound [K+].[K+].[O-]C(=O)[C@@H](N)CCC([O-])=O WDRWZVWLVBXVOI-QTNFYWBSSA-L 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 231100000673 dose–response relationship Toxicity 0.000 description 1
- 238000009510 drug design Methods 0.000 description 1
- 239000000975 dye Substances 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 238000010828 elution Methods 0.000 description 1
- ZMMJGEGLRURXTF-UHFFFAOYSA-N ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 1
- 229960005542 ethidium bromide Drugs 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- VVIAGPKUTFNRDU-ABLWVSNPSA-N folinic acid Chemical compound C1NC=2NC(N)=NC(=O)C=2N(C=O)C1CNC1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 VVIAGPKUTFNRDU-ABLWVSNPSA-N 0.000 description 1
- 235000008191 folinic acid Nutrition 0.000 description 1
- 239000011672 folinic acid Substances 0.000 description 1
- 238000002523 gelfiltration Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- 108010078144 glutaminyl-glycine Proteins 0.000 description 1
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 1
- 229960002897 heparin Drugs 0.000 description 1
- 229920000669 heparin Polymers 0.000 description 1
- 239000000833 heterodimer Substances 0.000 description 1
- 238000000990 heteronuclear single quantum coherence spectrum Methods 0.000 description 1
- 108010040030 histidinoalanine Proteins 0.000 description 1
- 108010025306 histidylleucine Proteins 0.000 description 1
- 239000000710 homodimer Substances 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 239000000852 hydrogen donor Substances 0.000 description 1
- GPRLSGONYQIRFK-UHFFFAOYSA-N hydron Chemical compound [H+] GPRLSGONYQIRFK-UHFFFAOYSA-N 0.000 description 1
- 230000005661 hydrophobic surface Effects 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 125000002883 imidazolyl group Chemical group 0.000 description 1
- 230000002163 immunogen Effects 0.000 description 1
- 230000009851 immunogenic response Effects 0.000 description 1
- 238000005462 in vivo assay Methods 0.000 description 1
- 238000012750 in vivo screening Methods 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 229940030980 inova Drugs 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000004573 interface analysis Methods 0.000 description 1
- 239000004313 iron ammonium citrate Substances 0.000 description 1
- JEIPFZHSYJVQDO-UHFFFAOYSA-N iron(III) oxide Inorganic materials O=[Fe]O[Fe]=O JEIPFZHSYJVQDO-UHFFFAOYSA-N 0.000 description 1
- YOBAEOGBNPPUQV-UHFFFAOYSA-N iron;trihydrate Chemical compound O.O.O.[Fe].[Fe] YOBAEOGBNPPUQV-UHFFFAOYSA-N 0.000 description 1
- QRXWMOHMRWLFEY-UHFFFAOYSA-N isoniazide Chemical compound NNC(=O)C1=CC=NC=C1 QRXWMOHMRWLFEY-UHFFFAOYSA-N 0.000 description 1
- 125000001909 leucine group Chemical group [H]N(*)C(C(*)=O)C([H])([H])C(C([H])([H])[H])C([H])([H])[H] 0.000 description 1
- 229960001691 leucovorin Drugs 0.000 description 1
- 230000029226 lipidation Effects 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000009630 liquid culture Methods 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 229920002521 macromolecule Polymers 0.000 description 1
- 239000006249 magnetic particle Substances 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 239000002991 molded plastic Substances 0.000 description 1
- 230000009456 molecular mechanism Effects 0.000 description 1
- 239000001788 mono and diglycerides of fatty acids Substances 0.000 description 1
- 235000013919 monopotassium glutamate Nutrition 0.000 description 1
- 239000012452 mother liquor Substances 0.000 description 1
- 238000002887 multiple sequence alignment Methods 0.000 description 1
- 231100000219 mutagenic Toxicity 0.000 description 1
- 230000003505 mutagenic effect Effects 0.000 description 1
- 201000000050 myeloid neoplasm Diseases 0.000 description 1
- 210000004897 n-terminal region Anatomy 0.000 description 1
- 238000001208 nuclear magnetic resonance pulse sequence Methods 0.000 description 1
- 230000025308 nuclear transport Effects 0.000 description 1
- 238000001668 nucleic acid synthesis Methods 0.000 description 1
- 101150087557 omcB gene Proteins 0.000 description 1
- 101150115693 ompA gene Proteins 0.000 description 1
- 239000011022 opal Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 238000004091 panning Methods 0.000 description 1
- 230000005298 paramagnetic effect Effects 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 101150040383 pel2 gene Proteins 0.000 description 1
- 101150050446 pelB gene Proteins 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 210000001322 periplasm Anatomy 0.000 description 1
- 102000013415 peroxidase activity proteins Human genes 0.000 description 1
- 108040007629 peroxidase activity proteins Proteins 0.000 description 1
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N phenol group Chemical group C1(=CC=CC=C1)O ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 1
- 101150009573 phoA gene Proteins 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- BZQFBWGGLXLEPQ-REOHCLBHSA-N phosphoserine Chemical compound OC(=O)[C@@H](N)COP(O)(O)=O BZQFBWGGLXLEPQ-REOHCLBHSA-N 0.000 description 1
- USRGIUJOYOXOQJ-GBXIJSLDSA-N phosphothreonine Chemical compound OP(=O)(O)O[C@H](C)[C@H](N)C(O)=O USRGIUJOYOXOQJ-GBXIJSLDSA-N 0.000 description 1
- 238000007747 plating Methods 0.000 description 1
- 102000015585 poly-pyrimidine tract binding protein Human genes 0.000 description 1
- 108010054442 polyalanine Proteins 0.000 description 1
- 239000001955 polyclycerol esters of fatty acids Substances 0.000 description 1
- 229920000573 polyethylene Polymers 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 108010033356 polyvaline Proteins 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 108010029020 prolylglycine Proteins 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 238000000159 protein binding assay Methods 0.000 description 1
- 108020003519 protein disulfide isomerase Proteins 0.000 description 1
- 230000004853 protein function Effects 0.000 description 1
- 230000002797 proteolythic effect Effects 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000008672 reprogramming Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 210000003705 ribosome Anatomy 0.000 description 1
- 102200139519 rs104893652 Human genes 0.000 description 1
- 102200139525 rs104893653 Human genes 0.000 description 1
- 102220004796 rs104893939 Human genes 0.000 description 1
- 102200016464 rs104894278 Human genes 0.000 description 1
- 102200145341 rs104894403 Human genes 0.000 description 1
- 102200083940 rs104894416 Human genes 0.000 description 1
- 102200108092 rs104894539 Human genes 0.000 description 1
- 102200108123 rs104894540 Human genes 0.000 description 1
- 102200104421 rs104894658 Human genes 0.000 description 1
- 102200153349 rs104894823 Human genes 0.000 description 1
- 102200078051 rs1057517661 Human genes 0.000 description 1
- 102200011002 rs1057517674 Human genes 0.000 description 1
- 102220195756 rs1057518124 Human genes 0.000 description 1
- 102200065164 rs1057519025 Human genes 0.000 description 1
- 102220196890 rs1057519097 Human genes 0.000 description 1
- 102220198443 rs1057520007 Human genes 0.000 description 1
- 102220226093 rs1060502034 Human genes 0.000 description 1
- 102220224023 rs1060502096 Human genes 0.000 description 1
- 102220223203 rs1060502131 Human genes 0.000 description 1
- 102220257837 rs1060503556 Human genes 0.000 description 1
- 102220214967 rs1060503560 Human genes 0.000 description 1
- 102200012531 rs111033829 Human genes 0.000 description 1
- 102220003133 rs116840793 Human genes 0.000 description 1
- 102200148733 rs116840794 Human genes 0.000 description 1
- 102220245580 rs1178507750 Human genes 0.000 description 1
- 102220017540 rs118203356 Human genes 0.000 description 1
- 102200042452 rs121909605 Human genes 0.000 description 1
- 102200042453 rs121909606 Human genes 0.000 description 1
- 102200042538 rs121909607 Human genes 0.000 description 1
- 102200055059 rs121909790 Human genes 0.000 description 1
- 102200055075 rs121909797 Human genes 0.000 description 1
- 102200015465 rs121912304 Human genes 0.000 description 1
- 102200000822 rs121912606 Human genes 0.000 description 1
- 102200044889 rs121913413 Human genes 0.000 description 1
- 102220198148 rs121913413 Human genes 0.000 description 1
- 102200081460 rs121913596 Human genes 0.000 description 1
- 102220253683 rs121913596 Human genes 0.000 description 1
- 102200115907 rs121918081 Human genes 0.000 description 1
- 102200012887 rs121918638 Human genes 0.000 description 1
- 102220297584 rs1232229010 Human genes 0.000 description 1
- 102220315619 rs1247249384 Human genes 0.000 description 1
- 102200024635 rs132630289 Human genes 0.000 description 1
- 102220277378 rs1332272921 Human genes 0.000 description 1
- 102220344489 rs1332969540 Human genes 0.000 description 1
- 102220317889 rs1333247214 Human genes 0.000 description 1
- 102200080930 rs137852221 Human genes 0.000 description 1
- 102200090484 rs137852478 Human genes 0.000 description 1
- 102200093466 rs137853247 Human genes 0.000 description 1
- 102220272013 rs1381395730 Human genes 0.000 description 1
- 102220311640 rs1382779104 Human genes 0.000 description 1
- 102220011102 rs139592595 Human genes 0.000 description 1
- 102220103676 rs141319800 Human genes 0.000 description 1
- 102220251543 rs141774369 Human genes 0.000 description 1
- 102220014106 rs142059019 Human genes 0.000 description 1
- 102220309147 rs1431217461 Human genes 0.000 description 1
- 102220147136 rs143999954 Human genes 0.000 description 1
- 102220342029 rs144089645 Human genes 0.000 description 1
- 102220224385 rs144221071 Human genes 0.000 description 1
- 102220176396 rs146274964 Human genes 0.000 description 1
- 102220181032 rs146698039 Human genes 0.000 description 1
- 102220322567 rs147013097 Human genes 0.000 description 1
- 102220014330 rs147406419 Human genes 0.000 description 1
- 102220219968 rs148269473 Human genes 0.000 description 1
- 102220072180 rs148562366 Human genes 0.000 description 1
- 102220294274 rs149101812 Human genes 0.000 description 1
- 102220095974 rs149101834 Human genes 0.000 description 1
- 102200038856 rs150565592 Human genes 0.000 description 1
- 102220254393 rs150888506 Human genes 0.000 description 1
- 102220247071 rs1553259811 Human genes 0.000 description 1
- 102220276981 rs1553350100 Human genes 0.000 description 1
- 102220257199 rs1553408408 Human genes 0.000 description 1
- 102220277542 rs1553619417 Human genes 0.000 description 1
- 102220277593 rs1553640212 Human genes 0.000 description 1
- 102220317537 rs1553646295 Human genes 0.000 description 1
- 102200072476 rs1553721236 Human genes 0.000 description 1
- 102220262380 rs1554458350 Human genes 0.000 description 1
- 102220308430 rs1554488388 Human genes 0.000 description 1
- 102220244174 rs1554813788 Human genes 0.000 description 1
- 102220268351 rs1555201383 Human genes 0.000 description 1
- 102220280430 rs1555280111 Human genes 0.000 description 1
- 102220268461 rs1555280330 Human genes 0.000 description 1
- 102220280407 rs1555280344 Human genes 0.000 description 1
- 102220280978 rs1555280382 Human genes 0.000 description 1
- 102220285635 rs1555280395 Human genes 0.000 description 1
- 102220283700 rs1555526664 Human genes 0.000 description 1
- 102220283702 rs1555526673 Human genes 0.000 description 1
- 102220287616 rs1555570422 Human genes 0.000 description 1
- 102220328262 rs1555575436 Human genes 0.000 description 1
- 102220276093 rs1555932427 Human genes 0.000 description 1
- 102200086451 rs16948978 Human genes 0.000 description 1
- 102200118166 rs16951438 Human genes 0.000 description 1
- 102200038988 rs17313469 Human genes 0.000 description 1
- 102220250322 rs181287533 Human genes 0.000 description 1
- 102200120791 rs183974372 Human genes 0.000 description 1
- 102200079914 rs193302884 Human genes 0.000 description 1
- 102220050493 rs193921030 Human genes 0.000 description 1
- 102200097288 rs199472827 Human genes 0.000 description 1
- 102200097057 rs199472848 Human genes 0.000 description 1
- 102200133005 rs199473360 Human genes 0.000 description 1
- 102220008235 rs199476326 Human genes 0.000 description 1
- 102220008236 rs199476327 Human genes 0.000 description 1
- 102220008237 rs199476328 Human genes 0.000 description 1
- 102220008239 rs199476334 Human genes 0.000 description 1
- 102220031772 rs199501657 Human genes 0.000 description 1
- 102220101231 rs199976372 Human genes 0.000 description 1
- 102220265047 rs199976372 Human genes 0.000 description 1
- 102220130942 rs200176406 Human genes 0.000 description 1
- 102220147072 rs200207198 Human genes 0.000 description 1
- 102220340189 rs200279736 Human genes 0.000 description 1
- 102220041350 rs200343185 Human genes 0.000 description 1
- 102220270193 rs200548009 Human genes 0.000 description 1
- 102220127772 rs201176284 Human genes 0.000 description 1
- 102200114508 rs201382018 Human genes 0.000 description 1
- 102200114509 rs201382018 Human genes 0.000 description 1
- 102200108155 rs201717599 Human genes 0.000 description 1
- 102200108158 rs201717599 Human genes 0.000 description 1
- 102220144360 rs201940585 Human genes 0.000 description 1
- 102200052146 rs2230178 Human genes 0.000 description 1
- 102200158159 rs2285944 Human genes 0.000 description 1
- 102220024419 rs267607524 Human genes 0.000 description 1
- 102220021176 rs273898675 Human genes 0.000 description 1
- 102220245302 rs273898675 Human genes 0.000 description 1
- 102220267747 rs276174818 Human genes 0.000 description 1
- 102200067434 rs281864823 Human genes 0.000 description 1
- 102220005369 rs281864859 Human genes 0.000 description 1
- 102220011222 rs281865133 Human genes 0.000 description 1
- 102200115837 rs28933979 Human genes 0.000 description 1
- 102220011282 rs312262813 Human genes 0.000 description 1
- 102220005397 rs33926206 Human genes 0.000 description 1
- 102220005270 rs33932981 Human genes 0.000 description 1
- 102220005286 rs33932981 Human genes 0.000 description 1
- 102220005320 rs33945546 Human genes 0.000 description 1
- 102200158856 rs33947112 Human genes 0.000 description 1
- 102220010318 rs33950507 Human genes 0.000 description 1
- 102220005475 rs33960522 Human genes 0.000 description 1
- 102220005551 rs33960522 Human genes 0.000 description 1
- 102220005403 rs33964507 Human genes 0.000 description 1
- 102220005449 rs33977363 Human genes 0.000 description 1
- 102220005473 rs33977363 Human genes 0.000 description 1
- 102220005501 rs33977363 Human genes 0.000 description 1
- 102220005258 rs33990858 Human genes 0.000 description 1
- 102220005395 rs33991223 Human genes 0.000 description 1
- 102220005454 rs33991223 Human genes 0.000 description 1
- 102220259718 rs34120878 Human genes 0.000 description 1
- 102220005409 rs34879587 Human genes 0.000 description 1
- 102200082903 rs35140348 Human genes 0.000 description 1
- 102220108975 rs35196441 Human genes 0.000 description 1
- 102200089531 rs35460768 Human genes 0.000 description 1
- 102200082934 rs35474880 Human genes 0.000 description 1
- 102220119638 rs35689779 Human genes 0.000 description 1
- 102220005466 rs35816645 Human genes 0.000 description 1
- 102220005420 rs35934411 Human genes 0.000 description 1
- 102200018234 rs36035373 Human genes 0.000 description 1
- 102220237891 rs367797765 Human genes 0.000 description 1
- 102220297573 rs368094683 Human genes 0.000 description 1
- 102220293228 rs368152787 Human genes 0.000 description 1
- 102200067724 rs369125667 Human genes 0.000 description 1
- 102220278142 rs369819304 Human genes 0.000 description 1
- 102220232156 rs370120266 Human genes 0.000 description 1
- 102220040126 rs371657037 Human genes 0.000 description 1
- 102220267156 rs371871714 Human genes 0.000 description 1
- 102220031036 rs3740912 Human genes 0.000 description 1
- 102220031426 rs374523166 Human genes 0.000 description 1
- 102220065897 rs374826256 Human genes 0.000 description 1
- 102220137029 rs376615998 Human genes 0.000 description 1
- 102220198468 rs377034865 Human genes 0.000 description 1
- 102220028352 rs386352374 Human genes 0.000 description 1
- 102200050844 rs386833793 Human genes 0.000 description 1
- 102200004091 rs387906857 Human genes 0.000 description 1
- 102200071874 rs387907029 Human genes 0.000 description 1
- 102220019639 rs397507881 Human genes 0.000 description 1
- 102220089991 rs397507881 Human genes 0.000 description 1
- 102220019720 rs397507902 Human genes 0.000 description 1
- 102220020469 rs397508332 Human genes 0.000 description 1
- 102220020471 rs397508335 Human genes 0.000 description 1
- 102220022237 rs397509313 Human genes 0.000 description 1
- 102200109794 rs398123562 Human genes 0.000 description 1
- 102220029647 rs398123613 Human genes 0.000 description 1
- 102220074096 rs45471099 Human genes 0.000 description 1
- 102220257861 rs45510294 Human genes 0.000 description 1
- 102200089548 rs5030802 Human genes 0.000 description 1
- 102200108161 rs534447939 Human genes 0.000 description 1
- 102220272203 rs534447939 Human genes 0.000 description 1
- 102220171407 rs540737897 Human genes 0.000 description 1
- 102220086571 rs544441867 Human genes 0.000 description 1
- 102220316960 rs553076085 Human genes 0.000 description 1
- 102200069083 rs55826713 Human genes 0.000 description 1
- 102220102325 rs564013964 Human genes 0.000 description 1
- 102220292582 rs567335712 Human genes 0.000 description 1
- 102220215485 rs572063023 Human genes 0.000 description 1
- 102220258648 rs572063023 Human genes 0.000 description 1
- 102200076366 rs57590980 Human genes 0.000 description 1
- 102200076358 rs58732244 Human genes 0.000 description 1
- 102220253434 rs587710840 Human genes 0.000 description 1
- 102220047071 rs587777891 Human genes 0.000 description 1
- 102220041125 rs587778627 Human genes 0.000 description 1
- 102220277390 rs587779008 Human genes 0.000 description 1
- 102220036771 rs587780037 Human genes 0.000 description 1
- 102220044582 rs587781412 Human genes 0.000 description 1
- 102220227724 rs587781412 Human genes 0.000 description 1
- 102220044627 rs587781441 Human genes 0.000 description 1
- 102220045670 rs587782296 Human genes 0.000 description 1
- 102220046369 rs587782872 Human genes 0.000 description 1
- 102200076359 rs59285727 Human genes 0.000 description 1
- 102200076360 rs59285727 Human genes 0.000 description 1
- 102200076361 rs59285727 Human genes 0.000 description 1
- 102200081810 rs5952410 Human genes 0.000 description 1
- 102200076355 rs59793293 Human genes 0.000 description 1
- 102200076362 rs59793293 Human genes 0.000 description 1
- 102220024416 rs59793293 Human genes 0.000 description 1
- 102220055234 rs6045440 Human genes 0.000 description 1
- 102200030476 rs615942 Human genes 0.000 description 1
- 102220046717 rs61754444 Human genes 0.000 description 1
- 102200004972 rs62638629 Human genes 0.000 description 1
- 102200004971 rs62638630 Human genes 0.000 description 1
- 102200158819 rs63751148 Human genes 0.000 description 1
- 102200101769 rs67939655 Human genes 0.000 description 1
- 102220032013 rs67939655 Human genes 0.000 description 1
- 102220032142 rs68031618 Human genes 0.000 description 1
- 102200145330 rs72474224 Human genes 0.000 description 1
- 102220054109 rs72474224 Human genes 0.000 description 1
- 102220032016 rs72554316 Human genes 0.000 description 1
- 102200101767 rs72554331 Human genes 0.000 description 1
- 102220053527 rs727504249 Human genes 0.000 description 1
- 102200013377 rs730880259 Human genes 0.000 description 1
- 102220056927 rs730880580 Human genes 0.000 description 1
- 102220057219 rs730881152 Human genes 0.000 description 1
- 102220057226 rs730881156 Human genes 0.000 description 1
- 102220270180 rs730881156 Human genes 0.000 description 1
- 102220058298 rs730881402 Human genes 0.000 description 1
- 102220057450 rs730881811 Human genes 0.000 description 1
- 102220057451 rs730881811 Human genes 0.000 description 1
- 102220112904 rs73928330 Human genes 0.000 description 1
- 102220224782 rs74315446 Human genes 0.000 description 1
- 102220215121 rs745338799 Human genes 0.000 description 1
- 102220072464 rs747347778 Human genes 0.000 description 1
- 102220065721 rs750676165 Human genes 0.000 description 1
- 102220224416 rs750825686 Human genes 0.000 description 1
- 102220329859 rs750825686 Human genes 0.000 description 1
- 102220294053 rs751463286 Human genes 0.000 description 1
- 102220274005 rs751542188 Human genes 0.000 description 1
- 102220097966 rs753204096 Human genes 0.000 description 1
- 102220067424 rs757120802 Human genes 0.000 description 1
- 102220146061 rs758345818 Human genes 0.000 description 1
- 102200063470 rs758432471 Human genes 0.000 description 1
- 102220123717 rs759057581 Human genes 0.000 description 1
- 102220095970 rs761681478 Human genes 0.000 description 1
- 102220097971 rs764750609 Human genes 0.000 description 1
- 102220330793 rs764750609 Human genes 0.000 description 1
- 102220224491 rs765799649 Human genes 0.000 description 1
- 102220118568 rs76784312 Human genes 0.000 description 1
- 102220319377 rs769687105 Human genes 0.000 description 1
- 102220097964 rs769819013 Human genes 0.000 description 1
- 102220092171 rs770694213 Human genes 0.000 description 1
- 102220260401 rs77196282 Human genes 0.000 description 1
- 102220095999 rs774521832 Human genes 0.000 description 1
- 102200055156 rs775863165 Human genes 0.000 description 1
- 102220282270 rs776471760 Human genes 0.000 description 1
- 102220094403 rs776643257 Human genes 0.000 description 1
- 102220143207 rs779977931 Human genes 0.000 description 1
- 102200011314 rs782290433 Human genes 0.000 description 1
- 102220062178 rs786202417 Human genes 0.000 description 1
- 102220059220 rs786202787 Human genes 0.000 description 1
- 102220062499 rs786204003 Human genes 0.000 description 1
- 102200055157 rs786205675 Human genes 0.000 description 1
- 102220065931 rs794726938 Human genes 0.000 description 1
- 102220068510 rs794727508 Human genes 0.000 description 1
- 102220072092 rs794728290 Human genes 0.000 description 1
- 102220075246 rs796052426 Human genes 0.000 description 1
- 102200076449 rs797044573 Human genes 0.000 description 1
- 102220077288 rs797045007 Human genes 0.000 description 1
- 102200115851 rs79977247 Human genes 0.000 description 1
- 102200027014 rs80356663 Human genes 0.000 description 1
- 102220008961 rs80356663 Human genes 0.000 description 1
- 102220020884 rs80356880 Human genes 0.000 description 1
- 102220020885 rs80356880 Human genes 0.000 description 1
- 102220016329 rs80358622 Human genes 0.000 description 1
- 102220082602 rs863224271 Human genes 0.000 description 1
- 102220083084 rs863224616 Human genes 0.000 description 1
- 102220083085 rs863224620 Human genes 0.000 description 1
- 102220082984 rs863224701 Human genes 0.000 description 1
- 102220276603 rs863224701 Human genes 0.000 description 1
- 102220085270 rs864309506 Human genes 0.000 description 1
- 102220086129 rs864622425 Human genes 0.000 description 1
- 102220086323 rs864622699 Human genes 0.000 description 1
- 102220088156 rs869025540 Human genes 0.000 description 1
- 102200063467 rs869312822 Human genes 0.000 description 1
- 102220097785 rs876658527 Human genes 0.000 description 1
- 102220095971 rs876658611 Human genes 0.000 description 1
- 102220095185 rs876660243 Human genes 0.000 description 1
- 102220095194 rs876660470 Human genes 0.000 description 1
- 102220102036 rs878853622 Human genes 0.000 description 1
- 102220099568 rs878853751 Human genes 0.000 description 1
- 102220102836 rs878854688 Human genes 0.000 description 1
- 102220105250 rs879254386 Human genes 0.000 description 1
- 102220114823 rs886038966 Human genes 0.000 description 1
- 102220287154 rs886040428 Human genes 0.000 description 1
- 102220121132 rs886042750 Human genes 0.000 description 1
- 102220122329 rs886043062 Human genes 0.000 description 1
- 102220165594 rs886048231 Human genes 0.000 description 1
- 102220266295 rs886048231 Human genes 0.000 description 1
- 102220192630 rs886057625 Human genes 0.000 description 1
- 102220149728 rs886061035 Human genes 0.000 description 1
- 102220160907 rs886062986 Human genes 0.000 description 1
- 102220258020 rs919338576 Human genes 0.000 description 1
- 102220278166 rs937875134 Human genes 0.000 description 1
- 102200075246 rs974712040 Human genes 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 239000012723 sample buffer Substances 0.000 description 1
- 238000009738 saturating Methods 0.000 description 1
- 238000007423 screening assay Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 238000013207 serial dilution Methods 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
- 238000002922 simulated annealing Methods 0.000 description 1
- IBKZNJXGCYVTBZ-IDBHZBAZSA-M sodium;1-[3-[2-[5-[(3as,4s,6ar)-2-oxo-1,3,3a,4,6,6a-hexahydrothieno[3,4-d]imidazol-4-yl]pentanoylamino]ethyldisulfanyl]propanoyloxy]-2,5-dioxopyrrolidine-3-sulfonate Chemical compound [Na+].O=C1C(S(=O)(=O)[O-])CC(=O)N1OC(=O)CCSSCCNC(=O)CCCC[C@H]1[C@H]2NC(=O)N[C@H]2CS1 IBKZNJXGCYVTBZ-IDBHZBAZSA-M 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- 125000006850 spacer group Chemical group 0.000 description 1
- 229940063675 spermine Drugs 0.000 description 1
- 235000000346 sugar Nutrition 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 102220481885 tRNA methyltransferase 10 homolog A_P82Q_mutation Human genes 0.000 description 1
- 102220535570 tRNA wybutosine-synthesizing protein 5_D66A_mutation Human genes 0.000 description 1
- 238000005382 thermal cycling Methods 0.000 description 1
- 125000003396 thiol group Chemical group [H]S* 0.000 description 1
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
- 238000000954 titration curve Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 239000013638 trimer Substances 0.000 description 1
- 125000000430 tryptophan group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C2=C([H])C([H])=C([H])C([H])=C12 0.000 description 1
- 108010073969 valyllysine Proteins 0.000 description 1
- 108010009962 valyltyrosine Proteins 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- 238000002424 x-ray crystallography Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/78—Connective tissue peptides, e.g. collagen, elastin, laminin, fibronectin, vitronectin or cold insoluble globulin [CIG]
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/18—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/40—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against enzymes
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K16/00—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
- C07K16/42—Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against immunoglobulins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/04—Libraries containing only organic compounds
- C40B40/10—Libraries containing peptides or polypeptides, or derivatives thereof
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2317/00—Immunoglobulins specific features
- C07K2317/30—Immunoglobulins specific features characterized by aspects of specificity or valency
- C07K2317/34—Identification of a linear epitope shorter than 20 amino acid residues or of a conformational epitope defined by amino acid residues
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2318/00—Antibody mimetics or scaffolds
- C07K2318/20—Antigen-binding scaffold molecules wherein the scaffold is not an immunoglobulin variable region or antibody mimetics
Landscapes
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Medicinal Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Immunology (AREA)
- Zoology (AREA)
- Engineering & Computer Science (AREA)
- Toxicology (AREA)
- Gastroenterology & Hepatology (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Wood Science & Technology (AREA)
- Plant Pathology (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Peptides Or Proteins (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Enzymes And Modification Thereof (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Apparatus Associated With Microorganisms And Enzymes (AREA)
Abstract
The here described invention discloses a combination of a top and bottom loop binder library using the CD and the FG loops of a number of Fnlll domains (Fnlll) (e.g., Fnlll7, Fnlll10 and Fnlll14) together with the surface exposed residues of the beta-sheet. The invention also pertains to a method of forming a library of Fnlll domain polypeptides useful in screening for the presence of one or more polypeptides having a selected binding or enzymatic activity.
Description
FIBRONECTIN CRADLE MOLECULES AND LIBRARIES THEREOF
Cross-Reference To Related Applications [0001] This application is related to U.S. Provisional Patent Application No.
61/369,160 filed on July 30, 2010, U.S. Provisional Patent Application No. 61/369,203 filed on July 30, 2010, U.S. Provisional Patent Application No. 61/369,222 filed on July 30, 2010, U.S. Provisional Patent Application No. 61/474,632 filed on April 12, 2011, and U.S.
Provisional Patent Application No. 61/474,648 filed on April 12, 2011. Each of the foregoing applications is hereby incorporated by reference in its entirety.
Statement Of Rights To Inventions Made Under Federally Sponsored Research
Cross-Reference To Related Applications [0001] This application is related to U.S. Provisional Patent Application No.
61/369,160 filed on July 30, 2010, U.S. Provisional Patent Application No. 61/369,203 filed on July 30, 2010, U.S. Provisional Patent Application No. 61/369,222 filed on July 30, 2010, U.S. Provisional Patent Application No. 61/474,632 filed on April 12, 2011, and U.S.
Provisional Patent Application No. 61/474,648 filed on April 12, 2011. Each of the foregoing applications is hereby incorporated by reference in its entirety.
Statement Of Rights To Inventions Made Under Federally Sponsored Research
[0002] Part of this invention was made with government support under contract GM72688 and U54 GM74946 awarded by the National Institutes of Health to the University of Chicago. The government has certain rights in part of the invention.
Reference to Sequence Listing Submitted Via EFS-Web
Reference to Sequence Listing Submitted Via EFS-Web
[0003] The entire content of the following electronic submission of the sequence listing via the USPTO EFS-WEB server, as authorized and set forth in MPEP 1730 II.B.2(a)(C), is incorporated herein by reference in its entirety for all purposes. The sequence listing is identified on the electronically filed text file as follows:
File Name Date of Creation Size (bytes) 6360921093405eq1ist.txt July 25, 2011 335,575 bytes Technical Field
File Name Date of Creation Size (bytes) 6360921093405eq1ist.txt July 25, 2011 335,575 bytes Technical Field
[0004] The present application relates to novel Fibronectin Type III domain (FnIII) polypeptides and the methods of making and using such FnIII polypeptides. More specifically, the present invention relates to a library of FnIII polypeptides using the CD
and the FG loops of a number of FnIII domains (e.g., FnIII 7, FnIIIi and FnIII14) together with the surface exposed residues of the beta-sheet.
Background Art
and the FG loops of a number of FnIII domains (e.g., FnIII 7, FnIIIi and FnIII14) together with the surface exposed residues of the beta-sheet.
Background Art
[0005] Scaffold based binding proteins are becoming legitimate alternatives to antibodies in their ability to bind specific ligand targets. These scaffold binding proteins share the quality of having a stable framework core that can tolerate multiple substitutions in the ligand binding regions. Some scaffold frameworks have immunoglobulin like protein domain architecture with loops extending from a beta sandwich core. A scaffold framework core can be synthetically engineered and used to form a library comprising different sequence variants.
The sequence diversity of such libraries is typically concentrated in the exterior surfaces of the proteins such as loop structures or other exterior surfaces that can serve as ligand binding regions.
The sequence diversity of such libraries is typically concentrated in the exterior surfaces of the proteins such as loop structures or other exterior surfaces that can serve as ligand binding regions.
[0006] The fibronectin type III domain (FnIII) has been established as an effective non-antibody "alternative" scaffold for the generation of novel binding proteins.
A member of the immunoglobulin superfamily, FnIII has three surface exposed loops at one end of the molecule which are analogous to antibody complementarity determining regions (CDRs).
Engineering strategies using this scaffold are based on combinatorial libraries created by diversifying both the length and amino acid sequence of these surface loops. From such libraries, FnIII variants capable of binding to a target of interest can be isolated using various selection methods. The FnIII scaffold offers many advantages compared to conventional antibodies or fragments thereof because it lacks disulfide bonds, can be readily and highly expressed in bacterial systems, and is relatively small. However, a need exists for improved FnIII
based polypeptides and methods of producing libraries of such polypeptides.
Summary of the Invention
A member of the immunoglobulin superfamily, FnIII has three surface exposed loops at one end of the molecule which are analogous to antibody complementarity determining regions (CDRs).
Engineering strategies using this scaffold are based on combinatorial libraries created by diversifying both the length and amino acid sequence of these surface loops. From such libraries, FnIII variants capable of binding to a target of interest can be isolated using various selection methods. The FnIII scaffold offers many advantages compared to conventional antibodies or fragments thereof because it lacks disulfide bonds, can be readily and highly expressed in bacterial systems, and is relatively small. However, a need exists for improved FnIII
based polypeptides and methods of producing libraries of such polypeptides.
Summary of the Invention
[0007] The present invention is based on the unexpected discovery that modifications to a beta sheet of a FnIII polypeptide in addition to modifications to at least one loop region of the FnIII based polypeptide result in an FnIII based binding molecule with improved binding ability for a target molecule. The improved binding is a result of increased surface area available for binding to a target molecule by using amino acid residues in the beta sheet to form part of the binding surface and to bind to a target molecule. Modifications to the beta sheets can also be used to distinguish targets. The invention pertains to modifications in the beta strand and loop of all FnIII molecules, e.g., FnIII 7, FnIII1 and FnIII14. In particular, the invention pertains to modifications in F and/or C beta strands and modifications in the the FG loop and the CD loop of FnIII molecules, e.g., FnIII 7, FnIII10 and FnIII14 .
[0008] Accordingly, in one aspect, the invention pertains to an FnIII domain-based cradle polypeptide comprising one or more amino acid substitutions in at least a loop region and at least a non-loop region.
[0009] In some embodiments, the cradle polypeptide may comprise amino acid substitutions in both the beta strands in conjunction with substitutions in the AB loop, the BC loop, the CD
loop, the DE loop, and/or the FG loop of FnIII. In some embodiments the cradle polypeptide may comprise amino acid substitution in beta strand C, beta strand D, beta strand F and/or beta strand G. In some embodiments the cradle polypeptide may comprise one or more amino acid substitutions in two loop regions and/or two non-loop regions, wherein the non-loop regions may be the beta strands C and F, and the loop regions may be the CD and FG
loops. In some embodiments the one or more amino acid substitutions may be introduced to the cradle residues in the beta strands. In some embodiments the cradle polypeptide may further comprise an insertion and/or deletion of at least one amino acid in at least one loop and/or non-loop region.
In some embodiments the cradle polypeptide may further comprise an insertion and/or deletion of at least one amino acid in two loop regions and/or two non-loop regions, wherein the non-loop regions may be the beta strands C and F, and the loop regions may be the CD and FG
loops. In some embodiments the FnIII domain may be the 1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th, 10th, 11 h, 12th, 13th, 14th, 15th or 16th FnIII domain of human fibronectin.
In some embodiments, the one or more amino acid substitutions in the non-loop region may not change the structure of the FnIII domain scaffold and/or the shape of the loop regions. In some embodiments, the one or more amino acid substitutions in the non-loop region may exclude the non-cradle residues.
loop, the DE loop, and/or the FG loop of FnIII. In some embodiments the cradle polypeptide may comprise amino acid substitution in beta strand C, beta strand D, beta strand F and/or beta strand G. In some embodiments the cradle polypeptide may comprise one or more amino acid substitutions in two loop regions and/or two non-loop regions, wherein the non-loop regions may be the beta strands C and F, and the loop regions may be the CD and FG
loops. In some embodiments the one or more amino acid substitutions may be introduced to the cradle residues in the beta strands. In some embodiments the cradle polypeptide may further comprise an insertion and/or deletion of at least one amino acid in at least one loop and/or non-loop region.
In some embodiments the cradle polypeptide may further comprise an insertion and/or deletion of at least one amino acid in two loop regions and/or two non-loop regions, wherein the non-loop regions may be the beta strands C and F, and the loop regions may be the CD and FG
loops. In some embodiments the FnIII domain may be the 1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th, 10th, 11 h, 12th, 13th, 14th, 15th or 16th FnIII domain of human fibronectin.
In some embodiments, the one or more amino acid substitutions in the non-loop region may not change the structure of the FnIII domain scaffold and/or the shape of the loop regions. In some embodiments, the one or more amino acid substitutions in the non-loop region may exclude the non-cradle residues.
[0010] In some embodiments, loop CD may be about 3-11, about 4-9, or 5 residues in length, wherein loop FG may be about 1-10, 5 or 6 residues in length. Position 1 of the FG loop may be a Gly residue, position 2 may be a Leu, Val, or Ile residue, position 3 may be a charged or polar residue, position 4 may be a Pro residue, position 5 may be a Gly residue, and position 6 may be a polar residue. In some embodiments, positions 3 and/or 5 of the loop may be a Gly residue.
[0011] In some embodiments, the beta strand lengths may be about 6-14, about 8-11, or 9 residues for beta strand C and for beta strand F about 8-13, about 9-11, or 10 residues. In some embodiments, the residue at positions 2, 4, and 6 of the C beta strand may be a hydrophobic residue, and positions 1, 3, 5, and 7-9 of the C beta strand amy be altered relative to the wild type sequence, wherein the residue at position 1 of the C beta strand may be selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys, and Arg. The residue at position 3 of the C beta strand may be a hydrophobic residue, or may be selected from the group consisting of Ile, Val, Arg, Leu, Thr, Glu, Lys, Ser, Gln, and His.
Position 5, 7, 8, and 9 of the C beta strand may be selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys, and Arg.
Position 5, 7, 8, and 9 of the C beta strand may be selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys, and Arg.
[0012] In some embodiments, the residue at positions 1, 3, 5, and 10 of the F
beta strand may be altered relative to the wild type sequence, wherein the residues at positions 1, 3, 5, and of the F beta strand may be individually selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys, and Arg. The residue at positions 2, 4, and 6 of the F
beta strand may be a hydrophobic residue. The residue at position 7 of the F
beta strand may be a hydrophobic residue, or may be selected from the group consisting of Arg, Tyr, Ala, Thr, and Val. The residue at position 8 of the F beta strand may be selected from the group consisting of Ala, Gly, Ser, Val, and Pro. The residue at position 9 of the F beta strand may be selected from the group consisting of Val, Leu, Glu, Arg, and Ile.
beta strand may be altered relative to the wild type sequence, wherein the residues at positions 1, 3, 5, and of the F beta strand may be individually selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys, and Arg. The residue at positions 2, 4, and 6 of the F
beta strand may be a hydrophobic residue. The residue at position 7 of the F
beta strand may be a hydrophobic residue, or may be selected from the group consisting of Arg, Tyr, Ala, Thr, and Val. The residue at position 8 of the F beta strand may be selected from the group consisting of Ala, Gly, Ser, Val, and Pro. The residue at position 9 of the F beta strand may be selected from the group consisting of Val, Leu, Glu, Arg, and Ile.
[0013] In some embodiments the cradle polypeptide may comprise a substitution that corresponds to a substitution in one or more of the amino acids at positions 30, 31, 33, 35, 37-39, 40-45, 47, 49, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, and/or 86 of SEQ ID
NO:l. In some embodiments the cradle polypeptide may comprise amino acid substitution in one or more of the amino acids at positions 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85 of SEQ ID NO:97. In some embodiments the cradle polypeptide may comprise amino acid substitution in one or more of the amino acids at positions 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81 of SEQ ID NO:129. In some embodiments the cradle polypeptide may comprise an amino acid sequence set forth in SEQ ID NOs: 468, 469 and 470. In some embodiments the cradle polypeptide may be modified by inserting or deleting 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20 or more amino acids, or any range derivable therein, in an FnIII loop. In some embodiments, substitutions in loops AB, CD, and EF may be specifically excluded, either individually or in various combinations. In some embodiments modifications in the bottom loop(s) may be limited to 1, 2, 3, 4, or 5 or fewer substitutions, insertions, and/ or deletions. In some embodiments the amino acid substitutions may contribute to the binding specificity of the cradle polypeptide.
NO:l. In some embodiments the cradle polypeptide may comprise amino acid substitution in one or more of the amino acids at positions 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85 of SEQ ID NO:97. In some embodiments the cradle polypeptide may comprise amino acid substitution in one or more of the amino acids at positions 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81 of SEQ ID NO:129. In some embodiments the cradle polypeptide may comprise an amino acid sequence set forth in SEQ ID NOs: 468, 469 and 470. In some embodiments the cradle polypeptide may be modified by inserting or deleting 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20 or more amino acids, or any range derivable therein, in an FnIII loop. In some embodiments, substitutions in loops AB, CD, and EF may be specifically excluded, either individually or in various combinations. In some embodiments modifications in the bottom loop(s) may be limited to 1, 2, 3, 4, or 5 or fewer substitutions, insertions, and/ or deletions. In some embodiments the amino acid substitutions may contribute to the binding specificity of the cradle polypeptide.
[0014] Also provided herein is a chimeric cradle polypeptide comprising one or more amino acid substitutions in at least a loop region and at least a non-loop region, wherein part of the cradle polypeptide is replaced by a non-FnIII domain polypeptide that enhances the binding affinity of the cradle polypeptide for a target molecule. In some embodiments the chimeric cradle polypeptide may comprise all or part of a complementarity determining region (CDR) of an antibody or a T-cell receptor, wherein the CDR may be a CDR1, CDR2 or CDR3 of a single domain antibody. In some embodiments the single domain antibody may be a nanobody. In some embodiments the CDR may replace part or all of the AB, BC, CD, DE, EF or FG loop.
[0015] Further provided herein is a multispecific cradle polypeptide comprising multiple copies of one or more monomer cradle polypeptides disclosed herein, wherein the monomer cradle polypeptides may be linked by a linker sequence. In some embodiments the linker sequence may be selected from the group consisting of GGGGSGGGGS (SEQ ID NO:
471), GSGSGSGSGS (SEQ ID NO: 472), PSTSTST (SEQ ID NO: 473) and EIDKPSQ (SEQ ID NO:
474).
471), GSGSGSGSGS (SEQ ID NO: 472), PSTSTST (SEQ ID NO: 473) and EIDKPSQ (SEQ ID NO:
474).
[0016] In another aspect, the present invention provides a cradle library comprising a plurality of cradle polypeptides having amino acid substitutions in both the beta strands in conjunction with substitutions in the AB loop, the BC loop, the CD loop, the DE loop, and/or the FG loop of FnIII. In some embodiments the cradle polypeptides may comprise one or more amino acid substitutions corresponding to amino acid positions 30, 41, 42, 43, 44, 45, 76, 77, 78, 79, 80, 81, 82, 83, 84 and/or 85 of SEQ ID NO:l. In some embodiments the cradle polypeptides may further comprise one or more amino acid substitutions corresponding to amino acid positions 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 32, 34, 35, 36, 37, 38, 39, 40, 46, 48, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 72, 74, 75, 86, 87, 88, 89, 90, 91, 92, 93 and/or 94 of SEQ ID NO:l. In some embodiments the cradle polypeptides may be at least 50%, 60%, 70%, 80%, or 90% identical to SEQ ID
NO:l. In some embodiments the cradle polypeptides may further comprise an insertion of at least 1, 2, or about 2-25 amino acids in at least one loop region. In some embodiments the cradle polypeptides may comprise a deletion of at least 1, 2 or about 2-10 amino acids in at least one loop region. In some embodiments the cradle polypeptides may comprise a deletion of at least 2 amino acids in two loop regions. In some embodiments the cradle polypeptides may comprise at least 1 amino acid insertion and 1 amino acid deletion in at least one loop region.
In some embodiments the cradle polypeptides may comprise an insertion and deletion of at least 1 amino acid in the same loop region. In some embodiments the cradle library may be pre-selected to bind a target molecule. In some embodiments the cradle polypeptides may comprise an amino acid sequence selected from the group consisting of SEQ ID NOs: 3, 79, 86 and 468-470. In some embodiments the cradle library may contain 10, 100, 1000, 104, 105, 106, 107, 108, 109, 1010, 1011, 1012, 1013, 1014, 1015 or more different polypeptide variants, including all values and ranges there between. In some embodiments, the amino acid sequence of the FnIII domain from which the library is generated is derived from the wild type amino acid sequences of the 1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, -th,10th, 11 h, 12th, 13th, 14th, 151h or 161h FnIII domain of human fibronectin. In some embodiments, the cradle polypeptide sequences may be a a loop FG
comprising 5 or 6 residues, a loop CD comprising 3 to 11 residues, a beta strand C comprising 6 to 14 residues, a beta F comprising 8 to 13 residues, or a combination of the loops and strands.
NO:l. In some embodiments the cradle polypeptides may further comprise an insertion of at least 1, 2, or about 2-25 amino acids in at least one loop region. In some embodiments the cradle polypeptides may comprise a deletion of at least 1, 2 or about 2-10 amino acids in at least one loop region. In some embodiments the cradle polypeptides may comprise a deletion of at least 2 amino acids in two loop regions. In some embodiments the cradle polypeptides may comprise at least 1 amino acid insertion and 1 amino acid deletion in at least one loop region.
In some embodiments the cradle polypeptides may comprise an insertion and deletion of at least 1 amino acid in the same loop region. In some embodiments the cradle library may be pre-selected to bind a target molecule. In some embodiments the cradle polypeptides may comprise an amino acid sequence selected from the group consisting of SEQ ID NOs: 3, 79, 86 and 468-470. In some embodiments the cradle library may contain 10, 100, 1000, 104, 105, 106, 107, 108, 109, 1010, 1011, 1012, 1013, 1014, 1015 or more different polypeptide variants, including all values and ranges there between. In some embodiments, the amino acid sequence of the FnIII domain from which the library is generated is derived from the wild type amino acid sequences of the 1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, -th,10th, 11 h, 12th, 13th, 14th, 151h or 161h FnIII domain of human fibronectin. In some embodiments, the cradle polypeptide sequences may be a a loop FG
comprising 5 or 6 residues, a loop CD comprising 3 to 11 residues, a beta strand C comprising 6 to 14 residues, a beta F comprising 8 to 13 residues, or a combination of the loops and strands.
[0017] Further provided herein are polynucleotides encoding one or more cradle polypeptide described herein. In some embodiments the polynucleotide may be an expression cassette or an expression construct. In some embodiments the expression construct may be capable of expressing the encoded polypeptide in a host cell, such as a prokaryotic or eukaryotic cell line or strain. In some embodiments the expression construct may be functional in one or more polypeptide expression systems known in the art. In some embodiments the expression construct may be functional in bacteria, yeast, insect cells, mammalian cells or the like.
[0018] Further provided herein is a method of producing a cradle polypeptide by: a) expressing a polynucleotide encoding a cradle polypeptide disclosed herein in a host cell; and b) isolating and/or purifying the expressed cradle polypeptide. In some embodiments the method may include the engineering of various amino acids substitutions, deletions, and/or insertions described herein.
[0019] In a further aspect, provided herein is a method of forming a cradle library of FnIII
domain polypeptides useful in screening for the presence of one or more polypeptides having a selected binding or enzymatic activity, which comprises: (i) aligning loops FG
and CD, and preferably beta strands C and F amino acid sequences in a collection of native FnIII domain polypeptides, (ii) segregating the aligned loop and beta strand sequences according to length, (iii) for a selected loop, beta strand, and length from step (ii), performing positional amino acid frequency analysis to determine the frequencies of amino acids at each position, (iv) for each loop, beta strand, and length analyzed in step (iii), identifying at each position a conserved or selected semi-conserved consensus amino acid and other natural-variant amino acids, (v) for at least one selected loop, beta strand, and length, forming: (1) a library of mutagenesis sequences expressed by a library of coding sequences that encode, at each loop position, the consensus amino acid, and if the consensus amino acid has a occurrence frequency equal to or less than a selected threshold frequency of at least 50%, a single common target amino acid and any co-produced amino acids, or (2) a library of natural-variant combinatorial sequences expressed by a library of coding sequences that encode at each position, a consensus amino acid and, if the consensus amino acid has a frequency of occurrence equal to or less than a selected threshold frequency of at least 50%, other natural variant amino acids, including semi-conserved amino acids and variable amino acids whose occurrence rate is above a selected minimum threshold occurrence at that position, or their chemical equivalents, (vi) incorporating the library of coding sequences into framework FnIII coding sequences to form an FnIII
expression library, and (vi) expressing the FnIII polypeptides of the expression library.
domain polypeptides useful in screening for the presence of one or more polypeptides having a selected binding or enzymatic activity, which comprises: (i) aligning loops FG
and CD, and preferably beta strands C and F amino acid sequences in a collection of native FnIII domain polypeptides, (ii) segregating the aligned loop and beta strand sequences according to length, (iii) for a selected loop, beta strand, and length from step (ii), performing positional amino acid frequency analysis to determine the frequencies of amino acids at each position, (iv) for each loop, beta strand, and length analyzed in step (iii), identifying at each position a conserved or selected semi-conserved consensus amino acid and other natural-variant amino acids, (v) for at least one selected loop, beta strand, and length, forming: (1) a library of mutagenesis sequences expressed by a library of coding sequences that encode, at each loop position, the consensus amino acid, and if the consensus amino acid has a occurrence frequency equal to or less than a selected threshold frequency of at least 50%, a single common target amino acid and any co-produced amino acids, or (2) a library of natural-variant combinatorial sequences expressed by a library of coding sequences that encode at each position, a consensus amino acid and, if the consensus amino acid has a frequency of occurrence equal to or less than a selected threshold frequency of at least 50%, other natural variant amino acids, including semi-conserved amino acids and variable amino acids whose occurrence rate is above a selected minimum threshold occurrence at that position, or their chemical equivalents, (vi) incorporating the library of coding sequences into framework FnIII coding sequences to form an FnIII
expression library, and (vi) expressing the FnIII polypeptides of the expression library.
[0020] In some embodiments, the cradle library may include cradle polypeptides which may comprise: (a) regions A, AB, B, C, CD, D, E, EF, F, and G having wildtype amino acid sequences of a selected native FnIII polypeptide, and (b) loop regions FG, CD, and/or beta strands C and F, having selected lengths, wherein at least one selected loop and/or beta strand region of a selected length contains a library of mutagenesis sequences expressed by a library of coding sequences that encode, at each loop position, a conserved or selected semi-conserved consensus amino acid and, if the consensus amino acid has an occurrence frequency equal to or less than a selected threshold frequency of at least 50%, a single common target amino acid and any co-produced amino acids.
[0021] In some embodiments, the cradle polypeptides may comprise: (a) regions A, AB, B, C, CD, D, E, EF, F, and G having wildtype amino acid sequences of a selected native FnIII
polypeptide, and (b) loop regions FG, CD, and non-loop regions C and F having selected lengths, where at least one selected loop and/or beta strand region of a selected length contains a library of natural-variant combinatorial sequences expressed by a library of coding sequences that encode at each position, a conserved or selected semi-conserved consensus amino acid and, if the consensus amino acid has a frequency of occurrence equal to or less than a selected threshold frequency of at least 50%, other natural variant amino acids, including semi-conserved amino acids and variable amino acids whose occurrence rate is above a selected minimum threshold occurrence at that position, or their chemical equivalents.
polypeptide, and (b) loop regions FG, CD, and non-loop regions C and F having selected lengths, where at least one selected loop and/or beta strand region of a selected length contains a library of natural-variant combinatorial sequences expressed by a library of coding sequences that encode at each position, a conserved or selected semi-conserved consensus amino acid and, if the consensus amino acid has a frequency of occurrence equal to or less than a selected threshold frequency of at least 50%, other natural variant amino acids, including semi-conserved amino acids and variable amino acids whose occurrence rate is above a selected minimum threshold occurrence at that position, or their chemical equivalents.
[0022] In some embodiments, the cradle library may have a given threshold of 100%, unless the amino acid position contains only one dominant and one variant amino acid, and the dominant and variant amino acids are chemically similar amino acids, in which case the given threshold may be 90%. In some embodiments, the cradle library may contain all natural variants or their chemical equivalents having at least some reasonable occurrence frequency, e.g., 10%, in the in the selected loop, beta strand, and position.
[0023] In some embodiments, the cradle library may have at least one or more loops FG and CD and/or beta-strands C and F which comprise beneficial mutations identified by screening a natural-variant combinatorial library containing amino acid variants in the loops, beta strands, or combinations thereof. In some embodiments, one or more members of the library may be then isolated from other members of the library and analyzed. In some embodiments, the cradle library may be pre-selected to bind a target and those preselected members are then further diversified in selected amino acid position to generate a targeted cradle library that is subsequently screened for a particular characteristic or property.
[0024] Further provided herein is a method of of identifying a cradle polypeptide having a desired binding affinity to a target molecule, comprising: a) reacting a cradle library of FnIII
domain polypeptides disclosed herein with the target molecule, and b) screening the cradle library of FnIII domain polypeptides to select those having a desired binding affinity to the target molecule. In some embodiments, after conducting the binding assay(s) one or more cradle polypeptides may be selected that have a particular property, such as binding specificity and/or binding affinity to a target. In some embodiments, the amino acid or nucleic acid sequence of one or more of the selected library members may be determined using conventional methods. The sequence of the selected FnIII polypeptide(s) may then be used to produce a second cradle library that introduces further substitution of the selected sequences. The second cradle library may then be screened for FnIII polypeptides having a particular property. The process can be repeated 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more times.
Additional iterations may enrich the cradle library as well as potentially include other variants.
domain polypeptides disclosed herein with the target molecule, and b) screening the cradle library of FnIII domain polypeptides to select those having a desired binding affinity to the target molecule. In some embodiments, after conducting the binding assay(s) one or more cradle polypeptides may be selected that have a particular property, such as binding specificity and/or binding affinity to a target. In some embodiments, the amino acid or nucleic acid sequence of one or more of the selected library members may be determined using conventional methods. The sequence of the selected FnIII polypeptide(s) may then be used to produce a second cradle library that introduces further substitution of the selected sequences. The second cradle library may then be screened for FnIII polypeptides having a particular property. The process can be repeated 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more times.
Additional iterations may enrich the cradle library as well as potentially include other variants.
[0025] In some embodiments, the method may further comprise conducting a first screen of a cradle library having amino acid substitutions in only FnIII loops or only FnIII beta strands and conducting a second screen using substitutions in only FnIII loops or only FnIII beta strands. In some embodiments, the first screen may use only substitutions in the FnIII loops and the second screen may use only substitutions in the FnIII beta-strands. In some embodiments, the second screen may use substitutions in both FnIII loops and beta-strands.
In some embodiments, the FnIII amino acid residues varied in the first screen may or may not be varied in the second screen.
In some embodiments, the FnIII amino acid residues varied in the first screen may or may not be varied in the second screen.
[0026] Also provided herein is a method of detecting a target molecule which comprises contacting a sample containing the target with an FnIII binding domain that specifically binds the target. Further provided herein is a method of producing an FnIII variant comprising: (a) expressing a polypeptide comprising an amino acid sequence; and (b) isolating and/or purifying the expressed variant FnIII domain from a host cell expressing the variant FnIII.
[0027] Further provided herein is a cradle polypeptide selected using the method of identifying a cradle polypeptide having a desired binding affinity to a target molecule disclosed herein. In some embodiments, the cradle polypeptide may comprise an amino acid sequence selected from the group consisting of SEQ ID NOs:4-78, 80-85, 87-96, 98, 99, 101-128, 130-141, 143, 145-147, 149-159, 161-199, 201-238 and 240-277.
[0028] In still a further aspect, the present invention provides a kit comprising a plurality of cradle polypeptides as described herein. Also provided herein is a kit comprising a plurality of polynucleotides encoding the FnIII cradle polypeptides as disclosed herein.
Further provided herein is a kit comprising a cradle library and/or the polynucleotides encoding the cradle library as disclosed herein.
Brief Description of the Drawings
Further provided herein is a kit comprising a cradle library and/or the polynucleotides encoding the cradle library as disclosed herein.
Brief Description of the Drawings
[0029] Figure 1 is a schematic diagram illustrating the method for constructing FnIII cradle libraries using computer assisted genetic database biomining and delineation of beta-scaffold and loop structures.
[0030] Figure 2A shows the structure and sequence of the wild type FnIIIi (SEQ ID NO:1).
Loops CD and FG along with the specific residues of beta strands C and F are shown in black in the structure and boldface black in the sequence.
Loops CD and FG along with the specific residues of beta strands C and F are shown in black in the structure and boldface black in the sequence.
[0031] Figure 2B shows the binding surface formed by the Cradle. Loops CD and FG are black, sheet C is light gray, and sheet F is dark gray.
[0032] Figure 3A shows the placement of the cradle residues on the ribbon diagrams of the FnIII domains. The structure for the FnIII 7 domain is from RCSB code 1FNF, the structure for the FnIII1 domain is from 1FNA, and the structure for the FnIII14 domain is from 1FNH.
[0033] Figure 3B is a ball and stick representation of the location of the cradle on the FnIIIi molecule, where cradle residues are black and other residues are white.
[0034] Figures 4A-D show the length distribution calculated from mammalian FnIII
domains found in the PFAM family PF00041. Figure 4A: Sheet C; Figure 4B: Loop CD; Figure 4C: Sheet F; Figure 4D: Loop FG.
domains found in the PFAM family PF00041. Figure 4A: Sheet C; Figure 4B: Loop CD; Figure 4C: Sheet F; Figure 4D: Loop FG.
[0035] Figures 5A-H show the positional amino acid distribution found in the structural elements of the Cradle. The conservation of the top 5 amino acids for each position in each Cradle is shown. Figure 5A: Sheet C, length 9; Figure 5B: Sheet C, length 10;
Figure 5C: Loop CD, length 4; Figure 5D: Loop CD, length 5; Figure 5E: Loop CD, length 6;
Figure 5F: Sheet F, length 10; Figure 5G: Loop FG, length 5; Figure 5H: Loop FG, length 6.
Figure 5C: Loop CD, length 4; Figure 5D: Loop CD, length 5; Figure 5E: Loop CD, length 6;
Figure 5F: Sheet F, length 10; Figure 5G: Loop FG, length 5; Figure 5H: Loop FG, length 6.
[0036] Figures 6A-F show the binding surface comparison of the FnIII1 Cradle, the FnIII1 Top Side and the FnIIIi Bottom Side binding sites. Figure 6A: Cradle shown in white on a ribbon diagram of FnIIIm; Figure 6B: Surface representation of the Cradle;
Figure 6C: Top Side shown in white on a ribbon diagram of FnIIIm; Figure 6D: Surface representation of the Top Side; Figure 6E: Bottom Side shown in white on a ribbon diagram of FnIIIm;
Figure 6F: Surface representation of the Bottom Side.
Figure 6C: Top Side shown in white on a ribbon diagram of FnIIIm; Figure 6D: Surface representation of the Top Side; Figure 6E: Bottom Side shown in white on a ribbon diagram of FnIIIm;
Figure 6F: Surface representation of the Bottom Side.
[0037] Figures 7A-B show the conservation of amino acid type in sheets C and F
of FnIIIi and indicates which residues are varied in the Cradle molecule and which ones were left as wild type. Figure 7A: Ribbon diagram of FnIII1 with the amino acids in sheets C
and F numbered 1 ¨ 19. The Ca for each amino acid is shown as a sphere and the cp is shown as a stick to indicate the direction of the amino acid R group. Varied amino acids are colored gray and unvaried are colored white. Figure 7B: Table showing the amino acid type conservation for each position in sheets C and F.
of FnIIIi and indicates which residues are varied in the Cradle molecule and which ones were left as wild type. Figure 7A: Ribbon diagram of FnIII1 with the amino acids in sheets C
and F numbered 1 ¨ 19. The Ca for each amino acid is shown as a sphere and the cp is shown as a stick to indicate the direction of the amino acid R group. Varied amino acids are colored gray and unvaried are colored white. Figure 7B: Table showing the amino acid type conservation for each position in sheets C and F.
[0038] Figure 8 is the amino acid distribution in the varied residues of the Cradle and CDR-H3 domains known to bind antigens.
[0039] Figures 9A-F describe the design of the Cradle library on FnIII 7, FnIII10, and FnIII14. Figure 9A: Length distribution of loop CD; Figure 9B: Length distribution of loop FG;
Figure 9C: Amino acid distribution in the sheets and loops. Figure 9D:
Alignment of cradle residues for FnIII 7, FnIII1 and FnIII14(SEQ ID NOs: 468-470). Beta sheets are shown as white residues on a black background and loops are shown as black text. Cradle residues are shown in bold with X representing the amino acid distribution for the beta sheets and Y
representing the amino acid distribution for the loops with the loop length range given as a subscript.
Figure 9E: Alignment of FnIII 7, FnIII1 and FnIII14 (SEQ ID NOs:97, 100, 129) illustrating the cradle residues in beta sheets C and F and loops CD and FG. Beta sheets are shown as white residues on a black background and loops are shown as black text with Cradle residues shown in bold. Figure 9F: Shown are the FnIII structural element residue ranges and FnIII cradle residues ranges.
Figure 9C: Amino acid distribution in the sheets and loops. Figure 9D:
Alignment of cradle residues for FnIII 7, FnIII1 and FnIII14(SEQ ID NOs: 468-470). Beta sheets are shown as white residues on a black background and loops are shown as black text. Cradle residues are shown in bold with X representing the amino acid distribution for the beta sheets and Y
representing the amino acid distribution for the loops with the loop length range given as a subscript.
Figure 9E: Alignment of FnIII 7, FnIII1 and FnIII14 (SEQ ID NOs:97, 100, 129) illustrating the cradle residues in beta sheets C and F and loops CD and FG. Beta sheets are shown as white residues on a black background and loops are shown as black text with Cradle residues shown in bold. Figure 9F: Shown are the FnIII structural element residue ranges and FnIII cradle residues ranges.
[0040] Figures 10A-C show a shared epitope for a monobody and a SIM peptide and conservation of the SIM binding site in SUMO proteins. Figure 10A: The structures of the ySMB-1 monobody bound to ySUMO (left) and a SIM peptide bound to hSUM0-115 (PDB ID
1Z55) (right) are shown. Because there are no structures of natural SIM
peptides in complex with ySUMO, the structure of a SIM peptide bound to hSUM0-1 is shown for comparison.
Figure 10B: An alignment of SIM binding site/ySMB-1 epitope residues in ySUMO, and two human homologs, hSUM0-1 and hSUM0-2 are shown (top left) (SEQ ID NOs:297-299).
Residues are ranked according to their conservation score: residues identical in all 3 SUMO
proteins (i), similar with a conservation score of 9 (ii), and similar with a conservation score of 8 (iii). Conservation scores were calculated using methods outlined by Livingstone and Barton, Comput. Appl. Biosci. (1993) 9:745-756 using the Jalview program (Clamp, et al., Bioinformatics (2004) 20:426-427). These scores reflect conservation of chemical and structural properties of side chains. The structure of ySUMO is shown (top right). Figure 10C:
A full sequence alignment of ySUMO, hSUM0-1 and hSUM0-2 is shown (SEQ ID
NOs:300-302).
1Z55) (right) are shown. Because there are no structures of natural SIM
peptides in complex with ySUMO, the structure of a SIM peptide bound to hSUM0-1 is shown for comparison.
Figure 10B: An alignment of SIM binding site/ySMB-1 epitope residues in ySUMO, and two human homologs, hSUM0-1 and hSUM0-2 are shown (top left) (SEQ ID NOs:297-299).
Residues are ranked according to their conservation score: residues identical in all 3 SUMO
proteins (i), similar with a conservation score of 9 (ii), and similar with a conservation score of 8 (iii). Conservation scores were calculated using methods outlined by Livingstone and Barton, Comput. Appl. Biosci. (1993) 9:745-756 using the Jalview program (Clamp, et al., Bioinformatics (2004) 20:426-427). These scores reflect conservation of chemical and structural properties of side chains. The structure of ySUMO is shown (top right). Figure 10C:
A full sequence alignment of ySUMO, hSUM0-1 and hSUM0-2 is shown (SEQ ID
NOs:300-302).
[0041] Figures 11A-C illustrate the design of a SUMO-targeted cradle library.
Figure 11A:
Shown is the structure of the ySMB-1/ySUMO interface. The ySUMO structure is shown as a surface with epitope residues shown as sticks. Figure 11B: Shown is a sequence alignment (top) of ySMB-1 epitope residues in ySUMO and equivalent residues in hSUM0-1 and hSUM0-2 (SEQ ID NOs:297-299). Below, the residues of ySMB-1 varied in the SUMO-targeted library are listed. Interactions between ySMB-1 residues and ySUMO
residues in the ySMB-1/ySUMO structure are indicated by lines. Below each ySMB-1 residue, the amino acids allowed at that position in the SUMO-targeted library are listed along with the degenerate codon used to introduce them in parentheses. Figure 11C: Shown is a cartoon of the ySMB-1 structure with the positions varied in the SUMO-targeted library indicated.
Figure 11A:
Shown is the structure of the ySMB-1/ySUMO interface. The ySUMO structure is shown as a surface with epitope residues shown as sticks. Figure 11B: Shown is a sequence alignment (top) of ySMB-1 epitope residues in ySUMO and equivalent residues in hSUM0-1 and hSUM0-2 (SEQ ID NOs:297-299). Below, the residues of ySMB-1 varied in the SUMO-targeted library are listed. Interactions between ySMB-1 residues and ySUMO
residues in the ySMB-1/ySUMO structure are indicated by lines. Below each ySMB-1 residue, the amino acids allowed at that position in the SUMO-targeted library are listed along with the degenerate codon used to introduce them in parentheses. Figure 11C: Shown is a cartoon of the ySMB-1 structure with the positions varied in the SUMO-targeted library indicated.
[0042] Figures 12A-C illustrate the selection and characterization of monobodies from the SUMO-targeted library. Figure 12A: ySUMO (left) and hSUM0-1 (right) are shown with the ySMB-1 paratope structure modeled as if binding to each target. Monobody residue positions are shown as spheres corresponding to Ca atoms of indicated residue numbers.
FG loop residues (75, 76, 78, 79, 80, 81, 82, 83, 84, and 85) and scaffold residues (31, 33, and 73) are indicated. In the center, a table is shown listing the amino acid diversities introduced at monobody positions in the SUMO-targeted library. Wildtype residues at each position are indicated in brackets. Figure 12B: Shown are the amino acid sequences of monobodies recovered against ySUMO and hSUM0-1 as well as representative SPR binding traces (SEQ ID
NOs:303-318). Estimated dissociation constants from SPR are given for all clones. Figure 12C: Shown are the sequence logo representations of 40 ySUMO monobodies and 44 hSUM0-1 monobodies. The wild-type sequence of ySMB-1 is shown above (SEQ ID NO:303).
In this depiction, the relative height of individual letters reflects how frequently that amino acid is recovered at that position, the letters stacked at a given position are ordered from more frequently occurring to less frequently occurring, and the overall height of an individual stack reflects the overall level of sequence conservation at that position. Figures generated using WebLogo (Crooks, et al., Genome Res. (2004) 14:1188-1190; Schneider and Stephens, Nucleic Acids Res. (1990) 18:6097-6100).
FG loop residues (75, 76, 78, 79, 80, 81, 82, 83, 84, and 85) and scaffold residues (31, 33, and 73) are indicated. In the center, a table is shown listing the amino acid diversities introduced at monobody positions in the SUMO-targeted library. Wildtype residues at each position are indicated in brackets. Figure 12B: Shown are the amino acid sequences of monobodies recovered against ySUMO and hSUM0-1 as well as representative SPR binding traces (SEQ ID
NOs:303-318). Estimated dissociation constants from SPR are given for all clones. Figure 12C: Shown are the sequence logo representations of 40 ySUMO monobodies and 44 hSUM0-1 monobodies. The wild-type sequence of ySMB-1 is shown above (SEQ ID NO:303).
In this depiction, the relative height of individual letters reflects how frequently that amino acid is recovered at that position, the letters stacked at a given position are ordered from more frequently occurring to less frequently occurring, and the overall height of an individual stack reflects the overall level of sequence conservation at that position. Figures generated using WebLogo (Crooks, et al., Genome Res. (2004) 14:1188-1190; Schneider and Stephens, Nucleic Acids Res. (1990) 18:6097-6100).
[0043] Figure 13 shows the rationale for scaffold residue preferences in ySUMO
and hSUM0-1 monobodies. Contacts made by scaffold residues in the ySMB-1/ySUMO
complex (left) and in a modeled ySMB-1/hSUM0-1 complex (right) are shown. A potential steric and electrostatic clash between R33 of the monobody scaffold and K25 of hSUM0-1 is circled.
and hSUM0-1 monobodies. Contacts made by scaffold residues in the ySMB-1/ySUMO
complex (left) and in a modeled ySMB-1/hSUM0-1 complex (right) are shown. A potential steric and electrostatic clash between R33 of the monobody scaffold and K25 of hSUM0-1 is circled.
[0044] Figures 14A-D show the specificity in ySUMO and hSUM0-1 binding monobodies.
Figure 14A: Amino Acid Sequences of Two Nearly Identical ySUMO and hSUM0-1 Monobodies are shown (SEQ ID NOs:303, 319-320). Monobody A was recovered as a hSUM01 binder, and monobody B as a ySUMO binder. Figure 14B: Phage ELISA data is shown for binding of Monobody A and B to ySUMO and hSUM0-1. Both ySUMO and hSUM0-1 were produced as GST fusion proteins. Binding to GST is shown as a negative control. Figure 14C:Phage ELISA data for the binding of 32 hSUM0-1 monobodies to ySUMO, hSUM0-1, and hSUM0-2 is shown. All SUMO proteins were produced as GST
fusions. Binding to GST is shown as a negative control. Figure 14D: Sequence alignments of monobodies specific for hSUM0-1 and cross-reacting with ySUMO are shown in sequence logo format (see Figure 13 legend for explanation of sequence logos). The wild-type ySMB-1 sequence is shown above. Clone numbers 1, 3, 4, 5, 6, 8, 9, 12, 13, 17, 18, 20, 21, 23, 31 and 32 in Figure 25C were classified as cross-reactive. The remaining 16 clones were classified as specific.
Figure 14A: Amino Acid Sequences of Two Nearly Identical ySUMO and hSUM0-1 Monobodies are shown (SEQ ID NOs:303, 319-320). Monobody A was recovered as a hSUM01 binder, and monobody B as a ySUMO binder. Figure 14B: Phage ELISA data is shown for binding of Monobody A and B to ySUMO and hSUM0-1. Both ySUMO and hSUM0-1 were produced as GST fusion proteins. Binding to GST is shown as a negative control. Figure 14C:Phage ELISA data for the binding of 32 hSUM0-1 monobodies to ySUMO, hSUM0-1, and hSUM0-2 is shown. All SUMO proteins were produced as GST
fusions. Binding to GST is shown as a negative control. Figure 14D: Sequence alignments of monobodies specific for hSUM0-1 and cross-reacting with ySUMO are shown in sequence logo format (see Figure 13 legend for explanation of sequence logos). The wild-type ySMB-1 sequence is shown above. Clone numbers 1, 3, 4, 5, 6, 8, 9, 12, 13, 17, 18, 20, 21, 23, 31 and 32 in Figure 25C were classified as cross-reactive. The remaining 16 clones were classified as specific.
45 CA 02805862 2013-01-16 [0045] Figures 15A-C show the representative binding data for monobodies generated from the cradle libraries. Phage ELISA signals of selected clones are shown.
Figures 15 A, B and C
show clones selected from the BL1 library with hSUM01, human ubiquitin and Ab1SH2 as a target, respectively. ELISA wells were coated with the cognate target. The left bars show data in the absence of a soluble target (which serves as a competitor), and the middle bars show data in the presence of a soluble competitor (100 nM for hSUM01 and 200nM for the others). The right bars show binding to wells containing no target (negative control).
Figures 15 A, B and C
show clones selected from the BL1 library with hSUM01, human ubiquitin and Ab1SH2 as a target, respectively. ELISA wells were coated with the cognate target. The left bars show data in the absence of a soluble target (which serves as a competitor), and the middle bars show data in the presence of a soluble competitor (100 nM for hSUM01 and 200nM for the others). The right bars show binding to wells containing no target (negative control).
[0046] Figures 16A-E show the sequences and properties of ySUMO-binding monobodies.
Figure 16A: yeast SUMO (ySUMO) structure shaded by conservation score among ySUMO
and hSUMO isoforms (Livingstone and Barton, supra, 1993). Figure 16B:
Schematic of the FnIII scaffold with beta strands A-G labeled and surface loops (BC loop, DE
loop, and FG loop) diversified in monobody libraries. Figure 16C: Amino acid sequences of variable loops of ySUMO-binding monobodies with Kd values from SPR (SEQ ID NOs:321-329). Figure 16D:
SPR traces for ySMB-1 and ySMB-2 binding to ySUMO with kinetic parameters calculated from a bet fit (solid line) of the raw data (dashed line) to a 1:1 binding model. Figure 16E:
Epitopes of ySMB-1 and ySMB-2 mapped from NMR chemical shift perturbation shown on the ySUMO structure.
Figure 16A: yeast SUMO (ySUMO) structure shaded by conservation score among ySUMO
and hSUMO isoforms (Livingstone and Barton, supra, 1993). Figure 16B:
Schematic of the FnIII scaffold with beta strands A-G labeled and surface loops (BC loop, DE
loop, and FG loop) diversified in monobody libraries. Figure 16C: Amino acid sequences of variable loops of ySUMO-binding monobodies with Kd values from SPR (SEQ ID NOs:321-329). Figure 16D:
SPR traces for ySMB-1 and ySMB-2 binding to ySUMO with kinetic parameters calculated from a bet fit (solid line) of the raw data (dashed line) to a 1:1 binding model. Figure 16E:
Epitopes of ySMB-1 and ySMB-2 mapped from NMR chemical shift perturbation shown on the ySUMO structure.
[0047] Figure 17 shows the sequences and affinities of ySUMO-binding monobodies (SEQ
ID NOs:321-415). Amino acid sequences of the variable loops of all ySUMO-binding monobodies recovered in our laboratory. If available, Kd values from SPR are given.
Monobodies originated from one of three libraries: a binary Tyr/Ser library in which loop lengths and sequences were varied using a combination of 50% Y and 50% S
(Koide, A., et al., Proc. Natl. Acad. Sci. USA (2007) 104:6632-6637), a "YSX" library which used a combination of 40% Y, 20% S, 10% G, and 5% each of R, L, H, D, N, A (Olsen, et al., Nature (2010) 463:906-912), or a "YSGW" library which used a combination of 30% Y, 15% S, 10% G, 5%
each of W, F and R, and 2.5% each of all other amino acids except cysteine in the BC and FG
loops and 50% Gly, 25% Tyr and 25% Ser at position 52, and a 50/50 mixture of Tyr and Ser at positions 53-55 in the DE loop (Wojcik, et al., supra, 2010).
ID NOs:321-415). Amino acid sequences of the variable loops of all ySUMO-binding monobodies recovered in our laboratory. If available, Kd values from SPR are given.
Monobodies originated from one of three libraries: a binary Tyr/Ser library in which loop lengths and sequences were varied using a combination of 50% Y and 50% S
(Koide, A., et al., Proc. Natl. Acad. Sci. USA (2007) 104:6632-6637), a "YSX" library which used a combination of 40% Y, 20% S, 10% G, and 5% each of R, L, H, D, N, A (Olsen, et al., Nature (2010) 463:906-912), or a "YSGW" library which used a combination of 30% Y, 15% S, 10% G, 5%
each of W, F and R, and 2.5% each of all other amino acids except cysteine in the BC and FG
loops and 50% Gly, 25% Tyr and 25% Ser at position 52, and a 50/50 mixture of Tyr and Ser at positions 53-55 in the DE loop (Wojcik, et al., supra, 2010).
[0048] Figure 18 shows the epitope mapping ELISA of ySUMO-binding monobodies.
Binding of 34 phage-displayed ySUMO-binding monobodies measured by ELISA in the presence and absence of 1 p M ySMB-1 competitor. Clone numbers correspond to those of the format ySMB-X in Figure 28.
Binding of 34 phage-displayed ySUMO-binding monobodies measured by ELISA in the presence and absence of 1 p M ySMB-1 competitor. Clone numbers correspond to those of the format ySMB-X in Figure 28.
[0049] Figures 19A-B show the specificity of ySUMO-binding monobodies. Figure 19A:
Binding of eight ySUMO-binding monobodies to ySUMO, hSUM01 and hSUM02 assayed using phage ELISA. Clone numbers are of the format ySMB-X in Figures 27C and 28. Figure 19B: Equilibrium SPR measurements of ySMB-1 (left column) and ySMB-9 (right column) binding to ySUMO, hSUM01 and hSUM02. Equilibrium responses at multiple concentrations (left panels) were fit with a simple 1:1 binding model (right panels).
Binding of eight ySUMO-binding monobodies to ySUMO, hSUM01 and hSUM02 assayed using phage ELISA. Clone numbers are of the format ySMB-X in Figures 27C and 28. Figure 19B: Equilibrium SPR measurements of ySMB-1 (left column) and ySMB-9 (right column) binding to ySUMO, hSUM01 and hSUM02. Equilibrium responses at multiple concentrations (left panels) were fit with a simple 1:1 binding model (right panels).
[0050] Figures 20A-C show the crystal structure of the monobody ySMB-1/ySUMO
complex. Figure 20A: Top: ySUMO and ySMB-1 are shown with monobody paratope residues shown as sticks; FG loop residues and scaffold residues are indicated. ySUMO
is shown in the same orientation as in Figure 16E. Bottom: An alternative view with the monobody paratope depicted as a surface. Figure 20B: Close-up of the ySMB-1/ySUMO interface.
ySUMO
(surface/sticks) is shown with residues comprising the hydrophobic center of the epitope and the charged/polar rim. Monobody paratope residues are shown as in (A). Figure 20C:
Left and Middle: comparison of the binding modes of ySMB-1 to ySUMO and the SIM of RanBP2 to hSUM01. Both form intermolecular beta sheets with their SUMO targets (expanded box).
Right: Overlay of the RanBP2 SIM and SIM mimicking monobody residues with the ySUMO
surface shown.
complex. Figure 20A: Top: ySUMO and ySMB-1 are shown with monobody paratope residues shown as sticks; FG loop residues and scaffold residues are indicated. ySUMO
is shown in the same orientation as in Figure 16E. Bottom: An alternative view with the monobody paratope depicted as a surface. Figure 20B: Close-up of the ySMB-1/ySUMO interface.
ySUMO
(surface/sticks) is shown with residues comprising the hydrophobic center of the epitope and the charged/polar rim. Monobody paratope residues are shown as in (A). Figure 20C:
Left and Middle: comparison of the binding modes of ySMB-1 to ySUMO and the SIM of RanBP2 to hSUM01. Both form intermolecular beta sheets with their SUMO targets (expanded box).
Right: Overlay of the RanBP2 SIM and SIM mimicking monobody residues with the ySUMO
surface shown.
[0051] Figures 21A-B show the ySMB-1/ySUMO interface analysis. Figure 21A:
Buried surface area contributed by each residue in the ySMB-1 paratope. Figure 21B:
Percent of total ySMB-1 and ySUMO buried surface area contributed by each amino acid type.
Buried surface area contributed by each residue in the ySMB-1 paratope. Figure 21B:
Percent of total ySMB-1 and ySUMO buried surface area contributed by each amino acid type.
[0052] Figures 22A-F show the hSUM01-binding monobodies from the SUMO-targeted library. Figure 22A: Design of the SUMO-targeted cradle library. Left: ySMB-1 paratope residues (backbone sticks/spheres) are shown with FG loop and scaffold residues indicated.
ySUMO (surface) is shown with ySMB-1 epitope residues as sticks. ySUMO
residues F37, K38, K40, T43, L48, and R55 are completely conserved, H23, 135, 139, R47 and A51 are conservative substitution and N25, E34, F36, E50, and K54 are non-conservative substitution, according to conservation between ySUMO and hSUM0s. The residue types at each position in hSUM01 and hSUM02/3 are shown in parentheses. Right: Amino acid diversity used in the SUMO-targeted library. The wild-type ySMB-1 residue is in brackets. Figure 22B: Amino acid sequences of hSUM01-binding monobodies from the SUMO-targeted library. Kd values from SPR are also shown. Representative SPR traces are shown. At bottom, sequences of an additional hSUM01 binding monobody (hS1MB-22) and a very similar ySUMO binding monobody (ySMB-ST6) recovered from the SUMO targeted library are shown (SEQ ID
NOs:303, 416-427). Figure 22C: Epitope of hS1MB-4 mapped from chemical shift perturbation shown on the hSUM01 structure. Data are represented using the same scheme as in Figure 16E.
Figure 22D: (SEQ ID NO:303) Sequence conservation of ySUMO- and hSUM01-binding monobodies shown as sequence logos (Schneider and Stephens, supra, 1990;
Crooks, et al., supra, 2004). The height of individual letters reflects how frequently that amino acid was recovered at that position, the letters stacked at a position are ordered from most to least frequently occurring and the overall height of a stack reflects the overall conservation level at that position. Figure 22E: Binding of hS1MB-22 and ySMB-ST6 to ySUMO and hSUM01 measured by phage ELISA. Figure 22F: Contacts made by scaffold residues in the ySMB-1/ySUMO complex (left) and in a modeled ySMB-1/hSUM0-1 complex (right).
Monobody (sticks) and SUMO residues (surface/sticks) are indicated. The ySMB-1/hSUM01 complex was modeled by superposition of the ySUMO portion of the ySMB-1 complex with the hSUM01 structure.
ySUMO (surface) is shown with ySMB-1 epitope residues as sticks. ySUMO
residues F37, K38, K40, T43, L48, and R55 are completely conserved, H23, 135, 139, R47 and A51 are conservative substitution and N25, E34, F36, E50, and K54 are non-conservative substitution, according to conservation between ySUMO and hSUM0s. The residue types at each position in hSUM01 and hSUM02/3 are shown in parentheses. Right: Amino acid diversity used in the SUMO-targeted library. The wild-type ySMB-1 residue is in brackets. Figure 22B: Amino acid sequences of hSUM01-binding monobodies from the SUMO-targeted library. Kd values from SPR are also shown. Representative SPR traces are shown. At bottom, sequences of an additional hSUM01 binding monobody (hS1MB-22) and a very similar ySUMO binding monobody (ySMB-ST6) recovered from the SUMO targeted library are shown (SEQ ID
NOs:303, 416-427). Figure 22C: Epitope of hS1MB-4 mapped from chemical shift perturbation shown on the hSUM01 structure. Data are represented using the same scheme as in Figure 16E.
Figure 22D: (SEQ ID NO:303) Sequence conservation of ySUMO- and hSUM01-binding monobodies shown as sequence logos (Schneider and Stephens, supra, 1990;
Crooks, et al., supra, 2004). The height of individual letters reflects how frequently that amino acid was recovered at that position, the letters stacked at a position are ordered from most to least frequently occurring and the overall height of a stack reflects the overall conservation level at that position. Figure 22E: Binding of hS1MB-22 and ySMB-ST6 to ySUMO and hSUM01 measured by phage ELISA. Figure 22F: Contacts made by scaffold residues in the ySMB-1/ySUMO complex (left) and in a modeled ySMB-1/hSUM0-1 complex (right).
Monobody (sticks) and SUMO residues (surface/sticks) are indicated. The ySMB-1/hSUM01 complex was modeled by superposition of the ySUMO portion of the ySMB-1 complex with the hSUM01 structure.
[0053] Figure 23 shows the ySUMO-binding monobodies isolated from the SUMO-targeted library. Shown are the amino acid sequences of monobodies recovered against ySUMO from the SUMO-targeted library with Kd values from SPR (SEQ ID NOs:303, 428-432) and representative SPR traces.
[0054] Figure 24 shows the epitope mapping ELISA of hSUM01-binding monobodies.
Binding of sixteen phage-displayed hSUM01-binding monobodies to hSUM01 measured by ELISA in the presence or absence of lu M hS1MB-4 competitor. Clone numbers correspond to those of the format hS1MB-X in Figures 22B and 26.
Binding of sixteen phage-displayed hSUM01-binding monobodies to hSUM01 measured by ELISA in the presence or absence of lu M hS1MB-4 competitor. Clone numbers correspond to those of the format hS1MB-X in Figures 22B and 26.
[0055] Figures 25A-B shows the specificity of hSUM0-1-binding monobodies.
Figure 25A:
Binding curves derived from phage ELISA of six hSUM01-binding monobodies binding to ySUMO, hSUM01 and hSUM02. Data for additional monobodies are shown in Figure 26A.
Serial dilutions of phage containing culture supernatant (titer ¨ 108) were used. Absorbance values were scaled to 1 cm path-length. Figure 25B: Equilibrium SPR
measurements of hS1MB-4 binding to ySUMO, hSUM01 and hSUM02. Equilibrium responses at multiple concentrations (left panels) were fit to a simple 1:1 binding model (right panels).
Figure 25A:
Binding curves derived from phage ELISA of six hSUM01-binding monobodies binding to ySUMO, hSUM01 and hSUM02. Data for additional monobodies are shown in Figure 26A.
Serial dilutions of phage containing culture supernatant (titer ¨ 108) were used. Absorbance values were scaled to 1 cm path-length. Figure 25B: Equilibrium SPR
measurements of hS1MB-4 binding to ySUMO, hSUM01 and hSUM02. Equilibrium responses at multiple concentrations (left panels) were fit to a simple 1:1 binding model (right panels).
[0056] Figures 26A-B show the selectivity of hSUM01-binding monobodies. Figure 26A:
Binding curves derived from phage ELISA of 10 hSUMO-binding monbodies binding to ySUMO, hSUM01 and hSUM02. Data for six additional monobodies are shown. Figure 26B:
The amino acid sequences of 16 hSUM01-binding monobodies are shown (SEQ ID
NOs:433-448) and grouped according to their specificity factor for hSUM01 over ySUMO.
The specificity factor is the ratio of apparent affinity measured for hSUM01 to that for ySUMO in the titration phage ELISA experiment shown in Figure 25A and Figure 26A.
Binding curves derived from phage ELISA of 10 hSUMO-binding monbodies binding to ySUMO, hSUM01 and hSUM02. Data for six additional monobodies are shown. Figure 26B:
The amino acid sequences of 16 hSUM01-binding monobodies are shown (SEQ ID
NOs:433-448) and grouped according to their specificity factor for hSUM01 over ySUMO.
The specificity factor is the ratio of apparent affinity measured for hSUM01 to that for ySUMO in the titration phage ELISA experiment shown in Figure 25A and Figure 26A.
[0057] Figures 27A-C show the effects of hSUM01-specific monobodies on SUMO/SIM
interactions and SUMOylation. Figure 27A: Left: Schematic of SIM-containing RanBP2's interaction with SUMOylated RanGAP (modified with hSUM01). Right: Binding of RanBP2 to SUM01-RanGAP in the presence of monobody hS1MB-4 in ELISA. Figure 27B:
Schematic of the El- and E2-dependent steps in the SUMOylation cascade. Covalently linked intermediates are formed sequentially between SUMO and El (SAE1/2) and E2 (Ubc9). Figure 27C: SDS-PAGE of SUMOylation reactions carried out in the presence of hS1MB-4 (lanes 3-5) and hS1MB-5 (lanes 6-8). Lanes 1 and 2 are negative controls with ySMB-1 and without a monobody, respectively. All reactions contained SAE1/2, Ubc9 and both hSUM01 and hSUM03 as substrates. Bands corresponding to the SAE2-SUMO and Ubc9-SUMO
covalent intermediates for each isoform are indicated. His6-tagged SUM03 (H6-SUM03) was used to distinguish hSUM01 from hSUM03 on the gel.
interactions and SUMOylation. Figure 27A: Left: Schematic of SIM-containing RanBP2's interaction with SUMOylated RanGAP (modified with hSUM01). Right: Binding of RanBP2 to SUM01-RanGAP in the presence of monobody hS1MB-4 in ELISA. Figure 27B:
Schematic of the El- and E2-dependent steps in the SUMOylation cascade. Covalently linked intermediates are formed sequentially between SUMO and El (SAE1/2) and E2 (Ubc9). Figure 27C: SDS-PAGE of SUMOylation reactions carried out in the presence of hS1MB-4 (lanes 3-5) and hS1MB-5 (lanes 6-8). Lanes 1 and 2 are negative controls with ySMB-1 and without a monobody, respectively. All reactions contained SAE1/2, Ubc9 and both hSUM01 and hSUM03 as substrates. Bands corresponding to the SAE2-SUMO and Ubc9-SUMO
covalent intermediates for each isoform are indicated. His6-tagged SUM03 (H6-SUM03) was used to distinguish hSUM01 from hSUM03 on the gel.
[0058] Figure 28 shows the proposed mechanism for monobody inhibition of hSUM01 conjugation. A modeled structure of a ySMB-1-like monobody bound to an El-hSUM01 complex (PDB ID 3KYD (Olsen, et al., supra, 2010). The trajectory of a long loop of SAE1 that is disordered in the crystal structure is illustrated by a dashed line.
[0059] Figures 29A-B show the monobody effects on deSUMOylation. Figure 29A:
Schematic of the deSUMOylation assay in which a YFP-hSUM01-ECFP fusion protein is cleaved by SENP1 at the hSUM01 C-terminal di-glycine sequence. Figure 29B: SDS-PAGE
analysis of deSUMOylation reactions carried out in the presence of hSMB-4 (lanes 6-8) or hS1MB-5 (lanes 9-11). Controls are also shown without SENP1 or a monobody (lane 1) or with SENP1 cleavage carried out in the presence of the ySUMO specific ySMB-1 (lanes 2-5). Bands corresponding to the YFP-hSUM01-ECFP fusion and the YFP-hSUM01 and ECFP
cleavage products are indicated as well as the band corresponding to the monobodies.
Schematic of the deSUMOylation assay in which a YFP-hSUM01-ECFP fusion protein is cleaved by SENP1 at the hSUM01 C-terminal di-glycine sequence. Figure 29B: SDS-PAGE
analysis of deSUMOylation reactions carried out in the presence of hSMB-4 (lanes 6-8) or hS1MB-5 (lanes 9-11). Controls are also shown without SENP1 or a monobody (lane 1) or with SENP1 cleavage carried out in the presence of the ySUMO specific ySMB-1 (lanes 2-5). Bands corresponding to the YFP-hSUM01-ECFP fusion and the YFP-hSUM01 and ECFP
cleavage products are indicated as well as the band corresponding to the monobodies.
[0060] Figures 30A-E shows monobody library design. Figure 30A: A comparison of the VHH scaffold (left) and the FnIII scaffold (right). The two beta sheet regions are colored in cyan and blue, respectively. The CDR regions of the VHH and the corresponding loops in FnIII
are colored and labeled. The beta strands of FnIII are labeled with A¨G.
Figure 30B: The structure of a monobody bound to its target, maltose-binding protein (Gilbreth, R. N., et al., J
Mol Biol (2008) 381:407-418). The monobody is depicted in the same manner as in A. Only a portion of maltose-binding protein is shown as a surface model. Figure 30C:
The structure of a monobody bound to the Abl 5H2 domain depicted as in B (Wojcik, et al., supra, 2010). Figure 30D: The locations of diversified residues in the cradle library shown as spheres on the FnIII
structure. Figure 30E: The locations of diversified residues in the cradle library.
are colored and labeled. The beta strands of FnIII are labeled with A¨G.
Figure 30B: The structure of a monobody bound to its target, maltose-binding protein (Gilbreth, R. N., et al., J
Mol Biol (2008) 381:407-418). The monobody is depicted in the same manner as in A. Only a portion of maltose-binding protein is shown as a surface model. Figure 30C:
The structure of a monobody bound to the Abl 5H2 domain depicted as in B (Wojcik, et al., supra, 2010). Figure 30D: The locations of diversified residues in the cradle library shown as spheres on the FnIII
structure. Figure 30E: The locations of diversified residues in the cradle library.
[0061] Figures 31A-D show monobody library designs and generated clones. Amino acid sequences of monobodies generated from the new cradle library (Figure 31A) (SEQ ID
NOs:449-457) and the "loop only" library (Figure 31B) (SEQ ID NOs:458-467).
"X" denotes a mixture of 30% Tyr, 15% Ser, 10% Gly, 5% Phe, 5% Trp and 2.5% each of all the other amino acids except for Cys; "B", a mixture of Gly, Ser and Tyr; "J", a mixture of Ser and Tyr; "0", a mixture of Asn, Asp, His, Ile, Leu, Phe, Tyr and Val; "U", a mixture of His, Leu, Phe and Tyr;
"Z", a mixture of Ala, Glu, Lys and Thr. Figure 31C: Binding measurements by yeast surface display of representative monobodies. The mean fluorescence intensities of yeast cells displaying a monobody are plotted as a function of the concentration of the target as indicated in panel A. Figure 31D: SPR sensorgrams for target binding of representative monobodies. The thin lines show the best global fit of the 1:1 binding model. The insets show dose-dependence analysis of the sensorgrams and the best fit of the 1:1 binding model.
NOs:449-457) and the "loop only" library (Figure 31B) (SEQ ID NOs:458-467).
"X" denotes a mixture of 30% Tyr, 15% Ser, 10% Gly, 5% Phe, 5% Trp and 2.5% each of all the other amino acids except for Cys; "B", a mixture of Gly, Ser and Tyr; "J", a mixture of Ser and Tyr; "0", a mixture of Asn, Asp, His, Ile, Leu, Phe, Tyr and Val; "U", a mixture of His, Leu, Phe and Tyr;
"Z", a mixture of Ala, Glu, Lys and Thr. Figure 31C: Binding measurements by yeast surface display of representative monobodies. The mean fluorescence intensities of yeast cells displaying a monobody are plotted as a function of the concentration of the target as indicated in panel A. Figure 31D: SPR sensorgrams for target binding of representative monobodies. The thin lines show the best global fit of the 1:1 binding model. The insets show dose-dependence analysis of the sensorgrams and the best fit of the 1:1 binding model.
[0062] Figures 32A-D show the crystal structures of monobodies originating from the two libraries. The structures are shown with the monobodies in similar orientations. Figure 32A:
The structure of the SH13 monobody bound to the Abl SH2 domain depicted as in Figure 30C.
Figure 32B: NMR-based epitope mapping of the SH13/Abl SH2 complex. The spheres show residues of Abl SH2 whose amide resonances were strongly affected (shift of >1.5 peak width), weakly affected (shift of 0.5-1.5 peak width) and minimally affected (shift of <0.5 peak width) by monobody binding, respectively. Figure 32C: The crystal structures of the ySMB-1/ySUMO
complex (left) and ySMB-9/hSUMO I complex (right). Figure 32D: The two monobodies bound to equivalent epitopes on the targets using distinct modes. The left panel shows a comparison of the two crystal structures shown in C with ySUMO and hSUMO I
superimposed.
The right panel shows ySUMO and hSUMO I in equivalent orientations with the epitopes for the indicated monobodies.
The structure of the SH13 monobody bound to the Abl SH2 domain depicted as in Figure 30C.
Figure 32B: NMR-based epitope mapping of the SH13/Abl SH2 complex. The spheres show residues of Abl SH2 whose amide resonances were strongly affected (shift of >1.5 peak width), weakly affected (shift of 0.5-1.5 peak width) and minimally affected (shift of <0.5 peak width) by monobody binding, respectively. Figure 32C: The crystal structures of the ySMB-1/ySUMO
complex (left) and ySMB-9/hSUMO I complex (right). Figure 32D: The two monobodies bound to equivalent epitopes on the targets using distinct modes. The left panel shows a comparison of the two crystal structures shown in C with ySUMO and hSUMO I
superimposed.
The right panel shows ySUMO and hSUMO I in equivalent orientations with the epitopes for the indicated monobodies.
[0063] Figure 33 shows mutations of residues in the C-strand abolished target binding.
Residues 30, 31 and 33 of monobody GS5 (see Figure 31A for its sequence) were mutated back to their respective wild-type amino acids. The mean fluorescence intensities of yeast cells displaying the GS5 monobody (filled circles) and the mutant (open circles) are plotted as a function of the concentration of the target, GFP.
Detailed Description of the Invention I. Definitions
Residues 30, 31 and 33 of monobody GS5 (see Figure 31A for its sequence) were mutated back to their respective wild-type amino acids. The mean fluorescence intensities of yeast cells displaying the GS5 monobody (filled circles) and the mutant (open circles) are plotted as a function of the concentration of the target, GFP.
Detailed Description of the Invention I. Definitions
[0064] The terms below have the following meanings unless indicated otherwise in the specification:
[0065] The term "fibronectin type III domain" or "FnIII domain" refers to a domain (region) from a wild-type fibronectin from any organism. In one specific embodiment, the FnIII domain is selected from the group consisting of FnIII 1, FnIII 2, FnIII 3, FnIII 4, FnIII 5, FnIII 6, FnIII 7, FnIII 8, FnIII 9, FnIII10, FnIII11, FnIII12, FnIII13, FnIII14, FnIII15, and FnIII16 and the like. In another embodiment, the FnIII domain is selected from the group consisting of FnIII 7, FnIII10 , and FnIII14. In another embodiment, the FnIII domain is FnIII 7. In another embodiment, the FnIII domain is FnIII10. In another embodiment, the FnIII domain is FnIII14.
[0066] The term "FnIII domain variant" or "variant FnIII domain" refers to a polypeptide region in which modifications have been made to the wildtype FnIII domain.
Modifications include one or more amino acid substitutions, deletions, and/or insertions are present as compared to the amino acid sequence of a wildtype FnIII domain. In one embodiment, the FnIII variant or FnIII variant domain has an alteration with respect to specifically the human tenth domain of the FnIII domain sequence (SEQ ID NO:1). In one embodiment, the FnIII
variant or FnIII variant domain has an alteration with respect to specifically the human seventh domain of the FnIII domain sequence (SEQ ID NO:97). In one embodiment, the FnIII variant or FnIII variant domain has an alteration with respect to specifically the human fourteenth domain of the FnIII domain sequence (SEQ ID NO:129). The term "substitutional variant"
includes the replacement of one or more amino acids in a peptide sequence with a conservative or non-conservative amino acid(s). In some embodiments, the FnIII domain variant has increased binding properties compared to the wildtype FnIII domain relative to a particular target. In some embodiments, the the FnIII domain variant has an increased surface area available for binding to a target moelcule compared with the wild type FnIII
domain.
Modifications include one or more amino acid substitutions, deletions, and/or insertions are present as compared to the amino acid sequence of a wildtype FnIII domain. In one embodiment, the FnIII variant or FnIII variant domain has an alteration with respect to specifically the human tenth domain of the FnIII domain sequence (SEQ ID NO:1). In one embodiment, the FnIII
variant or FnIII variant domain has an alteration with respect to specifically the human seventh domain of the FnIII domain sequence (SEQ ID NO:97). In one embodiment, the FnIII variant or FnIII variant domain has an alteration with respect to specifically the human fourteenth domain of the FnIII domain sequence (SEQ ID NO:129). The term "substitutional variant"
includes the replacement of one or more amino acids in a peptide sequence with a conservative or non-conservative amino acid(s). In some embodiments, the FnIII domain variant has increased binding properties compared to the wildtype FnIII domain relative to a particular target. In some embodiments, the the FnIII domain variant has an increased surface area available for binding to a target moelcule compared with the wild type FnIII
domain.
[0067] The term "FnIII domain polypeptide" refers to a polypeptide that includes at least one FnIII domain. A "variant FnIII domain polypeptide" or "FnIII domain-based polypeptide"
refers to a polypeptide that includes at least one FnIII domain variant. It is contemplated that such polypeptides are capable of specifically binding a target polypeptide or protein. "FnIII
domain-based molecule" refers to a molecule having an amino acid sequence of an FnIII
domain or FnIII variant domain.
refers to a polypeptide that includes at least one FnIII domain variant. It is contemplated that such polypeptides are capable of specifically binding a target polypeptide or protein. "FnIII
domain-based molecule" refers to a molecule having an amino acid sequence of an FnIII
domain or FnIII variant domain.
[0068] A "r3 sheet" or "beta sheet" is a form of regular secondary structure in proteins. Beta sheets consist of beta strands connected laterally by at least two or three backbone hydrogen bonds, forming a generally twisted, pleated sheet. A "beta strand" or "r3 strand" is a stretch of polypeptide chain typically 3 to 10 amino acids long with backbone in an almost fully extended conformation. The term beta strand A, also referred to as sheet A, refers to the amino acids preceding the AB loop. The term beta strand B, also referred to as sheet B, refers to the amino acids connecting the AB and BC loops. The term beta strand C, also referred to as sheet C or pl, refers to the amino acids connecting the BC and CD loops, e.g., amino acid position 31-39 of SEQ ID NO: 1. The term beta strand D, also referred to as sheet D or 32, refers to the amino acids connecting the CD and DE loops, e.g., amino acid position 44-51 of SEQ
ID NO:l. The term beta strand E, also referred to as sheet E, refers to the amino acids connecting the DE and EF loops. The term beta strand F, also referred to as sheet F or 33, refers to the amino acids connecting the EF and FG loops, e.g., amino acid position 67-75 of SEQ ID
NO:l. The term beta strand G, also referred to as sheet G or 34, refers to the amino acids after the FG loop, e.g., amino acid position 85-94 of SEQ ID NO:l.
ID NO:l. The term beta strand E, also referred to as sheet E, refers to the amino acids connecting the DE and EF loops. The term beta strand F, also referred to as sheet F or 33, refers to the amino acids connecting the EF and FG loops, e.g., amino acid position 67-75 of SEQ ID
NO:l. The term beta strand G, also referred to as sheet G or 34, refers to the amino acids after the FG loop, e.g., amino acid position 85-94 of SEQ ID NO:l.
[0069] A loop is a less ordered, flexible stretch of amino acids (as compared to alpha helices and beta sheets) that typically connect other structural elements of a protein. In the context of FnIII, the loops are designated by the beta-strands they connect, for example the loop connecting beta-strand A and beta-strand B is the AB loop. The term BC loop refers to the amino acids corresponding to amino acids 22 to 30 of SEQ ID NO:l. The term CD
loop refers to the amino acids corresponding to amino acids 39 to 45 of SEQ ID NO:l. The term DE loop refers to the amino acids corresponding to amino acids 51 to 55 of SEQ ID
NO:l. The term FG
loop refers to the amino acids corresponding to amino acids 76 to 87 of SEQ ID
NO:l. The term "non-loop region" refers to parts of the polypeptide sequence that do not form a loop, which include, but are not limited to, the beta sheets and beta strands. In the context of FnIII, the non-loop regions include beta strands A, B, C, D, E, F and G.
loop refers to the amino acids corresponding to amino acids 39 to 45 of SEQ ID NO:l. The term DE loop refers to the amino acids corresponding to amino acids 51 to 55 of SEQ ID
NO:l. The term FG
loop refers to the amino acids corresponding to amino acids 76 to 87 of SEQ ID
NO:l. The term "non-loop region" refers to parts of the polypeptide sequence that do not form a loop, which include, but are not limited to, the beta sheets and beta strands. In the context of FnIII, the non-loop regions include beta strands A, B, C, D, E, F and G.
[0070] The term "library" refers to a collection (e.g., to a plurality) of polypeptides having different amino acid sequences and different protein binding properties. In some embodiments there is a variant FnIII domain library comprising polypeptides having different variations of the FnIII domain. Unless otherwise noted, the library is an actual physical library of polypeptides or nucleic acids encoding the polypeptides. In further embodiments, there is a database that comprises information about a library that has been generated or a theoretical library that can be generated. This information may be a compound database comprising descriptions or structures of a plurality of potential variant FnIII domains.
[0071] The term "specifically binds" or "specific binding" refers to the measurable and reproducible ability of an FnIII domain variant to bind another molecule (such as a target), that is determinative of the presence of the target molecule in the presence of a heterogeneous population of molecules including biological molecules. For example, an FnIII
domain variant that specifically or preferentially binds to a target is a polypeptide that binds this target with greater affinity, avidity, more readily, and/or with greater duration than it binds to most or all other molecules. "Specific binding" does not necessarily require (although it can include) exclusive binding.
domain variant that specifically or preferentially binds to a target is a polypeptide that binds this target with greater affinity, avidity, more readily, and/or with greater duration than it binds to most or all other molecules. "Specific binding" does not necessarily require (although it can include) exclusive binding.
[0072] An polypeptide that specifically binds to a target with an affinity of at least 1 x 10-6 M at room temperature under physiological salt and pH conditions, as measured by surface plasmon resonance. An example of such a measurement is provided in the Example section.
[0073] The term "target" refers to a peptide, antigen or epitope that specifically binds to an FnIII-based binding molecule or monobody described herein. Targets include, but are not limited to, epitopes present on proteins, peptides, carbohydrates, and/or lipids.
[0074] The term "non-natural amino acid residue" refers to an amino acid residue that is not present in the naturally occurring FnIII domain in a mammal, such as a human.
[0075] The terms "tag", "epitope tag" or "affinity tag" are used interchangeably herein, and usually refer to a molecule or domain of a molecule that is specifically recognized by an antibody or other binding partner. The term also refers to the binding partner complex as well.
Thus, for example, biotin and a biotin/avidin complex are both regarded as an affinity tag. In addition to epitopes recognized in epitope/antibody interactions, affinity tags also comprise "epitopes" recognized by other binding molecules (e.g., ligands bound by receptors), ligands bound by other ligands to form heterodimers or homodimers, His6 bound by Ni-NTA, biotin bound by avidin, streptavidin, or anti-biotin antibodies, and the like.
Thus, for example, biotin and a biotin/avidin complex are both regarded as an affinity tag. In addition to epitopes recognized in epitope/antibody interactions, affinity tags also comprise "epitopes" recognized by other binding molecules (e.g., ligands bound by receptors), ligands bound by other ligands to form heterodimers or homodimers, His6 bound by Ni-NTA, biotin bound by avidin, streptavidin, or anti-biotin antibodies, and the like.
[0076] The term "conjugate" in the context of an FnIII domain variant refers to a chemical linkage between the FnIII domain variant and a non-FnIII domain variant. It is specifically contemplated that this excludes a regular peptide bond found between amino acid residues under physiologic conditions in some embodiments of the invention.
[0077] The terms "inhibiting," "reducing," or "preventing," or any variation of these terms, when used in the claims and/or the specification includes any measurable decrease or complete inhibition to achieve a desired result.
[0078] As used herein the term "cradle molecule" or "FnIII-based cradle molecule" refers to an FnIII domain that has been altered to contain one or more modifications in at least one beta strand and at least one loop region, wherein the loop region is a top loop region selected from the group consisting of BC, DE, and FG. In one embodiment, the cradle molecule refers to an FnIII domain that has been altered to contain one or more modifications in at least one beta strand and at least one loop region, wherein the loop region is a bottom loop region selected from the group consisting of AB, CD, and EF. In one embodiment, the cradle molecule refers to an FnIII domain that has been altered to contain one or more modifications in at least one beta strand and at least one top loop region selected from the group consisting of BC, DE, and FG and at least one bottom loop region selected from the group consisting of AB, CD, and EF.
It is understood that not all three loops from the top or bottom region need to be used for binding the target molecule. In one embodiment, the cradle molecule refers to an FnIII domain that has been altered to contain one or more modifications in at least one beta strand and the top FG loop region and the bottom CD loop region.
It is understood that not all three loops from the top or bottom region need to be used for binding the target molecule. In one embodiment, the cradle molecule refers to an FnIII domain that has been altered to contain one or more modifications in at least one beta strand and the top FG loop region and the bottom CD loop region.
[0079] In one embodiment, the cradle molecule refers to an FnIII domain that has been altered to contain one or more modifications in at least one beta strand selected from the group consisting of sheet A, sheet B, sheet C, sheet D, sheet E, sheet F and sheet G
and at least one loop region, wherein the loop region is a top loop region selected from the group consisting of BC, DE, and FG. In one embodiment, the cradle molecule refers to an FnIII
domain that has been altered to contain one or more modifications in at least one beta strand selected from the group consisting of sheet A, sheet B, sheet C, sheet D, sheet E, sheet F and sheet G and at least one loop region, wherein the loop region is a bottom loop region selected from the group consisting of AB, CD, and EF. In one embodiment, the cradle molecule refers to an FnIII
domain that has been altered to contain one or more modifications in at least one beta strand selected from the group consisting of sheet A, sheet B, sheet C, sheet D, sheet E, sheet F and sheet G and at least one loop region, wherein the loop region is a bottom loop region selected from the group consisting of AB, CD, and EF. In one embodiment, the cradle molecule refers to an FnIII domain that has been altered to contain one or more modifications in at least one beta strand selected from the group consisting of sheet A, sheet B, sheet C, sheet D, sheet E, sheet F and sheet G and at least one top loop region selected from the group consisting of BC, DE, and FG and at least one bottom loop region selected from the group consisting of AB, CD, and EF. In one embodiment, the cradle molecule refers to an FnIII domain that has been altered to contain one or more modifications in beta strand C and the top FG loop region and the bottom CD loop region. In one embodiment, the cradle molecule refers to an FnIII domain that has been altered to contain one or more modifications in beta strand F and the top FG loop region and the bottom CD loop region.
and at least one loop region, wherein the loop region is a top loop region selected from the group consisting of BC, DE, and FG. In one embodiment, the cradle molecule refers to an FnIII
domain that has been altered to contain one or more modifications in at least one beta strand selected from the group consisting of sheet A, sheet B, sheet C, sheet D, sheet E, sheet F and sheet G and at least one loop region, wherein the loop region is a bottom loop region selected from the group consisting of AB, CD, and EF. In one embodiment, the cradle molecule refers to an FnIII
domain that has been altered to contain one or more modifications in at least one beta strand selected from the group consisting of sheet A, sheet B, sheet C, sheet D, sheet E, sheet F and sheet G and at least one loop region, wherein the loop region is a bottom loop region selected from the group consisting of AB, CD, and EF. In one embodiment, the cradle molecule refers to an FnIII domain that has been altered to contain one or more modifications in at least one beta strand selected from the group consisting of sheet A, sheet B, sheet C, sheet D, sheet E, sheet F and sheet G and at least one top loop region selected from the group consisting of BC, DE, and FG and at least one bottom loop region selected from the group consisting of AB, CD, and EF. In one embodiment, the cradle molecule refers to an FnIII domain that has been altered to contain one or more modifications in beta strand C and the top FG loop region and the bottom CD loop region. In one embodiment, the cradle molecule refers to an FnIII domain that has been altered to contain one or more modifications in beta strand F and the top FG loop region and the bottom CD loop region.
[0080] In a further embodiment, two or more cradle molecules are linked together. Such molecules are referred to herein as "multispecific cradle molecules".
[0081] The cradle molecules can be linked together (e.g., in a pearl-like fashion) to form a multispecific cradle molecules that comprises, for example, at least two cradle molecules that are linked together. In some embodiments, this multispecific cradle molecule binds to different target regions of a same target molecule (e.g., Target A). For example, one cradle molecule of the multispecific cradle molecule can bind to a first target region of Target A and another cradle molecule of the multispecific cradle molecule can bind to a second target region of Target A.
This can be used to increase avidity of the multispecific cradle molecule for the target molecule.
In another embodiment, the multispecific cradle molecule binds to multiple target molecules.
For example, one cradle molecule of the multispecific cradle molecule can bind to Target A and another cradle molecule of the multispecific cradle molecule can bind to Target B (e.g., a half life extender). In yet another embodiment, the multispecific cradle molecule comprises at least two cradle molecules that bind to different target regions of Target A and at least two cradle molecules that bind to different target regions of Target B. The skilled artisan will appreciate that any number of cradle molecules can be linked in this fashion to create a multispecific cradle molecule that are able to bind to different target regions of the same target molecule or different target molecules. In one embodiment, the C-terminal region of one cradle molecule is linked to the N-terminal region of another cradle molecule.
This can be used to increase avidity of the multispecific cradle molecule for the target molecule.
In another embodiment, the multispecific cradle molecule binds to multiple target molecules.
For example, one cradle molecule of the multispecific cradle molecule can bind to Target A and another cradle molecule of the multispecific cradle molecule can bind to Target B (e.g., a half life extender). In yet another embodiment, the multispecific cradle molecule comprises at least two cradle molecules that bind to different target regions of Target A and at least two cradle molecules that bind to different target regions of Target B. The skilled artisan will appreciate that any number of cradle molecules can be linked in this fashion to create a multispecific cradle molecule that are able to bind to different target regions of the same target molecule or different target molecules. In one embodiment, the C-terminal region of one cradle molecule is linked to the N-terminal region of another cradle molecule.
[0082] The term "complementarity determining region (CDR)" refers to a hypervariable loop from an antibody variable domain or from a T-cell receptor. The position of CDRs within a antibody variable region have been defined (see, e.g., Kabat, E.A., et al.
Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH
Publication No. 91-3242, 1991; MacCallum et al., J. Mol. Biol. 262, 732-745, 1996; Al-Lazikani et al., J. Mol. Biol. 273,927-948, 1997, Lefranc et al., Dev. Comp.
Immunol. 27(1):55-77, 2003; Honegger and Pltickthun, J. Mol. Biol. 309(3):657-70, 2001; and Chothia, C. et al., J.
Mol. Biol. 196:901-917, 1987, which are incorporated herein by reference).
Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH
Publication No. 91-3242, 1991; MacCallum et al., J. Mol. Biol. 262, 732-745, 1996; Al-Lazikani et al., J. Mol. Biol. 273,927-948, 1997, Lefranc et al., Dev. Comp.
Immunol. 27(1):55-77, 2003; Honegger and Pltickthun, J. Mol. Biol. 309(3):657-70, 2001; and Chothia, C. et al., J.
Mol. Biol. 196:901-917, 1987, which are incorporated herein by reference).
[0083] The term "non-FnIII moiety" refers to a biological or chemical entity that imparts additional functionality to a molecule to which it is attached. In a particular embodiment, the non-FnIII moiety is a polypeptide, e.g., human serum albumin (HSA), or a chemical entity, e.g., polyethylene gycol (PEG) which increases the half-life of the FnIII-based binding molecule in vivo.
[0084] The term "cradle library" refers to an FnIII polypeptide library in which amino acid diversity in at least one beta strand and at least one top loop selected from the group consisting of BC, DE, and FG and/or at least one bottom loop selected from the group consisting of AB, CD, and EF loop regions is determined by or reflects the amino acid variants present in a collection of known FnIII sequences.
[0085] The term "universal N+- binding library" or "N+/- libraries" refers to a more sophisticated or fine tuned library in which the most frequent amino acids surrounding an fixed amino acid are determined in the library design. These N+/- libraries are contructed with variations in beta strands, (e.g., sheet A, sheet B, sheet C, sheet D, sheet E, sheet F and sheet G, in particular sheet C and F), bottom loops, AB, CD, and EF, the top loops, BC, DE, FG, or any combination of the beta strands (e.g., sheet A, sheet B, sheet C, sheet D, sheet E, sheet F and sheet G, in particular sheet C and F) and top and bottom loops. For "N+/-libraries," N is the most predominant amino acid at a particular position and amino acids upstream or downstream are designated +N or ¨N, respectively. For example, N+3 is an amino acid 3 positions upstream of N, while N-3 is an amino acid 3 positions downstream of N in a 3D structure of FnIII.
Likewise, N+2 and N+1 are amino acids at positions 2 and 1 upstream of N, respectively, while N-2 and N-1 are amino acids at positions 2 and 1 downstream of N, respectively. By altering N
from the most predominantly abundant amino acid to a less abundant amino acid, the effect of that modification can be assessed on the abundance of amino acids at 1, 2, or 3 positions away from N. In designing such a library, the frequency and abundance of amino acids surrounding the fixed N position are determined. These differences can be used to generate universal fibronectin bottom-side binding domain libraries, top-side binding domain libraries, or a combination of both bottom-side and top-side binding domain libraries.
Likewise, N+2 and N+1 are amino acids at positions 2 and 1 upstream of N, respectively, while N-2 and N-1 are amino acids at positions 2 and 1 downstream of N, respectively. By altering N
from the most predominantly abundant amino acid to a less abundant amino acid, the effect of that modification can be assessed on the abundance of amino acids at 1, 2, or 3 positions away from N. In designing such a library, the frequency and abundance of amino acids surrounding the fixed N position are determined. These differences can be used to generate universal fibronectin bottom-side binding domain libraries, top-side binding domain libraries, or a combination of both bottom-side and top-side binding domain libraries.
[0086] The term "conserved amino acid residue" or "fixed amino acid" refers to an amino acid residue determined to occur with a frequency that is high, typically at least 50% or more (e.g., at about 60%, 70%, 80%, 90%, 95%, or 100%), for a given residue position. When a given residue is determined to occur at such a high frequency, i.e., above a threshold of about 50%, it may be determined to be conserved and thus represented in the libraries of the invention as a "fixed" or "constant" residue, at least for that amino acid residue position in the loop region being analyzed.
[0087] The term "semi-conserved amino acid residue" refers to amino acid residues determined to occur with a frequency that is high, for 2 to 3 residues for a given residue position. When 2-3 residues, preferably 2 residues, that together, are represented at a frequency of about 40% of the time or higher (e.g., 50%, 60%, 70%, 80%, 90% or higher), the residues are determined to be semi-conserved and thus represented in the libraries of the invention as a "semi-fixed" at least for that amino acid residue position in the loop region being analyzed.
Typically, an appropriate level of nucleic acid mutagenesis/variability is introduced for a semi-conserved amino acid (codon) position such that the 2 to 3 residues are properly represented.
Thus, each of the 2 to 3 residues can be said to be "semi-fixed" for this position. A "selected semi-conserved amino acid residue" is a selected one of the 2 or more semi-conserved amino acid residues, typically, but not necessarily, the residue having the highest occurrence frequency at that position.
Typically, an appropriate level of nucleic acid mutagenesis/variability is introduced for a semi-conserved amino acid (codon) position such that the 2 to 3 residues are properly represented.
Thus, each of the 2 to 3 residues can be said to be "semi-fixed" for this position. A "selected semi-conserved amino acid residue" is a selected one of the 2 or more semi-conserved amino acid residues, typically, but not necessarily, the residue having the highest occurrence frequency at that position.
[0088] The term "variable amino acid residue" refers to amino acid residues determined to occur with a lower frequency (less than 20%) for a given residue position.
When many residues appear at a given position, the residue position is determined to be variable and thus represented in the libraries of the invention as variable at least for that amino acid residue position in the loop region being analyzed. Typically, an appropriate level of nucleic acid mutagenesis/variability is introduced for a variable amino acid (codon) position such that an accurate spectrum of residues is properly represented. Of course, it is understood that, if desired, the consequences or variability of any amino acid residue position, i.e., conserved, semi-conserved, or variable, can be represented, explored or altered using, as appropriate, any of the mutagenesis methods disclosed herein. A lower threshold frequency of occurrence of variable amino acids may be, for example, 5-10% or lower. Below this threshold, variable amino acids may be omitted from the natural-variant amino acids at that position.
When many residues appear at a given position, the residue position is determined to be variable and thus represented in the libraries of the invention as variable at least for that amino acid residue position in the loop region being analyzed. Typically, an appropriate level of nucleic acid mutagenesis/variability is introduced for a variable amino acid (codon) position such that an accurate spectrum of residues is properly represented. Of course, it is understood that, if desired, the consequences or variability of any amino acid residue position, i.e., conserved, semi-conserved, or variable, can be represented, explored or altered using, as appropriate, any of the mutagenesis methods disclosed herein. A lower threshold frequency of occurrence of variable amino acids may be, for example, 5-10% or lower. Below this threshold, variable amino acids may be omitted from the natural-variant amino acids at that position.
[0089] The term "natural-variant amino acids" includes conserved, semi-conserved, and variable amino acid residues observed, in accordance with their occurrence frequencies, at a given position in a selected loop of a selected length. The natural-variant amino acids may be substituted by chemically equivalent amino acids, and may exclude variable amino acid residues below a selected occurrence frequency, e.g., 5-10%, or amino acid residues that are chemically equivalent to other natural-variant amino acids.
[0090] The term "library of mutagenesis sequences" refers to a library of sequences within a selected FnIII loop and loop length which is expressed by a library of coding sequences that encode, at each loop position, a conserved or selected semi-conserved consensus amino acid and, if the consensus amino acid has an occurrence frequency equal to or less than a selected threshold frequency of at least 50%, a single common target amino acid and any co-produced amino acids. Thus, for each of target amino acid, the library of sequences within a given loop will contain the target amino acid at all combinations of one to all positions within the loop at which the consensus amino acid has an occurrence frequency equal to or less than the given threshold frequency. If this threshold frequency is set at 100%, each position in the loop will be contain the target amino acid in at least one library member. The "library mutagenesis sequences" can be generated from the Tables and Figures disclosed herein using commercial vendors such as Geneart, or DNA2Ø
[0091] The term "library of natural-variant combinatorial sequences" refers to a library of sequences within a selected FnIII beta strand and FnIII loop which is expressed by a library of coding sequences that encode at each loop position, a conserved or selected semi-conserved consensus amino acid and, if the consensus amino acid has a frequency of occurrence equal to or less than a selected threshold frequency of at least 50%, other natural variant amino acids, including semi-conserved amino acids and variable amino acids whose occurrence rate is above a selected minimum threshold occurrence at that position, or their chemical equivalents. Thus, for each amino acid position in a selected beta strand or loop, the library of natural variant combinatorial sequences will contain the consensus amino acid at that position plus other amino acid variants identified as having at least some minimum frequency at that position, e.g., at least 5-10% frequency, or chemically equivalent amino acids. In addition, natural variants may be substituted or dropped if the coding sequence for that amino acid produces a significant number of co-produced amino acids, via codon degeneracy.
[0092] The term "variability profile" or "VP" refers to the cataloguing of amino acids and their respective frequency rates of occurrence present at a particular beta strand or loop position.
The beta strand and loop positions are derived from an aligned fibronectin dataset.
The beta strand and loop positions are derived from an aligned fibronectin dataset.
[0093] Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating specific embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
II. FnIII Polypeptides
II. FnIII Polypeptides
[0094] Fibronectin Type III (FnIII) polypeptides refer to a group of proteins composed of momomeric subunits having FnIII structure or motif made up of seven beta strands with six connecting loops (three at the top and three at the bottom). Beta strands A, B, and E form one half beta sandwich and beta strands C, D, F, and G form the other half, and having molecular weights of about 94 amino acids and molecular weights of about 10 Kda. The overall fold of the FnIII domain is closely related to that of the immunoglobulin domains, and the three loops near the N-terminus of FnIII, named BC, DE, and FG can be considered structurally analogous to the antibody variable heavy (VH) domain complementarity-determining regions, CDR1, CDR2, and CDR3, respectively. The top and bottom loops of FnIII have typically been thought to confer structural stability rather than being used for binding targets.
However, the methods of the invention demonstrate that the top and bottom loops, as well as the beta sheets, can indeed be used for binding targets. Libraries of FnIII binding molecules can also be generated that use the top loops, the bottom loops or any combination of the top and bottom loops and the surface exposed residues of the beta-sheets for binding.
However, the methods of the invention demonstrate that the top and bottom loops, as well as the beta sheets, can indeed be used for binding targets. Libraries of FnIII binding molecules can also be generated that use the top loops, the bottom loops or any combination of the top and bottom loops and the surface exposed residues of the beta-sheets for binding.
[0095] In one embodiment, the FnIII polypeptide is FnIII1 with the following amino acid sequence:
Val Ser Asp Val Pro Arg Asp Leu Glu Val Val Ala Ala Thr Pro Thr Ser Leu Leu Ile Ser Trp Asp Ala Pro Ala Val Thr Val Arg Tyr Tyr Arg Ile Thr Tyr Gly Glu Thr Gly Gly Asn Ser Pro Val Gln Glu Phe Thr Val Pro Gly Ser Lys Ser Thr Ala Thr Ile Ser Gly Leu Lys Pro Gly Val Asp Tyr Thr Ile Thr Val Tyr Ala Val Thr Gly Arg Gly Asp Ser Pro Ala Ser Ser Lys Pro Ile Ser Ile Asn Tyr Arg Thr (SEQ ID NO:1).
Val Ser Asp Val Pro Arg Asp Leu Glu Val Val Ala Ala Thr Pro Thr Ser Leu Leu Ile Ser Trp Asp Ala Pro Ala Val Thr Val Arg Tyr Tyr Arg Ile Thr Tyr Gly Glu Thr Gly Gly Asn Ser Pro Val Gln Glu Phe Thr Val Pro Gly Ser Lys Ser Thr Ala Thr Ile Ser Gly Leu Lys Pro Gly Val Asp Tyr Thr Ile Thr Val Tyr Ala Val Thr Gly Arg Gly Asp Ser Pro Ala Ser Ser Lys Pro Ile Ser Ile Asn Tyr Arg Thr (SEQ ID NO:1).
[0096] In another embodiment, the FnIII polypeptide is FnIII 7 with the following amino acid sequence:
Pro Leu Ser Pro Pro Thr Asn Leu His Leu Glu Ala Asn Pro Asp Thr Gly Val Leu Thr Val Ser Trp Glu Arg Ser Thr Thr Pro Asp Ile Thr Gly Tyr Arg Ile Thr Thr Thr Pro Thr Asn Gly Gln Gln Gly Asn Ser Leu Glu Glu Val Val His Ala Asp Gln Ser Ser Cys Thr Phe Asp Asn Leu Ser Pro Gly Leu Glu Tyr Asn Val Ser Val Tyr Thr Val Lys Asp Asp Lys Glu Ser Val Pro Ile Ser Asp Thr Ile Ile Pro (SEQ ID NO: 97)
Pro Leu Ser Pro Pro Thr Asn Leu His Leu Glu Ala Asn Pro Asp Thr Gly Val Leu Thr Val Ser Trp Glu Arg Ser Thr Thr Pro Asp Ile Thr Gly Tyr Arg Ile Thr Thr Thr Pro Thr Asn Gly Gln Gln Gly Asn Ser Leu Glu Glu Val Val His Ala Asp Gln Ser Ser Cys Thr Phe Asp Asn Leu Ser Pro Gly Leu Glu Tyr Asn Val Ser Val Tyr Thr Val Lys Asp Asp Lys Glu Ser Val Pro Ile Ser Asp Thr Ile Ile Pro (SEQ ID NO: 97)
[0097] In another embodiment, the FnIII polypeptide or scaffold is FnIII14 with the following amino acid sequence:
Asn Val Ser Pro Pro Arg Arg Ala Arg Val Thr Asp Ala Thr Glu Thr Thr Ile Thr Ile Ser Trp Arg Thr Lys Thr Glu Thr Ile Thr Gly Phe Gln Val Asp Ala Val Pro Ala Asn Gly Gln Thr Pro Ile Gln Arg Thr Ile Lys Pro Asp Val Arg Ser Tyr Thr Ile Thr Gly Leu Gln Pro Gly Thr Asp Tyr Lys Ile Tyr Leu Tyr Thr Leu Asn Asp Asn Ala Arg Ser Ser Pro Val Val Ile Asp Ala Ser Thr (SEQ ID NO: 129) III. FnIII Cradle Molecules
Asn Val Ser Pro Pro Arg Arg Ala Arg Val Thr Asp Ala Thr Glu Thr Thr Ile Thr Ile Ser Trp Arg Thr Lys Thr Glu Thr Ile Thr Gly Phe Gln Val Asp Ala Val Pro Ala Asn Gly Gln Thr Pro Ile Gln Arg Thr Ile Lys Pro Asp Val Arg Ser Tyr Thr Ile Thr Gly Leu Gln Pro Gly Thr Asp Tyr Lys Ile Tyr Leu Tyr Thr Leu Asn Asp Asn Ala Arg Ser Ser Pro Val Val Ile Asp Ala Ser Thr (SEQ ID NO: 129) III. FnIII Cradle Molecules
[0098] The present invention pertains to methods and compositions for generating FnIII
cradle molecules and libraries containing the same.
cradle molecules and libraries containing the same.
[0099] The cradle molecules included in the methods set forth herein are variants in that they comprise a wild type FnIII domain that has been altered by substitution, insertion and/or deletion of one or more amino acid. The cradle molecules set forth herein may demonstrate a selective and/or specific binding affinity for particular target molecules or portions thereof.
[0100] In some embodiments, the cradle molecule is a fusion polypeptide that includes a variant FnIII domain linked at the N- or C-terminus to a second peptide or polypeptide. In other embodiments, the cradle molecule comprises a linker interposed between the variant FnIII
domain and the second peptide or polypeptide sequence. Linkers are discussed in greater detail in the specification below.
domain and the second peptide or polypeptide sequence. Linkers are discussed in greater detail in the specification below.
[0101] Furthermore, the cradle molecules set forth herein may comprise a sequence of any number of additional amino acid residues at either the N-terminus or C-terminus of the amino acid sequence that includes the variant FnIII domain. For example, there may be an amino acid sequence of about 3 to about 1,000 or more amino acid residues at either the N-terminus, the C-terminus, or both the N-terminus and C-terminus of the amino acid sequence that includes the variant FnIII domain.
[0102] The cradle molecule may include the addition of an antibody epitope or other tag, to facilitate identification, targeting, and/or purification of the polypeptide.
The use of 6xHis and GST (glutathione S transferase) as tags is well known. Inclusion of a cleavage site at or near the fusion junction will facilitate removal of the extraneous cradle molecule after purification.
Other amino acid sequences that may be included in the cradle molecule include functional domains, such as active sites from enzymes such as a hydrolase, glycosylation domains, cellular targeting signals or transmembrane regions. The cradle molecule may further include one or more additional tissue-targeting moieties.
The use of 6xHis and GST (glutathione S transferase) as tags is well known. Inclusion of a cleavage site at or near the fusion junction will facilitate removal of the extraneous cradle molecule after purification.
Other amino acid sequences that may be included in the cradle molecule include functional domains, such as active sites from enzymes such as a hydrolase, glycosylation domains, cellular targeting signals or transmembrane regions. The cradle molecule may further include one or more additional tissue-targeting moieties.
[0103] Cradle molecules may possess deletions and/or substitutions of amino acids relative to the native sequence. Sequences with amino acid substitutions are contemplated, as are sequences with a deletion, and sequences with a deletion and a substitution.
In some embodiments, these cradle molecules may further include insertions or added amino acids.
In some embodiments, these cradle molecules may further include insertions or added amino acids.
[0104] Substitutional or replacement variants typically contain the exchange of one amino acid for another at one or more sites within the protein and may be designed to modulate one or more properties of the cradle molecule, particularly to increase its efficacy or specificity.
Substitutions of this kind may or may not be conservative substitutions.
Conservative substitution is when one amino acid is replaced with one of similar shape and charge. Being that the libraries of variant FnIII domains serves to provide a diversity of amino acid sequences and binding selectivity conservative substitutions are not required. However, if used, conservative substitutions are well known in the art and include, for example, the changes of:
alanine to serine; arginine to lysine; asparagine to glutamine or histidine;
aspartate to glutamate;
cysteine to serine; glutamine to asparagine; glutamate to aspartate; glycine to proline; histidine to asparagine or glutamine; isoleucine to leucine or valine; leucine to valine or isoleucine; lysine to arginine; methionine to leucine or isoleucine; phenylalanine to tyrosine, leucine or methionine; serine to threonine; threonine to serine; tryptophan to tyrosine;
tyrosine to tryptophan or phenylalanine; and valine to isoleucine or leucine. Changes other than those discussed above are generally considered not to be conservative substitutions.
It is specifically contemplated that one or more of the conservative substitutions above may be included as embodiments. In other embodiments, such substitutions are specifically excluded. Furthermore, in additional embodiments, substitutions that are not conservative are employed in the variants.
Substitutions of this kind may or may not be conservative substitutions.
Conservative substitution is when one amino acid is replaced with one of similar shape and charge. Being that the libraries of variant FnIII domains serves to provide a diversity of amino acid sequences and binding selectivity conservative substitutions are not required. However, if used, conservative substitutions are well known in the art and include, for example, the changes of:
alanine to serine; arginine to lysine; asparagine to glutamine or histidine;
aspartate to glutamate;
cysteine to serine; glutamine to asparagine; glutamate to aspartate; glycine to proline; histidine to asparagine or glutamine; isoleucine to leucine or valine; leucine to valine or isoleucine; lysine to arginine; methionine to leucine or isoleucine; phenylalanine to tyrosine, leucine or methionine; serine to threonine; threonine to serine; tryptophan to tyrosine;
tyrosine to tryptophan or phenylalanine; and valine to isoleucine or leucine. Changes other than those discussed above are generally considered not to be conservative substitutions.
It is specifically contemplated that one or more of the conservative substitutions above may be included as embodiments. In other embodiments, such substitutions are specifically excluded. Furthermore, in additional embodiments, substitutions that are not conservative are employed in the variants.
[0105] In addition to a deletion or substitution, the cradle molecules may possess an insertion of one or more residues.
[0106] The variant FnIII domain may be structurally equivalent to the native counterparts.
For example, the variant FnIII domain forms the appropriate structure and conformation for binding targets, proteins, or peptide segments.
For example, the variant FnIII domain forms the appropriate structure and conformation for binding targets, proteins, or peptide segments.
[0107] The following is a discussion based upon changing of the amino acids of a cradle molecule to create a library of cradle molecules or a second-generation cradle molecule. For example, certain amino acids may be substituted for other amino acids in a cradle molecule without appreciable loss of function, such as ability to interact with a target peptide sequence.
Since it is the interactive capacity and nature of a cradle molecule that defines that cradle molecule's functional activity, certain amino acid substitutions can be made in a cradle molecule sequence and nevertheless produce a cradle molecule with like properties.
Since it is the interactive capacity and nature of a cradle molecule that defines that cradle molecule's functional activity, certain amino acid substitutions can be made in a cradle molecule sequence and nevertheless produce a cradle molecule with like properties.
[0108] In making such changes, the hydropathic index of amino acids may be considered.
The importance of the hydropathic amino acid index in conferring interactive function on a protein is generally understood in the art (Kyte and Doolittle, J Mol Biol.
(1982) 157(1):105-32). It is accepted that the relative hydropathic character of the amino acid contributes to the secondary structure of the resultant protein, which in turn defines the interaction of the protein with other molecules, for example, enzymes, substrates, receptors, DNA, antibodies, antigens, and the like.
The importance of the hydropathic amino acid index in conferring interactive function on a protein is generally understood in the art (Kyte and Doolittle, J Mol Biol.
(1982) 157(1):105-32). It is accepted that the relative hydropathic character of the amino acid contributes to the secondary structure of the resultant protein, which in turn defines the interaction of the protein with other molecules, for example, enzymes, substrates, receptors, DNA, antibodies, antigens, and the like.
[0109] It also is understood in the art that the substitution of like amino acids can be made effectively on the basis of hydrophilicity. U.S. Patent No. 4,554,101, incorporated herein by reference, states that the greatest local average hydrophilicity of a protein, as governed by the hydrophilicity of its adjacent amino acids, correlates with a biological property of the protein.
As detailed in U.S. Patent No. 4,554,101, the following hydrophilicity values have been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 1); glutamate (+3.0 1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0);
threonine (-0.4);
proline (-0.5 1); alanine (-0.5); histidine (-0.5); cysteine (-1.0);
methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4).
As detailed in U.S. Patent No. 4,554,101, the following hydrophilicity values have been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 1); glutamate (+3.0 1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0);
threonine (-0.4);
proline (-0.5 1); alanine (-0.5); histidine (-0.5); cysteine (-1.0);
methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4).
[0110] It is understood that an amino acid can be substituted for another having a similar hydrophilicity value and still produce a biologically equivalent and immunologically equivalent protein. In such changes, the substitution of amino acids whose hydrophilicity values are within 2 is preferred, those that are within 1 are particularly preferred, and those within 0.5 are even more particularly preferred.
[0111] As outlined above, amino acid substitutions generally are based on the relative similarity of the amino acid side-chain substituents, for example, their hydrophobicity, hydrophilicity, charge, size, and the like. However, in some aspects a non-conservative substitution is contemplated. In some embodiments a random substitution is also contemplated.
Exemplary substitutions that take into consideration the various foregoing characteristics are well known to those of skill in the art and include: arginine and lysine;
glutamate and aspartate;
serine and threonine; glutamine and asparagine; and valine, leucine and isoleucine.
Exemplary substitutions that take into consideration the various foregoing characteristics are well known to those of skill in the art and include: arginine and lysine;
glutamate and aspartate;
serine and threonine; glutamine and asparagine; and valine, leucine and isoleucine.
[0112] FnIII polypeptides can be modified by inserting or deleting 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20 or more amino acids, or any range derivable therein, in an FnIII loop or beta strand.
Variants of the loop region are discussed in U.S. Patent No. 6,673,901 and U.S. Patent Publication No. 20110038866, which are hereby incorporated by reference.
Variants of the loop region are discussed in U.S. Patent No. 6,673,901 and U.S. Patent Publication No. 20110038866, which are hereby incorporated by reference.
[0113] In some embodiments the one or more amino acid substitution in beta strand C may be one or more amino acid substitution corresponding to position 30, 31, 32, 33, 34, 35, 36, 37, 38 and/or 39 of SEQ ID NO: 1. In some embodiments, the amino acid substitution in beta strand C may correspond to position 31, 33, 35, 37, 38 and/or 39 of SEQ ID NO:l. In some embodiments, the amino acid substitution in beta strand C may correspond to position 31 and/or 33 of SEQ ID NO:l.
[0114] In some embodiments the one or more amino acid substitution in CD loop may be one or more amino acid substitution corresponding to position 40, 41, 42, 43, 44 and/or 45 of SEQ ID NO:l.
[0115] In some embodiments, the one or more amino acid substitution in beta strand D may be one or more amino acid substitution corresponding to position 44, 45, 46, 47, 48, 49, 50 or 51 of SEQ ID NO:l. In some embodiments, the amino acid substitution in beta strand D may correspond to position 44, 45, 47, or 49 of SEQ ID NO:l.
[0116] In still a further aspect, the one or more amino acid substitution in beta strand F may be one or more amino acid substitution corresponding to position 67, 68, 69, 70, 71, 72, 73, 74, 75 and/or 76 of SEQ ID NO: 1. In some embodiments, the amino acid substitution in beta strand F may correspond to position 67, 69, 71, 73 and/or 76 of SEQ ID NO:l. In some embodiments, the amino acid substitution in beta strand F may correspond to position 71, 73, 75 and/or 76 of SEQ ID NO:l.
[0117] In some embodiments the one or more amino acid substitution in FG loop may be one or more amino acid substitution corresponding to position 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86 of SEQ ID NO: 1.
[0118] In some embodiments, the one or more amino acid substitution in beta strand G may be one or more amino acid substitution corresponding to position 85, 86, 87, 88, 89, 90, 91, 92, 93, or 94 of SEQ ID NO:l. In some embodiments, the amino acid substitution in beta strand G
may correspond to position 84 or 85 of SEQ ID NO: 1.
may correspond to position 84 or 85 of SEQ ID NO: 1.
[0119] The cradle molecules can include amino acid substitutions correspond to one or more amino acid substitutions at position 31, 33, 47, 49, 73, and/or 75 of SEQ
ID NO: 1. In some embodiments, the cradle molecule may further comprise an amino acid substitution corresponding to amino acid position 30 of SEQ ID NO:l. In some embodiments, the cradle molecule may comprise one or more amino acid substitution corresponding to amino acid position 41, 42, 43, 44, or 45 of SEQ ID NO:l. The cradle molecule can further comprise one or more amino acid substitution corresponding to amino acid position 76, 77, 78, 79, 80, 81, 82, 83, 84, or 85 of SEQ ID NO:l. In one embodiment, the substitution may be in at least one beta strand.
ID NO: 1. In some embodiments, the cradle molecule may further comprise an amino acid substitution corresponding to amino acid position 30 of SEQ ID NO:l. In some embodiments, the cradle molecule may comprise one or more amino acid substitution corresponding to amino acid position 41, 42, 43, 44, or 45 of SEQ ID NO:l. The cradle molecule can further comprise one or more amino acid substitution corresponding to amino acid position 76, 77, 78, 79, 80, 81, 82, 83, 84, or 85 of SEQ ID NO:l. In one embodiment, the substitution may be in at least one beta strand.
[0120] In some embodiments, the cradle molecule can comprise 1, 2, 3, 4 or more insertions and/or deletions of amino acids corresponding to amino acids of SEQ ID NO:l.
Insertions can include, but are not limited to stretches of poly-serine, poly-alanine, poly-valine, poly-threonine, or polymers of any other of the 20 amino acids, that is subsequently mutagenized or diversified for generating a combinatorial cradle molecule library. Diversification of these inserted residues can include alteration to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19 of the other natural amino acids. In some embodiments 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 or more contiguous amino acids are inserted into one or more of the beta strands (e.g., C and/or F) or AB, BC, CD, DE, EF, FG
loops of an FnIII
domain cradle molecule. In some embodiments, the cradle molecule can comprise an insertion, a deletion, or both an insertion and a deletion. The insertion and deletion need not be located at the same position and may be located at sites distal or proximal to each other. The insertion and/or deletion can be in a loop or beta strands of the FnIII domain polypeptide. In some embodiments, at least one loop region of FnIII may comprise an insertion of at least 2 amino acids. In some embodiments, at least one region of FnIII may comprise an insertion of 2 to 25 amino acids in at least one loop region. In some embodiments at least 2, 3, or more loop regions comprise an insertion. In some embodiments, the cradle molecule has at least one loop region of FnIII may comprise a deletion of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids, including all values and ranges there between. In some embodiments, at least 2, 3, or 4 loop or beta strands, portions, or regions comprise a deletion of at least 1 amino acid. In some embodiments, the cradle molecule may comprise at least one insertion and one deletion in at least one loop and at least one beta strand. In some embodiments, the cradle molecule may comprise an insertion and a deletion in the same loop or beta strand region.
In some embodiments, the cradle molecule may comprise at least one insertion or deletion in at least one beta strand. In some embodiments, the cradle molecule may comprise at least one insertion or deletion in at least one beta strand and at least one loop region.
Insertions can include, but are not limited to stretches of poly-serine, poly-alanine, poly-valine, poly-threonine, or polymers of any other of the 20 amino acids, that is subsequently mutagenized or diversified for generating a combinatorial cradle molecule library. Diversification of these inserted residues can include alteration to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19 of the other natural amino acids. In some embodiments 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 or more contiguous amino acids are inserted into one or more of the beta strands (e.g., C and/or F) or AB, BC, CD, DE, EF, FG
loops of an FnIII
domain cradle molecule. In some embodiments, the cradle molecule can comprise an insertion, a deletion, or both an insertion and a deletion. The insertion and deletion need not be located at the same position and may be located at sites distal or proximal to each other. The insertion and/or deletion can be in a loop or beta strands of the FnIII domain polypeptide. In some embodiments, at least one loop region of FnIII may comprise an insertion of at least 2 amino acids. In some embodiments, at least one region of FnIII may comprise an insertion of 2 to 25 amino acids in at least one loop region. In some embodiments at least 2, 3, or more loop regions comprise an insertion. In some embodiments, the cradle molecule has at least one loop region of FnIII may comprise a deletion of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids, including all values and ranges there between. In some embodiments, at least 2, 3, or 4 loop or beta strands, portions, or regions comprise a deletion of at least 1 amino acid. In some embodiments, the cradle molecule may comprise at least one insertion and one deletion in at least one loop and at least one beta strand. In some embodiments, the cradle molecule may comprise an insertion and a deletion in the same loop or beta strand region.
In some embodiments, the cradle molecule may comprise at least one insertion or deletion in at least one beta strand. In some embodiments, the cradle molecule may comprise at least one insertion or deletion in at least one beta strand and at least one loop region.
[0121] In some embodiments, variants in any one or more of positions that correspond with amino acid position 15, 16, 17, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 38, 39, 40, 41, 42, 43, 44, 45, 51, 52, 53, 54, 55, 56, 60, 61, 62, 63, 64, 65, 66, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 93, 95, and/or 96, including all ranges there between, can be specifically included in the claimed embodiments. In other embodiments, variants in any one or more of positions that correspond with amino acid position 15, 16, 17, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 38, 39, 40, 41, 42, 43, 44, 45, 51, 52, 53, 54, 55, 56, 60, 61, 62, 63, 64, 65, 66, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 93, 95, and/or 96, including all ranges there between, can be specifically excluded. In other embodiments, variants in any one or more of positions that correspond with amino acid position 32, 34, 36, 68, 70, 72, 74, and/or 75, including all ranges there between, can be specifically excluded. It will be understood that these recited positions are based on the sequence of the tenth domain in human FnIII (SEQ ID NO:1). In some embodiments, the FnIII
domain may be the 1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th,10th, 11th, 12th, 13th, 14th, 15th or 16th FnIII
domain of human fibronectin. In some embodiments, the FnIII domain may be the 7th, 10th or 14th FnIII domain of human fibronectin. In some embodiments, the FnIII
variants in other organisms are also contemplated based on their alignment with human FnIII.
domain may be the 1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th,10th, 11th, 12th, 13th, 14th, 15th or 16th FnIII
domain of human fibronectin. In some embodiments, the FnIII domain may be the 7th, 10th or 14th FnIII domain of human fibronectin. In some embodiments, the FnIII
variants in other organisms are also contemplated based on their alignment with human FnIII.
[0122] In some embodiments the one or more amino acid substitution in beta strand C may be one or more amino acid substitution corresponding to position 33, 35, 37, 39, 40 and/or 41 of SEQ ID NO:97. In some embodiments the one or more amino acid substitution in CD loop may be one or more amino acid substitution corresponding to position 42, 43, 44, 45, 46, 47 and/or 48 of SEQ ID NO:97. In some embodiments, the amino acid substitution in beta strand F may correspond to position 70, 72, 74, 76 and/or 79 of SEQ ID NO:97. In some embodiments the one or more amino acid substitution in FG loop may be one or more amino acid substitution corresponding to position 80, 81, 82, 83, 84 and/or 85 of SEQ ID NO:97.
[0123] In some embodiments the one or more amino acid substitution in beta strand C may be one or more amino acid substitution corresponding to position 31, 33, 35, 37 and/or 39 of SEQ ID NO:129. In some embodiments the one or more amino acid substitution in CD loop may be one or more amino acid substitution corresponding to position 40, 41, 42, 43 and/or 44 of SEQ ID NO:129. In some embodiments, the amino acid substitution in beta strand F may correspond to position 66, 68, 70, 72 and/or 75 of SEQ ID NO:129. In some embodiments the one or more amino acid substitution in FG loop may be one or more amino acid substitution corresponding to position 76, 77, 78, 79, 80 and/or 81 of SEQ ID NO:129.
[0124] In other embodiments, variants in any one or more of positions that correspond with amino acid position 34, 36, 38, 71, 73, 75, 77 and/or 78 of SEQ ID NO:97, including all ranges there between, can be specifically excluded. In other embodiments, variants in any one or more of positions that correspond with amino acid position 332, 34, 36, 38, 67, 69, 71, 73 and/or 74 of SEQ ID NO:129, including all ranges there between, can be specifically excluded.
[0125] In some embodiments one or more of the altered or variant amino acids may correspond to position 30, 31, 33, 49, 47, 75, 76, 84, and/or 85 of SEQ ID
NO:l. In some embodiments, the variant FnIII domains may comprise an insertion or deletion of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 amino acids in at least one beta strand and at least one loop region of FnIII. In some embodiments, the variant FnIII domains may comprise an amino acid insertion in loop CD, FG and/or a combination of CD and FG loops and at least one beta strand with a substitution, deletion or addition. In some embodiments, the variant FnIII domains may comprise an amino acid insertion in loop BC, FG and/or a combination of BC and FG loops and at least one beta strand with a substitution, deletion, or addition. In some embodiments, the polypeptide may be at least 50%, 60%, 70%, 80%, or 90%, including all values and ranges there between, identical to SEQ ID NOs: 1, 97 and 129.
NO:l. In some embodiments, the variant FnIII domains may comprise an insertion or deletion of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 amino acids in at least one beta strand and at least one loop region of FnIII. In some embodiments, the variant FnIII domains may comprise an amino acid insertion in loop CD, FG and/or a combination of CD and FG loops and at least one beta strand with a substitution, deletion or addition. In some embodiments, the variant FnIII domains may comprise an amino acid insertion in loop BC, FG and/or a combination of BC and FG loops and at least one beta strand with a substitution, deletion, or addition. In some embodiments, the polypeptide may be at least 50%, 60%, 70%, 80%, or 90%, including all values and ranges there between, identical to SEQ ID NOs: 1, 97 and 129.
[0126] In some embodiments, FnIII cradle molecules may have at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more amino acid substitutions that may include, but are not limited to the following FnIII residue substitutions (corresponding to SEQ ID NO:1): R30A, R3ON, R30D, R30C, R30Q, R30E, R300, R3OH, R301, R3OL, R3OK, R30M, R3OF, R3OP, R305, R30T, R3OW, R30Y, R3OV, Y31A, Y31R, Y31N, Y31D, Y31C, Y31Q, Y31E, Y310, Y31H, Y31I, Y31L, Y31K, Y31M, Y31F, Y31P, Y315, Y31T, Y31W, Y31V, R33A, R33N, R33D, R33C, R33Q, R33E, R33G, R33H, R33I, R33L, R33K, R33M, R33F, R33P, R335, R33T, R33W, R33Y, R33V, T35A, T35R, T35N, T35D, T35C, T35Q, T35E, T350, T35H, T35I, T35L, T35K, T35M, T35F, T35P, T355, T35W, T35Y, T35V, G37A, G37N, G37R, G37D, G37C, G37Q, G37E, G37H, G37I, G37L, G37K, G37M, G37F, G37P, G375, G37T, G37W, G37Y, G37V, E38A, E38N, E38R, E38D, E38C, E38Q, E38G, E38H, E381, E38L, E38K, E38M, E38F, E38P, E385, E38T, E38W, E38Y, E38V, T39A, T39N, T39R, T39D, T39C, T39Q, T39E, T39G, T39H, T39I, T39L, T39K, T39M, T39F, T39P, T395, T39W, T39Y, T39V, 040A, 040N, 040R, 040D, 040C, 040Q, G40E, 040H, G40I, 040L, 040K, 040M, 040F, 040P, 0405, 040T, 040W, 040Y, 040V, 041A, 041R, 041N, 041D, 041C, 041Q, G41E, 041H, 0411, 041L, 041K, 041M, 041F, 041P, 0415, 041T, 041W, 041Y, 041V, N42A, N42R, N42D, N42C, N42Q, N42E, N42G, N42H, N42I, N42L, N42K, N42M, N42F, N42P, N425, N42T, N42W, N42Y, N42V, 543A, 543R, 543N, 543D, 543C, 543Q, 543E, 5430, 543H, 5431, 543L, S43K, 543M, 543F, 543P, 543T, S43W, 543Y, 543V, P44A, P44R, P44N, P44D, P44C, P44Q, P44E, P440, P44H, P44I, P44L, P44K, P44M, P44F, P445, P44T, P44W, P44Y, P44V, V45A, V45R, V45N, V45D, V45C, V45Q, V45E, V450, V45H, V45I, V45L, V45K, V45M, V45F, V45P, V455, V45T, V45W, V45Y, E47A, E47R, E47N, E47D, E47C, E47Q, E470, E47H, E471, E47L, E47K, E47M, E47F, E47P, E475, E47T, E47W, E47Y, E47V, T49A, T49R, T49N, T49D, T49C, T49Q, T49E, T490, T49H, T49I, T49L, T49K, T49M, T49F, T49P, T495, T49W, T49Y, T49V, V50A, V5OR, V5ON, V50D, V50C, V50Q, V50E, V500, V5OH, V50I, V5OL, V50K, V50M, V50F, V50P, V505, V50T, V5OW, V50Y, D67A, D67R, D67N, D67C, D67Q, D67E, D670, D67H, D67I, D67L, D67K, D67M, D67F, D67P, D675, D67T, D67W, D67Y, D67V, T69A, T69R, T69N, T69D, T69C, T69Q, T69E, T690, T69H, T69I, T69L, T69K, T69M, T69F, T69P, T695, T69W, T69Y, T69V, T71A, T71R, T71N, T71D, T71C, T71Q, T71E, 1710, T71H,1711, T71L, 171K, T71M, T71F, T71P, 171S, 171W, T71Y, T71V, Y73A, Y73R, Y73N, Y73D, Y73C, Y73Q, Y73E, Y73G, Y73H, Y73I, Y73L, Y73K, Y73M, Y73F, Y73P, Y73S, Y73T, Y73W, Y73V, V75A, V75R, V75N, V75D, V75C, V75Q, V75E, V75G, V75H, V75I, V75L, V75K, V75M, V75F, V75P, V75S, V75T, V75W, V75Y, T76A, T76R, T76N, T76D, T76C, T76Q, T76E, T76G, T76H, T76I, T76L, T76K, T76M, T76F, T76P, T76S, T76W, T76Y, T76V, G77A, G77R, G77N, G77D, G77C, G77Q, G77E, G77H, G77I, G77L, G77K, G77M, G77F, G77P, G77S, G77T, G77W, G77Y, G77V, R78A, R78N, R78D, R78C, R78Q, R78E, R78G, R78H, R78I, R78L, R78K, R78M, R78F, R78P, R78S, R78T, R78W, R78Y, R78V, G79A, G79R, G79N, G79D, G79C, G79Q, G79E, G79H, G79I, G79L, G79K, G79M, G79F, G79P, G79S, G79T, G79W, G79Y, G79V, D80A, D8OR, D8ON, D80C, D80Q, D80E, D800, D8OH, D801, D8OL, D8OK, D80M, D8OF, D8OP, D8OS, D8OT, D8OW, D80Y, D8OV, S81A, S81R, S81N, S81D, S81C, S81Q, S81E, S81G, S81H, S81I, S81L, S81K, S81M, S81F, S81P, S81T, S81W, S81Y, S81V, P82A, P82R, P82N, P82D, P82C, P82Q, P82E, P82G, P82H, P82I, P82L, P82K, P82M, P82F, P82S, P82T, P82W, P82Y, P82V, A83R, A83N, A83D, A83C, A83Q, A83E, A83G, A83H, A83I, A83L, A83K, A83M, A83F, A83P, A83S, A83T, A83W, A83Y, A83V, S84A, S84R, S84N, S84D, S84C, S84Q, S84E, S84G, S84H, S84I, S84L, S84K, S84M, S84F, S84P, S84T, S84W, S84Y, S84V, S85A, S85R, S85N, S85D, S85C, S85Q, S85E, S85G, S85H, S85I, S85L, S85K, S85M, S85F, S85P, S85T, S85W, S85Y, S85V, K86A, K86R, K86N, K86D, K86C, K86Q, K86E, K86G, K86H, K86I, K86L, K86M, K86F, K86P, K86S, K86T, K86W, K86Y, and K86V.
[0127] In still further embodiments other amino acid substitutions can be introduced before, during, or after introduction of those amino acid substitutions listed above.
The other substitutions (corresponding to SEQ ID NO:1) may include, but is not limited to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 of W22A, W22R, W22N, W22D, W22C, W22Q, W22E, W22G, W22H, W22I, W22L, W22K, W22M, W22F, W22P, W225, W22T, W22Y, W22V, D23A, D23R, D23N, D23C, D23Q, D23E, D23G, D23H, D23I, D23L, D23K, D23M, D23F, D23P, D235, D23T, D23W, D23Y, D23V, A24R, A24N, A24D, A24C, A24Q, A24E, A24G, A24H, A24I, A24L, A24K, A24M, A24F, A24P, A245, A24T, A24W, A24Y, A24V, P25A, P25R, P25N, P25D, P25C, P25Q, P25E, P25G, P25H, P25I, P25L, P25K, P25M, P25F, P25S, P25T, P25W, P25Y, P25V, A26R, A26N, A26D, A26C, A26Q, A26E, A26G, A26H, A26I, A26L, A26K, A26M, A26F, A26P, A265, A26T, A26W, A26Y, A26V, V27A, V27R, V27N, V27D, V27C, V27Q, V27E, V27G, V27H, V27I, V27L, V27K, V27M, V27F, V27P, V275, V27T, V27W, V27Y, T28A, T28R, T28N, T28D, T28C, T28Q, T28E, T28G, T28H, T28I, T28L, T28K, T28M, T28F, T28P, T285, T28W, T28Y, T28V, V29A, V29R, V29N, V29D, V29C, V29Q, V29E, V29G, V29H, V29I, V29L, V29K, V29M, V29F, V29P, V295, V29T, V29W, V29Y, G52A, G52R, G52N, G52D, G52C, G52Q, G52E, G52H, G52I, G52L, 052K, G52M, G52F, G52P, G52S, G52T, G52W, G52Y, G52V, S53A, S53R, S53N, S53D, S53C, S53Q, S53E, S53G, S53H, S53I, S53L, S53K, S53M, S53F, S53P, S53T, S53W, S53Y, S53V, K54A, K54R, K54N, K54D, K54C, K54Q, K54E, K54G, K54H, K54I, K54L, K54M, K54F, K54P, K54S, K54T, K54W, K54Y, K54V, S55A, S55R, S55N, S55D, S55C, S55Q, S55E, S55G, S55H, S55I, S55L, S55K, S55M, S55F, S55P, S55T, S55W, S55Y, or S55V.
The other substitutions (corresponding to SEQ ID NO:1) may include, but is not limited to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 of W22A, W22R, W22N, W22D, W22C, W22Q, W22E, W22G, W22H, W22I, W22L, W22K, W22M, W22F, W22P, W225, W22T, W22Y, W22V, D23A, D23R, D23N, D23C, D23Q, D23E, D23G, D23H, D23I, D23L, D23K, D23M, D23F, D23P, D235, D23T, D23W, D23Y, D23V, A24R, A24N, A24D, A24C, A24Q, A24E, A24G, A24H, A24I, A24L, A24K, A24M, A24F, A24P, A245, A24T, A24W, A24Y, A24V, P25A, P25R, P25N, P25D, P25C, P25Q, P25E, P25G, P25H, P25I, P25L, P25K, P25M, P25F, P25S, P25T, P25W, P25Y, P25V, A26R, A26N, A26D, A26C, A26Q, A26E, A26G, A26H, A26I, A26L, A26K, A26M, A26F, A26P, A265, A26T, A26W, A26Y, A26V, V27A, V27R, V27N, V27D, V27C, V27Q, V27E, V27G, V27H, V27I, V27L, V27K, V27M, V27F, V27P, V275, V27T, V27W, V27Y, T28A, T28R, T28N, T28D, T28C, T28Q, T28E, T28G, T28H, T28I, T28L, T28K, T28M, T28F, T28P, T285, T28W, T28Y, T28V, V29A, V29R, V29N, V29D, V29C, V29Q, V29E, V29G, V29H, V29I, V29L, V29K, V29M, V29F, V29P, V295, V29T, V29W, V29Y, G52A, G52R, G52N, G52D, G52C, G52Q, G52E, G52H, G52I, G52L, 052K, G52M, G52F, G52P, G52S, G52T, G52W, G52Y, G52V, S53A, S53R, S53N, S53D, S53C, S53Q, S53E, S53G, S53H, S53I, S53L, S53K, S53M, S53F, S53P, S53T, S53W, S53Y, S53V, K54A, K54R, K54N, K54D, K54C, K54Q, K54E, K54G, K54H, K54I, K54L, K54M, K54F, K54P, K54S, K54T, K54W, K54Y, K54V, S55A, S55R, S55N, S55D, S55C, S55Q, S55E, S55G, S55H, S55I, S55L, S55K, S55M, S55F, S55P, S55T, S55W, S55Y, or S55V.
[0128] In some embodiments, FnIII cradle molecules may have at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more amino acid substitutions that may include, but are not limited to the following FnIII residue substitutions (corresponding to SEQ ID NO:97): G33A, G33N, G33R, G33D, G33C, G33Q, G33E, G33H, G33I, G33L, G33K, G33M, G33F, G33P, G335, G33T, G33W, G33Y, G33V, R35A, R35N, R35D, R35C, R35Q, R35E, R35G, R35H, R35I, R35L, R35K, R35M, R35F, R35P, R355, R35T, R35W, R35Y, R35V, T37A, T37R, T37N, T37D, T37C, T37Q, T37E, T37G, T37H, T37I, T37L, T37K, T37M, T37F, T37P, T375, T37W, T37Y, T37V, T39A, T39N, T39R, T39D, T39C, T39Q, T39E, T39G, T39H, T39I, T39L, T39K, T39M, T39F, T39P, T395, T39W, T39Y, T39V, P40A, P4OR, P4ON, P40D, P40C, P40Q, P40E, P400, P40H, P401, P40L, P40K, P40M, P40F, P40S, P40T, P40W, P40Y, P4OV, T41A, T41R, T41N, T41D, T41C, T41Q, T41E, T41G, T41H, T41I, T41L, T41K, T41M, T41F, T41P, T415, T41W, T41Y, T41V, N42A, N42R, N42D, N42C, N42Q, N42E, N42G, N42H, N42I, N42L, N42K, N42M, N42F, N42P, N425, N42T, N42W, N42Y, N42V, G43A, G43N, G43R, G43D, G43C, G43Q, G43E, G43H, 0431, G43L, G43K, G43M, G43F, G43P, 0435, G43T, G43W, G43Y, G43V, Q44A, Q44R, Q44N, Q44D, Q44C, Q44E, Q44G, Q44H, Q44I, Q44L, Q44K, Q44M, Q44F, Q44P, Q445, Q44T, Q44W, Q44Y, Q44V, Q45A, Q45R, Q45N, Q45D, Q45C, Q45E, Q45G, Q45H, Q45I, Q45L, Q45K, Q45M, Q45F, Q45P, Q455, Q45T, Q45W, Q45Y, Q45V, G46A, G46R, G46N, G46D, G46C, G46Q, G46E, G46H, 0461, G46L, G46K, G46M, G46F, G46P, 0465, G46T, G46W, G46Y, G46V, N47A, N47R, N47D, N47C, N47Q, N47E, N47G, N47H, N47I, N47L, N47K, N47M, N47F, N47P, N475, N47T, N47W, N47Y, N47V, 548A, 548R, 548N, 548D, 548C, 548Q, 548E, 5480, 548H, S48I, 548L, S48K, 548M, 548F, 548P, 548T, S48W, 548Y, 548V, E70A, E7OR, E7ON, E70D, E70C, E70Q, E700, E7OH, E701, E7OL, E70K, E70M, E70F, E70P, E705, E70T, E7OW, E70Y, E70V, N72A, N72R, N72D, N72C, N72Q, N72E, N72G, N72H, N72I, N72L, N72K, N72M, N72F, N72P, N725, N72T, N72W, N72Y, N72V, 574A, 574R, 574N, 574D, 574C, 574Q, 574E, 5740, 574H, S74I, 574L, S74K, 574M, 574F, 574P, 574T, S74W, 574Y, 574V, Y76A, Y76R, Y76N, Y76D, Y76C, Y76Q, Y76E, Y76G, Y76H, Y76I, Y76L, Y76K, Y76M, Y76F, Y76P, Y765, Y76T, Y76W, Y76V, K79A, K79R, K79N, K79D, K79C, K79Q, K79E, K790, K79H, K79I, K79L, K79M, K79F, K79P, K795, K79T, K79W, K79Y, K79V, D80A, D8OR, D8ON, D80C, D80Q, D80E, D800, D80H, D801, D80L, D80K, D80M, D80F, D80P, D80S, D80T, D80W, D80Y, D80V, D81A, D81R, D81N, D81C, D81Q, D81E, D81G, D81H, D81I, D81L, D81K, D81M, D81F, D81P, D81S, D81T, D81W, D81Y, D81V, K82A, K82R, K82N, K82D, K82C, K82Q, K82E, K820, K82H, K82I, K82L, K82M, K82F, K82P, K82S, K82T, K82W, K82Y, K82V, E83A, E83R, E83N, E83D, E83C, E83Q, E83G, E83H, E831, E83L, E83K, E83M, E83F, E83P, E83S, E83T, E83W, E83Y, E83V, S84A, S84R, S84N, S84D, S84C, S84Q, S84E, S84G, S84H, S84I, S84L, S84K, S84M, S84F, S84P, S84T, S84W, S84Y, S84V, V85A, V85R, V85N, V85D, V85C, V85Q, V85E, V85G, V85H, V85I, V85L, V85K, V85M, V85F, V85P, V85S, V85T, V85W, and V85Y.
[0129] In some embodiments, FnIII cradle molecules may have at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more amino acid substitutions that may include, but are not limited to the following FnIII residue substitutions (corresponding to SEQ ID NO:129): G31A, G31N, G31R, G31D, G31C, G31Q, G31E, G31H, G31I, G31L, G31K, G31M, G31F, G31P, G315, G31T, G31W, G31Y, G31V, Q33A, Q33R, Q33N, Q33D, Q33C, Q33E, Q33G, Q33H, Q33I, Q33L, Q33K, Q33M, Q33F, Q33P, Q335, Q33T, Q33W, Q33Y, Q33V, D35A, D35R, D35N, D35C, D35Q, D35E, D35G, D35H, D35I, D35L, D35K, D35M, D35F, D35P, D355, D35T, D35W, D35Y, D35V, V37A, V37R, V37N, V37D, V37C, V37Q, V37E, V37G, V37H, V37I, V37L, V37K, V37M, V37F, V37P, V375, V37T, V37W, V37Y, A39N, A39R, A39D, A39C, A39Q, A39E, A39G, A39H, A39I, A39L, A39K, A39M, A39F, A39P, A395, A39T, A39W, A39Y, A39V, N40A, N4OR, N40D, N40C, N40Q, N40E, N400, N4OH, N40I, N4OL, N40K, N40M, N40F, N4OP, N405, N40T, N4OW, N40Y, N40V, G41A, G41R, G41N, G41D, G41C, G41Q, G41E, G41H, G41I, G41L, G41K, G41M, G41F, G41P, G415, G41T, G41W, G41Y, G41V, Q42A, Q42R, Q42N, Q42D, Q42C, Q42E, Q420, Q42H, Q42I, Q42L, Q42K, Q42M, Q42F, Q42P, Q425, Q42T, Q42W, Q42Y, Q42V, T43A, T43R, T43N, T43D, T43C, T43Q, T43E, T430, T43H, T43I, T43L, T43K, T43M, T43F, T43P, T435, T43W, T43Y, T43V, P44A, P44R, P44N, P44D, P44C, P44Q, P44E, P44G, P44H, P44I, P44L, P44K, P44M, P44F, P445, P44T, P44W, P44Y, P44V, D66A, D66R, D66N, D66C, D66Q, D66E, D66G, D66H, D66I, D66L, D66K, D66M, D66F, D66P, D665, D66T, D66W, D66Y, D66V, K68A, K68R, K68N, K68D, K68C, K68Q, K68E, K68G, K68H, K68I, K68L, K68M, K68F, K68P, K685, K68T, K68W, K68Y, K68V, Y70A, Y7OR, Y7ON, Y70D, Y70C, Y70Q, Y70E, Y700, Y7OH, Y70I, Y7OL, Y70K, Y70M, Y70F, Y70P, Y705, Y70T, Y7OW, Y70V, Y72A, Y72R, Y72N, Y72D, Y72C, Y72Q, Y72E, Y72G, Y72H, Y72I, Y72L, Y72K, Y72M, Y72F, Y72P, Y725, Y72T, Y72W, Y72V, N75A, N75R, N75D, N75C, N75Q, N75E, N75G, N75H, N75I, N75L, N75K, N75M, N75F, N75P, N755, N75T, N75W, N75Y, N75V, D76A, D76R, D76N, D76C, D76Q, D76E, D76G, D76H, D76I, D76L, D76K, D76M, D76F, D76P, D76S, D76T, D76W, D76Y, D76V, N77A, N77R, N77D, N77C, N77Q, N77E, N77G, N77H, N77I, N77L, N77K, N77M, N77F, N77P, N77S, N77T, N77W, N77Y, N77V, A78R, A78N, A78D, A78C, A78Q, A78E, A78G, A78H, A78I, A78L, A78K, A78M, A78F, A78P, A78S, A78T, A78W, A78Y, A78V, R79A, R79N, R79D, R79C, R79Q, R79E, R79G, R79H, R79I, R79L, R79K, R79M, R79F, R79P, R79S, R79T, R79W, R79Y, R79V, S80A, S8OR, S8ON, S80D, S80C, S80Q, S80E, S80G, S8OH, S80I, S8OL, S80K, S80M, S80F, S80P, S80T, S8OW, S80Y, S80V, S81A, S81R, S81N, S81D, S81C, S81Q, S81E, S81G, S81H, S81I, S81L, S81K, S81M, S81F, S81P, S81T, S81W, S81Y, and S81V.
[0130] The cradle molecule can further comprise a second FnIII domain that may or may not have been selected for affinity to a particular target. The second FnIII
domain may or may not contain additional amino acid variations or diversification. In other aspects, the cradle molecule can further comprise a non-FnIII polypeptide that enhances the FnIII
polypeptide binding affinity for a target molecule. The non-FnIII polypeptide may incldue additional variations or diversification that enhances or increases the cradle molecule binding affinity for another target molecule such as a half-life extender, e.g., HSA. The non-FnIII
polypeptide can include, but is not limited to domains involved in phospho-tyrosine binding (e.g., SH2, PTB), phospho-serine binding (e.g., UIM, GAT, CUE, BTB/POZ, VHS, UBA, RING, HECT, WW, 14-3- 3, Polo-box), phospho-threonine binding (e.g., FHA, WW, Polo-box), proline-rich region binding (e.g., EVH1 , 5H3, GYF), acetylated lysine binding (e.g., Bromo), methylated lysine binding (e.g., Chromo, PHD), apoptosis (e.g., BIR, TRAF, DED, Death, CARD, BH), cytoskeleton modulation (e.g., ADF, GEL, DH, CH, FH2), or other cellular functions (e.g., EH, CC, VHL, TUDOR, PUF Repeat, PAS, MH1 , LRR1 IQ, HEAT, GRIP, TUBBY, SNARE, TPR, TIR, START, SOCS Box, SAM, RGS, PDZ, PB1 , LIM, F-BOX, ENTH, EF-Hand, SHADOW, ARM, ANK).
Multispecific FnIII Domain Cradle Molecules
domain may or may not contain additional amino acid variations or diversification. In other aspects, the cradle molecule can further comprise a non-FnIII polypeptide that enhances the FnIII
polypeptide binding affinity for a target molecule. The non-FnIII polypeptide may incldue additional variations or diversification that enhances or increases the cradle molecule binding affinity for another target molecule such as a half-life extender, e.g., HSA. The non-FnIII
polypeptide can include, but is not limited to domains involved in phospho-tyrosine binding (e.g., SH2, PTB), phospho-serine binding (e.g., UIM, GAT, CUE, BTB/POZ, VHS, UBA, RING, HECT, WW, 14-3- 3, Polo-box), phospho-threonine binding (e.g., FHA, WW, Polo-box), proline-rich region binding (e.g., EVH1 , 5H3, GYF), acetylated lysine binding (e.g., Bromo), methylated lysine binding (e.g., Chromo, PHD), apoptosis (e.g., BIR, TRAF, DED, Death, CARD, BH), cytoskeleton modulation (e.g., ADF, GEL, DH, CH, FH2), or other cellular functions (e.g., EH, CC, VHL, TUDOR, PUF Repeat, PAS, MH1 , LRR1 IQ, HEAT, GRIP, TUBBY, SNARE, TPR, TIR, START, SOCS Box, SAM, RGS, PDZ, PB1 , LIM, F-BOX, ENTH, EF-Hand, SHADOW, ARM, ANK).
Multispecific FnIII Domain Cradle Molecules
[0131] In another aspect, the invention provides multispecific cradle molecules which comprise two or more individual cradle molecules linked together (e.g., genetically or chemically). The multispecific cradle molecules comprise at least one cradle molecule that uses at least one beta strand to bind to a target.
[0132] In one embodiment, the multispecific cradle molecule comprises two or more individual cradle molecules linked, in pearl-like fashion, wherein each individual cradle molecule binds to a specific target. Such targets can be present on the same molecule or on different molecules, such that the different molecules become juxtaposed by the binding of the multispecific cradle molecule. The targets can also be identical, such that the multispecific cradle molecule is able to cluster target molecules, in a similar way to an antibody. Avidity is also increased by binding to the same target molecule with two binding sites on the multispecific cradle molecule capable of independently binding to different regions of the target molecule.
[0133] A number of individual cradle molecules can be incorporated into the multispecific cradle molecules, for example, at least 2, 3, 4, 5, 6, 7, 8, 9, 10, or more individual cradle molecules.
[0134] Multispecific cradle molecules can be produced using art recognized methods. For example, cradle molecules may be linked genetically, such that multispecific cradle molecules are expressed as a single polypeptide. This linkage may be direct or conferred by an additional amino acid "linker" sequence. Suitable non-limiting methods and linkers are described, for example, in U.S. Patent Publication No. 20060286603 and Patent Cooperation Treaty Publication No. W004041862A2. Exemplary polypeptide linkers include, but are not limited to, GS linkers, such as GGGGSGGGGS (SEQ ID NO: 471), GSGSGSGSGS (SEQ ID NO:
472), PSTSTST (SEQ ID NO: 473), and EIDKPSQ (SEQ ID NO: 474), and multimers thereof.
472), PSTSTST (SEQ ID NO: 473), and EIDKPSQ (SEQ ID NO: 474), and multimers thereof.
[0135] The multispecific cradle molecules generated using linker sequences have an improved steric hinderance for binding to target molecules, thus enabling shorter linker sequences to be used to link two or more individual cradle molecules together.
Shorter linker sequences cause less immunogenic responses and are less likely to get cleaved.
Shorter linker sequences cause less immunogenic responses and are less likely to get cleaved.
[0136] Alternatively, multispecific cradle molecules can be prepared by chemically conjugating the individual cradle molecules using methods known in the art. A
variety of coupling or cross-linking agents can be used for covalent conjugation.
Examples of cross-linking agents include, e.g., protein A, carbodiimide, N-succinimidyl-S-acetyl-thioacetate (SATA), 5,5'-dithiobis(2-nitrobenzoic acid) (DTNB), o-phenylenedimaleimide (oPDM), N-succinimidy1-3-(2-pyridyldithio)propionate (SPDP), and sulfosuccinimidyl 4-(N-maleimidomethyl) cyclohaxane-l-carboxylate (sulfo-SMCC) (see e.g., Karpovsky et al. (1984) J. Exp. Med. 160:1686; Liu, MA et al. (1985) Proc. Natl. Acad. Sci. U.S.A
82:8648). Other methods include those described in Paulus (1985) Behring Ins. Mitt. No. 78:118-132; Brennan et al. (1985) Science 229:81-83, and Glennie et al. (1987) J. Immunol. 139:
2367-2375.
Preferred conjugating agents are SATA and sulfo-SMCC, both available from Pierce Chemical Co. (Rockford, IL). Cysteine residues can be introduced into the FnIII domain variants at specific positions and then crosslink with reagents to sulfhydryl such as DPDPB or DTME
(available from Pierce) to link two individual cradle molecules together to form a multispecific cradle molecule.
Methods for Grafting CDRs onto FnIII Cradle Molecules
variety of coupling or cross-linking agents can be used for covalent conjugation.
Examples of cross-linking agents include, e.g., protein A, carbodiimide, N-succinimidyl-S-acetyl-thioacetate (SATA), 5,5'-dithiobis(2-nitrobenzoic acid) (DTNB), o-phenylenedimaleimide (oPDM), N-succinimidy1-3-(2-pyridyldithio)propionate (SPDP), and sulfosuccinimidyl 4-(N-maleimidomethyl) cyclohaxane-l-carboxylate (sulfo-SMCC) (see e.g., Karpovsky et al. (1984) J. Exp. Med. 160:1686; Liu, MA et al. (1985) Proc. Natl. Acad. Sci. U.S.A
82:8648). Other methods include those described in Paulus (1985) Behring Ins. Mitt. No. 78:118-132; Brennan et al. (1985) Science 229:81-83, and Glennie et al. (1987) J. Immunol. 139:
2367-2375.
Preferred conjugating agents are SATA and sulfo-SMCC, both available from Pierce Chemical Co. (Rockford, IL). Cysteine residues can be introduced into the FnIII domain variants at specific positions and then crosslink with reagents to sulfhydryl such as DPDPB or DTME
(available from Pierce) to link two individual cradle molecules together to form a multispecific cradle molecule.
Methods for Grafting CDRs onto FnIII Cradle Molecules
[0137] In one aspect, the present invention features an FnIII cradle molecule altered compared to the wild-type FnIII domain to contain all or a portion of a complementarity determining region (CDR) of an antibody or a T-cell receptor.
[0138] The CDR regions of any antibody or T-cell receptor variable region, or antigen binding fragments thereof, are suitable for grafting. The CDRs can be obtained from the antibody or T-cell receptor repertoire of any animal including, but not limited to, rodents, primates, camelids or sharks. In a one embodiment, the CDRs are obtained from CDR1, CDR2 and CDR3 of a single domain antibody, for example a nanobody. In a more specific embodiment, CDR1, 2 or 3 of a single domain antibody, such as a nanobody, are grafted into any of the AB, BC, CD, DE, EF or FG loops of an FnIII domain, thereby providing target binding specificity of the original nanobody to the cradle molecule. In one embodiment, the CDR is heavy chain CDR3. In one embodiment, the CDR is grafted into the FG
loop.
Engineered libraries of camelid antibodies and antibody fragments are commercially available, for example, from Ablynx, Ghent, Belgium. The antibody repertoire can be from animals challenged with one or more antigens or from naive animals that have not been challenged with antigen. Additionally or alternatively, CDRs can be obtained from antibodies, or antigen binding fragments thereof, produced by in vitro or in vivo library screening methods, including, but not limited to, in vitro polysome or ribosome display, phage display or yeast display techniques. This includes antibodies not originally generated by in vitro or in vivo library screening methods but which have subsequently undergone mutagenesis or one or more affinity maturation steps using in vitro or in vivo screening methods. Example of such in vitro or in vivo library screening methods or affinity maturation methods are described, for example, in U.S.
Patent Nos. 7,195,880; 6,951,725; 7,078,197; 7,022,479; 5,922,545; 5,830,721;
5,605,793, 5,830,650; 6,194,550; 6,699,658; 7,063,943; 5866344 and Patent Cooperation Treaty Publication No. W006023144.
loop.
Engineered libraries of camelid antibodies and antibody fragments are commercially available, for example, from Ablynx, Ghent, Belgium. The antibody repertoire can be from animals challenged with one or more antigens or from naive animals that have not been challenged with antigen. Additionally or alternatively, CDRs can be obtained from antibodies, or antigen binding fragments thereof, produced by in vitro or in vivo library screening methods, including, but not limited to, in vitro polysome or ribosome display, phage display or yeast display techniques. This includes antibodies not originally generated by in vitro or in vivo library screening methods but which have subsequently undergone mutagenesis or one or more affinity maturation steps using in vitro or in vivo screening methods. Example of such in vitro or in vivo library screening methods or affinity maturation methods are described, for example, in U.S.
Patent Nos. 7,195,880; 6,951,725; 7,078,197; 7,022,479; 5,922,545; 5,830,721;
5,605,793, 5,830,650; 6,194,550; 6,699,658; 7,063,943; 5866344 and Patent Cooperation Treaty Publication No. W006023144.
[0139] Methods to identify antibody CDRs are well known in the art (see Kabat et al., U.S.
Dept. of Health and Human Services, "Sequences of Proteins of Immunological Interest"
(1983); Chothia et al., (1987) J. Mol. Biol. 196:901-917; MacCallum et al., (1996) J. Mol. Biol.
262:732-745). The nucleic acid encoding a particular antibody can be isolated and sequenced, and the CDR sequences deduced by inspection of the encoded protein with regard to the established antibody sequence nomenclature. Methods for grafting hypervariable regions or CDRs into FnIII include, for example, genetic engineering, de novo nucleic acid synthesis or PCR-based gene assembly (see, e.g., U.S. Patent No. 5,225.539).
Dept. of Health and Human Services, "Sequences of Proteins of Immunological Interest"
(1983); Chothia et al., (1987) J. Mol. Biol. 196:901-917; MacCallum et al., (1996) J. Mol. Biol.
262:732-745). The nucleic acid encoding a particular antibody can be isolated and sequenced, and the CDR sequences deduced by inspection of the encoded protein with regard to the established antibody sequence nomenclature. Methods for grafting hypervariable regions or CDRs into FnIII include, for example, genetic engineering, de novo nucleic acid synthesis or PCR-based gene assembly (see, e.g., U.S. Patent No. 5,225.539).
[0140] The above techniques allow for the identification of a suitable loop for selection and presentation of a hypervariable region or CDR, e.g., the FG loop. However, additional metrics can be invoked to further improve the fit and presentation of the hypervariable region based on structural modeling of the FnIII domain and the donor antibody.
[0141] In one aspect, specific amino acid residues in any of the beta-strands of an FnIII
domain are mutated to allow the CDR loops to adopt a conformation that retains or improves binding to antigen. This procedure can be performed in an analogous way to that CDR grafting into a heterologous antibody framework, using a combination of structural modeling and sequence comparison. In one embodiment, the FnIII domain residues adjacent to a CDR are mutated in a similar manner to that performed by Queen et al. (see U.S. Patent Nos. 6,180,370;
5,693,762; 5,693,761; 5,585,089; 7,022,500). In another embodiment, FnIII
domain residues within one Van der Waals radius of CDR residues are mutated in a similar manner to that performed by Winter et al. (see U.S. Patent Nos. 6,548,640; 6,982,321). In another embodiment, FnIII domain residues that are non-adjacent to CDR residues but are predicted, based upon structural modeling of the FnIII domain and the donor antibody, to modify the conformation of CDR residues are mutated in a similar manner to that performed by Carter et al. or Adair et al (see U.S. Patent Nos. 6,407,213; 6,639,055; 5,859,205;
6,632,927).
IV. FnIII Cradle Libraries
domain are mutated to allow the CDR loops to adopt a conformation that retains or improves binding to antigen. This procedure can be performed in an analogous way to that CDR grafting into a heterologous antibody framework, using a combination of structural modeling and sequence comparison. In one embodiment, the FnIII domain residues adjacent to a CDR are mutated in a similar manner to that performed by Queen et al. (see U.S. Patent Nos. 6,180,370;
5,693,762; 5,693,761; 5,585,089; 7,022,500). In another embodiment, FnIII
domain residues within one Van der Waals radius of CDR residues are mutated in a similar manner to that performed by Winter et al. (see U.S. Patent Nos. 6,548,640; 6,982,321). In another embodiment, FnIII domain residues that are non-adjacent to CDR residues but are predicted, based upon structural modeling of the FnIII domain and the donor antibody, to modify the conformation of CDR residues are mutated in a similar manner to that performed by Carter et al. or Adair et al (see U.S. Patent Nos. 6,407,213; 6,639,055; 5,859,205;
6,632,927).
IV. FnIII Cradle Libraries
[0142] The ability to generate novel binding proteins capable of interacting with other proteins with high-affinity and specificity is important in biotechnology, medicine and molecular biology. Such designed binding proteins can be used in numerous applications.
They can be used to bind a target protein, label a protein of interest for detection and visualization, to purify a target protein from a complex mixture or to functionally perturb a target by blocking a functional site.
They can be used to bind a target protein, label a protein of interest for detection and visualization, to purify a target protein from a complex mixture or to functionally perturb a target by blocking a functional site.
[0143] Combinatorial methods are effective platforms for the production of novel binding proteins. In these methods, large libraries of protein variants are created by introducing a large amount of sequence diversity and sometimes structural diversity into a contiguous surface in a protein scaffold. The central idea in combinatorial approaches is to create a sufficiently diverse repertoire of candidate binding surfaces that vary in shape and chemical character. Variants capable of binding a target of interest can then be isolated using various selection methods.
[0144] Though powerful, a significant limitation of combinatorial systems is their limited sampling capacity. For instance, phage display libraries are generally limited to approximately 1010 members. Considering a small binding surface consisting of 15 positions in a protein scaffold, if all 15 positions are varied to all 20 amino acids, this gives 2015 or 3 x 1019 theoretical sequence combinations. Thus, only a very small percentage of the possible binding site configurations would actually be sampled in the library. Since discovery of a binding surface suitable for a given target is already likely to be a rare event, the sampling limitations of combinatorial methods make isolation of functional binding proteins a difficult and unlikely task.
[0145] Several strategies have been explored for combating the sampling problem in combinatorial libraries. The most widely used approach is to couple simple library selection with so-called affinity maturation strategies. Usually, these strategies involve introduction of additional sequence diversity into the protein population at various stages during the selection process to effectively increase sampling capacity. The idea in such approaches is to first recover hits from an under-sampled library, then introduce point mutations to gradually optimize these clones for increased affinity. These approaches have been used successfully in a variety of systems. However, in most cases, the introduced mutations are random in terms of their positions and amino acid types. Thus, while this strategy has proven effective, the likelihood of accumulating productive mutations is very low. As a result, such methods often require several rounds of additional selection for affinity maturation after initial hits are recovered and effective binders are not always produced.
[0146] Another type of strategy for combating the sampling problem in combinatorial methods involves focusing the sequence and structural properties of the binding site library toward those likely to be useful for binding to a target-type of interest.
These strategies are based on structural information (both primary and tertiary) of existing binding molecules. This approach has been explored in synthetic antibodies with the creation of peptide-targeted and small molecule hapten targeted libraries (Cobaugh, et al., J. Mol. Biol.
(2008) 378:622-633;
Persson, et al., J. Mol. Biol. (2006) 357:607-620). In each of these examples, antibody complementarity determining region (CDR) lengths were chosen that are frequently observed in peptide- or small molecule-binding antibodies. These structural features are pre-encoded in the antibody binding site, and then sequence diversity is introduced in this context using amino acid types frequently observed in antibodies recognizing the target-type of interest. In this way, a proven useful architecture is simply "reprogrammed" to recognize another molecule with similar characteristics.
These strategies are based on structural information (both primary and tertiary) of existing binding molecules. This approach has been explored in synthetic antibodies with the creation of peptide-targeted and small molecule hapten targeted libraries (Cobaugh, et al., J. Mol. Biol.
(2008) 378:622-633;
Persson, et al., J. Mol. Biol. (2006) 357:607-620). In each of these examples, antibody complementarity determining region (CDR) lengths were chosen that are frequently observed in peptide- or small molecule-binding antibodies. These structural features are pre-encoded in the antibody binding site, and then sequence diversity is introduced in this context using amino acid types frequently observed in antibodies recognizing the target-type of interest. In this way, a proven useful architecture is simply "reprogrammed" to recognize another molecule with similar characteristics.
[0147] In embodiments discussed herein, an FnIII domain is used as a basis for generating a combinatorial library of protein binding domains.
[0148] Artificial antibody scaffolds that bind specific ligands are becoming legitimate alternatives to antibodies generated using traditional techniques, in part because antibodies can be difficult and expensive to produce. The limitations of antibodies have spurred the development of alternative binding proteins based on immunoglobulin like folds or other protein topologies. These non-antibody scaffold share the general quality of having a structurally stable framework core that is tolerant to multiple substitutions in other parts of the protein.
[0149] The present invention provides a library of FnIII cradle molecules that use the CD
and the FG loops of FnIII domains together with the surface exposed residues of the beta-strands. The proposed library, referred to as the "cradle library" herein, will increase the surface area available for binding over the traditional previously disclosed top and bottom side libraries. Furthermore, loops FG and CD are highly variable in natural occurring fibronectins and can be randomized without restrictions both in composition and loop length. This will enable a highly diverse library without generating instable molecules which should overcome some of the restrictions in the traditional libraries previously disclosed.
Additionally, surface exposed beta sheet residues will also be randomized to generate a large cradle-like surface to be available for binding to target proteins.
and the FG loops of FnIII domains together with the surface exposed residues of the beta-strands. The proposed library, referred to as the "cradle library" herein, will increase the surface area available for binding over the traditional previously disclosed top and bottom side libraries. Furthermore, loops FG and CD are highly variable in natural occurring fibronectins and can be randomized without restrictions both in composition and loop length. This will enable a highly diverse library without generating instable molecules which should overcome some of the restrictions in the traditional libraries previously disclosed.
Additionally, surface exposed beta sheet residues will also be randomized to generate a large cradle-like surface to be available for binding to target proteins.
[0150] By creating artificial diversity, the library size can be controlled so that they can be readily screened using, for example, high throughput methods to obtain new therapeutics. The FnIII cradle library with bottom and top side loop regions and the surface exposed residues of the beta-sheets can be screened using positive physical clone selection by FACS, phage panning or selective ligand retention. These in vitro screens bypass the standard and tedious methodology inherent in generating an antibody hybridoma library and supernatant screening.
[0151] Furthermore, the FnIII cradle library with the bottom and top loop regions (CD and FG, respectively) and the surface exposed residues of the beta-sheets has the potential to recognize any target as the constituent amino acids in the target binding loop are created by in vitro diversity techniques. This produces the significant advantages of the library controlling diversity size and the capacity to recognize self antigens. Still further, the FnIII cradle library with the bottom and top side loop regions (CD and FG) and the surface exposed residues of the beta-sheets can be propagated and re-screened to discover additional fibronectin binding domains against other desired targets.
[0152] A combinatorial library is a collection of diverse compounds generated by either chemical synthesis or biological synthesis by combining a number of chemical "building blocks." For example, a linear combinatorial chemical library such as a polypeptide (e.g., mutein or variant) library is formed by combining a set of chemical building blocks called amino acids in every possible way for a given compound length. Millions of compounds can be synthesized through such combinatorial mixing of chemical building blocks. For example, one commentator has observed that the systematic, combinatorial mixing of 100 interchangeable chemical building blocks results in the theoretical synthesis of 100 million tetrameric compounds or 10 billion pentameric compounds (Gallop, et al., J. Med. Chem.
(1994) 37:1233-1250).
(1994) 37:1233-1250).
[0153] Embodiments of the invention are directed to a combinatorial library of FnIII
domains. In some embodiments, polypeptides of the library include variations of amino acid sequence in one or more of the beta strands of the FnIII domains. In some embodiments, the library includes variations of amino acid sequences in one or more loops of the FnIII domains.
In some embodiments, the library includes variation in both loops and beta strands of the FnIII
domain. Libraries can be generated using (i) a directed approach; and (ii) a random approach, both of which are illustrated in the Examples.
Universal Mutagenesis Cradle Libraries
domains. In some embodiments, polypeptides of the library include variations of amino acid sequence in one or more of the beta strands of the FnIII domains. In some embodiments, the library includes variations of amino acid sequences in one or more loops of the FnIII domains.
In some embodiments, the library includes variation in both loops and beta strands of the FnIII
domain. Libraries can be generated using (i) a directed approach; and (ii) a random approach, both of which are illustrated in the Examples.
Universal Mutagenesis Cradle Libraries
[0154] The present invention pertains to a mutagenesis cradle library of FnIII
domain polypeptides useful in screening for the presence of one or more polypeptides having a selected binding or enzymatic activity. The library polypeptides include (a) regions A, AB, B, BC, C, CD, D, E, EF, F, FG, and G having wildtype amino acid sequences of a selected native FnIII
domain polypeptide or polypeptides, (b) beta-strands C and F and loop regions CD and FG
having one or more selected lengths. At least one selected beta-strand or loop region of a selected length contains a library of sequences encoded by a library of coding sequences that encode, at each beta-strand or loop position, a conserved or selected semi-conserved consensus amino acid and, if the consensus amino acid has a occurrence frequency equal to or less than a selected threshold frequency of at least 50%, a single common target amino acid and any co-produced amino acids (amino acids produced by the coding sequences at a given position as a result of codon degeneracy).
domain polypeptides useful in screening for the presence of one or more polypeptides having a selected binding or enzymatic activity. The library polypeptides include (a) regions A, AB, B, BC, C, CD, D, E, EF, F, FG, and G having wildtype amino acid sequences of a selected native FnIII
domain polypeptide or polypeptides, (b) beta-strands C and F and loop regions CD and FG
having one or more selected lengths. At least one selected beta-strand or loop region of a selected length contains a library of sequences encoded by a library of coding sequences that encode, at each beta-strand or loop position, a conserved or selected semi-conserved consensus amino acid and, if the consensus amino acid has a occurrence frequency equal to or less than a selected threshold frequency of at least 50%, a single common target amino acid and any co-produced amino acids (amino acids produced by the coding sequences at a given position as a result of codon degeneracy).
[0155] In constructing a library within a given loop/strand of a given loop/strand length, the variability profile is used to define a sequence of fixed and "variable"
positions, i.e., positions at which a target amino acid can be introduced. The number of fixed positions will depend on the selected threshold frequency for the consensus amino acid at each position.
positions, i.e., positions at which a target amino acid can be introduced. The number of fixed positions will depend on the selected threshold frequency for the consensus amino acid at each position.
[0156] Once the beta-strand and loop sequences are selected, a library of coding-sequence oligonucleotides encoding all of the identified sequences is constructed, making codon substitutions as shown that are effective to preserve the existing consensus amino acid, but also encode the selected target amino acid, and any other co-product amino acids encoded by degenerate codons.
[0157] The library of coding sequences for the beta strands and loops is added to the framework sequences, to construct the library of coding sequences for the polypeptide libraries.
The library of polypeptides may be encoded by an expression library format that includes a ribosome display library, a polysome display library, a phage display library, a bacterial expression library, or a yeast display library.
The library of polypeptides may be encoded by an expression library format that includes a ribosome display library, a polysome display library, a phage display library, a bacterial expression library, or a yeast display library.
[0158] The libraries may be used in a method of identifying a polypeptide having a desired binding affinity, in which the natural-variant combinatorial library are screened to select for an FnIII domain having a desired binding affinity. The same methodology can be used to generate FnIII libraries using any combination of beta sheets and top and bottom loop regions.
Natural-Variant Combinatorial Cradle Library
Natural-Variant Combinatorial Cradle Library
[0159] Further provided is a natural-variant combinatorial cradle library of FnIII
polypeptides useful in screening for the presence of one or more polypeptides having a selected binding or enzymatic activity. The cradle library polypeptides include (a) regions A, AB, B, BC, C, CD, D, DE, E, EF, F, FG and G having wildtype amino acid sequences of a selected native FnIII polypeptide or polypeptides, and (b) beta-strands C and F and loop regions CD and FG having selected lengths. At least one selected beta strand or loop region of a selected length contains a library of natural-variant combinatorial sequences expressed by a library of coding sequences that encode at each loop position, a conserved or selected semi-conserved consensus amino acid and, if the consensus amino acid has a frequency of occurrence equal to or less than a selected threshold frequency of at least 50%, other natural variant amino acids, including semi-conserved amino acids and variable amino acids whose occurrence rate is above a selected minimum threshold occurrence at that position, or their chemical equivalents.
polypeptides useful in screening for the presence of one or more polypeptides having a selected binding or enzymatic activity. The cradle library polypeptides include (a) regions A, AB, B, BC, C, CD, D, DE, E, EF, F, FG and G having wildtype amino acid sequences of a selected native FnIII polypeptide or polypeptides, and (b) beta-strands C and F and loop regions CD and FG having selected lengths. At least one selected beta strand or loop region of a selected length contains a library of natural-variant combinatorial sequences expressed by a library of coding sequences that encode at each loop position, a conserved or selected semi-conserved consensus amino acid and, if the consensus amino acid has a frequency of occurrence equal to or less than a selected threshold frequency of at least 50%, other natural variant amino acids, including semi-conserved amino acids and variable amino acids whose occurrence rate is above a selected minimum threshold occurrence at that position, or their chemical equivalents.
[0160] In constructing a natural-variant combinatorial cradle library for a given loop/sheet and loop/sheet length, the variability profile is used to define a sequence of fixed and "variable"
positions, i.e., positions at which amino acid variations can be introduced.
In the cradle libraries, the number of fixed positions will depend on the selected threshold frequency for the consensus amino acid at each position. If, for example, the selected frequency threshold was set at about 60%, the conserved or semi-conserved residues and natural-variant substitutions would not be made at these positions. Conversely, if the threshold frequency is set at 100%, all positions would be considered open to variation, recognizing that a single amino acid with a frequency of 100% at a loop position would not be substituted, and a position that had one very dominant amino acid, e.g., with a frequency of 90%, might be substituted only if the low-frequency variant(s) were chemically dissimilar to the dominant amino acid.
positions, i.e., positions at which amino acid variations can be introduced.
In the cradle libraries, the number of fixed positions will depend on the selected threshold frequency for the consensus amino acid at each position. If, for example, the selected frequency threshold was set at about 60%, the conserved or semi-conserved residues and natural-variant substitutions would not be made at these positions. Conversely, if the threshold frequency is set at 100%, all positions would be considered open to variation, recognizing that a single amino acid with a frequency of 100% at a loop position would not be substituted, and a position that had one very dominant amino acid, e.g., with a frequency of 90%, might be substituted only if the low-frequency variant(s) were chemically dissimilar to the dominant amino acid.
[0161] From the amino acid profile for a given loop/sheet and loop/sheet length, and knowing which of the positions will be held fixed and which will be admit variations, the amino acid substitutions at each variable position can be selected. In general, the number of variations that are selected (including co-produced amino acids) will depend on the number of variable substitution positions in the loop/sheet and the average number of variations per substituted loop/sheet position. Of course, if natural-variant substitutions are introduced into a single loop only, many more variations per position can be accommodated.
[0162] The particular natural variant amino acids that are selected for each position will generally include the amino acids having the highest frequencies, while limited the number of co-produced amino acids, and secondarily, preserving chemical diversity at each site. Once the natural-variant loop/sheet sequences are selected, a library of coding-sequence oligonucleotides encoding all of the identified natural-variant sequences is constructed, making codon substitutions that are effective to preserve the existing consensus amino acid, and encode the selected variant amino acids, including variants encoded by degenerate codons.
[0163] The library of coding sequences for the natural-variants loops/sheets is added to the framework sequences, to construct the library of coding sequences for the natural-variant polypeptide libraries. In some embodiments, the coding library includes coding sequences for a pair of AB/CD, AB/EF, CD/EF or CD/FG loops, where each loop in the pair has one selected length. In another embodiment, the coding library includes coding sequences for any combination of all five loops, AB, BC, CD, EF and FG. In yet another embodiment, the coding library includes coding sequences for the C and F sheets. In still another embodiment, the coding library includes coding sequences for any combination of all beta sheets.
N+/- Libraries
N+/- Libraries
[0164] In addition, the methods of the invention also provide other libraries referred to as the "N+/- libraries." These N+/- libraries are constructed with variations in bottom loops, AB, CD, and EF, the top loops, BC, DE, FG, or any combination of top and bottom loops, and any combination of the beta strands (e.g., C and/or F). For "N+/- libraries," N is the most predominant amino acid at a particular position and amino acids upstream or downstream are designated +N or ¨N, respectively. For example, N+3 is an amino acid 3 positions upstream of N, while N-3 is an amino acid 3 positions downstream of N in a 3D structure of FnIII.
Likewise, N+2 and N+1 are amino acids at positions 2 and 1 upstream of N, respectively, while N-2 and N-1 are amino acids at positions 2 and 1 downstream of N, respectively. By altering, N
from the most predominantly abundant amino acid to a less abundant amino acid, the effect of that modification can be assessed on the abundance of amino acids at 1, 2, or 3 positions away from N. In designing such a library, the frequency and abundance of amino acids surrounding the fixed N position are determined. These differences can be used to generate FnIII cradle libraries.
Likewise, N+2 and N+1 are amino acids at positions 2 and 1 upstream of N, respectively, while N-2 and N-1 are amino acids at positions 2 and 1 downstream of N, respectively. By altering, N
from the most predominantly abundant amino acid to a less abundant amino acid, the effect of that modification can be assessed on the abundance of amino acids at 1, 2, or 3 positions away from N. In designing such a library, the frequency and abundance of amino acids surrounding the fixed N position are determined. These differences can be used to generate FnIII cradle libraries.
[0165] For illustrative purposes only, the consensus sequence in the CD/5 loop is SGGEW
(SEQ ID NO:278) at loop positions 1, 2, 3, 4, and 5, with G being the predominant amino acid.
Using the N+/- theory, if G in loop position 3 is fixed as it is the predominant amino acid (N), then the structural and microenvironmental effect of G on loop position 1 (N-2), loop position 2 (N-1), loop position 4 (N+1), and loop position 5 (N+2) is determined. The amino acid frequency of each position N-2, N-1, N+1, N+2 in the context of a fixed G at position N is calculated. Then, if G at loop position 3 is changed to S, the effect of S on positions N-2, N-1, N+1, N+2 (i. e. , loop positions 1, 2, and 4, 5,) is determined, and so forth.
After all possible combinations are calculated the information yielded is an amino acid distribution (N-2, N-1, N, N+1, N+2) of a given position N within a predetermined loop region in the context of a specific amino acid at this position N. This information can then be used to generate a library.
(SEQ ID NO:278) at loop positions 1, 2, 3, 4, and 5, with G being the predominant amino acid.
Using the N+/- theory, if G in loop position 3 is fixed as it is the predominant amino acid (N), then the structural and microenvironmental effect of G on loop position 1 (N-2), loop position 2 (N-1), loop position 4 (N+1), and loop position 5 (N+2) is determined. The amino acid frequency of each position N-2, N-1, N+1, N+2 in the context of a fixed G at position N is calculated. Then, if G at loop position 3 is changed to S, the effect of S on positions N-2, N-1, N+1, N+2 (i. e. , loop positions 1, 2, and 4, 5,) is determined, and so forth.
After all possible combinations are calculated the information yielded is an amino acid distribution (N-2, N-1, N, N+1, N+2) of a given position N within a predetermined loop region in the context of a specific amino acid at this position N. This information can then be used to generate a library.
[0166] In another illustration, the consensus sequence of sheet C is GYIVEYREK
(SEQ ID
NO:279) at sheet positions 1, 2, 3, 4, 5, 6, 7, 8 and 9, respectively of the sheet C. Using the N+/- theory, if Y in position 2 is kept fixed as it is the predominant amino acid (N), then the structural and local microenvironmental effect on G at position 1 (N-1), I at position 3 (N+1), V
at position 4 (N+2), E at position 5 (N+3), Y at position 6 (N+4), R at position 7 (N+5), E at position 8 (N+6) and K at position 9 (N+7) is determined. Moreover, if Y at position 2 is changed to V, then the effect of this change on positions N-1, N+1, N+2, N+3, N+4, N+5, N+6 and N+7 (i. e. , sheet positions 1, 3, 4, 5, 6, 7, 8 and 9) is determined.
(SEQ ID
NO:279) at sheet positions 1, 2, 3, 4, 5, 6, 7, 8 and 9, respectively of the sheet C. Using the N+/- theory, if Y in position 2 is kept fixed as it is the predominant amino acid (N), then the structural and local microenvironmental effect on G at position 1 (N-1), I at position 3 (N+1), V
at position 4 (N+2), E at position 5 (N+3), Y at position 6 (N+4), R at position 7 (N+5), E at position 8 (N+6) and K at position 9 (N+7) is determined. Moreover, if Y at position 2 is changed to V, then the effect of this change on positions N-1, N+1, N+2, N+3, N+4, N+5, N+6 and N+7 (i. e. , sheet positions 1, 3, 4, 5, 6, 7, 8 and 9) is determined.
[0167] The FnIII cradle molecules in a cradle library may be represented by a sequence set forth in SEQ ID NOs: 468-470. Cradle residues are shown in bold with X
representing the amino acid substitution for the beta strands and Y representing the amino acid substitution for the loops with the loop length range given as a subscript. Any substitutions, including natural or engineered amino acids, or other molecules are contemplated. In some embodiments, any of the 19 amino acids other than the native residue can be substituted for the cradle residues.
Substitutions may include, but are not limited to conservative substitutions that have little or no effect on the overall net charge, polarity, or hydrophobicity of the protein.
Substitutions may also include an insertion and a deletion of one or more amino acids. FnIII
cradle molecules can include alanine substitutions at one or more of amino acid positions.
representing the amino acid substitution for the beta strands and Y representing the amino acid substitution for the loops with the loop length range given as a subscript. Any substitutions, including natural or engineered amino acids, or other molecules are contemplated. In some embodiments, any of the 19 amino acids other than the native residue can be substituted for the cradle residues.
Substitutions may include, but are not limited to conservative substitutions that have little or no effect on the overall net charge, polarity, or hydrophobicity of the protein.
Substitutions may also include an insertion and a deletion of one or more amino acids. FnIII
cradle molecules can include alanine substitutions at one or more of amino acid positions.
[0168] In some embodiments, the FG loop may be about 1-10 residues in length.
In some embodiments, the FG loop may be about 5 or 6 residues in length. In some embodiments, the FG loop may be five residues in length. In some embodiments, positions 3 and/or 5 of the FG
loop are a Gly residue. In some embodiments, position 1 of the FG loop is an Ala, Gly, Ser, Asn or Asp residue, position 2 of the FG loop is an Ala, Lys, Gly, Val or Gln residue, position 3 of the FG loop is a Gly, Leu, Val, Arg or Tyr residue, position 4 of the FG
loop is an Glu, Leu, Asp, Tyr or Pro residue, and position 5 of the FG loop is a Gly, Ser, Thr, Asn or His residue. In some embodiments, the FG loop may be six amino acids in length. In some embodiments, position 1 of the FG loop is a Gly residue, position 2 of the FG loop is a Leu, Val or Ile residue, position 3 of the FG loop is a charged or polar residue, position 4 of the FG
loop is a Pro residue, position 5 of the FG loop is a Gly residue, and position 6 of the FG
loop is a polar residue. In some embodiments, position 1 of the FG loop is a Gly, Glu, Asp, Ser or Ala residue, position 2 of the FG loop is an Ala, Gly, Tyr, Val or Asn residue, position 3 of the FG loop is a Gly, Gln, Lys, Arg or Glu residue, position 4 of the FG loop is an Arg, Glu, Val, Ile or Leu residue, position 5 of the FG loop is a Ser, Gly, Val, Thr or Leu residue, and position 6 of the FG loop is a Glu, Gly, Lys, Ser or Pro residue.
In some embodiments, the FG loop may be about 5 or 6 residues in length. In some embodiments, the FG loop may be five residues in length. In some embodiments, positions 3 and/or 5 of the FG
loop are a Gly residue. In some embodiments, position 1 of the FG loop is an Ala, Gly, Ser, Asn or Asp residue, position 2 of the FG loop is an Ala, Lys, Gly, Val or Gln residue, position 3 of the FG loop is a Gly, Leu, Val, Arg or Tyr residue, position 4 of the FG
loop is an Glu, Leu, Asp, Tyr or Pro residue, and position 5 of the FG loop is a Gly, Ser, Thr, Asn or His residue. In some embodiments, the FG loop may be six amino acids in length. In some embodiments, position 1 of the FG loop is a Gly residue, position 2 of the FG loop is a Leu, Val or Ile residue, position 3 of the FG loop is a charged or polar residue, position 4 of the FG
loop is a Pro residue, position 5 of the FG loop is a Gly residue, and position 6 of the FG
loop is a polar residue. In some embodiments, position 1 of the FG loop is a Gly, Glu, Asp, Ser or Ala residue, position 2 of the FG loop is an Ala, Gly, Tyr, Val or Asn residue, position 3 of the FG loop is a Gly, Gln, Lys, Arg or Glu residue, position 4 of the FG loop is an Arg, Glu, Val, Ile or Leu residue, position 5 of the FG loop is a Ser, Gly, Val, Thr or Leu residue, and position 6 of the FG loop is a Glu, Gly, Lys, Ser or Pro residue.
[0169] In some embodiments, the CD loop may be about 3-11 residues in length.
In some embodiments, the CD loop may be about 4-9 residues in length. In some embodiments, the CD
loop may be four residues in length. In some embodiments, position 1 of the CD
loop is a Asp, Gly, Glu, Ser or Asn residue, position 2 of the CD loop is a Gly, Ala, Asp, Asn or Glu residue, position 3 of the CD loop is a Gln, Glu, Arg, Gly or Thr residue, position 4 of the CD loop is a Pro, Thr, Glu, Ser or Gln residue. In some embodiments, the CD loop may be five amino acids in length. In some embodiments, position 1 of the CD loop is a Ser, Asp, Gly, Glu or Thr residue, position 2 of the CD loop is a Gly, Ser, Arg, Glu or Thr residue, position 3 of the CD
loop is a Gly, Glu, Arg, Lys or Thr residue, position 4 of the CD loop is a Glu, Trp, Ala, Ser or Thr residue, position 5 of the CD loop is a Trp, Pro, Leu, Val or Thr residue.
In some embodiments, the CD loop may be six amino acids in length. In some embodiments, position 1 of the CD loop is a Gly, Asn, Asp, Glu or Lys/Ser residue, position 2 of the CD loop is a Gly, Ser, Lys, Thr or Ala residue, position 3 of the CD loop is a Glu, Pro, Asp, Thr or Asn residue, position 4 of the CD loop is a Gly, Glu, Leu, Arg or Ser residue, position 5 of the CD loop is a Trp, Glu, Asp, Pro or Arg residue, and position 6 of the CD loop is a Glu, Val, Thr, Pro or Ala residue.
In some embodiments, the CD loop may be about 4-9 residues in length. In some embodiments, the CD
loop may be four residues in length. In some embodiments, position 1 of the CD
loop is a Asp, Gly, Glu, Ser or Asn residue, position 2 of the CD loop is a Gly, Ala, Asp, Asn or Glu residue, position 3 of the CD loop is a Gln, Glu, Arg, Gly or Thr residue, position 4 of the CD loop is a Pro, Thr, Glu, Ser or Gln residue. In some embodiments, the CD loop may be five amino acids in length. In some embodiments, position 1 of the CD loop is a Ser, Asp, Gly, Glu or Thr residue, position 2 of the CD loop is a Gly, Ser, Arg, Glu or Thr residue, position 3 of the CD
loop is a Gly, Glu, Arg, Lys or Thr residue, position 4 of the CD loop is a Glu, Trp, Ala, Ser or Thr residue, position 5 of the CD loop is a Trp, Pro, Leu, Val or Thr residue.
In some embodiments, the CD loop may be six amino acids in length. In some embodiments, position 1 of the CD loop is a Gly, Asn, Asp, Glu or Lys/Ser residue, position 2 of the CD loop is a Gly, Ser, Lys, Thr or Ala residue, position 3 of the CD loop is a Glu, Pro, Asp, Thr or Asn residue, position 4 of the CD loop is a Gly, Glu, Leu, Arg or Ser residue, position 5 of the CD loop is a Trp, Glu, Asp, Pro or Arg residue, and position 6 of the CD loop is a Glu, Val, Thr, Pro or Ala residue.
[0170] In some embodiments, the beta strand C may be about 6-14 residues in length. In some embodiments, the beta strand C may be about 8-11 residues in length. In some embodiments, the beta strand C may be 9 residues in length. In some embodiments, positions 2, 4 and 6 of the beta strand C are a hydrophobic residue. In some embodiments, positions 1, 3, 5 and 7-9 of the beta strand C are altered relative to the wild type sequence.
In some embodiments, position 1 of the beta strand C is selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg. In some embodiments, position 3 of the beta strand C is a hydrophobic residue. In some embodiments, position 3 of the beta strand C is selected from the group consisting of Ile, Val, Arg, Leu, Thr, Glu, Lys, Ser, Gln and His. In some embodiments, positions 5 and 7-9 of the beta strand C are selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
In some embodiments, position 1 of the beta strand C is selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg. In some embodiments, position 3 of the beta strand C is a hydrophobic residue. In some embodiments, position 3 of the beta strand C is selected from the group consisting of Ile, Val, Arg, Leu, Thr, Glu, Lys, Ser, Gln and His. In some embodiments, positions 5 and 7-9 of the beta strand C are selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
[0171] In some embodiments, the beta strand F may be about 8-13 residues in length. In some embodiments, the beta strand F may be about 9-11 residues in length. In some embodiments, the beta strand F may be 10 residues in length. In some embodiments, positions 1, 3, 5 and 10 of the beta strand F are altered relative to the wild type sequence. In some embodiments, positions 1, 3, 5 and 10 of the beta strand F are selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
In some embodiments, positions 2, 4 and 6 of the beta strand F are a hydrophobic residue. In some embodiments, position 7 of the beta strand F is a hydrophobic residue. In some embodiments, position 7 of the beta strand F is selected from the group consisting of Arg, Tyr, Ala, Thr and Val. In some embodiments, position 8 of the beta strand F is selected from the group consisting of Ala, Gly, Ser, Val and Pro. In some embodiments, position 9 of the beta strand F is selected from the group consisting of Val, Leu, Glu, Arg and Ile.
In some embodiments, positions 2, 4 and 6 of the beta strand F are a hydrophobic residue. In some embodiments, position 7 of the beta strand F is a hydrophobic residue. In some embodiments, position 7 of the beta strand F is selected from the group consisting of Arg, Tyr, Ala, Thr and Val. In some embodiments, position 8 of the beta strand F is selected from the group consisting of Ala, Gly, Ser, Val and Pro. In some embodiments, position 9 of the beta strand F is selected from the group consisting of Val, Leu, Glu, Arg and Ile.
[0172] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 30 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0173] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 31 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0174] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 33 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0175] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 35 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0176] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 37 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0177] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 38 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0178] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 39 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0179] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 40 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0180] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 41 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0181] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 42 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0182] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 43 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0183] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 44 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0184] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 45 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0185] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 47 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0186] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 49 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0187] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 50 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0188] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 67 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0189] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 69 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0190] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 71 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0191] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 73 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0192] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 75 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0193] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 76 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0194] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 77 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0195] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 78 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 79, 80, 81, 82, 83, 84, 85 and/or 86.
[0196] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 79 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 80, 81, 82, 83, 84, 85 and/or 86.
[0197] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 80 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 81, 82, 83, 84, 85 and/or 86.
[0198] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 81 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 82, 83, 84, 85 and/or 86.
[0199] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 82 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 81, 82, 83, 84, 85 and/or 86.
[0200] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 83 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 84, 85 and/or 86.
[0201] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 84 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 85 and/or 86.
[0202] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 85 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84 and/or 86.
[0203] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 86 of SEQ ID NO:1 in combination with one or more residue corresponding to amino acid 30, 31, 33, 35, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 49, 50, 67, 69, 71, 73, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84 and/or 85.
[0204] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 33 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0205] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 35 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0206] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 37 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0207] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 39 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0208] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 40 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0209] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 41 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0210] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 42 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0211] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 43 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0212] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 44 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0213] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 45 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0214] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 46 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0215] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 47 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0216] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 48 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 70, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0217] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 70 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 72, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0218] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 72 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 74, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0219] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 74 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 76, 79, 80, 81, 82, 83, 84 and/or 85.
[0220] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 76 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 79, 80, 81, 82, 83, 84 and/or 85.
[0221] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 79 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 80, 81, 82, 83, 84 and/or 85.
[0222] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 80 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 81, 82, 83, 84 and/or 85.
[0223] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 81 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 82, 83, 84 and/or 85.
[0224] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 82 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 83, 84 and/or 85.
[0225] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 83 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 84 and/or 85.
[0226] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 84 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83 and/or 85.
[0227] In some embodiments, the cradle library may comprise a variation in an amino acid corresponding to amino acid 85 of SEQ ID NO:97 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 70, 72, 74, 76, 79, 80, 81, 82, 83 and/or 84.
[0228] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 31 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0229] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 31 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0230] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 33 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0231] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 35 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0232] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 37 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0233] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 39 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0234] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 40 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0235] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 41 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0236] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 42 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0237] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 43 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0238] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 44 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 43, 66, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0239] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 66 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 68, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0240] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 68 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 70, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0241] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 70 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 72, 75, 76, 77, 78, 79, 80 and/or 81.
[0242] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 72 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 75, 76, 77, 78, 79, 80 and/or 81.
[0243] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 75 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 76, 77, 78, 79, 80 and/or 81.
[0244] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 76 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 77, 78, 79, 80 and/or 81.
245 CA 02805862 2013-01-16 [0245] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 77 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 78, 79, 80 and/or 81.
[0246] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 78 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 79, 80 and/or 81.
[0247] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 79 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 80 and/or 81.
[0248] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 80 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79 and/or 81.
[0249] In some embodiments, the cradle library comprises a variation in an amino acid corresponding to amino acid 81 of SEQ ID NO:129 in combination with one or more residue corresponding to amino acid 31, 33, 35, 37, 39, 40, 41, 42, 43, 44, 66, 68, 70, 72, 75, 76, 77, 78, 79 and/or 80.
V. Computer-Assisted FnIII Cradle Library Construction
V. Computer-Assisted FnIII Cradle Library Construction
[0250] Further provided herein are methods of making a cradle library of FnIII
domain variants based on sequence information obtained through, e.g., bioinformatics and/or structural analysis. The first step in building a fibronectin library of the invention is selecting sequences that meet certain predetermined criteria. PFAM, ProSite and similar databases were searched for sequences containing FnIII domains (Figure 1). These electronic databases contain catalogued expressed fibronectin and fibronectin-like protein sequences and can be queried for those FnIII domains and similar sequences (e.g., using the BLAST search algorithm). The FnIII
domain sequences can then be grouped to predefined criteria such as domain subclasses, sequence similarity or originating organism(s).
domain variants based on sequence information obtained through, e.g., bioinformatics and/or structural analysis. The first step in building a fibronectin library of the invention is selecting sequences that meet certain predetermined criteria. PFAM, ProSite and similar databases were searched for sequences containing FnIII domains (Figure 1). These electronic databases contain catalogued expressed fibronectin and fibronectin-like protein sequences and can be queried for those FnIII domains and similar sequences (e.g., using the BLAST search algorithm). The FnIII
domain sequences can then be grouped to predefined criteria such as domain subclasses, sequence similarity or originating organism(s).
[0251] The choice of FnIII domains based on the criteria of the invention dictates both the loop sizes and the initial amino acid sequence diversity to be introduced. By bioinformatics led design, the loop regions are flexible for insertion into multiple FnIII
domains. By specific targeted loop substitutions, overall scaffold stability is maximized while concurrently, non-immunogenic substitutions are minimized. Additionally, the library can be size tailored so that the overall diversity can be readily screened in different systems.
Furthermore, the representative diversity of the designed loops is still capable of binding a number of pre-defined targets. Moreover, the systematic design of loop still allows subsequent affinity maturation of recovered binding clones.
domains. By specific targeted loop substitutions, overall scaffold stability is maximized while concurrently, non-immunogenic substitutions are minimized. Additionally, the library can be size tailored so that the overall diversity can be readily screened in different systems.
Furthermore, the representative diversity of the designed loops is still capable of binding a number of pre-defined targets. Moreover, the systematic design of loop still allows subsequent affinity maturation of recovered binding clones.
[0252] FnIII domain sequences are then delineated whereupon the intervening beta strand and loop regions and constituent amino acids are then identified. This then determines the length of the existing loops and beta strands, the amino acid profiles for each loop length and beta strand length, hence the physical size and amino acid diversity that can be accommodated within these frameworks. Once the loops and beta strands are identified, sequences within each loop and beta strand are aligned, and the aligned sequences are then split into groups according to loop length and beta strand length. The distributions of beta strands lengths for sheets A-G
and loop lengths for AB, BC, CD, EF and FG loops were identified (see, e.g., Figure 4). Using this information, the most common beta strand lengths and loop sizes are selected. In some embodiments, the selected loop lengths are CD/4, CD/5, CD/6, FG/5 and FG/6, and the selected beta strand lengths are 9 residues for beta strand C and 10 residues for beta strand F.
and loop lengths for AB, BC, CD, EF and FG loops were identified (see, e.g., Figure 4). Using this information, the most common beta strand lengths and loop sizes are selected. In some embodiments, the selected loop lengths are CD/4, CD/5, CD/6, FG/5 and FG/6, and the selected beta strand lengths are 9 residues for beta strand C and 10 residues for beta strand F.
[0253] For each beta strand, one can determine the preferred loop acceptor sites based on both comparative structural and sequence analysis. For example, one can use the structural overlay comparison of the overall loop and beta strand scaffolds between the FnIII 7, FnIII10 , FnIII14 or any of the other known FnIII domains. In identifying precise loop positions, the above step greatly minimizes necessary diversity loop mutations that would not result in functional ligand binding specificity.
[0254] Once loop lengths are selected, a positional amino acid frequency analysis is performed at each loop position, to determine the frequency of occurrence, in a set of native FnIII domains. This method may include a frequency analysis and the generation of the corresponding variability profiles (VP) of existing loop sequences (see Example 6). In addition, the outward facing amino acids of sheets C and F were analyzed to determine the frequency of occurrence, in a set of native FnIII domains (Figure 7B). Amino acids 1, 3, 5, 7-9 in beta strand C and amino acids 1, 3, 5, 7, and 10 in beta strand F are intended for use in the cradle library.
High frequency (e.g., >50%) positions are considered conserved or fixed.
Moderately high frequency or "semi-conserved" amino acids or (when 2 or 3 are combined account for >40%) are chosen as "wildtype" at other positions. These wildtype amino acids are then systematically altered using, mutagenesis, e.g., walk-through mutagenesis, to generate the cradle library.
"Variable" positions are those where typically, no one amino acid accounts for more than 20%
of the represented set.
High frequency (e.g., >50%) positions are considered conserved or fixed.
Moderately high frequency or "semi-conserved" amino acids or (when 2 or 3 are combined account for >40%) are chosen as "wildtype" at other positions. These wildtype amino acids are then systematically altered using, mutagenesis, e.g., walk-through mutagenesis, to generate the cradle library.
"Variable" positions are those where typically, no one amino acid accounts for more than 20%
of the represented set.
[0255] A variability profile analysis of the FnIII domain databases allows identification of loop/beta strand amino acid residue positions that fall within three categories, e.g., 1) positions that should be conserved or "fixed," 2) semi-conserved, and/or 3) variable positions that are suitable for diversity generation. A variability profile analysis is performed and a threshold frequency is used to identify the most favorable sequences to be used in designating the overall loop/sheet diversity.
[0256] The conserved or a selected semi-conserved sequence (typically the most frequent amino acid in the semi-conserved residues) is considered the "wild type" or "consensus" residue in the loop sequence. This "consensus" or "frequency" approach identifies those particular amino acids under high selective pressure that occurs most frequently at a particular position.
[0257] Accordingly, these residue positions are typically fixed, with diversity being introduced into remaining amino acid positions (taking into account the identified preference for certain amino acids to be present at these positions). The threshold for occurrence frequency at which amino acid variation will be introduced can vary between selected levels as low as 40%, preferably 50% to as high as 100%. At the 100% threshold frequency, mutagenesis of amino acids can be introduced at all positions of the loop, and the only constraints on natural-variant amino acids will be the total number of variants and whether chemical equivalents are available.
[0258] When designing the diversity for any of the above-mentioned loops and beta strands, modified amino acid residues, for example, residues outside the traditional 20 amino acids used in most polypeptides, e.g., homocysteine, can be incorporated into the loops as desired. This is carried out using art recognized techniques which typically introduce stop codons into the polynucleotide where the modified amino acid residue is desired. The technique then provides a modified tRNA linked to the modified amino acid to be incorporated (a so-called suppressor tRNA of, e.g., the stop codon amber, opal, or ochre) into the polypeptide (see, e.g., Rohrer, et al., PNAS (2001) 98:14310-14315).
[0259] The FnIII cradle libraries of the invention and their construction are conducted with the benefit of sequence and structural information such that the potential for generating improved FnIII cradle moleculesis increased. Structural molecular replacement modeling information can also be used to guide the selection of amino acid diversity to be introduced into the defined beta strand andloop regions. Still further, actual results obtained with the FnIII
cradle molecules of the invention can guide the selection (or exclusion), e.g., affinity maturation, of subsequent FnIII cradle molecules to be made and screened in an iterative manner.
cradle molecules of the invention can guide the selection (or exclusion), e.g., affinity maturation, of subsequent FnIII cradle molecules to be made and screened in an iterative manner.
[0260] Further provided herein is a method for selecting a protein binding domain specific for a target comprises (a) detecting target specific binding of one or more members of a cradle library comprising a plurality of FnIII domain polypeptides having amino acid substitutions that correspond to at least amino acid position 31, 33, 47, 49, 71, 73, and/or 75 of SEQ ID NO:1;
and (b) selecting the protein binding domain that specifically binds the target. In some embodiments the method may further comprise first preparing the plurality of FnIII domain polypeptide variants described herein, e.g., FnIII domains having amino acid substitutions that correspond to at least amino acid position 31, 33, 47, 49, 71, 73, and/or 75 of SEQ ID NO:l. In some embodiments a polypeptide identified as exhibiting a particular characteristic may be isolated. In some embodiments, the method may further comprise determining the nucleic acid and/or the amino acid of sequence of the selected protein binding domain. In some embodiments, the selected protein binding domain may be synthesized or expressed.
and (b) selecting the protein binding domain that specifically binds the target. In some embodiments the method may further comprise first preparing the plurality of FnIII domain polypeptide variants described herein, e.g., FnIII domains having amino acid substitutions that correspond to at least amino acid position 31, 33, 47, 49, 71, 73, and/or 75 of SEQ ID NO:l. In some embodiments a polypeptide identified as exhibiting a particular characteristic may be isolated. In some embodiments, the method may further comprise determining the nucleic acid and/or the amino acid of sequence of the selected protein binding domain. In some embodiments, the selected protein binding domain may be synthesized or expressed.
[0261] In some embodiments, in silico modeling is used to eliminate the production of any FnIII cradle molecules predicted to have poor or undesired structure and/or function. In this way, the number of FnIII cradle molecules to be produced can be sharply reduced thereby increasing signal-to-noise in subsequent screening assays. In another particular embodiment, the in silico modeling is continually updated with additional modeling information, from any relevant source, e.g., from gene and protein sequence and three-dimensional databases and/or results from previously tested FnIII cradle molecules, so that the in silico database becomes more precise in its predictive ability (Figure 1).
[0262] In yet another embodiment, the in silico database is provided with the assay results, e.g., binding affinity/avidity of previously tested FnIII cradle molecules and categorizes the FnIII cradle molecules, based on the assay criterion or criteria, as responders or nonresponders, e.g., as FnIII cradle molecules that bind well or not so well. In this way, the affinity maturation of the invention can equate a range of functional responses with particular sequence and structural information and use such information to guide the production of future FnIII cradle molecules to be tested. The method is especially suitable for screening FnIII
cradle molecules for a particular binding affinity to a target ligand using, e.g., a BiacoreTM
assay.
cradle molecules for a particular binding affinity to a target ligand using, e.g., a BiacoreTM
assay.
[0263] Accordingly, mutagenesis of noncontiguous residues within a loop region or a beta-strand can be desirable if it is known, e.g., through in silico modeling, that certain residues in the region will not participate in the desired function. The coordinate structure and spatial interrelationship between the defined regions, e.g., the functional amino acid residues in the defined regions of the FnIII cradle molecules, e.g., the diversity that has been introduced, can be considered and modeled. Such modeling criteria include, e.g., amino acid residue side group chemistry, atom distances, crystallography data, etc. Accordingly, the number of FnIII cradle molecules to be produced can be intelligently minimized.
[0264] In some embodiments, one or more of the above steps are computer-assisted. In a particular embodiment, the computer assisted step comprises, e.g., mining the NCBI, Genbank, PFAM, and ProSite databases and, optionally, cross-referencing the results against PDB
structural database, whereby certain criteria of the invention are determined and used to design the desired loop diversity (Figure 1). The method is also amenable to being carried out, in part or in whole, by a device, e.g., a computer driven device. For example, database mining fibronectin domain sequence selection, diversity design, oligonucleotide synthesis, PCR-mediated assembly of the foregoing, and expression and selection of candidate FnIII cradle molecules that bind a given target, can be carried out in part or entirely, by interlaced devices.
In addition, instructions for carrying out the method, in part or in whole, can be conferred to a medium suitable for use in an electronic device for carrying out the instructions. In sum, the methods of the invention are amendable to a high throughput approach comprising software (e.g., computer-readable instructions) and hardware (e.g., computers, robotics, and chips).
structural database, whereby certain criteria of the invention are determined and used to design the desired loop diversity (Figure 1). The method is also amenable to being carried out, in part or in whole, by a device, e.g., a computer driven device. For example, database mining fibronectin domain sequence selection, diversity design, oligonucleotide synthesis, PCR-mediated assembly of the foregoing, and expression and selection of candidate FnIII cradle molecules that bind a given target, can be carried out in part or entirely, by interlaced devices.
In addition, instructions for carrying out the method, in part or in whole, can be conferred to a medium suitable for use in an electronic device for carrying out the instructions. In sum, the methods of the invention are amendable to a high throughput approach comprising software (e.g., computer-readable instructions) and hardware (e.g., computers, robotics, and chips).
[0265] Further details regarding fibronectin and FnIII sequence classification, identification, and analysis may be found, e.g., PFAM. A program to screen aligned nucleotide and amino acid sequences, Johnson, G., Methods Mol. Biol. (1995) 51:1-15; and Wu, et al., "Clustering of highly homologous sequences to reduce the size of large protein databases."
Bioinformatics (2001) 17:282-283; Databases and search and analysis programs include the PFAM
database at the Sanger Institute (pfam.sanger.ac.uk); the ExPASy PROSITE database (expasv.ch/prosite/);
SBASE web (hydra.icgeb.trieste.it/sbase/); BLAST (located on the World Wide Web at ncbi.nlm.nih.gov/BLAST/); CD-HIT (bioinformatic s.lj crf.edu/cd-hi/); EMBOSS
(hqmp.mrc.ac.uk/Software/EMBOSS/); PHYLIP
(evolution.genetics.washington.edu/phylip.html); and FAS TA
(fasta.bioch.virginia.edu).
Bioinformatics (2001) 17:282-283; Databases and search and analysis programs include the PFAM
database at the Sanger Institute (pfam.sanger.ac.uk); the ExPASy PROSITE database (expasv.ch/prosite/);
SBASE web (hydra.icgeb.trieste.it/sbase/); BLAST (located on the World Wide Web at ncbi.nlm.nih.gov/BLAST/); CD-HIT (bioinformatic s.lj crf.edu/cd-hi/); EMBOSS
(hqmp.mrc.ac.uk/Software/EMBOSS/); PHYLIP
(evolution.genetics.washington.edu/phylip.html); and FAS TA
(fasta.bioch.virginia.edu).
[0266] The bioinformatic analysis focuses on FnIII domains genes for descriptive purposes, but it will be understood that genes for other Fn domains and other scaffold protein are similarly evaluated.
VI. Synthesizing FnIII Cradle Libraries
VI. Synthesizing FnIII Cradle Libraries
[0267] The cradle library of polypeptides may be encoded by an expression library that has the format of a ribosome display library, a polysome display library, a phage display library, a bacterial expression library, or a yeast display library.
[0268] In some embodiments, the FnIII cradle libraries of the invention are generated for screening by synthesizing individual oligonucleotides that encode the defined region of the polypeptide and have no more than one codon for the predetermined amino acid.
This is accomplished by incorporating, at each codon position within the oligonucleotide either the codon required for synthesis of the wild-type polypeptide or a codon for the predetermined amino acid and is referred to as look-through mutagenesis (LTM) (see, e.g., U.S. Patent Publication No. 20050136428).
This is accomplished by incorporating, at each codon position within the oligonucleotide either the codon required for synthesis of the wild-type polypeptide or a codon for the predetermined amino acid and is referred to as look-through mutagenesis (LTM) (see, e.g., U.S. Patent Publication No. 20050136428).
[0269] In some embodiments, when diversity at multiple amino acid positions is required, walk-through mutagenesis (WTM) can be used (see, e.g., U.S. Patent Nos.
6,649,340;
5,830,650; and 5,798,208; and U.S. Patent Publication No. 20050136428). In another embodiment, diversity can be created using the methods available from commercial vendors such as DNA2.0 and Geneart by providing information about the loop lengths of the AB, BC, CD, EF and FG loops, the positional distribution of amino acids at each position of the loop, and the top 7 amino acid abundance at each position of the loop.
6,649,340;
5,830,650; and 5,798,208; and U.S. Patent Publication No. 20050136428). In another embodiment, diversity can be created using the methods available from commercial vendors such as DNA2.0 and Geneart by providing information about the loop lengths of the AB, BC, CD, EF and FG loops, the positional distribution of amino acids at each position of the loop, and the top 7 amino acid abundance at each position of the loop.
[0270] The mixture of oligonucleotides for generation of the library can be synthesized readily by known methods for DNA synthesis. The preferred method involves use of solid phase beta-cyanoethyl phosphoramidite chemistry (see, e.g., U.S. Pat. No.
4,725,677). For convenience, an instrument for automated DNA synthesis can be used containing specified reagent vessels of nucleotides. The polynucleotides may also be synthesized to contain restriction sites or primer hybridization sites to facilitate the introduction or assembly of the polynucleotides representing, e.g., a defined region, into a larger gene context.
4,725,677). For convenience, an instrument for automated DNA synthesis can be used containing specified reagent vessels of nucleotides. The polynucleotides may also be synthesized to contain restriction sites or primer hybridization sites to facilitate the introduction or assembly of the polynucleotides representing, e.g., a defined region, into a larger gene context.
[0271] The synthesized polynucleotides can be inserted into a larger gene context, e.g., a single scaffold domain using standard genetic engineering techniques. For example, the polynucleotides can be made to contain flanking recognition sites for restriction enzymes (see, e.g., U.S. Pat. No. 4,888,286). The recognition sites can be designed to correspond to recognition sites that either exist naturally or are introduced in the gene proximate to the DNA
encoding the region. After conversion into double stranded form, the polynucleotides are ligated into the gene or gene vector by standard techniques. By means of an appropriate vector (including, e.g., phage vectors, plasmids) the genes can be introduced into a cell-free extract, phage, prokaryotic cell, or eukaryotic cell suitable for expression of the fibronectin binding domain molecules.
encoding the region. After conversion into double stranded form, the polynucleotides are ligated into the gene or gene vector by standard techniques. By means of an appropriate vector (including, e.g., phage vectors, plasmids) the genes can be introduced into a cell-free extract, phage, prokaryotic cell, or eukaryotic cell suitable for expression of the fibronectin binding domain molecules.
[0272] When partially overlapping polynucleotides are used in the gene assembly, a set of degenerate nucleotides can also be directly incorporated in place of one of the polynucleotides.
The appropriate complementary strand is synthesized during the extension reaction from a partially complementary polynucleotide from the other strand by enzymatic extension with a polymerase. Incorporation of the degenerate polynucleotides at the stage of synthesis also simplifies cloning where more than one domain or defined region of a gene is mutagenized or engineered to have diversity.
The appropriate complementary strand is synthesized during the extension reaction from a partially complementary polynucleotide from the other strand by enzymatic extension with a polymerase. Incorporation of the degenerate polynucleotides at the stage of synthesis also simplifies cloning where more than one domain or defined region of a gene is mutagenized or engineered to have diversity.
[0273] In another approach, the fibronectin binding domain is present on a single stranded plasmid. For example, the gene can be cloned into a phage vector or a vector with a filamentous phage origin of replication that allows propagation of single-stranded molecules with the use of a helper phage. The single-stranded template can be annealed with a set of degenerate polynucleotides representing the desired mutations and elongated and ligated, thus incorporating each analog strand into a population of molecules that can be introduced into an appropriate host (see, e.g., Sayers, J. R., et al., Nucleic Acids Res. (1988) 16:791-802). This approach can circumvent multiple cloning steps where multiple domains are selected for mutagenesis.
[0274] Polymerase chain reaction (PCR) methodology can also be used to incorporate polynucleotides into a gene, for example, loop diversity into beta strand framework regions.
For example, the polynucleotides themselves can be used as primers for extension. In this approach, polynucleotides encoding the mutagenic cassettes corresponding to the defined region (or portion thereof) are complementary to each other, at least in part, and can be extended to form a large gene cassette (e.g., a fibronectin binding domain) using a polymerase, e.g., using PCR amplification.
For example, the polynucleotides themselves can be used as primers for extension. In this approach, polynucleotides encoding the mutagenic cassettes corresponding to the defined region (or portion thereof) are complementary to each other, at least in part, and can be extended to form a large gene cassette (e.g., a fibronectin binding domain) using a polymerase, e.g., using PCR amplification.
[0275] The size of the library will vary depending upon the loop/sheet length and the amount of sequence diversity which needs to be represented using mutagenesis methods. For example, the library is designed to contain less than 1015, 1014, 1013, 1012, 1011, 1010,109, 108, 107, and 106 fibronectin binding domain.
[0276] The description above has centered on representing fibronectin binding domain diversity by altering the polynucleotide that encodes the corresponding polypeptide. It is understood, however, that the scope of the invention also encompasses methods of representing the fibronectin binding domain diversity disclosed herein by direct synthesis of the desired polypeptide regions using protein chemistry. In carrying out this approach, the resultant polypeptides still incorporate the features of the invention except that the use of a polynucleotide intermediate can be eliminated.
[0277] For the libraries described above, whether in the form of polynucleotides and/or corresponding polypeptides, it is understood that the libraries may be also attached to a solid support, such as a microchip, and preferably arrayed, using art recognized techniques.
[0278] The method of this invention is especially useful for modifying candidate fibronectin binding domain molecules by way of affinity maturation. Alterations can be introduced into the loops and/or into the beta strand framework (constant) region of a fibronectin binding domain.
Modification of the beta sheets and loop regions can produce fibronectin binding domains with better ligand binding properties, and, if desired, catalytic properties.
Modification of the beta strand framework region can also lead to the improvement of chemo-physical properties, such as solubility or stability, which are especially useful, for example, in commercial production, bioavailability, and affinity for the ligand. Typically, the mutagenesis will target the loop region(s) of the fibronectin binding domain, i.e., the structure responsible for ligand-binding activity which can be made up of the three loop regions. In a preferred embodiment, an identified candidate binding molecule is subjected to affinity maturation to increase the affinity/avidity of the binding molecule to a target ligand. In one embodiment, modifications to at least one loop and at least one beta sheet produces an FnIII cradle molecule with an increased surface area available for binding to a target molecule. In one embodiment, modifications to at least one top loop, at least one bottom loop, and at least one beta sheet produces an FnIII cradle molecule with an increased surface area available for binding to a target molecule. In one embodiment, modifications to the FG and CD loops and the C and/or F beta sheet produces an FnIII cradle molecule with an increased surface area available for binding to a target molecule.
In one embodiment, modifications to at least one loop and at least one beta sheet produces an FnIII cradle molecule that can bind to different target molecules.
Modification of the beta sheets and loop regions can produce fibronectin binding domains with better ligand binding properties, and, if desired, catalytic properties.
Modification of the beta strand framework region can also lead to the improvement of chemo-physical properties, such as solubility or stability, which are especially useful, for example, in commercial production, bioavailability, and affinity for the ligand. Typically, the mutagenesis will target the loop region(s) of the fibronectin binding domain, i.e., the structure responsible for ligand-binding activity which can be made up of the three loop regions. In a preferred embodiment, an identified candidate binding molecule is subjected to affinity maturation to increase the affinity/avidity of the binding molecule to a target ligand. In one embodiment, modifications to at least one loop and at least one beta sheet produces an FnIII cradle molecule with an increased surface area available for binding to a target molecule. In one embodiment, modifications to at least one top loop, at least one bottom loop, and at least one beta sheet produces an FnIII cradle molecule with an increased surface area available for binding to a target molecule. In one embodiment, modifications to the FG and CD loops and the C and/or F beta sheet produces an FnIII cradle molecule with an increased surface area available for binding to a target molecule.
In one embodiment, modifications to at least one loop and at least one beta sheet produces an FnIII cradle molecule that can bind to different target molecules.
[0279] In general, the practice of the present invention employs, unless otherwise indicated, conventional techniques of chemistry, molecular biology, recombinant DNA
technology, PCR
technology, immunology (especially, e.g., antibody technology), expression systems (e.g., cell-free expression, phage display, ribosome display, and ProfusionTm), and any necessary cell culture that are within the skill of the art and are explained in the literature. See, e.g., Sambrook, Fritsch and Maniatis, Molecular Cloning: Cold Spring Harbor Laboratory Press (1989); DNA Cloning, Vols. 1 and 2, (D.N. Glover, Ed. 1985); Oligonucleotide Synthesis (M.J. Gait, Ed. 1984); PCR Handbook Current Protocols in Nucleic Acid Chemistry, Beaucage, Ed. John Wiley & Sons (1999) (Editor); Oxford Handbook of Nucleic Acid Structure, Neidle, Ed., Oxford Univ Press (1999); PCR Protocols: A Guide to Methods and Applications, Innis, et al., Academic Press (1990); PCR Essential Techniques: Essential Techniques, Burke, Ed., John Wiley & Son Ltd (1996); The PCR Technique: RT-PCR, Siebert, Ed., Eaton Pub.
Co. (1998);
Current Protocols in Molecular Biology, eds. Ausubel, et al., John Wiley &
Sons (1992); Large-Scale Mammalian Cell Culture Technology, Lubiniecki, A., Ed., Marcel Dekker, Pub., (1990).
Phage Display: A Laboratory Manual, C. Barbas (Ed.), CSHL Press, (2001);
Antibody Phage Display, P. O'Brien (Ed.), Humana Press (2001); Border, et al., "Yeast surface display for screening combinatorial polypeptide libraries," Nature Biotechnology (1997) 15:553-557;
Border, et al., "Yeast surface display for directed evolution of protein expression, affinity, and stability," Methods Enzymol. (2000) 328:430-444; ribosome display as described by Pluckthun, et al., in U.S. Pat. No. 6,348,315, and ProfusionTM as described by Szostak, et al., in U.S. Pat.
Nos. 6,258,558; 6,261,804; and 6,214,553, and bacterial periplasmic expression as described in U.S. Patent Publication No. 20040058403A1.
VII. Expression and Screening Systems
technology, PCR
technology, immunology (especially, e.g., antibody technology), expression systems (e.g., cell-free expression, phage display, ribosome display, and ProfusionTm), and any necessary cell culture that are within the skill of the art and are explained in the literature. See, e.g., Sambrook, Fritsch and Maniatis, Molecular Cloning: Cold Spring Harbor Laboratory Press (1989); DNA Cloning, Vols. 1 and 2, (D.N. Glover, Ed. 1985); Oligonucleotide Synthesis (M.J. Gait, Ed. 1984); PCR Handbook Current Protocols in Nucleic Acid Chemistry, Beaucage, Ed. John Wiley & Sons (1999) (Editor); Oxford Handbook of Nucleic Acid Structure, Neidle, Ed., Oxford Univ Press (1999); PCR Protocols: A Guide to Methods and Applications, Innis, et al., Academic Press (1990); PCR Essential Techniques: Essential Techniques, Burke, Ed., John Wiley & Son Ltd (1996); The PCR Technique: RT-PCR, Siebert, Ed., Eaton Pub.
Co. (1998);
Current Protocols in Molecular Biology, eds. Ausubel, et al., John Wiley &
Sons (1992); Large-Scale Mammalian Cell Culture Technology, Lubiniecki, A., Ed., Marcel Dekker, Pub., (1990).
Phage Display: A Laboratory Manual, C. Barbas (Ed.), CSHL Press, (2001);
Antibody Phage Display, P. O'Brien (Ed.), Humana Press (2001); Border, et al., "Yeast surface display for screening combinatorial polypeptide libraries," Nature Biotechnology (1997) 15:553-557;
Border, et al., "Yeast surface display for directed evolution of protein expression, affinity, and stability," Methods Enzymol. (2000) 328:430-444; ribosome display as described by Pluckthun, et al., in U.S. Pat. No. 6,348,315, and ProfusionTM as described by Szostak, et al., in U.S. Pat.
Nos. 6,258,558; 6,261,804; and 6,214,553, and bacterial periplasmic expression as described in U.S. Patent Publication No. 20040058403A1.
VII. Expression and Screening Systems
[0280] Libraries of polynucleotides generated by any of the above techniques or other suitable techniques can be expressed and screened to identify FnIII cradle molecules having desired structure and/or activity. Expression of the FnIII cradle molecules can be carried out using cell-free extracts (e.g., ribosome display), phage display, prokaryotic cells, or eukaryotic cells (e.g., yeast display).
[0281] In some embodiments, the polynucleotides are engineered to serve as templates that can be expressed in a cell free extract. Vectors and extracts as described, for example in U.S.
Pat. Nos. 5,324,637; 5,492,817; 5,665,563, can be used and many are commercially available.
Ribosome display and other cell-free techniques for linking a polynucleotide (i.e., a genotype) to a polypeptide (i.e., a phenotype) can be used, e.g.,ProfusionTm (see, e.g., U.S. Pat.
Nos. 6,348,315; 6,261,804; 6,258,558; and 6,214,553).
Pat. Nos. 5,324,637; 5,492,817; 5,665,563, can be used and many are commercially available.
Ribosome display and other cell-free techniques for linking a polynucleotide (i.e., a genotype) to a polypeptide (i.e., a phenotype) can be used, e.g.,ProfusionTm (see, e.g., U.S. Pat.
Nos. 6,348,315; 6,261,804; 6,258,558; and 6,214,553).
[0282] Alternatively, the polynucleotides of the invention can be expressed in a convenient E. coli expression system, such as that described by Pluckthun, Meth. Enzymol.
(1989) 178:476-515; and Skerra, et al. Biotechnology (1991) 9:273-278. The mutant proteins can be expressed for secretion in the medium and/or in the cytoplasm of the bacteria, as described by Better and Horwitz Meth. Enzymol. (1989) 178:476. In some embodiments, the FnIII cradle molecules are attached to the 3' end of a sequence encoding a signal sequence, such as the ompA, phoA or pelB signal sequence (Lei, et al., J. Bacteriol. (1987) 169:4379). These gene fusions are assembled in a dicistronic construct, so that they can be expressed from a single vector, and secreted into the periplasmic space of E. coli where they will refold and can be recovered in active form (Skerra, et al., Biotechnology (1991) 9:273-278).
(1989) 178:476-515; and Skerra, et al. Biotechnology (1991) 9:273-278. The mutant proteins can be expressed for secretion in the medium and/or in the cytoplasm of the bacteria, as described by Better and Horwitz Meth. Enzymol. (1989) 178:476. In some embodiments, the FnIII cradle molecules are attached to the 3' end of a sequence encoding a signal sequence, such as the ompA, phoA or pelB signal sequence (Lei, et al., J. Bacteriol. (1987) 169:4379). These gene fusions are assembled in a dicistronic construct, so that they can be expressed from a single vector, and secreted into the periplasmic space of E. coli where they will refold and can be recovered in active form (Skerra, et al., Biotechnology (1991) 9:273-278).
[0283] In some embodiments, the FnIII cradle molecule sequences are expressed on the membrane surface of a prokaryote, e.g., E. coli, using a secretion signal and lipidation moiety as described, e.g., in U.S. Patent Publication Nos. 20040072740A1; 20030100023A1;
and 20030036092A1.
and 20030036092A1.
[0284] In some embodiments, the polynucleotides can be expressed in eukaryotic cells such as yeast using, for example, yeast display as described, e.g., in U.S. Pat.
Nos. 6,423,538;
6,331,391; and 6,300,065. In this approach, the FnIII cradle molecules of the library are fused to a polypeptide that is expressed and displayed on the surface of the yeast.
Nos. 6,423,538;
6,331,391; and 6,300,065. In this approach, the FnIII cradle molecules of the library are fused to a polypeptide that is expressed and displayed on the surface of the yeast.
[0285] Higher eukaryotic cells for expression of the FnIII cradle molecules of the invention can also be used, such as mammalian cells, for example myeloma cells (e.g., NS/0 cells), hybridoma cells, or Chinese hamster ovary (CHO) cells. Typically, the FnIII
cradle molecules when expressed in mammalian cells are designed to be expressed into the culture medium, or expressed on the surface of such a cell. The FnIII cradle molecules can be produced, for example, as single individual domain or as multimeric chains comprising dimers, trimers, that can be composed of the same domain or of different FnIII variant domain types (FnIII 3-FnIIE0:
homodimer; FnIII10-FnIII 8: heterodimer; FnIII10- FnIII10 õ, where n is an integer from 1-20 or wild type or variant FnIII domains; FnIII 7- FnIII 7õ, where n is an integer from 1-20 or wild type or variant FnIII domains; FnIII14- FnIII14õ, where n is an integer from 1-20 or wild type or variant FnIII domains; FnIII 7- FnIII10 õ, where n is an integer from 1-20 or wild type or variant FnIII domains; FnIII10- FnIII14õ, where n is an integer from 1-20 or wild type or variant FnIII
domains; FnIII 7- FnIII14õ, where n is an integer from 1-20 or wild type or variant FnIII
domains; FnIII 7- FnIIIi - FnIII14õ, where n is an integer from 1-20 or wild type or variant FnIII
domains; FnIII 8- FnIII 9 - FnIII10 õ, where n is an integer from 1-20 or wild type or variant FnIII
domains; and the like).
cradle molecules when expressed in mammalian cells are designed to be expressed into the culture medium, or expressed on the surface of such a cell. The FnIII cradle molecules can be produced, for example, as single individual domain or as multimeric chains comprising dimers, trimers, that can be composed of the same domain or of different FnIII variant domain types (FnIII 3-FnIIE0:
homodimer; FnIII10-FnIII 8: heterodimer; FnIII10- FnIII10 õ, where n is an integer from 1-20 or wild type or variant FnIII domains; FnIII 7- FnIII 7õ, where n is an integer from 1-20 or wild type or variant FnIII domains; FnIII14- FnIII14õ, where n is an integer from 1-20 or wild type or variant FnIII domains; FnIII 7- FnIII10 õ, where n is an integer from 1-20 or wild type or variant FnIII domains; FnIII10- FnIII14õ, where n is an integer from 1-20 or wild type or variant FnIII
domains; FnIII 7- FnIII14õ, where n is an integer from 1-20 or wild type or variant FnIII
domains; FnIII 7- FnIIIi - FnIII14õ, where n is an integer from 1-20 or wild type or variant FnIII
domains; FnIII 8- FnIII 9 - FnIII10 õ, where n is an integer from 1-20 or wild type or variant FnIII
domains; and the like).
[0286] The screening of the expressed FnIII cradle molecules (or FnIII cradle molecules produced by direct synthesis) can be done by any appropriate means. For example, binding activity can be evaluated by standard immunoassay and/or affinity chromatography. Screening of the FnIII cradle molecules of the invention for catalytic function, e.g., proteolytic function can be accomplished using a standard hemoglobin plaque assay as described, for example, in U.S. Pat. No. 5,798,208. Determining the ability of candidate FnIII cradle molecules to bind therapeutic targets can be assayed in vitro using, e.g., a BiacoreTM
instrument, which measures binding rates of a FnIII cradle molecule to a given target or ligand, or using the methods disclosed herein. In vivo assays can be conducted using any of a number of animal models and then subsequently tested, as appropriate, in humans.
instrument, which measures binding rates of a FnIII cradle molecule to a given target or ligand, or using the methods disclosed herein. In vivo assays can be conducted using any of a number of animal models and then subsequently tested, as appropriate, in humans.
[0287] The FnIII cradle library is transfected into the recipient bacterial/yeast hosts using standard techniques as described in the Examples. Yeast can readily accommodate library sizes up to 107, with 103-105 copies of each FnIII fusion protein being displayed on each cell surface. Yeast cells are easily screened and separated using flow cytometry and fluorescence-activated cell sorting (FACS) or magnetic beads. The yeast eukaryotic secretion system and glycosylation pathways of yeast also allow FnIII type molecules to be displayed with N and 0 linked sugars on the cell surface. Details of yeast display are outlined in the Examples section.
[0288] In another embodiment, the yeast display system utilizes the a-agglutinin yeast adhesion receptor to display proteins on the cell surface. The proteins of interest, in this case, FnIII libraries, are expressed as fusion partners with the Aga2 protein.
[0289] These fusion proteins are secreted from the cell and become disulfide linked to the Agal protein, which is attached to the yeast cell wall (see Invitrogen, pYD1 Yeast Display product literature). The plasmid, e.g., pYD1, prepared from an E. coli host by plasmid purification (Qiagen), is digested with the restriction enzymes, B am HI and Not I, terminally dephosphorylated with calf intestinal alkaline phosphatase. Ligation of the pYD1 and CR
products libraries, E. coli (DH5a) transformation and selection on LB-ampicillin (50 mg/mi) plates were performed using standard molecular biology protocols to amplify the libraries before electroporation into yeast cell hosts.
products libraries, E. coli (DH5a) transformation and selection on LB-ampicillin (50 mg/mi) plates were performed using standard molecular biology protocols to amplify the libraries before electroporation into yeast cell hosts.
[0290] Methods for selecting expressed FnIII library variants having substantially higher affinities for target ligands (e.g., TNF, VEGF, VEGF-R etc), relative to the reference wild type FnIII domain, can be accomplished as follows.
[0291] Candidate test ligands (e.g., TNF, VEGF, VEGF-R etc), are fluorescently labeled (either directly or indirectly via a biotin-streptavidin linkage as described above). Those library clones that efficiently bind the labeled antigens are then enriched for by using FACS. This population of yeast cells is then re-grown and subjected to subsequent rounds of selection using increased levels of stringency to isolate a smaller subset of clones that recognize the target with higher specificity and affinity. The libraries are readily amenable to high-throughput formats, using, e.g., FITC labeled anti-Myc-tag FnIII binding domain molecules and FACS
analysis for quick identification and confirmation. In addition, there are carboxyl terminal tags included which can be utilized to monitor expression levels and/or normalize binding affinity measurements.
analysis for quick identification and confirmation. In addition, there are carboxyl terminal tags included which can be utilized to monitor expression levels and/or normalize binding affinity measurements.
[0292] To check for the display of the Aga2-FnIII fusion protein, an aliquot of yeast cells (8x105 cells in 40 ul) from the culture medium is centrifuged for 5 minutes at 2300 rpm. The supernatant is aspirated and the cell pellet is washed with 200 ul of ice cold PBS/BSA buffer (PBS/BSA 0.5% w/v). The cells are re-pelleted and supernatant removed before re-suspending in 100 ul of buffer containing the biotinylated TNFa (200 nM). The cells were left to bind the TNFa at 20 C for 45 minutes after which they were washed twice with PBS/BSA
buffer before the addition and incubation with streptavidin-FITC (2 mg/L) for 30 minutes on ice. Another round of washing in buffer was performed before final re-suspension volume of 400 ul in PBS/BSA. The cells were then analyzed on FACScanTM (Becton Dickinson) using CellQuest software as per manufacturer's directions.
buffer before the addition and incubation with streptavidin-FITC (2 mg/L) for 30 minutes on ice. Another round of washing in buffer was performed before final re-suspension volume of 400 ul in PBS/BSA. The cells were then analyzed on FACScanTM (Becton Dickinson) using CellQuest software as per manufacturer's directions.
[0293] To generate a library against TNFa, kinetic selections of the yeast displayed TNF-a fibronectin binding domain libraries involve initial labeling of cells with biotinylated TNF-a ligand followed by time dependent chase in the presence of large excess of un-biotinylated TNF-a ligand. Clones with slower dissociation kinetics can be identified by streptavidin-PE
labeling after the chase period and sorted using a high speed FACS sorter.
After Aga2-FnIII
induction, the cells are incubated with biotinylated TNFa at saturating concentrations (400 nM) for 3 hours at 25 C under shaking. After washing the cells, a 40 hour cold chase using unlabelled TNFa (1 uM) at 25 C. The cells are then be washed twice with PBS/BSA buffer, labeled with Streptavidin PE (2 mg/mi) anti-HIS-FITC (25 nM) for 30 minutes on ice, washed and re-suspended and then analyzed on FACS ARIA sorter.
labeling after the chase period and sorted using a high speed FACS sorter.
After Aga2-FnIII
induction, the cells are incubated with biotinylated TNFa at saturating concentrations (400 nM) for 3 hours at 25 C under shaking. After washing the cells, a 40 hour cold chase using unlabelled TNFa (1 uM) at 25 C. The cells are then be washed twice with PBS/BSA buffer, labeled with Streptavidin PE (2 mg/mi) anti-HIS-FITC (25 nM) for 30 minutes on ice, washed and re-suspended and then analyzed on FACS ARIA sorter.
[0294] Library screening can be conducted in order to select FnIII variants that bind to specific ligands or targets. Combinatorial screening can easily produce and screen a large number of variants, which is not feasible with specific mutagenesis ("rational design") approaches. Amino acid variant at various amino acid positions in FnIII can be generated using a degenerate nucleotide sequence. FnIII variants with desired binding capabilities can be selected in vitro, recovered and amplified. The amino acid sequence of a selected clone can be identified readily by sequencing the nucleic acid encoding the selected FnIII.
[0295] In some embodiments, a particular FnIII cradle molecule has an affinity for a target that is at least 2-fold greater than the affinity of the polypeptide prior to substitutions discussed herein. In some embodiments, the affinity is, is at least, or is at most about 2-, 3-, 4-, 5-, 6-, 7-, 8-, 9-, 10-, 15-, 20-, 25-, 30-, 35-, 40-, 45-, 50-, 60-, 70-, 80-, 90-, 100-fold increased compared to another FnIII cradle molecule.
[0296] Further provided herein is a cradle polypeptide selected using the method of identifying a cradle polypeptide having a desired binding affinity to a target molecule disclosed herein. In some embodiments, the cradle polypeptide may comprise an amino acid sequence selected from the group consisting of SEQ ID NOs:4-78, 80-85, 87-96, 98, 99, 101-128, 130-141, 143, 145-147, 149-159, 161-199, 201-238 and 240-277.
Analysis and Screening of FnIII Libraries for Function
Analysis and Screening of FnIII Libraries for Function
[0297] FnIII libraries can also be used to screen for FnIII proteins that possess functional activity. The study of proteins has revealed that certain amino acids play a crucial role in their structure and function. For example, it appears that only a discrete number of amino acids participate in the functional event of an enzyme. Protein libraries generated by any of the above techniques or other suitable techniques can be screened to identify variants of desired structure or activity.
[0298] By comparing the properties of a wild-type protein and the variants generated, it is possible to identify individual amino acids or domains of amino acids that confer binding and/or functional activity. Usually, the region studied will be a functional domain of the protein such as a binding domain. For example, the region can be the AB, BC, CD, DE, EF and FG loop binding regions or the beta sheets of FnIII domain. The screening can be done by any appropriate means. For example, activity can be ascertained by suitable assays for substrate conversion and binding activity can be evaluated by standard immunoassay and/or affinity chromatography.
[0299] From the chemical properties of the side chains, it appears that only a selected number of natural amino acids preferentially participate in a catalytic event.
These amino acids belong to the group of polar and neutral amino acids such as Ser, Thr, Asn, Gln, Tyr, and Cys, the group of charged amino acids, Asp and Glu, Lys and Arg, and especially the amino acid His.
Typical polar and neutral side chains are those of Cys, Ser, Thr, Asn, Gln and Tyr. Gly is also considered to be a borderline member of this group. Ser and Thr play an important role in forming hydrogen-bonds. Thr has an additional asymmetry at the beta carbon, therefore only one of the stereoisomers is used. The acid amide Gln and Asn can also form hydrogen bonds, the amido groups functioning as hydrogen donors and the carbonyl groups functioning as acceptors. Gln has one more CH2 group than Asn which renders the polar group more flexible and reduces its interaction with the main chain. Tyr has a very polar hydroxyl group (phenolic OH) that can dissociate at high pH values. Tyr behaves somewhat like a charged side chain; its hydrogen bonds are rather strong.
These amino acids belong to the group of polar and neutral amino acids such as Ser, Thr, Asn, Gln, Tyr, and Cys, the group of charged amino acids, Asp and Glu, Lys and Arg, and especially the amino acid His.
Typical polar and neutral side chains are those of Cys, Ser, Thr, Asn, Gln and Tyr. Gly is also considered to be a borderline member of this group. Ser and Thr play an important role in forming hydrogen-bonds. Thr has an additional asymmetry at the beta carbon, therefore only one of the stereoisomers is used. The acid amide Gln and Asn can also form hydrogen bonds, the amido groups functioning as hydrogen donors and the carbonyl groups functioning as acceptors. Gln has one more CH2 group than Asn which renders the polar group more flexible and reduces its interaction with the main chain. Tyr has a very polar hydroxyl group (phenolic OH) that can dissociate at high pH values. Tyr behaves somewhat like a charged side chain; its hydrogen bonds are rather strong.
[0300] Histidine (His) has a heterocyclic aromatic side chain with a pK value of 6Ø In the physiological pH range, its imidazole ring can be either uncharged or charged, after taking up a hydrogen ion from the solution. Since these two states are readily available, His is quite suitable for catalyzing chemical reactions. It is found in most of the active centers of enzymes.
[0301] Asp and Glu are negatively charged at physiological pH. Because of their short side chain, the carboxyl group of Asp is rather rigid with respect to the main chain. This may be the reason why the carboxyl group in many catalytic sites is provided by Asp and not by Glu.
Charged acids are generally found at the surface of a protein.
Charged acids are generally found at the surface of a protein.
[0302] Therefore, several different regions or loops of an FnIII protein domain can be mutagenized simultaneously. This enables the evaluation of amino acid substitutions in conformationally related regions such as the regions which, upon folding of the protein, are associated to make up a functional site or the binding site. This method provides a way to create modified or completely new binding sites. The two loop regions and two beta sheets of FnIII, which can be engineered to confer target ligand binding, can be mutagenized simultaneously, or separately within the CD and FG loops or C and F sheets to assay for contributing binding functions at this binding site. Therefore, the introduction of additional "functionally important"
amino acids into a ligand binding region of a protein may result in de novo improved binding activity toward the same target ligand.
amino acids into a ligand binding region of a protein may result in de novo improved binding activity toward the same target ligand.
[0303] Hence, new FnIII cradle molecules can be built on the natural "scaffold" of an existing FnIII polypeptide by mutating only relevant regions by the method of this invention.
The method of this invention is suited to the design of de novo improved binding proteins as compared to the isolation of naturally occurring FnIIIs.
VII. Kits
The method of this invention is suited to the design of de novo improved binding proteins as compared to the isolation of naturally occurring FnIIIs.
VII. Kits
[0304] Kits are also contemplated as being made or used in some embodiments of the present invention. For instance, a polypeptide or nucleic acid of the present invention can be included in a kit or in a library provided in a kit. A kit can be included in a sealed container.
Non-limiting examples of containers include a microtiter plate, a bottle, a metal tube, a laminate tube, a plastic tube, a dispenser, a pressurized container, a barrier container, a package, a compartment, or other types of containers such as injection or blow-molded plastic containers into which the dispersions or compositions or desired bottles, dispensers, or packages are retained. Other examples of containers include glass or plastic vials or bottles. The kit and/or container can include indicia on its surface. The indicia, for example, can be a word, a phrase, an abbreviation, a picture, or a symbol.
Non-limiting examples of containers include a microtiter plate, a bottle, a metal tube, a laminate tube, a plastic tube, a dispenser, a pressurized container, a barrier container, a package, a compartment, or other types of containers such as injection or blow-molded plastic containers into which the dispersions or compositions or desired bottles, dispensers, or packages are retained. Other examples of containers include glass or plastic vials or bottles. The kit and/or container can include indicia on its surface. The indicia, for example, can be a word, a phrase, an abbreviation, a picture, or a symbol.
[0305] The containers can dispense or contain a pre-determined amount of a composition of the present invention. The composition can be dispensed as a liquid, a fluid, or a semi-solid. A
kit can also include instructions for using the kit and/or compositions.
Instructions can include an explanation of how to use and maintain the compositions.
VIII. Examples
kit can also include instructions for using the kit and/or compositions.
Instructions can include an explanation of how to use and maintain the compositions.
VIII. Examples
[0306] The following examples are offered to illustrate but not to limit the invention.
Example 1 Phage Display Library and Selection
Example 1 Phage Display Library and Selection
[0307] An FnIIIi gene template was constructed (Koide, A., et al., J Mol Biol (1998) 284:1141-1151). A library can be created using a "shaved" template containing polyserine sequence at locations to be diversified (Koide, et al., supra, 2007 and Wojcik, et al., supra, 2010). A synthetic DNA fragment that encodes signal sequence of DsbA (Steiner, et al., Nat.
Biotechnol. (2006) 24:823-831) was fused to the gene for the template, and the fusion gene was cloned into the phage display vector pAS38 (Koide, A., et al., supra, 1998). A
phage-display combinatorial library was constructed by introducing codons for amino acid variation into the FnIII1 gene. Library construction procedures have previously been described (Koide, A., and Koide, S., Methods Mol. Biol. (2007) 352:95-109).
Biotechnol. (2006) 24:823-831) was fused to the gene for the template, and the fusion gene was cloned into the phage display vector pAS38 (Koide, A., et al., supra, 1998). A
phage-display combinatorial library was constructed by introducing codons for amino acid variation into the FnIII1 gene. Library construction procedures have previously been described (Koide, A., and Koide, S., Methods Mol. Biol. (2007) 352:95-109).
[0308] Phagemid particles can be prepared by growing XL1-Blue cells transfected with the phagemid library in the presence of 0.2 mM IPTG and helper phage (Lo Conte, et al., J. Mol.
Biol. (1999) 285:2177-2198; Fellouse, et al., J. Mol. Biol. (2005) 348:1153-1162). Phagemid library selection can be performed as follows. In the first round, 0.5 i.tM of a target protein modified with EZ-Link Sulfo-NHS-SS-Biotin (Sulfosuccinimidyl 2(biotinamido)-ethy1-1,3-dithiopropionate; Pierce) is mixed with a sufficient amount of streptavidin-conjugated magnetic beads (Streptavidin MagneSphere Pramagnetic Particles; Promega, Z5481/2) in TBS (50 mM
Tris HC1 buffer pH 7.5 150 mM NaC1) containing 0.5% Tween20 (TBST). To this target solution, 1012-13 phagemids suspended in 1 ml TBST plus 0.5% BSA is added, and the solution is mixed and incubated for 15 mM at room temperature. After washing the beads twice with TBST, the beads suspension containing bound phagemids is added to fresh E.
coli culture.
Phagemids were amplified as described before (Fellouse, et al., supra, 2005).
In a second round, phagemids are incubated with 0.1 !AM target in TBST plus 0.5% BSA, and then captured by streptavidin-conjugated magnetic beads. Phagemids bound to the target protein are eluted from the beads by cleaving the linker within the biotinylation reagent with 100 mM DTT in TB ST. The phagemids are washed and recovered as described above. After amplification, the third round of selection is performed using 0.02 i.tM target. Phage display is an established technique for generating binding members and has been described in detail in many publications such as Kontermann & Dubel (ed.), In: Antibody Engineering: Miniantibodies, 637-647, Springer-Verlag, (2001) and W092/01047, each of which is incorporated herein by reference in its entirety.
Example 2 Yeast Surface Display
Biol. (1999) 285:2177-2198; Fellouse, et al., J. Mol. Biol. (2005) 348:1153-1162). Phagemid library selection can be performed as follows. In the first round, 0.5 i.tM of a target protein modified with EZ-Link Sulfo-NHS-SS-Biotin (Sulfosuccinimidyl 2(biotinamido)-ethy1-1,3-dithiopropionate; Pierce) is mixed with a sufficient amount of streptavidin-conjugated magnetic beads (Streptavidin MagneSphere Pramagnetic Particles; Promega, Z5481/2) in TBS (50 mM
Tris HC1 buffer pH 7.5 150 mM NaC1) containing 0.5% Tween20 (TBST). To this target solution, 1012-13 phagemids suspended in 1 ml TBST plus 0.5% BSA is added, and the solution is mixed and incubated for 15 mM at room temperature. After washing the beads twice with TBST, the beads suspension containing bound phagemids is added to fresh E.
coli culture.
Phagemids were amplified as described before (Fellouse, et al., supra, 2005).
In a second round, phagemids are incubated with 0.1 !AM target in TBST plus 0.5% BSA, and then captured by streptavidin-conjugated magnetic beads. Phagemids bound to the target protein are eluted from the beads by cleaving the linker within the biotinylation reagent with 100 mM DTT in TB ST. The phagemids are washed and recovered as described above. After amplification, the third round of selection is performed using 0.02 i.tM target. Phage display is an established technique for generating binding members and has been described in detail in many publications such as Kontermann & Dubel (ed.), In: Antibody Engineering: Miniantibodies, 637-647, Springer-Verlag, (2001) and W092/01047, each of which is incorporated herein by reference in its entirety.
Example 2 Yeast Surface Display
[0309] Yeast surface experiments are performed according to Boder, E. T., and Wittrup, K. D., Methods Enzymol. (2000) 328:430-444 with minor modifications.
The Express-tag in the yeast display vector, pYD1, (Invitrogen) was removed because it cross-reacts with anti-FLAG antibodies (Sigma). The genes for cradle molecules in the phagemid library after three rounds of selection are amplified using PCR and mixed with the modified pYD1 cut with EcoRI and XhoI, and yeast EBY100 cells are transformed with this mixture. The transformed yeast cells are grown in the SD-CAA media at 30 C for two days, and then monobody expression is induced by growing the cells in the SG-CAA media at 30 C for 24 h.
The Express-tag in the yeast display vector, pYD1, (Invitrogen) was removed because it cross-reacts with anti-FLAG antibodies (Sigma). The genes for cradle molecules in the phagemid library after three rounds of selection are amplified using PCR and mixed with the modified pYD1 cut with EcoRI and XhoI, and yeast EBY100 cells are transformed with this mixture. The transformed yeast cells are grown in the SD-CAA media at 30 C for two days, and then monobody expression is induced by growing the cells in the SG-CAA media at 30 C for 24 h.
[0310] Sorting of monobody-displaying yeast cells is performed as follows. The yeast cells are incubated with a biotinylated target (50 nM) and mouse anti-V5 antibody (Sigma), then after washing incubated with anti-mouse antibody-FITC conjugate (Sigma) and NeutrAvidin -PE
conjugate (Invitrogen). The stained cells are sorted based on the FITC and PE
intensities.
Typically, cells exhibiting the top ¨1% PE intensity and top 10% FITC
intensity are recovered.
conjugate (Invitrogen). The stained cells are sorted based on the FITC and PE
intensities.
Typically, cells exhibiting the top ¨1% PE intensity and top 10% FITC
intensity are recovered.
[0311] After FACS sorting, individual clones are analyzed. Approximate Kd values are determined from a titration curve by FACS analysis (Boder and Wittrup, supra, 2000). Amino acid sequences are deduced from DNA sequencing.
[0312] Effects of E. coli lysate on monobody-target interaction are tested by comparing binding in the presence and absence of E. coli lysate prepared from cell suspension with 0D600 of 50.
Example 3 Protein Expression and Purification
Example 3 Protein Expression and Purification
[0313] The nucleic acid encoding any targets are cloned in the appropriate expression vector. In one example, genes for monobodies are cloned in the expression vector, pHFT2, which is a derivative of pHET1 (Huang, et al., supra, 2006) in which the His-6 tag had been replaced with a His-10 tag. Protein expression and purification can be performed as described previously (Huang, et al., supra, 2006).
[0314] An expression vector comprising cDNA encoding an FnIII polypeptide or a target molecule is introduced into Escherichia coli, yeast, an insect cell, an animal cell or the like for expression to obtain the polypeptide. Polypeptides used in the present invention can be produced, for example, by expressing a DNA encoding it in a host cell using a method described in Molecular Cloning, A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press (1989), Current Protocols in Molecular Biology, John Wiley &
Sons (1987-1997) or the like. A recombinant vector is produced by inserting a cDNA
downstream of a promoter in an appropriate expression vector. The vector is then introduced into a host cell suitable for the expression vector. The host cell can be any cell so long as it can express the gene of interest, and includes bacteria (e.g., E. coli), an animal cell and the like. Expression vector can replicate autonomously in the host cell to be used or vectors which can be integrated into a chromosome comprising an appropriate promoter at such a position that the DNA
encoding the polypeptide can be transcribed.
Example 4 Ribosome Display
Sons (1987-1997) or the like. A recombinant vector is produced by inserting a cDNA
downstream of a promoter in an appropriate expression vector. The vector is then introduced into a host cell suitable for the expression vector. The host cell can be any cell so long as it can express the gene of interest, and includes bacteria (e.g., E. coli), an animal cell and the like. Expression vector can replicate autonomously in the host cell to be used or vectors which can be integrated into a chromosome comprising an appropriate promoter at such a position that the DNA
encoding the polypeptide can be transcribed.
Example 4 Ribosome Display
[0315] Ribosome display utilizes cell free in vitro coupled transcription/translation machinery to produce protein libraries. The FnIII library genes are inserted upstream to kappa light immunoglobulin gene that does not have a termination stop codon causing the ribosome to stall, but not release, when it reaches the end of the mRNA. Additionally, the kappa domain spacer serves to physically distance the FnIII protein from the ribosome complex so that FnIII
binding domain has better accessibility to recognize its cognate ligand. The mRNA library is introduced into either S30 E. coli ribosome extract preparations (Roche) or rabbit reticulate lysate (Promega). In either case, the 5' end of the nascent mRNA can bind to ribosomes and undergo translation. During translation, the ligand-binding protein remains non-covalently attached to the ribosome along with its mRNA progenitor in a macromolecular complex.
binding domain has better accessibility to recognize its cognate ligand. The mRNA library is introduced into either S30 E. coli ribosome extract preparations (Roche) or rabbit reticulate lysate (Promega). In either case, the 5' end of the nascent mRNA can bind to ribosomes and undergo translation. During translation, the ligand-binding protein remains non-covalently attached to the ribosome along with its mRNA progenitor in a macromolecular complex.
[0316] The functional FnIII proteins can then bind to a specific ligand that is either attached to magnetic beads or microtiter well surface. During the enrichment process, non-specific variants are washed away before the specific FnIII binders are eluted. The bound mRNA is detected by RT-PCR using primers specific to the 5' FnIII and 3' portion of the kappa gene respectively. The amplified double stranded cDNA is then cloned into an expression vector for sequence analysis and protein production.
[0317] For prokaryotic translation reactions, the reaction mix can contain 0.2 M potassium glutamate, 6.9 mM magnesium acetate, 90 mg/ml protein disulfide isomerase (Fluka), 50 mM
Tris acetate (pH 7.5), 0.35 mM each amino acid, 2 mM ATP, 0.5 mM GTP, 1 mM
cAMP, 30 mM acetyl phosphate, 0.5 mg/ml E. coli tRNA, 20 mg/ml folinic acid, 1.5%
PEG 8000, 40 ml S30 E. coli extract and 10 mg mRNA in a total volume of 110 ml.
Translation can be performed at 37 C for 7 mM, after which ribosome complexes can be stabilized by 5-fold dilution in ice-cold selection buffer (50 mM Tris acetate (pH 7.5), 150 mM
NaC1, 50 mM
magnesium acetate, 0.1% Tween 20, 2.5 mg/ml heparin).
Affinity selection for target ligands
Tris acetate (pH 7.5), 0.35 mM each amino acid, 2 mM ATP, 0.5 mM GTP, 1 mM
cAMP, 30 mM acetyl phosphate, 0.5 mg/ml E. coli tRNA, 20 mg/ml folinic acid, 1.5%
PEG 8000, 40 ml S30 E. coli extract and 10 mg mRNA in a total volume of 110 ml.
Translation can be performed at 37 C for 7 mM, after which ribosome complexes can be stabilized by 5-fold dilution in ice-cold selection buffer (50 mM Tris acetate (pH 7.5), 150 mM
NaC1, 50 mM
magnesium acetate, 0.1% Tween 20, 2.5 mg/ml heparin).
Affinity selection for target ligands
[0318] Stabilized ribosome complexes can be incubated with biotinylated hapten (50 nM
fluorescein-biotin (Sigma)) or antigen (100 nM IL-13 (Peprotech) biotinylated) as appropriate at 4 C for 1-2 h, followed by capture on streptavidin-coated M280 magnetic beads (Dynal). Beads were then washed to remove non-specifically bound ribosome complexes. For prokaryotic selections, five washes in ice-cold selection buffer can be performed. For eukaryotic selections, three washes in PBS containing 0.1% BSA and 5 mM magnesium acetate were performed, followed by a single wash in PBS alone. Eukaryotic complexes can then be incubated with U DNAse I in 40 mM Tris-HC1, 6 mM MgC12, 10 mMNaC1, 10 mM CaC12 for 25 mM at 37 C, followed by three further washes with PBS, 5 mM magnesium acetate, 1%
Tween 20.
Recovery of mRNA from Selected Ribosome Complexes
fluorescein-biotin (Sigma)) or antigen (100 nM IL-13 (Peprotech) biotinylated) as appropriate at 4 C for 1-2 h, followed by capture on streptavidin-coated M280 magnetic beads (Dynal). Beads were then washed to remove non-specifically bound ribosome complexes. For prokaryotic selections, five washes in ice-cold selection buffer can be performed. For eukaryotic selections, three washes in PBS containing 0.1% BSA and 5 mM magnesium acetate were performed, followed by a single wash in PBS alone. Eukaryotic complexes can then be incubated with U DNAse I in 40 mM Tris-HC1, 6 mM MgC12, 10 mMNaC1, 10 mM CaC12 for 25 mM at 37 C, followed by three further washes with PBS, 5 mM magnesium acetate, 1%
Tween 20.
Recovery of mRNA from Selected Ribosome Complexes
[0319] For analysis of mRNA recovery without a specific disruption step, ribosome complexes bound to magnetic beads can directly be processed into the reverse transcription reaction. For recovery of mRNA from prokaryotic selections by ribosome complex disruption, selected complexes can be incubated in EB20 1150 mM Tris acetate (pH 7.5), 150 mM NaC1, 20 mM EDTA, 10 mg/ml Saccharomyces cerevisae RNA] for 10 min at 4 C. To evaluate the efficiency of the 20 mM EDTA for recovery of mRNA from eukaryotic selections, ribosome complexes can be incubated in PBS20 (PBS, 20 mM EDTA, 10 mg/ml S. cerevisae RNA) for min at 4 C. mRNA can be purified using a commercial kit (High Pure RNA
Isolation Kit, Roche). For prokaryotic samples, the DNAse I digestion option of the kit was performed;
however, this step is not required for eukaryotic samples, as DNAse I
digestion was performed during post-selection washes. Reverse transcription can be performed on either 4 ml of purified RNA or 4 ml of immobilized, selected ribosome complexes (i.e., a bead suspension).
Isolation Kit, Roche). For prokaryotic samples, the DNAse I digestion option of the kit was performed;
however, this step is not required for eukaryotic samples, as DNAse I
digestion was performed during post-selection washes. Reverse transcription can be performed on either 4 ml of purified RNA or 4 ml of immobilized, selected ribosome complexes (i.e., a bead suspension).
[0320] For prokaryotic samples, reactions contained 50 mM Tris-HC1 (pH 8.3), 75 mM
KC1, 3 mMMgC12, 10 mMDTT, 1.25 primer, 0.5 mM PCR nucleotide mix (Amersham Pharmacia), 1 URNAsin (Promega) and 5 U SuperScript II (Invitrogen) and were performed by incubation at 50 C for 30 min. For eukaryotic samples, reactions contained 50 mM Tris-HC1 (pH 8.3), 50 mM KC1, 10 mM MgC12, 0.5 mM spermine, 10 mM DTT, 1.25 mM RT
primers, 0.5 mM PCR nucleotide mix, 1 U RNasin and 5 U AMY reverse transcriptase (Promega) and can be performed by incubation at 48 C for 45 min.
PCR of Selection Outputs
KC1, 3 mMMgC12, 10 mMDTT, 1.25 primer, 0.5 mM PCR nucleotide mix (Amersham Pharmacia), 1 URNAsin (Promega) and 5 U SuperScript II (Invitrogen) and were performed by incubation at 50 C for 30 min. For eukaryotic samples, reactions contained 50 mM Tris-HC1 (pH 8.3), 50 mM KC1, 10 mM MgC12, 0.5 mM spermine, 10 mM DTT, 1.25 mM RT
primers, 0.5 mM PCR nucleotide mix, 1 U RNasin and 5 U AMY reverse transcriptase (Promega) and can be performed by incubation at 48 C for 45 min.
PCR of Selection Outputs
[0321] End-point PCR can be performed to visualize amplification of the full-length construct. A 5 ml sample of each reverse transcription reaction can be amplified with 2.5 UTaq polymerase (Roche) in 20 mM Tris-HC1 (pH 8.4), 50 mM KC1, 1 mM MgC12, 5% DMSO, containing 0.25 mM PCR nucleotide mix, 0.25 mM forward primer (T7B or T7KOZ
for prokaryotic and or eukaryotic experiments, respectively) and 0.25 mM RT
primer. Thermal cycling comprised 94 C for 3 min, then 94 C for 30 s, 50 C for 30 s and 72 C
for 1.5 min for 30 cycles, with a final step at 72 C for 5 min. PCR products were visualized by electrophoresis on an ethidium bromide stained agarose gels. The isolated PCR products can then be sub-cloned into a bacterial pBAD expression vector for soluble protein production.
Bacterial Expression and Production
for prokaryotic and or eukaryotic experiments, respectively) and 0.25 mM RT
primer. Thermal cycling comprised 94 C for 3 min, then 94 C for 30 s, 50 C for 30 s and 72 C
for 1.5 min for 30 cycles, with a final step at 72 C for 5 min. PCR products were visualized by electrophoresis on an ethidium bromide stained agarose gels. The isolated PCR products can then be sub-cloned into a bacterial pBAD expression vector for soluble protein production.
Bacterial Expression and Production
[0322] Competent E. coli host cells are prepared as per manufacturer's instructions (Invitrogen PBAD expression system). Briefly, 40 IA LMG 194 competent cells and 0.5 IA
pBAD FnIII constructs (approximately 1 ug DNA) can be incubated together on ice for 15 minutes after which, a one minute 42 C heat shock was applied. The cells are then allowed to recover for 10 minutes at 37 C in SOC media before plating onto LB-Amp plates and 37 C
growth overnight. Single colonies are picked the next day for small scale liquid cultures to initially determine optimal L-arabinose induction concentrations for FnIII
production.
Replicates of each clone after reaching an 0D600=0.5 can be tested induced with serial (1:10) titrations of L-arabinose (0.2% to 0.00002% final concentration) after overnight growth at room temperature. Test cultures (1 ml) can be collected, pelleted and 100 ul 1 xBBS
buffer (10 mM, 160 mM NaC1, 200 mM Boric acid, pH=8.0) added to resuspend the cells before the addition of 50 ul of lysozyme solution for 1 hour (37 C). Cell supernatants from the lysozyme digestions can be collected after centrifugation, and Mg504 can be added to final concentration 40 mM.
This solution can be applied to PBS pre-equilibrated Ni-NTA columns. His-tagged bound FnIII
samples are washed twice with PBS buffer upon which elution can be accomplished with the addition of 250 mM imidazole. Purity of the soluble FnIII expression can be examined by SDS-PAGE.
Example 5 Design of a Cradle library Based on the FnIII Domain Exemplified with the FnIII 7, FnIII1 and FnIII14 Domains
pBAD FnIII constructs (approximately 1 ug DNA) can be incubated together on ice for 15 minutes after which, a one minute 42 C heat shock was applied. The cells are then allowed to recover for 10 minutes at 37 C in SOC media before plating onto LB-Amp plates and 37 C
growth overnight. Single colonies are picked the next day for small scale liquid cultures to initially determine optimal L-arabinose induction concentrations for FnIII
production.
Replicates of each clone after reaching an 0D600=0.5 can be tested induced with serial (1:10) titrations of L-arabinose (0.2% to 0.00002% final concentration) after overnight growth at room temperature. Test cultures (1 ml) can be collected, pelleted and 100 ul 1 xBBS
buffer (10 mM, 160 mM NaC1, 200 mM Boric acid, pH=8.0) added to resuspend the cells before the addition of 50 ul of lysozyme solution for 1 hour (37 C). Cell supernatants from the lysozyme digestions can be collected after centrifugation, and Mg504 can be added to final concentration 40 mM.
This solution can be applied to PBS pre-equilibrated Ni-NTA columns. His-tagged bound FnIII
samples are washed twice with PBS buffer upon which elution can be accomplished with the addition of 250 mM imidazole. Purity of the soluble FnIII expression can be examined by SDS-PAGE.
Example 5 Design of a Cradle library Based on the FnIII Domain Exemplified with the FnIII 7, FnIII1 and FnIII14 Domains
[0323] In this example, universal CD and FG loop sequences along with 3 beta strands between the loops which face outward for fibronectin binding domain library sequences are identified and selected using bioinformatics and the criteria of the invention. A generalized schematic of this process is presented in Figure 1.
Sequences
Sequences
[0324] 7th FnIII domain (FnIII 7)¨FINC_HUMAN(1173-1265):
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSSCTFDN
LSPGLEYNVSVYTVKDDKE SVP I SDT I IP (SEQ ID NO:97)
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSSCTFDN
LSPGLEYNVSVYTVKDDKE SVP I SDT I IP (SEQ ID NO:97)
[0325] 10th FnIII domain (FnIII10) ¨FINC_HUMAN(1447-1542):
VSDVPRDLEVVAATPTSLL I SWDAPAVTVRYYRI TYGETGGNSPVQEF TVPGSKS TAT I SGLKP
GVDYT I TVYAVTGRGDSPAS SKP I S INYRTE I (SEQ ID NO:280)
VSDVPRDLEVVAATPTSLL I SWDAPAVTVRYYRI TYGETGGNSPVQEF TVPGSKS TAT I SGLKP
GVDYT I TVYAVTGRGDSPAS SKP I S INYRTE I (SEQ ID NO:280)
[0326] 14th FnIII domain (FnIII14) ¨FINC_HUMAN(1813-1901):
NVSPPRRARVTDATE TT I T I SWRTKTE T I TGFQVDAVPANGQTP IQRT IKPDVRSYT I TGLQPG
TDYKIYLYTLNDNARSSPVVIDAST (SEQ ID NO:129) Alignment
NVSPPRRARVTDATE TT I T I SWRTKTE T I TGFQVDAVPANGQTP IQRT IKPDVRSYT I TGLQPG
TDYKIYLYTLNDNARSSPVVIDAST (SEQ ID NO:129) Alignment
[0327] Below is the sequence alignment of FnIII repeats 7, 10, and 14 (SEQ ID
NOs: 97, 280, 129 respectively). The structurally conserved hydrophobic core residues are shown in bold.
FnI II0 7 PLSPPTNLHL-EA I NPDTG I VLIVSWE I RSTTPDIT I GYRITTTPT
BC I
FnIII10 VSDVPRDLEVVAA I T--PT I SLLISWD I APAV-TVR I YYRITYGET
I I
I
I
FnIII0 7 I NGQQGNS I LEEVVH I ADQ I SSCIFD I NLSPGL I EYNVSVYTVK
I ---CD -H
I DE-FnIII10 I -GGNSPV I QEFTVP I GSK I STATIS I GLKPGV I DYTITVYAVT
FnI I I14 I -NGQT-P I IQRTIK I PDV I RSYTIT I GLQPGT I DYKIYLYTLN
I I
I
I I
I
I FG
I
FnIII0 7 I D--DKE--SVP I ISDTIIP--FnIII10 I GRGDSPASSKP I ISINYRTEI
FnIII14 I D NA
RSSP I
VVIDAST¨
I
I
Definition of the Cradle
NOs: 97, 280, 129 respectively). The structurally conserved hydrophobic core residues are shown in bold.
FnI II0 7 PLSPPTNLHL-EA I NPDTG I VLIVSWE I RSTTPDIT I GYRITTTPT
BC I
FnIII10 VSDVPRDLEVVAA I T--PT I SLLISWD I APAV-TVR I YYRITYGET
I I
I
I
FnIII0 7 I NGQQGNS I LEEVVH I ADQ I SSCIFD I NLSPGL I EYNVSVYTVK
I ---CD -H
I DE-FnIII10 I -GGNSPV I QEFTVP I GSK I STATIS I GLKPGV I DYTITVYAVT
FnI I I14 I -NGQT-P I IQRTIK I PDV I RSYTIT I GLQPGT I DYKIYLYTLN
I I
I
I I
I
I FG
I
FnIII0 7 I D--DKE--SVP I ISDTIIP--FnIII10 I GRGDSPASSKP I ISINYRTEI
FnIII14 I D NA
RSSP I
VVIDAST¨
I
I
Definition of the Cradle
[0328] The portion of the fibronectin deemed the cradle are loops CD and FG
along with amino acids in 3 beta strands between the loops which face outward (Figures 2 and 3). Cradle residues are highlighted in bold in the alignment shown below (SEQ ID NOs:97, 280, 129 respectively).
FnI 1107 PLSPPTNLHL-EANPDTGVLTVSWERSTTPDITGYRITTTPT
FnI 1110 VS DVPRDLEVVAAT--PT SLL I SWDAPAV- TVRYYRITYGET
FnIII0 7 1 IsIGQQGNS 1 LEEVVHADQSSCIFDNLSPGLEYNVEVYTVK
I CD I
FnI I I10 I -GGNSPV 1 QEFTVPGSKSTAT I SGLKPGVDYTITVYAVT
FnI I I14 1 -NGQT-P 1 IQRTIKPDVRSYTITGLQPGTDYKIYLYTLN
FnIII07 I D¨DKE¨SVP 1 ISDTIIP--I FG
I
FnI I I10 I GRGDSPASSKP 1 I S INYRTE I
FnIII14 I D-NA---RSSP 1 VVIDAST--Distribution of the Cradle Loops
along with amino acids in 3 beta strands between the loops which face outward (Figures 2 and 3). Cradle residues are highlighted in bold in the alignment shown below (SEQ ID NOs:97, 280, 129 respectively).
FnI 1107 PLSPPTNLHL-EANPDTGVLTVSWERSTTPDITGYRITTTPT
FnI 1110 VS DVPRDLEVVAAT--PT SLL I SWDAPAV- TVRYYRITYGET
FnIII0 7 1 IsIGQQGNS 1 LEEVVHADQSSCIFDNLSPGLEYNVEVYTVK
I CD I
FnI I I10 I -GGNSPV 1 QEFTVPGSKSTAT I SGLKPGVDYTITVYAVT
FnI I I14 1 -NGQT-P 1 IQRTIKPDVRSYTITGLQPGTDYKIYLYTLN
FnIII07 I D¨DKE¨SVP 1 ISDTIIP--I FG
I
FnI I I10 I GRGDSPASSKP 1 I S INYRTE I
FnIII14 I D-NA---RSSP 1 VVIDAST--Distribution of the Cradle Loops
[0329] The fibronectin family alignment, PF00041.full, was downloaded from PFAM in Stockholm (1.0 format located on the World Wide Web at pfam.sanger.ac.uk/family/PF00041#tabview=2). The FG loop was truncated in PF00041.full and the data from U520090176654(A1) was used instead. The FG loop was defined to include the sequence TGRGDSPASSKPI and the terminal T and I are not defined as part of the FG loop in the cradle. As a result, the distribution data for the FG loop from U.S.
Patent Publication No.
2009017654(A1) will be amended by subtracting 2 from each loop length in the distribution.
Patent Publication No.
2009017654(A1) will be amended by subtracting 2 from each loop length in the distribution.
[0330] The BC loop was calculated for PF00041.full based on the definition in U.S. Patent Publication No. 2009017654(A1) of the DE loop to be the sequence 1 amino acid before the conserved W and up to, but not including, the conserved Y. The BC loop corresponded to columns 125 ¨248 in PF00041.full. The BC loops were extracted and the gaps were removed.
The length of each loop was determined and range of loop lengths was found to be 1 ¨ 26. The distribution of loop length was determined.
The length of each loop was determined and range of loop lengths was found to be 1 ¨ 26. The distribution of loop length was determined.
[0331] The output is captured as BC_Loop.txt and formatted in Excel to generate the table and graph which were saved as BC_Loop.xlsx and shown in Figure 4A. Following is the sequence of FnIIIi (SEQ ID NO:280) with the cradle loops/sheets in bold.
VS DVPRD LEVVAATP TSL L I SWDAPAVTVRYYRI TYGET
I GGNSPV I QEFTVPGSKS TAT I SGLKPGVDYTITVYAVT
FG
I GRGDSPASSKP I I S INYRTE I
VS DVPRD LEVVAATP TSL L I SWDAPAVTVRYYRI TYGET
I GGNSPV I QEFTVPGSKS TAT I SGLKPGVDYTITVYAVT
FG
I GRGDSPASSKP I I S INYRTE I
[0332] The sheet before the CD loop was designated [31 (also referred to as sheet C) and included the sequence YYRITYGET (residues 31-39 of SEQ ID NO:280). The CD loop was the sequence GGNSPV (residues 40-45 of SEQ ID NO:280). The sheet directly following the CD loop was 32 (also referred to as sheet D) and included the sequence QEFTV
(residues 46-50 of SEQ ID NO:280). The sheet before the FG loop was 33 (also referred to as sheet F) and included the sequence DYTITVYAVT (residues 67-76 of SEQ ID NO:280). The FG
loop was the sequence GRGDSPASSKP (residues 77-87 of SEQ ID NO:280). [31 corresponded to columns 236 ¨ 271. The CD loop was columns 271 ¨317. 32 was columns 318 ¨323.
33 was columns 400 ¨ 415.
(residues 46-50 of SEQ ID NO:280). The sheet before the FG loop was 33 (also referred to as sheet F) and included the sequence DYTITVYAVT (residues 67-76 of SEQ ID NO:280). The FG
loop was the sequence GRGDSPASSKP (residues 77-87 of SEQ ID NO:280). [31 corresponded to columns 236 ¨ 271. The CD loop was columns 271 ¨317. 32 was columns 318 ¨323.
33 was columns 400 ¨ 415.
[0333] The distribution for each sheet or loop was calculated with same Python code as the BC loop using the appropriate columns. The length distribution showed that the sheets, [31 - 33, have a high amount of length conservation which correlates well with the structural duties of the sheets within the fibronectin molecule (Figures 4A and 4C). The CD and FG
loops of the cradle show acceptance of a wide array of loop lengths (Figures 4B and 4D).
Sequence Conservation of the Beta Sheets in the Cradle
loops of the cradle show acceptance of a wide array of loop lengths (Figures 4B and 4D).
Sequence Conservation of the Beta Sheets in the Cradle
[0334] Amino acid sequences on [31 length 9, 32 length 4, and 33 length 10 were analyzed for sequence conservation.
[0335] The same code was used to calculate the distributions of 32 and 33.
Position 1, 3, 5, 7, and 9 are of interest for design of the cradle library and showed moderately low sequence conservation as shown in Figures 5A. Position 2 is known to be a highly conserved Tyrosine which has an important packing role with a Tryptophan from the opposite sheet of the beta sandwich (Figure 5A). Positions 4 and 6 also have beta sandwich packing roles in the structure and show high sequence conservation (Figure 5A). Positions 2 and 4 of are interest in the cradle library and showed low conservation (Figure 5C). Overall 32 is not as highly conserved as [31 or 33 (Figures 5B). Like in pl, the odd positions of 33 are intended to be used in library and show low to moderate conservation and even positions, which have structural support functions, show high sequence conservation.
Position 1, 3, 5, 7, and 9 are of interest for design of the cradle library and showed moderately low sequence conservation as shown in Figures 5A. Position 2 is known to be a highly conserved Tyrosine which has an important packing role with a Tryptophan from the opposite sheet of the beta sandwich (Figure 5A). Positions 4 and 6 also have beta sandwich packing roles in the structure and show high sequence conservation (Figure 5A). Positions 2 and 4 of are interest in the cradle library and showed low conservation (Figure 5C). Overall 32 is not as highly conserved as [31 or 33 (Figures 5B). Like in pl, the odd positions of 33 are intended to be used in library and show low to moderate conservation and even positions, which have structural support functions, show high sequence conservation.
[0336] The conservation was mapped onto FnIII 7, FnIII10, and FnIII14 where the cartoon and balls/sticks on 131 - 33 are shown in Figure 6, colored according to conservation (White =
high conservation, gray = moderate conservation, and black = low conservation).
Area of Binding Surface
high conservation, gray = moderate conservation, and black = low conservation).
Area of Binding Surface
[0337] A distinct advantage that the cradle library provides over a top or bottom side library is an increase in surface area of the binding surface on the fibronectin as shown in Figure 6.
The top side binding fibronectin library consists of the BC, DE, and FG loops.
The bottom side library is the AB, CD, and EF loops. The cradle is the CD and FG loops along with three beta strands.
The top side binding fibronectin library consists of the BC, DE, and FG loops.
The bottom side library is the AB, CD, and EF loops. The cradle is the CD and FG loops along with three beta strands.
[0338] The following alignment (SEQ ID NOs:97, 280, 129 respectively) shows top-side residues in bold.
FnI 1107 PLSPPTNLHL-EANPDTGVLTVSWERSTTPDITGYRITTTPT
FnI 1110 VSDVPRDLEVVAAT --PT SLL I SWDAPAV-TVRYYRITYGET
FnI I I 07 NGQQGNSLEEVVHADQSSCTFDNLSPGLEYNVSVYTVK
FnI I I10 -GGNSPVQEF TVPGSKSTAT I SGLKPGVDYT ITVYAVT
FnI 1114 -NGQT-PIQRTIKPDVRSYTITGLQPGTDYKIYLYTLN
FnI 1107 D--DKE--SVPISDTIIP--FnI 1110 GRGDSPASSKP I S INYRTE I
FnI 1114 D-NA---RSSPVVI DAST-
FnI 1107 PLSPPTNLHL-EANPDTGVLTVSWERSTTPDITGYRITTTPT
FnI 1110 VSDVPRDLEVVAAT --PT SLL I SWDAPAV-TVRYYRITYGET
FnI I I 07 NGQQGNSLEEVVHADQSSCTFDNLSPGLEYNVSVYTVK
FnI I I10 -GGNSPVQEF TVPGSKSTAT I SGLKPGVDYT ITVYAVT
FnI 1114 -NGQT-PIQRTIKPDVRSYTITGLQPGTDYKIYLYTLN
FnI 1107 D--DKE--SVPISDTIIP--FnI 1110 GRGDSPASSKP I S INYRTE I
FnI 1114 D-NA---RSSPVVI DAST-
[0339] The following alignment (SEQ ID NOs: 97, 280, 129 respectively) shows bottom-side residues in bold.
FnI 1107 PLSPPTNLHL-EANPDTGVLTVSWERSTTPDITGYRITTTPT
FnI 1110 VSDVPRDLEVVAAT--PTSLL I SWDAPAV-TVRYYRITYGET
FnIII07 NGQQGNSLEEVVHADQSSCTFDNLSPGLEYNVSVYTVK
FnIII10 -GGNSPVQEFTVPGSKSTATISGLKPGVDYTITVYAVT
FnIII14 -NGQT-PIQRTIKPDVRSYTITGLQPGTDYKIYLYTLN
FnIII07 D--DKE--SVPISDTIIP--FnIII10 GRGDSPASSKPISINYRTEI
FnIII14 D-NA RSSPVVIDAST--
FnI 1107 PLSPPTNLHL-EANPDTGVLTVSWERSTTPDITGYRITTTPT
FnI 1110 VSDVPRDLEVVAAT--PTSLL I SWDAPAV-TVRYYRITYGET
FnIII07 NGQQGNSLEEVVHADQSSCTFDNLSPGLEYNVSVYTVK
FnIII10 -GGNSPVQEFTVPGSKSTATISGLKPGVDYTITVYAVT
FnIII14 -NGQT-PIQRTIKPDVRSYTITGLQPGTDYKIYLYTLN
FnIII07 D--DKE--SVPISDTIIP--FnIII10 GRGDSPASSKPISINYRTEI
FnIII14 D-NA RSSPVVIDAST--
[0340] The following alignment (SEQ ID NOs: 97, 280, 129 respectively) shows cradle residues in bold.
FnI 1107 PLSPPINLHL-EANPDTGVLIVSWERSTTPDITGYRITTTPT
FnIII10 VSDVPRDLEVVAAT¨PTSLLISWDAPAV-TVRYYRITYGET
FnIII07 NGQQGNSLEEVVHADQSSCTFDNLSPGLEYNVSVYTVK
FnIII10 -GGNSPVQEFTVPGSKSTATISGLKPGVDYTITVYAVT
FnIII14 -NGQT-PIQRTIKPDVRSYTITGLQPGTDYKIYLYTLN
FnIII07 D--DKE--SVPISDTI IP--FnIII10 GRGDSPASSKPISINYRTEI
FnIII14 D-NA---RSSPVVIDAST-Analysis Summary
FnI 1107 PLSPPINLHL-EANPDTGVLIVSWERSTTPDITGYRITTTPT
FnIII10 VSDVPRDLEVVAAT¨PTSLLISWDAPAV-TVRYYRITYGET
FnIII07 NGQQGNSLEEVVHADQSSCTFDNLSPGLEYNVSVYTVK
FnIII10 -GGNSPVQEFTVPGSKSTATISGLKPGVDYTITVYAVT
FnIII14 -NGQT-PIQRTIKPDVRSYTITGLQPGTDYKIYLYTLN
FnIII07 D--DKE--SVPISDTI IP--FnIII10 GRGDSPASSKPISINYRTEI
FnIII14 D-NA---RSSPVVIDAST-Analysis Summary
[0341] The following Table 1 (and Fig. 2B) shows that the cradle offers approximately two times the binding surface of the top or bottom side loops and has less conservation of the residues intended for library design than the top or bottom side libraries.
Table 1 Surface Area of the loops Protein Total Area , Top Side Loops Bottom Side Loops Cradle . , 1--FnIII 7 9178 A- 2116 A2 1600A2 Fn1111 8804 A2 2001 A2 1453 A2 Fn11114 8716 A2 1962 A2 ' 1194 A2 s s
Table 1 Surface Area of the loops Protein Total Area , Top Side Loops Bottom Side Loops Cradle . , 1--FnIII 7 9178 A- 2116 A2 1600A2 Fn1111 8804 A2 2001 A2 1453 A2 Fn11114 8716 A2 1962 A2 ' 1194 A2 s s
[0342] Additionally, of the loops in the fibronectin molecule only the CD and FG loops have a large variation in allowed loop length. The length variation may indicate that these loops will tolerate more variation than the top or bottom side loops such as the EF
loop which has a >90% conservation of loop length 6 as, defined with FnIIIi sequence GLKPGV
(residues 61-66 of SEQ ID NO:280), along with a >95% sequence conservation of the leucine in position 2.
Although the cradle contains 3 beta strands of the largest beta sheet, it offers more amino acid residues to modify than the top or bottom loops. The alignment below shows sequences for FnIII 7, FnIII10, and FnIII14 (SEQ ID NOs:97, 280, 129 respectively) where the top loops are italics, the bottom side loops are underlined, and cradle residues are bold.
Only residues that are amenable to library design are marked.
Fn I 1107 PLSPPTNLHL-EANPDTGVLTVSWERSTTPD/TGYRITTTPT
Fn I 1110 VS DVPRDLEVVAAT --PT SLL I SWDAPAV- TVRYYRITYGET
FnI 1107 NGQQGNSLEEVVHADQSS CIF DNL SP GLEYNVSVYTVK
FnI 1110 -GGNSPVQEFTVPGSKSTAT I S GLKP GVDYT I TVYAVT
Fn I I I 14 -NGQT-P IQRTIKPDVRSYT I TGLQPGTDYKIYLYTLN
FnIII0 7 D--DKE--SVPISDTIIP--FnIII10 GRGDSPASSKPISINYRTE I
Fn I 1114 D-NA---RSSPVVI DAST-
loop which has a >90% conservation of loop length 6 as, defined with FnIIIi sequence GLKPGV
(residues 61-66 of SEQ ID NO:280), along with a >95% sequence conservation of the leucine in position 2.
Although the cradle contains 3 beta strands of the largest beta sheet, it offers more amino acid residues to modify than the top or bottom loops. The alignment below shows sequences for FnIII 7, FnIII10, and FnIII14 (SEQ ID NOs:97, 280, 129 respectively) where the top loops are italics, the bottom side loops are underlined, and cradle residues are bold.
Only residues that are amenable to library design are marked.
Fn I 1107 PLSPPTNLHL-EANPDTGVLTVSWERSTTPD/TGYRITTTPT
Fn I 1110 VS DVPRDLEVVAAT --PT SLL I SWDAPAV- TVRYYRITYGET
FnI 1107 NGQQGNSLEEVVHADQSS CIF DNL SP GLEYNVSVYTVK
FnI 1110 -GGNSPVQEFTVPGSKSTAT I S GLKP GVDYT I TVYAVT
Fn I I I 14 -NGQT-P IQRTIKPDVRSYT I TGLQPGTDYKIYLYTLN
FnIII0 7 D--DKE--SVPISDTIIP--FnIII10 GRGDSPASSKPISINYRTE I
Fn I 1114 D-NA---RSSPVVI DAST-
[0343] The top side, bottom side, cradle contain 19 ¨ 23, 12 ¨ 14, and 19 ¨ 24 residues respectively that can be used when loop length variation is not applied. The top and bottom side contain only one loop whose length can be aggressively changed, whereas the cradle contains both of them.
Example 6 Creating a Diverse Mammalian FnIII Domain Alignment
Example 6 Creating a Diverse Mammalian FnIII Domain Alignment
[0344] A profile was created using the FnIII domains found in the human fibronectin protein, uniprot FINC_HUMAN or P02751.
[0345] The list below shows the profile members.
FnIII 1: FINC_HUMAN(607-699) FnIII 2: FINC_HUMAN(720-808) FnIII 3: FINC_HUMAN(811-898) FnIII 4: FINC_HUMAN(908-995) FnIII 5: FINC_HUMAN(996-1083) FnIII 6: FINC_HUMAN(1087-1172) FnIII 7: FINC_HUMAN(1173 -1265) FnIII 8: FINC_HUMAN(1266-1356) FnIII 9: FINC_HUMAN(1357-1445) FnIIE : FINC_HUMAN(1447-1540) FnIIIll : FINC_HUMAN(1541 -1630) FnIII12: FINC_HUMAN(1631 -1720) FnIII13: FINC_HUMAN(1723-1810) FnIII14: FINC_HUMAN(1813-1901) FnIII15: FINC_HUMAN(1902-1991)
FnIII 1: FINC_HUMAN(607-699) FnIII 2: FINC_HUMAN(720-808) FnIII 3: FINC_HUMAN(811-898) FnIII 4: FINC_HUMAN(908-995) FnIII 5: FINC_HUMAN(996-1083) FnIII 6: FINC_HUMAN(1087-1172) FnIII 7: FINC_HUMAN(1173 -1265) FnIII 8: FINC_HUMAN(1266-1356) FnIII 9: FINC_HUMAN(1357-1445) FnIIE : FINC_HUMAN(1447-1540) FnIIIll : FINC_HUMAN(1541 -1630) FnIII12: FINC_HUMAN(1631 -1720) FnIII13: FINC_HUMAN(1723-1810) FnIII14: FINC_HUMAN(1813-1901) FnIII15: FINC_HUMAN(1902-1991)
[0346] The fasta sequence for each profile member was derived from the Uniprot entry.
The sequences were aligned with Clustal X 2Ø11. The crystal structure of the FnIII10, RCSB
entry lfna, was used to highlight secondary structure and define regions on the alignment for later analysis. The sheets on the fibronectin were designated A ¨ G and named from N-terminal to C-terminal in the protein.
The sequences were aligned with Clustal X 2Ø11. The crystal structure of the FnIII10, RCSB
entry lfna, was used to highlight secondary structure and define regions on the alignment for later analysis. The sheets on the fibronectin were designated A ¨ G and named from N-terminal to C-terminal in the protein.
[0347] The loops were labeled according to which sheets they were between.
Example:
Loop CD was between sheet C and sheet D.
Alignment Profile
Example:
Loop CD was between sheet C and sheet D.
Alignment Profile
[0348] The following alignment (FnIII_template.aln) (SEQ ID NOs:100, 97, 129, respectively) was loaded into Clustal X as Profile 1.
CLUSTAL 2Ø11 multiple sequence alignment 1fna SS *********A****AB **B*****B C********C******C D****D****DE**
FnIII10 VSDVPRDLEVVAATPT--SLLISWDAP-AVTVRYYRITYGETGGN-SPVQEFTVPGSKST
FnIII07 PLSPPTNLHLEANPDTG-VLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSS
FnIII14 NVSPPRRARVTDATET--TITISWRTKTET-ITGFQVDAVPANGQ--TPIQRTIKPDVRS
FnIII08 AVPPPTDLRFTNIGPD--TMRVTWAPPPSIDLTNFLVRYSPVKNE-EDVAELSISPSDNA
FnIII13 --PAPTDLKFTQVTPT--SLSAQWTPP-NVQLTGYRVRVTPKEKT-GPMKEINLAPDSSS
FnIII04 --PSPRDLQFVEVTDV--KVTIMWTPP-ESAVTGYRVDVIPVNLP-GEHGQRLPISRNTF
FnIII05 KLDAPTNLQFVNETDS--TVLVRWTPP-RAQITGYRLTVGLTR-R-GQPRQYNVGPSVSK
FnIII09 GLDSPTGIDFSDITAN--SFTVHWIAP-RATITGYRIRHHPEHFS-GRPREDRVPHSRNS
FnIII15 AIDAPSNLRFLATTPN--SLLVSWQPP-RARITGYIIKYEKPGSP-PREVVPRPRPGVTE
FnIII12 NIDRPKGLAFTDVDVD--SIKIAWESP-QGQVSRYRVTYSSPEDG-IHELFPAPDGEEDT
FnIII02 -PLVATSESVTEITAS--SFVVSWVSA-SDTVSGFRVEYELSEEG-DEPQYLDLPSTATS
FnIII03 -PDAPPDPTVDQVDDT--SIVVRWSRP-QAPITGYRIVYSPSVEG-S-STELNLPETANS
FnIII11 EIDKPSQMQVTDVQDN--SISVKWLPSSSP-VTGYRVTTTPKNGP-GPTKTKTAGPDQTE
FnIII06 -PGSSIPPYNTEVTET--TIVITWTPA PRIGFKLGVRPSQGG EAPREVTSDSGS
FnIII01 SSSGPVEVFITETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSV-GRWKEATIPGHLNS
*
1fna SS *E** EF******F**********FG*******G***
FnIII10 ATISGLKPGVDYTITVYAVTGRGDSPASSKPISINYRT
FnIII07 CTFDNLSPGLEYNVSVYTVKDDKES----VPISDTIIP
FnIII14 YTITGLQPGTDYKIYLYTLNDNARS----SPVVIDAST
FnIII08 VVLTNLLPGTEYVVSVSSVYEQHES----TPLRGRQKT
FnIII13 VVVSGLMVATKYEVSVYALKDTLTS----RPAQGVVTT
FnIII04 AEVTGLSPGVTYYFKVFAVSHGRES----KPLTAQQTT
FnIII05 YPLRNLQPASEYTVSLVAIKGNQES PKATGVFT
FnIII09 ITLTNLTPGTEYVVSIVALNGREES PLLIGQQS
FnIII15 ATITGLEPGTEYTIYVIALKNNQKS----EPLIGRKKT
FnIII12 AELQGLRPGSEYTVSVVALHDDMES----QPLIGTQST
FnIII02 VNIPDLLPGRKYIVNVYQISEDGEQ----SLILSTSQT
FnIII03 VTLSDLQPGVQYNITIYAVEENQES----TPVVIQQET
FnIII11 MTIEGLQPTVEYVVSVYAQNPSGES----QPLVQTAVT
FnIII06 IVVSGLTPGVEYVYTIQVLRDGQER---DAPIVNKVVT
FnIII01 YTIKGLKPGVVYEGQLISIQQYGHQ----EVTRFDFTT
. .* *
Mammalian Fibronectin Sequences
CLUSTAL 2Ø11 multiple sequence alignment 1fna SS *********A****AB **B*****B C********C******C D****D****DE**
FnIII10 VSDVPRDLEVVAATPT--SLLISWDAP-AVTVRYYRITYGETGGN-SPVQEFTVPGSKST
FnIII07 PLSPPTNLHLEANPDTG-VLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSS
FnIII14 NVSPPRRARVTDATET--TITISWRTKTET-ITGFQVDAVPANGQ--TPIQRTIKPDVRS
FnIII08 AVPPPTDLRFTNIGPD--TMRVTWAPPPSIDLTNFLVRYSPVKNE-EDVAELSISPSDNA
FnIII13 --PAPTDLKFTQVTPT--SLSAQWTPP-NVQLTGYRVRVTPKEKT-GPMKEINLAPDSSS
FnIII04 --PSPRDLQFVEVTDV--KVTIMWTPP-ESAVTGYRVDVIPVNLP-GEHGQRLPISRNTF
FnIII05 KLDAPTNLQFVNETDS--TVLVRWTPP-RAQITGYRLTVGLTR-R-GQPRQYNVGPSVSK
FnIII09 GLDSPTGIDFSDITAN--SFTVHWIAP-RATITGYRIRHHPEHFS-GRPREDRVPHSRNS
FnIII15 AIDAPSNLRFLATTPN--SLLVSWQPP-RARITGYIIKYEKPGSP-PREVVPRPRPGVTE
FnIII12 NIDRPKGLAFTDVDVD--SIKIAWESP-QGQVSRYRVTYSSPEDG-IHELFPAPDGEEDT
FnIII02 -PLVATSESVTEITAS--SFVVSWVSA-SDTVSGFRVEYELSEEG-DEPQYLDLPSTATS
FnIII03 -PDAPPDPTVDQVDDT--SIVVRWSRP-QAPITGYRIVYSPSVEG-S-STELNLPETANS
FnIII11 EIDKPSQMQVTDVQDN--SISVKWLPSSSP-VTGYRVTTTPKNGP-GPTKTKTAGPDQTE
FnIII06 -PGSSIPPYNTEVTET--TIVITWTPA PRIGFKLGVRPSQGG EAPREVTSDSGS
FnIII01 SSSGPVEVFITETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSV-GRWKEATIPGHLNS
*
1fna SS *E** EF******F**********FG*******G***
FnIII10 ATISGLKPGVDYTITVYAVTGRGDSPASSKPISINYRT
FnIII07 CTFDNLSPGLEYNVSVYTVKDDKES----VPISDTIIP
FnIII14 YTITGLQPGTDYKIYLYTLNDNARS----SPVVIDAST
FnIII08 VVLTNLLPGTEYVVSVSSVYEQHES----TPLRGRQKT
FnIII13 VVVSGLMVATKYEVSVYALKDTLTS----RPAQGVVTT
FnIII04 AEVTGLSPGVTYYFKVFAVSHGRES----KPLTAQQTT
FnIII05 YPLRNLQPASEYTVSLVAIKGNQES PKATGVFT
FnIII09 ITLTNLTPGTEYVVSIVALNGREES PLLIGQQS
FnIII15 ATITGLEPGTEYTIYVIALKNNQKS----EPLIGRKKT
FnIII12 AELQGLRPGSEYTVSVVALHDDMES----QPLIGTQST
FnIII02 VNIPDLLPGRKYIVNVYQISEDGEQ----SLILSTSQT
FnIII03 VTLSDLQPGVQYNITIYAVEENQES----TPVVIQQET
FnIII11 MTIEGLQPTVEYVVSVYAQNPSGES----QPLVQTAVT
FnIII06 IVVSGLTPGVEYVYTIQVLRDGQER---DAPIVNKVVT
FnIII01 YTIKGLKPGVVYEGQLISIQQYGHQ----EVTRFDFTT
. .* *
Mammalian Fibronectin Sequences
[0349] The FnIII domain alignment was obtained from PFAM and saved as PF00041.full.
The alignment from PFAM was truncated at the C-terminal portion of the FG loop and the entire Sheet G. The alignment was in Stockholm 1.0 format.
The alignment from PFAM was truncated at the C-terminal portion of the FG loop and the entire Sheet G. The alignment was in Stockholm 1.0 format.
[0350] The file fn3.in contained 1985 unique mammalian sequences and was loaded into Clustal X as Profile 2. All the sequences in Profile 2 were aligned to Profile 1. Outliers were removed and all the sequences in Profile 2 were aligned to Profile 1. The final alignment for Profile 2 was fn3_final.aln which contained 1750 sequences.
[0351] The file fn3_final.aln was reformatted so that each protein had 1 line, all amino acids past the C-terminus of FnIIIi were trimmed, and a header was added to the top. The file was saved as fibronectins.aln and was the base alignment for further analysis.
Amino Acid Distribution
Amino Acid Distribution
[0352] The SWISS_PROT release current for July 15, 2010 was downloaded in fasta format. The release contained 518,415 non redundant sequences for a total 182,829,264 amino acids. The amino acid distribution in the SWISS_PROT release was calculated as a random occurrence reference.
Table 2 Table of amino acid distribution w ..... ' Random FnIll f A ii a i.: t D E IF
E A 8.3 5.8 6.4 5.6 3.6 5.1 4.9 8.6 7.4 4.1 5.1 4.5 5.8 2.3 7.0 'a C 1.4 0.8 0.7 1.0 1.2 1.1 1.6 1.2 0.7 0.7 0.4 0.4 1.4 0.1 0.8 CT
N
.6.
D 5.5 4.5 4.8 0.7 3.2 3.5 1.7 1.9 2.5 10.3 7.4 8.4 6.7 5.5 4.7 E 6.8 6.6 6.2 3.0 8.1 7.1 5.1 53 7.2 6.1 5.9 102 7.4 8.8 8.0 F 3.9 2.6 2.4 1.8 3.8 2.6 5.0 5.7 3.4 0.3 2.0 0.8 0.4 0.9 1.5 G4-- 7.1 7.4 2.3 1.5 5.7 2.3 1.5 5.2 3.6 6.0 9.3 14.3 9.3 10.0 25.5 .,4 ii :H_23 1.8 2.1 1.7 1.8 2.1 2.6 2.0 1.0 2.0 1.7 2.2 2.4 14 1.5 : I 6.0 4.8 7.4 7.7 6.2 5.0 84 4.5 5.3 0.3 5.9 2.0 2.7 1.8 3.0 K 5.9 4.7 4.3 2.3 7.3 6.4 3.8 4.5 3.5 5.2 3.8 5.0 5.1 6.2 5.3 L ii 9.7 7.4 13.1 18.2 6.4 6.5 11.4 5.6 5.0 1.1 4.7 3.8 4.1 20.8 3.7 , M 2.4 1.2 1.3 2.1 1.2 1.7 1.6 1.6 1.0 0.5 0.7 1.1 0.6 0.9 0.9 0 I.) co ='!!,'.
,- = N ii 4.1 3.9 3.2 2.0 2.4 4.9 1.8 5.6 1.7 7.3 3.8 4.7 5.3 4.0 3.2 0 , in P L 4.7 7.3 3.7 0.1 2.7 6.7 1.0 0.7 12.4 6.2 14.0 6.4 6.2 13.1 3.1 0 (5) ii Q 3.9 3.8 2.9 2.8 4.4 5.3 3.1 3.2 3.2 2.5 4.0 5.6 4.8 3.5 4.3 "
.,, I.) R ii 5.5 5.2 4.3 3.3 7.0 6.2 4.0 7.5 3.3 4.8 3.4 5.7 6.1 4.0 5.7 0 H
u.) S 6.5 8.8 6.5 21.2 5.5 6.2 8.3 5.6 13.6 18.3 6.7 8.6 9.3 6.0 10.8 1 T 5.3 8.4 9.2 10.5 4.7 9.1 13.3 7.9 12.1 21.4 6.0 6.4 15.4 7.3 3.2 H
V 6.9 8.9 18.1 13.2 9.3 15.2 15.1 11.5 10.6 1.1 5.2 3.5 5.9 2.4 4.9 H
(5) = W 1.1 2.1 0.5 0.4 1.6 1.2 0.8 0.7 1.1 0.9 8.8 5.7 0.2 0.4 0.8 i. Y 2.9 4.0 0.7 0.9 13.9 1.9 5.1 11.1 1.6 0.9 1.1 0.8 0.9 0.6 2.3 1-d n cp t..) =
'a .6.
c7, c7, =
Mammalian FnIII Domain Motif
Table 2 Table of amino acid distribution w ..... ' Random FnIll f A ii a i.: t D E IF
E A 8.3 5.8 6.4 5.6 3.6 5.1 4.9 8.6 7.4 4.1 5.1 4.5 5.8 2.3 7.0 'a C 1.4 0.8 0.7 1.0 1.2 1.1 1.6 1.2 0.7 0.7 0.4 0.4 1.4 0.1 0.8 CT
N
.6.
D 5.5 4.5 4.8 0.7 3.2 3.5 1.7 1.9 2.5 10.3 7.4 8.4 6.7 5.5 4.7 E 6.8 6.6 6.2 3.0 8.1 7.1 5.1 53 7.2 6.1 5.9 102 7.4 8.8 8.0 F 3.9 2.6 2.4 1.8 3.8 2.6 5.0 5.7 3.4 0.3 2.0 0.8 0.4 0.9 1.5 G4-- 7.1 7.4 2.3 1.5 5.7 2.3 1.5 5.2 3.6 6.0 9.3 14.3 9.3 10.0 25.5 .,4 ii :H_23 1.8 2.1 1.7 1.8 2.1 2.6 2.0 1.0 2.0 1.7 2.2 2.4 14 1.5 : I 6.0 4.8 7.4 7.7 6.2 5.0 84 4.5 5.3 0.3 5.9 2.0 2.7 1.8 3.0 K 5.9 4.7 4.3 2.3 7.3 6.4 3.8 4.5 3.5 5.2 3.8 5.0 5.1 6.2 5.3 L ii 9.7 7.4 13.1 18.2 6.4 6.5 11.4 5.6 5.0 1.1 4.7 3.8 4.1 20.8 3.7 , M 2.4 1.2 1.3 2.1 1.2 1.7 1.6 1.6 1.0 0.5 0.7 1.1 0.6 0.9 0.9 0 I.) co ='!!,'.
,- = N ii 4.1 3.9 3.2 2.0 2.4 4.9 1.8 5.6 1.7 7.3 3.8 4.7 5.3 4.0 3.2 0 , in P L 4.7 7.3 3.7 0.1 2.7 6.7 1.0 0.7 12.4 6.2 14.0 6.4 6.2 13.1 3.1 0 (5) ii Q 3.9 3.8 2.9 2.8 4.4 5.3 3.1 3.2 3.2 2.5 4.0 5.6 4.8 3.5 4.3 "
.,, I.) R ii 5.5 5.2 4.3 3.3 7.0 6.2 4.0 7.5 3.3 4.8 3.4 5.7 6.1 4.0 5.7 0 H
u.) S 6.5 8.8 6.5 21.2 5.5 6.2 8.3 5.6 13.6 18.3 6.7 8.6 9.3 6.0 10.8 1 T 5.3 8.4 9.2 10.5 4.7 9.1 13.3 7.9 12.1 21.4 6.0 6.4 15.4 7.3 3.2 H
V 6.9 8.9 18.1 13.2 9.3 15.2 15.1 11.5 10.6 1.1 5.2 3.5 5.9 2.4 4.9 H
(5) = W 1.1 2.1 0.5 0.4 1.6 1.2 0.8 0.7 1.1 0.9 8.8 5.7 0.2 0.4 0.8 i. Y 2.9 4.0 0.7 0.9 13.9 1.9 5.1 11.1 1.6 0.9 1.1 0.8 0.9 0.6 2.3 1-d n cp t..) =
'a .6.
c7, c7, =
Mammalian FnIII Domain Motif
[0353] Below shows the mammalian FnIII domain motif. Key: H = hydrophilic, P =
polar, B = Basic, A = Acid, C = Charged, X = no preference, sheets are in bold, specific amino acids are underlined, subscripts indicate length variations without % and percent occurrence with %.
A AB B
[H] [C/P] [H] [C/P] [X] [H] [P] [C/P] [P/A] [P] [H] [P] [H] [P]
BC c [W] [P/A] [P] [P/P] [P] 3-8 [H] [D/P] [P] [Yin [HIP] [B] [P/C] [B] [B/P] [C/P]
[P] [P ] 15% [VP] 5%
CD
[P/A] [P/C] [P/C] [C50%/B30%/P25%] [H50%/P25%/C25$] 55% [P/H] 25%
D DE E
[C/P] [H/A] [H] [X] [H] [P] [P] [P/C] [P/A] [P] [P] [H] [P] [H] [P]
EF F
[G/P][L][C/P][P/P][G/P][P][C/P][Y][C/P][H][B/P][H][P/H][A/P][H][P]
FG G
[P/A] [P/H] [G/P] [C/H] [G/S] [C/P] 60% IC/13 l [Y] [C/P] [B] [B/C] [B] [B/B]
[A/P] [B] [P]
Cradle Library Description
polar, B = Basic, A = Acid, C = Charged, X = no preference, sheets are in bold, specific amino acids are underlined, subscripts indicate length variations without % and percent occurrence with %.
A AB B
[H] [C/P] [H] [C/P] [X] [H] [P] [C/P] [P/A] [P] [H] [P] [H] [P]
BC c [W] [P/A] [P] [P/P] [P] 3-8 [H] [D/P] [P] [Yin [HIP] [B] [P/C] [B] [B/P] [C/P]
[P] [P ] 15% [VP] 5%
CD
[P/A] [P/C] [P/C] [C50%/B30%/P25%] [H50%/P25%/C25$] 55% [P/H] 25%
D DE E
[C/P] [H/A] [H] [X] [H] [P] [P] [P/C] [P/A] [P] [P] [H] [P] [H] [P]
EF F
[G/P][L][C/P][P/P][G/P][P][C/P][Y][C/P][H][B/P][H][P/H][A/P][H][P]
FG G
[P/A] [P/H] [G/P] [C/H] [G/S] [C/P] 60% IC/13 l [Y] [C/P] [B] [B/C] [B] [B/B]
[A/P] [B] [P]
Cradle Library Description
[0354] The cradle library was originally defined as sheets C, D, and F with loops CD and FG. Sheet D is on the outside of the fibronectin molecule and unlikely to significantly contribute to binding residues. The definition of the cradle library is refined to be sheets C and F with loops CD and FG, and various combinations thereof.
[0355] Cradle library mapped onto the human fibronectin sequences (SEQ ID
NOs:292, 288, 289, 283, 284, 291, 97, 281, 285, 100, 290, 287, 282, 129, and 286, respectively) is shown below. Cradle residues are shown in bold.
lfna SS *********A****AB **B*****B C********C *****C D****D****DE**
FnIII01 SSSGPVEVFITETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSV-GRWKEATIPGHLNS
FnIII02 -PLVATSESVTEITAS--SFVVSWVSA-SDTVSGFRVEYELSEEG-DEPQYLDLPSTATS
FnIII03 -PDAPPDPTVDQVDDT--SIVVRWSRP-QAPITGYRIVYSPSVEG-S-STELNLPETANS
FnIII04 --PSPRDLQFVEVTDV--KVTIMWTPP-ESAVTGYRVDVIPVNLP-GEHGQRLPISRNTF
FnIII05 KLDAPTNLQFVNETDS--TVLVRWTPP-RAQITGYRLTVGLTR-R-GQPRQYNVGPSVSK
FnIII06 -PGSSIPPYNTEVTET--TIVITWTPA PRIGFKLGVRPSQGG---EAPREVTSDSGS
FnIII07 PLSPPTNLHLEANPDTG-VLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSS
FnIII08 AVPPPTDLRFTNIGPD--TMRVTWAPPPSIDLTNFLVRYSPVKNE-EDVAELSISPSDNA
FnIII09 GLDSPTGIDFSDITAN--SFTVHWIAP-RATITGYRIRHHPEHFS-GRPREDRVPHSRNS
FnIII10 VSDVPRDLEVVAATPT--SLLISWDAP-AVTVRYYRITYGETGGN-SPVQEFTVPGSKST
FnIII11 EIDKPSQMQVTDVQDN--SISVKWLPSSSP-VTGYRVTTTPKNGP-GPTKTKTAGPDQTE
FnIII12 NIDRPKGLAFTDVDVD--SIKIAWESP-QGQVSRYRVTYSSPEDG-IHELFPAPDGEEDT
FnIII13 --PAPTDLKFTQVTPT--SLSAQWTPP-NVQLTGYRVRVTPKEKT-GPMKEINLAPDSSS
FnIII14 NVSPPRRARVTDATET--TITISWRTKTET-ITGFQVDAVPANGQ--TPIQRTIKPDVRS
FnIII15 AIDAPSNLRFLATTPN--SLLVSWQPP-RARITGYIIKYEKPGSP-PREVVPRPRPGVTE
1fna SS *E** EF******F**********FG*******G***
FnIII01 YTIKGLKPGVVYEGQLISIQQYGHQ----EVTRFDFTT
FnIII02 VNIPDLLPGRKYIVNVYQISEDGEQ----SLILSTSQT
FnIII03 VTLSDLQPGVQYNITIYAVEENQES----TPVVIQQET
FnIII04 AEVTGLSPGVTYYFKVFAVSHGRES----KPLTAQQTT
Fn 11105 YPLRNLQPASEYTVSLVAIKGNQES PKATGVFT
FnIII06 IVVSGLTPGVEYVYTIQVLRDGQER---DAPIVNKVVT
FnIII07 CTFDNLSPGLEYNVSVYTVKDDKES----VPISDTIIP
FnIII08 VVLTNLLPGTEYVVSVSSVYEQHES----TPLRGRQKT
FnIII09 ITLTNLTPGTEYVVSIVALNGREES PLLIGQQS
FnIII10 ATISGLKPGVDYTITVYAVTGRGDSPASSKPISINYRT
FnIII11 MTIEGLQPTVEYVVSVYAQNPSGES----QPLVQTAVT
FnIII12 AELQGLRPGSEYTVSVVALHDDMES----QPLIGTQST
FnIII13 VVVSGLMVATKYEVSVYALKDTLTS----RPAQGVVTT
FnIII14 YTITGLQPGTDYKIYLYTLNDNARS----SPVVIDAST
FnIII15 ATITGLEPGTEYTIYVIALKNNQKS----EPLIGRKKT
NOs:292, 288, 289, 283, 284, 291, 97, 281, 285, 100, 290, 287, 282, 129, and 286, respectively) is shown below. Cradle residues are shown in bold.
lfna SS *********A****AB **B*****B C********C *****C D****D****DE**
FnIII01 SSSGPVEVFITETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSV-GRWKEATIPGHLNS
FnIII02 -PLVATSESVTEITAS--SFVVSWVSA-SDTVSGFRVEYELSEEG-DEPQYLDLPSTATS
FnIII03 -PDAPPDPTVDQVDDT--SIVVRWSRP-QAPITGYRIVYSPSVEG-S-STELNLPETANS
FnIII04 --PSPRDLQFVEVTDV--KVTIMWTPP-ESAVTGYRVDVIPVNLP-GEHGQRLPISRNTF
FnIII05 KLDAPTNLQFVNETDS--TVLVRWTPP-RAQITGYRLTVGLTR-R-GQPRQYNVGPSVSK
FnIII06 -PGSSIPPYNTEVTET--TIVITWTPA PRIGFKLGVRPSQGG---EAPREVTSDSGS
FnIII07 PLSPPTNLHLEANPDTG-VLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSS
FnIII08 AVPPPTDLRFTNIGPD--TMRVTWAPPPSIDLTNFLVRYSPVKNE-EDVAELSISPSDNA
FnIII09 GLDSPTGIDFSDITAN--SFTVHWIAP-RATITGYRIRHHPEHFS-GRPREDRVPHSRNS
FnIII10 VSDVPRDLEVVAATPT--SLLISWDAP-AVTVRYYRITYGETGGN-SPVQEFTVPGSKST
FnIII11 EIDKPSQMQVTDVQDN--SISVKWLPSSSP-VTGYRVTTTPKNGP-GPTKTKTAGPDQTE
FnIII12 NIDRPKGLAFTDVDVD--SIKIAWESP-QGQVSRYRVTYSSPEDG-IHELFPAPDGEEDT
FnIII13 --PAPTDLKFTQVTPT--SLSAQWTPP-NVQLTGYRVRVTPKEKT-GPMKEINLAPDSSS
FnIII14 NVSPPRRARVTDATET--TITISWRTKTET-ITGFQVDAVPANGQ--TPIQRTIKPDVRS
FnIII15 AIDAPSNLRFLATTPN--SLLVSWQPP-RARITGYIIKYEKPGSP-PREVVPRPRPGVTE
1fna SS *E** EF******F**********FG*******G***
FnIII01 YTIKGLKPGVVYEGQLISIQQYGHQ----EVTRFDFTT
FnIII02 VNIPDLLPGRKYIVNVYQISEDGEQ----SLILSTSQT
FnIII03 VTLSDLQPGVQYNITIYAVEENQES----TPVVIQQET
FnIII04 AEVTGLSPGVTYYFKVFAVSHGRES----KPLTAQQTT
Fn 11105 YPLRNLQPASEYTVSLVAIKGNQES PKATGVFT
FnIII06 IVVSGLTPGVEYVYTIQVLRDGQER---DAPIVNKVVT
FnIII07 CTFDNLSPGLEYNVSVYTVKDDKES----VPISDTIIP
FnIII08 VVLTNLLPGTEYVVSVSSVYEQHES----TPLRGRQKT
FnIII09 ITLTNLTPGTEYVVSIVALNGREES PLLIGQQS
FnIII10 ATISGLKPGVDYTITVYAVTGRGDSPASSKPISINYRT
FnIII11 MTIEGLQPTVEYVVSVYAQNPSGES----QPLVQTAVT
FnIII12 AELQGLRPGSEYTVSVVALHDDMES----QPLIGTQST
FnIII13 VVVSGLMVATKYEVSVYALKDTLTS----RPAQGVVTT
FnIII14 YTITGLQPGTDYKIYLYTLNDNARS----SPVVIDAST
FnIII15 ATITGLEPGTEYTIYVIALKNNQKS----EPLIGRKKT
[0356] The top side library is loops BC, DE, and FG and the bottom library is loops AB, CD, and EF. The AB loop has a length constraint of 3 amino acids with position 1 being Threonine or Serine 67.4% of the time. The BC loop allows for a large variation of loop length with no individual length conserved more than 28.8%. However, the BC loop does contain a critical Tryptophan residue at position 1 which is conserved at > 92% in all BC lengths and is necessary for proper folding of the fibronectin. Additionally, position 4 in the BC loop is a Proline with >33% conservation and the N-1 position is a hydrophobic residue.
[0357] The CD loop has a Poisson distribution of length centered at 5 amino acids. The most abundant amino acid in the CD loop is Glycine in position 2 and N-2, and when there is a position 5, it a Tryptophan 30% of the time. The DE loop has a length constraint of 4 amino acids. The EF loop has a length constraint of 6 amino acids with a high amount of sequence conservation.
Position 1: Glycine conserved at 44.2%
Position 2: Leucine, Valine, or Isoleucine conserved at 97.8%
Position 3: Charged or polar amino acid Position 4: Proline conserved at 57.5%
Position 5: Glycine conserved at 44.4%
Position 6: Tends to be a polar amino acid
Position 1: Glycine conserved at 44.2%
Position 2: Leucine, Valine, or Isoleucine conserved at 97.8%
Position 3: Charged or polar amino acid Position 4: Proline conserved at 57.5%
Position 5: Glycine conserved at 44.4%
Position 6: Tends to be a polar amino acid
[0358] The FG loop length is either 5 (32.6%) or 6 (62.1%) with positions 3 and 5 having a Glycine >43% of the time. The remaining positions have conservation of <22%
for any given amino acid.
for any given amino acid.
[0359] The top side library is limited due to length constraint of the DE loop and conserved amino acids in the BC loop. The bottom side library is limited by the 3 amino acid length constraint on the AB loop and the high amount of conservation throughout the EF loop.
[0360] The amino acid most participating in hydrophobic packing of the fibronectin are BC
loop position 1 (W), sheet C position 2 (Y/F), EF loop position 2 (L), and sheet F position 2 (Y).
Both the top and bottom side libraries contain loops which have important packing residues which may hinder effective variation. The cradle library loops do not contain structurally necessary amino acids. In addition, the cradle library utilizes the outward facing amino acids of sheets C and F to expand the binding surface.
loop position 1 (W), sheet C position 2 (Y/F), EF loop position 2 (L), and sheet F position 2 (Y).
Both the top and bottom side libraries contain loops which have important packing residues which may hinder effective variation. The cradle library loops do not contain structurally necessary amino acids. In addition, the cradle library utilizes the outward facing amino acids of sheets C and F to expand the binding surface.
[0361] The cradle library beta strands C and F are the two longest in the fibronectin molecule and interact extensively to form an anti-parallel beta sheet which may stabilize the protein when changes to the outward facing amino acids are made. Amino acids 1, 3, 5, 7-9 in sheet C and amino acids 1, 3, 5, 7, and 10 in sheet F are intended for use in the cradle library (Figure 7A).
[0362] The residue in sheets C and F were analyzed using a simplified amino acid type scheme where A/G/P/S/T are considered small and flexible, D/E/N/Q/H/K/R are considered polar/charged, F/Y/W/I/L/V/M are considered hydrophobic, and C is disulfide making.
[0363] Figure 7B shows the simplified positional distribution for sheet C
length 9 (SF =
amino acids A/G/P/S/T; CP = D/E/N/Q/H/K/R; H = F/Y/W/I/L/V/M; C = C).
Positions 2, 4, and 6 showed a clear preference for hydrophobic amino acid and are pointing inwards towards the core of the fibronectin. Position 3 had a preference for hydrophobic amino acids, but not as strongly, and is pointing outward toward solvent. The top 10 amino acids by conservation in sheet C, length 9, position 3 were:
I, 17.2% E, 8.5%
V, 15.4% K, 5.6%
R, 13.2% S, 3.9%
L, 10.9% Q, 3.2%
T, 10.2% H, 3.0%
length 9 (SF =
amino acids A/G/P/S/T; CP = D/E/N/Q/H/K/R; H = F/Y/W/I/L/V/M; C = C).
Positions 2, 4, and 6 showed a clear preference for hydrophobic amino acid and are pointing inwards towards the core of the fibronectin. Position 3 had a preference for hydrophobic amino acids, but not as strongly, and is pointing outward toward solvent. The top 10 amino acids by conservation in sheet C, length 9, position 3 were:
I, 17.2% E, 8.5%
V, 15.4% K, 5.6%
R, 13.2% S, 3.9%
L, 10.9% Q, 3.2%
T, 10.2% H, 3.0%
[0364] Position 1, 5, 7, 8, and 9 showed a clear preference for small flexible or charged/polar amino acid and were pointing outwards towards solvent.
[0365] Figure 7B shows the simplified positional distribution for sheet F
length 10.
Position 2, 4, and 6 showed a clear preference for hydrophobic amino acid and were pointing inwards towards the core of the fibronectin. Position 7 had a preference for hydrophobic amino acids, but not as strongly, and is pointing outward toward solvent. The top 5 amino acids by conservation in sheet F, length 10, position 7 were:
R 17.1%
= 15.6%
A 7.8%
= 7.4%
/ 7.1%
length 10.
Position 2, 4, and 6 showed a clear preference for hydrophobic amino acid and were pointing inwards towards the core of the fibronectin. Position 7 had a preference for hydrophobic amino acids, but not as strongly, and is pointing outward toward solvent. The top 5 amino acids by conservation in sheet F, length 10, position 7 were:
R 17.1%
= 15.6%
A 7.8%
= 7.4%
/ 7.1%
[0366] Position 1, 3, 5, and 10 showed a clear preference for small flexible or charged/polar amino acid and were pointing outwards towards solvent. Position 9 had a preference for hydro-phobic amino acids. The top 5 amino acids by conservation in sheet F, length 10, position 9 were:
/ 18.7%
= 13.0%
= 12.7%
= 8.3%
7.8%
/ 18.7%
= 13.0%
= 12.7%
= 8.3%
7.8%
[0367] Position 8 contained a highly conserved Alanine residue with Alanine or Glycine at 80% conservation. The top 5 amino acids by conservation in sheet F, length 10, position 8 were:
A 68.7%
= 11.5%
6.1%
/ 2.9%
= 2.5%
Binding Surface Comparison
A 68.7%
= 11.5%
6.1%
/ 2.9%
= 2.5%
Binding Surface Comparison
[0368] Shown below is the cradle library for FnIII 7, FnIII10, and FnIII14 (SEQ ID NOs:97, 100, 129, respectively).
1fna SS *********A****AB "B*****B C********C *****C D****D****DE"
FnIII07 PLSPPTNLHLEANPDTG-VLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSS
FnIII10 VSDVPRDLEVVAATPT--SLLISWDAP-AVTVRYYRITYGETGGN-SPVQEFTVPGSKST
FnIII14 NVSPPRRARVTDATET--TITISWRTKTET-ITGFQVDAVPANGQ--TPIQRTIKPDVRS
1fna SS *E** EF******F**********FG*******G***
FnIII07 CTFDNLSPGLEYNVSVYTVKDDKES----VPISDTIIP
FnIII10 ATISGLKPGVDYTITVYAVTGRGDSPASSKPISINYRT
FnIII14 YTITGLQPGTDYKIYLYTLNDNARS----SPVVIDAST
1fna SS *********A****AB "B*****B C********C *****C D****D****DE"
FnIII07 PLSPPTNLHLEANPDTG-VLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSS
FnIII10 VSDVPRDLEVVAATPT--SLLISWDAP-AVTVRYYRITYGETGGN-SPVQEFTVPGSKST
FnIII14 NVSPPRRARVTDATET--TITISWRTKTET-ITGFQVDAVPANGQ--TPIQRTIKPDVRS
1fna SS *E** EF******F**********FG*******G***
FnIII07 CTFDNLSPGLEYNVSVYTVKDDKES----VPISDTIIP
FnIII10 ATISGLKPGVDYTITVYAVTGRGDSPASSKPISINYRT
FnIII14 YTITGLQPGTDYKIYLYTLNDNARS----SPVVIDAST
[0369] Shown below is the top side library for FnIII 7, FnIII10, and FnIII14 (SEQ ID NOs:97, 100, 129, respectively).
1fna SS *********A****AB "B*****B C********C *****C D****D****DE"
FnIII07 PLSPPTNLHLEANPDTG-VLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSS
FnIII10 VSDVPRDLEVVAATPT--SLLISWDAP-AVTVRYYRITYGETGGN-SPVQEFTVPGSKST
FnIII14 NVSPPRRARVTDATET--TITISWRTKTET-ITGFQVDAVPANGQ--TPIQRTIKPDVRS
1fna SS *E** EF******F**********FG*******G***
FnIII07 CTFDNLSPGLEYNVSVYTVKDDKES----VPISDTIIP
FnIII10 ATISGLKPGVDYTITVYAVTGRGDSPASSKPISINYRT
FnIII14 YTITGLQPGTDYKIYLYTLNDNARS----SPVVIDAST
1fna SS *********A****AB "B*****B C********C *****C D****D****DE"
FnIII07 PLSPPTNLHLEANPDTG-VLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSS
FnIII10 VSDVPRDLEVVAATPT--SLLISWDAP-AVTVRYYRITYGETGGN-SPVQEFTVPGSKST
FnIII14 NVSPPRRARVTDATET--TITISWRTKTET-ITGFQVDAVPANGQ--TPIQRTIKPDVRS
1fna SS *E** EF******F**********FG*******G***
FnIII07 CTFDNLSPGLEYNVSVYTVKDDKES----VPISDTIIP
FnIII10 ATISGLKPGVDYTITVYAVTGRGDSPASSKPISINYRT
FnIII14 YTITGLQPGTDYKIYLYTLNDNARS----SPVVIDAST
[0370] Shown below is the bottom side library for FnIII 7, FnIII10, and FnIII14 (SEQ ID
NOs:97, 100, 129, respectively).
1fna SS *********A****AB "B*****B C********C *****C D****D****DE"
FnIII07 PLSPPTNLHLEANPDTG-VLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSS
FnIII10 VSDVPRDLEVVAATPT--SLLISWDAP-AVTVRYYRITYGETGGN-SPVQEFTVPGSKST
FnIII14 NVSPPRRARVTDATET--TITISWRTKTET-ITGFQVDAVPANGQ--TPIQRTIKPDVRS
1fna SS *E** EF******F**********FG*******G***
FnIII07 CTFDNLSPGLEYNVSVYTVKDDKES----VPISDTIIP
FnIII10 ATISGLKPGVDYTITVYAVTGRGDSPASSKPISINYRT
FnIII14 YTITGLQPGTDYKIYLYTLNDNARS----SPVVIDAST
Table 3 Binding Surface of each library Domain Top Side Library Bottom Side Library Cradle Library FnIII 7 1769 A2 1382 A2 2345 A2 FnIIIi 1834 A2 1140 A2 2457 A2 Fn11114 1700 A2 1088 A2 1949 A2 Table 4 Binding Surface of each library relative to the Top Side library.
Domain Top Side Library Bottom Side Library Cradle Library FnIII 7 100% 78% 133%
FnIIIi 100% 62% 216%
FnIII14 100% 64% 115%
Cradle Library Summary
NOs:97, 100, 129, respectively).
1fna SS *********A****AB "B*****B C********C *****C D****D****DE"
FnIII07 PLSPPTNLHLEANPDTG-VLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSS
FnIII10 VSDVPRDLEVVAATPT--SLLISWDAP-AVTVRYYRITYGETGGN-SPVQEFTVPGSKST
FnIII14 NVSPPRRARVTDATET--TITISWRTKTET-ITGFQVDAVPANGQ--TPIQRTIKPDVRS
1fna SS *E** EF******F**********FG*******G***
FnIII07 CTFDNLSPGLEYNVSVYTVKDDKES----VPISDTIIP
FnIII10 ATISGLKPGVDYTITVYAVTGRGDSPASSKPISINYRT
FnIII14 YTITGLQPGTDYKIYLYTLNDNARS----SPVVIDAST
Table 3 Binding Surface of each library Domain Top Side Library Bottom Side Library Cradle Library FnIII 7 1769 A2 1382 A2 2345 A2 FnIIIi 1834 A2 1140 A2 2457 A2 Fn11114 1700 A2 1088 A2 1949 A2 Table 4 Binding Surface of each library relative to the Top Side library.
Domain Top Side Library Bottom Side Library Cradle Library FnIII 7 100% 78% 133%
FnIIIi 100% 62% 216%
FnIII14 100% 64% 115%
Cradle Library Summary
[0371] The cradle library consisting of beta strand C, beta strand F, loop CD, and loop FG
of the FnIII domain offers better stability and available binding surface than the directional loop based Top and Bottom Side libraries.
of the FnIII domain offers better stability and available binding surface than the directional loop based Top and Bottom Side libraries.
[0372] Figure 8 shows the amino acid distribution of the residues and amino acid variation compared with the distribution on CDR-H3 domains known to bind antigens.
Figure 9C shows the biased amino acid distribution desired for the Cradle residues marked X
and Y.
Figure 9C shows the biased amino acid distribution desired for the Cradle residues marked X
and Y.
[0373] Figures 9D-9F show the mapping of the cradle library definition on the sequences of FnIII 7, FnIII10, and FnIII14. Figure 9D: Alignment of cradle residues for FnIII 7, FnIII1 and FnIII14. Beta sheets are shown as white residues on a black background and loops are shown as black text. Cradle residues are shown in bold with X representing the amino acid distribution for the beta sheets and Y representing the amino acid distribution for the loops with the loop length range given as a subscript. Figure 9E: Alignment of FnIII 7, FnIIIi and FnIII14 illustrating the cradle residues in beta sheets C and F and loops CD and FG.
Beta sheets are shown as white residues on a black background and loops are shown as black text with Cradle residues shown in bold. Figure 9F: Shown are the FnIII structural element residue ranges and FnIII cradle residues ranges.
Example 7 Cradle Molecules Binding to Lysozyme, Fc, and Human Serum Albumin (HSA)
Beta sheets are shown as white residues on a black background and loops are shown as black text with Cradle residues shown in bold. Figure 9F: Shown are the FnIII structural element residue ranges and FnIII cradle residues ranges.
Example 7 Cradle Molecules Binding to Lysozyme, Fc, and Human Serum Albumin (HSA)
[0374] The Example demonstrates the proof of principle for generating cradle molecules that bind to target molecules using calculated design libraries. Using the approaches described above, cradle binders were created against three targets (lysozyme, human Fc, and HSA) with FnIII 7, FillHi , and FnIII14.
[0375] FnIII 7 hits for lysozyme (SEQ ID NOs:97-99, respectively):
Nt***0#00#4WW0n00WW#49k#4900****WOW00****DPAt****tek#W0#4900****000*000WN
w o PLSPPTNLHLEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSSCTFDNLSPGLEYNVSVYTVK
DDKES-VPISDTIIP
w -C=.-1-, PLSPPTNLHLEANPDTGVLTVSWERSTTPDITAYYTYTYKSDKTRY--LEEVVHADQSSCTFDNLSPGLYYGVGAVATVRPHPTAGPISDTIIP cA
w PLSPPTNLHLEANPDTGVLTVSWERSTTPDITHYLIYTYG-HHSAG--LEEVVHADQSSCTFDNLSPGLGYSVYVNTVAYK--TMGPISDTIIP .6.
un
Nt***0#00#4WW0n00WW#49k#4900****WOW00****DPAt****tek#W0#4900****000*000WN
w o PLSPPTNLHLEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSSCTFDNLSPGLEYNVSVYTVK
DDKES-VPISDTIIP
w -C=.-1-, PLSPPTNLHLEANPDTGVLTVSWERSTTPDITAYYTYTYKSDKTRY--LEEVVHADQSSCTFDNLSPGLYYGVGAVATVRPHPTAGPISDTIIP cA
w PLSPPTNLHLEANPDTGVLTVSWERSTTPDITHYLIYTYG-HHSAG--LEEVVHADQSSCTFDNLSPGLGYSVYVNTVAYK--TMGPISDTIIP .6.
un
[0376] FnIII1 hits for lysozyme (SEQ ID NOs:100-128, respectively):
W******kt0**0004******000#*0#44***$4***00#*A*00##$00Et000**te**0**0*****q#414#4 900*0**N>
VSDVPRDLEVVAATPTSLLISWDAPAVIVR-YYRITYGETGGNSPV----QEFTVPGSKSTATISGLKPG-VDYTITVYAVTGRGDSPASSKPISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-TYYIMYSLWQHYVTNAL--QEFTVPGSKSTATISGLKPG-VFYGILVYAVSWWS R W PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-SYYIKYSTCSHYVRSGVG-QEFTVPGSKSTATISGLKPG-VDYMIDVNAVLSEG-RGD PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-GYTT YS YRDS
QEFTVPGSKSTATISGLKPGVIYNILVSAVSEWW K Y PISINYRT n VSDVPRDLEVVAATPTSLLISWDAPAVTVR-YYEIWYESYFY VLWQEFTVPGSKSTATISGLKPG
VSYEITVSAVYWH YAY PISINYRT
o VSDVPRDLEVVAATPTSLLISWDAPAVTVR-LYAIMYTAYEYRVMDAKLYQEFTVPGSKSTATISGLKPG-VSYYINVAAVYLHR-YFY PISINYRT n) co VSDVPRDLEVVAATPTSLLIPWDAPAVTVR-GYKIDYVVQTW AYYQEFTVPGSKSTATISGLKPG
VSYAITVLAVYRW YYS PISINYRT o ul VQYDIYVGAVETYV YAR PISINYRT co m m m VSDVPRDLEVVAATPTSLLIS-DAPAVTVR-SYYTYY--YDYDG GSVQEFTVPGSKSTATISGLKPG
VSYVISVAAVWYAA YRY PISINYRT n) ' VSDVPRDLEVVAATPTSLLISWDAPAVTVR-NYLIDYGYKNYSI AG QEFTVPGSKSIATISGLKPG
VFYAILVAAVRYFW YF PISINYRT n) o ISDVPRDLEVVAATPTSLLISWDAPAVTVR-GYSIHYYY--YSF TG QEFTVPGSKSTATISGLKPG
VSYWIRVWAVRFWE YLP PISINYRT H
W
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-GYDIAYGVNYYYY SY QEFTVPGSKSTATISGLKPG
VVYGIYVAAVRYWH YLF PISINYRT
(1) VSDVPRDLEVVAATPTSLLISWDAPAVTVR-IYSIGS QEFTVPGSKSTATISGLKPG
VWYWIYVAAVRAWS YWH PISINYRT H
I
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-EYYIYYGSSQE TEGQEFTVPGSKSTATISGLKPG
VNYSIGVAAVQNIY TYY PISINYRT H
m VSDVPRDLEVVAATPTSLLISWDAPAVTVR-SYEIGYEYIYLQY SQEFTVPGSKSTATISGLKPG
VMYSIVVYAVNKVY SYF PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-TYSISY--FDYLHL YSQEFTVQGSKSTATISGLKPG
VYYAIYVWAVG WW LAD PISINYRT
VSDVPRDLGVVAATPTSLLISWDAPAVTVR-KYMISYTLMGHLHYG--ASQEFTVPGSKSTATISGLKPGVVYYGIYVLAVSEYQ-VAS PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-SYNISYSKYHYSPA YQEFTVPGSKSTATISGLKPG
VQYYISVSAVHAHN VAG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRGGYGIGYAKAGSVDA YQEFTVPGSKSTTTISGLKPG
VXYYIYVRAVFAH PAY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-KYQISYG--YYSNT DQEFTVPGSKSTATISGLKPG
VDYWIYVSAVAWQA DQG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-TYSISYR--YGKWS GQEFTVPGSKSTATISGLKPG
VYYDIGVTAVTSVV SG PISINYRT
IV
VSDVPRDLEVVAATPTSLLISWDAPAVTVRQVYVIAYR--YYVRSW GQEFTVPGSKSTATISGLKPG
VYYSINVLAVYYRT WR PISINYRT n VSDVPRDLEVVAATPTSLLISWDAPAVTVR-SYDISYNGMAYTKTL VQEFTVPGSKSTATISGLKPG
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-NYAISYQ DDSPY VQEFTVPGSKSTATISGLKPG
VNYDISVTAVGWWR SGM PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-TYDIGYSSFNSSTLY VQEFTVPGSKSTATISGLKPG
VNYDISVTAVRLQE SQR PISINYRT w o VSDVPRDLEVVAATPTSLLISWDAPAVTVR-EYDIYYVDSYYYFEGQYPHQEFTVPGSKSTATISDLKPG-VTYDIGVKAVYNGSRIVE PISINYRT 1-, 1-, VSDVPRDLEVVAATPTSLLISWDAPAVTVR-VYEISYYSSESYL PGQEFTVPGSKSTATISGLKPG
VTYDIHVSAVAYRG AS PISINYRT -C=.-.6.
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-EYLIGYAV TEYGDRQEFTVPGSKSTATISGLKPG
VLYDIRVLAVYARW-PK PISINYRT cA
VSDVPRDLEVVAATPTSLPISWDAPAVTVR-YYSIWYYHY YPYAQEFTVPGSKSTATISSLKQG
VRYFIDVLAVAWVR WAY PISINYRT cA
o
W******kt0**0004******000#*0#44***$4***00#*A*00##$00Et000**te**0**0*****q#414#4 900*0**N>
VSDVPRDLEVVAATPTSLLISWDAPAVIVR-YYRITYGETGGNSPV----QEFTVPGSKSTATISGLKPG-VDYTITVYAVTGRGDSPASSKPISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-TYYIMYSLWQHYVTNAL--QEFTVPGSKSTATISGLKPG-VFYGILVYAVSWWS R W PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-SYYIKYSTCSHYVRSGVG-QEFTVPGSKSTATISGLKPG-VDYMIDVNAVLSEG-RGD PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-GYTT YS YRDS
QEFTVPGSKSTATISGLKPGVIYNILVSAVSEWW K Y PISINYRT n VSDVPRDLEVVAATPTSLLISWDAPAVTVR-YYEIWYESYFY VLWQEFTVPGSKSTATISGLKPG
VSYEITVSAVYWH YAY PISINYRT
o VSDVPRDLEVVAATPTSLLISWDAPAVTVR-LYAIMYTAYEYRVMDAKLYQEFTVPGSKSTATISGLKPG-VSYYINVAAVYLHR-YFY PISINYRT n) co VSDVPRDLEVVAATPTSLLIPWDAPAVTVR-GYKIDYVVQTW AYYQEFTVPGSKSTATISGLKPG
VSYAITVLAVYRW YYS PISINYRT o ul VQYDIYVGAVETYV YAR PISINYRT co m m m VSDVPRDLEVVAATPTSLLIS-DAPAVTVR-SYYTYY--YDYDG GSVQEFTVPGSKSTATISGLKPG
VSYVISVAAVWYAA YRY PISINYRT n) ' VSDVPRDLEVVAATPTSLLISWDAPAVTVR-NYLIDYGYKNYSI AG QEFTVPGSKSIATISGLKPG
VFYAILVAAVRYFW YF PISINYRT n) o ISDVPRDLEVVAATPTSLLISWDAPAVTVR-GYSIHYYY--YSF TG QEFTVPGSKSTATISGLKPG
VSYWIRVWAVRFWE YLP PISINYRT H
W
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-GYDIAYGVNYYYY SY QEFTVPGSKSTATISGLKPG
VVYGIYVAAVRYWH YLF PISINYRT
(1) VSDVPRDLEVVAATPTSLLISWDAPAVTVR-IYSIGS QEFTVPGSKSTATISGLKPG
VWYWIYVAAVRAWS YWH PISINYRT H
I
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-EYYIYYGSSQE TEGQEFTVPGSKSTATISGLKPG
VNYSIGVAAVQNIY TYY PISINYRT H
m VSDVPRDLEVVAATPTSLLISWDAPAVTVR-SYEIGYEYIYLQY SQEFTVPGSKSTATISGLKPG
VMYSIVVYAVNKVY SYF PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-TYSISY--FDYLHL YSQEFTVQGSKSTATISGLKPG
VYYAIYVWAVG WW LAD PISINYRT
VSDVPRDLGVVAATPTSLLISWDAPAVTVR-KYMISYTLMGHLHYG--ASQEFTVPGSKSTATISGLKPGVVYYGIYVLAVSEYQ-VAS PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-SYNISYSKYHYSPA YQEFTVPGSKSTATISGLKPG
VQYYISVSAVHAHN VAG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRGGYGIGYAKAGSVDA YQEFTVPGSKSTTTISGLKPG
VXYYIYVRAVFAH PAY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-KYQISYG--YYSNT DQEFTVPGSKSTATISGLKPG
VDYWIYVSAVAWQA DQG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-TYSISYR--YGKWS GQEFTVPGSKSTATISGLKPG
VYYDIGVTAVTSVV SG PISINYRT
IV
VSDVPRDLEVVAATPTSLLISWDAPAVTVRQVYVIAYR--YYVRSW GQEFTVPGSKSTATISGLKPG
VYYSINVLAVYYRT WR PISINYRT n VSDVPRDLEVVAATPTSLLISWDAPAVTVR-SYDISYNGMAYTKTL VQEFTVPGSKSTATISGLKPG
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-NYAISYQ DDSPY VQEFTVPGSKSTATISGLKPG
VNYDISVTAVGWWR SGM PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-TYDIGYSSFNSSTLY VQEFTVPGSKSTATISGLKPG
VNYDISVTAVRLQE SQR PISINYRT w o VSDVPRDLEVVAATPTSLLISWDAPAVTVR-EYDIYYVDSYYYFEGQYPHQEFTVPGSKSTATISDLKPG-VTYDIGVKAVYNGSRIVE PISINYRT 1-, 1-, VSDVPRDLEVVAATPTSLLISWDAPAVTVR-VYEISYYSSESYL PGQEFTVPGSKSTATISGLKPG
VTYDIHVSAVAYRG AS PISINYRT -C=.-.6.
VSDVPRDLEVVAATPTSLLISWDAPAVTVR-EYLIGYAV TEYGDRQEFTVPGSKSTATISGLKPG
VLYDIRVLAVYARW-PK PISINYRT cA
VSDVPRDLEVVAATPTSLPISWDAPAVTVR-YYSIWYYHY YPYAQEFTVPGSKSTATISSLKQG
VRYFIDVLAVAWVR WAY PISINYRT cA
o
[0377] FnIII14 hits for lysozyme (SEQ ID NOs:129-141, respectively):
Neff#W.RtMOWt..aOaW*Wt.#OGtki.ffOq#80.###.*i##G0#it.06t#i.*bE*OE**f#EE#).R#A#PW
.q*:###EGWRKki.WWN
NVSPPRRARVTDATETTITISWRTKTETITGF-QVDAVPANGQ-c NVSPPRRARVTDATETTITISWRTKTETITSF-WVWAKPYSYYWGSIQRTIKPDVRSYTITGLQPGTWYAINLYTLT-YRFWGDPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITYFGDVSAGPSSTYIESIQRTIKPDVRSYTITGLQPGTWYNIVLQTLYSW
SYW--PVVIDAST
NVSPPRRARVTDATETTITISWSTKTETITSF-VVGARP--YYYPYIQRTIKPDVRSYTITGLQPGTVYGIWLQTLR-YYYGYTPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITAF-EVVAHP--NYDYYIQRTIKPDVRSYTITGLQPGTSYWIYLYTL--YSRRYLPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITSF-SVIAFPLRERAATIQRTIKPDVRSYTITGLQPGTLYSIILNTL--WRYYPIPVVIDAST
NVSPPHRARVTDATETTITISWRTKTETITNF-LVYAYP--TEHVRIQRTIKPDVRSYTITGLQPGTKYWIYLYTLIYNMYY--PVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITGF-SVWAQP--GYLEEIQRTIKPDVRSYTITGLQPGTSYDSIALSTLGRYRWSDPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITQF-HVTAGP--HWVGRIQRTIKPDVRSYTITGLQPGTAYLIYALSTLRSYRYQWPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITYF-HVSALP-LVYGSYIQRTIKPDVRSYTITGLQPGTTYDIYLSTLN-SHWLTAPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITRF-YVEATPSAAANTSIQRTIKPDVRSYTITGLQPGTMYQIWLATLS-YYASHYPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITSF-GVTAKP-VWSWGSIQRTIKPDVRSYTITGLQPGTGYAISLYTLLRYWYRYYPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITAF-YVQAYP--YSDHSIQRTIKPDVRSYTITGLQPGTYYDITLSTLR--SYYYRPVVIDAST
Neff#W.RtMOWt..aOaW*Wt.#OGtki.ffOq#80.###.*i##G0#it.06t#i.*bE*OE**f#EE#).R#A#PW
.q*:###EGWRKki.WWN
NVSPPRRARVTDATETTITISWRTKTETITGF-QVDAVPANGQ-c NVSPPRRARVTDATETTITISWRTKTETITSF-WVWAKPYSYYWGSIQRTIKPDVRSYTITGLQPGTWYAINLYTLT-YRFWGDPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITYFGDVSAGPSSTYIESIQRTIKPDVRSYTITGLQPGTWYNIVLQTLYSW
SYW--PVVIDAST
NVSPPRRARVTDATETTITISWSTKTETITSF-VVGARP--YYYPYIQRTIKPDVRSYTITGLQPGTVYGIWLQTLR-YYYGYTPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITAF-EVVAHP--NYDYYIQRTIKPDVRSYTITGLQPGTSYWIYLYTL--YSRRYLPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITSF-SVIAFPLRERAATIQRTIKPDVRSYTITGLQPGTLYSIILNTL--WRYYPIPVVIDAST
NVSPPHRARVTDATETTITISWRTKTETITNF-LVYAYP--TEHVRIQRTIKPDVRSYTITGLQPGTKYWIYLYTLIYNMYY--PVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITGF-SVWAQP--GYLEEIQRTIKPDVRSYTITGLQPGTSYDSIALSTLGRYRWSDPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITQF-HVTAGP--HWVGRIQRTIKPDVRSYTITGLQPGTAYLIYALSTLRSYRYQWPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITYF-HVSALP-LVYGSYIQRTIKPDVRSYTITGLQPGTTYDIYLSTLN-SHWLTAPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITRF-YVEATPSAAANTSIQRTIKPDVRSYTITGLQPGTMYQIWLATLS-YYASHYPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITSF-GVTAKP-VWSWGSIQRTIKPDVRSYTITGLQPGTGYAISLYTLLRYWYRYYPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITAF-YVQAYP--YSDHSIQRTIKPDVRSYTITGLQPGTYYDITLSTLR--SYYYRPVVIDAST
[0378] FnIII 7 hit for human Fc (SEQ ID NOs:142-143, respectively):
#0.0*WOAA#.00#VM#*0A00A*#:00#00#$*On0##W0004M0Pkg.*Onrf#80~$V0##WOOW*W#
Of:
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSSCTFDNLSPGLEYNVSVYTVK
DDKES-VPISDTIIP co PLSPPTNLHLEANPDTGVLTVSWERSTTPDITGYTIVAVSYSFYYY-LEEVVHADQSSCTFDNLSPGLSYDEVYVVTVAYKSHGVPISDTAPS
#0.0*WOAA#.00#VM#*0A00A*#:00#00#$*On0##W0004M0Pkg.*Onrf#80~$V0##WOOW*W#
Of:
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVVHADQSSCTFDNLSPGLEYNVSVYTVK
DDKES-VPISDTIIP co PLSPPTNLHLEANPDTGVLTVSWERSTTPDITGYTIVAVSYSFYYY-LEEVVHADQSSCTFDNLSPGLSYDEVYVVTVAYKSHGVPISDTAPS
[0379] FnIIIi hits for human Fc (SEQ ID NOs:144-147, respectively):
o W10###*kt0**0000#*##*00#A#*0##0*0.**0#*00A***OtOft#A0WAA#49I#490$4***00#***0#00 N>
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKPGVDYTITVYAVTGRG
DSPASSKPISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYQIGYG-YNRGTS-QEFTVPGSKSTATISGLKPGVSYGIYVYAVYE
WSYS--PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYSITYTYYQAFG-TQEFTVPGSKSTATISGLKPGVGYYIQVYAVGDRVS---NGGPISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYGIYYS-MSSYG-RQEFTVPGSKSTATISGLKPGVTYQIYMSAVDN
WGVG-YPISINYRT
o W10###*kt0**0000#*##*00#A#*0##0*0.**0#*00A***OtOft#A0WAA#49I#490$4***00#***0#00 N>
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKPGVDYTITVYAVTGRG
DSPASSKPISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYQIGYG-YNRGTS-QEFTVPGSKSTATISGLKPGVSYGIYVYAVYE
WSYS--PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYSITYTYYQAFG-TQEFTVPGSKSTATISGLKPGVGYYIQVYAVGDRVS---NGGPISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYGIYYS-MSSYG-RQEFTVPGSKSTATISGLKPGVTYQIYMSAVDN
WGVG-YPISINYRT
[0380] FnIII14 hits for human Fc (SEQ ID NOs:148-159, respectively):
NOiffi#WOK****Iit010*****itidi#iitiqkfOti#000#idiA***0600.1ttikiOgiON#0**0****#
###i****ka****0*AideitN
NVSPPRRARVTDATETTITISWRTKTETITGFQVDAVPANGQ----TPIQRTIKPDVRSYTITGLQPGTDYKIYLYTLNDNARS--SPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITSFTVWASP RSYTH--IQRTIKPDVRSYTITGLQPGTYYR-IYLYTLY-NTYFS-PVVIDAST c NVSPPRHARVTDATETTITISWRTKTETITSFRVWAAP---TMYQYLYIQRTIKPDVRSYTITGLQPGTYYQAIILGTLS-TSNTPSPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITSFFVQAYP YGELYIQRTIKPDVRSYTITGLQPGTSYG-IRLSTLI-DSDSYGPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITRFTVVAHP GYPGYIQRTIKPDVRSYTITGLQPGTYYS-IDLRTLA-YAQGYSPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITRFTVTADP WYWYGIQRTIKPDVRSYTITGLQPGTYYSGIVLDTLS-WVSGGYPVVIDAST c NVSPPRRARVTDATETTITISWRTKTETITNFSVQAGPSI YYGYYIQRTIKPDVRSYTITGLQPGTQYS-ISLRTLWRWYGTYWPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITGFLVNAWP
HWANVIQRTIKPDVRSYTITGLQPGTFYV¨IYLATLQ¨YSSVYSPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITSFAVHAQP
VYANWIQRTIKPDVRSYTITGLQPGTYYG¨INLATL--YGPNYWPVVIDAST 0 NVSPPRRARVTDATETTITISWRTKTETITYFSVFAYPES¨GAYN
IQRTIKPDVRSYTITGLQPGTAYD¨IKLDTLL¨SSYWYHPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITTFGVYAMHPEEGGYYY--IQRTIKPDVRSYTITGLQPGTWYG¨IGLDTLY¨SVHDERPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITRFYVTDALPG¨DAYRYHRIQRTIKPDVRSYTITGLQPGTLYG¨ISLTTL
Y¨YAS¨AIPVVIDAST
NOiffi#WOK****Iit010*****itidi#iitiqkfOti#000#idiA***0600.1ttikiOgiON#0**0****#
###i****ka****0*AideitN
NVSPPRRARVTDATETTITISWRTKTETITGFQVDAVPANGQ----TPIQRTIKPDVRSYTITGLQPGTDYKIYLYTLNDNARS--SPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITSFTVWASP RSYTH--IQRTIKPDVRSYTITGLQPGTYYR-IYLYTLY-NTYFS-PVVIDAST c NVSPPRHARVTDATETTITISWRTKTETITSFRVWAAP---TMYQYLYIQRTIKPDVRSYTITGLQPGTYYQAIILGTLS-TSNTPSPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITSFFVQAYP YGELYIQRTIKPDVRSYTITGLQPGTSYG-IRLSTLI-DSDSYGPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITRFTVVAHP GYPGYIQRTIKPDVRSYTITGLQPGTYYS-IDLRTLA-YAQGYSPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITRFTVTADP WYWYGIQRTIKPDVRSYTITGLQPGTYYSGIVLDTLS-WVSGGYPVVIDAST c NVSPPRRARVTDATETTITISWRTKTETITNFSVQAGPSI YYGYYIQRTIKPDVRSYTITGLQPGTQYS-ISLRTLWRWYGTYWPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITGFLVNAWP
HWANVIQRTIKPDVRSYTITGLQPGTFYV¨IYLATLQ¨YSSVYSPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITSFAVHAQP
VYANWIQRTIKPDVRSYTITGLQPGTYYG¨INLATL--YGPNYWPVVIDAST 0 NVSPPRRARVTDATETTITISWRTKTETITYFSVFAYPES¨GAYN
IQRTIKPDVRSYTITGLQPGTAYD¨IKLDTLL¨SSYWYHPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITTFGVYAMHPEEGGYYY--IQRTIKPDVRSYTITGLQPGTWYG¨IGLDTLY¨SVHDERPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITRFYVTDALPG¨DAYRYHRIQRTIKPDVRSYTITGLQPGTLYG¨ISLTTL
Y¨YAS¨AIPVVIDAST
[0381] FnIII 7 hits for HSA (SEQ ID NOs:160-199, respectively):
Wi*tiOitAtOttOttiOi#00i*MOtt#igtttat****#*00*.###*.#008tOtOWVIrtki.RVO.qtr*WWWA
#VOttKOVVIN
PLSPPTNLHLEANPDTGVLTVSWERS TTPDI TGYRITTTPTNGQQG--NSLEEVVEADQS SCTFDNL
SPGLE¨YNV¨SVYTVKDDKESVP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTRDITTYGIETEYDHSV GLEEVVHADQS SCTFDNL
SPGLN¨YDV¨EVVTVGWGVYQRP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITTYVISTVTSHTGP RLEEVVHADQS SCTFDNL
SXGLC¨YDV¨YVYTVTDTAYTTP I SDT I IP
SL SPPTNLHLEANPDTGVLTVGWERS TTPGI T SYS IDTAKDDVPY LEEVVHADQS SCTFDNL
SPGLN¨YTV¨VVATVGWS¨VDGP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TYYE INTTGYYGFYPG--GLEEVVHADQS SCTFDNL
SPGLY¨YQV¨TVQTVVYSMWYHP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TYYGIWTLTWLQYYSYRWGLEEAVHADQS SCTFDNL
SPGLV¨YLV¨YVGTVRSP¨MARP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TWYWIGTWY¨SGYMV--GLEEVVHADQS SCTFDNL
SPGLT¨YWV¨LVGTVVRSP SRRP I SDT I IP co co PLSPPTNLHLEANPDTGVLTVSWERSTTPDVTTYS IYTYGYWDSHYM--SLEEVVHADQSRCTFDNL
SPGLY¨YSV¨EVYTVYYGLYVVP I SDT I IP co PLSPPTNLHLEANPDTGVLTVSWERSTTPDITTYGIETQTVEWVY YLEEVVHADQS SCTFDNL
SPGLY¨YNV¨TVGTVMLD¨AAYP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TDYYI I TRS RW G YLEEVVHADQS SCTFDNL
SPGLR¨YHV¨YVWTVGH¨Y¨RDP I SDT I IP 0 PLSPPTNLHLEANPDTGVLTVSWERSTTPDITNYL IQTDYFAF IK¨G--VLEEVVHADQS SCTFDNL
SPGLY¨YYV¨GVDTVSVPSH¨GP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TYYT IATADYTYSY¨A--HLEEVVHADQS SCTFDNL
SPGLN¨YEV¨GVGTVSVYSYI GP I SDT I IP o PLSPPTNLHLEANPDTGVLTVSWERSTTPDITYYS IDTWT¨FGQW GLEEVVHADQS SCTFDNL
SPGLY¨YYV¨EVVTVYEWAYSYP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TMYAITTYEYSRAW¨Q--YLEEVVHADQS SCTFDNL
SPGLT¨YYV¨EVYTVRYT¨WSDP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TDYNI S TWLYT SVSVYT¨ELEEVVHAGQS SCTFDNL
SPGLA¨YVVYVWS TVWEHFYP SP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITWYWINTSLANVRM SLEEVVHADQSGCTFDNL
SPGLY¨YDV¨QVRTVSAA¨EGYP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITKYI I--YTGYGAS ¨Y--DLEEVVHADQS SCTFDNL
SPGLK¨YTV¨TVWTVSYA¨SQVP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITMYS IYTYDYTRNY VLEEAVHADQS SCTFDNL
SPGLYGYYV¨GVGTVTGA¨GWHP I SDT I IP
PL SPPTNLHLEVNPDTGVLTVSWERS TTPGI TQYDIATL SYGGRS ¨G--GLEEVVHADQS SCTFDNL SPGL
S ¨YVV¨SVS TVT SNEYSAP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TMYDIKT TYYKAYY¨Y--GLEEVVHADQS SCTFDNL
SPGLY¨YFV¨GVVTVERP¨RYYP I SDT I IP 1-3 PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TYYYIDTNG¨G¨YW¨S --YLEEVVHADQS SCTFDNL
SPGLG¨YPVGYVRTVYAGWLKGP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITYYYIGTYQ¨GTTY¨E--HLEEVVHADQSSCTFDNLSPGL I
¨YLV¨YVS TVYWDSMS SP I SDT I IP
PLSPPANLRLEANPDTGVLTVSWERSTTPDITRYVIATGYGGSWY HLEEVVHADQSRCTFDNL
SPGLA¨YYV¨DVYTVTPGEKHSP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITKYI I S TYVDYGGY LEEVVHADQS SCTFDNL
SPGLG¨YSV¨TVS TVSAG¨WDSP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITSYRISTEWRWRYT GLEEVVHADQSSCTFDNLSPGL I
¨YGV¨GVS TVWKHNSQAP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITMYYISTGGSSYKPD
RLEEVVHADQSSCTFDNLSPGLD¨YMV¨YVRTVMY¨YNRSPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITGYSIATYLTYSNLV
GLEEVVHADQSSCTFDNLSPGLS¨YKV¨SVYTVYGY¨SYGPISDTIIP 0 PLSPPTNLHLEANPDTGVLTVSWERSTTSDITKYYIATWFGDYGY
SLEEVVHADQSSCTFDNLSPGLQ¨YGV¨SVATVKGGQAHYPISDTIIP =
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITKYYILTSG--YWG¨G--GLEEVVHADQSSCTFDNLSPGLT¨YLV¨SVWTVTH¨YAGYPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITYYSITTSF¨Y--Y¨S--ELEEVVHADQSSCTFDNLSPGLK¨YMV¨SVSTVSYS¨VGSPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITTYYISTQGQDERG¨Y--VLEEVVHADQSSCTFDKLSPGLI¨YXV¨IVWTVDDN¨RYDPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITRYYIRTSYVRHGR
LEEVVHADQSSCTFDNLSPGLY¨YNV¨SVSTVGYY¨YMLPISDTIIP
PLSPPTNLHLEANPDTGVLTVSRERSTTPDITTYSIYTHS
GALYVLEEVVHADQSSCTFDNPSPGLN¨YNV¨SVSTVHSRWRYGPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITMYGIVTIY--TRYY
SLEEVVHADQSSCTFDNLSPGLI¨YWV¨YVLTVYY¨SWYRPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITTYVIDTGA--AVNY
VLEEVVHADQSSCTFDNLSPGLQ¨YSV¨DVVVTVWYSWYMPISDTIIP
PLSPPTNPHLEANPDTGVLTVSWERSTTPDITTYWIGTYY
SADERLEEVVHADQSSCTFDNLSPGLY¨YAV¨VVGTVGVWYRVAPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITYYYIHTYY¨WKHWQ
SLEEVVHADQSSCTFDNLSPGLK¨YGV¨WVSTVYRV¨VYYPISDTIIP
PLSPPTNLHLEASPDTGVLTVSWERSTTPDITTYLILTYLGYSR
VLEEVVHADQSSCTFDNLSPGLW¨YMV¨YVDTVGRVPYIGPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITVYYIYTYT¨YNADL
ILEEVVHADQSSCTFDNLSPGLI¨YSV¨YVGTVAS¨DDGRPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITAYVI
YTYSESDGRVLEEVVHADQSSCTFDNLSPGLR¨YSV¨KVSTVY¨YSYAYPISDTIIP co co
Wi*tiOitAtOttOttiOi#00i*MOtt#igtttat****#*00*.###*.#008tOtOWVIrtki.RVO.qtr*WWWA
#VOttKOVVIN
PLSPPTNLHLEANPDTGVLTVSWERS TTPDI TGYRITTTPTNGQQG--NSLEEVVEADQS SCTFDNL
SPGLE¨YNV¨SVYTVKDDKESVP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTRDITTYGIETEYDHSV GLEEVVHADQS SCTFDNL
SPGLN¨YDV¨EVVTVGWGVYQRP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITTYVISTVTSHTGP RLEEVVHADQS SCTFDNL
SXGLC¨YDV¨YVYTVTDTAYTTP I SDT I IP
SL SPPTNLHLEANPDTGVLTVGWERS TTPGI T SYS IDTAKDDVPY LEEVVHADQS SCTFDNL
SPGLN¨YTV¨VVATVGWS¨VDGP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TYYE INTTGYYGFYPG--GLEEVVHADQS SCTFDNL
SPGLY¨YQV¨TVQTVVYSMWYHP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TYYGIWTLTWLQYYSYRWGLEEAVHADQS SCTFDNL
SPGLV¨YLV¨YVGTVRSP¨MARP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TWYWIGTWY¨SGYMV--GLEEVVHADQS SCTFDNL
SPGLT¨YWV¨LVGTVVRSP SRRP I SDT I IP co co PLSPPTNLHLEANPDTGVLTVSWERSTTPDVTTYS IYTYGYWDSHYM--SLEEVVHADQSRCTFDNL
SPGLY¨YSV¨EVYTVYYGLYVVP I SDT I IP co PLSPPTNLHLEANPDTGVLTVSWERSTTPDITTYGIETQTVEWVY YLEEVVHADQS SCTFDNL
SPGLY¨YNV¨TVGTVMLD¨AAYP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TDYYI I TRS RW G YLEEVVHADQS SCTFDNL
SPGLR¨YHV¨YVWTVGH¨Y¨RDP I SDT I IP 0 PLSPPTNLHLEANPDTGVLTVSWERSTTPDITNYL IQTDYFAF IK¨G--VLEEVVHADQS SCTFDNL
SPGLY¨YYV¨GVDTVSVPSH¨GP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TYYT IATADYTYSY¨A--HLEEVVHADQS SCTFDNL
SPGLN¨YEV¨GVGTVSVYSYI GP I SDT I IP o PLSPPTNLHLEANPDTGVLTVSWERSTTPDITYYS IDTWT¨FGQW GLEEVVHADQS SCTFDNL
SPGLY¨YYV¨EVVTVYEWAYSYP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TMYAITTYEYSRAW¨Q--YLEEVVHADQS SCTFDNL
SPGLT¨YYV¨EVYTVRYT¨WSDP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TDYNI S TWLYT SVSVYT¨ELEEVVHAGQS SCTFDNL
SPGLA¨YVVYVWS TVWEHFYP SP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITWYWINTSLANVRM SLEEVVHADQSGCTFDNL
SPGLY¨YDV¨QVRTVSAA¨EGYP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITKYI I--YTGYGAS ¨Y--DLEEVVHADQS SCTFDNL
SPGLK¨YTV¨TVWTVSYA¨SQVP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITMYS IYTYDYTRNY VLEEAVHADQS SCTFDNL
SPGLYGYYV¨GVGTVTGA¨GWHP I SDT I IP
PL SPPTNLHLEVNPDTGVLTVSWERS TTPGI TQYDIATL SYGGRS ¨G--GLEEVVHADQS SCTFDNL SPGL
S ¨YVV¨SVS TVT SNEYSAP I SDT I IP
PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TMYDIKT TYYKAYY¨Y--GLEEVVHADQS SCTFDNL
SPGLY¨YFV¨GVVTVERP¨RYYP I SDT I IP 1-3 PL SPPTNLHLEANPDTGVLTVSWERS TTPDI TYYYIDTNG¨G¨YW¨S --YLEEVVHADQS SCTFDNL
SPGLG¨YPVGYVRTVYAGWLKGP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITYYYIGTYQ¨GTTY¨E--HLEEVVHADQSSCTFDNLSPGL I
¨YLV¨YVS TVYWDSMS SP I SDT I IP
PLSPPANLRLEANPDTGVLTVSWERSTTPDITRYVIATGYGGSWY HLEEVVHADQSRCTFDNL
SPGLA¨YYV¨DVYTVTPGEKHSP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITKYI I S TYVDYGGY LEEVVHADQS SCTFDNL
SPGLG¨YSV¨TVS TVSAG¨WDSP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITSYRISTEWRWRYT GLEEVVHADQSSCTFDNLSPGL I
¨YGV¨GVS TVWKHNSQAP I SDT I IP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITMYYISTGGSSYKPD
RLEEVVHADQSSCTFDNLSPGLD¨YMV¨YVRTVMY¨YNRSPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITGYSIATYLTYSNLV
GLEEVVHADQSSCTFDNLSPGLS¨YKV¨SVYTVYGY¨SYGPISDTIIP 0 PLSPPTNLHLEANPDTGVLTVSWERSTTSDITKYYIATWFGDYGY
SLEEVVHADQSSCTFDNLSPGLQ¨YGV¨SVATVKGGQAHYPISDTIIP =
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITKYYILTSG--YWG¨G--GLEEVVHADQSSCTFDNLSPGLT¨YLV¨SVWTVTH¨YAGYPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITYYSITTSF¨Y--Y¨S--ELEEVVHADQSSCTFDNLSPGLK¨YMV¨SVSTVSYS¨VGSPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITTYYISTQGQDERG¨Y--VLEEVVHADQSSCTFDKLSPGLI¨YXV¨IVWTVDDN¨RYDPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITRYYIRTSYVRHGR
LEEVVHADQSSCTFDNLSPGLY¨YNV¨SVSTVGYY¨YMLPISDTIIP
PLSPPTNLHLEANPDTGVLTVSRERSTTPDITTYSIYTHS
GALYVLEEVVHADQSSCTFDNPSPGLN¨YNV¨SVSTVHSRWRYGPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITMYGIVTIY--TRYY
SLEEVVHADQSSCTFDNLSPGLI¨YWV¨YVLTVYY¨SWYRPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITTYVIDTGA--AVNY
VLEEVVHADQSSCTFDNLSPGLQ¨YSV¨DVVVTVWYSWYMPISDTIIP
PLSPPTNPHLEANPDTGVLTVSWERSTTPDITTYWIGTYY
SADERLEEVVHADQSSCTFDNLSPGLY¨YAV¨VVGTVGVWYRVAPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITYYYIHTYY¨WKHWQ
SLEEVVHADQSSCTFDNLSPGLK¨YGV¨WVSTVYRV¨VYYPISDTIIP
PLSPPTNLHLEASPDTGVLTVSWERSTTPDITTYLILTYLGYSR
VLEEVVHADQSSCTFDNLSPGLW¨YMV¨YVDTVGRVPYIGPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITVYYIYTYT¨YNADL
ILEEVVHADQSSCTFDNLSPGLI¨YSV¨YVGTVAS¨DDGRPISDTIIP
PLSPPTNLHLEANPDTGVLTVSWERSTTPDITAYVI
YTYSESDGRVLEEVVHADQSSCTFDNLSPGLR¨YSV¨KVSTVY¨YSYAYPISDTIIP co co
[0382] FnIIIi hits for HSA (SEQ ID NOs:200-238, respectively):
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYRITYGETGGN---SPVQEFTVPGSKSTATISGLKPGVDY¨TITVYAVTGRGDSPASSKPISINYRT
(1) VSDVPRDLEVVAATPTSLLISWDAPAVTVRGYYISYYYHSTRD SQEFTVPGSKSTATISGLKPGVSY YVGVGAV
WKKDYYF PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYYILYGDYNAYMD¨YAGQEFTVPGSKSTATISGLKPGVGYVEIDVYAV
¨RTSEEQ PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRAYQIRYAY¨YSVG RQEFTVPGSKSTATISGLRPGVKY HISVYAV
NGGMVTD PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYWIIYWEAWEYVQ AQEFTVPGSKSTATISGLKPGVHY GIMVSAV
SGEQPWY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRIYSIYYYSYVMRGYY--FQEFTVLGSKSTATISGLKPGVNY¨DINVQAV¨YHRGWRY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVAVRAYSIDYY¨HDNGDG TQEFTVPGSKSTATISGLKPGVTY GILVYAV
VS NMGI PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRSYYIGYSAYDEYGG RQEFTVPGSKSTATISGLKPGVSY SINVFAV
YTMTGRA PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRKYSIYYFSSYSGI AQEFTVPGSKSTATISGLKPGVYY GIYVEAV
YH HYSP PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRNYYIQYM¨VNYND TQEFTVPGSKSTATISGLKPGVYY DIKVAAV
YV AEDR PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRKYYITYYRGRSG NQEFTVPGSKSTATISGLKPGVKY HILVSAV
KYPFRRL PISINYRT o VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYWIVY¨YRSVYSN GQEFTVPGSKSTATISGLKPGVIY SIRVIAV
SYYYYG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYSISYSFGLDYEY DQEFTVPGSKSTATISGLKPGVQY YIVVDAV
AGWQYY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRGYSIKYG¨ST¨ISA DQEFTVPGSKSTATISGLKPGVFY VIMVWAV
YYAYANY PISINYRT o VSDVPRDLEVVAATPTSLLISWDAPAVTVRSYHIYDYYNVHSYY GQEFTVPGSKSTATISGLKPGVSY AIYVGAV
NE KQLG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYVISYMSYDAQGG¨Q¨NQEFTVPGSKSTATISGLKPGVAY¨NIIVSAV
¨GGGQQAV PISINYRT 0 VSDVPRDLEVVAATPTSLLISWDAPAVTVREYSIYHSWTLVYR RQEFTVPGSKSTATISGLKPGVNY YIYVGAV
DNGYGPD PISINYRT =
VSDVPRDLEVVAATPTSLLISWDAPAVTLRYYEIKYSGSSLY VQXFTVPGSKSTATIXGLKPGVSY NIGVSXV
WQAFWPV PISINYRT
VSDVPRDLGVVAATPTSLRISWDAPAVTVRSYDIYYWYTTGG SQEFTVPGSKSTATISGLKPGVMY NIYVTAV
DA DVGG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRRYYIGYNWQSPAW NQEFTVPGSKSTATISGLKPGVYY
QIYVAAVLRYGDY A PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRSYSIGYF¨GAYRNW VQEFTVPGSKSTATISGLKPGVTY YIEVYAV
YS NPVY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRKYQIYYYYSAYVKE SQEFTVPGSKSTATISGLKPGVSY NIAVYAV
SKSRYQP PISINYRT
VSDVPRDLEVVAATPTSLLTSWDAPAVTVRNYAIYYYDD--DTG RQEFTVPGSKSTATISGLKPGVDY YIGVEAV
WY WVSS PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYTIWYV QRYAY SQEFTVPGSKSTATISGLKPGVSY SISVRAV
STDRYY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYSIAYW QLYLP VQEFTVPGSKSTATISGLKPGVSY GITVEAV
MSGYSIY PISINYRT
VSDVPRDLEVVAAAPTSLLISWDAPAVTVRKYYIWYGYSY¨FVAYSSYQEFTVPGSKSTATISGLKPGVRY¨YIGVLAV
¨KYPGDYY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYHSIGYNY YGMYQEFTVPGSKSTATISGLKPGVYY YIYVRAV
TGREAA PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYQIEYV SSYYRWTQEFTVPGSKSTATISGLKPGVVY FIYVAAV
RDGPN D PISINYRT co VSDVPRDLEVVAATPTSLLISWDAPAVTVRSYKISYYGY HWVYQEFTVPGSKSTATISGLKPGVSY LISVSAV
DY YGVL PISINYRT co VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYYIGYGMYT YGQEFTVPGSKSTTTISGLKPGVVY DIYVWAV
GFGRYVD PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRNYYIGYRY TVANWCYQEFTVPGSKSTATISGLKPGVSY WITAKAV
VSDVPRDLEVVAATPTSLLISWDAPAVTVRKYYIGYKL QVMEPDQEFTVPGSKSTATISGLKPGVEY WIGVDAV
SYYWGFD PISINYRT (1) VSDVPRDLEVVAATPTSLLISWDAPAVTVRGYGIYYGDT GDTQEFTVPGSKSTATISGLKPGVMY SIVVFAV
EW YMWQ PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYWIQYYI YYSRGTQEFTVPGSKSTATISGLKPGVNY SIGVQAV
QAYFGE PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRSYRIMYS--GYYAWEYSRQEFTVPGSKSTATISGLKPGVIY¨AIHVSAV¨VT¨NWEG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRDYWIYYRYS WPYGSQEFTVPGSKSTATISGLKPGVTY DIQVEAV
YG SESG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYYGIYYAGKAGGDYFITQEFTVPGSKSTATISGLKPGVEY¨RIYVAAV
¨GY¨HYTP PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRNYSIKYK YIPYVSHQEFTVPGSKSTATISGLKPGVTY SIRVQAV
YYLIERY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRIYYIAYGY¨YPGWGRAGSQEFTVPGSKSTATISGLKPGVTY¨GISVSAV
¨EE¨RRKV PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYRITYGETGGN---SPVQEFTVPGSKSTATISGLKPGVDY¨TITVYAVTGRGDSPASSKPISINYRT
(1) VSDVPRDLEVVAATPTSLLISWDAPAVTVRGYYISYYYHSTRD SQEFTVPGSKSTATISGLKPGVSY YVGVGAV
WKKDYYF PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYYILYGDYNAYMD¨YAGQEFTVPGSKSTATISGLKPGVGYVEIDVYAV
¨RTSEEQ PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRAYQIRYAY¨YSVG RQEFTVPGSKSTATISGLRPGVKY HISVYAV
NGGMVTD PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYWIIYWEAWEYVQ AQEFTVPGSKSTATISGLKPGVHY GIMVSAV
SGEQPWY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRIYSIYYYSYVMRGYY--FQEFTVLGSKSTATISGLKPGVNY¨DINVQAV¨YHRGWRY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVAVRAYSIDYY¨HDNGDG TQEFTVPGSKSTATISGLKPGVTY GILVYAV
VS NMGI PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRSYYIGYSAYDEYGG RQEFTVPGSKSTATISGLKPGVSY SINVFAV
YTMTGRA PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRKYSIYYFSSYSGI AQEFTVPGSKSTATISGLKPGVYY GIYVEAV
YH HYSP PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRNYYIQYM¨VNYND TQEFTVPGSKSTATISGLKPGVYY DIKVAAV
YV AEDR PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRKYYITYYRGRSG NQEFTVPGSKSTATISGLKPGVKY HILVSAV
KYPFRRL PISINYRT o VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYWIVY¨YRSVYSN GQEFTVPGSKSTATISGLKPGVIY SIRVIAV
SYYYYG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYSISYSFGLDYEY DQEFTVPGSKSTATISGLKPGVQY YIVVDAV
AGWQYY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRGYSIKYG¨ST¨ISA DQEFTVPGSKSTATISGLKPGVFY VIMVWAV
YYAYANY PISINYRT o VSDVPRDLEVVAATPTSLLISWDAPAVTVRSYHIYDYYNVHSYY GQEFTVPGSKSTATISGLKPGVSY AIYVGAV
NE KQLG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYVISYMSYDAQGG¨Q¨NQEFTVPGSKSTATISGLKPGVAY¨NIIVSAV
¨GGGQQAV PISINYRT 0 VSDVPRDLEVVAATPTSLLISWDAPAVTVREYSIYHSWTLVYR RQEFTVPGSKSTATISGLKPGVNY YIYVGAV
DNGYGPD PISINYRT =
VSDVPRDLEVVAATPTSLLISWDAPAVTLRYYEIKYSGSSLY VQXFTVPGSKSTATIXGLKPGVSY NIGVSXV
WQAFWPV PISINYRT
VSDVPRDLGVVAATPTSLRISWDAPAVTVRSYDIYYWYTTGG SQEFTVPGSKSTATISGLKPGVMY NIYVTAV
DA DVGG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRRYYIGYNWQSPAW NQEFTVPGSKSTATISGLKPGVYY
QIYVAAVLRYGDY A PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRSYSIGYF¨GAYRNW VQEFTVPGSKSTATISGLKPGVTY YIEVYAV
YS NPVY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRKYQIYYYYSAYVKE SQEFTVPGSKSTATISGLKPGVSY NIAVYAV
SKSRYQP PISINYRT
VSDVPRDLEVVAATPTSLLTSWDAPAVTVRNYAIYYYDD--DTG RQEFTVPGSKSTATISGLKPGVDY YIGVEAV
WY WVSS PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYTIWYV QRYAY SQEFTVPGSKSTATISGLKPGVSY SISVRAV
STDRYY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYSIAYW QLYLP VQEFTVPGSKSTATISGLKPGVSY GITVEAV
MSGYSIY PISINYRT
VSDVPRDLEVVAAAPTSLLISWDAPAVTVRKYYIWYGYSY¨FVAYSSYQEFTVPGSKSTATISGLKPGVRY¨YIGVLAV
¨KYPGDYY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYHSIGYNY YGMYQEFTVPGSKSTATISGLKPGVYY YIYVRAV
TGREAA PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYQIEYV SSYYRWTQEFTVPGSKSTATISGLKPGVVY FIYVAAV
RDGPN D PISINYRT co VSDVPRDLEVVAATPTSLLISWDAPAVTVRSYKISYYGY HWVYQEFTVPGSKSTATISGLKPGVSY LISVSAV
DY YGVL PISINYRT co VSDVPRDLEVVAATPTSLLISWDAPAVTVRTYYIGYGMYT YGQEFTVPGSKSTTTISGLKPGVVY DIYVWAV
GFGRYVD PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRNYYIGYRY TVANWCYQEFTVPGSKSTATISGLKPGVSY WITAKAV
VSDVPRDLEVVAATPTSLLISWDAPAVTVRKYYIGYKL QVMEPDQEFTVPGSKSTATISGLKPGVEY WIGVDAV
SYYWGFD PISINYRT (1) VSDVPRDLEVVAATPTSLLISWDAPAVTVRGYGIYYGDT GDTQEFTVPGSKSTATISGLKPGVMY SIVVFAV
EW YMWQ PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYWIQYYI YYSRGTQEFTVPGSKSTATISGLKPGVNY SIGVQAV
QAYFGE PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRSYRIMYS--GYYAWEYSRQEFTVPGSKSTATISGLKPGVIY¨AIHVSAV¨VT¨NWEG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRDYWIYYRYS WPYGSQEFTVPGSKSTATISGLKPGVTY DIQVEAV
YG SESG PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRYYYGIYYAGKAGGDYFITQEFTVPGSKSTATISGLKPGVEY¨RIYVAAV
¨GY¨HYTP PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRNYSIKYK YIPYVSHQEFTVPGSKSTATISGLKPGVTY SIRVQAV
YYLIERY PISINYRT
VSDVPRDLEVVAATPTSLLISWDAPAVTVRIYYIAYGY¨YPGWGRAGSQEFTVPGSKSTATISGLKPGVTY¨GISVSAV
¨EE¨RRKV PISINYRT
[0383] FnIII14 hits for HSA (SEQ ID NOs:239-277, respectively):
#******040***Au00*****00******A*0**wo**40****Aotwo;#0acozo*AwA0*Arko****010kiwm wft NVSPPRRARVTDATETTITISWRIKTETITGFQVDAVPANGQ
TPIQRTIKPDVRSYTITGLQPGTDYKIYLYTLNDNARS--SPVVIDAST o NVSPPRRARVTDATETTITISWRTKTETITDFEVAALPMVST
GIQRTIKPDVRSYTITGLQPGTTYYISLYTLDDDGPG--TPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITSFNVVAYPS¨SQDG
IQRTIKPDVRSYTITGLQPGTGYQIHLTTLG¨HLSF--SPVVIDAST o NVSPPRRARVTDATETT I T I SWRTKTET I TYFTVDAAP S LVVD NIQRT IKPDVRSYT I
TGLQPGTYYI ILLYTLYNYDA LPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TLFEVYADPQVSNGT YIQRT IKPDVRSYT I
n.) NVSPPRRARVTDATETT I T I SWRTKTET I TRFFVSAVPF ETG T IQRT IKPDVRSYT I
TGLQPGTAYDIALYTLF¨GYYY--YPVVI DAS T =
1¨, n.) NVSPPRRARVTDATETT I T I SWRTKTET I TDFGVVASPY LGQ GIQRT IKPDVRSYT I
TGLQPGTAYS IKLHTLH¨VHDY--YPVVI DAS T
-a-, NVSPPRRARVTDATETT I T I SWRTKTET I TYFYVAADPTEDG KIQRT IKPDVRSYT I
TGLQPGTYYT IHLRTLYYLVA VPVVI DAS T cA
n.) .6.
NVSPPRRARVTDATETT I T I SWRTKTET I TYFDVAANP SYLG AIQRT IKPDVRSYT I
TGLQPGTAYDIALGTL EXYVSGPVVI DAS T un NVSPPRRARVTDATETT I T I SWRTKTET I TYFGVGADPA¨MYIEYP YIQRT IKPDVRSYT I
TGLQPGTQYGIYLTTL S ¨QASD--YPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TYFGVRAYPTYRS S IQRT IKPDVRSYT I
TGLQPGTLYRI SLYTLDSAG¨Y--NPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TQF SVYAYPARSKYH IQRT IKPDVRSYT I
TGLQPGTGYRIYLQTLG¨GYSD--EPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TEFDVGADPG KGH AIQRT IKPDVRSYT I TGLQPGT
SYL IGLRTLN¨RVLH--YPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I T SFRVDAGPGVAG S IQRT IKPDVRSYT I
TGLQPGTYYQIQLAALAYGY YPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TRFYVSAQPRFYYYN IQRT IKPDVRSYT I
TGLQPGTDYT IGLYTLG¨VYMH--YPVVI DAS T
n NVSPPRRARVTDATETT I T I SWRTKTET I TYF SVEAYPRWYAL IQRT IKPDVRSYT I
TGLQPGT SYYIYLWTLMMDT S SPVVI DAS T
o NVSPPRRARVTDATETT I T I SWRTKTET I TEFFVMAEP--YYGEGY YIQRT IKPDVRSYT I TGLQPGT
SYS INLYTLK¨RPYL--YPVVI DAS T n.) co NVSPPRRARVTDATETT I T I SWRTKTET I T SFYVMAQPTNYYGQS T YIQRT IKPDVRSYT I
TGLQPGTYYGIQLYTLMYRAS APVVI DAS T o in co NVSPPRRARVTDATETT I T I SWRTKTET I TTFDVYAYPG¨YGGSYW S IQRT IKPDVRSYT I
TGLQPGT SYE IELETLH¨YSHA--YPVVI DAS T o) c...) tv I TGLQPGTGYRIFL S TLR¨WYYG--MPVVI DAS T
n.) NVSPPRRARVTDATETT I T I SWRTKTET I TYF SVYANPMYPFY IQRT IKPDVRSYT I
TGLQPGTYYE IYLGTLYYFAT YPVVI DAS T H
CA
NVSPPRRARVTDATETT I T I SWRTKTET I TYFYVSAYPYYVAY DIQRT IKPDVRSYT I
TGLQPGTYYDINL S TL SYSDN SPVVI DAS T
oI
NVSPPRRARVTDATETT I T I SWRTKTET I TYFKVRAYPA¨YNYGGW S IQRT IKPDVRSYT I
TGLQPGTYYS IYLDTLYLGAYW--YPVVI DAS T '7 H
NVSPPRSARVTDATETT I T I SWRTKTET I TYFVVGAFPAYSAHV DIQRT IKPDVRSYT I
TGLQPGTGYI INLETLINATG YPVVI DAS T o) NVSPPRRARVTDATETT I T I SWRTKTET I TQFWVLAGP SVWTGRM S IQRT IKPDVRSYT I
TGLQPGTTYYIGLYTLQYYEY SPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TTFRVWARPYLYYW IQRT IKPDVRSYT I
TGLQPGTHYDIGL S TL S ¨S TWY--YPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TYFHVNAQP S SPP WIQRT IKPDVRSYT I
TGLQPGTYYGI SLYTL SWRGEY--HPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TRF SVLAYP S ¨KRTTYT P IQRT IKPDVRSYT I
TGLQPGTGYT IRLYTL SPYYWV--YPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TWFYVSAFPL¨LVDG IQRT IKPDVRSYT I
TGLQPGTYYGINLYTL S S YPVVI DAS T
IV
NVSPPRRARVTDATETT I T I SWRTKTET I TYFYVYAKPRYIN S IQRT IKPDVRSYT I
TGLQPGTDYS IYLDTLYWGGEY--GPVVI DAS T n ,-i NVSPPRRARVTDATETT I T I SWRTKTET I TAFNVYASPEYWRYGYFR F IQRT IKPDVRSYT I
TGLQPGTGYYIYLYTLYHKYGY--YPVVI DAS T
ci) NVSPPRRARVTDATETT I T I SWRTKTET I TAFYVHAVPMLWVVNG IQRT IKPDVRSYT I
TGLQPGT SYT INLETLRMS SHY--YPVVI DAS T n.) o NVSPPRRARVTDATETT I T I SWRTKTET I T SFYVRALPVSAW P IQRT IKPDVRSYT I
TGLQPGTGYNIGLVTLYYGASY--VPVVI DAS T 1¨, NVSPPRRARVTDATETT I T I SWRTKTET I TAFYVGAHPWYNL E IQRT IKPDVRSYT I
TGLQPGTGYVI SLYTLWHHNE APVVI DAS T -a-, .6.
c7, NVSPPRRARVTDATETT I T I SWRTKTET I T SFWVHAYP SGASGG IQRT IKPDVRSYT I
TGLQPGTNYGIALATLTHYYTY--SPVVI DAS T
cA
o NVSPPRRARVTDATETT I T I SWRTKTET I TGFHVFASPWYSGSQ S IQRT IKPDVRSYT I
TGLQPGTTYYIGLNTLYIPGHE--PPVVI DAS T
NVSPPRRARVTDATETTITISWRTKTETITSFYVDAGP
WYRPDAYEYIQRTIKPDVRSYTITGLQPGTGYSIQLYTLYAYAYL--YPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITLFYVYAYPR¨YYPG
IQRTIKPDVRSYTITGLQPGTSYSIYLSTLW¨DTKG--YPVVIDAST 0 n.) NVSPPRRARVTDATETTIT I SWRTKTET ITTFMVVAYPM¨FQYR
IQRTIKPDVRSYTITGLQPGTSYTIYLQTLG¨YASW--YPVVIDAST =
1¨, w -a-, c7, w .6.
u, tv co in co o) .6.
iv iv o H
CA
O
H
IV
n ,-i cp w =
-a-, .6.
c7, c7, =
Example 8 Proof of Principle with Small Ubiquitin-Like Modifier (SUMO) Using Structure-Guided Design
#******040***Au00*****00******A*0**wo**40****Aotwo;#0acozo*AwA0*Arko****010kiwm wft NVSPPRRARVTDATETTITISWRIKTETITGFQVDAVPANGQ
TPIQRTIKPDVRSYTITGLQPGTDYKIYLYTLNDNARS--SPVVIDAST o NVSPPRRARVTDATETTITISWRTKTETITDFEVAALPMVST
GIQRTIKPDVRSYTITGLQPGTTYYISLYTLDDDGPG--TPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITSFNVVAYPS¨SQDG
IQRTIKPDVRSYTITGLQPGTGYQIHLTTLG¨HLSF--SPVVIDAST o NVSPPRRARVTDATETT I T I SWRTKTET I TYFTVDAAP S LVVD NIQRT IKPDVRSYT I
TGLQPGTYYI ILLYTLYNYDA LPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TLFEVYADPQVSNGT YIQRT IKPDVRSYT I
n.) NVSPPRRARVTDATETT I T I SWRTKTET I TRFFVSAVPF ETG T IQRT IKPDVRSYT I
TGLQPGTAYDIALYTLF¨GYYY--YPVVI DAS T =
1¨, n.) NVSPPRRARVTDATETT I T I SWRTKTET I TDFGVVASPY LGQ GIQRT IKPDVRSYT I
TGLQPGTAYS IKLHTLH¨VHDY--YPVVI DAS T
-a-, NVSPPRRARVTDATETT I T I SWRTKTET I TYFYVAADPTEDG KIQRT IKPDVRSYT I
TGLQPGTYYT IHLRTLYYLVA VPVVI DAS T cA
n.) .6.
NVSPPRRARVTDATETT I T I SWRTKTET I TYFDVAANP SYLG AIQRT IKPDVRSYT I
TGLQPGTAYDIALGTL EXYVSGPVVI DAS T un NVSPPRRARVTDATETT I T I SWRTKTET I TYFGVGADPA¨MYIEYP YIQRT IKPDVRSYT I
TGLQPGTQYGIYLTTL S ¨QASD--YPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TYFGVRAYPTYRS S IQRT IKPDVRSYT I
TGLQPGTLYRI SLYTLDSAG¨Y--NPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TQF SVYAYPARSKYH IQRT IKPDVRSYT I
TGLQPGTGYRIYLQTLG¨GYSD--EPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TEFDVGADPG KGH AIQRT IKPDVRSYT I TGLQPGT
SYL IGLRTLN¨RVLH--YPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I T SFRVDAGPGVAG S IQRT IKPDVRSYT I
TGLQPGTYYQIQLAALAYGY YPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TRFYVSAQPRFYYYN IQRT IKPDVRSYT I
TGLQPGTDYT IGLYTLG¨VYMH--YPVVI DAS T
n NVSPPRRARVTDATETT I T I SWRTKTET I TYF SVEAYPRWYAL IQRT IKPDVRSYT I
TGLQPGT SYYIYLWTLMMDT S SPVVI DAS T
o NVSPPRRARVTDATETT I T I SWRTKTET I TEFFVMAEP--YYGEGY YIQRT IKPDVRSYT I TGLQPGT
SYS INLYTLK¨RPYL--YPVVI DAS T n.) co NVSPPRRARVTDATETT I T I SWRTKTET I T SFYVMAQPTNYYGQS T YIQRT IKPDVRSYT I
TGLQPGTYYGIQLYTLMYRAS APVVI DAS T o in co NVSPPRRARVTDATETT I T I SWRTKTET I TTFDVYAYPG¨YGGSYW S IQRT IKPDVRSYT I
TGLQPGT SYE IELETLH¨YSHA--YPVVI DAS T o) c...) tv I TGLQPGTGYRIFL S TLR¨WYYG--MPVVI DAS T
n.) NVSPPRRARVTDATETT I T I SWRTKTET I TYF SVYANPMYPFY IQRT IKPDVRSYT I
TGLQPGTYYE IYLGTLYYFAT YPVVI DAS T H
CA
NVSPPRRARVTDATETT I T I SWRTKTET I TYFYVSAYPYYVAY DIQRT IKPDVRSYT I
TGLQPGTYYDINL S TL SYSDN SPVVI DAS T
oI
NVSPPRRARVTDATETT I T I SWRTKTET I TYFKVRAYPA¨YNYGGW S IQRT IKPDVRSYT I
TGLQPGTYYS IYLDTLYLGAYW--YPVVI DAS T '7 H
NVSPPRSARVTDATETT I T I SWRTKTET I TYFVVGAFPAYSAHV DIQRT IKPDVRSYT I
TGLQPGTGYI INLETLINATG YPVVI DAS T o) NVSPPRRARVTDATETT I T I SWRTKTET I TQFWVLAGP SVWTGRM S IQRT IKPDVRSYT I
TGLQPGTTYYIGLYTLQYYEY SPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TTFRVWARPYLYYW IQRT IKPDVRSYT I
TGLQPGTHYDIGL S TL S ¨S TWY--YPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TYFHVNAQP S SPP WIQRT IKPDVRSYT I
TGLQPGTYYGI SLYTL SWRGEY--HPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TRF SVLAYP S ¨KRTTYT P IQRT IKPDVRSYT I
TGLQPGTGYT IRLYTL SPYYWV--YPVVI DAS T
NVSPPRRARVTDATETT I T I SWRTKTET I TWFYVSAFPL¨LVDG IQRT IKPDVRSYT I
TGLQPGTYYGINLYTL S S YPVVI DAS T
IV
NVSPPRRARVTDATETT I T I SWRTKTET I TYFYVYAKPRYIN S IQRT IKPDVRSYT I
TGLQPGTDYS IYLDTLYWGGEY--GPVVI DAS T n ,-i NVSPPRRARVTDATETT I T I SWRTKTET I TAFNVYASPEYWRYGYFR F IQRT IKPDVRSYT I
TGLQPGTGYYIYLYTLYHKYGY--YPVVI DAS T
ci) NVSPPRRARVTDATETT I T I SWRTKTET I TAFYVHAVPMLWVVNG IQRT IKPDVRSYT I
TGLQPGT SYT INLETLRMS SHY--YPVVI DAS T n.) o NVSPPRRARVTDATETT I T I SWRTKTET I T SFYVRALPVSAW P IQRT IKPDVRSYT I
TGLQPGTGYNIGLVTLYYGASY--VPVVI DAS T 1¨, NVSPPRRARVTDATETT I T I SWRTKTET I TAFYVGAHPWYNL E IQRT IKPDVRSYT I
TGLQPGTGYVI SLYTLWHHNE APVVI DAS T -a-, .6.
c7, NVSPPRRARVTDATETT I T I SWRTKTET I T SFWVHAYP SGASGG IQRT IKPDVRSYT I
TGLQPGTNYGIALATLTHYYTY--SPVVI DAS T
cA
o NVSPPRRARVTDATETT I T I SWRTKTET I TGFHVFASPWYSGSQ S IQRT IKPDVRSYT I
TGLQPGTTYYIGLNTLYIPGHE--PPVVI DAS T
NVSPPRRARVTDATETTITISWRTKTETITSFYVDAGP
WYRPDAYEYIQRTIKPDVRSYTITGLQPGTGYSIQLYTLYAYAYL--YPVVIDAST
NVSPPRRARVTDATETTITISWRTKTETITLFYVYAYPR¨YYPG
IQRTIKPDVRSYTITGLQPGTSYSIYLSTLW¨DTKG--YPVVIDAST 0 n.) NVSPPRRARVTDATETTIT I SWRTKTET ITTFMVVAYPM¨FQYR
IQRTIKPDVRSYTITGLQPGTSYTIYLQTLG¨YASW--YPVVIDAST =
1¨, w -a-, c7, w .6.
u, tv co in co o) .6.
iv iv o H
CA
O
H
IV
n ,-i cp w =
-a-, .6.
c7, c7, =
Example 8 Proof of Principle with Small Ubiquitin-Like Modifier (SUMO) Using Structure-Guided Design
[0384] The inventors used SUMO as a non-limiting model for demonstration of the methods described herein. The following description is given for the purpose of illustrating various embodiments of the invention and is not meant to limit the present invention in any fashion.
One skilled in the art will appreciate that the present invention is well adapted to carry out the objects mentioned, as well as those objects, ends and advantages inherent herein. SUMO is used to represent the general embodiments for the purpose of proof of principle and is not intended to limit the scope of the invention. The described methods and compositions can be used with respect to a plethora of target molecules and are not limited solely to SUMO.
Changes therein and other uses which are encompassed within the spirit of the invention as defined by the scope of the claims will occur to those skilled in the art.
One skilled in the art will appreciate that the present invention is well adapted to carry out the objects mentioned, as well as those objects, ends and advantages inherent herein. SUMO is used to represent the general embodiments for the purpose of proof of principle and is not intended to limit the scope of the invention. The described methods and compositions can be used with respect to a plethora of target molecules and are not limited solely to SUMO.
Changes therein and other uses which are encompassed within the spirit of the invention as defined by the scope of the claims will occur to those skilled in the art.
[0385] SUMOs are structurally similar to ubiquitin and are post-translationally conjugated to other proteins resulting in a variety of functional modulations. In humans, there are four SUMO isoforms (SUM01-4) (Gareau and Lima, Nature Rev. (2010) 11:861-871).
SUM01 and SUM02 share 41% sequence identity (72% similarity) but are functionally distinct (Figure 10B, bottom) (Saitoh and Hinchey, J. Biolog. Chem. (2000) 275:6252-6258; Vertegaal, et al., Mol.
Cell Proteomics (2006) 5:2298-2310). SUM02 and SUM03, collectively referred to as SUM02/3, share 97% sequence identity and are assumed to be functionally identical (Gareau and Lima, supra, 2010; Johnson, Annual Rev. Biochem. (2004) 73:355-382).
SUM04's relevance as a post-translational modification is not clear (Bohren, et al., Protein Express. Purif.
(2007) 54:289-294; Owerbach, et al., Biochemical Biophysical Res. Comm. (2005) 337:517-520). Thus, most studies in SUMO biology have focused on SUM01 and SUM02/3.
SUMOylation play important roles in regulating diverse cellular processes including DNA
repair, transcription, nuclear transport and chromosome dynamics (Gareau and Lima, supra, 2010; Johnson, supra, 2004). The dominant mechanism by which SUMOylation alters protein function appears to be through SUMO-mediated interactions with other proteins containing a short peptide motif known as a SUMO-interacting motif (SIM) (Johnson, supra, 2004;
Kerscher, EMBO Repts. (2007) 8:550-555; Song, et al., Proc. Natl. Acad. Sci.
USA (2004) 101:14373-14378).
SUM01 and SUM02 share 41% sequence identity (72% similarity) but are functionally distinct (Figure 10B, bottom) (Saitoh and Hinchey, J. Biolog. Chem. (2000) 275:6252-6258; Vertegaal, et al., Mol.
Cell Proteomics (2006) 5:2298-2310). SUM02 and SUM03, collectively referred to as SUM02/3, share 97% sequence identity and are assumed to be functionally identical (Gareau and Lima, supra, 2010; Johnson, Annual Rev. Biochem. (2004) 73:355-382).
SUM04's relevance as a post-translational modification is not clear (Bohren, et al., Protein Express. Purif.
(2007) 54:289-294; Owerbach, et al., Biochemical Biophysical Res. Comm. (2005) 337:517-520). Thus, most studies in SUMO biology have focused on SUM01 and SUM02/3.
SUMOylation play important roles in regulating diverse cellular processes including DNA
repair, transcription, nuclear transport and chromosome dynamics (Gareau and Lima, supra, 2010; Johnson, supra, 2004). The dominant mechanism by which SUMOylation alters protein function appears to be through SUMO-mediated interactions with other proteins containing a short peptide motif known as a SUMO-interacting motif (SIM) (Johnson, supra, 2004;
Kerscher, EMBO Repts. (2007) 8:550-555; Song, et al., Proc. Natl. Acad. Sci.
USA (2004) 101:14373-14378).
[0386] The existence of few inhibitors of SUMO/SIM interactions limits the ability to finely dissect SUMO biology and provides for an ideal model system to demonstrate the effectiveness of the methods and compositions described herein. In the only reported example of such an inhibitor, a SIM-containing linear peptide was used to inhibit SUMO/SIM
interactions, establishing their importance in coordinating DNA repair by non-homologous end joining (NHEJ) (Li, et al., Oncogene (2010) 29:3509-3518). This peptide sensitized cancer cells to radiation and chemotherapeutic-induced DNA damage, illustrating a therapeutic potential for SUMO/SIM inhibitors. These findings clearly establish the utility of SUMO/SIM
inhibitors, but the peptide inhibitor suffers from two significant shortcomings. First, the peptide binds equally well to SUM01 and SUM02/3, making it impossible to differentiate the roles of each isoform. Second, the peptide has low affinity for SUMO (Kd ¨5 uM) (Song, et al., supra, 2004). As a result, high concentrations of the peptide are required for inhibition. Most natural SIM peptides exhibit similarly low affinities and discriminate individual SUMO
isoforms by ¨10-fold or less (Kerscher, supra, 2007; Chang, et al., J. Biological Chem.
(2010) 285:5266-5273; Hecker, et al., J. Biol. Chem. (2006) 281:16117-16127;
Sekiyama, et al., J. Biological Chem. (2008) 283:35966-35975; Zhu, et al., J. Biological Chem.
(2008) 283:29405-29415). Higher affinity reagents capable of selectively inhibiting the SIM
interactions of individual SUMO isoforms could be powerful tools for better defining the functions of each isoform and potentially as more potent therapeutics.
However, the development of such highly selective inhibitors presents a formidable challenge as the SIM
binding site is highly conserved among SUMO isoforms (Figures 16A, and 10B, bottom) (Chupreta, et al., Molec. Cell. Biol. (2005) 25:4272-4282). The development of a SUMO/SIM
inhibitor or affinity agent that distinguishes between certain isoforms of SUMO can be used as a model system to demonstrate the ability to design and produce such affinity agents using the methods and compositions described herein.
FnIII Cradle Library
interactions, establishing their importance in coordinating DNA repair by non-homologous end joining (NHEJ) (Li, et al., Oncogene (2010) 29:3509-3518). This peptide sensitized cancer cells to radiation and chemotherapeutic-induced DNA damage, illustrating a therapeutic potential for SUMO/SIM inhibitors. These findings clearly establish the utility of SUMO/SIM
inhibitors, but the peptide inhibitor suffers from two significant shortcomings. First, the peptide binds equally well to SUM01 and SUM02/3, making it impossible to differentiate the roles of each isoform. Second, the peptide has low affinity for SUMO (Kd ¨5 uM) (Song, et al., supra, 2004). As a result, high concentrations of the peptide are required for inhibition. Most natural SIM peptides exhibit similarly low affinities and discriminate individual SUMO
isoforms by ¨10-fold or less (Kerscher, supra, 2007; Chang, et al., J. Biological Chem.
(2010) 285:5266-5273; Hecker, et al., J. Biol. Chem. (2006) 281:16117-16127;
Sekiyama, et al., J. Biological Chem. (2008) 283:35966-35975; Zhu, et al., J. Biological Chem.
(2008) 283:29405-29415). Higher affinity reagents capable of selectively inhibiting the SIM
interactions of individual SUMO isoforms could be powerful tools for better defining the functions of each isoform and potentially as more potent therapeutics.
However, the development of such highly selective inhibitors presents a formidable challenge as the SIM
binding site is highly conserved among SUMO isoforms (Figures 16A, and 10B, bottom) (Chupreta, et al., Molec. Cell. Biol. (2005) 25:4272-4282). The development of a SUMO/SIM
inhibitor or affinity agent that distinguishes between certain isoforms of SUMO can be used as a model system to demonstrate the ability to design and produce such affinity agents using the methods and compositions described herein.
FnIII Cradle Library
[0387] Libraries have been designed and constructed in which positions in the beta-strand regions of the FnIII scaffold, in addition to loop positions, are diversified.
Two different libraries are described herein that differ in that positions in the CD loop (residues 41-45) were diversified in Library BL1 but not the other, library BL2. The libraries were constructed in the phage display format following procedures that have been published (Wojcik, et al., supra, 2010) and described herein. The BL1 and BL2 libraries were estimated to contain 6x101 and 1x101 unique sequences, respectively.
Two different libraries are described herein that differ in that positions in the CD loop (residues 41-45) were diversified in Library BL1 but not the other, library BL2. The libraries were constructed in the phage display format following procedures that have been published (Wojcik, et al., supra, 2010) and described herein. The BL1 and BL2 libraries were estimated to contain 6x101 and 1x101 unique sequences, respectively.
[0388] Selection of cradle molecules from the libraries was performed as described previously (Koide, A., et al., 2009). The following targets in the form of poly-histidine tagged proteins were used: human SUM01, human ubiquitin, human Abl 5H2 domain, human SFMBT2 domain, human SCMH1 domain, and green fluorescent protein. Multiple clones were identified for most of targets. Representative binding data are shown in Figure 15. The amino acid sequences of monobody clones are given in Table 6.
Table 5 Amino acid diversity used for the cradle libraries.
Position Diversity BL1 library 30(R) D, F, H, I, L, N, V, or Y
31(Y) F, H, L, or Y
33 (R) D, F, H, I, L, N, V, orY
41-45 5 and 6 residues of [Y(30%), S(15%), G(10%), F(5%), W(5%), all others except for C (2.5% each)]
47(E) A, E, K, or T
49(T) A, E, K, or T
75-85 7-13 residues of [Y(30%), S(15%), G(10%), F(5%), W(5%), all others except for C (2.5% each)]
BL2 library 30(R) D, F, H, I, L, N, V, or Y
31(Y) F, H, L, or Y
33 (R) D, F, H, I, L, N, V, orY
47(E) A, E, K, or T
49(T) A, E, K, or T
75-85 7-13 residues of [Y(30%), S(15%), G(10%), F(5%), W(5%), all others except for C (2.5% each)]
ST1 library 31(Y) D, H, N, or Y
33(R) A, D, E, G, H, K, N, P, Q, R, S, or T
73(Y) A, D, F, H, I, L, N, P, S, T, V, or Y
75(V) D, F, H, I, L, N, V, or Y
76 (T) D, H, N, orY
77(G) S
78 (R) A, D, E, G, H, I, K, L, M, N, P, Q, R, S, T, V, or Y
79(G) D, E, H, K, N, Q, or Y
80(D) A, D, F, H, I, L, N, P, S, T, V, or Y
81(S) F, I, L, or V
82(P) A, D, F, H, I, L, N, P, S, T, V, or Y
83(A) D, F, H, I, L, N, V, or Y
84(S) A or S
85(5) D, F, H, I, L, N, V, or Y
88(I) S
Wild-type residues are shown in parenthesis.
Table 6 Amino acid sequences of cradle molecules generated from cradle libraries.
t..) o Sequences are grouped according to their binding target.
t..) -a-, "x" designates a diversified position in the libraries. Because the lengths of the CD and FG loops were varied in the BL1 and BL2 libraries, the c., numbers of "x"s shown for these libraries are for guidance only and they do not accurately reflect the actual numbers of residues. t..) .6.
vi BL1 Library (SEQ ID NO:3) Library VSSVPTKLEVVAATPTSLLISWDAPAVTVxxYxITYGETG-xxxxxxQxFxVPGSKSTATISGLSPGVDYTITVYAxxxxxxxxxxxxxxSPISINYRT
. . . .
.
10 20 30 40 CD loop 50 60 70 FG
loop 90 human SUM01 (SEQ ID NOS:4-11, respectively) n VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYVITYGETG-SYSSYGQEFAVPGSKSTATISGLSPGVDYTITVYAY---EFQFEMYMSYSPISINYRT
o VSSVPTKLEVVAATPTSLLISWDAPAVTVDYYVITYGETG-G-VYGPQEFEVPGSKSTATISGLSPGVDYTITVYAW-F-YQQAYEHYVSSPISINYRT
"
co VSSVPTKLEVVAATPTSLLISWDAPAVTVLFYHITYGETG-GN-SPVQEFTVPGSKSTATISGLSPGVDYTITVYAYYS-DYTY SPISINYRT
o in VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYVITYGETG-GN-SPVQEFTVPGSKSTATISGLSPGVDYTITVYAWYD--YSWG-YYGYSPISINYRT
co vo VSSVPTKLEVVAATPTSLLISWDAPAVTVDFYVITYGETG-GN-SPVQEFEVPGSKSTATISGLSPGVDYTITVYAW IYS-DSVYSASPISINYRT
o) oe iv VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYVITYGETGGYAYSASQEFEVPGSKSTATISGLSPGVDYTITVYAY---ESYYWGFAGYSPISINYRT
iv VSSVPTKLEVVAATPTSLLISWDAPAVTVDYYVITYGETG-VFGAGPQEFEVPGSKSTATISGLSPGVDYTITVYAY-H
VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYHITYGETG-GN-SPVQEFTVPGSKSTATISGLSPGVDYTITVYAYWE-AFSGDLYYSSSPISINYRT
u.) H
human ubiquitin (SEQ ID NOS:12-27, respectively) H
VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYNITYGETG-AFWHYVQAFTVPGSKSTATISGLSPGVDYTITVYAEW--DQYVVG SPISINYRT o) VSSVPTKLEVVAATPTSLLISWDAPAVTVDYYVITYGETG-GGYYSFQAFEVPGSKSTATISGLSPGVDYTITVYAFWP-DDYYYGGSEYSPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYHITYGETG-GSWSGYQEFTVPGSKSTATISGLSPGVDYTITVYANS
SWYWYNPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYVITYGETG-AHYYYFQEFEVPGSKSTATISGLSPGVDYTITVYAVSH-GTDGNKLYFFSPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDFYVITYGETG-GWWYGVQAFTVPGSKSTATISGLSPGVDYTITVYAEDS GGRHSISPISXNYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDYYVITYGETG-WY-SPPQEFTVPGSKSTATISGLSPGVDYTITVYAWNW--SAG LQSPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDYYVITYGETG-GN-SPVQEFTVPGSKSTATISGLSPGVDYTITVYAWS
WKYWYHGSPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDFYIITYGETG-GGYYSYQTFTVPGSKSTATISGLSPGVDYTITVYAN---EFGKSYPYTMNPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVLYYVITYGETG-GN-SPVQEFTVPGSKSTATISGLSPGVDYTITVYATDY-GPGYPY ESPISINYRT
n VSSVPTKLEVVAATPTSLLISWDAPAVTVDLYHITYGETG-GVWSGYQEFTVPGSKSTATISGLSPGVDYTITVYAVQH---QEIWPYYYSPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYFITYGETG-GSWSYYQEFAVPGSKSTATISGLSPGVDYTITVYAYSY EPYYYYNPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDFYVITYGETG-SF-SPPQEFTVPGSKSTATISGLSPGVDYTITVYAMMW--GWEYYDYNISPISINYRT
ci) n.) VSSVPTKLEVVAATPTSLLISWDAPAVTVDLYIITYGETG-SYHGW-QTFTVPGSKSTATISGLSPGVDYTITVYADSS TWPYWYYSSPISINYRT
o 1-, VSSVPTKLEVVAATPTSLLISWDAPAVTVDFYVITYGETG-SF-SPPQEFTVPGSKSTATISGLSPGVDYTITVYAMMW--GWEYYDYNISPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYVITYGETG-GVWYGYQEFTVPGSKSTATISGLSPGVDYTITVYAMTS YFQEYWSPISINYRT
-a-, .6.
VSSVPTKLEVVAATPTSLLISWDAPAVTVDFYVITYGETG-SF-SPPQEFTVPGSKSTATISGLSPGVDYTITVYAMMW--GWEYYDYNISPISINYRT
cr 1-, cr o human Abl 5H2 domain (SEQ ID NOS:28-44, respectively) VSSVPTKLEVVAATPTSLL I SWDAPAVTVDHYVI TYGE TG-GYPSPVQTF TVPGSKS TAT I SGL
SPGVDYT I TVYAWD YDW--YAIGSP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVVYYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YF I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAWDNWD-DYYY SP I S INYRT n.) o VSSVPTKLEVVAATPTSLL I SWDAPAVTVVF YVI TYGE TG-S YSGW-QEFEVPGSKS TAT I SGL
SPGVDYT I TVYAY YYQNPE -S YYSP I S INYRT
n.) VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYF I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAWY YGYYGPQYT SP I S INYRT
7:-:--, VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYYI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
, cA
VSSVPTKLEVVAATPTSLL I SWDAPAVTVHYYVI TYGE TG-WW-GPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAY WKYSYKYSP I S INYRT
n.) .6.
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYVI TYGE TG-AF GSG-QEFEVPGSKS TAT I SGL
SPGVDYT I TVYA KWMYS-YMYN-P I S I NYRT
un VSSVPTKLEVVAATPTSLL I SWDAPAVTVVYYF I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAWSYE LTGDYLQQF -SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVVYYNI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAWY--EYGGYME I D-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYVI TYGE TG-VPYYGWQEFEVPGSKS TAT I SGL
SPGVDYT I TVYAY-P -GSNWFYDWW-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGE TG-S YGS YPQAFEVPGSKS TAT I SGL
SPGVDYT I TVY TESEGYISS--SPISINYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVYHYVYL I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA KWKYSYQY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYYI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA-WYWNDYYMS SM--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYVI TYGE TG-GN-SPVQEFEVPGSKS TAT I SGL
SPGVDYT I TVYA--TYGDAYWHYYY-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVVHYH I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA DWQYSYMY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVDF YVYL I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVY GYS DSWNWPY-SP I S INYRT ( "SH13" ) SFMBT2 (SEQ ID NO:45) I.) co VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYVI TYGETG-F SFGSSQTFKVPGSKS TAT I SGL
SPGVDYT I TVYA F YWSKYY--SP I S INYRT
o in co (5) SCMH1 (SEQ ID NO:46-47, respectively) I.) VSSVPTKLEVVAATPTSLL I SWDAPAVDLYVYL I TYGE TG-VASWGYQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGGNYWY--SP I S INYRT
n.) o VSSVPTKLEVVAATPTSLL I SWDAPAVHYYVYL I TYGE TG-YYSYG-QEFEVPGSKS TAT I SGL
SPGVDYT I TVYA YNGSGWMVQ-SP I S INYRT
H
CA
o1 Green fluorescent protein (SEQ ID NO:48-78, respectively) H
I
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDHYYI TYGE TG--AYWYSQAF TVPGSKS TAT I SGL
SPGVDYT I TVYA S TKFNQY--SP I S INYRT
H
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYH I TYGE TG--HYWYYQAFAVPGSKS TAT I SGL
SPGVDYT I TVYA SS I DYMY--SP I S INYRT
o) VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYYI TYGE TG-GY-WFP S TF TVPGSKS TAT I SGL
SPGVDYT I TVYA SMSPSGYFYSP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YF I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGEWDWWSW-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYF I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA-YHVSF PS DEEGM-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYYI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT IT IYA F GS YHYWEH--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDFYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGEYKWWSY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGGYEYWYY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA RGYFKWWEY--SP I S INYRT
IV
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYF I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA--GMVYYGWERE S-SP I S INYRT n VSSVPTKLEVVAATPTSLL I SWDAPAVTVVHYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYALYEGGQHF GYSF S-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGS YS YWMY--SP I S INYRT
ci) VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAGYVEWQSAKNVH--SP I S INYRT
n.) o VSSVPTKLEVVAATPTSLL I SWDAPAVTVDHYNI TYGE TG-GSWYAYQTFEVPGSKS TAT I SGL
SPGVDYT I TVYA--SF SGDMYYYY--SP I S INYRT
1-, VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAGYVAF DYYWRGGY-SP I S INYRT
7:-:--, VSSVPTKLEVVAATPTSLL I SWDAPAVTVHYYYI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA SLWDWYSS SP I S INYRT
.6.
cA
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYF I TYGE TG-GYF S SWQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA-GYAGSFP SYE SP I S INYRT 1-, cA
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGDYYYWLY--SP I S INYRT o VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGEFGWWRY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYH I TYGETG-GPWWGYQTFAVPGSKSTAT I SGL SPGVDYT
I TVYT SSHHPGWW--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDHYVI TYGETG-YYAYSYQTFTVPGSKSTAT I SGL SPGVDYT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYH I TYGETG-GPWWGYQTFAVPGSKSTAT I SGL SPGVDYT
I TVYT S SHHPGWWS --SP I S INYRT n.) o VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYH I TYGETG-SYWHY-QAFEVPGSKSTAT I SGL SPGVDYT
I TVYA QTRNRYME --SP I S INYRT
n.) VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYVI TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYA YGDFMYWKY--SP I S INYRT
7:-:--, VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYA YGGYS YWLH--SP I S INYRT 1-, cA
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDHYH I TYGETG-SHYWSYQKFTVPGSKSTAT I SGL SPGVDYT
I TVYA-SPEGRGS YYGW--SP I S INYRT n.) .6.
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYNI TYGETG-VWFPY-QTFTVPGSKSTAT I SGL SPGVDYT
I TVYA SMVDYEYWW--SP I S INYRT un VSSVPTKLEVVAATPTSLL I SWDAPAVTVVHYL I TYGETG-GAGSSYQTFAVPGSKSTAT I SGL SPGVDYT
I TVYA YMSNYYSY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYH I TYGETG-GSGWGYQAFAVPGSKSTAT I SGL SPGVDYT
I TVYA SS DYLKYY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYA-YD I GWFPAHYG--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVVF YL I TYGETG-GN-SPVQEFTVPGSKSTAT I SGL
SPGVDYT I TVYA-YS TGGSYKSQ SP I S INYRT
BL2 Library (SEQ ID NO:79) n Library o VSSVPTKLEVVAATPTSLL I SWDAPAVTVxxYx I TYGETG-GN-SPVQxFxVPGSKSTAT I SGL SPGVDYT
I TVYAxxxxxxxxxxxxxx SP I S INYRT n.) = = = =
= co o loop 90 in co o n.) o human Abl 5H2 domain (SEQ ID NO:80-85, respectively) I.) VSSVPTKLEVVAATPTSLL I SWDAPAVTVVHYVI TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYA L LS SSHWVYE-SP I S INYRT ( "GG3" ) 0 H
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYL I TYGETG-GN-SPVQEFKVPGSKSTAT I SGL SPGVDYT
I TVYAGSDYYYYYQGAYW-SP I S INYRT u..) VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYA NWAYS YRY-SP I S INYRT O
VSSVPTKLEVVAATPTSLL I SWDAPAVTVFYYVI TYGETG-GN-SPVQEFEVPGSKSTAT I SGL SPGVDYT
I TVYA NYPYS YMY-SP I S INYRT '7 H
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYL I TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYA WDPYWDVM-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYVI TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYAGWGNWE LGYSWS --SP I S INYRT
ST1 Library (SEQ ID NO:86) Library VSSVPTKLEVVAATPTSLL I SWDAS SSSVSxYx I TYGE TGGNSPVQEF TVPGS SS TAT I SGL
SPGVDYT I TVxAxxSxxxxxxxxSPS S INYRT IV
= = = =
= n ci) t..) human SUM01 (SEQ ID NO:87-96, respectively) o 1¨
VSSVPTKLEVVAATPTSLL I SWDAS SS SVSHYH I TYGETGGNSPVQEFTVPGSSSTAT I SGL SPGVDYT
I TVYAF YS DDDLYFAF SPS S INYRT
VSSVPTKLEVVAATPTSLL I SWDAS SSSVSHYGI TYGE TGGNSPVQEF TVPGS SS TAT I SGL
SPGVDYT I TVYAYHS YDD IYYALSPSS INYRT
.6.
VSSVPTKLEVVAATPTSLL I SWDAS SSSVSHYAI TYGE TGGNSPVQEF TVPGS SS TAT I SGL
SPGVDYT I TVYAYHS YDD IF LADSPS S INYRT cA
1-, VSSVPTKLEVVAATPTSLL I SWDAS SSSVSHYE I TYGE TGGNSPVQEF TVPGS SS TAT I SGL
SPGVDYT I TVYAYYSHE D IF YAVSPS S INYRT cA
o VSSVPTKLEVVAATPTSLL I SWDAS SSSVSHYE I TYGE TGGNSPVQEF TVPGS SS TAT I SGL
SPGVDYT I TVAAYHS YHD IF YAVSPS S INYRT
VSSVPTKLEVVAATPTSLLISWDASSSSVSHYEITYGETGGNSPVQEFTVPGSSSTATISGLSPGVDYTITVTAYDSYY
DIYIAYSPSSINYRT
VSSVPTKLEVVAATPTSLLISWDASSSSVSYYEITYGETGGNSPVQEFTVPGSSSTATISGLSPGVDYTITVIAFYSHD
VSSVPTKLEVVAATPTSLLISWDASSSSVSHYAITYGETGGNSPVQEFTVPGSSSTATISGLSPGVDYTITVYAYYSYD
DLYVSDSPSSINYRT n.) o VSSVPTKLEVVAATPTSLLISWDASSSSVSHYAITYGETGGNSPVQEFTVPGSSSTATISGLSPGVDYTITVFAYYSYD
DIYYAYSPSSINYRT
VSSVPTKLEVVAATPTSLLISWDASSSSVSHYDITYGETGGNSPVQEFTVPGSSSTATISGLSPGVDYTITVHAYYSYD
DIYVAISPSSINYRT n.) 7:-:--, c7, t.., .6.
u, 1.) co in co 1¨, o) o iv 1¨, I\) o H
CA
O
H
IV
n cp t.., 7:-:--, .6.
c7, c7, c:, Structure-Guided Library Design
Table 5 Amino acid diversity used for the cradle libraries.
Position Diversity BL1 library 30(R) D, F, H, I, L, N, V, or Y
31(Y) F, H, L, or Y
33 (R) D, F, H, I, L, N, V, orY
41-45 5 and 6 residues of [Y(30%), S(15%), G(10%), F(5%), W(5%), all others except for C (2.5% each)]
47(E) A, E, K, or T
49(T) A, E, K, or T
75-85 7-13 residues of [Y(30%), S(15%), G(10%), F(5%), W(5%), all others except for C (2.5% each)]
BL2 library 30(R) D, F, H, I, L, N, V, or Y
31(Y) F, H, L, or Y
33 (R) D, F, H, I, L, N, V, orY
47(E) A, E, K, or T
49(T) A, E, K, or T
75-85 7-13 residues of [Y(30%), S(15%), G(10%), F(5%), W(5%), all others except for C (2.5% each)]
ST1 library 31(Y) D, H, N, or Y
33(R) A, D, E, G, H, K, N, P, Q, R, S, or T
73(Y) A, D, F, H, I, L, N, P, S, T, V, or Y
75(V) D, F, H, I, L, N, V, or Y
76 (T) D, H, N, orY
77(G) S
78 (R) A, D, E, G, H, I, K, L, M, N, P, Q, R, S, T, V, or Y
79(G) D, E, H, K, N, Q, or Y
80(D) A, D, F, H, I, L, N, P, S, T, V, or Y
81(S) F, I, L, or V
82(P) A, D, F, H, I, L, N, P, S, T, V, or Y
83(A) D, F, H, I, L, N, V, or Y
84(S) A or S
85(5) D, F, H, I, L, N, V, or Y
88(I) S
Wild-type residues are shown in parenthesis.
Table 6 Amino acid sequences of cradle molecules generated from cradle libraries.
t..) o Sequences are grouped according to their binding target.
t..) -a-, "x" designates a diversified position in the libraries. Because the lengths of the CD and FG loops were varied in the BL1 and BL2 libraries, the c., numbers of "x"s shown for these libraries are for guidance only and they do not accurately reflect the actual numbers of residues. t..) .6.
vi BL1 Library (SEQ ID NO:3) Library VSSVPTKLEVVAATPTSLLISWDAPAVTVxxYxITYGETG-xxxxxxQxFxVPGSKSTATISGLSPGVDYTITVYAxxxxxxxxxxxxxxSPISINYRT
. . . .
.
10 20 30 40 CD loop 50 60 70 FG
loop 90 human SUM01 (SEQ ID NOS:4-11, respectively) n VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYVITYGETG-SYSSYGQEFAVPGSKSTATISGLSPGVDYTITVYAY---EFQFEMYMSYSPISINYRT
o VSSVPTKLEVVAATPTSLLISWDAPAVTVDYYVITYGETG-G-VYGPQEFEVPGSKSTATISGLSPGVDYTITVYAW-F-YQQAYEHYVSSPISINYRT
"
co VSSVPTKLEVVAATPTSLLISWDAPAVTVLFYHITYGETG-GN-SPVQEFTVPGSKSTATISGLSPGVDYTITVYAYYS-DYTY SPISINYRT
o in VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYVITYGETG-GN-SPVQEFTVPGSKSTATISGLSPGVDYTITVYAWYD--YSWG-YYGYSPISINYRT
co vo VSSVPTKLEVVAATPTSLLISWDAPAVTVDFYVITYGETG-GN-SPVQEFEVPGSKSTATISGLSPGVDYTITVYAW IYS-DSVYSASPISINYRT
o) oe iv VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYVITYGETGGYAYSASQEFEVPGSKSTATISGLSPGVDYTITVYAY---ESYYWGFAGYSPISINYRT
iv VSSVPTKLEVVAATPTSLLISWDAPAVTVDYYVITYGETG-VFGAGPQEFEVPGSKSTATISGLSPGVDYTITVYAY-H
VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYHITYGETG-GN-SPVQEFTVPGSKSTATISGLSPGVDYTITVYAYWE-AFSGDLYYSSSPISINYRT
u.) H
human ubiquitin (SEQ ID NOS:12-27, respectively) H
VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYNITYGETG-AFWHYVQAFTVPGSKSTATISGLSPGVDYTITVYAEW--DQYVVG SPISINYRT o) VSSVPTKLEVVAATPTSLLISWDAPAVTVDYYVITYGETG-GGYYSFQAFEVPGSKSTATISGLSPGVDYTITVYAFWP-DDYYYGGSEYSPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYHITYGETG-GSWSGYQEFTVPGSKSTATISGLSPGVDYTITVYANS
SWYWYNPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYVITYGETG-AHYYYFQEFEVPGSKSTATISGLSPGVDYTITVYAVSH-GTDGNKLYFFSPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDFYVITYGETG-GWWYGVQAFTVPGSKSTATISGLSPGVDYTITVYAEDS GGRHSISPISXNYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDYYVITYGETG-WY-SPPQEFTVPGSKSTATISGLSPGVDYTITVYAWNW--SAG LQSPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDYYVITYGETG-GN-SPVQEFTVPGSKSTATISGLSPGVDYTITVYAWS
WKYWYHGSPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDFYIITYGETG-GGYYSYQTFTVPGSKSTATISGLSPGVDYTITVYAN---EFGKSYPYTMNPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVLYYVITYGETG-GN-SPVQEFTVPGSKSTATISGLSPGVDYTITVYATDY-GPGYPY ESPISINYRT
n VSSVPTKLEVVAATPTSLLISWDAPAVTVDLYHITYGETG-GVWSGYQEFTVPGSKSTATISGLSPGVDYTITVYAVQH---QEIWPYYYSPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYFITYGETG-GSWSYYQEFAVPGSKSTATISGLSPGVDYTITVYAYSY EPYYYYNPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDFYVITYGETG-SF-SPPQEFTVPGSKSTATISGLSPGVDYTITVYAMMW--GWEYYDYNISPISINYRT
ci) n.) VSSVPTKLEVVAATPTSLLISWDAPAVTVDLYIITYGETG-SYHGW-QTFTVPGSKSTATISGLSPGVDYTITVYADSS TWPYWYYSSPISINYRT
o 1-, VSSVPTKLEVVAATPTSLLISWDAPAVTVDFYVITYGETG-SF-SPPQEFTVPGSKSTATISGLSPGVDYTITVYAMMW--GWEYYDYNISPISINYRT
VSSVPTKLEVVAATPTSLLISWDAPAVTVDHYVITYGETG-GVWYGYQEFTVPGSKSTATISGLSPGVDYTITVYAMTS YFQEYWSPISINYRT
-a-, .6.
VSSVPTKLEVVAATPTSLLISWDAPAVTVDFYVITYGETG-SF-SPPQEFTVPGSKSTATISGLSPGVDYTITVYAMMW--GWEYYDYNISPISINYRT
cr 1-, cr o human Abl 5H2 domain (SEQ ID NOS:28-44, respectively) VSSVPTKLEVVAATPTSLL I SWDAPAVTVDHYVI TYGE TG-GYPSPVQTF TVPGSKS TAT I SGL
SPGVDYT I TVYAWD YDW--YAIGSP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVVYYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YF I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAWDNWD-DYYY SP I S INYRT n.) o VSSVPTKLEVVAATPTSLL I SWDAPAVTVVF YVI TYGE TG-S YSGW-QEFEVPGSKS TAT I SGL
SPGVDYT I TVYAY YYQNPE -S YYSP I S INYRT
n.) VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYF I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAWY YGYYGPQYT SP I S INYRT
7:-:--, VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYYI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
, cA
VSSVPTKLEVVAATPTSLL I SWDAPAVTVHYYVI TYGE TG-WW-GPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAY WKYSYKYSP I S INYRT
n.) .6.
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYVI TYGE TG-AF GSG-QEFEVPGSKS TAT I SGL
SPGVDYT I TVYA KWMYS-YMYN-P I S I NYRT
un VSSVPTKLEVVAATPTSLL I SWDAPAVTVVYYF I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAWSYE LTGDYLQQF -SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVVYYNI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAWY--EYGGYME I D-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYVI TYGE TG-VPYYGWQEFEVPGSKS TAT I SGL
SPGVDYT I TVYAY-P -GSNWFYDWW-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGE TG-S YGS YPQAFEVPGSKS TAT I SGL
SPGVDYT I TVY TESEGYISS--SPISINYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVYHYVYL I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA KWKYSYQY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYYI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA-WYWNDYYMS SM--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYVI TYGE TG-GN-SPVQEFEVPGSKS TAT I SGL
SPGVDYT I TVYA--TYGDAYWHYYY-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVVHYH I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA DWQYSYMY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVDF YVYL I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVY GYS DSWNWPY-SP I S INYRT ( "SH13" ) SFMBT2 (SEQ ID NO:45) I.) co VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYVI TYGETG-F SFGSSQTFKVPGSKS TAT I SGL
SPGVDYT I TVYA F YWSKYY--SP I S INYRT
o in co (5) SCMH1 (SEQ ID NO:46-47, respectively) I.) VSSVPTKLEVVAATPTSLL I SWDAPAVDLYVYL I TYGE TG-VASWGYQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGGNYWY--SP I S INYRT
n.) o VSSVPTKLEVVAATPTSLL I SWDAPAVHYYVYL I TYGE TG-YYSYG-QEFEVPGSKS TAT I SGL
SPGVDYT I TVYA YNGSGWMVQ-SP I S INYRT
H
CA
o1 Green fluorescent protein (SEQ ID NO:48-78, respectively) H
I
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDHYYI TYGE TG--AYWYSQAF TVPGSKS TAT I SGL
SPGVDYT I TVYA S TKFNQY--SP I S INYRT
H
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYH I TYGE TG--HYWYYQAFAVPGSKS TAT I SGL
SPGVDYT I TVYA SS I DYMY--SP I S INYRT
o) VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYYI TYGE TG-GY-WFP S TF TVPGSKS TAT I SGL
SPGVDYT I TVYA SMSPSGYFYSP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YF I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGEWDWWSW-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYF I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA-YHVSF PS DEEGM-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYYI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT IT IYA F GS YHYWEH--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDFYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGEYKWWSY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGGYEYWYY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA RGYFKWWEY--SP I S INYRT
IV
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYF I TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA--GMVYYGWERE S-SP I S INYRT n VSSVPTKLEVVAATPTSLL I SWDAPAVTVVHYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYALYEGGQHF GYSF S-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGS YS YWMY--SP I S INYRT
ci) VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAGYVEWQSAKNVH--SP I S INYRT
n.) o VSSVPTKLEVVAATPTSLL I SWDAPAVTVDHYNI TYGE TG-GSWYAYQTFEVPGSKS TAT I SGL
SPGVDYT I TVYA--SF SGDMYYYY--SP I S INYRT
1-, VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYAGYVAF DYYWRGGY-SP I S INYRT
7:-:--, VSSVPTKLEVVAATPTSLL I SWDAPAVTVHYYYI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA SLWDWYSS SP I S INYRT
.6.
cA
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYF I TYGE TG-GYF S SWQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA-GYAGSFP SYE SP I S INYRT 1-, cA
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGDYYYWLY--SP I S INYRT o VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGE TG-GN-SPVQEF TVPGSKS TAT I SGL
SPGVDYT I TVYA YGEFGWWRY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYH I TYGETG-GPWWGYQTFAVPGSKSTAT I SGL SPGVDYT
I TVYT SSHHPGWW--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDHYVI TYGETG-YYAYSYQTFTVPGSKSTAT I SGL SPGVDYT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYH I TYGETG-GPWWGYQTFAVPGSKSTAT I SGL SPGVDYT
I TVYT S SHHPGWWS --SP I S INYRT n.) o VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYH I TYGETG-SYWHY-QAFEVPGSKSTAT I SGL SPGVDYT
I TVYA QTRNRYME --SP I S INYRT
n.) VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYVI TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYA YGDFMYWKY--SP I S INYRT
7:-:--, VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYA YGGYS YWLH--SP I S INYRT 1-, cA
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDHYH I TYGETG-SHYWSYQKFTVPGSKSTAT I SGL SPGVDYT
I TVYA-SPEGRGS YYGW--SP I S INYRT n.) .6.
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYNI TYGETG-VWFPY-QTFTVPGSKSTAT I SGL SPGVDYT
I TVYA SMVDYEYWW--SP I S INYRT un VSSVPTKLEVVAATPTSLL I SWDAPAVTVVHYL I TYGETG-GAGSSYQTFAVPGSKSTAT I SGL SPGVDYT
I TVYA YMSNYYSY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYH I TYGETG-GSGWGYQAFAVPGSKSTAT I SGL SPGVDYT
I TVYA SS DYLKYY--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYA-YD I GWFPAHYG--SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVVF YL I TYGETG-GN-SPVQEFTVPGSKSTAT I SGL
SPGVDYT I TVYA-YS TGGSYKSQ SP I S INYRT
BL2 Library (SEQ ID NO:79) n Library o VSSVPTKLEVVAATPTSLL I SWDAPAVTVxxYx I TYGETG-GN-SPVQxFxVPGSKSTAT I SGL SPGVDYT
I TVYAxxxxxxxxxxxxxx SP I S INYRT n.) = = = =
= co o loop 90 in co o n.) o human Abl 5H2 domain (SEQ ID NO:80-85, respectively) I.) VSSVPTKLEVVAATPTSLL I SWDAPAVTVVHYVI TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYA L LS SSHWVYE-SP I S INYRT ( "GG3" ) 0 H
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYL I TYGETG-GN-SPVQEFKVPGSKSTAT I SGL SPGVDYT
I TVYAGSDYYYYYQGAYW-SP I S INYRT u..) VSSVPTKLEVVAATPTSLL I SWDAPAVTVDF YVI TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYA NWAYS YRY-SP I S INYRT O
VSSVPTKLEVVAATPTSLL I SWDAPAVTVFYYVI TYGETG-GN-SPVQEFEVPGSKSTAT I SGL SPGVDYT
I TVYA NYPYS YMY-SP I S INYRT '7 H
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDYYL I TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYA WDPYWDVM-SP I S INYRT
VSSVPTKLEVVAATPTSLL I SWDAPAVTVDLYVI TYGETG-GN-SPVQEFTVPGSKSTAT I SGL SPGVDYT
I TVYAGWGNWE LGYSWS --SP I S INYRT
ST1 Library (SEQ ID NO:86) Library VSSVPTKLEVVAATPTSLL I SWDAS SSSVSxYx I TYGE TGGNSPVQEF TVPGS SS TAT I SGL
SPGVDYT I TVxAxxSxxxxxxxxSPS S INYRT IV
= = = =
= n ci) t..) human SUM01 (SEQ ID NO:87-96, respectively) o 1¨
VSSVPTKLEVVAATPTSLL I SWDAS SS SVSHYH I TYGETGGNSPVQEFTVPGSSSTAT I SGL SPGVDYT
I TVYAF YS DDDLYFAF SPS S INYRT
VSSVPTKLEVVAATPTSLL I SWDAS SSSVSHYGI TYGE TGGNSPVQEF TVPGS SS TAT I SGL
SPGVDYT I TVYAYHS YDD IYYALSPSS INYRT
.6.
VSSVPTKLEVVAATPTSLL I SWDAS SSSVSHYAI TYGE TGGNSPVQEF TVPGS SS TAT I SGL
SPGVDYT I TVYAYHS YDD IF LADSPS S INYRT cA
1-, VSSVPTKLEVVAATPTSLL I SWDAS SSSVSHYE I TYGE TGGNSPVQEF TVPGS SS TAT I SGL
SPGVDYT I TVYAYYSHE D IF YAVSPS S INYRT cA
o VSSVPTKLEVVAATPTSLL I SWDAS SSSVSHYE I TYGE TGGNSPVQEF TVPGS SS TAT I SGL
SPGVDYT I TVAAYHS YHD IF YAVSPS S INYRT
VSSVPTKLEVVAATPTSLLISWDASSSSVSHYEITYGETGGNSPVQEFTVPGSSSTATISGLSPGVDYTITVTAYDSYY
DIYIAYSPSSINYRT
VSSVPTKLEVVAATPTSLLISWDASSSSVSYYEITYGETGGNSPVQEFTVPGSSSTATISGLSPGVDYTITVIAFYSHD
VSSVPTKLEVVAATPTSLLISWDASSSSVSHYAITYGETGGNSPVQEFTVPGSSSTATISGLSPGVDYTITVYAYYSYD
DLYVSDSPSSINYRT n.) o VSSVPTKLEVVAATPTSLLISWDASSSSVSHYAITYGETGGNSPVQEFTVPGSSSTATISGLSPGVDYTITVFAYYSYD
DIYYAYSPSSINYRT
VSSVPTKLEVVAATPTSLLISWDASSSSVSHYDITYGETGGNSPVQEFTVPGSSSTATISGLSPGVDYTITVHAYYSYD
DIYVAISPSSINYRT n.) 7:-:--, c7, t.., .6.
u, 1.) co in co 1¨, o) o iv 1¨, I\) o H
CA
O
H
IV
n cp t.., 7:-:--, .6.
c7, c7, c:, Structure-Guided Library Design
[0389] Many proteins are members of large structurally conserved families.
Binding proteins that can specifically target functional sites on individual members of highly related protein families are valuable tools for studying the unique functions of these molecules.
Proteins in such families often exhibit high levels of sequence similarity in addition to conserved structural features making it difficult to generate binding proteins that effectively discriminate individual family members. This problem is more pronounced when targeting a functional site that is particularly highly conserved among family members.
Taken together, these factors make the production of such reagents a challenge.
Binding proteins that can specifically target functional sites on individual members of highly related protein families are valuable tools for studying the unique functions of these molecules.
Proteins in such families often exhibit high levels of sequence similarity in addition to conserved structural features making it difficult to generate binding proteins that effectively discriminate individual family members. This problem is more pronounced when targeting a functional site that is particularly highly conserved among family members.
Taken together, these factors make the production of such reagents a challenge.
[0390] In recent studies, the structure of an FnIII domain variant (ySMB-1) bound to yeast small ubiquitin-like modifier protein (ySUMO) has been determined. ySMB-1 bound to ySUMO at a functional site normally used to interact with short peptide motifs known as SUMO interacting motifs (SIMs) (Figure 10A). The SIM binding site constitutes the most conserved surface among SUMO family proteins (Figure 10B) (Hecker, et al., J.
Biol. Chem.
(2006) 281:16117-16127). Despite this high level of conservation, ySMB-1 was shown to effectively discriminate ySUMO from two closely related human homologs, hSUM0-1 and hSUM0-2.
Biol. Chem.
(2006) 281:16117-16127). Despite this high level of conservation, ySMB-1 was shown to effectively discriminate ySUMO from two closely related human homologs, hSUM0-1 and hSUM0-2.
[0391] Specific, high-affinity cradle molecules binding to the SIM binding sites of SUMO
family proteins could potentially be used as inhibitors of SUMO/SIM
interactions. Since the roles of these interactions in different SUMO proteins are not well understood, such cradle molecules could be valuable tools in studying SUMO biology. However, FnIII
domain variants to hSUM0-1 and hSUM0-2 have not been identified in combinatorial libraries in which loops of the FnIII domain are diversified, with the exception of a single hSUM0-1 binding clone which crossreacts with ySUMO and hSUM0-2. These difficulties suggest that an alternative approach is required to obtain FnIII domain variants to these targets. Cradle molecules were generated that bind to the SIM binding site of hSUM0-1 by making a structure-guided library based on the architecture of the ySUMO-binding FnIII domain variant ySMB-1.
The idea behind this strategy was to maintain the useful binding mode of ySMB-1 and recognition of the SIM binding site, but allow for sufficient alteration in the cradle molecule binding surface to accommodate sequence differences in the predicted epitopes on other SUMO
proteins.
family proteins could potentially be used as inhibitors of SUMO/SIM
interactions. Since the roles of these interactions in different SUMO proteins are not well understood, such cradle molecules could be valuable tools in studying SUMO biology. However, FnIII
domain variants to hSUM0-1 and hSUM0-2 have not been identified in combinatorial libraries in which loops of the FnIII domain are diversified, with the exception of a single hSUM0-1 binding clone which crossreacts with ySUMO and hSUM0-2. These difficulties suggest that an alternative approach is required to obtain FnIII domain variants to these targets. Cradle molecules were generated that bind to the SIM binding site of hSUM0-1 by making a structure-guided library based on the architecture of the ySUMO-binding FnIII domain variant ySMB-1.
The idea behind this strategy was to maintain the useful binding mode of ySMB-1 and recognition of the SIM binding site, but allow for sufficient alteration in the cradle molecule binding surface to accommodate sequence differences in the predicted epitopes on other SUMO
proteins.
[0392] Cradle molecules were isolated that specifically target individual human SUMO
isoforms as well as the yeast homolog of SUMO (ySUMO), which has about 45%
sequence identitiy (about 67% similarity) with human SUMOs (hSUM0s) (Figures 10B, bottom, and 16A). Numerous cradle molecules to ySUMO with mid-nM Kd values were successfully isolated (Figures 16C, D and 19).
isoforms as well as the yeast homolog of SUMO (ySUMO), which has about 45%
sequence identitiy (about 67% similarity) with human SUMOs (hSUM0s) (Figures 10B, bottom, and 16A). Numerous cradle molecules to ySUMO with mid-nM Kd values were successfully isolated (Figures 16C, D and 19).
[0393] To further improve the design of a cradle library, the crystal structure of a ySUMO-binding FnIII domain variant bound to ySUMO was determined which revealed the structural basis for targeting ySUMO. Guided by this structural information, a "SUMO-targeted" cradle library that produced isoform-specific cradle molecules to hSUM01 was developed. Cradle molecules that bound to the SIM-binding site of human SUM01 with kd values of ¨100 nM but bound to SUM02 400 times more weakly were obtained from this library.
Functional studies also demonstrated that these cradle molecules are highly selective inhibitors of hSUM01/SIM
interactions and also of hSUM01 conjugation.
Structure-Guided Design of a SUMO-Targeted Phage Display Library
Functional studies also demonstrated that these cradle molecules are highly selective inhibitors of hSUM01/SIM
interactions and also of hSUM01 conjugation.
Structure-Guided Design of a SUMO-Targeted Phage Display Library
[0394] To guide the library design, the residues in the ySMB-1 epitope on ySUMO was compared with the equivalent residues in hSUM0-1 and hSUM0-2. The ySMB-1 paratope residues that contacted or were near each of these epitope positions were then identified (Figures 11A, 11B). In the ySMB-1/ySUMO structure, ySMB-1 forms a binding surface using an engineered FG loop and a portion of the undiversified FnIII scaffold. In the SUMO-targeted library both of these surfaces are diversified (Figure 11C). A SUMO-targeted library was designed by introducing amino acid diversity at each ySMB-1 paratope position that included the wild-type ySMB-1 residue and other amino acid types that might allow effective complementation of any of the three SUMO targets (Figure 11B). For example, polar amino acids and amino acids with complementary charge were included at positions expected to contact a charged residue in one or more SUMO isoforms, hydrophobic amino acids were included at positions expected to contact hydrophobic surface and small amino acid residues were included at positions that may have steric clashes with larger side chains in some of the SUMO proteins and so on.
[0395] All residues of the FG loop were varied except one, S77 that did not contact ySUMO
in the ySMB-1/ySUMO crystal structure and did not appear to be capable of direct participation in any similar interface. Y76 was varied to D, H, N and Y, because, although it did not directly contact ySUMO in the ySMB-1 interface, it was suspected that this position may be capable of interacting with the conserved R55 in all SUMOs (Figure 11A). Leucine 81 of ySMB-1 is buried in a pocket in the ySUMO surface that is conserved across all SUMO
isoforms, and an equivalent "anchor" leucine or valine is conserved in all SIM/SUMO complexes for which there are structures. As a result, amino acid diversity was restricted at this position to F, L, I and V.
E47 and S86 of the FnIII scaffold made very minimal contact in the ySMB-1 interface and were not varied. Though P87 of the scaffold did make significant contact in the ySMB-1 interface, it was held constant to avoid perturbation of the turn structure it introduces which would likely change the overall positioning of the FG loop. The total number of encoded sequences in the SUMO-targeted library was 1.6 x 1011 and the actual size of the phage library produced was 2.0 x 109.
Selection of Reprogrammed SUMO-Binding Cradle Molecules
in the ySMB-1/ySUMO crystal structure and did not appear to be capable of direct participation in any similar interface. Y76 was varied to D, H, N and Y, because, although it did not directly contact ySUMO in the ySMB-1 interface, it was suspected that this position may be capable of interacting with the conserved R55 in all SUMOs (Figure 11A). Leucine 81 of ySMB-1 is buried in a pocket in the ySUMO surface that is conserved across all SUMO
isoforms, and an equivalent "anchor" leucine or valine is conserved in all SIM/SUMO complexes for which there are structures. As a result, amino acid diversity was restricted at this position to F, L, I and V.
E47 and S86 of the FnIII scaffold made very minimal contact in the ySMB-1 interface and were not varied. Though P87 of the scaffold did make significant contact in the ySMB-1 interface, it was held constant to avoid perturbation of the turn structure it introduces which would likely change the overall positioning of the FG loop. The total number of encoded sequences in the SUMO-targeted library was 1.6 x 1011 and the actual size of the phage library produced was 2.0 x 109.
Selection of Reprogrammed SUMO-Binding Cradle Molecules
[0396] Using the SUMO-targeted library described above, four rounds of selection against hSUM0-1, hSUM0-2 and ySUMO were conducted. The enrichment ratio is defined as the number of phage recovered in the presence of target divided by the number recovered in the absence of target and generally reflects the number and affinity level of functional binders in the sorted phage population. After four rounds of selection good enrichment ratios were observed for both ySUMO and hSUM0-1 (-20 and 50 respectively). Thirty-two random clones for each target were assayed for binding activity using phage ELISA and 100% of clones tested positive for binding in the cases of ySUMO and hSUM0-1.
[0397] Five random ySUMO clones and 10 random hSUM0-1 clones were expressed as soluble proteins and assessed for binding activity via surface plasmon resonance (SPR).
Consistent with phage ELISA results, all clones produced binding signals. For ySUMO Ka estimates ranged from 39 nM to 3.3 uM. Similarly, for hSUM0-1, Ka estimates ranged from 145 nM to 3.6 uM (Figure 12B). Thus, the SUMO-targeted library succeeded in producing functional cradle molecules to both ySUMO and hSUM0-1 and the library performed similarly against both of these targets.
Sequence Profiles of ySUMO and hSUM0-1 Binding Cradle Molecules
Consistent with phage ELISA results, all clones produced binding signals. For ySUMO Ka estimates ranged from 39 nM to 3.3 uM. Similarly, for hSUM0-1, Ka estimates ranged from 145 nM to 3.6 uM (Figure 12B). Thus, the SUMO-targeted library succeeded in producing functional cradle molecules to both ySUMO and hSUM0-1 and the library performed similarly against both of these targets.
Sequence Profiles of ySUMO and hSUM0-1 Binding Cradle Molecules
[0398] Sequencing revealed that all 10 SPR tested hSUM0-1 cradle molecules contained mutations away from wild-type residues at position 33 in the FnIII scaffold and all but one clone contained mutations at position 31. At position 73, the wild-type tyrosine was recovered ¨50% of the time. FG loop sequences in hSUM0-1 monobodies all bore clear resemblance to ySMB-1 suggesting that the ySMB-1 binding mode was maintained in these monobodies as intended.
[0399] Interestingly, the wild-type scaffold residues were recovered in all SPR tested ySUMO clones except for one which contained a Y to F mutation at position 73 (Figures 12A, 12B). FG loop sequences of ySUMO cradle molecules also bore resemblance to ySMB-1 though they were somewhat more divergent than those of hSUM0-1 monobodies.
[0400] To further examine the sequence properties of ySUMO and hSUM0-1 binding cradle molecules an additional 34 clones for hSUM0-1 and an additional 35 clones for ySUMO
were sequenced. All of these clones tested positive for target binding by phage ELISA. Overall sequence profiles for both ySUMO and hSUM0-1 cradle molecules showed close relation to the ySMB-1 sequence and to each other suggesting that, as designed, the cradle molecules from this library maintain a ySMB-1 like binding mode to both targets (Figures 12A, 12C). However, a sharp difference was observed between ySUMO and hSUM0-1 binding cradle molecules at beta strand residues. Out of 40 ySUMO binding clones, 39 contained wild-type residues at positions 31 and 33. Out of 44 hSUM0-1 binding clones, none contained the wild-type arginine at position 33 and most did not contain the wild-type tyrosine at position 31.
Cradle molecules to both targets show a tendency toward tyrosine at position 73. The strong departure from wild-type beta strand residues in hSUM0-1 cradle molecules suggests that mutations in the FnIII
beta strand are necessary to bind hSUM0-1 using a ySMB-1-like binding mode.
Thus, modification in the beta strand residues can enhance the selectivitiy and/or affinity of a cradle molecules or cradle library for a target.
were sequenced. All of these clones tested positive for target binding by phage ELISA. Overall sequence profiles for both ySUMO and hSUM0-1 cradle molecules showed close relation to the ySMB-1 sequence and to each other suggesting that, as designed, the cradle molecules from this library maintain a ySMB-1 like binding mode to both targets (Figures 12A, 12C). However, a sharp difference was observed between ySUMO and hSUM0-1 binding cradle molecules at beta strand residues. Out of 40 ySUMO binding clones, 39 contained wild-type residues at positions 31 and 33. Out of 44 hSUM0-1 binding clones, none contained the wild-type arginine at position 33 and most did not contain the wild-type tyrosine at position 31.
Cradle molecules to both targets show a tendency toward tyrosine at position 73. The strong departure from wild-type beta strand residues in hSUM0-1 cradle molecules suggests that mutations in the FnIII
beta strand are necessary to bind hSUM0-1 using a ySMB-1-like binding mode.
Thus, modification in the beta strand residues can enhance the selectivitiy and/or affinity of a cradle molecules or cradle library for a target.
[0401] The dominant amino acids at position 33 in hSUM0-1 cradle molecules were alanine and glutamic acid, representing a truncation and inversion of charge compared to the wild-type arginine (Figure 12C). In a modeled ySMB-1/hSUM0-1 structure, the wild-type arginine residue of the FnIII scaffold exhibits a potential steric and electrostatic clash with K23 of hSUM0-1 (Figure 13). This clash could be resolved by the observed mutations in the hSUM0-1-binding cradle molecules. An explanation for the strong preference for histidine over the wild-type tyrosine at position 31 in the FnIII beta strand is not clear from the modeled structure.
[0402] In contrast to the beta strand residues, the FG loop sequences of cradle molecules recovered against ySUMO and hSUM0-1 exhibit similar amino acid preferences at most positions (75, 76, 78, 80, 81 and 84). This similarity is consistent with FG loop residues contacting predominantly those positions that are conserved between ySUMO and hSUM0-1 (Fig. 11A).
Interestingly, at positions 80 and 81, hSUM0-1 cradle molecules show more pronounced preference than ySUMO cradle molecules suggesting stronger selective pressure at these positions in the hSUM0-1 interface. Also, at position 79, cradle molecules to hSUM0-1 have a significant preference for aspartate (Fig. 12C).
Interestingly, at positions 80 and 81, hSUM0-1 cradle molecules show more pronounced preference than ySUMO cradle molecules suggesting stronger selective pressure at these positions in the hSUM0-1 interface. Also, at position 79, cradle molecules to hSUM0-1 have a significant preference for aspartate (Fig. 12C).
[0403] Based on the modeled structure of the ySMB-1/hSUM0-1 complex, positions 79 and 80 in hSUM0-1 cradle molecules would be close to two lysine residues of hSUM0-1, one of which is an arginine instead in ySUMO (Figure 12A). In ySMB-1, Y79 forms a stacking interaction with R47 of ySUMO (Figure 11A). The lysine residue at position 47 in hSUM0-1 may be better accommodated by aspartate. The basic residues in this region of hSUM0-1 normally interact with a conserved acidic stretch in SIMs and the "DD" motif in hSUM0-1 cradle molecules marks a shift toward a more SIM-like sequence. Interestingly, these two acidic residues are not strongly conserved in ySUMO cradle molecules, suggesting that reliance on these contacts for binding may not be as strong. hSUM0-1 cradle molecules exhibit a significant preference for isoleucine over leucine at the "anchor" position in the SIM interface.
In hSUM0-1 the identity of a core residue in the SIM binding surface, 139 (ySUMO) is truncated to Valine (Figure 12A). This mutation results in a deeper pocket at the "anchor"
position (Chupreta, et al., supra. (2005)) which may explain a strong preference for a bulkier side chain in hSUM0-1 cradle molecules.
In hSUM0-1 the identity of a core residue in the SIM binding surface, 139 (ySUMO) is truncated to Valine (Figure 12A). This mutation results in a deeper pocket at the "anchor"
position (Chupreta, et al., supra. (2005)) which may explain a strong preference for a bulkier side chain in hSUM0-1 cradle molecules.
[0404] Despite some differences, overall, FG loop sequence preferences are similar in cradle molecules to ySUMO and hSUM0-1 suggesting that similar FG loop sequences can effectively mediate binding to both targets. In one extreme example of this similarity, one pair of cradle molecules (one to ySUMO and one to hSUM0-1) have FG loop sequences that differ by only one amino acid at position 84. The ySUMO cradle molecule contained alanine at this position while the hSUM0-1 cradle molecule contained serine (Figure 14A).
Since alanine occurs at this position in most hSUM0-1 binders (Figure 12C), it is highly unlikely that the alanine mutation in the ySUMO cradle molecule would significantly alter binding activity to hSUM0-1. Interestingly, beta strand residues in these two cradle molecules are different suggesting that in this instance beta strand residues alone may dictate which target these cradle molecules bind.
Since alanine occurs at this position in most hSUM0-1 binders (Figure 12C), it is highly unlikely that the alanine mutation in the ySUMO cradle molecule would significantly alter binding activity to hSUM0-1. Interestingly, beta strand residues in these two cradle molecules are different suggesting that in this instance beta strand residues alone may dictate which target these cradle molecules bind.
[0405] Binding of these two clones to ySUMO and hSUM0-1 was assessed by phage ELISA. The hSUM0-1 cradle molecule bound to both hSUM0-1 and ySUMO. But, as expected, the ySUMO cradle molecule which contained wild-type FnIII scaffold residues bound only to ySUMO (Figure 14B). These results show that an effectively identical FG loop can be used to recognize both ySUMO and hSUM0-1 but mutations in the FnIII scaffold are necessary to bind hSUM0-1. These results also provide clear evidence supporting the beta strand-based mechanism for specificity in cradle molecules.
Specificity of Selected Cradle Molecules
Specificity of Selected Cradle Molecules
[0406] Because the SUMO-targeted library was designed based on the binding mode of a cradle molecule to ySUMO, it is quite possible that ySUMO binding activity may be maintained in the recovered hSUM0-1 cradle molecules. To examine this, the binding of hSUM0-1 cradle molecules to ySUMO using phage ELISA was assessed, cross-reactivity with hSUM0-2 was also tested. No hSUM0-1 cradle molecules showed binding to hSUM0-2, however approximately 50% of hSUM0-1 cradle molecules showed significant binding to ySUMO
(Figure 14C). Interestingly, sequence analysis revealed no obvious differences in amino acid preferences for hSUM0-1 specific and cross-reactive cradle molecules (Figure 14D). These results suggest different origins of specificity in different cradle molecules and that specific binding to hSUM0-1 likely requires multiple and varying mutations which exploit subtle differences in the amino acid preferences of ySUMO and hSUM0-1 in the binding interface.
(Figure 14C). Interestingly, sequence analysis revealed no obvious differences in amino acid preferences for hSUM0-1 specific and cross-reactive cradle molecules (Figure 14D). These results suggest different origins of specificity in different cradle molecules and that specific binding to hSUM0-1 likely requires multiple and varying mutations which exploit subtle differences in the amino acid preferences of ySUMO and hSUM0-1 in the binding interface.
[0407] An alternative explanation for similar sequences of specific and non-specific cradle molecules is that phage ELISA using GST-fusions of target proteins can produce strong binding signals even for weak interactions. The low resolution of affinity in this assay may produce false positives for cross-reactivity. If many of the hSUM0-1 binding cradle molecules classified as crossreactive are actually specific, this could explain the similarity in sequence profiles between these two groups. Notably, the phage ELISA data are unlikely to produce false negatives for crossreactivity since even weak binding produces a significant signal. Thus, classification of hSUM0-1 cradle molecules as specific is likely to be accurate. However, Ka measurements for these cradle molecules are necessary to thoroughly and quantitatively assess cross-reactivity.
Diverse FnIII Domain Variants Recognize the SIM-Binding Site of ySUMO and Discriminate ySUMO from hSUMOs
Diverse FnIII Domain Variants Recognize the SIM-Binding Site of ySUMO and Discriminate ySUMO from hSUMOs
[0408] To understand how FnIII domain variants recognize ySUMO, the epitopes of two of the highest affinity ySUMO-binding FnIII domain variants, ySMB-1 and ySMB-2 (Figures 16C, D and 17), were mapped using NMR chemical shift perturbation. Despite distinct amino acid sequences in their variable loops (Figure 16C), both FnIII domain variants bound to similar epitopes centered on the SIM binding site (Figure 16E). Binding of 33 other ySUMO FnIII
domain variants was inhibited by ySMB-1, indicating that they too bound to the SIM-binding site (Figure 18). Like ySMB-1, most ySUMO-binding FnIII domain variants have polyserine sequences in the BC and DE loops that originate from incomplete mutagenesis of the template vector in library construction, suggesting that these loops do not contribute to binding (Figures 16C and 17). Furthermore, many of these FnIII domain variants have an 11-residue FG loop with a centrally located acidic residue and flanking aromatic and hydrophobic residues (Figures 16C and 17). Together, these results suggest that essentially all the ySUMO-binding FnIII
domain variants recognize the SIM-binding site using a similar mode of interaction.
domain variants was inhibited by ySMB-1, indicating that they too bound to the SIM-binding site (Figure 18). Like ySMB-1, most ySUMO-binding FnIII domain variants have polyserine sequences in the BC and DE loops that originate from incomplete mutagenesis of the template vector in library construction, suggesting that these loops do not contribute to binding (Figures 16C and 17). Furthermore, many of these FnIII domain variants have an 11-residue FG loop with a centrally located acidic residue and flanking aromatic and hydrophobic residues (Figures 16C and 17). Together, these results suggest that essentially all the ySUMO-binding FnIII
domain variants recognize the SIM-binding site using a similar mode of interaction.
[0409] Most ySUMO-binding FnIII domain variants exhibited negligible levels of binding to hSUM01 or hSUM02 in phage ELISA assays (Figure 19A). Such high selectivity was unexpected, because the SIM-binding site is the most highly conserved surface between ySUMO and hSUMO proteins (Figure 16A). SPR measurements showed that ySMB-1 (selective for ySUMO in ELISA) bound to ySUMO with a 82 nM Kd and to hSUM01 with a ¨54 p M Kd and exhibited no detectable binding to hSUM02 (Figure 21B), discriminating ySUMO from hSUMOs by more than 600-fold in affinity. ySMB-9 (non-selective in ELISA) bound to all three SUMO proteins. Although ySMB-9 bound to hSUM01 with higher affinity (68 nM Kd) than either ySUMO or hSUM02 (Figure 19B), it discriminated hSUM02 by only ¨
70-fold, that is 10-fold less selective than ySMB-1. Notably, ySMB-9 does not have polyserine BC and DE loops like most other ySUMO-binding FnIII domain variants, and it also has a significantly shorter FG loop (Figures 16B and 17). Although competition data suggested that ySMB-9 binds to the SIM-binding site (Figure 17), its distinct sequence features suggest that it employs a different mode of interaction than most ySUMO-binding FnIII domain variants, leading to its lower specificity. Together, these findings demonstrate that the binding mode of most ySUMO-binding FnIII domain variants is particularly effective in discriminating ySUMO
and hSUMOs despite binding to the highly conserved SIM-binding site. Thus, it was expected that generating FnIII domain variants that bind to hSUMOs in a mode similar to the ySUMO-binding FnIII domain variants would yield clones with higher isoform selectivity toward hSUMOs.
Crystal Structure of the ySMB-1/ySUMO Complex.
70-fold, that is 10-fold less selective than ySMB-1. Notably, ySMB-9 does not have polyserine BC and DE loops like most other ySUMO-binding FnIII domain variants, and it also has a significantly shorter FG loop (Figures 16B and 17). Although competition data suggested that ySMB-9 binds to the SIM-binding site (Figure 17), its distinct sequence features suggest that it employs a different mode of interaction than most ySUMO-binding FnIII domain variants, leading to its lower specificity. Together, these findings demonstrate that the binding mode of most ySUMO-binding FnIII domain variants is particularly effective in discriminating ySUMO
and hSUMOs despite binding to the highly conserved SIM-binding site. Thus, it was expected that generating FnIII domain variants that bind to hSUMOs in a mode similar to the ySUMO-binding FnIII domain variants would yield clones with higher isoform selectivity toward hSUMOs.
Crystal Structure of the ySMB-1/ySUMO Complex.
[0410] To understand the structural basis for the isoform-selective recognition of the SIM-binding site, the crystal structure of ySMB-1 in complex with ySUMO at 2.4A
resolution (structural statistics in Table 8) was determined. Consistent with the NMR
epitope mapping data, ySMB-1 bound to the SIM binding site (Figures 20A and 16E). The FnIII
domain variant formed the binding surface using a single variable loop (FG loop) and residues from the invariant FnIII beta strands (Figure 20A). As inferred from their polyserine sequences, the BC
and DE loops of ySMB-1 were not involved in direct contacts with ySUMO.
resolution (structural statistics in Table 8) was determined. Consistent with the NMR
epitope mapping data, ySMB-1 bound to the SIM binding site (Figures 20A and 16E). The FnIII
domain variant formed the binding surface using a single variable loop (FG loop) and residues from the invariant FnIII beta strands (Figure 20A). As inferred from their polyserine sequences, the BC
and DE loops of ySMB-1 were not involved in direct contacts with ySUMO.
[0411] Residues 78-85 of the ySMB-1 FG loop form a beta hairpin that provides 84% of the FnIII domain variant binding surface with non-loop residues contributing the remainder (Figures 20B and 21A). The edge of this hairpin docks along the hydrophobic center of the SIM-binding site forming an intermolecular beta sheet with ySUMO and closely mimicking the interaction mode of SIMs (Figures 20B, C) (Kerscher, supra, 2007; Reverter, D., and Lima, C. D., Nature (2005) 435:687-692; Song, et al., J. Biological Chem.
(2005) 280:40122-40129). SIMs generally contain a stretch of hydrophobic residues flanked by a stretch of acidic residues, e.g., DVLIVY (SEQ ID NO:296) in RanBP2 and TLDIVD
(SEQ ID
NO:294) in PIASx (Song, et al., supra, 2004; Li, et al., supra, 2010; Minty, et al., J. Biological Chem. (2000) 275, 36316-36323). In ySMB-1, this motif is mimicked by the FG
loop sequence DLYYSY (SEQ ID NO:295) (residues 80-85) (Figures 16C, and 20B, C). D80 of the FnIII
domain variant aligns with the "top" basic portion of the SIM binding site in a similar orientation as a conserved acidic stretch in SIMs (Kerscher, supra, 2007;
Song, et al., supra, 2005) and Tyr residues line the hydrophobic tract where aliphatic residues are usually found in SIMs.
(2005) 280:40122-40129). SIMs generally contain a stretch of hydrophobic residues flanked by a stretch of acidic residues, e.g., DVLIVY (SEQ ID NO:296) in RanBP2 and TLDIVD
(SEQ ID
NO:294) in PIASx (Song, et al., supra, 2004; Li, et al., supra, 2010; Minty, et al., J. Biological Chem. (2000) 275, 36316-36323). In ySMB-1, this motif is mimicked by the FG
loop sequence DLYYSY (SEQ ID NO:295) (residues 80-85) (Figures 16C, and 20B, C). D80 of the FnIII
domain variant aligns with the "top" basic portion of the SIM binding site in a similar orientation as a conserved acidic stretch in SIMs (Kerscher, supra, 2007;
Song, et al., supra, 2005) and Tyr residues line the hydrophobic tract where aliphatic residues are usually found in SIMs.
[0412] The cystal structure suggests a structural basis for isoform selectivity of ySUMO-binding FnIII domain variants and for difficulties in generating FnIII domain variants to hSUMOs. Only five of the sixteen residues in the ySMB-1 epitope are poorly conserved between ySUMO and hSUMOs (positions 25, 34, 36, 50 and 54) (Figure 10B, bottom). Three of these residues (N25, E34 and F36) form a cluster at one side of the interface that is highly buried, comprising 23% (147 A2) of total ySUMO surface buried by the FnIII
domain variant (Figures 20B and 22A). hSUM01 contains N25K and F36H. hSUM02 contains E34V and F36Q. Thus, any FnIII domain variant that forms an interface similar to ySMB-1 is not likely to tightly bind to hSUM01 or hSUM02/3. Notably, this cluster is contacted in large part by scaffold residues in ySMB-1 (Y31, R33 and Y73) (Figures 20C and 22A). Because these beta strand residues were not varied in the library and are anchored in a conformationally rigid beta sheet, non-conservative substitutions in the cluster in hSUMOs could not have been accommodated, making the generation of ySMB-1-like FnIII domain variants for hSUMOs impossible. These structural restraints would eliminate a potentially very large number of ySMB-1-like FnIII domain variants that have an FG loop otherwise capable of binding to hSUMOs. Thus, these observations strongly suggest that residues within the FnIII beta strands serve as both positive design elements favoring ySUMO binding and negative design elements disfavoring binding to hSUMOs.
Table 7 Crystallographic Information And Refinement Statistics For The Structure Of The ySMB-1/ySUMO Complex (PDB ID: 3QHT) Data Collection*
Beamline APS 21-ID-F
Space Group P21212 Cell Parameters a = 59.64 A, b =
175.46 A, c = 52.83 A
a=r3=
Wavelength 0.97872 A
Resolution 50.00 ¨ 2.40 A (2.49 ¨ 2.40 A) Unique Reflections 22,586 RMerget +0.085 (0.643) Completeness 1_100.0% (99.5%) Redundancy 7.1 (6.6) II(I) , 18.9 (2.2) Refinement Statistics Resolution Range 20.00 ¨ 2.40 A
(2.46 ¨ 2.40 A) Unique Reflections Working Set 21,341 Free Set 1,151 10.223 RFree5 10.272 Overall Mean B Values 149.82 A2 Number of Amino Acid Residues 338 Number of Water Molecules , 85 Matthews Coefficient 3.20 (Water Content 61.6%) RMSD From Ideal Values Bonds/Angle 10.02 A/ 1.9 Estimated Overall Coordinate Error Based on 10.2 A
Maximum Likelihood Estimated Overall Error for B Values Based on +14.4 A
Maximum Likelihood Ramachandran Plot Statistics Residues in Most Favored Regions 87.8% (258) Residues in Additionally Allowed Regions 9.2% (27) Residues in Generously Allowed Regions 1.7% (5) Residues in Disallowed Regions 1.4% (4) '-Values for highest resolution shell shown in parentheses tRmerge = HIUiI l(HKL), - <I(hk1)> I /EhidEi</(hk/),> over i observations of a reflection hkl.
R = E I IF(obs)I-IF(calc)I I /E IF(obs)I.
Rfree is R with 5% of reflections sequestered before refinement.
Table 8 Interface Statistics For FnIII Domain Variant ySMB-1 And SIM Peptides ySMB-1 Average SIM Peptide*
Buried Surface 670 A2 SC Value 0.72 0.77 0.02 % Neutral and Non-Polar 64 Atoms in Interface *Values reported are the average for 5 SUMO/SIM complexes (PDB IDS 1WYW, 1Z5S, 2ASQ, 2KQS
and 2RPQ). The standard deviations in these values across all five complexes are given. Buried surface and % composition values calculated using the PROTORP server (Reynolds, et al., Bioinformatics (Oxford, England) (2009) 25:413-414). Sc values calculated using the sc program in the CCP4 suite (The CCP4 Suite, Acta Clyst (1994) D50:760-763; Lawrence and Colman, J. Molec.
Biol. (1993) 234:946-950).
Structure-Guided Design of a SUMO-Targeted Cradle Library.
domain variant (Figures 20B and 22A). hSUM01 contains N25K and F36H. hSUM02 contains E34V and F36Q. Thus, any FnIII domain variant that forms an interface similar to ySMB-1 is not likely to tightly bind to hSUM01 or hSUM02/3. Notably, this cluster is contacted in large part by scaffold residues in ySMB-1 (Y31, R33 and Y73) (Figures 20C and 22A). Because these beta strand residues were not varied in the library and are anchored in a conformationally rigid beta sheet, non-conservative substitutions in the cluster in hSUMOs could not have been accommodated, making the generation of ySMB-1-like FnIII domain variants for hSUMOs impossible. These structural restraints would eliminate a potentially very large number of ySMB-1-like FnIII domain variants that have an FG loop otherwise capable of binding to hSUMOs. Thus, these observations strongly suggest that residues within the FnIII beta strands serve as both positive design elements favoring ySUMO binding and negative design elements disfavoring binding to hSUMOs.
Table 7 Crystallographic Information And Refinement Statistics For The Structure Of The ySMB-1/ySUMO Complex (PDB ID: 3QHT) Data Collection*
Beamline APS 21-ID-F
Space Group P21212 Cell Parameters a = 59.64 A, b =
175.46 A, c = 52.83 A
a=r3=
Wavelength 0.97872 A
Resolution 50.00 ¨ 2.40 A (2.49 ¨ 2.40 A) Unique Reflections 22,586 RMerget +0.085 (0.643) Completeness 1_100.0% (99.5%) Redundancy 7.1 (6.6) II(I) , 18.9 (2.2) Refinement Statistics Resolution Range 20.00 ¨ 2.40 A
(2.46 ¨ 2.40 A) Unique Reflections Working Set 21,341 Free Set 1,151 10.223 RFree5 10.272 Overall Mean B Values 149.82 A2 Number of Amino Acid Residues 338 Number of Water Molecules , 85 Matthews Coefficient 3.20 (Water Content 61.6%) RMSD From Ideal Values Bonds/Angle 10.02 A/ 1.9 Estimated Overall Coordinate Error Based on 10.2 A
Maximum Likelihood Estimated Overall Error for B Values Based on +14.4 A
Maximum Likelihood Ramachandran Plot Statistics Residues in Most Favored Regions 87.8% (258) Residues in Additionally Allowed Regions 9.2% (27) Residues in Generously Allowed Regions 1.7% (5) Residues in Disallowed Regions 1.4% (4) '-Values for highest resolution shell shown in parentheses tRmerge = HIUiI l(HKL), - <I(hk1)> I /EhidEi</(hk/),> over i observations of a reflection hkl.
R = E I IF(obs)I-IF(calc)I I /E IF(obs)I.
Rfree is R with 5% of reflections sequestered before refinement.
Table 8 Interface Statistics For FnIII Domain Variant ySMB-1 And SIM Peptides ySMB-1 Average SIM Peptide*
Buried Surface 670 A2 SC Value 0.72 0.77 0.02 % Neutral and Non-Polar 64 Atoms in Interface *Values reported are the average for 5 SUMO/SIM complexes (PDB IDS 1WYW, 1Z5S, 2ASQ, 2KQS
and 2RPQ). The standard deviations in these values across all five complexes are given. Buried surface and % composition values calculated using the PROTORP server (Reynolds, et al., Bioinformatics (Oxford, England) (2009) 25:413-414). Sc values calculated using the sc program in the CCP4 suite (The CCP4 Suite, Acta Clyst (1994) D50:760-763; Lawrence and Colman, J. Molec.
Biol. (1993) 234:946-950).
Structure-Guided Design of a SUMO-Targeted Cradle Library.
[0413] Based on the theory that the binding mode of ySMB-1 could be used as a template for designing isoform-specific cradle inhibitors of hSUMO/SIM interactions, a library was designed that was aimed at "reprogramming" ySMB-1 for binding to hSUMO
proteins. Amino acid diversity at each ySMB-1 paratope position that included the wide-type amino acid and other amino acid types that might allow effective complementation of any of the three SUMO
proteins was introduced (Figure 22A) (SI Methods). Notably, this library included diversity at previously invariant beta strand positions that participated in ySUMO binding.
The number of independent clones in the constructed phage-display library was 2.0 x 109 giving reasonable coverage of the theoretical size of the design (1.6 x 10 11).
Selection of Cradle Molecules from the SUMO-Targeted Cradle Library.
proteins. Amino acid diversity at each ySMB-1 paratope position that included the wide-type amino acid and other amino acid types that might allow effective complementation of any of the three SUMO
proteins was introduced (Figure 22A) (SI Methods). Notably, this library included diversity at previously invariant beta strand positions that participated in ySUMO binding.
The number of independent clones in the constructed phage-display library was 2.0 x 109 giving reasonable coverage of the theoretical size of the design (1.6 x 10 11).
Selection of Cradle Molecules from the SUMO-Targeted Cradle Library.
[0414] After four rounds of library sorting against hSUM01, hSUM02, and ySUMO, randomly chosen clones for each target were assayed for binding activity using phage ELISA.
All clones tested positive for binding in the cases of ySUMO and hSUM01 but none bound to hSUM02. Five ySUMO-binding and 10 hSUM01-binding cradle molecules were expressed as soluble proteins and assessed using SPR, all of which produced binding signals (Figure 22B), consistent with phage ELISA results. For ySUMO, the cradle molecules exhibited Kd values similar to those of FnIII domain variants from the previous naïve library (39 nM to 3.3 p M) (Figures 17, 23 and 16C). For hSUM01, Kd estimates ranged from 118 nM to 3.6 p M (Figure 22B). Thus, unlike the original library, the SUMO-targeted library readily produced cradle molecules with good affinity to both ySUMO and hSUM01.
All clones tested positive for binding in the cases of ySUMO and hSUM01 but none bound to hSUM02. Five ySUMO-binding and 10 hSUM01-binding cradle molecules were expressed as soluble proteins and assessed using SPR, all of which produced binding signals (Figure 22B), consistent with phage ELISA results. For ySUMO, the cradle molecules exhibited Kd values similar to those of FnIII domain variants from the previous naïve library (39 nM to 3.3 p M) (Figures 17, 23 and 16C). For hSUM01, Kd estimates ranged from 118 nM to 3.6 p M (Figure 22B). Thus, unlike the original library, the SUMO-targeted library readily produced cradle molecules with good affinity to both ySUMO and hSUM01.
[0415] NMR chemical shift perturbation assays validated that a newly generated hSUM01-binding cradle molecule, hS1MB-4, targeted the SIM-binding site (Figure 22C).
Binding of 15 other hSUM01-binding cradle molecules was inhibited by hS1MB-4 as tested in ELISA, strongly suggesting that all these hSUM01-binding cradle molecules targeted the SIM binding site as intended (Figure 24).
Binding of 15 other hSUM01-binding cradle molecules was inhibited by hS1MB-4 as tested in ELISA, strongly suggesting that all these hSUM01-binding cradle molecules targeted the SIM binding site as intended (Figure 24).
[0416] The amino acid sequences of 44 hSUM01-binding clones and 40 ySUMO-binding clones revealed that cradle molecules to both targets contained FG loop sequences highly similar to ySMB-1 (Figure 22D), suggesting that a ySMB-1-like binding mode was maintained in these cradle molecules and that ySMB-1-like FG loop sequences are effective for binding to both ySUMO and hSUM01. In contrast, beta strand residues were sharply different in cradle molecules to the two targets (Figure 22D). The wild-type ySMB-1 beta strand residues were highly conserved among ySUMO-binding cradle molecules, but in hSUM01-binding cradle molecules the wild-type amino acid was never recovered at position 33 and only infrequently recovered at position 31. These results strongly support the inventor's position that isoform selectivity in ySUMO-binding cradle molecules arises from contacts made by the non-loop regions of the cradle scaffold. Consistent with this mechanism, in a pair of cradle molecules with nearly identical FG loop sequences, hS1MB-22 and ySMB-ST6, it was observed that ySMB-ST6 containing the wild-type scaffold residues bound only to ySUM, while hS1MB-22 containing altered scaffold residues bound to both ySUMO and hSUM01 (Figures 22A and 22E). Taken together these results illustrate the importance of altering non-loop residues in the FnIII domain in order to facilitate binding to hSUM01.
[0417] Modeling a ySMB-1 interface with hSUM01 provides a clear rationale for the observed mutations at non-loop residues in the hSUM01-binding cradle molecules. N25K and F36H substitutions in hSUM01 with respect to ySUMO result in a likely electrostatic and steric clash between R33 of the FnIII domain and K25 of hSUM01 as well as a loss of a close, edge-plane aromatic interaction between Y73 of the FnIII domain and F36 in ySUMO
(Figure 22F).
Notably, the most favored amino acid types at position 33 in hSUM01-binding cradle molecules were Ala and Glu, either of which should resolve a clash with K25, supporting this molecular mechanism for binding specificity.
hSUM01-Binding Cradle Molecules Are Isoform Specific.
(Figure 22F).
Notably, the most favored amino acid types at position 33 in hSUM01-binding cradle molecules were Ala and Glu, either of which should resolve a clash with K25, supporting this molecular mechanism for binding specificity.
hSUM01-Binding Cradle Molecules Are Isoform Specific.
[0418] hSUM01-binding cradle molecules had varied ability to discriminate hSUM01 and ySUMO as assessed by phage ELISA (Figures 25A and 26A). There were several clones (e.g., hS1MB-7, 16 and 23) that showed no detectable binding to ySUMO, representing at least 100-fold weaker binding to ySUMO than to hSUM01 (Figures 25A and 26A). The difference in the affinity of hS1MB-4 to ySUMO and hSUM01, as measured by SPR, was ¨20-fold, validating the phage ELISA experiment that gave a ¨10-fold difference (Figures 25B and 26B). No distinct features were evident between the sequences of clones that did and did not discriminate ySUMO (Figure 26B), suggesting that the mechanism of ySUMO/hSUM01 discrimination is complex, likely involving several positions, and varied across different clones. As expected from the failure of our library to generate cradle molecules to hSUM02, the hSUM01-binding cradle molecules showed no measurable binding to hSUM02 in phage ELISA
(Figures 25A and 26A), and the affinity of hS1MB-4 to hSUM02 determined by SPR was very weak (Kd = 43 p M; Figure 25B), corresponding to 360 fold discrimination between hSUM01 and hSUM02.
Taken together, these data demonstrate that the SUMO-targeted cradle library has the capacity of generating diverse cradle molecules that have high affinity and high specificity to hSUM01.
New Cradle Molecules Inhibit the SUM01/SIM Interaction and SUM01 Conjugation.
(Figures 25A and 26A), and the affinity of hS1MB-4 to hSUM02 determined by SPR was very weak (Kd = 43 p M; Figure 25B), corresponding to 360 fold discrimination between hSUM01 and hSUM02.
Taken together, these data demonstrate that the SUMO-targeted cradle library has the capacity of generating diverse cradle molecules that have high affinity and high specificity to hSUM01.
New Cradle Molecules Inhibit the SUM01/SIM Interaction and SUM01 Conjugation.
[0419] To investigate the potential utility of hSUM01-specific cradle molecules as tools for studying SUMO biology, their effects on three major processes: SUMO/SIM
interactions, SUMOylation, and deSUMOylation were examined. hS1MB-4 completely inhibited the SIM-mediated interaction between SUM01-RanGAP and RanBP2 (Johnson, supra, 2004;
Mahaj an, et al., Cell (1997) 88:97-107; Matunis, et al., J. Cell Biol. (1996) 135:1457-1470) in a dose-dependent manner (Figure 27A), further validating that these cradle molecules bind to the SIM-binding site as intended and demonstrating their efficacy as inhibitors of SUMO/SIM
interactions.
interactions, SUMOylation, and deSUMOylation were examined. hS1MB-4 completely inhibited the SIM-mediated interaction between SUM01-RanGAP and RanBP2 (Johnson, supra, 2004;
Mahaj an, et al., Cell (1997) 88:97-107; Matunis, et al., J. Cell Biol. (1996) 135:1457-1470) in a dose-dependent manner (Figure 27A), further validating that these cradle molecules bind to the SIM-binding site as intended and demonstrating their efficacy as inhibitors of SUMO/SIM
interactions.
[0420] The effects of cradle molecules on SUMOylation were then examined by monitoring the in vitro formation of covalent complexes between SUMOs and the SUMO El-activating (SAE1/SAE2) and E2-conjugating (Ubc9) enzymes of the SUMO conjugation cascade (Figure 27B). In this assay, both hSUM01 and hSUM03 were present as substrates, enabling the direct assessment of the isoform specificity of the cradle molecules. In the absence of a cradle molecule or in the presence of the ySUMO-specific ySMB-1 cradle molecule, El and E2 were conjugated with both hSUM01 and hSUM03 (Figure 27B, lanes 1 and 2). In contrast, in the presence of either hS1MB-4 or hS1MB-5, conjugation of hSUM01 was inhibited at the El-dependent step, while hSUM03 conjugation was enhanced (Figure 27B, lanes 3-8).
Because hSUM01 and hSUM03 compete for the same El-activating enzyme, the enhancement of hSUM03 conjugation is most likely because hSUM01 was effectively eliminated as a competitor and thus the El enzyme was more available to hSUM03. The potent inhibition of hSUM01 conjugation by the cradle molecules was remarkable, because a SIM-based peptide inhibitor did not inhibit this process (Li, et al., supra, 2010).
Because hSUM01 and hSUM03 compete for the same El-activating enzyme, the enhancement of hSUM03 conjugation is most likely because hSUM01 was effectively eliminated as a competitor and thus the El enzyme was more available to hSUM03. The potent inhibition of hSUM01 conjugation by the cradle molecules was remarkable, because a SIM-based peptide inhibitor did not inhibit this process (Li, et al., supra, 2010).
[0421] Superposition of the ySMB-1/ySUMO complex structure with the crystal structure of the El/hSUM01 complex (Olsen, et al., supra, 2010) suggests that a cradle molecule binding to hSUM01 in a manner similar to ySMB-1 would not cause steric clashes with the structurally well-defined regions of El. Rather, the cradle molecule would be positioned in the trajectory of a long disordered loop in the SAE1 subunit (residues ¨175-205) (Figure 28). As a result, it is contemplated that steric clashes between the cradle molecule and the SAE1 loop prevent binding of a cradle molecule/hSUM01 complex to El, thus inhibiting SUMOylation at the El dependent step. The previously reported inhibitor based on a SIM peptide is much smaller and would not likely cause such a steric hindrance, explaining why it did not inhibit SUMOylation (Li, et al., supra, 2010). hS1MB-4 was significantly more effective than hS1MB-5 in inhibiting SUMOylation (Figure 27B), although their Kd values for hSUM01 only differ by ¨2-fold and their sizes are essentially identical (Figure 22B). This difference in inhibition efficacy could be explained by subtle variations in the spatial arrangement of the two cradle molecules when bound to hSUM01, consistent with the proposed mechanism.
[0422] Neither hS1MB-4 nor hS1MB-5 affected deSUMOylation as assayed in vitro by monitoring SENP1 cleavage (Tatham and Hay, Methods Mol. Biol. (2009) 497:253-268) at the hSUM01 C-terminal di-glycine sequence (Figure 29). Superposition of the ySMB-1/ySUMO
structure with the structure of hSUM01 bound to SENP1 (Shen, et al., Nat.
Struct. Mol. Biol.
(2006) 13:1069-1077) suggests no apparent clashes between the cradle molecule and protease and that a cradle molecule binding similarly to ySMB-1 would not inhibit the SENP1/hSUM01 interaction.
Materials and Methods
structure with the structure of hSUM01 bound to SENP1 (Shen, et al., Nat.
Struct. Mol. Biol.
(2006) 13:1069-1077) suggests no apparent clashes between the cradle molecule and protease and that a cradle molecule binding similarly to ySMB-1 would not inhibit the SENP1/hSUM01 interaction.
Materials and Methods
[0423] Phage Display Library Construction. The SUMO-targeted phage display library was prepared as previously described (Koide, A., et al., supra, 2007) incorporating recent optimizations (Wojcik, et al., supra, 2010). The library was created using a "shaved" template containing polyserine sequences in the BC, DE, and FG loops. Amino acid diversity was introduced at FG loop and scaffold positions using degenerate codons as indicated in Figure 11B and high-efficiency Kunkel mutagenesis.
[0424] Phage Display Selection. For use in selection, ySUMO, hSUM0-1 and hSUM0-were expressed as a C-terminal fusion to an engineered GST (glutathione-S-transferase) variant devoid of cysteine residues (C to S mutations) except for a single cysteine near the N-terminus.
This was accomplished by cloning the genes into a previously reported vector (Wojcik, et al., supra, 2010). In the case of hSUM0-1 a C52A mutant was used and in the case of hSUM0-2 a C475 mutant was used. The GST fusion targets were modified with a redox cleavable biotin moiety using EZ-link Biotin HPDP (Pierce). For phage amplification, XL1-Blue E. Coli cells transformed with the Lad I containing plasmid pMCSG21 (termed XL21 cell) were used to maintain transcription silence until IPTG addition. Cradle molecules displaying phage were prepared by growing XL21 cells transfected with the phagemid library in the presence of 0.2 mM IPTG and helper phage K07 (Koide, A., and Koide, S., supra, 2007; Sidhu, S.
S., et al., Methods Enzymol. (2000) 328:333-363). In the first round of library selection, 50 nM
biotinylated GST-target was mixed with a sufficient amount of streptavidin-conjugated magnetic beads (Streptavidin MagneSphere Paramagnetic Particles; Promega, Z5481/2) in TBS (50 mM Tris HC1 buffer pH 7.5 150 mM NaC1) containing 0.05% Tween 20 (TBST).
Beads were blocked with a 5 uM solution of biotin in TBST. To this target solution, 101142 phage suspended in 0.5 ml TBST + 0.5% BSA were added, and the solution was mixed and incubated for 15 mM. at room temperature. After washing the beads twice with TBST, the beads suspension containing bound phage were added to a fresh XL21 culture.
Phages were amplified as described before (Sidhu, et al., supra, 2000). In the second round, phage were preincubated in TBST + 0.5% BSA with 500 nM unbiotinylated GST competitor to remove GST binders from the population. Target binding phages were then captured by streptavidin conjugated magnetic beads loaded with 10 nM GST-target. Phages bound to the target protein were eluted from the beads by cleaving the linker within the biotinylation reagent with 100 mM
DTT in 50 mM Tris pH=8Ø The phagemids were washed and recovered as described above.
After amplification, the third and fourth rounds of selection were performed using 1 nM and 0.1 nM target respectively.
This was accomplished by cloning the genes into a previously reported vector (Wojcik, et al., supra, 2010). In the case of hSUM0-1 a C52A mutant was used and in the case of hSUM0-2 a C475 mutant was used. The GST fusion targets were modified with a redox cleavable biotin moiety using EZ-link Biotin HPDP (Pierce). For phage amplification, XL1-Blue E. Coli cells transformed with the Lad I containing plasmid pMCSG21 (termed XL21 cell) were used to maintain transcription silence until IPTG addition. Cradle molecules displaying phage were prepared by growing XL21 cells transfected with the phagemid library in the presence of 0.2 mM IPTG and helper phage K07 (Koide, A., and Koide, S., supra, 2007; Sidhu, S.
S., et al., Methods Enzymol. (2000) 328:333-363). In the first round of library selection, 50 nM
biotinylated GST-target was mixed with a sufficient amount of streptavidin-conjugated magnetic beads (Streptavidin MagneSphere Paramagnetic Particles; Promega, Z5481/2) in TBS (50 mM Tris HC1 buffer pH 7.5 150 mM NaC1) containing 0.05% Tween 20 (TBST).
Beads were blocked with a 5 uM solution of biotin in TBST. To this target solution, 101142 phage suspended in 0.5 ml TBST + 0.5% BSA were added, and the solution was mixed and incubated for 15 mM. at room temperature. After washing the beads twice with TBST, the beads suspension containing bound phage were added to a fresh XL21 culture.
Phages were amplified as described before (Sidhu, et al., supra, 2000). In the second round, phage were preincubated in TBST + 0.5% BSA with 500 nM unbiotinylated GST competitor to remove GST binders from the population. Target binding phages were then captured by streptavidin conjugated magnetic beads loaded with 10 nM GST-target. Phages bound to the target protein were eluted from the beads by cleaving the linker within the biotinylation reagent with 100 mM
DTT in 50 mM Tris pH=8Ø The phagemids were washed and recovered as described above.
After amplification, the third and fourth rounds of selection were performed using 1 nM and 0.1 nM target respectively.
[0425] Protein Expression and Purification. GST-fusion proteins were produced by cloning genes into a previously described vector (Wojcik, et al., supra, 2010). All other proteins were expressed by cloning genes into the pHFT2 vector. pHFT2 is a pHFT1 derivative containing a 10-His tag instead of 6-His. Unless otherwise noted, all proteins were expressed by growing BL21(DE3) cells harboring the appropriate pHFT2 vector in ZYP-5052 autoinduction media according to the methods of Studier, et al., Protein Express. Purif. (2005) 41:207-234. Proteins were purified using Ni-Sepharose columns (GE Healthcare), or His-Mag magnetic particles (Novagen) in conjunction with a Kingfisher instrument (Thermo).
[0426] Surface Plasmon Resonance. Cradle molecules purified as described above were immobilized via His-tag to an NTA surface using a BiacoreTM 2000 instrument so that the theoretical maximum response (R.) from target binding was 100-200 RU. Target protein at varying concentrations was then flowed over the surface at a flow rate of 30 uL/min and the binding signal recorded. Fitting of kinetic traces was carried out using the BIAevaluation software. For equilibrium experiments, the equilibrium binding response was recorded for multiple target concentrations and fit to a simple 1:1 saturation binding curve.
[0427] Phage ELISA. For phage amplification, E. coli XL1-Blue cells transformed with the Lad l containing plasmid pMCSG21 (Stols, et al., Protein Express. Purif.
(2007) 53:396-403) (termed "XL21" hereafter) were used. Cradle molecule displaying phage were prepared by growing XL21 cells transfected with phagemid of individual clones in the presence of 0.2 mM IPTG and helper phage K07 (Koide, A., and Koide, S., supra, 2007;
Sidhu, et al., supra, 2000). Cultures were then centrifuged and phage containing supernatant used for ELISA
assays. All incubations were at room temperature. In all instances except for the phage titration experiment used to test hSUM01-binding cradle molecule specificity (Figure 26), wells of a 96-well Microlon (Greiner) ELISA plate were treated with a 2 ng/mL solution of a GST-fusion of the appropriate target protein, or GST alone in 50 mM Tris Cl buffer containing 150 mM NaC1, pH 7.5 (TBS) and incubated for 1 hour followed by blocking with 0.5% BSA in TBS for 1 hour.
In the hSUM01 binder specificity experiment, 2 p g/mL NeutrAvidin in TBS was coated, followed by blocking with 0.5% BSA and an addition of a 50 nM solution of his-tagged ySUMO, hSUM01 or hSUM02 in complex with the BT-Tris NTA compound, which non-covalently links a biotin moiety to a his-tag (Koide, A., et al., supra, 2007;
Reichel, et al., Analytical Chem. (2007) 79:8590-8600), and incubated for 30 minutes. In epitope mapping competition experiments, wells coated with GST-target were then incubated with either 1 p M
ySMB-1, or 1 p M hS1MB-4 in TBS or TBS only for one hour. In other experiments this step was not performed. After washing the wells with TBS + 0.1% Tween 20 (TBST), 50 pl of a 30% solution of phage supernatant in TBS + 0.5% BSA was added to the wells and incubated for 30 minutes. In the phage titration experiment, serial 5-fold dilutions of this 30% solution were also tested. In competition experiments, lp M ySMB-1, or lp M hS1MB-4 was included in the binding mixture. Bound phages were then detected using an anti-M13 antibody conjugated to horseradish peroxidase (GE Healthcare) in conjunction with the Ultra TMBELISA
colorimetric substrate (Pierce). Reactions were quenched after 5 minutes by addition of H2SO4 and phage binding quantified by absorbance measured at 450 nm.
(2007) 53:396-403) (termed "XL21" hereafter) were used. Cradle molecule displaying phage were prepared by growing XL21 cells transfected with phagemid of individual clones in the presence of 0.2 mM IPTG and helper phage K07 (Koide, A., and Koide, S., supra, 2007;
Sidhu, et al., supra, 2000). Cultures were then centrifuged and phage containing supernatant used for ELISA
assays. All incubations were at room temperature. In all instances except for the phage titration experiment used to test hSUM01-binding cradle molecule specificity (Figure 26), wells of a 96-well Microlon (Greiner) ELISA plate were treated with a 2 ng/mL solution of a GST-fusion of the appropriate target protein, or GST alone in 50 mM Tris Cl buffer containing 150 mM NaC1, pH 7.5 (TBS) and incubated for 1 hour followed by blocking with 0.5% BSA in TBS for 1 hour.
In the hSUM01 binder specificity experiment, 2 p g/mL NeutrAvidin in TBS was coated, followed by blocking with 0.5% BSA and an addition of a 50 nM solution of his-tagged ySUMO, hSUM01 or hSUM02 in complex with the BT-Tris NTA compound, which non-covalently links a biotin moiety to a his-tag (Koide, A., et al., supra, 2007;
Reichel, et al., Analytical Chem. (2007) 79:8590-8600), and incubated for 30 minutes. In epitope mapping competition experiments, wells coated with GST-target were then incubated with either 1 p M
ySMB-1, or 1 p M hS1MB-4 in TBS or TBS only for one hour. In other experiments this step was not performed. After washing the wells with TBS + 0.1% Tween 20 (TBST), 50 pl of a 30% solution of phage supernatant in TBS + 0.5% BSA was added to the wells and incubated for 30 minutes. In the phage titration experiment, serial 5-fold dilutions of this 30% solution were also tested. In competition experiments, lp M ySMB-1, or lp M hS1MB-4 was included in the binding mixture. Bound phages were then detected using an anti-M13 antibody conjugated to horseradish peroxidase (GE Healthcare) in conjunction with the Ultra TMBELISA
colorimetric substrate (Pierce). Reactions were quenched after 5 minutes by addition of H2SO4 and phage binding quantified by absorbance measured at 450 nm.
[0428] NMR Epitope Mapping. NMR epitope mapping was performed by comparing chemical shifts in the 1H-15N-HSQC spectra of labeled ySUMO and hSUM01 in the presence and absence of excess unlabeled cradle molecule. Uniformly 15N-labeled ySUMO
and hSUM01 were produced by culturing BL21(DE3) cells harboring a pHFT2 derivative containing the ySUMO or hSUM01 gene in M9 media with 15NH4C1 as the sole nitrogen source. pHFT2 is a pHET1 (Huang, et al., supra, 2006) derivative containing a 10-His tag instead of 6-His. A hSUM01 mutant was used containing the C52A mutation.
Protein expression was induced by the addition of 1 mM IPTG. Proteins were purified using a Ni-Sepharose column (GE Healthcare). After cleaving the N-terminal tag sequence with TEV
protease, the proteins were concentrated and dissolved in 50 mM phosphate, 100 mM NaC1, pH
= 6.5. 1H, 15N-HSQC spectra were collected on a Varian (Palo Alto, CA) INOVA
spectrometer using pulse sequences provided by the manufacturer. All ySUMO
spectra were recorded at 20 C. All hSUM01 spectra were recorded at 17 C. ySUMO resonances were assigned using previously reported assignments by Sheng, et al. (Sheng and Liao, Protein Sci.
(2002) 11:1482-1491). hSUM01 resonances were assigned using previously reported assignments by Macauley, et al. J. Biological Chem. (2004) 279:49131-49137.
Spectra were collected for the free 1115N1-ySUMO (380 p M), free 1115N1-hSUM01 (228 p M), 1115N1-ySUMO
(100 p M) in complex with unlabeled cradle molecule (200 p M) and 1115N1-hSUM01 (242 pM) in complex with unlabeled cradle molecule (484 p M) in the above buffer.
Residues affected by cradle molecule binding were identified by comparing the free and cradle molecule bound spectra. Amide cross-peaks were classified into five categories: strongly affected (a shift of greater than two peak widths), moderately affected (a shift of between one and two peak widths), weakly affected (a shift of approximately 1 peak width), unaffected (a shift of less than one peak width), and excluded (resonances that could not be unambiguously assigned) (Fanner, et al., Nature Struct. Biol. (1996) 3:995-997; Huang, et al., J. Molec. Biol.
(1998) 281:61-67).
and hSUM01 were produced by culturing BL21(DE3) cells harboring a pHFT2 derivative containing the ySUMO or hSUM01 gene in M9 media with 15NH4C1 as the sole nitrogen source. pHFT2 is a pHET1 (Huang, et al., supra, 2006) derivative containing a 10-His tag instead of 6-His. A hSUM01 mutant was used containing the C52A mutation.
Protein expression was induced by the addition of 1 mM IPTG. Proteins were purified using a Ni-Sepharose column (GE Healthcare). After cleaving the N-terminal tag sequence with TEV
protease, the proteins were concentrated and dissolved in 50 mM phosphate, 100 mM NaC1, pH
= 6.5. 1H, 15N-HSQC spectra were collected on a Varian (Palo Alto, CA) INOVA
spectrometer using pulse sequences provided by the manufacturer. All ySUMO
spectra were recorded at 20 C. All hSUM01 spectra were recorded at 17 C. ySUMO resonances were assigned using previously reported assignments by Sheng, et al. (Sheng and Liao, Protein Sci.
(2002) 11:1482-1491). hSUM01 resonances were assigned using previously reported assignments by Macauley, et al. J. Biological Chem. (2004) 279:49131-49137.
Spectra were collected for the free 1115N1-ySUMO (380 p M), free 1115N1-hSUM01 (228 p M), 1115N1-ySUMO
(100 p M) in complex with unlabeled cradle molecule (200 p M) and 1115N1-hSUM01 (242 pM) in complex with unlabeled cradle molecule (484 p M) in the above buffer.
Residues affected by cradle molecule binding were identified by comparing the free and cradle molecule bound spectra. Amide cross-peaks were classified into five categories: strongly affected (a shift of greater than two peak widths), moderately affected (a shift of between one and two peak widths), weakly affected (a shift of approximately 1 peak width), unaffected (a shift of less than one peak width), and excluded (resonances that could not be unambiguously assigned) (Fanner, et al., Nature Struct. Biol. (1996) 3:995-997; Huang, et al., J. Molec. Biol.
(1998) 281:61-67).
[0429] X-ray Crystallography. ySMB-1 and ySUMO proteins were expressed and purified as described above. After removal of the tag sequence with TEV
protease, the two proteins were mixed in a 1:1 molar ratio, concentrated to a total protein concentration of 4.9 mg/mL and dissolved in 10 mM Tris, 50 mM NaC1, pH = 8Ø The formation and monodispersity of the complex was asserted by gel filtration. The ySMB-1/ySUMO
complex was crystallized in 14% PEG 8000, 16% glycerol at 19 C using the hanging drop vapor diffusion method. Crystals were frozen in a mixture of 80% mother liquor and 20% glycerol as a cryoprotectant. Diffraction data were collected at APS beamline 21-ID-F
(Advanced Photon Source, Argonne National Laboratory). Crystal and data collection information are reported in Table 4. X-ray diffraction data were processed and scaled with HKL2000 (Otwinowski, Z., and Minor, W., Methods in Enz. (1997) 276:307-326). The structure was determined by molecular replacement using sequential search with two different models with the program MOLREP in CCP4 ((1994) The CCP4 suite). The ySUMO structure (residues 1013-1098 of chain C PDB ID
code 2EKE) was used as a search model, along with the FnIII structure with the variable loop regions deleted (PDB ID code 1FNA) (Dickinson, C., D., et al., J. Mol. Biol.
(1994) 236:1079-1092; Duda, et al., J. Molec. Biol. (2007) 369:619-630). Rigid body refinement was carried out with REFMAC5 (Murshudov, G. N., et al., Acta Crystallogr D Biol Crystallogr (1997) 53:240-255). Model building and the search for water molecules was carried out using the Coot program (Emsley, P., and Cowtan, K., Acta Crystallogr D Biol Crystallogr (2004) 60:2126-2132). Simulated annealing was performed in CNS1.1 (Brunger, A. T., et al., Acta Crystallogr D Biol Crystallogr (1998) 54:905-921). The TLS
(Translation/Libration/Screw) and bulk solvent parameters, restrained temperature factor and final positional refinement were completed with REFMAC5 (Murshudov, et al., supra, 1997). Molecular graphics were generated using PyMOL (located on the World Wide Web at pymol.org).
protease, the two proteins were mixed in a 1:1 molar ratio, concentrated to a total protein concentration of 4.9 mg/mL and dissolved in 10 mM Tris, 50 mM NaC1, pH = 8Ø The formation and monodispersity of the complex was asserted by gel filtration. The ySMB-1/ySUMO
complex was crystallized in 14% PEG 8000, 16% glycerol at 19 C using the hanging drop vapor diffusion method. Crystals were frozen in a mixture of 80% mother liquor and 20% glycerol as a cryoprotectant. Diffraction data were collected at APS beamline 21-ID-F
(Advanced Photon Source, Argonne National Laboratory). Crystal and data collection information are reported in Table 4. X-ray diffraction data were processed and scaled with HKL2000 (Otwinowski, Z., and Minor, W., Methods in Enz. (1997) 276:307-326). The structure was determined by molecular replacement using sequential search with two different models with the program MOLREP in CCP4 ((1994) The CCP4 suite). The ySUMO structure (residues 1013-1098 of chain C PDB ID
code 2EKE) was used as a search model, along with the FnIII structure with the variable loop regions deleted (PDB ID code 1FNA) (Dickinson, C., D., et al., J. Mol. Biol.
(1994) 236:1079-1092; Duda, et al., J. Molec. Biol. (2007) 369:619-630). Rigid body refinement was carried out with REFMAC5 (Murshudov, G. N., et al., Acta Crystallogr D Biol Crystallogr (1997) 53:240-255). Model building and the search for water molecules was carried out using the Coot program (Emsley, P., and Cowtan, K., Acta Crystallogr D Biol Crystallogr (2004) 60:2126-2132). Simulated annealing was performed in CNS1.1 (Brunger, A. T., et al., Acta Crystallogr D Biol Crystallogr (1998) 54:905-921). The TLS
(Translation/Libration/Screw) and bulk solvent parameters, restrained temperature factor and final positional refinement were completed with REFMAC5 (Murshudov, et al., supra, 1997). Molecular graphics were generated using PyMOL (located on the World Wide Web at pymol.org).
[0430] Design of the SUMO-targeted Cradle Library. Choice of positions and diversity in the SUMO-targeted library carried the following rationale. All residues of the FG loop were varied except one, S77 that did not contact ySUMO in the ySMB-1/ySUMO crystal structure and did not appear to be capable of direct participation in any similar interface. The inventors varied Y76 to D, H, N and Y, because, although it did not directly contact ySUMO in the ySMB-1 interface, it was suspected that this position may be capable of interacting with the conserved R55 in all SUMOs (Figure 22A). Leucine 81 of ySMB-1 is buried in a pocket in the ySUMO surface that is conserved across all SUMO isoforms, and an equivalent "anchor"
leucine or valine is conserved in all SIM/SUMO complexes for which there are structures. As a result, the amino acid diversity at this position was restricted to F, L, I
and V. E47 and S86 made very minimal contact in the ySMB-1 interface and were not varied. Though P87 of the FG loop did make significant contact in the ySMB-1 interface, it was held constant to avoid perturbation of the turn structure it introduces which would likely change the overall positioning of the FG loop.
leucine or valine is conserved in all SIM/SUMO complexes for which there are structures. As a result, the amino acid diversity at this position was restricted to F, L, I
and V. E47 and S86 made very minimal contact in the ySMB-1 interface and were not varied. Though P87 of the FG loop did make significant contact in the ySMB-1 interface, it was held constant to avoid perturbation of the turn structure it introduces which would likely change the overall positioning of the FG loop.
[0431] Cradle Molecule Effects on SUMO/SIM Interactions. Wells of a Microlon (Greiner) ELISA plate were coated with 2p g/mL GST-RanBP2 for 1 hour at room temperature.
This IR1-M-1R2 construct of RanBP2 has been described previously (Tatham, et al., Nat.
Struct. Mol. Biol. (2005) 12:67-74). A complex was pre-formed between his-tagged SUMOylated RanGAP (modified with SUM01) and the BT-Tris-NTA reagent which non-covalently attaches a biotin moiety to a His-tag (Koide, A., et al., Protein Eng. Des. Sel. (2009) 22:685-690; Reichel, et al., supra, 2007). This complex was incubated with varying concentrations of hS1MB-4 for 1 hour and then the mixture was added to the ELISA plate and incubated for 30 mm. Bound SUMO-RanGAP was then detected using a streptavidin-horseradish peroxidase conjugate in conjunction with the Ultra TMB ELISA
reagent (Pierce).
The reaction was quenched with 2M H2SO4, and the absorbance at 450 nm was measured.
This IR1-M-1R2 construct of RanBP2 has been described previously (Tatham, et al., Nat.
Struct. Mol. Biol. (2005) 12:67-74). A complex was pre-formed between his-tagged SUMOylated RanGAP (modified with SUM01) and the BT-Tris-NTA reagent which non-covalently attaches a biotin moiety to a His-tag (Koide, A., et al., Protein Eng. Des. Sel. (2009) 22:685-690; Reichel, et al., supra, 2007). This complex was incubated with varying concentrations of hS1MB-4 for 1 hour and then the mixture was added to the ELISA plate and incubated for 30 mm. Bound SUMO-RanGAP was then detected using a streptavidin-horseradish peroxidase conjugate in conjunction with the Ultra TMB ELISA
reagent (Pierce).
The reaction was quenched with 2M H2SO4, and the absorbance at 450 nm was measured.
[0432] Cradle Molecule Effects on SUMOylation. A mixture of hSUM01 and His6-SUM03 (24 pM ea.) was combined with either cradle molecule hS1MB-4 or hS1MB-5 at varying concentrations and incubated for 1 hour. A mixture of El (SAE1/2, 1.7 p M), E2 (Ubc9, 13.7 p M), and ATP (5.5 mM) was then added and the SUMOylation reaction allowed to proceed for 10 mm. at 37 C. The reaction was then quenched by an addition of SDS-PAGE
loading dye and reaction mixture was analyzed by SDS-PAGE.
loading dye and reaction mixture was analyzed by SDS-PAGE.
[0433] Cradle Molecule Effects on DeSUMOylation. YFP-hSUM01-ECFP fusion protein (63 p g/mL) was mixed with varying concentrations of cradle molecule hS1MB-4, hS1MB-5 or ySMB-1 as a control and incubated at room temperature for 30 mins. SENP1 was then added at a final concentration of 32 nM and the mixture incubated for 15 mm at 37 C.
The reaction was stopped by putting the reaction containers on ice, adding SDS-PAGE sample buffer and then boiling for 5 mins. The reaction mixture was then analyzed by SDS-PAGE.
Example 9 Cradle Molecules Constructed Using Alternative Surfaces of the FnIII Beta Sheets Library Design
The reaction was stopped by putting the reaction containers on ice, adding SDS-PAGE sample buffer and then boiling for 5 mins. The reaction mixture was then analyzed by SDS-PAGE.
Example 9 Cradle Molecules Constructed Using Alternative Surfaces of the FnIII Beta Sheets Library Design
[0434] The FnIII domain has two beta sheets (Figure 30A), one constituted by beta strands A, B and E, and the other by beta strands C, D, F and G. The crystal structure of a cradle molecule in complex with its target, the Abl 5H2 domain, revealed extensive interactions made by residues in the CDFG beta sheet region of the FnIII domain that were not diversified in our library (Figure 30C) (Wojcik, et al., supra, 2010). Alanine scanning mutagenesis experiments demonstrated the energetic importance of these residues in binding (Wojcik, et al., supra, 2010). Similar use of these beta sheet surfaces was observed in a cradle molecule that bound to yeast small ubiquitin-like modifier (ySUMO) (Gilbreth, R. N., et al., Proc Natl Acad Sci USA
(2011) 108:7751-7756). These observations suggest that it may be possible to construct a target-binding surface that is distinct from the conventional design relied upon in antibody-mimic engineering, i.e., interactions dominated by loops equivalent to the antibody complementarity-determining regions (CDRs). The surface of the CDFG beta sheet is slightly concave, suggesting it is suitable for producing recognition surface complementary to convex surfaces found in most globular macromolecules.
(2011) 108:7751-7756). These observations suggest that it may be possible to construct a target-binding surface that is distinct from the conventional design relied upon in antibody-mimic engineering, i.e., interactions dominated by loops equivalent to the antibody complementarity-determining regions (CDRs). The surface of the CDFG beta sheet is slightly concave, suggesting it is suitable for producing recognition surface complementary to convex surfaces found in most globular macromolecules.
[0435] To explore the efficacy of such alternative FnIII library designs and how their performance compares to conventional loop-focused FnIII engineering strategies, two distinct cradle molecule libraries were constructed. One library, which is called the cradle library, utilizes residues in beta strands C (residues 31 and 33) and D (residues 47 and 49) as well as residues in the FG and CD loops (Figures 30E and 31A) to present a concave binding surface as described above. The other library, which is called the loop-only library, constitutes the conventional FnIII library design utilizing positions in the BC, DE and FG
loops with no residues diversified in the beta sheet regions (Figures 30D and 31B).
loops with no residues diversified in the beta sheet regions (Figures 30D and 31B).
[0436] In both libraries, highly biased amino acid diversity and various loop lengths for the FG loop were used, as described previously (Wojcik, et al., supra, 2010). In the loop-only library, this diversity was also used for positions in the BC loop that was also varied in length.
The DE loop in the loop-only library was fixed in length and diversified only to Tyr or Ser with Gly also included at position 52 (Figure 31B). In the cradle library, codons that exclude Pro and Gly, amino acids that are likely detrimental to the structural integrity of the FnIII domain, were used for positions in beta strand C, an internal beta strand. For beta strand D, an edge strand, a small subset of amino acids, Ala, Glu, Lys and Thr, was used. Ala and Thr were intended so as to avoid large side chains that might prevent target binding, and Glu and Lys were included as negative design elements to prevent aggregation mediated by the formation of an intermolecular beta sheet (Richardson, J. S., and Richardson, D. C., Proc Nail Acad Sci USA
(2002) 99:2754-2759). Tyr73 was not diversified with a hope that this Tyr would always contribute to target interaction, as Tyr is highly suitable for making a protein-interaction interface (Koide, S., and Sidhu, S. S., ACS Chem Biol (2009) 4:325-334). Both libraries were constructed in the phage display format with estimated numbers of independent sequences of 2.0x101 and 1.5x1010, respectively.
High-Affinity FNIII Cradle Molecules from the Cradle Librarie
The DE loop in the loop-only library was fixed in length and diversified only to Tyr or Ser with Gly also included at position 52 (Figure 31B). In the cradle library, codons that exclude Pro and Gly, amino acids that are likely detrimental to the structural integrity of the FnIII domain, were used for positions in beta strand C, an internal beta strand. For beta strand D, an edge strand, a small subset of amino acids, Ala, Glu, Lys and Thr, was used. Ala and Thr were intended so as to avoid large side chains that might prevent target binding, and Glu and Lys were included as negative design elements to prevent aggregation mediated by the formation of an intermolecular beta sheet (Richardson, J. S., and Richardson, D. C., Proc Nail Acad Sci USA
(2002) 99:2754-2759). Tyr73 was not diversified with a hope that this Tyr would always contribute to target interaction, as Tyr is highly suitable for making a protein-interaction interface (Koide, S., and Sidhu, S. S., ACS Chem Biol (2009) 4:325-334). Both libraries were constructed in the phage display format with estimated numbers of independent sequences of 2.0x101 and 1.5x1010, respectively.
High-Affinity FNIII Cradle Molecules from the Cradle Librarie
[0437] The performance of the new cradle library with the conventional "loop only" library (Wojcik, et al., supra, 2010) were compared using three targets, Abl 5H2, human small ubiquitin-like modifier 1 (hSUM01), and green fluorescent protein (GFP). The molecular platform for these libraries was identical, except for the locations of the diversified residues.
These libraries contained similar numbers of independent sequences. For each combination of target and library, the following steps to generate cradle molecules were performed. First, cradle molecules from the phage display libraries were enriched. The N-terminal segment and C-terminal segment were then "shuffled" among cradle molecule clones in the enriched population for a given target, with a junction in the E strand to create a second-generation library in the yeast surface display format (Koide, et al., supra, 2007). The gene shuffling step was incorporated to increase the sequence space beyond that sampled in the starting, phage-display library. Finally, the yeast surface display library was sorted using flow cytometry.
These libraries contained similar numbers of independent sequences. For each combination of target and library, the following steps to generate cradle molecules were performed. First, cradle molecules from the phage display libraries were enriched. The N-terminal segment and C-terminal segment were then "shuffled" among cradle molecule clones in the enriched population for a given target, with a junction in the E strand to create a second-generation library in the yeast surface display format (Koide, et al., supra, 2007). The gene shuffling step was incorporated to increase the sequence space beyond that sampled in the starting, phage-display library. Finally, the yeast surface display library was sorted using flow cytometry.
[0438] FnIII domain variants to all the targets from both cradle and "loop only" libraries were generated. Many FnIII domain variants exhibited high affinity with Kd values in the low nM range as measured in the yeast display format (Figures 31A-31C). These Kd values were in good agreement with those determined using purified FnIII domain variants and surface plasmon resonance (Figure 31D). As in previously generated FnIII domain variants, residues in the FG loop were mutated in all the FnIII domain variants selected from both libraries, suggesting the central importance of the FG loop residues in target recognition of FnIII domain variants. Some of the cradle molecules originating from the cradle libraries contained the wild-type CD loop and the D strand, suggesting either that these residues are not involved in target recognition in these cradle molecules or that substitutions of the wild-type residues did not confer affinity improvement. In contrast, the diversified positions in the C
strand were mutated in all of the selected cradle molecules, and cradle molecules to different targets exhibited different amino acid sequences (Figure 31A). When the C strand of a GFP-binding cradle molecule, GS#2, was changed back to the wild type sequence, the mutant completely lost binding to GFP (Figure 33). Together, these results support the importance of the C strand positions in target binding of cradle molecules derived from the cradle libraries.
strand were mutated in all of the selected cradle molecules, and cradle molecules to different targets exhibited different amino acid sequences (Figure 31A). When the C strand of a GFP-binding cradle molecule, GS#2, was changed back to the wild type sequence, the mutant completely lost binding to GFP (Figure 33). Together, these results support the importance of the C strand positions in target binding of cradle molecules derived from the cradle libraries.
[0439] It appears that the two libraries performed differently against different targets. For GFP, the cradle library clones had higher affinity than the counterparts from the loop-only library, but for hSUM01 the trend was opposite. High-affinity FnIII domain variants were obtained from both libraries for Abl SH2. These results suggest that, whereas both libraries are capable of generating FnIII domain variants to these diverse targets, the use of two distinct libraries increases the likelihood of generating highly functional FnIII
domain variants to a broader range of targets.
domain variants to a broader range of targets.
[0440] This work produced several loop-only FnIII domain variants with good affinity for hSUM01, whereas in a previous work a FnIII domain variants termed ySMB-9 was the only hSUM01 binder that was recovered with a sub-uM Kd (Gilbreth, et al., supra, 2011). One notable difference between the two studies is the inclusion of a loop-shuffling step in the present study. As shown below, the ySMB-9/hSUM01 crystal structure strongly suggests that residues from all three loops are important for binding in ySMB-9 and the same is likely true for the highly homologous cradle molecules isolated from the present work. Thus, consistent with previous studies (Hackel, et al., supra, 2008), loop shuffling expands the sequence space that can be searched and thus increases the probability of generating high-affinity FnIII domain variants.
Crystal Structures of Cradle Molecule-Target Complexes Confirm Library Designs
Crystal Structures of Cradle Molecule-Target Complexes Confirm Library Designs
[0441] The crystal structure of a cradle molecule isolated from the cradle library termed SH13 in complex with its target, the SH2 domain of Abl kinase, was determined at a resolution of 1.83A (Figures 31A and 32A; Table 9). SH13 was among the initial cradle molecule clones generated directly from the phage-display libraries without loop shuffling and yeast display screening. Accordingly, it has low affinity with a Kd value of ¨4 uM. The SH13 cradle molecule maintained the FnIII scaffold structure as evidenced by its minimal deviation from a previously determined cradle molecule structure (Ca RMSD <0.7 A, excluding mutated residues) (Wojcik, et al., supra, 2010; Gilbreth, et al., supra, 2008). The overall structure of the Abl 5H2 domain is likewise in good agreement with a previously published crystal structure of the Abl SH2 domain in complex with another cradle molecule (Ca RMSD < 0.5 A) (Wojcik, et al., supra, 2010). The phospho-Tyr binding pocket of the SH2 domain contained electron density consistent with a sulfate ion, which was present in the crystallization solution.
Table 9 Data collection and refinement statistics (molecular replacement) SH13/Abll SH2 ySMB9/hSUM01 complex (3NKI) complex (3RZW) Data collection*
Space group P21212 C2221 Cell dimensions a, b, c (A) 65.55, 49.18, 60.95 93.35, 97.83, 96.58 a43,7 ( ) 90, 90, 90 90, 90, 90 Beamline APS 24 ID-E APS 21 ID-F
Wavelength (A) 0.97917 0.97872 Resolution (A) 1.83 (1.90-1.83) 2.15 (2.19-2.15) RsymOrRmerge 11.5 (52.9) 8.2 (52.2) // (3/ 24.4 (2.6) 17.9 (2.1) Completeness (%) 87.4 (83.0) 98.2 (96.6) Redundancy 4.0 (3.1) 6.7 (6.0) Refinement Resolution (A) 1.83 2.15 No. reflections 15737 22,699 Rwork / Rfree 0.188/0.237 0.186/0.237 No. atoms 1753 2,807 Protein 1614 2,657 Ligand/ion 5 12 Water 134 138 B-factors Protein 25.6 29.51 Ligand/ion 37.4 64.29 Water 33.3 32.24 R.m.s. deviations Bond lengths (A) 0.012 0.019 Bond angles ( ) 1.337 1.828 Ramachandran values 98.5% favored 97.2% favored 1.5% allowed 2.2% allowed 0% outliers 0.6% outliers APS, Advanced Photon Source.
*Values for highest resolution shell shown in parentheses tRmerge = hk1iI 1(hkl), - <I(hkl)> I /EhidEi<i(hk/),> over i observations of a reflection hkl.
= E I IF(obs)I-IF(calc)I I /E IF(obs)I.
Rfree is R with 5% of reflections sequestered before refinement.
Table 9 Data collection and refinement statistics (molecular replacement) SH13/Abll SH2 ySMB9/hSUM01 complex (3NKI) complex (3RZW) Data collection*
Space group P21212 C2221 Cell dimensions a, b, c (A) 65.55, 49.18, 60.95 93.35, 97.83, 96.58 a43,7 ( ) 90, 90, 90 90, 90, 90 Beamline APS 24 ID-E APS 21 ID-F
Wavelength (A) 0.97917 0.97872 Resolution (A) 1.83 (1.90-1.83) 2.15 (2.19-2.15) RsymOrRmerge 11.5 (52.9) 8.2 (52.2) // (3/ 24.4 (2.6) 17.9 (2.1) Completeness (%) 87.4 (83.0) 98.2 (96.6) Redundancy 4.0 (3.1) 6.7 (6.0) Refinement Resolution (A) 1.83 2.15 No. reflections 15737 22,699 Rwork / Rfree 0.188/0.237 0.186/0.237 No. atoms 1753 2,807 Protein 1614 2,657 Ligand/ion 5 12 Water 134 138 B-factors Protein 25.6 29.51 Ligand/ion 37.4 64.29 Water 33.3 32.24 R.m.s. deviations Bond lengths (A) 0.012 0.019 Bond angles ( ) 1.337 1.828 Ramachandran values 98.5% favored 97.2% favored 1.5% allowed 2.2% allowed 0% outliers 0.6% outliers APS, Advanced Photon Source.
*Values for highest resolution shell shown in parentheses tRmerge = hk1iI 1(hkl), - <I(hkl)> I /EhidEi<i(hk/),> over i observations of a reflection hkl.
= E I IF(obs)I-IF(calc)I I /E IF(obs)I.
Rfree is R with 5% of reflections sequestered before refinement.
[0442] In accordance with the design of the cradle library, the 5H13 cradle molecule binds to the target chiefly using the cradle surfaces (Figure 32A). The mode of interaction observed in the crystal structure is consistent with the epitope mapped using NMR
chemical shift perturbation (Figure 32B). The concave surface presented by the cradle molecule effectively complements a convex surface of the Abl SH2 domain. The total surface area buried at the interaction interface is nearly 2000A2, with the SH13 cradle molecule contributing ¨1030 A2 and the Abl SH2 domain 960 A2. Notably, of the cradle molecule surface area buried in the interface, ¨90% is contributed by residues at positions that were diversified in the generation of the library. Similarly, out of 21 cradle molecule residues that are within 5 A
of an 5H2 atom, 15 were located at positions that were diversified in the cradle library. All but one of these 15 residues are directly involved in target recognition. The extensive contributions of diversified positions to the interface suggest that the library design is effective in concentrating amino acid diversity at positions that are capable of making direct contacts with a target. These characteristics also provide additional support for the utility of this face of the FnIII beta sheets for constructing protein-interaction interfaces.
chemical shift perturbation (Figure 32B). The concave surface presented by the cradle molecule effectively complements a convex surface of the Abl SH2 domain. The total surface area buried at the interaction interface is nearly 2000A2, with the SH13 cradle molecule contributing ¨1030 A2 and the Abl SH2 domain 960 A2. Notably, of the cradle molecule surface area buried in the interface, ¨90% is contributed by residues at positions that were diversified in the generation of the library. Similarly, out of 21 cradle molecule residues that are within 5 A
of an 5H2 atom, 15 were located at positions that were diversified in the cradle library. All but one of these 15 residues are directly involved in target recognition. The extensive contributions of diversified positions to the interface suggest that the library design is effective in concentrating amino acid diversity at positions that are capable of making direct contacts with a target. These characteristics also provide additional support for the utility of this face of the FnIII beta sheets for constructing protein-interaction interfaces.
[0443] The epitope of the Abl 5H2 domain recognized by 5H13 is distinct from the phosphopeptide-binding interface that a previously reported cradle molecule recognizes (Wojcik, et al., supra, 2010). However, the 5H13 epitope is also a known functionally important surface of the 5H2 domain. In the context of the full-length Abl kinase, this surface, centered on the aA helix, mediates interactions with the C-lobe of the kinase domain that help to keep the kinase in an inactive conformation. Almost a half (-475 A2) of the epitope for the 5H13 cradle molecule is contributed by a linear segment including the entire aA helix and residues immediately adjacent to this helix (Figure 32B). The concave paratope of the SH13 cradle molecule seems suitable for recognizing the convex surface presented by this helix. It is unlikely that a cradle molecule with a convex paratope shape, typically observed in cradle molecules with exclusively loop-based binding surfaces, would be able to recognize this surface.
[0444] In order to better compare and contrast the structural basis for target recognition in FnIII domain variants isolated from the two distinct types of libraries, the crystal structure of the FnIII domain variant ySMB-9 bound to hSUM01 was determined at 2.15 A
resolution (Figure 32C; Table 9). The ySMB-9 FnIII domain variant was recovered from the same "loop only" phage-display library using a slightly different selection scheme (Hogrefe, H. H., et al., Gene (1993) 128:119-126) and shows close homology to new hSUM01 cradle molecules recovered in this study (Figure 31B). Thus, the structure of the ySMB-9/hSUM01 complex provides a good example of how "loop only" FnIII variants recognize their targets. The structure showed that ySMB-9 binds to hSUM01 in a "head-on" fashion using all three loops to form a contiguous binding surface in precisely the manner envisioned in typical loop-based FnIII library designs (Figure 32C). The BC, DE and FG loops contribute 54%, 6%
and 40% of the total FnIII variant buried surface area respectively with no buried surface contributed by the beta sheet regions of the FnIII domain.
resolution (Figure 32C; Table 9). The ySMB-9 FnIII domain variant was recovered from the same "loop only" phage-display library using a slightly different selection scheme (Hogrefe, H. H., et al., Gene (1993) 128:119-126) and shows close homology to new hSUM01 cradle molecules recovered in this study (Figure 31B). Thus, the structure of the ySMB-9/hSUM01 complex provides a good example of how "loop only" FnIII variants recognize their targets. The structure showed that ySMB-9 binds to hSUM01 in a "head-on" fashion using all three loops to form a contiguous binding surface in precisely the manner envisioned in typical loop-based FnIII library designs (Figure 32C). The BC, DE and FG loops contribute 54%, 6%
and 40% of the total FnIII variant buried surface area respectively with no buried surface contributed by the beta sheet regions of the FnIII domain.
[0445] The mode of interaction exhibited by the ySMB-9 FnIII variant stands in stark contrast with the "cradle" surface employed by SH13 in binding to Abl SH2 (Figure 32A) and is also distinct from that previously observed for a yeast SUMO (ySUM0)-binding FnIII variant, ySMB-1 (Figures 32C and 32D) (Gilbreth, et al., supra, 2011). The FnIII
variant ySMB-1 used the FG loop and the wild-type FnIII scaffold to form a side-and-loop mode of interaction similar to that exhibited by SH13. Interestingly, both ySMB-1 and ySMB-9 bind to structurally equivalent, highly conserved epitopes in hSUM01 and ySUMO, respectively (Figure 32D).
Thus, this pair of FnIII variants demonstrate that both the "loop only" and "cradle" binding modes can be used to successfully recognize essentially the same target surface and further supports the validity of both library design strategies. Furthermore, the epitope recognized by ySMB-9 is flat in shape, demonstrating that, although loop-only binding surfaces tend to have convex shapes that would seem unsuitable for recognizing flat surfaces, it is possible to effectively produce binders to a flat epitope using a loop-only FnIII variant library.
variant ySMB-1 used the FG loop and the wild-type FnIII scaffold to form a side-and-loop mode of interaction similar to that exhibited by SH13. Interestingly, both ySMB-1 and ySMB-9 bind to structurally equivalent, highly conserved epitopes in hSUM01 and ySUMO, respectively (Figure 32D).
Thus, this pair of FnIII variants demonstrate that both the "loop only" and "cradle" binding modes can be used to successfully recognize essentially the same target surface and further supports the validity of both library design strategies. Furthermore, the epitope recognized by ySMB-9 is flat in shape, demonstrating that, although loop-only binding surfaces tend to have convex shapes that would seem unsuitable for recognizing flat surfaces, it is possible to effectively produce binders to a flat epitope using a loop-only FnIII variant library.
[0446] A new type of FnIII cradle molecule library was developed in which positions for amino acid diversification are distinct from those of conventional FnIII
variant libraries. The new cradle library is effective in generating high-affinity cradle molecules, and its performance, compared with that of a conventional "loop only" library, appears different for different target molecules. Furthermore, the crystal structure of a cradle molecule from the new cradle library presents a concave paratope (Figure 32A), which is distinctly different from flat or convex paratopes often observed in FnIII variants from "loop only" libraries (Figure 32C). The ability of the new library to produce concave paratopes is likely to be critical in using cradle molecules to inhibit protein-protein interaction interfaces, as the majority of protein surfaces range from flat to convex in shape. The SH13 structure showed that residues in the beta sheet region of the FnIII domain underwent minimal backbone movements upon target binding. Thus, a small entropic penalty incurred by these residues upon binding may favorably contribute to achieving high affinity. Together, these results clearly illustrate that the single FnIII domain can be used to produce diverse types of binding surfaces that collectively are capable of recognizing epitopes with distinct topography. This expands the utility of the FnIII
domain for producing synthetic binding interfaces.
variant libraries. The new cradle library is effective in generating high-affinity cradle molecules, and its performance, compared with that of a conventional "loop only" library, appears different for different target molecules. Furthermore, the crystal structure of a cradle molecule from the new cradle library presents a concave paratope (Figure 32A), which is distinctly different from flat or convex paratopes often observed in FnIII variants from "loop only" libraries (Figure 32C). The ability of the new library to produce concave paratopes is likely to be critical in using cradle molecules to inhibit protein-protein interaction interfaces, as the majority of protein surfaces range from flat to convex in shape. The SH13 structure showed that residues in the beta sheet region of the FnIII domain underwent minimal backbone movements upon target binding. Thus, a small entropic penalty incurred by these residues upon binding may favorably contribute to achieving high affinity. Together, these results clearly illustrate that the single FnIII domain can be used to produce diverse types of binding surfaces that collectively are capable of recognizing epitopes with distinct topography. This expands the utility of the FnIII
domain for producing synthetic binding interfaces.
[0447] In structural comparison of the FnIII and immunoglobulin variable domains, the "DCFG" beta sheet of the FnIII domain used for constructing a new binding site in this work corresponds to the beta sheet of the immunoglobulin variable domain that mediates heterodimerization between the variable domains of the heavy and light chains (Amzel, L. M., and Poljak, R. J., Annu Rev Biochem (1979) 48:961-997). Therefore, the immunoglobulin domains utilize this beta sheet surface for specific protein-protein interaction but not for recognizing foreign molecules. In the camelid single-domain antibodies (VHHs), the equivalent beta sheet contains several mutations, with respect to the conventional variable domain, that prevent heterodimerization (Desmyter, A., et al., Nat Struct Biol (1996) 3:803-811; Hamers-Casterman, C., et al., Nature (1993) 363:446-448). Although the paratopes of most camelid VHHs reported to date are made with the three CDR loops and have convex topography (De Genst, et al., Proc Natl Acad Sci USA (2006) 103:4586-4591), rare examples of VHH that use a binding mode equivalent to the "side and loop" mode have been identified (Kirchhofer, A., et al., Nat Struct Mol Biol (2010) 17:133-138). These examples suggest that the VHH scaffold can also be used in the same manner as the FnIII domain to generate such "side binders". The rarity of such VHH molecules is likely to originate from the manner by which their amino acid diversity is generated in the natural immune system.
The gene recombination mechanism underlying the generation of immunoglobulin sequence diversity focuses on the CDRs (Wu, T., T., et al., Proteins: Struct Funct Genet (1993) 16:1-7).
Consequently, the "side" positions on the beta sheet are not extensively diversified in the natural immune repertoire, limiting the chance of generating "side binder" VHH
molecules.
The gene recombination mechanism underlying the generation of immunoglobulin sequence diversity focuses on the CDRs (Wu, T., T., et al., Proteins: Struct Funct Genet (1993) 16:1-7).
Consequently, the "side" positions on the beta sheet are not extensively diversified in the natural immune repertoire, limiting the chance of generating "side binder" VHH
molecules.
[0448] Whereas the FnIII variants have been viewed as close mimics of antibodies due to their structural similarity, the design of the cradle library represents a departure from this "antibody mimic" mind set. It was emphasized that structural characterization of cradle molecule-target complexes was instrumental in identifying the unanticipated mode of cradle molecule-target interactions and the potential utility of the beta sheet surface for target recognition. Unlike immunoglobulin libraries derived from natural sources, cradle molecule libraries are generated using in vitro mutagenesis, affording full control over the choice of locations for amino acid diversification in a library. This freedom is an obvious but important advantage of synthetic scaffold systems. A similar approach should be effective in identifying distinct surfaces useful for constructing binding interfaces in other scaffolds. This design strategy gives general insights into the design of molecular recognition interfaces.
Materials and Methods
Materials and Methods
[0449] Protein production and modification. Target proteins (Abl SH2, hSUM01, and GFP) and cradle molecules were produced as His10-tag proteins using the pHFT2 vector (Koide, et al., supra, 2007), and purified as previously described (Gilbreth, et al., supra, 2011, Koide, A., et al., supra, 2009). The hSUM01 sample used in this work contained the C52A
mutation that prevents dimer formation (Tatham, M. H., et al., J Biol Chem (2001) 276:35368-35374). Isotope-enriched samples were prepared as described previously (Pham, T-N, and Koide, S., J Biomol NMR (1998) 11:407-414). For SPR
experiments, the His-tag segment of the targets was cleaved using the TEV protease. For crystallization the His-tag segment was removed from both the targets and cradle molecules.
mutation that prevents dimer formation (Tatham, M. H., et al., J Biol Chem (2001) 276:35368-35374). Isotope-enriched samples were prepared as described previously (Pham, T-N, and Koide, S., J Biomol NMR (1998) 11:407-414). For SPR
experiments, the His-tag segment of the targets was cleaved using the TEV protease. For crystallization the His-tag segment was removed from both the targets and cradle molecules.
[0450] Target proteins used for yeast display were biotinylated using EZ-Link Biotin (Thermo Fisher Scientific). Typically 0.3-0.6 mg/ml of a target protein was incubated with 60 p M reagent for 30 min, and quenched the reaction by adding Tris-Cl (pH 8) at a final concentration of 0.1 M. Excess biotinylation reagent was removed by dialysis against 20 mM
Tris Cl buffer, pH 8 containing 100 mM NaC1 and 1 mM EDTA. The level of biotinylation was determined to be ¨1 per molecule using MALDI-TOF mass spectroscopy.
Tris Cl buffer, pH 8 containing 100 mM NaC1 and 1 mM EDTA. The level of biotinylation was determined to be ¨1 per molecule using MALDI-TOF mass spectroscopy.
[0451] Phage and yeast display, library construction and selection. The "loop only"
library has been described (Wojcik, et al., supra, 2010). The cradle library was constructed using the Kunkel mutagenesis method as described previously (Koide, A., and Koide, S., supra, 2007, Sidhu, et al., supra, 2000). Phage display selection was performed according to the methods previously described (Fellouse, F. A., et al. J Mol Biol (2007) 373:924-940, Koide, A., et al., supra, 2009). The His-tagged target proteins were incubated with equimolar concentration of BTtrisNTA, a high affinity Ni-NTA compound containing a biotin moiety, for 30 min to form a BTtrisNTA/his-tagged protein complex, and the complex was incubated with cradle molecule phage-display libraries. The target concentrations used for rounds 1, 2 and 3 were 100, 100 and 50 nM for Abl 5H2, 100, 50 and 50 nM for GFP, respectively, and 100 nM
throughout for hSUM01. Cradle molecule-displaying phages bound to the BTtrisNTA/target complexes were captured using Streptavidin (SAV)-coated magnetic beads. The captured phages were eluted with 10 mM EDTA solution that disrupts the linkage between the targets and BTtrisNTA. The recovered phages were amplified in the presence of 0.2 mM
IPTG to induce the expression of cradle molecule-p3 fusion genes.
library has been described (Wojcik, et al., supra, 2010). The cradle library was constructed using the Kunkel mutagenesis method as described previously (Koide, A., and Koide, S., supra, 2007, Sidhu, et al., supra, 2000). Phage display selection was performed according to the methods previously described (Fellouse, F. A., et al. J Mol Biol (2007) 373:924-940, Koide, A., et al., supra, 2009). The His-tagged target proteins were incubated with equimolar concentration of BTtrisNTA, a high affinity Ni-NTA compound containing a biotin moiety, for 30 min to form a BTtrisNTA/his-tagged protein complex, and the complex was incubated with cradle molecule phage-display libraries. The target concentrations used for rounds 1, 2 and 3 were 100, 100 and 50 nM for Abl 5H2, 100, 50 and 50 nM for GFP, respectively, and 100 nM
throughout for hSUM01. Cradle molecule-displaying phages bound to the BTtrisNTA/target complexes were captured using Streptavidin (SAV)-coated magnetic beads. The captured phages were eluted with 10 mM EDTA solution that disrupts the linkage between the targets and BTtrisNTA. The recovered phages were amplified in the presence of 0.2 mM
IPTG to induce the expression of cradle molecule-p3 fusion genes.
[0452] After three rounds of the phage-display library selection, the genes of selected cradle molecules was transferred to a yeast-display vector to make yeast libraries, using homologous recombination in yeast (Swers, J. S., et al., Nucleic Acids Res (2004) 32:e36). Gene shuffling during the construction of yeast-display libraries was incorporated as follows. A linearized yeast display vector, pGalAgaCamR (Koide, A., et al., J Mol Biol (2007) 373:941-953), was prepared using NcoI and XhoI digestion. Cradle molecule gene segments respectively encoding residues 1-74 and those for residues 54-94 separately were amplified using PCR
from the enriched pool after the phage selection. Yeast strain EBY100 was then transformed using a mixture of the three DNA fragments. Correctly recombined clones contained the fusion gene for Aga2-cradle molecule-V5 tag. The transformants were selected in tryptophan-deficient media and Aga2-cradle molecule fusion protein was expressed as previously described (Koide, et al., supra, 2007, Boder and Wittrup, supra, 2000).
from the enriched pool after the phage selection. Yeast strain EBY100 was then transformed using a mixture of the three DNA fragments. Correctly recombined clones contained the fusion gene for Aga2-cradle molecule-V5 tag. The transformants were selected in tryptophan-deficient media and Aga2-cradle molecule fusion protein was expressed as previously described (Koide, et al., supra, 2007, Boder and Wittrup, supra, 2000).
[0453] The yeast display libraries were sorted using 30 nM biotinylated Abl-SH2, 10 nM
biotinylated hSUM01, and 3 nM biotinylated GFP as described previously (Koide, et al., supra, 2007). The surface-displayed cradle molecules were detected with anti-V5 antibody (Sigma).
NeutrAvidin (NAV)-PE (InvitroGen) or SAV-PE (InvitroGen) and Alexa Fluor -647 chicken anti-rabbit IgG (InvitroGen) were used as the secondary detection reagents for biotinylated protein and anti V5 antibody, respectively. A total of two rounds of library sorting were performed for Abl 5H2 and hSUM01, and one round for GFP.
biotinylated hSUM01, and 3 nM biotinylated GFP as described previously (Koide, et al., supra, 2007). The surface-displayed cradle molecules were detected with anti-V5 antibody (Sigma).
NeutrAvidin (NAV)-PE (InvitroGen) or SAV-PE (InvitroGen) and Alexa Fluor -647 chicken anti-rabbit IgG (InvitroGen) were used as the secondary detection reagents for biotinylated protein and anti V5 antibody, respectively. A total of two rounds of library sorting were performed for Abl 5H2 and hSUM01, and one round for GFP.
[0454] Affinity measurements using yeast display. Individual clones from sorted libraries were isolated on agar plates and grown in liquid media as described previously (Koide, et al., supra, 2007, Boder and Wittrup, supra, 2000). Fifty thousand yeast cells for each clone were incubated with various concentrations of biotinylated target in the final volume of 20 pl in BSS
buffer (50 mM Tris Cl, 150 mM NaC1, pH 8, 1 mg/ml BSA) in the wells of a polypropyrene 96-well plate (Greiner 650201) on ice for 30 mM with shaking. The wells of a 96-well filter plate (MultiScreenHTS HV, 0.45 p m pore size; Millipore) were washed by adding 100 pl BSS and then removing the liquid by applying a vacuum. The cell suspensions from the binding reactions were transferred to the washed wells of the 96-well filter plate.
The binding solution was removed by vacuum filtration. The yeast cells in the wells were washed with 100 pl of BSST (BSS buffer containing 0.1% Tween 20) twice in the same manner. Next, 20 pl of p g/ml NAV-PE (InvitroGen) in BSS was added to each of the wells. After incubation on ice with shaking for 30 mM, the cells were washed with BSST once. The cells were suspended in 300 pl BSS and analyzed using a Guava EasyCyte 6/L flow cytometer (Millipore).
The Kd values were determined from plots of the mean PE fluorescence intensity versus target concentration by fitting the 1:1 binding model using the KaleidaGraph program (Synergy Software).
buffer (50 mM Tris Cl, 150 mM NaC1, pH 8, 1 mg/ml BSA) in the wells of a polypropyrene 96-well plate (Greiner 650201) on ice for 30 mM with shaking. The wells of a 96-well filter plate (MultiScreenHTS HV, 0.45 p m pore size; Millipore) were washed by adding 100 pl BSS and then removing the liquid by applying a vacuum. The cell suspensions from the binding reactions were transferred to the washed wells of the 96-well filter plate.
The binding solution was removed by vacuum filtration. The yeast cells in the wells were washed with 100 pl of BSST (BSS buffer containing 0.1% Tween 20) twice in the same manner. Next, 20 pl of p g/ml NAV-PE (InvitroGen) in BSS was added to each of the wells. After incubation on ice with shaking for 30 mM, the cells were washed with BSST once. The cells were suspended in 300 pl BSS and analyzed using a Guava EasyCyte 6/L flow cytometer (Millipore).
The Kd values were determined from plots of the mean PE fluorescence intensity versus target concentration by fitting the 1:1 binding model using the KaleidaGraph program (Synergy Software).
[0455] Surface plasmon resonance. All SPR measurements were carried out on a BiacoreTM 2000 instrument. For kinetic experiments, Abl 5H2 was immobilized on a CM5 chip using amine coupling following methods provided by the manufacturer. Cradle molecules at varying concentrations were flowed over the surface at a flow rate of 100 ul/min and the binding signal was recorded. Quintuplicate data sets were processed and fit with a bimolecular model including mass transport using the Scrubber2 program (BioLogic Software, Campbell, Australia). The presence of mass transport was confirmed using varying flow rates.
Equilibrium experiments were performed as described previously (Gilbreth, et al., supra, 2011).
Duplicate data sets were processed in Scrubber2 and saturation curves were fit with a 1:1 binding model using the Origin software (OriginLab, Northampton, MA).
Equilibrium experiments were performed as described previously (Gilbreth, et al., supra, 2011).
Duplicate data sets were processed in Scrubber2 and saturation curves were fit with a 1:1 binding model using the Origin software (OriginLab, Northampton, MA).
[0456] Crystallization and structure determination. The SH13/Abl SH2 domain complex was purified with a Superdex 75 column (GE Lifesciences). The complex was concentrated to ¨10 mg/ml and crystallized in 0.2 M Magnesium chloride, 0.1 M
Bis-Tris Cl pH 5.5 and 25% PEG 3350 at 19 C by the hanging-drop vapor-diffusion method.
The ySMB-9 and hSUM01 proteins were mixed in a 1:1 molar ratio, concentrated to a total protein concentration of ¨10 mg/mL and dissolved in 10 mM Tris Cl, 50 mM NaC1, pH 8Ø
The complex was crystallized in 24% PEG-8000, 0.1 M Imidazole, pH = 8.0 at 19 C
using the hanging drop vapor diffusion method. Crystals were frozen in a mixture of 80%
mother liquor and 20% glycerol as a cryoprotectant. Crystal and data collection information are reported in Table 1.
Bis-Tris Cl pH 5.5 and 25% PEG 3350 at 19 C by the hanging-drop vapor-diffusion method.
The ySMB-9 and hSUM01 proteins were mixed in a 1:1 molar ratio, concentrated to a total protein concentration of ¨10 mg/mL and dissolved in 10 mM Tris Cl, 50 mM NaC1, pH 8Ø
The complex was crystallized in 24% PEG-8000, 0.1 M Imidazole, pH = 8.0 at 19 C
using the hanging drop vapor diffusion method. Crystals were frozen in a mixture of 80%
mother liquor and 20% glycerol as a cryoprotectant. Crystal and data collection information are reported in Table 1.
[0457] X-ray diffraction data were collected at the Advanced Photon Source beamlines (Argonne National Laboratory). Crystal and data collection information are reported in Supplementary Table 1. X-ray diffraction data were processed and scaled with (Otwinowski, Z., supra, 1997). The 5H13/Abl 5H2 structure was determined by molecular replacement using Phaser in the CCP4 program suite (The CCP4 Suite, supra ,1994;
Potterton, E., et al., Acta Crystallogr D Biol Crystallogr (2003) 59:1131-1137). A multicopy search was performed with the Abl 5H2 domain and the FnIII scaffold, without the loop regions, as the search models (PDB IDs 2ABL and 1FNA, respectively). Simulated annealing, energy-minimization, B-factor refinement and map building were out using CNS
(Brunger, A. T., et al., supra, 1998; Brunger, A. T., Nature protocols (2007) 2:2728-2733). The ySMB-9/hSUM01 structure was determined by molecular replacement using sequential search with two different models with the program MOLREP in CCP4 (The CCP4 Suite, supra ,1994). The hSUM01 structure (residues 20-92 of chain B PDB ID code 1Z55) was used as a search model, along with the FnIII structure with the variable loop regions deleted (PDB ID
code 1FNA) (Dickinson, et al., supra, 1994; Reverter, D., supra, 2005). Model building and the search for water molecules was carried out using the Coot program (Emsley, P., supra, 2004). TLS
(Translation/Libration/Screw) and bulk solvent parameters, restrained temperature factor and final positional refinement were completed with REFMAC5 (Murshudov, et al., supra, 1997).
Molecular graphics were generated using PyMOL (located on the World Wide Web at pymol.org). Surface area calculations were performed using the PROTORP
protein¨protein interaction server (Reynolds, et al., supra, 2009).
Potterton, E., et al., Acta Crystallogr D Biol Crystallogr (2003) 59:1131-1137). A multicopy search was performed with the Abl 5H2 domain and the FnIII scaffold, without the loop regions, as the search models (PDB IDs 2ABL and 1FNA, respectively). Simulated annealing, energy-minimization, B-factor refinement and map building were out using CNS
(Brunger, A. T., et al., supra, 1998; Brunger, A. T., Nature protocols (2007) 2:2728-2733). The ySMB-9/hSUM01 structure was determined by molecular replacement using sequential search with two different models with the program MOLREP in CCP4 (The CCP4 Suite, supra ,1994). The hSUM01 structure (residues 20-92 of chain B PDB ID code 1Z55) was used as a search model, along with the FnIII structure with the variable loop regions deleted (PDB ID
code 1FNA) (Dickinson, et al., supra, 1994; Reverter, D., supra, 2005). Model building and the search for water molecules was carried out using the Coot program (Emsley, P., supra, 2004). TLS
(Translation/Libration/Screw) and bulk solvent parameters, restrained temperature factor and final positional refinement were completed with REFMAC5 (Murshudov, et al., supra, 1997).
Molecular graphics were generated using PyMOL (located on the World Wide Web at pymol.org). Surface area calculations were performed using the PROTORP
protein¨protein interaction server (Reynolds, et al., supra, 2009).
[0458] NMR spectroscopy. The following suite of spectra were taken on a uniformly 13C/15N enriched Abl 5H2 domain (-200 uM) in 10 mM sodium phosphate buffer, pH
7.4 containing 150 mM NaC1, 50 uM EDTA and 0.005% sodium azide prepared in 90% H20 and 10% D20, using a Varian (Palo Alto, CA) INOVA 600 NMR spectrometer equipped with a cryogenic probe using pulse sequences provided by the manufacturer: 1H,15N-HSQC, HNCO, CBCACONH, HNCACB, CCONH, HN(CA)CO. NMR data were processed and analyzed using NMRPipe and NMRView software (Delaglio, F., et al., J Biomol NMR (1995) 6:277-293;
Johnson, B. A., et al., J Biomol NMR (1994) 4:603-614). Resonance assignments were obtained using the PINE server (Bahrami, A., et al., PLUS Computational Biology (2009) 5:e1000307) and verified by visual inspection in NMR view. For epitope mapping, the 1H,15N-HSQC spectra of the 15N enriched Abl 5H2 domain (-60 uM) in the absence and presence of 1.25 fold molar excess of unlabeled 5H13 cradle molecule were recorded. The 1H,15N-HSQC cross peaks were classified according to the degree of migration upon SH13 binding as described previously (Koide, et al., supra, 2007).
7.4 containing 150 mM NaC1, 50 uM EDTA and 0.005% sodium azide prepared in 90% H20 and 10% D20, using a Varian (Palo Alto, CA) INOVA 600 NMR spectrometer equipped with a cryogenic probe using pulse sequences provided by the manufacturer: 1H,15N-HSQC, HNCO, CBCACONH, HNCACB, CCONH, HN(CA)CO. NMR data were processed and analyzed using NMRPipe and NMRView software (Delaglio, F., et al., J Biomol NMR (1995) 6:277-293;
Johnson, B. A., et al., J Biomol NMR (1994) 4:603-614). Resonance assignments were obtained using the PINE server (Bahrami, A., et al., PLUS Computational Biology (2009) 5:e1000307) and verified by visual inspection in NMR view. For epitope mapping, the 1H,15N-HSQC spectra of the 15N enriched Abl 5H2 domain (-60 uM) in the absence and presence of 1.25 fold molar excess of unlabeled 5H13 cradle molecule were recorded. The 1H,15N-HSQC cross peaks were classified according to the degree of migration upon SH13 binding as described previously (Koide, et al., supra, 2007).
Claims (27)
1. A fibronectin type III (FnIII) domain-based cradle polypeptide comprising one or more amino acid substitutions in at least a loop region and at least a non-loop region.
2. The cradle polypeptide of claim 1, wherein the non-loop region is selected from the group consisting of a beta strand C, a beta strand D, a beta strand F, and a beta strand G.
3. The cradle polypeptide according to claim 1 or 2, wherein the loop region is selected from the group consisting of an AB loop, a BC loop, a CD loop, a DE
loop, an EF loop and an FG loop.
loop, an EF loop and an FG loop.
4. The cradle polypeptide according to any one of claims 1-3, comprising one or more amino acid substitutions in two loop regions and/or two non-loop regions.
5. The cradle polypeptide of claim 4, wherein the non-loop regions are the beta strands C and F, and the loop regions are the CD and FG loops.
6. The cradle polypeptide according to any one of claims 1-5, wherein the one or more amino acid substitutions are introduced to the cradle residues in the beta strands.
7. The cradle polypeptide according to any one of claims 1-6, further comprising an insertion and/or deletion of at least one amino acid in at least one loop and/or non-loop region.
8. The cradle polypeptide of claim 7, which comprises an insertion and/or deletion of at least one amino acid in two loop regions and/or two non-loop regions.
9. The cradle polypeptide of claim 8, wherein the non-loop regions are the beta strands C and F, and the loop regions are the CD and FG loops.
10. The cradle polypeptide according to any one of claims 1-9, wherein the one or more amino acid substitutions in the non-loop region do not change the structure of the FnIII
domain scaffold and/or the shape of the loop regions.-130-
domain scaffold and/or the shape of the loop regions.-130-
11. The cradle polypeptide according to any one of claims 1-10, wherein the one or more amino acid substitutions in the non-loop region exclude the non-cradle residues.
12. The cradle polypeptide according to any one of claims 1-11, wherein the FnIII
domain is the 1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th, 10th, 11th, 12th, 13th, 14th, 15th or 16th FnIII
domain of human fibronectin.
domain is the 1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th, 10th, 11th, 12th, 13th, 14th, 15th or 16th FnIII
domain of human fibronectin.
13. The cradle polypeptide of claim 12, wherein the FnIII
domain is the 7th, 10th or 14th FnIII domain of human fibronectin.
domain is the 7th, 10th or 14th FnIII domain of human fibronectin.
14. The cradle polypeptide of claim 13, wherein the FnIII
domain is the 7th FnIII
domain of human fibronectin (FnIII07).
domain is the 7th FnIII
domain of human fibronectin (FnIII07).
15. The cradle polypeptide of claim 13, wherein the FnIII
domain is the 10th FnIII
domain of human fibronectin (FnIII10).
domain is the 10th FnIII
domain of human fibronectin (FnIII10).
16. The cradle polypeptide of claim 13, wherein the FnIII
domain is the 14th FnIII
domain of human fibronectin (FnIII14).
domain is the 14th FnIII
domain of human fibronectin (FnIII14).
17. The cradle polypeptide according to any one of claims 1-16, wherein the CD
loop is about 3-11 residues in length.
loop is about 3-11 residues in length.
18. The cradle polypeptide of claim 17, wherein the CD loop is about 4-9 residues in length.
19. The cradle polypeptide of claim 18, wherein the CD loop is 5 residues in length.
20. The cradle polypeptide according to any one of claims 1-19, wherein the FG loop is about 1-10 residues in length.
21. The cradle polypeptide of claim 20, wherein the FG loop is about 5 or 6 residues in length.
22. The cradle polypeptide of claim 21, wherein the FG loop is 5 residues in length.
23. The cradle polypeptide of claim 21, wherein the FG loop is 6 residues in length.-131-
24. The cradle polypeptide of claim 23, wherein position 1 of the FG loop is a Gly residue, position 2 of the FG loop is a Leu, Val or Ile residue, position 3 of the FG loop is a charged or polar residue, position 4 of the FG loop is a Pro residue, position 5 of the FG loop is a Gly residue, and position 6 of the FG loop is a polar residue.
25. The cradle polypeptide of claim 22, wherein positions 3 and/or 5 of the FG loop are a Gly residue.
26. The cradle polypeptide according to any one of claims 1-25, wherein the beta strand C is about 6-14 residues in length and the beta strand F is about 8-13 residues in length.
27. The cradle polypeptide of claim 26, wherein the beta strand C is about 8-residues in length.
length.28. The cradle polypeptide of claim 27, wherein the beta strand C is 9 residues in 29. The cradle polypeptide according to any one of claims 26-28, wherein positions 2, 4, and 6 of the beta strand C are a hydrophobic residue.
30. The cradle polypeptide according to any one of claims 26-29, wherein positions 1, 3, 5 and 7-9 of the beta strand C are altered relative to the wild type sequence.
31. The cradle polypeptide according to any one of claims 26-30, wherein position 1 of the beta strand C is selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
32. The cradle polypeptide according to any one of claims 26-31, wherein position 3 of the beta strand C is a hydrophobic residue.
33. The cradle polypeptide according to any one of claims 26-32, wherein position 3 of the beta strand C is selected from the group consisting of Ile, Val, Arg, Leu, Thr, Glu, Lys, Ser, Gln and His.
34. The cradle polypeptide according to any one of claims 26-33, wherein positions and 7-9 of the beta strand C are selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
35. The cradle polypeptide according to any one of claims 26-34, wherein the beta strand F is about 9-11 residues in length.
36. The cradle polypeptide of claim 35, wherein the beta strand F is 10 residues in length.
37. The cradle polypeptide according to claim 35 or 36, wherein positions 1, 3, 5 and of the beta strand F are altered relative to the wild type sequence.
38. The cradle polypeptide according to any one of claims 35-37, wherein positions 1, 3, 5 and 10 of the beta strand F are selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
39. The cradle polypeptide according to any one of claims 35-38, wherein positions 2, 4 and 6 of the beta strand F are a hydrophobic residue.
40. The cradle polypeptide according to any one of claims 35-39, wherein position 7 of the beta strand F is a hydrophobic residue.
41. The cradle polypeptide according to any one of claims 35-40, wherein position 7 of the beta strand F is selected from the group consisting of Arg, Tyr, Ala, Thr and Val.
42. The cradle polypeptide according to any one of claims 35-41, wherein position 8 of the beta strand F is selected from the group consisting of Ala, Gly, Ser, Val and Pro.
43. The cradle polypeptide according to any one of claims 35-42, wherein position 9 of the beta strand F is selected from the group consisting of Val, Leu, Glu, Arg and Ile.
44. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 30, 31, 32, 33, 34, 35, 36, 37, 38 and/or 39 of SEQ ID
NO:1.
45. The cradle polypeptide of claim 44, wherein the amino acid substitutions comprise positions 31, 33, 35, 37, 38 and/or 39 of SEQ ID NO:1.
46. The cradle polypeptide of claim 45, wherein the amino acid substitutions comprise positions 31 and/or 33 of SEQ ID NO:1.
47. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 44, 45, 46, 47, 48, 49, 50 and/or 51 of SEQ ID NO:1.
48. The cradle polypeptide of claim 47, wherein the amino acid substitutions comprise positions 44, 45, 47 and/or 49 of SEQ ID NO:1.
49. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 40, 41, 42, 43, 44 and/or 45 of SEQ ID NO:1.
50. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 67, 68, 69, 70, 71, 72, 73, 74, 75 and/or 76 of SEQ ID
NO:1.
51. The cradle polypeptide of claim 50, wherein the amino acid substitutions comprise positions 67, 69, 71, 73 and/or 76 of SEQ ID NO:1.
52. The cradle polypeptide of claim 50, wherein the amino acid substitutions comprise positions 71, 73, 75 and/or 76 of SEQ ID NO:1.
53. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86 of SEQ ID
NO:1.
54. The cradle polypeptide of claim 53, wherein the amino acid substitutions comprise positions 84 and/or 85 of SEQ ID NO:1.
55. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 85, 86, 87, 88, 89, 90, 91, 92, 93 and/or 94 of SEQ ID
NO:1.
56. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 31, 33, 47, 49, 73 and/or 75 of SEQ ID NO:1.-134-56. The cradle polypeptide of claim 14, wherein the amino acid substitutions comprise positions 33, 35, 37, 39, 40 and/or 41 of SEQ ID NO:97.
57. The cradle polypeptide of claim 14, wherein the amino acid substitutions comprise positions 42, 43, 44, 45, 46, 47 and/or 48 of SEQ ID NO:97.
58. The cradle polypeptide of claim 14, wherein the amino acid substitutions comprise positions 70, 72, 74, 76 and/or 79 of SEQ ID NO:97.
59. The cradle polypeptide of claim 14, wherein the amino acid substitutions comprise positions 80, 81, 82, 83, 84 and/or 85 of SEQ ID NO:97.
60. The cradle polypeptide of claim 16, wherein the amino acid substitutions comprise positions 31, 33, 35, 37 and/or 39 of SEQ ID NO:129.
61. The cradle polypeptide of claim 16, wherein the amino acid substitutions comprise positions 40, 41, 42, 43 and/or 44 of SEQ ID NO:129.
62. The cradle polypeptide of claim 16, wherein the amino acid substitutions comprise positions 66, 68, 70, 72 and/or 75 of SEQ ID NO:129.
63. The cradle polypeptide of claim 16, wherein the amino acid substitutions comprise positions 76, 77, 78, 79, 80 and/or 81 of SEQ ID NO:129.
64. The cradle polypeptide of claim 14, comprising an amino acid sequence set forth in SEQ ID NO:468.
65. The cradle polypeptide of claim 15, comprising an amino acid sequence set forth in SEQ ID NO:469.
66. The cradle polypeptide of claim 16, comprising an amino acid sequence set forth in SEQ ID NO:470.
67. A chimeric cradle polypeptide according to any one of claims 1-66, wherein part of the cradle polypeptide is replaced by a non-FnIII domain polypeptide that enhances the binding affinity of the cradle polypeptide for a target molecule.
68. The chimeric cradle polypeptide of claim 67, wherein the non-FnIII domain polypeptide is all or part of a complementarity determining region (CDR) of an antibody or a T-cell receptor.
69. The chimeric cradle polypeptide of claim 68, wherein the CDR is a CDR1, CDR2 or CDR3 of a single domain antibody.
70. The chimeric cradle polypeptide of claim 69, wherein the single domain antibody is a nanobody.
71. The chimeric cradle polypeptide of claim 68, wherein the CDR replaces part or all of the AB, BC, CD, DE, EF or FG loop.
72. A multispecific cradle polypeptide comprising multiple copies of one or more monomer cradle polypeptides according to any one of claims 1-71.
73. The multispecific cradle polypeptide of claim 72, wherein the monomer cradle polypeptides are linked by a linker sequence.
74. The multispecific cradle polypeptide of claim 73, wherein the linker sequence is selected from the group consisting of GGGGSGGGGS (SEQ ID NO: 471), GSGSGSGSGS
(SEQ ID NO: 472), PSTSTST (SEQ ID NO: 473) and EIDKPSQ (SEQ ID NO: 474).
75. A cradle library comprising a plurality of cradle polypeptides according to any one of claims 1-74.
76. The cradle library of claim 75, wherein the cradle polypeptides comprise one or more amino acid substitutions conesponding to amino acid positions 30, 41, 42, 43, 44, 45, 76, 77, 78, 79, 80, 81, 82, 83, 84 and/or 85 of SEQ ID NO:1.
77. The cradle library of claim 76, wherein the cradle polypeptides further comprise one or more amino acid substitutions conesponding to amino acid positions 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 32, 34, 35, 36, 37, 38, 39, 40, 46, 48, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 72, 74, 75, 86, 87, 88, 89, 90, 91, 92, 93 and/or 94 of SEQ ID NO:1.
78. The cradle library of claims 75, wherein the cradle polypeptides are at least 50%, 60%, 70%, 80%, or 90% identical to SEQ ID NO:1.
79. The cradle library according to any one of claims 75-78, wherein the cradle polypeptides further comprise an insertion of at least 1 amino acid in at least one loop region.
80. The cradle library of claim 79, wherein the cradle polypeptides comprise an insertion of at least 2 amino acids in at least one loop region.
81. The cradle library of claim 80, wherein the cradle polypeptides comprise an insertion of about 2-25 amino acids in at least one loop region.
82. The cradle library according to any one of claims 79-81, wherein the cradle polypeptides comprise a deletion of at least 1 amino acid in at least one loop region.
83. The cradle library of claim 82, wherein the cradle polypeptides comprise a deletion of at least 2 amino acids in at least one loop region.
84. The cradle library of claim 83, wherein the cradle polypeptides comprise a deletion of about 2-10 amino acids in at least one loop region.
85. The cradle library according to any one of claims 79, 83 and 84, wherein the cradle polypeptides comprise a deletion of at least 2 amino acids in two loop regions.
86. The cradle library according to any one of claims 79-85, wherein the cradle polypeptides comprise at least 1 amino acid insertion and 1 amino acid deletion in at least one loop region.
87. The cradle library of claim 86, wherein the cradle polypeptides comprise an insertion and deletion of at least 1 amino acid in the same loop region.
88. The cradle library according to any one of claims 79-87, which is pre-selected to bind a target molecule.
89. The cradle library of claim 75, wherein the cradle polypeptides comprise an amino acid sequence selected from the group consisting of SEQ ID NOs: 3, 79, 86 and 468-470.
90. The cradle library of claim 75, wherein the cradle library contains at least 10, 100, 1000, 10 4, 10 5, 10 6, 10 7, 10 8, 10 9, 10 10, 10 11, 10 12, 10 13, 10 14, 10 15 or more different cradle polypeptides.
91. A polynucleotide encoding a cradle polypeptide according to any one of claims 1-74.
92. The polynucleotide of claim 91, wherein the polynucleotide is an expression construct.
93. The polynucleotide of claim 92, wherein the expression construct is capable of expressing the encoded polypeptide in a bacterium, yeast or mammalian system.
94. A polynucleotide library encoding a cradle library according to any one of claims 75-90.
95. A method of producing a cradle polypeptide comprising:
a) expressing a polynucleotide encoding a cradle polypeptide according to any one of claims 91-93 in a host cell; and b) isolating and/or purifying the expressed cradle polypeptide.
96. A method of forming a cradle library of FnIII domain polypeptides useful in screening for the presence of one or more polypeptides having a selected binding or enzymatic activity, comprising:
(i) aligning beta strands C and F amino acid sequences and loops CD and FG
amino acid sequences in a collection of native FnIII domain polypeptides, (ii) segregating the aligned beta strand and loop sequences according to length, (iii) for a selected beta strand or loop of the length from step (ii), performing positional amino acid frequency analysis to determine the frequencies of amino acids at each beta strand or loop position, (iv) for each beta strand, loop, and length analyzed in step (iii), identifying at each position a conserved or selected semi-conserved consensus amino acid and other natural-variant amino acids, (v) for at least one selected beta strand, loop, and length, forming:
(1) a cradle library of mutagenesis sequences expressed by a library of coding sequences that encode, at each beta strand or loop position, the consensus amino acid, and, if the consensus amino acid has a occurrence frequency equal to or less than a selected threshold frequency of at least 50%, a single common target amino acid and any co-produced amino acids, or (2) a cradle library of natural-variant combinatorial sequences expressed by a library of coding sequences that encode, at each beta strand or loop position, a consensus amino acid and, if the consensus amino acid has a frequency of occurrence equal to or less than a selected threshold frequency of at least 50%, other natural variant amino acids, including semi-conserved amino acids and variable amino acids whose occurrence rate is above a selected minimum threshold occurrence at that position, or their chemical equivalents, (vi) incorporating the library of coding sequences into framework FnIII
coding sequences to form an FnIII expression library, and (vi) expressing the FnIII domain polypeptides of the expression library.
97. The method of claim 96, wherein the selected threshold frequency is 100%.
98. The method of claim 96, wherein the selected frequency is about 50-95%.
99. The method according to any one of claims 96-98, wherein step (ii) segregates the loops and loop lengths into the CD loop at about 3-11 residues in length.
100. The method of claim 99, wherein the CD loop is about 3-9 residues in length.
101. The method of claim 99, wherein the CD loop is about 4-9 residues in length.
102. The method of claim 99, wherein the CD loop is 5 residues in length.
103. The method according to any one of claims 96-102, wherein step (ii) segregates the loops and loop lengths into the FG loop at about 1-10 residues in length.
104. The method of claim 103, wherein the FG loop is about 5 or 6 residues in length.
105. The method of claim 103, wherein the FG loop is 5 residues in length.
106. The method of claim 103, wherein the FG loop is 6 residues in length.
107. The method of claim 106, wherein position 1 of the FG loop is a Gly residue, position 2 the FG loop is a Leu, Val or Ile residue, position 3 the FG loop is a charged or polar residue, position 4 the FG loop is a Pro residue, position 5 the FG loop is a Gly residue, and position 6 the FG loop is a polar residue.
108. The method according to claim 105, wherein positions 3 and/or 5 of the FG
loop are a Gly residue.
109. The method according to any one of claims 96-108, wherein step (ii) segregates the beta strands into the beta strand C at about 6-14 residues in length and the beta strand F at about 8-13 residues in length.
110. The method of claim 109, wherein the beta strand C is about 8-11 residues in length.
111. The method of claim 109, wherein the beta strand C is 9 residues in length.
112. The method of claim 111, wherein positions 2, 4 and 6 of the beta strand C are a hydrophobic residue.
113. The method of claim 111, wherein positions 1, 3, 5 and 7-9 of the beta strand C
are altered relative to the wild type sequence.
114. The method of claim 113, wherein position 1 of the beta strand C is selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
115. The method of claim 113, wherein position 3 of the beta strand C is a hydrophobic residue.
116. The method of claim 113, wherein position 3 of the beta strand C is selected from the group consisting of Ile, Val, Arg, Leu, Thr, Glu, Lys, Ser, Gln and His.
117. The method of claim 113, wherein positions 5 and 7-9 of the beta strand C
are selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
118. The method of claim 109, wherein the beta strand F is about 9-11 residues in length.
119. The method of claim 109, wherein the beta strand F is 10 residues in length.
120. The method of claim 119, wherein positions 1, 3, 5 and 10 of the beta strand F
are altered relative to the wild type sequence.
121. The method of claim 120, wherein positions 1, 3, 5 and 10 of the beta strand F
are selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
122. The method of claim 119, wherein positions 2, 4 and 6 of the beta strand F are a hydrophobic residue.
123. The method of claim 119, wherein position 7 of the beta strand F is a hydrophobic residue.
124. The method of claim 119, wherein position 7 of the beta strand F is selected from the group consisting of Arg, Tyr, Ala, Thr and Val.
125. The method of claim 119, wherein position 8 of the beta strand F is selected from the group consisting of Ala, Gly, Ser, Val and Pro.
126. The method of claim 119, wherein position 9 of the beta strand F is selected from the group consisting of Val, Leu, Glu, Arg and Ile.
127. The method according to any one of claims 96-126, wherein the native FnIII
domain polypeptides comprise the wild type amino acid sequences of the 1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th, 10th, 11th, 12th, 13th, 14th, 15th and/or 16th FnIII domain of human fibronectin.
128. The method of claim 127, wherein the native FnIII domain polypeptides comprise the wild type amino acid sequences of the 7th, 10th and/or 14th FnIII
domain of human fibronectin.
129. A cradle library useful in screening for the presence of one or more cradle polypeptides having a selected binding or enzymatic activity designed using a method of forming a cradle library of FnIII domain polypeptides according to any one of claims 96-128.
130. A method of identifying a cradle polypeptide having a desired binding affinity to a target molecule, comprising:
a) reacting a cradle library of FnIII domain polypeptides according to any one of claims 75-90 and 129 with the target molecule, and b) screening the cradle library of FnIII domain polypeptides to select those having a desired binding affinity to the target molecule.
131. The method of claim 130, wherein the method further comprises the step of identifying the polynucleotides that encode the selected polypeptides.
132. The method according to claim 130 or 131, wherein the target molecule is selected from the group consisting of lysozyme, human Fc, human serum albumin (HSA), human small ubiquitin-like modifier 1 (SUMO1), human ubiquitin, human Abl SH2 domain, human Scm-like with four mbt domains 2 (SFMBT2), sex comb on midleg homolog 1 (SCMH1) domain and green fluorescent protein (GFP).
133. A cradle polypeptide selected using the methods according to any one of claims 130-132.
134. The cradle polypeptide of claim 134 comprising an amino acid sequence selected from the group consisting of SEQ ID NOs:4-78, 80-85, 87-96, 98, 99, 101-128, 130-141, 143, 145-147, 149-159, 161-199, 201-238 and 240-277.
claims 1-74, 133 and 134.135. A kit comprising, in a container, a cradle polypeptide according to any one of 136. A kit comprising, in a container, a cradle library according to any one of claims 75-90 and 129.
137. A kit comprising, in a container, a polynucleotide encoding a cradle polypeptide according to any one of claims 91-93.
138. A kit comprising, in a container, a polynucleotide library encoding a cradle library according to any one of claims 75-90 and 129.
length.28. The cradle polypeptide of claim 27, wherein the beta strand C is 9 residues in 29. The cradle polypeptide according to any one of claims 26-28, wherein positions 2, 4, and 6 of the beta strand C are a hydrophobic residue.
30. The cradle polypeptide according to any one of claims 26-29, wherein positions 1, 3, 5 and 7-9 of the beta strand C are altered relative to the wild type sequence.
31. The cradle polypeptide according to any one of claims 26-30, wherein position 1 of the beta strand C is selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
32. The cradle polypeptide according to any one of claims 26-31, wherein position 3 of the beta strand C is a hydrophobic residue.
33. The cradle polypeptide according to any one of claims 26-32, wherein position 3 of the beta strand C is selected from the group consisting of Ile, Val, Arg, Leu, Thr, Glu, Lys, Ser, Gln and His.
34. The cradle polypeptide according to any one of claims 26-33, wherein positions and 7-9 of the beta strand C are selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
35. The cradle polypeptide according to any one of claims 26-34, wherein the beta strand F is about 9-11 residues in length.
36. The cradle polypeptide of claim 35, wherein the beta strand F is 10 residues in length.
37. The cradle polypeptide according to claim 35 or 36, wherein positions 1, 3, 5 and of the beta strand F are altered relative to the wild type sequence.
38. The cradle polypeptide according to any one of claims 35-37, wherein positions 1, 3, 5 and 10 of the beta strand F are selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
39. The cradle polypeptide according to any one of claims 35-38, wherein positions 2, 4 and 6 of the beta strand F are a hydrophobic residue.
40. The cradle polypeptide according to any one of claims 35-39, wherein position 7 of the beta strand F is a hydrophobic residue.
41. The cradle polypeptide according to any one of claims 35-40, wherein position 7 of the beta strand F is selected from the group consisting of Arg, Tyr, Ala, Thr and Val.
42. The cradle polypeptide according to any one of claims 35-41, wherein position 8 of the beta strand F is selected from the group consisting of Ala, Gly, Ser, Val and Pro.
43. The cradle polypeptide according to any one of claims 35-42, wherein position 9 of the beta strand F is selected from the group consisting of Val, Leu, Glu, Arg and Ile.
44. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 30, 31, 32, 33, 34, 35, 36, 37, 38 and/or 39 of SEQ ID
NO:1.
45. The cradle polypeptide of claim 44, wherein the amino acid substitutions comprise positions 31, 33, 35, 37, 38 and/or 39 of SEQ ID NO:1.
46. The cradle polypeptide of claim 45, wherein the amino acid substitutions comprise positions 31 and/or 33 of SEQ ID NO:1.
47. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 44, 45, 46, 47, 48, 49, 50 and/or 51 of SEQ ID NO:1.
48. The cradle polypeptide of claim 47, wherein the amino acid substitutions comprise positions 44, 45, 47 and/or 49 of SEQ ID NO:1.
49. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 40, 41, 42, 43, 44 and/or 45 of SEQ ID NO:1.
50. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 67, 68, 69, 70, 71, 72, 73, 74, 75 and/or 76 of SEQ ID
NO:1.
51. The cradle polypeptide of claim 50, wherein the amino acid substitutions comprise positions 67, 69, 71, 73 and/or 76 of SEQ ID NO:1.
52. The cradle polypeptide of claim 50, wherein the amino acid substitutions comprise positions 71, 73, 75 and/or 76 of SEQ ID NO:1.
53. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 76, 77, 78, 79, 80, 81, 82, 83, 84, 85 and/or 86 of SEQ ID
NO:1.
54. The cradle polypeptide of claim 53, wherein the amino acid substitutions comprise positions 84 and/or 85 of SEQ ID NO:1.
55. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 85, 86, 87, 88, 89, 90, 91, 92, 93 and/or 94 of SEQ ID
NO:1.
56. The cradle polypeptide of claim 15, wherein the amino acid substitutions comprise positions 31, 33, 47, 49, 73 and/or 75 of SEQ ID NO:1.-134-56. The cradle polypeptide of claim 14, wherein the amino acid substitutions comprise positions 33, 35, 37, 39, 40 and/or 41 of SEQ ID NO:97.
57. The cradle polypeptide of claim 14, wherein the amino acid substitutions comprise positions 42, 43, 44, 45, 46, 47 and/or 48 of SEQ ID NO:97.
58. The cradle polypeptide of claim 14, wherein the amino acid substitutions comprise positions 70, 72, 74, 76 and/or 79 of SEQ ID NO:97.
59. The cradle polypeptide of claim 14, wherein the amino acid substitutions comprise positions 80, 81, 82, 83, 84 and/or 85 of SEQ ID NO:97.
60. The cradle polypeptide of claim 16, wherein the amino acid substitutions comprise positions 31, 33, 35, 37 and/or 39 of SEQ ID NO:129.
61. The cradle polypeptide of claim 16, wherein the amino acid substitutions comprise positions 40, 41, 42, 43 and/or 44 of SEQ ID NO:129.
62. The cradle polypeptide of claim 16, wherein the amino acid substitutions comprise positions 66, 68, 70, 72 and/or 75 of SEQ ID NO:129.
63. The cradle polypeptide of claim 16, wherein the amino acid substitutions comprise positions 76, 77, 78, 79, 80 and/or 81 of SEQ ID NO:129.
64. The cradle polypeptide of claim 14, comprising an amino acid sequence set forth in SEQ ID NO:468.
65. The cradle polypeptide of claim 15, comprising an amino acid sequence set forth in SEQ ID NO:469.
66. The cradle polypeptide of claim 16, comprising an amino acid sequence set forth in SEQ ID NO:470.
67. A chimeric cradle polypeptide according to any one of claims 1-66, wherein part of the cradle polypeptide is replaced by a non-FnIII domain polypeptide that enhances the binding affinity of the cradle polypeptide for a target molecule.
68. The chimeric cradle polypeptide of claim 67, wherein the non-FnIII domain polypeptide is all or part of a complementarity determining region (CDR) of an antibody or a T-cell receptor.
69. The chimeric cradle polypeptide of claim 68, wherein the CDR is a CDR1, CDR2 or CDR3 of a single domain antibody.
70. The chimeric cradle polypeptide of claim 69, wherein the single domain antibody is a nanobody.
71. The chimeric cradle polypeptide of claim 68, wherein the CDR replaces part or all of the AB, BC, CD, DE, EF or FG loop.
72. A multispecific cradle polypeptide comprising multiple copies of one or more monomer cradle polypeptides according to any one of claims 1-71.
73. The multispecific cradle polypeptide of claim 72, wherein the monomer cradle polypeptides are linked by a linker sequence.
74. The multispecific cradle polypeptide of claim 73, wherein the linker sequence is selected from the group consisting of GGGGSGGGGS (SEQ ID NO: 471), GSGSGSGSGS
(SEQ ID NO: 472), PSTSTST (SEQ ID NO: 473) and EIDKPSQ (SEQ ID NO: 474).
75. A cradle library comprising a plurality of cradle polypeptides according to any one of claims 1-74.
76. The cradle library of claim 75, wherein the cradle polypeptides comprise one or more amino acid substitutions conesponding to amino acid positions 30, 41, 42, 43, 44, 45, 76, 77, 78, 79, 80, 81, 82, 83, 84 and/or 85 of SEQ ID NO:1.
77. The cradle library of claim 76, wherein the cradle polypeptides further comprise one or more amino acid substitutions conesponding to amino acid positions 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 32, 34, 35, 36, 37, 38, 39, 40, 46, 48, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 72, 74, 75, 86, 87, 88, 89, 90, 91, 92, 93 and/or 94 of SEQ ID NO:1.
78. The cradle library of claims 75, wherein the cradle polypeptides are at least 50%, 60%, 70%, 80%, or 90% identical to SEQ ID NO:1.
79. The cradle library according to any one of claims 75-78, wherein the cradle polypeptides further comprise an insertion of at least 1 amino acid in at least one loop region.
80. The cradle library of claim 79, wherein the cradle polypeptides comprise an insertion of at least 2 amino acids in at least one loop region.
81. The cradle library of claim 80, wherein the cradle polypeptides comprise an insertion of about 2-25 amino acids in at least one loop region.
82. The cradle library according to any one of claims 79-81, wherein the cradle polypeptides comprise a deletion of at least 1 amino acid in at least one loop region.
83. The cradle library of claim 82, wherein the cradle polypeptides comprise a deletion of at least 2 amino acids in at least one loop region.
84. The cradle library of claim 83, wherein the cradle polypeptides comprise a deletion of about 2-10 amino acids in at least one loop region.
85. The cradle library according to any one of claims 79, 83 and 84, wherein the cradle polypeptides comprise a deletion of at least 2 amino acids in two loop regions.
86. The cradle library according to any one of claims 79-85, wherein the cradle polypeptides comprise at least 1 amino acid insertion and 1 amino acid deletion in at least one loop region.
87. The cradle library of claim 86, wherein the cradle polypeptides comprise an insertion and deletion of at least 1 amino acid in the same loop region.
88. The cradle library according to any one of claims 79-87, which is pre-selected to bind a target molecule.
89. The cradle library of claim 75, wherein the cradle polypeptides comprise an amino acid sequence selected from the group consisting of SEQ ID NOs: 3, 79, 86 and 468-470.
90. The cradle library of claim 75, wherein the cradle library contains at least 10, 100, 1000, 10 4, 10 5, 10 6, 10 7, 10 8, 10 9, 10 10, 10 11, 10 12, 10 13, 10 14, 10 15 or more different cradle polypeptides.
91. A polynucleotide encoding a cradle polypeptide according to any one of claims 1-74.
92. The polynucleotide of claim 91, wherein the polynucleotide is an expression construct.
93. The polynucleotide of claim 92, wherein the expression construct is capable of expressing the encoded polypeptide in a bacterium, yeast or mammalian system.
94. A polynucleotide library encoding a cradle library according to any one of claims 75-90.
95. A method of producing a cradle polypeptide comprising:
a) expressing a polynucleotide encoding a cradle polypeptide according to any one of claims 91-93 in a host cell; and b) isolating and/or purifying the expressed cradle polypeptide.
96. A method of forming a cradle library of FnIII domain polypeptides useful in screening for the presence of one or more polypeptides having a selected binding or enzymatic activity, comprising:
(i) aligning beta strands C and F amino acid sequences and loops CD and FG
amino acid sequences in a collection of native FnIII domain polypeptides, (ii) segregating the aligned beta strand and loop sequences according to length, (iii) for a selected beta strand or loop of the length from step (ii), performing positional amino acid frequency analysis to determine the frequencies of amino acids at each beta strand or loop position, (iv) for each beta strand, loop, and length analyzed in step (iii), identifying at each position a conserved or selected semi-conserved consensus amino acid and other natural-variant amino acids, (v) for at least one selected beta strand, loop, and length, forming:
(1) a cradle library of mutagenesis sequences expressed by a library of coding sequences that encode, at each beta strand or loop position, the consensus amino acid, and, if the consensus amino acid has a occurrence frequency equal to or less than a selected threshold frequency of at least 50%, a single common target amino acid and any co-produced amino acids, or (2) a cradle library of natural-variant combinatorial sequences expressed by a library of coding sequences that encode, at each beta strand or loop position, a consensus amino acid and, if the consensus amino acid has a frequency of occurrence equal to or less than a selected threshold frequency of at least 50%, other natural variant amino acids, including semi-conserved amino acids and variable amino acids whose occurrence rate is above a selected minimum threshold occurrence at that position, or their chemical equivalents, (vi) incorporating the library of coding sequences into framework FnIII
coding sequences to form an FnIII expression library, and (vi) expressing the FnIII domain polypeptides of the expression library.
97. The method of claim 96, wherein the selected threshold frequency is 100%.
98. The method of claim 96, wherein the selected frequency is about 50-95%.
99. The method according to any one of claims 96-98, wherein step (ii) segregates the loops and loop lengths into the CD loop at about 3-11 residues in length.
100. The method of claim 99, wherein the CD loop is about 3-9 residues in length.
101. The method of claim 99, wherein the CD loop is about 4-9 residues in length.
102. The method of claim 99, wherein the CD loop is 5 residues in length.
103. The method according to any one of claims 96-102, wherein step (ii) segregates the loops and loop lengths into the FG loop at about 1-10 residues in length.
104. The method of claim 103, wherein the FG loop is about 5 or 6 residues in length.
105. The method of claim 103, wherein the FG loop is 5 residues in length.
106. The method of claim 103, wherein the FG loop is 6 residues in length.
107. The method of claim 106, wherein position 1 of the FG loop is a Gly residue, position 2 the FG loop is a Leu, Val or Ile residue, position 3 the FG loop is a charged or polar residue, position 4 the FG loop is a Pro residue, position 5 the FG loop is a Gly residue, and position 6 the FG loop is a polar residue.
108. The method according to claim 105, wherein positions 3 and/or 5 of the FG
loop are a Gly residue.
109. The method according to any one of claims 96-108, wherein step (ii) segregates the beta strands into the beta strand C at about 6-14 residues in length and the beta strand F at about 8-13 residues in length.
110. The method of claim 109, wherein the beta strand C is about 8-11 residues in length.
111. The method of claim 109, wherein the beta strand C is 9 residues in length.
112. The method of claim 111, wherein positions 2, 4 and 6 of the beta strand C are a hydrophobic residue.
113. The method of claim 111, wherein positions 1, 3, 5 and 7-9 of the beta strand C
are altered relative to the wild type sequence.
114. The method of claim 113, wherein position 1 of the beta strand C is selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
115. The method of claim 113, wherein position 3 of the beta strand C is a hydrophobic residue.
116. The method of claim 113, wherein position 3 of the beta strand C is selected from the group consisting of Ile, Val, Arg, Leu, Thr, Glu, Lys, Ser, Gln and His.
117. The method of claim 113, wherein positions 5 and 7-9 of the beta strand C
are selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
118. The method of claim 109, wherein the beta strand F is about 9-11 residues in length.
119. The method of claim 109, wherein the beta strand F is 10 residues in length.
120. The method of claim 119, wherein positions 1, 3, 5 and 10 of the beta strand F
are altered relative to the wild type sequence.
121. The method of claim 120, wherein positions 1, 3, 5 and 10 of the beta strand F
are selected from the group consisting of Ala, Gly, Pro, Ser, Thr, Asp, Glu, Asn, Gln, His, Lys and Arg.
122. The method of claim 119, wherein positions 2, 4 and 6 of the beta strand F are a hydrophobic residue.
123. The method of claim 119, wherein position 7 of the beta strand F is a hydrophobic residue.
124. The method of claim 119, wherein position 7 of the beta strand F is selected from the group consisting of Arg, Tyr, Ala, Thr and Val.
125. The method of claim 119, wherein position 8 of the beta strand F is selected from the group consisting of Ala, Gly, Ser, Val and Pro.
126. The method of claim 119, wherein position 9 of the beta strand F is selected from the group consisting of Val, Leu, Glu, Arg and Ile.
127. The method according to any one of claims 96-126, wherein the native FnIII
domain polypeptides comprise the wild type amino acid sequences of the 1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th, 10th, 11th, 12th, 13th, 14th, 15th and/or 16th FnIII domain of human fibronectin.
128. The method of claim 127, wherein the native FnIII domain polypeptides comprise the wild type amino acid sequences of the 7th, 10th and/or 14th FnIII
domain of human fibronectin.
129. A cradle library useful in screening for the presence of one or more cradle polypeptides having a selected binding or enzymatic activity designed using a method of forming a cradle library of FnIII domain polypeptides according to any one of claims 96-128.
130. A method of identifying a cradle polypeptide having a desired binding affinity to a target molecule, comprising:
a) reacting a cradle library of FnIII domain polypeptides according to any one of claims 75-90 and 129 with the target molecule, and b) screening the cradle library of FnIII domain polypeptides to select those having a desired binding affinity to the target molecule.
131. The method of claim 130, wherein the method further comprises the step of identifying the polynucleotides that encode the selected polypeptides.
132. The method according to claim 130 or 131, wherein the target molecule is selected from the group consisting of lysozyme, human Fc, human serum albumin (HSA), human small ubiquitin-like modifier 1 (SUMO1), human ubiquitin, human Abl SH2 domain, human Scm-like with four mbt domains 2 (SFMBT2), sex comb on midleg homolog 1 (SCMH1) domain and green fluorescent protein (GFP).
133. A cradle polypeptide selected using the methods according to any one of claims 130-132.
134. The cradle polypeptide of claim 134 comprising an amino acid sequence selected from the group consisting of SEQ ID NOs:4-78, 80-85, 87-96, 98, 99, 101-128, 130-141, 143, 145-147, 149-159, 161-199, 201-238 and 240-277.
claims 1-74, 133 and 134.135. A kit comprising, in a container, a cradle polypeptide according to any one of 136. A kit comprising, in a container, a cradle library according to any one of claims 75-90 and 129.
137. A kit comprising, in a container, a polynucleotide encoding a cradle polypeptide according to any one of claims 91-93.
138. A kit comprising, in a container, a polynucleotide library encoding a cradle library according to any one of claims 75-90 and 129.
Applications Claiming Priority (11)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US36920310P | 2010-07-30 | 2010-07-30 | |
US36916010P | 2010-07-30 | 2010-07-30 | |
US36922210P | 2010-07-30 | 2010-07-30 | |
US61/369,222 | 2010-07-30 | ||
US61/369,160 | 2010-07-30 | ||
US61/369,203 | 2010-07-30 | ||
US201161474632P | 2011-04-12 | 2011-04-12 | |
US201161474648P | 2011-04-12 | 2011-04-12 | |
US61/474,648 | 2011-04-12 | ||
US61/474,632 | 2011-04-12 | ||
PCT/US2011/046160 WO2012016245A2 (en) | 2010-07-30 | 2011-08-01 | Fibronectin cradle molecules and libraries thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2805862A1 true CA2805862A1 (en) | 2012-02-02 |
Family
ID=44504247
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA2805862A Abandoned CA2805862A1 (en) | 2010-07-30 | 2011-08-01 | Fibronectin cradle molecules and libraries thereof |
Country Status (12)
Country | Link |
---|---|
US (1) | US9512199B2 (en) |
EP (3) | EP2598529A2 (en) |
JP (2) | JP6092773B2 (en) |
KR (1) | KR20130136443A (en) |
CN (2) | CN103403027A (en) |
AU (1) | AU2011283646B2 (en) |
CA (1) | CA2805862A1 (en) |
IL (1) | IL224220A (en) |
MX (1) | MX345300B (en) |
RU (1) | RU2013108962A (en) |
SG (1) | SG187225A1 (en) |
WO (1) | WO2012016245A2 (en) |
Families Citing this family (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW200531979A (en) | 2003-12-05 | 2005-10-01 | Compound Therapeutics Inc | Inhibitors of type 2 vascular endothelial growth factor receptors |
US8470332B2 (en) | 2006-11-22 | 2013-06-25 | Bristol-Myers Squibb Company | Targeted therapeutics based on engineered proteins for tyrosine kinases receptors, including IGF-IR |
AU2009213141A1 (en) | 2008-02-14 | 2009-08-20 | Bristol-Myers Squibb Company | Targeted therapeutics based on engineered proteins that bind EGFR |
EP2274331B1 (en) | 2008-05-02 | 2013-11-06 | Novartis AG | Improved fibronectin-based binding molecules and uses thereof |
CN102099373A (en) | 2008-05-22 | 2011-06-15 | 百时美施贵宝公司 | Multivalent fibronectin based scaffold domain proteins |
PL2356269T3 (en) | 2008-10-31 | 2016-12-30 | Fibronectin type iii domain based scaffold compositions, methods and uses | |
TWI496582B (en) | 2008-11-24 | 2015-08-21 | 必治妥美雅史谷比公司 | Bispecific egfr/igfir binding molecules |
US9139825B2 (en) | 2009-10-30 | 2015-09-22 | Novartis Ag | Universal fibronectin type III bottom-side binding domain libraries |
WO2011137319A2 (en) | 2010-04-30 | 2011-11-03 | Centocor Ortho Biotech Inc. | Stabilized fibronectin domain compositions, methods and uses |
TW201138808A (en) | 2010-05-03 | 2011-11-16 | Bristol Myers Squibb Co | Serum albumin binding molecules |
WO2011150133A2 (en) | 2010-05-26 | 2011-12-01 | Bristol-Myers Squibb Company | Fibronectin based scaffold proteins having improved stability |
WO2012135284A2 (en) | 2011-03-28 | 2012-10-04 | Cornell University | Targeted protein silencing using chimeras between antibodies and ubiquitination enzymes |
WO2012142515A2 (en) | 2011-04-13 | 2012-10-18 | Bristol-Myers Squibb Company | Fc fusion proteins comprising novel linkers or arrangements |
WO2012158678A1 (en) | 2011-05-17 | 2012-11-22 | Bristol-Myers Squibb Company | Methods for maintaining pegylation of polypeptides |
WO2012158739A1 (en) | 2011-05-17 | 2012-11-22 | Bristol-Myers Squibb Company | Improved methods for the selection of binding proteins |
MX358827B (en) | 2011-09-27 | 2018-09-05 | Janssen Biotech Inc | Fibronectin type iii repeat based protein scaffolds with alternative binding surfaces. |
US9522951B2 (en) | 2011-10-31 | 2016-12-20 | Bristol-Myers Squibb Company | Fibronectin binding domains with reduced immunogenicity |
IL270110B (en) | 2012-09-13 | 2022-08-01 | Bristol Myers Squibb Co | Fibronectin based scaffold domain proteins that bind to myostatin |
US20150361159A1 (en) | 2013-02-01 | 2015-12-17 | Bristol-Myers Squibb Company | Fibronectin based scaffold proteins |
EP3406629B1 (en) | 2013-02-06 | 2020-06-24 | Bristol-Myers Squibb Company | Fibronectin type iii domain proteins with enhanced solubility |
WO2014126871A1 (en) | 2013-02-12 | 2014-08-21 | Bristol-Myers Squibb Company | Tangential flow filtration based protein refolding methods |
ES2870802T3 (en) | 2013-02-12 | 2021-10-27 | Bristol Myers Squibb Co | High pH Protein Refolding Methods |
WO2014165093A2 (en) | 2013-03-13 | 2014-10-09 | Bristol-Myers Squibb Company | Fibronectin based scaffold domains linked to serum albumin or a moiety binding thereto |
CN115322253A (en) | 2014-03-20 | 2022-11-11 | 百时美施贵宝公司 | Stabilized fibronectin based scaffold molecules |
MX2016011580A (en) | 2014-03-20 | 2016-11-29 | Bristol Myers Squibb Co | Serum albumin-binding fibronectin type iii domains. |
CA2968357A1 (en) | 2014-11-21 | 2016-05-26 | Bristol-Myers Squibb Company | Antibodies against cd73 and uses thereof |
CA2968961A1 (en) | 2014-11-25 | 2016-06-02 | Bristol-Myers Squibb Company | Methods and compositions for 18f-radiolabeling of biologics |
CN108290941A (en) | 2015-09-23 | 2018-07-17 | 百时美施贵宝公司 | The seralbumin associativity fibronectin type III domain of fast dissociation rate |
EP3353198B1 (en) | 2015-09-23 | 2020-06-17 | Bristol-Myers Squibb Company | Glypican-3binding fibronectin based scafflold molecules |
RU2605326C1 (en) * | 2015-12-07 | 2016-12-20 | Федеральное государственное бюджетное научное учреждение "Научно-исследовательский институт биохимии" (НИИ биохимии) | RECOMBINANT CHIMERIC PROTEIN SUMO3-apoA-I FOR PRODUCING MATURE HUMAN APOLIPOPROTEIN A-I, Pichia pastoris YEAST STRAIN - PRODUCER OF RECOMBINANT CHIMERIC PROTEIN SUMO3-apoA-I AND METHOD OF PRODUCING MATURE HUMAN APOLIPOPROTEIN A-I |
US10994033B2 (en) | 2016-06-01 | 2021-05-04 | Bristol-Myers Squibb Company | Imaging methods using 18F-radiolabeled biologics |
WO2018111978A1 (en) | 2016-12-14 | 2018-06-21 | Janssen Biotech, Inc. | Cd137 binding fibronectin type iii domains |
WO2018111976A1 (en) | 2016-12-14 | 2018-06-21 | Janssen Biotech, Inc. | Pd-l1 binding fibronectin type iii domains |
AU2017378226A1 (en) | 2016-12-14 | 2019-06-20 | Janssen Biotech, Inc. | CD8A-binding fibronectin type III domains |
CN110997921A (en) * | 2017-07-31 | 2020-04-10 | 国立大学法人东京大学 | Super-versatile method for presenting cyclic peptides on protein structures |
US11680091B2 (en) | 2018-02-23 | 2023-06-20 | The University Of Chicago | Methods and composition involving thermophilic fibronectin type III (FN3) monobodies |
WO2019178604A1 (en) * | 2018-03-16 | 2019-09-19 | Cornell University | Broad-spectrum proteome editing with an engineered bacterial ubiquitin ligase mimic |
WO2021076574A2 (en) | 2019-10-14 | 2021-04-22 | Aro Biotherapeutics Company | Fn3 domain-sirna conjugates and uses thereof |
US11628222B2 (en) | 2019-10-14 | 2023-04-18 | Aro Biotherapeutics Company | CD71 binding fibronectin type III domains |
CN111217903B (en) * | 2020-02-25 | 2022-11-15 | 芜湖天明生物技术有限公司 | Recombinant human fibronectin III 1-C and preparation method and application thereof |
JP2023515633A (en) | 2020-02-28 | 2023-04-13 | ブリストル-マイヤーズ スクイブ カンパニー | Radiolabeled fibronectin-based scaffolds and antibodies and their theranostic uses |
US12049483B2 (en) * | 2020-12-05 | 2024-07-30 | New York University | Selective and noncovalent inhibitors of oncogenic RAS mutants |
EP4323409A2 (en) | 2021-04-14 | 2024-02-21 | ARO Biotherapeutics Company | Cd71 binding fibronectin type iii domains |
CN113527525B (en) * | 2021-09-14 | 2021-12-17 | 美慕(北京)科技有限公司 | Recombinant protein and construction method and application thereof |
CN113527526B (en) * | 2021-09-14 | 2021-12-17 | 美慕(北京)科技有限公司 | Recombinant protein and construction method and application thereof |
TW202340235A (en) * | 2021-12-17 | 2023-10-16 | 美商戴納立製藥公司 | Polypeptide engineering, libraries, and engineered cd98 heavy chain and transferrin receptor binding polypeptides |
WO2023145861A1 (en) * | 2022-01-27 | 2023-08-03 | 公益財団法人東京都医学総合研究所 | Anti-ubiquitin synthetic antibody |
CN115976031B (en) * | 2022-07-18 | 2023-06-23 | 烟台市华昕生物医药科技有限公司 | Recombinant fibronectin and application thereof |
Family Cites Families (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4554101A (en) | 1981-01-09 | 1985-11-19 | New York Blood Center, Inc. | Identification and preparation of epitopes on antigens and allergens on the basis of hydrophilicity |
DE3329892A1 (en) | 1983-08-18 | 1985-03-07 | Köster, Hubert, Prof. Dr., 2000 Hamburg | METHOD FOR PRODUCING OLIGONUCLEOTIDES |
US4888286A (en) | 1984-02-06 | 1989-12-19 | Creative Biomolecules, Inc. | Production of gene and protein analogs through synthetic gene design using double stranded synthetic oligonucleotides |
US5225539A (en) | 1986-03-27 | 1993-07-06 | Medical Research Council | Recombinant altered antibodies and methods of making altered antibodies |
US6548640B1 (en) | 1986-03-27 | 2003-04-15 | Btg International Limited | Altered antibodies |
US5530101A (en) | 1988-12-28 | 1996-06-25 | Protein Design Labs, Inc. | Humanized immunoglobulins |
US5859205A (en) | 1989-12-21 | 1999-01-12 | Celltech Limited | Humanised antibodies |
ES2078518T3 (en) | 1990-04-05 | 1995-12-16 | Roberto Crea | COMPLETE DISPLACEMENT MUTAGENESIS. |
GB9015198D0 (en) | 1990-07-10 | 1990-08-29 | Brien Caroline J O | Binding substance |
US7063943B1 (en) | 1990-07-10 | 2006-06-20 | Cambridge Antibody Technology | Methods for producing members of specific binding pairs |
US5843701A (en) | 1990-08-02 | 1998-12-01 | Nexstar Pharmaceticals, Inc. | Systematic polypeptide evolution by reverse translation |
EP0940468A1 (en) | 1991-06-14 | 1999-09-08 | Genentech, Inc. | Humanized antibody variable domain |
WO1993007287A1 (en) | 1991-10-11 | 1993-04-15 | Promega Corporation | Coupled transcription and translation in eukaryotic cell-free extract |
US5492817A (en) | 1993-11-09 | 1996-02-20 | Promega Corporation | Coupled transcription and translation in eukaryotic cell-free extract |
US5665563A (en) | 1991-10-11 | 1997-09-09 | Promega Corporation | Coupled transcription and translation in eukaryotic cell-free extract |
US20030036092A1 (en) | 1991-11-15 | 2003-02-20 | Board Of Regents, The University Of Texas System | Directed evolution of enzymes and antibodies |
US5866344A (en) | 1991-11-15 | 1999-02-02 | Board Of Regents, The University Of Texas System | Antibody selection methods using cell surface expressed libraries |
US5922545A (en) | 1993-10-29 | 1999-07-13 | Affymax Technologies N.V. | In vitro peptide and antibody display libraries |
US5605793A (en) | 1994-02-17 | 1997-02-25 | Affymax Technologies N.V. | Methods for in vitro recombination |
US6699658B1 (en) | 1996-05-31 | 2004-03-02 | Board Of Trustees Of The University Of Illinois | Yeast cell surface display of proteins and uses thereof |
US6300065B1 (en) | 1996-05-31 | 2001-10-09 | Board Of Trustees Of The University Of Illinois | Yeast cell surface display of proteins and uses thereof |
US6261804B1 (en) | 1997-01-21 | 2001-07-17 | The General Hospital Corporation | Selection of proteins using RNA-protein fusions |
JP3692542B2 (en) | 1997-01-21 | 2005-09-07 | ザ ジェネラル ホスピタル コーポレーション | Protein selection using RNA-protein fusions |
JP4086325B2 (en) | 1997-04-23 | 2008-05-14 | プリュックテュン,アンドレアス | Method for identifying a nucleic acid molecule encoding a (poly) peptide that interacts with a target molecule |
CA2293632C (en) * | 1997-06-12 | 2011-11-29 | Research Corporation Technologies, Inc. | Artificial antibody polypeptides |
IL143238A0 (en) | 1998-12-02 | 2002-04-21 | Phylos Inc | Dna-protein fusions and uses thereof |
ATE551352T1 (en) | 1999-07-27 | 2012-04-15 | Bristol Myers Squibb Co | PEPTIDE ACCEPTOR BINDING METHOD |
US7022479B2 (en) | 2000-01-24 | 2006-04-04 | Compound Therapeutics, Inc. | Sensitive, multiplexed diagnostic assays for protein analysis |
EP2385067A1 (en) * | 2000-07-11 | 2011-11-09 | Research Corporation Technologies, Inc. | Artificial antibody polypeptides |
AU2002213251B2 (en) * | 2000-10-16 | 2007-06-14 | Bristol-Myers Squibb Company | Protein scaffolds for antibody mimics and other binding proteins |
US7094571B2 (en) | 2000-10-27 | 2006-08-22 | The Board Of Regents Of The University Of Texas System | Combinatorial protein library screening by periplasmic expression |
US7598352B2 (en) * | 2000-11-17 | 2009-10-06 | University Of Rochester | Method of identifying polypeptide monobodies which bind to target proteins and use thereof |
US20030157561A1 (en) | 2001-11-19 | 2003-08-21 | Kolkman Joost A. | Combinatorial libraries of monomer domains |
US6951725B2 (en) | 2001-06-21 | 2005-10-04 | Compound Therapeutics, Inc. | In vitro protein interaction detection systems |
AU2003243436A1 (en) | 2002-06-06 | 2003-12-22 | Shohei Koide | Reconstituted polypeptides |
WO2004041865A2 (en) | 2002-11-08 | 2004-05-21 | Ablynx N.V. | Stabilized single domain antibodies |
RU2005140664A (en) | 2003-06-27 | 2007-08-27 | Байорен, Инк. (Us) | VIEWING MUTAGENESIS |
BRPI0513155B1 (en) | 2004-07-06 | 2021-07-20 | Bioren, Inc. | METHOD OF DISTINGUISHING ONE OR MORE FUNCTIONAL AMINO ACID RESIDUES FROM NON-FUNCTIONAL AMINO ACID RESIDUES IN A DEFINED REGION WITHIN A POLYPEPTID |
AR056857A1 (en) | 2005-12-30 | 2007-10-24 | U3 Pharma Ag | DIRECTED ANTIBODIES TO HER-3 (RECEIVER OF THE HUMAN EPIDERMAL GROWTH FACTOR-3) AND ITS USES |
US20100055093A1 (en) | 2006-06-12 | 2010-03-04 | Receptor Biologix Inc. | Pan-cell surface receptor-specific therapeutics |
CN201113075Y (en) | 2007-07-10 | 2008-09-10 | 富士康(昆山)电脑接插件有限公司 | Power supply connector |
WO2009023184A2 (en) * | 2007-08-10 | 2009-02-19 | Protelix, Inc. | Universal fibronectin type iii binding-domain libraries |
WO2009023266A1 (en) | 2007-08-14 | 2009-02-19 | Ludwig Institute For Cancer Research | Generation of antibodies to cell-surface receptors and cancer-associated proteins including egfr family members |
EP2205629A2 (en) | 2007-10-16 | 2010-07-14 | Symphogen A/S | Compositions comprising optimized her1 and her3 multimers and methods of use thereof |
EP2274331B1 (en) | 2008-05-02 | 2013-11-06 | Novartis AG | Improved fibronectin-based binding molecules and uses thereof |
TWI496582B (en) * | 2008-11-24 | 2015-08-21 | 必治妥美雅史谷比公司 | Bispecific egfr/igfir binding molecules |
US9139825B2 (en) * | 2009-10-30 | 2015-09-22 | Novartis Ag | Universal fibronectin type III bottom-side binding domain libraries |
PT3351558T (en) | 2009-11-13 | 2020-04-09 | Daiichi Sankyo Europe Gmbh | Material and methods for treating or preventing her-3 associated diseases |
-
2011
- 2011-08-01 CN CN2011800474318A patent/CN103403027A/en active Pending
- 2011-08-01 EP EP11746713.4A patent/EP2598529A2/en not_active Withdrawn
- 2011-08-01 MX MX2013001144A patent/MX345300B/en active IP Right Grant
- 2011-08-01 CA CA2805862A patent/CA2805862A1/en not_active Abandoned
- 2011-08-01 EP EP17167946.7A patent/EP3248987A1/en not_active Withdrawn
- 2011-08-01 EP EP16187306.2A patent/EP3222631A3/en not_active Withdrawn
- 2011-08-01 RU RU2013108962/10A patent/RU2013108962A/en not_active Application Discontinuation
- 2011-08-01 CN CN201710585677.XA patent/CN107903321A/en active Pending
- 2011-08-01 SG SG2013006580A patent/SG187225A1/en unknown
- 2011-08-01 JP JP2013523251A patent/JP6092773B2/en not_active Expired - Fee Related
- 2011-08-01 KR KR1020137005001A patent/KR20130136443A/en not_active Application Discontinuation
- 2011-08-01 AU AU2011283646A patent/AU2011283646B2/en not_active Ceased
- 2011-08-01 US US13/813,409 patent/US9512199B2/en active Active
- 2011-08-01 WO PCT/US2011/046160 patent/WO2012016245A2/en active Application Filing
-
2013
- 2013-01-14 IL IL224220A patent/IL224220A/en active IP Right Grant
-
2016
- 2016-06-23 JP JP2016124152A patent/JP2016192973A/en active Pending
Also Published As
Publication number | Publication date |
---|---|
IL224220A (en) | 2017-07-31 |
CN107903321A (en) | 2018-04-13 |
MX345300B (en) | 2017-01-24 |
EP3248987A1 (en) | 2017-11-29 |
CN103403027A (en) | 2013-11-20 |
AU2011283646A1 (en) | 2013-03-07 |
RU2013108962A (en) | 2014-09-10 |
MX2013001144A (en) | 2013-06-05 |
EP3222631A3 (en) | 2017-11-22 |
EP2598529A2 (en) | 2013-06-05 |
US20140057807A1 (en) | 2014-02-27 |
JP2013539362A (en) | 2013-10-24 |
KR20130136443A (en) | 2013-12-12 |
WO2012016245A3 (en) | 2012-08-09 |
SG187225A1 (en) | 2013-02-28 |
JP2016192973A (en) | 2016-11-17 |
US9512199B2 (en) | 2016-12-06 |
AU2011283646B2 (en) | 2015-07-09 |
EP3222631A2 (en) | 2017-09-27 |
WO2012016245A2 (en) | 2012-02-02 |
JP6092773B2 (en) | 2017-03-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9512199B2 (en) | Fibronectin cradle molecules and libraries thereof | |
Koide et al. | Teaching an old scaffold new tricks: monobodies constructed using alternative surfaces of the FN3 scaffold | |
US10253313B2 (en) | Universal fibronectin type III bottom-side binding domain libraries | |
Koide et al. | Exploring the capacity of minimalist protein interfaces: interface energetics and affinity maturation to picomolar KD of a single-domain antibody with a flat paratope | |
CA2900745C (en) | Scaffold proteins derived from plant cystatins | |
US20110124527A1 (en) | Universal fibronectin type iii binding-domain libraries | |
Banner et al. | Mapping the conformational space accessible to BACE2 using surface mutants and cocrystals with Fab fragments, Fynomers and Xaperones | |
Ravindran et al. | Improvement of the crystallizability and expression of an RNA crystallization chaperone | |
EP1773994A2 (en) | Polypeptide | |
Nishi et al. | Ligation-based assembly for constructing mouse synthetic scFv libraries by chain shuffling with in vivo-amplified VH and VL fragments | |
Venegas | Generating and Characterizing Phosphospecific Affinity Reagents Using the Forkhead-Associated 1 Domain | |
Huang | Design of leaps in protein function and implications for protein evolution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request |
Effective date: 20160729 |
|
FZDE | Discontinued |
Effective date: 20200113 |