[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Longest Motifs with a Functionally Equivalent Central Block

  • Conference paper
String Processing and Information Retrieval (SPIRE 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3246))

Included in the following conference series:

  • 753 Accesses

Abstract

This paper presents a generalization of the notion of longest repeats with a block of k don’t care symbols introduced by [8] (for k fixed) to longest motifs composed of three parts: a first and last that parameterize match (that is, match via some symbol renaming, initially unknown), and a functionally equivalent central block. Such three-part motifs are called longest block motifs. Different types of functional equivalence, and thus of matching criteria for the central block are considered, which include as a subcase the one treated in [8] and extend to the case of regular expressions with no Kleene closure or complement operation. We show that a single general algorithmic tool that is a non-trivial extension of the ideas introduced in [8] can handle all the various kinds of longest block motifs defined in this paper. The algorithm complexity is, in all cases, in O(n log n).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Amir, A., Aumann, Y., Cole, R., Lewenstein, M., Porat, E.: Function matching: Algorithms, applications, and a lower bound. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 929–942. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  2. Baker, B.S.: Parameterized pattern matching: Algorithms and applications. J. Comput. Syst. Sci. 52(1), 28–42 (1996)

    Article  MATH  Google Scholar 

  3. Baker, B.S.: Parameterized duplication in strings: Algorithms and an application to software maintenance. SIAM J. Computing 26(5), 1343–1362 (1997)

    Article  MATH  Google Scholar 

  4. Brazma, A., Jonassen, I., Eidhammer, I., Gilbert, D.: Approaches to the automatic discovery of patterns in biosequences. J. of Computational Biology 5, 277–304 (1997)

    Google Scholar 

  5. Brodal, G.S., Lyngsø, R.B., Pederson, C.N.S., Stoye, J.: Finding maximal pairs with bounded gaps. J. of Discrete Algorithms 1(1), 1–27 (2000)

    Google Scholar 

  6. Chan, H.S., Dill, K.A.: Compact polymers. Macromolecules 22, 4559–4573 (1989)

    Article  Google Scholar 

  7. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (1998)

    Google Scholar 

  8. Crochemore, M., Iliopoulos, C.S., Mohamed, M., Sagot, M.-F.: Longest repeated motif with a block of don’t cares. In: Farach-Colton, M. (ed.) LATIN 2004. LNCS, vol. 2976, pp. 271–278. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  9. Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  10. Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. on Computing 13, 338–355 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  11. Karlin, S., Ghandour, G.: Multiple-alphabet amino acid sequence comparisons of the immunoglobulin kappa-chain constant domain. Proc. Natl. Acad. Sci. USA 82(24), 8597–8601 (1985)

    Article  Google Scholar 

  12. Kolpakov, R., Kucherov, G.: Finding repeats with fixed gaps. In: Proc. of SPIRE 2002, pp. 162–168 (2002)

    Google Scholar 

  13. Li, T., Fan, K., Wang, J., Wang, W.: Reduction of protein sequence complexity by residue grouping. Protein Eng. (5), 323–330 (2003)

    Article  Google Scholar 

  14. Liu, X., Liu, D., Qi, J., Zheng, W.M.: Simplified amino acid alphabets based on deviation of conditional probability from random background. Phys. Rev. E 66, 1–9 (2002)

    Google Scholar 

  15. Lothaire, M.: Combinatorics on Words. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  16. Lothaire, M.: Algebraic Combinatorics on Words. Cambridge University Press, Cambridge (2002)

    MATH  Google Scholar 

  17. Lothaire, M.: Applied Combinatorics on Words. in preparation (2004), http://igm.univ-mlv.fr/~berstel/Lothaire/index.html

  18. McCreight, E.M.: A space economical suffix tree construction algorithm. J. of ACM 23, 262–272 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  19. Murphy, L.R., Wallqvist, A., Levy, R.M.: Simplified amino acid alphabets for protein fold recognition and implications for folding. Protein. Eng. 13, 149–152 (2000)

    Article  Google Scholar 

  20. Schieber, B., Vishkin, U.: On finding lowest common ancestors: Simplification and parallelization. Siam J. on Computing 17, 1253–1262 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  21. Spitzer, M., Fuellen, G., Cullen, P., Lorkowsk, S.: Viscose: Visualisation and comparison of consensus sequences. Bioinformatics (2004) (to appear)

    Google Scholar 

  22. Wang, J., Wang, W.: A computational approach to simplifying the protein folding alphabet. Nat. Struct. Biol. 11, 1033–1038 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Crochemore, M., Giancarlo, R., Sagot, MF. (2004). Longest Motifs with a Functionally Equivalent Central Block. In: Apostolico, A., Melucci, M. (eds) String Processing and Information Retrieval. SPIRE 2004. Lecture Notes in Computer Science, vol 3246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30213-1_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30213-1_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23210-0

  • Online ISBN: 978-3-540-30213-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics