Abstract
A class of probabilistic-logic models is considered, which increases the expressibility from HMM’s and SCFG’s regular and context-free languages to, in principle, Turing complete languages. In general, such models are computationally far too complex for direct use, so optimization by pruning and approximation are needed. The first steps are taken towards a methodology for optimizing such models by approximations using auxiliary models for preprocessing or splitting them into submodels. Evaluation of such approximating models is challenging as authoritative test data may be sparse. On the other hand, the original complex models may be used for generating artificial evaluation data by efficient sampling, which can be used in the evaluation, although it does not constitute a foolproof test procedure. These models and evaluation processes are illustrated in the PRISM system developed by other authors, and we discuss their applicability and limitations.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Biba, M., Ferilli, S., Mauro, N.D., Basile, T.M.A.: A hybrid symbolic-statistical approach to modeling metabolic networks. In: Apolloni, B., Howlett, R.J., Jain, L.C. (eds.) KES 2007, Part I. LNCS, vol. 4692, pp. 132–139. Springer, Heidelberg (2007)
Burge, C., Karlin, S.: Prediction of complete gene structures in human genomic DNA. Journal of Molecular Biology 268, 78–94 (1997)
Chen, J., Muggleton, S., Santos, J.: Abductive stochastic logic programs for metabolic network inhibition learning. In: Frasconi, P., Kersting, K., Tsuda, K. (eds.) MLG (2007)
Christiansen, H., Dahmcke, C.M.: A machine learning approach to test data generation: A case study in evaluation of gene finders. In: Perner, P. (ed.) MLDM 2007. LNCS, vol. 4571, pp. 742–755. Springer, Heidelberg (2007)
Christiansen, H., Gallagher, J.: Mode-based slicing and its applications (submitted, 2009)
De Raedt, L., Frasconi, P., Kersting, K., Muggleton, S. (eds.): Probabilistic Inductive Logic Programming. LNCS, vol. 4911. Springer, Heidelberg (2008)
Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: Biological Sequence Analysis. Cambridge University Press, Cambridge (1998)
Jaeger, M.: Relational bayesian networks. In: Geiger, D., Shenoy, P.P. (eds.) UAI, pp. 266–273. Morgan Kaufmann, San Francisco (1997)
Jurafsky, D., Martin, J.H.: Speech and Language Processing, 2nd edn. Prentice-Hall, Inc., Upper Saddle River (2006)
Koller, D., McAllester, D.A., Pfeffer, A.: Effective bayesian inference for stochastic programs. In: AAAI/IAAI, pp. 740–747 (1997)
Krogh, A.: Using database matches with for HMMGene for automated gene detection in Drosophila. Genome Research 10(4), 523–528 (2000)
Lukashin, A., Borodovsky, M.: Genemark.hmm: new solutions for gene finding. Nucleic Acids Research 26(4), 1107–1115 (1998)
Muggleton, S.: Learning from positive data. In: Muggleton, S. (ed.) ILP 1996. LNCS, vol. 1314, pp. 358–376. Springer, Heidelberg (1997)
LoSt on the Web, http://lost.ruc.dk
Sato, T.: A statistical learning method for logic programs with distribution semantics. In: ICLP, pp. 715–729 (1995)
Sato, T., Kameya, Y.: Parameter learning of logic programs for symbolic-statistical modeling. J. Artif. Intell. Res (JAIR) 15, 391–454 (2001)
Sato, T., Kameya, Y.: Statistical abduction with tabulation. In: Kakas, A.C., Sadri, F. (eds.) Computational Logic: Logic Programming and Beyond. LNCS, vol. 2408, pp. 567–587. Springer, Heidelberg (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Christiansen, H., Torp Lassen, O. (2009). Preprocessing for Optimization of Probabilistic-Logic Models for Sequence Analysis. In: Hill, P.M., Warren, D.S. (eds) Logic Programming. ICLP 2009. Lecture Notes in Computer Science, vol 5649. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02846-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-02846-5_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02845-8
Online ISBN: 978-3-642-02846-5
eBook Packages: Computer ScienceComputer Science (R0)