Abstract
This paper considers the problem of quantifying literary style and looks at several variables which may be used as stylistic “fingerprints” of a writer. A review of work done on the statistical analysis of “change over time” in literary style is then presented, followed by a look at a specific application area, the authorship of Biblical texts.
Similar content being viewed by others
References
Antosch, F. “The Diagnosis of Literary Style with the Verb-Adjective Ratio.” InStatistics and Style. Eds. L. Dolezel and R.W. Bailey. New York: American Elsevier, 1969.
Bailey, R.W. “Authorship Attribution in a Forensic Setting.”Advances in Computer-aided Literary and Linguistic Research. Eds. D.E. Ager, F.E. Knowles and J. Smith. Birmingham: AMLC, 1979.
Baker, J.C. Pace. “A Test of Authorship Based on the Rate at Which New Words Enter an Author's Text.”Journal of the Association for Literary and Linguistic Computing, 3, 1 (1988), 36–39.
Bartholomew, D.J. “Probability, Statistics and Theology.”Journal of the Royal Statistical Society, A, 151, 1 (1988), 137–78.
Bee, R.E. “Statistical Methods in the Study of the Masoretic Text of the Old Testament.”Journal of the Royal Statistical Society, A, 134, 4 (1971), 611–622.
Bee, R.E. “A Statistical Study of the Sinai Periscope.”Journal of the Royal Statistical Society, A, 135, 3 (1972), 406–421.
Bender, T.K. and S.M. Briggum. “Quantitative Stylistic Analysis of Impressionist Style in Joseph Conrad and Ford Maddox Ford.” InComputing in the Humanities. Ed. R.W. Bailey. North-Holland, 1982.
Bennett, P.E. “The Statistical Measurement of a Stylistic Trait inJulius Caesar andAs You Like It.” InStatistics and Style. Eds. L. Dolezel and R.W. Bailey. New York: American Elsevier, 1969.
Boreland, H. and P. Galloway. “Authorship, Discrimination and Clustering: Timoneda, Montesino and Two Anonymous Poems.”Association for Literary and Linguistic Computing Bulletin, 8 (1980), 125–151.
Brainerd, B. “On the Distinction Between a Novel and a Romance: A Discriminant Analysis.”Computers and the Humanities, 7 (1973), 259–270.
Brainerd, B.Weighing Evidence in Language and Literature: A Statistical Approach. University of Toronto Press, 1974.
Brainerd, B. “Two Models for the Type-Token Relation with Time Dependant Vocabulary Reservoir.” InVocabulary Structure and Lexical Richness. Eds. P. Thoiron, D. Serant and D. Labbe. Paris: Champion-Slatkine, 1988.
Brinegar, C.S. “Mark Twain and the Quintus Curtius Snodgrass Letters: A Statistical Test of Authorship.”Journal of the American Statistical Association, 58 (1963), 85–96.
Bruno, A.M.Toward a Quantitative Methodology for Stylistic Analyses. University of California Press, 1974.
Burrows, J.F. “Word Patterns and Story Shapes: The Statistical Analysis of Narrative Style.”Journal of the Association for Literary and Linguistic Computing, 2, 2 (1987), 61–70.
Burrows, J.F. and A.J. Hassall. “Anna Boleyn and the Authenticity of Fielding's Feminine Narratives.”Eighteenth Century Studies, 21 (1988), 427–453.
Burrows, J.F. “Computers and the Study of Literature.” InComputers and Written Texts. Ed. C.S. Butler. Oxford: Blackwell, 1992.
Cox, D.R. and L. Brandwood. “On a Discriminating Problem Connected with the Works of Plato.”Journal of the Royal Statistical Society, B, 21 (1959), 195–200.
Damerau, F.J. “The Use of Function Word Frequencies as Indicators of Style.”Computers and the Humanities, 9 (1975), 271–280.
Delcourt, C. “On Vocabulary Curves.”Association for Literary and Linguistic Computing Journal, 2 (1981), 13–24.
Ellegard, A.A Statistical Method for Determining Authorship: The Junius Letters, 1769–1772. Gothenburg: University of Gothenburg, 1962.
Fucks, W. “On the Mathematical Analysis of Style.”Biometrika, 39 (1952), 122–129.
Fucks, W. and J. Lauter. “Mathematische Analyse des Literarischen Stils.” InMathematik und Dichtung. Eds. H. Kreuzer and R. Gunzenhausers. Munich: Nymphenburger Verlagsbuckhandlung, 1965.
Grayston, K. and G. Herdan. “The Authorship of the Pastorals in the Light of Statistical Linguistics.”New Testament Studies, 6 (1959), 1–15.
Gregory, M.J. “An Approach to the Study of Style.”Linguistics and Style. Eds. N. Enkvist, J. Spencer and M.J. Gregory. University of Oxford Press, 1964.
Herdan, G. “A New Derivation and Interpretation of Yule's ‘Characteristic’ K.”Journal of Applied Mathematics and Physics, 6 (1955), 332–334.
Herdan, G.Quantitative Linguistics. London: Butterworths, 1964.
Herdan, G.The Advanced Theory of Language as Choice and Chance. New York: Springer-Verlag, 1966.
Holmes, D.I. “Vocabulary Richness and the Prophetic Voice.”Literary and Linguistic Computing, 6, 4 (1991), 259–268.
Holmes, D.I. “A Stylometric Analysis of Mormon Scripture and Related Texts.”Journal of the Royal Statistical Society (A), 155, 1 (1992), 91–120.
Honoré, A. “Some Simple Measures of Richness of Vocabulary.”Association for Literary and Linguistic Computing Bulletin, 7, 2 (1979), 172–177.
Hubert, P. and D. Labbe, D. “A Model of Vocabulary Partition.”Journal of the Association for Literary and Linguistic Computing, 3, 4 (1988), 223–225.
Johnson, R. “Measures of Vocabulary Diversity.” InAdvances in Computer-aided Literary and Linguistic Research. Eds. D.E. Ager, F.E. Knowles and M.W.A. Smith. Birmingham: AMLC, 1979.
Kemp, K.W. “Aspects of the Statistical Analysis and Effective Use of Linguistic Data.”Association for Literary and Linguistic Computing Bulletin, 4 (1976), 14–22.
Kenny, A.A Stylometric Study of the New Testament. Oxford University Press, 1986.
Kjetssa, G. “And Quiet Flows the Don Through the Computer.”Association for Literary and Linguistic Computing Bulletin, 7 (1979), 248–256.
Kjetssa, G. “Written by Dostoyevsky.”Association for Literary and Linguistic Computing Journal, 2 (1981), 25–33.
Ledger, G.R.Re-counting Plato: A Computer Analysis of Plato's Style. Oxford: Clarendon, 1989.
Mandelbrot, B. “A Note on a Class of Skew Distribution Functions: Analysis and Critique of a Paper by H.A. Simon.”Information and Control, 2 (1959), 90–99.
Mendenhall, T.C. “The Characteristic Curves of Composition.”Science, IX (1887), 237–249.
Miles, J. and H. C. Selvin. “A Factor Analysis of the Vocabulary of Poetry in the Seventeenth Century.” InThe Computer and Literary Style. Ed. J. Leed. Ohio: Kent State University Press, 1966.
Morton, A.Q. “The Authorship of Greek Prose.”Journal of the Royal Statistical Society, A, 128 (1965), 169–233.
Morton, A.Q.Literary Detection. New York: Scribners, 1978.
Morton, A.Q. “Once. A Test of Authorship Based on Words which are not Repeated in the Sample.”Journal of the Association for Literary and Linguistic Computing, 1, 1 (1986), 1–8.
Morton, A.Q. and J. McLeman.The Genesis of John. Edinburgh: St Andrew's Press, 1980.
Mosteller, F. and D.L. Wallace. “Inference and Disputed Authorship: TheFederalist.” Reading, MA: Addison-Wesley, 1964.
Muller, C. “Calcul des Probabilités et Calcul d'un Vocabulaire.”Travaux de Linguistique et de Littérature (1964), 235–244.
Muller, C. “Lexical Distribution Reconsidered: the Waring-Herdan Formula.” InStatistics and Style. Eds. L. Dolezel and R.W. Bailey, New York: American Elsevier, 1969.
Muller, C. “Peut-on estimer l'étendue d'un lexique?”Cahiers de Lexicologie, 27 (1975), 3–29.
Oakman, R.L.Computer Methods for Literary Research. Columbia: University of South Carolina Press, 1980.
Pollatschek, M. and Y.T. Radday. “Vocabulary Richness and Concentration in Hebrew Biblical Literature.”Association for Literary and Linguistic Computing Bulletin, 8 (1981), 217–231.
Pollatschek, M. and Y.T. Radday. “Vocabulary Richness and Concentration.” InGenesis: An Authorship Study. Eds. Y.T. Radday and H. Shore. Rome: Biblical Institute Press, 1985.
Portnoy, S. “Reply to Professor Bartholomew.”Journal of the Royal Statistical Society, A, 151, 1 (1988), 172.
Portnoy, S. and D.L. Petersen. “Biblical Texts and Statistical analysis: Zechariah and Beyond.”Journal of Biblical Literature, 103 (1984), 11–21.
Radday, Y.T.The Unity of Isaiah in the Light of Statistical Linguistics. Gerstenberg: Hindlesheim, 1973.
Radday, Y.T. and D. Wickmann. “The Unity of Zechariah in the Light of Statistical Linguistics.”Zeit Alttestamentliche Wissenschaft, 87 (1975), 30–55.
Radday, Y.T. and M. Pollatschek. “Frequency Profiles: A Key to the M. Pollatschek Structure of Lamentations.”Balsanut Hofsit, 12 (1977), 24–35.
Radday, Y.T., D. Wickmann, G. Leb, and S. Talman. “The Book of Judges Examined by Statistical Linguistics.”Biblica, 58 (1977), 469–499.
Radday, Y.T. and H. Shore.Genesis: An Authorship Study in Computer-assisted Statistical Linguistics. Rome: Biblical Institute Press, 1985.
Ratkowsky, D.A. and L. Hantrais. “Tables for Comparing the Richness and Structure of Vocabulary in Texts of Different Lengths.”Computers and the Humanities, 9 (1975), 69–75.
Sichel, H.S. “On a Distribution Representing Sentence-Length in Written Prose.”Journal of the Royal Statistical Society (A), 137 (1974), 25–34.
Sichel, H.S. “On a Distribution Law for Word Frequencies.”Journal of the American Statistical Association, 70 (1975), 542–547.
Sichel, H.S. “Word Frequency Distributions and Type-Token Characteristics.”Mathematical Scientist, 11 (1986), 45–72.
Simpson, E.H. “Measurement of Diversity.”Nature, 163 (1949), 688.
Smith, M.W.A. “Recent Experience and New Developments of Methods for the Determination of Authorship.”Association for Literary and Linguistic Computing Bulletin, 11 (1983), 73–82.
Smith, M.W.A. “An Investigation of the Basis of Morton's Method for the Determination of Authorship.”Style, 19, 3 (1985a), 341–368.
Smith, M.W.A. “An Investigation of Morton's Method to Distinguish Elizabethan Playwrights.”Computers and the Humanities, 19, 1 (1985b), 3–21.
Smith, M.W.A. “Hapax Legomena in Prescribed Positions: An Investigation of Recent Proposals to Resolve Problems of Authorship.”Journal of the Association for Literary and Linguistic Computing, 2, 3 (1987a), 145–152.
Smith, M.W.A. “The Authorship of Pericles: New Evidence for Wilkins.”Journal of the Association for Literary and Linguistic Computing, 2, 4 (1987b), 221–30.
Smith, M.W.A. “Attribution by Statistics: A Critique of Four Recent Studies.”Revue, Informatique et Statistique dans les Sciences Humaines, 26 (1990), 233–251.
Smith, M.W.A. “The Authorship ofThe Raigne of King Edward the Third.”Literary and Linguistic Computing, 6, 3 (1991a), 166–174.
Smith, M.W.A. “The Authorship ofThe Revenger's Tragedy.”Notes and Queries, 38, 4 (1991 b), 508–513.
Somers, H.H. “Statistical Methods in Literary Analysis.” InThe Computer and Literary Style. Ed. J. Leed, Ohio: Kent State University Press, 1966.
Tallentire, D.R.An Appraisal of Methods and Models in Computational Stylistics, with Particular Reference to Author Attribution. PhD thesis. University of Cambridge, 1972.
Tallentire, D.R. “Towards an Archive of Lexical Norms — A Proposal.” InThe Computer and Literary Studies. Eds. A.J. Aitken, R.W. Bailey and N. Hamilton-Smith. Edinburgh University Press, 1973.
Tallentire, D.R. “Confirming Intuitions about Style Using Concordances.” InThe Computer in Literary and Linguistic Studies. Eds. A. Jones and R.F. Churchouse. University of Wales Press, 1976.
Thoiron, P. “Diversity Index and Entropy as Measures of Lexical Richness.”Computers and the Humanities, 20, 3 (1986), 197–202.
Ule, L. “Recent Progress in Computer Methods of Authorship Determination.”Association for Literary and Linguistic Computing Bulletin, 10 (1982), 73–89.
Wake, W.C. “Sentence-Length Distributions of Greek Authors.”Journal of the Royal Statistical Society, A, 120 (1957), 331–346.
Weitzman, M.P. “Reply to Professor Bartholomew.”Journal of the Royal Statistical Society, A, 151, 1 (1988) 173
Williams, C.B. “A Note on the Statistical Analysis of Sentence-Length as a Criterion of Literary Style.”Biometrika, 31 (1940), 356–361.
Williams, C.B.Style and Vocabulary: Numerical Studies. Griffin, 1970.
Yule, G.U. “On Sentence-Length as a Statistical Characteristic of Style in Prose, with Application to Two Cases of Disputed Authorship.”Biometrika, 30 (1938), 363–390.
Yule, G.U.The Statistical Study of Literary Vocabulary. Cambridge University Press, 1944.
Zipf, G.K.Selected Studies of the Principle of Relative Frequency in Language. Cambridge, MA: Harvard University Press, 1932.
Author information
Authors and Affiliations
Additional information
David Holmes is a Principal Lecturer in Statistics at the University of the West of England, Bristol with specific responsibility for co-ordinating the research programmes in the Department of Mathematical Sciences. He has taught literary style analysis to humanities students since 1983 and has published articles on the statistical analysis of literary style in theJournal of the Royal Statistical Society, History and Computing, andLiterary and Linguistic Computing. He presented papers at the ACH/ALLC conferences in 1991 and 1993.
Rights and permissions
About this article
Cite this article
Holmes, D.I. Authorship attribution. Comput Hum 28, 87–106 (1994). https://doi.org/10.1007/BF01830689
Issue Date:
DOI: https://doi.org/10.1007/BF01830689