Zusammenfassung
Daten für netzwerkanalytische Projekte können explizit oder implizit in natürlichsprachlichen, un- oder halbstrukturierten Texten enthalten sein. In dieser Situation ermöglichen Verfahren zur Relationsextraktion die Gewinnung oder Anreicherung von Netzwerkdaten. Die folgenden Beispiele verdeutlichen Einsatzgebiete für diese Familie von Methoden: Analysten aus Wirtschaft und Verwaltung entnehmen Berichten von und über Organisationen Angaben zu deren Zusammensetzung, Effizienz und Entwicklung (Corman et al. 2002; Krackhardt 1987). Kognitions- und Sozialwissenschaftler untersuchen auf der Grundlage von Interviews, wer welche Themen anspricht und wie in Verbindung setzt (Carley und Palmquist 1991; Collins und Loftus 1975). Journalisten und Analysten durchsuchen Meldungen und Archive nach Beteiligten, Gegenstand, Grund, Verlauf, Ort, Zeit, und Zusammenhängen von Ereignissen (Gerner et al. 1994; van Cuilenburg et al. 1986). Marktforscher analysieren Kundenbewertungen um herauszufinden, welche Marken und Produkte welche Empfindungen hinterlassen (Wiebe 2000). Internetforscher verfolgen die akteursbezogene Diffusion von Themen im Internet (Adar und Adamic 2005; Kleinberg 2003). Nutzer senden Suchmaschinen Anfragen, deren Beantwortung Informationen von mehr als einer Webseite bedarf (Berners-Lee et al. 2001; Brin 1999). All diesen Aufgaben ist gemeinsam, dass sie gelöst werden können, indem die jeweils relevanten Informationen (Knoten) und deren Verbindungen (Kanten) aus Texten herausgefunden, wiedergegeben und netzwerkanalytisch ausgewertet werden (McCallum 2005). In diesem Kapitel erläutern wir, unter welchen Bedingungen das Extrahieren relationaler Daten aus Texten sinnvoll ist, welche Verfahren dafür zur Verfügung stehen, und zeigen Grenzen und bislang ungelöste Probleme der Methodik auf.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
5 Literatur
Adar, Eytan und Lada A. Adamic, 2005: Tracking Information Epidemics in Blogspace. Proc. of IEEE/WIC/ACM International Conference on Web Intelligence, September 2005, Compiegne, Frankreich: 207–214.
Allen, James F. und Allen M. Frisch, 1982: What's in a semantic network? Proc. of 20th annual meeting of Association for Computational Linguistics Toronto, Canada: 19–27.
Baker, Wayne E. und Robert R. Faulkner, 1993: The Social Organization of Conspiracy: Illegal Networks in the Heavy Electrical Equipment Industry. American Sociological Review 58(6): 837–860.
Berelson, Bernard, 1952: Content analysis in communication research. Glencoe, Ill: Free Press.
Bernard, H. Russel und Gery W. Ryan, 19En: Text analysis: Qualitative and quantitative methods. S. 595–646 in: H. Russel Bernard (Hg.), Handbook of methods in cultural anthropology, Walnut Creek: Altamira Press.
Berners-Lee, Tim, James Hendler und Ora Lassila, 2001: The Semantic Web. Scientific American 284(5): 34–43.
Brin, Sergey, 1999: Extracting Patterns and Relations from the World Wide Web. WebDB Workshop at 6th International Conference on Extending Database Technology (EDBT), März 1998, Valencia, Spanien: 172–183.
Bunescu, Razvan und Raymond J. Mooney, 2007: Statistical Relational Learning for Natural Language Information Extraction. S. 535–552 in: Lise Getoor und Ben Taskar (Hg.), Statistical Relational Learning. Cambridge: MIT Press.
Burt, Ronald und Nan Lin, 1977: Network Time Series from Archival Records. S. 224–254 in: David R. Heise (Hg.), Sociological Methodology, San Francisco, CA: Jossey-Bass.
Buzan, Tony, 1984: Make the Most of Your Mind. New York, NY: Simon and Schuster.
Cafarella, Michael J., Michele Banko und Oren Etzioni, 2006: Relational web search. Proc. of World Wide Web Conference (WWW), Mai 2006, Edinburgh, UK.
Carley, Kathleen M., 1997: Network text analysis: The network position of concepts. S. 79–100 in: Carl W. Roberts (Hg.), Text analysis for the social sciences: Methods for drawing statistical inferences from texts and transcripts. Mahwah, NJ: Lawrence Erlbaum Associates.
Carley, Kathleen M., Jana Diesner, Jeffrey Reminga und Maksim Tsvetovat, 2007: Toward an interoperable dynamic network analysis toolkit. Decision Support Systems. 43(3): 1324–1347.
Carley, Kathleen M. und Michael Palmquist, 1991: Extracting, Representing, and Analyzing Mental Models. Social Forces 70(3): 601–636.
Central Intelligence Agency. World Factbook: Available from: www.cia.gov/library/publications/the-world-factbook/.
Chomsky, Noam, 1956: Three models for the description of language. IRE Transactions on Information Theory 2(3): 113–124.
Collins, Allan M. und Elisabeth F. Loftus, 1975: A spreading-activation theory of semantic processing. Psychological Review 82: 407–428.
Corman, Stephen R., Timothy Kuhn, Robert D. McPhee, und Kevin J. Dooley, 2002: Studying Complex Discursive Systems: Centering Resonance Analysis of Communication. Human Communication Research 28: 157–206.
Culotta, Aron, Andrew McCallum und Jonathan Betz, 2006: Integrating probabilistic extraction models and data mining to discover relations and patterns in text. Proc. Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL), Juni 2006, New York, NY.
Danowski, James A., 1993: Network Analysis of Message Content. Progress in Communication Sciences 12: 198–221.
Diesner, Jana und Kathleen M. Carley, 2005: Revealing Social Structure from Texts: Meta-Matrix Text Analysis as a novel method for Network Text Analysis. S. 81–108 in: V. K. Narayanan and Deborah J. Armstrong (Hg.), Causal Mapping for Information Systems and Technology Research: Approaches, Advances, and Illustrations, Harrisburg, PA: Idea Group Publishing.
Diesner, Jana und Kathleen M. Carley, 2008: Conditional Random Fields for Entity Extraction and Ontological Text Coding. Journal of Computational and Mathematical Organization Theory 14(3): 248–262.
Diesner, Jana und Kathleen M. Carley, 2009a: WYSIWII - What You See Is What It Is: Informed Approximation of Relational Data from Texts. Presentation General Online Research (GOR), April 2009, Wien, Österreich.
Diesner, Jana und Kathleen M. Carley 2009b. He says, she says. Pat says, Tricia says. How much reference resolution matters for entity extraction, relation extraction, and social network analysis. Proceedings of IEEE Symposium on Computational Intelligence for Security and Defence Applications (CISDA), Juli 2009, Ottawa, Canada.
Diesner, Jana, Kathleen M. Carley und Harald Katzmair, 2007: The morphology of a breakdown. How the semantics and mechanics of communication networks from an organization in crises relate. Präsentation, XXVII Sunbelt Social Network Conference, Mai 2007, Korfu, Griechenland.
Diesner, Jana, Terrill L. Frantz und Kathleen M. Carley, 2005: Communication Networks from the Enron Email Corpus “It's Always About the People. Enron is no Different”. Journal of Computational and Mathematical Organization 11(3): 201–228.
Dietterich, Thomas G., 2002: Machine Learning for Sequential Data: A Review. Proc. of Joint IAPR International Workshops SSPR 2002 and SPR 2002, August 2002, Windsor, ON, Canada: 15–33.
Doddington, George, Alexis Mitchell, Mark Przybocki, Lance Ramshaw, Stephanie Strassel und Ralph Weischedel, 2004: The Automatic Content Extraction (ACE) Program–Tasks, Data, and Evaluation. Proc. of Language Resources and Evaluation Conference (LREC), Mai 2004, Lissabon, Portugal: 837–840.
Doerfel, Marya, 1998: What Constitutes Semantic Network Analysis? A Comparison of Research and Methodologies. Connections 21(2): 16–26.
Doerfel, Marya und George A. Barnett, 1999: A Semantic Network Analysis of the International Communication Association. Human Communication Research 25(4): 589–603.
Fellbaum, Christiane, 1998: WordNet: An electronic lexical database. Cambridge MA: MIT Press.
Fillmore, Charles J., 1982: Frame Semantics. S. 111–137 in: The Linguistic Society of Korea (Hg.), Linguistics in the morning calm. Seoul, Süd Korea: Hanshin Publishing Co.
Fillmore, Charles J., 1968: The Case for Case. S. 1–88 in: Emon Bach and Robert T. Harms (Hg.), Universals in Linguistic Theory. New York: Holt, Rinehart and Winston.
Frank, Ove, 2004: Network sampling and model fitting. S. 31–56 in: Peter J. Carrington, John Scott und Stanley Wasserman (Hg.), Models and methods in social network analysis. New York: Cambridge University Press.
Franzosi, Roberto, 1989: From words to numbers: A generalized and linguistics-based coding procedure for collecting textual data. Sociological Methodology 19: 225–257.
Gerner, Deborah, Phillip A. Schrodt, Ronald A. Francisco und Judith L. Weddle, 1994: Machine Coding of Event Data Using Regional and International Sources. International Studies Quarterly 38(1): 91–119.
Glaser, B. und A. Strauss, 1967: The Discovery of Grounded Theory: Strategies for Qualitative Research. New York, NY: Aldine.
Grisham, Ralph und Beth Sundheim, 1996: Message understanding conference - 6: A brief history. Proc. of 16th International Conference on Computational Linguistics, Kopenhagen, Dänemark, Juni 1996.
Hartley, Roger und John Barnden, 1997: Semantic networks: visualizations of knowledge. Trends in Cognitive Sciences 1(5): 169–175.
Howard, Ronald A., 1989: Knowledge maps. Management Science 35(8): 903–922.
Janas, Jtirgen und Camilla Schwind, 1979: Extensional Semantic Networks. S. 267–302 in: Nicholas V. Findler (Hg.), Associative Networks. Representation and Use of Knowledge by Computers. New York u.a.: Academic Press.
Johnson-Laird, Phil N., 2005: The history of mental models. S. 179–212 in: Ken Manktelow und Man C. Chung (Hg.), Psychology of Reasoning: Theoretical and Historical Perspectives. London: Psychology Press.
Jurafsky, Daniel und James H. Martin, 2000: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Upper Sadle River NJ: Prentice Hall.
King, Gary und Will Lowe, 2003: An Automated Information Extraction Tool for International Conflict Data with Performance as Good as Human Coders: A Rare Events Evaluation Design. International Organization 57(3): 617–642.
Kleene, Stephen, 1956: Representation of events in nerve nets and finite automata. S. 3–41 in: Claude Shannon und John McCarthy (Hg.), Automata Studies. Princeton NJ: Princeton University Press.
Kleinberg, Jon, 2003: Bursty and Hierarchical Structure in Streams. Data Mining and Knowledge Discovery 7(4): 373–397.
Krackhardt, David, 1987: Cognitive social structures. Social Networks 9: 109–134.
Krebs, Valdis E., 2002: Mapping networks of terrorist cells. Connections 24(3): 43–52.
Krippendorff, Klaus, 2004: Content analysis: An introduction to its methodology. Thousand Oaks CA: Sage.
Lafferty, John, Andrew McCallum und Fernando Pereira, 2001: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. Proc. of 18th International Conference on Machine Learning, Juni 2001, Willliamstown, MA: 282–289.
Lewins, Ann und Christina Silver, 2007: Using software in qualitative research: a step-by-step guide. London: Sage.
McCallum, Andrew, 2005: Information extraction: distilling structured data from unstructured text. ACM Queue 3(9): 48–57.
Miller, Scott, Heidi Fox, Lance Ramshaw und Ralph Weischedel, 2000: A novel use of statistical parsing to extract information from text. Proc. of 1st Conference of North American chapter of the Association for Computational Linguistics (NAACL), Seattle, WA: 226–233.
Minsky, Marvin, 1974: A Framework for Representing Knowledge. MIT-AI Laboratory Memo 306.
Mitchell, Tom, 1997: Machine Learning. Muggleton: McGraw-Hill.
Mohr, John W., 1998: Measuring Meaning Structures. Annual Review of Sociology 24(1): 345–370.
Norvig, Peter und Stuart Russell, 1995: Artificial Intelligence: A Modern Approach. Upper Saddle River: Pearson Education.
Novak, Joseph D. und Alberto Cañas, 2008: The Theory Underlying Concept Maps and How to Construct Them. Florida Institute for Human and Machine Cognition, Report No. IHMC CmapTools Rev 01–2008.
Osgood, Charles E., 1959: The representational model and relevant research methods. S. 33–88 in: Ithiel de Sola Pool (Hg.), Trends in content analysis. Urbana, IL: University of Illinois Press.
Pearl, Judea, 1988: Probabilistic reasoning in intelligent systems: networks of plausible inference. San Francisco: Morgan Kaufmann.
Petri, Carl Adam, 1962: Kommunikation mit Automaten. Universität Bonn, Ph. D. Dissertationsschrift.
Richards, Tom, 2002: An intellectual history of NUD* IST and NVivo. International Journal of Social Research Methodology 5(3): 199–214.
Roberts, Carl W., 1997: A Generic Semantic Grammar for Quantitative Text Analysis: Applications to East and West Berlin Radio News Content from 1979. Sociological Methodology 27: 89–129.
Roberts, Carl W., 2000: A Conceptual Framework for Quantitative Text Analysis. Quality and Quantity 34(3): 259–274.
Rumelhart, David E., 1981: Schemata: The building blocks of cognition. Comprehension and teaching: Research reviews: 3–26.
Schrodt, Phillip A., Ömür Yilmaz, Deborah J. Gerner und Dennis Hermick, 2008: Coding Sub-State Actors using the CAMEO (Conflict and Mediation Event Observations) Actor Coding Framework. Präsentation, Annual Meeting of the International Studies Association, März 2008, San Francisco, CA.
Seibel, Wolfgang und Jörg Raab, 2003: Verfolgungsnetzwerke. Kölner Zeitschrift für Soziologie und Sozialpsychologie 55(2): 197–230.
Shapiro, Stuart C., 1971: A net structure for semantic information storage, deduction and retrieval. Proc. of Second International Joint Conference on Artificial Intelligence: 512–523.
Smith, Andrew E. und Michael S. Humphreys, 2006: Evaluation of unsupervised semantic mapping of natural language with Leximancer concept mapping. Behavior Research Methods 38(2): 262–279.
Sowa, John F., 1992: Semantic Networks. S. 1493–1511 in: Stuart C. Shapiro (Hg.), Encyclopedia of Artificial Intelligence. New York: Wiley and Sons.
Tesnière, Lucien, 1959: Elements de syntaxestructurale. Paris: Klincksieck.
Tversky, Amos, und Itamar Gati, 1982: Similarity, separability, and the triangle inequality. Psychological Review, 89(2): 123–154.
Van Atteveldt, Wouter, 2008: Semantic network analysis: Techniques for extracting, representing, and querying media content. Charleston: Book Surge Publishers.
van Cuilenburg, Jan J., Jan Kleinnijenhuis und Jan A. de Ridder, 1986: A Theory of Evaluative Discourse: Towards a Graph Theory of Journalistic Texts. European Journal of Communication 1(1): 65–96.
White, Harrison C., 1993: Canvases and careers: institutional change in the French painting world. Chicago: University of Chicago Press.
Wiebe, Janyce M., 2000: Learning Subjective Adjectives from Corpora. Proc. of 17th National Conference on Artificial Intelligence (AAAI) 2000, Juli 2000, Austin, TX: 735–741.
Woods, William A., 1975: What's in a link: Foundations for semantic networks. S. 35–82 in: Daniel G. Bobrow und Allan Collins (Hg.), Representation and Understanding: Studies in Cognitive Science. New York: Academic Press.
Yang, Yiming und Jan O. Pedersen, 1997: A comparative study on feature selection in text categorization. Proc. 14th International Conference on Machine Learning (ICML), Nashville, TN.
Zelenko, Dmitry, Chinatsu Aone und Anthony Richardella, 2003: Kernel methods for relation extraction. Journal of Machine Learning Research 3(2): 1083–1106.
Zipf, George K., 1949: Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Cambridge, MA: Addison-Wesley Press.
Züll, Cornelia und Melina Alexa, 2001: Automatisches Codieren von Textdaten. Ein Überblick über neue Entwicklungen. S. 303–317 in: Werner Wirth und Edmund Lauf (Hg.), Inhaltsanalyse - Perspektiven, Probleme, Potenziale. Köln: Herbert von Halem.
Editor information
Rights and permissions
Copyright information
© 2010 VS Verlag für Sozialwissenschaften | Springer Fachmedien Wiesbaden GmbH
About this chapter
Cite this chapter
Diesner, J., Carley, K.M. (2010). Extraktion relationaler Daten aus Texten. In: Stegbauer, C., Häußling, R. (eds) Handbuch Netzwerkforschung. VS Verlag für Sozialwissenschaften. https://doi.org/10.1007/978-3-531-92575-2_44
Download citation
DOI: https://doi.org/10.1007/978-3-531-92575-2_44
Publisher Name: VS Verlag für Sozialwissenschaften
Print ISBN: 978-3-531-15808-2
Online ISBN: 978-3-531-92575-2
eBook Packages: Humanities, Social Science (German Language)