8000 GitHub - jeremistderechte/ParlaMind
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

jeremistderechte/ParlaMind

Repository files navigation

ParlaMind

ParlaMind is a project that analyzes speeches from the German Bundestag from 1949 to 2025. It applies advanced Natural Language Processing (NLP) techniques to extract insights from political discourse, including sentiment analysis, topic modeling, and party classification using BERT.

Features

  • Sentiment Analysis: Determines the sentiment (positive, neutral, or negative) of speeches.
  • Topic Modeling: Identifies key themes and trends over time.
  • Party Classification: Uses BERT to classify and attribute speeches to specific parties.
  • Historical Insights: Tracks linguistic and ideological changes across decades.

Technologies Used

  • Python (Core language)
  • PyTorch & Hugging Face Transformers (For BERT-based NLP tasks)
  • Polars & Pandas (For data processing)
  • Poetry (For dependency management)

Installation

poetry install

Usage

Running Analysis

To analyze speeches, run:

poetry run python main.py

For it to run you need to have the speeches.csv and factions.csv in /data/raw/OpenDiscourse/ from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/FIKIBO after that the XML files will be downloaded and turned into parquet file/polars df. For the XML download you have to create a .secrets.toml with api_key = "your_api_key" from bundestag api. You can get the newest api from https://dip.bundestag.de/%C3%BCber-dip/hilfe/api. After that you can finde the ParlaMind.parquet in /data/formated/parquet/.

Dataset

The dataset consists of Bundestag speeches from 1949–2025, preprocessed and stored in parquet format.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  
0