8000 GitHub - riju-talk/SpanBERT-CRF: Fine-tuned SpanBERT model with a CRF layer for improved span-based Question Answering, designed for datasets like SQuAD v2.0 with precise span boundary prediction.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fine-tuned SpanBERT model with a CRF layer for improved span-based Question Answering, designed for datasets like SQuAD v2.0 with precise span boundary prediction.

Notifications You must be signed in to change notification settings

riju-talk/SpanBERT-CRF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 

Repository files navigation

SpanBERT-CRF for Question Answering

This repository implements a SpanBERT-based Question Answering model enhanced with a Conditional Random Field (CRF) layer for improved answer span prediction on the SQuAD v2.0 dataset.


🚀 Features

  • Fine-tuned SpanBERT: Optimized for span-based QA tasks
  • CRF Enhancement: Improved boundary detection for answer spans
  • SQuAD v2.0 Support: Handles both answerable and unanswerable questions
  • Reproducibility: Fully functional in Jupyter Notebook/Google Colab
  • Custom Tooling: Specialized evaluation and prediction pipelines

🗂️ Project Structure

├── spanbert-crf.ipynb          # Main training/evaluation notebook
├── data/                       # Dataset-copy (optional)
├── models/                     # Saved model and checkpoints with weights & tokenizer
├── outputs/                    # Training logs & prediction outputs
├── README.md                   # Project documentation
└── requirements.txt            # Python dependencies

🛠️ Setup Instructions

  1. Clone the repository:

    git clone https://github.com/riju-talk/SpanBERT-CRF.git
    cd SpanBERT-CRF
  2. Install dependencies:

    pip install -r requirements.txt
  3. Run the notebook:

    • Local Execution:
      jupyter notebook spanbert-crf.ipynb
    • Google Colab: Upload spanbert-crf.ipynb and run interactively

📊 Evaluation Metrics

  • Exact Match (EM): 90% (dev set, Base Model 90% using trainer objects from HuggingFace)
  • Exact Match (EM): 57% (dev set, SPANBERT-CRF Model 90% using trainer objects from HuggingFace)

🤖 Inference Example

Predict answers from context/question pairs:

from inference import predict_answer

context = "The quick brown fox jumps over the lazy dog."
question = "What does the fox jump over?"
answer = predict_answer(context, question)

print(f"Predicted Answer: {answer}")  # Output: "the lazy dog"

📌 Roadmap

  • Hyperparameter tuning experiments
  • CLI/API interface for model serving
  • FastAPI deployment setup
  • Cross-dataset evaluation (HotpotQA, Natural Questions)

Open in Colab
Click the badge for one-click Colab execution

About

Fine-tuned SpanBERT model with a CRF layer for improved span-based Question Answering, designed for datasets like SQuAD v2.0 with precise span boundary prediction.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0