Large Language Models Enable Textual Interpretation of Image-Based Astronomical Transient Classifications - Code Repository

TODO: Replace the link below with the actual paper link once puplished

This repository contains the code and notebooks associated with the research paper:

TODO: Replace the link below with the actual paper link once puplished Paper: Large Language Models Enable Textual Interpretation of Image-Based Astronomical Transient Classifications (Research Square Preprint)

Authors: Fiorenzo Stoppa¹, Turan Bulmus², Steven Bloemen³, Stephen J. Smartt¹⁶, Paul J. Groot³⁴⁵, Paul Vreeswijk³, Ken W. Smith¹⁶ (Affiliations omitted for brevity, see paper)

Overview

Modern astronomical surveys generate vast amounts of transient candidate detections. Distinguishing genuine astrophysical signals (e.g., supernovae, variable stars) from imaging artifacts ("bogus" sources) is a critical challenge. While Convolutional Neural Networks (CNNs) have shown success, their "black box" nature limits interpretability.

This work demonstrates the application of Large Language Models (LLMs), specifically Google's Gemini 1.5 Pro (gemini-1.5-pro-002), to classify astronomical transients using triplet images (New, Reference, Difference). Using few-shot learning (15 examples per dataset) and carefully engineered prompts, the LLM achieves high accuracy (average 93%, improved to ~96.7% via iteration on MeerLICHT) comparable to CNNs, while simultaneously providing human-readable textual explanations for its classifications. This enhances transparency and aligns with scientific reasoning.

We also showcase a novel approach where an LLM evaluates the coherence and consistency of classifications generated by another LLM instance, enabling targeted refinement and identification of problematic cases.

The code primarily demonstrates the workflow using the MeerLICHT dataset, but the paper discusses applications to Pan-STARRS and ATLAS as well.

Key Features

Few-shot classification of astronomical transients (Real/Bogus) using Gemini 1.5 Pro.
Generation of detailed, human-readable textual explanations for each classification.
Assignment of an "interest score" (No/Low/High) based on classification and features.
Prompt engineering techniques tailored for multimodal LLMs in astronomical image analysis.
Batch processing pipeline leveraging Google Cloud Vertex AI Batch Prediction and BigQuery.
Evaluation of classification performance (Accuracy, Precision, Recall, Confusion Matrix).
Demonstration of LLM-based evaluation ("LLM judging LLM") for coherence assessment.

Repository Structure

spacehack/ 
  ├── data/ # Placeholder for data files (created/downloaded by scripts) 
  │ │ └── pics/ # Generated image files for prompts/analysis
  │ │ └── prompt_pics/ # Images used in few-shot examples 
  │ ├── new_data.npy # Downloaded image triplets 
  │ ├── new_labels.csv # Downloaded labels 
  │ ├── predictions_results.csv # Prediction results from 01_...ipynb
  │ ├── predictions_with_Coherence.csv # Example output from 02_...ipynb
  ├── prompts/ # Placeholder for saved prompt text files (created by scripts) 
  ├── 01_LLM_Classification_Transients.ipynb # Notebook for initial  classification & explanation generation 
  ├── 02_LLM_Judging_LLM_Classifications.ipynb # Notebook for LLM evaluation of the first notebook's outputs
  ├── Repeatability_Analysis.ipynb # Notebook for statistical analysis of Appendix C of the paper
  ├── CONTRIBUTING.md # Instructions for contributing to the repository 
  ├── helper_functions.py # Utility functions for data handling, prompts, GCP interaction, etc. 
  ├── LICENSE # Software license 
  ├── requirements.txt # Python package dependencies 
  └── README.md # This file

Setup

Prerequisites:
- Python 3.10+ (tested with 3.12.3)
- Google Cloud Platform (GCP) Account:
  - If you don't have a GCP account, create one at https://cloud.google.com/. A credit card is required to create an account, but new accounts may be eligible for free credits.
  - Billing: Ensure that billing is enabled for your GCP project. You can enable billing and link a billing account in the GCP console.
- GCP Project
- Enabled GCP APIs:
  - Vertex AI API (aiplatform.googleapis.com)
  - BigQuery API (bigquery.googleapis.com)
- GCP Permissions:
  - Your user account or service account needs the following IAM roles to interact with Vertex AI and BigQuery:
    - roles/aiplatform.user
    - roles/bigquery.user
    - roles/storage.objectViewer
  - You can grant these roles in the GCP console under "IAM & Admin" or using the gcloud command-line tool.
- GCP Authentication configured for your environment (e.g., run gcloud auth application-default login locally, or use Colab's built-in authentication).
- A BigQuery Dataset created within your GCP project (the default name spacehack is used in the notebooks, update if necessary).

Clone Repository:

git clone https://github.com/turanbulmus/spacehack.git # Or your repo URL
cd spacehack

Install Dependencies:
```
pip install -r requirements.txt
```
(Note: If running in Colab, package installation might be handled differently, potentially requiring a kernel restart after installation as shown in the notebook comments).

Data Acquisition

The MeerLICHT dataset used in this project consists of two files:

new_data.npy: A NumPy array containing the image triplets (New, Reference, Difference). Download using gdown with the following ID: 1EZZyK_E99H--7yrTumYnuogrpd8F-KOv
new_labels.csv: A CSV file containing the labels for each image triplet. Download using gdown with the following ID: 11qLdAGY-_v8wC4IzE9CFOIywapSFWr7g

Remember to create the data directory if it doesn't already exist.

To download these files, ensure you have gdown installed (pip install gdown). Then, use the following commands:

wget https://zenodo.org/records/14714279/files/MeerLICHT_images.npy?download=1 -O data/new_data.npy
wget https://zenodo.org/records/14714279/files/MeerLICHT_labels.csv?download=1 -O data/new_labels.csv

Usage

Configure GCP Variables:
- Open the notebooks (.ipynb files).
- In the initial cells, update the PROJECT_ID, LOCATION, and DATASET_ID variables to match your specific GCP environment setup.
Run Notebooks:
- Execute the cells sequentially within a Jupyter or Google Colab environment.
- 01_LLM_Classification_Transients.ipynb:
  - Downloads the MeerLICHT dataset (.npy images, .csv labels) using gdown if not found locally in the data/ directory.
  - Prepares image files (.png) for the LLM.
  - Constructs the prompt with few-shot examples.
  - Sets up and runs a Vertex AI Batch Prediction job to classify all transients.
  - Retrieves results from BigQuery.
  - Evaluates performance and generates a confusion matrix.
  - Saves results to data/predictions_results.csv.
- 02_LLM_Judging_LLM_Classifications.ipynb:
  - Requires the output data/predictions_results.csv from the first notebook.
  - Uses an LLM ("judge") to evaluate the coherence score and interest score validity for each classification made by the first LLM.
  - Runs a second Vertex AI Batch Prediction job for the evaluation task.
  - Retrieves evaluation results from BigQuery.
  - Saves evaluated results to data/predictions_with_Coherence.csv.
TODO: Add what Round2_MeerLICHT.ipynb does
Important Notes:
- Running these notebooks executes jobs on Google Cloud Platform (Vertex AI, BigQuery) which will incur costs.
- API usage is subject to GCP quotas. Large batch jobs may take significant time to complete.
- The notebooks rely on specific data file IDs from Google Drive for download via gdown. If these links become invalid, the data needs to be obtained and placed in the data/ directory manually.

Citation

If you use this code or methodology in your research, please cite the original paper:

TODO: Update the actual paper link once published

@article{Stoppa2025LLMTransient,
  title={Large Language Models Enable Textual Interpretation of Image-Based Astronomical Transient Classifications},
  author={Stoppa, Fiorenzo and Bulmus, Turan and Bloemen, Steven and Smartt, Stephen J. and Groot, Paul J. and Vreeswijk, Paul and Smith, Ken W.},
  year={2025},
  journal={Research Square},
  doi={10.21203/rs.3.rs-5723428/v1},
  url={https://doi.org/10.21203/rs.3.rs-5723428/v1},
  note={Preprint}
}
License
This project is licensed under the Apache License 2.0. See the LICENSE file (or the header in the source files) for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Large Language Models Enable Textual Interpretation of Image-Based Astronomical Transient Classifications - Code Repository

Overview

Key Features

Repository Structure

Setup

Data Acquisition

Usage

Citation

About

Uh oh!

Uh oh!

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.gitignore		.gitignore
01 - LLM Classification Transients.ipynb		01 - LLM Classification Transients.ipynb
02 - LLM Judging LLM Classifications.ipynb		02 - LLM Judging LLM Classifications.ipynb
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
Repeatability_Analysis.ipynb		Repeatability_Analysis.ipynb
Round2_MeerLICHT.ipynb		Round2_MeerLICHT.ipynb
helper_functions.py		helper_functions.py
requirements.txt		requirements.txt

License

turanbulmus/spacehack

Folders and files

Latest commit

History

Repository files navigation

Large Language Models Enable Textual Interpretation of Image-Based Astronomical Transient Classifications - Code Repository

Overview

Key Features

Repository Structure

Setup

Data Acquisition

Usage

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 3

Uh oh!

Languages