8000 GitHub - zhu-minjun/Researcher: CycleResearcher: Improving Automated Research via Automated Review
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

zhu-minjun/Researcher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

71 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

AI-powered Research and Review Agents [ICLR 2025 / ACL 2025]

GitHub stars Python 3.8+ arXiv OpenReview Homepage

AI Research Ecosystem

Update:

[04/26/2025] We hosted AI Co-scientist Discussion in ICLR 2025, over 300 people gathered together!

[04/06/2025] We have collected 400 papers related to AI Scientists in our Awesome-AI-Scientist GitHub repository. If you're interested in this field, don't miss out!

[03/22/2025] We've just rolled out an exciting new feature for https://ai-researcher.net! ๐ŸŽ‰ Now you can directly read arXiv papers with unprecedented ease! ๐Ÿ“šโœจ

Transform any arXiv link from: https://arxiv.org/abs/2503.08569 -> https://ai-researcher.net/abs/2503.08569

๐Ÿ” Overview

CycleResearcher is a comprehensive open-source ecosystem for AI-powered academic research and review. Our system features three integrated components:

  • CycleResearcher: Generates high-quality research papers
  • CycleReviewer: Provides detailed academic reviews
  • DeepReviewer: Delivers multi-perspective review simulations with self-verification

By creating a complete feedback loop between research generation and evaluation, we aim to:

  • ๐Ÿค– Automate academic research processes
  • ๐Ÿ“ Provide rigorous, multi-perspective research reviews
  • ๐Ÿ”„ Establish research-review feedback loops
  • ๐Ÿš€ Accelerate scientific discovery

CycleResearcher Architecture

๐Ÿš€ Getting Started

Installation

pip install ai_researcher

Using CycleResearcher

# Import necessary libraries
from ai_researcher import CycleResearcher
from ai_researcher.utils import print_paper_summary

# Initialize CycleResearcher with the default 12B model
researcher = CycleResearcher(model_size="12B")

# Load references from BibTeX file
with open('cycleresearcher_references.bib', 'r') as f:
    references_content = f.read()

# Generate a paper with specific references
generated_papers = researcher.generate_paper(
    topic = "AI Researcher",
    references = references_content,
    n = 1  # Generate a single paper
)

# Print summary of generated paper
print_paper_summary(generated_papers[0])

Using CycleReviewer

# Import necessary libraries
from ai_researcher import CycleReviewer

# Initialize CycleReviewer with the default 8B model
reviewer = CycleReviewer(model_size="8B")

# Review a paper (assuming paper_text contains the paper content)
review_results = reviewer.evaluate(paper_text)

# Print review results
print(f"Average score: {review_results[0]['avg_rating']}")
print(f"Decision: {review_results[0]['paper_decision']}")

Using DeepReviewer

# Import necessary libraries
from ai_researcher import DeepReviewer

# Initialize DeepReviewer with 14B model
deep_reviewer = DeepReviewer(model_size="14B")

# Review a paper with multiple simulated reviewers in Standard Mode
review_results = deep_reviewer.evaluate(
    paper_text,
    mode="Standard Mode",  # Options: "Fast Mode", "Standard Mode", "Best Mode"
    reviewer_num=4         # Simulate 4 different reviewers
)

# Print review results
for i, review in enumerate(review_results[0]['reviews']):
    print(f"Reviewer {i+1} Rating: {review.get('rating', 'N/A')}")
    print(f"Reviewer {i+1} Summary: {review.get('summary', 'N/A')[:100]}...")

Launching DeepReviewer Best Mode

Using OpenScholar

OpenScholar is a retrieval-augmented generation-based academic research question-answering system. For detailed usage instructions, please refer to the OpenScholar directory.

Quick Start Guide for OpenScholar
  1. Apply for Semantic Scholar API Key: Visit Semantic Scholar API

  2. Start Model Services:

    # For Linux/Mac users
    cd OpenScholar
    chmod +x start_models.sh
    ./start_models.sh
  3. Start API Service:

    python openscholar_api.py \
        --s2_api_key YOUR_SEMANTIC_SCHOLAR_API_KEY \
        --reranker_path OpenSciLM/OpenScholar_Reranker
  4. Using the API:

    import requests
    
    # Send questions to OpenScholar API
    response = requests.post("http://localhost:38015/batch_ask", json={
        "questions": ["How do retrieval-augmented LMs perform in knowledge-intensive tasks?"]
    })
    
    result = response.json()
    print("OpenScholar Answer:", result["results"][0]["output"])

Best Mode

DeepReviewer's Best Mode provides the most comprehensive review experience, including background knowledge search, multi-reviewer simulation, and self-verification:

# Use Best Mode for in-depth review
review_results = deep_reviewer.evaluate(
    paper_text,
    mode="Best Mode",      # Most comprehensive review mode
    reviewer_num=6,        # Simulate 6 different reviewers
    enable_search=True,    # Enable background knowledge search
    self_verification=True # Enable self-verification
)

DeepReviewer Architecture

๐Ÿ“Š Model Evaluation

CycleResearcher

CycleResearcher Evaluation

CycleResearcher-12B achieves an average score of 5.36, approaching the 5.69 average for conference-accepted papers and surpassing AI Scientist's score of 4.31.

CycleReviewer

CycleReviewer Evaluation

CycleReviewer outperforms both proprietary systems and human experts with a 48.77% reduction in Proxy MSE and a 26.89% reduction in Proxy MAE compared to human reviewers. With a decision accuracy of 74.24%, our model demonstrates a significant lead over other closed-source systems.

DeepReviewer

DeepReviewer Evaluation

DeepReviewer provides multi-perspective simulation with self-verification, enabling more comprehensive and balanced feedback. It offers three distinct review modes: Fast Mode, Standard Mode, and Best Mode to accommodate different use cases.

๐Ÿง  Models & Datasets

Models Overview

CycleResearcher Models
Model Name Pre-training Language Model HF Link
CycleResearcher-ML-12B Mistral-Nemo-Instruct-2407 ๐Ÿค— link
CycleResearcher-ML-72B Qwen2.5-72B-Instruct ๐Ÿค— link
CycleResearcher-ML-123B Mistral-Large-2 ๐Ÿค— link
CycleReviewer Models
Model Name Pre-training Language Model HF Link
CycleReviewer-ML-Llama3.1-8B Llama3.1-8B-Instruct ๐Ÿค— link
CycleReviewer-ML-Llama3.1-70B Llama3.1-70B-Instruct ๐Ÿค— link
CycleReviewer-ML-Pro-123B Mistral-Large-2 ๐Ÿค— link
DeepReviewer Models
Model Name Parameters HF Link
DeepReviewer-7B 7B ๐Ÿค— link
DeepReviewer-14B 14B ๐Ÿค— link

Datasets

Datasets Overview
Dataset Name Train Data Test Data Description HF Link
Review-5K 4,189 781 Peer review dataset for CycleReviewer training ๐Ÿค— link
Research-14K 12,696 802 Research paper dataset for CycleResearcher training ๐Ÿค— link
DeepReview-13K 13,378 1,286 Multi-perspective review dataset for DeepReviewer training ๐Ÿค— link

๐Ÿ’ก Features

DeepReviewer Review Modes

DeepReviewer offers three distinct review modes to accommodate different use cases:

๐Ÿƒโ€โ™‚๏ธ Fast Mode

Quick review generation for rapid feedback. Provides essential evaluation without multi-reviewer simulation.

๐Ÿ”„ Standard Mode

Default mode that simulates multiple reviewers and includes self-verification to ensure reliable assessments.

โญ Best Mode

Most comprehensive mode with background knowledge search, multi-reviewer simulation, and self-verification for in-depth analysis.

AI Detection

Detect if content was generated by AI models:

from ai_researcher import AIDetector

# Initialize AI detector
detector = AIDetector(device='cpu')

# Analyze the generated paper
detection_result = detector.analyze_paper(paper)

print("Detection Results:")
print(f"Probability of AI generation: {detection_result['probability'] * 100:.2f}%")
print(f"Confidence Level: {detection_result['confidence_level']}")

๐Ÿ“š Tutorials and Demos

We have prepared comprehensive tutorials to help users understand and utilize our models:

๐Ÿ“„ License

This code and the models' weights are provided under the CycleResearcher-License. See the LICENSE.md file for details.

๐Ÿ“š Citation

If CycleResearcher is helpful to your work, please cite our paper:

@inproceedings{
weng2025cycleresearcher,
title={CycleResearcher: Improving Automated Research via Automated Review},
author={Yixuan Weng and Minjun Zhu and Guangsheng Bao and Hongbo Zhang and Jindong Wang and Yue Zhang and Linyi Yang},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=bjcsVLoHYs}
}

if DeepReviewer is helpful to your work, please cite our paper:

@misc{zhu2025deepreviewimprovingllmbasedpaper,
      title={DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process}, 
      author={Minjun Zhu and Yixuan Weng and Linyi Yang and Yue Zhang},
      year={2025},
      eprint={2503.08569},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2503.08569}, 
}

๐Ÿ“ฎ Contact

About

CycleResearcher: Improving Automated Research via Automated Review

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

0