8000 GitHub - xyizko/xo-gis: ⛏️Github Issues and PR Scraper
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

xyizko/xo-gis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xo-gis

Simple Github Issues and PR Scraper

uv

  1. 💻 Compatibility
  2. 🎥 Demo
    1. 🍬 Features
  3. 🤔What
  4. 💽 Setup
    1. 😿 Common Problems
  5. 🎩 License

💻 Compatibility

Env Status
My Skills

🎥 Demo

🍬 Features

✅ Captures both screenshots and scrape to .txt and .json

✅ Set any type of User-Agent via config/useragent.txt

✅ Fast since it uses https://docs.astral.sh/uv/

🤔What

Research tool to quickly scrape the 1st page of the github issues and prs from a given github repo. Its screenshots are taken and the headings of the issues and prs are stored as .txt and .json

💽 Setup

  1. Download and install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
  1. Download this repo

  2. Run setup.sh

  • Bash script which will install required python libraries.
  • Note this uses playwright-python and will also install its dependencies.
  1. Enter the required repos to be scraped in the config/repos.txt in the follwing format
org1/repo1
org2/repo2
.
.
  1. If you want to use a specific user agent.
  • Make a new file config/useragent.txt with the required user-agent or defaults will be used.
# Defaul user=agent
default_user_agent = "Mozilla/5.0 (Linux; Android 10; SM-G975F) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.93 Mobile Safari/537.36"
  1. Execute
uv run xo.py
  • A new reports directory will be created with each repo as its own directory containing its scraped assets

😿 Common Problems

Some repositories may not be scraped properly if it uses pinned issues.

🎩 License

Lic

About

⛏️Github Issues and PR Scraper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0