8000 GitHub - kendalled/GetScraped: Don't worry about it!
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

kendalled/GetScraped

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GetScraped v3 🚒

Don't Worry About It! But if you'd like to know, it scrapes emails from CSV's containing website addresses. The program pulls the html, uses a regexp match, and removes / cleans up duplicates.

Installation

Use the package manager pip to install dependencies (see below).

git clone https://github.com/kendalled/GetScraped.git

Usage

Put many csv files containing URL's in the src/Data Folder. Then:

pip3 install pandas
pip3 install lxml
pip3 install unicodecsv

Then, Run the Following:

python3 getscrapedall.py

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update issues as appropriate.

License

MIT

About

Don't worry about it!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0