GitHub - goncaloadolfo/keywords-extraction

Exercise 1

Run exercise-1.py file in src/exercises/;
It applies system represented by the image located in mod dir, with an example document in file system.

Run exercise-2.py file in src/exercises/;
It applies the same system but with variants trying to improve results;
The results from exercise-1 and this exercise are represented in a txt file localed in results dir and it uses a dataset with 1500 abstract documents.

Tornado package is necessary;
Conection to internet is necessary to get RSS file;
Run exercise-4.py in src/exercises/ to start web server;
Open web browser and do a request to "localhost:8888" or if you changed the port "localhost:newPort"
This request return an html page which allows to see every sport article and its key phrases. Also, it has filter mechanisms.

During the development, we generated a pickle file for word vectors that exceed the max unique file size. The big_files_url contains an URL to download that file. It should be saved into src/files/ and the name must be preserved or you might change on code.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
doc		doc
mod		mod
results		results
src		src
.gitignore		.gitignore
README.md		README.md
big_files_url.txt		big_files_url.txt