8000 GitHub - pauloamed/GloVePhrases: GloVe model for distributed word representation that allows computing of phrase embeddings
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

pauloamed/GloVePhrases

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GloVe: Global Vectors for Word Representation ~ Phrase support

Extension for handling phrases, to be separated with SEP_CHAR. A phrase needs to be marked like:

// SEP_CHAR = '\1'
hot dog => hot\1dog\1

Phrase adaptation

No methodological adaptation was needed, only modifications in token count (vocab_count.c) and cooccourence count (cooccur.c) were done. Some unsupported code from the original Glove paper had to be removed for repository consistency.

Testing

Simple code testing of the implemented modifications were performed.

Train word vectors on a new corpus

You can train word vectors on your own corpus. Adapt demo.sh for such.

$ ./demo.sh

License

All work contained in this package is licensed under the Apache License, Version 2.0. See the include LICENSE file.

About

GloVe model for distributed word representation that allows computing of phrase embeddings

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C 92.4%
  • Shell 2.8%
  • Python 2.6%
  • Makefile 2.2%
0