Extension for handling phrases, to be separated with SEP_CHAR
.
A phrase needs to be marked like:
// SEP_CHAR = '\1'
hot dog => hot\1dog\1
No methodological adaptation was needed, only modifications in token count (vocab_count.c
) and cooccourence count (cooccur.c
) were done. Some unsupported code from the original Glove paper had to be removed for repository consistency.
Simple code testing of the implemented modifications were performed.
You can train word vectors on your own corpus. Adapt demo.sh
for such.
$ ./demo.sh
All work contained in this package is licensed under the Apache License, Version 2.0. See the include LICENSE file.