-
sentence-transformers Public
Forked from UKPLab/sentence-transformersState-of-the-Art Text Embeddings
Python Apache License 2.0 UpdatedMay 8, 2025 -
-
datasets Public
Forked from huggingface/datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Python Apache License 2.0 UpdatedFeb 27, 2025 -
pytorch-ood Public
Forked from kkirchheim/pytorch-ood👽 Out-of-Distribution Detection with PyTorch
Python Apache License 2.0 UpdatedFeb 25, 2025 -
setfit Public
Forked from huggingface/setfitEfficient few-shot learning with Sentence Transformers
Jupyter Notebook Apache License 2.0 UpdatedJan 13, 2025 -
-
reach Public
Load embeddings and featurize your sentences.
-
fast-sentence-transformers Public
Forked from davidberenstein1957/fast-sentence-transformersSimply, faster, sentence-transformers
Python MIT License UpdatedAug 27, 2024 -
-
-
torchic Public
Simple linear thing in Torch, with a scikit-learn compatible API.
-
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python Apache License 2.0 UpdatedJun 14, 2023 -
argilla Public
Forked from argilla-io/argilla✨ Argilla: Open-source data platform for LLMs and Human Feedback
Python Apache License 2.0 UpdatedApr 23, 2023 -
mteb Public
Forked from embeddings-benchmark/mtebMTEB: Massive Text Embedding Benchmark
Python Apache License 2.0 UpdatedApr 11, 2023 -
hashing_split Public
Stable train/test splits using hashing
-
unitoken Public
Tokenization across languages. Useful as preprocessing for subword tokenization.
-
somber Public
Recursive Self-Organizing Map/Neural Gas.
-
piecelearn Public
Learning BPE embeddings by first learning a segmentation model and then training word2vec
-
quickumls_pred Public
Predict semantic types using QuickUMLS
-
orst Public
A pixel sorting program, written in python 3.x.
-
-
conch Public
Forked from clips/conchUnsupervised concept extraction from clinical text
Python GNU General Public License v3.0 UpdatedApr 11, 2020 -
trnsps Public
transpose words
-
-
-
spacy_conll Public
Forked from BramVanroy/spacy_conllParse text with spaCy and print the output in CoNLL-U format
Python BSD 2-Clause "Simplified" License UpdatedJul 21, 2019 -
diora Public
Forked from iesl/dioraDeep Inside-Outside Recursive Autoencoder
Python Apache License 2.0 UpdatedJul 4, 2019 -
lrec2018 Public
Code for the experiments in the LREC 2018 paper "WordKit: a Python Package for Orthographic and Phonological Featurization"
Python MIT License UpdatedApr 24, 2018 -
OpenDutchWordnet Public
Forked from cltl/OpenDutchWordnetThis repo provides a python module to work with Open Dutch WordNet. It was created using python 3.4.
-
ruly Public
A short script to generate stuff based on binary cellular automata.