-
Scaled Cognition
- New York, NY
- aryamccarthy.github.io
Stars
Pre-trained variant of Yoyodyne, a small-vocabulary neural sequence-to-sequence generation engine
The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.
Pyright language server integration for PyCharm Professional
For processing text exports from proquest, which contain more use data than other exports, but don't come in an easy format to use
Pinafore group papers accepted for publication. Includes LaTeX, code to generate figures, etc.
Repository for the IJCNLP paper "Deriving Consensus for Multi-Parallel Corpora: an English Bible Study"
Python port of Moses tokenizer, truecaser and normalizer
https://sharedtask.duolingo.com
Dead simple games made with word vectors.
Shepherding experimental jobs locally or on clusters
An English lexical database from the Big 🍎, let's go Mets baby love da Mets
Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons
地球坐标系 (WGS-84)转火星坐标系 (GCJ-02)地图纠偏算法Javascript版
Codebase for testing whether hidden states of neural networks encode discrete structures.
A tool for holistic analysis of language generations systems
An extended package for clustering similarity
Introduction to Nonparametric Bayes, Infinite Mixture Models, and the Dirichlet Process (+ McDonald's)
Open Source Neural Machine Translation and (Large) Language Models in PyTorch
Family planning through machine learning.
Indexing & querying large assembly graphs -- in space, no one can hear you miao!
Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms - R package
📚 Find your next book to read!
A library for debugging/inspecting machine learning classifiers and explaining their predictions