matbahasa

Hiroki Nomoto matbahasa

Tokyo University of Foreign Studies
http://www.tufs.ac.jp/ts/personal/nomoto/

Achievements

Starred repositories

UMxYTL-AI-Labs / MalayMMLU

[MalayMMLU] This is the first-ever Bahasa Melayu multitask benchmark designed to elevate the performance of Large Language Models (LLMs) and Large Vision Language Models (LVLMs).

Python 34 4 Updated Dec 26, 2024

antonisa / lang2vec

A simple library for querying the URIEL typological database.

Python 90 16 Updated Apr 8, 2024

sarahjuan / sarawakmalay

This is a Sarawak Malay speech and text data for the purpose of speech technology research. The data was collected by Faculty of Computer Science and Information Technology, Universiti Malaysia Sar…

Python 6 Updated Sep 19, 2024

linggrads / latex

A repository for LaTeX resources for linguists

TeX 1 Updated Apr 25, 2022

human-ai-lab / struct_amb_ind

The first Indonesian structurally ambiguous utterances corpus

1 Updated Feb 3, 2024

aisingapore / BHASA

16 3 Updated Dec 12, 2024

osekilab / JBLiMP

5 Updated Jul 20, 2023

alexwarstadt / blimp

The Benchmark of Linguistic Minimal Pairs

Python 149 13 Updated Dec 13, 2022

gederajeg / database-verba-bahasa-indonesia

VerbInd: Pangkalan data verba bahasa Indonesia berbasis korpus.

HTML 2 1 Updated Apr 27, 2025

gauthierdmn / question_generation

Neural Question Generation using the SQuAD and NewsQA datasets

Python 110 39 Updated Dec 8, 2022

aisingapore / seacorenlp-data

2 Updated May 25, 2024

vncorenlp / VnCoreNLP

A Vietnamese natural language processing toolkit (NAACL 2018)

Java 619 150 Updated Feb 12, 2023

UCREL / Indonesian-TreeTagger-Docker-Build

Docker file to build the Indonesian TreeTagger.

1 Updated Jan 31, 2022

langsci / 231

Kroeger, Paul: Analyzing meaning

TeX 6 1 Updated Aug 30, 2022

stockmarkteam / bert-book

「BERTによる自然言語処理入門: Transformersを使った実践プログラミング」サポートページ

Jupyter Notebook 262 78 Updated Feb 13, 2024

ariaghora / mpstemmer

Stemmer and lemmatizer for Indonesian (Bahasa Indonesia)

Python 37 3 Updated Aug 14, 2023

gentaiscool / indonesian-nlp

A curated list of research papers and resources on Indonesian languages

39 3 Updated Mar 21, 2024

IndoNLP / nusa-crowd

A collaborative project to collect datasets in Indonesian languages.

Jupyter Notebook 269 61 Updated Jun 2, 2024

IndoNLP / nusax

High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)

Jupyter Notebook 99 10 Updated May 8, 2023

google / cld3

C++ 830 120 Updated May 24, 2023

IndoNLP / indonlu

The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)

Jupyter Notebook 606 205 Updated Nov 16, 2024

IndoNLP / indonlg

The first-ever vast natural language generation benchmark for Indonesian, Sundanese, and Javanese. We provide multiple downstream tasks, pre-trained IndoGPT and IndoBART models, and a starter code!…

Python 73 14 Updated Nov 16, 2024

ir-nlp-csui / kethu

A constituency treebank that conforms to the Penn Treebank format

2 Updated Jul 7, 2022

cysouw / pandoc-ling

Pandoc Lua filter for linguistic examples

Lua 39 7 Updated May 22, 2025

goodmami / rumi-jawi-web

Rumi-Jawi conversion web app

HTML 5 2 Updated May 3, 2021

neocl / speach

🐍🍑 Python 3 library for managing, annotating, and converting natural language corpuses using popular formats (CoNLL, ELAN, Praat, CSV, JSON, SQLite, VTT, Audacity, TTL, TIG, ISF, etc.)

Python 17 6 Updated Jun 26, 2024

bahasa-csui / aksara

Aksara is an Indonesian morphological analyzer that conforms to the UD v2 annotation guidelines

Python 6 3 Updated Sep 19, 2021

w4okubo / Roberts2012-jpn

TeX 1 Updated Apr 22, 2021

konlpy / konlpy

Python package for Korean natural language processing.

Python 1,452 334 Updated Aug 28, 2023

multilingual-dh / nlp-resources

Natural language processing resources for multiple languages, with an eye towards use for digital humanities.

Hiroki Nomoto matbahasa

Starred repositories

indonesian

bahasa-indonesia

malay

bahasa-melayu