Stars
A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.
A simple REST client library for Yandex.Translate
Provides a common interface to many IR ranking datasets.
Code for CEDR: Contextualized Embeddings for Document Ranking, accepted at SIGIR 2019.
WinSCP is a popular free file manager for Windows supporting SFTP, FTP, FTPS, SCP, S3, WebDAV and local-to-local file transfers. A powerful tool to enhance your productivity with a user-friendly in…
CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)
Cross-Lingual Entity Alignment via Joint Attribute-Preserving Embedding, ISWC 2017
Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis
Must-read papers on entity alignment published in recent years
"A Fused Gromov-Wasserstein Framework for Unsupervised Knowledge Graph Entity Alignment" in ACL 2023
Source code and datasets for ACL 2020 paper: Neighborhood Matching Network for Entity Alignment.
Source code and datasets for IJCAI 2019 paper: Relation-Aware Entity Alignment for Heterogeneous Knowledge Graphs.
coastalcph / GCN-Align
Forked from 1049451037/GCN-AlignCode of the paper: Cross-lingual Knowledge Graph Alignment via Graph Convolutional Networks.
This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL.
Toolkit for creating, sharing and using natural language prompts.
《大语言模型》作者:赵鑫,李军毅,周昆,唐天一,文继荣
A modular graph-based Retrieval-Augmented Generation (RAG) system
李白 👤 作为唐代杰出诗人,其诗歌作品在中国文学史上具有重要地位。近年来,随着数字技术和人工智能的快速发展,传统文化普及推广的形式也面临着创新与变革。国内外对于李白诗歌的研究虽已相当深入,但在数字化、智能化普及方面仍存在不足。因此,本项目旨在通过构建李白知识图谱,结合大模型训练出专业的AI智能体,以生成式对话应用的形式,推动李白文化的普及与推广。
Neo4j graph construction from unstructured data
[WWW 2025] A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)