8000 weizeming (Zeming Wei) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View weizeming's full-sized avatar
😋
😋

Block or report weizeming

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
171 results for source starred repositories
Clear filter

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 1,112 88 Updated Apr 3, 2025

Dataset and code for "JailbreaksOverTime: Detecting Jailbreak Attacks Under Distribution Shift"

Jupyter Notebook 5 Updated Apr 24, 2025
Python 13 Updated Mar 20, 2025

A survey on harmful fine-tuning attack for large language model

182 6 Updated Jun 13, 2025
Jupyter Notebook 34 5 Updated Nov 12, 2024

To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models

Python 30 Updated May 21, 2025
Python 30 2 Updated Mar 11, 2025

Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]

Shell 312 30 Updated Jan 23, 2025
Python 29 1 Updated May 21, 2025

Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, datasets, evaluations, and analyses.

741 66 Updated Jun 10, 2025
Jupyter Notebook 4,248 1,231 Updated Jul 9, 2024

A curated list of retrieval-augmented generation (RAG) in large language models

279 21 Updated Feb 14, 2025

"LightRAG: Simple and Fast Retrieval-Augmented Generation"

Python 17,420 2,405 Updated Jun 11, 2025

Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)

Jupyter Notebook 61 10 Updated Jan 11, 2025

V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer (AAAI 2025)

Python 44 1 Updated Feb 24, 2025

[NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"

Python 128 17 Updated Apr 12, 2025

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

7,718 468 Updated Jul 28, 2024

AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM

Python 65 7 Updated Nov 3, 2024

Agent Security Bench (ASB)

Python 85 5 Updated May 3, 2025

Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep

Python 132 10 Updated Apr 23, 2025

An Open-Ended Embodied Agent with Large Language Models

JavaScript 6,175 583 Updated Apr 3, 2024
5 Updated Oct 17, 2024

[ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)

Jupyter Notebook 78 Updated Oct 23, 2024

[NeurIPS 2024] Fight Back Against Jailbreaking via Prompt Adversarial Tuning

Python 10 1 Updated Oct 29, 2024

A resource repository for representation engineering in large language models

124 5 Updated Nov 14, 2024

Code for paper 'Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning'

Python 16 1 Updated Apr 19, 2024

Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)

Python 105 7 Updated Apr 7, 2025
Next
0