8000 weizeming (Zeming Wei) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View weizeming's full-sized avatar
😋
😋

Block or report weizeming

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 13 Updated Mar 20, 2025

A survey on harmful fine-tuning attack for large language model

179 6 Updated Jun 7, 2025
Jupyter Notebook 34 5 Updated Nov 12, 2024

To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models

Python 30 Updated May 21, 2025
Python 30 2 Updated Mar 11, 2025

Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]

Shell 313 29 Updated Jan 23, 2025
Python 29 Updated May 21, 2025

Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, datasets, evaluations, and analyses.

727 65 Updated May 23, 2025
Jupyter Notebook 4,218 1,220 Updated Jul 9, 2024

A curated list of retrieval-augmented generation (RAG) in large language models

278 21 Updated Feb 14, 2025

"LightRAG: Simple and Fast Retrieval-Augmented Generation"

Python 17,200 2,371 Updated Jun 6, 2025

Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)

Jupyter Notebook 61 9 Updated Jan 11, 2025

V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer (AAAI 2025)

Python 44 1 Updated Feb 24, 2025

[NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"

Python 128 17 Updated Apr 12, 2025

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

7,700 463 Updated Jul 28, 2024

AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM

Python 64 7 Updated Nov 3, 2024

Agent Security Bench (ASB)

Python 81 5 Updated May 3, 2025

Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep

Python 127 9 Updated Apr 23, 2025

An Open-Ended Embodied Agent with Large Language Models

JavaScript 6,163 582 Updated Apr 3, 2024
5 Updated Oct 17, 2024

[ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)

Jupyter Notebook 78 Updated Oct 23, 2024

[NeurIPS 2024] Fight Back Against Jailbreaking via Prompt Adversarial Tuning

Python 10 1 Updated Oct 29, 2024

A resource repository for representation engineering in large language models

123 5 Updated Nov 14, 2024

Code for paper 'Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning'

Python 16 1 Updated Apr 19, 2024

Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)

Python 103 7 Updated Apr 7, 2025

[ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion

Python 43 6 Updated Oct 25, 2024

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

Jupyter Notebook 655 90 Updated Aug 16, 2024
Next
0