alternate implementation without transformerlens #29

wassname · 2025-03-11T10:51:22Z

I remember when we were working on this together the paper first came out. Anyway you might be interested in my implementation using baukit instead of transformerlens. The advantage is that you don't need to map the weights, just the layer names. It makes exporting the model easier.

https://github.com/wassname/abliterator

wassname · 2025-03-11T12:17:08Z

Also, the bottleneck here is often storing large activations in mem. Here's a nice way to cache them to disk, so that you can abliterate with larger datasets: https://github.com/wassname/activation_store

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

alternate implementation without transformerlens #29

alternate implementation without transformerlens #29

Uh oh!

alternate implementation without transformerlens #29

alternate implementation without transformerlens #29

Comments

Uh oh!

Uh oh!