Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I use quantized LLMs in production and can't say I ever found the models to be less censored.

For unlearning reinforced behaviour, the abliteration [1] technique seems to be much more powerful.

1 https://huggingface.co/blog/mlabonne/abliteration



Were you using models that had been unlearned using gradient ascent specifically?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: