Very cool. For the INT4 QAT model, what is the recommended precision for the act... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		cgdl 5 months ago \| parent \| context \| favorite \| on: Gemma 3 270M: Compact model for hyper-efficient AI Very cool. For the INT4 QAT model, what is the recommended precision for the activations and for the key and values stored in KV cache?

hnuser123456 5 months ago [–]

For keys, you probably want to use at least q5 or q6, for values q4 is fine

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact