I'm making an app where literally all I want to do with an LLM is generate tags....

coder543 · 2025-08-15T12:24:50 1755260690

I'm pretty sure you're supposed to fine tune the Gemma 3 270M model to actually get good results out of it: https://ai.google.dev/gemma/docs/core/huggingface_text_full_...

Use a large model to generate outputs that you're happy with, then use the inputs (including the same prompt) and outputs to teach 270M what you want from it.

deepsquirrelnet · 2025-08-14T18:15:32 1755195332

Oof. I also had it refuse an instruction for “safety”, which was completely harmless. So that’s another dimension of issues with operationalizing it.

thegeomaster · 2025-08-14T20:41:36 1755204096

Well, Gemini Flash Lite is at least one, or likely two orders of magnitude larger than this model.

dismalaf · 2025-08-14T21:28:36 1755206916

That's fair but one can dream of being able to simply run a useful LLM on CPU on your own server to simplify your app and save costs...