I think using Gemma3 in vision mode could be a good use-case for converting PDF ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		EmilStenstrom 9 months ago \| parent \| context \| favorite \| on: PDF to Text, a challenging problem I think using Gemma3 in vision mode could be a good use-case for converting PDF to text. It’s downloadable and runnable on a local computer, with decent memory requirements depending on which size you pick. Did anyone try it?

ljlolel 9 months ago | [–]

Mistral OCR has the best in class document understanding. https://mistral.ai/news/mistral-ocr

CaptainFever 9 months ago | [–]

Kind of unrelated, but Gemma 3's weights are unfree, so perhaps LLaVA (https://ollama.com/library/llava) would be a good alternative.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact