Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think using Gemma3 in vision mode could be a good use-case for converting PDF to text. It’s downloadable and runnable on a local computer, with decent memory requirements depending on which size you pick. Did anyone try it?


Mistral OCR has the best in class document understanding. https://mistral.ai/news/mistral-ocr


Kind of unrelated, but Gemma 3's weights are unfree, so perhaps LLaVA (https://ollama.com/library/llava) would be a good alternative.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: