I think gpt4o is probably doing some ocr as preprocessing. It's not really contr...

thomasahle · on July 11, 2024

If so, it's better than any other ocr on the market.

I think they just train it on a bunch of text.

Maybe counting squares in a grid was not probably considered important enough to train for.

_flux · on July 11, 2024

Why do you think it's probable? The much smaller llava that I can run in my consumer GPU can also do "OCR", yet I don't believe anyone has hidden any OCR engine inside llama.cpp.