Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You’re aware that PDFs are containers that can hold various formats, which can be interlaced in different ways, such as on top, throughout, or in unexpected and unspecified ways that aren’t “parsable,” right?

I would wager that they’re using OCR/LLM in their pipeline.



Could be. But their pricing for the conversion is free, which leads me to believe LLMs are not involved.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: