I have no actual info on this, but I always assumed they'd compute some mutlimod...

		maciejgryka on June 5, 2024 \| parent \| context \| favorite \| on: Is Microsoft trying to commit suicide? I have no actual info on this, but I always assumed they'd compute some mutlimodal embeddings of the screenshots to then retrieve semantically-relevant ones by text? And yeah, they'd have to do it using on-device models, which doesn't seem out of reach?