Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Google will label fake images created with its A.I (cnbc.com)
25 points by mfiguiere on May 10, 2023 | hide | past | favorite | 12 comments


Am I the only one who saw the first picture on the article and immediately thought: hell yeah, that's a fake CEO! ?


We need this for all ML-generated content. Especially for text.

There must be some industry labeling standard?


I want to see a classifier that takes text and can check the sources to see if the information is accurate. And another model that takes text and rates how much “well-written” it is: bonus points for terseness and clarity.

That way either the AI generated text gets flagged by the classifiers, or the AI generated text is genuinely high-quality and I want it.


Not sure why you are getting down voted, idea makes sense


How do you enforce that with a copy paste of ASCII text? I'm not against disclosure/attribution, but it'd be voluntary.


https://arxiv.org/abs/2301.10226

> We propose a watermarking framework for proprietary language models. The watermark can be embedded with negligible impact on text quality, and can be detected using an efficient open-source algorithm without access to the language model API or parameters.

Alpaca uses this already.


Zero-width spaces :)


So what you're saying is the second tool in everyone's pipeline will be one that removes hidden characters?


And then use AI to remove the watermark. The perfect circle.


> Google’s approach is to label the images when they come out of the AI system, instead of trying to determine whether they’re real later on. Google said Shutterstock and Midjourney would support this new markup approach. Google developer documentation says the markup will be able to categorize images as trained algorithmic media, which was made by an AI model; a composite image that was partially made with an AI model; or algorithmic media, which was created by a computer but isn’t based on training data.

Can it store at least: (1) the prompt; and (2) the model which purportedly were generated by a Turing robot with said markup specification? Is it schema.org JSON-LD?

It's IPTC: https://developers.google.com/search/docs/appearance/structu...

If IPTC-to-RDF i.e./e.g. schema:ImageObject (schema:CreativeWork > https://schema.org/ImageObject) mappings are complete, it would be possible to sign IPTC metadata with W3C Verifiable Credentials (and e.g. W3C DIDs) just like any other [JSON-LD,] RDF; but is there an IPTC schema extension for appending signatures, and/or is there an IPTC graph normalization step that generates equivalent output to a (web-standardized) JSON-LD normalization function?

/? IPTC jsonschema: https://github.com/ihsn/nada/blob/master/api-documentation/s...

/? IPTC schema.org RDFS

IPTC extension schema: https://exiv2.org/tags-xmp-iptcExt.html

[ Examples of input parameters & hyperparameters: from e.g. the screenshot in the README.md of stablediffusion-webui or text-generation-webui: https://github.com/AUTOMATIC1111/stable-diffusion-webui ]

How should input parameters and e.g. LLM model version & signed checksum and model hyperparameters be stored next to a generated CreativeWork? filename.png.meta.jsonld.json or similar?


If an LLM passes the Turing test ("The Imitation Game") - i.e. has output indistinguishable from a human's output - does that imply that it is not possible to stylometrically fingerprint its outputs without intentional watermarking?

https://en.wikipedia.org/wiki/Turing_test


Implicit in the Turing test is the entity doing the evaluation. It's quite possible that a human evaluator could be tricked, but a tool-assisted human, or an AI itself could not be. Or even just some humans could be better at not being tricked than others.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: