Can someone explain again how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?
There are artists that can study a painting for a few minutes and then recreate it from memory. There are artists who study a particular body of work so long that they can create more works indistinguishable in style. If an artist recreates a copyrighted work or creates a derivative too close to the original, then that new work is potentially copyright infringement.
That is, we focus on the output of the process to determine infringement with living artists and ignore the training. But with ML, everyone focuses on the training.
It seems an ML tool could add a filter to the output and refuse to output a work that too closely resembles one or more work under copyright. Isn't that basically what legitimate professional artists do as well?
Thousands of artists are capable of infringement, but we don't take away their brushes based on capability.
> Can someone explain again how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?
Its legally different because the human brain is not considered a fixed medium under copyright law, so a human experiencing and learning something is not making what is potentially a copy or derivative work under copyright law, and therefore not exercising, potentially without permission, one of the exclusive rights granted to the copyright holder.
> There are artists that can study a painting for a few minutes and then recreate it from memory
Right, and those artists violate copyright at the moment they recreate it in a fixed medium without permission, but encoding it into computer storage media is already a copy, so a machine (or, rather, the person using the machine) hits that threshold before creating an actual visual output.
> That is, we focus on the output of the process to determine infringement
No, we focus on what is set in a fixed medium. If that is done at an intermediate step of the process, rather than being an output, it can still be infringement.
Its just that a human doesn’t always need to make a copy in what is legally a fixed medium until the output, but that is not the same as categorically only treating outout of a process as legally relevant.
People betting on all “lossy compression” being equivalent will lose. It’s a lame argument. Can you encode images to be basically 1:1 with embeddings? Absolutely.
Is that what these embeddings are doing? No.
If artists try to argue that all copies are equivalent, but are unable to recreate their works from the embeddings, their argument will fall flat.
This argument also only applies to sharing models, which is doubly dumb because we want open source models, not closed source models. It’s a harmful status quo to try and enforce.
> If an artist recreates a copyrighted work or creates a derivative too close to the original, then that new work is potentially copyright infringement.
I see no reason the same standard cannot be applied to ML generated content. If the evaluation is being performed on the end result, then that is all that matters. The same judges that decide these things for human generated content can continue to do so for ML generated ones.
Even the people submitting and responding to the copyright claims will still be human (with briefs generated by ML…).
What will be more interesting is when the judges themselves get replaced with an “objective” AI to quantify similarity for copyright purposes. If that ever happens, it’ll trigger an arms race to hit the razors edge without going over.
> how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?
You just answered it? One is a ML system and one is a human?
I'm really, really baffled why people keep using this argument. Like you guys know machines are not humans, right? ...right?
Humans are special cases in laws. Always have been and always will be (until AGI). A pedestrian is treated differently in laws than a driver is. The fact that a pair of legs and a car both move you from point A to point B doesn't make them same. Selling human livers on your local market is very different from selling cow livers, even biologically they are all organic tissues.
Let me say it again: humans are special cases. AI learning copyrighted materials might be illegal or legal, but it has little to do with "what if a human being does the same".
Isn't copyright about the final product, not how it was arrived at?
If I independently come up with a song called "Let it Be" that has the same lyrics as the Beatles song and publish it without the permission of the copyright holders, I will have violated their copyright.
It doesn't matter if I heard the song before or not. It doesn't matter if I did it myself or used a computer to do it. What matters is the final product and my publishing of a song close enough to the one that was copyrighted.
AI image generators are just tools, like Photoshop is a tool. Nobody cares if you used a paint brush or Photoshop to create something that looks like a copyrighted image, why should AI image generators be any different?
If the final image is similar enough to a copyrighted work and I publish that image without permission of the copyright holder, then that's a copyright violation.
If the final image is different enough, then it's not.
That an AI was used and how the AI was trained are completely separate issues.
Patents do work this way, but copyright doesn't cover independent invention. However, in practice, if you reproduce a popular work exactly, it's so incredibly unlikely to be actual independent invention, that the court will assume it's not.
Probably not even then, at least not initially. While some people conflate AGI with personhood, consciousness, qualia, etc. we've got at least 22 different[0] ideas of what consciousness is and no idea how to even determine whether or not a mind has qualia — and even if we did, I see no specific reason to require any of them, as a P-zombie[1] AGI doesn't seem to me like a contradiction in terms.
In law there is such a thing as legal person as opposed to natural person. When it comes to commercial law, its provisions tends to relate to legal persons.
legal personhood (such as corporate personhood) are tools/frameworks which facilitate certain economic actions, such as the creation of charters for specific initiatives.
Like any useful force amplifier, legal personhood has effectively been co-opted to benefit those entities which stand to gain from it (e.g., corporations, large-money political donors, etc).
Thus, it probably won't surprise anyone when AIs are granted legal personhood, to bypass the detrimenal effects they will inevitably have on natural persons.
That said, it is likely a good thing to keep meatspace in some kind of privileged legal category, lest the rights of legal persons outweight those of natural persons (which seems to be the inevitable conclusion, and goal, of all this).
I agree with you somewhat, in that human rights should be protected and law is just a tool, that I feel should be used to better humanity.
There is a philosophical thought behind the whole idea of copyright, I suspect it'd be best to use that as a starting point rather as I'm sure it would weigh the pros and cons of your point against other factors. It fundamentally is designed to strike a balance between a creator and wider society.
It's a matter of scale. No human being can ingest ALL existing images. If it was the case that the average human artist was able to replicate any other work, without effort, probably we would have had two effects: first, we'd have much less works of art (because the gains would have been eliminated, so why bother), and second, copyright law would have been much more restrictive. This is exactly what we should do: avoid applying a law thought for human beings, and create a new more specific law, much more restrictive. Otherwise future art created by actual human beings will suffer greatly (without mentioning the loss of work and human abilities), to the economic benefit of a very small set of monopolistic players.
What you propose is just a different small set of monopolistic players.
Copyright has always been a trade off between the creator and society.
It should be enforced the exact same way as it currently is. Fair use is fair use.
By your same logic, what is the difference between an AI or a very productive human? Where do you draw the line?
Fair enough, if that's is so, there's even less of a case to be made. The person inputting variables (i.e a prompt) into the model that prints out non-fair use outputs would be at fault. The end work would be the actual object subject to litigation. It's just a tool then. The current system still stands then.
Arguing a trained model is infringing would then be like arguing the manufacturers of my monitor were infringing as it contained the very same RGB.as the artist.
> Fair use by a machine is a concept which has hardly been studied by legislators.
I mean, that's true in the sense that "fair use by a pencil is a concept which hardly been studied by legislators" is true. Use (fair or not) isn't by the tool, its by the person using the tool.
The only way we end up with only monopolistic players being the only ones that benefit from AI is if we misapply the outdated concept of copyright here. Open source models like SD are the path away from monopolization.
Let me change the argument around: Why is it assumed that because an artwork is freely available on the internet, you are allowed to train a machine to reproduce it, being in its totality or just details that are used in the creation of new works?
IE why isn't it that an artist could say, hey I'm letting you see this painting, but you are not allowed to sit down with a canvas and learn how to reproduce it? Because you can do that in galleries - no photos, no reproductions.
So actually building a machine there, under the cover of darkness, that learns from your work so you can produce new work, why is that allowed in the first place? Certainly wouldn't be at a museum.
The key thing here is - if you want artists' data, you should ask for it. They didn't. This would be equivalent of training a Github CoPilot on every available piece of code in existence, ever, instead of what they had available. Why should that be allowed? So if I built some toy code in 1996, and happened to post it on usenet, and it's a great implementation of X, why the heck is CoPilot allowed to read it? It's my property.
> why isn't it that an artist could say, hey I'm letting you see this painting, but you are not allowed to sit down with a canvas and learn how to reproduce it? Because you can do that in galleries - no photos, no reproductions.
But you can't stop people from sitting and studying your painting and then painting stuff similar to it.
One of the core assertions that is being decided in this case is if there is any actual reproduction here. Does a model contain a reproduction of every image it was trained on? Can the model actually create a reproduction of any images it was trained on?
If it turns out that there is no reproduction here, then it comes down to how much legal control we give copyright owners to regulate access.
A gallery can reasonably ban cameras and canvases, but it becomes a lot less reasonable if they try to ban artists.
Let's imagine that this isn't just specifically tuned ML but proper General AI that can learn new skills. Is your argument that this AI would be legally prohibited from viewing any images it doesn't have a specific license for?
I think that drawing hard lines around what kind of processing can be done on publicly available images is going to become problematic. It's better to regulate around what can be done with the results of the processing than that processing itself. That's how our existing laws work. Making a reproduction, even just from memory, of a copyrighted work is restricted. Memorizing a copyrighted work is not.
I find the whole comparison "it´s just like a person learning" to be a tiring trope. It's demonstrably not.
Like I said to another poster - you've probably seen a Picasso. Can you make me a copy?
Because a Diffusion model can. But you can't. Why not?
Your denial that there is a demonstrable difference between human and machine attention is part of the core obfuscation these companies are using to win this battle, so I reject it entirely. That difference creates the whole issue. If you don't recognise it, then answer me - Why can't you paint me a Picasso? You're saying the Machine is just like a human, yet a simple question of reproduction tells you it's not like a human in any way. It's a machine, and it produces machine reproductions. It learns faster and more accurately than any human, and its purpose is to produce derivative works. If the machine didn't need human data to do this, this discussion would be academic. But it does.
So the whole future of the Arts will be decided by investigating what the machine actually does, not the simplistic idea of it´s just like a human.
You have to evaluate the machine's abilities and impact onto the world. And that's the tough part. But just saying "hihih it's just a person" while it produces superhuman output is not a solution, it's just a lie that was invented by the people profiting from these models.
>Is your argument that this AI would be legally prohibited from viewing any images it doesn't have a specific license for?
This isn't a given, it is something that has yet to be decided in this case. (Edit: if you look at studies that look for examples of SD reproductions, the best examples still similar to what a human trying to reproduce the image from memory would create)
> Your denial that there is a demonstrable difference between human and machine attention
There are demonstrable differences between different intelligent systems. I have yet to see any demonstration that shows that you can't reproduce human attention with machine. (Though we can't do it yet)
> If you don't recognise it, then answer me - Why can't you paint me a Picasso?
I haven't studied painting or Picasso. There are many people who can paint a Picasso as well as, if not better than, any ML model we have today. There are people who you could take to a gallery show who could go home and reproduce both style and individual works at an equivalent level.
> It learns faster and more accurately than any human,
The word, "faster", here is doing a lot of work. Machine learning can be "faster" in that it can happen in parallel and be scaled to take less time. However humans currently also learn "faster" because they require fewer repetitions or examples to learn. As such, the "learning" derived from a human viewing an image is arguably currently larger.
> Yes. You pay for access.
Is this good faith? I already stipulated that the images are publicly accessible. Are you suggesting that somehow artists should be able to block the AI from viewing a properly licensed instance of a copyrighted image? That pretty much results in a ban on general ai.
>Is this good faith? I already stipulated that the images are publicly accessible. Are you suggesting that somehow artists should be able to block the AI from viewing a properly licensed instance of a copyrighted image? That pretty much results in a ban on general ai.
No it, results in a ban on general ai that doesn't compensate rights owners.
So you have a general AI, it sees someone wearing a t-shirt containing a licensed, copyrighted image.
This AI now needs to pay the copyright holder of that image?
How I think it should work is the sa.e for any other intelligent system. Systems can view publicly available images, memorize them, and even reproduce them for certain fair uses. The systems have to pay for a license to right holders for the non fair uses of reproductions.
You're saying it sees someone as if AI is walking around and this all happened by random chance.
No. What happened was AI scientists deliberately built a giant corpus of training data based off unlicensed imagery that was conveniently pre-tagged - Artstation and other sites of the same type. And it was trained to deliberately create images of the exact same type as it was ingesting. It wasn't "randomly learning about the world" and it certainly did not "stumble upon" these images.
The fact that there was a large corpus of artistic imagery already tagged just revealed itself to be too appetizing for AI training, so a few companies did it in secret, without asking anyone for permission, then hoping to make enough money and VC funding that they would defeat any challenge in court.
So yes, those people who made the original images should get paid.
By that same logic, would the manufacturer of my monitor be liable for displaying the unlicensed image at my request?
It's a tool. The end user, inputting variables into the model is generating that image.
> You're saying it sees someone as if AI is walking around and this all happened by random chance.
It's called a thought experiment. You are claiming that processing imagery that is publicly available for viewing consitutes an IP violation and I'm taking that to the logical conclusion.
> based off unlicensed imagery
The imagery on Artstation is licensed. Artstation has a license to display those images publicly. If Artstation did not have that license, they would be the infringing party.
> so a few companies did it in secret
I'm not sure what the basis for this assertion is it was not done in secret. Stable Diffusion was trained using https://laion.ai/blog/laion-5b/
This lawsuit isn't against the people who trained the model, but those who distribute it. Yhe Stable Diffusion model was not released by a for-profit company but was released as open source by a university research group which received funding from VC backed companies.
The IP model you are pushing for is a huge expansion of the already problematic copyright system. It will curtain research and training of publicly available models.
It is already easy as can be to copy images with a computer or digital camera. Stable Diffusion doesn't make it easier to reproduce these images and washing an image through a ML model won't protect you from that reproduction being an infringement.
I think it makes far more sense to use our existing restrictions to regulate usage of the output than to make new restrictions on the types of processing that are allowed on publicly available content.
If I am a decent human artist who's looked at many Picasso works and spent time studying how to reproduce his style?
Of course I can.
Artists copy each other's styles all the time, dude. You can literally go to Deviant Art (or wherever), scroll through, and point out examples of style copying, often much more glaring that anything I've seen out of a diffusion model.
If the model didn't learn anything important from Picasso, it wouldn't be in the training data.
This whole argument of "ah but it doesnt really need it" doesn't hold up. If the model didn't need it, it wouldn't have used it in the first place.
Same thing in Artstation. It was of course propitious for AI scientists to find such a lovely database of high quality imagery, and all so helpfully tagged into categories.
> If the model didn't learn anything important from Picasso, it wouldn't be in the training data.
> This whole argument of "ah but it doesnt really need it" doesn't hold up. If the model didn't need it, it wouldn't have used it in the first place.
I haven't seen anyone making this argument. There's a pretty clear difference between learning something from an image and memorizing it.
There also isn't any illegal with memorizing an image and painting a reproduction. What you aren't allowed to do is sell or distribute that reproduction without a license.
I think it makes more sense to restrict what people are allowed do with ML tools than to restrict what ML tools can do.
Of course it learned, that's the point of training.
You claimed the model can reproduce an image from that training data. That's false, and what the judge dismissed.
“none of the Stable Diffusion output images provided in response to a
particular Text Prompt is likely to be a close match for any specific image in
the training data.”
“I am not convinced that copyright claims based a derivative theory can
survive absent ‘substantial similarity’ type allegations,” the ruling stated.
Whether using copyrighted data to train a model is fair use or not is a different discussion.
If you don't mind it being as bad as the the result of a Stable Diffusion image being passed on to a half-trained robot arm, sure: extra limbs, even more of a David Cronenberg vibe than Picasso at his weirdest, mixing with mis-attributed ideas from other images that I've associated with the same labels…
When you publish it, you lose some property rights. While under copyright, there is a short list of things that others are prohibited from doing (reproduce, distributed, etc.). And you lose all your rights once the copyright expires.
The question is about copyright law.
You can raise other legal theories or ask congress to create an entirely new class of intellectual property law. Sure.
The lawsuit is about whether copyright applies, it seems to me.
This is the typical intentionally misleading argument in favor of AI, comparing a software to a human artist conveniently forgetting that a real artist cannot a create millions of pieces every hour, just that difference makes any direct comparison laughtable because such threshold was an absolute immutable constant for all human history until very recently, and that includes among many other things the incentives artists had to persue that career instead of any other. And of course the societal problems that displacing so many jobs entails.
The issue is that ml companies are using exact copies of the work in question, and using it to make massive for-profit systems without permission or compensation. Individual artists don't threaten the market in the same way.
They download them from the public internet. The issue is that digital artists are basically required to maintain a public portfolio to get work. Them doing that is not implicit permission to use that work for whatever the fuck you want, as at says at the bottom of every image on Google images.
> Can someone explain again how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?
Because (fortunately) human thoughts can't be subject to copyright law yet. So when we talk about copying and making derivative works, if you have this
artistic works -> neural network weights
The end result may or may not be copyrightable (that's for the courts to decide), but this
I don't know. It seems pretty shitty that these systems are literally leveraging their work against them. It also seems shitty that we're trying to automate cultural expression. Even if it's not explicitly illegal, the ai art guys are still ass holes.
Totally agree.
As someone who hates how commoditization already permeated everything in the West we're sadly in for many years ahead of cheap regurgitated knockoffs in practically every human output domain imaginable, pictures, music, literature.
All the while being told that what we were doing in these domains is 'just' this and 'only' that.
At the same time it is fascinating from a scientific perspective, when it lands in the same boat as you're sailing it becomes quite unsettling.
> Can someone explain again how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?
Three things immediately spring to mind: scale (1), accountability (2), and profit (3).
1. An automated system can train on data at huge volume, in a way that no single human is capable of doing. Setting aside the issue that training an ML model and artists learning by copying techniques of other artists is, I would argue, fundamentally different acts, _even if we take them to be the same_, we have to acknowledge that in a single human lifetime one person can only "train" on so many works. Automated systems have no such limitation.
2. If an artist violates copyright or oversteps norms around artistic professional practice, they can be held accountable. Companies which violate this by using automated systems so far hide behind those systems ("the AI is doing it/did it") so aren't held responsible (it should be: the company has built the system, and therefore is responsible for how it is used, and what it does). By building up this false sense of agency on the part of systems (which the marketing term "AI" is designed to bolster), lack of accountability is laundered into the actions being taken at scale.
3. Automated systems are, due to their scale, very profitable. I can generate hundreds or thousands of copyright-violating work that dilute the market for artists, and it is incredibly cheap to do so. Fighting those copyright violations in court has to be done more or less on an individual basis (especially if actions like that in the original article continue to fail), which is extremely slow and expensive. If the cost of violating copyright is tiny, and the cost of enforcing it is huge, then it ceases to be a useful tool except for the most well-resourced organizations.
> It seems an ML tool could add a filter to the output and refuse to output a work that too closely resembles one or more work under copyright. Isn't that basically what legitimate professional artists do as well?
No, because copyright is more complicated than "these two things look a lot alike", and legitimate professional artists don't run into this issue, because they aren't constantly trying to skirt the line of "as close as possible to copyright violation while still getting away with it".
> Thousands of artists are capable of infringement, but we don't take away their brushes based on capability.
But they do get sued when they infringe! Enforcement happens, because (for now) it is still possible for independent artists to enforce their copyrights. The argument being made by artists with regard to these ML models is that _they are already infringing copyright_, not that they hypothetically may in the future.
This lawsuit was always weird because it was a much much weaker case than the GitHub Copilot lawsuit by the same firm: atleast with text you can point out exact infringement, but the Stable Diffusion lawsuit (https://stablediffusionlitigation.com/) seems mostly based on inaccurate technical memes like "diffusion is just compression" without examples.
Having done way more corporate court than I want (patents, mergers, liquidation), I’m increasingly convinced that the judicial system is fundamentally flawed.
The reality is that the law in 2023 US is so obscure and opaque and how judges come to their ruling seems to be by their total whim with no actual philosophy other than maintenance of the system.
Further I’m extremely unimpressed with the vast majority of judges competence in display - such that contempt should be the starting position.
The fact that this is how laws are actually made (precedent of applications will always beat the letter) means that nobody who doesn’t have a warchest will be able to actually utilize the system coherently
As with everything now, courts are rules by those with the most money
Not to be too rude, but you’re not an attorney and couldn’t be more wrong.
The law has never been more transparent. The public has nearly complete access to every docket in the country. Moreover, the level of jurisprudence has never been higher.
Moreover, I’ve lost a case or two in my time, but it was never because of a lack of a warchest.
That there has been a steady march forward in the quality of judicial output. There are still dumb judges and crooked judges, but overall, my experience is that younger judges are more willing to listen and be educated than judges of yesteryear.
Additionally, Westlaw is hated by all for its pricing, but it’s almost impossible to comprehend how significantly it’s improved, and equalized, legal research.
In which a person who professionally practices law, has been to law school, has a higher than average IQ, and has a decade of experience describes how easy it is to interpret the legal system
Hey quick question, my good friend, lets call him Doug is a high school equivalency graduate, has a few felonies and currently works as a road flagger
How does this complete access to every docket in the country help him?
You have more perfectly explained my point better than I could have. Thanks
The final decision made by this article is one I agree with with or without money and I have no incentive the game.
Every piece of creation you and I make us the sum total of our experiences and that includes copy written work. Holding an LLM guilty for that is like holding the human brain guilty for memorizing copyrighted work.
And yet these systems are incapable of genuine creativity. If they were, they would be taught rules & techniques and set off to their own devices to draw, like humans. But, they can't and they're not. LLMs and humans don't learn or create in the same way. Moreover, there's no reason we should grant LLMs the full rights and privileges of humans.
And yet humans plagiarize all the time. And how do you measure genuine creativity? See the Chinese Room thought experiment. Also we generally don't make our tools able to "set off to their own devices" because that's silly. I agree with the substance of software operating differently from humans and that we should currently maintain a distinction between software and human rights. Of which I do not believe copyright is a fundamental one, merely a legislative one(and one I might add has been stolen from current living humans- I'm in midlife and I can't creatively revamp works that are twice as old as I am, how is that fair that I can't use nostalgia from my childhood commercially as an adult?).
I don't follow. Which standard is that? Many (most?) artists do in fact fail, particularly if they're unable to find a creative way to differentiate themselves. Society generally shuns plagiarism. We call out things that are deemed as "knock-offs", whether they're bands, video games, movies, books, or clothing.
Given the terse reply, I'm guessing you disagree that it's even a concept. So, to round out the inevitable circular discussion I've gone ahead and asked ChatGPT for you:
Genuine creativity, often attributed to humans, is the ability to generate, imagine, or invent something new and original that has value or meaning. It involves thinking beyond existing boundaries, making connections between different pieces of information, and coming up with innovative solutions to problems.
Creativity can manifest in many forms, including:
Artistic creativity: This is often the first thing people think of when they hear the word "creativity". It includes creating visual art, music, literature, and more.
Inventive creativity: This involves coming up with new products, technologies, or methods that solve problems in novel ways.
Conceptual creativity: This involves developing new theories, models, or ways of understanding the world.
Problem-solving creativity: This involves finding unique solutions to challenges or problems.
Genuine creativity is often characterized by originality, expressiveness, and the ability to transform or redefine existing ideas or norms. It's a complex process that involves both conscious and unconscious thinking, and it's influenced by a person's knowledge, experiences, personality, and environment.
I went a step further and asked if it's capable of genuine creativity and received:
As an AI, I don't possess creativity in the human sense. I don't have feelings, thoughts, or experiences, and I don't generate ideas or concepts spontaneously. However, I can generate unique combinations of information based on the vast amount of data I've been trained on. This can sometimes appear as "creativity", but it's important to note that it's a result of complex algorithms and computations, not genuine creative thought.
> Given the terse reply, I'm guessing you disagree that it's even a concept.
My terse reply actually indicates an understanding that we completely lack a formal definition of "genuine creativity", and therefore any such claims are vague intuitions at best.
> I don't have feelings, thoughts, or experiences
This implicitly assumes we have a mechanistic understanding of feelings, thoughts or experiences. We don't, therefore we can make no such definitive claims about how machine learning and human cognitive processes. ChatGPT has specifically been trained to give this response despite agreeing with an argument that suggests it could indeed have mental states:
> As an AI, I don't possess creativity in the human sense. I don't have feelings, thoughts, or experiences, and I don't generate ideas or concepts spontaneously
Define "spontaneously". If you mean that humans act without apparent cause, that does not entail there is no cause. If there is a cause, then that cause can be modeled as an input into a pure function. ChatGPT and other ML systems are pure functions can also mix concepts and generate new and unique outputs from its learned state space given such inputs. Humans are still more complex than such systems, so the mystique can hide in the perceived complexity, but don't mistake this for a different kind of process. Which isn't to say that it is the same process, I'm saying there's no real basis for either claim.
I think there's a lot of sloppy thinking going on when comparing human brains and ML, particularly ascribing some sort of exceptionalism to humans. There's a long, incorrect history of that.
Define “understanding”. Please provide a formal definition for “vague intuition”.
These sorts of clipped sentences without any supporting context are understood to be delivered in bad faith. If you want to discuss in good faith, then elaborate. From here it sure looks like you wanted to score points by derailing the discussion on what you’ve deemed to be poor word choice.
I've elaborated plenty in the post you literally just responded to. Nothing further needs to be said. Suffice it to say that your original claim was unjustifiable given what we currently know.
You didn’t elaborate on the points I asked about. You literally argued with the response from a chat bot. You can’t just say “QED, I win”. Until I get satisfactory definitions the rest of your response is unjustifiable.
That’s quite alright. I’ve written amply about the topic of creativity in this thread in order to delineate it from “creative” meaning “something that creates”. If you don’t know what “genuine” means my Webster’s definition isn’t going to help anything. I’m not keen to have a debate over adjectives when I’ve more than explained my stance. I never claimed to be introducing some new term of art. If that somehow means I’m bluffing, so be it.
I viewed the curt demand I provide a definition as rude and a cheap tactic intended to disrupt a discussion. And the ensuing discussion with that particular person did nothing to convince me otherwise. Fortunately, I don’t owe them a reply and you don’t have to believe me. It’s all good.
How do you know the answer isn't an hallucination? Maybe it made the answer up creatively and is lying to you about its own capabilities.
The fact that it knows the definition of creativity makes it's answer suspect. It's like saying this: "I am not capable of speaking or understanding English, what you see here is just a statistical prediction of the next most likely words. I do not in actuality understand or speak English."
We don’t know the associations they form. But, we know how they’re built and that they’re not reprogramming themselves. That establishes a very strong bound on what they’re doing.
No it doesn't. The model we use to create these things is general enough that it can be applied to ALL forms of intelligence.
Basically it's a best fit curve in N-dimensional space. The entire human brain can be modeled this way. In practice what we end up doing is using a bunch of math tricks to try to poke and prod at this curve to get some sort of "fit" on the data.
There are an infinite number of possible curves that can fit within this data. One of these curves is the LLM. The other curve is the human brain.
Here's a better way to put it. Your entire OS can be modeled under this idea. We can use a bunch of training data and basically recreate your operating system under ML. Just feed in current state and train it until it will output correct state.
But understanding the OS from this ML perspective is a far cry from understanding the OS from the perspective of source code.
We do NOT understand fully what's going on. In fact, the fact that the LLM even had chatGPT's capabilities was not predicted or foreseen.
Which is precisely why I said “genuine creativity” in hopes of avoiding pedantry around word etymology. It’s hard to have these discussions when people are deliberately being obtuse. By this definition, nearly everything is creative making any discussion about it meaningless. So, let’s use the connotative meaning.
Ok yeah let's avoid that bullshit then. The problem is there's enough leeway between the words that I could honestly say everything LLMs generate are representative of what most humans define as "genuine creativity" and I bet you that I can show you human art and computer art side by side and you'd be incapable of knowing which one is genuinely creative.
I hate pedantic vocabulary just as much as you and this is not the direction I want to take it. But that test I outlined above literally points out that there is no difference. What the LLMs output fits our definition of genuine creativity because you can't tell the difference.
In fact the word itself is the ludicrous thing here. You just made it up to differentiate AI art and human art, but in reality there is no differentiator it's one category with zero recognizable difference... the only actual difference is "what" created it.
The systems in question require learning from art generated by humans. If they didn't, they could avoid all of this IP mess by learning how to draw. The supposition I'm pushing back on is that humans only generate art by regurgitating what came before them and I don't see any basis for that claim. We have art formed by completely isolated societies in very distinct styles. Children draw all sorts of fanciful creatures that they've never seen in the wild or in other art. Artists have developed different techniques for capturing their work and it can resonate with people. We have prodigies capable of creating symphonies before they can do much else in the world.
Sure, there's incremental evolution. But, we also have breakthrough artists inventing new techniques and art forms. It doesn't matter that a computer program can clone a power chord structure and create something that sorta sounds like Nirvana. That's not proof of creativity. Yes, it created something so it's "creative" in entirely mechanical stance. Just like solar flares will flip bits in my computer and "create" things as well.
We can argue all day about what art means. It can get really philosophical really quick. I contend people can dream up new ideas and execute on them in a way that resonates with others and that they don't need to copy everyone else to do that. That given the basics (here's some paper an color pencils) people can develop skills and invent wholly unique ways to represent themselves and the world around them. That they're able to do that in isolation and without education. I point to the entirety of human civilization as my supporting evidence. I think it's reductive to claim we're just statistical models consuming media and shuffling things around.
It seems to me this whole argument hinges on saying humans and these image generation tools work the same way. If they do, then teach one of these programs what it means to draw, give them a sensor network to the outside world, and let's see what they generate. That would be hugely compelling and would sidestep this whole discussion about IP whitewashing. But, that's not what's happening. Whether because of convenience or because it's the only practical way to generate art, these systems only work by training on art created by humans. That they're able to generate a final product that looks like something else made by a human shouldn't be shocking -- that's the whole basis of copying.
Take a look at the image on the lower right hand corner. That is indisputably original. The badge doesn't exist anywhere else and the alien form with one entire leg jutting out of the torso has never been done before. We know that this one legged creature is entirely original because it doesn't exist in any of the shows.
This is the key how you know LLMs aren't regurgitating stuff. It's trying to reproduce something from a flawed understanding of reality. A regurgitation would get things truly correct, but a one legged human is a creative error due to a lack of understanding. The LLM doesn't understand reality as completely and as cohesively as we do but it understands an aspect of it enough that it can produce art that mostly works. The LLM is definitely creating stuff from pure thought. These things are not copies and that's what you don't get.
The hype for LLMs is so over the top that it looks like the latest outrage from something that occurred on social media. What you and other people are missing is that we crossed a certain AI threshold here. This isn't mere regurgitation.
>We have art formed by completely isolated societies in very distinct styles
prompt: Draw art in the style of a society or civilization that has never existed. Make the art very distinct in style such that the style is very divergent from anything that has been seen before.
>It seems to me this whole argument hinges on saying humans and these image generation tools work the same way. If they do, then teach one of these programs what it means to draw, give them a sensor network to the outside world, and let's see what they generate.
No. The argument hinges on something far more insidious. If I showed you two pieces of art side by side.... One was AI generated in seconds and the other one was created through pure passion and hours and hours of hard work and toil. If you can't tell one was AI generated then all that toil and passion is useless.
Why? Just observe how art and content is perceived before and after you tell people it was made by AI. I think it's an unfounded bias.
You probably remember the AI art piece that won an art contest? It was perceived as better than the rest, obviously, or it wouldn't have won - until it was revealed it was made by AI.
Now, IMHO, that was not fair and if it was disqualified, that was absolutely the right move, but that's not the point.
The same can be observed when people talk about (AI) art. I've seen people comment "Awesome! There is just something about [this artwork] that AI can't reproduce, human art has soul!" even though the artwork they commented on* was made by AI
After it is revealed to be AI made, it is suddenly "soulless" and not genuine anymore.
I remember there was an online mental health care platform that (without revealing this) introduced AI therapists. According to their analytics, the AI therapists, on average, got higher scores than the human therapists. Then, word got out they're using AI and suddenly, people rated the service a lot worse than before. It was obviously wrong to not disclose this.
There's nothing wrong with valuing craftsmanship and human work higher than machine work. We as humans do and I do, too - but generally, nowadays, digital watercolor art made in Photoshop (and co) is not inherently considered "not creative" or "soulless" because it wasn't painted with real watercolor... but I'm pretty sure that was not always the case. I'm sure artists discussed about how digital art is not genuine, creative or soulless back then, too.
Can anyone that was there at the time tell me if there were similarities in sentiment when digital art and Photoshop (and co) became popular?
What I'm trying to say is that there is definitely a bias at work and people generally can't tell the difference if they are not told about the (possibility of) involvement of AI.
I also fear that currently, we're just the lame adults that don't like the new thing and eventually in 10 to 20 years after AI art is normal and accepted by the generation that grew up with it, something new will come out that that generation will think is lame and not genuine, and so on.
That is how I remember it. I'm relatively sure, but take it with a grain of salt - memories are, as we know, not reliable.
This argument gets repeated often enough that it implies that there are a significant number of people who actually believe it. This is pretty depressing, as the only way you could think that a human is not fundamentally more capable of creativity than an LLM is if you are incapable of imagining anything other than a life of ‘consuming content’.
Or maybe it’s depressing to you because of your vested belief in some kind of exceptionalism for humanity. We’re just big biological machines - we run on electricity and chemicals. Just because the materials might be a bit different between a human brain and an LLM, doesn’t mean that fundamentally the process of gathering, digesting, and regurgitating data works any different.
We see things, we recognize patterns, and we reproduce those patterns to the best of our abilities - that is the human creative process in a nutshell. It’s the process that lead to the creation of LLMs in the first place - why should our creation not follow the same methodology in order to create the same way we do? Why should a computer looking at a hundred pieces of art and producing something objectively similar to what it’s seen be any different than a human being doing the same thing?
Even if this is true, the sum of a human's total experiences is far more vast than what LLMs are trained on and is mostly non-copyrighted material. Your eyes are open ~16 hours a day, constantly exposing your brain to various non-copyrighted stuff like trees, rocks, animals, the insides of your house, the crap on your desk, other people's faces, traffic, etc.
Human society is built around humans so there is a reason I’m allowed to store art and stories in my brain; I and the vast majority of people can do jack all with it, pose little threat to the owners if we could, would draw attention if we did, and it’s not worthwhile or feasible to sue an entire population of content consumers for consuming content so people act smart and pick and choose their battles so society can function.
LLMs and diffusion models don’t “consume content” and the content they produce doesn’t come from inspiration, greed, boredom or any number of human traits. They don’t remember the Target logo to go shop there or a college class they took to apply for a job so they can be a tax paying member of society.
>LLMs and diffusion models don’t “consume content” and the content they produce doesn’t come from inspiration, greed, boredom or any number of human traits.
No LLMs use a cold and calculating algorithm to produce content that it equivalent and at times Better than anything a human can produce. When a human consumes content produced by an LLM it is indistinguishable from content that came from inspiration, greed, boredom or any number of human traits. You can't tell. That is the future. We can identify flaws now but we all know those flaws will be rapidly disappearing.
What the LLM tells us, what it teaches us, is that the human condition is trivial. We are a biological machines made out of wetware and the LLM is a solid state machine made out of silicon. Two machines that make content. We like to pretend that the stuff that comes from inspiration, greed or boredom as if those are things that have deep intrinsic meaning in the universe. No. It's a lie we tell ourselves. We make up the meaning, and so does the LLM.
Society is fucking changing. Creativity, art, works of inspiration... all that stuff will become as common as water and plastic cups. That's just the way things are accept it or don't.
But here's the thing. Don't pull justice into the mix. Don't say that a crime was committed because an LLM produced something better and faster than a human could. Don't call obviously original writing and original art a COPY when it is obviously NOT. These are the same tactics used by patent trolls.
This is a twisting of justice because people are afraid. No different from how a patent troll twists justice because people want profit.
“Don't say that a crime was committed because an LLM produced something better and faster than a human could. Don't call obviously original writing and original art a COPY when it is obviously NOT. These are the same tactics used by patent trolls.”
The law is arbitrary, written by humans, not by some platonic ideal inherent in the universe. It doesn’t need to be applied equally to people and tools.
The people producing these models just want profit as much as any patent troll and like them will interpret the law in such a way that benefits them or fight to change it so they can realize that profit.
It’s easier to go after the models and prevent copyright for AI generated works than to take issue with all of their users which is what we’re seeing in the courts.
“Society is fucking changing”
I agree with you there.
I’m not fundamentally opposed to AI generated anything actually but the training is problematic and I don’t see society as ready for it.
It would be an easier pill to swallow if we had a shortage of writers/artists/etc and this was the solution to that.
"Experiences" and "content" are not equivalent. To take a very simple example: Go outside. Look at some geraniums. There are probably other examples, but red geraniums are pretty common and cheap in temperate climates. If you've been looking at screens for a long time, geranium petals look impossibly red. Red roses also work. Why is that? Because they're a real physical object whose colour properties cannot be accurately reproduced in an image using standard RGB colour. Even after billions of dollars of R&D and decades of research, the best sensors and screens in the world cannot still accurately reproduce the simple act of looking at a flower. And yet here you are claiming that these models that've been fed nothing but digitised facsimiles that we know can't even accurately represent common situations are equivalent to the experiences of living beings?
Here's a common experience most people will have had: Think about what it feels like to slip and fall on a wet road and skin your palms on some gritty tarmac. Think about how many senses that involves, and how it's a basic experience most people have had. And now consider how we aren't even remotely close to the process of beginning to digitise such a basic experience.
> Because they're a real physical object whose colour properties cannot be accurately reproduced in an image using standard RGB colour.
'Real physical objects' don't have 'colour properties', as grouping together certain wavelength ranges of EM radiation and then assigning them a 'colour' label is a human invention.
For example, even if every human disappeared from Earth tomorrow, objects will still emit EM radiation, but won't emit human invented labels.
Isn't the more severe offense, and the one relevant here, more about distributing a recording than making a recording? Nobody will care if your brain has a copyrighted melody saved in it, but they'll care if you start putting it (or something substantially equivalent) all over the place where others can obtain it.
Copyright holders aren't complaining about the training per se, they are complaining about the distribution and outputs of the models which in many cases directly regurgitate the training data.
In most cases it won't regurgitate the same training data. What happens is the model essentially models a full on continuous curve that best fits in-between the training data. The amount of points on that curve is 9999999x more then the training data and that 999999 is not an exaggeration. It's likely too small of a number.
I disagree, the size of the models are a lot smaller than the training data.
Just because I make an algorithm that linearly interpolates between two (copyrighted) values doesn't mean that it is creative or holds the wisdom between them.
You seemed to miss the part where the judge said that the only things that can be claimed as copyrighted are those things that were submitted to the USPTO for specific narrow coptright
"The other problem for plaintiffs is that it is simply not plausible that every Training Image used to train Stable Diffusion was copyrighted (as opposed to copyrightable), or that all DeviantArt users’ Output Images rely upon (theoretically) copyrighted Training Images, and therefore all Output images are derivative images"
This displays either ignorance as to how artists work and the extent to which they are involved or can be involved in the legal copyright system, or reflects incoherence around the copyright system.
In this case the Judge chose to say, in effect: unless you have explicitly copyrighted it, it's fair use
That is now a new precedent that negatively impacts individual artists who have no power in the market, and protects giant corporate interests which have tons of power in the market
You're arguing something slightly different. The artist has copyright as soon as the work is published, whether they register it or not. You can't sue unless you've registered.
> You seemed to miss the part where the judge said that the only things that can be claimed as copyrighted are those things that were submitted to the USPTO for specific narrow coptright
The judge didn't say that (for one thing, copyrights aren't handled by the Patent and Trademark Office.)
What he did say is a bit bizarre. because copyrightable works are cooyrighted automatically when set in fixed (including digital) form. So, if it existed to be trained on and was cooyrightable, it was cooyrighted.
The judge may have been using sloppy language to refer to registration, which legally must be done before pursuing most copyright claims in court. (This isn't about being cooyrighted, and registration can happen after alleged infringement without invalidating the claim, but it doesn have to happen before filing a lawsuit; it is a procedural reauirement that is black and white in the law.)
> Except for an action brought for a violation of the rights of the author under section 106A(a), and subject to the provisions of subsection (b),[1] no civil action for infringement of the copyright in any United States work shall be instituted until preregistration or registration of the copyright claim has been made in accordance with this title.
[NB: 106A(a) is right of attribution].
It's the law, as enacted by Congress (even though it is probably a violation of the TRIPS Agreement, an international treaty signed by the US).
But note there is nothing preventing you from registering your copyright well after you first published the material, although the work has to be registered before infringement if you want statutory damages (the big $$$$) instead of just actual damages.
"there is nothing preventing you from registering your copyright well after you first published the material"
Your point is exactly what people in power want. IT IS CODIFIED IN LAW that unless you go through a kafkaesque process, your work can be reused and you get no compensation for it.
That is precisely the OPPOSITE of "just" and the law was written by capitalists for capitalists.
Just look at the original precedent case:
https://casetext.com/case/vacheron-constantin-le-coultre-wat...
"two successive applications for a certificate of registration were refused by the Register of Copyrights upon the ground that the subject matter was not a work of art within the requirements of the Act"
Perfect - so we set up a kafkaesque process that is opaque and up to a handful of unelected elite to make rulings on art (oh please) and if you don't do this then it's Legal and therefore right.
So why are you angry at the judge and not Congress?
(Also, fwiw, your copyright registration being denied is sufficient to bring action. You just now have to plead why you have copyright despite the US Copyright Office disagreeing.)
> Your work is under copyright protection the moment it is created and fixed in a tangible form that it is perceptible either directly or with the aid of a machine or device.
> Do I have to register with your office to be protected?
> No. In general, registration is voluntary. Copyright exists from the moment the work is created. You will have to register, however, if you wish to bring a lawsuit for infringement of a U.S. work. See Circular 1, Copyright Basics, section “Copyright Registration.”
I’m surprised to see this reasoning (well all billion images can’t have been registered, so they can’t sue). I wonder how much it would cost in copyright fees to register all of LAION.
> I’m surprised to see this reasoning (well all billion images can’t have been registered, so they can’t sue).
Well, here's the actual text from the decision:
> Each defendant argues that McKernan and Ortiz’s copyright claims must be dismissed
because neither of them has registered their images with the Copyright Office. They also move to “limit” Anderson’s copyright claim to infringement based only on the 16 collections of works that she has registered. See, e.g., Declaration of Paul M. Schoenhard (Dkt. No. 51-1), ¶¶ 5-6; see also Compl. ¶ 28 & Exs. 1-16.3
> In opposition, plaintiffs do not address, much less contest, McKernan or Ortiz’s asserted
inability to pursue Copyright Act claims. At oral argument, plaintiffs’ counsel clarified that they are not asserting copyright claims on behalf of these two plaintiffs. July 19, 2023 Transcript (Tr.), pg. 17:1-5. As such, McKernan and Ortiz’s copyright act claims are DISMISSED WITH PREJUDICE.
> Likewise, plaintiffs do not address or dispute that Anderson’s copyright claims should be
limited to the collections Anderson has registered. The scope of Anderson’s Copyright Act claims are limited to the collections which she has registered.
TL;DR: plaintiffs didn't attempt to argue that the copyright claims should be construed broadly, defendants argued they should, so defendants win at motion-to-dismiss stage. The defendants actually lost their argument that they should win because Anderson didn't identify the specific registered works, the entire case should be thrown out--the judge said there's enough specificity to let the case go to discovery to figure out which registered works may have been infringed.
For me this argument will hold water when we can put LLMs in jail if they commit a criminal act. Until then, an LLM is not a human and not entitled to be treated like one.
Moreover, at least in the case of music, people have been successfully sued when their song strongly resembles another copyrighted work. Thus "holding the human brain guilty for memorizing copyrighted work" is actually the status quo.
Can somebody explain how this will not kill any incentive to publish anything?
Why would any human spend their limited lifespan to create a piece of work that will be grabbed without permission, approximated algorithmically (at least on the surface) and reused in infinite possible small variations without any attribution or remuneration whatsoever?
This feels like a reversion to medieval times with minimal trade between regions as thieves would ambush traders and steal any goods.
I despise the underlying belief here. That belief that the only reason people create art is based on desires for fame and monetary gain. Which is so demonstrably incorrect that it boggles my mind.
This is where centuries of copyright law have gotten us, brainwashing people into thinking ideas are property ("intellectual property") and should be treated as rivalrous goods, and that the only reason to be an artist is to profit. Brainwashing people into thinking the only way we'll have art in the world is if we maximize the profits of commercial artists.
Take one brief look at the internet, music, video, podcasts, a museum, the walls and refrigerators in people's homes, a kindergarten class room, an art class, or hell, this very forum. And you'll see that it's universally true that people like creating stuff because people like creating stuff. For free. Because it's fun and stimulating. That's inherent in us. We do not need laws to prop up an artificial business model for humans to maintain our drive to create.
... but you do have a rent to pay for? You know that artists also have to live somewhere?
Why do we have stars on github? Maybe not for fame, but it is a good indicator of how somebody is good. Fame is important. We do not make github repositories for "stars", but I think it is a good motivator for people to continue what they're doing.
If there is an author who spent years of his life into producing some kind of music piece, should not there be some kind of laws protecting his work from theft?
i agree in part with the parent comment, art can be done for hobby or for work, if the work part is being replaced by generators then art will be confined as an hobby and maybe that will be better for everyone. Stop at all the laundry-multimillionary "art" racket.
btw, regarding copyright use, i don't see how AIs are not in the "fair use" category, they take human-generated content and apply a transformation to it, generating new images. The only problem is that generating images is a million-times quicker and cheaper
If a 'work' was created for view only, and a "big tech company" sends employees to spy and capture anything 'work' related during human training, with mischievous intent to copy everything without permission, without compensation, so that author cannot use it anymore then NO, I do not find that ridiculous.
What a shit opinion. As most of us are professional programmers here, surely we can understand the benefits of professionalization? We as a society lose something if the economy can't support the mastery of these skills, including the ability to create the training data these models depend on.
- The vast majority of art created is done by people who don't have a full-time job creating art. So why does art have to pay the rent?
- Lots of art still does pay the rent in ways that aren't threatened by AI. I haven't seen any convincing arguments that AI will be the death of commercial art, but moreso a new tool used by many commercial artists.
- Business models change. There have been countless industries and skillsets that have been made obsolete by various technological innovations that have automated what previously required skilled workers to do. Society adapts. Why should we make this particular technological progress illegal, just so we can freeze a particular business model in time? We don't do that with other professions. Why can't artists adapt and find new ways to make money?
- Similarly, why do methods of getting famous for art, or being motivated for art, need to stay consistent and unchanged over time? Why can't artists figure out new ways to get famous or get motivated? Why should we freeze or outlaw technology just so artists don't have to change from what's always worked?
- Copyright violation is not theft. Theft is when you take a rivalrous good from someone else who owns it. If you grow an apple, and I take it, that is theft, and it's wrong because you no longer have that apple. Copyright violation is not theft. If you draw a picture, and I copy it, that picture has not been stolen, and you have not been deprived of it. This is a crime very different than theft, and the laws behind it are very different than the laws behind theft, as is the reasoning behind those laws. So it would behoove you to stop equating it to theft.
I didn't get that from the poster you're replying to. I agree with you that people will create because people are people and want to create.
But people also have to eat, need shelter, want kids, have to take care of health issues, and so on. For that they need money. If they can't get money from their creations they'll spend less time creating and more time engaging in activity that generates returns.
> If they can't get money from their creations they'll spend less time creating and more time engaging in activity that generates returns.
On the other hand, artists who adapt AI into their workflows will have vastly improved productivity, so the amount of time they need to create new works can also be much smaller for the same amount of output.
There will still be some artists who can avoid using AI and make a living, but like all skilled industries that have already been disrupted by automation, the market for that kind of work will become smaller and more exclusive, and that’s probably OK. Live musicians were the only way to get music on the radio, or in the cinema, or at a party, until audio recordings decimated the industry[0]. You can still buy hand-made furniture, or clothing, or cars, but most people today get them from a factory.
I do think there are serious issues with large players attempting to corner the market and extract all the wealth for themselves, but this is an old problem[1], and I think on balance these lawsuits perpetuate, rather than attempt to solve, the root problem: a missing social safety net and no guarantee that people will not have to scramble to survive every time a disruptive technology emerges.
This part of the discussion is mainly about economics, not expression.
The parent argument was that AI will force artists to spend more time engaging in non-artistic pursuits because the amount of money the market will pay for artwork will go down. The counterpoint is that the amount of time the artist needs to spend on creating artwork is also reduced. In other words, if the market value for creating one artwork goes down by 50%, but the amount of time it takes also goes down by 50%, then there is no change except that the artist gets to double the amount of work they produce for the same amount of time.
Regarding the natural human desire to create art, unless you’re famous, making money as a creative involves mostly shitwork. Most commercial projects are just not very fun or interesting. Tools like generative AI can help to automate away a lot of the shitwork, so when the really interesting projects come along, they can be given more energy. This, to me, also seems like a win.
You are going on an irrelevant tangent (whether people enjoying being creative - which is obviously true at least for some) instead of answering a very clear and simple question: how, in your evolved and less broken universe, will talented people dedicate their life to produce something that society does not acknowledge or reward but simply appropriates.
Why do we as a society need to answer that question?
When automobiles were popularized, entire generations of families and people devoted to horsecraft suddenly found their business model obsolete. Did we as a society need to come together to ask how talented people in the horse industry might be able to continue to profitably dedicate their lives to that task? No, we just let the market figure it out, and allowed change to occur.
Our take an alternate thought experiment. Imagine a world where recipes were patentable. The first person to make mac-and-cheese could patent that, claim ownership of mac-and-cheese, and ban every other person and restaurant from making and selling mac-and-cheese. There would, of course, be an entire industry devoted to this. There would be restaurant chains that own the idea of pizza, burgers, etc., and stop any and everyone else from making that. There would be small and boutique recipe crafters creating and profiting from their unique recipes, too. And of course, people in this industry would ferociously sue anyone who "stole" the recipes that they "owned." And they would ask the same questions you're asking: "What ever would we do in a world where recipe owners don't get to uniquely monopolize their creations because they're allowed to be appropriated by others?" And the answer is: that business model wouldn't simply exist, and the world would be just fine without it.
It's not clear to me why any particular profession or business model needs to be protected into existence. The world will be just fine if some business model that always worked gradually ceases to become viable. I have sympathy for the people in those professions, but it happens all the time, and is a necessary consequence of technological innovation and progress.
In the 1800s, the Luddites smashed up factory equipment and tried to make it illegal, because they wanted to protect labor jobs. Thankfully they lost, and now we have a world with better jobs, which they could not have imagined.
If a copy of someone's work is used to create a product (in this case, to train the GenAI), then it makes sense for the creator to receive some form of credit.
Your car/horse example makes no sense in this context because the existence of cars is not predicated on the existence of horses. On the other hand, GenAI is not possible without the art that forms the training data.
That's not how it works, never has been how it works, and should not be how it works.
For thousands of years, human workers and artists have trained themselves by looking at copies of other people's work, and used the ideas and inspiration the gain from that material to produce new works. Artists will literally sit down and copy another artist's work as part of training. This happens as a matter of course. Sometimes the new works produced after training are fresh and original. And just as often, they are derivative copycats (see: fan fiction, most clothing, most music, half the stuff on DeviantArt, etc.).
The law is only concerned with a creator's output, not the input.
If you produce and distribute something that's an obvious copy of someone else's work, that could be a copyright violation. However, if what you produce is sufficiently original, it's fair game, regardless of who you were inspired or trained by. The law does not require compensating or crediting those who inspired you or helped with training. Again, the law is only concerned with output, not with training, not with inspiration, not with input.
What you are advocating for is a huge change. Would this change apply to humans who are learning/training by looking at and copying others' work? If so, that would be a nightmare. If not, why not? Why only apply it to AI? Because the AI is more efficient at it than humans?
If that's your reason, could you justify it? New technology is always more efficient than the status quo. That's the entire point of new technology. Making laws to stop it accomplishes nothing except limiting technology in order to protect old jobs and business models. Why should we do that?
Yes, of course I can justify applying a different standard to AI than to humans when it comes to the generation of art.
Despite everything you said about humans copying the work of other humans, the fact is that humans are capable of producing new art without reference to prior art. You can prove this by considering the emergence of the very first cave paintings.
On the other hand, GenAI like Stable Diffusion is capable of outputting art only if the following two conditions are met:
1. there is a prompt (and perhaps special configuration) made by the artist using the GenAI
2. there is training data (i.e. art made by prior artists) that makes the AI capable of translating the prompt into an image.
Of course it is possible to create programs that output something that may be considered art without reference to the work of other people. However, without (2) GenAI specifically cannot exist, and so the people that make (2) possible are right to demand credit if their work is used in this context.
Note that this says nothing about technological progress. If you train GenAI on material you have a right to use (e.g. art that is in the public domain, or copyrighted art that you have a license to use), then it's all good. The problem arises when you use a copy of someone's work in a process without their permission, when the law clearly says that the creator of that work has the legal right to decide how copies of that work are used (with the exception of "fair use").
Your point still doesn't make sense to me. Real humans today are creating art based on other artists' work. Their art would have absolutely zero chance of being created if not for the training, copying, and inspiration they're drawing from other works. Cavemen were not creating the art that we see people creating today, because people today rely on learning from and copying others' examples.
None of the targeted artists are being compensated or credited. Nor are you demanding they be.
Why not?
If I train myself by copying your work, by your logic, shouldn't you be compensated or credited? The fact that my neurons could've theoretically created some other type of art from scratch even if I hadn't trained on your work, doesn't change the situation that I did in fact train on your work.
When you train a GenAI on an image, legally speaking you're training it on a COPY of that image.
If you intend to train a GenAI on a copy of an image I own the rights to (perhaps because I created it), then you have to negotiate with me. If you don't, then you are violating my copyright.
That's not how copyright works. Copyright law limits creating copies of a work, not interacting with copies of a work lol. Which means it doesn't apply to training.
What you're describing is some totally new law that doesn't exist anywhere on earth, because it would make no sense. Imagine if an aspiring artist who's training by studying a copy of another artist's drawing, or by listening to a copy of another artist's music, gets charged with violating that artist's copyright because they trained on a copy without permission.
Again, think of the implications if the laws you're dreaming up actually applied to people today.
When you are downloading someone's copyrighted music to train your GenAI with, the act of downloading it creates a copy of the work that now exists on your computer.
The essence of copyright law is to give the owner the right to decide how copies of their work are used, which you have violated if you did the above without their consent. Incidentally, this is why pirating music is a copyright violation.
Your argument is based on falsely equating the process of a human getting inspired by prior art to training GenAI on the same. If a human goes to an art gallery and gets inspired by some copyrighted work, he did not have to create a copy of that work for the inspiration to take place.
On the other hand, with GenAI training, the act of creating a copy is unavoidable (e.g by taking a picture of a painting, which is uploaded to your computer, which is piped into a training algorithm).
The human's mental representation of the painting does not count as a copy, but painting_photo.png does and the owner's rights fully extend to it.
Copyright law only gives owners limited power to decide how copies are used. Not unlimited power. They have the sole right to make, distribute, perform, or display copies of their work… but that's it. They can't forbid others from learning or training by viewing copies of their work. And there are also strong exceptions, notably Fair Use, which outlines situations that allow copyrighted works to be used without anyone's permission.
I'm not sure if you're a web programmer or not, but all images, audio, and video on the web work by copying. When you navigate to a website that displays images/audio/video, this works because your browser is automatically downloading those files from the server to your computer, i.e. creating copies.
Importantly, this process is identical, whether it's a human browsing the web or GenAI browsing the web. It is 100% unavoidable for both humans and AI to download copies of media that their browsers come across on the internet. 100% of us end up with painting_photo.png on our computers. That's just how browsers work. And that download counts as legal under Fair Use in every single country.
Furthermore, once that download has occurred, we all have the legal right to observe these files, to learn from them, to peek into them however we want. There is no law that makes this illegal for humans, nor for GenAI. It is theoretically possible to make a GenAI that illegal makes additional copies of files that the browser has downloaded, but that's entirely unnecessary, as they could simply train on the already-downloaded copy.
You have two options:
1. Re-interpret current copyright law in a way that would make it illegal to look at files automatically downloaded when browsing the web. This would make GenAI training methods illegal, but would also make humans browsing the web illegal, since they are identical.
2. Add a new law to the books that would specifically target GenAI and make it illegal for it to do something that is not currently illegal or in violation of any laws. In this case, I would ask you: why on earth would you want to go out of your way to create this unnecessary law that the world is just fine without?
I can't think of any reason, except wanting to protect the business model of current artists. Which is lame, imo. We should not be outlawing useful new technologies to protect the profits of people in any industry, including artists.
Yes, I am a web programmer, and I understand that anytime you are using the internet you are downloading a copy of whatever content it is that you are accessing. You don't even have to invoke fair use here, as the copyright holder gives you the right to access a copy by putting it on the open internet.
That does not mean that you automatically have a right to do whatever you please with this copy. If you want to redistribute the content, or use it as part of some commercial process, then you absolutely have to discuss this with the copyright holder, and they have every right to deny you if you cannot agree on the terms. That is the meaning of "all rights reserved".
Not all copies are equal. Feel free to enjoy a copy of "Oops!... I Did It Again" that you paid for on iTunes. But download the same file using P2P, and the record company might just have the book thrown at you. The two files are identical copies, but the difference is that the copyright holder gave you the right to listen to the first (in exchange for money) but not the second. Furthermore, downloading the song from iTunes gives you the right to listen to it, but not to play it on loudspeakers at the next conference that you are organizing. To obtain the right to use it for that purpose, you have to negotiate with the record label.
Note once again that these restrictions do not apply for works that are e.g. in the public domain, which you are free to use in any way you want. I am in no way arguing for "outlawing useful new technologies", anymore than I am for outlawing cameras because they can be used to create bootleg copies of the latest blockbuster movie. Feel free to train GenAI on copies of works that you have a right to use for that purpose: either because they are in the public domain, or because the copyright owner explicitly gave everyone the right to use a copy of the work to train a GenAI, or because you negotiated a licensing agreement with the copyright owner.
I agree that copyright owners have rights. But what you're missing in your response is the fact that these rights are limited. And their limits are very specifically outlined by the law.
When it comes to their works, copyright owners have the right to block reproduction (e.g. making a copy), adaptation (e.g. turning a book into a movie version), distribution (e.g. selling copies), public performance, and public display. That's it. And there are even some important exceptions for Fair Use, too.
They can't block other things. There is no law granting them the power to do so.
For example. they can't stop me from talking about a copy of their work. They can't stop me from analyzing the copy. They can't stop me from learning from it. From describing it. From memorizing it. From looking at it whenever I want. From extracting lessons from it, or extracting patterns from it. Or from hiring a friend or a company to do these things for me. Or from writing a computer program that does these things. Etc.
It's perfectly legal for me to browse the web, look at copyrighted images that my browser has downloaded, study them, consider the patterns and techniques within them, and then re-use these same or similar techniques to create new art, so long as it's not so close to the original that it counts as a reproduction or adaptation. There is no power granted to copyright holders to limit my ability to do this, nor should their be.
This kind of inspired learning is the engine of societal creativity. Copyright was not created to kill this. Copyright was created to help artists profit from their exact work and its likeness, not to capture, own, and profit from inspiration that springs forth from the rest of society.
Yes, I get why it sucks for artists that the technology now exists for computers to also engage in the above process. But I still don't think we should be modifying or creating laws to stop that. Artists can adapt their business models, just like people in every other field when tech starts to compete. The world will be fine.
Should I be able to dedicate my life to some obscure thing that nobody cares to buy as a commodity?
That argument is absurd, you don't HAVE to be an artist. You can be a talented person and dedicate your life to something else.
Or you can be like everyone else and do it as a hobby in your time off because people don't find what you do creatively to be valuable. I demand I get paid for my playing video games, I think it's valuable and I dedicate my time to it, I deserve to be paid.
I appreciate the sentiment but AI is not a panacea to copyright laws, it's a way to hoard ideas whether it's protected or not.
And we still have copyright laws. Corporations are still hoarding IP. So rules for thee, not for me. The harder you make it for artists to get paid, the more people you get promoting Raid Shadow Legends
>Why would any human spend their limited lifespan to create a piece of work that will be grabbed without permission, approximated algorithmically (at least on the surface) and reused in infinite possible small variations without any attribution or remuneration whatsoever?
Because they enjoy it? Or do you see artists as some type of corporate drone who hates the very act of making art?
That's like asking why anyone would contribute to MIT or Apache licensed open source.
Excuse me? What moral and economic planet are you living on? Enjoyment is an important drive behind any creative person, so much is true. But at least part of the enjoyment comes from other people appreciating, acknowledging and, yes, remunerating that creative work.
The idea that authors, artists and other creatives will keep pumping original work as part-time love affairs so that AI bros can grab it and mint a dime is... strange.
You seem to have a very limited view of artists. The vast vast majority of artists are not professionals and make no money from their art. The vast majority of them also get very little publicity for their arts. Yet they still create.
There'd be less art and probably lower quality art however to think people would outright stop making art due to any reason is strange (to use your words).
> The idea that authors, artists and other creatives will keep pumping original work as part-time love affairs so that AI bros can grab it and mint a dime is... strange.
I make a painting. I display it. My neighbor sees the painting, studies it for a while, goes home, and makes a painting based on what he learned from mine. People still enjoy my painting, still credit me with making it, and if they like it might still pay for a copy. My neighbor, having devised a way to make paintings really quickly, sells paintings for cheap. Can I send him a bill because he's making money based on something he learned from me?
Granted, a lot of this boils down to whether AI learns or copys/remixes; if it only creates what copyright law would consider derivative works, then that's another matter.
> Granted, a lot of this boils down to whether AI learns or copys/remixes; [...]
I mean, it certainly isn't learning in the same way humans learn. Human learning involves some degree of understanding, which is wholly absent in AIs. There is, to my mind, no good reason to treat them the same because of this difference.
> The idea that authors, artists and other creatives will keep pumping original work as part-time love affairs so that AI bros can grab it and mint a dime is... strange.
The entire software industry is built on open source software written for free by other software devs
the sustainability of open source projects that are not corporate-backed is, famously, a major issue.
In analogy, if creative production was to be paid by a benevolent patron that doesn't care if the work is then released in the public domain the economic equation changes. But this mode is not how things work at present.
People like to create things regardless of profit motive, like art, who woulda thunk it?
I don't understand how people think this will suddenly make human art vanish, that is just ridiculous and naive. People will spend their limited lifespan to make art, because that's what humans just do. Cavemen weren't being paid to paint on the walls.
Your viewpoint is honestly insane. The people against AI art are bonkers.
In your infinite sanity you have not answered my question of how a creative person will dedicate their life (starting from long studies) producing something that society will not reward in any way.
In their freetime? Do you not have any? Should society reward people who want to dedicate to something that has no inherent survival benefit? Should that not be something done in ones own time?
Why is society obligated to pay artists? It isn't.
I actually had a discussion regarding this topic with a peer of mine that at the time was doing art-studies . She was extremely angry and devastated that in her field weren't enough jobs available for every one that mayored in it. That was a real complaint, for her the state should step in and guarantee a well-paid job for everybody in the exact field that everyone chooses, Like "I want to be a painter" and the day after they give you a place to work on your paintings
I mean, I would love to live in a world where UBI was a thing, and people were free to pursue their interests without fear of homelessness or starvation. Capitalism is a system that requires an underbelly of people exploited through the coercion inherent to the inequality of the world.
That is an utopia, however. The reality is that most people - including yours truly - sell their labor to make ends meet. It certainly was not my lifelong dream to do backend development for a financial institution.
Your friend (and many others in this thread) operate under a naive and entitled assumption that their creative output should be enshrined as something special, whereas throughout history technology always replaced human capacity, and we are better for it.
What? Have you ever paid for a book? a subscription to a publishing medium of any sort, a movie or a piece of music? Have you noticed a copyright sign somewhere? Is that arrangement of recognizing and rewarding those who create something worthwhile part of the real world or not? Not even a hallucinating AI would be so incongruous.
> but automation replaced humans in all those industries.
So? To the extend that people are still involved, they are typically paid - unless it is slave labor, that is.
This is not an automation vs manual labor debate. Creators will use any technology that helps them create something worthwhile. It may even involve algorithms in various ways. Generative art was a thing way before the AI bro invasion.
The question is about provenance, attribution and remuneration of whatever human work and creativity is involved in producing unique pieces of work. Work that is used as input for the algorithmic production of infinite variations and replicas.
The AI crowd simply wants to devalue that human input (while pressumably charging for API's or whatever comes on the other side of the meat grinder).
Which I suppose might happen in an increasingly dystopic world, but they seem to also believe that people will keep embarking on literary studies, movie or music making studies etc. as some sort of non-remunerative hobby, just to keep producing useful inputs for the AI models.
Its not going to happen. The golden goose will be dead before you can spell "AI".
Because most people prefer spend their free time consuming art instead of staring at a wall. You just forgot that art embraces games, music, movies, tv shows, comics, books or even some Youtube.
You are applying an outdated mental framework to AI, not to mention a falacious view on the relation between copyright protection and the incentive for art. Have in mind that millions of humans have historically produced enormous amounts of art with no copyright protection of any kind. The protections you are trying to defend are a relatively recent phenomenon (~200 years); arguably, some of our best art was created well before these protections existed!
Additionally, the beauty of contemporary AI is that it's much more similar to the mechanism of inspiration and learning that humans employ than, let's say, the literal copy of a photo. I think it's reasonable for an artist to limit the visibility of their work and prohibit their images from being shared online and used by AI training - but this must apply across the board. If their image is public in a way that anyone could see it, and be inspired by it, then they need to accept that the AI could be equally "inspired" by it.
If you want an easy, concrete example of the process of inspiration and copying taking place for humans, just look at Animes. Styles are imitated by humans left and right, with no concern of original artists losing their livelihood. Human Anime artists copy their idols when learning to draw, oftentimes producing literal imitations for years before starting to produce their own original work, which to be honest usually greatly resembles the source of inspiration (PS: I like Animes, but most of it is very similar in terms of artistic style).
Do humans ever really "invent" any art? Or are the artistic innovators simply a mix of influence of existing art, the natural images of nature/life plus a spice of randomness? Because that's pretty much how AI art functions.
I am not trying to defend copyright. I am trying to defend the incentives and ability of humans to dedicate their lives to something that might be innate to all to some degree but only comes to fruition after long years of dedication.
Older societies did not have copyright but they, manifestly, had ways to sustain creatives.
People wax philosophical about paradigm changes and other vacuities yet refuse to answer a simple question: how will society reward human creativity that takes a lifetime of cultivation to flourish.
They actually didn't have such a great way to sustain creatives. Poor artists is a staple of civilization. You hear about the winners, but most artisans, painters, etc had to fight hard for income.
AI won't make art go away, because you still need to tell the AI what to do. But the new art won't require as much skill with a paintbrush. For instance, I'm terrible at drawing but I believe I have creative ideas. AI alllows me to be an artist too.
The problem is these discussions are being had by STEM/tech people who don't respect or value art or the effort behind it, not by artists. They simply do not get the concerns that artists have.
It truly boggles the mind that people equate machines that can output thousands and thousands of images in short time spans in any ingested style... with humans who have to hone styles and can only produce a result every so often.
This is how technology works. Bulldozers effectively replaced people with shovels. Excel effectively replaced accounting clerks. Generative AI effectively replace artists (to some capacity).
Most people care only about the output of a system, not about who the system replaces.
We are able to create bulldozers without using any shovels. We can create Excel without accounting clerks. But we cannot create generative models without using existing artwork. Because existing artwork is so critical to the inventions replacing artists, it’s more exploitative.
If we required workers consent to directly use their works, we would be able to build bulldozers and excel but not stable diffusion. It’s very different that way.
I fail to see how so, outside of some ulterior desire to see "art" as something inherently superior. Manual labor was used to build bulldozers that replaced manual labor, to give an example.
Using existing artwork for training model is only the process that generative AI came to be.
If you’re talking about the labor needed to build a bulldozer, it is paid consenting labor. That’s the point. It isn’t that art is superior, it’s that the required labor to create these things is going uncompensated and without consent.
If you had a conversation with a STEM person, they'd probably say everything is 100% fine, society and all. If you had a conversation with an artist, they'd probably say AI is pure evil theft and society is collapsing.
If you solely listen to either side, you'll be blinded by madness. On that note, how many people respect or value the effort behind software? Most don't, not most artists either. That is the nature of life.
> Why would any human spend their limited lifespan to create a piece of work that will be grabbed without permission, approximated algorithmically (at least on the surface) and reused in infinite possible small variations without any attribution or remuneration whatsoever?
Because many people make art for art's sake.
Besides popular art works always essentially had this yet people still made them. The difference is in scale not kind.
Or to put it in another context - why would anyone work on an open source project when their work can be reused without (explicit) permission, cloned and reused in infinite small varations without any renumeration and essentially no credit (when was the last time you actually looked at the CREDITS file in an open source project? Have you ever?)
Why does it matter if some artist uses ChatGPT to knock-off your style indirectly rather than directly?
I mean, sure, ChatGPT is better at it than most artists. Is that the problem? The quality of knock-offs is too good now?
Take any new song - and any music head can list several songs it is just like. Take any new movie - and most screenwriters could go on for hours how it's almost exactly 10 different movies. Etc.
Agreed, working with generative AI tools at work and independently over the past year has only made me want to make weirder more personal paintings / drawings with less interest in mass appeal or artistic professional viability. It's felt unexpectedly liberating and inspiring.
In your universe apparently "creative people" don't need to spend a lifetime of study to hone their art, don't need food and shelter every single day etc.
Its amazing how callous tech people have become as they salivate for their unicorns or whatever they are pursuing.
It's a delightful irony that the comment directly above yours is from an artist that has used generative art for those commercial purposes you think are so essential, and felt liberated and inspired in their personal time to be even more creative in ways that are not commercially viable.
I don't see how this could be any different than say a person reads a book. Then uses what they learned from that book to write another book? Sure, if they're literally copying material from one book and then adding that as their own work into their book, that could be against copyright.
This decision isn't "a reversion to medieval times", whatever your opinion on the legal status of these images— this is an entirely procedural decision in which the judge ruled that the specific claims of copyright infringement are invalid because the plaintiff never filed for copyright:
> Orrick spends the rest of his ruling explaining why he found the artists’ complaint defective, which includes various issues, but the big one being that two of the artists — McKernan and Ortiz, did not actually file copyrights on their art with the U.S. Copyright Office.
In other words, the story here is mostly that the lawyers screwed up badly in pursuing a copyright lawsuit before ensuring copyright had been filed.
It's a parlor trick, but I'm going to use it anyway: your own comment refutes your point.
You put work into posting this comment: thought about the situation and crafted sentences you wanted to publish. I've absorbed them, learned from them, they'll inform my own output in the future. And respectfully, I won't remember your name or give credit.
So why did you publish your comment? People can't avoid creating data. We do it passively. And you'll continue doing it, for your entire limited lifespan, even if you get neither laid nor paid for it.
It's just not that different from people seeing works and learning or being inspired, so how do you "ban AI" without adding more crazy DRM/DMCA stuff for legitimate use?
Um, it's extremely comparable. You basically just described the human process, but used bigger numbers lol.
Go on any art site, whether it's drawings or writing, and a massive amount of the human learning and inspiration you see is just copying from others, too. People drawing their favorite characters, writing fan fiction that copies from their favorite books, etc.
This is so extremely comparable to AI it's weird to me that you don't see it.
Look I consider myself an artist, and am very creative. But the vast majority of art and artists are following after other's previous work. It's really not that unique, and frankly why AI can generate some things so well - because everybody's doing it!
Artists who focus on truly new art aren't "in some AI" because it's unique or not easily reduced to some model weights.
The economics of all this are challenging for graphic artists and illustrators for sure!
I agree that the scale is different, but even the way you’ve phrased it sounds pretty comparable to how humans learn and produce art. Besides scale, how is it different?
Plenty of people create with minimal profit effort. Outstanding creation happens with minimal profit all the time.
To profit from creation you have to publish - what alternative do you propose?
So that I don't understand this idea that anything will "kill publishing". Copyright changes the economic math around publishing, sure - and most of the time currently not for the better. That will keep evolving but there is no risk of killing creation or publishing.
Oh I can explain it: what you’re describing is already happening, it’s been happening since far far before medieval times, it’s literally how human creativity works: novel recombinations of existing works.
If anything, it will lower the barrier for creation, to allow a greater variety of art by a greater variety of artists. Adding a new tool to an artist’s kit is an incredible step forward for art.
Not at this rate, speed, and reliability. Even setting aside the morality and legality of the matter, let’s please stop with the fiction that what these computerised systems do is the same as other humans. Scale matters. If someone said “I’m worried about the consequences of machine guns being sold at convenience stores”, it is not a sensible response to say “human have always been able to kill other humans with knives and handguns”.
People have been putting up with some theft because they could still eke a living.
This attitude has all the coherency of "some people are thieves, we cannot catch them all, so lets make theft legal".
Unless I hear some sensible argument why this slippery road won't destroy a good fraction of the economy I am assuming that regression to kleptocracy is the shape of things to come.
Orrick also dismissed McKernan and Ortiz's copyright infringement claims entirely.
Well, duh. The judge is helping out the plaintiffs in this case. A jury would have been easily convinced by the defense that no images produced by Stability's systems are visually derivative.
The key is indeed what follows:
The judge allowed Andersen to continue pursuing her key claim that Stability's alleged use of her work to train Stable Diffusion infringed her copyrights.
So unless there is some kind of summary judgement I would wager that this becomes the focus of both sides as this heads towards trial.
But that's it. As predicted by commentary from legal scholars, the outputs of Stable Diffusion are distinct from the model and are not infringing on copyright... at least for this complaint!
No, he dismissed McKernan and Ortiz because they didn't register their images for U.S. copyright, which is a foundational prerequisite for any copyright lawsuit (in the U.S.)
EDIT: reading the linked PDF further, and it appears that McK and O's legal counsel stated that the two weren't asserting the copyright claims at all, which is why they were dismissed with prejudice. That means that they can't re-join the case by filing for copyrights for their images...Their lawyer fucked up pretty badly and if I were either of them I'd be filing a malpractice lawsuit.
I'll check PACER and read the actual ruling when I'm at work tomorrow, but yeah I'm interpreting "dismissed entirely" as "dismissed with prejudice".
You're entirely correct that if it was dismissed without prejudice the complaints on copyright infringement on the outputs could be amended and refiled.
In opposition, plaintiffs do not address, much less contest, McKernan or Ortiz’s asserted inability to pursue Copyright Act claims. At oral argument, plaintiffs’ counsel clarified that they are not asserting copyright claims on behalf of these two plaintiffs. July 19, 2023 Transcript (Tr.), pg. 17:1-5. As such, McKernan and Ortiz’s copyright act claims are DISMISSED WITH PREJUDICE.
Another interpretation is that the plaintiffs were well aware of how weak their case was with regards to the outputs and basically planned on abandoning it from the start.
There's been more than a bit of showmanship from the plaintiff's counsel so I'm not surprised that the actual legal tactics differ from the rhetoric of the blog posts. It's also common to stack the complaint so that when the judge does start focusing on the key issues that maybe a little more ends up at trial than otherwise.
There's winning in the court of public opinion and then there's winning in a Federal court.
This is consistent with historically intellectual property being a construct that benefits owners of capital and not actual innovators. That's why I think it should be abolished, this is yet another mechanism to monopolize a space to profit through some kind of rent-seeking procedure.
What is intellectual right? I respect authorship, with obvious consideration that no intellectual activity happens in a vacuum, as Isaac Newton said: "if I have seen further, it is by standing on the shoulders of giants.". I believe that I should never be able to get financially hurt or go to prison because I used other person's thoughts.
> I believe that I should never be able to get financially hurt or go to prison because I used other person's thoughts
"Use another person's thoughts" is obfuscating the reality of the situation so far as to be disingenuous. The way society ensures new works are created is to guarantee a temporary monopoly over certain narrow types of ideas to their creators. Why would anybody be an author if everyone could download free copies of any new book that came out?
I think patents are granted too liberally, and that copyright lasts at least twice as long as it should, but to argue that intellectual property can't even in theory be beneficial to protect is silly.
Do you suggest that you need intellectual property for new ideas to be created and propagated? THAT'S silly! For most of the history of this species there were no intellectual property rights and people still were sharing their thoughts and ideas for various reasons.
For most of the history of this species there also weren't books, people making a living writing books, computers, or importantly, the ability to endlessly copy books using computers. If we want people to write books, those people need to be able to feed themselves, and that's harder if nobody has any reason to pay for those books.
We can try some kind of libertarian socialism system, either with decentralized planing or with anti-capitalist markets to deal with it.
Consider existence of services like patreon where people support creators they respect before are able to get to know the work of the author - no intellectual property rights are needed here.
Realistically, you're talking about a fantasy world.
Less harshly, if we want to have that kind of world, we're not going to get there by abandoning creatives by removing intellectual property protections as the first step. Within the current society, intellectual property law does more good than harm.
But besides that, I don't think a universal patreon-like support system is even ideal. I don't want to provide ongoing support for any creation/creator I appreciate. I like being able to just buy a book.
This will be the greatest act of Intellectual Property theft in history.
All because judges will be befuddled about what to do after hearing terms like “training data” and “compression”. We will, of course get the emails in 10-20 years showing that it’s all lies and that the CEOs of these companies knew exactly what they were doing.
If this continues, AI will be the great inequality machine in history. Take data from 1,000,000 individuals, train your AI to replace them, compensate no one.
You can do this in every area: driving, farming, cooking. Music. Just dispossess everyone of all their property by training an AI to copy all their work! What could be easier (and less morally right)…
Intellectual property never really existed. Copyright is something we made up to extend the logic of commodities to the full value chain for books, which made sense 200 years ago. But it makes no sense to apply the logic of commodities to digitally produced and distributed media. The production of culture has been slowly becoming more distorted as cultural assets that should be and historically were held in common accumulates under the umbrella of massive intellectual property holders (Disney, Universal, etc) after we took the legal concept that was meant to apply to a much more narrow context and applied it broadly. They benefit disproportionally from intellectual property than individual artists do. The (recent) past dominating the present, being ruled by abstractions, and all that.
Is it bad that this will be used to displace individual artists/creatives in the value chain of media production? Of course it is. But we shouldn't be responding to that by clinging harder to schemes that have outlived their usefulness, we should be developing new models for funding production.
> Copyright is something we made up to extend the logic of commodities to the full value chain for books, which made sense 200 years ago. But it makes no sense to apply the logic of commodities to digitally produced and distributed media.
Great! So given your articles are in the public domain on your website I can make millions out of it without given you a cent or direct credit and sources without paying you and can claim it all as my own then.
If someone ran a print shop and printed out my articles into a book called "Collected works by Trey on HN" or something and didn't give me a cent then yeah I'd be thrilled, because it's a validation of my work. I already published the articles, they're doing all the work to put them into print, what right do I really have to claim part of the sales?
But if they claimed that they wrote the content then they would be defrauding their customers, since saying they wrote them would just be lying. You don't need intellectual property for fraud (as in "lying for material gain") to exist and be bad. At the very least it would be dishonest academically speaking and they should be criticized for it.
you would throw out hundreds of years of copyright law on a whim. This is economic suicide for independent creatives. The most tragic part of this is that creatives themselves, immersed in creativity, see no rhyme or reason to stop the flow. Over time, those that grow old or weak are discarded with no rights to their own work. Yes, it is that bad.
Have you listened to artists and creatives talk about copyright recently? Especially ones that publish on newer, digital distribution platforms like YouTube? There's numerous more thorough critiques of copyright law and calls to abolish it from across the political spectrum.
You're mischaracterizing me by suggesting that it's "economic suicide", as I said in my original comment in this thread:
> But we shouldn't be responding to that by clinging harder to schemes that have outlived their usefulness, we should be developing new models for funding production.
Sure it would be disruptive if we snapped our fingers and said "no more IP starting tomorrow", there should be a gradual phasing out of these unfair protections and effort put into sustainable pro-creative models.
That's bad but also mostly unrelated to the main point I'm making about copyright. If anything it's supportive since part of the risk of account deletion/restriction is running afowl of their content ID system, which only exists because of strongarm corporate copyright holders.
This argument kind of elides over how you will make the millions, when the author has not.
The hypothetical of lost revenue needs to be validated by the evidence of actual revenue being made in this way. Right now, I see lots of interest in paying for the _tool_, but almost none in paying human wages for the _output_ of generative models.
(I am not proposing that my above distinction is a legal test. Just pointing out that all these arguments would be more credible if actual ai generated works were being sold by AI companies)
I don't disagree at all. The OP I was responding to mentioned non-public-domain works like a blog.
My point was that we haven't yet seen the products of GenAI stuff really making money yet. People are paying for the tool, and people are paying for work that is being done using the tool, but nobody buys a book, image, movie, or similar from OpenAI or Google directly.
I don't think you've really thought this through. How would someone make millions off of that? If someone tried, the original author is still offering it for free. Why would someone pay for it?
A lot of people are very myopic about generative AI, thinking that large Hollywood studios are going to steal people's work and put everyone out of a job. Hollywood studios won't exist soon enough because anyone will be able to put together a movie that rivals existing expensive productions. In fact, generative AI is democratizing, as long as it's not gatekept by a few large corporations, which is exactly what trying to misapply copyright here would do.
> I don't think you've really thought this through. How would someone make millions off of that? If someone tried, the original author is still offering it for free. Why would someone pay for it?
Tens of millions of fans of a living celebrity would pay for it and the fans do not care about the original author as long as it is the celebrity's name that is selling it and can claim it as their own.
They don't need to give credit or sources to whatever their selling to generate millions.
Given a generation or so, a celebrity mostly won't have tens of millions of fans, because there will be tens of millions of celebrities. It's happening already but they are just called 'influencers, podcasters, youtubers' at the moment, with the line getting more blurred everyday.
My point still stands, depending on the price of say a book with the same text, etc which a celebrity with their millions of fans or tens of millions of fans can sell them and claim as their own.
Either way, they can make millions out of it, even if you are the original author and don't need to give you credit.
> I can make millions out of it without given you a cent or direct credit and sources without paying you and can claim it all as my own then.
This describes how ideas work - and ideas are rarely (if ever) IP-locked by a first originator - because there usually isn't one.
Everything by everyone, everywhere is built on the output of predecessors. Progress is a shared effort made up of minuscule increments or slight reorderings - which are typically done several times before they catch on.
IP exists to hinder this process by preventing 99.99999% of potential people from advancing ideas.
Anyone who programs for a living should be making whatever preparations are possible for being replaced by an AI. If AIs are good at art they'll be better at coding.
I have sympathy for the artists, but frankly this is progress and it can't be stopped. The economics are so lop-sided in favour of silicon that the law won't be able to hold it back without crippling society at large. Artists aren't the only ones affected and they may not even be the profession most impacted.
While the conclusion doesn't follow from the premise through the mere application of logic, I would note that when I was a kid all the high performance stuff was done in assembly "because compilers can't optimise properly", and yet since sometime around when I went to university (± a few years) we've all had compilers that are in almost all cases better at this than their operators.
Also, GPT-3.5 is already a better coder than a few humans whose mistakes I've had to fix. 3.5 is nowhere near the best, yet it's already eating at the bottom rungs despite being free.
At the moment I'm in the situation where having AI tools means I'm willing to try things I've never tried before; including having actually considered hiring artists for the first time ever.
Sadly, I think most artists would bite my head off right this minute? So I guess I'm going to have to wait until the storm blows over.
That's only true if artists cling to the old ways and reject new tools. Their income can increase if they embrace the new empowering tools becoming available.
Some places have government grants funding the arts.
On the other hand, I grew up in one of those places, the UK, and there were a lot of people moaning that the TV License (which funds the BBC) was an abominable stealth tax or words to that effect.
I'm now in Germany, where everyone has to pay the equivalent even if they don't own a TV or watch live over the internet (unlike the UK where not doing that means you don't need to pay); I've not heard anyone complain so far… but I don't know if that's because they genuinely don't, or if I'm peacefully oblivious by never having been suckered into reading a German-language comments section.
If you memorize all of harry potter word for word, or some famous solo vocal track from memory, are you committing a copyright violation? Or only if you then recreate it and try to redistribute your copy?
The scenario where AI training is locked down doesn't result in 1,000,000 individuals getting paid. (What would they get paid, and by whom?) It results in Disney, Adobe, etc.—massive companies with existing licenses to use content just about however they want—training their own models and locking everyone else out of the large AI model training game, until AI gets good enough to start generating human-quality creative work on its own (the same kind of progression as alphago/lee to alphago/zero), perhaps with the addition of a small set of purely copyright-free material.
Excluding all copyrighted material would be tying an AI model's metaphorical hands behind its back, since humans, although capable of producing great works through much iterative effort in isolation, all rely on having learned from some copyrighted work. Find an author who hasn't read plenty of recent books as well as older classics, or a musician (other than classical) who hasn't listened to plenty of modern music, or a director or editor who hasn't watched tons of movies and films. Recall Newton, "[I]f I have seen further, it is by standing on the shoulders of giants." Many of those "shoulders" are copyrighted.
Where is this idea that copyrighted material should be excluded from training data coming from?
My understanding is that people want to be compensated when their intellectual property is used as training data for a machine. That strikes me as an entirely reasonable expectation.
One person memorizing Harry Potter for their own amusement, even if they make money doing public appearances where they recite sections of the work verbatim for the amusement of the audience is not in any way similar to the process of training an LLM or of that LLM's output. The scale alone is so vastly different that it renders the comparison useless and misleading.
Yes, and you know how humans acquire works to learn from?
They pay for it.
They buy the books. They buy tickets to theatre. They buy entrance to the gallery.
The trick that's being done now is hey, we don't have to pay since it's not a person. (to the creator) But hey, it is just like a person when it learns! (legal system)
If AI models require human training data, then they should pay for it. Easy.
False. Libraries exist. Borrowing books from neighborhood libraries or friends exists. Watching movies and TV with friends exists. Listening to music on the radio (yes, those free electromagnetic thingies) still exists. There are many, many, many free performances or accessible copies of all kinds of copyrighted content, plenty to train either a neural net or a human brain on.
Books3 has separate legal concerns, but Google has a legally acquired corpus of tons of books, which they've mostly cleaned up from scans (probably far better than IA has), and have probably used to train Bard on. Their lawyers must be biting their nails waiting to see how these lawsuits turn out, though.
Until AGI arrives, or some other method of training LLMs from the ground up on sparse examples by incrementally building on structural knowledge of language.... training on ridiculous amounts of copyrighted content is required. Not because anyone wants to copy those works, but because training that way fills in for a lack of real-world experience that every child gets, which includes consuming and interacting with a bunch of copyrighted content that isn't tracked because it's not practical to do so.
You could train a LLM only on project gutenberg, and the LLM would churn out stilted English and the occasional iambic pentameter. That's great if you want works that seem like they were written over a century ago, but nearly useless otherwise.
Libraries exist?
Do you think books fly onto library shelves for free? As far as I know, someone bought them.
Your neighbour or friend also bought the stuff. I suspect you're not being straight here, I just have to ignore this whole line of reasoning since it seems so absurd.
>Until AGI arrives, or some other method of training LLMs from the ground up on sparse examples by incrementally building on structural knowledge of language.... training on ridiculous amounts of copyrighted content is required.
that's not my problem. Those AI model folk should just compensate the people they're using training data from, and they should ask for permission.
>You could train a LLM only on project gutenberg, and the LLM would churn out stilted English and the occasional iambic pentameter. That's great if you want works that seem like they were written over a century ago, but nearly useless otherwise.
Not my problem. Why are the problems of the wonderful AI developers suddenly human, global problems that we all have to find a way to fix?
If they want access to training data - they should pay for the privielige.
> Libraries exist? Do you think books fly onto library shelves for free? As far as I know, someone bought them. Your neighbour or friend also bought the stuff. I suspect you're not being straight here, I just have to ignore this whole line of reasoning since it seems so absurd.
Second: I, as a user of the service, who learns things, still pay nothing.
Should I be required to directly pay for the things I learned from, or is it sufficient that someone is? Because if the latter, then picking up a book from a normal (non-deposit) library, showing it to an OCR system, and having an AI learn from that, would involve just as much payment as I ever made to read a library book (with the possible exception of late return fines, I can't remember if I ever had any of those).
Is it fascinating? Seems pretty obvious to me, you've got things that are good and things that are bad and you want to find the dividing line between them, why pick something central to either group that doesn't illuminate the boundary between them?
I think humans should pay for permission to learn. Heaven forbid copyright holders don't get paid for all the material they've put out that humans are using (often stealing) to learn from in order to become useful members of society!
Physical library books are governed by the doctrine of first sale. That's why google has one of the largest (maybe excluding l-bg-n and IA) corpus of books on the internet. They might have the cleanest corpus of OCR'd book content of anyone, since IA uses commercial or open source OCR and that's it, while google for a long time used recaptcha to check OCR results.
For physical books, the cost per read of a library book is an order of magnitude smaller than the cost per read of privately purchased books. How can you tolerate the economic model of libraries when the net effect is a theft of maybe 80%-95% from the author and publisher? Libraries subsidize books that nobody wanted to read, but steal from authors and publishers whose books are read multiple times per physical copy.
Even libraries' onerous ebook licenses are not commercial retail ebook pricing. They're just closer to retail pricing than the publishers could ever manage with physical books, because there's no pesky right of first sale which turns physical book libraries into piracy havens.
I would prefer to get away from OpenAI and Facebook and all the other people using potentially tainted sources like books3. The obvious legal question for them isn't whether training was legal, but whether the acquisition of the training data was legal. That's a straightforward copyright issue, or at least as straightforward as fair use determinations can ever be. Whether we agree with copyright law as it stands, it's certain that copyright applies when books3 is transferred around the internet. How transformative it is, how much the transfer of books3 affects the market, and the other two factors, make those actions fair use, are the only questions to be considered.
The training aspect is where all the difference of opinion lies:
What is your position on Google using its corpus of books (legally acquired and possessed, as the content behind google books) to train a LLM? Do they need to acquire additional rights from copyright holders? Why, and under what legal theory?
How would they get permission ahead of time? How would they agree to a pricing model? Would they spend tens or hundreds of millions of dollars training a model, and only then negotiate with rights holders to find out whether the license fees they want will be economically viable? We all know that most major rights holders would never grant a one-time license fee. It would be perpetual rent-seeking from AI output. I don't see how any of these LLM or image generation models would be economical if rights holders had their way. They wouldn't mind. They're notoriously slow to adopt tech, but if they did anything, they'd hire AI experts, build their own models, and license the models back to Google and Microsoft.
I had a free school education (including Shakespeare and Ethan Frome, both of which are out of copyright now though only the former when I studied it); several free libraries; and with the exception of my final year even my university tuition was free[0]; after graduation the museums I went to were also free; I watched free educational videos from Apple Developer and YouTube, and listened to free podcasts; I have learned things from reading Wikipedia; and I have done free online courses in both natural languages and programming languages.
This doesn't cover everything: I did, indeed, also buy books on HTML and JS, and my first C compiler, and a licence to REALbasic[1]. But that doesn't refute the fact that I did learn a lot for free.
> If AI models require human training data, then they should pay for it. Easy.
You can do that if you like, but that won't stop any of the economic issues that arise. The cost of running Stable Diffusion is so low that even if you had literal slaves, and you were spending only the UN extreme poverty threshold on keeping them alive and housed, the pro-rata cost of keeping them alive for long enough to type in the prompt dominates the total cost of making images.
Right now these models are still, despite their impressiveness, flawed: while an artist can use them to great effect, most of us will have our generations easily spotted by some flaw we have never trained ourselves to notice. If the models become good enough to fully replace all artists, the only way the profession called "artist" isn't going to go the same way as the profession called "computer" is if the arts are to humans as fancy tails are to peacocks: the effort being the point, extravagantly wasting effort to show you're fit enough to manage fine despite the penalty.
[0] UK rules at the time, thanks to my dad's early retirement and therefore "low income" status
If we let this idea that "AI training data usage has no compensation for rights owners" to be become ensconced in the legal system, then all human endeavour will become fair game to be acquired by someone to make a Machine Intelligence out of, and remove you completely out of the profit loop of your own work.
This will happen in every industry and occupation, one by one.
Is this what you think is desirable?
The alternative is perversely simple: PAY for the right to use training data.
> This will happen in every industry and occupation, one by one.
And still will even if they (for any value of "they") do pay.
Unless… do you want them to pay the entire future economic value that, say, all programmers including myself might have added if we weren't about to be made redundant by the next coding LLM?
Because the historical analogy there is getting Raspberry Pi to pay out the entire global GDP for each Pi Zero, on the grounds of that model being able to do arithmetic as fast as the entire world, even if the entire world had been paid to work in the obsolete job role of "computer", after having been trained to operate reliably at the speed of the current world record holder.
The main Stable Diffusion models themselves are licensed under Creative ML OpenRAIL-M, and freely available third-party models exist; OpenAI gets stick for not making their models downloadable (at least GPT-x does, though nobody seems to care too much about them keeping the DALL•E 2/3 models private), other players are making and distributing other LLMs of varying quality.
That said, we do also have actual reasons to be concerned about this remaining anarchic (though a lot of people very loudly scoff at them), so who knows what the future will bring on that axis.
Learning is not theft and never has been. I don't care whether it's a human or machine doing it. AI will benefit everyone enormously, even if it won't be equally distributed. The real issue here is that some skills are increasingly becoming obsolete and people have a hard time coping with that. Instead of demanding compensation, which would really be impractical to implement anyways, why not focus on developing new skills?
No that is not what people are upset about. They are upset that their life's work is being used without even asking permission, for someone else to get insanely rich.
That's what they're upset about.
If there were no use for 2D artists, then Stability Ai wouldn't be making an AI to replace them.
Key word here is: replace. 2D artists are not becoming obsolete - they're being replaced by a machine that was trained on their works without permission.
If you want to make an AI that does amazing paintings, and doesn't use human training data, more power to you. I can't compete with that. But if you use MY WORK to make a machine that's going to replace me, you do it under the cover of darkness and without permission - yeah i'll get pretty mad.
What happened to visual artists was more like Logitech announcing Logitech CoPilot and revealing they've extracted code from keylogging for the past 20 years.
Your job will only be replaced if you're unwilling to adapt. It's like saying that C made assembly programmers obsolete. There's a lot fewer people programming in assembly, but the correct view is that there's a lot more programmers now than there were in the days that you had to know assembly.
"Under the cover of darkness and without permission" implies quite a bit, but you're coming at it from the wrong angle, which this court ruling affirms. Try thinking of it as a new tool that will act as a force multiplier for your work.
This is a completely misled interpretation of events, since automating my work to the scale of generative AI, there won't be any work left for humans.
This "but progress" argument is tiring. The Industrial Revolution was a complete failure if you measure quality of life in its first decades. It got turned around by people who you know, fought back, created unions, and demanded fair pay.
But progress is part of the foundational purpose of copyright:
> To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.
Good. That's the goal, and we can stop inventing stupid make-work for people when that happens. Jobs aren't the goal, making robots do all of the work is the goal. You're falling into the myopic trap of assuming that the current way of things won't change along with generative AI becoming commonplace.
My fear is that in between the world we have now and the world where robots do everything, we have a giant valley of madness where robots take all the jobs, all the profits, and a lot of people are left with nothing.
That's whose goal? It certainly isn't the goal of billionaires and people that run the country. By in large, their goal is to be rich and powerful: to have power over other people. You don't get that by eliminating the need to work.
It would take an incredible amount of altruism to share your goal, and you don't become a billionaire by being altruistic.
AI is just leveraging value in the data that wasn't used before. In the same way we can automate jobs by watching how the human workers do the job, we can automate writing text or producing a picture. As we drive to work, we do so so we can get to work, but cameras on other vehicles watch us drive, and take all the judgement and learning we put into driving to try and create automated drivers. Should all drivers be compensated when their driving style is used to train an autonomous vehicle? It's the same scenario you mention: the work of humans is used to automate away that same work (this is pretty much how we automated everything, AI is just more explicit about it). It seems to me that these lawsuits are really about trying to stop the automation of certain creative jobs. And perhaps this is where we disagree, since stopping automation of jobs doesn't seem like a nobel end to me. Indeed, if we'd had this attitude about automating other jobs (like farming and manufacture of goods), creative jobs wouldn't exist, since everyone would be spending all their time producing food and tools.
I'm not sure what your comment about "Logitech CoPilot" is. GitHub CoPilot exists, was trained on code from the past 20 years, and by and large developers enjoy the additional automation, and are looking for ways to leverage this automation to be more productive. I would think artists and writers would adopt a similar approach. Experience tells me that fighting automation is a waste of time. Best to stay ahead of it.
It sounds like you want to expand intellectual property rights to include the details of any action people take that could be observed by a machine so it can learn how to do the same thing. That's an untenable strategy: the ROI of the existing IP regime is already doubtful, and expanding it won't improve it, it'll simply expand all those problems to new areas.
> No that is not what people are upset about. They are upset that their life's work is being used without even asking permission, for someone else to get insanely rich.
That's what they're upset about.
This is kind of petulant then, respectfully. They got paid to produce that work, and they sold their services for that paycheck and the knowledge that they were putting their work into the world. The fruits of their labor were already being used to make Disney, etc. massively wealthy.
But this is even another step of indirection from that totally fine and reasonable situation: people (or machines) are learning from the work and producing their own. They might as well be angry at interns who learned their style to get junior jobs at Disney and who are now up and coming, replacing them.
Where did you get this idea that Disney or someone paid for all these works to be produced?
A lot of the works were just ripped off Arstation and similar websites. It's likely a very large number was never "paid for".
A lot of that work was never paid for to begin with. It was just spec work by artists who a. might be too young b. might be in between jobs c. just doing passion work.
Again a lot of people at HN don't really seem to know the story very well, or how artists are compensated.
I just assumed Disney paid their employees, though maybe they’re volunteers or something. I read Ed Catmull’s Creativity, Inc. a while back and my takeaway (in addition to seeing a glimpse into how the management worked) was that all workers at Pixar were compensated for their work. Not totally orthogonal because Disney did end up acquiring Pixar. But I have a hard time believing a movie like Frozen was created without paying for it to be created.
You're absolutely correct that Disney pays its employees.
However most of the images that are being used for training data don't come from Disney. Most of the imagery used for Stable Diffusion and Midjourney was from portfolio websites and other sources, afaik. The Artstation website, which is a popular portfolio website, was one of the main targets, alongside many others. Other websites used include DeviantArt and Flickr. You can check using this website https://haveibeentrained.com
A lot of people placed their passion work, personal work, and unpaid work on Artstation, DeviantArt and Flickr. Some of those people are brilliant. Stability AI used their work as training data, without announcing it or asking for permission. The result is that a lot of more profilic and famous artists can see a lot of work that directly seems to be a reference to their own, copying their style.
A lot of the work on these websites is quite simply not commercial work. The only commercial application in fact, so far, is these image engines.
Let's say I have a band. We make a record, and we sell some records. We get paid for that.
Now someone wants to use one of the songs in an ad. We "got paid" to produce the work, but the people making the ad still have to pay us for that use of the song, because it's a different use than the one we got paid for.
People who do the work should be compensated when their work is used to generate income for someone else. This is the way it's worked for as long as I can remember: I get paid to do some work, the company then gets my output.
What we have here is someone hoovering up copyrighted and protected content, from all over the internet, and using it to generate income. They then are turning around and claiming that they do not need to pay for these protected works because... Well, it's always something hand-wavy like "machine learn like people" but it boils down to "I want to keep the money."
It's convenient to refer to the training of the machine as learning but let's not lose sight of the fact that "machine learning" is not at all the same thing as people "learning". Pretending they mean the same thing in this context, in my opinion, is dishonest.
I also take issue with the assumption that AI will "benefit everyone enormously, even if it won't be equally distributed"; I don't see any factual basis for this assumption. On the contrary, it seems much more likely that AI will be used to concentrate wealth even further. Given the high cost, I find it hard to believe it will ever by "equally distributed".
For as long as I can remember big corporations have been merciless in their preservation of "intellectual property". I didn't love it then and I don't love it now. OTOH, the idea that Microsoft can train their LLM on code I've written and then sell access to that LLM for money (sharing no dollars with me) strikes me as outright theft.
Just because it's on twitter doesn't mean it's true. I think a court setting where things are contemplated in a rigorous and reasoned way has a somewhat better chance of arriving at something resembling the truth.
We've been here before several times: Silhouette painting, Photography, Airbrushing, Pianos, Synthesizers, Sampling, Photoshop, Ray Tracing, and many more. "It's not real art", "they're stealing from us", "we'll go hungry!" .
Some of these are already quite old. For others, I've actually been asked the question back when I was in school: "Are you really making music if your instrument has a microprocessor in it?". Um, yes, yes I claim I am making music thank you very much.
First people complain, then they adapt, and then they end up making awesome art with the new tools and/or instruments. Which isn't to say historically it was all rainbows and roses, but it was never the end of the world either. Seeing the newer generation of AI tools and how the tools end up getting integrated into regular workflows, it seems to be going the same direction.
To quote the song, I think it's "all just little bits of history repeating".
I guess enjoy rejecting modern life. We're going to see an explosion of new works from people using these new tools in creative ways. I'm sure you'll say that you don't want anything to do with AI-generated art, but that will soon mean avoiding just about every new work.
It's the usage of unlicensed images and text in their training data that I have an issue with. If you wrote a book or made a painting, GPT-4 should pay for the rights to read or look at it.
What you're wishing for won't achieve what you want. You'd then be complaining because OpenAI bought one copy of every book, and authors can't survive on one sale alone.
It doesn't really matter though. The models that exist today aren't going away, and future work will just be built on top of their output, tainting every model forevermore from your perspective. I suggest you adapt to the new way of things and enjoy the ride.
Rights holders would be in the wrong to be dissatisfied. This entire issue has been done to death in the courts over scanned library books and dozens of other media used after first sale.
Copyright is archaic and holding humanity, and especially creatives, back.
> If you put your "work" out in the world, anyone who views it, is automatically training their brains on it. Viewing is training.
A perfectly reasonable view for humans, since you shouldn't be able to copyright a brain.
Not at all a reasonable view for a computer, until we also get to freely use all the copyrighted works ourselves. The problem here is that AI training is asymmetric: the people training an AI use works in violation of their licenses, but don't let their own works be used in the same way. For instance, Microsoft uses code on GitHub to train Copilot, but you still don't get to freely the source code of Windows or GitHub.
I am absolutely in favor of eliminating copyright and patent law. I am not in favor of keeping it around while letting AI become a laundering mechanism to get around it. AI training should not get to uniquely ignore copyright; copyright should cease to exist.
It's funny to me that we haven't reached AGI or anywhere near it, but when we talk about training Diffusion models, suddenly they're "just like a person" for legal reasons.
Same thing that happened on the construction of the "corporation as a person", built on top of rulings made to protect African Americans.
That's like saying "it's funny to me that this baby can't drive a car, but we still call it a 'person'".
We've achieved AGI, it's just yet a baby. Only a few years ago, if you wanted a model that could complete a task, you'd have to train a model specific to that task. GPT-4 can do a large variety of tasks without being specifically trained on them, i.e. it's suitable for a wide range of general tasks. Yeah, it sucks at some of them, but focusing on that is short-sighted. Look at the progress made from just a few years ago and where it's going, especially with multimodality and increased compute power.
My point is this is still a machine. Running analogies on top of metaphors, you can justify everything.
This thing is a machine intelligence. My point is that a machine intelligence to be equivalent to humans in every law is curiously never mentioned, except when it is advantageous to the owners of said machine.
Or are we saying that we can't turn off GPT-4 at this point, because following your metaphor, we would be committing murder or abortion?
Again, you can't reason using metaphors and inferences and analogies. It's not helpful.
It's pretty obvious that we should be considering whether or not it's ethical to turn off AI. Even if most people think it's OK right now, that will likely change as these models get more capable, exactly similar to the differing opinions on abortion.
> except when it is advantageous to the owners of said machine.
That's just not true. You're ignoring vast amounts of discussion on the topic, particularly the discussions had when Blake Lemoine claimed that LaMDA was sentient. There's no reason to dismiss all that just because you're dead set on assuming everyone involved in AI is up to no good.
No two people are exactly the same, but we group them together in various ways. The question to ask is "for a particular purpose, should they be treated with the same reasoning?".
For the purpose of creating new art from experiencing previously created art, why should we treat them differently?
The purpose of the copyright law is to reward authors through property rights for their work to protect them against theft and incentivizes them to be able to earn a living from their work too. That does not happen if someone can take the work, feed it as extra data to a statistics machine and earn money from their work without their permission.
If AI training should be fair use has nothing to do about how it works (search engines fall under fair use regardless of if they think or not) but the effects it have on the original author.
So when people talk about humans and AIs "doing the same thing", AIs are not legal entities like humans are to begin with, the entire premise is moot.
Don't be silly. This was a level-headed ruling that avoids retarding the progress of science and the useful arts.
I'd really like to see people drop the inequality argument. If you actually cared about that instead of virtue signaling, you'd push for a mandatory GPL-style license that forces models to be available to anybody that uses them. That would avoid trying to unsuccessfully put the genie back in the bottle, while also preventing a few companies from benefiting at the expense of everyone else. Just like OpenAI's original mission of making AI available for everyone, before they got dollar signs in their eyes.
Is this supposed to be sarcastic, because this is impossible and you're arguing a strawman. You could also say electricity enabled this great inequality, we have got to stop electricity.
Google “expert witness”! Courts are also known to hire their own experts who mediated between the experts on either side.
Also, those emails seem very likely to be ordered to be produced during discovery.
This thing could really go either way at this point but I feel like Stability has the upper hand.
Imagine training a model without any of the plaintiffs images, then using that side by side with the model that does. This could then be used to show the jury that those individual works are of no importance to the system if the images are of the same quality.
They will probably argue that the individual expressions of each work are not copied, rather the abstract ideas of two-dimensional representations present across any and all images.
Expect lots of side by side pictures as Exhibits from both side! Grandma and her fellows have to weigh in on this one!
And then the defense reminds the court that copyright is about specific works. Clearing the specific intent to copy a specific work is pretty key if you want to argue idea/expression or fair use or whatever their strategy may be.
And yes, the nearly identical copies will definitely be presented to the jury.
that's a gross oversimplification and an argument that was made in bad faith that has now spread like a meme.
The gist was if you overtrain a model, and try to recreate an exact image by prompting very similar things, and run it several thousand times, then you can recreate an image. If you seriously crank up the overtraining it's even easier. But normal use of the models do not just pump out recreations of training data.
The dismissal of Deviant was inappropriate given that the case hasn't reached discovery yet. The dismissal was granted based on a substantive evaluation of the Defendant's assertions which is inappropriate at this early procedural stage of the case. (see e.g. page 10 where the judge evaluates the "plausibility" of alleged facts, and page 12 where he says "I am not convinced" about the plaintiff's theory, even though in a MTD this is not a determination he is supposed to make pre-discovery).
Moreover, even if plaintiff's language was "unclear", the appropriate procedure is to require them to amend their claim and dismiss Deviant if the plaintiff does not amend, not to dismiss a defendant and give the plaintiff leave to amend their claims.
With respect to Midjourney, the Plaintiffs failed to plead sufficient factual allegations to support their claim, so that dismissal was appropriate. (Pre-discovery, it's okay for the alleged/pleaded "facts" to be wrong, you just need to allege sufficient "facts" that you have a legal basis for a court case. Note that "facts" in the MTD context doesn't mean real world facts, it is a legal term of art that actually refers to an allegation of a fact that will later be determined to be true or false at the actual legal proceeding on the merits.)
Interesting. How do you see the Getty v stability lawsuit going? That looks much worse for stability. Do you think they will just settle and stability will pay them some licensing fee?
Getty has a much stronger case, given that warped versions of the Getty logo have shown up in a number of SD-generated images, so it's obvious that there was impermissible copying.
I'm not sure Stability will agree to a licensing fee, since part of the rationale for the last version of SD was to remove the infringing images from their training sets going forward.
> warped versions of the Getty logo have shown up in a number of SD-generated images.
If you create art that has a Pepsi logo on a depicted vending machine etc Pepsi has no copyright claim on your art does it? All it shows is the art was made with the knowledge of the logo and the logo was included as an element inside the art.
When logos are shown in a context that may cause confusion (about who made a product etc) there may be trademark infringement but trademark infringement is not being claimed here so why would warped logos matter?
The issue at hand isn't actually directly about how the output images contained the Getty logo; the lawsuit isn't saying "you're showing our logo on your output, which isn't a Getty image, and we take issue with that". It's whether Getty images can be ingested into the training set without consent or compensation to Getty.
The reason the distorted logos matter is because they make it much more difficult to claim that Getty images were not ingested and used for training — if they weren't, then how come the outputs have those logos? And similarly, they make it much more difficult to claim that these source images were only used as "inspiration" for the generative algorithm and thus fall under fair use — if they're only used for "inspiration", how come they generate/copy easily-recognizable parts of the original images (i.e. the logo) as-is?
> they make it much more difficult to claim that Getty images were not ingested and used for training
Was that claim put forth? Why then does making this difficult matter?
> they make it much more difficult to claim that these source images were only used as "inspiration" for the generative algorithm and thus fall under fair use — if they're only used for "inspiration", how come they generate/copy easily-recognizable parts of the original images
If artists created works of art containing warped logos etc as elements in their art would they be infringing copyright because of these warped logos? But if an artists uses a computer to create the same art instead of real paint that becomes infringement? Because copyright depends on the method of production not just the produced result?
So I'm only guessing here but my thought is because a Pepsi logo just indicates that the input image was an ad, a Getty Images logo means that the input images were owned by Getty and likely used without their consent.
But a watermark showing up indicates clear, undeniable ownership of those images by a massive company that seems to make most of their money selling licensing to their images.
In the early stuffs, Stable Diffusion 1 (not XL 1!) and such, if you prompted for stock photo style images, you regularly (~30%) got something resembling the Getty stock photo watermark in the lower right.
Was quite annoying but adding "public domain, Creative Commons" to the prompt usually got rid of it (the model knows that public domain images have no watermark :-). Since SD2.0 I haven't seen this happening at all.
I've generated thousands of images using SD 1.5 based models and I've never got the getty watermark. That makes my think that some dishonest lawyer used img2img.
> Two of the three artists who filed the lawsuit have dropped their infringement claims because they didn’t register their work with the copyright office before suing. The copyright claims will be limited to artist Sarah Anderson’s works, which she has registered.
I’m impressed that their legal team was incompetent enough that they didn’t bring this up as an issue before filing the lawsuit.
> Two of the three artists who filed the lawsuit have dropped their infringement claims because they didn’t register their work with the copyright office before suing. The copyright claims will be limited to artist Sarah Anderson’s works, which she has registered.
The lawsuit is moving forward, but only on copyrighted work. This is (not yet) a story.
I’m so confused about American copyright law. I was always under the impression that copyright is granted automatically and you didn’t need to “register” it, contrarily to a trademark which must be registered and is only valid for its specific industry.
That was my belief too, but: "Copyright exists from the moment the work is created. You will have to register, however, if you wish to bring a lawsuit for infringement of a U.S. work."
It can take several months to register a copyright, so they may have started the process and took a gamble on how slow the court would move, knowing they could fallback on the works that were registered if it didn't work out.
I'm really looking forward to the EU framework around "AI". It's definitely a better approach than having individual artists sue and get dismissed on technicalities (that don't even apply in most of the EU - e.g. in France, if you release something by default you get copyright on it, so the judge's reasoning couldn't apply here) and judges deciding based on their interpretation of vague laws crafted in an age when "AI" was little more than niche science fiction if that.
> I'm really looking forward to the EU framework around "AI"
After GDPR and the cookie pop-ups my expectations for things coming out of the EU is quite low. Every company I have worked it has a different and often conflicting interpretation of GDPR, and some places uses it to play politics, and governments of individual EU countries are not doing their part to clarify how things should be interpreted. It's a dumpster fire IMO.
The plaintiffs apparently failed to plead sufficient factual allegations to support their infringement claim against MTD, which is a rookie mistake.
Factual allegations at this point don't have to be correct (that's what discovery is for), but they do have to at least satisfy the legal requirements for each prong of a legal claim. In many legal pleadings, the plaintiffs will state, "upon information and belief, we [assert X factual allegation]" since they don't yet have the discovery to support a more specific factual allegation.
For that count specifically, Stability was directly involved with creating and funding the LAION dataset, whereas Midjourney and DeviantArt were not.
The DeviantArt direct claim is because of how DeviantArt has been using Stable Diffusion for their DreamUp system, but the direct claim against Midjourney has been less clear from the plaintiffs about whether they're going against Midjourney using Stable Diffusion in one model (beta/test/testp) or their use of training data (like LAION)
Basically the judge said the idea AI images generated are infringing on copyright is so stupid it's thrown out.
The other part of the case is if the artists copyright was violated when training the AI and they have only claimed that Stability used their art to train.
My perspective is there are two different main issues about AI (especially Stable Diffusion).
One is how it works consistently with the current law. Ml model is basically a highly lossy compressed data format. If you collect millions of copyrighted images, merge them into a super big image, then compress it into a .jpg. Are you allowed to redistribute this .jpg file?
To me, it's mostly depending on how lossy (low quality) your .jpg is.
(Note the fact that human brains are also lossy compressed data is completely irrelevant here: you can only compare machine to machine, algorithm to algorithm. You can't say if a human has right to do X, therefore a machine has the same right to do X.)
But this line of thinking, while consistent to me, is dangerous. Because it means open models like Stability Diffusion are more likely to be illegal than a closed one like MidJourney, since it's closer to the source materials. If closed models end up being legal but open models don't, it would be a big loss for our society as a whole.
Compression implies the input can be reconstructed from the output (lossy or not), in the case of these ml models the input is the training data and the output is the model. You can't reconstruct even a fraction of that training data using the model alone therefore it is not compression even in the most lossy sense.
The model produced though can be an efficient compressor/decompressor, which produces a lossy output image when given a input of prompt and/or image.
All that aside, the whole human/machine thing is a dumb argument. It's humans that are using the tool. The question shouldn't be does a machine have rights to do X, but rather do humans the have right to use and build such tools?
> Compression implies the input can be reconstructed from the output (lossy or not), in the case of these ml models the input is the training data and the output is the model. You can't reconstruct even a fraction of that training data using the model alone therefore it is not compression even in the most lossy sense.
It's already proven that you can reconstruct at least a small fraction of the training set from diffusion models. It's something quite well known, so could we not die on this hill?
So just to be sure: the list of URLs + metadata that gets used for stable diffusion is several terabytes. Not the images. Just the list of URLs alone (and a bit of other metadata).
Stable diffusion itself is just 6+ GB, and fits comfortably on my USB stick.
That's one heck of a lossy compression algorithm, sir!
> So just to be sure: the list of URLs + metadata that gets used for stable diffusion is several terabytes. Not the images. Just the list of URLs alone (and a bit of other metadata).
> Stable diffusion itself is just 6+ GB, and fits comfortably on my USB stick.
Thanks for sharing this info which I'm aware of. However, this fact is not as significant as it might sound in terms of whether it's a lossy compression algorithm.
In most lossy compression algorithms, the compression rate is arbitrary. For example, for an algorithm that based on fourier transform, you can choose only take the first sin wave, or the first 1000 ones (a bit oversimplification here).
So yes, SD is small. Quite miracally small, and its size alone implies some important insights on how human see and read artworks. But this fact doesn't change whether I see it as a lossy compression. (In my previous comment I stated human brain stores lossy compressed data too, so you can see I'm using a broad definition of "lossy compression".)
It's not just "lossy compression" though if you can generate images that were never in the source material. I get your point but it's somewhat misleading analogy.
That's a difficult question because the boundaries of similarity/derived works for copyright purposes are determined by judges and juries based on their intuitions. There's no mathematical similarity testing, and trying to formulate such a thing would be challenging.
What's similar enough to a pop music theme, that has a grand total of a few lines of unique music, to be a copyright violation? How many bars have to be copied, and what kinds of minor variances do or don't avoid a violation? If you're inspired by a haiku, and change 5 of 17 syllables, is that still a copyright violation? Who knows.
> That's a difficult question because the boundaries of similarity/derived works for copyright purposes are determined by judges and juries based on their intuitions.
I believe that's why DALL-E bans some keywords related to alive artists. To show they have "no intention to violate copyrights".
And that's why I'm so worry about that we're heading to a future where open, uncensored models are illegal and closed source AI-as-a-service services are legal. It's not fearmongering: right now, you can't use GPL code in your closed source apps, but you can use GPL code on your server running a service that provides the exact same functions. I believe it has already hugely undermined the original intent of GPL (written in an era before SaaS became popular).
Some AI proponents say ML is the biggest invention since steam machines. I don't know if it's true, but if we end up stuck in a situation where open models are illegal while AI-as-a-service is legal, then it's the biggest step toward a dystopia since steam machines.
To the extent that Stable Diffusion models are "lossy compression", the main one is somewhere between 1 and 10 bytes per image depending on whose answer I use for the question "how many images was it trained on?" (I assume the cause is 1.5, 2.0 and SDXL having different answers and the reporters conflating them). The geometric mean of those is ~three bytes, which is only enough for one single RGB pixel per image.
For all the legal issues — and the artistic flaws — I still find it quite remarkable how good it is at such a small size.
> the main one is somewhere between 1 and 10 bytes per image depending on whose answer I use for the question "how many images was it trained
Here is a catch tho. It's just "by average" several bytes. We can't tell if some images practically contribute 0 bit to the final results while some others contribute more.
(I know this "contribute" word is a little non-sense in the context of ML. But existing lossy compression algorithms are not that different in this sense: if you compress a 1M frames produced by a 3D renderer to a .mpeg video, each frame doesn't contribute the same amount of bytes to the final result.)
I'm amused by the idea that under close examination it might turn out that one of the people suing on copyright grounds, made literally zero bits of difference to the model.
Machines do not have rights. The question is whether an human with a specific machine has a certain right, as opposed to a human with a different machine.
I assume the entire client+server system constitutes the "machine" in this case, correct? So does "human with" refer to the end user (client side) or the sysadmin (server side)? Maybe one is an accomplice? The machine isn't going to infringe without certain prompting by the end user, just as an inkjet printer isn't going to do so.
> Ml model is basically a highly lossy compressed data format
This is a pretty incendiary statement for those opposed to generative models, but more important its not a good interpretation because the intent is not to store a compressed format for restoring the same image, nor can it.
"The contentious issue of whether AI art generators violent copyright — since they are by and large trained on human artists’ work, in many cases without their direct affirmative consent, compensation, or even knowledge — has taken a step forward to being settled in the U.S. today."
I’m not sure if here you are reading that. Have you read the article?
The copyright infringement claim (for training) is left intact. It’s the other claims that had no basis in existing law (e.g. no copyright was registered, etc) that have been thrown out.
It's really not, copyright is and always has been incredibly political and ebbs and flows with the whims of the existing power structures. And that's mostly to do with the fact that copyright, unlike say murder, is much less defined, has as many interpretations as people, and is something that is designed to be overbroad and selectively enforced where everyone in the world by the letter is constantly violating it.
So this it pretty much expected and if it swings the other way the US/EU is going to hobble themselves in the face of any locality that
gives zero shits about copyright. It's less about the art and more that the art enables these models to do
real useful work and is better at it for having access to more data.
Orrick dismissed McKernan and Ortiz's copyright claims because they had not registered their images with the U.S. Copyright Office, a requirement for bringing a copyright lawsuit.
Hasn't this always been a precarious road? With say fair use for instance.
Not only that but I really wish we could just redo copyright to be more flexible but ultimately empowering the creator with conclusive licenses for others to use (like in AI, other creative work, streaming, etc.) and the creator is paid either monthly or per generated image/song.
It seems like they focused too much on the details of how the model works and how data is encoded by the model.
"In his dismissal of infringement claims, Orrick wrote that plaintiffs’ theory is “unclear” as to whether there are copies of training images stored in Stable Diffusion that are utilized by DeviantArt and Midjourney. He pointed to the defense’s arguments that it’s impossible for billions of images “to be compressed into an active program,” like Stable Diffusion."
Perhaps future litigation will be more successful if they treat the model as a black box. Could an argument be made that a person's intellectual property was used to train the model without compensation and _that_ is the illegal act? From there one would only have to demonstrate that the output form the model is similar to a person's body of work.
Maybe even the data for training should be opt-in, then at least this case would have been easier solved. The outputs are then another story - I can agree to training but I'm not eager to see knock-offs of my work being outputted and spread.
> Orrick spends the rest of his ruling explaining why he found the artists’ complaint defective, which includes various issues, but the big one being that two of the artists — McKernan and Ortiz, did not actually file copyrights on their art with the U.S. Copyright Office. [...] The other problem for plaintiffs is that it is simply not plausible that every Training Image used to train Stable Diffusion was copyrighted (as opposed to copyrightable)
What? I thought everything was copyrighted by default under the Berne Convention?
That's the reason for the existence of CC0 [0], after all. Their FAQ says: "Copyright and other laws throughout the world automatically extend copyright protection to works of authorship and databases, whether the author or creator wants those rights or not."
Intellectual property shouldn’t be a thing. If you still have it after I’ve supposedly stolen it from you, then it’s not real property. The easiest test of consistency is simply to ask about both piracy and AI training data. If you support IP in one case but not the other then you’re a hypocrite. There is no third option where your support of something depends not on what it is but who it benefits.
Say someone takes your written work (say your online comments, any articles, blogs etc) and claims it as their own. You still have a copy of your work but now your audience the authorship is in doubt. Would you be against this happening to you? What are your thoughts about plagiarism? How is this different from "copyright"?
That’s a massive reading comprehension fail on your part. The context of a statement is critical in understanding it’s meaning.
It’s obvious both from the statement and the preceding ones that what’s being argued is the legality of specific actions. Quoting a tiny fragment of text and reacting in isolation to those words is reacting at imaginary ideas disconnected from what was actually said.
Also, the idea of a law preventing some type of action is a rather silly premise, people constantly break laws. Laws moderate behavior so they at best prevent specific incidents instead of totally eliminating specific behaviors.
Alphanullmeric said "ip shouldn't be a thing".
Then CZL asked about the situation where "authorship is in doubt".
To which I answered: You don't need no copyright to be a liar and a fraud.
To which you answered: copyright is "the only thing preventing you"
To which I then answered: No, there are other forces guiding humans as well.
that was the "obvious" context.
But I doubt this will lead anywhere, tbh.
Have a good evening/morning (wherever you are located)
Alphanullmeric is taking about IP in a legal context. They want the laws to change, rather than say killing off humanity which would also eventually eliminate IP.
Legality is thus the basis for the discussion.
Also, changing the text when you quote someone is dishonest and disrespectful. Swapping in ‘ip’ for ‘Intellectual property’ may seem meaningless to you, but they typed it out for a reason.
The creation of information is a divine thing, information lasts until humanity itself goes extinct. The very first concept created by our caveman ancestor we still use today. Copying is easy. Creating is hard. Even something as simple as creating an original name is really hard, let alone making entire movies and video games. I actually think intellectual property is the single best thing humanity had done, precisely because otherwise there is no movies, there is no games, why would there be. Although I agree it shouldn't last forever.
In case I haven’t already made it clear enough in the comment you replied to - whether I believe in something or not doesn’t depend on who would benefit from it.
But to answer the question, you use proprietary software protected by means other than government force every single day.
You do realise that if people’s intellectual work is not protected there wont be any intellectual work left, right? Why would i create something knowing you can just grab it and use it? Communism did the same to physical property, where you didnt own much and everything belonged to everyone. That didnt end particularly well because people inherently want to own things, especially the output of their own creation. Sure you can use it, but according to the terms and conditions of the owner. Same goes for owning objects. You can use my car if i let you use my car.
Thank god copyright came along and gave us Shakespeare, Bach, da Vinci, Chaucer, Beethoven…
And can you imagine life without the wheel? Too bad we didn’t invent patents earlier so that we could’ve gotten a head start inventing it and the spear.
Copyright activists really lost a lot of respect because of those silly music industry lawsuits, the mickey mouse protection act, software patents of trivial stuff and the like.
This lawsuit is even sillier than the previous ones.
And if there was no slavery we wouldn’t have any pyramids. I don’t care. You don’t have the right to an idea, a sound or a particular arrangement of pixels. That’s not communism because nothing is being taken from you. I don’t owe you any terms and conditions to something you don’t own.
Do you think the creator of a piece of art has any rights whatsoever? You're basically endorsing the idea that if you stumble across some original work that you're able to make a copy of, there's nothing wrong with falsely declaring yourself the author of it and collecting money from anyone you can trick into believing this.
You have the right to protect it with means that are not government force. Force is only justified in response to force, and you don’t get to ransom anyone that “steals” your thoughts and pixels. Do you believe copying is force? That’s a yes or no question, and if I don’t get a yes or no answer then I’ll answer for you.
> You have the right to protect it with means that are not government force. Force is only justified in response to force, and you don’t get to ransom anyone that “steals” your thoughts and pixels.
This just sounds like your opinion, why do you think this? I really like that people can profitably write books and make movies and I think some force to allow that to happen seems reasonable.
I reject your whole premise here, as many thefts can occur without force, and many legal remedies are imposed without force. You're just ducking the question; it seems you're saying that 'no, you don't have any rights that you can enforce in court.' You'd probably object to extralegal enforcement on the basis that it violates the NAP or some other glibertarian trope.
I’ll answer for you then. Yes, you’re claiming that copying is force.
I am defining force as literal force. What theft can occur without physically touching the thing being stolen?
I didn’t duck the question. I answered it. I even answered it before it was asked (minus the irrelevant comment about identity fraud) - I’m against IP. Copy whatever you want.
I am defining force as literal force. What theft can occur without physically touching the thing being stolen?
This is like saying you work 24-7 because breathing involves physical motion. Insisting on ur personal definitions of well-understood terms while disregarding how everyone else uses them is childish. But even if we use this, ah, special definition, legal remedies in tort cases typically don't involve force. You get a judgement of liability from the court, and and are ordered to write a check. You can complain about government force, but by your criteria your injury is wholly imaginary.
Meanwhile it seems clear that you do not consider that authors of creative works to have any rights whatsoever in their output. I hear this a lot from people with no creative abilities of their own.
My definition of force is the definition of force. The definition I gave you was literally "literal force", and you understood that to mean something different from your interpretation of force. You are the only one trying to change definitions here. If the court orders me to do something, that's force. If you think it's somehow voluntary, then I have a few examples I'd like to ask you about. I have no idea what injury you're referring to.
Yes, we've already established that you don't get the rights to pixels, sounds and ideas, and my creative abilities won't change that.
Pathetic display. There is not only one definition of any word, as you can readily ascertain by consulting a good dictionary, legal or otherwise.
Yes, we've already established that you don't get the rights to pixels, sounds and ideas
We've established that that's what you want, and that you're unwilling to even acknowledge the concept of authorship. Your efforts at rhetorical browbeating are clumsy at best.
But it doesn't matter what your definition is. If your definition of force includes copying, then we're done. That's what I wanted to hear you say. If not, but you want to go after people that copy, then you don't believe that force is only justified in response to force. If you want to tell me about how court orders aren't force, then I'll ask you if they're equally voluntary in a couple situations where they don't benefit you.
That's right, I do not want people to own pixels, sounds and ideas. I don't acknowledge the concept of ownership of something that isn't property. Problem?
Copyright is literally the mechanism that allows creators to obtain a legal remedy for such actions. I have a lot of problems with the state of copyright, but am OK with the basic concept.
GP is arguing against any sort of IP rights, so these questions are reasonable. Lying isn't illegal, if there are no IP rights in a created work then anyone can legally claim authorship.
Well, by law, i do and you do owe me royalties. Hopefully these people hire better lawyers next time. Also people tend to get upset when their ownership of things is violated, pixels or otherwise.
The concept of intellectual property came about at the same time that slavery was fading away. Essentially at the time when people progressed into something better than basic primates that thought enslaving one another is a good idea. So technically speaking advocates for the abolition of copyright are projecting a reversal of progress. I want actual ai that actually learns and that doesnt rely on brute forcing simulated intelligence using a clever mix of people’s ideas. By tolerating this fakery you are really just handing over the bastardisation of the holy grail of tech - ai - to a handful of grifters who’s success relies on taking what’s mine for free and giving it to others for a fee. At scale and with government protection.
This notion that the creators and artists are raking up the wealth generated by people consuming their creations is obviously wrong. Intellectual property in general, but especially copyright, has been a colossal failure -- rent-seeking middlemen have emerged to swallow up the financial dividends of the creative. The idea that someone can sell access to your work without you being rewarded is ALREADY how things work. Copyright as it stands is just a way to give rent-seeking middlemen a moat, not a protection for creatives.
I find this whole exchange silly, but I would point out that you said:
> The concept of intellectual property came about at the same time that slavery was fading away. Essentially at the time when people progressed into something better than basic primates that thought enslaving one another is a good idea.
Wouldn't that suggest that at the point in time where we move past IP we are _also_ progressing to a new stage in humanity?
Irrelevant. You said “it’s the law”. I responded with “it’s the law”. You lack consistency, more at 12.
I already made it clear in my original comment that unlike you, my position on IP holds universally and not just in situations that benefit me. The fact that you continue to try to argue how beneficial it would be to have IP makes me believe that you simply do not care about being consistent, so I’d like to hear it from you personally. You believe in IP when it comes to stopping AI but not when it prevents you from pirating, agree or disagree? If I don’t get a one word answer I’ll answer for you.
I dont care about what holds universally “true” in some peole’s minds about _my_ property. That’s communism. I am an individual therefore i care about _my_ ownership. Communism wanted to make everything belong to everyone. Philosophical darwinism proved that that concept is wrong. What’s mine is mine, period.
I never pirate software and no, I dont think piracy is a good idea either. I believe in open source and that human knowledge should he free in the sense of librem but i equally believe that that should be according to terms and conditions. So i, as a human, would like to read and understand your idea but if you dont wish to grant me the right to monetise it - sure.
Pixels, sounds and ideas aren't your property. That has nothing to do with communism. Force is only justified in response to force, and you've still yet to prove that my copying of your ideas is forceful enough to warrant government intervention.
"people inherently want to own things, especially the output of their own creation". That is the founding idea of communism indeed. I'm not sure you understand anything about it.
This is how moneyheads think the world works: that everything is a series of monetary incentives to be linked together to make an end result. Most humans don't actually think this way, and in specific a LOT of creative work is made without calculating exactly what the profit is going to be. This doesn't mean that artists don't want to be paid, but that artists focus on making their work first and monetizing it later.
What copyright actually protects is creative industry. By assigning individualized monopolies over copying and reproduction, the publishing industry can persistently lowball the shit out of artists (who themselves undervalue their work, see above) and then reap the profits for themselves. Since the vast majority of creative work would never see market interest, it's cheaper to pay billions of dollars to the handful of known, recognizable, and marketable mega-successes than to pay smaller amounts to a far larger pool of mid-list or unknown artists. This is why unions exist in basically every creative industry: otherwise, nobody below the talent line[0] gets paid.
To put a finer point on it: right now, the unions are doing a way better job of protecting human artists against AI art than copyright is. The argument for training AI being infringing is very weak in the general case where there's no obvious regurgitation. I mean, where does your copyrighted material even 'live' in the model, if the model can't even reproduce it? However, unions can very easily just say "you can't force us to cut corners by using this tool" in their negotiations and actually get that result. Furthermore, those rulings only bind publishers that hire artists. The artists themselves can still use AI when it makes sense in their workflow, rather than when publishers think they can cheap out on shit.
The failures of Soviet communism are complicated, but if you had to boil it down to one factor, I would not summarize it as "communal ownership bad" or "collectivism bad". Collective action has its place. Furthermore, the analogy you're making between copyright and physical property is flawed[1]. The reason why physical property ownership even exists is because of scarcity - the reason why I need permission to use your car is because you can't use your car if I'm also using it.
The irony of your communism analogy is that copyright is specifically used to erode ownership in private property in a way that makes the communism haters cry communism. There's a novel form of copyright misuse as a business model in which you put software in a thing that used to not require software, call it "smart", and then use the software to enforce your own idea of what "owning" the product means, backed up by the same laws that make it illegal to copy DVDs. There are a LOT of people who would like to go back to owning their cars and computers again, and that requires rolling back copyright, not strengthening it.
[0] Hollywood-ism for "people whose contribution to the work is not marketable"
[1] And, I suspect, a by-product of having read a bunch of Ayn Rand nonsense
Yeah people need to pay bills and such. I know in communism that may seem unecessary but it is. AI fueled techno communism will fail as all other flavours of the same ideology. Can't take from the few and give to the many even if that's a digital product. What's mine is mine.
Imagine you have a blob of seemingly random data. Nothing in the data contains anything recognizable as illegal or in violation of copyright.
Now imagine that the right input suddenly turns the data into illegal or infringing material, after a transformation operation. And not just a single unique input such as a password which clearly represents a mapping function between two sets of data.
But imagine if there were seemingly infinite possible inputs, each of which transformed the data into a different infringing blob of data. If these inputs exactly represented the novel, copyrightable or illegal aspects, but the blob itself was inert.
What should be illegal here? The blob, which by itself is free of any questionable bits of data, or the inputs which transform it into something tangible? Both? Neither?
Well, it has never been illegal to draw or paint something representing CSAM, for example. And it has never been illegal to draw or paint Mickey Mouse in your own home.
What's often illegal is publishing said data. Ignoring the free speech debate around artificially produced CSAM, publishing it is already illegal in many territories. It is also illegal to violate copyright in many countries when publishing information.
What's interesting is that it is not illegal to trace a drawing and hanging it up on your wall, instead of buying the the real drawing from its rights-holder. It's also not illegal to reproduce a tracing done by a friend. But the recording and film industries have been more successful in convincing us that it should be illegal to do the same for a song or film. That you should not be able to "trace" the data at home, and that you should not be able to share it with me, that I should not be able to trace over your tracing and bring home a copy for myself.
I can understand, and support a copyright system which regulates the publishing of copyrighted material. Even copyleft paradigms lean on regulation for enforcement. But the film and music industry actively try to restrict individual freedoms in the name of corporate profits, while still screwing over their clients and employees with respect to profit-sharing.
Back to the point: That blob should never be illegal. The activation functions should never be illegal. That is a basic extension of free speech. But publishing, that is a different story, and we already have laws offering such protections both with respect to illegally-produced or copyrighted content. Any attempt to regulate what kind of model I am allowed to run at home is a massive infringement on my rights as an individual, and is borne either out of gross ignorance of current copyright law from the same people crying, "But think of the copyrights!", or direct, insidious corporate greed.
You can adjust this thought experiment so that instead of dealing with a magic blob, we are dealing with a program that makes it really easy to produce illegal or copyrighted works after a bit of human interaction. Is there claim here now? Are we basing the law on how much human involvement was needed to create the output? We've faced similar arguments around technological leaps such as the printing press or mechanical loom. Did we, as a society, reject these advances in technology in order to protect loom workers and scribes?
Bottom line. You can pry my models out of my cold, dead or handcuffed hands. Times like these really shine a light on who is complicit in the system, and who suffers from it.
If you are in the creative industry, you need to understand how things are going to change. As an engineer with decades of investment into my craft, I also have to face the rude awakening that is ahead in my own industry as automation creates a gap between highly-skilled professionals and newcomers. Being a paid software engineer might become as hard of work as becoming a famous professional artist. Lots of connections, insane specialization and a lifetime devoted to the craft. A lot of people in school for engineering right now might struggle to find employment in 20 years or less if they cannot cross this gap in time. Artists aren't the only tribe experiencing a huge industry shake-up over a technology that will one day be so ubiquitous that it's inside of your toaster.
This feels like a bit of a naive interpretation of the situation. At its core — regardless of specific lawsuits, etc — the questions here are (1) should copyright laws be adapted to the new reality of generative AI, (2) should artists be able to control how their work is used given generative AI is a reality, and (3) do we as a society think people should be able to make a living as artists, and what are the implications of that either way when it comes to AI models and their use.
Until this point, an artist who has developed their own personal, recognizable style, could be somewhat confident that it is difficult for someone else to generate a new piece of art exactly mimicking their style. That is to say, it was never impossible — there have certainly always been other artists out there who are capable of taking artwork and creative something new in that style — but there were some barriers to getting there, including that those artists aren’t easily and instantaneously accessible to every human being on the planet, that they generally don’t work for free, and that they would need some time to produce their work. The combination of these factors resulted in a system wherein, for the most part, if you really wanted to create something in the style of a specific artist, you would need to commission them, thereby supporting their ability to live and continue creating art. And/or they sold merchandise with their art, or collections, etc.
Now, on the other hand, it is incredibly easy to go to an image generator and have it generate art in the style of a specific (sufficiently well-established) artist quickly, easily, and freely. The barriers have, overnight, gone from being reasonably protective to pretty much nonexistent. As a result, artists are asking themselves how they can continue to live and create art. This is something a sufficiently well-established professional artist used to be able to do before generative AI came into the picture, because other than the odd copycat (which again took time and effort and an actual human with the right ability), they were the only ones who could produce images in their own styles, and this ability was thus a valuable resource that people paid for. If anyone can now produce identical images independently and for free, then this ability may no longer be a resource other people will pay for.
Part of what these court cases are trying to determine is exactly whether any copyright does apply to generated images. You wrote that “publishing, that is a different story, and we already have laws offering such protections both with respect to illegally-produced or copyrighted content”, but those laws are exactly what’s being tested here: artists (and organizations like Getty) are seeing what they claim are AI-generated copies of their copyrighted works in use out in the world (so these have been “published” by some definition — they are not only being printed out and hung in people’s garages for them and their friends to look at in private), and are suing to stop that.
But aside from that, I think there is a real philosophical discussion here. If you’ve trained as an artist your entire life, have worked hard to develop a unique style, and are one of the relatively few artists who have been successful doing so — should a company be able to wait until you became popular, then just take all of your work, and use it to train a model that can produce works exactly in your style easily and without any effort, which it can then provide to people freely or for a subscription?
This also isn’t as much about the output, as about how the output was obtained. If the model did not actually ingest your images, but someone wrote a prompt that involved a super-detailed description of what made your style unique, going into color palettes, line thicknesses, art styles, influences, etc etc, and you would have to get all of that right in order to generate something that looked like your art, then I think most folks would be generally ok with that. But when (1) your prompt can just be “give me art that looks like soulofmischief made it” and it’ll give you just that, and (2) you know that your art was used to train the model in order for it to be able to do that, then there is a question of whether fair use laws should be adjusted to prohibit this behavior and protect your ability to live off of your work.
I also think that regardless of the outcome of these lawsuits, no one is really coming for your own models and hour ability to tinker in your garage. It may not be legal today to duplicate a copyrighted image and hang it in your office, but no one will ever know (or care enough to do nothing about it) if you do. Similarly, even if this use becomes copyrighted, nothing will practically stop you from building your own large model that includes any copyrighted images you want, for your own personal use, in your own garage. But if you then turn around and try to profit off of that model, or if you want someone else to produce a model (thus stepping more into the publishing realm) that’s where a line may be drawn. I personally think that’d be fair.
Finally, zooming all the way out, I believe that it should be possible to make a living as an artist, and I think when we have discussions like these, we should keep reminding ourselves to think about how our technical or legal arguments affect that outcome.
There are artists that can study a painting for a few minutes and then recreate it from memory. There are artists who study a particular body of work so long that they can create more works indistinguishable in style. If an artist recreates a copyrighted work or creates a derivative too close to the original, then that new work is potentially copyright infringement.
That is, we focus on the output of the process to determine infringement with living artists and ignore the training. But with ML, everyone focuses on the training.
It seems an ML tool could add a filter to the output and refuse to output a work that too closely resembles one or more work under copyright. Isn't that basically what legitimate professional artists do as well?
Thousands of artists are capable of infringement, but we don't take away their brushes based on capability.