“All watched over by llamas of loving grace”

It’s been a busy few weeks in the AI and copyright beat, and while I’ve been following all of the developments closely, I haven’t had the time to react to everything in the blog. However, with two first decisions handed down in the US, and one in the UK expected soon, it is a perfect time to start looking at the subject from a long-term perspective. A lot of the work from copyright lawyers and researchers who look at AI tend to propose solutions based on the traditional models, namely licensing and remuneration. I have been arguing for a while that the nature of AI training and of the technology do not fit perfectly into the existing frameworks, and that we should be looking to either reform the system or adjust business models, and copyright policy should reflect those changes. I’ll try to explain the problems, and I’ll begin exploring new ways of looking at copyright in this context, although I do not claim to have any answers.

The traditional copyright model

The way copyright has worked traditionally in the creative industries is well-documented, but for the purpose of this post this is roughly how it is supposed to operate. In an ideal world creators generate a work worthy of copyright protection, this gets then distributed to the public, who pay for it, and royalties flow back from that payment to the author. The reality is evidently infinitely more complex than that, as creatives may not own copyright, then you have to consider intermediaries, licensing deals with distributors, and we have a complex picture of moving parts where often just a small percentage of money goes back to the person who created it.

The term ‘creative industries’ itself covers a vast and varied array of sectors. It’s not just the traditional realms of publishing and music, but also film and television production, theatre, design, fashion, advertising, and the ever-expanding world of video games and digital media. Each of these operates as its own distinct ecosystem, with unique commercial pressures and conventions for how creative work is funded, produced, and sold. Authors often have to operate in a world of literary agents and publishing houses, and this provides a different set of challenges to a freelance graphic designer licensing their work for a commercial campaign, or a musician trying to make sense of streaming royalties. This diversity is key to understanding why a single, simple model of copyright rarely fits the bill.

So while the model still relies on individual creators, the reality is a system of publishers, agents, collecting societies, distributors, and licensing bodies. But even with this complexity the idea is simple: work → licensing → remuneration. Create a work, sell it or licence it, profit.

But this model also relies on several cogs being maintained that are already under threat. There has to be a product that can be sold or licensed to an audience, and a royalty system in place that sends the money back, often operated by these intermediaries and collecting societies. But digitisation and the rise of online streaming has changed many of those premises. More often than not nowadays, creators are employees in large conglomerates that churn out content and keep ownership of all that is produced. Moreover, streaming models often translate into less money from royalties as the system is propped up by subscription fees, which often generate less than what the traditional product model used to accrue. So there has been a concentration of power at the top, and less money trickling down to the creators that actually make the copyright works. This is coupled with the fact that new generations of creatives are starting to opt out of the traditional model to engage with the content-creation economy that is more direct: influencers, streamers, and YouTubers do not operate within the traditional copyright framework at all.

The problem with the traditional model and AI

The model described above, while it has been suffering, still produces a sizeable amount of returns, which is the reason why it remains in place for the most part. So when generative AI started becoming more popular, it was assumed by most people in the industry, as well as some legal commentators, that this would be a similar way to move forward with AI. I’ve read several opinions that AI should be dealt with in roughly a similar way as the copyright industries handle remuneration nowadays, but there are a few problems in implementing this model to AI, and we are starting to see these play out in practice.

The ideal application of the model would work something like this: author creates a work protected by copyright, the AI developer purchases a licence to use the work as training data. Rinse and repeat.

But this is not how things have proceeded in reality. At first the data used for training was the largest repository of works in existence, and that is the public web. Earlier language models, and most image ones, used content already found online for training. So from the start, large amounts of data already available for decades on the open web was the source, and that usually meant bypassing any sort of negotiation with content owners. Evidently, creators cried foul, and in many instances proceeded to sue for copyright infringement. At first it looked like this would be an easy win for authors, but things haven’t been so clear-cut, and this appears to be confusing quite a lot of people who were told that copyright would be the silver bullet that would kill generative AI.

Here is what I think is happening.

The first problem has been the fact that creators have been playing catch-up from the outset. A couple of years ago I gave a lecture at the London Book Fair on generative AI and copyright infringement; it was a full house and everyone from the copyright industry was there. It was the copyright maximalist Woodstock. I was a bit nervous as someone who is famously a minimalist, but I think that the talk went quite well. My main message was that the copyright industry had been asleep at the wheel, and that they had already lost the battle. Today that message would have gone down very differently, but at the time these people needed a wake-up call. My argument was simple: the models are already trained, you could get rid tomorrow of every single generative AI company, and the models would still exist. Moreover, other countries with no copyright enforcement would start getting in on the act. My advice at the time was to try to start negotiating immediately, I wasn’t sure if that would work, but it was a start. Soon after, the lawsuits started flying, but the assessment remains. The models are already trained, and there is no putting the genie back in the bottle, so the copyright industry has already been trying to catch up with a runaway technology. See, it would have been easier to get all negotiations done beforehand, but they were mostly worried about the link tax, but I digress…

The second issue is about the models themselves. Traditional copyright remuneration systems work because there is a product that faces a consumer; in other words, there is a publication of a work that gets distributed and communicated to the public. AI models take millions of works and do not publish them in any sense of the word—they’re not facing the public. You wouldn’t go to ChatGPT or Gemini to read ‘The Hobbit’; you would go there to know the plot of ‘The Hobbit’, but that is the same as Wikipedia. Training an AI means accessing a copy to extract data from it, but that is not an adaptation and it’s not a publication. Sure, there is a reproduction, but at least in some systems that could fall under existing exceptions and limitations. There may still be an infringement, but the lack of publication will mean that damages for the most part may end up being negligible; in most countries that do not have statutory damages, that is. But even if these companies get hit with billions in damages for those copies, you end up with the first problem.

The third problem for a remuneration model was recently made evident in the Anthropic fair use decision by Judge Alsup. An interesting detail in the discussion that came to light was that Anthropic had purchased books for scanning, something that I have been expecting would eventually happen. The issue is that copyright forbids people from making a reproduction of a work without authorisation, which is what most training from the web is. But what if someone purchases a legal copy of a work, and trains from that? I think that this is fine; training an AI is not an exclusive right of the author, extracting information from a purchased work is not an infringement of copyright, otherwise reading a book would be actionable. The problem here is that, with the exception of evidently infringing outputs, for the most part AI outputs cannot be considered to be either a communication to the public of a work, a publication, or an adaptation, all exclusive rights of the author. So if there is no unauthorised reproduction, there is no cause for action. This may be solved by eventually making AI training something that requires authorisation from the author, but for now, that is not the case.

And the final problem for establishing a remuneration model is one of scale. I’m often baffled by how people simply seem to ignore the gargantuan amount of data that goes into the training of a model such as an LLM, but also what effect that has on any sort of licensing deals, as well as any viable remuneration model. For the most part, a model such as ChatGPT requires billions of tokens (a unit of text such as part of a work which is used by language models to process and generate language). The sheer number of tokens, coupled with the fact that you only need to pay for one copy of a work, means that the value of a licensing market is negligible. For there to be a traditional licensing system, there has to be a value to each individual work that is worth the transaction costs, but if what you are going to get is 0.00000001% of an already low transaction cost, then most creators would not expect to see any money at all. Imagine the Spotify remuneration model, but a thousand times less money going to creators.

What next?

I’m tempted to just add the shrug emoji here. Heck, I don’t know, that is way above my pay grade. I’m just diagnosing the problem; I have no idea how to come up with solutions. I know people in the creative industries will not like my analysis above, but I think that it is about time that people stopped listening to those peddling 19th-century models, and at least recognised that the current copyright system is just not suited to the challenges posed by AI. We are here because people have been assuming that the traditional models will still hold, but that is not going to happen for various reasons, and you will continue getting people being surprised, hurt, and disappointed when court decisions don’t go their way. The problem is that a lot of people were sold lies about AI and about copyright. They were told things were certain, but they rarely are. Copyright is messy, costly, and you almost never get the result you wanted. This is because copyright also has to have a system of exceptions and limitations, otherwise even running a game on your computer or browsing the web would be infringing. (Notice the two em-dashes here, I’m trying to make em-dashes happen).

So at the very least stop trying to make copyright do something that it’s not meant to do. Some of the issues that are being discussed are about societal challenges. They are also about competition, funding to the arts, capitalism, corporate greed, and inequality. So reform the system if you must, establish grants for creators, tax Big Tech developers up to the gills and give the money to creators; anything but the assumption that the traditional remuneration system has any hope of surviving.

And if you’re a copyright holder, time is not on your side. The lawsuits in the US will keep going for years and years, and by that time more models will have come online, and more countries will have joined the AI race. I’d argue that the bulk of the generative AI revolution took place between 2014 and 2017. We’re living in the aftermath of that momentous time, which most people missed. AI is here to stay, so perhaps start acting accordingly.

Concluding

It’s been a weird decade, 10 years ago I published my first generative AI copyright blog post. So much has happened since that I don’t know where to begin. I got very sick, and I got better. I had a short NFT copyright obsession, and I got better. I liked ‘The Last Jedi’, and I got better. But one thing has remained clear. AI tools have continued to improve, and like it or not, we are becoming more reliant on them. If you don’t like that statement, that’s fine, but copyright is not the answer.

I don’t know what the answer is, I’m just a guy cosplaying as a llama online.


5 Comments

Alberto Gonzalez` · July 9, 2025 at 11:19 pm

Hi Andres, as a Costa Rican aiming to create original IP in the form of entertainment material for kids and hopefully make a living while at it, the whole AI vs. copyright thing goes straight to the core of my concerns. Being this a behemoth of a subject I think it is best to address it in bite-size pieces.

Copyright, as from an author’s perspective, implies two basic rights for any creator: to prove the IP credited to the author’s name is effectively and legally theirs, and the implicit right for the author to make use and / or profit off it as s/he/they see fit. (I know corporations are also copyright holders but let’s keep things simple here) Fair enough, right?

However, the assumed control of creators under their own material when having to deal with powerful distribution channels online and offline does dilute greatly as you point out, and where original creators end up earning the least in this whole system, instead of the logical opposite. And that didn’t start with AI.

Creators have long been given the shaft in terms of legal rights and learning to stand our ground has taken a long time also. To illustrate in light of the new Superman movie: the original creators of the series, Jerry Siegel and Joe Shuster, did it as any other such job in the 1930s: they got paid for the story, the art pages and that was it. When their creation became a multi-millionaire success, they both were kicked out of the character copyright by DC and ended up living in poverty. Maybe that’s why IP ownership (and not giving it away foolishly) strikes a strong chord with me.

As for AI: Yes, if AI companies wanted to play fair they shouldn’t have scrapped in the whole Internet for their LLM training. But they did, because no one stopped them, and because they were posing as “non-profits” only to be valued at billions of dollars on Wall Street later. Now the genie is out of the bottle and, short of destroying all LLM databases and undoing AI as a whole, it won’t go away. But I also thing shrugging the whole thing off isn’t fair either, specially for millions of indie and small creators and publishing shops that will never have the resources to sue anyone out of existence like Disney does. AI companies are forced to make big profits in order to keep going. We can agree the existing model is broken and isn’t really working save for those with really deep pockets, but a fair, working alternative isn’t coming out yet either.

Anonymous · July 21, 2025 at 3:42 pm

Really enjoyed reading your piece, for both its content and the vulnerability of it. Dr MP 🙂

Avatar

Sharon L · July 29, 2025 at 7:45 pm

Really enjoyed this contribution. As someone who has been engaged in research in traditional knowledge protection within the context of an IP inspired system, and more recently looking at AI and IP, the trick is to find a “work around” to the challenges posed by an IP system based on principles and models that do not neatly translate, and figuring out what “work around” that might look like.

Analysis of the Voss Report: Technical, Economic and Legal Concerns for European AI Policy | C4C · July 9, 2025 at 12:04 pm

[…] of meaningful individual creator compensation. As technology law expert Andrés Guadamuz observes, “for there to be a traditional licensing system, there has to be a value to each individual […]

July Copyright Reads – Open Research · July 15, 2025 at 11:49 am

[…] How AI is breaking traditional remuneration models […]

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.