Will we ever be able to detect AI usage?

Rachael: Do you mind if I smoke?
Deckard: It won’t affect the test. All right, I’m going to ask you a series of questions. Just relax and answer them as simply as you can. — It’s your birthday. Someone gives you a calfskin wallet.
Rachael: I wouldn’t accept it. Also, I’d report the person who gave it to me to the police.
Deckard: You’ve got a little boy. He shows you his butterfly collection plus the killing jar.
Rachael: I’d take him to the doctor.
Deckard: You’re watching television. Suddenly you realize there’s a wasp crawling on your arm.
Rachael: I’d kill it.
Deckard: You’re reading a magazine. You come across a fullpage nude photo of a girl.
Rachael: Is this testing whether I’m a replicant or a lesbian, Mr. Deckard?
Deckard: Just answer the questions, please — You show it to your husband. He likes it so much he hangs it on your bedroom wall.
Rachael: I wouldn’t let him.

The above is from the movie Blade Runner, and it shows the use of the Voight-Kampff test, one of the most famous AI-detection tests in science fiction. The objective is, as you may gather from the title of this blog post, to find out if a subject is a replicant. In the movie replicants are artificial beings, but they’re basically indistinguishable from a human, except for the fact that they’re engineered to be stronger, more beautiful, but they live considerably shorter lives by design. Deckard, who is a Blade Runner, specialises in performing this test, and dispatching rogue replicants. The whole point of the film is to make us feel empathy towards these artificial humans, particularly with the “tears in rain” monologue.

I suspect that modern day replicant equivalents do not get the same level of sympathy from some quarters. While we’re increasingly using AI tools in everyday life, our capacity to detect these could not be further from being anywhere near efficient, or even remotely viable. Large language models like Bard and ChatGPT took the world by storm, and almost at the same time a number of detectors came to market, supposedly able to detect when AI had been used, but since such detectors both for images and text have been plagued by accusations of inaccuracy and false positives and negatives.

Why, you may ask, do we need to detect AI in the first place? The detection of AI use in everyday life is becoming increasingly important for a multitude of reasons. Firstly, transparency and trust are foundational to the relationship between technology and its users, as AI tools become more integrated into our daily routines, from personalised content recommendations to LLMs, it’s crucial for users to be aware of when and how these systems are making decisions on their behalf. Knowing when AI is at play can help individuals better understand the rationale behind certain outcomes, whether it’s a movie suggestion on a streaming platform or a financial advice from a robo-advisor.

Secondly, the ethical implications of AI are vast and varied. By detecting AI use, we can ensure that these systems are being employed in ways that align with societal values and norms, and with the user’s own acceptance of AI. For instance, in areas like hiring or lending, where decisions can have profound impacts on individuals’ lives, it’s essential to know if an AI is involved, potentially perpetuating biases or making uninformed decisions. Furthermore, as concerns about privacy and data security grow, being aware of AI’s presence can help individuals protect their personal information and avoid potential misuse. In essence, detecting AI use promotes accountability, ensuring that these powerful tools are used responsibly and ethically in our ever-evolving digital landscape.

Thirdly, if we’re going to continue using essays as means of assessment, we may need to find out for certain if AI is being used. There are evidently ways around this without requiring AI detection systems, such as using innovative questions, oral examinations, or in-person written exams, but this could prove to be a difficult issue to tackle. As of today many universities have forbidden the use of AI in essays or any other form of formal assessment, but this requirement hasn’t been met with the capacity to detect its use. So markers have to options, to guess, or to assume that AI has been used, and mark things at face value. Given the unreliability of detectors, there is only one option at the moment.

So what else can we do? Perhaps detectors are the wrong way of looking at things, and what we should be looking for is to have AI authenticators and identifiers at the point of origin, so if you use a language model tool, or an image generator, the output will have some sort of watermark that denotes it as AI. Google has recently announced the deployment of a watermark system that can’t be edited out, with the intention of making AI works identifiable.

Another option is for regulation that makes it a legal requirement for AI output to be identified at the point of origin. At the time of writing there has been chatter about including some sort of identifier, and maybe one will even be included in the forthcoming AI Act, but at the moment nothing specific. I’m slightly sceptical of top-down approaches such as this, as identifiers could be easy to beat if the regulation doesn’t also include some strict technical requirements. It could also be possible for users to use tools created outside of the jurisdiction that requires such identifiers. The use of such tools also presupposes that users will want to comply, and short of making it a criminal offence not to identify an AI, I can’t see how this would ever be enforced. Unless…

Taking a page from Blade Runner, we could see Blade Runners, an anti-AI police that will knock down your door if you use AI. Life imitating art again?

4 Comments

Amanda Horzyk · September 10, 2023 at 9:35 am

Watermarking AI output with both a regulatory requirement and a technical specification will be key to accountability, verification, building trust when releasing more advanced models that will decide on legal or similarly important matters that will affect our lives. Watermarking Gen-AI images seems to be at the forefront of the discussion, and yet watermarking Gen-AI text and music is argueably more technologically challenging but equally important. Unfortunately, these conversations about AI Ethics, accountability and watermarking often stop at the foot of sociologists, and legal professionals or scholars, a few Big Tech Companies. There is a great need to bridge the gap between law and AI Developers who are able to develop and implement most of these solutions. Best wishes Andres! I really enjoy reading your blogs ~ your former student Amanda

Just Some Citizen · September 21, 2023 at 6:35 pm

Come on. Every supposed solution involving the use of so-called watermarks always fails, and this particular suggestion is even worse than usual. It would be but a moment until tools would appear (perhaps even using lesser forms of AI themselves) that could easily take a piece of watermarked AI-generated content, and regenerate it, keeping all its major semantic aspects, minus the watermark. Also, the big AI providers would simply start offering some kind of premium service where high-paying customers would get access to AI-generated content unencumbered by watermarks, leaving those for the rest of us who would be using their services for free. (Such discrepancy already exists: Consider the well-known case of AI chat bots refusing to assist in the construction of dangerous devices, and such; if you’re a paying customer, your premium version of the chatbot will gladly answer those same “dangerous questions” without any qualms.)