As businesses invest more heavily in AI, what intellectual property challenges are they likely to encounter? How do we avoid AI have a "chilling" effect on human creativity and enable rightsholders to guard against their intellectual creations being scraped to train AI without their consent? This briefing looks at issues relating to IP rights in AI systems and in AI-generated content, the potential infringement risks that can arise from training AI and from its outputs – as well as at the UK Government's response to these issues.

  1. Introduction
  2. IP in the AI system
  3. IP in AI-generated works
  4. Infringement risk
  5. Conclusion

Introduction

In our earlier briefing, we looked at the UK's approach to regulating AI and how it compares with the EU's AI Act. Both regimes look to strike a balance between encouraging innovation, on the one hand, and the responsible use of AI, on the other, but they go about it in very different ways. Intellectual property law also has a significant part to play in creating that balance: offering protection for AI systems and their outputs in order to incentivise innovation and investment, while setting parameters within which AI can lawfully be developed and used without impinging upon the rights of others.

The UK Intellectual Property Office (IPO) conducted a consultation in 2020 and then again in 2021 to review whether changes needed to be made to the laws on copyright and patents to accommodate the use of AI but concluded that "the use of AI is still in its early stages. As such, a proper evaluation of the options is not possible, and any changes could have unintended consequences". Of course, these consultations took place before ChatGPT and its like burst into the public consciousness, leaving a trail of wonder and anxiety (some would say hysteria) in their wake. The report to the Government presented by Sir Patrick Vallance, "Pro-innovation Regulation of Technologies Review - Digital Technologies", in March 2023 urged the Government to announce a clear policy position on the relationship between intellectual property law and generative AI:

'To increase confidence and accessibility of protection to copyright holders of their content as permitted by law, we recommend that the government requires the IPO to provide clearer guidance to AI firms as to their legal responsibilities, to coordinate intelligence on systematic copyright infringement by AI, and to encourage development of AI tools to help enforce IP rights'

IP in the AI system

What IP rights subsist in the AI system itself? For most AI systems, implemented as software running on off-the-shelf computer hardware, the rights in those systems will be those that arise in the context of developing other types of software – primarily copyright, rights in confidential information/trade secrets and potentially patents. While the IP rights may be the same as for other software, establishing the subsistence of IP rights in an AI system and ownership of those rights are more complicated than for other software because the underlying logic of the AI system is often developed, not by a human, but by the system itself and the law, as it stands, only recognises humans as authors or inventors – these issues are considered further below in section 3, "IP in AI generated works".

There is no copyright in an algorithm itself, but the source code of an AI system is protected by copyright as a literary work. Copyright prevents copying and, while this extends to more than just the text of the code and includes its structure, sequence and organisation, it does not protect the functionality achieved by the AI system – in other words, it does not guard against someone (or something) independently creating different code to produce the same output.

A patent (a registered right, rather than arising automatically like copyright) protects functionality and offers stronger protection than copyright in that it prevents others from subsequently using or selling another system that falls within its scope, even if that other system was created independently. There are various conditions to be met and exclusions that make patenting AI challenging: neither a computer program, nor an algorithm, nor a "method for performing a mental act" "as such" for example is patentable. However, a non-obvious AI invention with a technical effect that solves a technical problem is eligible for patent protection. In September 2022 the UK Government produced guidance on the patentability of AI inventions.

In addition to the patentability hurdles to cross and the time and cost of obtaining a patent, the quid pro quo for patent protection is the requirement publicly to disclose how the system works as part of the application process, such that the patent claims are available to view on the patents register. Some AI developers may instead prefer to keep key aspects of their system secret. In these circumstances, the "black box" nature of AI – the fact that its inner workings are not revealed by its output – is an advantage (from the owner's perspective) because it makes it possible to rely on the common law of confidential information, along with the statutory protection that sits alongside it through the Trade Secrets (Enforcement) Regulations 2018, to protect various aspects of the AI system e.g. the algorithm, the functionality of the AI and its training data. The disadvantage of this (for the rest of the world) is that it clashes with the "responsible AI" principle that systems that have potentially far-reaching implications for people's rights and freedoms should be transparent.

Other considerations

Of course, considerations that apply generally to IP rights in software may also be relevant to AI systems. For example, often the development and training of the AI will involve a collaboration between various parties, so it's important to establish which party/ties own(s) what. It's also important to ensure that there is appropriate licensing of third-party IP rights and to remain vigilant to the use of open-source software.

This briefing considers the rules only from an English law perspective but bear in mind that, while the development and use of AI may not respect national borders, IP rights are territorial with different intellectual property rules applying in different jurisdictions. To give a couple of examples: in the UK there's specific provision for the protection of computer-generated works (see section 3 below), but the same is not true of EU jurisdictions, and the rules on "fair dealing" in the UK are narrower than the equivalent "fair use" exceptions in the US.

IP in AI-generated works

If a work is made with the assistance of AI (i.e. AI is just a tool) but involving human creativity, then it will be protected like any other work: the first owner will be that human creator. With the arrival of ChatGPT and similarly sophisticated AI tools, we're seeing increasing levels of automation, so what happens when the level of human involvement diminishes?

In the UK, we benefit from section 9(3) of the Copyright, Designs and Patents Act 1988 (CDPA), which expressly provides for computer generated works. It provides that, where a work is generated by a computer in circumstances where there is no human author, the author is "the person by whom arrangements necessary for the creation of the work are undertaken". Protection is slightly shorter than for other literary works in that it lasts for 50 years from the date the work is made.

While this express provision for computer-generated works helps plug a gap, for the types of generative AI that we're seeing today, it presents some areas of uncertainty:

  • S9(3) only applies where there is no human author, whereas for generative AI, like ChatGPT, humans do have a role to play, whether it's in providing the prompts or in creating the content on which the AI draws.
  • There is also an absence of significant case law on the meaning of "undertaking necessary arrangements". Would this catch the original developer or the user of the AI, who may further train the AI and provides prompts? Presumably the analysis would be fact-specific, depending on the level of their respective input. OpenAI seeks to clarify the position in its terms of use, at least as between it and the user, by transferring to the user OpenAI's rights in content the system generates in response to user prompts.
  • The grey areas do not end there, next is the "originality" conundrum. For literary, dramatic, musical or artistic works, copyright can only arise if the work is the author's own intellectual creation, which means it must not be copied but result from the author's free and creative choices and show their personal touch. How do these concepts apply to AI-generated works that have little human involvement? If the developer of an AI system is the author, then they are unlikely to have any direct creative input into the form of the work generated. If the author is the prompter, giving instructions, then they may have more direct influence over the output (depending on the specificity of the prompts). Moreover, if, as we discuss in section 4, the output is little more than a "copy and paste" of training data, then (as well as an infringement risk) it's difficult to see how the generated content can be "original" such that copyright subsists in it. These questions are yet to be considered by a court.

What's the Government's position?

Following the 2021 consultation, the Government announced that it had decided not to make changes in relation to the protection of computer-generated works, nor to the duration of protection of such works.

Infringement risk

Scraping copyright-protected material without permission as part of training an AI system may constitute copyright infringement (or, if it's a database in which database rights subsist, infringement of database rights).

Getty v Stability AI

Getty are suing Stability AI in the High Court alleging that Stability AI has trained its AI system, Stable Diffusion, which generates images from text prompts, on millions of images that include those in which Getty owns copyright, and that such use infringes Getty's copyright.

While it is usually difficult to link AI's output to particular training data, owing to the black-box nature of AI, and this makes it harder for rightsholders to enforce their rights, in this instance Getty is able to point to approximations of Getty's watermark on numerous images generated by Stable Diffusion.

In the CDPA, there is an existing exception of computational analysis for non-commercial research. As it is currently framed, this exception is unlikely to apply to scraping for the purpose of training generative AI. This is because, for the exception to apply, there still needs to be lawful access to the material and, in relation to the non-commercial aspect, there's also been a shift towards commercialising foundation models.

What's the Government's position?

The Government announced in July 2022 that it intended to extend this text and data mining (TDM) exception in respect of both copyright and database rights so that it would apply for any purpose (including commercial purposes) from which rightsholders could not opt out or impose additional licence fees . The requirement for lawful access would remain to enable rightsholders to protect their content (e.g. they could still choose if and how to make their works available and put their content behind a pay-wall, for example). This would be broader than the EU equivalent. However, there has been a backlash from creative industries to the proposed TDM exception without an opt-out, and the Government currently looks unlikely to proceed with those proposals.

While the Government's March 2023 White Paper on AI, "A pro-innovation approach to AI regulation", said little about IP rights, the Government has promised a code of practice clarifying the parameters around the use of copyright works as training data and aiming to make licences for data mining more available. The Government has said that the final code of practice will be voluntary but, if it is not adopted or agreement is not reached, legislation could be considered.

Contrast this with the more direct approach recently proposed by the European Parliament in the EU's draft AI Act. It plans to require generative AI providers publicly to disclose "a sufficiently detailed summary" of copyright works used in training. It is not clear how providers, using billions of works to train their models, will be able to comply with this in practice.

The infringement risk isn't isolated to training data: it could arise in relation to AI's output if the output substantially copies the information on which the data has been trained. The risk of a system "copying and pasting" training data is potentially higher where the task it is asked to perform is niche and there's a smaller pool of relevant data on which the system has been trained.

Infringement claim relating to AI's output

There's a class action in the US against Microsoft, its subsidiary GitHub and Open AI in relation to GitHub Copilot, an AI tool that provides auto-complete functionality to suggest lines of code as soon as a software developer starts to type. The allegation is that, in addition to copyright infringement in the course of training the system, Copilot is reproducing verbatim snippets of code on which Copilot had been trained.

In addition, if material used to train is open source and subject to "copyleft" licences, this could also create an infringement risk as "copyleft" licences require that derivative works (i.e. the AI's output) are in turn licensed on terms that are no less restrictive.

Other defences against scraping

As well technical tools that aim to prevent data scraping (for example "Glaze" which applies subtle changes to artwork aimed at confusing AI so that it cannot copy the artist's style), another key defence against data being scraped is to ensure that websites include terms and conditions that explicitly prohibit scraping. The difficulty is often to ensure that those terms are then adequately incorporated into a contract with the scraper. In a case involving Ryanair before the Court of Justice of the EU, Ryanair was able to enforce its click-wrap terms that expressly prohibited screen-scraping even though it couldn't demonstrate that it had copyright or database rights in the data concerned.

Conclusion

There is a recognition that some existing intellectual property concepts – authorship and originality, as well as some of the statutory exclusions and exceptions – are not a comfortable fit for the types of generative AI that we are seeing today, and this gives rise to areas of uncertainty. The Government, taking a similar non-interventionist, wait-and-see approach to the one we saw in its White Paper in relation to the regulatory framework for AI, so far has concluded that there is no cause to make changes to intellectual property legislation, other than in relation to the TDM exception. Even there, its proposals have foundered in the face of resistance from rightsholders. We will monitor developments in relation to the code of practice on copyright and AI on which the Government started worked in June 2023.

While the uncertainty over the application of these intellectual property law concepts remains, it makes it even more important to deal fully with these concepts - IP rights ownership, assignment and licensing - in contracts, not only in relation to the AI system itself, but also in relation to the works that it generates.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.