Zoryana Matyushenko's blog: Who owns AI developments and data for model training?

Zoryana Matyushenko, IT lawyer at Legal IT Group.

AI is rapidly gaining popularity, causing sincere admiration and concern at the same time, as it makes it much easier to get the desired illustration for a request, article or promotional text.

However, if we think one step further, who is the author of the works created by AI? Is it legal to use the copyrighted works of others to train AI models? I propose to consider these and other issues in this article.

Rights to objects created by AI

So, at first glance, ChatGPT, Midjourhey, and Dall-e seem to be great helpers for content makers, copywriters, developers, and many others. Nevertheless, at the stage of working with them, the question arises: can the source material really be considered your own?

As is often the case in the legal world, there is no definitive answer, because it depends, as they say. Technology is moving much faster than legislation, so not all issues are clearly regulated. However, some regulation does exist, so let’s consider it.

ChatGPT itself says that in most cases, the authorship of the source material belongs to the developers of algorithms and technologies used for its operation. However, if the input materials were provided by the user, he or she may have the right to copyright the created content. In addition, different countries have different legal rules and requirements regarding authorship, so this may also affect who is the author of the generated content.

And although ChatGPT often resorts to so-called “hallucinations”, when it invents information that does not exist, in this case it is close to the truth.

In the United States, the U.S. Copyright Office is already receiving applications for registration of works that include AI-generated content. On 16 March 2023, the Copyright Office issued the guideline “Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence”.

According to the Guideline, the key point on the basis of which the decision to register AI-generated content is made is the presence of “human authorship”. Thus, when assessing works containing AI-generated content, the Copyright Office will rely on whether the work is a human work and the computer is only an auxiliary tool. In other words, whether the AI’s contribution is the result of its own original mental concept. Material created by AI and substantially modified by humans afterwards will also be protected by copyright. If human participation is limited to a description or a “prompt” for the AI, the result will not be subject to registration.

As you might have guessed, there are no universal criteria for determining the degree of such creative human participation. The decision is made individually in each situation.

It should also be borne in mind that ChatGPT does not indicate the source from which it takes the text. Therefore, one cannot be sure that the model is using them legally. Therefore, there is a risk that the text generated by AI is plagiarised or otherwise violates the rights of third parties.

In addition, no one is immune from the similarity between the content generated at your request and that of someone else. Claiming to be unique and simultaneously “simplifying life”, you can get 5 identical or very similar samples of generated content in different parts of the world. You should also be careful that the content generated by AI is not necessarily accurate due to the aforementioned “hallucinations”.

In Ukrainian law, we observe the same principle, according to which only a human can be an author and will have rights to content only if he or she has contributed to its creation.

British law defines the author of a computer-generated work as the person who has taken the steps necessary to create the work (Section 9(3) of The Copyright, Designs and Patents Act (CDPA) 1988). The CDPA also provides for the concept of a work created by a computer in such circumstances that there is no human author. Thus, for a computer-generated work in the United Kingdom, human authorship does not affect whether the work is copyrightable. This is a rather specific regulation compared to global trends.

There is no “one size fits all”, and therefore in each situation authorship will be determined individually, depending on the circumstances and applicable law. However, the key principle is that the degree of creative involvement of the user in the creation of the AI work is crucial in determining authorship.

Rights to data for training AI models

Another aspect worthy of attention in relation to AI is the use of third-party copyrighted objects to train AI models.

Recently, artists Sarah Andersen, Kelly McKernan, and Carla Ortiz filed a lawsuit in the US against the creators of AI-based art generators Stable Diffusion, Midjourney, and DreamUp. They claim that the organisations violated the rights of millions of artists by training their AI models on five billion images collected from the Internet without the consent of the original artists.

In addition, the photo stock platform Getty Images filed a lawsuit against Stability AI for copyright infringement in AI training.

Potentially, infringement claims can be directed not only against developers of AI programs but also against anyone who tries to use the results generated by them in commercial activities.

In 2019, in a commentary to the United States Patent and Trademark Office, OpenAI stated that the use of machine learning algorithms that analyse copyrighted data for training should fall under fair use. The company’s representatives also stressed that the opposite position would have “catastrophic consequences”, “could lead to the transfer of innovations to a foreign jurisdiction” and “seriously hinder creative research in the field of AI”. In fact, fair use is the main argument of AI product development companies in favour of using third-party copyrighted works to train their programs.

Indeed, the US has a fair use doctrine that allows the use of a copyrighted work without the permission of the copyright owner for purposes such as criticism, comment, news reporting, teaching, scholarship, or research.

However, the line between the lawful use of third-party works under the fair use doctrine and copyright infringement is rather thin and blurred, and therefore the position of different parties may be subject to counter-arguments. There are many complicating factors in cases involving the fair use of third-party works by AI, including the location of the company (legal jurisdiction) and the purpose of such use. The amount of protected material used is also important. That is why only a court, after a comprehensive review of a particular case, will be able to conclude whether the use of works copyrighted by third parties for training AI models can be considered fair use.

If we compare such use by non-profit and for-profit companies, non-profit companies are more likely to be able to argue that their case does indeed fall under fair use. A commercial purpose significantly reduces the chances of protection under the fair use doctrine. But again, it’s a case by case basis 🙂

At the same time, the EU has the Directive of the European Parliament and of the Council on Copyright and Related Rights in the Digital Single Market, Article 4 of which allows for the reproduction and extraction of legally accessible works and other objects for the purposes of in-depth analysis of texts and data. Thus, this provision can be used to protect such use of third-party works for training AI models in the EU.

However, one should not relax here either. Firstly, court practice may interpret this provision differently and create new precedents. Secondly, the AI Act, which has not yet been adopted but is being actively discussed, contains a provision that obliges market surveillance authorities to disclose any copyrighted material used for AI development, including training. This, in turn, can radically change the situation.

The other side of the claims against AI art generators can be seen in a class action lawsuit against Microsoft, GitHub, and OpenAI. The lawsuit claims that these companies have violated copyright law by knowingly using protected open source code to train their AI. What distinguishes this lawsuit from others is that it does not contain direct allegations of copyright infringement, but relies on a provision oftheDigital Millennium Copyright Act that prohibits the removal of “copyright management information” from a work, which may include information about authorship and licence.

The practice is only at the beginning of its formation, so it is difficult to predict what future court decisions will be. It can be assumed that the first decisions will set a further trend for consideration of similar cases. At the same time, each case will be considered based on individual circumstances, and therefore decisions may be made without reference to previous ones in similar cases.

The use of artificial intelligence in law

Lawyers are also finding applications for artificial intelligence in their work. First of all, AI can help with legal research. If you need to find information in a completely unknown area, you can turn to AI for navigation. Of course, you need to check the guidelines and do not expect all the work to be done for you 🙂

AI can process large amounts of information extremely quickly, which can be useful when performing certain time-consuming tasks. In addition, AI tools can be useful in automating document management and billing, and even preparing document drafts.

However, you should be careful with the resources used and keep in mind data security.

The use of AI in the legal industry.
Source: https: //technative.io/how-legal-ai-became-more-accurate-than-lawyers/

Conclusion

To summarise, the use of generated content, although balancing on the edge of risks, is certainly a very progressive tool. Given the rapid development of AI technologies, we can expect more detailed legal regulation and court practice to emerge in the near future.

The boundary for determining the authorship of such content is rather blurred, and there is a possibility of infringement of third parties’ copyrights at the stage of creating AI art, text or other output. Therefore, AI should be used responsibly to ensure that it is truly effective.

For example, depending on the situation and applicable jurisdiction, content created by AI may be owned by

the developer or owner of the AI;
the user of the application;
jointly to the developer/owner of the AI and the user, or to the developer/owner of the AI and a third party, or even to another combination of these three, depending on who contributed to the creation of the final work.

In addition, the generated content may not only contain a part of the third party’s creativity, but also infringe its copyright in the results of its creativity.

The issue of the legality of using third-party copyrighted works without the consent of the authors to train AI models also remains controversial. The US has a fair use doctrine, which, however, does not establish one clear rule for all cases, and therefore a set of circumstances such as the purpose of use, jurisdiction, volume and nature of the content, the share of “foreign” material in the final results, etc. should be taken into account in a comprehensive manner.

It will not be possible to transfer all work tasks to artificial intelligence, but it can certainly make life easier. In the future, AI may become an everyday tool that improves work efficiency. Of course, it’s important to keep security and privacy in mind and take a responsible approach to interacting with artificial intelligence.

In the EU, regulation is currently more loyal to the authorship of AI-generated works, but active work is underway on the new AI Act, which may become a new global standard in this area. Therefore, we keep our finger on the pulse and follow the dynamic development of legislation.

Source: https://serpstat.com/uk/blog/komu-nalezhat-napratcyuvannya-shtuchnogo-intelektu/.

Zoryana Matyushenko’s blog: Who owns AI developments and data for model training?

Rights to objects created by AI

Rights to data for training AI models

The use of artificial intelligence in law

Conclusion

The last straw: three stories of divorce

The situation with the supply of food to the Armed Forces is rapidly deteriorating: higher costs, non-transparent contracts, disruptions in deliveries

When will Bulgakov be forgotten?

General Staff names areas where enemy attacks most actively

ЕКСКЛЮЗИВ Today, the government can say goodbye to Kubrakov and Solsky

Deputy Minister of Agrarian Policy Dmytrasevych remanded in custody in Solsky case

Zelenskyy appoints more than a hundred judges

The damage is extensive. Electricity imports cannot completely cover the deficit, – Ukrenergo

The government has started preparing for the next heating season

Rada allows mobilisation of certain categories of convicts

85% of grain exports are controlled by one official, and grain is exported by fictitious companies – media

ФОТО Valeriy Zaluzhnyi and Lina Kostenko were awarded the title of “Honorary Citizen of Kyiv”

Who will win the Eurovision Song Contest 2024: bookmakers update their predictions after the first semi-final