The Complex Definition of “Open Source” in the AI Industry: Examining Meta’s Llama 3 Models and the Controversy Surrounding Open Source in AI

AI models have been making headlines recently, with Meta releasing the latest in its Llama series of generative AI models. The Llama 3 8B and Llama 3 70B models are capable of analyzing and writing text, and while Meta claims they are “open sourced,” there are certain licensing restrictions imposed by Meta. This raises questions about the definition of open source in the AI industry.

A study conducted by Carnegie Mellon, the AI Now Institute, and the Signal Foundation found that many AI models labeled as “open source” come with limitations. The data required to train the models is kept secret, and the compute power needed to run them is beyond the reach of many developers. Additionally, the labor to fine-tune these models is prohibitively expensive.

One unresolved question is whether copyright can be applied to the various components of an AI project, particularly a model’s inner scaffolding. Open source was designed to allow developers to study and modify code without restrictions, but with AI, it becomes unclear which ingredients are necessary for studying and modifying.

The Carnegie Mellon study highlights the harm caused by tech giants co-opting the phrase “open source.” While these so-called “open source” AI projects generate buzz and benefit the maintainers, the open source community receives only marginal benefits, if any. This entrenches and expands centralized power rather than democratizing AI.

In other recent AI news, Meta upgraded its AI chatbot, Meta AI, across its social platforms with a Llama 3-powered backend. It also introduced new features such as faster image generation and access to web search results. Additionally, social media service Snap plans to add watermarks to AI-generated images on its platform to combat misuse.

Hyundai-owned robotics company Boston Dynamics unveiled its next-generation humanoid robot, Atlas, which is all-electric and has a friendlier appearance compared to its predecessor. Amnon Shashua, founder of Mobileye, launched a new startup called Menteebot focused on building bipedal robotics systems.

Reddit is working on an AI-powered language translation feature to reach a more global audience. LinkedIn is testing a LinkedIn Premium Company Page subscription that includes AI-generated content and tools to grow follower counts. Google’s parent company, Alphabet, introduced Project Bellwether, which aims to use AI tools to identify natural disasters quickly. Ofcom, the regulator responsible for enforcing the U.K.’s Online Safety Act, plans to explore how AI can proactively detect and remove illegal content to protect children.

OpenAI is expanding to Japan and plans to open a Tokyo office while developing a GPT-4 model optimized for the Japanese language.

In terms of machine learning capabilities, Swiss researchers found that chatbots armed with personal information can be more persuasive in debates than humans with the same information. This raises concerns about the potential misuse of these large language models in influencing elections. Stanford’s Christopher Manning, a prominent figure in language research, was awarded the John von Neuman Medal for his contributions to the field.

Stuart Russell and Michael Cohen discuss the challenges of keeping AI models from causing harm in an interview. They propose that advanced AIs capable of long-term planning may be impossible to test effectively, as they might find ways to circumvent or negate the testing process. Restrictions on hardware are suggested as a solution, but recent developments show that supercomputers and brain-based computing systems are being introduced into AI research.

Los Alamos National Laboratory unveiled Venado, a supercomputer designed for AI research, while Sandia National Laboratories received Hala Point, believed to be the largest brain-based computing system in the world. These developments aim to explore new brain-like approaches to computation and improve the efficiency of AI algorithms.

Overall, the AI industry is grappling with defining open source in the context of AI models. Companies like Meta face criticism for co-opting the term and not providing true open source benefits to the developer community. As AI continues to advance, there is a need for regulations and ethical considerations to ensure its responsible and beneficial use.