This report was published by 451 Research on October 30, 2023
While some of this month's funding rounds were relatively small compared with others tracked this year, a mega round from OpenAI loomed large, with media reports claiming the company is organizing a tender offer whereby employees will sell existing shares to investors at a valuation of $86 billion. To put that in context, the company's previous funding round in April valued the company at $29 billion. We previously pondered why there were no large pure-play AI companies around. Now we have not just one, but many contenders in the form of OpenAI's rival foundation model providers.
OpenAI, Adept AI and Baidu Inc. are launching new multi-modal features for their chatbots. OpenAI's suggestion that ChatGPT can now "see, hear and speak" is explicitly presented as a drive toward artificial general intelligence (AGI) — a contested and theoretical state of AI where models can learn to solve any problem. An aligned trend is the use of generative AI techniques to train robots to address tasks or environments they have not been explicitly trained to address, with Google an early pioneer. Regardless of one's position on the viability of true AGI, the trajectory of generative AI is toward multi-modal models, making them far better able to engage with the physical world.
Product releases and updates
OpenAI launched new voice and image capabilities for premium versions of ChatGPT, enabling users to prompt the model using their voice or by uploading images, in addition to typing text prompts. These multi-modal features enable use cases such as showing ChatGPT the contents of your fridge and having it suggest a meal, or troubleshooting why something doesn't work. It can create synthetic voices based on text and a few seconds of sample speech, so ChatGPT turns into a voice-based assistant. OpenAI worked with professional actors on the voices, and says it cannot be used to create fake versions of actual people speaking words they had never said. The image input feature was tested by red teamers, which OpenAI says enabled it to "align on a few key details for responsible usage," although it didn't specify what they were.
Baidu unveiled version 4 of its ERNIE AI chatbot, and CEO Robin Li demonstrated the product at Baidu World 2023. Li claimed ERNIE can do all that OpenAI's GPT-4 product can do and more — in particular, multi-modal capabilities. During the presentation, the bot, which operates mainly in Mandarin Chinese, apparently produced a car commercial from text prompts, created a plot for a martial arts novel and solved complex geometry problems.
French startup Mistral AI has launched its first model, Mistral 7B, via an Apache 2.0 open-source license. It is a 7.3-billion-parameter model that the company claims outperforms Meta Platforms Inc.'s Llama 2 13B on all benchmarks and its Llama 1 34B on many benchmarks. Mistral also unveiled a chatbot based on Mistral 7B called Mistral 7B Instruct, which it fine-tuned using publicly available datasets on Hugging Face, but cautioned that it has no moderation mechanism (no guardrails against misuse) and is meant mainly as an example of how the model could be fine-tuned. Mistral thanked CoreWeave for its "24/7 help in marshalling our cluster," indicating the model was trained using CoreWeave on-demand GPU cluster.
Google announced Google-Extended, a flag that can ensure a website will be included in search but will not be used by Google to train Bard and Vertex AI generative APIs. The control is seen as a response to a "public discussion" the business kicked off in July around the use of web content to train models. Google suggests it will be releasing more controls for web publishers. These flags are important (Figure 1), as general information search is the most popular generative AI use case with the general public.
Figure: General public most likely to use generative AI for information search
Source: 451 Research's VoCUL: Connected Customer, Quantifying the Customer Experience 2023.Q. Which of the following generative AI use cases, if any, have you used or are potentially interested in using? Please select all that apply.Base: All respondents that use or plan to use generative AI in the next 12 months (n=2,553).© 2023 S&P Global.
Google DeepMind announced the creation of general-purpose robotics dataset Open X-Embodiment, a partnership with 33 academic labs. The dataset is being made available for researchers, alongside the RT-1-X model, a robotics transformer model. Google DeepMind is leveraging transformer architectures to move away from task-specific training of robots. For example, its visual language action model, RT-2, is driving toward emergent skills, where robots can engage with objects or tasks they have not been specifically trained to address.
New generative AI offerings were announced by Dell Technologies Inc., with an emphasis on on-premises deployment and customer data foundations. "Dell Validated Design for Generative AI" supports the tuning and inferencing of generative AI models by offering hardware and pretrained models, and a new "data lakehouse" for AI workloads is planned for the first half of 2024. Dell Professional Services for Generative AI includes services around implementation, data preparation and educational services.
Graphic design platform Canva announced Magic Studio, a packaging of its AI tools for image, animations and text. The company announced its first generative AI capabilities in March, suggesting it was bringing together foundation models from OpenAI and Stable Diffusion, as well as developing its own. AI video generator Runway is partnering with Canva to make its Gen-2 model accessible directly in the Magic Media app.
Adept AI announced Fuyu-8B, a small version of the multi-modal model that drives its upcoming product, which the company describes as a "generally intelligent copilot for knowledge workers." Fuyu-8B is being released as open source via Hugging Face. Adept says its Fuyu models are decoder-only transformer models that can handle text and images within a separate image encoder. Adept is planning a product that can do analysis of images, charts of data and other tasks many knowledge workers face.
Adobe Inc. announced generative AI capabilities at its Adobe MAX conference, including new models and an improved feature set for modifying them. The models included Firefly Image 2 — positioned as offering photorealistic quality and better human rendering. The company also released a model for generating vector graphics and a model for text-to-template design.
Researchers from Stanford, MIT and Princeton announced a transparency scoring system for foundation models called the Foundation Model Transparency Index. It measures 100 different aspects of transparency and assigns a score out of 100. The top-rated model — Meta's Llama 2 — only scored 54%, and OpenAI's GPT-4 only 48%, with AWS' Titan Text model at 12%. Transparency factors considered include model access (true open source ones get 100% on that score), information on training data, the labor needed to create the model, compute usage, energy usage, and mitigations for privacy and copyright, among others. The researchers also produced a 100-page paper detailing their findings.
Funding and M&A
One month after accepting an investment of up to $4 billion from AWS, foundation model provider Anthropic agreed to an investment of up to $2 billion from Alphabet Inc., on top of the stake it already owned. Google has invested $500 million and is committed to a further $1.5 billion over an unspecified period of time. Google paid $700 million for about 10% of Anthropic back in February, and participated in its series C round in May. The company has achieved $7.7 billion in total funding. Although AWS has been designated as Anthropic's "primary cloud provider," none of these minority investments appears to tie Anthropic irrevocably to one cloud provider or another.
Chinese startup Baichuan Intelligence has raised $300 million from investors including Alibaba Group Holding Ltd., Xiaomi Corp. and Tencent Holdings Ltd. Founded in 2023, this is the second round announced by the AI startup — the first a $50 million angel round. It has released four open-source LLMs, the most recent — Baichuan2-53B — is a model trained in English and Chinese with 53 billion parameters.
Chinese AI startup Beijing Knowledge Atlas Technology, which does business as ZhipuAI, raised $343 million in a series B round. Investors included Tencent and Hangzhou Alibaba Venture Capital Management. The company develops code and text generating technologies, supporting Chinese and English.
Generally Intelligent, newly renamed as Imbue, announced $12 million in funding from AWS' Alexa Fund and Eric Schmidt, former CEO of Google and executive chairman of Alphabet. This brings the company's series B round to $212 million. The company's objective is to build foundation models capable of reasoning, with the first AI agents designed to support software engineering.
Visa Inc. plans to invest $100 million in startups developing generative AI technologies for fintech, payments and commerce use cases. The investment fund will be deployed through Visa Ventures, which expects to primarily cut checks in the several-million-dollar range given its focus on early-stage startups and the formative nature of GenAI in fintech.
Italian synthetic data startup Aindo closed a €6 million series A round, bringing total funding to $9.5 million. The company is one of 11 known specialist tabular synthetic data startups headquartered in EMEA, a fast-growing generative AI market segment.
TabbyML, which has developed an eponymous open-source self-hosted code-generation tool, announced $3.2 million in seed funding. A key feature of the tool is its support for consumer-grade GPUs. Roadmap items for Q4 2023 include extending its use beyond code completion to chatbot capabilities.
AI developer platform Preemo, which does business as Gradient, has received $10 million in seed funding. The ability for customers to build their own private LLMs is an important component of the company's value proposition. The business primarily targets healthcare, finance and legal companies.
Move AI, which develops software that can generate 3D animated objects from photos and videos, announced $10 million in seed funding. Investors in the round include Warner Music Group Corp.
Politics and regulations
The Biden-Harris administration issued an executive order aimed at promoting the US as the world's leading AI regulator while issuing a set of constraints on how AI systems are developed. The order calls for developers of the most powerful models to share safety test results with the US government. Companies developing foundation models that pose serious risk to economic or national security, or public health, are instructed to notify the federal government during the training process. It also directs the National Institute of Standards and Technology to set standards for red-team testing. Other provisions in the order include protecting the privacy of American citizens through various internal government actions. President Biden called on Congress to pass federal privacy legislation to augment these efforts. This appears unlikely in the near future — the American Data Privacy and Protection Act was the first federal online privacy bill to pass committee in July 2022, but it did not receive a vote in Congress. The order also promises a report on how AI will affect labor markets and how the government could mitigate any negative effects, as well as provisions to ensure the government's own responsible use of AI.
Fumia Kishida, the Prime Minister of Japan, announced that G7 Leaders are likely to establish international guiding principles and a code of conduct around AI by year-end. The statement came as European Commission VP for Values and Transparency Vera Jourova suggested that the EU and Japan are deepening cooperation around AI and chips.
China announced its AI Global Governance initiative, which calls for a robust testing and assessment system to evaluate AI risk level, the development of AI governance frameworks and norms based on widespread agreement, and a new internal body to oversee AI. Those are fairly in-line with similar messages from the US, EU and UK. One key point of difference is its call to "oppose drawing ideological lines or forming exclusive groups to obstruct other countries from developing AI," which could be seen as direct reference to the US Chips Act and the restrictions placed on companies selling technology to China. It is not clear if any other countries have signed up to the initiative yet.
The tentative agreement between the Writers Guild of America and the Alliance of Motion Picture and Television Producers, which still must be ratified by WGA members, includes a section on AI specifying that material written or re-written by AI can't be considered "source material" and so can't be used to undermine a writer's credit or separated rights. Writers can use generative AI in their work with their employer's consent, but they cannot be forced to use it by employers, who also must tell writers if any material provided to them includes any AI-generated material.
Kansas and California announced new statewide generative AI guidelines. While 10 states have regulations related to AI set to go into effect this year, only a few have policies around generative AI, including New York, Pennsylvania and Rhode Island. Kansas' outlined policy requires any outputs produced from GenAI to be reviewed by human operators for accuracy and privacy before being disseminated, and to not be solely relied on when making decisions. By March 2024, California plans to have infrastructure ready to conduct GenAI pilots, namely around improving citizen experience with government services and supporting state employees.
Universal Music Group (UMG) and other publishers have sued Anthropic, alleging that it distributes copyrighted lyrics with its Claude 2 LLM. The complaint states that, when a user prompts Claude 2 to provide lyrics to some songs, it does so using some — or all — of the lyrics. That may seem logical, but UMG and the others point out that websites that do that today license those lyrics from the publishers, whereas Anthropic does not. By using the lyrics as an input to train its models and as an output from user's prompts, Anthropic is directly infringing the publishers' exclusive rights as copyholders, the complaint alleges.
The BBC set out its principles for using generative AI ahead of launching numerous projects. The first is that it will always act in the best interests of the public, meaning delivering value while strengthening its mission as a public service broadcaster. Second, it will prioritize talent and creativity, and will "always consider the rights of artists and rights holders," when it does use the technology. Third, the BBC will be open and transparent in its use of GenAI, will always point out when it has been used, and will never rely on it solely to generate content. The BBC has blocked crawlers such as Common Crawl and OpenAI's web crawler from accessing its websites and scraping data from them to train models.