This report was published by 451 Research on January 2, 2024
Introduction
After November's wild week at OpenAI brought tensions between those that believe in accelerating AI development and those that want to apply the brakes — and those that believe in open source versus those who supposedly believe in closed proprietary models — to the general awareness of a bemused public, the dust appears to be settling. With the reintroduction of Sam Altman, onlookers won't need to read tea leaves to see that accelerators appear to be in the driving seat. AWS, Google and Microsoft Corp. have made notable announcements, and a number of startups announced sizeable funding rounds. The open/closed debate appears set to continue, with an AI alliance forming around a more open ecosystem and new open-source models released over the past few weeks.
The Take
Many of December's announcements dovetail nicely with trends we started to see early in 2023. With AWS following Google and Microsoft in prioritizing and rapidly firing out generative AI announcements during its re:Invent conference keynote, hyperscalers are set to play key roles in this space. The strength of an integrated development stack strategy for generative AI, supporting an array of models and providers, is evidenced in the increasing alignment of these three companies around that objective. Similarly, the desire for datacenter AI accelerator competition was illustrated by the attention received by AMD Inc.'s AI accelerator announcements.
December also showcased emerging features that set the stage for what we are likely to see next year. One such trend is the shift from generating single content types toward multi-step workflow automation — for example, in Amazon's showcased integrations between AWS Step Functions and Amazon Bedrock at re:Invent to orchestrate task chains. The erosion of barriers between generative mediums, with many technology vendors pushing multimodal, is illustrated in the ever-broadening ambitions of former generative image specialist Stability AI in its December product updates.
Product releases and updates
AWS announced a wide array of generative AI features and products at AWS re:Invent, from new chips to a new generative AI assistant, Amazon Q. While much of the media attention surrounded Amazon Q, a seeming competitor to Microsoft's Copilot offering, many other updates were announced for AI development platforms Amazon Bedrock and Amazon SageMaker.
AMD launched its AI accelerator, the MI300x, at the company's "Advancing AI" event. The accelerator, which has eight 12Hi stacks of HBM3 memory and 304 AMD CDNA architecture compute units, was first previewed in January 2023. Explicitly positioned as a generative AI accelerator, AMD suggests its MI300x outperforms NVIDIA Corp.'s H100 for inference and can compete in model training. NVIDIA disputes these claims, but there is a desire from organizations to see greater competition at a critical layer on the generative AI stack.
Google announced the launch of its new AI model, Gemini, a family of multimodal large language models developed by Google DeepMind. It is said to be natively multimodal, meaning it was pretrained on text, images, video, code and other formats, and can process and reason across similar inputs. Google made strong performance claims for the most performant version of Gemini, claiming it beats OpenAI's GPT-4 in 30 of 32 widely used academic benchmarks in large language model (LLM) research. Gemini is being released in three sizes: Gemini Nano, Gemini Pro and Gemini Ultra. Gemini Nano is designed for on-device applications and will be available on Google Pixel phones, starting with Pixel 8 Pro. Gemini Pro is powering Google's Bard AI chatbot and is available now in Google Cloud's Vertex AI and AI Studio. Gemini Ultra, described as the most capable model, is set to be released in Q1 2024 after finishing its current phase of testing. In terms of benchmarked performance, Gemini Ultra scored 90.0% on Massive Multitask Language Understanding (MMLU) benchmark, beating human expert level of 89.8% and GPT-4's 86.4%. MMLU measures knowledge across 57 different subjects. On the newer Massive Multi-discipline Multimodal Understanding benchmark, which is a set of questions about images across six disciplines that require college-level knowledge to solve, Gemini scored 59.4% versus GPT-4's 56.8%, and Google pointed out it didn't use any external OCR engine that could extract text from images for further processing.
Stability AI made a number of announcements in December, including a major one related to monetization in a "Stability AI Membership" model. While a no-cost tier continues to exist, for personal use and research, two new paid tiers have been added. These tiers are Professional, which allows self-hosted and commercial use as well as access to a Discord community, and Enterprise. The company suggests new enterprise features and custom billing for this Enterprise tier. New models introduced in December include Stable Zero123, for 3D object generation from single images; a smaller version of LLM Stable LM, known as Zephyr 3B; generative video model Stable Video Diffusion; and "real-time text to image" model SDXL Turbo.
IBM Corp., Oracle Corp., AMD, Dell Technologies Inc., Intel Corp., and Meta Platforms Inc. were among the names announced as part of a new AI Alliance. The focus of the alliance appears to be openness, with six priorities ranging from benchmarking and standards to fostering a "vibrant" AI hardware accelerator ecosystem. Other founding members of the AI Alliance include Cerebras, Hugging Face, Red Hat, ServiceNow Inc., Stability AI and Sony Group Corp.
Apple Inc. released a machine-learning framework, MLX, designed for Apple silicon. The company notes that MLX is inspired by popular frameworks PyTorch, Jax and ArrayFire, with LLM inference an area explicitly outlined in the docs. This framework suggests Apple is looking to ensure generative AI models can perform effectively on its hardware. A research paper on digital avatar generation released on Apple's Machine Learning Research page a few days after the company's MLX announcement indicates the company is making generative AI software investments.
Microsoft Research announced more details around small language model Phi-2 and its performance. The model, at 2.7 billion parameters, is much smaller than the language models commonly applied to language tasks. Llama 2's parameter size options are 7 billion, 13 billion and 70 billion. Microsoft claims its Phi-2 model can "match or outperform models up to 25 times larger" and outperforms Google's Gemini Nano 2 model. The model is now available in the Azure AI Studio model catalog.
Funding and M&A
Paris-based GenAI startup Mistral AI closed its series A round, raising €385 million at a reported $2 billion valuation. The round was led by Andreessen Horowitz, with participation from original investor Lightspeed Ventures, BNP Paribas, General Catalyst, Salesforce Inc. and NVIDIA. The company raised a seed round of €105 million in May led by Lightspeed. It also launched its Mixtral 8x7B LLM via the Apache 2.0 open-source model, the same one it used for its 7B model. Mixtral is described as its small model, with 7B described as its tiny one.
A filing with the US securities regulator suggests X.AI Corp. is seeking to raise up to $1 billion in an equity offering. As well as an indication of the ambitions of Elon Musk's AI startup, the filing suggested the company has already achieved just under $135 million in equity financing.
Liquid AI emerged from stealth in early December with $37.6 million in seed capital. The company's four founders were researchers at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). The startup claims to be developing a "new generation of foundation models" that are more efficient and environmentally responsible.
Together Computer, a California-headquartered startup that does business as Together AI, announced a $102.5 million series A round. The company, which pitches itself as the fastest cloud platform for generative AI development, was incorporated in July 2022 and completed a $20 million seed round in May 2023. The round, led by Kleiner Perkins Caufield & Byers, was joined by a number of new investors, including NVIDIA. Together AI draws attention to its NVIDIA H100 and A100 GPU training clusters as part of its training offering.
Diffblue Limited, a UK-headquartered Java code-generation startup, announced strategic investment from Citi Institutional Strategic Investments. Diffblue positions its differentiation around autonomous operations, suggesting its tool can write high volumes of Java unit test code without human intervention. Few details have been released about the transaction.
HeyGen, which generates avatars and voices to produce videos at scale, announced a $5.6 million venture round, bringing the company to $14.6 million in total disclosed funding and a post-money valuation of $75 million. The transaction was led by new investor Conviction Partners.
Image generator Leonardo Interactive received AUD 47 million ($31 million) in a series A round. The company, incorporated in May 2022, has a general-purpose image and 3D texture generation offering.
Indian startup Sarvam AI raised a $41 million joint seed and series A round to build Indian language-focused LLMs. Lightspeed led the series A round and co-led the seed with Peak XV Partners. Peak XV and Khosla Ventures also participated in the series A funding.
Yseop, a startup whose offerings include generative AI capabilities for life sciences and finance, announced the closure of an investment round in early December. Few details were made available at the time of writing besides a note that Novartis was a new investor.
Politics and regulations
The final version of the EU AI Act was agreed upon by the European Parliament negotiators and the Council presidency on Dec. 9, 2023. The Act is expected to be adopted in early 2024, followed by a transition period of at least 18 months before it becomes fully enforced across the 27 member countries. The fraught 36 hours of final negotiations centered on open source, foundation models, governance, and the use of AI by law enforcement and for military purposes. The end result is an act banning a set of applications deemed to pose an unacceptable risk, including social scoring and those that exploit vulnerabilities and bulk scraping of facial images. Other applications deemed high risk (e.g., AI in education) or limited risk (e.g., chatbots) have different levels of obligations posed on them. The GenAI explosion resulted in plenty of debate during 2023 and pressure from European foundation model providers, with the end result being a requirement to comply with European copyright law, to publish details about the training data used and other technical documentation. The law applies to anyone making or selling AI systems into the EU, regardless of the headquarters of the companies involved. Fines for noncompliance are a percentage of global revenue, with the highest being €35 million or 7% of turnover, whichever is higher for banned AI applications.
The UK Competition and Markets Authority has launched an investigation into the partnership between Microsoft and OpenAI to determine if the partnership "has resulted in an acquisition of control — that is, where it results in one party having material influence, de facto control or more than 50% of the voting rights over another entity — or change in the nature of control by one entity over another." Microsoft is believed to own 49% of the capped profit part of OpenAI (as opposed to the nonprofit part), and OpenAI's generative AI work is done exclusively on Microsoft's Azure cloud platform as part of the partnership. All interested parties are invited to comment until Jan. 3, 2024. The CMA has also promised an update to its research into foundation models, having published its initial report in September 2023.
OpenAI reached an agreement with publisher Axel Springer, which may start to set a road map for how technology and media navigate conflict around generative AI. OpenAI has compensated Axel Springer to incorporate content from its titles into its model training set, which include Business Insider and Politico, and agreements with other publishers are reportedly set to follow. The tension between media and AI companies are pronounced, with Publisher Helena World Chronicle filing a class action lawsuit over the past few weeks. Helena World Chronicle alleges that Google's AI investments were worsening an issue of press content being "siphoned off" by the technology company. Google Bard is identified as a technology that had included news content as part of its training set and part of a trend set to "discourage end users from visiting" press websites.