This report was published by 451 Research on August 25, 2023
It is summertime in the northern hemisphere, traditionally the slowest part of the year for announcements and updates. However, with Amazon Web Services announcing a number of updates at the AWS Summit in New York, OpenAI making its first acquisition and Google's release of a vision-to-action model, recent weeks have continued the steady output of generative AI doings. In order to keep pace, these digests are augmented with various reports on product releases from vendors that you can find as part of our Market Insight service.
The Take
Apple Inc. is one of the few major technology companies without a clear story to tell around generative AI. This appears set to change, but the challenge for the business will be addressing the workload and storage requirements associated with generative AI models if the strategy — as it appears to — involves their use on mobile devices. A new theme within Google's more expansive generative AI story appears to relate to robotics, extending transformer infrastructure and extensive multimodal training to vision-to-action models. As the umbrella of generative AI grows, the narratives underpinning generative AI strategies have greater opportunity to diverge. With so many early vendors offering similar supposed competitive differentiators (e.g., the ability to take a foundation model, which may be proprietary or open source, and fine-tune it using your own corporation's data, while maintaining that none of your data is used to train the original model), differentiation around capability — be that edge deployment or integration with robotics — may prove valuable.
Product releases and updates
Google announced vision-to-action model Robotics Transformer 2 (RT-2). The model is trained on text and images, and can direct robot actions to address tasks that they have not been specifically trained to address. RT-1 was debuted by DeepMind in 2022, with the model trained on 130,000 demonstrations to perform tasks. The shift with RT-2 is adding image and text datasets from the web, which Google claims has almost doubled the efficacy of robots faced with unseen scenarios in trials.
AWS announced Bedrock Agents, a managed service to accelerate the development of conversational agents based on foundation models that can break down complex tasks into simple steps, collect additional information if needed and then complete a task. AWS also announced AWS Entity Resolution to match and link records from multiple applications, channels and data sources. Entity Resolution is generally available; Bedrock Agents is in preview.
Apple's generative AI investments have drawn interest, with a chatbot reportedly being used internally with a proprietary large language model (LLM) framework. A statement from Tim Cook to Reuters suggests that generative AI has been an area of active investment for the business. The Financial Times, using an analysis of job ads, suggests that the company is looking to compress language models so that they can run efficiently on mobile devices.
Developer online community company Stack Overflow announced OverflowAI in late July. Rather than a specific product, OverflowAI appears to be a collection of initiatives, including upgrading search functionality and providing generative responses. For the company's enterprise offering, Stack Overflow for Teams, enterprise knowledge ingestion is being introduced to personalize and refine insight.
IBM Corp. announced the release of a geospatial foundation model co-developed with NASA on the Hugging Face platform. IBM says it is the first open-source foundation model NASA has collaborated on, and that it can analyze geospatial data up to four times faster than current deep learning models, with half as much labeled data. IBM fine-tuned the model, enabling users to accurately map historical US floods and wildfires, which could help predict future areas at risk. IBM says the model could be refined further to undertake tasks such as monitoring deforestation, forecasting crop yields, and identifying and tracking greenhouse gas emissions. The model uses a portion of NASA's Harmonized Landsat Sentinel-2 dataset, which captures a comprehensive view of Earth every two to three days. IBM plans to make a commercial version of the model available later this year as part of its IBM Environmental Intelligence Suite. IBM will host Meta Platform Inc.'s Llama 2-chat 70-billion-parameter model on its watsonx platform.
VMware Inc. and NVIDIA Corp. — celebrating a decade as partners — announced VMware Private AI Foundation with NVIDIA, an integrated stack with accelerators from NVIDIA running on top of VMware Cloud Foundation. The two are aiming it at enterprises that want accelerated IT infrastructure to build and train their own AI models. The offering, which will be sold by VMware's sales force, includes NVIDIA's AI Enterprise software platform, as well as its BlueField-3 data processing unit and L4OS graphics processing unit (GPU).
Plurilock Security Inc. announced PromptGuard, which will be integrated into the Plurilock AI platform. The offering is designed as a cloud access security broker to ensure that sensitive data is not being released to third-party AI systems. The capability is available in early access to invited beta-testing organizations.
South Korean startup Upstage claims to have the "world's best LLM." The model reached a score of 72.3 on the Hugging Face Open LLM Leaderboard at the time of launch in early August. The leaderboard ranks open-source LLMs based on four key benchmarks. Upstage's model was still in the top three at the time of writing this digest, a few weeks later.
Tokyo-based Sakana AI was founded by Llion Jones, one of the co-authors on the 2017 research paper "Attention Is All You Need," which outlined the architecture of transformers used in large language models. He left Google in July (the last of the eight co-authors of the paper to leave Google) and is joined at Sakana by David Ha, previously the head of research at Stability AI and head of research at Google Japan. The company plans to take a novel approach to generative AI by building models that mimic the way systems in nature are formed, such as beehives and schools of fish, to make models that are small, flexible and able to collaborate like a swarm.
Observability and application security vendor Dynatrace announced Davis CoPilot in late July. Davis CoPilot supports observability workflows using natural language and code-generation capabilities. The business positions the release as part of the first hypermodal AI for unified observability and security, in that generative AI is combined with the company's "causal AI" and "predictive AI" technologies. Core to the tool's hypermodality is support for auto-prompting, where prompts are populated by insight around causality or prediction — this improves the quality of response beyond standard user inputs. Davis CoPilot can be used to generate suggested remediation actions to users, for example, or to support data queries and the development of dashboards in natural language.
Meta unveiled SeamlessM4T, a new multimodal model that can translate and transcribe across almost 100 languages using inputs of both speech and text. The model can output speech-to-text, speech-to-speech, text-to-speech and text-to-text translations. The open-source model was trained on four million hours of publicly available text and speech from the web. The speech encoder is a new self-supervised encoder called w2v-BERT 2.0, while the text encoder is based on Meta's No Language Left Behind text-to-text translation model unveiled in 2022.
Funding and M&A
Anthropic, not long after its $450 million funding in May, announced another $100 million in funding from returning investor SK Telecom Co. Ltd., bringing its total raised to $1.66 billion. As part of the announcement, SK Telecom noted that the companies were partnering to create a multilingual large language model designed for telecom operators, and that the model would be shared with the Global Telco AI Alliance launched in July. SK Telecom is a founding member of this alliance, alongside Deutsche Telekom AG, Singapore Telecommunications Ltd. and United Arab Emirates telecom operator e&.
DynamoFL, which targets privacy-preserving and regulation-compliant generative AI, raised $15.1 million in a series A funding round co-led by new investor Canapi Ventures and returning investor Nexus Venture Partners. The company, founded in 2021, sees opportunity in addressing privacy-critical datasets with a federated learning model.
Canadian AI chip developer Tenstorrent raised $100 million in a convertible note funding round led by Hyundai Motor Group and Samsung Catalyst Fund, the venture arm of Samsung Electronics Co. Ltd. Other participants included Fidelity Ventures, Eclipse Ventures, Epiq Capital and Maverick Capital. It brings the total raised by seven-year-old Tenstorrent to $342 million, according to S&P Capital IQ. Tenstorrent builds low-powered chips for AI workloads based on the RISC-V open-source instruction set. The company has been led by Jim Keller since January 2023, having joined as CTO in December 2020. Keller's long career includes leading Tesla Inc.'s Autopilot chip platform, helping turn around Advanced Micro Devices Inc.'s processor business and creating and leading what became Apple's customer processor business, as well as stints at Broadcom Inc., PA Semiconductor and Digital Equipment Corp., starting in the 1980s.
CoreWeave, a cloud provider specializing in offering AI acceleration, has raised $2.3 billion in a nonconvertible debt financing facility led by Magnetar Capital Partners, a returning investor, and Blackstone Tactical Opportunities Advisors, a new one. CoreWeave's GPU Cloud offering claims to offer the broadest selection of high-end NVIDIA GPUs. The company suggests part of the funding will be used to open new datacenters. As Figure 1 illustrates, the opportunity for AI accelerator providers is sizeable — with 37% respondents to 451 Research's AI/machine-learning infrastructure survey seeing accelerators in cloud as a requirement to improve workload performance.
Figure 1: AI infrastructure requirements to improve performance wide-ranging
Source: 451 Research's Voice of the Enterprise: AI & Machine Learning, Infrastructure 2023.Q. Which infrastructure resources does your organization need in order to improve the performance of its AI/ML production workloads? Select all that apply.Base: All respondents, abbreviated fielding (n=434).Q. Which of the following infrastructure resources is most essential to improve the performance of your organization's AI/ML production workloads?Base: Organizations that need infrastructure resources in order to improve the performance of AI/ML production workloads, abbreviated fielding (n=425).© 2023 S&P Global.
OpenAI made its first acquisition, the "acquihire" of the team from Global Illumination, a startup that had been working on Biomes, an online games platform, among other things. The company was founded by Thomas Dimson, Taylor Gordon and Joey Flynn, and the entire Global Illumination team is joining OpenAI to work on its products, including ChatGPT. Dimson wrote the original content-ranking algorithm at Instagram and stayed at its parent Meta until 2020 before starting Global Illumination.
Theai, better known as Inworld AI, announced a $50 million funding round in early August. The company supports the development of nonplayer characters for video games, with generative AI used to create more realistic conversations. The round was led by Lightspeed Venture Partners. Other known investors included Stanford University and Samsung NEXT. Returning investors included M12, First Spark Ventures and LG Technology Ventures.
Politics and regulations
A failure to secure regulatory approval in China has led to the abandonment of Intel Corp.'s proposed $5.4 billion acquisition of Tower Semiconductor Ltd. The Chinese competition regulator, State Administration of Market Regulation, may have been influenced by the challenges Chinese companies are facing in accessing chips due to US export controls. While listed in Israel, Tower Semiconductor's acquisition had to be assessed by Chinese regulatory bodies due to rules that state participants that generate more than $55 million in China must file the deal for approval.
Australian Medical Association released a policy statement on the use of AI for decision-making and large language models in healthcare. Concerns around large language models, raised by the AMA, center on the privacy of patient and practitioner data. The inclusion of LLMs is likely a reflection of headlines generated in Australia after the CEO of Perth's South Metropolitan Health Service asked staff not to use the tool to generate medical records — with one doctor reported to have used ChatGPT to develop a discharge summary.
The Federal Election Commission unanimously voted in August to advance a petition seeking to regulate political ads using deepfakes. In June, a number of articles appeared in the US press suggesting that Governor Ron DeSantis of Florida spread three false images of Donald Trump, DeSantis' political opponent, that had been generated by AI.
The new regulation for generative AI in China, outlined in our previous digest, took effect on August 15. The Cyberspace Administration of China is one of seven ministries with oversight of adherence to the 24 announced guidelines. Generative AI technology providers will need to have their products registered, and undergo a security assessment if they are public facing.