Meta and Qualcomm team up to run big A.I. models on phones

Qualcomm Incorporated President and Chief Executive Officer Cristiano Amon speaks at the Milken Institute Global Conference on May 2, 2022 in Beverly Hills, California.

Patrick T. Fallon AFP | Getty Images

Qualcomm and Yuan The social networking company’s new large-scale language model, Llama 2, will run on Qualcomm chips in phones and PCs starting in 2024, the companies announced today.

So far, LLM has largely run on large server farms on Nvidia graphics processors, as the technology’s massive appetite for computing power and data has boosted Nvidia’s stock price, which is up more than 220 percent this year. But the AI boom has largely missed companies like Qualcomm, which makes leading processors for phones and PCs. Its shares have risen about 10% so far in 2023, trailing the Nasdaq’s 36% gain.

Tuesday’s announcement suggests that Qualcomm wants to position its processors as well-suited for artificial intelligence, but “at the edge” or “on-device,” rather than “in the cloud.” If large language models could be run on phones instead of in massive data centers, it could dramatically reduce the cost of running AI models and potentially lead to better, faster voice assistants and other applications.

Qualcomm will make Meta’s open-source Llama 2 model available on Qualcomm devices, which is believed to enable applications such as intelligent virtual assistants. Meta’s Llama 2 can do many of the same things as ChatGPT, but it can be packaged in a smaller program, which allows it to run on mobile phones.

Qualcomm’s chip contains a “tensor processor unit” (TPU), ideal for the kinds of calculations required by AI models. However, the processing power available on mobile devices pales in comparison to data centers equipped with cutting-edge GPUs.

What makes Meta’s Llama notable is that Meta publishes its “weights,” a set of numbers that help control how a particular AI model works. Doing so would enable researchers and ultimately commercial enterprises to use AI models on their own computers without asking for permission or paying a fee. Other well-known LLMS, such as OpenAI’s GPT-4 or Google’s Bard, are closed source, and their weights are kept strictly confidential.

Qualcomm has worked closely with Meta in the past, especially on chips for the Quest virtual reality headset. It also demonstrated some slow-running AI models on its chips, such as the open-source image generator Stable Diffusion.

Svlook