Stay informed with free updates
Simply sign up to the Artificial intelligence myFT Digest — delivered directly to your inbox.
Elon Musk’s xAI is pushing to build so-called world models, joining rivals such as Meta and Google in the race to develop artificial intelligence systems that can navigate and design physical environments.
The San Francisco-based start-up hired specialists from Nvidia over the summer to work on these next-generation AI models, which train on videos and data from robots to understand the real world.
World models could push the capabilities of AI beyond that of the large language models, trained on text, that underpin popular AI tools such as ChatGPT and xAI’s Grok.
Two people familiar with the plans said the company was building world models with a view to applying them in gaming, where they could be used to generate interactive 3D environments. One of the people added that they could be applied to AI systems for robots.
xAI has hired Zeeshan Patel and Ethan He, two AI researchers from Nvidia with experience in world models. Nvidia has been a leader in developing this technology with its Omniverse platform, which creates and runs simulations.
Some tech groups have vast expectations of world models, which could unlock uses for AI beyond software and computers in physical products such as humanoid robots. Last month, Nvidia told the Financial Times that the potential market for world models could be almost the size of the present global economy.
xAI would release a “great AI-generated game before the end of next year”, Musk said in a post on X, confirming a target the billionaire set last year.
On Tuesday, xAI launched its latest image and video generation model, which it said had “massive upgrades” and was free to use.
Current video generation models, such as OpenAI’s Sora, generate frames of images for videos by predicting patterns learned from training data.
World models would be a big advance as they would have a causal understanding of physics and how objects interact in different environments in real time.
The company is advertising for technical staff in both image and video generation to join its “omni team”, which “creates magical AI experiences beyond text, enabling understanding and generation of content across various modalities, including image, video and audio”.
Salaries for these jobs range from $180,000 to $440,000. It also has an open position for a “video games tutor”, who will train Grok to produce video games and enable “users to explore AI-assisted game design”, for $45 to $100 an hour.
Musk follows other leading AI labs, such as Google and Meta, that are also working on these systems.
However, world models remain a huge technical challenge. Finding sufficient data to simulate the real world and to train such models has proved difficult and costly.
Michael Douse, head of publishing at Larian Studios, which develops the video game Baldur’s Gate 3, said on X this week that AI could not solve the “big problem” for the games industry, which is “leadership [and] vision”.
He added that the industry did not need “more mathematically produced, psychologically trained gameplay loops [but] rather more expressions of worlds that folks are engaged with, or want to engage with”.
xAI, Patel and He did not respond to requests for comment.
Additional reporting by Hannah Murphy in San Francisco