You can now create 2D Games with Google's new AI model- Google Genie; here's how it works


The Google Genie is out of the bottle and is ready to take the world of AI by storm. After text, images, videos, and music, the Generative AI is now ready to create games in seconds. The new AI model comes from the search engine giant Google and can convert any image into a playable 2D world. Let’s get into specifics.

Google’s new AI model is designed to create an endless array of 2D platformer video games. Tim Rocktäschel, Open-Endedness Team Lead at Google DeepMind described Genie as an action-controllable world model trained on unsupervised video game data.

According to Rocktäschel, Genie stands out for its capability to generate action-controllable 2D worlds using image prompts. Unlike other models, Genie is unique in that it focuses exclusively on generating video games, marking a significant milestone in the AI landscape.

Currently, Genie remains a research model and is not available to the public. The specifics of its user-centric functionalities are yet to be disclosed, leaving questions about its ability to respond to text prompts or other technical aspects unanswered.

The AI model’s training involved an extensive 200,000 hours of unsupervised learning from internet videos. Highlighted in a research paper, Genie is described as the first generative interactive environment trained in an unsupervised manner from unlabelled internet videos. Its versatile capabilities allow it to generate action-controllable virtual worlds based on text, synthetic images, photographs, and sketches.

Genie, with its 11 billion parameters, is considered a foundation world model. Comprising a spatiotemporal video tokenizer, an autoregressive dynamics model, and a scalable latent action model, it empowers users to interact with the generated environments on a frame-by-frame basis.

Remarkably, Genie achieves this without the need for ground-truth action labels or other domain-specific requirements typically found in world model literature.

The learned latent action space opens up possibilities for training agents to imitate behaviors from unseen videos, paving the way for the development of future generalist agents. Despite the current limitations on public access, the introduction of Genie showcases the potential of AI in creating dynamic and interactive virtual environments, emphasizing the ongoing advancements in unsupervised learning from diverse internet video data.

