An autonomous vehicle drives on a deserted stretch of highway. Suddenly, a huge tornado is seen in the distance. What does driverless vehicles do next?
This is one of the scenarios that Waymo can simulate in the “ultra-realistic” virtual world it just created with the help of Google’s DeepMind. Waymo’s world model is built using Google’s new AI World Model Genie 3, which can generate virtual interactive spaces with text or images according to prompts. But Genie 3 isn’t just about making poor imitations of Nintendo games; It can also create photorealistic and interactive 3D environments “adapted to the rigors of the driving domain”. waymo says.
Simulation is a key component in autonomous vehicle development, enabling developers to test their vehicles in a variety of settings and scenarios, many of which may only be encountered in rare occasions – without any physical risk of harm to passengers or pedestrians. AV companies use these virtual environments to run them through a series of tests, driving millions – or even billions of miles in the process, in the hopes of better training their vehicles for any potential “edge cases” they might encounter in the real world.
What types of edge cases is Waymo testing? In addition to the aforementioned tornadoes, the company can also simulate a snow-covered Golden Gate Bridge, a flooded suburban cul-de-sac with floating furniture, a neighborhood engulfed in flames, or even an encounter with a rogue elephant. In each scenario, the Waymo robotaxi’s lidar sensors generate a 3D rendering of the surrounding environment, including obstacles in the road.
“The Waymo World Model can generate almost any scene – from regular, day-to-day driving to rare, long-tail scenarios – across multiple sensor modalities,” the company says in a blog post.
Waymo says the Genie 3 is ideal for creating virtual worlds for its robotaxis, citing three unique mechanisms: driving action control, scene layout control, and language control. Driving action controls allow developers to simulate “what if” counterfactuals, while visual layout controls enable the optimization of traffic signals and other road user behavior such as road layout. Waymo describes the language control as its “most flexible device” that allows adjustments to be made to the time of day and weather conditions. This is especially helpful if developers are trying to simulate low-light or high-brightness situations, in which the vehicle’s various sensors may have difficulty seeing the road ahead.
The company says the Waymo World model can also take real-world dashcam footage and transform it into a simulated environment for “the highest degree of realism and factuality” in virtual testing. And it can create simulated scenes for long periods of time, such as play at 4X speed playback, without compromising image quality or computer processing.
“By simulating the ‘impossible,’ we proactively prepare Waymo drivers for some of the most rare and complex scenarios,” the company says in its blog post.
