Google Launches Gemini 3.1 Flash TTS and Robotics-ER 1.6

Google has launched Gemini 3.1 Flash TTS and Gemini Robotics-ER 1.6, enhancing text-to-speech and robotics capabilities, along with a new Gemini app for macOS.

Google has launched Gemini 3.1 Flash TTS, an updated text-to-speech synthesis model based on the Gemini 3 generation. It features improved sound quality, expressiveness, and more precise control, supporting over 70 languages.

This neural network enables developers, companies, and everyday users to create applications with voice AI interfaces.

Gemini 3.1 Flash TTS is now available:

for developers — in preview mode via the Gemini API and Google AI Studio;
for enterprises — in Vertex AI;
for Workspace users — through the Google Vids service.

Enhanced Speech Quality and Control

The model scored 1211 points in the Artificial Analysis TTS ranking, based on preferences from thousands of respondents who participated in blind audio quality testing.

According to Artificial Analysis, the model is among the most attractive solutions due to its combination of high-quality speech synthesis and low cost.

This LLM stands out for its ability to generate natural dialogues involving multiple speakers.

New Audio Tags

The 3.1 Flash TTS version introduces audio tags, a tool for controlling style, pace, and manner of speech.

“Early developers and corporate testers are already seeing the results of 3.1 Flash TTS, noting its impressive control and expressiveness. They told us how audio tags provide a new level of creative precision, transforming simple text into high-quality voice performances,” the company stated in its blog.

AI Model for Robotics

Alongside Gemini 3.1 Flash TTS, Google has introduced Gemini Robotics-ER 1.6. This AI model aims to enable robots to perform complex tasks in real-world conditions through enhanced cognitive functions and “embodied” thinking.

The neural network specializes in spatial perception, action planning, and assessing their success. It shows significant improvements over its predecessor and Gemini 3.0 Flash in tasks related to spatial and physical reasoning.

Gemini Robotics-ER 1.6 can interpret data from complex measuring instruments and observe metrics through viewing windows. This capability was developed by Google DeepMind in collaboration with Boston Dynamics for industrial applications.

“Such capabilities allow for autonomous vision, understanding, and reaction to real-world challenges,” commented Marco da Silva, Vice President of the Spot project at Boston Dynamics.

In security threat detection tests, the new model outperformed Gemini 3.0 Flash by 6% in text scenarios and by 10% in video analysis.

Integration of LLM into real workflows has already begun: Boston Dynamics has incorporated Gemini and Gemini Robotics-ER 1.6 into its Orbit AIVI-Learning platform.

Gemini on macOS

Additionally, Google has released a native Gemini application for macOS, accessible via Option + Space. Among its features is the ability to share a window for instant context transfer.

The app supports image generation with Nano Banana, video creation with Veo, and other familiar tools.

As a reminder, in April, Google introduced Gemma 4, a new family of open AI models for advanced reasoning and agent-based workflows.

Google's New Releases: AI Models for Text-to-Speech, Robotics, and Gemini on macOS

Enhanced Speech Quality and Control

New Audio Tags

AI Model for Robotics

Gemini on macOS