Copilot Vision Brings Multimodal AI to Windows
Microsoft has launched Copilot Vision, a major upgrade that allows the AI to analyze visual context from your screen, answer questions, perform tasks, and offer suggestions—all in real time. It supports voice interaction and multimodal input, bringing the assistant closer to a truly contextual, ambient AI presence on Windows 11 and upcoming Windows 12 devices.

This feature puts Microsoft squarely in competition with Google’s Gemini Live, and reflects its broader push to make Copilot the core of a new “AI-first desktop.”

Copilot Goes Military-Grade
In a major institutional milestone, Microsoft confirmed it’s developing a secure version of Copilot specifically for the U.S. Department of Defense. This version will run inside the GCC High environment, compliant with strict federal regulations for data security. Deployment is expected by summer 2025.

Copilot Tuning: AI for Every Enterprise
At Build 2025, Microsoft introduced Copilot Tuning, a no-code customization platform that allows businesses to create specialized Copilot agents using internal documents, APIs, and knowledge bases. Think of it as “Copilot-as-a-Service”—tailored for different workflows, from healthcare to finance.

Microsoft also continues to integrate third-party models like Meta’s Llama 3, Mistral, and xAI’s Grok into Azure, creating a flexible backend that goes beyond OpenAI’s models.

Google Gemini: An Operating Layer for AI Everywhere

Gemini Scheduled Actions and Workspace Enhancements
Gemini’s assistant capabilities are becoming smarter and more proactive. The Scheduled Actions feature allows Gemini Pro and Ultra users to automate daily tasks—summarizing inboxes, generating morning briefings, or scheduling calendar items. It positions Gemini as not just a reactive tool, but a proactive daily planner.

In Google Workspace, Gemini now summarizes PDFs, Google Forms responses, and assists with document creation through a “Help Me Create” button rolling out this summer. These additions aim to embed AI deeply into Google’s productivity layer, streamlining workflows with zero manual input.

Gemini 2.5 and Deep Think Mode
The newly upgraded Gemini 2.5 Flash model is now the default for developers and consumers. For power users, Gemini 2.5 Pro introduces a Deep Think mode that allows the model to explore multiple reasoning paths before responding—mimicking strategic decision-making. This feature is currently in limited testing but marks a significant leap in AI planning capability.

To read the full article: 9Meters