Google's EmbeddingGemma: A New Contender for On-Device RAG

I usually default to OpenAI for embeddings, but Google’s new EmbeddingGemma model is a noteworthy development. It’s not just another model; it’s a strategic move that shows real promise for improving Retrieval-Augmented Generation (RAG) pipelines, especially in on-device and edge applications. What is EmbeddingGemma? Google has released EmbeddingGemma as a lightweight, efficient, and multilingual embedding model. At just 308M parameters, it’s designed for high performance in resource-constrained environments. This isn’t just about making a smaller model; it’s about making a capable small model. ...

5 September, 2025 · 2 min · 375 words · Yury Akinin

Grok's Public Chats: A Predictable AI Privacy Failure

It’s a classic story at this point. We saw it recently with OpenAI’s ChatGPT, and now it’s Grok’s turn. Elon Musk’s xAI has inadvertently published hundreds of thousands of its users’ private conversations, making them fully searchable on Google. This wasn’t a sophisticated hack; it was a fundamental product design flaw. The Feature That Became a Bug The mechanism was simple and naive. When a Grok user hit the “share” button to send a conversation to a colleague or friend, the system generated a unique URL. However, instead of being a private link, this URL was made public and available for search engines to index. In effect, “sharing” meant “publishing to the open web” without any warning or disclaimer. ...

22 August, 2025 · 2 min · 350 words · Yury Akinin

OpenAI's Priorities: Fix the Leaks Before Encrypting the AI

OpenAI’s proposal to encrypt AI is a commendable headline, but it sidesteps a more fundamental issue. Before we debate the complex philosophy of encrypting artificial intelligence, we should ask a simpler, more urgent question: have they patched the basic vulnerabilities in their existing systems? It’s easy to forget, but OpenAI has a history of security lapses, most notably the incident that leaked private user chat histories across the internet. This wasn’t a failure of advanced cryptography; it was a foundational security bug. They created a vulnerability and, as a result, exposed their clients’ private conversations. ...

19 August, 2025 · 2 min · 260 words · Yury Akinin

China's AI Progress: Why 'Good Enough' Hardware Is a Game-Changer

The recent success of DeepSeek’s new AI model is more than just another headline—it’s a clear signal of a major shift in the global tech landscape. While the West has focused on restricting access to cutting-edge hardware, China has been playing a different game: achieving component independence by making good enough hardware work exceptionally well. While many are surprised that a company could develop a leading AI model without the latest NVIDIA chips, this outcome was predictable. China is strategically leveraging its core advantages: a massive domestic market and a deep pool of highly skilled, cost-effective software engineers. The core of their strategy isn’t just about building better hardware; it’s about optimizing software to extract maximum performance from the hardware they can produce domestically. ...

19 August, 2025 · 2 min · 288 words · Yury Akinin

AI Memory Isn't the End Goal—It's the Beginning of a Knowledge Marketplace

Anthropic’s recent release of a “memory” function for its Claude chatbot is being framed as another move in the AI arms race to increase user stickiness. The feature allows the AI to reference past conversations when prompted, keeping projects and context continuous. While a useful feature, I believe this points to a much more fundamental shift in the AI landscape. Everything is moving toward the accumulation of user interaction data into isolated, private memory volumes. This isn’t just about convenience; it’s about creating a foundation where knowledge itself becomes private and proprietary. ...

13 August, 2025 · 2 min · 217 words · Yury Akinin

Loyalty Over Billions: Why Meta's Raid on Murati's AI Startup Failed

Mark Zuckerberg’s recent attempt to acquire Mira Murati’s new startup, Thinking Machines Lab, wasn’t just a standard M&A play. When Murati, OpenAI’s former CTO, declined the offer, Meta switched tactics to a full-scale talent raid—and failed spectacularly. This isn’t just industry gossip; it’s a critical signal about where the real value lies in the AI talent war. Meta reportedly approached the startup’s employees with staggering offers. Co-founder and leading researcher Andrew Tulloch was allegedly offered a compensation package worth as much as $1.5 billion over six years. Other offers to researchers ranged from $200 million to a reported $1 billion for a single individual. ...

13 August, 2025 · 2 min · 323 words · Yury Akinin

Perplexity's 'Imperfect' Launch: The Right Strategy for the AI Era

Perplexity CEO Aravind Srinivas’s launch of the Comet web browser is a critical case study in product strategy for the current AI landscape. He launched it knowing the underlying models weren’t ready for his full vision of an “operating system for the AI era.” This wasn’t a mistake; it was the entire point. The New Go-to-Market: Build for the Future Model The core insight here is a fundamental shift in product development. As Srinivas states, “You’ve got to position your product and your technology with the assumption that the models are eventually going to be great and also going to be affordable.” ...

13 August, 2025 · 3 min · 499 words · Yury Akinin

OpenAI's GPT-OSS: A Major Step Back Towards 'Open'

OpenAI just made a significant move by releasing GPT-OSS, its first truly open-source large language model family since GPT-2. With a permissive Apache 2.0 license, this isn’t just a minor release; it’s a fundamental shift that puts real power back into the hands of developers. The family includes two Mixture-of-Experts (MoE) models, gpt-oss-20b and gpt-oss-120b, designed for high-performance inference with strong reasoning capabilities. Why This Is a Game-Changer For years, the most powerful models from OpenAI have been locked behind APIs. This meant dealing with rate limits, opaque pricing, and sending potentially sensitive data to a third party. GPT-OSS changes that equation entirely. ...

13 August, 2025 · 2 min · 418 words · Yury Akinin

My Take on GPT-5, OpenAI's Strategy, and the Dawn of 'AI Time'

A recent Forbes article by John Sviokla put a name to something many of us in the AI space have been feeling: the shift to AI Time. It’s the idea that the tempo of innovation and organizational operations is no longer dictated by human speed, but by the near-instantaneous cycle of silicon intelligence. OpenAI’s GPT-5 launch is a masterclass in this new reality. It wasn’t a simple model update; it was a multi-front strategic deployment that reshapes the competitive landscape. I see it as a “quadruple play” that establishes a new baseline for the industry. ...

13 August, 2025 · 3 min · 582 words · Yury Akinin

OpenAI's Hand Was Forced: Why the AI Race is No Longer Won in Secret

For years, the AI frontier was defined by closed doors and proprietary models. That era is officially over. OpenAI’s recent pivot to open-source isn’t just a strategic shift; it’s a direct response to a new reality: the center of AI innovation has gone public, and China is leading the charge. The Open-Source Tipping Point The catalyst was the surprise release of high-performance models by Chinese startup DeepSeek. As a recent Fortune article aptly pointed out, this move exposed a critical vulnerability in the “closed-garden” strategy of Western AI labs. By making powerful AI openly accessible, DeepSeek didn’t just win goodwill; it ignited an explosion of development across China. Companies from Baidu to Alibaba quickly followed suit, creating a tidal wave of open innovation. ...

13 August, 2025 · 3 min · 447 words · Yury Akinin