<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>#EdgeAI on Home</title>
    <link>https://yakinin.com/en/tags/%23edgeai/</link>
    <description>Recent content in #EdgeAI on Home</description>
    <generator>Hugo -- 0.148.2</generator>
    <language>en</language>
    <lastBuildDate>Fri, 05 Sep 2025 19:54:00 +0000</lastBuildDate>
    <atom:link href="https://yakinin.com/en/tags/%23edgeai/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Google&#39;s EmbeddingGemma: A New Contender for On-Device RAG</title>
      <link>https://yakinin.com/en/posts/20250905-google-embeddinggemma-on-device-rag/</link>
      <pubDate>Fri, 05 Sep 2025 19:54:00 +0000</pubDate>
      <guid>https://yakinin.com/en/posts/20250905-google-embeddinggemma-on-device-rag/</guid>
      <description>&lt;p&gt;I usually default to OpenAI for embeddings, but Google&amp;rsquo;s new EmbeddingGemma model is a noteworthy development. It’s not just another model; it’s a strategic move that shows real promise for improving Retrieval-Augmented Generation (RAG) pipelines, especially in on-device and edge applications.&lt;/p&gt;
&lt;h2 id=&#34;what-is-embeddinggemma&#34;&gt;What is EmbeddingGemma?&lt;/h2&gt;
&lt;p&gt;Google has released EmbeddingGemma as a lightweight, efficient, and multilingual embedding model. At just 308M parameters, it’s designed for high performance in resource-constrained environments. This isn&amp;rsquo;t just about making a smaller model; it&amp;rsquo;s about making a &lt;em&gt;capable&lt;/em&gt; small model.&lt;/p&gt;</description>
    </item>
    <item>
      <title>DeepSeek-V3: A Quiet Release with Impressive Local Performance</title>
      <link>https://yakinin.com/en/posts/20250801-deepseek-v3-local-performance/</link>
      <pubDate>Thu, 27 Mar 2025 11:22:11 +0000</pubDate>
      <guid>https://yakinin.com/en/posts/20250801-deepseek-v3-local-performance/</guid>
      <description>&lt;div style=&#34;display: flex; justify-content: center; gap: 1em; flex-wrap: wrap;&#34;&gt;
  &lt;img src=&#34;https://yakinin.com/img/20250801-deepseek-v3-local-performance-0.jpg&#34; style=&#34;max-width: 350px; width: 100%;&#34; /&gt;
&lt;/div&gt;
&lt;p&gt;DeepSeek has once again followed its &amp;ldquo;quiet release&amp;rdquo; strategy, making its new DeepSeek-V3-0324 model available on Hugging Face without any major announcements. Instead of marketing hype, they&amp;rsquo;ve simply delivered a solution for the community to evaluate.&lt;/p&gt;
&lt;p&gt;I tested the model locally on a Mac Studio equipped with an M3 Ultra chip and saw impressive performance, generating over 20 tokens per second. This marks a significant acceleration for running capable models on local hardware, making it a viable option for developers.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
