Streaming Gemini API Responses in Rust + Tauri — Real-Time Token Display

If this is useful, a ❤️ helps others find it.

All tests run on an 8-year-old MacBook Air.

Waiting 5 seconds for an AI response with no feedback feels broken. Streaming fixes this — tokens appear as they’re generated, just like ChatGPT.

Here’s how to wire Gemini streaming into a Tauri app so the response appears word by word in your UI.

The API endpoint

Replace :generateContent with :streamGenerateContent:

POST /v1beta/models/gemini-2.5-flash-preview:streamGenerateContent?key=API_KEY

The response is a stream of JSON objects, one per token batch.

Rust: reading the stream

use reqwest::Client;
use futures_util::StreamExt;
use tauri::Emitter;

#[tauri::command]
pub async fn stream_gemini(
    prompt: String,
    api_key: String,
    window: tauri::Window,
) -> Result<(), String> {
    let client = Client::new();
    let url = format!(
        "https://generativelanguage.googleapis.com/v1beta/models/
         gemini-2.5-flash-preview:streamGenerateContent?key={}",
        api_key
    );

    let body = serde_json::json!({
        "contents": [{"parts": [{"text": prompt}]}]
    });

    let mut stream = client
        .post(&url)
        .json(&body)
        .send()
        .await
        .map_err(|e| e.to_string())?
        .bytes_stream();

    while let Some(chunk) = stream.next().await {
        let bytes = chunk.map_err(|e| e.to_string())?;
        let text = String::from_utf8_lossy(&bytes);

        // Each chunk is a JSON object — extract the text
        if let Ok(json) = serde_json::from_str::(&text) {
            if let Some(token) = json["candidates"][0]["content"]["parts"][0]["text"].as_str() {
                // Emit each token to the frontend
                window.emit("ai-token", token).ok();
            }
        }
    }

    window.emit("ai-done", ()).ok();
    Ok(())
}

React: receiving tokens

import { listen } from '@tauri-apps/api/event';
import { invoke } from '@tauri-apps/api/core';
import { useState, useEffect, useRef } from 'react';

export function StreamingDiagnosis() {
  const [response, setResponse] = useState('');
  const [isStreaming, setIsStreaming] = useState(false);
  const unlistenRef = useRef<(() => void) | null>(null);

  const startStream = async (prompt: string) => {
    setResponse('');
    setIsStreaming(true);

    // Listen for tokens
    unlistenRef.current = await listen('ai-token', (event) => {
      setResponse(prev => prev + event.payload);
    });

    // Listen for completion
    const unlistenDone = await listen('ai-done', () => {
      setIsStreaming(false);
      unlistenDone();
    });

    await invoke('stream_gemini', { prompt, apiKey: 'YOUR_KEY' });
  };

  // Cleanup on unmount
  useEffect(() => {
    return () => { unlistenRef.current?.(); };
  }, []);

  return (



{response}{isStreaming && }

       startStream('Analyze this error...')}>
        Diagnose



  );
}

The blinking cursor

A small CSS detail that makes streaming feel polished:

.cursor {
  animation: blink 1s step-end infinite;
}

@keyframes blink {
  0%, 100% { opacity: 1; }
  50% { opacity: 0; }
}

Appears while streaming, disappears when ai-done fires.

Result

5-second wait with no feedback → tokens appearing immediately, word by word. Same content, completely different feel.

Hiyoko PDF Vault → https://hiyokoko.gumroad.com/l/HiyokoPDFVault
X → @hiyoyok

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post

What Is SaaS Integration? Complete Guide for Businesses

Related Posts