[Sep] ML Community — Highlights and Achievements
Let’s explore highlights and accomplishments of the vast Google Machine Learning communities over the month. We appreciate all the activities and commitment by the community members. Without further ado, here are the key highlights!
Featured Stories
New Gemma models activities
Fact Check with DataGemma by AI/ML GDE Ertuğrul Demir (Türkiye) explored DataGemma, a new tool that integrates LLMs with the Data Commons knowledge base to improve the accuracy of factual claims, especially with numbers. His tests showed that DataGemma provided detailed, factual answers to complex queries where a standard LLM either refused to answer or hallucinated.
Target ends the illusion? Introduction to DataGemma by AI/ML GDE Liu Yu-Wei (Taiwan) introduced DataGemma and how it helps to reduce hallucinations through real-world statistics.
Pre-alpha of Gemma-o1 by AI/ML GDE Rabimba Karanjai (US) introduced Gemma-o1 and demonstrated its capacity to tackle complex riddles and challenges.
Highlights by products
Gemini
https://medium.com/media/717265eb124f6abff566a33db2e2d114/href
Multimodal Document Processing (video | code) by AI/ML GDE Sascha Heyer (Germany) explains document processing with Gemini’s multimodality. He processed 10251 documents for just $1 within 15 minutes without training and labeling. This article got 110+ Claps on Medium.
The Missing Link in RAG Systems: Bridging Text and Visuals in PDFs for Better Results by AI/ML GDE Daniel Gwerzman (UK) explores how to enhance RAG systems to handle PDFs containing text and visual elements. Using Gemini 1.5 Flash, he outlined a method for storing Base64-encoded PDF pages in a vector database, preserving the full context of text and visuals. He showed how this approach improves response quality in tasks like Q&A, document summarization, and data extraction.
Ampy: Greece’s First AI-Powered Energy Advisor by AI/ML GDE Konstantinos Kechagias (Greece) is an AI-driven energy advisor designed to help households and businesses in Greece find the most cost-effective energy plans. He used Gemini and Gemma to analyze individual energy consumption data and to help people make informed decisions and reduce energy costs.
Function Calling with Gemini and the Rick And Morty API (Colab Notebook | video) by AI/ML GDE Nathaly Alarcon Torrico (Bolivia) explains how to create the “Rick And Morty Expert” where you can ask for factual information about the TV Series.
https://medium.com/media/ed70fc9c5a0257bf4e4989d195658881/href
Google Gemini Nano in Chrome provides OFFLINE AI by Angular GDE Muhammad Ahsan Ayaz (Sweden) explains how to run Gemini Nano in Chrome. Introducing the latest Gemini models, Gemini Pro, Flash, and Nano, he guides how to activate Gemini Nano in Chrome Canary and dev versions, and how to update Chrome flags and enable the on-device model via chrome://components. He also shared a tutorial using Gemini API, Build Your Own AI Sentiment Analyzer in Angular! with code repo and demo app.
Img2Anime Photo Journal using Stable Diffusion and Gemini API (Colab Notebook) by AI/ML GDE Nitin Tiwari (India) is a photo journal maker that transforms pictures into anime-style illustrations using Gemini and Stable Diffusion XL Turbo.
The Eras of LLMs: Gemini & Beyond by TFUG Bhubaneswar focused on Gen AI using Gemini & Gemma. Over 400 participants engaged across 4 events including workshops, seminars, and hands-on sessions. In a hands-on workshop by Saswat Samal (India), participants explored the Gemini Chat API and built a chat agent using Colab notebook, focusing on prompts and settings. AI/ML GDE Tarun R Jain (India) discussed applications built using Gemini, such as Function Calling, long video prompting, and RAG using its 2M context window for agentic workflows.
What Using LLMs To Filter Television News EPG Show Names For “News/Not News” Teaches Us About The Severe Limits Of GenAI by AI/ML GDE Kalev Leetaru (US) shows how he examined 1418 unique show titles using Gemini 1.5 Pro and ChatGPT 4o.
Gemma
Building Digital Twins using OSS LLM models by AI/ML GDE Rabimba Karanjai (US) explored how open-source LLMs can be leveraged to create digital twins in healthcare, enhancing patient care and operational efficiency. He focused on integrating Gemma with Kubernetes Engine for scalable deployment.
Gemma2 and RAG a great combination for your Gen AI apps (Spanish) (Colab Notebook) by AI/ML GDE Nathaly Alarcon Torrico (Bolivia) is a tutorial to implement RAG on Gemma with a set of documents containing specific information on specific topics.
Unleashing the Power of Ollama and Gemma: A Python Playground by AI/ML GDE Linda Lawton (US) guides how to use Gemma locally with Python and Ollama. Having a local AI model for testing allows for efficient experimentation and troubleshooting without relying on external APIs.
OpenAI-Translator by AI/ML GDE Jingtian Peng (China) is an AI-powered multilingual translation tool for books of PDF or Markdown format. The tool provides translation between any languages using LLMs like Gemma 2, LLaMA3.1, GPT-4o-mini. This repo received 1100+ stars on Github.
Colab
Immersive Stories (Colab Notebook) by AI/ML GDE Pedro Lourenço (Brazil) is to build videos around the 60s. He used different tools to create an app to build immersive stories such as Lora, Flux, fal.ai, Luma to create a video. Especially he used Gemini to create the script for all the scenes and image descriptions that are used to illustrate each scene.
SEMAC — XXXIV Computer Week Workshop (Colab Notebook) by AI/ML GDE Vinicius F. Caridá (Brazil) was a 3-hour workshop for students using hands-on examples on the notebook. The session covered basic ML concepts to the latest Gen AI technologies including Gemini and Gemma combining theoretical insights with practical exercises.
Keras
Fine-Tune Gemma 2 2b with Keras and LoRA (Part 3) (Kaggle notebook) by AI/ML GDE Ruqiya Bin Safi (Saudi Arabia) explores how to fine-tune the Gemma2–9b model for handling Arabic language using KerasNLP and the LoRA technique. It covers how to set up the environment, load the model, make necessary modifications, and train the model using model parallelism to distribute parameters across multiple accelerators. This article is a continuation of Colab| Fine-Tune Gemma 2 2b Using Transformers and qLoRA (Part 2).
Keras Community Day (video) by TFUG Lesotho was to engage the local community in learning and sharing knowledge about Keras, deep learning, and AI. Focusing on KerasNLP, they explored multimodal models such as Gemini (Colab notebook), how to fine-tune open-source LLMs, and hands-on sessions using Kaggle, etc.
Sequential model with Keras 3.0 for binary classification by Yara Armel Desire (Canada) is a model for images classification (binary classification). Using François Cholet’s tutorial on sequential models, he built his own dataset and applied a model to it to publish a notebook on GitHub. He also led a session Keras 3.0 : Sequential model for images classification (video) as a speaker, hosted by ML Abidjan.
Kaggle
ISIC 2024 – Skin Cancer Detection with 3D-TBP
ISIC 2024 — Skin Cancer Detection — 13th Place Solution by AI/ML GDE Ertuğrul Demir (Türkiye) is about his gold winning solution in the Kaggle featured competition. He provides a comprehensive overview of the solution, including various techniques and strategies employed to achieve a gold medal ranking, offering insights for similar ML challenges.
Gemma 2 2B learns how to tutor in AI/ML (Kaggle notebook) by AI/ML and Kaggle GDE Luca Massaron (Italy) explores the potential of small language models (SLMs) like Gemma 2 2B, emphasizing their ability to efficiently perform specialized tasks while being highly customizable for local use. He highlights the advantages of these models over larger counterparts, particularly in terms of accessibility, flexibility, and privacy.
https://medium.com/media/4879bc27de6e9960afd8b204ac0d5783/href
Climb Kaggle Leaderboard with AutoGluon: Titanic Challenge Breakdown by Kaggle GDE Karnika Kapoor (UAE) is a video demonstrating how to use AutoGluon to effortlessly manage feature engineering, hyperparameter tuning, and ensemble modeling to excel in Kaggle competitions.
Fine-tuning Gemma 2 Model Using LoRA and Keras by Kaggle GDE Gabriel Preda (Romania) is a notebook about how to fine-tune the Gemma 2 model with LoRA, create a specialized class for querying Kaggle features, and presents results from querying Kaggle Docs, building on various sources and prior work.
JAX
JORA: JAX Tensor-Parallel LoRA Library for Retrieval Augmented Fine-Tuning by AI/ML GDE Anique Tahir (US), Lu Cheng and Huan Liu is a framework for PEFT-compatible fine-tuning of GPT models, leveraging distributed training. It utilizes JAX’s just-in-time (JIT) compilation and tensor-sharding for efficient resource management, enabling accelerated fine-tuning with reduced memory requirements. They improved the scalability and feasibility of fine-tuning LLMs for complex RAG applications, even on systems with limited GPU resources.
Multimodal Fusion with Jax/Flax (Kaggle notebook) by AI/ML GDE Taha Bouhsine (US) was a talk that bridged the gap between theory and practice in multimodal model implementation with a focus on aerospace engineering applications. He covered efficient data loading techniques, explaining how to organize and load data batches tailored to specific model architectures. A key highlight was the in-depth look at creating “Connectors” — crucial components that fuse embeddings from various encoders, forming the backbone of effective multimodal models.
VertexAI
Harnessing Google’s Gemini and Imagen Models for High-Quality Image Generation and Refinement by AI/ML GDE Yucheng Wang (China) explains how he used Gemini and Imagen 3 on VertexAI to generate images and how he edited and customized them with Imagen 2.
Vertex AI Function Calling (video | code) by AI/ML GDE Sascha Heyer (Germany) introduces how to turn LLMs into reasoning engines with capabilities like web search and calling external APIs. He shows step-by-step on how to call external APIs using Gemini with Function Calling. This article is a part of Generative AI Livestream series where Sascha shares live coding on Gen AI projects and received 115+ Claps on Medium. He also shared, Reranking (video | code), a language model that computes a relevance score using a document and a query.
https://medium.com/media/85838b8aca4ab1b4f148c50ce813b59f/href
Instantly Change Product Backgrounds Using AI Prompts with Google Vertex AI | Imagegen Tutorial (Colab Notebook) by AI/ML GDE Bhavesh Bhatt (India) is a tutorial on how to effortlessly change product background images using Vertex AI, Imagegen, and simple prompts.
Smarter than a 5th Grader? Using LLM Comparator on Google Vertex AI to find out (repository) by AI/ML GDE Jigyasa Grover (US) is designed to offer a comprehensive evaluation of LLMs, allowing you to compare their ability to handle various tasks with clarity and thoroughness.
Spring-Vertex Connect by TFUG Durg was to explore the latest tech in AI and web development. Adarsh Gupta led a workshop on “Vertex AI with RESTful APIs using Spring Boot” and AI/ML GDE Kartikey Rawat (India) gave a talk “Introduction to Vector Search using Vertex AI”.
Build generative AI agents with Vertex AI Agent Builder for Mobile by TFUG Hyderabad focused on how to build Gen AI agents using the VertexAI agent builder for application’s needs. They covered how to customize the agent for a specific purpose by integrating it into a mobile application. TFUG Hyderabad also has hosted 4 weeks of ML Study JAM (Youtube playlist) with Kaggle learn for beginners. 260+ participants joined the campaign.
ODML — LiteRT, MediaPipe, Firebase
YOLOv10 to LiteRT: Object Detection on Android with Google AI Edge (Colab notebook) by AI/ML GDE Nitin Tiwari (India) shares a comprehensive tutorial on converting and quantizing the latest YOLOv10 object detection model from Ultralytics into LiteRT format, running inference on the resulting LiteRT model, and deploying it on Android for real-time detection.
Implement LiteRT for a segmentation task utilizing the FastSAM model by Ultralytics (Colab notebook) by AI/ML GDE George Soloupis (Greece) explains how to implement LiteRT for real-time segmentation using the FastSAM model. FastSAM enables efficient segmentation tasks, and when combined with LiteRT, it supports fast inference on edge devices like mobile phones.
Overview of MediaPipe Studio by Md. Sajjadur Rahman (Bangladesh) is a video introducing key features of MediaPipe Studio and how to easily visualize, evaluate, and benchmark AI and ML solutions directly in your browser. He also shared a tutorial on Customizing MediaPipe Model Maker for Image Classification.
Firebase Genkit for Github Models by AI/ML GDE Xavier Portilla Edo (Spain) is a community plugin for using GitHub Models APIs with Firebase Genkit. It allows you to use GitHub models through their official APIs. Xavier built and maintains it. He also shared a collection of resources related to the Firebase Genkit ecosystem here in his repository, Awesome Firebase Genkit.
Others
How AI sees the world: similarities and differences with human perception? by AI/ML GDE Radostin Cholakov (US) showed how humans and AI act in completely different ways with visual examples. This talk included findings from recent research on adversarial examples, adversarial attacks, and etc.
How to Write an Effective ML Research Paper by TFUG Islamabad was designed to provide participants with practical guidance on writing an ML research paper and covered the following topics: research paper structure, formulating a strong research question, conducting a literature review, and so on. Students and early career and researchers participated in the event to improve their research writing skills.
[Sep] ML Community — Highlights and Achievements was originally published in Google Developer Experts on Medium, where people are continuing the conversation by highlighting and responding to this story.