[Oct] ML Community — Highlights and Achievements
Let’s explore highlights and accomplishments of the vast Google Machine Learning communities over the month. We appreciate all the activities and commitment by the community members. Without further ado, here are the key highlights!
Product Highlights
Gemini
Use Superintelligent Tabot Please! (repository) by AI/ML GDE Xiaohu Zhu (China) is a Gemini-powered Chrome extension that lets you discuss the content with an intelligent bot in your browser tabs.
Yes, Gemini can do object detection
Yes, Gemini can do object detection (Colab Notebook | Demo on Hugging Face) by AI/ML GDE Nitin Tiwari (India) is a blog post demonstrating how to implement zero-shot object detection for any object using the Gemini API.
https://medium.com/media/b94b380416183148a5a7326787ba2569/href
GEMINI 002! AMAZING improvements in Price, Speed and Intelligence — Tutorial! (Spanish) (Colab Notebook) by AI/ML GDE Carlos Alarcon (Colombia) dives deep into improvements in Gemini Pro and Flash including a 50% cost reduction for tasks under 128,000 tokens, increased rate limits, speed, and performance.
Green City Finder: Helps you find sustainable destinations of your choice using LLMs (Colab Notebook) by AI/ML GDE Ashmi Banerjee (Germany) is a city recommendation model for tourists enhanced with sustainability goals, local community interests, and visitor’s satisfaction. She used Gemini 1.5 Pro and Flash via VertexAI to generate sustainable travel recommendations.
ReceiptNinja: Using Google Gemini to extract information from Retail Receipts – CV-Tricks.com
ReceiptNinja: Using Google Gemini to extract information from Retail Receipts (repository) by AI/ML GDE Ankit Sachan (India) explains how he built ReceiptNinja, an intelligent web application leveraging Gemini and the open-source OCR model, Doctr, to automatically extract key information from various types of receipts, including physical, digital, images, and PDFs.
LLM for Music Generation by AI/ML GDE Bao-Dai Nguyen-Hoang (Singapore) showed how Gemini can be used for music composition and production. He covered the process of feeding structured musical data into LLMs using TF and VertexAI.
TFUG Mumbai October Meetup by TFUG Mumbai presented the latest advances in language models including PaliGemma and Gemma 2. Attendees learned about the architecture and efficiency of these models, along with hands-on demonstrations that showcased their practical applications.
Gemma
https://medium.com/media/a769468f4035a5bbe33f344253f72425/href
From ML to LLM: On-Device AI in the Browser (slides) by AI/ML GDE Nico Martin (Switzerland) explains how to use ML solutions directly on your devices. He introduces WebAssembly, WebGPU, and TensorFlow.js and other tools which make it possible. He also shared his “AI in the browser” projects [md.edit], a web based markdown editor, and Ask my PDF, a web app using RAG and LLMs to interact with a PDF file in a browser.
Using LLM to draw geometry diagram for an IMO problem (Colab Notebook) by AI/ML GDE TJ Wei (Taiwan) is a video exploring the use of LLMs to assist in solving and visualizing a complex geometry problem from the 2024 International Mathematical Olympiad. He compares two AI models: ChatGPT-4 and Gemma 2 9B. He emphasizes the importance of verifying AI-generated results and discusses strategies for effectively using these tools in mathematical problem-solving and visualization tasks.
Ignore All Previous Instructions by AI/ML GDE Martin Andrews (Singapore) a consecutive talk from his project in AI Sprint, “Gemma Fine-tuning with ablations” (slides arranged vertically). The talk covered fine-tuning LLMs and simple reinforcement learning ideas.
Make your Vision Model using Gemma 2
Make your Vision Model using Gemma 2 by AI/ML GDE Joan Santoso (Indonesia) and Machine Learning Surabaya Core Team is a simple tutorial on how to create a simple multimodal LLM using Gemma 2. This case study is inspired by Mini-GPT4 models, but used Gemma.
Gemma Models — DevFest Chennai 2024 (slides) by Navaneeth Malingan (India) was a talk covering the technical aspects of Gemma models, exploring the underlying architecture, application development strategies, and advanced fine-tuning techniques. He covered practical implementation methods, focusing on efficient integration of Gemma models into various application frameworks and APIs.
Keras
No PhD Required: Generative AI with Keras by AI/ML GDE Ahirton Lopes (Brazil) discussed how Keras simplifies the development of Gen AI models, allowing both scientists and developers, even without a PhD, to explore and create advanced AI solutions. He demonstrated some new features of Keras 3.0, emphasizing how the library abstracts complex algorithms like Transformers (demo repository). This talk was a part of LATAM Google for Developers Community Summit. He also shared the same talk (slides) at DevFest BH and conducted a hands-on demonstration using Colab notebook.
Keras Community Day by TFUG Colombia was a comeback event to revive the community, covering basic key concepts of philosophy of AI, LLMs, computer vision, multimodal systemems and basics of Gen AI.
JAX
Building a Breast Cancer Classification Pipeline with Flax NNX, Kubeflow Pipelines and Vertex AI (Colab Notebook | repository) by AI/ML GDE David Cardozo (Canada) demonstrates how to build an ML pipeline for breast cancer classification using Kubeflow Pipelines and Vertex AI. The pipeline leverages the Curated Breast Imaging Subset of DDSM (CBIS-DDSM) dataset and utilizes TensorFlow Datasets for data processing and a custom Flax CNN model for training.
Deep Learning with JAX by AI/ML GDE Grigory Sapunov (UK) is a newly published book. He guides how to build effective neural networks with JAX and introduces unique features of JAX.
JAX Implementation of Black Forest Labs’ Flux.1 family of models by AI/ML GDEs Aritra Roy Gosthipaty (India) and Saurav Maheshkar (India) is a port from the official PyTorch implementation into JAX, using the flax.nnx framework.
Kaggle
AI/ML GDE Rishiraj Acharya (India) shared two Kaggle notebooks for featured competitions: 1) Fine-Tuning Gemma 2 for Bengali Poetry Generation — demonstrates the process of fine-tuning Gemma 2 to generate Bengali poetry based on specific poets, titles, and categories. 2) AI-Powered Legal Assistant for India’s New Criminal Laws — demonstrates the power of Gemini 1.5’s vast context window and generative capabilities to help judges, lawyers, and police officers efficiently analyze, interpret, and apply the new laws in real-time scenarios. He also explores how Gemini’s 2 million token capacity allows it to process entire legal documents (thousands of pages long) while analyzing case-specific facts. Both notebooks received Silver medals.
Next-Gen Conversational AI: Local Gemma 2 Chatbot by Kaggle GDE Gabriel Preda (Romania) is an article about how to use Kaggle Models, FastAPI, and Streamlit to build a chatbot powered by Gemma 2 and run it on your local computer.
Leveraging Kaggle: Building AI Collaboration by Kaggle GDE Karnika Kapoor (UAE) shared her insights on how to leverage Kaggle to build smarter and insightful AI solutions and tips and tricks of using Kaggle at DevFest AI Connect MENA.
VertexAI
Prompt Translation: The Way to Switch Between LLMs Without Losing Performance
Prompt Translation: The Way to Switch Between LLMs Without Losing Performance by AI/ML GDE Gad Benram (Portugal) introduces “Prompt Translation,” an analytics-based method published by Google researchers to optimize prompts using benchmarks and Gen AI, and explores GCP’s offering to train these models with reinforcement learning on Vertex AI.
Prompts multimodales con Generative AI en Vertex AI (Colab Notebook) by AI/ML GDE Lesly Zerna (Bolivia) was a workshop about how multimodal works and how to set up Gemini API and Vertex AI.
Exploring Function Calling in Gemini and Langchain Reasoning Engine on Vertex AI: Uncovering Advantages and Differences by AI/ML GDE Bukempas (Türkiye) compares Gemini Function Calling and Langchain Reasoning Engine, highlighting their advantages, differences, and potential applications.
Deploying a Tourist Guide Chatbot with Gemma 2 and Vertex AI Model Garden by AI/ML GDE Bilguun Jargalsaikhan (Mongolia) explains how he deployed a Gemma 2 powered chatbot for tourists exploring Mongolia.
Generative Image with Imagen3 on VertexAI | AI Image Generator by AI/ML GDE Juan Guillermo Gomez Torres (Mexico) is a repository containing a Python application that leverages advanced AI models to perform various image-related tasks. It integrates models such as Imagen3, Imagen2 on VertexAI, Ollama, and Gemma to provide functionalities such as image generation, captioning, question answering, editing, and interactive dialog.
Google Cloud Next Tokyo ’24 Lightning Talk: “What is vector search?” by AI/ML GDE Rio Kurihara (Japan) was an overview of vector search, a tutorial on vector search with BigQuery, and more.
ODML
Using Audio in a Multimodal Prompt inside android for the Gemini API with Vertex AI on Firebase by AI/ML GDE George Soloupis (Greece) demonstrates how to incorporate audio into a multimodal prompt to present an example of text and audio for the document.
A Guide to Vertex AI and Genkit for Generative AI Practitioners by AI/ML GDE Ismael Chaile (Spain) is about how to navigate the vast amount of documentation on Vertex AI and Firebase Genkit. It includes a tutorial on the Genkit plugin and some videos about it.
Stop Toxic Comments with AI! Sentiment Analysis in Angular (Google Gemini) by Angular GDE Muhammad Ahsan Ayaz (Sweden) is a video tutorial covering how to set up Gemini for your Angular; building a custom sentiment analysis component; disabling comment submission for toxic content; and best practices for using AI in comment moderation.
Build AI-powered apps faster with Firebase (slides) by TFUG prayagraj [ML Prayagraj] explored new features and updates of Firebase to build, deploy, and optimize production-ready Gen AI apps. And it also covered Google’s Gen AI ecosystem for a broader understanding of AI applications.
Cloud and NotebookLM
RAG API — 30 lines of code is all you need for RAG (video | repository) by AI/ML GDE Sascha Heyer (Germany) introduces an easy way to get started with RAG. He uses Google’s RAG API to retrieve documents similar to his query and combine them with Gemini to answer it. This post received 276+ Clap👏 on Google Cloud Medium.
https://medium.com/media/73c0523e60dfe4df6b685817939e77b9/href
Building a Dynamic Podcast Generator Inspired by Google’s NotebookLM and Illuminate (video | repository) by AI/ML GDE Sascha Heyer (Germany) walks you through his journey of building this dynamic podcast generator, explaining the code and technologies involved. This post received 239+ Clap👏 on Google Cloud Medium.
NotebookLM one of the BEST AI products from Google 🤯 (Spanish) by AI/ML GDE Carlos Alarcon (Colombia) is a step by step tutorial about how to create podcasts with AI easily and quickly. He introduces NotebookLM and how it allows you to generate podcasts from your documents, links and even YouTube videos.
Responsible AI
AI for Good: Ethical AI in Action: Advanced Fine-Tuning (advanced) by AI/ML GDE David Cardozo (Cananda) was a workshop that delved into advanced fine-tuning strategies using Gemma and VertexAI, while addressing critical ethical considerations in AI development. Through hands-on exploration of methods like LoRA with Keras, distributed tuning using Keras and JAX/ Flax, participants learned how to create performant, scalable models from tuning guidelines. Topics included dataset bias mitigation, and efficient resource usage (TPUs), ensuring AI solutions are both innovative and responsible.
Responsible AI and Gemma Scope by AI/ML GDE Xiaohu Zhu (China) was about Google’s Responsible AI approach and related RAI toolkit. He also gave an introduction to Gemma Scope, a mechanistic interpretability tool for LLMs and showcased Gemma Scope examples.
ML Ecosystem
1000+ graduated ML Bootcamp Korea
A total 1,061 developers out of 1,638 graduated the ML Bootcamp Korea marking a significant milestone in the program’s impact on AI talent development since 2020. Also, a 2023 graduate recently joined Google as a SWE! Graduates completed curriculum on Google AI products including Coursera DL Specialization covering TensorFlow and Kaggle. The 2024 cohort with 334 graduates implemented 211 Gemma projects as well!
[Oct] ML Community — Highlights and Achievements was originally published in Google Developer Experts on Medium, where people are continuing the conversation by highlighting and responding to this story.