[ML Story] Leveraging VertexAI and “Clever” Prompts for Textual Data Insights

[ml-story]-leveraging-vertexai-and-“clever”-prompts-for-textual-data-insights

A step-by-step tutorial to use the VertexAI Python APIs for restaurant review analysis.

Problem Statement

Imagine you’re the owner of a popular restaurant with many reviews. You’d like to gain valuable insights from these reviews, such as identifying the most popular menu items, understanding what customers appreciate most about your restaurant, and pinpointing areas for improvement.

Instead of manually sifting through hundreds of reviews, you can harness the power of generative AI to extract insights from the data.

Let’s explore how to achieve this using VertexAI Python APIs.

Before we proceed further, it’s essential to clarify two fundamental concepts that will play a recurring role throughout this tutorial:

  • Prompt Engineering: It is a pivotal aspect of successful generative AI analysis. It involves crafting well-defined prompts that serve as instructions to guide the AI model in generating meaningful responses to your queries or directives.
  • Token Limit Challenges: A finite input token limit exists in the context of VertexAI models.
    ⚠️ Combining ALL reviews in one go ➡️ exceeds the model’s input token limit.
    ⚠️ Sending EACH review individually to the API ➡️ proves inefficient, as it underutilizes the token limit capacity.

To address this constraint, we will adopt a strategy of processing reviews in manageable batches or chunks.

Original Workflow Proposal: Sending all reviews through the API.

Step 1: Prepare the Dataset

To start with this tutorial, you can access a subsample of the Yelp Business data, which simulates restaurant review analysis.
You can find the sample data here.

review-analysis-genai/data/yelp_academic_dataset_businesses_reviews_sample.csv at main · ashmibanerjee/review-analysis-genai

This dataset is already cleaned and ready for analysis. 🎉

Our tech stack: sample reviews from the Yelp Business dataset [1], Python, and VertexAI

Step 2: Set Up VertexAI on Google Cloud

To access VertexAI locally through your Jupyter notebooks, you should set up VertexAI on Google Cloud. You can follow the setup tutorial here.

[ML Story] Exploring the Future of Text Generation: A Deep Dive into Vertex AI using Jupyter…

Step 3: Process Reviews in Batches

Now, let’s start processing the restaurant reviews in batches. We’ll use Python and the VertexAI API to accomplish this. Here’s a breakdown of the code, followed by its explanation.

https://medium.com/media/40bd4746bee886a8f0ad99864dbe4c21/href

  1. The code divides the text data into smaller chunks or batches based on the estimated token counts. It ensures that each chunk doesn’t exceed the token limit when combined with the prompt.
  2. The prompt, a guiding statement or question, initiates the text generation. It’s designed to effectively elicit meaningful responses from the AI model.
  3. The code iterates through the text data, adding individual items to a chunk until it approaches the token limit. The chunk is sent to the VertexAI model for text generation when the limit is reached.
  4. The responses from the model for each chunk are collected and concatenated to create a cohesive output.
  5. If the concatenated response still exceeds the token limit, the process is repeated recursively, ensuring that the token constraints are maintained.

The tiktoken library is a valuable tool for estimating tokens without making API requests, optimizing the text processing workflow for AI-based analysis.

Step 4: Utilizing “Clever” Prompt Engineering

You can now call get_final_response_prompt_engg(prompt, items) with your desired prompt and restaurant reviews to generate insights based on your questions.

You can explore more here if you’d like to delve deeper into prompt engineering strategies.

So, let your creativity flow and craft prompts that spark innovation, allowing you to unearth previously undiscovered revelations within your data.
The possibilities are boundless, and the insights are waiting to be discovered! 🚀

The source code on GitHub can be accessed here.
The references and further readings on this topic have been summarized here.

If you like the article, please subscribe to my latest ones.
To get in touch, contact me on
LinkedIn or via ashmibanerjee.com.


[ML Story] Leveraging VertexAI and “Clever” Prompts for Textual Data Insights was originally published in Google Developer Experts on Medium, where people are continuing the conversation by highlighting and responding to this story.

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
offering-my-blog-to-new-authors

Offering my blog to new authors

Next Post
scaling-papaya-global-past-$100m-arr

Scaling Papaya Global Past $100M ARR

Related Posts