Multi-modal RAG
This notebook demonstrates using LangChain, Astra DB Serverless, and a Google Gemini Pro Vision model to perform multi-modal Retrieval-Augmented Generation (RAG).
Prerequisites
You will need an vector-enabled Astra DB Serverless database and a Google Cloud Platform account.
-
Create an Astra vector database.
-
Within your database, create an Astra DB Access Token with Database Administrator permissions.
-
Get your Astra DB Serverless API Endpoint:
-
Create a GCP account and a service account. Download your service account’s GCP credentials as a JSON file. For more on service account authorization, see the GCP documentation.
-
Install the following dependencies:
pip install google-cloud-aiplatform ragstack-ai --upgrade
See the Notebook Prerequisites page for more details.
Configure Astra DB Serverless and GCP credentials
Export these values in the terminal where you’re running this application. If you’re running this in Google Colab, you’ll be prompted for these values in the Colab environment.
export ASTRA_DB_APPLICATION_TOKEN=AstraCS: ...
export ASTRA_DB_API_ENDPOINT=https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com
export GOOGLE_APPLICATION_CREDENTIALS=<path to your GCP credentials JSON file>
Use Gemini Pro Vision to identify an item
Let’s see if Gemini Pro Vision can identify a part to an espresso machine and tell you where to get a replacement.
-
Download a picture of the espresso machine’s part. If you were running this in Colab, the picture would be saved in the Google Colab filesystem. Here, you’ll save it locally.
import requests source_img_data = requests.get( 'https://drive.google.com/uc?export=view&id=15ddcn-AIxpvRdWcFGvIr77XLWdo4Maof', timeout=30, ).content with open('coffee_maker_part.png', 'wb') as handler: handler.write(source_img_data)
-
Ask Gemini Pro Vision to identify the part in the picture.
-
Python
-
Result
from langchain.chat_models import ChatVertexAI from langchain.schema.messages import HumanMessage import os llm = ChatVertexAI(model_name="gemini-pro-vision") from vertexai.vision_models import MultiModalEmbeddingModel, Image image_message = { "type": "image_url", "image_url": {"url": "coffee_maker_part.png"}, } text_message = { "type": "text", "text": "What's in the image? Please share a link to purchase a replacement", } message = HumanMessage(content=[text_message, image_message]) output = llm([message]) print(output.content)
This is a bottomless portafilter basket. It is used to hold the ground coffee in a portafilter. You can purchase a replacement here: https://www.amazon.com/Bottomless-Portafilter-Basket-Compatible-Machines/dp/B09752K44C/
-
-
Gemini correctly identified the part! However, it returned an out-of-date link to purchase a replacement. Similarly, if the part was newer than the LLM, the LLM would not have been able to identify it. Fortunately, with Retreival Augmented Generation, you can address both of these problems!
Load a coffee parts product catalog into the vector store
To solve this problem, you need to be able to search your own database for pictures that look similar to the one provided by the user. To do that, create a database and download the referenced images. If you were running this in Colab, the images would be saved in the Google Colab filesystem. Here, you’ll be downloading them locally. Don’t worry, there are only ~15 PNG files.
-
Create a pandas dataframe with the product name, url, price, and image url.
-
Python
-
Result
import pandas as pd d = {'name': ["Saucer", "Saucer Ceramic", "Milk Jug Assembly", "Handle Steam Wand Kit (New Version From 0735 PDC)", "Spout Juice Small (From 0637 to 1041 PDC)", "Cleaning Steam Wand", "Jug Frothing", "Spoon Tamping 50mm", "Collar Grouphead 50mm", "Filter 2 Cup Dual Wall 50mm", "Filter 1 Cup 50mm", "Water Tank Assembly", "Portafilter Assembly 50mm", "Milk Jug Assembly", "Filter 2 Cup 50mm" ], 'url': ["https://www.breville.com/us/en/parts-accessories/parts/sp0014946.html?sku=SP0014946", "https://www.breville.com/us/en/parts-accessories/parts/sp0014914.html?sku=SP0014914", "https://www.breville.com/us/en/parts-accessories/parts/sp0011391.html?sku=SP0011391", "https://www.breville.com/us/en/parts-accessories/parts/sp0010719.html?sku=SP0010719", "https://www.breville.com/us/en/parts-accessories/parts/sp0010718.html?sku=SP0010718", "https://www.breville.com/us/en/parts-accessories/parts/sp0003247.html?sku=SP0003247", "https://www.breville.com/us/en/parts-accessories/parts/sp0003246.html?sku=SP0003246", "https://www.breville.com/us/en/parts-accessories/parts/sp0003243.html?sku=SP0003243", "https://www.breville.com/us/en/parts-accessories/parts/sp0003232.html?sku=SP0003232", "https://www.breville.com/us/en/parts-accessories/parts/sp0003231.html?sku=SP0003231", "https://www.breville.com/us/en/parts-accessories/parts/sp0003230.html?sku=SP0003230", "https://www.breville.com/us/en/parts-accessories/parts/sp0003225.html?sku=SP0003225", "https://www.breville.com/us/en/parts-accessories/parts/sp0003216.html?sku=SP0003216", "https://www.breville.com/us/en/parts-accessories/parts/sp0001875.html?sku=SP0001875", "https://www.breville.com/us/en/parts-accessories/parts/sp0000166.html?sku=SP0000166"], 'price': ["10.95", "4.99", "14.95", "8.95", "10.95", "6.95", "24.95", "8.95", "6.95", "12.95", "12.95", "14.95", "10.95", "16.95", "11.95"], 'image': ["https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0014946/tile.jpg", "https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0014914/tile.jpg", "https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0011391/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0010719/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0010718/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0003247/tile.jpg", "https://assets.breville.com/cdn-cgi/image/width=400,format=auto/Spare+Parts+/Espresso+Machines/BES250/SP0003246/SP0003246_IMAGE1_400X400.jpg", "https://assets.breville.com/cdn-cgi/image/width=400,format=auto/Spare+Parts+/Espresso+Machines/ESP8/SP0003243/SP0003243_IMAGE1_400X400.jpg", "https://assets.breville.com/cdn-cgi/image/width=400,format=auto/Spare+Parts+/Espresso+Machines/ESP8/SP0003232/SP0003232_IMAGE1_400x400.jpg", "https://www.breville.com/content/dam/breville/au/catalog/products/images/sp0/sp0003231/tile.jpg", "https://www.breville.com/content/dam/breville/au/catalog/products/images/sp0/sp0003230/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0003225/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0003216/tile.jpg", "https://www.breville.com/content/dam/breville/au/catalog/products/images/sp0/sp0001875/tile.jpg", "https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0000166/tile.jpg"]} df = pd.DataFrame(data=d) print(df)
0 Saucer https://www.breville.com/us/en/parts-accessori... 10.95 https://www.breville.com/content/dam/breville/... 1 Saucer Ceramic https://www.breville.com/us/en/parts-accessori... 4.99 https://www.breville.com/content/dam/breville/... 2 Milk Jug Assembly https://www.breville.com/us/en/parts-accessori... 14.95 https://www.breville.com/content/dam/breville/... 3 Handle Steam Wand Kit (New Version From 0735 PDC) https://www.breville.com/us/en/parts-accessori... 8.95 https://www.breville.com/content/dam/breville/... 4 Spout Juice Small (From 0637 to 1041 PDC) https://www.breville.com/us/en/parts-accessori... 10.95 https://www.breville.com/content/dam/breville/... 5 Cleaning Steam Wand https://www.breville.com/us/en/parts-accessori... 6.95 https://www.breville.com/content/dam/breville/... 6 Jug Frothing https://www.breville.com/us/en/parts-accessori... 24.95 https://assets.breville.com/cdn-cgi/image/widt... 7 Spoon Tamping 50mm https://www.breville.com/us/en/parts-accessori... 8.95 https://assets.breville.com/cdn-cgi/image/widt... 8 Collar Grouphead 50mm https://www.breville.com/us/en/parts-accessori... 6.95 https://assets.breville.com/cdn-cgi/image/widt... 9 Filter 2 Cup Dual Wall 50mm https://www.breville.com/us/en/parts-accessori... 12.95 https://www.breville.com/content/dam/breville/... 10 Filter 1 Cup 50mm https://www.breville.com/us/en/parts-accessori... 12.95 https://www.breville.com/content/dam/breville/... 11 Water Tank Assembly https://www.breville.com/us/en/parts-accessori... 14.95 https://www.breville.com/content/dam/breville/... 12 Portafilter Assembly 50mm https://www.breville.com/us/en/parts-accessori... 10.95 https://www.breville.com/content/dam/breville/... 13 Milk Jug Assembly https://www.breville.com/us/en/parts-accessori... 16.95 https://www.breville.com/content/dam/breville/... 14 Filter 2 Cup 50mm https://www.breville.com/us/en/parts-accessori... 11.95 https://www.breville.com/content/dam/breville/...
-
-
Create vector embeddings of each of the product images using Google’s Multi-Modal Embedding Model and save the data in AstraDB.
-
Python
-
Result
import json, requests from vertexai.preview.vision_models import MultiModalEmbeddingModel, Image from astrapy.db import AstraDB model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding@001") # Initialize the vector db astra_db = AstraDB(token=os.getenv("ASTRA_DB_APPLICATION_TOKEN"), api_endpoint=os.getenv("ASTRA_DB_API_ENDPOINT")) collection = astra_db.create_collection(collection_name="coffee_shop_ecommerce", dimension=1408) for i in range(len(df)): name = df.loc[i, "name"] image = df.loc[i, "image"] price = df.loc[i, "price"] url = df.loc[i, "url"] # Download this product's image and save it to the Colab filesystem. # In a production system this binary data would be stored in Google Cloud Storage img_data = requests.get(image, timeout=30).content with open(f'{name}.png', 'wb') as handler: handler.write(img_data) # load the image from filesystem and compute the embedding value img = Image.load_from_file(f'{name}.png') embeddings = model.get_embeddings(image=img, contextual_text=name) try: # add to the AstraDB Vector Database collection.insert_one({ "_id": i, "name": name, "image": image, "url": url, "price": price, "$vector": embeddings.image_embedding, }) except Exception as error: # if you've already added this record, skip the error message error_info = json.loads(str(error)) if error_info[0]['errorCode'] == "DOCUMENT_ALREADY_EXISTS": print(f"Document {name} already exists in the database. Skipping.")
Cleaning Steam Wand.png Collar Grouphead 50mm.png Filter 1 Cup 50mm.png Filter 2 Cup 50mm.png Filter 2 Cup Dual Wall 50mm.png Handle Steam Wand Kit (New Version From 0735 PDC).png Jug Frothing.png Milk Jug Assembly.png Portafilter Assembly 50mm.png Saucer Ceramic.png Saucer.png Spoon Tamping 50mm.png Spout Juice Small (From 0637 to 1041 PDC).png Water Tank Assembly.png coffee_maker_part.png
-
Create a multi-modal RAG chain
-
Ask the LLM the same question, but this time you’ll perform a vector search against Astra DB Serverless using the same image to supply the LLM with relevant products in the prompt.
-
Python
-
Result
# Embed the similar item img = Image.load_from_file('coffee_maker_part.png') embeddings = model.get_embeddings(image=img, contextual_text="A espresso machine part") # Perform the vector search against AstraDB Vector documents = collection.vector_find( embeddings.image_embedding, limit=3, ) related_products_csv = "name, image, price, url\n" for doc in documents: related_products_csv += f"{doc['name']}, {doc['image']}, {doc['price']}, {doc['url']},\n" image_message = { "type": "image_url", "image_url": {"url": "coffee_maker_part.png"}, } text_message = { "type": "text", "text": f"""Given this image, please choose a possible replacement. Include link and price. Here are possible replacements: {related_products_csv}""", } message = HumanMessage(content=[text_message, image_message]) output = llm([message]) print(output.content)
Filter 2 Cup 50mm, https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0000166/tile.jpg, 11.95, https://www.breville.com/us/en/parts-accessories/parts/sp0000166.html?sku=SP0000166
-
-
Presto! Gemini correctly identified the part, and now provides a current link to purchase a replacement. This is because the LLM was able to use vector search on fresh data to find the most similar product in the catalog.
Cleanup
To delete the created coffee_shop_ecommerce
collection and its data, run the following command with your Astra token and endpoint:
-
Curl
-
Result
curl -v -s --location \
--request POST "https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com/api/json/v1/default_keyspace" \
--header "X-Cassandra-Token: AstraCS: ..." \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"deleteCollection": {
"name": "coffee_shop_ecommerce"
}
}'
* Connection #0 to host https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com left intact
{"status":{"ok":1}}%
Complete code example
Python
import os
import json
import requests
import pandas as pd
from getpass import getpass
from langchain.chat_models import ChatVertexAI
from langchain.schema.messages import HumanMessage
from vertexai.preview.vision_models import MultiModalEmbeddingModel, Image
from astrapy.db import AstraDB
source_img_data = requests.get(
'https://drive.google.com/uc?export=view&id=15ddcn-AIxpvRdWcFGvIr77XLWdo4Maof',
timeout=30,
).content
with open('coffee_maker_part.png', 'wb') as handler:
handler.write(source_img_data)
llm = ChatVertexAI(model_name="gemini-pro-vision")
image_message = {
"type": "image_url",
"image_url": {"url": "coffee_maker_part.png"},
}
text_message = {
"type": "text",
"text": "What's in the image? Please share a link to purchase a replacement",
}
message = HumanMessage(content=[text_message, image_message])
output = llm([message])
print(output.content)
d = {'name': ["Saucer", "Saucer Ceramic", "Milk Jug Assembly", "Handle Steam Wand Kit (New Version From 0735 PDC)", "Spout Juice Small (From 0637 to 1041 PDC)", "Cleaning Steam Wand", "Jug Frothing", "Spoon Tamping 50mm", "Collar Grouphead 50mm", "Filter 2 Cup Dual Wall 50mm", "Filter 1 Cup 50mm", "Water Tank Assembly", "Portafilter Assembly 50mm", "Milk Jug Assembly", "Filter 2 Cup 50mm"],
'url': ["https://www.breville.com/us/en/parts-accessories/parts/sp0014946.html?sku=SP0014946", "https://www.breville.com/us/en/parts-accessories/parts/sp0014914.html?sku=SP0014914", "https://www.breville.com/us/en/parts-accessories/parts/sp0011391.html?sku=SP0011391", "https://www.breville.com/us/en/parts-accessories/parts/sp0010719.html?sku=SP0010719", "https://www.breville.com/us/en/parts-accessories/parts/sp0010718.html?sku=SP0010718", "https://www.breville.com/us/en/parts-accessories/parts/sp0003247.html?sku=SP0003247", "https://www.breville.com/us/en/parts-accessories/parts/sp0003246.html?sku=SP0003246", "https://www.breville.com/us/en/parts-accessories/parts/sp0003243.html?sku=SP0003243", "https://www.breville.com/us/en/parts-accessories/parts/sp0003232.html?sku=SP0003232", "https://www.breville.com/us/en/parts-accessories/parts/sp0003231.html?sku=SP0003231", "https://www.breville.com/us/en/parts-accessories/parts/sp0003230.html?sku=SP0003230", "https://www.breville.com/us/en/parts-accessories/parts/sp0003225.html?sku=SP0003225", "https://www.breville.com/us/en/parts-accessories/parts/sp0003216.html?sku=SP0003216", "https://www.breville.com/us/en/parts-accessories/parts/sp0001875.html?sku=SP0001875", "https://www.breville.com/us/en/parts-accessories/parts/sp0000166.html?sku=SP0000166"],
'price': ["10.95", "4.99", "14.95", "8.95", "10.95", "6.95", "24.95", "8.95", "6.95", "12.95", "12.95", "14.95", "10.95", "16.95", "11.95"],
'image': ["https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0014946/tile.jpg", "https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0014914/tile.jpg", "https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0011391/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0010719/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0010718/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0003247/tile.jpg", "https://assets.breville.com/cdn-cgi/image/width=400,format=auto/Spare+Parts+/Espresso+Machines/BES250/SP0003246/SP0003246_IMAGE1_400X400.jpg", "https://assets.breville.com/cdn-cgi/image/width=400,format=auto/Spare+Parts+/Espresso+Machines/ESP8/SP0003243/SP0003243_IMAGE1_400X400.jpg", "https://assets.breville.com/cdn-cgi/image/width=400,format=auto/Spare+Parts+/Espresso+Machines/ESP8/SP0003232/SP0003232_IMAGE1_400x400.jpg", "https://www.breville.com/content/dam/breville/au/catalog/products/images/sp0/sp0003231/tile.jpg", "https://www.breville.com/content/dam/breville/au/catalog/products/images/sp0/sp0003230/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0003225/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0003216/tile.jpg", "https://www.breville.com/content/dam/breville/au/catalog/products/images/sp0/sp0001875/tile.jpg", "https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0000166/tile.jpg"]}
df = pd.DataFrame(data=d)
print(df)
model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding@001")
# Initialize the vector db
astra_db = AstraDB(token=os.getenv("ASTRA_DB_APPLICATION_TOKEN"), api_endpoint=os.getenv("ASTRA_DB_API_ENDPOINT"))
collection = astra_db.create_collection(collection_name="coffee_shop_ecommerce", dimension=1408)
for i in range(len(df)):
name = df.loc[i, "name"]
image = df.loc[i, "image"]
price = df.loc[i, "price"]
url = df.loc[i, "url"]
# Download this product's image and save it to your local filesystem.
# In a production system this binary data would be stored in Google Cloud Storage
img_data = requests.get(image, timeout=30).content
with open(f'{name}.png', 'wb') as handler:
handler.write(img_data)
# load the image from filesystem and compute the embedding value
img = Image.load_from_file(f'{name}.png')
embeddings = model.get_embeddings(image=img, contextual_text=name)
try:
# add to the AstraDB Vector Database
collection.insert_one({
"_id": i,
"name": name,
"image": image,
"url": url,
"price": price,
"$vector": embeddings.image_embedding,
})
except Exception as error:
# if you've already added this record, skip the error message
error_info = json.loads(str(error))
if error_info[0]['errorCode'] == "DOCUMENT_ALREADY_EXISTS":
print(f"Document {name} already exists in the database. Skipping.")
img = Image.load_from_file('coffee_maker_part.png')
embeddings = model.get_embeddings(image=img, contextual_text="A espresso machine part")
documents = collection.vector_find(
embeddings.image_embedding,
limit=3,
)
related_products_csv = "name, image, price, url\n"
for doc in documents:
related_products_csv += f"{doc['name']}, {doc['image']}, {doc['price']}, {doc['url']},\n"
image_message = {
"type": "image_url",
"image_url": {"url": "coffee_maker_part.png"},
}
text_message = {
"type": "text",
"text": f"""Given this image, please choose a possible replacement. Include link and price. Here are possible replacements: {related_products_csv}""",
}
message = HumanMessage(content=[text_message, image_message])
output = llm([message])
print(output.content)