Multi-modal RAG

colab badge

This notebook demonstrates using LangChain, Astra DB Serverless, and a Google Gemini Pro Vision model to perform multi-modal Retrieval-Augmented Generation (RAG).

Prerequisites

You will need an vector-enabled Astra DB Serverless database and a Google Cloud Platform account.

See the Notebook Prerequisites page for more details.

Configure Astra DB Serverless and GCP credentials

Export these values in the terminal where you’re running this application. If you’re running this in Google Colab, you’ll be prompted for these values in the Colab environment.

export ASTRA_DB_APPLICATION_TOKEN=AstraCS: ...
export ASTRA_DB_API_ENDPOINT=https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com
export GOOGLE_APPLICATION_CREDENTIALS=<path to your GCP credentials JSON file>

Use Gemini Pro Vision to identify an item

Let’s see if Gemini Pro Vision can identify a part to an espresso machine and tell you where to get a replacement.

  1. Download a picture of the espresso machine’s part. If you were running this in Colab, the picture would be saved in the Google Colab filesystem. Here, you’ll save it locally.

    import requests
    
    source_img_data = requests.get('https://drive.google.com/uc?export=view&id=15ddcn-AIxpvRdWcFGvIr77XLWdo4Maof').content
    with open('coffee_maker_part.png', 'wb') as handler:
      handler.write(source_img_data)
  2. Ask Gemini Pro Vision to identify the part in the picture.

    • Python

    • Result

    from langchain.chat_models import ChatVertexAI
    from langchain.schema.messages import HumanMessage
    import os
    
    llm = ChatVertexAI(model_name="gemini-pro-vision")
    
    from vertexai.vision_models import MultiModalEmbeddingModel, Image
    image_message = {
        "type": "image_url",
        "image_url": {"url": "coffee_maker_part.png"},
    }
    text_message = {
        "type": "text",
        "text": "What's in the image? Please share a link to purchase a replacement",
    }
    message = HumanMessage(content=[text_message, image_message])
    
    output = llm([message])
    print(output.content)
    This is a bottomless portafilter basket. It is used to hold the ground coffee in a portafilter. You can purchase a replacement here: https://www.amazon.com/Bottomless-Portafilter-Basket-Compatible-Machines/dp/B09752K44C/
  3. Gemini correctly identified the part! However, it returned an out-of-date link to purchase a replacement. Similarly, if the part was newer than the LLM, the LLM would not have been able to identify it. Fortunately, with Retreival Augmented Generation, you can address both of these problems!

Load a coffee parts product catalog into the vector store

To solve this problem, you need to be able to search your own database for pictures that look similar to the one provided by the user. To do that, create a database and download the referenced images. If you were running this in Colab, the images would be saved in the Google Colab filesystem. Here, you’ll be downloading them locally. Don’t worry, there are only ~15 PNG files.

  1. Create a pandas dataframe with the product name, url, price, and image url.

    • Python

    • Result

    import pandas as pd
    
    d = {'name': ["Saucer", "Saucer Ceramic", "Milk Jug Assembly", "Handle Steam Wand Kit (New Version From 0735 PDC)", "Spout Juice Small (From 0637 to 1041 PDC)", "Cleaning Steam Wand", "Jug Frothing", "Spoon Tamping 50mm", "Collar Grouphead 50mm", "Filter 2 Cup Dual Wall 50mm", "Filter 1 Cup 50mm", "Water Tank Assembly", "Portafilter Assembly 50mm", "Milk Jug Assembly", "Filter 2 Cup 50mm" ],
         'url': ["https://www.breville.com/us/en/parts-accessories/parts/sp0014946.html?sku=SP0014946", "https://www.breville.com/us/en/parts-accessories/parts/sp0014914.html?sku=SP0014914", "https://www.breville.com/us/en/parts-accessories/parts/sp0011391.html?sku=SP0011391", "https://www.breville.com/us/en/parts-accessories/parts/sp0010719.html?sku=SP0010719", "https://www.breville.com/us/en/parts-accessories/parts/sp0010718.html?sku=SP0010718", "https://www.breville.com/us/en/parts-accessories/parts/sp0003247.html?sku=SP0003247", "https://www.breville.com/us/en/parts-accessories/parts/sp0003246.html?sku=SP0003246", "https://www.breville.com/us/en/parts-accessories/parts/sp0003243.html?sku=SP0003243", "https://www.breville.com/us/en/parts-accessories/parts/sp0003232.html?sku=SP0003232", "https://www.breville.com/us/en/parts-accessories/parts/sp0003231.html?sku=SP0003231", "https://www.breville.com/us/en/parts-accessories/parts/sp0003230.html?sku=SP0003230", "https://www.breville.com/us/en/parts-accessories/parts/sp0003225.html?sku=SP0003225", "https://www.breville.com/us/en/parts-accessories/parts/sp0003216.html?sku=SP0003216", "https://www.breville.com/us/en/parts-accessories/parts/sp0001875.html?sku=SP0001875", "https://www.breville.com/us/en/parts-accessories/parts/sp0000166.html?sku=SP0000166"],
         'price': ["10.95", "4.99", "14.95", "8.95", "10.95", "6.95", "24.95", "8.95", "6.95", "12.95", "12.95", "14.95", "10.95", "16.95", "11.95"],
         'image': ["https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0014946/tile.jpg", "https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0014914/tile.jpg", "https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0011391/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0010719/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0010718/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0003247/tile.jpg", "https://assets.breville.com/cdn-cgi/image/width=400,format=auto/Spare+Parts+/Espresso+Machines/BES250/SP0003246/SP0003246_IMAGE1_400X400.jpg", "https://assets.breville.com/cdn-cgi/image/width=400,format=auto/Spare+Parts+/Espresso+Machines/ESP8/SP0003243/SP0003243_IMAGE1_400X400.jpg", "https://assets.breville.com/cdn-cgi/image/width=400,format=auto/Spare+Parts+/Espresso+Machines/ESP8/SP0003232/SP0003232_IMAGE1_400x400.jpg", "https://www.breville.com/content/dam/breville/au/catalog/products/images/sp0/sp0003231/tile.jpg", "https://www.breville.com/content/dam/breville/au/catalog/products/images/sp0/sp0003230/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0003225/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0003216/tile.jpg", "https://www.breville.com/content/dam/breville/au/catalog/products/images/sp0/sp0001875/tile.jpg", "https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0000166/tile.jpg"]}
    df = pd.DataFrame(data=d)
    print(df)
    0                                              Saucer  https://www.breville.com/us/en/parts-accessori...  10.95  https://www.breville.com/content/dam/breville/...
    1                                      Saucer Ceramic  https://www.breville.com/us/en/parts-accessori...   4.99  https://www.breville.com/content/dam/breville/...
    2                                   Milk Jug Assembly  https://www.breville.com/us/en/parts-accessori...  14.95  https://www.breville.com/content/dam/breville/...
    3   Handle Steam Wand Kit (New Version From 0735 PDC)  https://www.breville.com/us/en/parts-accessori...   8.95  https://www.breville.com/content/dam/breville/...
    4           Spout Juice Small (From 0637 to 1041 PDC)  https://www.breville.com/us/en/parts-accessori...  10.95  https://www.breville.com/content/dam/breville/...
    5                                 Cleaning Steam Wand  https://www.breville.com/us/en/parts-accessori...   6.95  https://www.breville.com/content/dam/breville/...
    6                                        Jug Frothing  https://www.breville.com/us/en/parts-accessori...  24.95  https://assets.breville.com/cdn-cgi/image/widt...
    7                                  Spoon Tamping 50mm  https://www.breville.com/us/en/parts-accessori...   8.95  https://assets.breville.com/cdn-cgi/image/widt...
    8                               Collar Grouphead 50mm  https://www.breville.com/us/en/parts-accessori...   6.95  https://assets.breville.com/cdn-cgi/image/widt...
    9                         Filter 2 Cup Dual Wall 50mm  https://www.breville.com/us/en/parts-accessori...  12.95  https://www.breville.com/content/dam/breville/...
    10                                  Filter 1 Cup 50mm  https://www.breville.com/us/en/parts-accessori...  12.95  https://www.breville.com/content/dam/breville/...
    11                                Water Tank Assembly  https://www.breville.com/us/en/parts-accessori...  14.95  https://www.breville.com/content/dam/breville/...
    12                          Portafilter Assembly 50mm  https://www.breville.com/us/en/parts-accessori...  10.95  https://www.breville.com/content/dam/breville/...
    13                                  Milk Jug Assembly  https://www.breville.com/us/en/parts-accessori...  16.95  https://www.breville.com/content/dam/breville/...
    14                                  Filter 2 Cup 50mm  https://www.breville.com/us/en/parts-accessori...  11.95  https://www.breville.com/content/dam/breville/...
  2. Create vector embeddings of each of the product images using Google’s Multi-Modal Embedding Model and save the data in AstraDB.

    • Python

    • Result

    import json, requests
    from vertexai.preview.vision_models import MultiModalEmbeddingModel, Image
    from astrapy.db import AstraDB
    
    model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding@001")
    
    # Initialize the vector db
    astra_db = AstraDB(token=os.getenv("ASTRA_DB_APPLICATION_TOKEN"), api_endpoint=os.getenv("ASTRA_DB_API_ENDPOINT"))
    collection = astra_db.create_collection(collection_name="coffee_shop_ecommerce", dimension=1408)
    
    for i in range(len(df)):
      name = df.loc[i, "name"]
      image = df.loc[i, "image"]
      price = df.loc[i, "price"]
      url = df.loc[i, "url"]
    
      # Download this product's image and save it to the Colab filesystem.
      # In a production system this binary data would be stored in Google Cloud Storage
      img_data = requests.get(image).content
      with open(f'{name}.png', 'wb') as handler:
        handler.write(img_data)
    
      # load the image from filesystem and compute the embedding value
      img = Image.load_from_file(f'{name}.png')
      embeddings = model.get_embeddings(image=img, contextual_text=name)
    
      try:
        # add to the AstraDB Vector Database
        collection.insert_one({
            "_id": i,
            "name": name,
            "image": image,
            "url": url,
            "price": price,
            "$vector": embeddings.image_embedding,
          })
      except Exception as error:
        # if you've already added this record, skip the error message
        error_info = json.loads(str(error))
        if error_info[0]['errorCode'] == "DOCUMENT_ALREADY_EXISTS":
          print(f"Document {name} already exists in the database. Skipping.")
    Cleaning Steam Wand.png
    Collar Grouphead 50mm.png
    Filter 1 Cup 50mm.png
    Filter 2 Cup 50mm.png
    Filter 2 Cup Dual Wall 50mm.png
    Handle Steam Wand Kit (New Version From 0735 PDC).png
    Jug Frothing.png
    Milk Jug Assembly.png
    Portafilter Assembly 50mm.png
    Saucer Ceramic.png
    Saucer.png
    Spoon Tamping 50mm.png
    Spout Juice Small (From 0637 to 1041 PDC).png
    Water Tank Assembly.png
    coffee_maker_part.png

Create a multi-modal RAG chain

  1. Ask the LLM the same question, but this time you’ll perform a vector search against Astra DB Serverless using the same image to supply the LLM with relevant products in the prompt.

    • Python

    • Result

    # Embed the similar item
    img = Image.load_from_file('coffee_maker_part.png')
    embeddings = model.get_embeddings(image=img, contextual_text="A espresso machine part")
    
    # Perform the vector search against AstraDB Vector
    documents = collection.vector_find(
        embeddings.image_embedding,
        limit=3,
    )
    
    related_products_csv = "name, image, price, url\n"
    for doc in documents:
      related_products_csv += f"{doc['name']}, {doc['image']}, {doc['price']}, {doc['url']},\n"
    
    image_message = {
        "type": "image_url",
        "image_url": {"url": "coffee_maker_part.png"},
    }
    text_message = {
        "type": "text",
        "text": f"""Given this image, please choose a possible replacement. Include link and price. Here are possible replacements: {related_products_csv}""",
    }
    message = HumanMessage(content=[text_message, image_message])
    output = llm([message])
    print(output.content)
    Filter 2 Cup 50mm, https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0000166/tile.jpg, 11.95, https://www.breville.com/us/en/parts-accessories/parts/sp0000166.html?sku=SP0000166
  2. Presto! Gemini correctly identified the part, and now provides a current link to purchase a replacement. This is because the LLM was able to use vector search on fresh data to find the most similar product in the catalog.

Cleanup

To delete the created coffee_shop_ecommerce collection and its data, run the following command with your Astra token and endpoint:

  • Curl

  • Result

curl -v -s --location \
--request POST "https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com/api/json/v1/default_keyspace" \
--header "X-Cassandra-Token: AstraCS: ..." \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
  "deleteCollection": {
    "name": "coffee_shop_ecommerce"
  }
}'
* Connection #0 to host https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com left intact
{"status":{"ok":1}}%

Complete code example

Python
import os
import json
import requests
import pandas as pd
from getpass import getpass
from langchain.chat_models import ChatVertexAI
from langchain.schema.messages import HumanMessage
from vertexai.preview.vision_models import MultiModalEmbeddingModel, Image
from astrapy.db import AstraDB

source_img_data = requests.get('https://drive.google.com/uc?export=view&id=15ddcn-AIxpvRdWcFGvIr77XLWdo4Maof').content
with open('coffee_maker_part.png', 'wb') as handler:
    handler.write(source_img_data)

llm = ChatVertexAI(model_name="gemini-pro-vision")

image_message = {
    "type": "image_url",
    "image_url": {"url": "coffee_maker_part.png"},
}
text_message = {
    "type": "text",
    "text": "What's in the image? Please share a link to purchase a replacement",
}
message = HumanMessage(content=[text_message, image_message])

output = llm([message])
print(output.content)

d = {'name': ["Saucer", "Saucer Ceramic", "Milk Jug Assembly", "Handle Steam Wand Kit (New Version From 0735 PDC)", "Spout Juice Small (From 0637 to 1041 PDC)", "Cleaning Steam Wand", "Jug Frothing", "Spoon Tamping 50mm", "Collar Grouphead 50mm", "Filter 2 Cup Dual Wall 50mm", "Filter 1 Cup 50mm", "Water Tank Assembly", "Portafilter Assembly 50mm", "Milk Jug Assembly", "Filter 2 Cup 50mm"],
     'url': ["https://www.breville.com/us/en/parts-accessories/parts/sp0014946.html?sku=SP0014946", "https://www.breville.com/us/en/parts-accessories/parts/sp0014914.html?sku=SP0014914", "https://www.breville.com/us/en/parts-accessories/parts/sp0011391.html?sku=SP0011391", "https://www.breville.com/us/en/parts-accessories/parts/sp0010719.html?sku=SP0010719", "https://www.breville.com/us/en/parts-accessories/parts/sp0010718.html?sku=SP0010718", "https://www.breville.com/us/en/parts-accessories/parts/sp0003247.html?sku=SP0003247", "https://www.breville.com/us/en/parts-accessories/parts/sp0003246.html?sku=SP0003246", "https://www.breville.com/us/en/parts-accessories/parts/sp0003243.html?sku=SP0003243", "https://www.breville.com/us/en/parts-accessories/parts/sp0003232.html?sku=SP0003232", "https://www.breville.com/us/en/parts-accessories/parts/sp0003231.html?sku=SP0003231", "https://www.breville.com/us/en/parts-accessories/parts/sp0003230.html?sku=SP0003230", "https://www.breville.com/us/en/parts-accessories/parts/sp0003225.html?sku=SP0003225", "https://www.breville.com/us/en/parts-accessories/parts/sp0003216.html?sku=SP0003216", "https://www.breville.com/us/en/parts-accessories/parts/sp0001875.html?sku=SP0001875", "https://www.breville.com/us/en/parts-accessories/parts/sp0000166.html?sku=SP0000166"],
     'price': ["10.95", "4.99", "14.95", "8.95", "10.95", "6.95", "24.95", "8.95", "6.95", "12.95", "12.95", "14.95", "10.95", "16.95", "11.95"],
     'image': ["https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0014946/tile.jpg", "https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0014914/tile.jpg", "https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0011391/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0010719/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0010718/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0003247/tile.jpg", "https://assets.breville.com/cdn-cgi/image/width=400,format=auto/Spare+Parts+/Espresso+Machines/BES250/SP0003246/SP0003246_IMAGE1_400X400.jpg", "https://assets.breville.com/cdn-cgi/image/width=400,format=auto/Spare+Parts+/Espresso+Machines/ESP8/SP0003243/SP0003243_IMAGE1_400X400.jpg", "https://assets.breville.com/cdn-cgi/image/width=400,format=auto/Spare+Parts+/Espresso+Machines/ESP8/SP0003232/SP0003232_IMAGE1_400x400.jpg", "https://www.breville.com/content/dam/breville/au/catalog/products/images/sp0/sp0003231/tile.jpg", "https://www.breville.com/content/dam/breville/au/catalog/products/images/sp0/sp0003230/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0003225/tile.jpg", "https://www.breville.com/content/dam/breville/ca/catalog/products/images/sp0/sp0003216/tile.jpg", "https://www.breville.com/content/dam/breville/au/catalog/products/images/sp0/sp0001875/tile.jpg", "https://www.breville.com/content/dam/breville/us/catalog/products/images/sp0/sp0000166/tile.jpg"]}
df = pd.DataFrame(data=d)
print(df)

model = MultiModalEmbeddingModel.from_pretrained("multimodalembedding@001")

# Initialize the vector db
astra_db = AstraDB(token=os.getenv("ASTRA_DB_APPLICATION_TOKEN"), api_endpoint=os.getenv("ASTRA_DB_API_ENDPOINT"))
collection = astra_db.create_collection(collection_name="coffee_shop_ecommerce", dimension=1408)

for i in range(len(df)):
    name = df.loc[i, "name"]
    image = df.loc[i, "image"]
    price = df.loc[i, "price"]
    url = df.loc[i, "url"]

    # Download this product's image and save it to your local filesystem.
    # In a production system this binary data would be stored in Google Cloud Storage
    img_data = requests.get(image).content
    with open(f'{name}.png', 'wb') as handler:
        handler.write(img_data)

    # load the image from filesystem and compute the embedding value
    img = Image.load_from_file(f'{name}.png')
    embeddings = model.get_embeddings(image=img, contextual_text=name)

    try:
        # add to the AstraDB Vector Database
        collection.insert_one({
            "_id": i,
            "name": name,
            "image": image,
            "url": url,
            "price": price,
            "$vector": embeddings.image_embedding,
        })
    except Exception as error:
        # if you've already added this record, skip the error message
        error_info = json.loads(str(error))
        if error_info[0]['errorCode'] == "DOCUMENT_ALREADY_EXISTS":
            print(f"Document {name} already exists in the database. Skipping.")

img = Image.load_from_file('coffee_maker_part.png')
embeddings = model.get_embeddings(image=img, contextual_text="A espresso machine part")
documents = collection.vector_find(
    embeddings.image_embedding,
    limit=3,
)

related_products_csv = "name, image, price, url\n"
for doc in documents:
    related_products_csv += f"{doc['name']}, {doc['image']}, {doc['price']}, {doc['url']},\n"

image_message = {
    "type": "image_url",
    "image_url": {"url": "coffee_maker_part.png"},
}
text_message = {
    "type": "text",
    "text": f"""Given this image, please choose a possible replacement. Include link and price. Here are possible replacements: {related_products_csv}""",
}
message = HumanMessage(content=[text_message, image_message])
output = llm([message])
print(output.content)

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com