Integrate Semantic Kernel with Astra DB Serverless
Microsoft Semantic Kernel is an open-source SDK that simplifies the creation of AI agents.
Semantic Kernel provides capabilities for managing contextual conversations including previous chats, prompt history, and conversations. It also provides planners for multi-step functions and connections (plug-ins) for third-party APIs to enable Retrieval-Augmented Generation (RAG) in enterprise data.
Take advantage of the integration between Semantic Kernel and Astra DB Serverless to:
-
Define plugins and chains them together with your code.
-
Define your goals and have a Large Language Model (LLM) orchestrate the plugins to achieve those goals.
-
Build RAG applications with extended contextual conversations.
As a key component of this integration, DataStax has contributed the Astra DB connector in Python. The connector enables Astra DB Serverless to function as a vector database within the Semantic Kernel orchestration framework. It’s an important feature for developers building RAG applications that want to use Semantic Kernel’s unique framework features for contextual conversations or intelligent agents, or for those targeting the Microsoft AI and Azure ecosystem. The Astra DB connector allows for the storage of embeddings and the performance of semantic searches.
See the Python examples below to walk through the integration.
Prerequisites
This guide requires the following:
-
An active Astra account
-
An active Serverless (Vector) database
-
An application token with the Database Administrator role
-
Python 3.8 or later
-
pip 23.0 or later
-
The required Python packages:
pip install semantic-kernel python-dotenv
Set up the environment
-
Create a Python script file.
-
Set up imports and establish Astra DB Serverless as the vector store:
semantic-kernel.pyimport asyncio from typing import Tuple import semantic_kernel as sk import semantic_kernel.connectors.ai.open_ai as sk_oai from semantic_kernel.connectors.memory.astradb.astradb_memory_store import ( AstraDBMemoryStore ) from semantic_kernel.memory.memory_record import MemoryRecord
-
Populate the memory store:
semantic-kernel.pyasync def populate_memory(kernel: sk.Kernel) -> None: # Add some documents to the semantic memory await kernel.memory.save_information("aboutMe", id="info1", text="My name is Andrea") await kernel.memory.save_information("aboutMe", id="info2", text="I currently work as a tour guide") await kernel.memory.save_information("aboutMe", id="info3", text="I've been living in Seattle since 2005") await kernel.memory.save_information("aboutMe", id="info4", text="I visited France and Italy five times since 2015") await kernel.memory.save_information("aboutMe", id="info5", text="My family is from New York")
-
Search the populated memory store:
semantic-kernel.pyasync def search_memory_examples(kernel: sk.Kernel) -> None: questions = [ "what's my name", "where do I live?", "where's my family from?", "where have I traveled?", "what do I do for work", ] for question in questions: print(f"Question: {question}") result = await kernel.memory.search("aboutMe", question) print(f"Answer: {result[0].text}\n")
-
Create a chat application that uses the populated Astra DB Serverless vector store as context for queries:
semantic-kernel.pyasync def setup_chat_with_memory( kernel: sk.Kernel, ) -> Tuple[sk.KernelFunction, sk.KernelContext]: sk_prompt = """ ChatBot can have a conversation with you about any topic. It can give explicit instructions or say 'I don't know' if it does not have an answer. Information about me, from previous conversations: - {{$fact1}} {{recall $fact1}} - {{$fact2}} {{recall $fact2}} - {{$fact3}} {{recall $fact3}} - {{$fact4}} {{recall $fact4}} - {{$fact5}} {{recall $fact5}} Chat: {{$chat_history}} User: {{$user_input}} ChatBot: """.strip() chat_func = kernel.create_semantic_function(sk_prompt, max_tokens=200, temperature=0.8) context = kernel.create_new_context() context["fact1"] = "what is my name?" context["fact2"] = "where do I live?" context["fact3"] = "where's my family from?" context["fact4"] = "where have I traveled?" context["fact5"] = "what do I do for work?" context[sk.core_plugins.TextMemoryPlugin.COLLECTION_PARAM] = "aboutMe" context[sk.core_plugins.TextMemoryPlugin.RELEVANCE_PARAM] = "0.8" context["chat_history"] = "" return chat_func, context
-
Chat with the memory store:
semantic-kernel.pyasync def chat(kernel: sk.Kernel, chat_func: sk.KernelFunction, context: sk.KernelContext) -> bool: try: user_input = input("User:> ") context["user_input"] = user_input except KeyboardInterrupt: print("\n\nExiting chat...") return False except EOFError: print("\n\nExiting chat...") return False if user_input == "exit": print("\n\nExiting chat...") return False answer = await kernel.run(chat_func, input_vars=context.variables) context["chat_history"] += f"\nUser:> {user_input}\nChatBot:> {answer}\n" print(f"ChatBot:> {answer}") return True
-
Run the main function to perform the operations.
This step demonstrates the most important functional area of the Semantic Kernel and Astra DB Serverless integration, which is configuring RAG with Astra DB Serverless.
semantic-kernel.pyasync def main() -> None: kernel = sk.Kernel() api_key, org_id = sk.openai_settings_from_dot_env() kernel.add_chat_service("chat-gpt", sk_oai.OpenAIChatCompletion("gpt-3.5-turbo", api_key, org_id)) kernel.add_text_embedding_generation_service( "ada", sk_oai.OpenAITextEmbedding("text-embedding-ada-002", api_key, org_id) ) #original volatile memorystore instance #kernel.register_memory_store(memory_store=sk.memory.VolatileMemoryStore()) #kernel.import_plugin(sk.core_plugins.TextMemoryPlugin(), "TextMemoryPlugin") app_token, db_id, region, keyspace = sk.astradb_settings_from_dot_env() astra_store = AstraDBMemoryStore(app_token, db_id, region, keyspace, 2, "cosine") kernel.register_memory_store(memory_store=astra_store()) kernel.import_plugin(sk.core_plugins.TextMemoryPlugin(), "TextMemoryPlugin") print("Populating memory...") await populate_memory(kernel) print("Asking questions... (manually)") await search_memory_examples(kernel) print("Setting up a chat (with memory!)") chat_func, context = await setup_chat_with_memory(kernel) print("Begin chatting (type 'exit' to exit):\n") chatting = True while chatting: chatting = await chat(kernel, chat_func, context) if __name__ == "__main__": asyncio.run(main())