Integrate Semantic Kernel with Astra DB Serverless

query_builder 15 min

Microsoft Semantic Kernel is an open-source SDK that simplifies the creation of AI agents.

Semantic Kernel provides capabilities for managing contextual conversations including previous chats, prompt history, and conversations. It also provides planners for multi-step functions and connections (plug-ins) for third-party APIs to enable Retrieval-Augmented Generation (RAG) in enterprise data.

Take advantage of the integration between Semantic Kernel and Astra DB Serverless to:

  • Define plugins and chains them together with your code.

  • Define your goals and have a Large Language Model (LLM) orchestrate the plugins to achieve those goals.

  • Build RAG applications with extended contextual conversations.

As a key component of this integration, DataStax has contributed the Astra DB connector in Python. The connector enables Astra DB Serverless to function as a vector database within the Semantic Kernel orchestration framework. It’s an important feature for developers building RAG applications that want to use Semantic Kernel’s unique framework features for contextual conversations or intelligent agents, or for those targeting the Microsoft AI and Azure ecosystem. The Astra DB connector allows for the storage of embeddings and the performance of semantic searches.

See the Python examples below to walk through the integration.

Prerequisites

This guide requires the following:

Set up the environment

  1. Create a Python script file.

  2. Set up imports and establish Astra DB Serverless as the vector store:

    semantic-kernel.py
    import asyncio
    from typing import Tuple
    
    import semantic_kernel as sk
    import semantic_kernel.connectors.ai.open_ai as sk_oai
    from semantic_kernel.connectors.memory.astradb.astradb_memory_store import (
        AstraDBMemoryStore
    )
    from semantic_kernel.memory.memory_record import MemoryRecord
  3. Populate the memory store:

    semantic-kernel.py
    async def populate_memory(kernel: sk.Kernel) -> None:
        # Add some documents to the semantic memory
        await kernel.memory.save_information("aboutMe", id="info1", text="My name is Andrea")
        await kernel.memory.save_information("aboutMe", id="info2", text="I currently work as a tour guide")
        await kernel.memory.save_information("aboutMe", id="info3", text="I've been living in Seattle since 2005")
        await kernel.memory.save_information("aboutMe", id="info4", text="I visited France and Italy five times since 2015")
        await kernel.memory.save_information("aboutMe", id="info5", text="My family is from New York")
  4. Search the populated memory store:

    semantic-kernel.py
    async def search_memory_examples(kernel: sk.Kernel) -> None:
        questions = [
            "what's my name",
            "where do I live?",
            "where's my family from?",
            "where have I traveled?",
            "what do I do for work",
        ]
    
        for question in questions:
            print(f"Question: {question}")
            result = await kernel.memory.search("aboutMe", question)
            print(f"Answer: {result[0].text}\n")
  5. Create a chat application that uses the populated Astra DB Serverless vector store as context for queries:

    semantic-kernel.py
    async def setup_chat_with_memory(
        kernel: sk.Kernel,
    ) -> Tuple[sk.KernelFunction, sk.KernelContext]:
        sk_prompt = """
        ChatBot can have a conversation with you about any topic.
        It can give explicit instructions or say 'I don't know' if
        it does not have an answer.
    
        Information about me, from previous conversations:
        - {{$fact1}} {{recall $fact1}}
        - {{$fact2}} {{recall $fact2}}
        - {{$fact3}} {{recall $fact3}}
        - {{$fact4}} {{recall $fact4}}
        - {{$fact5}} {{recall $fact5}}
    
        Chat:
        {{$chat_history}}
        User: {{$user_input}}
        ChatBot: """.strip()
    
        chat_func = kernel.create_semantic_function(sk_prompt, max_tokens=200, temperature=0.8)
    
        context = kernel.create_new_context()
        context["fact1"] = "what is my name?"
        context["fact2"] = "where do I live?"
        context["fact3"] = "where's my family from?"
        context["fact4"] = "where have I traveled?"
        context["fact5"] = "what do I do for work?"
    
        context[sk.core_plugins.TextMemoryPlugin.COLLECTION_PARAM] = "aboutMe"
        context[sk.core_plugins.TextMemoryPlugin.RELEVANCE_PARAM] = "0.8"
    
        context["chat_history"] = ""
    
        return chat_func, context
  6. Chat with the memory store:

    semantic-kernel.py
    async def chat(kernel: sk.Kernel, chat_func: sk.KernelFunction, context: sk.KernelContext) -> bool:
        try:
            user_input = input("User:> ")
            context["user_input"] = user_input
        except KeyboardInterrupt:
            print("\n\nExiting chat...")
            return False
        except EOFError:
            print("\n\nExiting chat...")
            return False
    
        if user_input == "exit":
            print("\n\nExiting chat...")
            return False
    
        answer = await kernel.run(chat_func, input_vars=context.variables)
        context["chat_history"] += f"\nUser:> {user_input}\nChatBot:> {answer}\n"
    
        print(f"ChatBot:> {answer}")
        return True
  7. Run the main function to perform the operations.

    This step demonstrates the most important functional area of the Semantic Kernel and Astra DB Serverless integration, which is configuring RAG with Astra DB Serverless.

    semantic-kernel.py
    async def main() -> None:
        kernel = sk.Kernel()
    
        api_key, org_id = sk.openai_settings_from_dot_env()
        kernel.add_chat_service("chat-gpt", sk_oai.OpenAIChatCompletion("gpt-3.5-turbo", api_key, org_id))
        kernel.add_text_embedding_generation_service(
            "ada", sk_oai.OpenAITextEmbedding("text-embedding-ada-002", api_key, org_id)
        )
        #original volatile memorystore instance
        #kernel.register_memory_store(memory_store=sk.memory.VolatileMemoryStore())
        #kernel.import_plugin(sk.core_plugins.TextMemoryPlugin(), "TextMemoryPlugin")
    
        app_token, db_id, region, keyspace = sk.astradb_settings_from_dot_env()
        astra_store = AstraDBMemoryStore(app_token, db_id, region, keyspace, 2, "cosine")
    
        kernel.register_memory_store(memory_store=astra_store())
        kernel.import_plugin(sk.core_plugins.TextMemoryPlugin(), "TextMemoryPlugin")
    
        print("Populating memory...")
        await populate_memory(kernel)
    
        print("Asking questions... (manually)")
        await search_memory_examples(kernel)
    
        print("Setting up a chat (with memory!)")
        chat_func, context = await setup_chat_with_memory(kernel)
    
        print("Begin chatting (type 'exit' to exit):\n")
        chatting = True
        while chatting:
            chatting = await chat(kernel, chat_func, context)
    
    
    if __name__ == "__main__":
        asyncio.run(main())

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com