Conva.AI Overview

This gives a quick overview of the core concepts behind Conva.AI and what it can be used for

Overview

Conva.AI is a versatile low-code platform for creating, maintaining, and integrating AI Assistants using Meta Prompts. This full-stack platform provides everything needed to seamlessly add AI Assistants to your apps.

Conva.AI Assistants are software components that process unstructured text inputs to generate structured outputs, triggering actions within the app. These inputs can come from users or the app itself. The actions can be:

  • Visual Actions: Navigate screens, display search results, or show AI-generated content.

  • Direct Actions: Provide responses through speech or text.

  • Combined Actions: Mix of visual and direct responses.

Assistants can appear as interactive visual elements (e.g., a Conversational Overlay) or work behind the scenes, with outputs integrated into the app's existing screens.

Conva.AI simplifies the creation, integration, and maintenance of Assistants by breaking them into modular Capabilities. Each Capability is a fundamental unit of an Assistant and acts like an AI function, processing input strings into well-defined JSON outputs.

There are two key steps in the journey of a Conva.AI Assistant

  1. Assistant Creation: Using Meta Prompts to define Capabilities.

  2. Assistant Integration: Easily integrating these Capabilities into your app.

More about Capabilities later.

Components of Conva.AI

It comprises of three core components:

  1. Conva.AI SDK

  2. Conva.AI orchestration service

Magic Studio

A web-based platform, where developers can create, test, update and maintain AI Assistants and their Capabilities.

Magic Studio offers:

  • Offers a simple way for developers (or rather PMs) to define the scope and specification (the output) of each Capability using Meta prompts.

  • Pipeline management for compiling Capabilities (which generates the system prompts).

  • A playground to test the Assistant or specific Capabilities.

  • A log viewer

  • Customization options to control the Copilot (more on this later) and Inline experience of the Assistant.

Conva.AI SDK

Platform-specific SDKs that offer simple but flexible ways for app developers to integrate the Assistant into their app. It comes in two variants:

Core SDK

Allows developers to trigger Capabilities and process responses, suitable for headless integration or custom UI creation. This is available in streaming and non-streaming variants.

// Pass an input string to Conva.AI and let it determine the 
// relevant Capability to invoke and return the output params
// after that Capability is executed
val response = ConvaAI.invokeCapability(
    input = "book a bus ticket from bangalore to chennai tomorrow at 2pm",
)

// response = {
//    "capability_name" : "ticket_booking",
//    "message" : "Showing you buses from Bangalore to Chennai for tommorrow around 2pm"
//    "params" : {
//         "source": "BLR",
//         "destination" : "MAS",
//         "date" : "6/7/2024",
//         "time" : "14:00"
//         "mode" : "bus"
//     }
// }

Copilot SDK

Provides an intuitive Conversational Overlay UI with integrated ASR and TTS, supported only for mobile platforms like Android and iOS. The Copilot UI interacts directly with end-users and notifies the app upon receiving responses from Capabilities.

It also provides APIs to theme the UI and control various elements and also get appropriate notifications when the user interacts with various elements in the UI.

This is supported only for mobile platforms.

// Register handlers for the Copilot
val options = ConvaAIOptions.Builder()
    .setCapabilityHandler(object : ConvaAIAction {
        override fun onCapability(
            response: ConvaAIResponse, 
            interactionData: ConvaAIInteraction, 
            isFinal: Boolean
        ) {
            // Handle the response from the assistant
        }
    })
    .build()
ConvaAICopilot.setup(options)

// Start the Copilot
ConvaAICopilot.startConversation()

Conva.AI orchestration service

The orchestration service operates within Conva.AI's cloud, communicating with the SDK to manage and route queries effectively.

Key functions

  1. Context Maintenance: Rewrites the incoming query to maintain context with the previous queries, eg. rewrites the query "show me in red" to "show me red shirts" if the previous query was "blue shirts".

  2. Capability Routing: Routes the request to the right Capability, for example,

    • Given a travel assistant with two Capabilities called "Flight Ticket Booking" and "Train Ticket booking", for a query "book a flight ticket from Bangalore to Chennai", the router will pick "Flight Ticket booking" Capability to process this input

    • If the input query is unrelated to the context of the app, then it will route it to a special built-in Capability called "Unsupported"

  3. Runtime Knowledge Augmentation: Enhances Capability prompts with Retrieval Augmented Generation (RAG).

  4. LLM Selection: Chooses the relevant Large Language Model (LLM) for executing the Capability.

  5. Response Processing: Processes and returns responses from the selected Capability to the SDK.

  6. Semantic Caching: Implements a semantic cache to improve speed and efficiency.

Assistant Capabilities

The role of an AI Assistant is to convert unstructured input into structured outputs that the app can use to perform actions. This process is managed by the Capabilities configured for the Assistant.

Assistant Creators can create a Capability by writing a short description of its intended function in Magic Studio.

Magic Studio generates a "Capability Blueprint" based on the outline, which is a structured specification of the Capability.

The blueprint contains the following sections:

Capability Configuration

This section includes details for compiling the prompts for the Capability. It contains the following fields:

  • Capability Name - Display name of this Capability as shown in Studio.

  • Developer Name - Name of this Capability when used in code during integration.

  • Knowledge - Data to help ground the Capability's output (More about knowledge later).

  • Specification - Output fields to be generated by the Capability.

  • Groups - Capability Group the Capability belongs to. Users can add it to an existing group or create a new one.

  • Special Instructions (Advanced) - Special instructions for the Capability, typically not modified by most users.

Router Configuration

This section instructs the Conva.AI router on the Capability's function to ensure correct routing based on input. It contains:

  • Capability Description: Brief description of the Capability.

  • Capability Scope: Phrases that inform the LLM about the various topics the Capability can handle.

  • Example queries: Sample queries that should trigger this router.

Typically, the "Router Configuration" section should not be updated by the user. But if there are any issues with routing, this section can be updated to improve the routing.

Capability Specifications (Output Parameters)

The goal of every Capability is to generate a set of well-formed output parameters, that will be consumed by the app via the Conva.AI SDK.

Magic Studio automatically generates these parameters from the Capability outline, though users can also specify parameters explicitly.

Creators can edit this specification directly or by editing the outline and regenerating the Capability blueprint.

Conva.AI currently supports only the following data types for the output parameters

  • String

  • Integer

  • Boolean

  • Float

  • Array

  • Object

Required fields

Sometimes Conva.AI (or rather than underlying LLM) might not generate the expected parameter even if the input contains the required details. To force the parameter to be generated, you can set that parameter to be "Required".

Another use of the "Required" parameter is to get Conva.AI to prompt the user to provide some missing input, eg. if a "date" field is required for booking a ticket, but the user hasn't provided it, Conva.AI will automatically ask for the date.

Default Parameters

Every Capability by default comes with two default parameters:

  • message: Contains the answer to the user query or a status message (wherein the answer is inside other parameters)

  • related_queries: Contains AI-suggested follow-on queries based on the original query.

When the Assistant is used in Copilot mode (details further below), this field populates the suggestion chips by default. Developers can override the suggestion chips with other fields if needed.

Capability Groups

Multiple Capabilities can be grouped together as a logical unit and given a name. This is called Capability Group.

The Conva.AI router will restrict its choice of Capabilities to those within the selected group.

This is useful in scenarios where only specific Capabilities are relevant based on the app context. For example, in a travel app, there might be two Capabilities: "Flight FAQ" and "Bus FAQ". To avoid conflicting data, you can ensure that only the relevant Capability is active based on whether the user is in the flight booking or bus booking section of the app.

By default all Capabilities are added to the built-in group called "default".

// Limit the Capability to invoked to a specific group
val response = ConvaAI.invokeCapability(
    input = "Hello, how are you?",
    capabilityGroup = "general_conversation"
)

Capability patterns

Capabilities provide a flexible and easy way to define various use cases. Here are some common patterns for building capabilities:

  • Classification - Classify the given input into one or more buckets. E.g. Navigation Capabilities ("show my account statement")

  • Extraction - Extract relevant attributes from the given input. E.g. Search Capabilities ("show me buses from Bangalore to Chennai for 1st December" -> {"source": "Bangalore", "destination": "Chennai", "date": "1/12/2024")

  • Transformation - Convert extracted attributes into a different output. E.g. Replace locations into code ("Bangalore to Chennai tomorrow" -> {"source":"BLR", "destination": "MAA", "date":"10/10/2024")

  • Generation - Generate new content based on the given input ("What is your return policy?" -> "Amazon.in allows you to return your products within 15 days...")

And of course its possible to do a combination of these patterns. A single Capability can have different parameters that are classified, extracted, transformed and even generated.

Capability Knowledge

When generating an output, Conva.AI typically relies on the underlying LLM's internal knowledge, guided by the Capability outline and blueprint. However, apps may have internal knowledge that can help the LLM to enhance the output accuracy. Conva.AI allows Capability creators to add this knowledge to their Capabilities.

Knowledge Groups

Knowledges can be added to Capabilities by grouping them together as Knowledge groups. Creators can give it a name and describe it.

Each group can contain either Static or Smart knowledge.

Static knowledge

  • Purpose: Provides the entire text as context to the LLM.

  • Limitations: Recommended size is around 4K bytes to avoid runtime errors, since the entire text from this knowledge group is being sent as-is to the underlying LLM.

  • Formats: Text blob or HTML URL.

Conva.AI does not technically put any restrictions on this size and if you upload more than what can be handled by the underlying LLM's context window, it will throw an error at runtime.

Static knowledge only uses the textual information of HTML content. Any images or videos will not be processed.

Smart Knowledge

This is the default type of knowledge.

  • Purpose: Retrieves relevant portions of larger data sources during runtime.

  • Method: Stores data as vector embeddings and extracts relevant chunks that best suites the input given to the Capability.

  • Formats: HTML URLs, PDF, CSV, TSV, and JSON (coming soon), API-based connectors (coming soon).

This type of knowledge retrieval is what is usually called as RAG or Retrieval Augmented Generation

Parameter grounding

A unique aspect of the knowledge based grounding option provided by Conva.AI is the ability for the creator to limit it to specific parameters. By default the knowledge group will impact any or all parameters (based on the input and the context)

Knowledge Grounding Policies

When providing knowledge sources, creators can set one of three Grounding Policies

  • Strict: Conva.AI relies solely on the provided knowledge sources when the Capability is invoked.

  • Best Effort: Conva.AI uses the knowledge source when possible but falls back on the underlying LLM's internal knowledge if necessary.

Capability Compilation

Once Capabilities are created, they are in an abstracted form that is easy for Assistant Creators to understand but not yet in a format the underlying LLM can consume. The Magic Studio Compiler converts these into prompts optimized for the LLM.

Magic Studio Compiler is an AI agent that is built using a collaborative agent architecture. It comes with 3 smaller agents

  • Prompt Compiler - This uses the Capability blueprint and the App details to generate the first set of prompts

  • Prompt Evaluator - This generates a test plan to evaluate the generated prompts in the context of the app. This also analyzes the test plan results and provides feedback for optimization which will be used by the Compiler in the next phase.

  • Testplan Executor - This takes the test plan and runs it and generates the output

The above agents work together until they converge on an optimal metric.

Consuming the Assistant

Conva.AI Assistants can be added to apps in two ways:

An interactive Copilot

As an interactive Copilot that end-users can interact with both via voice or text. Conva.AI offers a Copilot SDK for mobile platforms, which comes with a themeable Conversational Overlay UI and other components like ASR and TTS.

Inline Component

As an inline component that fits into the workflow of the app. Developers can use the Conva.AI SDK to build these experiences and build out their own UI to consume the output.

Now that we have the overview of the key concepts of Conva.AI, lets jump right in and build an Assistant and how to go about integrating it to an app

Let the fun begin!

Last updated