Conva.AI Overview

This gives a quick overview of the core concepts behind Conva.AI and what it can be used for

Overview

Conva.AI is a low-code, Meta Prompt based AI Assistant Platform.

It's a full stack platform that provides everything required to create, maintain, update, and integrate an AI Assistant into apps.

Conva.AI Assistants are software components that sit inside the app and take unstructured textual input and generates structured textual outputs in a form that can trigger actions inside the app.

The input can come from an end-user talking to the app or from the app itself.

The action triggered by the Conva.AI Assistant can be one of the below -

  • Trigger some visual action inside the app, e.g. the app navigates to a particular screen or shows some search results or displays an AI generated textual content

  • The Assistant directly completes the action by speaking out (or in some cases, just showing) the expected response or

  • A combination of the two, i.e. speak out the answer and also trigger appropriate action inside the app

The Assistant itself can have an explicit visual form that end-users can interact with directly, using natural language and via voice or text (like a Conversational Overlay that appears as a bottom sheet on top of the app's screen), or be something that is only consumed by the app programmatically and the output of the Assistant appears inline inside the app's existing screens.

The way Conva.AI enables Assistant to be created and integrated and maintained easily is by enabling Assistant creators to break down the scope of the Assistant into various Capabilities.

There are two key steps in the journey of a Conva.AI Assistant

  • Assistant Creation

  • Assistant Integration

Conva.AI Assistants are composed of one or more Capabilities, which is the fundamental unit of an Assistant. Capabilities are like AI functions, that takes an input string (usually unstructured) and generates an output in a well-defined form (Json).

The way Conva.AI enables Assistant Creators to create these Capabilities, using Meta Prompts, and how easily it allows Assistant Integrators to integrate it into their app and trigger various actions inside their app is a core idea behind Conva.AI Assistants.

More about Capabilities later.

Components of Conva.AI

It includes 3 core components -

  1. Conva.AI SDK

  2. Conva.AI orchestration service

Magic Studio

A web-based platform, where developers can create, test, update and maintain AI Assistants and its Capabilities.

Magic Studio takes care of the following -

  • Offers a simple way for developers (or rather PMs) to define the scope and specification (the output) of each Capability using Meta prompts.

  • Manages the pipelines required for compilation of the Capabilities (which generates the system prompts)

  • Offers a playground to test the Assistant or a specific Capability

  • Log viewer

  • Other customization options to control the Copilot and Inline experience of the Assistant

Conva.AI SDK

Platform-specific SDKs that offer simple but flexible ways for app developers to integrate the Assistant into their app. It comes in two flavors

Core SDK

Allows developers to trigger Capabilities and process the response from the Capability. This is useful when the developer wants to integrate Conva.AI in a headless fashion or when they want to create their own UI. It comes with streaming and non-streaming variants of these APIs.

// Pass an input string to Conva.AI and let it determine the 
// relevant Capability to invoke and return the output params
// after that Capability is executed
val response = ConvaAI.invokeCapability(
    input = "book a bus ticket from bangalore to chennai tomorrow at 2pm",
)

// response = {
//    "capability_name" : "ticket_booking",
//    "message" : "Showing you buses from Bangalore to Chennai for tommorrow around 2pm"
//    "params" : {
//         "source": "BLR",
//         "destination" : "MAS",
//         "date" : "6/7/2024",
//         "time" : "14:00"
//         "mode" : "bus"
//     }
// }

Copilot SDK

Which comes with Conva.AI's intuitive Conversational Overlay style UI (with integrated ASR and TTS). The Copilot UI, when triggered, directly interacts with the end user and informs the app when there is a response from the appropriate Capability. It also provides APIs to theme the UI and control various elements and also get appropriate notifications when the user interacts with various elements in the UI.

This is supported only for mobile platforms like Android and iOS.

// Register handlers for the Copilot
val options = ConvaAIOptions.Builder()
    .setCapabilityHandler(object : ConvaAIAction {
        override fun onCapability(
            response: ConvaAIResponse, 
            interactionData: ConvaAIInteraction, 
            isFinal: Boolean
        ) {
            // Handle the response from the assistant
        }
    })
    .build()
ConvaAICopilot.setup(options)

// Start the Copilot
ConvaAICopilot.startConversation()

Conva.AI orchestration service

The orchestration service that runs inside Conva.AI's cloud, that the SDK talks to. It decides which Capability to invoke (based on the input from the SDK) and returns the response back to the SDK. It does the following things -

  1. Rewrites the incoming query to maintain context with the previous queries. Eg if the first query was "blue shirts" and the second one is "show me in red", it will rewrite the second query as "show me red shirts"

  2. Routes the request to the right Capability. For eg if the Assistant has two Capabilities called "Flight Ticket Booking" and "Train Ticket booking", and the input query is "book a flight ticket from Bangalore to Chennai", the router will pick the "Flight Ticket booking" Capability to process this input. If the input query is unrelated to the context of the app, then it will route it to a special built-in Capability called "Unsupported"

  3. Augments the Capability prompts with any runtime knowledge (what is called RAG or Retrieval Augmented Generation in the AI world)

  4. It also picks the relevant LLM to execute the Capability

  5. Process the response from the Capability and passes it to the SDK

  6. It also comes with a semantic cache to speed up things.

Assistant Capabilities

As mentioned above, the job of an AI Assistant is to take unstructured input and generate structured outputs that the app can consume and perform some action. This generation is controlled by the actual Capability that handles that input.

Assistant Creators can create a Capability by just writing a short description of what they want the Capability to do in Magic Studio.

Magic Studio will then take this outline and generate what is called as a "Capability Blueprint", which is a structured specification of what the Capability should do

The blueprint contains the following sections -

Capability Configuration

This section contains the details that is used to compile the prompts for the Capability itself. This in-turn contains the following fields

  • Capability Name - This is the display name of this Capability when it's shown in Studio

  • Developer Name - The name of this Capability when it's referred in code

  • Knowledge - The data that will help the Capability ground its output. More about knowledge later

  • Specification - These are the output fields that need to be generated by this Capability

  • Groups - The Capability Group this belongs to. Users can add it to an existing group or create a new one

  • Special Instructions (Advanced) - This is a place to put any special instructions for the Capability. Is not expected to be modified by most users.

Router Configuration

This section contains the details that is used to instruct the Conva.AI router about what this Capability is all about. The router will use this to trigger the correct Capability that matches with the given input. This contains the following fields

  • Capability Description - A brief description about what the Capability is about

  • Capability Scope - This is a set of phrases that tell the LLM what are the various topics that this Capability can handle

  • Example queries - This is a set of potential sample queries which should trigger this router

Typically, the "Router Configuration" section should not be updated by the user. But if there are any issues with routing, the developer or the PM can use this section to make the routing better

Capability Specifications (output parameters)

The goal of every Capability is to generate a set of well-formed output parameters, that will be consumed by the App via the Conva.AI SDK.

Magic Studio will automatically generate the parameters from the Capability outline. The Capability creator can also more explicitly instruct Magic Studio about the parameters they are looking for.

Creators can edit this specification directly or by editing the outline and regenerating the Capability blueprint.

Note that Conva.AI currently supports only the following data types for the output parameters

Required fields

Sometimes Conva.AI (or rather than underlying LLM) might not generate the expected parameter even if the input contains the required details. To force the parameter to be generated, you can set that parameter to be "Required"

Another use of the "Required" parameter is to get Conva.AI to prompt the user to provide some missing input. Eg when trying to book a ticket, if the Capability expects a "date" field and the user has not provided it, Conva.AI will automatically prompt the user to provide that

Default Parameters

Every Capability by default comes with two default parameters.

  • message - This is the field that contains either the answer to the user query or some sort of a status message (where the actual answer is inside other parameters)

  • related_queries - This is the field that contains the list of AI suggested follow-on queries that is based on the original query.

When the Assistant is used in Copilot mode (details further below), related_queries is what is used to populate the suggestion chips by default. The developer can override the suggestion chips by other fields if required.

Capability Groups

Multiple Capabilities can be grouped together as a logical unit and given a name. This is called Capability Group.

Conva.AI's router will restrict its choices of Capabilities to select based on the Capabilities that are part of this group.

This is useful when the user is in a particular part of the app where only some of the Capabilities make sense. For eg, continuing with the travel theme, say there are two Capabilities called "Flght FAQ" and "Bus FAQ". They might both have potentially conflicting data and so you want to ensure that only one of the Capability is active at any point, based on whether the user is inside the Flight book or Bus booking part of the app.

By default all Capabilities are added to the built-in group called "default".

// Limit the Capability to invoked to a specific group
val response = ConvaAI.invokeCapability(
    input = "Hello, how are you?",
    capabilityGroup = "general_conversation"
)

Capability patterns

Capabilities offer a very flexible and easy way to define various use cases. Here are some the most common patterns for things that users build Capabilities for

  • Classification - To classify the given input into one or more buckets. E.g. to build Navigation Capabilities ("show my account statement")

  • Extraction - To extract relevant attributes from the given input. E.g. to build Search Capabilities ("show me buses from Bangalore to Chennai for 1st December" -> {"source": "Bangalore", "destination": "Chennai", "date": "1/12/2024")

  • Transformation - To convert some extracted attributes into a different output. E.g. to replace locations into code ("Bangalore to Chennai tomorrow" -> {"source":"blr", "destination": "mas", "date":"10/10/2024")

  • Generation - To generate whole new content but based on the given input ("what is your return policy?" -> "Amazon.in allows you to return your products within 15 days...")

And of course its possible to do a combination of these patterns. So, a single Capability can have different parameters that are classified, extracted, transformed and even generated.

Capability Knowledge

Normally when generating the output for the Capability, Conva.AI relies on the underlying LLM's internal knowledge (guided by the Capability outline and the blueprint) to generate the output. In many cases, the app has the internal knowledge that can help the LLM to generate the right output. Conva.AI offers the ability for Capability creators to add knowledge to their Capabilities.

Knowledge Groups

Knowledges can be added to Capabilities by grouping them together as Knowledge groups. Creators can give it a name and describe it

Every knowledge group can contain one of two of knowledges -

Static knowledge

Use this knowledge if you want Conva.AI to send the entire text from this knowledge to the underlying LLM as context to help it generate the output. Because the entire text from this knowledge group is being sent as-is to the underlying LLM, its recommended for the Capability creator to ensure that the size of this knowledge is not too big - typically around 4K bytes.

Note that Conva.AI does not technically put any restrictions on this size and if you upload more than what can be handled by the underlying LLM's context window, it will throw an error at runtime

This knowledge can be given either as a blob of text or as an HTML URL (the textual parts of the content will be fetched and used as the knowledge)

Note that knowledge only uses the textual information of html content. Any images or videos will not be processed

Smart Knowledge

This is the default type of knowledge. Use this knowledge when you have larger data sources and you want Conva.AI to smartly pickup only the relevant portions or chunks of the knowledge when its interacting with the underlying LLM. Conva.AI will ingest this data and store it internally as vector embeddings and then at runtime extract the relevant chunks of this data that best suites the input given to the Capability

This type of knowledge retrieval is what is usually called as RAG or Retrieval Augmented Generation

Creators can add different types of knowledge sources here.

  • HTML files

  • PDF, CSV, TSV and JSON (coming soon)

  • API based connectors (coming soon)

Parameter grounding

A unique aspect of the knowledge based grounding option provided by Conva.AI is the ability for the creator to limit it to specific parameters. By default the knowledge group will impact any or all parameters (based on the input and the context)

Knowledge Grounding Policies

When providing knowledge sources, creators can set one of three Grounding Policies

  • Strict - This means that Conva.AI should only rely on the provided knowledge sources when the Capability is invoked.

  • Best Effort - This means that Conva.AI should use the knowledge source when possible but also fall back on the underlying LLM's internal knowledge to give the answer.

Capability Compilation

Once the Capabilities have been created, remember they are still in an abstracted form that is easy for Assistant Creators to consume but not yet in a form that the underlying LLM can consume. The underlying LLMs finally require everything to be in the form of a prompt and that too in a format that generates the most optimal results. This is what the Magic Studio Compiler will do.

Magic Studio Compiler itself is an AI agent that is built using a collaborative agent architecture. It comes with 3 smaller agents

  • Prompt Compiler - This uses the Capability blueprint and the App details to generate the first set of prompts

  • Prompt Evaluator - This generates a testplan to evaluate the generated prompts (again in the context of the app). This also takes the result of the Testplan execution and generates some feedback that will be used by the Compiler in the next phase.

  • Testplan Executor - This takes the testplan and runs it and generates the output

The above agents work together until they converge on a metric.

Consuming the Assistant

Conva.AI Assistants can be added to apps as -

An interactive Copilot

As an interactive Copilot that end-users can interact with both via voice or text. Conva.AI offers a Copilot SDK for mobile platforms, which comes with a themeable Conversational Overlay UI and other components like ASR and TTS.

Inline Component

As an inline component that fits into the workflow of the app. Developers can use the Core SDK to build these experiences and build out their own UI to consume the output.

Now that we have the overview of the key concepts of Conva.AI, lets jump right in and build an Assistant and how to go about integrating it to an app

Let the fun begin!

Last updated