Large Language Models
Large Language Models (LLMs) are the core component of any generative AI application, whether for answering a user's question given retrieved information or designing and triggering a set of actions to accomplish a user's task.
Supported LLMs
The C3 Generative AI Suite is designed to integrate with the best-of-breed LLMs from all major providers, as well as with proprietary fine-tuned models hosted on C3 AI's Model Inference Service. As of the 4.0 release, the following models are supported by default:
- AWS/Claude: Claude v2, Claude v3 Haiku
- Azure/Gpt: Gpt 3.5 Turbo, Gpt 4
- Google/Gemini: Gemini (gemini-pro)
- MIS: various (Llama, narwhal, etc.)
To view all available models, you can execute Genai.UnstructuredQuery.Engine.ModelConfig.listConfigs().collect() in static console.
Adding support for new LLMs
When a new model comes out from one of the existing providers, then as long as the calling signature (i.e., client library function, parameters, and return type) are consistent with currently supported models, integrating the new model is as simple as creating a new Genai.UnstructuredQuery.Engine.ModelConfig with the appropriate model identifier specified in the config's llmKwargs. If the new model does not have the same calling signature/pattern, then it will require code changes before it may be used.
Setting up and testing an LLM
Whitelisting model
Coordinate with your application administrator to ensure that the endpoint for the models you intend to use is whitelisted in the cluster you are operating in.
Configuring credentials
To use one of the LLMs in your application, the main step is making sure that the credentials are set for the appropriate provider config: Genai.Llm.OpenAI.Config, Genai.Llm.Gcp.Config, Genai.Llm.AwsBedrock.Config. For example,
Genai.Llm.OpenAI.Config.setSecretValue('apiKey', MY_AZURE_OPENAI_KEY);
Genai.Llm.OpenAI.Config.setConfigValue('apiBase', MY_AZURE_OPENAI_BASE_URL);Testing the LLM integration
The easiest way to test the LLM integration/configuration is to simply execute a quick completion call. For example,
Genai.UnstructuredQuery.Engine.ModelConfig.forConfigKey('azureGpt4o').generateText({ prompt: 'Hi, who are you?' });
("Hello! I'm an AI developed by OpenAI, designed to provide information and answer questions to the best of my ability based on the data I've been trained on. How can I assist you today?");Genai.ConfigUtil.queryEngineModelConfig('gemini').generateText({
prompt: 'Count from 1 to 10',
});Creating an LLM batch job
For LLMs which support batch jobs, it is also possible to use a Genai.UnstructuredQuery.Engine.ModelConfig to directly pass a batch of prompts that will be handled by the LLM's batch API.
For more information on LLM batch APIs, see
- https://platform.openai.com/docs/guides/batch
- https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/batch-prediction-gemini
- https://www.anthropic.com/news/message-batches-api
job = Genai.ConfigUtil.queryEngineModelConfig('gemini').generateTextBatch({
"prompts": ["Count from 1 to 10", "What is the color of the sky"],
})
// Wait until the job is complete
while(!job.completed()) {
console.log(`Batch job ${job.id} has not completed as of ${DateTime.now()}. Sleeping for 10 seconds.`)
Thread.sleep(10000);
}
// Read the responses
job.readResponses()
// Remove the job if it is no longer needed
job.remove()Which LLMs are suited to which tasks?
For the LLMs served by the major cloud providers (AWS/Claude, Azure/Gpt, Google/Gemini), the general pattern is that newer models tend to be more capable and support larger contexts, but this frequently comes with a corresponding decrease in speed and increase in cost. For more details on the differences between those models, please refer to the provider's documentation:
While MIS can host many different types of models, the main ones used in the GenAi application are
- Narwhal - C3 AI fine-tuned LLM to answer questions about C3 AI code and Types
- Zephyr - more general purpose model designed to stand in for Gpt, Claude, or Gemini for air-gapped deployments