Manage the Query Engine Lifecycle
Many of the core components of C3 Generative AI execute within Engines. The two main query handlers in the base app, Genai.UnstructuredQuery.Engine and Genai.Agent.QueryOrchestrator, are themselves subtypes of Engine.Py.
Why Engines?
Fundamentally, C3 AI engines (including Engine.Py) use a persistent process to allow internal state to remain across independent calls. While the C3 Generative AI query engines are stateless with respect to user queries (chat history is maintained in a combination of the client and DB), they do require many subcomponents to be loaded, many of which require a substantial amount of time. The persistent state that engines provide allows the platform to load all of the data, libraries, and configuration necessary to process user requests at the beginning of the engine's lifecycle instead of on every call, which reduces the latency of normal interaction with the engines.
Engine lifecycle
There are a few key stages of the GenAI engines' lifecycle that admin users should know.
Initialization
The first step to using an engine is initializing it. While this is done automatically on the first user query if the target engine hasn't been initialized yet, many admin users will want to 'pre-initialize' their application's engines so that the first user requests aren't slow. Both Genai.UnstructuredQuery.Engine and Genai.Agent.QueryOrchestrator have an initialize method that can be used to force initialization prior to the first user queries. initialize can be called without parameters, in which case the default Config for the respective engine will be used for configuration, or with a specific configuration.
Runtime installation
The GenAI engines are executed in Python ImplLanguage.RuntimeDeps runtimes. A more detailed description of runtimes can be found in Python Runtimes on C3 AI Platform, but fundamentally they are backed by conda runtimes that have to be installed once on each node. While not specific to the C3 Generative AI application or engines themselves, runtime installation therefore must occur before the GenAI engines can function, though as with initialization, runtime installation is performed automatically when necessary (that is, when the implementing runtime for an action hasn't been installed yet). Runtime installation can be slow (~10-15 minutes per runtime at the extreme), and the GenAI application requires several runtimes to be installed, so admin users should be aware of this expected latency. Runtime installation progress can be checked using Action.dump().
Engine statuses
Both Genai.UnstructuredQuery.Engine and Genai.Agent.QueryOrchestrator have an isInitialized function that returns whether the called instance has been initialized. Note again that each handler (if multiple have been configured) will have its own status.
Restarting and shutting down
While not necessary in a running C3 Generative AI application, admin users may want to shut down or restart some or all of their application's engines.
To terminate all engines, Genai.PyUtil.terminateAllEngines(true) will terminate all engines of all types in the current App (all nodes), while Genai.PyUtil.restartAllEngines(true) will call Engine#restart on each running Engine in the App.
Configure the Engine Thread Pools
Engine thread pools are assigned to Genai.Agent.Config for assistant configuration.
An engine deploy specification defines:
- Minimum number of threads in the pool. Default is 2
- Initial number of threads in the pool. Default is 2.
- Maximum number of threads in the pool. Default is 2.
- Name of the node pool. Default is empty.
Example:
engineDeploySpec = {
threadPool: {
minThreads: 2,
initialThreads: 2,
maxThreads: 2,
actionQueueCapacity: 10,
},
nodePools: [],
};You can update the above values as per your needs. You can also set the actionQueueCapacity which indicates the number of tasks that can be queued before the thread pool starts rejecting tasks and starts throwing RejectedExecutionException. This will also trigger the thread pool to start new threads to maxThreads.
By default the actionQueueCapacity is not set, which indicates all tasks more than maxThreads will wait in queue, and no scale up will happen.
You can also set the threadCount using Genai.QuickStart.SetupSpec#threadPoolSize.
Use the $Genai.Agent.Config#deploySpec attribute to set the engine deploy specification. Notice that the engines must be previously terminated for them to pick the new configuration.
Setting the specification for query orchestrator:
Genai.Agent.QueryOrchestrator.terminate();
Genai.Agent.QueryOrchestrator.inst.config().setConfigValue('deploySpec', engineDeploySpec);