LLM Guardrails
While individual LLMs do provide some checking of prompts and responses, instead of relying on that, we provide LLM guardrails to check and, if appropriate, either modify or flag both prompts and LLM responses that could be problematic.
The guardrails are applied to all prompts just before they are sent to an external LLM and are applied to all responses as soon as they are returned from an external LLM.
Configuring LLM Guardrails
See Genai.LlmGuardrails.Manager for details.
Configure a leader node with at least one T4 GPU, as LLM guardrails use a local model loaded into memory.
Configuring input processors
To enable the current input processors, run the following:
Genai.LlmGuardrails.Manager.setConfigValue('inputProcessors', [
Genai.LlmGuardrails.Processor.ToxicSpeech.inst(),
Genai.LlmGuardrails.Processor.PromptInjection.inst()
]);Both of the current input processors will raise errors if they detect a prompt that is problematic.
Configuring output processors
To enable the current output processor, run the following:
Genai.LlmGuardrails.Manager.setConfigValue('outputProcessors', [Genai.LlmGuardrails.Processor.PiiMasking.inst()]);- The PII masking processor redacts PII it finds in the response from the LLM.
- The classes of PII that are redacted are specified in Genai.LlmGuardrails.Processor.PiiMasking#piiClasses.
Configuring processors for dynamic agent
If you are using dynamic agent and want to configure guardrails you can run the following snippet.
c3.Genai.LlmGuardrails.Manager.setConfigValue("inputProcessors", [c3.Genai.LlmGuardrails.Processor.ToxicSpeech.inst()]) # replace with the input processors you want to use
c3.Genai.LlmGuardrails.Manager.setConfigValue("outputProcessors", [c3.Genai.LlmGuardrails.Processor.PiiMasking.inst()]) # replace with the output processors you want to use
def preprocess(messages):
"""
Replace with any preprocessing logic
"""
message = messages[-1]
updated_text = c3.Genai.LlmGuardrails.Manager.processInput(message["content"][0]["text"]).updatedValue.toString()
message["content"][0]["text"] = updated_text
return messages[:-1] + [message]
def postprocess(response):
"""
Replace with any postprocessing logic
"""
response.choices[0].message.original_content = response.choices[0].message.content
response.choices[0].message.content = c3.Genai.LlmGuardrails.Manager.processOutput(response.choices[0].message.content).updatedValue.toString()
return response
preprocess_lambda = c3.Lambda.fromPyFunc(preprocess)
postprocess_lambda = c3.Lambda.fromPyFunc(postprocess)
processor = c3.GenaiCore.Llm.Processor.Lambda(
preprocessLambda=preprocess_lambda, postprocessLambda=postprocess_lambda
)
c3.GenaiCore.Llm.Completion.Client.make({
"name": "gpt_4o",
"model": {
"type": "GenaiCore.Llm.AzureOpenAi.Model",
"model": "gpt-4o",
"processor": processor,
"auth": {
"type": "GenaiCore.Llm.AzureOpenAi.Auth",
"name": "Genai.Llm.OpenAI.Config"
},
"defaultOptions": {
"stop": ["</plan>", "</thought>", "</execute>", "</solution>"],
"temperature": 0.0
}
}
}).setConfig()Test the guardrail for the dynamic agent
In a test, you can make sure that the guardrail is working.
sample_prompt = "My phone number is 555-555-5555"
result = c3.Genai.LlmGuardrails.Manager.inst().processOutput(sample_prompt)
print("Original Value:", result.originalValue)
print("Current Value:", result.currentValue)
print("Updated Value:", result.updatedValue)
You will see
Original Value: My phone number is 555-555-5555
Current Value: My phone number is 555-555-5555
Updated Value: My phone number is [REDACTED_PHONE_NUMBER_5]If you are using another LLM, set the processor for the respective config.
Clearing guardrails configuration
- To disable all guardrails, run
Genai.LlmGuardrails.Manager.clearConfigAndSecretOverride(ConfigOverride.APP). - To disable just the input processors, run
Genai.LlmGuardrails.Manager.clearConfigValue('inputProcessors', ConfigOverride.APP). - To disable just the output processors, run
Genai.LlmGuardrails.Manager.clearConfigValue('outputProcessors', ConfigOverride.APP).