C3 AI Documentation Home

LLM Guardrails

While individual LLMs do provide some checking of prompts and responses, instead of relying on that, we provide LLM guardrails to check and, if appropriate, either modify or flag both prompts and LLM responses that could be problematic.

The guardrails are applied to all prompts just before they are sent to an external LLM and are applied to all responses as soon as they are returned from an external LLM.

Configuring LLM Guardrails

See Genai.LlmGuardrails.Manager for details.

Configure a leader node with at least one T4 GPU, as LLM guardrails use a local model loaded into memory.

Configuring input processors

To enable the current input processors, run the following:

Text
Genai.LlmGuardrails.Manager.setConfigValue('inputProcessors', [
  Genai.LlmGuardrails.Processor.ToxicSpeech.inst(),
  Genai.LlmGuardrails.Processor.PromptInjection.inst()
]);

Both of the current input processors will raise errors if they detect a prompt that is problematic.

Configuring output processors

To enable the current output processor, run the following:

Text
Genai.LlmGuardrails.Manager.setConfigValue('outputProcessors', [Genai.LlmGuardrails.Processor.PiiMasking.inst()]);

Configuring processors for dynamic agent

If you are using dynamic agent and want to configure guardrails you can run the following snippet.

Python
c3.Genai.LlmGuardrails.Manager.setConfigValue("inputProcessors", [c3.Genai.LlmGuardrails.Processor.ToxicSpeech.inst()]) # replace with the input processors you want to use
c3.Genai.LlmGuardrails.Manager.setConfigValue("outputProcessors", [c3.Genai.LlmGuardrails.Processor.PiiMasking.inst()]) # replace with the output processors you want to use

def preprocess(messages):
    """
    Replace with any preprocessing logic
    """
    message = messages[-1]
    updated_text = c3.Genai.LlmGuardrails.Manager.processInput(message["content"][0]["text"]).updatedValue.toString()
    message["content"][0]["text"] = updated_text
    return messages[:-1] + [message]


def postprocess(response):
    """
    Replace with any postprocessing logic
    """
    response.choices[0].message.original_content = response.choices[0].message.content
    response.choices[0].message.content = c3.Genai.LlmGuardrails.Manager.processOutput(response.choices[0].message.content).updatedValue.toString()
    return response

preprocess_lambda = c3.Lambda.fromPyFunc(preprocess)
postprocess_lambda = c3.Lambda.fromPyFunc(postprocess)

processor = c3.GenaiCore.Llm.Processor.Lambda(
    preprocessLambda=preprocess_lambda, postprocessLambda=postprocess_lambda
)

c3.GenaiCore.Llm.Completion.Client.make({
  "name": "gpt_4o",
  "model": {
    "type": "GenaiCore.Llm.AzureOpenAi.Model",
    "model": "gpt-4o",
    "processor": processor,
    "auth": {
      "type": "GenaiCore.Llm.AzureOpenAi.Auth",
      "name": "Genai.Llm.OpenAI.Config"
    },
    "defaultOptions": {
      "stop": ["</plan>", "</thought>", "</execute>", "</solution>"],
      "temperature": 0.0
    }
  }
}).setConfig()

Test the guardrail for the dynamic agent

In a test, you can make sure that the guardrail is working.

Python
sample_prompt = "My phone number is 555-555-5555"

result = c3.Genai.LlmGuardrails.Manager.inst().processOutput(sample_prompt)

print("Original Value:", result.originalValue)
print("Current Value:", result.currentValue)
print("Updated Value:", result.updatedValue)

You will see

Text
Original Value: My phone number is 555-555-5555
Current Value: My phone number is 555-555-5555
Updated Value: My phone number is [REDACTED_PHONE_NUMBER_5]

If you are using another LLM, set the processor for the respective config.

Clearing guardrails configuration

  • To disable all guardrails, run Genai.LlmGuardrails.Manager.clearConfigAndSecretOverride(ConfigOverride.APP).
  • To disable just the input processors, run Genai.LlmGuardrails.Manager.clearConfigValue('inputProcessors', ConfigOverride.APP).
  • To disable just the output processors, run Genai.LlmGuardrails.Manager.clearConfigValue('outputProcessors', ConfigOverride.APP).
Was this page helpful?