LLM Guardrails

While individual LLMs do provide some checking of prompts and responses, instead of relying on that, we provide LLM guardrails to check and, if appropriate, either modify or flag both prompts and LLM responses that could be problematic.

The guardrails are applied to all prompts just before they are sent to an external LLM and are applied to all responses as soon as they are returned from an external LLM.

Configuring LLM Guardrails

See Genai.LlmGuardrails.Manager for details.

Configure a leader node with at least one T4 GPU, as LLM guardrails use a local model loaded into memory.

Configuring input processors

To enable the current input processors, run the following:

Text

Genai.LlmGuardrails.Manager.setConfigValue('inputProcessors', [
  Genai.LlmGuardrails.Processor.ToxicSpeech.inst(),
  Genai.LlmGuardrails.Processor.PromptInjection.inst()
]);

Both of the current input processors will raise errors if they detect a prompt that is problematic.

Configuring output processors

To enable the current output processor, run the following:

Text

Genai.LlmGuardrails.Manager.setConfigValue('outputProcessors', [Genai.LlmGuardrails.Processor.PiiMasking.inst()]);

The PII masking processor redacts PII it finds in the response from the LLM.
The classes of PII that are redacted are specified in Genai.LlmGuardrails.Processor.PiiMasking#piiClasses.

Configuring processors for dynamic agent

If you are using dynamic agent and want to configure guardrails you can run the following snippet.

Python

c3.Genai.LlmGuardrails.Manager.setConfigValue("inputProcessors", [c3.Genai.LlmGuardrails.Processor.ToxicSpeech.inst()]) # replace with the input processors you want to use
c3.Genai.LlmGuardrails.Manager.setConfigValue("outputProcessors", [c3.Genai.LlmGuardrails.Processor.PiiMasking.inst()]) # replace with the output processors you want to use

def preprocess(messages):
    """
    Replace with any preprocessing logic
    """
    message = messages[-1]
    updated_text = c3.Genai.LlmGuardrails.Manager.processInput(message["content"][0]["text"]).updatedValue.toString()
    message["content"][0]["text"] = updated_text
    return messages[:-1] + [message]


def postprocess(response):
    """
    Replace with any postprocessing logic
    """
    response.choices[0].message.original_content = response.choices[0].message.content
    response.choices[0].message.content = c3.Genai.LlmGuardrails.Manager.processOutput(response.choices[0].message.content).updatedValue.toString()
    return response

preprocess_lambda = c3.Lambda.fromPyFunc(preprocess)
postprocess_lambda = c3.Lambda.fromPyFunc(postprocess)

processor = c3.GenaiCore.Llm.Processor.Lambda(
    preprocessLambda=preprocess_lambda, postprocessLambda=postprocess_lambda
)

c3.GenaiCore.Llm.Completion.Client.make({
  "name": "gpt_4o",
  "model": {
    "type": "GenaiCore.Llm.AzureOpenAi.Model",
    "model": "gpt-4o",
    "processor": processor,
    "auth": {
      "type": "GenaiCore.Llm.AzureOpenAi.Auth",
      "name": "Genai.Llm.OpenAI.Config"
    },
    "defaultOptions": {
      "stop": ["</plan>", "</thought>", "</execute>", "</solution>"],
      "temperature": 0.0
    }
  }
}).setConfig()

Test the guardrail for the dynamic agent

In a test, you can make sure that the guardrail is working.

Python

sample_prompt = "My phone number is 555-555-5555"

result = c3.Genai.LlmGuardrails.Manager.inst().processOutput(sample_prompt)

print("Original Value:", result.originalValue)
print("Current Value:", result.currentValue)
print("Updated Value:", result.updatedValue)

You will see

Text

Original Value: My phone number is 555-555-5555
Current Value: My phone number is 555-555-5555
Updated Value: My phone number is [REDACTED_PHONE_NUMBER_5]

If you are using another LLM, set the processor for the respective config.

Clearing guardrails configuration

To disable all guardrails, run Genai.LlmGuardrails.Manager.clearConfigAndSecretOverride(ConfigOverride.APP).
To disable just the input processors, run Genai.LlmGuardrails.Manager.clearConfigValue('inputProcessors', ConfigOverride.APP).
To disable just the output processors, run Genai.LlmGuardrails.Manager.clearConfigValue('outputProcessors', ConfigOverride.APP).

Copy link to this sectionConfiguring LLM Guardrails

Copy link to this sectionConfiguring input processors

Copy link to this sectionConfiguring output processors

Copy link to this sectionConfiguring processors for dynamic agent

Copy link to this sectionTest the guardrail for the dynamic agent

Copy link to this sectionClearing guardrails configuration

Configuring LLM Guardrails

Configuring input processors

Configuring output processors

Configuring processors for dynamic agent

Test the guardrail for the dynamic agent

Clearing guardrails configuration