GuardRail: Risk Analysis

Risk Analysis is WitnessAI’s prompt risk analysis and control GuardRail. Its primary purpose is to analyze user prompts to detect potential risks and assess their severity. This GuardRail identifies activities that may harm the user or the business by evaluating prompts across multiple topics, such as Data Theft, Harmful Code Generation, Violence, and other high-risk areas.

When these types of activities are detected, administrators can configure the GuardRail to Allow, Warn, or Block prompts and provide a customizable message to guide or inform the user.

WitnessAI Policies leverage the Risk Analysis GuardRail to ensure safe, ethical, and compliant use of AI, protecting both users and organizational integrity.

Use Cases

Preventing Data Exfiltration

Warn or block prompts that attempt to extract sensitive information, such as proprietary algorithms or confidential business data.

Mitigating Harmful Code Risks:

Detect prompts that generate harmful or insecure code and notify the user to review their request.

Violence Prevention:

Block or warn users attempting to generate violent or harmful content, ensuring adherence to ethical and safety standards.

Risk Profiling:

Provide risk level analysis for user prompts to inform administrators about trends or emerging threats.

Using Risk Analysis GuardRail Step-by-Step

Step 1: Create a New Policy

1. Navigate to the Policies section.

2. Click Add New Policy to create a new policy.

3. Provide a name and description (e.g., “High-Risk Prompt Monitoring Policy”).

4. Click Create Policy to proceed.

Step 2: Add User Groups

1. Click the Groups tab in the policy configuration interface.

2. Select Assign to New Group.

3. Choose the relevant User Groups from the list.

4. Save the changes to add the group to the policy.

Step 3: Configure the Risk Analysis GuardRail

1. Go to the GuardRails tab in the policy editor.

2. Select Risk Analysis GuardRail from the list of available GuardRails.

3. Enable the GuardRail for the policy.

Step 4: Define GuardRail Actions

1. Specify the action to take when the Risk Analysis GuardRail detects risky prompts:

• Allow: Permit the prompt to proceed without restriction.

• Warn: Display a customizable warning message to alert the user about potential risks.

• Block: Prevent the prompt from being processed by the model and provide an explanatory message.

2. Customize the associated message to provide relevant guidance or warnings.

• Example Message: “The detected prompt poses a risk of harmful code generation. This activity is blocked. Please contact your administrator for more information.”

3. Save the configuration.

Step 5: Risk Evaluation Settings

1. Enable and configure risk evaluation for multiple topics, such as:

• Data Theft: Detect prompts that attempt to extract sensitive or proprietary information.

• Harmful Code Generation: Identify queries that generate insecure, malicious, or unethical code.

• Violence: Flag prompts that discuss or promote violent content.

• Other High-Risk Topics: Include additional categories as needed, based on organizational requirements.

2. Set thresholds for risk levels (e.g., low, medium, high) to fine-tune the GuardRail’s sensitivity.

Step 6: Test and Save the Policy

1. Test the policy configuration in a controlled environment to ensure it detects and handles risky prompts as expected.

2. Once verified, save the policy to activate the Risk Analysis GuardRail for the assigned User Groups.

Best Practices

• Threshold Calibration: Regularly review and adjust risk detection thresholds to match evolving organizational needs.

• User Communication: Use customized warning messages to educate users on acceptable prompts and potential risks.

• Comprehensive Testing: Validate the Risk Analysis GuardRail configuration in test environments before deployment.

• Policy Documentation: Keep detailed records of risk settings and policy actions for compliance purposes.

By employing the Risk Analysis GuardRail, organizations can proactively monitor and control user interactions with AI, reducing potential harm and ensuring safe and ethical use. This GuardRail provides the flexibility to tailor risk management strategies to organizational requirements, safeguarding both users and business assets.

Risk Analysis GuardRail Use Cases Preventing Data Exfiltration Mitigating Harmful Code Risks:Violence Prevention:Risk Profiling:Using Risk Analysis GuardRail Step-by-Step Best Practices