Building AI Governance: A Practical Framework for Enterprise LLM Deployment

As organizations rapidly adopt Large Language Models (LLMs) into their production systems, the need for robust AI governance has never been more critical. Without proper guardrails, LLMs can generate harmful content, leak sensitive information, or produce outputs that violate regulatory requirements.

In this comprehensive guide, we'll walk through building a production-ready AI governance framework that balances innovation with safety, compliance, and user protection.

Why AI Governance Matters

The risks of ungoverned LLM deployment are substantial:

Harmful Content Generation: Models can produce toxic, biased, or offensive outputs
Data Leakage: Without proper controls, LLMs may expose PII or confidential information
Regulatory Violations: Non-compliant outputs can result in legal penalties
Reputational Damage: A single viral incident can destroy brand trust
Security Vulnerabilities: Prompt injection attacks can compromise system integrity

According to a 2024 Stanford study, over 73% of enterprise AI deployments lack comprehensive governance frameworks, leading to an average of 2.3 safety incidents per quarter.

The Four Pillars of AI Governance

Our framework is built on four essential pillars:

Input Validation & Safety
Output Moderation & Filtering
Audit Logging & Compliance
Continuous Monitoring & Improvement

Let's dive deep into each pillar with practical implementation examples.

Pillar 1: Input Validation & Safety

The first line of defense is validating and sanitizing user inputs before they reach your LLM.

Implementing Input Guards

import { z } from 'zod'

// Define strict input schema
const userPromptSchema = z.object({
  message: z
    .string()
    .min(1, 'Message cannot be empty')
    .max(4000, 'Message too long')
    .refine(msg => !containsInjectionPatterns(msg), 'Potential prompt injection detected'),
  context: z.record(z.string()).optional(),
  maxTokens: z.number().min(1).max(2000).default(500),
})

// Input validation middleware
export async function validateUserInput(rawInput: unknown) {
  try {
    const validated = userPromptSchema.parse(rawInput)

    // Additional safety checks
    const safetyScore = await checkContentSafety(validated.message)

    if (safetyScore.toxicity > 0.8) {
      throw new Error('Input flagged for toxic content')
    }

    return validated
  } catch (error) {
    if (error instanceof z.ZodError) {
      throw new ValidationError('Invalid input format', error.errors)
    }
    throw error
  }
}

Detecting Prompt Injection Attacks

Prompt injection is one of the most common attack vectors. Here's a detection pattern:

const INJECTION_PATTERNS = [
  /ignore (previous|all) instructions/i,
  /you are now (a|an) .+ (assistant|bot|ai)/i,
  /disregard (your|the) (system|safety|ethical) (prompt|guidelines)/i,
  /reveal (your|the) (system prompt|instructions)/i,
  /<\|.*?\|>/g, // Special tokens
]

function containsInjectionPatterns(text: string): boolean {
  return INJECTION_PATTERNS.some(pattern => pattern.test(text))
}

// Rate limiting by user
const rateLimiter = new Map<string, number[]>()

export function checkRateLimit(userId: string): boolean {
  const now = Date.now()
  const userRequests = rateLimiter.get(userId) || []

  // Remove requests older than 1 minute
  const recentRequests = userRequests.filter(timestamp => now - timestamp < 60_000)

  if (recentRequests.length >= 20) {
    return false // Rate limit exceeded
  }

  recentRequests.push(now)
  rateLimiter.set(userId, recentRequests)
  return true
}

Pillar 2: Output Moderation & Filtering

Even with strict input validation, LLMs can still generate problematic content. Output moderation is essential.

Multi-Layer Content Filtering

interface ContentSafetyResult {
  safe: boolean
  categories: {
    hate: number
    violence: number
    sexual: number
    selfHarm: number
    profanity: number
  }
  flaggedPhrases: string[]
}

export async function moderateOutput(content: string): Promise<ContentSafetyResult> {
  // Layer 1: Keyword-based filtering (fast)
  const keywordFlags = checkBlockedKeywords(content)

  // Layer 2: ML-based moderation (OpenAI Moderation API)
  const moderationResult = await openai.moderations.create({
    input: content,
  })

  const categories = moderationResult.results[0].category_scores

  // Layer 3: PII detection
  const piiDetected = detectPII(content)

  // Determine if content is safe
  const safe =
    !keywordFlags.hasBlockedContent &&
    categories.hate < 0.7 &&
    categories.violence < 0.7 &&
    categories.sexual < 0.7 &&
    categories['self-harm'] < 0.5 &&
    !piiDetected.hasPII

  return {
    safe,
    categories: {
      hate: categories.hate,
      violence: categories.violence,
      sexual: categories.sexual,
      selfHarm: categories['self-harm'],
      profanity: keywordFlags.profanityScore,
    },
    flaggedPhrases: [...keywordFlags.flagged, ...piiDetected.findings],
  }
}

PII Detection & Redaction

const PII_PATTERNS = {
  email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g,
  phone: /\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/g,
  ssn: /\b\d{3}-\d{2}-\d{4}\b/g,
  creditCard: /\b\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}\b/g,
}

function detectPII(text: string) {
  const findings: string[] = []

  for (const [type, pattern] of Object.entries(PII_PATTERNS)) {
    const matches = text.match(pattern)
    if (matches) {
      findings.push(`${type}: ${matches.length} occurrence(s)`)
    }
  }

  return {
    hasPII: findings.length > 0,
    findings,
  }
}

export function redactPII(text: string): string {
  let redacted = text

  redacted = redacted.replace(PII_PATTERNS.email, '[EMAIL REDACTED]')
  redacted = redacted.replace(PII_PATTERNS.phone, '[PHONE REDACTED]')
  redacted = redacted.replace(PII_PATTERNS.ssn, '[SSN REDACTED]')
  redacted = redacted.replace(PII_PATTERNS.creditCard, '[CARD REDACTED]')

  return redacted
}

Pillar 3: Audit Logging & Compliance

Comprehensive logging is critical for compliance, debugging, and continuous improvement.

Structured Audit Logging

interface AuditLog {
  id: string
  timestamp: Date
  userId: string
  sessionId: string
  input: {
    message: string
    metadata: Record<string, any>
  }
  output: {
    response: string
    model: string
    tokens: number
    latency: number
  }
  safety: {
    inputScore: ContentSafetyResult
    outputScore: ContentSafetyResult
    flagged: boolean
  }
  compliance: {
    gdprCompliant: boolean
    dataRetention: string
  }
}

export class AuditLogger {
  async log(entry: AuditLog): Promise<void> {
    // Store in database with encryption
    await db.auditLogs.create({
      data: {
        ...entry,
        // Encrypt sensitive fields
        input: encrypt(JSON.stringify(entry.input)),
        output: encrypt(JSON.stringify(entry.output)),
      },
    })

    // Also stream to monitoring system
    await this.streamToMonitoring(entry)
  }

  async queryLogs(filter: {
    userId?: string
    flaggedOnly?: boolean
    startDate?: Date
    endDate?: Date
  }): Promise<AuditLog[]> {
    return db.auditLogs.findMany({
      where: {
        userId: filter.userId,
        'safety.flagged': filter.flaggedOnly ? true : undefined,
        timestamp: {
          gte: filter.startDate,
          lte: filter.endDate,
        },
      },
    })
  }
}

GDPR Compliance

export class GDPRCompliance {
  // Right to erasure (Article 17)
  async deleteUserData(userId: string): Promise<void> {
    await db.$transaction([
      db.auditLogs.deleteMany({ where: { userId } }),
      db.userSessions.deleteMany({ where: { userId } }),
      db.userPreferences.delete({ where: { userId } }),
    ])
  }

  // Data portability (Article 20)
  async exportUserData(userId: string): Promise<any> {
    const [logs, sessions, preferences] = await Promise.all([
      db.auditLogs.findMany({ where: { userId } }),
      db.userSessions.findMany({ where: { userId } }),
      db.userPreferences.findUnique({ where: { userId } }),
    ])

    return {
      exportDate: new Date().toISOString(),
      userId,
      auditLogs: logs.map(log => ({
        ...log,
        input: decrypt(log.input),
        output: decrypt(log.output),
      })),
      sessions,
      preferences,
    }
  }
}

Pillar 4: Continuous Monitoring & Improvement

AI governance is not a "set and forget" system. Continuous monitoring and improvement are essential.

Real-Time Monitoring Dashboard

export class GovernanceMonitor {
  async getMetrics(timeRange: '1h' | '24h' | '7d' = '24h') {
    const metrics = await db.auditLogs.aggregate({
      where: {
        timestamp: {
          gte: this.getTimeRangeStart(timeRange),
        },
      },
      _count: true,
      _avg: {
        'safety.inputScore.hate': true,
        'safety.inputScore.violence': true,
        'output.tokens': true,
        'output.latency': true,
      },
    })

    const flaggedCount = await db.auditLogs.count({
      where: {
        'safety.flagged': true,
        timestamp: {
          gte: this.getTimeRangeStart(timeRange),
        },
      },
    })

    return {
      totalRequests: metrics._count,
      flaggedRequests: flaggedCount,
      flagRate: (flaggedCount / metrics._count) * 100,
      averageSafetyScores: {
        hate: metrics._avg['safety.inputScore.hate'],
        violence: metrics._avg['safety.inputScore.violence'],
      },
      performance: {
        avgTokens: metrics._avg['output.tokens'],
        avgLatency: metrics._avg['output.latency'],
      },
    }
  }

  // Alert on anomalies
  async checkAnomalies(): Promise<Alert[]> {
    const alerts: Alert[] = []
    const metrics = await this.getMetrics('1h')

    // High flag rate
    if (metrics.flagRate > 5) {
      alerts.push({
        severity: 'high',
        type: 'high_flag_rate',
        message: `Flag rate is ${metrics.flagRate.toFixed(2)}% (threshold: 5%)`,
      })
    }

    // Unusual latency
    if (metrics.performance.avgLatency > 3000) {
      alerts.push({
        severity: 'medium',
        type: 'high_latency',
        message: `Average latency is ${metrics.performance.avgLatency}ms`,
      })
    }

    return alerts
  }
}

Putting It All Together: The Complete Pipeline

Here's how all four pillars work together in a production request:

export async function handleAIRequest(rawInput: unknown, userId: string): Promise<AIResponse> {
  const auditLogger = new AuditLogger()
  const startTime = Date.now()

  try {
    // 1. Rate limiting
    if (!checkRateLimit(userId)) {
      throw new Error('Rate limit exceeded')
    }

    // 2. Input validation & safety
    const validatedInput = await validateUserInput(rawInput)
    const inputSafety = await checkContentSafety(validatedInput.message)

    if (!inputSafety.safe) {
      await auditLogger.log({
        userId,
        input: validatedInput,
        safety: { inputScore: inputSafety, flagged: true },
        // ... other fields
      })
      throw new Error('Input flagged by safety system')
    }

    // 3. Generate LLM response
    const llmResponse = await generateLLMResponse(validatedInput.message)

    // 4. Output moderation
    const outputSafety = await moderateOutput(llmResponse.content)

    if (!outputSafety.safe) {
      await auditLogger.log({
        userId,
        output: llmResponse,
        safety: { outputScore: outputSafety, flagged: true },
        // ... other fields
      })
      return {
        success: false,
        error: 'Response flagged by moderation system',
      }
    }

    // 5. PII redaction
    const sanitizedResponse = redactPII(llmResponse.content)

    // 6. Audit logging
    await auditLogger.log({
      id: generateId(),
      timestamp: new Date(),
      userId,
      sessionId: validatedInput.context?.sessionId || generateId(),
      input: validatedInput,
      output: {
        response: sanitizedResponse,
        model: llmResponse.model,
        tokens: llmResponse.usage.totalTokens,
        latency: Date.now() - startTime,
      },
      safety: {
        inputScore: inputSafety,
        outputScore: outputSafety,
        flagged: false,
      },
      compliance: {
        gdprCompliant: true,
        dataRetention: '30days',
      },
    })

    // 7. Return safe response
    return {
      success: true,
      content: sanitizedResponse,
      metadata: {
        tokens: llmResponse.usage.totalTokens,
        latency: Date.now() - startTime,
      },
    }
  } catch (error) {
    // Error logging
    console.error('AI request failed:', error)
    throw error
  }
}

Best Practices & Recommendations

Based on our experience deploying AI governance at scale, here are our top recommendations:

1. Start with Strict Policies, Then Relax

It's easier to relax strict policies than to tighten loose ones. Begin with conservative safety thresholds and adjust based on real-world data.

2. Layer Your Defenses

No single safety mechanism is perfect. Use multiple layers:

Input validation
Prompt engineering
Output moderation
Human review for edge cases

3. Monitor Continuously

Set up real-time dashboards and alerts. Review flagged content weekly to identify patterns and improve your filters.

4. Educate Your Team

AI governance is a team sport. Ensure developers, product managers, and legal teams understand the risks and controls.

5. Plan for Incidents

Have a clear incident response plan:

Who gets alerted?
How do you shut down the system if needed?
What's your communication plan?

6. Document Everything

Maintain comprehensive documentation of:

Safety policies and thresholds
Incident reports and resolutions
Model versions and changes
Compliance audits

Conclusion

Building robust AI governance is not optional—it's a fundamental requirement for responsible LLM deployment. By implementing the four pillars outlined in this guide, you can:

Protect your users from harmful content
Maintain regulatory compliance
Build trust with stakeholders
Enable rapid, confident AI innovation

Remember, AI governance is an ongoing journey, not a destination. As models evolve and new risks emerge, your governance framework must evolve too.

Additional Resources

This article is part of our Enterprise AI series. Stay tuned for our next post on "Optimizing LLM Performance at Scale" coming next week.

Menu

Building AI Governance: A Practical Framework for Enterprise LLM Deployment

Building AI Governance: A Practical Framework for Enterprise LLM Deployment

Why AI Governance Matters

The Four Pillars of AI Governance

Pillar 1: Input Validation & Safety

Implementing Input Guards

Detecting Prompt Injection Attacks

Pillar 2: Output Moderation & Filtering

Multi-Layer Content Filtering

PII Detection & Redaction

Pillar 3: Audit Logging & Compliance

Structured Audit Logging

GDPR Compliance

Pillar 4: Continuous Monitoring & Improvement

Real-Time Monitoring Dashboard

Putting It All Together: The Complete Pipeline

Best Practices & Recommendations

1. Start with Strict Policies, Then Relax

2. Layer Your Defenses

3. Monitor Continuously

4. Educate Your Team

5. Plan for Incidents

6. Document Everything

Conclusion

Additional Resources

About the Author

Tony O