AI-Check — AI & LLM Security Governance Checklist

1

LLM Security

7 items

Prompt injection defenses in place

System prompts are isolated from user input. Validation and guardrails block attempts to override model instructions, exfiltrate data, or hijack behavior.

Critical

Jailbreak resistance tested

The model is hardened against techniques that bypass its safety rails — including role-play attacks, encoding tricks, and multi-turn escalation.

Critical

Hallucination mitigation strategy

Responses are grounded in verified data (RAG). Temperature is tuned low. Outputs include source citations and confidence disclaimers where needed.

Critical

Bias and fairness audits conducted

Model outputs are regularly tested across demographics. Diverse test scenarios ensure the system does not discriminate by age, gender, ethnicity, or language.

High

Content moderation filters active

Input and output filters catch toxic, harmful, illegal, or off-topic content. Abuse reporting mechanisms are available to end users.

High

Knowledge base integrity protected

Write access to RAG sources is restricted. All changes are version-controlled. A review process prevents data poisoning of the knowledge base.

High

Red-team testing performed

Adversarial testing simulates real-world attacks including prompt injection, jailbreaking, and data extraction before every major release.

High

2

Privacy & Compliance

6 items

GDPR / data protection compliance

No PII is stored without consent. Data processing agreements exist with all LLM providers. A processing register (Art. 30 GDPR) is maintained.

Critical

Sensitive data handling policy

Health, financial, or other special-category data has explicit consent flows, encryption at rest and in transit (TLS 1.2+), and access restrictions.

Critical

EU AI Act transparency requirements met

Users are clearly informed they are interacting with an AI system. Documentation meets the forthcoming EU AI Act obligations for the system's risk tier.

High

Copyright and liability safeguards

Outputs are sourced from owned/licensed content only. Liability disclaimers are displayed. Legal review processes are in place for generated text.

High

Audit logging implemented

All API calls, errors, and model interactions are logged in a structured, PII-safe format. Log retention policies are defined. Central aggregation is set up.

Medium

Data retention and deletion policy

Conversation data has defined TTLs. Users can request deletion. Automated purge jobs ensure compliance with right-to-erasure requests.

Medium

3

Application Security

5 items

XSS protection for AI outputs

All model-generated content is HTML-escaped before rendering. CSP headers block inline scripts. Sanitization libraries handle markdown/HTML outputs.

Critical

CSRF and SSRF protections active

Anti-CSRF tokens protect state-changing requests. Outbound connections are restricted to allow-listed endpoints. No user-controlled URLs are fetched server-side.

High

Secure session management

Sessions use cryptographically strong IDs, short TTLs, HttpOnly + Secure + SameSite cookies, and protection against session fixation.

Medium

Input validation on all AI-facing routes

Request payloads are validated for size, type, and format before reaching the model. Malformed or oversized inputs are rejected early.

Medium

Zip bomb / decompression bomb defense

Uploaded archives are validated before extraction. Limits on decompressed size, nesting depth, and file count prevent malicious archives from exhausting server memory, disk, or CPU.

High

4

Infrastructure & Operations

5 items

API keys and secrets secured

No credentials in client-side code or public repos. Keys are stored in environment variables or secret managers. Regular rotation is automated.

Critical

Rate limiting and DoS protection

Per-IP and per-session rate limits are enforced. WAF or CDN edge protection is active. CAPTCHAs trigger on suspicious patterns.

High

Availability and fallback strategy

Graceful degradation when the LLM provider is down. Retry logic with exponential backoff. Health-check endpoints and status pages are available.

Medium

Vendor lock-in mitigated

An abstraction layer sits between the app and the LLM API. Alternative providers have been evaluated. SLAs and exit strategies are documented.

Medium

Least-privilege IAM policies

Service accounts use minimal permissions. Network segmentation isolates AI workloads. Access is audited and reviewed periodically.

Medium

5

LLM Monitoring & Observability

5 items

Latency and throughput tracking

P50/P95/P99 response times are monitored. Alerts fire when latency exceeds thresholds. Dashboards show real-time and historical trends.

High

Token usage and cost monitoring

Input/output token counts are logged per request. Daily and monthly spend is tracked. Anomaly detection catches unexpected usage spikes.

High

Response quality scoring

Automated evaluations score response relevance, groundedness, and safety. Feedback loops (thumbs up/down) provide continuous quality signals.

Medium

Drift and anomaly detection

Statistical monitoring detects shifts in model behavior, topic distribution, or refusal rates. Alerts trigger when patterns deviate from baselines.

Medium

Error rate and failure mode tracking

API errors, timeouts, and content filter rejections are tracked separately. Root cause dashboards help distinguish model issues from infra problems.

Medium

6

Model Governance

5 items

Model versioning and changelog

Every model swap or prompt change is version-tagged. A changelog documents what changed and why, enabling rollback to any previous state.

High

Rollback plan documented and tested

A one-click rollback to the previous model version is available. The process is documented and has been tested in a staging environment.

High

Evaluation benchmarks defined

A standardized test suite measures accuracy, safety, and task completion rates. Benchmarks run automatically before any model or prompt update ships.

Medium

Model card or system documentation

A model card documents the model used, its capabilities, known limitations, intended use cases, and ethical considerations.

Medium

A/B testing framework for model changes

New models or prompt versions can be tested on a subset of traffic before full rollout, with quality metrics compared side by side.

Low

7

Cost Management

4 items

Budget caps and spending alerts

Hard and soft spending limits are configured at the provider level. Alerts fire at 50%, 80%, and 100% of monthly budget thresholds.

High

Per-request token budgets

Max input and output token limits are enforced per request. Runaway conversations are capped. Context window usage is optimized.

Medium

Cost attribution per feature

Token usage is tagged by feature or endpoint, enabling granular cost visibility. Teams can identify which features drive disproportionate spend.

Medium

Cost-efficiency optimization reviewed

Prompt compression, caching, smaller models for simple tasks, and batch inference have been evaluated. Unit economics are documented.

Low

8

UX & Responsible AI

4 items

Human escalation paths defined

Users with urgent or complex issues are seamlessly routed to human agents. The AI clearly communicates its limitations and offers alternatives.

High

AI disclosure visible to users

Users are clearly informed they are interacting with AI, not a human. The disclosure is persistent and not hidden in fine print.

High

User feedback mechanism in place

Users can report inaccurate, offensive, or unhelpful responses. Feedback is routed to the team for continuous improvement.

Medium

Accessibility standards met

The AI interface meets WCAG 2.1 AA. Screen readers can parse AI responses. Keyboard navigation works fully across the chat experience.

Medium

AI IntegrationSecurity Checklist

AI Integration
Security Checklist