Skip to main content

AI Safety — Unsafe Mode Detection

The AI Safety dashboard page monitors unsafe AI tool usage across all endpoints. It covers three distinct detection mechanisms:

  1. Config Guard compliance — Is the Config Guard file present and unmodified on every endpoint?
  2. Unsafe config events — Did any AI tool have dangerous config settings enabled (e.g. Cursor dangerouslyAllowArbitraryCode)?
  3. Unsafe process attempts — Did a developer try to start an AI tool with an unsafe CLI flag?

Process-level detection

Sielum monitors running processes for dangerous CLI flags:

ToolFlagRisk
Claude Code--dangerously-skip-permissionsBypasses all permission checks
Claude Code--allowedTools *Grants access to every tool without restriction
Claude Code--allowedTools allSame as above
No process killing

Sielum does not kill processes. When Claude Code's Managed Settings are configured, Claude Code itself rejects the --dangerously-skip-permissions flag before executing. Sielum only logs the attempt as an audit trail.

Every detected attempt is logged with timestamp, endpoint, process name, PID, and the exact flag used. A high-severity alert is created in the Alerts page.

Cursor config enforcement

Sielum scans ~/.cursor/mcp.json on every tick and remediates dangerous settings automatically:

ConfigRiskSielum action
dangerouslyAllowArbitraryCode: true (root-level)Allows arbitrary code execution by MCP serversSet to false
dangerouslyAllowArbitraryCode: true (per-server)Same, scoped to one MCP serverSet server disabled: true + flag to false

Remediation is atomic (write-to-temp + rename) to avoid partial-write corruption. Each finding is reported to the server as an Unsafe Config Event with remediated: true.

Dashboard — AI Safety page

Navigate to AI Safety in the sidebar. The page has three tabs:

TabWhat it shows
Config GuardCompliance summary for all endpoints (compliant/non-compliant/tamper type)
Unsafe Config EventsAll remediated dangerous config settings, with timestamp, endpoint, tool, path, and violation
Unsafe Process AttemptsAll detected dangerous CLI flags, with timestamp, endpoint, process name, PID, and flag

All tabs update live via WebSocket.

24h event badge

A red badge in the page header shows the total number of unsafe events (config + process) in the last 24 hours at a glance.