CS All case studies

n8n-io/n8n

github.com/n8n-io/n8n · audited 2026-06-04 · commit fe87a5d

51% ERI composite

Unlike a marketing monorepo, n8n-io/n8n contains the actual product — the workflow engine, the CLI/server (packages/cli), the core (packages/core), and the editor. So its 51% composite is a meaningful read on a real, shipping platform, not a surface layer. The picture is that of a well-built product that grew enterprise features deliberately, with a few platform-control dimensions still maturing.

Where it’s strong

The enterprise-access story is genuinely good. Identity & Access (81%) — federated SSO is properly wired for both SAML (ACS endpoints) and OIDC (callback flow), not a bolted-on isAdmin flag. API & Extensibility (64%) is backed by a checked-in, machine-readable contract (packages/cli/src/public-api/v1/openapi.yml) that the server actually serves and enforces. Implementation & Customization (94%, small sample) shows real feature-flag/entitlement gating (centralized evaluation via PostHog) instead of per-customer forks. IP & OSS Hygiene (68%), Deployability (65%), and Reliability Primitives (64%) round out a solid execution tier.

Where the gaps are

The weak dimensions cluster around multi-tenant data governance. Audit / Governance / Residency (10%): there’s an EventMessageAudit / MessageEventBus.sendAuditEvent spine, but it’s far from a queryable, append-only audit store with residency controls. Tenancy Isolation (17%): no default-scoped query mechanism — isolation depends on explicit scoping at each call site rather than a database-enforced default, which is fragile as the surface grows. Procurement Readiness (11%) and Reporting & Data Export (28%) are similarly thin on the data-subject export / region-pinning controls that enterprise procurement asks for.

The read

n8n is the inverse of a typical OSS profile: the access and extensibility layers an acquirer worries about are in good shape, while the gaps are in the data-isolation and governance plumbing that’s harder to retrofit. None are fatal, but tenancy and audit would be the first line items in a 100-day enterprise-hardening plan. The dimension breakdown below is scored against the audited commit, evidence linked inline.

T1 Thesis Viability

AI / Data Foundation

Versioned data pipelines, pinned model versions, and a real vector or feature store — not scattered cron jobs and model="latest".

61% 15/19 scored
  • Declarative, tested transformations 140%
    7/5 expected sites
  • Data quality validation / contracts 50%
    1/2 expected sites
  • Raw / immutable source layer 0%
    0/1 expected sites
  • Data + pipeline versioning 0%
    0/2 expected sites not present
  • Vector / embedding store 17%
    1/4 expected sites
  • Model version pinning 100%
    4/4 expected sites
  • Prompt / model-call management 89%
    3/3 expected sites
  • Reproducibility / determinism 0%
    0/3 expected sites
  • AI output validation 100%
    3/3 expected sites
  • Grounding / wrongness check 100%
    3/3 expected sites
  • Self-correction / feedback loop 0%
    0/1 expected sites not present
  • Evaluation harness + scoring 100%
    3/3 expected sites
  • Runnable correctness checks 67%
    1/1 expected sites
  • Positive confirmation 67%
    1/1 expected sites
  • Machine-readable contracts 83%
    4/4 expected sites
Declarative, tested transformations 140%

A clear instance of declarative, tested transformations exists in the workflow template import path: `templateTransforms.ts` provides centralized transformation functions for credential overriding and resourceLocator scrubbing, and `templateTransforms.test.ts` contains unit tests (including boundary/empty and non-mutation checks). The import orchestration (`templateActions.ts`) correctly delegates to this transformation layer.

  • high

    Extend this transformation-layer pattern to any other template-related data reshaping currently done inline (if present). As a check, ensure any future transformation helpers have adjacent unit tests and are invoked by the import orchestration rather than reimplemented at call sites.

  • med

    If additional transformation responsibilities emerge (e.g., further template normalization beyond credentials and resourceLocator), consider expanding the existing `templateTransforms.ts` module rather than creating one-off helpers next to orchestration logic, to keep transforms declarative and testable as a single governed layer.

Orchestrated pipelines N/A

No codebase-wide “Orchestrated pipelines” primitive was found in the sense of a governed DAG/asset definition with explicit declared dependencies, retry behavior, and reproducible, observable pipeline runs. What was found related to “orchestration” is worker-status polling in the frontend, which does not constitute a pipeline orchestration layer with a dependency graph and execution governance.

  • high

    Identify where data/AI pipelines are executed (backend worker/services layer) and ensure they are defined declaratively as a DAG (assets/manifests) with explicit dependencies and retry/success/failure handling surfaced as structured run records (reproducible inputs + run metadata).

Data quality validation / contracts 50%

The repo externalizes data/contract validation using Zod-based schemas at tool boundaries (notably in the agents integration tool layer) and also provides a reusable schema-resolution/validation helper in the workflow SDK. This indicates a real data-quality/contract primitive, though only a small number of ingestion-boundary sites were confirmed in this audit slice.

  • high

    Extend this audit to all ingestion boundaries (HTTP handlers/DTO parsing, workflow input ingestion, node parameter parsing, webhook payloads). For each, require an explicit schema gate (Zod/JSON-schema-to-Zod) with a quarantine/error path instead of only downstream type assumptions.

  • med

    Add/verify standardized error routing for schema-validation failures (e.g., consistent error codes, capture invalid payloads, and ensure invalid inputs never propagate into execution).

Raw / immutable source layer 0%

An explicit raw landing-layer concept exists (`InsightsRaw`), and compaction stages read from it. However, the compaction process deletes the processed raw rows from the source table, so the raw layer is not immutable and is not safely recoverable for auditing/reprocessing after transforms.

  • high

    Make `insights_raw` append-only/immutable by removing the destructive delete. Replace `DELETE FROM ${sourceTableName} ...` with an approach that preserves raw rows (e.g., mark-processed without deletion, move to an archive table, or store immutable raw snapshots keyed by run/batch).

  • med

    Add explicit auditability guarantees: store lineage metadata for each compaction run (source batch identifiers, time window, and counts) so auditors can reproduce aggregates from immutable raw data.

Data + pipeline versioning 0%

No clear implementation of “Data + pipeline versioning” (data state captured in immutable, versioned snapshots and linked to specific pipeline logic releases) was found. The codebase has evaluation dataset syncing and mock/pin-data generation, but dataset state is updated (and new examples get random UUIDs) rather than being snapshot-versioned and tied to a specific pipeline release for guaranteed reproducibility.

  • high

    Add explicit, release-tied dataset snapshotting for evaluation inputs (e.g., store generated/synced scenario inputs + splits as immutable artifacts/versions, then reference the snapshot id in evaluation runs). Ensure the snapshot is created deterministically from repo content + pipeline version/commit, not only by diffing current filesystem state.

  • med

    Record and persist pin-data generation provenance/versioning (generator code version/commit + schema resolution strategy + input workflow hash) and store the resulting pin data as versioned artifacts (or ensure the evaluation can fetch a prior immutable pin-data version).

  • low

    If using an external system (e.g., LangSmith) as the data store, introduce a governed “dataset/model/pipeline version” field in example metadata and enforce that evaluation runs pin to a specific dataset snapshot/version rather than relying on “current sync”.

Data lineage / provenance N/A

No explicit data lineage / provenance primitive (e.g., OpenLineage/Marquez/DataHub/Amundsen, or equivalent lineage/provenance emission + governance artifacts) was found in this codebase via repository-wide searches. No schema/config/artifact for provenance emission and no lineage/provenance implementation points were located.

  • high

    Add an explicit, machine-queryable lineage/provenance emission layer in the pipeline/execution path (record dataset identifiers, source dataset(s), and transformation/derivation edges with timestamps + run identifiers). Ensure it is persisted and queryable (DB tables/events) and covered by automated tests that validate lineage correctness end-to-end.

  • med

    Adopt or integrate a standard lineage model (e.g., OpenLineage) or define an equivalent internal schema and publish a machine-readable contract (schema) for lineage events.

Feature management N/A

I did not find any feature-management primitive (e.g., a centralized, versioned feature definition/feature store used as a single source of truth by both training and serving). The codebase appears to define feature-like schemas inline in application code rather than externalizing them into a governed feature layer suitable for avoiding training/serving skew.

  • high

    Introduce a centralized feature-definition artifact (feature store + versioned contracts/feature manifests) and route both training and serving to read from the same generated/compiled feature definitions.

Vector / embedding store 17%

This codebase clearly implements a vector/embedding-store primitive via n8n LangChain vector store nodes (including external providers like PGVector) and a shared createVectorStoreNode dispatcher. However, the audited implementation shown for the in-memory store is explicitly ephemeral (lost on restart) and the memory manager metadata does not demonstrate any governance tying vectors to the embeddings model version and embedded content version.

Model version pinning 100%

Model version pinning exists and is implemented well in `packages/@n8n/ai-workflow-builder.ee/src/llm-config.ts`, where model factories construct LangChain chat models using explicit versioned model IDs for OpenAI and Anthropic. Additionally, integration-test fixtures include explicit versioned model identifiers to keep test behavior deterministic. However, general runtime node templates (e.g., the OpenAI-compatible example node and the LMChatOpenAi node) appear to accept model IDs from node parameters without enforcing version pinning.

Prompt / model-call management 89%

The codebase does have a managed/centralized prompt layer for at least the AI Workflow Builder and related evaluators (e.g., packages/@n8n/ai-workflow-builder.ee/src/prompts/* with re-exported builders). Core LLM calls use these prompt builders and, in key flows like the Planner Agent, enforce a structured output schema with validation and a bounded retry loop—matching the intended prompt/model-call management primitive.

  • high

    Ensure every other agent/evaluator model call site in this repo (outside the planner/responder examples) follows the same pattern: (1) prompt text built from the centralized prompts module, and (2) output parsed/validated against an explicit schema gate before the result is used.

  • med

    Add/strengthen automated tests that fail if a call site inlines a prompt literal or bypasses the centralized prompt builders (e.g., unit tests asserting the prompt builder function is used, or snapshot tests tied to prompt builder outputs).

Reproducibility / determinism 0%

The repo contains at least one strong determinism pattern: deterministic workflow-builder node ID generation that is explicitly implemented and unit-tested. However, at higher-level evaluation/execution run boundaries (the parts that would need exact reproducibility of datasets/prompting/LLM sampling), the observed harness/config wiring does not show explicit capture of determinism controls such as RNG seed or pinned model sampling parameters; the execution flow also uses randomUUID for run IDs.

  • high

    Add a determinism configuration object at evaluation-run boundaries (e.g., in the harness runner config): capture and persist (1) RNG/seed value(s), (2) LLM sampling parameters (temperature/top_p/max_tokens), (3) pinned model IDs/versions, and (4) relevant environment/config versions. Persist it alongside the evaluation run artifacts (transcript/output/score).

  • high

    Eliminate or quarantine non-deterministic identifiers used inside evaluation execution, or ensure they are explicitly recorded as non-reproducible metadata while the actual determinism controls are captured separately (e.g., record the seed + model parameters instead of relying on deterministic outputs only).

  • med

    Ensure CI/provenance capture includes the determinism-critical items, not just CI source (ci/local) and GH run IDs. Either embed commit SHA/branch in the metadata artifact or ensure the run artifact stores them directly (instead of relying on LangSmith auto-tracking).

AI output validation 100%

The codebase contains a strong, schema-governed AI output validation primitive for structured outputs. LLM text is parsed and validated against a declared Zod schema before any result is accepted, and when auto-fix is enabled, failures trigger a bounded retry loop that re-checks the corrected output against the same schema.

  • high

    Search for any other LLM call sites whose outputs are consumed without going through a structured output parser (e.g., raw `content`/string outputs returned to workflow execution). Add/route them through a schema gate like `N8nStructuredOutputParser` to ensure consistent rejection/retry behavior.

  • med

    If additional structured output formats exist beyond the current Zod-from-JSON-schema path, factor them into the same parsing/validation interface so all formats share identical error messages and retry semantics.

Grounding / wrongness check 100%

The codebase contains a concrete grounding/wrongness check primitive implemented as LLM-as-judge correctness evaluation plus robust judge-output parsing (with multiple output-format fallbacks). This enables verdicts (pass/fail) derived from comparing generated output against expected context, rather than surfacing raw model text without verification.

  • high

    Extend the wrongness-check coverage from offline evaluations to any production paths where AI outputs are acted upon (e.g., auto-executed workflow edits or direct user-facing factual claims). Ensure there is a deterministic verdict gate (schema-validated verdict + bounded retries/fallback) between generation and action.

  • med

    Add explicit tests asserting end-to-end that judge verdict parsing fails safely (e.g., returns undefined/throws) and cannot be interpreted as a pass when parsing fails.

Self-correction / feedback loop 0%

I did not find a closed self-correction feedback loop that takes a specific check/validation error, injects it into the next model attempt, and re-checks with bounded retries. The closest pattern is a bounded retry loop in the checklist verifier, but it retries without feeding back the failure details into the prompt.

  • high

    Implement a closed feedback loop in the checklist verifier: when agent.generate throws or when parsed. results is missing/empty, capture the exact error (e.g., exception message, structuredOutput parsing failure reason, or validation mismatch) and append it to the next attempt’s user message/instructions (e.g., an additional section like “Previous attempt failure: … Fix accordingly”). Keep MAX_VERIFY_ATTEMPTS and ensure a safe fallback still returns an empty result if all attempts fail.

  • med

    Add a targeted test that simulates structuredOutput/JSON schema failures and asserts the next attempt includes the prior failure message and that parsing succeeds on retry (or safely falls back after MAX_VERIFY_ATTEMPTS).

Evaluation harness + scoring 100%

The repo includes a full evaluation harness with scoring and artifact output: evaluators run via a central harness runner, harness-level score aggregation is implemented in a dedicated score-calculator module, and evaluation results are persisted to disk (including summary.json) for offline regression measurement. Implementation quality is strong and appears to support both local and LangSmith modes with reusable, testable components.

Runnable correctness checks 67%

The repository contains a runnable correctness-check primitive in the form of the `packages/testing/code-health` CLI, which runs rule checks and returns clear process exit codes (0 on pass; 1 on violations; 2 on internal errors). I did not find evidence of a root-level documented single command or CI workflow entrypoint wiring it, so the completeness of the “one entrypoint” story is only partial.

  • high

    Add/confirm a top-level, documentation-backed command (e.g., a repo root `pnpm` script or Make/Just target) that runs `packages/testing/code-health/src/cli.ts` with the intended arguments, so agents can discover a standard one-command pass/fail check without spelunking.

    • packages/testing/code-health/src/cli.ts:1-105 — While the CLI itself has the pass/fail semantics, I did not locate (via code-graph queries) a root-level `package.json`/CI workflow entrypoint in this audit environment to confirm the standard command wiring.
  • med

    Document the exact CLI invocation contract (supported commands/flags, expected env vars like `CODE_HEALTH_CHANGED_FILES`, and what constitutes pass/fail) in a README at `packages/testing/code-health/` so the correctness signal is externally governed.

Actionable diagnostics N/A

The codebase contains a governed diagnostics primitive via custom ESLint rules in `packages/@n8n/eslint-config`. The rules emit structured, rule-specific messages and often include autofixers, turning lint failures into actionable “what/where/how to fix” diagnostics.

  • high

    Add/verify a repo-level runnable check documentation (e.g., `npm run lint` / `pnpm lint`) and ensure it surfaces ESLint rule IDs + file/line locations in CI logs, so diagnostics are actionable outside of local development.

  • med

    For the most important rules, ensure each rule uses `meta.messages`/`messageId` consistently (and includes an autofix where safe), extending the existing pattern shown by `no-plain-errors` and `no-json-parse-json-stringify`.

Positive confirmation 67%

The codebase has an explicit positive confirmation mechanism in the AI workflow evaluation CLI: successful completion ends with exit code 0, while exceptions end with exit code 1. However, the inline comment indicates pass/fail is treated as informational rather than mapped to the exit code, limiting the strength of “correctness” signaling.

  • high

    If the intended primitive is “confirm correct (green) vs wrong (fail) so agents/CI can safely stop,” update the CLI to map evaluation pass/fail (based on the computed `summary` / score vs threshold) to the process exit code (e.g., exit 0 only when pass, exit 2 or 1 when fail) instead of always exiting 0 on successful completion.

Machine-readable contracts 83%

This codebase contains strong machine-readable contract artifacts, primarily via exported Zod schemas (extension and package manifests) and explicit JSON-schema-driven tool input validation in the agent layer. These contracts are treated as source-of-truth for validation and are supported by automated tests.

  • high

    Identify the production call sites that consume the exported schemas (beyond the schema definitions/tests) and ensure there is an explicit, documented path for agents/tools to retrieve the contract artifacts (e.g., stable imports/entrypoints or generated schema outputs).

  • med

    For workspace manifests, consider migrating/adding an explicit Zod schema (or equivalent JSON-schema artifact) alongside the parser to make the contract more uniformly machine-readable for downstream tooling.

Not applicable to this codebase: Orchestrated pipelines, Data lineage / provenance, Feature management, Actionable diagnostics.

Tenancy Isolation

A tenant_id on every business table, row-level security in the database, and tests that prove a cross-tenant request returns 403.

17% 10/12 scored
  • Default-scoped queries 0%
    0/1 expected sites not present
  • Tenant context at the boundary 0%
    0/1 expected sites not present
  • Cache key namespacing 0%
    0/3 expected sites not present
  • Object/blob partitioning 0%
    0/4 expected sites not present
  • Tenant context in async work 0%
    0/3 expected sites not present
  • Per-tenant resource limits 0%
    0/2 expected sites not present
  • Tenant-scoped key management 0%
    0/3 expected sites not present
  • Admin / role scoping 67%
    3/3 expected sites
  • Uniform not-found vs. forbidden 100%
    3/3 expected sites
  • Cross-tenant isolation tests 0%
    0/3 expected sites not present
Tenant key on every record N/A

No tenant-key-on-every-record primitive was found. The database entities inspected (e.g., WorkflowEntity, Agent) do not include a tenant/organization/workspace discriminator column, and the shared base entity does not implement any tenant-key mixin or default tenant scoping mechanism. This suggests the codebase is not enforcing multi-tenant row ownership via a required tenant key on every record.

  • high

    If this product is intended to be multi-tenant, introduce a tenant identifier column (tenantId/orgId/workspaceId) on all writeable business tables and enforce it via database constraints and/or default ORM scoping. Start by identifying the canonical tenant key used in this system (if any) and then backfill migrations + entity/repository updates.

  • med

    Add an automated check (lint/test) that fails CI if new/changed writeable entities or migrations omit the required tenant discriminator column, and add integration tests that attempt cross-tenant reads/lists/exports and assert denial.

Database-enforced isolation N/A

This codebase does not appear to implement a database-enforced tenancy isolation safety net (e.g., Postgres RLS policies forced at the DB layer, or schema/database-per-tenant). While the project has “workspaceId/tenant-like” concepts, the concrete scoping mechanisms observed are implemented in application-layer components (e.g., scoped filesystem/workspace and instance security settings), not in the database layer as a default-enforced filter.

  • high

    Confirm whether n8n’s data model is truly multi-tenant at the DB level (workspace/instance boundaries) and, if so, implement defense-in-depth in the database: add RLS policies (or schema-per-tenant / DB-per-tenant) that restrict reads/writes by the server-resolved tenant/workspace identifier, and ensure policies are FORCED even for table owners.

  • med

    Add integration tests that attempt cross-workspace (cross-tenant) read/list/export actions and assert denial at the DB boundary (not only application 403s).

Default-scoped queries 0%

I did not find evidence of a default-scope mechanism that automatically applies tenant/project isolation to queries. Instead, the code relies on explicit scoping in some methods, and at least one critical 'existing record' lookup (`findOneBy({ id: threadId })`) is not scoped by project/workspace, which is consistent with this primitive being absent at the data-access layer.

  • high

    Introduce (and enforce) a data-layer default scope for tenant/project identifiers so that all repository/ORM reads are automatically filtered. Concretely, ensure `findOrCreateInSerializableTransaction` cannot load a thread without the correct tenant/project scope (e.g., by making the base repository/EntityManager inject `projectId`/tenant constraints for all `find*` calls).

  • med

    Audit other repository/entity queries for the same pattern: any `findOneBy({ id: ... })` / `findOne(...)` / query builder reads that do not include the tenant/project discriminator should be converted to default-scoped behavior or explicitly constrained at the repository boundary.

Tenant context at the boundary 0%

The codebase establishes an authenticated principal (`req.user`) at the request boundary via `AuthService.createAuthMiddleware`, but I did not find any implementation of “tenant context at the boundary” (i.e., tenant/workspace/org context derived once from the verified identity/JWT before business logic). Therefore, this tenancy isolation primitive appears absent in the request-entry layer.

  • high

    Implement tenant/workspace/org context resolution in the entry auth middleware (same layer as `createAuthMiddleware`), deriving tenant context from the verified principal/session (e.g., from user membership/claims), storing it in a trusted request-scoped field (e.g., `req.tenant`/`req.workspaceId`), and ensuring downstream services use this trusted value rather than client-supplied IDs.

Cache key namespacing 0%

The codebase does implement a Redis cache key prefix, but it is based on global configuration (`globalConfig.redis.prefix` + `globalConfig.cache.redis.prefix`) rather than tenant-aware key namespacing. Concrete cache keys like `mfa:enforce` and `roles:scope-map` are also not tenant-prefixed, and there is no visible default tenant scoping in the cache layer.

  • high

    Enforce tenant-aware cache key construction in the cache layer (`CacheService`) so that every get/set/delete automatically prefixes keys with a tenant identifier (e.g., `tenant:{tenantId}:...`) derived from trusted tenant context (request/auth/session), not client input.

  • high

    Audit and update all constant/bare cache keys (e.g., `mfa:enforce`, `roles:scope-map`) to either (a) rely on the new cache-layer tenant prefix automatically, or (b) include an explicit tenant component in the composed key where the cache layer cannot infer it.

  • med

    Add an integration test that attempts cross-tenant cache access (set in tenant A, get in tenant B) and asserts it is isolated/denied (or returns miss), to prevent silent cache-based leaks.

Object/blob partitioning 0%

No evidence was found of tenant-scoped object/blob partitioning in the storage layer: the S3 node and S3 request helper appear to operate on caller-provided `bucketName` and `fileKey`/`path` without automatically adding a tenant/workspace namespace or enforcing tenant-scoped access at the blob/object addressing level.

  • high

    Enforce tenant/workspace-scoped object key partitioning at the lowest storage boundary (e.g., S3 request helper or an unconditional wrapper): require the caller to supply only a tenant-scoped identifier, and automatically prefix object keys (e.g., `${workspaceId}/...`) or select per-tenant buckets.

  • high

    Harden the S3 node object operations (download/upload/delete/copy/list) so that `fileKey`/`destinationPath` are transformed/validated into tenant-scoped object identifiers by default (and do not permit reading/writing arbitrary global keys).

Tenant context in async work 0%

I did not find a concrete “tenant context in async work” primitive implemented in the async layers. The core queued execution worker (`JobProcessor`) consumes a job payload (`JobData`) that does not include tenant/workspace/org/account identifiers, and it immediately queries execution data by `executionId` without any tenant context being established from the job message. This means tenant isolation in async processing appears to rely on something outside the payload (or is missing entirely at the primitive level).

  • high

    Introduce mandatory tenant context for queued execution jobs: extend `JobData` to include a verified tenant/workspace/org identifier (or include enough data for the worker to derive it from a trusted source) and update enqueue sites accordingly.

  • high

    Update the worker entry point to re-establish tenant context before touching data. Concretely, ensure `JobProcessor.processJob` sets/loads tenant context (from trusted job payload data or trusted execution/workflow lookup) and that repository calls cannot proceed without tenant scoping.

  • med

    Apply the same tenant-context rule to other async message boundaries (task broker / runner messaging). Ensure handler paths either carry tenant context in message payloads or derive it in a way that enforces tenant-scoped access by default.

Per-tenant resource limits 0%

The codebase contains rate limiting, but the implemented buckets are keyed by IP or user id / request-body field value. There is no evidence of per-tenant (workspace/account/org) quota/rate-limit buckets, so the “per-tenant resource limits” primitive is not present as specified.

  • high

    Add a tenant-scoped rate-limiter strategy: extend the rate-limit decorator + RateLimitService so limiter keys are namespaced by a trusted tenant/workspace identifier (e.g., `tenant:${tenantId}:...`) resolved from the authenticated principal, and update middleware wiring so it is enforced by default for tenant-affecting endpoints.

  • med

    Update/extend integration tests to verify isolation: create two tenants/workspaces, drive one tenant to exhaust its limit, and assert requests from the other tenant are not throttled (and that 429 behavior is tied to the tenant bucket, not only user/IP).

Tenant-scoped key management 0%

No tenant-scoped key management is implemented. Encryption keys are managed and selected at the instance/global level (via `instanceSettings.encryptionKey` and a single `DeploymentKey` active key), and the `DeploymentKey` entity does not include any tenant/workspace identifier to support per-tenant keys, BYOK/CMEK per tenant, or per-tenant crypto-erase.

  • high

    Add tenant scoping to the key model: introduce a tenant/workspace identifier column to `DeploymentKey` (and migrations), enforce uniqueness/activation invariants per tenant (e.g., one active key per tenant per type), and ensure all key-manager queries filter by tenant.

  • high

    Update the key management API to resolve keys per tenant: change `KeyManagerService` methods (e.g., `getActiveKey`, `getKeyById`, `rotateKey`, `listKeys`) to accept a tenant identifier and apply it in `DeploymentKeyRepository` queries.

  • high

    Update the crypto layer to use tenant context when selecting keys/envelope wrapping: modify `Cipher`/encryption callers so the correct tenant-scoped envelope key (or per-tenant deployment key id) is used instead of the global `instanceSettings.encryptionKey`.

  • med

    Add isolation tests for crypto boundaries: integration tests that (1) create/activate keys for multiple tenants/workspaces, (2) encrypt identical payloads under each tenant, and (3) assert that decrypting with the wrong tenant’s key material fails/returns incorrect data, plus crypto-erase behavior for a tenant.

Admin / role scoping 67%

This codebase implements admin/elevated-role scoping via a role model that distinguishes `global` vs `project` roles and anchors project-scoped elevated assignments to a per-`projectId` membership table (`ProjectRelation`). There is no sign (from the examined authz/data-model wiring) of a simplistic global `isAdmin` boolean gating all elevated behavior; instead, elevated privileges flow from role scopes and roleType, with project-scoped role assignments computed through project relations.

  • high

    Add/extend integration tests that explicitly attempt cross-project/project-membership elevated access (e.g., assign a project-scoped admin/editor role in project A and verify the same principal cannot administer resources in project B).

  • med

    Audit authorization entrypoints to confirm that every permission check uses the membership-resolved role/scopes (not only a principal’s derived role scopes without ensuring the resource’s project context matches).

Uniform not-found vs. forbidden 100%

The codebase contains the uniform-not-found pattern for at least the executions resource: when an execution lookup is blocked due to insufficient permissions (including cross-workspace/tenant access), the service returns NotFoundError (404) rather than ForbiddenError (403). However, ForbiddenError (403) exists broadly, and this primitive is only confirmed at the execution-related sites reviewed here.

  • high

    Audit every per-resource fetch/action endpoint for workspace/tenant-scoped data to ensure access-denied is mapped to the same response as not-found (404/uniform error), not to ForbiddenError (403). Start with endpoints analogous to /executions/:id (getOne, stop, retry, delete, update, and any 'findOne' repository wrappers).

  • med

    Add an explicit cross-tenant/workspace isolation integration test for each resource fetch/action that asserts identical error responses for a non-existent ID and an inaccessible-but-existing ID (no 403 distinction, no different message/timing).

Cross-tenant isolation tests 0%

The repository contains multiple isolation-related tests, including cross-project access control and endpoint/security-header isolation. However, there are no clear integration/security tests that specifically attempt cross-tenant read/write/list/export and async access and assert denial—so the 'Cross-tenant isolation tests' primitive is not implemented as defined.

  • high

    Add a dedicated cross-tenant security test suite that creates resources under Tenant A and attempts (from Tenant B identities) cross-tenant reads, writes, listing, export/download, and any async/background retrieval paths; assert uniform denial (preferably the same not-found/forbidden behavior policy used elsewhere).

  • high

    Extend existing isolation suites (Playwright/API and CLI integration) to include tenant boundary checks for every major data operation: list, get-by-id, update, create, delete, export, and async/background processing. Keep tenant context resolution tied to verified identity (JWT/session) rather than client-supplied tenant IDs.

  • med

    Identify the system’s tenant model (e.g., organization/instance/workspace concept) and mirror the cross-project tests using that tenant boundary in test setup utilities so future tests consistently exercise the correct isolation boundary.

Not applicable to this codebase: Tenant key on every record, Database-enforced isolation.

Identity & Access

SAML/OIDC libraries, SCIM provisioning endpoints, and a real roles/permissions schema — not a hard-coded isAdmin boolean.

81% 11/11 scored
  • Federated SSO (SAML/OIDC) 100%
    5/5 expected sites
  • Directory provisioning (SCIM) 0%
    0/2 expected sites not present
  • RBAC modeled as data 83%
    4/4 expected sites
  • Centralized authorization 22%
    1/3 expected sites
  • No hardcoded privilege shortcuts 100%
    1/1 expected sites
  • Deny-by-default 150%
    3/2 expected sites
  • AuthN before AuthZ at the boundary 100%
    3/3 expected sites
  • MFA / step-up auth 100%
    3/3 expected sites
  • Session & token hygiene 83%
    5/6 expected sites
  • Scoped machine credentials 100%
    5/5 expected sites
  • IP allowlists / network constraints 50%
    1/2 expected sites
Federated SSO (SAML/OIDC) 100%

Federated SSO is present and appears well-wired for both SAML and OIDC. The codebase exposes SAML ACS endpoints and an OIDC callback flow that both delegate authentication to protocol-specific services and perform verification steps (SAML XML/metadata validation; OIDC signed state/nonce verification and authorization code/token processing via openid-client). After successful authentication, it establishes an authenticated session using the auth service’s cookie issuance.

  • high

    Review whether SAML response signature verification (beyond XML schema validation) is fully enforced inside samlService.handleSamlLogin() for all configurations (e.g., metadata-based key retrieval, signature validation, and audience/recipient checks). Add explicit tests for signature failure, tampered assertions, and wrong Recipient/Audience.

  • med

    Ensure OIDC logout/session revocation semantics are consistent: verify cookie/token lifetimes, whether logout invalidates server-side sessions or refresh tokens (if any), and whether role/provisioning changes take effect promptly.

  • low

    Add/expand observability around SSO failures: correlate IdP error causes (state/nonce invalid, token exchange failure, claim/userinfo fetch failure) with structured logs and user/session identifiers to speed incident response.

Directory provisioning (SCIM) 0%

Directory provisioning via SCIM is not implemented in this codebase. The repository contains an SSO provisioning configuration/controller and related role-mapping based provisioning service logic, but no SCIM 2.0 endpoints (/scim/v2/Users, /scim/v2/Groups) or SCIM-style PATCH/deactivation handlers were found.

  • high

    Add a SCIM 2.0 HTTP surface under the identity/provisioning backend (e.g., /scim/v2/Users and /scim/v2/Groups) including request/response models and schema support.

  • high

    Implement SCIM lifecycle operations with explicit deprovisioning: handle deactivation (e.g., PATCH active=false) and delete flows so that access is revoked (e.g., remove memberships/relations and invalidate any active sessions/tokens tied to the user).

  • med

    Map SCIM identity attributes (externalId, userName, emails, groups) to the existing internal role/permission data model, ensuring all provisioning mutations flow through a single centralized authorization/provisioning policy layer.

RBAC modeled as data 83%

RBAC modeled as data is present. The codebase persists roles/scopes via `AuthRolesService` (DB Role/Scope entities synchronized from `@n8n/permissions` definitions) and enforces authorization through centralized scope/role policy logic (`userHasScopes`) rather than scattered hardcoded privilege booleans. Frontend also uses a scoped RBAC model (store + middleware) for permission evaluation.

  • high

    Audit remaining backend endpoints/handlers to confirm they all call the centralized RBAC policy (`userHasScopes`/`@ProjectScope` style decorators or equivalent) and do not introduce scattered per-handler privilege checks.

  • med

    Verify the backend also has explicit deny-by-default behavior at the routing/middleware boundary (i.e., endpoints without a RBAC guard are not silently accessible).

Centralized authorization 22%

This codebase contains evidence of centralized authorization/policy chokepoints in the CLI backend: workflow execution authorization is routed through dedicated checker services (SubworkflowPolicyChecker and CredentialsPermissionChecker) and resource access is abstracted behind AccessService. However, the presence of more granular controller-level policy enforcement could not be fully verified from the sampled files, so implementation appears solid in some execution paths but only partially mapped to every public API entry point.

  • high

    Audit all backend REST controllers for consistency: ensure every entry point that returns protected resources uses the same centralized authorization services/checkers (AccessService and permission checkers) and does not rely on scattered per-endpoint role/permission logic.

  • med

    Confirm that authorization (not just authentication) is policy-driven in MCP: after getAuthMiddleware attaches req.user, ensure MCP handlers call an explicit permission/policy layer for tool/workflow access and log denials.

  • med

    Ensure centralized authorization is uniformly applied to runtime credential surfaces via their dedicated access services, with deny-by-default behavior and consistent audit/telemetry on denials.

No hardcoded privilege shortcuts 100%

In the UI command bar, authorization is permission-model driven (via `getResourcePermissions` / `hasPermission`). However, the user store still defines privilege shortcut booleans like `isAdmin`/`isAdminOrOwner` by directly comparing the user role, which violates the “no hardcoded privilege shortcuts” requirement that privilege must flow through the role/permission model without boolean shortcut flags.

  • high

    Remove/avoid `isAdmin`/`isAdminOrOwner` boolean privilege shortcuts in `users.store.ts`. Instead, derive UI gating from the standardized permission layer (e.g., the same `getResourcePermissions` / `hasPermission` approach used elsewhere) so privilege decisions are centrally auditable and role-driven.

  • med

    Search for other `isAdmin` / `isRoot`-style booleans used for access control and refactor them to permission checks against the role/permission model (not direct role equality).

Deny-by-default 150%

Deny-by-default is present for the Public API (v1): the `publicApiScope` middleware rejects requests by default (403) when no token grant exists or when the required scope is not explicitly present. Key public route handlers for workflows and executions apply this middleware, preventing silent-open endpoints.

  • high

    For any new Public API v1 endpoint, ensure it is wired with `publicApiScope(<required-scope>)` (and any additional scope/tag/project middleware) rather than relying on downstream checks.

  • med

    Add/extend tests similar to `global.middleware.test.ts` to cover at least one representative new handler per resource type (workflows, executions, credentials, etc.), asserting 403 when scope is missing.

AuthN before AuthZ at the boundary 100%

This codebase applies the “AuthN before AuthZ at the boundary” primitive correctly at key HTTP entry points: REST requests use a centralized middleware that verifies the auth cookie/JWT before setting `req.user`, and MCP HTTP endpoints use a dedicated middleware that verifies the Bearer token before the controller uses `req.user` for access-relevant decisions.

  • med

    Audit additional public HTTP entry-point handlers beyond the MCP controller and verify each is wired with the appropriate authentication middleware before any handler logic reads `req.user` for gating/authorization.

MFA / step-up auth 100%

MFA enforcement/step-up-like gating is present and centralized: the auth middleware checks whether MFA was used during token/session creation (`usedMfa`) and blocks access when instance policy enforces MFA, unless the route explicitly allows skipping (`allowSkipMFA`). MFA setup endpoints are configured to bypass this gate so users can enroll/verify. However, what’s implemented is closer to a global “require MFA for access” control than a fine-grained per-action step-up flow (the code shows enforcement gating + route bypass, but not an obvious per-sensitive-action second-factor prompt/flow beyond the `usedMfa` session attribute).

  • high

    Confirm there is a truly fine-grained step-up mechanism for *sensitive actions* (not just a global enforcement gate). If needed, introduce a dedicated step-up requirement per endpoint/action (e.g., a separate `requireStepUp`/`minMfaAssurance` route metadata) that triggers an additional second-factor challenge for existing sessions that were created without MFA.

    • packages/cli/src/auth/auth.service.ts:70-133 — Enforcement is based on `mfaEnforced` and session `usedMfa`, with only `allowSkipMFA` as a bypass. There is no separate per-action step-up prompt/requirement shown in the enforcement logic.
  • med

    Audit all sensitive endpoints for `allowSkipMFA` usage and ensure only MFA enrollment/verification and explicitly safe endpoints can bypass. Add tests that verify the MFA gate behavior for representative protected routes.

  • low

    Document the intended semantics of `usedMfa` and how it maps to “step-up” assurance in your threat model (global enforcement vs. per-action step-up), so future route additions follow the right policy.

Session & token hygiene 83%

Session & token hygiene is implemented in multiple places with short-lived expiry, server-side invalidation/rotation, and revocation semantics. OAuth consent-session cookies are JWT time-bounded and explicitly cleared on errors/invalid sessions. MCP OAuth access tokens are short-lived and are invalidated server-side via DB presence checks; refresh tokens are rotated and both access/refresh tokens support explicit revoke endpoints via DB deletes. Additionally, the token-exchange subsystem adds bounded lifetimes and replay protection via JTI consumption (server-side).

  • high

    Confirm and document the logout/revoke call chain for each token type (OAuth consent session cookie, MCP access/refresh tokens, and token-exchange access tokens). Where logout exists, ensure it triggers clearCookie() (consent session) and DB revocation/delete (access/refresh) rather than only client-side removal.

  • med

    Ensure minted token scopes/claims are consistently scoped (aud/resource/iss/sub + any scope claim) and that authorization decisions depend on server-resolved identity/role state rather than solely token-embedded claims where role changes must take effect immediately.

Scoped machine credentials 100%

Scoped machine credentials are implemented: API keys are stored as non-human credentials with persisted `scopes` and `audience` (`user_api_keys`), existing keys are backfilled with role-derived scopes, and authentication strategies verify tokens/credentials and attach scoped identities to requests for least-privilege authorization. A separate scoped JWT strategy for token-exchange also cryptographically verifies and maps to role scopes.

  • med

    Add/confirm end-to-end revocation semantics for API-key/JWT credentials (e.g., ensuring server-side invalidation for rotated/deleted keys is enforced in the strategies and token-exchange paths).

    • packages/cli/src/services/api-key-auth.strategy.ts:1-101 — This strategy verifies and then authorizes by looking up the API key record by `apiKey` and `audience`, which is good for revocation by deletion; however, a dedicated check for server-side invalidation/blacklisting is not shown in the reviewed snippet.
IP allowlists / network constraints 50%

This codebase does implement IP allowlists for the Webhook trigger: it supports per-node `ipWhitelist` entries (single IPs and CIDRs) and enforces them before any authentication/authorization logic runs. I did not find an additional, clearly-wired per-tenant/network constraint at higher levels (e.g., instance-wide middleware). The Form flow appears to support IP allowlist helpers, but the specific boundary enforcement site for the Form handler was not identified in the inspected slices.

  • med

    Confirm whether the Form trigger (and other public entrypoints) should enforce the same IP allowlist boundary check, and wire it at the request entry before any authentication validation. Add/verify a clear `if (!isIpAllowed(...)) { 403/401 }` boundary near the handler entry for the Form POST/GET paths.

Compliance Code Patterns

Envelope encryption, enforced TLS, validated inputs, and zero secrets anywhere in the full git history.

49% 10/11 scored
  • Encryption in transit 0%
    0/1 expected sites not present
  • Encryption at rest 100%
    3/3 expected sites
  • Centralized key management 92%
    5/4 expected sites
  • Secrets management 100%
    2/2 expected sites
  • Input validation at boundaries 0%
    0/3 expected sites
  • Injection-safe data access 33%
    1/3 expected sites
  • Data classification & PII handling 73%
    4/5 expected sites
  • Access logging on protected routes 0%
    0/2 expected sites not present
  • Retention & secure deletion 93%
    5/5 expected sites
  • Secure defaults / hardening 0%
    0/2 expected sites
Encryption in transit 0%

TLS is supported (HTTPS is created when `protocol === 'https'`), but encryption-in-transit is not forced. When the config is not set to HTTPS, the server starts an `http.createServer`, allowing plaintext traffic; no unconditional HTTPS redirect/HSTS enforcement was evidenced in the inspected server bootstrap code.

  • high

    Force HTTPS unconditionally for the inbound edge: remove/disable the plaintext `http.createServer` path (or make it redirect all requests to HTTPS with HSTS). Ensure this behavior is correct behind proxies/load balancers (e.g., correct `trust proxy` usage and redirect conditions).

  • med

    Add/verify transport hardening headers: ensure `Strict-Transport-Security` (HSTS) and redirect behavior are set on all relevant paths (including behind any reverse proxy).

Encryption at rest 100%

Encryption-at-rest is present and centrally implemented via `packages/core/src/encryption/cipher.ts` (AES-256-CBC payload encryption plus AES-256-GCM authenticated DEK wrapping, with optional key-rotation via `EncryptionKeyProxy`). Sensitive external secrets configuration is handled encrypted during migration: decrypted from the encrypted `settings` blob and then re-encrypted into `secrets_provider_connection.encryptedSettings`.

  • high

    Verify that encrypted-at-rest coverage extends to backup/export/derived data paths (e.g., database dumps, object storage backups, log/trace exports) and that no plaintext sensitive fields are written by any repository/serialization layer outside the cipher abstraction.

  • med

    Audit all DB columns that store credential material (e.g., credentials data blobs and any external secrets settings columns) to ensure they are always written via `encryptWithInstanceKey`/`encryptV2` and never persisted in plaintext on any code path (including tests/migrations).

Centralized key management 92%

This codebase has centralized encryption key management for data encryption keys (KeyManagerService + EncryptionKeyProxy + cipher encryptV2/decryptV2). Keys are stored centrally (DeploymentKeyRepository), there is a rotation mechanism (rotateKey/addKey + active/inactive statuses), and ciphertexts can reference specific key ids for correct decryption. However, the implementation does not clearly provide the full 'managed key store with scheduled rotation and emergency revocation procedure' model: markInactive() is not fully enforced (TODO), and no explicit emergency revocation flow is evidenced in the reviewed code paths.

Secrets management 100%

This codebase has a working secrets-management primitive via the Enterprise External Secrets EE module. It provides runtime integration with external secret managers (HashiCorp Vault and Azure Key Vault), authenticates using configured connection settings, fetches secret values at connect/update time, caches them in memory, and serves secrets through getters for expression-based references. I did not find evidence of plaintext secrets being hardcoded into the provider implementation paths themselves.

No secrets in git history N/A

The primitive does not hold for this codebase: the full-history secret scan found committed secret-like values, and the repo contains a fixture file with credential material (API keys, AWS credentials, a GitHub token, and a private key block).

  • high

    Remove the committed credential material from git history (not just the current tree). Delete/replace the fixture contents with non-secret test vectors, then rewrite history (e.g., filter-repo/bfg) and rotate any credentials that could have been real.

Input validation at boundaries 0%

Schema validation libraries (notably Zod-style APIs via `parse/safeParse`) are present and DTOs are test-covered. However, in the specific controller boundary handlers reviewed, incoming boundary values (`req.params.id`, `req.query.transferId`) are read and used without an explicit, enforce-reject schema check visible in the handler code. The codebase may validate inputs via framework/decorator plumbing, but that wiring was not verified in the slices inspected here.

Injection-safe data access 33%

The codebase has some good injection-safe data access patterns (notably TypeORM QueryBuilder with bound parameters in `FolderService.getFolderTree`). However, there are also clear injection-safety deviations: database migrations build SQL strings using template-literal interpolation and pass them to `queryRunner.query(...)` (string-built SQL), which does not satisfy the primitive’s requirement of avoiding string concatenation for query construction.

  • high

    Eliminate template-literal SQL construction in migrations where possible. Instead, use the ORM’s identifier-safe facilities for dynamic identifiers (e.g., properly escaped identifiers) and parameter binding for variable data; avoid passing fully string-built SQL with interpolated fragments into `queryRunner.query(...)`.

  • med

    For complex QueryBuilder conditions that embed subqueries (e.g., `folder.id IN ${subQuery}`), verify (and add tests) that all user-controlled values are represented only via bound parameters from QueryBuilder, not via direct string interpolation of runtime values.

Data classification & PII handling 73%

This codebase includes an end-to-end data redaction concept and enforces it in key sensitive data flow paths. Execution node output and flatted runData pushed to the frontend are redacted via a proxy redaction service with fail-closed behavior (skip/empty on redaction failure). Separately, the MCP browser redaction layer detects secret/sensitive patterns and applies redactions to both text and structured tool results using marker-based replacement.

  • med

    Verify that the @Redactable() decorator on log-streaming/audit event relay methods is actually wired to anonymize/redact payload fields (e.g., user/email/name) for every enabled path (including when redaction policy/enforcement flags differ). Add tests that assert sensitive fields are masked in emitted audit events.

Access logging on protected routes 0%

No clear implementation of the primitive “access logging on protected routes” was found. While the codebase has an audit-event concept (with payload fields like userId/userEmail) and protected-route scope checks exist, the inspected middleware/bootstrapping code does not emit a request access/audit log for authenticated/protected route accesses with a unique actor identifier on every path.

  • high

    Add a centralized request middleware on the server (or public-api router) that runs for all authenticated/protected endpoints and emits an access/audit log entry for each request, including a unique actor identifier (e.g., req.user.id or API key subject) and request context (route, method, outcome).

  • high

    Ensure access/audit logging is emitted on every protected-route authorization outcome (authorized, forbidden/404 for scope failures, auth failures), not only in select business logic. Wire the logging at the same layer as scope enforcement (or as a finalizer after it).

  • med

    Reuse the existing audit-event infrastructure (EventMessageAudit payload fields like userId/userEmail) to standardize actor attribution, and add tests that assert an audit/access entry is produced for representative protected routes.

Retention & secure deletion 93%

The codebase contains an implemented retention-and-deletion control for execution data: a leader-only scheduled pruning service performs rolling soft-deletion (marking eligible executions with deletedAt) and periodic hard deletion that removes execution entities plus associated binary/FS execution bundles. It also has retention mechanisms for workflow history (pruning older history while preserving current/active versions), with additional scheduled compaction/trimming for auto-saved histories.

  • high

    Add/confirm an explicit “secure deletion” policy statement for how binary/DB data are disposed (e.g., whether DB hard deletes are acceptable as “secure disposal”, and what “cryptographic wipe” means in this system). If cryptographic wipe is required by your compliance standard, document and implement field/volume crypto handling for execution_data/fs bundles beyond simple rm/delete.

  • med

    Verify backup/derived-data coverage for execution retention: document that pruning reaches all relevant storage backends (db rows, execution_data, filesystem bundles, and any configured external storage). Add tests that run pruning against each configured storage mode.

  • low

    Create or link a single retention-policy document enumerating each dataset (executions, execution_data binaries, workflow history), their retention windows, soft vs hard delete stages, and safety exclusions; reference the enforcing pruning services/repositories for auditability.

Secure defaults / hardening 0%

Secure defaults/hardening is partially implemented: the server constructs a `helmet`-based security headers middleware (CSP, X-Frame-Options, and conditional HSTS) and applies it when serving the UI `index.html` via history-API fallback. However, the hardening middleware is not clearly applied globally to all HTTP paths (notably non-UI/API responses), and HSTS is conditional on n8n directly terminating TLS, which may leave an unenforced gap when TLS is terminated by a reverse proxy.

  • high

    Apply the `helmet`/security-headers middleware broadly (e.g., `this.app.use(securityHeadersMiddleware)` or equivalent) for all relevant routes/responses, not only inside the history-API handler.

    • packages/cli/src/server.ts:392-456 — Hardened middleware is only executed inside `historyApiHandler` before sending `index.html` (not applied to general API/non-UI responses in the shown code).
  • med

    Ensure HSTS expectations are met across deployments behind a reverse proxy: document and/or automatically support `X-Forwarded-Proto`/proxy-aware HSTS enabling so that HTTPS is consistently enforced on every hop.

    • packages/cli/src/server.ts:414-426 — HSTS is enabled only when n8n handles TLS directly (`globalConfig.protocol === 'https'` and `sslKey`/`sslCert` present), otherwise `strictTransportSecurity: false`.
  • low

    Verify debug/verbose error exposure is disabled in production for all environments where `inProduction` is true (e.g., ensure error handlers and logging don’t return stack traces to clients).

    • packages/cli/src/server.ts:1-120 — This file imports `inDevelopment`/`inProduction`, but the shown hardening implementation focuses on `helmet`; a dedicated production error/debug guard was not verified in the code slices reviewed.

Not applicable to this codebase: No secrets in git history.

Audit, Governance, Residency

An append-only audit_events table, a queryable audit API, and per-region infrastructure keyed on each tenant’s region.

10% 8/10 scored
  • Dedicated audit event store 0%
    0/2 expected sites
  • Append-only / tamper-evidence 17%
    1/4 expected sites
  • Comprehensive event coverage 61%
    5/6 expected sites
  • Queryable, provable audit access 0%
    0/1 expected sites not present
  • No cross-region leakage 0%
    0/3 expected sites not present
  • Data-subject rights (export & erase) 0%
    0/3 expected sites not present
  • Customer-controlled keys 0%
    0/2 expected sites
  • Sub-processor / data-flow transparency 0%
    0/2 expected sites not present
Dedicated audit event store 0%

A dedicated “audit event” mechanism exists in the form of `EventMessageAudit` + `MessageEventBus.sendAuditEvent`, which persists structured audit messages to a dedicated event-bus log writer (JSON lines). However, the implementation quality for a “dedicated audit event store” is weak for governance needs: the persistence layer clearly rotates/removes log files and provides no visible immutability/tamper-evidence or integrity verification, and the code evidence reviewed does not demonstrate a tenant-scoped, queryable/exportable audit trail interface with actor/tenant/action/resource/context/timestamp semantics.

  • high

    Confirm and document the audit-event schema requirements (actor, tenant/org, action, resource type+id, context, timestamp) are actually populated on `EventMessageAudit.payload` at emission time (not just the interface exists). Then enforce missing fields at compile/runtime (e.g., validators).

  • high

    Strengthen audit-store immutability and tamper-evidence: replace/augment the rotating file-log approach with an append-only store that prevents in-place modification and supports integrity validation (e.g., hash-chaining/signing per record). Ensure administrative deletion/retention changes cannot silently rewrite history.

  • med

    Provide tenant-scoped audit read and export endpoints specifically for the audit event store (with pagination and verifiable export formats). The current public audit handler is for generating security audit reports, not for querying the audit event records.

Append-only / tamper-evidence 17%

The codebase does implement an append-only style event log writer for audit/event messages (JSON lines appended to `.log` files). However, the same component deletes old logs (`rmSync`) and rotates them via renaming, and there is no evidence in code of integrity validation (e.g., hash chaining or signatures) that would make tampering detectable across records/files. Overall, the append-only aspect exists, but the tamper-evidence governance/evidence-chain requirements are not met.

  • high

    Implement tamper-evidence for the event/audit evidence store: compute a per-record integrity link (e.g., hash-chain using the previous record hash, or Merkle tree, or signed records with an append-only verifier) and persist the integrity metadata alongside each record so that any alteration breaks validation for all subsequent records.

  • high

    Replace/augment destructive retention with audited, recoverable archival semantics for audit evidence (restrict deletion; ensure archival copies are immutable; require integrity validation before purging). Remove or tightly control `rmSync`-based evidence deletion for audit evidence.

  • med

    Ensure the tamper-evidence chain spans rotations across files: rotation should carry forward the last-record hash (or equivalent) into the next log file, so verifiers can validate continuity across renames/archives.

  • med

    Add a public/internal audit-verification interface that can validate the integrity chain over a requested time range and export verifiable evidence bundles (record hashes + chain tip + verifier output).

Comprehensive event coverage 61%

This codebase implements a dedicated audit-event emission pipeline using `MessageEventBus.sendAuditEvent()` and persists audit messages via the event-bus log writer. Audit events are emitted for sensitive auth and many workflow/user/role-governance events in `log-streaming.event-relay.ts`. However, at least one sensitive export path (`/n8n-packages/export` and its service export logic) does not show evidence of emitting an audit event in the controller/service layers reviewed, which creates a likely audit-coverage gap for data exports.

  • high

    Add/ensure an audit event is emitted for the workflows/credentials export endpoint (`/n8n-packages/export`) including actor identity and the set of exported workflow IDs (and possibly whether credentials were included). Ensure the audit event is emitted at the point where export contents are determined (service), not only at transport/controller level.

  • med

    Verify that permission/role changes and other sensitive permission-adjacent actions are fully covered by audit events (beyond token-exchange/role-mapping). If permission/role management exists elsewhere, ensure it routes through the same audit emission mechanism (`sendAuditEvent`) and is included in the log-streaming relay/listeners model.

  • low

    Document how timestamps/order are ensured across hosts for audit reconstruction (event log writer + delivery retry) and confirm audit timeline integrity for customer/auditor consumption.

Queryable, provable audit access 0%

The repository includes a public API endpoint for `generateAudit` that produces a security audit report (risk categories over workflows). However, there is no queryable, provable audit *access* for a tenant-scoped audit trail with pagination and an exportable, independently verifiable evidence trail (including integrity/cryptographic assurance). Therefore this primitive is absent in the required form.

  • high

    Implement (or wire in) a dedicated, append-only, structured audit event store and expose it via a tenant-scoped, paginated public API for auditors/customers/support, plus a separate export endpoint that outputs verifiable evidence (identity assertion, policy/policy-version state, and cryptographic integrity such as hash-chaining/signatures).

  • med

    Ensure the implementation logs and persists the required audit context fields (actor identity, tenant/project scope, event type, resource identifiers, policy state/version, timestamp) into the dedicated evidence store at the time of each sensitive action; do not rely on generated reports or debug/log streaming.

Audit retention & separation of duties N/A

No dedicated “audit retention & separation of duties” primitive is implemented or wired in this codebase. The code contains a security-audit *report generator* (CLI + public API handler) that computes findings from workflows, but there is no persisted audit/event evidence store with an enforced retention window, restricted tamper controls, or audited log deletion/purging.

  • high

    If n8n is expected to provide compliance-grade audit trails, introduce a dedicated structured audit/event store (separate from app logs) and implement: (1) retention window configuration, (2) immutable/append-only write semantics, (3) separation of duties so system admins cannot shorten retention or modify prior audit records, and (4) retention purge jobs whose execution and deletions are themselves audited.

  • med

    Add verifiable operational artifacts: a retention/purge job definition (config + scheduler), audit-store mutation safeguards (no UPDATE/DELETE on evidence rows except via controlled purge), and an audited record of purge actions (who/what/when/how many).

Data residency / region pinning N/A

This codebase does not show a data-residency / region-pinning primitive: there is no evidence of a tenant-level region attribute driving in-region data/compute placement or region-keyed routing. Where 'region' appears, it is used for unrelated concepts (e.g., provider userLocation metadata) or generic request routing, not residency enforcement.

  • high

    If n8n is deployed as a multi-tenant service with any EU/region residency obligations, introduce an explicit tenant data residency model (tenant.region) plus region-keyed routing so workflow execution and all persistence/sync/side-effects are constrained to the tenant’s chosen region.

  • med

    Audit and document every data sink (primary DB, caches, queues/event bus, backups/snapshots, analytics/export pipelines, and third-party integrations) and enforce region pinning across them; the primitive must guarantee no cross-region 'shadow data' escapes.

No cross-region leakage 0%

I did not find an implementation of “No cross-region leakage” that enforces residency for all data sinks (including derived/backup/analytics/exports/relays). The pubsub scaling components route via Redis prefixes/hostId without any region-scoped enforcement, and the export service writes out exported artifacts without any visible region pinning/blocking.

  • high

    Introduce and enforce tenant/organization region scoping across all data sinks that can propagate or materialize data outside the primary store (including pubsub/relay paths and export outputs). Ensure routing/placement is keyed by tenant region and that cross-region destinations are blocked by policy checks close to the sink (not only at the primary DB).

  • high

    Add explicit residency enforcement to export flows: validate tenant region vs. configured export destination region (or storage bucket/endpoint region), and block or reroute exports that would place data out-of-region.

  • med

    Audit secondary sinks comprehensively (backups/snapshots/replication/analytics pipelines and any third-party syncs). For each sink, add region pinning plus an automated test that attempts an out-of-region sink configuration and verifies the sync/export is blocked.

Data-subject rights (export & erase) 0%

No dedicated data-subject rights (export & erase) primitive was found. While the codebase contains (a) an entity-export service that appears to export from all tables and (b) a user deletion endpoint that deletes user-related primary records, neither is demonstrated as a GDPR/CCPA-style DS request mechanism that exports all data for a specific subject and performs auditable erasure with required cascade coverage (backups/derived stores).

  • high

    Implement a dedicated DS rights module with public API endpoints for (1) export-by-subject and (2) erase-by-subject, including DS request identity verification, tenant scoping, and response evidence artifacts.

  • high

    Add an erase handler that performs comprehensive cascade deletion beyond the primary rows, explicitly covering backups and derived stores (and document/verify backup-safe strategy), then make the erase action itself auditable in a structured, immutable audit store.

  • med

    Add explicit wiring from DS request endpoints to underlying data deletion/export services so that the DS primitive is not merely “admin actions,” but a governed workflow with auditable checkpoints.

Customer-controlled keys 0%

The codebase includes a central “encryption key manager” with endpoints to list keys and rotate/create a new active data-encryption key. However, it does not present evidence of the primitive’s core requirement: customer-controlled, per-tenant/customer-managed keys with customer-driven import, scheduled rotation, and explicit revoke (crypto-shred) semantics. The visible API appears instance/global and permissioned for admins, not tenants supplying their own keys.

  • high

    Add a tenant-scoped BYOK interface: implement per-tenant key reference + customer-provided key import (or KMS reference import) and persist it against tenant scope, not globally. Evidence target: current controller is global and lacks import/revoke flows.

  • high

    Implement explicit revoke semantics for crypto-shred: expose an API that transitions a tenant key to inactive/revoked in a way that ensures derived/encrypted-at-rest data becomes unreadable (or triggers a designed key-eraser workflow). Wire it end-to-end (API -> service -> repository -> storage).

  • med

    Provide scheduled rotation and prove enforcement: add rotation scheduling per tenant/customer (or policy-driven rotation) and persist rotation/activation history with tenant scoping; ensure only the tenant/customer can rotate/revoke within their scope.

Sub-processor / data-flow transparency 0%

The codebase provides an authenticated 'third-party licenses' endpoint and frontend client for retrieving a THIRD_PARTY_LICENSES.md file, but there is no in-repo, versioned sub-processor/data-flow inventory (or an equivalent API) that would allow verifiable mapping of which third parties touch data. Therefore, this primitive is absent.

  • high

    Add a versioned, in-repo sub-processor/data-flow inventory artifact (e.g., SUBPROCESSORS.md or a machine-readable JSON) that is maintained alongside node/vendor changes, and that lists each third party that receives data (for each relevant data-flow: model/provider nodes, analytics, telemetry, etc.). Ensure entries have version/history and an explicit 'last reviewed' timestamp.

  • high

    Implement/extend an API endpoint to serve the sub-processor inventory with an auditable, verifiable backing file (and ensure the file exists in-repo). Reuse the authentication model but make the content explicitly about sub-processors and data flows (not licenses).

  • med

    Cross-check the declared inventory against actual third-party SDK usage in code (e.g., vendor node transports). Add a lightweight CI check to prevent undocumented third-party recipients from being added without updating the inventory.

Not applicable to this codebase: Audit retention & separation of duties, Data residency / region pinning.

T2 Execution Velocity

Performance Primitives

A caching layer, an async job runtime, connection pooling, and indexes on the columns that actually need them.

50% 11/11 scored
  • Redundant work in loops 0%
    0/2 expected sites
  • Bounded interfaces 50%
    1/2 expected sites
  • Memoization / caching 89%
    3/3 expected sites
  • Resource reuse / pooling 0%
    0/2 expected sites
  • Off-critical-path execution 100%
    2/2 expected sites
  • Lookup data structures 0%
    0/1 expected sites
  • Batching round-trips 0%
    0/2 expected sites
  • Shared-state synchronization 78%
    3/3 expected sites
  • Bounded concurrency / backpressure 100%
    3/3 expected sites
  • Lazy / minimal computation 100%
    1/1 expected sites
  • Streaming over buffering 33%
    2/2 expected sites
Redundant work in loops 0%

The primitive is present: there are at least two clear instances where expensive work is repeated inside loops. One is repeated lodash `get(...)` path resolution in a hot sort/comparator path; another is per-node `await fetch(...)` in a loop that can cause many sequential network calls. I did not find any cases where the expensive call/work is correctly hoisted/batched/memoized for these specific should-be sites.

  • high

    In `Sort.node.ts`, precompute per-item derived values for each `sortFields` entry (including optional lowercasing) once before validation/sorting, and have the comparator read those cached values rather than calling `get(...)` repeatedly during comparator invocations.

  • high

    In `executions.utils.ts`, refactor the per-node `fetch` inside the loop to avoid N sequential calls: (a) collect unique `testUrl`s and fetch them in parallel with a bounded concurrency, and/or (b) avoid fetching at all when can be decided from already-available state, and/or (c) memoize results by `testUrl` within the function call.

Bounded interfaces 50%

Bounded interfaces are implemented partially via the `ListProjectsQueryDto` pagination contract (with `take` capped to `MAX_ITEMS_PER_PAGE`). However, the code intentionally supports unbounded collection retrieval when callers omit pagination (`take` defaults to `undefined` and the controller returns a bare array). Client code also calls `GET /projects` without pagination (`getAllProjects()`), creating genuine unbounded collection surfaces—so the primitive is present but not correctly enforced end-to-end.

  • high

    Remove/disable the backward-compat path that allows omitting `take` to return all projects. Make `take` required or apply a safe default server-side limit when `take` is absent (and always return an envelope with `count`/`data` if that’s the desired bounded contract).

  • high

    Fix frontend/API wrapper(s) that call collection endpoints without pagination parameters (e.g., `getAllProjects`) by requiring `take/skip` (or at least sending a default `take`) from the client.

  • med

    Enforce bounded behavior consistently at the repository/query layer by ensuring `applyPagination` always sets a `take` (server-side) even if callers omit it.

Memoization / caching 89%

The codebase has a strong and correct caching/memoization implementation: (1) a per-instance memoized DB read for instance version history, and (2) a centralized cache abstraction (`CacheService`) built on `cache-manager` with Redis and memory backends, including TTL and hit/miss/refresh behavior.

  • high

    Audit other hot-path deterministic functions for repeated expensive calls and route them through `CacheService.get/getHash/getHashValue` with stable cache keys and explicit invalidation/TTL strategy (the repo already has the infrastructure; the main risk is missing keying/invalidation at call sites).

  • med

    For memoized values like `_cache` in `InstanceVersionHistoryService`, confirm invalidation correctness under leader changes and any external updates to the underlying repository table (current logic memoizes until re-init).

Resource reuse / pooling 0%

The codebase shows an intent to use Oracle pooling (via `PooledOracleEmbeddings` borrowing connections from a pool). However, the pool initialization path appears to run on each embeddings/list call (`configureOracleDB.call(...)` is invoked inside per-call functions), so the expensive pooling handle is not clearly created once and reused across the component lifetime.

Off-critical-path execution 100%

The codebase uses a Bull-backed queue to offload workflow execution work from the main/initiating path to worker consumers. The decision to enqueue vs. run inline is made in `WorkflowRunner.run`, and the actual heavy processing happens inside Bull's `queue.process` handler in `ScalingService.setupWorker`.

  • high

    Search for any remaining queueing decision points or execution paths that still call `runMainProcess(...)` for queue mode; ensure workflow execution (and other potentially slow/failable steps) are consistently routed through `enqueueExecution` so the hot path stays free.

  • med

    Audit `enqueueExecution(...)` / job handler logic for idempotency and retry semantics (e.g., ensuring job processing is safe to retry after failures). The offload exists, but correctness depends on retry safety.

Lookup data structures 0%

The codebase does contain a correct lookup data structure implementation (an LRU cache built on `Map`). However, at least one hot per-record spot (pairwise output generation) repeatedly uses linear `Array.find` over the same collection (`r.feedback`) rather than building a lookup index for O(1) metric access, so the anti-pattern appears in that location.

  • high

    In `writeOutputs`, build a per-record lookup (e.g., `const feedbackByMetric = new Map(r.feedback.map(f => [f.metric, f.score]))`) once per record, then replace the three `r.feedback.find(...)` calls with O(1) `feedbackByMetric.get(metric)` reads.

  • med

    If `feedback` is stable and large across the entire run, consider pre-indexing once at ingestion time (rather than per output pass) to avoid repeated O(n) searches each time metrics are emitted.

Batching round-trips 0%

The codebase does contain a well-implemented batching pattern at the I/O boundary for bulk workflow setting updates (chunked DB reads/updates in `bulkSetAvailableInMCP`). However, at least two node execution paths (Rocketchat and Brandfetch) still perform outbound API calls inside per-item loops, which are exactly the round-trip anti-patterns for this primitive (and were not found to use a batching strategy at those call sites).

  • high

    For Rocketchat node execution, avoid calling `rocketchatApiRequest.call(...)` once per item. Implement batching if the Rocketchat API supports multi-message endpoints, or restructure to send fewer grouped requests (e.g., collect messages per channel/resource and call a bulk endpoint, or add concurrency controls + explicit chunk size).

  • high

    For Brandfetch node execution, avoid `brandfetchApiRequest.call(...)` once per item index `i`. If Brandfetch supports bulk logo/color/company retrieval, introduce a batched fetch strategy (or group domains and call a bulk endpoint); otherwise implement bounded chunking to cap round-trips and avoid unbounded linear growth.

  • med

    Add/extend a shared batching helper (at the I/O boundary) used by nodes that call external services, to standardize chunk sizing and to prevent accidental per-item request fan-out.

Shared-state synchronization 78%

The primitive exists and is well-implemented in `DbLockService`, which synchronizes shared in-process lock state using a FIFO queue plus ownership tokens, and relies on Postgres transaction-scoped advisory locks for cross-process correctness. Additional synchronization is applied at key mutation boundaries in the workflow dependency repository (row-level locking) and in the scoped task runner (promise-chain serialization per scope).

  • high

    Review `ScopedMemoryTaskRunner` shared mutable arrays/maps for logical invariants under concurrency (e.g., `capturedErrors` max-size enforcement and `inFlightTasks.delete(info.id)` timing relative to overlapping `runTask` calls). If invariants matter strictly, consider encapsulating these mutations behind per-scope serialization or an internal single-flight queue.

  • med

    Confirm `WorkflowDependencyRepository.acquireLockAndCheckForExistingData` coverages: ensure every concurrent writer path for workflow dependency mutations uses the same locking strategy (especially if there are other methods performing inserts/updates without calling this helper).

  • low

    Add targeted comments/tests for the in-process mutex edge cases already handled (stale release, transfer-before-resolve ordering) to prevent future refactors from regressing microtask/window safety.

Bounded concurrency / backpressure 100%

The codebase has a correct, explicit bounded-concurrency/backpressure primitive implemented as `ConcurrencyQueue` + `ConcurrencyControlService`. Capacity is enforced via a queue of awaiters: when the cap is hit, new work is blocked until capacity is released. This primitive is applied at the evaluation test-runner fan-out boundary, with additional abort-aware eviction/release handling to prevent capacity leaks.

  • med

    Audit other fan-out-heavy execution paths to ensure they consistently use `ConcurrencyControlService.throttle/release` (or equivalent) rather than spawning unbounded per-item async work. If found, route those call sites through the concurrency control.

Lazy / minimal computation 100%

The primitive exists in `InstanceVersionHistoryService`: DB work (fetching all version entries) is deferred until first use and then cached for subsequent consumer methods, avoiding unnecessary repeated computation and data transfer.

  • high

    Audit other modules that build potentially large result sets or expensive derived data (e.g., caches, computed selectors, “history”/“timeline” style queries) and ensure the fetch/compute is guarded behind a “first use” check like `_cache === null` (and that partial consumer methods don’t force full recomputation).

Streaming over buffering 33%

The codebase contains streaming implementations (notably SSE event writing and incremental processing of async stream chunks). However, the 'constant memory regardless of input size' primitive is violated in places where streams are converted into full strings (e.g., accumulating entire text from an agent stream) and where workflow code is fully inlined/assembled in memory (local import resolution).

  • high

    Replace `collectNativeStreamText` with a bounded/streaming alternative: expose an `AsyncIterable`/stream of text deltas to the consumer, or cap/roll up content (e.g., limit retained text, store only recent tail, or persist incrementally to storage). Avoid `deltaText += ...` / `messageText += ...` over unbounded streams.

  • high

    Change `resolveLocalImports` to avoid assembling the entire inlined bundle in memory. Instead, stream/emit chunks incrementally to the downstream builder (or write to a temp file / bounded buffer). If a single string is required by the API, enforce strict size limits and/or incremental truncation.

  • med

    Audit other stream-to-buffer conversions for the same issue pattern (look for functions that return `string`, `Buffer`, or `Array` after consuming an async/stream input). Prefer iterators/chunked outputs and bounded retention.

Reliability Primitives

Retries, circuit breakers, idempotency keys, health checks, and a runbook for each service.

64% 11/11 scored
  • Timeouts 67%
    2/3 expected sites
  • Retry with backoff + jitter 0%
    0/2 expected sites
  • Idempotency 83%
    2/2 expected sites
  • Circuit breaking / fail-fast 100%
    1/1 expected sites
  • Graceful degradation / fallback 89%
    3/3 expected sites
  • Error handling & propagation 56%
    2/3 expected sites
  • Deterministic resource cleanup 100%
    1/1 expected sites
  • Atomicity / all-or-nothing 0%
    0/1 expected sites
  • Input / boundary validation 100%
    1/1 expected sites
  • Failure isolation / bulkheading 0%
    0/1 expected sites not present
  • Graceful shutdown 111%
    4/3 expected sites
Timeouts 67%

Timeouts support exists in this codebase (undici Agent/ProxyAgent timeouts via `proxyFetch`, and `AbortSignal.timeout()` in the n8n evaluation client). However, at least one critical unbounded boundary remains: `N8nClient.callWebhook()` calls `fetch()` directly with no timeout, and many higher-level `N8nClient` methods call the internal `fetch()` without passing `timeoutMs`, making the timeout facility dependent on callers.

Retry with backoff + jitter 0%

The codebase contains a retry utility with deterministic backoff (linear/exponential, capped at 30s) but it does not implement jitter. Additionally, there is at least one concrete HTTP retry loop (fetchNodeTypesJsonWithRetry) that uses deterministic sleep between attempts and does not implement jitter or exponential backoff with a clearly defined capped budget. As a result, the primitive 'retry_backoff_jitter' is only partially present and is not correctly applied to the transient-failure retry sites found.

  • high

    Upgrade `packages/@n8n/utils/src/retry.ts` to add jitter to the computed delay (e.g., full jitter or equal jitter) while preserving the existing capped exponential backoff budget. Ensure the delay computation is centralized and consistently used by call sites.

    • packages/@n8n/utils/src/retry.ts:1-52 — Backoff is implemented deterministically via interval * attempt or Math.pow(2, attempt-1) * interval (capped at 30s), with no randomization/jitter in the delay.
  • high

    Refactor `fetchNodeTypesJsonWithRetry` to use the updated jitter-capable retry helper, and switch to an exponential backoff strategy with a capped maximum delay budget and jitter.

Idempotency 83%

Idempotency mechanisms are present, notably (1) DB-level execution dedup via a unique `execution_entity.deduplicationKey` index and (2) single-flight dedup for background task spawning using `BackgroundTaskManager`’s `dedupeKey`. Additionally, workflow statistics writes are made idempotent via `ON CONFLICT ... DO UPDATE` upserts. However, the audit did not confirm idempotency behavior in all retryable execution-creation call paths (e.g., whether retries always supply the dedup key and correctly handle duplicate-insert errors) within the sampled failure/duplication wiring.

  • high

    Verify end-to-end idempotency for execution creation: ensure every retryable code path that persists a new `ExecutionEntity` supplies the correct `deduplicationKey`, and that duplicate-insert errors are caught and converted into a safe 'already exists / skip' outcome (rather than re-running). Cross-check where `deduplicationKey` is set (Schedule Trigger) and where it is passed into `WorkflowExecutionService.runWorkflow` / execution persistence.

  • med

    For idempotent upserts with conflict handling, audit the error branch and concurrency comments: specifically the SQLite branch in `upsertWorkflowStatistics` uses a naive post-query approach for determining insert vs update. Ensure that retries still remain correct for the side effects (counter increments) and that the classification logic does not trigger any additional write.

  • low

    Add/extend tests covering idempotency under retries for background tasks: validate that repeated spawn attempts with identical `dedupeKey` return the duplicate result and do not start an additional run even when the first attempt is still running.

Circuit breaking / fail-fast 100%

A circuit breaker implementation exists (packages/cli/src/utils/circuit-breaker.ts) and it is applied correctly to the log streaming message destination: receiveFromEventBus is wrapped with circuitBreakerInstance.execute(...) so the system will fail-fast (OPEN) and probe in HALF_OPEN with concurrency limiting.

  • high

    Audit other external dependency call sites (network/db/HTTP/event bus send paths) for missing circuit breaker wrapping. Concretely, search for direct calls to external send/request functions that lack circuit breaker protection and add CircuitBreaker.execute(...) around those unhappy-path boundaries similar to message-event-bus-destination.ee.ts.

  • med

    Ensure callers that catch CircuitBreakerOpen either (a) treat it as a normal fast-fail and stop further retries, or (b) propagate context upstream without re-triggering new retries that would defeat fail-fast behavior.

Graceful degradation / fallback 89%

The codebase does implement graceful degradation/fallback behavior. Notably, webhook cache lookup failures fall back to DB lookups, Redis cache-manager skips cache failures for non-cacheable values, and Postgres connection setup supports a fallback handler for pool acquisition/connection setup. Overall quality is strong, with correct error-branch handling and explicit continuation on non-critical failures.

  • high

    For the Postgres transport fallback path, verify that all fallback-related failure branches still return a usable connection (or a clearly-defined error) and do not allow the fallback itself to throw unhandled exceptions; if there are unguarded operations inside `fallBackHandler`, wrap them with context and ensure the caller can continue or fail predictably.

  • med

    Standardize fallback staleness/explicitness: where cache is used (e.g., webhook cache), consider returning/recording a flag or timestamp indicating that results came from DB after cache failure, so callers can treat the result as 'non-cached' or 'stale-by-definition'.

Error handling & propagation 56%

Error handling & propagation is present and generally well-applied in the @n8n/agents runtime: delegation errors are captured and returned as structured failed tool output, and streaming runtime failures are caught with cleanup and client-facing error signaling. However, at least some fallible write/emit operations appear to use local suppression (e.g., swallowing writer-write rejections), which is acceptable only if intentionally non-critical; overall quality is good but not perfect.

  • high

    Review stream-related error branches for any intentionally-swallowed failures (e.g., `writer.write(...).catch(() => {})`). Ensure that if the stream write failure is meaningful, it is either propagated to the stream termination path or at least logged/recorded—avoid silent loss of error context.

  • med

    Audit the helper `closeStreamWithError` call chain to ensure that any failures during cleanup (`cleanupRun`) or error writes (`writer.write`, `writer.close`) are also handled in a non-silent way, preserving the original error context.

  • low

    Standardize error stringification/representation across tool delegation and runtime streaming so upstream callers get consistent error shapes (e.g., message + original error string/metadata when available).

Deterministic resource cleanup 100%

Deterministic resource cleanup is present in the codebase: when an episodic-memory task lock is acquired, the code releases it in a finally block so the lock is freed even if the task throws.

  • med

    Extend this audit pattern to other resource acquisitions (e.g., file streams, network connections/clients, DB handles) by explicitly checking whether their corresponding release/close happens in finally/defer/with/RAII at each acquisition site.

Atomicity / all-or-nothing 0%

The codebase has some atomic/all-or-nothing mechanisms: frontend cache persistence uses real IndexedDB transactions, and a data-table create+CSV-import flow uses compensating rollback (delete the created table) when row insertion fails. However, at least one compound CSV import operation into an existing table lacks rollback on failure (no catch around insertRows), making partial writes observable and therefore a should-be atomicity site.

  • high

    Add an error-handling/rollback strategy to `importCsvToExistingTable` so that row insertion into the existing table is all-or-nothing (preferably a DB transaction; if not feasible, implement compensating deletion of any rows inserted during the failed import, with safeguards to avoid deleting pre-existing rows).

Input / boundary validation 100%

Input/boundary validation is present in the codebase. In `get-node-parameter.tool.ts`, the tool’s `input: unknown` is validated with a Zod schema at the handler boundary, and Zod validation failures are handled explicitly by returning an error response. For this audit, only this concrete required should-be site was identified and it is correctly implemented.

  • high

    Repeat this boundary-validation pattern across other tool/handler entry points that accept `unknown`/raw request/serialized workflow input: define a Zod (or equivalent) schema per boundary, call `schema.parse(...)` (or `safeParse`), and in the failure branch return a structured error response without performing any side effects or deep lookups.

Failure isolation / bulkheading 0%

I did not find a clear, explicit failure-isolation/bulkheading implementation in the code I examined. For example, the Oracle embeddings node borrows connections from a shared pool for each call; while it does correctly close connections in a `finally` block, it does not appear to partition or cap resources per independent workload in a way that would prevent one workload from exhausting shared capacity.

  • high

    Introduce bulkheading around the shared Oracle connection pool usage. Options include: (1) separate pools per workload class (e.g., per model / per node instance / per embedding operation type), (2) semaphore-based concurrency limits scoped to this node/subsystem, and/or (3) time-bounded acquisition with a fast-fail fallback when the limit is reached—so one embedding workload can’t starve others sharing the same pool capacity.

Graceful shutdown 111%

Graceful shutdown is present. The task-runner entry point has robust signal handling with a forced-timeout, draining/stop calls (runner + healthcheck + Sentry), and guarded repeated signals. The engine server closes the HTTP listener on SIGINT/SIGTERM. The MCP browser server awaits connection.shutdown() before exiting, though it doesn’t explicitly close the HTTP server listener in the provided shutdown path.

  • high

    For packages/@n8n/mcp-browser/src/server.ts, extend the shutdown handler to also stop/close the underlying HTTP server(s) (when transportType is 'http') so the process truly stops accepting new work during SIGTERM/SIGINT, not only the MCP connection.

  • med

    In packages/@n8n/engine/src/serve.ts, consider adding a shutdown timeout/forced-exit similar to the task-runner to avoid hanging indefinitely if server.close never completes (e.g., stuck keep-alive connections).

API & Extensibility

A checked-in OpenAPI spec, versioned routes, a webhook system with retries and signing, and tenant-scoped rate limits.

64% 10/10 scored
  • Machine-readable API contract 100%
    3/3 expected sites
  • Versioning & backward compatibility 22%
    1/3 expected sites
  • Programmatic auth with scopes 100%
    6/6 expected sites
  • Per-tenant rate limiting 83%
    2/2 expected sites
  • Idempotent writes 0%
    0/4 expected sites not present
  • Consistent pagination & filtering 100%
    5/5 expected sites
  • Outbound events / webhooks 0%
    0/1 expected sites not present
  • Consistent errors & status codes 17%
    1/4 expected sites
  • Sandbox / test mode 67%
    2/2 expected sites
  • Extension points / plugins 150%
    3/2 expected sites
Machine-readable API contract 100%

This codebase has a checked-in, machine-readable API contract: `packages/cli/src/public-api/v1/openapi.yml`. The server serves this spec to consumers and enforces it at runtime using `express-openapi-validator` with API-spec validation enabled, strongly reducing drift between implementation and contract.

Versioning & backward compatibility 22%

The codebase contains a versioning mechanism for the CLI “Public API”: it loads versioned `v*` modules, mounts routes under `/${publicApiEndpoint}/${version}`, and provides per-version OpenAPI + Swagger UI. However, the audit did not find an explicit, public deprecation/sunset policy (headers + migration links) or other visible backward-compat governance across versions; the cross-cutting middleware/error handling also doesn’t show such compatibility signaling.

  • high

    Add a standardized deprecation/sunset response policy for versioned Public API endpoints (e.g., `Deprecation`, `Sunset`, and/or `Link` headers with migration URLs) and ensure it is applied consistently via the public API middleware/error pipeline.

  • med

    Introduce contract compatibility testing across API versions (e.g., snapshot/contract tests that ensure older versions remain valid and that schema evolution is add-only).

  • low

    Extend the global public-api middleware to attach version-compat metadata (when relevant) in a single place to avoid endpoint-specific behavior.

Programmatic auth with scopes 100%

This codebase implements scoped, server-managed public API credentials. API keys are stored with per-key `scopes` (and `lastUsedAt`), authenticated via `x-n8n-api-key`, and enforced per endpoint through middleware that checks `req.tokenGrant.apiKeyScopes`. There are also dedicated key-management endpoints (create/list/update/delete/scopes) guarded by `apiKey:*` scopes, supporting revocation/rotation workflows.

  • high

    Verify consistency across the entire public API surface: ensure every public-api v1 handler uses the scope enforcement helpers (e.g., `publicApiScope` / `projectScope` / `apiKeyHasScopeWithGlobalScopeFallback`) and that all endpoints depend on `req.tokenGrant.apiKeyScopes` rather than bypassing enforcement.

  • med

    Confirm rotation/revocation behavior is fully documented for external integrators (how to generate a new key with narrower scopes, how to revoke/delete old keys, and how last-used tracking is surfaced/queried).

Per-tenant rate limiting 83%

Per-tenant/per-consumer rate limiting is present as a route-level capability: `ControllerRegistry` can attach rate limit middleware based on decorator metadata, and `RateLimitService` supports user-keyed buckets (keyed by `req.user.id`). However, enforcement is only enabled on routes that explicitly declare `keyedRateLimit`, and the shown limiter implementation does not demonstrate the expected public signaling contract (standard rate-limit headers, 429 retry guidance) required for third-party integrators.

  • high

    Audit the full set of public API routes to ensure user/tenant-keyed rate limiting (`keyedRateLimit: { source: 'user' }`) is consistently applied at the controller/route registration layer (not only selectively). Any endpoints that are currently missing rate limiting should be updated to declare the correct keyed limiter configuration.

  • high

    Verify and standardize the public 429 response contract for the rate limiter: ensure headers like `Retry-After` (and, if used, `X-RateLimit-*` or equivalent) are emitted consistently and include retry guidance, aligned with the project’s other API response conventions.

  • med

    Ensure the keying aligns with 'tenant' semantics rather than just 'user'. If n8n public API keys/clients are team/project-scoped (or have an API key tenant identifier distinct from user ID), update `createUserKeyedRateLimitMiddleware` (or add a dedicated 'client/tenant' keyed mode) to bucket by the correct consumer identifier.

Idempotent writes 0%

Idempotent writes (HTTP retry-safe mutations via an idempotency key) are not implemented as a public, consumable contract on the inspected public write endpoints (workflows + credentials). The codebase does include a separate “data deduplication” mechanism for execution/runtime purposes, but it is not the HTTP mutation idempotency-key pattern required for safe client retries.

Consistent pagination & filtering 100%

The codebase has a strong, reusable pagination + cursor-filtering convention for public API v1 list endpoints: a shared pagination DTO (bounded page size), a `validCursor` middleware to normalize cursor queries, and a shared `encodeNextCursor` response contract. The main public list handlers (workflows, data-tables, projects, credentials, executions) apply this consistently, including returning `nextCursor` and enforcing bounded limits where applicable.

Outbound events / webhooks 0%

The codebase contains webhook *inbound* handling (server/controller/service for receiving HTTP webhook requests), and it also has an *outbound* webhook-like sender for log streaming (Axios POST to a configured URL). However, there is no implemented outbound events/webhooks primitive matching the required contract: no evidence of a subscription/delivery worker with HMAC-signed versioned payloads, exponential-backoff retries with a cap + flag/alert, idempotent redelivery, and a documented event catalog. Therefore this primitive is absent as a coherent, integrator-consumable outbound-events system.

  • high

    Introduce a first-class outbound events/webhooks delivery pipeline: (1) store webhook subscriptions per tenant/credential, (2) create a delivery worker that reads pending deliveries, (3) emit versioned payloads with HMAC (shared secret) signatures, (4) retry with exponential backoff capped at a limit, then flag-and-alert, and (5) implement idempotent delivery (e.g., delivery-id/request-id persisted) to prevent duplicates on retries.

  • med

    Add (and check in) a documented event catalog + webhook payload schema/versioning policy (e.g., AsyncAPI/OpenAPI-like or a dedicated event registry file) so integrators can build without contacting the maintainers.

    • packages/@n8n/api-types/src/push/webhook.ts:1-18 — There are webhook-related push message types, but no evidence (from the reviewed webhook destination/server code) of a public, versioned outbound event catalog and payload signing/retry contract.
Consistent errors & status codes 17%

The codebase has a partially consistent REST error contract: errors are classified and serialized through a centralized pipeline (classifyHttpError -> serializeInternalRestError -> sendErrorResponse). However, evidence also shows endpoint-specific ad-hoc error responses (e.g., MCP consent controller) that do not follow the shared envelope. Additionally, the audited error serializers/responder do not demonstrate a required correlation id field on every error, and the status-code mapping requirements (400/401/403/409/422/429 with correct semantics) are not confirmed in the shared serializer layer.

  • high

    Enforce the shared error-envelope for all public REST endpoints by removing/rewriting ad-hoc controller-level {status, message} responses (like MCP consent) to use the centralized sendErrorResponse() / send() wrapper path.

  • high

    Add and propagate a correlation/request id on every error response in the shared serializer layer (e.g., include correlationId/requestId in serializeInternalRestError/serializePublicApiError and ensure it is present for all descriptor kinds).

  • med

    Verify and correct status-code mapping across the shared classifier/serializers to explicitly cover 400, 401, 403, 409, 422, 429, and ensure 5xx is only used for faults; encode these mappings in classifyHttpError() or ResponseError types so they are uniform across endpoints.

Sandbox / test mode 67%

The codebase contains an internal sandbox/test-mode mechanism for the instance-ai evaluation harness: sandbox base URL + test/API keys are resolved from env vars (`resolveSandboxConfig`), and an eval HTTP client authenticates against a provided sandbox baseUrl using test credentials. However, this appears to be harness-oriented rather than a clearly documented, third-party consumable “sandbox contract” for external integrators.

  • high

    Make the sandbox primitive integrator-facing: add/confirm a documented sandbox base URL and test credential acquisition/rotation process, with a stable contract (where to send requests, which accounts/keys to use, data isolation guarantees, and lifecycle). Evidence: sandbox selection + required keys exist, but there’s no sign (in this audit) of a public, stable integrator contract beyond the eval harness.

  • med

    Add an explicit “sandbox/test mode” section to the relevant integration docs and/or repository-level docs that maps sandbox env vars to integration behavior (including which providers are supported and what test data isolation means), so a third party can integrate without reading internal harness code.

Extension points / plugins 150%

This codebase has real extension points: (1) CLI external lifecycle hooks loaded from configured external files and invoked through a central `ExternalHooks` runner, and (2) a documented workflow-builder plugin registry in the workflow SDK (validators/composite handlers/serializers) with a singleton registry for easy integration.

  • high

    Add first-class, versioned documentation/specs for the External Hooks contract (the expected module export shape and available hook names/parameters), including stability guarantees and error-handling semantics for hook failures.

  • med

    Ensure plugin registry extension points are accompanied by public SDK-facing docs that explain how external packages should create/register plugins, what lifecycle guarantees exist, and how priority conflicts are resolved.

Integration Depth

Per-system adapters behind one shared interface with bi-directional sync — not per-customer scripts held together with spreadsheets.

55% 8/10 scored
  • Metadata-driven mappings 78%
    3/3 expected sites
  • Per-integration reliability 17%
    1/2 expected sites
  • Sync state & reconciliation 100%
    1/1 expected sites
  • Inbound validation & normalization 83%
    2/2 expected sites
  • Per-tenant integration credentials 25%
    2/4 expected sites
  • Per-integration observability 0%
    0/3 expected sites not present
  • Connector breadth for the category 67%
    1/1 expected sites
  • Build-vs-buy posture 73%
    5/5 expected sites
Shared integration abstraction N/A

No “Shared integration abstraction” for external-system connectors (one common interface + canonical entities like account/contact/invoice across multiple distinct external integrations) was found in the code areas inspected. While the repo contains shared adapter-style abstractions (e.g., CRDT sync provider wiring and various agent/tool/provider interfaces), those are not the integration-depth primitive you requested (they are generic infrastructure, not canonical external-system connector adapters over stable canonical domain entities).

  • high

    Identify the product’s customer-facing external-system connectors (the dirs and runtime paths that implement SaaS integrations). Then implement/verify a shared connector interface + canonical entity model contract that every connector maps to (e.g., canonical Account/Contact/Invoice entities). Ensure each connector implements the shared interface rather than duplicating parsing/mapping/writes.

  • med

    Add “integration adapter contract” enforcement: require connectors to implement the shared interface and produce normalized canonical entities (with validation/dedup at the boundary). Add compile-time typing/tests to prevent one-off/snowflake mappings from bypassing the canonical layer.

Bidirectional sync N/A

The codebase contains a well-architected bidirectional sync primitive in the CRDT module (BaseSyncProvider), which synchronizes state between peers by applying incoming updates and sending outgoing updates back over a transport. I did not find evidence of this primitive as an external-system integration feature (read+write connector sync) in the sampled connector/node paths.

  • high

    If the audit intent is specifically external-system integrations (connectors), expand the connector inventory beyond string/path heuristics (integration/adapter/sync/import/export) and directly inspect the sync execution paths for representative connectors that should require write-back, then verify presence of read+write behavior, stored cursors, idempotent upserts, and failure handling.

  • med

    Clarify scope: determine whether the dimension should treat CRDT peer synchronization as “bidirectional sync” for this audit, or only consider customer-facing external connectors. If only external connectors count, mark this primitive as N/A for the integration-depth portion.

    • packages/@n8n/crdt/src/transports/types.ts:1-30 — Defines sync transport as a “dumb pipe” moving binary updates between CRDT documents, reinforcing that this sync is peer-to-peer within the product rather than an external integration connector.
Metadata-driven mappings 78%

The codebase contains a clear metadata-driven mapping primitive in the EE provisioning/SSO role-mapping area. Mapping rules (expression + role + scope/type + ordering) are stored as config and interpreted at runtime by RoleResolverService via Expression.resolveWithoutWorkflow, selecting the first enabled rule that evaluates to true.

  • high

    Add/verify end-to-end tests that demonstrate tenant-scoped behavior is driven purely by persisted mapping config (enabled rules, projectId selection, expression evaluation order) rather than environment- or tenant-specific code branches.

  • med

    Audit mapping-rule enablement and fallback semantics (instanceRoleRules vs fallbackInstanceRole; projectRoleRules with matched.has(projectId)) to ensure the expected determinism when multiple rules evaluate true.

Per-integration reliability 17%

Per-integration reliability is only partially present. There is a retry-with-exponential-backoff manager for external secrets, but the implementation does not include a dead-letter/holding queue for failures after retries. The AMQP sender node (an external integration) does not apply retry-with-backoff + DLQ behavior when per-item message sending fails; errors are surfaced (or thrown) without parking undeliverable records.

  • high

    Add retry-with-backoff for per-item AMQP publish failures and introduce a dead-letter queue/quarantine mechanism for messages that still fail after the retry budget is exhausted. Ensure failures are observable (metrics/alerts) and that undeliverable items are parked rather than only returned as errors or thrown.

  • high

    Extend the ExternalSecretsRetryManager to support a dead-letter/holding area for operations that fail beyond configured retry attempts (and emit/record alerting for DLQ events). Right now it schedules retries indefinitely via backoff but has no 'give up + park' mechanism visible in this service.

Sync state & reconciliation 100%

The codebase contains a strong instance of the “Sync state & reconciliation” primitive in the n8n-memory adapter. It persists per-scope cursors (watermarks), uses idempotent upserts for derived memory entries (contentHash-based), and performs drift repair by dropping/superseding entries and copying/superseding sources according to a normalized reconciliation plan. Other sync-state mechanisms exist (e.g., CRDT peer sync), but the adapter-style cursor+reconciliation is clearly implemented at n8n-memory.

  • med

    Add/verify explicit drift detection telemetry/visibility for this adapter (e.g., counters/logging for dropped vs superseded vs upserted counts per reconciliation run) so reconciliation correctness issues don’t remain silent.

  • low

    Ensure concurrency safety around cursor updates for the same observation scope (e.g., confirm locking or monotonic cursor update semantics across distributed instances).

    • packages/cli/src/modules/agents/integrations/n8n-memory.ts:940-1045 — Cursor updates use an insert-or-ignore plus an update conditioned on lastIndexedObservationCreatedAt/lastIndexedObservationId ordering, which helps monotonicity, but verifying end-to-end locking (task locks) would complete the reconciliation correctness story under concurrent writers.
Inbound validation & normalization 83%

The codebase does implement inbound validation/normalization at key API/config boundaries using Zod (DTO validation for “import-workflow-from-url” and Zod schemas for agent integration settings with strictness, custom refinement, and normalization/dedup). I did not identify evidence here of a full “dedup + bad record quarantine” pipeline beyond schema-layer handling for the sampled boundaries; deeper boundary-to-storage ingestion behavior would need confirmation in the handlers/repositories that persist these inputs.

  • high

    For inbound workflow import flows, ensure the handler that consumes `ImportWorkflowFromUrlDto` performs idempotency/dedup (e.g., dedupe by (projectId, sourceUrl) or content hash) and quarantines failed/invalid imports to a dedicated failure store rather than allowing raw/unvalidated data to reach workflow tables.

  • med

    For agent integration config, confirm that Zod validation failures are consistently mapped to a canonical error/quarantine mechanism (e.g., rejecting invalid integration configs before they’re stored or used) and that any normalization (like allowedUsers deduping) is reflected in the canonical persisted model.

Per-tenant integration credentials 25%

The codebase contains credential handling mechanisms consistent with per-tenant/per-credential isolation and refresh. For external secret providers, ExternalSecretsManager loads encrypted per-provider-connection settings, sets up the provider, and starts/stops periodic refresh (supporting refresh and revocation). For OAuth, OAuth service persists token data to the specific credential record (encrypt-and-save) and the OAuth client supports refresh-token based refreshing. However, from the evidence inspected, it is not fully demonstrated that tenant isolation is enforced specifically via a secret manager boundary for all OAuth credential refresh paths (the secret-manager requirement appears explicitly for external secrets, not necessarily for the core OAuth token lifecycle).

  • high

    Verify tenant/workspace isolation for OAuth token persistence: trace from getCredentialForUpdate/findCredentialForUser (permission scoping) through credentialsRepository.update(credential.id) and confirm credential.id is tenant-scoped (not globally shared) and that revocation clears/deletes refresh tokens within the same tenant boundary.

    • packages/cli/src/oauth/oauth.service.ts:200-320 — encryptAndSaveData updates the credential by credential.id; need to confirm credential.id belongs to the requesting tenant/workspace via the upstream finder/permission logic (not shown in the snippet).
  • high

    Confirm secret-manager-based refresh for “integration credentials” beyond ExternalSecrets EE: locate the component(s) that call token refresh in the runtime and check whether refresh-token material is fetched from per-tenant secret storage (or whether it is stored encrypted in the DB). If it’s DB-encrypted, document that it still satisfies the intended isolation requirement, or extend secret-manager integration if required.

  • med

    For external secrets, confirm per-tenant connection scoping: inspect SecretsProviderConnectionRepository queries/filters to ensure providerKey/settings are constrained to the tenant (not just “providerKey”).

Per-integration observability 0%

I did not find any implementation of per-integration observability (per connector/platform health, throughput, failures, and last-sync/last-run surfaced to ops/customers). There is an observability provider for the expression engine, and integration code (e.g., AgentChatBridge) logs errors and posts messages, but there are no per-integration metrics/status updates (success/failure rates, latency histograms, last-sync/run state) visible in the integration execution paths.

  • high

    Add per-integration metrics and status reporting keyed by integration connection identity (e.g., integrationConnectionId + integration.type): counters for successes/failures, histogram for latency, and a gauge/status record for last successful/failed run time + error code. Emit these at the start/end of executeAndStream and in each error handler.

  • high

    Instrument platform-specific posting lifecycles (streaming + buffered) with the same per-integration metric labels, so ops can distinguish failures caused by the agent stream vs. failures caused by the external platform post.

  • med

    Surface last-sync/last-run status to an ops-visible channel (and, if applicable, to customers) by persisting a lightweight integration status record (timestamp, outcome, last error code/message) in an existing status store or monitoring table used by the product.

Connector breadth for the category 67%

The codebase contains a concrete connector-breadth surface for the agents/LLM-provider category: `PROVIDER_CREDENTIAL_SCHEMAS` enumerates many external provider systems and their credential requirements. However, this appears to be provider-focused (auth/capability inputs) rather than a broader “integration catalog” across all table-stakes connector types for the vertical (identity/CRM/data warehouse/etc.); I did not find a single comprehensive breadth matrix for the whole product category in the limited evidence gathered.

  • high

    Confirm whether there is a higher-level connector catalog that maps *external systems covered* vs *table-stakes expectations* for this market/vertical (beyond LLM provider credentials). If not, create/extend one (e.g., a single catalog or data-room document + runtime registry) so connector breadth is measurable and gaps are explicit.

  • med

    If breadth is intended to be measured for additional connector types (CRM/identity/warehouse/etc.), add analogous enumerations/registries for those connector families (or document intentional omissions), using the same “coverage is explicit” pattern as `PROVIDER_CREDENTIAL_SCHEMAS`.

Build-vs-buy posture 73%

The codebase uses an architected first-party integration contract (AgentChatIntegration) while outsourcing connector depth to embedded third-party adapter packages loaded from @chat-adapter/* (Slack/Telegram/Linear/etc.). This is consistent across multiple connectors and avoids spaghetti-style per-connector snowflakes, but the overall integration depth (API connectivity/adapter internals) is clearly ‘bought’ rather than fully built.

  • high

    Confirm that adapter-purchased depth is intentionally bounded: document what responsibilities remain first-party (credential extraction, lifecycle hooks, normalization, UI metadata) vs what is delegated to @chat-adapter/* (API calls, protocol details). Add/keep a short README alongside the integrations module to prevent future connectors from drifting into bespoke wiring.

  • med

    Add consistency checks/tests ensuring new connectors follow the same adapter-boundary pattern (extract credentials → load adapter → createAdapter) and register in ChatIntegrationRegistry, so build-vs-buy posture remains uniform as connector count grows.

Not applicable to this codebase: Shared integration abstraction, Bidirectional sync.

Deployability

CI/CD as code, infrastructure as code, per-environment isolation, and a one-command local boot.

65% 11/11 scored
  • Reproducible one-command build 0%
    0/1 expected sites not present
  • Automated CI pipeline 100%
    4/4 expected sites
  • Automated deployment (CD) 0%
    0/2 expected sites not present
  • Infrastructure as code 67%
    4/4 expected sites
  • Environment isolation 0%
    0/2 expected sites not present
  • Local/production parity 100%
    3/3 expected sites
  • Config & secrets externalized per env 122%
    4/3 expected sites
  • Decouple deploy from release 83%
    4/4 expected sites
  • Reversibility / rollback 67%
    2/3 expected sites
  • Delivery cadence (DORA proxy) 92%
    4/4 expected sites
  • Deploy-tooling ownership 89%
    3/3 expected sites
Reproducible one-command build 0%

I did not find a true “reproducible one-command build” for the main n8n app: the repo’s local bootstrap relies on multi-step devcontainer commands (corepack/pnpm install and build via lifecycle hooks) rather than a single documented command that (a) starts from a clean clone, (b) deterministically builds/boots using pinned dependencies (lockfile), and (c) is the first-class local boot workflow for contributors.

  • high

    Add a repo-root, one-command bootstrap+boot entry point (e.g., `./scripts/boot-dev.sh` or `make dev` / `just dev`) that: (1) installs dependencies from a committed lockfile, (2) builds, and (3) starts the n8n server, with all environment requirements checked and documented. Ensure the command works without requiring devcontainer lifecycle hooks.

  • med

    Ensure determinism by wiring dependency installation to a committed lockfile and using it in the one-command script (no unpinned installs, no implicit network-resolved version ranges).

  • low

    If keeping the devcontainer, make its `postCreateCommand` and `postAttachCommand` delegate to the same one-command script so there is one source of truth for local reproducible boot.

Automated CI pipeline 100%

The automated CI pipeline primitive is clearly present. The repo has dedicated GitHub Actions workflows for master pushes and for pull requests/merge queues. The PR workflow runs build, unit tests, typecheck, lint, and broader checks (DB/e2e/dev-server-smoke/performance/security/workflow scripts) and includes a required-checks gate to validate that the expected jobs pass.

  • low

    Consider documenting (in the workflow or README) exactly how branch protection is configured to require the 'required-checks' job (and whether additional jobs are required directly).

Automated deployment (CD) 0%

Automated deployment (CD) to production is not present as a distinct, runnable pipeline path in this repository. The codebase has CI workflows for building and publishing release artifacts (NPM, DockerHub, GitHub Releases) but no corresponding production deployment/rollout pipeline stage is wired in the workflow definitions we inspected.

  • high

    Add a production deployment/rollout workflow/job wired into the existing release pipeline (e.g., triggered from release-publish or called via workflow_call). Make the deploy step target a specific GitHub Actions environment (environment: production) and include the actual deploy mechanism (Kubernetes apply/Helm, Terraform apply, serverless deploy, etc.), plus rollout/health checks.

  • med

    If production deployment exists outside this repo, codify the production deploy trigger contract here (e.g., call a separate deploy workflow, or add deploy instructions/scripts and invoke them from CI). Ensure multiple team members can run it via the pipeline (not a one-person script).

Infrastructure as code 67%

Infrastructure-as-code exists in the repository, but it appears scoped to benchmark infrastructure under packages/@n8n/benchmark/infra (Terraform/azurerm). The IaC is implemented as versioned Terraform configuration and a reusable VM module with pinned providers and concrete Azure resource definitions. However, the evidence suggests IaC may not cover the primary n8n production deployment path (infrastructure-as-code primitives were only detected in the benchmark infra subtree).

  • high

    Extend/replicate the Terraform IaC pattern beyond the benchmark-only subtree so that production/staging infrastructure is defined in versioned code (single golden path), not only for benchmark environments.

  • med

    Harden IaC outputs/secret handling: avoid emitting SSH private keys as direct Terraform outputs; prefer storing keys in a secret manager and output only references or public endpoints.

  • low

    Ensure the IaC is fully parameterized and documented with a repeatable “apply” entrypoint (e.g., README + tfvars templates) so new environments can be created reproducibly without tribal knowledge.

Environment isolation 0%

I did not find evidence that this codebase enforces true environment isolation (separate dev/staging/prod environments with isolated data and credentials/accounts). The code mostly provides runtime environment detection (NODE_ENV) and occasional runtime switching between staging/production endpoints using hardcoded URLs, but that is not the same as isolated deployments with segregated state.

  • high

    Replace hardcoded per-environment endpoints (e.g., staging vs production URLs) with environment-specific configuration that is supplied from versioned per-environment config/infra (dev/staging/prod stacks), and ensure each environment uses its own credentials/accounts/data store.

  • high

    Ensure the application’s configuration layer wires separate database connections/secrets per stage (dev/staging/prod) rather than relying on a single binary/runtime with only NODE_ENV checks.

  • med

    Add/verify a versioned per-environment configuration pattern (e.g., config templates and required variables) for staging/prod, and confirm that secrets are not shared between environments (separate secret sets, not just different endpoint constants).

Local/production parity 100%

Local/production parity is present and well implemented via a VS Code devcontainer: local development runs n8n in containers (with Compose), includes a containerized Postgres dependency, and uses the repo’s Docker image/runtime conventions (production `NODE_ENV` and entrypoint).

  • med

    Confirm that the `.devcontainer/docker-compose.yml` `build.dockerfile: Dockerfile` maps to the same production runtime Dockerfile you intend to mirror (e.g., verify whether it resolves to `docker/images/n8n/Dockerfile` or a different Dockerfile at repo root) and, if different, update the devcontainer to build from the same Dockerfile used for the production image.

    • .devcontainer/docker-compose.yml:1-25 — The dev stack builds using `dockerfile: Dockerfile`, but we also see a production runtime image Dockerfile at `docker/images/n8n/Dockerfile`; ensure these are consistent for full parity.
Config & secrets externalized per env 122%

This codebase uses an explicit configuration layer that externalizes env-dependent settings: key modules under `packages/@n8n/config/src/configs/*` define config fields via `@Env(...)` decorators and read environment variables (including legacy compatibility) at runtime. A `.env.example` template exists for agent API keys, supporting the intended secrets/config injection workflow.

  • high

    Do a targeted scan for any remaining environment-specific production literals (e.g., hardcoded URLs/endpoints/keys) outside the config layer (search for patterns like `http://`, `https://`, `API_KEY`, `SECRET`, `process.env` usage without the config decorators) and refactor them into the `packages/@n8n/config/src/configs/` pattern using `@Env(...)`.

  • med

    Ensure every environment-specific secret used by agents/workers (provider keys, external service tokens) has a corresponding `.env.example` entry or documented env mapping, and that production values are never committed.

Decouple deploy from release 83%

This codebase includes real feature-flag plumbing that decouples deploy from release: PostHog flag evaluation happens at runtime, and server-side env-var overrides (`N8N_ENV_FEAT_*`) provide an operator escape hatch. The frontend also gates behavior based on evaluated flag variants. Implementation is solid but the evidence inspected focuses mainly on the flag evaluation/consumption layer rather than verifying percentage/canary rollout at specific production code paths.

  • high

    Trace a few concrete production user journeys where features should be gated (e.g., the specific flags mentioned in `@/modules/.../feature-flag` imports) and confirm the code paths are actually guarded by flag checks (and not merely evaluated/telemetry-only).

  • med

    Confirm rollout behavior beyond boolean enable/disable: identify whether evaluated flags support percentage/canary variants (e.g., variant types or numeric rollouts) and ensure they are used to progressively expose code rather than all-or-nothing activation per deploy.

  • low

    Document the intended operational model for flag governance (where flags are defined, how long they live, and how rollbacks are performed) so the team treats this as a release mechanism rather than a permanent switch.

Reversibility / rollback 67%

Reversibility/rollback for database migrations is implemented and enforced. The migration system distinguishes reversible vs irreversible migrations via types (`down` required or forbidden), `wrapMigration` standardizes execution of both `up` and `down`, and test utilities provide an actual rollback mechanism (`undoLastSingleMigration` calling `undoLastMigration`). I did not identify a production-facing rollback *command/job* in the code slices read, but rollback readiness at the migration layer (including undo testing) is strong.

  • high

    Confirm (and if missing, add) an operational rollback entry point that triggers `undoLastMigration`/migration rollback in production when a deploy is reverted (e.g., a CLI/job used by the release process). Right now we verified the undo capability and tests, but not the production rollback orchestration.

  • med

    Audit a sample of real migrations to ensure all `ReversibleMigration` implementations truly restore backward-compatible schema/data (not just syntactically having `down`). Your evidence so far shows the contract and helper wiring; the next step is to validate actual migration correctness for representative backward-compatibility cases.

Delivery cadence (DORA proxy) 92%

Delivery cadence (DORA proxy) appears healthy: git history indicates frequent integration and sustained tagging/release activity, and the codebase includes well-automated CI on master and PRs plus automated release publishing and container build/publish workflows. These are consistent with low release friction and frequent small-batch delivery.

  • med

    Verify that merged changes on main/master also trigger an automated deploy to at least a staging environment (not just build/test + publish). If deployment-to-staging is not wired on merges, add a CD workflow to reduce lead time from commit to production-like environments.

Deploy-tooling ownership 89%

Deploy-tooling ownership exists and appears healthy: the repo’s CI workflows and Terraform IaC live as versioned code under .github/workflows and IaC directories, and git-history authorship across these deploy/infra paths is highly distributed (235 authors; top author share ~0.156), reducing the single-engineer CI/CD time-bomb risk.

  • med

    Keep ownership distributed by ensuring new CI/CD and IaC changes are reviewed by at least 1-2 contributors outside the original author set (especially for high-impact workflows like Docker build/push and release workflows).

T3 Exit Cleanliness

Engineering Org Resilience

No single-author critical paths: git-blame concentration, CODEOWNERS coverage, and reviewer diversity across the codebase.

52% 9/10 scored
  • Critical-path bus factor 67%
    2/3 expected sites
  • Review diversity 83%
    2/2 expected sites
  • Ownership clarity 89%
    3/3 expected sites
  • Retained vs. departed knowledge 67%
    3/3 expected sites
  • Documentation density ("why") 100%
    1/1 expected sites
  • Operational runbooks 0%
    0/4 expected sites not present
  • Onboarding reproducibility 67%
    2/2 expected sites
  • Tests as executable knowledge 0%
    0/3 expected sites
  • Decision history legibility 0%
    0/1 expected sites
Critical-path bus factor 67%

The critical-path bus factor primitive is effectively present. Git-history signals (bus_factor) across critical directories (packages/core, packages/cli, packages/workflow, packages/frontend, packages/nodes-base, packages/@n8n) show many distinct contributors and no indication of bus-factor-1 gravity wells in the aggregated critical directories. Additionally, org ownership manifests (CODEOWNERS/OWNERS) cover critical backend/security areas, and core workflow execution is protected by substantial executable tests—together mitigating single-person knowledge concentration.

  • high

    Validate that CODEOWNERS entries for the most business-critical runtime paths (packages/core/* execution engine, workflow execution, and CLI auth/webhooks) are backed by actual commitership by at least 2 human authors over time (not just code ownership labels).

    • .github/OWNERS:1-233 — Ownership coverage exists, but it should be cross-checked against real historical commitership concentration to ensure no single-person gravity wells remain within those directories.
  • med

    Confirm there are meaningful tests (non-smoke) specifically for any other critical-path modules that are not as extensively covered as packages/core execution-engine (e.g., scheduling/queueing, webhook handling, and DB transaction/migration integrations).

Single-author hotspots N/A

The repo shows high-churn files, but the git-history hotspots signal did not identify any “danger=true” files (i.e., no files that are both high-churn and touched by only one or two distinct lifetime authors). Therefore, there are no confirmed single-author hotspot sites to audit further, and the primitive appears absent.

  • med

    Re-run the hotspots check with a longer lookback window (e.g., since: 24-36 months) to catch hotspots that are not currently within the default 12-month window.

  • low

    If you introduce a new frequently-changing subsystem, ensure ownership is shared via CODEOWNERS and that behavior is encoded in tests/docs so it cannot become a gravity well even if one author dominates commits temporarily.

Review diversity 83%

The repo shows strong “review diversity” characteristics at the process level (high PR-based landing: pr_referenced_share=0.713; many human integrators merged changes: distinct_mergers_human=22). In-repo governance also supports this via ownership manifests (OWNERS and CODEOWNERS) that route review to a wider set of owner teams rather than a single gatekeeper.

  • high

    Ensure branch-protection / required-review settings (if present) mandate multiple approvers or rotating required reviewers for critical paths (e.g., core/workflow/db/auth). Ownership manifests help, but required-review rules are the enforcement layer that maximizes diversity.

  • med

    Audit that CODEOWNERS/OWNERS entries align with what actually changes most (hot/high-impact modules) and that owners in those entries remain active contributors; if not, update ownership routing so review context stays distributed.

    • .github/OWNERS:1-24 — OWNERS provides the mechanism for review routing; its effectiveness depends on the active presence of named owners across the highest-change modules.
Ownership clarity 89%

This codebase has an explicit ownership clarity primitive implemented via .github/OWNERS and .github/CODEOWNERS. The manifest declares ownership for key critical areas (including core/workflow and database migrations) and provides a default catch-all owner to avoid unowned paths.

  • high

    Verify that each critical-team entry used in .github/OWNERS/.github/CODEOWNERS corresponds to an actual team with >=2 active humans, and that those humans regularly commit in the declared areas (cross-check with bus-factor/history per critical path).

    • .github/OWNERS:1-200 — Manifest contains team-based ownership entries for critical paths; this should be validated against actual people/activity to ensure knowledge is not concentrated in a single individual.
  • med

    Add/ensure CODEOWNERS/OWNERS coverage for any remaining high-critical directories not explicitly listed (if any exist), and ensure the catch-all default aligns with the org’s intended escalation/ownership model.

    • .github/OWNERS:1-200 — Catch-all ownership exists, but completeness for every critical directory should be verified against the repo’s actual critical components.
Retained vs. departed knowledge 67%

The codebase shows evidence of the retained-vs-departed knowledge primitive via an explicit ownership/coverage mechanism (.github/OWNERS and .github/CODEOWNERS). This helps ensure critical areas are not trapped as single-person knowledge. However, history-based signals indicate substantial “departed authorship share” (recency-based), so while ownership coverage exists, additional durability artifacts (e.g., runbooks/ADRs) are absent and could leave operational/decision context vulnerable.

  • high

    Add operational runbooks and migration/upgrade runbooks for the DB/migrations and deployment workflows (since artifacts-mode reported runbook/adr categories as absent). Ensure they are referenced from codeowners/OWNERS areas and updated alongside changes.

    • .github/CODEOWNERS:1-6 — Migrations and workflows are covered by CODEOWNERS, but there is no corresponding runbook/ADR category present in tracked org-doc artifacts (per artifacts scan), which is a typical gap for retained knowledge durability.
  • med

    For the most critical packages (those under packages/core, packages/workflow, packages/@n8n/db, packages/@n8n/utils, packages/@n8n/engine-like components), add decision records (ADRs) for major architectural choices so knowledge doesn’t reside only in authors’ heads.

    • .github/OWNERS:1-40 — OWNERS maps many critical packages to owner groups, but artifacts-mode indicated ADRs are absent; ADRs are the durable complement to ownership for retained-vs-departed knowledge.
  • low

    Periodically audit ownership mappings for “gravity well” risk: check that each critical package has multiple distinct lifetime owners (not just one group alias) and that owners actually touch the code (not only review assignment).

    • .github/OWNERS:1-60 — OWNERS provides the manifest to audit; performing a regular validation reduces the chance that departed authorship concentrates in a way the manifest doesn’t reflect.
Documentation density ("why") 100%

The repo contains strong architecture/“why” documentation in at least one critical area (the @n8n/agents Agent Runtime). However, the core @n8n/engine v2 package currently lacks the same level of durable architecture/why documentation (it describes itself as a scaffold), indicating an opportunity to add architecture rationale as the engine evolves.

  • high

    Create/expand durable architecture (“why”) docs for packages/@n8n/engine v2: execution-loop design, invariants/assumptions, key tradeoffs, lifecycle/state machine, and operational intent (what correctness means for the engine).

    • packages/@n8n/engine/README.md:1-15 — Identifies @n8n/engine as the workflow execution engine (v2) but states it is currently a scaffold with public API/core interfaces still being defined—i.e., the architectural “why” documentation is not yet present at the same depth.
  • med

    Add cross-links from engine v2 public entry points to the architecture/why doc sections (eventing/state machine/constraints) so rationale stays discoverable for new contributors and operators.

Operational runbooks 0%

Operational runbooks do not exist anywhere in the repository as tracked org-doc runbook artifacts (the `runbook` bucket is absent). While ownership manifests exist, the codebase lacks written, service-specific operational runbooks covering deploy, incident response, and recovery—creating a risk of a gravity well during outages.

  • high

    Create a runbook document set under the repo’s ops/doc structure for each critical service area (at minimum: core execution, workflow layer, database/migrations, and CLI deployment/operations). Each runbook should include: (1) how to deploy, (2) incident triage/checklist, (3) recovery + rollback steps, (4) links to dashboards/logs/alerts and exact commands.

    • .github/OWNERS:18-32 — Critical areas are explicitly owned here (core/workflow/db/cli), indicating where operational runbooks should exist.
  • high

    Assign runbook CODEOWNERS (or equivalent) that match the critical ownership set, and ensure at least two people actively update each runbook after incidents or operational changes (to reduce knowledge concentration).

    • .github/OWNERS:1-6 — Ownership governance exists (OWNERS/CODEOWNERS). Use it to make runbooks maintainable by more than one person.
  • med

    Add lightweight smoke verification for runbooks: e.g., a monthly checklist test that validates the documented recovery commands against a staging environment (or verifies links/commands are still correct).

    • CONTRIBUTING.md:1-120 — The repo has an established contribution process and operational documentation culture; use it to formalize runbook verification steps.
Onboarding reproducibility 67%

Onboarding reproducibility is present primarily via CONTRIBUTING.md, which documents requirements, a clean-clone development setup (pnpm install → pnpm build → pnpm start), and a reproducible dev workflow for iterating and validating changes. Implementation is good and concrete, but there isn’t evidence (from on-graph/code checks) of a single canonical “one command from clean clone” script; it’s documented as an explicit command sequence.

  • high

    Add/ensure a single canonical bootstrap command (e.g., `pnpm setup && pnpm build && pnpm start` wrapped into one script) and document it prominently at the top of the onboarding section, so the ramp-up loop can be executed with minimal command coordination.

    • CONTRIBUTING.md:110-170 — Current onboarding reproducibility relies on a documented sequence: `pnpm install` → `pnpm build` → `pnpm start`, rather than a single one-command bootstrap.
  • med

    Add a ‘fresh clone checklist’ section that cross-links prerequisites and environment setup (devcontainer, Node/pnpm versions, and `.env.local` handling) and explicitly names the expected state at the end (e.g., backend running + frontend accessible).

    • CONTRIBUTING.md:1-120 — Requirements and dev container guidance exist, and environment-variable instructions exist later, but the doc would benefit from an explicit end-to-end checklist tying them together into a single reproducibility story.
Tests as executable knowledge 0%

The primitive is present: the repo contains substantial test suites that function as executable specifications, including security-critical execution-context behavior and core workflow execution correctness. These tests are meaningfully asserted (not just smoke/import checks), indicating strong “executable knowledge” usage on critical paths.

  • high

    Ensure any newly added or refactored critical execution/agent paths always include intent-pinning tests (edge cases + invariants), especially around encryption/decryption and workflow execution ordering/output shape.

  • med

    For the workflow execution engine and agent orchestration modules, require PRs to reference/extend the existing executable tests when changing behavior, so behavior changes remain discoverable in test diffs.

Decision history legibility 0%

History legibility is partially supported: git history indicates a meaningful share of commits include explanatory bodies (low low-effort subject rate; non-trivial body share), and some modules include rationale in docs. However, durable decision records (ADRs) appear absent across the repo (artifacts mode reported 0 ADR files), so the primitive is not fully implemented where it should be most critical (major architectural decisions).

  • high

    Introduce ADRs (or an ADR-equivalent decision record format) for major architectural milestones (e.g., Engine 2.0 wiring decisions). Ensure each ADR links to the relevant PRs/commits so that the 'why' can be recovered from history even if commit archaeology is needed.

    • packages/@n8n/engine/README.md:1-15 — Engine 2.0 rationale is present, but there is no visible durable decision record artifact (ADRs absent repo-wide per artifacts scan), which is the key gap for decision_history_legibility.
  • med

    For changes that replace/retire architectural components, enforce commit/body guidelines (e.g., require a short 'why/impact' section for non-trivial design changes) to complement the lack of ADRs and increase recoverability from git history.

    • CONTRIBUTING.md:1-120 — Contribution guidance exists; extend/enforce guidance specifically for design-change commit messages/bodies so intent is consistently captured in history.

Not applicable to this codebase: Single-author hotspots.

IP & OSS License Hygiene

An SBOM in CI, no AGPL/GPLv3 in the dependency tree, CVEs triaged by severity, and no outside-contributor commits without IP assignment.

68% 9/12 scored
  • License compliance 0%
    0/2 expected sites
  • Known-vulnerability scan 0%
    0/3 expected sites not present
  • Dependency usage & reachability 92%
    4/4 expected sites
  • Dependency freshness 56%
    2/3 expected sites
  • Upstream maintenance 100%
    1/1 expected sites
  • Remediation velocity 100%
    2/2 expected sites
  • Supply-chain integrity 133%
    4/3 expected sites
  • Dependency-confusion resistance 67%
    3/3 expected sites
  • IP ownership / provenance 67%
    2/2 expected sites
Software bill of materials N/A

No evidence was found in the codebase for generation/publication of an SBOM (e.g., via syft, cyclonedx/cdxgen) as part of CI/release. While the repo clearly maintains lockfiles (pnpm-lock.yaml and multiple uv.lock files) indicating dependency pinning, this audit did not locate any SBOM-generation step wired into GitHub workflows or release scripts.

  • high

    Locate and add an SBOM generation step to the release/CI pipeline (preferably producing a CycloneDX or SPDX artifact during build). Ensure it includes transitive dependencies and is uploaded as a build artifact and/or published alongside release assets.

    • pnpm-lock.yaml:1-40 — Lockfile pinning exists (pnpm lockfile present), which is necessary for an accurate SBOM, but this file alone is not proof that an SBOM is generated in CI/release.
  • med

    Add an SBOM freshness check: fail the pipeline if an SBOM artifact is missing or does not match the current lockfile resolution (or at least ensure the SBOM is regenerated on every release build).

    • renovate.json:1-95 — Dependency-update automation is present (Renovate config), but there is no indication here of SBOM generation/verification as part of the release process.
License compliance 0%

License compliance is not satisfactorily applied for this codebase’s proprietary SaaS risk profile: the transitive dependency license scan flags at least one network-copyleft dependency (@zone-eu/mailsplit, EUPL-1.1+) and one strong-copyleft dependency (jszip, GPL-3.0-or-later). The required mitigation/justification steps for these specific dependencies are not evidenced in the inspected lockfile contents.

  • high

    Remove or replace the network-copyleft dependency @zone-eu/mailsplit@5.4.8 (EUPL-1.1+ OR MIT) from the dependency graph (preferably via direct dependency changes/upstream alternatives). If it must remain, produce a deal-legal justification package (license texts/NOTICE handling, distribution/SaaS risk analysis, and internal policy sign-off).

  • high

    Remove or replace the strong-copyleft dependency jszip@3.10.1 (GPL-3.0-or-later OR MIT) from the dependency graph. If relying on the MIT alternative, document and validate that the distributed artifacts are actually under MIT terms (and that the GPL path is not applicable).

  • med

    Add/verify CI enforcement that fails the build on strong/network-copyleft transitive dependencies unless an explicit allowlist + justification exists. Ensure the compliance report (SBOM/license inventory + NOTICE artifacts) is generated for release artifacts.

Known-vulnerability scan 0%

A dependency vulnerability scan over the repo’s lockfiles (OSV/GHSA via osv-scanner) reports many HIGH/CRITICAL findings (e.g., handlebars 4.7.8, @babel/traverse 7.21.4), but I did not find evidence in the checked-in code (via the accessible repo entrypoints) of a corresponding CI “known-vulnerability scan” primitive that triages/remediates these findings. Based on the limited wiring evidence, treat implementation as missing/unclear.

  • high

    Add/ensure a CI job that runs a lockfile vulnerability scan (OSV via osv-scanner or equivalent), fails the build on untriaged HIGH/CRITICAL issues, and requires explicit triage/exception records tied to the exact pinned lockfile versions.

    • package.json:1-120 — No top-level CI/dependency vulnerability scan script is evident in the repository’s primary scripts entrypoint.
  • high

    Create triage workflow rules for each HIGH/CRITICAL finding: either upgrade the vulnerable dependency to a fixed version in the relevant workspace lockfile(s), or document an exception with justification and (where possible) demonstrate reachability for the vulnerable API.

    • pnpm-lock.yaml:1-60 — Lockfile-based anchoring is required for this primitive; vulnerabilities are reported per pinned version and must be resolved/triaged against those pins.
Known-exploited CVEs N/A

The “known-exploited CVEs” primitive is satisfied at the dependency level: osv-scanner’s known-exploited CVE set reports known_exploited_count = 0 for this repo (i.e., no vulnerabilities flagged as part of the famous actively-exploited set were found in the pinned dependency graph). However, I did not independently anchor any specific known-exploited dependency entries to exact lockfile line ranges via code_read, so the implementation-quality grade is not perfect.

  • high

    Confirm the result by pinpointing in each lockfile the exact pinned versions (and verify none match the known-exploited aliases) for the top CVE candidates returned by osv-scanner, using code_read to cite the specific lockfile lines.

  • med

    Ensure CI enforces this primitive (e.g., a step that runs osv-scanner with a known-exploited/“known vulnerability scan” gate) rather than relying on ad-hoc manual scans.

Dependency usage & reachability 92%

For this primitive, dependency reachability is evidenced on key call paths: the public API middleware layer uses `express` and swagger UI helpers, and the HTTP/request utilities and OAuth2 identifier resolvers actively call into `axios` rather than merely importing it for types. This indicates the codebase correctly exercises important dependencies in runtime flows (no clear 'imported but never called' anti-patterns observed in the sampled high-risk spots).

  • high

    Run/extend the dependency reachability checks across the full manifest-vs-imports set: identify declared-but-never-imported and phantom imports, then confirm call-site reachability for any libraries with known high impact (HTTP, auth, crypto, templating).

    • packages/cli/src/public-api/index.ts:1-120 — Example of a high-risk dependency reachability site already exercised (express + swagger UI). This pattern should be systematically checked for the rest of the dependency set.
  • med

    For each externally imported high-risk dependency (e.g., `express`, `axios`, `swagger-ui-express`), record the most important call-site files/functions and use them to rank CVE remediation blast-radius (hot-path first).

Dependency freshness 56%

This codebase has the two key components of dependency freshness hygiene: (1) a committed pnpm lockfile that pins exact versions with integrity hashes, and (2) a Renovate configuration that enables OSV vulnerability alerts to drive regular dependency updates. However, the lockfile also contains at least one explicitly deprecated package entry (superagent), and a large transitive set has vulnerability findings, so freshness remediation appears to be ongoing rather than fully caught up.

  • high

    Prioritize upgrading transitive dependencies with the highest severities (especially CRITICAL/HIGH) from the OSV scan results, and confirm via reachability/usage that vulnerable code paths are exercised where applicable. Start with the most central UI/build dependencies (e.g., handlebars and babel traverse) that are heavily reused.

    • pnpm-lock.yaml:20000-20080 — Lockfile contains an explicitly deprecated dependency entry (superagent), demonstrating that at least some staleness/deprecation remediation is still pending.
  • high

    Add/verify CI enforces freshness controls: ensure there is a repeatable security/dependency freshness job (e.g., osv-scanner run against lockfiles, and/or a SBOM generation check) that fails when the OSV vulnerability set crosses a defined threshold.

    • renovate.json:1-95 — Renovate alerts are enabled, but freshness in CI still needs an enforced gate (not just notification). This file is the mechanism currently present.
  • med

    Review lockfile maintenance settings and Renovate grouping so that major/minor updates don’t stall in review queues; ensure the update cadence is sustained across all workspaces (root plus nested pnpm-lock.yaml files).

    • renovate.json:1-95 — Renovate is configured with grouping and disabled lockFileMaintenance; tuning may be needed to keep multi-workspace pnpm dependencies current.
Upstream maintenance 100%

Upstream maintenance is implemented via an active Renovate dependency-update mechanism (including vulnerability alerting). However, this audit run does not provide direct “upstream abandoned/deprecated” flags from dependency metadata, so the finding is based on the presence of an ongoing update mechanism rather than proving every critical upstream is still actively maintained.

  • high

    Add/verify an explicit “deprecated/abandoned upstream” signal in CI or dependency scanning results (e.g., fail the build or create tracked issues when osv-scanner flags deprecated packages). Current evidence shows vulnerability alerting and dependency updating, but not a dedicated deprecated-upstream replacement gate.

    • renovate.json:1-95 — Renovate configuration shown, but it is oriented to updates and vulnerability alerts; there is no evidence here of an automated abandoned/deprecated-upstream policy.
  • med

    Ensure the dependency-updater actually keeps moving for the relevant ecosystems by periodically reviewing merged dependency-update PR rates (remediation velocity) and scheduling policy.

    • renovate.json:1-95 — Mechanism exists; continue monitoring merged PR velocity to confirm upstream maintenance is operational, not just configured.
Remediation velocity 100%

Remediation velocity is implemented: Renovate is configured for automated dependency updates with vulnerability alerting (renovate.json). Additionally, the repo includes a PR approval + auto-merge workflow used to ensure dependency update PRs can be merged when CI checks pass (util-approve-and-set-automerge.yml). Off-graph git-history evidence indicates the dependency-update mechanism is active with non-zero recent merged update activity.

  • high

    Add/verify an explicit CI gate or documentation that ties Renovate-created dependency PRs to the approve-and-automerge path (so the velocity signal remains reliable over time).

  • med

    Ensure vulnerability alerts are treated as merge-blocking (for HIGH/CRITICAL) or have a tracked SLA, so the velocity mechanism translates into timely remediation outcomes.

    • renovate.json:1-95 — Vulnerability alerts are enabled (including OSV alerts), but this should be paired with an explicit operational policy/SLA in CI or workflows.
Supply-chain integrity 133%

Supply-chain integrity is present: the repo commits pnpm lockfiles with `integrity` hashes, and CI installs with `--frozen-lockfile` (preventing lockfile drift). Python environments also use uv.lock files with explicit sha256 hashes. Implementation quality is solid overall, but full coverage across all workflows/sites can’t be proven from the evidence read so far (only specific workflows/lockfiles were directly inspected).

  • high

    Verify that every CI/CD job that installs Node dependencies (pnpm) uses `--frozen-lockfile` and points to the intended lockfile (root vs per-package). Sample-check other install workflows (not just `release-update-pointer-tag.yml`).

  • med

    Confirm Python install steps for every uv.lock consumer use hash-verified installation (and don’t allow resolution to regenerate lockfiles without review).

  • low

    Optionally add explicit supply-chain provenance checks in CI (e.g., verifying install logs against lockfile, or enabling package manager integrity enforcement flags) if not already globally configured.

Dependency-confusion resistance 67%

Dependency-confusion resistance is implemented at the dependency-resolution tooling layer: the repo pins pnpm as the package manager, includes a committed pnpm lockfile, and blocks npm installs via a preinstall script. This provides meaningful (but not fully proven here) resistance against slopsquatting/confusion by ensuring resolution is deterministic and uses the intended installer.

  • high

    Also verify that the lockfile contains integrity hashes for resolved packages (and that CI uses the lockfile with a frozen/immutable install), so name-to-version resolution is truly deterministic for all workspaces.

    • pnpm-lock.yaml:1-11 — Lockfile presence is confirmed; however, deterministic integrity pinning should be confirmed by reading specific sections where integrity/resolved hashes are stored.
  • med

    Audit all workspace manifests for any unscoped private package names (and ensure private packages are always namespace-scoped like `@org/...`), since slopsquatting risk is highest for unscoped/private-like names.

    • package.json:1-14 — Manifest-level enforcement exists for the package manager, but an explicit scan/read of all package manifests for unscoped private names is required to fully close the dependency-confusion gap.
IP ownership / provenance 67%

The primitive is present: n8n documents and provides a Contributor License Agreement, stating that PRs can only be merged after the CLA is signed. However, based on the evidence collected so far, I can confirm the existence of the CLA artifact and how it is described in contributing docs; I did not find (in the snippets reviewed) the concrete bot/workflow enforcement code that triggers CLA comments and blocks merges (so the implementation certainty is slightly below perfect). Git history shows many authors, but without an employee roster in this audit run, individual unassigned-IP candidates cannot be conclusively separated from properly CLA-covered contributors.

  • high

    Confirm the CLA enforcement mechanism is actually wired in CI/GitHub (e.g., a specific GitHub App/bot or workflow that posts the CLA request and blocks merge until signature). Read the relevant workflow(s) or bot configuration to verify enforcement is operational, not only documented.

    • CONTRIBUTING.md:520-526 — States enforcement intent (bot comment + merge gating), which should be verified by finding the actual workflow/bot configuration doing it.
  • med

    Run an authorship-to-ownership validation with a provided roster (current employees/contractors emails) to identify any off-roster human contributors and then verify that CLA coverage exists for those contributors (or that an employment/assignment agreement exists).

    • N/A (tool output):N/A — git_dep_provenance authorship listing is available but a roster was not provided, so unassigned-IP candidates cannot be reliably flagged in this run.
AI-coding-tool provenance N/A

No AI-coding-tool provenance tracking was identified in the codebase (e.g., no file/headers/markers for AI-generated code, no Co-authored-by / provenance trailers, and no AI-usage/provenance policy artifacts found via targeted searches for provenance-related naming).

  • high

    Add an explicit, repo-wide AI provenance convention and enforcement: e.g., require a machine-readable provenance trailer or header (with tool/model + run id + prompt hash) for any AI-generated/assisted code, and document it in a policy file; update CI to validate markers on PRs.

Not applicable to this codebase: Software bill of materials, Known-exploited CVEs, AI-coding-tool provenance.

Implementation & Customization

Configuration over per-customer branches: no "if customer_id == 12345", no pricing literals scattered outside the billing module.

94% 4/10 scored
  • Feature gating via flags, not forks 92%
    4/4 expected sites
  • Documented extension interface 83%
    4/4 expected sites
  • Customization isolation & upgrade safety 100%
    2/2 expected sites
  • Onboarding-by-configuration cost 100%
    3/3 expected sites
Configuration over code branches N/A

No clear implementation of “configuration over code branches” for tenant/customer/brand variation was found in this codebase. Queries for per-customer/per-tenant code directories and customization override directories under typical paths (customers/, tenants/, overrides/, custom/) returned no matches, suggesting customer-specific behavior is not being handled via a tenant-driven config layer in this repo (at least not in the indexed code/data paths).

  • high

    Re-run this audit including the repo’s runtime “instance/tenant” configuration sources (e.g., environment-based config, database-driven settings, node/workflow type registry, or brand/EE gating). The current scan focused on directory patterns for multi-tenant customization and only surfaced generic, app-internal mappings/constants—not tenant config layers.

  • med

    Search for actual tenant/customer identifiers used to gate behavior (then verify whether behavior branches on identifiers vs attributes/entitlements/settings retrieved from config/DB). If behavior is attribute-driven, the primitive may exist under different naming than the “tenants/customers/overrides” directory patterns.

No hardcoded customer branching N/A

I did not find any hardcoded customer/tenant/org/account identity branching in the inspected areas. For example, where an identity field like `customerId` is used (e.g., constructing Stripe API endpoints), it is treated as a passed-in/customer-provided parameter rather than a branch key (i.e., no special-casing like `if customerId === <literal>` was observed in the checked code).

Centralized pricing/plan logic N/A

I did not find any dedicated, centralized “pricing/plan logic” module in this codebase. The frontend consumes plan/limit data from cloud APIs (e.g., `/admin/cloud-plan` and `/cloud/limits`) and then computes remaining usage for display, but there is no single local module where pricing/discount/plan rules/constants are defined and reused across the system. There also appears to be no in-repo billing/pricing module (e.g., no `/billing`, `/pricing`, `/subscriptions` code directories detected), so the expected centralized pricing logic primitive is effectively absent from this repository.

  • high

    If pricing/plan rules (discounts, tiers, entitlements, limits) are expected to vary by plan, consolidate them into a single billing/pricing module in-repo (e.g., `/packages/<...>/pricing/` or `/packages/<...>/billing/`) and ensure controllers/UI consume derived results from that module instead of duplicating calculations.

Metering decoupled from pricing model N/A

I did not find an implementation of “metering decoupled from the pricing model” anywhere in this codebase. While the repo contains token usage/cost concepts for the agents runtime (e.g., token usage types and `computeCost()` converting usage to USD using model cost data), the code does not show a generic usage/metering capture layer that is later mapped to charges by an independent billing/pricing module. In other words, usage and pricing are implemented together rather than via a decoupled metering→billing mapping boundary.

  • high

    If the product requires customer billing/plan-based metering: introduce a dedicated metering layer that records generic usage events (e.g., token counts or execution counters) independent of any plan/price model, then implement a separate billing/pricing mapper that turns those usage events into charges. Keep the execution engine unaware of USD conversion/pricing constants.

Feature gating via flags, not forks 92%

This codebase uses feature gating via flags/entitlements in multiple key layers (backend module initialization, centralized flag evaluation via PostHog with env overrides, and frontend route gating/variant checks). The gating is done by enabling/disabling the same code paths via flags, rather than forking plan/tenant-specific implementations.

  • med

    Where feature checks appear inline in route handlers/components, ensure they ultimately depend on a centralized flag/entitlement source (e.g., the same flag resolution pipeline used elsewhere) to keep flag governance consistent and retire flags cleanly.

Documented extension interface 83%

This codebase contains documented extension interfaces, most clearly via the `@ContextEstablishmentHook()` / `IContextEstablishmentHook` contract (a self-describing, version-ready hook interface with decorator-based discovery and DI registration). There is also an extension mechanism for expression runtime functions (`extend` + `ExtensionMap`) and a config-driven entry point for external lifecycle hooks (`EXTERNAL_HOOK_FILES`). Overall, extension boundaries exist and appear upgrade-safe, but not all extension points are equally “public plugin contracts” (expression runtime looks more internalized than the context hook system).

  • high

    Verify and document the full external lifecycle hook loading path for `EXTERNAL_HOOK_FILES` (where files are loaded, how hook contracts are validated, and how versions/compatibility are handled) so this config surface becomes a truly stable documented extension interface rather than just a config knob.

  • med

    For expression extensions, ensure there is a clearly documented public contract for third-party/partner extension authors (how to register new `ExtensionMap` entries, how docs/metadata are provided, and how compatibility is maintained across upgrades).

  • low

    Add explicit versioning/deprecation guidance to the documented hook interface documentation for context-establishment hooks (if not already present in the runtime registry) to strengthen upgrade safety for external implementers.

Customization isolation & upgrade safety 100%

The codebase contains clear customization isolation boundaries via (1) the expression runtime extension interface (extend/extendOptional with security blocklisting) and (2) a workflow-builder plugin registry abstraction. Both funnel custom behavior through stable, bounded contracts rather than per-customer code forks, supporting upgrade safety.

  • high

    Document and enforce the stability guarantees of the expression extension boundary and plugin registry (e.g., versioning expectations, compatibility rules, and deprecation workflow) so third-party/custom code is upgraded predictably alongside core.

  • med

    Add automated regression tests that simulate “custom plugin/extension + core upgrade” scenarios (e.g., plugin registration, resolution, serializer lookup order, and extension security constraints) to prove isolation over time.

Theming / white-label as config N/A

I did not find a theming/white-label system that is driven by tenant/customer configuration (i.e., serving the next brand via config rows rather than code changes). The front-end contains UI/editor theming implemented as code-defined themes (e.g., CodeMirror and ag-grid theme objects using CSS variables), which indicates styling configuration via build/runtime CSS variables rather than a white-label “theme selection + variant data” layer.

Tenant-configurable behavior surface N/A

I did not find a tenant-configurable behavior surface implemented as a customer-facing settings/rules model that governs business behavior (workflows/fields/rules/limits) per tenant. The only clear “tenant” concept in the inspected code is infrastructure/config for licensing (e.g., tenantId env var), not a data-driven behavior extension surface for customers.

  • high

    Do a targeted repo-wide check for a tenant-scoped settings/rules model (e.g., DB tables or config loaders for per-tenant feature rules/limits/workflow behaviors). If none exists, confirm whether this platform is intentionally single-tenant (per instance) and document the expected customization boundary; otherwise, introduce a formal configuration surface rather than behavior branching in code.

Onboarding-by-configuration cost 100%

This codebase supports low-touch, configuration-driven onboarding for SSO provisioning/role management via a dedicated provisioning module. Admins can patch provisioning config through an API, the service validates and persists config into a settings store, and startup bootstrapping loads provisioning config without code edits—indicating marginal onboarding cost is primarily configuration, not engineering.

Not applicable to this codebase: Configuration over code branches, No hardcoded customer branching, Centralized pricing/plan logic, Metering decoupled from pricing model, Theming / white-label as config, Tenant-configurable behavior surface.

Procurement Code Readiness

Data-export and data-subject erase/export endpoints, region pinning, and DPA-mapped controls that survive enterprise procurement.

11% 6/10 scored
  • Self-serve trust documentation 0%
    0/1 expected sites
  • Controls-to-contract mapping 0%
    0/1 expected sites not present
  • Data export mechanism 0%
    0/4 expected sites not present
  • Deletion / erase-on-request 67%
    1/1 expected sites
  • Enterprise access controls 0%
    0/2 expected sites not present
  • Reliability / SLA evidence 0%
    0/2 expected sites not present
Self-serve trust documentation 0%

The repo contains a committed security document (SECURITY.md), but it is not a self-serve trust documentation set. It only covers reporting a security vulnerability and does not package the broader trust/compliance artifacts prospects typically need for procurement diligence (certifications/attestations, DPA/sub-processor transparency, pen-test summaries, and ongoing control/status evidence).

  • high

    Expand self-serve trust documentation beyond vulnerability disclosure: create/maintain a trust-center-style set (either in-repo under a docs/trust path and/or a published trust page) that includes versioned DPA/privacy/legal commitments, a current sub-processor list with last-updated timestamps, certification/attestation summaries (or clear pointers to SOC 2/ISO artifacts), pen-test summary cadence, and operational control/status information kept current.

    • SECURITY.md:1-4 — Current SECURITY.md content is only vulnerability disclosure reporting, not procurement-grade self-serve trust evidence packaging.
  • med

    Ensure the trust artifacts are maintained as an intentional product surface (not ad-hoc responses): add explicit “last updated” dates, versioning for sub-processors and attestations, and a single canonical URL that procurement can cite in reviews.

    • SECURITY.md:1-4 — No versioned trust-center elements (last-updated, attestations, sub-processors, control status) are present in the current artifact.
Questionnaire response library N/A

This codebase does not contain a data-room “Questionnaire response library” artifact (CAIQ/SIG/VSA response bank). The repository scan shows the `questionnaire` category is absent (expected for data-room materials that are typically not committed to source control).

  • high

    Request the current Questionnaire response library (e.g., CAIQ/SIG/VSA response set) directly from the seller’s Security/GRC team or GC for packaging review. Ask for (1) the latest version date, (2) mapping to dominant frameworks/standards, and (3) versioned ownership/maintenance evidence.

Controls-to-contract mapping 0%

No controls-to-contract mapping artifact (DPA/MSA commitments mapped to implemented controls + audit evidence) was found in the code-adjacent materials. The only discovered trust doc (SECURITY.md) is limited to vulnerability disclosure and does not package the required deal-closing traceability.

  • high

    Create and version a “Controls-to-Contract mapping” document under a predictable repo path (e.g., docs/security/) that explicitly maps each DPA/MSA commitment (encryption, retention, breach notice, data residency) to (a) the implemented mechanism in the system and (b) the corresponding audit evidence reference (e.g., SOC 2 control IDs / test reports). Ensure the doc is maintained and date/version stamped.

    • SECURITY.md:1-4 — Shows current trust doc scope is insufficient; it’s the place where buyers expect links/traceability artifacts.
  • high

    Publish/maintain the DPA/MSA itself (or at least the committed security exhibit) in a buyer-accessible location and cross-link it from SECURITY.md so procurement can validate commitments against the mapping document.

    • SECURITY.md:1-4 — Current doc contains no DPA/MSA references; procurement traceability is missing.
  • med

    Add a short, maintained index page (e.g., SECURITY.md section) that lists the traceability artifacts (controls-to-contract mapping, trust documentation, and sub-processor inventory version) so reviewers do not have to re-derive evidence from scratch.

Data export mechanism 0%

No code-visible, complete tenant-scoped “get ALL your data out” export mechanism was found. The codebase includes workflow/data-table export helpers used for source-control export, but they are scoped to caller-provided IDs and write selected resources to a local work directory rather than providing a full tenant-wide, async portable export handler.

Deletion / erase-on-request 67%

The codebase contains a user deletion endpoint (`DELETE /users/:id`) that performs a tenant-scoped cascade of primary application data (workflows, credentials, auth identities, and the personal project + user record). However, within the reviewed code, there is no packaged evidence that the deletion request also propagates to backups or other derived/async data stores, which weakens procurement readiness for 'erase-on-request, verifiably'.

  • high

    Provide/implement (and link to auditable evidence) a true erase-on-request cascade that explicitly covers derived/async data and backup retention. Concretely: identify what data stores exist beyond the direct DB entities deleted here (execution logs, binary/object storage, search indexes, event/audit streams, backups/replicas) and ensure the user-deletion flow triggers their cleanup for the user/tenant scope.

  • med

    Add an auditable deletion job record and completion verification (e.g., a durable job with status + counts of deleted items per subsystem) tied to the erase request. This should be produced/retained for the customer/data-subject and for internal compliance review.

  • low

    Clarify contract semantics: the endpoint appears to be admin-driven user deletion with optional transferee migration. If the procurement contract requires 'data subject self-serve erase', document and/or add a self-serve erase path that maps a data-subject request to the same cascade deletion mechanism.

Data residency commitment N/A

No code-visible implementation of a tenant data-residency commitment was found. Region values exist only as generic infrastructure/provider configuration (e.g., S3 bucket region), not as a tenant attribute used to route data/compute to a pinned residency region. Because this repository appears to be the n8n product code (primarily self-hostable) rather than a multi-tenant hosted service with residency pinning, this primitive is not applicable as a code-derivable control in this codebase.

  • high

    If you are auditing the *hosted* n8n service for residency commitments (EU/Canada/India), request the current residency enforcement architecture and the tenant-region-to-routing enforcement evidence from the seller (e.g., region-pinning data model + routing layer + deployment topology). This repo alone does not provide the required tenant-scoped enforcement artifacts.

Enterprise access controls 0%

I found IP allowlisting enforcement in two places: (1) outbound SSRF protection driven by environment-configured CIDR allow/block lists, and (2) a per-Webook-node ipWhitelist option that blocks access (403) when a request IP is not allowed. However, I did not find evidence of the *enterprise access controls* primitive as defined (tenant-scoped network restriction enforced at an edge/boundary, plus an admin UI/control surface to manage it). Therefore, this primitive is treated as absent in this codebase.

  • high

    Confirm whether there is an enterprise/tenant-level ingress IP allowlisting feature elsewhere in the codebase (e.g., reverse proxy/ingress middleware, instance-level config, or an enterprise settings module) and that it supports multi-tenant scoping. If not present, implement a boundary enforcement layer that applies a tenant-configured allowlist/CIDR to inbound requests uniformly.

  • high

    Add or surface an admin UI (and corresponding API) that allows security teams to manage the allowlist per tenant/enterprise plan, and ensure the enforcement reads from that tenant configuration (not from ad-hoc per-node settings or unrelated SSRF settings).

  • med

    Package deal-close evidence: provide a documented control description and code-to-control traceability for how the enterprise allowlist is enforced at the network boundary (middleware/edge) and how it is administered (UI/API).

Sub-processor transparency N/A

I did not find a public, current, versioned sub-processor list artifact (e.g., docs/subprocessors or SUBPROCESSORS.md) that would transparently back the DPA sub-processor clause. The repo-adjacent “subprocessors” scan returned code files unrelated to a maintained processor inventory, and there was no clear code-visible control or imported inventory matching a declared list.

  • high

    Create/maintain an explicit, versioned sub-processor inventory in a repo-adjacent location (e.g., docs/subprocessors or SUBPROCESSORS.md) and ensure it is clearly tied to the DPA sub-processor clause. The artifact must be current and include a notification flow/versioning when changes occur.

  • med

    Cross-check the declared sub-processor list against actual third-party data sinks used by the product (SDK imports / integrations) and update the inventory to include any third party the system sends data to (analytics, monitoring, LLM providers, messaging, etc.).

Compliance attestation readiness N/A

No compliance attestation readiness evidence (e.g., a current SOC 2 Type II report and a control-to-code traceability package) was found in the data-room compliance_reports bucket. This primitive is data-room follow-up (not code-visible), so its absence is expected from a repo scan and should not be scored as a code gap.

  • high

    Request the current SOC 2 Type II report (and/or equivalent ISO27001/pen-test attestation if applicable) plus control-to-code traceability from the seller, ensuring the mapping matches the implemented Dim 5 audit evidence set used by your procurement/procurement-legal team.

  • med

    Ask for the specific version/date of the attestation and the scope boundary (services, regions, and subprocessors) that matches the codebase deployment you intend to procure.

Reliability / SLA evidence 0%

No procurement-grade Reliability / SLA evidence is packaged in this repo. The only relevant artifacts discovered are: (1) a SECURITY.md focused on vulnerability disclosure, and (2) UptimeRobot integration code, which does not function as published SLA/status terms or incident/postmortem evidence.

  • high

    Provide (or link to) procurement-ready Reliability/SLA evidence: published status page URL(s), documented SLA/availability commitments, and a versioned incident/postmortem history (or a link to the incident archive). If these are maintained externally, add them to a trust/security page in this repository.

  • high

    Create a dedicated doc (e.g., STATUS/SLA/RELIABILITY.md or update SECURITY.md) that explicitly includes: (a) uptime targets, (b) maintenance window policy, (c) definitions/exclusions, (d) escalation/support process, and (e) incident review/postmortem links.

Not applicable to this codebase: Questionnaire response library, Data residency commitment, Sub-processor transparency, Compliance attestation readiness.

Reporting & Data Export

Customer-accessible export endpoints (CSV, Parquet, JSON), scheduled exports, and a documented map of emitted events.

28% 9/10 scored
  • On-demand data export 0%
    0/1 expected sites
  • Export completeness & fidelity 0%
    0/3 expected sites
  • Large / async export handling 0%
    0/3 expected sites not present
  • Warehouse sync / reverse-ETL 0%
    0/2 expected sites not present
  • In-product reporting / analytics 72%
    6/6 expected sites
  • Event stream completeness 78%
    3/3 expected sites
  • Documented export / event schema 100%
    1/1 expected sites
  • Export access control & audit 0%
    0/2 expected sites not present
  • Exit portability / no lock-in 0%
    0/2 expected sites
On-demand data export 0%

A portable export mechanism exists in code (ExportService that serializes DB tables to encrypted JSONL and compresses into entities.zip). However, in the code we inspected, there is no evidence of an on-demand, tenant-scoped export/download handler that is permission-gated and audited for customer data egress. As a result, the primitive is only partially implemented at the data-export layer and does not yet clearly satisfy the “portable or hostage” customer on-demand export requirement end-to-end.

  • high

    Find (or add) the actual export/download HTTP/API handler(s) that trigger ExportService for a customer request, and ensure they are (1) tenant-scoped, (2) permission-checked, and (3) audit-logged. The handler should return a downloadable, portable archive (e.g., entities.zip) for the requesting tenant/account only.

  • med

    Document and align completeness: confirm which entity categories are included/excluded in exportEntities (it currently iterates through entityMetadatas and optionally includes data table rows). Add explicit inclusion/exclusion rules so the customer can rely on a complete export.

  • med

    Harden export paging/fidelity for large tenants: exportEntities uses LIMIT/OFFSET paging based on offset increment by pageEntities.length. Review correctness and performance at scale; consider keyset pagination if available to avoid unstable pagination under concurrent writes.

Export completeness & fidelity 0%

The codebase has a real export implementation (SourceControlExportService) that writes workflow/variable/data-table/folder resources to JSON files for source-control-style exports. But for this primitive (“Export completeness & fidelity”), the evidence indicates the export surface is not a single tenant/account-scoped “export ALL customer data categories” handler; it appears candidate- and context-scoped with early returns and selective coverage. Therefore, portability is present for specific resource types, but completeness across the full customer data model is not demonstrated.

  • high

    Identify (or implement) a single, tenant/account-scoped bulk export handler for reporting/data egress that enumerates every customer-exportable entity category and validates coverage against the data model (no silent omissions). The existing SourceControlExportService methods are candidate-scoped; the completeness primitive requires a full-account export surface.

  • high

    Ensure the bulk export endpoint (not just internal services) is tenant-scoped, permission-gated, and audited. Add explicit authorization checks and audit logging around the top-level export request/job creation.

  • med

    Add automated “export completeness” tests that compare the exported file contents against expected counts/sets for each critical data category (including operational/history/analytics where applicable).

Large / async export handling 0%

I did not find any code-visible implementation of the “Large / async export handling” primitive (async job + streamed export that does not buffer the full dataset in memory). While there is export functionality that returns a `Readable` tar stream, the tar writer buffers all entry contents in memory, and the workflow export command serializes the entire export dataset in-process.

  • high

    Replace/upgrade the tar writer to be true streaming: do not store all entry `Buffer`s in `entries`; instead, write each tar entry to the pack as data becomes available (or pipe from upstream streams). Ensure backpressure is respected and the export does not hold the full dataset in RAM.

  • high

    Move large export endpoints to an async job model: enqueue export work, persist progress/status, and provide a download endpoint that streams the completed artifact (or streams while the job runs). This avoids long in-request execution and improves reliability at volume.

  • med

    Audit other bulk “export*” flows for the same memory-buffering pattern (collect-to-array + JSON.stringify / Buffer accumulation). Convert hot loops to iterator-based pagination/streaming reads from the DB.

Scheduled / recurring exports N/A

I did not find any scheduled/recurring *exports* mechanism in the codebase (schedule store + runner that batches and delivers tenant-scoped exported data to a destination, with retry/DLQ, etc.). The codebase does include cron/scheduling infrastructure for executing workflows/tasks, but not a scheduled data-export primitive.

  • high

    If you expect a customer data export product feature, add/implement a dedicated scheduled_exports subsystem: (1) tenant-scoped schedule persistence, (2) a runner that materializes full export batches, (3) retry/DLQ semantics, (4) destination abstraction (S3/warehouse/webhook/drive/etc.), and (5) tenant+permission checks plus audit logging on every export job execution.

Warehouse sync / reverse-ETL 0%

No real “warehouse sync / reverse-ETL” primitive (incremental, managed sync to customer warehouses/BI destinations) was found. The codebase contains warehouse destination integrations (e.g., Snowflake and BigQuery nodes) that execute user-driven SQL/insert/update operations, but not a dedicated sync/reverse-ETL layer with incremental syncing and sync-job orchestration.

  • high

    Confirm whether n8n’s intended “reverse-ETL” is implemented via workflow-driven nodes only; if the product promises managed warehouse sync/incremental behavior, add a dedicated sync layer (sync state, incremental cursoring, scheduling/runner, retries/DLQ) rather than relying solely on ad-hoc executeQuery/insert/update node actions.

  • med

    If warehouse sync configs/connectors are supposed to exist (e.g., dbt/airbyte/fivetran/singer-style), ensure they are represented by maintained configuration/projects and not just credential/node metadata; add or document a concrete sync configuration path with actual sync execution wiring.

In-product reporting / analytics 72%

This codebase contains a real in-product analytics/reporting primitive: the /insights REST controller plus corresponding public API handler expose reporting endpoints (summary, by-workflow, by-time) backed by an InsightsService that computes dashboard-ready aggregates. Access is gated by authorization scope decorators and license checks, and request date ranges are validated and checked against license retention/granularity constraints.

  • high

    Confirm end-to-end tenant/project scoping and authorization for the underlying data queries (projectId filtering and whether the authenticated user is restricted to their own project). The controller passes projectId through to the service/repositories, but we should verify that the repository queries enforce ownership/tenant boundaries.

  • med

    Check the repository query implementations used by insightsService.getInsightsSummary/getInsightsByWorkflow/getInsightsByTime to ensure they are performant for real datasets (indexes/aggregation strategy) and that no unbounded history/range slips past license filtering.

  • low

    Verify frontend/UI integration exists for these endpoints (tables/charts) and that the client is using the same reporting API contracts, not duplicating reporting logic or relying on internal-only routes.

Event stream completeness 78%

Event-stream completeness (as a reporting/event coverage primitive) exists in the codebase as a typed event system built around EventService (TypedEmitter) plus relay layers (TelemetryEventRelay and LogStreamingEventRelay) that register handlers for specific event-name catalogs using the shared EventRelay.setupListeners wiring. Evidence shows substantial coverage via large eventName→handler maps. However, I did not find an off-graph, documented “event catalog” specifically for this primitive to diff against the actual emitted/handled set (git_doc_scan surfaced other schema/config categories but no clear dedicated documented event catalog for drift comparison).

  • high

    Add/locate the documented external event catalog for this primitive (the “what we claim to emit” list) and then implement an automated diff test: (declared catalog events) vs (actually registered relay handlers / emitted event names) to detect drift.

  • med

    Create completeness assertions per relay layer (TelemetryEventRelay and LogStreamingEventRelay): ensure every intended RelayEventMap event name has a corresponding handler entry in the setupListeners map for that output.

  • low

    Extend or add tests that assert registered listeners include a representative set of RelayEventMap keys (and fail when keys are added to RelayEventMap but omitted from relay maps).

Documented export / event schema 100%

Documented payload schemas for export/reporting-style data are present in-repo (e.g., maintained Zod schemas under packages/@n8n/api-types/src/schemas). However, the repo-wide “documented export/event schema” coverage for the full event catalog (e.g., telemetry/audit/workflow events) is not fully evidenced as a single maintained asyncapi/openapi/event-catalog-style contract in the code we inspected—so overall documentation quality/coverage appears partial rather than comprehensive.

  • high

    Add/maintain a single, consumer-facing event catalog/spec (asyncapi/openapi or an EVENTS.md/schema doc) that enumerates the supported event names AND links each event name to a versioned JSON schema for its payload—then ensure code payloads are validated/serialized against those schemas to prevent drift.

  • med

    For each major event group (telemetry/audit/workflow execution/queue/etc.), ensure there is a corresponding documented payload schema (and add tests that fail when emitted payloads diverge from the documented schema).

Export access control & audit 0%

The codebase includes an export endpoint for n8n packages (workflows/credentials) but does not implement an explicit “Export access control & audit” primitive on the export path: there is no visible permission/tenant authorization check and no visible audit/access-log write associated with the export request.

  • high

    Add a permission + tenant/project scoping authorization check for the export endpoint (before export work begins) and ensure it is enforced for every requested workflowId/credential requirement. Record a denial reason consistently.

  • high

    Write an audit/access-log event for each successful export request (and optionally each failure) that includes tenant/account/project identifiers, exporting user id, the exported entity counts/ids (redacted as needed), and the request correlation id.

  • med

    Ensure exporter dependencies (WorkflowExporter/CredentialExporter) cannot accidentally broaden access: validate that their internal queries are scoped to the requesting user’s tenant/project and permission set, not merely “user object present”.

Exit portability / no lock-in 0%

This codebase has a real export capability for selected workflows (and related credentials) packaged as a downloadable gzip/tar artifact (`/n8n-packages/export`). However, for the specific primitive of exit portability/no lock-in, the code-visible export path evidenced here appears *not* to be a full-account export covering all customer data needed to leave; the handler/service are driven by `workflowIds`. The repo scan did not find any exit-terms/contract clause documents in source (hand-off required to a buyer GC).

  • high

    Implement or document a true full-account export endpoint (tenant/account-scoped) that includes all relevant customer data beyond workflow definitions—e.g., credentials, global variables, data tables, settings/config, and other exported entities required for leaving—packaged in a portable format (with completeness guarantees).

  • high

    Ensure the full-account export path is explicitly permission-gated and auditable (write an audit-log entry on export initiation/completion, and confirm tenant scoping).

  • med

    Buyer-GC hand-off (contract half): confirm the MSA/termination/data-portability clause explicitly honors the customer’s right to export their complete data before lock-in/termination restrictions take effect.

    • : — git_doc_scan found `exit_terms` category as absent in the repo (no contractual clause artifacts to cite). This is a hand-off item to the buyer's GC, not a code gap.

Not applicable to this codebase: Scheduled / recurring exports.