Architecture — Canary Technical Library

System context

Canary LP is a loss-prevention analytics platform for Square-POS merchants. The user-facing product is an AI-assisted dashboard that watches every transaction, refund, void, cash-drawer event, discount, and timecard change in real time — and surfaces plain-English alerts when a pattern looks wrong.

The platform is built on a canonical data model (the CRDM) that descends from three generations of enterprise retail LP systems. Square is the first POS integration; the canonical layer is source-agnostic by design, so additional POS integrations are a parser and a field-mapping rather than a rewrite.

It is not a payment processor. It is not a POS replacement. It is not a rule-authoring engine that replaces human judgement. It is a layer on top of the POS that reads everything Square emits, applies decades of LP pattern knowledge, and writes back nothing. Canary is a lens.

Component map

At the highest level, Canary is a Python 3.12 Flask app with 29 blueprints, 12 MCP servers, a four-stage stream pipeline, and a PostgreSQL+Valkey data plane. It runs as a Docker Compose stack locally and behind a Cloudflare tunnel in hosted dev.

Canary service mesh — the 12 MCP servers and the Flask app — Fig. O-01 · Service Mesh

Traffic flows in three broad directions:

Square → Canary — webhooks hit the TSP ingress at /webhooks. Four subscribers (Sub1 Seal, Sub2 Parse, Sub3 Merkle, Sub4 Chirp) process each event through append-only Postgres writes and Valkey streams. Atlas category P maps every step.
Merchant → Canary — the dashboard and Owl chat run on Flask + Jinja2 + Alpine. Frontend reads via the /api/* and /bff/* blueprints. Owl accepts natural language and translates it to structured queries over the CRDM.
Canary → Square — outbound is narrow: OAuth token refresh (/oauth/*) and the merchant reset flow. Canary never writes merchant-facing POS data back to Square.

Data architecture

One Postgres database named canary with three schemas:

app — identity, config, detection rules, alerts, Fox cases, Owl memory, webhooks, audit logs
sales — transactions, tenders, line items, refund links, evidence records, cash drawer events, employee timecards, disputes, gift cards, loyalty events, inventory adjustments
metrics — daily/period metrics, hourly metrics, risk scores, baselines, scorecards

The split is enforced through search_path at session open; foreign keys cross schemas freely. The Field Registry catalogs every column in every table across all three schemas.

Database schema architecture — 3 schemas, cross-schema FKs, 98 tables — Fig. I-02 · Database Schema Architecture

Three tiers of immutability keep the evidence chain honest:

Financial ledger (append-only). sales.transactions, sales.refund_links, sales.cash_drawer_events. No UPDATE, no DELETE. Enforced by the seed-time trigger prevent_mutation().
Evidentiary (insert-only). app.fox_evidence, app.fox_evidence_access_log, sales.evidence_records, sales.event_inscriptions. If the platform accuses someone, the evidence chain must be unbroken.
Audit trail (hash-chained). Every write to app.audit_log and app.fox_evidence gets a SHA-256 chain hash linking it to the previous row. Tamper with one entry, the chain breaks downstream.

Row-Level Security is enabled on every merchant-scoped table. Every session sets canary.current_merchant_id; every query is scoped to that merchant by Postgres, not by application code.

The detection pipeline — TSP

TSP stands for Triple Subscriber Pipeline (the name predates the fourth subscriber). Webhooks arrive at POST /webhooks/square; HMAC is validated; the raw payload is sealed into Postgres as an append-only evidence record and published to the canary:events Valkey stream. Four consumer groups read from that stream and do non-overlapping work:

TSP orchestration — 4 subscribers, Valkey streams, canonical tables — Fig. P-00 · TSP Orchestration Overview

Sub 1 — Hash & Seal. Validates fields, recomputes SHA-256, writes the immutable sales.evidence_records entry with per-merchant chain hash. Never transforms or parses. If the hash doesn't match, the event goes to the dead-letter stream.
Sub 2 — Parse & Route. Parses the JSON payload, maps it to CRDM columns, writes to the domain tables (sales.transactions, sales.transaction_tenders, sales.cash_drawer_events, etc.), and publishes routing messages for detection.
Sub 3 — Merkle Batcher. Accumulates events for Bitcoin inscription. Every 100 events or every hour, constructs a Merkle root over the evidence records and writes to sales.inscription_pool. Inscription is optional per-merchant.
Sub 4 — Chirp Detection. Evaluates each event against the relevant detection rules (filtered by rule category and source table) and writes app.alerts + app.alert_history. This is where the 37 rules in the Detection Catalog run.

Detection runs in three execution tiers. Tier 1 (stateless) fires from a single webhook payload with zero lookups — microsecond evaluation, the fast path. Tier 2 (stateful) needs shift-level or session-level aggregation, such as employee refund rate. Tier 3 (composite) correlates multiple primary-rule hits into higher-order patterns. All three tiers share the same detection_rules catalog and merchant-specific thresholds in merchant_rule_config.

Investigation lifecycle

An alert is the beginning of a narrative, not the end. A critical-severity rule (C-104 After-Hours Drawer, C-204 Untendered Order, C-301 Off-Clock Transaction, C-502 Post-Void) auto-opens a Fox case. Non-critical alerts sit in the Alert Queue until a merchant reviews them.

A Fox case has a case number (CASE-2026-NNNNN), a status lifecycle (open → investigating → escalated → closed | dismissed), an assigned reviewer, an evidence locker, and a hash-chained timeline. Evidence types include screenshots, register tapes, CCTV links, employee statements, and the triggering alerts themselves.

Every evidence upload writes an immutable fox_evidence row with chain-hash linkage and a SHA-256 of the file content.
Every access to evidence logs a row to fox_evidence_access_log — who read it, when, from where.
Every status change writes a fox_case_timeline event with actor, description, and timestamp.
Closing a case records the total loss in cents and a resolution note.

The lifecycle diagram is Atlas figure L-03. Field-level detail for every Fox table is in the Field Registry under the App domain.

Search & metrics

Owl is the AI assistant layer. It accepts natural-language questions ("What employees have the highest refund rate this week?") and translates them to structured queries over the CRDM. Owl uses the Field Registry as a typed schema, applies tenant-scoped RLS, and returns an answer with citations back to the underlying rows.

The Risk Dictionary is a curated set of 21 predefined z-score outlier queries — the drill-down entry points merchants see on the dashboard. Each entry maps a question to a drill order (by_employee / by_location / by_day) and a set of canonical filters. No pre-built SQL; the drill engine handles query construction at runtime.

Metrics are computed in metrics schema via daily + period aggregation:

metrics.daily_metrics — per-location, per-day rollups (sales, transactions, refunds, voids, discounts, shrink indicators)
metrics.employee_daily_metrics — per-employee, per-day (with a risk_score_snapshot)
metrics.period_metrics — weekly + monthly rollups for dashboard trend lines
metrics.metric_baselines — per-merchant baselines (mean + stddev) used by the Risk Dictionary's z-score queries
metrics.entity_risk_scores — current risk score per employee / card / subject

The MCP surface

Every bounded service domain exposes an MCP server at its own URL prefix — /owl, /chirp, /fox, /alert, /analytics, /identity, /tsp, /raas, /bff, /condor, /atlas, plus ALX for institutional memory. Each server follows the same contract:

GET /<domain>/manifest — describes the server, tools, and schemas
GET /<domain>/tools — lists tool names + argument signatures
POST /<domain>/tools/<name> — invokes a tool with JSON arguments
GET /<domain>/health — liveness probe

Auth is JWT on tool invocation; manifest/tools/health are public. The shared base kit in canary/mcp/ stamps these endpoints via create_mcp_blueprint(), so adding a new domain is a model + a handler file, not a blueprint scaffold.

This is the same contract the Agent SDK uses. Owl, ALX, and the Ops Console's QA Agent all talk to these servers as tool consumers — no internal-only API, no backdoors.

Deployment shape

Docker stack topology — Flask, four TSP subscribers, Postgres, Valkey, Ollama, PgAdmin, MailHog — Fig. I-01 · Docker Stack Topology

All environments run the same Docker Compose stack:

canary_flask — Gunicorn, 5001 inside the container, served via --reload in dev
canary_localhost_tsp_sub1..sub4 — four TSP consumer containers, one per subscriber
growdirect_postgres — PostgreSQL 17 with pgvector (on port 5432, shared across Canary + Cove + memory)
growdirect_valkey — Valkey 8 (on 6379); DB 0 = sessions/cache, DB 4 = TSP streams
growdirect_ollama — Ollama serving qwen3-embedding:8b for 1024-dim semantic vectors
growdirect_pgadmin — DB admin UI on port 5050
canary_localhost_qa_agent — Claude-powered QA assistant sidecar

In dev, dev.growdirect.app routes to the Flask container through a Cloudflare tunnel. Production target is canary.growdirect.app on a Mac Mini behind a second tunnel (same compose, no nginx — Cloudflare handles TLS). No AWS. No Kubernetes. The architecture is deliberately one founder's operational surface area.

Where to go next

You now have the shape. For depth:

The Mermaid Atlas — 52 diagrams across pipeline, lifecycle, orchestration, infrastructure, decision, journey, protocol, and legacy research.
The CRDM — provenance of the data model (Tesco TDS → Walmart SMART → Canary), seven canonical sources, Square coverage analysis.
The Field Registry — every table, every column, every searchable field, every i18n key. The metadata source of truth.
The Detection Catalog — all 37 rules, generated from source.
The Risk Dictionary — 21 z-score queries mapped to drill paths.
API Reference — OpenAPI 3.0 spec, Redoc-rendered.