Build Explainable AI with AmberTrace AI

Worked examples

Eight domains, one workflow.

Each was built end-to-end against the live platform — a description, a dataset, a verified build, and queries that return proof-carrying Amber Reports. The forecasting domains pull live data straight from connectors.

Click any example to see the actual code.

Decision domains · verified & proof-carrying

verified

Loan approval

Age, debt-to-income, credit-score tiers

▸

# Rules in plain English become an ontology
domain = api.domains.create(name="Loan Approval Assessment",
    description="Applicants must be 18 or older; debt-to-income "
                "must not exceed 43%; minimum credit score per loan "
                "tier; employment verification required.")
api.datasets.upload(domain_id=domain["id"], file_path="loan_applications.csv")
api.domains.build_ontology(domain["id"])
platform = api.platforms.create(domain_id=domain["id"], dataset_id=ds["id"],
    verified_profile=True, verified_min_confidence=0.6)

report = api.platforms.query(platform["id"],
    query="Can a 17-year-old with no income apply for a £3,000 loan?")

Live result >>> Check Applicant Age Below Threshold fired · proof_checked: true
“Decision independently certified against the trusted kernel: 1 rule fired, 0 facts derived from 1 input fact.”

View full demo · 13_loan_assessment.py →

verified

Insurance fraud

Policy limits, early claims, high-frequency, collusion

▸

domain = api.domains.create(name="Insurance Claims Fraud Detection",
    description="Claims exceeding the policy limit must be denied; "
                "claims filed within 90 days of policy start flagged; "
                "claimants with 4+ prior claims flagged high-frequency...")

report = api.platforms.query(platform["id"],
    query="A property flood claim for £55,000 on a £50,000 policy. Approve?")

Live result >>> Check Policy Limit Exceeds Threshold (+2 rules) · proof_checked: true
“The claim of £55,000 exceeds the policy limit of £50,000 — it should not be approved in full.”

View full demo · 11_fraud_detection.py →

verified

Clinical prescribing

Interactions, contraindications, allergy cross-reactions — derived

▸

# The platform DERIVES the clinical knowledge from primitive facts
domain = api.domains.create(name="Clinical Prescribing Safety",
    description="From the drug class, eGFR, concurrent medication and "
                "allergy, derive drug interactions, contraindications and "
                "allergy cross-reactions; then block, require a dose "
                "adjustment, flag, or permit.")

report = api.platforms.query(platform["id"],
    query="Prescribe ibuprofen to an 82-year-old with chronic kidney disease?")

Live result >>> Is Contraindicated (derived) → block · proof_checked: true
“Decision independently certified against the trusted kernel: 3 rules fired, 2 facts derived from 7 input facts.” The same NSAID at normal kidney function permits — the contraindication is derived, not column-matched.

View full demo · 12_clinical_safety.py →

verified

Recruitment compliance

Age, salary bands, right-to-work, bias detection

▸

domain = api.domains.create(name="HR Recruitment Compliance",
    description="Applicants must be 18+; salary within the role band; "
                "right to work required; references must pass; flag "
                "qualified candidates rejected without interview as bias.")

report = api.platforms.query(platform["id"],
    query="Can we hire a 16-year-old for a junior developer role?")

Live result >>> Check Age Below Threshold fired · proof_checked: true
“There is a rule flagging applicants below 18 — a potential compliance issue for this hire.”

View full demo · 14_recruitment_compliance.py →

verified

Environmental compliance

Regulatory / permit breaches, protected zones

▸

domain = api.domains.create(name="Environmental Regulatory Compliance",
    description="Readings must not exceed the regulatory or permit "
                "limit; facilities need an active permit; coastal-protected "
                "zones have stricter limits; breaches require enforcement.")

report = api.platforms.query(platform["id"],
    query="NO2 recorded at 44.8 ug/m3 against a 40.0 limit. In breach?")

Live result >>> Calculate Measurement Value To Regulatory Limit Ratio · proof_checked: true
“Riverside Chemical Works is in breach — NO2 of 44.8 ug/m3 exceeds the 40.0 regulatory limit.”

View full demo · 15_environmental_compliance.py →

verified · 20k rows

Access-governance PDP

Classify-then-conclude policy chain — permit / deny, every decision proof-carrying

▸

# 20k access requests. The policy is a chain: classify raw fields into named
# conditions (device trust, privilege, restricted zone), then conclude permit or
# deny — including the cross-field rule "deny when clearance < target sensitivity".
# The request is supplied as structured facts — the facts ARE the certified base.
domain = api.domains.create(name="Access Governance PDP", description="…")
platform = api.platforms.create(domain_id=domain["id"], dataset_id=ds["id"],
    verified_profile=True, verified_min_confidence=0.6)

report = api.platforms.query(platform["id"], query="Should this be permitted?",
    facts={"clearance_level": 2, "target_sensitivity": 3,
           "access_type": "write", "mfa_passed": True})

Live result Decision: DENY · proof_checked: true
“Decision: Deny — access denied due to insufficient clearance level (clearance 2 is below the target sensitivity 3).”
Held-out: 100% decision accuracy, 100% certification across 50 requests.

View full demo · 18_access_governance.py →

Command & control · verified at scale

verified · 20k rows

Air-track identification & triage

Recognized air picture — clear / monitor / escalate, human in the loop

▸

# 20k tracks; the request is supplied as structured facts (the facts ARE
# the certified base). Policy stated in the description — incl. the safety rule:
# "emergency tracks must always be escalated to a human operator."
domain = api.domains.create(name="Air Track Triage", description="…")
api.datasets.upload(domain_id=domain["id"], file_path="air_tracks.csv")
platform = api.platforms.create(domain_id=domain["id"], dataset_id=ds["id"],
    verified_profile=True, verified_min_confidence=0.6)

report = api.platforms.query(platform["id"], query="Triage this track.",
    facts={"iff_mode": "emergency", "squawk_emergency": True,
           "in_restricted_zone": False, "flight_plan_correlated": False})

Live result Triage: ESCALATE · proof_checked: true — “Decision independently certified against the trusted kernel.”
Held-out: 98% three-way accuracy, 98% certification across 50 tracks; emergency tracks always escalate to an operator.

View full demo · 19_air_track_triage.py →

verified · ISR standards

Triage on standardised ISR sensor data

Same policy, 26-field ASTERIX/MISB schema — the platform picks the drivers

▸

# The SAME clear / monitor / escalate policy as the air-track demo above —
# but over a 26-field ISR feed (ASTERIX Cat 021/062 + MISB ST 0601). You
# describe the policy once; the platform finds the few fields that decide it.
domain = api.domains.create(name="Air Track Triage (High-Spec ISR)", description="…")
api.datasets.upload(domain_id=domain["id"], file_path="air_tracks_hispec.csv")
platform = api.platforms.create(domain_id=domain["id"], dataset_id=ds["id"],
    verified_profile=True, verified_min_confidence=0.6)

report = api.platforms.query(platform["id"], query="Triage this track.",
    facts={"position_source": "fused", "iff_mode": "emergency",
           "emergency_squawk": True, "in_restricted_zone": True,
           "track_confidence": 0.93})

Live result Triage: ESCALATE · proof_checked: true — “Decision independently certified against the trusted kernel.”
Standardised ISR schema — ASTERIX Cat 021/062 surveillance fields, MISB ST 0601 sensor metadata, per-sensor confidence; synthetic, reproducible ISR data. 26 input columns, but only a handful drive the verdict — the platform does the feature selection from a wide, sparse schema, no feature engineering required. Emergency tracks always escalate to a human operator.

View full demo · 24_air_track_isr_hispec.py →

Forecasting · explainable macro models

FRED · explainable

10-year Treasury yield — system-selected drivers

24 macro series in, autoregression off — the platform picks what explains the yield

▸

# Upload a broad, neutral 24-series US macro panel (bundled, public-domain FRED)
api.datasets.upload(domain_id=domain["id"], file_path="gs10_macro_panel.csv")

# Train with autoregression OFF — the model must explain via macro drivers, not momentum
cfg = api.predictions.create_config(platform["id"], target_field="GS10",
    time_index_field="date", horizon=1, frequency="monthly",
    autoregressive="none")

# What did the system select? Feature importance + readable WHEN-THEN rules
base = api.predictions.predict(platform["id"], prediction_config_id=cfg["id"])
sf = api.predictions.symbolic_forecast(platform["id"], prediction_config_id=cfg["id"])

System-selected drivers From 24 candidates the platform kept corporate credit (AAA, BAA), the short end (FEDFUNDS, GS1), and real activity (retail sales, building permits, housing starts) — credit conditions + the real economy, surfaced from a neutral list, not hand-picked.

Fit Level-space R² ≈ 0.97 with 0% own-history reliance (411 monthly rows, 1992–2026). The month-to-month change R² is negative — and we say so rather than hide behind the flattering level number.

Rules 41 readable WHEN→THEN driver rules with contributions and hit-rates — e.g. "WHEN the front end falls AND real activity weakens → yield down". Rules you can read, challenge, and govern.

View full demo · 20_bond_yield_forecast.py →

FRED · neuro-symbolic

Neuro-symbolic 10-year yield forecast

A tight fit, a what-if you drive, and an honest neural-vs-symbolic head-to-head

▸

# 1) Conditional forecast — YOU supply the macro view; the gate shows which rules fire
f = api.predictions.symbolic_forecast(platform["id"], prediction_config_id=cfg["id"],
    feature_overrides={"FEDFUNDS": 6.0})   # "what if the Fed stays at 6%?"
# f["forecast"] -> 4.45%  ;  f["why"] -> the rules, with fired_on_latest_row=True for the 3 that fire

# 2) Does the symbolic layer earn its place? Preview the discovered rules read-only.
cmp = api.predictions.neurosymbolic_comparison(platform["id"], prediction_config_id=cfg["id"],
    include_pending=True)   # what-if preview of accepted-but-unapproved rules

In-sample fit The symbolic forecaster tracks the 10-year yield closely one step ahead — corr 0.98 over a walk-forward backtest on real FRED data (1959–2026).

What-if — you supply the macro view Plug your own driver expectations via feature_overrides: setting fed funds to 6% lifts the implied 10-year from 4.32% → 4.45%, and the gate names the 3 rules that fired (e.g. “WHEN CPI is rising AND fed funds is rising → yield up 0.13”). A mid-range reading fires nothing and the forecast holds at baseline — conditional and honest.

Neural vs neuro-symbolic Forecasting 6 months out, the discovered correction rules are A/B-tested and held for expert approval; previewed read-only they lift R² 0.54 → 0.69 and cut RMSE 18%. The chart above tracks all three series month by month — the correction rule fires on ~25% of months (the amber dots); where it fires the neuro-symbolic line is pulled toward actual, otherwise the forecast equals the neural model. At a 1-month horizon the neural model is near-ceiling and discovery keeps no rules — the layer only adds rules that beat the backtest.

View full demo · 26_neurosymbolic_bond_yield.py →

FRED · explainable

S&P 500 — system-selected macro drivers

25 macro series in, autoregression off — the platform picks what explains the index

▸

# Live-fetch a 25-series monthly macro panel from FRED (rates, inflation, labour, ...)
panel_csv = fetch_fred_panel(api_key)   # resamples daily+monthly to clean monthly
api.datasets.upload(domain_id=domain["id"], file_path=panel_csv)

# Train with autoregression OFF — explain the index through macro, not momentum
cfg = api.predictions.create_config(platform["id"], target_field="SP500",
    time_index_field="date", horizon=1, frequency="monthly",
    autoregressive="none")

# What did the system select?
base = api.predictions.predict(platform["id"], prediction_config_id=cfg["id"])
sf = api.predictions.symbolic_forecast(platform["id"], prediction_config_id=cfg["id"])

System-selected drivers From 25 candidates the platform kept housing starts and corporate credit spreads ahead of the 2-year yield, consumer sentiment, CPI and money supply — a real-economy-and-credit story, not the "rates and the VIX" a hand-picked model would have started with.

Fit Level-space R² ≈ 0.79 with 0% own-history reliance (~118 monthly rows). Honest: the month-to-month change is near-zero predictable, and the small sample is stated. The process is the product.

Rules Readable WHEN→THEN macro rules with contributions — rules you can challenge and govern, not a single number from a black box.

S&P 500 — macro drivers the platform kept

View full demo · 22_sp500_macro_forecast.py →

FRED + Coinbase · explainable

Bitcoin — system-selected crypto + macro drivers

ETH + 25 macro series in, autoregression off — is Bitcoin more macro than crypto?

▸

# Upload a bundled monthly panel: BTC + ETH (Coinbase) + 25 US macro series (FRED)
api.datasets.upload(domain_id=domain["id"], file_path="btc_macro_panel.csv")

# Train with autoregression OFF — explain BTC through the panel, not momentum
cfg = api.predictions.create_config(platform["id"], target_field="BTC_USD",
    time_index_field="date", horizon=1, frequency="monthly",
    autoregressive="none")

# What did the system select?
base = api.predictions.predict(platform["id"], prediction_config_id=cfg["id"])
sf = api.predictions.symbolic_forecast(platform["id"], prediction_config_id=cfg["id"])

System-selected drivers From 27 candidates (ETH + 25 macro) the platform kept FEDFUNDS, business loans (BUSLOANS), corporate credit (BAA), ETH, capacity utilisation (TCU), housing starts, and industrial production — the Fed and credit rank ABOVE Ethereum. Bitcoin is more macro than crypto.

AR actually hurts With BTC's own history included (AR=full), R² = 0.57. Without it (AR=none), R² = 0.62 — Bitcoin's momentum is noise that actively degrades the model. The macro drivers are the signal.

Rules 47 readable WHEN→THEN rules — e.g. "WHEN industrial production falls AND consumer credit tightens → BTC up" (flight-to-alternative). 119 monthly rows, 2016–2026, 0% own-history reliance. Discovery honestly accepts 0 correction rules — the macro drivers already explain Bitcoin.

Bitcoin — macro drivers the platform kept

Bitcoin — autoregression off improves fit

View full demo · 21_bitcoin_macro_forecast.py →

FRED · symbolic

GBP/USD — readable rules that beat the random walk

UK + euro + US macro in; a SYMBOLIC-only model out — no neural net, and on cable it beats persistence

▸

# Pull 18 UK + euro + US macro series LIVE from the FRED connector — one call, merged by date
ds = api.datasets.fetch(domain_id=domain["id"], connector_type="fred",
    config={"series_ids": PANEL, "api_key": key, "start_date": "1999-01-01"})

# Forecast cable from a SYMBOLIC-ONLY model: induced WHEN->THEN rules, no neural network
sf = api.predictions.symbolic_forecast(platform["id"], prediction_config_id=cfg["id"])
sf["skill_vs_persistence"]   # +0.19 — beats a last-value baseline on a walk-forward backtest
sf["why"]                    # 20 readable rules: contribution, hit-rate, support

Symbolic skill The rules-only model backtests at +0.19 skill vs persistence on monthly cable — it beats the "next month ≈ this month" random walk that FX almost never lets you beat. The neural model on the same panel scores −0.21. Here the readable symbolic layer isn't just more explainable — it's more accurate.

What drives sterling From 18 candidates the rules lean on two UK signals — business confidence and the OECD UK leading indicator — with cameos from US & euro 10-year yields and UK industrial production. Deteriorating UK momentum → sterling weaker; improving → stronger. Surfaced from a neutral panel, not hand-picked.

# a sample of the 20 induced rules — each carries a contribution, hit-rate and support count
WHEN UK confidence falling AND UK leading indicator falling   → GBP/USD down 0.022  (hit 0.70, n=33)
WHEN UK confidence 3m-change low AND 12m-deviation low         → GBP/USD down 0.020  (hit 0.77, n=47)
WHEN UK confidence jumps from a depressed level               → GBP/USD up 0.017   (hit 0.86, n=7)
WHEN UK industrial production running above trend              → GBP/USD up 0.006   (hit 0.63, n=65)

Honest by design The point forecast sits at the persistence baseline; the rules form the adjustment band. When no rule's condition is met — as at the latest reading — the forecast holds rather than inventing a move. Skill is a walk-forward backtest over 329 months (1999–2026), not a live trading signal.

View full demo · gbpusd_macro_forecast.py →

Turn your data into explainable, proof-carrying AI.

A neural answer you can’t audit is a liability.

From a CSV to a proof-carrying platform in five calls.

Describe the domain

Bring your data

Build a verified platform

Ask questions

Forecast, too

Not a score. A traceable decision.

Every report carries

Eight domains, one workflow.

Decision domains · verified & proof-carrying

Loan approval

Insurance fraud

Clinical prescribing

Recruitment compliance

Environmental compliance

Access-governance PDP

Command & control · verified at scale

Air-track identification & triage

Triage on standardised ISR sensor data

Forecasting · explainable macro models

10-year Treasury yield — system-selected drivers

Neuro-symbolic 10-year yield forecast

S&P 500 — system-selected macro drivers

Bitcoin — system-selected crypto + macro drivers

GBP/USD — readable rules that beat the random walk

From simple gates to multi-hop policy chains.

Fail-closed by design.

An API surface an agent can drive on its own.

Govern an AI agent with a proof for every action.

Deployment gate

Trading limits

Build an AI system you can defend.