§ 01 · Composite score

The five-layer composite score

src/lib/weighting/composite-score.ts:87-281

When the home page says "5 layers", it means the math that turns four raw agent scores (fundamentals, technical, sentiment, news) into a single composite for ranking. It is not a count of agent stages — those are a different concept entirely. Each layer is a pure function and runs in order. The output is a single number per candidate plus a trace[] that the future "explain this trade" UI can replay.

Learning loop status: loading…

L0 — Sector z-score

sector-zscore.ts

Each candidate's raw 0–100 agent scores are z-scored against the other tickers in the same sector at the same moment. A "75 fundamentals" score in a sector where the median is 80 becomes a slightly negative z; a 75 where the median is 50 becomes a strong positive z.

Why this exists: during a tech rally every tech name's fundamentals score creeps up. Without L0 a generic "tech is up" lift would dominate the composite and the ranker would just shuffle the sector. Z-scoring against the cohort cancels the regime tide and surfaces the names that are relatively stronger. When the cohort has fewer than three tickers the layer is skipped (composite-score.ts:95-97).

L1 — Sector × regime profile

sector-profile.ts

Looks up the weight vector trained for this (sector, regime) cell — e.g. "in biotech during a RISK_OFF regime, fundamentals matter more than sentiment". The lookup falls back to a coarser cell when the exact (sector, regime) row is missing.

Why this exists: the predictive value of each agent dim is regime-dependent. Sentiment agents predict well in momentum regimes and badly in panic regimes. L1 pulls the right prior off the shelf before personalisation.

L2 — Bayesian shrinkage

bayesian-shrinkage.ts

Blends each user's own posterior weights (learned from their own closed-trade outcomes) toward the L1 prior. The mix weight is α = nTrades / (nPrior + nTrades) with nPrior = 10. A brand-new user with zero trades sits at α=0 (pure prior). A user with 30 closed trades sits at α=0.75 (mostly their own taste). An optional federated posterior — the average of the whole user base — can be mixed in as a third anchor.

Why this exists: cold starts are noise. Without shrinkage, a user's first three closed trades would yank the weights to extremes and lock in random taste. The prior absorbs the noise, then steps out of the way as evidence accumulates.

L3 — Black-Litterman tilt

black-litterman.ts

Tilts the L2 posterior by each agent's rolling accuracy in this (sector, regime, horizon) cell. Parameters: τ = 30, maxTilt = 0.30 — no agent can be tilted by more than ±30% even if it has been amazing or terrible recently.

Why this exists: the static prior assumes all agents are equally calibrated. In reality one specific agent may be hitting 65% on biotech in RISK_ON regimes while another is stuck at 48%. L3 amplifies the one that is currently working and damps the one that isn't — capped, so a hot streak can't take over the vote.

L4 — Exposure caps + regime tilt

exposure-caps.ts

Multiplies the composite by a soft regime-aware tilt. If the per-sector hard cap is already breached, the composite is also multiplied by 0.5 (the candidate isn't filtered out — the caller decides, but the score is heavily penalised). Sells/closes bypass the cap check via bypassCap.

Why this exists: a high-composite name in an already-overweight sector is a worse trade than the same name in an underweight sector. L4 routes that judgement into the score itself rather than relying on a downstream veto. Macro and news dims are explicitly zeroed at the final multiplication step (composite-score.ts:184-198) — they are display-only chips on the dashboard, not vote-carriers.

L4.5 — Conditional crowding penalty

crowding/conditional-penalty.ts

Optional, AQR/QMJ-style. When crowding signals are supplied (raw crowding, quality z-score, valuation stretch, trend persistence) the composite is multiplied by (1 − MAX_PENALTY × raw_penalty) with MAX_PENALTY = 0.30. For shorts the quality term inverts — a high-quality issuer amplifies squeeze risk rather than damping it.

Why this exists: the scariest names to buy are the ones everyone already owns. L4.5 is the late-stage check that says "yes this scored 92, but everyone is already long, valuation is stretched, momentum is stale — knock it down 30%". When the inputs aren't available it's a no-op (fail open).

§ 02 · Agent pipeline

Eighteen agent roles, eight stages

Read the code

src/lib/agents/pipeline-stages.ts:84-112

Two separate pipelines run in production: discovery (find a new name, debate it, execute) and reeval (rescore everything we already hold). They share some agent names —bull_researcher, bear_researcher, consensus, execution_agent — but run on different triggers. The swap (rotation) pipeline is a third path that composes elements of both. Roles are dispatched by canStartAgent() — stage N cannot begin until every required agent in stage N−1 has reported complete.

Discovery — 11 roles across 8 stages

triggered by: scout cycle

1market_scoutScreens the universe and picks the candidate. Enforces sector blocks, earnings blackout, and cooldown.
2 ∥fundamentalsP/E, growth, debt, cash runway. Runs in parallel with the other three analysts.
2 ∥news_analystCatalysts, headlines, days-to-event labels (display only — never votes).
2 ∥technicalRSI, SMA crossovers, volume surge, chart patterns.
2 ∥sentiment_analystSocial sentiment, short interest, unusual options flow.
3 ∥bull_researcherArgues the upside in plain prose. Parallel with bear.
3 ∥bear_researcherArgues the downside. Both researchers read the full analyst dossier.
4consensusJudges the debate, declares winner + conviction.
5traderPosition sizing, entry structure, stock vs option selection.
6 ⚖risk_neutral15% position cap, sector concentration, macro cash floor. Can veto.
7 ⚖portfolio_managerSole final authority. Approves or rejects with conviction + horizon. Can veto.
8execution_agentRoutes PM-approved orders to Alpaca. Auto-submits profit ladder + stop orders.

⚖ marks veto stages. ∥ marks parallel-within-stage. Stage gating in canStartAgent() ensures no agent in stage N starts before stage N−1 fully reports.

Reeval — 4 additional roles

triggered by: cron + manual

The reeval pipeline rescores everything currently held and proposes trims, exits, and rotations. Source: src/lib/reeval-pipeline.ts.

amacro_analystThreat-level scan (LOW → CRITICAL). Cached between runs; refreshes on a fixed interval.
bposition_sentinelPer-holding catalyst scan + deep-dive on the flagged ones. Emits the forward-return forecast (see §04).
cbriefing_officerPer-holding briefing dossier. Surfaces rotation candidates when a held position weakens.
dreeval_pmFinal reevaluation verdict: HOLD / TRIM / EXIT / ROTATE, with execution timing (immediate, 1h, 4h, EOD, next session).

Trim debates re-use bull_researcher, bear_researcher, consensus; exits route through the same execution_agent.

Add the three swap-only roles — head_to_head_pm, risk_neutral_swap, risk_validator — and you get the full count of eighteen distinct agent roles across the production runtime. See §05 for how those three compose with the discovery agents to execute a rotation.

§ 03 · PM cockpit

The PM bot — your chat-driven cockpit

Read the code

src/lib/pm-chat/

The PM is not a chatbot wrapped around a portfolio. It is the operator interface to the whole desk. Twenty-six named action tools place real orders, drive cycles, change configuration, swap LLM models, and schedule rotations. Thirteen math tools answer quantitative questions deterministically (no LLM-faking-numbers). When neither category fits, a Python execution fallback runs arbitrary calculations in a sandbox. Every destructive action returns a requiresConfirmation envelope so the operator gets a second look before anything fires.

26 action tools

action-tools/

Trade · 5

buy · sell · trim · cancel_order · suggest_buy_size

Cycle · 6

pause_engine · unpause_engine · run_cycle · kill_engine · force_restart_cycle · analyze_candidate

Config · 8

set_max_position_pct · set_stop_loss · set_instrument · set_sector_focus · set_conviction_gate · toggle_options_engine · set_options_allocation_pct · set_options_style

Model · 2

switch_cycle_model · switch_pm_model

Rotation · 5

propose_swap · schedule_reeval · cancel_scheduled_reeval · place_swap_with_tranches · execute_pending_rotation

13 math tools

math-tools/

Pure functions. The LLM emits the call; deterministic code produces the number; the model only narrates. Prevents the "what's my Sharpe" hallucination problem.

calc_sharpe · risk-adjusted returncalc_sortino · downside-only Sharpecalc_drawdown · peak-to-troughtrade_stats · win-rate, avg Rposition_sizing · qty from $ + priceconcentration_metrics · top-N weightsector_exposure · % per sectorcorrelation_matrix · cross-symbol ρportfolio_beta · weighted β vs SPYvar_cvar · tail-loss expectationtax_exposure · realised + unrealisedcompounding_projection · forward curvereconcile_equity · book vs broker

Python execution fallback

python-exec/

When the operator asks something the 13 named math tools don't cover — "given my last 50 trades, fit an exponential to the cumulative R curve and tell me when I'll double" — the PM drops to Python. Primary path: Anthropic's hosted code-execution tool (anthropic-code-exec.ts). Fallback: E2B sandbox (e2b-fallback.ts). Output is returned as a tool-result block the model can quote.

Operator memories

Long-term preferences ("never trade XOM", "default size for biotech = 3 % of NAV", "always check macro threat before raising the conviction gate") are stored per-user and prepended to every chat. The PM applies them implicitly — no need to re-state your taste every session.

Example prompts

"Buy 2.5 % of NAV in NVDA, but only if your concentration check passes."
→ suggest_buy_size → concentration_metrics → buy (withrequiresConfirmation)
"What's my Sortino over the last 90 days and which trade hurt me most?"
→ calc_sortino + trade_stats, the PM narrates the worst-R trade with rationale + risk-summary pulled from the audit row.
"Pause the engine until tomorrow's open, then run one cycle and switch the PM to Sonnet for it."
→ pause_engine → schedule_reeval + switch_pm_model → run_cycle (queued).
"Propose a swap from AXS to V — risk-aware, account for my margin."
→ propose_swap spins up the head-to-head PM, the LLM risk-neutral stage, and the deterministic risk-validator (see §05). 1:1 swaps with net-concentration-delta ≤ 0 pass even from over-cap positions.
"If I keep compounding at this rate, how big is the book at year end? Fit a curve."
→ compounding_projection for the baseline, then Python fallback for the curve fit.

§ 04 · Position sentinel

Forward-return forecast (μ, σ, conf, bias)

Read the code

src/lib/forecast/forward-return.ts

The line on the dashboard that reads 30D forecast: +4.8 % ±3.5 % · conf 0.74 is not vibes. It's a deterministic synthesis from four primitives — composite score, direction, recent residual mean, and confidence — extrapolated along a 30-trading-day anchor and √t-scaled to any other horizon. Pure function, no I/O, returns a discriminated { ok: true | false } so callers fail-soft on missing inputs rather than fabricating.

The math (preserved verbatim from synthesize-distribution.ts)

μ_anchor = ((compositeScore − 50) / 50) × 10
           + 1  if direction = up   |   −1  if direction = down
           + 0.5 × clamp(recentResidualMeanPct, −2, +2)

σ_anchor = 6 − 2 × confidence              (range 4 % – 6 %)

μ_h      = μ_anchor · (h / 30)             √t-scaled to horizon h
σ_h      = σ_anchor · √(h / 30)

μ (mid): the expected return percentage. Composite drives the bulk of it; direction adds a ±1 % nudge; recent residual error adds a tiny mean-reversion bias.
σ (spread): shrinks as input confidence rises, but never below 4 % at the 30-day anchor.
Output confidence: re- derived separately from the synthesizer's input confidence — it's the mean of per-signal availability flags so the UI can suppress forecasts on stub inputs (no price history, no fundamentals, no catalyst).
Bias: the directional sign applied to μ_anchor — visible on the dashboard so the operator can see which way the forecast is leaning before reading the number.

Until 2026-05-18 this math only ran on ranked discovery candidates. The Position Sentinel rescores every held position too — so the same primitives now drive the per-holding cards on the dashboard.

§ 05 · Swap pipeline

Head-to-head PM → risk-neutral → risk-validator → execution

Read the code

src/lib/agents/rotation-orchestrator.ts:893-1708

A swap is not a sell-then-buy. It's a single atomic proposal — "exit sell-leg, enter buy-leg" — that has to pass two risk gates back-to-back, with the PM deciding first instead of last. The ordering inverts the discovery pipeline because the question being asked is different: discovery asks "is this name worth holding", rotation asks "is this name worth holding more than the one we already hold".

Order

1.market_scout · screens the buy-leg as a replacement candidate (same fundamentals/news/technical/sentiment fan-in).
2.bull_researcher + bear_researcher + consensus · bull-vs-bear review on the swap, not on the buy-leg in isolation.
3.trader · sizes the swap (1:1 by default, but trims/adds are allowed).
4.head_to_head_pm · decides first. Reads both legs side-by-side; returns SWAP / KEEP / SWAP_DIFFERENT_BUY with a reasoned verdict. Can veto.
5.risk_neutral_swap · LLM risk pass mirroring the discovery risk_neutral stage. Soft veto — REJECT halts the swap.
6.risk_validator · deterministic. Concentration policy, sector caps, margin awareness. Hard veto — ABORT halts the swap. Must approve.
7.execution_agent · executes in tranches, market-hours-aware, verify-or-cancel at next open.

Concentration policy (the AXS → V fix)

A 1:1 swap from an over-cap position has zero net concentration delta. Treating it as a fresh entry would forever strand the swap out. The policy in concentration-policy.ts buckets the request: new entry → hard cap; rebalance with netDelta ≤ 0 → approve; rebalance with netDelta > 0 → size-down recommendation. Margin context is honoured — when marginContext.enabled && multiplier ≥ 2 the effective cap uses buying power as a second, looser bound.

Tranches + market hours + verify-or-cancel

Large swaps split into tranches via rotation-tranches.ts. When the market is closed, the rotation is queued; when it opens, the executor places the first tranche, then verifies the fill at the next price-check. Unfilled orders cancel rather than chase the tape.

§ 06 · Honest gaps

What's NOT real

Four surfaces are wired but not fully fed. The page above is the fiction-free version; this section is the rest of it.

Correlation matrix needs price history
The correlation_matrix math tool computes pairwise ρ across symbols you supply, but a fully populated per-symbol price series isn't routinely cached. When history is missing the tool fails-soft and the PM tells you so. Wiring an opportunistic backfill is on the list.
L3 Black-Litterman runs as a no-op on cold cells
Until enough agent_accuracy_rolling rows accumulate for a given (sector, regime, horizon) cell, the L3 tilt is zero. The layer is still in the pipeline — it just contributes nothing to the composite for those cells. As the trade log grows, more cells light up. Cited at composite-score.ts:134-156.
Federated posterior is opt-in
The L2 shrinkage layer accepts an optional third anchor — the average posterior across the whole user base. The plumbing works; the feeder that aggregates per-user posteriors into a federated row runs offline. Cold-start users currently see the sector-profile prior only.
L0 z-score skips small cohorts
When a sector has fewer than three tickers in the cohort at decision time, the L0 layer is skipped (composite-score.ts:95-97) and the composite uses raw 0–100 scores into L1. This is intentional — a 1-ticker z-score is meaningless — but it does mean thinly-covered sectors briefly lose the regime-detrend.

If any of the above changes status, this page must move first. The code is the contract; this page is its mirror.