§ 01 · Composite score
The five-layer composite score
Read the codesrc/lib/weighting/composite-score.ts:87-281
When the home page says "5 layers", it means the math that turns four raw agent scores (fundamentals, technical, sentiment, news) into a single composite for ranking. It is not a count of agent stages — those are a different concept entirely. Each layer is a pure function and runs in order. The output is a single number per candidate plus a trace[] that the future "explain this trade" UI can replay.
L0 — Sector z-score
sector-zscore.tsEach candidate's raw 0–100 agent scores are z-scored against the other tickers in the same sector at the same moment. A "75 fundamentals" score in a sector where the median is 80 becomes a slightly negative z; a 75 where the median is 50 becomes a strong positive z.
Why this exists: during a tech rally every tech name's fundamentals score creeps up. Without L0 a generic "tech is up" lift would dominate the composite and the ranker would just shuffle the sector. Z-scoring against the cohort cancels the regime tide and surfaces the names that are relatively stronger. When the cohort has fewer than three tickers the layer is skipped (composite-score.ts:95-97).
L1 — Sector × regime profile
sector-profile.tsLooks up the weight vector trained for this (sector, regime) cell — e.g. "in biotech during a RISK_OFF regime, fundamentals matter more than sentiment". The lookup falls back to a coarser cell when the exact (sector, regime) row is missing.
Why this exists: the predictive value of each agent dim is regime-dependent. Sentiment agents predict well in momentum regimes and badly in panic regimes. L1 pulls the right prior off the shelf before personalisation.
L2 — Bayesian shrinkage
bayesian-shrinkage.tsBlends each user's own posterior weights (learned from their own closed-trade outcomes) toward the L1 prior. The mix weight is α = nTrades / (nPrior + nTrades) with nPrior = 10. A brand-new user with zero trades sits at α=0 (pure prior). A user with 30 closed trades sits at α=0.75 (mostly their own taste). An optional federated posterior — the average of the whole user base — can be mixed in as a third anchor.
Why this exists: cold starts are noise. Without shrinkage, a user's first three closed trades would yank the weights to extremes and lock in random taste. The prior absorbs the noise, then steps out of the way as evidence accumulates.
L3 — Black-Litterman tilt
black-litterman.tsTilts the L2 posterior by each agent's rolling accuracy in this (sector, regime, horizon) cell. Parameters: τ = 30, maxTilt = 0.30 — no agent can be tilted by more than ±30% even if it has been amazing or terrible recently.
Why this exists: the static prior assumes all agents are equally calibrated. In reality one specific agent may be hitting 65% on biotech in RISK_ON regimes while another is stuck at 48%. L3 amplifies the one that is currently working and damps the one that isn't — capped, so a hot streak can't take over the vote.
L4 — Exposure caps + regime tilt
exposure-caps.tsMultiplies the composite by a soft regime-aware tilt. If the per-sector hard cap is already breached, the composite is also multiplied by 0.5 (the candidate isn't filtered out — the caller decides, but the score is heavily penalised). Sells/closes bypass the cap check via bypassCap.
Why this exists: a high-composite name in an already-overweight sector is a worse trade than the same name in an underweight sector. L4 routes that judgement into the score itself rather than relying on a downstream veto. Macro and news dims are explicitly zeroed at the final multiplication step (composite-score.ts:184-198) — they are display-only chips on the dashboard, not vote-carriers.
L4.5 — Conditional crowding penalty
crowding/conditional-penalty.tsOptional, AQR/QMJ-style. When crowding signals are supplied (raw crowding, quality z-score, valuation stretch, trend persistence) the composite is multiplied by (1 − MAX_PENALTY × raw_penalty) with MAX_PENALTY = 0.30. For shorts the quality term inverts — a high-quality issuer amplifies squeeze risk rather than damping it.
Why this exists: the scariest names to buy are the ones everyone already owns. L4.5 is the late-stage check that says "yes this scored 92, but everyone is already long, valuation is stretched, momentum is stale — knock it down 30%". When the inputs aren't available it's a no-op (fail open).
§ 02 · Agent pipeline
Eighteen agent roles, eight stages
Read the codesrc/lib/agents/pipeline-stages.ts:84-112
Two separate pipelines run in production: discovery (find a new name, debate it, execute) and reeval (rescore everything we already hold). They share some agent names —bull_researcher, bear_researcher, consensus, execution_agent — but run on different triggers. The swap (rotation) pipeline is a third path that composes elements of both. Roles are dispatched by canStartAgent() — stage N cannot begin until every required agent in stage N−1 has reported complete.
Discovery — 11 roles across 8 stages
triggered by: scout cycle- 1
market_scoutScreens the universe and picks the candidate. Enforces sector blocks, earnings blackout, and cooldown. - 2 ∥
fundamentalsP/E, growth, debt, cash runway. Runs in parallel with the other three analysts. - 2 ∥
news_analystCatalysts, headlines, days-to-event labels (display only — never votes). - 2 ∥
technicalRSI, SMA crossovers, volume surge, chart patterns. - 2 ∥
sentiment_analystSocial sentiment, short interest, unusual options flow. - 3 ∥
bull_researcherArgues the upside in plain prose. Parallel with bear. - 3 ∥
bear_researcherArgues the downside. Both researchers read the full analyst dossier. - 4
consensusJudges the debate, declares winner + conviction. - 5
traderPosition sizing, entry structure, stock vs option selection. - 6 ⚖
risk_neutral15% position cap, sector concentration, macro cash floor. Can veto. - 7 ⚖
portfolio_managerSole final authority. Approves or rejects with conviction + horizon. Can veto. - 8
execution_agentRoutes PM-approved orders to Alpaca. Auto-submits profit ladder + stop orders.
⚖ marks veto stages. ∥ marks parallel-within-stage. Stage gating in canStartAgent() ensures no agent in stage N starts before stage N−1 fully reports.
Reeval — 4 additional roles
triggered by: cron + manualThe reeval pipeline rescores everything currently held and proposes trims, exits, and rotations. Source: src/lib/reeval-pipeline.ts.
- a
macro_analystThreat-level scan (LOW → CRITICAL). Cached between runs; refreshes on a fixed interval. - b
position_sentinelPer-holding catalyst scan + deep-dive on the flagged ones. Emits the forward-return forecast (see §04). - c
briefing_officerPer-holding briefing dossier. Surfaces rotation candidates when a held position weakens. - d
reeval_pmFinal reevaluation verdict: HOLD / TRIM / EXIT / ROTATE, with execution timing (immediate, 1h, 4h, EOD, next session).
Trim debates re-use bull_researcher, bear_researcher, consensus; exits route through the same execution_agent.
Add the three swap-only roles — head_to_head_pm, risk_neutral_swap, risk_validator — and you get the full count of eighteen distinct agent roles across the production runtime. See §05 for how those three compose with the discovery agents to execute a rotation.
§ 03 · PM cockpit
The PM bot — your chat-driven cockpit
Read the codesrc/lib/pm-chat/
The PM is not a chatbot wrapped around a portfolio. It is the operator interface to the whole desk. Twenty-six named action tools place real orders, drive cycles, change configuration, swap LLM models, and schedule rotations. Thirteen math tools answer quantitative questions deterministically (no LLM-faking-numbers). When neither category fits, a Python execution fallback runs arbitrary calculations in a sandbox. Every destructive action returns a requiresConfirmation envelope so the operator gets a second look before anything fires.
26 action tools
action-tools/buy · sell · trim · cancel_order · suggest_buy_sizepause_engine · unpause_engine · run_cycle · kill_engine · force_restart_cycle · analyze_candidateset_max_position_pct · set_stop_loss · set_instrument · set_sector_focus · set_conviction_gate · toggle_options_engine · set_options_allocation_pct · set_options_styleswitch_cycle_model · switch_pm_modelpropose_swap · schedule_reeval · cancel_scheduled_reeval · place_swap_with_tranches · execute_pending_rotation13 math tools
math-tools/Pure functions. The LLM emits the call; deterministic code produces the number; the model only narrates. Prevents the "what's my Sharpe" hallucination problem.
calc_sharpe · risk-adjusted returncalc_sortino · downside-only Sharpecalc_drawdown · peak-to-troughtrade_stats · win-rate, avg Rposition_sizing · qty from $ + priceconcentration_metrics · top-N weightsector_exposure · % per sectorcorrelation_matrix · cross-symbol ρportfolio_beta · weighted β vs SPYvar_cvar · tail-loss expectationtax_exposure · realised + unrealisedcompounding_projection · forward curvereconcile_equity · book vs brokerPython execution fallback
python-exec/When the operator asks something the 13 named math tools don't cover — "given my last 50 trades, fit an exponential to the cumulative R curve and tell me when I'll double" — the PM drops to Python. Primary path: Anthropic's hosted code-execution tool (anthropic-code-exec.ts). Fallback: E2B sandbox (e2b-fallback.ts). Output is returned as a tool-result block the model can quote.
Operator memories
Long-term preferences ("never trade XOM", "default size for biotech = 3 % of NAV", "always check macro threat before raising the conviction gate") are stored per-user and prepended to every chat. The PM applies them implicitly — no need to re-state your taste every session.
Example prompts
- "Buy 2.5 % of NAV in NVDA, but only if your concentration check passes."→
suggest_buy_size→concentration_metrics→buy(withrequiresConfirmation) - "What's my Sortino over the last 90 days and which trade hurt me most?"→
calc_sortino+trade_stats, the PM narrates the worst-R trade with rationale + risk-summary pulled from the audit row. - "Pause the engine until tomorrow's open, then run one cycle and switch the PM to Sonnet for it."→
pause_engine→schedule_reeval+switch_pm_model→run_cycle(queued). - "Propose a swap from AXS to V — risk-aware, account for my margin."→
propose_swapspins up the head-to-head PM, the LLM risk-neutral stage, and the deterministic risk-validator (see §05). 1:1 swaps with net-concentration-delta ≤ 0 pass even from over-cap positions. - "If I keep compounding at this rate, how big is the book at year end? Fit a curve."→
compounding_projectionfor the baseline, then Python fallback for the curve fit.
§ 04 · Position sentinel
Forward-return forecast (μ, σ, conf, bias)
Read the codesrc/lib/forecast/forward-return.ts
The line on the dashboard that reads 30D forecast: +4.8 % ±3.5 % · conf 0.74 is not vibes. It's a deterministic synthesis from four primitives — composite score, direction, recent residual mean, and confidence — extrapolated along a 30-trading-day anchor and √t-scaled to any other horizon. Pure function, no I/O, returns a discriminated { ok: true | false } so callers fail-soft on missing inputs rather than fabricating.
The math (preserved verbatim from synthesize-distribution.ts)
μ_anchor = ((compositeScore − 50) / 50) × 10
+ 1 if direction = up | −1 if direction = down
+ 0.5 × clamp(recentResidualMeanPct, −2, +2)
σ_anchor = 6 − 2 × confidence (range 4 % – 6 %)
μ_h = μ_anchor · (h / 30) √t-scaled to horizon h
σ_h = σ_anchor · √(h / 30)μ (mid): the expected return percentage. Composite drives the bulk of it; direction adds a ±1 % nudge; recent residual error adds a tiny mean-reversion bias.
σ (spread): shrinks as input confidence rises, but never below 4 % at the 30-day anchor.
Output confidence: re- derived separately from the synthesizer's input confidence — it's the mean of per-signal availability flags so the UI can suppress forecasts on stub inputs (no price history, no fundamentals, no catalyst).
Bias: the directional sign applied to μ_anchor — visible on the dashboard so the operator can see which way the forecast is leaning before reading the number.
Until 2026-05-18 this math only ran on ranked discovery candidates. The Position Sentinel rescores every held position too — so the same primitives now drive the per-holding cards on the dashboard.
§ 05 · Swap pipeline
Head-to-head PM → risk-neutral → risk-validator → execution
Read the codesrc/lib/agents/rotation-orchestrator.ts:893-1708
A swap is not a sell-then-buy. It's a single atomic proposal — "exit sell-leg, enter buy-leg" — that has to pass two risk gates back-to-back, with the PM deciding first instead of last. The ordering inverts the discovery pipeline because the question being asked is different: discovery asks "is this name worth holding", rotation asks "is this name worth holding more than the one we already hold".
Order
- 1.
market_scout· screens the buy-leg as a replacement candidate (same fundamentals/news/technical/sentiment fan-in). - 2.
bull_researcher+bear_researcher+consensus· bull-vs-bear review on the swap, not on the buy-leg in isolation. - 3.
trader· sizes the swap (1:1 by default, but trims/adds are allowed). - 4.
head_to_head_pm· decides first. Reads both legs side-by-side; returns SWAP / KEEP / SWAP_DIFFERENT_BUY with a reasoned verdict. Can veto. - 5.
risk_neutral_swap· LLM risk pass mirroring the discovery risk_neutral stage. Soft veto — REJECT halts the swap. - 6.
risk_validator· deterministic. Concentration policy, sector caps, margin awareness. Hard veto — ABORT halts the swap. Must approve. - 7.
execution_agent· executes in tranches, market-hours-aware, verify-or-cancel at next open.
Concentration policy (the AXS → V fix)
A 1:1 swap from an over-cap position has zero net concentration delta. Treating it as a fresh entry would forever strand the swap out. The policy in concentration-policy.ts buckets the request: new entry → hard cap; rebalance with netDelta ≤ 0 → approve; rebalance with netDelta > 0 → size-down recommendation. Margin context is honoured — when marginContext.enabled && multiplier ≥ 2 the effective cap uses buying power as a second, looser bound.
Tranches + market hours + verify-or-cancel
Large swaps split into tranches via rotation-tranches.ts. When the market is closed, the rotation is queued; when it opens, the executor places the first tranche, then verifies the fill at the next price-check. Unfilled orders cancel rather than chase the tape.
§ 06 · Honest gaps
What's NOT real
Four surfaces are wired but not fully fed. The page above is the fiction-free version; this section is the rest of it.
- Correlation matrix needs price history
The
correlation_matrixmath tool computes pairwise ρ across symbols you supply, but a fully populated per-symbol price series isn't routinely cached. When history is missing the tool fails-soft and the PM tells you so. Wiring an opportunistic backfill is on the list. - L3 Black-Litterman runs as a no-op on cold cells
Until enough
agent_accuracy_rollingrows accumulate for a given (sector, regime, horizon) cell, the L3 tilt is zero. The layer is still in the pipeline — it just contributes nothing to the composite for those cells. As the trade log grows, more cells light up. Cited atcomposite-score.ts:134-156. - Federated posterior is opt-in
The L2 shrinkage layer accepts an optional third anchor — the average posterior across the whole user base. The plumbing works; the feeder that aggregates per-user posteriors into a federated row runs offline. Cold-start users currently see the sector-profile prior only.
- L0 z-score skips small cohorts
When a sector has fewer than three tickers in the cohort at decision time, the L0 layer is skipped (
composite-score.ts:95-97) and the composite uses raw 0–100 scores into L1. This is intentional — a 1-ticker z-score is meaningless — but it does mean thinly-covered sectors briefly lose the regime-detrend.
If any of the above changes status, this page must move first. The code is the contract; this page is its mirror.