zfin/TODO.md

# Future Work

No work here is blocking - we're in a good state. Items below are
ordered roughly by priority within each section. Priority labels
(`HIGH` / `MEDIUM` / `LOW`) mark items that deserve explicit
ranking; unlabeled items are "someday, if the mood strikes."

## Projections: future enhancements

- **Chart vertical line at retirement boundary - priority LOW.**
  The accumulation-phase spec called this "mandatory" but it was
  explicitly deferred during implementation. The chart currently
  shows the full `accumulation_years + horizon` span without a
  visual marker for where accumulation ends and distribution
  begins. Easier to add to the kitty-graphics chart than the braille
  one.
- **Goal-seek over distribution horizon for W1 - priority LOW.**
  Today the W1 ("set spending, find date") workflow reports the
  earliest retirement at each user-configured `(horizon, confidence)`
  cell. The philosophically correct version asks "when have I
  accumulated enough wealth that the projection shows a 95%
  probability of success withdrawing X per year from retirement
  until age-of-death?" - i.e. goal-seek across both `accumulation_years`
  AND `distribution_years` simultaneously, anchored to a configured
  age-of-death. NP-shaped search; not worth optimizing until
  someone wants it.
- **Per-person retirement_age - priority LOW.**
  V1 of the accumulation-phase spec chose Option A: a single
  household retirement boundary derived from the oldest configured
  birthdate. Households where one earner retires significantly
  earlier than the other would benefit from per-person
  `retirement_age` fields on each `type::birthdate` record, with
  contributions stopped per-person.
- Multiple spending models: flat (current), decreasing (1-2% real annual decrease,
  Blanchett "spending smile"). Late-life healthcare better modeled as a life event.
- **Historical projection overlay follow-ups.** The base
  `--overlay-actuals` overlay shipped (CLI tip + TUI primary surface).
  Open enhancements:
    - Historical `metadata.srf` / `projections.srf` for back-dated
      runs. Today the overlay re-runs against current classifications
      and assumptions; for historically faithful what-the-model-said-then
      output we'd check out the git-tracked versions of those files
      at the as-of commit and load those instead. Edge case until
      classifications materially drift.
    - Contribution-attribution overlay. Today's actuals line includes
      contributions implicitly; the bands assume modeled contributions
      that may or may not match reality. A "decompose actuals into
      market return vs contributions" annotation would clarify how
      much of the trajectory was the model being right vs new money
      arriving on schedule.
    - Mosaic mode: overlay multiple as-of starting points on one chart
      ("show me 1Y, 3Y, 5Y, 10Y projections all at once") so the user
      can see how the projection envelope tightened as data came in.
    - **Better composition basis for imported-only as-of.** Today
      the imported-only path uses today's allocations scaled by
      `imported_liquid / today_total_liquid`. That's the simplest
      thing that could work, but it's "today's mix back-dated" -
      it ignores everything we know about the historical context.
      Specifically: `imported_values.srf` already carries an
      `expected_return` field per row that the user captured at
      that date in their source spreadsheet. We could:
        - Use the imported `expected_return` as a sanity check
          against the simulation's per-position weighted return
          (warn or clamp if they diverge wildly - the spreadsheet's
          number reflects what the user actually saw at the time).
        - Use the imported `expected_return` to bias the
          stock/bond split inference: a higher expected return
          implies a higher historical equity weighting than today's
          mix probably reflects.
        - Reach further: derive a synthetic stock/bond split from
          the imported `expected_return` directly, treating it as
          a weighted average of SPY and AGG returns at that date
          and solving for the weights. That gives a per-imported-
          row composition that's locally faithful instead of
          one-mix-fits-all.
      None of these are urgent - the current "today's mix scaled"
      approximation is documented as such and the bands still
      render meaningfully - but each would tighten the historical
      faithfulness one notch. Pick whichever has the highest
      payoff vs. complexity when this gets revisited.

## `--export-chart` follow-ups - priority LOW

V1 of `--export-chart <PATH>` shipped for `quote` and `projections`
(default bands mode only). Several adjacent surfaces still don't
have PNG export and were deferred:

- **`history --export-chart`.** The `history` command renders a
  single-series braille chart of portfolio value over time
  (synthesized into `Candle` records and fed to
  `format.computeBrailleChart`). It doesn't share the z2d
  pipeline that `quote` (`tui/chart.zig`) and `projections`
  (`tui/projection_chart.zig`) use. To export, options:
    - **A.** Pipe the synthesized candles through
      `tui/chart.zig`'s `renderChart` - but that draws Bollinger
      Bands and an RSI panel, both meaningless on a portfolio-
      value series.
    - **B.** Add a minimal "single-series line chart" z2d
      renderer (a slimmed-down `projection_chart.zig` without
      bands). ~150 lines. Same renderToSurface shape so PNG
      export is trivial after.
    - **C.** Skip it permanently; the braille chart is fine for
      what `history` is. Document as "not exportable".
  B is the right answer if PNG export of the history chart is
  ever requested.
- **`projections --convergence` / `--return-backtest`.** Both
  render forecast-evaluation charts via `tui/forecast_chart.zig`.
  Not refactored to expose a `renderToSurface` seam yet -
  parser rejects `--export-chart` in those modes today. Low
  effort to add (mirror the `tui/chart.zig` pattern).
- **`projections --vs <DATE>`.** No chart at all in this mode
  (text-only delta table); `--export-chart` rejected at parse
  time. Could grow a side-by-side bands comparison chart, but
  that's a feature of its own - not just an export plumbing job.
- **Theme overrides at export time.** Today the export always
  uses `theme.default_theme`. A `--theme <PATH>` flag at export
  time would let users render with their configured theme or a
  presentation-friendly one. Out of scope for V1; gate when
  someone asks for it.
- **File format alternatives.** SVG / PDF / WebP - `z2d` only
  exports PNG natively today; would need an external dependency
  or a pixel-buffer-to-format conversion.

## Refactor: trim `src/format.zig` once Money / Date have absorbed their helpers - priority LOW

`src/format.zig` is still a ~1700-line grab-bag, but the money- and
date-shaped helpers that used to live there have been moved out:
money formatting now lives in `src/Money.zig` (with `{f}` /
`whole()` / `trim()` / `signed()` / `padRight(N)` / `padLeft(N)`),
and date formatting lives in `src/Date.zig` (with `{f}` /
`padRight(N)` / `padLeft(N)`). What's left in `format.zig` is the
genuinely-format-domain stuff: braille charts, return formatters,
allocation notes, signed-percent rendering.

If the file ever grows enough to be annoying again, consider
renaming to `src/render.zig` to better describe what's left, or
splitting the braille chart out (it's ~600 lines on its own).
Not blocking - file it as cleanup if and when it bites.

## Investigate: detailed 401(k) contributions data source

Found a more detailed contributions screen on at least one
employer-sponsored 401(k) provider portal - distinct from the
standard positions/holdings view we already pull from. Worth
investigating whether this unlocks better attribution than what
we get from the positions CSV alone, and whether other 401(k)
providers expose similar screens.

Open questions to answer when picking this up:

- Which screen specifically (path / URL within the portal)? Is there
  an export option, or is it view-only / scrape territory?
- What fields does it expose (employee pre-tax, employer match,
  after-tax / mega-backdoor, by-pay-period dates, per-fund
  allocations)?
- Refresh cadence - per-paycheck, daily, on-demand?
- Can it be auto-discovered like the existing audit CSVs, or
  is it manual-entry territory?

If the export is structured and recurring, this could feed a
401(k)-specific contributions classifier that bypasses the lot-diff
heuristic for that account, similar to how `cash_is_contribution`
opts ESPP/HSA accounts into cash-based attribution.

Related: ESPP-style accrual blind spot in the "Audit: manual-check
accounts mechanism" section above.

## Torn SRF files from server sync (root cause unknown)

**Status:** Root cause still unidentified. We have mitigations and
diagnostics in place that keep torn responses from corrupting the
cache, but we don't yet know *why* responses arrive torn. Until we
have a root cause, this is not resolved - it's mitigated.

Mitigations landed so far:

- `syncFromServer` (`src/service.zig`) validates responses via
  `cache.Store.looksCompleteSrf` before `writeRaw`. Torn HTTP bodies
  (empty, missing `#!srfv1` header, or no trailing newline) are
  rejected with a warn-level log and NOT written to cache.
- HTTP responses are checked for an `ETag` sha256 header; on mismatch
  we retry the request once before giving up and falling back to the
  provider.
- Read-path self-heal: on SRF parse failure during read, the cache
  entry is invalidated so a subsequent refresh can repair without
  user intervention.
- Diagnostics: richer error capture around the sync path. So far,
  HTTP transit is the dominant source of torn responses - but that's
  an observation, not a root cause.

**Remaining work:**

- Identify root cause. Candidates to investigate: proxy/load-balancer
  behavior, HTTP keepalive reuse, partial reads on the server side,
  client-side buffer handling. The etag retry tells us whether the
  problem is per-request or persistent; dig into the diagnostics
  output when the next occurrence is captured.
- Once root cause is known, decide whether the current mitigations
  are sufficient or whether a targeted fix is needed. The
  mitigations may end up being the whole answer, but we can't
  conclude that without understanding the underlying cause.

(Content-Length validation was considered and rejected: once the
server starts compressing response bodies, Content-Length reflects
the compressed byte count, not the decoded payload, so it's not a
reliable integrity check.)

## Market-aware cache TTL for daily candles

Daily candle TTL is currently 23h45m, but candle data only becomes meaningful
after the market close. Investigate keying the cache freshness to ~4:30 PM
Eastern rather than a rolling window. This would avoid unnecessary refetches
during the trading day and ensure a fetch shortly after close gets fresh data.
Probably alleviated by the cron job approach.

## On-demand server-side fetch for new symbols

Currently the server's SRF endpoints (`/candles`, `/dividends`, etc.) are pure
cache reads - they 404 if the data isn't already on disk. New symbols only get
populated when added to the portfolio and picked up by the next cron refresh.

Consider: on a cache miss, instead of blocking the HTTP response with a
multi-second provider fetch, kick off an async background fetch (or just
auto-add the symbol to the portfolio) and return 404 as usual. The next
request - or the next cron run - would then have the data. This gives
"instant-ish gratification" for new symbols without the downsides of
synchronous fetch-on-miss (latency, rate limit contention, unbounded cache
growth from arbitrary tickers).

Note that this process doesn't do anything to eliminate all the API keys
that are necessary for a fully functioning system. A more aggressive view
would be to treat ZFIN_SERVER as a 100% source of record, but that would
introduce some opacity to the process as we wait for candles (for example) to
populate. This could be solved on the server by spawning a thread to fetch the
data, then returning 202 Accepted, which could then be polled client side. Maybe
this is a better long term approach?

## Configurable live-quote provider (Tiingo IEX) - priority LOW

The TUI refresh key (`r`) values the portfolio with live intraday
quotes via `DataService.loadLiveQuotes`, which is Yahoo-only: Yahoo is
keyless, consolidated, and stays off every rate-limit budget, so bursty
refresh traffic costs nothing. The tradeoffs are that Yahoo's
unofficial feed is ~15-minute delayed and "can break without notice."

Tiingo's IEX endpoint (`/iex/?tickers=A,B,C`) is a strong opt-in
alternative: it's genuinely real-time (IEX last-sale, no 15-min delay),
official/keyed, and bills per HTTP request - one call returns the whole
portfolio (confirmed empirically: a 2-ticker batch decrements the daily
quota by 1, not 2). Fields map cleanly: `tngoLast` to price, `prevClose`
to day-change. Caveats: IEX is a single venue (~2-3% of volume), so
`tngoLast` can sit stale between prints on illiquid names, and IEX
doesn't trade mutual funds, so those fall back to the candle close.

Proposal: a config knob (env var, e.g. `ZFIN_LIVE_QUOTE_PROVIDER` =
`yahoo` (default) | `tiingo`) that switches `loadLiveQuotes` to a new
`Tiingo.fetchQuotes(tickers)` batched call. Someone on Tiingo's Power
tier ($30/mo, higher limits) who wants real-time and mashes `r` a lot
(or once we add streaming) reuses their existing `TIINGO_API_KEY` and
gets real-time coverage; everyone else keeps the keyless Yahoo default.

Implementation notes:

- `Tiingo.fetchQuotes` returns an array whose order is NOT guaranteed to
  match the request order, so key results by the returned `ticker`
  field, not by position.
- Tiingo-sourced live quotes would share Tiingo's 50/hour token bucket
  (`RateLimiter.perHour`, wired into the provider). A batched quote
  call is 1 request, but heavy `r` use plus candle refreshes draw from
  the same hourly budget, so watch for contention.
 - Tiingo websocket streaming would be the natural follow-on for true
   push-based real-time, replacing poll-on-`r` entirely.

## Precise "as of <clock time>" via a datetime/timezone lib (zeit) - priority LOW

The portfolio tab's live-price footer is deliberately vague static
text: "(as of intraday quote today)" after a live refresh, falling
back to "(as of close on YYYY-MM-DD)" otherwise. We can't do better
today because the codebase has no wall-clock-to-local-time machinery -
`Date` is days-only, and every time display is either a date or a
relative "X ago" (`fmt.fmtTimeAgo`). There's no way to render an
absolute local clock time like "2:34 PM ET".

Pulling in a datetime/timezone library (e.g. [zeit](https://github.com/rockorager/zeit),
already by the libvaxis author) would let us:

- Show a precise, honest stamp: "(as of 2:34 PM ET)" / "refreshed
  2:34 PM" instead of "today" / "Xs ago".
- Fix the current label's weekend/after-hours imprecision. Right now a
  refresh when the market is closed flips the footer to "intraday quote
  today" even though Yahoo returned the last close (which on a Saturday
  is Friday's). With real clock + market-calendar awareness, the label
  could say "as of Fri close" or "(market closed, last quote Fri 4:00
  PM ET)" instead of implying live intraday data.
- Replace `last_refresh_s` "refreshed Xs ago" in the TUI status bar
  (and the quote/earnings/options "data Xs ago" readouts) with absolute
  times where that reads better.

Scope is a judgment call: a new dependency for what's currently a
cosmetic label. Worth it once we want trustworthy timestamps (e.g. for
screenshots, or to stop conflating "live" with "last close"); not
before.

## Analysis: dividend equity / income-shaped equity - think about it

Dividend-equity ETFs (SCHD, VYM, DGRO, NOBL, SDY, VIG, etc.)
bucket as Equity in `analysis.bucketSector`. That's correct for
risk-exposure analysis - they drop with the market in a
2008-style crash, regardless of the dividend stream - but it
loses the income-vs-growth distinction that retirement-planning
tools care about.

Open question: is there a useful second dimension to add?
Possibilities:

- **Yield-weighted breakdown.** Aggregate `current_yield` per
  position, weight by market value, report a portfolio-level
  yield. Doesn't change the asset-class taxonomy; adds a new
  metric.
- **Income coverage of expenses.** "My dividends + bond coupons
  cover X% of projected retirement spending." Closer to what the
  income-side framing actually wants - answers the question
  rather than redefining the buckets.
- **Income-equity sub-bucket within Equity.** A sub-row in the
  Asset Category breakdown, not a 5th top-level bucket. Would
  need a way to mark funds as "income-shaped" - probably a
  per-symbol opt-in in `metadata.srf`.

Not a bug. Not blocking anything. Could end up being a feature.
This is a note to revisit after using the 4-bucket view for a
while and seeing whether the missing dimension actually matters
in practice.

Resist the temptation to:

- **Add a 5th top-level bucket** ("Income Equity" / "Dividend
  Equity"). The 4-bucket view is already the right answer for
  "how much equity exposure do I have?". A 5th bucket
  fragments the headline number.
- **Override SCHD to Fixed Income.** Wrong on risk grounds.
  SCHD will lose 35-45% in an equity crash; treating it as FI
  makes the user think they have downside protection they don't.
- **Add per-symbol "intent" metadata** (`held_for_income::true`).
  Smell of putting framing into data. Intent is a property of
  the holder's strategy, not the security.

If a fix lands, it's probably a separate analysis section (yield
breakdown, income coverage) - not a change to the asset-class
taxonomy.

The following items are acknowledged but not prioritized. Listed here
so they don't get lost; pick up opportunistically.

### UX

- **CLI options command UX.** The `options` command auto-expands only
  the nearest monthly expiration and lists others collapsed. Reconsider
  the interaction model - e.g. allow specifying an expiration date,
  showing all monthlies expanded by default, or filtering by strategy
  (covered calls, spreads).

### Options / valuation

- **Per-account covered call adjustment.** `adjustForCoveredCalls` in
  `valuation.zig` operates on portfolio-wide aggregated allocations.
  It matches sold calls against total underlying shares across all
  accounts. This is wrong - calls in one account can only cover
  shares in that same account. Fixing means restructuring
  `portfolioSummary`, since `Allocation` is currently
  account-agnostic. Low priority - naked calls are rare, and calls
  are typically in the same account as the underlying.
- **Covered call adjustment O(N*M) loop.** `adjustForCoveredCalls`
  has a nested loop - for each allocation, it iterates all lots to
  find matching option contracts. Fine for personal portfolios
  (<1000 lots). Pre-indexing options by underlying would help if
  someone had a very large options-heavy portfolio.

### Audit

- **Audit large-lot threshold tuning.** `src/commands/audit.zig` uses
  `audit_large_lot_threshold: f64 = 10_000.0` as the cutoff for
  "surface this new lot for confirmation." Revisit if $10k proves too
  aggressive (ESPP accruals spam the report) or too permissive (large
  DRIP confirmations slip past). If runtime tuning becomes necessary,
  a `--large-lot <amount>` flag or a global
  `audit_large_lot_threshold` field on `accounts.srf` would be
  reasonable extensions.

### Infra / performance

- **HTTP connection pooling.** Parallel server sync in `loadAllPrices`
  spawns up to 8 threads, each with its own HTTP connection. Could
  reuse connections to reduce TCP handshake overhead. Only matters
  with very large portfolios (100+ symbols) hitting ZFIN_SERVER.
- **Streaming cache deserialization.** Cache store reads entire files
  into memory (`readFileAlloc` with 50MB limit). For portfolios with
  10+ years of daily candles, this could use significant memory.
  Keep current approach unless memory becomes a real problem.