Caching layers
Where snapshots can get stale across orchestrator caching, baselines, and provider-side caches.
Caching issues usually show up as “the planner is wrong”.
Most of the time, the planner is fine. The input snapshot was stale.
The short version
CrossWatch can cache snapshots within a run.
Providers can cache API responses too.
Stale snapshots mean “no changes” or repeated adds.
Quick fixes
Set
runtime.snapshot_ttl_sec = 0.Rerun once (eventual consistency is common).
If it’s still wrong, suspect provider-side caching.
Related:
Snapshot behavior: Snapshots
Watchlist add flapping: Phantom Guard
Orchestrator caching: orchestrator/_snapshots.py (in-run snapshot cache)
Provider caching: varies; SIMKL is known to have disk TTL + ETag layers in some modules.
1) Orchestrator snapshot caching (in-run only)
What it is
The orchestrator can cache snapshots (indices) within a single run.
Stored in
ctx.snap_cacheControlled by
runtime.snapshot_ttl_sec
If TTL > 0:
repeated calls to
build_snapshots_for_feature(feature, ...)can reuse an already-built index for provider+feature.
Why it exists
two-way planning may build snapshots once and reuse them in multiple stages
some UI-triggered flows may call orchestration pieces repeatedly
What it does not do
It does not persist across runs.
It does not cache per endpoint; it caches “final index dict”.
Busting the cache
Pipelines call _bust_snapshot(provider) after successful writes to that provider so later stages don’t reuse a stale index.
If you see:
“adds confirmed” but later logic still thinks it’s missing, it’s often because snapshot cache wasn’t busted or TTL is too long.
2) Baselines (cross-run “memory”)
Baselines in state.json are a form of cache:
they store the last “known inventory” per provider+feature
They are used for:
delta-index providers (merge baseline + delta)
drop guard comparisons (detect suspect snapshots)
observed deletions (two-way)
safe removals (only remove items that existed in baseline)
This is not a cache of API responses.
It can still mask live misses when drop guard coerces to baseline.
That’s by design.
3) Provider-side caching (memo / disk TTL / ETag)
Providers may implement their own caching.
Typical layers
In-run memo (dict in memory)
Disk TTL cache (JSON file) keyed by URL+params
ETag cache (If-None-Match → 304 reuse body)
These layers can cause the orchestrator to see stale snapshots even when the provider API has new data.
Known hotspot: SIMKL “present PTW snapshot”
In CrossWatch, SIMKL watchlist indexing has historically used:
a “present snapshot” endpoint, then deltas when activities changed.
If the present snapshot is cached and not forced fresh when activities move, you get:
snapshot count unchanged
planner sees no adds
new SIMKL adds never propagate
Fix pattern (already discussed in project handover):
when
force_refresh=True, do not send If-None-Match (force 200 body)if activities timestamp advanced beyond cache timestamp, force refresh present snapshot
This is provider-side; the orchestrator can’t fix it, it can only bust its own snapshot memo.
4) How stale data shows up
A) Planner shows no changes
Likely causes:
provider snapshot itself is stale (provider cache)
orchestrator reused snap_cache within run (TTL too high)
drop guard coerced snapshot back to baseline because checkpoint didn’t advance
B) Adds loop forever
Likely causes:
provider accepted writes but doesn’t reflect them immediately (eventual consistency)
snapshot cache reused after write (no bust)
key mismatch (adds succeed but can’t be matched in next snapshot)
PhantomGuard not active or not recording stickiness correctly
C) Two-way deletes look wrong
Likely causes:
observed deletions ran while one side was “effectively down”
baseline is old and current snapshot is partial (delta mis-marked as present)
drop guard disabled or checkpoints missing
5) Make it fresh (practical levers)
Orchestrator-side
Set
runtime.snapshot_ttl_sec = 0(disable in-run caching)Ensure pipelines call
_bust_snapshot(dst)after writes (they do today)Enable
sync.drop_guardto avoid bogus empty snapshots (helps safety, not freshness)
Provider-side
Disable disk TTL caches temporarily (if configurable)
Force refresh key endpoints when activities moved
When forcing refresh, avoid If-None-Match so you don’t get 304 “stale body”
After writes, clear in-run memo keys for relevant endpoints
Related pages
Snapshot building and drop guard: Snapshots
Snapshot TTL and debug knobs: Runtime
Watchlist “ghost add” suppression: Phantom Guard
Two-way delete logic: Two-way sync
Last updated