Caching layers

Where snapshots can get stale across orchestrator caching, baselines, and provider-side caches.

Caching issues usually show up as “the planner is wrong”.

Most of the time, the planner is fine. The input snapshot was stale.

The short version

CrossWatch can cache snapshots within a run.
Providers can cache API responses too.
Stale snapshots mean “no changes” or repeated adds.

Quick fixes

Set runtime.snapshot_ttl_sec = 0.
Rerun once (eventual consistency is common).
If it’s still wrong, suspect provider-side caching.

Snapshot behavior: Snapshots
Watchlist add flapping: Phantom Guard

Orchestrator caching: orchestrator/_snapshots.py (in-run snapshot cache) Provider caching: varies; SIMKL is known to have disk TTL + ETag layers in some modules.

1) Orchestrator snapshot caching (in-run only)

What it is

The orchestrator can cache snapshots (indices) within a single run.

Stored in ctx.snap_cache
Controlled by runtime.snapshot_ttl_sec

If TTL > 0:

repeated calls to build_snapshots_for_feature(feature, ...) can reuse an already-built index for provider+feature.

Why it exists

two-way planning may build snapshots once and reuse them in multiple stages
some UI-triggered flows may call orchestration pieces repeatedly

What it does not do

It does not persist across runs.
It does not cache per endpoint; it caches “final index dict”.

Busting the cache

Pipelines call _bust_snapshot(provider) after successful writes to that provider so later stages don’t reuse a stale index.

If you see:

“adds confirmed” but later logic still thinks it’s missing, it’s often because snapshot cache wasn’t busted or TTL is too long.

2) Baselines (cross-run “memory”)

Baselines in state.json are a form of cache:

they store the last “known inventory” per provider+feature

They are used for:

delta-index providers (merge baseline + delta)
drop guard comparisons (detect suspect snapshots)
observed deletions (two-way)
safe removals (only remove items that existed in baseline)

This is not a cache of API responses.

It can still mask live misses when drop guard coerces to baseline.

That’s by design.

3) Provider-side caching (memo / disk TTL / ETag)

Providers may implement their own caching.

Typical layers

In-run memo (dict in memory)
Disk TTL cache (JSON file) keyed by URL+params
ETag cache (If-None-Match → 304 reuse body)

These layers can cause the orchestrator to see stale snapshots even when the provider API has new data.

Known hotspot: SIMKL “present PTW snapshot”

In CrossWatch, SIMKL watchlist indexing has historically used:

a “present snapshot” endpoint, then deltas when activities changed.

If the present snapshot is cached and not forced fresh when activities move, you get:

snapshot count unchanged
planner sees no adds
new SIMKL adds never propagate

Fix pattern (already discussed in project handover):

when force_refresh=True, do not send If-None-Match (force 200 body)
if activities timestamp advanced beyond cache timestamp, force refresh present snapshot

This is provider-side; the orchestrator can’t fix it, it can only bust its own snapshot memo.

4) How stale data shows up

A) Planner shows no changes

Likely causes:

provider snapshot itself is stale (provider cache)
orchestrator reused snap_cache within run (TTL too high)
drop guard coerced snapshot back to baseline because checkpoint didn’t advance

B) Adds loop forever

Likely causes:

provider accepted writes but doesn’t reflect them immediately (eventual consistency)
snapshot cache reused after write (no bust)
key mismatch (adds succeed but can’t be matched in next snapshot)
PhantomGuard not active or not recording stickiness correctly

C) Two-way deletes look wrong

Likely causes:

observed deletions ran while one side was “effectively down”
baseline is old and current snapshot is partial (delta mis-marked as present)
drop guard disabled or checkpoints missing

5) Make it fresh (practical levers)

Orchestrator-side

Set runtime.snapshot_ttl_sec = 0 (disable in-run caching)
Ensure pipelines call _bust_snapshot(dst) after writes (they do today)
Enable sync.drop_guard to avoid bogus empty snapshots (helps safety, not freshness)

Provider-side

Disable disk TTL caches temporarily (if configurable)
Force refresh key endpoints when activities moved
When forcing refresh, avoid If-None-Match so you don’t get 304 “stale body”
After writes, clear in-run memo keys for relevant endpoints

Snapshot building and drop guard: Snapshots
Snapshot TTL and debug knobs: Runtime
Watchlist “ghost add” suppression: Phantom Guard
Two-way delete logic: Two-way sync

PreviousChunking NextState.json

Last updated 14 hours ago

Good afternoon

hashtagThe short version

hashtagQuick fixes

hashtag1) Orchestrator snapshot caching (in-run only)

hashtagWhat it is

hashtagWhy it exists

hashtagWhat it does not do

hashtagBusting the cache

hashtag2) Baselines (cross-run “memory”)

hashtag3) Provider-side caching (memo / disk TTL / ETag)

hashtagTypical layers

hashtagKnown hotspot: SIMKL “present PTW snapshot”

hashtag4) How stale data shows up

hashtagA) Planner shows no changes

hashtagB) Adds loop forever

hashtagC) Two-way deletes look wrong

hashtag5) Make it fresh (practical levers)

hashtagOrchestrator-side

hashtagProvider-side

hashtagRelated pages

The short version

Quick fixes

1) Orchestrator snapshot caching (in-run only)

What it is

Why it exists

What it does not do

Busting the cache

2) Baselines (cross-run “memory”)

3) Provider-side caching (memo / disk TTL / ETag)

Typical layers

Known hotspot: SIMKL “present PTW snapshot”

4) How stale data shows up

A) Planner shows no changes

B) Adds loop forever

C) Two-way deletes look wrong

5) Make it fresh (practical levers)

Orchestrator-side

Provider-side

Related pages