# Caching layers

{% tabs %}
{% tab title="End users" %}
Caching issues usually show up as “the planner is wrong”.

Most of the time, the planner is fine. The input snapshot was stale.

#### The short version

* CrossWatch can cache snapshots within a run.
* Providers can cache API responses too.
* Stale snapshots mean “no changes” or repeated adds.

#### Quick fixes

* Set `runtime.snapshot_ttl_sec = 0`.
* Rerun once (eventual consistency is common).
* If it’s still wrong, suspect provider-side caching.

Related:

* Snapshot behavior: [Snapshots](/blueprint-architecture/orchestrator/snapshots.md)
* Watchlist add flapping: [Phantom Guard](/blueprint-architecture/orchestrator/phantom-guard.md)
  {% endtab %}

{% tab title="Power users" %}
**Orchestrator caching:** `orchestrator/_snapshots.py` (in-run snapshot cache)\
**Provider caching:** varies; SIMKL is known to have disk TTL + ETag layers in some modules.

***

### 1) Orchestrator snapshot caching (in-run only)

#### What it is

The orchestrator can cache snapshots (indices) within a single run.

* Stored in `ctx.snap_cache`
* Controlled by `runtime.snapshot_ttl_sec`

If TTL > 0:

* repeated calls to `build_snapshots_for_feature(feature, ...)` can reuse an already-built index for provider+feature.

#### Why it exists

* two-way planning may build snapshots once and reuse them in multiple stages
* some UI-triggered flows may call orchestration pieces repeatedly

#### What it does *not* do

* It does not persist across runs.
* It does not cache per endpoint; it caches “final index dict”.

#### Busting the cache

Pipelines call `_bust_snapshot(provider)` after successful writes to that provider so later stages don’t reuse a stale index.

If you see:

* “adds confirmed” but later logic still thinks it’s missing, it’s often because snapshot cache wasn’t busted or TTL is too long.

***

### 2) Baselines (cross-run “memory”)

Baselines in `state.json` are a form of cache:

* they store the last “known inventory” per provider+feature

They are used for:

* delta-index providers (merge baseline + delta)
* drop guard comparisons (detect suspect snapshots)
* observed deletions (two-way)
* safe removals (only remove items that existed in baseline)

This is not a cache of API responses.

It can still **mask live misses** when drop guard coerces to baseline.

That’s by design.

***

### 3) Provider-side caching (memo / disk TTL / ETag)

Providers may implement their own caching.

#### Typical layers

1. In-run memo (dict in memory)
2. Disk TTL cache (JSON file) keyed by URL+params
3. ETag cache (If-None-Match → 304 reuse body)

These layers can cause the orchestrator to see stale snapshots even when the provider API has new data.

#### Known hotspot: SIMKL “present PTW snapshot”

In CrossWatch, SIMKL watchlist indexing has historically used:

* a “present snapshot” endpoint, then deltas when activities changed.

If the present snapshot is cached and not forced fresh when activities move, you get:

* snapshot count unchanged
* planner sees no adds
* new SIMKL adds never propagate

Fix pattern (already discussed in project handover):

* when `force_refresh=True`, do **not** send If-None-Match (force 200 body)
* if activities timestamp advanced beyond cache timestamp, force refresh present snapshot

This is provider-side; the orchestrator can’t fix it, it can only bust its own snapshot memo.

***

### 4) How stale data shows up

#### A) Planner shows no changes

Likely causes:

* provider snapshot itself is stale (provider cache)
* orchestrator reused snap\_cache within run (TTL too high)
* drop guard coerced snapshot back to baseline because checkpoint didn’t advance

#### B) Adds loop forever

Likely causes:

* provider accepted writes but doesn’t reflect them immediately (eventual consistency)
* snapshot cache reused after write (no bust)
* key mismatch (adds succeed but can’t be matched in next snapshot)
* PhantomGuard not active or not recording stickiness correctly

#### C) Two-way deletes look wrong

Likely causes:

* observed deletions ran while one side was “effectively down”
* baseline is old and current snapshot is partial (delta mis-marked as present)
* drop guard disabled or checkpoints missing

***

### 5) Make it fresh (practical levers)

#### Orchestrator-side

* Set `runtime.snapshot_ttl_sec = 0` (disable in-run caching)
* Ensure pipelines call `_bust_snapshot(dst)` after writes (they do today)
* Enable `sync.drop_guard` to avoid bogus empty snapshots (helps safety, not freshness)

#### Provider-side

* Disable disk TTL caches temporarily (if configurable)
* Force refresh key endpoints when activities moved
* When forcing refresh, avoid If-None-Match so you don’t get 304 “stale body”
* After writes, clear in-run memo keys for relevant endpoints

***

### Related pages

* Snapshot building and drop guard: [Snapshots](/blueprint-architecture/orchestrator/snapshots.md)
* Snapshot TTL and debug knobs: [Runtime](/blueprint-architecture/orchestrator/runtime.md)
* Watchlist “ghost add” suppression: [Phantom Guard](/blueprint-architecture/orchestrator/phantom-guard.md)
* Two-way delete logic: [Two-way sync](/blueprint-architecture/orchestrator/two-way-sync.md)
  {% endtab %}
  {% endtabs %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://wiki.crosswatch.app/blueprint-architecture/orchestrator/caching-layers.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
