# Provider specifics

{% tabs %}
{% tab title="End users" %}
Providers don’t behave the same.

Some are fast and consistent. Some are stale or “eventually consistent”.

This page lists the common issues you’ll actually feel in the UI.

#### Common symptoms and what they usually mean

**Planner shows 0 changes when you know there are changes**

* stale provider snapshot (provider caching)
* CrossWatch snapshot TTL > 0 (in-run caching)

**The same adds repeat every run**

* destination accepted the write but never shows it later
* destination lacks stable IDs so matching never “sticks”

**Two-way wants scary deletes**

* one provider returned a tiny snapshot
* CrossWatch should block it via drop guard / mass delete protection

#### Provider quick notes

* **Plex**: usually has the best external IDs. Missing IDs still happen.
* **Jellyfin/Emby**: missing external IDs are common. Expect matching noise.
* **Trakt/SIMKL**: caching and rate limits can hide recent updates.
* **AniList**: anime keys can be weird. CrossWatch does extra key backfill.

Related:

* Where staleness comes from: [Caching layers](/blueprint-architecture/orchestrator/caching-layers.md)
* Matching and key rules: [Snapshots](/blueprint-architecture/orchestrator/snapshots.md)
* Two-way safety: [Two-way sync](/blueprint-architecture/orchestrator/two-way-sync.md)
  {% endtab %}

{% tab title="Power users" %}
Provider-specific quirks (and how they bite the orchestrator)

This doc is the “yeah, but in real life…” layer: behaviors that are common per provider and how to design around them.

It focuses on quirks that matter for orchestrator correctness:

* stable keys / IDs
* snapshot freshness
* eventual consistency
* deletion semantics
* write confirmation

***

### PLEX

#### Canonical IDs come from GUID parsing

`id_map` contains GUID patterns (`ids_from_guid`) that extract IDs like:

* `tmdb`
* `imdb`
* `tvdb`

If Plex items don’t expose external IDs cleanly:

* canonical keys drift
* diffs become noisy
* Jellyfin/Emby matching gets worse

**Best practice:**

* Always populate `item["ids"]` with the strongest external IDs available (tmdb/imdb/tvdb).
* For episodes/seasons, include typed tokens if you can (tvdb season/episode details), so alias matching works.

***

### TRAKT

#### ETag and “shadow” patterns

Trakt APIs are rate-limit sensitive and often benefit from:

* ETag caching (If-None-Match)
* write-through shadow for “we wrote this, treat it as present until next full sync”

If your Trakt module uses ETag, be careful:

* ETag + stale body can hide new items unless you bust keys after writes.

**Orchestrator impact:**

* stale snapshots → planner shows no changes
* or two-way thinks items were deleted (if it shrinks hard)

***

### SIMKL

#### “Present snapshot” vs “delta via activities”

Simkl’s sync model is naturally incremental:

* call `/sync/activities`
* if timestamps moved, fetch `/sync/all-items/...` with `date_from=...`

Many implementations still use a “present snapshot” for watchlist as a baseline. That’s where caching pain happens.

#### Caching layers are the classic foot-gun

If the Simkl module has:

* in-run memo cache
* disk TTL cache
* ETag-based 304 reuse

…you can get a perfectly “successful” run that sees an old watchlist forever.

**Typical symptom:**

* user adds new item on Simkl
* snapshot count doesn’t change
* planner plans 0 adds

**Fix pattern (provider-side):**

* when you need a truly fresh response, do **not** send `If-None-Match`
* if activities timestamp advanced beyond cached timestamp, force-refresh the present snapshot
* after write calls, bust caches for the impacted endpoints

**Orchestrator can’t fix this** beyond busting *its own* in-run snapshot memo.

***

### JELLYFIN / EMBY

#### Provider IDs are often missing or inconsistent

Jellyfin libraries frequently index by:

* title + year
* internal item ids and do not reliably carry tmdb/imdb/tvdb ProviderIds for everything.

**Symptom (the famous one):**

* Plex → Jellyfin keeps planning hundreds of adds every run
* because Jellyfin index keys don’t match source external IDs

**Recommended strategy: write-through shadow**

* When you successfully write history/watchlist to Jellyfin, record a shadow entry keyed by source external IDs.
* When building the Jellyfin index, merge live inventory + shadow:
  * live wins when it has real IDs
  * shadow provides stable matching when live lacks IDs

This prevents “re-add forever” loops without needing perfect Jellyfin metadata.

#### Eventual consistency

Even when Jellyfin accepts a write, the state might not reflect immediately. If you verify-after-write too aggressively, you’ll misclassify success as failure.

**Provider-side mitigations:**

* small post-write delay for specific endpoints
* or “trust write” but use shadow for matching until live catches up

***

### ANILIST (watchlist)

#### The orchestrator does special key backfill

`_snapshots._maybe_backfill_anilist_shadow(...)` runs for `feature="watchlist"`:

* It builds a token map from other providers’ IDs (tmdb/imdb/tvdb/etc.)
* For each Anilist item, if it overlaps tokens, it:
  * enriches missing IDs
  * rekeys Anilist entries to a stronger canonical key
  * writes a scoped shadow file:
    * `/config/.cw_state/anilist_watchlist_shadow.<scope>.json`

This is explicitly to reduce “Anilist-only keys” and make cross-provider matching sane.

**Gotcha:**

* this shadow is scope-based; wrong scope env means wrong file gets updated.

***

### General “quirks” that are actually design constraints

#### 1) Stable IDs are everything

If providers don’t supply external IDs:

* canonical keys drift
* tombstones/unresolved/blackbox matching gets weaker
* two-way delete logic becomes risky

#### 2) Confirmed keys > confirmed counts

Providers that return only `{ok:true}` or a total count:

* make it impossible to track which items succeeded
* degrade PhantomGuard/unresolved quality

If you can, return:

* `confirmed_keys: [...]`

#### 3) “Present” vs “delta” semantics must be honest

If an index is partial but you treat it as full:

* you will create false removals
* drop guard will fight it, but only if checkpoints are meaningful

If your provider is incremental:

* declare delta semantics in capabilities (provider-side)
* merge with baseline before diffing

***

### Related pages

* Key normalization and special backfills: [Snapshots](/blueprint-architecture/orchestrator/snapshots.md)
* Stale data paths: [Caching layers](/blueprint-architecture/orchestrator/caching-layers.md)
* Failure suppression: [Tombstones](/blueprint-architecture/orchestrator/tombstones.md), [Blackbox](/blueprint-architecture/orchestrator/blackbox.md), [Unresolved](/blueprint-architecture/orchestrator/unresolved.md)
* Delete logic: [Two-way sync](/blueprint-architecture/orchestrator/two-way-sync.md)
  {% endtab %}
  {% endtabs %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://wiki.crosswatch.app/blueprint-architecture/orchestrator/provider-specifics.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
