Two-way sync

Exact two-way pipeline (A ↔ B), including tombstones, observed deletions, and guarded removal propagation.

Two-way sync keeps both sides aligned.

It is powerful. It is also where deletes can propagate.

When to use it

Use two-way only when:

both providers match items reliably (stable IDs)
you already had a few clean one-way runs
you understand what “remove” will do

Start two-way with Watchlist only. Keep Remove off at first.

What to expect

First run is safe. It avoids delete propagation.
Later runs can propagate deletes when evidence is strong.
If a provider is down, CrossWatch avoids “everything deleted” behavior.

Safety mechanisms you’ll see

Drop guard: blocks destructive plans from tiny snapshots.
Mass delete protection: blocks large removal waves unless you opt in.
Tombstones: remember deletions to avoid ping-pong.

Recommended setup

Run Dry run first.
Enable Add. Keep Remove off.
Watch plans for a few runs.
Only then consider enabling removes.

Safer defaults: Best practice
How deletes are remembered: Tombstones
Safety model: Guardrails

Two-way means: A ↔ B.

It tries to keep both sides aligned. It also tries very hard not to turn an API hiccup into mass deletes.

Implementation notes

Primary implementation: cw_platform/orchestrator/_pairs_twoway.py

Key helpers: _snapshots.py, _planner.py, _tombstones.py, _phantoms.py, _pairs_blocklist.py, _pairs_massdelete.py, _unresolved.py, _blackbox.py

Related runtime docs:

Overview

Two-way is not “run one-way twice”. It adds extra logic for safe delete propagation:

Observed deletions: “it used to exist, now it’s gone”
Tombstones: “we saw a delete, remember it for a while”
Drop guard: “a tiny snapshot might be wrong”

Per feature, two-way does:

Gate by provider availability and health.
Build snapshots for A and B.
Load baselines, checkpoints, and manual policy for both sides.
Apply drop-guard and index semantics (present vs delta).
Compute observed deletions and write tombstones (when safe).
Plan adds and optional removals in both directions.
Apply removals first, then adds.
Persist baselines and checkpoints.

Technical reference

Entry point

run_two_way_feature(ctx, src, dst, feature, fcfg, health_map)

Emits feature:start / feature:done
Calls _two_way_sync(ctx, a, b, ...)

Step 0 — Health gating

If either side reports auth_failed → emits pair:skip and returns ok:false.
If either side is down:
- writes to that side are skipped (items become unresolved)
- observed deletions are disabled for both sides (to avoid “down = everything deleted”)

Also: feature support is checked:

provider capabilities.features[feature] (best effort)
plus health features[feature] if reported

If either side doesn’t support the feature → feature:unsupported and no-op.

Step 1 — Build snapshots

build_snapshots_for_feature(feature, providers={A,B}, ...) returns:

A_cur: current snapshot from A
B_cur: current snapshot from B

Snapshots may be memoized per run using ctx.snap_cache and ctx.snap_ttl_sec.

Step 2 — Baselines, checkpoints, manual policy

Loads the stable “previous state” once per run:

prev_state = ctx._stable_prev_state or state_store.load_state()

From prev_state:

prevA baseline items: state.providers[A][feature].baseline.items
prevB baseline items: state.providers[B][feature].baseline.items
prev_cp_A / prev_cp_B: stored checkpoints

Current checkpoints:

now_cp_A = module_checkpoint(aops, cfg, feature)
now_cp_B = module_checkpoint(bops, cfg, feature)

Manual policy is loaded for each side:

manual_adds_A, manual_blocks_A
manual_adds_B, manual_blocks_B

Manual adds are merged into A_eff/B_eff; manual blocks filter adds and removes.

Step 3 — Drop guard (optional)

Controlled by:

sync.drop_guard (bool)
runtime knobs:
- runtime.suspect_min_prev (default 20)
- runtime.suspect_shrink_ratio (default 0.10)
- runtime.suspect_debug (default True)

If enabled, each side’s “present snapshot” can be coerced back to baseline when:

previous baseline is sufficiently large
current snapshot shrank too much
checkpoint did not advance

Flags produced:

A_suspect, B_suspect

These flags later suppress observed deletions for that side.

Step 4 — Index semantics: present vs delta

Per provider capability:

capabilities.index_semantics defaults to "present"

Then:

if "delta": eff = prev | cur
else: eff = guarded_cur (or raw cur if drop guard off)

Result:

A_eff, B_eff

Library whitelists (history/ratings) can filter:

prevA, A_cur, A_eff
prevB, B_cur, B_eff

Step 5 — Tombstones (pair-scoped) + TTL

Two-way uses pair-scoped tombstones only.

Pair key:

pair_key = "-".join(sorted([A, B])) (e.g., PLEX-SIMKL)

Load:

tomb_map = keys_for_feature(state_store, feature, pair=pair_key)
Filter by TTL:
- sync.tombstone_ttl_days (default 30)

Result:

tomb is the set of tokens still “alive” (canonical keys and id tokens)

Tokens look like:

imdb:tt..., tmdb:123, etc.

Step 6 — Observed deletions

This is how two-way decides “an item disappeared on one side, so we should delete it on the other side too”.

When it runs

Observed deletions run only if:

not bootstrapping (see below)
sync.include_observed_deletes is true (default True)
neither provider is down
the relevant side’s snapshot is not suspect (A_suspect/B_suspect == False)

How it’s computed

For each side:

obsA = prevA.keys - A_cur.keys
obsB = prevB.keys - B_cur.keys

Interpretation: “it used to exist (baseline), now the live snapshot no longer contains it.”

What happens next

newly = (obsA | obsB) - tomb
Every token in newly is written into tombstones with timestamp now.

And the orchestrator also tries to include ID tokens for the item by looking it up in prevA or prevB:

it writes imdb:..., tmdb:..., etc. for stronger matching across providers.

Finally:

observed-deleted keys are removed from the effective indices:
- A_eff.pop(k) for k in obsA
- B_eff.pop(k) for k in obsB

This prevents immediately re-adding an observed-deleted item back to the same side.

Bootstrap mode

If this is the first time syncing this pair+feature:

bootstrap = (not prevA) and (not prevB) and not tomb

Then:

observed deletions are skipped
and later removals are forced off, even if allow_removals is true

Translation: first run won’t delete things. Good.

Step 7 — Tombstone expansion (`tombX`)

Tombstone tokens can be either:

canonical keys, or
ID tokens, or
episode/season typed tokens.

Two-way expands tombstones into tombX by resolving tokens through alias maps:

Builds alias maps from:
- A_eff, B_eff, prevA, prevB
If a tomb token matches a typed token, it maps back to a canonical key.
Adds those canonical keys to tombX

Net effect: tombstones match more reliably even if providers disagree on the primary key.

Step 8 — Planning

Two-way produces four lists:

add_to_A
add_to_B
rem_from_A
rem_from_B

Presence features (watchlist/history/playlists)

For each item present in A_eff but not “present” in B_eff:

normally → add_to_B
but if removals are enabled and deletion is strongly indicated → rem_from_A

Deletion “strong signals” (any of):

item tokens hit tombX
canonical key in tombX
key is in obsB (B observed deletion)
key is in shrinkB (B baseline key missing from current snapshot)

…and the item must have existed on the opposite side before:

_prev_had(prevB, prevB_alias, item) is true, OR it hits tombstones (tombX), which counts as “strong enough evidence”.

Same logic is repeated in reverse for items present in B_eff but not in A_eff.

Ratings (special)

For ratings:

compute diffs both directions:
- upserts_to_B, unrates_on_B = diff_ratings(A, B)
- upserts_to_A, unrates_on_A = diff_ratings(B, A)

Conflict resolution:

if both sides want to upsert, pick a winner:
- newer rated_at wins (if both parse), else sync.bidirectional.source_of_truth wins (default: A)
if one side wants upsert and the other wants unrate:
- if removals enabled and tombstones match → unrate
- else propagate the upsert (keep the rating)

Important: unrates/removals are only planned if allow_removals is enabled.

Step 9 — Blocklists, manual blocks, phantoms

Unresolved keys block adds on each destination:
- load_unresolved_keys(dst, feature, cross_features=True)
For non-watchlist features, apply_blocklist(...) also blocks adds using:
- tombstones + unresolved + blackbox

Manual blocks apply to both adds and removes.

Watchlist uses PhantomGuard (not blackbox) to avoid add flapping:

PhantomGuard(src=X, dst=Y, feature, ttl_days=cooldown_days)
ratings upserts are split:
- updates are not phantom-filtered
- fresh adds are phantom-filtered

Step 10 — Mass delete protection

Before applying removals:

maybe_block_mass_delete(...) can drop the entire removal list if it exceeds a ratio of baseline size.

Controlled by:

sync.allow_mass_delete
runtime ratio: runtime.suspect_shrink_ratio (default 0.10)

This is the “no, you’re not deleting 900 items because the API had a nap” guard.

Step 11 — Apply order

Two-way applies in this order:

removals from A
removals from B
adds to A
adds to B

Why removals first?

It lets tombstones get recorded before add planning on later passes in the same run (and keeps indices tidy).

Tombstone marking on successful removals

After confirmed removals (and not dry-run), _mark_tombs(rem_items) writes:

canonical key
every ids.* token in the removed items

into pair-scoped tombstones:

key format: {feature}:{PAIR}|{token}

Step 12 — Apply results, confirmations, blackbox + unresolved

Adds use the same “effective confirmation” rules as one-way:

if provider returns confirmed_keys, those are used
else: success is approximated as attempted - newly_unresolved

If sync.verify_after_write is enabled and the provider supports it:

successes are further reduced by re-checking unresolved after write

Blackbox integration:

failed attempted keys (excluding skipped) call:
- record_attempts(dst, feature, failed_keys, reason="two:apply:add:failed", op="add", pair=pair_key)
successes call:
- record_success(dst, feature, success_keys, pair=pair_key)

Unresolved:

when a provider is down, everything becomes unresolved with hints provider_down:add/remove
when apply failures occur, those items are recorded unresolved with hint apply:add:failed

Step 13 — Persist baselines + checkpoints

At the end, two-way persists:

baseline items for both providers (filtered)
checkpoints (if present)

Filter rules match one-way:

skip items marked transient (_cw_transient, _cw_skip_persist, etc.)
skip items with provider subobject ignored == True

Writes state.json with updated last_sync_epoch.

Config knobs that matter

Under sync:

enable_add / enable_remove (or per-feature overrides)
include_observed_deletes
tombstone_ttl_days
drop_guard
allow_mass_delete
verify_after_write
dry_run

Under sync.bidirectional (ratings conflict preference):

source_of_truth (A or B provider name)

Root blackbox (PhantomGuard):

enabled, block_adds, cooldown_days

Notes

First run is bootstrap mode. It avoids observed deletes and removals.
If either side is down, observed deletions are disabled for both sides.
Drop-guard suspect snapshots suppress observed deletions for that side.

PreviousOne-way sync NextGuardrails

Last updated 14 hours ago

Good afternoon

hashtagWhen to use it

hashtagWhat to expect

hashtagSafety mechanisms you’ll see

hashtagRecommended setup

hashtagOverview

hashtagTechnical reference

hashtagEntry point

hashtagStep 0 — Health gating

hashtagStep 1 — Build snapshots

hashtagStep 2 — Baselines, checkpoints, manual policy

hashtagStep 3 — Drop guard (optional)

hashtagStep 4 — Index semantics: present vs delta

hashtagStep 5 — Tombstones (pair-scoped) + TTL

hashtagStep 6 — Observed deletions

hashtagWhen it runs

hashtagHow it’s computed

hashtagWhat happens next

hashtagBootstrap mode

hashtagStep 7 — Tombstone expansion (tombX)

hashtagStep 8 — Planning

hashtagPresence features (watchlist/history/playlists)

hashtagRatings (special)

hashtagStep 9 — Blocklists, manual blocks, phantoms

hashtagStep 10 — Mass delete protection

hashtagStep 11 — Apply order

hashtagTombstone marking on successful removals

hashtagStep 12 — Apply results, confirmations, blackbox + unresolved

hashtagStep 13 — Persist baselines + checkpoints

hashtagConfig knobs that matter

hashtagNotes

When to use it

What to expect

Safety mechanisms you’ll see

Recommended setup

Overview

Technical reference

Entry point

Step 0 — Health gating

Step 1 — Build snapshots

Step 2 — Baselines, checkpoints, manual policy

Step 3 — Drop guard (optional)

Step 4 — Index semantics: present vs delta

Step 5 — Tombstones (pair-scoped) + TTL

Step 6 — Observed deletions

When it runs

How it’s computed

What happens next

Bootstrap mode

Step 7 — Tombstone expansion (`tombX`)

Step 8 — Planning

Presence features (watchlist/history/playlists)

Ratings (special)

Step 9 — Blocklists, manual blocks, phantoms

Step 10 — Mass delete protection

Step 11 — Apply order

Tombstone marking on successful removals

Step 12 — Apply results, confirmations, blackbox + unresolved

Step 13 — Persist baselines + checkpoints

Config knobs that matter

Notes