Files
josh-sync/docs/adr/001-josh-proxy-for-sync.md

43 lines
2.6 KiB
Markdown
Raw Normal View History

# ADR-001: Josh-proxy for Bidirectional Sync
**Status:** Accepted
**Date:** 2026-01
## Context
We need bidirectional sync between a monorepo and N external subrepos. Each subrepo corresponds to a subfolder in the monorepo. Developers on both sides should see a clean, complete git history — not synthetic commits or squashed blobs.
### Alternatives considered
1. **git subtree**: Built into git. `git subtree split` extracts a subfolder into a standalone repo. However, subtree split rewrites history on every run (O(n) on total commits), creating new SHAs each time. Bidirectional sync requires manual `subtree merge` with conflict-prone history grafting. No transport-layer filtering — all content must be fetched.
2. **git submodule**: Tracks external repos via `.gitmodules` pointer commits. Does not provide content-level integration — monorepo commits don't contain subrepo files directly. Developers must run `git submodule update`. Bidirectional sync is not a supported workflow.
3. **Custom diff-and-patch scripts**: Compute diffs between monorepo subfolder and subrepo, apply patches in both directions. Fragile with renames, binary files, and merge conflicts. Loses authorship and commit granularity.
4. **josh-proxy**: A git proxy that computes filtered views of repositories in real-time. Clients `git clone` through josh and receive a repo containing only the specified subfolder, with history rewritten to match. Josh maintains a persistent SHA mapping, so the same monorepo commit always produces the same filtered SHA. Bidirectional: pushing back through josh maps filtered commits to monorepo commits.
## Decision
Use josh-proxy as the transport layer for all sync operations.
## Consequences
**Positive:**
- Clean git history in both directions — no synthetic commits
- Deterministic SHA mapping — same monorepo state always produces same filtered SHA
- Bidirectional by design — push through josh maps back to monorepo
- Transport-layer filtering — content exclusion happens at clone/push time, not via generated files
- Supports any git hosting platform (Gitea, GitHub, GitLab) since it's a proxy
**Negative:**
- Requires running a josh-proxy instance (operational overhead)
- Josh-proxy is a Rust project with a smaller community than git-native tools
- Proxy must have network access to the monorepo's git server
- Josh's SHA mapping is opaque — debugging requires understanding josh internals
- First-parent traversal behavior must be respected in merge commits (see ADR-008)
**Risks:**
- Josh-proxy downtime blocks all sync operations
- Josh-proxy bugs could corrupt history mapping (mitigated by force-with-lease on forward, always-PR on reverse)