9 Commits
v1.1 ... v1.2

Author SHA1 Message Date
8ab07b83ab Update docs, changelog, examples, and add ADRs for v1.2
- Add v1.1.0 and v1.2.0 changelog entries
- Add exclude field to config reference and example config
- Add ADRs documenting all major design decisions
- Fix step numbering in reverse_sync()
- Fix action.yml to copy VERSION file
- Add dist/ and .env to .gitignore
- Use refs/tags/ format for Nix flake tag refs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 21:28:40 +03:00
95b83bd538 Fix PR body newlines rendering as literal \n
Bash double-quoted strings don't interpret \n as newlines.
Use actual newlines in the pr_body strings instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 16:13:13 +03:00
ce53d3c1d2 Fix reconciliation parent order and add reverse sync tree check
- Swap parent order in reconcile_filter_change(): josh-filtered must
  be first parent so josh can follow first-parent traversal to map
  history back to the monorepo. Old subrepo history on parent 2.
- Add tree comparison in reverse_sync() before commit detection:
  if subrepo tree matches josh-filtered tree, skip immediately.
  Prevents false positive PRs after reconciliation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 15:11:31 +03:00
16257f25d7 Fix reverse sync false positive after filter reconciliation
Add --ancestry-path to git log in reverse_sync() to prevent old
subrepo history from leaking through reconciliation merge parents.
Without this, every old subrepo commit appears as a "human commit"
triggering a spurious 0-commit PR on the monorepo.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 14:19:56 +03:00
c0ddb887ff Fix filter reconciliation for pre-v1.2 state and unrelated histories
Three bugs found during first CI run after enabling :exclude:

- Derive old filter (:/subfolder) when state has no josh_filter stored
  (pre-v1.2 upgrade path)
- Detect unrelated histories in forward_sync() and fall back to
  reconcile_filter_change() instead of creating a useless conflict PR
- Skip state update on conflict result (prevents storing wrong filter
  and mono SHA that blocks retries)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 13:30:24 +03:00
22bd59a9d7 Auto-reconcile subrepo history when josh filter changes
When the exclude list changes, josh-proxy recomputes filtered history
with new SHAs, breaking common ancestry with the subrepo. Instead of
requiring a manual reset (force-push), forward sync now detects the
filter change and creates a reconciliation merge commit that connects
the old and new histories — no force-push, no re-clone needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 10:40:08 +03:00
d7f8618b38 Use inline :exclude in josh-proxy URL instead of stored filter files
The :+ stored filter syntax doesn't work in josh-proxy URLs.
Inline :exclude[::p1,::p2] works directly — no files to generate
or commit, no extra dependencies.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 10:19:41 +03:00
5929585d6c Fix josh-proxy rejecting stored filter path with slash
Josh-proxy's parser treats "/" in :+ paths as a filter separator,
so :+.josh-filters/backend fails. Use flat naming at repo root:
.josh-filter-<target>.josh referenced as :+.josh-filter-<target>.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 09:47:16 +03:00
187a9ead14 Add file exclusion via josh stored filters (v1.2.0)
New `exclude` config field per target generates .josh-filters/<name>.josh
files with josh :exclude clauses. Josh-proxy applies exclusions at the
transport layer — excluded files never appear in the subrepo.

Preflight checks that generated filter files are committed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 22:45:13 +03:00
26 changed files with 820 additions and 36 deletions

3
.gitignore vendored
View File

@@ -1,3 +1,4 @@
.claude/*local*
dist/
.env
result

View File

@@ -1,5 +1,32 @@
# Changelog
## 1.2.0
### Features
- **File exclusion**: `exclude` config field removes files/directories from the subrepo at the josh-proxy transport layer. Patterns are embedded inline in the josh-proxy URL using `:exclude[::pattern,...]` syntax — no extra files to generate or commit.
- **Filter change reconciliation**: When the josh filter changes (e.g., adding/removing exclude patterns), josh-sync automatically creates a reconciliation merge commit that connects old and new histories. No manual reset or force-push required.
- **Tree comparison guard**: Reverse sync now compares subrepo tree to josh-filtered tree before checking commit log. Skips immediately when trees are identical, avoiding false positives from reconciliation merge history.
- **Unrelated histories detection**: Forward sync detects when histories are unrelated (no common ancestor) and falls back to reconciliation instead of creating a useless conflict PR.
### Fixes
- Pre-v1.2 state compatibility: When upgrading from v1.0/v1.1 (no `josh_filter` stored in state), the old filter is derived from `subfolder` so reconciliation triggers correctly.
- Reconciliation merge parent order: Josh-filtered history is always first parent so josh-proxy can follow first-parent traversal back to the monorepo.
- Reverse sync `--ancestry-path` flag prevents old subrepo history from leaking through reconciliation merge parents.
- PR body `\n` now renders as actual newlines instead of literal text.
- Conflict result no longer updates sync state (added `continue` to skip state write).
- `action.yml` now copies VERSION file for correct `--version` output in CI.
- `.gitignore` now includes `dist/` and `.env`.
## 1.1.0
### Features
- **`onboard` command**: Interactive, resumable workflow for importing existing subrepos into the monorepo. Walks through: prerequisites check, import (creates PRs), wait for merge, reset (pushes josh-filtered history). Checkpoint/resume at every step.
- **`migrate-pr` command**: Migrates open PRs from an archived subrepo to the new one. Supports interactive selection, `--all` flag, and specific PR numbers. Uses `git apply --3way` for resilient patch application.
- **Onboard state tracking**: Stored on the `josh-sync-state` branch at `<target>/onboard.json`. Tracks step progress, import PR numbers, reset branches, and migrated PRs.
## 1.0.0
Initial release. Extracted from [private-monorepo-example](https://code.itkan.io/pe/private-monorepo-example) into a standalone reusable library.

View File

@@ -16,12 +16,12 @@ josh:
targets:
- name: "billing"
subfolder: "services/billing"
josh_filter: ":/services/billing"
subrepo_url: "git@gitea.example.com:ext/billing.git"
subrepo_auth: "ssh"
branches:
main: main
forward_only: []
exclude: # files excluded from subrepo (optional)
- ".monorepo/"
bot:
name: "josh-sync-bot"
@@ -58,8 +58,10 @@ Run `josh-sync preflight` to validate your setup.
## Documentation
- **[Setup Guide](docs/guide.md)** — Step-by-step: prerequisites, importing existing subrepos, CI workflows, and troubleshooting
- **[Setup Guide](docs/guide.md)** — Step-by-step: prerequisites, importing existing subrepos, CI workflows, file exclusion, and troubleshooting
- **[Configuration Reference](docs/config-reference.md)** — Full `.josh-sync.yml` field documentation
- **[Architecture Decision Records](docs/adr/)** — Design rationale and trade-offs
- **[Changelog](CHANGELOG.md)** — Version history
## CLI
@@ -79,12 +81,16 @@ josh-sync state reset <target> [branch]
- **Forward sync** (mono → subrepo): pushes directly if clean, creates conflict PR if not. Uses `--force-with-lease` for safety.
- **Reverse sync** (subrepo → mono): always creates a PR, never pushes directly.
- **File exclusion**: `exclude` patterns are embedded inline in the josh-proxy URL. Excluded files exist only in the monorepo.
- **Filter reconciliation**: Changing the exclude list auto-creates a merge commit that connects old and new histories — no force-push needed.
- **Loop prevention**: `Josh-Sync-Origin:` git trailer filters out bot commits.
- **State tracking**: orphan branch `josh-sync-state` stores JSON per target/branch.
## Dependencies
`bash >=4`, `git`, `curl`, `jq`, `yq` ([mikefarah/yq](https://github.com/mikefarah/yq) v4+), `openssh`
`bash >=4`, `git`, `curl`, `jq`, `yq` ([mikefarah/yq](https://github.com/mikefarah/yq) v4+), `openssh`, `rsync`
> The Nix flake bundles all dependencies automatically.
## License

View File

@@ -1 +1 @@
1.1.0
1.2.0

View File

@@ -26,6 +26,7 @@ runs:
run: |
JOSH_DIR="$(mktemp -d)"
cp -r "${{ github.action_path }}/bin" "${{ github.action_path }}/lib" "${JOSH_DIR}/"
cp "${{ github.action_path }}/VERSION" "${JOSH_DIR}/" 2>/dev/null || true
chmod +x "${JOSH_DIR}/bin/josh-sync"
echo "${JOSH_DIR}/bin" >> "$GITHUB_PATH"
echo "JOSH_SYNC_ROOT=${JOSH_DIR}" >> "$GITHUB_ENV"

View File

@@ -208,13 +208,42 @@ _sync_direction() {
fi
fi
# Run sync
# Check for filter change (forward only — reverse uses same filter)
local result
if [ "$direction" = "forward" ]; then
result=$(forward_sync)
local prev_filter
prev_filter=$(echo "$state" | jq -r '.last_forward.josh_filter // empty')
# If no filter stored (pre-v1.2 state) but a previous sync exists,
# the old filter was the simple :/subfolder (before exclude was added)
if [ -z "$prev_filter" ]; then
local prev_mono_sha
prev_mono_sha=$(echo "$state" | jq -r '.last_forward.mono_sha // empty')
if [ -n "$prev_mono_sha" ]; then
local subfolder
subfolder=$(echo "$TARGET_JSON" | jq -r '.subfolder')
prev_filter=":/${subfolder}"
fi
fi
if [ -n "$prev_filter" ] && [ "$prev_filter" != "$JOSH_FILTER" ]; then
log "WARN" "Josh filter changed — reconciling histories"
log "INFO" "Old: ${prev_filter}"
log "INFO" "New: ${JOSH_FILTER}"
result=$(reconcile_filter_change)
else
result=$(forward_sync)
fi
else
result=$(reverse_sync)
fi
# If forward sync hit unrelated histories, fall back to reconciliation
if [ "$result" = "unrelated" ]; then
log "WARN" "Unrelated histories detected — falling back to filter reconciliation"
result=$(reconcile_filter_change)
log "INFO" "Reconciliation result: ${result}"
fi
log "INFO" "Result: ${result}"
# Handle warnings
@@ -224,6 +253,7 @@ _sync_direction() {
fi
if [ "$result" = "conflict" ]; then
echo "::warning::Target ${target_name}, branch ${branch}: merge conflict — PR created on subrepo"
continue
fi
if [ "$result" = "josh-rejected" ]; then
echo "::error::Target ${target_name}, branch ${branch}: josh rejected push — check proxy logs"
@@ -240,8 +270,9 @@ _sync_direction() {
--arg s_sha "${subrepo_sha_now:-}" \
--arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--arg status "$result" \
--arg filter "$JOSH_FILTER" \
--argjson prev "$state" \
'$prev + {last_forward: {mono_sha:$m_sha, subrepo_sha:$s_sha, timestamp:$ts, status:$status}}')
'$prev + {last_forward: {mono_sha:$m_sha, subrepo_sha:$s_sha, timestamp:$ts, status:$status, josh_filter:$filter}}')
else
local mono_sha_now
mono_sha_now=$(git rev-parse "origin/${branch}" 2>/dev/null || echo "")

View File

@@ -0,0 +1,42 @@
# ADR-001: Josh-proxy for Bidirectional Sync
**Status:** Accepted
**Date:** 2026-01
## Context
We need bidirectional sync between a monorepo and N external subrepos. Each subrepo corresponds to a subfolder in the monorepo. Developers on both sides should see a clean, complete git history — not synthetic commits or squashed blobs.
### Alternatives considered
1. **git subtree**: Built into git. `git subtree split` extracts a subfolder into a standalone repo. However, subtree split rewrites history on every run (O(n) on total commits), creating new SHAs each time. Bidirectional sync requires manual `subtree merge` with conflict-prone history grafting. No transport-layer filtering — all content must be fetched.
2. **git submodule**: Tracks external repos via `.gitmodules` pointer commits. Does not provide content-level integration — monorepo commits don't contain subrepo files directly. Developers must run `git submodule update`. Bidirectional sync is not a supported workflow.
3. **Custom diff-and-patch scripts**: Compute diffs between monorepo subfolder and subrepo, apply patches in both directions. Fragile with renames, binary files, and merge conflicts. Loses authorship and commit granularity.
4. **josh-proxy**: A git proxy that computes filtered views of repositories in real-time. Clients `git clone` through josh and receive a repo containing only the specified subfolder, with history rewritten to match. Josh maintains a persistent SHA mapping, so the same monorepo commit always produces the same filtered SHA. Bidirectional: pushing back through josh maps filtered commits to monorepo commits.
## Decision
Use josh-proxy as the transport layer for all sync operations.
## Consequences
**Positive:**
- Clean git history in both directions — no synthetic commits
- Deterministic SHA mapping — same monorepo state always produces same filtered SHA
- Bidirectional by design — push through josh maps back to monorepo
- Transport-layer filtering — content exclusion happens at clone/push time, not via generated files
- Supports any git hosting platform (Gitea, GitHub, GitLab) since it's a proxy
**Negative:**
- Requires running a josh-proxy instance (operational overhead)
- Josh-proxy is a Rust project with a smaller community than git-native tools
- Proxy must have network access to the monorepo's git server
- Josh's SHA mapping is opaque — debugging requires understanding josh internals
- First-parent traversal behavior must be respected in merge commits (see ADR-008)
**Risks:**
- Josh-proxy downtime blocks all sync operations
- Josh-proxy bugs could corrupt history mapping (mitigated by force-with-lease on forward, always-PR on reverse)

View File

@@ -0,0 +1,50 @@
# ADR-002: State Storage on Orphan Git Branch
**Status:** Accepted
**Date:** 2026-01
## Context
Josh-sync needs persistent state to track what has already been synced (last-synced commit SHAs, timestamps, status). This prevents re-syncing unchanged content and enables incremental operation. The state must survive CI runner teardown — runners are ephemeral containers.
### Alternatives considered
1. **File in the repo**: Commit a state JSON file to the monorepo. Every sync run creates a commit, polluting history. Race conditions when multiple sync jobs run concurrently.
2. **External database/KV store**: Redis, SQLite, or a cloud KV service. Adds an infrastructure dependency. Credentials and connectivity to manage.
3. **CI artifacts/cache**: Platform-specific (GitHub Actions cache, Gitea cache). Not portable across CI platforms. Expiry policies vary.
4. **Orphan git branch**: A branch with no parent relationship to the main history. Stores JSON files in a simple `<target>/<branch>.json` layout. Pushed to origin, so it survives runner teardown. No external dependencies — uses git itself.
## Decision
Store sync state as JSON files on an orphan branch (`josh-sync-state`) in the monorepo.
### Storage layout
```
origin/josh-sync-state/
<target>/<branch>.json # sync state per target/branch
<target>/onboard.json # onboard workflow state (v1.1+)
```
### Implementation
- `read_state()`: `git fetch origin josh-sync-state && git show origin/josh-sync-state:<key>.json`
- `write_state()`: Uses `git worktree` to check out the orphan branch in a temp directory, writes JSON, commits, and pushes. This avoids touching the main working tree.
## Consequences
**Positive:**
- Zero external dependencies — only git
- Portable across CI platforms (Gitea Actions, GitHub Actions, local)
- Human-readable JSON files — easy to inspect and debug
- Atomic updates via git commit + push
- Natural namespacing via directory structure
**Negative:**
- Concurrent writes can race (mitigated by concurrency groups in CI workflows)
- `git worktree` adds complexity to the write path
- State branch appears in `git branch -a` output (minor clutter)
- Push failures on the state branch are non-fatal (logged as warning, sync still succeeds)

View File

@@ -0,0 +1,33 @@
# ADR-003: Force-with-Lease for Forward Sync
**Status:** Accepted
**Date:** 2026-01
## Context
Forward sync pushes monorepo changes to the subrepo. If someone pushes directly to the subrepo between when josh-sync reads its HEAD and when josh-sync pushes, a naive `git push` would overwrite their work. A `git push --force` would be worse — it would silently destroy concurrent changes.
## Decision
Use `git push --force-with-lease=refs/heads/<branch>:<expected-sha>` for all forward sync pushes. The expected SHA is recorded at the start of the sync operation (the "lease").
### How it works
1. Record subrepo HEAD SHA before any operations: `subrepo_sha=$(subrepo_ls_remote "$branch")`
2. Perform merge of monorepo changes onto subrepo state
3. Push with explicit lease: `--force-with-lease=refs/heads/main:<subrepo_sha>`
4. If the subrepo HEAD changed since step 1, git rejects the push
5. Josh-sync reports `lease-rejected` and retries on the next run
## Consequences
**Positive:**
- Never overwrites concurrent changes — git atomically checks the expected SHA
- Explicit SHA lease (not just "current tracking ref") prevents stale-ref bugs
- Failed leases are retried on the next sync run — no data loss, just delay
- Works correctly with josh-proxy's SHA mapping
**Negative:**
- Lease-rejected means the sync run did work that gets discarded (clone, merge, etc.)
- Persistent lease failures indicate a concurrent push pattern that needs investigation
- Requires the `--force-with-lease` flag with explicit SHA — the shorthand form (`--force-with-lease` without `=`) is unsafe because it uses the local tracking ref, which may be stale

View File

@@ -0,0 +1,41 @@
# ADR-004: Always-PR Policy for Reverse Sync
**Status:** Accepted
**Date:** 2026-01
## Context
Reverse sync brings subrepo changes back into the monorepo. The monorepo is the source of truth and typically has CI checks, code review requirements, and branch protection rules. Pushing directly to the monorepo's main branch would bypass these safeguards.
### Alternatives considered
1. **Direct push**: Fast, but bypasses all review and CI. A bad subrepo commit could break the entire monorepo with no review gate.
2. **Always create a PR**: Pushes to a staging branch (`auto-sync/subrepo-<branch>-<timestamp>`), then creates a PR via API. Humans review and merge.
3. **Configurable per-target**: Let users choose direct push vs PR. Adds complexity and a dangerous default.
## Decision
Reverse sync always creates a PR on the monorepo. Never pushes directly to the target branch.
### Implementation
1. Push subrepo HEAD through josh-proxy to a staging branch: `git push -o "base=main" josh://... HEAD:refs/heads/auto-sync/subrepo-main-<ts>`
2. Create PR via Gitea/GitHub API targeting the monorepo's main branch
3. PR includes a review checklist: scoped to subfolder, no leaked credentials, CI passes
The `-o "base=main"` option tells josh-proxy which monorepo branch to map the push against.
## Consequences
**Positive:**
- All monorepo changes go through review — consistent with team workflow
- CI runs on the PR branch before merge
- Bad subrepo changes are caught before they affect the monorepo
- Audit trail via PR history
**Negative:**
- Reverse sync is not instant — requires human action to merge the PR
- Stale PRs accumulate if subrepo changes frequently but PRs aren't merged promptly
- Adds API dependency (needs token with PR creation scope)

View File

@@ -0,0 +1,52 @@
# ADR-005: Git Trailer for Loop Prevention
**Status:** Accepted
**Date:** 2026-01
## Context
Bidirectional sync creates an infinite loop risk: forward sync pushes commit A to the subrepo, reverse sync sees commit A as "new" and creates a PR back to the monorepo, forward sync sees the merged PR as "new" and pushes again, etc.
### Alternatives considered
1. **SHA tracking only**: Compare SHAs to skip already-synced content. Breaks when josh-proxy rewrites SHAs (which it always does for filtered views). The monorepo commit SHA and the filtered/subrepo commit SHA are never the same.
2. **Commit message prefix**: Add `[sync]` to bot commit messages. Fragile — humans might use the same prefix. Requires string matching on message content.
3. **Git trailer**: A structured key-value pair in the commit message body (after a blank line), following the `git interpret-trailers` convention. Format: `Key: value`. Machine-parseable, unlikely to be used by humans, and supported by `git log --grep`.
## Decision
All bot commits include a git trailer with a configurable key (default: `Josh-Sync-Origin`). Both sync directions filter out commits containing this trailer.
### Format
```
Sync from monorepo 2026-02-12T10:30:00Z
Josh-Sync-Origin: forward/main/2026-02-12T10:30:00Z
```
The trailer value encodes: direction, branch, and timestamp. This aids debugging but is not parsed by the loop filter — only the trailer key presence matters.
### Filtering
- **Reverse sync**: `git log --invert-grep --grep="^${BOT_TRAILER}:"` excludes all commits with the trailer
- **CI loop guard**: The composite action checks if HEAD commit has the trailer before running sync at all
### Configuration
The trailer key is set in `.josh-sync.yml` under `bot.trailer`. This allows multiple josh-sync instances (with different bots) to operate on the same repos without interfering.
## Consequences
**Positive:**
- Reliable loop prevention — trailer is part of the immutable commit object
- Configurable key avoids conflicts between multiple sync bots
- Human-readable — `git log` shows the trailer in commit messages
- CI loop guard prevents unnecessary sync runs entirely
**Negative:**
- Commits with manually-added trailers matching the key would be incorrectly filtered
- Trailer must be in the commit body (after blank line), not the subject line
- Squash-and-merge on PRs may lose the trailer if the platform doesn't preserve commit message body

View File

@@ -0,0 +1,55 @@
# ADR-006: Inline Exclude in Josh-Proxy URL
**Status:** Accepted
**Date:** 2026-02
## Context
Some files in a monorepo subfolder should not appear in the subrepo (e.g., monorepo-specific CI configs, internal tooling, secrets templates). We need a mechanism to exclude these files from sync.
### Alternatives considered
1. **`.josh-sync-exclude` file committed to the repo**: A gitignore-style file listing patterns. Requires generating and committing a file. Changes to the exclude list create commits. The file itself would need to be excluded from the subrepo (circular dependency).
2. **Post-clone file deletion**: Clone through josh, then `rm -rf` excluded paths before pushing. Fragile — deletions create diff noise. Doesn't work for reverse sync (excluded files would appear as "deleted" in the subrepo).
3. **Josh `:exclude` filter inline in the URL**: Josh-proxy supports `:exclude[::pattern1,::pattern2]` appended to the filter path. The exclusion happens at the transport layer — git objects for excluded files are never transferred. Works identically for clone (forward) and push (reverse).
4. **Separate josh filter file**: Generate a josh filter expression and store it somewhere. Adds state management complexity.
## Decision
Embed exclusion patterns inline in the josh-proxy URL using josh's native `:exclude` syntax. The `exclude` config field in `.josh-sync.yml` is transformed at config parse time into the josh filter string.
### Example
Config:
```yaml
exclude:
- ".monorepo/"
- "**/internal/"
```
Produces josh filter:
```
:/services/billing:exclude[::.monorepo/,::**/internal/]
```
### Implementation
The `parse_config()` function in `lib/config.sh` uses jq to conditionally append `:exclude[...]` to the josh filter when the `exclude` array is non-empty. The enriched filter is stored in `JOSH_SYNC_TARGETS` JSON and used everywhere via `$JOSH_FILTER`.
## Consequences
**Positive:**
- Zero committed files — exclusion is purely in the URL
- Transport-layer filtering — excluded content never leaves the git server
- Works identically for forward sync (clone), reverse sync (push), and reset
- Tree comparison (`skip` detection) works correctly since excluded files aren't in the filtered view
- Standard josh syntax — no custom invention
**Negative:**
- Josh's `:exclude` pattern syntax is limited (no negation, no regex — only glob-style patterns with `::` prefix)
- Long exclude lists make the URL unwieldy (though this is cosmetic — git handles long URLs fine)
- Changing the exclude list changes the josh filter, which changes all filtered SHAs (see ADR-007 for how this is handled)
- Debugging requires understanding josh's filter composition syntax

View File

@@ -0,0 +1,53 @@
# ADR-007: Reconciliation Merge for Filter Changes
**Status:** Accepted
**Date:** 2026-02
## Context
When the josh filter changes (e.g., adding exclude patterns), josh-proxy recomputes the entire filtered history with new SHAs. The subrepo's existing history (based on the old filter) shares no common ancestor with the new filtered history. A naive forward sync would see "unrelated histories" and fail.
### Alternatives considered
1. **Force-push to subrepo**: Replace subrepo history with the new filtered view (same as `josh-sync reset`). Destructive — all local clones become invalid, open PRs are orphaned, developers must re-clone.
2. **Cherry-pick new commits**: Identify commits that exist in the new filtered history but not the old, cherry-pick them onto the subrepo. Complex — the "same" commit has different SHAs in old vs new filtered history. No reliable way to match them.
3. **Reconciliation merge commit**: Create a merge commit on the subrepo that has both the new filtered HEAD and the old subrepo HEAD as parents, using the new filtered tree. This establishes shared ancestry without rewriting history.
## Decision
When josh-sync detects a filter change (stored filter in state differs from current `$JOSH_FILTER`), create a reconciliation merge commit using `git commit-tree`.
### How it works
1. Clone subrepo (has old history)
2. Fetch josh-proxy filtered view (has new history)
3. If trees are identical → skip (filter change had no effect on content)
4. Create merge commit: `git commit-tree <josh-tree> -p <josh-head> -p <subrepo-head>`
5. Push with `--force-with-lease`
The merge commit uses the josh-filtered tree (new content) and has two parents:
- **Parent 1**: josh-filtered HEAD (new filter history) — must be first (see ADR-008)
- **Parent 2**: subrepo HEAD (old filter history) — preserves old history as a side branch
### Detection
Filter change is detected by comparing the stored `josh_filter` in sync state with the current `$JOSH_FILTER`. For pre-v1.2 state (no filter stored), the old filter is derived as `:/<subfolder>`.
As a reactive fallback, `forward_sync()` also detects unrelated histories via `git merge-base` and falls back to reconciliation.
## Consequences
**Positive:**
- Non-destructive — old history is preserved as parent 2 of the merge
- Developers don't need to re-clone the subrepo
- Open PRs on the subrepo remain valid (they're based on commits that are still ancestors)
- Automatic — no manual intervention needed when changing exclude patterns
- Force-with-lease protects against concurrent changes during reconciliation
**Negative:**
- The merge commit is synthetic (created by bot, not a real merge of concurrent work)
- Parent ordering is critical — wrong order breaks josh's reverse mapping (see ADR-008)
- The reconciliation merge contains a bot trailer, so reverse sync correctly ignores it
- If the subrepo has diverged significantly (manual commits during filter change), the reconciliation merge may produce unexpected tree content (uses josh-filtered tree unconditionally)

View File

@@ -0,0 +1,42 @@
# ADR-008: First-Parent Ordering in Reconciliation Merges
**Status:** Accepted
**Date:** 2026-02
## Context
Josh-proxy uses **first-parent traversal** when mapping subrepo history back to the monorepo. When you push a commit through josh-proxy, josh walks the first-parent chain to find a commit it can map to a monorepo commit. If the first parent leads to unmappable history, josh cannot reconstruct the monorepo-side branch correctly.
This became critical when the reconciliation merge (ADR-007) initially had the wrong parent order: old subrepo history as parent 1, josh-filtered as parent 2. Josh followed parent 1, couldn't find any mappable commit, and created a monorepo branch containing only the subrepo subfolder content — effectively deleting 1280 files from the rest of the monorepo.
## Decision
In reconciliation merge commits, the josh-filtered HEAD **must be parent 1** (first parent). The old subrepo HEAD is parent 2.
```bash
git commit-tree "$josh_tree" \
-p "$josh_head" \ # parent 1: josh-filtered — josh follows this
-p "$subrepo_head" \ # parent 2: old history — side branch, ignored by josh
-m "..."
```
### Why this is safe
- The old subrepo HEAD (`subrepo_head`) is still an ancestor of the merge commit regardless of parent order — push succeeds either way
- `--ancestry-path` in reverse sync still follows `B → M → C` regardless of parent order (it traces all paths, not just first-parent)
- Josh follows first-parent and finds the josh-filtered commit, which maps cleanly back to the monorepo
## Consequences
**Positive:**
- Josh can map the reconciliation merge back to the monorepo correctly
- Reverse sync through josh produces correct diffs (only subrepo-scoped changes)
- `git log --first-parent` on the subrepo shows the clean josh-filtered lineage
**Negative:**
- This is a subtle invariant — future changes to merge commit creation must preserve parent order
- The constraint is undocumented in josh-proxy's own documentation (discovered empirically)
- No automated test can verify this without a running josh-proxy instance
**Lesson learned:**
Parent order in `git commit-tree -p` is not cosmetic. For tools that rely on first-parent traversal (josh-proxy, `git log --first-parent`), parent 1 must be the "mainline" that the tool should follow.

View File

@@ -0,0 +1,53 @@
# ADR-009: Tree Comparison as Sync Skip Guard
**Status:** Accepted
**Date:** 2026-02
## Context
Both forward and reverse sync need to detect "nothing to do" quickly. The primary mechanism is SHA comparison against stored state (last-synced SHA). However, this misses cases where:
- State is reset or lost
- Reconciliation merges change SHAs without changing content
- Multiple sync runs overlap
Additionally, reverse sync originally relied on `git log <base>..HEAD` to find new commits. After a reconciliation merge, the `..` range can leak old subrepo history through the merge's second parent, creating false positives.
## Decision
Add tree-level comparison as an early skip guard in both forward and reverse sync. Compare the git tree objects (which represent directory content, not commit history) to determine if there's actually any content difference.
### Forward sync
```bash
mono_tree=$(git rev-parse 'HEAD^{tree}')
subrepo_tree=$(git rev-parse "subrepo/${branch}^{tree}")
[ "$mono_tree" = "$subrepo_tree" ] && echo "skip"
```
### Reverse sync
```bash
subrepo_tree=$(git rev-parse "HEAD^{tree}")
josh_tree=$(git rev-parse "mono-filtered/${branch}^{tree}")
[ "$subrepo_tree" = "$josh_tree" ] && echo "skip"
```
Tree comparison happens **before** commit log analysis. If trees are identical, there is definitionally nothing to sync, regardless of what the commit history looks like.
### Combined with `--ancestry-path`
For reverse sync, even when trees differ, `git log --ancestry-path` restricts the commit range to the direct lineage between the two endpoints. This prevents old history from leaking through reconciliation merge parents.
## Consequences
**Positive:**
- Eliminates false positives from reconciliation merges (trees are identical after reconciliation)
- Fast — tree SHA comparison is O(1), no content traversal
- Correct by definition — if trees match, content is identical
- Defense in depth — works even when state tracking has gaps
**Negative:**
- Tree comparison alone doesn't tell you *which* commits are new (still need `git log` for PR descriptions)
- Adds an extra `git rev-parse` call per sync direction (negligible cost)
- Cannot detect file-mode-only changes if josh normalizes modes (theoretical edge case)

View File

@@ -0,0 +1,76 @@
# ADR-010: Onboard Workflow with Checkpoint/Resume
**Status:** Accepted
**Date:** 2026-02
## Context
Onboarding an existing subrepo into the monorepo is a multi-step process that involves human interaction (renaming repos, merging PRs). The full flow is:
1. Prerequisites: rename existing repo, create new empty repo
2. Import: copy subrepo content into monorepo, create import PR(s)
3. Wait: human merges the import PR(s)
4. Reset: force-push josh-filtered history to the new empty repo
5. (Optional) Migrate open PRs from archived repo
Each step can fail or be interrupted. The process may span hours or days (waiting for PR review). If interrupted, restarting from scratch wastes work and can create duplicate PRs.
### Alternatives considered
1. **Single-shot script**: Run all steps in sequence. If interrupted, must restart from scratch. Duplicate PRs if import step is re-run.
2. **Manual step-by-step commands**: `import`, then manually run `reset`. Simple but error-prone — users may forget steps or run them out of order.
3. **Checkpoint/resume with persistent state**: Track the current step and intermediate results (PR numbers, reset branches) in persistent state. On re-run, resume from the last completed step.
## Decision
Implement `josh-sync onboard` as a checkpoint/resume workflow with state stored on the `josh-sync-state` branch at `<target>/onboard.json`.
### State machine
```
start → importing → waiting-for-merge → resetting → complete
```
Each transition is persisted before proceeding. Re-running `josh-sync onboard <target>` reads the current step and resumes.
### State schema
```json
{
"step": "waiting-for-merge",
"archived_api": "https://host/api/v1/repos/org/repo-archived",
"archived_url": "git@host:org/repo-archived.git",
"archived_auth": "ssh",
"import_prs": { "main": 42 },
"reset_branches": ["main"],
"migrated_prs": [
{ "old_number": 5, "new_number": 12, "title": "Fix login" }
],
"timestamp": "2026-02-10T14:30:00Z"
}
```
### Per-branch progress
Import and reset both iterate over branches. Progress is saved after each branch, so interruption mid-iteration resumes at the next unprocessed branch.
### PR migration
`josh-sync migrate-pr` is a separate command that reads onboard state (for the archived repo URL) and tracks migrated PRs. It uses `git apply --3way` for resilient patch application — the subrepo's content is identical after reset, so patches apply cleanly.
## Consequences
**Positive:**
- Safe to interrupt at any point — no duplicate work on resume
- Per-branch tracking prevents duplicate import PRs or redundant resets
- Archived repo URL stored in state — `migrate-pr` can operate independently
- `--restart` flag allows starting over if state is corrupted
- Human-friendly — prints instructions at each step
**Negative:**
- State management adds complexity (read/write onboard state, step validation)
- Interactive steps (`read -r`) are not suitable for fully automated pipelines
- Onboard state persists on the state branch even after completion (minor clutter)
- The step machine is linear — cannot skip steps or run them out of order

18
docs/adr/README.md Normal file
View File

@@ -0,0 +1,18 @@
# Architecture Decision Records
This directory contains Architecture Decision Records (ADRs) for josh-sync. Each ADR documents a significant design decision, its context, the alternatives considered, and the rationale for the chosen approach.
## Index
| ADR | Title | Status |
|-----|-------|--------|
| [001](001-josh-proxy-for-sync.md) | Josh-proxy for bidirectional sync | Accepted |
| [002](002-state-on-orphan-branch.md) | State storage on orphan git branch | Accepted |
| [003](003-force-with-lease-forward.md) | Force-with-lease for forward sync | Accepted |
| [004](004-always-pr-reverse.md) | Always-PR policy for reverse sync | Accepted |
| [005](005-git-trailer-loop-prevention.md) | Git trailer for loop prevention | Accepted |
| [006](006-inline-exclude-filter.md) | Inline exclude in josh-proxy URL | Accepted |
| [007](007-reconciliation-merge.md) | Reconciliation merge for filter changes | Accepted |
| [008](008-first-parent-ordering.md) | First-parent ordering in reconciliation merges | Accepted |
| [009](009-tree-comparison-guard.md) | Tree comparison as sync skip guard | Accepted |
| [010](010-onboard-checkpoint-resume.md) | Onboard workflow with checkpoint/resume | Accepted |

View File

@@ -32,6 +32,7 @@ Each target maps a monorepo subfolder to an external subrepo.
| `subrepo_ssh_key_var` | string | No | `"SUBREPO_SSH_KEY"` | Name of the env var holding the SSH private key for this target. |
| `branches` | object | Yes | — | Branch mapping: `mono_branch: subrepo_branch`. Each key-value pair syncs those branches bidirectionally. |
| `forward_only` | string[] | No | `[]` | Branches that only sync mono → subrepo, never reverse. |
| `exclude` | string[] | No | `[]` | File/directory patterns to exclude from sync via josh `:exclude` filter. Excluded files exist only in the monorepo, never in the subrepo. See [Excluding Files](guide.md#excluding-files-from-sync). |
## `bot` Section

View File

@@ -91,6 +91,9 @@ targets:
branches:
main: main # mono_branch: subrepo_branch
forward_only: []
exclude: # files excluded from subrepo (optional)
- ".monorepo/" # monorepo-only config dir
- "**/internal/" # internal dirs at any depth
- name: "auth"
subfolder: "services/auth"
@@ -183,7 +186,7 @@ To pin to a specific version, use a tag ref in `devenv.yaml`:
```yaml
josh-sync:
url: git+https://your-gitea.example.com/org/josh-sync?ref=v1.1
url: git+https://your-gitea.example.com/org/josh-sync?ref=refs/tags/v1.2
flake: true
```
@@ -515,6 +518,65 @@ Bot commits include a git trailer like `Josh-Sync-Origin: forward/main/2024-02-1
Sync state is stored as JSON files on an orphan branch (`josh-sync-state`), one file per target/branch. This tracks the last-synced commit SHAs and timestamps to avoid re-syncing the same changes.
## Excluding Files from Sync
Some files in the monorepo subfolder may not belong in the subrepo (e.g., monorepo-specific CI configs, internal tooling). The `exclude` config field removes these at the josh-proxy layer — excluded files never appear in the subrepo.
### Configuration
Add an `exclude` list to any target:
```yaml
targets:
- name: "billing"
subfolder: "services/billing"
subrepo_url: "git@host:org/billing.git"
exclude:
- ".monorepo/" # directory at subfolder root
- "**/internal/" # directory at any depth
- "*.secret" # files by extension
branches:
main: main
```
### How it works
When `exclude` is present, josh-sync appends an inline `:exclude` filter to the josh-proxy URL. For the example above, the josh filter becomes:
```
:/services/billing:exclude[::.monorepo/,::**/internal/,::*.secret]
```
Josh-proxy applies this filter at the transport layer — no extra files to generate or commit. This means:
- **Forward sync**: the filtered clone already excludes the files
- **Reverse sync**: pushes through josh also respect the exclusion
- **Reset**: the subrepo history never contains excluded files
- **Tree comparison**: `skip` detection works correctly (excluded files are not in the diff)
### Pattern syntax
Josh uses `::` patterns inside `:exclude[...]`:
| Pattern | Matches |
|---------|---------|
| `dir/` | Directory at subfolder root |
| `file` | File at subfolder root |
| `**/dir/` | Directory at any depth |
| `**/file` | File at any depth |
| `*.ext` | Glob pattern (single `*` only) |
### Setup
1. Add `exclude` to the target in `.josh-sync.yml`
2. Run `josh-sync preflight` to verify the filter works
3. Forward sync will now exclude the specified files
No extra files to generate or commit — the exclusion is embedded directly in the josh-proxy URL.
### Changing the exclude list
You can safely add or remove patterns from `exclude` at any time. When josh-sync detects that the filter has changed since the last sync, it automatically creates a reconciliation merge commit on the subrepo that connects the old and new histories — no manual reset or force-push required. Developers do not need to re-clone the subrepo.
## Adding a New Target
To add a new subrepo after initial setup:

View File

@@ -5,12 +5,12 @@
# In devenv.yaml:
# inputs:
# josh-sync:
# url: github:org/josh-sync/v1.0.0
# url: git+https://your-gitea.example.com/org/josh-sync?ref=refs/tags/v1.2
# flake: true
#
# Or in flake.nix:
# inputs.josh-sync = {
# url = "github:org/josh-sync/v1.0.0";
# url = "git+https://your-gitea.example.com/org/josh-sync?ref=refs/tags/v1.2";
# inputs.nixpkgs.follows = "nixpkgs";
# };
@@ -21,14 +21,16 @@
# josh-sync CLI is now available in the shell.
# Commands:
# josh-sync sync --forward Forward sync (mono → subrepo)
# josh-sync sync --reverse Reverse sync (subrepo → mono)
# josh-sync preflight Validate config and connectivity
# josh-sync import <target> Initial import from subrepo
# josh-sync reset <target> Reset subrepo to josh-filtered view
# josh-sync status Show target config and sync state
# josh-sync state show <t> [b] Show state JSON
# josh-sync state reset <t> [b] Reset state
# josh-sync sync --forward Forward sync (mono → subrepo)
# josh-sync sync --reverse Reverse sync (subrepo → mono)
# josh-sync preflight Validate config and connectivity
# josh-sync import <target> Initial import from subrepo
# josh-sync reset <target> Reset subrepo to josh-filtered view
# josh-sync onboard <target> Interactive import + reset workflow
# josh-sync migrate-pr <target> Migrate PRs from archived repo
# josh-sync status Show target config and sync state
# josh-sync state show <t> [b] Show state JSON
# josh-sync state reset <t> [b] Reset state
enterShell = ''
echo "Josh Sync available run 'josh-sync --help' for commands"

View File

@@ -62,7 +62,7 @@ jobs:
done | sort -u | paste -sd ',' -)
echo "targets=${TARGETS}" >> "$GITHUB_OUTPUT"
- uses: https://your-gitea.example.com/org/josh-sync@v1
- uses: https://your-gitea.example.com/org/josh-sync@v1.2
with:
direction: forward
target: ${{ github.event.inputs.target || steps.detect.outputs.targets }}

View File

@@ -10,17 +10,19 @@ josh:
targets:
- name: "billing"
subfolder: "services/billing"
josh_filter: ":/services/billing"
# josh_filter auto-derived as ":/services/billing" if omitted
subrepo_url: "https://gitea.example.com/ext/billing.git"
subrepo_auth: "https"
branches:
main: main
develop: develop
forward_only: []
exclude: # files excluded from subrepo (optional)
- ".monorepo/" # directory at subfolder root
- "**/internal/" # directory at any depth
- name: "auth"
subfolder: "services/auth"
josh_filter: ":/services/auth"
subrepo_url: "git@gitea.example.com:ext/auth.git"
subrepo_auth: "ssh"
# Per-target credential override (reads from $AUTH_SSH_KEY instead of $SUBREPO_SSH_KEY)
@@ -31,7 +33,6 @@ targets:
- name: "shared-lib"
subfolder: "libs/shared"
josh_filter: ":/libs/shared"
subrepo_url: "https://gitea.example.com/ext/shared-lib.git"
branches:
main: main

View File

@@ -10,7 +10,7 @@ name: "Josh Sync ← Subrepo"
on:
schedule:
- cron: "0 1,7,13,19 * * *" # Every 6h, offset from forward
- cron: "0 1,7,13,19 * * *" # Every 6h, offset from forward
workflow_dispatch:
inputs:
target:
@@ -40,7 +40,7 @@ jobs:
curl -sL "https://github.com/mikefarah/yq/releases/download/v4.44.6/yq_linux_amd64" \
-o /usr/local/bin/yq && chmod +x /usr/local/bin/yq
- uses: https://your-gitea.example.com/org/josh-sync@v1
- uses: https://your-gitea.example.com/org/josh-sync@v1.2
with:
direction: reverse
target: ${{ github.event.inputs.target || '' }}

View File

@@ -36,7 +36,10 @@ parse_config() {
export JOSH_SYNC_TARGETS
JOSH_SYNC_TARGETS=$(echo "$config_json" | jq '[.targets[] | . +
# Auto-derive josh_filter from subfolder if not set
(if (.josh_filter // "") == "" then
# When exclude patterns are present, append inline :exclude[::p1,::p2,...] to the filter
(if (.exclude // [] | length) > 0 then
{josh_filter: (":/" + .subfolder + ":exclude[" + (.exclude | map("::" + .) | join(",")) + "]")}
elif (.josh_filter // "") == "" then
{josh_filter: (":/" + .subfolder)}
else {} end) +
# Derive gitea_host and subrepo_repo_path from subrepo_url

View File

@@ -11,7 +11,7 @@
# ─── Forward Sync: Monorepo → Subrepo ──────────────────────────────
#
# Returns: fresh | skip | clean | lease-rejected | conflict
# Returns: fresh | skip | clean | lease-rejected | conflict | unrelated
forward_sync() {
local mono_branch="$SYNC_BRANCH_MONO"
@@ -97,7 +97,14 @@ ${BOT_TRAILER}: forward/${mono_branch}/$(date -u +%Y-%m-%dT%H:%M:%SZ)" >&2
fi
else
# Conflict!
# Check: unrelated histories (filter change) vs normal merge conflict
if ! git merge-base "subrepo/${subrepo_branch}" "$mono_head" >/dev/null 2>&1; then
log "INFO" "No common ancestor — histories are unrelated (filter change?)"
echo "unrelated"
return
fi
# Normal merge conflict
local conflicted
conflicted=$(git diff --name-only --diff-filter=U 2>/dev/null || echo "(unknown)")
git merge --abort
@@ -115,7 +122,14 @@ ${BOT_TRAILER}: forward/${mono_branch}/$(date -u +%Y-%m-%dT%H:%M:%SZ)" >&2
local pr_body conflicted_list
# shellcheck disable=SC2001
conflicted_list=$(echo "$conflicted" | sed 's/^/- /')
pr_body="## Sync Conflict\n\nMonorepo \`${mono_branch}\` has changes that conflict with \`${subrepo_branch}\`.\n\n**Conflicted files:**\n${conflicted_list}\n\nPlease resolve and merge this PR to complete the sync."
pr_body="## Sync Conflict
Monorepo \`${mono_branch}\` has changes that conflict with \`${subrepo_branch}\`.
**Conflicted files:**
${conflicted_list}
Please resolve and merge this PR to complete the sync."
create_pr "${SUBREPO_API}" "${SUBREPO_TOKEN}" \
"$subrepo_branch" "$conflict_branch" \
@@ -128,6 +142,87 @@ ${BOT_TRAILER}: forward/${mono_branch}/$(date -u +%Y-%m-%dT%H:%M:%SZ)" >&2
fi
}
# ─── Filter Change Reconciliation ─────────────────────────────────
# When the josh filter changes (e.g., exclude patterns added/removed),
# josh-proxy recomputes filtered history with new SHAs. This creates a
# merge commit on the subrepo that connects old and new histories,
# re-establishing shared ancestry without a destructive force-push.
# Returns: reconciled | lease-rejected
reconcile_filter_change() {
local mono_branch="$SYNC_BRANCH_MONO"
local subrepo_branch="$SYNC_BRANCH_SUBREPO"
local work_dir
work_dir=$(mktemp -d)
# shellcheck disable=SC2064 # Intentional early expansion — work_dir is local
trap "rm -rf '$work_dir'" EXIT
log "INFO" "=== Filter change reconciliation: ${mono_branch} ==="
# 1. Clone subrepo
git clone "$(subrepo_auth_url)" \
--branch "$subrepo_branch" --single-branch \
"${work_dir}/subrepo" || die "Failed to clone subrepo"
cd "${work_dir}/subrepo" || exit
git config user.name "$BOT_NAME"
git config user.email "$BOT_EMAIL"
local subrepo_head
subrepo_head=$(git rev-parse HEAD)
log "INFO" "Subrepo HEAD: ${subrepo_head:0:12}"
# 2. Fetch josh-proxy filtered view (new filter)
git remote add josh-filtered "$(josh_auth_url)"
git fetch josh-filtered "$mono_branch" || die "Failed to fetch from josh-proxy"
local josh_head josh_tree
josh_head=$(git rev-parse "josh-filtered/${mono_branch}")
# shellcheck disable=SC1083 # {tree} is git syntax, not shell brace expansion
josh_tree=$(git rev-parse "josh-filtered/${mono_branch}^{tree}")
log "INFO" "Josh-proxy HEAD (new filter): ${josh_head:0:12}"
# 3. Check if trees are already identical (filter change had no effect)
local subrepo_tree
# shellcheck disable=SC1083
subrepo_tree=$(git rev-parse "HEAD^{tree}")
if [ "$josh_tree" = "$subrepo_tree" ]; then
log "INFO" "Trees identical after filter change — no reconciliation needed"
echo "skip"
return
fi
# 4. Create merge commit: josh-proxy HEAD (first parent) + subrepo HEAD, with josh-proxy's tree
# Josh follows first-parent traversal — josh-filtered MUST be first so josh can map
# the history back to the monorepo. Old subrepo history hangs off parent 2.
local merge_commit
merge_commit=$(git commit-tree "$josh_tree" \
-p "$josh_head" \
-p "$subrepo_head" \
-m "Sync: filter configuration updated
${BOT_TRAILER}: filter-change/${mono_branch}/$(date -u +%Y-%m-%dT%H:%M:%SZ)")
git reset --hard "$merge_commit" >&2
log "INFO" "Created reconciliation merge: ${merge_commit:0:12}"
# 5. Record lease and push
local subrepo_sha
subrepo_sha=$(subrepo_ls_remote "$subrepo_branch")
if git push \
--force-with-lease="refs/heads/${subrepo_branch}:${subrepo_sha}" \
"$(subrepo_auth_url)" \
"HEAD:refs/heads/${subrepo_branch}"; then
log "INFO" "Filter change reconciled — shared ancestry re-established"
echo "reconciled"
else
log "WARN" "Force-with-lease rejected — subrepo changed during reconciliation"
echo "lease-rejected"
fi
}
# ─── Reverse Sync: Subrepo → Monorepo ──────────────────────────────
#
# Always creates a PR on the monorepo — never pushes directly.
@@ -156,9 +251,24 @@ reverse_sync() {
git remote add mono-filtered "$(josh_auth_url)"
git fetch mono-filtered "$mono_branch" || die "Failed to fetch from josh-proxy"
# 3. Find new human commits (excludes bot commits from forward sync)
# 3. Compare trees — skip if subrepo matches josh-filtered view
local subrepo_tree josh_tree
# shellcheck disable=SC1083 # {tree} is git syntax, not shell brace expansion
subrepo_tree=$(git rev-parse "HEAD^{tree}")
# shellcheck disable=SC1083
josh_tree=$(git rev-parse "mono-filtered/${mono_branch}^{tree}")
if [ "$subrepo_tree" = "$josh_tree" ]; then
log "INFO" "Subrepo tree matches josh-filtered view — nothing to sync"
echo "skip"
return
fi
# 4. Find new human commits (excludes bot commits from forward sync)
# Uses --ancestry-path to restrict to the direct lineage and avoid
# leaking old history through reconciliation merge parents.
local human_commits
human_commits=$(git log "mono-filtered/${mono_branch}..HEAD" \
human_commits=$(git log --ancestry-path "mono-filtered/${mono_branch}..HEAD" \
--oneline --invert-grep --grep="^${BOT_TRAILER}:" 2>/dev/null || echo "")
if [ -z "$human_commits" ]; then
@@ -170,7 +280,7 @@ reverse_sync() {
log "INFO" "New human commits to sync:"
echo "$human_commits" >&2
# 4. Push through josh to a staging branch
# 5. Push through josh to a staging branch
local ts
ts=$(date +%Y%m%d-%H%M%S)
local staging_branch="auto-sync/subrepo-${subrepo_branch}-${ts}"
@@ -178,9 +288,20 @@ reverse_sync() {
if git push -o "base=${mono_branch}" "$(josh_auth_url)" "HEAD:refs/heads/${staging_branch}"; then
log "INFO" "Pushed to staging branch via josh: ${staging_branch}"
# 5. Create PR on monorepo (NEVER direct push)
# 6. Create PR on monorepo (NEVER direct push)
local pr_body
pr_body="## Subrepo changes\n\nNew commits from subrepo \`${subrepo_branch}\`:\n\n\`\`\`\n${human_commits}\n\`\`\`\n\n**Review checklist:**\n- [ ] Changes scoped to synced subfolder\n- [ ] No leaked credentials or environment-specific config\n- [ ] CI passes"
pr_body="## Subrepo changes
New commits from subrepo \`${subrepo_branch}\`:
\`\`\`
${human_commits}
\`\`\`
**Review checklist:**
- [ ] Changes scoped to synced subfolder
- [ ] No leaked credentials or environment-specific config
- [ ] CI passes"
create_pr "${MONOREPO_API}" "${GITEA_TOKEN}" \
"$mono_branch" "$staging_branch" \
@@ -268,7 +389,14 @@ ${BOT_TRAILER}: import/${JOSH_SYNC_TARGET_NAME}/${ts}" >&2
# 5. Create PR on monorepo
local pr_body
pr_body="## Initial import\n\nImporting existing subrepo \`${subrepo_branch}\` (${file_count} files) into \`${subfolder}/\`.\n\n**Review checklist:**\n- [ ] Content looks correct\n- [ ] No leaked credentials or environment-specific config\n- [ ] CI passes"
pr_body="## Initial import
Importing existing subrepo \`${subrepo_branch}\` (${file_count} files) into \`${subfolder}/\`.
**Review checklist:**
- [ ] Content looks correct
- [ ] No leaked credentials or environment-specific config
- [ ] CI passes"
create_pr "${MONOREPO_API}" "${GITEA_TOKEN}" \
"$mono_branch" "$staging_branch" \

View File

@@ -70,6 +70,12 @@
"items": { "type": "string" },
"default": [],
"description": "Branches that only sync mono → subrepo (never reverse)"
},
"exclude": {
"type": "array",
"items": { "type": "string" },
"default": [],
"description": "File/directory patterns to exclude from sync via josh :exclude filter. Josh pattern syntax: 'dir/' for directories, '*.ext' for globs, '**/dir/' for nested matches. Patterns are embedded inline in the josh-proxy URL."
}
}
}