Compare commits
17 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 8ab07b83ab | |||
| 95b83bd538 | |||
| ce53d3c1d2 | |||
| 16257f25d7 | |||
| c0ddb887ff | |||
| 22bd59a9d7 | |||
| d7f8618b38 | |||
| 5929585d6c | |||
| 187a9ead14 | |||
| 401d0e87a4 | |||
| fbacec7f6f | |||
| 553f006174 | |||
| cb14cf9bd4 | |||
| 0363b0ee77 | |||
| 72430714af | |||
| 105216a27e | |||
| 405e5f4535 |
3
.gitignore
vendored
3
.gitignore
vendored
@@ -1,3 +1,4 @@
|
||||
.claude/*local*
|
||||
|
||||
dist/
|
||||
.env
|
||||
result
|
||||
|
||||
27
CHANGELOG.md
27
CHANGELOG.md
@@ -1,5 +1,32 @@
|
||||
# Changelog
|
||||
|
||||
## 1.2.0
|
||||
|
||||
### Features
|
||||
|
||||
- **File exclusion**: `exclude` config field removes files/directories from the subrepo at the josh-proxy transport layer. Patterns are embedded inline in the josh-proxy URL using `:exclude[::pattern,...]` syntax — no extra files to generate or commit.
|
||||
- **Filter change reconciliation**: When the josh filter changes (e.g., adding/removing exclude patterns), josh-sync automatically creates a reconciliation merge commit that connects old and new histories. No manual reset or force-push required.
|
||||
- **Tree comparison guard**: Reverse sync now compares subrepo tree to josh-filtered tree before checking commit log. Skips immediately when trees are identical, avoiding false positives from reconciliation merge history.
|
||||
- **Unrelated histories detection**: Forward sync detects when histories are unrelated (no common ancestor) and falls back to reconciliation instead of creating a useless conflict PR.
|
||||
|
||||
### Fixes
|
||||
|
||||
- Pre-v1.2 state compatibility: When upgrading from v1.0/v1.1 (no `josh_filter` stored in state), the old filter is derived from `subfolder` so reconciliation triggers correctly.
|
||||
- Reconciliation merge parent order: Josh-filtered history is always first parent so josh-proxy can follow first-parent traversal back to the monorepo.
|
||||
- Reverse sync `--ancestry-path` flag prevents old subrepo history from leaking through reconciliation merge parents.
|
||||
- PR body `\n` now renders as actual newlines instead of literal text.
|
||||
- Conflict result no longer updates sync state (added `continue` to skip state write).
|
||||
- `action.yml` now copies VERSION file for correct `--version` output in CI.
|
||||
- `.gitignore` now includes `dist/` and `.env`.
|
||||
|
||||
## 1.1.0
|
||||
|
||||
### Features
|
||||
|
||||
- **`onboard` command**: Interactive, resumable workflow for importing existing subrepos into the monorepo. Walks through: prerequisites check, import (creates PRs), wait for merge, reset (pushes josh-filtered history). Checkpoint/resume at every step.
|
||||
- **`migrate-pr` command**: Migrates open PRs from an archived subrepo to the new one. Supports interactive selection, `--all` flag, and specific PR numbers. Uses `git apply --3way` for resilient patch application.
|
||||
- **Onboard state tracking**: Stored on the `josh-sync-state` branch at `<target>/onboard.json`. Tracks step progress, import PR numbers, reset branches, and migrated PRs.
|
||||
|
||||
## 1.0.0
|
||||
|
||||
Initial release. Extracted from [private-monorepo-example](https://code.itkan.io/pe/private-monorepo-example) into a standalone reusable library.
|
||||
|
||||
2
Makefile
2
Makefile
@@ -23,7 +23,7 @@ dist/josh-sync: bin/josh-sync lib/*.sh VERSION
|
||||
@echo '# Generated by: make build' >> dist/josh-sync
|
||||
@echo '' >> dist/josh-sync
|
||||
@# Inline all library modules (strip shebangs and source directives)
|
||||
@for f in lib/core.sh lib/config.sh lib/auth.sh lib/state.sh lib/sync.sh; do \
|
||||
@for f in lib/core.sh lib/config.sh lib/auth.sh lib/state.sh lib/sync.sh lib/onboard.sh; do \
|
||||
echo "# --- $$f ---" >> dist/josh-sync; \
|
||||
grep -v '^#!/' "$$f" | grep -v '^# shellcheck source=' >> dist/josh-sync; \
|
||||
echo '' >> dist/josh-sync; \
|
||||
|
||||
16
README.md
16
README.md
@@ -16,12 +16,12 @@ josh:
|
||||
targets:
|
||||
- name: "billing"
|
||||
subfolder: "services/billing"
|
||||
josh_filter: ":/services/billing"
|
||||
subrepo_url: "git@gitea.example.com:ext/billing.git"
|
||||
subrepo_auth: "ssh"
|
||||
branches:
|
||||
main: main
|
||||
forward_only: []
|
||||
exclude: # files excluded from subrepo (optional)
|
||||
- ".monorepo/"
|
||||
|
||||
bot:
|
||||
name: "josh-sync-bot"
|
||||
@@ -58,8 +58,10 @@ Run `josh-sync preflight` to validate your setup.
|
||||
|
||||
## Documentation
|
||||
|
||||
- **[Setup Guide](docs/guide.md)** — Step-by-step: prerequisites, importing existing subrepos, CI workflows, and troubleshooting
|
||||
- **[Setup Guide](docs/guide.md)** — Step-by-step: prerequisites, importing existing subrepos, CI workflows, file exclusion, and troubleshooting
|
||||
- **[Configuration Reference](docs/config-reference.md)** — Full `.josh-sync.yml` field documentation
|
||||
- **[Architecture Decision Records](docs/adr/)** — Design rationale and trade-offs
|
||||
- **[Changelog](CHANGELOG.md)** — Version history
|
||||
|
||||
## CLI
|
||||
|
||||
@@ -68,6 +70,8 @@ josh-sync sync [--forward|--reverse] [--target NAME[,NAME]] [--branch BRANCH]
|
||||
josh-sync preflight
|
||||
josh-sync import <target>
|
||||
josh-sync reset <target>
|
||||
josh-sync onboard <target> [--restart]
|
||||
josh-sync migrate-pr <target> [PR#...] [--all]
|
||||
josh-sync status
|
||||
josh-sync state show <target> [branch]
|
||||
josh-sync state reset <target> [branch]
|
||||
@@ -77,12 +81,16 @@ josh-sync state reset <target> [branch]
|
||||
|
||||
- **Forward sync** (mono → subrepo): pushes directly if clean, creates conflict PR if not. Uses `--force-with-lease` for safety.
|
||||
- **Reverse sync** (subrepo → mono): always creates a PR, never pushes directly.
|
||||
- **File exclusion**: `exclude` patterns are embedded inline in the josh-proxy URL. Excluded files exist only in the monorepo.
|
||||
- **Filter reconciliation**: Changing the exclude list auto-creates a merge commit that connects old and new histories — no force-push needed.
|
||||
- **Loop prevention**: `Josh-Sync-Origin:` git trailer filters out bot commits.
|
||||
- **State tracking**: orphan branch `josh-sync-state` stores JSON per target/branch.
|
||||
|
||||
## Dependencies
|
||||
|
||||
`bash >=4`, `git`, `curl`, `jq`, `yq` ([mikefarah/yq](https://github.com/mikefarah/yq) v4+), `openssh`
|
||||
`bash >=4`, `git`, `curl`, `jq`, `yq` ([mikefarah/yq](https://github.com/mikefarah/yq) v4+), `openssh`, `rsync`
|
||||
|
||||
> The Nix flake bundles all dependencies automatically.
|
||||
|
||||
## License
|
||||
|
||||
|
||||
@@ -26,6 +26,7 @@ runs:
|
||||
run: |
|
||||
JOSH_DIR="$(mktemp -d)"
|
||||
cp -r "${{ github.action_path }}/bin" "${{ github.action_path }}/lib" "${JOSH_DIR}/"
|
||||
cp "${{ github.action_path }}/VERSION" "${JOSH_DIR}/" 2>/dev/null || true
|
||||
chmod +x "${JOSH_DIR}/bin/josh-sync"
|
||||
echo "${JOSH_DIR}/bin" >> "$GITHUB_PATH"
|
||||
echo "JOSH_SYNC_ROOT=${JOSH_DIR}" >> "$GITHUB_ENV"
|
||||
|
||||
210
bin/josh-sync
210
bin/josh-sync
@@ -9,6 +9,8 @@
|
||||
# preflight Validate config, connectivity, auth
|
||||
# import <target> Initial import: pull subrepo into monorepo
|
||||
# reset <target> Reset subrepo to josh-filtered view
|
||||
# onboard <target> Import existing subrepo into monorepo (interactive)
|
||||
# migrate-pr <target> [PR#...] [--all] Move PRs from archived to new subrepo
|
||||
# status Show target config and sync state
|
||||
# state show|reset Manage sync state directly
|
||||
#
|
||||
@@ -39,6 +41,8 @@ source "${JOSH_LIB_DIR}/auth.sh"
|
||||
source "${JOSH_LIB_DIR}/state.sh"
|
||||
# shellcheck source=../lib/sync.sh
|
||||
source "${JOSH_LIB_DIR}/sync.sh"
|
||||
# shellcheck source=../lib/onboard.sh
|
||||
source "${JOSH_LIB_DIR}/onboard.sh"
|
||||
|
||||
# ─── Version ────────────────────────────────────────────────────────
|
||||
|
||||
@@ -69,6 +73,8 @@ Commands:
|
||||
preflight Validate config, connectivity, auth, workflow coverage
|
||||
import <target> Initial import: pull existing subrepo into monorepo (creates PR)
|
||||
reset <target> Reset subrepo to josh-filtered view (after merging import PR)
|
||||
onboard <target> Import existing subrepo into monorepo (interactive, resumable)
|
||||
migrate-pr <target> [PR#...] [--all] Move PRs from archived to new subrepo
|
||||
status Show target config and sync state
|
||||
state show <target> [branch] Show sync state JSON
|
||||
state reset <target> [branch] Reset sync state to {}
|
||||
@@ -202,13 +208,42 @@ _sync_direction() {
|
||||
fi
|
||||
fi
|
||||
|
||||
# Run sync
|
||||
# Check for filter change (forward only — reverse uses same filter)
|
||||
local result
|
||||
if [ "$direction" = "forward" ]; then
|
||||
local prev_filter
|
||||
prev_filter=$(echo "$state" | jq -r '.last_forward.josh_filter // empty')
|
||||
|
||||
# If no filter stored (pre-v1.2 state) but a previous sync exists,
|
||||
# the old filter was the simple :/subfolder (before exclude was added)
|
||||
if [ -z "$prev_filter" ]; then
|
||||
local prev_mono_sha
|
||||
prev_mono_sha=$(echo "$state" | jq -r '.last_forward.mono_sha // empty')
|
||||
if [ -n "$prev_mono_sha" ]; then
|
||||
local subfolder
|
||||
subfolder=$(echo "$TARGET_JSON" | jq -r '.subfolder')
|
||||
prev_filter=":/${subfolder}"
|
||||
fi
|
||||
fi
|
||||
|
||||
if [ -n "$prev_filter" ] && [ "$prev_filter" != "$JOSH_FILTER" ]; then
|
||||
log "WARN" "Josh filter changed — reconciling histories"
|
||||
log "INFO" "Old: ${prev_filter}"
|
||||
log "INFO" "New: ${JOSH_FILTER}"
|
||||
result=$(reconcile_filter_change)
|
||||
else
|
||||
result=$(forward_sync)
|
||||
fi
|
||||
else
|
||||
result=$(reverse_sync)
|
||||
fi
|
||||
# If forward sync hit unrelated histories, fall back to reconciliation
|
||||
if [ "$result" = "unrelated" ]; then
|
||||
log "WARN" "Unrelated histories detected — falling back to filter reconciliation"
|
||||
result=$(reconcile_filter_change)
|
||||
log "INFO" "Reconciliation result: ${result}"
|
||||
fi
|
||||
|
||||
log "INFO" "Result: ${result}"
|
||||
|
||||
# Handle warnings
|
||||
@@ -218,6 +253,7 @@ _sync_direction() {
|
||||
fi
|
||||
if [ "$result" = "conflict" ]; then
|
||||
echo "::warning::Target ${target_name}, branch ${branch}: merge conflict — PR created on subrepo"
|
||||
continue
|
||||
fi
|
||||
if [ "$result" = "josh-rejected" ]; then
|
||||
echo "::error::Target ${target_name}, branch ${branch}: josh rejected push — check proxy logs"
|
||||
@@ -234,8 +270,9 @@ _sync_direction() {
|
||||
--arg s_sha "${subrepo_sha_now:-}" \
|
||||
--arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
|
||||
--arg status "$result" \
|
||||
--arg filter "$JOSH_FILTER" \
|
||||
--argjson prev "$state" \
|
||||
'$prev + {last_forward: {mono_sha:$m_sha, subrepo_sha:$s_sha, timestamp:$ts, status:$status}}')
|
||||
'$prev + {last_forward: {mono_sha:$m_sha, subrepo_sha:$s_sha, timestamp:$ts, status:$status, josh_filter:$filter}}')
|
||||
else
|
||||
local mono_sha_now
|
||||
mono_sha_now=$(git rev-parse "origin/${branch}" 2>/dev/null || echo "")
|
||||
@@ -643,6 +680,173 @@ cmd_state() {
|
||||
esac
|
||||
}
|
||||
|
||||
# ─── Onboard Command ──────────────────────────────────────────────
|
||||
|
||||
cmd_onboard() {
|
||||
local config_file=".josh-sync.yml"
|
||||
local target_name=""
|
||||
local restart=false
|
||||
|
||||
while [ $# -gt 0 ]; do
|
||||
case "$1" in
|
||||
--config) config_file="$2"; shift 2 ;;
|
||||
--debug) export JOSH_SYNC_DEBUG=1; shift ;;
|
||||
--restart) restart=true; shift ;;
|
||||
-*) die "Unknown flag: $1" ;;
|
||||
*) target_name="$1"; shift ;;
|
||||
esac
|
||||
done
|
||||
|
||||
if [ -z "$target_name" ]; then
|
||||
echo "Usage: josh-sync onboard <target> [--restart]" >&2
|
||||
parse_config "$config_file"
|
||||
echo "Available targets:" >&2
|
||||
echo "$JOSH_SYNC_TARGETS" | jq -r '.[].name' | sed 's/^/ /' >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
parse_config "$config_file"
|
||||
|
||||
local target_json
|
||||
target_json=$(echo "$JOSH_SYNC_TARGETS" | jq -c --arg n "$target_name" '.[] | select(.name == $n)')
|
||||
[ -n "$target_json" ] || die "Target '${target_name}' not found in config"
|
||||
|
||||
log "INFO" "══════ Onboard target: ${target_name} ══════"
|
||||
load_target "$target_json"
|
||||
onboard_flow "$target_json" "$restart"
|
||||
}
|
||||
|
||||
# ─── Migrate PR Command ──────────────────────────────────────────
|
||||
|
||||
cmd_migrate_pr() {
|
||||
local config_file=".josh-sync.yml"
|
||||
local target_name=""
|
||||
local all=false
|
||||
local pr_numbers=()
|
||||
|
||||
while [ $# -gt 0 ]; do
|
||||
case "$1" in
|
||||
--config) config_file="$2"; shift 2 ;;
|
||||
--debug) export JOSH_SYNC_DEBUG=1; shift ;;
|
||||
--all) all=true; shift ;;
|
||||
-*) die "Unknown flag: $1" ;;
|
||||
*)
|
||||
if [ -z "$target_name" ]; then
|
||||
target_name="$1"
|
||||
else
|
||||
pr_numbers+=("$1")
|
||||
fi
|
||||
shift ;;
|
||||
esac
|
||||
done
|
||||
|
||||
if [ -z "$target_name" ]; then
|
||||
echo "Usage: josh-sync migrate-pr <target> [PR#...] [--all]" >&2
|
||||
parse_config "$config_file"
|
||||
echo "Available targets:" >&2
|
||||
echo "$JOSH_SYNC_TARGETS" | jq -r '.[].name' | sed 's/^/ /' >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
parse_config "$config_file"
|
||||
|
||||
local target_json
|
||||
target_json=$(echo "$JOSH_SYNC_TARGETS" | jq -c --arg n "$target_name" '.[] | select(.name == $n)')
|
||||
[ -n "$target_json" ] || die "Target '${target_name}' not found in config"
|
||||
|
||||
load_target "$target_json"
|
||||
|
||||
# Load archived repo info from onboard state
|
||||
local onboard_state archived_api
|
||||
onboard_state=$(read_onboard_state "$target_name")
|
||||
archived_api=$(echo "$onboard_state" | jq -r '.archived_api')
|
||||
if [ -z "$archived_api" ] || [ "$archived_api" = "null" ]; then
|
||||
die "No archived repo info found. Run 'josh-sync onboard ${target_name}' first."
|
||||
fi
|
||||
|
||||
log "INFO" "Archived repo: ${archived_api}"
|
||||
|
||||
# Load already-migrated PR numbers for skip detection and display
|
||||
local migrated_numbers
|
||||
migrated_numbers=$(echo "$onboard_state" | jq -r '[.migrated_prs // [] | .[].old_number] | map(tostring) | .[]')
|
||||
|
||||
# Counters for summary
|
||||
local migrated=0 failed=0 skipped=0
|
||||
|
||||
# Helper: attempt migration of one PR with counting
|
||||
_try_migrate() {
|
||||
local num="$1"
|
||||
if echo "$migrated_numbers" | grep -qx "$num"; then
|
||||
log "INFO" "PR #${num} already migrated — skipping"
|
||||
skipped=$((skipped + 1))
|
||||
elif migrate_one_pr "$num"; then
|
||||
migrated=$((migrated + 1))
|
||||
else
|
||||
failed=$((failed + 1))
|
||||
fi
|
||||
}
|
||||
|
||||
if [ "$all" = true ]; then
|
||||
# Migrate all open PRs from archived repo
|
||||
local prs
|
||||
prs=$(list_open_prs "$archived_api" "$SUBREPO_TOKEN") \
|
||||
|| die "Failed to list PRs on archived repo"
|
||||
local count
|
||||
count=$(echo "$prs" | jq 'length')
|
||||
log "INFO" "Found ${count} open PR(s) on archived repo"
|
||||
|
||||
while read -r num; do
|
||||
_try_migrate "$num"
|
||||
done < <(echo "$prs" | jq -r '.[] | .number')
|
||||
|
||||
elif [ ${#pr_numbers[@]} -gt 0 ]; then
|
||||
# Migrate specific PR numbers
|
||||
for num in "${pr_numbers[@]}"; do
|
||||
_try_migrate "$num"
|
||||
done
|
||||
|
||||
else
|
||||
# Interactive: list open PRs, let user pick
|
||||
local prs
|
||||
prs=$(list_open_prs "$archived_api" "$SUBREPO_TOKEN") \
|
||||
|| die "Failed to list PRs on archived repo"
|
||||
local count
|
||||
count=$(echo "$prs" | jq 'length')
|
||||
|
||||
if [ "$count" -eq 0 ]; then
|
||||
log "INFO" "No open PRs on archived repo"
|
||||
return
|
||||
fi
|
||||
|
||||
# Display PRs with [migrated] marker for already-processed ones
|
||||
echo "" >&2
|
||||
echo "Open PRs on archived repo:" >&2
|
||||
while IFS=$'\t' read -r num title base_ref head_ref; do
|
||||
if echo "$migrated_numbers" | grep -qx "$num"; then
|
||||
echo " #${num}: ${title} (${base_ref} <- ${head_ref}) [migrated]" >&2
|
||||
else
|
||||
echo " #${num}: ${title} (${base_ref} <- ${head_ref})" >&2
|
||||
fi
|
||||
done < <(echo "$prs" | jq -r '.[] | "\(.number)\t\(.title)\t\(.base.ref)\t\(.head.ref)"')
|
||||
echo "" >&2
|
||||
echo "Enter PR numbers to migrate (space-separated), or 'all':" >&2
|
||||
local selection
|
||||
read -r selection
|
||||
|
||||
if [ "$selection" = "all" ]; then
|
||||
while read -r num; do
|
||||
_try_migrate "$num"
|
||||
done < <(echo "$prs" | jq -r '.[] | .number')
|
||||
else
|
||||
for num in $selection; do
|
||||
_try_migrate "$num"
|
||||
done
|
||||
fi
|
||||
fi
|
||||
|
||||
log "INFO" "Migration complete: ${migrated} migrated, ${failed} failed, ${skipped} skipped"
|
||||
}
|
||||
|
||||
# ─── Main ───────────────────────────────────────────────────────────
|
||||
|
||||
main() {
|
||||
@@ -666,6 +870,8 @@ main() {
|
||||
preflight) cmd_preflight "$@" ;;
|
||||
import) cmd_import "$@" ;;
|
||||
reset) cmd_reset "$@" ;;
|
||||
onboard) cmd_onboard "$@" ;;
|
||||
migrate-pr) cmd_migrate_pr "$@" ;;
|
||||
status) cmd_status "$@" ;;
|
||||
state) cmd_state "$@" ;;
|
||||
*)
|
||||
|
||||
42
docs/adr/001-josh-proxy-for-sync.md
Normal file
42
docs/adr/001-josh-proxy-for-sync.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# ADR-001: Josh-proxy for Bidirectional Sync
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-01
|
||||
|
||||
## Context
|
||||
|
||||
We need bidirectional sync between a monorepo and N external subrepos. Each subrepo corresponds to a subfolder in the monorepo. Developers on both sides should see a clean, complete git history — not synthetic commits or squashed blobs.
|
||||
|
||||
### Alternatives considered
|
||||
|
||||
1. **git subtree**: Built into git. `git subtree split` extracts a subfolder into a standalone repo. However, subtree split rewrites history on every run (O(n) on total commits), creating new SHAs each time. Bidirectional sync requires manual `subtree merge` with conflict-prone history grafting. No transport-layer filtering — all content must be fetched.
|
||||
|
||||
2. **git submodule**: Tracks external repos via `.gitmodules` pointer commits. Does not provide content-level integration — monorepo commits don't contain subrepo files directly. Developers must run `git submodule update`. Bidirectional sync is not a supported workflow.
|
||||
|
||||
3. **Custom diff-and-patch scripts**: Compute diffs between monorepo subfolder and subrepo, apply patches in both directions. Fragile with renames, binary files, and merge conflicts. Loses authorship and commit granularity.
|
||||
|
||||
4. **josh-proxy**: A git proxy that computes filtered views of repositories in real-time. Clients `git clone` through josh and receive a repo containing only the specified subfolder, with history rewritten to match. Josh maintains a persistent SHA mapping, so the same monorepo commit always produces the same filtered SHA. Bidirectional: pushing back through josh maps filtered commits to monorepo commits.
|
||||
|
||||
## Decision
|
||||
|
||||
Use josh-proxy as the transport layer for all sync operations.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive:**
|
||||
- Clean git history in both directions — no synthetic commits
|
||||
- Deterministic SHA mapping — same monorepo state always produces same filtered SHA
|
||||
- Bidirectional by design — push through josh maps back to monorepo
|
||||
- Transport-layer filtering — content exclusion happens at clone/push time, not via generated files
|
||||
- Supports any git hosting platform (Gitea, GitHub, GitLab) since it's a proxy
|
||||
|
||||
**Negative:**
|
||||
- Requires running a josh-proxy instance (operational overhead)
|
||||
- Josh-proxy is a Rust project with a smaller community than git-native tools
|
||||
- Proxy must have network access to the monorepo's git server
|
||||
- Josh's SHA mapping is opaque — debugging requires understanding josh internals
|
||||
- First-parent traversal behavior must be respected in merge commits (see ADR-008)
|
||||
|
||||
**Risks:**
|
||||
- Josh-proxy downtime blocks all sync operations
|
||||
- Josh-proxy bugs could corrupt history mapping (mitigated by force-with-lease on forward, always-PR on reverse)
|
||||
50
docs/adr/002-state-on-orphan-branch.md
Normal file
50
docs/adr/002-state-on-orphan-branch.md
Normal file
@@ -0,0 +1,50 @@
|
||||
# ADR-002: State Storage on Orphan Git Branch
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-01
|
||||
|
||||
## Context
|
||||
|
||||
Josh-sync needs persistent state to track what has already been synced (last-synced commit SHAs, timestamps, status). This prevents re-syncing unchanged content and enables incremental operation. The state must survive CI runner teardown — runners are ephemeral containers.
|
||||
|
||||
### Alternatives considered
|
||||
|
||||
1. **File in the repo**: Commit a state JSON file to the monorepo. Every sync run creates a commit, polluting history. Race conditions when multiple sync jobs run concurrently.
|
||||
|
||||
2. **External database/KV store**: Redis, SQLite, or a cloud KV service. Adds an infrastructure dependency. Credentials and connectivity to manage.
|
||||
|
||||
3. **CI artifacts/cache**: Platform-specific (GitHub Actions cache, Gitea cache). Not portable across CI platforms. Expiry policies vary.
|
||||
|
||||
4. **Orphan git branch**: A branch with no parent relationship to the main history. Stores JSON files in a simple `<target>/<branch>.json` layout. Pushed to origin, so it survives runner teardown. No external dependencies — uses git itself.
|
||||
|
||||
## Decision
|
||||
|
||||
Store sync state as JSON files on an orphan branch (`josh-sync-state`) in the monorepo.
|
||||
|
||||
### Storage layout
|
||||
|
||||
```
|
||||
origin/josh-sync-state/
|
||||
<target>/<branch>.json # sync state per target/branch
|
||||
<target>/onboard.json # onboard workflow state (v1.1+)
|
||||
```
|
||||
|
||||
### Implementation
|
||||
|
||||
- `read_state()`: `git fetch origin josh-sync-state && git show origin/josh-sync-state:<key>.json`
|
||||
- `write_state()`: Uses `git worktree` to check out the orphan branch in a temp directory, writes JSON, commits, and pushes. This avoids touching the main working tree.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive:**
|
||||
- Zero external dependencies — only git
|
||||
- Portable across CI platforms (Gitea Actions, GitHub Actions, local)
|
||||
- Human-readable JSON files — easy to inspect and debug
|
||||
- Atomic updates via git commit + push
|
||||
- Natural namespacing via directory structure
|
||||
|
||||
**Negative:**
|
||||
- Concurrent writes can race (mitigated by concurrency groups in CI workflows)
|
||||
- `git worktree` adds complexity to the write path
|
||||
- State branch appears in `git branch -a` output (minor clutter)
|
||||
- Push failures on the state branch are non-fatal (logged as warning, sync still succeeds)
|
||||
33
docs/adr/003-force-with-lease-forward.md
Normal file
33
docs/adr/003-force-with-lease-forward.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# ADR-003: Force-with-Lease for Forward Sync
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-01
|
||||
|
||||
## Context
|
||||
|
||||
Forward sync pushes monorepo changes to the subrepo. If someone pushes directly to the subrepo between when josh-sync reads its HEAD and when josh-sync pushes, a naive `git push` would overwrite their work. A `git push --force` would be worse — it would silently destroy concurrent changes.
|
||||
|
||||
## Decision
|
||||
|
||||
Use `git push --force-with-lease=refs/heads/<branch>:<expected-sha>` for all forward sync pushes. The expected SHA is recorded at the start of the sync operation (the "lease").
|
||||
|
||||
### How it works
|
||||
|
||||
1. Record subrepo HEAD SHA before any operations: `subrepo_sha=$(subrepo_ls_remote "$branch")`
|
||||
2. Perform merge of monorepo changes onto subrepo state
|
||||
3. Push with explicit lease: `--force-with-lease=refs/heads/main:<subrepo_sha>`
|
||||
4. If the subrepo HEAD changed since step 1, git rejects the push
|
||||
5. Josh-sync reports `lease-rejected` and retries on the next run
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive:**
|
||||
- Never overwrites concurrent changes — git atomically checks the expected SHA
|
||||
- Explicit SHA lease (not just "current tracking ref") prevents stale-ref bugs
|
||||
- Failed leases are retried on the next sync run — no data loss, just delay
|
||||
- Works correctly with josh-proxy's SHA mapping
|
||||
|
||||
**Negative:**
|
||||
- Lease-rejected means the sync run did work that gets discarded (clone, merge, etc.)
|
||||
- Persistent lease failures indicate a concurrent push pattern that needs investigation
|
||||
- Requires the `--force-with-lease` flag with explicit SHA — the shorthand form (`--force-with-lease` without `=`) is unsafe because it uses the local tracking ref, which may be stale
|
||||
41
docs/adr/004-always-pr-reverse.md
Normal file
41
docs/adr/004-always-pr-reverse.md
Normal file
@@ -0,0 +1,41 @@
|
||||
# ADR-004: Always-PR Policy for Reverse Sync
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-01
|
||||
|
||||
## Context
|
||||
|
||||
Reverse sync brings subrepo changes back into the monorepo. The monorepo is the source of truth and typically has CI checks, code review requirements, and branch protection rules. Pushing directly to the monorepo's main branch would bypass these safeguards.
|
||||
|
||||
### Alternatives considered
|
||||
|
||||
1. **Direct push**: Fast, but bypasses all review and CI. A bad subrepo commit could break the entire monorepo with no review gate.
|
||||
|
||||
2. **Always create a PR**: Pushes to a staging branch (`auto-sync/subrepo-<branch>-<timestamp>`), then creates a PR via API. Humans review and merge.
|
||||
|
||||
3. **Configurable per-target**: Let users choose direct push vs PR. Adds complexity and a dangerous default.
|
||||
|
||||
## Decision
|
||||
|
||||
Reverse sync always creates a PR on the monorepo. Never pushes directly to the target branch.
|
||||
|
||||
### Implementation
|
||||
|
||||
1. Push subrepo HEAD through josh-proxy to a staging branch: `git push -o "base=main" josh://... HEAD:refs/heads/auto-sync/subrepo-main-<ts>`
|
||||
2. Create PR via Gitea/GitHub API targeting the monorepo's main branch
|
||||
3. PR includes a review checklist: scoped to subfolder, no leaked credentials, CI passes
|
||||
|
||||
The `-o "base=main"` option tells josh-proxy which monorepo branch to map the push against.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive:**
|
||||
- All monorepo changes go through review — consistent with team workflow
|
||||
- CI runs on the PR branch before merge
|
||||
- Bad subrepo changes are caught before they affect the monorepo
|
||||
- Audit trail via PR history
|
||||
|
||||
**Negative:**
|
||||
- Reverse sync is not instant — requires human action to merge the PR
|
||||
- Stale PRs accumulate if subrepo changes frequently but PRs aren't merged promptly
|
||||
- Adds API dependency (needs token with PR creation scope)
|
||||
52
docs/adr/005-git-trailer-loop-prevention.md
Normal file
52
docs/adr/005-git-trailer-loop-prevention.md
Normal file
@@ -0,0 +1,52 @@
|
||||
# ADR-005: Git Trailer for Loop Prevention
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-01
|
||||
|
||||
## Context
|
||||
|
||||
Bidirectional sync creates an infinite loop risk: forward sync pushes commit A to the subrepo, reverse sync sees commit A as "new" and creates a PR back to the monorepo, forward sync sees the merged PR as "new" and pushes again, etc.
|
||||
|
||||
### Alternatives considered
|
||||
|
||||
1. **SHA tracking only**: Compare SHAs to skip already-synced content. Breaks when josh-proxy rewrites SHAs (which it always does for filtered views). The monorepo commit SHA and the filtered/subrepo commit SHA are never the same.
|
||||
|
||||
2. **Commit message prefix**: Add `[sync]` to bot commit messages. Fragile — humans might use the same prefix. Requires string matching on message content.
|
||||
|
||||
3. **Git trailer**: A structured key-value pair in the commit message body (after a blank line), following the `git interpret-trailers` convention. Format: `Key: value`. Machine-parseable, unlikely to be used by humans, and supported by `git log --grep`.
|
||||
|
||||
## Decision
|
||||
|
||||
All bot commits include a git trailer with a configurable key (default: `Josh-Sync-Origin`). Both sync directions filter out commits containing this trailer.
|
||||
|
||||
### Format
|
||||
|
||||
```
|
||||
Sync from monorepo 2026-02-12T10:30:00Z
|
||||
|
||||
Josh-Sync-Origin: forward/main/2026-02-12T10:30:00Z
|
||||
```
|
||||
|
||||
The trailer value encodes: direction, branch, and timestamp. This aids debugging but is not parsed by the loop filter — only the trailer key presence matters.
|
||||
|
||||
### Filtering
|
||||
|
||||
- **Reverse sync**: `git log --invert-grep --grep="^${BOT_TRAILER}:"` excludes all commits with the trailer
|
||||
- **CI loop guard**: The composite action checks if HEAD commit has the trailer before running sync at all
|
||||
|
||||
### Configuration
|
||||
|
||||
The trailer key is set in `.josh-sync.yml` under `bot.trailer`. This allows multiple josh-sync instances (with different bots) to operate on the same repos without interfering.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive:**
|
||||
- Reliable loop prevention — trailer is part of the immutable commit object
|
||||
- Configurable key avoids conflicts between multiple sync bots
|
||||
- Human-readable — `git log` shows the trailer in commit messages
|
||||
- CI loop guard prevents unnecessary sync runs entirely
|
||||
|
||||
**Negative:**
|
||||
- Commits with manually-added trailers matching the key would be incorrectly filtered
|
||||
- Trailer must be in the commit body (after blank line), not the subject line
|
||||
- Squash-and-merge on PRs may lose the trailer if the platform doesn't preserve commit message body
|
||||
55
docs/adr/006-inline-exclude-filter.md
Normal file
55
docs/adr/006-inline-exclude-filter.md
Normal file
@@ -0,0 +1,55 @@
|
||||
# ADR-006: Inline Exclude in Josh-Proxy URL
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-02
|
||||
|
||||
## Context
|
||||
|
||||
Some files in a monorepo subfolder should not appear in the subrepo (e.g., monorepo-specific CI configs, internal tooling, secrets templates). We need a mechanism to exclude these files from sync.
|
||||
|
||||
### Alternatives considered
|
||||
|
||||
1. **`.josh-sync-exclude` file committed to the repo**: A gitignore-style file listing patterns. Requires generating and committing a file. Changes to the exclude list create commits. The file itself would need to be excluded from the subrepo (circular dependency).
|
||||
|
||||
2. **Post-clone file deletion**: Clone through josh, then `rm -rf` excluded paths before pushing. Fragile — deletions create diff noise. Doesn't work for reverse sync (excluded files would appear as "deleted" in the subrepo).
|
||||
|
||||
3. **Josh `:exclude` filter inline in the URL**: Josh-proxy supports `:exclude[::pattern1,::pattern2]` appended to the filter path. The exclusion happens at the transport layer — git objects for excluded files are never transferred. Works identically for clone (forward) and push (reverse).
|
||||
|
||||
4. **Separate josh filter file**: Generate a josh filter expression and store it somewhere. Adds state management complexity.
|
||||
|
||||
## Decision
|
||||
|
||||
Embed exclusion patterns inline in the josh-proxy URL using josh's native `:exclude` syntax. The `exclude` config field in `.josh-sync.yml` is transformed at config parse time into the josh filter string.
|
||||
|
||||
### Example
|
||||
|
||||
Config:
|
||||
```yaml
|
||||
exclude:
|
||||
- ".monorepo/"
|
||||
- "**/internal/"
|
||||
```
|
||||
|
||||
Produces josh filter:
|
||||
```
|
||||
:/services/billing:exclude[::.monorepo/,::**/internal/]
|
||||
```
|
||||
|
||||
### Implementation
|
||||
|
||||
The `parse_config()` function in `lib/config.sh` uses jq to conditionally append `:exclude[...]` to the josh filter when the `exclude` array is non-empty. The enriched filter is stored in `JOSH_SYNC_TARGETS` JSON and used everywhere via `$JOSH_FILTER`.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive:**
|
||||
- Zero committed files — exclusion is purely in the URL
|
||||
- Transport-layer filtering — excluded content never leaves the git server
|
||||
- Works identically for forward sync (clone), reverse sync (push), and reset
|
||||
- Tree comparison (`skip` detection) works correctly since excluded files aren't in the filtered view
|
||||
- Standard josh syntax — no custom invention
|
||||
|
||||
**Negative:**
|
||||
- Josh's `:exclude` pattern syntax is limited (no negation, no regex — only glob-style patterns with `::` prefix)
|
||||
- Long exclude lists make the URL unwieldy (though this is cosmetic — git handles long URLs fine)
|
||||
- Changing the exclude list changes the josh filter, which changes all filtered SHAs (see ADR-007 for how this is handled)
|
||||
- Debugging requires understanding josh's filter composition syntax
|
||||
53
docs/adr/007-reconciliation-merge.md
Normal file
53
docs/adr/007-reconciliation-merge.md
Normal file
@@ -0,0 +1,53 @@
|
||||
# ADR-007: Reconciliation Merge for Filter Changes
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-02
|
||||
|
||||
## Context
|
||||
|
||||
When the josh filter changes (e.g., adding exclude patterns), josh-proxy recomputes the entire filtered history with new SHAs. The subrepo's existing history (based on the old filter) shares no common ancestor with the new filtered history. A naive forward sync would see "unrelated histories" and fail.
|
||||
|
||||
### Alternatives considered
|
||||
|
||||
1. **Force-push to subrepo**: Replace subrepo history with the new filtered view (same as `josh-sync reset`). Destructive — all local clones become invalid, open PRs are orphaned, developers must re-clone.
|
||||
|
||||
2. **Cherry-pick new commits**: Identify commits that exist in the new filtered history but not the old, cherry-pick them onto the subrepo. Complex — the "same" commit has different SHAs in old vs new filtered history. No reliable way to match them.
|
||||
|
||||
3. **Reconciliation merge commit**: Create a merge commit on the subrepo that has both the new filtered HEAD and the old subrepo HEAD as parents, using the new filtered tree. This establishes shared ancestry without rewriting history.
|
||||
|
||||
## Decision
|
||||
|
||||
When josh-sync detects a filter change (stored filter in state differs from current `$JOSH_FILTER`), create a reconciliation merge commit using `git commit-tree`.
|
||||
|
||||
### How it works
|
||||
|
||||
1. Clone subrepo (has old history)
|
||||
2. Fetch josh-proxy filtered view (has new history)
|
||||
3. If trees are identical → skip (filter change had no effect on content)
|
||||
4. Create merge commit: `git commit-tree <josh-tree> -p <josh-head> -p <subrepo-head>`
|
||||
5. Push with `--force-with-lease`
|
||||
|
||||
The merge commit uses the josh-filtered tree (new content) and has two parents:
|
||||
- **Parent 1**: josh-filtered HEAD (new filter history) — must be first (see ADR-008)
|
||||
- **Parent 2**: subrepo HEAD (old filter history) — preserves old history as a side branch
|
||||
|
||||
### Detection
|
||||
|
||||
Filter change is detected by comparing the stored `josh_filter` in sync state with the current `$JOSH_FILTER`. For pre-v1.2 state (no filter stored), the old filter is derived as `:/<subfolder>`.
|
||||
|
||||
As a reactive fallback, `forward_sync()` also detects unrelated histories via `git merge-base` and falls back to reconciliation.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive:**
|
||||
- Non-destructive — old history is preserved as parent 2 of the merge
|
||||
- Developers don't need to re-clone the subrepo
|
||||
- Open PRs on the subrepo remain valid (they're based on commits that are still ancestors)
|
||||
- Automatic — no manual intervention needed when changing exclude patterns
|
||||
- Force-with-lease protects against concurrent changes during reconciliation
|
||||
|
||||
**Negative:**
|
||||
- The merge commit is synthetic (created by bot, not a real merge of concurrent work)
|
||||
- Parent ordering is critical — wrong order breaks josh's reverse mapping (see ADR-008)
|
||||
- The reconciliation merge contains a bot trailer, so reverse sync correctly ignores it
|
||||
- If the subrepo has diverged significantly (manual commits during filter change), the reconciliation merge may produce unexpected tree content (uses josh-filtered tree unconditionally)
|
||||
42
docs/adr/008-first-parent-ordering.md
Normal file
42
docs/adr/008-first-parent-ordering.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# ADR-008: First-Parent Ordering in Reconciliation Merges
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-02
|
||||
|
||||
## Context
|
||||
|
||||
Josh-proxy uses **first-parent traversal** when mapping subrepo history back to the monorepo. When you push a commit through josh-proxy, josh walks the first-parent chain to find a commit it can map to a monorepo commit. If the first parent leads to unmappable history, josh cannot reconstruct the monorepo-side branch correctly.
|
||||
|
||||
This became critical when the reconciliation merge (ADR-007) initially had the wrong parent order: old subrepo history as parent 1, josh-filtered as parent 2. Josh followed parent 1, couldn't find any mappable commit, and created a monorepo branch containing only the subrepo subfolder content — effectively deleting 1280 files from the rest of the monorepo.
|
||||
|
||||
## Decision
|
||||
|
||||
In reconciliation merge commits, the josh-filtered HEAD **must be parent 1** (first parent). The old subrepo HEAD is parent 2.
|
||||
|
||||
```bash
|
||||
git commit-tree "$josh_tree" \
|
||||
-p "$josh_head" \ # parent 1: josh-filtered — josh follows this
|
||||
-p "$subrepo_head" \ # parent 2: old history — side branch, ignored by josh
|
||||
-m "..."
|
||||
```
|
||||
|
||||
### Why this is safe
|
||||
|
||||
- The old subrepo HEAD (`subrepo_head`) is still an ancestor of the merge commit regardless of parent order — push succeeds either way
|
||||
- `--ancestry-path` in reverse sync still follows `B → M → C` regardless of parent order (it traces all paths, not just first-parent)
|
||||
- Josh follows first-parent and finds the josh-filtered commit, which maps cleanly back to the monorepo
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive:**
|
||||
- Josh can map the reconciliation merge back to the monorepo correctly
|
||||
- Reverse sync through josh produces correct diffs (only subrepo-scoped changes)
|
||||
- `git log --first-parent` on the subrepo shows the clean josh-filtered lineage
|
||||
|
||||
**Negative:**
|
||||
- This is a subtle invariant — future changes to merge commit creation must preserve parent order
|
||||
- The constraint is undocumented in josh-proxy's own documentation (discovered empirically)
|
||||
- No automated test can verify this without a running josh-proxy instance
|
||||
|
||||
**Lesson learned:**
|
||||
Parent order in `git commit-tree -p` is not cosmetic. For tools that rely on first-parent traversal (josh-proxy, `git log --first-parent`), parent 1 must be the "mainline" that the tool should follow.
|
||||
53
docs/adr/009-tree-comparison-guard.md
Normal file
53
docs/adr/009-tree-comparison-guard.md
Normal file
@@ -0,0 +1,53 @@
|
||||
# ADR-009: Tree Comparison as Sync Skip Guard
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-02
|
||||
|
||||
## Context
|
||||
|
||||
Both forward and reverse sync need to detect "nothing to do" quickly. The primary mechanism is SHA comparison against stored state (last-synced SHA). However, this misses cases where:
|
||||
|
||||
- State is reset or lost
|
||||
- Reconciliation merges change SHAs without changing content
|
||||
- Multiple sync runs overlap
|
||||
|
||||
Additionally, reverse sync originally relied on `git log <base>..HEAD` to find new commits. After a reconciliation merge, the `..` range can leak old subrepo history through the merge's second parent, creating false positives.
|
||||
|
||||
## Decision
|
||||
|
||||
Add tree-level comparison as an early skip guard in both forward and reverse sync. Compare the git tree objects (which represent directory content, not commit history) to determine if there's actually any content difference.
|
||||
|
||||
### Forward sync
|
||||
|
||||
```bash
|
||||
mono_tree=$(git rev-parse 'HEAD^{tree}')
|
||||
subrepo_tree=$(git rev-parse "subrepo/${branch}^{tree}")
|
||||
[ "$mono_tree" = "$subrepo_tree" ] && echo "skip"
|
||||
```
|
||||
|
||||
### Reverse sync
|
||||
|
||||
```bash
|
||||
subrepo_tree=$(git rev-parse "HEAD^{tree}")
|
||||
josh_tree=$(git rev-parse "mono-filtered/${branch}^{tree}")
|
||||
[ "$subrepo_tree" = "$josh_tree" ] && echo "skip"
|
||||
```
|
||||
|
||||
Tree comparison happens **before** commit log analysis. If trees are identical, there is definitionally nothing to sync, regardless of what the commit history looks like.
|
||||
|
||||
### Combined with `--ancestry-path`
|
||||
|
||||
For reverse sync, even when trees differ, `git log --ancestry-path` restricts the commit range to the direct lineage between the two endpoints. This prevents old history from leaking through reconciliation merge parents.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive:**
|
||||
- Eliminates false positives from reconciliation merges (trees are identical after reconciliation)
|
||||
- Fast — tree SHA comparison is O(1), no content traversal
|
||||
- Correct by definition — if trees match, content is identical
|
||||
- Defense in depth — works even when state tracking has gaps
|
||||
|
||||
**Negative:**
|
||||
- Tree comparison alone doesn't tell you *which* commits are new (still need `git log` for PR descriptions)
|
||||
- Adds an extra `git rev-parse` call per sync direction (negligible cost)
|
||||
- Cannot detect file-mode-only changes if josh normalizes modes (theoretical edge case)
|
||||
76
docs/adr/010-onboard-checkpoint-resume.md
Normal file
76
docs/adr/010-onboard-checkpoint-resume.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# ADR-010: Onboard Workflow with Checkpoint/Resume
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-02
|
||||
|
||||
## Context
|
||||
|
||||
Onboarding an existing subrepo into the monorepo is a multi-step process that involves human interaction (renaming repos, merging PRs). The full flow is:
|
||||
|
||||
1. Prerequisites: rename existing repo, create new empty repo
|
||||
2. Import: copy subrepo content into monorepo, create import PR(s)
|
||||
3. Wait: human merges the import PR(s)
|
||||
4. Reset: force-push josh-filtered history to the new empty repo
|
||||
5. (Optional) Migrate open PRs from archived repo
|
||||
|
||||
Each step can fail or be interrupted. The process may span hours or days (waiting for PR review). If interrupted, restarting from scratch wastes work and can create duplicate PRs.
|
||||
|
||||
### Alternatives considered
|
||||
|
||||
1. **Single-shot script**: Run all steps in sequence. If interrupted, must restart from scratch. Duplicate PRs if import step is re-run.
|
||||
|
||||
2. **Manual step-by-step commands**: `import`, then manually run `reset`. Simple but error-prone — users may forget steps or run them out of order.
|
||||
|
||||
3. **Checkpoint/resume with persistent state**: Track the current step and intermediate results (PR numbers, reset branches) in persistent state. On re-run, resume from the last completed step.
|
||||
|
||||
## Decision
|
||||
|
||||
Implement `josh-sync onboard` as a checkpoint/resume workflow with state stored on the `josh-sync-state` branch at `<target>/onboard.json`.
|
||||
|
||||
### State machine
|
||||
|
||||
```
|
||||
start → importing → waiting-for-merge → resetting → complete
|
||||
```
|
||||
|
||||
Each transition is persisted before proceeding. Re-running `josh-sync onboard <target>` reads the current step and resumes.
|
||||
|
||||
### State schema
|
||||
|
||||
```json
|
||||
{
|
||||
"step": "waiting-for-merge",
|
||||
"archived_api": "https://host/api/v1/repos/org/repo-archived",
|
||||
"archived_url": "git@host:org/repo-archived.git",
|
||||
"archived_auth": "ssh",
|
||||
"import_prs": { "main": 42 },
|
||||
"reset_branches": ["main"],
|
||||
"migrated_prs": [
|
||||
{ "old_number": 5, "new_number": 12, "title": "Fix login" }
|
||||
],
|
||||
"timestamp": "2026-02-10T14:30:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
### Per-branch progress
|
||||
|
||||
Import and reset both iterate over branches. Progress is saved after each branch, so interruption mid-iteration resumes at the next unprocessed branch.
|
||||
|
||||
### PR migration
|
||||
|
||||
`josh-sync migrate-pr` is a separate command that reads onboard state (for the archived repo URL) and tracks migrated PRs. It uses `git apply --3way` for resilient patch application — the subrepo's content is identical after reset, so patches apply cleanly.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive:**
|
||||
- Safe to interrupt at any point — no duplicate work on resume
|
||||
- Per-branch tracking prevents duplicate import PRs or redundant resets
|
||||
- Archived repo URL stored in state — `migrate-pr` can operate independently
|
||||
- `--restart` flag allows starting over if state is corrupted
|
||||
- Human-friendly — prints instructions at each step
|
||||
|
||||
**Negative:**
|
||||
- State management adds complexity (read/write onboard state, step validation)
|
||||
- Interactive steps (`read -r`) are not suitable for fully automated pipelines
|
||||
- Onboard state persists on the state branch even after completion (minor clutter)
|
||||
- The step machine is linear — cannot skip steps or run them out of order
|
||||
18
docs/adr/README.md
Normal file
18
docs/adr/README.md
Normal file
@@ -0,0 +1,18 @@
|
||||
# Architecture Decision Records
|
||||
|
||||
This directory contains Architecture Decision Records (ADRs) for josh-sync. Each ADR documents a significant design decision, its context, the alternatives considered, and the rationale for the chosen approach.
|
||||
|
||||
## Index
|
||||
|
||||
| ADR | Title | Status |
|
||||
|-----|-------|--------|
|
||||
| [001](001-josh-proxy-for-sync.md) | Josh-proxy for bidirectional sync | Accepted |
|
||||
| [002](002-state-on-orphan-branch.md) | State storage on orphan git branch | Accepted |
|
||||
| [003](003-force-with-lease-forward.md) | Force-with-lease for forward sync | Accepted |
|
||||
| [004](004-always-pr-reverse.md) | Always-PR policy for reverse sync | Accepted |
|
||||
| [005](005-git-trailer-loop-prevention.md) | Git trailer for loop prevention | Accepted |
|
||||
| [006](006-inline-exclude-filter.md) | Inline exclude in josh-proxy URL | Accepted |
|
||||
| [007](007-reconciliation-merge.md) | Reconciliation merge for filter changes | Accepted |
|
||||
| [008](008-first-parent-ordering.md) | First-parent ordering in reconciliation merges | Accepted |
|
||||
| [009](009-tree-comparison-guard.md) | Tree comparison as sync skip guard | Accepted |
|
||||
| [010](010-onboard-checkpoint-resume.md) | Onboard workflow with checkpoint/resume | Accepted |
|
||||
@@ -32,6 +32,7 @@ Each target maps a monorepo subfolder to an external subrepo.
|
||||
| `subrepo_ssh_key_var` | string | No | `"SUBREPO_SSH_KEY"` | Name of the env var holding the SSH private key for this target. |
|
||||
| `branches` | object | Yes | — | Branch mapping: `mono_branch: subrepo_branch`. Each key-value pair syncs those branches bidirectionally. |
|
||||
| `forward_only` | string[] | No | `[]` | Branches that only sync mono → subrepo, never reverse. |
|
||||
| `exclude` | string[] | No | `[]` | File/directory patterns to exclude from sync via josh `:exclude` filter. Excluded files exist only in the monorepo, never in the subrepo. See [Excluding Files](guide.md#excluding-files-from-sync). |
|
||||
|
||||
## `bot` Section
|
||||
|
||||
|
||||
191
docs/guide.md
191
docs/guide.md
@@ -91,6 +91,9 @@ targets:
|
||||
branches:
|
||||
main: main # mono_branch: subrepo_branch
|
||||
forward_only: []
|
||||
exclude: # files excluded from subrepo (optional)
|
||||
- ".monorepo/" # monorepo-only config dir
|
||||
- "**/internal/" # internal dirs at any depth
|
||||
|
||||
- name: "auth"
|
||||
subfolder: "services/auth"
|
||||
@@ -165,6 +168,34 @@ SUBREPO_SSH_KEY="-----BEGIN OPENSSH PRIVATE KEY-----
|
||||
# AUTH_REPO_TOKEN=<auth-specific-token>
|
||||
```
|
||||
|
||||
### Updating josh-sync in devenv
|
||||
|
||||
To update to the latest version:
|
||||
|
||||
```bash
|
||||
devenv update josh-sync
|
||||
```
|
||||
|
||||
Or with plain Nix flakes:
|
||||
|
||||
```bash
|
||||
nix flake lock --update-input josh-sync
|
||||
```
|
||||
|
||||
To pin to a specific version, use a tag ref in `devenv.yaml`:
|
||||
|
||||
```yaml
|
||||
josh-sync:
|
||||
url: git+https://your-gitea.example.com/org/josh-sync?ref=refs/tags/v1.2
|
||||
flake: true
|
||||
```
|
||||
|
||||
After updating, verify the version:
|
||||
|
||||
```bash
|
||||
josh-sync --version
|
||||
```
|
||||
|
||||
### Option B: Manual installation
|
||||
|
||||
Install the required tools, then either:
|
||||
@@ -189,11 +220,61 @@ For a new monorepo before import, preflight may warn that subfolders don't exist
|
||||
|
||||
## Step 5: Import Existing Subrepos
|
||||
|
||||
This is the critical onboarding step. For each existing subrepo, you run a three-step cycle: **import → merge → reset**.
|
||||
This is the critical onboarding step. There are two approaches:
|
||||
|
||||
- **`josh-sync onboard`** (recommended) — interactive, resumable, preserves open PRs
|
||||
- **Manual `import` → merge → `reset`** — lower-level, for automation or when there are no open PRs to preserve
|
||||
|
||||
### Option A: Onboard (recommended)
|
||||
|
||||
The `onboard` command walks you through the entire process interactively, with checkpoint/resume at every step.
|
||||
|
||||
**Before you start:**
|
||||
|
||||
1. **Rename** the existing subrepo on your Git server (e.g., `stores/storefront` → `stores/storefront-archived`)
|
||||
2. **Create a new empty repo** at the original path (e.g., a new `stores/storefront` with no commits)
|
||||
|
||||
The rename preserves the archived repo with all its history and open PRs. The new empty repo will receive josh-filtered history.
|
||||
|
||||
**Run onboard:**
|
||||
|
||||
```bash
|
||||
josh-sync onboard billing
|
||||
```
|
||||
|
||||
The command will:
|
||||
1. **Verify prerequisites** — checks the new empty repo is reachable, asks for the archived repo URL
|
||||
2. **Import** — copies subrepo content into monorepo and creates import PRs (one per branch)
|
||||
3. **Wait for merge** — shows PR numbers and waits for you to merge them
|
||||
4. **Reset** — pushes josh-filtered history to the new subrepo (per-branch, with resume)
|
||||
5. **Done** — prints instructions for developers and PR migration
|
||||
|
||||
If the process is interrupted at any point, re-run `josh-sync onboard billing` to resume from where it left off. Use `--restart` to start over.
|
||||
|
||||
**Migrate open PRs:**
|
||||
|
||||
After onboard completes, migrate PRs from the archived repo to the new one:
|
||||
|
||||
```bash
|
||||
# Interactive — lists open PRs and lets you pick
|
||||
josh-sync migrate-pr billing
|
||||
|
||||
# Migrate all open PRs at once
|
||||
josh-sync migrate-pr billing --all
|
||||
|
||||
# Migrate specific PRs by number
|
||||
josh-sync migrate-pr billing 5 8 12
|
||||
```
|
||||
|
||||
PR migration works by fetching the diff from the archived repo's PR, applying it to the new repo, and creating a new PR. File content is identical after reset, so patches apply cleanly.
|
||||
|
||||
### Option B: Manual import → merge → reset
|
||||
|
||||
Use this when the subrepo has no open PRs to preserve, or for scripted automation.
|
||||
|
||||
> Do this **one target at a time** to keep PRs reviewable.
|
||||
|
||||
### 5a. Import
|
||||
#### 5b-1. Import
|
||||
|
||||
```bash
|
||||
josh-sync import billing
|
||||
@@ -208,13 +289,13 @@ This:
|
||||
|
||||
Review the import PR — check for leaked credentials, environment-specific config, or files that shouldn't be in the monorepo.
|
||||
|
||||
### 5b. Merge the import PR
|
||||
#### 5b-2. Merge the import PR
|
||||
|
||||
Merge the PR using your Git platform's UI. This lands the subrepo content into the monorepo's main branch.
|
||||
|
||||
> At this point, the monorepo has the content but the histories are disconnected. Sync will **not** work until you complete the reset step.
|
||||
|
||||
### 5c. Reset
|
||||
#### 5b-3. Reset
|
||||
|
||||
```bash
|
||||
josh-sync reset billing
|
||||
@@ -228,9 +309,20 @@ This:
|
||||
|
||||
This establishes **shared commit ancestry** between josh's filtered view and the subrepo. Without this, josh-proxy can't compute diffs between the two.
|
||||
|
||||
> **Warning:** This is a destructive force-push that replaces the subrepo's history. Back up any important branches or tags in the subrepo beforehand.
|
||||
> **Warning:** This is a destructive force-push that replaces the subrepo's history. Back up any important branches or tags in the subrepo beforehand. Merge or close all open pull requests on the subrepo first — they will be invalidated.
|
||||
|
||||
### 5d. Repeat for each target
|
||||
After reset, **every developer with a local clone of the subrepo** must update their local copy to match the new history:
|
||||
|
||||
```bash
|
||||
cd /path/to/local-subrepo
|
||||
git fetch origin
|
||||
git checkout main && git reset --hard origin/main
|
||||
git checkout stage && git reset --hard origin/stage # repeat for each branch
|
||||
```
|
||||
|
||||
Or simply delete and re-clone the subrepo. Local-only branches (not pushed to the remote) will be lost either way.
|
||||
|
||||
#### 5b-4. Repeat for each target
|
||||
|
||||
```
|
||||
For each target:
|
||||
@@ -239,9 +331,9 @@ For each target:
|
||||
3. josh-sync reset <target>
|
||||
```
|
||||
|
||||
### 5e. Verify
|
||||
### Verify
|
||||
|
||||
After all targets are imported and reset:
|
||||
After all targets are imported and reset (whichever option you used):
|
||||
|
||||
```bash
|
||||
# Check all targets show state
|
||||
@@ -426,6 +518,65 @@ Bot commits include a git trailer like `Josh-Sync-Origin: forward/main/2024-02-1
|
||||
|
||||
Sync state is stored as JSON files on an orphan branch (`josh-sync-state`), one file per target/branch. This tracks the last-synced commit SHAs and timestamps to avoid re-syncing the same changes.
|
||||
|
||||
## Excluding Files from Sync
|
||||
|
||||
Some files in the monorepo subfolder may not belong in the subrepo (e.g., monorepo-specific CI configs, internal tooling). The `exclude` config field removes these at the josh-proxy layer — excluded files never appear in the subrepo.
|
||||
|
||||
### Configuration
|
||||
|
||||
Add an `exclude` list to any target:
|
||||
|
||||
```yaml
|
||||
targets:
|
||||
- name: "billing"
|
||||
subfolder: "services/billing"
|
||||
subrepo_url: "git@host:org/billing.git"
|
||||
exclude:
|
||||
- ".monorepo/" # directory at subfolder root
|
||||
- "**/internal/" # directory at any depth
|
||||
- "*.secret" # files by extension
|
||||
branches:
|
||||
main: main
|
||||
```
|
||||
|
||||
### How it works
|
||||
|
||||
When `exclude` is present, josh-sync appends an inline `:exclude` filter to the josh-proxy URL. For the example above, the josh filter becomes:
|
||||
|
||||
```
|
||||
:/services/billing:exclude[::.monorepo/,::**/internal/,::*.secret]
|
||||
```
|
||||
|
||||
Josh-proxy applies this filter at the transport layer — no extra files to generate or commit. This means:
|
||||
- **Forward sync**: the filtered clone already excludes the files
|
||||
- **Reverse sync**: pushes through josh also respect the exclusion
|
||||
- **Reset**: the subrepo history never contains excluded files
|
||||
- **Tree comparison**: `skip` detection works correctly (excluded files are not in the diff)
|
||||
|
||||
### Pattern syntax
|
||||
|
||||
Josh uses `::` patterns inside `:exclude[...]`:
|
||||
|
||||
| Pattern | Matches |
|
||||
|---------|---------|
|
||||
| `dir/` | Directory at subfolder root |
|
||||
| `file` | File at subfolder root |
|
||||
| `**/dir/` | Directory at any depth |
|
||||
| `**/file` | File at any depth |
|
||||
| `*.ext` | Glob pattern (single `*` only) |
|
||||
|
||||
### Setup
|
||||
|
||||
1. Add `exclude` to the target in `.josh-sync.yml`
|
||||
2. Run `josh-sync preflight` to verify the filter works
|
||||
3. Forward sync will now exclude the specified files
|
||||
|
||||
No extra files to generate or commit — the exclusion is embedded directly in the josh-proxy URL.
|
||||
|
||||
### Changing the exclude list
|
||||
|
||||
You can safely add or remove patterns from `exclude` at any time. When josh-sync detects that the filter has changed since the last sync, it automatically creates a reconciliation merge commit on the subrepo that connects the old and new histories — no manual reset or force-push required. Developers do not need to re-clone the subrepo.
|
||||
|
||||
## Adding a New Target
|
||||
|
||||
To add a new subrepo after initial setup:
|
||||
@@ -433,8 +584,12 @@ To add a new subrepo after initial setup:
|
||||
1. Add the target to `.josh-sync.yml`
|
||||
2. Update the forward workflow's `paths:` list to include the new subfolder
|
||||
3. Commit and push
|
||||
4. Run the import-merge-reset cycle for the new target:
|
||||
4. Import the target:
|
||||
```bash
|
||||
# Recommended: interactive onboard (preserves open PRs)
|
||||
josh-sync onboard new-target
|
||||
|
||||
# Or manual: import → merge PR → reset
|
||||
josh-sync import new-target
|
||||
# merge the PR
|
||||
josh-sync reset new-target
|
||||
@@ -471,6 +626,24 @@ The subfolder already contains the same content as the subrepo. This is fine —
|
||||
|
||||
Verify `bot.trailer` in config matches what's in commit messages. Check the loop guard in the CI workflow is active.
|
||||
|
||||
### "cannot lock ref" or "expected X but got Y"
|
||||
|
||||
**After reset (subrepo):** The subrepo's history was replaced by force-push. Local clones still have the old history:
|
||||
|
||||
```bash
|
||||
cd /path/to/subrepo
|
||||
git fetch origin
|
||||
git checkout main && git reset --hard origin/main
|
||||
```
|
||||
|
||||
Or simply delete and re-clone.
|
||||
|
||||
**After import/reset cycle (monorepo):** The import and reset steps create and update branches rapidly (`auto-sync/import-*`, `josh-sync-state`). If your local clone fetched partway through, tracking refs go stale:
|
||||
|
||||
```bash
|
||||
git remote prune origin && git pull
|
||||
```
|
||||
|
||||
### State issues
|
||||
|
||||
```bash
|
||||
|
||||
@@ -5,12 +5,12 @@
|
||||
# In devenv.yaml:
|
||||
# inputs:
|
||||
# josh-sync:
|
||||
# url: github:org/josh-sync/v1.0.0
|
||||
# url: git+https://your-gitea.example.com/org/josh-sync?ref=refs/tags/v1.2
|
||||
# flake: true
|
||||
#
|
||||
# Or in flake.nix:
|
||||
# inputs.josh-sync = {
|
||||
# url = "github:org/josh-sync/v1.0.0";
|
||||
# url = "git+https://your-gitea.example.com/org/josh-sync?ref=refs/tags/v1.2";
|
||||
# inputs.nixpkgs.follows = "nixpkgs";
|
||||
# };
|
||||
|
||||
@@ -26,6 +26,8 @@
|
||||
# josh-sync preflight Validate config and connectivity
|
||||
# josh-sync import <target> Initial import from subrepo
|
||||
# josh-sync reset <target> Reset subrepo to josh-filtered view
|
||||
# josh-sync onboard <target> Interactive import + reset workflow
|
||||
# josh-sync migrate-pr <target> Migrate PRs from archived repo
|
||||
# josh-sync status Show target config and sync state
|
||||
# josh-sync state show <t> [b] Show state JSON
|
||||
# josh-sync state reset <t> [b] Reset state
|
||||
|
||||
@@ -62,7 +62,7 @@ jobs:
|
||||
done | sort -u | paste -sd ',' -)
|
||||
echo "targets=${TARGETS}" >> "$GITHUB_OUTPUT"
|
||||
|
||||
- uses: https://your-gitea.example.com/org/josh-sync@v1
|
||||
- uses: https://your-gitea.example.com/org/josh-sync@v1.2
|
||||
with:
|
||||
direction: forward
|
||||
target: ${{ github.event.inputs.target || steps.detect.outputs.targets }}
|
||||
|
||||
@@ -10,17 +10,19 @@ josh:
|
||||
targets:
|
||||
- name: "billing"
|
||||
subfolder: "services/billing"
|
||||
josh_filter: ":/services/billing"
|
||||
# josh_filter auto-derived as ":/services/billing" if omitted
|
||||
subrepo_url: "https://gitea.example.com/ext/billing.git"
|
||||
subrepo_auth: "https"
|
||||
branches:
|
||||
main: main
|
||||
develop: develop
|
||||
forward_only: []
|
||||
exclude: # files excluded from subrepo (optional)
|
||||
- ".monorepo/" # directory at subfolder root
|
||||
- "**/internal/" # directory at any depth
|
||||
|
||||
- name: "auth"
|
||||
subfolder: "services/auth"
|
||||
josh_filter: ":/services/auth"
|
||||
subrepo_url: "git@gitea.example.com:ext/auth.git"
|
||||
subrepo_auth: "ssh"
|
||||
# Per-target credential override (reads from $AUTH_SSH_KEY instead of $SUBREPO_SSH_KEY)
|
||||
@@ -31,7 +33,6 @@ targets:
|
||||
|
||||
- name: "shared-lib"
|
||||
subfolder: "libs/shared"
|
||||
josh_filter: ":/libs/shared"
|
||||
subrepo_url: "https://gitea.example.com/ext/shared-lib.git"
|
||||
branches:
|
||||
main: main
|
||||
|
||||
@@ -40,7 +40,7 @@ jobs:
|
||||
curl -sL "https://github.com/mikefarah/yq/releases/download/v4.44.6/yq_linux_amd64" \
|
||||
-o /usr/local/bin/yq && chmod +x /usr/local/bin/yq
|
||||
|
||||
- uses: https://your-gitea.example.com/org/josh-sync@v1
|
||||
- uses: https://your-gitea.example.com/org/josh-sync@v1.2
|
||||
with:
|
||||
direction: reverse
|
||||
target: ${{ github.event.inputs.target || '' }}
|
||||
|
||||
@@ -28,6 +28,7 @@
|
||||
|
||||
installPhase = ''
|
||||
mkdir -p $out/{bin,lib}
|
||||
cp VERSION $out/
|
||||
cp lib/*.sh $out/lib/
|
||||
cp bin/josh-sync $out/bin/
|
||||
chmod +x $out/bin/josh-sync
|
||||
|
||||
48
lib/auth.sh
48
lib/auth.sh
@@ -39,16 +39,15 @@ subrepo_ls_remote() {
|
||||
}
|
||||
|
||||
# ─── PR Creation ────────────────────────────────────────────────────
|
||||
# Shared helper for creating PRs on Gitea/GitHub API.
|
||||
# Shared helpers for creating PRs on Gitea/GitHub API.
|
||||
# Usage: create_pr <api_url> <token> <base> <head> <title> <body>
|
||||
# number=$(create_pr_number <api_url> <token> <base> <head> <title> <body>)
|
||||
#
|
||||
# create_pr — fire-and-forget (stdout suppressed, safe inside sync functions)
|
||||
# create_pr_number — returns the new PR number via stdout
|
||||
|
||||
create_pr() {
|
||||
local api_url="$1"
|
||||
local token="$2"
|
||||
local base="$3"
|
||||
local head="$4"
|
||||
local title="$5"
|
||||
local body="$6"
|
||||
create_pr_number() {
|
||||
local api_url="$1" token="$2" base="$3" head="$4" title="$5" body="$6"
|
||||
|
||||
curl -sf -X POST \
|
||||
-H "Authorization: token ${token}" \
|
||||
@@ -59,5 +58,36 @@ create_pr() {
|
||||
--arg title "$title" \
|
||||
--arg body "$body" \
|
||||
'{base:$base, head:$head, title:$title, body:$body}')" \
|
||||
"${api_url}/pulls" >/dev/null
|
||||
"${api_url}/pulls" | jq -r '.number'
|
||||
}
|
||||
|
||||
create_pr() {
|
||||
create_pr_number "$@" >/dev/null
|
||||
}
|
||||
|
||||
# ─── PR API Helpers ──────────────────────────────────────────────
|
||||
# Used by onboard and migrate-pr commands.
|
||||
|
||||
# List open PRs on a repo. Returns JSON array.
|
||||
# Usage: list_open_prs <api_url> <token>
|
||||
list_open_prs() {
|
||||
local api_url="$1" token="$2"
|
||||
curl -sf -H "Authorization: token ${token}" \
|
||||
"${api_url}/pulls?state=open&limit=50"
|
||||
}
|
||||
|
||||
# Get PR diff as plain text.
|
||||
# Usage: get_pr_diff <api_url> <token> <pr_number>
|
||||
get_pr_diff() {
|
||||
local api_url="$1" token="$2" pr_number="$3"
|
||||
curl -sf -H "Authorization: token ${token}" \
|
||||
"${api_url}/pulls/${pr_number}.diff"
|
||||
}
|
||||
|
||||
# Get single PR as JSON (for checking merge status, metadata, etc.).
|
||||
# Usage: get_pr <api_url> <token> <pr_number>
|
||||
get_pr() {
|
||||
local api_url="$1" token="$2" pr_number="$3"
|
||||
curl -sf -H "Authorization: token ${token}" \
|
||||
"${api_url}/pulls/${pr_number}"
|
||||
}
|
||||
|
||||
@@ -36,7 +36,10 @@ parse_config() {
|
||||
export JOSH_SYNC_TARGETS
|
||||
JOSH_SYNC_TARGETS=$(echo "$config_json" | jq '[.targets[] | . +
|
||||
# Auto-derive josh_filter from subfolder if not set
|
||||
(if (.josh_filter // "") == "" then
|
||||
# When exclude patterns are present, append inline :exclude[::p1,::p2,...] to the filter
|
||||
(if (.exclude // [] | length) > 0 then
|
||||
{josh_filter: (":/" + .subfolder + ":exclude[" + (.exclude | map("::" + .) | join(",")) + "]")}
|
||||
elif (.josh_filter // "") == "" then
|
||||
{josh_filter: (":/" + .subfolder)}
|
||||
else {} end) +
|
||||
# Derive gitea_host and subrepo_repo_path from subrepo_url
|
||||
|
||||
451
lib/onboard.sh
Normal file
451
lib/onboard.sh
Normal file
@@ -0,0 +1,451 @@
|
||||
#!/usr/bin/env bash
|
||||
# lib/onboard.sh — Onboard orchestration and PR migration
|
||||
#
|
||||
# Provides:
|
||||
# onboard_flow() — Interactive: import → wait for merge → reset to new repo
|
||||
# migrate_one_pr() — Migrate a single PR from archived repo to new repo
|
||||
#
|
||||
# Onboard state is stored on the josh-sync-state branch at <target>/onboard.json.
|
||||
# Steps: start → importing → waiting-for-merge → resetting → complete
|
||||
#
|
||||
# Requires: lib/core.sh, lib/config.sh, lib/auth.sh, lib/state.sh, lib/sync.sh sourced
|
||||
# Expects: JOSH_SYNC_TARGET_NAME, BOT_NAME, BOT_EMAIL, SUBREPO_API, SUBREPO_TOKEN, etc.
|
||||
|
||||
# ─── Onboard State Helpers ────────────────────────────────────────
|
||||
# Follow the same pattern as read_state()/write_state() in lib/state.sh.
|
||||
|
||||
read_onboard_state() {
|
||||
local target_name="${1:-$JOSH_SYNC_TARGET_NAME}"
|
||||
git fetch origin "$STATE_BRANCH" 2>/dev/null || true
|
||||
git show "origin/${STATE_BRANCH}:${target_name}/onboard.json" 2>/dev/null || echo '{}'
|
||||
}
|
||||
|
||||
write_onboard_state() {
|
||||
local target_name="${1:-$JOSH_SYNC_TARGET_NAME}"
|
||||
local state_json="$2"
|
||||
local key="${target_name}/onboard"
|
||||
local tmp_dir
|
||||
tmp_dir=$(mktemp -d)
|
||||
|
||||
if git rev-parse "origin/${STATE_BRANCH}" >/dev/null 2>&1; then
|
||||
git worktree add "$tmp_dir" "origin/${STATE_BRANCH}" 2>/dev/null
|
||||
else
|
||||
git worktree add --detach "$tmp_dir" 2>/dev/null
|
||||
(cd "$tmp_dir" && git checkout --orphan "$STATE_BRANCH" && { git rm -rf . 2>/dev/null || true; })
|
||||
fi
|
||||
|
||||
mkdir -p "$(dirname "${tmp_dir}/${key}.json")"
|
||||
echo "$state_json" | jq '.' > "${tmp_dir}/${key}.json"
|
||||
|
||||
(
|
||||
cd "$tmp_dir" || exit
|
||||
git add -A
|
||||
if ! git diff --cached --quiet 2>/dev/null; then
|
||||
git -c user.name="$BOT_NAME" -c user.email="$BOT_EMAIL" \
|
||||
commit -m "onboard: update ${target_name}"
|
||||
git push origin "HEAD:${STATE_BRANCH}" || log "WARN" "Failed to push onboard state"
|
||||
fi
|
||||
)
|
||||
|
||||
git worktree remove "$tmp_dir" 2>/dev/null || rm -rf "$tmp_dir"
|
||||
}
|
||||
|
||||
# ─── Derive Archived API URL ─────────────────────────────────────
|
||||
# Given a URL like "git@host:org/repo-archived.git" or
|
||||
# "https://host/org/repo-archived.git", derive the Gitea API URL.
|
||||
|
||||
_archived_api_from_url() {
|
||||
local url="$1"
|
||||
# Strip .git suffix first — avoids non-greedy regex issues in POSIX ERE
|
||||
url="${url%.git}"
|
||||
local host repo_path
|
||||
|
||||
if echo "$url" | grep -qE '^(ssh://|git@)'; then
|
||||
# SSH URL
|
||||
if echo "$url" | grep -q '^ssh://'; then
|
||||
host=$(echo "$url" | sed -E 's|ssh://[^@]*@([^/]+)/.*|\1|')
|
||||
repo_path=$(echo "$url" | sed -E 's|ssh://[^@]*@[^/]+/(.+)$|\1|')
|
||||
else
|
||||
host=$(echo "$url" | sed -E 's|git@([^:/]+)[:/].*|\1|')
|
||||
repo_path=$(echo "$url" | sed -E 's|git@[^:/]+[:/](.+)$|\1|')
|
||||
fi
|
||||
else
|
||||
# HTTPS URL
|
||||
host=$(echo "$url" | sed -E 's|https?://([^/]+)/.*|\1|')
|
||||
repo_path=$(echo "$url" | sed -E 's|https?://[^/]+/(.+)$|\1|')
|
||||
fi
|
||||
|
||||
echo "https://${host}/api/v1/repos/${repo_path}"
|
||||
}
|
||||
|
||||
# ─── Onboard Flow ────────────────────────────────────────────────
|
||||
# Interactive orchestrator with checkpoint/resume.
|
||||
# Usage: onboard_flow <target_json> <restart>
|
||||
|
||||
onboard_flow() {
|
||||
local target_json="$1"
|
||||
local restart="${2:-false}"
|
||||
local target_name="$JOSH_SYNC_TARGET_NAME"
|
||||
|
||||
# Load existing onboard state (or empty)
|
||||
local onboard_state
|
||||
onboard_state=$(read_onboard_state "$target_name")
|
||||
local current_step
|
||||
current_step=$(echo "$onboard_state" | jq -r '.step // "start"')
|
||||
|
||||
if [ "$restart" = true ]; then
|
||||
log "INFO" "Restarting onboard from scratch"
|
||||
current_step="start"
|
||||
onboard_state='{}'
|
||||
fi
|
||||
|
||||
log "INFO" "Onboard step: ${current_step}"
|
||||
|
||||
# ── Step 1: Prerequisites + archived repo info ──
|
||||
if [ "$current_step" = "start" ]; then
|
||||
echo "" >&2
|
||||
echo "=== Onboarding ${target_name} ===" >&2
|
||||
echo "" >&2
|
||||
echo "Before proceeding, you should have:" >&2
|
||||
echo " 1. Renamed the existing subrepo (e.g., storefront → storefront-archived)" >&2
|
||||
echo " 2. Created a new EMPTY repo at the original URL" >&2
|
||||
echo "" >&2
|
||||
|
||||
# Verify the new (empty) subrepo is reachable (no HEAD ref — works on empty repos)
|
||||
if git ls-remote "$(subrepo_auth_url)" >/dev/null 2>&1; then
|
||||
# shellcheck disable=SC2001 # sed is clearer for URL pattern replacement
|
||||
log "INFO" "New subrepo is reachable at $(echo "$SUBREPO_URL" | sed 's|://[^@]*@|://***@|')"
|
||||
else
|
||||
log "WARN" "New subrepo is not reachable — make sure you created the new empty repo"
|
||||
fi
|
||||
|
||||
echo "Enter the archived repo URL (e.g., git@host:org/repo-archived.git):" >&2
|
||||
local archived_url
|
||||
read -r archived_url
|
||||
[ -n "$archived_url" ] || die "Archived URL is required"
|
||||
|
||||
# Determine auth type for archived repo (same as current subrepo)
|
||||
local archived_auth="${SUBREPO_AUTH:-https}"
|
||||
|
||||
# Derive API URL
|
||||
local archived_api
|
||||
archived_api=$(_archived_api_from_url "$archived_url")
|
||||
|
||||
# Verify archived repo is reachable via API
|
||||
if curl -sf -H "Authorization: token ${SUBREPO_TOKEN}" \
|
||||
"${archived_api}" >/dev/null 2>&1; then
|
||||
log "INFO" "Archived repo reachable: ${archived_api}"
|
||||
else
|
||||
log "WARN" "Cannot reach archived repo API — check URL and token"
|
||||
echo "Continue anyway? (y/N):" >&2
|
||||
local confirm
|
||||
read -r confirm
|
||||
[ "$confirm" = "y" ] || [ "$confirm" = "Y" ] || die "Aborted"
|
||||
fi
|
||||
|
||||
# Save state
|
||||
onboard_state=$(jq -n \
|
||||
--arg step "importing" \
|
||||
--arg archived_api "$archived_api" \
|
||||
--arg archived_url "$archived_url" \
|
||||
--arg archived_auth "$archived_auth" \
|
||||
--arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
|
||||
'{step:$step, archived_api:$archived_api, archived_url:$archived_url,
|
||||
archived_auth:$archived_auth, import_prs:{}, reset_branches:[],
|
||||
migrated_prs:[], timestamp:$ts}')
|
||||
write_onboard_state "$target_name" "$onboard_state"
|
||||
current_step="importing"
|
||||
fi
|
||||
|
||||
# ── Step 2: Import (reuses initial_import()) ──
|
||||
if [ "$current_step" = "importing" ]; then
|
||||
echo "" >&2
|
||||
log "INFO" "Step 2: Importing subrepo content into monorepo..."
|
||||
|
||||
local branches
|
||||
branches=$(echo "$target_json" | jq -r '.branches | keys[]')
|
||||
|
||||
# Load existing import_prs from state (resume support)
|
||||
local import_prs
|
||||
import_prs=$(echo "$onboard_state" | jq -r '.import_prs // {}')
|
||||
|
||||
# Build the archived repo clone URL for initial_import().
|
||||
# The content lives in the archived repo — the new repo at SUBREPO_URL is empty.
|
||||
local archived_url archived_clone_url
|
||||
archived_url=$(echo "$onboard_state" | jq -r '.archived_url')
|
||||
if [ "${SUBREPO_AUTH:-https}" = "ssh" ]; then
|
||||
archived_clone_url="$archived_url"
|
||||
else
|
||||
# shellcheck disable=SC2001
|
||||
archived_clone_url=$(echo "$archived_url" | sed "s|https://|https://${BOT_USER}:${SUBREPO_TOKEN}@|")
|
||||
fi
|
||||
|
||||
for branch in $branches; do
|
||||
local mapped
|
||||
mapped=$(echo "$target_json" | jq -r --arg b "$branch" '.branches[$b] // empty')
|
||||
[ -z "$mapped" ] && continue
|
||||
|
||||
# Skip branches that already have an import PR recorded
|
||||
if echo "$import_prs" | jq -e --arg b "$branch" 'has($b)' >/dev/null 2>&1; then
|
||||
log "INFO" "Import PR already recorded for ${branch} — skipping"
|
||||
continue
|
||||
fi
|
||||
|
||||
export SYNC_BRANCH_MONO="$branch"
|
||||
export SYNC_BRANCH_SUBREPO="$mapped"
|
||||
|
||||
log "INFO" "Importing branch: ${branch} (subrepo: ${mapped})"
|
||||
local result
|
||||
result=$(initial_import "$archived_clone_url")
|
||||
log "INFO" "Import result for ${branch}: ${result}"
|
||||
|
||||
if [ "$result" = "pr-created" ]; then
|
||||
# Find the import PR number via API
|
||||
local prs pr_number
|
||||
prs=$(list_open_prs "$MONOREPO_API" "$GITEA_TOKEN")
|
||||
pr_number=$(echo "$prs" | jq -r --arg t "$target_name" --arg b "$branch" \
|
||||
'[.[] | select(.title | test("\\[Import\\] " + $t + ":")) | select(.base.ref == $b)] | .[0].number // empty')
|
||||
|
||||
if [ -n "$pr_number" ]; then
|
||||
import_prs=$(echo "$import_prs" | jq --arg b "$branch" --arg n "$pr_number" '. + {($b): ($n | tonumber)}')
|
||||
log "INFO" "Import PR for ${branch}: #${pr_number}"
|
||||
else
|
||||
log "WARN" "Could not find import PR number for ${branch} — check monorepo PRs"
|
||||
fi
|
||||
fi
|
||||
|
||||
# Save progress after each branch (resume support)
|
||||
onboard_state=$(echo "$onboard_state" | jq --argjson prs "$import_prs" '.import_prs = $prs')
|
||||
write_onboard_state "$target_name" "$onboard_state"
|
||||
done
|
||||
|
||||
# Update state
|
||||
onboard_state=$(echo "$onboard_state" | jq \
|
||||
--arg step "waiting-for-merge" \
|
||||
--argjson prs "$import_prs" \
|
||||
'.step = $step | .import_prs = $prs')
|
||||
write_onboard_state "$target_name" "$onboard_state"
|
||||
current_step="waiting-for-merge"
|
||||
fi
|
||||
|
||||
# ── Step 3: Wait for merge ──
|
||||
if [ "$current_step" = "waiting-for-merge" ]; then
|
||||
echo "" >&2
|
||||
log "INFO" "Step 3: Waiting for import PR(s) to be merged..."
|
||||
|
||||
local import_prs
|
||||
import_prs=$(echo "$onboard_state" | jq -r '.import_prs')
|
||||
local pr_count
|
||||
pr_count=$(echo "$import_prs" | jq 'length')
|
||||
|
||||
if [ "$pr_count" -eq 0 ]; then
|
||||
log "WARN" "No import PRs recorded — skipping merge check"
|
||||
else
|
||||
echo "" >&2
|
||||
echo "Import PRs to merge:" >&2
|
||||
echo "$import_prs" | jq -r 'to_entries[] | " \(.key): PR #\(.value)"' >&2
|
||||
echo "" >&2
|
||||
echo "Merge the import PR(s) on the monorepo, then press Enter..." >&2
|
||||
read -r
|
||||
|
||||
# Verify each PR is merged
|
||||
local all_merged=true
|
||||
for branch in $(echo "$import_prs" | jq -r 'keys[]'); do
|
||||
local pr_number
|
||||
pr_number=$(echo "$import_prs" | jq -r --arg b "$branch" '.[$b]')
|
||||
local pr_json merged
|
||||
pr_json=$(get_pr "$MONOREPO_API" "$GITEA_TOKEN" "$pr_number")
|
||||
merged=$(echo "$pr_json" | jq -r '.merged // false')
|
||||
|
||||
if [ "$merged" = "true" ]; then
|
||||
log "INFO" "PR #${pr_number} (${branch}): merged"
|
||||
else
|
||||
log "ERROR" "PR #${pr_number} (${branch}): NOT merged — merge it first"
|
||||
all_merged=false
|
||||
fi
|
||||
done
|
||||
|
||||
if [ "$all_merged" = false ]; then
|
||||
die "Not all import PRs are merged. Re-run 'josh-sync onboard ${target_name}' after merging."
|
||||
fi
|
||||
fi
|
||||
|
||||
# Update state
|
||||
onboard_state=$(echo "$onboard_state" | jq '.step = "resetting"')
|
||||
write_onboard_state "$target_name" "$onboard_state"
|
||||
current_step="resetting"
|
||||
fi
|
||||
|
||||
# ── Step 4: Reset (pushes josh-filtered history to new repo) ──
|
||||
if [ "$current_step" = "resetting" ]; then
|
||||
echo "" >&2
|
||||
log "INFO" "Step 4: Pushing josh-filtered history to new subrepo..."
|
||||
|
||||
local branches
|
||||
branches=$(echo "$target_json" | jq -r '.branches | keys[]')
|
||||
local already_reset
|
||||
already_reset=$(echo "$onboard_state" | jq -r '.reset_branches // []')
|
||||
|
||||
for branch in $branches; do
|
||||
# Skip branches already reset (resume support)
|
||||
if echo "$already_reset" | jq -e --arg b "$branch" 'index($b) != null' >/dev/null 2>&1; then
|
||||
log "INFO" "Branch ${branch} already reset — skipping"
|
||||
continue
|
||||
fi
|
||||
|
||||
local mapped
|
||||
mapped=$(echo "$target_json" | jq -r --arg b "$branch" '.branches[$b] // empty')
|
||||
[ -z "$mapped" ] && continue
|
||||
|
||||
export SYNC_BRANCH_MONO="$branch"
|
||||
export SYNC_BRANCH_SUBREPO="$mapped"
|
||||
|
||||
local result
|
||||
result=$(subrepo_reset)
|
||||
log "INFO" "Reset result for ${branch}: ${result}"
|
||||
|
||||
# Track progress
|
||||
onboard_state=$(echo "$onboard_state" | jq --arg b "$branch" \
|
||||
'.reset_branches += [$b]')
|
||||
write_onboard_state "$target_name" "$onboard_state"
|
||||
done
|
||||
|
||||
# Update state
|
||||
onboard_state=$(echo "$onboard_state" | jq \
|
||||
--arg step "complete" \
|
||||
--arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
|
||||
'.step = $step | .timestamp = $ts')
|
||||
write_onboard_state "$target_name" "$onboard_state"
|
||||
current_step="complete"
|
||||
fi
|
||||
|
||||
# ── Step 5: Done ──
|
||||
if [ "$current_step" = "complete" ]; then
|
||||
echo "" >&2
|
||||
echo "=== Onboarding complete! ===" >&2
|
||||
echo "" >&2
|
||||
echo "The new subrepo now has josh-filtered history." >&2
|
||||
echo "Developers should re-clone or reset their local copies:" >&2
|
||||
echo " git fetch origin && git reset --hard origin/main" >&2
|
||||
echo "" >&2
|
||||
echo "To migrate open PRs from the archived repo:" >&2
|
||||
echo " josh-sync migrate-pr ${target_name} # interactive picker" >&2
|
||||
echo " josh-sync migrate-pr ${target_name} --all # migrate all" >&2
|
||||
echo " josh-sync migrate-pr ${target_name} 5 8 12 # specific PRs" >&2
|
||||
fi
|
||||
}
|
||||
|
||||
# ─── Migrate One PR ──────────────────────────────────────────────
|
||||
# Fetches the PR's branch from the archived repo, computes a local diff,
|
||||
# and applies it to the new subrepo with --3way for resilience.
|
||||
# Usage: migrate_one_pr <pr_number>
|
||||
#
|
||||
# Expects: JOSH_SYNC_TARGET_NAME, SUBREPO_API, SUBREPO_TOKEN, BOT_NAME, BOT_EMAIL loaded
|
||||
|
||||
migrate_one_pr() {
|
||||
local pr_number="$1"
|
||||
local target_name="$JOSH_SYNC_TARGET_NAME"
|
||||
|
||||
# Read archived repo info from onboard state
|
||||
local onboard_state archived_api
|
||||
onboard_state=$(read_onboard_state "$target_name")
|
||||
archived_api=$(echo "$onboard_state" | jq -r '.archived_api')
|
||||
if [ -z "$archived_api" ] || [ "$archived_api" = "null" ]; then
|
||||
die "No archived repo info found. Run 'josh-sync onboard ${target_name}' first."
|
||||
fi
|
||||
|
||||
# Check if this PR was already migrated
|
||||
local already_migrated
|
||||
already_migrated=$(echo "$onboard_state" | jq -r \
|
||||
--argjson num "$pr_number" '.migrated_prs // [] | map(select(.old_number == $num)) | length')
|
||||
if [ "$already_migrated" -gt 0 ]; then
|
||||
log "INFO" "PR #${pr_number} already migrated — skipping"
|
||||
return 0
|
||||
fi
|
||||
|
||||
# Same credentials — the repo was just renamed
|
||||
local archived_token="$SUBREPO_TOKEN"
|
||||
|
||||
# 1. Get PR metadata from archived repo
|
||||
local pr_json title base head body
|
||||
pr_json=$(get_pr "$archived_api" "$archived_token" "$pr_number") \
|
||||
|| die "Failed to fetch PR #${pr_number} from archived repo"
|
||||
title=$(echo "$pr_json" | jq -r '.title')
|
||||
base=$(echo "$pr_json" | jq -r '.base.ref')
|
||||
head=$(echo "$pr_json" | jq -r '.head.ref')
|
||||
body=$(echo "$pr_json" | jq -r '.body // ""')
|
||||
|
||||
log "INFO" "Migrating PR #${pr_number}: \"${title}\" (${base} <- ${head})"
|
||||
|
||||
# 2. Clone new subrepo, add archived repo as second remote
|
||||
# Save cwd so we can restore it (function runs in caller's shell, not subshell)
|
||||
local original_dir
|
||||
original_dir=$(pwd)
|
||||
|
||||
local work_dir
|
||||
work_dir=$(mktemp -d)
|
||||
# shellcheck disable=SC2064 # Intentional early expansion
|
||||
trap "cd '$original_dir' 2>/dev/null; rm -rf '$work_dir'" RETURN
|
||||
|
||||
git clone "$(subrepo_auth_url)" --branch "$base" --single-branch \
|
||||
"${work_dir}/subrepo" 2>&1 || die "Failed to clone new subrepo (branch: ${base})"
|
||||
|
||||
cd "${work_dir}/subrepo" || exit
|
||||
git config user.name "$BOT_NAME"
|
||||
git config user.email "$BOT_EMAIL"
|
||||
|
||||
# Build authenticated URL for the archived repo
|
||||
local archived_url archived_clone_url
|
||||
archived_url=$(echo "$onboard_state" | jq -r '.archived_url')
|
||||
if [ "${SUBREPO_AUTH:-https}" = "ssh" ]; then
|
||||
archived_clone_url="$archived_url"
|
||||
else
|
||||
# shellcheck disable=SC2001
|
||||
archived_clone_url=$(echo "$archived_url" | sed "s|https://|https://${BOT_USER}:${SUBREPO_TOKEN}@|")
|
||||
fi
|
||||
|
||||
# Fetch the PR's head and base branches from the archived repo
|
||||
git remote add archived "$archived_clone_url"
|
||||
git fetch archived "$head" "$base" 2>&1 \
|
||||
|| die "Failed to fetch branches from archived repo"
|
||||
|
||||
# 3. Compute diff locally and apply with --3way
|
||||
git checkout -B "$head" >&2
|
||||
|
||||
local diff
|
||||
diff=$(git diff "archived/${base}..archived/${head}")
|
||||
if [ -z "$diff" ]; then
|
||||
log "WARN" "Empty diff for PR #${pr_number} — skipping"
|
||||
return 1
|
||||
fi
|
||||
|
||||
if echo "$diff" | git apply --3way 2>&1; then
|
||||
git add -A
|
||||
git commit -m "${title}
|
||||
|
||||
Migrated from archived repo PR #${pr_number}" >&2
|
||||
|
||||
git push "$(subrepo_auth_url)" "$head" >&2 \
|
||||
|| die "Failed to push branch ${head}"
|
||||
|
||||
# 4. Create PR on new repo
|
||||
local new_number
|
||||
new_number=$(create_pr_number "$SUBREPO_API" "$SUBREPO_TOKEN" \
|
||||
"$base" "$head" "$title" "$body")
|
||||
log "INFO" "Migrated PR #${pr_number} -> #${new_number}: \"${title}\""
|
||||
|
||||
# 5. Record in onboard state
|
||||
cd "$original_dir" || true
|
||||
onboard_state=$(read_onboard_state "$target_name")
|
||||
onboard_state=$(echo "$onboard_state" | jq \
|
||||
--argjson old "$pr_number" \
|
||||
--argjson new_num "${new_number}" \
|
||||
--arg title "$title" \
|
||||
'.migrated_prs += [{"old_number":$old, "new_number":$new_num, "title":$title}]')
|
||||
write_onboard_state "$target_name" "$onboard_state"
|
||||
else
|
||||
log "ERROR" "Could not apply changes for PR #${pr_number} even with 3-way merge"
|
||||
log "ERROR" "Manual migration needed: branch '${head}' from archived repo"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
154
lib/sync.sh
154
lib/sync.sh
@@ -11,7 +11,7 @@
|
||||
|
||||
# ─── Forward Sync: Monorepo → Subrepo ──────────────────────────────
|
||||
#
|
||||
# Returns: fresh | skip | clean | lease-rejected | conflict
|
||||
# Returns: fresh | skip | clean | lease-rejected | conflict | unrelated
|
||||
|
||||
forward_sync() {
|
||||
local mono_branch="$SYNC_BRANCH_MONO"
|
||||
@@ -97,7 +97,14 @@ ${BOT_TRAILER}: forward/${mono_branch}/$(date -u +%Y-%m-%dT%H:%M:%SZ)" >&2
|
||||
fi
|
||||
|
||||
else
|
||||
# Conflict!
|
||||
# Check: unrelated histories (filter change) vs normal merge conflict
|
||||
if ! git merge-base "subrepo/${subrepo_branch}" "$mono_head" >/dev/null 2>&1; then
|
||||
log "INFO" "No common ancestor — histories are unrelated (filter change?)"
|
||||
echo "unrelated"
|
||||
return
|
||||
fi
|
||||
|
||||
# Normal merge conflict
|
||||
local conflicted
|
||||
conflicted=$(git diff --name-only --diff-filter=U 2>/dev/null || echo "(unknown)")
|
||||
git merge --abort
|
||||
@@ -115,7 +122,14 @@ ${BOT_TRAILER}: forward/${mono_branch}/$(date -u +%Y-%m-%dT%H:%M:%SZ)" >&2
|
||||
local pr_body conflicted_list
|
||||
# shellcheck disable=SC2001
|
||||
conflicted_list=$(echo "$conflicted" | sed 's/^/- /')
|
||||
pr_body="## Sync Conflict\n\nMonorepo \`${mono_branch}\` has changes that conflict with \`${subrepo_branch}\`.\n\n**Conflicted files:**\n${conflicted_list}\n\nPlease resolve and merge this PR to complete the sync."
|
||||
pr_body="## Sync Conflict
|
||||
|
||||
Monorepo \`${mono_branch}\` has changes that conflict with \`${subrepo_branch}\`.
|
||||
|
||||
**Conflicted files:**
|
||||
${conflicted_list}
|
||||
|
||||
Please resolve and merge this PR to complete the sync."
|
||||
|
||||
create_pr "${SUBREPO_API}" "${SUBREPO_TOKEN}" \
|
||||
"$subrepo_branch" "$conflict_branch" \
|
||||
@@ -128,6 +142,87 @@ ${BOT_TRAILER}: forward/${mono_branch}/$(date -u +%Y-%m-%dT%H:%M:%SZ)" >&2
|
||||
fi
|
||||
}
|
||||
|
||||
# ─── Filter Change Reconciliation ─────────────────────────────────
|
||||
# When the josh filter changes (e.g., exclude patterns added/removed),
|
||||
# josh-proxy recomputes filtered history with new SHAs. This creates a
|
||||
# merge commit on the subrepo that connects old and new histories,
|
||||
# re-establishing shared ancestry without a destructive force-push.
|
||||
# Returns: reconciled | lease-rejected
|
||||
|
||||
reconcile_filter_change() {
|
||||
local mono_branch="$SYNC_BRANCH_MONO"
|
||||
local subrepo_branch="$SYNC_BRANCH_SUBREPO"
|
||||
local work_dir
|
||||
work_dir=$(mktemp -d)
|
||||
# shellcheck disable=SC2064 # Intentional early expansion — work_dir is local
|
||||
trap "rm -rf '$work_dir'" EXIT
|
||||
|
||||
log "INFO" "=== Filter change reconciliation: ${mono_branch} ==="
|
||||
|
||||
# 1. Clone subrepo
|
||||
git clone "$(subrepo_auth_url)" \
|
||||
--branch "$subrepo_branch" --single-branch \
|
||||
"${work_dir}/subrepo" || die "Failed to clone subrepo"
|
||||
|
||||
cd "${work_dir}/subrepo" || exit
|
||||
git config user.name "$BOT_NAME"
|
||||
git config user.email "$BOT_EMAIL"
|
||||
|
||||
local subrepo_head
|
||||
subrepo_head=$(git rev-parse HEAD)
|
||||
log "INFO" "Subrepo HEAD: ${subrepo_head:0:12}"
|
||||
|
||||
# 2. Fetch josh-proxy filtered view (new filter)
|
||||
git remote add josh-filtered "$(josh_auth_url)"
|
||||
git fetch josh-filtered "$mono_branch" || die "Failed to fetch from josh-proxy"
|
||||
|
||||
local josh_head josh_tree
|
||||
josh_head=$(git rev-parse "josh-filtered/${mono_branch}")
|
||||
# shellcheck disable=SC1083 # {tree} is git syntax, not shell brace expansion
|
||||
josh_tree=$(git rev-parse "josh-filtered/${mono_branch}^{tree}")
|
||||
log "INFO" "Josh-proxy HEAD (new filter): ${josh_head:0:12}"
|
||||
|
||||
# 3. Check if trees are already identical (filter change had no effect)
|
||||
local subrepo_tree
|
||||
# shellcheck disable=SC1083
|
||||
subrepo_tree=$(git rev-parse "HEAD^{tree}")
|
||||
if [ "$josh_tree" = "$subrepo_tree" ]; then
|
||||
log "INFO" "Trees identical after filter change — no reconciliation needed"
|
||||
echo "skip"
|
||||
return
|
||||
fi
|
||||
|
||||
# 4. Create merge commit: josh-proxy HEAD (first parent) + subrepo HEAD, with josh-proxy's tree
|
||||
# Josh follows first-parent traversal — josh-filtered MUST be first so josh can map
|
||||
# the history back to the monorepo. Old subrepo history hangs off parent 2.
|
||||
local merge_commit
|
||||
merge_commit=$(git commit-tree "$josh_tree" \
|
||||
-p "$josh_head" \
|
||||
-p "$subrepo_head" \
|
||||
-m "Sync: filter configuration updated
|
||||
|
||||
${BOT_TRAILER}: filter-change/${mono_branch}/$(date -u +%Y-%m-%dT%H:%M:%SZ)")
|
||||
|
||||
git reset --hard "$merge_commit" >&2
|
||||
log "INFO" "Created reconciliation merge: ${merge_commit:0:12}"
|
||||
|
||||
# 5. Record lease and push
|
||||
local subrepo_sha
|
||||
subrepo_sha=$(subrepo_ls_remote "$subrepo_branch")
|
||||
|
||||
if git push \
|
||||
--force-with-lease="refs/heads/${subrepo_branch}:${subrepo_sha}" \
|
||||
"$(subrepo_auth_url)" \
|
||||
"HEAD:refs/heads/${subrepo_branch}"; then
|
||||
|
||||
log "INFO" "Filter change reconciled — shared ancestry re-established"
|
||||
echo "reconciled"
|
||||
else
|
||||
log "WARN" "Force-with-lease rejected — subrepo changed during reconciliation"
|
||||
echo "lease-rejected"
|
||||
fi
|
||||
}
|
||||
|
||||
# ─── Reverse Sync: Subrepo → Monorepo ──────────────────────────────
|
||||
#
|
||||
# Always creates a PR on the monorepo — never pushes directly.
|
||||
@@ -156,9 +251,24 @@ reverse_sync() {
|
||||
git remote add mono-filtered "$(josh_auth_url)"
|
||||
git fetch mono-filtered "$mono_branch" || die "Failed to fetch from josh-proxy"
|
||||
|
||||
# 3. Find new human commits (excludes bot commits from forward sync)
|
||||
# 3. Compare trees — skip if subrepo matches josh-filtered view
|
||||
local subrepo_tree josh_tree
|
||||
# shellcheck disable=SC1083 # {tree} is git syntax, not shell brace expansion
|
||||
subrepo_tree=$(git rev-parse "HEAD^{tree}")
|
||||
# shellcheck disable=SC1083
|
||||
josh_tree=$(git rev-parse "mono-filtered/${mono_branch}^{tree}")
|
||||
|
||||
if [ "$subrepo_tree" = "$josh_tree" ]; then
|
||||
log "INFO" "Subrepo tree matches josh-filtered view — nothing to sync"
|
||||
echo "skip"
|
||||
return
|
||||
fi
|
||||
|
||||
# 4. Find new human commits (excludes bot commits from forward sync)
|
||||
# Uses --ancestry-path to restrict to the direct lineage and avoid
|
||||
# leaking old history through reconciliation merge parents.
|
||||
local human_commits
|
||||
human_commits=$(git log "mono-filtered/${mono_branch}..HEAD" \
|
||||
human_commits=$(git log --ancestry-path "mono-filtered/${mono_branch}..HEAD" \
|
||||
--oneline --invert-grep --grep="^${BOT_TRAILER}:" 2>/dev/null || echo "")
|
||||
|
||||
if [ -z "$human_commits" ]; then
|
||||
@@ -170,7 +280,7 @@ reverse_sync() {
|
||||
log "INFO" "New human commits to sync:"
|
||||
echo "$human_commits" >&2
|
||||
|
||||
# 4. Push through josh to a staging branch
|
||||
# 5. Push through josh to a staging branch
|
||||
local ts
|
||||
ts=$(date +%Y%m%d-%H%M%S)
|
||||
local staging_branch="auto-sync/subrepo-${subrepo_branch}-${ts}"
|
||||
@@ -178,9 +288,20 @@ reverse_sync() {
|
||||
if git push -o "base=${mono_branch}" "$(josh_auth_url)" "HEAD:refs/heads/${staging_branch}"; then
|
||||
log "INFO" "Pushed to staging branch via josh: ${staging_branch}"
|
||||
|
||||
# 5. Create PR on monorepo (NEVER direct push)
|
||||
# 6. Create PR on monorepo (NEVER direct push)
|
||||
local pr_body
|
||||
pr_body="## Subrepo changes\n\nNew commits from subrepo \`${subrepo_branch}\`:\n\n\`\`\`\n${human_commits}\n\`\`\`\n\n**Review checklist:**\n- [ ] Changes scoped to synced subfolder\n- [ ] No leaked credentials or environment-specific config\n- [ ] CI passes"
|
||||
pr_body="## Subrepo changes
|
||||
|
||||
New commits from subrepo \`${subrepo_branch}\`:
|
||||
|
||||
\`\`\`
|
||||
${human_commits}
|
||||
\`\`\`
|
||||
|
||||
**Review checklist:**
|
||||
- [ ] Changes scoped to synced subfolder
|
||||
- [ ] No leaked credentials or environment-specific config
|
||||
- [ ] CI passes"
|
||||
|
||||
create_pr "${MONOREPO_API}" "${GITEA_TOKEN}" \
|
||||
"$mono_branch" "$staging_branch" \
|
||||
@@ -200,9 +321,13 @@ reverse_sync() {
|
||||
#
|
||||
# Used when a subrepo already has content and you're adding it to the
|
||||
# monorepo for the first time. Creates a PR.
|
||||
# Usage: initial_import [clone_url_override]
|
||||
# clone_url_override — if set, clone from this URL instead of subrepo_auth_url()
|
||||
# (used by onboard to clone from the archived repo)
|
||||
# Returns: skip | pr-created
|
||||
|
||||
initial_import() {
|
||||
local clone_url="${1:-$(subrepo_auth_url)}"
|
||||
local mono_branch="$SYNC_BRANCH_MONO"
|
||||
local subrepo_branch="$SYNC_BRANCH_SUBREPO"
|
||||
local subfolder
|
||||
@@ -225,8 +350,8 @@ initial_import() {
|
||||
--branch "$mono_branch" --single-branch \
|
||||
"${work_dir}/monorepo" || die "Failed to clone monorepo"
|
||||
|
||||
# 2. Clone subrepo
|
||||
git clone "$(subrepo_auth_url)" \
|
||||
# 2. Clone subrepo (or archived repo when clone_url is overridden)
|
||||
git clone "$clone_url" \
|
||||
--branch "$subrepo_branch" --single-branch \
|
||||
"${work_dir}/subrepo" || die "Failed to clone subrepo"
|
||||
|
||||
@@ -264,7 +389,14 @@ ${BOT_TRAILER}: import/${JOSH_SYNC_TARGET_NAME}/${ts}" >&2
|
||||
|
||||
# 5. Create PR on monorepo
|
||||
local pr_body
|
||||
pr_body="## Initial import\n\nImporting existing subrepo \`${subrepo_branch}\` (${file_count} files) into \`${subfolder}/\`.\n\n**Review checklist:**\n- [ ] Content looks correct\n- [ ] No leaked credentials or environment-specific config\n- [ ] CI passes"
|
||||
pr_body="## Initial import
|
||||
|
||||
Importing existing subrepo \`${subrepo_branch}\` (${file_count} files) into \`${subfolder}/\`.
|
||||
|
||||
**Review checklist:**
|
||||
- [ ] Content looks correct
|
||||
- [ ] No leaked credentials or environment-specific config
|
||||
- [ ] CI passes"
|
||||
|
||||
create_pr "${MONOREPO_API}" "${GITEA_TOKEN}" \
|
||||
"$mono_branch" "$staging_branch" \
|
||||
|
||||
@@ -70,6 +70,12 @@
|
||||
"items": { "type": "string" },
|
||||
"default": [],
|
||||
"description": "Branches that only sync mono → subrepo (never reverse)"
|
||||
},
|
||||
"exclude": {
|
||||
"type": "array",
|
||||
"items": { "type": "string" },
|
||||
"default": [],
|
||||
"description": "File/directory patterns to exclude from sync via josh :exclude filter. Josh pattern syntax: 'dir/' for directories, '*.ext' for globs, '**/dir/' for nested matches. Patterns are embedded inline in the josh-proxy URL."
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user