- README: add onboard and migrate-pr to CLI reference - Guide Step 5: add onboard as recommended Option A, move manual import/reset to Option B, document migrate-pr usage - Guide "Adding a New Target": mention onboard as preferred path Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
566 lines
18 KiB
Markdown
566 lines
18 KiB
Markdown
# Setup Guide
|
|
|
|
Step-by-step guide to setting up josh-sync for a new monorepo with existing subrepos.
|
|
|
|
## Overview
|
|
|
|
josh-sync provides bidirectional sync between a monorepo and N external subrepos via [josh-proxy](https://josh-project.github.io/josh/):
|
|
|
|
```
|
|
MONOREPO SUBREPOS
|
|
├── services/billing/ ──── forward ────► billing-repo/
|
|
├── services/auth/ (push or cron) auth-repo/
|
|
└── libs/shared/ ◄──── reverse ───── shared-lib-repo/
|
|
(cron → always PR)
|
|
via josh-proxy (filtered git views)
|
|
```
|
|
|
|
**Key safety properties:**
|
|
- Forward sync (mono → subrepo) uses `--force-with-lease` — never overwrites concurrent changes
|
|
- Reverse sync (subrepo → mono) always creates a PR — never pushes directly
|
|
- Git trailers (`Josh-Sync-Origin:`) prevent infinite sync loops
|
|
- State tracked on an orphan branch (`josh-sync-state`) — survives CI runner teardown
|
|
|
|
## Prerequisites
|
|
|
|
Before you begin, you need:
|
|
|
|
### josh-proxy instance
|
|
|
|
A running [josh-proxy](https://josh-project.github.io/josh/) that can access your monorepo's Git server. Verify connectivity:
|
|
|
|
```bash
|
|
git ls-remote https://josh.example.com/org/monorepo.git HEAD
|
|
```
|
|
|
|
### Bot account
|
|
|
|
A dedicated Git user (e.g., `josh-sync-bot`) with:
|
|
- Write access to the monorepo
|
|
- Write access to all subrepos
|
|
- Ability to create PRs on both monorepo and subrepo platforms
|
|
|
|
### Credentials
|
|
|
|
| Variable | Purpose | Required |
|
|
|----------|---------|----------|
|
|
| `SYNC_BOT_USER` | Bot's Git username | Yes |
|
|
| `SYNC_BOT_TOKEN` | API token with repo scope (monorepo + josh-proxy auth) | Yes |
|
|
| `SUBREPO_SSH_KEY` | SSH private key for subrepo access (if using SSH auth) | If SSH |
|
|
| `SUBREPO_TOKEN` | HTTPS token for subrepo access (defaults to `SYNC_BOT_TOKEN`) | No |
|
|
|
|
Per-target credential overrides are supported — see [Configuration Reference](config-reference.md).
|
|
|
|
### Tool dependencies
|
|
|
|
`bash >=4`, `git`, `curl`, `jq`, `yq` ([mikefarah/yq](https://github.com/mikefarah/yq) v4+), `openssh`, `rsync`
|
|
|
|
> The Nix flake bundles all dependencies automatically.
|
|
|
|
## Step 1: Create the Monorepo
|
|
|
|
Create a new repository on your Git server (e.g., `org/monorepo`). Create subdirectories for each subrepo you want to sync:
|
|
|
|
```bash
|
|
mkdir -p services/billing services/auth libs/shared
|
|
```
|
|
|
|
These directories will be populated during the import step. They can be empty or contain `.gitkeep` files for now.
|
|
|
|
Verify josh-proxy can see the monorepo:
|
|
|
|
```bash
|
|
git ls-remote https://josh.example.com/org/monorepo.git HEAD
|
|
```
|
|
|
|
## Step 2: Configure `.josh-sync.yml`
|
|
|
|
Create `.josh-sync.yml` at the monorepo root. Each target maps a monorepo subfolder to an external subrepo:
|
|
|
|
```yaml
|
|
josh:
|
|
proxy_url: "https://josh.example.com" # josh-proxy URL (no trailing slash)
|
|
monorepo_path: "org/monorepo" # repo path as josh sees it
|
|
|
|
targets:
|
|
- name: "billing" # unique identifier
|
|
subfolder: "services/billing" # monorepo subfolder
|
|
# josh_filter auto-derived as ":/services/billing" if omitted
|
|
subrepo_url: "git@gitea.example.com:ext/billing.git"
|
|
subrepo_auth: "ssh" # "https" (default) or "ssh"
|
|
branches:
|
|
main: main # mono_branch: subrepo_branch
|
|
forward_only: []
|
|
|
|
- name: "auth"
|
|
subfolder: "services/auth"
|
|
subrepo_url: "https://gitea.example.com/ext/auth.git"
|
|
subrepo_auth: "https"
|
|
subrepo_token_var: "AUTH_REPO_TOKEN" # per-target credential override
|
|
branches:
|
|
main: main
|
|
develop: develop # multiple branches supported
|
|
forward_only: []
|
|
|
|
- name: "shared-lib"
|
|
subfolder: "libs/shared"
|
|
subrepo_url: "https://gitea.example.com/ext/shared-lib.git"
|
|
branches:
|
|
main: main
|
|
forward_only: [main] # one-way: mono → subrepo only
|
|
|
|
bot:
|
|
name: "josh-sync-bot"
|
|
email: "josh-sync-bot@example.com"
|
|
trailer: "Josh-Sync-Origin" # git trailer for loop prevention
|
|
```
|
|
|
|
For the full field reference, see [Configuration Reference](config-reference.md).
|
|
|
|
## Step 3: Set Up Local Dev Environment
|
|
|
|
### Option A: Nix + devenv (recommended)
|
|
|
|
**`devenv.yaml`** — declare josh-sync as a flake input:
|
|
|
|
```yaml
|
|
inputs:
|
|
nixpkgs:
|
|
url: github:cachix/devenv-nixpkgs/rolling
|
|
josh-sync:
|
|
url: git+https://your-gitea.example.com/org/josh-sync?ref=main
|
|
flake: true
|
|
```
|
|
|
|
**`devenv.nix`** — import the josh-sync module:
|
|
|
|
```nix
|
|
{ inputs, ... }:
|
|
{
|
|
imports = [ inputs.josh-sync.devenvModules.default ];
|
|
|
|
name = "my-monorepo";
|
|
|
|
# .env contains secrets, not devenv config
|
|
dotenv.disableHint = true;
|
|
}
|
|
```
|
|
|
|
**`.envrc`** — activate devenv automatically:
|
|
|
|
```bash
|
|
DEVENV_WARN_TIMEOUT=20
|
|
use devenv
|
|
```
|
|
|
|
**`.env`** — local credentials (add to `.gitignore`):
|
|
|
|
```bash
|
|
SYNC_BOT_USER=sync-bot
|
|
SYNC_BOT_TOKEN=<your-api-token>
|
|
SUBREPO_SSH_KEY="-----BEGIN OPENSSH PRIVATE KEY-----
|
|
...
|
|
-----END OPENSSH PRIVATE KEY-----"
|
|
# Per-target overrides:
|
|
# AUTH_REPO_TOKEN=<auth-specific-token>
|
|
```
|
|
|
|
### Option B: Manual installation
|
|
|
|
Install the required tools, then either:
|
|
|
|
- Clone the josh-sync repo and add `bin/` to your `PATH`
|
|
- Run `make build` to create a single bundled script at `dist/josh-sync`
|
|
|
|
## Step 4: Validate with Preflight
|
|
|
|
```bash
|
|
josh-sync preflight
|
|
```
|
|
|
|
This validates:
|
|
- Config syntax and required fields
|
|
- josh-proxy connectivity (via `git ls-remote` through josh)
|
|
- Subrepo connectivity and authentication
|
|
- Branch mappings
|
|
- CI workflow path coverage (checks if `.gitea/workflows/josh-sync-forward.yml` paths match target subfolders)
|
|
|
|
For a new monorepo before import, preflight may warn that subfolders don't exist yet — that's expected.
|
|
|
|
## Step 5: Import Existing Subrepos
|
|
|
|
This is the critical onboarding step. There are two approaches:
|
|
|
|
- **`josh-sync onboard`** (recommended) — interactive, resumable, preserves open PRs
|
|
- **Manual `import` → merge → `reset`** — lower-level, for automation or when there are no open PRs to preserve
|
|
|
|
### Option A: Onboard (recommended)
|
|
|
|
The `onboard` command walks you through the entire process interactively, with checkpoint/resume at every step.
|
|
|
|
**Before you start:**
|
|
|
|
1. **Rename** the existing subrepo on your Git server (e.g., `stores/storefront` → `stores/storefront-archived`)
|
|
2. **Create a new empty repo** at the original path (e.g., a new `stores/storefront` with no commits)
|
|
|
|
The rename preserves the archived repo with all its history and open PRs. The new empty repo will receive josh-filtered history.
|
|
|
|
**Run onboard:**
|
|
|
|
```bash
|
|
josh-sync onboard billing
|
|
```
|
|
|
|
The command will:
|
|
1. **Verify prerequisites** — checks the new empty repo is reachable, asks for the archived repo URL
|
|
2. **Import** — copies subrepo content into monorepo and creates import PRs (one per branch)
|
|
3. **Wait for merge** — shows PR numbers and waits for you to merge them
|
|
4. **Reset** — pushes josh-filtered history to the new subrepo (per-branch, with resume)
|
|
5. **Done** — prints instructions for developers and PR migration
|
|
|
|
If the process is interrupted at any point, re-run `josh-sync onboard billing` to resume from where it left off. Use `--restart` to start over.
|
|
|
|
**Migrate open PRs:**
|
|
|
|
After onboard completes, migrate PRs from the archived repo to the new one:
|
|
|
|
```bash
|
|
# Interactive — lists open PRs and lets you pick
|
|
josh-sync migrate-pr billing
|
|
|
|
# Migrate all open PRs at once
|
|
josh-sync migrate-pr billing --all
|
|
|
|
# Migrate specific PRs by number
|
|
josh-sync migrate-pr billing 5 8 12
|
|
```
|
|
|
|
PR migration works by fetching the diff from the archived repo's PR, applying it to the new repo, and creating a new PR. File content is identical after reset, so patches apply cleanly.
|
|
|
|
### Option B: Manual import → merge → reset
|
|
|
|
Use this when the subrepo has no open PRs to preserve, or for scripted automation.
|
|
|
|
> Do this **one target at a time** to keep PRs reviewable.
|
|
|
|
#### 5b-1. Import
|
|
|
|
```bash
|
|
josh-sync import billing
|
|
```
|
|
|
|
This:
|
|
1. Clones the monorepo directly (not through josh)
|
|
2. Clones the subrepo
|
|
3. Copies subrepo content into the monorepo subfolder via `rsync`
|
|
4. Creates a branch `auto-sync/import-billing-<timestamp>`
|
|
5. Pushes it and creates a PR on the monorepo
|
|
|
|
Review the import PR — check for leaked credentials, environment-specific config, or files that shouldn't be in the monorepo.
|
|
|
|
#### 5b-2. Merge the import PR
|
|
|
|
Merge the PR using your Git platform's UI. This lands the subrepo content into the monorepo's main branch.
|
|
|
|
> At this point, the monorepo has the content but the histories are disconnected. Sync will **not** work until you complete the reset step.
|
|
|
|
#### 5b-3. Reset
|
|
|
|
```bash
|
|
josh-sync reset billing
|
|
```
|
|
|
|
> **You do NOT need to `git pull` locally before running reset.** The reset command clones fresh from josh-proxy — it never uses your local working copy.
|
|
|
|
This:
|
|
1. Clones the monorepo through josh-proxy with the josh filter (the "filtered view")
|
|
2. Force-pushes that filtered view to the subrepo, replacing its history
|
|
|
|
This establishes **shared commit ancestry** between josh's filtered view and the subrepo. Without this, josh-proxy can't compute diffs between the two.
|
|
|
|
> **Warning:** This is a destructive force-push that replaces the subrepo's history. Back up any important branches or tags in the subrepo beforehand. Merge or close all open pull requests on the subrepo first — they will be invalidated.
|
|
|
|
After reset, **every developer with a local clone of the subrepo** must update their local copy to match the new history:
|
|
|
|
```bash
|
|
cd /path/to/local-subrepo
|
|
git fetch origin
|
|
git checkout main && git reset --hard origin/main
|
|
git checkout stage && git reset --hard origin/stage # repeat for each branch
|
|
```
|
|
|
|
Or simply delete and re-clone the subrepo. Local-only branches (not pushed to the remote) will be lost either way.
|
|
|
|
#### 5b-4. Repeat for each target
|
|
|
|
```
|
|
For each target:
|
|
1. josh-sync import <target>
|
|
2. Review and merge the import PR on the monorepo
|
|
3. josh-sync reset <target>
|
|
```
|
|
|
|
### Verify
|
|
|
|
After all targets are imported and reset (whichever option you used):
|
|
|
|
```bash
|
|
# Check all targets show state
|
|
josh-sync status
|
|
|
|
# Test forward sync — should return "skip" (trees are identical after reset)
|
|
josh-sync sync --forward --target billing
|
|
|
|
# Test reverse sync — should return "skip" (no new human commits)
|
|
josh-sync sync --reverse --target billing
|
|
```
|
|
|
|
## Step 6: Set Up CI Workflows
|
|
|
|
### Forward sync (mono → subrepo)
|
|
|
|
Create `.gitea/workflows/josh-sync-forward.yml`:
|
|
|
|
```yaml
|
|
name: "Josh Sync → Subrepo"
|
|
|
|
on:
|
|
push:
|
|
branches: [main]
|
|
paths:
|
|
# List ALL target subfolders:
|
|
- "services/billing/**"
|
|
- "services/auth/**"
|
|
- "libs/shared/**"
|
|
schedule:
|
|
- cron: "0 */6 * * *" # every 6 hours as fallback
|
|
workflow_dispatch:
|
|
inputs:
|
|
target:
|
|
description: "Target to sync (empty = detect from push or all)"
|
|
required: false
|
|
default: ""
|
|
branch:
|
|
description: "Branch to sync (empty = triggered branch or all)"
|
|
required: false
|
|
default: ""
|
|
|
|
concurrency:
|
|
group: josh-sync-fwd-${{ github.ref_name }}
|
|
cancel-in-progress: false
|
|
|
|
jobs:
|
|
sync:
|
|
runs-on: docker
|
|
container: node:20-bookworm
|
|
timeout-minutes: 10
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
with:
|
|
fetch-depth: 2 # needed for target detection
|
|
|
|
- name: Install tools
|
|
run: |
|
|
apt-get update -qq && apt-get install -y -qq jq curl git openssh-client >/dev/null 2>&1
|
|
curl -sL "https://github.com/mikefarah/yq/releases/download/v4.44.6/yq_linux_amd64" \
|
|
-o /usr/local/bin/yq && chmod +x /usr/local/bin/yq
|
|
|
|
- name: Detect changed target
|
|
if: github.event_name == 'push'
|
|
id: detect
|
|
run: |
|
|
CHANGED=$(git diff --name-only HEAD~1 HEAD 2>/dev/null || echo "")
|
|
TARGETS=$(yq -o json '.targets' .josh-sync.yml \
|
|
| jq -r '.[] | "\(.name):\(.subfolder)"' \
|
|
| while IFS=: read -r name prefix; do
|
|
echo "$CHANGED" | grep -q "^${prefix}/" && echo "$name"
|
|
done | sort -u | paste -sd ',' -)
|
|
echo "targets=${TARGETS}" >> "$GITHUB_OUTPUT"
|
|
|
|
- uses: https://your-gitea.example.com/org/josh-sync@v1
|
|
with:
|
|
direction: forward
|
|
target: ${{ github.event.inputs.target || steps.detect.outputs.targets }}
|
|
branch: ${{ github.event.inputs.branch || github.ref_name }}
|
|
env:
|
|
SYNC_BOT_USER: ${{ secrets.SYNC_BOT_USER }}
|
|
SYNC_BOT_TOKEN: ${{ secrets.SYNC_BOT_TOKEN }}
|
|
SUBREPO_TOKEN: ${{ secrets.SUBREPO_TOKEN || secrets.SYNC_BOT_TOKEN }}
|
|
SUBREPO_SSH_KEY: ${{ secrets.SUBREPO_SSH_KEY }}
|
|
```
|
|
|
|
### Reverse sync (subrepo → mono)
|
|
|
|
Create `.gitea/workflows/josh-sync-reverse.yml`:
|
|
|
|
```yaml
|
|
name: "Josh Sync ← Subrepo"
|
|
|
|
on:
|
|
schedule:
|
|
- cron: "0 1,7,13,19 * * *" # every 6h, offset from forward
|
|
workflow_dispatch:
|
|
inputs:
|
|
target:
|
|
description: "Target to sync (empty = all)"
|
|
required: false
|
|
default: ""
|
|
branch:
|
|
description: "Branch to sync (empty = all eligible)"
|
|
required: false
|
|
default: ""
|
|
|
|
concurrency:
|
|
group: josh-sync-rev-${{ github.event.inputs.target || 'all' }}
|
|
cancel-in-progress: false
|
|
|
|
jobs:
|
|
sync:
|
|
runs-on: docker
|
|
container: node:20-bookworm
|
|
timeout-minutes: 10
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
|
|
- name: Install tools
|
|
run: |
|
|
apt-get update -qq && apt-get install -y -qq jq curl git openssh-client >/dev/null 2>&1
|
|
curl -sL "https://github.com/mikefarah/yq/releases/download/v4.44.6/yq_linux_amd64" \
|
|
-o /usr/local/bin/yq && chmod +x /usr/local/bin/yq
|
|
|
|
- uses: https://your-gitea.example.com/org/josh-sync@v1
|
|
with:
|
|
direction: reverse
|
|
target: ${{ github.event.inputs.target || '' }}
|
|
branch: ${{ github.event.inputs.branch || '' }}
|
|
env:
|
|
SYNC_BOT_USER: ${{ secrets.SYNC_BOT_USER }}
|
|
SYNC_BOT_TOKEN: ${{ secrets.SYNC_BOT_TOKEN }}
|
|
SUBREPO_TOKEN: ${{ secrets.SUBREPO_TOKEN || secrets.SYNC_BOT_TOKEN }}
|
|
SUBREPO_SSH_KEY: ${{ secrets.SUBREPO_SSH_KEY }}
|
|
```
|
|
|
|
### Required CI secrets
|
|
|
|
| Secret | Purpose |
|
|
|--------|---------|
|
|
| `SYNC_BOT_USER` | Bot username |
|
|
| `SYNC_BOT_TOKEN` | Bot API token (monorepo access + josh-proxy auth) |
|
|
| `SUBREPO_SSH_KEY` | SSH private key for subrepo push (if using SSH auth) |
|
|
| `SUBREPO_TOKEN` | Optional separate subrepo token (defaults to `SYNC_BOT_TOKEN`) |
|
|
|
|
> **GitHub Actions note:** These examples target Gitea Actions. For GitHub Actions, change the `uses:` reference to a GitHub repo (e.g., `org/josh-sync@v1`) and `runs-on:` to a GitHub runner (e.g., `ubuntu-latest`).
|
|
|
|
## How Ongoing Sync Works
|
|
|
|
Once set up, sync runs automatically:
|
|
|
|
### Forward sync (mono → subrepo)
|
|
|
|
Triggered by pushes to target subfolders or on a cron schedule:
|
|
|
|
1. Clones the monorepo through josh-proxy (filtered view of the subfolder)
|
|
2. Fetches the subrepo branch for comparison
|
|
3. If trees are identical → skip
|
|
4. If subrepo branch doesn't exist → fresh push
|
|
5. Merges mono changes on top of subrepo state
|
|
6. If clean merge → pushes with `--force-with-lease` (protects against concurrent changes)
|
|
7. If lease rejected → retries on next run (subrepo changed during sync)
|
|
8. If merge conflict → creates a conflict PR on the subrepo
|
|
|
|
### Reverse sync (subrepo → mono)
|
|
|
|
Runs on a cron schedule (never triggered by subrepo pushes):
|
|
|
|
1. Clones the subrepo
|
|
2. Fetches the monorepo's josh-filtered view for comparison
|
|
3. Finds new human commits (filters out bot commits by checking for the `Josh-Sync-Origin:` trailer)
|
|
4. If no new human commits → skip
|
|
5. Pushes through josh-proxy to a staging branch
|
|
6. Creates a PR on the monorepo — **never pushes directly**
|
|
|
|
### Loop prevention
|
|
|
|
Bot commits include a git trailer like `Josh-Sync-Origin: forward/main/2024-02-12T10:30:00Z`. Each sync direction filters out commits with this trailer, preventing changes from bouncing back and forth. The CI action also has a loop guard that skips entirely if the HEAD commit has the trailer.
|
|
|
|
### State tracking
|
|
|
|
Sync state is stored as JSON files on an orphan branch (`josh-sync-state`), one file per target/branch. This tracks the last-synced commit SHAs and timestamps to avoid re-syncing the same changes.
|
|
|
|
## Adding a New Target
|
|
|
|
To add a new subrepo after initial setup:
|
|
|
|
1. Add the target to `.josh-sync.yml`
|
|
2. Update the forward workflow's `paths:` list to include the new subfolder
|
|
3. Commit and push
|
|
4. Import the target:
|
|
```bash
|
|
# Recommended: interactive onboard (preserves open PRs)
|
|
josh-sync onboard new-target
|
|
|
|
# Or manual: import → merge PR → reset
|
|
josh-sync import new-target
|
|
# merge the PR
|
|
josh-sync reset new-target
|
|
```
|
|
5. Verify with `josh-sync status`
|
|
|
|
## Troubleshooting
|
|
|
|
### "Failed to clone through josh-proxy"
|
|
|
|
- Check josh-proxy is running and accessible
|
|
- Verify `monorepo_path` matches what josh-proxy expects
|
|
- Test manually: `git ls-remote https://<user>:<token>@josh.example.com/org/repo.git:/services/app.git`
|
|
|
|
### SSH authentication failures
|
|
|
|
- `SUBREPO_SSH_KEY` must contain the actual key content, not a file path
|
|
- For per-target keys, ensure `subrepo_ssh_key_var` in config matches the env var name
|
|
- Check the key has write access to the subrepo
|
|
|
|
### "Force-with-lease rejected"
|
|
|
|
Normal: the subrepo changed while sync was running. The next sync run will pick it up. If persistent, check for another process pushing to the subrepo simultaneously.
|
|
|
|
### "Josh rejected push" (reverse sync)
|
|
|
|
Josh-proxy couldn't map the push back to the monorepo. Check josh-proxy logs, verify the josh filter is correct. May indicate a history divergence — consider running `josh-sync reset <target>`.
|
|
|
|
### Import PR shows "No changes"
|
|
|
|
The subfolder already contains the same content as the subrepo. This is fine — the import is a no-op.
|
|
|
|
### Duplicate/looping commits
|
|
|
|
Verify `bot.trailer` in config matches what's in commit messages. Check the loop guard in the CI workflow is active.
|
|
|
|
### "cannot lock ref" or "expected X but got Y"
|
|
|
|
**After reset (subrepo):** The subrepo's history was replaced by force-push. Local clones still have the old history:
|
|
|
|
```bash
|
|
cd /path/to/subrepo
|
|
git fetch origin
|
|
git checkout main && git reset --hard origin/main
|
|
```
|
|
|
|
Or simply delete and re-clone.
|
|
|
|
**After import/reset cycle (monorepo):** The import and reset steps create and update branches rapidly (`auto-sync/import-*`, `josh-sync-state`). If your local clone fetched partway through, tracking refs go stale:
|
|
|
|
```bash
|
|
git remote prune origin && git pull
|
|
```
|
|
|
|
### State issues
|
|
|
|
```bash
|
|
# View current state
|
|
josh-sync state show <target> [branch]
|
|
|
|
# Reset state (forces next sync to run regardless of SHA comparison)
|
|
josh-sync state reset <target> [branch]
|
|
```
|