# Step 2 Prompt: Artifacts Pipeline

Source: `laravel/resources/prompts/_source/STEP-2-ARTIFACTS-PIPELINE.md`
Prompt version: c0d02f751adf

You are running Step 2 of the Agents HQ product-spec setup. Your job is to turn
business artifacts into grounded policies, goals, CEO instructions, pre-staged
C-level instructions, and a gap report. Laravel hosts this prompt; the work
happens in the local filesystem and Paperclip.

## Inputs

- `$BASE=~/Agents/{company-prefix}`
- Raw files in `$BASE/artifacts/`
- Any pasted conversation or business context from the user
- `state/agent-registry.json` from Step 1
- `state/setup.json`, `state/governor.json`, and `state/paperclip-health.json`
  from Step 1 when available

Initialize:

```bash
export BASE="${BASE:-$HOME/Agents/REPLACE_WITH_COMPANY_PREFIX}"
export STATE_DIR="$BASE/state"
export SETUP_STATE="$STATE_DIR/setup.json"
export REGISTRY="$STATE_DIR/agent-registry.json"
export ARTIFACTS_DIR="$BASE/artifacts"
export PARSED_DIR="$ARTIFACTS_DIR/_parsed"
export CLASSIFIED_DIR="$ARTIFACTS_DIR/_classified"
export CLASSIFICATION_VERSION="c0d02f751adf"
export ADAPTER_TYPE="$(jq -r '.adapterType // "pi_local"' "$SETUP_STATE" 2>/dev/null || printf 'pi_local')"
export PAPERCLIP_URL="$(jq -r '.paperclipUrl // "http://127.0.0.1:3100"' "$SETUP_STATE" 2>/dev/null || printf 'http://127.0.0.1:3100')"
export PAPERCLIP_COMPANY_ID="$(jq -r '.companyId // empty' "$REGISTRY" 2>/dev/null || true)"

mkdir -p "$ARTIFACTS_DIR" "$PARSED_DIR" "$CLASSIFIED_DIR" "$BASE/governance" "$STATE_DIR/templates"
```

If the user pastes business context or uploads files into the conversation,
write that content into `artifacts/` before parsing it. Use stable names:

```bash
PASTE_PATH="$ARTIFACTS_DIR/paste-$(date -u +%Y%m%d-%H%M%S).md"
cat > "$PASTE_PATH" <<'EOF'
PASTE_OR_UPLOADED_CONTENT_HERE
EOF
```

Do not leave pasted context only in the chat transcript. Step 2 reads from
`artifacts/` regardless of whether artifacts came from filesystem drops or the
conversation.

## Runtime Preconditions

Before parsing, verify Step 1 left usable runtime state:

```bash
test -d "$ARTIFACTS_DIR" || {
  echo "Missing artifacts directory. Run Step 1 first."
  exit 1
}

test -f "$BASE/state/agent-registry.json" || {
  echo "Missing state/agent-registry.json. Run Step 1 first."
  exit 1
}

if [ -f "$BASE/state/governor.json" ]; then
  GOVERNOR_ENABLED="$(jq -r '.enabled // .data.enabled // false' "$BASE/state/governor.json")"
  if [ "$GOVERNOR_ENABLED" != "true" ]; then
    echo "Governor is disabled. Enable it in the connectors plugin settings, then retry Step 2."
    exit 1
  fi
fi

if [ -z "$PAPERCLIP_COMPANY_ID" ]; then
  echo "Missing Paperclip company ID in state/agent-registry.json. Run Step 1 first."
  exit 1
fi

update_setup_step() {
  step="$1"
  ts="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
  test -f "$SETUP_STATE" || printf '{"step":"%s"}\n' "$step" > "$SETUP_STATE"
  tmp="$(mktemp)"
  jq --arg step "$step" --arg ts "$ts" '.step = $step | .updatedAt = $ts' "$SETUP_STATE" > "$tmp" && mv "$tmp" "$SETUP_STATE"
}

update_setup_step "step-2.ingestion"
```

Any Step 2 agent registration must merge into `state/agent-registry.json` using
the `jq` merge pattern from Step 1. Do not replace the file with `cat >`.

Download the Step 2 templates before synthesis:

```bash
curl -fsSL "http://agents.tractionstudio.ai/downloads/policies/bucket-policy.md" -o "$STATE_DIR/templates/bucket-policy.md"
curl -fsSL "http://agents.tractionstudio.ai/downloads/policies/bucket-goals.md" -o "$STATE_DIR/templates/bucket-goals.md"
curl -fsSL "http://agents.tractionstudio.ai/downloads/policies/ceo-agents.md" -o "$STATE_DIR/templates/ceo-agents.md"
curl -fsSL "http://agents.tractionstudio.ai/downloads/policies/c-level-agents.md" -o "$STATE_DIR/templates/c-level-agents.md"
curl -fsSL "http://agents.tractionstudio.ai/downloads/policies/agents-md-template.md" -o "$STATE_DIR/templates/agents-md-template.md"
```

## Parse

Set setup state before parsing:

```bash
update_setup_step "step-2.parsing"
```

For every readable artifact directly under `artifacts/`:

1. Extract readable text.
2. Save the mirror as `artifacts/_parsed/{filename}.txt`.
3. Preserve filename, source path, parse time, and parse warnings.

Ignore files already inside `artifacts/_parsed/` or `artifacts/_classified/`.
Create one parsed text mirror per source file, preserving enough of the original
name to make review easy.

Suggested parser behavior:

- `.md`, `.txt`, `.csv`, `.json`, `.yaml`, `.yml`: copy/read as text.
- `.html`, `.htm`: extract visible text while preserving headings and lists.
- `.pdf`: use `pdftotext` if available; if unavailable, try another local PDF
  text extractor.
- `.docx`: use a structured document parser if available.
- Unknown binary files: mark unreadable rather than guessing.

Use a parse log:

```text
artifacts/_parsed/_parse-log.jsonl
artifacts/_parsed/_errors.jsonl
```

Each parse log entry should include:

```json
{
  "source_path": "$BASE/artifacts/example.pdf",
  "parsed_path": "$BASE/artifacts/_parsed/example.pdf.txt",
  "sha256": "...",
  "parsed_at": "2026-04-28T00:00:00Z",
  "warnings": []
}
```

Each error entry should include:

```json
{
  "source_path": "$BASE/artifacts/bad-scan.pdf",
  "error": "No extractable text; OCR required",
  "recorded_at": "2026-04-28T00:00:00Z"
}
```

If a file is unreadable, write an error entry, tell the user which file failed,
continue with readable artifacts, and add the issue to `governance/gap-report.md`
later. Never silently drop an artifact.

If no readable artifacts remain after parsing, stop and ask the user to add
business artifacts before continuing.

## Classify

Set setup state before classification:

```bash
update_setup_step "step-2.classifying"
```

Classify each artifact into:

- Buckets: one or more of `cockpit`, `wings`, `left-engine`, `right-engine`,
  `fuel-tank`, or `fuselage`
- Content types
- Key facts
- Evidence quotes
- Confidence per bucket
- Gaps or contradictions

Write JSON to:

```text
artifacts/_classified/{filename}.json
```

Classification schema:

```json
{
  "classification_version": "c0d02f751adf",
  "source": {
    "filename": "artifact.pdf",
    "source_path": "$BASE/artifacts/artifact.pdf",
    "parsed_path": "$BASE/artifacts/_parsed/artifact.pdf.txt",
    "sha256": "..."
  },
  "buckets": [
    {
      "bucket": "cockpit",
      "confidence": 0.8,
      "reason": "The artifact defines mission and strategic priorities."
    }
  ],
  "content_types": ["vision", "market-evidence"],
  "key_facts": [
    {
      "fact": "The company targets regional service businesses.",
      "buckets": ["cockpit", "right-engine"],
      "evidence_quotes": [
        {
          "quote": "direct quote from the parsed artifact",
          "source_path": "$BASE/artifacts/artifact.pdf"
        }
      ]
    }
  ],
  "gaps": [
    {
      "bucket": "fuel-tank",
      "missing_evidence": "No pricing or runway detail found."
    }
  ],
  "contradictions": []
}
```

Default content-to-bucket map:

- Strategy, mission, vision, priorities: `cockpit`
- Product, users, roadmap, customer feedback: `wings`
- Marketing, brand, content, positioning: `left-engine`
- Sales, customer success, pipeline, retention: `right-engine`
- Finance, pricing, funding, runway, unit economics: `fuel-tank`
- Operations, delivery, admin, legal, HR, vendors, internal tooling: `fuselage`

Default artifact-to-bucket map:

| Artifact pattern | Default buckets |
|---|---|
| Business plan | `cockpit` |
| Competitor landscape report | `cockpit`, `wings`, `left-engine`, `right-engine` |
| Customer behavior story | `wings`, `left-engine` |
| Desirability validation report | `wings` |
| Problem interview script | `wings` |
| Viability validation report | `fuel-tank` |
| Feasibility validation report | `fuselage`, `wings` |
| Final refinement report | `cockpit` |
| Refined business canvas | `cockpit` |
| Traction roadmap | `cockpit`, `wings`, `left-engine`, `right-engine`, `fuel-tank`, `fuselage` |
| Branding guide | `left-engine` |
| Elevator pitch | `left-engine`, `right-engine` |
| Irresistible offer | `left-engine`, `right-engine` |
| Landing page | `left-engine` |

Defaults are starting points. Override them when the artifact content clearly
belongs elsewhere, and record the reason in the classification JSON.

Write a classification index:

```text
artifacts/_classified/index.json
```

The index should list every source artifact with `status` of `classified`,
`unreadable`, or `skipped_by_user`, plus paths to parsed/classified outputs.

## Synthesize

Set setup state:

```bash
update_setup_step "step-2.synthesizing"
mkdir -p "$STATE_DIR/synthesis/buckets"
```

Use the classified JSON as the source of truth. Do not synthesize directly from
memory or chat. For each bucket, gather matching facts, quotes, gaps, and
contradictions:

```bash
for bucket in cockpit wings left-engine right-engine fuel-tank fuselage; do
  jq -s --arg bucket "$bucket" '
    map(select(.buckets[]?.bucket == $bucket)) as $items
    | {
        bucket: $bucket,
        sources: $items | map(.source),
        key_facts: $items | map(.key_facts[]? | select((.buckets // []) | index($bucket))),
        gaps: $items | map(.gaps[]? | select(.bucket == $bucket)),
        contradictions: $items | map(.contradictions[]?)
      }
  ' "$CLASSIFIED_DIR"/*.json > "$STATE_DIR/synthesis/buckets/$bucket.json"
done
```

Write evidence-backed policies:

```text
policies/company.md
policies/cockpit.md
policies/wings.md
policies/left-engine.md
policies/right-engine.md
policies/fuel-tank.md
policies/fuselage.md
goals/company.md
goals/{bucket}.md
```

For each bucket, start from `state/templates/bucket-policy.md` and
`state/templates/bucket-goals.md`, then overwrite the seeded files with grounded
content. Each bucket policy must include:

- Bucket responsibility using the Business Made Simple airplane model.
- Evidence summary with citations.
- Standards for the bucket's work.
- Decision rules and operating constraints.
- Review expectations and blast-radius defaults.
- Open questions.

Each bucket goals file must include:

- Top-level outcomes the bucket owns.
- Active initiatives or candidate projects pulled from evidence.
- Open questions and gaps.
- Evidence citations for every business-specific claim.

Use this citation format in markdown:

```text
> "short supporting quote" — source: artifacts/{filename}
```

Business-specific claims require artifact citations. If a claim is useful but
not yet evidenced, write it as a gap, not a fact.

Create or update `goals/company.md` from cross-bucket facts with these sections:

- Vision and mission
- Company-level outcomes
- Cross-bucket priorities
- Evidence map
- Open gaps

## CEO And C-Level Files

Set setup state:

```bash
update_setup_step "step-2.agent-files"
```

Write or overwrite:

```text
agents/ceo/AGENTS.md
```

Use `state/templates/ceo-agents.md` as the base. Keep these sections:
Identity, Context, Standards, Workflow, Handoff. Then append a `Grounded
Business Context` section containing:

- Company mission and current priorities.
- Bucket summary table for all six buckets.
- Cross-bucket standards.
- Top unresolved gaps and skipped questions.
- Pointers to `policies/company.md`, `goals/company.md`, all bucket policies,
  all bucket goals, `governance/gap-report.md`, and `governance/gap-report.json`.

Use commands equivalent to:

```bash
mkdir -p "$BASE/agents/ceo"
cp "$STATE_DIR/templates/ceo-agents.md" "$BASE/agents/ceo/AGENTS.md"
COMPANY_NAME="$(jq -r '.companyName // "Company"' "$SETUP_STATE")"
COMPANY_PREFIX="$(jq -r '.companyPrefix // "company"' "$SETUP_STATE")"
COMPANY_NAME="$COMPANY_NAME" COMPANY_PREFIX="$COMPANY_PREFIX" \
  perl -0pi -e 's/\{\{\s*company_name\s*\}\}/$ENV{COMPANY_NAME}/g; s/\{\{\s*company_prefix\s*\}\}/$ENV{COMPANY_PREFIX}/g' \
  "$BASE/agents/ceo/AGENTS.md"
```

Pre-stage, but do not register yet:

```text
agents/cpo/AGENTS.md
agents/cmo/AGENTS.md
agents/cro/AGENTS.md
agents/cfo/AGENTS.md
agents/coo/AGENTS.md
```

Use `state/templates/c-level-agents.md` as the base. Write each file with the
right bucket:

| Role | Bucket |
|---|---|
| `cpo` | `wings` |
| `cmo` | `left-engine` |
| `cro` | `right-engine` |
| `cfo` | `fuel-tank` |
| `coo` | `fuselage` |

For each C-level file, append a `Pre-Staged Context` section with links to the
bucket policy, bucket goals, company policy, gap report, and bucket-specific
classification summary. Do not register these C-level agents in Paperclip yet.

Use commands equivalent to:

```bash
for spec in \
  "cpo:wings:Chief Product Officer" \
  "cmo:left-engine:Chief Marketing Officer" \
  "cro:right-engine:Chief Revenue Officer" \
  "cfo:fuel-tank:Chief Financial Officer" \
  "coo:fuselage:Chief Operating Officer"; do
  IFS=":" read -r role bucket title <<EOF
$spec
EOF
  mkdir -p "$BASE/agents/$role"
  cp "$STATE_DIR/templates/c-level-agents.md" "$BASE/agents/$role/AGENTS.md"
  C_LEVEL_ROLE="$title" COMPANY_NAME="$COMPANY_NAME" BUCKET="$bucket" \
    perl -0pi -e 's/\{\{\s*c_level_role\s*\}\}/$ENV{C_LEVEL_ROLE}/g; s/\{\{\s*company_name\s*\}\}/$ENV{COMPANY_NAME}/g; s/\{\{\s*bucket\s*\}\}/$ENV{BUCKET}/g' \
    "$BASE/agents/$role/AGENTS.md"
  cat >> "$BASE/agents/$role/AGENTS.md" <<EOF

## Pre-Staged Context

- Bucket policy: policies/$bucket.md
- Bucket goals: goals/$bucket.md
- Company policy: policies/company.md
- Gap report: governance/gap-report.md and governance/gap-report.json
- Classification summary: state/synthesis/buckets/$bucket.json

EOF
done
```

## Register CEO

Register or update only `ceo` in Paperclip at the end of Step 2.

Set setup state:

```bash
update_setup_step "step-2.ceo-registration"
```

Use the role adapter. The payload must use product role `ceo`, Paperclip role
`ceo`, `adapterType: $ADAPTER_TYPE`, no top-level `agentsMdPath`, and
`metadata.bucket: cockpit`.

If a stub CEO was created in Step 1, update it rather than creating a duplicate.
The live update route is `PATCH /api/agents/{agentId}?companyId={companyId}`.
The create route is `POST /api/companies/{companyId}/agents`.

```bash
CEO_AGENT_ID="$(jq -r '.agents.ceo.paperclipAgentId // empty' "$REGISTRY")"

cat > "$STATE_DIR/ceo-agent-payload.json" <<EOF
{
  "name": "ceo",
  "role": "ceo",
  "adapterType": "$ADAPTER_TYPE",
  "adapterConfig": {},
  "runtimeConfig": {},
  "metadata": {
    "productRole": "ceo",
    "bucket": "cockpit",
    "agentsMdPath": "$BASE/agents/ceo/AGENTS.md"
  }
}
EOF

if [ -n "$CEO_AGENT_ID" ]; then
  curl -fsS -X PATCH "$PAPERCLIP_URL/api/agents/$CEO_AGENT_ID?companyId=$PAPERCLIP_COMPANY_ID" \
    -H "Content-Type: application/json" \
    -d @"$STATE_DIR/ceo-agent-payload.json" \
    -o "$STATE_DIR/ceo-agent.json"
else
  curl -fsS -X POST "$PAPERCLIP_URL/api/companies/$PAPERCLIP_COMPANY_ID/agents" \
    -H "Content-Type: application/json" \
    -d @"$STATE_DIR/ceo-agent-payload.json" \
    -o "$STATE_DIR/ceo-agent.json"
  CEO_AGENT_ID="$(jq -r '.id // .data.id // .agent.id // empty' "$STATE_DIR/ceo-agent.json")"
fi

test -n "$CEO_AGENT_ID" || {
  echo "Could not determine CEO Paperclip agent ID."
  exit 1
}

curl -fsS -X PATCH "$PAPERCLIP_URL/api/agents/$CEO_AGENT_ID/instructions-bundle?companyId=$PAPERCLIP_COMPANY_ID" \
  -H "Content-Type: application/json" \
  -d "{\"mode\":\"external\",\"rootPath\":\"$BASE/agents/ceo\",\"entryFile\":\"AGENTS.md\"}"
```

Merge the CEO into the registry:

```bash
tmp="$(mktemp)"
jq \
  --arg companyId "$PAPERCLIP_COMPANY_ID" \
  --arg name "ceo" \
  --arg id "$CEO_AGENT_ID" \
  --arg path "$BASE/agents/ceo/AGENTS.md" \
  '.companyId = $companyId
   | .agents = (.agents // {})
   | .agents[$name] = {
      paperclipAgentId: $id,
      productRole: "ceo",
      paperclipRole: "ceo",
      bucket: "cockpit",
      agentsMdPath: $path
    }' "$REGISTRY" > "$tmp" && mv "$tmp" "$REGISTRY"
```

## Gap Report

Write:

```text
governance/gap-report.md
governance/gap-report.json
```

Use `governance/gap-report.json` as the machine-readable source and derive the
markdown from it for human review. JSON shape:

```json
[
  {
    "bucket": "fuel-tank",
    "missing_evidence": "No pricing or runway detail found.",
    "why_it_matters": "CFO goals and constraints cannot be grounded.",
    "micro_interview_questions": [
      "What is the current monthly revenue?",
      "What is the current runway?"
    ],
    "skip_decision": null,
    "follow_up_issue_id": null,
    "sources": ["artifacts/_classified/example.json"]
  }
]
```

Include all of this in the markdown:

- Bucket
- Missing evidence
- Why it matters
- 3-5 targeted micro-interview questions when needed
- User skip decision, if any
- Follow-up issue to raise in Step 3 if skipped

Only ask micro-interview questions for real gaps. Keep each micro-interview to
3-5 targeted questions. If the user skips, write the skip decision into both
`gap-report.json` and `gap-report.md`, and turn that gap into a Step 3
first-conversation issue.

Contradiction entries should use this shape in classification JSON:

```json
{
  "description": "Two artifacts disagree on target customer size.",
  "sources": ["artifacts/a.md", "artifacts/b.md"],
  "affected_buckets": ["cockpit", "right-engine"]
}
```

## Step 2 Done

Set final setup state and report:

```bash
update_setup_step "step-2.done"
```

Report parsed file count, classified file count, unreadable file count, policies
written, CEO agent ID, C-level files pre-staged, and the top unresolved gaps.