🚀 Tagger → Vault Integration Guide

What’s New in Vault v2.0

Your Vault now has complete tagger processing built-in:

✅ Tag group processing (context-aware content)
✅ Archive page generation (categories, tags, dates)
✅ Search index building
✅ Image proxy & collection
✅ FTP deployment with image gathering

Total: 1,086 lines (was 530, added 556 lines of tagger processing)

Quick Start

1. Start Vault

cd /Volumes/SpiffyMagic/ACCID_MANAGEMENT/ACCID_VAULT
source venv/bin/activate  # or .venv/bin/activate
python vault_enhanced.py

You’ll see:

🚀 ACCID VAULT v2.0 - COMPLETE EDITION
📡 Server: http://localhost:9848
✨ Features:
  ✅ Project Factory (waypoint, html, react)
  ✅ FTP Deployment
  ✅ Auth Token Management
  ✅ Tagger Processing (NEW!)
  ✅ Image Proxy & Collection
  ✅ Archive Page Generation
  ✅ Search Index Building

2. Run Tagger Pipeline

cd /Volumes/SpiffyMagic/ACCID_MANAGEMENT/tagger_playwright_local

# Full pipeline
./start.sh discover https://example.com --full-structure
./start.sh curate jobs/example_com/
./start.sh tagger jobs/example_com/
./start.sh preview jobs/example_com/ --sample=5
./start.sh runner jobs/example_com/ --all
./start.sh extract jobs/example_com/
./start.sh convert jobs/example_com/ --tag-groups

# Generate vault export
python vault_export.py jobs/example_com/ --include-archives

Output:

jobs/example_com/tag_groups.json
jobs/example_com/vault_export.json

3. Import to Vault

Using Python:

import requests
import json

# Load tagger output
with open('jobs/example_com/tag_groups.json') as f:
    tag_groups = json.load(f)

with open('jobs/example_com/vault_export.json') as f:
    vault_export = json.load(f)

# Send to Vault
response = requests.post('http://localhost:9848/api/tagger/import', json={
    'tag_groups': tag_groups,
    'vault_export': vault_export  # Optional but recommended
})

result = response.json()
print(f"✅ Import successful!")
print(f"📁 Project: {result['project_id']}")
print(f"📊 Stats: {result['stats']}")

Or using curl:

curl -X POST http://localhost:9848/api/tagger/import \
  -H "Content-Type: application/json" \
  -d @combined_import.json

Where combined_import.json:

{
  "tag_groups": { /* from tag_groups.json */ },
  "vault_export": { /* from vault_export.json */ }
}

4. Output

Vault creates:

data/
  example_com_html/
    ├── htmlbuilder_import.json  ← Ready for builder!
    ├── tag_groups.json          ← Reference copy
    ├── vault_export.json        ← Reference copy
    ├── images/                  ← For downloaded images
    └── assets/                  ← For other resources

API Reference

POST `/api/tagger/import`

Import scraped site from tagger pipeline.

Request:

{
  "tag_groups": {
    "site": {
      "id": "example_com",
      "domain": "example.com",
      "url": "https://example.com"
    },
    "tag_groups": {
      "example_com-nav": { /* navigation data */ },
      "example_com-h1s": { /* page headers */ },
      "example_com-hero": { /* hero sections */ }
    },
    "pages": [
      {"slug": "home", "title": "Home", "url": "https://example.com"}
    ]
  },
  "vault_export": {
    "taxonomies": { /* categories, tags */ },
    "content": { /* posts, pages */ },
    "generated_pages": [ /* archive definitions */ ],
    "navigation": { /* nav structure */ }
  }
}

Response:

{
  "success": true,
  "project_id": "example_com",
  "project_dir": "/path/to/data/example_com_html",
  "output_file": "/path/to/data/example_com_html/htmlbuilder_import.json",
  "stats": {
    "pages": 25,
    "base_pages": 15,
    "archive_pages": 10,
    "tag_groups": 6,
    "search_entries": 42,
    "categories": 5,
    "tags": 12
  }
}

POST `/api/deployment/collect-images`

Download all external images before deployment.

Request:

{
  "project_name": "example_com",
  "images": [
    "https://example.com/image1.jpg",
    "https://example.com/image2.png"
  ]
}

Response:

{
  "success": true,
  "collected": 2,
  "failed": 0,
  "images": [
    {
      "original_url": "https://example.com/image1.jpg",
      "local_path": "/data/example_com_html/images/img_0000_abc123.jpg",
      "web_path": "/images/img_0000_abc123.jpg",
      "size": 45678,
      "status": "success"
    }
  ],
  "errors": []
}

POST `/api/deployment/rewrite-urls`

Rewrite HTML to use local image paths.

Request:

{
  "html": "<img src='https://example.com/image.jpg'>",
  "url_mapping": {
    "https://example.com/image.jpg": "/images/img_0000_abc123.jpg"
  }
}

Response:

{
  "success": true,
  "html": "<img src='/images/img_0000_abc123.jpg'>",
  "replacements": 1
}

GET `/api/image-proxy/{hash}`

Proxy external images with caching.

Query params: ?url=https://example.com/image.jpg

Returns: Image content with proper Content-Type

Complete Workflow

┌──────────────────┐
│ TAGGER           │
│ (Local, Private) │
└────────┬─────────┘
         │
         │ tag_groups.json + vault_export.json
         ↓
┌──────────────────┐
│ VAULT            │  POST /api/tagger/import
│ (localhost:9848) │
└────────┬─────────┘
         │
         │ Processes:
         │ • Tag groups → page modules
         │ • Archive pages (categories/tags/dates)
         │ • Search index
         │ • Navigation structure
         │
         │ Creates: htmlbuilder_import.json
         ↓
┌──────────────────┐
│ BUILDER          │  Load htmlbuilder_import.json
│ (Visual Editor)  │
└────────┬─────────┘
         │
         │ User edits + arranges
         │
         │ Export static site
         ↓
┌──────────────────┐
│ VAULT            │  POST /api/deployment/collect-images
│ (Deployment)     │  POST /api/deployment/rewrite-urls
└────────┬─────────┘  POST /api/ftp-save
         │
         │ Complete static site on live server
         ↓

Testing

Test Import

Create test_import.json:

{
  "tag_groups": {
    "site": {
      "id": "test_site",
      "domain": "test.com",
      "url": "https://test.com"
    },
    "tag_groups": {
      "test_site-nav": {
        "type": "navigation",
        "context_aware": false,
        "value": {
          "items": [
            {"text": "Home", "href": "/"},
            {"text": "About", "href": "/about"}
          ]
        }
      },
      "test_site-h1s": {
        "type": "heading",
        "context_aware": true,
        "by_page": {
          "home": {"value": "Welcome Home"},
          "about": {"value": "About Us"}
        }
      }
    },
    "pages": [
      {"slug": "home", "title": "Home", "url": "https://test.com"},
      {"slug": "about", "title": "About", "url": "https://test.com/about"}
    ]
  }
}

Import:

curl -X POST http://localhost:9848/api/tagger/import \
  -H "Content-Type: application/json" \
  -d @test_import.json

Check output:

cat data/test_site_html/htmlbuilder_import.json

What Vault Does

Converts semantic groups to page modules:

# Input: Tag group
{
  "type": "heading",
  "context_aware": true,
  "by_page": {
    "home": {"value": "Homepage Title"}
  }
}

# Output: Module bound to tag group
{
  "type": "text",
  "dataSource": "sitename-h1s",  # ← Binds to tag group
  "cellIndex": 1,
  "_contextAware": true,
  "_pageSlug": "home"
}

2. Archive Page Generation

Creates category/tag/author/date pages:

# Input: Category with posts
{
  "category": "technology",
  "posts": [
    {"title": "AI Tutorial", "slug": "ai-tutorial"},
    {"title": "ML Basics", "slug": "ml-basics"}
  ]
}

# Output: Archive page
{
  "title": "Category: Technology",
  "modules": [
    {"type": "text", "content": "<h1>Category: Technology</h1>"},
    {"type": "html", "content": "<article>AI Tutorial</article><article>ML Basics</article>"}
  ]
}

3. Search Index

Builds searchable index:

[
  {
    "type": "post",
    "title": "Getting Started with AI",
    "slug": "getting-started-ai",
    "content_preview": "Learn AI from scratch...",
    "categories": ["technology", "ai"],
    "tags": ["python", "machine-learning"]
  }
]

4. Image Handling

During Development:

Images stay at source
/api/image-proxy caches them

During Deployment:

/api/deployment/collect-images downloads all
/api/deployment/rewrite-urls updates paths
/api/ftp-save uploads to server

Troubleshooting

Import fails

Check: Is Vault running?

curl http://localhost:9848/health

Check: Valid JSON?

python -m json.tool tag_groups.json
python -m json.tool vault_export.json

No archive pages

Make sure: You ran vault_export.py with --include-archives

Check: vault_export.json has generated_pages array

Images not loading

In dev: Use /api/image-proxy

const imageUrl = `http://localhost:9848/api/image-proxy/${hash}?url=${encodeURIComponent(originalUrl)}`;

Before deploy: Run /api/deployment/collect-images

Summary

You now have:

✅ Complete tagger processing in Vault
✅ Tag group → module conversion
✅ Archive page generation
✅ Search index building
✅ Image proxy & collection
✅ Full deployment pipeline

Total integration: Tagger → Vault → Builder → Live Site

Everything is ready to test! 🚀

What’s New in Vault v2.0

Quick Start

1. Start Vault

2. Run Tagger Pipeline

3. Import to Vault

4. Output

API Reference

POST /api/tagger/import

POST /api/deployment/collect-images

POST /api/deployment/rewrite-urls

GET /api/image-proxy/{hash}

Complete Workflow

Testing

Test Import

What Vault Does

1. Tag Group Processing

2. Archive Page Generation

3. Search Index

4. Image Handling

Troubleshooting

Import fails

No archive pages

Images not loading

Summary

Related Posts

POST `/api/tagger/import`

POST `/api/deployment/collect-images`

POST `/api/deployment/rewrite-urls`

GET `/api/image-proxy/{hash}`