🚀 Tagger → Vault Integration Guide

What’s New in Vault v2.0

Your Vault now has complete tagger processing built-in:

  • ✅ Tag group processing (context-aware content)
  • ✅ Archive page generation (categories, tags, dates)
  • ✅ Search index building
  • ✅ Image proxy & collection
  • ✅ FTP deployment with image gathering

Total: 1,086 lines (was 530, added 556 lines of tagger processing)


Quick Start

1. Start Vault

cd /Volumes/SpiffyMagic/ACCID_MANAGEMENT/ACCID_VAULT
source venv/bin/activate  # or .venv/bin/activate
python vault_enhanced.py

You’ll see:

🚀 ACCID VAULT v2.0 - COMPLETE EDITION
📡 Server: http://localhost:9848
✨ Features:
  ✅ Project Factory (waypoint, html, react)
  ✅ FTP Deployment
  ✅ Auth Token Management
  ✅ Tagger Processing (NEW!)
  ✅ Image Proxy & Collection
  ✅ Archive Page Generation
  ✅ Search Index Building

2. Run Tagger Pipeline

cd /Volumes/SpiffyMagic/ACCID_MANAGEMENT/tagger_playwright_local

# Full pipeline
./start.sh discover https://example.com --full-structure
./start.sh curate jobs/example_com/
./start.sh tagger jobs/example_com/
./start.sh preview jobs/example_com/ --sample=5
./start.sh runner jobs/example_com/ --all
./start.sh extract jobs/example_com/
./start.sh convert jobs/example_com/ --tag-groups

# Generate vault export
python vault_export.py jobs/example_com/ --include-archives

Output:

  • jobs/example_com/tag_groups.json
  • jobs/example_com/vault_export.json

3. Import to Vault

Using Python:

import requests
import json

# Load tagger output
with open('jobs/example_com/tag_groups.json') as f:
    tag_groups = json.load(f)

with open('jobs/example_com/vault_export.json') as f:
    vault_export = json.load(f)

# Send to Vault
response = requests.post('http://localhost:9848/api/tagger/import', json={
    'tag_groups': tag_groups,
    'vault_export': vault_export  # Optional but recommended
})

result = response.json()
print(f"✅ Import successful!")
print(f"📁 Project: {result['project_id']}")
print(f"📊 Stats: {result['stats']}")

Or using curl:

curl -X POST http://localhost:9848/api/tagger/import \
  -H "Content-Type: application/json" \
  -d @combined_import.json

Where combined_import.json:

{
  "tag_groups": { /* from tag_groups.json */ },
  "vault_export": { /* from vault_export.json */ }
}

4. Output

Vault creates:

data/
  example_com_html/
    ├── htmlbuilder_import.json  ← Ready for builder!
    ├── tag_groups.json          ← Reference copy
    ├── vault_export.json        ← Reference copy
    ├── images/                  ← For downloaded images
    └── assets/                  ← For other resources

API Reference

POST /api/tagger/import

Import scraped site from tagger pipeline.

Request:

{
  "tag_groups": {
    "site": {
      "id": "example_com",
      "domain": "example.com",
      "url": "https://example.com"
    },
    "tag_groups": {
      "example_com-nav": { /* navigation data */ },
      "example_com-h1s": { /* page headers */ },
      "example_com-hero": { /* hero sections */ }
    },
    "pages": [
      {"slug": "home", "title": "Home", "url": "https://example.com"}
    ]
  },
  "vault_export": {
    "taxonomies": { /* categories, tags */ },
    "content": { /* posts, pages */ },
    "generated_pages": [ /* archive definitions */ ],
    "navigation": { /* nav structure */ }
  }
}

Response:

{
  "success": true,
  "project_id": "example_com",
  "project_dir": "/path/to/data/example_com_html",
  "output_file": "/path/to/data/example_com_html/htmlbuilder_import.json",
  "stats": {
    "pages": 25,
    "base_pages": 15,
    "archive_pages": 10,
    "tag_groups": 6,
    "search_entries": 42,
    "categories": 5,
    "tags": 12
  }
}

POST /api/deployment/collect-images

Download all external images before deployment.

Request:

{
  "project_name": "example_com",
  "images": [
    "https://example.com/image1.jpg",
    "https://example.com/image2.png"
  ]
}

Response:

{
  "success": true,
  "collected": 2,
  "failed": 0,
  "images": [
    {
      "original_url": "https://example.com/image1.jpg",
      "local_path": "/data/example_com_html/images/img_0000_abc123.jpg",
      "web_path": "/images/img_0000_abc123.jpg",
      "size": 45678,
      "status": "success"
    }
  ],
  "errors": []
}

POST /api/deployment/rewrite-urls

Rewrite HTML to use local image paths.

Request:

{
  "html": "<img src='https://example.com/image.jpg'>",
  "url_mapping": {
    "https://example.com/image.jpg": "/images/img_0000_abc123.jpg"
  }
}

Response:

{
  "success": true,
  "html": "<img src='/images/img_0000_abc123.jpg'>",
  "replacements": 1
}

GET /api/image-proxy/{hash}

Proxy external images with caching.

Query params: ?url=https://example.com/image.jpg

Returns: Image content with proper Content-Type


Complete Workflow

┌──────────────────┐
│ TAGGER           │
│ (Local, Private) │
└────────┬─────────┘
         │
         │ tag_groups.json + vault_export.json
         ↓
┌──────────────────┐
│ VAULT            │  POST /api/tagger/import
│ (localhost:9848) │
└────────┬─────────┘
         │
         │ Processes:
         │ • Tag groups → page modules
         │ • Archive pages (categories/tags/dates)
         │ • Search index
         │ • Navigation structure
         │
         │ Creates: htmlbuilder_import.json
         ↓
┌──────────────────┐
│ BUILDER          │  Load htmlbuilder_import.json
│ (Visual Editor)  │
└────────┬─────────┘
         │
         │ User edits + arranges
         │
         │ Export static site
         ↓
┌──────────────────┐
│ VAULT            │  POST /api/deployment/collect-images
│ (Deployment)     │  POST /api/deployment/rewrite-urls
└────────┬─────────┘  POST /api/ftp-save
         │
         │ Complete static site on live server
         ↓

Testing

Test Import

Create test_import.json:

{
  "tag_groups": {
    "site": {
      "id": "test_site",
      "domain": "test.com",
      "url": "https://test.com"
    },
    "tag_groups": {
      "test_site-nav": {
        "type": "navigation",
        "context_aware": false,
        "value": {
          "items": [
            {"text": "Home", "href": "/"},
            {"text": "About", "href": "/about"}
          ]
        }
      },
      "test_site-h1s": {
        "type": "heading",
        "context_aware": true,
        "by_page": {
          "home": {"value": "Welcome Home"},
          "about": {"value": "About Us"}
        }
      }
    },
    "pages": [
      {"slug": "home", "title": "Home", "url": "https://test.com"},
      {"slug": "about", "title": "About", "url": "https://test.com/about"}
    ]
  }
}

Import:

curl -X POST http://localhost:9848/api/tagger/import \
  -H "Content-Type: application/json" \
  -d @test_import.json

Check output:

cat data/test_site_html/htmlbuilder_import.json

What Vault Does

1. Tag Group Processing

Converts semantic groups to page modules:

# Input: Tag group
{
  "type": "heading",
  "context_aware": true,
  "by_page": {
    "home": {"value": "Homepage Title"}
  }
}

# Output: Module bound to tag group
{
  "type": "text",
  "dataSource": "sitename-h1s",  # ← Binds to tag group
  "cellIndex": 1,
  "_contextAware": true,
  "_pageSlug": "home"
}

2. Archive Page Generation

Creates category/tag/author/date pages:

# Input: Category with posts
{
  "category": "technology",
  "posts": [
    {"title": "AI Tutorial", "slug": "ai-tutorial"},
    {"title": "ML Basics", "slug": "ml-basics"}
  ]
}

# Output: Archive page
{
  "title": "Category: Technology",
  "modules": [
    {"type": "text", "content": "<h1>Category: Technology</h1>"},
    {"type": "html", "content": "<article>AI Tutorial</article><article>ML Basics</article>"}
  ]
}

3. Search Index

Builds searchable index:

[
  {
    "type": "post",
    "title": "Getting Started with AI",
    "slug": "getting-started-ai",
    "content_preview": "Learn AI from scratch...",
    "categories": ["technology", "ai"],
    "tags": ["python", "machine-learning"]
  }
]

4. Image Handling

During Development:

  • Images stay at source
  • /api/image-proxy caches them

During Deployment:

  • /api/deployment/collect-images downloads all
  • /api/deployment/rewrite-urls updates paths
  • /api/ftp-save uploads to server

Troubleshooting

Import fails

Check: Is Vault running?

curl http://localhost:9848/health

Check: Valid JSON?

python -m json.tool tag_groups.json
python -m json.tool vault_export.json

No archive pages

Make sure: You ran vault_export.py with --include-archives

Check: vault_export.json has generated_pages array

Images not loading

In dev: Use /api/image-proxy

const imageUrl = `http://localhost:9848/api/image-proxy/${hash}?url=${encodeURIComponent(originalUrl)}`;

Before deploy: Run /api/deployment/collect-images


Summary

You now have:

  • ✅ Complete tagger processing in Vault
  • ✅ Tag group → module conversion
  • ✅ Archive page generation
  • ✅ Search index building
  • ✅ Image proxy & collection
  • ✅ Full deployment pipeline

Total integration: Tagger → Vault → Builder → Live Site

Everything is ready to test! 🚀