monorepo scaffolding spa-scraping lovable base44 fontsource parallel-agents cross-referencing

SPA Scraping & Second-Wave Scaffolding — Scaling from 14 to 24 Sites

SPA Scraping & Second-Wave Scaffolding

Problem

After scaling from 2 to 14 sites (see bulk-site-scaffolding-and-shared-components.md), the user wanted to import all projects from their portfolio at aaronr.info. The site is a Lovable-built SPA — WebFetch and curl return only a shell <div id="root"></div> with no content.

Investigation & Approach

1. Failed: Standard scraping

WebFetch → "Portfolio" title only, no project data
curl → <div id="root"></div> + JS bundle reference

Root cause: Lovable apps are React SPAs. All content lives in the JavaScript bundle, not the HTML.

2. Solution: JS bundle extraction

Instead of rendering the SPA, we extracted data directly from the compiled JS bundle:

# Step 1: Find the bundle URL from the HTML
curl -sL https://aaronr.info/ | grep -oE 'src="/assets/[^"]*\.js"'
# → src="/assets/index-D_aM2TFg.js"

# Step 2: Extract all URLs from the bundle
curl -sL https://aaronr.info/assets/index-D_aM2TFg.js \
  | grep -oE '"https?://[^"]+"' | sort -u

# Step 3: Extract project name mappings from display logic
curl -sL https://aaronr.info/assets/index-D_aM2TFg.js \
  | tr ',' '\n' | grep -E 'lovable\.app|base44\.app|netlify|mountainlife'

This revealed the name-mapping function in the bundle:

// Bundle contained display name overrides:
g==="Sendy Visor Vibes"?"Sendy Visors"
g==="Pocket Lunch Ascend"?"Pocket Lunch"
g==="Lana Health Planner"?"HealthCal AI"
c==="ido"?"Weddings Lana"
// etc.

Key insight: SPA bundles contain all the data — you don’t need to render them. Look for URL arrays, display name mappings, and config objects in the compiled JS.

3. Discovered 18 projects, mapped to existing + new

Source URLDisplay NameAction
adventure-oasis.lovable.appAdventure OasisAlready exists
adventure-weddings.lovable.appAdventure WeddingsAlready exists
audacious-art.lovable.appAudacious ArtAlready exists
lana-health-planner.lovable.appHealthCal AIAlready exists
healthcalai.netlify.appHealth CalAlt version of healthcal
mountainlife.techFinal FestAlready exists (mountain-life)
mountainlife.tech/resumeResumeAlready exists (aaronr)
ai.aaronr.infoAaron ConsultingAlready exists (squamish-ai)
sendy-visor-vibes.lovable.appSendy VisorsNew site
pocket-lunch-ascend.lovable.appPocket LunchNew site
email-health-helper.lovable.appLana AINew site
unity-of-sound.comUnity of SoundNew site
adventure-product-school.lovable.appAdventure Product SchoolNew site
festivalweddings.caFestival WeddingsNew site
ido.lovable.appWeddings LanaNew site
maker-spark (base44)Maker MentorNew site
liquid-donations (base44)Liquid DonationsNew site
perplexity.ai appImagine CNCNew site

4. Parallel scaffolding: 10 agents

Same pattern as Phase 1 but with cross-references baked in:

  1. Read template files from existing site once
  2. Define per-site config with real scraped content where available
  3. Launch 10 parallel agents (~90 seconds total)
  4. Fix any dependency issues (see Gotcha below)
  5. Update businesses.ts registry (14 → 24 entries)
  6. Update root CLAUDE.md businesses table
  7. Build all 24 sites to verify

Several new sites share context with existing ones. Added cross-references in each site’s CLAUDE.md:

  • Lana AIHealthCal (health/AI vertical)
  • Festival WeddingsAdventure Weddings (wedding vertical)
  • Weddings LanaAdventure Weddings (wedding vertical)
  • Unity of SoundPirate Radi0 (music/community vertical)
  • Maker MentorCreate Makerspace (maker vertical)

Gotcha: @fontsource Variable font availability

Problem: One agent chose @fontsource-variable/lato and @fontsource-variable/playfair-display for wedding-lana. pnpm install failed:

ERR_PNPM_FETCH_404 GET https://registry.npmjs.org/@fontsource-variable%2Flato: Not Found - 404

Root cause: Not all Google Fonts have Variable versions on fontsource. Lato has a static package (@fontsource/lato) but no variable package (@fontsource-variable/lato).

Fix: Use the standard Montserrat/Inter variable fonts that all other sites use:

"@fontsource-variable/montserrat": "^5",
"@fontsource-variable/inter": "^5",

Prevention: When scaffolding sites, stick to the known-good font packages unless explicitly requested. If using a custom font, verify the -variable package exists on npm first:

npm view @fontsource-variable/<font-name> version

Key Design Decisions

1. JS bundle scraping over browser automation

Why: No need for Puppeteer/Playwright. The compiled bundle contains all project data as string literals. curl + grep is faster and has zero dependencies.

Trade-off: Only works for data embedded in bundles. Dynamically loaded content (API calls) would need actual rendering.

2. Present candidates for user approval

Why: Not all discovered projects should be scaffolded. Some are duplicates, some are alt versions, some are experiments. Always present the full list and let the user decide.

Pattern: Scrape → categorize (existing/new/duplicate) → present table → user picks → scaffold chosen ones.

3. Cross-references in CLAUDE.md, not code

Why: Related businesses share context (wedding vertical, maker vertical) but are independent sites. Cross-references in CLAUDE.md give AI assistants context without creating code dependencies.

Prevention / Best Practices

  1. When scraping Lovable/SPA sites: Don’t use WebFetch — extract from the JS bundle directly. Look for the src="/assets/*.js" tag in the HTML shell.

  2. When scaffolding fonts: Only use @fontsource-variable/* packages that are known to exist. Default to montserrat/inter. Check npm before using custom fonts.

  3. When scaling to many sites: Update THREE places: (a) site files, (b) packages/ui/src/data/businesses.ts, (c) root CLAUDE.md businesses table.

  4. When sites share a vertical: Add cross-references in each site’s CLAUDE.md under a ## Related Businesses section.

Files Created/Modified

New sites (10):

  • apps/sendy-visors/, apps/pocket-lunch/, apps/lana-ai/
  • apps/unity-of-sound/, apps/adventure-product-school/, apps/festival-weddings/
  • apps/weddings-lana/, apps/maker-mentor/, apps/liquid-donations/, apps/imagine-cnc/

Modified:

  • packages/ui/src/data/businesses.ts — 14 → 24 entries
  • CLAUDE.md — businesses table updated to 24 entries
  • apps/weddings-lana/package.json — fixed font deps
  • apps/weddings-lana/src/layouts/SiteLayout.astro — fixed font imports

Cross-References

  • See also: bulk-site-scaffolding-and-shared-components.md (Phase 1: 2→14 sites)
  • See also: build-errors/pnpm-monorepo-missing-dependency.md (pnpm strict deps)