lanthanides.io — Site Audit & Investment-Readiness Gaps
lanthanides.io — Site Audit & Investment-Readiness Gaps
Audit date: 2026-05-31 · Branch:
main· Audited against currentmain(HEAD2a0a1b9; a prior pass mis-stated its base as56e980f, which predated/framework/) Scope: the current static Jekyll repo (no Next.js yet). Read-only inventory. This document is the reference that later migration prompts read. It is factual and cites file paths. Where the olderdocs/improvement-plan.md(dated 2026-04-08) disagrees with the repo, the repo wins — that plan is stale (it referencessupply-risk.html,heatmap.html,search.js,heatmap.js,_sass/_heatmap.scss,alternative_supply.yml,header.html, all of which no longer exist).Re-evaluation note (2026-05-31): a prior pass of this audit was written against the stale base commit
56e980f, which predatesd55ad05(“feat: add/framework/— REE import/export operational reference”) and0628880/44fdc1d(the regulatory-monitor automation). That pass omitted (a) the served/framework/page from the inventory (§1.4), the permalink contract (§2), the navigation (§1.3), and the crown-jewel discussion (§6); and (b) the regulatory-monitor pipeline (.github/workflows/regulatory-monitor.yml
scripts/scraper|triage|notify) that keeps the regulatory data fresh. This revision adds both and corrects the nav link count (8 → 9). Every other count and claim below was re-verified against the working tree and stands unchanged.
1. Site map & inventory
1.1 Configuration & infrastructure
| File | What it does |
|---|---|
_config.yml |
Jekyll config. Collections elements (/elements/:name/) and articles (/news/:name/); plugins jekyll-feed, jekyll-seo-tag, jekyll-sitemap; kramdown + rouge; permalink: pretty; compressed Sass. Excludes Gemfile, Gemfile.lock, scripts/, invoices/, outreach/, node_modules/, vendor/, README.md, LICENSE, *.py, requirements.txt, _data/demo/ from the build. pages/framework.md is NOT excluded → it builds and serves at /framework/. |
Gemfile / Gemfile.lock |
Jekyll ~> 4.4, webrick ~> 1.9, plugins feed/seo-tag/sitemap. No framework. |
CNAME |
www.lanthanides.io (root-domain GitHub Pages; no /project/ base path). |
.gitignore |
Ignores _site/, caches, invoices/* (except .gitkeep), .env, prompts/, logs/, run-all.sh, outreach/suppliers.yml, outreach/drafts/. (So prompts/ being untracked is expected.) |
LICENSE |
Dual license: MIT (code) + CC BY 4.0 (content/data). |
README.md |
Project overview: 31 elements, two-price model, regulatory tracker, tech stack. |
CONTRIBUTING.md |
Contribution guide: data schemas, source standards, local dev. Contact = mironovb@berea.edu (line 220). |
.impeccable.md |
Design brief: personas, brand personality (“intelligence brief — Foreign Affairs meets a Goldman Sachs commodity note”), color semantics, IBM Plex + Source Serif typography commitments. |
robots.txt |
Allow: / + sitemap pointer (Liquid-templated, layout: null). |
humans.txt |
Creator metadata. Contact = mironovb@berea.edu. |
404.html |
Custom 404 with home/prices/news links. |
run-all.sh |
Pipeline runner (gitignored). |
.github/ISSUE_TEMPLATE/{price-update,data-correction,market-note}.yml |
Structured contributor intake forms. |
.github/PULL_REQUEST_TEMPLATE.md |
PR template. |
.github/workflows/community-intake.yml |
Manual-dispatch-only workflow: ingests an approved-labelled price-update issue → opens a PR (two human checkpoints; never auto-runs on issue creation). |
.github/workflows/price-update.yml |
Scheduled/again-manual price refresh workflow. |
.github/workflows/regulatory-monitor.yml |
Regulatory-monitor automation (added 0628880, refined 44fdc1d): cron every 6h (+ manual dispatch) → runs scripts/run_monitor.py (polls Chinese-gov sources via scripts/scraper/, scores significance via scripts/triage.py, alerts via scripts/notify/{telegram,email}.py), commits scripts/run_state.json back as the regulatory-monitor bot (the frequent “monitor: state update” commits). Keeps the crown-jewel regulatory data current; see §1.9, §5, §6. |
1.2 Layouts (_layouts/)
| File | What it does | Reads |
|---|---|---|
default.html |
Shell: head.html + nav.html + <main> + footer.html. Loads 5 JS files deferred (main, sort, ledger, regulatory-timeline, charts). |
— |
home.html |
Homepage: hero + stat counters, category legend, element grid, regulatory banner, recent articles (3), recent movements widget (5). | element_catalog, price_records, source_registry, site.articles, movements |
element-detail.html |
Element page: identity block, two-price cards (retail/bulk), Price Movement panel + Price History chart, inline regulatory notice, body content, related elements, prev/next, full element nav bar. | element_catalog, price_records, price_history, fluctuations (via includes) |
article.html |
Long-form article: meta, status badge, element tags, hero image, body, back link. | front-matter |
page.html |
Minimal <h1> + content wrapper (currently unused by any page). |
— |
data-page.html |
Title + description + content wrapper. Used by methodology, regulatory, sources, framework. |
— |
1.3 Includes (_includes/)
| File | What it does | Reads | Mounted by |
|---|---|---|---|
head.html |
<head>: favicons, manifest, theme-color #1A5C6B, Inter + JetBrains Mono Google Fonts, SEO/OG/Twitter meta, canonical, <link type="application/atom+xml" rel="alternate" href="https://www.lanthanides.io/feed.xml" title="lanthanides.io — Rare Earth Prices, Export Controls & Strategic Materials Intelligence" />, movements.xml <link>. |
site, page |
default.html |
nav.html |
Fixed header: logo + 9 links (Dashboard, Prices, Regulatory, Framework, Movements, News, Methodology, About, Contribute) + mobile toggle. Framework sits immediately after Regulatory (header-only — not in the footer, mirroring how Sources is footer-only). | page.url |
default.html |
footer.html |
Footer nav (8 links incl. Sources, not in header), live freshness badge computed from newest quote_date, license/build stamp. |
price_records, site.elements, site.time |
default.html |
element-grid.html |
Compact tile grid of all 31 elements (category color, retail/bulk price, indicators). | element_catalog, price_records (via price-selection) |
home.html |
price-selection.html |
Shared selection logic: retail_ref (metal-preferred, 5 g–1 kg, conf ≥ 0.5, lowest) + bulk_ref (most recent bulk/industrial, conf ≥ 0.6). |
price_records |
element-detail, element-grid, price-table |
price-table.html |
Master 4-section price ledger (server-rendered, sortable headers). | element_catalog, price_records |
pages/prices.html |
price-movement-panel.html |
% change table (retail/bulk × 7d/30d/90d/1y) with quality chip. | fluctuations.elements[sym] |
element-detail.html |
price-chart.html |
Emits chart container + inline JSON of raw retail/bulk observations for charts.js. |
price_history[sym] |
element-detail.html |
fluctuation-fallback.html |
No-JS latest-price + 30d-change block. ORPHANED — included by nothing. | price_history[sym] |
(none) |
dashboard-mover-row.html |
One row of the dashboard movers list: price, % label, 2-point SVG sparkline, confidence chip. | include params | pages/dashboard.html |
movement-row.html |
One row of the Market Movements feed (full + compact variants), incl. a server-baked sparkline from movements.yml. |
movements.events[] |
movements.html, home.html |
provenance-table.html |
Sortable per-element price-records table. | price_records |
all 31 _elements/*.md |
regulatory-banner.html |
Homepage alert listing active-control elements. | element_catalog |
home.html |
disclaimer.html |
“All prices require provenance…” bar. | — | prices, sources, regulatory |
structured-data-site.html |
JSON-LD WebSite + FAQPage (4 Q&As, embeds live counts). |
price_records, site.elements |
home.html |
structured-data-element.html |
JSON-LD Product + Offer[] per element. |
cat_info, retail_ref, bulk_ref |
element-detail.html |
structured-data-article.html |
JSON-LD Article. |
page |
article.html |
structured-data-breadcrumb.html |
JSON-LD BreadcrumbList. |
breadcrumb_current |
every page |
price-selection.html (see above) |
— | — | — |
1.4 Pages (pages/ + index.html)
| File | URL | What it does | Reads |
|---|---|---|---|
index.html |
/ |
Home (layout home). |
(see home.html) |
pages/dashboard.html |
/dashboard/ |
Market dashboard: 30-day movers (gainers/decliners), regulatory snapshot stat-cards, retail-premium leaderboard, data-coverage tile map. All computed in Liquid from fluctuations. |
fluctuations, element_catalog |
pages/prices.html |
/prices/ |
“The Ledger” — server price-table.html, then ledger.js replaces it with an interactive table fetched from elements.json. |
element_catalog, price_records; JS reads assets/data/elements.json |
pages/elements.html |
/elements/ |
Full directory, 4 category tables. Duplicates the retail/bulk selection algorithm inline (lines 62-99) instead of using price-selection.html. |
element_catalog, price_records, site.elements |
pages/regulatory.html |
/regulatory/ |
Export-control tracker: filter chips, notice cards (per _data/regulatory/*.yml), announcement timeline (per policy_events.yml), legal references. |
element_catalog, regulatory, policy_events |
pages/framework.md |
/framework/ |
REE import/export operational framework (data-page layout): quick-start decision tree, 3-tier regulatory classification (uncontrolled / general-licence / case-by-case), post-IEEPA US tariff stack, China-side control summary + per-element licence-status table, multiplicative realised-price model, four-axis landed-cost decomposition with worked examples + lot-size sensitivity, buyer playbook, channels/counterparties, 2026 decision-trigger calendar. ~450 lines, heavily sourced, table-driven (no charts); reconciles against _data/regulatory/mofcom_18_2025.yml + mofcom_55_62_2025.yml. |
static prose; deep-links /regulatory/, /methodology/, /elements/* |
pages/movements.html |
/movements/ |
Auto-generated movements feed (>10% / 30d + regulatory changes) + Atom link. | movements |
pages/news.html |
/news/ |
Article tile grid. Uses the articles collection, NOT _data/news.yml. |
site.articles |
pages/methodology.md |
/methodology/ |
Full methodology: two-price rationale, normalization, provenance chain, trust tiers, live source-breakdown table from source_breakdown.yml. |
source_breakdown |
pages/about.html |
/about/ |
Mission, principles, coverage stats, contributions, contact mironovb@berea.edu (line 79). |
site.elements, price_records, source_registry, policy_events |
pages/sources.html |
/sources/ |
Trust-tier table + registered-source table + “add a source” guide. | site_settings, source_registry |
1.5 Collections
| Collection | Dir | Count | Permalink | Notes |
|---|---|---|---|---|
elements |
_elements/*.md |
31 | /elements/:name/ (each file sets explicit permalink: /elements/<Sym>/) |
Rich hand-authored bodies: applications table, market/supply stats, verified-offers (provenance-table), important notes, physical properties, references. |
articles |
_articles/*.md |
5 | /news/:name/ |
chinas-invisible-rare-earth-wall, chinese-export-licence-timelines, ebay-rare-earth-market-research-april-2026, mofcom-october-2025-suspension, retail-premium-puzzle. Publication-quality, footnoted. |
1.6 Data files (_data/)
| File | What it is | Size | Consumed by | Status |
|---|---|---|---|---|
element_catalog.yml |
Master 31-element catalog (symbol, Z, category, control status, trade form, notes). | 31 | every template | ✅ canonical |
price_records.json |
Flat price observations (provenance, normalization, confidence, verification). | 238 records | price-selection, provenance-table, all stat counters |
✅ canonical for display |
price_history/*.yml |
Per-element time series (incl. median_aggregate rows). |
31 files, 285 observations | price-movement-panel, price-chart, fluctuation-fallback, movements pipeline |
⚠️ second price store — count (285) ≠ price_records.json (238) |
fluctuations.json |
Pre-computed per-element windowed % changes + coverage grade. | 3,772 lines | dashboard, price-movement-panel |
✅ used |
movements.yml |
Auto-detected price/regulatory events + sparkline geometry. | 1,214 lines | movements page/feed, home widget | ✅ used |
policy_events.yml |
11 policy events (MOFCOM/GAC/EU/State Council), Chinese refs. | 11 | regulatory timeline, about counter | ✅ used |
regulatory/*.yml |
5 detailed notices (gac_46_2024, mofcom_10_2025, mofcom_18_2025, mofcom_55_62_2025, mofcom_1_17_2026) with legal basis, compliance, suspension. |
5 | regulatory notice cards | ✅ used — crown jewel data |
source_registry.yml |
5 sources (Stanford AM, Edgetech, ALB, American Elements, Goodfellow), trust tiers. | 5 | sources page, counters | ✅ used |
source_breakdown.yml |
Observation mix by source type (285 total: 76.8% public_listing, 16.5% aggregate, 6.7% benchmark; community/supplier/invoice = 0). | — | methodology table | ✅ used |
site_settings.yml |
Thresholds, labels, trust-tier text. | — | sources page | ✅ used |
elements/*.yml |
Second 31-element store with retail_reference/bulk_benchmark/retail_premium_ratio. |
31 files | only assets/data/elements.json → ledger.js |
⚠️ parallel store; La.yml is broken placeholder (see §4) |
news.yml |
16 legacy “news feed” entries. | 16 | nothing | ❌ ORPHANED (news page uses the articles collection) |
key_developments.yml |
Homepage export-control summary. | — | nothing | ❌ ORPHANED |
snapshots/2026-05-24_*.json |
Dated price snapshot. | 1 | nothing (written by build_snapshots.py) |
❌ orphaned in templates |
demo/demo_price_records.json |
Synthetic dev data, self-labelled “DEMO DATA ONLY… Never display.” | — | excluded from build in _config.yml |
⚠️ demo dir in repo |
1.7 JavaScript (assets/js/)
| File | Module | What it does | Mount point | Status |
|---|---|---|---|---|
main.js |
— | Boots SML_Sort, SML_Ledger, SML_RegTimeline; wires mobile-nav toggle. |
global | ✅ |
sort.js |
SML_Sort |
Click-to-sort .data-table headers (data-sort-key). |
any .data-table |
✅ |
ledger.js |
SML_Ledger |
On /prices/, hides the server table and rebuilds it from assets/data/elements.json (filter/search/sort). |
.prices-page |
⚠️ reads the divergent _data/elements/*.yml store |
regulatory-timeline.js |
SML_RegTimeline |
Element-chip filtering of notice cards + timeline. | #reg-filter-strip |
✅ |
charts.js |
(IIFE) | Renders inline-SVG price-history chart; no connecting line below 3 points (dots only) per no-fabrication rule. | [data-chart="price-history"] |
✅ but starved of data |
1.8 Styles (_sass/) + media
- 24 SCSS partials imported by
assets/css/main.scss(_variables,_reset,_mixins,_typography,_layout,_header,_footer,_home,_cards,_badges,_tables,_panels,_dashboard,_movements,_element-detail,_elements-page,_news-page,_article,_about,_interactive,_regulatory-banner,_periodic-nav,_responsive,_print). assets/images/: favicons (ico/16/32/svg/apple-touch/android-chrome 192+512), logos (48/128/png),og-default.png(exists), article imageschina-rare-earth-controls-{600,1200}.jpg,site.webmanifest.assets/data/:elements.json(Liquid-generated from_data/elements/),fluctuations.json(a duplicate of_data/fluctuations.json).
1.9 Out-of-build tooling (not served)
scripts/*.py (validate, normalize, generate, import invoices/offers, detect movements, build snapshots, fluctuations, source breakdown, placeholder generator), outreach/*.py (consent-gated supplier intake), invoices/, logs/, prompts/. Excluded via _config.yml/.gitignore.
Regulatory-monitor pipeline (out-of-build, but a real asset — see §5/§6): scripts/run_monitor.py + scripts/run_state.json (orchestrator + dedup/seen state), scripts/scraper/monitor.py + scripts/scraper/sources.yaml (polls MOFCOM homepage, gov.cn, and RSS/Atom feeds for REE-relevant announcements), scripts/triage.py (rule-based 1–10 significance scorer, biased low to avoid false-positive alerts), scripts/notify/{telegram,email}.py (alert channels). Driven by .github/workflows/regulatory-monitor.yml; its automated commits are authored by the regulatory-monitor bot.
2. Routing / permalinks (MUST be preserved by the migration)
Every public URL the current site serves. This is the contract the Next.js migration must honor (redirects if any change).
Top-level pages
/— home/dashboard//prices//elements//regulatory//framework/— served page (pages/framework.md); was missing from the prior pass. Part of the contract./movements//news//methodology//about//sources//404.html
Element collection (31) — /elements/<Symbol>/ with original-case symbol:
/elements/La/, /elements/Ce/, /elements/Pr/, /elements/Nd/, /elements/Sm/, /elements/Y/, /elements/Sc/, /elements/Eu/, /elements/Gd/, /elements/Tb/, /elements/Dy/, /elements/Ho/, /elements/Er/, /elements/Tm/, /elements/Yb/, /elements/Lu/, /elements/Te/, /elements/V/, /elements/Sb/, /elements/W/, /elements/Bi/, /elements/Mo/, /elements/Zr/, /elements/Ta/, /elements/Nb/, /elements/Co/, /elements/Li/, /elements/Ga/, /elements/Ge/, /elements/Se/, /elements/In/
Article collection (5) — /news/<slug>/:
/news/chinas-invisible-rare-earth-wall//news/chinese-export-licence-timelines//news/ebay-rare-earth-market-research-april-2026//news/mofcom-october-2025-suspension//news/retail-premium-puzzle/
Machine-readable / feeds / SEO
/sitemap.xml(jekyll-sitemap)/feed.xml(jekyll-feed;<link type="application/atom+xml" rel="alternate" href="https://www.lanthanides.io/feed.xml" title="lanthanides.io — Rare Earth Prices, Export Controls & Strategic Materials Intelligence" />inhead.html)/movements.xml(custom Atom feed,permalink: /movements.xml)/assets/data/elements.json(open-data export,permalink: /assets/data/elements.json)/assets/data/fluctuations.json(open-data export)/robots.txt/humans.txt/assets/images/site.webmanifest
⚠️ Migration note:
/elements/<Sym>/uses case-sensitive symbols. Several pages also deep-link/methodology/#display-price,/methodology/#provenance-chain,/methodology/#data-sources-breakdown,/methodology/#oxide-to-metal— preserve those anchors. The/framework/page also exposes in-page anchors (#pricing,#us-side-tariff-stack-may-14-2026) for its own internal navigation — preserve them too.
3. Visualization audit
Context that colors every “data sufficiency” cell below: the price corpus is essentially two collection days — 2026-03-15 and 2026-04-04 (with a sprinkle of bulk on 04-04). fluctuations.json confirms it: nearly every window carries confidence: "low" and distinct_days_in_window: 2, and several elements have distinct_days: 1 with all windows null. The codebase itself says so (pages/dashboard.html:209 “Most current windows span only two distinct observation days, so confidence is generally low”). Any widget that draws a line, trend, or “% move” from ≤2 points is structurally choppy/misleading and is flagged accordingly.
| # | Visualization | Location | What it shows | Data sufficiency | Rendering quality | Decision | Rationale |
|---|---|---|---|---|---|---|---|
| 1 | Per-element price-history chart | _includes/price-chart.html + assets/js/charts.js (element-detail) |
Inline-SVG scatter/line of retail & bulk USD/kg over time | Poor — most elements have 2 distinct days → 2 dots/series; line suppressed below 3 points | Code is honest (dots-only <3 pts, per-day medians) but the result is a near-empty plot | REBUILD-CLEAN | Keep the no-fabrication discipline, but a 2-dot “chart” reads as broken. Rebuild as a point-in-time comparison (retail vs bulk vs premium) and only show a true time series once ≥3 distinct days exist; otherwise hide the chart, not show 2 dots. |
| 2 | Price Movement % table | _includes/price-movement-panel.html (element-detail) |
Retail/bulk × 7d/30d/90d/1y % change grid + quality chip | Poor/misleading — windows derive from 2 days and mix forms (e.g. day-1 oxide vs day-2 metal) → absurd values (Ce 1y pct_change: 10450%, Y retail 30d +115,837.5%) |
Clean table; many “—” cells; tooltips honest about confidence | REBUILD-CLEAN | The math is real but compares non-comparable products across only 2 days. Suppress any window with <3 distinct days and same-form anchoring; otherwise the headline number is noise dressed as signal. |
| 3 | Dashboard “30-day movers” | pages/dashboard.html + _includes/dashboard-mover-row.html |
Top gainers/decliners by abs % over trailing 30d | Misleading — fed by the same 2-day, mixed-form windows; surfaces +115,837% “spikes” |
Tidy two-column list; compact %; confidence chip | REBUILD-CLEAN | A “movers” board that ranks oxide→metal artifacts as the biggest movers actively undermines credibility. Gate on ≥3 distinct days / same-form deltas, or replace with a “latest prices + spread” board until the series deepens. |
| 4 | Movers 2-point sparkline | _includes/dashboard-mover-row.html:69-103 |
A hardcoded up/down/flat slope (M3,18 L77,4) |
None — geometry is fixed by direction, not data | Crisp SVG, but conveys zero information the arrow/% don’t already | REMOVE | Pure decoration. The brief (.impeccable.md) forbids non-informative visuals. The % label + direction arrow already carry the signal. |
| 5 | Movements-feed sparkline | _includes/movement-row.html:43-58 (+ movements.yml geometry) |
2-point slope baked by detect_movements.py |
Low — point_count: 2 on the events I inspected |
Honest meta (“2 pts · $8 → $9,275”) but the curve itself is a straight segment | REBUILD-CLEAN | Keep the factual from→to + observation count; drop or rebuild the 2-point “sparkline” so it doesn’t imply a shape. Acceptable once events span ≥3 points. |
| 6 | Dashboard premium leaderboard | pages/dashboard.html:356-424 |
Latest retail ÷ latest bulk ratio, ranked | Fair — needs one retail + one bulk obs; ratio is meaningful | Clean ranked table; inverse (<1×) highlighted | KEEP (clarify) | The premium ratio is a genuine differentiator. But it silently divides metal retail by oxide bulk, inflating ratios (the article cites 200:1). Keep, but label the form basis so the number is defensible. |
| 7 | Dashboard data-coverage map | pages/dashboard.html:426-490 |
Tile per element colored by coverage grade + observation count | Good — it is the honest meta-view of sparsity | Clean grid + legend; links to element pages | KEEP | This is the right way to be transparent about thin data. Reuse the pattern; it turns a weakness into a credibility signal. |
| 8 | fluctuation-fallback.html |
_includes/fluctuation-fallback.html |
No-JS latest price + 30d change per (symbol, tier) | n/a | Well-written, but included by no template | REMOVE | Dead code (orphaned include). Either wire it as the no-JS baseline under the chart or delete it; today it ships nothing. |
| 9 | Regulatory timeline | pages/regulatory.html + assets/js/regulatory-timeline.js + policy_events.yml/regulatory/*.yml |
Filterable notice cards + reverse-chron announcement timeline | Strong — 11 events + 5 deep notices, all sourced | Polished, accessible, progressively enhanced | KEEP | The crown jewel (see §6). Don’t touch the data model; only elevate it. |
| 10 | Home element grid | _includes/element-grid.html (home) |
31 tiles: category color, retail/bulk price, control/demand indicators | Fair — prices honest; “—” where absent | Dense, scannable, on-brand | KEEP | Core navigation and the at-a-glance promise of the brief. Keep; consider surfacing control status more boldly. |
| 11 | Interactive Ledger (replaces /prices/ table) |
assets/js/ledger.js ← assets/data/elements.json ← _data/elements/*.yml |
Filter/search/sortable price table | Inconsistent — reads a different data store than the server table; _data/elements/La.yml is broken placeholder → La shows blanks/garbage with JS on |
Functional UI, but numbers diverge from the no-JS page and from element-detail | REBUILD-CLEAN | Two sources of truth (price_records.json vs _data/elements/*.yml) produce different prices for the same element. Repoint to one canonical store (or drop the JS replacement) and fix/remove the placeholder file. |
| 12 | Element two-price cards | _layouts/element-detail.html:106-154 |
Retail Reference + Bulk Benchmark + premium ×, with source line | Good — single observed prices, fully attributed | Clean, on-brand, honest empty states | KEEP | The signature “two-price system.” Exactly the right abstraction; keep verbatim. |
| 13 | Regulatory snapshot stat-cards | pages/dashboard.html:310-354 |
Counts by export-control posture / regulatory state | Good — derived from catalog fields | Clean stat-card row, links to tracker | KEEP | Accurate, useful, links to the crown jewel. |
| 14 | Hero / about stat counters | _layouts/home.html:26-31, pages/about.html:44-61 |
Elements / records / sources / controls counts | Good | Simple, legible | KEEP (reconcile) | Honest, but uses price_records | size (238) while methodology cites 285 observations — pick one number and label it (see §4.3). |
4. Student-project signals (and where they get fixed)
Concrete, fixable tells that make a strong dataset read as a class project rather than a fundable venture.
4.1 .edu contact address (credibility)
mironovb@berea.edu appears in 3 public places: pages/about.html:79, CONTRIBUTING.md:220, humans.txt. A .edu address signals “student, not company.”
Fix: replace with a domain address (e.g. data@lanthanides.io / hello@lanthanides.io). → content/brand phase.
4.2 Stray scratch files at repo root (professionalism)
Untracked junk in the working tree: chat.md, combined.txt, combined 2.txt, combined 3.txt, .arun/. (logs/, prompts/, run-all.sh are correctly gitignored.) Also committed bytecode dirs outreach/__pycache__/, scripts/__pycache__/ despite __pycache__/ being in .gitignore — they were committed before the ignore rule.
Fix: delete the scratch files; git rm -r --cached the __pycache__ dirs. → repo-hygiene / migration setup phase.
4.3 Two parallel data stores with diverging numbers (data integrity)
- Prices:
price_records.json(238) vsprice_history/*.yml(285 observations). The “238 price records” stat (home/about) contradicts methodology’s “285 observations.” - Elements:
element_catalog.yml(canonical, drives all server pages) vs_data/elements/*.yml(drives onlyelements.json→ledger.js). With JS on,/prices/can show different prices than with JS off. Fix: collapse to a single canonical store in the migration; derive everything else from it; label every count. → data-model phase (high priority).
4.4 Broken / placeholder data shipping (data integrity)
_data/elements/La.yml is malformed placeholder: literal schema strings source: "Description of source(s)", applications_summary: "Brief description of primary uses.", broken indentation, no price_per_kg. It flows into assets/data/elements.json and the interactive Ledger. A demo dir (_data/demo/) and a generate_placeholder_data.py script also sit in the repo (demo is build-excluded, but its presence is a tell).
Fix: regenerate/repair La.yml from canonical data; remove _data/demo/ and the placeholder generator from the shipped tree. → data-model phase.
4.5 Apologetic “sparse data” framing (positioning)
The site repeatedly apologizes for thin coverage: methodology.md:406 “## Why Some Pages Show Sparse Data,” dashboard copy “We don’t hide thin coverage,” “Insufficient data yet,” “confidence is generally low.” Honesty is good; apology is not — it reads as a student hedging.
Fix: reframe transparency as a feature (“provenance-first; we show our work”) and lead with the regulatory intelligence that is deep, rather than foregrounding what’s missing. The existing /framework/ page already models this confident, declarative, decision-oriented voice — use it as the in-repo tone reference. → content/positioning phase.
4.6 No clear value proposition / “who is this for” above the fold (positioning)
The home hero (“Rare earth & critical metal prices, with provenance.”) states what but not for whom or why now. .impeccable.md names the audience (investors, journalists, analysts) but the page never does. There is no above-the-fold hook tying the (world-class) export-control tracking to a decision a reader is trying to make.
Fix: add a one-line audience + value statement and route the eye to the regulatory tracker / movements. → home redesign phase.
4.7 Design diverges from its own brief (polish)
.impeccable.md commits to IBM Plex Sans/Mono + Source Serif 4 (“serif headings… load-bearing,” “monospace for all numeric data”). The shipped site loads Inter + JetBrains Mono and no serif at all (head.html:13). The “Foreign Affairs / Goldman note” editorial identity is therefore not actually implemented.
Fix: decide the type system deliberately in the migration and apply it consistently. → design-system phase.
4.8 site.webmanifest is broken + off-brand (polish/bug)
assets/images/site.webmanifest points icons at /periodicpricing/assets/images/… — a stale GitHub-Pages project base path that 404s on the root domain (www.lanthanides.io). Its theme_color: #1a1d23 and background_color: #ffffff also contradict the brand tokens used elsewhere (theme-color #1A5C6B in head.html; paper #F5F2EC in the brief).
Fix: correct icon paths to /assets/images/… and align colors. → polish / PWA phase.
4.9 Thin-data charts implying trends (covered in §3)
Items #1, #2, #3, #4, #5 above. The migration should treat “≤2 distinct days ⇒ no line/no %move” as a hard rendering rule. → visualization phase.
4.10 Orphaned data + dead includes (codebase smell)
_data/news.yml, _data/key_developments.yml, _data/snapshots/, and _includes/fluctuation-fallback.html are referenced by no template. They suggest abandoned iterations.
Fix: delete or re-wire during migration. → cleanup phase.
4.11 Minor: contributor dropdown lists untracked elements
price-update.yml offers Pm, Hf, Re (not in the 31-element catalog).
Fix: trim to tracked elements (or commit to adding them). → low priority.
5. Assets to preserve (do not lose in the migration)
These are real, hard-won, and differentiating. Carry every one forward.
SEO & structured data
- The four
_includes/structured-data-*.html(WebSite + FAQPage, Product+Offer, Article, BreadcrumbList) — comprehensive JSON-LD with live counts. jekyll-seo-tag-style meta (title/desc/keywords/canonical/OG/Twitter) inhead.html, including the per-layout OG title logic./sitemap.xml(jekyll-sitemap) and/feed.xml(jekyll-feed).- The custom
/movements.xmlAtom feed (movements.xml) — hand-rolled, valid, category-tagged. robots.txt,humans.txt, favicons +site.webmanifest(fix paths per §4.8),og-default.png,CNAME.
Open data
assets/data/elements.jsonandassets/data/fluctuations.json(sitemap: false, stable permalinks) — the public open-data export. Preserve the URLs; fix the underlying store (§4.3/4.4).
Licensing
- Dual CC BY 4.0 (content/data) + MIT (code), surfaced in
LICENSE,footer.html,head.htmlrel="license", and the Atom<rights>.
Contributor pipeline
.github/ISSUE_TEMPLATE/{price-update,data-correction,market-note}.yml+PULL_REQUEST_TEMPLATE.md..github/workflows/community-intake.yml— the manual-dispatch,approved-label-gated, PR-opening “two human checkpoints” flow (and the matchingoutreach/community_intake.py). This consent-/review-gated provenance model is itself a credibility asset and is documented on/methodology/.
Regulatory-monitoring automation (the freshness engine behind the crown jewel)
.github/workflows/regulatory-monitor.yml+scripts/scraper/(monitor.py,sources.yaml— Chinese-gov + RSS pollers),scripts/triage.py(rule-based significance scorer),scripts/notify/{telegram,email}.py,scripts/run_monitor.py+run_state.json(dedup state). Runs every 6h and alerts only on critical announcements. This is what makes “updated as announcements land” literally true; carry the pipeline (and its bot-commit cadence) into the migration’s ops story even thoughscripts/itself isn’t served.
Content (the substance)
- All 31 element pages (
_elements/*.md) — sourced applications, market/supply stats, notes, properties, references. - All 5 articles (
_articles/*.md) — footnoted, publication-quality. - The
/framework/operational reference (pages/framework.md) — ~450 lines of sourced import/export decision support (regulatory tiering, US tariff stack, realised-price model, landed-cost decomposition, channels, 2026 trigger calendar). A near-unique companion to the regulatory tracker; preserve verbatim and keep the/framework/URL. element_catalog.yml,price_records.json,policy_events.yml,regulatory/*.yml,source_registry.yml,site_settings.yml,methodology.md— the data and the rules that govern it.- The two-price (Retail Reference / Bulk Benchmark) model and the
price-selection.htmllogic.
6. Crown jewel — the MOFCOM/GAC regulatory tracker
The strongest, least-replicable asset on the site is the Chinese export-control tracker (pages/regulatory.html, _data/regulatory/*.yml, _data/policy_events.yml) — now a cluster with its operational companion /framework/ and an automated monitoring pipeline (fourth bullet below) — and the migration should make it the centerpiece, not a tab buried mid-nav.
Why it’s the differentiator:
- Depth and primary sourcing.
_data/regulatory/mofcom_18_2025.yml(and siblings) carry the announcement number, the Chinese reference (商务部 海关总署公告2025年第18号), issuing authority, issue/effective dates, exact affected elements and controlled forms, the legal basis (Article 15, Export Control Law of the PRC; 2024 Dual-Use Items Regulations), the 45-working-day review period, and suspension state (e.g. “No. 18 was never suspended; Oct-2025 Nos. 55–62 suspended until 28 Nov 2026”).policy_events.ymlstrings 11 events from the July-2023 Ga/Ge controls through the Feb-2026 Japan sanctions into one coherent timeline. - It answers the question the audience actually has. The brief’s primary user (investor/journalist/analyst) needs “is this element controlled, by which announcement, since when, and is it suspended?” — and almost no English-language source assembles this with citations. The element pages already reflect it inline (
Dy.mdbanner cites No. 18/2025 + the 28-Nov-2026 suspension key date), and the dashboard/regulatory snapshot quantify it. - It’s defensible. Prices are thin and will always be contestable; the regulatory corpus is deep, dated, sourced, and updated as announcements land. It is the thing a funder can verify in five minutes and not find elsewhere.
- It now ships with two force-multipliers the prior pass missed. (1) The
/framework/operational reference (pages/framework.md) converts the tracker’s what/when into procurement how — three-tier classification (uncontrolled / general-licence / case-by-case), the live post-IEEPA US tariff stack, a multiplicative realised-price model, four-axis landed-cost decomposition with worked examples, and a 2026 decision-trigger calendar — all reconciled against the same_data/regulatory/*.ymlcorpus, in ~450 lines of confident, sourced, decision-oriented prose (the clearest in-repo counter-example to the apologetic voice of §4.5). (2) An automated monitoring pipeline (.github/workflows/regulatory-monitor.yml→scripts/scraper/→scripts/triage.py→scripts/notify/) polls Chinese-government sources every six hours, scores each item’s regulatory significance, and fires Telegram/email alerts only on critical announcements — the machinery that makes “updated as announcements land” literally true. Treat tracker + framework + monitor as one crown-jewel cluster; nav already keeps the first two adjacent (Regulatory → Framework).
Implication for the migration: elevate the tracker to a hero position (home above-the-fold + a first-class destination), keep its data model intact, and let the (transparent) thin price data play a supporting role behind the regulatory intelligence rather than leading with apologies about sparse coverage (§4.5).
End of audit. Later prompts: treat §2 (permalinks) as a hard contract, §3 decisions as the visualization backlog, §4 as the investment-readiness punch-list, §5 as the do-not-break list, and §6 as the product’s north star.