The Dorian Gray Listing and Schrödinger's Price: a night with an AI used-car investigator

A car that ages in the listing but not in the photos. A price that simultaneously exists in four different values. Mileage that mysteriously drops on the same vehicle over the course of a week. These aren't fictions — these are three stories an AI used-car investigator I built over a weekend mailed me in a single night.

It's not a scraper. It's an investigator. And it's the first thing I've ever built on top of Claude that genuinely made me sit up and say: "OK, this is a different league."

This post is about how it works, what you need to build it, and why I think this isn't just about cars — it's a prototype of something bigger. I name no portals. The cases are real, the principles reproducible. If you want them, find them yourself.

Why a classic scraper isn't enough

In my scraping in 2026 post I argued that a scraper is a system, not a script. This is one step further: the problem isn't collecting the data — it's trusting it.

When you scrape used-car listings across four portals, you get four realities. A price in the title, a different one in the detail, a third in the structured data for Google, a fourth on the photo of the windshield price tag. Every portal has a different layout, aggregators index them in reverse, some add VAT, others magic it away. Raw data lies — not maliciously, but structurally.

A classic scraper can't tell. It pulls .price-final, returns a number, moves on. A validator can tell, because it opens the page in a real browser and looks with its own eyes. That's the whole point.

Stack: two Claudes, Chrome DevTools, SearXNG, Docker

The architecture is embarrassingly simple for what it does. Full stack:

Scraper-Claude — first agent. Receives a VIN or car spec. Finds it through SearXNG, opens the HTML, extracts structured data (price, mileage, year, equipment, photos). Returns JSON.
Validator-Claude — second agent in a separate context. Paranoid. Receives the Scraper's output and has one job: "Open this URL in Chrome, find it with your own eyes, and tell me whether it matches." If it doesn't, returns what's wrong and where.
Chrome DevTools MCP — turns the Validator into a real user. Clicks, scrolls, reads text, inspects element attributes, takes screenshots. No headless requests. Real browser, real DOM.
SearXNG in Docker — self-hosted search backend. Instead of Brave Search API or Serper at $200/month. One container, one compose, no key, no 429.
Loop — Scraper returns a result → Validator checks it → if it doesn't match, returns to Scraper what to fix (new selector, new URL, different page). Loop ends when validation is green or when Validator says "this genuinely doesn't match, log the discrepancy."

That's all. Nothing here is a new invention. It's a new combination.

Trick #1: SearXNG instead of paid search APIs

When you scrape across the internet, you need to search. "Find me this car on every portal where it's still live." Brave Search API? $5 per thousand queries. Serper? Similar. Run dozens of cars per night and the bill stacks up quickly.

The fix: SearXNG in Docker. An open-source meta-search engine that aggregates results from dozens of search providers (Google, Bing, DuckDuckGo, Brave, Qwant, Yandex…) and returns them as a unified JSON API. One compose, twenty seconds startup, $0 a month.

# docker-compose.yml — minimal SearXNG
services:
  searxng:
    image: searxng/searxng:latest
    ports:
      - "8080:8080"
    volumes:
      - ./searxng:/etc/searxng
    environment:
      - SEARXNG_BASE_URL=http://localhost:8080/
    restart: unless-stopped

In settings.yml enable JSON output:

search:
  formats:
    - html
    - json

And then call it as a normal REST endpoint:

curl "http://localhost:8080/search?q=octavia+2018+1.6+tdi&format=json&categories=general"

No API key. No rate limit. No account. If Google starts feeding you captchas, SearXNG quietly serves results from Bing or Brave instead. For a Claude agent it's like having the entire web on localhost.

Side note: yes, technically you're rate-limiting the upstream providers. Be polite, throttle between queries, use multiple backends. For personal research and slow loops it's a non-issue.

Trick #2: the Validator as a paranoid QA engineer

This is the heart of the whole thing. Instead of Scraper-Claude deciding "that's probably fine", the decision is delegated to a second Claude — without context, without sentiment, with one job: find what the output gets wrong about reality.

Prompt (abridged):

You are a paranoid QA engineer. You don't trust anything until
you've seen it in Chrome. You receive the scraper's JSON output
and the URL of the listing.

Your task:
1. Open the URL via the chrome-devtools MCP.
2. For every field in the JSON, find the corresponding value on
   the page (price, mileage, year, photos, description).
3. Compare. If it matches, say "match".
4. If it doesn't, return: field, scraper-value, real value,
   the selector where you found it, and a hypothesis WHY it
   doesn't match (wrong selector / price without VAT /
   aggregator overwrote the value / etc).
5. If you find anything suspicious (price in meta tag different
   from page, photo EXIF dates inconsistent with advertised
   year, JSON-LD vs DOM mismatch), log it as an anomaly even
   without being asked.

The Validator never says "probably". Either it's a match, or it's a discrepancy with evidence. Here's the psychological trick: when there are two separate agents in two separate contexts, the second one has no motivation to defend the first one's work. No ego. Doesn't know the story of how the output was produced. Just looks at what's there. It often finds what Scraper-Claude missed because the first one was already "hooked" on what it expected to see.

This is exactly the kind of task Chrome DevTools MCP was built for. It's not "write me a selector". It's "look at what's actually there".

Trick #3: a loop that decides where to look next

A classic scraper has a fixed pipeline: download → parse → store. This thing has a decision loop:

1. Scraper finds a listing.
2. Validator verifies it.
3. If match → write to DB, done.
4. If discrepancy → three branches:
   a) "Selector was wrong" → Scraper fixes and retries.
   b) "The value really is different on the page" → log as
      anomaly, Scraper goes to find the same car on another
      portal.
   c) "This car appears in multiple versions on different
      portals" → Scraper uses SearXNG to find them all,
      both run, Validator compares across.
5. After N iterations, or when discrepancies pile up → Scraper
   pulls the VIN/serial number and checks history in an external
   registry (cebia, archive.org, EU vehicle history).

That's not a scraping pipeline. That's an investigative agent. Claude decides at runtime where to look next, based on what the second Claude found a moment ago. Fixed DAG replaced with a decision loop. This is what LLMs actually add — not that they parse HTML better (they often parse it worse), but that they make decisions at runtime based on fuzzy signals.

Three stories from one night

I let it run for one night. The brief: take a handful of common used-car specs, find them across the major listing portals, verify consistency. By morning my log had this in it.

1. The Dorian Gray Listing

The listing has been alive for 14 months. The car in the photo always looks the same — pristine, freshly washed, sharp shadows, yellow leaves in the background. Those yellow leaves caught my eye. Fourteen months and it's still autumn outside?

Validator-Claude compared the hashes of the photos across archive.org snapshots over those 14 months. Identical files. Bit for bit. Meanwhile the listing's description was rotating ("just imported", "just serviced", "discount!"), the mileage was changing (sometimes upward), the price was shifting. The car doesn't exist in the form the photo shows — and hasn't, for over a year.

The point: a listing is a literary genre. The photo is a metaphor. The Validator found this in 8 seconds — a human would have to name the word "autumn" by hand, open the wayback machine, remember it from a previous viewing. The machine doesn't forget.

2. Schrödinger's Price

A car. Price in the page <title>: 289,000. Price in <h1>: 319,000. Price in the JSON-LD structured data for Google: 269,000. Price on the windshield price tag in the photo: 340,000.

Validator-Claude's note in the log: "The price of this vehicle is a superposition of four values. It collapses upon contact with the seller."

What happened? Probably the classic CMS field salad — someone overwrote the price in one place only, the aggregator imported an old version, the photo is from the original listing, the JSON-LD is generated by a script no one has audited. It's not fraud. It's chaos. But for the buyer trying to decide "is this worth a phone call?" it's a disaster — nobody knows what price you'll actually get on the other end.

And here's the key thing: a classic scraper would return one of those four values and pretend it had the truth. Validator-Claude returns all four and says "I don't know which one is right, but this is a signal that nobody knows what this listing actually costs." That's a completely different kind of information.

3. The Backwards-Driving Car

Same listing, same vehicle, tracked for 6 days. Mileage in the listing:

Monday: 98,200 km
Wednesday: 89,400 km
Friday: 84,100 km

The car didn't go anywhere. Nobody moved it. It sat on the lot. And yet it "drove off" minus 14,100 km in a week.

I have no explanation. Maybe a retroactive edit, maybe a typo, maybe an importer overwrote the value with the OCR'd reading from the odometer photo, badly. Validator-Claude went manually to all three archive.org snapshots and confirmed it wasn't my hallucination. "Recommend caution interpreting the 'mileage' field for this subject." Dryly.

What this all says about the internet in 2026

After a few nights of running the loop, I realised something important. Half the discrepancies Validator-Claude found weren't anyone trying to deceive anyone. They were parsing errors from upstream aggregators, misconfigured JSON-LD, automatic translations that didn't talk to each other. Dirty data, not dirty intent.

But for the buyer the result is the same: there's no one to trust. And that's a problem 2026 doesn't solve with another scraper. It solves it with a validation layer on top of the data. AI that looks at the world distrustfully and compares versions.

This isn't a scraper for cars. It's a prototype of something bigger. AI that validates the internet for itself. Because in 2026 collecting data isn't the problem anymore — trusting it is. And that, we can delegate.

What it costs and where it breaks

Economics: stack at $0/month. SearXNG in Docker free. Chrome DevTools MCP free. I host it on my own Mac, runs at night, I check the log in the morning.

The only cost line is Claude tokens. At 30–50 listings per night it fits comfortably inside Claude Pro at $20 (or its rate limit). If you wanted to scale to thousands of listings per day you'd switch to the API and pay for tokens — at which point it makes sense to run the Validator only on suspicious records, not every one.

Where it breaks:

Behind Cloudflare Turnstile / hCaptcha. Chrome DevTools MCP can drive a browser, but not solve a captcha. If your target site is aggressive on bot detection, you hit a wall. Neither SearXNG nor Validator help here — it's an architectural scraping problem, not a validation one.
JS-rendered SPAs without stable selectors. The Validator can do "find the price", but if the price renders as a React component with a randomised hash in the classname, the loop can spiral (Scraper fixes selector, it changes again an hour later, repeat). The fix: use full DOM text content instead of selectors.
Completely unfamiliar domains, cold start. The loop is good in known terrain. The first night against a new portal is exploration, not validation. The second night is faster.

What it isn't

I don't want to oversell. This thing doesn't save the buyer from every kind of fraud. It finds discrepancies, but a discrepancy ≠ fraud. A car with four different prices in four fields might not be a seller problem, it might be an aggregator problem. The Dorian Gray Listing is suspicious, but it could also be a legitimate listing with a stale photo.

A discrepancy is a signal, not proof. And that's exactly how Validator-Claude returns it — it returns what it saw, not what happened. Interpretation is left to the human. That's an important boundary.

The other important boundary: I don't name portals, I don't publish cases with VINs/plates. The point of the tool isn't to incriminate anyone. The point is to know what to ask the seller before paying €11,000. "Could you send me a current odometer photo and the plate so I can verify it in cebia?" is a legitimate question, and Validator-Claude provides the ammunition for asking it.

What I want to build next

Three directions I'm itching to try:

Memory across runs. When the Validator finds an anomaly, it should remember to check that listing again in two days. "Did the price come back? Did the mileage change? Did the description shift?" Time-series validation.
Cross-portal fingerprinting. "I've seen this car on 3 portals with 3 different sellers. Is that legitimate aggregation, or is this a clone?" The Validator already compares photos and copy. Next step: clone detection.
A generic validation agent for any domain. Cars, apartments, job listings, e-shop prices — the same kinds of discrepancies live everywhere. Same Validator prompt, just parametrise "what's trustworthy in this domain".

That's on the next-weekend list. If you like the way of thinking, drop me a line — happy to chat.

Take-aways

Five things to remember:

Raw data lies structurally, not maliciously. Price, mileage, photos — no single layer is right by itself.
A second agent in a separate context = a cheap, powerful validator. No ego, no context, no motivation to defend. Finds what the first one missed.
Chrome DevTools MCP turns scraping into investigation. No headless, no blind selectors. Real browser, real eyes.
SearXNG in Docker = Google without the limits. Don't overpay Brave/Serper if you don't have to.
Value comes from the decision loop, not the pipeline. AI is good at the "what next?" question, not the "how do I parse this HTML?" answer.

And one meta-point: this is the first thing I've built where I felt the term "scraping" is exhausted. In 2026, we're not building scrapers anymore. We're building agents that know to look twice.