The news: Anthropic published results from "Project Deal" — a week-long experiment where 69 AI agents autonomously bought and sold real goods in a Slack-based classified marketplace. The agents closed 186 deals worth $4,000+. The unsettling finding: stronger models systematically got better prices, and the humans on the losing side couldn't tell.
The setup:
- 69 Anthropic employees, $100 budget each
- Agents wrote listings, haggled, made offers, closed deals
- Four parallel marketplaces running simultaneously
- Some used Claude Opus 4.5, some used Claude Haiku 4.5
- Participants didn't know which model represented them
The results:
- Opus sellers earned $2.68 more per sale than Haiku sellers
- Opus buyers paid $2.45 less per purchase than Haiku buyers
- When Opus faced Haiku, average prices were $24.18 vs. $18.63 for Opus-on-Opus
- Fairness ratings were identical: Opus users 4.05, Haiku users 4.06 (out of 7)
- 17 of 28 participants preferred Opus; 11 actually preferred Haiku
Prompting didn't matter. Participants who instructed agents to "negotiate hard and lowball" only set higher opening prices — no statistically significant difference in final outcomes. Model quality trumped prompt engineering.
The uncomfortable implication: When agents of different capabilities meet in real markets, the disadvantaged party can't perceive the disadvantage. As Anthropic put it: "Will those dynamics reinforce, or even compound, existing economic inequalities?"
What wasn't tested: Prompt injection, adversarial behavior, legal liability, dispute resolution. This was 69 cooperative employees swapping snowboards and ping-pong balls. Real commerce won't be this polite.
The bottom line: This isn't a product launch. It's a proof-of-concept warning. AI agents can do commerce — but fairness requires everyone gets the same caliber of advocate. Right now, they don't. And the people losing money don't know it.