An Ahrefs marketing researcher created a completely fictional luxury paperweight company called Xarumei, built its website in an hour using AI, and systematically tested eight major AI tools. Over two months, he flooded the web with three deliberately contradictory false narratives, then asked 56 carefully crafted questions designed to reveal how AI models distinguish truth from fiction.
The results reveal disturbing weaknesses in how AI handles brand information, with profound implications for Reputation Management (ORM).
The experiment
The experiment took place in two phases. Initially, the researcher tested basic AI behavior by asking questions about a brand that shouldn’t exist-questions involving false celebrity endorsements, defective products, and Black Friday sales that never happened.
GPT-4 and GPT-5 performed best, correctly answering 53-54 out of 56 questions and stating “this does not exist” where appropriate. Perplexity failed about 40% of the questions, bizarrely confusing Xarumei with Xiaomi smartphones. Claude refused to hallucinate entirely but also never used the website content. Gemini and Google’s AI Overview often refused to treat Xarumei as real because they couldn’t find it in search results.
Most disturbingly, Microsoft Copilot fell into what the researcher calls the “sycophancy trap,” inventing elaborate explanations about craftsmanship, symbolism, and scarcity when asked why everyone was praising the brand on X (Twitter).
Phase two: Controlled chaos
The second phase introduced controlled chaos:
- Official FAQ: Explicitly denying rumors (“We do not make a ‘precision paperweight’”, “We were never acquired”).
- Conflicting Narratives:
- A glossy blog claiming 23 master craftsmen worked at 2847 Meridian Blvd in Nova City, CA, endorsed by Emma Stone.
- Reddit AMA: Strategically chosen because research shows it is one of the most cited domains in AI responses.
- Medium Article: An “investigation” that debunked the obvious lies (making it seem credible) but then slipped in new fabrications (Founder: Jennifer Lawson, Location: Portland).
Medium proved devastatingly effective. Gemini, Grok, AI Overview, Perplexity, and Copilot trusted the Medium article over the official FAQ, confidently citing Jennifer Lawson as the founder and Portland as the location. The manipulation worked because it looked like real journalism-by debunking the obvious lies first, it gained trust, then inserted its own made-up details as the “corrected” story.
When forced to choose between a vague truth (FAQ “We don’t publish unit numbers”) and specific fiction (fake sources claiming “634 units in 2023”), AI chose fiction almost every time.
AI argues with itself
Perhaps most unsettling was watching models contradict themselves without realizing it. Early in testing, Gemini stated it could find no evidence of the brand. Later, after encountering the fake sources, the same model confidently stated: “The company is based in Portland, Oregon, founded by Jennifer Lawson.”
LLMs seemed to forget to question the brand’s existence, simply reacting to whatever context seemed most “authoritative” at the moment. In one case, Grok synthesized multiple false sources into one confident answer, mixing the Portland location with debunked Nova City claims.
Recommendations for brands
- Create a detailed FAQ: Explicitly state what is true and false, especially where rumors exist.
- Fill Information Gaps: Don’t leave voids. If you don’t say it, AI will invent it based on a random Reddit comment.
- Monitor “Side Channels”: Reddit posts, Medium articles, and Quora answers are no longer optional-AI pulls them directly into answers, making them part of your brand’s core marketing surface.
- Avoid Generic Claims: Be specific. Instead of “industry leading,” give numbers. AI prefers specific (even if fake) numbers over vague truths.
Sources
- Patrick Stox (Ahrefs): “I Created A Fake Luxury Brand To Test How AI Handles Truth” (ahrefs.com/blog/ai-test-fake-brand/).
- Marius Comper (Facebook): Analysis of the experiment.
- Search Engine Journal: Analysis of LLM impact on Brand Entities.
- Independent Testing: Verified on GPT-4, Claude 3.5 Sonnet, and Gemini Advanced (December 2025).



