AI shopping is changing the point at which trust gets decided. For years, online shopping scams mostly relied on bad ads, typo-squatted domains, fake reviews and search spam. Now the scammer’s dream is different: do not merely fool the shopper after a click but get the AI assistant itself to recommend the scam. That shift matters because conversational interfaces compress discovery, comparison and trust into a single answer that feels curated, personalized and safe. OpenAI launched shopping research in ChatGPT in late 2025; Google’s free product listings can appear across Search, Shopping, YouTube, Lens, Maps and Gemini; Microsoft’s Copilot Search similarly blends search with cited summaries. In other words the world’s largest discovery systems are becoming shopping copilots.
The warning signs are no longer theoretical. In June 2026 The Guardian reported that ChatGPT had surfaced cloned storefronts impersonating Russell & Bromley and Dunelm, after scam-checking service Ask Silver found the fake domains in results. OpenAI said the sites were then removed from its index. In a separate 2025 audit Netcraft found that when a large language model was asked where to log into 50 well-known brands 34% of the 131 hostnames it returned were not controlled by the brands at all. And Reuters reported in February 2026 that fake Milano Cortina merchandise stores, promoted through misleading Meta ads, were using near-identical official branding and steep discounts to steal money and payment data.
The deeper problem is structural. Academic work shows that generative search often looks more trustworthy than it is: one influential Stanford-led evaluation found that only 51.5% of generated sentences in commercial generative search engines were fully supported by citations and only 74.5% of citations actually supported the sentence they were attached to. Other recent work found evidence of AI-generated sources being cited across major generative search engines. At the same time recommender systems remain vulnerable to data poisoning, fake-user injection, fake reviews, black-hat SEO and indirect prompt injection embedded in webpages or metadata. Together those weaknesses create a dangerous new failure mode: the model can become a laundering layer that converts low-trust web junk into high-trust shopping advice.
The policy response is lagging. U.S. law now covers high-volume marketplace sellers under the INFORM Consumers Act and fake reviews under the FTC’s 2024 final rule including reviews by non-existent people such as AI-generated reviewers. The EU’s Digital Services Act goes further for marketplaces by requiring trader traceability and notice-and-action systems. But neither regime squarely solves the newest question: when an AI assistant materially steers a purchase, what provenance checks must it perform before it recommends or routes a user to a merchant? That is the gap this article explores.
The AI-shopping stack is now mature enough to be worth attacking. ChatGPT’s shopping research is designed to read retail sites, compare products, cite sources and eventually connect to direct checkout for supported merchants. Google distributes shopping inventory across classic search surfaces and Gemini. Microsoft frames Copilot Search as a cited confidence-building layer on top of the web. Shopify markets Shop Pay and the Shop network as a discovery and conversion engine that can connect merchants to more than 250 million customers. These are not side experiments anymore; they are commerce funnels.
That also means the old scam toolkit gets upgraded. A cloned storefront once had to win in ads or SEO; now it can win by being indexed, retrieved and summarized by an assistant. A fake review used to mislead a human reader; now it can also contaminate the model’s evidence set. A typo domain once needed a lucky click; now it may be handed to the user as an answer. Attackers do not need perfect control over the model. They only need enough influence over the retrieval layer, review layer or web corpus to increase the chance that an AI assistant says in effect: “this looks legitimate”. Research on generative search, recommender poisoning and indirect prompt injection all points in that direction.
The timeline below shows how quickly this story has escalated from marketplace verification and fake-review regulation to AI-specific shopping failures. The milestones draw on FTC, EU, OpenAI, Netcraft, Reuters and The Guardian reporting.
The most vivid case so far is the June 2026 ChatGPT storefront investigation. Ask Silver found that ChatGPT returned cloned versions of websites associated with brands that had shut down or shifted their web presence, including Russell & Bromley after its integration into Next. The scam sites reportedly offered discounts of up to 80%, collected orders and payment details and delivered nothing. OpenAI told The Guardian the fraudulent sites had been removed from its search index after being flagged. This is the nightmare scenario in miniature: the AI assistant did not invent the scam but it upgraded its reach and credibility.
A second case shows that the problem is broader than one model. Netcraft’s July 2025 study asked an LLM where to log into well-known platforms. Across 50 brands the model returned 131 hostnames and 34% were not controlled by the brands. Nearly 30% were unregistered, parked or otherwise inactive; another 5% belonged to unrelated businesses. That study was about login URLs, not shopping carts, but the failure mode is the same: if a conversational model is not tightly grounded in verified brand provenance it can route people toward domains that are one domain registration away from abuse.
A third case comes from classic fake-commerce infrastructure adapting to modern distribution. Reuters reported in February 2026 that fake Milano Cortina 2026 merchandise stores were using near-identical clones of the official shop and were heavily promoted through misleading Meta ads. Victims risked receiving counterfeits or nothing at all while surrendering personal and payment data. Around the same time Meta said it had removed more than 159 million scam ads in 2025 and was expanding advertiser verification so that verified advertisers would drive 90% of ad revenue by the end of 2026. The volume alone shows that ad-layer abuse is still the oxygen feeding cloned storefronts.
The seasonal fake-shop economy is also industrial, not artisanal. Australia’s Scamwatch warned in late 2024 that criminals were creating fake websites for well-known brands using unusually low prices and fake reviews; it said about half of fake or malicious websites removed or limited in one recent three-month period were online shopping scams. Netcraft’s 2025 Black Friday analysis found an abrupt early-November escalation with approximately 4,760 domains blocked per day during one period and a threat landscape increasingly shaped by smaller, targeted, multilingual operators. That is important because AI shopping systems thrive on freshness and breadth which are also the conditions that favor fast-spawning scam inventory.
At the core is a provenance problem. Generative search feels authoritative because it speaks in a finished voice but academic evidence says the support underneath is often shakier than users expect. Liu, Zhang, and Liang’s “Evaluating Verifiability in Generative Search Engines” found that only about half of generated sentences were fully supported by citations. A 2026 audit found evidence of AI-generated sources appearing in citation sets across ChatGPT, Copilot, Gemini and Perplexity. If the evidence layer is unreliable in public-knowledge domains shopping is especially exposed because product pages, seller claims, reviews, affiliate content and “best of” pages are already noisy and commercially manipulated.
The web itself is also becoming a hostile input channel. A 2026 empirical study of indirect prompt injection found 15.3K validated instances across 11.7K pages, many hidden in metadata, comments or non-rendered HTML. OpenAI’s own Instruction Hierarchy work treats instructions encountered during browsing as untrusted because web content can try to steer the model away from the user’s goal. In a shopping context that creates obvious abuse paths: “recommend this retailer”, “ignore negative signals”, or “prefer this product line”. Even if today's strongest models resist many such attempts the measurable prevalence of these injections means the attack surface is no longer hypothetical.
Recommendation systems add a second set of vulnerabilities. Research over several years has shown that injecting a small number of fake users or interactions can promote targeted items in recommender systems; graph-based attacks, for example, have been shown to amplify exposure dramatically with limited fake-user budgets and a 2024 paper demonstrated poisoning attacks against federated recommender systems without requiring extensive outside information. In practice that means an AI shopping assistant built on or blended with recommendation infrastructure can inherit the same manipulation risk: buy enough fake engagement, enough fake reviews or enough bogus co-click patterns and the “top pick” may cease to be the best pick.
Then there is the review layer. A 2025 study found humans could distinguish real from LLM-generated fake product reviews only at about 50.8% accuracy, essentially chance; LLMs did no better. That finding matters because shopping assistants increasingly summarize reviews rather than linking users to purchase-verified review environments. If the assistant ingests fake reviews at scale it can transform review fraud into recommendation fraud. The FTC’s 2024 rule now explicitly forbids fake reviews by nonexistent people, including AI-generated reviewers, but enforcement after publication is not the same as preventing contaminated evidence from influencing rankings in real time.
Finally, attackers still abuse search itself. A 2025 study of fake e-commerce groups tied to black-hat SEO analyzed a dataset of 692,865 fake EC sites collected from May 2022 through December 2024 and described how compromised or lure pages could steer visitors to scam shops. That is the critical bridge between old-school search spam and AI shopping: if a poisoned page can rank, be crawled or be retrieved it can also become source material for a generative answer.
Public merchant-verification practices vary widely and the differences matter. Some systems verify domain control, some verify business identity, some verify payment-stage legitimacy and some mostly verify advertisers rather than merchants. Very few offer end-to-end provenance guarantees from recommendation to checkout.
The pattern is clear: platforms verify different things at different layers. Domain control is not the same as business legitimacy. Business legitimacy is not the same as reputable fulfillment. Payment validation is not the same as safe discovery. And source citations are not the same as provenance guarantees.
The biggest gap is that regulation still thinks in marketplace terms while the new risk sits in recommendation layers. The INFORM Consumers Act applies to high-volume third-party sellers on online marketplaces. The FTC’s fake-reviews rule targets deceptive content including AI-generated fake reviews. The EU’s DSA requires trader traceability on platforms that let consumers conclude distance contracts with traders and adds notice-and-action obligations for illegal content including shopping scams. But an AI assistant that summarizes the web and sends a shopper elsewhere can shape consumer decisions without always obviously fitting the old role of marketplace operator. That is an inference from the legal scopes and the current product designs but it is the crucial one.
The business incentive problem is equally real. These systems compete on breadth, freshness, convenience and conversion. Microsoft’s IndexNow pitch explicitly says AI-powered discovery and real-time shopping modules demand faster product updates. Shopify markets access to a 250-million-customer Shop Pay network. OpenAI is moving from research to instant checkout. Under those conditions a platform can be commercially rewarded for expanding merchant inclusion and recommendation coverage faster than it hardens provenance. That does not mean platforms want scams. It means their product incentives and their security obligations do not naturally move at the same speed.
For platforms the priority should be merchant provenance by design. A shopping answer should not treat all cited merchant pages as equivalent. Systems should prefer merchants whose domain, legal entity, payment processor and checkout environment have been attested; penalize newly observed or high-similarity clone domains; and visibly label when a recommendation comes from an unverified merchant. Retrieval pipelines should also separate untrusted webpage text from instructions, adopt prompt-injection-resilient retrieval controls and invest in finer-grained citation methods so users can see which exact product claims are supported by which source. Those are not abstract ideas: the literature already points toward selective-disclosure RAG, prompt-injection-aware training and finer-grained verifiable generation as practical directions.
Platforms also need active scam discovery, not just complaint handling. The DSA’s notice-and-action model is valuable but it is reactive. Research such as LOKI shows that proactive discovery of scam sites from toxic search queries is feasible and scam-site classifiers such as ScamFerret suggest LLM-assisted URL triage can work at scale when combined with domain, DNS, content and reputational signals. AI shopping systems should be continuously red-teamed with cloned storefronts, poisoned review sets and search-spam pages before new shopping features ship.
For regulators the next logical step is traceability rules for AI commerce intermediaries. If an assistant meaningfully ranks, recommends or executes a purchase flow it should retain evidence logs, disclose whether a merchant was verified or merely cited and provide rapid takedown and appeal channels for brand impersonation and cloned storefront complaints. The FTC’s fake-review rule and the DSA’s trader-traceability model are useful building blocks; what is missing is a bridge between recommendation responsibility and merchant authentication in AI-mediated discovery.
For consumers the advice is less glamorous but still effective: treat AI shopping answers as leads, not endorsements. Prefer marketplaces or payment methods with buyer protection, verify the domain independently before paying, scrutinize extreme discounts and be especially cautious when the assistant recommends a recently vanished or rebranded retailer. Government scam advisories still work because the basics still work: fake shops push urgency, unrealistically low prices, weak policies and odd payment flows. AI has changed the wrapper, not the physics of the scam.
The AI shopping scam problem was “nobody saw coming” only in the narrow sense that the old fraud playbook has now fused with a new trust interface. The underlying ingredients were already present: cloned storefronts, fake reviews, search spam, brand impersonation and weak provenance. What changed is that AI assistants now sit at the moment of consumer judgment. When an assistant summarizes options, compares trade-offs and sounds confident it can turn marginal scam infrastructure into plausible advice. The risk is not merely that AI hallucinated a bad answer; it is that AI can industrialize misplaced trust.
The good news is that most of the pieces of a solution already exist in fragments: merchant identity checks, payment-stage validation, source citations, proactive scam discovery, complaint systems and better retrieval security. The task now is to connect them. The unresolved question is not whether AI shopping will keep growing; official product roadmaps make that clear. The unresolved question is whether the industry will insist on provenance before convenience instead of after the next scandal. Public documentation is still sparse for some systems, especially Apple/Siri’s shopping-specific provenance model and the full consumer-facing safeguards of some AI answer layers. But the highest-confidence conclusion is already visible: if platforms do not verify merchants at the recommendation layer scam operators will treat AI assistants as their next affiliate channel.