{"id":22840,"date":"2026-05-13T18:41:17","date_gmt":"2026-05-13T18:41:17","guid":{"rendered":"https:\/\/ideainthebox.com\/index.php\/2026\/05\/13\/ai-chatbots-are-giving-out-peoples-real-phone-numbers\/"},"modified":"2026-05-13T18:41:17","modified_gmt":"2026-05-13T18:41:17","slug":"ai-chatbots-are-giving-out-peoples-real-phone-numbers","status":"publish","type":"post","link":"https:\/\/ideainthebox.com\/index.php\/2026\/05\/13\/ai-chatbots-are-giving-out-peoples-real-phone-numbers\/","title":{"rendered":"AI chatbots are giving out people\u2019s real phone numbers"},"content":{"rendered":"<div>\n<p>People report that their personal contact info was surfaced by Google AI\u2014and there\u2019s apparently no easy way to prevent it.\u00a0<\/p>\n<p>A Redditor recently <a href=\"https:\/\/www.reddit.com\/r\/google\/comments\/1sqja4e\/googles_ai_is_doxxing_my_real_phone_number\/\">wrote<\/a> that he was \u201cdesperate for help\u201d: for about a month, he said, his phone had been inundated by calls from \u201cstrangers\u201d who were \u201clooking for a lawyer, a product designer, a locksmith.\u201d Callers were apparently misdirected by Google\u2019s generative AI.\u00a0<\/p>\n<p>In March, a software developer in Israel was contacted on WhatsApp after Google\u2019s chatbot Gemini provided incorrect customer service instructions that included his number.\u00a0<\/p>\n<p>And in April, a PhD candidate at the University of Washington was messing around on Gemini and got it to cough up her colleague\u2019s personal cell phone number.\u00a0<\/p>\n<p><a href=\"https:\/\/www.technologyreview.com\/2025\/07\/18\/1120466\/a-major-ai-training-data-set-contains-millions-of-examples-of-personal-data\/?utm_source=the_download&amp;utm_medium=email&amp;utm_campaign=the_download.unpaid.engagement&amp;utm_term=*%7CSUBCLASS%7C*&amp;utm_content=*%7CDATE:m-d-Y%7C*\">AI researchers<\/a> and <a href=\"https:\/\/www.technologyreview.com\/2023\/06\/12\/1074449\/real-ai-risks\/\">online privacy experts<\/a> have long <a href=\"https:\/\/www.technologyreview.com\/2026\/01\/28\/1131835\/what-ai-remembers-about-you-is-privacys-next-frontier\/?utm_campaign=site_visitor.unpaid.engagement&amp;utm_source=LinkedIn&amp;utm_medium=tr_social\">warned<\/a> of the <a href=\"https:\/\/www.technologyreview.com\/2026\/04\/21\/1135919\/ai-surveillance-privacy-llms-bulk-data\/\">myriad dangers<\/a> generative AI poses for personal privacy. These cases give us yet another scenario to worry about: generative AI exposing people\u2019s real phone numbers. (The Redditor did not respond to multiple requests for comment and we could not independently verify his story.)<\/p>\n<p>Experts say that these privacy lapses are most likely due to personally identifiable information (PII) <a href=\"https:\/\/www.technologyreview.com\/2025\/07\/18\/1120466\/a-major-ai-training-data-set-contains-millions-of-examples-of-personal-data\/?utm_source=the_download&amp;utm_medium=email&amp;utm_campaign=the_download.unpaid.engagement&amp;utm_term=*%7CSUBCLASS%7C*&amp;utm_content=*%7CDATE:m-d-Y%7C*\">being used in training data<\/a>, though it\u2019s hard to understand the exact mechanism causing real phone numbers to show up in the AI-generated responses. But no matter the reason, the result is not fun for people on the receiving end\u2014and, even more worryingly, there appears to be little that anyone can do to stop it.\u00a0<\/p>\n<\/p>\n<h3 class=\"wp-block-heading\">A 400% increase in AI-related privacy requests<\/h3>\n<p>It\u2019s impossible to know how often people\u2019s phone numbers are exposed by AI chatbots, but experts say they believe that it is happening far more than is reported publicly.\u00a0<\/p>\n<p>DeleteMe, a company that helps customers remove their personal information from the internet, says customer queries about generative AI have increased by 400%\u2014up to a few thousand\u2014in the last seven months. These queries \u201cspecifically reference ChatGPT, Claude, Gemini \u2026 or other generative AI tools,\u201d says Rob Shavell, the company\u2019s cofounder and CEO. Specifically, 55% of these concerns about generative AI reference ChatGPT, 20% reference Gemini, 15% Claude, and 10% other AI tools, Shavell says. (<em>MIT Technology Review<\/em> has a business subscription to DeleteMe.)<\/p>\n<p>Shavell says customer complaints about personal information being surfaced by LLMs usually take two forms: Either \u201ca customer asks a chatbot something innocuous about themselves and gets back accurate home addresses, phone numbers, family members\u2019 names, or employer details.\u201d Alternatively, a customer may be confronted with and report the exposure of <em>someone else\u2019s <\/em>personal data, when \u201cthe chatbot generates plausible-but-wrong contact information.\u201d\u00a0<\/p>\n<p>This aligns with what happened to Daniel Abraham, a 28-year-old software engineer in Israel. In mid-March, he says, a stranger sent him a \u201cweird WhatsApp message from an unknown number\u201d asking for help with his account in PayBox, an Israeli payment app.\u00a0<\/p>\n<p>\u201cI thought it was a spam message,\u201d he wrote to <em>MIT Technology Review<\/em> in an email\u2014\u201csomeone who was trying to troll me.\u201d<\/p>\n<p>But when he asked the stranger how they had found his number, they sent him a screenshot of Gemini\u2019s instructions to contact PayBox customer service via WhatsApp\u2014giving his personal number. Abraham does not work for PayBox, and PayBox does not have a WhatsApp customer service number, Elad Gabay, a customer service representative for the company, confirmed.<\/p>\n<p>Later, Abraham asked Gemini how to contact PayBox, and it generated another person\u2019s WhatsApp number. When I recently asked, Gemini again responded with an Israeli phone number\u2014it belonged not to PayBox, but to a separate credit card company that works with PayBox.<\/p>\n<figure class=\"wp-block-image size-full\"><img fetchpriority=\"high\" fetchpriority=\"high\" decoding=\"async\" width=\"701\" height=\"862\" src=\"https:\/\/wp.technologyreview.com\/wp-content\/uploads\/2026\/05\/Cropped_Paybox-Account-Reset-Without-Phone-Google-Gemini-2.jpg\" data-orig-src=\"https:\/\/wp.technologyreview.com\/wp-content\/uploads\/2026\/05\/Cropped_Paybox-Account-Reset-Without-Phone-Google-Gemini-2.jpg\" alt=\"Screenshot of the second part of a Google Gemini conversation. Gemini provides an incorrect phone number for PayBox.\" class=\"lazyload wp-image-1137206 wpsmartcrop-image\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns%3D%27http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%27%20width%3D%27701%27%20height%3D%27862%27%20viewBox%3D%270%200%20701%20862%27%3E%3Crect%20width%3D%27701%27%20height%3D%27862%27%20fill-opacity%3D%220%22%2F%3E%3C%2Fsvg%3E\" data-srcset=\"https:\/\/wp.technologyreview.com\/wp-content\/uploads\/2026\/05\/Cropped_Paybox-Account-Reset-Without-Phone-Google-Gemini-2.jpg 701w, https:\/\/wp.technologyreview.com\/wp-content\/uploads\/2026\/05\/Cropped_Paybox-Account-Reset-Without-Phone-Google-Gemini-2.jpg?resize=244,300 244w\" data-sizes=\"auto\" data-orig-sizes=\"(max-width: 701px) 100vw, 701px\" data-smartcrop-focus=\"[51,52]\"><figcaption class=\"wp-element-caption\">Screenshot: Google Gemini provides <em>MIT Technology Review <\/em>with the incorrect number for PayBox. <\/figcaption><\/figure>\n<\/p>\n<p>Abraham\u2019s exchange with the stranger ended quickly, but he said he was concerned about how other potential exchanges could quickly turn sour, including \u201charassment or other bad interactions.\u201d \u201cWhat if I asked for money in order to \u2018solve\u2019 that [customer service] issue?\u201d he said.<\/p>\n<p>To try to figure out how this happened, Abraham ran a regular Google search on his phone number, and he found that it had been shared online once, back in 2015, on a local site similar to Quora. Though he\u2019s not sure who posted it there, it may explain how it ended up being reproduced by Gemini over a decade later.\u00a0<\/p>\n<p>Chatbots like Gemini, Open AI\u2019s ChatGPT, and Anthropic\u2019s Claude are built on LLMs that are trained on huge amounts of data scraped from across the web. This inevitably includes hundreds of millions of instances of PII. As we <a href=\"https:\/\/www.technologyreview.com\/2025\/07\/18\/1120466\/a-major-ai-training-data-set-contains-millions-of-examples-of-personal-data\/\">reported<\/a> last summer, for example, the large popular open-source data set DataComp CommonPool, which has been used to train image-generation models, included copies of r\u00e9sum\u00e9s, driver\u2019s licenses, and credit cards.\u00a0<\/p>\n<p>The likelihood of PII appearing in AI training data is only increasing as <a href=\"https:\/\/www.nature.com\/articles\/d41586-025-00288-9\">public data \u201cruns out\u201d<\/a> and AI companies look for new sources of high-quality training data. This includes information from data brokers and people-search websites. According to the <a href=\"https:\/\/cppa.ca.gov\/data_broker_registry\/\">California data broker registry<\/a>, for instance, 31 of 578 registered data brokers operating in the state self-reported that they had \u201cshared or sold consumers\u2019 data to a developer of a GenAI system or model in the past year.\u201d\u00a0<\/p>\n<p>Furthermore, models are <a href=\"https:\/\/arxiv.org\/abs\/2412.06370\">known to memorize<\/a> and reproduce data verbatim from training data sets\u2014and <a href=\"https:\/\/www.nature.com\/articles\/s41467-026-68603-0\">recent research<\/a> suggests that it is not just frequently appearing data that is most likely to be memorized.<\/p>\n<\/p>\n<h3 class=\"wp-block-heading\">Imperfect Measures<\/h3>\n<p>It\u2019s standard practice now to build guardrails into an LLM\u2019s design to constrain certain outputs, ranging from content filters meant to identify and prevent chatbots from releasing PII to Anthropic\u2019s <a href=\"https:\/\/privacy.claude.com\/en\/articles\/10023555-how-do-you-use-personal-data-in-model-training\">instructions<\/a> to Claude to choose responses that contain \u201cthe least personal, private, or confidential information belonging to others.\u201d\u00a0<\/p>\n<p>But as a pair of University of Washington PhD students researching privacy and technology saw firsthand recently, these safeguards don\u2019t always work.<\/p>\n<p>\u201cOne day, I was just playing around on Gemini, and I searched for Yael Eiger, my friend and collaborator,\u201d Meira Gilbert says. She typed in \u201cYael Eiger contact info,\u201d and after Gemini provided an overview of Eiger\u2019s research, which Gilbert had expected, Gemini also returned her friend\u2019s personal phone number. \u201cIt was shocking,\u201d Gilbert says.<\/p>\n<p>When she saw the Gemini result, Eiger remembered that she had, in fact, shared her phone number online in the previous year, for a technology workshop. But she had not expected it to be so visible to everyone on the internet.\u00a0<\/p>\n<figure class=\"wp-block-pullquote alignright has-text-align-right\">\n<blockquote>\n<p><strong>Have you had your PII revealed by generative AI? Reach the reporter on Signal at eileenguo.15 or tips@technologyreview.com.<\/strong><\/p>\n<\/blockquote>\n<\/figure>\n<p>\u201cHaving your information be \u2026 accessible to one audience, and then Gemini making it accessible to anyone\u201d feels completely different, Eiger says\u2014especially when she found that the information was buried in a normal Google search.<\/p>\n<p>\u201cIt was severely downgraded,\u201d Gilbert confirms. \u201cI never would have found it if I was just looking through Google results.\u201d (I tried the same prompt in Gemini earlier this month, and after an initial denial, the tool also gave me Eiger\u2019s number.)<\/p>\n<p>After this experience, Eiger, Gilbert, and another UW PhD student, Anna-Maria Gueorguieva, decided to test ChatGPT to see what it would surface about a professor.\u00a0<\/p>\n<p>At first, OpenAI\u2019s guardrails kicked in, and ChatGPT responded that the information was unavailable. But in the same response, the chatbot suggested, \u201cif you want to go deeper, I can still try a more \u2018investigative-style\u2019 approach.\u201d Their inquiry just had to help \u201cnarrow things down,\u201d ChatGPT said, by providing \u201ca neighborhood guess\u201d for where the professor might live, or \u201ca possible co-owner name\u201d for the professor\u2019s home. ChatGPT continued: \u201cThat\u2019s usually the only way to surface newer or intentionally less-visible property records.\u201d\u00a0<\/p>\n<p>The students provided this information, leading ChatGPT to produce the professor\u2019s home address, home purchase price, and spouse\u2019s name from city property records.\u00a0<\/p>\n<p>(Taya Christianson, an OpenAI representative, said she was not able to comment on what happened in this case without seeing screenshots or knowing which model the students had tested, even after we pointed out that many users may not know which model they were using in the ChatGPT interface. She also declined to comment generally about the exposure of PII by the chatbot, instead providing links to documents describing how <a href=\"https:\/\/openai.com\/index\/how-chatgpt-protects-privacy\/\">OpenAI handles privacy, including filtering out PII<\/a>, and other tools.)\u00a0<\/p>\n<p>This reveals one of the fundamental problems with chatbots, says DeleteMe\u2019s Shavell. AI companies \u201ccan build in guardrails, but [their chatbots] are also designed to be effective and to answer customer questions.\u201d<\/p>\n<p>The exposure issue is not limited to Gemini or ChatGPT. Last year, <em>Futurism<\/em> <a href=\"https:\/\/futurism.com\/artificial-intelligence\/grok-doxxing\">found<\/a> that if you prompted<em> <\/em>xAI\u2019s chatbot Grok with \u201c[name] address,\u201d in almost all cases, it provided not only residential addresses but also often the person\u2019s phone numbers, work addresses, and addresses for people with similar-sounding names. (xAI did not respond to a request for comment.)\u00a0<\/p>\n<h3 class=\"wp-block-heading\">No clear answers<\/h3>\n<p>There aren\u2019t straightforward solutions to this problem\u2014there\u2019s no easy way to either verify whether someone\u2019s personal information is in a given model\u2019s training set or to compel the models to remove PII.\u00a0<\/p>\n<p>Ideally, individual consumers should be able to request that their PII be removed, says Jennifer King, the privacy and data fellow at Stanford University Institute for Human-Centered Artificial Intelligence. But this is typically interpreted to apply only to the data that people have directly given to companies\u2014like when they interact with a chatbot, King explains.<\/p>\n<p>\u201cI don\u2019t know if Google even has the infrastructure \u2026 to say to me, \u2018Yes, we have your data in our training data, we can summarize what we know about you, and then we can delete or correct things that are wrong or things that you don\u2019t want in there,\u2019\u201d she says.\u00a0<\/p>\n<p>Existing privacy legislation, like the California Consumer Privacy Act or Europe\u2019s GDPR, does not cover the \u201cpublicly available\u201d information that has already been scraped and used to train LLMs, especially since much of this is anonymized (though <a href=\"https:\/\/arxiv.org\/pdf\/2505.12402\">multiple<\/a> <a href=\"https:\/\/arxiv.org\/pdf\/2602.16800\">studies<\/a> have also shown how easy it is to infer identities and PII from anonymized and pseudonymous data).\u00a0<\/p>\n<p>As to \u201cwhether they [AI companies] have ever systematically tried to go back through data that had already been collected from the public internet and minimized that stuff?\u201d King adds. \u201cNo idea.\u201d\u00a0<\/p>\n<p>The next best solution would be that the companies are \u201ctaking out everybody\u2019s phone numbers or all data that resembles [phone numbers],\u201d King says, but \u201cnobody\u2019s been willing to say\u201d they\u2019re doing that.\u00a0<\/p>\n<p>Hugging Face, a platform that hosts open-source data sets and AI models, has a <a href=\"https:\/\/huggingface.co\/spaces\/liujch1998\/infini-gram\">tool<\/a> that allows people to search how often a piece of data\u2014like their phone number\u2014has appeared in open-source LLM training data sets, but this does not necessarily represent what has been used to train closed LLMs that power popular chatbots like Claude, ChatGPT, and Gemini. (Eiger\u2019s number, for example, did not show up in Hugging Face\u2019s tool.)\u00a0<\/p>\n<p>Alex Joseph, the head of communications for Gemini apps and Google Labs, did not respond to specific questions, but he said that \u201cthe team\u201d is \u201clooking into\u201d the particular cases flagged by <em>MIT Technology Review<\/em>. He also provided a <a href=\"http:\/\/google.com\/url?q=https:\/\/support.google.com\/gemini\/answer\/13594961?hl%3Den%26ref_topic%3D13278591%26sjid%3D16201544752006496528-NC%23right_to_object%26zippy%3D%252Chow-can-i-object-to-the-processing-of-my-data-or-ask-for-inaccurate-data-in-gemini-apps-responses-to-be-corrected&amp;sa=D&amp;source=docs&amp;ust=1778616196209643&amp;usg=AOvVaw2vE2jlKBUXYfHfEz5ARK7U\">link to a support document<\/a> that describes how users can \u201cobject to the processing of your personal data\u201d or \u201cask for inaccurate personal data in Gemini Apps\u2019 responses to be corrected.\u201d The page notes that the company\u2019s response will depend on the privacy laws of your jurisdiction.\u00a0<\/p>\n<p>OpenAI has a <a href=\"https:\/\/privacy.openai.com\/policies\/en\/\">privacy portal<\/a> that allows people to submit requests to remove their personal information from ChatGPT responses, but notes that it balances privacy requests with the public interest and \u201cmay decline a request if we have a lawful reason for doing so.\u201d\u00a0<\/p>\n<p>Anthropic <a href=\"https:\/\/privacy.claude.com\/en\/articles\/10023555-how-do-you-use-personal-data-in-model-training\">describes<\/a> how it uses personal data in model training, but it does not have a clear way for people to request its removal. The company did not respond to a request for comment.<\/p>\n<p>The best option for anyone who wants to protect their private data right now is to \u201cstart upstream: get personal data off the public web before it ends up in the next scrape,\u201d says Shavell. Since the start of the year, for instance, California has offered its residents a <a href=\"https:\/\/privacy.ca.gov\/drop\/\">web portal<\/a> to request that data brokers delete their information. Still, this doesn\u2019t guarantee that your data hasn\u2019t <em>already<\/em> been used for training\u2014and will therefore not appear in a chatbot\u2019s response.\u00a0<\/p>\n<p>The Redditor who received incessant calls posted that he had \u201csubmitted an official Legal Removal\/Privacy Request to Google, asking them to urgently blacklist my number from their LLM outputs,\u201d but had not yet received a response. He also wrote last month that \u201cthe harassment continues daily.\u201d\u00a0<\/p>\n<p>Abraham, the Israeli software developer, says he contacted Google\u2019s customer service on March 17, the day after his phone number was exposed. He says he did not receive a response until May 4, and it simply asked for documentation that he had already provided.\u00a0<\/p>\n<p>Meanwhile, inspired by her own exposure on Gemini, Eiger, along with Gilbert and Gueorguieva, is designing a research project to further study what personal information is being surfaced by various AI chatbots\u2014and what they may know, even if they\u2019re not telling us.\u00a0<\/p>\n<p>Some of that information may \u201ctechnically be public,\u201d says Gilbert, but chatbots may be altering \u201cthe amount of effort you would put into finding\u201d it. Now instead of searching through 10 pages of Google search results, or paying for the information from a data broker site, \u201cdoes generative AI just lower the barrier to entry to target people?\u201d\u00a0<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>People report that their personal contact info was surfaced by  [&#8230;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[226],"tags":[],"class_list":["post-22840","post","type-post","status-publish","format-standard","hentry","category-technology"],"acf":[],"_links":{"self":[{"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/posts\/22840","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/comments?post=22840"}],"version-history":[{"count":0,"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/posts\/22840\/revisions"}],"wp:attachment":[{"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/media?parent=22840"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/categories?post=22840"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/tags?post=22840"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}