OpenAI created a method for websites to block or allow content in ChatGPT’s new search engine. Here’s how publishers can be included in their search results while stopping them from using their content to train their AI models.
On October 31, 2024, OpenAI announced that its AI search engine prototype, SearchGPT, was being rebranded and integrated into its flagship product, ChatGPT. Via ChatGPT, the new search engine provides results similar to those of Google and Bing.
If ChatGPT’s new search engine becomes popular, publishers will want to appear in their search results, especially if there’s a chance it will drive referral traffic. However, many publishers have blocked all OpenAI user-agents from crawling their sites to keep them from training their large language models (LLMs) with their content. Fortunately, like Google and Apple, OpenAI has provided a method for publishers to be included in ChatGPT’s search results while also blocking them from training their AI models with their content.
One of OpenAI’s user-agents is called OAI-SearchBot
. They use OAI-SearchBot
to find and link to sites in ChatGPT search results and explicitly state that it is not used to crawl content to train OpenAI’s generative AI foundation models.
So, as long as sites exclude OAI-SearchBot
in their robots exclusion file, their site content will be eligible to appear in ChatGPT search results. They must also ensure they aren’t blocking the IPs used by OAI-Searchbot
.
Publishers interested in appearing in ChatGPT, Google, and Bing search results but still wanting to block them and other AI companies from using their content for LLM model training can use the following robots exclusion list in their robots.txt
file.
If you like web technology and marketing news, along with the occasional random stuff, then this is the newsletter for you. No ads. No sponsors. No spam. Only interesting and timely stories. Unsubscribe anytime.
Jon Henshaw
Jon Henshaw is the founder of Coywolf and an industry veteran with almost three decades of SEO, digital marketing, and web technologies experience. Follow @jon@henshaw.social
2020 Fieldstone Pkwy
STE 900-122
Franklin, TN 37069-4337
+1 615-461-0902 / contact@coywolf.news
Fediverse: @coywolf@coywolf.social
News Policies and Standards – Submit a news tip – Privacy Policy – Terms of Service
This site uses privacy-first Fathom Analytics and is hosted on Pressable
© 2017-2024 Coywolf LLC All Rights Reserved
Coywolf and the Coywolf logo are registered trademarks of Coywolf LLC