# robots.txt for SavvyScratch # This file controls which pages search engines and AI bots can crawl # Allow all bots to crawl the entire site by default User-agent: * Allow: / # Disallow crawling of admin, API, or private areas (add these if applicable) # Disallow: /api/ # Disallow: /admin/ # Disallow: /_next/static/ # Reference to your XML sitemap Sitemap: https://www.savvyscratch.com/sitemap.xml # =================================== # AI-Specific Bot Configuration # =================================== # OpenAI GPTBot - Used by ChatGPT User-agent: GPTBot Allow: / # If you want to disallow specific sections from GPTBot, uncomment: # Disallow: /private/ # ChatGPT User - For ChatGPT browsing User-agent: ChatGPT-User Allow: / # Google Extended - Used by Google's AI products (Bard/Gemini) User-agent: Google-Extended Allow: / # Anthropic Claude - Claude AI crawler User-agent: anthropic-ai Allow: / # Common Crawl - Used by many AI training datasets User-agent: CCBot Allow: / # Meta AI (Facebook/Meta) User-agent: FacebookBot Allow: / # Bing AI / Copilot User-agent: Bingbot Allow: / # Perplexity AI User-agent: PerplexityBot Allow: / # =================================== # Search Engine Bots # =================================== # Google User-agent: Googlebot Allow: / # Google Image Search User-agent: Googlebot-Image Allow: / # Bing User-agent: Bingbot Allow: / # Yahoo (now uses Bing's crawler) User-agent: Slurp Allow: / # DuckDuckGo User-agent: DuckDuckBot Allow: / # Yandex (Russian search engine) User-agent: Yandex Allow: / # Baidu (Chinese search engine) - Important if you have Chinese users User-agent: Baiduspider Allow: / # =================================== # Social Media Crawlers # =================================== # Twitter User-agent: Twitterbot Allow: / # Facebook User-agent: facebookexternalhit Allow: / # LinkedIn User-agent: LinkedInBot Allow: / # Pinterest User-agent: Pinterestbot Allow: / # =================================== # Notes and Best Practices # =================================== # # To BLOCK AI bots from crawling your site (if you change your mind): # User-agent: GPTBot # Disallow: / # # To block specific directories: # User-agent: * # Disallow: /admin/ # Disallow: /private/ # # To slow down aggressive crawlers: # User-agent: * # Crawl-delay: 10 # # Remember: robots.txt is a suggestion, not enforcement # Malicious bots may ignore it