# ── Standard crawlers ──────────────────────────────────────────────────────── User-agent: * Allow: / Disallow: /innerdocs/ Disallow: /chatbot-knowledge/ Disallow: /logs/ Disallow: /PhpFunctions/ Disallow: /oldlanding/ Disallow: /extrafiles/ Disallow: /azure/ Disallow: /backupsite/ # ── AI / LLM crawlers — explicitly allowed ─────────────────────────────────── # OpenAI (ChatGPT training + browsing) User-agent: GPTBot Allow: / # OpenAI (ChatGPT on-demand fetch when answering users) User-agent: ChatGPT-User Allow: / # OpenAI (ChatGPT search index) User-agent: OAI-SearchBot Allow: / # Google AI (Gemini, AI Overviews) User-agent: Google-Extended Allow: / # Perplexity AI User-agent: PerplexityBot Allow: / # Anthropic / Claude User-agent: ClaudeBot Allow: / User-agent: anthropic-ai Allow: / User-agent: Claude-Web Allow: / # Apple AI User-agent: Applebot Allow: / User-agent: Applebot-Extended Allow: / # Meta AI User-agent: FacebookBot Allow: / User-agent: meta-externalagent Allow: / # Common Crawl (used by many LLM training sets) User-agent: CCBot Allow: / # Cohere User-agent: cohere-ai Allow: / # ByteDance / Doubao User-agent: Bytespider Allow: / # Mistral User-agent: MistralAI-User Allow: / # DuckDuckGo AI User-agent: DuckAssistBot Allow: / # Amazon User-agent: Amazonbot Allow: / Sitemap: https://www.asystir.com/sitemap.xml