# Dancing Noodle Winnipeg — robots.txt # https://dancingnoodle.ca # ── General directives for all crawlers ────────────────────────────────────── User-agent: * Allow: / # llms.txt is publicly readable by AI agents but should NOT be indexed # as a search result. The noindex directive is applied via X-Robots-Tag # in .htaccess (Apache) or equivalent server config. Allow: /llms.txt # Block legal pages from indexing (still accessible to users via footer links) Disallow: /privacy-policy.html Disallow: /terms-of-use.html # Block URL parameters that don't produce unique content Disallow: /*?*author=* Disallow: /*?*tag=* Disallow: /*?*month=* Disallow: /*?*view=* Disallow: /*?*format=* # ── AI crawlers — full access including llms.txt ────────────────────────────── User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: CCBot Allow: / User-agent: anthropic-ai Allow: / User-agent: Claude-Web Allow: / User-agent: ClaudeBot Allow: / User-agent: Google-Extended Allow: / User-agent: cohere-ai Allow: / User-agent: FacebookBot Allow: / User-agent: PerplexityBot Allow: / # ── Google Ads bots ─────────────────────────────────────────────────────────── User-agent: AdsBot-Google Allow: / User-agent: AdsBot-Google-Mobile Allow: / # ── Slow down aggressive crawlers ──────────────────────────────────────────── User-agent: Baiduspider Crawl-delay: 10 User-agent: SemrushBot Crawl-delay: 5 User-agent: AhrefsBot Crawl-delay: 5 # ── Sitemap ─────────────────────────────────────────────────────────────────── Sitemap: https://dancingnoodle.ca/sitemap.xml