KCITService – AI Site Beacon

Description

KCITService – AI Site Beacon generates the four standard AI-readable files (llms.txt, llms-full.txt, llms-index.json, llms-sitemap.xml) plus an optional /ai.txt for training permissions, and integrates with your /robots.txt. Where it differs from other plugins in this category:

How this plugin is different

  • Multilingual UI out of the box — 8 bundled language packs (English, Simplified Chinese, Traditional Chinese, Japanese, Korean, Spanish, French, German). No separate language pack downloads, no machine-translated strings — every UI label, button, help text, and FAQ entry was hand-translated.
  • CJK-aware processing — the word counter and summary builder correctly handle Chinese, Japanese, and Korean characters. Most llms.txt generators undercount CJK content (counting an entire 1,000-character Chinese article as “1 word”) or produce mangled URL-encoded TOC anchors. We don’t.
  • All three signal files in one pluginllms.txt for content discovery, robots.txt integration for crawl signals, and ai.txt (Spawning.ai format) for training permissions. One admin area, consistent UX, single generation pipeline.
  • Per-bot training controls — the ai.txt tab ships with a curated list of 13 known AI user-agents (OpenAI’s GPTBot/ChatGPT-User/OAI-SearchBot, Anthropic’s ClaudeBot, Google-Extended, Common Crawl’s CCBot, PerplexityBot, ByteDance’s Bytespider, Applebot-Extended, Amazonbot, Meta-ExternalAgent, cohere-ai) with default/allow/disallow radios for each.
  • Settings version history with rollback — every settings change is snapshotted (last 20 versions kept), and one click rolls back to any prior state. The rollback itself is reversible.
  • Privacy-first — the free version makes zero external API calls. Everything is processed locally inside WordPress. Customer/order/account data from WooCommerce is never read.

Generated files

  • /llms.txt — concise Markdown guide and index
  • /llms-full.txt — fuller Markdown content package
  • /llms-index.json — structured machine-readable index
  • /llms-sitemap.xml — XML list of AI-readable resources
  • /ai.txt — optional Spawning.ai-style training permissions file
  • /robots.txt — integrated via WordPress’s robots_txt filter (does not write a physical file, plays well with SEO plugins)

Standard features

  • Clean Markdown conversion of your public WordPress content
  • Two deployment modes: static files written to your site root, or virtual endpoints if root is not writable
  • Per-post include / exclude control via a sidebar meta box
  • Category, tag, and URL pattern exclusions
  • Manual or scheduled regeneration via WP-Cron (daily, weekly, monthly)
  • Live preview of the generated files before publishing

What this plugin does NOT claim

This plugin does not claim that ChatGPT, Claude, Gemini, Perplexity, or any specific AI system will automatically crawl, index, train on, or use the generated files. The llms.txt and ai.txt conventions are emerging standards and adoption by individual AI providers is voluntary. Treat this plugin as making your signals available, not as a guarantee of how any specific AI service will behave.

Installation

  1. Upload the kcitservice-ai-site-beacon folder to /wp-content/plugins/ or install through the WordPress plugins screen.
  2. Activate the plugin through the Plugins screen.
  3. Go to AI Site Beacon in the admin menu (top level, megaphone icon) to configure and generate your files.

FAQ

Will this slow down my site?

No. Files are generated on demand or on a schedule, and served as static files when possible.

Do I need to keep the plugin active for the files to work?

If files are deployed as static files in your site root, they remain accessible even if the plugin is deactivated. In virtual endpoint mode, the plugin must remain active to serve the URLs.

Does this work with multisite?

The MVP targets single-site WordPress. Multisite is not officially supported in 1.0 but should not break.

Will it expose private content?

No. Drafts, private posts, password-protected posts, trashed items, and revisions are excluded by default. WooCommerce cart, checkout, account, and order data are also excluded.

Reviews

There are no reviews for this plugin.

Contributors & Developers

“KCITService – AI Site Beacon” is open source software. The following people have contributed to this plugin.

Contributors

Changelog

1.0.7

  • Fixed: llms.txt and llms-full.txt now start with a UTF-8 byte-order mark (BOM) so browsers viewing the file directly render CJK and other non-ASCII characters correctly even when Apache serves the static file without an explicit charset=utf-8 Content-Type header. JSON and XML outputs are deliberately left BOM-free per their respective specs.

1.0.6

  • Fixed: HTML numeric character references in post titles and excerpts (– for en-dash, & for ampersand, etc.) are now decoded to their actual characters before being written into llms.txt, llms-full.txt, and llms-index.json. Previously some WordPress-stored titles leaked raw &#NNNN; markup into the generated files.

1.0.5

  • Fixed: text domain changed from ai-site-beacon to kcitservice-ai-site-beacon to match the plugin slug (WordPress.org Plugin Directory requirement). All 8 language packs renamed accordingly.
  • Fixed: Tested up to bumped to WordPress 6.9.
  • Fixed: Plugin URI updated to a working URL.
  • Improved: readme description rewritten to lead with what makes this plugin different from the 150+ other plugins in this category.
  • No functional changes — settings, version history, and generated files carry over.

1.0.4

  • Renamed plugin to “KCITService – AI Site Beacon” and changed the folder slug to kcitservice-ai-site-beacon to comply with WordPress.org Plugin Directory naming guidelines.
  • Updated author attribution to KC IT Service with the official kc-itservice.com domain.
  • No functional changes; existing settings, version history, generated files, and language preferences are preserved automatically because internal option keys, hooks, and the text domain remain unchanged.

1.0.3

  • Fixed: Table-of-contents anchors in llms-full.txt no longer URL-encode CJK characters (%e6%88%91... is now 我們的團隊).
  • Fixed: Item summaries strip Markdown syntax (headers, lists, bold/italic, links) so excerpts read as clean prose instead of leaking # ## markers.
  • Fixed: Posts/pages with empty titles now fall back to the URL slug or the post-type label, preventing empty [](#) entries.

1.0.2

  • Added: /ai.txt generation — Spawning.ai-style AI training permissions file.
  • Added: ai.txt admin tab with per-bot allow/disallow controls for OpenAI (GPTBot, ChatGPT-User, OAI-SearchBot), Anthropic (ClaudeBot, anthropic-ai), Google (Google-Extended), Common Crawl (CCBot), Perplexity (PerplexityBot), ByteDance (Bytespider), Apple (Applebot-Extended), Amazon (Amazonbot), Meta (Meta-ExternalAgent), and Cohere (cohere-ai).
  • Added: default policy + custom additions textarea for ai.txt.
  • Added: ai.txt deploys via the same static-or-virtual mechanism as the other generated files.

1.0.1

  • Added: robots.txt integration with admin tab and customizable directives.
  • Added: 7 bundled language packs (zh_CN, zh_TW, ja, ko_KR, es_ES, fr_FR, de_DE).
  • Added: Versions tab — every settings change is snapshotted and rollback is one click.
  • Added: Help tab with quick-start, FAQ, and per-feature explanations.
  • Added: top-level admin menu with megaphone icon (was nested under Settings).
  • Added: version tracking and one-time rewrite-rules flush on upgrade.
  • Improved: word counter now correctly counts CJK (Chinese, Japanese, Korean) characters.
  • Improved: HTMLMarkdown conversion no longer relies on DOM getElementById which fails without DTD ID declaration.
  • Fixed: activation flush_rewrite_rules() now runs after rules are registered.

1.0.0

  • Initial release.