* feat: create AI bot blocker middleware * chore: add changeset * fix: uppercase function called too late * chore: don't version-control robots.json * chore: track `robots.json` * ci: add `@hono/ai-robots-txt` workflow script * fix: change initial version * feat: add automatic `robots.json` sync from upstream * feat!: change package name and architecture * refactor(ua-blocker): prebuild compiled regex (#1) * fix: add json data files to tsconfig * chore: rename workflow files * fix: test if string _contains_ "Yes" It might be a markdown link, so not the exact string, but "[Yes](<link>)" * fix: tests reflect the fixed "Yes" check * feat: move generator back to prebuild generated regex should use the version-controled robots.json, not directly the upstream file * chore: add .zed * chore: remove unused files * fix: properly setup workspace before running scripts * chore: remove `prebuild` script from `build`, `typecheck`, and `test` * chore: run `getrobotstxt` and `prebuild` * fix: export `RegExp`s, not `string[]`s * chore: mention RegExp and uppercase matching in docs * fix: adapt tests to regex exports * chore: add tests for direct regex passing * chore: format code --------- Co-authored-by: Jonathan Haines <jonno.haines@gmail.com> |
||
---|---|---|
.. | ||
script | ||
src | ||
README.md | ||
package.json | ||
tsconfig.build.json | ||
tsconfig.json | ||
tsconfig.script.json | ||
tsconfig.spec.json | ||
vitest.config.ts |
README.md
User Agent based blocker middleware for Hono
The UA blocker middleware for Hono applications. You can block requests based on their User-Agent headers and generate robots.txt files to discourage them.
This package also exports AI bots lists, allowing you to easily block known AI bots.
Usage
UA Blocker Middleware
Block requests based on a list of forbidden user-agents:
import { uaBlocker } from '@hono/ua-blocker'
import { Hono } from 'hono'
const app = new Hono()
app.use(
'*',
uaBlocker({
// Add your custom blocklist here
// Can be either a list of User Agents or a RegExp directly
blocklist: ['ForbiddenBot', 'Not You'], // or /(FORBIDDENBOT|NOT YOU)/
})
)
app.get('/', (c) => c.text('Hello World'))
export default app
Block all known AI bots
We export a ready-to-use list of AI bots sourced from ai.robots.txt:
import { uaBlocker } from '@hono/ua-blocker'
import { aiBots } from '@hono/ua-blocker/ai-bots'
import { Hono } from 'hono'
const app = new Hono()
app.use(
'*',
uaBlocker({
blocklist: aiBots,
})
)
app.get('/', (c) => c.text('Hello World'))
export default app
Block only non-respecting bots
Allow bots that respect robots.txt and only block known non-respecting ones:
import { uaBlocker } from '@hono/ua-blocker'
import { nonRespectingAiBots, useAiRobotsTxt } from '@hono/ua-blocker/ai-bots'
import { Hono } from 'hono'
const app = new Hono()
app.use(
'*',
uaBlocker({
blocklist: nonRespectingAiBots,
})
)
// serve robots.txt
app.use('/robots.txt', useAiRobotsTxt())
app.get('/', (c) => c.text('Hello World'))
export default app
Serve ready-made AI bots Robots.txt
Serve a robots.txt file that disallows all known AI bots:
import { useAiRobotsTxt } from '@hono/ua-blocker/ai-bots'
import { Hono } from 'hono'
const app = new Hono()
// Serve robots.txt at /robots.txt
app.use('/robots.txt', useAiRobotsTxt())
app.get('/', (c) => c.text('Hello World'))
export default app
Extend the robots.txt content
Import the robots.txt content directly, allowing you to complete it with other rules.
import { AI_ROBOTS_TXT } from '@hono/ua-blocker/ai-bots'
console.log(AI_ROBOTS_TXT)
// Output:
// User-agent: GPTBot
// User-agent: ChatGPT-User
// User-agent: Bytespider
// User-agent: CCBot
// ...
// Disallow: /
const app = new Hono()
app.use('/robots.txt', (c) => {
robotsTxt = AI_ROBOTS_TXT + '\nUser-agent: GoogleBot\nAllow: /'
return c.text(robotsTxt, 200)
// Output:
// User-agent: GPTBot
// User-agent: ChatGPT-User
// User-agent: Bytespider
// User-agent: CCBot
// ...
// Disallow: /
// User-agent: GoogleBot
// Allow: /
})
API
@hono/ua-blocker
uaBlocker(options)
Middleware that blocks requests based on their User-Agent header.
Parameters:
options.blocklist
(string[] | RegExp
, default:[]
) - The list of user-agents to block. If a RegExp is passed, it should match on UPPERCASE User Agents.
Returns: Hono middleware function
@hono/ua-blocker/ai-bots
aiBots
Pre-made list of AI bots user-agents sourced from ai.robots.txt,
ready to be passed to uaBlocker()
.
nonRespectingAiBots
Subset of the aiBots
list, allowing bots that are known to respect robots.txt
directives.
AI_ROBOTS_TXT
robots.txt content that disallows all known AI bots.
useAiRobotsTxt()
Middleware that serves the generated robots.txt content for known AI bots.
Returns: Hono middleware function
Author
finxol https://github.com/finxol
License
MIT