honojs-middleware/packages/ua-blocker/README.md

174 lines
3.8 KiB
Markdown

# User Agent based blocker middleware for Hono
[![codecov](https://codecov.io/github/honojs/middleware/graph/badge.svg?flag=ua-blocker)](https://codecov.io/github/honojs/middleware)
The UA blocker middleware for [Hono](https://honojs.dev) applications.
You can block requests based on their User-Agent headers and generate robots.txt files to discourage them.
This package also exports AI bots lists, allowing you to easily block known AI bots.
## Usage
### UA Blocker Middleware
Block requests based on a list of forbidden user-agents:
```ts
import { uaBlocker } from '@hono/ua-blocker'
import { Hono } from 'hono'
const app = new Hono()
app.use(
'*',
uaBlocker({
// Add your custom blocklist here
// Can be either a list of User Agents or a RegExp directly
blocklist: ['ForbiddenBot', 'Not You'], // or /(FORBIDDENBOT|NOT YOU)/
})
)
app.get('/', (c) => c.text('Hello World'))
export default app
```
### Block all known AI bots
We export a ready-to-use list of AI bots sourced from [ai.robots.txt](https://github.com/ai-robots-txt/ai.robots.txt):
```ts
import { uaBlocker } from '@hono/ua-blocker'
import { aiBots } from '@hono/ua-blocker/ai-bots'
import { Hono } from 'hono'
const app = new Hono()
app.use(
'*',
uaBlocker({
blocklist: aiBots,
})
)
app.get('/', (c) => c.text('Hello World'))
export default app
```
### Block only non-respecting bots
Allow bots that respect robots.txt and only block known non-respecting ones:
```ts
import { uaBlocker } from '@hono/ua-blocker'
import { nonRespectingAiBots, useAiRobotsTxt } from '@hono/ua-blocker/ai-bots'
import { Hono } from 'hono'
const app = new Hono()
app.use(
'*',
uaBlocker({
blocklist: nonRespectingAiBots,
})
)
// serve robots.txt
app.use('/robots.txt', useAiRobotsTxt())
app.get('/', (c) => c.text('Hello World'))
export default app
```
### Serve ready-made AI bots Robots.txt
Serve a robots.txt file that disallows all known AI bots:
```ts
import { useAiRobotsTxt } from '@hono/ua-blocker/ai-bots'
import { Hono } from 'hono'
const app = new Hono()
// Serve robots.txt at /robots.txt
app.use('/robots.txt', useAiRobotsTxt())
app.get('/', (c) => c.text('Hello World'))
export default app
```
### Extend the robots.txt content
Import the robots.txt content directly, allowing you to complete it with other rules.
```ts
import { AI_ROBOTS_TXT } from '@hono/ua-blocker/ai-bots'
console.log(AI_ROBOTS_TXT)
// Output:
// User-agent: GPTBot
// User-agent: ChatGPT-User
// User-agent: Bytespider
// User-agent: CCBot
// ...
// Disallow: /
const app = new Hono()
app.use('/robots.txt', (c) => {
robotsTxt = AI_ROBOTS_TXT + '\nUser-agent: GoogleBot\nAllow: /'
return c.text(robotsTxt, 200)
// Output:
// User-agent: GPTBot
// User-agent: ChatGPT-User
// User-agent: Bytespider
// User-agent: CCBot
// ...
// Disallow: /
// User-agent: GoogleBot
// Allow: /
})
```
## API
### `@hono/ua-blocker`
#### `uaBlocker(options)`
Middleware that blocks requests based on their User-Agent header.
**Parameters:**
- `options.blocklist` (`string[] | RegExp`, default: `[]`) - The list of user-agents to block.
If a RegExp is passed, it should match on UPPERCASE User Agents.
**Returns:** Hono middleware function
### `@hono/ua-blocker/ai-bots`
#### `aiBots`
Pre-made list of AI bots user-agents sourced from [ai.robots.txt](https://github.com/ai-robots-txt/ai.robots.txt),
ready to be passed to `uaBlocker()`.
#### `nonRespectingAiBots`
Subset of the [`aiBots`](#aibots) list, allowing bots that are known to respect `robots.txt` directives.
#### `AI_ROBOTS_TXT`
robots.txt content that disallows all known AI bots.
#### `useAiRobotsTxt()`
Middleware that serves the generated robots.txt content for known AI bots.
**Returns:** Hono middleware function
## Author
finxol <https://github.com/finxol>
## License
MIT