Rate Limiting

Upstash-style rate limiting with Convex-first storage and middleware-friendly DX.

In this guide, we'll set up better-convex/plugins/ratelimit with an Upstash-parity API, Convex-first storage, and a UI-friendly hook.

We'll keep it practical: a reusable server guard, middleware wiring, algorithm choices, and client-side button disabling with live countdowns.

Approach

We'll build this in four steps so you can ship quickly and keep things maintainable:

Enable the schema plugin so internal storage tables exist.
Add one reusable server guard for all protected mutations.
Wire the guard into middleware so handlers stay focused.
Add optional client-side UX with useRateLimit() for disabled states and retry timers.

We'll end with a full API reference so you can tune behavior without guesswork.

Why this package

better-convex/plugins/ratelimit is designed as a hard cutover from component-driven APIs.

What you get	Why it matters
Upstash-style API	Easier migration if your team already knows Upstash (`limit`, `check`, `getRemaining`, `resetUsedTokens`)
Convex-first tables	No component registration and no component migration path to manage
Read dedupe helpers	Common repeated reads may reuse cached results and reduce duplicate DB fetches
React hook support	`hookAPI()` + `useRateLimit()` gives accurate client countdown and button states
Fail-closed default	Safer behavior under pressure (`failureMode: "closed"`)

Install

bun add better-convex

That's it. You do not install or register @convex-dev/rate-limiter.

Enable schema plugin (required)

aggregatePlugin and migrationPlugin are builtin in defineSchema.
Rate limiting is opt-in, so add ratelimitPlugin() where you define schema:

convex/functions/schema.ts

import { defineSchema } from 'better-convex/orm';
import { ratelimitPlugin } from 'better-convex/plugins/ratelimit';

export const tables = {
  // your tables...
};

export default defineSchema(tables, {
  plugins: [ratelimitPlugin()],
});

Create a reusable guard

We'll start with a shared guard function so all mutations enforce rate limits consistently.

convex/lib/rate-limiter.ts

import { MINUTE, Ratelimit } from 'better-convex/plugins/ratelimit';
import { CRPCError } from 'better-convex/server';
import type { MutationCtx } from '../functions/generated/server';
import type { SessionUser } from '../shared/auth-shared';

const fixed = (rate: number) => Ratelimit.fixedWindow(rate, MINUTE);

const rateLimitConfig = {
  'default:free': fixed(60),
  'default:premium': fixed(200),
  'default:public': fixed(30),
  'todo/create:free': fixed(20),
  'todo/create:premium': fixed(60),
} as const;

function getUserTier(
  user: { isAdmin?: boolean; plan?: SessionUser['plan'] } | null
): 'free' | 'premium' | 'public' {
  if (!user) return 'public';
  if (user.isAdmin || user.plan) return 'premium';
  return 'free';
}

export async function rateLimitGuard(
  ctx: MutationCtx & {
    rateLimitKey: string;
    user: Pick<SessionUser, 'id' | 'plan'> | null;
  }
) {
  const tier = getUserTier(ctx.user);
  const limitKey = `${ctx.rateLimitKey}:${tier}` as keyof typeof rateLimitConfig;
  const resolved = limitKey in rateLimitConfig ? limitKey : (`default:${tier}` as const);

  const limiter = new Ratelimit({
    db: ctx.db,
    prefix: `example:${resolved}`,
    limiter: rateLimitConfig[resolved],
    failureMode: 'closed',
    enableProtection: true,
    denyListThreshold: 30,
  });

  const status = await limiter.limit(ctx.user?.id ?? 'anonymous');
  if (!status.success) {
    throw new CRPCError({
      code: 'TOO_MANY_REQUESTS',
      message: 'Rate limit exceeded. Please try again later.',
    });
  }
}

Now you have one place to tune plan limits, prefixes, and protection defaults.

No-wrapper IP signals with Better Auth

If you already use Better Auth sessions, you can enrich rate-limit checks with session-derived network signals in normal query/mutation middleware, without wrapping every endpoint in httpAction.

convex/lib/rate-limiter.ts

import { getSessionNetworkSignals } from 'better-convex/auth';
import { Ratelimit } from 'better-convex/plugins/ratelimit';
import type { MutationCtx } from '../functions/generated/server';

export async function rateLimitGuard(
  ctx: MutationCtx & {
    rateLimitKey: string;
    user: { id: string; session?: { ipAddress?: string; userAgent?: string } } | null;
  }
) {
  const limiter = new Ratelimit({
    db: ctx.db,
    prefix: `example:${ctx.rateLimitKey}`,
    limiter: Ratelimit.fixedWindow(60, '1 m'),
  });

  const identifier = ctx.user?.id ?? 'anonymous';
  const signals = await getSessionNetworkSignals(ctx, ctx.user?.session ?? null);
  const status = await limiter.limit(identifier, signals);

  if (!status.success) throw new Error('Too many requests');
}

getSessionNetworkSignals() returns:

{} when no session is available
{ ip }, { userAgent }, or both when present on the session
trimmed values, with blank strings normalized away

Optional anonymous-session strategy

If your app has public flows and you still want session-based IP/user-agent keys, use Better Auth anonymous sessions and captcha-gate anonymous session creation.

convex/functions/auth.ts

import { anonymous } from 'better-auth/plugins';

plugins: [
  // ...other plugins
  anonymous(),
];

src/lib/convex/auth-client.ts

import { anonymousClient } from 'better-auth/client/plugins';

plugins: [
  // ...other plugins
  anonymousClient(),
];

Important: session IP/user-agent is captured by auth/session lifecycle, not guaranteed to be fresh per request. Treat it as trusted-ish app-layer identity, not strict network truth. For cert/audit-grade per-request source IP controls, use HTTP/proxy logging and enforcement.

Queries vs mutations

Use check() in queries (read-only, no token consumption).
Use limit() in mutations/actions (consumes capacity).

// query
const preview = await limiter.check(identifier, signals);

// mutation/action
const enforced = await limiter.limit(identifier, signals);

Wire it into middleware

Next, apply the guard from cRPC middleware so your handlers stay focused on business logic.

convex/lib/crpc.ts

const rateLimitMiddleware = c.middleware<
  MutationCtx & { user?: Pick<SessionUser, 'id' | 'plan' | 'session'> | null }
>(async ({ ctx, meta, next }) => {
  await rateLimitGuard({
    ...ctx,
    rateLimitKey: meta.rateLimit ?? 'default',
    user: ctx.user ?? null,
  });
  return next({ ctx });
});

export const authMutation = c.mutation
  .meta({ auth: 'required' })
  .use(authMiddleware)
  .use(rateLimitMiddleware);

Then set per-procedure keys with metadata:

export const createTodo = authMutation
  .meta({ rateLimit: 'todo/create' })
  .input(z.object({ title: z.string().min(1) }))
  .mutation(async ({ ctx, input }) => {
    // business logic
  });

Choose your algorithm

Start simple and pick based on workload shape.

Fixed window

Best when hard windows are acceptable. Tokens reset at the start of each window.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'post:create',
  limiter: Ratelimit.fixedWindow(10, '1 m'),
});

Sliding window

Best when you want smoother request shaping without hard resets. Weighs the previous window proportionally so you don't get bursts at window boundaries.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'search',
  limiter: Ratelimit.slidingWindow(50, '1 m'),
});

Token bucket

Best for burst-friendly throughput with long-term control. Tokens refill at a steady rate up to maxTokens. Use maxReserved to allow requests to "borrow" from future tokens when the bucket is empty.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'llm:tokens',
  limiter: Ratelimit.tokenBucket(1000, '1 m', 1000, { maxReserved: 3000 }),
});

Algorithm options

All three algorithm builders accept an optional options object as the last argument.

Option	Type	Default	Description
`shards`	`number`	`1`	Number of shards for write distribution. Higher values reduce contention at the cost of less precise counts (see Sharding).
`maxReserved`	`number`	`undefined`	Maximum tokens a request can "borrow" from future capacity. Only applies to `fixedWindow` and `tokenBucket`. Not supported by `slidingWindow`.
`capacity`	`number`	`limit`	Maximum stored tokens. Only applies to `fixedWindow`. Useful when you want a higher burst capacity than the per-window refill.
`start`	`number`	`0`	Epoch offset (ms) for window alignment. Only applies to `fixedWindow`. Aligns windows to a custom origin instead of epoch zero.

Duration formats

Every window or interval parameter accepts a Duration — either a raw millisecond number or a human-readable string.

String format: "<number> <unit>" or "<number><unit>". Both '1 m' and '1m' work.

Unit	Meaning	Example
`ms`	milliseconds	`'500 ms'`
`s`	seconds	`'30 s'`
`m`	minutes	`'1 m'`
`h`	hours	`'1 h'`
`d`	days	`'1 d'`

You can also use the pre-defined constants from better-convex/plugins/ratelimit:

import { SECOND, MINUTE, HOUR, DAY, WEEK } from 'better-convex/plugins/ratelimit';

Ratelimit.fixedWindow(100, MINUTE);        // 60_000 ms
Ratelimit.slidingWindow(50, 30 * SECOND);  // 30_000 ms
Ratelimit.tokenBucket(10, HOUR, 100);      // 3_600_000 ms

Done. You now have deterministic, application-layer limits with one API surface.

Add a client-side limiter UX

Server enforcement is mandatory. Client checks are for better UX — disabled buttons, countdowns, and retry hints.

Expose the hook API

First, export the hook API from a Convex file. The hookAPI() method returns a getRateLimit query and a getServerTime mutation that the React hook consumes.

convex/functions/ratelimit.ts

import { Ratelimit } from 'better-convex/plugins/ratelimit';

const limiter = new Ratelimit({
  limiter: Ratelimit.fixedWindow(3, '30 s'),
});

export const { getRateLimit, getServerTime } = limiter.hookAPI({
  identifier: async (_ctx, fromClient) => fromClient ?? 'anonymous',
  sampleShards: 1,
});

The identifier option can be a static string, or an async callback that receives (ctx, fromClient). Use the callback to resolve the identifier server-side (e.g. from auth) while still accepting a client-provided fallback.

sampleShards controls how many shards to read when estimating the remaining count. Set it to 1 for low-cost reads, or increase it for more accurate estimates on high-shard configs.

Use the React hook

Then wire it up in your component with useRateLimit:

src/components/send-button.tsx

import { useRateLimit } from 'better-convex/plugins/ratelimit/react';

const rateLimitRef = 'ratelimitDemo:getInteractiveRateLimit' as const;
const serverTimeRef = 'ratelimitDemo:getInteractiveServerTime' as const;

const { status, check } = useRateLimit(rateLimitRef, {
  identifier: sessionId,
  count: 1,
  getServerTimeMutation: serverTimeRef,
});

const blocked = status?.ok === false;
const retryAt = status?.retryAt;

useRateLimit accepts either:

a Convex function path string ('module:functionName') — this is what the /ratelimit demo uses.
a generated FunctionReference from api.

The hook returns:

Field	Type	Description
`status`	`HookStatus \| undefined`	`undefined` while loading. `{ ok: true }` when allowed, `{ ok: false, retryAt: number }` when blocked. Auto-updates when `retryAt` passes.
`check`	`(ts?, count?) => HookCheckValue \| undefined`	Manual projection function. Call it with a timestamp and count to get a precise snapshot for custom gauges or progress bars.

The HookCheckValue returned by check() has this shape:

Field	Type	Description
`value`	`number`	Projected remaining tokens (negative means over-limit)
`ts`	`number`	Timestamp of the projection (client time)
`config`	`ResolvedAlgorithm`	The algorithm config for further calculations
`shard`	`number`	Which shard was sampled
`ok`	`boolean`	`true` when `value >= 0`
`retryAt`	`number \| undefined`	Client timestamp when tokens become available

If you need precise projected values (for custom gauges), call check(ts, count).

Protection and deny lists

When enableProtection is on, the limiter tracks repeated failures per identifier, IP, user-agent, and country. Once a value reaches denyListThreshold, it gets blocked for 24 hours — without even checking the database.

You can also provide static deny lists to block known bad actors immediately.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'api',
  limiter: Ratelimit.fixedWindow(100, '1 m'),
  failureMode: 'closed',
  enableProtection: true,
  denyListThreshold: 30,
  denyList: {
    identifiers: ['known-bad-user-id'],
    ips: ['203.0.113.0'],
    userAgents: ['BadBot/1.0'],
    countries: ['XX'],
  },
});

To trigger deny-list matching on request metadata, pass ip, userAgent, or country in the limit() call:

const result = await limiter.limit(userId, {
  ip: request.headers.get('x-forwarded-for') ?? undefined,
  userAgent: request.headers.get('user-agent') ?? undefined,
  country: request.headers.get('x-country') ?? undefined,
});

Important: Deny-list state is in-memory and non-durable. It can survive across warm runtime requests, but is lost on cold starts/deploys. For persistent blocking, use an external deny list or database-backed blocklist.

Dynamic limits

Dynamic limits let you change rate limits at runtime — useful for feature flags, admin overrides, or gradual rollouts. Enable them with dynamicLimits: true in the constructor.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'api:search',
  limiter: Ratelimit.fixedWindow(100, '1 m'),
  dynamicLimits: true,
});

Then use setDynamicLimit to override the configured limit at runtime:

// Double the limit during a sale
await limiter.setDynamicLimit({ limit: 200 });

// Read the current override
const { dynamicLimit } = await limiter.getDynamicLimit();
// dynamicLimit === 200

// Remove the override (reverts to configured limit)
await limiter.setDynamicLimit({ limit: false });

The dynamic limit overrides the limit field of the algorithm. For token bucket, it overrides refillRate (and maxTokens if they were originally equal).

Limits and mitigations you should know

Important: This is application-layer limiting. It protects business logic and expensive downstream work, but it is not a network firewall or DDoS shield.

API Reference

Constructor options

Create a Ratelimit instance with a config object:

const limiter = new Ratelimit(config: RatelimitConfig);

Option	Type	Default	Description
`db`	`ctx.db`	—	Convex database context. Required for `limit`, `check`, `getRemaining`, `getValue`, `resetUsedTokens`, `setDynamicLimit`, `getDynamicLimit`. Not needed for `hookAPI()` (it receives `db` from the query/mutation context).
`limiter`	`ResolvedAlgorithm`	—	Required. Algorithm created by `Ratelimit.fixedWindow()`, `Ratelimit.slidingWindow()`, or `Ratelimit.tokenBucket()`.
`prefix`	`string`	`'@better-convex/plugins/ratelimit'`	Namespaces stored state in the database. Use unique prefixes for different rate limit scopes.
`dynamicLimits`	`boolean`	`false`	Enables `setDynamicLimit()` / `getDynamicLimit()`.
`failureMode`	`'closed' \| 'open'`	`'closed'`	Behavior on timeout. `'closed'` rejects, `'open'` allows.
`timeout`	`number`	`5000`	Milliseconds before triggering `failureMode` behavior.
`enableProtection`	`boolean`	`false`	Enables deny-list tracking on repeated failures.
`denyListThreshold`	`number`	`30`	Consecutive failures before an identifier is blocked (24h). Requires `enableProtection: true`.
`denyList`	`ProtectionLists`	`undefined`	Static deny lists. See Protection and deny lists.
`ephemeralCache`	`Map<string, number> \| false`	`new Map()`	In-memory block cache. Shared across requests in the same Convex invocation. Pass `false` to disable.

Algorithm builders

All builders are available as static methods on Ratelimit.

`Ratelimit.fixedWindow(limit, window, options?)`

fixedWindow(limit: number, window: Duration, options?: AlgorithmOptions): FixedWindowAlgorithm

Parameter	Type	Description
`limit`	`number`	Tokens replenished per window
`window`	`Duration`	Window length (`number` in ms, or string like `'1 m'`)
`options.shards`	`number`	Write distribution shards (default `1`)
`options.maxReserved`	`number`	Max tokens that can be borrowed from future windows
`options.capacity`	`number`	Max stored tokens (default = `limit`)
`options.start`	`number`	Epoch offset for window alignment

`Ratelimit.slidingWindow(limit, window, options?)`

slidingWindow(limit: number, window: Duration, options?: AlgorithmOptions): SlidingWindowAlgorithm

Parameter	Type	Description
`limit`	`number`	Max requests in the sliding window
`window`	`Duration`	Window length
`options.shards`	`number`	Write distribution shards (default `1`)
`options.maxReserved`	`number`	Max tokens that can be borrowed

Note: reserve is not supported with sliding window. The algorithm needs both current and previous window counts, which makes reservation impractical.

`Ratelimit.tokenBucket(refillRate, interval, maxTokens, options?)`

tokenBucket(refillRate: number, interval: Duration, maxTokens: number, options?: AlgorithmOptions): TokenBucketAlgorithm

Parameter	Type	Description
`refillRate`	`number`	Tokens added per interval
`interval`	`Duration`	Refill interval
`maxTokens`	`number`	Maximum bucket capacity
`options.shards`	`number`	Write distribution shards (default `1`)
`options.maxReserved`	`number`	Max tokens that can be borrowed from future refills

Core methods

`limit(identifier, options?)`

Consume tokens and return a response. This is the primary method for enforcing rate limits.

limit(identifier: string, options?: LimitRequest): Promise<RatelimitResponse>

`check(identifier, options?)`

Evaluate without consuming tokens. Use this for read-only checks (e.g. showing a warning before the user submits).

check(identifier: string, options?: CheckRequest): Promise<RatelimitResponse>

`getRemaining(identifier)`

Return the remaining tokens, reset time, and limit for an identifier.

getRemaining(identifier: string): Promise<RemainingResponse>

`getValue(identifier, options?)`

Return a raw snapshot for custom projections and UI calculations.

getValue(identifier: string, options?: { sampleShards?: number }): Promise<RateLimitSnapshot>

`resetUsedTokens(identifier)`

Clear all stored state for an identifier. Useful for admin resets.

resetUsedTokens(identifier: string): Promise<void>

`setDynamicLimit(options)`

Override the configured limit at runtime. Pass { limit: false } to remove the override. Requires dynamicLimits: true.

setDynamicLimit(options: { limit: number | false }): Promise<void>

`getDynamicLimit()`

Read the current dynamic override. Returns { dynamicLimit: number | null }. Requires dynamicLimits: true.

getDynamicLimit(): Promise<DynamicLimitResponse>

`hookAPI(options?)`

Export a getRateLimit query and getServerTime mutation for the React hook.

hookAPI(options?: HookAPIOptions): {
  getRateLimit: FunctionReference<'query'>;
  getServerTime: FunctionReference<'mutation'>;
}

Request options

`LimitRequest`

Pass these options to limit() to customize behavior per-call.

Field	Type	Default	Description
`rate`	`number`	`1`	Alias for `count`. Tokens to consume.
`count`	`number`	`1`	Tokens to consume. Takes precedence if both `rate` and `count` are set.
`reserve`	`boolean`	`false`	Allow borrowing from future capacity (up to `maxReserved`). Not supported by `slidingWindow`.
`ip`	`string`	—	IP address for deny-list matching
`userAgent`	`string`	—	User-agent for deny-list matching
`country`	`string`	—	Country code for deny-list matching
`geo`	`unknown`	—	Reserved for future geo-based rules

`CheckRequest`

Same as LimitRequest but reserve defaults to not consuming tokens (since check() is read-only).

Response types

`RatelimitResponse`

Returned by limit() and check().

Field	Type	Description
`success`	`boolean`	`true` if the request was allowed
`ok`	`boolean`	Alias for `success` (Convex DX parity)
`limit`	`number`	Maximum tokens for this algorithm
`remaining`	`number`	Tokens left after this request (floored to 0)
`reset`	`number`	Epoch ms when tokens will be available
`pending`	`Promise<unknown>`	Resolves when async side-effects complete
`reason`	`'timeout' \| 'cacheBlock' \| 'denyList'`	Present when a reason applies. Note: `failureMode: 'open'` can return `success: true` with `reason: 'timeout'`.
`deniedValue`	`string`	Present only when `reason === 'denyList'`. The value that triggered the block.

`RemainingResponse`

Returned by getRemaining().

Field	Type	Description
`remaining`	`number`	Tokens available
`reset`	`number`	Epoch ms of next replenishment
`limit`	`number`	Maximum tokens

`RateLimitSnapshot`

Returned by getValue(). Used for custom projections and the React hook.

Field	Type	Description
`value`	`number`	Current token count
`ts`	`number`	Timestamp of last state update
`shard`	`number`	Which shard was read
`config`	`ResolvedAlgorithm`	Full algorithm config for `calculateRateLimit()`

Hook API

`HookAPIOptions`

Options for hookAPI().

Field	Type	Default	Description
`identifier`	`string \| (ctx, fromClient?) => string \| Promise<string>`	—	How to resolve the identifier. A string uses it directly. A callback receives the Convex context and the optional client-provided identifier.
`sampleShards`	`number`	`1`	How many shards to sample when reading. Higher = more accurate, more reads.

`UseRateLimitOptions`

Options for the useRateLimit() React hook.

useRateLimit(
  getRateLimitValueQuery: FunctionReference<'query'> | string,
  options?: UseRateLimitOptions
)

Field	Type	Default	Description
`identifier`	`string`	—	Passed to the `getRateLimit` query
`count`	`number`	`1`	Tokens to project for status calculation
`sampleShards`	`number`	—	Override sampleShards from hook API
`getServerTimeMutation`	`FunctionReference \| string`	—	Enables clock-skew correction between client and server

Time constants

Pre-defined millisecond constants exported from better-convex/plugins/ratelimit:

Constant	Value
`SECOND`	`1_000`
`MINUTE`	`60_000`
`HOUR`	`3_600_000`
`DAY`	`86_400_000`
`WEEK`	`604_800_000`

Internal tables

The rate limiter stores state in three Convex tables. These are added only when you enable ratelimitPlugin() in defineSchema — do not define tables with these names yourself.

Table	Purpose
`ratelimit_state`	Per-identifier, per-shard token state
`ratelimit_dynamic_limit`	Dynamic limit overrides per prefix
`ratelimit_protection_hit`	Protection tracking (hits, blocks) per prefix

Advanced notes

`calculateRateLimit`

The calculateRateLimit function is exported for custom projections and UI calculations. It takes a state snapshot, algorithm config, current timestamp, and count, and returns the evaluated result without touching the database.

import { calculateRateLimit } from 'better-convex/plugins/ratelimit';

const result = calculateRateLimit(
  { value: 8, ts: Date.now() - 30_000 },
  Ratelimit.fixedWindow(10, '1 m'),
  Date.now(),
  1
);
// result.remaining, result.reset, result.retryAfter

Sharding

When shards > 1, each limit() call picks a random shard (or two, using power-of-two-choices when shards >= 3) to reduce write contention. The trade-off: reads (check, getRemaining, getValue) only sample a subset of shards, so remaining counts are approximate. For most use cases, shards: 1 (the default) is fine. Increase shards only when you see write contention on hot identifiers.

Ephemeral cache

The ephemeral block cache is an in-memory Map<string, number> that caches "blocked until" timestamps. When a limit() call fails, subsequent calls for the same identifier skip the database read entirely until the block expires. The cache is per-Ratelimit instance and resets on each Convex function invocation. Pass ephemeralCache: false to disable it, or pass a shared Map across multiple Ratelimit instances to share the cache.

Rate Limiting

Plugins Overview

Error Handling

Middlewares

On this page