Rate Limiting
Upstash-style rate limiting with Convex-first storage and middleware-friendly DX.
In this guide, we'll set up better-convex/plugins/ratelimit with an Upstash-parity API, Convex-first storage, and a UI-friendly hook.
We'll keep it practical: a reusable server guard, middleware wiring, algorithm choices, and client-side button disabling with live countdowns.
Approach
We'll build this in four steps so you can ship quickly and keep things maintainable:
- Enable the schema plugin so internal storage tables exist.
- Add one reusable server guard for all protected mutations.
- Wire the guard into middleware so handlers stay focused.
- Add optional client-side UX with
useRateLimit()for disabled states and retry timers.
We'll end with a full API reference so you can tune behavior without guesswork.
Why this package
better-convex/plugins/ratelimit is designed as a hard cutover from component-driven APIs.
| What you get | Why it matters |
|---|---|
| Upstash-style API | Easier migration if your team already knows Upstash (limit, check, getRemaining, resetUsedTokens) |
| Convex-first tables | No component registration and no component migration path to manage |
| Read dedupe helpers | Common repeated reads may reuse cached results and reduce duplicate DB fetches |
| React hook support | hookAPI() + useRateLimit() gives accurate client countdown and button states |
| Fail-closed default | Safer behavior under pressure (failureMode: "closed") |
Install
bun add better-convexThat's it. You do not install or register @convex-dev/rate-limiter.
Enable schema plugin (required)
aggregatePlugin and migrationPlugin are builtin in defineSchema.
Rate limiting is opt-in, so add ratelimitPlugin() where you define schema:
import { defineSchema } from 'better-convex/orm';
import { ratelimitPlugin } from 'better-convex/plugins/ratelimit';
export const tables = {
// your tables...
};
export default defineSchema(tables, {
plugins: [ratelimitPlugin()],
});Create a reusable guard
We'll start with a shared guard function so all mutations enforce rate limits consistently.
import { MINUTE, Ratelimit } from 'better-convex/plugins/ratelimit';
import { CRPCError } from 'better-convex/server';
import type { MutationCtx } from '../functions/generated/server';
import type { SessionUser } from '../shared/auth-shared';
const fixed = (rate: number) => Ratelimit.fixedWindow(rate, MINUTE);
const rateLimitConfig = {
'default:free': fixed(60),
'default:premium': fixed(200),
'default:public': fixed(30),
'todo/create:free': fixed(20),
'todo/create:premium': fixed(60),
} as const;
function getUserTier(
user: { isAdmin?: boolean; plan?: SessionUser['plan'] } | null
): 'free' | 'premium' | 'public' {
if (!user) return 'public';
if (user.isAdmin || user.plan) return 'premium';
return 'free';
}
export async function rateLimitGuard(
ctx: MutationCtx & {
rateLimitKey: string;
user: Pick<SessionUser, 'id' | 'plan'> | null;
}
) {
const tier = getUserTier(ctx.user);
const limitKey = `${ctx.rateLimitKey}:${tier}` as keyof typeof rateLimitConfig;
const resolved = limitKey in rateLimitConfig ? limitKey : (`default:${tier}` as const);
const limiter = new Ratelimit({
db: ctx.db,
prefix: `example:${resolved}`,
limiter: rateLimitConfig[resolved],
failureMode: 'closed',
enableProtection: true,
denyListThreshold: 30,
});
const status = await limiter.limit(ctx.user?.id ?? 'anonymous');
if (!status.success) {
throw new CRPCError({
code: 'TOO_MANY_REQUESTS',
message: 'Rate limit exceeded. Please try again later.',
});
}
}Now you have one place to tune plan limits, prefixes, and protection defaults.
No-wrapper IP signals with Better Auth
If you already use Better Auth sessions, you can enrich rate-limit checks with session-derived network signals in normal query/mutation middleware, without wrapping every endpoint in httpAction.
import { getSessionNetworkSignals } from 'better-convex/auth';
import { Ratelimit } from 'better-convex/plugins/ratelimit';
import type { MutationCtx } from '../functions/generated/server';
export async function rateLimitGuard(
ctx: MutationCtx & {
rateLimitKey: string;
user: { id: string; session?: { ipAddress?: string; userAgent?: string } } | null;
}
) {
const limiter = new Ratelimit({
db: ctx.db,
prefix: `example:${ctx.rateLimitKey}`,
limiter: Ratelimit.fixedWindow(60, '1 m'),
});
const identifier = ctx.user?.id ?? 'anonymous';
const signals = await getSessionNetworkSignals(ctx, ctx.user?.session ?? null);
const status = await limiter.limit(identifier, signals);
if (!status.success) throw new Error('Too many requests');
}getSessionNetworkSignals() returns:
{}when no session is available{ ip },{ userAgent }, or both when present on the session- trimmed values, with blank strings normalized away
Optional anonymous-session strategy
If your app has public flows and you still want session-based IP/user-agent keys, use Better Auth anonymous sessions and captcha-gate anonymous session creation.
import { anonymous } from 'better-auth/plugins';
plugins: [
// ...other plugins
anonymous(),
];import { anonymousClient } from 'better-auth/client/plugins';
plugins: [
// ...other plugins
anonymousClient(),
];Important: session IP/user-agent is captured by auth/session lifecycle, not guaranteed to be fresh per request. Treat it as trusted-ish app-layer identity, not strict network truth. For cert/audit-grade per-request source IP controls, use HTTP/proxy logging and enforcement.
Queries vs mutations
- Use
check()in queries (read-only, no token consumption). - Use
limit()in mutations/actions (consumes capacity).
// query
const preview = await limiter.check(identifier, signals);
// mutation/action
const enforced = await limiter.limit(identifier, signals);Wire it into middleware
Next, apply the guard from cRPC middleware so your handlers stay focused on business logic.
const rateLimitMiddleware = c.middleware<
MutationCtx & { user?: Pick<SessionUser, 'id' | 'plan' | 'session'> | null }
>(async ({ ctx, meta, next }) => {
await rateLimitGuard({
...ctx,
rateLimitKey: meta.rateLimit ?? 'default',
user: ctx.user ?? null,
});
return next({ ctx });
});
export const authMutation = c.mutation
.meta({ auth: 'required' })
.use(authMiddleware)
.use(rateLimitMiddleware);Then set per-procedure keys with metadata:
export const createTodo = authMutation
.meta({ rateLimit: 'todo/create' })
.input(z.object({ title: z.string().min(1) }))
.mutation(async ({ ctx, input }) => {
// business logic
});Choose your algorithm
Start simple and pick based on workload shape.
Fixed window
Best when hard windows are acceptable. Tokens reset at the start of each window.
const limiter = new Ratelimit({
db: ctx.db,
prefix: 'post:create',
limiter: Ratelimit.fixedWindow(10, '1 m'),
});Sliding window
Best when you want smoother request shaping without hard resets. Weighs the previous window proportionally so you don't get bursts at window boundaries.
const limiter = new Ratelimit({
db: ctx.db,
prefix: 'search',
limiter: Ratelimit.slidingWindow(50, '1 m'),
});Token bucket
Best for burst-friendly throughput with long-term control. Tokens refill at a steady rate up to maxTokens. Use maxReserved to allow requests to "borrow" from future tokens when the bucket is empty.
const limiter = new Ratelimit({
db: ctx.db,
prefix: 'llm:tokens',
limiter: Ratelimit.tokenBucket(1000, '1 m', 1000, { maxReserved: 3000 }),
});Algorithm options
All three algorithm builders accept an optional options object as the last argument.
| Option | Type | Default | Description |
|---|---|---|---|
shards | number | 1 | Number of shards for write distribution. Higher values reduce contention at the cost of less precise counts (see Sharding). |
maxReserved | number | undefined | Maximum tokens a request can "borrow" from future capacity. Only applies to fixedWindow and tokenBucket. Not supported by slidingWindow. |
capacity | number | limit | Maximum stored tokens. Only applies to fixedWindow. Useful when you want a higher burst capacity than the per-window refill. |
start | number | 0 | Epoch offset (ms) for window alignment. Only applies to fixedWindow. Aligns windows to a custom origin instead of epoch zero. |
Duration formats
Every window or interval parameter accepts a Duration — either a raw millisecond number or a human-readable string.
String format: "<number> <unit>" or "<number><unit>". Both '1 m' and '1m' work.
| Unit | Meaning | Example |
|---|---|---|
ms | milliseconds | '500 ms' |
s | seconds | '30 s' |
m | minutes | '1 m' |
h | hours | '1 h' |
d | days | '1 d' |
You can also use the pre-defined constants from better-convex/plugins/ratelimit:
import { SECOND, MINUTE, HOUR, DAY, WEEK } from 'better-convex/plugins/ratelimit';
Ratelimit.fixedWindow(100, MINUTE); // 60_000 ms
Ratelimit.slidingWindow(50, 30 * SECOND); // 30_000 ms
Ratelimit.tokenBucket(10, HOUR, 100); // 3_600_000 msDone. You now have deterministic, application-layer limits with one API surface.
Add a client-side limiter UX
Server enforcement is mandatory. Client checks are for better UX — disabled buttons, countdowns, and retry hints.
Expose the hook API
First, export the hook API from a Convex file. The hookAPI() method returns a getRateLimit query and a getServerTime mutation that the React hook consumes.
import { Ratelimit } from 'better-convex/plugins/ratelimit';
const limiter = new Ratelimit({
limiter: Ratelimit.fixedWindow(3, '30 s'),
});
export const { getRateLimit, getServerTime } = limiter.hookAPI({
identifier: async (_ctx, fromClient) => fromClient ?? 'anonymous',
sampleShards: 1,
});The identifier option can be a static string, or an async callback that receives (ctx, fromClient). Use the callback to resolve the identifier server-side (e.g. from auth) while still accepting a client-provided fallback.
sampleShards controls how many shards to read when estimating the remaining count. Set it to 1 for low-cost reads, or increase it for more accurate estimates on high-shard configs.
Use the React hook
Then wire it up in your component with useRateLimit:
import { useRateLimit } from 'better-convex/plugins/ratelimit/react';
const rateLimitRef = 'ratelimitDemo:getInteractiveRateLimit' as const;
const serverTimeRef = 'ratelimitDemo:getInteractiveServerTime' as const;
const { status, check } = useRateLimit(rateLimitRef, {
identifier: sessionId,
count: 1,
getServerTimeMutation: serverTimeRef,
});
const blocked = status?.ok === false;
const retryAt = status?.retryAt;useRateLimit accepts either:
- a Convex function path string (
'module:functionName') — this is what the/ratelimitdemo uses. - a generated
FunctionReferencefromapi.
The hook returns:
| Field | Type | Description |
|---|---|---|
status | HookStatus | undefined | undefined while loading. { ok: true } when allowed, { ok: false, retryAt: number } when blocked. Auto-updates when retryAt passes. |
check | (ts?, count?) => HookCheckValue | undefined | Manual projection function. Call it with a timestamp and count to get a precise snapshot for custom gauges or progress bars. |
The HookCheckValue returned by check() has this shape:
| Field | Type | Description |
|---|---|---|
value | number | Projected remaining tokens (negative means over-limit) |
ts | number | Timestamp of the projection (client time) |
config | ResolvedAlgorithm | The algorithm config for further calculations |
shard | number | Which shard was sampled |
ok | boolean | true when value >= 0 |
retryAt | number | undefined | Client timestamp when tokens become available |
If you need precise projected values (for custom gauges), call check(ts, count).
Protection and deny lists
When enableProtection is on, the limiter tracks repeated failures per identifier, IP, user-agent, and country. Once a value reaches denyListThreshold, it gets blocked for 24 hours — without even checking the database.
You can also provide static deny lists to block known bad actors immediately.
const limiter = new Ratelimit({
db: ctx.db,
prefix: 'api',
limiter: Ratelimit.fixedWindow(100, '1 m'),
failureMode: 'closed',
enableProtection: true,
denyListThreshold: 30,
denyList: {
identifiers: ['known-bad-user-id'],
ips: ['203.0.113.0'],
userAgents: ['BadBot/1.0'],
countries: ['XX'],
},
});To trigger deny-list matching on request metadata, pass ip, userAgent, or country in the limit() call:
const result = await limiter.limit(userId, {
ip: request.headers.get('x-forwarded-for') ?? undefined,
userAgent: request.headers.get('user-agent') ?? undefined,
country: request.headers.get('x-country') ?? undefined,
});Important: Deny-list state is in-memory and non-durable. It can survive across warm runtime requests, but is lost on cold starts/deploys. For persistent blocking, use an external deny list or database-backed blocklist.
Dynamic limits
Dynamic limits let you change rate limits at runtime — useful for feature flags, admin overrides, or gradual rollouts. Enable them with dynamicLimits: true in the constructor.
const limiter = new Ratelimit({
db: ctx.db,
prefix: 'api:search',
limiter: Ratelimit.fixedWindow(100, '1 m'),
dynamicLimits: true,
});Then use setDynamicLimit to override the configured limit at runtime:
// Double the limit during a sale
await limiter.setDynamicLimit({ limit: 200 });
// Read the current override
const { dynamicLimit } = await limiter.getDynamicLimit();
// dynamicLimit === 200
// Remove the override (reverts to configured limit)
await limiter.setDynamicLimit({ limit: false });The dynamic limit overrides the limit field of the algorithm. For token bucket, it overrides refillRate (and maxTokens if they were originally equal).
Limits and mitigations you should know
Important: This is application-layer limiting. It protects business logic and expensive downstream work, but it is not a network firewall or DDoS shield.
Recommended production posture:
- Enforce auth early and reject fast.
- Protect anonymous flows with captcha + validated session IDs.
- Put network-layer controls (Cloudflare or equivalent) in front when IP-based mitigation is required.
- Alert on request spikes and fail safely (
failureMode: "closed"by default).
API Reference
Constructor options
Create a Ratelimit instance with a config object:
const limiter = new Ratelimit(config: RatelimitConfig);| Option | Type | Default | Description |
|---|---|---|---|
db | ctx.db | — | Convex database context. Required for limit, check, getRemaining, getValue, resetUsedTokens, setDynamicLimit, getDynamicLimit. Not needed for hookAPI() (it receives db from the query/mutation context). |
limiter | ResolvedAlgorithm | — | Required. Algorithm created by Ratelimit.fixedWindow(), Ratelimit.slidingWindow(), or Ratelimit.tokenBucket(). |
prefix | string | '@better-convex/plugins/ratelimit' | Namespaces stored state in the database. Use unique prefixes for different rate limit scopes. |
dynamicLimits | boolean | false | Enables setDynamicLimit() / getDynamicLimit(). |
failureMode | 'closed' | 'open' | 'closed' | Behavior on timeout. 'closed' rejects, 'open' allows. |
timeout | number | 5000 | Milliseconds before triggering failureMode behavior. |
enableProtection | boolean | false | Enables deny-list tracking on repeated failures. |
denyListThreshold | number | 30 | Consecutive failures before an identifier is blocked (24h). Requires enableProtection: true. |
denyList | ProtectionLists | undefined | Static deny lists. See Protection and deny lists. |
ephemeralCache | Map<string, number> | false | new Map() | In-memory block cache. Shared across requests in the same Convex invocation. Pass false to disable. |
Algorithm builders
All builders are available as static methods on Ratelimit.
Ratelimit.fixedWindow(limit, window, options?)
fixedWindow(limit: number, window: Duration, options?: AlgorithmOptions): FixedWindowAlgorithm| Parameter | Type | Description |
|---|---|---|
limit | number | Tokens replenished per window |
window | Duration | Window length (number in ms, or string like '1 m') |
options.shards | number | Write distribution shards (default 1) |
options.maxReserved | number | Max tokens that can be borrowed from future windows |
options.capacity | number | Max stored tokens (default = limit) |
options.start | number | Epoch offset for window alignment |
Ratelimit.slidingWindow(limit, window, options?)
slidingWindow(limit: number, window: Duration, options?: AlgorithmOptions): SlidingWindowAlgorithm| Parameter | Type | Description |
|---|---|---|
limit | number | Max requests in the sliding window |
window | Duration | Window length |
options.shards | number | Write distribution shards (default 1) |
options.maxReserved | number | Max tokens that can be borrowed |
Note: reserve is not supported with sliding window. The algorithm needs both current and previous window counts, which makes reservation impractical.
Ratelimit.tokenBucket(refillRate, interval, maxTokens, options?)
tokenBucket(refillRate: number, interval: Duration, maxTokens: number, options?: AlgorithmOptions): TokenBucketAlgorithm| Parameter | Type | Description |
|---|---|---|
refillRate | number | Tokens added per interval |
interval | Duration | Refill interval |
maxTokens | number | Maximum bucket capacity |
options.shards | number | Write distribution shards (default 1) |
options.maxReserved | number | Max tokens that can be borrowed from future refills |
Core methods
limit(identifier, options?)
Consume tokens and return a response. This is the primary method for enforcing rate limits.
limit(identifier: string, options?: LimitRequest): Promise<RatelimitResponse>check(identifier, options?)
Evaluate without consuming tokens. Use this for read-only checks (e.g. showing a warning before the user submits).
check(identifier: string, options?: CheckRequest): Promise<RatelimitResponse>getRemaining(identifier)
Return the remaining tokens, reset time, and limit for an identifier.
getRemaining(identifier: string): Promise<RemainingResponse>getValue(identifier, options?)
Return a raw snapshot for custom projections and UI calculations.
getValue(identifier: string, options?: { sampleShards?: number }): Promise<RateLimitSnapshot>resetUsedTokens(identifier)
Clear all stored state for an identifier. Useful for admin resets.
resetUsedTokens(identifier: string): Promise<void>setDynamicLimit(options)
Override the configured limit at runtime. Pass { limit: false } to remove the override. Requires dynamicLimits: true.
setDynamicLimit(options: { limit: number | false }): Promise<void>getDynamicLimit()
Read the current dynamic override. Returns { dynamicLimit: number | null }. Requires dynamicLimits: true.
getDynamicLimit(): Promise<DynamicLimitResponse>hookAPI(options?)
Export a getRateLimit query and getServerTime mutation for the React hook.
hookAPI(options?: HookAPIOptions): {
getRateLimit: FunctionReference<'query'>;
getServerTime: FunctionReference<'mutation'>;
}Request options
LimitRequest
Pass these options to limit() to customize behavior per-call.
| Field | Type | Default | Description |
|---|---|---|---|
rate | number | 1 | Alias for count. Tokens to consume. |
count | number | 1 | Tokens to consume. Takes precedence if both rate and count are set. |
reserve | boolean | false | Allow borrowing from future capacity (up to maxReserved). Not supported by slidingWindow. |
ip | string | — | IP address for deny-list matching |
userAgent | string | — | User-agent for deny-list matching |
country | string | — | Country code for deny-list matching |
geo | unknown | — | Reserved for future geo-based rules |
CheckRequest
Same as LimitRequest but reserve defaults to not consuming tokens (since check() is read-only).
Response types
RatelimitResponse
Returned by limit() and check().
| Field | Type | Description |
|---|---|---|
success | boolean | true if the request was allowed |
ok | boolean | Alias for success (Convex DX parity) |
limit | number | Maximum tokens for this algorithm |
remaining | number | Tokens left after this request (floored to 0) |
reset | number | Epoch ms when tokens will be available |
pending | Promise<unknown> | Resolves when async side-effects complete |
reason | 'timeout' | 'cacheBlock' | 'denyList' | Present when a reason applies. Note: failureMode: 'open' can return success: true with reason: 'timeout'. |
deniedValue | string | Present only when reason === 'denyList'. The value that triggered the block. |
RemainingResponse
Returned by getRemaining().
| Field | Type | Description |
|---|---|---|
remaining | number | Tokens available |
reset | number | Epoch ms of next replenishment |
limit | number | Maximum tokens |
RateLimitSnapshot
Returned by getValue(). Used for custom projections and the React hook.
| Field | Type | Description |
|---|---|---|
value | number | Current token count |
ts | number | Timestamp of last state update |
shard | number | Which shard was read |
config | ResolvedAlgorithm | Full algorithm config for calculateRateLimit() |
Hook API
HookAPIOptions
Options for hookAPI().
| Field | Type | Default | Description |
|---|---|---|---|
identifier | string | (ctx, fromClient?) => string | Promise<string> | — | How to resolve the identifier. A string uses it directly. A callback receives the Convex context and the optional client-provided identifier. |
sampleShards | number | 1 | How many shards to sample when reading. Higher = more accurate, more reads. |
UseRateLimitOptions
Options for the useRateLimit() React hook.
useRateLimit(
getRateLimitValueQuery: FunctionReference<'query'> | string,
options?: UseRateLimitOptions
)| Field | Type | Default | Description |
|---|---|---|---|
identifier | string | — | Passed to the getRateLimit query |
count | number | 1 | Tokens to project for status calculation |
sampleShards | number | — | Override sampleShards from hook API |
getServerTimeMutation | FunctionReference | string | — | Enables clock-skew correction between client and server |
Time constants
Pre-defined millisecond constants exported from better-convex/plugins/ratelimit:
| Constant | Value |
|---|---|
SECOND | 1_000 |
MINUTE | 60_000 |
HOUR | 3_600_000 |
DAY | 86_400_000 |
WEEK | 604_800_000 |
Internal tables
The rate limiter stores state in three Convex tables. These are added only when you enable ratelimitPlugin() in defineSchema — do not define tables with these names yourself.
| Table | Purpose |
|---|---|
ratelimit_state | Per-identifier, per-shard token state |
ratelimit_dynamic_limit | Dynamic limit overrides per prefix |
ratelimit_protection_hit | Protection tracking (hits, blocks) per prefix |
Advanced notes
calculateRateLimit
The calculateRateLimit function is exported for custom projections and UI calculations. It takes a state snapshot, algorithm config, current timestamp, and count, and returns the evaluated result without touching the database.
import { calculateRateLimit } from 'better-convex/plugins/ratelimit';
const result = calculateRateLimit(
{ value: 8, ts: Date.now() - 30_000 },
Ratelimit.fixedWindow(10, '1 m'),
Date.now(),
1
);
// result.remaining, result.reset, result.retryAfterSharding
When shards > 1, each limit() call picks a random shard (or two, using power-of-two-choices when shards >= 3) to reduce write contention. The trade-off: reads (check, getRemaining, getValue) only sample a subset of shards, so remaining counts are approximate. For most use cases, shards: 1 (the default) is fine. Increase shards only when you see write contention on hot identifiers.
Ephemeral cache
The ephemeral block cache is an in-memory Map<string, number> that caches "blocked until" timestamps. When a limit() call fails, subsequent calls for the same identifier skip the database read entirely until the block expires. The cache is per-Ratelimit instance and resets on each Convex function invocation. Pass ephemeralCache: false to disable it, or pass a shared Map across multiple Ratelimit instances to share the cache.
ok alias
The response includes both success and ok. They are always identical. ok exists for Convex DX parity with patterns like if (!result.ok) throw ....