BETTER-CONVEX

Rate Limiting

Upstash-style rate limiting with Convex-first storage and middleware-friendly DX.

In this guide, we'll set up better-convex/plugins/ratelimit with an Upstash-parity API, Convex-first storage, and a UI-friendly hook.

We'll keep it practical: a reusable server guard, middleware wiring, algorithm choices, and client-side button disabling with live countdowns.

Approach

We'll build this in four steps so you can ship quickly and keep things maintainable:

  1. Enable the schema plugin so internal storage tables exist.
  2. Add one reusable server guard for all protected mutations.
  3. Wire the guard into middleware so handlers stay focused.
  4. Add optional client-side UX with useRateLimit() for disabled states and retry timers.

We'll end with a full API reference so you can tune behavior without guesswork.

Why this package

better-convex/plugins/ratelimit is designed as a hard cutover from component-driven APIs.

What you getWhy it matters
Upstash-style APIEasier migration if your team already knows Upstash (limit, check, getRemaining, resetUsedTokens)
Convex-first tablesNo component registration and no component migration path to manage
Read dedupe helpersCommon repeated reads may reuse cached results and reduce duplicate DB fetches
React hook supporthookAPI() + useRateLimit() gives accurate client countdown and button states
Fail-closed defaultSafer behavior under pressure (failureMode: "closed")

Install

bun add better-convex

That's it. You do not install or register @convex-dev/rate-limiter.

Enable schema plugin (required)

aggregatePlugin and migrationPlugin are builtin in defineSchema.
Rate limiting is opt-in, so add ratelimitPlugin() where you define schema:

convex/functions/schema.ts
import { defineSchema } from 'better-convex/orm';
import { ratelimitPlugin } from 'better-convex/plugins/ratelimit';

export const tables = {
  // your tables...
};

export default defineSchema(tables, {
  plugins: [ratelimitPlugin()],
});

Create a reusable guard

We'll start with a shared guard function so all mutations enforce rate limits consistently.

convex/lib/rate-limiter.ts
import { MINUTE, Ratelimit } from 'better-convex/plugins/ratelimit';
import { CRPCError } from 'better-convex/server';
import type { MutationCtx } from '../functions/generated/server';
import type { SessionUser } from '../shared/auth-shared';

const fixed = (rate: number) => Ratelimit.fixedWindow(rate, MINUTE);

const rateLimitConfig = {
  'default:free': fixed(60),
  'default:premium': fixed(200),
  'default:public': fixed(30),
  'todo/create:free': fixed(20),
  'todo/create:premium': fixed(60),
} as const;

function getUserTier(
  user: { isAdmin?: boolean; plan?: SessionUser['plan'] } | null
): 'free' | 'premium' | 'public' {
  if (!user) return 'public';
  if (user.isAdmin || user.plan) return 'premium';
  return 'free';
}

export async function rateLimitGuard(
  ctx: MutationCtx & {
    rateLimitKey: string;
    user: Pick<SessionUser, 'id' | 'plan'> | null;
  }
) {
  const tier = getUserTier(ctx.user);
  const limitKey = `${ctx.rateLimitKey}:${tier}` as keyof typeof rateLimitConfig;
  const resolved = limitKey in rateLimitConfig ? limitKey : (`default:${tier}` as const);

  const limiter = new Ratelimit({
    db: ctx.db,
    prefix: `example:${resolved}`,
    limiter: rateLimitConfig[resolved],
    failureMode: 'closed',
    enableProtection: true,
    denyListThreshold: 30,
  });

  const status = await limiter.limit(ctx.user?.id ?? 'anonymous');
  if (!status.success) {
    throw new CRPCError({
      code: 'TOO_MANY_REQUESTS',
      message: 'Rate limit exceeded. Please try again later.',
    });
  }
}

Now you have one place to tune plan limits, prefixes, and protection defaults.

No-wrapper IP signals with Better Auth

If you already use Better Auth sessions, you can enrich rate-limit checks with session-derived network signals in normal query/mutation middleware, without wrapping every endpoint in httpAction.

convex/lib/rate-limiter.ts
import { getSessionNetworkSignals } from 'better-convex/auth';
import { Ratelimit } from 'better-convex/plugins/ratelimit';
import type { MutationCtx } from '../functions/generated/server';

export async function rateLimitGuard(
  ctx: MutationCtx & {
    rateLimitKey: string;
    user: { id: string; session?: { ipAddress?: string; userAgent?: string } } | null;
  }
) {
  const limiter = new Ratelimit({
    db: ctx.db,
    prefix: `example:${ctx.rateLimitKey}`,
    limiter: Ratelimit.fixedWindow(60, '1 m'),
  });

  const identifier = ctx.user?.id ?? 'anonymous';
  const signals = await getSessionNetworkSignals(ctx, ctx.user?.session ?? null);
  const status = await limiter.limit(identifier, signals);

  if (!status.success) throw new Error('Too many requests');
}

getSessionNetworkSignals() returns:

  • {} when no session is available
  • { ip }, { userAgent }, or both when present on the session
  • trimmed values, with blank strings normalized away

Optional anonymous-session strategy

If your app has public flows and you still want session-based IP/user-agent keys, use Better Auth anonymous sessions and captcha-gate anonymous session creation.

convex/functions/auth.ts
import { anonymous } from 'better-auth/plugins';

plugins: [
  // ...other plugins
  anonymous(),
];
src/lib/convex/auth-client.ts
import { anonymousClient } from 'better-auth/client/plugins';

plugins: [
  // ...other plugins
  anonymousClient(),
];

Important: session IP/user-agent is captured by auth/session lifecycle, not guaranteed to be fresh per request. Treat it as trusted-ish app-layer identity, not strict network truth. For cert/audit-grade per-request source IP controls, use HTTP/proxy logging and enforcement.

Queries vs mutations

  • Use check() in queries (read-only, no token consumption).
  • Use limit() in mutations/actions (consumes capacity).
// query
const preview = await limiter.check(identifier, signals);

// mutation/action
const enforced = await limiter.limit(identifier, signals);

Wire it into middleware

Next, apply the guard from cRPC middleware so your handlers stay focused on business logic.

convex/lib/crpc.ts
const rateLimitMiddleware = c.middleware<
  MutationCtx & { user?: Pick<SessionUser, 'id' | 'plan' | 'session'> | null }
>(async ({ ctx, meta, next }) => {
  await rateLimitGuard({
    ...ctx,
    rateLimitKey: meta.rateLimit ?? 'default',
    user: ctx.user ?? null,
  });
  return next({ ctx });
});

export const authMutation = c.mutation
  .meta({ auth: 'required' })
  .use(authMiddleware)
  .use(rateLimitMiddleware);

Then set per-procedure keys with metadata:

export const createTodo = authMutation
  .meta({ rateLimit: 'todo/create' })
  .input(z.object({ title: z.string().min(1) }))
  .mutation(async ({ ctx, input }) => {
    // business logic
  });

Choose your algorithm

Start simple and pick based on workload shape.

Fixed window

Best when hard windows are acceptable. Tokens reset at the start of each window.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'post:create',
  limiter: Ratelimit.fixedWindow(10, '1 m'),
});

Sliding window

Best when you want smoother request shaping without hard resets. Weighs the previous window proportionally so you don't get bursts at window boundaries.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'search',
  limiter: Ratelimit.slidingWindow(50, '1 m'),
});

Token bucket

Best for burst-friendly throughput with long-term control. Tokens refill at a steady rate up to maxTokens. Use maxReserved to allow requests to "borrow" from future tokens when the bucket is empty.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'llm:tokens',
  limiter: Ratelimit.tokenBucket(1000, '1 m', 1000, { maxReserved: 3000 }),
});

Algorithm options

All three algorithm builders accept an optional options object as the last argument.

OptionTypeDefaultDescription
shardsnumber1Number of shards for write distribution. Higher values reduce contention at the cost of less precise counts (see Sharding).
maxReservednumberundefinedMaximum tokens a request can "borrow" from future capacity. Only applies to fixedWindow and tokenBucket. Not supported by slidingWindow.
capacitynumberlimitMaximum stored tokens. Only applies to fixedWindow. Useful when you want a higher burst capacity than the per-window refill.
startnumber0Epoch offset (ms) for window alignment. Only applies to fixedWindow. Aligns windows to a custom origin instead of epoch zero.

Duration formats

Every window or interval parameter accepts a Duration — either a raw millisecond number or a human-readable string.

String format: "<number> <unit>" or "<number><unit>". Both '1 m' and '1m' work.

UnitMeaningExample
msmilliseconds'500 ms'
sseconds'30 s'
mminutes'1 m'
hhours'1 h'
ddays'1 d'

You can also use the pre-defined constants from better-convex/plugins/ratelimit:

import { SECOND, MINUTE, HOUR, DAY, WEEK } from 'better-convex/plugins/ratelimit';

Ratelimit.fixedWindow(100, MINUTE);        // 60_000 ms
Ratelimit.slidingWindow(50, 30 * SECOND);  // 30_000 ms
Ratelimit.tokenBucket(10, HOUR, 100);      // 3_600_000 ms

Done. You now have deterministic, application-layer limits with one API surface.

Add a client-side limiter UX

Server enforcement is mandatory. Client checks are for better UX — disabled buttons, countdowns, and retry hints.

Expose the hook API

First, export the hook API from a Convex file. The hookAPI() method returns a getRateLimit query and a getServerTime mutation that the React hook consumes.

convex/functions/ratelimit.ts
import { Ratelimit } from 'better-convex/plugins/ratelimit';

const limiter = new Ratelimit({
  limiter: Ratelimit.fixedWindow(3, '30 s'),
});

export const { getRateLimit, getServerTime } = limiter.hookAPI({
  identifier: async (_ctx, fromClient) => fromClient ?? 'anonymous',
  sampleShards: 1,
});

The identifier option can be a static string, or an async callback that receives (ctx, fromClient). Use the callback to resolve the identifier server-side (e.g. from auth) while still accepting a client-provided fallback.

sampleShards controls how many shards to read when estimating the remaining count. Set it to 1 for low-cost reads, or increase it for more accurate estimates on high-shard configs.

Use the React hook

Then wire it up in your component with useRateLimit:

src/components/send-button.tsx
import { useRateLimit } from 'better-convex/plugins/ratelimit/react';

const rateLimitRef = 'ratelimitDemo:getInteractiveRateLimit' as const;
const serverTimeRef = 'ratelimitDemo:getInteractiveServerTime' as const;

const { status, check } = useRateLimit(rateLimitRef, {
  identifier: sessionId,
  count: 1,
  getServerTimeMutation: serverTimeRef,
});

const blocked = status?.ok === false;
const retryAt = status?.retryAt;

useRateLimit accepts either:

  • a Convex function path string ('module:functionName') — this is what the /ratelimit demo uses.
  • a generated FunctionReference from api.

The hook returns:

FieldTypeDescription
statusHookStatus | undefinedundefined while loading. { ok: true } when allowed, { ok: false, retryAt: number } when blocked. Auto-updates when retryAt passes.
check(ts?, count?) => HookCheckValue | undefinedManual projection function. Call it with a timestamp and count to get a precise snapshot for custom gauges or progress bars.

The HookCheckValue returned by check() has this shape:

FieldTypeDescription
valuenumberProjected remaining tokens (negative means over-limit)
tsnumberTimestamp of the projection (client time)
configResolvedAlgorithmThe algorithm config for further calculations
shardnumberWhich shard was sampled
okbooleantrue when value >= 0
retryAtnumber | undefinedClient timestamp when tokens become available

If you need precise projected values (for custom gauges), call check(ts, count).

Protection and deny lists

When enableProtection is on, the limiter tracks repeated failures per identifier, IP, user-agent, and country. Once a value reaches denyListThreshold, it gets blocked for 24 hours — without even checking the database.

You can also provide static deny lists to block known bad actors immediately.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'api',
  limiter: Ratelimit.fixedWindow(100, '1 m'),
  failureMode: 'closed',
  enableProtection: true,
  denyListThreshold: 30,
  denyList: {
    identifiers: ['known-bad-user-id'],
    ips: ['203.0.113.0'],
    userAgents: ['BadBot/1.0'],
    countries: ['XX'],
  },
});

To trigger deny-list matching on request metadata, pass ip, userAgent, or country in the limit() call:

const result = await limiter.limit(userId, {
  ip: request.headers.get('x-forwarded-for') ?? undefined,
  userAgent: request.headers.get('user-agent') ?? undefined,
  country: request.headers.get('x-country') ?? undefined,
});

Important: Deny-list state is in-memory and non-durable. It can survive across warm runtime requests, but is lost on cold starts/deploys. For persistent blocking, use an external deny list or database-backed blocklist.

Dynamic limits

Dynamic limits let you change rate limits at runtime — useful for feature flags, admin overrides, or gradual rollouts. Enable them with dynamicLimits: true in the constructor.

const limiter = new Ratelimit({
  db: ctx.db,
  prefix: 'api:search',
  limiter: Ratelimit.fixedWindow(100, '1 m'),
  dynamicLimits: true,
});

Then use setDynamicLimit to override the configured limit at runtime:

// Double the limit during a sale
await limiter.setDynamicLimit({ limit: 200 });

// Read the current override
const { dynamicLimit } = await limiter.getDynamicLimit();
// dynamicLimit === 200

// Remove the override (reverts to configured limit)
await limiter.setDynamicLimit({ limit: false });

The dynamic limit overrides the limit field of the algorithm. For token bucket, it overrides refillRate (and maxTokens if they were originally equal).

Limits and mitigations you should know

Important: This is application-layer limiting. It protects business logic and expensive downstream work, but it is not a network firewall or DDoS shield.

Recommended production posture:

  • Enforce auth early and reject fast.
  • Protect anonymous flows with captcha + validated session IDs.
  • Put network-layer controls (Cloudflare or equivalent) in front when IP-based mitigation is required.
  • Alert on request spikes and fail safely (failureMode: "closed" by default).

API Reference

Constructor options

Create a Ratelimit instance with a config object:

const limiter = new Ratelimit(config: RatelimitConfig);
OptionTypeDefaultDescription
dbctx.dbConvex database context. Required for limit, check, getRemaining, getValue, resetUsedTokens, setDynamicLimit, getDynamicLimit. Not needed for hookAPI() (it receives db from the query/mutation context).
limiterResolvedAlgorithmRequired. Algorithm created by Ratelimit.fixedWindow(), Ratelimit.slidingWindow(), or Ratelimit.tokenBucket().
prefixstring'@better-convex/plugins/ratelimit'Namespaces stored state in the database. Use unique prefixes for different rate limit scopes.
dynamicLimitsbooleanfalseEnables setDynamicLimit() / getDynamicLimit().
failureMode'closed' | 'open''closed'Behavior on timeout. 'closed' rejects, 'open' allows.
timeoutnumber5000Milliseconds before triggering failureMode behavior.
enableProtectionbooleanfalseEnables deny-list tracking on repeated failures.
denyListThresholdnumber30Consecutive failures before an identifier is blocked (24h). Requires enableProtection: true.
denyListProtectionListsundefinedStatic deny lists. See Protection and deny lists.
ephemeralCacheMap<string, number> | falsenew Map()In-memory block cache. Shared across requests in the same Convex invocation. Pass false to disable.

Algorithm builders

All builders are available as static methods on Ratelimit.

Ratelimit.fixedWindow(limit, window, options?)

fixedWindow(limit: number, window: Duration, options?: AlgorithmOptions): FixedWindowAlgorithm
ParameterTypeDescription
limitnumberTokens replenished per window
windowDurationWindow length (number in ms, or string like '1 m')
options.shardsnumberWrite distribution shards (default 1)
options.maxReservednumberMax tokens that can be borrowed from future windows
options.capacitynumberMax stored tokens (default = limit)
options.startnumberEpoch offset for window alignment

Ratelimit.slidingWindow(limit, window, options?)

slidingWindow(limit: number, window: Duration, options?: AlgorithmOptions): SlidingWindowAlgorithm
ParameterTypeDescription
limitnumberMax requests in the sliding window
windowDurationWindow length
options.shardsnumberWrite distribution shards (default 1)
options.maxReservednumberMax tokens that can be borrowed

Note: reserve is not supported with sliding window. The algorithm needs both current and previous window counts, which makes reservation impractical.

Ratelimit.tokenBucket(refillRate, interval, maxTokens, options?)

tokenBucket(refillRate: number, interval: Duration, maxTokens: number, options?: AlgorithmOptions): TokenBucketAlgorithm
ParameterTypeDescription
refillRatenumberTokens added per interval
intervalDurationRefill interval
maxTokensnumberMaximum bucket capacity
options.shardsnumberWrite distribution shards (default 1)
options.maxReservednumberMax tokens that can be borrowed from future refills

Core methods

limit(identifier, options?)

Consume tokens and return a response. This is the primary method for enforcing rate limits.

limit(identifier: string, options?: LimitRequest): Promise<RatelimitResponse>

check(identifier, options?)

Evaluate without consuming tokens. Use this for read-only checks (e.g. showing a warning before the user submits).

check(identifier: string, options?: CheckRequest): Promise<RatelimitResponse>

getRemaining(identifier)

Return the remaining tokens, reset time, and limit for an identifier.

getRemaining(identifier: string): Promise<RemainingResponse>

getValue(identifier, options?)

Return a raw snapshot for custom projections and UI calculations.

getValue(identifier: string, options?: { sampleShards?: number }): Promise<RateLimitSnapshot>

resetUsedTokens(identifier)

Clear all stored state for an identifier. Useful for admin resets.

resetUsedTokens(identifier: string): Promise<void>

setDynamicLimit(options)

Override the configured limit at runtime. Pass { limit: false } to remove the override. Requires dynamicLimits: true.

setDynamicLimit(options: { limit: number | false }): Promise<void>

getDynamicLimit()

Read the current dynamic override. Returns { dynamicLimit: number | null }. Requires dynamicLimits: true.

getDynamicLimit(): Promise<DynamicLimitResponse>

hookAPI(options?)

Export a getRateLimit query and getServerTime mutation for the React hook.

hookAPI(options?: HookAPIOptions): {
  getRateLimit: FunctionReference<'query'>;
  getServerTime: FunctionReference<'mutation'>;
}

Request options

LimitRequest

Pass these options to limit() to customize behavior per-call.

FieldTypeDefaultDescription
ratenumber1Alias for count. Tokens to consume.
countnumber1Tokens to consume. Takes precedence if both rate and count are set.
reservebooleanfalseAllow borrowing from future capacity (up to maxReserved). Not supported by slidingWindow.
ipstringIP address for deny-list matching
userAgentstringUser-agent for deny-list matching
countrystringCountry code for deny-list matching
geounknownReserved for future geo-based rules

CheckRequest

Same as LimitRequest but reserve defaults to not consuming tokens (since check() is read-only).

Response types

RatelimitResponse

Returned by limit() and check().

FieldTypeDescription
successbooleantrue if the request was allowed
okbooleanAlias for success (Convex DX parity)
limitnumberMaximum tokens for this algorithm
remainingnumberTokens left after this request (floored to 0)
resetnumberEpoch ms when tokens will be available
pendingPromise<unknown>Resolves when async side-effects complete
reason'timeout' | 'cacheBlock' | 'denyList'Present when a reason applies. Note: failureMode: 'open' can return success: true with reason: 'timeout'.
deniedValuestringPresent only when reason === 'denyList'. The value that triggered the block.

RemainingResponse

Returned by getRemaining().

FieldTypeDescription
remainingnumberTokens available
resetnumberEpoch ms of next replenishment
limitnumberMaximum tokens

RateLimitSnapshot

Returned by getValue(). Used for custom projections and the React hook.

FieldTypeDescription
valuenumberCurrent token count
tsnumberTimestamp of last state update
shardnumberWhich shard was read
configResolvedAlgorithmFull algorithm config for calculateRateLimit()

Hook API

HookAPIOptions

Options for hookAPI().

FieldTypeDefaultDescription
identifierstring | (ctx, fromClient?) => string | Promise<string>How to resolve the identifier. A string uses it directly. A callback receives the Convex context and the optional client-provided identifier.
sampleShardsnumber1How many shards to sample when reading. Higher = more accurate, more reads.

UseRateLimitOptions

Options for the useRateLimit() React hook.

useRateLimit(
  getRateLimitValueQuery: FunctionReference<'query'> | string,
  options?: UseRateLimitOptions
)
FieldTypeDefaultDescription
identifierstringPassed to the getRateLimit query
countnumber1Tokens to project for status calculation
sampleShardsnumberOverride sampleShards from hook API
getServerTimeMutationFunctionReference | stringEnables clock-skew correction between client and server

Time constants

Pre-defined millisecond constants exported from better-convex/plugins/ratelimit:

ConstantValue
SECOND1_000
MINUTE60_000
HOUR3_600_000
DAY86_400_000
WEEK604_800_000

Internal tables

The rate limiter stores state in three Convex tables. These are added only when you enable ratelimitPlugin() in defineSchema — do not define tables with these names yourself.

TablePurpose
ratelimit_statePer-identifier, per-shard token state
ratelimit_dynamic_limitDynamic limit overrides per prefix
ratelimit_protection_hitProtection tracking (hits, blocks) per prefix

Advanced notes

calculateRateLimit

The calculateRateLimit function is exported for custom projections and UI calculations. It takes a state snapshot, algorithm config, current timestamp, and count, and returns the evaluated result without touching the database.

import { calculateRateLimit } from 'better-convex/plugins/ratelimit';

const result = calculateRateLimit(
  { value: 8, ts: Date.now() - 30_000 },
  Ratelimit.fixedWindow(10, '1 m'),
  Date.now(),
  1
);
// result.remaining, result.reset, result.retryAfter

Sharding

When shards > 1, each limit() call picks a random shard (or two, using power-of-two-choices when shards >= 3) to reduce write contention. The trade-off: reads (check, getRemaining, getValue) only sample a subset of shards, so remaining counts are approximate. For most use cases, shards: 1 (the default) is fine. Increase shards only when you see write contention on hot identifiers.

Ephemeral cache

The ephemeral block cache is an in-memory Map<string, number> that caches "blocked until" timestamps. When a limit() call fails, subsequent calls for the same identifier skip the database read entirely until the block expires. The cache is per-Ratelimit instance and resets on each Convex function invocation. Pass ephemeralCache: false to disable it, or pass a shared Map across multiple Ratelimit instances to share the cache.

ok alias

The response includes both success and ok. They are always identical. ok exists for Convex DX parity with patterns like if (!result.ok) throw ....

Next steps

On this page