API Reference

CautionLabs API

Send text, get instant moderation scores across twelve harm categories. One endpoint, simple JSON, production-ready.

RESTJSONHTTPShttps://api.cautionlabs.com/v1

Introduction

The CautionLabs API analyzes user-generated text and returns confidence scores from 0 (safe) to 1 (high confidence of violation) for each moderation category. Use it to filter chat, comments, reviews, UGC, and any text-heavy product surface.

All requests are made over HTTPS. Responses are JSON. There is a single primary endpoint — no versioning sprawl, no graph of micro-endpoints to learn.

Base URL: https://api.cautionlabs.com/v1
Need an API key? from the dashboard.

Quickstart

Make your first moderation request in under a minute.

Create a free account and copy your API key from the API Keys page.
Replace YOUR_API_KEY in the examples below.
Send a POST request to /moderate with a JSON body containing text.

curl -X POST https://api.cautionlabs.com/v1/moderate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello, world!"}'

Authentication

Authenticate every request with your API key in the Authorization header using the Bearer scheme:

JSON

Authorization: Bearer YOUR_API_KEY

Keep your key secure

Never expose API keys in client-side code or public repositories.
Call the API from your backend or a trusted proxy.
Rotate compromised keys immediately via Roll API Key in the dashboard.

Moderate text

Analyze a string of text and receive scores for all twelve moderation categories.

POST/moderate

Request body

Field	Type	Required	Description
`text`	string	Yes	The text content to analyze. UTF-8 encoded.

Example request

curl -X POST https://api.cautionlabs.com/v1/moderate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Your content to analyze"}'

Example response

Returns 200 OK with a JSON object. Each key is a category field with a float score between 0 and 1.

{
  toxic: 0.02,
  profanity: 0.05,
  hate: 0.01,
  harassment: 0.03,
  self_harm: 0.0,
  adult: 0.0,
  violence: 0.01,
  drugs: 0.0,
  weapons: 0.0,
  pii: 0.0,
  spam: 0.12,
  minor: 0.0,
}

Headers

Header	Required	Description
`Authorization`	Yes	`Bearer YOUR_API_KEY`
`Content-Type`	Yes	`application/json`

Response fields

Every successful response includes all twelve category scores. Set your own thresholds per category based on your product's risk tolerance.

Field	Category	Description
`toxic`	Toxicity	General toxicity score from 0 (safe) to 1 (highly toxic)
`profanity`	Profanity	Profanity confidence score from 0 (clean) to 1 (highly explicit)
`hate`	Hate Speech	Hate speech score from 0 (no hate) to 1 (strong hate)
`harassment`	Harassment	Harassment score from 0 (safe) to 1 (severe harassment)
`self_harm`	Self-Harm	Self-harm content score from 0 (safe) to 1 (high risk)
`adult`	Adult Content	Adult content score from 0 (safe) to 1 (highly explicit)
`violence`	Violence	Violence score from 0 (non-violent) to 1 (graphic/threatening)
`drugs`	Drug Content	Drug content score from 0 (safe) to 1 (drug-related)
`weapons`	Weapons	Weapons content score from 0 (safe) to 1 (weapons-related)
`pii`	PII Detection	PII presence score from 0 (clean) to 1 (contains PII)
`spam`	Spam	Spam confidence score from 0 (genuine) to 1 (definite spam)
`minor`	Minor Safety	Minor safety risk score from 0 (safe) to 1 (critical risk)

Tip: Scores are continuous, not binary. A score of 0.85 on toxic means high confidence of toxic content; 0.12 is likely safe. Tune cutoffs per surface (e.g. stricter in DMs than in public posts).

Errors

Errors use standard HTTP status codes. The response body includes an error object with code and message fields.

{
  error: {
    code: 'insufficient_credits',
    message: 'Your account has no remaining credits.',
  },
}

Status	Code	Description
`400`	`invalid_request`	Missing or malformed request body (e.g. empty `text`).
`401`	`unauthorized`	Missing, invalid, or revoked API key.
`402`	`insufficient_credits`	Account has no remaining credits for moderation requests.
`429`	`rate_limit_exceeded`	Too many requests. Retry after the time indicated in headers.
`500`	`internal_error`	Unexpected server error. Retry with exponential backoff.

Rate limits

Requests are rate-limited per API key to ensure fair usage and platform stability. When you exceed the limit, the API returns 429 Too Many Requests.

Check the Retry-After response header for when to retry.
Implement exponential backoff in your client for transient errors.
High-volume needs? Contact us for enterprise limits after you're on a paid plan.

Credits & billing

Each moderation request consumes credits from your account balance. New accounts include a free tier to get started.

Purchase additional credits from the Buy Credits page (minimum $5 USD).
Approximately 3,000 credits per $1 USD.
When credits are exhausted, requests return 402 Payment Required with code insufficient_credits.