← Back to Blog

JSON Schema Tutorial: Validate Your Data Like a Pro

Unvalidated JSON is one of the most common sources of bugs in APIs and data pipelines. JSON Schema gives you a powerful, language-agnostic way to define exactly what your data should look like - and reject anything that doesn't match. This tutorial covers everything from basic types to advanced composition patterns.

The Problem: JSON Without Validation Is a Liability

Imagine your API receives a user registration payload. You expect email to be a valid email string and age to be a positive integer. But what happens when a client sends "age": "twenty-five" or omits the email field entirely? Without validation, that bad data flows into your database, breaks downstream logic, and causes errors that are painful to debug hours or days later.

Here is the kind of payload that breaks systems every day:

{
  "name": "",
  "email": "not-an-email",
  "age": -5,
  "role": "superadmin"
}

Every field has a problem: empty name, invalid email format, negative age, and an unauthorized role value. A JSON Schema catches all four errors before the data ever touches your business logic.

What Is JSON Schema?

JSON Schema is a vocabulary for annotating and validating JSON documents. It is itself written in JSON. A schema describes the expected shape of data - which fields are required, what types they must be, what constraints apply, and how nested structures should look. The current stable version is Draft 2020-12, though Draft-07 remains widely supported by libraries.

JSON Schema is used for:

  • API contract enforcement - validate request and response bodies
  • Configuration file validation - catch typos in YAML/JSON configs at startup
  • Form validation - drive frontend validation from a single source of truth
  • Code generation - tools like quicktype generate TypeScript types from schemas
  • Documentation - OpenAPI/Swagger uses JSON Schema to document request shapes

Step 1: Your First Schema

A JSON Schema document starts with a $schema declaration and a type. The simplest possible schema accepts any object:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object"
}

This accepts {}, {"anything": true}, and so on. To be useful, you need to add properties and required:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "required": ["name", "email"],
  "properties": {
    "name": {
      "type": "string",
      "minLength": 1,
      "maxLength": 100
    },
    "email": {
      "type": "string",
      "format": "email"
    },
    "age": {
      "type": "integer",
      "minimum": 0,
      "maximum": 150
    }
  },
  "additionalProperties": false
}

Key points: required is an array of field names that must be present. properties defines constraints per field. additionalProperties: false rejects any key not listed in properties - useful for strict APIs.

Step 2: All Primitive Types

JSON Schema supports six primitive types:

  • "type": "string" - any JSON string
  • "type": "number" - integer or float
  • "type": "integer" - whole numbers only
  • "type": "boolean" - true or false
  • "type": "null" - the JSON null value
  • "type": "array" - a JSON array
  • "type": "object" - a JSON object

You can allow multiple types using an array: "type": ["string", "null"] is the standard pattern for a nullable field.

Step 3: String Constraints

Strings support several constraints beyond just minLength and maxLength:

{
  "type": "string",
  "minLength": 3,
  "maxLength": 50,
  "pattern": "^[a-zA-Z0-9_-]+$",
  "format": "email"
}

The pattern keyword takes a regular expression (ECMA 262 dialect). The format keyword is an annotation - validators may or may not enforce it depending on configuration. Common values include "email", "uri", "date", "date-time", "uuid", and "ipv4".

Step 4: Enums and Const

To restrict a field to a fixed set of values, use enum:

{
  "type": "string",
  "enum": ["active", "inactive", "suspended", "deleted"]
}

For a single allowed value, use const:

{
  "properties": {
    "apiVersion": { "const": "v2" }
  }
}

This is useful for versioned API payloads where you want to assert a specific version string is present.

Step 5: Arrays

Array schemas use items to define the type of each element:

{
  "type": "array",
  "items": {
    "type": "string"
  },
  "minItems": 1,
  "maxItems": 10,
  "uniqueItems": true
}

This schema accepts an array of 1–10 unique strings. For tuple validation (fixed-length arrays with different types per position), use prefixItems in Draft 2020-12:

{
  "type": "array",
  "prefixItems": [
    { "type": "string" },
    { "type": "number" },
    { "type": "boolean" }
  ],
  "items": false
}

Setting "items": false disallows additional elements beyond the defined prefix - the array must be exactly 3 items.

Step 6: Nested Objects

Real-world data is nested. Here is a schema for an order with embedded address and line items:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "required": ["orderId", "customer", "items"],
  "properties": {
    "orderId": { "type": "string", "format": "uuid" },
    "customer": {
      "type": "object",
      "required": ["name", "email"],
      "properties": {
        "name": { "type": "string", "minLength": 1 },
        "email": { "type": "string", "format": "email" }
      }
    },
    "items": {
      "type": "array",
      "minItems": 1,
      "items": {
        "type": "object",
        "required": ["sku", "quantity", "price"],
        "properties": {
          "sku": { "type": "string" },
          "quantity": { "type": "integer", "minimum": 1 },
          "price": { "type": "number", "exclusiveMinimum": 0 }
        }
      }
    },
    "discount": { "type": ["number", "null"], "minimum": 0, "maximum": 100 }
  }
}

Validate JSON Schema Instantly

Paste your JSON and schema into our free validator. Errors are highlighted in real time. 100% client side - no data leaves your browser.

Open JSON Schema Validator

Step 7: Reuse with $ref and $defs

Duplicating sub-schemas across a large document is a maintenance problem. Use $defs to define reusable schemas and $ref to reference them:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$defs": {
    "address": {
      "type": "object",
      "required": ["street", "city", "country"],
      "properties": {
        "street": { "type": "string" },
        "city": { "type": "string" },
        "country": { "type": "string", "minLength": 2, "maxLength": 2 }
      }
    }
  },
  "type": "object",
  "properties": {
    "billingAddress": { "$ref": "#/$defs/address" },
    "shippingAddress": { "$ref": "#/$defs/address" }
  }
}

The $ref value is a JSON Pointer path. #/$defs/address means "look up address inside $defs at the root of this document." This eliminates duplication and keeps your schema DRY.

Step 8: Composition Keywords

JSON Schema has four composition keywords for combining schemas:

  • allOf - data must be valid against all listed schemas (intersection)
  • anyOf - data must be valid against at least one schema (union)
  • oneOf - data must be valid against exactly one schema (exclusive union)
  • not - data must NOT be valid against the given schema
{
  "oneOf": [
    {
      "type": "object",
      "required": ["type", "cardNumber"],
      "properties": {
        "type": { "const": "card" },
        "cardNumber": { "type": "string", "pattern": "^[0-9]{16}$" }
      }
    },
    {
      "type": "object",
      "required": ["type", "iban"],
      "properties": {
        "type": { "const": "bank" },
        "iban": { "type": "string" }
      }
    }
  ]
}

This schema accepts either a card payment or a bank payment object, but not both simultaneously - a classic discriminated union pattern.

Step 9: Running Validation in Code

Every major language has a JSON Schema library. Here are the most widely used:

  • JavaScript/Node.js: Ajv - the fastest validator, supports Draft-07 and 2020-12
  • Python: jsonschema library via pip
  • Java: networknt/json-schema-validator
  • Go: qri-io/jsonschema
  • PHP: opis/json-schema

Example with Ajv in Node.js:

const Ajv = require('ajv');
const addFormats = require('ajv-formats');

const ajv = new Ajv({ allErrors: true });
addFormats(ajv); // adds 'email', 'date-time', 'uri', etc.

const schema = {
  type: 'object',
  required: ['name', 'email'],
  properties: {
    name: { type: 'string', minLength: 1 },
    email: { type: 'string', format: 'email' },
    age: { type: 'integer', minimum: 0 }
  },
  additionalProperties: false
};

const validate = ajv.compile(schema);
const data = { name: 'Alice', email: 'alice@example.com', age: 30 };

if (!validate(data)) {
  console.error(validate.errors);
} else {
  console.log('Valid!');
}

Setting allErrors: true makes Ajv report all validation errors instead of stopping at the first one - important for building good user-facing error messages.

Step 10: Integrating with OpenAPI

If you document your API with OpenAPI 3.x, JSON Schema is already built in. Every schema object in an OpenAPI spec is a JSON Schema (with minor OpenAPI-specific extensions). This means the same validation logic powering your runtime checks also drives your API documentation, client code generation, and mock server responses.

paths:
  /users:
    post:
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [name, email]
              properties:
                name:
                  type: string
                  minLength: 1
                email:
                  type: string
                  format: email

Tip: define your schemas in a separate components/schemas section and reference them with $ref: '#/components/schemas/User'. This keeps your OpenAPI spec DRY and makes generated client SDKs cleaner.

Common JSON Schema Mistakes

  • Missing required: properties listed under properties are optional by default. You must explicitly list required fields in the required array.
  • Forgetting additionalProperties: false: by default, JSON Schema allows extra keys. If you want strict validation, set this explicitly.
  • Using format without a format validator: "format": "email" is just an annotation unless your library is configured to enforce it (e.g., Ajv needs ajv-formats).
  • Confusing number and integer: number accepts 1.5; integer does not. Use integer for counts, IDs, and quantities.
  • Deeply nested schemas without $ref: large inline schemas become unreadable. Extract common shapes into $defs early.

Frequently Asked Questions

What is the difference between JSON Schema Draft-07 and Draft 2020-12?

The most significant change is the replacement of definitions with $defs, and items for tuples with prefixItems. Draft 2020-12 also adds unevaluatedProperties and unevaluatedItems for tighter composition. Draft-07 is still widely supported; check your library's documentation for which drafts it supports before upgrading.

Is JSON Schema the same as TypeScript types?

No, but they serve similar purposes at different layers. TypeScript types are compile-time constructs that disappear at runtime. JSON Schema operates at runtime and validates actual data. Tools like quicktype and json-schema-to-typescript can generate TypeScript interfaces from JSON Schemas, giving you both compile-time and runtime safety from a single source of truth.

Can JSON Schema validate data inside a database?

MongoDB supports JSON Schema natively via collection validators. PostgreSQL has no built-in JSON Schema support, but you can validate JSON columns using PLV8 (JavaScript inside Postgres) or application-level validation before insert. Most teams validate at the API boundary and rely on database constraints for structural integrity.

How do I validate a JSON Schema document itself?

JSON Schema is self-describing: the meta-schema for Draft 2020-12 is published at https://json-schema.org/draft/2020-12/schema. You can validate your schema against this meta-schema using the same tools you use for regular validation. Ajv supports meta-schema validation with ajv.validateSchema(schema).

What is the performance impact of JSON Schema validation?

With a compiled validator like Ajv, validation overhead is typically under 1ms for typical API payloads. Ajv compiles schemas to optimized JavaScript functions on first use and caches them. The main cost is the one time compilation step, which is why you should call ajv.compile(schema) once at startup rather than on every request.

Can I use JSON Schema to validate environment variables or config files?

Yes. YAML and TOML configuration files can be parsed to JSON and validated against a schema at application startup. This catches misconfiguration early - before your app tries to connect to a database with a missing password. Libraries like env-schema (Node.js) wrap Ajv specifically for environment variable validation.

The Bottom Line

JSON Schema is one of the highest-ROI tools in a developer's toolkit. A few dozen lines of schema definition can prevent entire classes of bugs, improve API documentation, and make integration testing more reliable. Start with the basics - type, required, and properties - and add constraints incrementally as you understand your data better.

For more developer tools, explore our complete tools collection - all free, all running in your browser with no data sent to any server.

Use our free JSON Schema Validator

Paste any JSON document and schema. Validation errors are shown inline with clear explanations. Free, instant, no signup required.

Use our free tool here →
UK
Written by Usman Khan
DevOps Engineer | MSc Cybersecurity | CEH | AWS Solutions Architect

Usman has 10+ years of experience securing enterprise infrastructure, managing high-traffic servers, and building zero-knowledge security tools. Read more about the author.