Skip to main content

Base Schemas

At their core, schemas are simply a collection of data processing rules. The simplest possible schema has no rules defined, and transparently passes its input through to its output: new Schema(). This empty schema is unopinionated and does nothing, so it's not particularly useful except as a building block.

To create a specialized schema with more interesting behavior, you could add data processing rules to an empty schema. This is not complicated, but would be tedious and repetitive for schemas defining hierarchies of properties that share many common rules. For this reason, you can specify a base schema to act as a template by passing either a schema instance or a registered schema name to the constructor: new Schema(base).

Schema Registry

The SchemaResolver provides a schema registry, used to resolve named base schemas during compilation. It is prepopulated with a small set of named schemas corresponding to fundamental "types".

In general, the prebuilt schemas embrace the philosophy of Postel's Law. The normalization handler is accommodating, and accepts values of the schema's defined type plus any unambiguous format that enhances convenience (e.g. the stringified representation of the type values), whereas their validation handler strictly enforces the actual type. Additional per-schema notes follow:

string

new Schema('string')

Converts almost anything to a string. The string representations are intended to be complementary to the other prebuilt schemas, such that the other schemas can always successfully accept the format. Complex objects are stringified using $string, which produces an expanded JSON syntax that supports circular references and extended types like dates and bigints.

number

new Schema('number')

Normalizes numbers and stringified numbers, outputs and validates number values. Values must not be infinite or NaN.

boolean

new Schema('boolean')

Normalizes inputs based on their truthiness (note that this library uses a broader definition than the JavaScript standard) but outputs and validates actual boolean values.

object

new Schema('object')

// or more commonly, used with child property schemas,
new Schema('object')
.property('child1', new Schema(/* ... */))
.property('child2', new Schema(/* ... */))
// ...

// rarely,
new Schema('object')
.property('child', new Schema(/*... */)) // known child name
.property('*', new Schema('*')) // unknown child name

The object schema is set up to handle child properties, and normalizes the empty container signal as {}. It also accepts the expanded JSON syntax produced when $string processes an object. Also supports wildcard child property schemas to match unknown keys. Unknown input properties can be accepted and pruned by using a wildcard with a $null normalizer.

An object schema without any properties treats its entire input as a monolithic value.

Any structured data returned from an object schema normalizer (potentially from the default) will be used as the input for its children. Avoid combining explicit child property assignments with "bulk assignments" to the container:

import { Schema, SchemaResolver } from '@versionzero/schema';
const resolver = new SchemaResolver();

const schema = await resolver.compile(
new Schema('object')
.deep()
.default( {child: 123} ) // bulk assignment!
.property('child', new Schema('number').default(456))
);
// The results are subtle, so avoid depending on this behavior if possible:
console.log( await schema.process({child: 789}) );
// -> { child: 789 } (explicitly set property wins over parent assignment)
console.log( await schema.process({}) )
// -> { child: 456 } (explicit empty object overrides the parent default, but undefined child uses the default)
console.log( await schema.process() );
// -> { child: 123 } (no parent assignment, so parent default of bulk assignment wins)

See the properties section for more information on building schema hierarchies.

array

new Schema('array')
// or more commonly either with a typed wildcard child:
new Schema('array')
.property('*', new Schema(/* whatever... */))
// or occasionally, defined as an explicit tuple:
new Schema('array')
.property('0', new Schema(/* first member */))
.property('1', new Schema(/* second member */))
// ...

The array schema behaves analogously to object, except it additionally accepts simple comma-separated strings.
Commonly used with wildcard properties, allowing arbitrary length arrays with typed values. It also will normalize a number into an empty array of that length, and if the array has a wildcard child property schema, the parent array normalizes the special character "*" as the values option of the wildcard child. Notes about bulk parent assignments also apply here.

See the properties section for more information on building schema hierarchies.

date

new Schema('date')

Produces and validates Date instances. Uses the the $date processor for normalization, which accepts multiple input formats and convenience syntaxes (e.g. "now" or "+365d").
See the $date documentation for details.

buffer

new Schema('buffer')

The most common use is to normalize base64 input strings, but other formats and encodings are supported.
See the $buffer documentation for details on other supported inputs.

function

new Schema('function')

By default, simple (non-class-constructor) functions are treated as dynamic values and invoked using the same signature as value processor functions. To pass functions as values, the function schema disables the dynamic schema option.

any

new Schema('any')

The any schema is essentially a pass-through just like new Schema(), but adds support for normalizing the empty container signal, using heuristics to determine whether to return {} or [].

literal

Schema.literal(value);

Not a base schema in the same sense as the others. Literals are created using a special static factory function rather than the constructor. Whatever value is provided will be:

  • used as the default
  • returned from the transform
  • validated as the only legal value Literal schemas should not have child properties or be constructed as a union. See the API docs for details and the $literal processor for the pipeline variant.

inherit

Schema.inherit(propertyName)

Another special factory function, creates a schema with a transform that generates the same value as whatever is in the output at the referenced propertyName in a parent schema. If propertyName is omitted, will use the active child name where the current schema was found during transformation.

import { Schema, SchemaResolver } from '@versionzero/schema';
const resolver = new SchemaResolver();

const schema = await resolver.compile(
new Schema('object').deep() // ("deep" option ensures population of children even without assignments
.property('a', new Schema('number'))
.property('b', new Schema('number'))
.property('child', new Schema('object').deep()
.property('a', Schema.inherit()) // default is to just look for same name as current property
.property('ai', Schema.inherit('a'))
.property('c', new Schema('number'))
)
);

console.log( await schema.process( {a: 123, b: 456}) );
// -> { a: 123, b: 456, child: { a: 123, ai: 123 } }

reference

Schema.reference(path, absolute)

Like inherit, but uses a path rather than property name.

By default, paths are interpreted assuming absolute=true from the root. With absolute=false, meta-characters in the path allow navigation around the hierarchy. Uses the same path semantics as the $reference keyword.

import { Schema, SchemaResolver } from '@versionzero/schema';
const resolver = new SchemaResolver();

const schema = await resolver.compile(
new Schema('object').deep()
.property('a', new Schema('number'))
.property('child1', new Schema('object').deep()
.property('b', new Schema('number'))
)
.property('child2', new Schema('object').deep()
.property('ar', Schema.reference('a')) // absolute from root
.property('br', Schema.reference('^^child1.b', true)) // use relative path
)
);

console.log( await schema.process( {a: 123, child1:{b: 456}}) );
// -> { a: 123, child1: { b: 456 }, child2: { ar: 123, br: 456 } }


No Hard-Coded Schema Types!

These schemas are not "special" in any way, they are just a convenience, and are built using the public API. This is the entire actual definition of number:

import { Schema } from '@versionzero/schema';

export const NUMBER_SCHEMA = new Schema()
.option('type', 'number')
.meta('valueName', 'number')
.normalizer('$number')
.validator('$is-number')

($number and $is-number are value processor functions registered as keywords in SchemaResolver.)

Schema Inheritance

Any custom schema can be used as a base, or registered with SchemaResolver for reuse by name.

import { Schema, SchemaResolver } from '@versionzero/schema';
const schemaResolver = new SchemaResolver();

// Inherit and extend the built-in 'string' base schema for email validation
const emailSchema = new Schema('string')
.validator('$email')
.meta('description', 'Email address');

// You can inherit by reference and extend it...
const selectiveEmailSchema = new Schema(emailSchema)
.validator(/^(?!.*@aol\.com$).+/); // disallow aol addresses

// register it for reuse...
schemaResolver.registerSchema('SelectiveEmail', selectiveEmailSchema);
// reference by name, and continue extending...
const myEmailSchema = new Schema('SelectiveEmail')
.validator(/^(?!.*@gmail\.com$).+/); // also disallow gmail

The base parameter to new Schema(base) can be:

  • A Schema, a CompiledSchema, or compatible "schema-shaped" raw data.
  • A string name of a registered schema (e.g., 'string', 'number')

Compilation of Schema Hierarchies

The base schema is not resolved until the Schema is compiled into a CompiledSchema. During compilation, the SchemaResolver.resolve() call walks the Schema and its bases (and properties and unions, and their bases), flattening everything into the CompiledSchema used for runtime processing and validation. Late resolution of base names means that Schema definitions don't need to depend on the implementation details of what they reference. This is useful for encoding values as dependency-injected references (e.g. .property(logger, new Schema('Logger')). An unresolved base schema throws an error at compilation time by default, but this can be overridden by setting the lax option on a schema, which makes unresolved bases only an error if they are actually referenced at runtime. This is useful if a "partially resolved" schema might be shared between different applications, and a condition is used to block irrelevant references.