Base Schemas
At their core, schemas are simply a collection of data processing rules. The simplest possible schema
has no rules defined, and transparently passes its input through to its output: new Schema(). This empty schema is
unopinionated and does nothing, so it's not particularly useful except as a building block.
To create a specialized schema with more interesting behavior, you could add data processing rules
to an empty schema. This is not complicated, but would be tedious and repetitive for schemas defining
hierarchies of properties that share many common rules. For this reason, you can specify a base schema
to act as a template by passing either a schema instance or a registered schema name to the constructor: new Schema(base).
Schema Registry
The SchemaResolver provides a schema registry, used to resolve named base schemas during compilation.
It is prepopulated with a small set of named schemas corresponding to fundamental "types".
In general, the prebuilt schemas embrace the philosophy of Postel's Law. The normalization handler is accommodating, and accepts values of the schema's defined type plus any unambiguous format that enhances convenience (e.g. the stringified representation of the type values), whereas their validation handler strictly enforces the actual type. Additional per-schema notes follow:
string
new Schema('string')
Converts almost anything to a string. The string representations are intended to be complementary to
the other prebuilt schemas, such that the other schemas can always successfully accept the format.
Complex objects are stringified using $string, which produces an expanded JSON syntax
that supports circular references and extended types like dates and bigints.
number
new Schema('number')
Normalizes numbers and stringified numbers, outputs and validates number values. Values must not be infinite or NaN.
boolean
new Schema('boolean')
Normalizes inputs based on their truthiness (note that this
library uses a broader definition than the JavaScript standard) but outputs and validates actual boolean values.
object
new Schema('object')
// or more commonly, used with child property schemas,
new Schema('object')
.property('child1', new Schema(/* ... */))
.property('child2', new Schema(/* ... */))
// ...
// rarely,
new Schema('object')
.property('child', new Schema(/*... */)) // known child name
.property('*', new Schema('*')) // unknown child name
The object schema is set up to handle child properties, and normalizes the empty container signal as {}.
It also accepts the expanded JSON syntax produced when $string processes an object. Also supports wildcard child property
schemas to match unknown keys. Unknown input properties can be accepted and pruned by using a wildcard with a $null normalizer.
An object schema without any properties treats its entire input as a monolithic value.
Any structured data returned from an object schema normalizer (potentially from the default) will be used as
the input for its children. Avoid combining explicit child property assignments with "bulk assignments" to the container:
import { Schema, SchemaResolver } from '@versionzero/schema';
const resolver = new SchemaResolver();
const schema = await resolver.compile(
new Schema('object')
.deep()
.default( {child: 123} ) // bulk assignment!
.property('child', new Schema('number').default(456))
);
// The results are subtle, so avoid depending on this behavior if possible:
console.log( await schema.process({child: 789}) );
// -> { child: 789 } (explicitly set property wins over parent assignment)
console.log( await schema.process({}) )
// -> { child: 456 } (explicit empty object overrides the parent default, but undefined child uses the default)
console.log( await schema.process() );
// -> { child: 123 } (no parent assignment, so parent default of bulk assignment wins)
See the properties section for more information on building schema hierarchies.
array
new Schema('array')
// or more commonly either with a typed wildcard child:
new Schema('array')
.property('*', new Schema(/* whatever... */))
// or occasionally, defined as an explicit tuple:
new Schema('array')
.property('0', new Schema(/* first member */))
.property('1', new Schema(/* second member */))
// ...
The array schema behaves analogously to object, except it additionally accepts simple comma-separated strings.
Commonly used with wildcard properties, allowing arbitrary length arrays with typed values. It also
will normalize a number into an empty array of that length, and if the array has a wildcard child property schema,
the parent array normalizes the special character "*" as the values option of the wildcard child. Notes
about bulk parent assignments also apply here.
See the properties section for more information on building schema hierarchies.
date
new Schema('date')
Produces and validates Date instances. Uses the the $date processor for normalization, which accepts
multiple input formats and convenience syntaxes (e.g. "now" or "+365d").
See the $date documentation for details.
buffer
new Schema('buffer')
The most common use is to normalize base64 input strings, but other formats and encodings are supported.
See the $buffer documentation for details on other supported inputs.
function
new Schema('function')
By default, simple (non-class-constructor) functions are treated as dynamic values and invoked using the same
signature as value processor functions. To pass functions as values, the function schema disables the dynamic
schema option.
any
new Schema('any')
The any schema is essentially a pass-through just like new Schema(), but adds support for normalizing
the empty container signal, using heuristics to determine whether to return
{} or [].
literal
Schema.literal(value);
Not a base schema in the same sense as the others. Literals are created using a special static factory function rather than the constructor. Whatever value is provided will be:
- used as the default
- returned from the transform
- validated as the only legal value
Literal schemas should not have child properties or be constructed as a union.
See the API docs for details and the
$literalprocessor for the pipeline variant.
inherit
Schema.inherit(propertyName)
Another special factory function, creates a schema with a transform that generates the same value
as whatever is in the output at the referenced propertyName in a parent schema. If propertyName
is omitted, will use the active child name where the current schema was found during transformation.
import { Schema, SchemaResolver } from '@versionzero/schema';
const resolver = new SchemaResolver();
const schema = await resolver.compile(
new Schema('object').deep() // ("deep" option ensures population of children even without assignments
.property('a', new Schema('number'))
.property('b', new Schema('number'))
.property('child', new Schema('object').deep()
.property('a', Schema.inherit()) // default is to just look for same name as current property
.property('ai', Schema.inherit('a'))
.property('c', new Schema('number'))
)
);
console.log( await schema.process( {a: 123, b: 456}) );
// -> { a: 123, b: 456, child: { a: 123, ai: 123 } }
reference
Schema.reference(path, absolute)
Like inherit, but uses a path rather than property name.
By default, paths are interpreted assuming absolute=true from the root. With absolute=false,
meta-characters in the path allow navigation around the hierarchy. Uses the same path semantics
as the $reference keyword.
import { Schema, SchemaResolver } from '@versionzero/schema';
const resolver = new SchemaResolver();
const schema = await resolver.compile(
new Schema('object').deep()
.property('a', new Schema('number'))
.property('child1', new Schema('object').deep()
.property('b', new Schema('number'))
)
.property('child2', new Schema('object').deep()
.property('ar', Schema.reference('a')) // absolute from root
.property('br', Schema.reference('^^child1.b', true)) // use relative path
)
);
console.log( await schema.process( {a: 123, child1:{b: 456}}) );
// -> { a: 123, child1: { b: 456 }, child2: { ar: 123, br: 456 } }
No Hard-Coded Schema Types!
These schemas are not "special" in any way, they are just a convenience, and are built using the public API.
This is the entire actual definition of number:
import { Schema } from '@versionzero/schema';
export const NUMBER_SCHEMA = new Schema()
.option('type', 'number')
.meta('valueName', 'number')
.normalizer('$number')
.validator('$is-number')
($number and $is-number are value processor functions registered as keywords in SchemaResolver.)
Schema Inheritance
Any custom schema can be used as a base, or registered with SchemaResolver for reuse by name.
import { Schema, SchemaResolver } from '@versionzero/schema';
const schemaResolver = new SchemaResolver();
// Inherit and extend the built-in 'string' base schema for email validation
const emailSchema = new Schema('string')
.validator('$email')
.meta('description', 'Email address');
// You can inherit by reference and extend it...
const selectiveEmailSchema = new Schema(emailSchema)
.validator(/^(?!.*@aol\.com$).+/); // disallow aol addresses
// register it for reuse...
schemaResolver.registerSchema('SelectiveEmail', selectiveEmailSchema);
// reference by name, and continue extending...
const myEmailSchema = new Schema('SelectiveEmail')
.validator(/^(?!.*@gmail\.com$).+/); // also disallow gmail
The base parameter to new Schema(base) can be:
- A
Schema, aCompiledSchema, or compatible "schema-shaped" raw data. - A string name of a registered schema (e.g.,
'string','number')
Compilation of Schema Hierarchies
The base schema is not resolved until the Schema is compiled into a CompiledSchema.
During compilation, the SchemaResolver.resolve() call walks the Schema and its bases (and properties and unions, and their bases),
flattening everything into the CompiledSchema used for runtime processing and validation. Late resolution of base names means that
Schema definitions don't need to depend on the implementation details of what they reference. This is useful for
encoding values as dependency-injected references (e.g. .property(logger, new Schema('Logger')). An unresolved
base schema throws an error at compilation time by default, but this can be overridden by setting the lax option
on a schema, which makes unresolved bases only an error if they are actually referenced at runtime. This is useful
if a "partially resolved" schema might be shared between different applications, and a condition is used to block
irrelevant references.