Skip to main content

KSY keyword reference

A .ksy file is a YAML document that describes a binary format. The Kaitai Struct compiler (ksc, also called kaitai-struct-compiler) reads it and generates a parser library in a target language. This page is a compact reference to the keywords you write inside a .ksy file: the top-level keys, the meta keys, and the per-attribute keys used inside seq and instances.

note

Keys are spelled exactly as shown, including hyphens (for example size-eos, repeat-expr, pad-right). YAML is case-sensitive, and so is the compiler.

Top-level keys

These keys appear at the root of a .ksy file (and most also appear inside any user-defined type under types).

KeyRequiredValuePurpose
metaYes (at root)MapMetadata about the format: id, endianness, encoding, imports, etc.
docNoStringFree-form documentation for the type. Supports Markdown.
doc-refNoString or list of stringsReference(s) to an external document/specification, optionally with a URL.
paramsNoList of mapsParameters the type accepts when it is instantiated by a parent.
seqNoList of attributesThe sequence of fields read, in order, from the stream.
instancesNoMap of name → attributeLazy / out-of-sequence fields: computed values or fields read from an explicit position.
enumsNoMap of name → (int → name)Named integer-to-symbol mappings used by enum.
typesNoMap of name → typeNested user-defined types, each with its own meta/seq/types/etc.

A user-defined type under types may contain meta, seq, instances, enums, types, params, doc, and doc-ref. Only the root requires meta.

meta keys

The meta map holds format-wide settings. At the root, id is required; everything else is optional and inherited by nested types unless overridden.

KeyValuePurpose
idString (lower_underscore)Identifier for the format / generated top-level class. Required at root.
titleStringHuman-readable name of the format.
applicationString or listApplication(s) that produce or consume this format.
file-extensionString or listTypical file extension(s), without the leading dot.
xrefMapCross-references to external catalogs (Wikidata, MIME, PRONOM, RFC, etc.).
licenseStringSPDX license identifier for the .ksy spec itself.
importsList of stringsOther .ksy files to import; paths are relative, or resolved via -I / KSPATH.
encodingStringDefault string encoding (e.g. UTF-8, ASCII) for type: str fields.
endianle or beDefault byte order for multi-byte numeric types. Can also be a switch-on map.
bit-endianle or beDefault bit order for bit-sized integer types (bX). Since v0.9.
ks-versionStringMinimum compiler version required to build this spec.
ks-debugBooleanForce debug mode for this spec.
ks-opaque-typesBooleanAllow referencing externally-defined (opaque) types.
tip

endian and bit-endian set defaults only. Any individual numeric type can override them by appending the order to the type name — for example u4le, u4be, or b3le.

Attribute keys

These keys describe a single attribute (a field) inside seq or instances. An attribute is a YAML map; id plus a way to determine its data and length are the core of it.

Identity and type

KeyValuePurpose
idString (lower_underscore)Field name. Required in seq; for instances the map key is the name instead.
typeString or switch-on mapData type: a built-in (u1/u2/u4/u8, s1/s2/s4/s8, f4, f8, str, strz, bX) or a user-defined type name. Omit for a raw byte array.
docStringDocumentation for this field.
doc-refString or listExternal reference for this field.
enumStringName of an enums entry to map the parsed integer onto named constants.

The type key can also switch on a value at parse time:

- id: body
type:
switch-on: rec_type
cases:
1: rec_type_1
2: rec_type_2
_: rec_type_unknown

The _ case is the default (fallback) branch.

Size and string handling

KeyValuePurpose
sizeInteger or expressionNumber of bytes the field occupies (a fixed length or a value computed from earlier fields).
size-eosBooleanIf true, read until the end of the stream. Mutually exclusive with size.
contentsList / string / bytesMagic value: assert that these exact bytes appear here; the parser fails otherwise.
encodingStringCharacter encoding for a type: str field; overrides meta/encoding.
terminatorInteger (byte value)Stop reading a string/byte array at this byte. strz is shorthand for terminator: 0.
consumeBooleanWhether the terminator byte is consumed from the stream. Default true.
includeBooleanWhether the terminator byte is included in the parsed value. Default false.
eos-errorBooleanIf false, reaching end-of-stream without finding the terminator is not an error. Default true.
pad-rightInteger (byte value)Strip this trailing padding byte from a fixed-size string/byte array.
info

contents validates fixed bytes — useful for file signatures. Values may be written as a byte array ([0xca, 0xfe, 0xba, 0xbe]), a string, or a mix; the compiler concatenates them into the expected byte sequence.

seq:
- id: magic
contents: [0xca, 0xfe, 0xba, 0xbe]
- id: name
type: str
size: 16
encoding: UTF-8
pad-right: 0
- id: comment
type: strz
encoding: ASCII

Repetition

KeyValuePurpose
repeateos, expr, or untilRepeat this attribute, producing an array.
repeat-exprInteger or expressionRequired when repeat: expr: how many times to repeat.
repeat-untilBoolean expressionRequired when repeat: until: stop once it is true. The current element is _.
# repeat: eos — read elements until the stream ends
- id: records
type: record
repeat: eos

# repeat: expr — read a known count
- id: entries
type: entry
repeat: expr
repeat-expr: num_entries

# repeat: until — read until a sentinel value
- id: numbers
type: s4
repeat: until
repeat-until: _ == -1

Conditionals and processing

KeyValuePurpose
ifBoolean expressionParse the field only when the expression is true; otherwise skip it.
processzlib, xor(key), rol(n), ror(n), or a custom processorTransform raw bytes (decompress / decrypt) before exposing them. Requires a known byte range (size or size-eos).
- id: has_crc32
type: u1
- id: crc32
type: u4
if: has_crc32 != 0
- id: body
size: body_len
process: zlib

Instance-only keys

Inside instances, the field name is the map key (not id). Two extra keys control where data comes from.

KeyValuePurpose
posInteger or expressionRead this field from an explicit position in the stream rather than the current one.
valueExpressionMake this a computed value instance — no bytes are read; the expression's result is the value.
ioI/O object expressionRead from a different stream (e.g. _root._io, or a substream of another field).

A value instance computes a result; a pos instance seeks and parses:

instances:
# computed: no parsing, just arithmetic
len_in_meters:
value: len_in_feet * 0.3048
# positional: seek to offset and read
header_copy:
pos: 0
type: u4
Illustrative example

The .ksy snippets above are minimal and illustrative; they show keyword usage, not a complete real-world format. For complete, production specs see the format gallery linked below.

Compiling a .ksy file

The keywords above are consumed by ksc. A typical invocation:

kaitai-struct-compiler --target python --outdir out my_format.ksy

Common flags: -t / --target selects the output language (for example python, java, cpp_stl, javascript, csharp, go, rust, or all); -d / --outdir sets the output directory; -I adds an import search path used by meta/imports. Adding --read-write generates serialization support (_write / _check) in addition to parsing, currently for Java and Python.

Sources