KSY keyword reference
A .ksy file is a YAML document that describes a binary
format. The Kaitai Struct compiler (ksc, also called
kaitai-struct-compiler) reads it and generates a parser library in a target
language. This page is a compact reference to the keywords you write inside a
.ksy file: the top-level keys, the meta keys, and the per-attribute keys
used inside seq and instances.
Keys are spelled exactly as shown, including hyphens (for example size-eos,
repeat-expr, pad-right). YAML is case-sensitive, and so is the compiler.
Top-level keys
These keys appear at the root of a .ksy file (and most also appear inside any
user-defined type under types).
| Key | Required | Value | Purpose |
|---|---|---|---|
meta | Yes (at root) | Map | Metadata about the format: id, endianness, encoding, imports, etc. |
doc | No | String | Free-form documentation for the type. Supports Markdown. |
doc-ref | No | String or list of strings | Reference(s) to an external document/specification, optionally with a URL. |
params | No | List of maps | Parameters the type accepts when it is instantiated by a parent. |
seq | No | List of attributes | The sequence of fields read, in order, from the stream. |
instances | No | Map of name → attribute | Lazy / out-of-sequence fields: computed values or fields read from an explicit position. |
enums | No | Map of name → (int → name) | Named integer-to-symbol mappings used by enum. |
types | No | Map of name → type | Nested user-defined types, each with its own meta/seq/types/etc. |
A user-defined type under types may contain meta, seq, instances,
enums, types, params, doc, and doc-ref. Only the root requires
meta.
meta keys
The meta map holds format-wide settings. At the root, id is required;
everything else is optional and inherited by nested types unless overridden.
| Key | Value | Purpose |
|---|---|---|
id | String (lower_underscore) | Identifier for the format / generated top-level class. Required at root. |
title | String | Human-readable name of the format. |
application | String or list | Application(s) that produce or consume this format. |
file-extension | String or list | Typical file extension(s), without the leading dot. |
xref | Map | Cross-references to external catalogs (Wikidata, MIME, PRONOM, RFC, etc.). |
license | String | SPDX license identifier for the .ksy spec itself. |
imports | List of strings | Other .ksy files to import; paths are relative, or resolved via -I / KSPATH. |
encoding | String | Default string encoding (e.g. UTF-8, ASCII) for type: str fields. |
endian | le or be | Default byte order for multi-byte numeric types. Can also be a switch-on map. |
bit-endian | le or be | Default bit order for bit-sized integer types (bX). Since v0.9. |
ks-version | String | Minimum compiler version required to build this spec. |
ks-debug | Boolean | Force debug mode for this spec. |
ks-opaque-types | Boolean | Allow referencing externally-defined (opaque) types. |
endian and bit-endian set defaults only. Any individual numeric type can
override them by appending the order to the type name — for example u4le,
u4be, or b3le.
Attribute keys
These keys describe a single attribute (a field) inside seq or instances.
An attribute is a YAML map; id plus a way to determine its data and length
are the core of it.
Identity and type
| Key | Value | Purpose |
|---|---|---|
id | String (lower_underscore) | Field name. Required in seq; for instances the map key is the name instead. |
type | String or switch-on map | Data type: a built-in (u1/u2/u4/u8, s1/s2/s4/s8, f4, f8, str, strz, bX) or a user-defined type name. Omit for a raw byte array. |
doc | String | Documentation for this field. |
doc-ref | String or list | External reference for this field. |
enum | String | Name of an enums entry to map the parsed integer onto named constants. |
The type key can also switch on a value at parse time:
- id: body
type:
switch-on: rec_type
cases:
1: rec_type_1
2: rec_type_2
_: rec_type_unknown
The _ case is the default (fallback) branch.
Size and string handling
| Key | Value | Purpose |
|---|---|---|
size | Integer or expression | Number of bytes the field occupies (a fixed length or a value computed from earlier fields). |
size-eos | Boolean | If true, read until the end of the stream. Mutually exclusive with size. |
contents | List / string / bytes | Magic value: assert that these exact bytes appear here; the parser fails otherwise. |
encoding | String | Character encoding for a type: str field; overrides meta/encoding. |
terminator | Integer (byte value) | Stop reading a string/byte array at this byte. strz is shorthand for terminator: 0. |
consume | Boolean | Whether the terminator byte is consumed from the stream. Default true. |
include | Boolean | Whether the terminator byte is included in the parsed value. Default false. |
eos-error | Boolean | If false, reaching end-of-stream without finding the terminator is not an error. Default true. |
pad-right | Integer (byte value) | Strip this trailing padding byte from a fixed-size string/byte array. |
contents validates fixed bytes — useful for file signatures. Values may be
written as a byte array ([0xca, 0xfe, 0xba, 0xbe]), a string, or a mix; the
compiler concatenates them into the expected byte sequence.
seq:
- id: magic
contents: [0xca, 0xfe, 0xba, 0xbe]
- id: name
type: str
size: 16
encoding: UTF-8
pad-right: 0
- id: comment
type: strz
encoding: ASCII
Repetition
| Key | Value | Purpose |
|---|---|---|
repeat | eos, expr, or until | Repeat this attribute, producing an array. |
repeat-expr | Integer or expression | Required when repeat: expr: how many times to repeat. |
repeat-until | Boolean expression | Required when repeat: until: stop once it is true. The current element is _. |
# repeat: eos — read elements until the stream ends
- id: records
type: record
repeat: eos
# repeat: expr — read a known count
- id: entries
type: entry
repeat: expr
repeat-expr: num_entries
# repeat: until — read until a sentinel value
- id: numbers
type: s4
repeat: until
repeat-until: _ == -1
Conditionals and processing
| Key | Value | Purpose |
|---|---|---|
if | Boolean expression | Parse the field only when the expression is true; otherwise skip it. |
process | zlib, xor(key), rol(n), ror(n), or a custom processor | Transform raw bytes (decompress / decrypt) before exposing them. Requires a known byte range (size or size-eos). |
- id: has_crc32
type: u1
- id: crc32
type: u4
if: has_crc32 != 0
- id: body
size: body_len
process: zlib
Instance-only keys
Inside instances, the field name is the map key (not id). Two extra keys
control where data comes from.
| Key | Value | Purpose |
|---|---|---|
pos | Integer or expression | Read this field from an explicit position in the stream rather than the current one. |
value | Expression | Make this a computed value instance — no bytes are read; the expression's result is the value. |
io | I/O object expression | Read from a different stream (e.g. _root._io, or a substream of another field). |
A value instance computes a result; a pos instance seeks and parses:
instances:
# computed: no parsing, just arithmetic
len_in_meters:
value: len_in_feet * 0.3048
# positional: seek to offset and read
header_copy:
pos: 0
type: u4
The .ksy snippets above are minimal and illustrative; they show keyword usage,
not a complete real-world format. For complete, production specs see the format
gallery linked below.
Compiling a .ksy file
The keywords above are consumed by ksc. A typical invocation:
kaitai-struct-compiler --target python --outdir out my_format.ksy
Common flags: -t / --target selects the output language (for example
python, java, cpp_stl, javascript, csharp, go, rust, or all);
-d / --outdir sets the output directory; -I adds an import search path
used by meta/imports. Adding --read-write generates serialization support
(_write / _check) in addition to parsing, currently for Java and Python.