Skip to main content

Repetition & conditions

Most real binary formats are not flat lists of single fields. They contain arrays whose length is computed at runtime, fields that only exist when a flag is set, and records whose body type depends on a tag byte read earlier. Kaitai Struct expresses all three of these patterns declaratively inside a seq attribute, with no hand-written loops or if statements in the generated parser.

This page covers the three mechanisms:

GoalKey(s)Result type
Read the same field many timesrepeat (+ repeat-expr / repeat-until)array
Read a field only sometimesifthe field's type, or null/absent
Choose a type at parse timetype: { switch-on, cases }the selected type
note

A .ksy file is YAML. The keys below (repeat, if, switch-on, …) are ordinary YAML mapping keys on an attribute, and the compiler ksc (kaitai-struct-compiler) turns them into the loops and branches of the target language.

Repetition

Adding repeat to an attribute turns its value into an array. There are three repetition modes, selected by the value of repeat.

repeat: expr — a known number of times

Use repeat: expr together with repeat-expr, which holds an expression giving the element count. The count is frequently a field that was parsed just before the array.

seq:
- id: num_floats
type: u4
- id: floats
type: f8
repeat: expr
repeat-expr: num_floats

repeat-expr is a full expression, so the count can be computed:

- id: pixels
type: u1
repeat: expr
repeat-expr: width * height

repeat: eos — until the end of the stream

Use repeat: eos ("end of stream") to keep reading the field until there are no more bytes left in the current stream. No count is needed.

seq:
- id: numbers
type: u4
repeat: eos
tip

repeat: eos is bounded by the current stream, not the whole file. If the attribute lives inside a substream created by a size key, it stops at the end of that substream — a common way to read "all records in this section."

repeat: until — until a condition is met

Use repeat: until together with repeat-until, a boolean expression evaluated after each element is parsed. Reading stops once the expression becomes true. The just-parsed element is available as the special variable _.

Important: the element that satisfies the condition has already been read, so it is included in the resulting array.

seq:
- id: numbers
type: s4
repeat: until
repeat-until: _ == -1

When the repeated field is a user-defined type, reach into its attributes through _:

- id: chunks
type: chunk
repeat: until
repeat-until: _.len == 0

The repetition keys summarize as:

repeat valueCompanion keyStops when
exprrepeat-exprthe expression's count is reached
eos(none)the current stream is exhausted
untilrepeat-untilthe boolean expression (using _) is true

Conditional fields with if

The if key makes an attribute optional. The field is only parsed when its boolean expression is true; otherwise nothing is read and the field is null/absent in the generated API.

seq:
- id: has_crc32
type: u1
- id: crc32
type: u4
if: has_crc32 != 0

if works with enums too — compare against an enum value using the enum_name::value syntax:

seq:
- id: my_animal
type: u1
enum: animal
- id: dog_tag
type: u4
if: my_animal == animal::dog
enums:
animal:
1: cat
2: dog
info

if and repeat can be combined on the same attribute. The condition is checked first; if it is true, the field is read according to its repeat mode.

Type switching with switch-on

When a field's type is not fixed but is chosen by a value read earlier (a "tag" or "record type"), give type a mapping with switch-on and cases instead of a plain type name. switch-on is the expression to test; cases maps possible values to the type to parse for each.

seq:
- id: rec_type
type: u1
- id: len
type: u4
- id: body
size: len
type:
switch-on: rec_type
cases:
1: rec_type_1
2: rec_type_2

Default case

A case key of _ acts as the default, matching any value not listed explicitly:

type:
switch-on: rec_type
cases:
1: rec_type_1
2: rec_type_2
_: rec_type_unknown
note

If switch-on matches no case and there is no _ default, but the attribute has a size, parsing yields a raw byte array for that field rather than failing.

Switching on strings and enums

Case keys are expressions, so string and enum values work as keys — but they must be quoted so YAML reads them as the intended literal:

type:
switch-on: magic # a previously parsed string field
cases:
'"KETCHUP"': rec_type_1
'"MUSTARD"': rec_type_2
type:
switch-on: media # a field with enum: media
cases:
'media::cdrom': rec_type_1
'media::dvdrom': rec_type_2

Illustrative example: a tagged record list

Illustrative example. The following .ksy is written for this page to show repeat, if, and switch-on working together. It is not taken from a real format in the gallery.

meta:
id: tagged_log
endian: le
seq:
- id: num_records
type: u4
- id: records
type: record
repeat: expr
repeat-expr: num_records
types:
record:
seq:
- id: tag
type: u1
enum: rec_kind
- id: len
type: u2
- id: body
size: len
type:
switch-on: tag
cases:
'rec_kind::text': text_body
'rec_kind::number': number_body
_: raw_body # falls back to a raw-bytes wrapper type
- id: checksum
type: u4
if: tag != rec_kind::text # only numeric/raw records carry a checksum
text_body:
seq:
- id: value
type: str
size-eos: true
encoding: UTF-8
number_body:
seq:
- id: value
type: s8
raw_body:
seq:
- id: value
size-eos: true
enums:
rec_kind:
1: text
2: number

This single specification reads a u4 count, repeats a record exactly that many times (repeat: expr), selects each record body's type from the tag enum (switch-on / cases with a _ default), and parses a trailing checksum only for non-text records (if).

A note on serialization

The same keys are reused when Kaitai Struct writes data, not just reads it. When serializing, an array declared with repeat-expr: 2 must hold exactly two elements, or the _check() call before _write() raises a ConsistencyError. Value instances are cached, so if you change a field that a repeat-expr depends on, invalidate the cached instance before writing so the count is recomputed.

Sources