Skip to main content

Glossary

This page defines the vocabulary used throughout the YARA-X knowledge base. YARA-X is a Rust reimplementation of YARA, the pattern-matching tool used to identify and classify malware. Its command-line tool is yr, and rules are written in the YARA rule language with meta, strings, and condition sections.

note

The definitions below describe how each term is used in YARA-X specifically. Where the YARA-X behavior differs from the original YARA, the difference is called out. Links to the upstream documentation are in Sources.

Quick reference

TermOne-line definition
RuleThe basic unit of detection: patterns plus a boolean condition.
String (pattern)A piece of data to search for — text, hex bytes, or a regex.
ConditionThe mandatory boolean expression that decides whether a rule matches.
ModuleAn importable extension that exposes file-format or helper data to rules.
NamespaceA label that isolates a group of rules to avoid identifier clashes.
Compiled rulesRules pre-built into a binary blob for fast reuse.
ScannerThe component that runs compiled rules against target data.

Rule

A rule is the basic unit of work in YARA-X. Every rule begins with the keyword rule followed by an identifier, and describes a set of patterns together with a boolean condition that decides whether a piece of data — a file, a process, or a buffer — matches.

A rule is made of up to three sections: an optional meta section (metadata), an optional strings section (the patterns), and a mandatory condition section. Rules may also carry tags declared after the identifier (for example rule Example : TagName), which are used to filter output.

Illustrative example — minimal rule
rule silent_banker : banker {
meta:
description = "This is just an example"
author = "Jane Doe"
strings:
$a = "dummy string"
condition:
$a
}
info

The meta section holds documentation only. Its values can be strings (valid UTF-8 only), integers, or the booleans true/false, and cannot be referenced from the condition.

String

A string — more precisely a pattern — is a piece of data that YARA-X searches for. Patterns are declared in the optional strings section, and each is given an identifier that starts with $. There are three kinds of pattern:

Pattern typeExample declarationMatches
Text$text = "text here"A literal UTF-8 string
Hexadecimal$hex = { E2 34 A1 C8 23 FB }A raw byte sequence
Regular expression$regex = /some regular expression: \w+/A regex match
Illustrative example — the three pattern types
rule ExampleRule {
strings:
$text = "text here"
$hex = { E2 34 A1 C8 23 FB }
$regex = /some regular expression: \w+/
condition:
any of them
}
tip

The strings section is optional. A rule can match purely on logic from a module — for example testing a PE header field — without declaring any patterns at all.

Condition

The condition is the mandatory section where the matching logic lives. It is a boolean expression: if it evaluates to true for a given input, the rule matches. The condition refers to previously defined patterns by their identifiers (such as $a), and can combine them with operators and quantifiers like any of them, all of them, or counts and offsets.

Illustrative example — a condition combining patterns
rule TwoOfThree {
strings:
$a = "foo"
$b = "bar"
$c = "baz"
condition:
2 of ($a, $b, $c)
}

Module

A module is an extension that exposes structured, file-format-specific or helper data to the rule language. A module is brought into a rule with the import statement, after which its fields and functions are available under the module's name.

Commonly used modules include pe (Portable Executable files), elf (ELF files), math, and hash, among others.

Illustrative example — using the pe module
import "pe"

rule single_section {
condition:
pe.number_of_sections == 1
}

rule is_dll {
condition:
pe.characteristics & pe.DLL != 0
}
Illustrative example — using the elf module
import "elf"

rule elf_64 {
condition:
elf.machine == elf.EM_X86_64
}

Namespace

A namespace is a label that groups rules together and isolates them from rules in other namespaces, so identical rule identifiers in separate sets do not clash. On the yr command line, a rules path can be prefixed with a namespace, with the namespace and path separated by a colon:

Place a rule file under a namespace
yr scan my_namespace:my_rules.yar /path/to/target

The --path-as-namespace flag tells yr to automatically use each source file's path as its namespace.

note

If you do not specify a namespace, rules go into a single default namespace. Namespaces matter most when you combine rule sets from different authors that might reuse the same rule names.

Compiled rules

Compiled rules are rules that have been processed from their text source into a binary form ahead of time, so they can be loaded and reused without recompiling. The yr compile command builds one or more source files into a single binary file:

Compile sources into a binary, then scan with it
yr compile rules/test.yara
yr scan --compiled-rules output.yarc /bin

Compiling once and scanning many times avoids paying the compilation cost on every run, which matters when scanning large numbers of files.

Scanner

The scanner is the component that takes compiled rules and runs them against target data — a file, a directory of files, or an in-memory buffer — reporting which rules match. On the command line the scanner is invoked through yr scan:

Scan a target directory with a rule file
yr scan rules/test.yara /bin
yr scan --output-format ndjson rules/test.yara /bin | jq .path

The same scanning capability is exposed programmatically through the YARA-X APIs (Rust, plus C, Python, Go, and others), where a scanner object is created from compiled rules and then run against input data.


YARA vs YARA-X

YARA is the original C-based pattern-matching tool for malware research. YARA-X is its reimplementation in Rust, designed to be faster, safer, and more user-friendly while keeping the rule language largely compatible.

AspectYARAYARA-X
Implementation languageCRust
CLI commandyarayr
Project directionReceives maintenance and bug fixesWhere new modules and features are added
Rule languageOriginal syntaxLargely the same, with some syntax differences
info

According to the upstream project, new development effort — including new modules — now focuses on YARA-X, and VirusTotal has run YARA-X in production scanning billions of files. Existing rules generally carry over, but there are intentional syntax differences, so review the upstream migration notes before porting a large rule set.

Sources