Strings & patterns
The strings section of a YARA-X rule defines the patterns you want to look for inside a file or process memory. The condition section then decides how those patterns combine into a match. YARA-X supports three kinds of patterns:
- Text strings — quoted literals such as
"foobar". - Hex strings — byte sequences with wildcards, jumps, and alternatives.
- Regular expressions — delimited by
/.../, with the usual regex syntax.
Every pattern has an identifier that starts with $, and can be tuned with one or more modifiers.
YARA-X is the Rust reimplementation of YARA. The rule language is highly compatible with the original, but a few behaviors differ — base64 matching in particular is more precise and produces fewer false positives. Where behavior diverges from classic YARA, this page calls it out.
Text strings
A text string is a sequence of bytes written as a double-quoted literal. By default, matching is case-sensitive and uses one byte per character (ASCII).
rule TextExample {
strings:
$a = "foobar"
$b = "https://example.com"
condition:
$a and $b
}
Text strings accept escape sequences such as \n, \t, \", \\, and \xHH for an arbitrary byte.
String modifiers
Modifiers are appended after the pattern definition and change how the string is matched. The full set supported in YARA-X is:
| Modifier | Effect |
|---|---|
nocase | Case-insensitive matching. |
wide | Matches the two-bytes-per-character (interleaved zero byte) encoding. |
ascii | Matches the plain one-byte-per-character form. Implied unless wide is the only encoding wanted. |
fullword | Match only when the pattern is delimited by non-alphanumeric characters. |
xor | Match the string XORed against a single byte (optionally a byte range). |
base64 | Match the three base64 encodings of the string. |
base64wide | Like base64, but the base64 result is then wide-encoded. |
You can combine several modifiers on the same pattern, subject to the constraints described below.
nocase
nocase turns a pattern into a case-insensitive one.
$text = "foobar" nocase
This matches foobar, Foobar, FOOBAR, fOoBaR, and so on.
wide and ascii
wide searches for strings encoded with two bytes per character — common in Windows executables. The literal Borland is matched as B\x00o\x00r\x00l\x00a\x00n\x00d\x00.
$wide_only = "Borland" wide
$wide_and_ascii = "Borland" wide ascii
When wide is present, the ascii modifier brings back matching of the plain one-byte form as well, so the second pattern above matches both encodings. ascii on its own is redundant because patterns are ASCII by default.
wide is not full UTF-16. It simply interleaves each byte of the string with a zero byte, so it only covers characters that fit in a single byte (English text and similar).
fullword
fullword requires the match to be delimited by characters that are not alphanumeric.
$text = "domain" fullword
This matches domain in www.my-domain.com but not in www.mydomain.com, because in the latter domain is surrounded by letters.
xor
xor searches for the string after it has been XORed with a single byte. With no arguments it tries all 256 possible single-byte keys.
$xor = "This program cannot" xor
You can restrict the key to a range with xor(min-max):
$xor = "This program cannot" xor(0x01-0xff)
xor can be combined with wide and ascii. When combined with wide, the XOR is applied after the wide transformation:
$xor = "This program cannot" xor wide ascii
base64 and base64wide
base64 matches the three possible base64 encodings of a string (one for each byte-alignment offset). base64wide does the same and then applies the wide encoding to the result.
$a = "This program cannot" base64
$b = "This program cannot" base64wide
Both modifiers accept a custom 64-character alphabet:
$a = "This program cannot" base64("!@#$%^&*(){}[]....custom 64-char alphabet....")
Unlike classic YARA, YARA-X does not generate the false-positive matches that base64 patterns are prone to, and it lets you specify different alphabets for base64 and base64wide on the same pattern.
Modifier combinations and constraints
A few combinations are rejected at compile time:
base64andbase64widerequire the pattern to be at least 3 bytes long.- Using
xor,fullword, ornocasetogether withbase64orbase64widecauses a compiler error. nocasecannot be combined withxor(nor withbase64/base64wide, per the rule above).
| Combination | Allowed? |
|---|---|
wide + ascii | Yes |
xor + wide + ascii | Yes |
base64 + custom alphabet | Yes |
base64 + nocase / fullword / xor | No (compiler error) |
nocase + xor | No |
Hex strings
Hex strings describe raw byte sequences and are written inside curly braces. They are useful for matching binary opcodes or structures where some bytes vary.
rule HexExample {
strings:
$h = { E2 34 ?? C8 A? FB }
condition:
$h
}
Wildcards
The ? character is a wildcard that operates at the nibble (half-byte) level:
??matches any full byte.A?matches any byte whose high nibble isA(i.e.A0–AF).?Amatches any byte whose low nibble isA.
$h = { E2 34 ?? C8 A? FB }
Jumps
Square brackets introduce a jump — a variable-length run of arbitrary bytes — written as [X-Y] where X <= Y:
| Syntax | Meaning |
|---|---|
[4-6] | Between 4 and 6 bytes |
[6] | Exactly 6 bytes (same as [6-6]) |
[10-] | 10 or more bytes (unbounded) |
[-] | 0 or more bytes (unbounded) |
$h = { F4 23 [4-6] 62 B4 }
The lower bound must not exceed the upper bound.
Alternatives
Parentheses with the | separator express a choice between several byte sequences. Alternatives may contain wildcards and can be of different lengths.
$h = { F4 23 ( 62 B4 | 56 | 45 ?? 67 ) 45 }
This matches F4 23 followed by 62 B4, or 56, or 45 <any> 67, and then 45.
Regular expressions
Regular expressions are written between forward slashes, like in Perl, and are an alternative to text strings for flexible matching.
rule RegexExample {
strings:
$re = /foo(bar|baz)\d{2,4}/
condition:
$re
}
Flags
Two flags can follow the closing slash:
| Flag | Effect |
|---|---|
i | Case-insensitive matching. |
s | The dot (.) also matches newline characters. |
They can be combined, e.g. /foo/is.
/foo/i is equivalent to /foo/ nocase. The YARA-X documentation recommends using the nocase modifier form when defining patterns, for consistency with text strings.
Modifiers on regular expressions
Regular expressions accept the same modifiers as text strings where they make sense: nocase, ascii, wide, and fullword.
$re = /md5: [0-9a-f]{32}/ nocase wide
The regex engine supports the usual constructs: anchors (^, $), alternation (|), grouping ((...)), character classes ([...], \d, \s, \w), word boundaries (\b, \B), greedy and lazy quantifiers (*, +, ?, {n,m}, and their ?-suffixed lazy forms), and escape sequences including Unicode (\x{...}, \u{...}).
A complete illustrative example
The rule below is for documentation purposes only and is not intended to detect any real-world threat. It exists to show several pattern types and modifiers working together.
rule Illustrative_PatternShowcase {
meta:
author = "techwriter.ai docs"
description = "Illustrative only — demonstrates pattern types and modifiers."
strings:
$text = "evil-config" nocase fullword
$wide = "Borland" wide ascii
$b64 = "This program cannot" base64
$xored = "secret-key" xor(0x01-0xff)
$hex = { 6A 40 68 [4-6] FF 15 ?? ?? ?? ?? }
$regex = /https?:\/\/[a-z0-9.-]+\/[a-z0-9]{8}/ nocase
condition:
2 of them
}
When matching strings against a file with the CLI, you scan with the yr command, for example:
yr scan rules.yar suspicious.bin