Skip to main content

Strings & patterns

The strings section of a YARA-X rule defines the patterns you want to look for inside a file or process memory. The condition section then decides how those patterns combine into a match. YARA-X supports three kinds of patterns:

  • Text strings — quoted literals such as "foobar".
  • Hex strings — byte sequences with wildcards, jumps, and alternatives.
  • Regular expressions — delimited by /.../, with the usual regex syntax.

Every pattern has an identifier that starts with $, and can be tuned with one or more modifiers.

note

YARA-X is the Rust reimplementation of YARA. The rule language is highly compatible with the original, but a few behaviors differ — base64 matching in particular is more precise and produces fewer false positives. Where behavior diverges from classic YARA, this page calls it out.

Text strings

A text string is a sequence of bytes written as a double-quoted literal. By default, matching is case-sensitive and uses one byte per character (ASCII).

rule TextExample {
strings:
$a = "foobar"
$b = "https://example.com"
condition:
$a and $b
}

Text strings accept escape sequences such as \n, \t, \", \\, and \xHH for an arbitrary byte.

String modifiers

Modifiers are appended after the pattern definition and change how the string is matched. The full set supported in YARA-X is:

ModifierEffect
nocaseCase-insensitive matching.
wideMatches the two-bytes-per-character (interleaved zero byte) encoding.
asciiMatches the plain one-byte-per-character form. Implied unless wide is the only encoding wanted.
fullwordMatch only when the pattern is delimited by non-alphanumeric characters.
xorMatch the string XORed against a single byte (optionally a byte range).
base64Match the three base64 encodings of the string.
base64wideLike base64, but the base64 result is then wide-encoded.

You can combine several modifiers on the same pattern, subject to the constraints described below.

nocase

nocase turns a pattern into a case-insensitive one.

$text = "foobar" nocase

This matches foobar, Foobar, FOOBAR, fOoBaR, and so on.

wide and ascii

wide searches for strings encoded with two bytes per character — common in Windows executables. The literal Borland is matched as B\x00o\x00r\x00l\x00a\x00n\x00d\x00.

$wide_only = "Borland" wide
$wide_and_ascii = "Borland" wide ascii

When wide is present, the ascii modifier brings back matching of the plain one-byte form as well, so the second pattern above matches both encodings. ascii on its own is redundant because patterns are ASCII by default.

info

wide is not full UTF-16. It simply interleaves each byte of the string with a zero byte, so it only covers characters that fit in a single byte (English text and similar).

fullword

fullword requires the match to be delimited by characters that are not alphanumeric.

$text = "domain" fullword

This matches domain in www.my-domain.com but not in www.mydomain.com, because in the latter domain is surrounded by letters.

xor

xor searches for the string after it has been XORed with a single byte. With no arguments it tries all 256 possible single-byte keys.

$xor = "This program cannot" xor

You can restrict the key to a range with xor(min-max):

$xor = "This program cannot" xor(0x01-0xff)

xor can be combined with wide and ascii. When combined with wide, the XOR is applied after the wide transformation:

$xor = "This program cannot" xor wide ascii

base64 and base64wide

base64 matches the three possible base64 encodings of a string (one for each byte-alignment offset). base64wide does the same and then applies the wide encoding to the result.

$a = "This program cannot" base64
$b = "This program cannot" base64wide

Both modifiers accept a custom 64-character alphabet:

$a = "This program cannot" base64("!@#$%^&*(){}[]....custom 64-char alphabet....")
tip

Unlike classic YARA, YARA-X does not generate the false-positive matches that base64 patterns are prone to, and it lets you specify different alphabets for base64 and base64wide on the same pattern.

Modifier combinations and constraints

A few combinations are rejected at compile time:

  • base64 and base64wide require the pattern to be at least 3 bytes long.
  • Using xor, fullword, or nocase together with base64 or base64wide causes a compiler error.
  • nocase cannot be combined with xor (nor with base64/base64wide, per the rule above).
CombinationAllowed?
wide + asciiYes
xor + wide + asciiYes
base64 + custom alphabetYes
base64 + nocase / fullword / xorNo (compiler error)
nocase + xorNo

Hex strings

Hex strings describe raw byte sequences and are written inside curly braces. They are useful for matching binary opcodes or structures where some bytes vary.

rule HexExample {
strings:
$h = { E2 34 ?? C8 A? FB }
condition:
$h
}

Wildcards

The ? character is a wildcard that operates at the nibble (half-byte) level:

  • ?? matches any full byte.
  • A? matches any byte whose high nibble is A (i.e. A0AF).
  • ?A matches any byte whose low nibble is A.
$h = { E2 34 ?? C8 A? FB }

Jumps

Square brackets introduce a jump — a variable-length run of arbitrary bytes — written as [X-Y] where X <= Y:

SyntaxMeaning
[4-6]Between 4 and 6 bytes
[6]Exactly 6 bytes (same as [6-6])
[10-]10 or more bytes (unbounded)
[-]0 or more bytes (unbounded)
$h = { F4 23 [4-6] 62 B4 }

The lower bound must not exceed the upper bound.

Alternatives

Parentheses with the | separator express a choice between several byte sequences. Alternatives may contain wildcards and can be of different lengths.

$h = { F4 23 ( 62 B4 | 56 | 45 ?? 67 ) 45 }

This matches F4 23 followed by 62 B4, or 56, or 45 <any> 67, and then 45.

Regular expressions

Regular expressions are written between forward slashes, like in Perl, and are an alternative to text strings for flexible matching.

rule RegexExample {
strings:
$re = /foo(bar|baz)\d{2,4}/
condition:
$re
}

Flags

Two flags can follow the closing slash:

FlagEffect
iCase-insensitive matching.
sThe dot (.) also matches newline characters.

They can be combined, e.g. /foo/is.

note

/foo/i is equivalent to /foo/ nocase. The YARA-X documentation recommends using the nocase modifier form when defining patterns, for consistency with text strings.

Modifiers on regular expressions

Regular expressions accept the same modifiers as text strings where they make sense: nocase, ascii, wide, and fullword.

$re = /md5: [0-9a-f]{32}/ nocase wide

The regex engine supports the usual constructs: anchors (^, $), alternation (|), grouping ((...)), character classes ([...], \d, \s, \w), word boundaries (\b, \B), greedy and lazy quantifiers (*, +, ?, {n,m}, and their ?-suffixed lazy forms), and escape sequences including Unicode (\x{...}, \u{...}).

A complete illustrative example

Illustrative example

The rule below is for documentation purposes only and is not intended to detect any real-world threat. It exists to show several pattern types and modifiers working together.

rule Illustrative_PatternShowcase {
meta:
author = "techwriter.ai docs"
description = "Illustrative only — demonstrates pattern types and modifiers."
strings:
$text = "evil-config" nocase fullword
$wide = "Borland" wide ascii
$b64 = "This program cannot" base64
$xored = "secret-key" xor(0x01-0xff)
$hex = { 6A 40 68 [4-6] FF 15 ?? ?? ?? ?? }
$regex = /https?:\/\/[a-z0-9.-]+\/[a-z0-9]{8}/ nocase
condition:
2 of them
}

When matching strings against a file with the CLI, you scan with the yr command, for example:

yr scan rules.yar suspicious.bin

Sources