top of page

Yara analysis - Basics

  • brencronin
  • 11 minutes ago
  • 8 min read

What is yara?


YARA is an open-source, cross-platform tool used by malware researchers and security analysts to identify, classify, and detect malware samples based on textual or binary patterns. Often described as a "Swiss Army knife" for threat hunting, YARA operates by matching specific rules, sets of strings and Boolean conditions, against files or running processes, enabling the identification of malware families. Yara is maintained by VirusTotal.



Ways yara can be implemented


Practically there are a couple of ways yara is used:


  • Running Yara on an analysis system. Files are then loaded onto the forensic analysis system to be analyzed by yara.

    • Variation is mounting the file system being analyzed to the analysis system. Files can then remotely be analyzed by yara.

  • Loading Yara on the target system and running yara analysis against files on that system.

  • Integrating yara rules into endpoint security products




Yara analysis and malware


YARA is used both for malware analysis and for scanning otherwise normal files for indicators of malicious activity, and the distinction between these use cases is important. When YARA is applied to the analysis of known or suspected malware, such as during reverse engineering or rule development, this work should be performed within an isolated forensic virtual machine to prevent accidental execution or environmental contamination. In contrast, when YARA is used operationally to scan trusted or production files against vetted detection rules, the files do not need to be copied into a forensic VM, enabling efficient, large-scale scanning directly within operational environments.



Breakdown of Yara Rules


A yara rule has the following key sections:


  • Imports

  • Rule name and tags (tags are labels allow rules to be connected together)

  • meta:

  • strings:

  • condition:



yara imports


In YARA, imports enable rules to access structured, contextual information about a file or execution environment that goes beyond simple byte or string matching. By importing a module (such as pe, elf, or math), a rule can evaluate file format metadata, headers, sections, entropy, or runtime properties and incorporate that data into detection logic. This allows YARA rules to be more precise, context-aware, and resilient to false positives than rules based solely on raw byte patterns.


A YARA module is a built-in or external component that exposes structured data and functions to rules. Modules enable feature exactions, shortcuts and calculations. There are two types of modules: exe and utility modules.


exe modules:

  • pe – Portable Executable metadata

  • elf – ELF binaries

  • dotnet


utility modules:

  • math – entropy and numeric functions

  • hash – cryptographic hashes

  • cuckoo – sandbox analysis re

  • Console - allows printing values to console


Modules define what capabilities exist. Modules = capabilities. Imports = how you enable those capabilities in a rule.


Example below the pe module is imported and pe module functions can be referenced within the yara rule:


import "pe"

rule example {
   condition:
     pe.is_pe and pe.number_of_sections > 5
}

For example, the 'number_of_sections function within the pe module. The yara docs have all the PE module functions: https://yara.readthedocs.io/en/latest/modules/pe.html



yara rule name and tags


Tags are labels allow rules to be connected together


rule <rule name> tag1 tag2 tag3
 {

For example, the below rule written by CISA 'BRICKSTORM Backdoor': BRICKSTORM Backdoor | CISA




rule name = CISA_251165_02


tags:

  • BRICKSTORM

  • backdoor

  • installs_other_components

  • communicates_with_c2

  • exfiltrates_data


yara Metadata


meta:
author = "author name"
description = "description of rule"

For example, the below rule written by CISA 'BRICKSTORM Backdoor': BRICKSTORM Backdoor | CISA



yara strings


Three main types of strings:


  1. Human readable text

  2. Hex sequences

  3. Regular expressions


The strings are assigned to variables and naming convention for strings is standardized as:


$a = application

$c = command

$f = file

$IP = IP address

$p = Process

$r = Registry

$s = string


Although this isn't mandatory and some rules may have variables like:


$op1

$op2


Additionally, a yara rule may have multiple variables assigned to a single rule and they are denoted with numbers.:


$s1 = "value1"

$s2 = "value2"

$s3 = "value3"


For example, the below rule written by CISA 'BRICKSTORM Backdoor': BRICKSTORM Backdoor | CISA



yara strings analysis best practices


During static analysis, analysts commonly begin with string extraction to identify human-readable artifacts that may indicate functionality, configuration details, or command-and-control behavior. Extracted strings are then reviewed to identify candidates for YARA rule development.



Common string extraction tools include:


  • The standard strings utility

  • 010 Editor, which provides additional context through file structure visualization


For many executable formats (e.g., EXEs, DLLs, ELFs), the most informative strings are often concentrated toward the latter portion of the file, where configuration data, encoded resources, or embedded commands are stored. As a result, analysts often find value in starting analysis near the end of the file and working upward rather than scanning sequentially from the beginning.


It is also important to understand minimum string length thresholds. Most tools apply a default minimum (for example, 010 Editor defaults to five characters), which means shorter, but potentially meaningful, strings may be excluded unless the threshold is adjusted.


Because modern binaries can contain thousands of strings, specialized tools can help prioritize analysis by highlighting the most relevant ones:


  • FLOSS (FireEye Labs Obfuscated String Solver) – Automatically detects and decodes obfuscated and runtime-decrypted strings

  • StringSifter – Uses machine-learning techniques to rank strings by analytical relevance and surface the most suspicious or useful artifacts


Example Brickstorm yara rule strings



$s7 = fs.(*WebServer).RunServer.


The string fs.(*WebServer).RunServer corresponds to a method within a custom Go package (commonly referenced as wssoft) and is indicative of BRICKSTORM, a Go-based backdoor that targets VMware vCenter environments. BRICKSTORM is capable of instantiating an embedded web server, performing file system and directory manipulation, uploading and downloading files, executing shell commands, and providing SOCKS proxying to support lateral movement. Command-and-control communications are conducted over WebSockets to hard-coded C2 infrastructure.


To further evade detection, BRICKSTORM leverages DNS over HTTPS (DoH) for C2 resolution, issuing encrypted HTTPS requests to well-known, legitimate DoH resolvers. This technique allows the malware to obscure DNS activity within normal HTTPS traffic and bypass traditional DNS monitoring and logging controls.




These are DNS-over-HTTPS (DoH) resolver endpoints for:


  • Cloudflare (1.0.0.1, 1.1.1.1)

  • Google (8.8.8.8, 8.8.4.4)

  • Quad9 (9.9.9.9)


Analytical significance

  • Explicit DoH usage allows malware to:

    • Bypass traditional DNS logging

    • Evade network-based detections

    • Blend into legitimate HTTPS traffic

  • Hardcoding multiple resolvers provides resilience and failover


Why write yara rules in hex versus ascii?


Writing strings in hex is a deliberate engineering choice, not a convenience issue.


ASCII strings in YARA default to ASCII only, unless modifiers are added:

$s = "main.startNew"

But in real binaries, the same string may appear as:

  • ASCII

  • UTF-16LE (m\x00a\x00i\x00n\x00...)

  • Mixed encoding

  • Embedded in binary blobs


Hex strings match bytes exactly, regardless of encoding assumptions:

$s = { 6D 61 69 6E 2E 73 74 61 72 74 4E 65 77 }

This removes guesswork and makes the rule deterministic.


Malware authors can also try to obfuscate strings through techniques like splitting the string. Byte code sequences allow for more easy jump sequences like search for a pattern and within X number of bytes a different part of the pattern.


yara condition


Conditions reference all strings as well as boolean logic for evaluating matches against the rule. Examples:

condition:
	any of them

Types of operators.


  • Boolean operators

  • relational operators

  • Arithmetic operators

  • Bitwise+ operators


A good idea is to group different strings into high, medium, and low fidelity, then create conditions based on those.


For example, the below rule written by CISA 'BRICKSTORM Backdoor': BRICKSTORM Backdoor | CISA


Specifies 8 of the string conditions need to match for the yara rule to trigger.


Yara Rule Files Management


You can have multiple rule files or a single master rule file.


There are three correct and commonly used ways to run YARA rules that are split across multiple files, depending on how you organize your rules and how much control you need.


1. Use a Rule Directory (Best Practice)


YARA can load all .yar/.yara files in a directory.


Linux / macOS

yara -r /path/to/rules/ /path/to/scan/

Windows (cmd or PowerShell)

yara64.exe -r C:\YaraRules\ C:\TargetDirectory\

Best practice for large rule sets

  • Automatically loads all rule files

  • Supports modular rule organization

  • Easy updates and CI integration


2. Use include Statements (Modular Rules)


Create a main rule file that includes others.


main.yar

include "exe_rules.yar"
include "jpg_rules.yar"
include "malware_rules.yar"

Run:

yara main.yar /path/to/scan/

Best when:

  • You want explicit control over which rules are loaded

  • You maintain rule libraries


Paths in include are relative to the including file


3. Specify Multiple Rule Files on the Command Line


You can pass multiple rule files directly.

yara rules1.yar rules2.yar rules3.yar /path/to/scan/

Windows:

yara64.exe rules1.yar rules2.yar rules3.yar C:\TargetDirectory\

This is less scalable as rule count grows


Recommended Structure (Professional Setup)

yara/
├── core/
│   ├── filetypes.yar
│   ├── pe.yar
│   └── archives.yar
├── malware/
│   ├── apt.yar
│   └── ransomware.yar
├── main.yar

main.yar:

include "core/filetypes.yar"
include "malware/apt.yar"

Important Notes


Recursive scanning is separate from rule loading

  • -r applies to scan targets, not rule files


Use -w to suppress warnings

yara -w -r rules/ target/

Compile-only test (CI/CD)

yara -C rules/

Installing and Managing YARA


YARA binaries can be downloaded directly from the official VirusTotal GitHub releases page: https://github.com/VirusTotal/yara/releases


YARA syntax is largely consistent across versions; however, newer releases may introduce additional modules or expand existing module capabilities. A complete list of available modules and the YARA versions in which they were introduced is maintained in the official documentation: https://yara.readthedocs.io/en/stable/modules.html

To verify the version of YARA installed on your system, run:

yara --version

Instructions for building YARA from source, including required dependencies and platform-specific considerations, are available here: https://yara.readthedocs.io/en/stable/gettingstarted.html


Because installing and maintaining a full malware analysis toolchain can be time-consuming, many practitioners choose to use a purpose-built malware analysis virtual machine with YARA and related tooling preinstalled. A widely used option is REMnux, which includes an up-to-date YARA installation and a comprehensive set of malware analysis tools: https://remnux.org/


Executing Yara Rules


To run a YARA rule file against all files in a directory, use the -r (recursive) option.


Basic Command (Non-Recursive)


Scans only the files in a single directory:

yara rules.yar /path/to/directory

Recursive Directory Scan (Most Common)


Scans all files and subdirectories:

yara -r rules.yar /path/to/directory

Commonly Used Options (Practical)


Show matching rule names and files (default behavior)

yara -r rules.yar /path/to/directory

Print strings that matched

yara -r -s rules.yar /path/to/directory

Suppress warnings

yara -r -w rules.yar /path/to/directory

Limit file size (avoid huge files)

yara -r --max-filesize=50MB rules.yar /path/to/directory

Scan only certain file types

yara -r rules.yar /path/to/directory/*.exe

Example (Realistic DFIR Use)

yara -r -s --max-filesize=100MB suspicious.yar C:\Evidence\DiskImage\

Exit Codes (Important for Automation)

Exit Code

Meaning

0

No matches

1

One or more matches

2

Errors occurred

Windows Command Prompt (cmd.exe)


Recursive scan of a directory

yara64.exe -r rules.yar C:\Path\To\Directory

Show matched strings

yara64.exe -r -s rules.yar C:\Path\To\Directory

Suppress warnings

yara64.exe -r -w rules.yar C:\Path\To\Directory

Limit file size (recommended)

yara64.exe -r --max-filesize=100MB rules.yar C:\Path\To\Directory

Quote paths with spaces

yara64.exe -r rules.yar "C:\Program Files\Target Directory"

Windows PowerShell


PowerShell parses arguments differently, so quoting is more important.


Basic recursive scan

.\yara64.exe -r rules.yar C:\Path\To\Directory

With matched strings

.\yara64.exe -r -s rules.yar C:\Path\To\Directory

With spaces in paths

.\yara64.exe -r rules.yar "C:\Program Files\Target Directory"

Capture output to a file

.\yara64.exe -r rules.yar C:\Path\To\Directory | Out-File yara_results.txt

PowerShell-Native Enumeration (More Control)


For large environments, PowerShell can enumerate files and pass them individually to YARA:

Get-ChildItem C:\Path\To\Directory -Recurse -File | ForEach-Object {
    .\yara64.exe rules.yar $_.FullName
}

References


YARA Rules for Beginners: A Practical Guide to Threat Hunting


Lets Defend: How to Install YARA on Windows


Awesome-Yara


Linux Binary Analysis for Reverse Engineering and Vulnerability Discovery


Basics of Binary Analysis


10 ways to analyze binary files on Linux


Yara-Rules Project (Note last update 2022)


Bad VIB(E)s Part Two: Detection and Hardening within ESXi Hypervisors


BRICKSTORM Backdoor malware Analysis report


yaraHub


EDR: Understanding and Deploying Yara Components


Florian Roth Nextron Github




 
 
 

Recent Posts

See All
Defender XDR - Part 5f - Unified Console

Rolling Aerts into Incidents - XDR A key strength of Microsoft Defender is its ability to correlate disparate alerts from various detection sources into a single, cohesive incident view. This approach

 
 
 

Comments


Post: Blog2_Post
  • Facebook
  • Twitter
  • LinkedIn

©2021 by croninity. Proudly created with Wix.com

bottom of page