CSP-CERT® Malware Research:
Analyzing Weaponized Documents

by CSP CERT® Harli Aquino
posted August 2018

Malware Research


Cybercriminals commonly spread malware through mass spam campaigns. Over the years they have used different social engineering techniques to lure users into opening and, consequently, executing malicious attachments.

Such attachments often contain embedded malicious code or exploits that take advantage of software features or vulnerabilities. These "weaponized documents", as the industry calls them, are very effective in spreading malware because of the following reasons.

  • Many applications that can open Microsoft Office, PDF, and RTF documents contain vulnerabilities.
  • Cybercriminals can use certain application features to run malicious code. Examples of these features are macro execution and automatic loading of URLs and embedded files.
  • Creating legitimate-looking documents that can deceive users is relatively easy.

In this article, we will discuss a few document features and vulnerabilities, and how we can extract or analyze malicious elements using open-source tools.

Interactive PDF Features

PDF documents can contain interactive elements such as annotations, buttons, form fields, and rich media. Cybercriminals have found that they can use such features to lure users into opening phishing pages instead of writing complicated code. The documents only need to look real enough that users do not hesitate to click certain buttons.

PDF Sample

Sample: d3ac19cfe27e653932d38d568233bd0ac508c41864a94b9cc0905c265a4c71fe

Annots objects enable users to add comments, editing markup, and other items to PDF documents. However, they also enable cybercriminals to display user interface elements that hide malicious phishing links from plain sight. In this sample, the buttons "CONFIRM ORDER" and "CANCEL ORDER" link to the same phishing page.

  1. Use peepdf (https://github.com/jesparza/peepdf) to see the links that the buttons open.
  2. Use peepdf
  3. Extract the URIs and then check them against URL scanners. In this case, the URI is a shortened URL that leads to further redirections.
  4. Extract URI

Microsoft Office VBA Macros

VBA macros are useful for automating tasks such as generating reports from spreadsheet data. However, cybercriminals use macros to run malicious code. In newer versions of Microsoft Office, automatic execution of VBA macros is disabled by default, but it only takes a few social engineering techniques to trick users into enabling this feature.

Sample: 5e50f1753e0917f9254c6732ef5aca37fd49617dc653dd8e0c40daac2c181c08

Malicious documents often include a convincing message that prompts users to enable macros. Following the prompt causes the VBA code to automatically load.


Viewing Code

Warning: Analyzing VBA macros in this manner allows the malicious code to automatically execute.

  1. Press Alt+F8 to open the "Macros" window, which lists the available VBA macros in the document.
  2. Macros
  3. Click "Edit" to open the Visual Basic for Applications IDE, which allows you to debug the malicious code and see its actual intent. Such code is usually not password-protected.
  4. Application IDE

Extracting Code

You can extract VBA macro code using the "olevba" tool from the "python-oletools" package (https://github.com/decalage2/oletools/wiki). olevba is a script that parses files such as Microsoft Office documents to detect VBA macros and extract their source code in clear text.

Extract VBA Macro Extract VBA Macro


With the emergence of application vulnerabilities, cybercriminals use exploits in weaponized documents to automatically execute arbitrary code or execute malware. Many recent and well-known vulnerabilities in Microsoft Office are often exploited using RTF files.

Sample: 36e8d7a55d1a6e644a1b79d33c56e8cd05b838739345e7f833d28f8eb772a9d5

You can use open-source tools such as "rtfobj", which is also part of the "python-oletools" package. "rtfobj" is a Python module that detects and extracts objects embedded in RTF files.



Sample: CVE-2017-11882 exploit

CVE-2017-11882 involves a Microsoft Office component that contains a stack buffer overflow vulnerability. In this case, you can analyze the shellcode based on the following report from Palo Alto Networks: https://researchcenter.paloaltonetworks.com/2017/12/unit42-analysis-of-cve-2017-11882-exploit-in-the-wild/ .

Analyzing the Exploit

Required Tools

  • OllyDbg 1.10 (ollydbg.exe)
  • Microsoft® NT Global Flags Manipulator (gflags.exe)
  1. Configure "gflags.exe" to trigger a debugging session on "EQNEDT32.exe" using your preferred debugger. In this case, we will set the debugger to use OllyDBG 1.10.
  2. gflags.exe
  3. Click "Apply" and then open the RTF exploit document. Microsoft Word loads "EQNEDT32.exe" and then OllyDbg opens.
  4. Set a breakpoint on 0x411655 which is in EQNEDT32ís context, where the stack buffer overflow is expected to occur.
  5. Set Breakpoint
  6. Press F9. The EDX register holds the address for a font name.
  7. EDX
  8. Press F9 repeatedly until the value of the EDX register no longer shows a valid font name. The EDX register eventually holds the address to the exploit shellcode (0x0012F3D0).
  9. EDX
  10. Press Ctrl+G and then type "EDX" to follow the address in the debugger window.
  11. EDX
  12. Set a breakpoint on the shellcode instructions and then debug it to see its intended behavior.
  13. Set Breakpoint