Chintan Shah – McAfee Blogs Securing Tomorrow. Today. Sat, 25 Jan 2020 14:49:02 +0000 en-US hourly 1 Chintan Shah – McAfee Blogs 32 32 An Inside Look into Microsoft Rich Text Format and OLE Exploits Fri, 24 Jan 2020 18:09:03 +0000 /blogs/?p=98259

There has been a dramatic shift in the platforms targeted by attackers over the past few years. Up until 2016, browsers tended to be the most common attack vector to exploit and infect machines but now Microsoft Office applications are preferred, according to a report published here during March 2019. Increasing use of Microsoft Office […]

The post An Inside Look into Microsoft Rich Text Format and OLE Exploits appeared first on McAfee Blogs.


There has been a dramatic shift in the platforms targeted by attackers over the past few years. Up until 2016, browsers tended to be the most common attack vector to exploit and infect machines but now Microsoft Office applications are preferred, according to a report published here during March 2019. Increasing use of Microsoft Office as a popular exploitation target poses an interesting security challenge. Apparently, weaponized documents in email attachments are a top infection vector.

Object Linking and Embedding (OLE), a technology based on Component Object Model (COM), is one of the features in Microsoft Office documents which allows the objects created in other Windows applications to be linked or embedded into documents, thereby creating a compound document structure and providing a richer user experience. OLE has been massively abused by attackers over the past few years in a variety of ways. OLE exploits in the recent past have been observed either loading COM objects to orchestrate and control the process memory, take advantage of the parsing vulnerabilities of the COM objects, hide malicious code or connecting to external resources to download additional malware.

Microsoft Rich Text Format is heavily used in the email attachments in phishing attacks. It has been gaining massive popularity and its wide adoption in phishing attacks is primarily attributed to the fact that it has an ability to contain a wide variety of exploits and can be used efficiently as a delivery mechanism to target victims. Microsoft RTF files can embed various forms of object types either to exploit the parsing vulnerabilities or to aid further exploitation. The Object Linking and Embedding feature in Rich Text Format files is largely abused to either link the RTF document to external malicious code or to embed other file format exploits within itself and use it as the exploit container. Apparently, the RTF file format is very versatile.

In the below sections, we attempt to outline some of the exploitation and infection strategies used in Microsoft Rich Text format files over the recent past and then towards the end , we introspect on the key takeaways that can help automate the analysis of RTF exploits and set the direction for the generic analysis approach.

RTF Control Words

Rich Text Format files are heavily formatted using control words. Control words in the RTF files primarily define the way the document is presented to the user. Since these RTF control words have the associated parameters and data, parsing errors for them can become a target for exploitation. Exploits in the past have been found using control words to embed malicious resources as well. Consequently, it becomes significant to examine a destination control word that consumes data and extract the stream. RTF specifications describe several hundred control words consuming data.

RTF parsers must also be able to handle the control word obfuscation mechanisms commonly used by attackers, to further aid the analysis process. Below is one of the previous instances’ exploits using control word parameters to introduce executable payloads inside the datastore control word.

Overlay Data in RTF Files

Overlay data is the additional data which is appended to the end of RTF documents and is predominantly used by exploit authors to embed decoy files or additional resources, either in the clear, or encrypted form which is usually decrypted when the attacker-controlled code is executed. Overlay data of the volume beyond a certain size should be deemed suspicious and must be extracted and analysed further. However, Microsoft Word RTF parser will ignore the overlay data while processing RTF documents. Below are some instances of RTF exploits with a higher volume of overlay data appended at the end of the file, with CVE-2015-1641 embedding both the decoy document and multi-staged shellcodes with markers.

Object Linking and Embedding in RTF Files

Linked or embedded objects in RTF documents are represented as RTF objects, precisely to the RTF destination control word “object”. The data for the embedded or linked object is stored as the parameter to the RTF sub-destination control word “objdata” in the hex-encoded OLESaveToStream format. Modifier control word “objclass” determines the type of the object embedded in the RTF files and helps the client application to render the object. However, the hex encoded object data as the argument to the “objdata” control word can also be heavily obfuscated, either to make the reverse engineering and analysis effort more time consuming or to break the immature RTF parsers. Apparently, OLE has been one of the dominant attack vectors in the recent past, with many instances of OLE based exploits used in targeted attacks, essentially implying robust RTF document parsers for the extraction of objects, along with deeper inspection of object data is extremely critical.

Object Linking – Linking RTF to External Resource

Using object linking, it is possible to link the RTF files to the remote object which could be the link to the malicious resource hosted on the remote server. This leads the resulting RTF file to behave as a downloader and subsequently execute the downloaded resource by invoking the registered application-specific resource handlers. Inspecting the modifier RTF control words to “object”, linked objects are indicated by another nested control word “objautlink”, as represented below in the RTF document.

As indicated in the above representation, object data as the argument to the RTF control word “objdata” is OLE1.0NativeStream in the OLESaveToStream format which is followed by the NativeDataSize indicating the size of the OLE2.0 Compound document that is wrapped in the NativeStream. As per the Rich Text Format specifications, if the object is linked to the container application, which in this case is the RTF document, the Root Storage directory entry of the compound document will have the CLSID of the StdOleLink indicating the linked object. Also, when the object is in the OLE2.0 format, the linked source data is specified in the MonikerStream of the OLESteam structure. As highlighted below, while parsing the object data, the ole32.OleConvertOLESTREAMToIStorage function is responsible for converting the OLE1.0 NativeStream data to OLE2.0 structured storage format. Following the pointer to the OLE stream lpolestream will allow us to visualize the parsed extracted native data. Below is a memory snapshot from when an RTF document with a linked object was parsed by the winword.exe process.

Launching the RTF document with the link to external object will throw up a dialogue box asking to update the data from the linked object, as shown below.

However, this is not the ideal exploitation strategy to target victims. This error can be eliminated by inserting another modifier control word “objupdate”, which internally calls link object’s IOleObject::Update method to update the link’s source.

Subsequently the urlmon.dll, which is the registered server for the URL Moniker, is instantiated.

Once the COM object is instantiated, the connection is initiated to the external resource and, based on the content-type header returned by the server in the response, URL Moniker consults the Mime database in the registry and invokes registered application handlers.

Details on how URL Moniker is executed and an algorithm to determine which appropriate handlers to invoke is described by Microsoft here.  We have had multiple such RTF exploits in the past including CVE-2017-0199, CVE-2017-8756 and others using Monikers to download and execute remote code.

However, COM objects used in the mentioned exploits had been blacklisted by Microsoft in the newer versions, but similar techniques could be used in future which essentially necessitates the analysis of OLE structured storage streams.

Object Embedding – RTF Containing OLE Controls

As indicated earlier, embedded objects are represented in the container documents in the OLE2 format. When the object is stored in the OLE2 format, the container application (here Rich Text Format files) creates the OLE Compound File Storage for each of the objects embedded and the respective object data is stored in the OLE Compound File Stream Objects. Layout of the container documents storing embedded objects is as represented below and described in the Microsoft documentation here.

RTF exploits historically have been found embedding and loading multiple OLE controls in order to bypass exploit mitigations and to take advantage of memory corruption vulnerabilities by loading vulnerable OLE controls. Embedded OLE controls in the RTF document are usually indicated by nested control word “objocx” or “objemb” followed by the “objclass” with the argument as the name of the OLE control to render the object. Below is one of the examples of the previous exploit used in the targeted attacks, which exploited a vulnerability in the COM object and loaded another OLE control to aid the exploitation process which had the staged malicious code embedded. Apparently, it is critical to extract this object data, extract the OLE2 compound file storage and extract each of the stream objects for further inspection of hidden malicious shellcodes.

Object Embedding – RTF Containing Other Documents

Malicious RTF documents can use the OLE functionality to embed other file formats like Flash files and Word documents, either to exploit respective file format vulnerabilities or to further assist and set up the stage for the successful exploitation process. Multiple RTF exploits have been observed in the past embedding OOXML documents using OLE functionality to manipulate the process heap memory and bypass Windows exploit mitigations. In RTF files, embedded objects are usually indicated by nested control word “objemb” with a version-dependent “ProgID” string as the argument to the nested control word “objclass”. One such RTF exploit used in targeted attacks in the recent past, is as indicated below.

Below is another instance where the PDF file was physically embedded within the compound document. As mentioned, the embedded object is stored physically along with all the information required to render it.

In the embedded object, the creating application’s identifier is stored in the CLSID field of the compound file directory entry of the CFB storage object. If we take a look at the previous instance, when the object data is extracted and inspected manually, the following CLSID is observed in the CFB storage object, which corresponds to the CLSID_Microsoft_Word_Document.

When OLE2 stream objects are parsed and the embedded OOXML is extracted and analysed after deflating the contents, we see the suspicious ActiveX object loading activity and embedded malicious code in one of the binary files. Apparently, it is significant to extract the embedded files in RTF and perform further analysis.

OLE Packages in RTF Files

RTF documents can also embed other file types like scripts (VBSsript, JavaScript, etc.), XML files and executables via OLE packages. An OLE package in an RTF file is indicated by the ProgID string “package” as the argument to the nested control word “objclass”. Packager format is the legacy format that does not have an associated OLE server. Looking at the associated CLSID in the registry, there is no specific data format mapped with Packages.

This essentially implies that OLE packages can store multiple file types and, if a user clicks the object, it will lead to execution of it and, eventually, infection of the machine if they are malicious scripts. RTF documents have been known to deliver malware by embedding scripts via OLE packages and then using Monikers, as described in the previous sections, to drop files in the desired directory and then execute them. One such instance of a malicious RTF document exploiting CVE-2018-0802, embedding an executable file, is shown below.

Since many RTF documents have been found delivering malware via OLE packages, it is critical to look for these embedded objects and analyse them for such additional payloads. Embedded executables / scripts within RTF could be malicious. Looking for OLE packages and extracting embedded files should be a trivial task.

The above exploit delivery strategies can allow us to take a step towards building analysis frameworks for RTF documents. Primarily, inspecting the linked or embedded objects turns out to be the critical aspect of automated analysis tasks along with the RTF control words inspection. The following are the key takeaways:

  • Using the RTF file as the container, many other file format exploits can be embedded inside using the Object Linking and Embedding feature, essentially weaponizing the RTF documents.
  • Extract and analysing embedded or linked objects for malicious code, payload or resource handler invocations becomes very essential.
  • If RTF document has a higher volume of appended data, it must be further looked at.
  • Non-OLE control words and OLE packages must also be analysed for any malicious content.

McAfee Response

As Microsoft Office vulnerabilities continue to surface, generic inspection methods will have to be improved and enhanced, consequently leading to better detection results. As a reminder, the McAfee Anti-Malware engine used on all our endpoints and most of our appliances has the potential to unpack Office, RTF and OLE documents, expose the streams of content and unpack these streams if necessary.

The post An Inside Look into Microsoft Rich Text Format and OLE Exploits appeared first on McAfee Blogs.

]]> 0
Evolution of Malware Sandbox Evasion Tactics – A Retrospective Study Mon, 09 Sep 2019 19:05:58 +0000

Executive Summary Malware evasion techniques are widely used to circumvent detection as well as analysis and understanding. One of the dominant categories of evasion is anti-sandbox detection, simply because today’s sandboxes are becoming the fastest and easiest way to have an overview of the threat. Many companies use these kinds of systems to detonate malicious […]

The post Evolution of Malware Sandbox Evasion Tactics – A Retrospective Study appeared first on McAfee Blogs.


Executive Summary

Malware evasion techniques are widely used to circumvent detection as well as analysis and understanding. One of the dominant categories of evasion is anti-sandbox detection, simply because today’s sandboxes are becoming the fastest and easiest way to have an overview of the threat. Many companies use these kinds of systems to detonate malicious files and URLs found, to obtain more indicators of compromise to extend their defenses and block other related malicious activity. Nowadays we understand security as a global process, and sandbox systems are part of this ecosystem, and that is why we must take care with the methods used by malware and how we can defeat it.

Historically, sandboxes had allowed researchers to visualize the behavior of malware accurately within a short period of time. As the technology evolved over the past few years, malware authors started producing malicious code that delves much deeper into the system to detect the sandboxing environment.

As sandboxes became more sophisticated and evolved to defeat the evasion techniques, we observed multiple strains of malware that dramatically changed their tactics to remain a step ahead. In the following sections, we look back on some of the most prevalent sandbox evasion techniques used by malware authors over the past few years and validate the fact that malware families extended their code in parallel to introducing more stealthier techniques.

The following diagram shows one of the most prevalent sandbox evasion tricks we will discuss in this blog, although many others exist.

Delaying Execution

Initially, several strains of malware were observed using timing-based evasion techniques [latent execution], which primarily boiled down to delaying the execution of the malicious code for a period using known Windows APIs like NtDelayExecution, CreateWaitTableTImer, SetTimer and others. These techniques remained popular until sandboxes started identifying and mitigating them.


As sandboxes identified malware and attempted to defeat it by accelerating code execution, it resorted to using acceleration checks using multiple methods. One of those methods, used by multiple malware families including Win32/Kovter, was using Windows API GetTickCount followed by a code to check if the expected time had elapsed. However, we observed several variations of this method across malware families.

This anti-evasion technique could be easily bypassed by the sandbox vendors simply creating a snapshot with more than 20 minutes to have the machine running for more time.

API Flooding

Another approach that subsequently became more prevalent, observed with Win32/Cutwail malware, is calling the garbage API in the loop to introduce the delay, dubbed API flooding. Below is the code from the malware that shows this method.



Inline Code

We observed how this code resulted in a DOS condition since sandboxes could not handle it well enough. On the other hand, this sort of behavior is not too difficult to detect by more involved sandboxes. As they became more capable of handling the API based stalling code, yet another strategy to achieve a similar objective was to introduce inline assembly code that waited for more than 5 minutes before executing the hostile code. We found this technique in use as well.

Sandboxes are now much more capable and armed with code instrumentation and full system emulation capabilities to identify and report the stalling code. This turned out to be a simplistic approach which could sidestep most of the advanced sandboxes. In our observation, the following depicts the growth of the popular timing-based evasion techniques used by malware over the past few years.

Hardware Detection

Another category of evasion tactic widely adopted by malware was fingerprinting the hardware, specifically a check on the total physical memory size, available HD size / type and available CPU cores.

These methods became prominent in malware families like Win32/Phorpiex, Win32/Comrerop, Win32/Simda and multiple other prevalent ones. Based on our tracking of their variants, we noticed Windows API DeviceIoControl() was primarily used with specific Control Codes to retrieve the information on Storage type and Storage Size.

Ransomware and cryptocurrency mining malware were found to be checking for total available physical memory using a known GlobalMemoryStatusEx () trick. A similar check is shown below.

Storage Size check:

Illustrated below is an example API interception code implemented in the sandbox that can manipulate the returned storage size.

Subsequently, a Windows Management Instrumentation (WMI) based approach became more favored since these calls could not be easily intercepted by the existing sandboxes.

Here is our observed growth path in the tracked malware families with respect to the Storage type and size checks.

CPU Temperature Check

Malware authors are always adding new and interesting methods to bypass sandbox systems. Another check that is quite interesting involves checking the temperature of the processor in execution.

A code sample where we saw this in the wild is:

The check is executed through a WMI call in the system. This is interesting as the VM systems will never return a result after this call.

CPU Count

Popular malware families like Win32/Dyreza were seen using the CPU core count as an evasion strategy. Several malware families were initially found using a trivial API based route, as outlined earlier. However, most malware families later resorted to WMI and stealthier PEB access-based methods.

Any evasion code in the malware that does not rely on APIs is challenging to identify in the sandboxing environment and malware authors look to use it more often. Below is a similar check introduced in the earlier strains of malware.

There are number of ways to get the CPU core count, though the stealthier way was to access the PEB, which can be achieved by introducing inline assembly code or by using the intrinsic functions.

One of the relatively newer techniques to get the CPU core count has been outlined in a blog, here. However, in our observations of the malware using CPU core count to evade automated analysis systems, the following became adopted in the outlined sequence.

User Interaction

Another class of infamous techniques malware authors used extensively to circumvent the sandboxing environment was to exploit the fact that automated analysis systems are never manually interacted with by humans. Conventional sandboxes were never designed to emulate user behavior and malware was coded with the ability to determine the discrepancy between the automated and the real systems. Initially, multiple malware families were found to be monitoring for Windows events and halting the execution until they were generated.

Below is a snapshot from a Win32/Gataka variant using GetForeGroundWindow and checking if another call to the same API changes the Windows handle. The same technique was found in Locky ransomware variants.

Below is another snapshot from the Win32/Sazoora malware, checking for mouse movements, which became a technique widely used by several other families.

Malware campaigns were also found deploying a range of techniques to check historical interactions with the infected system. One such campaign, delivering the Dridex malware, extensively used the Auto Execution macro that triggered only when the document was closed. Below is a snapshot of the VB code from one such campaign.

The same malware campaign was also found introducing Registry key checks in the code for MRU (Most Recently Used) files to validate historical interactions with the infected machine. Variations in this approach were found doing the same check programmatically as well.

MRU check using Registry key: \HKEY_CURRENT_USER\Software\Microsoft\Office\16.0\Word\User MRU

Programmatic version of the above check:

Here is our depiction of how these approaches gained adoption among evasive malware.

Environment Detection

Another technique used by malware is to fingerprint the target environment, thus exploiting the misconfiguration of the sandbox. At the beginning, tricks such as Red Pill techniques were enough to detect the virtual environment, until sandboxes started to harden their architecture. Malware authors then used new techniques, such as checking the hostname against common sandbox names or the registry to verify the programs installed; a very small number of programs might indicate a fake machine. Other techniques, such as checking the filename to detect if a hash or a keyword (such as malware) is used, have also been implemented as has detecting running processes to spot potential monitoring tools and checking the network address to detect blacklisted ones, such as AV vendors.

Locky and Dridex were using tricks such as detecting the network.

Using Evasion Techniques in the Delivery Process

In the past few years we have observed how the use of sandbox detection and evasion techniques have been increasingly implemented in the delivery mechanism to make detection and analysis harder. Attackers are increasingly likely to add a layer of protection in their infection vectors to avoid burning their payloads. Thus, it is common to find evasion techniques in malicious Word and other weaponized documents.

McAfee Advanced Threat Defense

McAfee Advanced Threat Defense (ATD) is a sandboxing solution which replicates the sample under analysis in a controlled environment, performing malware detection through advanced Static and Dynamic behavioral analysis. As a sandboxing solution it defeats evasion techniques seen in many of the adversaries. McAfee’s sandboxing technology is armed with multiple advanced capabilities that complement each other to bypass the evasion techniques attempted to the check the presence of virtualized infrastructure, and mimics sandbox environments to behave as real physical machines. The evasion techniques described in this paper, where adversaries widely employ the code or behavior to evade from detection, are bypassed by McAfee Advanced Threat Defense sandbox which includes:

  • Usage of windows API’s to delay the execution of sample, hard disk size, CPU core numbers and other environment information .
  • Methods to identify the human interaction through mouse clicks , keyboard strokes , Interactive Message boxes.
  • Retrieval of hardware information like hard disk size , CPU numbers, hardware vendor check through registry artifacts.
  • System up time to identify the duration of system alive state.
  • Check for color bit and resolution of Windows .
  • Recent documents and files used.

In addition to this, McAfee Advanced Threat Defense is equipped with smart static analysis engines as well as machine-learning based algorithms that play a significant detection role when samples detect the virtualized environment and exit without exhibiting malware behavior. One of McAfee’s flagship capability, the Family Classification Engine, works on assembly level and provides significant traces once a sample is loaded in memory, even though the sandbox detonation is not completed, resulting in enhanced detection for our customers.


Traditional sandboxing environments were built by running virtual machines over one of the available virtualization solutions (VMware, VirtualBox, KVM, Xen) which leaves huge gaps for evasive malware to exploit.

Malware authors continue to improve their creations by adding new techniques to bypass security solutions and evasion techniques remain a powerful means of detecting a sandbox. As technologies improve, so also do malware techniques.

Sandboxing systems are now equipped with advanced instrumentation and emulation capabilities which can detect most of these techniques. However, we believe the next step in sandboxing technology is going to be the bare metal analysis environment which can certainly defeat any form of evasive behavior, although common weaknesses will still be easy to detect.

The post Evolution of Malware Sandbox Evasion Tactics – A Retrospective Study appeared first on McAfee Blogs.

]]> 0