Phishing Threat Uses UTF-8 BOM in ZIP Signature to Evade Detection

This blog was written by Sanchit Karve.

Last week, we noticed thousands of malware files in the wild that employ a simple phishing attack by modifying the hosts file on Windows systems. What’s interesting, however, is the technique chosen by the malware authors to distribute their payload. The samples in question (Example MD5: 34d9b42bfd64c6f752fe27eef8d80c5f) are packaged in a ZIP file along with a 0-byte readme.txt file.

Usually, ZIP files start with the ZIP signature 0x04034B50 (or “PK”, 03, 04), but in this case the author chose to insert the UTF-8 Byte Order Mark (BOM) (represented as 0xEFBBBF) before the ZIP header.


Unicode BOMs are often used to indicate the endianness of encoded textual data. It is redundant for UTF-8 data, as byte order has no meaning in UTF-8, which is why the Unicode Standard leaves it as optional and does not recommend its use. Despite this, it is not uncommon to see UTF-8 BOMS in text files and data streams. For example, many popular applications (Notepad, Google Docs, etc.) use the UTF-8 BOM to explicitly state that a document is encoded with UTF-8.

Because the ZIP file is prefixed with the UTF-8 BOM, it tricks many applications into assuming that the file is a UTF-8-encoded text file. For example, when such a file is opened by Windows 7, the OS complains that such a ZIP file is invalid. Some third-party archive programs, such as 7-Zip, WinRAR, and some others ignore the BOM and read the ZIP file correctly.

Because the only way to run the file is to manually extract and execute it, the malware authors expect their victims to have third-party archiver applications installed on their computers.

It is also likely that the authors using this technique want to evade detection from antivirus products and email spam filters that adhere strictly to the ZIP format. Even though most of the samples are virtually the same, they generate unique hashes due to the varying timestamp fields embedded in the ZIP header as well as differences in the installers’ overlay sections caused by varying application names.

As for the sample itself, the main payload is an installer that always bears one or more of the following words: Golaya, Russkaya, or Devochka (which together roughly translates to “Naked Russian Girls” in Russian).

This installer silently executes batch and VBScript files to modify the hosts file on a victim’s machine and map IP addresses to popular Russian websites as shown below:


When users visit one of these websites, instead of being connected to the site’s IP, they are instead connected to the IP address listed against the site name in the hosts file. Like any other phishing attack, the page hosted at the IP address in the hosts file looks almost like the original site, and is the perfect bait to lure users into unknowingly giving away their account credentials.

This threat has been growing steadily over the past few days. VirusTotal reports that it currently has more than 5 million submissions of this malware family.

McAfee detects this threat as Trojan-SkyHook, Agent-FBH and Agent-FBX.
Thanks to my colleague Srinivasa Kanamatha for discovering the anomaly.

Introducing McAfee+

Identity theft protection and privacy for your digital life

FacebookLinkedInTwitterEmailCopy Link

Stay Updated

Follow us to stay updated on all things McAfee and on top of the latest consumer and mobile security threats.


More from McAfee Labs

Back to top