Periodic Connections to Control Server Offer New Way to Detect Botnets

Chintan Shah

Oct 24, 2013

7 MIN READ

A number of recent botnets and advanced threats use HTTP as their primary communications channel with their control servers. McAfee Labs research during the last couple of years reveals that more than 60 percent of the top botnet families depend on HTTP. These numbers have increased significantly over the last few quarters. The following pie chart shows the predominance of HTTP in botnet communications.

Why is HTTP so popular? One reason is that HTTP is always allowed on the network perimeter. Because the malicious traffic blends well with legitimate HTTP traffic, it’s hard to differentiate and impossible for network security devices to block malicious traffic unless there is a known signature for it.

The limits with the traditional signature-based approach drive security researchers to shift focus to behavioral-based approaches, but the question remains: What network behaviors should the security devices look for?

Botnets typically work in a “pull” fashion; they continuously fetch commands from the control server, either at fixed intervals or at stealth level. Once connected, they usually “phone home” via HTTP GET/POST requests to a specific server resource (URI). Subsequently, a botnet might execute a command sent by the server or sleep for fixed interval before it tries again. Security researchers can perhaps leverage this connection behavior.

Difference Between Automated and Human-Initiated Traffic

An infected machine connecting periodically to a control server is automated traffic. We need to draw a line between automated and human-initiated traffic as well as between control server responses and legitimate server responses. We can rely on a few facts:

It is abnormal for most users to connect to a specific server resource repeatedly and at periodic intervals. There might be dynamic web pages that periodically refresh content, but these legitimate behaviors can be detected by looking the server responses.

The first connection to any web server will always have response greater than 1KB because these are web pages. A response size of just 100 or 200 bytes is hard to imagine under usual conditions.

Legitimate web pages will always have embedded images, JavaScript, tags, links to several other domains, links to several file paths on the same domain, etc.

Browsers will send the full HTTP headers in the request unless it comes from a man-in-the-middle attack

The preceding points give us a way we can look for specific behaviors on the network: Repetitive connections to the same server resource over HTTP.

If we monitor a machine under idle conditions–when the user is not logged on and the host does not generate a high volume of traffic–we can distinguish botnet activity with a high level of accuracy.

Under these conditions, if the machine was infected with the Zeus botnet, for example, the traffic from the infected machine would look like this:

Notice that Zeus connects to one control domain and keeps running HTTP POST every six seconds to a specific server resource. Algorithmically, while idle, we’d deem a host’s activity repetitively suspicious under these conditions:

The number of unique domains a system connects to is less than a certain threshold
The number of unique URIs a system connects to is less than a certain threshold
For each unique domain, the number of times a unique URI is repetitively connected to is greater than a certain threshold

Assuming the volume of traffic from the host is less, If we take the preceding conditions in a window of say two hours, we might come up with following:

Number of unique domains = 1 (less than the threshold)
Number of unique URIs connected = 1 (less than the threshold)
For each unique domain, the number of times a unique URI is repetitively connected to = 13 (greater than threshold)

All of the thresholds can be set as configurable parameters, depending on typical traffic on a network. The following traffic pattern shows the behavior of the SpyEye botnet. The repetitive activity here occurs every 31 seconds as it connects to a specific resource.

However, the solution does not mandate that repetitive activity should be seen at these fixed intervals. If we choose to monitor within a larger window. We could detect more stealthy activities. The following flowchart represents a possible sequence of operations.

The first few checks are important to determine whether the host is talking too much. First, Total URI > Threshold determines that we have enough traffic to look into. Next, Total Domain access >/= Y determines that the number of domains accessed is not too large. The final check is to see if Total unique URIs < Z. The source ends up on the suspicious list if we believe it has generated repetitive connections.

For instance, if the Total URIs = 30, Total Domain access = 3, and Total Unique URI accessed = 5, we guarantee a repetitive URI access from the host. Now if the number of repetitive accesses to any particular URI crosses the threshold (for example, 1 URI accessed 15 times within a window), we can further examine the connection and apply some of heuristics to increase our confidence level and eliminate false positives. Some heuristics we can apply:

Minimal HTTP headers sent in the request
Absence of UA/referrer headers
Small server responses lacking structure of usual web page

Let’s look at an example of SpyEye sending minimal HTTP headers without a referrer header:

We implemented a proof of concept for this approach and detected repetitive activity over the network with relative ease.

We applied this approach to several top botnet families that exhibit this behavior. We found we could detect them with a medium to high level of confidence.

Behavioral detection methods will be key to detecting next-generation threats. Given the complexity and sophistication of the recent advanced attacks, such detection approaches can address threats proactively–without waiting for signature updates–and will prove to be much faster.

Introducing McAfee+

Identity theft protection and privacy for your digital life

Stay Updated

Chintan Shah is currently working as a Security Researcher with McAfee Intrusion Prevention System team and holds broad experience in the network security industry. He primarily focuses on Exploit and...

Periodic Connections to Control Server Offer New Way to Detect Botnets

Difference Between Automated and Human-Initiated Traffic

Introducing McAfee+

More from McAfee Labs

AI Can Find Your Location 91% of the Time Using Just One Photo

Silent Swap: A Crypto Clipper Extension Campaign

Game Over: WeedHack – The Rise of Minecraft Malware-as-a-Service Campaigns

Sinkholing CountLoader: Insights into Its Recent Campaign

Why Hackers Are Collecting Data They Can’t Read Yet. And How to Stay Safe

Operation NoVoice: Rootkit Tells No Tales

AI Wrote This Malware: Dissecting the Insides of a Vibe-Coded Malware Campaign

Learn to Identify and Avoid Malicious Browser Extensions

Astaroth: Banking Trojan Abusing GitHub for Resilience

Android Malware Promises Energy Subsidy to Steal Financial Data

Think Before You Click: EPI PDF’s Hidden Extras

Android Malware Targets Indian Banking Users to Steal Financial Info and Mine Crypto