The merciless malignancy of malware Part 2 (SEM 101)

Malware infections are no laughing matter. When they afflict your website, they can infect your customers, who won’t appreciate your sharing, intentional or not (and I’m guessing it’s not)! And if Bing discovers malware on your site, your listing in the Bing search engine results pages (SERPs) will either be completely omitted or the link to your site will be disabled, so when the searcher clicks on it, only a malware warning appears. All told, this is bad news for conversions, don’t you think?

This article is Part 2 of a three-part series on malware. Part 1 covered how to detect the presence of malware on your site by using the Bing Webmaster Center tools to get access to the information the bot sees when it crawls your site’s pages and the external links they contain. In this post, we’ll cover the available resources and strategies to do a malware clean-up job. It’s usually a big job, so let’s get right to it.

Cleaning up the mess

Bing’s detection of malware on your site usually indicates that your site was hacked. Comprehensive information on how to clean up each specific malware infection could fill an entire book (and this post is quite long as is). Instead of deep dives into specifics, let’s talk about strategies and resources for combating this problem.

Sources of malware code

There are three primary ways your website might be serving malware:

External source. If hackers exploit an existing security vulnerability to gain access to your source code, they often edit your HTML or script files to make calls to externally based, malicious content on servers they control. Worse yet, hackers don’t even need access to your web server or source files to inflict this attack. If you include externally based content on your pages, and if the hackers can successfully attack the source at its external site, your pages then will unintentionally serve their malware to your customers.
Local source. Sometimes hackers, once they’ve gained full access to vulnerable web servers, put malware code directly in your webpage files and/or place malicious content in the directory structure of your website. Your page’s HTML source code may still appear to be clean, but in this case, the poisoned images, documents, or other binary files they call locally can be the source of the malware attack.
Man-in-the-middle attack. Although this is a less common form of attack due to its technical sophistication, hackers can, when server and network security is severely compromised, inject malware into your webpage content over the network as it travels from your web server to the end user.

Your webpage will be considered malicious if you serve malware from any source, be it from an external server, directly from your web server, or by man-in-the-middle attacks. A user browsing to your webpage from the Bing SERPs will not be able to distinguish your clean content from the malicious content inserted there by hackers. It’s all presented as content in your webpages, so you are ultimately responsible for protecting your customers.

Attack indicators

The malicious code changes that hackers will likely employ come in one or more of these five forms. If any of these elements in your code appear to be suspicious, unexpectedly modified, or unfamiliar to you as webmaster, investigate them further.

Script code. Webmaster should check for new JavaScript script code inserted into their pages. This code will be contained within <script> tags. It is common for inserted, malicious script to be written in an encrypted hash function, accompanied by the decryption key hash value, which allows the script to execute but prevents the webmaster from being able to read and interpret the function of the code. This is known as obfuscated JavaScript code. Such code will look like many wrapped lines of continuous, random alphanumeric characters within a <script> tag. If your pages do not normally use encrypted scripts, this form of script code will definitely stand out as different. This malicious code usually runs when the page is loaded, and it typically appends an exploit or a poisoned, hidden control to the page as it is loaded.
<iframe> code. An <iframe> is simply an HTML tag that enables an unrelated HTML document to be loaded within another HTML document. The <iframe> tag enables hackers to inject poisoned HTML and script code into another webmaster’s webpage. Injected script code often employs <iframe> tags as the means of creating hidden windows that enable malware exploits to execute without the user’s knowledge.
Page redirect. If a hacker get access to edit your home page, they can add code that will automatically and immediately redirect a web browser to another web page (usually to one similarly named and identical looking one on an external server, but possibly to one created on your server) that runs malware as the page loads. This can be done by means of <meta> refresh, JavaScript, or even 301/302 redirects. Unless you find the code for the redirect when you examine your content, you typically won’t see any malware on your page because it’s not executed there. Be sure to also visually inspect your web server configuration for unauthorized redirects.
Externally sourced content. Note that while the use of small, externally-based controls, like hit counters, can be legitimately secure when you first install them, if the webmaster of that control’s host server is not security conscious, those once-benign controls can themselves become malicious vectors later on. Also, small advertising hosts can outsource their contracts to other advertising hosts, who might sub-contract that work out again several times down the line, all done in order to sell more advertising. But the farther you get away from the original trusted external host, the more vulnerable your link becomes to that original, external ad host. Only use external (third party) content from highly trusted sources whose security practices are widely known to be good.
Obfuscation efforts. Attackers often try to hide their exploitation work from quick inspections by using external, referring domain names that are spelled very similarly to known, trusted entities of the Web. Check the spelling of domain names in all external resources to be sure the URLs were not changed to addresses that are similarly named but not the actual, intended target. This includes external references to advertisers, hit counters and other such controls, external images, analytics trackers, and the like. Also, look for URLs that substitute IP addresses for domain names, another common method of obfuscation.

What can you do?

You gotta look at the code. If someone cracked your web server’s security and modified your source code, you need to find what’s changed as the first step in identifying and cleaning up the malware. You can do this by visually inspect the HTML and script code on your pages for unauthorized changes.

When you examine your source code, carefully inspect your code on your web server. Look for newly added scripts in your HTML pages that execute when the page loads, especially obfuscated script. Consider any references to third-party domains in your source code as a potential source of malware. Suspects should include any inserted external code that runs on your site when the page is loaded, including hit counters, images, media content, and other externally sourced controls. External scripts should never be implicitly trusted without a careful consideration of that host’s security practices, as this is a major security vulnerability.

As much as possible, remove unnecessary, externally sourced content to reduce your exposure to exploits beyond your control. Only embed content from trusted third parties into your webpages. If you discover some code that was added to or modified on your page without authorization or realize a once-trusted external page element now appears to be malicious, simply remove that portion of the code from your file to clean it up.

Malware might also have been embedded in your existing images, document files, animations and media content, or other binary files that are presented on your pages. All of these should be scanned again with an antivirus tool for malware.

If you are using a version control system for maintaining your site’s source code, you can easily redeploy the last known good version before the infection occurred. Just be sure that the versioned source code from your workstation is not the source of the malware.

Diagnostic tools to use

To help in your source code examination, use these tools for additional insight on cleaning up a malware mess:

Run an antivirus utility on your source code. Install a fully capable antivirus software tool (and regularly update it to be sure its program code and malware signatures are current) to run a thorough scan of the folders containing your website’s source code. It may be able to detect some forms of malware if they are locally installed on your web server or your webpage files were modified with unauthorized, malicious scripts. Also run a thorough antivirus scan of your personal workstation (the one you use to edit the your site’s source code and connect to the web server for uploads). You may unknowingly infect an otherwise clean web server with a compromised workstation infected with malware. And if you get a key logger infection on your workstation, the hacker controlling that malware might steal your web server’s FTP logon credentials, providing them with full access to attack your site with malicious content.
Run Fiddler HTTP proxy on your website. The Fiddler web debugging proxy tool is a no-cost, web debugging proxy tool used to see what HTTP calls are being made when your page is loaded. By examining the multi-threaded, network traffic generated by your webpages, you can see if your pages are making unexpected calls to unknown resources, and if so, identify where they are going. Watch the Fiddler video tutorials and read its documentation to learn how this valuable tool is used and how it works.

Checking for man-in-the-middle attacks

You might also inspect your source code as received by your browser using the browser’s View Source command to check for “man-in-the-middle” attacks. In that case, a direct inspection of the original webpage source code files on your web server would likely reveal no malware infection. However, by revealing and examining the source code for the infected webpages from your web browser and comparing the results to the original, clean file from the web server, you might find the malicious changes. If so, inform your web-hosting provider that they might be the victims of a "man-in-the-middle" attack. If your provider takes no action as a result, consider moving your website to a more trusted provider. Luckily, as this is a much more sophisticated attack, it is less common than overt modification of the code on your webpages.

Warning! Make sure both your browser and your operating system are running the latest security updates, along with running up-to-date antivirus, anti-spyware, and software firewall products, to minimize the vulnerabilities to your computer when loading pages likely to be infected with malware.

Also, most web browsers allow you to configure specific security settings for individual sites. Add your infected site to the list. (If your browser doesn’t allow you to specify security settings for individual sites, you can temporarily implement these settings for all sites during your testing, but you may want to revert those changes later to restore full functionality.) You’ll want to disable JavaScripts for your tests. If you’re using Internet Explorer, you’ll also want to disable ActiveX controls. These changes will protect your computer from the infection methods used by malware.

Verifying your fixes

Once you have cleaned up the problem, you should verify your work to be sure the revised code is clean.

Visually inspect the page on the web server to be sure the edits are in place.
Visually inspect the changed page and its source code in your web browser.
Use Fiddler to ensure that the malware’s unexpected external network calls have been eliminated.

A stumper

Sometimes you’ll scan your site’s code and find no clear source of malware, yet malware is clearly affecting the users of your website. If this is the case, look at portions of your code where you take user input without input validation, write cookies to the user’s computer, or other such personalized activity beyond simply displaying information to a generic user. Your site may be the victim of cross-site scripting (XSS). Resolving this specific issue is beyond the scope of this article, but it is very commonly used by hackers for exploiting computer security vulnerabilities, and you should learn how to protect your site against such attacks.

Additional information resources

Microsoft offers a number of useful, anti-malware resources to help you understand what you are up against and what you need to do. Check these out for starters:

Visit the Microsoft Malware Protection Center Threat Research and Response portal for information on malware, Microsoft security products, useful guidance and advice, and more.
Visit the Microsoft TechNet Security TechCenter to access their vast library of security resources, including the articles:

The topic of malware clean up is admittedly not really an introductory level subject, despite this being the SEM 101 column. But the negative implications of detected malware infections on a website are huge. Referrals from Bing will likely dry up after the bots detect malware because of end user protection mechanisms employed on the SERPs to prevent searchers from clicking an infected page. And on top of that, the few customers who choose to circumvent those protections on the SERP or who choose to browse directly to an infected site may possibly suffer the frustrating consequences of a malware infection. Either way, the folks whom you are trying to convert, either with a purchase, a subscription, or a download, will be forced to deal with the unpleasant mess left by the malware picked up from your site. They won’t remain your customers for long. And that’s why this topic needs to be addressed in SEM 101, even though it’s not really a 101-level topic.

If you have any questions or comments about malware, please feel free to post them in our General Questions forum. For regular SEM and SEO questions and suggestions, please go to our SEM forum. Next up: how to better secure your computers against hacker attacks. Until then...

-- Rick DeJarnette, Bing Webmaster Center

Related Stories

SEO best practice for subscription-based and paywall content

Site explorer: SEO-explore your site

Announcing the new Bing Webmaster Tools (migration complete)