How to remove malware from a website manually & malware injection removal
The detection rates of anti-malware and antivirus scanners varies considerably. Knowing how to manually scan for and remove malware is an important and useful skill with which to confirm a scanner's effectiveness or compensate for its failings. In this article, Andrey Kucherov, Malware Analyst at Imunify360, describes some essential manual website malware detection and cleanup techniques. Additionally, read our website hosting security article and learn how to keep your website secure in 2021.
Table of Contents
- Types of Malware
- Manual search cookbook
- Malicious code analysis
- Server logs can help
- Integrity checks
The reality of modern security creates new challenges for web hosts every day. It is well known that there is no absolute protection that guarantees a 0% chance of your website being hacked. Even major players in online markets suffer from security breaches, and, from time to time, make users change their passwords "just in case". Usually, after getting into a victim's host, perhaps using some zero-day software vulnerability, brute-force attacks on weak passwords, or via infected "neighbor" sites, attackers will try to strengthen their positions by injecting additional malware into system folders. They may use malware such as web-shells or hidden uploaders, or they will attempt to insert 'backdoors' into a CMS's core files or into the website database.
Different malware scanners can help you identify uploaded web shells, backdoor, phishing content, spam mailers and other types of malicious software—i.e., everything already known and encountered before. Some scanners, like ImunifyAV, have heuristic rule-sets that allow them to identify files containing suspicious code, with signatures matching those used by malicious scripts. They can also identify files with suspicious attributes that may have been uploaded by hackers. Unfortunately, even when using an antivirus scanner on your host after becoming infected, you may still be in a situation where not all malicious software has been identified. That means that intruders still have a 'back door' to your system, and can get there again any time they want.
A modern hacker's scripts differ significantly from those that existed four or five years ago. Now, malware developers combine multiple techniques, like code obfuscation, encryption, decomposition, and external loading. They use many methods to bypass even the best antivirus software, so there is always a chance there might be something left behind on a server once they have departed. It may seem paranoid but probably you can't survive in the current security reality without some degree of healthy paranoia on your side.
So, what you can do to effectively secure your website? A comprehensive approach must be used: initial automated malware scans must be followed by a manual check. In this article I will talk about how to identify malicious software without using malware scanners.
First, let's review what we are going to be looking for:
1. Hacker Scripts
Very often during an attack there are a number of files of a certain type uploaded onto a victim's system. These may be web shells (e.g. c99.php), backdoors, file uploaders, spam mailers, phishing pages, doorways (web pages that are created for the deliberate manipulation of search engine indexes), or defacement content (for example, the hacker's logo, obscene messages, links, etc.). In some cases you can simply search the name of the suspicious file on the web to find out what it does—script kiddies usually do not bother modifying files much so it will probably turn up in search results.
2. Code Injection
3. SQL Injection
4. Cache Injections
Due to insecure settings of a caching server, for example, when using memcached, some injections can be done on cached data on the fly. In some cases, spam can be injected into website pages without actually hacking the core functionality of a website.
If hackers are able to get privileged (root) access to a server, they can replace some web server components or caching server components with infected versions. Such a web server can then be controlled via remote commands, and it can add dynamic redirects and malicious code to different website pages. As with cache injections, a webmaster is usually not able to spot the infection because all user files and databases appear unaffected. This is the most difficult case, and in some situations it is easier to rebuild the server and migrate user data rather than try to detect all the malware.
5. System components replacement
By now, I'll assume that you've already checked the files and database dump with AV scanners and that they have not identified anything malicious. If the malicious redirect or script (embedded in the <script> tag) is still somewhere on the pages of your website, redirects will continue sending users to malicious websites.
How should you proceed? Read on to find out.
In Linux and some Unix-like systems, it is hard to find more useful commands for searching files than find and grep.
This command will look for all files that were modified in the past week.
Sometimes, attackers change file modification dates to avoid detection. In this case, you can use the following command to look for .php and .phtml files that have had their attributes changed.
If you need to look for file changes in a certain time frame, you can also use this find command.
When your web server is compromised, it is good practice to check files with the guid/suid flag, just to be safe.
Now that we know how to search for possibly malicious files, let's dive a bit deeper and list what exactly we are looking for and where.
1. Check the upload, cache, tmp, backup, log, and images directories.
You need to check all directories that are used for file uploading. For example, with Joomla you should look for .php files in the ./images folder. There is a high chance that if you find something, it will be malicious. For WordPress it is worth checking the wp-content/uploads, backup and theme cache directories.
2. Looking for files with weird names
Here are examples of strange file names to look out for: php, fyi.php, n2fd2.php. You should also look for unusual patterns in file names. For example:
- File names comprising an odd and unreadable mixture of letters, numbers and symbols, e.g. srrfwz.php, ath.php, kirill.php, b374k.php.php, tryag.php.
- Because many users rename files by appending the digit '1', look out for normal-looking file names that append numbers other than 1 to filename parts, e.g. index9.php, wp3-login.php.
3. Looking for files with unusual extensions.
4. Looking for files with non-standard attributes or creation dates.
5. Looking for doorways using a large number of .html or .php files.
- Dependent relations between the date and time when an email was sent (using details in the mail log and email message header), and access_log entries, help to determine how mail spam was sent out, and find the mailer script on the server.
- FTP xferlog analysis helps to identify which files were uploaded or changed during the attack and by whom.
- If your mail server and PHP settings are correctly configured you can find the name of the sender PHP script and the full path to it in your mail log or in the full email message header. This helps to quickly find and eliminate the source of spammy deliveries.
- Some modern CMSs and plugins have more advanced defense techniques to proactively detect cyber attacks. Their logs might show if there was any attack and whether the CMS or plugin was able to protect itself or not.
- The access_log and error_log files also allow you to track a hacker's actions, if you were able to identify the script names that were used, the IP address or the HTTP user agent (User-Agent). You can also check the POST request on the day the attack happened. Often, such checks allow you to find which malicious files were uploaded and which were already present before the attack.
It is much easier to analyze attack vectors and look for malware scripts on websites if some security precautions were already made beforehand. Integrity checks help to identify the changes on the file system in a timely manner, and detect malicious activities quickly. The easiest and most effective way to perform such checks is by using version control systems such as git, SVN, or CVS. For example, with git, if you correctly configure the .gitignore file, the process of integrity checking comes down to executing two commands:
This guarantees that you have a backup copy of your files, and allows you to quickly restore the website to a previous state. Experienced server administrators can also use inotify, tripwire, auditd and similar tools to track file and folder access and changes.
Unfortunately, it is not always possible to configure the version control system or any site integrity check services on a server. In the case of shared hosting, it is not possible to use a version control system or system services. To overcome this problem you can use CMS extensions, in the form of a plugin or a stand-alone script, to track file changes. Some CMSs (e.g. Bitrix or DLE) already have built-in integrity checks.
If the website is using custom scripts or is built with static HTML files, you can use the following shell command to make a snapshot of currently stored files:
If any malware threats occur you can create another snapshot and then compare them using any comparison software you like, for example, WinDiff, AraxisMerge Tool, BeyondCompare, the diff command (on Linux) or even compare snapshots online.
In cases where antivirus solutions fail to clean a hacked website, it is useful to know how to do it yourself on the command line, with heuristics, and using built-in OS and CMS tools and features. Even with high-rate detection malware scanners such as ImunifyAV, being able to confirm the detection rates manually is an important, confidence-building skill.
ImunifyAV is the completely free antivirus and anti-malware scanner. Upgrade to ImunifyAV+ to access the built-in, one-click, fully automated cleanup feature, or get it as part of Imunify360's complete and comprehensive website security solution, which includes an intelligent WAF, IDS and IPS, Proactive Defense, automated kernel patch management and more.