How to block bots on your website and server

In 2019, 40% to 50% of the total internet traffic comes from automated bots, almost half of the total traffic. While this number consists of both good bots and bad bots, it’s no secret that a lot of these bots are made with malicious intent and can be dangerous for your website and server in various ways. 

Bad bots are used by cybercriminals to spy, interrupt, spam, and hack websites and servers of various sizes, so it’s no longer an issue exclusive for big companies and enterprises. Even individuals with an online presence (i.e. a social media profile) might potentially deal with these malicious bots and will need to block them. 

block bad botsAgain, with close to nearly half of the internet traffic consists of bots, chances are we won’t be able to block and avoid all of them. However, here we will share some important tips you can use to effectively block and manage malicious bots on your website and server. 

Key challenges in blocking bots

On the surface, isn’t blocking bots on your website fairly straightforward? Can we just simply block all traffic that doesn’t come from human users? 

While in theory, it might seem so simple, actually there are two significant challenges we’d have to consider in blocking these bot activities: 

We wouldn’t want to block good bots

As briefly discussed above, not all bots are malicious. There are bots owned by reputable companies (i.e. Google, Amazon, Facebook, etc.) that are actually performing beneficial tasks for both the website and the users. We don’t want, for example, to accidentally block Googlebot so it can’t crawl and index our site. If our site isn’t indexed properly, it won’t be featured in any Google search results. 

Bots are masking themselves as human users

Malicious bots are now utilizing AI and machine learning technologies to impersonate humanlike behaviors like non-linear mouse movements, humanlike typing patterns, and so on. On the other hand, they are also getting better at masking obvious fingerprints like headless browsers, OSs, while rotating between hundreds if not thousands of IP addresses. 

So, differentiating these malicious bots from legitimate human users can be very challenging, while accidentally blocking legitimate users (false positives) can lead to a loss of revenue as well as long-term damage to your reputation.

Thus, when blocking bots on our website and server, we’ll have to always consider these two challenges. 

Tips on how you can block bots on your website and server

1. Setting up rules and policies on your website

Good bot management should begin with properly setting up rules for your website of which bots can access your resources, and which can’t.

This is typically done by configuring your robots.txt file, but it’s important to note that not all bots will comply with your rules. As a general rule of thumb, all good bots will typically follow your robots.txt directives, so this can be a good way to filter out good bots from malicious ones. Robots.txt is important so that the good bots won’t eat too much of your server’s resources, and to limit good bots that aren’t so beneficial for your site (i.e. if you are not an eCommerce site, you might not need bots that provide eCommerce-related functions). 

If you are using Linux servers, You can also use .htaccess, the Apache-based configuration that can block most malicious bots and botnets. 

2. Monitor your traffic for signs of malicious bots

Use traffic monitoring tools like Google Analytics to monitor the following metrics: 

  • Increase in pageviews: sudden spike in pageviews, especially when it is unprecedented, is a very likely symptom of bot traffic. 
  • Increase in bounce rate: Bounce rate refers to the number of users that only come to a single page then leave immediately without moving to another page and/or clicking anything on the page. An abnormal spike in bounce rate can be a sign of bots performing their tasks on a single page and then leaving immediately. 
  • Abnormally high or low dwell time: Dwell time, or session duration, is how long a user stays on a website, and normally it should remain relatively steady. A sudden and unexplained increase or decrease in dwell time can be a sign of bot traffic.
  • Conversion rate: Lower conversion rate due to fake account creations, fake form-filling activities, etc. 
  • Traffic from an unexpected location: A sudden increase in activities of users from an abnormal location, especially from locations who aren’t fluent with the native language of the site. 

When any of these symptoms are found on your website or server, you can then use the other methods to track the source and block the bot activities. 

3. Invest in proper bot mitigation solution

To tackle the two core challenges in managing bot activities, as discussed above, a sufficient bot management solution is required. Since many malicious bots are now using AI technologies to impersonate humanlike patterns and rotate between hundreds of user agents/IP addresses, we also need an AI-powered account takeover prevention solution like DataDome that can use behavioral analysis to detect and manage malicious bots in real-time and autopilot. 

Your bot management solution should be able to:

  • Properly differentiate between bots and legitimate human users.
  • Differentiate between good bots and bad bots via behavioral and fingerprinting analyses.
  • Decide whether to block the incoming traffic or implement rate-limiting or other approaches as needed.
  • Challenge the traffic via CAPTCHA, JavaScript, and other methods. 
  • Filter bot traffic based on fingerprint reputation (IP address whitelist, browser used, OS used, etc.)

End words

Avoiding false positives so we don’t accidentally block legitimate human users and beneficial good bots are very important when attempting to block and manage bad bot activities. This is where an advanced bot mitigation solution like DataDome, which can perform behavioral-based analysis in real-time to detect malicious bot activities is now necessary for the age of sophisticated, AI-powered bots that can eloquently mimic human-like behaviors.