Block by User Agent
Sometimes, your site may fall victim to an overly aggressive or problematic crawl bot. Blocking access to your server for these is very straightforward, with a simple edit to your domain's ___general/example.com.conf
file
Eg. To block the Yandex crawl bot
if ($http_user_agent ~* "YandexBot") {
return 403;
}
Alternatively, if you want to give a discrete message, rather than an outright block (perhaps to allow for humans to contact you if there is an error), then a rewrite would be more suitable
Eg. To redirect all requests to a static HTML page
if ($http_user_agent ~* "YandexBot") {
rewrite .* /no-crawl-bots.html last;
}
Then just make a normal HTML file in /no-crawl-bots.html
with whatever message you would like to pass to the affected user agents.