So you are getting an error 404 or perhaps even worse, a 500!

The error in your Apache2 logs looks something like this:

***.162.245.*** - - [03/Apr/2020:12:49:50 +0000] "GET /robots.txt HTTP/1.1" 404 89670 "-" "Mozilla/5.0 (compatible; SomeUserAgent/2.1; +"

In a perfect world, you’d only have a single site/domain on this host, so will know that the robots.txt file would reside in the Apache root serving directory.

However, I just happen to (as you?) have a ton of VirtualHosts on this machine, so am not sure which robots.txt file is missing..

First steps to see

The very first thing you should do is check the output of apachectl -S.

This will tell you where everything is and how things are setup in general.

Adding debugging

Create a file:


Put this config in it:

LogLevel trace4
GlobalLog ${APACHE_LOG_DIR}/debug.log "%v:%p %h %l %u %t \"%r\" %>s %O file=%f"

You can read more about Apache log formatting here if you need more/different output.

Enabling and Disabling

Now you just need to enable it, so run this from the commandline:

sudo a2enconf temp_debug && sudo apachectl graceful

Now you will be able to see your new found logs being dumped to:

tail -f -n100 /var/log/apache2/debug.log

When you’re done and have resolved the problem, you can disable this log ingestion by doing the following:

sudo a2disconf temp_debug && sudo apachectl graceful

Reloading Apache

Remember to reload apache by doing the following both after enabling and disabling the configuration changes:

systemctl reload apache2