Apache and Log Rotation

I was looking into Apache log rotation schemes and here’s what I came up with. Note that this is specifically for a Linux system.

- http://httpd.apache.org/docs/2.0/programs/rotatelogs.html

This is the Apache docs for a program called ‘rotatelogs’ which comes with Apache. It replaces the default specification of the log file as a pipe to a rotatelogs process. (A ‘pipe’ is a communication channel between two process that, unlike RPC, is necessarily one-way and synchronous)

- configuring the log in httpd.conf

CustomLog “|rotatelogs -l /var/log/httpd/prod.error_log.%Y%m%d%H%M 86400″ common

This will rotate the logs every 24 hours creating a logs file of:

/var/log/httpd/prod.error_log.YYYYMMDDHHMM

Both Apache and Urchin can be configured to use/be aware of this log filename format.

- Apache + Urchin

Now, it seems, you cannot set the time for which this 24 hour interval starts and ends. But, that really does not matter. Why? Log rotation has nothing to do with making boundaries for (Urchin) statistics. Rotation of Apache logs is only for keeping logs from becoming unwieldy and taking up too much disk space. Urchin will read a log that goes across a midnight boundary and still process all the data properly according to the time stamp of each line not based upon the file’s time stamp or the create date of the file. So, eventually, all data will property get into Urchin, The downside (if it can be called that) is that data for the previous day will not all be in Urchin a 00:01 the following day. If logs rotate at 07:00, at April 25 00:01, the only data in Urchin from April 24 is from April 24 00:00 to April 24 07:00. At 07:00 on April 25, the rest of the data from April 24 will be put into Urchin. Make sense?

And, most importantly, the reasons for using rotatelogs instead of Linux’s logrotate or a 3rd party rotator:

1) Its a standard part of Apache
2) It permits logrotation without restarting Apache

- Lastly, in my travels, I found a bit of useful info in the case that we ever have to do some recovery:

“Urchin does not need any sort of log rotation to avoid data duplication. Urchin is equipped with a log tracking capability that ensures only new hits are processed.”

This is good news. We can rerun any logs without having to spend hours figuring out what we don’t need to reprocess.

3 Responses to “Apache and Log Rotation”

  1. jason says:

    After all of this, I looked more closely at how the Linux logrotate is configured and deals with Apache. Apache, when HUP’d, will re-open its log file and if the intended file name is not found it will create the new log file — that is, it closes the filehandle (inode) when flushed — good news. So all a log rotator has to do is to rename the log file (apache still is writing to the same inode, so no log entries are lost) and then HUP apache. Apache closes the filehandle, sees that its usual filename is not there, and then starts the new log under its usual name.

    All of this was already in place for the project I was working on. Now knowing this, I realized I had set up another rotator which confused the whole process. Once that was gone, the linux rotatation process began to normalize but was set to do this on a weekly basis and so it was too slow to notice. That’s when I started looking into the ideas in my original post. Once I realized that a good rotation process was already in place, but was running slower than expected, I changed it to run on a daily basis and it seems that we’re good to go for the proejct. Amazingly, the easiest solution is the right one.

  2. jason says:

    To clarify, now that this has also been done for REDF production…

    To set up basic rotation for Apache in Linux, you do not need to do anything to Apache, save that it writes to logs. All of the work is in log rotate. And that isn’t much. Place the following in /etc/logrotate.d/httpd

    /var/log/httpd/*log {
    rotate 14
    daily
    missingok
    sharedscripts
    postrotate
    /bin/kill -HUP `cat /var/run/httpd.pid 2>/dev/null` 2> /dev/null || true
    endscript
    }

    This means:
    - rotate 14 = Logs hang around for 14 days and then begin to rotate into oblivion
    - daily = Rotation happens daily, meaning log file number will be advanced, and the oldest deleted
    - sharedscipts = run related scripts once, not for every file matched by the regex (/var/log/httpd/*.log)
    - postrotate/endscripts – execute the lines inbetween these directives after the logs have been rotated (I.e. HUP the apache process after rotating)

  3. Shane Chao says:

    By adding parameter “dateext” to the application specified logrotate file, the date in the form of YYYYMMDD will be appended to the file when it rotates.

Leave a Reply