Some Tuning Tips For Apache Mod_Cache (mod_disk_cache)

So this blog was quite delayed in the writing like months, but better late then never. I was tasked awhile back with creating a caching cluster serving cached API requests and during my research I selected Apache Mod_Cache and below were some things I learned tuning it along the way.

Server Specifications:

Dell 1950′s Dual Core Intel Xeon 2.0 GHz
8 Gigs of RAM
Running 64bit Centos 5

Server Application Type:

API Caching Server

Basic Installation:

Install Apache 2.2.8
gunzip -c httpd-2.2.8.tar.gz | tar -xvf -
cd httpd-2.2.8
./configure --disable-status --enable-status=shared --enable-rewrite --enable-so --enable-proxy --enable-cache --enable-disk-cache
make
make install

Installed PHP

gunzip -c php-5.2.8.tar.gz | tar -xvf -
cd php-5.2.8
./configure --with-apxs2=/usr/local/apache2/bin/apxs  --enable-module=so --blah --blah --blah
make
make install

Below is the mod_cache portion of my apache vhost:

<IfModule mod_cache.c>
<IfModule mod_disk_cache.c>
CacheDefaultExpire 3600
CacheEnable disk /
CacheRoot "/opt/apicache/"
CacheDirLevels 2
CacheDirLength 1
CacheMaxFileSize 1000000
CacheMinFileSize 1
CacheIgnoreCacheControl On
CacheIgnoreNoLastMod On
CacheIgnoreQueryString Off
CacheIgnoreHeaders None
CacheLastModifiedFactor 0.1
CacheDefaultExpire 3600
CacheMaxExpire 86400
CacheStoreNoStore On
CacheStorePrivate On
</IfModule>
</IfModule>

The first performance decision I made was to use –enable-disk-cache after doing some research I found that contrary to what you would think, disk cache is faster then memory cache when it comes to Apache mod_cache and OS interaction. The reason why is when you use mod_mem_cache the process of reading a file into memory, basically copying its data into RAM and thus kernel buffer in order to deliver it is not optimal. When using mod_disk_cache Linux uses the sendfile API, which does not require the server to read the file before delivering it. The server identifies the file to deliver and the destination via the API, the OS then reads and delivers the file, so no read API or memory for the payload is required, and the OS can just use the file system cache. So the kernel acts as a buffer, increasing cache speed.

The second performance issues I saw was when I set my CacheDirLevels and CacheDirLength to high, load was skyrocketing. I found that CacheDirLevels 2 and CacheDirLength 1 was the optimal setting for about 50 Gigs of cache, lowering the amount of traversing needed on reads and writes.

The third performance issue I saw was when I met Brian Moon at the Velocity Conference in 2008 a very bright guy! I asked Brian how I could optimize the filesystem for Apache mod_disk_cache he instructed me to make some fstab changes only if I was using EXT3. Specifically to reflect the one entry below for my cache partition:

/dev/md1                /opt                    ext3   defaults,noatime,nodiratime,data=writeback 1 2

Setting the noatime effects removing a write for every read. Typically when a file is read the system updates the inode for the file with an access time so that the last access time is recorded, which basically entails a write to the file system. Unless you are running some sort of mirror you probably do not need the access time written.

Setting the nodiratime is the same as the noatime but for directories.
*Note 08/09/2009 it has been pointed out to me that noatime is a superset of nodiratime which is a subset. So if you use noatime you don’t need the entry for nodiratime

Setting data=writeback causes the non preserving of data ordering, the data to be written into the file system after its metadata has been committed to the journal which offers a higher throughput. Warning this setting could allow recently modified files to become corrupted in the event of an unexpected reboot or system crash.

If you look at the below graph you will see the sharp gain from doing the outlined tuning.

modcache

Happy tuning, fun stuff!

13 Responses to “Some Tuning Tips For Apache Mod_Cache (mod_disk_cache)”

  1. I have a load balancer in front of my app server. Would mod_cache, mod_disk_cache also be the solution if the load balancer balances the load to many servers behind it.

  2. Phil Chen says:

    Hi Frederic,

    I currently have Apache mod_cache using disk caching running on a cluster of 9 servers sitting behind some Foundry Load Balancers and it works fine. It more depends on your application in regards to how the Load Balancer would effect the situation. Hope that answers your question.

  3. Lamah says:

    Hi there, nodiratime is not actually required (it is implied as part of noatime, which applies to all files, including directories.)

  4. Phil Chen says:

    Upon doing some research it looks like you are correct that noatime is a superset of nodiratime which is a subset. Thanks for the input Lamah.

  5. akira says:

    Did you try tmpfs as disk_cache?

  6. Doug in Paia says:

    Hi Phil, Thanks for the great post, this really got me jump started. I am running a small site with cakePHP. We host on a couple of CentOS 5 boxes, 64 bit on both VPS and dedicated. I am running apache ab against our VPS instances and I tried your settings. As well I tried a vanilla mem cache. I came up with the results below. The summary is that I didn’t get any performance boost for test on two sample pages. In the first instance, disk caching is slower by a factor of 3, memory caching is about the same. In the second test case, memory caching or not seems to be the same. Is apache ab a valid test? Any thoughts about what I might be doing wrong. Thanks, –Doug

    n1000c10ModCache_new-releases.html:Requests per second:2.65
    n1000c10ModMemCache_new-releases.html:Requests per second:8.59
    n1000c10new-releases.html:Requests per second:8.89 << no cache enabled

    n1000c10ModCache_moon-blue-ray.html:Requests per second:1.14
    n1000c10ModCacheR1_moon-blue-ray.html:Requests per second:1.20
    n1000c10ModCacheR1_moon-blue-ray_try2.html:Requests per second:1.23
    n1000c10ModMemCache_moon-blue-ray.html:Requests per second:1.22
    n1000c10moon-blue-ray.html:Requests per second:1.25 << no cache enabled

  7. Phil Chen says:

    Doug in Paia,

    Thanks for posting your results from your Apache ab tests. Looks like you are testing the theory on disk caching versus caching in RAM. Apache ab test is a good way to go about it however I think the variable is that of your disk speed as well as your EXT3 settings which maybe slowing your disk results up.

    To be honest the performance difference between disk caching and mem caching in this instance is not very large my results regarding this aspect were different by just 0.2. The big factor is the EXT3 settings I outline when using disk caching that you get the bigger bang. I chose disk caching because it allows for a lot more cache storage versus that in the maximum amount of RAM in your server. So for my situation disk allowed me a lot more cache and the steps above gave me the performance I needed.

    Thats just my view on it, hope that helps :)

  8. randy melder says:

    Nice job, Phil. Simple change with a big impact. I’m seeing lots of hits and first person accounts have proven this a valuable config change. You’re the man.

  9. Hello,

    I have mod_cache used in conjunction with mod_cband.
    But when we enable mod_cache mod_cband are overridden.
    Any idea in order to define the order of execution?

    Best regards,
    Vincenzo D’Amore

  10. [...] Kasindorf (dormando) and, after discussing file-system optimization, proceeded to implement some file-system tuning which, in turn, helped increase our Apache Web Servers running mod_disk_cache efficiency by 3 [...]

  11. Thanx for the tip I was looking for this. Now lets see the results over the next week.

  12. indibest.com says:

    Excellent work. I was able to reduce MaxClients after the changes

  13. Steve Clay says:

    This is helpful, but readers should note that several of these directives tell mod_cache to ignore/defy HTTP caching directives sent by your app:

    CacheIgnoreCacheControl On
    CacheIgnoreNoLastMod On
    CacheIgnoreHeaders None
    CacheStoreNoStore On
    CacheStorePrivate On

    The result is that users may see stale content that should not have been cached and, more dangerously, CacheStorePrivate On tells the cache it can serve to multiple people responses marked as private.

    Caching is always app-dependent, but generally settings like these indicate your app/server is not sending appropriate cache headers. Fix that and these settings won’t be needed.

Leave a Reply