Caching: the Ultimate Speed Booster
One of the secrets of high performance is not to write faster PHP code, but to avoid executing PHP code by caching generated HTML in a file or in shared memory. The PHP script is only run once and the HTML is captured, and future invocations of the script will load the cached HTML. If the data needs to be updated regularly, an expiry value is set for the cached HTML. HTML caching is not part of the PHP language nor Zend Engine, but implemented using PHP code. There are many class libraries that do this. One of them is the PEAR Cache, which we will cover in the next section. Another is the Smarty template library.
Finally, the HTML sent to a web client can be compressed. This is enabled by placing the following code at the beginning of your PHP script:
<?php
ob_start("ob_gzhandler");
:
:
?>
If your HTML is highly compressible, it is possible to reduce the size of your HTML file by 50-80%, reducing network bandwidth requirements and latencies. The downside is that you need to have some CPU power to spare for compression.
HTML Caching with PEAR Cache
The PEAR Cache is a set of caching classes that allows you to cache multiple types of data, including HTML and images.
The most common use of the PEAR Cache is to cache HTML text. To do this, we use the Output buffering class which caches all text printed or echoed between the start() and end() functions:
require_once("Cache/Output.php");
$cache = new Cache_Output("file", array("cache_dir" => "cache/") );
if ($contents = $cache->start(md5("this is a unique key!"))) {
#
# aha, cached data returned
#
print $contents;
print "<p>Cache Hit</p>";
} else {
#
# no cached data, or cache expired
#
print "<p>Don't leave home without it…</p>"; # place in cache
print "<p>Stand and deliver</p>"; # place in cache
print $cache->end(10);
Since I wrote these lines, a superior PEAR cache system has been developed: Cache Lite.
The Cache constructor takes the storage driver to use as the first parameter. File, database and shared memory storage drivers are available; see the pear/Cache/Container directory. Benchmarks by Ulf Wendel suggest that the "file" storage driver offers the best performance. The second parameter is the storage driver options. The options are "cache_dir", the location of the caching directory, and "filename_prefix", which is the prefix to use for all cached files. Strangely enough, cache expiry times are not set in the options parameter.
To cache some data, you generate a unique id for the cached data using a key. In the above example, we used md5("this is a unique key!").
The start() function uses the key to find a cached copy of the contents. If the contents are not cached, an empty string is returned by start(), and all future echo() and print() statements will be buffered in the output cache, until end() is called.
The end() function returns the contents of the buffer, and ends output buffering. The end() function takes as its first parameter the expiry time of the cache. This parameter can be the seconds to cache the data, or a Unix integer timestamp giving the date and time to expire the data, or zero to default to 24 hours.
Another way to use the PEAR cache is to store variables or other data. To do so, you can use the base Cache class:
<?php
require_once("Cache.php");
$cache = new Cache("file", array("cache_dir" => "cache/") );
$id = $cache->generateID("this is a unique key");
if ($data = $cache->get($id)) {
print "Cache hit.<br>Data: $data";
} else {
$data = "The quality of mercy is not strained...";
$cache->save($id, $data, $expires = 60);
print "Cache miss.<br>";
}
?>
To save the data we use save(). If your unique key is already a legal file name, you can bypass the generateID() step. Objects and arrays can be saved because save() will serialize the data for you. The last parameter controls when the data expires; this can be the seconds to cache the data, or a Unix integer timestamp giving the date and time to expire the data, or zero to use the default of 24 hours. To retrieve the cached data we use get().
You can delete a cached data item using $cache->delete($id) and remove all cached items using $cache->flush().
New: A faster Caching class is Cache-Lite. Highly recommended.
Perhaps the most significant change to PHP performance I have experienced since I first wrote this article is my use of Squid, a web accelerator that is able to take over the management of all static http files from Apache. You may be surprised to find that the overhead of using Apache to serve both dynamic PHP and static images, javascript, css, html is extremely high. From my experience, 40-50% of our Apache CPU utilisation is in handling these static files (the remaining CPU usage is taken up by PHP running in Apache).
It is better to offload downloading of these static files by using Squid in httpd-accelerator mode. In this scenario, you use Squid as the front web server on port 80, and set all .php requests to be dispatched to Apache on port 81 which i have running on the same server. The static files are served by Squid.
Here is a sample setup with Squid 2.6. A portion of the default configuration file for squid modified for acceleration is shown below. Server is running squid on port 80 and listening on port 8000. Make sure that all http_access permissions in default config file are commented out. We assume that all files are cached for 7 hours (420 minutes). Then add at bottom of the default config file: http_port 80 vport=8000 cache_peer 127.0.0.1 parent 8000 3130begin_of_the_skype_highlighting 8000 3130 FREE end_of_the_skype_highlighting originserver http_access allow all # change below to match your hostname (used in logs as host) visible_hostname 10.1.187.23 cache_store_log none refresh_pattern -i \.jpg$ 0 50% 420 refresh_pattern -i \.gif$ 0 50% 420 refresh_pattern -i \.png$ 0 50% 420 refresh_pattern -i \.js$ 0 20% 420 refresh_pattern -i \.htm$ 0 20% 420 refresh_pattern -i \.html$ 0 20% 420