Caching made simple

In this tutorial we will show a very simple way to implement query caching. The goal is to create something that is versatile and easy to develop and use, even for not so experienced programmers.

The main idea is to have a PHP class which will automatically do the caching for all it’s methods. In the end, each method will have an additional parameter, which will define how long till the cache (for that function results) invalidates. If the cache is valid, we fetch the already-calculated results. Else, we run the query normally, and update the cache.

Cache Table

We will need a table in the database, that we’ll use to hold the cache. We’ll name that table `cache` (who would have guessed??). Here is the CREATE TABLE query.

CREATE TABLE IF NOT EXISTS `cache` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`_key` varchar(255) DEFAULT NULL,
`_call` text,
`value` longtext,
`ts` int(11) DEFAULT NULL,
`locktime` int(11) DEFAULT NULL,
`expirationtime` int(11) DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `_key_2` (`_key`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8
  • _key is the unique identifier for each cache record. It’s a string made by running md5 on the function name and arguments (more on this later)
  • _call is the function name and arguments
  • value is the cached query result
  • ts is the timestamp when the query was cached
  • locktime is the timestamp when the cache was locked for this query
  • expirationtime is the timestamp when the cache will invalidate

Before moving on to the PHP code, let me explain (through example) how this is going to work. Let’s suppose that you have a bookstore database and you need to search the books by the author name. Our goal is to call a function like this:
$cache->searchBooksByAuthor(300, “nikos”);
where 300 is the time before the results of this function expire.

Let’s start by defining the class, and 1 variable, the default value for the expiration of a locked row. I’ll also add the definition of the function mentioned before, for searching the books

class Cache {
   var $lock_timeout = 10;

   function __call($name, $arguments) {
   }

   function _searchBooksByAuthor($author) {
      // query the database like you would normally
      // do and return the result
   }
}

Notice that an underscore precedes the function name, and that the timeout isn’t included in the arguments. You’ll understand why, once we start adding code to the __call() function.

__call is a magic method. It is automatically invoked, when the class method we are calling doesn’t exist. Now, it’s getting more clear why we are using the underscore before the function name. searchBooksByAuthor doesn’t actually exist, so the __call() function will be invoked. Inside __call() we’ll check the cache and if we have a valid result cached, we’ll return that, else we’ll execute the _searchBooksByAuthor, add the results to our cache and return them at the same time.

Let’s start adding code to __call(). Let me say, that since everybody is using their own database wrapper, I am just using a fake function named myquery() to show which queries are executed.

// first think we must check that the function actually exists
if ( !method_exists($this, '_'.$name) ) {
   // handle the error anyway you like
}

$current_timestamp = time();
$timeout = $arguments[0]; // we said that the first argument will be the time till the cache record invalidates
$expiration_timestamp = $current_timestamp + $timeout;

// lets create the key for the cached row
$arguments[0] = $name;
$call = serialize(array_merge((array)$name, array_slice($arguments, 1)));
$key = md5($call);

We create a serialized array consisting of the function name and all the argument values. We use md5() on it and that’s our final key for the cached row. $call isn’t actually necessary and won’t be used anywhere in this class, but it is very usefull for debugging and to easily check what your cache holds (through a DB manager like phpMyAdmin).

If no records are in the cache table, we must run the original function, get the data, insert a row in the cache table, and return the data.

// searching for the cached row
$cached = myquery("SELECT * FROM cache WHERE _key = '$key'");

if ( !$cached['id'] ) {
   // row doesn't exist in the cache table, we just need to create it
   // but first, run the original function and get the result
   $data = call_user_func_array(array($this, '_'.$name), array_diff_key($arguments, array($timeout)));
   $value = serialize($data);

   myquery("
      INSERT INTO `cache`
      (`_key`, `_call`, `ts`, `locktime`, `expirationtime`, `value`)
      VALUES
      ('$key', '$call', $current_timestamp, 0, $expiration_timestamp, $value)
   ");

   return $data;
}

If the row does exist in the cache table, then there several options: it might be valid, it might be invalid, it might be locked, it might be problematic (eg locked and past the locking timeout)

// if valid
if ( $cached['ts'] + timeout > $current_timestamp ) return unserialize($cached['value']);

// locked, so we'll serve the outdated content
if ( $cached['locktime'] && $cached['locktime'] + $this->lock_timeout < $current_timestamp ) return unserialize($cached['value']);

// if we reach here, either the cache is invalid, or it is locked, and past the locked timeout
// in either case we lock and update the cache row
myquery("UPDATE `cache` SET locktime = $current_timestamp WHERE `_key`='$key'");

$data = call_user_func_array(array($this, '_'.$name), array_diff_key($arguments, array(' ')));
$value = serialize($data);

myquery("
   UPDATE `cache` SET
      `value` = '$value',
      `ts` = $current_timestamp,
      `locktime` = 0,
      `expirationtime` = $expiration_timestamp
   WHERE `_key` = '$key'
");

return $data;

And there you have it, an easy way to cache the result of your queries. Any comments,questions or suggestions are more than welcome.

This is a blog post of mine that was posted early 2011 in another blog that no longer exists. I found it, and thought it would be nice if I added it to this blog as well

Leave a Reply

Your email address will not be published. Please enter your name, email and a comment.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>