Thursday, May 28, 2009

Counting the number of queries with PDO

One common habbit you often see when people write their database wrappers is count the number of queries they perform. PDO doesn't have support for this internally, so I've seen people struggle and do different kinds of solutions to this. Mostly the reason to this is because people don't know how to properly extends the PDO classes and how to call the parent methods with variable number of arguments, like PDO requires you to do.

In my last post I discussed calling parent methods with Variable number arguments, so let's put that into actual use. Counting the number of queries is quite simple, really. You just have to override the methods that perform the actual queries and increment the query count each time these methods are called in addition to calling the parent method.

Here is an example class for overriding the PDO class.

class lqPDOWrapper extends PDO
{
 private $queryCount;

 public function __construct ($dsn)
 {
  $args = func_get_args();
  call_user_func_array(array($this, 'parent::__construct'), $args);

  $this->queryCount = 0;
  $this->setAttribute(PDO::ATTR_STATEMENT_CLASS,
   array('lqPDOStatementWrapper', array(& $this->queryCount)));
 }

 public function exec ($statement)
 {
  $this->queryCount++;

  $args = func_get_args();
  return call_user_func_array(array($this, 'parent::exec'), $args);
 }

 public function query ($statement)
 {
  $this->queryCount++;

  $args = func_get_args();
  return call_user_func_array(array($this, 'parent::query'), $args);
 }

 public function getQueryCount ()
 {
  return $this->queryCount;
 }
}

However, the PDO class is not the only object that needs to be extended. You will also need to create new clas for the statements, because using prepared statements, the statement class can also perform queries. So, here's how to extend the PDOStatement class.

class lqPDOStatementWrapper extends PDOStatement
{
 private $queryCount;

 protected function __construct (& $queryCount)
 {
  $this->queryCount = & $queryCount;
 }

 public function execute ($input_parameters = array())
 {
  $this->queryCount++;

  $args = func_get_args();
  return call_user_func_array(array($this, 'parent::execute'), $args);
 }
}

Now, whenever you want to get the number of queries performed through the database object, just use the getQueryCount() method in the PDO object. Note that even though the arguments in these functions are not actually used directly, the argument lists are (mostly) needed to conform to the PDO class spesification.

In the class extending the PDO class, I've taken advantage of the PDO's internal built in feature to allow usage of different statement classes. This makes it simple to create your own PDOStatement wrappers. In addition you can pass the constructor arguments for the class, which I take advantage to pass reference to the database object's counter.

This solution is kind of neat, since usually you see people passing the actual database object to the statement class in order to increment the query count from the statement class. In my solution, however, I simply pass the reference to the database object's query counter to the statement class. This reference is then passed down to the statement class's query counter. Because the statement's member $queryCount is actually reference to the database object's member $queryCount, incrementing the statement class's query counter increments the database object's query count.

Here is a simple example how to use the above classes:

$db = new lqPDOWrapper('mysql:host=localhost;dbname=test', 'root', '');
$db->query('SELECT 1 + 1');
$sth = $db->prepare('SELECT 2 + 2');
$sth->execute();
$sth->execute();

echo "Performed " . $db->getQueryCount() . " queries.";

This will output "Performed 3 queries.". One query was performed through the query() method and two queries were performed by calling the execute() method twice.

Wednesday, May 27, 2009

Parent methods with variable arguments list

PHP's syntac allows you to create functions with variable number of arguments. either you could just have parameters with default values or use func_get_args() to retrieve a larger number of arguments. These functions can be quite handy when programming, because PHP doesn't have any kind of function overloading based on argument list.

Usually this doesn't really cause any headaches or problems, but there is one particular case that might become tricky, which is calling parent method of a class method with variable number of arguments.

When designing API, it might be good idea to avoid large variability in function's argument list and clearly indicate the default values, if you do happen to have variable number of arguments. Personally, I would also recommend to usage of array parameters, instead of getting list of arguments with func_get_args(). Sometimes, however, you are handed an API, which can not be changed and you're stuck with extending the API classes.

For example, if you're trying to extend PHP's internal classes, you may have the problem that default values for arguments aren't even listed in the PHP's documentation. Because of this, you can't just pass default values to the functions; you actually have to pass different number of arguments.

PHP has a built in function called call_user_func_array(). Using this function may not be obvious, since PHP's documentation barely even mentions how to call parent methods using a callback paremeter. In fact, the appropriate usage pretty much only appears in one example in the documentation, but is not mentioned or described elsewhere.

To call a parent class's method with call_user_func_array(), you'd have to use something like:

$args = func_get_args();
call_user_func_array(array($this, 'parent::test'), $args);

This allows you to call the parent method with a variable number of arguments using the arguments from the current method. Of course, you can always modify the arguments if needed. However, the assignment to the $args variable before passing it as parameter is necessary. If you try to use the func_get_args() as parameter to a function, PHP will issues a warning "Fatal error: func_get_args(): Can't be used as a function parameter".

Here is a simple example of code taking advantage of calling functions with variable number of arguments:

class ParentClass
{
 function test ()
 {
  echo "Args: ";
  foreach (func_get_args() as $arg)
  {
   echo "$arg, ";
  }
 }
}

class ChildClass extends ParentClass
{
 function test ()
 {
  echo "Called from Child, ";
  
  $args = func_get_args();
  call_user_func_array(array($this, 'parent::test'), $args);
 }
}

$foo = new ChildClass();
$foo->test('arg1', 'arg2', 'arg3', 'arg4');
This code will simply output "Called from Child, Args: arg1, arg2, arg3, arg4,".

Saturday, May 9, 2009

str_split and UTF-8 (and other encodings)

If you've ever dealt with UTF-8 in PHP, you'll probably know that you'll be getting into a lot of hassle, since PHP doesn't internally support any character encodings (until PHP6, that is). Luckily, at least the mb_string extension exists, that provides the basic functionality for handling various different encodings with PHP, but even that library is missing some often needed and useful functions.

One of these missing functions is called str_split(). The function splits the string into array of strings with specified number of characters. This function can come quite handy at times, even though it has only been available since PHP5 among the normal string functions. Here's how to achieve that same functionality with UTF-8 and other encodings.

Dealing with UTF-8

If you're working with UTF-8, there's a relatively easy solution for you, since the PCRE functions support UTF-8 simply by using the modifier u. If you just need to separate the entire string into array of characters, you could simply just use preg_replace():

$chars = preg_split('//u', $string, -1, PREG_SPLIT_NO_EMPTY);

The empty regular expression will match between all characters, which will cause the string to be split into a character array. Because of the u modifier, the string is treated as UTF-8 and all characters in the encoding will matched as whole. The use of PREG_SPLIT_NO_EMPTY is required, because otherwise it would return an empty string at the beginning and the end, because the regular expression would match between the first character and the the beginning and between the last character and the end.

You could also use preg_match_all() to just create a list of all characters like:

preg_match_all('/./us', $string, $match);
$chars = $match[0];

If you want to replicate the str_split functionality, allowing you to split the string into longer character sequences, you could use the following function:

function str_split_utf8 ($string, $split_length = 1)
{
 $length = (int) $split_length;
 $string = (string) $string;
 
 if ($length < 1)
 {
  return false;
 }
 
 return preg_split("/(.{{$length}})/us", $string, -1,
  PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
}

This function will work just like str_split, except that it will work correctly for UTF-8 strings. I prefer to use the preg_split() here instead of preg_match(), because my testing indicates it is slightly faster. The function works in a bit roundabout way, because the characters are captured as delimiters, instead of splitting the string into array of strings (hence the use of PREG_SPLIT_DELIM_CAPTURE). Only the last "overflow" characters are actually captured as nonempty separated string.

You could use preg_split() to actually split the string into array of strings with proper size by replacing the above preg_split() with:

preg_split(
 "/(?<=\G.{" . ($length - 1) . "}|\A.{{$length}})(?<=.{{$length}})(?=.)/us",
 $string);

However, this regular expression is considerably slower and I also think you'll agree with me, if I say it's not the most obvious regular expression either. If you need practice in understanding regular expressions, feel free to try to figure out how and why that works.

Encodings other than UTF-8

If you want an mb_string equivalent version of str_split(), which allows you to use different encodings, you'll have to resort to manually obtaining parts of the string with use of the mb_string functions. While it isn't really that much harder, it is significantly slower. To replicate the str_split() function, you could use a function like:

function mb_str_split ($string, $split_length = 1, $encoding = null)
{
 $chunk = (int) $split_length;
 $string = (string) $string;
 
 if ($chunk < 1)
 {
  return false;
 }
 
 // User internal encoding if none provided
 if ($encoding === null)
 {
  $encoding = mb_internal_encoding();
 } 
 
 $len = mb_strlen($string, $encoding);
 $return = array();
 for ($i = 0; $i < $len; $i += $chunk)
 {
  $return[] = mb_substr($string, $i, $chunk, $encoding);
 }
 return $return;
}

This would work just fine with UTF-8 too, but the problem is that due to the iterated nature of the code, it's much slower than calling a single function to do the entire operation for you. UTF-8 is used more much often in PHP than any other multibyte encodings, which is why I provided a separate way for working with UTF-8 strings.