Testing Database Locks with PHPUnit and Gearman

For the Beta 2 release of Doctrine 2 we plan to integrate pessimistic Database-level locks across all the supported vendors (MySQL, Oracle, PostgreSql, IBM DB2 so far). This means row-level locking as defined in the ANSI SQL Standard using "SELECT .. FOR UPDATE" will be available optionally in DQL Queries and Finder methods. The Implementation of this extension to SELECT statements is rather trivial, however functional testing of this feature is not.

A general approach would look like this:

  1. Run Query 1 and 2 with FOR UPDATE into the background
  2. Have both queries lock the row for a specified time x (using sleep)
  3. Verify that one of the two processes/threads runs approximately 2*x the lock time.

Since PHP does not support process forking or threads naturally you run into a serious problem. How do you execute two database queries in parallel and verify that indeed one query is locking read access for the second one?

Side note: There are some drawbacks to this testing approach. It could be that one background threads finishes the lock sleep already when the second just starts. The locking would work in these cases, however the lock time would not nearly be 2*x seconds, producing a test-failure. We are talking about a functional test though and I will accept a failure from time to time just to be 99% sure that locking works.

Solving this problem with Gearman provides a pretty nice "real-world" example for the Job-Server that I wanted to share. This blog post contains a stripped down code-example from the Doctrine 2 testsuite. If you are interested, you can see the complete Gearman Locking Tests in on GitHub.

Gearman allows to register worker processes with the job-server and offers clients to execute jobs on those workers in parallel. After installing the Gearman job-server and PHP pecl/gearman extension (Rasmus Lerdorf has a post on installation) we can go on writing our locking tests with Gearman.

The first bit is the worker, a PHP script that tries to acquire a database lock and then sleeps for a second. The return value of this script is the total time required for acquiring the lock and sleeping.

class LockAgentWorker
{
    public function findWithLock($job)
    {
        $fixture = $this->processWorkload($job); // setup doctrine in here
 
        $s = microtime(true);
        $this->em->beginTransaction();
 
        $entity = $this->em->find($fixture['entityName'], $fixture['entityId'], $fixture['lockMode']);
 
        sleep(1);
        $this->em->rollback(); // clean up doctrine
 
        return (microtime(true) - $s);
    }
}

The glue-code for the worker script contains of the registering of the worker method with the job-server and a simple infinite loop:

$lockAgent = new LockAgentWorker();
 
$worker = new \GearmanWorker();
$worker->addServer();
$worker->addFunction("findWithLock", array($lockAgent, "findWithLock"));
 
while($worker->work()) {
    if ($worker->returnCode() != GEARMAN_SUCCESS) {
        break;
    }
}

We need two running workers for this to work, since one worker only processes one task at a time. Just open up two terminals and launch the php scripts. They will wait for their first task to process.

Now we need to write our PHPUnit TestCase, which will contain a GearmanClient to execute two of the "findWithLock" in parallel. Our locking assertion will work like this:

  1. Register two tasks for the "findWithLock" method that access the same database row.
  2. Register a completed callback using "GearmanClient::setCompleteCallback()" that collects the run-time of the individual workers.
  3. Execute this tasks in parallel using "GearmanClient::runTasks()".
  4. Assert that the maximum run-time is around 2 seconds (since each worker sleeps 1 second)

The code for this steps could look like:

class GearmanLockTest extends \Doctrine\Tests\OrmFunctionalTestCase
{
    private $gearman = null;
    private $maxRunTime = 0;
    private $articleId;
 
    public function testLockIsAquired()
    {
        // .. write fixture data into the database
 
        $gearman = new \GearmanClient();
        $gearman->addServer();
        $gearman->setCompleteCallback(array($this, "gearmanTaskCompleted"));
 
        $workload = array(); // necessary workload data to configure workers
        $gearman->addTask("findWithLock", serialize($workload));
        $gearman->addTask("findWithLock", serialize($workload));
 
        $gearman->runTasks();
 
        $this->assertTrue($this->maxRunTime >= 2);
    }
 
    public function gearmanTaskCompleted($task)
    {
        $this->maxRunTime = max($this->maxRunTime, $task->data());
    }
}

Now if both workers are waiting for processing the task we can run this test and get a green bar for a working lock support.

Doctrine 2 Beta 1 released

Today we are happy to announce that the first beta of Doctrine 2 has been released and we fixed 165 issues kindly reported by several early adopters.

You can grab the code from the Github project:

http://github.com/doctrine/doctrine2/tree/2.0.0-BETA1
git clone git://github.com/doctrine/doctrine2.git

Or download it from our website:

http://www.doctrine-project.org/download#2_0

It is our believe that Doctrine 2 brings PHP ORMs to a new level. We are leaving behind the Active Record pattern because we think it hurts testability, project maintainability and is not a suitable abstraction (80/20) for models that exceed the complexity of a blog or otherwise simple web application. Instead we implemented a pure Data Mapper approach with help of the new Reflection functionalities of PHP 5.3, so your Domain Objects neither have to extend a base class nor implement an interface.

We also dropped most of the magical features of Doctrine 1 in favour of a simple and standardized API that is loosely based on the Java Persistence API, a technical standard for Object Relational Mappers. However we try not to blindly follow the "Programm PHP like Java" approach and and deviated from JPA where applicable to make the concepts fit better into the PHP environment, such as alternatively hydrating all results into nested array structures for very high read performance.

The cornerstone of Doctrine 2 is the query language DQL. It allows to execute queries on the object level defined by your metadata in a similar fashion to SQL. You can even do Joins, Subselects and Aggregates and Group Clauses in DQL, eliminating the need to circumvent ORMs for more advanced SQL features. Nevertheless it is also possible to write plain SQL and let Doctrine 2 hydrate the results into an object graph.

In the last three month since alpha 4 we have done considerable changes and integrated lots of feedback from our users. The most notable changes are:

  • Allowing Constructors of your Domain Objects to have non-optional parameters.
  • Allow to define a natural ordering of to Many Collections that is automatically enforced trough an SQL ORDER BY statement when retrieved from the database.
  • Shipping the Symfony Console Component to replace our own Console Implementation
  • New DQL syntax to load objects partially, omitting potentially expensive fields from retrieval for the current request.
  • Changes to how bi-directional have to be defined in the mapping files.
  • Several changes to the Events API inside the ORM, to make sure many possible extension scenarios work smoothly.
  • Enhancements to our Console Tools
  • Surpassed the 1000 unit-tests each running against Postgres, Mysql, Sqlite and Oci8 drivers.
  • Moved from SVN to Git: http://github.com/doctrine/doctrine2

We also did several painful backwards incompatible changes that seemed necessary to clean up and optimize the API or allow the ORM to be even faster than before. The beta phase beginning today will not contain any larger BC breaks anymore, opening up this release for a broader testing audience.

For the next iteration several enhancements are planned:

  • Support for PDO IBM, IBM DB2, SqlSrv and PDO SqlSrv und MsSql drivers
  • Pessimistic Lock Support (FOR UPDATE and SHARED)
  • Support for Custom Hydration Modes
  • Support for Custom Persister Implementations
  • Support for handling very large collections of related objects without needing an in-memory representation of the collection
  • Separation of Doctrine\Common, Doctrine\DBAL and Doctrine\ORM into three different projects
  • Extend the documentation even further, adding quickstart tutorials, cookbook recipes and enhancing the existing chapters.

We also plan to support several extensions for the 2.0 release such as:

  • Migrations
  • PHPUnit Database Testing Integration
  • NestedSet Support
  • Symfony 2 Support
  • Zend Framework 2 Support

Please try out the new Beta release If you find the time and leave your feedback in our Issue Tracker, the Mailing-List or come discuss with us on Doctrine 2 on Freenode #doctrine-dev.

New Netbeans PHP CodeSniffer Plugin Version

This morning I took the time to merge several changes by Manuel Piechler into my Github branch of the CodeSniffer Plugin and released a new NBM file for you to download. Here are the changes done by Manuel:

  • Ability to specifiy the path to CodeSniffer Binary
  • Automatic Detection of all the installed Coding Standards
  • Fixed Automatic Binary Detection on Windows

The only open TODO for this project is the possibility to specify different standards on a per project basis, currently you can only choose a global standard.

Resources for a PHP and Hudson CI Integration

Yesterday I finally had the time to setup my first continuous integration environment. Possible solutions for CI are phpUnderControl, Hudson or Arbit. Although phpUnderControl is the most wide-spread, but from I heard complex to setup/maintain, solution supposedly a hack and Arbit just in an early Alpha I decided to give Hudson a shoot. Another reason for this decision, I heard it has a simple plugin architecture and is easy to install and use. Additionally Hudson is easily integrated into Netbeans and Redmine, and I use both tools regularly in development.

My motivation to dive into CI is easily explained. I just never felt it was necessary to add a continuous integration enviroment to my projects, since I had one or two simple bash scripts that did the job. In general this is rather annoying, because they mostly only run PHPUnit and have to be done using a cronjob or manually, without any real process of notification. Additionally you have no way to navigate the test-results, code-coverage and no history of the last builds. For projects like Doctrine 2 we have the additional requirement to support 4 different database platforms, i.e. 4 different PHPUnit configurations. Currently that is solved by me using a Bash script that iterates over the configuration file names and invokes PHPUnit.

There are already some awesome sources how to get Hudson and PHP working. I'll list them here:

All those guides are already awesome and I would recommend choosing one of those to install Hudson, I think i can't add anything new to those. I have used Sebastians Howto, however i also like the third one. David Luhmans guide adds lots of details that are important to get the different parts of a build process to work.

Now what these tutorials all do is that they use a bash command to execute the build process or specifiy an Ant Build file. However there is also a Phing Build process plugin for Hudson that allows to specify the build.xml targets to execute in the process. From the "Available Plugins" list you can choose the "Phing plugin".

After installation you have to configure the Phing install. The Phing Plugin Wiki Page shows how to do this. You have to go to "Manage Hudson" => "Configure System" and look for Phing. There you find the dialog to configure your phing installations.

In the context of choosing a build script for your project you can now choose "Phing" instead of Ant:

You can enter the targets to build, for example on my local Hudson instance I only execute "test" for Doctrine 2, since I am not interested in the building and deployment onto the PEAR channel at this development stage.

From inside Netbeans you can then start builds by pointing to the Hudson instance. See this tutorial by one of the Hudson + Netbeans Developers. You can then start all the builds from inside Netbeans and be notified of the success or failure.

Application Lifecycle Management and Deployment with PEAR and PHAR (revisited) *UPDATE*

Some weeks ago I posted an article on using PEAR and PHAR for application lifecycle management and deployment. Since then I have gotten some feedback through comments, but also discussed the topic with collegues. I have also optimized the approach quite considerably and even made an open-source project out of parts of it and I want to share all that is new with you. First of all, yes the presented solution was somewhat complex, partly because it is still a proposed idea and definately up for optimizations. However I am still very convinced of my approach, something I should discuss in more detail.

The only other two languages I have ever programmed something in with more than 50 lines of code are Java and C#. In both languages you can explicitly import different dependencies like libraries and frameworks by adding them to your application, i.e. in java you would add for example Spring as an MVC Web layer, Hibernate as an ORM and several other things to your project and directly bundle them in your executable. This is very easy to configure and maintain in IDEs like Netbeans or Eclipse (C# has the same with allowing you to attach DDLs of libraries to your project). It also makes for a much more straightforward deployment.

In the PHP world this siutation was quite different (up to PHP 5.3) for several reasons:

  • You could not package a library into a single distributable file.
  • The PEAR installer as the only tool for updating and managing dependencies of your application by default installs into a system/global directory. This means dependencies your application uses are located in a completly different location than your application code.
  • You can't manage multiple versions of the same package with PEAR in this system directory, making it very hard to control servers with different applications.
  • The global directory with your application dependencies is most often not under version control, which makes deployment of applications with PEAR dependencies somewhat difficult.

There are some solutions to this problems:

  • Don't use PEAR, but put all dependencies in your version control system.
  • Don't use PEAR, and bundle dependencies to your code in the build/deployment process.
  • Use PEAR like in the article described, on a per project basis.

The solutions that don't use PEAR suffer from the disadvantage that you need to keep track of all the library and framework dependencies yourself and upgrade them yourself. This might not be such a huge problem from a first glance, but in my opinion many PHP applications and projects suffer from using either no framework/library or just exactly one. There is no real cherry-picking going on the PHP world, for example I would really like to use Zend Framework for the general application layout, but still use Doctrine for the Modelling and HTML Purifier for the Output Escaping. Certain tasks might then only be solvable with the help of eZ Components, all of which are then to some extend dependencies of my application. With PEARHUB and PEARFARM on the horizon (Read Padraic on this topic) even more PHP code will be distributed using PEAR channels in the near future. My immutable DateTime code for example makes for a great little open source library that could be distribued via PEAR, aswell as Yadif - a dependency injection container I am using extensivly.

Question: Are you really going to manage all these dependencies correctly manually? Is everything up to date all the time, and upgradeable with ease?

The PEAR driven solution then begins to look very desirable in this light, however it has a considerable disadvantage: The PEAR installer itself works on a system-wide/user-centric basis, making it impossible to manage dependencies of several applications using only one linux user. My little Pearanha to the rescue: I have taken the PEAR installer code (which is distributed with all PHP installations across all systems) and put a new CLI tool on top of it. Its a very simple code-generator that allows to generate a re-configured PEAR installer script which only manages a single application in a given directory. This approach is also used by the symfony plugin mechanism, which internally uses the PEAR installer (did you know?).

Lets revisit my blog application example from my previous PEAR post, first install from Github and make the "pearanha" file executable and put it in your PATH (A PEAR Server Channel will follow any time soon).

Now we need to have an existing application directory somewhere, for example /var/www/blog and then we can put Pearanha on top of it with:

benny@desktop: pearanha generate-project

You then need to specifiy the project base dir and then the project style (for example Zend Framework or Manual) which prompts your for the directory that should be used for as the vendor/library directory that PEAR will install all the code in. You will also be prompted for a binary/scripts directory which will then hold a new PHP file for you, the file my_phpiranha.

Pro Argument: Switching to Pearanha can be done at any point in your application lifecycle. Just define an additional vendor directory for all the dependencies to go in and generate the applications pear installer and you are good to go.

The generated script is your new application specific PEAR installer and you can begin to install all the required dependencies of your application:

benny@desktop:~$ cd /var/www/blog
benny@desktop:/var/www/blog$ ./vendor/pear/my_pearanha channel-discover pear.zfcampus.org
benny@desktop:/var/www/blog$ ./vendor/pear/my_pearanha install zfcampus/zf-alpha
benny@desktop:/var/www/blog$ ./vendor/pear/my_pearanha channel-discover htmlpurifier.org
benny@desktop:/var/www/blog$ ./vendor/pear/my_pearanha install hp/HTMLPurifier
benny@desktop:/var/www/blog$ ./vendor/pear/my_pearanha channel-discover pear.phpdoctrine.org
benny@desktop:/var/www/blog$ ./vendor/pear/my_pearanha install pear.phpdoctrine.org/DoctrineORM-2.0.0
benny@desktop:/var/www/blog$ ./vendor/pear/my_pearanha channel-discover components.ez.no
benny@desktop:/var/www/blog$ ./vendor/pear/my_pearanha install ezc/ezcGraph

All this stuff is now located in /var/www/blog/vendor. Again you can use PEARs complete upgrade, remove and install functionality for your application, now without the hazzle of having to create a linux user for each project you want to manage this way, which in my opinion is a considerable simplification. The complete application (including its dependencies) can then be put under version control and be readily packaged as a single executable PHAR file by your build process.

As a side node, I did try Pyrus instead of PEAR for the same discussed purpose, however most of the current PEAR channels don't validate against Pyrus strict standards for the package.xml file. In the future this might change and a Pyrus based application installer will then be integrated into Pearanha.

UPDATE: I renamed PHPiranha to Pearanha as its more appropriate. Also after apinsteins comment on "pear config-create" I rewrote the generate-project parts to use the config-create functionality internally, which allowed me to throw away half of the self-written code. Thanks!