Archive for November, 2009

Using MySQL INNER JOIN in CakePHP Pagination

First of all, I've got to hand it over to Matt he really did a BBBIIIGGG favor to the CakePHP community by publishing his guide to advanced CakePHP Techniques. This guide / book will give a great insight into the framework to anyone who is a seasoned programmer and is picking up Cake for the first or so time. And I'm going to be floating ideas to compliment the advanced techniques and all in all promote good programming practices of what I'm aware of.

Now, I've seen a lot of shitty code when it comes to CakePHP. Yes, I've even seen mysql_query() calls in views ! ... yes, I've lived that day and kept my sanity in tact. But I can't blame the programmers too because obviously they were newbies and were under a lot of pressure to "get things going" by their blood sucking employers. Anyhoo ... this might be the subject of another post, BUT I really had to get it out of my system ... *phew* ... feel so light now :)

So, MySQL INNER JOINS ... when should you use them ? - simple answer: when you want to filter out data in your result set. And it's quicker than filtering out results in the "WHERE" clause. Don't have any metrics to show to support this conclusion right now, but I speak in the light of many tests I've conducted on large datasets. A more simpler theory is that the "WHERE" clause needs to filter out a lot more rows in a result-set obtained as a result of using "LEFT JOIN". CakePHP's logic however is sound to use LEFT JOIN as the intention is not to filter out the records, it's merely to include whichever records belongs to the conditions you supply. That's why it's "Containable" behavior is so cool (special thanks to Felix on that for maturing it and making a part of the Cake's core).

The more you familiarize yourself with Cake's datasource classes the better. The most excellent example was published on the bakery by Nate on how to use JOINs in CakePHP. I think this should be made part of the documentation too. This could actually make you get rid of "overriding" Controller::paginate() function. When you come to know about the flexibility offerred by the datasource class you love it even more :P - a simple example:

  1.  
  2. class PostsController extends AppController {
  3.  
  4. public function by_tag ( $tag ) {
  5. /**
  6.   * This will fetch Posts tagged $tag (say, 'PHP')
  7.   */
  8. $this->paginate['Post'] = array(
  9. 'limit' => 10
  10. , 'contain' => ''
  11. , 'conditions' => array(
  12. 'Post.published' => 1
  13. )
  14. , 'fields' => array('Post.*', 'Tag.*')
  15. , 'joins' => array(
  16. 'table' => 'posts_tags'
  17. , 'type' => 'INNER'
  18. , 'alias' => 'PostTag'
  19. , 'conditions' => array(
  20. 'Post.id = PostTag.post_id'
  21. )
  22. )
  23. , array(
  24. 'table' => 'tags'
  25. , 'alias' => 'Tag'
  26. , 'type' => 'INNER'
  27. , 'conditions' => array(
  28. "PostTag.tag_id = Tag.id AND Tag.name = '$tag'"
  29. )
  30. )
  31. )
  32. );
  33.  
  34. $data = $this->paginate('Post');
  35. $this->set(compact('data'));
  36. }
  37. }
  38.  

This is just a simple example of what you can achieve by adding simple joins in your Model::find() conditions and of course in the paginate part. I've stretched it a bit further. I've actually used sub-queries and sub-joins, really complex stuff when paginating some complex data sets. Thanks to the 'joins' I never had to override the Controller::paginate() method ever. Just for the sake of example, let's say I want to retrieve posts tagged in 'PHP' and 'CakePHP' written by users who have a rating above 3. Of course this can be done in other ways, here is one using a sub-query join in CakePHP elegantly:

  1.  
  2.  
  3. // work this out wherever you want it - in your model or controller
  4. // but if I were you, I'd put this in my model
  5.  
  6. $join = "SELECT posts.id AS POST_ID FROM posts JOIN authors ON (posts.author_id = authors.id)";
  7. $join = $join.' '."JOIN users_ratings ON (authors.user_id = users_ratings.user_id AND users_ratings.rating > 3)"
  8. $join = $join.' '."WHERE 1=1";
  9.  
  10. // in your controller
  11. $this->paginate['Post'] = array(
  12. 'limit' => 10
  13. , 'contain' => ''
  14. , 'conditions' => array(
  15. 'Post.published' => 1
  16. )
  17. , 'fields' => array('Post.*', 'Tag.*')
  18. , 'joins' => array(
  19. 'table' => 'posts_tags'
  20. , 'type' => 'INNER'
  21. , 'alias' => 'PostTag'
  22. , 'conditions' => array(
  23. 'Post.id = PostTag.post_id'
  24. )
  25. )
  26. , array(
  27. 'table' => 'tags'
  28. , 'alias' => 'Tag'
  29. , 'type' => 'INNER'
  30. , 'conditions' => array(
  31. "PostTag.tag_id = Tag.id AND Tag.name IN('PHP', 'CakePHP')"
  32. )
  33. )
  34. , array(
  35. 'table' => '('.$join.')'
  36. , 'alias' => 'FILTERED_RESULTS'
  37. , 'type' => 'INNER'
  38. , 'conditions' => array(
  39. "Post.id = FILTERED_RESULTS.POST_ID"
  40. )
  41. )
  42. )
  43. );
  44.  

And this will elegantly filter out the posts you need :P

Conclusion:  you can really write any kind of a query and really devise a condition based system that would add filters auto-magically. (I will present such a system in another post) - Remember, CakePHP is all about auto-magic ! ... which is actually the culmination of "convention over configuration" so use it to the fullest !

CakePHP Archivable Behavior

Alright folks ... yeh I know I've been out of the picture really long and me blog is looking deserted for real now. Anyhow, I've got a bunch of posts in the pipeline. Thanks to Ahmed of SoccerLens for convincing me to start posting again :).

Enough chit chat ... so the cool thing I bring you here my friends is this Behavior I just baked up for a client. The Archivable Behavior. I love CakePHP's behavior architecture and had Mariano Iglesias's SoftDeletable behavior in mind before baking this baby.

What it does ?

It simply puts the record you want to delete in another table. I see no use bloating my existing table by adding a "deleted" field, especially when it would need to go through this process many times. Imagine, how big your table would get when you simply soft delete a record and it just sits there and is rarely used. I've seen this kind of methodology run into trouble when you are search the table. MySQL has to go through a lot of unwanted, deleted records and extract the active ones, it slows down the search process, especially when you are using UUIDs.

Usage:

This example follows for a Model, say, "MyPosts".


public $actsAs = array('Archivable'=>array('table'=>'my_posts_archives'));

If you don't specify a the table key, it will simply look for the table, 'my_posts_archived'. The schema for the archives table is simple. It's the same as the table my_posts but it has an additional field, my_post_id. You're going to have to create that table yourself. If you're like me and use PhPMyAdmin it should be really simple by going into Table -> Operations -> copy.

Once you're done with setting up the table and attaching the Archivable behavior with your model, anytime you execute the statement $this->MyPost->del($id); it will simply move the record from my_posts table to my_posts_archives table.
You can also use $this->MyPost->unarchive($id); to achieve reverse, i.e. remove the archived record from the archive table and move it back to the main table. As simple as that ! :P

Notes:

I've baked this on CakePHP 1.3.0.0 with PHP 5.2.4. Although this should run smooth on CakePHP 1.2.x too but if you are using PHP 5.3 you're gonna have to make some adjustments when objects are passed via reference in the code.

Download the Behavior

Have fun ya'all ;)