Subtly Bad Code

Alright, let’s have a little fun… I just added a new blog and went to include it on the main page, but my code failed citing the database throwing errors. It took me forever to find. I’m curious if others can find it.

I was further confused because the code worked fine until I added the new blog to the list of ones for it to use, and it was specifically built so that it wouldn’t matter how many blogs there were. It has a separate file that just lists blogs to include, and reads that file at runtime and builds a query to retrieve posts from all of them.

You need some background, first… All the posts are stored in a database, so each has its own table. I built this monster query, basically looking something like (Get most recent posts from blog 1) UNION (Get more recent posts from blog 2) UNION (…3…), and then tack an “ORDER BY…” onto the end. Credit for this idea goes to Andrew; I’d have never thought of it myself.

What the list includes is blog IDs in the database. They ranged from 2 to 9, skipping 8 (which isn’t used). After a bout of spam registrations, the numbers got run up, so when I included the new one, it was numbered 51.

The below code (in PHP) calls some custom-rolled functions, but I’ll just say up front that the error does not depend on understanding how they work. Similarly, the answer does not have to do with caching in any way, so don’t get too hung up on the amount of code devoted to working with the cache. (And finally, I’m building one huge variable called $query the whole time, and then return that variable… This isn’t a crucial thing to understand either, I just wanted to explain it since it’s somewhat of a bizarre practice. .= is the PHP variable concatenation method.)

// $count is the number of blogs to pull out
function genRPQuery($count) {
  // Retrieve it from Memcache
  $query = getCachedObject("bigquery-$count");
  // It'll return NULL if it doesn't exist, so we check for that...
  if($query) return $query;

    // Since we're here, we didn't return, and
    // thus didn't get it out of the cache

    // Next two lines read in the files. blogList()
    // returns a list of the blogs -- it's little more than a
    // file read with caching enabled.
    $blogs = explode(',', rtrim(blogList(),"n"));
    $fields = rtrim(cachedFile('./fields.inc',30), "n");

    foreach ($blogs as $i) {
      // We have a loop for each blog
      // For unfamiliar eyes, .= is PHP's means of variable concatenation
      // We're building a ridiculously-long query, each one a SELECT, encased in
      // parens, and we UNION them all together...
      $query .= "(SELECT $fields FROM wp_" . $i . "_posts WHERE post_status='publish' AND post_type='post' AND post_password='' ORDER BY post_date DESC LIMIT $count)n";
      // If we're not on the last one, insert a "UNION" in (see above)
      if($i<sizeof($blogs)) $query .= "UNIONn";
    }
    // Done, so now we order them, getting just the most recent $count ones
    $query .= "ORDER BY post_date DESC LIMIT $count;";

    // Insert it into the cache for 15 seconds for next time!
    cacheObject("bigquery-$count", $query, 15);
    return $query;
}

Remember, it worked fine when the list was blogs number “2,3,4,5,6,7,9” but the simple change to “2,3,4,5,6,7,9,51” causes it to blow up and try run a query with invalid syntax. This made no sense to me, since the code was built to not care about things like that. I eventually found it, and feel like an idiot.

I’ve posted a hint in the comments… It’s in the interest of fairness because I turned on some debugging and got the information I share. But it also really narrows your attention to a couple of lines, so I don’t want to include it in the main post.

6 thoughts on “Subtly Bad Code

  1. Hint – (avert your eyes if you don’t want it!)

    I eventually (temporarily) changed the code to just print the query.

    Upon examining it, the last two per-blog (SELECT) queries didn’t have a UNION between them, even though the code is supposed to insert one for all but the last one. Understandably, the resulting query made no sense, thus preventing MySQL from running it.

  2. Yup! The problem was that that second-to-last item was 7, with the last being 9, so it just happened to work perfectly with my code checking to see if it was less than the size of the array (7 or 8 or so?).

    I ended up keeping the foreach and just adding another variable that I incremented and checking that.

    I just love that it *had* worked.

  3. That is the trap of foreach loops. I love them but if I need to count I tend to use a for loop. Adding a seperate counter works but I worry about it adding bugs of its own at times. A lot depends on complexity of course.

Leave a Reply

Your email address will not be published. Required fields are marked *