AbleCommerce unable to process payment after initial rejection

If a payment for an anonymous order does not go through the first time for whatever reason (invalid card number, payment gateway timeout, etc.) subsequent attempts to submit a payment for that order will fail with the error:

Value cannot be null. Parameter name: request.Payment.Order.User

This is due to a bug in AbleCommerce that causes the user ID associated with the order to be removed. This is the correct action for anonymous orders that have been completed, but in this case order processing is not yet complete.

To work around the issue you have to dive into the database and add the anonymous user’s ID back to the order record. After this step has been done the next attempt to process a payment should proceed normally. This has to be done after each payment failure.

YUI caching breaks editing

From the AJAX settings screen (/admin/settings.php?section=ajax) you can modify various parameters regarding how Moodle loads the YUI libraries. One of the options is to cache the JavaScript output. This has obvious performance benefits and should normally be enabled. However, during a recent update it appears that caching has broken the ability to edit. Whenever an editing page loads the browser throws the following error:

TypeError: M.core_formchangechecker is undefined

The form actually loads but because of the JavaScript error the rendering of the forum stops before it is complete, leaving the user unable to edit a number of options for the component being edited.

To work around the issue I have disabled caching. This isn’t ideal, but until a solution is found it is the most workable option.

PHP’s built-in validation/sanitation … who knew?

So apparently PHP has built-in validation/sanitation? How come nobody ever told me?! And why are so many people writing email validation regex? Find a sample email validation in the PHP docs.

That being said, the built-in validation isn’t perfect. Run the outlier cases cited in Phil Haack’s email RFC article to see the few instances where PHP fails. Are you likely to run into these outliers in the wild? No, but it’d still be nice to have completely accurate validation. We may not produce perfect code, but we should at least strive for that.

You can test PHP’s built-in validation on functions-online. Compare how it does against Cal Henderson’s RFC822 Email Address Parser.

Optimizing database queries in CakePHP

We’ve been using CakePHP for a while now. I’ve found it to be very helpful for quick prototyping and development of core site functionality. This is nice because it allows more time for tweaking a site’s user interface, functionality, and performance. That last one, performance, can occasionally be a challenge. One of the means of optimizing a site’s performance is through the crafting of database queries that take into account indexes and conditions. But because CakePHP is building the SQL we have to keep in mind how CakePHP does this as we construct our query parameters.

Let’s review one of those situations. Say we have the following (simplified) models:

class Item extends AppModel {
  var $hasMany = array('Answer');
  var $belongsTo = array('Packet');
}

class Student extends AppModel {
  var $hasMany = array('Answer');
} 

class Answer extends AppModel {
  var $belongsTo = array('Student','Item','Packet');
}

class Packet extends AppModel {
  var $hasMany = array('Answer','Student');
  var $hasAndBelongsToMany = array('Item');
}

Let me explain the model relationships. We have test items that are assigned to packets. Packets are sent out for testing to a group of student. So each answer is unique based on the packet, item, and student. The (somewhat circular) relationships between these models provides for easy access to specific data. For example, let’s say we want to know all the answers from a particular state for a couple of our items. Here’s an example query that present some optimization issues:

$queryOpts = array(
  'fields' => array('Item.*');
  'conditions' => array('Item.id' => array(1995, 1726, 1971, 1707, 1972));
  'contain' => array(
    'Answer' => array(
      'fields' => array('Answer.*');
      'conditions' => array(
        'Answer.packet_id IN (SELECT id FROM packets WHERE state = "CA")'
      )
    )
  );
);
$items = $this->Item->find('all',$queryOpts);

You’ll note that I’m not using CakePHP syntax for the Answer model conditional. We had been using this conditional in our web application prior to moving to CakePHP. Since there is not, so far as I can tell, an easy way to work with SQL subqueries we decided to start by forcing a query similar to the original.

Running the above creates the following SQL statement when querying the answers table:

SELECT `Answer`.* FROM `answers` AS `Answer` WHERE `Answer`.`packet_id` IN (SELECT id FROM packets WHERE packet_type = "F") AND `Answer`.`item_id` IN (1995, 1726, 1971, 1707, 1972)

We have a fairly large data set so the above query takes ~32 seconds to run on my system. But that query is not optimized; reordering the conditions in the WHERE clause produces the following query:

SELECT `Answer`.* FROM `answers` AS `Answer` WHERE `Answer`.`item_id` IN (1995, 1726, 1971, 1707, 1972) AND `Answer`.`packet_id` IN (SELECT id FROM packets WHERE packet_type = "F")

The optimized query performs significantly better, <1 second.

Since for this example we know which items we want we can update the query to manually force the desired ordering:

$queryOpts = array(
  'fields' => array('Item.*');
  'conditions' => array('Item.id' => array(1995, 1726, 1971, 1707, 1972));
  'contain' => array(
    'Answer' => array(
      'fields' => array('Answer.*');
      'conditions' => array(
        'Item.id' => array(1995, 1726, 1971, 1707, 1972)
        'Answer.packet_id IN (SELECT id FROM packets WHERE state = "CA")'
      )
    )
  );
);
$items = $this->Item->find('all',$queryOpts);

But what if we don’t know what items we want. Say we’re pulling items (and answers) as part of another containable query. We have no way to specify the items in the Answer model conditionals. But we can improve the query so that we’re not using a subquery each time (i.e. we’re doing it the “CakePHP” way):

$fields = array('Packet.id');
$conditions = array('Packet.packet_type' => 'F');
$packet_list = $this->Item->Packet->find('list',array('fields'=>$fields,'conditions'=>$conditions));

$queryOpts = array(
  'fields' => array('Item.*');
  'conditions' => array('Item.id' => array(1995, 1726, 1971, 1707, 1972));
  'contain' => array(
    'Answer' => array(
      'fields' => array('Answer.*');
      'conditions' => array(
        'Answer.packet_id' => $packet_list
      )
    )
  );
);
$items = $this->Item->find('all',$queryOpts);

Which produces the following two queries:

SELECT `Packet`.`id` FROM `packets` AS `Packet` WHERE `Packet`.`packet_type` = 'F';
SELECT `Answer`.*, `Answer`.`item_id` FROM `answers` AS `Answer` WHERE `Answer`.`packet_id` IN (1278, 1277, 1276, 1274,...,1726, 1971, 1972, 1995);

Again we’re seeing the kind of performance we want, <1 second for both queries. As good as when we knew all our conditions in advance. The main drawback to doing it this way is the size of the query string sent to your database. If your first query is pulling a large number of results then your second query will be quite large. And most databases have limits on how large a query can be.

This isn’t the only way to get to the end point either. As with any programming task there are a variety of ways to arrive at the desired result. For example, you could break up the query into multiple queries that build off of the previous query. Sticking with our items/answers query you could first get the items in one query. Then foreach through the results and run a query to get the answers, integrating the results manually. It’s more work, and likely would decrease performance since you’re increasing the number of database queries, but if you can’t seem to optimize your all-in-one queries it’s worthwhile to try other options to see if you can get better performance.

Aside: performance problems due to poorly optimized queries can be somewhat mitigated by query caching. Even so, you’re better of with an optimized query to begin with since it saves you from a performance hit every time the relevant cached result is flushed.

ASP goes on a diet

I recently modified our custom mass mailer script to accept attachments. After the modifications the new functionality worked nicely, except when it didn’t. Sometimes the script would error out when it accessed the Request object. When this happened I would receive the following message:

Request object error ‘ASP 0104 : 80004005’ Operation not Allowed

Though the error messages produced by ASP tend to be a bit cryptic, Microsoft sometimes does a decent job of documenting them on their web site. A quick search revealed the problem: IIS 5.x and 6 have a default request limit of 200K. That’s sufficient in most situations where you would attach something to an email, but once you start attaching multiple documents there’s a chance you’ll need a higher limit. The fix is easy enough, but does require editing the IIS metabase. Setting the AspMaxRequestEntityAllowed property to an appropriate limit (in bytes, up to 1GB max) addresses the issue. I like to use metaedit to edit the metabase, but you can also use the command-line metabase editing script:

cscript adsutil.vbs set w3svc/ASPMaxRequestEntityAllowed 20971520

20971520 is the limit of the request size in bytes (20M in this case).

References:

WP redirects confuse IE

Some users were having problems on a community site that was implemented using WordPress+BuddyPress. After some testing the issue involved users visiting the site with Internet Explorer (IE). These users were receiving error pages instead of site content. Other web browsers did not have similar issues.

The cause of the problem turns out to be a confluence of issues:

  • The site is hosted on a Windows+IIS server, a rare platform for WordPress (and even more so for BuddyPress) and one that probably doesn’t receive full attention during quality assurance testing.
  • How WordPress performs redirects on IIS is a bit quirky. BuddyPress issues a lot of redirects so this redirect quirk comes into play quite often. The issue is that when WordPress needs to perform a redirect on IIS it returns a “refresh” header pointing to the new page rather than a “location” header.
  • IE’s attempt to make the Internet more friendly; specifically IE’s use of “friendly error pages.” These friendly error pages replace the content delivered by the server (if that content falls below a certain size in KB).

Normally none of these issue are a problem by themselves and a web browser (including IE) will load the page indicated by the redirect. However, all three of the above issues taken together result in a situation where IE never sees the header refresh and so doesn’t redirect the user to the correct location.

The fix is fairly simple: change the headers that WordPress sends to include the standard “location” header. To do this you modify wordpress/wp-includes/pluggable.php@wp_redirect() so that it reads as follows (line 14 is new):

function wp_redirect($location, $status = 302) {
	global $is_IIS;

	$location = apply_filters('wp_redirect', $location, $status);
	$status = apply_filters('wp_redirect_status', $status, $location);

	if ( !$location ) // allows the wp_redirect filter to cancel a redirect
		return false;

	$location = wp_sanitize_redirect($location);

	if ( $is_IIS ) {
		header("Refresh: 0;url=$location");
		header("Location: $location", true, $status);
	} else {
		if ( php_sapi_name() != 'cgi-fcgi' )
			status_header($status); // This causes problems on IIS and some FastCGI setups
		header("Location: $location", true, $status);
	}
}

Integrating calculated fields and model data in CakePHP

(This is mostly a summary of Dealing with calculated fields in CakePHP’s find().)

One of the great things about CakePHP is that if it doesn’t have some core functionality you want/need there are easy ways to add it. More and more I’m taking advantage of this ability. This all comes about because I wanted to have calculated fields available inline with the model data. By calculated fields I mean results that are not data columns (e.g. SELECT *, CURDATE() AS current_date FROM users … yes, that’s a fairly contrived example).

By default Cake places results from calculated fields outside the model data, like this:

Array
(
    [0] => Array
        (
            [User] => Array
                (
                    [id] => 1
                    [username] => aaas
                )

            [0] => Array
                (
                    [current_date] => '2011-05-13'
                )

        )
)

What we want is to place the “current_date” calculated field inside the User model, so it’s more naturally accessed with $users[0]['User']['current_date'] instead of $users[0]['User'][0]['current_date']. Easy enough to do through a model’s afterFind() callback method (to make it widely available place the function in app_model.php).

function afterFind($results, $primary=false) {
	if($primary == true) {
		if(Set::check($results, '0.0') && Set::check($results, "0." . $this->alias)) {
			$fields = array_keys( $results[0][0] );
			foreach($results as $key=>$value) {
				foreach( $fields as $fieldName ) {
					$results[$key][$this->alias][$fieldName] = $value[0][$fieldName];
				}
				unset($results[$key][0]);
			}
		}
	}
	return $results;
}

And yet, even as I add such enhanced functionality to my web apps I’m finding limits. The above logic is complicated a little because you don’t know what type of results you’re getting. The results you get from calling, for example, find('all') versus find('count') are not the same. But Cake doesn’t give any hints to the afterFind() callback, and as a result additional logic is included to try and guess at our data structure. The above adds a quick but incomplete hack by a) checking that the results are for the primary model queried (e.g. not from contained model), b) checking for the presence of nested numerical keys, and c) checking that there is model data to integrate with.

The take away is that while the code produces the desired data structure, in its current form it does so only for specific results.

Align two columns in Excel

I recently had two sets of data, one a full list of records and the other a list of identifiers for the records that needed to be extracted. Extracting the relevant records from the full list would be a fairly easy programming task, but the data was in Excel and I wanted to try and accomplish the task in that environment. Thankfully this problem has already been solved and the answer posted to the web.

If I have one column of identifiers for the records that need to be extracted (column A, 100 records) and one column with the full list of identifiers (column B, 1000 records) the following formula will indicate which identifiers from column B match an identifier from column A:

=IF(ISNA(MATCH(B1,$A$1:$A$100,0)),"","X")

Just place the formula in its own column and copy down for the length of the full data set. The columns do not need to be in any particular order and you can create a separate worksheet that contains your filter list, keeping your data and your filter separate. You can then AutoFilter on your search column to see the results (and copy/paste to another worksheet if necessary).

What file was that in?

One of the more annoying things about coding is finding the right file. Or, even worse, finding a file you didn’t know you needed to look at. Especially when the number of files you’re parsing is in the thousands. If you’re on Windows the built-in search can help, but you never know if all the files in the target directory have been indexed.

Luckily, one of the nice things about coding is that you’re often dealing with plain text files. And you are typically searching for a particular string. As they say, there’s an app for that. Both *nix and Windows are capable of searching through the contents an entire hierarchy of files using command-line programs. Each OS has a variety of commands that can do the job, but I’ll highlight the two I use most often here.

In Windows you would use findstr.exe, and it’s as simple as running the following from the containing directory:

findstr.exe /MIS "searchtext" filetype

On *nix systems the grep command is your friend:

grep -lr "searchtext" filetype

Of course, a quick google search will get you all the help you need in refining your search.

Update 2012-08-22: What file wasn’t that in?

I recently had a need to find files that did not contain a specific string. A bit more difficult of a prospect. I suspect there’s probably a way to do this on the command line, but so far I’ve only come up with a work-around (in Windows). Use findstr.exe in a loop to show how many times each file contains the string, then filter for files with zero instances:

for /R %f in (filetypes) do find /C /I "searchtext" %f >> ..find.txt

(filetypes in this instance can be a space-separated list of possible file types.)