A note on various updates to the Items Utility related to packets and school reports.
Field and Pilot Test Packets
For this year’s field and pilot testing the researchers are going to be using OMR software for both production and data collection. On the one hand this should mean less work for the tech group in helping prepare packets. On the other hand … so much for all that work on the PDF generation tool. Hopefully we’ll get some use out of it elsewhere.
At any rate, in order to import the actual items into the utility they need to be in an image format. EPS would be the preferred since it would provide the best resolution on print-out. Linux has a built-in tool to do this and so I was easily able to update the item_print.php script to add EPS export capability. This is done by first creating a PDF in the usual manner (HTML to PDF using DOMPDF) then using the pdftops
linux tool to create the EPS.
Unfortunately, the OMR design software doesn’t like the exported EPS. I’m not sure where the problem lies, and because EPS is a rather complex format I decided to go in another direction. So now the script also provides an option to export to TIFF, which seems to work fine. The process is similar to the above, only instead of pdftops
the script now uses GhostScript (gs
).
I didn’t want to spend too much time on this, so there isn’t much in the way of error handling for the benefit of the front end. If an error is encountered during the conversion process the script will just display the command output.
School Reports Class and Student Counts
The school report format has been edited by staff. Most of the modifications were simple text updates or formatting changes. However, the staff wanted to add some numbers to the report which were not being calculated: number of students and number of classes involved in the testing of each topic.
This is a somewhat complex calculation since the data is spread out among various tables in the database and arrays in the script. To perform the calculation I decided to add one more value to the packet data array under $student_data[packet_id]['packet_metadata']
. This new value is an array of topics covered by the packet. The means of determining this required quite a bit of looping through the arrays. Surprisingly, however, the speed of the script does not appear to be significantly affected.
Further down in the script the actual calculation is performed. A couple of factors need to be taken into account when calculating these numbers: a packet can cover more than one topic; and a single class can be split among multiple packets. With these things in mind I use an array to store information about each grade/class combination and use that array to determine the actual numbers.
School Report Misconception Reference Duplication
As I was working on the report I noticed that one of the features I had developed wasn’t working as expected. The researchers wanted to have a list of references for the misconceptions at the bottom of the report. I added this feature in the previous revision of the report, but noticed that some of the references were showing up more than once in the reference list. The reason for this had to do with the way references were stored in the data array. Previously each misconception had it’s own list of references, so references that produced more than one misconception would be listed more than once. To rectify the situation I created a references data array that contains the actual references and then then just pointed each entry in a misconception’s reference list to the references data array key. When a misconception is present in the report it is noted and the list of references are later compiled based on this information.
I also had to work on the reference sorting. One of the problems sorting these entries are that the references may contain HTML and high-bit UTF8 characters. I created a custom sort function called ref_sort()
that will massage the strings so that they are more amenable to sorting. First HTML tags are stripped from the string, then leading spaces/line breaks are trimmed, and finally any HTML entity encoding is decoded (for high-bit characters). Finally the strings are compared using the strcmp()
PHP function. The only problem with this function is that it isn’t UTF8 safe and so doesn’t do a good job when the first character is high-bit. The sorting, however, is good enough.