sat, 03-mar-2012, 15:47
Recently read iBooks

Recently read iBooks

The Morning News Tournament of Books starts next week, and I’m about a third of the way through the only unread volume in this year’s contest, Téa Obreht’s The Tiger’s Wife. In preparation for my posts on the tournament, I wanted to generate a tournament bracket, filled with my choices. I could have fired up Inkscape or my favorite old drawing program, xfig, but drawing something that has such an obvious pattern built into it would be much easier using a programming language rather than moving a mouse pointer around over and over again.

I debated refreshing my Metapost skills, the tool I used to generate my best baseball scorecards, and even wrote a simple Pic (a language written in 1982 by Brian Kernighan, of K&R and awk fame) macro to do it. But I wanted something that would work on the web, and for figures on the Internet, SVG is really the best format (well, unless you’re stuck on Internet Explorer…). Metapost and Pic produce PostScript and PDF, and I wasn’t happy with the way the available PS to SVG image converters mangled the Pic PostScript.

I settled on Python (of course!) and the svgwrite library. Generating an SVG file programatically is pretty easy with svgwrite. You start the drawing with:

svg = svgwrite.Drawing(output_filename, size = ("700px", "650px"))

and then draw stuff onto the page with commands like those found in my draw_bracket function. The all-caps variables are global parameters set at the top of the code to control the size of various elements across the entire figure. I also created a Point class for handling x / y coordinates in the diagram. That’s what start is defined as below (and how I can access x and y via start.x and start.y.

def draw_bracket(svg, start, width, height, names):
    """ Draw a bracket from (start.x, start.y) right width, down height,
        to (start.x, start.y + height), placing the names above the
        horizonal lines. """

    lines = svg.add(svg.g(stroke_width = 2, stroke = "red"))
    lines.add(svg.line((start.x, start.y), (start.x + width, start.y)))
    lines.add(svg.line((start.x + width, start.y),
                       (start.x + width, start.y + height)))
    lines.add(svg.line((start.x + width, start.y + height),
                       (start.x, start.y + height)))
    texts = svg.add(svg.g(font_size = TEXT_SIZE))
                       (start.x + TEXT_RIGHT, start.y - TEXT_UP)))
                       (start.x + TEXT_RIGHT, start.y + height - TEXT_UP)))

With this function, all that’s left is to design the data structures that hold the data in each column of the diagram, and write the loops to draw the brackets. Here’s the first loop:

current_start = Point(MARGIN, MARGIN)
for bracket in first_brackets:
    draw_bracket(svg, current_start, INIT_BRACKET_WIDTH,
            FIRST_BRACKET_HEIGHT, bracket)
    current_start = current_start + \
                    Point(0, FIRST_BRACKET_HEIGHT + FIRST_BRACKET_SKIP)

first_brackets is a tuple of paired tuples, so each bracket above contains the two book titles that should appear on each leg of that bracket.

first_brackets = (
        ("Sense of an Ending", "The Devil All the Time"),
        ("Lightning Rods", "Salvage the Bones"),

The full code can be downloaded at The result:

2012 Tournament of Books

I’ll comment more on my picks after I’ve finished The Tiger’s Wife, and as the tournament starts next week. There were a lot of great books in the tournament, and I wouldn’t be disappointed if some of the later bracket winners in my diagram wound up winning. For me, the most interesting first round bracket is Wil Weaton’s tough choice between Ann Pachett’s State of Wonder and The Sisters Brothers by Patrick deWitt. I chose Sisters, but it was a tough choice, and even though it doesn’t make it past round one on my diagram, I’d be happy to see Wonder win it all.

One other note on the screenshot from my iPad at the top of the post. There are several excellent, non-Tournament books pictured. I highly recommend The Last Werewolf, by Glen Duncan, Neal Stephenson’s Reamde, 11/22/63 (Stephen King), and Eleanor Henderson’s Ten Thousand Saints. These four are easily better than the worst of this year’s Tournament (Hollinghurst, DeWitt and Zambreno).

wed, 22-feb-2012, 16:25
snowmachine trespass

Snowmachine trespass

It's great living where we do: in the middle of nowhere, and yet, only a few miles from town. I can ski to work in the winter on the multi-use commuter trail across the street from our driveway and ride my bike on the road in the summer.

If there is a down side to living where we do it's that it feels like the middle of nowhere to those who think it's their right to drive their gasoline-powered vehicles wherever they want. Several years ago I put up a pair of signs on one of our trails after a four wheeler damaged the vegetation around the trail. And miraculously, that seems to have worked.

Now we've got snowmachines (snowmobiles) riding down the road, which is illegal, then crossing onto the powerline, which is private property. Golden Valley Electric Association has a right of way easement, but it is still my property. In a different time and place it's the sort of thing that might be solved by shooting the offenders with a rock salt-loaded shotgun.

I took a more measured approach and stuck a pair of poles in the snow where they blew across the driveway in attempt to indicate that this isn't an acceptable place to drive their machines. Today they made a new track next to the poles, and knocked one of them over for good measure (the photo above).

And, keep in mind, there's a snowmachine-friendly trail across the street!

I know this is a fight I can't win, so I should just ignore it. And that's what I'll do after making this plea: if any of you ride snowmachines or know others that do, please encourage them to respect private property, including powerlines. Asshats like those riding roughshod all over my property give considerate riders a bad name.

tue, 07-feb-2012, 17:19
data logger

Today was the first day where I got some good data skiing to and from work using my data logger. There’s a photo of it in it’s protective box on the right. The Arduino and data logging shield (with the sensors soldered to it) is sitting on top of a battery pack holding six AA batteries. The accelerometer is the little square board that is sticking up on the left side of the logger, and you can see the SD card on the right side. The cord under the rubber bands leads to the external temperature sensor.

This morning it took about four minutes for the sensor to go from room temperature to outside temperature (-12°F), which means I need to pre-acclimate it before going out for a ski. A thermocouple would respond faster (much less mass), but they’re not as accurate because they have such a wide response range (-200°C to 1,350°C). A thermistor might be a good compromise, but I haven’t fiddled with those yet.

Here’s the temperature data from my ski home:

When I left work, the temperature at our house was 12°F, and I figured it would be warmer almost everywhere else, so I used “extra green” kick wax, which has a range of 12 to 21°F. I’ve highlighted this range on the plot with a transparent green box. In general, if you’ve chosen wax that’s too warm for the conditions, you won’t get much glide, and if the wax is rated too cold, you won’t have much kick. The plot shows that as I got near the end of the route and the temperature dropped below the lower range of the wax, I should have lost some glide, which is pretty much exactly what happened. Normally this isn’t a big issue on the Goldstream Valley Trail because it’s often very smooth, which means that a warmer wax is needed to get a grip, but this afternoon’s trail had seen a lot of snowmachine traffic, it wasn’t very smooth, and I didn’t get as much glide as earlier in the ski.

The other interesting thing is the dramatic dip marked “Goldstream Creek” on the plot. This is where the trail crosses the Creek on a small bridge designed for light recreational traffic (nothing bigger than a snowmachine or four-wheeler). It’s probably the lowest place in the trail. Our house is also on the Creek, so the two coldest spots on the trail are exactly where I’d expect them to be, right on the Creek.

sun, 05-feb-2012, 13:50

Over the past couple weeks I’ve been experimenting with a data logging shield from adafruit. My original idea was to build a unit I could take with my on my commute to work to see how the temperatures change over the route. I’m also interested in watching the temperatures inside a dog house when there’s a dog sleeping in there. During the cold snap, where temperatures got as low as -55°F, how warm could the dogs keep their houses (which are insulated)?

I added a three axis accelerometer (ADXL335) with the idea it could tell me when I was moving, but I don’t think I can afford to read (and log) the sensors that often when it’s running off batteries (6 AA cells) at cold temperature. Instead, it’ll tell me the position (relative to the ground) of the logger, and I’ll use that to indicate when I start whatever activity I want to measure. The data logger starts logging as soon as it gets power, and even though it has a clock, it may drift relative to GPS time, so I can time the start based on when the logger’s position changes.

Here’s the schematic:

Data logger circuit

I wired it on a breadboard first to confirm the circuit was correct and to get the program storing the data. The data logger shield has a 10 x 10 perfboard section, so once everything was working, I soldered the parts directly onto the board using the plan below. The black lines are above the board and the orange lines are the solder joins below the board.

Perfboard layout

I don’t know yet how long the battery will last, but I ran it last night in the arctic entryway and got the following data:

The orange dots are from the temperature sensor on the board, wrapped in bubble wrap and sitting inside a cardboard box. The cyan dots are from the waterproof sensor outside of the box. Our arctic entryway is heated with a ventilation fan that blows warm air from the house into the room, so the oscillation in the temperature shows the fan going on at the bottom of the sweep and then off at the top. The insulation in the box reduces the temperature fluctuations and traps the slight amount of heat produced by the electronics.

Time to watch the Super Bowl.

thu, 02-feb-2012, 16:56

In my last post I discussed the January 2012 cold snap, which was the fifth coldest January on record in Fairbanks. I wondered how much of this was due to the arbitrary choice of “month” as the aggregation level for the analysis. In other words, maybe January 2012 was really remarkable only because the cold happened to fall within the margins of that particular month?

So I ran a query against the GHCN daily database for the Fairbanks Airport station. This data set won’t allow me to answer the exact question I’m interested in because the data for Fairbanks only goes back to 1948 and because I don’t have the data for the last day in January 2012. But I think the analysis is still valid, even if it’s not perfect.

The following query calculates the average temperature in Fairbanks for every 31-day period possible, and ranks it according to a descending sort of the average temperature for those periods. The results show the start date, end date, the year and month that the majority of data appears in (more on that later), and the average temperature over the period.

One other note: the temperatures in this post are in degrees Celsius, which is why they’re different than those in my previous post (in °F).

SELECT date(dte - interval '15 days') AS start,
       date(dte + interval '15 days') AS end,
       to_char(dte, 'YYYY-MM') AS yyyymm,
       round(avg_31_days::numeric, 1),
       rank() OVER (ORDER BY avg_31_days) AS rank
    SELECT dte,
           avg(tavg_c) OVER (
              ORDER BY dte
           ) AS avg_31_days
    FROM (
        SELECT dte,
               (sum(CASE WHEN variable = 'TMIN'
                    THEN raw_value * 0.1
                    ELSE NULL END)
              + sum(CASE WHEN variable = 'TMAX'
                    THEN raw_value * 0.1
                    ELSE NULL END)) / 2.0 AS tavg_c
        FROM ghcnd_obs
        WHERE station_id = 'USW00026411'
        GROUP BY dte
    ) AS foo
) AS bar
ORDER BY rank;

The funky CASE WHEN ... END stuff in the innermost query is because of the structure of the data in the GHCN database. The exciting part is the window function in the first subquery that calculates the average temperature for a 31-day window surrounding every date in the database.

Some of the results:

start end yyyymm avg (°C) rank
1964-12-10 1965-01-09 1964-12 -36.9 1
1964-12-09 1965-01-08 1964-12 -36.9 2
1964-12-11 1965-01-10 1964-12 -36.9 2
1971-01-06 1971-02-05 1971-01 -36.8 4
1964-12-08 1965-01-07 1964-12 -36.7 5
1971-01-07 1971-02-06 1971-01 -36.6 6
1971-01-05 1971-02-04 1971-01 -36.5 7
1964-12-12 1965-01-11 1964-12 -36.4 8
1968-12-21 1969-01-20 1969-01 -36.3 9
1968-12-22 1969-01-21 1969-01 -36.3 10
2012-01-14 2012-02-13 2012-01 -33.9 42
2012-01-13 2012-02-12 2012-01 -33.9 43

There are some interesting things here. First, the January we just went through doesn’t show up until the 42nd coldest. You can also see that there was a very cold period from mid-December 1964 through mid-January 1964. This “even” appears five times in the first ten coldest periods. The January 1971 event occurs three times in the top ten.

To me, this means a couple things. First, we’re diluting the rankings because the same cold weather event shows up multiple times. If we use the yyyymm column to combine these, we can get a better sense of where January 2012 fits. Also, if an event shows up on here a bunch of times, that probably means that the event was longer than the 31-day window we’ve set at the outset. If you look at the minimum and maximum dates for the 1964 event, the real even lasted from December 8, 1964 through January 11, 1964 (35 days).

If we use this query as a subquery of one that groups on yyyymm, we’ll get a ranking of the overall events:

SELECT yyyymm, min_avg,
       rank() OVER (ORDER BY min_rank)
    SELECT yyyymm,
           min(avg_31_days) AS min_avg,
           min(rank) AS min_rank
    FROM (ABOVE QUERY) AS foobie
    GROUP BY yyyymm
) AS barfoo
ORDER BY min_rank;

Here’s the results of the new query:

yyyymm min_avg (°C) rank
1964-12 -36.9 1
1971-01 -36.8 2
1969-01 -36.3 3
1951-01 -34.2 4
1996-11 -34.0 5
2012-01 -33.9 6
1953-01 -33.7 7
1956-12 -33.5 8
1966-01 -33.3 9
1965-01 -33.2 10

Our cold snap winds up in sixth place. I’m sure that if I had all the data the National Weather Service used in their analysis, last month would drop even lower in an analysis like this one.

My conclusion: last month was a really cold month. But the exceptional nature of it was at least partly due to the coincidence of it happening within the confines of an arbitrary 31-day period called “January.”

tags: weather  SQL 

Meta Photolog Archives