thu, 16-jul-2009, 07:08

Cycling with the wind

Cycling with the wind
Photo by ~BostonBill~

I decided to look at wind a little more deeply after yesterday’s bike ride home. It seemed clear to me that the wind was strongly at my back for much of the route. It wasn’t my fastest ride home, but it was close, and it didn’t feel like I was working all that hard.

Here’s the process. First, examine all my bicycling tracks individually, using PostGIS’s ST_Azimuth function to calculate the direction I was traveling at each point. The query uses another of the new window functions (lead) in PostgreSQL 8.4.

SELECT
    point_id, dt_local,
    ST_Azimuth(
        point_utm,
        lead(point_utm) OVER (PARTITION BY tid ORDER BY dt_local)
    ) / (2 * pi()) *360
FROM points
WHERE tid = TID
ORDER BY dt_local;

Then, for each point, find the direction the wind was blowing. This is a pretty slow query, but I haven’t found a better way to compare timestamps in the database to find the closest record. This technique, based on converting both timestamps to “epoch,” which is the number of seconds since January 1st, 1970, is faster than using an interval type of operation (like: WHERE obs_dt - POINT_DT BETWEEN interval '-3 minutes' AND interval '3 minutes').

SELECT obs_dt, wdir, wspd
FROM observations
WHERE abs(extract(epoch from obs_dt) - extract(epoch from POINT_DT)) < 5 * 60
    AND wspd IS NOT NULL
    AND wdir IS NOT NULL
ORDER BY abs(extract(epoch from obs_dt) - extract(epoch from POINT_DT))
LIMIT 1;

Now I’ve got the direction I was traveling and the direction the wind is coming from. I wrote a Python function that returns a value from –1 (wind is in my face) to 1 (wind is at my back). The procedure is to convert the wind directions to unit u and v vectors and get the distance between the endpoints of each vector. The distances are then scaled such that wind behind the direction traveled range from 0 – 1, and from –1 – 0 for wind blowing against the direction traveled.

def wind_effect(mydir, winddir):
    """ Returns a number from 1 (wind at my back) to -1 (wind in my face)
        based on the directions passed in.

        Remember that wind direction is where the wind is *from*, so a
        wind direction of 0 and a mydir of 0 means the wind is in my face.
    """
    try:
        mydir = float(mydir)
        winddir = float(winddir)
    except:
        return(None)

    my_spd = 1.0
    wind_spd = 1.0
    u_mydir = -1 * my_spd * math.sin(math.radians(mydir))
    v_mydir = -1 * my_spd * math.cos(math.radians(mydir))
    u_winddir = -1 * wind_spd * math.sin(math.radians(winddir))
    v_winddir = -1 * wind_spd * math.cos(math.radians(winddir))
    distance = math.sqrt((u_mydir - u_winddir)**2 + (v_mydir - v_winddir)**2)
    factor = (1.41421356 - distance)
    if factor < 0.0:
        factor = factor / -0.58578644
    else:
        factor = factor / -1.41421356

    return(factor)

Finally, multiply this value by the wind speed at that time, and sum all these values for an entire bicycling track. The result is a “wind factor.” A positive wind factor means the wind was generally at my back during the ride, negative means it was blowing in my face. Yesterday’s ride home had the highest wind factor (1.07) among trips since June. So the wind really was at my back!

Can “wind factor” help predict average speed? Here’s the R and results:

$ R --save < wind_from_abr.R
> data<-read.table('wind_factor_from_abr',header=TRUE)
> model<-lm(speed ~ wind, data)
> summary(model)

Call:
lm(formula = speed ~ wind, data = data)

Residuals:
     Min       1Q   Median       3Q      Max
-0.90395 -0.46782 -0.04334  0.40286  0.85918

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  14.7796     0.1471  100.48   <2e-16 ***
wind          0.4369     0.2875    1.52    0.147
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5522 on 17 degrees of freedom
Multiple R-squared: 0.1196,     Adjusted R-squared: 0.06784
F-statistic:  2.31 on 1 and 17 DF,  p-value: 0.1469

Hmm. Not a whole lot of help here. The model is close to being statistically significant (although it’s not…), and it’s not very predictive (only 12% of the variation in average speed is explained by wind factor). However, the directionality of the (not quite statistically significant) wind coefficient is correct. A positive wind factor is (weakly) correlated with a higher average speed.

Thinking more about my route from work, I suspect that the route is actually two trips: the trip from ABR to the bottom of Miller Hill (4.8 miles) and the two mile trip over Miller Hill to our house. I’ll bet that wind becomes statistically significant if I only consider the first part of the trip: wind doesn’t have as much effect on a hill climb, and after making it over the top, the rest is a bumpy, gravel road where speed is determined more by safety than wind or how hard I’m pedalling. I think this might also resolve the question of why the ride home is so much easier than to work. It’s not because I’m glad to be out of work or because I’m carrying a lunchbox full of food to work, it’s because it’s downhill from ABR to the bottom of Miller Hill.

tue, 14-jul-2009, 21:17

Route to work

Ride to work

It’s interesting riding the same route back and forth to work every day. I have a perception that the wind is always in my face, and wondered if maybe the wind tends to be going one direction in the morning, and another in the afternoon when I’m riding home. Despite the fact that riding home seems much easier than riding to work, the wind always seems stronger.

If you look at the little map of my bicycling route from work on the right, you can see that the major portion of the trip is in a northwesterly (to work) or southeasterly (from work) direction. The long stretch that’s north/south is Miller Hill, and wind doesn’t really matter on that section of the route because it’s a steep hill. The color of the dots indicate speed from blue (slow) to red (fast).

I took a look at the wind data from our weather station for the days I bicycled to work; for the two hours before and after each ride on those dates. The weather database has a binned summary of wind direction (each row in the table shows the number of five-minute observations where the wind was blowing in a particular cardinal direction) and an average speed. I multiplied the two for each hour during my rides, and then summed them over all my rides this year. The plot below shows what the data looks like.

And, for reference, the SQL query, data set and gnuplot script. It took me forever to figure out how to make gnuplot make a histogram of this data.

Wind direction, to and from work

I was right about it being windier on my ride home from work. But, my perception that the wind is always in my face isn’t right. In both the morning and afternoon, there are two predominant wind directions, northwest (which would be at my back on the way home) and south-southeast (in my face). This is one of those cases where I notice when the wind is in my face, but when it’s at my back it doesn’t register.

At some point I’ll have to see if there’s any relationship between my average speed and the wind. At least then I’d have something to blame when I arrive at my destination with a slow time.

tags: bicycling  weather  wind 
sun, 12-jul-2009, 15:22

Stevenson screen

Stevenson screen

For the past ten days I’ve been collecting data from three sets of temperature sensors located in different places around the yard. There’s the sensor in the Rainwise weather station at the top of the dog yard gate, a collection of sensors out behind the house under the oil tank, and a set of sensors under a collection of yogurt containers on top of a foundation post on the west side of the house.

It’s not easy to get accurate temperature readings. You need to site the sensors where they’ll get a good reading (between 4 and 6 feet off the ground, out in the open and away from buildings and trees), keep the sensors from getting heated by the sun, and keep them dry both from rain and snow, as well as from condensation inside an enclosure. I’ve got the last one figured out, but siting and solar radiation are proving to be big challenges.

The plot below shows the hourly average temperature readings for all three sets of sensors over the last ten days since I added the west sensor.

Temperature sensor comparison

The sensor atop the dog yard gate (the red line) is well sited in terms of it’s distance from large objects like buildings and trees, but it’s too high off the ground. It’s enclosed in a Gill multi-plate radiation shield, which is effective at reducing the effect of solar radiation when the wind is blowing. Compared with other temperature sensors in the region, this sensor is commonly several degrees warmer during the middle of the day, and I think this is because the shield isn’t keeping the sensor cool enough. We do seem to get less wind than in other places, and I think this is why the shield isn’t working as well as it should be. The sensor’s location away from everything does allow it to reach accurate minimum temperatures at night.

The sensor cluster behind the house is effectively shielded from the sun because it’s very close to the north side of the house, and even when the sun is in the north (in Fairbanks, the sun comes pretty close to circling the sky in summer) there are trees behind the house that keep it shaded. But it’s much too close to the house, and the location is far more sheltered than is appropriate. The moderating effect of being so close to the house reduces the diurnal temperature range, clipping the highs and lows compared to the data from the dog yard sensor.

On the west side of the house, I’ve got three sensors sitting on top of a foundation post (a telephone pole driven into the ground). There are several layers of yogurt container on top of the sensors, both to protect them from rain, but also in an attempt to reduce solar heating under the containers. The radiation shielding appears to be almost as effective as the commercial Gill shield over the dog yard sensors (the high temperature peak on the graph is very similar between the two), but something is keeping the temperature from dropping at night. The low temperatures from the west sensors are more than 5 degrees warmer than the dog yard sensor. I suspect the sensors aren’t high enough off the ground, and that the foundation post may be absorbing a lot of heat during the day and keeping the sensors artificially warm at night.

My plan is to place the west sensors inside the Stevenson shield pictured at the top of this post. I’ll raise it to between 4 and 6 feet in the air, and see how the temperatures compare with the dog yard sensors. I’m also working on a solar powered aspiration system in case the Stevenson screen doesn’t have enough of an effect on the high temperatures on sunny days. I haven’t quite worked it out yet, but the idea is to put a small computer fan on the top of a short piece of 4” plastic pipe that contain the sensors. When the sun is shining, the solar panel drives the fan, which pulls air up through the pipe and over the sensors. We’ll see if it’s needed in the next few days.

sat, 11-jul-2009, 11:04

West weather station

West weather sensors

Fairbanks had some very hot weather earlier this week, including breaking a high temperature record at the airport on Wednesday afternoon. We got up to 94°F here on the Creek, but what was worse was that the low temperature on Wednesday night was 60°F, too warm to cool the house much. We were tempted to sleep out in the back cabin because the house was so warm.

It’s been cooler in the last couple days. But how much cooler?

The new version of PostgreSQL (8.4) has support for window functions, which make it easier to answer questions like this. Window functions allow you to calculate aggregates (average, sum, etc.) at one level, while display the data from another level. In this case, I want to compare the hourly average temperatures over the last 24 hours with the overall hourly average temperature over the past week. Without window functions, I’d need to combine a query that yields the hourly average temperature over the last 24 hours with a query that calculates overall seven-day hourly average temperatures. And if I want the difference between the two, the first two queries become a subquery of a third query.

Here’s the query:

SELECT dt, t_avg,
    seven_day_avg::numeric(4,1),
    (t_avg - seven_day_avg)::numeric(4,1) AS anomaly
FROM (
    SELECT dt, t_avg::numeric(4,1),
        avg(t_avg::numeric(4,1)) OVER (PARTITION BY extract(hour from dt)) AS seven_day_avg
    FROM hourly WHERE dt > current_timestamp - interval ’7 days’
    ) AS sevenday
WHERE dt > current_timestamp - interval ’24 hours’
ORDER BY dt;

And the result for the last 24 hours of temperature data:

         dt       | t_avg | seven_day_avg | anomaly
------------------+-------+---------------+---------
 2009-07-10 11:00 |  67.1 |          72.3 |    -5.2
 2009-07-10 12:00 |  70.8 |          75.8 |    -5.0
 2009-07-10 13:00 |  72.9 |          77.4 |    -4.5
 2009-07-10 14:00 |  74.1 |          78.5 |    -4.4
 2009-07-10 15:00 |  74.6 |          80.2 |    -5.6
 2009-07-10 16:00 |  75.9 |          80.4 |    -4.5
 2009-07-10 17:00 |  76.1 |          81.0 |    -4.9
 2009-07-10 18:00 |  76.9 |          80.4 |    -3.5
 2009-07-10 19:00 |  76.5 |          79.3 |    -2.8
 2009-07-10 20:00 |  73.1 |          77.0 |    -3.9
 2009-07-10 21:00 |  69.1 |          73.5 |    -4.4
 2009-07-10 22:00 |  63.7 |          68.2 |    -4.5
 2009-07-10 23:00 |  57.6 |          62.3 |    -4.7
 2009-07-11 00:00 |  52.1 |          56.5 |    -4.4
 2009-07-11 01:00 |  48.5 |          52.6 |    -4.1
 2009-07-11 02:00 |  45.5 |          49.3 |    -3.8
 2009-07-11 03:00 |  43.4 |          47.9 |    -4.5
 2009-07-11 04:00 |  42.2 |          47.1 |    -4.9
 2009-07-11 05:00 |  44.8 |          47.6 |    -2.8
 2009-07-11 06:00 |  47.5 |          49.7 |    -2.2
 2009-07-11 07:00 |  51.2 |          53.8 |    -2.6
 2009-07-11 08:00 |  55.3 |          59.2 |    -3.9
 2009-07-11 09:00 |  60.4 |          64.0 |    -3.6
 2009-07-11 10:00 |  65.5 |          68.3 |    -2.8

Conclusion? It’s been several degrees cooler in the last 24 hours compared with the last seven days.

And window functions are groovy.

thu, 02-jul-2009, 19:45

The company I work for (ABR, Inc.) pays it’s employees to use alternative transportation—bicycle, walk, run, ski, etc.—to get to work. It’s not a lot of money, but human behavior being what it is, even a small incentive really works. I’ve been riding my bicycle to work three or four times a week, carrying a hand-held GPS (Garmin eTrex Vista Cx) along with me and logging the data into spatially aware databases (PostGIS and spatialite). You can see a map of last week’s data on my GIS page.

One problem with my GPS is that I get really inaccurate elevation data, and as anyone who has ridden a bicycle knows, elevation changes are really important to how fast you can ride. I’ve noticed that I ride much faster going home than on my way to work, and had concluded (until examining the data…) that it must be an uphill ride to work.

But how to fix the elevation data? One approach would be to download the new global digital elevation model (DEM) that’s based on data from Japan’s ASTER satellite and use this to find the corrected elevation for GPS trackpoint. I’ll probably try this at some point just to see how well the DEM matches my data. But the approach I wound up using is to generate an “average” track based on all my trips to and from work.

There’s probably a way to do this with a single SQL statement, but I couldn’t figure out how, so I did the point iteration in Python. The first SQL query takes an existing trip and segments the trip into a series of points. I chose a maximum interval of 50 meters because that was approximately how often the GPS reports data at the speeds I’m normally travelling. Here’s the PostGIS SQL:

SELECT ST_AsText(ST_PointN(
  segpoints.points,
  generate_series(1, ST_NPoints(segpoints.points)))
  )
FROM (
  SELECT
    ST_GeometryN(
        ST_Segmentize(track_utm, 50),
        generate_series(1, ST_NumGeometries(ST_Segmentize(track_utm, 50)))
    )
  AS points FROM tracks WHERE tid=67
  ) AS segpoints;

This yields a series of points (in WKT format) along the track.

My Python script takes each of these points and finds the average X, Y, and Z (elevation) coordinates of all points within 50 meters of the point. Here’s the query:

SELECT count(*), avg(lat), avg(lon), avg(ele)*3.2808399 AS avg_ft
FROM points
WHERE ST_Within(
  ST_SetSRID(ST_GeomFromText(POINT_WKT), 32606),
  ST_Buffer(point_utm, 50)
);

And the results:

Elevation and speed, Jul 1 ride to ABR

The red squares show the elevations along my route, and the green markers show the speed I was traveling at that point. In this case, I’m riding to work, so home is on the left and ABR is on the right. The big hill on the left is Miller Hill Road. In general, you can see that a ride to work involves going over Miller Hill, then a gradual climb along Sheep Creek Road to the intersection with Ester Dome Road (the little hitch in this section is the hairpin curve before Anne’s Greenhouses where so many people go off the road in winter…), followed by a steeper descent to ABR. My speed tracks the elevation changes fairly closely. I’ll be interested in seeing how these plots change over time as my legs and cardiovascular system strengthens. Will the green points rise uniformly, or will I be able to see improvements in individual aspects of my performance such as hill climbing?

One final point: the elevation at home and ABR are essentially the same. My conclusion after all this is that elevation can’t account for why my rides home are so much faster (1.28 mph, on average). Wind? Cold mornings? Excitement to get home?

For reference, here's my ride home on the same day:

Elevation and speed, Jul 1 ride from ABR

tags: ABR  bicycle  cycling  elevation  GIS  GPS  speed 

Meta Photolog Archives