Hot summer, Window functions

West weather sensors
Fairbanks had some very hot weather earlier this week, including breaking a high temperature record at the airport on Wednesday afternoon. We got up to 94°F here on the Creek, but what was worse was that the low temperature on Wednesday night was 60°F, too warm to cool the house much. We were tempted to sleep out in the back cabin because the house was so warm.
It’s been cooler in the last couple days. But how much cooler?
The new version of PostgreSQL (8.4) has support for window functions, which make it easier to answer questions like this. Window functions allow you to calculate aggregates (average, sum, etc.) at one level, while display the data from another level. In this case, I want to compare the hourly average temperatures over the last 24 hours with the overall hourly average temperature over the past week. Without window functions, I’d need to combine a query that yields the hourly average temperature over the last 24 hours with a query that calculates overall seven-day hourly average temperatures. And if I want the difference between the two, the first two queries become a subquery of a third query.
Here’s the query:
SELECT dt, t_avg,
seven_day_avg::numeric(4,1),
(t_avg - seven_day_avg)::numeric(4,1) AS anomaly
FROM (
SELECT dt, t_avg::numeric(4,1),
avg(t_avg::numeric(4,1)) OVER (PARTITION BY extract(hour from dt)) AS seven_day_avg
FROM hourly WHERE dt > current_timestamp - interval ’7 days’
) AS sevenday
WHERE dt > current_timestamp - interval ’24 hours’
ORDER BY dt;
And the result for the last 24 hours of temperature data:
dt | t_avg | seven_day_avg | anomaly
------------------+-------+---------------+---------
2009-07-10 11:00 | 67.1 | 72.3 | -5.2
2009-07-10 12:00 | 70.8 | 75.8 | -5.0
2009-07-10 13:00 | 72.9 | 77.4 | -4.5
2009-07-10 14:00 | 74.1 | 78.5 | -4.4
2009-07-10 15:00 | 74.6 | 80.2 | -5.6
2009-07-10 16:00 | 75.9 | 80.4 | -4.5
2009-07-10 17:00 | 76.1 | 81.0 | -4.9
2009-07-10 18:00 | 76.9 | 80.4 | -3.5
2009-07-10 19:00 | 76.5 | 79.3 | -2.8
2009-07-10 20:00 | 73.1 | 77.0 | -3.9
2009-07-10 21:00 | 69.1 | 73.5 | -4.4
2009-07-10 22:00 | 63.7 | 68.2 | -4.5
2009-07-10 23:00 | 57.6 | 62.3 | -4.7
2009-07-11 00:00 | 52.1 | 56.5 | -4.4
2009-07-11 01:00 | 48.5 | 52.6 | -4.1
2009-07-11 02:00 | 45.5 | 49.3 | -3.8
2009-07-11 03:00 | 43.4 | 47.9 | -4.5
2009-07-11 04:00 | 42.2 | 47.1 | -4.9
2009-07-11 05:00 | 44.8 | 47.6 | -2.8
2009-07-11 06:00 | 47.5 | 49.7 | -2.2
2009-07-11 07:00 | 51.2 | 53.8 | -2.6
2009-07-11 08:00 | 55.3 | 59.2 | -3.9
2009-07-11 09:00 | 60.4 | 64.0 | -3.6
2009-07-11 10:00 | 65.5 | 68.3 | -2.8
Conclusion? It’s been several degrees cooler in the last 24 hours compared with the last seven days.
And window functions are groovy.
Another good query using the new rank() window aggregate:
SELECT date, t_min, rank() OVER (PARTITION BY extract(year from date) ORDER BY t_min DESC) AS rank_min, t_avg, rank() OVER (PARTITION BY extract(year from date) ORDER BY t_avg DESC) AS rank_avg, t_max, rank() OVER (PARTITION BY extract(year from date) ORDER BY t_max DESC) as rank_max FROM daily WHERE extract(year from date) = 2009 ORDER BY rank_avgIt shows the daily minimum, maximum and average temperatures with their rank among this year’s temperatures. Something similar could be used to get the rank for each day among the historical record (where you’d use something like (PARTITION BY to_char(extract(month from date), '00') || to_char(extract(day from date), '00') ORDER BY t_avg DESC)), but at this point I don’t even have a full year of data here at the Creek. Here’s the top 20 results from the query:
date | t_min | rank_min | t_avg | rank_avg | t_max | rank_max ------------+--------+----------+--------+----------+--------+---------- 2009-07-08 | 48.0 | 5 | 72.0 | 1 | 94.0 | 1 2009-07-07 | 45.0 | 10 | 69.0 | 2 | 87.0 | 2 2009-07-03 | 40.0 | 25 | 66.5 | 3 | 84.0 | 4 2009-07-06 | 45.0 | 10 | 65.8 | 4 | 82.9 | 9 2009-06-04 | 37.0 | 31 | 65.8 | 4 | 86.0 | 3 2009-07-02 | 39.0 | 27 | 65.2 | 6 | 84.0 | 4 2009-07-05 | 43.0 | 14 | 65.2 | 6 | 83.0 | 7 2009-07-04 | 41.0 | 20 | 64.6 | 8 | 84.0 | 4 2009-07-09 | 54.0 | 1 | 64.2 | 9 | 72.8 | 40 2009-06-16 | 36.8 | 34 | 62.2 | 10 | 80.6 | 16 2009-06-12 | 47.0 | 7 | 61.9 | 11 | 76.0 | 26 2009-05-21 | 44.0 | 12 | 61.8 | 12 | 77.0 | 22 2009-06-11 | 37.0 | 31 | 61.7 | 13 | 81.8 | 12 2009-06-15 | 46.0 | 9 | 61.7 | 13 | 77.7 | 20 2009-07-01 | 35.0 | 39 | 61.6 | 15 | 81.0 | 14 2009-06-03 | 40.0 | 25 | 61.6 | 15 | 78.0 | 18 2009-07-10 | 42.4 | 16 | 61.3 | 17 | 77.6 | 21 2009-05-25 | 32.0 | 46 | 61.2 | 18 | 81.6 | 13 2009-06-14 | 49.0 | 2 | 61.1 | 19 | 74.7 | 34 2009-06-17 | 36.0 | 35 | 60.8 | 20 | 83.0 | 7Comment by cswingle — Sat, 11 Jul 2009 @ 18:03:41 -0800
[...] For the past ten days I’ve been collecting data from three sets of temperature sensors located in different places around the yard. There’s the sensor in the Rainwise weather station at the top of the dog yard gate, a collection of sensors out behind the house under the oil tank, and a set of sensors under a collection of yogurt containers on top of a foundation post on the west side of the house. [...]
Pingback by metachronistic » Weather station updates — Sun, 12 Jul 2009 @ 17:02:28 -0800