wed, 27-mar-2013, 18:35

Earlier today our monitor stopped working and left us without heat when it was −35°F outside. I drove home and swapped the broken heater with our spare, but the heat was off for several hours and the temperature in the house dropped into the 50s until I got the replacement running. While I waited for the house to warm up, I took a look at the heat loss data for the building.

To do this, I experimented with the “Python scientific computing stack,”: the IPython shell (I used the notebook functionality to produce the majority of this blog post), Pandas for data wrangling, matplotlib for plotting, and NumPy in the background. Ordinarily I would have performed the entire analysis in R, but I’m much more comfortable in Python and the IPython notebook is pretty compelling. What is lacking, in my opinion, is the solid graphics provided by the ggplot2 package in R.

First, I pulled the data from the database for the period the heater was off (and probably a little extra on either side):

import psycopg2
from pandas.io import sql
con = psycopg2.connect(host = 'localhost', database = 'arduino_wx')
temps = sql.read_frame("""
    SELECT obs_dt, downstairs,
        (lead(downstairs) over (order by obs_dt) - downstairs) /
            interval_to_seconds(lead(obs_dt) over (order by obs_dt) - obs_dt)
            * 3600 as downstairs_rate,
        upstairs,
        (lead(upstairs) over (order by obs_dt) - upstairs) /
            interval_to_seconds(lead(obs_dt) over (order by obs_dt) - obs_dt)
            * 3600 as upstairs_rate,
        outside
    FROM arduino
    WHERE obs_dt between '2013-03-27 07:00:00' and '2013-03-27 12:00:00'
    ORDER BY obs_dt;""", con, index_col = 'obs_dt')

SQL window functions calculate the rate the temperature is changing from one observation to the next, and convert the units to the change in temperature per hour (Δ°F/hour).

Adding the index_col attribute in the sql.read_frame() function is very important so that the Pandas data frame doesn’t have an arbitrary numerical index. When plotting, the index column is typically used for the x-axis / independent variable.

Next, calculate the difference between the indoor and outdoor temperatures, which is important in any heat loss calculations (the greater this difference, the greater the loss):

temps['downstairs_diff'] = temps['downstairs'] - temps['outside']
temps['upstairs_diff'] = temps['upstairs'] - temps['outside']

I took a quick look at the data and it looks like the downstairs temperatures are smoother so I subset the data so it only contains the downstairs (and outside) temperature records.

temps_up = temps[['outside', 'downstairs', 'downstairs_diff', 'downstairs_rate']]
print(u"Minimum temperature loss (°f/hour) = {0}".format(
    temps_up['downstairs_rate'].min()))
temps_up.head(10)

Minimum temperature loss (deg F/hour) = -3.7823079517
obs_dt outside downstairs diff rate
2013-03-27 07:02:32 -33.09 65.60 98.70 0.897
2013-03-27 07:07:32 -33.19 65.68 98.87 0.661
2013-03-27 07:12:32 -33.26 65.73 98.99 0.239
2013-03-27 07:17:32 -33.52 65.75 99.28 -2.340
2013-03-27 07:22:32 -33.60 65.56 99.16 -3.782
2013-03-27 07:27:32 -33.61 65.24 98.85 -3.545
2013-03-27 07:32:31 -33.54 64.95 98.49 -2.930
2013-03-27 07:37:32 -33.58 64.70 98.28 -2.761
2013-03-27 07:42:32 -33.48 64.47 97.95 -3.603
2013-03-27 07:47:32 -33.28 64.17 97.46 -3.780

You can see from the first bit of data that when the heater first went off, the differential between inside and outside was almost 100 degrees, and the temperature was dropping at a rate of 3.8 degrees per hour. Starting at 65°F, we’d be below freezing in just under nine hours at this rate, but as the differential drops, the rate that the inside temperature drops will slow down. I'd guess the house would stay above freezing for more than twelve hours even with outside temperatures as cold as we had this morning.

Here’s a plot of the data. The plot looks pretty reasonable with very little code:

import matplotlib.pyplot as plt
plt.figure()
temps_up.plot(subplots = True, figsize = (8.5, 11),
    title = u"Heat loss from our house at −35°F",
    style = ['bo-', 'ro-', 'ro-', 'ro-', 'go-', 'go-', 'go-'])
plt.legend()
# plt.subplots_adjust(hspace = 0.15)
plt.savefig('downstairs_loss.pdf')
plt.savefig('downstairs_loss.svg')

You’ll notice that even before I came home and replaced the heater, the temperature in the house had started to rise. This is certainly due to solar heating as it was a clear day with more than twelve hours of sunlight.

The plot shows what looks like a relationship between the rate of change inside and the temperature differential between inside and outside, so we’ll test this hypothesis using linear regression.

First, get the data where the temperature in the house was dropping.

cooling = temps_up[temps_up['downstairs_rate'] < 0]

Now run the regression between rate of change and outside temperature:

import pandas as pd
results = pd.ols(y = cooling['downstairs_rate'], x = cooling.ix[:, 'outside'])
results
-------------------------Summary of Regression Analysis-------------------------

Formula: Y ~ <x> + <intercept>

Number of Observations:         38
Number of Degrees of Freedom:   2

R-squared:         0.9214
Adj R-squared:     0.9192

Rmse:              0.2807

F-stat (1, 36):   421.7806, p-value:     0.0000

Degrees of Freedom: model 1, resid 36

-----------------------Summary of Estimated Coefficients------------------------
      Variable       Coef    Std Err     t-stat    p-value    CI 2.5%   CI 97.5%
--------------------------------------------------------------------------------
             x     0.1397     0.0068      20.54     0.0000     0.1263     0.1530
     intercept     1.3330     0.1902       7.01     0.0000     0.9603     1.7057
---------------------------------End of Summary---------------------------------

You can see there’s a very strong positive relationship between the outside temperature and the rate that the inside temperature changes. As it warms outside, the drop in inside temperature slows.

The real relationship is more likely to be related to the differential between inside and outside. In this case, the relationship isn’t quite as strong. I suspect that the heat from the sun is confounding the analysis.

results = pd.ols(y = cooling['downstairs_rate'], x = cooling.ix[:, 'downstairs_diff'])
results
-------------------------Summary of Regression Analysis-------------------------

Formula: Y ~ <x> + <intercept>

Number of Observations:         38
Number of Degrees of Freedom:   2

R-squared:         0.8964
Adj R-squared:     0.8935

Rmse:              0.3222

F-stat (1, 36):   311.5470, p-value:     0.0000

Degrees of Freedom: model 1, resid 36

-----------------------Summary of Estimated Coefficients------------------------
      Variable       Coef    Std Err     t-stat    p-value    CI 2.5%   CI 97.5%
--------------------------------------------------------------------------------
             x    -0.1032     0.0058     -17.65     0.0000    -0.1146    -0.0917
     intercept     6.6537     0.5189      12.82     0.0000     5.6366     7.6707
---------------------------------End of Summary---------------------------------
con.close()

I’m not sure how much information I really got out of this, but I am pleasantly surprised that the house held it’s heat as well as it did even with the very cold temperatures. It might be interesting to intentionally turn off the heater in the middle of winter and examine these relationship for a longer period and without the influence of the sun.

And I’ve enjoyed learning a new set of tools for data analysis. Thanks to my friend Ryan for recommending them.

tags: weather  house  Pandas  Python  IPython 
Meta Photolog Archives