metachronistic http://swingleydev.com/blog/ Latest metachronistic posts en-us Thu, 13 Sep 2018 17:40:09 -0800 Equinox Marathon Weather, 2018 update http://swingleydev.com/blog/p/2010/ <div class="document"> <div class="section" id="introduction"> <h1>Introduction</h1> <p>A couple years ago I wrote a post about past <a class="reference external" href="https://swingleydev.com/blog/p/1999/">Equinox Marathon weather</a>. Since that post Andrea and I have run the relay twice, and I plan on running the full marathon in a couple days. This post updates the statistics and plots to include two more years of the race.</p> </div> <div class="section" id="methods"> <h1>Methods</h1> <p>Methods and data are the same as in my previous <a class="reference external" href="https://swingleydev.com/blog/p/1999/">post</a>, except the daily data has been updated to include 2016 and 2017. The R code is available at the end of the previous post.</p> </div> <div class="section" id="results"> <h1>Results</h1> <div class="section" id="race-day-weather"> <h2>Race day weather</h2> <p>Temperatures at the airport on race day ranged from 19.9&nbsp;°F in 1972 to 35.1&nbsp;°F in 1969, but the average range is between 34.1 and 53.1&nbsp;°F. Using our model of Ester Dome temperatures, we get an average range of 29.5 and 47.3&nbsp;°F and an overall min / max of 16.1 / 61.3&nbsp;°F. Generally speaking, it will be below freezing on Ester Dome, but possibly before most of the runners get up there.</p> <p>Precipitation (rain, sleet or snow) has fallen on 16 out of 55 race days, or 29% of the time, and measurable snowfall has been recorded on four of those sixteen. The highest amount fell in 2014 with 0.36 inches of liquid precipitation (no snow was recorded and the temperatures were between 45 and 51&nbsp;°F so it was almost certainly all rain, even on Ester Dome). More than a quarter of an inch of precipitation fell in three of the sixteen years when it rained or snowed (1990, 1993, and 2014), but most rainfall totals are much smaller.</p> <p>Measurable snow fell at the airport in four years, or seven percent of the time: 4.1&nbsp;inches in 1993, 2.1&nbsp;inches in 1985, 1.2&nbsp;inches in 1996, and 0.4&nbsp;inches in 1992. But that’s at the airport station. Five of the 12 years where measurable precipitation fell at the airport and no snow fell, had possible minimum temperatures on Ester Dome that were below freezing. It’s likely that some of the precipitation recorded at the airport in those years was coming down as snow up on Ester Dome. If so, that means snow may have fallen on nine race days, bringing the percentage up to sixteen percent.</p> <p>Wind data from the airport has only been recorded since 1984, but from those years the average wind speed at the airport on race day is 4.8&nbsp;miles per hour. The highest 2-minute wind speed during Equinox race day was 21&nbsp;miles per hour in 2003. Unfortunately, no wind data is available for Ester Dome, but it’s likely to be higher than what is recorded at the airport.</p> </div> <div class="section" id="weather-from-the-week-prior"> <h2>Weather from the week prior</h2> <p>It’s also useful to look at the weather from the week before the race, since excessive pre-race rain or snow can make conditions on race day very different, even if the race day weather is pleasant. The year I ran the full marathon (2013), it snowed the week before and much of the trail in the woods before the water stop near Henderson and all of the out and back were covered in snow.</p> <p>The most dramatic example of this was 1992 where 23 inches (!) of snow fell at the airport in the week prior to the race, with much higher totals up on the summit of Ester Dome. Measurable snow has been recorded at the airport in the week prior to six races, but all the weekly totals are under an inch except for the snow year of 1992.</p> <p>Precipitation has fallen in 44 of 55 pre-race weeks (80% of the time). Three years have had more than an inch of precipitation prior to the race: 1.49&nbsp;inches in 2015, 1.26&nbsp;inches in 1992 (most of which fell as snow), and 1.05&nbsp;inches in 2007. On average, just over two tenths of an inch of precipitation falls in the week before the race.</p> </div> </div> <div class="section" id="summary"> <h1>Summary</h1> <p>The following stacked plots shows the weather for all 55 runnings of the Equinox marathon. The top panel shows the range of temperatures on race day from the airport station (wide bars) and estimated on Ester Dome (thin lines below bars). The shaded area at the bottom shows where temperatures are below freezing.</p> <p>The middle panel shows race day liquid precipitation (rain, melted snow). Bars marked with an asterisk indicate years where snow was also recorded at the airport, but remember that five of the other years with liquid precipitation probably experienced snow on Ester Dome (1977, 1986, 1991, 1994, and 2016) because the temperatures were likely to be below freezing at elevation.</p> <p>The bottom panel shows precipitation totals from the week prior to the race. Bars marked with an asterisk indicate weeks where snow was also recorded at the airport.</p> <div class="figure"> <a class="reference external image-reference" href="//media.swingleydev.com/img/blog/2018/09/equinox_weather_thru_2017.pdf"><img alt="Equinox Marathon Weather" class="img-responsive" src="//media.swingleydev.com/img/blog/2018/09/equinox_weather_thru_2017.svgz" /></a> </div> <p>Here’s a table with most of the data from the analysis. A CSV with this data can be downloaded from <a class="reference external" href="//media.swingleydev.com/img/blog/2018/09/all_wx.csv">all_wx.csv</a></p> <table border="1" class="docutils"> <colgroup> <col width="15%" /> <col width="9%" /> <col width="9%" /> <col width="13%" /> <col width="13%" /> <col width="8%" /> <col width="8%" /> <col width="8%" /> <col width="10%" /> <col width="10%" /> </colgroup> <thead valign="bottom"> <tr><th class="head">Date</th> <th class="head">min t</th> <th class="head">max t</th> <th class="head">ED min t</th> <th class="head">ED max t</th> <th class="head">awnd</th> <th class="head">prcp</th> <th class="head">snow</th> <th class="head">p prcp</th> <th class="head">p snow</th> </tr> </thead> <tbody valign="top"> <tr><td>1963-09-21</td> <td>32.0</td> <td>54.0</td> <td>27.5</td> <td>48.2</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.01</td> <td>0.0</td> </tr> <tr><td>1964-09-19</td> <td>34.0</td> <td>57.9</td> <td>29.4</td> <td>51.8</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.03</td> <td>0.0</td> </tr> <tr><td>1965-09-25</td> <td>37.9</td> <td>60.1</td> <td>33.1</td> <td>53.9</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.80</td> <td>0.0</td> </tr> <tr><td>1966-09-24</td> <td>36.0</td> <td>62.1</td> <td>31.3</td> <td>55.8</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.01</td> <td>0.0</td> </tr> <tr><td>1967-09-23</td> <td>35.1</td> <td>57.9</td> <td>30.4</td> <td>51.8</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.00</td> <td>0.0</td> </tr> <tr><td>1968-09-21</td> <td>23.0</td> <td>44.1</td> <td>19.1</td> <td>38.9</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.04</td> <td>0.0</td> </tr> <tr><td>1969-09-20</td> <td>35.1</td> <td>68.0</td> <td>30.4</td> <td>61.3</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.00</td> <td>0.0</td> </tr> <tr><td>1970-09-19</td> <td>24.1</td> <td>39.9</td> <td>20.1</td> <td>34.9</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.42</td> <td>0.0</td> </tr> <tr><td>1971-09-18</td> <td>35.1</td> <td>55.9</td> <td>30.4</td> <td>50.0</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.14</td> <td>0.0</td> </tr> <tr><td>1972-09-23</td> <td>19.9</td> <td>42.1</td> <td>16.1</td> <td>37.0</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.01</td> <td>0.2</td> </tr> <tr><td>1973-09-22</td> <td>30.0</td> <td>44.1</td> <td>25.6</td> <td>38.9</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.05</td> <td>0.0</td> </tr> <tr><td>1974-09-21</td> <td>48.0</td> <td>60.1</td> <td>42.5</td> <td>53.9</td> <td>&nbsp;</td> <td>0.08</td> <td>0.0</td> <td>0.00</td> <td>0.0</td> </tr> <tr><td>1975-09-20</td> <td>37.9</td> <td>55.9</td> <td>33.1</td> <td>50.0</td> <td>&nbsp;</td> <td>0.02</td> <td>0.0</td> <td>0.02</td> <td>0.0</td> </tr> <tr><td>1976-09-18</td> <td>34.0</td> <td>59.0</td> <td>29.4</td> <td>52.9</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.54</td> <td>0.0</td> </tr> <tr><td>1977-09-24</td> <td>36.0</td> <td>48.9</td> <td>31.3</td> <td>43.4</td> <td>&nbsp;</td> <td>0.06</td> <td>0.0</td> <td>0.20</td> <td>0.0</td> </tr> <tr><td>1978-09-23</td> <td>30.0</td> <td>42.1</td> <td>25.6</td> <td>37.0</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.10</td> <td>0.3</td> </tr> <tr><td>1979-09-22</td> <td>35.1</td> <td>62.1</td> <td>30.4</td> <td>55.8</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.17</td> <td>0.0</td> </tr> <tr><td>1980-09-20</td> <td>30.9</td> <td>43.0</td> <td>26.5</td> <td>37.8</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.35</td> <td>0.0</td> </tr> <tr><td>1981-09-19</td> <td>37.0</td> <td>43.0</td> <td>32.2</td> <td>37.8</td> <td>&nbsp;</td> <td>0.15</td> <td>0.0</td> <td>0.04</td> <td>0.0</td> </tr> <tr><td>1982-09-18</td> <td>42.1</td> <td>61.0</td> <td>37.0</td> <td>54.8</td> <td>&nbsp;</td> <td>0.02</td> <td>0.0</td> <td>0.22</td> <td>0.0</td> </tr> <tr><td>1983-09-17</td> <td>39.9</td> <td>46.9</td> <td>34.9</td> <td>41.5</td> <td>&nbsp;</td> <td>0.00</td> <td>0.0</td> <td>0.05</td> <td>0.0</td> </tr> <tr><td>1984-09-22</td> <td>28.9</td> <td>60.1</td> <td>24.6</td> <td>53.9</td> <td>5.8</td> <td>0.00</td> <td>0.0</td> <td>0.08</td> <td>0.0</td> </tr> <tr><td>1985-09-21</td> <td>30.9</td> <td>42.1</td> <td>26.5</td> <td>37.0</td> <td>6.5</td> <td>0.14</td> <td>2.1</td> <td>0.57</td> <td>0.0</td> </tr> <tr><td>1986-09-20</td> <td>36.0</td> <td>52.0</td> <td>31.3</td> <td>46.3</td> <td>8.3</td> <td>0.07</td> <td>0.0</td> <td>0.21</td> <td>0.0</td> </tr> <tr><td>1987-09-19</td> <td>37.9</td> <td>61.0</td> <td>33.1</td> <td>54.8</td> <td>6.3</td> <td>0.00</td> <td>0.0</td> <td>0.00</td> <td>0.0</td> </tr> <tr><td>1988-09-24</td> <td>37.0</td> <td>45.0</td> <td>32.2</td> <td>39.7</td> <td>4.0</td> <td>0.00</td> <td>0.0</td> <td>0.11</td> <td>0.0</td> </tr> <tr><td>1989-09-23</td> <td>36.0</td> <td>61.0</td> <td>31.3</td> <td>54.8</td> <td>8.5</td> <td>0.00</td> <td>0.0</td> <td>0.07</td> <td>0.5</td> </tr> <tr><td>1990-09-22</td> <td>37.9</td> <td>50.0</td> <td>33.1</td> <td>44.4</td> <td>7.8</td> <td>0.26</td> <td>0.0</td> <td>0.00</td> <td>0.0</td> </tr> <tr><td>1991-09-21</td> <td>36.0</td> <td>57.0</td> <td>31.3</td> <td>51.0</td> <td>4.5</td> <td>0.04</td> <td>0.0</td> <td>0.03</td> <td>0.0</td> </tr> <tr><td>1992-09-19</td> <td>24.1</td> <td>33.1</td> <td>20.1</td> <td>28.5</td> <td>6.7</td> <td>0.01</td> <td>0.4</td> <td>1.26</td> <td>23.0</td> </tr> <tr><td>1993-09-18</td> <td>28.0</td> <td>37.0</td> <td>23.8</td> <td>32.2</td> <td>4.9</td> <td>0.29</td> <td>4.1</td> <td>0.37</td> <td>0.3</td> </tr> <tr><td>1994-09-24</td> <td>27.0</td> <td>51.1</td> <td>22.8</td> <td>45.5</td> <td>6.0</td> <td>0.02</td> <td>0.0</td> <td>0.08</td> <td>0.0</td> </tr> <tr><td>1995-09-23</td> <td>43.0</td> <td>66.9</td> <td>37.8</td> <td>60.3</td> <td>4.0</td> <td>0.00</td> <td>0.0</td> <td>0.00</td> <td>0.0</td> </tr> <tr><td>1996-09-21</td> <td>28.9</td> <td>37.9</td> <td>24.6</td> <td>33.1</td> <td>6.9</td> <td>0.06</td> <td>1.2</td> <td>0.26</td> <td>0.0</td> </tr> <tr><td>1997-09-20</td> <td>27.0</td> <td>55.0</td> <td>22.8</td> <td>49.1</td> <td>3.8</td> <td>0.00</td> <td>0.0</td> <td>0.03</td> <td>0.0</td> </tr> <tr><td>1998-09-19</td> <td>42.1</td> <td>60.1</td> <td>37.0</td> <td>53.9</td> <td>4.9</td> <td>0.00</td> <td>0.0</td> <td>0.37</td> <td>0.0</td> </tr> <tr><td>1999-09-18</td> <td>39.0</td> <td>64.9</td> <td>34.1</td> <td>58.4</td> <td>3.8</td> <td>0.00</td> <td>0.0</td> <td>0.26</td> <td>0.0</td> </tr> <tr><td>2000-09-16</td> <td>28.9</td> <td>50.0</td> <td>24.6</td> <td>44.4</td> <td>5.6</td> <td>0.00</td> <td>0.0</td> <td>0.30</td> <td>0.0</td> </tr> <tr><td>2001-09-22</td> <td>33.1</td> <td>57.0</td> <td>28.5</td> <td>51.0</td> <td>1.6</td> <td>0.00</td> <td>0.0</td> <td>0.00</td> <td>0.0</td> </tr> <tr><td>2002-09-21</td> <td>33.1</td> <td>48.9</td> <td>28.5</td> <td>43.4</td> <td>3.8</td> <td>0.00</td> <td>0.0</td> <td>0.03</td> <td>0.0</td> </tr> <tr><td>2003-09-20</td> <td>26.1</td> <td>46.0</td> <td>22.0</td> <td>40.7</td> <td>9.6</td> <td>0.00</td> <td>0.0</td> <td>0.00</td> <td>0.0</td> </tr> <tr><td>2004-09-18</td> <td>26.1</td> <td>48.0</td> <td>22.0</td> <td>42.5</td> <td>4.3</td> <td>0.00</td> <td>0.0</td> <td>0.25</td> <td>0.0</td> </tr> <tr><td>2005-09-17</td> <td>37.0</td> <td>63.0</td> <td>32.2</td> <td>56.6</td> <td>0.9</td> <td>0.00</td> <td>0.0</td> <td>0.09</td> <td>0.0</td> </tr> <tr><td>2006-09-16</td> <td>46.0</td> <td>64.0</td> <td>40.7</td> <td>57.6</td> <td>4.3</td> <td>0.00</td> <td>0.0</td> <td>0.00</td> <td>0.0</td> </tr> <tr><td>2007-09-22</td> <td>25.0</td> <td>45.0</td> <td>20.9</td> <td>39.7</td> <td>4.7</td> <td>0.00</td> <td>0.0</td> <td>1.05</td> <td>0.0</td> </tr> <tr><td>2008-09-20</td> <td>34.0</td> <td>51.1</td> <td>29.4</td> <td>45.5</td> <td>4.5</td> <td>0.00</td> <td>0.0</td> <td>0.08</td> <td>0.0</td> </tr> <tr><td>2009-09-19</td> <td>39.0</td> <td>50.0</td> <td>34.1</td> <td>44.4</td> <td>5.8</td> <td>0.00</td> <td>0.0</td> <td>0.25</td> <td>0.0</td> </tr> <tr><td>2010-09-18</td> <td>35.1</td> <td>64.9</td> <td>30.4</td> <td>58.4</td> <td>2.5</td> <td>0.00</td> <td>0.0</td> <td>0.00</td> <td>0.0</td> </tr> <tr><td>2011-09-17</td> <td>39.9</td> <td>57.9</td> <td>34.9</td> <td>51.8</td> <td>1.3</td> <td>0.00</td> <td>0.0</td> <td>0.44</td> <td>0.0</td> </tr> <tr><td>2012-09-22</td> <td>46.9</td> <td>66.9</td> <td>41.5</td> <td>60.3</td> <td>6.0</td> <td>0.00</td> <td>0.0</td> <td>0.33</td> <td>0.0</td> </tr> <tr><td>2013-09-21</td> <td>24.3</td> <td>44.1</td> <td>20.3</td> <td>38.9</td> <td>5.1</td> <td>0.00</td> <td>0.0</td> <td>0.13</td> <td>0.6</td> </tr> <tr><td>2014-09-20</td> <td>45.0</td> <td>51.1</td> <td>39.7</td> <td>45.5</td> <td>1.6</td> <td>0.36</td> <td>0.0</td> <td>0.00</td> <td>0.0</td> </tr> <tr><td>2015-09-19</td> <td>37.9</td> <td>44.1</td> <td>33.1</td> <td>38.9</td> <td>2.9</td> <td>0.01</td> <td>0.0</td> <td>1.49</td> <td>0.0</td> </tr> <tr><td>2016-09-17</td> <td>34.0</td> <td>57.9</td> <td>29.4</td> <td>51.8</td> <td>2.2</td> <td>0.01</td> <td>0.0</td> <td>0.61</td> <td>0.0</td> </tr> <tr><td>2017-09-16</td> <td>33.1</td> <td>66.0</td> <td>28.5</td> <td>59.5</td> <td>3.1</td> <td>0.00</td> <td>0.0</td> <td>0.02</td> <td>0.0</td> </tr> </tbody> </table> </div> </div> Thu, 13 Sep 2018 17:40:09 -0800 http://swingleydev.com/blog/p/2010/ Equinox Marathon R running weather Another Equinox Marathon prediction http://swingleydev.com/blog/p/2009/ <div class="document"> <div class="section" id="introduction"> <h1>Introduction</h1> <p>In previous posts (<a class="reference external" href="https://swingleydev.com/blog/p/2000/">Fairbanks Race Predictor</a>, <a class="reference external" href="https://swingleydev.com/blog/p/1968/">Equinox from Santa Claus</a>, <a class="reference external" href="https://swingleydev.com/blog/p/1967/">Equinox from Gold Discovery</a>) I’ve looked at predicting Equinox Marathon results based on results from earlier races. In all those cases I’ve looked at single race comparisons: how results from Gold Discovery can predict Marathon times, for example. In this post I’ll look at all the <a class="reference external" href="https://www.runningclubnorth.org/usibelli-running-series/">Usibelli Series races</a> I completed this year to see how they can inform my expectations for next Saturday’s Equinox Marathon.</p> </div> <div class="section" id="methods"> <h1>Methods</h1> <p>I’ve been collecting the results from all Usibelli Series races since 2010. Using that data, grouped by the name of the person racing and year, find all runners that completed the same set of Usibelli Series races that I finished in 2018, as well as their Equinox Marathon finish pace. Between 2010 and 2017 there are 160 records that match.</p> <p>The data looks like this. <tt class="docutils literal">crr</tt> is that person’s Chena River Run pace in minutes, <tt class="docutils literal">msr</tt> is Midnight Sun Run pace for the same person and year, <tt class="docutils literal">rotv</tt> is the pace from Run of the Valkyries, <tt class="docutils literal">gdr</tt> is the Gold Discovery Run, and <tt class="docutils literal">em</tt> is Equniox Marathon pace for that same person and year.</p> <table border="1" class="docutils"> <colgroup> <col width="19%" /> <col width="21%" /> <col width="19%" /> <col width="21%" /> <col width="21%" /> </colgroup> <thead valign="bottom"> <tr><th class="head">crr</th> <th class="head">msr</th> <th class="head">rotv</th> <th class="head">gdr</th> <th class="head">em</th> </tr> </thead> <tbody valign="top"> <tr><td>8.1559</td> <td>8.8817</td> <td>8.1833</td> <td>10.2848</td> <td>11.8683</td> </tr> <tr><td>8.7210</td> <td>9.1387</td> <td>9.2120</td> <td>11.0152</td> <td>13.6796</td> </tr> <tr><td>8.7946</td> <td>9.0640</td> <td>9.0077</td> <td>11.3565</td> <td>13.1755</td> </tr> <tr><td>9.4409</td> <td>10.6091</td> <td>9.6250</td> <td>11.2080</td> <td>13.1719</td> </tr> <tr><td>7.3581</td> <td>7.1836</td> <td>7.1310</td> <td>8.0001</td> <td>9.6565</td> </tr> <tr><td>7.4731</td> <td>7.5349</td> <td>7.4700</td> <td>8.2465</td> <td>9.8359</td> </tr> <tr><td>...</td> <td>...</td> <td>...</td> <td>...</td> <td>...</td> </tr> </tbody> </table> <p>I will use two methods for using these records to predict Equinox Marathon times, multivariate linear regression and Random Forest.</p> <p>The R code for the analysis appears at the end of this post.</p> </div> <div class="section" id="results"> <h1>Results</h1> <div class="section" id="linear-regression"> <h2>Linear regression</h2> <p>We start with linear regression, which isn’t entirely appropriate for this analysis because the independent variables (pre-Equinox race pace times) aren’t really independent of one another. A person who runs a 6 minute pace in the Chena River Run is likely to also be someone who runs Gold Discovery faster than the average runner. This relationship, in fact, is the basis for this analysis.</p> <p>I started with a model that includes all the races I completed in 2018, but pace time for the Midnight Sun Run wasn’t statistically significant so I removed it from the final model, which included Chena River Run, Run of the Valkyries, and Gold Discovery.</p> <p>This model is significant, as are all the coefficients except the intercept, and the model explains nearly 80% of the variation in the data:</p> <pre class="literal-block"> ## ## Call: ## lm(formula = em ~ crr + gdr + rotv, data = input_pivot) ## ## Residuals: ## Min 1Q Median 3Q Max ## -3.8837 -0.6534 -0.2265 0.3549 5.8273 ## ## Coefficients: ## Estimate Std. Error t value Pr(&gt;|t|) ## (Intercept) 0.6217 0.5692 1.092 0.276420 ## crr -0.3723 0.1346 -2.765 0.006380 ** ## gdr 0.8422 0.1169 7.206 2.32e-11 *** ## rotv 0.7607 0.2119 3.591 0.000442 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 1.278 on 156 degrees of freedom ## Multiple R-squared: 0.786, Adjusted R-squared: 0.7819 ## F-statistic: 191 on 3 and 156 DF, p-value: &lt; 2.2e-16 </pre> <p>Using this model and my 2018 results, my overall pace and finish times for Equinox are predicted to be 10:45 and 4:41:50. The 95% confidence intervals for these predictions are 10:30–11:01 and 4:35:11–4:48:28.</p> </div> <div class="section" id="random-forest"> <h2>Random Forest</h2> <p>Random Forest is another regression method but it doesn’t require independent variables be independent of one another. Here are the results of building 5,000 random trees from the data:</p> <pre class="literal-block"> ## ## Call: ## randomForest(formula = em ~ ., data = input_pivot, ntree = 5000) ## Type of random forest: regression ## Number of trees: 5000 ## No. of variables tried at each split: 1 ## ## Mean of squared residuals: 1.87325 ## % Var explained: 74.82 ## IncNodePurity ## crr 260.8279 ## gdr 321.3691 ## msr 268.0936 ## rotv 295.4250 </pre> <p>This model, which includes all race results explains just under 74% of the variation in the data. And you can see from the importance result that Gold Discovery results factor more heavily in the result than earlier races in the season like Chena River Run and the Midnight Sun Run.</p> <p>Using this model, my predicted pace is 10:13 and my finish time is 4:27:46. The 95% confidence intervals are 9:23–11:40 and 4:05:58–5:05:34. You’ll notice that the confidence intervals are wider than with linear regression, probably because there are fewer assumptions with Random Forest and less power.</p> </div> </div> <div class="section" id="conclusion"> <h1>Conclusion</h1> <p>My number one goal for this year’s Equinox Marathon is simply to finish without injuring myself, something I wasn’t able to do the last time I ran the whole race in 2013. I finished in 4:49:28 with an overall pace of 11:02, but the race or my training for it resulted in a torn hip labrum.</p> <p>If I’m able to finish uninjured, I’d like to beat my time from 2013. These results suggest I should have no problem acheiving my second goal and perhaps knowing how much faster these predictions are from my 2013 times, I can race conservatively and still get a personal best time.</p> </div> <div class="section" id="appendix-r-code"> <h1>Appendix - R code</h1> <div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>tidyverse<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>RPostgres<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>lubridate<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>glue<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>randomForest<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>knitr<span class="p">)</span> races <span class="o">&lt;-</span> dbConnect<span class="p">(</span>Postgres<span class="p">(),</span> host <span class="o">=</span> <span class="s">&quot;localhost&quot;</span><span class="p">,</span> dbname <span class="o">=</span> <span class="s">&quot;races&quot;</span><span class="p">)</span> all_races <span class="o">&lt;-</span> races <span class="o">%&gt;%</span> tbl<span class="p">(</span><span class="s">&quot;all_races&quot;</span><span class="p">)</span> usibelli_races <span class="o">&lt;-</span> tibble<span class="p">(</span>race <span class="o">=</span> <span class="kt">c</span><span class="p">(</span><span class="s">&quot;Chena River Run&quot;</span><span class="p">,</span> <span class="s">&quot;Midnight Sun Run&quot;</span><span class="p">,</span> <span class="s">&quot;Jim Loftus Mile&quot;</span><span class="p">,</span> <span class="s">&quot;Run of the Valkyries&quot;</span><span class="p">,</span> <span class="s">&quot;Gold Discovery Run&quot;</span><span class="p">,</span> <span class="s">&quot;Santa Claus Half Marathon&quot;</span><span class="p">,</span> <span class="s">&quot;Golden Heart Trail Run&quot;</span><span class="p">,</span> <span class="s">&quot;Equinox Marathon&quot;</span><span class="p">))</span> css_2018 <span class="o">&lt;-</span> all_races <span class="o">%&gt;%</span> inner_join<span class="p">(</span>usibelli_races<span class="p">,</span> copy <span class="o">=</span> <span class="kc">TRUE</span><span class="p">)</span> <span class="o">%&gt;%</span> filter<span class="p">(</span>year <span class="o">==</span> <span class="m">2018</span><span class="p">,</span> name <span class="o">==</span> <span class="s">&quot;Christopher Swingley&quot;</span><span class="p">)</span> <span class="o">%&gt;%</span> collect<span class="p">()</span> candidate_races <span class="o">&lt;-</span> css_2018 <span class="o">%&gt;%</span> select<span class="p">(</span>race<span class="p">)</span> <span class="o">%&gt;%</span> bind_rows<span class="p">(</span>tibble<span class="p">(</span>race <span class="o">=</span> <span class="kt">c</span><span class="p">(</span><span class="s">&quot;Equinox Marathon&quot;</span><span class="p">)))</span> input_data <span class="o">&lt;-</span> all_races <span class="o">%&gt;%</span> inner_join<span class="p">(</span>candidate_races<span class="p">,</span> copy <span class="o">=</span> <span class="kc">TRUE</span><span class="p">)</span> <span class="o">%&gt;%</span> filter<span class="p">(</span><span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>gender<span class="p">),</span> <span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>birth_year<span class="p">))</span> <span class="o">%&gt;%</span> collect<span class="p">()</span> input_pivot <span class="o">&lt;-</span> input_data <span class="o">%&gt;%</span> group_by<span class="p">(</span>race<span class="p">,</span> name<span class="p">,</span> year<span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>n <span class="o">=</span> n<span class="p">())</span> <span class="o">%&gt;%</span> filter<span class="p">(</span>n <span class="o">==</span> <span class="m">1</span><span class="p">)</span> <span class="o">%&gt;%</span> ungroup<span class="p">()</span> <span class="o">%&gt;%</span> select<span class="p">(</span>name<span class="p">,</span> year<span class="p">,</span> race<span class="p">,</span> pace_min<span class="p">)</span> <span class="o">%&gt;%</span> spread<span class="p">(</span>race<span class="p">,</span> pace_min<span class="p">)</span> <span class="o">%&gt;%</span> rename<span class="p">(</span>crr <span class="o">=</span> <span class="sb">`Chena River Run`</span><span class="p">,</span> msr <span class="o">=</span> <span class="sb">`Midnight Sun Run`</span><span class="p">,</span> rotv <span class="o">=</span> <span class="sb">`Run of the Valkyries`</span><span class="p">,</span> gdr <span class="o">=</span> <span class="sb">`Gold Discovery Run`</span><span class="p">,</span> em <span class="o">=</span> <span class="sb">`Equinox Marathon`</span><span class="p">)</span> <span class="o">%&gt;%</span> filter<span class="p">(</span><span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>crr<span class="p">),</span> <span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>msr<span class="p">),</span> <span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>rotv<span class="p">),</span> <span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>gdr<span class="p">),</span> <span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>em<span class="p">))</span> <span class="o">%&gt;%</span> select<span class="p">(</span><span class="o">-</span><span class="kt">c</span><span class="p">(</span>name<span class="p">,</span> year<span class="p">))</span> kable<span class="p">(</span>input_pivot <span class="o">%&gt;%</span> <span class="kp">head</span><span class="p">)</span> css_2018_pivot <span class="o">&lt;-</span> css_2018 <span class="o">%&gt;%</span> select<span class="p">(</span>name<span class="p">,</span> year<span class="p">,</span> race<span class="p">,</span> pace_min<span class="p">)</span> <span class="o">%&gt;%</span> spread<span class="p">(</span>race<span class="p">,</span> pace_min<span class="p">)</span> <span class="o">%&gt;%</span> rename<span class="p">(</span>crr <span class="o">=</span> <span class="sb">`Chena River Run`</span><span class="p">,</span> msr <span class="o">=</span> <span class="sb">`Midnight Sun Run`</span><span class="p">,</span> rotv <span class="o">=</span> <span class="sb">`Run of the Valkyries`</span><span class="p">,</span> gdr <span class="o">=</span> <span class="sb">`Gold Discovery Run`</span><span class="p">)</span> <span class="o">%&gt;%</span> select<span class="p">(</span><span class="o">-</span><span class="kt">c</span><span class="p">(</span>name<span class="p">,</span> year<span class="p">))</span> pace <span class="o">&lt;-</span> <span class="kr">function</span><span class="p">(</span>minutes<span class="p">)</span> <span class="p">{</span> mm <span class="o">=</span> <span class="kp">floor</span><span class="p">(</span>minutes<span class="p">)</span> seconds <span class="o">=</span> <span class="p">(</span>minutes <span class="o">-</span> mm<span class="p">)</span> <span class="o">*</span> <span class="m">60</span> glue<span class="p">(</span><span class="s">&#39;{mm}:{sprintf(&quot;%02.0f&quot;, seconds)}&#39;</span><span class="p">)</span> <span class="p">}</span> finish_time <span class="o">&lt;-</span> <span class="kr">function</span><span class="p">(</span>minutes<span class="p">)</span> <span class="p">{</span> hh <span class="o">=</span> <span class="kp">floor</span><span class="p">(</span>minutes <span class="o">/</span> <span class="m">60.0</span><span class="p">)</span> min <span class="o">=</span> minutes <span class="o">-</span> <span class="p">(</span>hh <span class="o">*</span> <span class="m">60</span><span class="p">)</span> mm <span class="o">=</span> <span class="kp">floor</span><span class="p">(</span><span class="kp">min</span><span class="p">)</span> seconds <span class="o">=</span> <span class="p">(</span>min <span class="o">-</span> mm<span class="p">)</span> <span class="o">*</span> <span class="m">60</span> glue<span class="p">(</span><span class="s">&#39;{hh}:{sprintf(&quot;%02d&quot;, mm)}:{sprintf(&quot;%02.0f&quot;, seconds)}&#39;</span><span class="p">)</span> <span class="p">}</span> lm_model <span class="o">&lt;-</span> lm<span class="p">(</span>em <span class="o">~</span> crr <span class="o">+</span> gdr <span class="o">+</span> rotv<span class="p">,</span> data <span class="o">=</span> input_pivot<span class="p">)</span> <span class="kp">summary</span><span class="p">(</span>lm_model<span class="p">)</span> prediction <span class="o">&lt;-</span> predict<span class="p">(</span>lm_model<span class="p">,</span> css_2018_pivot<span class="p">,</span> interval <span class="o">=</span> <span class="s">&quot;confidence&quot;</span><span class="p">,</span> level <span class="o">=</span> <span class="m">0.95</span><span class="p">)</span> prediction rf <span class="o">&lt;-</span> randomForest<span class="p">(</span>em <span class="o">~</span> <span class="m">.</span><span class="p">,</span> data <span class="o">=</span> input_pivot<span class="p">,</span> ntree <span class="o">=</span> <span class="m">5000</span><span class="p">)</span> rf importance<span class="p">(</span>rf<span class="p">)</span> rfp_all <span class="o">&lt;-</span> predict<span class="p">(</span>rf<span class="p">,</span> css_2018_pivot<span class="p">,</span> predict.all <span class="o">=</span> <span class="kc">TRUE</span><span class="p">)</span> rfp_all<span class="o">$</span>aggregate rf_ci <span class="o">&lt;-</span> quantile<span class="p">(</span>rfp_all<span class="o">$</span>individual<span class="p">,</span> <span class="kt">c</span><span class="p">(</span><span class="m">0.025</span><span class="p">,</span> <span class="m">0.975</span><span class="p">))</span> rf_ci </pre></div> </div> </div> Sun, 09 Sep 2018 10:54:14 -0800 http://swingleydev.com/blog/p/2009/ Equinox Marathon running statistics randomForest Lincoln in the Bardo, George Saunders http://swingleydev.com/blog/p/2007/ <div class="document"> <p>Well that was disappointing. I’ve read some of George Saunders’s short stories and was entertained, but I didn’t much enjoy <a class="reference external" href="https://www.amazon.com/dp/0812995341">Lincoln in the Bardo</a>. It’s the story of Abraham Lincoln coming to the graveyard to visit his newly dead son William, told from the perspective of a variety of lost souls that don’t believe they’re dead. There was no plot to speak of, and none of the large cast of characters was appealing. I did enjoy the sections that were fictional quotes from contemporary histories, many of which contradicted each other on the details, and some of the characters told funny stories, but it didn’t hold together as a novel.</p> <p>Widely acclaimed, winner of the Man Booker Prize, on many best of 2017 lists. Not my cup of tea.</p> <p>Music I listened to while reading this:</p> <ul class="simple"> <li>Carlow Town, Seamus Fogarty</li> <li>You’ve Got Tonight, Wiretree</li> </ul> </div> Wed, 03 Jan 2018 19:32:58 -0900 http://swingleydev.com/blog/p/2007/ books George Saunders All Our Wrong Todays, Elan Mastai http://swingleydev.com/blog/p/2006/ <div class="document"> <p>It’s one day until <a class="reference external" href="https://themorningnews.org/archive/tag/tob18">The Tournament of Books</a> announces the list of books for this year’s competition, and I’ve been reading some of the <a class="reference external" href="https://themorningnews.org/article/the-year-in-fiction-2017">Long List</a>, including the book commented on here, Elan Mastai’s <a class="reference external" href="https://www.amazon.com/dp/1101985135">All Our Wrong Todays</a>. I throughly enjoyed it. The writing sparkles, the narrator is hilarously self-deprecating, and because of the premise, there is a lot of insightful commentary about contemporary society.</p> <p>The main plot line is that the main character grew up in an alternative timeline where a device that produces free energy was invented in 1965 and put into the public domain. With free energy and fifty plus years, his world is something of a techonological utopia (especially compared with our present). However, for reasons best left unspoiled, he alters the timeline and is stuck here in our timeline with the rest of us.</p> <p>The narrator on waking up for the first time in our timeline:</p> <blockquote> Here, it’s like nobody has considered using even the most rudimentary technology to improve the process. Mattresses don’t subtly vibrate to keep your muscles loose. Targeted steam valves don’t clean your body in slumber. I mean, blankets are made from tufts of plant fiber spun into thread and occasionally stuffed with feathers. Feathers. Like from actual birds.</blockquote> <p>While there’s a lot of science-fiction concepts in the story, it’s really more of a love story than what it sounds like it’d be. There were a couple plot points I probably would have written differently, but the book is really funny, touching and thoughful. I highly recommend it. Best book I’ve read in 2018 so far…</p> <p>A couple other quotes I found particularly timely:</p> <blockquote> Part of the problem is this world is basically a cesspool of misogyny, male entitlement, and deeply demented gender constructs accepted as casual fact by outrageously large swaths of the human population.</blockquote> <p>and</p> <blockquote> People are despondent about the future because they’re increasingly aware that we, as a species, chased an inspiring dream that led us to ruin. We told ourselves the world is here for us to control, so the better our technology, the better our control, the better our world will be. The fact that for every leap in technology the world gets more sour and chaotic is deeply confusing. The better things we build keep making it worse. The belief that the world is here for humans to control is the philosophical bedrock of our civilization, but it’s a mistaken belief. Optimism is the pyre on which we’ve been setting ourselves aflame.</blockquote> <p>Music I listened to while reading this book:</p> <ul class="simple"> <li>Jesus Christ, Brand New</li> <li>House of Cards, Radiohead</li> <li>Conundrum, Hak Baker</li> <li>Die Young, Sylvan Esso</li> <li>Feat &amp; Force, Vagabon</li> <li>No War, Cari Cari</li> </ul> </div> Tue, 02 Jan 2018 15:18:40 -0900 http://swingleydev.com/blog/p/2006/ books Elan Mastai Big South Fork River & Recreation Area, February Climate http://swingleydev.com/blog/p/2005/ <div class="document"> <div class="section" id="introduction"> <h1>Introduction</h1> <p>I’m planning a short trip to visit family in Florida and thought I’d take advantage of being in a new place to do some late winter backpacking where it’s warmer than in Fairbanks. I think I’ve settled on a 3‒5 day backpacking trip in Big South Fork National River and Recreation Area, which is in northeastern Tennesee and southeastern Kentucky.</p> <p>Except for a couple summer trips in New England in the 80s, my backpacking experience has been in summer, in places where it doesn’t rain much and is typically hot and dry (California, Oregon). So I’d like to find out what the weather should be like when I’m there.</p> </div> <div class="section" id="data"> <h1>Data</h1> <p>I’ll use the <a class="reference external" href="https://www.ncdc.noaa.gov/ghcn-daily-description">Global Historical Climatology Network — Daily</a> dataset, which contains daily weather observations for more than 100&nbsp;thousand stations across the globe. There are more than 26&nbsp;thousand active stations in the United States, and data for some U.S. stations goes back to 1836. I loaded the entire dataset—2.4&nbsp;billion records as of last week—into a PostgreSQL database, partitioning the data by year. I’m interested in daily minimum and maximum temperature (<tt class="docutils literal">TMIN</tt>, <tt class="docutils literal">TMAX</tt>), precipitation (<tt class="docutils literal">PRCP</tt>) and snowfall (<tt class="docutils literal">SNOW</tt>), and in stations within 50&nbsp;miles of the center of the recreation area.</p> <p>The following map shows the recreation area boundary (with some strange drawing errors, probably due to using the <tt class="docutils literal">fortify</tt> command) in green, the Tennessee/Kentucky border across the middle of the plot, and the 19&nbsp;stations used in the analysis.</p> <div class="figure"> <img alt="//media.swingleydev.com/img/blog/2017/12/biso_stations.svgz" class="img-responsive" src="//media.swingleydev.com/img/blog/2017/12/biso_stations.svgz" /> </div> <p>Here are the details on the stations:</p> <table border="1" class="docutils"> <colgroup> <col width="15%" /> <col width="32%" /> <col width="13%" /> <col width="11%" /> <col width="11%" /> <col width="12%" /> <col width="7%" /> </colgroup> <thead valign="bottom"> <tr><th class="head">station_id</th> <th class="head">station_name</th> <th class="head">start_year</th> <th class="head">end_year</th> <th class="head">latitude</th> <th class="head">longitude</th> <th class="head">miles</th> </tr> </thead> <tbody valign="top"> <tr><td>USC00407141</td> <td>PICKETT SP</td> <td>2000</td> <td>2017</td> <td>36.5514</td> <td>-84.7967</td> <td>6.13</td> </tr> <tr><td>USC00406829</td> <td>ONEIDA</td> <td>1959</td> <td>2017</td> <td>36.5028</td> <td>-84.5308</td> <td>9.51</td> </tr> <tr><td>USC00400081</td> <td>ALLARDT</td> <td>1928</td> <td>2017</td> <td>36.3806</td> <td>-84.8744</td> <td>12.99</td> </tr> <tr><td>USC00404590</td> <td>JAMESTOWN</td> <td>2003</td> <td>2017</td> <td>36.4258</td> <td>-84.9419</td> <td>14.52</td> </tr> <tr><td>USC00157677</td> <td>STEARNS 2S</td> <td>1936</td> <td>2017</td> <td>36.6736</td> <td>-84.4792</td> <td>16.90</td> </tr> <tr><td>USC00401310</td> <td>BYRDSTOWN</td> <td>1998</td> <td>2017</td> <td>36.5803</td> <td>-85.1256</td> <td>24.16</td> </tr> <tr><td>USC00406493</td> <td>NEWCOMB</td> <td>1999</td> <td>2017</td> <td>36.5517</td> <td>-84.1728</td> <td>29.61</td> </tr> <tr><td>USC00158711</td> <td>WILLIAMSBURG 1NW</td> <td>2011</td> <td>2017</td> <td>36.7458</td> <td>-84.1753</td> <td>33.60</td> </tr> <tr><td>USC00405332</td> <td>LIVINGSTON RADIO WLIV</td> <td>1961</td> <td>2017</td> <td>36.3775</td> <td>-85.3364</td> <td>36.52</td> </tr> <tr><td>USC00154208</td> <td>JAMESTOWN WWTP</td> <td>1971</td> <td>2017</td> <td>37.0056</td> <td>-85.0617</td> <td>39.82</td> </tr> <tr><td>USC00406170</td> <td>MONTEREY</td> <td>1904</td> <td>2017</td> <td>36.1483</td> <td>-85.2650</td> <td>40.04</td> </tr> <tr><td>USC00406619</td> <td>NORRIS</td> <td>1936</td> <td>2017</td> <td>36.2131</td> <td>-84.0603</td> <td>41.13</td> </tr> <tr><td>USC00402202</td> <td>CROSSVILLE ED &amp; RESEARCH</td> <td>1912</td> <td>2017</td> <td>36.0147</td> <td>-85.1314</td> <td>41.61</td> </tr> <tr><td>USW00053868</td> <td>OAK RIDGE ASOS</td> <td>1999</td> <td>2017</td> <td>36.0236</td> <td>-84.2375</td> <td>42.24</td> </tr> <tr><td>USC00401561</td> <td>CELINA</td> <td>1948</td> <td>2017</td> <td>36.5408</td> <td>-85.4597</td> <td>42.31</td> </tr> <tr><td>USC00157510</td> <td>SOMERSET 2 N</td> <td>1950</td> <td>2017</td> <td>37.1167</td> <td>-84.6167</td> <td>42.36</td> </tr> <tr><td>USW00003841</td> <td>OAK RIDGE ATDD</td> <td>1948</td> <td>2017</td> <td>36.0028</td> <td>-84.2486</td> <td>43.02</td> </tr> <tr><td>USW00003847</td> <td>CROSSVILLE MEM AP</td> <td>1954</td> <td>2017</td> <td>35.9508</td> <td>-85.0814</td> <td>43.87</td> </tr> <tr><td>USC00404871</td> <td>KINGSTON</td> <td>2000</td> <td>2017</td> <td>35.8575</td> <td>-84.5278</td> <td>45.86</td> </tr> </tbody> </table> <p>To perform the analysis, I collected all valid observations for the stations listed, then reduced the results, including observations where the day of the year was between 45 and 52 (February&nbsp;14‒21).</p> <table border="1" class="docutils"> <colgroup> <col width="43%" /> <col width="57%" /> </colgroup> <thead valign="bottom"> <tr><th class="head">variable</th> <th class="head">observations</th> </tr> </thead> <tbody valign="top"> <tr><td>PRCP</td> <td>5,942</td> </tr> <tr><td>SNOW</td> <td>5,091</td> </tr> <tr><td>TMAX</td> <td>4,900</td> </tr> <tr><td>TMIN</td> <td>4,846</td> </tr> </tbody> </table> </div> <div class="section" id="results"> <h1>Results</h1> <div class="section" id="temperature"> <h2>Temperature</h2> <p>We will consider temperature first. The following two plots show the distribution of daily minimum and maximum temperatures. In both plots, the bars represent the number of observations at that temperature, the vertical red line through the middle of the plot shows the average temperature, and the light orange and blue sections show the ranges of temperatures enclosing 80% and 98% of the data.</p> <div class="figure"> <img alt="//media.swingleydev.com/img/blog/2017/12/min_temp_dist.svgz" class="img-responsive" src="//media.swingleydev.com/img/blog/2017/12/min_temp_dist.svgz" /> </div> <div class="figure"> <img alt="//media.swingleydev.com/img/blog/2017/12/max_temp_dist.svgz" class="img-responsive" src="//media.swingleydev.com/img/blog/2017/12/max_temp_dist.svgz" /> </div> <p>The minimum daily temperature figure shows that the average minimum temperature is below freezing, (28.9&nbsp;°F) and eighty percent of all days in the third week of February were between 15 and 43&nbsp;°F (the light orange region). The minimum temperature was colder than 15&nbsp;°F or warmer than 54&nbsp;°F 2% of the time (the light blue region). Maximum daily temperature was an average of 51&nbsp;°F, and was rarely below freezing or above 72&nbsp;°F.</p> <p>Another way to look at this sort of data is to count particular occurances and divide by the total, “binning” the data into groups. Here we look at the number of days that were below freezing, colder than 20&nbsp;°F or colder than 10&nbsp;°F.</p> <table border="1" class="docutils"> <colgroup> <col width="34%" /> <col width="32%" /> <col width="34%" /> </colgroup> <thead valign="bottom"> <tr><th class="head">temperature</th> <th class="head">observed&nbsp;days</th> <th class="head">percent&nbsp;chance</th> </tr> </thead> <tbody valign="top"> <tr><td>below freezing</td> <td>3,006</td> <td>62.0</td> </tr> <tr><td>colder than 20</td> <td>1,079</td> <td>22.3</td> </tr> <tr><td>colder than 10</td> <td>203</td> <td>4.2</td> </tr> <tr><td>TOTAL</td> <td>4,846</td> <td>100.0</td> </tr> </tbody> </table> <p>What about the daily maximum temperature?</p> <table border="1" class="docutils"> <colgroup> <col width="36%" /> <col width="34%" /> <col width="30%" /> </colgroup> <thead valign="bottom"> <tr><th class="head">temperature</th> <th class="head">observed days</th> <th class="head">percent chance</th> </tr> </thead> <tbody valign="top"> <tr><td>colder than 20</td> <td>22</td> <td>0.4</td> </tr> <tr><td>below freezing</td> <td>371</td> <td>7.6</td> </tr> <tr><td>below 40</td> <td>1,151</td> <td>23.5</td> </tr> <tr><td>above 50</td> <td>2,569</td> <td>52.4</td> </tr> <tr><td>above 60</td> <td>1,157</td> <td>23.6</td> </tr> <tr><td>above 70</td> <td>80</td> <td>1.6</td> </tr> <tr><td>TOTAL</td> <td>4,900</td> <td>100.0</td> </tr> </tbody> </table> <p>The chances of it being below freezing during the day are pretty slim, and more than half the time it’s warmer than 50&nbsp;°F, so even if it’s cold at night, I should be able to get plenty warm hiking during the day.</p> </div> <div class="section" id="precipitation"> <h2>Precipitation</h2> <p>How often it rains, and how much falls when it does is also important for planning a successful backpacking trip. Most of my backpacking has been done in the summer in California, where rainfall is rare and even when it does rain, it’s typically over quickly. Daily weather data can’t tell us about the hourly pattern of rainfall, but we can find out how often and how much it has rained in the past.</p> <table border="1" class="docutils"> <colgroup> <col width="35%" /> <col width="32%" /> <col width="33%" /> </colgroup> <thead valign="bottom"> <tr><th class="head">rainfall&nbsp;amount</th> <th class="head">observed&nbsp;days</th> <th class="head">percent&nbsp;chance</th> </tr> </thead> <tbody valign="top"> <tr><td>raining</td> <td>2,375</td> <td>40.0</td> </tr> <tr><td>tenth</td> <td>1,610</td> <td>27.1</td> </tr> <tr><td>quarter</td> <td>1,136</td> <td>19.1</td> </tr> <tr><td>half</td> <td>668</td> <td>11.2</td> </tr> <tr><td>inch</td> <td>308</td> <td>5.2</td> </tr> <tr><td>TOTAL</td> <td>5,942</td> <td>100.0</td> </tr> </tbody> </table> <p>This data shows that the chance of rain on any given day between February&nbsp;14th and the 21st is 40%, and the chance of getting at least a tenth of an inch is 30%. That’s certainly higher than in the Sierra Nevada in July, although by August, afternoon thunderstorms are more common in the mountains.</p> <p><em>When there is precipitation</em>, the distribution of precipitation totals looks like this:</p> <table border="1" class="docutils"> <colgroup> <col width="60%" /> <col width="40%" /> </colgroup> <thead valign="bottom"> <tr><th class="head">cumulative&nbsp;frequency</th> <th class="head">precipition</th> </tr> </thead> <tbody valign="top"> <tr><td>1%</td> <td>0.01</td> </tr> <tr><td>5%</td> <td>0.02</td> </tr> <tr><td>10%</td> <td>0.02</td> </tr> <tr><td>25%</td> <td>0.07</td> </tr> <tr><td>50%</td> <td>0.22</td> </tr> <tr><td>75%</td> <td>0.59</td> </tr> <tr><td>90%</td> <td>1.18</td> </tr> <tr><td>95%</td> <td>1.71</td> </tr> <tr><td>99%</td> <td>2.56</td> </tr> </tbody> </table> <p>These numbers are cumulative which means that on 1&nbsp;percent of the days with precipition, there was a hundredth of an inch of liquid precipitation <em>or less</em>. Ten percent of the days had 0.02&nbsp;inches or less. And 50&nbsp;percent of rainy days had 0.22&nbsp;inches or liquid precipitation or less. Reading the numbers from the top of the distribution, there was more than an inch of rain 10&nbsp;percent of the days on which it rained, which is a little disturbing.</p> <p>One final question about precipitation is how long it rains once it starts raining? Do we get little showers here and there, or are there large storms that dump rain for days without a break? To answer this question, I counted the number of days between zero-rainfall days, which is equal to the number of consecutive days where it rained.</p> <table border="1" class="docutils"> <colgroup> <col width="52%" /> <col width="48%" /> </colgroup> <thead valign="bottom"> <tr><th class="head">consecutive&nbsp;days</th> <th class="head">percent&nbsp;chance</th> </tr> </thead> <tbody valign="top"> <tr><td>1</td> <td>53.0</td> </tr> <tr><td>2</td> <td>24.4</td> </tr> <tr><td>3</td> <td>11.9</td> </tr> <tr><td>4</td> <td>7.5</td> </tr> <tr><td>5</td> <td>2.2</td> </tr> <tr><td>6</td> <td>0.9</td> </tr> <tr><td>7</td> <td>0.1</td> </tr> </tbody> </table> <p>The results show that more than half the time, a single day of rain is followed by at least one day without. And the chances of having it rain every day of a three day trip to this area in mid-February is 11.9%.</p> </div> <div class="section" id="snowfall"> <h2>Snowfall</h2> <p>Repeating the precipitation analysis with snowfall:</p> <table border="1" class="docutils"> <colgroup> <col width="35%" /> <col width="32%" /> <col width="33%" /> </colgroup> <thead valign="bottom"> <tr><th class="head">snowfall&nbsp;amount</th> <th class="head">observed&nbsp;days</th> <th class="head">percent&nbsp;chance</th> </tr> </thead> <tbody valign="top"> <tr><td>snowing</td> <td>322</td> <td>6.3</td> </tr> <tr><td>inch</td> <td>148</td> <td>2.9</td> </tr> <tr><td>two</td> <td>115</td> <td>2.3</td> </tr> <tr><td>TOTAL</td> <td>5,091</td> <td>100.0</td> </tr> </tbody> </table> <p>Snowfall isn’t common on these dates, but it did happen, so I will need to be prepared for it. Also, the <tt class="docutils literal">PRCP</tt> variable includes melted snow, so a small portion of the precipitation from the previous section overlaps with the snowfall shown here.</p> </div> </div> <div class="section" id="conclusion"> <h1>Conclusion</h1> <p>Based on this analysis, a 3‒5&nbsp;day backpacking trip to the Big South Fork National River and Recreation area seems well within my abilities and my gear. It will almost certainly be below freezing at night, but isn’t likely to be much below 20&nbsp;°F, snowfall is uncommon, and even though I will probably experience some rain, it shouldn’t be too much or carry on for the entire trip.</p> </div> <div class="section" id="appendix"> <h1>Appendix</h1> <p>The R code for this analysis appears below. I’ve loaded the GHCND data into a PostgreSQL database with observation data partitioned by year. The database tables are structured basically as they come from the National Centers for Environmental Information.</p> <div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>tidyverse<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>dbplyr<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>glue<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>maps<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>sp<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>rgdal<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>scales<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>knitr<span class="p">)</span> noaa <span class="o">&lt;-</span> src_postgres<span class="p">(</span>dbname <span class="o">=</span> <span class="s">&quot;noaa&quot;</span><span class="p">)</span> biso_stations <span class="o">&lt;-</span> noaa <span class="o">%&gt;%</span> tbl<span class="p">(</span>build_sql<span class="p">(</span> <span class="s">&quot;WITH inv AS (</span> <span class="s"> SELECT station_id, max(start_year) AS start_year,</span> <span class="s"> min(end_year) AS end_year,</span> <span class="s"> array_agg(variable::text) AS variables</span> <span class="s"> FROM ghcnd_inventory</span> <span class="s"> WHERE variable IN (&#39;TMIN&#39;, &#39;TMAX&#39;, &#39;PRCP&#39;, &#39;SNOW&#39;)</span> <span class="s"> GROUP BY station_id)</span> <span class="s"> SELECT station_id, station_name, start_year, end_year,</span> <span class="s"> latitude, longitude,</span> <span class="s"> ST_Distance(ST_Transform(a.the_geom, 32617),</span> <span class="s"> ST_Transform(b.the_geom, 32617))/1609 AS miles</span> <span class="s"> FROM ghcnd_stations AS a</span> <span class="s"> INNER JOIN inv USING(station_id),</span> <span class="s"> (SELECT ST_SetSRID(</span> <span class="s"> ST_MakePoint(-84.701553,</span> <span class="s"> 36.506800), 4326) AS the_geom) AS b</span> <span class="s"> WHERE inv.variables @&gt; ARRAY[&#39;TMIN&#39;, &#39;TMAX&#39;, &#39;PRCP&#39;, &#39;SNOW&#39;]</span> <span class="s"> AND end_year = 2017</span> <span class="s"> AND ST_Distance(ST_Transform(a.the_geom, 32617),</span> <span class="s"> ST_Transform(b.the_geom, 32617))/1609 &lt; 65</span> <span class="s"> ORDER BY miles&quot;</span><span class="p">))</span> start_doy <span class="o">&lt;-</span> <span class="m">32</span> <span class="c1"># Feb 1</span> end_doy <span class="o">&lt;-</span> <span class="m">59</span> <span class="c1"># Feb 28</span> ghcnd_variables <span class="o">&lt;-</span> noaa <span class="o">%&gt;%</span> tbl<span class="p">(</span><span class="s">&quot;ghcnd_variables&quot;</span><span class="p">)</span> <span class="c1"># ghcnd_obs partitioned by year, so query by year</span> obs_by_year <span class="o">&lt;-</span> <span class="kr">function</span><span class="p">(</span>conn<span class="p">,</span> year<span class="p">,</span> start_doy<span class="p">,</span> end_doy<span class="p">)</span> <span class="p">{</span> <span class="kp">print</span><span class="p">(</span>year<span class="p">)</span> filter_start_dte <span class="o">&lt;-</span> glue<span class="p">(</span><span class="s">&quot;{year}-01-01&quot;</span><span class="p">)</span> filter_end_dte <span class="o">&lt;-</span> glue<span class="p">(</span><span class="s">&quot;{year}-12-31&quot;</span><span class="p">)</span> conn <span class="o">%&gt;%</span> tbl<span class="p">(</span><span class="s">&quot;ghcnd_obs&quot;</span><span class="p">)</span> <span class="o">%&gt;%</span> inner_join<span class="p">(</span>biso_stations<span class="p">)</span> <span class="o">%&gt;%</span> inner_join<span class="p">(</span>ghcnd_variables<span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>doy <span class="o">=</span> date_part<span class="p">(</span><span class="s">&#39;doy&#39;</span><span class="p">,</span> dte<span class="p">),</span> value <span class="o">=</span> raw_value <span class="o">*</span> raw_multiplier<span class="p">)</span> <span class="o">%&gt;%</span> filter<span class="p">(</span>dte <span class="o">&gt;=</span> filter_start_dte<span class="p">,</span> dte <span class="o">&lt;=</span> filter_end_dte<span class="p">,</span> doy <span class="o">&gt;=</span> start_doy<span class="p">,</span> doy <span class="o">&lt;=</span> end_doy<span class="p">,</span> <span class="kp">is.na</span><span class="p">(</span>qual_flag<span class="p">),</span> variable <span class="o">%in%</span> <span class="kt">c</span><span class="p">(</span><span class="s">&#39;TMIN&#39;</span><span class="p">,</span> <span class="s">&#39;TMAX&#39;</span><span class="p">,</span> <span class="s">&#39;PRCP&#39;</span><span class="p">,</span> <span class="s">&#39;SNOW&#39;</span><span class="p">))</span> <span class="o">%&gt;%</span> select<span class="p">(</span><span class="o">-</span><span class="kt">c</span><span class="p">(</span>raw_value<span class="p">,</span> time_of_obs<span class="p">,</span> qual_flag<span class="p">,</span> description<span class="p">,</span> raw_multiplier<span class="p">))</span> <span class="o">%&gt;%</span> collect<span class="p">()</span> <span class="p">}</span> feb_obs <span class="o">&lt;-</span> map_df<span class="p">(</span><span class="m">1968</span><span class="o">:</span><span class="m">2017</span><span class="p">,</span> <span class="kr">function</span><span class="p">(</span>x<span class="p">)</span> obs_by_year<span class="p">(</span>noaa<span class="p">,</span> x<span class="p">,</span> start_doy<span class="p">,</span> end_doy<span class="p">))</span> <span class="c1"># MAP</span> restrict_miles <span class="o">&lt;-</span> <span class="m">50</span> biso_filtered <span class="o">&lt;-</span> biso <span class="o">%&gt;%</span> filter<span class="p">(</span>miles <span class="o">&lt;</span> restrict_miles<span class="p">)</span> nps_boundary <span class="o">&lt;-</span> readOGR<span class="p">(</span><span class="s">&quot;nps_boundary.shp&quot;</span><span class="p">,</span> verbose <span class="o">=</span> <span class="kc">FALSE</span><span class="p">)</span> biso_boundary <span class="o">&lt;-</span> <span class="kp">subset</span><span class="p">(</span>nps_boundary<span class="p">,</span> UNIT_CODE <span class="o">==</span> <span class="s">&#39;BISO&#39;</span><span class="p">)</span> biso_df <span class="o">&lt;-</span> fortify<span class="p">(</span>biso_boundary<span class="p">)</span> <span class="o">%&gt;%</span> tbl_df<span class="p">()</span> q <span class="o">&lt;-</span> ggplot<span class="p">(</span>data <span class="o">=</span> biso_filtered<span class="p">,</span> aes<span class="p">(</span>x <span class="o">=</span> longitude<span class="p">,</span> y <span class="o">=</span> latitude<span class="p">))</span> <span class="o">+</span> theme_bw<span class="p">()</span> <span class="o">+</span> theme<span class="p">(</span>axis.text <span class="o">=</span> element_blank<span class="p">(),</span> axis.ticks <span class="o">=</span> element_blank<span class="p">(),</span> panel.grid <span class="o">=</span> element_blank<span class="p">())</span> <span class="o">+</span> geom_hline<span class="p">(</span>yintercept <span class="o">=</span> <span class="m">36.6</span><span class="p">,</span> colour <span class="o">=</span> <span class="s">&quot;darkcyan&quot;</span><span class="p">,</span> size <span class="o">=</span> <span class="m">0.5</span><span class="p">)</span> <span class="o">+</span> geom_point<span class="p">(</span>colour <span class="o">=</span> <span class="s">&quot;darkred&quot;</span><span class="p">)</span> <span class="o">+</span> geom_text<span class="p">(</span>aes<span class="p">(</span>label <span class="o">=</span> str_to_title<span class="p">(</span>station_name<span class="p">)),</span> size <span class="o">=</span> <span class="m">3</span><span class="p">,</span> hjust <span class="o">=</span> <span class="m">0.5</span><span class="p">,</span> vjust <span class="o">=</span> <span class="m">0</span><span class="p">,</span> nudge_y <span class="o">=</span> <span class="m">0.01</span><span class="p">)</span> <span class="o">+</span> geom_polygon<span class="p">(</span>data <span class="o">=</span> biso_df<span class="p">,</span> aes<span class="p">(</span>x <span class="o">=</span> long<span class="p">,</span> y <span class="o">=</span> lat<span class="p">),</span> fill <span class="o">=</span> <span class="s">&quot;darkgreen&quot;</span><span class="p">)</span> <span class="o">+</span> scale_x_continuous<span class="p">(</span>name <span class="o">=</span> <span class="s">&quot;&quot;</span><span class="p">,</span> limits <span class="o">=</span> <span class="kt">c</span><span class="p">(</span><span class="kp">min</span><span class="p">(</span>biso_filtered<span class="o">$</span>longitude<span class="p">)</span> <span class="o">-</span> <span class="m">0.02</span><span class="p">,</span> <span class="kp">max</span><span class="p">(</span>biso_filtered<span class="o">$</span>longitude<span class="p">)</span> <span class="o">+</span> <span class="m">0.02</span><span class="p">))</span> <span class="o">+</span> scale_y_continuous<span class="p">(</span>name <span class="o">=</span> <span class="s">&quot;&quot;</span><span class="p">,</span> limits <span class="o">=</span> <span class="kt">c</span><span class="p">(</span><span class="kp">min</span><span class="p">(</span>biso_filtered<span class="o">$</span>latitude<span class="p">)</span> <span class="o">-</span> <span class="m">0.02</span><span class="p">,</span> <span class="kp">max</span><span class="p">(</span>biso_filtered<span class="o">$</span>latitude<span class="p">)</span> <span class="o">+</span> <span class="m">0.02</span><span class="p">))</span> <span class="o">+</span> coord_quickmap<span class="p">()</span> <span class="kp">print</span><span class="p">(</span><span class="kp">q</span><span class="p">)</span> <span class="c1"># OBS</span> feb_obs_filtered <span class="o">&lt;-</span> feb_obs <span class="o">%&gt;%</span> filter<span class="p">(</span>miles <span class="o">&lt;</span> restrict_miles<span class="p">,</span> doy <span class="o">&gt;=</span> <span class="m">45</span><span class="p">,</span> doy <span class="o">&lt;=</span> <span class="m">52</span><span class="p">)</span> <span class="c1"># feb 14-21</span> <span class="c1"># TEMP PLOTS</span> tmin_rects <span class="o">&lt;-</span> tibble<span class="p">(</span>pwidth <span class="o">=</span> <span class="kt">c</span><span class="p">(</span><span class="s">&quot;80&quot;</span><span class="p">,</span> <span class="s">&quot;98&quot;</span><span class="p">),</span> xmin <span class="o">=</span> quantile<span class="p">((</span>feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;TMIN&#39;</span><span class="p">))</span><span class="o">$</span>value<span class="o">*</span><span class="m">9</span><span class="o">/</span><span class="m">5+32</span><span class="p">,</span> <span class="kt">c</span><span class="p">(</span><span class="m">0.10</span><span class="p">,</span> <span class="m">0.01</span><span class="p">)),</span> xmax <span class="o">=</span> quantile<span class="p">((</span>feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;TMIN&#39;</span><span class="p">))</span><span class="o">$</span>value<span class="o">*</span><span class="m">9</span><span class="o">/</span><span class="m">5+32</span><span class="p">,</span> <span class="kt">c</span><span class="p">(</span><span class="m">0.90</span><span class="p">,</span> <span class="m">0.99</span><span class="p">)),</span> ymin <span class="o">=</span> <span class="o">-</span><span class="kc">Inf</span><span class="p">,</span> ymax <span class="o">=</span> <span class="kc">Inf</span><span class="p">)</span> q <span class="o">&lt;-</span> ggplot<span class="p">(</span>data <span class="o">=</span> feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;TMIN&#39;</span><span class="p">),</span> aes<span class="p">(</span>x <span class="o">=</span> value<span class="o">*</span><span class="m">9</span><span class="o">/</span><span class="m">5+32</span><span class="p">))</span> <span class="o">+</span> theme_bw<span class="p">()</span> <span class="o">+</span> geom_rect<span class="p">(</span>data <span class="o">=</span> tmin_rects <span class="o">%&gt;%</span> filter<span class="p">(</span>pwidth <span class="o">==</span> <span class="s">&quot;98&quot;</span><span class="p">),</span> inherit.aes <span class="o">=</span> <span class="kc">FALSE</span><span class="p">,</span> aes<span class="p">(</span>xmin <span class="o">=</span> xmin<span class="p">,</span> xmax <span class="o">=</span> xmax<span class="p">,</span> ymin <span class="o">=</span> ymin<span class="p">,</span> ymax <span class="o">=</span> ymax<span class="p">),</span> fill <span class="o">=</span> <span class="s">&quot;darkcyan&quot;</span><span class="p">,</span> alpha <span class="o">=</span> <span class="m">0.2</span><span class="p">)</span> <span class="o">+</span> geom_rect<span class="p">(</span>data <span class="o">=</span> tmin_rects <span class="o">%&gt;%</span> filter<span class="p">(</span>pwidth <span class="o">==</span> <span class="s">&quot;80&quot;</span><span class="p">),</span> inherit.aes <span class="o">=</span> <span class="kc">FALSE</span><span class="p">,</span> aes<span class="p">(</span>xmin <span class="o">=</span> xmin<span class="p">,</span> xmax <span class="o">=</span> xmax<span class="p">,</span> ymin <span class="o">=</span> ymin<span class="p">,</span> ymax <span class="o">=</span> ymax<span class="p">),</span> fill <span class="o">=</span> <span class="s">&quot;darkorange&quot;</span><span class="p">,</span> alpha <span class="o">=</span> <span class="m">0.2</span><span class="p">)</span> <span class="o">+</span> geom_vline<span class="p">(</span>xintercept <span class="o">=</span> <span class="kp">mean</span><span class="p">((</span>feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;TMIN&#39;</span><span class="p">))</span><span class="o">$</span>value<span class="o">*</span><span class="m">9</span><span class="o">/</span><span class="m">5+32</span><span class="p">),</span> colour <span class="o">=</span> <span class="s">&quot;red&quot;</span><span class="p">,</span> size <span class="o">=</span> <span class="m">0.5</span><span class="p">)</span> <span class="o">+</span> geom_histogram<span class="p">(</span>binwidth <span class="o">=</span> <span class="m">1</span><span class="p">)</span> <span class="o">+</span> scale_x_continuous<span class="p">(</span>name <span class="o">=</span> <span class="s">&quot;Minimum temperature (°F)&quot;</span><span class="p">,</span> breaks <span class="o">=</span> pretty_breaks<span class="p">(</span>n <span class="o">=</span> <span class="m">10</span><span class="p">))</span> <span class="o">+</span> scale_y_continuous<span class="p">(</span>name <span class="o">=</span> <span class="s">&quot;Days&quot;</span><span class="p">,</span> breaks <span class="o">=</span> pretty_breaks<span class="p">(</span>n <span class="o">=</span> <span class="m">6</span><span class="p">))</span> <span class="o">+</span> ggtitle<span class="p">(</span><span class="s">&quot;Minimum daily temperature distribution, February 14‒21&quot;</span><span class="p">)</span> <span class="kp">print</span><span class="p">(</span><span class="kp">q</span><span class="p">)</span> max_temp_distribution <span class="o">&lt;-</span> quantile<span class="p">((</span>feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;TMAX&#39;</span><span class="p">))</span><span class="o">$</span>value<span class="o">*</span><span class="m">9</span><span class="o">/</span><span class="m">5</span> <span class="o">+</span> <span class="m">32</span><span class="p">,</span> <span class="kt">c</span><span class="p">(</span><span class="m">0.01</span><span class="p">,</span> <span class="m">0.05</span><span class="p">,</span> <span class="m">0.10</span><span class="p">,</span> <span class="m">0.25</span><span class="p">,</span> <span class="m">0.5</span><span class="p">,</span> <span class="m">0.75</span><span class="p">,</span> <span class="m">0.90</span><span class="p">,</span> <span class="m">0.95</span><span class="p">,</span> <span class="m">0.99</span><span class="p">))</span> tmax_rects <span class="o">&lt;-</span> tibble<span class="p">(</span>pwidth <span class="o">=</span> <span class="kt">c</span><span class="p">(</span><span class="s">&quot;80&quot;</span><span class="p">,</span> <span class="s">&quot;98&quot;</span><span class="p">),</span> xmin <span class="o">=</span> quantile<span class="p">((</span>feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;TMAX&#39;</span><span class="p">))</span><span class="o">$</span>value<span class="o">*</span><span class="m">9</span><span class="o">/</span><span class="m">5+32</span><span class="p">,</span> <span class="kt">c</span><span class="p">(</span><span class="m">0.10</span><span class="p">,</span> <span class="m">0.01</span><span class="p">)),</span> xmax <span class="o">=</span> quantile<span class="p">((</span>feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;TMAX&#39;</span><span class="p">))</span><span class="o">$</span>value<span class="o">*</span><span class="m">9</span><span class="o">/</span><span class="m">5+32</span><span class="p">,</span> <span class="kt">c</span><span class="p">(</span><span class="m">0.90</span><span class="p">,</span> <span class="m">0.99</span><span class="p">)),</span> ymin <span class="o">=</span> <span class="o">-</span><span class="kc">Inf</span><span class="p">,</span> ymax <span class="o">=</span> <span class="kc">Inf</span><span class="p">)</span> q <span class="o">&lt;-</span> ggplot<span class="p">(</span>data <span class="o">=</span> feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;TMAX&#39;</span><span class="p">),</span> aes<span class="p">(</span>x <span class="o">=</span> value<span class="o">*</span><span class="m">9</span><span class="o">/</span><span class="m">5+32</span><span class="p">))</span> <span class="o">+</span> theme_bw<span class="p">()</span> <span class="o">+</span> geom_rect<span class="p">(</span>data <span class="o">=</span> tmax_rects <span class="o">%&gt;%</span> filter<span class="p">(</span>pwidth <span class="o">==</span> <span class="s">&quot;98&quot;</span><span class="p">),</span> inherit.aes <span class="o">=</span> <span class="kc">FALSE</span><span class="p">,</span> aes<span class="p">(</span>xmin <span class="o">=</span> xmin<span class="p">,</span> xmax <span class="o">=</span> xmax<span class="p">,</span> ymin <span class="o">=</span> ymin<span class="p">,</span> ymax <span class="o">=</span> ymax<span class="p">),</span> fill <span class="o">=</span> <span class="s">&quot;darkcyan&quot;</span><span class="p">,</span> alpha <span class="o">=</span> <span class="m">0.2</span><span class="p">)</span> <span class="o">+</span> geom_rect<span class="p">(</span>data <span class="o">=</span> tmax_rects <span class="o">%&gt;%</span> filter<span class="p">(</span>pwidth <span class="o">==</span> <span class="s">&quot;80&quot;</span><span class="p">),</span> inherit.aes <span class="o">=</span> <span class="kc">FALSE</span><span class="p">,</span> aes<span class="p">(</span>xmin <span class="o">=</span> xmin<span class="p">,</span> xmax <span class="o">=</span> xmax<span class="p">,</span> ymin <span class="o">=</span> ymin<span class="p">,</span> ymax <span class="o">=</span> ymax<span class="p">),</span> fill <span class="o">=</span> <span class="s">&quot;darkorange&quot;</span><span class="p">,</span> alpha <span class="o">=</span> <span class="m">0.2</span><span class="p">)</span> <span class="o">+</span> geom_vline<span class="p">(</span>xintercept <span class="o">=</span> <span class="kp">mean</span><span class="p">((</span>feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;TMAX&#39;</span><span class="p">))</span><span class="o">$</span>value<span class="o">*</span><span class="m">9</span><span class="o">/</span><span class="m">5+32</span><span class="p">),</span> colour <span class="o">=</span> <span class="s">&quot;red&quot;</span><span class="p">,</span> size <span class="o">=</span> <span class="m">0.5</span><span class="p">)</span> <span class="o">+</span> geom_histogram<span class="p">(</span>binwidth <span class="o">=</span> <span class="m">1</span><span class="p">)</span> <span class="o">+</span> scale_x_continuous<span class="p">(</span>name <span class="o">=</span> <span class="s">&quot;Maximum temperature (°F)&quot;</span><span class="p">,</span> breaks <span class="o">=</span> pretty_breaks<span class="p">(</span>n <span class="o">=</span> <span class="m">10</span><span class="p">))</span> <span class="o">+</span> scale_y_continuous<span class="p">(</span>name <span class="o">=</span> <span class="s">&quot;Days&quot;</span><span class="p">,</span> breaks <span class="o">=</span> pretty_breaks<span class="p">(</span>n <span class="o">=</span> <span class="m">8</span><span class="p">))</span> <span class="o">+</span> ggtitle<span class="p">(</span><span class="s">&quot;Maximum daily temperature distribution, February 14‒21&quot;</span><span class="p">)</span> <span class="kp">print</span><span class="p">(</span><span class="kp">q</span><span class="p">)</span> <span class="c1"># TEMP BINS</span> below_freezing_percent <span class="o">&lt;-</span> feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;TMIN&#39;</span><span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span><span class="sb">`below freezing`</span> <span class="o">=</span> <span class="kp">ifelse</span><span class="p">(</span>value <span class="o">&lt;</span> <span class="m">0</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">),</span> <span class="sb">`colder than 20`</span> <span class="o">=</span> <span class="kp">ifelse</span><span class="p">(</span>value<span class="o">*</span><span class="m">9</span><span class="o">/</span><span class="m">5</span> <span class="o">+</span> <span class="m">32</span> <span class="o">&lt;</span> <span class="m">20</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">),</span> <span class="sb">`colder than 10`</span> <span class="o">=</span> <span class="kp">ifelse</span><span class="p">(</span>value<span class="o">*</span><span class="m">9</span><span class="o">/</span><span class="m">5</span> <span class="o">+</span> <span class="m">32</span> <span class="o">&lt;</span> <span class="m">10</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">))</span> <span class="o">%&gt;%</span> summarize<span class="p">(</span><span class="sb">`below freezing`</span> <span class="o">=</span> <span class="kp">sum</span><span class="p">(</span><span class="sb">`below freezing`</span><span class="p">),</span> <span class="sb">`colder than 20`</span> <span class="o">=</span> <span class="kp">sum</span><span class="p">(</span><span class="sb">`colder than 20`</span><span class="p">),</span> <span class="sb">`colder than 10`</span> <span class="o">=</span> <span class="kp">sum</span><span class="p">(</span><span class="sb">`colder than 10`</span><span class="p">),</span> TOTAL <span class="o">=</span> n<span class="p">(),</span> total <span class="o">=</span> n<span class="p">())</span> <span class="o">%&gt;%</span> gather<span class="p">(</span>temperature<span class="p">,</span> <span class="sb">`observed days`</span><span class="p">,</span> <span class="o">-</span>total<span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span><span class="sb">`percent chance`</span> <span class="o">=</span> <span class="sb">`observed days`</span> <span class="o">/</span> total <span class="o">*</span> <span class="m">100</span><span class="p">)</span> <span class="o">%&gt;%</span> select<span class="p">(</span>temperature<span class="p">,</span> <span class="sb">`observed days`</span><span class="p">,</span> <span class="sb">`percent chance`</span><span class="p">)</span> kable<span class="p">(</span>below_freezing_percent<span class="p">,</span> digits <span class="o">=</span> <span class="m">1</span><span class="p">,</span> align <span class="o">=</span> <span class="s">&quot;lrr&quot;</span><span class="p">,</span> format.args <span class="o">=</span> <span class="kt">list</span><span class="p">(</span>big.mark <span class="o">=</span> <span class="s">&quot;,&quot;</span><span class="p">))</span> <span class="c1"># PRCP BINS</span> prcp_percent <span class="o">&lt;-</span> feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;PRCP&#39;</span><span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>raining <span class="o">=</span> <span class="kp">ifelse</span><span class="p">(</span>value <span class="o">&gt;</span> <span class="m">0</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">),</span> tenth <span class="o">=</span> <span class="kp">ifelse</span><span class="p">(</span>value <span class="o">&gt;</span> <span class="m">0.1</span> <span class="o">*</span> <span class="m">25.4</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">),</span> quarter <span class="o">=</span> <span class="kp">ifelse</span><span class="p">(</span>value <span class="o">&gt;</span> <span class="m">0.25</span> <span class="o">*</span> <span class="m">25.4</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">),</span> half <span class="o">=</span> <span class="kp">ifelse</span><span class="p">(</span>value <span class="o">&gt;</span> <span class="m">0.5</span> <span class="o">*</span> <span class="m">25.4</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">),</span> inch <span class="o">=</span> <span class="kp">ifelse</span><span class="p">(</span>value <span class="o">&gt;</span> <span class="m">1</span> <span class="o">*</span> <span class="m">25.4</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">))</span> <span class="o">%&gt;%</span> summarize<span class="p">(</span>raining <span class="o">=</span> <span class="kp">sum</span><span class="p">(</span>raining<span class="p">),</span> tenth <span class="o">=</span> <span class="kp">sum</span><span class="p">(</span>tenth<span class="p">),</span> quarter <span class="o">=</span> <span class="kp">sum</span><span class="p">(</span>quarter<span class="p">),</span> half <span class="o">=</span> <span class="kp">sum</span><span class="p">(</span>half<span class="p">),</span> inch <span class="o">=</span> <span class="kp">sum</span><span class="p">(</span>inch<span class="p">),</span> TOTAL <span class="o">=</span> n<span class="p">(),</span> total <span class="o">=</span> n<span class="p">())</span> <span class="o">%&gt;%</span> gather<span class="p">(</span><span class="sb">`rainfall amount`</span><span class="p">,</span> <span class="sb">`observed days`</span><span class="p">,</span> <span class="o">-</span>total<span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span><span class="sb">`percent chance`</span> <span class="o">=</span> <span class="sb">`observed days`</span> <span class="o">/</span> total <span class="o">*</span> <span class="m">100</span><span class="p">)</span> <span class="o">%&gt;%</span> select<span class="p">(</span><span class="sb">`rainfall amount`</span><span class="p">,</span> <span class="sb">`observed days`</span><span class="p">,</span> <span class="sb">`percent chance`</span><span class="p">)</span> kable<span class="p">(</span>prcp_percent<span class="p">,</span> digits <span class="o">=</span> <span class="m">1</span><span class="p">,</span> align <span class="o">=</span> <span class="s">&quot;lrr&quot;</span><span class="p">,</span> format.args <span class="o">=</span> <span class="kt">list</span><span class="p">(</span>big.mark <span class="o">=</span> <span class="s">&quot;,&quot;</span><span class="p">))</span> <span class="c1"># PRCP DIST</span> prcp_cum_freq <span class="o">&lt;-</span> tibble<span class="p">(</span><span class="sb">`cumulative frequency`</span> <span class="o">=</span> <span class="kt">c</span><span class="p">(</span><span class="s">&quot;1%&quot;</span><span class="p">,</span> <span class="s">&quot;5%&quot;</span><span class="p">,</span> <span class="s">&quot;10%&quot;</span><span class="p">,</span> <span class="s">&quot;25%&quot;</span><span class="p">,</span> <span class="s">&quot;50%&quot;</span><span class="p">,</span> <span class="s">&quot;75%&quot;</span><span class="p">,</span> <span class="s">&quot;90%&quot;</span><span class="p">,</span> <span class="s">&quot;95%&quot;</span><span class="p">,</span> <span class="s">&quot;99%&quot;</span><span class="p">),</span> precipition <span class="o">=</span> quantile<span class="p">((</span>feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&quot;PRCP&quot;</span><span class="p">,</span> value <span class="o">&gt;</span> <span class="m">0</span><span class="p">))</span><span class="o">$</span>value<span class="o">/</span><span class="m">25.4</span><span class="p">,</span> <span class="kt">c</span><span class="p">(</span><span class="m">0.01</span><span class="p">,</span> <span class="m">0.05</span><span class="p">,</span> <span class="m">0.10</span><span class="p">,</span> <span class="m">0.25</span><span class="p">,</span> <span class="m">0.5</span><span class="p">,</span> <span class="m">0.75</span><span class="p">,</span> <span class="m">0.90</span><span class="p">,</span> <span class="m">0.95</span><span class="p">,</span> <span class="m">0.99</span><span class="p">)))</span> kable<span class="p">(</span>prcp_cum_freq<span class="p">,</span> digits <span class="o">=</span> <span class="m">2</span><span class="p">,</span> align<span class="o">=</span><span class="s">&quot;lr&quot;</span><span class="p">)</span> <span class="c1"># PRCP PATTERN</span> no_prcp <span class="o">&lt;-</span> feb_obs <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;PRCP&#39;</span><span class="p">,</span> value <span class="o">==</span> <span class="m">0</span><span class="p">,</span> miles <span class="o">&lt;</span> restrict_miles<span class="p">,</span> doy <span class="o">&gt;=</span> <span class="m">44</span><span class="p">,</span> doy <span class="o">&lt;=</span> <span class="m">53</span><span class="p">)</span> consecutive_rain <span class="o">&lt;-</span> no_prcp <span class="o">%&gt;%</span> group_by<span class="p">(</span>station_name<span class="p">)</span> <span class="o">%&gt;%</span> arrange<span class="p">(</span>station_name<span class="p">,</span> dte<span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>days <span class="o">=</span> <span class="kp">as.integer</span><span class="p">(</span>dte <span class="o">-</span> lag<span class="p">(</span>dte<span class="p">)</span> <span class="o">-</span> <span class="m">1</span><span class="p">))</span> <span class="o">%&gt;%</span> filter<span class="p">(</span><span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>days<span class="p">),</span> days <span class="o">&gt;</span> <span class="m">0</span><span class="p">,</span> days <span class="o">&lt;</span> <span class="m">10</span><span class="p">)</span> consecutive_days_dist <span class="o">&lt;-</span> consecutive_rain <span class="o">%&gt;%</span> ungroup<span class="p">()</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>total <span class="o">=</span> n<span class="p">())</span> <span class="o">%&gt;%</span> arrange<span class="p">(</span>days<span class="p">)</span> <span class="o">%&gt;%</span> group_by<span class="p">(</span>days<span class="p">,</span> total<span class="p">)</span> <span class="o">%&gt;%</span> summarize<span class="p">(</span><span class="sb">`percent chance`</span> <span class="o">=</span> n<span class="p">()</span><span class="o">/</span><span class="kp">max</span><span class="p">(</span>total<span class="p">)</span><span class="o">*</span><span class="m">100</span><span class="p">)</span> <span class="o">%&gt;%</span> rename<span class="p">(</span><span class="sb">`consecutive days`</span> <span class="o">=</span> days<span class="p">)</span> <span class="o">%&gt;%</span> select<span class="p">(</span><span class="sb">`consecutive days`</span><span class="p">,</span> <span class="sb">`percent chance`</span><span class="p">)</span> kable<span class="p">(</span>consecutive_days_dist<span class="p">,</span> digits <span class="o">=</span> <span class="m">1</span><span class="p">,</span> align <span class="o">=</span> <span class="s">&quot;lr&quot;</span><span class="p">)</span> <span class="c1"># SNOW DIST</span> snow_percent <span class="o">&lt;-</span> feb_obs_filtered <span class="o">%&gt;%</span> filter<span class="p">(</span>variable <span class="o">==</span> <span class="s">&#39;SNOW&#39;</span><span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>snowing <span class="o">=</span> <span class="kp">ifelse</span><span class="p">(</span>value <span class="o">&gt;</span> <span class="m">0</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">),</span> half <span class="o">=</span> <span class="kp">ifelse</span><span class="p">(</span>value <span class="o">&gt;</span> <span class="m">0.5</span> <span class="o">*</span> <span class="m">25.4</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">),</span> inch <span class="o">=</span> <span class="kp">ifelse</span><span class="p">(</span>value <span class="o">&gt;</span> <span class="m">1</span> <span class="o">*</span> <span class="m">25.4</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">),</span> two <span class="o">=</span> <span class="kp">ifelse</span><span class="p">(</span>value <span class="o">&gt;</span> <span class="m">2</span> <span class="o">*</span> <span class="m">25.4</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">0</span><span class="p">))</span> <span class="o">%&gt;%</span> summarize<span class="p">(</span>snowing <span class="o">=</span> <span class="kp">sum</span><span class="p">(</span>snowing<span class="p">),</span> inch <span class="o">=</span> <span class="kp">sum</span><span class="p">(</span>inch<span class="p">),</span> two <span class="o">=</span> <span class="kp">sum</span><span class="p">(</span>two<span class="p">),</span> TOTAL <span class="o">=</span> n<span class="p">(),</span> total <span class="o">=</span> n<span class="p">())</span> <span class="o">%&gt;%</span> gather<span class="p">(</span><span class="sb">`snowfall amount`</span><span class="p">,</span> <span class="sb">`observed days`</span><span class="p">,</span> <span class="o">-</span>total<span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span><span class="sb">`percent chance`</span> <span class="o">=</span> <span class="sb">`observed days`</span> <span class="o">/</span> total <span class="o">*</span> <span class="m">100</span><span class="p">)</span> <span class="o">%&gt;%</span> select<span class="p">(</span><span class="sb">`snowfall amount`</span><span class="p">,</span> <span class="sb">`observed days`</span><span class="p">,</span> <span class="sb">`percent chance`</span><span class="p">)</span> kable<span class="p">(</span>snow_percent<span class="p">,</span> digits <span class="o">=</span> <span class="m">1</span><span class="p">,</span> align <span class="o">=</span> <span class="s">&quot;lrr&quot;</span><span class="p">,</span> format.args <span class="o">=</span> <span class="kt">list</span><span class="p">(</span>big.mark <span class="o">=</span> <span class="s">&quot;,&quot;</span><span class="p">))</span> </pre></div> </div> </div> Sun, 31 Dec 2017 11:10:31 -0900 http://swingleydev.com/blog/p/2005/ R weather BISO Tennessee Kentucky