Tableau Prep Tackles Nest Thermostat Data

Nest Learning Thermostat captures tons of data

Last year, I went home from the Teknion Christmas party with a shiny, new Nest Learning Thermostat. Sure, I was excited about the potential energy savings but being the data geek I am, I was more excited about the idea that I could capture tons of Nest thermostat data and then visualize it in Tableau! So for the past year, after figuring out how, I’ve been collecting lots and lots of data.

Collecting Nest Thermostat Data in Google Sheets

Nest makes some of the data available via weekly reports – but it’s nowhere near as detailed as I’d like. I mean, I want minute-by-minute snapshots of inside/oustide temperature, humidity, is my AC or Heater running, etc… Fortunately, Nest has an API that you can use to get at the detail data. It took some digging, but I found a GitHub project where Michael Pesce leveraged some work by Beezly — all focused on automating the pull of Nest data into a Google Sheet document.

As Michael notes, there is a limit to how many rows of data a single Google Sheet document will allow and storing a record every few minutes will quickly hit that limit. His solution was to continually update certain variables, add together others, average others and just keep one record per day. But I really wanted the minute-by-minute Nest thermostat data.

My solution was to modify the script (download my modified version here) to still keep one record per day, but to concatenate values together with time stamps all in one exceedingly long string (per measure) where just two entries might look like:

|34.66@Tue Feb 19 2019 22:42:10 GMT-0600 (CST)|34.66@Tue Feb 19 2019 23:02:10 GMT-0600 (CST)

I set the script to automatically run every minute (see instructions in the script comments). So, by the end of the day, I’ll have a bunch of really, really long strings. By the end of the year I have a bunch of records with a bunch of really, really long strings.

I know – it’s a horrible data structure. But I have a tool takes horrible data structures and turns them into gold.


Tableau Prep for the gold!

All of the Nest thermostat data I capture via the script is in a Google Sheet, which I have to download can be connected to directly with Tableau Prep 2019.4 (currently in beta)! The Tableau Flow itself looks a bit complex, but it’s mostly just the same steps duplicated for each measure:

Tableau Prep Flow that untangles Nest Learning Thermostat Data

Each piece, like Inside Temps is primarily splitting the really, really long string:

Some long string splitting in Tableau Prep

then pivoting so that all those measurements and times are together in a single column:

Pivot in Tableau Prep

And then doing some additional splitting out of the measure from the timestamp, parsing the date and time, and changing data types:

More string splitting in Tableau Prep

The same pattern is reused for each metric – good thing Tableau Prep has copy and paste of steps! When all measures are done, we can union them together:

Union together a bunch of Nest Learning Thermostat data in Tableau Prep

The union results in a single table with a column for each measure, but they are split out across different rows, so only one measure has a value on any given row and the rest are NULLs. We can collapse the rows with an Aggregate the groups on the timestamp and takes a MIN (or MAX) of each measure:

Aggregate to collapse rows in Tableau Prep

The script also captured the overall weather for each day (cloudy, sunny, rainy, thunderstorms, snow, etc…) and so I join that in at the day level. There’s no end to what I can do with the Nest thermostat data in Tableau now that I’ve captured it and cleaned it up with Tableau Prep:

Lots of Nest Learning Thermostat data parsed by Tableau Prep to visualize in Tableau Desktop

2 Responses so far.

  1. Tim M says:

    Just wondering why you used a Union rather than a Join? As I expect most of the Time stamps are the same and seems like fewer steps than Unioning then Aggregating?

    • Joshua Milligan says:

      That’s a great question Tim! Your approach would work very well in general, but here was my reasoning: some of the timestamps get lost. Sometimes the Nest thermostat data is retrieved while the weather data is not (and possibly vice versa). Also, I initially split the really long strings somewhat arbitrarily to avoid the limit of the Split feature in Prep (I think it will split up to 150 values, but I often have more values in the strings). That cuts some of the timestamps in half, so they get discarded. I’m not worried about losing them given the overall volume of data. But the join would get rather messy as it would have to be a full outer and I wasn’t confident that I wouldn’t end up with many-to-many matches. The union avoids these problems.

      Great question!

Leave a Reply

Your email address will not be published. Required fields are marked *