Measure All the Things

Those of you who know me well will probably know that I kind of have a thing for graphs. More generally, I find monitoring, trending, and logging to be invaluable. No sane sysadmin would ever set up a server without those things.

But why doesn’t my house support this? Over the winter, I noticed that my furnace was periodically throwing an error code. It worked fine, but would eventually display an error code, shut down, and start back up again. I have no idea what causes it or how often it happens. Since the error code’s description isn’t anything hazardous-sounding, and since the system operates fine on the whole, I haven’t yet considered paying someone to come out and look at it.

Meanwhile, my problems aren’t limited to heat. My air conditioner has been fraught with problems. When the technician comes, he’ll attach a set of gauges to the lines to measure pressure. A more thorough technician might also attach temperature clamps to see the exact temperature going in and out, which provides a sanity check and allows him to calculate superheat and subcooling. It’s also not uncommon to measure current of various components and see if they are within normal parameters. Of course, my system, being designed by buffoons, seems to make this stuff really hard — a decent number of measurements require disassembling the thing.

And as all of this goes on, I’ve been tempted to install this Brultech ECM1240 meter, which measures current on each circuit of your home’s electrical panel. I don’t need it, but I really have no concept of how much electricity I’m using (beyond the monthly bill), nor what uses the most.

But all of this leaves me frustrated. When my furnace hit an error, why couldn’t it send an SNMP trap, fire a syslog message, or send me an email? (Or, for that matter, send an alert to my HVAC company, which could then log in and look at electronic diagnostics, having a good idea what the problem was before anyone came out?)

When they come out to fix my AC, why do they have to bring their own gauges? Why can’t the tech just pull out an iPad and read the values over Bluetooth, getting not just pressure but a wealth of other information that the AC is already tracking for its internal operations? And, when he says, “It seems like you’ve got a leak,” why can’t we pull up a graph and see the pressure decreasing over the past week? And for that matter, why didn’t it just send me an email alerting me that the pressure had fallen below the expected range, before the thing got so low that it didn’t work?

And when I really want to troubleshoot more, why can’t I just set things to log in more verbose mode? Why doesn’t my thermostat send an INFO event whenever it kicks a zone on or off, which can just live in a ring buffer that’s generally ignored until something goes wrong? And before my smoke detector sounds the alarm, can’t it send me a warning that it’s detecting light smoke and will go off soon? (As an aside, the concept of a ‘pre-alarm’ is not a new one.) And when it does randomly sound an alarm at 4:30am, why can’t it send me a text message telling me what detector has fired, so I don’t have to run around in a panic before realizing that it’s the smoke detector in my bedroom, where there is clearly no smoke or fire? (And, for that matter, why can’t I reply “stfu” to stop it from sounding?)

What frustrates me about this is that it’s like I’m describing a futuristic, almost science-fiction world. SNMP is almost as old as I am. Ethernet has been around for more than 30 years. syslog has been around for about as long. Everything I’ve described could have been fully implemented in 1990. People have been talking about “smart homes” before then. More than two decades later, when everything has computerized onboard diagnostics already and boards that can do Ethernet, syslog, SNMP, and an embedded webserver are incredibly cheap, this is still a pipe dream. Why?!

One thought on “Measure All the Things

  1. Most likely reason why -> added up front unit cost.

    i.e. selecting, integrating (with extensive testing), and securing the additional hardware + software components to do this.

    And if done badly, or the embedded software gains remote exploits over time (likely), it also increases the liability and risk levels for the manufacturer a lot.

Leave a Reply

Your email address will not be published. Required fields are marked *