IoT Day 1: Home Energy Monitoring

Read More
Graph showing electricity usage from an Eagle home enery monitor
Graph showing electricity usage from an Eagle home enery monitor
Eagle energy home monitoring IoT device
Eagle energy home monitoring IoT device

 

In my next series of blog posts we explore an Internet of Things topic – Home energy monitoring – from a first person perspective.  Join me as I install, use and hack a monitor (and related cloud services) in my new home.


Bringing IoT Home

I recently moved into a new home that uses electric heat exclusively.  Having come from a natural gas forced air furnace, I wasn’t sure what to expect in terms of cost.  I used to just keep the thermostat at 20C and forget about it.  After hearing some horror stories about $800 electricity bills for heating this place, monitoring it became a little more important to me.  So, I ordered an energy monitor.

BC Hydro helps provide a discount to get an Eagle system from Rainforest Automation, pre-programmed for your meter.  It arrived in a box today.  Here’s what I was able to do with it out-of-the-box and how I’ll be using it in the future.

Eagle energy monitor IoT device from Rainforest Automation
Eagle energy monitor IoT device from Rainforest Automation

In my next post I’ll talk about how I connect it to some cloud services and see what they provide.  In the end, it is ultimately an open platform and boasts a REST API and developer documents that I’ll be digging into.

Smart Meter 101

They communicate wirelessly using a Zigbee compatible protocol.  So the Eagle monitoring unit is capturing data from that broadcast (the current meter reading counter) and storing it on the device with a timestamp.  This leads to some other questions as well, especially for the DIY device hacker.

Are there other ways of capturing that data without Zigbee – i.e. optical?  I think there are but haven’t investigated it much as this option was pre-configured and not horrendously expensive (~$60) for getting an embedded system with web server, Zigbee and more (can’t wait to hack this sucker!).

Does the meter actually store any historical data?  I’m hoping to find out, but I believe that all it spits out is the current meter counter, so it doesn’t matter if it misses a bunch of transmissions, it will correct itself later on.

Can I read my neighbours?  Looks like you need some special ID numbers generated from a hydro service account to access the meter, so I doubt that I’ll be trying it on others.

In-Device Application

After plugging it in and hooking it to an ethernet cable to my router, all the green lights came on except the Cloud light (as I hadn’t configured it yet).  The unit had a label on the underside with the name of the device as it would appear on my network eagle-xxxxxx.local – I just checked my router and used the IP directly.

First they tell you to go to their website and enter in a few codes from the label.  I didn’t get too far with that until I actually powered off the device once and restarted it.  Then I could hit the host/IP and start using it right away.  I did send a support request to them mentioning some of these issues and they actually updated their docs, so be sure to communicate early if you are working through a similar scenario it yourself.

The Eagle unit runs a few services, e.g. the RESTful interface and a web server that you can hit directly when plugged into your local network.  The main part of the UI I’m interested in is a usage History graph, showing hourly, daily or weekly usage.  It also has a download feature so you can get a CSV text file showing the same data.

I’m not sure yet how much data it will retain, guess we’ll find out tomorrow.

Graph showing electricity usage from an Eagle home enery monitor
Graph showing electricity usage from an Eagle home enery monitor

 

Feeding External Services

Eagle energy monitor UI for configuring cloud connected services
Eagle energy monitor UI for configuring cloud connected services

An important part of the Settings in the unit are for configuring cloud services.  When configured, the Eagle unit can push data to an external service.  There are a few pre-defined ones, BlueDot, Bidgely, etc. but also the option to add your own custom service URL.  (Which tells that I think I know what Part III of this post will be already :) )

The only bummer I noticed with the cloud services so far is that, at least with the one I’ve tested, it only works when a web connection is online.  Although the monitor will continue to store data and you can view it in the devices history chart, it does not re-send records to an external service after a network failure.

So, if you are like me and want to even shut down your Internet at night to save energy, just don’t expect to gain all the benefits.  More on that later.  Until then, I’ll be leaving it all on so we can see how things go.

Next Steps

My next planned steps are to get a baseline for my energy usage during the day.  It’s actually pretty easy for me to see how it’s going throughout the day as I work from home and several of us are in the house all day long.  It’s pretty cool so far to see a spike go up when heaters turn on or the hot water is getting low.

First World Problems

Although I’m going to try to get my energy bill down, I’m still amazed at how cheap our power is, especially relative to those in the world who don’t have access to any.  So if I look like I’m out of touch trying to save a few cents a day keep in mind, I’d pay 10x my current bill before I gave up my clean, hydro-generated, renewable power in place of a dark hut with a dung fire!  I’m very thankful I can even run these experiments.

More to come in Day 2 – as I enable a cloud service to help me make sense of my data.

Web console for Kafka messaging system

Read More
Kafka Web Console - Zookeeper Register Form

Running Kafka for a streaming collection service can feel somewhat opaque at times, this is why I was thrilled to find the Kafka Web Console project on Github yesterday.  This Scala application can be easily downloaded and installed with a couple steps.  An included web server can then be launched to serve it up quickly.  Here’s how to do all that.

For a quick intro to what the web console does, see my video first.  Instructions for getting started follow below.

Kafka Web Console Project – Download

The main repository for this project is available on Github (claudemamo/kafka-web-console).  However, I wanted the ability to add and remove Kafka topics so I use a forked repository that has a specific branch with those capabilities.  (These are planned to be added to the main project but have not been yet.)

Download the ZIP archive file and unzip.

Before doing anything further, we need another application to build and launch the console.

Download Play Framework

A program called Play with Activator is used to build and launch the web console.  It’s a Java app for launching Scala apps.

Download it here, unzip it and add it to the system path so you can execute the activator command that it provides.

Build/Launch

Now back to the Kafka web console code.  Enter the top level directory and execute the Play Activator start command (with one special option):

The first time it runs will take some time to build the source and then launch the web server.

In the last line, above, you see it launches by default on port 9000.

Configuring the Console

Kafka Web Console - Zookeeper Register Form
Step 1 – Register your Zookeeper instance

Now you can launch the web console and start using the application.  Step 1 figure shows the basic form for your Zookeeper configuration details.  This is the only setup step required to get access to your Kafka brokers and topics.

The remained of the steps/figures are just showing the different screens.  See the video for watching it in action.

Kafka Web Console - Broker List Table
Step 2 – Brokers tab shows list of Kafka Brokers
Kafka Web Console - Topic List Table
Step 3 – Topics tab lists the Kafka topics. Note that this forked version provides the red “delete” button for each topic, as well as the additional “create topic” tab above.
Kafka Web Console - Topic Feed Table
Step 4 – Selecting a topic allows you to see an active topic stream.

 

Drinking from the (data) Firehose of Terror

Read More
Image of a boy about to blasted by a firehose

Between classic business transactions and social interactions and machine-generated observations, the digital data tap has been turned on and it will never be turned off. The flow of data is everlasting. Which is why you see a lot of things in the loop around real time frameworks and streaming frameworks. – Mike Hoskins, CTO Actian

From Mike Hoskins to Mike Richards (yes we can do that kind of leap in logic, it’s the weekend)…

Oh, Joel Miller, you just found the marble in the oatmeal!   You’re a lucky, lucky, lucky little boy – because you know why?  You get to drink from… the firehose!  Okay, ready?  Open wide! – Stanley Spadowski, UHF

Firehose of Terror

I think you get the picture – a potentially frightening picture for those unprepared to handle the torrent of data that is coming down the pipe.  Unfortunately, for those who are unprepared, the disaster will not merely overwhelm them.  Quite the contrary – I believe they will be consumed by irrelevancy.

If you’re still with me, let me explain.

I agree that the tap has been turned on, maybe not at the full blast of power or under maximum control, yet the data is coming and already well beyond a trickle.  The helter-skelter implementations of big data solutions out there has, perhaps, created more of a turbulent blasting firehose than a meandering stream of flowing data.  And this is only the beginning.

Success is Still Newsworthy

We are still at the stage where designing a (successful) enterprise system built around streaming data, for example, is big news.  Why is it news?  Because building is harder than merely planning, especially when new open source projects continue to push us beyond the bleeding edge.  New tools are helping us see how we can handle more data and find more value, but they are also making the pool of data so much larger that the tools themselves are often irrelevant by the time they are adopted.

For example, MapReduce was awesome, until it began to be so widely adopted that its limitations became apparent.  It’s like finding the marble in a sandbox filled with oatmeal – not easy, but when you find it, you’re a winner!  Oh, the prize is a fierce typhoon of even more data coming your way.  Congratulations! (Sorry you didn’t prepare for that.)

So where does this leave organisations that have no ability to handle more than a trickle of data?

It’s a win or lose scenario – either you can do something about it or you can’t.  As software developers or data managers we won’t be judged along some smooth gradation of skills and capabilities.

We’ll be judged against a checklist
Yes or no.
Pass or fail.
Win or lose.
Firehose or … an icky pail to hide in the closet.

Data’s Need for Speed

Why is it a pass/fail scenario?  Consider your car – is it successful when it mostly starts in the morning?  Never.  Anything beyond fully starting is a complete failure because that is what it is designed to do.

I argue that today’s data streams are being designed to handle data at maximum velocity.  Sure many services aren’t producing millions of records per second, but as we gear up with the latest toolsets, we make the tools themselves hunger and thirst to get more and more data into their greedy little hands.  Feed the beast – or ignore it at your peril.

Systems today are designed to run at full throttle – 100% – all out – maximum overdrive. None of us would like to have pay for premium broadband and find that we only use 10% of our bandwidth.  Likewise, our systems are waiting for us to crank up the volume to see what we’ll do next.

Our data economy inherently wants to run at maximum but much of the old plumbing needs upgrades to function at that rate.  You’d be irate if the fire department ran their hoses at 10% power when trying to douse your burning home.  However, if they told you later that max pressure would burst the old hoses and you’d have no water, then you might be a little more appreciative.

Patch up those hoses so they are tested and ready.  Buckle up the survival suit.  Start digging through the oatmeal and open wide.  Be forewarned – if you are not searching for the marble, you can never find it.  If cannot find it, you’ll be sent home empty handed.

—-

p.s. If you don’t win, I highly doubt you’ll even receive a lousy copy of the home edition.

 

OSX Open Command – Launch Custom Application

The OSX “open” command line tool is very useful.  Use it to launch a URL or point to a folder and the web browser or Finder pops up automatically.  But what about when you want to launch a particular app to handle a resource you provide?  It can do that as well.  Easily.

In this example I’m setting up my environment to launch several Windows remote desktop client sessions using the wonderful CoRD application.  The idea is that I can put the CoRD app in a folder with a shell script and have an easily transportable launcher package to give to others – without them having to install a RDP client, etc.

Because I’m not install the CoRD application globally, the RDP protocol doesn’t get associated with the app.  Therefore, when using the open command I need to explicitly tell it what application to launch it with, using the “-a” flag:

The benefit to using the open command here is that it won’t launch new CoRD instances, but will add them as sessions to the existing one.  Likewise, open will start the process and return you to the command line – so you can list several open statements in a script and it will run them all without having to do any funky backgrounding step.

Thanks to this old OSXDaily post for pointing me in the right direction.  I had given up on the command because I didn’t realise there were more options.  Next time I’ll read the manual first!

 

 

VIDEO: Kibana 3 Dashboard – 3 Use Cases Demonstrated

Read More
Kibana 3 Dashboard: All Events

Kibana dashboards, from the Elasticsearch project, can help you visualise activity and incidents in log files. Here I show 3 different types of use cases for dashboards and how each can be used to answer different questions depending on the person.  Video and details follow.

Text Search Dashboard

The first example is the simplest UI I could imagine: a query/search box, a histogram, and a table.  In this instance any user, at any level of curiosity, can find textual data in the logs using a keyword match.

Kibana 3 Dashboard: Text Search

Then they can also see the relative number of records that occur at given times within the time window of all the data available.  These are aggregate counts of all records that have some match to the query keyword or algebra.

Likewise, the table reflects the subset of data provided by the records, with the ability to only show fields of interest.

Process Details

A slightly more advanced use is to focus on a particular process (i.e. application) running on a machine that’s being logged.  Here we can then take a particular metric, i.e. CPU usage, and graph it instead of just a simple histogram.

Kibana 3 Dashboard: Process Details

A typical user may be in charge of a particular set of services in a system.  Here they can see how they perform and yet still dig into the details as desired.

I also do some cool “markers” to subtly show when events coincide with other process metrics.

All Events

The data example shown here has process, performance and event logging information.  I combine multiple queries and having them drive different parts of the dashboard – a pie chart, summary table, histogram, sparkline and other charts based on numeric data.

Kibana 3 Dashboard: All Events

These can then all be filtered based on time windows that are interactively selected.  This is really the typical picture of a dashboard – giving more densely packed information about a variety of metrics, ideal for system managers to get a handle on things.

Streaming Pipeline

The data are generated by Windows servers using a custom C# application that pushes data in a Kafka topic in a Hadoop cluster running in EC2. The data stream is then read from the topic using Actian DataFlow platform and pushed into Elasticsearch for Kibana to use at the end of the pipeline.  There are other reasons I have this kind of pipeline – namely that DataFlow can simultaneously also feed other outgoing part of the pipeline – RDBMS, graph, etc.   More on that in a future video/post.

Next Steps

  • My next plans are to show you Kibana version 4 in action, replicating some of what I’ve shown here.
  • If you haven’t seen it already, see this link and my video with some tips and tricks for using Kibana 3.
  • Tell me more about your interests in dashboards and I’ll consider focusing on them too.

Google wants “mobile-friendly” – fix your WordPress site

Read More
Google mobile check fixed success

TheNextWeb reports: “Google will begin ranking mobile-friendly sites higher starting April 21“.  It’s always nice having advance warning, so use it wisely – here’s how to tweak WordPress to increase your mobile-friendliness.

Google Mobile-Friendly Check

I use a self hosted WordPress site and wanted to make sure it was ready for action.  I already thought it was, because I’ve accessed in on a mobile device very often and it worked okay.

I even went onto the Google Web Admin tools and the mobile usability check said things were fine.

Google admin tool says mobile check is okay

However, all was not golden when I ran the Google mobile-friendly checker.  (Obviously two different apps here, hopefully those will merge.)

Google mobile check failure

Try it here, now!  The complaints about were that some content is wider than screens and that links were too close together.  Fair enough.

WordPress Mobile-Friendly Activation

If you’re not already using WordPress’s Jetpack features, you’re really missing out.  I use it mostly for monitoring stats but there are several other features that make it very useful, including one called Mobile Theme.

  1. From the admin sidebar select Jetpack (install it first if not already enabled).  It will show you some suggested plugins to enable, plus show you a search bar to find others.
  2. Enter “mobile” and click on the Mobile Theme item.
  3. Activate it (lower-right corner button).
  4. And you’re done!

Going back to Google’s checker it shows a different preview now and also says things are fine.  Looking at the site after making these changes, it’s obviously better.

Google mobile check fixed success

However, I still have one plugin (Crayon markup) that helps display code samples that seems to force some posts to wider than the screen.  I assume the plugin creators will fix that up, but it’s not too bad at this point.  Unless Google complains, it doesn’t matter anyway!

iPhone cable – loose connection?

iPhone cable – mysterious loose connection bothering you?  Before buying a new gold plated cord or adapter, clean out the port with a toothpick.  You will be amazed!

Kafka Consumer – Simple Python Script and Tips

Read More
Screenshot from Hortonworks site describing how Kafka works
Screenshot from Hortonworks site describing how Kafka works

[UPDATE: Check out the Kafka Web Console that allows you to manage topics and see traffic going through your topics – all in a browser!]


 

When you’re pushing data into a Kafka topic, it’s always helpful to monitor the traffic using a simple Kafka consumer script.  Here’s a simple script I’ve been using that subscribes to a given topic and outputs the results.  It depends on the kafka-python module and takes a single argument for the topic name.  Modify the script to point to the right server IP.

Max Buffer Size

There are two lines I wanted to focus on in particular.  The first is the “max_buffer_size” setting:

When subscribing to a topic with a high level of messages that have not been received before, the consumer/client can max out and fail.  Setting an infinite buffer size (zero) allows it to take everything that is available.

If you kill and restart the script it will continue where it last left off, at the last offset that was received.  This is pretty cool but in some environments it has some trouble, so I changed the default by adding another line.

Offset Out of Range Error

As I regularly kill the servers running Kafka and the producers feeding it (yes, just for fun), things sometimes go a bit crazy, not entirely sure why but I got the error:

To fix it I added the “seek” setting:

If you set it to (0,0) it will restart scanning from the first message.  Setting it to (0,2) allows it to start from the most recent offset – so letting you tap back into the stream at the latest moment.

Removing this line forces it back to the context mentioned earlier, where it will pick up from the last message it previously received.  But if/when that gets broke, then you’ll want to have a line like this to save the day.


For more about Kafka on Hadoop – see Hortonworks excellent overview page from which the screenshot above is taken.

Web Mapping Illustrated – 10 year celebration giveaway [ENDED!]

Read More
web-mapping-tyler-mitchell-large
My O'Reilly, 2005 book
web-mapping-tyler-mitchell-large
My O’Reilly, 2005 book

Update: All copies are gone!  If you want Geospatial Desktop or Geospatial Power Tools – go to LocatePress.com – quantity discounts available.  For Web Mapping Illustrated go to Amazon.


 

I’m giving away a couple copies of my circa 2005 classic book.  Details below…  When O’Reilly published Web Mapping Illustrated – Using Open Source GIS Toolkits – nothing like it existed on the market.  It was a gamble but worked out well in the end.

Primarily focused on MapServer, GDAL/OGR and PostGIS, it is a how-to guide for those building web apps that included maps.  That’s right, you couldn’t just use somebody else’s maps all the time – us geographers needed jobs, after all.

To help give you the context of the times, a couple months before the final print date, Google Maps, was released.  I blithely added a reference to their site just in case it became popular.

The book is still selling today and though I haven’t reviewed it in a while, I do believe many of the concepts are still as valid as when it was written.  In fact, it’s even easier to install and configure the apps now due to packaging and distribution options that didn’t exist back then.  Note this was also a year before OSGeo.org’s collaborative efforts started to help popularise the tools further.

In celebration of 10 years of sales I have a couple autographed copies as giveaways to the first two people who don’t mind paying only for the shipping (about USD$8) and who drop me a note expressing their interest.

Additionally, I have some of Gary Sherman’s excellent Geospatial Desktop books as giveaways as well.  Same deal, pay actual shipping cost only from my remote hut in northern Canada.  Just let me know you’d like one of them and I’ll email you the PayPal details.  Sorry, not autographed by Gary, though I was editor and publisher, so could scribble on it for you if desired.

Neo4j Cypher Query for Graph Density Analysis

Read More
Graph density calculation with Neo4j

Graph analysis is all about finding relationships. In this post I show how to compute graph density (a ratio of how well connected relationships in a graph are) using a Cypher query with Neo4j. This is a follow up to the earlier post: SPARQL Query for Graph Density Analysis.

Installing Neo4j Graph Database

In this example we launch Neo4j and enter Cypher commands into the web console.

  1. Neo4j is a java application and requires OracleJDK 7 or OpenJDK 7 to be on your system.
  2. Download the Community Edition of Neo4j.  In my case I grabbed the Unix .tar.gz file and unzipped the files.   Install on Windows may vary.
  3. From command line and within the newlycreatedneo4j-community folder, start the database:
  4. Use web console at: http://localhost:7474 – the top line of the page is a window for entering Cypher commands.
  5. Load sample CSV data using a Cypher command – I cover this in a separate post here.  Be sure the path to the file is the same on your system, matching where you saved the CSV files to.

Quick Visualization

Using the built-in Neo4j graph browser, you can easily see all your relationships as nodes and edges.  For a query, return results that include all the objects:

Friends graph sample viewed in Neo4j

Compute Graph Density

Graph Density requires total number of nodes and total number of relationships/edges.  We do them both separately, then pull them together at the end.

Compute the number of unique nodes in the graph

This tells us that there are 21 people as Subjects in the graph.  (I’m not sure how this differed from the 20 I had in my other post – perhaps part of the header from the CSV came in?)

Therefore, a maximum number of edges between all people would be 21² (we’ll use only 21×20 as a person might not link to themselves in this example).

Compute the number of edges in the graph

Here we only select subjects that are “Person” and only where they have a relationship called “HAS_FRIEND”.

(The “p:”, “r:” and “f:” prefixes act like the “?…” variables references in SPARQL – you set them to whatever you want as a pointer to the data returned from the MATCH statement.)

 When we defined the data from CSV, we set up a relationship that I thought would be just one-way.  But you can see, if you don’t provide the DISTINCT keyword, that you’ll get double the record counts, so I’m assuming it’s treating the relationships as bi-directional.

Total edges is 57.  Do the quick math and see that the ratio is then:

Rolling it all together

We can do all this in a single query and get some nice readable results, even though the query looks a bit long.  Note that I snuck in an extra “-1″ to account for that stray record I didn’t account for.

Challenge: Why did I get different results than in the SPARQL query example?

In the earlier post I had only 20 nodes, but in this one got 21.  Can you explain why?

Future Post

What’s your favourite graph analytic?  Let me know and I’ll try it out in the future post.

One of things I have planned is to do some further comparisons between graph analytics in SPARQLverse and other product like Neo4j but with large amounts of data instead of these tiny examples.  What different results would you expect?