Data Quality in the IoT – part II

This is a follow-up discussion from my IoT blog entry on data quality on the Internet and ultimately in the Internet of Things (IoT).  In the last post, we pointed out that in the increasingly automated IoT, people often don’t know where their decision-support data has come from?  And what if it did not have the credibility you assumed? In our previous example we looked at the data linked to bar codes and QR codes. Even though this data may be used for making critical decisions in our personal and business lives, the origin of the data can be hard to determine.

So where do you apply intelligence about the origin and reputation of data in the IoT?  Given that there are various indicators of trustworthiness and reputation available on the Internet (such as McAfee Global Threat Intelligence – GTI), where can you implement this intelligence to filter out bad data sources?  There are a few options to evaluate the data source:

Managing data quality intelligence on the device.

The device is the remote endpoint where the data is ultimately arriving and being processed for some in-field application. Ideally, this device is smart enough to make decisions about the quality of the data being received based on activities like verifying the credentials presented by the source (like a web site security certificate).  However, such credential management is a costly activity from a computation  and network perspective and simply not available to many of the emerging devices and assets in the IoT.  You are not going to deploy a SSL certificate verification mechanism, or even an IPSEC capability, onto a battery operated device used for road sensing, or for tracking the expiry data of pharmaceuticals. So what then?

Managing data quality intelligence on the gateway

Many if not most of the devices that find their way onto the IoT will not access the full-fledged Internet.  The Internet is too unhygienic for that, so these devices will sit behind domestic or business gateways like DSL routers with WiFi networks for local devices.  Alternately or even at the same time, many of IoT devices may never access even a fully-fledged IP (v4 or v6) network, because they are constrained in their networking ability and rely on layer 2 and layer 3 protocols designed for low-power, slow networks. In this case, another gateway will be employed which speaks IP on one side and various application-specific protocols on the other – acting as a proxy for potentially hundreds of IoT devices.

With some of these cases it may be appropriate to allow the gateway  (the device consumers would get from ISP providers like DSL or DOCSIS modems) to perform analysis of the data quality on behalf of the IoT device.  The gateway device, being more powerful, can afford to utilize intelligence about the reputation of the source of data, making decisions for the IOT device about what data can be trusted.  For instance, a person watching TV attempts to click on a link within a advertisement on a late night infomercial (because the TV has been integrated with internet browsers).  The link is to a malware site, which has been recently established to take advantage of flaws in the TV software.  The TV is too “dumb” to either run security software or access intelligence, but the home gateway is smart enough and stops the connection.

Alternately, consider a situation where an IP-enabled car attempts to “call home” for operating system updates.  But DNS poisoning has resulted in it being directed to a bad IP address, where malware mimicking the car updates will get loaded. (Malware in car operating systems has been lightly researched to date – but early findings are scary(1).) Again, the smarter, more aware gateway might intercept the connection based on its ability to access and manage data quality intelligence.

Intelligence in the network

 Another place where intelligence might be applied as an alternative to the device or the gateway – or possibly in addition to those types of controls – is within the enterprise or service-provider network.  This is a concept that does not align well with “net neutrality”, which calls for service provider to move all traffic regardless of source or destination and not apply differing qualities of service on the Internet.  More and more this neutral culture is being challenged by the need to find better security solutions that engage large network providers, and compensate for less-smart gateways and end devices.  (Within an enterprise – owned and operated for a single owner – no “neutrality” issues arise.)

One option for applying intelligence in the network is at the major border or peering points.  Traffic from sources with severely degraded reputations, due to present or recent past bad behaviors, may be disallowed to selected destination inside the network – for instance, IP ranges being used to support more sensitive IoT applications and identified as such proactively, by the application owners to the service provider.

Another option for applying intelligence in the network is to flag, tag or “stain” data packets as they move through the service provider network in such a manner that their reputation is embedded within the network flows directly.  These stains might then be read by either the gateways or the endpoint IoT devices.  Where do you apply such stains?  In IPv4, there are fields in the IP header such as the “type of service” that can contain reputation information within a network domain-of-control without affecting the normal functioning of IP or the higher level protocols like TCP or UDP.  In IPv6, there are additional ways to potentially inject intelligence into the packet headers, such as the Flow Label or Destination Options fields.(2)


1)   Checkoway et al., Comprehensive Experimental Analyses of Automotive Attack Surfaces, University of California, 2011

2)   Macaulay et al, IPv6 Packet Staining, IETF draft, Dec 2011

Leave a Comment

1 × two =