The Evolution of Data Management for IoT

In the upcoming webinar for SnapLogic, we will be looking at the Internet of Things from the perspective of data.

  • What data can be expected
  • How IoT data builds upon the evolution of data management and analytics for big data
  • Why IoT data differs from data from other sources
  • Who can make the most use of IoT data or Who can be impacted most by IoT data
  • Where IoT data needs to be processed
  • When IoT data has an impact

Specifically, how the recent evolution of data management in response to big data, is ideally suited in some ways for IoT data, and is still evolving for some unique characteristics of IoT data and metadata.

The business drivers range from new sources of data that can help organizations better understand, service and retain customers, to consolidation in many industries bringing about the need to bring together data from disparate and duplicate information and operation systems after merger and acquisition. One of the more pervasive developments has been the movement of data acquisition, storage, processing, management and analytics, to the Cloud.

Beyond these corporate motives, governments and non-government organizations (NGOs) are using data for good to bring about better quality of life for millions or billions of individuals. Clean water, prosecuting genocide, fighting human trafficking, reducing hunger, and opening up new means of commerce are only a few examples. Some look at the future and see a utopian paradise, others a dystopian wasteland. The IoT with evolving data management and analytics are unlikely to bring about either extreme, but I do think that the future will be better for billions as a result.

The basic question that we’ll ask in this webinar is “What is the Internet of Things?”. From simple connectivity, to the resulting cognitive patterns that will be exhibited by these connected things, we will explore what it means to be a thing on the Internet of Things, how the IoT is currently evolving, and how to bring value from the IoT. It is also important to recognize that the IoT is already here, many organizations are reaping the benefits from IoT data management and sensor analytics. The webinar will show ways in which your organization can join the IoT or mature your IoT capabilities.

Big data was often described by three parameters overwhelming the old ways of integrating and storing data: volume, velocity and variety. Really, we are looking at deftly interweaving the volumetric flow of data in timely ways that flexibly provide for privacy, security, convenience, transparency, governance and compliance. Nowhere is this evolution better expressed than in data management for the Internet of Things (IoT).

We will cover some of the more interesting and useful aspects of preparing for IoT data and sensor analytics. Though coined by Kevin Ashton in 1999, the IoT is still considered in the early stages of adoption and relevance. While the latest trends in data management and analytics apply to IoT data and sensor analytics, there are specific needs for properly addressing IoT data, which legacy ETL (extract, transform and load) and DBMS (database management systems) simply don’t handle well, such as time-series data and location data, as well as metadata specific to IoT. In addition to these characteristics of IoT data, we will explore other aspects that make IoT data so interesting.

The IoT isn’t meeting its hype as yet, which requires many solution spaces coming together as ecosystems. Instead, the IoT is growing within each vertical separately, creating new data silos. This is exemplified by the 30-plus standards bodies addressing IoT data communication, transport and packaging. Metadata and API management can help. Metadata also addresses the nuances of IoT data, such as the factors arising from replacing a sensor that allow continuity of the data set and understanding of the difference before and after the change.

Information Technology (IT) and Operational Technology (OT) are coming together in IoT. This means interfacing legacy systems on both side of the house, such as enterprise resource planning (ERP) and customer relationship management (CRM) systems with supervisory control and data acquisition (SCADA) systems, and relational database management systems (RDBMS) with Historians DBMS. This also means deriving context from the EDGE of the IoT for use in central IT and OT systems, and bringing context from those central systems for use in streaming analytics at the Edge. Further this means that machine learning (ML) is not just for deep analysis at the end of the DMA process; ML is now necessary for properly managing data at each step from the sensor or actuator generating the data stream, to intermediate gateways, to central, massively scalable analytic platforms, on-premises and in the Cloud.

As we discuss all of this, our participants in today’s webinar will come away with five specific recommendations on gaining advantage through the latest IoT data management technologies and business processes. For more on what we will be discussing, visit my post on the SnapLogic Blog. I hope that you’ll register and join the conversation on 2016 October 27 at 10:00 am PDT.

Contest Kognitio on Hadoop Best Use


During the week of 2017 September 26 at the O'Reilly Strata-Hadoop conference in New York City, Kognitio announced the start of their contest looking for the best use case or application of Kognitio-on-Hadoop. Kognitio are looking for innovative solutions that include Kognitio-on-Hadoop. Innovation is defined by Kognitio as

Innovation could be a novel or interesting application or it could be something that is common place but is now being done at scale.

This covers a wide range of potential big data analytics use cases that might include data-for-good, government, academic or business applications. Contestants must write-up their use case in a short paper, to be submitted to Kognitio no later than 2017 March 31. Applications will be judged by a named panel headed by a leading industry analyst. The winner will be notified on 2017 June 01. Applicants can be individuals, groups or organizations. The winner may chose among the following three prizes:

  1. US$5,000.00
  2. A one year standard support contract
  3. A one year internship at Kognitio’s R&D facility in the UK – subject to the intern being eligible to work in the United Kingdom

Kognitio on Hadoop is free to download; registered entrants will receive notifications of patches and updates to the free software, as well as preferential support on the Kognitio forums.


As one of the first in-memory, massively parallel processor (MPP) analytics platform, Kognitio has over 25 years of experience to bring to big data processing…always in-memory, MPP and on clusters. Today, the Kognitio Analytical Platform is delivered via appliances, software, and cloud. Kognitio on Hadoop was announced at the 2016 Strata-Hadoop conference in London. This free-to-use version of the Kognitio Analytical Platform includes full YARN integration allowing Hadoop users to pull vast amounts of data into memory for data management and analytics (DMA). As an in-memory MPP analytical platform, Kognitio is very scalable and can provide MPP execution of any computational statistics or data science applications. MPP of SQL, MDX, R, Python and other languages, for advanced analytics, is handled through bulk synchronous parallel (BSP) API. This provides extremely fast, high concurrency access to the data. In addition to these languages, Kognitio has a strong partnership with business intelligence vendors, such as Tableau, Microstrategy and others. For Tableau, Kognitio has a first-class connector; and, for example, a joint customer in the financial services market, with 10,000 customers accessing nine petabytes (9PB) of data in Hadoop [five terabytes (5TB) in Kognitio]. As example of the high concurrency available through Kognitio, the financial services customer routinely sees 1500-2000 queries per second from ~500 concurrent sessions. Now, know that this is an analytical subsystem; there are another 15 such uses of Kognitio, for specific purposes, accessing that 9PB data lake.

Kognitio on Hadoop

Kognitio on Hadoop can be downloaded free of charge and with no data size limits or functional restrictions. This download is available without registration. There is a range of paid support options available as well. Kognitio on Hadoop is integrated with YARN, and works on any existing Hadoop infrastructure. Thus, no additional hardware is required solely for Kognitio. Kognitio on Hadoop accesses files, such as CSV files, stored on Hadoop, in HDFS, as one would normally store data in Hadoop. Intelligent parallelism in Kognitio 8.2 allows queries to be assigned to as few as one core, or to use all cores, allowing for extraordinarily high levels of concurrency. This apportionment is performed dynamically by Kognitio. In addition to the obvious advantages of such a mature product, as free-to-use, Kognitio on Hadoop can be much more easily deployed, tested, and brought into production, while many open source solutions are still trying to run in a lab. Kognitio on Hadoop was developed internally using Apache Hadoop. Kognitio on Hadoop is in production at customers on Apache Hadoop, and the distributions from Cloudera, Hortonworks and MapR.

Why is this important to SAE?

As the Internet of Things matures beyond simple connectivity and communication, in-memory MPP analytical platforms, such as Kognitio on Hadoop, will be required to allow context to be derived from intelligent sensor packages and Edge gateways, to the Cloud, and provide context to the Edge, Fog and sensors, in real-time. Kognitio on Hadoop conceivably allows true collaboration and contextualization among things and humans in sensor analytics ecosystems.

Innovation at CHRISTUS with Informatica

CHRISTUS Health is a Catholic healthcare organization that started with a call to the Mother Superior of the Monastery of the Order of the Incarnate Word and Blessed Sacrament in Lyons, France in 1866. Three sisters responded, beginning their mission to care for the people of Texas in the United States of America. Since then, CHRISTUS Health has been working in accordance with their ideals and values, bringing quality healthcare to the people of the United States, Mexico and Chile. Fifteen years ago, a journey began to bring together financial and clinical data measuring patient outcomes, as well as operational efficiency. In 2013 a Business Intelligence Division was established to take analytics to another level. And this year, at Informatica World 2016 #INFA16, CHRISTUS Health was awarded an Informatica Innovation Award.

CHRISTUS Health – Everything CHRISTUS Health does is about taking better care of patients. While it was imperative that they control growing volumes of data, they knew that if they did it right, there was an unprecedented opportunity to use data to increase effectiveness and quality of services delivered. With end-to-end data management providing accurate, consistent views of data across the enterprise, the health system is confident they will improve the patient experience, enable clinical insight discovery and identify operational efficiencies. CHRISTUS Health implemented Informatica as a scalable enterprise data management architecture across its hundreds of healthcare facilities powering a voracious demand for business and clinical analytics for value-based care delivery. Immediate outcomes include an annual $1M in savings across lines of business and a 30 percent reduction in data stewardship efforts coupled with increased contract management across the Group Purchasing Organization.

In its sixteenth year, the Informatica Innovation Awards acknowledges organizations that creatively change their processes through data using Informatica products.

Exceptional Informatica Customers Receive 2016 Innovation Awards for Excellence around Business Transformation
Annual Awards Program Honors Visionary Organizations that Unlock Data to Power Business
Eleven organizations will be recognized as Innovation Awards winners and finalists at a luncheon on Monday, May 23rd. We’ll be announcing those organizations in this press release. Now in its 16th year, the annual Informatica Innovation Awards program honors organizations that demonstrate vision, creativity and leadership in the use of Informatica solutions to transform business through data.

Through the efforts of Peggy O'Neill, Analyst Relations at Informatica Corporation, Clarise and I were able to speak with Mavis Girlinghouse, a 34 year veteran of CHRISTUS Health, currently their System Director of Business Intelligence. We had a good time discussing the challenges specific to the healthcare industry, such as dealing with HL7 data andpushing that data to vendors in a useable fashion, the challenges of the recent change to IDC10 – which went rather smoothly at CHRISTUS Health – and the coming improvements to patient care and population health management through IoT data and sensor analytics.

The primary business case driving the work being honored this week, was managing the supply chain across the organization, with a focus on contract management. Remember, CHRISTUS Health spans six states in the USA, and six more in Mexico and Chile. This isn't a case study, and we didn't have time to go into competitive vendor selection nor return on investment. However, one immediate return was identifying contracts with terms that were impossible to meet, resulting in recurring late payment fees. Renegotiating these contracts resulted in significant cost savings. Technically, CHRISTUS Health went from four MDM systems to one; they went from using numerous spreadsheets requiring weeks of manual labor, to single-click answers. They aren't stopping at managing suppliers and supplies with MDM. Next on the list is the Pharmacy system, to be followed by including the implant tracking system. There are other challenges with which Informatica products help, such as dealing with the three electronic medical record (EMR) systems in use throughout CHRISTUS Health. Mavis is also planning for the deluge of data that will be associated with the change from episodic healthcare to population health management, where dynamic data mapping will ease the ETL (extract, transform, load) issues. CHRISTUS Health is also constantly planning for the ever changing security landscape. Population health management and improving patient-physician interaction and care management, will require bringing together EMR, pharmacy, third-party provider, IoT, social, and other external data sources to help CHRISTUS Health patients make better, informed lifestyle and prevention choices. Informatica brings the tools that allow new data coming in to be properly mirrored with like data, thus brining proper context to care decisions.

You can also hear Mavis on The Cube, and hear some great advice and experiences on how to get started with big data, and how data truly can save lives.

Informatica Partners with Tableau at INFA16

Informatica Teams with Tableau to Bridge the Business and IT Divide
Partnership Helps Customers Answer the World’s Toughest Questions with Trusted Data
Informatica has expanded its partnership with Tableau Software (NYSE: DATA), a global leader in rapid-fire, easy-to-use business analytics software. Together, the two organizations are delivering insights and understanding for business users at the speed they require, while ensuring IT controls are in place. Uniting Tableau’s self-service analytics with Informatica’s clean, trusted data and rich metadata enables business users to make even faster and smarter decisions with the support and blessings of IT. Together, the companies are providing the industry’s first end-to-end architecture and framework for agile business intelligence.

Today at Informatica World, Informatica and Tableau announced a continuation and expansion of their partnership. For Information Technology organizations, this means that they can focus on enabling the new generation of business analysts and data scientists. IT must evolve their data management architecture to provide appropriate and immediate access to legacy, internal, third-party, social and IoT data.

Appropriate Access

The application of privacy, transparency, security and convenience to data management has become one of the main challenges to all types and sizes of organizations today. Unintended exposure of data, whether through criminal or terrorist action, or through carelessness, has been a major source of embarrassment, bad press, and legal action in recent years. Privacy means different things to different people. Cultural differences affect this perception of privacy, and regional regulatory differences reflect these different perceptions. Security has shades of meaning as well; all revolving around protection. Technology has tried to provide physical and cyber security through different means: roles, authorization, access, firewalls, locks, malware lists, heuristics, encryption, private and public keys, and virtual private networks. All meant to keep the wrong people out. Two more things have become apparent as the Internet has developed: people expect transparency in how corporations, governments and other organizations deal with the data that they generate, and people are willing to exchange a certain amount of security and privacy in exchange for convenience. Regardless of these risks, IT must move from being a gatekeeper of data, must move away from unthinking lockdown of data, to controlling the flow of data.

Immediate Access

Informatica Powers the New Era of Self-Service Analytics
Informatica Delivers the Data to Harness Disruptive Trends and Turns Them into Business Advantage
Power users of data are quickly becoming the norm as data analysis and the entire BI market is being disrupted as businesses are putting data in the hands of all employees and accelerating adoption through self-service data visualization and business discovery solutions. As the center of gravity for analytics shifts from IT to lines of business, enterprises face a tough new challenge: how to ensure that every user is empowered with the right data. Informatica has the answer with our solutions for cloud analytics.

In addition to the announcement with Tableau, Informatica announced its own evolving products and services for self-service data management.

The move of project and program management towards adopting the Agile Manifesto has shown that responsiveness to customer needs, whether that customer is internal, external or the citizenry, is paramount. We might talk about latency, or data temperature, or query times, but what really matters is that the analyst, the organization, has the required data and necessary tools to inform their gut as to the best decision. This means exposing both the data and the algorithms, assuring data stewardship, governance and quality, and providing the tools for self-service data preparation, business intelligence, reporting, analytics and visualization. It means productionalizing the work of data science teams so that those needing to make decisions, within and outside the organization, can use the inferences, predictions and insights to improve the performance of their decisions. Key to this are using the products of data science, machine learning and predictives, to understand, manage and expose metadata, to anticipate and protect against risks, attack surfaces and threat vectors, and expand context to all sources and all uses of data. Two-way traceability will be vital to this new IT architecture.

Importance to Sensor Analytics Ecosystems

Sensor Analytics Ecosystems (SAE) will be required to overcome the data silos, market verticals and myriad of conflicting standards that are being created under the banner of the Internet of Things. Many technologies and processes will be required to bring this about. Self-service data preparation, self-service BI, productionalizing data science, the Intelligent Data Platform: the combination of human and machine intelligence will be among the most important. One only need look at the public Tableau Cloud, and to open data and open gov sites, to see the potential of data storytelling. Extrapolate this through some of the experiments in sensor journalism. All of this are small steps towards the improvements in data management and analytics coupled with the Internet of Things, that will bring about SAE to address the world's critical challenges.

Data Powers Business at Informatica World 2016 INFA16

Clarise and I will be attending Informatica World 2016 at the Moscone Center in San Francisco. This year's theme is "Data Powers Business". There are a host of keynotes, sessions, solutions expo, and roundtables. Four summits give clues to the trends and announcements that we can expect.

  • Cloud Innovation Summit
  • MDM 360 Summit
  • Big Data Ready Summit
  • Data Disruption Summit

Over the past several years, Informatica has been delivering on the promise made with the introduction of the Intelligent Data Platform (IDP). We have had a bit of a peek into how Informatica will continue this evolution in each of these four areas with specific product announcements at Informatica World 2016. Cloud innovation has been happening throughout the past year at Informatica. Of particular note to us, was the support of Salesforce IoT Cloud initiatives at Dreamforce 2015. Informatica Cloud and Data-as-a-Service will provide even more solutions to Informatica customers going forward. Informatica Master Data Management (MDM) solutions will provide specific solutions to give 360º views and improved data governance and quality. Big Data 10.1 was announced at this spring's Strata conference in Santa Clara; more details will be announced at INFA16. Data or digital disruption has been receiving major attention this year. Informatica will be showcasing how improvements to their architecture and IDP, their business transformation services using BOST (business, operations, systems, technology framework), and intelligent use of metadata will help their customers leverage data disruption from third-party, social, IoT and other sources. Security will also be a topic running throughout all of these areas with Informatica's latest, Secure@Source adding to their strong security portfolio. It will also be interesting to see Informatica's building upon Vibe and other products for real-time and streaming data management.

There are two roundtable series that will attract those interested in IoT:

  • IoT: What it means for the Telecommunication, Technology, Automotive and Energy industries
  • Ready for the Internet of Things with MDM and Big Data Relationship Management

We plan to live tweet the event [@JAdP and @CZDS] using #INFA16, as well as additional blog posts. If you plan to attend Informatica World 2016, and would like to meet with us to discuss using Informatica for IoT data management, and preparation for sensor analytics, please DM us on Twitter, or follow the contact link in the footer or author links.

October 2016
Mon Tue Wed Thu Fri Sat Sun
 << <   > >>
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
The DataArchon blog provides research, analysis, insight and leadership for all aspects of data management and analysis. While many like to talk about data science as something new, we see it as an extension of analytic teams. The new discipline that many seek has more of a craft feel to it, and we see it more as a data smith.


  XML Feeds