OSS DSS Studies Introduction

First, let me say that we're talking about systems supporting the decisions that are made by human beings, not "expert systems" that automate decisions.  As an example, let's look at inventory management.  A human might use various components of a DSS to determine the amount of an item in stock, the demand for that item as a trend to determine when it might be out of stock, and predictives as to various factors (internal, external, environmental, political, etc) that might affect supply, to come to a decision as to how much and when to order more of that item.  An expert system might be created that could also determine when and how much of an item to oder, using neural networks, Bayesian nets or other algorithms.  The expert system might even take from the same DSS components (or directly from their underlying data) as the human might.  One could even run the expert system in parallel with humans making the decisions, scoring or otherwise evaluating the two, until the expert system is comparable or better than the expert system.  But, we're not really interested in expert systems in this study guide.  We'll be focusing on systems that help humans to make better decisions, not on automated feedback and control loops.

To me, a technology doesn't matter very much if it's not supporting some process, or a step within a process.  That process may be for personal reasons or supporting work activities. For this study guide, let's begin by continuing the discussion that we began in the previous posts, about the process by which one makes a decision, the steps, the events, the triggers and the consequences of making a decision.

I have my own process in making decisions.  I've played in executive and management roles for many years, and have been responsible for 5 P/L centers.  But this is a study guide, and while I intend to offer my own opinions and interpretations, we need some objective sources to study.  Let's start with a Google search.  Of course, Wikipedia has an article.  A site of which I've not heard before has the first hit with their article on problem-solving and decision-making.  Science Daily has a timely article from 2010 March 13 on how we really make decisions, our brain activity during decision making.  I also like the map from The Institute for Strategic Clarity.  Mindtools sets out a list of techniques and tools for aiding in the decision making process, and provides an important caveat "Do remember, though, that the tools in this chapter exist only to assist your intelligence and common sense. These are your most important assets in good Decision Making".  Reading through various reviews, the one book on decision making that I want to add to my library is The Managerial Decision-Making Process, 5th ed. by E. Frank Harrison.  From the Glossary of Political Economy Terms, we have:


Where formal organizations are the setting in which decisions are made, the particular decisions or policies chosen by decision-makers can often be explained through reference to the organization's particular structure and procedural rules. Such explanations typically involve looking at the distribution of responsibilities among organizational sub-units, the activities of committees and ad hoc coordinating groups, meeting schedules, rules of order etc. The notion of fixed-in-advance standard operating procedures (SOPs) typically plays an important role in such explanations of individual decisions made. -- Organizational process models of decision-making


Let's revisit and expand upon the summary that we gave in the third post in this series.

  1. As an individual faced with making a decision, I may want input from others, I may want consensus, but in the end, it is an individual decision, and I will bear the fruits of having made that decision.
  2. I need to put the problem, and my decision making, into context.  I have a variety of resources at my disposal to do so:

    • historical data
    • current information
    • structured data from transactional systems, master data, metadata, data warehouse, and other possible sources
    • unstructured data from blogs, wikis, Zotero libraries, Evernote, searches, bookmarks and similar sources
    • email
    • non-electronic correspondence, notes and conversations
    • personal experience
    • the experience of others garnered through water cooler and hallway conversations, formal meetings, twitter, phone calls and the like
  3. Now I need to understand all of these facts, opinions and conjecture at my disposal.  Part of this sifting all of it through my internal filters, using my "gut".  Part is using the various reporting and analytical tools at my disposal, and then filtering those through my gut.  And really, this and the next point will constitute the majority of this OSS DSS Study Guide - the tools we use.
  4. As I contemplate the various decisions that I might make from all of this, I want to understand the consequences of each potential decision: might this decision lead to a better product, more profit, less profit, broader market penetration, higher reliability, or even an alternate universe.
  5. As I make this decision, I'll want to collaborate with others.  Ideally, I'll want to collaborate within the context of my decision support system. Once upon a time we would do this by embedding the tools within a portal system, now we take a more master data management approach, and use a services oriented architecture with either web services description language (WSDL) or representational state transition (ReST) application programming interfaces (APIs) to the collaborative environment, usually a wiki.

In summary, this introduction has set up a framework for a decision-making process for an individual to use a decision support system.  The majority of this study guide will be to expore the actual decision support system, and the open source tools from which we can build such a system.

Syllabus for OSS DSS Studies

As promised, here's the syllabus for our study guide to decision support systems using open source solutions. We'll start with a first draft on 2010-03-23, and update and change based on ideas, comments and lessons learned. So, please comment. :) The updates will be marked. Deletions will be marked with a strike-though and not removed.

  1. Introduction
    1. Continuing the discussion of the processes and technologies that constitute a decision support system
    2. Formalizing a definition of DSS as well as the components, such as business intelligence (BI) that contribute to a DSS
    3. Providing [and updating] the list of references for this study guide
  2. Preparation
    1. Discussing the technology for use in this study guide including the client(s) and server (Red Hat Enterprise Linux 5)
    2. Checking for prerequisites for the open source solutions that will be used
    3. Hands-on exercises for preparing the system
  3. Installation
    1. Pointers and examples for installing the open source server-side packages including but not limited to:
      1. LucidDB
      2. Pentaho BI-Server, including PAT, and Administrative Console
      3. RServe and/or RApache
    2. Pointers for installation of client-side software and some examples on MacOSX
  4. Modeling
    1. Generally, we would determine the models, the architecture and then one (or more competing) design(s) to satisfy that architecture, including selecting the right technical solutions for the job at hand. Here, we're creating a learning environment for certain tools, so we're introducing the architecture and design studies after the technology installs.
    2. In general, this section will explore the various means of modeling processes, systems and data, specifically as these relate to making decisions.
    3. Decision Making Processes
      1. Decision Theory
      2. Game Theory
      3. Machine Learning & Data Mining
      4. Bayes and Iterations
      5. Predictives
    4. Information Flow
    5. Mathematical Modeling
    6. Data Modeling
    7. UML
    8. Dimensional Modeling
    9. PMML
  5. Architecture and Design
    1. In this section, we'll examine the differences between enterprise and system architecture, and between architecture and design. We'll look at various architectural and design elements that might influence both policy and technology directions.
    2. Discussing Enterprise Architecture, especially the translation between the user needs and technology/operational realities
    3. System Architecture
    4. SOA, ReST, WSDL, and Master Data Management
    5. Technology selection and vendor bake-offs
  6. Implementation Considerations
    1. Discussing the various philosophies and considerations for implementing any DSS, or really, any system integration project. We'll look at our own three track implementation methodology, as well as how the new Pentaho Agile BI tools support our method. In addition, we'll consider how we'll get all these OSS tools working together, on the same data sets, as well as, the importance of managing data about the data.
    2. Pentaho Agile BI and our own 8D™ Method
    3. System and Data Integration
    4. Metadata
  7. Using the Tools
    1. This is the vaguest part of our syllabus. We'll be using the examples from our various references, but with the system we've set-up here, rather than the exact systems that the references use. For example, we'll be using LucidDB and not MySQL for the examples from Pentaho Solutions. Remember too, that this is a study guide, and not a oops meant to be a book written as a series of blog posts, so while we might vary from the reference materials, we'll always refer to them.
    2. ETL
    3. Reporting
    4. OLAP
    5. Data Mining & Machine Learning
    6. Statistical Analysis
    7. Predictives
    8. Workflow
    9. Collaboration
    10. Hmm, this should take years :D

Renewables and Smart Grid

We are currently in, at least, the fourth era of growth and interest in renewable energy. The first two of which I'm aware, in the late 1800's into the turn of that century, and in the 1950's, both concentrated on solar (Photovoltaics and Solar Thermal), with some wind power in the first. The third was during the Carter Administration in the 1970's (famously ending when Ronald Reagan ordered the solar panels off the roof of the White House). Disclosure: I was doing photovoltaic research at SES, Inc (now part of Royal Dutch Shell) as a physicaleletrochemist during this time.

During the recent upswing in interest, investment and installations of renewable energy sources (photovoltaics, solar thermal, wind, wave, tidal, geothermal, biomass, etc.) I've been worried that the bubble would soon burst. But today, I've had a thought that encourages me, that maybe renewables will take their place along side coal, oil and nuclear. The reason for this is complex, more social than technical, more due to business than to science.

Many point to the past failures of renewables, of whatever type, due to inefficiencies and to long periods, or infinite time, for a return on the upfront investment. But I think that much of what prevented adoption of renewables is more for social and business reasons. For the most part, the past marketing effort for renewables was to get people off the grid. This was scary for the individual, not justified by the ROI, and inimical to business interests.

Today however, we have the prospect of the Smart Grid. What exactly defines the Smart Grid is still being debated, but here's my hopeful thought. Just as the Internet evolved to combine data, communication and collaboration protocols into what we now term Web2.0 or read-write-web or social media, allowing anyone who desires to do so, become a producer of content as well as a consumer, the Smart Grid will not force users of renewable energy sources off the grid, but will allow whoever desires to do so become a producer as well as a consumer of utility services, starting with electricity, but perhaps evolving to include other utility services as well. Let me also point out that I'm not [just] talking about the individual, I'm talking about communities and small businesses. For example, the Smart Grid would allow a small business such as our local Coastside Scavengers to install an AdaptiveARC reactor, transforming the waste they pick-up from our homes into electricity, and additional cash flow.

This possibility has social, business and economic implications that the previous generations of renewables lacked. This gives me hope. This also strengthens my desire to see workable standards, and working implementations of the Smart Grid(s) - whatever that turns out to really mean.

Questions and Commonality

In the introduction to our open source solutions (OSS) for decision support systems (DSS) study guide (SG), I gave a variety of examples of activities that might be considered using a DSS. I asked some questions as to what common elements exist among these activities that might help us to define a modern platform for DSS, and whether or not we could build such a system using open source solutions.

In this post, let's examine the first of those questions, and see if we can start answering those questions. In the next post, we will lay out a syllabus of sorts for this OSS DSS SG.

The first common element is that in all cases, we have an individual doing the activity, not a machine nor a committee.

Secondly, the individual has some resources at their disposal. Those resources include current and historical information, structured and unstructured data, communiqués and opinions, and some amount of personal experience, augmented by the experience of others.

Thirdly, though not explicit, there's the idea of digesting these resources and performing formal or informal analyses.

Fourthly, though again, not explicit, the concept of trying to predict what might happen next, or as a result of the decision is inherent to all of the examples.

Finally, there's collaboration involved. Few of us can make good decisions in a vacuum.

Of course, since the examples are fictional, and created by us, they represent our biases. If you had fingered our domain server back in 1993, or read our .project and .plan files from that time, you would have seen that we were interested in sharing information and analyses, while providing a framework for making decisions using such tools as email, gopher and electronic bulletin boards. So, if you identify any other commonalities, or think anything is missing, please join the discussion in the comments.

From these commonalities, can we begin to answer the first question we had asked: "What does this term [DSS] really mean?". Let's try.

A DSS is a set of processes and technology that help an individual to make a better decision than they could without the DSS.

That's nice and vague; generic enough to almost meaningless, but provides some key points that will help us to bound the specifics as we go along. For example, if a process or technology doesn't help us to make a better decision, than it doesn't fit. If something allows us to make a better decision, but we can't define the process or identify the technology involved, it doesn't belong (e.g. "my gut tells me so").

Let's create a list from all of the above.

  1. Individual Decision Maker
  2. Process
  3. Technology
  4. Structured Data
  5. Unstructured Data
  6. Historical Information
  7. Current Information
  8. Communication
  9. Opinion
  10. Collaboration
  11. Analysis
  12. Prediction
  13. Personal Experience
  14. Other's Experience

What do you think? Does a modern system to support decisions need to cover all of these elements and no others? Is this list complete and sufficient? The comments are open.

First DSS Study Guide

Someone sitting in their study, looking at their books, journals, piles of scholarly periodicals and files of correspondence with learned colleagues probably didn't think that they were looking at their decision support system, but they were.

Someone sitting on the plains, looking at the conditions around them, smoke signals from distant tribe members, records knotted into a string, probably didn't think that they were looking at their decision support system, but they were.

Someone at the nexus of a modern military command, control, communications, computing and intelligence system, probably didn't think that they were looking at their decision support system, but they were.

Someone pulling data from transactional systems, and dumping the results of reports & analyses from BI tool into a spreadsheet to feed a dashboard for the executives of a huge corporation probably didn't think that they were looking at their decision support system, but they were.

The term "decision support system" has been in use for over 50 years, perhaps longer.

  • But what does this term really mean?
  • What do all of my examples have in common?
  • How can we build a reasonable decision support system from open source solutions?
  • What resources exist to help us learn?

I'm starting a series of posts, essentially a "study guide" to help answer these questions.

I'll be drawing from and pointing to the following books and online resources as we install, configure and use open source systems to create a technical platform for a decision support system.

  1. Bayesian Computation in R by Jim Albert, Springer Series in UseR!, ISBN: 0-38-792297-0, Purchase from Amazon, you can also purchase the Kindle ebook from Amazon
  2. R in a Nutshell by Joseph Adler, ISBN: 0-59-68017-0X, Purchase from Amazon
  3. Pentaho Solutions; Business Intelligence and Data Warehousing with Pentaho and MySQL, by Roland Bouman and Jos van Dongen, ISBN: 0-47-048432-2, Purchase from Amazon
  4. Pentaho Reporting 3.5 for Java Developers by Will Gorman, ISBN: 1-84-719319-6, Purchase from Amazon
  5. Pentaho Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration by Matt Casters, Roland Bouman & Jos van Dongen, ISBN: 0-47-063517-7 due 2010 September, Pre-Order from Amazon
  6. Data Mining: Practical Machine Learning Tools and Techniques by Ian H. Witten and Eibe Frank, Second Edition, Morgan-Kaufmann Series in Data Management Systems, ISBN: 0-12-088407-0 a.k.a. "The Weka Book", Purchase from Amazon, Pre-Order the Third Edition, you can also purchase the Kindle ebook from Amazon
  7. LucidDB online documentation
  8. Pertinent information from Eigenbase
  9. LudidDB mailing list archive on Nabble
  10. Anything I can find on PAT
  11. Pentaho Community Forums, Wiki, WebEx Events, and other community sources
  12. R Mailing Lists and Forums
  13. Various Books in PDF from The R Project
  14. Information Management and Open Source Solution Blogs from our side-column linkblogs

In this study guide series of posts:

  • I'll show how the datawarehousing (DW) and business intelligence (BI) can be extended to include all the elements held in common from my DSS examples.
  • We'll examine the open source solutions Pentaho, R, Rserve, Rapache, LucidDB and possibly Map-Reduce & Key-value-stores, and the related open source projects, communities and companies in terms of how they can be used to create a DSS.
  • I would like to add a collaboration tool to the mix, as we do in our implementation projects, possibly Mindtouch, a ReSTful Wiki Platform.
  • We may add one non-open source package, SQLStream, that's built upon open source elements from Eigenbase. This will allow us to add a real-time component to our DSS.
  • I'll give my own experience in installing these packages and getting them to work together, with pointers to the resources listed above.
  • We'll explore sample and public data sets with the DSS environment we created, again with pointers to and help from the resources listed.

The purpose of this series of posts is a study guide, not an online book written as a blog. The goal is to help us to define a modern DSS and build it out of open source solutions, while using existing resources.

Please feel free to comment, especially if there is anything that you feel should be included beyond what I've outlined here.

April 2018
Mon Tue Wed Thu Fri Sat Sun
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30            
 << <   > >>
The TeleInterActive Press is a collection of blogs by Clarise Z. Doval Santos and Joseph A. di Paolantonio, covering the Internet of Things, Data Management and Analytics, and other topics for business and pleasure. 37.540686772871 -122.516149406889

Search

Categories

The TeleInterActive Lifestyle

Yackity Blog Blog

The Cynosural Blog

Open Source Solutions

DataArchon

The TeleInterActive Press

  XML Feeds