Recently Bill Inmon made a post on his B-eye channel about what the data warehouse is not. It immediately led to a storm of discussion on Twitter between several analysts, among them Colin White, Claudia Imhoff and Seth Grimes. Bottom-line of their reaction: Bill’s definition is a traditional one and does not reflect current industry practices. Moreover, Inmon initially stated that SAP was the only company offering a true data warehouse solution (see Colin White’s tweets) but in the current version of his article this statement has been left out. Another version of his truth?
With the risk of initiating another fierce discussion, I react to Bill’s following statement: “Data warehouses exist for the purpose of supporting management, not operations”. Five years ago I would have concurred but now I simply disagree. It’s like having a word processor but restricting its use to only writing memos. The roots of ERP date back to MRP, Material Requirements Planning. Obviously, its current scope is much broader. I used to dislike the idea of using a data warehouse to support operations, but have come to a different insight. What is the use of a sound data warehouse when it’s use is restricted to only one business function? Data (or information or knowledge for that matter) is to share, not to sit on top it. As Colin White points out: there’s theory and there’s practice and the industry simply evolves.
Ok, that said, I need to say that I’m also allergic to people redefining and misinterpreting originally sound defined concepts. To that end, I support Bill’s plea. Take for example the concept of a core competence, defined by Prahalad & Hamel. Does anybody still knows the three distinguishing characteristics of a core competence? Guess not (it’s: provide entry to new markets, difficult to imitate, support customer’s needs). Take any declared core competence and I bet hardly any instance complies with these criteria.
But, I’m not going into the discussion regarding the (sound) definition of a data warehouse (again). Instead, I like to approach the issue of operational use from a different perspective, yep the architectural one. One key question we need to ask is whether there’s a fundamental architectural principle that inhibits the use of a data warehouse for operational purposes. Or rather, a universal principle. My answer to that question would be no, there’s not. There’s no principle like ‘Thou shall not use a data warehouse operationally, ever”. Architectural principles that do guide this trade-off are for example reducing dependencies, reusability, business responsiveness etc. But it is up to each invididual organization how they want to trade this off and what its implications are for the data warehouse. Second, we need to bear in mind that the way the application architecture is realized (i.e. physically implemented), is driven to a large extent by the capabilities of the technology architecture (i.e. infrastructure). Nowadays, new data warehouse-technologies are capable of dealing with operational service levels (real-time integration, split-second response, failover etc.) at an affordable cost level making it practically and economically feasible to use a data warehouse for operational purposes. Again, this should be a trade-off driven by architecture principles and constraints. Then there’s the question whether the data warehouse is a logical or a physical construct. If it is seen logically, operational use of a data warehouse at the physical level may imply for example a seperate datastore in order to avoid direct physical access to the (physical) data warehouse.
It’s a difficult discussion and not one that will likely be settled after this post. What I wanted to make clear is that there are multiple considerations and viewpoints – principles, logical vs. physical, application vs. infrastructure – that need to be considered when operational use of a data warehouse is being discussed.
Very much like to hear your comments on this one!
wouter
