There is a massive amount of hype and buzz in the Data Warehousing and Business Intelligence market place surrounding the term ‘Big Data’. Recently we have even seen talk of Big Data as a replacement for Data Warehousing. I believe this is a misunderstanding of what Big Data is. In fact Big Data strategies only work if they co-exist with a well thought-out and supported Enterprise Data Warehouse. So I don’t believe we are witnessing the end of Data Warehousing – and here’s why.
First, what is Big Data? In John Bantleman’s recent blog Raw is More, he defines Big Data using the criteria of volume, velocity, variety and value. This is a great definition and captures exactly why the hype, buzz and excitement around Big Data will be with us for some time – businesses now have the means to collect, store and analyse huge volumes of data, from varied sources, at high frequency, in a very cost efficient manner – and this hasn’t been possible before.
I recall the days during the first dot-com boom, where trying to capture and store all the detailed data generated by people browsing a website – capturing every click, interaction and page viewed, over a period of more than a month was near-on impossible. A client involved in providing share trading services couldn’t hold more than 14 days’ worth of detailed browsing data – so think how difficult it was to generate insights into user behaviour. With the arrival of Big Data, this problem is no longer present; it’s possible to save the detail data for much longer.
So where does an Enterprise Data Warehouse (EDW), fit into the picture? Are we now witnessing the demise of an EDW, to be replaced by ‘Big Data’ systems? In short … no. For an organisation to get value out of their data they must be able to generate insights, quickly, effectively and for as many user groups as possible. For this you need a well-structured Data Warehouse.
In a recent Australian CIO article, ‘Five things CIOs should know about Big Data’, the misinformed idea is presented that in some way ‘Big Data’ allows an organisation to forgot all the hard work and thinking that goes into creating a well-constructed Enterprise Data Warehouse (EDW), The article suggests that a Big Data implementation will enable;
- Access to data by more than just a handful of highly paid and hard-to-find Data Scientists. Untrue – you will need even more sophisticated data analysis if your data is not structured in a logical way – a skill most people in the organisation do not have.
- Support for all the business questions that can be thrown at it, unlike an EDW, and without the need for any structure. Untrue – A well-designed dimensional data model at the core of the EDW supports a variety of business questions being asked, and the data model doesn’t prevent, limit or second guess those questions. Structure to data actually makes it easier to navigate the data and generate insight. Good luck if you need to navigate your unstructured Big Data store, without your expert guide available!
- As much detail data as the underlying infrastructure can support. True, but you still have to have the means and capability to access that data.
The article goes even further suggesting that ‘You can use a [Big Data repository] as a dumping ground, and run the analysis on top of it, and discover the relationships later.’ I’ve seen ‘data dumps’ and they are not fun to use for anyone. They typically suffer from extremely poor data quality, poor performance and lack of control – all of which is the reason we’ve spent 20 years refining the approach to supporting business in generating insight from Data Warehouses!
We believe that both ‘Big Data’ and Enterprise Data Warehousing need to co-exist, supporting the need for organisations to generate insight from all data. Big Data provides the deep analytical capability to generate insight from huge volumes of data and transactions that you just wouldn’t need to make available to everybody on expensive hardware, whereas an Enterprise Data Warehouse is bringing insight from data to as many business users as possible, in a structured and planned way.
Is there a meeting point in the future? We believe there could be – a ‘Big Data 2.0’ where an Enterprise Data Warehouse can take advantage of the infrastructure approach that ‘Big Data’ uses. In the meantime if your ‘Big Data’ vendor tells you that you don’t need that Data Warehouse any more, come and talk to us at altis Consulting for a more rounded and balanced view.