OK, let’s make it simple. Imagine data is like water. You have constructed a dam to hold this water. But then imagine the weather has turned unpredictable and overcast with clouds; it’s started raining cats and dogs. Your dam will soon fill and you will get overwhelmed. Water spills over with unimaginable consequences. That is a Big Data scenario… The internet is that cloud which is pouring petabytes of data water every second as we speak. It is shapeshifting businesses as we speak.
However the minute we talk about data the first person who comes into our mind is the guy that collects it. Data has always been the remit of IT people - they collect it, they store it and they take responsibility for it.
Transputec has been in the business for 30 years collecting storing and transferring data for IT companies or on their behalf. In that time we have seen IT has evolve into a space where the data that is coming into the Servers has grown enormously and it is little understood. However, even before it became enormous, business houses had to put up with poor and unfiltered data of very poor quality and also slow heavy batch orientated aggregations. IT takes the ownership of the business warehouse but IT is also too busy in fighting this fire so, as a result the small data itself already presents an enormous challenge. To top this, now the data is growing in gigantic proportions and the traditional IT is finding itself overwhelmed, whilst the real owners of the data are absent. As we see it, IT is never the owner of this data, but rather a custodian.
At the data sources level, it is important to find the right question to answer. For example there is no point finding the behaviour of chickens in the evening time for someone who is in the soft drinks business. Big Data has a lot of answers, you just need to ask the right questions!
The real value of the Big Data is in the hands of those who own it i.e. the data analyst visualizers. The industry today suffers from a lack of Big Data analytical expertise, a very poor perception of data science and no integration with the business processes. Moreover the owners do not realise its potential.
For example in case you are using R for your Big Data regression, though glmulti package provides an efficient subset selection for vlm, it can use only 30 features at a time. Otherwise its performance goes down as the number of rows and the size of your Big Data increases. Obviously the data scientist complains to the IT Director that his computer or server is slowing down. But the IT Director does not have an answer to this problem. This is a brand new problem for IT. That is why CFO’s should not base their ROI calculations on the traditional IT models.
So who are the real owners of the data? The truth is that they differ from industry to industry, but for most industries Big Data is not owned by IT. For example in FMCG, the owners of Big Data are the brand directors, marketing directors and sales directors who could make use of the analytics based on the Big Data intelligence. Whereas in the heavy industry sectors, it is the chief executives. In economics and governments it could be the economists and strategists that are the owners. As mentioned previously, IT is merely the custodian.
Traditionally the supposed owners (which we have coined, the custodians) of the data are not aware what the data is doing and how to make sense out of them.
There is a story of a frog which was living in a well. One day a big frog which lived in the ocean found himself in the well by accident.
They had a conversation where the big frog explained that he came from the ocean The small frog could not understand what an ocean is. So the well frog began to compare it to his well. Is it twice the size of my well? No? 10 times the size of my well? He simply could not visualise the vastness of the ocean.
So todays conversations surrounding Big Data can be compared to this conversation that those who are habituated to use the traditional small data sources such OLTP and structured data sets cannot comprehend unstructured NOSQL Big Data or even the terms surrounding Big Data such as Mapreduce. As data scientists only understand regular data, they may be unable to visualise the potential of Big Data. That which they cannot understand, they cannot comprehend.
Imagine a data scientist goes back into his office and says, “Okay, I want to access Big Data”. Brace yourselves, it’s like imagining the little frogs well getting flooded by the ocean. Can you visualise that? The data not only fills the well but attacks like a tsunami.
In order to make sense of Big Data they need to have infrastructure and applications in place. In other words, an initial investment needs to be made. However, the risk is that the data is too overwhelming and so it confuses business leaders.
So even before Big Data is approached, the following things need to be considered:
1. Find which sources of Big Data to access which are desired.
2. Implement a method tools to capture that data.
3. Have a business question.
Just like an ocean has too many kinds of fish, Big Data is full of information, and different types of it.
So prioritizing the information that you want to or need to know is key. This is why I say that Big Data is not useless it is usedless - Because those that are exposed to it, cannot comprehend it, they are not the owners. It seems;
Those who understand Big Data, do not have the necessary skill set to make sense out of it.
Those who can use and understand it often lack the investment backing.
Those who have the investment backing may fail to implement a proper business strategy around it.
So, as a result the Big Data initiative is suffering at various levels.
Because the investors are the CFO, they may calculate the ROI using the traditional the ROI model which are IT investment centric. When a company IT director presents his company budget for a server, he is able to justify his need as he is the end user of it. Whereas Big Data which supplies Business Intelligence, analytics and a whole range a new possibilities is beyond the imagination of the regular IT investment goal.
So we say that the investment proposition towards Big Data should use different ROI models than the traditional IT centric models. The decision towards Big Data should be taken by the company as a whole, perhaps by the real owner such as; CEO’s of the company and not the owners of the IT silo. Whereas we say that Big Data decisions are big, therefore should not be taken and should not be sold to owners of IT.
But what exactly are you trying to do with Big Data? The question is lacking. Whereas we can see success stories of companies that were able to harness the Big Data. One major example of this is Palanteer which is estimated at a value of $15b dollar. This was a Big Data start up during 2004. So, within 10 years we have seen a value of a start-up rise to 15 billion in the Big Data space. If we analyse the Palateer success, all three points we proposed earlier come into picture:
1. Palanteer knew which are going to be the sources of information -sell software that do Big Data link analysis
2. They implemented the correct tools to capture the data
3. They have a particular business question - finding fraudulent behaviour (addressing a simple buss problem – fraud prevention)
A simple question with a $15b value!
So the example of Palanteer shows with the right question and skill set and right sources of information, Big Data can really get big - even in terms of ROI.
They are asking the wrong people the questions - IT guys can't visualise it, so they push it back.
So if you want to sell Big Data - Don’t sell it to the IT guys.
Madhava Kumar Turumella
Chief Executive Officer (CEO), Big Data Services
Madhava Kumar Turumella is a graduate in Business Studies and a Post Graduate in Computer Applications. Madhava comes with an extensive knowledge in implementing business process. As a technology visionary, business leader, statistician Madhava is renowned for his experience in directing and sourcing the data collection operations. He has more than 24 years of leadership experience in working with large multinational corporations. He speaks many languages including German and English. He is a community volunteer and received many prestigious awards for his community volunteering activities.