Big Data and the 3Vs: What is the fourth ‘V’ and what are the implications for not embracing it?
The term Big Data is commonly associated with the three Vs that define properties or dimensions, Volume, Variety and Velocity. Volume refers to the amount of data; variety relates to the number of types of data and velocity refers to the speed of data processing. According to the 3Vs model, the challenges of Big Data management result from the expansion of all three properties, rather than just the volume alone – the sheer amount of data to be managed.
Extracting insights and value from Unstructured Data
Along with the 3Vs, being able to manage and extract insights from unstructured data, or unstructured information, is essential to effective Big Data deployment. It refers to information that either does not have a pre-defined data model or is not organised in a pre-defined manner. Unstructured information is typically text-heavy, but may also contain data such as dates, numbers, and facts.
More than 80% of today’s information is unstructured, and typically too big to manage effectively. Unstructured data is also complex and hard to represent in a simple way. Yet, companies want to be able to combine their data and analyse it to gain new insights. For example, organisations in different industries have combined geospatial vessel location data with weather and news data to make real-time mission-critical decisions.
Aligning your Big Data programme with Business Objectives
Big Data has the ability to master these dimensions of a problem. That is, to become more efficient, proactive and ultimately predictive in dealing with business challenges. Companies are driving up their investments into Big Data strategies and delivery platforms as they realise it is critical to their business success. Key business imperatives include the ability to:
Develop and monetise deeper business and customer insights
However, gaining a 360-degree view of your business, as well as the market, through data analysis, is and has been a challenge for many years. Until recently, the solution would generally involve significant spending on powerful data warehouses and relational database management systems, which led to painstaking data preparation in order to answer pre-determined questions.
What happens when the issues change, as they often do? How quickly can you respond to the unknown? This may lead to months of expensive schema redesign, or worse, going back to the source systems to gather new data not captured by the existing warehouse schema. It may also lead to a lengthy upgrade to a larger, more expensive system if a lot more data is needed to answer the question. The fact that everything must first conform to a rigid schema design (‘schema on write’) is a big hurdle. Unfortunately, cost, time and effort challenges are incorporated into the schema on write approaches, forming part of the problem.
Why not just store large quantities of data in the most efficient way, being its natural form? Or why not just refine the structure as you query or explore it, adapting to the questions as you go (‘schema on read’)? While the growth and speed of data have fundamentally been solved by the arrival of scale-out solutions such as Hadoop and some No SQL databases, the real game changer lies in supporting the second “V” – Variety. Flexible data structures and schema on read is where things get really exciting.
The Fourth and the Most Important ‘V’ in Big Data Deployment
Whilst solving the challenges around the 3Vs is the focus point of many Big Data solutions, the realisation of a 4th V – Value – is key. Without extracting tangible business value from your Big Data solution, it is irrelevant how well the other 3Vs are managed.
To get maximum value from a Big Data strategy, you need to be organised internally, and to discover the full potential of your Big Data strategy, your entire organisation needs to have at least a baseline understanding of its potential.
Most people in your organisation will understand that Big Data encompasses the ability to become more efficient, proactive and predictive when trying to overcome well-trodden business issues. However, many will have worked at organisations that have failed to find value in Big Data, and which may have spent vast amounts of money on powerful data warehouses and relational database management systems, to analyse data and answer pre-determined questions.
Big Data does indeed offer amazing opportunities, as both structured and unstructured data can be combined and viewed from multiple perspectives, revealing new insights and helping organisations find novel solutions to complex problems. Leading organisations will increasingly scale their programmes to be cross-functional, combining data analytics with other applications and embedding intelligence in every process. Fundamentally, the need is for improved business insight and speed. These are complex challenges to solve as data velocity, variety and volume continue to grow, and companies want to integrate and exploit new and legacy data sources.