Big data is a term applied to datasets whose size or type is beyond the ability of traditional relational databases to capture, manage, and process the data with low-latency. (IBM, 2018)
The V’s of Big Data
They share at least one or more of the common V’s; known as
Value are often seen to be added to the core list as drivers towards a more rounded architecture. (DeVan, 2016)
The data is known to come from sensors, devices, video/audio, networks, log files, transactional applications, web, and social media – much of it generated in real-time and on a large scale. (IBM, 2018)
Data collection and analysis has been growing exponentially over the past few decades. Many traditional tools and technology stacks are no longer able to store or retrieve specific data parameters due to the sheer size and volume.
Newer technologies have arisen to address this continual growth under the veil of Big Data and Big Data Tooling (NT, 2017).
Use cases in the wild
Apache’s Hadoop, HBase and others are groundbreaking in pioneering the future of where Big Data can go with commodity computers networked together to form a highly distributed file system which can grow infinitely with many advanced features such as fault tolerance and redundancies built right into the core.
It is calculated that 40 zettabytes of data will be created by 2020, which is an increase of 300 times from 2005.
Every day we create 2.5 quintillion bytes of data; every minute 72 hours of video is uploaded to Youtube, 216,000 Instagram posts are created and over 204 million emails are sent. (IBMBigDataHub, 2018)
It is important to note that around 90 percent of all generated data is unstructured – without a clear defined consistent schema.
Without a way to store, analyse and retrieve relevant information quickly, this data will be entirely useless in the long run.
With the advent of distributed technologies to store and analyse massive datasets, many companies and governments are making use of Big Data trending patterns around a wide range of industries and consumer habits.
Data is often collected from multiple sources that ingests both normalised and unnormalised schemas sharded across networked machines for later analysis and reporting. Oftentimes feeding more granular data into downstream databases for specific lead actions.
Big Data usages
Big Data is being used in a wide variety of ways:
- Understand and target customers
- Optimise business processes and practises
- Improve healthcare and public health
- Improve science and research
- Optimise machine training
- Improve law enforcement
- Financial trading (Marr, 2017)
Even highlighting on a single one such as ?Optimising machine training? shows huge signs of gain worldwide.
Autonomous vehicles and self driving cars feed their data back into centralised Big Data platforms that then help other vehicles learn from obstacles, roads, incidents and patterns for safer journeys across the entire network.
Big Data is the way of the future, it is vital that we optimise the rules without making them too constrained in order to keep up with what the future holds.
IBM (2018) What is Big Data Analytics? [Online] IBM.com, Available from: https://www.ibm.com/analytics/hadoop/big-data-analytics (Accessed on 12th January 2018)
NT, Baiju. (2017) 11 interesting Big Data case studies in Telecom [Online] Bigdata-madesimple.com, Available from: http://bigdata-madesimple.com/11-interesting-big-data-case-studies-in-telecom/ (Accessed on 13th January 2018)
IBMBigDataHub (2018) Extracting business value from the 4 V’s of big data [Online] IBMBigDataHub.com, Available from: http://www.ibmbigdatahub.com/infographic/extracting-business-value-4-vs-big-data (Accessed on 13th January 2018)
Marr, B. (2017) How is Big Data Used in Practice? 10 Use Cases Everyone Must Read [Online] BernardMarr.com, Available from: https://www.bernardmarr.com/default.asp?contentID=1076 (Accessed on 13th January 2018)