Netflix Hadoop Big Data Marketing Use Case

Andrew • Jan 28, 2018 • Big Data

2 min read 562 words

Netflix is a video streaming service that has a wealth of information about their user base likes, dislikes, general consumer habits, retention lengths and much more.

Netflix uses their big data to commission original programming content that it knows will succeed and be accepted in relevant published markets (O?Neill, 2016).

They perform various A/B tests to determine which variant of similar things perform higher, for example, when showing cover images for series or movies, they will at random show alternative images to determine which proves more reactive from their user base.

Volume:
As of Q4 2017, Netflix has around 120 million subscribed users and counting (Statista, 2017). With a steady growth rate year on year, it is important that the company uses its immense data aggregation and analytics to drive new business and support investment into the platform.

The number of titles in Netflix?s database varies wildly from country to country. A recent report of the largest licence zone has indicated that there are over 1000 television shows and almost 5000 movies in the United States database alone.

As millions of users watch the service globally, it totals to 400 billion titles a day and 8 million titles every second (MindSight, n.d.).

Variety:
While a lot of the data is structured, such as categorisations of shows, actors information or user ratings, there is a massive amount of unstructured information that needs to be processed, such as general analytics, A/B test results, play and resume times per user on each title and others.

Even though the concept of a streaming service is quite simple, the many implementation enhancements to add benefits make it much more complex.

HDFS is a distributed file system that handles large data sets running on commodity hardware (IBM, 2018). In any cluster there is a single NameNode which manages the file system and regulates access to client files. Each cluster consists of any amount of DataNodes, with usually around a single one per each node in a cluster, they handle their own storage capabilities on the distributed space. Nodes are also replicated across servers to guarantee a high level of fault tolerance and availability.

Pig provides a high level language called Pig Latin which allows the operator to perform SQL-like queries on the data without the need to write and execute complex Java applications to retrieve meaningful data. Pig translates Pig Latin scripts into MapReduce jobs that can then automatically be run on the data itself.

In marketing, big data is providing insights into which content is most effective at each stage of a sales cycle (Columbus, 2016).

Never before has this amount of information been available to scrutinise by teams of analysts to drive feature improvement and customer understanding in business.

References

O?Neill, E. (2016) 10 companies that are using big data [Online] ICAS.com, Available from: https://www.icas.com/ca-today-news/10-companies-using-big-data (Accessed on 27th January 2018)

Statista (2017) Number of Netflix streaming subscribers worldwide from 3rd quarter 2011 to 4th quarter 2017 (in millions) [Online] Statista.com, Available from: https://www.statista.com/statistics/250934/quarterly-number-of-netflix-streaming-subscribers-worldwide/ (Accessed on 27th January 2018)

MindSight (n.d.) How Netflix Uses Big Data To Drive Big Business [Online] GoMindSight.com, Available from: https://www.gomindsight.com/blog/how-netflix-uses-big-data-to-drive-big-business/ (Accessed on 27th January 2018)

IBM (2018) What is HDFS? [Online] IBM.com, Available from: https://www.ibm.com/analytics/hadoop/hdfs (Accessed on 28th January 2018)

Columbus, L. (2016) Ten Ways Big Data Is Revolutionizing Marketing And Sales [Online] Forbes.com, Available from: https://www.forbes.com/sites/louiscolumbus/2016/05/09/ten-ways-big-data-is-revolutionizing-marketing-and-sales/#5686d00921cf (Accessed on 26th January 2018)

Tags: Big Data

Andrew

Andrew is a visionary software engineer and DevOps expert with a proven track record of delivering cutting-edge solutions that drive innovation at Ataiva.com. As a leader on numerous high-profile projects, Andrew brings his exceptional technical expertise and collaborative leadership skills to the table, fostering a culture of agility and excellence within the team. With a passion for architecting scalable systems, automating workflows, and empowering teams, Andrew is a sought-after authority in the field of software development and DevOps.

Netflix Hadoop Big Data Marketing Use Case

Table of Contents

Andrew

Tags

Recent Posts

Advanced Go Memory Management and GC Optimization: Mastering Performance at Scale

Transfer Learning Techniques: Leveraging Pre-trained Models for Enterprise AI Applications

Serverless Architecture Patterns for Distributed Systems

The Future of Rust: Roadmap and Upcoming Features

Distributed Systems Resilience: Building Robust Applications in an Uncertain World

Implementing Zero Trust in the Cloud: Architecture and Best Practices

Rust Design Patterns and Idioms: Writing Idiomatic, Maintainable Code

Microservices Architecture Patterns: Design Strategies for Scalable Systems

Real-Time Data Processing: Architectures and Best Practices

Service Discovery in Distributed Systems: Patterns and Implementation

Rust Interoperability: Seamlessly Working with Other Languages

Edge Computing Architectures: Bringing Computation Closer to Data Sources

Automated Remediation: Building Self-Healing Systems for Modern SRE Teams

Load Balancing Strategies for Distributed Systems

Rust Performance Optimization: Techniques for Blazing Fast Code

Data Engineering Best Practices: Building Scalable and Reliable Data Pipelines

Rust's Ecosystem and Community: The Foundation of Success

Data Consistency Models in Distributed Systems

Building an AI Ethics and Governance Framework for Enterprise Applications

Containerization Best Practices: Building Efficient and Secure Container Environments

Netflix Hadoop Big Data Marketing Use Case

Table of Contents

Share this article:

Related Articles

Tags

Recent Posts