What are Real-time Systems?
If you look up the term “real-time” in a dictionary, it will tell you that it is the actual time during which a process takes place or an event occurs (Dictionary.com, n.d.).
In computing, it is more often than not due to when an event is created and specific data service is able to respond with the newly updated event in its index.
What is a Timing issue?
Timing is quite potentially the most crucial aspect of achieving a truly real-time system. This is mainly due to the nature of what is trying to be achieved. With real-time systems, it is all about reacting to events as data points as soon as they have happened.
In order for this to be possible, the following criteria needs to be met:
- An event is triggered
- The event data is sent to a collector service
- The collector service strips metadata and stores it immediately into a centralised database
- Additional processing on the data is then done and stored along with the metadata information
- The data is then immediately available in a master index that the application layer is able to use in near real-time
Having personally created a real-time analytics platform over the past few years (AO Analytics, n.d.), I can speak for the challenges and amount of moving parts involved in such an interconnected complex setup.
In the web analytics space the flow works as follows in it’s simplest form:
The highest demand on an analytics tracker service is the exponential scaling required to handle all websites it is installed on. The service has to cater to all of their customers’ traffic.
This means that if you track a thousand websites which each get a thousand visitors, then the service needs to be able to handle a million (1000×1000) visitors.
Concurrency in this case often becomes one of the largest issues and the single biggest pitfall is the database itself (Johan Nilsson, n.d.); as it needs to be able to handle the concurrent traffic as well as index all incoming requests so that the data is available to all subsequent requests following it.
The best way to address this is by means of database clustering, where you would have a multi-master database with numerous read cluster members for data lookups once all data has been added.
When programming such an application, it is important to remember that general software engineering principles still apply; but the performance and data integrity is of highest concern and importance.
Johan Nilsson, Bjorn Wittenmark, Martin Turngren, Martin Sanfridson – Timing Problems in Real-Time Control Systems (n.d.) – Available from: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.48.6989&rep=rep1&type=pdf
Statvoo Analytics (n.d.) – Available from: https://analytics.statvoo.com