A Brief Talk on Databases


Databases have been around since ancient times, when elaborate database systems were deployed by governments, hospitals and organisations to keep track of vast amounts of information; but only until around the 1960s did computerised databases start to take shape and become cost effective for private organisations to start using (Quickbase, n.d.).

I have used countless databases over my career to reliably store and retrieve many sizes of information, from a small collection of less than ten items of data to terabytes worth of hundred of billions of data points spread across clusters of machines around the world.

Databases come in all shapes and sizes and are absolutely invaluable to any size of organisation or entity that needs to keep track of credible information.

One of the best examples of database usage I have been involved in was for a real-time analytics cloud based service that started off with a single MySQL relational database to collect visitor information of any website that had signed up to the service.

When an individual on the internet visited a website with the service installed, a script would send the visitors unique information including but not limited to their internet protocol address, browser information, geographical location, service provider information and many other things to a collection resource that would insert the data into the database and then monitor them every five to ten seconds to see if they were still on the website, updating this information each time they interacted with the website or navigated to a new page.

Website owners would then be able to see who was visiting their websites, from what countries and which pages were most popular.

Data was then analysed and stored in large count tables once a day to show historical information for reports and for added lookup speed when querying older data.

This system worked extremely well but started to get a bit slower over long periods of time so certain features were redesigned to make less costly use of SQL JOINs between tables and add appropriate INDEXes where necessary.

There were also additional databases that supported this master system such as geographical information corresponding to internet protocol address ranges that used a NoSQL solution called MongoDB as an object store for almost instantaneous key retrieval.

PowerDesigner (Jacikevicius, 2015), MySQLWorkbench (MySQL, n.d.) and Percona Toolkit (Percona, 2017) have proven to be very useful Data Modelling Tools that really assisted in the development of an elegant and reliable overseer of how the database would and can be used in the end, with many added benefits to boot.

Databases are among the few items in a software system that can have a significant impact on overall speed and reliability of the overall system all by themselves.

It has been said that the nastiest performance bottleneck is often the database (Gualtieri, 2011).

References

Quickbase (n.d.) ?A Timeline of Database History [Online] Quickbase.com, Available from: http://www.quickbase.com/articles/timeline-of-database-history (Accessed on 27th October 2017)

Jacikevicius, Z (2015) Top 6 Data Modeling Tools [Online] DataScienceCentral.com, Available from: https://www.datasciencecentral.com/profiles/blogs/top-6-data-modeling-tools (Accessed on 27th October 2017)

MySQL (n.d.) MySQL Workbench 6.3 – Enhanced Data Migration [Online] MySQL.com, Available from: https://www.mysql.com/products/workbench/ (Accessed on 27th October 2017)

Percona (2017) Percona Toolkit Documentation [Online] Percona.com, Available from: https://www.percona.com/doc/percona-toolkit/LATEST/index.html (Accessed on 27th October 2017)

Gualtieri, M (2011) The Nastiest Performance Bottleneck Is Often The Database [Online] Forrester.com, Available from: https://go.forrester.com/blogs/11-02-13-the_nastiest_performance_bottleneck_is_often_the_database/ (Accessed on 27th October 2017)