Analysis

Tracking vaccinations in the Netherlands

The COVID-19 vaccinations started on January 6th in the Netherlands. The Dutch government (“Rijksoverheid”) has received its share of criticism regarding the speed of vaccination so far. It is lagging behind its vaccination roadmap as laid out in December last year. For the first few weeks Rijksoverheid did not have an online vaccination tracker. The number of administered vaccines would be communicated in other ways, such as through Twitter by the Minister of Health, Welfare, and Sport.

Starting January 27th, Rijksoverheid started to list the number of administered vaccines alongside its “Coronadasboard“. Every day the number of (estimated) administered vaccines gets updated, the number is a cumulative number of vaccinations administered since January 6th. The dashboard created by Rijksoverheid currently does not display the number of vaccines over time. I think this has to do with the fact that they do not have the actual number of vaccines administered by the day. This might be caused by a delay in reporting of administered vaccines by different institutions. Furthermore, the number of administered vaccines is an estimation based on the vaccines that are delivered to vaccination locations.

Due to the lack of information about administered vaccines over time, I have created a dashboard that can be seen on vaccinetracker.nl. As explained above, these are not the actual administered vaccines by the day. These numbers are the daily reported numbers by Rijksoverheid. It does not give us an actual representation of vaccination speed, instead it gives us somewhat of an idea of what is currently happening.

Below is a screenshot of the dashboard as seen on vaccinetracker.nl. Please visit the website for the interactive version.

Click on the image to go to the interactive version.

Big Data

Big Data: volume, velocity, variety… and value

Doug Laney introduced us to the first 3 Vs of big data back in 2001. The three original Vs were volume, velocity, and variety. As we have amassed more data over time, the volume of data has increased. Think about sensor-meters on machines. We can now investigate how machines in a factory or a warehouse are doing based on continuous sensor readings. Furthermore, through our smart devices and social media platforms, even more data is being generated. The emergence of the internet of things (IoTs) has brought us a goldmine of datasets.

What makes big data even more special is that this data arrives repeatedly in realtime. We can monitor machines as they drill oil, likes on social media posts are instantly registered, and rainfall measures are continually measured and recorded. These three examples fall under the second V, velocity. The speed at which data arrives has increased incredibly over the last decades. This has been facilitated by the increase in bandwidth and internet speeds.

Moreover, we have a myriad of different types of data formats now. Think about the different types of data an online store can generate. First, there are the click paths that people go through on the website. The information the customer fills in on the website. What items a customer ends up buying. What payment method they used. And those are just a few examples. All of these actions generate different data formats that need to be stored, processed, and analyzed.

Other scholars and big data engineers have added other Vs to the mix. Examples are variability, veracity, and visualization. But in this post I would like to discuss a different V, referred to as value. Big data on the surface of things seems great. We have a lot of information. Which pleases those who adhere to the ‘law of large numbers’. In statistics there are many principles that point toward the idea of bigger is better. Think of the central limit theorem, the larger the sample size, the more likely the sample will morph into a normal distribution. And increasing the sample size is all about getting an answer that is closer to ‘reality’. We want a sample represents the population.

But let’s say we’re a company. We have loads of data. Statisticians would be jealous of our datasets. So much data. But what now? We let the data sit in a database or a distributed file system for a couple of weeks before we analyze it. We analyze it, and oops – it’s already too late. The interesting trends we found through our analysis are now irrelevant. That is why we should seek value in our data. It means acting fast, it means performing the right analyses. It also means realizing what we want to achieve with our data. Do we want to increase sales? Do we want to understand our population better? Do we want to facilitate better decision-making processes?

We have to know what we’re doing, that is why the value principle is so important. This principle is based on our other Vs as well, the volume, variety, and velocity of the data. Value is often overlooked but it is definitely imperative to your big data solution.