The process of selecting a movie is far less enjoyable that the act of watching it. Hence, I created an app in Qlik Sense to select a movie on the basis of any of the following criteria:
- Principal Cast
- Release Year
- Title Type
- Adult Content
It uses the dataset files provided by IMDb on their website, which is not to say that it is without issues. However, I have done my best to remove illegible values from each of the columns. There are some interesting insights immediately visible on the loading of the app like:
- As of Jan 27, IMDb has 4.78 million titles listed with a total of 794.5 million votes
- The rating curve is a right-tail distribution indicating titles tend to be rated higher than lower
- The average score of all titles is 6.94, so anything rated 7+ should be considered better than average
- Drama and Comedy constitute the bulk of titles by far
- The number of titles has been increasing exponentially over the decades, but it has exploded in this century
There are of course a lot more insights to glean out of this data, so if you are intending to use the app, you need to download the following 4 files from the aforementioned IMDb website. These are updated on a daily basis, so make sure to grab the latest ones whenever you are using the app.
Each of these files, on extraction, will contain a data.tsv file which is in a tab-separated format. For this app, I renamed each of these files as per the source (eg. title.basics.tsv) and loaded them from ‘D:DataIMDB’ folder. So, if you are doing things differently, be sure to adjust the data library path and file names in the data load editor.
I would have loved to host this app on the web but my server certainly isn’t up to the task of handling the work load. Hence, feel free to download the qvf, dataset files and set about on your own journey of discovery. I will leave you with a couple more screenshots indicating the look and usage of this app.