How to Read the Whisky Database

How to Read the Database

The Whisky Database page features an integrated Google Sheet with all the key information on each whisky in my dataset. You can sort, scroll and search this database with ease. Below are some simple instructions on how to interpret the results, to get the most out of this resource.

As I explained in my Introduction to this site, the ultimate goal here is really one of arbitrage. That is, taking advantage of the price difference between two or more similar whiskies by comparing their relative quality (e.g., the Meta-Critic scores and variation within a cluster), thus benefiting from the quality differential in a given price range.

That said, I do NOT recommend you simply look at the highest scoring whiskies in any given cluster or category. Those just reflect cases where there is broad agreement among reviewers. You will still have to head off to some of the actual reviews out there to see if the specific characteristics present in that whisky match your personal taste.

But more broadly, you are going to want to spend some time exploring whiskies that score across the mid-to-high end range of any given class/cluster. And when doing so, you need to pay careful attention to the whiskies with a high standard deviation (i.e., measure of disagreement among reviewers). This may ultimately be more important for finding matches to your personal preferences, once you research the whiskies further. Please see my When Reviewers Disagree page for more info on this point.

There are also plenty of low-scoring whiskies that I find to be quite good. Again, you need to keep the number of reviews and the standard deviations in mind. But there are definitely cases where I think decent whiskies are being under-served by the reviewer rankings (especially in the single malt category). I plan to explore the possible reasons why in my more detailed individual whisky commentary posts.

As a bit of background to help you calibrate yourself: as of October 2015, the overall average Meta-Critic score – for all the whiskies in my dataset – is ~8.53. This is based on the actual population average of all the normalized scores for all the reviewers on each whisky in my dataset. The more delicate single malt clusters will have a sub-population average that is a bit lower, for the reasons outlined here.

Now, how to read the actual page:

Spreadsheet Information

The spreadsheet contains 10 columns of information about each whisky, with as many rows as there are whiskies in the dataset (that meet the criteria discussed earlier). Naturally, this makes for more columns and rows than can be displayed on a typical screen.

The API will do the best it can to match as much information as possible to your screen size (i.e., how many columns, and how wide). You can always change your screen resolution or web browser zoom settings to help (i.e., Ctrl- or Ctrl+ on your keyboard) – the table will refresh automatically to adjust. But I have preset some options for easier viewing – and there are dedicated controls to allow you to customize the display further.

User Controls

At the top of the page, you will see a header that gives you some key options:

By default, I’ve chosen 50 entries (i.e., number of whiskies) to display per page. You can change this, but it’s a good trade-off to show you all the whiskies in a given class or super-cluster, while still limiting load times.

You can hide select columns if you like, as well as export/copy all the data through the buttons on the right, in case you want to keep a copy of the data.

A key feature here is the Search option. Although you can sort the results by column in the table below, the Search option is your best bet to find not only individual whiskies, but entire sub-classes of whiskies to compare.

Column Organization

To explain why you want to do that, let’s look at the column organization:

I explain each of these column headings below. They key point right now is that the database is by default pre-sorted by Class (descending), followed by Super Cluster (ascending) and finally MetaCritic Score (descending) – as shown by the purple arrow heads above. If you click on any of the gray arrow heads, you will re-sort the entire dataset by that column (and only that column). Note that the end user cannot multiply select columns to sort, so you will not be able to easily separate out groups if you re-sort.

This is why you are better off using the search feature t narrow your selection – the search results will display with the multiple sort criteria by default. So if you search “japan” for example, all the Japanese whiskies will be displayed in the proper sort order (i.e., by flavour cluster first, then score).

Any columns that cannot be displayed across your screen width will be available to you as an open-menu feature next to the whisky name. For example:

If you click on the green “+” symbol, it will open up to show you the data for this whisky from the missing columns, as shown below:

In the case above, the last two columns were cut off, so the info is provide in the pull-out menu. If you are reading this on a mobile device, a lot more column info will be presented here. Press on the red “-” to close to menu.

Column Descriptions

So what do all the columns mean?

Whisky name is the first column, and is typically self-explanatory. A number followed by “yo” means how many years old is specified on the label. In brackets is additional information about the bottling, where relevant.

MetaCritic score refers to average normalized score of all reviewers who have reported on that whiskiy. Again, this is not a raw score aggregator, but a proper statistical meta-analysis with standardized normalization (as described here).

STDEV is the standard deviation of the mean Meta Critic score, a measure of variance. See this discussion to find out why this is important to consider.

# is the number of reviewers on which the mean Meta Critic score and standard deviation is based. Again, see the link above for a discussion.

Cost is an approximate indicator of the average worldwide price for the whisky in question. In cases where the whisky is not widely available, I have limited the price estimate to areas where it can be found.

$ is for whiskies <$30 CAD
$$ for whiskies between $30~$50 CAD
$$$ for whiskies between $50-$70 CAD
$$$$ for whiskies between $70~$125 CAD
$$$$$ for whiskies between $125~$300 CAD
$$$$$+ refers to all whiskies >$300 CAD.

Class is a key column. Although country and whisky type is specified in later columns, the Class column groups together whiskies that share major common characterisitics. You typically will want to compare scores within these major classes. At present, I have four Classes in the dataset:

Single-Malt-like – includes all the traditional Scottish “single malt” whiskies, and other international malt whiskies. Includes some international blends that taste more like single malts than blends. All whiskies in this class are assigned to a Cluster and Super Cluster, as explained below
Scotch-like – refers to the typical Scottish “blended whiskies”, and also includes international whiskies that again are of comparable style.
Rye-like – refers to classic Canadian (and some American) blended whisky style that is heavily influenced by rye flavours. Some whiskies in this category are exclusively rye whisky (i.e., 100% or “straight” rye whisky), including some international whiskies
Bourbon-like – refers to the classic American whisky style. Bourbon is a predominantly corn-based whisky (at least 51% corn in the mash). Some Canadian whisky can be bourbon-like.

Super Cluster and Cluster typically refer to the revised flavour cluster analysis performed here, based on the earlier Wishart analysis (and expanded for all single malt-like whiskies in the dataset). The Super Clusters are groupings of Clusters where the characteristics are similar enough to overlap considerably on the principal component analysis. I recommend you use the Super Clusters when planning to host whisky tasting events, to demonstrate the widest range of flavours. The individual Clusters could be described as follows:

A – Full-bodied, sweet, pronounced sherry – with fruity, honey and spicy notes (e.g., Aberlour A’Bunadh, Auchentoshan Three Wood, Glenfiddich 15yo, Glendronach 12yo, Glenmorangie Lasanta)
B – Full-bodied, sweet, pronounced sherry – with fruity, floral and malty notes, some honey and spicy notes may be evident (e.g., Balvenie New Wood 17yo, Glenfarclas 10yo/15yo/17yo, Glengoyne 17yo/21yo, Penderyn Madeira)
C – Full-bodied, sweet, pronounced sherry – with fruity, floral, nutty, and spicy notes, some smoky notes may be evident (e.g., Aberlour 10yo, Glenfarclas 105/12yo/21yo/25yo/30yo, Glenmorangie Signet, Highland Park 18yo)
E – Medium-bodied, medium-sweet – with fruity, honey, malty and winey notes, some smoky and spicy notes may be evident (e.g., Auchentoshan 12yo/18yo, Dalmore 12yo, Glenrothes Select Reserve/Vintage 1989/1991/1992/1994, Old Pulteney, Redbreast 12yo)
F – Full-bodied, sweet and malty – with fruity, spicy, and smoky notes (e.g., Bunnahabhain 12yo, Deanston 12yo, Glen Garioch 10yo/12yo/15yo, Glenlivet French Oak 15yo, Tobermory 10yo)
G – Light-bodied, sweet, apéritif-style – with honey, floral, fruity and spicy notes, but rarely any smoky notes (BenRiach 12yo, Bruichladdich Laddie Classic, Glenfiddich 12yo, Glen Garioch Founder’s Reserve, Glenmorangie 10yo, Jura 10yo)
H – Very light-bodied, sweet, apéritif-style – with malty, fruity and floral notes (e.g., Auchentoshan Classic/10yo, Cardhu 12yo, Dalwhinnie 15yo, Glen Grant 10yo, Tamdhu 10yo)
I – Medium-bodied, medium-sweet, quite smoky – with some medicinal notes and spicy, fruity and nutty notes (e.g., Ardmore Traditional Cask, BenRiach Curiositas 10yo, Bowmore 12yo, Highland Park 12yo, Jura Superstition, Oban 14yo, Talisker 10yo).
J – Full-bodied, dry, very smoky, pungent – with medicinal notes and some spicy, malty and fruity notes (e.g., Ardbeg 10yo/Corryvreckan/Uigeadail, Lagavulin 16yo, Laphroaig 10yo/15yo/18yo/Quarter Cask, Toumintol Peaty Tang)

Please see the Whisky Database for further examples of each class. You can probably tell from the above (and the flavour chart) why I decided to “super cluster” A-B-C together, E-F together, and G-H together.

For bourbons and American whiskies, I am using a Cluster classification system based on rye grain content in the mashbill.

R0 – (r=0%) – “No Rye” whisky with 0 rye gain (i.e., no rye in the mashbill or in the resulting taste, includes pure corn whiskies and “wheaters”)
R1 – (r<=10%) – “Low Rye” whisky of 10% or less rye grain (i.e., lower rye content and flavour than typical)
R2 – (10%<r<=15%) – “Standard Rye” whisky of 10-15% rye grain (i.e., classic bourbon recipe)
R3 – (15%<r<51%) – “High Rye” whisky of more 15% rye (i.e., more rye content or flavour than typical, but not enough to classify as a rye whisky)
R4 – (r>=51%) – “Rye” whisky of more than 51% rye (aka “straight rye”, although that brings with it other requirements for aging)

Please see my Bourbon Classification page for more info on how this classification scheme was developed.

Country is the country of origin for the whisky

Type is the actual source material used for the whisky.

… And with that, have fun exploring the the Whisky Database!

16 comments

dax
September 20, 2016 9:50 am
This is an excellent resource . After many years I have found the best whisky resource online.
Reply
RM
December 28, 2016 2:20 pm
Comment is private. No need to publish. The author of the content on this excellent website refers to him or herself in the first person but not by name. No signature or by-line. It suggests an academic who doesn’t want her/his university knowing they are in anyway associated with whisky. Understandable but a shame not to get proper credit. Best wishes.
Reply
Christian
December 10, 2018 10:53 am
Is there a way to access the database through an API with API key perhaps? I think it would be fun to make a little web application wit this data and point users back to this website.
Reply
- selfbuilt
  December 10, 2018 5:06 pm
  Yes it would, but that is beyond my current skill set (or time availability to learn). But I do I plan to post an article about future expansion possibilities soon.
  Reply
  - Neil
    February 17, 2019 7:27 pm
    If you need help with that I am more than willing to, I have already started my own whiskey app with your data.
    Hope it is okay, but it is unlikely that the app will ever go into production.
    Reply
    - JJ
      November 26, 2020 10:47 pm
      Did you ever make the app?
      Reply
Erik
March 8, 2020 10:30 am
Hi, I was just wondering if there’s a reason some whiskys are listed in your database under a different cluster than they are in the wishart book and in the description of the clusters? E.g. Glenfarclas 21 is listed in the B cluster in your database, but is listed in the C cluster in both the book and the description of the cluster on your site. Glenfarclas 40 on the other hand is listed as cluster A in the database, but also C in the book. Are these differences intentional?
Reply
- selfbuilt
  March 8, 2020 2:31 pm
  I will have to go back and check my notes. Sometimes the cluster assignments change when modern tasting notes differ from those Dr Wishart used to build his rankings. A small change on a scale or two can be enough to shift a cluster assignment.
  Reply
tom evans
April 7, 2020 9:11 pm
I found this website is to be a great resource. It certainly helps me visualize how my taste in Scotch is changing. Do you know of any resource that includes the values for the individual whiskies that you used to create the cluster analysis? Malts.com has their graph for their family of whiskies but it doesn’t include all the ones i am interested in trying. It seems like the coordinates for the individual whisky would be a good thing to include in the database. I am assuming (perhaps incorrectly) that you would have used those data points to do your plots.
Reply
- selfbuilt
  April 8, 2020 7:18 am
  The exact coordinates of each whisky can change as new whiskies are added and the cluster analysis is re-run. This is one of the reasons why I have just stuck with the major clusters. That work, led by Dr Wishart, is challenging to recreate, so I’ve focused instead here on my meta critic for whiskies quality measures.
  Reply
AD Lopez
October 28, 2020 3:18 pm
I would love to do some analysis on what users are clicking on or searching when they review look through the database. Do you have that available?
Reply
- selfbuilt
  October 28, 2020 3:52 pm
  No, the site does not track user input. So I have no information as to what they searching for or clicking on.
  Reply
Frank GA
January 4, 2021 3:42 pm
Hi,
What is the most recent overall average Meta-Critic score – for all the whiskies in the data set?
Reply
- selfbuilt
  January 4, 2021 10:48 pm
  The overall average for all whiskies in the dataset (as of Dec 2020) is 8.55.
  However, you really need to compare within in a given whisky type, as there are significant differences in scores across styles. The general breakdown of average scores right now is: Blends (8.32), Bourbon (8.58), Malt (8.60), Rye (8.62).
  Similarly, you really should consider malts by flavour super-cluster, since more delicate whiskies get lower scores than more flavourful (or smokey). The current breakdown for averages by super-cluster is: ABC (8.68), EF (8.52), GH (8.40), I (8.71), J (8.72).
  Reply
Steve
November 14, 2021 4:37 pm
Great resource. Are you able to share the two dimension values associated with each that allow one to plot the chart for the respective entries?
Reply
- selfbuilt
  November 15, 2021 1:07 pm
  No, because I haven’t been able to recreate the full cluster analysis/PCA for new whiskies. So the cluster assignments for most of the entries in the database are approximations based on flavour descriptors used in the reviews. It’s good to think of them as a general guide, using the best classification system developed to date (i.e., the Wishart analysis).
  Reply

A scientific meta-analysis of whisky flavours and quality

How to Read the Database

16 comments

Leave a Reply Cancel reply