Can analytics radically simplify the intricacies, nuances and 700 plus year history of Scotch Whisky? Kailash Purang says yes and takes on the expert palates of the Scotch Whisky Masters armed only with his data set. His visualization groups 86 single malt brands in terms of 12 flavor characteristics like sweetness, smoky, honey and nutty into like flavor groups. It presents them in a simple graphical form so we can all understand the differences and similarities between the various single malt brands.
Each dot (or node) represents a single malt brand. Each line (or edge) represents the strength of the similarity between the whiskies, the thicker and darker the line, the more similar in taste. Thus natural clusters form with similar tasting single malts closer together and highly connected. This type of analytic is used in food science. It can be matched with market share and segment profile data to engineer new flavor styles designed to appeal to high value segments and or reposition existing brands to appeal to new segments. Its also provides a ‘cheat sheet’ for bar and retail staff, instantly giving them the expert knowledge to recommend new brands to sample or substitute brands when a clients favorite single malt is not available.
For the every day consumer it also offers a fun way to explore a wonderfully complex subject. If you like the taste profile of one brand you can now try others that are similar or use it to explore very different flavor types. Either way you will be guaranteed a great night out. Cheers, Kailash! See you at the bar?
This Teradata Aster visualization shows a Gephi representation of an Aster Lens produced graph using the University Of Strathclyde’s Whisky Classification open data set. It covers 86 single malt Scotch Whisky brands across 12 expert rated flavor characteristics along with their distillery geo co-ordinates.
It uses cosine similarity to group the whiskies by their flavor types yet cater for sparse data caused by single malts that have none of a certain flavor type. For example some single malts have no ‘sweetness‘ or no ‘medicinal’ taste, which creates many zero flavor ratings. Cosine similarity deals well with a proliferation of zeros, which can otherwise be a defining factor in clustering. The Aster Lens visualization is created using the GraphGen function in Teradata Aster. The nodes are of equal size representing the equal weight of the different single malts, the links indicating the strength of similarities.
Kailash is the lead Data Scientist for Teradata in Singapore. He also works across South East Asia and most notably in Indonesia, supporting the leading banking and communication industry clients Teradata serves in the region.
Kailash holds a Bachelor of Economics and Statistics as well as a Masters in Economics from the National University of Singapore. He also holds a Bachelor of Management from University Of London. He has worked in the field of analytics for 15 years across various industries.
Despite having ‘sold his soul’ to join the commercial world, he still believes that the aim of all this learning and technology is to make people’s life easier and more fun. To help introduce analytics in a fun ‘tear-less’ way, he works in his spare time on creating visualizations that show how everybody can benefit from simple analytic applications.
As a Data Scientist for Teradata, he strives to make his clients realize the full potential of ‘Big Data’ so that their customers can benefit via better services and offerings.