Karthikeyan.Guruswamy@thinkbiganalytics.com

Analytics vs Data Science - Are they the same ?

Blog Post created by Karthikeyan.Guruswamy@thinkbiganalytics.com on May 18, 2016

Are the terms Analytics & Data Science talking about two different things or the same thing ? My blog post was inspired by a discussion on this topic with some of my team members during a recent conference.

A lot of millennials think Analytics as an old fashioned way of describing Data Science. On the other hand, a lot of traditional data mining/warehouse people think Data Science as old wine in a newly designed big data bottle. I know this debate not only exists in traditional companies who has been solving customer problems for years, but also for new startups who are trying to find a range of customers to consume their products. This debate is also among customers who've used different products over years and now wondering - "What is the new data science shiny object thingy ? We've been using scoring algorithms with SAS and SPSS for years ? Is this a new marketing term ?"

In this blog post, I want to demystify the two terms and call out the differences. As a practitioner, I'll have to tell you that the terms Analytics & Data Science are distinctively different. They are two different animals. Here's why ...

Analytics is often tied to a technology or a relatively canned approach, usually to a few hypotheses with well defined methods. Mature analytic processes often creates useful solutions.
Data Science however is free form that often drives what Analytics should be run for a problem and falls into a discovery realm. Data Science helps in many cases to formulate the hypothesis itself  ...

That being said, the definition has morphed quite a bit over last few years. Just like how BI and Analytics is used in the same breath in many places, the word Analytics is now tied to Data Science and used often to represent the same thing!

How to use BOTH the terms in the same paragraph:

A churn problem comes with poly structured data - Transactions, Emails, Call center notes. We need to decide what 'Analytics' need to be performed to generate a rank ordered list of customers. We can decide to use Teradata Aster's Multi-Genre (TM) Advanced Analytics or even Spark MLIB to dig through the data and create models for prediction. The process of finding out what "Different Analytics" needs to be performed is Data Science ... Should I use Hidden Markov or Logistic regression or a combination of both ? Should I combine Personalized Page Rank features with XGBoost to increase my precision and recall ? Which "Analytic method(s)" is worth a shot given the descriptive statistics of the data ?

Synergistic Alignments:

Analytics methods are also more prescriptive and used in places where it's close to operationalization. Analytics folks are more tied to the code, underlying technology and tend to create controls (KPIs) over the process. Data Science is more on the discovery realm. Data Science folks tend to get more caught up in how algorithms work- even under the hood, limitations of it, visualizations/ insights/finding needles in the haystack,  smoking gun etc.,

In layman terms if Analytics is like law enforcement, Data Science would be the forensic lab to provide the methods & evidence :). We need both to solve the problem not just once, but repeatedly in future.

Outcomes