What is Data Science ?

Data Science is a branch of science that takes 'Data' as subject of the science.

I know you don't like this definition :) and this definition would not give you any clear understanding of Data Science.

Before we try to understand 'Data Science', let's think of 'What is Science ?'.

Following is the first line of definition of Science from Wikipedia.


Science (from Latin scientia, meaning "knowledge") is a systematic enterprise that builds and organizes knowledge in the form of testable explanations and predictions about the universe.  


If you replace the term 'universe' with the term 'Data', I think it can be a pretty accurate formal definition of the data science.


Science (from Latin scientia, meaning "knowledge") is a systematic enterprise that builds and organizes knowledge in the form of testable explanations and predictions about the Data.  


It may sound pretty accurate, but I know you don't like this definition either. It is too dry.. and it doesn't sound practical to me.


However, if you want to define any terminology in practical sense, usually we may end up with the description that 'sound easy' but 'can be interpreted in too many different ways'. If you ask ten people to define 'Data Science' in practical sense, you might have ten different definition. But if you just put the ten different definition into your brain and keep it for some time, your brain would come up with 'some kind of understanding' on its own even though you may not be able to express it in your own words. Or just watch all the video listed below and if possible search a couple of more videos you can search from YouTube, you will have some form of understanding and I think your understanding will be correct whatever it is.


Following  is my personal understanding of 'Data Science' after watching all the videos linked below.


In usual data analysis that we have done, we are give a certain set of data (usually structured data that is stored in a well defined format, like data base table). and we are also given a predefined 'question' (analysis target) we need to find the answer from the data. and in most case the methods/techniques for the analysis is known in advance.


On the contrary, in Data Science we are given a set of data (usually Big Data) and we don't even know what kind of questions we need to ask about the data.. and in many case we don't know what kind of methodology we have to use. So in many case of Data Science, 'making a question (finding 'what to ask') is the first thing we have to do' and then we have to figure out 'what kind of methodology we need to find the answers to the question'. It is like Scientist do in other scientific area.


This is a little bit of risky definition (you may not agree with this definition), but if you push me to find more practical definition... I would say 'Data Science' is just a marketting term that become popular as another marketting term 'Big Data' gets attention from many people. As you may see in Big Data page, the conventional technology would not work very well with Big Data and we need to find new knowledge and methodology to analyze those Big Data and those new knowledge/technology tend to be called as 'Data Science'.




