What is Data Science, and how do you become a Data Scientist?

Data Science is not just about data. The bare basics are recognizing what all data to keep, identifying how to process it for different results. It does not stop there. Data scientists need to figure out blanks in data and fill them with data that ‘may’ come up in future. Data Science essentially is about connecting dots in businesses and using existing and non-existing data to meet the demands of each business.

Data Science is one of the hottest areas in technology and so is the demand for data scientists worldwide. In fact, a new online Microsoft Certification program called the Microsoft Professional Degree Program has also been announced.

What is Data Science

What is Data Science

Most of us think Data Science is simply statistics. If you are good at statistics, you will be able to represent the numbers in any way you want: charts, infographics, etc. Will you be able to identity the different data needs for the business in different areas? Can you ‘foresee’ data? Will you be able to fill in data pieces that are required but are not yet available? These questions don’t belong to statistics alone.

What is Data Science? Let’s check it out by listing out each step so that the overall image comes up. As such, it is difficult to explain it in one sentence, but I will try. Data science is the science that lets you identify data for different purposes, identify business needs for information, process the data using tools at hand to provide inputs necessary for a business to thrive. Thus, Data Science is a bit of everything. It includes not only statistical skills but a bit of managerial skills, some language processing, researching skills, a bit of machine learning knowledge and a complete idea of what tools are required to produce desired results.

Data Science contains all of the following, irrespective of what all is used at a business:

  1. Creating the need for data
  2. Categorizing of data sets based on their possible usage
  3. Strategized storage of data sets on premise or the cloud; in either case, the data sets should be available on demand without delay
  4. Understanding of business process flows and how different data sets are useful for each
  5. Understanding of business decisions to help the business do better
  6. Ability to process data using different set of tools: spreadsheets, databases, programming languages, etc. to meet the demands of business processes
  7. Ability to foresee what kind of data would be incoming in the near future and using it for current processes
  8. Analyzing the results of a process and going back to the drawing board to make it better

The above list is not comprehensive but highlights the main points of data science. As the first point suggests, data scientists need to be able to convince businesses that all of the data is useful and hence should be stored for a long time. Maybe put on those useful old databases on some shared cloud for 10-15 years so that they can look at it and produce more effective databases? Any need may arise as the business surroundings keep changing. Laws of land change, business processes change, and data needs to be adapted. Thus, the more data you have at hand, the more effective you’ll be.

Traits of & Requirements to become a Data Scientist

In the third paragraph above, I tried to describe data science as an amalgamation of marketing, managerial, statistical, Machine Learning science. Simply statistical skills won’t be enough. You’ll need more than that.

Requirement to become data scientists

First of all, you’ll need Math skills. They’d be Calculus and Algebra in addition to simple arithmetic. Learn metric system for calculations as they’d be precise. You must be good at permutations and combinations. A certificate course in Math may cover all these. There are online courses too, at Coursera.

It will help if you have experience or knowledge of team management. Likewise, certificates and diplomas in business management will give you an edge.

You’ll need to learn at least one data handling language. From the adverts I have seen, Python and R are always in demand. R is a part of Hadoop so if you have a certificate in Hadoop, your chances of being hired increase.

The requirements to become data scientist will keep changing as more and more things add to Data Science. For example, a bit of Machine Learning experience will go a far way in getting a good job in the field because everyone is focusing on AI these days.

The job descriptions of Data Scientist vary from business to business. At a place, they simply need analytics while at some other place, they’ll want data scientists working on artificial intelligence. Check out the list I wrote to explain Data Science. The more points you can cover, the better it will be for you.

If you still have questions like what is data science or what are the requirements to become a Data Scientist, please leave comments. I’ll try to get answers for you.

Posted by on , in Category General with Tags
Arun Kumar is a Microsoft MVP alumnus, obsessed with technology, especially the Internet. He deals with the multimedia content needs of training and corporate houses. Follow him on Twitter @PowercutIN