Data science jobs have been hailed as being among the “sexiest jobs of the 20th century” by Harvard Business Review. Why? Because data is the new “oil” or the new “gold”.
We are living in a data deluge. A lot of data has been generated in the past decade. This has paved the way for companies to come up with systems to mine the data, analyse the data and create insights and value from the data. Consequently, people with the required skills are sought after as the need for storing, working and securing data grows.
No wonder there are a lot of opportunities in this newly emerging field created by not only the rapid generation of new data, but also the need to work with this new data.
What is data science?
Simply speaking, data science is working with data with the aim of creating insight and delivering business value. Data science is broad. It includes, but is not limited to:
- Data mining;
- Data analysis, machine learning; and
- Data visualisation.
As such, data science cannot be simply defined as just one of the aforementioned skills/attributes.
How to become a data scientist
A combination of educational background, training and skills is required. Here are a few of the qualifications and educational experiences you will need:
- A degree in a quantitative field (physics, maths, statistics, finance, economics, computer science, etc).
- A higher degree/research (master’s, PhD) in a quantitative field. Although this is not a strict requirement, it does come in handy since it gives one the necessary problem-solving skills. And, in most cases, it also provides for the necessary coding skills that are gained through research.
- A direct entry through a four year Bachelor’s degree in data science is becoming common. In South Africa, for example, top universities such as Wits, UCT, UP and the University of Stellenbosch, among others, are offering specialised degrees in data science or data analytics.
- Transitioning into a data scientist role from other closely related roles such as quantitative analyst/developer, software engineer/developer, business intelligence analyst, statistician among other is possible. This is possible by leveraging domain knowledge and other skills useful in data science and closing the gap by taking self-taught courses.
- Short courses are also offered to upskill graduates who do not necessarily have the skills (coding, statistics) gained through study.
- Online courses/boot camps and MOOCs (massive open online learning courses) are also becoming popular in closing the skills gap.
Six skills needed to become a data scientist
Prepares the data scientist to analyse data in a scientific way. This includes but is not limited to distributions, linear regression, probability theory, Bayesian statistics, statistical tests, etc.
It is a necessary tool in the data scientist’s toolbox. It helps the data scientist to analyse data and use different algorithms to deliver a solution in code. The two most commonly used languages in use by the data science community are Python and R. Modules/Packages are available in both languages, which make it easy to do data analysis, visualisation, machine learning, etc. Other languages such as C++, Scala and Java also come in handy, especially when creating data science products that integrate with existing systems.
One of the most important skills of a data scientist is curiosity. The data scientist is eager to find patterns in data, not afraid to investigate ‘anomalies’ and come up with a sound reason why such patterns or anomalies exist.
4. Data wrangling
This is the ability to extract data from various sources, to clean up the data and transform it in a required format.
5. Story telling
This covers the ability to explain one’s findings to a non-technical audience and business to aid them with their decision-making. Great communication skills are needed to translate findings into insights. Not only visuals but also a language that can be understood by a business must be employed that will eventually lead to action that, in turn, leads to growth.
6. Domain knowledge
Data science is applicable in many industries and, therefore, knowledge in these business domains is important. It can be acquired over a period of time as one works in a particular field.
In summary, data science is an interdisciplinary field that is at the intersection of coding skills, maths/statistics and domain as well as business knowledge.