Data scientist jobs can vary greatly from company to company in terms of responsibilities, expectations and skill sets. This can make recruiting the right data scientist unsuccessful and expensive. Building accurate data scientist job descriptions, decoding nuances in experience and skill sets and matching present and future expectations of the hiring company all prove to be difficult without understanding the sub-categories of sought-after candidates.
Company A may need a dedicated data scientist to implement machine learning solutions into a web application. Company B may need an expert to process terabytes of data on a Spark cluster. And Company C needs an excellent communicator to break down statistical insights for easier comprehension and management.
In my experience, data scientists can be roughly grouped into three major categories: The Software Engineer, The Statistician and The Machine Learning Expert. An individual candidate or job opening will be some blend of one or more of these. Using a dataset with over 50 million professionals in the U.S., we can quickly identify data scientists at the level of job skills and interests, helping you recruit the right prospect for any data scientist job.
1. Data Scientist: The Software Engineer
The Software Engineer data scientist is a highly-skilled programmer who knows multiple languages beyond the minimum necessary choice of R or Python. This person has extensive knowledge of software systems, architecture and development. They most likely have a computer science degree, but it isn't necessary. They're capable of deploying predictive models into a production environment such as a web application.
They should also be familiar with a wide array of computer systems and fluent with using Linux/Unix-based terminals. Working remotely is necessary on most cloud technologies like AWS, and you should expect they have a fairly well-established GitHub with a variety of example projects. Better yet, they're an active contributor to an open-source project.
This is your “hacker” type of data scientist.
2. Data Scientist: The Statistician
The Statistician data scientist has a mastery of numbers, their underlying mathematics and theoretical interpretations. They are highly educated, even when compared to other data scientists. Almost always they'll have an advanced degree in mathematics, statistics, physics or some other related field. The Statistician has a strong focus on validity of results, careful analysis of any findings and no hand-waving explanations without solid data. They're very good with charting and visualization tools, and you can expect to hear them speak about statistical significance, p-values and correlation.
This is your “math wiz.”
3. Data Scientist: The Machine Learning Expert
The Machine Learning Expert data scientist has extensive knowledge and training in artificial intelligence (AI). They come from a variety of backgrounds, as there aren't as many formal education paths designed specifically for machine learning.
They have a deep understanding of virtually all ML algorithms, their applications and their “gotchas.” When presented with a potential application of ML, they should be able to immediately propose a possible application of AI or provide a clear explanation of why the application is not a good candidate. They should also be fairly in-tune with state-of-the-art developments in AI and up-to-date on current academic papers published in the field.
This is your “AI specialist.”
Any individual data scientist is going to be some mixture of the above, but all data scientists need a solid understanding of machine learning. The difference in a candidate more strongly number 3 versus number 1 is the same as the difference between a Formula 1 racer who drives for a living versus a lawyer who drives each day to work. Learning the right mixture and match for any given role and candidate is the key to finding the right data scientist for the job.
What sort of data scientist are you looking for?