Data scientists aren’t a nice-to-have anymore, they’re a must-have, and businesses of all sizes are scooping up this new breed of engineering professional.
Companies like Apple, BirchBox, GE, Facebook, YellowPages, Bank of America and a swath of startups all have data scientists on their payroll. Even the White House has a chief data scientist. It used to be a role no-one wanted.
F**k it, we'll do it live!
Our biggest ever edition of TNW Conference is fast approaching! Join 10,000 tech leaders this May in Amsterdam.
Today it’s a different story. Data scientists have now become critical to a company’s success. Any business that relies on data or analytics will need an expert in looking for numerical needles in data haystacks. But how do you find the right one for your business?
Companies may find themselves lost in the sea of information that confronts them, and often attempts to bring in experts to sift and sort the information for them. More than two-thirds of data scientists say cleaning and organizing data is their most-time consuming task and 52.3 percent say that poor quality data is their biggest daily obstacle.
For many companies, this is a completely new universe of hiring. As a data scientist myself, and having worked with and hired many in the field, I’ve put together a guide for the newbies looking to build the perfect data scientist team.
When posting your job listing on relevant sites, it’s critical that you look for horizontal skill sets. Place ads on machine learning verticals or relevant forums you see on Reddit. You won’t find your data scientist at typical headhunter sites.
You’re going to get a lot of candidates because data science is the latest buzzword showing up on resumes but you can easily spot the fakers by asking specific questions: What tools do you use? What was your role within the organization? By these types of qualifying questions, you can get down do a handful of candidates.
Now the next stage is tricky.
There is nothing really typical about the interview process moving forward. It’s all about asking the right questions. You’re going to want to ask about their experience with engineering, machine learning and programming.
Have they worked with unstructured data, algorithms, and have knowledge of modern technology such as Hadoop and Spark, and scripting languages such as Python Coding? But the key part of these skills is having a keen understanding of correlation, causation and related concepts.
These are central to modeling exercises involving big data. We want to see a few years of experience in the battle field, testing and training models with R, Weka, Matlab – in plain speak, these are machine learning solutions used today – or other related tools.
This person will need to do these critical tasks: clean and categorize data, ask questions, analyze this data with statistics and machine learning models, visualize these results and improve models and algorithms to produce better results and execute them quickly.
This all seems like basic stuff but the art of identifying the critical traits of a stellar data scientist aren’t so cut and dry. In addition to these five tasks, there are traits that separate a good data scientist from a great one.
And you find out by giving them a challenge. This is where it gets fun.
The real deal
Set up a challenge to learn how the candidate works in the wild. For example, we’ve done challenges asking candidates to predict how many people can survive the Titanic.
Candidates should create a daily project plan that presents a glimpse of how their brain works – algorithms to predict survival rate, what age and gender are more likely to survive than others and why and other factors you would have never even thought of.
I have seen amazing and mind-blowing ways candidates show how they explore data, types of data models being used, and how they understand data, data cleaning, various models used, analysis and very importantly, how they present these findings.
Read between the resume lines
Intellectual curiosity is what you should discover from these project plans. It’s what gives the candidate the ability to find loopholes or outliers in data that helps crack the code to find the answers to issues like how a fraudster taps into your system or what consumer shopping behaviors should be considered when creating a new product marketing strategy.
Data scientists find the opportunities that you didn’t even know were in the realm of existence for your company. They also find the needle in the haystack that is causing a kink in your business – but on an entirely monumental scale. So, if there are 100 billion galaxies, the Milky Way being one of them with 200 billion stars and planets, it’s like finding planet Earth in that.
In many instances, these are very complex algorithms and very technical findings. However, a data scientist is only as good as the person he must relay his findings to. Others within the business need to be able to understand this information and apply these insights appropriately.
Good data scientists can make analogies and metaphors to explain the data but not every concept can be boiled down in layman’s terms. A space rocket is not an automobile and, in the brave new world, everyone must make this paradigm shift.
And lastly, the data scientist you’re looking for needs to have strong business acumen. Do they know your business? Do they know what problems you’re trying to solve? And do they find opportunities that you never would have guessed or spotted?
Data scientists see the forest through the trees – or the galaxy through the stars – to help you find new ways to leverage your data, and champion ideas that others may not embrace at first. Like my good friend said, you have to find a MacGyver instead of a James Bond. While Bond has a lot of support behind him, MacGyver is more creative and has more imaginative ways of getting results.
Who will bring your business to the next level?
Image credits: Shutterstock