Artificial intelligence is one of the most exciting and attractive fields to get into. The global machine learning (ML) market is estimated to grow from $1.4 billion in 2017 to $8.8 billion by 2022. AI is projected to create 2.3 million related jobs by 2020, according to Gartner. The average salary of a machine learning engineer is between $125,000 and $175,000. At the top ten highest paying companies for AI talent, the average salary easily surpasses $200,000. Clearly, there are a lot of reasons to join this booming industry.
In this article, I’m going to break down exactly what artificial intelligence jobs entail, and how to go about getting the skills required and break down the hype a bit.
First, let’s be very clear about what artificial intelligence work is and isn’t. Feel free to skip this section if you think you’ve got a handle on it.
Artificial intelligence is an incredibly broad term — it involves an aspirational drive to replicate human learning and behavior in machines. How do we cut through the hype?
Let’s talk first about a specific element of artificial intelligence that is actionable and well-compensated: machine learning. Machine learning is a subset of artificial intelligence that involves using certain rules and algorithms to try to generalize insights from one dataset to a broader one.
You might take labeled data that is human-classified and work on extending the logic with machine learning or let the computer go through unlabelled data and figure things out for you. You might work in a form of reinforcement learning that is akin to deep learning: a specific set of machine learning approaches that use layers of reinforcement learning to get to the desired outcome.
You’ll be working with pipelines of data if you choose to get into machine learning — the skill of having machines make predictions and labels for new sets of data after absorbing certain rules from similar datasets.
Machine learning is a set of programming tools to work with data and deep learning or reinforcement learning is a subset within that. While there are models programmed with pre-set rules to take data and process it a certain way — for example, a linear regression model that can tell you how much a dependent variable is affected by an independent variable (rent dependent on the number of rooms in an apartment, for example) — deep learning approaches tend to use semi-structured models to evaluate data at scale that works a bit like the human brain.
The critical distinction is that deep learning will operate through multiple layers of feedback. A neural network-like deep learning model will self-correct and optimize toward a certain outcome, tuning itself so that its output gradually matches its input through the self-modification of weights within the model.
This is perhaps best illustrated with the simplest deep learning model: the perceptron, illustrated above. In this case, from a series of inputs, the layer of hidden calculations performed between the input and the output self-modifies until it arrives at the desired output.
Why does this matter? It forms the basis of all sorts of exciting AI innovations you’ve heard of, from self-driving cars to video/image recognition. By creating increasingly efficient models that help machines manage the complexity of data patterns that can stretch into trillions of possibilities, humanity can benefit from automated processing of ever-larger data — gaining richer insights on datasets that can grow larger and larger. Those insights can allow a social network like Facebook to automatically classify the photos on its network, or allow somebody to pattern-match and predict your behavior based on your search history.
However, despite all the hype, deep learning approaches are nowhere close to how scientists think the human mind actually operates.
Let’s take a step back and define all of these terms so we know exactly what we’re talking about:
Data science involves using statistics and theory to treat large datasets so you can get a business answer or prediction based on the underlying dataset.
Artificial intelligence is the broad aspiration of granting machines human-like learning and reasoning. Much of it is a theory at this point, rather than something practical and implementable.
Machine learning is a way of creating predictive models that learn without needing to be explicitly programmed to do so, an actionable subset of artificial intelligence. You can think of machine learning models as semi-structured objective functions, wherein a data scientist will train a model for a certain outcome, without having to explicitly plug in all of the variables and interactions required. The model understands it is trying to minimize a certain amount of error and adjusts accordingly
Deep learning is a subset of machine learning and specifically refers to models like convolutional neural networks that reconcile an input and output with dense hidden layers that perform self-correcting levels of calculations in order to come to the desired outcome. In practice, in production, the number of layers and calculations performed is exponentially high.
There are traditionally two fundamental splits here when you’re working with datasets and artificial intelligence/machine learning:
Data scientists, who help tailor the business logic of the models that are being created. Basically, data scientists help communicate findings from data models to business decision-makers and they help tune and tailor models that help businesses ask the right questions of their data.
Machine learning engineers build the data plumbing that allows for data scientists to process and work with huge reams of data that continually updates. In practice, they’re responsible for feeding the models defined by data scientists with the data they need to perform well, and they’re often responsible for taking theoretical data science models and helping scale them out to production-level models that can handle the day-to-day of companies that generate terabytes of data.
I’ll break it down into more detail, but as a rule of thumb, even if the two broad roles share some overlap, a data scientist is often going to be working with the theory behind the data science of artificial intelligence, while machine learning engineers will implement models in practice. Data scientists tend to have a stronger theoretical foundation in machine learning, statistics, and mathematics, while machine learning engineers typically have a stronger software engineering background.
You’re going to have to take one of these broad roles if you’re going to work with artificial intelligence models.
Lots of people question the long-term prospects of work in artificial intelligence or machine learning. After all, won’t that work be automated along with everything AI else will automate? It’s a valid question, but for now, it’s important to consider artificial intelligence in the same vein as industrial revolutions of the past: something that allows for people to gain new capabilities and create whole new economies. ATMs are correlated with an increase in bank tellers.
Yet, ATMs may be responsible for long-term structural unemployment. The future, as ever, is murky. Yet we can learn from the history of ATMs that automation doesn’t automatically mean job loss, though it certainly means that new technologies can upend established truths.
Compensation and roles
Data scientists have one broad split in the categorical definition here: data analysts also fall under their purview. The main difference is that data analysts lean more toward communication data and doing one-off queries of established data models, which tend to be defined by data scientists. This article dives deeper into the split between data analyst and data scientist roles.
Meanwhile, data engineers will also earn an average of about $90,000 a year, similar to their data scientist peers. However, engineers focused specifically on implementing machine learning earn significantly more, easily going above $100,000 a year, and at its upper tiers, a $200,000-a-year average among top-paying companies. Well-known names in the AI field will sometimes get millions of dollars in cash compensation and stock, though they tend to be AI practitioners who are doing cutting-edge work and research at top universities or laboratories around the world.
Broadly speaking, if you want to develop your career in artificial intelligence, you can get started with a software development background and pick up the machine learning theory, or you can start off with the machine learning theory and communication skills and gradually pick up the programming chops to work in machine learning.
In order to work with artificial intelligence/machine learning, you generally need four skill sets:
The software engineering chops to implement models in practice. You’ll often work with tools like Python, Pandas, Scikit-Learn, TensorFlow and Spark. The ability to ably work within that toolset will determine your ability to process, “wrangle,” clean, and manage your data so you can use it to process the large streams of data required in a production-level model.
The knowledge of machine learning theory so you know what model to implement and why, and the downsides or upsides of applying certain approaches to certain data problems.
The ability to use statistical inference to quickly evaluate whether or not a model is working.
Domain-level knowledge and the ability to communicate insights from data to business stakeholders. It’s important not only to be able to gain insights from data, but also to be able to push the right answers in front of business-level units so you can help drive solutions.
In practice, machine learning engineers will lean more on their software engineering chops, while data scientists rely more on their knowledge of machine learning theory and statistical inference, along with the ability to communicate those data insights.
Here are some resources that can help you pick up the skills you need to place your best foot forward when it comes to applying to the AI jobs that are out there (mostly a hybrid of data science or machine learning engineering roles).
Software engineering for artificial intelligence
This free, curated course will run you through the basics of how to use powerful Python frameworks to wrangle data and build basic models for it. You’ll start working with critical data science tools, such as Pandas and sci-kit learn, and get a real feel for how to put machine learning theory into practice.
This tutorial for Apache Spark helps introduce how to work with big data sets for data engineers and machine learning engineers.
Working with TensorFlow will be an important part of understanding and implementing artificial intelligence models. This website offers a bunch of beginner-level tutorials that can help you quickly understand this powerful deep learning framework.
This collection of different big data sets will give you open-source data you can play around with as you look to build big data pipelines of your own.
Machine Learning/Artificial Intelligence Theory
This highly technical piece talks about the possible statistical and mathematical roots of why deep learning models seem to function so well.
This interactive Python notebook by AI legend Peter Norvig will help you reason with basic probability concepts and play around with them, gaining a critical skillset and perspective into statistical inference.
This handy tutorial simplifies Bayes Theorem, a crucial part of reasoning with changing probabilities and an important perspective to have with ever-shifting machine learning models.
This tutorial goes over the statistical foundation for calculating confidence intervals, a foundational part of machine learning evaluation.
Job boards/places to find ML work
All of this theory is great, but where do you actually go to find job postings related to AI? Here are some places where you might find artificial intelligence work, ranging from specific communities to AI-focused mailing lists or job boards.
Hacker News, a technically focused community wrapped around the YCombinator accelerator for startups, has monthly “Who is hiring” threads that tend to bring up a lot of work in artificial intelligence. Just ctrl+f for “machine learning engineer” or “data scientist” roles with different companies. As a bonus, hiring managers tend to post directly, which should help you get in touch with the right people faster.
AngelList is a repository of startup jobs, and there are several listings for machine learning jobs. Look around and apply with one click.
Data Elixir is a data science specific mailing list, and it also offers a job board for positions in industry that deal with artificial intelligence and data science. There are often positions for machine learning engineers as well.
KDNuggets is filled with data science and artificial intelligence resources and it serves as a useful place for job postings as well, with job postings dedicated to data engineers and machine learning engineers as well.
This AI job board curates some opportunities in the field. While it can be hit or miss when it comes to curation of the job posts presented, there are enough postings that are relevant to make up for it.
In order for you to get into a position to do artificial intelligence work, you’re likely going to have to network and do informational interviews with people in artificial intelligence roles. Then you’re going to have to interview.
I hope these resources have been helpful with regards to what artificial intelligence truly is and the steps you can take to begin a career in artificial intelligence!
This post is part of our contributor series. The views expressed are the author's own and not necessarily shared by TNW.