The Rise of Machine Learning
Machine learning is a completely different animal from other forms of automation. While the old way of automation focused on throwing processing power at simple, well-defined tasks, machine learning takes a different approach.
The old way's big limitation in the age of cheap, abundant processing power is the ability to define and execute a set of steps (an algorithm). If the task is straightforward, like assembling a car bumper, then that's not particularly hard to get that algorithm up and running.
That approach runs into problems in domains where there's a large amount of variability and a substantial number of choices that have to be made on a regular basis.
This type of dynamic presented problems in the past because old methods relied on the limited brains of programmers, who can only account for a small number of possibilities for any given activity. Constructing logical constructs of how to do something becomes untenable quickly for all but the most basic activities.
Enter machine learning (ML). Rather than depending on programmers determining all possible paths ahead of time, ML uses data to generate programs that solve problems. You can think of ML like a big, black box that takes in data (as much as you can give it) and spits out programs.
There are a bunch of different ways to generate these programs, but the type of ML creating the most economic value right now is a set of techniques known as supervised learning. Supervised learning revolves around using data to find patterns, and then building programs that utilize those patterns in some way.
Another popular set of techniques is reinforcement learning, which involves building programs that maximize some kind of reward.
The specifics of those particular topics are very math-heavy and not particularly relevant for right now, so we're not going to zero in on the details. For now, just remember this: ML is all about using data to automatically create and improve software.
ML Out in the Wild
An example that you've probably already used today is the way that Google suggests things when you use the search bar. The suggestions you get were not manually entered by a programmer — that would be far too labor-intensive.
Instead, Google utilizes supervised learning to constantly scan their incoming search data and spot patterns in the search terms.
If you go to Google right now and type in "What is..." you'll see some suggestions, which are likely not related to what you're looking for. That's because that phrase is extremely broad. Once you narrow it down a little bit, like "What is machine ..." you'll get closer to what you want.
Google can do this because they've recognized that those combinations of words tend to lead to specific searches. As more and more people use their search engine, their ML algorithms automatically build those suggestions and give people increasingly relevant results.
The only reason Google can do this so effectively is because they deal with enormous amounts of data. On any given day, they're processing 40,000 searches per second.
If Google was just starting out and processing only a few searches a day, then they'd have to wait a long time before they had enough data to use ML. That's because ML systems utilize statistics to spot patterns, and it's very difficult to accurately spot patterns with small sample sizes.
Building Houses vs. Growing Plants
The reason ML is so powerful is that it allows programmers to teach software how to do things, rather than strictly define what software should do. Once data is available, ML systems can continuously improve themselves and generate increasingly complex programs.
There's an element of mystery to the whole ML process as a result of this, because the algorithms are generating programs on their own. Programmers are no longer specifying all the moving parts of their software, and are instead simply telling their ML systems what kind of end results they want (better searches, improved movie recommendations, etc.). Because they aren't designing the output programs, they often have no idea exactly how they work.
Take, for example, self-driving cars. Even though Google and several other companies have built vehicles that can drive themselves, nobody knows exactly how they do it. This is because self-driving cars use ML to teach themselves how to drive.
First, the engineers working on these projects design a set of target outputs. Their primary goal is to have the car drive itself safely (an important distinction, since a car could get to its destination in a variety of very unsafe ways), so the goal is to have a program that knows when to stop at stoplights, make U-turns, merge onto the freeway, and so on.
The engineers then slap a plethora of sensors onto their prototype vehicles, and then allow them to drive around. When the vehicle makes a mistake (like stopping when it shouldn't), it looks at the data surrounding that mistake, updates its internal algorithm, then tries again.
While the engineering team is not entirely hands-off — they have to regularly make tweaks to the ML systems to ensure the targets are understood and correct — the task of driving is not defined by them. They simply tell the ML systems what outputs they want, and the car figures out the rest.
The differences between traditional software systems and ML systems can be compared to the difference between building a house (old way of building software) and growing a plant (new way of building software). A house is built by laborers from blueprints that are created by an architect. The laborers are expected to follow the design laid out in those blueprints exactly, and they fully understand what the end product will be because of the nature of that construction process.
Growing a plant, on the other hand, involves planting a seed and then tending to it until it turns into a desired end-product. How well it turns out depends on how much water you use, what nutrients you put into the soil, and so on. You don't have to understand everything about that growth process to get what you want out of the plant, but you do need to provide it with the right environment to thrive.
What this means for companies looking to automate is that they now need to focus on finding ways to gather large amounts of data about what their employees are doing every day. Once they have that, they can put an ML system in place that will (eventually) figure out how to replace those employees.