It’s not Back to the Future day any more, but the future is still coming whether you like it or not so you can either hide away from it like a coward, or read this article and prepare yourself for what’s to come. You’re not a coward are you?
Last time, we had a look at current developments in the field of machine learning and AI, focussing on how Google, Microsoft and Apple have been using the technology in their flagship products. This time around, we’re going to look in a bit more depth at the theory surrounding it all, and explain how it all applies to the future of online search.
Apple’s second acquisition is another start-up called VocalIQ, self-described as “the world’s first self-learning dialogue API – putting real, natural conversation between people and their machines.”
Siri has previously been criticised for being a bit too formulaic, and (relatively) simplistic. But with VocalIQ on board, this is set to change. Self-learning is the key here.
This is the kind of advancement that we mentioned Amit Singhal describing (in Part 1) when he spoke of the need for “computer-based personalities” that are “very sensitive to human interaction”.
It is at this point that we should probably ask ourselves how noble an aim this actually is; ancient myths and sci-fi stories alike are filled with warnings about what might happen should we reach just that little bit too far – just ask Icarus, or Will Smith. Of course, warnings like this are a little over dramatic – a highly sophisticated talking digital assistant isn’t quite the Terminator, but there is nonetheless something to be said for caution here.
But mild caution aside, this technology is still incredibly exciting and has the potential for boundless use, so let’s continue.
Now, to go a bit further into this idea of self-learning programs, let’s enter the complex and twisted rabbit hole of deep learning.
Hold on to your pocket watch…
(Quick disclaimer here: my own understanding of deep learning and machine learning generally is limited. As such I’m going to stick to the basic ideas behind the theory, which should be enough for now)
The more basic forms of artificial intelligence rely on humans to manually input various different kinds of instructions that might mean the same thing. This is easiest to understand in terms of speech recognition like we have with Siri. A basic example would be programming various voice commands like “call John”, “phone John” and “talk to John” all leading to the same function – basically a complicated flow chart. This is a simplified explanation of what is known as entirely supervised learning. Deep learning aims to generate similar results without supervision.
The aim of deep learning is to go from mimicking the results of the functions of a brain, to digitally simulating those actual functions. This is done by using what are known as artificial neural networks. Essentially this involves mathematically recreating biological brains, where neurons and their connections are all simulated to create a vastly complex program with potential for learning and self-improvement.
Let’s take a look at Google’s image recognition software as an example here:
Google’s image recognition tools are designed to pick out and describe certain features of images, like cats or bottles or… you get the picture.
For a brief (ish) explanation of how this actually works, I’ll refer to Reddit user GregBahm:
“You give the computer millions of photos, and tell the computer which ones are birds, and then you say “now you figure out what all those photos have in common.” And the computer will make millions and billions and trillions of random guesses for how it could tell the difference between a bird or not a bird. Then the computer will test each random guess, by using it against all the photos you gave it and then comparing its guesses to the answers you provided. If it happened to find some random pattern that allows it to guess right, the computer bases the next guesses off of that previous pattern, improving it and improving it”
So the computer effectively learns as it goes on, constantly building up and improving its pattern-recognising faculty.
Here we can go along a cool (and vaguely informative) diversion:
Take Google’s Deep Dream image manipulation program, a piece of software that takes Google’s image recognition tools and feeds it back into itself, reversing the deep learning process to create odd, and often incredibly creepy results.
Some of you might have seen its equal parts nightmarish and dog-obsessed results before, but if you haven’t, here is a picture of your humble author that has been fed through this delightful warping program:
Deep dream takes that image recognition software and enters it into a kind of feedback loop. So it is essentially told to look for the bird in a picture where there is no bird, and then inserts the bird in the picture itself, and repeats that process. It makes order from chaos.
In a phrase, this is what it looks like when a digital neural network hallucinates or, more romantically, when robots dream.
And it turns out when they do, they see a lot of dogs (or in my case, what looks like birds and rodents). This most likely has something to do with the kind of sample images initially fed into the image recognition software.
Now there are few practical implications for deep dream itself, but as an exercise to show what kinds of things can be made possible with deep learning techniques, it’s pretty interesting. And the basic idea of reversing deep learning mechanisms to go from analysis to creation is something that could certainly have mileage.
But let’s get back to the point:
Deep learning neural networks are designed to work just like we do. When you identify something as a bird, you don’t literally go through your brain, pick out every possible condition that something must satisfy in order to be a bird and then applying them systematically to what you see in front of you, instead you just sort of know. This is the holy grail of machine learning.
Stephen Levy, in an essay on neural networks and the future of Google, gave the following dramatic description of neural network development as:
“the black art of organizing several layers of artificial neurons so that the entire system could be trained, or even train itself, to divine coherence from random inputs, much in a way that a newborn learns to organize the data pouring into his or her virgin senses.”
So what does all of this mean for search engines?
The Future of Search
Well, first of all, machine learning generally isn’t a new thing when it comes to SEO. Google’s Panda and Penguin updates both made use of machine learning it their own way.
But the future of search promises to break genuinely new ground.
Imagine a world where even Google’s chief engineers couldn’t tell you about the specifics of their ranking algorithms. That’s where we’re headed.
Take Panda and Penguin again. Both of these updates are designed to separate low quality or spammy sites from the real good stuff that should be ranking highly on SERPs. But it’s pretty difficult, impossible in fact, to lay out a set of hard and fast rules to decide what exactly counts as ‘spammy’.
Currently, Google’s search algorithms are incredibly complicated and are regularly updated by developers when they notice something that could be added, making the evaluations of sites steadily more and more accurate. Take the Hummingbird update, for example, that improved the understanding of context in searches, in addition to what Panda and Penguin had already done (and continue to do).
However, there’s only so far this can go.
Imagine the framework provided by GregBahm to describe image recognition software. Now, imagine that instead of looking at pictures and picking out birds, it was looking at websites and picking out those that satisfied searchers. It will do it, and it will get better and better as time goes on without any manual input from developers.
What this means for us in the SEO industry is that our craft is going to become ever more creative, more art-like than science-like, as steadfast rules about rankings become steadily less universal. Pressure will mount on SEOs to step up their game, spammers will be shaking in their boots, and this can only be good news for the general public.
There’s still a fair way to go though, and Google’s machine learning technology is far from perfect, as we’ve heard from Amit Singhal, and we saw when their image recognition software caused deep offence when it rather awkwardly thought it saw gorillas in a picture that definitely was not of gorillas:
Improvement is happening though; and Google’s technology is being constantly refined and improved and worked on ceaselessly. As the true pioneers in the field of machine learning, one thing is certain about
Skynet Google: the future promises truly exciting and ground breaking innovation.
I’ll end with another quotation from Stephen Levy’s essay, illustrating the real shift in paradigm that developments in machine learning have brought about:
“Many years ago, Larry Page and Sergey Brin spoke, maybe only half jokingly, of search being an implant in our brains. No one talks about implants now. Instead of tapping our brains to make search better, Google is building brains of its own.”
And for good measure, here’s Doc Brown himself, with a few wise words on what the future might hold: