New Thinking for the New Year — BERT Comes to Search
- Posted on: Dec 30 2019
Language is tricky. Just take the line from the Beatles song Don’t Pass Me By.
In one line, Ringo, hoping not to lose the girl who may indeed pass him by, make him cry, make him blue, sings, “I’ll be waiting here, just waiting to hear from you.” That’s the kind of stuff that drives adults from other countries crazy when they’re trying to learn English as a second or third language. “Here” and “hear” sound identical, but mean nothing remotely the same. Ah, the beloved and perplexing homonym.
Now imagine you’re a computer algorithm trying to decipher a search query. It’s no easy feat to accurately grasp the nuances of English and then accurately deliver the relevant search engine results page to the searcher.
In late October, Google said it’s getting better and better at doing that. Its latest tool for understanding the idiosyncrasies of language and search terms is named BERT. Google says BERT is a big deal. How big? Pandu Nayak, Vice President of Search at Google, said BERT is “the biggest leap forward in the past five years, and one of the biggest leaps forward in the history of Search.”
OK, so what exactly does BERT do? And, no, he doesn’t hang out with Ernie.
In 2018, Google introduced and open-sourced a neural network-based technique for natural language processing (NLP) pre-training called Bidirectional Encoder Representations from Transformers — BERT. This technology enables anyone to train their own state-of-the-art question answering system.
“Transformers” is key here. Transformers are models that process words in relation to all the other words in a sentence, rather than one-by-one in order. This enables BERT to consider the full context of a word by looking at the words that come before and after it. This is really important for Google to understand what the searcher really means and not get sidetracked by whacky syntax or “keyword-ese,” where the searcher types strings of words that he or she assumes will help the algorithm understand their query, even if that’s not how they’d naturally ask the question.
Google estimates that BERT will help create better understanding in one out of every 10 searches in the U.S. in English. It will be adding more languages over time. That is a big deal: BERT is impacting 10 percent of searches.
BERT is the biggest thing to come to Google search since RankBrain, the company’s first artificial intelligence method for understanding queries, which was integrated into its algorithm in 2015.
BERT will be applied to both ranking and featured snippets in search. Featured snippets are selected search results that are featured in a box on top of Google’s organic search results and below paid search. Featured snippets are usually a paragraph, often with an image, answering the searcher’s question right away. This content is typically from a site, and being featured in the snippet increases click-through rate anywhere from 2 to 8 percent, with a corresponding 677 percent increase in revenue from organic traffic.
Google believes BERT will really help with longer, more conversational queries, or searches where prepositions like “for” and “to” matter a lot to the meaning. With BERT the Google algorithm will be better able to understand the context of the words in the query.
When they were developing BERT, Google did lots of testing. Here are three examples they’ve given with a before BERT and after BERT results delivery.
This first example was meant to be a Brazilian who wanted to travel to the U.S. The person typed in a query that would have confused the Google algorithm. In the pre-BERT world, the algorithm linked Brazil traveler and USA and thought the question was about the need or not for a visa. It assumed the traveler was a U.S. citizen traveling to Brazil. That was totally wrong. You can see below how the BERT-driven results get it right.
This was another query test, “do estheticians stand a lot at work.” Before BERT, the algorithm looked to match keywords. In this case, it matched the word “stand” in the query with “stand-alone” and went down the wrong path, looking at different types of estheticians rather than the actual amount of standing on the job. See the difference.
This third example given by Google was dealing with the nuances of language, nuances that can go right over the head/CPU of the computer, so to speak. This query was intended as an oddly constructed query wondering if another person can pick up a prescription for someone else at a pharmacy. Old Google got it all wrong, but BERT Google hit it on the head.
Bottom line? As an update, BERT will immediately impact 10% of all search queries, and that’s pretty big. BERT should make Google search much more intuitive and helpful for searchers. And that’s good for us all.
Posted in: Blog