Study Results

Study Results

View All Posts In Our Blog »


Google Autocomplete Study

2.22.2011 | 3 Comments

The Study

8 participants were asked to rank 13 potential influencers compiled from online research, and to provide their own opinion on the algorithm. Ranking was a simple 1-5 scale, with 5 being the most influential factor.

Scale 1 to 5

Fortunately, we discovered a few other patterns of suggestion from our experts that weren’t included in the survey, but are noteworthy and discussed in detail following the results. These additional influencers included:

  • Trending phrases
  • Competitive in AdWords bidding
  • Location of user performing the search
  • Personalized search (searches you may have performed in the past)

The Results

Rank Potential Influencer Score
1 Volume of queries for a term 4.63
2 Click-through history of suggestions 3.25
3 Volume of actual search results for a keyword theme 3.25
4 Click-through rate of suggestions in search box 3.13
5 Explicit nature of search query for a given phrase 2.88
6 Click-through rate of search results containing a keyword theme 2.88
7 Non-Explicit nature of search queries around a keyword theme 2.75
8 Quantity of link text crawled throughout the web with a keyword theme 2.63
9 Quality of specific search results with a keyword theme 2.50
10 Quality of web pages containing link text with a keyword theme 2.50
11 Quality of the root domain of search results with a keyword theme 2.38
12 Overall diversity of search suggestions beyond sentiment 2.13
13 Diversity of sentiment (positive and negative) 1.88

1. Volume of Queries for a Term (4.63)

2.22.2011 | 0 Comments

It’s unknown whether or not our participants, who travel in the same industry circles, were aware of Payne’s experiment prior to taking the survey, but it’s obvious that most believed that search volume was the most critical influencer.

Rand Fishkin, Ian Lurie and I gave a score less than 5 to volume of queries for a term. My reasoning had more to do with ranking and research performed in Google’s own AdWords Keyword Research Tool. For example, not logged into Google on a standalone browser, the word “nike” was queried:

Screenshot of Nike Autocomplete Results

Based on the volume hypothesis, the search volume for these terms should parallel data in Google’s Keyword Tool. However, regardless of match type, the sequence and terms were completely different. In a query, the predicted results start with “nike sb”, “nike shoes”, “nikeid” and “nike shoes”. However, if they were ranked by volume, the sequence would be “nike id”, “nike store”, “nike shoes” and “nike sb”. See the data below in both broad and exact match types.
It is important to note that ALL of the terms that appear in the predictions were found to have search history in Google’s keyword database. This doesn’t discount machine learning, but is definitely noteworthy.

Exact Match via Google AdWords

Screenshot of Search Volume in AdWords

Broad Match via Google AdWords

Screenshot of Broad Match Search Volume

While volume may actually be a factor, it most definitely does not stand on its own. Payne’s experiment was fascinating because it introduced a NEW query (“brent payne manipulated this”) at a high volume and managed to inject itself to the autocomplete predicted results.

Screenshot of Google SERPS Featuring Brent Payne

This may also be a good point in time to identify that Payne’s first attempt to influence the search result used an active query with search history (“brent payne seo”). The pattern of search and click behavior may have triggered a flag that actually removed the autocomplete prediction. Read Brent’s post here.

Aaron Wall, SEOBook.com Founder, made it clear in his response that he believed the reason phrases appear in autocomplete “is mainly down to search volume.” Since all the results in the Nike example did have search volume history, he may be right. However, there is still insufficient data to explain the sequence (ranking) of the search terms that appear.


2. Click-Through History of Suggestions (3.25)

2.22.2011 | 0 Comments

For those of you who don’t know what click-through rate (CTR) is, here is a link to give you some background. This article is geared toward CTR of AdWords paid search ads, but clearly states that “a keyword’s CTR is a strong indicator of relevance“.

Here’s where the unfairness starts. As you’re doing a search for a vendor or possible vendor, what’s the likelihood that you’ll click on the predicted search result that says {company} scam? If it were me, I’d be curious and inclined to click. Well, so do others, thus forcing that negative search results to stick in Google Autocomplete until that CTR rate goes down (which isn’t likely to happen right away thanks to curiosity). In this influencing factor, we’re identifying the importance of the history of CTR, not the actual click-through rate as it is today (see 6. Click-through rate of search results containing a keyword theme).

Those who have tried to manipulate CTR using automated programs have been filtered out, in part from an advanced algorithm that detects patterns of unusual behavior. For example, if you were to create a browser emulator on a machine using a dynamic IP that changes every 2 seconds, Google would still see the source coming from one OS, one browser type, and other footprints Google can easily detect.

You could also go the Brent Payne route, but thanks all the hype his research created, Amazon has cracked down on that tactic as well.

Screenshot of Mechanical Turk Warning Message

If we were to stop at this point and make some inferences, one might identify the start of a business rule. Perhaps: A keyword must have a history of searches and a higher-than-average frequency of click-through’s. I’m still just being presumptuous, there much more to cover ahead.


3. Volume of Actual Search Results (3.25)

2.22.2011 | 0 Comments

If you’re not familiar with occurrences in search results, this refers to the number of web pages that search engines believe might be a match for your query. Going back to the Nike example, Google indicates that there may be 98 million results that match that particular search:

Screenshot of Nike Search

As an experiment, I chose a different brand, Disney, and ran a simultaneous search on three browsers only to discover different volumes and different sequencing of keywords (logged out of Google on all three).

Screenshot of Three Browsers and Results

A quick call to a relative in Michigan (I’m based in California) who replicated the test, produced similar results, only the top keywords were “disney channel”, “disney world” and “disney store”. The test produced results differentiated by geographic targeting. One could therefore make an inference that attempting to influence results for a local business from a geography outside of the local area, may not produce the results one might expect (at least not in the local area of the business).
In other words, there may not be value for a local business to use Payne’s technique to influence search volume outside of the service area of the local business.


4. Click-Through Rate of Suggestions in Search Box (3.13)

2.22.2011 | 0 Comments

Not too different from click-through history, this potential influencer is what may be the death blow to the negative terms. As witnessed in Payne’s test, too much, too fast, from different locations blew out the “brent payne seo” predicted search result from autocomplete.

It’s possible that data is recorded separately and later augmented from Google’s search and Google’s autocomplete features. In other words, in the short term, predictions may be easier influenced than actual results.

This possible factor requires more experimentation.


5. Explicit Nature of a Search Query for a Given Phrase (2.88)

2.22.2011 | 0 Comments

“XYZ Company Scam” is a phrase commonly found in company brand name queries. The explicit nature of the query, as opposed to results generated from machine learning, is the next possible criteria for ranking in Google’s current autocomplete feature.

In short, a search query may have a higher probability of ranking higher in autocomplete if searched for in its original state without modifying search terms. The searcher must complete the query “XYZ Company Scam” by selecting the phrase from the predicted results or by hitting the Enter key after typing the entire phrase.

Further testing is required to qualify this potential factor.


6. Click-through Rate of Search Results (2.88)

2.22.2011 | 0 Comments

In this instance we are not referring to the autocomplete predictions, but the actual search results. Assume for a moment that there are 100 searches per day for “big red balloon”. The search results display the first 10 listings. If searchers are choosing one or more results at a higher frequency than a different term, the search term will appear to be more relevant.

Example (made up data):

  • 100 searches made for “big red balloon”; overall average click-through rate of results is 4%
  • 100 searches made for “big blue balloon”; overall average click-through rate of results is 20%
  • “big blue balloon” may be more likely to appear in autocomplete simply because it was selected a higher rate in the natural results

7. Non-Explicit Nature of Search Queries (2.75)

2.22.2011 | 0 Comments

Latent Semantic Indexing (LSI) is a consideration spoke of frequently when discussing natural search ranking. Using clusters of similar keywords reinforces a theme rather than creating pages containing too much explicit keyword usage. Google will analyze patterns of keyword usage to help AdWords advertisers find keyword opportunities and to return a diverse array of search suggestions to users.

This potential autocomplete factor is centered around a Markov-chain type process, as used in Google PageRank). Also referred to as machine learning or the use of terms used for similar queries.

For example, even though you searched for “disneyland tickets”, Google may return “disneyland packages” in addition to results for the original query. For the purpose of this study, a few of the experts who participated believe ranking in autocomplete may be slightly influenced by related searches around a main keyword theme derived from a history of searches in Google.


8. Quantity of Link Text Crawled on the Web (2.63)

2.22.2011 | 5 Comments

Another factor completely outside the realm of Payne’s test, link text plays an important role in organic ranking and may be an influencer of autocomplete rank. A quick search for “click here” returns the Adobe Reader download page as the top result, despite the fact that the page does not contain the phrase anywhere in the page or source code. Googlebot reads the text within a link almost like an introduction to the page it is being sent to. See how one might analyze the anchor text used for David Mihm’s Local Ranking Factors using SEOMOZ.org’s Linkscape tool:

Screenshot of Linkscape by seomoz.org

The quantity of links (provided they are not too explicit as they were with the “miserable failure” Googlebomb on the mid-2000′s) may influence organic ranking and possibly Google’s autocomplete predictions.

It’s recommended to have a conversation with a link building expert before venturing off into link building tactics that could cause your website and company to be removed from the search results and exploited the way JC Penny had been in the New York Times.

Please note that these diagrams were randomly selected for entertainment purposes only. Speak with Eric Ward, Jim Boykin, or Nate Dame before attempting to emulate any of the below scenarios.

Picture of a Link Building Diagram


Another Link Building Diagram


Another Link Building Diagram Example


Another Link Building Diagram Example


Another Link Building Diagram Example


Another Link Building Diagram Example


Another Link Building Diagram Example


Another Link Building Diagram Example


Another Link Building Diagram Example


Another Link Building Diagram Example


9. Quality of Search Results with a Keyword Theme (2.50)

2.22.2011 | 1 Comment

In a scenario where Google autocomplete must choose between two similar queries, such as “product A” and “product B”, it’s likely that Google autocomplete may predict the query with the highest quality of content.

Example:

Screenshot of Skechers SERPS

In the above screenshot, find two products: “shape ups” and “twinkle toes”. What makes shape ups rank higher in the suggestions than twinkle toes? Both are popular products, but how does Google know which product is the most popular aside from volume alone? Quality of search results may be the answer.

The top 10 search results that appear for each query were analyzed. Even though there may be truth in the idea that Google PageRank, could be a thing of the past as it pertains to ranking, PageRank was included in the measurement tools. Rand Fishkin and his team at seomoz.org have spent years analyzing ranking factors and Google’s algorithm to come up with their own relevancy measurement tool, appropriately named mozRank. mozRank and page authority were both used in the study of Quality. The results are below.

Tables Comparing Link and SEO Data

The results may be coincidental, but despite the fact that twinkle toes produced 2,950,000 results and shape ups produced 2,180,000 (quantity of results), shape ups remains several suggestions higher than twinkle toes.

Note in the tables above pages appearing within the first page of the search results for shape ups contained a higher PageRank, mozRank, and page authority, despite the lack of typical SEO page structure (title, meta description, heading tag usage).

Also note that the websites associated with the highest ranked suggestion contained more inbound links to the individual pages and to the root domains.
The inference here could be that the quality of web pages, and quantity of links to those pages, may impact positioning within Google’s autocomplete. One might also add that the value of the h1 tag in this instance means less than it might in natural/organic search.


10. Quality of Web Pages Containing Link Text (2.50)

2.22.2011 | 0 Comments

Similar to Factor #8, the quality of the pages linking to anywhere using the keyword within the link text may be factor in autocomplete predictions. Thanks to tools such as Yahoo! Site Explorer and OpenSiteExplorer.org, it’s not difficult to research how others are linking to a particular website. It is, however, difficult to find ALL links on the Internet with a given link text.

If anyone has a web crawler than can search for a specific phrase and identify whether or not the text is within an anchor tag, please use the contact information at the end of this article so that the next version will have more data to base this potential influencer on.


11. Quality of Root Domain in SERPS (2.38)

2.22.2011 | 0 Comments

Refer to tables in Factor #9. In reviewing the data from the first 10 results that appear for a given query, there are two different important variables to consider: links and PageRank (or mozRank, which isn’t represented in the data above).

Based on the individual page data and the quantity of links pointing to the root domains, one could make a safe assumption that quality of root domain may play a role in the ranking of predictions in Google autocomplete.

Additional research can be performed to better understand this potential factor.


12. Overall Diversity of Search Suggestions Beyond Sentiment (2.13)

2.22.2011 | 0 Comments

This potential Google autocomplete suggestion influencer is very similar to factor #7 being that it is based on a pattern of assumptions Google might make based on machine learning. A possible way one might think to “game the system” to achieve higher placement would be to create multiple WordPress blogs and cross link them through various non-relevant channels. Assuming the web pages within these blogs are optimized for search and the links are not coming from farms of links or paid for links in website sidebar navigation, this technique might actually work.

A good example of this would be business directories, such as Yelp, ServiceMagic, and so forth. Take a look at how they dominate a majority of the search results:

Google Screenshot

These directories have earned their Google PageRank through the years by offering high quality listings, earning links from advertisers who pin up “Listed in” and “Official Member of” badges on their websites, great site structure, and age of a website.

Because it’s difficult for a small business to compete with the gigantic directories, Google may be injecting a small number of less popular websites into the mix, simply to insure that all of the listings aren’t directories, or all blogs, or all article websites.

Google might then catalog and classify the “type” of web pages and websites within the search results, and then inject a few less relevant and less popular alternatives to allow diversity. The slightly more prudent approach taken in the late 2000′s included “Universal Search”, which allowed images, news, products, maps, and video to appear within the natural search results.

It could be argued that Universal Search is how Google handles diversity of results. Then again, we may need to study the saturation and SEO focal points of the various web directories that appear for a contractor-related phrase before ruling out Universal Search before making a final decision.


13. Diversity of Sentiment (Negative and Positive) (1.88)

2.22.2011 | 0 Comments

There are a number of knowledgeable search marketing experts who strongly believe that the search engines want to deliver a natural mix of sentiment, along with relevancy, popularity, and search behavior history. This would mean that a listing with a negative sentiment (scam, fraud, etc) may show up in the search results even if the top 10 results are more relevant, popular and have greater click-through history, simply because Google wants diversity.

Then there are the rest of us that believe otherwise, which might explain the low 1.88 score this potential influencer might have. Further data is needed to support this idea. If you have it, please share and we’ll include it in the next update of this study.


Potential Factors Not Included in the Survey

2.22.2011 | 0 Comments

Trending

Ian Lurie discusses the trending of searches as possible factor in predicted search suggestions. If you have insights or data to add to this, please connect with us below.

Competitive in AdWords Bidding

Tim Eschenauer states that “Google knows many searchers are influenced by what they suggest, at least at the beginning stages of their search. You’d think they’d at least try to lead them to the paid results.” Share your thoughts with us on the impact of PPC search data to organic autocomplete suggestions and we will add it to our next survey.

Personalized Search

Google admits that “you may see search queries from relevant searches that you’ve done in the past.” Many SEO experts, including contributors of this article search for their own rank frequently on terms such as “SEO Expert” and “SEO Consultant”. Even with Personalized Search turned on, autocomplete does not return predictions for either query:

Screenshot of Google SEO Search

If you have examples of different circumstance or testing with regards to autocomplete and Personalized Search, please let us know and we will include it in the next survey.


Additional Commentary

2.22.2011 | 2 Comments

Cayley Vos, Owner

Netpaths, Los Angeles Search Engine Optimization Company

We have seen dramatic increase in clicks through to long tail searches suggested by Google. It also allows other websites to capitalize on brand searches. We ranked for a keyword ‘brand discount’. An example search would be “Nike Discount” and this search greatly helped the smaller retailer with a 25% traffic increase.


Tom Critchlow, Head of Search

Distilled, an Internet Marketing Company in London

I’ve only done some testing around this subject so don’t have definitive answers but I’m a strong believer that the search suggest options are generated from Markov-chain type processes. Obviously on top of that if there is large search volume Google wants to add value by using CTR, volume of mentions etc etc.

But Google also wants to provide search suggest results for long tail queries and I believe they do this by document analysis, Markov-chaining and machine learning because I’ve seen numerous examples of search results where for example there is a “brand x sucks” search suggest but where this phrase appears nowhere online and is also not searched for therefore I believe this hints that Google are creating this from somewhere by knowing that “brand x” refers to a brand and that “brand x sucks” is a common search format (obviously this is a very basic example but demonstrates the concept). Remember that Google drove a car through a city for 100,000 miles without a driver. They know a thing or two about machine learning.


Ian Lurie, President

Conversation Marketing, CEO, Portent

I think trending is also really, really important. If a particular set of phrases around a topic are trending sharply, those phrases are more likely to end up in search suggest.

Search suggest is influenced primarily by click-through and trends. If increasing numbers of people are searching on phrases and clicking through at a steady rate, and those phrases are easily clustered around a topic, you can bet they’ll end up in Search Suggest.


Rand Fishkin, CEO

seomoz.org

I suspect the formula for it is both more and less complex than we’d suspect – less complex in that the quantity of signals may be just a few (maybe amount of use/appearance on the web, search volume and usage/CTR) and more complex, e.g. machine learning against a sample set of queries and suggestions to analyze the usefulness/effectiveness of the algorithm producing them (and whether they’re working well for users).

Search suggest is likely one of Google’s best methods for helping those stuck on what to query and reducing the uncertainty of intent. I suspect it also positively influences number of searches per searcher, particularly the number of searches where ads appear.


Aaron Wall, SEOBook Founder

www.SEOBook.com

In my opinion, Google autocomplete predictions mainly come down to search volume. The more a phrase is searched, the more likely it will be to appear in the suggestions.


Tim Eschenauer, SEO/SEM Strategist

Austin & Williams, an Advertising Agency

My feeling is that Google Suggests terms that have high search volumes and perhaps are competitive on the PPC space. Google knows many searchers are influenced by what they suggest at least at the beginning stages of their search. You’d think they’d at least try to lead them to the paid results.


Special Thanks

We know how busy life gets when you’re the leader in your industry. We want to thank those who participated in the study despite their crazy schedules. Not listed above, but noteworthy of participating nonetheless, are:

Danny Sullivan, Editor and Chief
SearchEngineLand.com

Robert Wright, Search Marketing Expert
MrWebGuru.com

Comments & Feedback

To participate in the next study, please send your case study data to info@seosteve.com. To comment on the first run at analyzing Google autocomplete, please use the Facebook comment section below. Thanks for reading!

About the Author

Here is a Picture of SEO Expert Steve WiidemanSteve Wiideman has been practicing the crafts of Search Engine Optimization and Search Engine Marketing for nearly a decade, working for corporations on paid, organic, local, and product search. He has authored several popular eBooks and is acclaimed for his ranking in Google for the term “SEO Expert”. Wiideman hosts weekly workshops at Creative Search Strategies and continues to work at an SEO Consultant for a handful of popular brands.