Personalized search

Personalized search refers to search experiences that are tailored specifically to an individual's interests by incorporating information about the individual beyond specific query provided. Pitkow et al. describe two general approaches to personalizing search results, one involving modifying the user’s query and the other re-ranking search results.[1]

History

Google introduced Personalized search in 2004 and it was implemented in 2005 to Google search. Google has personalized search set up for not just those who have a Google account but everyone as well. There is not very much information on how exactly Google personalizes their searches, however, it is believed that they use user language, location, and web history.[2]

Early search engines, like Yahoo! and AltaVista, found results based only on key words. Personalized search, as pioneered by Google, has become far more complex with the goal to "understand exactly what you mean and give you exactly what you want."[3] Using mathematical algorithms, search engines are now able to return results based on the number of links to an from sites; the more links a site has, the higher it is placed on the page.[4] Search engines have two degrees of expertise: the shallow expert and the deep expert. An expert from the shallowest degree serves as a witness who knows some specific information on a given event. A deep expert, on the other hand, has comprehensible knowledge that gives it the capacity to deliver unique information that is relevant to each individual inquirer.[5] If a person knows what he or she wants than the search engine will act as a shallow expert and simply locate that information. But search engines are also capable of deep expertise in that they rank results indicating that those near the top are more relevant to a user's wants than those below.[6]

While many search engines take advantage of information about people in general, or about specific groups of people, personalized search depends on a user profile that is unique to the individual. Research systems that personalize search results model their users in different ways. Some rely on users explicitly specifying their interests or on demographic/cognitive characteristics.[7][8] But user supplied information can be hard to collect and keep up to date. Others have built implicit user models based on content the user has read or their history of interaction with Web pages.[9][10][11][12][13]

There are several publicly available systems for personalizing Web search results (e.g., Google Personalized Search and Bing's search result personalization[14]). However, the technical details and evaluations of these commercial systems are proprietary. One technique Google uses to personalize searches for its users is to track log in time and if the user has enabled web history in his browser. The more you keep going the same site through a search result from Google, it believes that you like that page. So when you do certain searches, Google's personalized search algorithm gives the page a boost, moving it up through the ranks. Even if you're signed out, Google may personalize your results because it keeps a 180 day record of what a particular web browser has searched for, linked to a cookie in that browser.[15]

In order to better understand how personalized search results are being presented to the users, a group of researchers at Northeastern University set out to answer this question. By comparing an aggregate set of searches from logged in users against a control group, the research team found that 11.7% of results show differences due to personalization, however this varies widely by search query and result ranking position.[16] Of various factors tested, the two that had measurable impact were being logged in with a Google account and the IP address of the searching users. It should also be noted that results with high degress of personalization include companies and politics. One of the factors driving personalization is localization of results, with company queries showing store locations relevant to the location of the user. So, for example, if you searched for "used car sales", Google may churn out results of local car dealerships in your area. On the other hand, queries with the least amount of personalization include factual queries ("what is") and health.[17]

When measuring personalization, it is important to eliminate background noise. In this context, one type of background noise is the carry-over effect. The carry-over effect can be defined as follows: when you perform a search and follow it with a subsequent search, the results of the second search is influenced by the first search. An interesting point to note is that the top ranked URLs are less likely to change based off personalization, with most personalization occurring at the lower ranks. This is a style of personalization, based on recent search history, but it is not a consistent element of personalization because the phenomenon times out after 10 minutes, according to the researchers.[18]

The Filter Bubble

Main article: Filter bubble

Several concerns have been brought up regarding personalized search. It decreases the likelihood of finding new information by biasing search results towards what the user has already found. It introduces potential privacy problems in which a user may not be aware that their search results are personalized for them, and wonder why the things that they are interested in have become so relevant. Such a problem has been coined as the "filter bubble" by author Eli Pariser. He argues that people are letting major websites drive their destiny and make decisions based on the vast amount of data they've collected on individuals. This can isolate users in their own worlds or "filter bubbles" where they only see information that they want to, such a consequence of "The Friendly World Syndrome." As a result people are much less informed of problems in the developing world which can further widen the gap between the North (developed countries) and the South (developing countries).[19]

The methods of personalization, and how useful it is to “promote” certain results which have been showing up regularly in searches by like-minded individuals in the same community. The personalization method makes it very easy to understand how the Filter Bubble happens. As certain results are bumped up and viewed more by individuals, other results not favored by them are relegated to obscurity. As this happens on a community-wide level, it results in the community, consciously or not, sharing a skewed perspective of events.[20]

An area of particular concern to some parts of the world is the use of personalized search as a form of control over the people utilizing the search by only giving them particular information. This can be used to give particular influence over highly talked about topics such as gun control or even gear people to side with a particular political regime in different countries.[21] While total control by a particular government just from personalized search is a stretch, control of the information readily available from searches can easily be controlled by the richest corporations. The biggest example of a corporation controlling the information is Google. Google is not only feeding you the information they want but they are at times using your personalized search to gear you towards their own companies or affiliates. This has led to a complete control of various parts of the web and a pushing out of their competitors such as how Google Maps took a major control over the online map and direction industry with MapQuest and others forced to take a backseat.[22]

Many search engines use concept-based user profiling strategies that derive only topics that users are highly interested in but for best results, according to researchers Wai-Tin and Dik Lun, both positive and negative preferences should be considered. Such profiles, applying negative and positive preferences, result in highest quality and most relevant results by separating alike queries from unalike queries. For example, typing in 'apple' could refer to either the fruit or the Macintosh computer and providing both preferences aids search engines' ability to learn which apple the user is really looking for based on the links clicked. One concept-strategy the researchers came up with to improve personalized search and yield both positive and negative preferences is the click-based method. This method captures a user's interests based on which links they click on in a results list, while downgrading unclicked links.[23]

The feature also has profound effects on the search engine optimization industry, due to the fact that search results will no longer be ranked the same way for every user.[24] An example of this is found in Eli Pariser's, The Filter Bubble, where he had two friends type in "BP" into Google's search bar. One friend found information on the BP oil spill in the Gulf of Mexico while the other retrieved investment information.[25]

Some have noted that personalized search results not only serve to customize a user's search results, but also advertisements. This has been criticized as an invasion on privacy.[26]

The Case of Google

The perfect example of search personalization is Google. Google is not just a search engine, but a corporation that is entering every facet of our lives. Personalization with Google has gone far beyond just search. There are a host of new applications, all of which can be personalized and integrated with the help of a Google account. Personalizing search does not require an account. However, one is almost deprived a choice, since so many useful Google products are only accessible if one has a Google account. The Google Dashboard, introduced in 2009, covers more than 20 products and services, including Gmail, Calendar, Docs, YouTube, etc.[27] that keeps track of all the information directly under one’s name. The free Google Custom Search is available for individuals and big companies alike, providing the Search facility for individual websites and powering corporate sites such as that of the New York Times. The high level of personalization that was available with Google played a significant part in helping remain the world’s most favorite search engine.

One large example of Google’s ability to personalized search is in its use of Google News. Google has geared its news to show everyone a few similar articles that can be deemed as interesting, but as soon as the user scrolls down, it can be seen that the news articles begin to differ. Google takes into account past searches as well as the location of the user to make sure that local news gets to them first. This can lead to a much easier search and less time going through all of the news to find the information you want. The concern, however, is that the very important information can be held back because it does not match with the criteria that the program sets for the particular user. This can create the “filter bubble” as described earlier.[28]

An interesting point about personalization that often gets overlooked is the privacy vs personalization battle. While the two do not have to be mutually exclusive, it is often the case that as one becomes more prominent, it compromises the other. Google provides a host of services to people, and many of these services do not require information to be collected about a person to be customizable. Since there is no threat of privacy invasion with these services, the balance has been tipped to favor personalization over privacy, even when it comes to search. As people reap the rewards of convenience from customizing their other Google services, they desire better search results, even if it comes at the expense of private information. Where to draw the line between the information versus search results tradeoff, is new territory and Google gets to make that decision. Until people get the power to control the information that is being collected about them, Google is not truly protecting privacy. Google’s popularity as a search engine and Internet browser has allowed it to gain a lot of power. Their popularity has created millions of usernames, which have been used to collect vast amounts of information about individuals. Google can use multiple methods of personalization such as traditional, social, geographic, IP address, browser, cookies, time of day, year, behavioral, query history, bookmarks, and more. Although many people would say that having Google personalize your search results based on what you searched previously would be a good thing, there are negatives that come with it.[29][30] With the power from this information, Google has chosen to bully its way into other sectors it owned such as videos, document sharing, shopping, maps, and many more. Google has done this by steering searchers to their own services offered as opposed to others such as MapQuest.

Using Search Personalization, Google has doubled its video market share to about eighty percent. The legal definition of a monopoly is when a firm gains control of seventy to eighty percent of the market. Google has reinforced this monopoly by creating significant barriers of entry such as manipulating search results to show their own services. This can be clearly seen with Google Maps being the first thing displayed in most searches.

The analytical firm Experian Hitwise stated that since two thousand and seven, MapQuest has had its traffic cut in half because of this. Other statistics from around the same time include Photobucket going from twenty percent of market share to only three percent, Myspace going from twelve percent market share to less than one percent, and ESPN from eight percent to four percent market share. In terms of images, Photobucket went from thirty one percent in two thousand and seven to ten percent in two thousand and ten. Even Yahoo Images has gone from twelve percent to seven percent. It becomes very apparent that the decline of these companies has come because of Google’s increase in market share from forty three percent in two thousand and seven to about fifty five percent in two thousand and nine.

It might be easy to say that all of this has come from Google being more dominant because they provide better services. However, Experian Hitwise has also created graphs to show the market share of about fifteen different companies at once. This has been done for every category for the market share of pictures, videos, product search, and more. The graph for product search is evidence enough for Google’s bullying because their numbers went from one point three million unique visitors to eleven point nine unique visitors in one month. That kind of growth can only come with the change of a process.

In the end, there are two things in common theme with all of these graphs. The first is that Google’s market share has a directly inverse relationship to the market share of the leading competitors. The second is that this directly inverse relationship began around two thousand and seven, which is around the time that Google began to use its “Universal Search” method.[31]

Benefits

One of the most critical benefits personalized search has is to improve the quality of decisions consumers make. The internet has made the transaction cost of obtaining information significantly lower than ever. However, human’s capability of processing information has not expanded much.[32] When facing overwhelming amount of information, consumers need a sophisticated tool to help them make high quality decisions. Two studies examined the effects of personalized screening and ordering tools, and the results show positive correlation between personalized search and the quality of consumers’ decisions.

The first study was conducted by Kristin Diehl from University of South Carolina. Her research discovered that reducing search cost led to lower quality choices. The reason behind this discovery was that ‘consumers make worse choices because lower search costs cause them to consider inferior options.’ It also showed that if consumers have a specific goal in mind, they would further their search, resulting in an even worse decision.[33] The study by Gerald Haubl from University of Alberta and Benedict G.C. Dellaert from Maastricht University mainly focused on recommendation systems. Both studies concluded that a personalized search and recommendation system significantly improved consumers’ decision quality and reduced the number of products inspected.[34]

Models

Personalized search gains popularity because of the demand for more relevant information. Research has indicated low success rates among major search engines in providing relevant results; in 52% of 20,000 queries, searchers did not find any relevant results within the documents that Google returned.[35] Personalized search can improve search quality significantly and there are mainly two ways to achieve this goal.

The first model available is based on the users’ historical searches and search locations. People are probably familiar with this model since they often find the results reflecting their current location and previous searches.

There is another way to personalize search results. In Bracha Shapira and Boaz Zabar’s “Personalized Search: Integrating Collaboration and Social Networks”, Shapira and Zabar focused on a model that utilizes a recommendation system.[36] This model shows results of other users who have searched for similar keywords. The authors examined keyword search, the recommendation system, and the recommendation system with social network working separately and compares the results in terms of search quality. The results show that a personalized search engine with the recommendation system produces better quality results than the standard search engine, and that the recommendation system with social network even improves more.

Disadvantages

While there are documented benefits of the implementation of search personalization, there are also arguments against its use. The foundation of this argument against its use is because it confines internet users’ search engine results to material that aligns with the users’ interests and history. It limits the users’ ability to become exposed to material that would be relevant to the user’s search query but due to the fact that some of this material differs from the user’s interests and history, the material is not displayed to the user. Search personalization takes the objectivity out of the search engine and undermines the engine. “Objectivity matters little when you know what you are looking for, but its lack is problematic when you do not”.[37] One of the main functions of the internet is the collection and sharing of information. This is the criticism of search personalization. It limits a core function of the web. It helps prevent users from easily accessing all the possible information that is available for a specific search query. Search personalization adds a bias to user’s search queries. If a user has a particular set of interests or internet history and uses the web to research a controversial issue. The user’s search results will reflect that. The user not be displayed both sides of the issue if the user’s interests lean to one side or another. The user may be missing out on information that could be important. A study done on search personalization and its effects on search results in Google News resulted in different orders of news stories being generated by different users even though each user entered the same search query. “When I further distilled the results, I saw that only 12% of the searchers had the same three stories in the same order. This to me is prima facie evidence that there is filtering going on”.[38] If search personalization was not active, all the results in theory should have been the same stories in an identical order.

Another disadvantage of search personalization is that internet companies such as Google are gathering and potentially selling your internet interests and histories to other companies. This raises a privacy issue. The issue is if people are content with companies gather and selling their internet information without their consent or knowledge. Many web users are unaware of the use of search personalization and even fewer have knowledge that user data is a valuable commodity for internet companies.

Sites that use Personalized Search

E. Pariser author of the Filter Bubble explains how there are differences that search personalization has on both Facebook and Google. Facebook implements personalization when it comes to the amount of things we share and also what pages we “like”. It also takes into consideration our social interactions, whose profile we visit the most, who we message or chat with are all indicators that are used when Facebook uses personalization. Rather than what we share being an indicator of what is filtered out, but Google takes into consideration what we “click” to filter out what comes up in our searches. In addition Facebook searches are not necessarily as private as the Google ones. Facebook draws on the more public self and we share what other people want to see. Even while tagging photographs, Facebook uses personalization and recognition that will automatically assign a name to face for you without you having to tag them. In terms of Google we are provided similar websites and resources based on what we initially click on. This doesn't just affect Google and Facebook. There are even other websites that use the filter tactic to better adhere to user preferences. For example, Netflix also judges from the users search history to suggest movies that they may be interested in for the future. There are cites like Amazon and personal shopping cites also use other peoples history in order to serve their interests better. Twitter also uses personalization by “suggesting” other people to follow. In addition, based on who we “follow” and who we “tweet” and “retweet” at Twitter filters out to peoples best interest for us. Mark Zuckerberg, founder of Facebook, believed that we only have one identity. E. Pariser argues that is completely false and search personalization is just another way to prove that isn’t true. Although personalized search may seem helpful it is not a very accurate representation of who we are as people. There are instances where people also search things and share things in order to make themselves look better. For example, someone may look up and share political articles and other intellectual articles in order to make themselves look better. Search personalization is not an ideal representation of any person. There are so many cites used for different purposes and that does not make up one person’s identity at all that, but are in fact false representations of ourselves.[39]

Personalized Search and Online Shopping

Search engines, such as Google and Yahoo!, utilize personalized search to attract possible customers to products that fit their presumed desires. Based on a large amount of collected data aggregated from an individual’s web clicks, search engines can use personalized search to put forth advertisements that may pique the interest of an individual. Utilizing personalized search can help consumers find what they want faster, as well as help match up products and services to individuals within more specialized and/or niche markets. Many of these products or services that are sold via personalized online results would struggle to sell in brick-and-mortar stores. These types of products and services are called long tail items.[40] Using personalized search allows faster product and service discoveries for consumers, and reduces the amount of necessary advertisement money spent to reach those consumers. In addition, utilizing personalized search can help companies determine which individuals should be offered online coupon codes to their products and/or services. By tracking if an individual has perused their website, considered purchasing an item, or has previously made a purchase a company can post advertisements on other websites to reach that particular consumer in an attempt to have them make a purchase.

Aside from aiding consumers and businesses in finding one-another, the search engines that provide personalized search benefit greatly. The more data collected on an individual, the more personalized results will be. In turn, this allows search engines to sell more advertisements because companies understand that they will have a better opportunity to sell to high percentage matched individuals then medium and low percentage matched individuals. This aspect of personalized search angers many scholars, such as William Badke and Eli Pariser, because they believe personalized search is driven by the desire to increase advertisement revenues. In addition, they believe that personalized search results are frequently utilized to sway individuals into using products and services that are offered by the particular search engine company or any other company in partnered with them. For example, Google searching any company with at least one brick-and-mortar location will offer a map portraying the closest company location using the Google Maps service as the first result to the query.[41] In order to use other mapping services, such as MapQuest, a user would have to dig deeper into the results. Another example pertains to more vague queries. Searching the word “shoes” using the Google search engine will offer several advertisements to shoe companies that pay Google to link their website as a first result to consumer’s queries.

References

  1. Pitokow, James; Hinrich Schütze; Todd Cass; Rob Cooley; Don Turnbull; Andy Edmonds; Eytan Adar; Thomas Breuel (2002). "Personalized search". Communications of the ACM (CACM) 45 (9): 50–55.
  2. http://personalization.ccs.neu.edu/paper.pdf
  3. Remerowski, Ted (2013), National Geographic: Inside Google
  4. Remerowski, Ted (2013). National Geographic: Inside Google.
  5. Simpson, Thomas (2012). "Evaluating Google as an epistemic tool". Metaphilosophy 43 (4): 969–982.
  6. Simpson, Thomas (2012). "Evaluating Google as an epistemic tool". Metaphilosophy 43 (4): 969–982.
  7. Ma, Z.; Pant, G.; Sheng, O. (2007). "Interest-based personalized search.". ACM TOIS 25 (5).
  8. Frias-Martinez, E.; Chen, S.Y.; Liu, X. (2007). "Automatic cognitive style identification of digital library users for personalization.". JASIST 58 (2): 237–251. doi:10.1002/asi.20477.
  9. Chirita, P.; Firan, C.; Nejdl, W. (2006). "Summarizing local context to personalize global Web search". SIGIR: 287–296.
  10. Dou, Z.; Song, R.; Wen, J.R. (2007). "A large-scale evaluation and analysis of personalized search strategies". WWW: 581–590.
  11. Shen, X.; Tan, B. and Zhai, C.X. (2005). "Implicit user modeling for personalized search". CIKM: 824–831.
  12. Sugiyama, K.; Hatano, K.; Yoshikawa, M. (2004). "Adaptive web search based on user profile constructed without any effort from the user". WWW: 675–684.
  13. Teevan, J.; Dumais, S.T.; Horvitz, E. (2005). "Personalizing search via automated analysis of interests and activities". SIGIR: 415–422.
  14. Crook, Aidan, and Sanaz Ahari. "Making search yours". Bing. Retrieved 14 March 2011.
  15. Sullivan, Danny. "Of "Magic Keywords" and Flavors Of Personalized Search At Google". Retrieved 21 April 2014.
  16. Briggs, Justin. "A Better Understanding of Personalized Search". Retrieved 21 April 2014.
  17. Briggs, Justin. "A Better Understanding of Personalized Search". Retrieved 21 April 2014.
  18. Briggs, Justin. "A Better Understanding of Personalized Search". Retrieved 21 April 2014.
  19. Pariser, Eli (2011). The Filter Bubble.
  20. Smyth, B. (2007). "Adaptive Information Access:: Personalization And Privacy". International Journal Of Pattern Recognition & Artificial Intelligence: 183–205.
  21. Pariser, Eli (2011). The Filter Bubble.
  22. http://"TrafficReport:HowGoogleissqueezingoutcompetitorsandmusclingintonewmarkets".Retrieved27April2014. 
  23. Wai-Tin, Kenneth; Dik Lun, L (2010). "Deriving concept-based user profiles from search engine logs". IEE transaction on knowledge and data engineering 22 (7): 969–982. doi:10.1109/tkde.2009.144.
  24. "Google Personalized Results Could Be Bad for Search". Network World. Retrieved July 12, 2010.
  25. Pariser, Eli (2011). The Filter Bubble.
  26. "Search Engines and Customized Results Based on Your Internet History". SEO Optimizers. Retrieved 27 February 2013.
  27. Mattison, D. (2010). "Time, Space, And Google: Toward A Real-Time, Synchronous, Personalized, Collaborative Web.". Searcher: 20–31.
  28. Pariser, Eli (2011). The Filter Bubble.
  29. Jackson, Mark. "The Future of Google's Search Personalization". Retrieved 29 April 2014.
  30. Harry, David. "Search Personalization and the User Experience". Retrieved 29 April 2014.
  31. "TRAFFIC REPORT:How Google is Squeezing out Competitors and Muscling into New Markets". ConsumerWatchDog.org. Retrieved 29 April 2014.
  32. Diehl, K. (2003). Personalization and Decision Support Tools: Effects on Search and Consumer Decision Making. Advances In Consumer Research, 30(1), 166-169.
  33. Diehl, K. (2003). Personalization and Decision Support Tools: Effects on Search and Consumer Decision Making. Advances In Consumer Research, 30(1), 166-169.
  34. Diehl, K. (2003). Personalization and Decision Support Tools: Effects on Search and Consumer Decision Making. Advances In Consumer Research, 30(1), 166-169.
  35. Coyle, M., & Smyth, B. (2007). Information recovery and discovery in collaborative web search. Advances in Information Retrieval (pp. 356–367).
  36. Shapira, B., & Zabar, B. (2011). Personalized search: Integrating collaboration and social networks. Journal Of The American Society For Information Science & Technology, 62(1), 146-160. doi:10.1002/asi.21446
  37. Simpson, Thomas W. (2012). "Evaluating Google As An Epistemic Tool". Metaphilosophy 43.4: 426–445. doi:10.1111/j.1467-9973.2012.01759.x.
  38. Bates, Mary Ellen (2011). "Is Google Hiding My News?" 35.6.
  39. http://www.sp.uconn.edu/~jbl00001/pariser_the%20filter%20bubble_introduction.pdf
  40. Badke, William. “Personalization and Information Literacy”. Online, 47. Feb. 2012.
  41. Inside Google. "Traffic Report: How Google Is Squeezing Out Competitors and Muscling Into New Markets." Consumer Watchdog. http://www.consumerwatchdog.org, 2 June 2010. Web.