Top 10 Myths about Search Engine

    技术2022-05-12  1 Engine Overview.PDF


    •Myth:Some search engines are close to “perfect”.•Fact:They are perfect because you have no choice̵Search engines lower our expectations̵We are getting used to their poor performance


    •Myth:There are magic algorithms in search engines•Fact:There is no a single magic algorithm can make you win the search battle̵PageRank is not that important as you think. It is only one small factor among many many others that search engines use to determine the ranking̵Search algorithms are keeping improving


    Myth:Most of the information on the Web has been indexed by search engines.•Fact:Only a very tiny fraction of Web information is being indexed.̵Seen URLs >> crawled URLs̵Dynamic contents, deep Web, Web 2.0 contents


    •Myth:It is easy to switch to another search engine.•Fact:Users only switch to a search engine significant better than the current one.


    •Myth:Ranking is the most important thing•Fact:An infrastructure enabling quick innovations is most important̵No good infrastructure, no good ranking̵Good ranking is the result of many hard efforts behind


    •Myth:Search engine is equivalent to Web information retrieval•Fact:Search engine is equivalent to Web-scale information management̵Information acquisition, processing, storage, access, indexing, querying, mining̵Managing the information in the world


    •Myth:Cool feature is the king•Fact:Do “simple”thing and do it best is the king̵In terms of features, the current search engines are in fact the same as those ten years ago̵Ideas vs. ideas do work!̵Of course, only if you have a really cool idea that can change the game



    •Myth:Ideas in top conference papers are excellent•Fact:Most of them DO NOT work at all!̵Toy system̵Small dataset̵Scholastic evaluation̵How to narrow the gap between academy and industry?


    •Myth:Most of Web search researchers are from the IR community•Fact:They come from diverse fields̵For example, search researchers in Microsoft Research Asia are from database, machine learning, system, IR, multimedia, etc.


    •Myth:We know what is next-generation search engine•Fact:We don’t know̵Many efforts̵Users will tell