Internet search tools fall into two camps:search engines,such as HotBot and AltaVista,and online directories,such as Yahoo and Lycos.The difference between the two is related to how they compile their site listings.Of course,there are exceptions to every rule.Some search utilities,such as Ask Jeeves,combine the search engine and directory approaches into a single package,hoping to provide users with the best of both worlds.
In directory-based search services,the Web site listings are compiled manually.For example,the everpopular Yahoo dedicates staff resources to accept site suggestions from users,review and categorize them,and add them to a specific directory on the Yahoo site.
You can usually submit your Web site simply by filling out an online form.On Yahoo,for example,you'll find submission information at .Because human intervention is necessary to process,verify,and review submission requests,expect a delay before your site secures a spot in a directory-based search service.
On the flip side,search engines completely automate the compilation process,removing the human component entirely.
A software robot,called a spider or crawler,automatically fetches sites all over the Web,reading pages and following associated links.By design,a spider will return to a site periodically to check for new pages and changes to existing pages.
Results from spidering are recorded in the search engine’s index or catalog.Given the wealth of information available on the Internet,it is not surprising that indexes grow to very large sizes.For example,the AltaVista index has recently been increased to top out at 350 million pages.This may seem like a mammoth number,but by all estimates it still represents less than 35 percent of all pages on the Web.
Because of the depth and breadth of information being indexed,there is usually a delay,sometimes up to several weeks,between the time a site has been“spidered”and when it appears in a search index.Until this two-step process has been completed,a site remains unavailable to search queries.
Finally,the heart of each search engine is an algorithm that matches keyword queries against the information in the index,ranking results in the order the algorithm deems most relevant.
Because the spiders,resulting indexes,and search algorithms of each search engine differ,so do the search results and rankings across the various search engines.This explains why a top 10 site in HotBot may not appear near the top of Alta Vista when the same keyword search criterion is entered.
In addition,many,but not all,search utilities also reference metatags—invisible HTML tags within documents that describe their content—as a way to control how content is indexed.As a result,proper use of metatags throughout a site can also boost search engine ranking.
因特網搜索工具分為兩大陣營:搜索引擎,如HotBot和AltaVista,以及在線目錄,如Yahoo和Lycos。兩者間的差別與它們如何編撰網站編目有關。當然,對任何規律都有例外。有些搜索實用程序,如Ask Jeeves,把搜索引擎和目錄方法合並成單一的軟件包,希望把這兩個陣營中最好的東西提供給用戶。
在基於目錄的搜索服務中,Web網站編目是手工編撰的。比如一直流行的Yahoo就指定專門的人力資源來接受用戶對網站的建議,並對建議進行評價和分類,再把它們加到Yahoo網站上特定目錄中。
通常是通過簡單地填寫在線表格就能把你的網站信息提交給(搜索引擎)。例如,在Yahoo網站上,你可以在 www.yahoo.com/docs/info/include.htm1上找到提交信息。由於人工幹預對處理、驗 證和評價提交請求是必要的,所以在網站在基於目錄的搜索服務中捕捉到一處之前,可 望有些延遲。
另一方麵,搜索引擎完全實現了編撰過程的自動化,徹底消除了人工幹預。
一個叫做蜘蛛或爬蟲的軟件機器人自動地在整個Web上取出站點,閱讀頁麵和跟隨相關的鏈接。通過設計,蜘蛛可以周期性地返回到站點,檢查新的頁麵和修改已有頁麵。
蜘zhi蛛zhu爬pa行xing得de到dao的de結jie果guo記ji錄lu在zai搜sou索suo引yin擎qing的de索suo引yin或huo目mu錄lu中zhong。已yi知zhi了le因yin特te網wang上shang可ke資zi利li用yong的de信xin息xi的de價jia值zhi,對dui索suo引yin擴kuo張zhang到dao非fei常chang大da的de規gui模mo是shi不bu會hui感gan到dao驚jing訝ya的de。 例如,AltaVista的索引最近已增至3.5億頁而名列前茅。這個數字看來好像非常大,但總體估計它僅代表了Web上不足35%的頁麵。
由於已編索引的信息的深度與廣度(非常大),所以通常在“蜘蛛爬行過”站點的時間與出現在搜索索引中的時間之間有一個延遲,有時多達幾周。隻有這兩步的過程完成之後,站點才能供搜索查詢使用。
最後,每個搜索引擎的心髒是一種算法,它將關鍵字查詢與索引中的信息匹配起來,並按算法認為最有關聯的順序把結果列出。
由於每種搜索引擎的蜘蛛、產chan生sheng的de索suo引yin和he搜sou索suo算suan法fa都dou是shi不bu一yi樣yang的de,所suo以yi在zai不bu同tong搜sou索suo引yin擎qing上shang的de搜sou索suo結jie果guo和he排pai列lie次ci序xu是shi不bu同tong的de。這zhe就jiu解jie釋shi了le為wei什shen麼me當dang相xiang同tong的de關guan鍵jian字zi搜sou索suo準zhun則ze輸shu入ru進jin去qu時shi,HotBot中排在最前麵的10個站點不會出現在 AltaVista中最前麵的站點中。
此外,很多(但不是所有的)搜索實用程序也引用元標記(文檔中用來描述其內容的、看不見的HTML標記),作為控製內容如何編索引的方法。因此,在整個站點中正確使用元標記也能提高(此站點)在搜索引擎中的排列名次。
手機版







