The importance of website robots txt files

many sites in the FTP you will find a robots.txt file, many webmaster only know this is to limit the spider visit file, if there is no other effect of this file? We are going to learn.

User-agent:* said it would allow all search engine spiders grab and included, if you do not want to let the love of Shanghai included in your site, put into * " baiduspider" then, behind the Disallow limit content will not be included and spiders love Shanghai. If you want to limit the capture station, then the Disallow file will be written "Disallow:/", if you want to limit a folder under the file does not crawl, then write "Disallow:/admin/", if you want to grab the limit; beginning with admin file, then write" Disallow:/admin", and so on;, you should limit a folder the file for example, you should limit to grab a admin folder under the index.htm file, then Disallow grammar is written as "Disallow:/admin/index.htm". If the Disallow does not take "/", on behalf of all the pages included in the website and allowed to crawl.


3. The contents of the file to correct grammar, in general, is commonly used User-agent and Disallow:


1. The file name cannot be wrong, and must be lowercase, suffix must be.Txt.

2. The root directory, file must be put on the website such as: 贵族宝贝taofengyu贵族宝贝/robots.txt, you can access.

what is the robots file? This document is a communication bridge between search engine and web site, is good agreement between the two grammar files, each search engine grab a website, will first check the file, as the key to the door. If the file does not exist, then do not limit the search engines crawl. If the file exists, it will crawl in accordance with the provisions of the requirement of the documents. Some will ask, we definitely need to establish a website, search engines, why limit the crawl? The search engine will search the station during the crawling process, there may be some things you collected your site, or some similar things no substantive page, so the search engines will on your website evaluation greatly reduced, not to the Shanghai dragon effect, but the robots file you can tell the spider, which pages do not want to see it, but also indirectly reduce the load of server.

