Helping ordinary people create extraordinary websites!

Go Back   Web Development Forum > Website Designing > HTML/CSS
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 03-04-2008, 02:53 PM
ryanhellyer's Avatar
Moderator
 
Join Date: Dec 2007
Posts: 586
Default Limiting bot rate

I was recently asked by a colleague, how to go about limiting the amount of traffic a site receives by robots.

It is easy enough to totally prevent them from visiting, a simple noindex meta tag will fix that quite easily. But is it possible to simply reduce the number of visits from bots?

Sitemaps give you the option to specify how often you would like bots to visit, but AFAIK they simply use this as a relative measure. So if you say you want them to visit your home page every hour and your contact form once per year, they may visit your home page more often than your contact page accordingly. But my understanding is that they don't obey your commands per se, just that they use it as a guide to how important your various pages are. Or perhaps I have misunderstood?

Any ideas/suggestions?

In case you wondering why they would want to limit the number of bot visits, it is because the cost of bandwidth within our organisation is quite expensive. The cost per year from robots is about $120 whereas the cost due to humans is only $80. The site is rarely updated, so it seems pointless for the bots to be visiting the site every minute which they seem to be doing at the moment.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 03-04-2008, 04:10 PM
BigAlReturns's Avatar
Moderator Extraordinaire!
 
Join Date: Dec 2007
Location: The Wirral, England
Posts: 298
Send a message via MSN to BigAlReturns
Default

By SiteMaps do you mean Webmaster Tools? I think there is an option in tools to slow down overall bot rate, although I have no idea how effective it actually is.
Beyond this, I don't know of any other methods to be honest.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 03-04-2008, 05:22 PM
ryanhellyer's Avatar
Moderator
 
Join Date: Dec 2007
Posts: 586
Default

By Sitemap, I mean the XML file which search engine spiders can find via your robots.txt file ... Sitemaps - Wikipedia, the free encyclopedia
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 03-04-2008, 08:31 PM
BigAlReturns's Avatar
Moderator Extraordinaire!
 
Join Date: Dec 2007
Location: The Wirral, England
Posts: 298
Send a message via MSN to BigAlReturns
Default

Ah OK, I just wondered because Google Webmaster Tools used to be called Google Sitemaps, before they extended its functionality somewhat.
I don't have a tools account to check with, but I'm pretty certain if you log into it then I think there is an option to slow crawl rate.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 03-05-2008, 12:23 AM
ryanhellyer's Avatar
Moderator
 
Join Date: Dec 2007
Posts: 586
Default

Thanks BigAl
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 03-24-2008, 02:51 AM
Junior Member
 
Join Date: Mar 2008
Posts: 1
Default

Thanks a lot BigA.It really helped me.
--------------
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 03-27-2008, 08:14 AM
Junior Member
 
Join Date: Mar 2008
Posts: 13
Default

ryanhellyer, the easiest solution would be to slow down the speed with which the bots can index your site. For instance, to force them to pause a minute between every fetch, in your robots.txt file, put:

User-agent: *
Crawl-delay: 60

To actually limit the number of times that a particular bot can index your site -- perhaps a certain number of times per month -- would require detecting the IP address (assuming that it does not change over time) and dynamically throttling by keeping track of how many times that bot has already visited your site. That would be much more involved, and would require storing information in a database.

I hope this gives you some ideas.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 03-28-2008, 09:06 AM
ryanhellyer's Avatar
Moderator
 
Join Date: Dec 2007
Posts: 586
Default

That sounds like a great idea coder4hire!

I'll send a link to this topic to my colleague to see.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 05-02-2008, 06:25 AM
Junior Member
 
Join Date: Apr 2008
Posts: 11
Default

Its really helpful.One sitemap is google webmaster tool.Its different from websites sitemap?
__________________
web design
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 05-02-2008, 07:49 AM
ryanhellyer's Avatar
Moderator
 
Join Date: Dec 2007
Posts: 586
Default

I don't know much about Googles webmaster tool. But the sitemap for a site is just an XML file which you can use to notify search engines when your site is updated and what the most important parts of your sites are (allows SE's to index your site better).
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -5. The time now is 04:10 PM.


Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.