Ever wondered how a spammer got your email address? Read on....
Web bots and spam - we know what spam is but what about web bots? And are they related? YES!
What are web bots?
Basically these are automated programs that scour the Internet gathering information, links and email addresses. Major search engines have been employing the use of these for over a decade. They will spider your site having been provided your URL from an add URL page such as: www.google.co.uk/addurl.html
or a page on another site that links to yours and the spider followed the link.
Search engine web bots are generally well behaved and will gather and index your web site and use the information in a useful manner to provide links into your site when someone searches their index. Other web bots that are now being used by spammers to collect just email addresses aren't so useful. They can collect millions of email addresses from web pages in a relatively short time. Once they have your email address you are on their spam lists.
Below we show you how to avoid spam bots but ensure you don't frighten away the friendly spiders.
- Protecting your email addresses from spam web bots
This is relatively simple if you are used to using JavaScript, or editing raw HTML.
JavaScript to hide your email address from web bots:
Within the script above, change the following variables to your own details:
DisplayName
MailboxName
DomainName
DomainExtensionThen copy the whole of the script above and paste into the pages where you want the email address link to appear. And that's it - as the email address is made-up from multiple parts the predatory spam web bots will not recognise this as a valid email address ;-)
HELP to ensure your site is search engine web bot friendly:
If you want the web bots to spider all your site ensure you have a text file called robots.txt in the root directory containing:
User-agent: *
Disallow: /cgi-binThis informs the web bots to index all your site but disallow the cg-bin directory.
If you have a sub directory where you do not want the web bots to index then add a file, again called robots.txt containing:
User-agent: *
Disallow: *This informs the web bots NOT to index pages in the current sub directory.
In addition to the above ensure you don't forget to add the correct meta data on your pages, see Meta tags revisited at: www.seiretto.com/newsletters/
NEED a boost to get indexed on search engines like Google? Why not get listed on our review pages? We are particularly interested in reviews from our relatively new Starter and Ecommerce hosting accounts, for more details please see: www.seiretto.com/featuredsites/starter.php
Need to know more about robots.txt, try: http://www.robotstxt.org/wc/faq.html
|
Copyright © 1996-2008 Seiretto Ltd. All rights reserved. Registered in England & Wales no: 4716409. VAT no: GB780 4245 32 |
Privacy Policy Terms of Service |
|---|