Bob Hoffmann's Web Tips  
 

Home
HTML, etc.
Email
Dreamweaver Tips
Contribute Tips
Graphics
Miscellaneous Tips
PDF's
Search Engine Tips


CAHE Information Department

WSU Google Search:

 
 
 

Fighting Spam with Unicode

Definition of Spam: Those obnoxious, unsolicited emails you receive concerning Viagra, snoring, your penis/breast size, the latest hot stock, the greatest porno Web site, earning a six-figure income if you really want to, and going into business for yourself by flooding other people's in boxes with the same crap that's driving you insane.

How did they find my email address?
Short answer:
They looked for it.
Long answer: By using a spidering software that searches the Internet for email addresses and compiles the addresses it finds. Or by purchasing a CD with email addresses that were collected using this technique.

The Solution: Don't put your email address online.

This is a tricky solution, because WSU requires that Web pages have an email address at which someone can be reached. The key is to not have an email address in any form that these address spiders can decipher.

How about just using a graphic with my email address?
Theoretically possible, but this limits the usability of your page, and also brushes with illegality. Of course, if you use a graphic, and then link the graphic using the mailto:my-email@wsu.edu configuration, the spider software will simply take your address from the link.

If you don't bother to link the graphic, you will depend on the initiative and accuracy of the visitor to get the address into an email program—definitely a hit-miss proposition. Even worse, since university Web sites are required by law to be accessible to the visually impaired, this email address is a violation of Federal guidelines. Screen readers can read HTML text (like this text that you are reading now), but they can't read graphics (like the example to the right). Can you say Section 508 incompatible? I thought you could.

Is there a way to fool the spiders with Java Script?
Briefly, yes, but since some people may disable Java Script in their browsers, this also presents accessibility problems, and should be avoided.

What about HTML forms?
Now you're talking! Use a form, such as the Video Unit's Video Project Proposal Form. Your visitors type in their comments, click the submit button, and boom, the server emails you the message.

This can be an excellent solution, although there are some drawbacks. First, creating a form is much more involved than simply providing an email link. In addition to creating the form with HTML, you need to link it to a server-side program ("script") to interact with the form. Cooperation with your system administrator may be required. Secondly (and more importantly), this will only work if your email address is stored in the script, and not as a parameter in the actual HTML form. It is easy enough to use the FormMail script, which simply requires you to correctly construct an HTML form, but your HTML form will include your email address, which can be harvested by a spider.

OK, so what is the correct way to stop my email address from getting lifted?
In a word, Unicode. Each letter on the keyboard (and more) can be represented in HTML by a series of six other characters. For instance, if your HTML source code contains the combination

@

The browser will interpret this series of characters as

@

While browsers and some email programs automatically convert these characters, the spiders that mine email addresses do not. So instead of using the hypertext reference

mailto:my-email@wsu.edu

Try replacing the @ character with

@

and you get

mailto:my-email@wsu.edu

For added levels of fun and security, you can convert your entire address, or the entire reference, with a Unicode converter, such as the Fantomas mailShield. This is exactly what I have done with the email addresses on this site. So when you see:

Have a Web question?
Ask Bob Hoffmann

you can click the link to obtain the email address in your email client, but if you examine this page's source code, you will see that the reference gives my address in Unicode, quite indecipherable to the ordinary mortal. Remember, only the hypertext reference should have your email address. The link text should say something like "Send Email," or perhaps your name, as I have done above.

Warning: While you can type or paste Unicode directly into Dreamweaver 4.0's link field in the Properties Inspector, previous versions of Dreamweaver will automatically convert the Unicode to alphabet (Grrr!). For Dreamweaver 1-3, you will need to insert the Unicode in your code view.

 

Want to skip the labored explanation?
Go straight to the solution.

 

 

Incorrect way to give your email address on your site:

Fake email address

(With an unlinked graphic.) This is the only time you will be able to read my email address on a Web page that I have made—because spiders can't read the graphic.

 

Did I do the Unicode correctly?
Just click the link in your browser to see if the correct address appears in your email program.

Have a Web
question?
Ask Bob Hoffmann

 
                         
 
 
Refer questions or comments to Bob Hoffmann, 509-335-7744. Accessibility | Copyright | Policies
CAHNRS Information Department, 401 Hulbert Hall, Washington State University, Pullman, Washington, 99164-6244.