{"id":20,"date":"2008-03-12T19:44:14","date_gmt":"2008-03-13T00:44:14","guid":{"rendered":"http:\/\/unixmonkey.net\/?p=20"},"modified":"2008-03-12T19:44:14","modified_gmt":"2008-03-13T00:44:14","slug":"overkill-email-obfuscation-with-ruby-and-javascript","status":"publish","type":"post","link":"https:\/\/unixmonkey.net\/?p=20","title":{"rendered":"Overkill Email Obfuscation with Ruby and Javascript"},"content":{"rendered":"<p><a href=\"https:\/\/unixmonkey.net\/blog\/wp-content\/uploads\/2008\/03\/runaway_spiders.jpg\" title=\"Robot Spiders from Runaway\"><img decoding=\"async\" src=\"https:\/\/unixmonkey.net\/blog\/wp-content\/uploads\/2008\/03\/runaway_spiders.thumbnail.jpg\" alt=\"Robot Spiders from Runaway\" align=\"right\" \/><\/a>The web is a generally free and open place for all types of communication, but if you put your email address on 1 website, you can expect an email-harvesting robot spider to find that address and send it to its spammer overlords.<\/p>\n<p>Once on a spammer&#8217;s list, you can expect to get all kinds of interesting stock tips, products to enhance your manhood, and friendly letters from Nigerian diplomats.<\/p>\n<p>If you simply have too little to do in the day, this can be a great way to meet new people and start a career in day trading. However, some of us are just too darn busy to stop what we are doing every 2\/3rds of a second to check our email; but still need it for keeping in contact with friends, family, and business contacts.<\/p>\n<p>From a few tips pulled from the web, I set to create a nice link helper for Ruby \/ Rails intended to display email links that work indistinguishably from regular mailto: links, and even gracefully downgrade for users without javascript.<\/p>\n<p>Lets not even display the email address on the page at all, and use a little javascript to render the email address after the fact by breaking it up and putting it back together with javascript.<\/p>\n<pre lang=\"rails\">\n# Takes in an email address and (optionally) anchor text,\n# its purpose is to obfuscate email addresses so spiders and\n# spammers can't harvest them.\ndef js_antispam_email_link(email, linktext=email)\n    user, domain = email.split('@')\n    # if linktext wasn't specified, throw email address builder into js document.write statement\n    linktext = \"'+'#{user}'+'@'+'#{domain}'+'\" if linktext == email \n    out =  \"<noscript>#{linktext} #{user}(at)#{domain}<\/noscript>n\"\n    out += \"<script language='javascript'>n\"\n    out += \"  <!--n\"\n    out += \"    string = '#{user}'+'@'+'#{domain}';n\"\n    out += \"    document.write('<a href='+'ma'+'il'+'to:'+ string +'>#{linktext}<\/a>'); n\"\n    out += \"  \/\/-->n\"\n    out += \"<\/script>n\"\n    return out\nend<\/pre>\n<p>This is probably good enough for 90% of those robots, but you know if one spammer gets your address, he will likely share (or sell) your email to all his friends. The weak spot in this looks like the noscript version, lets fuzz that up a bit by converting to HTML character entities.<\/p>\n<p>One of the earliest and simplest ways to obfuscate an email address is by converting each character into its HTML equivalent.  This makes the source look nasty, but will be correctly rendered by the browser that the end-user is none the wiser.<\/p>\n<p>An address like abc@example.com will look like this in the source:<\/p>\n<pre lang=\"rails\">\n&#097;&#098;&#099;&#064;&#101;&#120;&#097;&#109;&#112;&#108;&#101;&#046;&#099;&#111;&#109;<\/pre>\n<p>Let&#8217;s build a simple method to convert a plaintext string into something like the above.  I&#8217;m going to cheat and only convert a-z and A-Z and leave @ signs, dots, dashes, etc. alone.<\/p>\n<pre lang=\"rails\">\n# HTML encodes ASCII chars a-z, useful for obfuscating\n# an email address from spiders and spammers\ndef html_obfuscate(string)\n  output_array = []\n  lower = %w(a b c d e f g h i j k l m n o p q r s t u v w x y z)\n  upper = %w(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)\n  char_array = string.split('')\n  char_array.each do |char|  \n    output = lower.index(char) + 97 if lower.include?(char)\n    output = upper.index(char) + 65 if upper.include?(char)\n    if output\n      output_array << \"&#038;##{output};\"\n    else \n      output_array << char\n    end\n  end\n  return output_array.join\nend<\/pre>\n<p>now in our js_antispam_email_link method we can \"encrypt\" the user and domain before sending to the browser like so:<\/p>\n<pre lang=\"rails\">\ndef js_antispam_email_link(email, linktext=email)\n  user, domain = email.split('@')\n  user = html_obfuscate(user)\n  domain = html_obfuscate(domain)\n  ...<\/pre>\n<p>Not bad, but many spiders these days can still decode HTML entities and get at that address, so lets build up our defenses a bit more by adding some methods to really screw with those spiders.<\/p>\n<p>We'll write a method that encrypts a string with ROT13 and puts that on the webpage, and use some javascript to decrypt that on page display. ROT13 is a really simple cipher where you take characters a-z and shift them by half the alphabet.<\/p>\n<p>This is a really simple one-liner borrowed from <a href=\"http:\/\/www.miranda.org\/~jkominek\/rot13\/ruby\/\">Jay Komineck<\/a><\/p>\n<pre lang=\"rails\">\n# Rot13 encodes a string\ndef rot13(string)\n  string.tr \"A-Za-z\", \"N-ZA-Mn-za-m\"\nend<\/pre>\n<p>Lets use this to really beef up our link helper by using some javascript that can decipher this. JS code taken from <a href=\"http:\/\/blog.macromates.com\/2006\/obfuscating-email-addresses\/\">Allan Odgaard<\/a><\/p>\n<pre lang=\"javascript\">\nstring = '#{email}'.replace(\/[a-zA-Z]\/g, \n  function(c){ \n    return String.fromCharCode(\n      (c <= 'Z' ? 90 : 122) >= (c = c.charCodeAt(0) + 13) ? c : c - 26\n    );\n  });<\/pre>\n<p>Now we've got some pretty strong defense against those pesky robots and by using simple HTML character encoding and lightweight ROT13 ciphering it shouldn't be too taxing on your webserver to spit out a page with a few emails on it. Less sophisticated browsers still get the contact info and everyone is a little bit happier to come home to a (relatively) clean inbox.<\/p>\n<p>Here's the whole shebang put together, put this in application_helper.rb if using rails:<\/p>\n<pre lang=\"rails\">\n# Rot13 encodes a string\ndef rot13(string)\n  string.tr \"A-Za-z\", \"N-ZA-Mn-za-m\"\nend\n\n# HTML encodes ASCII chars a-z, useful for obfuscating\n# an email address from spiders and spammers\ndef html_obfuscate(string)\n  output_array = []\n  lower = %w(a b c d e f g h i j k l m n o p q r s t u v w x y z)\n  upper = %w(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)\n  char_array = string.split('')\n  char_array.each do |char|  \n    output = lower.index(char) + 97 if lower.include?(char)\n    output = upper.index(char) + 65 if upper.include?(char)\n    if output\n      output_array << \"&#038;##{output};\"\n    else \n      output_array << char\n    end\n  end\n  return output_array.join\nend\n\n# Takes in an email address and (optionally) anchor text,\n# its purpose is to obfuscate email addresses so spiders and\n# spammers can't harvest them.\ndef js_antispam_email_link(email, linktext=email)\n  user, domain = email.split('@')\n  user   = html_obfuscate(user)\n  domain = html_obfuscate(domain)\n  # if linktext wasn't specified, throw encoded email address builder into js document.write statement\n  linktext = \"'+'#{user}'+'@'+'#{domain}'+'\" if linktext == email \n  rot13_encoded_email = rot13(email) # obfuscate email address as rot13\n  out =  \"<noscript>#{linktext}<br\/><small>#{user}(at)#{domain}<\/small><\/noscript>n\" # js disabled browsers see this\n  out += \"<script language='javascript'>n\"\n  out += \"  <!--n\"\n  out += \"    string = '#{rot13_encoded_email}'.replace(\/[a-zA-Z]\/g, function(c){ return String.fromCharCode((c <= 'Z' ? 90 : 122) >= (c = c.charCodeAt(0) + 13) ? c : c - 26);});n\"\n  out += \"    document.write('<a href='+'ma'+'il'+'to:'+ string +'>#{linktext}<\/a>'); n\"\n  out += \"  \/\/-->n\"\n  out += \"<\/script>n\"\n  return out\nend<\/pre>\n<p>I hope this helps out somebody out there, please leave a comment if you have any suggestions.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The web is a generally free and open place for all types of communication, but if you put your email address on 1 website, you can expect an email-harvesting robot spider to find that address and send it to its spammer overlords. Once on a spammer&#8217;s list, you can expect to get all kinds of [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,7,9,14,15,17],"tags":[38,39,41,46,48],"_links":{"self":[{"href":"https:\/\/unixmonkey.net\/index.php?rest_route=\/wp\/v2\/posts\/20"}],"collection":[{"href":"https:\/\/unixmonkey.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/unixmonkey.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/unixmonkey.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/unixmonkey.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=20"}],"version-history":[{"count":0,"href":"https:\/\/unixmonkey.net\/index.php?rest_route=\/wp\/v2\/posts\/20\/revisions"}],"wp:attachment":[{"href":"https:\/\/unixmonkey.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=20"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/unixmonkey.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=20"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/unixmonkey.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=20"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}