wood_menu
Follow me on twitter

email validation

Regular expressions : Email validation

Working on various web projects, there's a very well known problem : find a good regular expression (regexp) to check the validity of user submitted email addresses.

This website has compiled various regular expressions which try to resolve this problem. The editor's idea is great, using a set of valid/invalid emails, and a simple unit test, he can provide a good comparison of some of the most used regexps.

His philosophy is great : "It's better to accept a few invalid addresses than reject any valid ones, so I'm looking for 0 false-positives and as few false-negatives as possible."
But I've noticed 2 problems :

  • His "best" regexp doesn't work in JavaScript (JS doesn't support advanced features like negative lookbehind ...)
  • The method used to validate IP addresses is not correct (doesn't take care of 0-255 range)

So i've decided to improve an existant one, adding an other test criteria : also check the "real" validity of the IP address. The following work is based on the G. Arluison's improved version of Warren Gaebel's regex.

Here are my solutions :


/^[-a-z0-9~!$%^&*_=+}{\'?]+(\.[-a-z0-9~!$%^&*_=+}{\'?]+)*@([a-z0-9]([-a-z0-9_]?[a-z0-9])*(\.[-a-z0-9_]+)*\.(aero|arpa|biz|com|coop|edu|gov|info|int|mil|museum|name|net|org|pro|travel|mobi|[a-z]{2})|([1]?\d{1,2}|2[0-4]{1}\d{1}|25[0-5]{1})(\.([1]?\d{1,2}|2[0-4]{1}\d{1}|25[0-5]{1})){3})(:[0-9]{1,5})?$/i

This one works very well (found 18/18 valid mails + deep IP address check, and found 19/20 invalid mails - there is a problem checking global length)

There's just a small problem, each time a new TLD > 2 chars will be added, you'll need to append it to the list in the regex, if you want a more generic solution, you can use this variant (note that this version will not check if the TLD really exists) :

/^[-a-z0-9~!$%^&*_=+}{\'?]+(\.[-a-z0-9~!$%^&*_=+}{\'?]+)*@([a-z0-9]([-a-z0-9_]?[a-z0-9])*(\.[-a-z0-9_]+)*\.([a-z]{2,6})|([1]?\d{1,2}|2[0-4]{1}\d{1}|25[0-5]{1})(\.([1]?\d{1,2}|2[0-4]{1}\d{1}|25[0-5]{1})){3})(:[0-9]{1,5})?$/i

Those 2 solutions should be usable in all languages providing PCRE (Perl Compatible Regular Expressions), on server & client side (such as Javascript, PHP, Perl, Python, Ruby etc...)

Tags

Flickr Random images

Regular Expressions

About the author

photoAlexandre DE DOMMELIN

Geneva - Switzerland