Results 1 to 17 of 17
  1. #1

    Help with validating emailadress (preg_match)

    Hi,

    I am trying to validate an emailadress, so that you enter a valid email. I am using this script:

    PHP Code:
    $email 'user@domain.com';
    if (
    preg_match("^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,6})$^"$email)) { 
        echo 
    'Email is correct!'

    The expression i did not come up with myself, i found it, I dont fully understand it tho. Is there someone (PLEASE!) that can break it up and try to explain what the different blocks mean?

    I realize that the [_a-z0-9-] is the valid chars, but how about the other things? Like the ^ at the beginning, and what difference does the + and the * make between them blocks?

    Anyway, i would truly love it if someone would try to explain this to me. Thanks!

    Hugs / Merilille

  2. #2
    Join Date
    Nov 2002
    Posts
    514
    Get help with your server optimization - A forum on server optimization...
    ExoPHPDesk - Powerful PHP HelpDesk

  3. #3
    Join Date
    Apr 2002
    Location
    UK
    Posts
    429
    Regular expressions aren't a speciality of mine, but I'll have a go.

    I'd read it in sections.

    "^[_a-z0-9-]+(.[_a-z0-9-]+)*@[a-z0-9-]+(.[a-z0-9-]+)*(.[a-z]{2,6})$^"

    ^[_a-z0-9-]+

    ^ = starting at the beginning of the line.

    [_a-z0-9-] = match a-z, 0-0, _ or -

    + = (above match) one or more times.

    (.[_a-z0-9-]+)*

    ( = start a subpattern.

    . = match a full stop, single character.

    [_a-z0-9-] = match a-z, 0-0, _ or - again.

    + = (above match) one or more times.

    ) = end a subpattern.

    * = match the preceding thing (in this case the subpattern) 0 or more times.

    @

    @ = match that @ character, once.

    [a-z0-9-]+

    [a-z0-9-] = match a-z or 0-9.

    + = ... one or more times.

    (.[a-z0-9-]+)*

    ( = start another subpattern.

    . = match a full stop, once.

    [a-z0-9-] = match a-z or 0-9.

    + = ... one or more times.

    ) = end a subpattern.

    * = match the preceding thing (in this case the subpattern) 0 or more times.

    (.[a-z]{2,6})$^

    ( = start another subpattern.

    . = match a full stop, once.

    [a-z]{2,6} = match a-z for 2 to 6 characters.

    ) = end a subpattern.

    $ = end of line.

    ^ = starting at the beginning of the line (not sure why that's there).

  4. #4
    Join Date
    Sep 2000
    Location
    Alberta, Canada
    Posts
    3,146
    disoft, for someone who's not sure what they are doing, you do pretty good.

    I would also agree that the caret ^ at the end should not be there.


    Was also wondering if anyone sees this code as faster, compared to what was originally posted:

    if ( preg_match ( "^[A-z0-9\._-]+[@][A-z0-9_-]+([.][A-z0-9\._-]+)+[A-z]{2,6}$", $email ) )

    Haven't tried it yet but wondering how important the * pattern match is, since the whole variable is supposed to be a pattern match anyway.
    PotentProducts.com - for all your Hosting needs
    Helping people Host, Create and Maintain their Web Site
    ServerAdmin Services also available

  5. #5
    Thanks alot for the answers, they were all great!

    Regarding the last ^ .. if i dont have i get this error message:
    "Warning: No ending delimiter '^' found in C:\web\index.php on line 28"
    When i place it there in the end, the error goes away .. thats why i put it there, im not that good with this stuff but it worked

    I do have one question tho about my original post, it seems as if an email like "firstname..lastname@host.com" is valid, and "firstname!#%lastname@host.com" is also valid. Well, my script seems to think so anyway.
    Anyone got any idea to why this happens? And how i can fix it?

  6. #6
    Join Date
    Sep 2000
    Location
    Alberta, Canada
    Posts
    3,146
    It would seem that escaping is required:

    (preg_match("/^[_a-z0-9-]+(.[_a-z0-9-]+)*@[a-z0-9-]+(.[a-z0-9-]+)*(.[a-z]{2,6})$/", $email))

    You should then notice that the address you say will work, will not work.

    Also, the character number matching should be a 1, not a 2 - {1,6}.
    Otherwise you find a Domain name ending in 2 characters (.ca, .us, etc.)
    will return an error when they shouldn't.

    Funny how it's hard to remember to count from zero instead of one.
    PotentProducts.com - for all your Hosting needs
    Helping people Host, Create and Maintain their Web Site
    ServerAdmin Services also available

  7. #7
    Join Date
    Oct 2004
    Location
    UK
    Posts
    487
    I'm sure with preg_match, you have to have the same character at the beginning and the end to specify the beginning and end of the regex to PHP. This can be any character, but most people use ^ as it is not regualry used in regex's

  8. #8
    Join Date
    Dec 2004
    Location
    Canada
    Posts
    1,097
    Originally posted by Xenatino
    I'm sure with preg_match, you have to have the same character at the beginning and the end to specify the beginning and end of the regex to PHP. This can be any character, but most people use ^ as it is not regualry used in regex's
    Most people use '/' or '#' as they're not regular expression meta characters. Using '^' is asinine because of it's ambiguity.

    Also of note is that '.' is a wildcare in REs, and as it isn't escaped here the '.'s which are intended to match only a full stop, will actually match any character whatsoever. This may cause false-positives. REs are parsed into a tree and executed in that fashion. There's negligible difference between using '*' and '+', and oftentimes '*' will be faster as it's less strict, and thus requires less testing.

  9. #9
    Join Date
    Jan 2004
    Location
    Stockholm
    Posts
    25
    Also of note is that '.' is a wildcare in REs, and as it isn't escaped here the '.'s which are intended to match only a full stop, will actually match any character whatsoever.
    I am wondering about this aswell, since "user%username@host.com" will work, but % can be any char it seems. Hmm how would i go about to fix this?

  10. #10
    Join Date
    Jan 2004
    Location
    Stockholm
    Posts
    25
    Solved it, needed a backslash in front on the dots.
    --

  11. #11
    Join Date
    Dec 2003
    Location
    Earth
    Posts
    144
    Just my 2 cents but if you really want to be picky about validating email addresses:

    1. I would break it up first

    soandso@someplace.com = soandso , @ , someplace, . , com

    2. check each part individually

    3. If they check out it's valid

    This would be an alternative to using a universal regex to verify.
    Just think if you couldn't split the string with '@' it would be invalid right away.

  12. #12
    Join Date
    Apr 2002
    Location
    UK
    Posts
    429
    Well, personally, I use this complex affair:

    PHP Code:
    <?php
    /* 
    * The preg_match statement is all one line. 
    */ 
    function is_email_format($dataval

            return (
    preg_match("/^[-_.[:alnum:]]+@((([[:alnum:]]|[[:alnum:]][[:alnum:]-]*[[:alnum:]])\.)+(ad|ae|aero|af|ag|ai|al|am|an|ao|aq|ar|arpa|as|at|au|aw|az|ba|bb|bd|be|bf| 
    bg|bh|bi|biz|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn 
    |co|com|coop|cr|cs|cu|cv|cx|cy|cz|de|dj|dk|dm|do|dz|ec|edu|ee|eg|eh|er|es| 
    et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gh|gi|gl|gm|gn|gov|gp|gq|gr|gs|gt|gu|gw| 
    gy|hk|hm|hn|hr|ht|hu|id|ie|il|in|info|int|io|iq|ir|is|it|jm|jo|jp|ke|kg|kh|ki|km|kn| 
    kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|mg|mh|mil|mk|ml|mm|mn 
    |mo|mp|mq|mr|ms|mt|mu|museum|mv|mw|mx|my|mz|na|name|nc|ne|net| 
    nf|ng|ni|nl|no|np|nr|nt|nu|nz|om|org|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|pro|ps|pt 
    |pw|py|qa|re|ro|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|sk|sl|sm|sn|so|sr|st|su|sv| 
    sy|sz|tc|td|tf|tg|th|tj|tk|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|um|us|uy|uz|va|vc|ve|vg| 
    vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw)|(([0-9][0-9]?|[0-1][0-9][0-9]|[2][0-4][0-9]|[2] 
    [5][0-5])\.){3}([0-9][0-9]?|[0-1][0-9][0-9]|[2][0-4][0-9]|[2][5][0-5]))$/i"
    $dataval)) ? 
                
    true 
                
    false


    ?>

  13. #13
    Join Date
    Oct 2004
    Posts
    104
    This is what I use in tcl:

    regexp {^[^@ ][^@ ]*@[^@ ][^@ ]*\.[^@ ][^@ ]*$} $EMAIL

    I am not completely happy with it, but for the most part, it does what I want it to do.

  14. #14
    Join Date
    Jan 2004
    Location
    Stockholm
    Posts
    25
    I ended up using this, could anyone please comment it?

    PHP Code:
    function validateemail($email) {
        if (
    preg_match("/^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.(ad|ae|aero|af|ag|ai|al|am|an|ao|aq|ar|arpa|as|at|au|aw|az|ba|bb|bd|be|bf|bg|bh|bi|biz|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|com|coop|cr|cs|cu|cv|cx|cy|cz|de|dj|dk|dm|do|dz|ec|edu|ee|eg|eh|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gh|gi|gl|gm|gn|gov|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|in|info|int|io|iq|ir|is|it|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|mg|mh|mil|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|museum|mv|mw|mx|my|mz|na|name|nc|ne|net|nf|ng|ni|nl|no|np|nr|nt|nu|nz|om|org|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|pro|ps|pt|pw|py|qa|re|ro|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|sk|sl|sm|sn|so|sr|st|su|sv|sy|sz|tc|td|tf|tg|th|tj|tk|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|um|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw))$/"$email)) 
            {
                return 
    true
            }
        else 
            {              
                return 
    false
            }

    --

  15. #15
    Join Date
    Oct 2004
    Posts
    104
    Originally posted by Suedish
    I ended up using this, could anyone please comment it?

    PHP Code:
    function validateemail($email) {
        if (
    preg_match("/^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.(ad|ae|aero|af|ag|ai|al|am|an|ao|aq|ar|arpa|as|at|au|aw|az|ba|bb|bd|be|bf|bg|bh|bi|biz|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|com|coop|cr|cs|cu|cv|cx|cy|cz|de|dj|dk|dm|do|dz|ec|edu|ee|eg|eh|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gh|gi|gl|gm|gn|gov|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|in|info|int|io|iq|ir|is|it|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|mg|mh|mil|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|museum|mv|mw|mx|my|mz|na|name|nc|ne|net|nf|ng|ni|nl|no|np|nr|nt|nu|nz|om|org|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|pro|ps|pt|pw|py|qa|re|ro|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|sk|sl|sm|sn|so|sr|st|su|sv|sy|sz|tc|td|tf|tg|th|tj|tk|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|um|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw))$/"$email)) 
            {
                return 
    true
            }
        else 
            {              
                return 
    false
            }

    Two things about this, and one about this topic in general. First all ascii characters are allowed in an email address except space ()<>@,;:\".[]. Your method will not allow all of the the chars that rfc 822 allows. Second, using this method means that you have to keep up with new top level domains as they are added.

    As for this topic in general, spending a lot of time trying to validate email addresses in code is a wast of time. When some guy gives us an address like lskjdf@lskjf our validators will pick it up and flag the address, but as soon as the user knows that he cannot use that address, he will put in something like me@blahblahblah.com. That is why I do the simplest validation up front, and let the mail server do the real validation on the back end.

  16. #16
    Join Date
    Jun 2004
    Location
    Bay Area -USA
    Posts
    1,740
    This is a function i use. It's not my own work but i've had it for so long i cant remember where i got it from. Either way, here it is:


    PHP Code:
    //check for correct Address. If none - go back 
    Validate_String($string)
    {
     
    $return_invalid_chars;
             
    $valid_chars "1234567890-_.^~abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
             
    $invalid_chars "";
             
             if(
    $string == null || $string == "")
                return(
    true);
             
             
    //For every character on the string.   
             
    for($index 0$index strlen($string); $index++)
                {
                
    $char substr($string$index1);
                
                
    //Is it a valid character?
                
    if(strpos($valid_chars$char) === false)
                  {
                  
    //If not, is it already on te list of invalid characters?
                  
    if(strpos($invalid_chars$char) === false)
                    {
                    
    //If it's not, add it.
                    
    if($invalid_chars == "")
                       
    $invalid_chars .= $char;
                    else
                       
    $invalid_chars .= ", " $char;
                    }
                  }
                }
                
             
    //If the string does not contain invalid characters, the function will return true.
             //If it does, it will either return false or a list of the invalid characters used
             //in the string, depending on the value of the second parameter.
             
    if($return_invalid_chars == true && $invalid_chars != "")
               {
               
    $last_comma strrpos($invalid_chars",");
               
               if(
    $last_comma != false)
                  
    $invalid_chars substr($invalid_chars0$last_comma) . 
                  
    " and " substr($invalid_chars$last_comma 1strlen($invalid_chars));
                          
               return(
    $invalid_chars);
               }
             else
               return(
    $invalid_chars == ""); 
    }

    function 
    Verify_Email_Address($email_address)
    {
             
    //Assumes that valid email addresses consist of [email]user_name@domain.tld[/email]
             
    $at strpos($email_address"@");
             
    $dot strrpos($email_address".");
             
             if(
    $at === false || 
                
    $dot === false || 
                
    $dot <= $at ||
                
    $dot == || 
                
    $dot == strlen($email_address) - 1)
                return(
    false);
                
             
    $user_name substr($email_address0$at);
             
    $domain_name substr($email_address$at 1strlen($email_address));
             
             if(
    Validate_String($user_name) == false || 
                
    Validate_String($domain_name) == false)
                return(
    false);
             
             return(
    true);

    Then call it by using
    PHP Code:
    if(Verify_Email_Address($mail_to)){
    //do something 

    <<< Please see Forum Guidelines for signature setup. >>>

  17. #17
    Originally posted by folsom
    Two things about this, and one about this topic in general. First all ascii characters are allowed in an email address except space ()<>@,;:\".[]. Your method will not allow all of the the chars that rfc 822 allows. Second, using this method means that you have to keep up with new top level domains as they are added.

    As for this topic in general, spending a lot of time trying to validate email addresses in code is a wast of time. When some guy gives us an address like lskjdf@lskjf our validators will pick it up and flag the address, but as soon as the user knows that he cannot use that address, he will put in something like me@blahblahblah.com. That is why I do the simplest validation up front, and let the mail server do the real validation on the back end.
    After seeing this, I came up with this. It's not perfect, and looks like hell, but it does a decent job.

    preg_match('/^[^()<>@,;:\\"\[\] ]+@[^()<>@,;:\\"\[\] ]+\.[^()<>@,;:\\"\[\] ]+$/', $email);

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •