Results 1 to 11 of 11

Thread: PHP: preg_match

  1. #1
    Join Date
    Mar 2004
    Location
    Toronto, Canada
    Posts
    122

    PHP: preg_match

    I have a system where people can upload files. Right now what I am trying to check is that the name file they upload only contains the chars a-z A-Z _ -

    I am trying to do this as follows:

    PHP Code:
    $path_parts pathinfo($_FILES['userfile']['name']);

    $pattern "/([a-zA-Z0-9_-]){1,255}/";

    if (
    preg_match ($pattern$path_parts['basename']))
        echo 
    "good";
    else
        echo 
    "bad"
    The problem is that as long as one of the chars I have listed in my expression is in the file name it seems to go through. I do not want to allow spaces in the files names or '%' or any other special chars but they are still being allowed.

    Google has helped me this far but I can't seem to find anything to help me on the last bit here. How do I ensure that any file name with a char that is not in my list gets excluded?

  2. #2
    Join Date
    May 2004
    Location
    Singapore
    Posts
    263
    /^pattern$/ asserts that the subject matches the pattern from its start to its end.
    #include<cstdio>
    char*s="#include<cstdio>%cchar*s=%c%s%c;%cint main(){std::printf(s,10,34,s,34,10);}";
    int main(){std::printf(s,10,34,s,34,10);}

  3. #3
    Join Date
    Feb 2005
    Location
    Seattle, Washington
    Posts
    147

  4. #4
    Join Date
    May 2004
    Location
    Singapore
    Posts
    263
    ctype_alnum() wont work out of the box here, mfonda. One will have to use it on every character in the subject and test for underscores and dashes as well.
    #include<cstdio>
    char*s="#include<cstdio>%cchar*s=%c%s%c;%cint main(){std::printf(s,10,34,s,34,10);}";
    int main(){std::printf(s,10,34,s,34,10);}

  5. #5
    Join Date
    Mar 2004
    Location
    Toronto, Canada
    Posts
    122
    Ok, I realized that $path_parts['basename'] gives me the file name WITH extension so that was causing a problem.

    I tried modifying my pattern as follows to account for the extension but its not quite working.

    $pattern = "/^[a-zA-Z0-9_-]$\.^[a-zA-Z]$/"; (in my sample code above this outputs 'bad')

    I would like to put a further restriction on that makes sure the the name is 1, 255 chars and the extension is 3 chars but the following caused an error (Parse error: parse error, unexpected ',' in ...).

    $pattern = "/^[a-zA-Z0-9_-]${1,255}\.^[a-zA-Z]${3}/";

  6. #6
    Join Date
    May 2004
    Location
    Singapore
    Posts
    263
    $pattern = "/^[0-9A-Za-z\_\-]{1,255}\.[0-9A-Za-z]{3}$/";

    though personally restricting file extensions to exactly 3 characters is a little too restrictive.
    #include<cstdio>
    char*s="#include<cstdio>%cchar*s=%c%s%c;%cint main(){std::printf(s,10,34,s,34,10);}";
    int main(){std::printf(s,10,34,s,34,10);}

  7. #7
    Join Date
    Mar 2004
    Location
    Toronto, Canada
    Posts
    122
    Originally posted by laserlight
    $pattern = "/^[0-9A-Za-z\_\-]{1,255}\.[0-9A-Za-z]{3}$/";

    though personally restricting file extensions to exactly 3 characters is a little too restrictive.
    Perfect, thank you very much.

    I know its very restrictive to force it to be three chars and I might change that, but for now all of the types I am allowing to be uploaded should have a 3 char extension. I will be looking at that part again later.

  8. #8
    Join Date
    Apr 2000
    Location
    California
    Posts
    3,051
    BTW, you don't need to backwack the _ and - characters, and you can use \w in place of A-Za-z0-9 and _ anyway and I doubt you want a number in the file extension, right?:

    $pattern = "/^[-\w]{1,255}\.[A-Za-z]{3}$/";

    You can also try matching case insensitively to make things easier if you have much typing in that regard.

    $pattern = "/^[-\w]{1,255}\.[a-z]{3}$/i";

    Also, you'd probably want to check that no one uploads a file you don't want them to be able to run (i.e., .php, .cgi, etc.). Just some ideas and suggestions to help you along.

  9. #9
    Join Date
    May 2004
    Location
    Singapore
    Posts
    263
    you don't need to backwack the _ and - characters
    I think it is good practice though, especially if someone somehow moves the dash from the end to somewhere in the middle while modifying the regex.

    and I doubt you want a number in the file extension, right?
    mp3 files?
    #include<cstdio>
    char*s="#include<cstdio>%cchar*s=%c%s%c;%cint main(){std::printf(s,10,34,s,34,10);}";
    int main(){std::printf(s,10,34,s,34,10);}

  10. #10
    Join Date
    Mar 2004
    Location
    Toronto, Canada
    Posts
    122
    I do check the file extension, just not with the regular expression. I do a mime/type check and I check the extension seperatly as I read that not all browsers will give you the proper mime/type and that they are easy to spoof anyways.

    MP3 files are not allowed for this so no numbers in the extension.

  11. #11
    Join Date
    Apr 2000
    Location
    California
    Posts
    3,051
    Originally posted by laserlight
    I think it is good practice though, especially if someone somehow moves the dash from the end to somewhere in the middle while modifying the regex.


    mp3 files?
    It's never a good proactive to backwack characters that don't need them, it can cause confusion. The hyphen could indeed pose an issue, if it's placed differently, but the underscore character shouldn't be.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •