Web Hosting Talk







View Full Version : Stripping domain from URL, what do you think is more solid?


yegorpb
02-13-2007, 03:13 PM
I have 2 methods that I came up with, I personally like the first, since its way more simple, and works with all TLD extensions, but I might not be seeing all sides of this.



function getDomainName($ful_url) {
$ref = parse_url($full_url);
if ($ref['host'] != "") {
$ref_check = substr($ref['host'], 0, 4);

// check for www
if ($ref_check == "www.") {
$ref['host'] = substr($ref['host'], 4);
}
} else {
$ref['host'] = "BAD_URL";
}

// Output filtered url
return $ref['host'];
}

vs.

function getDomainName($url) {
$url = strtolower($url);
if (substr($url, 0, 7) == 'http://')
$url = substr($url, 7);
if (strpos($url, '/'))
$url = substr($url, 0, strpos($url, '/'));
while (strpos($url, '.') != strrpos($url, '.'))
$url = substr($url, strpos($url, '.')+1);
return $url;
}

Xenatino
02-13-2007, 05:38 PM
EDIT: Sorry, misread original post

Engelmacher
02-13-2007, 06:37 PM
URLs don't have to have "www" in them. Your code dies if you feed it something like "http://slurp.junk.co.jp".

yegorpb
02-13-2007, 09:15 PM
My code doesn't need a www. It strips the www if its there. 2nd piece of code only works with normal tld, like .com/.net and doesnt with 2 part tlds like co.uk

First one should work with everything.

Engelmacher
02-13-2007, 09:47 PM
My code doesn't need a www. It strips the www if its there. 2nd piece of code only works with normal tld, like .com/.net and doesnt with 2 part tlds like co.uk

First one should work with everything.

Test it yourself. It clearly returns "BAD_URL" if it doesn't find "www." in the array.

yegorpb
02-13-2007, 10:35 PM
Test it yourself. It clearly returns "BAD_URL" if it doesn't find "www." in the array.
No, it doesn't. It returns it if its not a valid URL.

Teh_Winnar
02-13-2007, 11:54 PM
I personally like this



<?php
// get host name from URL
preg_match('@^(?:http://)?([^/]+)@i',
"http://www.php.net/index.html", $matches);
$host = $matches[1];

// get last two segments of host name
preg_match('/[^.]+\.[^.]+$/', $host, $matches);
echo "domain name is: {$matches[0]}\n";
?>

plumsauce
02-14-2007, 05:35 AM
The regex is pretty good, I like the "until slash" part, but it does not validate whether it contains any periods. For example, it would also match http://localhost It will also validate against http://192.168.0.1

The second part will fall apart if you have country tld's. For example, blah.co.uk

This is fine, until someone fat fingers something that gets past the regex. Users will do the most insane things to data input screens. And then there are my own coding errors :)

Engelmacher
02-14-2007, 06:25 AM
No, it doesn't. It returns it if its not a valid URL.

Well then I guess the PHP installations on my test machines must be horribly broken because that's what it does on every single one of them. I don't know what else to say other than "don't quit your day job".

brendandonhu
02-14-2007, 03:08 PM
Why don't you just use parse_url()?

yegorpb
02-14-2007, 03:26 PM
Well then I guess the PHP installations on my test machines must be horribly broken because that's what it does on every single one of them. I don't know what else to say other than "don't quit your day job".

Right back at you, buddy. I also suggest you get a pair of eyeglasses. :rolleyes: There is a } you are not seeing.

Why don't you just use parse_url()?

I did, in the first example.... just wanted to see if its a sound code.

Xenatino
02-14-2007, 08:11 PM
Well then I guess the PHP installations on my test machines must be horribly broken because that's what it does on every single one of them. I don't know what else to say other than "don't quit your day job".
Right back at you, buddy. I also suggest you get a pair of eyeglasses. There is a } you are not seeing.

This might shed some light in a test situation: http://www.1921681100.com/public/623.php