
|
View Full Version : Detect if url is a redirect or offline?
lexington 02-27-2008, 05:29 PM Hello, is there some code I could use to check a url that a user enters into a field to see if that url is a direct to another site? Also, another error check to see if the site is online? Either one that you could help me with is fine. If you could post a working example that would be great. Thanks!
azizny 02-27-2008, 09:45 PM What do you mean "url is a direct to another site"? You mean if it redirects to another website using header redirect?
Peace,
lexington 02-27-2008, 09:50 PM What do you mean "url is a direct to another site"? You mean if it redirects to another website using header redirect?
Peace,
Hmm I suppose those sites are using header redirect. I am referring to when you view a site url and it redirects to another url. There should be a way for a script to detect that by comparing the first URL to a changed url or something?
azizny 02-27-2008, 11:22 PM There are two steps:
Step one: Fetch the redirect url from the actual url
//url = www.site.com
//getvar = /page?urlid=2
function get_url($url,$getvar){
$fp = fsockopen ($url, '80', $errno, $errstr, 30);
if ($fp){
$query = "GET $getvar HTTP/1.1\r\n";
$query .= "Host: $url\r\n\r\n";
fputs($fp, $query);
while (!feof($fp)){
$buf .= fgets($fp,128);
}
fclose ($fp);
} else {
return NULL;
}
preg_match("/Location: (.+)\n/U", $buf, $url);
print_r($url);
return $url;
}
Step two: use fopen to check if the url exist.
Peace,
lexington 02-27-2008, 11:48 PM Meaning fopen(get_url($url));
?
azizny 02-28-2008, 11:50 AM Depends on what the function is returning, that is why I used print_r($url). I am assuming that $url[0] contains the url, if you return that, then you can use fopen(get_url($url)).
You are not trying to scrape a scripts directory. Are you?
Peace,
lexington 02-28-2008, 08:36 PM You are not trying to scrape a scripts directory. Are you?
Not sure what that means? I am adding this to my own site since some people add urls that appear to be normal but then redirect to a spam site so I want to add a check that sees if the url remains the same and if so it will allow it.
azizny 02-28-2008, 09:21 PM Not sure what that means? I am adding this to my own site since some people add urls that appear to be normal but then redirect to a spam site so I want to add a check that sees if the url remains the same and if so it will allow it.
What if the site is doing a header redirect is actually fine (ex. site moved)?
Peace,
Xeentech 02-29-2008, 04:28 PM There are two steps:
Step one: Fetch the redirect url from the actual url
//url = www.site.com
//getvar = /page?urlid=2
function get_url($url,$getvar){
$fp = fsockopen ($url, '80', $errno, $errstr, 30);
if ($fp){
$query = "GET $getvar HTTP/1.1\r\n";
$query .= "Host: $url\r\n\r\n";
fputs($fp, $query);
while (!feof($fp)){
$buf .= fgets($fp,128);
}
fclose ($fp);
} else {
return NULL;
}
preg_match("/Location: (.+)\n/U", $buf, $url);
print_r($url);
return $url;
}
Step two: use fopen to check if the url exist.
Peace,
Could we please avoid reimplementing an incomplete version of HTTP every time some one has a question like this. There are many many HTTP implementations available to reuse.
Here's an example using cURL that I made in about a second.
<?php
$ch = curl_init("http://domain.tld/");
curl_exec($ch);
if (curl_getinfo($ch, CURLINFO_HTTP_CODE) == 302) {
print("URL was an HTTP Redireect\n");
}
?>
lexington 02-29-2008, 04:40 PM Thanks I will try that :)
lexington 02-29-2008, 06:31 PM This works but it seems when the url is not a redirect it displays the website onto my own page. Is there a way to prevent that? If the site is a redirect it displays the printed error which works fine.
EDIT
I believe it is the curl_exec function that is displaying the other site on my page.
Xeentech 02-29-2008, 08:29 PM I guess you could write teh output to null like this:
$fp = fopen("/dev/null", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_exec($ch);
fclose($fp);
lexington 02-29-2008, 09:25 PM Is that the full code to use? Because I do not see where it checks the site url in your new code.
|