
|
View Full Version : How to take the address of all links present in a file?
wayiran 08-14-2006, 04:17 AM I have done this much of it:
<?php
$text=htmlspecialchars(file_get_contents("welcome.txt"), ENT_NOQUOTES);
$start = strpos($text,"<a href=\"");
$end = strpos($text,"\">");
echo substr($text, ($start+12), ($end - $start-12));
?>
But it takes just the first link present in the file, how can I change it to get the address of all links?
thank you
UK-Networks 08-14-2006, 06:26 AM <?
$text=htmlspecialchars(file_get_contents("welcome.txt"), ENT_NOQUOTES);
$start=0;
$end=0;
$test=0;
while (strpos($text,"<a href=\"")) {
$start = strpos($text,"<a href=\"",$end);
$end = strpos($text,"\">",$start);
echo substr($text, ($start+12), ($end - $start-12));
}
?>
wayiran 08-14-2006, 09:04 AM Thanks, but it repeats in while forever, and gives 1000pages of:
hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href="hhhvvvsss; <body> <p><a href
Do you have any correction? why it happens?
while (strpos($text,"<a href=\"")) {
}
I think it starts $text again, when reaches end of it! Because when I removed <a href="..."> from my text file, It didnt show anything, even other things....
my welcome.txt file contains:
</head>
<body>
<p><a href="hhh">aaa</a></p>
</body>
</html>
<p><a href="vvv">aaa</a></p>
<p><a href="sss">aaa</a></p>
UK-Networks 08-14-2006, 09:22 AM <?
$text=htmlspecialchars(file_get_contents("welcome.txt"), ENT_NOQUOTES);
$start=0;
$end=0;
$test=0;
while ((strpos($text,"<a href=\"")) && ($start>=$test)) {
$start = strpos($text,"<a href=\"",$end);
$end = strpos($text,"\">",$start);
$test=$end;
echo substr($text, ($start+12), ($end - $start-12));
}
?>
Give that a shot
tiamak 08-14-2006, 11:00 AM why u play with strpos substr instead of preg_match_all() function ?
$html = file_get_contents('welcome.txt');
$urlpattern = '/<a[^>]+href="([^"]+)/i';
preg_match_all($urlpattern, $html, $matches);
foreach ($matches[1] as $u) {
echo $u."\n";
}
i guess it looks much better :D
wayiran 08-14-2006, 04:10 PM <?
$text=htmlspecialchars(file_get_contents("welcome.txt"), ENT_NOQUOTES);
$start=0;
$end=0;
$test=0;
while ((strpos($text,"<a href=\"")) && ($start>=$test)) {
$start = strpos($text,"<a href=\"",$end);
$end = strpos($text,"\">",$start);
$test=$end;
echo substr($text, ($start+12), ($end - $start-12));
}
?>
It gives:
hhh
for:
</head>
<body>
<p><a href="hhh">aaa</a></p>
</body>
</html>
<p><a href="vvv">aaa</a></p>
<p><a href="sss">aaa</a></p>
That means it takes just the first URL, and lefts remaining!
Any new idea????
wayiran 08-14-2006, 04:29 PM why u play with strpos substr instead of preg_match_all() function ?
PHP Code:
$html = file_get_contents('welcome.txt');
$urlpattern = '/<a[^>]+href="([^"]+)/i';
preg_match_all($urlpattern, $html, $matches);
foreach ($matches[1] as $u) {
echo $u."\n";
}
i guess it looks much better :D
Thanks, It worked.
|