Web Hosting Talk







View Full Version : Parsing external HTML content to a PHP doc


szarwell
12-27-2004, 08:49 PM
I'm looking to do a bit more than a typical PHP include. I want to grab an external HTML document and then parse out certain parts of that document to be displayed on my page. Ideally, I'd like to grab the document into a string, then from that string grab the text displayed between specific tags. So, I could get the text that comes after <a href="mailto: and "> to get all the email addresses on that page. Probably store them in an array so I could print them on my PHP page. I have tried many variations of fread, file_get_contents, and anything else I could think of that could do this with no luck. Please help! TIA

Burhan
12-28-2004, 02:54 AM
<?php

$contents = file_get_contents("foo.html");

preg_match_all("|<a href=\"mailto:(.*?)\">|mi",$contents,$matches, PREG_PATTERN_ORDER);

echo "Number of matches : ".sizeof($matches[1]);
echo implode("<br />",$matches[1]);
?>


Try that one.

GigaByteSolutions
12-28-2004, 07:46 AM
Hope the mailadresses are not for spam! :)

Xenatino
12-28-2004, 11:03 AM
My thoughts exactly

szarwell
12-28-2004, 11:38 AM
Thanks fyrestrtr! And no Xenatino and GigaByteSolutions, this isn't for spam. The email addresses was just an example I used. If I were spamming I'd be more clever than to use that as an example. Just to set you at ease, I'm using this at work to grab employee profiles from our PeopleFinder pages then displaying them on an org chart for our department. That was just a little more than I thought I needed to say .

Xenatino
12-28-2004, 02:42 PM
Okay szarwell, we can never be too careful though with a 1 poster asking how to grab email addresses from remote sites!

I apologise for my mistake