croakingtoad
11-17-2003, 03:49 AM
I'm trying to hack this script (Mini-Fetch) a bit to get it to do what I want, but am not sure how to accomplish this. This script (http://www.mikenew.net/) is used to fetch remote HTML documents and insert them into your own site, and allows you to also modify the original code some...
Below is an example of the script variables used to remove certain code or content from your fetch:
$value = eregi_replace( "<IMG alt=[^>]*>", "", $value ); // Remove all image alt="whatever" tags
$value = eregi_replace( "<class[^>]*>", "", $value ); // Remove all variations of <class> tags.
$value = eregi_replace( "<table[^>]*>", "", $value ); // Remove ALL variations of <table> tags.
$value = eregi_replace( "<tr[^>]*>", "", $value ); // Replace <tr> tags with blank space.
$value = eregi_replace( "<td[^>]*>", "", $value ); // Remove all variations of <td> tags.
$value = eregi_replace( "<p class=\"lead\"[^>]*>", "", $value ); // Remove all variations of <td> tags.
Using the above format, what should I add to remove a certain tag and all content in between using a wildcard.
For example, if the tag is:
<p class="headline">blah blah blah 123 blah</p>
How can I get it to remove all instances of <p class="headline"> and any wildcard content until the next </p> is encountered?
Below is an example of the script variables used to remove certain code or content from your fetch:
$value = eregi_replace( "<IMG alt=[^>]*>", "", $value ); // Remove all image alt="whatever" tags
$value = eregi_replace( "<class[^>]*>", "", $value ); // Remove all variations of <class> tags.
$value = eregi_replace( "<table[^>]*>", "", $value ); // Remove ALL variations of <table> tags.
$value = eregi_replace( "<tr[^>]*>", "", $value ); // Replace <tr> tags with blank space.
$value = eregi_replace( "<td[^>]*>", "", $value ); // Remove all variations of <td> tags.
$value = eregi_replace( "<p class=\"lead\"[^>]*>", "", $value ); // Remove all variations of <td> tags.
Using the above format, what should I add to remove a certain tag and all content in between using a wildcard.
For example, if the tag is:
<p class="headline">blah blah blah 123 blah</p>
How can I get it to remove all instances of <p class="headline"> and any wildcard content until the next </p> is encountered?
