
|
View Full Version : PHP : Convert URL to Link ?
SunShellNET 11-17-2009, 01:37 PM Hi
Can anyone help me with this ?
I have a string
Hello welcome to www.google.com , please do not forget to visit http://gmail.com
This is a plan text that I assign to a variable $text
How, I want to echo like this
Hello welcome to <a href=http://www.google.com>www.google.com</a> , please do not forget to visit <a href=http://gmail.com>http://gmail.com<a>
Any help is highly appreciated
mattle 11-17-2009, 02:31 PM There are three steps to this:
1. Define the rules for matching text to a url (perhaps anything that starts with "www." or "http://" and ends with an alphanumeric character or trailing slash and includes only the following characters (a-z0-9/.-?&=%) -- ie, "www.google.com", without capturing the comma or "http://webmail.sample.com/?inbox")
2. Read this: http://php.net/manual/en/function.preg-replace.php
3. Google this: "regex tutorial"
Energizer Bunny 11-17-2009, 02:34 PM Mattle i think he meant this :
echo'
Hello welcome to <a href="http://www.google.com">www.google.com</a> , please do not forget to visit <a href="http://gmail.com">http://gmail.com</a>';
Update i think user meant something like this
$text = 'Hello welcome to www.google.com , please do not forget to visit http://gmail.com';
$text = str_replace('www.google.com','<a href="http://www.google.com">www.google.com</a>',$text);
$text = str_replace('http://gmail.com','<a href="http://gmail.com">http://gmail.com</a>',$text);
echo $text;
Not sure if regex will be any faster than str replace function, maybe some php guru can elaborate.
mattle 11-17-2009, 03:54 PM Actually...I ran some timing tests in another thread...minuscule difference at best. Regex's tend to be more efficient if they match at the beginning of a string, str_replace are pretty consistent regardless of match location....anyhoo...
I did get the distinct impression that the OP was looking to replace all URL's with links. Either way, I still prefer the PCRE functions to me, this is still cleaner...but I'm an old Perl hack ;)
$text = preg_replace("/www\.google\.com/", "<a href=\"$1\">$1</a>", $text);
...
Energizer Bunny 11-17-2009, 05:37 PM Hmm.. how you do testing i always wanted to know ?
SunShellNET 11-17-2009, 05:45 PM I can explain again what I meant.
This is a test string. Everytime when I execute the script, the string will change. so google.com is a sample domain
$text = "Hi, Welcome to www.google.com, Please visit http://gmail.com as well. See http://www.yahoo.com " ;
Then do the conversion
Then display the string
echo $text;
And it should display
Hi, Welcome to <a href=http://www.google.com>www.google.com</a> , Please visit <a href=http://gmail.com>http://gmail.com</a> as we , See <a href=http://www.yahoo.com>http://www.yahoo.com</a>
Energizer Bunny 11-17-2009, 06:01 PM I can explain again what I meant.
This is a test string. Everytime when I execute the script, the string will change. so google.com is a sample domain
$text = "Hi, Welcome to www.google.com, Please visit http://gmail.com as well. See http://www.yahoo.com " ;
Then do the conversion
Then display the string
echo $text;
And it should display
Hi, Welcome to <a href=http://www.google.com>www.google.com</a> , Please visit <a href=http://gmail.com>http://gmail.com</a> as we , See <a href=http://www.yahoo.com>http://www.yahoo.com</a>
Use Mattle's code or mine should both work.
Unless you want to just show $text as pure $text showing all html markup code ? If you want to do that, you need to use htmlentities or decode i think to not display them as clickable links and just pure text.
mattle 11-17-2009, 06:10 PM Use Mattle's code or mine should both work.
Unless you want to just show $text as pure $text showing all html markup code ? If you want to do that, you need to use htmlentities or decode i think to not display them as clickable links and just pure text.
Incorrect...our code samples are predicated on knowing what urls $text might contain. htmlentities() and html_entity_decode() (I'm assuming here...there is no decode() function) have nothing to do with this.
Follow the directions in my first reply and post back here with code samples if you run into difficulty.
Energizer Bunny 11-17-2009, 06:23 PM Incorrect...our code samples are predicated on knowing what urls $text might contain. htmlentities() and html_entity_decode() (I'm assuming here...there is no decode() function) have nothing to do with this.
Follow the directions in my first reply and post back here with code samples if you run into difficulty.
Depends on how he wants to show it , show it full html code or show it on a webpage as rendered links ... all upto the user from this end on :)
cselzer 11-19-2009, 11:00 PM $text = ereg_replace("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]","<a href=\"\\0\">\\0</a>", $text);
SunShellNET 11-20-2009, 12:13 AM $text = ereg_replace("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]","<a href=\"\\0\">\\0</a>", $text);
I already found it, but there is a problem
http://google.com
if will detect and convert to link
www.google.com
it will not detect.
Alteration is appreciated, I tried but got error
mattle 11-20-2009, 09:50 AM First of all...don't use the ereg_* functions. They are deprecated and will not be in PHP6. (http://php.net/manual/en/function.ereg-replace.php)
Secondly, you need to do some research on regular expressions to understand *how* cselzer's example is working. As yet, however, you have not defined specific rules for you you want to match something as a url...that has to be your starting point. Then I will be happy to help you write the regular expressions that will implement those rules.
Matt
SunShellNET 11-20-2009, 09:57 AM First of all...don't use the ereg_* functions. They are deprecated and will not be in PHP6. (http://php.net/manual/en/function.ereg-replace.php)
Secondly, you need to do some research on regular expressions to understand *how* cselzer's example is working. As yet, however, you have not defined specific rules for you you want to match something as a url...that has to be your starting point. Then I will be happy to help you write the regular expressions that will implement those rules.
Matt
Hi www.google.com
Hi http://google.com
Hi http://www.google.com
All the above lines must be detected and converted to URLs from text. These are the conditions
If I output these lines using *echo* then I should get a clickable output.
Hope this helps
mattle 11-20-2009, 10:33 AM Hi www.google.com (http://www.google.com)
Hi http://google.com
Hi http://www.google.com
All the above lines must be detected and converted to URLs from text. These are the conditions
If I output these lines using *echo* then I should get a clickable output.
Hope this helps
Okay...so start by breaking those strings down. As an example, I'll do the first one for you.
1. Starts with "www."
2. Contains only valid domain name characters (a-z,0-9,-,.) ("google")
3. The domain portion should end with a "." and then 2-4 a-z characters (".com")
3. Possibly followed by a slash "/"
4. Possible followed by a URI (probably allow any characters here except spaces)
5. String should be terminated by a "/", "a-z" or "0-9". This will help us to not capture punctuation at the end of a URL. For example:
"www.google.com,"
GOOD: <a href="www.google.com">www.google.com</a>,
BAD: <a href="www.google.com,">www.google.com,</a>
So, by breaking down how to match the string, we can build our regular expression.
/(www\.[a-z\d-\.]*\.[a-z]{2,4}?\/?(?:[^\s]*[\/a-z\d]))/ig
Now, let's break this down:
/ : start your regex
( : begin capturing (we'll use this later)
www : match the text "www"
\. : match a period *
[ : begin a character set **
a-z : match any letter
\d : match any digit
- : match a hyphen
\. : match a period *
] : end the character set **
* : match multiple characters in the previous set
\. : match a period *
[ : start another character set **
a-z : match any letter
] : end the character set **
{2,4}: match the previous character set 2-4 times
? : perhaps followed by a...
\/ : forward slash
? : perhaps followed by a...
(?: : begin a non-capturing block ***
[ : begin another character set **
^ : matching any characters that are NOT
\s : spaces
] : end the character set **
* : match the previous character set as many times as possible
[ : start another character set **
\/ : matching a forward slash *
a-z : matching any letter
\d : matching any digit
] : end the character set
) : end the non-capturing block ***
) : end our capturing block
/ : end our regex
i : perform a case-insensitive search
g : perform the search globally (ie, match multiple urls)
* Regular expressions have some reserved characters, including "." and "/". They mean other things on their own, so to denote those actual characters, you need to escape them with a backslash.
** A character set matches any character in brackets ([ ]). It is usually followed by a +, *, or a number range {1, 3} to tell how many characters to match.
*** Parentheses indicate a "block". It's a way of organizing your regular expression into smaller chunks. On their own, they also capture the contents and create what is called a "back reference". This allows you to use the contents of your regex later. They are also useful when you want to apply an operator to a whole portion of a regular expression, like in our example. We have a piece that represents the URI -- [^\s]*[\/a-z\d]
That whole piece may or may not be present, so we want to precede it with a ?
?[^\s]*[\/a-z\d]
Problem is, that will only examine the first character set, so we need a block:
?([^\s]*[\/a-z\d])
Except now the problem is that we have created another capturing block that we don't need. To tell the regex engine not to create a back reference to that block, we use the ?: notation
Our back references are created in the order that they are started (although we only have one...) and are available via the variables $1...$n;
So, to use a back reference:
echo preg_replace("/a(b)c/ig", $1, "abc abc");
yields: "b b"
In this example, we've matched the abc twice, and created a back reference to the "b". We then replace all of our matches "abc" with our back reference "b".
So, to convert a url to a link, the basic formula is:
$myRegex = "/(www\.[a-z\d-\.]*\.[a-z]{2,4}?\/?(?:[^\s]*[\/a-z\d]))/ig";
preg_replace($myRegex, "<a href='$1'>$1</a>", $text);
I would recommend writing a separate regex for urls beginning with http...I'll let you go ahead and try your hand. Post what you come up with here, and we can go over it if you're having problems.
Also, check out this site for quick and easy testing: http://www.regextester.com/
Matt
Hello welcome to www.google.com, please do not forget to visit http://gmail.com
Why not do it BEFORE the string gets created?
The OP said "www.google.com" and "http://gmail.com" are dynamic.
$display_string = 'Hello welcome to <a href="';
$display_string .= $firsturl;
$display_string .= '">';
$display_string .= $firsturl;
$display_string .= '</a>, please do not forget to visit <a href="';
$display_string .= $secondurl;
$display_string .= '">';
$display_string .= $secondurl;
echo $display_string;
SunShellNET 11-20-2009, 10:54 AM Hello welcome to www.google.com, please do not forget to visit http://gmail.com
Why not do it BEFORE the string gets created?
Because I get the string created.
Because I get the string created.
Sorry, that's not what I inferred from:
Everytime when I execute the script, the string will change.
|