Web Hosting Talk







View Full Version : Get internet page from LINUX command line?


owocki
10-19-2005, 01:17 PM
Anyone have any idea of how to download an html page from the command line of a (Fedora) linux box.

Thanks.

tamasrepus
10-19-2005, 01:32 PM
See wget or curl.

owocki
10-19-2005, 01:41 PM
I do not have root access to this box and do not wish to install any other programs, I simply would like a command line solution.

THanks :)

LimpBagel
10-19-2005, 02:47 PM
They will both be installed on most servers. You run them from the command line.

owocki
10-19-2005, 03:21 PM
Unfortunately, the server I am on does not have these :(.

I have seen this implemented in a loop. I was hoping thats what I would find.

Oshaka
10-19-2005, 03:27 PM
You can save the html file via links/elinks/lynx by hitting ESC for the menu, going to save as.

tree-host
10-19-2005, 03:30 PM
wget is normally installed... have you tried wget http://www.google.com ?

wget is oftern used to install programs, so i would expect its there somewhere.

You could also try looking for it (locate wget) i think its normally in usr/sbin in which case you can run

/usr/sbin/wget http://www.google.com

cerebis
10-19-2005, 03:35 PM
They might be on the server and just not on your path. Depending on how well configured you server is, try typing "man -k wget" or "man -k curl" and see if you get any hits. If you do, it's very likely the program is somewhere on the system.

Seriously though, a script to download webpages will require the use of some shell program (such as wget) or be done in a language which possesses the ability to do http via a package (such as perl or python).

If you have access to gcc, then just download the source to wget

http://ftp.gnu.org/pub/gnu/wget/wget-1.10.2.tar.gz

Untar it, go into the directory and type

./configure --prefix=$HOME/wget && make install

when it's done, you'll have a working wget at

$HOME/wget/bin/wget

If you don't want to type that in every time, then either add the location to your path or create a symlink in your local bin directory.

Burhan
10-20-2005, 02:51 AM
You can save the html file via links/elinks/lynx by hitting ESC for the menu, going to save as.

Actually, if you want to download a webpage:

lynx -dump http://www.google.com/ > google.html

Or, if you want to spider a website

lynx -crawl -traversal http://www.google.com/

SouthiRobert
10-20-2005, 07:34 PM
Perhaps 'fetch' if it is on *bsd. (Yeah, I know you've said LINUX but these days I am not sure anymore how people call bsd box :)

almahdi
10-21-2005, 04:06 AM
If you've got PHP use this code:


#!/usr/bin/php
<?
error_reporting(0);
if($argc <= 1) {
echo "Usage: ./fetch.php http://www.testurl.com/\n";
exit();
}
$handle = fopen($argv[1], "r");
if(!$handle) die("Error Opening URL\n");
$contents = '';
while (!feof($handle)) {
$contents .= fread($handle, 8192);
}
fclose($handle);
echo $contents;
?>