View Full Version : meta tags
Can someone tell me what meta name tags are important for promoting a site? I assume the following is a good idea:
<META NAME="ROBOTS" CONTENT="INDEX,FOLLOW">
Also, should I have a robot.txt file on my server? Any suggestions would be appreciated.
Chicken 09-29-2001, 02:08 AM Different SE's use different methods to index your site, so what may be crucial to one SE, isn't a factor with another. I don't know the speficis and of course I can think of, nor find the URL that always gets posted about SE placement.
JustLurkin77 09-29-2001, 02:21 AM More than you would ever want to know about search engine placement...
www.searchengineforums.com
Aloha
well here are a few links
http://spider-food.net/
http://www.searchenginewatch.com/webmasters
more important than meta is your title and the words that are first on your site
then meta tags.
slade 09-29-2001, 05:54 PM Originally posted by Ron
Also, should I have a robot.txt file on my server? Any suggestions would be appreciated.
I'm gonna say probably yes. If there is anything on the site you don't want the engines to spider, like a /cgi-bin for a shopping cart or other scripts, a robots.txt is a good idea.
It won't keep every spider out, but most of them will obey them.
Also, I'm gonna add WebMasterWorld (http://webmasterworld.com)
Thanks slade. I do not have a cgi-bin although I do have a shopping cart. My entire site runs in ASP including the top page. Is there anything I should take into consideration with an ASP site?
Can you tell me where I can get a templet for robots.txt?
Cyberpunk 10-01-2001, 08:09 AM <META NAME="Robots" Content="ALL">
<META NAME="Robots" Content="INDEX,FOLLOW"> (same as all)
<META NAME="Robots" Content="NONE">
<META NAME="Robots" Content="NOINDEX,NOFOLLOW"> (same as none)
<META NAME="Robots" Content="NOINDEX,FOLLOW">
<META NAME="Robots" Content="INDEX,NOFOLLOW>
this is supposed to be a good one too
<META NAME="Revisit" Content="7"> (number is supposed to signify days).
Validator and links for robot.txt info I found useful:
http://www.searchengineworld.com/cgi-bin/robotcheck.cgi
I went to the URL you suggested and am still a little confused. Which of the following is correct:
User-agent: *
Disallow: /convertproducts.asp
Disallow:
User-agent: *
Disallow: /convertproducts.asp/
Disallow:
User-agent: *
Disallow: /convertproducts.asp
User-agent: *
Disallow: /convertproducts.asp/
What I am trying to tell the robots is:
All robots welcome, disallow convertproducts.asp, and spider everything else.
Is the forward slash supposed to go after the file name as well? By the way, I only want to disallow certain files, not whole directories.
Aloha
ron
http://www.robotstxt.org/wc/exclusion-admin.html
since it is a file it is better to leave it off
if you had the / on the end it would try to look inside deeper ito that url
so use:
User-agent: *
Disallow: /convertproducts.asp
I kind of though so. But should there be an additional line that says just
Disallow? My point is, don't allow so and so file but allow everything else. See example:
User-agent: *
Disallow: /convertproducts.asp
Disallow:
Also, I wanted to ask if it is ok to create this file in a text editor, saved as a text file, and uploaded in ASCII.
Lastly, I have about 50 - 100 ASP files that are related to the shopping cart, should I Disallow them all? Also the database?
One thing that concerns me about a robots.txt file is that anyone who goes to your site and opens the file in their browser can see all your files listed that you don't want spidered. Is this really such a good idea from the security standpoint?
Thanks and Aloha!
Aloha
no you do not need the extra line in there ;)
yes I just do mine in notepad and upload
well I to have had that thought about spidered pages that I do not want others to know if I cna help it
so I just use the meta tags in those pages
if you go to www.searchenginewatch.com/webmasters
they have some great info on meta tags etc...
are all your asp pages in the root dir ???
or are they in a dir ??
if they are in a dir you would just use
disallow: /members/
(assuming your dir is called members
if you have ? etc... in pages engines tend to ignore these (I am pretty sure but engines change constantly)
(covering myself here;)
hehehehehe
hope this helps
If you have pages you want to keep hidden as much as possible I would not put them in your robots text ???
and just use meta tags to tell engines to goaway ;)
I worked on a large site and we used a methid with CF called fusebox
http://www.fusebox.org
pretty cool as you can not tell where your pages are in dir etc..
a good thing to look at for grins ;)
Cyberpunk 10-03-2001, 01:12 AM Bit late back to this one....
You have allow ok.
You have to have a disallow for every directory and filename you want blocked to the robots that obey the exclusion standard.
User-agent: *
Disallow: /convertproducts.asp
is the correct one for what youre asking.
Robots.txt should be in public root of your webspace.
While were on the subject, I read a while back if you have subdomains etc. you still only need the one file in the root, and somewhere else I read you need one in the root of every subdomain, which is correct folks?
|