Using Meta tags
Meta Tag Overview
Mega tags are bits of information inserted into the <head> area of your web pages. Apart from the fairly important <title> tag which headlines for the browser, the info in this header is mostly for search engines and browsers. The more obvious examples are setting the PICS rating, the language set used and the description line.
I’m actually using this to revise my own metafiles this month! I know a number of sites say to include the </meta> to close the tag but to be honest, I didn’t think this actually needed, certainly validators aren’t concerned about them. That is until you validate as XHTML… Better to put it in, eh. And use lower case. Caps are history!
(Hopefully I’ll add a page on XHTML and Style Sheets next month (July 2003))
Update. Here’s a thing. I was fixing some meta files and cross checked them.
XHTML was happy with </meta> but HTML4.01 seems to see this as a new meta file and goes all wobbly. The solution is so use /> This way HTML and XHTML are happy.
I went through all this from HTML 3.2 to HTML 4.01. With the move to XMHTL I now have several hundred pages to revise…
You’d be amazed how rewriting and editing over a 100,000 lines of code focuses your mind on future proofing!
I’ve heard experts proclaim the death of meta’s, especially as Google and others largely ignore them. To put that myth to bed, here’s an extract following a resent site submission (Sept 2003):
Important: ExactSeek is a META Tag search engine. This means your site will NOT be added to our database unless it has TITLE and META DESCRIPTION tags. To automatically generate these tags for your site (at no cost) and learn more about search engine optimization, visit SubmitExpress
This has a number of uses including identifying the page – especially when minimised, plus search engines generally use it in the page description. It never ceases to amaze – and irritate me – the number of “webmasters” that forget the page title. Blah!
A lot of folk say make the title “this many words” or “that many letters”, but the rules change all the time. Myself, I reckon just put in what feel is right!
For the Title, don’t forget to close the tag.
<meta name=”keywords” content=”fish, chips, mushy, peas“></meta>
Strictly for search engines, and then only a few of them.
In the example above I’ve used food for a chippy, it could just as easily be widgets for an engineering company. For a custom intranet search engine, this can be a godsend, but really only a few search engines use them anymore. Can’t hurt to add them mind, as long you use them sensibly.
<meta name=”description” content=”For proper chips like yer Ma’ made! Sometown’s premier greasy spoon. Second left after the gasworks!“></meta>
Our intrepid frier extolls the companies virtues. This tells search engines the description of your page. This can be as long or as short as you wish, but a vague rule of thumb is around 200 words and ideally use some keywords in there. The figure is a guide only as the rule changes depending on which search engine you have in mind, and no, Google won’t give you a clue, not even if you beg. Needless to say only the first sentence or so it actually used.
Naturally bear in mind some engines just take the first sentence they find, period, others the first sentence in the <body>. At present, according to
Search Engine Watch:
“this tag enjoys much support, and it is well worth using.”
Me? I just put in what I feel is appropriate (when I remember). If it get’s used, all to the good, if not, it’s as much a guide for me as them, hmmm.
Meta Robots Tag
<meta name=”robots” content=”index,follow”></meta> ( content=”all” ) OR
<meta name=”robots” content=”noindex,follow”></meta> OR
<meta name=”robots” content=”index,nofollow”></meta> OR
<meta name=”robots” content=”noindex,nofollow”></meta> ( content=”none” )
This lets you specify that a particular page should NOT be indexed by a search engine and is widely supported, all least by the respectable engines (mutters angrily about spambots). To keep spiders out, simply add this text between your head tags on each page you don’t want indexed.
The INDEX directive specifies if an indexing robot should index the page – or not.
The FOLLOW directive specifies if a robot is to follow links on the page – or not.
Given the defaults are INDEX and FOLLOW (ALL) you don’t really need to add this to your code.
I reckon it’s less hassle to write a robot script and have done.
See Robots.Txt for more info if you don’t know how to do this.
<meta http-equiv=”pics-label” content=”(pics-1.1 ……“></meta>
<meta http-equiv=”Content-Type” content=”text/html; charset=ISO-8859-1“></meta>
To quote Vancouver Webpages:
“META tags with an HTTP-EQUIV attribute are equivalent to HTTP headers. Typically, they control the action of browsers, and may be used to refine the information provided by the actual headers. Tags using this form should have an equivalent effect when specified as an HTTP header, and in some servers may be translated to actual HTTP headers automatically or by a pre-processing tool.”
Basically they tell the browser how to deal with you. I use the above two to essentially say,
Hello, this is a decent site, for all, and it’s in English The latter one can be considered essential
There are several more but largely, I believe, they are outdated. The above link details most of them.
<link rel=”stylesheet” TYPE=”text/css” href=”ackadia.css“></link>
I’ve seen this used with the “http-equiv” but I believe using the above form it the correct way to go.
The href is the path to to Cascading Style Sheet settings. These can be set locally, but it’s far easier to handle it on a global basic. If you aren’t familiar with CSS then now would be a good time to learn, eh!
When I get round to it I’ll write a tutorial of it, though there are several good examples and a few good books on it already.
<meta name=”option” content=”attributes></meta>
The name attribute are used for other types which do not correspond to HTTP headers. Apparently though some agents may interpret certain tags whether declared as either “name” or as “http-equiv”. That ambiguous enough for you? Really the only ones you need to think about are the “Description”, “keywords” and to a lesser extent “robots” which I covered further up the page.
Most of the “options” are completely ignored by all but the odd custom or specialised search engine.
For example a book company might be interested in “author”, Microsoft or Macromedia might be interested in “generator” which some web editors spit out with the code and pretty much no-one gives a monkies about “copyright”.
You might want to note though that web I was checking a page for validation under the Web Access Initiative it flagged a warning message because I’d taken out the “author” entry.
As I understand it, if your company makes custom cars, for example, you can make up something like:
<meta name=”Landspeed3” content=”Engine-Chassis“>
And search your site accordingly. Specialised stuff, hmmm.
The other “names” of interest are these:
<meta name=”expires” content=”31 December 1999“>
I’m not sure if this is used anymore but it tells search engines when your page should be deleted from it’s directory and should be presented in the form above (day month year). I think the search engines have there own way of dealing with out of date pages…
<meta name=”revisit-after” content=”30 days“>
This is supposed to tell the search engine to visit your site again in nn days.
Meanwhile, back in spider HQ, the bots are laughing their binary socks off in the bar whilst lampooning you for your audacity.
For me it summons up the image of the Smash family robots* rolling on the floor laughing, waving a potato peeler and chuckling
… and they mash them all to bits
(* Older Brits will remember this. I’ll try and find a picture.)
There again, it can’t hurt to put it in, I guess.
Surprising basic for W3 Schools, but are HTML sites go, this is in the top echilons. They have quizzes too now, which is useful.
Vancouver Webpages – A Dictionary of HTML META Tags
Whilst I won’t go are far as to say this is the last word, but it is a very detailed look at Met tag usage.
Aimed at directing (responsible) search engine spiders and bots
Bit basic, but the sites OK
Search Engine Watch
Takes it from the search engines point of view. Basically, as I’ve mentioned elsewhere, meta’s are largely yesterdays news, but they have there place yet.
It may seem like a daft question, but I have to ask…
… can you track the spiders’ home?
Even if we can’t tie a packet of string to it’s thieving legs, it has to come to our websites from ‘somewhere’.
For instance, from my logs…
Date Wed Jan 28 03:14:44 2004
IP Address: 184.108.40.206
Range is: Videotron.com, Montreal
Logistical nightmare, perhaps, but if enough folk see it coming from the same host…
*Offers image of Garfield with a book… THWAP*