Using Meta tags
Meta Tag Overview
Mega tags are bits of information inserted into the <head> area of your web pages. Apart from the fairly important <title> tag, which headlines for the browser, the info in this header is mostly for search engines and browsers. The more obvious examples are setting the PICS rating, the language set used and the description line.
I’m actually using this to revise my own metafiles this month! I know a number of sites say to include the </meta> to close the tag, but to be honest, I didn’t think this needed, certainly validators aren’t concerned about them. That is until you validate as XHTML… Better to put it in, eh? And use lowercase. Caps are history!
(Hopefully, I’ll add a page on XHTML and Style Sheets next month (July 2003))
Update. Here’s a thing. I was fixing some meta files and cross-checked them.
XHTML was happy with </meta> but HTML4.01 seems to see this as a new metafile and goes all wobbly. The solution is to use /> This way, HTML and XHTML are happy.
I went through all this from HTML 3.2 to HTML 4.01. With the move to XMHTL, I now have several hundred pages to revise…
You’d be amazed how rewriting and editing over 100,000 lines of code focuses your mind on future-proofing!
I’ve heard experts proclaim the death of meta’s, especially as Google and others largely ignore them. To put that myth to bed, here’s an extract following a recent site submission (Sept 2003):
Important: ExactSeek is a META Tag search engine. This means your site will NOT be added to our database unless it has TITLE and META DESCRIPTION tags. To automatically generate these tags for your site (at no cost) and learn more about search engine optimization, visit SubmitExpress
Title
<title>Sid’s Chippy></title>
This has a number of uses, including identifying the page – especially when minimised, plus search engines generally use it in the page description. It never ceases to amaze – and irritate me – the number of “webmasters” that forget the page title. Blah!
A lot of folks say to make the title “this many words” or “that many letters”, but the rules change all the time. Myself, I reckon just put in what feel is right!
For the Title, don’t forget to close the tag.
Keywords
<meta name=”keywords” content=”fish, chips, mushy, peas“></meta>
Strictly for search engines, and then only a few of them.
In the example above, I’ve used food for a chippy, it could just as easily be widgets for an engineering company. For a custom intranet search engine, this can be a godsend, but only a few search engines use them anymore. Can’t hurt to add them, mind, as long you use them sensibly.
Page Description
<meta name=”description” content=”For proper chips like yer Ma’ made! Sometown’s premier greasy spoon. Second left after the gasworks!“></meta>
Our intrepid frier extolls the company’s virtues. This tells search engines the description of your page. This can be as long or as short as you wish, but a vague rule of thumb is around 200 words and ideally, use some keywords in there. The figure is a guide only as the rule changes depending on which search engine you have in mind, and no, Google won’t give you a clue, not even if you beg. Needless to say, only the first sentence or so it used.
Naturally, bear in mind some engines take the first sentence they find, period, others the first sentence in the <body>. At present, according to
Search Engine Watch:
“this tag enjoys much support, and it is well worth using.”
Me? I just put in what I feel is appropriate (when I remember). If it gets used, all to the good; if not, it’s as much a guide for me as them, hmmm.
Meta Robots Tag
<meta name=”robots” content=”index,follow”></meta> ( content=”all” ) OR
<meta name=”robots” content=”noindex,follow”></meta> OR
<meta name=”robots” content=”index,nofollow”></meta> OR
<meta name=”robots” content=”noindex,nofollow”></meta> ( content=”none” )
This lets you specify that a particular page should NOT be indexed by a search engine and is widely supported, all least by respectable engines (mutters angrily about spambots). To keep spiders out, simply add this text between your head tags on each page you don’t want to be indexed.
The INDEX directive specifies if an indexing robot should index the page – or not.
The FOLLOW directive specifies if a robot is to follow links on the page – or not.
Given the defaults are INDEX and FOLLOW (ALL), you don’t need to add this to your code.
I reckon it’s less hassle to write a robot script and have done.
See Robots.Txt for more info if you don’t know how to do this.
HTTP-EQUIV
Examples:
<meta http-equiv=”pics-label” content=”(pics-1.1 ……“></meta>
<meta http-equiv=”Content-Type” content=”text/html; charset=ISO-8859-1“></meta>
To quote Vancouver Webpages:
“META tags with an HTTP-EQUIV attribute are equivalent to HTTP headers. Typically, they control the action of browsers and may be used to refine the information provided by the actual headers. Tags using this form should have an equivalent effect when specified as an HTTP header, and in some servers may be translated to actual HTTP headers automatically or by a pre-processing tool.”
They tell the browser how to deal with you. I use the above two to essentially say, Hello, this is a decent site, for all, and it’s in English
The latter one can be considered essential
There are several more, but largely, I believe they are outdated. The above link details most of them.
LINK REL
<link rel=”stylesheet” TYPE=”text/css” href=”ackadia.css“></link>
I’ve seen this used with the “http-equiv” but I believe using the above form it the correct way to go.
The href is the path to the Cascading Style Sheet settings. These can be set locally, but it’s far easier to handle them on a global basis. If you aren’t familiar with CSS, then now would be a good time to learn, eh!
When I get around to it, I’ll write a tutorial of it, though there are several good examples and a few good books on it already.
A similar form is used for javascript.
NAME attributes
<meta name=”option” content=”attributes></meta>
The name attributes are used for other types which do not correspond to HTTP headers. Some agents may interpret certain tags, whether declared as either “name” or as “http-equiv”. Is that ambiguous enough for you? Really the only ones you need to think about are the “Description”, “keywords” and, to a lesser extent “robots” which I covered further up the page.
Most of the “options” are completely ignored by all but the odd custom or specialised search engine.
For example, a book company might be interested in “author”, Microsoft or Macromedia might be interested in “generator”, which some web editors spit out with the code and pretty much no one gives a monkies about “copyright”.
You might want to note though, that web I was checking a page for validation under the Web Access Initiative, it flagged a warning message because I’d taken out the “author” entry.
As I understand it, if your company makes custom cars, for example, you can make up something like:
<meta name=”Landspeed3” content=”Engine-Chassis“>
And search your site accordingly. Specialised stuff, hmmm.
The other “names” of interest are these:
Expiration Date
<meta name=”expires” content=”31 December 1999“>
I’m not sure if this is used anymore, but it tells search engines when your page should be deleted from its directory and should be presented in the form above (day, month, year). I think search engines have their way of dealing with out-of-date pages…
Revisit-after
<meta name=”revisit-after” content=”30 days“>
This is supposed to tell the search engine to visit your site again in nn days.
Meanwhile, back in spider HQ, the bots are laughing their binary socks off in the bar whilst lampooning you for your audacity.
For me it summons up the image of the Smash family robots* rolling on the floor laughing, waving a potato peeler and chuckling … and they mash them all to bits
(* Older Brits will remember this. I’ll try and find a picture.)
There again, it can’t hurt to put it in, I guess.
Links
Anybrowser’s Metatag generator
Surprising basic for W3 Schools, but are HTML sites go, this is in the top echelons. They have quizzes too now, which is useful.
Vancouver Webpages – A Dictionary of HTML META Tags
Whilst I won’t go are far as to say this is the last word, but it is a very detailed look at Met tag usage.
Robots.Txt
Aimed at directing (responsible) search engine spiders and bots
Search Engine Watch
Takes it from the search engine’s point of view. As I’ve mentioned elsewhere, meta’s are largely yesterday’s news, but they have their place yet.
It may seem like a daft question, but I have to ask…
… can you track the spiders’ home?
Even if we can’t tie a packet of string to its thieving legs, it has to come to our websites from ‘somewhere’.
For instance, from my logs:
Date Wed Jan 28 03:14:44 2004
IP Address: 66.130.150.162
Range is: Videotron.com, Montreal
Browser/OS: EmailSiphon
Logistical nightmare, perhaps, but if enough folk see it coming from the same host…
Offers image of Garfield with a book… THWAP