Reasons I don’t like HTML 5

Please Note: At the time of this writing, HTML 5 is still a draft, and will remain so for some years to come; the specs are still changing. Hence this post might become outdated very soon.

The Web needs a new version of it’s favorite markup language, HTML. Both XHTML 1.0 and HTML 4.01 are more than eight years old by now, so the W3C is working on the next chapter, XHTML 2.0. Due to it’s lack of movement, many people assumed the W3C Working Group was dead, and due to quite a lot of criticism to the whole concept of XHTML, many people assumed it was rightly so. Meanwhile some practical people (who are calling themselves WHAT WG) have started on a standard of their own which they called “HTML 5”. The W3C has adopted this draft and is now developing both HTML5 and XHTML2 (which are rivals) – but this doesn’t mean that the WHAT WG has stopped developing, but rather that HTML5 is developed twice. I’m a bit confused about this. Anyway, there are some Detail about this whole HTML5 thing, that I’d like to explain here.

Sitenote: No matter what some people ((Microsoft)) tell you, HTML/XHTML will be around for a really long time – neither Silverlight nor Flash nor Adobe Air can substitute it.

Redefinitions

<i>

i stands for italic, meaning text is supposed to be rendered italic. It’s a purely presentational element which has no semantic value whatsoever. Hence, it became depreciated, like all presentational Elements, and was subsequently removed from the XHTML 1.1 Standard. Now, all of a sudden, its back, with some “semantic” value forced onto it.

The i element represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, a ship name, or some other prose whose typical typographic presentation is italicized. (…) Note: Style sheets can be used to format i elements, just like any other element can be restyled. Thus, it is not the case that content in i elements will necessarily be italicised.

So, because, there are cases where the italic-tag has been used to markup text, which in itself had semantic value, it now gained that value (and quite blurry, too). Problem is, though, HTML5 is supposed to be backward compatible which it can’t be if it’s redefining Elements that do already exist. For example, if I’ve used <i> for my navigation ((which has been and still is quite common)) (to make the links in my navigation italic) I would be fully complying with the HTML 3.0 and HTML 4.01 Transitional Doctypes. The moment I switch to HTML 5 these links magically gain semantic value which they never had and where never indented to have. From a semantic standpoint, HTML 5 would have broken my site. Not backward compatible at all.
The second problem about this is that people won’t start to think of <i> as something different than “italic”. It would be uncool to use in CSS, either, just think about:

i{
 font-style:normal;
 font-weight:bold;
}

Imagine a book about HTML in which “i” is explained as an “element that represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose yapp yapp yapp” and not as “italic”.

<b>

b is not “bold” anymore, but explained as

The b element represents a span of text to be stylistically offset from the normal prose without conveying any extra importance, such as key words in a document abstract, product names in a review, or other spans of text whose typical typographic presentation is boldened.

Same bogus as above.

<u>

u(nderline) was dropped though (as well as <tt> and some others). Why? Why not make another daft definition up ((Like: The u element represents a span of text to be stylistically offset from the normal prose without conveying any extra importance, such as key words in a document abstract, product names in a review, or other spans of text whose typical typographic presentation is underlined.Me)). That would be, at least, consequential.

<small>

Of course, the small element is back form the dead depreciated. Lets have a look at the (old, current and comming) spec. Which doesn’t quite fit?

SMALL places text in a small font. HTML 3.2

SMALL: Renders text in a “small” font. HTML 4.01

The small element represents small print (part of a document often describing legal restrictions, such as copyrights or other disadvantages), or other side comments. The small element does not “de-emphasise” or lower the importance of text emphasised by the em element or marked as important with the strong element. HTML 5

<hr>

The <hr> stands for “horizontal rule”. Though it had semantic value in HTML 3 ((“Horizontal rules may be used to indicate a change in topic. In a speech based user agent, the rule could be rendered as a pause.” – http://www.w3.org/TR/REC-html32.html#hr)), it lost that in HTML 4.01 ((“The HR element causes a horizontal rule to be rendered by visual user agents.” http://www.w3.org/TR/html401/present/graphics.html#h-15.3)) and was hence depreciated and dropped. Now its back as:

The hr element represents a paragraph-level thematic break, e.g. a scene change in a story, or a transition to another topic within a section of a reference book.

As explained above, redefining an presentational element to an semantic element is no good. XHTML 2.0 has a simmilar Element, but they have solved it more elegantly by calling it <separator>, thus avoiding the confusion mentioned above.
The separator element separates parts of the document from each other.

font

The <font>-Tag is back, if (and only if!) inserted by someone using a WYSIWYG-editor. Now thats something I can understand! I think that the marquee-tag should be mandatory, if inserted by someone using Macintosh. As of this writing, the font-tag has been dropped again.

No (usefull) Doctype

The doctype for HTML 5 will be <!DOCTYPE HTML>

which completly omits the word “html 5”, which is rather unusual. I personally find that suspicios. Also, the specs makes it sound as if the doctype where somewhat redundand:

A DOCTYPE is a mostly useless, but required, header.

DOCTYPEs are required for legacy reasons. When omitted, browsers tend to use a different rendering mode that is incompatible with some specifications. Including the DOCTYPE in a document ensures that the browser makes a best-effort attempt at following the relevant specifications.

As far as I know, the doctype is in fact a important part of SGML. Not only that, but there is no doctype for XHTML5.

That means of course that there is no way of telling if a document is (X)HMTL5 or 6 (if there will be a (X)HTML 6 and if they decide not to change this scheme again).

Conclusion

All in all I get the fealing that HTML5 is a solution for today’s problems, not tomorrow’s. Most people working on HTML5 are either webdevelopers or browser vendors. People who are using the specs or are responsible for implementing them. This somehow seems rotten to me. Of course, the solutions they came up with are really practical for today’s problems, but I doubt they will be equally well fitted for the problems that will arise in the next ten or so years. If you read the specs of XHTML 2.0, you feel a much more visionary attempt. Still I admire the hard work these people have done. What I really fear ((allmost as much as a Cappuccino shortage)) is that there will be an arms race for the dominance of either (X)HTML5 or XHTML 2.0. The last thing we need (and want) is a new browser/markup war.