Higher Residing By way of XHTML – A Record Aside

XHTML is the usual markup language for internet paperwork and the successor to HTML 4. A combination of traditional (HTML) and chopping–edge (XML), this hybrid language seems and works very similar to HTML however relies on XML, the net’s “tremendous” markup language, and brings internet pages lots of XML’s advantages, as enumerated by the On-line Fashion Information of the Department Libraries of The New York Public Library.

Article Continues Under

If you’d like your website to work effectively in immediately’s browsers and non–conventional units, and to proceed to work effectively in tomorrow’s, it’s a good suggestion to writer new websites in XHTML, and to transform outdated pages to XHTML as your work schedule permits.

And the W3C has made it straightforward to take action. You’ll be able to be taught the principles of XHTML sooner than Domino’s delivers a medium pizza with black olives and recent mushrooms. These few, easy guidelines exemplify W3C’s practicality, for they convey consistency and XML effectively–formedness to the net with out requiring busy designers and builders to be taught solely new markup methods.

However as with all transition, you’ll get higher and extra predictable outcomes should you put together forward. This text will assist you do this, by inspecting instruments that may help you in changing to XHTML and confirm that you just’ve accomplished it appropriately. The article may even focus on adjustments in the way in which some browsers show XHTML pages that may puzzle you should you’re not anticipating them, and assist you put together workarounds if wanted.


In the event you haven’t already accomplished so, learn the On-line Fashion Information of the Department Libraries of The New York Public Library earlier than you learn this text.

The Fashion Information illuminates XHTML with out requiring you to muddle by way of the customarily arcane literature at W3C; contains worthwhile data on CSS (together with Fashion Sheets you’ll be able to seize and use by yourself websites); explains find out how to work with the W3C validators;  and affords up to date suggestions for Dreamweaver customers.

This writer helped The New York Public Library create the Fashion Information and is grateful to the Library and to internet coordinator (and Fashion Information co–creator) Carrie Bickner for selecting to make the Fashion Information obtainable to the whole neighborhood. The Fashion Information is steadily up to date to right errors and supply new data.

TIDY TIME#section3

By far the simplest methodology of making legitimate XHTML pages is to put in writing them from scratch. However a lot internet design is actually redesign, and also you’ll usually end up charged with updating outdated pages. Redesign assignments present the right alternative emigrate to XHTML.

The free software HTML Tidy can rapidly convert your HTML to XHTML. We just lately used it to just do that with The Every day Report at zeldman.com. We likewise used Tidy for final yr’s CSS/XHTML conversion of A Record Aside, and we’ve employed it efficiently on a number of consumer websites.

Tidy was created by the sensible requirements geek Dave Raggett and is now maintained as Open Supply software program by the neighborhood at Supply Forge, although some variations are maintained by people as a labor of affection.

For our conversion, we used MacTidy 1.0b13, the newest model of Tidy for Mac OS, developed by Terry Teague.

There are on-line variations of Tidy in addition to downloadable binaries for Home windows, Unix, numerous Linux distributions, Mac OS X, and different platforms. Every model affords completely different capabilities and consequently contains fairly completely different documentation.


Like many busy individuals we are likely to keep away from studying the handbook, however on this case we urge you to learn each phrase. Although some variations of Tidy look rudimentary and its capabilities might seem apparent, Tidy is an influence software. You have to acquaint your self with its settings and preferences to make sure the specified outcomes.

For our Every day Report conversion, we refreshed our reminiscence with a mere look at MacTidy’s documentation, and this proved to be a mistake.

On our first cross, utilizing settings we’d by no means used earlier than, Tidy transformed our encoded character entities to non–encoded, platform–particular keyboard characters; remodeled Unicode entities that work in all browsers into named entities that ought to work in all browsers however don’t; and altered remark brackets (the < in <!— as an example) to encoded characters, thereby triggering errors in an embedded JavaScript operate.

The ability was Tidy’s, the fault was ours. Misuse of Photoshop, Illustrator, or Flash can have equally dire penalties, and Tidy can’t be blamed for the errors of its customers. So do your self three favors:

  1. Learn the handbook.
  2. Preserve a backup copy of your doc.
  3. Learn the handbook.

Those that share our handbook studying avoidance downside will need to know which choice setting was proper. Alas, there isn’t any single “proper” setting. The correct setting will depend on the kind of character encoding you propose to specify in your web page header, the kind of encoding you’ve informed your HTML editor to output (normally, however not at all times, Latin1), and different variables. Right here’s a tip, although. Be sure you select Convert HTML to XML if you wish to generate XHTML. (Bear in mind: XHTML is actually XML.)


The Fashion Information explains find out how to work with the W3C’s (X)HTML and CSS validators. Validation takes only a few moments. In the event you don’t trouble with this step, and in case your XHTML or CSS incorporates errors, your website might not operate correctly. It might additionally look fairly completely different than what you supposed.

With legitimate markup and CSS, compliant browsers are likely to render your website as you count on, with exceptions to be mentioned beneath. With invalid XHTML or CSS, all bets are off, and you may’t blame the browsers. (Nicely, you’ll be able to, nevertheless it wouldn’t be truthful and it received’t do you a bit of excellent.)

In the event you write your markup by hand, except you’re excellent, you’re prone to make errors each occasionally. In the event you use Macromedia Dreamweaver or Adobe GoLive straight out of the field, your website is for certain to comprise errors that the validators will help you repair.

We now have each confidence that upcoming variations of Dreamweaver and GoLive will assist you writer extra legitimate internet content material, however these variations usually are not but available on the market, and even after they turn into obtainable, you could effectively have to go in and therapeutic massage your markup by hand. (In the meantime, Dreamweaver customers ought to seek the advice of the Fashion Information’s suggestions, up to date 15 February 2002 to coincide with the current article.)

Whatever the method you generate your markup, it pays to work with the validators. They’re like non–judgemental XHTML and CSS consultants that can level out your issues with out considering badly of you.


Even the most effective consultants generally give unhealthy recommendation. They could additionally eat an excessive amount of garlic at lunch, or use your fax machine greater than you’d like. The robotic consultants at validator.w3.org  and jigsaw.w3.org/css-validator/ can also often offer you sudden bother.

Primarily, this has to do with the language the validators use to report errors. Written by and for requirements geeks, the validators generally present “assist” that will confuse and even mislead common working stiffs. It’s not W3C’s job to dumb down internet requirements for the remainder of us, however we generally want the validators would say, “Hey, dummy, you forgot to shut your <p> tag”  as a substitute of the cryptic stuff they often spit out.


To be truthful, cryptic validator messages are sometimes the results of software program limitations. The validator isn’t 2001’s HAL. In the event you neglect to shut one tag, the validator can’t probably know that you just supposed to shut it, and will thus report an error additional down on the web page as a substitute of zeroing in on the actual downside. The validator might level to an improperly nested tag that’s, in reality, correctly nested—however an earlier one isn’t, and that throws the validator for a loop.

As writer, you’re accountable for your individual errors, whether or not generated by a (probably misused) software or marked up by hand. Realizing concerning the XHTML validator’s tendency to report nesting errors beneath the place they really happen will help you make sense of complicated error experiences and get again on monitor rapidly.


Many generally–used HTML editors generate doctypes with relative URIs, reminiscent of…

These URIs invariably set off CSS validation errors which are practically unattainable for mere mortals to understand. We now have banged our heads bloody over error messages like this one:

org.xml.sax.SAXException: Please, repair your system identifier (URI) within the DOCTYPE

The little–identified CSS Validator FAQ interprets these deeply geeky utterances into readable English, and explains find out how to repair the issues. Within the instance given above, the answer is to make use of a doctype with an absolute URI, reminiscent of:

We’ll share vital recommendations on doctypes a bit of later on this article.

One other widespread downside that doesn’t disturb the validators—however wreaks havoc in older browsers, and even in some newer ones reminiscent of MSIE6—entails the non-compulsory xml prologue that precedes the doctype and namespace declarations. Once more, see the NYPL Fashion Information’s XHTML Tips for particulars, together with the answer.

Different widespread validation issues (and options) are lined in Eric Meyer’s Liberty! Equality! Validity! at Netscape DevEdge. {Ed: Netscape might have moved the cited article since Higher Residing By way of XHTML was first revealed.}


The validators are additionally the merchandise of human engineering, and thus, like all software program, comprise a number of bugs. It’s best to report these bugs while you encounter them (we let you know how beneath), however might really feel intimidated about doing so, because you’re likelier to assume your markup is at fault than to suspect {that a} highly effective laptop programmed by requirements consultants might be fallacious. However each as soon as in an important whereas, it may be.

In our current encounter with Tidy, utilizing incorrect choice settings, we bought an online web page that wasn’t fairly proper, and determined to repair the errors by hand. With out realizing it, we missed one error.

Obeying our in poor health–suggested choice settings, Tidy had transformed our encoded © copyright image to the Macintosh keyboard character for copyright. This keyboard character is ok for Macintosh–to–Macintosh doc switch, however isn’t advisable for the net. We ran the ensuing web page by way of the W3C validator and it handed with flying colours.

We subsequent tried to validate our fashion sheet, however W3C’s CSS validator  informed us it couldn’t accomplish that due to an error in our XHTML: “An invalid XML character (Unicode: 0xa9) was discovered within the component content material of the doc.”


Sadly, we couldn’t search and exchange “0xa9,” since 0xa9 was not a textual content string in our doc. (It occurs to be the copyright image in Unicode, however except you’ve dedicated Unicode characters to reminiscence, the validator’s message isn’t significantly useful.)

The CSS validator supplied Line and Column references for the error, and these may have proved helpful in pinpointing the issue in the event that they mapped to something. However the references mapped to nothing because the CSS validator doesn’t print out your markup.

The W3C markup validator does print out your XHTML markup, full with Line references, however solely if it thinks your markup is invalid. And as we’ve mentioned, that W3C validator thought of our markup kosher.

We thus discovered ourselves in a Catch–22. One validator mentioned our web page was good; the opposite choked on it and supplied error experiences we couldn’t use.

Briefly baffled, we uploaded the web page anyway, and inside an hour, Every day Report readers together with Mark Howells, Zeke Runyon, and Dylan Foley had taken it upon themselves to proofread our supply and discover the error. We thanked them, corrected the error, and had been again in enterprise.

Had we been engaged on a consumer mission as a substitute of a private website, we might not have uploaded the web page till we had discovered and corrected the error. Most often, it’s finest to give up your HTML editor, go for a stroll, and return later, with a clearer head.


Such issues are fairly uncommon (and in our case, they might have been prevented by consulting Tidy’s person handbook within the first place), however they do crop up. One buddy of ours, who has been known as the “best dwelling internet designer,” routinely varieties Home windows keyboard characters into his supply. Markup errors occur; validator errors (very often) occur.

In the event you assume the W3C XHTML validator is in error, go to the suggestions web page. To report potential CSS validator errors, write to [email protected]. (The e-mail tackle listed on the CSS validator ReadMe web page is non–practical as a result of incomplete.) If the validator is certainly at fault—or you will have robust causes to assume it’s—be form and thoughtful in reporting the error.

The W3C validators are a free useful resource maintained by educated people as a labor of affection. Shows of petulance, although generally tempting, will both offend these arduous–working people or (extra seemingly) make them surprise why you’re behaving so rudely, and immediate them to toss your observe into the trash.

So now you will have a sound XHTML web page. Will it look the way in which it used to? In some current, requirements–pleasant browsers, it could not—however you’ll be able to repair that rapidly.

After changing the Every day Report from HTML 4.01 Transitional to XHTML 1.0 Transitional, our web page was by no means completely different from the earlier model aside from the change in doctype and related markup guidelines.

However IE6 and Mozilla/Netscape 6 determined it ought to look completely different than it used to. Right here’s what IE6/Home windows did to our thực đơn bar:

IE6 mangles menu bar.

And right here’s how Netscape 6.2/Mac felt about it:

NN6 mangles menu bar.

View a Every day Report of April 2002 to see how the thực đơn is meant to show.


To repair these (to our thoughts) glitches in MSIE6 and Mozilla/Netscape 6, we added two guidelines to a method sheet embedded within the web page header:

img {show: block;}.inline {show: inline;}

The primary rule fastened the thực đơn bar. The second fastened structure issues prompted elsewhere by the primary rule. The place we needed photographs to show inline, we added a class=“inline” attribute to the img tag. Downside solved.

If markup (construction) and visible show (design) are two completely different animals per W3C considering, why did these browsers change our show, and the way did we provide you with the CSS guidelines that solved the issue?


For one factor, as famous within the Every day Report itself (26 January), consultants disagree on how requirements like CSS must be interpreted. Specifically, they disagree on what kinds (if any) must be utilized by default to photographs that haven’t been styled by the web page designer.

Eric Meyer’s Tables, Photographs, and Mysterious Gaps explains how the CSS consultants at Mozilla interpret unstyled picture tags in relation to the implied grid of each internet web page, and gives workarounds for many who don’t need additional house being added to their internet layouts. The problem primarily impacts “mixture” layouts that use a combination of historical (tables) and fashionable (CSS) structure applied sciences. {Ed: Netscape might have moved the cited article since Higher Residing By way of XHTML was first revealed.}

STRICT vs. STRICT#section15

Meyer’s useful article states that Mozilla and its baby browser, Netscape 6 solely do that to (X)HTML paperwork with strict doctypes, however that could be unintentionally deceptive, because it appears to counsel that additional house is utilized to photographs solely in HTML Strict or XHTML Strict. A look at many internet design mailing lists will present you that that is the favored interpretation of the phrase “strict” on this context.

In reality, what Meyer and his colleagues imply by “strict doctypes” is “full” (or legitimate) doctypes, i.e. any doc—even HTML 4.01 Transitional—that features a full URI. (Meyer isn’t misusing the phrase strict; it’s simply that the phrase means various things in several contexts.)

In apply, we’ve discovered that Netscape 6.x applies its consultants’ strict CSS interpretation to some HTML 4.01 Transitional paperwork with full doctypes, and to not others. This can be as a result of Mozilla continues to be in Beta, therefore Netscape 6.x continues to be unfinished; or it could point out an underlying precept that we’ve did not discern.

Extra to the purpose for our current functions, Mozilla/NN6 at all times applies this CSS interpretation—and thus, this additional house—to pages authored in XHTML (Strict or Transitional). The second you exchange to XHTML, photographs contained in desk cells will do to your structure what Germany did to Poland.


Most CSS–compliant browsers use the presence or absence of a whole doctype to set off requirements–compliant or backward–suitable (“Quirks mode”) presentation, respectively, a apply first instructed so far as we all know) by Todd Fahrner in 1998, and first applied by IE5/Mac in March, 2000.

Mozilla/NN6 follows this sample, as does IE6/Win. IE6 additionally features a DOM property that tells whether or not requirements–compliant mode is switched on for a given doc.

When in “requirements” mode, a compliant browser assumes that you recognize what you’re doing and shows your web page per W3C specs. In “Quirks” mode, the browser surmises that you just’ve crafted an outdated–common, in all probability invalid web page, and shows it as an older browser may. You management which tack the browser takes by together with or excluding a whole (X)HTML doctype.

See Repair Your Web site With the Proper DOCTYPE! to be taught which doctype it is best to use to your internet mission.

(There’s one exception to this rule: Mozilla/NN6, in widespread with MSIE, treats HTML 4.0 pages—even these with full doctypes—in backward–suitable “Quirks” mode. So should you’re not fairly prepared for XHTML, however you’re writing legitimate HTML and CSS and wish the browser to show your web page appropriately, select an HTML 4.01 doctype. In fact, we encourage you to make use of XHTML as a substitute.)

After changing to XHTML, in case your photographs start invading the borders of neighboring international locations, you’ll should take a couple of minutes so as to add compensatory guidelines to your fashion sheets. Every structure is completely different, so no single CSS rule or assortment of guidelines will resolve each downside, however Eric Meyer’s article and the fashion guidelines we used and have listed above ought to present a place to begin, and this additional work shouldn’t take you a lot time in any respect.


By way of Eric Meyer’s article, Mozilla/Netscape has documented why it acts because it does. We’re undecided why IE6/Win modified its show after we up to date our web page’s doctype to XHTML. (Each variations—the outdated HTML 4.01 Transitional and the brand new XHTML 1.0 Transitional—used full doctypes.)

We predict it might should do with the way in which some browsers deal with white house. Every of the 2 tags beneath is functionally equal, however due to their various use (or non–use) of white house, they could show in another way in a browser that makes an attempt to parse white house in markup. Thus:

… may show in another way than:


The second instance—the one with white house in its markup—may lead to undesirable visible gaps in your internet web page. Likewise within the instance beneath. The primary tag (with no white house)…

… may look completely different in your browser than the functionally similar:


Why does this occur? The “whitespace bug” was a identified downside in Netscape Navigator courting again to Model 3.0 (if not earlier). When Microsoft determined to construct a competing browser, its engineers emulated a lot of Netscape’s habits—together with a few of its bugs. Our guess is that MSIE6 continues to emulate this outdated Netscape bug.

No matter why IE6 behaved because it did, our further guidelines (show: block) fastened the downside in that browser as effectively. Your mileage might differ, however some model of (show: block)  will in all probability resolve your design downside in each Mozilla/NN6 and IE6.


When correctly used, W3C requirements improve accessibility and promise lengthy–time period sturdiness (which we name “ahead compatibility”) for any doc revealed on the internet. In the event you care to achieve the most important viewers for the longest time potential, you need to work with internet requirements, and the place doc construction is anxious, XHTML is the way in which to go.

Whereas some W3C requirements are supposed to assist consultants accomplish subtle duties, markup (XHTML) and magnificence sheets are for everybody, and W3C has taken pains to pave the highway to XHTML.

The principles of XHTML take minutes to be taught and the advantages of XHTML are huge. It’s straightforward to writer in XHTML and equally straightforward to transform HTML to XHTML by hand. Instruments like Tidy will help automate the method so long as you are taking a couple of minutes to learn the documentation earlier than pushing the button.

Free on-line validators assist be certain that your XHTML and CSS are kosher, although error reporting might generally, momentarily, confuse you, and in very uncommon situations the validators can misbehave.

After changing to XHTML, you could want to regulate your fashion sheet to compensate for some browsers’ default presentation of photographs, significantly after they happen in desk cells, however should you make this part of your work routine it may well turn into second nature. And as new browsers proceed to achieve market share, we’ll be doing much less and fewer design work with tables, and increasingly more by way of CSS.

With a bit of care and feeding, XHTML will assist your websites work higher in additional browsers and units, thus reaching larger numbers of readers, now and for years to come back. What extra may you ask?

Leave a Comment