APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

W3C Validation

© July 2004 Tony Lawrence

HTML validation really is important, but it can be hard to get people to understand why.

Referencing: https://validator.w3.org/

Neatness counts, or so some of our teachers told us. For webmasters, neatness includes producing valid HTML.

Trouble is, there isn't a lot of incentive to do so. Because there always has been so much broken HTML on the web, and even disagreement about what is and isn't valid, most browsers will happily display pages that will cause dozens of errors if run through the W3C validator referenced above. As many webmasters limit their "validity checking" to just calling up the page with their own browser, it's easy for mistakes to creep in. The major source of invalid pages isn't from that, though - it comes from scripts.

Web pages like this use a lot of scripts. A script converted this text, originally typed as text , into an html page. Other scripts produce the common links and ads at the top, the Wiki comments at the bottom, and search out related pages. Any of those scripts can easily introduce HTML that won't pass W3C validation.

There's that lack of incentive again: since the browsers are so forgiving, why would you want to spend a lot of time chasing down obscure errors? https://www.pantos.org/atw/h-valid.html gives some reasons.

I've been sloppy about that sort of thing in the past, but recently have tried to clean things up. It's been simply amazing how much horribly broken HTML has been produced here. Usually it's not that I did such a terrible job with the script to start with, but that I introduced errors as I added new features or changed my mind about how things should work. Some of the errors that the W3C Validator helped me find were really bad, but I never noticed any objection from any of the multiple browsers I use.

Another place errors can come from is files that have data that scripts use. The Wiki comments are a good example of that: anyone can introduce "bad" HTML into an otherwise valid page simply by adding a comment. There's not much I can do about that ahead of time (too complicated), so I have to fix that kind of thing after the fact.

There have been a few places where I've run into changing standards. For example, I had used "align=center" with the table that creates the navigation links at the top of each page. That was fine in ancient HTML, and if you don't have a

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"

at the top of your document, browsers happily center the table. You need a DOCTYPE declaration to validate, but the minute you add that, browsers don't like "align=center" for tables and the W3C validator doesn't either. It also doesn't like "hspace" and "vspace". Actually, centering tables turns out to be easy - tables naturally center if you use something like this:

<table style="margin-left: auto; margin-right: auto;" width="75%">

The "vspace" and "hspace" are dealt with by using margin-left, margin-top etc. in a style sheet or in-line declaration. So, I have this "sidebar" class:

TABLE.sidebar { margin-left:2em; margin-right:2em; margin-top:2em;
margin-bottom:2em;float: right;}

Which might be used like this:

<table class="sidebar"  width=100  bgcolor="#cceeff" cellspacing=2  border=1>

You can see an instance of that in the "There's that lack of incentive" paragraph above.

The validator was happier after those changes.

I wish I could say that every page here is now W3C valid. Unfortunately, we have almost 12,000 individual pages here now, and more that are generated on demand from scripts. I think I've found and corrected a lot of the problems, but I'm sure there are many more lurking within. So while my intention now is to produce only valid HTML, the website may not entirely conform in practice. Well, better to make the effort than to do nothing, right?

After writing this, I noticed a new error in the script that produces the main Blog pages: so this page itself may not validate until I fix it!

Got something to add? Send me email.

(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> W3C Validation

Inexpensive and informative Apple related e-books:

Photos: A Take Control Crash Course

Photos for Mac: A Take Control Crash Course

Take Control of Automating Your Mac

Take Control of OS X Server

Take Control of iCloud

More Articles by © Tony Lawrence

You could save some bandwidth by reducing the number of declarations. For example, when all margins are 2ems:

margin: 2em;

Or if you wanted seperate margins for each side of the box:

margin: 2em 1em 2em 1em;
margin: top right bottom left;

It's also worth keeping in mind that an "em" is relative to the element's font size, so when a user increases/decreases their browser's font size the margin would also change relative to the new font size.

July 19, 2004

Thanks! I need to do a LOT of work with my style sheets..


---July 19, 2004

Tony, while you're at it, can you fix the bug that truncates the comment dates in these posts? ("***July 19, 2004" becomes "***July ," at least in IE6) This may just be an IE6 problem.

Please note that I had to change -'s to *'s in order to demonstrate this bug.

Damn, this is hard to demonstrate. For some reason, the "July nineteen, two-thousand four" part of the date displays as just "July ,". Whyizzit?

Another thing: the "IEsix" above displays as just "IE".


---July 19, 2004

I see it; just haven't figured it out yet, sorry..


---July 19, 2004

OK, found it.. let me know if it doesn't work for you..


---July 19, 2004

BTW, I really appreciate it when people take the time to point out errors or things they just don't like. I suppose part of the reluctance is that at too many places suggestions and comments just get ignored. I WANT to hear about problems. I may already know about it, and may already be trying to fix it, but you never know, so it's worth telling me.


---July 26, 2004

One more little suggestion:

Please turn off the page auto-refresh feature that kicks in after a minute or two of viewing these pages. You're in the middle of reading an article down near the bottom of the page, turn your eyes away from the screen for a second, and voila! you're back at the top of the page with no user intervention whatsoever. Or so you think. I would have reported this sooner, but I really wasn't believing what I was seeing when it happened.


---July 26, 2004

You're supposed to read faster :-)

Seriously - I thought I HAD taken those out..

Looking harder now..


Printer Friendly Version

Have you tried Searching this site?

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us

Printer Friendly Version

Today’s computers are not even close to a 4-year-old human in their ability to see, talk, move, or use common sense. One reason, of course, is sheer computing power. It has been estimated that the information processing capacity of even the most powerful supercomputer is equal to the nervous system of a snail—a tiny fraction of the power available to the supercomputer inside [our] skull. (Steven Pinker)

Linux posts

Troubleshooting posts

This post tagged:



Unix/Linux Consultants

Skills Tests

Unix/Linux Book Reviews

My Unix/Linux Troubleshooting Book

This site runs on Linode