W3C HTML and XHTML Validator

The World Wide Web Consortium, also known as the W3C, is the organization tasked with setting and maintaining the international standards for common Web scripting languages, such as HTML, XHTML and cascading style sheets (CSS). Their web site offers several valuable tools that assure that a Web site is compliant with international coding standards. First and foremost is the W3C HTML and XHTML code validator.

There are many HTML-XHTML validators, but the W3C validator is the bona fide standard. You can find the validator here:

http://validator.w3.org.

You may find it interesting to learn how to eliminate every error message the W3C validator produces. It can be an educational process even if you are a very experienced developer. Although you may think you know a lot about HTML or XHTML, you will likely find that you do not know as much as you thought you knew about the W3C’s coding standards.

There are some very good reasons for validating Web page code.

Browsers are designed to interpret standards. Compliance with standards helps assure consistent rendering among different browsers. In other words, it helps a site to look the same in Internet Explorer, and Netscape, and FireFox, and Opera, etc.
Browsers are very forgiving and automatically fix many coding errors, but this frequently requires re-rendering which slows down the final display of the page. Did you ever watch a Web page flash as the browser displays a page, or watch a page rearrange itself several times while attempting to display the page? This is frequently caused by poor coding methods. Each flash of the page is another attempt at rendering.
Browsers are very forgiving, but search engine spiders may not be. Spiders are designed to read and dissect clean code that meets established standards. If a search engine spider cannot read the code properly, it may not be able to dissect the page and find the content. This results in either abandonment of the page by the spider or a missed opportunity that might have resulted in a high ranking page.

Before you use the validator, a page needs to have a Doctype declaration. The Doctype declaration, or DTD, needs to be the very first line of code on a Web page. The DTD sets the standard to be used in the validation. It also serves another purpose. Many modern browsers contain multiple rendering engines. If you do not set a standard to be used for rendering, the page may not look consistent across different browsers.

You can find a list of valid DTDs here:

http://www.w3.org/QA/2002/04/valid-dtd-list.html.

Note that all valid DTDs from HTML 4.01 on contain a link to a standards document. If your web page editor already adds a DTD with every Web page, it may not be valid if it does not contain a URL to the actual standards document.

Using the W3C validator is very easy. Just cut-and-paste or enter a complete URL to the Web page you wish to test. You may find some of the validation messages to be a bit cryptic, but the W3C provides links with most error messages that lead to additional information.

There are some common page elements that always produce errors or warnings, even though they do not negatively affect either browsers or search engine spiders. Microsoft-specific entity codes, such as copyright and other symbols may not be code compliant. The validator handles these errors well and typically offers the code compliant equivalent. Body tag attributes such as leftmargin, right margin (Internet Explorer), marginheight, marginwidth (Netscape) are browser extensions that are not part of the official coding standards. These types of issues will not typically produce negative effects with either browsers or spiders, but do show up as errors because they do not comply with the standard.

One type of nuisance error is a missing alt attribute in image tags. Current standards do require the use of the alt attribute in order to push designers and developers to use this attribute because it is beneficial for Internet users with vision impairments who utilize special browsers that describe images though the use of alt attributes. It may not make sense, however, to include text with every alt attribute. The code compliant workaround is to simply include an empty string value (alt="") for lines and other page elements that need no explanation.

It can sometimes be very beneficial to eliminate every error message displayed, and to assure complete and 100% compliance with the DTD standard you chose. Once a page has been validated as compliant, the W3C offers a cut-and-paste snippet of code that can be added to your Web pages as both a "gold star" certification of compliance and as a link to re-validate the code when you make changes.