General advantages and disadvantages of HTML vs XML and XHTML

written by: Nelson Druell; article published: year 2006, month 08;


  

In: Categories » Internet » Web design and development » General advantages and disadvantages of HTML vs XML and XHTML

There are three markup languages. These include Hypertext Markup Language (HTML), Extensible Markup Language (XML), and the combination of the two, Extensible Hypertext Markup Language, (XHTML).

HTML

HTML is the primary format used on the World Wide Web. HTML can display Web pages with a wide range of colors, shapes, and objects. Although not a true programming language, HTML has increased in power over the years.

HTML is actually a loosely defined subset of XML. However, whereas XML is a strict language (as you will learn), HTML takes many liberties that have helped it become the popular presentation tool it is today. Although the spirit of the young Internet encouraged freedom, developers have now realized that the freedom of HTML has repercussions. Because HTML is so flexible, many browsers and Web applications have added their own functionality to the base HTML protocol. Like all enhanced functionality, this comes with additional security risks.

For this reason, efforts are underway to replace HTML with a much more regulated and standardized markup language known as XHTML.

XML

XML is the foundation for many data formats, including HTML, WML, XHTML, and more. It has recently become popular because it can facilitate the transfer of data between widely disparate programs, operating systems, and companies. The key to XML's utility is that it enables any developer to design her own data format using her own terms and requirements. In fact, XML is so popular that Microsoft has built its entire suite of products, from operating systems to server components, around the concept of XML.

To illustrate the utility of XML, let's consider a sample corporation that needs to share information about fruit inventory. Because direct access to a database would be a security risk (as well as poor business practice), the developer can create an XML program that defines the type, size, and color of each fruit on hand. Once she has determined the specs, the developer could program the host with the capability to pull data from a database and convert it to an XML file. On the other end, a special client could scan the generated XML file and parse the information to fill its own database. This process would thus allow for rapid and standardized data transfer.

To illustrate this, consider the following sample source code to see how such an XML file would appear. Note the hierarchy and the matching set of labels. Each label is a property, which could have sub-properties. In this case, we are passing information about an apple and a grape.

<FRUIT> <NAME>APPLE <COLOR>RED</COLOR> <SIZE>BIG</SIZE> </NAME> <NAME>GRAPE <COLOR>PURPLE</COLOR> <SIZE>SMALL</SIZE> </NAME> </FRUIT>

By extrapolating from this simple example, you can see how XML data is organized. The use of such relational data methods is still in its infancy, and will continue to grow for many years.

Although XML is the foundation of many other Internet-based formatting languages, its subsets are giving XML the push it needs to become the de facto standard. A recent subset, XHTML, is slowly gaining ground, and is destined to overtake HTML in prevalence.

XHTML

Thus, XHTML will likely replace HTML. Although this process will take several years, many Webmasters have already embraced XHTML, and are slowly integrating its rules into their development. In fact, XHTML 1.0 is considered by many to be the next version of HTML (HTML 5.0).

What makes XHTML so popular is its simple yet rigid ruleset. This ruleset is so powerful because it enforces a universal standard. The rules are as follows:

  • XHTML requires a declaration at the top of every XHTML page.

    This new rule tells the browser the type of data to render, which keeps all parts of the data presentation and transfer process flowing smoothly. The following is an example of an XHMTL declaration. <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd">

  • All XHTML pages must have the <head> and <body> tags.

    Although these tags typically exist in all Web pages, For HTML, Web browsers will overlook the missing data and fill it in automatically when it's not present. However, this is not the case with XHTML.

  • All tags must be closed.

    Prior to XHTML, Web pages included tags like <p>, which typically had a closing tag </p>. however, it didn't matter if the closing tag was left out. With XHTML, every tag must be closed. In addition, tags like "<HR>", which created a line across a Web page, must now look like <hr />. This is a completely new concept for Web pages.

  • All tags must be lower case.

    Again, this is a new rule. Previous versions of HTML used uppercase tags; now these tags must be lowercase. As you noticed the rule prior to this one, the <HR> not only gained a slash, but also became lowercase. (This only applies to tags, not attributes.)

  • All attributes must have quotes.

    Although this rule has traditionally been considered good coding practice, it is now mandatory. This will add complications for dynamically created Web pages.

  • All tags must be in the proper hierarchy (not nested).

    Again, this was considered good coding practice, but was not required. With XHTML, the following would no longer be correct: <I><B>Bolded and Italicized</I></B>

    Instead, it would now be written as follows: <i><b> Bolded and Italicized </b></i>

    (Note the lowercase letters.)

  • All attribute values must be denoted.

    This is not a common occurrence in HTML. However, if you are coding a group of radio buttons, one might be listed as "checked." See the following old versus new way of listing this:

    Old: <INPUT TYPE=RADIO CHECKED NAME="AnyName">

    New: <input type="radio" checked="true" name="AnyName"/>

    (Note the use of lowercase, quotes, and a closing slash.)

  • All <pre> tags must not contain the following tags: <big>, <small>, <sub>, <sup>, <img>, or <object>

  • Form cannot be nested.

  • All "&" symbols must be written as "&amp;".

  • All CSS must be written in lowercase letters.

  • All JavaScript must be performed externally.

    JavaScript is a programming language, and is separate from XHTML, which is only a formatting language. Remember, XHTML is ONLY FOR PRESENTATION (with CSS).

    In addition, JavaScript is not commented out.

  • All <!-- comments --> are illegal.

    Of course, commenting is still supported in XHTML, if it is performed with the following syntax: <[CDATA[comments appear in here]]>

By contrasting these simple but powerful rules with HTML, you can begin to see the advantages of XHTML. In addition, PCS (Personal Communication Service) devices also use XHTML. Because of the myriad of vendors, each with its own proprietary approach, the strict rules of XHTML and XML are vital. Without this standard, Web developers would have to create separate Web pages for each device. Fortunately, because of this standard, developers can create one or two pages for all devices. However, XHTML is still too bloated for many smaller PCS devices. Therefore, another option is required.

legal disclaimer

1) Our website is not responsible for the information contained by this article as well for any and all copyright infringements by authors and writers. E-articles is a free information resource. If you suspect this article for any copyright infringements, please read the Terms of service and contact us to investigate the problem.
2) The E-articles directory team is not responsible for inaccuracies, falsehoods, or any other types of misinformation this tutorial may contain and will not be liable for any loss or damage suffered by a user through the user's reliance on the information gained here. Please read the Terms of service

Useful tools and features

Translate this article to...    Send this article to you or to a friend

Link to this article from your page   
If you like this article (tutorial), please link to it from your web page using the information above. Linking to this page, this is the only way to help us improve our service, the same time providing your visitors with a way to improve their online experience.

related articles

1. The Essential Ingredients Of A Magnetic Website
Yes, believe it or not, there is actually a recipe for creating a website that is magnetic. A website that attracts targeted people far and wide like a super-powerful yet pinpoint-accurate magnet! If you apply each of these ingredients, but badly, you will have failed. If you address a quarter of them with gusto, accuracy and efficiency you will be well on the way to having a magnetic website whose profile just grows and grows. Your Shopping List For Baking A Magnetic Website • Great ...

  

2. Advantages and Disadvantages of HTTP Authentication
Authentication can be passed in the HTTP headers of incoming requests. This is the same type of authentication that is used when your browser creates a small login window when attempting to access a site. The authentication information is Base 64-encoded, so it does look like it is encrypted when transmitted over the wire, but in reality it is not. This encoding only ensures that all characters are valid to be passed in the header and is not intended to provide any level of security. Advantages: Easily hand...

3. Advantages and Disadvantages of Message Based Authentication
Client credentials can also be passed along with the regular message payload. This is marginally easier to implement on the client side because adding credentials should be no more difficult than adding another parameter to the request. Remember that even if a secure (SSL) endpoint is used, the URL used for the request is still sent in the clear, so if the credentials are passed on the URL (as is the case with a REST request), they will be visible to any and all intermediaries. Advantages: Easily handled &m...

4. 7 Things You Should Not Use in Web Design to Get a Quality Web Site
If you have any of these on your website or you have built websites for other people that include some of these ‘No-No’s’ then don’t feel too bad. We all make mistakes and it’s only my opinion right? 1. Flash In The Pan Pan being a slang term for toilet – as that’s where it belongs. Okay, maybe not all use of Flash but certainly Flash introduction pages. What a nightmare they are – ever visited a site where you positively revelled in the fact you got to...

5. How To Quickly And Easily Protect Your Adsense Account From Accidental Clicks
Not a day goes by without somebody complaining that they’ve been shutdown by Adsense because of “click fraud”. Scary isn’t it? Your kids or family members accidentally “stumble” on your website as they’re browsing the net (using the home computer)… and proceed to click on YOUR ads. You accidentally click on your ads yourself while you’re “checking” your site in your browser. Now, I’m sure that some people have accidentally ...

6. What Should I Do For a Successful Business Website
There are just four cornerstone foundations you need to perfect to make your website a success. These foundations need to be central to your way of thinking about your website from now on. Whenever you make a single change to your website, whenever you have an idea about your website, whenever you think about your website in any way you need to think about the four cornerstone foundations – so here they are… Volumes The volume of people you attract to your website is crucial to your websit...

7. The 7 Deadly Sins Of Web Design
Sin 1 - Starfield backgrounds You know the sort – zillions of tiny white pixels glinting back at you from behind the text. Beautiful. Not! In a galaxy far, far away, in a time long, long ago people thought this was cool. It’s not. It sucks and people who use it should be shot. Sin 2 - Anything that moves. Okay, that’s maybe a little bit harsh – let me zero in on something more specific - animated cursors. I know 12 year-old kids that think they’re crap. Wise up an...

8. Wireless Markup languages ~ Overview ~ WAP WML WMLScript
The most common standard of data transfer and presentation for a handheld device involves the combination of Wireless Application Protocol (WAP) with Wireless Markup Language (WML). Although WAP can be used with other forms of presentation, its coders primarily designed it to be used with WML. WAP Because of the small size of PCS devices, and because they operate with much less bandwidth or speed, than the rest of the Internet, a special protocol was necessary to redefine how they handle data transmission. This protoc...

9. How To Configure Apache or IIS Web Server to Work with WML ~ Openwave SDK
It could be a useful exercise for you to create your own WML program and test it on a live Web server. This requires the following two items: Access to a Web server (IIS or Apache both work well) A development tool to test the programming For the development tool, we recommend that you download and use the latest version of Openwave's SDK, which is freely available to developers at http://www.openwave.com. Once you install this program, you simply need to specify where the files ...

10. Developing a Commerce Site
Developing a commerce site is similar to developing an application, and a structured approach is recommended. This article discusses a development methodology for the commerce site. An approach with the following stages is recommended here: Scope Prototype Design Implementation Testing Deployment Scope The Scope stage involves the following activities: ...