|
XML: Why and how?广告 XML: Why and how?
HTML: Simple but limitedHTML transformed the Internet from an obscure academic tool into the rich, flexible, and powerful World Wide Web that we know today. HTML has some huge advantages that made it ripe for this explosive growth. It is relatively simple to learn and use. With just a limited number of tags and a word processor, anyone can produce legible HTML pages within minutes. But the syntax is still powerful, despite being simple. Lists, links, graphics, and formatted text can all be produced with a set of easy-to-learn tags. And HTML, with a few minor exceptions, is a universally accepted language, common to all web pages and displayable by all browsers on all platforms. If you build a web page with well-formatted HTML, then you are guaranteed the widest possible audience across all platforms.
HTML's very success led to the creation of a demand for ever larger and more sophisticated web applications. But HTML cannot cope with the increasing demands made on it. HTML is too static and rigid for the fast-changing world of the Web. Its tags are hard-coded by committees and browser authors. The addition of new tags should be possible, without making them arbitrary or nonstandard. HTML was designed for a fixed purpose - web pages. But the information these pages contain is valuable in many other formats, such as printed documents, manuals, financial data in databases and so on. Transforming HTML to or from these alternative storage mediums is costly and difficult. Looking ahead, it would be a huge benefit to capture and organize the vast amount of information on the Web. But HTML documents are intrinsically unstructured: HTML tags describe how documents should be displayed, not how documents are structured. This means that simple text searches are all that is possible. Searching structured documents would magnify the usefulness of information on the Web many fold. Finally, HTML's linking mechanisms are very weak. Links break regularly, with no warning, and they are hard-coded into the document, making it necessary to use a tool to maintain them. And links are unidirectional - an unnecessary limitation on a pure hyperlinking system. What is XML?Aware of the problems facing HTML, the World Wide Web Consortium (W3C) ratified a standard for a new form of mark-up language, called XML. XML stands for Extensible Markup Language. It bears some superficial similarities to HTML, but is in fact a much more powerful concept.
XML in practiceLet's say that a travel agent wants to allow users to browse through lists of currently available holiday destinations. A typical section of HTML might look like this: - <H1>Turkey</H1>
In XML, it might look like this: - <LOCATIONS>
<COUNTRY><NAME>Turkey</NAME>
The XML document describes structure. Locations are divided into countries
and cities. And a city is made up of a one-line description and a longer
description. Notice that the XML document mentions nothing about display - that
is left to the style sheet. The style sheet author may decide, for the purposes
of the on-line version, to print it thus: -
TURKEY Istanbul - The city of the sultans
Istanbul, gateway from west to east, has been the capital of three empires
throughout its long, troubled history.
Ankara - Heart of the new republic
The founders of the modern, secular Turkey chose Ankara as their new
capital.
An additional style sheet can be produced that allows the underlying XML
document to be printed in a brochure. Or the user may decide to just browse the
country and cities, with their short descriptions, ignoring the longer
descriptions. And the document could also include hotel information, prices,
availability, and so on. Each of these new fields would require a tag (not shown
above) defined in the DTD and a matching style sheet specification to state how
it should be displayed.
All of this power and flexibility is available because XML has split
structure from display.
XML will greatly extend the hyperlinking mechanism familiar to all web
surfers. Links, defined with the XLL (Extensible Link Language), can be stored
and maintained independently of the documents in which they appear. They can be
bidirectional, allowing greater power to link common data within structured
documents. And they can have attributes that define what type of link it is.
Some links will even make the link target document look like part of the link
source document. All of this extra flexibility and power will transform the way
documents are accessed and linked together across the Web. The Document Object Model (DOM) is a standard object-based API that will
allow, among other things, scripting languages dynamic access to XML document
content. The content of tags will be available for complex processing on the
client side. XML will be used wherever data needs to be structured for presentation or
interchange, so it will greatly enhance e-commerce. XML is an excellent way to
define nonproprietary data structures, thus allowing for smooth interchange of
data between heterogeneous systems across the Web. For example, if all travel
agents defined travel packages in a standard way, using the same DTD, they would
be easily able to transfer data between themselves and airlines, hotels, and
customers.
XML will enhance the power of the Web considerably. It will leverage the vast
store of information on the Web to improve searching and data interchange. And
it will help overcome some of the limitations currently faced by HTML.
|
|
|
|