Untangling the Web

2002-9-11 10:59:06【作者】 畅享网 【进入论坛】
广告

Untangling the Web

--SOAP uses XML as a simple and elegant solution that automates B2B transactions

By Greg Barish

 Most of today's Web applications are built for human consumption. Because real people interact with these applications, information must be presented in a visually appealing way. Users fill out HTML forms and then receive static or dynamic HTML output in response. For example, metacatalogs automatically query hundreds of existing online catalogs from a single user interface where users have made queries. In recent years, more and more such software agents - not people - are interacting with these Web applications. The long-term view of Web-based B2B is based on such automation. In fact, it is likely that the network transmission of such automation will eventually dwarf the traffic generated from human-based interactivity.

THE ORIGINS OF SOAP 
From IBM rejection to W3C recognition

SOAP was first proposed by Microsoft as a means for heterogeneous software objects to communicate over a network. The protocol's Microsoft origins may seem surprising considering that it is not tied directly to any Microsoft technology - rather, it is a proposal for an open standard. However, the truth is that the original 1998 proposal (which involved Microsoft, UserLand, and DevelopMentor Inc.) did emphasize an approach that favored what has become BizTalk - Microsoft's SOAP strategy. It was only after the input of IBM, which initially rejected it, that the proposal began to distance itself from its original Microsoft bent, evolving into something more open. Sun also initially rejected the proposal and only recently (June 2000) changed its tune, whispering support for the version that the W3C acknowledged in May of 2000. Several other B2B companies (Ariba, CommerceOne Corp, and Lotus among them) also supported the proposal submitted to the W3C.
 
 While a nice visual interface is an asset when it comes to enabling humans to interact with machines, it is an unnecessary obstacle when machines communicate with each other. What B2B really needs is an easy way to integrate the back-end systems of participating organizations. And we're not just talking about a solution that involves each business maintaining multiple interfaces to that data. That's the way things work today and, to a large extent, visual interfaces have often proved to be unwieldy solutions. IT managers want a way to consolidate their data and functionality in one system that can be accessed over the Web by real people or automatically by software agents.

The Simple Object Access Protocol, better known as SOAP, is aimed squarely at this data consolidation problem. Recently approved by the World Wide Web Consortium (W3C), SOAP uses XML and HTTP to define a component interoperability standard on the Web. SOAP enables Web applications to communicate with each other in a flexible, descriptive manner while enjoying the built-in network optimization and security of an HTTP-based messaging protocol. SOAP's foundations come from attempts to establish an XML-based form of RPC as well as Microsoft's own efforts to push its DCOM technology beyond Windows.

SOAP increases the utility of Web applications by defining a standard for how information should be requested by remote components and how it should be described upon delivery. The key to achieving both of these goals is the use of XML to provide names to not only the functions and parameters being requested, but to the data being returned.

Why SOAP?

As it exists today, Web-based distributed computing is not widely practical. IT managers have just two ways to go about enabling components to talk to each other over the Internet. One method is to use what HTTP provides, which means marshalling input and output as part of a POST or GET request/reply scenario. The other way is to use existing component technologies (integrating as necessary) between servers. In the latter scenario, objects communicate using a binary protocol over TCP/IP, but not as HTTP.

Let's take the HTTP-based solution first. Under this approach, components invoke functionality on other remote components by issuing POST or GET requests and processing associated HTML replies. However, this process is not general; it is inherently inflexible and, at times, can be just plain ugly. To understand why, let's consider an example.

Mix and Match

Suppose your company is trying to match sellers to buyers. You have established partnerships with several seller Web sites, each one different and each one providing access to its catalog via the Web. Now, suppose your company wants to integrate essentially all of these Web sites into one virtual catalog, so that when users query for some product, your system can match the query against those sellers that have the requested product. The problem is that the seller catalogs are huge, highly dynamic, and the sellers vary widely on how they store their data. Thus, downloading catalogs in their native format on a periodic basis is not always practical because (a) it is not always possible, (b) it may mean significant integration costs, and (c) it usually forces the need for very large, redundant databases. Since each seller distributes its catalog via the Web already, it would be far less costly if the B2B company could simply extract that data from those pages - possibly even extract it on the fly (per query).

However, there is no simple solution to this problem of extraction. For example, extraction implies that your company either develop technology that allows for the data to be extracted (or "scraped") from the seller's Web pages or that the seller provide an alternative, easy-to-parse interface to the data.

Obviously, the root of the problem here is that the existing Web site data is prepared for human - not machine - consumption. Although useful data exists on Web pages, it is embedded between another type of data (HTML tags) that is used purely to facilitate browsers and provide a visual representation. However, inflexibility is another problem with querying via HTTP. If the client or server wants to communicate more complex data types (such as a list of catalog items, each of which has a list of colors and/or features), some ad hoc method for encoding those data structures must be developed.

如果您希望与本文章的作者或其所在机构,进一步交流,请联系:畅享网 姜小姐
jill.jiang@amteam.org | 021-51096826-112 | 在线联系
老孙的IT运维管理之道[原创]用户的BSM用户的IT业务管..

从企业实际的IT运营角度来看,BSM是推动IT与业务融合,实现、改善WCNG司IT管理和治理的最佳实践之一。

吕建伟 专栏和CIO问答软件项目实施管理

现实中很少能按照正规流程来的,所以只能把流程中的各个环节拆开,个个击破,以后就可以见招拆招了。

ITIL实施:CIO时刻准备着

千军易得,一将难求,要推进ITIL实施,CIO扮演的角色不容忽视。吹响集结号,CIO出击的时刻已经来到。

节能与优化IT 企业CIO过冬良策

当前金融危机的影响还在继续漫延,很多企业都在苦寻过冬的良策,在这种情况下,节能与优化技术与产品无疑成为CIO们关注的首要对象,本次选题就是针对节能与优化IT来为CIO们提供过冬的良……