Click here to advertise

XML Acceptor
XML as the API

May 1 1999
Rev. Jan 29 2002


XMLAcceptor on Wiki

UDDI: an XML Web Service

XML as the API for Microsoft Office Charts 

SQL Server 2000 XML and Web Application Architecture

Microsoft XML For Analysis Specification


Programming Web Services With SOAP
XML and SOAP Programming for BizTalk Servers

Date Description
Dec 13 1999 Added Section "Scalability"
Jan 18 2000 Changed "Truly Separating Client and Server via Documents" section to better explain what is meant by 'XML is the API'.

Added Shadow Objects

Aug 8 2002 Added Follow Up and SingleMethod rebuttal

Introduction

A simplistic observation of today's server designs is that most of them utilize XML in one way or another.   That's the reality of 1999 - XML is a widely accepted and sexy technology.   Add XML to your design and suddenly you have marketing's attention.

Although XML was designed as an interchange format for sending data between disparate systems across the Internet, XML is also useful within systems.  Consider sending payroll data to a payroll server.  In the "old" days (which are, in fact, not very old at all!) of distributed computing, IDL (Interface Definition Language) would specify, at a very detailed level, how payroll clients communicate with payroll servers.

Note:  The term IDL does not refer specifically to CORBA, COM, DCOM, RPC, EJB, Java/RMI, etc..

IDL has several problems:

  1. IDL's scalar datatypes are limiting.  IDL was designed for passing small amounts of structured data from client to server.  Although IDL supports structs, arrays, and object references, passing data through individual method parameters is limiting from a design perspective and inefficient from a runtime perspective due to the additional round-trips.
  2. IDL is brittle.  If you need to evolve the data or methods, you must change the IDL.  IDL is not supposed to change.  When IDL changes, existing servers and clients must be updated.  Generally, if new methods are added to IDL, clients should still work without recompiling, although few vendors are willing to actually guarantee this.

Enter XML.  XML and IDL function well together.  The above problems can be solved with XML:

  1. Pass all data via strings containing XML.
  2. XML is inherently extensible.  Methods that accept XML can look at the top-level element name and its 'version' attribute.  In many cases DTDs can be modified without breaking existing code, as long as these modifications do not remove elements or attributes or change their semantics.

The XML-Acceptor Pattern

The XML-Acceptor Pattern (see Design Patterns) involves combining IDL and XML.  The pattern is best described with an example.  The IDL consists of one method:

int Accept([in] String xml_in, [out] String xml_out);

This method follows the request-response paradigm popularized by HTTP.    Clients post XML to the server (aka "acceptor") and get XML back.  The server parses the xml_in string and routes it to the appropriate handler depending on what type of document it is.   For example, the document may update an employee's vacation time.  The acceptor responds with another XML document in xml_out -- this document contains status information, including any error messages.

The benefits of this technique include:

  • Simplicity.  There is only one method.  New operations can be added to the acceptor simply by passing different XML documents to the method.
  • Extensibility.  When IDL is used in a traditional manner, client and server programmers can work in isolation, but the IDL must be defined.  We all know that IDL changes during development, which is a painful process.  Changes to IDL must be carefully coordinated.  The XML-Acceptor pattern allows methods, input, and output data to evolve without changing the IDL. 
  • Richness.  XML data can be (almost) arbitrarily complex.  XML can be used to pass simple datatypes or complex graphs of objects.  The complex data is passed "by value" whereas in the old days, complex data would probably be passed "by reference" as objects.  Passing data by value is usually faster than passing data by reference because round-trips are avoided to actually access the data.

Coarse-Grained Transactions

The ability to pass arbitrarily complex information in a single method call enables coarse-grained transactions.  A single method call can perform an entire business transaction.  In contrast, a transaction in a textbook object-oriented system requires several method calls ("set" this, "get" that).  Coarse-grained transactions reduce the number of round-trips.


Scalability

State introduces bottlenecksBottlenecks decrease scalability.

Scalability is the degree to which a system can process multiple transactions at the same time.  The more transactions per second, the higher the scalability.  Another name for transactions per second is "throughput."  High scalability implies high throughput.

Scalability can also be a measure of the degree to which throughput increases as resources are added to a system.  Add more CPU.  Add more memory.  Add more disk.  Add another computer.  Did throughput increase?

Computers have a fixed amount of resources.  These resources include RAM, CPU cycles per second, and database locks.  Consuming resources for a long period of time decreases scalability because when one object consumes resources, there are fewer resources available for everyone else.  When objects wait for resources before they can perform useful work, that's called a bottleneck.  Traffic can't flow through the system unabated, creating the equivalent of rush hour in most of the world's cities.

Objects that retain state between method calls consume resources for the duration of a business transaction.  An object consumes at least RAM because it remains in memory.  The object could be "paged" to disk; that's good for scalability but it's yet another performance hit; anyway, one resource (memory) is just substituted for another (disk).  

If a significant amount of time passes between the beginning and the end of a transaction, the bottleneck can be severe.  In a web scenario, a severe bottleneck could force a customer to wait several minutes while waiting for her order to be approved.  A bottleneck in the order-submittal module can have a detrimental effect on the bottom line, because shoppers will get frustrated and leave, and might not return to the website again.

Coarse-grained transactions increase throughput by decreasing state.  Request documents contain all the information that an acceptor needs to perform a transaction.  The acceptor doesn't have to retain state between calls.  The acceptor might stay in memory after processing a request, but the resources it consumes are minimized because it doesn't have to keep track of prior calls.  Consuming fewer resources results in better scalability.

In the ideal scenario, state is eliminated entirely.  However, that may cause an undue burden on the caller.  The caller is essentially caching data for the XML acceptor.  Client caching is impractical when the cached data set is large.  However, a balance between client and server needs can usually be found.


XML-RPC and SOAP

I frequently receive questions regarding XML-RPC and SOAP and whether they replace XML-Acceptor.  The answer is no.  XML-Acceptor is a pattern which can be implemented in many ways:  within an ORB, as a DCOM object, as an RMI server, as an Enterprise Java Bean, with XML-RPC, etc..  

XML-RPC and SOAP both implement the XML-Acceptor pattern.  Likewise, XML-RPC and SOAP can be used to implement the pattern.  The RPC method would accept and return a string containing an XML document.  XML-RPC and SOAP documents merely wrap yet another document which is sent to the acceptor - it's another rather cumbersome step to wrap a document inside another document (by converting it to a string as PCDATA), but it works.

XML-RPC and SOAP use HTTP which is a good thing.  HTTP scales well because it is stateless.  Round-robin HTTP load balancers are common and well understood.  Coarse-grained transactions work well in a stateless environment.  In fact, statelessness requires coarse-grained transactions because objects have no memory between method calls.  In order to achieve maximum scalability, as much context should be passed in XML documents as possible and should not be stored on the server.

XML-RPC and SOAP let developers write server-side code without realizing that XML is being used at all.  I think this is undesirable.  XML can be leveraged in many exciting ways and hiding XML from developers is not a good thing.  XML can express graph-oriented data structures, unlike method parameters.  Certainly, XML can be passed to methods as strings, but when using XML-RPC and SOAP, that's likely to be the exception rather than the rule.


Truly Separating Clients and Servers via Documents

To be blunt, RPC and its offspring such as SOAP are old-school.  Replacing closed binary datagrams with open XML doesn't buy very much.  On the other hand, documents bring a lot to the table.

There's no better example of the document paradigm than HTML.  After all, HTML is XML's predecessor.  The browser is the first widely used document-based application.  Of course, you don't know you're using a document-based application right now because the implementation is well hidden from the end-user.  A browser presents a view on top of a document.

Some programs don't display HTML at all -- they process HTML and put it into a database for fast indexed retrieval.  Web Crawlers represent the second widely-used document-based application.

In the future, most applications will be document-based.  For example, in a business-to-business scenario, a computer sends a request for quotation (RFQ) to a  server.  The RFQ is an XML document.  The quote server receives the request and returns the quote (again, in XML).  Thus, XML is the API.  There are no methods.  There are just documents.  There are just nodes in a tree that represent some sort of data that can be processed.

Fully embracing XML as an API has several benefits.  XML documents are lot easier to understand than a handful of objects, methods, and properties.  XML documents can be stored in flat files, allowing client and server code to be decoupled in a way that was never before possible.  Is the server not finished yet?  Is it down?  Is the developer working at home?  No matter -- the developer who is writing client-side code just needs to be able to read an XML document from a URL as if it was returned by the server.  If a document contains everything that the client logic needs (and it will if XML-Acceptor is followed correctly), client and server logic can be developed independently of each other.  I believe this is the most compelling argument for XML-Acceptor and is especially relevant on the Web, which is, after all, just another distributed network.

XML Acceptor has profound consequences on software testing.  Test personnel can craft request documents via a text editor and store them in the file system.  These documents can then be sent to XMLAcceptors with a minimum of coding.  The response documents can be compared against files containing expected results.  Writing an automated test harness to do this is trivial.  XML Acceptor greatly simplifies the traditional way of hand-coding test suites.  It also reduces the required skill level for testers which means more testers.  More testers means better code.


Message Dispatch

An acceptor might perform multiple actions.  This corresponds to a component that has multiple methods.  Each action is described by an XML document.  Each action document could even have its own DTD.  Making different actions conform to a single DTD would be limiting and DTD validation would not be very effective.  For example, you might send an <AddUser> document to create a new user:

<AddUser userName="Joe"/>

And you might send a <GetUser> document to retrieve the user's details:

<GetUser userName="tom"/>

The doctype might even be used to route different documents to different server components.  Other parts of the document can be used to route documents.

Similarly, the Accept() method can return any sort of document.  In an error case, the acceptor might return an <Error> document, and a different type of document upon success.


The XML Bus

An acceptor that passes messages to other objects follows the bus paradigm.   Clients post messages to the bus which routes the messages to the appropriate components.  A bus architecture decouples components in a complex system because the components only communicate through the bus.  Components do not know about each other -- they only know about XML documents.

A bus can be used with event and store-and-forward patterns to further decrease coupling and increase scalability.  XML-Acceptor greatly simplifies the implementation of a bus.  There is only one method, Accept, which accepts a known number of parameters.  With conventional object-oriented systems, the proliferation of methods having different signatures makes building a bus a daunting task.


Authentication and Access Control

As with all forms of RPC, authentication and access control represent a major hurdle for XML-Acceptor implementations.  Unauthorized access to an acceptor can result in a breach of sensitive information.

The acceptor could utilize the underlying implementation's infrastructure, such as DCOM's NT-user based security.  However, using the security features of the RPC mechanism or operating system may be unfeasible when the user base is large, as would be the case if the acceptor was invoked over the Internet via HTTP.

Here is an overview of a build-it-yourself approach to authentication and authorization:

  1. Send a <Login> document to the acceptor, which contains encrypted (using public key encryption, the result then converted to base64 perhaps) login credentials.
  2. The acceptor validates the credentials and generates and returns an access token.
  3. Subsequent messages require the access token.  The access token is used to validate the caller and access control lists can specify which actions the caller is authorized to perform.  For example, the access control list consists of a list of doctypes that the caller can send to the Accept method.
  4. Access tokens time out explicitly and after a period of inactivity, not unlike web server sessions.

The Achilles' heel of this simple scheme is that access tokens are passed in clear text.  However, access tokens could be encrypted using a public-key mechanism and then converted to base64.  Secondly, access tokens could be guessed if the token generator was predictable -- access tokens need to be fairly random large numbers.   Furthermore, server code is responsible for enforcing access control lists -- this represents most vulnerable point of attack due to programmer error and oversight.

Passing XML documents over an unsecured channel represents another point of attack.   Line sniffers could easily retrieve sensitive information.  SSL seems to be a viable solution to this problem.

Probably the biggest issue of all is the building, maintaining, and providing the API to a user database.  This is a necessary evil if security is a major concern.

Of course, the user management API might not an API per se, but instead might be a set of XML documents that are sent to an acceptor.  In this case you would have to create a well-known user, such as "guest" that can create new users.  The guest user would have to be created via a mechanism that bypasses the acceptor because this is a chicken and egg problem.  How  could you create a user if the user database was empty?  If the user database was empty you wouldn't be able to authenticate with the acceptor!

Better security has run-time consequences.  Storing access tokens in a centralized server reduces scalability.  The token database becomes a bottleneck.  Also, access control lists should be cached in order to deliver decent performance.  All of this extra bookkeeping tremendously increases the development effort.


Performance Considerations

Performance is generally acceptable as long as the size of the documents remain small compared to the amount of free memory available; otherwise, a SAX-style parser must be used.  See the next section regarding batch processing.


Batch vs Transactional Processing

Like traditional RPC, XML-Acceptor doesn't work in scenarios that would result in a large number of server round-trips.  For example, consider a typical Perl script in which each line in a comma-separate file is transformed.  This is called a 'batch' process as opposed to a 'transactional' one.  Calling a remote Acceptor object once for each line in the file would result in horrible performance.  The exception, of course, is when an Acceptor object is invoked in process.  A better solution to batch problems is to ship the entire file to the server, process it, and send the file back to the client.

XML-Acceptor doesn't help much with batch processes although the file could theoretically be sent over via HTTP.  However, sending the file back generally requires an asynchronous mechanism, and HTTP is not a good fit.  I personally have used both FTP and SSH/SFTP as a bi-directional transfer mechanism with 'listeners' on each end looking for files to arrive in an inbox.  SMTP and POP3/IMAP could also be used, as well as some other sort of message queuing mechanism.  Either way you go, it's a tedious job.  This is one area where SOAP and XML-RPC show their immaturity.  On the other hand, Microsoft's BizTalk server handles asynchronous messaging quite well and is very handy for batch processing.


Lessons Learned

XML-Acceptor is not a magic recipe.  The real work is in defining the business transactions in XML.  The response DTD should be designed early to include error messages.

Large amounts of data is a big problem.  Large XML documents are typically returned by the acceptor in response to a query.  Passing a large string back to the client requires significant bandwidth.  A streaming solution is required in order for client software to be able to display portions of the XML document as soon as it arrives -- much like the way browsers can display parts of an HTML document before all of it has been received.   Event-based parsers are best suited to this sort of operation because DOMs need the entire XML document to be loaded in memory.

XML does not yet have a standard query language.  Queries are a common operation.  A DTD can be created to describe a query.  XPATH could be used to describe queries.  The attractiveness of XPATH is that existing designs can be leveraged.  However, generic query languages are usually too "big" to process and clients must be limited to a small set of query functionality expressed as custom DTDs.

Clients should identify their locale when establishing a connection to the acceptor.  This allows error messages to be generated in the language of the client.  The locale and other basic "handshaking" data can be passed via XML (of course!).

Shadow Objects  Programmers frequently re-represent DOM trees in programming language-specific data structures known as "shadow" objects.  Shadow objects are representations of the document using a data structure that is specific to the application.  Shadow objects shield programming logic from changes to XML documents and allow applications to support multiple document formats, at the expense of using CPU and memory to transform the DOM into another data structure.  However, retrieving data from shadow objects is significantly faster than accessing it via the DOM.  If you're starting an XML-based application from scratch, you'll have to decide when to store data as DOM trees and when to use some other data structure.  There's no easy answer.  Spending a lot of time on this issue is generally not recommended unless you expect the document format to change or be versioned.  Shadow objects are a good idea in Internet and Extranet scenarios.

Finally, persisting XML data is a tedious process.  There are very few tools to assist with querying and updating relational data.  If you can forego using a relational database, check out eXcelon.


Critique

XML-Acceptor is not without its critics.

  • "Since the IDL has only one method, the IDL is not self-documenting.  Additional documentation must be provided, such as the DTDs of the input and output documents."
    • Rebuttal:  This is very true but what is self-documenting IDL?  Additional documentation is always needed.  I believe that XML documents are much easier to understand, more manageable, and more amenable to change than a collection of interfaces, objects, methods, and properties.
  • "XML documents are not type-safe"
    • Rebuttal:  This is true because DTDs do not describe datatypes.  When XML-schema is completed, this problem will disappear.  However, today's DTD validation provides a better-than-nothing sanity check of documents.  DTD validation validates the structure of the document which in most cases is good enough.  Note that if DTD validation is desirable, you can't use XML namespaces until XML-Schema has been incorporated into popular parsers.
  • "XML is not an efficient wire format.  Sending XML data across the wire will degrade performance."
    • Rebuttal:  XML is not the most efficient wire format.  In addition to network latency issues, clients must spend time building up XML and servers must spend time parsing it.  However, XML is easy to build, easy to parse, and, best of all, it's an open and language-neutral format.  XML messages can be extended easily compared with proprietary compressed data formats. 
  • "XML must be parsed before the acceptor can do anything.  This degrades performance."
    • Rebuttal:  That's what they said about SQL, isn't it?

      Event-based parsers (such as Expat) can be used to parse XML very quickly.  They can also be used to parse portions of documents.  For example, the acceptor can parse just the doctype (the first node) to decide how to route the XML document to the appropriate server component.

      The XML-Acceptor pattern uses coarse-grained transactions.  Each call to the Accept() method results in the server performing more work, such as database queries and updates, compared to traditional remote procedure calls.  Each input XML document describes a complete business transaction.  XML-Acceptor can sometimes result in better performance because fewer method calls means fewer round trips and resources are tied up for a shorter period of time.

      On the other hand, XML is not for everything.  In-process method calls are much slower when XML has to be generated and parsed.  XML is not always a good choice for in-process objects.  XML-Acceptor is meant to be used in the same cases when CORBA, DCOM, or RPC would be used.

      I have used XMLAcceptor extensively for in-process objects.  I have reaped many rewards from doing so.  Performance generally has not been a problem.  I use C++, and I can't vouch for other languages like Java.

      However, there are issues with Java and other environments that use lazy garbage collection.  The DOM causes the VM to allocate thousands of little objects, which must eventually be garbage collected.  A general problem with the Java garbage collector is that it's slow and unpredictable.  Java programmers, if they want to write fast code, must reuse objects as much as possible and create new objects sparingly.  The DOM is the worst case scenario for the Java garbage collector.  C++, for all its deficiencies, does not suffer this performance penalty.
  • "A system filled with classes with only one method with the same signature, isn't that scary? A typical code snippet becomes like this:

    boo = foo.doWork(null);
    bar = foo.doWork(boo);
    bar.doWork(foo);

    and every stack trace consist of many doWork(). Wouldn't it be a hell to debug or maintain?"
    • Rebuttal:
      In practice, it isn't. XML Acceptor results in highly modular and reusable code. XML Acceptor forces developers to write components that do one thing and one thing only. This is a goal of object oriented systems but requires discipline to achieve in practice. XML Acceptor forces the issue.

      Of course, XML Acceptors can be implemented using modern programming languages. With the source code, one can "step" inside an XML Acceptor and see the implementation. An XML Acceptor follows the Facade Pattern. The interface looks simple but behind the "wall" there is some complex stuff going on.

      The top-level document node (e.g., the DOCTYPE) is analogous to a method. With XML Acceptor, methods don't matter. Only documents and objects matter.

      A single object can support multiple documents. Generally the "other" documents are configuration documents which are then followed by a <doWork/> document. This means objects are not free-threaded. In fact this is where J2EE and COM+ are headed. Systems generally scale better when multiple instances are used rather than using the same instance in multiple threads due to synchronization issues.

Conclusion

XML-based APIs provide unprecedented flexibility and simplicity.  A single XML message can replace hundreds of method calls.  XML message routing can be implemented via COM, HTTP, CORBA, SOAP, RMI, XML-RPC, etc..

In order to use XML-Acceptor successfully, documents must be designed with care.  The DTD is the interface.  Just like IDL, DTDs must be subject to the design/review process.

I have designed and built several XML-based servers using XML-Acceptor.  XML-Acceptor enabled rapid development and resulted in fewer bugs and fewer development headaches compared to coding against traditional Java and C++ interfaces.


Follow Up (2.75 years later)

It has been mentioned that IDL supports arbitrary complexity of input and output parameters via the Any and Variant datatypes.  Therefore, XML provides no value over and above existing capabilities.

That is true.  In fact, you can do anything with a Turing machine so in theory object oriented programming really isn't needed either.  The question is:  is it harder or easier to use the right tool for the job?

The COM interface I use actually accepts and returns a Variant. This has proven to be very flexible as it allows the caller to pass in a string, a stream (which I have never actually done), or a MSXML DOM pointer (node or element). This has resulted in an extremely flexible and usable interface.

I no longer agree with passing traditional datatypes to business logic, either local or remote. I stand behind my claim that XML allows far more richness and usability even compared to Any or Variant (including arrays). Defining classes and structures in IDL is cumbersome. Writing code to read and populate arrays is cumbersome. Using XSLT, XPath, and the DOM is actually quite comfortable.

One benefit of XML is that it substantially simplifies the testing process.  This has been proven in practice.

If the data is just data, pass it as such. If the data are objects, pass them as IDL explicit objects. Notice that XML Acceptor does not pass objects (besides possibly a DOM) because XML can not encode object references.

Although this makes the hair stand up on proponents of object-oriented programming, in practice it actually isn't so bad. Passing data by value, rather than by reference, is what the web is based on. One can hardly argue with cgi-bin scripts, servlets, SOAP, and other uses of Representational State Transfer (REST). Pass-by-value semantics eliminate excessive round-trips.

XML Acceptor has proven itself by scaling up to arbitrarily complex tasks. A workflow engine has been built in which each component of the workflow is an XML Acceptor. The workflow engine itself is an XML Acceptor.

The workflow engine is similar to Apache Ant.   There is currently standards work going on in the area of orchestrating web services. My workflow engine can be considered as an early implementation of these efforts which are, after all, implementations of XML Acceptor. I regret that I am unable to distribute the engine and sample components as Open Source.

The components in the workflow "talk" to each other in a decoupled fashion. For example, a component emits relational data into an XML node and another component transforms this data into HTML via XSLT. Each component can read, write, delete, and create any nodes in the document. I never imagined the power of this implementation when I initially authored this paper.

The downside to this "emit and consume" approach is that the XML can get pretty hairy. Documentation becomes of utmost importance to make sure that components don't step on each others' toes. The nice thing about IDL is that it's rigid and therefore the data can't get out of control as long as you avoid Variant and Any.

XML Acceptor is not for everything. But it is my feeling that when all business logic is implemented using XML Acceptor, you can do some mind-blowing stuff (as I have).  For example, since all messages are XML, it is trivial to pass messages bidirectionally between components via a message queue.  This is a lot faster than using an application server which introduces a great deal of overhead.

However, the lack of compile-time interface enforcement and the overhead of parsing and validating XML really turns some developers off. It is not possible for XML Acceptor to take hold in an organization without the evangelization of someone of considerable influence. By the way, passing Variants and Anys has the same compile-time enforcement problem. I hope that some day computers will be fast enough for most of the negatives of XML Acceptor to simply be irrelevant. In my applications they are. Believe me, I have spent enough time worrying about them.


(c) 2001 XMLEverywhere.com