throbber

`
`
`WORLDWIDE WEB JOURNAL
`
`J_ (erraya
`
`XML
`
`Principles, Tools,
`
`and Techniques
`
`
`
`
`
`O’REILLY"
`
`
`
`IBM-1012
`
`Page 1 of 17
`
`IBM-1012
`Page 1 of 17
`
`

`

`WORLD WIDE WEB JOURNAL
`. XML: PRINCIPLES, TOOLS, AND TECHNIQ UES Volume 2, Issue 4, Fall 1997
`
`Publisher: Dale Dougherty
`
`Guest Editor.- Dan Connolly
`
`Series Editor: Rohit Khare
`
`Managing Editor: Donna Woonteiler
`News Editor.- DC. Denison
`
`Production Editor: Nancy Crumpton
`
`Tecbnicai Illustrator: Robert Romano
`
`Software Tools specialist: Mike Sierra
`
`Quality Assurance: Ellie Fountain Maden
`Cover Design: Hanna Dyer
`
`Text Design: Nancy Priest, Marcia Ciro
`
`Subscription Administrator: Marianne Cooke
`Photos: Flint Born
`
`ISBN: 156592—3499
`
`The individual contributions are copyrighted by the authors or their respective employers. The print
`compilation is Copyright © 1997 O’Reilly & Associates, Inc. All rights reserved. Printed in the United
`States of America.
`
`Many of the designations used by manufacturers and sellers to distinguish their products are claimed
`as trademarks. Where those designations appear in this book, and O’Reilly 8; Associates, Inc. was aware
`of a trademark claim, the designations have been printed in caps or initial caps.
`
`While every precaution has been taken in the preparation of this book, the publisher assumes no
`responsibility for errors or omissions, or for damages resulting from the use of the information
`contained herein.
`
`This book is printed on acid—free paper with 85% recycled content, 15% posteconsumer
`{X}
`Q waste. O’Reilly 8: Associates is committed to using paper With the highest recycled content
`
`available consistent With high quality.
`ISSN: 1085—2301
`
`IBM-1012
`
`Page 2 of 17
`
`IBM-1012
`Page 2 of 17
`
`

`

`
`Arnaud Le Hors
`Architecture
`iehors@w3.org
`Dan Connolly
`Domain Leader
`Hakon Lie
`howcome@w3.org
`connoliy@w3.org
`Jim Gettys
`Chris Liiiey
`chris@w3.org
`ig@w3. org
`Philipp Hoschka
`Masayasu ”Mimasa” ishikawa
`hoschka@w3.org
`mimasa@w3.org
`Youichirou Koga
`Dave Raggett
`dsr@w3.org
`yvkoga@w3.org
`Yves Lafon
`lréne Vatton
`lafon@w3.org
`vatton@w3.org
`Ora Lassila
`lassila@w3.org
`Henrik Frystyk Nielsen
`irystyk@w3.org
`Daniel Veiliard
`veiiiard@w3.org
`
`W3C Administration
`Jean-Francois Abramatic
`W3C Chairman and Associate
`Director, MIT Laboratory for
`Computer Science
`jfa@w3.org
`Tim Berners-Lee
`Director of the W30
`timbl@w3.org
`Vincent Quint
`Deputy Director for Europe
`quint@w3.org
`Nobuo Saito
`W30 Associate Chairman
`and Dean, Keio University
`nobuo.saito@w3,org
`Alan Kotok
`W30 Associate Chairman
`kotok@w3.org
`Tatsuya Hagino
`Deputy Director for Asia
`hagino@w3.org
`
`-‘
`
`:
`
`User interface
`Vincent Quint
`Domain Leader
`quint@w3.org
`Bert Hos
`bert@w3.org
`Ramzi Guetari
`guetari@w3.org
`
`lose’ Kahan
`kahan@w3.org
`Sally Khudairi
`khudairi@w3.org
`Stephan Montigaud
`montigaud@w3.org
`Gerald Oskoboiny
`geraid@w3.org
`Luc Ottavj
`ottavj@w3. org
`Pierre Fiilauit
`fliiauit@w3.org
`Takeshi "Yamachan" Yamane
`yamachan@w3,org
`
`Administrative Support
`Pamela Ahern
`pam@w3.org
`susan Hardy
`susan@w3,org
`MarieLine Ramfos
`ramfos@w3.org
`Josiane Roberts
`roberts@inria.ir
`
`Nancy Ryan
`ryan@w3.org
`Yukari Mitsuhashi
`yukari@w3.org
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Technology and Society
`Jim Miller
`Domain Leader
`jmiiier@w3.org
`Eui-Suk Chung
`euisuk@w3.org
`Daniel Dardaiiier
`danieid@w3.org
`Philip DesAutels
`philiod@w3.org
`Josef Dieii
`jdieti@w3.org
`Joseph Reagle
`reagie@w3.org
`Ralph Swick
`swick@w3.org
`
`Cross Areas and
`Technical Support
`Janet Bertot
`bertot@w3.org
`Stephane Boyera
`boyera@w3.org
`Daicho Funato
`daichi@w3sorg
`Tom Greene
`tjg@w3.org
`
`
`
`IBM-1012
`
`Page 3 of 17
`
`IBM-1012
`Page 3 of 17
`
`

`

`Consulting, Inc. Poet Soft
`
`Primrose
`Pretty G
`
`
`
`IBM-1012
`
`Page 4 of 17
`
`ABNVAMRO Bank
`
`Access Company Limited
`Adobe Systems Inc.
`Ae’rospatiale
`AGFSI
`
`Agfa Division, Bayer Corp.
`Agranat Systems, Inc.
`AIcatel Aisthom Recherche
`AlfaAOmega Foundation
`AIis Technologies, Inc.
`America Oniine, Inc.
`American International Group Data
`Center, Inc. {AlG}
`American Internet Corporation
`Apple Computer, Inc.
`ArborText, Inc.
`Architecture Projects
`Management Ltd.
`ArrowPoint Communications
`Art Technology Group
`Asymetrix Corporation
`AT&T
`
`Attachmate Corporation
`BackWeb Technologies, Inc,
`BELGACOM
`Bel/core
`Bitstream, Inc.
`British Telecommunications
`Laboratories
`Bull SA.
`Canal +
`Canon, Inc.
`Cap Gemini Innovation
`Center for Democracy
`and Technology
`Center for Mathematics and
`Computer Science (CWI)
`CERN
`CIRAD
`
`CNETeThe Computer Network
`CNR—Instltuto Elaborazione
`deii’lnformazione
`CNRS
`
`Commissariat a L’Energie
`Atomioue (CEA)
`CompuServe, Inc.
`Computer Answer Line
`Corporation for National Research
`Initiatives (CNRI)
`CosmosBay
`Council for the Central
`Laboratory of the Research
`Councils (CCL)
`CyberCash, Inc.
`Cygnus Support
`Daewoo Electronics Company
`Dassault Aviation
`Data Channel
`
`Data Research Associates, inc.
`Defense Information Systems
`Agency {DISA}
`Deutsche Teiekom—Oniine Pro
`Dienste GmbH R Co. KG
`(T—DnIine)
`Digital Equipment Corporation
`Digital Vision Laboratories
`Corporation
`DigitalSty/e Corporation
`Direct Marketing Association, Inc.
`DoubleC/Ick
`
`Eastman Kodak Company
`Ecole Nationale Supe‘rieure
`d’Informatique et de
`Mathe’matiques
`Applique'es (ENSIMAG)
`EDF
`
`EEIG/ERCIM
`ENEL
`
`Engage Technologies
`ENN Corporation
`Enterprise Integration Technology
`Entrust Technologies, Inc.
`ERICSSON
`
`Ernst & Young LLP
`ETNO TEAM S.p.A.
`Firefly Network, Inc.
`First Virtual Holdings, Inc.
`
`FirstFloor Software, Inc.
`Folio Corporation
`Foundation for Research and
`Technology (FORTH)
`France Telecom
`Fujitsu Limited
`Fulcrum Technologies, Inc.
`GCTECH S.A.
`GEMPLUS
`
`General Magic, Inc.
`Geoworks
`GMD Institute FIT
`Graphic Communications
`Association
`Grenoble Network Initiative
`GR/F S.A.
`Groupe ESC Grenoble
`Harlequin Inc.
`HA VAS
`Hewlett Packard
`Laboratories, Bristol
`Hitachi, Ltd.
`@Home Network
`Hong Kong Jockey Club
`Hummingbird Communications Ltd.
`IBERDROLA S.A.
`IBM Corporation
`ILOG, S.A.
`InContext Systems
`Industrial Technology
`Research Institute
`Infopartners
`INRETS
`
`Inso Corporation, Providence
`Institut FrancoeRusse A.M.
`Liapunov
`Institute for Information Industry
`Intel Corporation
`lntermind
`
`lnternet Profiles Corporation
`Intraspect Software, Inc.
`Joint Info. Systems Comm. of the
`UK Higher Ed. Funding Council
`
`Justsystem Corporation
`K2Net, Inc.
`KnowiedgeCite
`Kumamoto Institute of Computer
`Software, inc.
`Lexmark International, Inc.
`Los Alamos National Laboratory
`Lotus Development Corporation
`Lucent Technologies
`Mainspring Communications, Inc.
`Marimba, Inc.
`Matra Hachette
`MBED Software
`MCI Telecommunications
`Metrowerks Corporation
`Michelin
`
`Microsoft Corp.
`Microsystems Software, Inc.
`MITRE Corporation
`Mitsubishi Electric Corporation
`MTA SZTAKI
`Narrowiine
`National Center for
`Supercomputing
`Applications (NCSA)
`National Security Agency (NSA)
`National University of Singapore
`NCR
`
`NEC Corporation
`
`Netscape Communications
`NHS {National Health Service, UK)
`Nippon Telegraph & Telephone
`Corp. (NiT)
`NOKIA Corporation
`Novell, Inc.
`NU Data Communications
`Systems Corp.
`Nynex Science & Technology, Inc.
`O’Reiliy & Associates, Inc.
`Object Management Group,
`Inc. {OMG}
`Object Services and
`
`IBM-1012
`Page 4 of 17
`
`

`

`
`
`
`
`
`
`Thomson-63F
`SottOuad. Inc.
`Progressive Networks
`0CLC (Online Computer Library
`TIAA-CREF
`Software Publishers Association
`Center, Inc.)
`Public IP Exchange, Ltd. (PIPEX)
`Toshiba Corporation
`(SPA)
`Omron Corporation
`Qua/comm Inc.
`TriTeaI Corporation
`Sony Corporation
`Raptor Systems, Inc.
`Open Market, Inc.
`TRUSTe
`Spyglass, Inc.
`ReedrEIsevier
`Open Sesame
`UKERNA
`
`Strategic interactive Group
`Reuters Limited
`Open Software Associates, Inc.
`Unwired Planet
`
`Sun Microsystems Corporation
`Rice University for Nat’l
`Open Software Foundation
`SURFnet bv
`USWeb Corporation
`HCPP Software
`
`Open Text Corporation
`Swedish Institute for Systems
`VeriSign, inc.
`Riveriand Holding NV/SA
`Oracle Corp.
`
`Development (SISU)
`Verity, Inc.
`ORSTOM
`Royal Melbourne institute of
`
`Syracuse University
`Technology
`Vignette Corporation
`Pacifitech Corporation
`
`Tandem Computers Inc.
`VTT Information Technology
`Security Dynamics
`Partners HealthCare System, Inc.
`
`Technische Universitat Graz
`Technologies, Inc.
`webMethods, Inc.
`Pencom Web Works
`
`Sema Group
`Teknema Corporation
`WebTV Networks Inc.
`
`Philips Electronic N. V.
`
`Telecom Italia
`Wolfram Research, Inc.
`Sharp Corporation
`
`Poet Software Corporation
`SICS
`Telequip Corporation
`WWW. Consult Pty Ltd.
`PointCast Incorporated
`
`Siemens-Nixdorf
`WWW—KR
`Terisa Systems, Inc.
`
`Pretty Good Privacy, inc.
`
`Tercel Group
`Silicon Graphics, Inc.
`Xerox Corporation
`
`Prodigy Services Corporation
`SLIGOS
`Xionics Document
`The Productivity Works, inc,
`
`Technologies, Inc.
`
`
`
`IBM-1012
`
`Page 5 of 17
`
`IBM-1012
`Page 5 of 17
`
`

`

`
`
`CONTENTS
`
` 1
`
`EDITORIAL
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Guest Editor Dan Connolly and Series Editor Rohit
`Khare team up to herald the appearance of XML
`and discuss its evolution.
`
`
`XML BACKGROUND
`
`Members of the W3C's XML Editorial Review
`
`Board talk about the road to XML: its history,
`breakthroughs, the participation of Microsoft and
`Netscape, and the work that remains.
`
`
`WORK IN PROGRESS
`13
`
`In “The Web Is Ruined and l Ruined lt”self-
`proclaimed HTML Terrorist David Siegel discusses
`how proper separation of structure (HTML), style
`{683), and semantics (XML) make content more
`compelling and design more effective
`
`
`TIMELINE
`22
`
`
`W3C REPORTS
`27
`
`Recent, noteworthy W3C events
`
`See next page for detailed listing
`
`
`TECHNICAL PAPERS
`95
`
`See next page for detailed listing
`
`
`
`This ASSUG ’3
`cover image was
`photographed by
`Kevin Thomas and
`manipulated in
`Adobe Photoshop 4.0
`by Edie Freedman.
`
`
`
`
`
`
`
`
`IBM-1012
`
`Page 6 of 17
`
`IBM-1012
`Page 6 of 17
`
`

`

`CONTENTS
`
`W3C REPORTS
`
`Extensible Markup Language (XML)
`TLM BRAY, JEAN PAOLI, C.M. SPERBERG—MCQUEEN
`
`
`
`Extensible Markup Language (XML)
`
`Part 2: Linking
`TIM BRAY, STEVE DEROSE
`
`H TML-Math:
`
`Mathematical Markup Language Working Draft
`ROBERT R. MINER, PATRICK D. F. ION
`
`
`
`Document Object Model Requirements
`LAUREN WOOD, JARED SORENSEN
`
`
`
`TECHNICAL PAPERS
`
`A Guide to XML
`NORMAN WALSH
`
`
`
`Table of Content5
`
`
`
`
`
`XML and CSS
`STUART CULSHAw, MICHAEL LEVENTHAL, AND MURRAY MALONEY
`
`The Evolution of Web Documents:
`
`The Ascent of XML
`DAN CONNOLLY, ROHiT KHARE, ADAM RIFKIN
`
`Embedded Markup Considered Harmful
`THEODOR HOLM NELSON
`
`IBM-1012
`
`Page 7 of 17
`
`IBM-1012
`Page 7 of 17
`
`

`

`
`
`
`
`
`M C
`
`ONTENTS
`
`W
`
`WEB
`
`Chemical Markup Language:
`A Simple Introduction to Structured Documents
`PEiER MURRAY‘RUST
`
`Codifylng Medical Records in XML:
`Philosophy and Engineering
`THOMAS L. LINCOLN
`
`1
`
`;
`l
`
`1
`)1
`
`XML: Can the Desperate Perl Hacker Do it?
`
`
`
`MICHAEL LEVENTHAL
`
`XML: From Bytes to Characters
`BERT B05
`
`
`
`An introduction to XML Processing with Lark
`'IM BRAY
`
`
`
`135
`
`149
`
`153
`
`165
`
`177
`
`187
`
`197
`
`207
`
`219
`
`229
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`IBM-1012
`
`Page 8 of 17
`
`Table of Contents
`
`K
`
`(
`
`l
`

`“
`i
`
`'2,
`
`Building XML Parsers for Microsoft's iE4
`EAN PAOLI, DAVID SCHACH, CHRIS LOVEIT, ANDREW LAYMAN, iSWAN CSERI
`
`
`
`JUMBO: An Object—Based XML Browser
`
`JEIER MURRAY-RUST
`
`
`
`Capturing the State of Distributed Systems with XML
`XOHIT KHARE, ADAM RiFKIN
`
`
`
`XML, Java, and the Future of the Web
`ON BOSAK
`
`WiDL:Appiicatian integration with XML
`CHARLES ALLEN
`
`
`
`
`
`
`
`IBM-1012
`Page 8 of 17
`
`

`

`
`
`
`
`XML, JAVA,
`
`AND THE FUTURE OF THE WEB
`
`W3i
`
`fun Bosa/e
`
`introduction
`
`The extraordinary growth of the World Wide
`Web has been fueled by the ability it gives
`authors to easily and Cheaply distribute electronic
`documents to an international audience. As Web
`
`documents have become larger and more come
`plex, however, Web content providers have
`begun to experience the limitations of a medium
`that does not provide the extensibility, structure,
`and data checking needed for large—scale com—
`mercial publishing. The ability of Java applets to
`embed powerful data manipulation capabilities in
`Web clients makes even clearer the limitations of
`current methods for the transmittal of document
`data.
`
`To address the requirements of commercial Web
`publishing and enable the further expansion of
`Web technology into new domains of distributed
`document processing, the World Wide Web Con-
`sortium has developed an Extensible Markup
`Language (XML)
`for applications
`that
`require
`functionality
`beyond the
`current Hypertext
`Markup Language (HTML). This paper describes
`the XML effort and discusses new kinds of Java-
`based Web applications made possible by XML.”
`
`Background: HTML and SGML
`Most documents on the Web are stored and
`
`transmitted in HTML. HTML is a simple language
`well suited for hypertext, multimedia, and the
`display of small and reasonably simple docu—
`ments. HTML is based on SGML (Standard Gener~
`alized Markup Language, ISO 8879), a standard
`system for defining and using document formats.
`
`SGML allows documents to describe their own
`grammarfithat is, to specify the tag set used in
`the document and the structural relationships that
`those tags
`represent, HTML applications are
`applications that hardwire a small set of tags in
`conformance with a single SGML specification.
`Freezing a small set of tags allows users to leave
`the language specification out of the document
`and makes it much easier to build applications,
`but this case comes at the cost of severely limit~
`ing HTML in several
`important respects, chief
`among which are extensibility, structure, and val—
`idation,
`
`- Extensibility. IITML does not allow users to
`specify their own tags or attributes in order
`to parameterize or otherwise semantically
`qualify their data.
`
`0 Structure. HTML does not support the speci—
`fication of deep structures needed to repre—
`sent database schemas or object—oriented
`hierarchies.
`
`0 Validation, HTML does not support the kind
`of language specification that allows con—
`suming applications to check data for struc—
`tural validity on importation.
`
`to HTML stands generic SGML. A
`In contrast
`generic SGML application is one that supports
`SGML language specifications of arbitrary com—
`plexity and makes possible the qualities of exten—
`sibility, structure, and validation missing from
`HTML. SGML makes it possible to define your
`own formats for your own documents, to handle
`large and complex documents, and to manage
`large
`information repositories. However,
`full
`SGML contains many optional features that are
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`* This paper, first published on the Web on November 17, 1996 [1], was revised March 10, 1997 [Z] and is here pre—
`sented in a form edited for the World Wide W'ebjoztrlmi
`
`
`Technical Papers
`
`
`
`IBM-1012
`
`Page 9 of 17
`
`IBM-1012
`Page 9 of 17
`
`

`

`not needed for Web applications and has proven
`to have a cost/benefit
`ratio unattractive to cur-
`rent vendors of Web browsers.
`
`The XML Effort
`
`The World Wide Web Consortium (W3C) has cre-
`ated an SGML Working Group to build a set of
`specifications to make it easy and straightforward
`to use the beneficial features of SGML on the
`Web. See the W3C SGML/XML Activity page [3]
`for the current status of this effort. The goal of the
`W5C SGML activity is to enable the delivery of
`self—describing data structures of arbitrary depth
`and complexity to applications that require such
`structures.
`
`The first phase of this effort is the specification of
`a simplified subset of SGML specially designed
`for Web applications. This subset, called XML
`(Extensible Markup Language), retains the key
`SGML advantages of extensibility, structure, and
`validation in a language that is designed to be
`vastly easier to learn, use, and implement than
`full SGML.
`
`XML differs from HTML in three major respects:
`
`1. Information providers can define new tag
`and attribute names at will.
`
`. Document structures can be nested to any
`level of complexity,
`
`. Any XML document can contain an optional
`description of its grammar for use by appli—
`cations that need to perform structural vali-
`dation.
`
`XML has been designed for maximum expressive
`power, maximum teachability, and maximum
`ease Of
`implementation. The language is not
`backward—compatible with existing HTML docu—
`ments, but documents conforming to the W3C
`HTML 3.2 specification can easily be converted to
`XML, as can generic SGML documents and docu-
`ments generated from databases.
`
`Tire first working draft of XML was announced
`November 1996 at the SGML 96 Conference. A
`
`220
`
`XML: Principles, Tools, and Techniques
`
`
`major revision of the draft was announced at the
`Sixth World Wide Web Conference in April 1997,
`XML 1.0 is currently scheduled for recommenda-
`tion to the W3C Advisory Council during October
`l997. See the W3C XML page [3] for links to the
`latest draft.
`
`Web Applications of XML
`
`The applications that will drive the acceptance of
`XML are those that cannot be accomplished
`Within the the limitations of HTML. These appli—
`cations can be divided into four broad categories:
`
`1. Applications that require the Web client to
`mediate between two or more heteroge—
`neous databases.
`
`. Applications that attempt to distribute a sig-
`nificant proportion of the processing load
`from the Web server to the Web client.
`
`. Applications that require the Web client to
`present different views of the same data to
`different users.
`
`. Applications in which intelligent Web agents
`attempt to tailor information discovery to the
`needs of individual users.
`
`The alternative to XML for these applications is
`proprietary code embedded as “script elements”
`in HTML documents and delivered in conjunction
`with proprietary browser plug-ins or Java applets.
`XML derives from a philosophy that data belongs
`to its creators and that content providers are best
`served by-a data format that does not bind them
`to particular script
`languages, authoring tools,
`and delivery engines but provides a standardized,
`vendor—independent,
`level playing field upon
`which different authoring and delivery tools may
`freely complete.
`
`Database Interchange:
`The Universal Hub
`
`A paradigmatic example of this first category of
`XML applications is the information tracking sys-
`tem for a home health care agency.
`
`
`
`
`
`
`7 IBM1-102 .
`
`Page 10 of 17
`
`IBM-1012
`Page 10 of 17
`
`

`

`
`
`
`
`
`
`
`
`
`
`
`
`
`
` l.i-ll
`
`l
`
`
`
`
`
`Home health care is a major component of Amer-
`ica’s multibillion—dollar medical industry that con-
`tinues to increase in importance as the health
`care burden is shifted from hospitals to home
`care settings. Information management is critical
`to this industry in order to meet the record—keep—
`ing requirements of the federal agencies and
`health maintenance organizations that pay for
`patient care.
`
`The typical patient entering a home health care
`agency is represented to the information system
`by a large collection of paper—based historical
`materials in the form of patient medical histories
`and billing data from a variety of doctors, hospi—
`tals, pharmacies, and insurance companies. The
`biggest task in getting the patient into the system
`is
`the manual entry of this material
`into the
`agency’s database
`
`The coming of the Web has given the medical
`informatics community the hope that an elec—
`tronic means can be found to alleviate this bur-
`den.‘ Unfortunately, existing Web applications
`represent fundamentally insufficient models for
`an adequate solution. Hospitals have begun to
`offer the agencies a solution that goes something
`like this:
`
`patient data from the Web browser and keying it
`directly into the agency’s online forms-based
`interface in a separate window instead of making
`a printout first. The only difference between this
`version and the previous one is that it saves the
`paper that would have been needed for the print—
`out.
`It does nothing to address the root of the
`problem. A real solution would look more like
`this:
`
`1. Log into the hospital’s Web site.
`
`2. Become an authorized user.
`
`5. Access the patient’s medical records in a
`Web—based
`interface
`that
`represents
`the
`records for that patient with a folder icon.
`
`4. Drag the folder from the Web application
`over to the internal database application.
`
`5. Drop it into the database.
`
`However, this solution is not possible within the
`limitations of HTML, for three reasons.
`
`- The HTML tag set is too limited to represent
`or differentiate between the multitude of
`database fields in the mixture of documents
`making up the patient’s medical history.
`
`- HTML is incapable of representing the vari-
`ety of structures in those documents.
`
`- HTML lacks any mechanism for checking the
`data for structural validity before the receiv-
`ing application attempts to import it into the
`target database.
`
`One technically feasible way to implement seam-
`less interchange of patient care records is simply
`to require all hospitals and health care agencies
`to use a single standard system dictated by the
`government (such an approach has actually been
`suggested).
`In an environment where hospitals
`are going out of business on a daily basis and
`many health care agencies are in deep financial
`
`1. Log into the hospital’s Web site.
`
`2. Become an authorized user.
`
`3. Access the patient’s medical records using a
`Web browser.
`
`4. Print out the records from the browser.
`
`5. Manually key in the data from the printouts.
`
`this
`The knowledgeable reader may smile at
`“solution,” but in fact this is not a joke; this is an
`actual proposal from a large American hospital
`known for its early adoption of advanced medical
`information systems.
`
`this
`A slightly more sophisticated version of
`“solution” envisions the operator
`reading the
`R,
`
`
`
`
`
`
`
`
`
`
`
`
`
`" For more information we refer you to Lincoln Stein’s article, “Electronic Medical Records: Promises and Threats,"
`in the Summer 1997 issue of the W5] entitled Web Security A Matter of Trust.
`
`Technical Papers
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`IBM-1012 '
`
`Page 11 of 17
`
`
`
`
`IBM-1012
`Page 11 of 17
`
`

`

`difficulty, however, a scheme that en masse is
`hardly practical.
`
`The other way to enable interchange between
`heterogeneous systems is to adopt a single indus—
`try—wide interchange format that serves as the sin—
`gle output format for all exporting systems and
`the single input format for all importing systems.
`This is, in fact, the purpose for which SGML was
`initially designed, and XML simply carries on this
`tradition.
`
`allergic to penicillin. The ability of XML to define
`tags specific to an area of application is critical to
`this scenario, because the otherwise unqualified
`word “penicillin" in the thousands of pages of a
`patient’s entire medical history could not trigger
`the recognition that
`the same word inside an
`<allergies> element could trigger.
`
`The health care example is relevant not only
`because of the scope of the problem and the
`enormous sums of money involved but also
`because it is paradigmatic of a very wide range of
`future Web applications—any in which Web cli-
`ents (or Java applications running on those cli—
`ents)
`are
`expected to mediate
`the
`lossless
`exchange of complex data between systems that
`use different forms of data representation in a
`way that can be standardized across an industry
`or other interest group. Some random examples
`of such applications are:
`
`- Legal publishing
`
`The government drug approval process
`Collaborative CAD/CAM efforts
`
`XML: Principles, Tools, and Techniques
`
`
`A number of industries, including the aerospace,
`automotive,
`telecommunications, and computer
`software industries, have been using. hub lan-
`guages to perform data interchange for years, and
`by this time the process is well understood. Typi—
`cally,
`the major players in an industry form a
`standards consortium tasked with defining a Doc-
`ument Type Definition, which is
`the way in
`which the tag set and grammar of a markup lane
`guage are defined. This DTD can then be sent
`with documents that have been marked up in the
`industry standard language using off-the-shelf
`editing tools, and any standard application on the
`receiving end can validate and process them.
`The XML solution is system-independent, vendor—
`independent, and proven by over a decade of
`SGML implementation experience XML merely
`extends this proven approach to document inter~
`change over the Web. Interestingly, the same day
`on which the first XML 1.0 draft was released also
`saw the formal announcement of an SGML initia-
`tive within HL7,
`the standards organization for
`health care IS vendors, to develop a Health Care
`Markup Language designed to solve exactly the
`kind of problem described in this example.
`
`Previous vertical-industry efforts have shown that
`capturing data in a rich markup often has benefits
`beyond the immediate requirements of data
`exchange.
`In.
`a well—designed
`standardized
`patient data system, for example, specific infor—
`mation originally gathered in the course of a rou—
`tine physical exam and tagged <allergies>, <drug—
`reactions>, and so on would instantly be avail-
`able to alert the staff of an emergency room that
`an unconscious patient from a distant city was
`222
`
`Collaborative calendar management across
`different systems
`
`corporate network application that
`Any
`works across databases, especially where
`policies must be enforced: purchase orders,
`expense requests, etC.
`
`Exchange of information between players in
`any broker—organized business:
`insurance,
`securities, banking, etc.
`
`Distributed Processing:
`Giving Java Something to Do
`
`A paradigmatic example of this second category
`of XML applications is the data delivery system
`designed by the semiconductor industry.
`
`Each major semiconductor manufacturer main~
`tains several terabytes of technical data on all of
`the ICs that it produces. To enable interchange of
`this data, an industry consortium (the Pinnacles
`Group) was formed several years ago by Intel,
`
`,lii
`
`i i s
`
`IBM-1012
`
`Page 12 of 17
`
`IBM-1012
`Page 12 of 17
`
`

`

`
`
`
`National Semiconductor, Philips, Texas Instru—
`ments, and Hitachi'to design an industry—specific
`SGML markup language, The consortium finished
`that specification in 1995, and its member compa-
`nies are now well into the implementation phase
`of the process.
`
`think that the rise in popularity of
`One might
`HTML would cause the Pinnacles members to
`reconsider their decision, but in fact the limita—
`tions of IITML have convinced them that
`their
`
`original strategy was the correct one, Their initial
`idea was
`that
`the richly parameterized data
`stream made possible by the industryespecific
`SGML markup would enable intelligent applica—
`tions not merely to display semiconductor data
`sheets as readable documents but actually to
`drive design processes. It is now recognized that
`this approach is a perfect fit with the concept of
`distributed java applets, and the vision of the
`near future is one in which engineers can access
`a manufacturer’s Web site and download not only
`viewable data on particular integrated circuits but
`also a java applet
`that allows them to model
`those circuits in various combinations.
`
`The semiconductor application is a good demon-
`stration of the advantages of XML because:
`
`1,
`
`It requires industry—specific markup that can«
`not be implemented within the confines of
`the fixed HTML tag set.
`
`the data representation be
`. It requires that
`platform— and vendor—independent so that
`data from a variety of sources can be used to
`drive a variety of distributed applications
`(some of which may be provided by third
`parties, generating a subindustry of provid-
`ers of tools that can work with the standard—
`ized data stream).
`
`_ Its utility rests ultimately in the fact that a
`process
`computation—intensive
`(modeling
`circuits for hours at a time) that would other—
`wise entail an enormous, extended resource
`hit on the server has been changed into a
`brief interaction with the server followed by
`an extended interaction with the user’s own
`
`Technical Papers
`
`computers.
`
`
`Web client. This aspect has been summed
`up in the slogan “XML gives Java something
`to do.”
`
`Note that validation, while sometimes important,
`does not always play the crucial role in this cate-
`gory of applications that it does in applications
`where data must be checked for structural integ-
`rity before entering a database, To make process-
`ing as efficient
`as possible, XML has been
`designed so that validation is optional in applica—
`tions where it is not needed.
`
`the semicon-
`As with the health—care example,
`ductor application is notable not merely for the
`sheer size of the market it represents but also
`because it is paradigmatic of an enormous range
`of future Java-based Web applications —— virtually
`any application in which standardized data is
`expected to be manipulated in interesting ways
`on the client. Perhaps the most obvious examples
`of such applications are the following:
`
`0 Design applications where the designer
`would otherwise use server cycles to con
`sider various alternatives: electronics, engi-
`neering, architecture, menu planning, etc.
`
`Scheduling applications where a customer
`would otherwise use server cycles to enter—
`tain various
`possibilities:
`airlines,
`trains,
`buses, and subways;
`restaurants, movies,
`plays, and concerts. This is what Easy Saabre
`and Ticketron will look like a few years from
`now as the economies of distributed Java—
`based processing become evident.
`
`Commercial applications that allow consum—
`ers to explore alternatives by supplying dif—
`ferent shopping criteria: real estate, automo-
`biles, appliances, etc.
`
`The entire spectrum of educational applica—
`tions,
`at small subset of which are the ones
`we call “online help.”
`
`The entire spectrum of customer—support
`applications,
`ranging
`from lawn—mower
`maintenance through technical support for
`
`IBM-1012
`
`Page 13 of 17
`
`IBM-1012
`Page 13 of 17
`
`

`

`A harbinger of applications to come in the last
`category is the Solution Exchange Standard, an
`SGML markup language announced in June 1996
`by a consortium of over 60 hardware, software,
`and communications companies to facilitate the
`exchange
`of
`technical
`support
`information
`among vendors, system integrators, and corpo-
`rate help desks. In the Words of the announce—
`ment:
`
`tunately, the Web latency built into every expan~
`sion or contraction of the TOC makes this pro-
`cess sluggish in many user environments. A much
`better solution is to download the entire struc-
`tured TOC to the client rather than just individual
`server-generated views of the TOC. Then the user
`can expand, contract, and move about
`in the
`TOC supported by a much faster process running
`directly on the client.
`
`The standard has been designed to be
`flexible.
`It
`is independent of any plat—
`form, vendor or application, so it can be
`used to exchange solution information
`without regard to the system it is com—
`ing from or going to.
`[.
`.
`.] Additionally,
`the standard has been designed to have
`a long lifetime. SGML offers room for
`growth and extensibility, so the stan—
`dard can easily accommodate rapidly
`changing support environments.
`
`A group at Sun actually implemented a form of
`this solution as part of a Java—based HTML help
`browser, but the limitations of HTML required the
`team to come up with a couple of clever
`workarounds. In this application, a TOC was con-
`structed by hand (the lack of structure in ordinary
`HTML makes it impossible to reliably generate a
`TOC directly from the document) using non—
`standard tags invented for the purpose, and then
`the TOC piece was wrapped in a comment within
`an HTML page to hide the nonstandard markup
`from Web browsers. A Java applet downloaded
`with the HTML document interpreted the hidden
`markup and provided the client-based TOC
`behavior.
`
`XML: Principles, Tools, and Techniques
`
`Such applications, which the XML subset is spe-
`cifically designed to address, will grow in impor-
`tance as consumers come to expect interoperabil-
`ity among their data-manipulating applets and
`information providers confront
`the realities of
`trying to support computation—intensive tasks
`directly on their Web servers.
`
`View Selection: Letting the User Decide
`
`A third variety of XML applications are those in
`which users may wish to switch between differ—
`ent views of the data without requiring that the
`data be downloaded again in a different form
`from the Web server.
`
`One early application in this category Will be
`dynamic tables of contents. It
`is possible now,
`using Web servers built on object—oriented data-
`bases, to present the user with a table of contents
`into a
`large collection of data that can be
`expanded with a mouse click to “open up” a por—
`tion of the TOC and reveal more detailed levels
`of the document structure. Dynamic TOCs of this
`kind can be generated at run time directly from
`the hierarchical structure of the document. Unfor-
`
`224
`
`In practice, this application worked very well and
`testified both to the ingenuity of its designers and
`to the validity of the basic concept. But in an
`XML environment, neither the manual creation of
`the TOC nor its concealment would have been
`necessar

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket