throbber

`
`
`WORLDWIDE WEB JOURNAL
`
`J_ (erraya
`
`XML
`
`Principles, Tools,
`
`and Techniques
`
`
`
`
`
`O’REILLY"
`
`
`
`IBM-1011
`
`Page 1 of 16
`
`IBM-1011
`Page 1 of 16
`
`

`

`WORLD WIDE WEB JOURNAL
`. XML: PRINCIPLES, TOOLS, AND TECHNIQ UES Volume 2, Issue 4, Fall 1997
`
`Publisher: Dale Dougherty
`
`Guest Editor.- Dan Connolly
`
`Series Editor: Rohit Khare
`
`Managing Editor: Donna Woonteiler
`News Editor.- DC. Denison
`
`Production Editor: Nancy Crumpton
`
`Tecbnicai Illustrator: Robert Romano
`
`Software Tools specialist: Mike Sierra
`
`Quality Assurance: Ellie Fountain Maden
`Cover Design: Hanna Dyer
`
`Text Design: Nancy Priest, Marcia Ciro
`
`Subscription Administrator: Marianne Cooke
`Photos: Flint Born
`
`ISBN: 156592—3499
`
`The individual contributions are copyrighted by the authors or their respective employers. The print
`compilation is Copyright © 1997 O’Reilly & Associates, Inc. All rights reserved. Printed in the United
`States of America.
`
`Many of the designations used by manufacturers and sellers to distinguish their products are claimed
`as trademarks. Where those designations appear in this book, and O’Reilly 8; Associates, Inc. was aware
`of a trademark claim, the designations have been printed in caps or initial caps.
`
`While every precaution has been taken in the preparation of this book, the publisher assumes no
`responsibility for errors or omissions, or for damages resulting from the use of the information
`contained herein.
`
`This book is printed on acid—free paper with 85% recycled content, 15% posteconsumer
`{X}
`Q waste. O’Reilly 8: Associates is committed to using paper With the highest recycled content
`
`available consistent With high quality.
`ISSN: 1085—2301
`
`IBM-1011
`
`Page 2 of 16
`
`IBM-1011
`Page 2 of 16
`
`

`

`
`Arnaud Le Hors
`Architecture
`iehors@w3.org
`Dan Connolly
`Domain Leader
`Hakon Lie
`howcome@w3.org
`connoliy@w3.org
`Jim Gettys
`Chris Liiiey
`chris@w3.org
`ig@w3. org
`Philipp Hoschka
`Masayasu ”Mimasa” ishikawa
`hoschka@w3.org
`mimasa@w3.org
`Youichirou Koga
`Dave Raggett
`dsr@w3.org
`yvkoga@w3.org
`Yves Lafon
`lréne Vatton
`lafon@w3.org
`vatton@w3.org
`Ora Lassila
`lassila@w3.org
`Henrik Frystyk Nielsen
`irystyk@w3.org
`Daniel Veiliard
`veiiiard@w3.org
`
`W3C Administration
`Jean-Francois Abramatic
`W3C Chairman and Associate
`Director, MIT Laboratory for
`Computer Science
`jfa@w3.org
`Tim Berners-Lee
`Director of the W30
`timbl@w3.org
`Vincent Quint
`Deputy Director for Europe
`quint@w3.org
`Nobuo Saito
`W30 Associate Chairman
`and Dean, Keio University
`nobuo.saito@w3,org
`Alan Kotok
`W30 Associate Chairman
`kotok@w3.org
`Tatsuya Hagino
`Deputy Director for Asia
`hagino@w3.org
`
`-‘
`
`:
`
`User interface
`Vincent Quint
`Domain Leader
`quint@w3.org
`Bert Hos
`bert@w3.org
`Ramzi Guetari
`guetari@w3.org
`
`lose’ Kahan
`kahan@w3.org
`Sally Khudairi
`khudairi@w3.org
`Stephan Montigaud
`montigaud@w3.org
`Gerald Oskoboiny
`geraid@w3.org
`Luc Ottavj
`ottavj@w3. org
`Pierre Fiilauit
`fliiauit@w3.org
`Takeshi "Yamachan" Yamane
`yamachan@w3,org
`
`Administrative Support
`Pamela Ahern
`pam@w3.org
`susan Hardy
`susan@w3,org
`MarieLine Ramfos
`ramfos@w3.org
`Josiane Roberts
`roberts@inria.ir
`
`Nancy Ryan
`ryan@w3.org
`Yukari Mitsuhashi
`yukari@w3.org
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Technology and Society
`Jim Miller
`Domain Leader
`jmiiier@w3.org
`Eui-Suk Chung
`euisuk@w3.org
`Daniel Dardaiiier
`danieid@w3.org
`Philip DesAutels
`philiod@w3.org
`Josef Dieii
`jdieti@w3.org
`Joseph Reagle
`reagie@w3.org
`Ralph Swick
`swick@w3.org
`
`Cross Areas and
`Technical Support
`Janet Bertot
`bertot@w3.org
`Stephane Boyera
`boyera@w3.org
`Daicho Funato
`daichi@w3sorg
`Tom Greene
`tjg@w3.org
`
`
`
`IBM-1011
`
`Page 3 of 16
`
`IBM-1011
`Page 3 of 16
`
`

`

`Consulting, Inc. Poet Soft
`
`Primrose
`Pretty G
`
`
`
`IBM-1011
`
`Page 4 of 16
`
`ABNVAMRO Bank
`
`Access Company Limited
`Adobe Systems Inc.
`Ae’rospatiale
`AGFSI
`
`Agfa Division, Bayer Corp.
`Agranat Systems, Inc.
`AIcatel Aisthom Recherche
`AlfaAOmega Foundation
`AIis Technologies, Inc.
`America Oniine, Inc.
`American International Group Data
`Center, Inc. {AlG}
`American Internet Corporation
`Apple Computer, Inc.
`ArborText, Inc.
`Architecture Projects
`Management Ltd.
`ArrowPoint Communications
`Art Technology Group
`Asymetrix Corporation
`AT&T
`
`Attachmate Corporation
`BackWeb Technologies, Inc,
`BELGACOM
`Bel/core
`Bitstream, Inc.
`British Telecommunications
`Laboratories
`Bull SA.
`Canal +
`Canon, Inc.
`Cap Gemini Innovation
`Center for Democracy
`and Technology
`Center for Mathematics and
`Computer Science (CWI)
`CERN
`CIRAD
`
`CNETeThe Computer Network
`CNR—Instltuto Elaborazione
`deii’lnformazione
`CNRS
`
`Commissariat a L’Energie
`Atomioue (CEA)
`CompuServe, Inc.
`Computer Answer Line
`Corporation for National Research
`Initiatives (CNRI)
`CosmosBay
`Council for the Central
`Laboratory of the Research
`Councils (CCL)
`CyberCash, Inc.
`Cygnus Support
`Daewoo Electronics Company
`Dassault Aviation
`Data Channel
`
`Data Research Associates, inc.
`Defense Information Systems
`Agency {DISA}
`Deutsche Teiekom—Oniine Pro
`Dienste GmbH R Co. KG
`(T—DnIine)
`Digital Equipment Corporation
`Digital Vision Laboratories
`Corporation
`DigitalSty/e Corporation
`Direct Marketing Association, Inc.
`DoubleC/Ick
`
`Eastman Kodak Company
`Ecole Nationale Supe‘rieure
`d’Informatique et de
`Mathe’matiques
`Applique'es (ENSIMAG)
`EDF
`
`EEIG/ERCIM
`ENEL
`
`Engage Technologies
`ENN Corporation
`Enterprise Integration Technology
`Entrust Technologies, Inc.
`ERICSSON
`
`Ernst & Young LLP
`ETNO TEAM S.p.A.
`Firefly Network, Inc.
`First Virtual Holdings, Inc.
`
`FirstFloor Software, Inc.
`Folio Corporation
`Foundation for Research and
`Technology (FORTH)
`France Telecom
`Fujitsu Limited
`Fulcrum Technologies, Inc.
`GCTECH S.A.
`GEMPLUS
`
`General Magic, Inc.
`Geoworks
`GMD Institute FIT
`Graphic Communications
`Association
`Grenoble Network Initiative
`GR/F S.A.
`Groupe ESC Grenoble
`Harlequin Inc.
`HA VAS
`Hewlett Packard
`Laboratories, Bristol
`Hitachi, Ltd.
`@Home Network
`Hong Kong Jockey Club
`Hummingbird Communications Ltd.
`IBERDROLA S.A.
`IBM Corporation
`ILOG, S.A.
`InContext Systems
`Industrial Technology
`Research Institute
`Infopartners
`INRETS
`
`Inso Corporation, Providence
`Institut FrancoeRusse A.M.
`Liapunov
`Institute for Information Industry
`Intel Corporation
`lntermind
`
`lnternet Profiles Corporation
`Intraspect Software, Inc.
`Joint Info. Systems Comm. of the
`UK Higher Ed. Funding Council
`
`Justsystem Corporation
`K2Net, Inc.
`KnowiedgeCite
`Kumamoto Institute of Computer
`Software, inc.
`Lexmark International, Inc.
`Los Alamos National Laboratory
`Lotus Development Corporation
`Lucent Technologies
`Mainspring Communications, Inc.
`Marimba, Inc.
`Matra Hachette
`MBED Software
`MCI Telecommunications
`Metrowerks Corporation
`Michelin
`
`Microsoft Corp.
`Microsystems Software, Inc.
`MITRE Corporation
`Mitsubishi Electric Corporation
`MTA SZTAKI
`Narrowiine
`National Center for
`Supercomputing
`Applications (NCSA)
`National Security Agency (NSA)
`National University of Singapore
`NCR
`
`NEC Corporation
`
`Netscape Communications
`NHS {National Health Service, UK)
`Nippon Telegraph & Telephone
`Corp. (NiT)
`NOKIA Corporation
`Novell, Inc.
`NU Data Communications
`Systems Corp.
`Nynex Science & Technology, Inc.
`O’Reiliy & Associates, Inc.
`Object Management Group,
`Inc. {OMG}
`Object Services and
`
`IBM-1011
`Page 4 of 16
`
`

`

`
`
`
`
`
`
`Thomson-63F
`SottOuad. Inc.
`Progressive Networks
`0CLC (Online Computer Library
`TIAA-CREF
`Software Publishers Association
`Center, Inc.)
`Public IP Exchange, Ltd. (PIPEX)
`Toshiba Corporation
`(SPA)
`Omron Corporation
`Qua/comm Inc.
`TriTeaI Corporation
`Sony Corporation
`Raptor Systems, Inc.
`Open Market, Inc.
`TRUSTe
`Spyglass, Inc.
`ReedrEIsevier
`Open Sesame
`UKERNA
`
`Strategic interactive Group
`Reuters Limited
`Open Software Associates, Inc.
`Unwired Planet
`
`Sun Microsystems Corporation
`Rice University for Nat’l
`Open Software Foundation
`SURFnet bv
`USWeb Corporation
`HCPP Software
`
`Open Text Corporation
`Swedish Institute for Systems
`VeriSign, inc.
`Riveriand Holding NV/SA
`Oracle Corp.
`
`Development (SISU)
`Verity, Inc.
`ORSTOM
`Royal Melbourne institute of
`
`Syracuse University
`Technology
`Vignette Corporation
`Pacifitech Corporation
`
`Tandem Computers Inc.
`VTT Information Technology
`Security Dynamics
`Partners HealthCare System, Inc.
`
`Technische Universitat Graz
`Technologies, Inc.
`webMethods, Inc.
`Pencom Web Works
`
`Sema Group
`Teknema Corporation
`WebTV Networks Inc.
`
`Philips Electronic N. V.
`
`Telecom Italia
`Wolfram Research, Inc.
`Sharp Corporation
`
`Poet Software Corporation
`SICS
`Telequip Corporation
`WWW. Consult Pty Ltd.
`PointCast Incorporated
`
`Siemens-Nixdorf
`WWW—KR
`Terisa Systems, Inc.
`
`Pretty Good Privacy, inc.
`
`Tercel Group
`Silicon Graphics, Inc.
`Xerox Corporation
`
`Prodigy Services Corporation
`SLIGOS
`Xionics Document
`The Productivity Works, inc,
`
`Technologies, Inc.
`
`
`
`IBM-1011
`
`Page 5 of 16
`
`IBM-1011
`Page 5 of 16
`
`

`

`
`
`CONTENTS
`
` 1
`
`EDITORIAL
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Guest Editor Dan Connolly and Series Editor Rohit
`Khare team up to herald the appearance of XML
`and discuss its evolution.
`
`
`XML BACKGROUND
`
`Members of the W3C's XML Editorial Review
`
`Board talk about the road to XML: its history,
`breakthroughs, the participation of Microsoft and
`Netscape, and the work that remains.
`
`
`WORK IN PROGRESS
`13
`
`In “The Web Is Ruined and l Ruined lt”self-
`proclaimed HTML Terrorist David Siegel discusses
`how proper separation of structure (HTML), style
`{683), and semantics (XML) make content more
`compelling and design more effective
`
`
`TIMELINE
`22
`
`
`W3C REPORTS
`27
`
`Recent, noteworthy W3C events
`
`See next page for detailed listing
`
`
`TECHNICAL PAPERS
`95
`
`See next page for detailed listing
`
`
`
`This ASSUG ’3
`cover image was
`photographed by
`Kevin Thomas and
`manipulated in
`Adobe Photoshop 4.0
`by Edie Freedman.
`
`
`
`
`
`
`
`
`IBM-1011
`
`Page 6 of 16
`
`IBM-1011
`Page 6 of 16
`
`

`

`CONTENTS
`
`W3C REPORTS
`
`Extensible Markup Language (XML)
`TLM BRAY, JEAN PAOLI, C.M. SPERBERG—MCQUEEN
`
`
`
`Extensible Markup Language (XML)
`
`Part 2: Linking
`TIM BRAY, STEVE DEROSE
`
`H TML-Math:
`
`Mathematical Markup Language Working Draft
`ROBERT R. MINER, PATRICK D. F. ION
`
`
`
`Document Object Model Requirements
`LAUREN WOOD, JARED SORENSEN
`
`
`
`TECHNICAL PAPERS
`
`A Guide to XML
`NORMAN WALSH
`
`
`
`Table of Content5
`
`
`
`
`
`XML and CSS
`STUART CULSHAw, MICHAEL LEVENTHAL, AND MURRAY MALONEY
`
`The Evolution of Web Documents:
`
`The Ascent of XML
`DAN CONNOLLY, ROHiT KHARE, ADAM RIFKIN
`
`Embedded Markup Considered Harmful
`THEODOR HOLM NELSON
`
`IBM-1011
`
`Page 7 of 16
`
`IBM-1011
`Page 7 of 16
`
`

`

`
`
`
`
`
`M C
`
`ONTENTS
`
`W
`
`WEB
`
`Chemical Markup Language:
`A Simple Introduction to Structured Documents
`PEiER MURRAY‘RUST
`
`Codifylng Medical Records in XML:
`Philosophy and Engineering
`THOMAS L. LINCOLN
`
`1
`
`;
`l
`
`1
`)1
`
`XML: Can the Desperate Perl Hacker Do it?
`
`
`
`MICHAEL LEVENTHAL
`
`XML: From Bytes to Characters
`BERT B05
`
`
`
`An introduction to XML Processing with Lark
`'IM BRAY
`
`
`
`135
`
`149
`
`153
`
`165
`
`177
`
`187
`
`197
`
`207
`
`219
`
`229
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`IBM-1011
`
`Page 8 of 16
`
`Table of Contents
`
`K
`
`(
`
`l
`

`“
`i
`
`'2,
`
`Building XML Parsers for Microsoft's iE4
`EAN PAOLI, DAVID SCHACH, CHRIS LOVEIT, ANDREW LAYMAN, iSWAN CSERI
`
`
`
`JUMBO: An Object—Based XML Browser
`
`JEIER MURRAY-RUST
`
`
`
`Capturing the State of Distributed Systems with XML
`XOHIT KHARE, ADAM RiFKIN
`
`
`
`XML, Java, and the Future of the Web
`ON BOSAK
`
`WiDL:Appiicatian integration with XML
`CHARLES ALLEN
`
`
`
`
`
`
`
`IBM-1011
`Page 8 of 16
`
`

`

`
`
`
`
`
`
`
`
`
`XML BACKGROUND
`
`
`The Road to XML
`
`
`ADAPTING SGML TO THE WEB
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`ny computer scientists have talked about
`simplifying SGML. The W3C’s XML Edito-
`
`.
`4
`rlal Review Board has been working at it
`since July ’96. So far, their efforts have received almost
`universal acclaim. Recently D. C. Denison canvassed a
`group of Editorial Review Board (ERB) members, and
`asked them to look back on how the XML project got off
`the ground, and where they think it’s going from here.
`
`XML wasn't the only acronym in the running
`when WSC’s Working Group began to con-
`sider a name for what they hoped to create:
`specifications for a subset of SGML that was
`optimized for the Web,
`
`“There were several acronyms that we consid—
`ered,” Tim Bray remembers. “I believe there
`was MGML, for Minimal Generalized Markup
`Language, and something called SIMPL for
`Simple Internet Markup Protocol, or something
`
`like that Eventually we voted, and XML—for
`Extensible Markup Languageuwon out. It was
`short and sweet, and people liked it.”
`
`“Marketing XML to the HTML user was one of
`our prime goals,” Jean Paoli adds. “We thought
`that putting the spin on the ‘Extensibility’ part
`of the language would attract the HTML user.”
`
`Choosing a name for the project was trivial, of
`course, compared to some of the other Chal-
`lenges that faced the group when they first
`started working together in July, 1996. Many
`other efforts to simplify SGML had run out of
`steam long before reaching the proposal stage.
`Somehow, however,
`this group managed to
`pull
`it off, publishing a working draft
`that’s
`received Wide acceptance in both the SGML
`and Web communities. How did they do it?
`
`
`
`
`
`XML Background
`
`
`
`IBM-1011
`
`Page 9 of 16
`
`IBM-1011
`Page 9 of 16
`
`

`

`
`
`XML BACKGROUND
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`activity was
`
`for Web use
`
`
`‘
`L
`_ Working and
`
`The official
`
`SGML ERB, _
`
`
`
`
`
`
`let’s Go Back a Few Years
`
`A slimmed-down SGML is not a new concept.
`Many members of the XML team have been
`discussing the idea for years.
`
`“Most computer scientists who have worked
`with SGML have proposed simplifications; spe-
`cifically, keeping all
`the structural flexibility
`but losing many syntax options,” ERB member
`Steve DeRose says.
`“I‘ve heard of about a
`dozen proposals over the years."
`
`Some of the XML authors, in fact, were already
`using a sort of proto-XML.
`
`“I think it’s important to understand that I and
`some other people had actually been doing
`XML for years,” Tim Bray says. “A lot of people
`who are in the business were actually using
`
`N ”
`
`Everyone who uses HTML for very long
`discovers that they want
`'just one more tag.
`
`N S
`
`GML data in the case of open text searching
`and displaying. In the case of electronic book
`technology, there was a similar kind of story:
`we had long observed the fact that if they sent
`you some nicely—tagged text you could do any
`number, any amount of useful things with it
`without worrying about
`the minutiae of the
`standard and without having to have a DTD.
`So what XML in effect is has been around for a
`long time.”
`
`Dave Hollander was another XML author who
`had already jumped the gun, so to speak.
`
`6
`
`simplified SGML language
`“I developed a
`while working on HP’s LaserROM program in
`the early ’90s,” he recalls. “That evolved into
`the language used in our HP-UX help sys-
`terns.”
`
`The rise of the Web, and HTML, pressed other
`members of the ERB to approach XML from
`the other direction.
`
`“Everyone who uses HTML for very long dis-
`covers that
`they want
`‘just one more tag,”
`according to Steve DeRose. “If you’re doing
`catalogs you need a <PRICE> tag; for repair
`manuals you need <PARTNUM>; for ancient
`manuscripts you need <LACUNA> and <SIC>.
`Having been through this enough times,
`I
`want
`to be able to create new information
`structures any time my data justifies them, and
`do it easily. This is why C++ lets you make
`your own classes
`(imagine a development
`environment that didn’tl), and it’s why XML is
`absolutely necessary. To do generalized pro-
`cessing, retrieval, etc, I have to be able to say
`what things in documents are.
`I can do that
`with XML, but I can’t do it with any one partic-
`ular fixed tag set.”
`
`Jean Paoli was also well aware of HTML’S
`shortcomings. “I discovered that a lot of Web
`content providers were using what they called
`‘structured comments’ to hide information in
`their HTML,” he says.
`“I was convinced that
`they needed a simple way to extend HTML,
`and I always thought that it could be a kind of
`simplified SGML that my SGML customers
`were all already using.”
`
`Jon Bosak was similarly inspired.
`
`
`
`
`
`XML: Principles, Tools, and Techniques
`
`IBM-1011
`
`Page 10 of 16
`
`IBM-1011
`Page 10 of 16
`
`

`

`
`
`____________——_———-—-————————_
`
`XML BACKGROUND
`
`
`
`“XML arose from the realization that HTML is
`insufficient for certain kinds of Web applica—
`
`tions,” he says. “I was one of the people who
`came to this realization early because I was
`
`working in a field—online technical docu—
`mentationA—where the requirements are well
`understood and it’s clear that HTML can’t meet
`
`I was putting this complex material in
`them.
`online browsers used by millions of people
`before anyone had heard of HTML or the Web,
`and I knew from experience that HTML wasn’t
`going to work for that kind of publishing.
`I
`knew that it wouldnt work well for any kind
`
`of large—scale content production. SO I could
`see a time coming when large content provid—
`ers would have to turn from HTML to some-
`
`thing more powerful. The question was, what
`would they provide?
`
`“1 could see only two possibilities: either the
`big software companies would offer propri—
`etary and probably binary—coded formats or
`we could get them to adopt a single, standard,
`human-readable format. The only standard
`solution that
`I knew could do the job was
`SGML."
`
`Bosak’s solution: “I started a working group in
`
`the W5C to provide specifications that could
`put SGML on the Web. What came out of that
`activity was XML-a subset of SGML designed
`for Web use.”
`
`Working and Evangelizing
`
`The official W3C group, originally called the
`SGML ERB, began working together in July
`1996:.
`the larger mailing list discussions,
`the
`SGML W'orking Group, started the following
`
`September. Work proceeded quietly through
`most of 1996 and early 1997, Via teleconfer—
`ences, email, and the occasional conference.
`(In July ’97, the SGML ERB became the XML
`WG and the SGML WG became the XML SIG.)
`
`Meanwhile interest was growing, as the XML
`authors discussed the project with their col—
`leagues. Perhaps it was an early indication of
`XML’s flexibility that some authors,
`like Tim
`Bray} found that they could tailor their descrip-
`tions of XML to their audience,
`
`“If I was talking to people to whom search and
`retrieval is very important, I would point out
`that when you invent your own tags, you can
`use them to drive searches and that’s a lot bet-
`ter," he recalls.
`
`“When I was talking to people to whom Java
`and that whole type of thing is important,
`I
`would point out
`that HTML is
`fine but
`it
`doesn’t give Java much to chew on. And XML
`does. And if was talking to people who are in
`
`the publishing business and are irritated at
`HTML’s fairly primitive page make-up facilities,
`I would point out that one solution to that is to
`de-couple the markup ‘syntax and the format—
`ting semantics, and XML does that.
`
`When ERB member CM. Sperberg—McQueen
`
`spoke to colleagues about XML, he promoted
`“the ability to use your own tags, rather than
`the rather eccentric and constricted vocabulary
`of HTML,” he recalls. “That’s easily the most
`important aspect of XML from the point of
`View of academic research. The ability to write
`
`an XML parser that fits in 50 Kb of memory
`also captured the attention of a lot of program—
`mers and tool developers.”
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`XML Background
`
`
`
`
`IBM-1011
`
`Page 11 of 16
`
`IBM-1011
`Page 11 of 16
`
`

`

`l l
`
`i!
`
`____,—___._———_—__———
`X
`BACKGROUND
`
`l l
`
`The SGM
`plicity at
`board.
`T}
`same, thij
`ity and v3
`gets you",
`nology 1i
`the thing
`
`communi
`
`
`Eve Maler found that the XML applications that
`generated the most excitement were “the ones
`that blur the distinction between information
`
`such as
`delivery and transacting business,
`ordering a new part by clicking on a part num—
`ber in an online service bulletin. And the idea
`
`of using XML as an exchange protocol for
`purely transaction—oriented applications is also
`pretty popular, as we’ve seen by the quick
`promotion of XML-based EDI initiatives.
`
`“Of course, for many people who have had
`exposure
`only
`to HTML,
`they’re most
`impressed simply by the notion that tags can
`have meaning,” Maler continues. “Many of the
`business and technical requirements they’ve
`conceived to date could be addressed with this
`one innovation!”
`
`Soon, a certain software company began to
`show an interest
`in XML. Jean Paoli, of
`Microsoft, a member of the original SGML Edie
`torial Review Board, had been aggressively
`evangelizing XML to the company’s Explorer
`product teams.
`
`“When I talked about XML to the people here
`at Microsoft,” Paoli
`remembers,
`“I always
`stressed its ability to encode data, not docu-
`ments. Nobody at Microsoft understood why
`you would want to use XML for things that
`HTML is good for. But data? Yes. And describ—
`ing customers and orders? Yes. Financial infor—
`mation? Yes. 80 I always sold XML to the
`database people, the people who understood
`the value of structuring data.”
`
`“Adam Bosworth (who designed Microsoft
`Access) and Thomas Reardon helped me a lot
`selling this idea.”
`
`“But, even more important, it was the Channel
`Definition Format (CDF) that helped sell the
`whole XML story to Microsoft,” Paoli contin—
`ues. “At that moment (February ’97), the push
`battle was
`terrible between Netscape and
`Microsoft, and the Internet Explorer team was
`searching for a good data file format to repre—
`sent Webcasting information.
`It was evident
`that XML was a good choice. I presented XML
`to the managers of the Microsoft Internet Push
`team and we modeled their Webcasting data
`in ten minutes!
`It
`took only a few days to
`decide to use XML. The first XML application
`(GDP) by Microsoft gave Microsoft at big win.
`This was the beginning of a lot of PR around
`XML. Starting XML with a winning application
`was a great thing for XML!”
`
`In March ’97, Microsoft officially announced
`that they were going to base their new Chan—
`nel Definition Format on XML. This generated
`a'fair amount of interest in XMI. among pro—
`grammers and Internet professionals.
`
`Breakthroughs
`
`two events
`As late ’96 turned into early ’97,
`brought a new level of attention to the XML
`project. The first was the SGML :96 conference,
`held in November 1996.
`
`“The SGML ’96 conference was a watershed,”
`Steve DeRose remembers, “because it was not
`clear whether the SGML community would see
`XML as SGML writ large, or as some kind of
`competitor, Since SGML software already sup-
`ports tag extensibility, variant delimiters, etc.,
`and the SGML market has huge amounts of
`high—value data,
`this community is important.
`
`XML: Principles, Tools, and Techniques
`
`
`
`purpose
`XML,” P
`
`to the
`noise as
`
`“The so
`
`XML,” j
`
`IBM-1011
`
`Page 12 of 16
`
`IBM-1011
`Page 12 of 16
`
`

`

`
`
`—_———————_——_——_—
`XML BACKGROUND
`
`
`
`The SGML community saw the benefits of sim—
`plicity and ease of adoption and jumped on
`board. The Web community has done the
`same, though for different reasons: extensibil—
`ity and validation. The beauty of XML is that it
`gets you the best of both worlds; but any tech-
`nology like that overlaps partly with both of
`the things it draws on; the reception in both
`communities is therefore crucial. As soon as i
`
`saw the major SGML vendors and the major
`Web vendors all diving in, I knew we were in
`good shape.”
`
`The WW W6 Conference, held early in 1997 in
`Santa Clara, California, was another milestone.
`
`“We put on a major PR blitz at that conference,
`and I think it went over pretty well,” Tim Bray
`recalls. “I think XML was one of the hot stories
`
`of that conference. By May 1 of ’)7 it was
`pretty obvious we were onto something that
`was going to be significant. And it’s grown
`since then."
`
`I was
`earlier, not many people knew what
`talking about. But when we presented the
`XML draft at the W6 Conference in April
`’97, about half the faces in the audience lit up.
`Those were content providers and Web site
`administrators who’d finally hit that wall. They
`knew that they had a problem, they just didn’t
`know what to do about it. As soon as they saw
`XML, they knew.”
`
`Microsoft versus Netscape
`
`Soon Netscape joined Microsoft in agreeing to
`support
`the new standard. Tim Bray began
`working with Netscape as a consultant Articles
`on XML began showing up in a variety of print
`magazines and online publications. Predict-
`
`”The SGML people got it as soon as they
`
`saw XML because they all come from
`
`industries that had to solve this problem
`
`“Microsoft announced CDF based on XML a
`
`along time ago.”
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`few weeks before the W6 conference, on
`purpose,
`in order to boost
`the interest
`in
`XML,” Paoli remembers. “I took a bunch of
`Microsoft people who were involved in XML
`to the conference, and we made as much
`noise as possible in all the XML sessions.”
`
`“The SGML people got it as soon as they saw
`XML,” Jon Bosak recalls, “because they all
`come from industries that had to solve this
`
`ably, many media stories played up the
`Microsoft—versus—Netscape angle.
`
`Many ERB members tend to downplay the
`importance of the competition between Micro—
`soft and Netscape, but they all agree it will
`have an impact.
`
`“Looking at this purely from the industry point
`of View,” Jon Bosak says, “the competition can
`only do us good by accelerating the accep—
`tance of a truly open, human—readable data
`format.”
`
`problem a long time ago. The HTML people
`only got it this year; that’s when they started
`hitting the wall in large numbers, in terms of
`having to deal with significant levels of con—
`
`
`tent. At the WWWS Conference in Paris a year
`
`
`XML Background
`
`
`
`IBM-1011
`
`Page 13 of 16
`
`IBM-1011
`Page 13 of 16
`
`

`

`
`
`“The participation of both Microsoft and
`Netscape has been very beneficial,” C.M. Sper~
`berg—McQueen adds. “They bring a particular
`technical perspective to the discussions:
`the
`View of the world from a large programming
`shop with enormous numbers of current users
`is rather different from the view of the world
`from an ac
`ademic institution or from a smaller
`nization.
`In that sense,
`the
`commercial orga
`Microsoft and Netscape viewpoints have been
`more similar than different, in my view.”
`
`Steve DeRose believes that competitive issues
`will not intrude on the creation of the XML
`specification.
`and
`“The
`competition between Microsoft
`Netscape would be almost a non-issue if not
`for a few over—excited articles,” he says. “All
`the representatives on the XML Working
`Group are deeply committed to doing the right
`thing, and to a consensus process. Neither
`Netscape nor Microsoft has tried to dominate
`the process or to foist any self—serving propos-
`als on the [XML working] group. Also, I think
`both companies realize they have better places
`to compete than over syntax. Let them and
`everyone else compete on user interface qual—
`ity,
`reliability, performance, and functional—
`itye-not on who can dream up new tag names
`p)
`or punctuation marks faster
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Details, Details
`
`Although XML has met with an enthusiastic
`reception, the ERB members are well aware of
`the work that remains. First and foremost, they
`have to finish the specification.
`
`“It would be nice if we could finish XML 1.0
`and get it snapshotted,” Tim Bray says. “We
`should get it blessed by W3C as a recommen—
`dation, and maybe even get
`it blessed by
`another standards organization as well, just so
`that we have a line in the sand and can say,
`‘Okay, this phase is done.‘ I think we need to
`do that simply because there are so many
`implementations happening so fast that just to
`be fair to the people who believed in what
`we’ve done we have to stop changing it. We
`have to stop and say, ‘Okay, here’s what it is.
`Maybe
`it’s not perfect yet,
`it could be
`improved still further, but here’s 1.0 and that’s
`what 1.0 is.’ i think clearly by the end of the
`year we must have 1.0 finished, blessed, and
`canonized. There will still be lots of other
`things to work on. The 1.0 version won’t have
`a solution to the style sheet problem, it won’t
`have a solution for lots of other things, but the
`base language has to be frozen.”
`
`is hopeful
`for one,
`Jon Bosak,
`issues are behind them.
`
`that the big
`
`“I may be whistling in the dark,” he says. “But
`aside from the political issues were going to
`have to deal with as a result of competition, I
`don’t think that XML really faces any major
`problems once we get the specification for 1.0
`finished.
`It’s been designed to be easy to
`implement, and outside of all the last—minute
`internationalization details,
`it hasn’t
`really
`changed much for a while. The basics have
`been in place since last November ’96, and
`most of the finer points were settled by April
`797"
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Still there are;
`
`“in addition é
`itself,” Bosa
`kinds of issg“
`directly in H"
`
`for example
`names case-
`
`Japanese ch
`really a spa
`bogglingly c
`be sidestepp.
`policy questi
`error—handli
`
`
`“The real ac
`
`to the other
`
`ing piece a
`
`them XLL, fo
`XSL, for ext:
`
`
`itself is just
`into
`we get
`
`you lose a
`
`says,
`
`sure that t1
`
`and accorni
`
`getting klu
`
`clear, but i
`
`well enou"
`
`
`
`
`
`XML: Principles, Tools, and Techniques
`
`IBM-1011
`
`Page 14 of 16
`
`IBM-1011
`Page 14 of 16
`
`

`

`
`
`
`
`
`———————————_——_——_
`XML BACKGROUND
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`IBM-1011
`
`Page 15 of 16
`
`Fortunately, XML will be easier to develop
`than HTML, according to Tim Bray.
`
`“HTML is painfully difficult to evolve,” he says,
`“because it is a mixture of formatting seman—
`tics and hypertext semantics and GUI seman—
`tics with forms and so on. And trying to evolve
`all of those capabilities at once without break—
`ing them is very difficult. Now XML, the basic
`language, has a syntax and there’s going to be
`a style sheet facility and there’s going to be
`various behavior facilities That doesn’t mean
`
`it just
`that evolving any of this stuff is easy,
`means that you can partition the problems and
`solve them without having to solve them all at
`once, which is the problem that HTML faces.
`So a lot of the advanced capabilities that users
`of the Web are asking for, I think, are going to
`be easier to solve in an XML context."
`
`Still there are details on top of details.
`
`“In addition to the greater complexity of XML
`itself,” Bosak says, ”we’re dealing with all
`kinds of issues that were never confronted
`
`directly in HTML—how to handle whitespacc,
`for example, or whether to make stuff like tag
`names case-sensitive or not, or whether the
`
`japanese character for an ideographic space is
`really a space or not. Lots of nitty but mind—
`bogglingly complex problems that finally can’t
`be sidestepped any more. And there was a big
`policy question, which was what to do about
`error—handling—but we’re past that now.”
`
`“The real action," Bosak continues, “shifts now
`to the other two pieces of the puzzle, the link—
`ing piece and the style sheet piece. We call
`them XLL, for extensible linking language, and
`XSL, for extensible style sheet language. XML
`itself is just about syntax. With XLL and XSL
`we get into semantics, and that’s where the
`
`real competition is going to be: how you actu—
`ally do stuff.”
`
`“The hardest thing, in general,” Steve DeRose
`says, “is to look far enough ahead to make
`sure that the language will scale up smoothly
`and accommodate later extensions without
`
`getting kludgy. The broadstroke picture is very
`clear, but if you don’t pin all the details down
`well enough, systems won’t interoperate and
`you lose a central benefit of standardization.
`
`Yet still ahead, after the big technical problems
`are largely solved,
`there’s another Challenge:
`inspiring people to exploit the new possibili-
`ties that come with XML.
`
`“Now that it is reasonable to expect next gen-
`eration tools
`to have better control over
`
`encoding information,” Dave Hollander says,
`“we need to get ready to use these features.
`My next key initiative is how to get authors,
`collaborators, and consumers of information to
`make the best use of the new capabilities.”
`
`
`
`
`
`
`
`“Now, we have to encourage the market to
`It’s nice to see descriptive markup move into
`
`create specific horizontal and vertical DTDs, to
`the mainstream and be adopted so quickly.
`1
`build common vocabularies,” Paoli says. “We
`
`hope that it will let us really move data into
`need to let content providers generate useful
`forms that will outlast rev 5.3.9.1b of some—
`
`XML data while we,
`the software and tool
`body’s word processor, and help make bit-rot
`
`builders, build tools which access and uses
`a noneissue for the future of literature.”
`this data.”
`
`XML Background
`
`IBM-1011
`Page 15 of 16
`
`

`

`—__—__
`BACKGROUND
`
`to be sure. Yet, at this
`There’s plenty to do,
`point it appears likely that the early work of
`the XML ERB has created enough momentum
`to carry the project to completion.
`
`“What’s important, from here on in, is to keep
`all these activities moving toward the goal we
`
`started with in July 1996,” Jon Bosak says. “It’s
`more like a snowball gathering speed down a
`slope now.
`It doesn’t need pushing,
`it
`just
`to be kept pointed in
`needs
`the
`right
`direction.” l
`
`
`
`
`
`XML: Principles, Tools, and Techniques
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket