throbber
3/30/2017
`
`Extensible Markup Language (XML)
`
`Extensible Markup Language (XML)
`
`W3C Working Draft 14-Nov-96
`
`WD-xml-961114
`
`This version:
`http://www.w3.org/pub/WWW/TR/WD-xml-961114.html
`Previous versions:
`Latest version:
`http://www.textuality.com/sgml-erb/WD-xml.html
`
`Editors:
`Tim Bray (Textuality) <tbray@textuality.com>
`C. M. Sperberg-McQueen (University of Illinois at Chicago) <cmsmcq@uic.edu>
`
`Status of this memo
`
`This is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document
`and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to use W3C
`Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C
`working drafts can be found at: http://www.w3.org/pub/WWW/TR
`
`Note: since working drafts are subject to frequent change, you are advised to reference the above URL, rather
`than the URLs for working drafts themselves.
`
`This work is part of the W3C SGML Activity.
`
`Abstract
`
`Extensible Markup Language (XML) is an extremely simple dialect of SGML which is completely described in
`this document. The goal is to enable generic SGML to be served, received, and processed on the Web in the way
`that is now possible with HTML. For this reason, XML has been designed for ease of implementation, and for
`interoperability with both SGML and HTML.
`Note on status of this document: This is even more of a moving target than the typical W3C working draft.
`Several important decisions on the details of XML are still outstanding - members of the W3C SGML Working
`Group will recognize these areas of particular volatility in the spec, but those who are not intimately familiar
`with the deliberative process should be careful to avoid actions based on the content of this document, until the
`notice you are now reading has been removed.
`
`Table of Contents
`
`1. Introduction
`1.1 Origin and Goals
`1.2 Relationship to Other Standards
`1.3 Notation
`
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`IBM EX. 1023
`
`1/25
`
`
`
`1 of 25
`
`

`

`3/30/2017
`
`Extensible Markup Language (XML)
`
`1.4 Terminology
`1.5 Common Syntactic Constructs
`2. Documents
`2.1 Logical and Physical Structure
`2.2 Characters
`2.3 Syntax of Text and Markup
`2.4 Comments
`2.5 Processing Instructions
`2.6 Marked Sections
`2.7 White Space Handling
`2.8 Prolog and Document Type Declaration
`2.9 Required Markup Declaration
`3. Logical Structures
`3.1 Start- and End-Tags
`3.2 Well-Formed XML Documents
`3.3 Element Declaration
`3.3.1 Mixed Content
`3.3.2 Element Content
`3.4 Attribute Declaration
`3.4.1 Attribute Types
`3.4.2 Attribute Defaults
`3.5 Partial DTD Information
`4. Physical Structures
`4.1 Character and Entity References
`4.2 Declaring Entities
`4.2.1 Internal Entities
`4.2.2 External Entities
`4.2.3 Character Encoding in Entities
`4.2.4 The Document Entity
`4.3 XML Processor Treatment of Entities
`4.4 Notation Declaration
`5. Conformance
`A. XML and SGML
`B. References
`C. Working Group and Editorial Review Board Membership
`C.1 Working Group
`C.2 Editorial Review Board
`
`1. Introduction
`
`Extensible Markup Language, abbreviated XML, describes a class of data objects stored on computers and
`partially describes the behavior of programs which process these objects. Such objects are called XML
`documents. XML is an application profile or restricted form of SGML, the Standard Generalized Markup
`Language [ISO 8879].
`
`XML documents are made up of storage units called entities, which contain either text or binary data. Text is
`made up of characters, some of which form the character data in the document, and some of which form markup.
`Markup encodes a description of the document's storage layout, structure, and arbitrary attribute-value pairs
`associated with that structure. XML provides a mechanism to impose constraints on the storage layout and
`logical structure.
`
`A software module called an XML processor is used to read XML documents and provide access to their content
`and structure. It is assumed that an XML processor is doing its work on behalf of another module, referred to as
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`2/25
`
`2 of 25
`
`
`
`

`

`3/30/2017
`Extensible Markup Language (XML)
`the application. This specification describes some of the required behavior of an XML processor in terms of the
`manner it must read XML data, and the information it must provide to the application.
`
`1.1 Origin and Goals
`
`XML was developed by a Generic SGML Editorial Review Board formed under the auspices of the W3
`Consortium in 1996 and chaired by Jon Bosak of Sun Microsystems, with the very active participation of a
`Generic SGML Working Group also organized by the W3C. The membership of these groups is given in an
`appendix.
`
`The design goals for XML are:
`
`1. XML shall be straightforwardly usable over the Internet.
`2. XML shall support a wide variety of applications.
`3. XML shall be compatible with SGML.
`4. It shall be easy to write programs which process XML documents.
`5. The number of optional features in XML is to be kept to the absolute minimum, ideally zero.
`6. XML documents should be human-legible and reasonably clear.
`7. The XML design should be prepared quickly.
`8. The design of XML shall be formal and concise.
`9. XML documents shall be easy to create.
`10. Terseness is of minimal importance.
`
`This specification, together with the associated standards, provides all the information necessary to understand
`XML version 1.0 and construct computer programs to process it.
`
`This version of the XML specification (0.01) is for internal discussion within the SGML ERB only. It should not
`be distributed outside the ERB.
`
`Known problems in version 0.01:
`
`1. Several items in the bibliography have no references to them; several references in the text do not point to
`anything in the bibliograpy.
`2. The EBNF grammar has not been checked for completeness, and has at least two start productions.
`3. The description of conformance in the final section is incomplete.
`4. Language exists in the spec which describes the effect of several decisions which have not been taken.
`Specifically, XML may have INCLUDE/IGNORE marked sections as does SGML, the comment syntax
`may change, XML may have CONREF attributes, the 8879 syntax for EMPTY elements may be
`outlawed, XML may choose to rule out what 8879 calls "ambiguous" content models, XML may choose to
`prohibit overlap between enumerated attribute values for different attributes, the handling for attribute
`values in the absence of a DTD may be specified, there may be a way to signal whether the DTD is
`complete, the DTD summary may be dropped, and XML may support parameter entities, and XML may
`predefine a large number of character entities, for example those from HTML 3.2.
`
`1.2 Relationship to Other Standards
`
`Other standards relevant to users and implementors of XML include:
`
`1. SGML (ISO 8879-1986). Valid XML Documents are SGML documents in the sense described in ISO
`standard 8879.
`2. Unicode, ISO 10646. This specification depends on ISO standard 10646 and the technically identical
`Unicode Standard, Version 2.0, which define the encodings and meanings of the characters which make up
`XML text data.
`
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`3 of 25
`
`3/25
`
`
`
`

`

`3/30/2017
`Extensible Markup Language (XML)
`3. IETF RFC 1738. This specification defines the syntax and semantics of Uniform Resource Locators, or
`URLs.
`4. World-Wide Web Consortium Working Draft WD-html32-960909: HTML 3.2 Reference Specification.
`This includes the repertoire of characters to be predefined by an XML processor.
`
`1.3 Notation
`
`The formal grammar of XML is given using a simple Extended Backus-Naur Form (EBNF) notation. Each rule
`in the grammar defines one non-terminal or terminal symbol of the grammar, in the form
`
`symbol ::= expression
`
`Symbols are written with an initial capital letter if they are defined by a regular expression, with an initial
`lowercase letter if they have a more complex definition (i.e. if they require a stack for proper recognition).
`Literal strings are quoted; unless otherwise noted they are not case-sensitive. The distinction between symbols
`which can and cannot be recognized using simple regular-expressions is made for clarity only. It may be
`reflected in the boundary between an implementation's lexical scanner and its parser, but there is are no
`assumptions about the placement of such a boundary, nor even that the implementation has separate modules for
`parser and lexical scanner.
`
`Within the expression on the right-hand side of a rule, the meaning of symbols is as shown below:
`
`#NN
`
`#xNN
`
`where NN is a decimal integer, the expression matches the character in ISO 10646 whose UCS-4 bit-
`string, when interpreted as an unsigned binary number, has the value indicated
`
`where NN is a hexadecimal integer, the expression matches the character in ISO 10646 whose UCS-4 bit-
`string, when interpreted as an unsigned binary number, has the value indicated
`[#xNN-#xNN], [a-zA-Z]
`matches any character with a value in the range(s) indicated (inclusive)
`[^#xNN-#xNN], [^a-z]
`matches any character with a value outside the range indicated
`[^abc]
`matches any character with a value not among the characters given
`"string"
`matches the literal string given inside the double quotes
`'string'
`matches the literal string given inside the single quotes
`
`a b
`
`a | b
`
`a followed by b
`
`a or b but not both
`
`a?
`
`a+
`
`a*
`
`a or nothing; optional a
`
`one or more occurrences of a
`
`zero or more occurrences of a
`(expression)
`expression is treated as a unit; allows subgroups to carry the operators ?, *, or +
`/* ... */
`comment
`[ WFC: ... ]
`
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`4 of 25
`
`4/25
`
`
`
`

`

`3/30/2017
`
`Extensible Markup Language (XML)
`Well-formedness check; this identifies by name a check for well-formedness that is associated with a
`production.
`[ VC: ... ]
`Validity check; this identifies by name a check for validity that is associated with a production.
`
`1.4 Terminology
`
`Some terms used with special meaning in this specification are:
`
`may
`
`must
`
`error
`
`Conforming data and software may but need not behave as described.
`
`Conforming data and software must behave as described; otherwise they are in error.
`
`A violation of the rules of this specification; results are undefined. Conforming software may detect and
`report an error and may recover from it.
`reportable error
`An error which conforming software must report to the user, unless the user has explicitly disabled error
`reporting.
`validity constraint
`A rule which applies to all valid XML documents. Violations of validity constraints are errors; they must
`be reported by validating XML processors.
`well-formedness constraint
`A rule which applies to all well-formed XML documents. Violations of well-formedness constraints are
`reportable errors.
`at user option
`Conforming software may or must (depending on the verb in the sentence) provide users a means to select
`the behavior described; it must also allow the user not to select it.
`
`match
`
`Case-insensitive match: two strings or names being compared must be identical except for differences
`between upper- and lower-case letters in scripts which have such a distinction. Characters with multiple
`possible representations in ISO 10646 (e.g. both precomposed and base+diacritic forms) match only if
`they have the same representation, except for case differences, in both strings. Case folding must be
`performed as specified in The Unicode Standard, Version 2.0, section 4.1; in particular, it is recommended
`that case-insensitive matching be performed by folding uppercase letters to lowercase, not vice versa.
`exact(ly) match
`Case-sensitive match: two strings or names being compared must be identical. Characters with multiple
`possible representations in ISO 10646 (e.g. both precomposed and base+diacritic forms) match only if
`they have the same representation in both strings.
`for compatibility
`A feature of XML included solely to ensure that XML remains compatible with SGML; the expectation is
`that in many cases, those aspects of SGML that are not required to satisfy XML's requirements but
`mandated only to achieve conformance may be removed or replaced in the near future by the organizations
`that maintain that standard.
`
`1.5 Common Syntactic Constructs
`
`This section defines some symbols used widely in the grammar.
`
`S (white space) consists of one or more blank characters, carriage returns, line feeds, or tabs.
`
`< 1 White space >
`
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`5 of 25
`
`5/25
`
`
`
`

`

`3/30/2017
`
`Extensible Markup Language (XML)
`
`S ::= (#x0020 | #x000a | #x000d | #x0009)+
`
`For some purposes, characters are classified as letters, digits, or other characters:
`
`< 2 Name >
`
`Character ::= [#x20-
`#xFFFFFFFF]
`
`Letter ::= [#x41-#x5A]
`| [#x61-
`#x7A]
`
`/* any ISO 10646 32-bit code */
`
`/* Latin 1 upper and lowercase */
`
`| #xAA |
`#xB5 | #xBA
`
`| [#xC0-
`#xD6] |
`[#xD8-#xF6]
`
`| [#xF8-
`#xFF]
`
`| [#x0100-
`#x017F]
`
`| [#x0180-
`#x0217]
`
`| [#x0250-
`#x1FFF]
`
`| [#x3040-
`#x9FFF]
`
`| [#xF900-
`#xFDFF]
`
`| [#xFE70-
`#xFEFE]
`
`| [#xFF21-
`#xFF3A]
`
`| [#xFF41-
`#xFF5A]
`
`| [#xFF66-
`#xFFDC]
`
`/* Latin 1 supplementary letters */
`
`/* Latin 1 supplementary letters */
`
`/* Extended Latin-A */
`
`/* Extended Latin-B */
`
`/* IPA extensions, spacing modifiers,
`diacritics, Greek, Coptic, Cyrillic,
`Armenian, Hebrew, ... */
`
`/* CJK */
`
`/* CJK compatibility ideographs ... */
`
`/* Arabic presentation forms B */
`
`/* Fullwidth Latin A-Z */
`
`/* Fullwidth Latin a-z */
`
`/* Halfwidth katakana, hangul */
`
`Digit ::= [#x0030-
`#x0039]
`
`/* Correct this table using section 4.5 of Unicode
`2.0
`ISO 646 digits */
`
`| [#x0660-
`#x0669]
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`/* Arabic-Indic digits */
`
`6 of 25
`
`6/25
`
`
`
`

`

`3/30/2017
`
`Extensible Markup Language (XML)
`
`| [#x06F0-
`#x06F9]
`
`| [#x0966-
`#x096F]
`
`| [#x09E6-
`#x09EF]
`
`| [#x0A66-
`#x0A6F]
`
`| [#x0AE6-
`#x0AEF]
`
`| [#x0B66-
`#x0B6F]
`
`| [#x0BE7-
`#x0BEF]
`
`| [#x0C66-
`#x0C6F]
`
`| [#x0CE6-
`#x0CEF]
`
`| [#x0D66-
`#x0D6F]
`
`| [#x0E50-
`#x0E59]
`
`| [#x0ED0-
`#x0ED9]
`
`| [#xFF10-
`#xFF19]
`
`/* Eastern Arabic-Indic digits */
`
`/* Devanagari digits */
`
`/* Bengali digits */
`
`/* Gurmukhi digits */
`
`/* Gujarati digits */
`
`/* Oriya digits */
`
`/* Tamil digits (no zero) */
`
`/* Telugu digits */
`
`/* Kannada digits */
`
`/* Malayalam digits */
`
`/* Thai digits */
`
`/* Lao digits */
`
`/* Fullwidth digits
`Ranges taken from Java documentation. Check
`against Unicode 2.0, section 4.6.
`N.B. not clear whether the relevant Greek and
`Hebrew letters should also be digits. Will
`matter for NUMBER attributes. */
`
`A Name is a token beginning with a letter or hyphen and continuing with letters, digits, hyphens, or full stops
`(together known as name characters). The use of any name beginning with a string which matches "-XML-" in a
`fashion other than those described in this specification is a reportable error.
`
`A Number is a sequence of digits. An Nmtoken (name token) is any mixture of name characters.
`
`< 3 Names, Numbers, and Tokens >
`
`Name ::= (Letter | '-') (Letter | Digit | '-' | '.')*
`
`Number ::= Digit+
`
`Nmtoken ::= (Letter | Digit | '-' | '.')+
`
`Nmtokens ::= Nmtoken (S Nmtoken)*
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`7 of 25
`
`7/25
`
`
`
`

`

`3/30/2017
`
`(
`
`Extensible Markup Language (XML)
`
`)
`
`Literal data is any quoted string containing neither the quotation mark used as a delimiter nor angle brackets. It
`may contain entity and character references.
`
`< 4 Literals >
`
`Literal ::= '"' [^"<>]* '"'
`
`| "'" [^'<>]* "'"
`
`QuotedCData ::= '"' [^"<>]* '"'
`
`| "'" [^'<>]* "'"
`
`QuotedNames ::= '"' Names '"' | "'" Names "'"
`
`2. Documents
`
`A textual object is an XML Document if it is either valid, or failing that, well-formed, as defined in this
`specification.
`
`2.1 Logical and Physical Structure
`
`Each XML document has both a logical and a physical structure.
`
`Physically, the document is composed of units called entities; it begins in a "root" or document entity, which
`may refer to other entities, and so on ad infinitum. Entities referred to are embedded in the document at the point
`of reference.
`
`The document contains declarations, elements, comments, entity references, character references, and processing
`instructions, all of which are indicated in the document by explicit markup. These concepts and their markup are
`all explained elsewhere in this specification.
`
`The two structures must be synchronous: tags and elements must each begin and end in the same entity, but may
`refer to other entities internally; comments, processing instructions, character references, and entity references
`must each be contained entirely within a single entity. Entities must each contain an integral number of elements,
`comments, processing instructions, and references, possibly together with character data not contained within
`any element in the entity, or else they must contain non-textual data, which by definition contains no elements.
`
`2.2 Characters
`
`The data stored in an XML entity is either text or binary. Binary data has an associated notation, identified by
`name; beyond a requirement to make available the notation name and the size in bytes of the binary data in a
`storage object, XML provides no information about and places no constraints on binary data. In fact, so-called
`binary data may in fact be textual, perhaps even well-formed XML text; but its identification as binary means
`that an XML processor need not parse it in the fashion described by the specification. XML text data is a
`sequence of characters. A character is A character is an atomic unit of text represented by a bit string; valid bit
`strings and their meanings are specified by ISO 10646.
`
`Users may extend the ISO 10646 character repertoire, in the rare cases where this is necessary, by making use of
`the private use areas.
`
`The mechanism for encoding character values into bit patterns may vary from entity to entity. All XML
`processors must accept the UTF-8 and UCS-2 encodings of 10646; the mechanisms for signalling which of the
`
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`8 of 25
`
`8/25
`
`
`
`

`

`3/30/2017
`Extensible Markup Language (XML)
`two are in use, and for bringing other encodings into play, are discussed later, in the discussion of character
`encodings.
`
`Regardless of the specific encoding used, any character in the ISO 10646 character set may be referred to by the
`decimal or hexadecimal equivalent of its bit string:
`
`< 5 Character references >
`
`Hex ::= [0-9a-fA-F]
`
`Hex4 ::= Hex Hex Hex Hex
`
`CharRef ::= '&#' Number ';'
`
`| '&u-' Hex4 ';'
`
`2.3 Syntax of Text and Markup
`
`XML text consists of intermingled character data and markup. Markup takes the form of start-tags, end-tags,
`empty elements, entity references, character references, comments, marked sections, document type declarations,
`and processing instructions. The simplest form of XML processor thus could parse a well-formed XML
`document using the following rules:
`
`< 6 Trivial text grammar >
`
`Trivial ::= (PCDATA | Markup)*
`
`Eq ::= S? '=' S?
`
`Markup ::= '<' Name (S Name Eq QuotedCData)* S?
`'>'
`
`/* start-tags */
`
`| '</' Name S? '>'
`
`| '<' Name (S Name Eq QuotedCData)* S?
`'/>'
`
`/* end-tags */
`
`/* empty elements */
`
`| '&' Name ';'
`
`| '&#' Number ';'
`
`| '&u-' Hex4 ';'
`
`/* entity references */
`
`/* character references
`*/
`
`/* character references
`*/
`
`| '<!--' [^-]* ('-' [^-]+)* '-->'
`
`/* comments */
`
`| '<![CDATA[' MsData ']]>'
`
`/* marked sections */
`
`| '<!DOCTYPE' (Name | S)+ ('[' [^]]*
`']')? '>'
`
`/* doc type declaration
`*/
`
`| '<?' [^?]* ('?' [^>]+)* '?>'
`
`/* processing
`instructions */
`
`Most processors will require the more complex grammar given in the rest of this specification.
`
`All text that is not markup constitutes the character data of the document.
`
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`9 of 25
`
`9/25
`
`
`
`

`

`3/30/2017
`Extensible Markup Language (XML)
`The ampersand character (&) and the left and right angle bracket (< and >) may appear in their literal form only
`when used as markup delimiters, or within comments, processing instructions, or marked sections. If they are
`needed in the text, they must be represented using the strings "&amp;", "&lt;", and "&gt;". Parsed character data
`is thus any string of characters which does not contain the start-delimiter of any markup. Character data is any
`string of characters not including the marked-section-close delimiter, "]]>". For convenience, the single-quote
`character (') may be represented as "&sq;", and the double-quote character (") as "&dq;".
`
`< 7 Character Data >
`
`PCDATA ::= [^<&]*
`
`MsData ::= [^]]* (((']' ([^]])) | (']]' [^>])) [^]]*)*
`
`2.4 Comments
`
`Comments may appear anywhere that character data may, except in a marked section (more properly, comments
`appearing in a marked section will not be recognized as such). They are not part of the document's character
`data; an XML processor may, but need not, make it possible for an application to retrieve the text of comments.
`An XML processor must inform the application of the length of comments if they are not passed through, to
`enable the application to keep track of the correct location of objects in the XML document. For compatibility,
`the string -- (double-hyphen) may not occur within comments.
`
`< 8 Comments >
`
`Comment ::= '<!--' [^-]* ('-' [^-]+)* '-->'
`
`2.5 Processing Instructions
`
`Processing instructions, usually referred to as PIs, allow the XML processor to pass instructions directly to
`selected applications.
`
`< 9 Processing Instructions >
`
`PI ::= '<?' Name S [^?]* ('?' [^>]+)* '?>'
`
`PIs are not part of the document's character data, but must be passed through to the application. The Name which
`follows the '?' at the beginning of the PI is called the PI target. It is normally the name of a declared notation,
`identifying the application to which it belongs. The use of the PI target "XML" in any other way other than those
`described in this specification is a reportable error.
`
`2.6 Marked Sections
`
`Marked sections can occur anywhere character data may occur; they are used to escape blocks of text which may
`contain characters which would otherwise be recognized as markup. Marked sections begin with the string <!
`[CDATA[ and end with the string ]]>:
`
`< 10 Marked Sections >
`
`MS ::= MsStart MsData MsEnd
`
`MsStart ::= '<![CDATA['
`
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`10 of 25
`
`10/25
`
`
`
`

`

`3/30/2017
`
`Extensible Markup Language (XML)
`
`MsEnd ::= ']]>'
`
`Within a marked section, only the MsEnd string is recognized, so that angle brackets and ampersands may occur
`in their literal form and need not be escaped using &lt;, etc. Marked sections cannot nest.
`
`2.7 White Space Handling
`
`While authoring XML documents, it is often convenient to use "white space" (spaces, tabs, blank lines, denoted
`by the nonterminal S in this specification) to set apart the markup for greater readability. Such white space is
`typically not intended for inclusion in the delivered version of the document. On the other hand, "significant"
`white space, to be retained in the delivered version, is common; for example in poetry or source code.
`
`An XML processor must provide two distinct white space handling modes, COLLAPSE and KEEP, and have the
`ability to apply these modes on a per-element basis. They operate as follows:
`
`COLLAPSE
`The XML processor must suppress (i.e. not pass to the application) all white space in an element which
`immediately follows the start-tag and all that which immediately precedes the end-tag. In the element's
`character data, it must convert all sequences of white space characters to a single space (#x0020)
`character, before passing the data to the application.
`
`KEEP
`
`The XML processor must suppress initial line break characters which immediately follow the start-tag of
`the element, and which immediately precede its end-tag. All other characters in the character data of the
`element must be passed to the application without change.
`
`The white space handling mode is signaled through the use of a reserved attribute, whose declaration is as
`follows:
`
`<!ATTLIST * -XML-SPACE (KEEP|COLLAPSE) #IMPLIED>
`
`where the "*" signifies that this attribute may apply to any element.
`
`The value of the attribute sets the white space handling mode for the element and for any contained elements.
`Unless otherwise specified, an XML processor is to set the white space handling mode for the root element of a
`document to COLLAPSE.
`
`2.8 Prolog and Document Type Declaration
`
`The function of the markup in an XML document is to describe its storage and logical structures, and associate
`attribute-value pairs with the logical structure. XML provides a mechanism, the document type declaration, to
`define constraints on that logical structure, and to support the use of predefined storage units. An XML document
`is said to be valid if there is an associated document type declaration and if the document complies with the
`constraints expressed in it.
`
`The document type declaration must appear before the first element in the document.
`
`< 11 XML document >
`
`document ::= Prolog element Misc*
`
`Prolog ::= EncodingDecl? Misc* RMDecl? Misc* (doctypdecl | DtdSummary)?
`Misc*
`
`Misc ::= Comment | PI | S
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`11 of 25
`
`11/25
`
`
`
`

`

`3/30/2017
`
`Extensible Markup Language (XML)
`
`For example, the following is a complete XML document, well-formed but not valid:
`
`<greeting>Hello, world!</greeting>
`
`The XML document type declaration may include a pointer to an external entity containing a subset of the
`necessary markup declarations, and may also directly include another, internal, subset of the necessary markup
`declarations.
`
`< 12 Document type declaration >
`
`doctypedecl ::= '<!DOCTYPE' S Name Extid? S? ('[' internalsubset* ']'
`S?)? '>'
`
`internalsubset ::= elementdecl | AttlistDecl | EntityDecl
`
`| NotationDecl | DtdSummary | S | Comment
`
`These two subsets, taken together, are properly referred to as the document type definition, abbreviated DTD.
`However, it is a common practice for the bulk of the markup declarations to appear in the external subset, and
`for this subset, usually contained in a file, to be referred to as "the DTD" for a class of documents. For example:
`
`<!DOCTYPE greeting SYSTEM "hello.dtd">
`<greeting>Hello, world!</greeting>
`
`The system identifier hello.dtd indicates the location of a full DTD for the document.
`
`The declarations can also be given locally, as in this slightly larger example:
`
`<?XML encoding="UTF-8">
`<!DOCTYPE greeting [
`<!ELEMENT greeting (#PCDATA)>
`]>
`<greeting>Hello, world!</greeting>
`
`The character-set label <?XML encoding="UTF-8"> indicates that the document entity is encoded using the UTF-
`8 transformation of ISO 10646. The legal values of the character set code are given in the discussion of character
`encodings.
`
`Individual markup declaration types are described elsewhere in this specification.
`
`2.9 Required Markup Declaration
`
`In many cases, an XML processor can read an XML document and accomplish useful tasks without having first
`processed the entire DTD. However, certain declarations can substantially affect the actions of an XML
`processor. A document author can communicate whether or not DTD processing is necessary using a required
`markup declaration (abbreviated RMD) processing instruction:
`
`< 13 Required markup declaration >
`
`RMDecl ::= '<?XML' S 'RMD' Eq ('NONE' | 'INTERNAL' | 'ALL') S? '?>'
`
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`12 of 25
`
`12/25
`
`
`
`

`

`3/30/2017
`Extensible Markup Language (XML)
`In an RMD, the value NONE indicates that an XML processor can parse the containing document correctly
`without first reading any part of the DTD. The value INTERNAL indicates that the XML processor must read and
`process the internal subset of the DTD to parse the containing document correctly. The value ALL indicates that
`the XML processor must read and process the declarations in both the subsets of the DTD to parse the containing
`document directly.
`
`The RMD must indicate that the entire DTD is required if the external subset contains any declarations of
`
`1. undistinguished empty elements, and these elements occur in the document, or
`2. attributes with default values, and elements to which these attributes apply appear in the document, or
`3. entities, and references to those entities appear in the document.
`
`If such declarations occur in the internal but not the external subset, the RMD should take the value INTERNAL.
`It is an error to specify INTERNAL if the external subset is required, or to specify NONE if the internal or
`external subset is required.
`
`If no RMD is provided, the effect is identical to an RMD with the value ALL.
`
`3. Logical Structures
`
`Each XML document contains one or more elements, the boundaries of which are either delineated by start-tags
`and end-tags, or, for empty elements, are limited to the start-tag. Each element has a type, identified by name
`(sometimes called its generic identifier or GI), and may have a set of attributes. Each attribute has a name and a
`value.
`
`This specification does not constrain the semantics, use, or (beyond syntax) names of the elements and attributes.
`
`3.1 Start- and End-Tags
`
`The beginning of every XML element is marked by a start-tag.
`
`< 14 Start-tag Recognition >
`
`STag ::= '<' Name (S Attribute)* S? '>'
`
`Attribute ::= Name Eq QuotedCData
`
`[ VC: Attribute Value Type ]
`
`The Name in the start- and end-tag rules gives the element's type. The Name-QuotedCData pairs are referred to
`as the attributes of the element, with the Name referred to as the attribute name and the content of the
`QuotedCData (the characters between the "'" or '"' delimiters) as the attribute value.
`
`Validity Constraint - Attribute Value Type:
`Attribute values must be of the type declared for the attribute. (For attribute types, see the discussion of attribute
`declarations.)
`
`The end of every element which is not empty is marked by an end-tag:
`
`< 15 End-tag Recognition >
`
`ETag ::= '</' Name S? '>'
`
`The Name, once again, gives the element's type.
`
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`13 of 25
`
`13/25
`
`
`
`

`

`3/30/2017
`Extensible Markup Language (XML)
`If an element is empty, the start-tag constitutes the whole element. An empty element takes a special form:
`
`< 16 Tags for empty elements >
`
`EmptyElement ::= '<' Name (S Attribute)* S? '/>';
`
`For compatibility, an empty element may have the same syntax as a start-tag; in this case, it cannot be
`recognized based on syntax, but must be declared as being empty. Such elements are called undistinguished
`empty elements.
`
`The text between the start-tag and end-tag is called the element's content:
`
`< 17 Content of elements >
`
`content ::= (element | PCDATA | MS | PI | Comment)*
`
`element ::= EmptyElement
`
`| STag content ETag
`
`/* empty elements */
`
`[ WFC: GI Match ]
`
`Well-Formedness Constraint - GI Match:
`The Name in an element's end-tag must match that in the start-tag.
`
`3.2 Well-Formed XML Documents
`
`A textual object is said to be a well-formed XML document if, first, it matches the production above labeled
`XML Document, and if:
`
`1. There are no undistinguished empty elements which have not been specified as such in an element
`declaration.
`2. For each entity reference which appears in the document, the entity name has been declared in the
`document type declaration.
`
`Matching the "XML Document" production implies that:
`
`1. It contains one or more elements.
`2. There is one element, called the root, for which neither the start-tag nor the end-tag are in the content of
`any other element. For all other elements, if the start-tag is in the content of another element, the end-tag is
`in the content of the same element. More simply stated, the elements, delineated by start- and end-tags,
`nest within each other properly.
`
`As a consequence of this, for each non-root element C, there is one other element P such that C is in the content
`of P, but is not in the content of any other element that is in the content of P. Then P is referred to as the parent
`of C, and C as the child of P.
`
`3.3 Element Declaration
`
`The element structure of an XML document may be declared fully or partially. Such declarations serve two
`purposes:
`
`1. To establish a set of structural constraints, i.e. a grammar, for a class of documents, and to verify that
`documents are valid, i.e. comply with that grammar.
`2. To make XML documents well-formed by declaring undistinguished empty elements.
`
`https://www.w3.org/TR/WD-xml-961114#sec1.1
`
`14 of 25
`
`14/25
`
`
`
`

`

`3/30/2017
`Extensible Markup Language (XML)
`An element declaration constrains the element's type and its content. The content constraints will be described
`first; four forms are available: empty, any, mixed content, and element content.
`
`Declarations often contain references to element types, for example when constraining which element types can
`appear as children of others, and which attributes may be attached to which element types. At user option, an
`XML processor may issue a warning when n

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket