` Request for Comments: 1341 N. Freed, Innosoft
` June 1992
`
` MIME (Multipurpose Internet Mail Extensions):
`
` Mechanisms for Specifying and Describing
` the Format of Internet Message Bodies
`
` Status of this Memo
`
` This RFC specifies an IAB standards track protocol for the
` Internet community, and requests discussion and suggestions
` for improvements. Please refer to the current edition of
` the "IAB Official Protocol Standards" for the
` standardization state and status of this protocol.
` Distribution of this memo is unlimited.
`
` Abstract
`
` RFC 822 defines a message representation protocol which
` specifies considerable detail about message headers, but
` which leaves the message content, or message body, as flat
` ASCII text. This document redefines the format of message
` bodies to allow multi-part textual and non-textual message
` bodies to be represented and exchanged without loss of
` information. This is based on earlier work documented in
` RFC 934 and RFC 1049, but extends and revises that work.
` Because RFC 822 said so little about message bodies, this
` document is largely orthogonal to (rather than a revision
` of) RFC 822.
`
` In particular, this document is designed to provide
` facilities to include multiple objects in a single message,
` to represent body text in character sets other than US-
` ASCII, to represent formatted multi-font text messages, to
` represent non-textual material such as images and audio
` fragments, and generally to facilitate later extensions
` defining new types of Internet mail for use by cooperating
` mail agents.
`
` This document does NOT extend Internet mail header fields to
` permit anything other than US-ASCII text data. It is
` recognized that such extensions are necessary, and they are
` the subject of a companion document [RFC -1342].
`
` A table of contents appears at the end of this document.
`
` Borenstein & Freed [Page i]
`
`Page 1 of 81
`
`AT&T EXHIBIT 1021
`
`
`
` 1 Introduction
`
` Since its publication in 1982, RFC 822 [RFC-822] has defined
` the standard format of textual mail messages on the
` Internet. Its success has been such that the RFC 822 format
` has been adopted, wholly or partially, well beyond the
` confines of the Internet and the Internet SMTP transport
` defined by RFC 821 [RFC-821]. As the format has seen wider
` use, a number of limitations have proven increasingly
` restrictive for the user community.
`
` RFC 822 was intended to specify a format for text messages.
` As such, non-text messages, such as multimedia messages that
` might include audio or images, are simply not mentioned.
` Even in the case of text, however, RFC 822 is inadequate for
` the needs of mail users whose languages require the use of
` character sets richer than US ASCII [US-ASCII]. Since RFC
` 822 does not specify mechanisms for mail containing audio,
` video, Asian language text, or even text in most European
` languages, additional specifications are needed
`
` One of the notable limitations of RFC 821/822 based mail
` systems is the fact that they limit the contents of
` electronic mail messages to relatively short lines of
` seven-bit ASCII. This forces users to convert any non-
` textual data that they may wish to send into seven-bit bytes
` representable as printable ASCII characters before invoking
` a local mail UA (User Agent, a program with which human
` users send and receive mail). Examples of such encodings
` currently used in the Internet include pure hexadecimal,
` uuencode, the 3-in-4 base 64 scheme specified in RFC 1113,
` the Andrew Toolkit Representation [ATK], and many others.
`
` The limitations of RFC 822 mail become even more apparent as
` gateways are designed to allow for the exchange of mail
` messages between RFC 822 hosts and X.400 hosts. X.400 [X400]
` specifies mechanisms for the inclusion of non-textual body
` parts within electronic mail messages. The current
` standards for the mapping of X.400 messages to RFC 822
` messages specify that either X.400 non-textual body parts
` should be converted to (not encoded in) an ASCII format, or
` that they should be discarded, notifying the RFC 822 user
` that discarding has occurred. This is clearly undesirable,
` as information that a user may wish to receive is lost.
` Even though a user’s UA may not have the capability of
` dealing with the non-textual body part, the user might have
` some mechanism external to the UA that can extract useful
` information from the body part. Moreover, it does not allow
` for the fact that the message may eventually be gatewayed
` back into an X.400 message handling system (i.e., the X.400
` message is "tunneled" through Internet mail), where the
` non-textual information would definitely become useful
` again.
`
` Borenstein & Freed [Page 1]
`
`Page 2 of 81
`
`
`
` RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
`
` This document describes several mechanisms that combine to
` solve most of these problems without introducing any serious
` incompatibilities with the existing world of RFC 822 mail.
` In particular, it describes:
`
` 1. A MIME-Version header field, which uses a version number
` to declare a message to be conformant with this
` specification and allows mail processing agents to
` distinguish between such messages and those generated
` by older or non-conformant software, which is presumed
` to lack such a field.
`
` 2. A Content-Type header field, generalized from RFC 1049
` [RFC-1049], which can be used to specify the type and
` subtype of data in the body of a message and to fully
` specify the native representation (encoding) of such
` data.
`
` 2.a. A "text" Content-Type value, which can be used to
` represent textual information in a number of
` character sets and formatted text description
` languages in a standardized manner.
`
` 2.b. A "multipart" Content-Type value, which can be
` used to combine several body parts, possibly of
` differing types of data, into a single message.
`
` 2.c. An "application" Content-Type value, which can be
` used to transmit application data or binary data,
` and hence, among other uses, to implement an
` electronic mail file transfer service.
`
` 2.d. A "message" Content-Type value, for encapsulating
` a mail message.
`
` 2.e An "image" Content-Type value, for transmitting
` still image (picture) data.
`
` 2.f. An "audio" Content-Type value, for transmitting
` audio or voice data.
`
` 2.g. A "video" Content-Type value, for transmitting
` video or moving image data, possibly with audio as
` part of the composite video data format.
`
` 3. A Content-Transfer-Encoding header field, which can be
` used to specify an auxiliary encoding that was applied
` to the data in order to allow it to pass through mail
` transport mechanisms which may have data or character
` set limitations.
`
` 4. Two optional header fields that can be used to further
` describe the data in a message body, the Content-ID and
` Content-Description header fields.
`
` Borenstein & Freed [Page 2]
`
`Page 3 of 81
`
`
`
` RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
`
` MIME has been carefully designed as an extensible mechanism,
` and it is expected that the set of content-type/subtype
` pairs and their associated parameters will grow
` significantly with time. Several other MIME fields, notably
` including character set names, are likely to have new values
` defined over time. In order to ensure that the set of such
` values is developed in an orderly, well-specified, and
` public manner, MIME defines a registration process which
` uses the Internet Assigned Numbers Authority (IANA) as a
` central registry for such values. Appendix F provides
` details about how IANA registration is accomplished.
`
` Finally, to specify and promote interoperability, Appendix A
` of this document provides a basic applicability statement
` for a subset of the above mechanisms that defines a minimal
` level of "conformance" with this document.
`
` HISTORICAL NOTE: Several of the mechanisms described in
` this document may seem somewhat strange or even baroque at
` first reading. It is important to note that compatibility
` with existing standards AND robustness across existing
` practice were two of the highest priorities of the working
` group that developed this document. In particular,
` compatibility was always favored over elegance.
`
` 2 Notations, Conventions, and Generic BNF Grammar
`
` This document is being published in two versions, one as
` plain ASCII text and one as PostScript. The latter is
` recommended, though the textual contents are identical. An
` Andrew-format copy of this document is also available from
` the first author (Borenstein).
`
` Although the mechanisms specified in this document are all
` described in prose, most are also described formally in the
` modified BNF notation of RFC 822. Implementors will need to
` be familiar with this notation in order to understand this
` specification, and are referred to RFC 822 for a complete
` explanation of the modified BNF notation.
`
` Some of the modified BNF in this document makes reference to
` syntactic entities that are defined in RFC 822 and not in
` this document. A complete formal grammar, then, is obtained
` by combining the collected grammar appendix of this document
` with that of RFC 822.
`
` The term CRLF, in this document, refers to the sequence of
` the two ASCII characters CR (13) and LF (10) which, taken
` together, in this order, denote a line break in RFC 822
` mail.
`
` The term "character set", wherever it is used in this
` document, refers to a coded character set, in the sense of
` ISO character set standardization work, and must not be
`
` Borenstein & Freed [Page 3]
`
`Page 4 of 81
`
`
`
` RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
`
` misinterpreted as meaning "a set of characters."
`
` The term "message", when not further qualified, means either
` the (complete or "top-level") message being transferred on a
` network, or a message encapsulated in a body of type
` "message".
`
` The term "body part", in this document, means one of the
` parts of the body of a multipart entity. A body part has a
` header and a body, so it makes sense to speak about the body
` of a body part.
`
` The term "entity", in this document, means either a message
` or a body part. All kinds of entities share the property
` that they have a header and a body.
`
` The term "body", when not further qualified, means the body
` of an entity, that is the body of either a message or of a
` body part.
`
` Note : the previous four definitions are clearly circular.
` This is unavoidable, since the overal structure of a MIME
` message is indeed recursive.
`
` In this document, all numeric and octet values are given in
` decimal notation.
`
` It must be noted that Content-Type values, subtypes, and
` parameter names as defined in this document are case-
` insensitive. However, parameter values are case-sensitive
` unless otherwise specified for the specific parameter.
`
` FORMATTING NOTE: This document has been carefully formatted
` for ease of reading. The PostScript version of this
` document, in particular, places notes like this one, which
` may be skipped by the reader, in a smaller, italicized,
` font, and indents it as well. In the text version, only the
` indentation is preserved, so if you are reading the text
` version of this you might consider using the PostScript
` version instead. However, all such notes will be indented
` and preceded by "NOTE:" or some similar introduction, even
` in the text version.
`
` The primary purpose of these non-essential notes is to
` convey information about the rationale of this document, or
` to place this document in the proper historical or
` evolutionary context. Such information may be skipped by
` those who are focused entirely on building a compliant
` implementation, but may be of use to those who wish to
` understand why this document is written as it is.
`
` For ease of recognition, all BNF definitions have been
` placed in a fixed-width font in the PostScript version of
` this document.
`
` Borenstein & Freed [Page 4]
`
`Page 5 of 81
`
`
`
` RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
`
` 3 The MIME-Version Header Field
`
` Since RFC 822 was published in 1982, there has really been
` only one format standard for Internet messages, and there
` has been little perceived need to declare the format
` standard in use. This document is an independent document
` that complements RFC 822. Although the extensions in this
` document have been defined in such a way as to be compatible
` with RFC 822, there are still circumstances in which it
` might be desirable for a mail-processing agent to know
` whether a message was composed with the new standard in
` mind.
`
` Therefore, this document defines a new header field, "MIME-
` Version", which is to be used to declare the version of the
` Internet message body format standard in use.
`
` Messages composed in accordance with this document MUST
` include such a header field, with the following verbatim
` text:
`
` MIME-Version: 1.0
`
` The presence of this header field is an assertion that the
` message has been composed in compliance with this document.
`
` Since it is possible that a future document might extend the
` message format standard again, a formal BNF is given for the
` content of the MIME-Version field:
`
` MIME-Version := text
`
` Thus, future format specifiers, which might replace or
` extend "1.0", are (minimally) constrained by the definition
` of "text", which appears in RFC 822.
`
` Note that the MIME-Version header field is required at the
` top level of a message. It is not required for each body
` part of a multipart entity. It is required for the embedded
` headers of a body of type "message" if and only if the
` embedded message is itself claimed to be MIME-compliant.
`
` Borenstein & Freed [Page 5]
`
`Page 6 of 81
`
`
`
` RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
`
` 4 The Content-Type Header Field
`
` The purpose of the Content-Type field is to describe the
` data contained in the body fully enough that the receiving
` user agent can pick an appropriate agent or mechanism to
` present the data to the user, or otherwise deal with the
` data in an appropriate manner.
`
` HISTORICAL NOTE: The Content-Type header field was first
` defined in RFC 1049. RFC 1049 Content-types used a simpler
` and less powerful syntax, but one that is largely compatible
` with the mechanism given here.
`
` The Content-Type header field is used to specify the nature
` of the data in the body of an entity, by giving type and
` subtype identifiers, and by providing auxiliary information
` that may be required for certain types. After the type and
` subtype names, the remainder of the header field is simply a
` set of parameters, specified in an attribute/value notation.
` The set of meaningful parameters differs for the different
` types. The ordering of parameters is not significant.
` Among the defined parameters is a "charset" parameter by
` which the character set used in the body may be declared.
` Comments are allowed in accordance with RFC 822 rules for
` structured header fields.
`
` In general, the top-level Content-Type is used to declare
` the general type of data, while the subtype specifies a
` specific format for that type of data. Thus, a Content-Type
` of "image/xyz" is enough to tell a user agent that the data
` is an image, even if the user agent has no knowledge of the
` specific image format "xyz". Such information can be used,
` for example, to decide whether or not to show a user the raw
` data from an unrecognized subtype -- such an action might be
` reasonable for unrecognized subtypes of text, but not for
` unrecognized subtypes of image or audio. For this reason,
` registered subtypes of audio, image, text, and video, should
` not contain embedded information that is really of a
` different type. Such compound types should be represented
` using the "multipart" or "application" types.
`
` Parameters are modifiers of the content-subtype, and do not
` fundamentally affect the requirements of the host system.
` Although most parameters make sense only with certain
` content-types, others are "global" in the sense that they
` might apply to any subtype. For example, the "boundary"
` parameter makes sense only for the "multipart" content-type,
` but the "charset" parameter might make sense with several
` content-types.
`
` An initial set of seven Content-Types is defined by this
` document. This set of top-level names is intended to be
` substantially complete. It is expected that additions to
` the larger set of supported types can generally be
`
` Borenstein & Freed [Page 6]
`
`Page 7 of 81
`
`
`
` RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
`
` accomplished by the creation of new subtypes of these
` initial types. In the future, more top-level types may be
` defined only by an extension to this standard. If another
` primary type is to be used for any reason, it must be given
` a name starting with "X-" to indicate its non-standard
` status and to avoid a potential conflict with a future
` official name.
`
` In the Extended BNF notation of RFC 822, a Content-Type
` header field value is defined as follows:
`
` Content-Type := type "/" subtype *[";" parameter]
`
` type := "application" / "audio"
` / "image" / "message"
` / "multipart" / "text"
` / "video" / x-token
`
` x-token := <The two characters "X-" followed, with no
` intervening white space, by any token>
`
` subtype := token
`
` parameter := attribute "=" value
`
` attribute := token
`
` value := token / quoted-string
`
` token := 1*<any CHAR except SPACE, CTLs, or tspecials>
`
` tspecials := "(" / ")" / "<" / ">" / "@" ; Must be in
` / "," / ";" / ":" / "\" / <"> ; quoted-string,
` / "/" / "[" / "]" / "?" / "." ; to use within
` / "=" ; parameter values
`
` Note that the definition of "tspecials" is the same as the
` RFC 822 definition of "specials" with the addition of the
` three characters "/", "?", and "=".
`
` Note also that a subtype specification is MANDATORY. There
` are no default subtypes.
`
` The type, subtype, and parameter names are not case
` sensitive. For example, TEXT, Text, and TeXt are all
` equivalent. Parameter values are normally case sensitive,
` but certain parameters are interpreted to be case-
` insensitive, depending on the intended use. (For example,
` multipart boundaries are case-sensitive, but the "access-
` type" for message/External-body is not case-sensitive.)
`
` Beyond this syntax, the only constraint on the definition of
` subtype names is the desire that their uses must not
` conflict. That is, it would be undesirable to have two
`
` Borenstein & Freed [Page 7]
`
`Page 8 of 81
`
`
`
` RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
`
` different communities using "Content-Type:
` application/foobar" to mean two different things. The
` process of defining new content-subtypes, then, is not
` intended to be a mechanism for imposing restrictions, but
` simply a mechanism for publicizing the usages. There are,
` therefore, two acceptable mechanisms for defining new
` Content-Type subtypes:
`
` 1. Private values (starting with "X-") may be
` defined bilaterally between two cooperating
` agents without outside registration or
` standardization.
`
` 2. New standard values must be documented,
` registered with, and approved by IANA, as
` described in Appendix F. Where intended for
` public use, the formats they refer to must
` also be defined by a published specification,
` and possibly offered for standardization.
`
` The seven standard initial predefined Content-Types are
` detailed in the bulk of this document. They are:
`
` text -- textual information. The primary subtype,
` "plain", indicates plain (unformatted) text. No
` special software is required to get the full
` meaning of the text, aside from support for the
` indicated character set. Subtypes are to be used
` for enriched text in forms where application
` software may enhance the appearance of the text,
` but such software must not be required in order to
` get the general idea of the content. Possible
` subtypes thus include any readable word processor
` format. A very simple and portable subtype,
` richtext, is defined in this document.
` multipart -- data consisting of multiple parts of
` independent data types. Four initial subtypes
` are defined, including the primary "mixed"
` subtype, "alternative" for representing the same
` data in multiple formats, "parallel" for parts
` intended to be viewed simultaneously, and "digest"
` for multipart entities in which each part is of
` type "message".
` message -- an encapsulated message. A body of
` Content-Type "message" is itself a fully formatted
` RFC 822 conformant message which may contain its
` own different Content-Type header field. The
` primary subtype is "rfc822". The "partial"
` subtype is defined for partial messages, to permit
` the fragmented transmission of bodies that are
` thought to be too large to be passed through mail
` transport facilities. Another subtype,
` "External-body", is defined for specifying large
` bodies by reference to an external data source.
`
` Borenstein & Freed [Page 8]
`
`Page 9 of 81
`
`
`
` RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
`
` image -- image data. Image requires a display device
` (such as a graphical display, a printer, or a FAX
` machine) to view the information. Initial
` subtypes are defined for two widely-used image
` formats, jpeg and gif.
` audio -- audio data, with initial subtype "basic".
` Audio requires an audio output device (such as a
` speaker or a telephone) to "display" the contents.
` video -- video data. Video requires the capability to
` display moving images, typically including
` specialized hardware and software. The initial
` subtype is "mpeg".
` application -- some other kind of data, typically
` either uninterpreted binary data or information to
` be processed by a mail-based application. The
` primary subtype, "octet-stream", is to be used in
` the case of uninterpreted binary data, in which
` case the simplest recommended action is to offer
` to write the information into a file for the user.
` Two additional subtypes, "ODA" and "PostScript",
` are defined for transporting ODA and PostScript
` documents in bodies. Other expected uses for
` "application" include spreadsheets, data for
` mail-based scheduling systems, and languages for
` "active" (computational) email. (Note that active
` email entails several securityconsiderations,
` which are discussed later in this memo,
` particularly in the context of
` application/PostScript.)
`
` Default RFC 822 messages are typed by this protocol as plain
` text in the US-ASCII character set, which can be explicitly
` specified as "Content-type: text/plain; charset=us-ascii".
` If no Content-Type is specified, either by error or by an
` older user agent, this default is assumed. In the presence
` of a MIME-Version header field, a receiving User Agent can
` also assume that plain US-ASCII text was the sender’s
` intent. In the absence of a MIME-Version specification,
` plain US-ASCII text must still be assumed, but the sender’s
` intent might have been otherwise.
`
` RATIONALE: In the absence of any Content-Type header field
` or MIME-Version header field, it is impossible to be certain
` that a message is actually text in the US-ASCII character
` set, since it might well be a message that, using the
` conventions that predate this document, includes text in
` another character set or non-textual data in a manner that
` cannot be automatically recognized (e.g., a uuencoded
` compressed UNIX tar file). Although there is no fully
` acceptable alternative to treating such untyped messages as
` "text/plain; charset=us-ascii", implementors should remain
` aware that if a message lacks both the MIME-Version and the
` Content-Type header fields, it may in practice contain
` almost anything.
`
` Borenstein & Freed [Page 9]
`
`Page 10 of 81
`
`
`
` RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
`
` It should be noted that the list of Content-Type values
` given here may be augmented in time, via the mechanisms
` described above, and that the set of subtypes is expected to
` grow substantially.
`
` When a mail reader encounters mail with an unknown Content-
` type value, it should generally treat it as equivalent to
` "application/octet-stream", as described later in this
` document.
`
` 5 The Content-Transfer-Encoding Header Field
`
` Many Content-Types which could usefully be transported via
` email are represented, in their "natural" format, as 8-bit
` character or binary data. Such data cannot be transmitted
` over some transport protocols. For example, RFC 821
` restricts mail messages to 7-bit US-ASCII data with 1000
` character lines.
`
` It is necessary, therefore, to define a standard mechanism
` for re-encoding such data into a 7-bit short-line format.
` This document specifies that such encodings will be
` indicated by a new "Content-Transfer-Encoding" header field.
` The Content-Transfer-