draft-ietf-cbor-7049bis-16.txt | rfc8949.txt | |||
---|---|---|---|---|
Network Working Group C. Bormann | Internet Engineering Task Force (IETF) C. Bormann | |||
Internet-Draft Universitaet Bremen TZI | Request for Comments: 8949 Universität Bremen TZI | |||
Obsoletes: 7049 (if approved) P. Hoffman | STD: 94 P. Hoffman | |||
Intended status: Standards Track ICANN | Obsoletes: 7049 ICANN | |||
Expires: 3 April 2021 30 September 2020 | Category: Standards Track December 2020 | |||
ISSN: 2070-1721 | ||||
Concise Binary Object Representation (CBOR) | Concise Binary Object Representation (CBOR) | |||
draft-ietf-cbor-7049bis-16 | ||||
Abstract | Abstract | |||
The Concise Binary Object Representation (CBOR) is a data format | The Concise Binary Object Representation (CBOR) is a data format | |||
whose design goals include the possibility of extremely small code | whose design goals include the possibility of extremely small code | |||
size, fairly small message size, and extensibility without the need | size, fairly small message size, and extensibility without the need | |||
for version negotiation. These design goals make it different from | for version negotiation. These design goals make it different from | |||
earlier binary serializations such as ASN.1 and MessagePack. | earlier binary serializations such as ASN.1 and MessagePack. | |||
This document is a revised edition of RFC 7049, with editorial | This document obsoletes RFC 7049, providing editorial improvements, | |||
improvements, added detail, and fixed errata. This revision formally | new details, and errata fixes while keeping full compatibility with | |||
obsoletes RFC 7049, while keeping full compatibility of the | the interchange format of RFC 7049. It does not create a new version | |||
interchange format from RFC 7049. It does not create a new version | ||||
of the format. | of the format. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
provisions of BCP 78 and BCP 79. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 7841. | ||||
This Internet-Draft will expire on 3 April 2021. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
https://www.rfc-editor.org/info/rfc8949. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
extracted from this document must include Simplified BSD License text | to this document. Code Components extracted from this document must | |||
as described in Section 4.e of the Trust Legal Provisions and are | include Simplified BSD License text as described in Section 4.e of | |||
provided without warranty as described in the Simplified BSD License. | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction | |||
1.1. Objectives . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.1. Objectives | |||
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 | 1.2. Terminology | |||
2. CBOR Data Models . . . . . . . . . . . . . . . . . . . . . . 8 | 2. CBOR Data Models | |||
2.1. Extended Generic Data Models . . . . . . . . . . . . . . 9 | 2.1. Extended Generic Data Models | |||
2.2. Specific Data Models . . . . . . . . . . . . . . . . . . 9 | 2.2. Specific Data Models | |||
3. Specification of the CBOR Encoding . . . . . . . . . . . . . 10 | 3. Specification of the CBOR Encoding | |||
3.1. Major Types . . . . . . . . . . . . . . . . . . . . . . . 11 | 3.1. Major Types | |||
3.2. Indefinite Lengths for Some Major Types . . . . . . . . . 14 | 3.2. Indefinite Lengths for Some Major Types | |||
3.2.1. The "break" Stop Code . . . . . . . . . . . . . . . . 14 | 3.2.1. The "break" Stop Code | |||
3.2.2. Indefinite-Length Arrays and Maps . . . . . . . . . . 14 | 3.2.2. Indefinite-Length Arrays and Maps | |||
3.2.3. Indefinite-Length Byte Strings and Text Strings . . . 16 | 3.2.3. Indefinite-Length Byte Strings and Text Strings | |||
3.2.4. Summary of indefinite-length use of major types . . . 17 | 3.2.4. Summary of Indefinite-Length Use of Major Types | |||
3.3. Floating-Point Numbers and Values with No Content . . . . 18 | 3.3. Floating-Point Numbers and Values with No Content | |||
3.4. Tagging of Items . . . . . . . . . . . . . . . . . . . . 20 | 3.4. Tagging of Items | |||
3.4.1. Standard Date/Time String . . . . . . . . . . . . . . 23 | 3.4.1. Standard Date/Time String | |||
3.4.2. Epoch-based Date/Time . . . . . . . . . . . . . . . . 23 | 3.4.2. Epoch-Based Date/Time | |||
3.4.3. Bignums . . . . . . . . . . . . . . . . . . . . . . . 24 | 3.4.3. Bignums | |||
3.4.4. Decimal Fractions and Bigfloats . . . . . . . . . . . 25 | 3.4.4. Decimal Fractions and Bigfloats | |||
3.4.5. Content Hints . . . . . . . . . . . . . . . . . . . . 26 | 3.4.5. Content Hints | |||
3.4.5.1. Encoded CBOR Data Item . . . . . . . . . . . . . 27 | 3.4.5.1. Encoded CBOR Data Item | |||
3.4.5.2. Expected Later Encoding for CBOR-to-JSON | 3.4.5.2. Expected Later Encoding for CBOR-to-JSON Converters | |||
Converters . . . . . . . . . . . . . . . . . . . . 27 | 3.4.5.3. Encoded Text | |||
3.4.5.3. Encoded Text . . . . . . . . . . . . . . . . . . 28 | 3.4.6. Self-Described CBOR | |||
3.4.6. Self-Described CBOR . . . . . . . . . . . . . . . . . 29 | 4. Serialization Considerations | |||
4. Serialization Considerations . . . . . . . . . . . . . . . . 29 | 4.1. Preferred Serialization | |||
4.1. Preferred Serialization . . . . . . . . . . . . . . . . . 29 | 4.2. Deterministically Encoded CBOR | |||
4.2. Deterministically Encoded CBOR . . . . . . . . . . . . . 31 | 4.2.1. Core Deterministic Encoding Requirements | |||
4.2.1. Core Deterministic Encoding Requirements . . . . . . 31 | 4.2.2. Additional Deterministic Encoding Considerations | |||
4.2.2. Additional Deterministic Encoding Considerations . . 32 | 4.2.3. Length-First Map Key Ordering | |||
4.2.3. Length-first Map Key Ordering . . . . . . . . . . . . 34 | 5. Creating CBOR-Based Protocols | |||
5. Creating CBOR-Based Protocols . . . . . . . . . . . . . . . . 35 | 5.1. CBOR in Streaming Applications | |||
5.1. CBOR in Streaming Applications . . . . . . . . . . . . . 35 | 5.2. Generic Encoders and Decoders | |||
5.2. Generic Encoders and Decoders . . . . . . . . . . . . . . 36 | 5.3. Validity of Items | |||
5.3. Validity of Items . . . . . . . . . . . . . . . . . . . . 37 | 5.3.1. Basic validity | |||
5.3.1. Basic validity . . . . . . . . . . . . . . . . . . . 37 | 5.3.2. Tag validity | |||
5.3.2. Tag validity . . . . . . . . . . . . . . . . . . . . 37 | 5.4. Validity and Evolution | |||
5.5. Numbers | ||||
5.4. Validity and Evolution . . . . . . . . . . . . . . . . . 38 | 5.6. Specifying Keys for Maps | |||
5.5. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 39 | 5.6.1. Equivalence of Keys | |||
5.6. Specifying Keys for Maps . . . . . . . . . . . . . . . . 40 | 5.7. Undefined Values | |||
5.6.1. Equivalence of Keys . . . . . . . . . . . . . . . . . 42 | 6. Converting Data between CBOR and JSON | |||
5.7. Undefined Values . . . . . . . . . . . . . . . . . . . . 43 | 6.1. Converting from CBOR to JSON | |||
6. Converting Data between CBOR and JSON . . . . . . . . . . . . 43 | 6.2. Converting from JSON to CBOR | |||
6.1. Converting from CBOR to JSON . . . . . . . . . . . . . . 43 | 7. Future Evolution of CBOR | |||
6.2. Converting from JSON to CBOR . . . . . . . . . . . . . . 44 | 7.1. Extension Points | |||
7. Future Evolution of CBOR . . . . . . . . . . . . . . . . . . 46 | 7.2. Curating the Additional Information Space | |||
7.1. Extension Points . . . . . . . . . . . . . . . . . . . . 46 | 8. Diagnostic Notation | |||
7.2. Curating the Additional Information Space . . . . . . . . 47 | 8.1. Encoding Indicators | |||
8. Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . 47 | 9. IANA Considerations | |||
8.1. Encoding Indicators . . . . . . . . . . . . . . . . . . . 49 | 9.1. CBOR Simple Values Registry | |||
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 49 | 9.2. CBOR Tags Registry | |||
9.1. Simple Values Registry . . . . . . . . . . . . . . . . . 50 | 9.3. Media Types Registry | |||
9.2. Tags Registry . . . . . . . . . . . . . . . . . . . . . . 50 | 9.4. CoAP Content-Format Registry | |||
9.3. Media Type ("MIME Type") . . . . . . . . . . . . . . . . 51 | 9.5. Structured Syntax Suffix Registry | |||
9.4. CoAP Content-Format . . . . . . . . . . . . . . . . . . . 51 | 10. Security Considerations | |||
9.5. The +cbor Structured Syntax Suffix Registration . . . . . 52 | 11. References | |||
10. Security Considerations . . . . . . . . . . . . . . . . . . . 53 | 11.1. Normative References | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 55 | 11.2. Informative References | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . 55 | Appendix A. Examples of Encoded CBOR Data Items | |||
11.2. Informative References . . . . . . . . . . . . . . . . . 57 | Appendix B. Jump Table for Initial Byte | |||
Appendix A. Examples of Encoded CBOR Data Items . . . . . . . . 59 | Appendix C. Pseudocode | |||
Appendix B. Jump Table for Initial Byte . . . . . . . . . . . . 63 | Appendix D. Half-Precision | |||
Appendix C. Pseudocode . . . . . . . . . . . . . . . . . . . . . 66 | ||||
Appendix D. Half-Precision . . . . . . . . . . . . . . . . . . . 69 | ||||
Appendix E. Comparison of Other Binary Formats to CBOR's Design | Appendix E. Comparison of Other Binary Formats to CBOR's Design | |||
Objectives . . . . . . . . . . . . . . . . . . . . . . . 70 | Objectives | |||
E.1. ASN.1 DER, BER, and PER . . . . . . . . . . . . . . . . . 71 | E.1. ASN.1 DER, BER, and PER | |||
E.2. MessagePack . . . . . . . . . . . . . . . . . . . . . . . 71 | E.2. MessagePack | |||
E.3. BSON . . . . . . . . . . . . . . . . . . . . . . . . . . 72 | E.3. BSON | |||
E.4. MSDTP: RFC 713 . . . . . . . . . . . . . . . . . . . . . 72 | E.4. MSDTP: RFC 713 | |||
E.5. Conciseness on the Wire . . . . . . . . . . . . . . . . . 72 | E.5. Conciseness on the Wire | |||
Appendix F. Well-formedness errors and examples . . . . . . . . 73 | Appendix F. Well-Formedness Errors and Examples | |||
F.1. Examples for CBOR data items that are not well-formed . . 74 | F.1. Examples of CBOR Data Items That Are Not Well-Formed | |||
Appendix G. Changes from RFC 7049 . . . . . . . . . . . . . . . 76 | Appendix G. Changes from RFC 7049 | |||
G.1. Errata processing, clerical changes . . . . . . . . . . . 76 | G.1. Errata Processing and Clerical Changes | |||
G.2. Changes in IANA considerations . . . . . . . . . . . . . 77 | G.2. Changes in IANA Considerations | |||
G.3. Changes in suggestions and other informational | G.3. Changes in Suggestions and Other Informational Components | |||
components . . . . . . . . . . . . . . . . . . . . . . . 77 | Acknowledgements | |||
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 79 | Authors' Addresses | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 79 | ||||
1. Introduction | 1. Introduction | |||
There are hundreds of standardized formats for binary representation | There are hundreds of standardized formats for binary representation | |||
of structured data (also known as binary serialization formats). Of | of structured data (also known as binary serialization formats). Of | |||
those, some are for specific domains of information, while others are | those, some are for specific domains of information, while others are | |||
generalized for arbitrary data. In the IETF, probably the best-known | generalized for arbitrary data. In the IETF, probably the best-known | |||
formats in the latter category are ASN.1's BER and DER [ASN.1]. | formats in the latter category are ASN.1's BER and DER [ASN.1]. | |||
The format defined here follows some specific design goals that are | The format defined here follows some specific design goals that are | |||
skipping to change at page 4, line 26 ¶ | skipping to change at line 155 ¶ | |||
to note that this is not a proposal that the grammar in RFC 8259 be | to note that this is not a proposal that the grammar in RFC 8259 be | |||
extended in general, since doing so would cause a significant | extended in general, since doing so would cause a significant | |||
backwards incompatibility with already deployed JSON documents. | backwards incompatibility with already deployed JSON documents. | |||
Instead, this document simply defines its own data model that starts | Instead, this document simply defines its own data model that starts | |||
from JSON. | from JSON. | |||
Appendix E lists some existing binary formats and discusses how well | Appendix E lists some existing binary formats and discusses how well | |||
they do or do not fit the design objectives of the Concise Binary | they do or do not fit the design objectives of the Concise Binary | |||
Object Representation (CBOR). | Object Representation (CBOR). | |||
This document is a revised edition of [RFC7049], with editorial | This document obsoletes [RFC7049], providing editorial improvements, | |||
improvements, added detail, and fixed errata. This revision formally | new details, and errata fixes while keeping full compatibility with | |||
obsoletes RFC 7049, while keeping full compatibility of the | the interchange format of RFC 7049. It does not create a new version | |||
interchange format from RFC 7049. It does not create a new version | ||||
of the format. | of the format. | |||
1.1. Objectives | 1.1. Objectives | |||
The objectives of CBOR, roughly in decreasing order of importance, | The objectives of CBOR, roughly in decreasing order of importance, | |||
are: | are: | |||
1. The representation must be able to unambiguously encode most | 1. The representation must be able to unambiguously encode most | |||
common data formats used in Internet standards. | common data formats used in Internet standards. | |||
skipping to change at page 5, line 26 ¶ | skipping to change at line 200 ¶ | |||
3. Data must be able to be decoded without a schema description. | 3. Data must be able to be decoded without a schema description. | |||
* Similar to JSON, encoded data should be self-describing so | * Similar to JSON, encoded data should be self-describing so | |||
that a generic decoder can be written. | that a generic decoder can be written. | |||
4. The serialization must be reasonably compact, but data | 4. The serialization must be reasonably compact, but data | |||
compactness is secondary to code compactness for the encoder and | compactness is secondary to code compactness for the encoder and | |||
decoder. | decoder. | |||
* "Reasonable" here is bounded by JSON as an upper bound in | * "Reasonable" here is bounded by JSON as an upper bound in size | |||
size, and by the implementation complexity limiting how much | and by the implementation complexity, which limits the amount | |||
effort can go into achieving that compactness. Using either | of effort that can go into achieving that compactness. Using | |||
general compression schemes or extensive bit-fiddling violates | either general compression schemes or extensive bit-fiddling | |||
the complexity goals. | violates the complexity goals. | |||
5. The format must be applicable to both constrained nodes and high- | 5. The format must be applicable to both constrained nodes and high- | |||
volume applications. | volume applications. | |||
* This means it must be reasonably frugal in CPU usage for both | * This means it must be reasonably frugal in CPU usage for both | |||
encoding and decoding. This is relevant both for constrained | encoding and decoding. This is relevant both for constrained | |||
nodes and for potential usage in applications with a very high | nodes and for potential usage in applications with a very high | |||
volume of data. | volume of data. | |||
6. The format must support all JSON data types for conversion to and | 6. The format must support all JSON data types for conversion to and | |||
skipping to change at page 6, line 30 ¶ | skipping to change at line 252 ¶ | |||
The term "byte" is used in its now-customary sense as a synonym for | The term "byte" is used in its now-customary sense as a synonym for | |||
"octet". All multi-byte values are encoded in network byte order | "octet". All multi-byte values are encoded in network byte order | |||
(that is, most significant byte first, also known as "big-endian"). | (that is, most significant byte first, also known as "big-endian"). | |||
This specification makes use of the following terminology: | This specification makes use of the following terminology: | |||
Data item: A single piece of CBOR data. The structure of a data | Data item: A single piece of CBOR data. The structure of a data | |||
item may contain zero, one, or more nested data items. The term | item may contain zero, one, or more nested data items. The term | |||
is used both for the data item in representation format and for | is used both for the data item in representation format and for | |||
the abstract idea that can be derived from that by a decoder; the | the abstract idea that can be derived from that by a decoder; the | |||
former can be addressed specifically by using "encoded data item". | former can be addressed specifically by using the term "encoded | |||
data item". | ||||
Decoder: A process that decodes a well-formed encoded CBOR data item | Decoder: A process that decodes a well-formed encoded CBOR data item | |||
and makes it available to an application. Formally speaking, a | and makes it available to an application. Formally speaking, a | |||
decoder contains a parser to break up the input using the syntax | decoder contains a parser to break up the input using the syntax | |||
rules of CBOR, as well as a semantic processor to prepare the data | rules of CBOR, as well as a semantic processor to prepare the data | |||
in a form suitable to the application. | in a form suitable to the application. | |||
Encoder: A process that generates the (well-formed) representation | Encoder: A process that generates the (well-formed) representation | |||
format of a CBOR data item from application information. | format of a CBOR data item from application information. | |||
skipping to change at page 7, line 25 ¶ | skipping to change at line 296 ¶ | |||
Stream decoder: A process that decodes a data stream and makes each | Stream decoder: A process that decodes a data stream and makes each | |||
of the data items in the sequence available to an application as | of the data items in the sequence available to an application as | |||
they are received. | they are received. | |||
Terms and concepts for floating-point values such as Infinity, NaN | Terms and concepts for floating-point values such as Infinity, NaN | |||
(not a number), negative zero, and subnormal are defined in | (not a number), negative zero, and subnormal are defined in | |||
[IEEE754]. | [IEEE754]. | |||
Where bit arithmetic or data types are explained, this document uses | Where bit arithmetic or data types are explained, this document uses | |||
the notation familiar from the programming language C [C], except | the notation familiar from the programming language C [C], except | |||
that "**" denotes exponentiation and ".." denotes a range that | that ".." denotes a range that includes both ends given, and | |||
includes both ends given. Examples and pseudocode assume that signed | superscript notation denotes exponentiation. For example, 2 to the | |||
integers use two's complement representation and that right shifts of | power of 64 is notated: 2^(64). In the plain-text version of this | |||
signed integers perform sign extension; these assumptions are also | specification, superscript notation is not available and therefore is | |||
specified in Sections 6.8.2 and 7.6.7 of the 2020 version of C++, | rendered by a surrogate notation. That notation is not optimized for | |||
successor of [Cplusplus17]. | this RFC; it is unfortunately ambiguous with C's exclusive-or (which | |||
is only used in the appendices, which in turn do not use | ||||
exponentiation) and requires circumspection from the reader of the | ||||
plain-text version. | ||||
Examples and pseudocode assume that signed integers use two's | ||||
complement representation and that right shifts of signed integers | ||||
perform sign extension; these assumptions are also specified in | ||||
Sections 6.8.1 (basic.fundamental) and 7.6.7 (expr.shift) of the 2020 | ||||
version of C++ (currently available as a final draft, [Cplusplus20]). | ||||
Similar to the "0x" notation for hexadecimal numbers, numbers in | Similar to the "0x" notation for hexadecimal numbers, numbers in | |||
binary notation are prefixed with "0b". Underscores can be added to | binary notation are prefixed with "0b". Underscores can be added to | |||
a number solely for readability, so 0b00100001 (0x21) might be | a number solely for readability, so 0b00100001 (0x21) might be | |||
written 0b001_00001 to emphasize the desired interpretation of the | written 0b001_00001 to emphasize the desired interpretation of the | |||
bits in the byte; in this case, it is split into three bits and five | bits in the byte; in this case, it is split into three bits and five | |||
bits. Encoded CBOR data items are sometimes given in the "0x" or | bits. Encoded CBOR data items are sometimes given in the "0x" or | |||
"0b" notation; these values are first interpreted as numbers as in C | "0b" notation; these values are first interpreted as numbers as in C | |||
and are then interpreted as byte strings in network byte order, | and are then interpreted as byte strings in network byte order, | |||
including any leading zero bytes expressed in the notation. | including any leading zero bytes expressed in the notation. | |||
Words may be _italicized_ for emphasis; in the plain text form of | Words may be _italicized_ for emphasis; in the plain text form of | |||
this specification this is indicated by surrounding words with | this specification, this is indicated by surrounding words with | |||
underscore characters. Verbatim text (e.g., names from a programming | underscore characters. Verbatim text (e.g., names from a programming | |||
language) may be set in "monospace" type; in plain text this is | language) may be set in "monospace" type; in plain text, this is | |||
approximated somewhat ambiguously by surrounding the text in double | approximated somewhat ambiguously by surrounding the text in double | |||
quotes (which also retain their usual meaning). | quotes (which also retain their usual meaning). | |||
2. CBOR Data Models | 2. CBOR Data Models | |||
CBOR is explicit about its generic data model, which defines the set | CBOR is explicit about its generic data model, which defines the set | |||
of all data items that can be represented in CBOR. Its basic generic | of all data items that can be represented in CBOR. Its basic generic | |||
data model is extensible by the registration of "simple values" and | data model is extensible by the registration of "simple values" and | |||
tags. Applications can then subset the resulting extended generic | tags. Applications can then create a subset of the resulting | |||
data model to build their specific data models. | extended generic data model to build their specific data models. | |||
Within environments that can represent the data items in the generic | Within environments that can represent the data items in the generic | |||
data model, generic CBOR encoders and decoders can be implemented | data model, generic CBOR encoders and decoders can be implemented | |||
(which usually involves defining additional implementation data types | (which usually involves defining additional implementation data types | |||
for those data items that do not already have a natural | for those data items that do not already have a natural | |||
representation in the environment). The ability to provide generic | representation in the environment). The ability to provide generic | |||
encoders and decoders is an explicit design goal of CBOR; however | encoders and decoders is an explicit design goal of CBOR; however, | |||
many applications will provide their own application-specific | many applications will provide their own application-specific | |||
encoders and/or decoders. | encoders and/or decoders. | |||
In the basic (un-extended) generic data model defined in Section 3, a | In the basic (unextended) generic data model defined in Section 3, a | |||
data item is one of: | data item is one of the following: | |||
* an integer in the range -2**64..2**64-1 inclusive | * an integer in the range -2^(64)..2^(64)-1 inclusive | |||
* a simple value, identified by a number between 0 and 255, but | * a simple value, identified by a number between 0 and 255, but | |||
distinct from that number itself | distinct from that number itself | |||
* a floating-point value, distinct from an integer, out of the set | * a floating-point value, distinct from an integer, out of the set | |||
representable by IEEE 754 binary64 (including non-finites) | representable by IEEE 754 binary64 (including non-finites) | |||
[IEEE754] | [IEEE754] | |||
* a sequence of zero or more bytes ("byte string") | * a sequence of zero or more bytes ("byte string") | |||
* a sequence of zero or more Unicode code points ("text string") | * a sequence of zero or more Unicode code points ("text string") | |||
* a sequence of zero or more data items ("array") | * a sequence of zero or more data items ("array") | |||
* a mapping (mathematical function) from zero or more data items | * a mapping (mathematical function) from zero or more data items | |||
("keys") each to a data item ("values"), ("map") | ("keys") each to a data item ("values"), ("map") | |||
* a tagged data item ("tag"), comprising a tag number (an integer in | * a tagged data item ("tag"), comprising a tag number (an integer in | |||
the range 0..2**64-1) and the tag content (a data item) | the range 0..2^(64)-1) and the tag content (a data item) | |||
Note that integer and floating-point values are distinct in this | Note that integer and floating-point values are distinct in this | |||
model, even if they have the same numeric value. | model, even if they have the same numeric value. | |||
Also note that serialization variants are not visible at the generic | Also note that serialization variants are not visible at the generic | |||
data model level, including the number of bytes of the encoded | data model level. This deliberate absence of visibility includes the | |||
floating-point value or the choice of one of the ways in which an | number of bytes of the encoded floating-point value. It also | |||
integer, the length of a text or byte string, the number of elements | includes the choice of encoding for an "argument" (see Section 3) | |||
in an array or pairs in a map, or a tag number, (collectively "the | such as the encoding for an integer, the encoding for the length of a | |||
argument", see Section 3) can be encoded. | text or byte string, the encoding for the number of elements in an | |||
array or pairs in a map, or the encoding for a tag number. | ||||
2.1. Extended Generic Data Models | 2.1. Extended Generic Data Models | |||
This basic generic data model comes pre-extended by the registration | This basic generic data model has been extended in this document by | |||
of a number of simple values and tag numbers right in this document, | the registration of a number of simple values and tag numbers, such | |||
such as: | as: | |||
* "false", "true", "null", and "undefined" (simple values identified | * "false", "true", "null", and "undefined" (simple values identified | |||
by 20..23) | by 20..23, Section 3.3) | |||
* integer and floating-point values with a larger range and | * integer and floating-point values with a larger range and | |||
precision than the above (tag numbers 2 to 5) | precision than the above (tag numbers 2 to 5, Section 3.4) | |||
* application data types such as a point in time or an RFC 3339 | * application data types such as a point in time or date/time string | |||
date/time string (tag numbers 1, 0) | defined in RFC 3339 (tag numbers 1 and 0, Section 3.4) | |||
Further elements of the extended generic data model can be (and have | Additional elements of the extended generic data model can be (and | |||
been) defined via the IANA registries created for CBOR. Even if such | have been) defined via the IANA registries created for CBOR. Even if | |||
an extension is unknown to a generic encoder or decoder, data items | such an extension is unknown to a generic encoder or decoder, data | |||
using that extension can be passed to or from the application by | items using that extension can be passed to or from the application | |||
representing them at the interface to the application within the | by representing them at the application interface within the basic | |||
basic generic data model, i.e., as generic simple values or generic | generic data model, i.e., as generic simple values or generic tags. | |||
tags. | ||||
In other words, the basic generic data model is stable as defined in | In other words, the basic generic data model is stable as defined in | |||
this document, while the extended generic data model expands by the | this document, while the extended generic data model expands by the | |||
registration of new simple values or tag numbers, but never shrinks. | registration of new simple values or tag numbers, but never shrinks. | |||
While there is a strong expectation that generic encoders and | While there is a strong expectation that generic encoders and | |||
decoders can represent "false", "true", and "null" ("undefined" is | decoders can represent "false", "true", and "null" ("undefined" is | |||
intentionally omitted) in the form appropriate for their programming | intentionally omitted) in the form appropriate for their programming | |||
environment, implementation of the data model extensions created by | environment, the implementation of the data model extensions created | |||
tags is truly optional and a matter of implementation quality. | by tags is truly optional and a matter of implementation quality. | |||
2.2. Specific Data Models | 2.2. Specific Data Models | |||
The specific data model for a CBOR-based protocol usually subsets the | The specific data model for a CBOR-based protocol usually takes a | |||
extended generic data model and assigns application semantics to the | subset of the extended generic data model and assigns application | |||
data items within this subset and its components. When documenting | semantics to the data items within this subset and its components. | |||
such specific data models, where it is desired to specify the types | When documenting such specific data models and specifying the types | |||
of data items, it is preferred to identify the types by the names | of data items, it is preferable to identify the types by their | |||
they have in the generic data model ("negative integer", "array") | generic data model names ("negative integer", "array") instead of | |||
instead of by referring to aspects of their CBOR representation | referring to aspects of their CBOR representation ("major type 1", | |||
("major type 1", "major type 4"). | "major type 4"). | |||
Specific data models can also specify what values (including values | Specific data models can also specify value equivalency (including | |||
of different types) are equivalent for the purposes of map keys and | values of different types) for the purposes of map keys and encoder | |||
encoder freedom. For example, in the generic data model, a valid map | freedom. For example, in the generic data model, a valid map MAY | |||
MAY have both "0" and "0.0" as keys, and an encoder MUST NOT encode | have both "0" and "0.0" as keys, and an encoder MUST NOT encode "0.0" | |||
"0.0" as an integer (major type 0, Section 3.1). However, if a | as an integer (major type 0, Section 3.1). However, if a specific | |||
specific data model declares that floating-point and integer | data model declares that floating-point and integer representations | |||
representations of integral values are equivalent, using both map | of integral values are equivalent, using both map keys "0" and "0.0" | |||
keys "0" and "0.0" in a single map would be considered duplicates, | in a single map would be considered duplicates, even while encoded as | |||
even while encoded as different major types, and so invalid; and an | different major types, and so invalid; and an encoder could encode | |||
encoder could encode integral-valued floats as integers or vice | integral-valued floats as integers or vice versa, perhaps to save | |||
versa, perhaps to save encoded bytes. | encoded bytes. | |||
3. Specification of the CBOR Encoding | 3. Specification of the CBOR Encoding | |||
A CBOR data item (Section 2) is encoded to or decoded from a byte | A CBOR data item (Section 2) is encoded to or decoded from a byte | |||
string carrying a well-formed encoded data item as described in this | string carrying a well-formed encoded data item as described in this | |||
section. The encoding is summarized in Table 7 in Appendix B, | section. The encoding is summarized in Table 7 in Appendix B, | |||
indexed by the initial byte. An encoder MUST produce only well- | indexed by the initial byte. An encoder MUST produce only well- | |||
formed encoded data items. A decoder MUST NOT return a decoded data | formed encoded data items. A decoder MUST NOT return a decoded data | |||
item when it encounters input that is not a well-formed encoded CBOR | item when it encounters input that is not a well-formed encoded CBOR | |||
data item (this does not detract from the usefulness of diagnostic | data item (this does not detract from the usefulness of diagnostic | |||
skipping to change at page 11, line 6 ¶ | skipping to change at line 470 ¶ | |||
are not used as an integer argument, but as a floating-point value | are not used as an integer argument, but as a floating-point value | |||
(see Section 3.3). | (see Section 3.3). | |||
28, 29, 30: These values are reserved for future additions to the | 28, 29, 30: These values are reserved for future additions to the | |||
CBOR format. In the present version of CBOR, the encoded item is | CBOR format. In the present version of CBOR, the encoded item is | |||
not well-formed. | not well-formed. | |||
31: No argument value is derived. If the major type is 0, 1, or 6, | 31: No argument value is derived. If the major type is 0, 1, or 6, | |||
the encoded item is not well-formed. For major types 2 to 5, the | the encoded item is not well-formed. For major types 2 to 5, the | |||
item's length is indefinite, and for major type 7, the byte does | item's length is indefinite, and for major type 7, the byte does | |||
not constitute a data item at all but terminates an indefinite | not constitute a data item at all but terminates an indefinite- | |||
length item; all are described in Section 3.2. | length item; all are described in Section 3.2. | |||
The initial byte and any additional bytes consumed to construct the | The initial byte and any additional bytes consumed to construct the | |||
argument are collectively referred to as the "head" of the data item. | argument are collectively referred to as the _head_ of the data item. | |||
The meaning of this argument depends on the major type. For example, | The meaning of this argument depends on the major type. For example, | |||
in major type 0, the argument is the value of the data item itself | in major type 0, the argument is the value of the data item itself | |||
(and in major type 1 the value of the data item is computed from the | (and in major type 1, the value of the data item is computed from the | |||
argument); in major type 2 and 3 it gives the length of the string | argument); in major type 2 and 3, it gives the length of the string | |||
data in bytes that follows; and in major types 4 and 5 it is used to | data in bytes that follow; and in major types 4 and 5, it is used to | |||
determine the number of data items enclosed. | determine the number of data items enclosed. | |||
If the encoded sequence of bytes ends before the end of a data item, | If the encoded sequence of bytes ends before the end of a data item, | |||
that item is not well-formed. If the encoded sequence of bytes still | that item is not well-formed. If the encoded sequence of bytes still | |||
has bytes remaining after the outermost encoded item is decoded, that | has bytes remaining after the outermost encoded item is decoded, that | |||
encoding is not a single well-formed CBOR item; depending on the | encoding is not a single well-formed CBOR item. Depending on the | |||
application, the decoder may either treat the encoding as not well- | application, the decoder may either treat the encoding as not well- | |||
formed or just identify the start of the remaining bytes to the | formed or just identify the start of the remaining bytes to the | |||
application. | application. | |||
A CBOR decoder implementation can be based on a jump table with all | A CBOR decoder implementation can be based on a jump table with all | |||
256 defined values for the initial byte (Table 7). A decoder in a | 256 defined values for the initial byte (Table 7). A decoder in a | |||
constrained implementation can instead use the structure of the | constrained implementation can instead use the structure of the | |||
initial byte and following bytes for more compact code (see | initial byte and following bytes for more compact code (see | |||
Appendix C for a rough impression of how this could look). | Appendix C for a rough impression of how this could look). | |||
3.1. Major Types | 3.1. Major Types | |||
The following lists the major types and the additional information | The following lists the major types and the additional information | |||
and other bytes associated with the type. | and other bytes associated with the type. | |||
Major type 0: an unsigned integer in the range 0..2**64-1 inclusive. | Major type 0: | |||
The value of the encoded item is the argument itself. For | An unsigned integer in the range 0..2^(64)-1 inclusive. The value | |||
example, the integer 10 is denoted as the one byte 0b000_01010 | of the encoded item is the argument itself. For example, the | |||
(major type 0, additional information 10). The integer 500 would | integer 10 is denoted as the one byte 0b000_01010 (major type 0, | |||
be 0b000_11001 (major type 0, additional information 25) followed | additional information 10). The integer 500 would be 0b000_11001 | |||
by the two bytes 0x01f4, which is 500 in decimal. | (major type 0, additional information 25) followed by the two | |||
bytes 0x01f4, which is 500 in decimal. | ||||
Major type 1: a negative integer in the range -2**64..-1 inclusive. | Major type 1: | |||
The value of the item is -1 minus the argument. For example, the | A negative integer in the range -2^(64)..-1 inclusive. The value | |||
integer -500 would be 0b001_11001 (major type 1, additional | of the item is -1 minus the argument. For example, the integer | |||
information 25) followed by the two bytes 0x01f3, which is 499 in | -500 would be 0b001_11001 (major type 1, additional information | |||
decimal. | 25) followed by the two bytes 0x01f3, which is 499 in decimal. | |||
Major type 2: a byte string. The number of bytes in the string is | Major type 2: | |||
equal to the argument. For example, a byte string whose length is | A byte string. The number of bytes in the string is equal to the | |||
5 would have an initial byte of 0b010_00101 (major type 2, | argument. For example, a byte string whose length is 5 would have | |||
additional information 5 for the length), followed by 5 bytes of | an initial byte of 0b010_00101 (major type 2, additional | |||
binary content. A byte string whose length is 500 would have 3 | information 5 for the length), followed by 5 bytes of binary | |||
initial bytes of 0b010_11001 (major type 2, additional information | content. A byte string whose length is 500 would have 3 initial | |||
25 to indicate a two-byte length) followed by the two bytes 0x01f4 | bytes of 0b010_11001 (major type 2, additional information 25 to | |||
for a length of 500, followed by 500 bytes of binary content. | indicate a two-byte length) followed by the two bytes 0x01f4 for a | |||
length of 500, followed by 500 bytes of binary content. | ||||
Major type 3: a text string (Section 2), encoded as UTF-8 | Major type 3: | |||
([RFC3629]). The number of bytes in the string is equal to the | A text string (Section 2) encoded as UTF-8 [RFC3629]. The number | |||
argument. A string containing an invalid UTF-8 sequence is well- | of bytes in the string is equal to the argument. A string | |||
formed but invalid (Section 1.2). This type is provided for | containing an invalid UTF-8 sequence is well-formed but invalid | |||
systems that need to interpret or display human-readable text, and | (Section 1.2). This type is provided for systems that need to | |||
allows the differentiation between unstructured bytes and text | interpret or display human-readable text, and allows the | |||
that has a specified repertoire (that of Unicode) and encoding | differentiation between unstructured bytes and text that has a | |||
(UTF-8). In contrast to formats such as JSON, the Unicode | specified repertoire (that of Unicode) and encoding (UTF-8). In | |||
characters in this type are never escaped. Thus, a newline | contrast to formats such as JSON, the Unicode characters in this | |||
character (U+000A) is always represented in a string as the byte | type are never escaped. Thus, a newline character (U+000A) is | |||
0x0a, and never as the bytes 0x5c6e (the characters "\" and "n") | always represented in a string as the byte 0x0a, and never as the | |||
nor as 0x5c7530303061 (the characters "\", "u", "0", "0", "0", and | bytes 0x5c6e (the characters "\" and "n") nor as 0x5c7530303061 | |||
"a"). | (the characters "\", "u", "0", "0", "0", and "a"). | |||
Major type 4: an array of data items. In other formats, arrays are | Major type 4: | |||
also called lists, sequences, or tuples (a "CBOR sequence" is | An array of data items. In other formats, arrays are also called | |||
something slightly different, though [RFC8742]). The argument is | lists, sequences, or tuples (a "CBOR sequence" is something | |||
the number of data items in the array. Items in an array do not | slightly different, though [RFC8742]). The argument is the number | |||
need to all be of the same type. For example, an array that | of data items in the array. Items in an array do not need to all | |||
contains 10 items of any type would have an initial byte of | be of the same type. For example, an array that contains 10 items | |||
0b100_01010 (major type 4, additional information 10 for the | of any type would have an initial byte of 0b100_01010 (major type | |||
length) followed by the 10 remaining items. | 4, additional information 10 for the length) followed by the 10 | |||
remaining items. | ||||
Major type 5: a map of pairs of data items. Maps are also called | Major type 5: | |||
tables, dictionaries, hashes, or objects (in JSON). A map is | A map of pairs of data items. Maps are also called tables, | |||
comprised of pairs of data items, each pair consisting of a key | dictionaries, hashes, or objects (in JSON). A map is comprised of | |||
that is immediately followed by a value. The argument is the | pairs of data items, each pair consisting of a key that is | |||
number of _pairs_ of data items in the map. For example, a map | immediately followed by a value. The argument is the number of | |||
that contains 9 pairs would have an initial byte of 0b101_01001 | _pairs_ of data items in the map. For example, a map that | |||
(major type 5, additional information 9 for the number of pairs) | contains 9 pairs would have an initial byte of 0b101_01001 (major | |||
followed by the 18 remaining items. The first item is the first | type 5, additional information 9 for the number of pairs) followed | |||
key, the second item is the first value, the third item is the | by the 18 remaining items. The first item is the first key, the | |||
second key, and so on. Because items in a map come in pairs, | second item is the first value, the third item is the second key, | |||
their total number is always even: A map that contains an odd | and so on. Because items in a map come in pairs, their total | |||
number of items (no value data present after the last key data | number is always even: a map that contains an odd number of items | |||
item) is not well-formed. A map that has duplicate keys may be | (no value data present after the last key data item) is not well- | |||
well-formed, but it is not valid, and thus it causes indeterminate | formed. A map that has duplicate keys may be well-formed, but it | |||
decoding; see also Section 5.6. | is not valid, and thus it causes indeterminate decoding; see also | |||
Section 5.6. | ||||
Major type 6: a tagged data item ("tag") whose tag number, an | Major type 6: | |||
integer in the range 0..2**64-1 inclusive, is the argument and | A tagged data item ("tag") whose tag number, an integer in the | |||
whose enclosed data item ("tag content") is the single encoded | range 0..2^(64)-1 inclusive, is the argument and whose enclosed | |||
data item that follows the head. See Section 3.4. | data item (_tag content_) is the single encoded data item that | |||
follows the head. See Section 3.4. | ||||
Major type 7: floating-point numbers and simple values, as well as | Major type 7: | |||
the "break" stop code. See Section 3.3. | Floating-point numbers and simple values, as well as the "break" | |||
stop code. See Section 3.3. | ||||
These eight major types lead to a simple table showing which of the | These eight major types lead to a simple table showing which of the | |||
256 possible values for the initial byte of a data item are used | 256 possible values for the initial byte of a data item are used | |||
(Table 7). | (Table 7). | |||
In major types 6 and 7, many of the possible values are reserved for | In major types 6 and 7, many of the possible values are reserved for | |||
future specification. See Section 9 for more information on these | future specification. See Section 9 for more information on these | |||
values. | values. | |||
Table 1 summarizes the major types defined by CBOR, ignoring the next | Table 1 summarizes the major types defined by CBOR, ignoring | |||
section for now. The number N in this table stands for the argument, | Section 3.2 for now. The number N in this table stands for the | |||
mt for the major type. | argument. | |||
+====+=======================+=================================+ | +============+=======================+=========================+ | |||
| mt | Meaning | Content | | | Major Type | Meaning | Content | | |||
+====+=======================+=================================+ | +============+=======================+=========================+ | |||
| 0 | unsigned integer N | - | | | 0 | unsigned integer N | - | | |||
+----+-----------------------+---------------------------------+ | +------------+-----------------------+-------------------------+ | |||
| 1 | negative integer -1-N | - | | | 1 | negative integer -1-N | - | | |||
+----+-----------------------+---------------------------------+ | +------------+-----------------------+-------------------------+ | |||
| 2 | byte string | N bytes | | | 2 | byte string | N bytes | | |||
+----+-----------------------+---------------------------------+ | +------------+-----------------------+-------------------------+ | |||
| 3 | text string | N bytes (UTF-8 text) | | | 3 | text string | N bytes (UTF-8 text) | | |||
+----+-----------------------+---------------------------------+ | +------------+-----------------------+-------------------------+ | |||
| 4 | array | N data items (elements) | | | 4 | array | N data items (elements) | | |||
+----+-----------------------+---------------------------------+ | +------------+-----------------------+-------------------------+ | |||
| 5 | map | 2N data items (key/value pairs) | | | 5 | map | 2N data items (key/ | | |||
+----+-----------------------+---------------------------------+ | | | | value pairs) | | |||
| 6 | tag of number N | 1 data item | | +------------+-----------------------+-------------------------+ | |||
+----+-----------------------+---------------------------------+ | | 6 | tag of number N | 1 data item | | |||
| 7 | simple/float | - | | +------------+-----------------------+-------------------------+ | |||
+----+-----------------------+---------------------------------+ | | 7 | simple/float | - | | |||
+------------+-----------------------+-------------------------+ | ||||
Table 1: Overview over the definite-length use of CBOR major | Table 1: Overview over the Definite-Length Use of CBOR Major | |||
types (mt = major type, N = argument) | Types (N = Argument) | |||
3.2. Indefinite Lengths for Some Major Types | 3.2. Indefinite Lengths for Some Major Types | |||
Four CBOR items (arrays, maps, byte strings, and text strings) can be | Four CBOR items (arrays, maps, byte strings, and text strings) can be | |||
encoded with an indefinite length using additional information value | encoded with an indefinite length using additional information value | |||
31. This is useful if the encoding of the item needs to begin before | 31. This is useful if the encoding of the item needs to begin before | |||
the number of items inside the array or map, or the total length of | the number of items inside the array or map, or the total length of | |||
the string, is known. (The ability to start sending a data item | the string, is known. (The ability to start sending a data item | |||
before all of it is known is often referred to as "streaming" within | before all of it is known is often referred to as "streaming" within | |||
that data item.) | that data item.) | |||
Indefinite-length arrays and maps are dealt with differently than | Indefinite-length arrays and maps are dealt with differently than | |||
indefinite-length strings (byte strings and text strings). | indefinite-length strings (byte strings and text strings). | |||
3.2.1. The "break" Stop Code | 3.2.1. The "break" Stop Code | |||
The "break" stop code is encoded with major type 7 and additional | The "break" stop code is encoded with major type 7 and additional | |||
information value 31 (0b111_11111). It is not itself a data item: it | information value 31 (0b111_11111). It is not itself a data item: it | |||
is just a syntactic feature to close an indefinite-length item. | is just a syntactic feature to close an indefinite-length item. | |||
If the "break" stop code appears anywhere where a data item is | If the "break" stop code appears where a data item is expected, other | |||
expected, other than directly inside an indefinite-length string, | than directly inside an indefinite-length string, array, or map -- | |||
array, or map -- for example directly inside a definite-length array | for example, directly inside a definite-length array or map -- the | |||
or map -- the enclosing item is not well-formed. | enclosing item is not well-formed. | |||
3.2.2. Indefinite-Length Arrays and Maps | 3.2.2. Indefinite-Length Arrays and Maps | |||
Indefinite-length arrays and maps are represented using their major | Indefinite-length arrays and maps are represented using their major | |||
type with the additional information value of 31, followed by an | type with the additional information value of 31, followed by an | |||
arbitrary-length sequence of zero or more items for an array or key/ | arbitrary-length sequence of zero or more items for an array or key/ | |||
value pairs for a map, followed by the "break" stop code | value pairs for a map, followed by the "break" stop code | |||
(Section 3.2.1). In other words, indefinite-length arrays and maps | (Section 3.2.1). In other words, indefinite-length arrays and maps | |||
look identical to other arrays and maps except for beginning with the | look identical to other arrays and maps except for beginning with the | |||
additional information value of 31 and ending with the "break" stop | additional information value of 31 and ending with the "break" stop | |||
skipping to change at page 17, line 25 ¶ | skipping to change at line 775 ¶ | |||
5F -- Start indefinite-length byte string | 5F -- Start indefinite-length byte string | |||
44 -- Byte string of length 4 | 44 -- Byte string of length 4 | |||
aabbccdd -- Bytes content | aabbccdd -- Bytes content | |||
43 -- Byte string of length 3 | 43 -- Byte string of length 3 | |||
eeff99 -- Bytes content | eeff99 -- Bytes content | |||
FF -- "break" | FF -- "break" | |||
After decoding, this results in a single byte string with seven | After decoding, this results in a single byte string with seven | |||
bytes: 0xaabbccddeeff99. | bytes: 0xaabbccddeeff99. | |||
3.2.4. Summary of indefinite-length use of major types | 3.2.4. Summary of Indefinite-Length Use of Major Types | |||
Table 2 summarizes the major types defined by CBOR as used for | Table 2 summarizes the major types defined by CBOR as used for | |||
indefinite length encoding (with additional information set to 31). | indefinite-length encoding (with additional information set to 31). | |||
mt stands for the major type. | ||||
+====+===================+==================================+ | +============+===================+==================================+ | |||
| mt | Meaning | enclosed up to "break" stop code | | | Major Type | Meaning | Enclosed up to "break" Stop Code | | |||
+====+===================+==================================+ | +============+===================+==================================+ | |||
| 0 | (not well-formed) | - | | | 0 | (not well- | - | | |||
+----+-------------------+----------------------------------+ | | | formed) | | | |||
| 1 | (not well-formed) | - | | +------------+-------------------+----------------------------------+ | |||
+----+-------------------+----------------------------------+ | | 1 | (not well- | - | | |||
| 2 | byte string | definite-length byte strings | | | | formed) | | | |||
+----+-------------------+----------------------------------+ | +------------+-------------------+----------------------------------+ | |||
| 3 | text string | definite-length text strings | | | 2 | byte string | definite-length byte strings | | |||
+----+-------------------+----------------------------------+ | +------------+-------------------+----------------------------------+ | |||
| 4 | array | data items (elements) | | | 3 | text string | definite-length text strings | | |||
+----+-------------------+----------------------------------+ | +------------+-------------------+----------------------------------+ | |||
| 5 | map | data items (key/value pairs) | | | 4 | array | data items (elements) | | |||
+----+-------------------+----------------------------------+ | +------------+-------------------+----------------------------------+ | |||
| 6 | (not well-formed) | - | | | 5 | map | data items (key/value pairs) | | |||
+----+-------------------+----------------------------------+ | +------------+-------------------+----------------------------------+ | |||
| 7 | "break" stop code | - | | | 6 | (not well- | - | | |||
+----+-------------------+----------------------------------+ | | | formed) | | | |||
+------------+-------------------+----------------------------------+ | ||||
| 7 | "break" stop | - | | ||||
| | code | | | ||||
+------------+-------------------+----------------------------------+ | ||||
Table 2: Overview over the indefinite-length use of CBOR | Table 2: Overview of the Indefinite-Length Use of CBOR Major | |||
major types (mt = major type, additional information = | Types (Additional Information = 31) | |||
31) | ||||
3.3. Floating-Point Numbers and Values with No Content | 3.3. Floating-Point Numbers and Values with No Content | |||
Major type 7 is for two types of data: floating-point numbers and | Major type 7 is for two types of data: floating-point numbers and | |||
"simple values" that do not need any content. Each value of the | "simple values" that do not need any content. Each value of the | |||
5-bit additional information in the initial byte has its own separate | 5-bit additional information in the initial byte has its own separate | |||
meaning, as defined in Table 3. Like the major types for integers, | meaning, as defined in Table 3. Like the major types for integers, | |||
items of this major type do not carry content data; all the | items of this major type do not carry content data; all the | |||
information is in the initial bytes (the head). | information is in the initial bytes (the head). | |||
skipping to change at page 19, line 37 ¶ | skipping to change at line 848 ¶ | |||
byte extension: it is followed by an additional byte to represent the | byte extension: it is followed by an additional byte to represent the | |||
simple value. (To minimize confusion, only the values 32 to 255 are | simple value. (To minimize confusion, only the values 32 to 255 are | |||
used.) This maintains the structure of the initial bytes: as for the | used.) This maintains the structure of the initial bytes: as for the | |||
other major types, the length of these always depends on the | other major types, the length of these always depends on the | |||
additional information in the first byte. Table 4 lists the numeric | additional information in the first byte. Table 4 lists the numeric | |||
values assigned and available for simple values. | values assigned and available for simple values. | |||
+=========+==============+ | +=========+==============+ | |||
| Value | Semantics | | | Value | Semantics | | |||
+=========+==============+ | +=========+==============+ | |||
| 0..19 | (Unassigned) | | | 0..19 | (unassigned) | | |||
+---------+--------------+ | +---------+--------------+ | |||
| 20 | False | | | 20 | false | | |||
+---------+--------------+ | +---------+--------------+ | |||
| 21 | True | | | 21 | true | | |||
+---------+--------------+ | +---------+--------------+ | |||
| 22 | Null | | | 22 | null | | |||
+---------+--------------+ | +---------+--------------+ | |||
| 23 | Undefined | | | 23 | undefined | | |||
+---------+--------------+ | +---------+--------------+ | |||
| 24..31 | (Reserved) | | | 24..31 | (reserved) | | |||
+---------+--------------+ | +---------+--------------+ | |||
| 32..255 | (Unassigned) | | | 32..255 | (unassigned) | | |||
+---------+--------------+ | +---------+--------------+ | |||
Table 4: Simple Values | Table 4: Simple Values | |||
An encoder MUST NOT issue two-byte sequences that start with 0xf8 | An encoder MUST NOT issue two-byte sequences that start with 0xf8 | |||
(major type 7, additional information 24) and continue with a byte | (major type 7, additional information 24) and continue with a byte | |||
less than 0x20 (32 decimal). Such sequences are not well-formed. | less than 0x20 (32 decimal). Such sequences are not well-formed. | |||
(This implies that an encoder cannot encode false, true, null, or | (This implies that an encoder cannot encode "false", "true", "null", | |||
undefined in two-byte sequences, and that only the one-byte variants | or "undefined" in two-byte sequences and that only the one-byte | |||
of these are well-formed; more generally speaking, each simple value | variants of these are well-formed; more generally speaking, each | |||
only has a single representation variant). | simple value only has a single representation variant). | |||
The 5-bit values of 25, 26, and 27 are for 16-bit, 32-bit, and 64-bit | The 5-bit values of 25, 26, and 27 are for 16-bit, 32-bit, and 64-bit | |||
IEEE 754 binary floating-point values [IEEE754]. These floating- | IEEE 754 binary floating-point values [IEEE754]. These floating- | |||
point values are encoded in the additional bytes of the appropriate | point values are encoded in the additional bytes of the appropriate | |||
size. (See Appendix D for some information about 16-bit floating- | size. (See Appendix D for some information about 16-bit floating- | |||
point numbers.) | point numbers.) | |||
3.4. Tagging of Items | 3.4. Tagging of Items | |||
In CBOR, a data item can be enclosed by a tag to give it some | In CBOR, a data item can be enclosed by a tag to give it some | |||
additional semantics, as uniquely identified by a "tag number". The | additional semantics, as uniquely identified by a _tag number_. The | |||
tag is major type 6, its argument (Section 3) indicates the tag | tag is major type 6, its argument (Section 3) indicates the tag | |||
number, and it contains a single enclosed data item, the "tag | number, and it contains a single enclosed data item, the _tag | |||
content". (If a tag requires further structure to its content, this | content_. (If a tag requires further structure to its content, this | |||
structure is provided by the enclosed data item.) We use the term | structure is provided by the enclosed data item.) We use the term | |||
"tag" for the entire data item consisting of both a tag number and | _tag_ for the entire data item consisting of both a tag number and | |||
the tag content: the tag content is the data item that is being | the tag content: the tag content is the data item that is being | |||
tagged. | tagged. | |||
For example, assume that a byte string of length 12 is marked with a | For example, assume that a byte string of length 12 is marked with a | |||
tag of number 2 to indicate it is a positive "bignum" | tag of number 2 to indicate it is an unsigned _bignum_ | |||
(Section 3.4.3). The encoded data item would start with a byte | (Section 3.4.3). The encoded data item would start with a byte | |||
0b110_00010 (major type 6, additional information 2 for the tag | 0b110_00010 (major type 6, additional information 2 for the tag | |||
number) followed by the encoded tag content: 0b010_01100 (major type | number) followed by the encoded tag content: 0b010_01100 (major type | |||
2, additional information of 12 for the length) followed by the 12 | 2, additional information 12 for the length) followed by the 12 bytes | |||
bytes of the bignum. | of the bignum. | |||
The definition of a tag number describes the additional semantics | In the extended generic data model, a tag number's definition | |||
conveyed for tags with this tag number in the extended generic data | describes the additional semantics conveyed with the tag number. | |||
model. These semantics may include equivalence of some tagged data | These semantics may include equivalence of some tagged data items | |||
items with other data items, including some that can already be | with other data items, including some that can be represented in the | |||
represented in the basic generic data model. For instance, 0xc24101, | basic generic data model. For instance, 0xc24101, a bignum the tag | |||
a bignum the tag content of which is the byte string with the single | content of which is the byte string with the single byte 0x01, is | |||
byte 0x01, is equivalent to an integer 1, which could also be encoded | equivalent to an integer 1, which could also be encoded as 0x01, | |||
for instance as 0x01, 0x1801, or 0x190001. The tag definition may | 0x1801, or 0x190001. The tag definition may specify a preferred | |||
include the definition of a preferred serialization (Section 4.1) | serialization (Section 4.1) that is recommended for generic encoders; | |||
that is recommended for generic encoders; this may prefer basic | this may prefer basic generic data model representations over ones | |||
generic data model representations over ones that employ a tag. | that employ a tag. | |||
The tag definition usually restricts what kinds of nested data item | The tag definition usually defines which nested data items are valid | |||
or items are valid for such tags. Tag definitions may restrict their | for such tags. Tag definitions may restrict their content to a very | |||
content to a very specific syntactic structure, as the tags defined | specific syntactic structure, as the tags defined in this document | |||
in this document do, or they may aim at a more semantically defined | do, or they may define their content more semantically. An example | |||
definition of their content, as for instance tags 40 and 1040 do | for the latter is how tags 40 and 1040 accept multiple ways to | |||
[RFC8746]: These accept a number of different ways of representing | represent arrays [RFC8746]. | |||
arrays. | ||||
As a matter of convention, many tags do not accept null or undefined | As a matter of convention, many tags do not accept "null" or | |||
values as tag content; instead, the expectation is that a null or | "undefined" values as tag content; instead, the expectation is that a | |||
undefined value can be used in place of the entire tag; Section 3.4.2 | "null" or "undefined" value can be used in place of the entire tag; | |||
provides some further considerations for one specific tag about the | Section 3.4.2 provides some further considerations for one specific | |||
handling of this convention in application protocols and in mapping | tag about the handling of this convention in application protocols | |||
to platform types. | and in mapping to platform types. | |||
Decoders do not need to understand tags of every tag number, and tags | Decoders do not need to understand tags of every tag number, and tags | |||
may be of little value in applications where the implementation | may be of little value in applications where the implementation | |||
creating a particular CBOR data item and the implementation decoding | creating a particular CBOR data item and the implementation decoding | |||
that stream know the semantic meaning of each item in the data flow. | that stream know the semantic meaning of each item in the data flow. | |||
Their primary purpose in this specification is to define common data | The primary purpose of tags in this specification is to define common | |||
types such as dates. A secondary purpose is to provide conversion | data types such as dates. A secondary purpose is to provide | |||
hints when it is foreseen that the CBOR data item needs to be | conversion hints when it is foreseen that the CBOR data item needs to | |||
translated into a different format, requiring hints about the content | be translated into a different format, requiring hints about the | |||
of items. Understanding the semantics of tags is optional for a | content of items. Understanding the semantics of tags is optional | |||
decoder; it can simply present both the tag number and the tag | for a decoder; it can simply present both the tag number and the tag | |||
content to the application, without interpreting the additional | content to the application, without interpreting the additional | |||
semantics of the tag. | semantics of the tag. | |||
A tag applies semantics to the data item it encloses. Tags can nest: | A tag applies semantics to the data item it encloses. Tags can nest: | |||
If tag A encloses tag B, which encloses data item C, tag A applies to | if tag A encloses tag B, which encloses data item C, tag A applies to | |||
the result of applying tag B on data item C. | the result of applying tag B on data item C. | |||
IANA maintains a registry of tag numbers as described in Section 9.2. | IANA maintains a registry of tag numbers as described in Section 9.2. | |||
Table 5 provides a list of tag numbers that were defined in | Table 5 provides a list of tag numbers that were defined in [RFC7049] | |||
[RFC7049], with definitions in the rest of this section. (Tag number | with definitions in the rest of this section. (Tag number 35 was | |||
35 was also defined in [RFC7049]; a discussion of this tag number | also defined in [RFC7049]; a discussion of this tag number follows in | |||
follows in Section 3.4.5.3.) Note that many other tag numbers have | Section 3.4.5.3.) Note that many other tag numbers have been defined | |||
been defined since the publication of [RFC7049]; see the registry | since the publication of [RFC7049]; see the registry described at | |||
described at Section 9.2 for the complete list. | Section 9.2 for the complete list. | |||
+============+=============+==================================+ | +=======+=============+==================================+ | |||
| Tag Number | Data Item | Tag Content Semantics | | | Tag | Data Item | Semantics | | |||
+============+=============+==================================+ | +=======+=============+==================================+ | |||
| 0 | text string | Standard date/time string; see | | | 0 | text string | Standard date/time string; see | | |||
| | | Section 3.4.1 | | | | | Section 3.4.1 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 1 | integer or | Epoch-based date/time; see | | | 1 | integer or | Epoch-based date/time; see | | |||
| | float | Section 3.4.2 | | | | float | Section 3.4.2 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 2 | byte string | Positive bignum; see | | | 2 | byte string | Unsigned bignum; see | | |||
| | | Section 3.4.3 | | | | | Section 3.4.3 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 3 | byte string | Negative bignum; see | | | 3 | byte string | Negative bignum; see | | |||
| | | Section 3.4.3 | | | | | Section 3.4.3 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 4 | array | Decimal fraction; see | | | 4 | array | Decimal fraction; see | | |||
| | | Section 3.4.4 | | | | | Section 3.4.4 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 5 | array | Bigfloat; see Section 3.4.4 | | | 5 | array | Bigfloat; see Section 3.4.4 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 21 | (any) | Expected conversion to base64url | | | 21 | (any) | Expected conversion to base64url | | |||
| | | encoding; see Section 3.4.5.2 | | | | | encoding; see Section 3.4.5.2 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 22 | (any) | Expected conversion to base64 | | | 22 | (any) | Expected conversion to base64 | | |||
| | | encoding; see Section 3.4.5.2 | | | | | encoding; see Section 3.4.5.2 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 23 | (any) | Expected conversion to base16 | | | 23 | (any) | Expected conversion to base16 | | |||
| | | encoding; see Section 3.4.5.2 | | | | | encoding; see Section 3.4.5.2 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 24 | byte string | Encoded CBOR data item; see | | | 24 | byte string | Encoded CBOR data item; see | | |||
| | | Section 3.4.5.1 | | | | | Section 3.4.5.1 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 32 | text string | URI; see Section 3.4.5.3 | | | 32 | text string | URI; see Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 33 | text string | base64url; see Section 3.4.5.3 | | | 33 | text string | base64url; see Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 34 | text string | base64; see Section 3.4.5.3 | | | 34 | text string | base64; see Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 36 | text string | MIME message; see | | | 36 | text string | MIME message; see | | |||
| | | Section 3.4.5.3 | | | | | Section 3.4.5.3 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 55799 | (any) | Self-described CBOR; see | | | 55799 | (any) | Self-described CBOR; see | | |||
| | | Section 3.4.6 | | | | | Section 3.4.6 | | |||
+------------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
Table 5: Tag numbers defined in RFC 7049 | Table 5: Tag Numbers Defined in RFC 7049 | |||
Conceptually, tags are interpreted in the generic data model, not at | Conceptually, tags are interpreted in the generic data model, not at | |||
(de-)serialization time. A small number of tags (at this time, tag | (de-)serialization time. A small number of tags (at this time, tag | |||
number 25 and tag number 29 [IANA.cbor-tags]) have been registered | number 25 and tag number 29 [IANA.cbor-tags]) have been registered | |||
with semantics that may require processing at (de-)serialization | with semantics that may require processing at (de-)serialization | |||
time: The decoder needs to be aware and the encoder needs to be in | time: the decoder needs to be aware of, and the encoder needs to be | |||
control of the exact sequence in which data items are encoded into | in control of, the exact sequence in which data items are encoded | |||
the CBOR data item. This means these tags cannot be implemented on | into the CBOR data item. This means these tags cannot be implemented | |||
top of an arbitrary generic CBOR encoder/decoder (which might not | on top of an arbitrary generic CBOR encoder/decoder (which might not | |||
reflect the serialization order for entries in a map at the data | reflect the serialization order for entries in a map at the data | |||
model level and vice versa); their implementation therefore typically | model level and vice versa); their implementation therefore typically | |||
needs to be integrated into the generic encoder/decoder. The | needs to be integrated into the generic encoder/decoder. The | |||
definition of new tags with this property is NOT RECOMMENDED. | definition of new tags with this property is NOT RECOMMENDED. | |||
IANA allocated tag numbers 65535, 4294967295, and | IANA allocated tag numbers 65535, 4294967295, and | |||
18446744073709551615 (binary all-ones in 16-bit, 32-bit, and 64-bit). | 18446744073709551615 (binary all-ones in 16-bit, 32-bit, and 64-bit). | |||
These can be used as a convenience for implementers that want a | These can be used as a convenience for implementers who want a | |||
single integer data structure to indicate either that a specific tag | single-integer data structure to indicate either the presence of a | |||
is present, or the absence of a tag. That allocation is described in | specific tag or absence of a tag. That allocation is described in | |||
Section 10 of [I-D.bormann-cbor-notable-tags]. These tags are not | Section 10 of [CBOR-TAGS]. These tags are not intended to occur in | |||
intended to occur in actual CBOR data items; implementations MAY flag | actual CBOR data items; implementations MAY flag such an occurrence | |||
such an occurrence as an error. | as an error. | |||
Protocols using tag numbers 0 and 1 extend the generic data model | Protocols can extend the generic data model (Section 2) with data | |||
(Section 2) with data items representing points in time; tag numbers | items representing points in time by using tag numbers 0 and 1, with | |||
2 and 3, with arbitrarily sized integers; and tag numbers 4 and 5, | arbitrarily sized integers by using tag numbers 2 and 3, and with | |||
with floating-point values of arbitrary size and precision. | floating-point values of arbitrary size and precision by using tag | |||
numbers 4 and 5. | ||||
3.4.1. Standard Date/Time String | 3.4.1. Standard Date/Time String | |||
Tag number 0 contains a text string in the standard format described | Tag number 0 contains a text string in the standard format described | |||
by the "date-time" production in [RFC3339], as refined by Section 3.3 | by the "date-time" production in [RFC3339], as refined by Section 3.3 | |||
of [RFC4287], representing the point in time described there. A | of [RFC4287], representing the point in time described there. A | |||
nested item of another type or a text string that doesn't match the | nested item of another type or a text string that doesn't match the | |||
[RFC4287] format is invalid. | format described in [RFC4287] is invalid. | |||
3.4.2. Epoch-based Date/Time | 3.4.2. Epoch-Based Date/Time | |||
Tag number 1 contains a numerical value counting the number of | Tag number 1 contains a numerical value counting the number of | |||
seconds from 1970-01-01T00:00Z in UTC time to the represented point | seconds from 1970-01-01T00:00Z in UTC time to the represented point | |||
in civil time. | in civil time. | |||
The tag content MUST be an unsigned or negative integer (major types | The tag content MUST be an unsigned or negative integer (major types | |||
0 and 1), or a floating-point number (major type 7 with additional | 0 and 1) or a floating-point number (major type 7 with additional | |||
information 25, 26, or 27). Other contained types are invalid. | information 25, 26, or 27). Other contained types are invalid. | |||
Non-negative values (major type 0 and non-negative floating-point | Nonnegative values (major type 0 and nonnegative floating-point | |||
numbers) stand for time values on or after 1970-01-01T00:00Z UTC and | numbers) stand for time values on or after 1970-01-01T00:00Z UTC and | |||
are interpreted according to POSIX [TIME_T]. (POSIX time is also | are interpreted according to POSIX [TIME_T]. (POSIX time is also | |||
known as "UNIX Epoch time".) Leap seconds are handled specially by | known as "UNIX Epoch time".) Leap seconds are handled specially by | |||
POSIX time and this results in a 1 second discontinuity several times | POSIX time, and this results in a 1-second discontinuity several | |||
per decade. Note that applications that require the expression of | times per decade. Note that applications that require the expression | |||
times beyond early 2106 cannot leave out support of 64-bit integers | of times beyond early 2106 cannot leave out support of 64-bit | |||
for the tag content. | integers for the tag content. | |||
Negative values (major type 1 and negative floating-point numbers) | Negative values (major type 1 and negative floating-point numbers) | |||
are interpreted as determined by the application requirements as | are interpreted as determined by the application requirements as | |||
there is no universal standard for UTC count-of-seconds time before | there is no universal standard for UTC count-of-seconds time before | |||
1970-01-01T00:00Z (this is particularly true for points in time that | 1970-01-01T00:00Z (this is particularly true for points in time that | |||
precede discontinuities in national calendars). The same applies to | precede discontinuities in national calendars). The same applies to | |||
non-finite values. | non-finite values. | |||
To indicate fractional seconds, floating-point values can be used | To indicate fractional seconds, floating-point values can be used | |||
within tag number 1 instead of integer values. Note that this | within tag number 1 instead of integer values. Note that this | |||
generally requires binary64 support, as binary16 and binary32 provide | generally requires binary64 support, as binary16 and binary32 provide | |||
non-zero fractions of seconds only for a short period of time around | nonzero fractions of seconds only for a short period of time around | |||
early 1970. An application that requires tag number 1 support may | early 1970. An application that requires tag number 1 support may | |||
restrict the tag content to be an integer (or a floating-point value) | restrict the tag content to be an integer (or a floating-point value) | |||
only. | only. | |||
Note that platform types for date/time may include null or undefined | Note that platform types for date/time may include "null" or | |||
values, which may also be desirable at an application protocol level. | "undefined" values, which may also be desirable at an application | |||
While emitting tag number 1 values with non-finite tag content values | protocol level. While emitting tag number 1 values with non-finite | |||
(e.g., with NaN for undefined date/time values or with Infinite for | tag content values (e.g., with NaN for undefined date/time values or | |||
an expiry date that is not set) may seem an obvious way to handle | with Infinity for an expiry date that is not set) may seem an obvious | |||
this, using untagged null or undefined avoids the use of non-finites | way to handle this, using untagged "null" or "undefined" avoids the | |||
and results in a shorter encoding. Application protocol designers | use of non-finites and results in a shorter encoding. Application | |||
are encouraged to consider these cases and include clear guidelines | protocol designers are encouraged to consider these cases and include | |||
for handling them. | clear guidelines for handling them. | |||
3.4.3. Bignums | 3.4.3. Bignums | |||
Protocols using tag numbers 2 and 3 extend the generic data model | Protocols using tag numbers 2 and 3 extend the generic data model | |||
(Section 2) with "bignums" representing arbitrarily sized integers. | (Section 2) with "bignums" representing arbitrarily sized integers. | |||
In the basic generic data model, bignum values are not equal to | In the basic generic data model, bignum values are not equal to | |||
integers from the same model, but the extended generic data model | integers from the same model, but the extended generic data model | |||
created by this tag definition defines equivalence based on numeric | created by this tag definition defines equivalence based on numeric | |||
value, and preferred serialization (Section 4.1) never makes use of | value, and preferred serialization (Section 4.1) never makes use of | |||
bignums that also can be expressed as basic integers (see below). | bignums that also can be expressed as basic integers (see below). | |||
skipping to change at page 25, line 12 ¶ | skipping to change at line 1102 ¶ | |||
leading zeroes (note that this means the preferred serialization for | leading zeroes (note that this means the preferred serialization for | |||
n = 0 is the empty byte string, but see below). Decoders that | n = 0 is the empty byte string, but see below). Decoders that | |||
understand these tags MUST be able to decode bignums that do have | understand these tags MUST be able to decode bignums that do have | |||
leading zeroes. The preferred serialization of an integer that can | leading zeroes. The preferred serialization of an integer that can | |||
be represented using major type 0 or 1 is to encode it this way | be represented using major type 0 or 1 is to encode it this way | |||
instead of as a bignum (which means that the empty string never | instead of as a bignum (which means that the empty string never | |||
occurs in a bignum when using preferred serialization). Note that | occurs in a bignum when using preferred serialization). Note that | |||
this means the non-preferred choice of a bignum representation | this means the non-preferred choice of a bignum representation | |||
instead of a basic integer for encoding a number is not intended to | instead of a basic integer for encoding a number is not intended to | |||
have application semantics (just as the choice of a longer basic | have application semantics (just as the choice of a longer basic | |||
integer representation than needed, such as 0x1800 for 0x00 does | integer representation than needed, such as 0x1800 for 0x00, does | |||
not). | not). | |||
For example, the number 18446744073709551616 (2**64) is represented | For example, the number 18446744073709551616 (2^(64)) is represented | |||
as 0b110_00010 (major type 6, tag number 2), followed by 0b010_01001 | as 0b110_00010 (major type 6, tag number 2), followed by 0b010_01001 | |||
(major type 2, length 9), followed by 0x010000000000000000 (one byte | (major type 2, length 9), followed by 0x010000000000000000 (one byte | |||
0x01 and eight bytes 0x00). In hexadecimal: | 0x01 and eight bytes 0x00). In hexadecimal: | |||
C2 -- Tag 2 | C2 -- Tag 2 | |||
49 -- Byte string of length 9 | 49 -- Byte string of length 9 | |||
010000000000000000 -- Bytes content | 010000000000000000 -- Bytes content | |||
3.4.4. Decimal Fractions and Bigfloats | 3.4.4. Decimal Fractions and Bigfloats | |||
Protocols using tag number 4 extend the generic data model with data | Protocols using tag number 4 extend the generic data model with data | |||
items representing arbitrary-length decimal fractions of the form | items representing arbitrary-length decimal fractions of the form | |||
m*(10**e). Protocols using tag number 5 extend the generic data | m*(10^(e)). Protocols using tag number 5 extend the generic data | |||
model with data items representing arbitrary-length binary fractions | model with data items representing arbitrary-length binary fractions | |||
of the form m*(2**e). As with bignums, values of different types are | of the form m*(2^(e)). As with bignums, values of different types | |||
not equal in the generic data model. | are not equal in the generic data model. | |||
Decimal fractions combine an integer mantissa with a base-10 scaling | Decimal fractions combine an integer mantissa with a base-10 scaling | |||
factor. They are most useful if an application needs the exact | factor. They are most useful if an application needs the exact | |||
representation of a decimal fraction such as 1.1 because there is no | representation of a decimal fraction such as 1.1 because there is no | |||
exact representation for many decimal fractions in binary floating- | exact representation for many decimal fractions in binary floating- | |||
point representations. | point representations. | |||
"Bigfloats" combine an integer mantissa with a base-2 scaling factor. | "Bigfloats" combine an integer mantissa with a base-2 scaling factor. | |||
They are binary floating-point values that can exceed the range or | They are binary floating-point values that can exceed the range or | |||
the precision of the three IEEE 754 formats supported by CBOR | the precision of the three IEEE 754 formats supported by CBOR | |||
(Section 3.3). Bigfloats may also be used by constrained | (Section 3.3). Bigfloats may also be used by constrained | |||
applications that need some basic binary floating-point capability | applications that need some basic binary floating-point capability | |||
without the need for supporting IEEE 754. | without the need for supporting IEEE 754. | |||
A decimal fraction or a bigfloat is represented as a tagged array | A decimal fraction or a bigfloat is represented as a tagged array | |||
that contains exactly two integer numbers: an exponent e and a | that contains exactly two integer numbers: an exponent e and a | |||
mantissa m. Decimal fractions (tag number 4) use base-10 exponents; | mantissa m. Decimal fractions (tag number 4) use base-10 exponents; | |||
the value of a decimal fraction data item is m*(10**e). Bigfloats | the value of a decimal fraction data item is m*(10^(e)). Bigfloats | |||
(tag number 5) use base-2 exponents; the value of a bigfloat data | (tag number 5) use base-2 exponents; the value of a bigfloat data | |||
item is m*(2**e). The exponent e MUST be represented in an integer | item is m*(2^(e)). The exponent e MUST be represented in an integer | |||
of major type 0 or 1, while the mantissa can also be a bignum | of major type 0 or 1, while the mantissa can also be a bignum | |||
(Section 3.4.3). Contained items with other structures are invalid. | (Section 3.4.3). Contained items with other structures are invalid. | |||
An example of a decimal fraction is that the number 273.15 could be | An example of a decimal fraction is the representation of the number | |||
represented as 0b110_00100 (major type 6 for tag, additional | 273.15 as 0b110_00100 (major type 6 for tag, additional information 4 | |||
information 4 for the tag number), followed by 0b100_00010 (major | for the tag number), followed by 0b100_00010 (major type 4 for the | |||
type 4 for the array, additional information 2 for the length of the | array, additional information 2 for the length of the array), | |||
array), followed by 0b001_00001 (major type 1 for the first integer, | followed by 0b001_00001 (major type 1 for the first integer, | |||
additional information 1 for the value of -2), followed by | additional information 1 for the value of -2), followed by | |||
0b000_11001 (major type 0 for the second integer, additional | 0b000_11001 (major type 0 for the second integer, additional | |||
information 25 for a two-byte value), followed by 0b0110101010110011 | information 25 for a two-byte value), followed by 0b0110101010110011 | |||
(27315 in two bytes). In hexadecimal: | (27315 in two bytes). In hexadecimal: | |||
C4 -- Tag 4 | C4 -- Tag 4 | |||
82 -- Array of length 2 | 82 -- Array of length 2 | |||
21 -- -2 | 21 -- -2 | |||
19 6ab3 -- 27315 | 19 6ab3 -- 27315 | |||
An example of a bigfloat is that the number 1.5 could be represented | An example of a bigfloat is the representation of the number 1.5 as | |||
as 0b110_00101 (major type 6 for tag, additional information 5 for | 0b110_00101 (major type 6 for tag, additional information 5 for the | |||
the tag number), followed by 0b100_00010 (major type 4 for the array, | tag number), followed by 0b100_00010 (major type 4 for the array, | |||
additional information 2 for the length of the array), followed by | additional information 2 for the length of the array), followed by | |||
0b001_00000 (major type 1 for the first integer, additional | 0b001_00000 (major type 1 for the first integer, additional | |||
information 0 for the value of -1), followed by 0b000_00011 (major | information 0 for the value of -1), followed by 0b000_00011 (major | |||
type 0 for the second integer, additional information 3 for the value | type 0 for the second integer, additional information 3 for the value | |||
of 3). In hexadecimal: | of 3). In hexadecimal: | |||
C5 -- Tag 5 | C5 -- Tag 5 | |||
82 -- Array of length 2 | 82 -- Array of length 2 | |||
20 -- -1 | 20 -- -1 | |||
03 -- 3 | 03 -- 3 | |||
skipping to change at page 27, line 38 ¶ | skipping to change at line 1218 ¶ | |||
therefore wants to say what it believes is the proper way to convert | therefore wants to say what it believes is the proper way to convert | |||
binary strings to JSON. | binary strings to JSON. | |||
The data item tagged can be a byte string or any other data item. In | The data item tagged can be a byte string or any other data item. In | |||
the latter case, the tag applies to all of the byte string data items | the latter case, the tag applies to all of the byte string data items | |||
contained in the data item, except for those contained in a nested | contained in the data item, except for those contained in a nested | |||
data item tagged with an expected conversion. | data item tagged with an expected conversion. | |||
These three tag numbers suggest conversions to three of the base data | These three tag numbers suggest conversions to three of the base data | |||
encodings defined in [RFC4648]. Tag number 21 suggests conversion to | encodings defined in [RFC4648]. Tag number 21 suggests conversion to | |||
base64url encoding (Section 5 of RFC 4648), where padding is not used | base64url encoding (Section 5 of [RFC4648]) where padding is not used | |||
(see Section 3.2 of RFC 4648); that is, all trailing equals signs | (see Section 3.2 of [RFC4648]); that is, all trailing equals signs | |||
("=") are removed from the encoded string. Tag number 22 suggests | ("=") are removed from the encoded string. Tag number 22 suggests | |||
conversion to classical base64 encoding (Section 4 of RFC 4648), with | conversion to classical base64 encoding (Section 4 of [RFC4648]) with | |||
padding as defined in RFC 4648. For both base64url and base64, | padding as defined in RFC 4648. For both base64url and base64, | |||
padding bits are set to zero (see Section 3.5 of RFC 4648), and the | padding bits are set to zero (see Section 3.5 of [RFC4648]), and the | |||
conversion to alternate encoding is performed on the contents of the | conversion to alternate encoding is performed on the contents of the | |||
byte string (that is, without adding any line breaks, whitespace, or | byte string (that is, without adding any line breaks, whitespace, or | |||
other additional characters). Tag number 23 suggests conversion to | other additional characters). Tag number 23 suggests conversion to | |||
base16 (hex) encoding, with uppercase alphabetics (see Section 8 of | base16 (hex) encoding with uppercase alphabetics (see Section 8 of | |||
RFC 4648). Note that, for all three tag numbers, the encoding of the | [RFC4648]). Note that, for all three tag numbers, the encoding of | |||
empty byte string is the empty text string. | the empty byte string is the empty text string. | |||
3.4.5.3. Encoded Text | 3.4.5.3. Encoded Text | |||
Some text strings hold data that have formats widely used on the | Some text strings hold data that have formats widely used on the | |||
Internet, and sometimes those formats can be validated and presented | Internet, and sometimes those formats can be validated and presented | |||
to the application in appropriate form by the decoder. There are | to the application in appropriate form by the decoder. There are | |||
tags for some of these formats. | tags for some of these formats. | |||
* Tag number 32 is for URIs, as defined in [RFC3986]. If the text | * Tag number 32 is for URIs, as defined in [RFC3986]. If the text | |||
string doesn't match the "URI-reference" production, the string is | string doesn't match the "URI-reference" production, the string is | |||
invalid. | invalid. | |||
* Tag numbers 33 and 34 are for base64url- and base64-encoded text | * Tag numbers 33 and 34 are for base64url- and base64-encoded text | |||
strings, respectively, as defined in [RFC4648]. If any of: | strings, respectively, as defined in [RFC4648]. If any of the | |||
following apply: | ||||
- the encoded text string contains non-alphabet characters or | - the encoded text string contains non-alphabet characters or | |||
only 1 alphabet character in the last block of 4 (where | only 1 alphabet character in the last block of 4 (where | |||
alphabet is defined by Section 5 of [RFC4648] for tag number 33 | alphabet is defined by Section 5 of [RFC4648] for tag number 33 | |||
and Section 4 of [RFC4648] for tag number 34), or | and Section 4 of [RFC4648] for tag number 34), or | |||
- the padding bits in a 2- or 3-character block are not 0, or | - the padding bits in a 2- or 3-character block are not 0, or | |||
- the base64 encoding has the wrong number of padding characters, | - the base64 encoding has the wrong number of padding characters, | |||
or | or | |||
- the base64url encoding has padding characters, | - the base64url encoding has padding characters, | |||
the string is invalid. | the string is invalid. | |||
* Tag number 36 is for MIME messages (including all headers), as | * Tag number 36 is for MIME messages (including all headers), as | |||
defined in [RFC2045]. A text string that isn't a valid MIME | defined in [RFC2045]. A text string that isn't a valid MIME | |||
message is invalid. (For this tag, validity checking may be | message is invalid. (For this tag, validity checking may be | |||
particularly onerous for a generic decoder and might therefore not | particularly onerous for a generic decoder and might therefore not | |||
be offered. Note that many MIME messages are general binary data | be offered. Note that many MIME messages are general binary data | |||
and can therefore not be represented in a text string; | and therefore cannot be represented in a text string; | |||
[IANA.cbor-tags] lists a registration for tag number 257 that is | [IANA.cbor-tags] lists a registration for tag number 257 that is | |||
similar to tag number 36 but uses a byte string as its tag | similar to tag number 36 but uses a byte string as its tag | |||
content.) | content.) | |||
Note that tag numbers 33 and 34 differ from 21 and 22 in that the | Note that tag numbers 33 and 34 differ from 21 and 22 in that the | |||
data is transported in base-encoded form for the former and in raw | data is transported in base-encoded form for the former and in raw | |||
byte string form for the latter. | byte string form for the latter. | |||
[RFC7049] also defined a tag number 35, for regular expressions that | [RFC7049] also defined a tag number 35 for regular expressions that | |||
are in Perl Compatible Regular Expressions (PCRE/PCRE2) form [PCRE] | are in Perl Compatible Regular Expressions (PCRE/PCRE2) form [PCRE] | |||
or in JavaScript regular expression syntax [ECMA262]. The state of | or in JavaScript regular expression syntax [ECMA262]. The state of | |||
the art in these regular expression specifications has since advanced | the art in these regular expression specifications has since advanced | |||
and is continually advancing, so the present specification does not | and is continually advancing, so this specification does not attempt | |||
attempt to update the references to a snapshot that is current at the | to update the references. Instead, this tag remains available (as | |||
time of writing. Instead, this tag remains available (as registered | registered in [RFC7049]) for applications that specify the particular | |||
in [RFC7049]) for applications that specify the particular regular | regular expression variant they use out-of-band (possibly by limiting | |||
expression variant they use out-of-band (possibly by limiting the | the usage to a defined common subset of both PCRE and ECMA262). As | |||
usage to a defined common subset of both PCRE and ECMA262). As the | this specification clarifies tag validity beyond [RFC7049], we note | |||
present specification clarifies tag validity beyond [RFC7049], we | that due to the open way the tag was defined in [RFC7049], any | |||
note that due to the open way the tag was defined in [RFC7049], any | ||||
contained string value needs to be valid at the CBOR tag level (but | contained string value needs to be valid at the CBOR tag level (but | |||
may then not be "expected" at the application level). | then may not be "expected" at the application level). | |||
3.4.6. Self-Described CBOR | 3.4.6. Self-Described CBOR | |||
In many applications, it will be clear from the context that CBOR is | In many applications, it will be clear from the context that CBOR is | |||
being employed for encoding a data item. For instance, a specific | being employed for encoding a data item. For instance, a specific | |||
protocol might specify the use of CBOR, or a media type is indicated | protocol might specify the use of CBOR, or a media type is indicated | |||
that specifies its use. However, there may be applications where | that specifies its use. However, there may be applications where | |||
such context information is not available, such as when CBOR data is | such context information is not available, such as when CBOR data is | |||
stored in a file that does not have disambiguating metadata. Here, | stored in a file that does not have disambiguating metadata. Here, | |||
it may help to have some distinguishing characteristics for the data | it may help to have some distinguishing characteristics for the data | |||
skipping to change at page 29, line 51 ¶ | skipping to change at line 1326 ¶ | |||
which will never be found at the beginning of a JSON text. | which will never be found at the beginning of a JSON text. | |||
4. Serialization Considerations | 4. Serialization Considerations | |||
4.1. Preferred Serialization | 4.1. Preferred Serialization | |||
For some values at the data model level, CBOR provides multiple | For some values at the data model level, CBOR provides multiple | |||
serializations. For many applications, it is desirable that an | serializations. For many applications, it is desirable that an | |||
encoder always chooses a preferred serialization (preferred | encoder always chooses a preferred serialization (preferred | |||
encoding); however, the present specification does not put the burden | encoding); however, the present specification does not put the burden | |||
of enforcing this preference on either encoder or decoder. | of enforcing this preference on either the encoder or decoder. | |||
Some constrained decoders may be limited in their ability to decode | Some constrained decoders may be limited in their ability to decode | |||
non-preferred serializations: For example, if only integers below | non-preferred serializations: for example, if only integers below | |||
1_000_000_000 (one billion) are expected in an application, the | 1_000_000_000 (one billion) are expected in an application, the | |||
decoder may leave out the code that would be needed to decode 64-bit | decoder may leave out the code that would be needed to decode 64-bit | |||
arguments in integers. An encoder that always uses preferred | arguments in integers. An encoder that always uses preferred | |||
serialization ("preferred encoder") interoperates with this decoder | serialization ("preferred encoder") interoperates with this decoder | |||
for the numbers that can occur in this application. More generally | for the numbers that can occur in this application. Generally | |||
speaking, it therefore can be said that a preferred encoder is more | speaking, a preferred encoder is more universally interoperable (and | |||
universally interoperable (and also less wasteful) than one that, | also less wasteful) than one that, say, always uses 64-bit integers. | |||
say, always uses 64-bit integers. | ||||
Similarly, a constrained encoder may be limited in the variety of | Similarly, a constrained encoder may be limited in the variety of | |||
representation variants it supports in such a way that it does not | representation variants it supports such that it does not emit | |||
emit preferred serializations ("variant encoder"): Say, it could be | preferred serializations ("variant encoder"). For instance, a | |||
designed to always use the 32-bit variant for an integer that it | constrained encoder could be designed to always use the 32-bit | |||
encodes even if a short representation is available (again, assuming | variant for an integer that it encodes even if a short representation | |||
that there is no application need for integers that can only be | is available (assuming that there is no application need for integers | |||
represented with the 64-bit variant). A decoder that does not rely | that can only be represented with the 64-bit variant). A decoder | |||
on only ever receiving preferred serializations ("variation-tolerant | that does not rely on receiving only preferred serializations | |||
decoder") can therefore be said to be more universally interoperable | ("variation-tolerant decoder") can therefore be said to be more | |||
(it might very well optimize for the case of receiving preferred | universally interoperable (it might very well optimize for the case | |||
serializations, though). Full implementations of CBOR decoders are | of receiving preferred serializations, though). Full implementations | |||
by definition variation-tolerant; the distinction is only relevant if | of CBOR decoders are by definition variation tolerant; the | |||
a constrained implementation of a CBOR decoder meets a variant | distinction is only relevant if a constrained implementation of a | |||
encoder. | CBOR decoder meets a variant encoder. | |||
The preferred serialization always uses the shortest form of | The preferred serialization always uses the shortest form of | |||
representing the argument (Section 3); it also uses the shortest | representing the argument (Section 3); it also uses the shortest | |||
floating-point encoding that preserves the value being encoded. | floating-point encoding that preserves the value being encoded. | |||
The preferred serialization for a floating-point value is the | The preferred serialization for a floating-point value is the | |||
shortest floating-point encoding that preserves its value, e.g., | shortest floating-point encoding that preserves its value, e.g., | |||
0xf94580 for the number 5.5, and 0xfa45ad9c00 for the number 5555.5. | 0xf94580 for the number 5.5, and 0xfa45ad9c00 for the number 5555.5. | |||
For NaN values, a shorter encoding is preferred if zero-padding the | For NaN values, a shorter encoding is preferred if zero-padding the | |||
shorter significand towards the right reconstitutes the original NaN | shorter significand towards the right reconstitutes the original NaN | |||
value (for many applications, the single NaN encoding 0xf97e00 will | value (for many applications, the single NaN encoding 0xf97e00 will | |||
suffice). | suffice). | |||
Definite length encoding is preferred whenever the length is known at | Definite-length encoding is preferred whenever the length is known at | |||
the time the serialization of the item starts. | the time the serialization of the item starts. | |||
4.2. Deterministically Encoded CBOR | 4.2. Deterministically Encoded CBOR | |||
Some protocols may want encoders to only emit CBOR in a particular | Some protocols may want encoders to only emit CBOR in a particular | |||
deterministic format; those protocols might also have the decoders | deterministic format; those protocols might also have the decoders | |||
check that their input is in that deterministic format. Those | check that their input is in that deterministic format. Those | |||
protocols are free to define what they mean by a "deterministic | protocols are free to define what they mean by a "deterministic | |||
format" and what encoders and decoders are expected to do. This | format" and what encoders and decoders are expected to do. This | |||
section defines a set of restrictions that can serve as the base of | section defines a set of restrictions that can serve as the base of | |||
skipping to change at page 31, line 38 ¶ | skipping to change at line 1401 ¶ | |||
- 24 to 255 and -25 to -256 MUST be expressed only with an | - 24 to 255 and -25 to -256 MUST be expressed only with an | |||
additional uint8_t; | additional uint8_t; | |||
- 256 to 65535 and -257 to -65536 MUST be expressed only with an | - 256 to 65535 and -257 to -65536 MUST be expressed only with an | |||
additional uint16_t; | additional uint16_t; | |||
- 65536 to 4294967295 and -65537 to -4294967296 MUST be expressed | - 65536 to 4294967295 and -65537 to -4294967296 MUST be expressed | |||
only with an additional uint32_t. | only with an additional uint32_t. | |||
Floating-point values also MUST use the shortest form that | Floating-point values also MUST use the shortest form that | |||
preserves the value, e.g. 1.5 is encoded as 0xf93e00 (binary16) | preserves the value, e.g., 1.5 is encoded as 0xf93e00 (binary16) | |||
and 1000000.5 as 0xfa49742408 (binary32). (One implementation of | and 1000000.5 as 0xfa49742408 (binary32). (One implementation of | |||
this is to have all floats start as a 64-bit float, then do a test | this is to have all floats start as a 64-bit float, then do a test | |||
conversion to a 32-bit float; if the result is the same numeric | conversion to a 32-bit float; if the result is the same numeric | |||
value, use the shorter form and repeat the process with a test | value, use the shorter form and repeat the process with a test | |||
conversion to a 16-bit float. This also works to select 16-bit | conversion to a 16-bit float. This also works to select 16-bit | |||
float for positive and negative Infinity as well.) | float for positive and negative Infinity as well.) | |||
* Indefinite-length items MUST NOT appear. They can be encoded as | * Indefinite-length items MUST NOT appear. They can be encoded as | |||
definite-length items instead. | definite-length items instead. | |||
skipping to change at page 32, line 21 ¶ | skipping to change at line 1432 ¶ | |||
4. "z", encoded as 0x617a. | 4. "z", encoded as 0x617a. | |||
5. "aa", encoded as 0x626161. | 5. "aa", encoded as 0x626161. | |||
6. [100], encoded as 0x811864. | 6. [100], encoded as 0x811864. | |||
7. [-1], encoded as 0x8120. | 7. [-1], encoded as 0x8120. | |||
8. false, encoded as 0xf4. | 8. false, encoded as 0xf4. | |||
(Implementation note: the self-delimiting nature of the CBOR | | Implementation note: the self-delimiting nature of the CBOR | |||
encoding means that there are no two well-formed CBOR encoded data | | encoding means that there are no two well-formed CBOR encoded | |||
items where one is a prefix of the other. The bytewise | | data items where one is a prefix of the other. The bytewise | |||
lexicographic comparison of deterministic encodings of different | | lexicographic comparison of deterministic encodings of | |||
map keys therefore always ends in a position where the byte | | different map keys therefore always ends in a position where | |||
differs between the keys, before the end of a key is reached.) | | the byte differs between the keys, before the end of a key is | |||
| reached. | ||||
4.2.2. Additional Deterministic Encoding Considerations | 4.2.2. Additional Deterministic Encoding Considerations | |||
CBOR tags present additional considerations for deterministic | CBOR tags present additional considerations for deterministic | |||
encoding. If a CBOR-based protocol were to provide the same | encoding. If a CBOR-based protocol were to provide the same | |||
semantics for the presence and absence of a specific tag (e.g., by | semantics for the presence and absence of a specific tag (e.g., by | |||
allowing both tag 1 data items and raw numbers in a date/time | allowing both tag 1 data items and raw numbers in a date/time | |||
position, treating the latter as if they were tagged), the | position, treating the latter as if they were tagged), the | |||
deterministic format would not allow the presence of the tag, based | deterministic format would not allow the presence of the tag, based | |||
on the "shortest form" principle. For example, a protocol might give | on the "shortest form" principle. For example, a protocol might give | |||
encoders the choice of representing a URL as either a text string or, | encoders the choice of representing a URL as either a text string or, | |||
using Section 3.4.5.3, tag number 32 containing a text string. This | using Section 3.4.5.3, tag number 32 containing a text string. This | |||
protocol's deterministic encoding needs to either require that the | protocol's deterministic encoding needs either to require that the | |||
tag is present or require that it is absent, not allow either one. | tag is present or to require that it is absent, not allow either one. | |||
In a protocol that does require tags in certain places to obtain | In a protocol that does require tags in certain places to obtain | |||
specific semantics, the tag needs to appear in the deterministic | specific semantics, the tag needs to appear in the deterministic | |||
format as well. Deterministic encoding considerations also apply to | format as well. Deterministic encoding considerations also apply to | |||
the content of tags. | the content of tags. | |||
If a protocol includes a field that can express integers with an | If a protocol includes a field that can express integers with an | |||
absolute value of 2^64 or larger using tag numbers 2 or 3 | absolute value of 2^(64) or larger using tag numbers 2 or 3 | |||
(Section 3.4.3), the protocol's deterministic encoding needs to | (Section 3.4.3), the protocol's deterministic encoding needs to | |||
specify whether smaller integers are also expressed using these tags | specify whether smaller integers are also expressed using these tags | |||
or using major types 0 and 1. Preferred serialization uses the | or using major types 0 and 1. Preferred serialization uses the | |||
latter choice, which is therefore recommended. | latter choice, which is therefore recommended. | |||
Protocols that include floating-point values, whether represented | Protocols that include floating-point values, whether represented | |||
using basic floating-point values (Section 3.3) or using tags (or | using basic floating-point values (Section 3.3) or using tags (or | |||
both), may need to define extra requirements on their deterministic | both), may need to define extra requirements on their deterministic | |||
encodings, such as: | encodings, such as: | |||
skipping to change at page 33, line 21 ¶ | skipping to change at line 1482 ¶ | |||
and negative zero as distinct values, the application might not | and negative zero as distinct values, the application might not | |||
distinguish these and might decide to represent all zero values | distinguish these and might decide to represent all zero values | |||
with a positive sign, disallowing negative zero. (The application | with a positive sign, disallowing negative zero. (The application | |||
may also want to restrict the precision of floating-point values | may also want to restrict the precision of floating-point values | |||
in such a way that there is never a need to represent 64-bit -- or | in such a way that there is never a need to represent 64-bit -- or | |||
even 32-bit -- floating-point values.) | even 32-bit -- floating-point values.) | |||
* If a protocol includes a field that can express floating-point | * If a protocol includes a field that can express floating-point | |||
values, with a specific data model that declares integer and | values, with a specific data model that declares integer and | |||
floating-point values to be interchangeable, the protocol's | floating-point values to be interchangeable, the protocol's | |||
deterministic encoding needs to specify whether (for example) the | deterministic encoding needs to specify whether, for example, the | |||
integer 1.0 is encoded as 0x01 (unsigned integer), 0xf93c00 | integer 1.0 is encoded as 0x01 (unsigned integer), 0xf93c00 | |||
(binary16), 0xfa3f800000 (binary32), or 0xfb3ff0000000000000 | (binary16), 0xfa3f800000 (binary32), or 0xfb3ff0000000000000 | |||
(binary64). Example rules for this are: | (binary64). Example rules for this are: | |||
1. Encode integral values that fit in 64 bits as values from | 1. Encode integral values that fit in 64 bits as values from | |||
major types 0 and 1, and other values as the preferred | major types 0 and 1, and other values as the preferred | |||
(smallest of 16-, 32-, or 64-bit) floating-point | (smallest of 16-, 32-, or 64-bit) floating-point | |||
representation that accurately represents the value, | representation that accurately represents the value, | |||
2. Encode all values as the preferred floating-point | 2. Encode all values as the preferred floating-point | |||
representation that accurately represents the value, even for | representation that accurately represents the value, even for | |||
integral values, or | integral values, or | |||
3. Encode all values as 64-bit floating-point representations. | 3. Encode all values as 64-bit floating-point representations. | |||
Rule 1 straddles the boundaries between integers and floating- | Rule 1 straddles the boundaries between integers and floating- | |||
point values, and Rule 3 does not use preferred serialization, so | point values, and Rule 3 does not use preferred serialization, so | |||
Rule 2 may be a good choice in many cases. | Rule 2 may be a good choice in many cases. | |||
* If NaN is an allowed value and there is no intent to support NaN | * If NaN is an allowed value, and there is no intent to support NaN | |||
payloads or signaling NaNs, the protocol needs to pick a single | payloads or signaling NaNs, the protocol needs to pick a single | |||
representation, typically 0xf97e00. If that simple choice is not | representation, typically 0xf97e00. If that simple choice is not | |||
possible, specific attention will be needed for NaN handling. | possible, specific attention will be needed for NaN handling. | |||
* Subnormal numbers (nonzero numbers with the lowest possible | * Subnormal numbers (nonzero numbers with the lowest possible | |||
exponent of a given IEEE 754 number format) may be flushed to zero | exponent of a given IEEE 754 number format) may be flushed to zero | |||
outputs or be treated as zero inputs in some floating-point | outputs or be treated as zero inputs in some floating-point | |||
implementations. A protocol's deterministic encoding may want to | implementations. A protocol's deterministic encoding may want to | |||
specifically accommodate such implementations while creating an | specifically accommodate such implementations while creating an | |||
onus on other implementations, by excluding subnormal numbers from | onus on other implementations by excluding subnormal numbers from | |||
interchange, interchanging zero instead. | interchange, interchanging zero instead. | |||
* The same number can be represented by different decimal fractions, | * The same number can be represented by different decimal fractions, | |||
by different bigfloats, and by different forms under other tags | by different bigfloats, and by different forms under other tags | |||
that may be defined to express numeric values. Depending on the | that may be defined to express numeric values. Depending on the | |||
implementation, it may not always be practical to determine | implementation, it may not always be practical to determine | |||
whether any of these forms (or forms in the basic generic data | whether any of these forms (or forms in the basic generic data | |||
model) are equivalent. An application protocol that presents | model) are equivalent. An application protocol that presents | |||
choices of this kind for the representation format of numbers | choices of this kind for the representation format of numbers | |||
needs to be explicit in how the formats are to be chosen for | needs to be explicit about how the formats for deterministic | |||
deterministic encoding. | encoding are to be chosen. | |||
4.2.3. Length-first Map Key Ordering | 4.2.3. Length-First Map Key Ordering | |||
The core deterministic encoding requirements (Section 4.2.1) sort map | The core deterministic encoding requirements (Section 4.2.1) sort map | |||
keys in a different order from the one suggested by Section 3.9 of | keys in a different order from the one suggested by Section 3.9 of | |||
[RFC7049] (called "Canonical CBOR" there). Protocols that need to be | [RFC7049] (called "Canonical CBOR" there). Protocols that need to be | |||
compatible with [RFC7049]'s order can instead be specified in terms | compatible with the order specified in [RFC7049] can instead be | |||
of this specification's "length-first core deterministic encoding | specified in terms of this specification's "length-first core | |||
requirements": | deterministic encoding requirements": | |||
A CBOR encoding satisfies the "length-first core deterministic | A CBOR encoding satisfies the "length-first core deterministic | |||
encoding requirements" if it satisfies the core deterministic | encoding requirements" if it satisfies the core deterministic | |||
encoding requirements except that the keys in every map MUST be | encoding requirements except that the keys in every map MUST be | |||
sorted such that: | sorted such that: | |||
1. If two keys have different lengths, the shorter one sorts | 1. If two keys have different lengths, the shorter one sorts | |||
earlier; | earlier; | |||
2. If two keys have the same length, the one with the lower value in | 2. If two keys have the same length, the one with the lower value in | |||
(byte-wise) lexical order sorts earlier. | (bytewise) lexical order sorts earlier. | |||
For example, under the length-first core deterministic encoding | For example, under the length-first core deterministic encoding | |||
requirements, the following keys are sorted correctly: | requirements, the following keys are sorted correctly: | |||
1. 10, encoded as 0x0a. | 1. 10, encoded as 0x0a. | |||
2. -1, encoded as 0x20. | 2. -1, encoded as 0x20. | |||
3. false, encoded as 0xf4. | 3. false, encoded as 0xf4. | |||
4. 100, encoded as 0x1864. | 4. 100, encoded as 0x1864. | |||
5. "z", encoded as 0x617a. | 5. "z", encoded as 0x617a. | |||
6. [-1], encoded as 0x8120. | 6. [-1], encoded as 0x8120. | |||
7. "aa", encoded as 0x626161. | 7. "aa", encoded as 0x626161. | |||
8. [100], encoded as 0x811864. | 8. [100], encoded as 0x811864. | |||
(Although [RFC7049] used the term "Canonical CBOR" for its form of | | Although [RFC7049] used the term "Canonical CBOR" for its form | |||
requirements on deterministic encoding, this document avoids this | | of requirements on deterministic encoding, this document avoids | |||
term because "canonicalization" is often associated with specific | | this term because "canonicalization" is often associated with | |||
uses of deterministic encoding only. The terms are essentially | | specific uses of deterministic encoding only. The terms are | |||
interchangeable, however, and the set of core requirements in this | | essentially interchangeable, however, and the set of core | |||
document could also be called "Canonical CBOR", while the length- | | requirements in this document could also be called "Canonical | |||
first-ordered version of that could be called "Old Canonical CBOR".) | | CBOR", while the length-first-ordered version of that could be | |||
| called "Old Canonical CBOR". | ||||
5. Creating CBOR-Based Protocols | 5. Creating CBOR-Based Protocols | |||
Data formats such as CBOR are often used in environments where there | Data formats such as CBOR are often used in environments where there | |||
is no format negotiation. A specific design goal of CBOR is to not | is no format negotiation. A specific design goal of CBOR is to not | |||
need any included or assumed schema: a decoder can take a CBOR item | need any included or assumed schema: a decoder can take a CBOR item | |||
and decode it with no other knowledge. | and decode it with no other knowledge. | |||
Of course, in real-world implementations, the encoder and the decoder | Of course, in real-world implementations, the encoder and the decoder | |||
will have a shared view of what should be in a CBOR data item. For | will have a shared view of what should be in a CBOR data item. For | |||
skipping to change at page 35, line 41 ¶ | skipping to change at line 1601 ¶ | |||
based protocols MUST produce only valid items, that is, the protocol | based protocols MUST produce only valid items, that is, the protocol | |||
cannot be designed to make use of invalid items. An encoder can be | cannot be designed to make use of invalid items. An encoder can be | |||
capable of encoding as many or as few types of values as is required | capable of encoding as many or as few types of values as is required | |||
by the protocol in which it is used; a decoder can be capable of | by the protocol in which it is used; a decoder can be capable of | |||
understanding as many or as few types of values as is required by the | understanding as many or as few types of values as is required by the | |||
protocols in which it is used. This lack of restrictions allows CBOR | protocols in which it is used. This lack of restrictions allows CBOR | |||
to be used in extremely constrained environments. | to be used in extremely constrained environments. | |||
The rest of this section discusses some considerations in creating | The rest of this section discusses some considerations in creating | |||
CBOR-based protocols. With few exceptions, it is advisory only and | CBOR-based protocols. With few exceptions, it is advisory only and | |||
explicitly excludes any language from BCP 14 other than words that | explicitly excludes any language from BCP 14 [RFC2119] [RFC8174] | |||
could be interpreted as "MAY" in the sense of BCP 14. The exceptions | other than words that could be interpreted as "MAY" in the sense of | |||
aim at facilitating interoperability of CBOR-based protocols while | BCP 14. The exceptions aim at facilitating interoperability of CBOR- | |||
making use of a wide variety of both generic and application-specific | based protocols while making use of a wide variety of both generic | |||
encoders and decoders. | and application-specific encoders and decoders. | |||
5.1. CBOR in Streaming Applications | 5.1. CBOR in Streaming Applications | |||
In a streaming application, a data stream may be composed of a | In a streaming application, a data stream may be composed of a | |||
sequence of CBOR data items concatenated back-to-back. In such an | sequence of CBOR data items concatenated back-to-back. In such an | |||
environment, the decoder immediately begins decoding a new data item | environment, the decoder immediately begins decoding a new data item | |||
if data is found after the end of a previous data item. | if data is found after the end of a previous data item. | |||
Not all of the bytes making up a data item may be immediately | Not all of the bytes making up a data item may be immediately | |||
available to the decoder; some decoders will buffer additional data | available to the decoder; some decoders will buffer additional data | |||
skipping to change at page 36, line 39 ¶ | skipping to change at line 1648 ¶ | |||
Generic CBOR encoders provide an application interface that allows | Generic CBOR encoders provide an application interface that allows | |||
the application to specify any well-formed value to be encoded as a | the application to specify any well-formed value to be encoded as a | |||
CBOR data item, including simple values and tags unknown to the | CBOR data item, including simple values and tags unknown to the | |||
encoder. | encoder. | |||
Even though CBOR attempts to minimize these cases, not all well- | Even though CBOR attempts to minimize these cases, not all well- | |||
formed CBOR data is valid: for example, the encoded text string | formed CBOR data is valid: for example, the encoded text string | |||
"0x62c0ae" does not contain valid UTF-8 (because [RFC3629] requires | "0x62c0ae" does not contain valid UTF-8 (because [RFC3629] requires | |||
always using the shortest form) and so is not a valid CBOR item. | always using the shortest form) and so is not a valid CBOR item. | |||
Also, specific tags may make semantic constraints that may be | Also, specific tags may make semantic constraints that may be | |||
violated, for instance by a bignum tag enclosing another tag, or by | violated, for instance, by a bignum tag enclosing another tag or by | |||
an instance of tag number 0 containing a byte string, or containing a | an instance of tag number 0 containing a byte string or containing a | |||
text string with contents that do not match [RFC3339]'s "date-time" | text string with contents that do not match the "date-time" | |||
production. There is no requirement that generic encoders and | production of [RFC3339]. There is no requirement that generic | |||
decoders make unnatural choices for their application interface to | encoders and decoders make unnatural choices for their application | |||
enable the processing of invalid data. Generic encoders and decoders | interface to enable the processing of invalid data. Generic encoders | |||
are expected to forward simple values and tags even if their specific | and decoders are expected to forward simple values and tags even if | |||
codepoints are not registered at the time the encoder/decoder is | their specific codepoints are not registered at the time the encoder/ | |||
written (Section 5.4). | decoder is written (Section 5.4). | |||
5.3. Validity of Items | 5.3. Validity of Items | |||
A well-formed but invalid CBOR data item (Section 1.2) presents a | A well-formed but invalid CBOR data item (Section 1.2) presents a | |||
problem with interpreting the data encoded in it in the CBOR data | problem with interpreting the data encoded in it in the CBOR data | |||
model. A CBOR-based protocol could be specified in several layers, | model. A CBOR-based protocol could be specified in several layers, | |||
in which the lower layers don't process the semantics of some of the | in which the lower layers don't process the semantics of some of the | |||
CBOR data they forward. These layers can't notice any validity | CBOR data they forward. These layers can't notice any validity | |||
errors in data they don't process and MUST forward that data as-is. | errors in data they don't process and MUST forward that data as-is. | |||
The first layer that does process the semantics of an invalid CBOR | The first layer that does process the semantics of an invalid CBOR | |||
item MUST take one of two choices: | item MUST pick one of two choices: | |||
1. Replace the problematic item with an error marker and continue | 1. Replace the problematic item with an error marker and continue | |||
with the next item, or | with the next item, or | |||
2. Issue an error and stop processing altogether. | 2. Issue an error and stop processing altogether. | |||
A CBOR-based protocol MUST specify which of these options its | A CBOR-based protocol MUST specify which of these options its | |||
decoders take, for each kind of invalid item they might encounter. | decoders take for each kind of invalid item they might encounter. | |||
Such problems might occur at the basic validity level of CBOR or in | Such problems might occur at the basic validity level of CBOR or in | |||
the context of tags (tag validity). | the context of tags (tag validity). | |||
5.3.1. Basic validity | 5.3.1. Basic validity | |||
Two kinds of validity errors can occur in the basic generic data | Two kinds of validity errors can occur in the basic generic data | |||
model: | model: | |||
Duplicate keys in a map: Generic decoders (Section 5.2) make data | Duplicate keys in a map: Generic decoders (Section 5.2) make data | |||
skipping to change at page 38, line 5 ¶ | skipping to change at line 1706 ¶ | |||
that the sequence of bytes in a UTF-8 string (major type 3) is | that the sequence of bytes in a UTF-8 string (major type 3) is | |||
actually valid UTF-8 and react appropriately. | actually valid UTF-8 and react appropriately. | |||
5.3.2. Tag validity | 5.3.2. Tag validity | |||
Two additional kinds of validity errors are introduced by adding tags | Two additional kinds of validity errors are introduced by adding tags | |||
to the basic generic data model: | to the basic generic data model: | |||
Inadmissible type for tag content: Tag numbers (Section 3.4) specify | Inadmissible type for tag content: Tag numbers (Section 3.4) specify | |||
what type of data item is supposed to be used as their tag | what type of data item is supposed to be used as their tag | |||
content; for example, the tag numbers for positive or negative | content; for example, the tag numbers for unsigned or negative | |||
bignums are supposed to be put on byte strings. A decoder that | bignums are supposed to be put on byte strings. A decoder that | |||
decodes the tagged data item into a native representation (a | decodes the tagged data item into a native representation (a | |||
native big integer in this example) is expected to check the type | native big integer in this example) is expected to check the type | |||
of the data item being tagged. Even decoders that don't have such | of the data item being tagged. Even decoders that don't have such | |||
native representations available in their environment may perform | native representations available in their environment may perform | |||
the check on those tags known to them and react appropriately. | the check on those tags known to them and react appropriately. | |||
Inadmissible value for tag content: The type of data item may be | Inadmissible value for tag content: The type of data item may be | |||
admissible for a tag's content, but the specific value may not be; | admissible for a tag's content, but the specific value may not be; | |||
e.g., a value of "yesterday" is not acceptable for the content of | e.g., a value of "yesterday" is not acceptable for the content of | |||
skipping to change at page 38, line 29 ¶ | skipping to change at line 1730 ¶ | |||
would present a tag with an unknown tag number (Section 5.4). | would present a tag with an unknown tag number (Section 5.4). | |||
5.4. Validity and Evolution | 5.4. Validity and Evolution | |||
A decoder with validity checking will expend the effort to reliably | A decoder with validity checking will expend the effort to reliably | |||
detect data items with validity errors. For example, such a decoder | detect data items with validity errors. For example, such a decoder | |||
needs to have an API that reports an error (and does not return data) | needs to have an API that reports an error (and does not return data) | |||
for a CBOR data item that contains any of the validity errors listed | for a CBOR data item that contains any of the validity errors listed | |||
in the previous subsection. | in the previous subsection. | |||
The set of tags defined in the tag registry (Section 9.2), as well as | The set of tags defined in the "Concise Binary Object Representation | |||
the set of simple values defined in the simple values registry | (CBOR) Tags" registry (Section 9.2), as well as the set of simple | |||
(Section 9.1), can grow at any time beyond the set understood by a | values defined in the "Concise Binary Object Representation (CBOR) | |||
generic decoder. A validity-checking decoder can do one of two | Simple Values" registry (Section 9.1), can grow at any time beyond | |||
things when it encounters such a case that it does not recognize: | the set understood by a generic decoder. A validity-checking decoder | |||
can do one of two things when it encounters such a case that it does | ||||
not recognize: | ||||
* It can report an error (and not return data). Note that treating | * It can report an error (and not return data). Note that treating | |||
this case as an error can cause ossification, and is thus not | this case as an error can cause ossification and is thus not | |||
encouraged. This error is not a validity error per se. This kind | encouraged. This error is not a validity error, per se. This | |||
of error is more likely to be raised by a decoder that would be | kind of error is more likely to be raised by a decoder that would | |||
performing validity checking if this were a known case. | be performing validity checking if this were a known case. | |||
* It can emit the unknown item (type, value, and, for tags, the | * It can emit the unknown item (type, value, and, for tags, the | |||
decoded tagged data item) to the application calling the decoder, | decoded tagged data item) to the application calling the decoder, | |||
with an indication that the decoder did not recognize that tag | and then give the application an indication that the decoder did | |||
number or simple value. | not recognize that tag number or simple value. | |||
The latter approach, which is also appropriate for decoders that do | The latter approach, which is also appropriate for decoders that do | |||
not support validity checking, provides forward compatibility with | not support validity checking, provides forward compatibility with | |||
newly registered tags and simple values without the requirement to | newly registered tags and simple values without the requirement to | |||
update the encoder at the same time as the calling application. (For | update the encoder at the same time as the calling application. (For | |||
this, the API for the decoder needs to have a way to mark unknown | this, the decoder's API needs the ability to mark unknown items so | |||
items so that the calling application can handle them in a manner | that the calling application can handle them in a manner appropriate | |||
appropriate for the program.) | for the program.) | |||
Since some of the processing needed for validity checking may have an | Since some of the processing needed for validity checking may have an | |||
appreciable cost (in particular with duplicate detection for maps), | appreciable cost (in particular with duplicate detection for maps), | |||
support of validity checking is not a requirement placed on all CBOR | support of validity checking is not a requirement placed on all CBOR | |||
decoders. | decoders. | |||
Some encoders will rely on their applications to provide input data | Some encoders will rely on their applications to provide input data | |||
in such a way that valid CBOR results from the encoder. A generic | in such a way that valid CBOR results from the encoder. A generic | |||
encoder may also want to provide a validity-checking mode where it | encoder may also want to provide a validity-checking mode where it | |||
reliably limits its output to valid CBOR, independent of whether or | reliably limits its output to valid CBOR, independent of whether or | |||
not its application is indeed providing API-conformant data. | not its application is indeed providing API-conformant data. | |||
5.5. Numbers | 5.5. Numbers | |||
CBOR-based protocols should take into account that different language | CBOR-based protocols should take into account that different language | |||
environments pose different restrictions on the range and precision | environments pose different restrictions on the range and precision | |||
of numbers that are representable. For example, the basic JavaScript | of numbers that are representable. For example, the basic JavaScript | |||
number system treats all numbers as floating-point values, which may | number system treats all numbers as floating-point values, which may | |||
result in silent loss of precision in decoding integers with more | result in the silent loss of precision in decoding integers with more | |||
than 53 significant bits. Another example is that, since CBOR keeps | than 53 significant bits. Another example is that, since CBOR keeps | |||
the sign bit for its integer representation in the major type, it has | the sign bit for its integer representation in the major type, it has | |||
one bit more for signed numbers of a certain length (e.g., | one bit more for signed numbers of a certain length (e.g., | |||
-2**64..2**64-1 for 1+8-byte integers) than the typical platform | -2^(64)..2^(64)-1 for 1+8-byte integers) than the typical platform | |||
signed integer representation of the same length (-2**63..2**63-1 for | signed integer representation of the same length (-2^(63)..2^(63)-1 | |||
8-byte int64_t). A protocol that uses numbers should define its | for 8-byte int64_t). A protocol that uses numbers should define its | |||
expectations on the handling of non-trivial numbers in decoders and | expectations on the handling of nontrivial numbers in decoders and | |||
receiving applications. | receiving applications. | |||
A CBOR-based protocol that includes floating-point numbers can | A CBOR-based protocol that includes floating-point numbers can | |||
restrict which of the three formats (half-precision, single- | restrict which of the three formats (half-precision, single- | |||
precision, and double-precision) are to be supported. For an | precision, and double-precision) are to be supported. For an | |||
integer-only application, a protocol may want to completely exclude | integer-only application, a protocol may want to completely exclude | |||
the use of floating-point values. | the use of floating-point values. | |||
A CBOR-based protocol designed for compactness may want to exclude | A CBOR-based protocol designed for compactness may want to exclude | |||
specific integer encodings that are longer than necessary for the | specific integer encodings that are longer than necessary for the | |||
skipping to change at page 40, line 12 ¶ | skipping to change at line 1805 ¶ | |||
compact application that does not require deterministic encoding | compact application that does not require deterministic encoding | |||
should accept values that use a longer-than-needed encoding (such as | should accept values that use a longer-than-needed encoding (such as | |||
encoding "0" as 0b000_11001 followed by two bytes of 0x00) as long as | encoding "0" as 0b000_11001 followed by two bytes of 0x00) as long as | |||
the application can decode an integer of the given size. Similar | the application can decode an integer of the given size. Similar | |||
considerations apply to floating-point values; decoding both | considerations apply to floating-point values; decoding both | |||
preferred serializations and longer-than-needed ones is recommended. | preferred serializations and longer-than-needed ones is recommended. | |||
CBOR-based protocols for constrained applications that provide a | CBOR-based protocols for constrained applications that provide a | |||
choice between representing a specific number as an integer and as a | choice between representing a specific number as an integer and as a | |||
decimal fraction or bigfloat (such as when the exponent is small and | decimal fraction or bigfloat (such as when the exponent is small and | |||
non-negative), might express a quality-of-implementation expectation | nonnegative) might express a quality-of-implementation expectation | |||
that the integer representation is used directly. | that the integer representation is used directly. | |||
5.6. Specifying Keys for Maps | 5.6. Specifying Keys for Maps | |||
The encoding and decoding applications need to agree on what types of | The encoding and decoding applications need to agree on what types of | |||
keys are going to be used in maps. In applications that need to | keys are going to be used in maps. In applications that need to | |||
interwork with JSON-based applications, conversion is simplified by | interwork with JSON-based applications, conversion is simplified by | |||
limiting keys to text strings only; otherwise, there has to be a | limiting keys to text strings only; otherwise, there has to be a | |||
specified mapping from the other CBOR types to text strings, and this | specified mapping from the other CBOR types to text strings, and this | |||
often leads to implementation errors. In applications where keys are | often leads to implementation errors. In applications where keys are | |||
numeric in nature and numeric ordering of keys is important to the | numeric in nature, and numeric ordering of keys is important to the | |||
application, directly using the numbers for the keys is useful. | application, directly using the numbers for the keys is useful. | |||
If multiple types of keys are to be used, consideration should be | If multiple types of keys are to be used, consideration should be | |||
given to how these types would be represented in the specific | given to how these types would be represented in the specific | |||
programming environments that are to be used. For example, in | programming environments that are to be used. For example, in | |||
JavaScript Maps [ECMA262], a key of integer 1 cannot be distinguished | JavaScript Maps [ECMA262], a key of integer 1 cannot be distinguished | |||
from a key of floating-point 1.0. This means that, if integer keys | from a key of floating-point 1.0. This means that, if integer keys | |||
are used, the protocol needs to avoid use of floating-point keys the | are used, the protocol needs to avoid the use of floating-point keys | |||
values of which happen to be integer numbers in the same map. | the values of which happen to be integer numbers in the same map. | |||
Decoders that deliver data items nested within a CBOR data item | Decoders that deliver data items nested within a CBOR data item | |||
immediately on decoding them ("streaming decoders") often do not keep | immediately on decoding them ("streaming decoders") often do not keep | |||
the state that is necessary to ascertain uniqueness of a key in a | the state that is necessary to ascertain uniqueness of a key in a | |||
map. Similarly, an encoder that can start encoding data items before | map. Similarly, an encoder that can start encoding data items before | |||
the enclosing data item is completely available ("streaming encoder") | the enclosing data item is completely available ("streaming encoder") | |||
may want to reduce its overhead significantly by relying on its data | may want to reduce its overhead significantly by relying on its data | |||
source to maintain uniqueness. | source to maintain uniqueness. | |||
A CBOR-based protocol MUST define what to do when a receiving | A CBOR-based protocol MUST define what to do when a receiving | |||
application does see multiple identical keys in a map. The resulting | application sees multiple identical keys in a map. The resulting | |||
rule in the protocol MUST respect the CBOR data model: it cannot | rule in the protocol MUST respect the CBOR data model: it cannot | |||
prescribe a specific handling of the entries with the identical keys, | prescribe a specific handling of the entries with the identical keys, | |||
except that it might have a rule that having identical keys in a map | except that it might have a rule that having identical keys in a map | |||
indicates a malformed map and that the decoder has to stop with an | indicates a malformed map and that the decoder has to stop with an | |||
error. When processing maps that exhibit entries with duplicate | error. When processing maps that exhibit entries with duplicate | |||
keys, a generic decoder might do one of the following: | keys, a generic decoder might do one of the following: | |||
* Not accept maps with duplicate keys (that is, enforce validity for | * Not accept maps with duplicate keys (that is, enforce validity for | |||
maps, see also Section 5.4). These generic decoders are | maps, see also Section 5.4). These generic decoders are | |||
universally useful. An application may still need to do perform | universally useful. An application may still need to perform its | |||
its own duplicate checking based on application rules (for | own duplicate checking based on application rules (for instance, | |||
instance if the application equates integers and floating-point | if the application equates integers and floating-point values in | |||
values in map key positions for specific maps). | map key positions for specific maps). | |||
* Pass all map entries to the application, including ones with | * Pass all map entries to the application, including ones with | |||
duplicate keys. This requires the application to handle (check | duplicate keys. This requires that the application handle (check | |||
against) duplicate keys, even if the application rules are | against) duplicate keys, even if the application rules are | |||
identical to the generic data model rules. | identical to the generic data model rules. | |||
* Lose some entries with duplicate keys, e.g. by only delivering the | * Lose some entries with duplicate keys, e.g., deliver only the | |||
final (or first) entry out of the entries with the same key. With | final (or first) entry out of the entries with the same key. With | |||
such a generic decoder, applications may get different results for | such a generic decoder, applications may get different results for | |||
a specific key on different runs and with different generic | a specific key on different runs, and with different generic | |||
decoders as which value is returned is based on generic decoder | decoders, which value is returned is based on generic decoder | |||
implementation and the actual order of keys in the map. In | implementation and the actual order of keys in the map. In | |||
particular, applications cannot validate key uniqueness on their | particular, applications cannot validate key uniqueness on their | |||
own as they do not necessarily see all entries; they may not be | own as they do not necessarily see all entries; they may not be | |||
able to use such a generic decoder if they do need to validate key | able to use such a generic decoder if they need to validate key | |||
uniqueness. These generic decoders can only be used in situations | uniqueness. These generic decoders can only be used in situations | |||
where the data source and transfer can be relied upon to always | where the data source and transfer always provide valid maps; this | |||
provide valid maps; this is not possible if the data source and | is not possible if the data source and transfer can be attacked. | |||
transfer can be attacked. | ||||
Generic decoders need to document which of these three approaches | Generic decoders need to document which of these three approaches | |||
they implement. | they implement. | |||
The CBOR data model for maps does not allow ascribing semantics to | The CBOR data model for maps does not allow ascribing semantics to | |||
the order of the key/value pairs in the map representation. Thus, a | the order of the key/value pairs in the map representation. Thus, a | |||
CBOR-based protocol MUST NOT specify that changing the key/value pair | CBOR-based protocol MUST NOT specify that changing the key/value pair | |||
order in a map would change the semantics, except to specify that | order in a map changes the semantics, except to specify that some | |||
some orders are disallowed, for example where they would not meet the | orders are disallowed, for example, where they would not meet the | |||
requirements of a deterministic encoding (Section 4.2). (Any | requirements of a deterministic encoding (Section 4.2). (Any | |||
secondary effects of map ordering such as on timing, cache usage, and | secondary effects of map ordering such as on timing, cache usage, and | |||
other potential side channels are not considered part of the | other potential side channels are not considered part of the | |||
semantics but may be enough reason on their own for a protocol to | semantics but may be enough reason on their own for a protocol to | |||
require a deterministic encoding format.) | require a deterministic encoding format.) | |||
Applications for constrained devices that have maps where a small | Applications for constrained devices should consider using small | |||
number of frequently used keys can be identified should consider | integers as keys if they have maps with a small number of frequently | |||
using small integers as keys; for instance, a set of 24 or fewer | used keys; for instance, a set of 24 or fewer keys can be encoded in | |||
frequent keys can be encoded in a single byte as unsigned integers, | a single byte as unsigned integers, up to 48 if negative integers are | |||
up to 48 if negative integers are also used. Less frequently | also used. Less frequently occurring keys can then use integers with | |||
occurring keys can then use integers with longer encodings. | longer encodings. | |||
5.6.1. Equivalence of Keys | 5.6.1. Equivalence of Keys | |||
The specific data model applying to a CBOR data item is used to | The specific data model that applies to a CBOR data item is used to | |||
determine whether keys occurring in maps are duplicates or distinct. | determine whether keys occurring in maps are duplicates or distinct. | |||
At the generic data model level, numerically equivalent integer and | At the generic data model level, numerically equivalent integer and | |||
floating-point values are distinct from each other, as they are from | floating-point values are distinct from each other, as they are from | |||
the various big numbers (Tags 2 to 5). Similarly, text strings are | the various big numbers (Tags 2 to 5). Similarly, text strings are | |||
distinct from byte strings, even if composed of the same bytes. A | distinct from byte strings, even if composed of the same bytes. A | |||
tagged value is distinct from an untagged value or from a value | tagged value is distinct from an untagged value or from a value | |||
tagged with a different tag number. | tagged with a different tag number. | |||
Within each of these groups, numeric values are distinct unless they | Within each of these groups, numeric values are distinct unless they | |||
are numerically equal (specifically, -0.0 is equal to 0.0); for the | are numerically equal (specifically, -0.0 is equal to 0.0); for the | |||
purpose of map key equivalence, NaN (not a number) values are | purpose of map key equivalence, NaN values are equivalent if they | |||
equivalent if they have the same significand after zero-extending | have the same significand after zero-extending both significands at | |||
both significands at the right to 64 bits. | the right to 64 bits. | |||
(Byte and text) strings are compared byte by byte, arrays element by | Both byte strings and text strings are compared byte by byte, arrays | |||
element, and are equal if they have the same number of bytes/elements | are compared element by element, and are equal if they have the same | |||
and the same values at the same positions. Two maps are equal if | number of bytes/elements and the same values at the same positions. | |||
they have the same set of pairs regardless of their order; pairs are | Two maps are equal if they have the same set of pairs regardless of | |||
equal if both the key and value are equal. | their order; pairs are equal if both the key and value are equal. | |||
Tagged values are equal if both the tag number and the tag content | Tagged values are equal if both the tag number and the tag content | |||
are equal. (Note that a generic decoder that provides processing for | are equal. (Note that a generic decoder that provides processing for | |||
a specific tag may not be able to distinguish some semantically | a specific tag may not be able to distinguish some semantically | |||
equivalent values, e.g. if leading zeroes occur in the content of tag | equivalent values, e.g., if leading zeroes occur in the content of | |||
2/3 (Section 3.4.3).) Simple values are equal if they simply have | tag 2 or tag 3 (Section 3.4.3).) Simple values are equal if they | |||
the same value. Nothing else is equal in the generic data model; a | simply have the same value. Nothing else is equal in the generic | |||
simple value 2 is not equivalent to an integer 2 and an array is | data model; a simple value 2 is not equivalent to an integer 2, and | |||
never equivalent to a map. | an array is never equivalent to a map. | |||
As discussed in Section 2.2, specific data models can make values | As discussed in Section 2.2, specific data models can make values | |||
equivalent for the purpose of comparing map keys that are distinct in | equivalent for the purpose of comparing map keys that are distinct in | |||
the generic data model. Note that this implies that a generic | the generic data model. Note that this implies that a generic | |||
decoder may deliver a decoded map to an application that needs to be | decoder may deliver a decoded map to an application that needs to be | |||
checked for duplicate map keys by that application (alternatively, | checked for duplicate map keys by that application (alternatively, | |||
the decoder may provide a programming interface to perform this | the decoder may provide a programming interface to perform this | |||
service for the application). Specific data models are not able to | service for the application). Specific data models are not able to | |||
distinguish values for map keys that are equal for this purpose at | distinguish values for map keys that are equal for this purpose at | |||
the generic data model level. | the generic data model level. | |||
5.7. Undefined Values | 5.7. Undefined Values | |||
In some CBOR-based protocols, the simple value (Section 3.3) of | In some CBOR-based protocols, the simple value (Section 3.3) of | |||
Undefined might be used by an encoder as a substitute for a data item | "undefined" might be used by an encoder as a substitute for a data | |||
with an encoding problem, in order to allow the rest of the enclosing | item with an encoding problem, in order to allow the rest of the | |||
data items to be encoded without harm. | enclosing data items to be encoded without harm. | |||
6. Converting Data between CBOR and JSON | 6. Converting Data between CBOR and JSON | |||
This section gives non-normative advice about converting between CBOR | This section gives non-normative advice about converting between CBOR | |||
and JSON. Implementations of converters MAY use whichever advice | and JSON. Implementations of converters MAY use whichever advice | |||
here they want. | here they want. | |||
It is worth noting that a JSON text is a sequence of characters, not | It is worth noting that a JSON text is a sequence of characters, not | |||
an encoded sequence of bytes, while a CBOR data item consists of | an encoded sequence of bytes, while a CBOR data item consists of | |||
bytes, not characters. | bytes, not characters. | |||
skipping to change at page 45, line 7 ¶ | skipping to change at line 2031 ¶ | |||
6.2. Converting from JSON to CBOR | 6.2. Converting from JSON to CBOR | |||
All JSON values, once decoded, directly map into one or more CBOR | All JSON values, once decoded, directly map into one or more CBOR | |||
values. As with any kind of CBOR generation, decisions have to be | values. As with any kind of CBOR generation, decisions have to be | |||
made with respect to number representation. In a suggested | made with respect to number representation. In a suggested | |||
conversion: | conversion: | |||
* JSON numbers without fractional parts (integer numbers) are | * JSON numbers without fractional parts (integer numbers) are | |||
represented as integers (major types 0 and 1, possibly major type | represented as integers (major types 0 and 1, possibly major type | |||
6 tag number 2 and 3), choosing the shortest form; integers longer | 6, tag number 2 and 3), choosing the shortest form; integers | |||
than an implementation-defined threshold may instead be | longer than an implementation-defined threshold may instead be | |||
represented as floating-point values. The default range that is | represented as floating-point values. The default range that is | |||
represented as integer is -2**53+1..2**53-1 (fully exploiting the | represented as integer is -2^(53)+1..2^(53)-1 (fully exploiting | |||
range for exact integers in the binary64 representation often used | the range for exact integers in the binary64 representation often | |||
for decoding JSON [RFC7493]). A CBOR-based protocol, or a generic | used for decoding JSON [RFC7493]). A CBOR-based protocol, or a | |||
converter implementation, may choose -2**32..2**32-1 or | generic converter implementation, may choose -2^(32)..2^(32)-1 or | |||
-2**64..2**64-1 (fully using the integer ranges available in CBOR | -2^(64)..2^(64)-1 (fully using the integer ranges available in | |||
with uint32_t or uint64_t, respectively) or even -2**31..2**31-1 | CBOR with uint32_t or uint64_t, respectively) or even | |||
or -2**63..2**63-1 (using popular ranges for two's complement | -2^(31)..2^(31)-1 or -2^(63)..2^(63)-1 (using popular ranges for | |||
signed integers). (If the JSON was generated from a JavaScript | two's complement signed integers). (If the JSON was generated | |||
implementation, its precision is already limited to 53 bits | from a JavaScript implementation, its precision is already limited | |||
maximum.) | to 53 bits maximum.) | |||
* Numbers with fractional parts are represented as floating-point | * Numbers with fractional parts are represented as floating-point | |||
values, performing the decimal-to-binary conversion based on the | values, performing the decimal-to-binary conversion based on the | |||
precision provided by IEEE 754 binary64. The mathematical value | precision provided by IEEE 754 binary64. The mathematical value | |||
of the JSON number is converted to binary64 using the | of the JSON number is converted to binary64 using the | |||
roundTiesToEven procedure in Section 4.3.1 of [IEEE754]. Then, | roundTiesToEven procedure in Section 4.3.1 of [IEEE754]. Then, | |||
when encoding in CBOR, the preferred serialization uses the | when encoding in CBOR, the preferred serialization uses the | |||
shortest floating-point representation exactly representing this | shortest floating-point representation exactly representing this | |||
conversion result; for instance, 1.5 is represented in a 16-bit | conversion result; for instance, 1.5 is represented in a 16-bit | |||
floating-point value (not all implementations will be capable of | floating-point value (not all implementations will be capable of | |||
skipping to change at page 46, line 43 ¶ | skipping to change at line 2111 ¶ | |||
protocol is designed to tolerate and embrace implementations that | protocol is designed to tolerate and embrace implementations that | |||
start using more codepoints than initially allocated. | start using more codepoints than initially allocated. | |||
Sizing the codepoint space may be difficult because the range | Sizing the codepoint space may be difficult because the range | |||
required may be hard to predict. Protocol designs should attempt to | required may be hard to predict. Protocol designs should attempt to | |||
make the codepoint space large enough so that it can slowly be filled | make the codepoint space large enough so that it can slowly be filled | |||
over the intended lifetime of the protocol. | over the intended lifetime of the protocol. | |||
CBOR has three major extension points: | CBOR has three major extension points: | |||
* the "simple" space (values in major type 7). Of the 24 efficient | the "simple" space (values in major type 7): Of the 24 efficient | |||
(and 224 slightly less efficient) values, only a small number have | (and 224 slightly less efficient) values, only a small number have | |||
been allocated. Implementations receiving an unknown simple data | been allocated. Implementations receiving an unknown simple data | |||
item may easily be able to process it as such, given that the | item may easily be able to process it as such, given that the | |||
structure of the value is indeed simple. The IANA registry in | structure of the value is indeed simple. The IANA registry in | |||
Section 9.1 is the appropriate way to address the extensibility of | Section 9.1 is the appropriate way to address the extensibility of | |||
this codepoint space. | this codepoint space. | |||
* the "tag" space (values in major type 6). The total codepoint | the "tag" space (values in major type 6): The total codepoint space | |||
space is abundant; only a tiny part of it has been allocated. | is abundant; only a tiny part of it has been allocated. However, | |||
However, not all of these codepoints are equally efficient: the | not all of these codepoints are equally efficient: the first 24 | |||
first 24 only consume a single ("1+0") byte, and half of them have | only consume a single ("1+0") byte, and half of them have already | |||
already been allocated. The next 232 values only consume two | been allocated. The next 232 values only consume two ("1+1") | |||
("1+1") bytes, with nearly a quarter already allocated. These | bytes, with nearly a quarter already allocated. These subspaces | |||
subspaces need some curation to last for a few more decades. | need some curation to last for a few more decades. | |||
Implementations receiving an unknown tag number can choose to | Implementations receiving an unknown tag number can choose to | |||
process just the enclosed tag content or, preferably, to process | process just the enclosed tag content or, preferably, to process | |||
the tag as an unknown tag number wrapping the tag content. The | the tag as an unknown tag number wrapping the tag content. The | |||
IANA registry in Section 9.2 is the appropriate way to address the | IANA registry in Section 9.2 is the appropriate way to address the | |||
extensibility of this codepoint space. | extensibility of this codepoint space. | |||
* the "additional information" space. An implementation receiving | the "additional information" space: An implementation receiving an | |||
an unknown additional information value has no way to continue | unknown additional information value has no way to continue | |||
decoding, so allocating codepoints in this space is a major step | decoding, so allocating codepoints in this space is a major step | |||
beyond just exercising an extension point. There are also very | beyond just exercising an extension point. There are also very | |||
few codepoints left. See also Section 7.2. | few codepoints left. See also Section 7.2. | |||
7.2. Curating the Additional Information Space | 7.2. Curating the Additional Information Space | |||
The human mind is sometimes drawn to filling in little perceived gaps | The human mind is sometimes drawn to filling in little perceived gaps | |||
to make something neat. We expect the remaining gaps in the | to make something neat. We expect the remaining gaps in the | |||
codepoint space for the additional information values to be an | codepoint space for the additional information values to be an | |||
attractor for new ideas, just because they are there. | attractor for new ideas, just because they are there. | |||
The present specification does not manage the additional information | The present specification does not manage the additional information | |||
codepoint space by an IANA registry. Instead, allocations out of | codepoint space by an IANA registry. Instead, allocations out of | |||
this space can only be done by updating this specification. | this space can only be done by updating this specification. | |||
For an additional information value of n >= 24, the size of the | For an additional information value of n >= 24, the size of the | |||
additional data typically is 2**(n-24) bytes. Therefore, additional | additional data typically is 2^(n-24) bytes. Therefore, additional | |||
information values 28 and 29 should be viewed as candidates for | information values 28 and 29 should be viewed as candidates for | |||
128-bit and 256-bit quantities, in case a need arises to add them to | 128-bit and 256-bit quantities, in case a need arises to add them to | |||
the protocol. Additional information value 30 is then the only | the protocol. Additional information value 30 is then the only | |||
additional information value available for general allocation, and | additional information value available for general allocation, and | |||
there should be a very good reason for allocating it before assigning | there should be a very good reason for allocating it before assigning | |||
it through an update of the present specification. | it through an update of the present specification. | |||
8. Diagnostic Notation | 8. Diagnostic Notation | |||
CBOR is a binary interchange format. To facilitate documentation and | CBOR is a binary interchange format. To facilitate documentation and | |||
skipping to change at page 48, line 23 ¶ | skipping to change at line 2184 ¶ | |||
The notation borrows the JSON syntax for numbers (integer and | The notation borrows the JSON syntax for numbers (integer and | |||
floating-point), True (>true<), False (>false<), Null (>null<), UTF-8 | floating-point), True (>true<), False (>false<), Null (>null<), UTF-8 | |||
strings, arrays, and maps (maps are called objects in JSON; the | strings, arrays, and maps (maps are called objects in JSON; the | |||
diagnostic notation extends JSON here by allowing any data item in | diagnostic notation extends JSON here by allowing any data item in | |||
the key position). Undefined is written >undefined< as in | the key position). Undefined is written >undefined< as in | |||
JavaScript. The non-finite floating-point numbers Infinity, | JavaScript. The non-finite floating-point numbers Infinity, | |||
-Infinity, and NaN are written exactly as in this sentence (this is | -Infinity, and NaN are written exactly as in this sentence (this is | |||
also a way they can be written in JavaScript, although JSON does not | also a way they can be written in JavaScript, although JSON does not | |||
allow them). A tag is written as an integer number for the tag | allow them). A tag is written as an integer number for the tag | |||
number, followed by the tag content in parentheses; for instance, an | number, followed by the tag content in parentheses; for instance, a | |||
RFC 3339 (ISO 8601) date could be notated as: | date in the format specified by RFC 3339 (ISO 8601) could be notated | |||
as: | ||||
0("2013-03-21T20:04:00Z") | 0("2013-03-21T20:04:00Z") | |||
or the equivalent relative time as | or the equivalent relative time as the following: | |||
1(1363896240) | 1(1363896240) | |||
Byte strings are notated in one of the base encodings, without | Byte strings are notated in one of the base encodings, without | |||
padding, enclosed in single quotes, prefixed by >h< for base16, >b32< | padding, enclosed in single quotes, prefixed by >h< for base16, >b32< | |||
for base32, >h32< for base32hex, >b64< for base64 or base64url (the | for base32, >h32< for base32hex, >b64< for base64 or base64url (the | |||
actual encodings do not overlap, so the string remains unambiguous). | actual encodings do not overlap, so the string remains unambiguous). | |||
For example, the byte string 0x12345678 could be written h'12345678', | For example, the byte string 0x12345678 could be written h'12345678', | |||
b32'CI2FM6A', or b64'EjRWeA'. | b32'CI2FM6A', or b64'EjRWeA'. | |||
Unassigned simple values are given as "simple()" with the appropriate | Unassigned simple values are given as "simple()" with the appropriate | |||
integer in the parentheses. For example, "simple(42)" indicates | integer in the parentheses. For example, "simple(42)" indicates | |||
major type 7, value 42. | major type 7, value 42. | |||
A number of useful extensions to the diagnostic notation defined here | A number of useful extensions to the diagnostic notation defined here | |||
are provided in Appendix G of [RFC8610], "Extended Diagnostic | are provided in Appendix G of [RFC8610], "Extended Diagnostic | |||
Notation" (EDN). Similarly, an extension of this notation could be | Notation" (EDN). Similarly, this notation could be extended in a | |||
provided in a separate document to provide for the documentation of | separate document to provide documentation for NaN payloads, which | |||
NaN payloads, which are not covered in the present document. | are not covered in this document. | |||
8.1. Encoding Indicators | 8.1. Encoding Indicators | |||
Sometimes it is useful to indicate in the diagnostic notation which | Sometimes it is useful to indicate in the diagnostic notation which | |||
of several alternative representations were actually used; for | of several alternative representations were actually used; for | |||
example, a data item written >1.5< by a diagnostic decoder might have | example, a data item written >1.5< by a diagnostic decoder might have | |||
been encoded as a half-, single-, or double-precision float. | been encoded as a half-, single-, or double-precision float. | |||
The convention for encoding indicators is that anything starting with | The convention for encoding indicators is that anything starting with | |||
an underscore and all following characters that are alphanumeric or | an underscore and all following characters that are alphanumeric or | |||
underscore, is an encoding indicator, and can be ignored by anyone | underscore is an encoding indicator, and can be ignored by anyone not | |||
not interested in this information. For example, "_" or "_3". | interested in this information. For example, "_" or "_3". Encoding | |||
Encoding indicators are always optional. | indicators are always optional. | |||
A single underscore can be written after the opening brace of a map | A single underscore can be written after the opening brace of a map | |||
or the opening bracket of an array to indicate that the data item was | or the opening bracket of an array to indicate that the data item was | |||
represented in indefinite-length format. For example, [_ 1, 2] | represented in indefinite-length format. For example, [_ 1, 2] | |||
contains an indicator that an indefinite-length representation was | contains an indicator that an indefinite-length representation was | |||
used to represent the data item [1, 2]. | used to represent the data item [1, 2]. | |||
An underscore followed by a decimal digit n indicates that the | An underscore followed by a decimal digit n indicates that the | |||
preceding item (or, for arrays and maps, the item starting with the | preceding item (or, for arrays and maps, the item starting with the | |||
preceding bracket or brace) was encoded with an additional | preceding bracket or brace) was encoded with an additional | |||
information value of 24+n. For example, 1.5_1 is a half-precision | information value of 24+n. For example, 1.5_1 is a half-precision | |||
floating-point number, while 1.5_3 is encoded as double precision. | floating-point number, while 1.5_3 is encoded as double precision. | |||
This encoding indicator is not shown in Appendix A. (Note that the | This encoding indicator is not shown in Appendix A. (Note that the | |||
encoding indicator "_" is thus an abbreviation of the full form "_7", | encoding indicator "_" is thus an abbreviation of the full form "_7", | |||
which is not used.) | which is not used.) | |||
The detailed chunk structure of byte and text strings of indefinite | The detailed chunk structure of byte and text strings of indefinite | |||
length can be notated in the form (_ h'0123', h'4567') and (_ "foo", | length can be notated in the form (_ h'0123', h'4567') and (_ "foo", | |||
"bar"). However, for an indefinite length string with no chunks | "bar"). However, for an indefinite-length string with no chunks | |||
inside, (_ ) would be ambiguous whether a byte string (0x5fff) or a | inside, (_ ) would be ambiguous as to whether a byte string (0x5fff) | |||
text string (0x7fff) is meant and is therefore not used. The basic | or a text string (0x7fff) is meant and is therefore not used. The | |||
forms ''_ and ""_ can be used instead and are reserved for the case | basic forms ''_ and ""_ can be used instead and are reserved for the | |||
with no chunks only -- not as short forms for the (permitted, but not | case of no chunks only -- not as short forms for the (permitted, but | |||
really useful) encodings with only empty chunks, which to preserve | not really useful) encodings with only empty chunks, which need to be | |||
the chunk structure need to be notated as (_ ''), (_ ""), etc. | notated as (_ ''), (_ ""), etc., to preserve the chunk structure. | |||
9. IANA Considerations | 9. IANA Considerations | |||
IANA has created two registries for new CBOR values. The registries | IANA has created two registries for new CBOR values. The registries | |||
are separate, that is, not under an umbrella registry, and follow the | are separate, that is, not under an umbrella registry, and follow the | |||
rules in [RFC8126]. IANA has also assigned a new MIME media type and | rules in [RFC8126]. IANA has also assigned a new media type, an | |||
an associated Constrained Application Protocol (CoAP) Content-Format | associated CoAP Content-Format entry, and a structured syntax suffix. | |||
entry. | ||||
9.1. Simple Values Registry | 9.1. CBOR Simple Values Registry | |||
IANA has created the "Concise Binary Object Representation (CBOR) | IANA has created the "Concise Binary Object Representation (CBOR) | |||
Simple Values" registry at [IANA.cbor-simple-values]. The initial | Simple Values" registry at [IANA.cbor-simple-values]. The initial | |||
values are shown in Table 4. | values are shown in Table 4. | |||
New entries in the range 0 to 19 are assigned by Standards Action. | New entries in the range 0 to 19 are assigned by Standards Action | |||
It is suggested that these Standards Actions allocate values starting | [RFC8126]. It is suggested that IANA allocate values starting with | |||
with the number 16 in order to reserve the lower numbers for | the number 16 in order to reserve the lower numbers for contiguous | |||
contiguous blocks (if any). | blocks (if any). | |||
New entries in the range 32 to 255 are assigned by Specification | New entries in the range 32 to 255 are assigned by Specification | |||
Required. | Required. | |||
9.2. Tags Registry | 9.2. CBOR Tags Registry | |||
IANA has created the "Concise Binary Object Representation (CBOR) | IANA has created the "Concise Binary Object Representation (CBOR) | |||
Tags" registry at [IANA.cbor-tags]. The tags that were defined in | Tags" registry at [IANA.cbor-tags]. The tags that were defined in | |||
[RFC7049] are described in detail in Section 3.4, and other tags have | [RFC7049] are described in detail in Section 3.4, and other tags have | |||
already been defined since then. | already been defined since then. | |||
New entries in the range 0 to 23 ("1+0") are assigned by Standards | New entries in the range 0 to 23 ("1+0") are assigned by Standards | |||
Action. New entries in the ranges 24 to 255 ("1+1") and 256 to 32767 | Action. New entries in the ranges 24 to 255 ("1+1") and 256 to 32767 | |||
(lower half of "1+2") are assigned by Specification Required. New | (lower half of "1+2") are assigned by Specification Required. New | |||
entries in the range 32768 to 18446744073709551615 (upper half of | entries in the range 32768 to 18446744073709551615 (upper half of | |||
skipping to change at page 51, line 5 ¶ | skipping to change at line 2302 ¶ | |||
* Description of semantics (URL) -- This description is optional; | * Description of semantics (URL) -- This description is optional; | |||
the URL can point to something like an Internet-Draft or a web | the URL can point to something like an Internet-Draft or a web | |||
page. | page. | |||
Applicants exercising the First Come First Served range and making a | Applicants exercising the First Come First Served range and making a | |||
suggestion for a tag number that is not representable in 32 bits | suggestion for a tag number that is not representable in 32 bits | |||
(i.e., larger than 4294967295) should be aware that this could reduce | (i.e., larger than 4294967295) should be aware that this could reduce | |||
interoperability with implementations that do not support 64-bit | interoperability with implementations that do not support 64-bit | |||
numbers. | numbers. | |||
9.3. Media Type ("MIME Type") | 9.3. Media Types Registry | |||
The Internet media type [RFC6838] for a single encoded CBOR data item | The Internet media type [RFC6838] ("MIME type") for a single encoded | |||
is application/cbor, as defined in [IANA.media-types]: | CBOR data item is "application/cbor" as defined in the "Media Types" | |||
registry [IANA.media-types]: | ||||
Type name: application | Type name: application | |||
Subtype name: cbor | Subtype name: cbor | |||
Required parameters: n/a | Required parameters: n/a | |||
Optional parameters: n/a | Optional parameters: n/a | |||
Encoding considerations: Binary | Encoding considerations: Binary | |||
Security considerations: See Section 10 of this document | Security considerations: See Section 10 of RFC 8949. | |||
Interoperability considerations: n/a | Interoperability considerations: n/a | |||
Published specification: This document | Published specification: RFC 8949 | |||
Applications that use this media type: Many | Applications that use this media type: Many | |||
Additional information: | Additional information: | |||
* Magic number(s): n/a | ||||
* File extension(s): .cbor | ||||
* Macintosh file type code(s): n/a | Magic number(s): n/a | |||
File extension(s): .cbor | ||||
Macintosh file type code(s): n/a | ||||
Person & email address to contact for further information: IETF CBOR | Person & email address to contact for further information: IETF CBOR | |||
Working Group cbor@ietf.org (mailto:cbor@ietf.org) or IETF | Working Group (cbor@ietf.org) or IETF Applications and Real-Time | |||
Applications and Real-Time Area art@ietf.org (mailto:art@ietf.org) | Area (art@ietf.org) | |||
Intended usage: COMMON | Intended usage: COMMON | |||
Restrictions on usage: none | Restrictions on usage: none | |||
Author: IETF CBOR Working Group cbor@ietf.org (mailto:cbor@ietf.org) | Author: IETF CBOR Working Group (cbor@ietf.org) | |||
Change controller: The IESG iesg@ietf.org (mailto:iesg@ietf.org) | Change controller: The IESG (iesg@ietf.org) | |||
9.4. CoAP Content-Format | 9.4. CoAP Content-Format Registry | |||
The CoAP Content-Format for CBOR is registered in | The CoAP Content-Format for CBOR has been registered in the "CoAP | |||
[IANA.core-parameters]: | Content-Formats" subregistry within the "Constrained RESTful | |||
Environments (CoRE) Parameters" registry [IANA.core-parameters]: | ||||
Media Type: application/cbor | Media Type: application/cbor | |||
Encoding: - | ||||
Id: 60 | Encoding: - | |||
Reference: [RFCthis] | ID: 60 | |||
9.5. The +cbor Structured Syntax Suffix Registration | Reference: RFC 8949 | |||
The Structured Syntax Suffix [RFC6838] for media types based on a | 9.5. Structured Syntax Suffix Registry | |||
single encoded CBOR data item is +cbor, as defined in | ||||
[IANA.media-type-structured-suffix]: | ||||
Name: Concise Binary Object Representation (CBOR) | The structured syntax suffix [RFC6838] for media types based on a | |||
single encoded CBOR data item is +cbor, which IANA has registered in | ||||
the "Structured Syntax Suffixes" registry [IANA.structured-suffix]: | ||||
+suffix: +cbor | Name: Concise Binary Object Representation (CBOR) | |||
References: [RFCthis] | +suffix: +cbor | |||
Encoding Considerations: CBOR is a binary format. | References: RFC 8949 | |||
Interoperability Considerations: n/a | Encoding Considerations: CBOR is a binary format. | |||
Interoperability Considerations: n/a | ||||
Fragment Identifier Considerations: The syntax and semantics of | Fragment Identifier Considerations: The syntax and semantics of | |||
fragment identifiers specified for +cbor SHOULD be as specified | fragment identifiers specified for +cbor SHOULD be as specified | |||
for "application/cbor". (At publication of this document, there | for "application/cbor". (At publication of RFC 8949, there is no | |||
is no fragment identification syntax defined for "application/ | fragment identification syntax defined for "application/cbor".) | |||
cbor".) | ||||
The syntax and semantics for fragment identifiers for a specific | The syntax and semantics for fragment identifiers for a specific | |||
"xxx/yyy+cbor" SHOULD be processed as follows: | "xxx/yyy+cbor" SHOULD be processed as follows: | |||
* For cases defined in +cbor, where the fragment identifier | * For cases defined in +cbor, where the fragment identifier | |||
resolves per the +cbor rules, then process as specified in | resolves per the +cbor rules, then process as specified in | |||
+cbor. | +cbor. | |||
* For cases defined in +cbor, where the fragment identifier does | * For cases defined in +cbor, where the fragment identifier does | |||
not resolve per the +cbor rules, then process as specified in | not resolve per the +cbor rules, then process as specified in | |||
"xxx/yyy+cbor". | "xxx/yyy+cbor". | |||
* For cases not defined in +cbor, then process as specified in | * For cases not defined in +cbor, then process as specified in | |||
"xxx/yyy+cbor". | "xxx/yyy+cbor". | |||
Security Considerations: See Section 10 of this document | Security Considerations: See Section 10 of RFC 8949. | |||
Contact: IETF CBOR Working Group cbor@ietf.org | Contact: IETF CBOR Working Group (cbor@ietf.org) or IETF | |||
(mailto:cbor@ietf.org) or IETF Applications and Real-Time Area | Applications and Real-Time Area (art@ietf.org) | |||
art@ietf.org (mailto:art@ietf.org) | ||||
Author/Change Controller: The IESG iesg@ietf.org | Author/Change Controller: IETF | |||
(mailto:iesg@ietf.org) | ||||
10. Security Considerations | 10. Security Considerations | |||
A network-facing application can exhibit vulnerabilities in its | A network-facing application can exhibit vulnerabilities in its | |||
processing logic for incoming data. Complex parsers are well known | processing logic for incoming data. Complex parsers are well known | |||
as a likely source of such vulnerabilities, such as the ability to | as a likely source of such vulnerabilities, such as the ability to | |||
remotely crash a node, or even remotely execute arbitrary code on it. | remotely crash a node, or even remotely execute arbitrary code on it. | |||
CBOR attempts to narrow the opportunities for introducing such | CBOR attempts to narrow the opportunities for introducing such | |||
vulnerabilities by reducing parser complexity, by giving the entire | vulnerabilities by reducing parser complexity, by giving the entire | |||
range of encodable values a meaning where possible. | range of encodable values a meaning where possible. | |||
skipping to change at page 53, line 43 ¶ | skipping to change at line 2435 ¶ | |||
As discussed throughout this document, there are many values that can | As discussed throughout this document, there are many values that can | |||
be considered "equivalent" in some circumstances and "not equivalent" | be considered "equivalent" in some circumstances and "not equivalent" | |||
in others. As just one example, the numeric value for the number | in others. As just one example, the numeric value for the number | |||
"one" might be expressed as an integer or a bignum. A system | "one" might be expressed as an integer or a bignum. A system | |||
interpreting CBOR input might accept either form for the number | interpreting CBOR input might accept either form for the number | |||
"one", or might reject one (or both) forms. Such acceptance or | "one", or might reject one (or both) forms. Such acceptance or | |||
rejection can have security implications in the program that is using | rejection can have security implications in the program that is using | |||
the interpreted input. | the interpreted input. | |||
Hostile input may be constructed to overrun buffers, overflow or | Hostile input may be constructed to overrun buffers, to overflow or | |||
underflow integer arithmetic, or cause other decoding disruption. | underflow integer arithmetic, or to cause other decoding disruption. | |||
CBOR data items might have lengths or sizes that are intentionally | CBOR data items might have lengths or sizes that are intentionally | |||
extremely large or too short. Resource exhaustion attacks might | extremely large or too short. Resource exhaustion attacks might | |||
attempt to lure a decoder into allocating very big data items | attempt to lure a decoder into allocating very big data items | |||
(strings, arrays, maps, or even arbitrary precision numbers) or | (strings, arrays, maps, or even arbitrary precision numbers) or | |||
exhaust the stack depth by setting up deeply nested items. Decoders | exhaust the stack depth by setting up deeply nested items. Decoders | |||
need to have appropriate resource management to mitigate these | need to have appropriate resource management to mitigate these | |||
attacks. (Items for which very large sizes are given can also | attacks. (Items for which very large sizes are given can also | |||
attempt to exploit integer overflow vulnerabilities.) | attempt to exploit integer overflow vulnerabilities.) | |||
A CBOR decoder, by definition, only accepts well-formed CBOR; this is | A CBOR decoder, by definition, only accepts well-formed CBOR; this is | |||
the first step to its robustness. Input that is not well-formed CBOR | the first step to its robustness. Input that is not well-formed CBOR | |||
causes no further processing from the point where the lack of well- | causes no further processing from the point where the lack of well- | |||
formedness was detected. If possible, any data decoded up to this | formedness was detected. If possible, any data decoded up to this | |||
point should have no impact on the application using the CBOR | point should have no impact on the application using the CBOR | |||
decoder. | decoder. | |||
In addition to ascertaining well-formedness, a CBOR decoder might | In addition to ascertaining well-formedness, a CBOR decoder might | |||
also perform validity checks on the CBOR data. Alternatively, it can | also perform validity checks on the CBOR data. Alternatively, it can | |||
leave those checks to the application using the decoder. This choice | leave those checks to the application using the decoder. This choice | |||
skipping to change at page 55, line 6 ¶ | skipping to change at line 2496 ¶ | |||
underflow of integer arithmetic, and other such errors that are aimed | underflow of integer arithmetic, and other such errors that are aimed | |||
to disrupt the encoder. | to disrupt the encoder. | |||
Protocols should be defined in such a way that potential multiple | Protocols should be defined in such a way that potential multiple | |||
interpretations are reliably reduced to a single interpretation. For | interpretations are reliably reduced to a single interpretation. For | |||
example, an attacker could make use of invalid input such as | example, an attacker could make use of invalid input such as | |||
duplicate keys in maps, or exploit different precision in processing | duplicate keys in maps, or exploit different precision in processing | |||
numbers to make one application base its decisions on a different | numbers to make one application base its decisions on a different | |||
interpretation than the one that will be used by a second | interpretation than the one that will be used by a second | |||
application. To facilitate consistent interpretation, encoder and | application. To facilitate consistent interpretation, encoder and | |||
decoder implementations should provide a validity checking mode of | decoder implementations should provide a validity-checking mode of | |||
operation (Section 5.4). Note, however, that a generic decoder | operation (Section 5.4). Note, however, that a generic decoder | |||
cannot know about all requirements that an application poses on its | cannot know about all requirements that an application poses on its | |||
input data; it is therefore not relieving the application from | input data; it is therefore not relieving the application from | |||
performing its own input checking. Also, since the set of defined | performing its own input checking. Also, since the set of defined | |||
tag numbers evolves, the application may employ a tag number that is | tag numbers evolves, the application may employ a tag number that is | |||
not yet supported for validity checking by the generic decoder it | not yet supported for validity checking by the generic decoder it | |||
uses. Generic decoders therefore need to provide documentation which | uses. Generic decoders therefore need to document which tag numbers | |||
tag numbers they support and what validity checking they can provide | they support and what validity checking they provide for those tag | |||
for each of them as well as for basic CBOR validity (UTF-8 checking, | numbers as well as for basic CBOR (UTF-8 checking, duplicate map key | |||
duplicate map key checking). | checking). | |||
Section 3.4.3 notes that using the non-preferred choice of a bignum | Section 3.4.3 notes that using the non-preferred choice of a bignum | |||
representation instead of a basic integer for encoding a number is | representation instead of a basic integer for encoding a number is | |||
not intended to have application semantics, but it can have such | not intended to have application semantics, but it can have such | |||
semantics if an application receiving CBOR data is using a decoder in | semantics if an application receiving CBOR data is using a decoder in | |||
the basic generic data model. This disparity causes a security issue | the basic generic data model. This disparity causes a security issue | |||
if the two sets of semantics differ. Thus, applications using CBOR | if the two sets of semantics differ. Thus, applications using CBOR | |||
need to specify the data model that they are using for each use of | need to specify the data model that they are using for each use of | |||
CBOR data. | CBOR data. | |||
It is common to convert CBOR data to other formats. In many cases, | It is common to convert CBOR data to other formats. In many cases, | |||
CBOR has more expressive types than other formats; this is | CBOR has more expressive types than other formats; this is | |||
particularly true for the common conversion to JSON. The loss of | particularly true for the common conversion to JSON. The loss of | |||
type information can cause security issues for the systems that are | type information can cause security issues for the systems that are | |||
processing the less-expressive data. | processing the less-expressive data. | |||
Section 6.2 describes a possibly-common usage scenario of converting | Section 6.2 describes a possibly common usage scenario of converting | |||
between CBOR and JSON that could allow an attack if the attcker knows | between CBOR and JSON that could allow an attack if the attacker | |||
that the application is performing the conversion. | knows that the application is performing the conversion. | |||
Security considerations for the use of base16 and base64 from | Security considerations for the use of base16 and base64 from | |||
[RFC4648], and the use of UTF-8 from [RFC3629], are relevant to CBOR | [RFC4648], and the use of UTF-8 from [RFC3629], are relevant to CBOR | |||
as well. | as well. | |||
11. References | 11. References | |||
11.1. Normative References | 11.1. Normative References | |||
[C] International Organization for Standardization, | [C] International Organization for Standardization, | |||
"Information technology — Programming languages — C", ISO/ | "Information technology - Programming languages - C", | |||
IEC 9899:2018, Fourth Edition, June 2018. | Fourth Edition, ISO/IEC 9899:2018, June 2018, | |||
<https://www.iso.org/standard/74528.html>. | ||||
[Cplusplus17] | [Cplusplus20] | |||
International Organization for Standardization, | International Organization for Standardization, | |||
"Programming languages — C++", ISO/IEC 14882:2017, Fifth | "Programming languages - C++", Sixth Edition, ISO/IEC DIS | |||
Edition, December 2017. | 14882, ISO/IEC ISO/IEC JTC1 SC22 WG21 N 4860, March 2020, | |||
<https://isocpp.org/files/papers/N4860.pdf>. | ||||
[IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE | [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE | |||
Std 754-2019, DOI 10.1109/IEEESTD.2019.8766229, | Std 754-2019, DOI 10.1109/IEEESTD.2019.8766229, | |||
<https://ieeexplore.ieee.org/document/8766229>. | <https://ieeexplore.ieee.org/document/8766229>. | |||
[RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | |||
Extensions (MIME) Part One: Format of Internet Message | Extensions (MIME) Part One: Format of Internet Message | |||
Bodies", RFC 2045, DOI 10.17487/RFC2045, November 1996, | Bodies", RFC 2045, DOI 10.17487/RFC2045, November 1996, | |||
<https://www.rfc-editor.org/info/rfc2045>. | <https://www.rfc-editor.org/info/rfc2045>. | |||
skipping to change at page 57, line 5 ¶ | skipping to change at line 2590 ¶ | |||
[RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for | [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for | |||
Writing an IANA Considerations Section in RFCs", BCP 26, | Writing an IANA Considerations Section in RFCs", BCP 26, | |||
RFC 8126, DOI 10.17487/RFC8126, June 2017, | RFC 8126, DOI 10.17487/RFC8126, June 2017, | |||
<https://www.rfc-editor.org/info/rfc8126>. | <https://www.rfc-editor.org/info/rfc8126>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
[TIME_T] The Open Group Base Specifications, "Open Group Standard: | [TIME_T] The Open Group, "The Open Group Base Specifications", | |||
Vol. 1: Base Definitions, Issue 7", Section 4.16 'Seconds | Section 4.16, 'Seconds Since the Epoch', Issue 7, 2018 | |||
Since the Epoch', IEEE Std 1003.1, 2018 Edition, 2018, | Edition, IEEE Std 1003.1, 2018, | |||
<http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/ | <https://pubs.opengroup.org/onlinepubs/9699919799/ | |||
V1_chap04.html#tag_04_16>. | basedefs/V1_chap04.html#tag_04_16>. | |||
11.2. Informative References | 11.2. Informative References | |||
[ASN.1] International Telecommunication Union, "Information | [ASN.1] International Telecommunication Union, "Information | |||
Technology — ASN.1 encoding rules: Specification of Basic | Technology - ASN.1 encoding rules: Specification of Basic | |||
Encoding Rules (BER), Canonical Encoding Rules (CER) and | Encoding Rules (BER), Canonical Encoding Rules (CER) and | |||
Distinguished Encoding Rules (DER)", ITU-T Recommendation | Distinguished Encoding Rules (DER)", ITU-T Recommendation | |||
X.690, 1994. | X.690, 2015, | |||
<https://www.itu.int/rec/T-REC-X.690-201508-I/en>. | ||||
[BSON] Various, "BSON - Binary JSON", 2013, | ||||
<http://bsonspec.org/>. | ||||
[ECMA262] Ecma International, "ECMAScript 2018 Language | [BSON] Various, "BSON - Binary JSON", <http://bsonspec.org/>. | |||
Specification", ECMA Standard ECMA-262, 9th Edition, June | ||||
2018, <https://www.ecma- | ||||
international.org/publications/files/ECMA-ST/Ecma- | ||||
262.pdf>. | ||||
[I-D.bormann-cbor-notable-tags] | [CBOR-TAGS] | |||
Bormann, C., "Notable CBOR Tags", Work in Progress, | Bormann, C., "Notable CBOR Tags", Work in Progress, | |||
Internet-Draft, draft-bormann-cbor-notable-tags-02, 25 | Internet-Draft, draft-bormann-cbor-notable-tags-02, 25 | |||
June 2020, <http://www.ietf.org/internet-drafts/draft- | June 2020, <https://tools.ietf.org/html/draft-bormann- | |||
bormann-cbor-notable-tags-02.txt>. | cbor-notable-tags-02>. | |||
[ECMA262] Ecma International, "ECMAScript 2020 Language | ||||
Specification", Standard ECMA-262, 11th Edition, June | ||||
2020, <https://www.ecma- | ||||
international.org/publications/standards/Ecma-262.htm>. | ||||
[Err3764] RFC Errata, Erratum ID 3764, RFC 7049, | ||||
<https://www.rfc-editor.org/errata/eid3764>. | ||||
[Err3770] RFC Errata, Erratum ID 3770, RFC 7049, | ||||
<https://www.rfc-editor.org/errata/eid3770>. | ||||
[Err4294] RFC Errata, Erratum ID 4294, RFC 7049, | ||||
<https://www.rfc-editor.org/errata/eid4294>. | ||||
[Err4409] RFC Errata, Erratum ID 4409, RFC 7049, | ||||
<https://www.rfc-editor.org/errata/eid4409>. | ||||
[Err4963] RFC Errata, Erratum ID 4963, RFC 7049, | ||||
<https://www.rfc-editor.org/errata/eid4963>. | ||||
[Err4964] RFC Errata, Erratum ID 4964, RFC 7049, | ||||
<https://www.rfc-editor.org/errata/eid4964>. | ||||
[Err5434] RFC Errata, Erratum ID 5434, RFC 7049, | ||||
<https://www.rfc-editor.org/errata/eid5434>. | ||||
[Err5763] RFC Errata, Erratum ID 5763, RFC 7049, | ||||
<https://www.rfc-editor.org/errata/eid5763>. | ||||
[Err5917] RFC Errata, Erratum ID 5917, RFC 7049, | ||||
<https://www.rfc-editor.org/errata/eid5917>. | ||||
[IANA.cbor-simple-values] | [IANA.cbor-simple-values] | |||
IANA, "Concise Binary Object Representation (CBOR) Simple | IANA, "Concise Binary Object Representation (CBOR) Simple | |||
Values", | Values", | |||
<http://www.iana.org/assignments/cbor-simple-values>. | <https://www.iana.org/assignments/cbor-simple-values>. | |||
[IANA.cbor-tags] | [IANA.cbor-tags] | |||
IANA, "Concise Binary Object Representation (CBOR) Tags", | IANA, "Concise Binary Object Representation (CBOR) Tags", | |||
<http://www.iana.org/assignments/cbor-tags>. | <https://www.iana.org/assignments/cbor-tags>. | |||
[IANA.core-parameters] | [IANA.core-parameters] | |||
IANA, "Constrained RESTful Environments (CoRE) | IANA, "Constrained RESTful Environments (CoRE) | |||
Parameters", | Parameters", | |||
<http://www.iana.org/assignments/core-parameters>. | <https://www.iana.org/assignments/core-parameters>. | |||
[IANA.media-type-structured-suffix] | ||||
IANA, "Structured Syntax Suffix Registry", | ||||
<http://www.iana.org/assignments/media-type-structured- | ||||
suffix>. | ||||
[IANA.media-types] | [IANA.media-types] | |||
IANA, "Media Types", | IANA, "Media Types", | |||
<http://www.iana.org/assignments/media-types>. | <https://www.iana.org/assignments/media-types>. | |||
[IANA.structured-suffix] | ||||
IANA, "Structured Syntax Suffixes", | ||||
<https://www.iana.org/assignments/media-type-structured- | ||||
suffix>. | ||||
[MessagePack] | [MessagePack] | |||
Furuhashi, S., "MessagePack", 2013, <http://msgpack.org/>. | Furuhashi, S., "MessagePack", <https://msgpack.org/>. | |||
[PCRE] Ho, A., "PCRE - Perl Compatible Regular Expressions", | [PCRE] Hazel, P., "PCRE - Perl Compatible Regular Expressions", | |||
2018, <http://www.pcre.org/>. | <https://www.pcre.org/>. | |||
[RFC0713] Haverty, J., "MSDTP-Message Services Data Transmission | [RFC0713] Haverty, J., "MSDTP-Message Services Data Transmission | |||
Protocol", RFC 713, DOI 10.17487/RFC0713, April 1976, | Protocol", RFC 713, DOI 10.17487/RFC0713, April 1976, | |||
<https://www.rfc-editor.org/info/rfc713>. | <https://www.rfc-editor.org/info/rfc713>. | |||
[RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type | [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type | |||
Specifications and Registration Procedures", BCP 13, | Specifications and Registration Procedures", BCP 13, | |||
RFC 6838, DOI 10.17487/RFC6838, January 2013, | RFC 6838, DOI 10.17487/RFC6838, January 2013, | |||
<https://www.rfc-editor.org/info/rfc6838>. | <https://www.rfc-editor.org/info/rfc6838>. | |||
skipping to change at page 59, line 21 ¶ | skipping to change at line 2727 ¶ | |||
Sequences", RFC 8742, DOI 10.17487/RFC8742, February 2020, | Sequences", RFC 8742, DOI 10.17487/RFC8742, February 2020, | |||
<https://www.rfc-editor.org/info/rfc8742>. | <https://www.rfc-editor.org/info/rfc8742>. | |||
[RFC8746] Bormann, C., Ed., "Concise Binary Object Representation | [RFC8746] Bormann, C., Ed., "Concise Binary Object Representation | |||
(CBOR) Tags for Typed Arrays", RFC 8746, | (CBOR) Tags for Typed Arrays", RFC 8746, | |||
DOI 10.17487/RFC8746, February 2020, | DOI 10.17487/RFC8746, February 2020, | |||
<https://www.rfc-editor.org/info/rfc8746>. | <https://www.rfc-editor.org/info/rfc8746>. | |||
[SIPHASH_LNCS] | [SIPHASH_LNCS] | |||
Aumasson, J. and D. Bernstein, "SipHash: A Fast Short- | Aumasson, J. and D. Bernstein, "SipHash: A Fast Short- | |||
Input PRF", Lecture Notes in Computer Science pp. 489-508, | Input PRF", Progress in Cryptology - INDOCRYPT 2012, pp. | |||
DOI 10.1007/978-3-642-34931-7_28, 2012, | 489-508, DOI 10.1007/978-3-642-34931-7_28, 2012, | |||
<https://doi.org/10.1007/978-3-642-34931-7_28>. | <https://doi.org/10.1007/978-3-642-34931-7_28>. | |||
[SIPHASH_OPEN] | [SIPHASH_OPEN] | |||
Aumasson, J. and D.J. Bernstein, "SipHash: a fast short- | Aumasson, J. and D.J. Bernstein, "SipHash: a fast short- | |||
input PRF", <https://131002.net/siphash/siphash.pdf>. | input PRF", <https://www.aumasson.jp/siphash/siphash.pdf>. | |||
[YAML] Ben-Kiki, O., Evans, C., and I.d. Net, "YAML Ain't Markup | [YAML] Ben-Kiki, O., Evans, C., and I.d. Net, "YAML Ain't Markup | |||
Language (YAML[TM]) Version 1.2", 3rd Edition, October | Language (YAML[TM]) Version 1.2", 3rd Edition, October | |||
2009, <http://www.yaml.org/spec/1.2/spec.html>. | 2009, <https://www.yaml.org/spec/1.2/spec.html>. | |||
Appendix A. Examples of Encoded CBOR Data Items | Appendix A. Examples of Encoded CBOR Data Items | |||
The following table provides some CBOR-encoded values in hexadecimal | The following table provides some CBOR-encoded values in hexadecimal | |||
(right column), together with diagnostic notation for these values | (right column), together with diagnostic notation for these values | |||
(left column). Note that the string "\u00fc" is one form of | (left column). Note that the string "\u00fc" is one form of | |||
diagnostic notation for a UTF-8 string containing the single Unicode | diagnostic notation for a UTF-8 string containing the single Unicode | |||
character U+00FC, LATIN SMALL LETTER U WITH DIAERESIS (u umlaut). | character U+00FC (LATIN SMALL LETTER U WITH DIAERESIS, "ü"). | |||
Similarly, "\u6c34" is a UTF-8 string in diagnostic notation with a | Similarly, "\u6c34" is a UTF-8 string in diagnostic notation with a | |||
single character U+6C34 (CJK UNIFIED IDEOGRAPH-6C34, often | single character U+6C34 (CJK UNIFIED IDEOGRAPH-6C34, "水"), often | |||
representing "water"), and "\ud800\udd51" is a UTF-8 string in | representing "water", and "\ud800\udd51" is a UTF-8 string in | |||
diagnostic notation with a single character U+10151 (GREEK ACROPHONIC | diagnostic notation with a single character U+10151 (GREEK ACROPHONIC | |||
ATTIC FIFTY STATERS). (Note that all these single-character strings | ATTIC FIFTY STATERS, "𐅑"). (Note that all these single-character | |||
could also be represented in native UTF-8 in diagnostic notation, | strings could also be represented in native UTF-8 in diagnostic | |||
just not in an ASCII-only specification.) In the diagnostic notation | notation, just not if an ASCII-only specification is required.) In | |||
provided for bignums, their intended numeric value is shown as a | the diagnostic notation provided for bignums, their intended numeric | |||
decimal number (such as 18446744073709551616) instead of showing a | value is shown as a decimal number (such as 18446744073709551616) | |||
tagged byte string (such as 2(h'010000000000000000')). | instead of a tagged byte string (such as 2(h'010000000000000000')). | |||
+==============================+====================================+ | +==============================+====================================+ | |||
|Diagnostic | Encoded | | |Diagnostic | Encoded | | |||
+==============================+====================================+ | +==============================+====================================+ | |||
|0 | 0x00 | | |0 | 0x00 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
|1 | 0x01 | | |1 | 0x01 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
|10 | 0x0a | | |10 | 0x0a | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
skipping to change at page 63, line 47 ¶ | skipping to change at line 2944 ¶ | |||
Appendix B. Jump Table for Initial Byte | Appendix B. Jump Table for Initial Byte | |||
For brevity, this jump table does not show initial bytes that are | For brevity, this jump table does not show initial bytes that are | |||
reserved for future extension. It also only shows a selection of the | reserved for future extension. It also only shows a selection of the | |||
initial bytes that can be used for optional features. (All unsigned | initial bytes that can be used for optional features. (All unsigned | |||
integers are in network byte order.) | integers are in network byte order.) | |||
+============+================================================+ | +============+================================================+ | |||
| Byte | Structure/Semantics | | | Byte | Structure/Semantics | | |||
+============+================================================+ | +============+================================================+ | |||
| 0x00..0x17 | Unsigned integer 0x00..0x17 (0..23) | | | 0x00..0x17 | unsigned integer 0x00..0x17 (0..23) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x18 | Unsigned integer (one-byte uint8_t follows) | | | 0x18 | unsigned integer (one-byte uint8_t follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x19 | Unsigned integer (two-byte uint16_t follows) | | | 0x19 | unsigned integer (two-byte uint16_t follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x1a | Unsigned integer (four-byte uint32_t follows) | | | 0x1a | unsigned integer (four-byte uint32_t follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x1b | Unsigned integer (eight-byte uint64_t follows) | | | 0x1b | unsigned integer (eight-byte uint64_t follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x20..0x37 | Negative integer -1-0x00..-1-0x17 (-1..-24) | | | 0x20..0x37 | negative integer -1-0x00..-1-0x17 (-1..-24) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x38 | Negative integer -1-n (one-byte uint8_t for n | | | 0x38 | negative integer -1-n (one-byte uint8_t for n | | |||
| | follows) | | | | follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x39 | Negative integer -1-n (two-byte uint16_t for n | | | 0x39 | negative integer -1-n (two-byte uint16_t for n | | |||
| | follows) | | | | follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x3a | Negative integer -1-n (four-byte uint32_t for | | | 0x3a | negative integer -1-n (four-byte uint32_t for | | |||
| | n follows) | | | | n follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x3b | Negative integer -1-n (eight-byte uint64_t for | | | 0x3b | negative integer -1-n (eight-byte uint64_t for | | |||
| | n follows) | | | | n follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x40..0x57 | byte string (0x00..0x17 bytes follow) | | | 0x40..0x57 | byte string (0x00..0x17 bytes follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x58 | byte string (one-byte uint8_t for n, and then | | | 0x58 | byte string (one-byte uint8_t for n, and then | | |||
| | n bytes follow) | | | | n bytes follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x59 | byte string (two-byte uint16_t for n, and then | | | 0x59 | byte string (two-byte uint16_t for n, and then | | |||
| | n bytes follow) | | | | n bytes follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
skipping to change at page 65, line 43 ¶ | skipping to change at line 3036 ¶ | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xba | map (four-byte uint32_t for n, and then n | | | 0xba | map (four-byte uint32_t for n, and then n | | |||
| | pairs of data items follow) | | | | pairs of data items follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xbb | map (eight-byte uint64_t for n, and then n | | | 0xbb | map (eight-byte uint64_t for n, and then n | | |||
| | pairs of data items follow) | | | | pairs of data items follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xbf | map, pairs of data items follow, terminated by | | | 0xbf | map, pairs of data items follow, terminated by | | |||
| | "break" | | | | "break" | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc0 | Text-based date/time (data item follows; see | | | 0xc0 | text-based date/time (data item follows; see | | |||
| | Section 3.4.1) | | | | Section 3.4.1) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc1 | Epoch-based date/time (data item follows; see | | | 0xc1 | epoch-based date/time (data item follows; see | | |||
| | Section 3.4.2) | | | | Section 3.4.2) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc2 | Positive bignum (data item "byte string" | | | 0xc2 | unsigned bignum (data item "byte string" | | |||
| | follows) | | | | follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc3 | Negative bignum (data item "byte string" | | | 0xc3 | negative bignum (data item "byte string" | | |||
| | follows) | | | | follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc4 | Decimal Fraction (data item "array" follows; | | | 0xc4 | decimal Fraction (data item "array" follows; | | |||
| | see Section 3.4.4) | | | | see Section 3.4.4) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc5 | Bigfloat (data item "array" follows; see | | | 0xc5 | bigfloat (data item "array" follows; see | | |||
| | Section 3.4.4) | | | | Section 3.4.4) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc6..0xd4 | (tag) | | | 0xc6..0xd4 | (tag) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xd5..0xd7 | Expected Conversion (data item follows; see | | | 0xd5..0xd7 | expected conversion (data item follows; see | | |||
| | Section 3.4.5.2) | | | | Section 3.4.5.2) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xd8..0xdb | (more tags; 1/2/4/8 bytes of tag number and | | | 0xd8..0xdb | (more tags; 1/2/4/8 bytes of tag number and | | |||
| | then a data item follow) | | | | then a data item follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xe0..0xf3 | (simple value) | | | 0xe0..0xf3 | (simple value) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf4 | False | | | 0xf4 | false | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf5 | True | | | 0xf5 | true | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf6 | Null | | | 0xf6 | null | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf7 | Undefined | | | 0xf7 | undefined | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf8 | (simple value, one byte follows) | | | 0xf8 | (simple value, one byte follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf9 | Half-Precision Float (two-byte IEEE 754) | | | 0xf9 | half-precision float (two-byte IEEE 754) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xfa | Single-Precision Float (four-byte IEEE 754) | | | 0xfa | single-precision float (four-byte IEEE 754) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xfb | Double-Precision Float (eight-byte IEEE 754) | | | 0xfb | double-precision float (eight-byte IEEE 754) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xff | "break" stop code | | | 0xff | "break" stop code | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
Table 7: Jump Table for Initial Byte | Table 7: Jump Table for Initial Byte | |||
Appendix C. Pseudocode | Appendix C. Pseudocode | |||
The well-formedness of a CBOR item can be checked by the pseudocode | The well-formedness of a CBOR item can be checked by the pseudocode | |||
in Figure 1. The data is well-formed if and only if: | in Figure 1. The data is well-formed if and only if: | |||
skipping to change at page 67, line 4 ¶ | skipping to change at line 3091 ¶ | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
Table 7: Jump Table for Initial Byte | Table 7: Jump Table for Initial Byte | |||
Appendix C. Pseudocode | Appendix C. Pseudocode | |||
The well-formedness of a CBOR item can be checked by the pseudocode | The well-formedness of a CBOR item can be checked by the pseudocode | |||
in Figure 1. The data is well-formed if and only if: | in Figure 1. The data is well-formed if and only if: | |||
* the pseudocode does not "fail"; | * the pseudocode does not "fail"; | |||
* after execution of the pseudocode, no bytes are left in the input | * after execution of the pseudocode, no bytes are left in the input | |||
(except in streaming applications) | (except in streaming applications). | |||
The pseudocode has the following prerequisites: | The pseudocode has the following prerequisites: | |||
* take(n) reads n bytes from the input data and returns them as a | * take(n) reads n bytes from the input data and returns them as a | |||
byte string. If n bytes are no longer available, take(n) fails. | byte string. If n bytes are no longer available, take(n) fails. | |||
* uint() converts a byte string into an unsigned integer by | * uint() converts a byte string into an unsigned integer by | |||
interpreting the byte string in network byte order. | interpreting the byte string in network byte order. | |||
* Arithmetic works as in C. | * Arithmetic works as in C. | |||
* All variables are unsigned integers of sufficient range. | * All variables are unsigned integers of sufficient range. | |||
Note that "well_formed" returns the major type for well-formed | Note that "well_formed" returns the major type for well-formed | |||
definite length items, but 99 for an indefinite length item (or -1 | definite-length items, but 99 for an indefinite-length item (or -1 | |||
for a "break" stop code, only if "breakable" is set). This is used | for a "break" stop code, only if "breakable" is set). This is used | |||
in "well_formed_indefinite" to ascertain that indefinite length | in "well_formed_indefinite" to ascertain that indefinite-length | |||
strings only contain definite length strings as chunks. | strings only contain definite-length strings as chunks. | |||
well_formed(breakable = false) { | well_formed(breakable = false) { | |||
// process initial bytes | // process initial bytes | |||
ib = uint(take(1)); | ib = uint(take(1)); | |||
mt = ib >> 5; | mt = ib >> 5; | |||
val = ai = ib & 0x1f; | val = ai = ib & 0x1f; | |||
switch (ai) { | switch (ai) { | |||
case 24: val = uint(take(1)); break; | case 24: val = uint(take(1)); break; | |||
case 25: val = uint(take(2)); break; | case 25: val = uint(take(2)); break; | |||
case 26: val = uint(take(4)); break; | case 26: val = uint(take(4)); break; | |||
skipping to change at page 69, line 15 ¶ | skipping to change at line 3169 ¶ | |||
Note that the remaining complexity of a complete CBOR decoder is | Note that the remaining complexity of a complete CBOR decoder is | |||
about presenting data that has been decoded to the application in an | about presenting data that has been decoded to the application in an | |||
appropriate form. | appropriate form. | |||
Major types 0 and 1 are designed in such a way that they can be | Major types 0 and 1 are designed in such a way that they can be | |||
encoded in C from a signed integer without actually doing an if-then- | encoded in C from a signed integer without actually doing an if-then- | |||
else for positive/negative (Figure 2). This uses the fact that | else for positive/negative (Figure 2). This uses the fact that | |||
(-1-n), the transformation for major type 1, is the same as ~n | (-1-n), the transformation for major type 1, is the same as ~n | |||
(bitwise complement) in C unsigned arithmetic; ~n can then be | (bitwise complement) in C unsigned arithmetic; ~n can then be | |||
expressed as (-1)^n for the negative case, while 0^n leaves n | expressed as (-1)^n for the negative case, while 0^n leaves n | |||
unchanged for non-negative. The sign of a number can be converted to | unchanged for nonnegative. The sign of a number can be converted to | |||
-1 for negative and 0 for non-negative (0 or positive) by arithmetic- | -1 for negative and 0 for nonnegative (0 or positive) by arithmetic- | |||
shifting the number by one bit less than the bit length of the number | shifting the number by one bit less than the bit length of the number | |||
(for example, by 63 for 64-bit numbers). | (for example, by 63 for 64-bit numbers). | |||
void encode_sint(int64_t n) { | void encode_sint(int64_t n) { | |||
uint64t ui = n >> 63; // extend sign to whole length | uint64t ui = n >> 63; // extend sign to whole length | |||
unsigned mt = ui & 0x20; // extract (shifted) major type | unsigned mt = ui & 0x20; // extract (shifted) major type | |||
ui ^= n; // complement negatives | ui ^= n; // complement negatives | |||
if (ui < 24) | if (ui < 24) | |||
*p++ = mt + ui; | *p++ = mt + ui; | |||
else if (ui < 256) { | else if (ui < 256) { | |||
skipping to change at page 73, line 27 ¶ | skipping to change at line 3356 ¶ | |||
| | 00 00 04 31 00 13 00 00 00 | | | | | 00 00 04 31 00 13 00 00 00 | | | |||
| | 10 30 00 02 00 00 00 10 31 | | | | | 10 30 00 02 00 00 00 10 31 | | | |||
| | 00 03 00 00 00 00 00 | | | | | 00 03 00 00 00 00 00 | | | |||
+-------------+----------------------------+----------------+ | +-------------+----------------------------+----------------+ | |||
| CBOR | 82 01 82 02 03 | 9f 01 82 02 03 | | | CBOR | 82 01 82 02 03 | 9f 01 82 02 03 | | |||
| | | ff | | | | | ff | | |||
+-------------+----------------------------+----------------+ | +-------------+----------------------------+----------------+ | |||
Table 8: Examples for Different Levels of Conciseness | Table 8: Examples for Different Levels of Conciseness | |||
Appendix F. Well-formedness errors and examples | Appendix F. Well-Formedness Errors and Examples | |||
There are three basic kinds of well-formedness errors that can occur | There are three basic kinds of well-formedness errors that can occur | |||
in decoding a CBOR data item: | in decoding a CBOR data item: | |||
* Too much data: There are input bytes left that were not consumed. | Too much data: There are input bytes left that were not consumed. | |||
This is only an error if the application assumed that the input | This is only an error if the application assumed that the input | |||
bytes would span exactly one data item. Where the application | bytes would span exactly one data item. Where the application | |||
uses the self-delimiting nature of CBOR encoding to permit | uses the self-delimiting nature of CBOR encoding to permit | |||
additional data after the data item, as is for example done in | additional data after the data item, as is done in CBOR sequences | |||
CBOR sequences [RFC8742], the CBOR decoder can simply indicate | [RFC8742], for example, the CBOR decoder can simply indicate which | |||
what part of the input has not been consumed. | part of the input has not been consumed. | |||
* Too little data: The input data available would need additional | Too little data: The input data available would need additional | |||
bytes added at their end for a complete CBOR data item. This may | bytes added at their end for a complete CBOR data item. This may | |||
indicate the input is truncated; it is also a common error when | indicate the input is truncated; it is also a common error when | |||
trying to decode random data as CBOR. For some applications, | trying to decode random data as CBOR. For some applications, | |||
however, this may not actually be an error, as the application may | however, this may not actually be an error, as the application may | |||
not be certain it has all the data yet and can obtain or wait for | not be certain it has all the data yet and can obtain or wait for | |||
additional input bytes. Some of these applications may have an | additional input bytes. Some of these applications may have an | |||
upper limit for how much additional data can show up; here the | upper limit for how much additional data can appear; here the | |||
decoder may be able to indicate that the encoded CBOR data item | decoder may be able to indicate that the encoded CBOR data item | |||
cannot be completed within this limit. | cannot be completed within this limit. | |||
* Syntax error: The input data are not consistent with the | Syntax error: The input data are not consistent with the | |||
requirements of the CBOR encoding, and this cannot be remedied by | requirements of the CBOR encoding, and this cannot be remedied by | |||
adding (or removing) data at the end. | adding (or removing) data at the end. | |||
In Appendix C, errors of the first kind are addressed in the first | In Appendix C, errors of the first kind are addressed in the first | |||
paragraph/bullet list (requiring "no bytes are left"), and errors of | paragraph and bullet list (requiring "no bytes are left"), and errors | |||
the second kind are addressed in the second paragraph/bullet list | of the second kind are addressed in the second paragraph/bullet list | |||
(failing "if n bytes are no longer available"). Errors of the third | (failing "if n bytes are no longer available"). Errors of the third | |||
kind are identified in the pseudocode by specific instances of | kind are identified in the pseudocode by specific instances of | |||
calling fail(), in order: | calling fail(), in order: | |||
* a reserved value is used for additional information (28, 29, 30) | * a reserved value is used for additional information (28, 29, 30) | |||
* major type 7, additional information 24, value < 32 (incorrect) | * major type 7, additional information 24, value < 32 (incorrect) | |||
* incorrect substructure of indefinite length byte/text string (may | * incorrect substructure of indefinite-length byte string or text | |||
only contain definite length strings of the same major type) | string (may only contain definite-length strings of the same major | |||
type) | ||||
* "break" stop code (mt=7, ai=31) occurs in a value position of a | * "break" stop code (major type 7, additional information 31) occurs | |||
map or except at a position directly in an indefinite length item | in a value position of a map or except at a position directly in | |||
where also another enclosed data item could occur | an indefinite-length item where also another enclosed data item | |||
could occur | ||||
* additional information 31 used with major type 0, 1, or 6 | * additional information 31 used with major type 0, 1, or 6 | |||
F.1. Examples for CBOR data items that are not well-formed | F.1. Examples of CBOR Data Items That Are Not Well-Formed | |||
This subsection shows a few examples for CBOR data items that are not | This subsection shows a few examples for CBOR data items that are not | |||
well-formed. Each example is a sequence of bytes each shown in | well-formed. Each example is a sequence of bytes, each shown in | |||
hexadecimal; multiple examples in a list are separated by commas. | hexadecimal; multiple examples in a list are separated by commas. | |||
Examples for well-formedness error kind 1 (too much data) can easily | Examples for well-formedness error kind 1 (too much data) can easily | |||
be formed by adding data to a well-formed encoded CBOR data item. | be formed by adding data to a well-formed encoded CBOR data item. | |||
Similarly, examples for well-formedness error kind 2 (too little | Similarly, examples for well-formedness error kind 2 (too little | |||
data) can be formed by truncating a well-formed encoded CBOR data | data) can be formed by truncating a well-formed encoded CBOR data | |||
item. In test suites, it may be beneficial to specifically test with | item. In test suites, it may be beneficial to specifically test with | |||
incomplete data items that would require large amounts of addition to | incomplete data items that would require large amounts of addition to | |||
be completed (for instance by starting the encoding of a string of a | be completed (for instance by starting the encoding of a string of a | |||
very large size). | very large size). | |||
A premature end of the input can occur in a head or within the | A premature end of the input can occur in a head or within the | |||
enclosed data, which may be bare strings or enclosed data items that | enclosed data, which may be bare strings or enclosed data items that | |||
are either counted or should have been ended by a "break" stop code. | are either counted or should have been ended by a "break" stop code. | |||
* End of input in a head: 18, 19, 1a, 1b, 19 01, 1a 01 02, 1b 01 02 | End of input in a head: 18, 19, 1a, 1b, 19 01, 1a 01 02, 1b 01 02 03 | |||
03 04 05 06 07, 38, 58, 78, 98, 9a 01 ff 00, b8, d8, f8, f9 00, fa | 04 05 06 07, 38, 58, 78, 98, 9a 01 ff 00, b8, d8, f8, f9 00, fa 00 | |||
00 00, fb 00 00 00 | 00, fb 00 00 00 | |||
* Definite length strings with short data: 41, 61, 5a ff ff ff ff | Definite-length strings with short data: 41, 61, 5a ff ff ff ff 00, | |||
00, 5b ff ff ff ff ff ff ff ff 01 02 03, 7a ff ff ff ff 00, 7b 7f | 5b ff ff ff ff ff ff ff ff 01 02 03, 7a ff ff ff ff 00, 7b 7f ff | |||
ff ff ff ff ff ff ff 01 02 03 | ff ff ff ff ff ff 01 02 03 | |||
* Definite length maps and arrays not closed with enough items: 81, | Definite-length maps and arrays not closed with enough items: 81, 81 | |||
81 81 81 81 81 81 81 81 81, 82 00, a1, a2 01 02, a1 00, a2 00 00 | 81 81 81 81 81 81 81 81, 82 00, a1, a2 01 02, a1 00, a2 00 00 00 | |||
00 | ||||
* Tag number not followed by tag content: c0 | Tag number not followed by tag content: c0 | |||
* Indefinite length strings not closed by a "break" stop code: 5f 41 | Indefinite-length strings not closed by a "break" stop code: 5f 41 | |||
00, 7f 61 00 | 00, 7f 61 00 | |||
* Indefinite length maps and arrays not closed by a "break" stop | Indefinite-length maps and arrays not closed by a "break" stop | |||
code: 9f, 9f 01 02, bf, bf 01 02 01 02, 81 9f, 9f 80 00, 9f 9f 9f | code: 9f, 9f 01 02, bf, bf 01 02 01 02, 81 9f, 9f 80 00, 9f 9f 9f 9f | |||
9f 9f ff ff ff ff, 9f 81 9f 81 9f 9f ff ff ff | 9f ff ff ff ff, 9f 81 9f 81 9f 9f ff ff ff | |||
A few examples for the five subkinds of well-formedness error kind 3 | A few examples for the five subkinds of well-formedness error kind 3 | |||
(syntax error) are shown below. | (syntax error) are shown below. | |||
Subkind 1: | Subkind 1: | |||
Reserved additional information values: 1c, 1d, 1e, 3c, 3d, 3e, | ||||
* Reserved additional information values: 1c, 1d, 1e, 3c, 3d, 3e, | 5c, 5d, 5e, 7c, 7d, 7e, 9c, 9d, 9e, bc, bd, be, dc, dd, de, fc, | |||
5c, 5d, 5e, 7c, 7d, 7e, 9c, 9d, 9e, bc, bd, be, dc, dd, de, fc, | fd, fe, | |||
fd, fe, | ||||
Subkind 2: | Subkind 2: | |||
Reserved two-byte encodings of simple values: f8 00, f8 01, f8 | ||||
* Reserved two-byte encodings of simple values: f8 00, f8 01, f8 18, | 18, f8 1f | |||
f8 1f | ||||
Subkind 3: | Subkind 3: | |||
Indefinite-length string chunks not of the correct type: 5f 00 | ||||
ff, 5f 21 ff, 5f 61 00 ff, 5f 80 ff, 5f a0 ff, 5f c0 00 ff, 5f | ||||
e0 ff, 7f 41 00 ff | ||||
* Indefinite length string chunks not of the correct type: 5f 00 ff, | Indefinite-length string chunks not definite length: 5f 5f 41 00 | |||
5f 21 ff, 5f 61 00 ff, 5f 80 ff, 5f a0 ff, 5f c0 00 ff, 5f e0 ff, | ff ff, 7f 7f 61 00 ff ff | |||
7f 41 00 ff | ||||
* Indefinite length string chunks not definite length: 5f 5f 41 00 | ||||
ff ff, 7f 7f 61 00 ff ff | ||||
Subkind 4: | Subkind 4: | |||
Break occurring on its own outside of an indefinite-length | ||||
item: ff | ||||
* Break occurring on its own outside of an indefinite length item: | Break occurring in a definite-length array or map or a tag: 81 | |||
ff | ff, 82 00 ff, a1 ff, a1 ff 00, a1 00 ff, a2 00 00 ff, 9f 81 ff, | |||
9f 82 9f 81 9f 9f ff ff ff ff | ||||
* Break occurring in a definite length array or map or a tag: 81 ff, | ||||
82 00 ff, a1 ff, a1 ff 00, a1 00 ff, a2 00 00 ff, 9f 81 ff, 9f 82 | ||||
9f 81 9f 9f ff ff ff ff | ||||
* Break in indefinite length map would lead to odd number of items | Break in an indefinite-length map that would lead to an odd | |||
(break in a value position): bf 00 ff, bf 00 00 00 ff | number of items (break in a value position): bf 00 ff, bf 00 00 | |||
00 ff | ||||
Subkind 5: | Subkind 5: | |||
Major type 0, 1, 6 with additional information 31: 1f, 3f, df | ||||
* Major type 0, 1, 6 with additional information 31: 1f, 3f, df | ||||
Appendix G. Changes from RFC 7049 | Appendix G. Changes from RFC 7049 | |||
As discussed in the introduction, this document is a revised edition | As discussed in the introduction, this document formally obsoletes | |||
of RFC 7049, with editorial improvements, added detail, and fixed | RFC 7049 while keeping full compatibility with the interchange format | |||
errata. This document formally obsoletes RFC 7049, while keeping | from RFC 7049. This document provides editorial improvements, added | |||
full compatibility of the interchange format from RFC 7049. This | detail, and fixed errata. This document does not create a new | |||
document does not create a new version of the format. | version of the format. | |||
G.1. Errata processing, clerical changes | G.1. Errata Processing and Clerical Changes | |||
The two verified errata on RFC 7049, EID 3764 and EID 3770, concerned | The two verified errata on RFC 7049, [Err3764] and [Err3770], | |||
two encoding examples in the text that have been corrected | concerned two encoding examples in the text that have been corrected | |||
(Section 3.4.3: "29" -> "49", Section 5.5: "0b000_11101" -> | (Section 3.4.3: "29" -> "49", Section 5.5: "0b000_11101" -> | |||
"0b000_11001"). Also, RFC 7049 contained an example using the | "0b000_11001"). Also, RFC 7049 contained an example using the | |||
numeric value 24 for a simple value (EID 5917), which is not well- | numeric value 24 for a simple value [Err5917], which is not well- | |||
formed; this example has been removed. Errata report 5763 pointed to | formed; this example has been removed. Errata report 5763 [Err5763] | |||
an accident in the wording of the definition of tags; this was | pointed to an error in the wording of the definition of tags; this | |||
resolved during a re-write of Section 3.4. Errata report 5434 | was resolved during a rewrite of Section 3.4. Errata report 5434 | |||
pointed out that the UBJSON example in Appendix E no longer complied | [Err5434] pointed out that the Universal Binary JSON (UBJSON) example | |||
with the version of UBJSON current at the time of submitting the | in Appendix E no longer complied with the version of UBJSON current | |||
report. It turned out that the UBJSON specification had completely | at the time of the errata report submission. It turned out that the | |||
changed since 2013; this example therefore also was removed. Further | UBJSON specification had completely changed since 2013; this example | |||
errata reports (4409, 4963, 4964) complained that the map key sorting | therefore was removed. Other errata reports [Err4409] [Err4963] | |||
rules for canonical encoding were onerous; these led to a | [Err4964] complained that the map key sorting rules for canonical | |||
reconsideration of the canonical encoding suggestions and replacement | encoding were onerous; these led to a reconsideration of the | |||
by the deterministic encoding suggestions (described below). An | canonical encoding suggestions and replacement by the deterministic | |||
editorial suggestion in errata report 4294 was also implemented | encoding suggestions (described below). An editorial suggestion in | |||
(improved symmetry by adding "Second value" to a comment to the last | errata report 4294 [Err4294] was also implemented (improved symmetry | |||
example in Section 3.2.2). | by adding "Second value" to a comment to the last example in | |||
Section 3.2.2). | ||||
Other more clerical changes include: | Other clerical changes include: | |||
* use of new RFCXML functionality [RFC7991]; | * the use of new xml2rfc functionality [RFC7991]; | |||
* explain some more of the notation used; | * more explanation of the notation used; | |||
* updated references, e.g. for RFC4627 to [RFC8259] in many places, | * the update of references, e.g., from RFC 4627 to [RFC8259], from | |||
for CNN-TERMS to [RFC7228]; added missing reference to [IEEE754] | CNN-TERMS to [RFC7228], and from the 5.1 edition to the 11th | |||
(importing required definitions) and updated to [ECMA262]; added a | edition of [ECMA262]; the addition of a reference to [IEEE754] and | |||
reference to [RFC8618] that further illustrates the discussion in | importation of required definitions; the addition of references to | |||
Appendix E; | [C] and [Cplusplus20]; and the addition of a reference to | |||
[RFC8618] that further illustrates the discussion in Appendix E; | ||||
* the discussion of diagnostic notation mentions the "Extended | * in the discussion of diagnostic notation (Section 8), the | |||
Diagnostic Notation" (EDN) defined in [RFC8610] as well as the gap | "Extended Diagnostic Notation" (EDN) defined in [RFC8610] is now | |||
diagnostic notation has in representing NaN payloads; an | mentioned, the gap in representing NaN payloads is now | |||
explanation was added on how to represent indefinite length | highlighted, and an explanation of representing indefinite-length | |||
strings with no chunks; | strings with no chunks has been added (Section 8.1); | |||
* the addition of this appendix. | * the addition of this appendix. | |||
G.2. Changes in IANA considerations | G.2. Changes in IANA Considerations | |||
The IANA considerations were generally updated (clerical changes, | The IANA considerations were generally updated (clerical changes, | |||
e.g., now pointing to the CBOR working group as the author of the | e.g., now pointing to the CBOR Working Group as the author of the | |||
specification). References to the respective IANA registries have | specification). References to the respective IANA registries were | |||
been added to the informative references. | added to the informative references. | |||
Tags in the space from 256 to 32767 (lower half of "1+2") are no | In the "Concise Binary Object Representation (CBOR) Tags" registry | |||
longer assigned by First Come First Served; this range is now | [IANA.cbor-tags], tags in the space from 256 to 32767 (lower half of | |||
Specification Required. | "1+2") are no longer assigned by First Come First Served; this range | |||
is now Specification Required. | ||||
G.3. Changes in suggestions and other informational components | G.3. Changes in Suggestions and Other Informational Components | |||
In revising the document, beyond processing errata reports, the WG | While revising the document, beyond the addressing of the errata | |||
could use nearly seven years of experience with the use of CBOR in a | reports, the working group drew upon nearly seven years of experience | |||
diverse set of applications. This led to a number of editorial | with CBOR in a diverse set of applications. This led to a number of | |||
changes, including adding tables for illustration, but also to | editorial changes, including adding tables for illustration, but also | |||
emphasizing some aspects and de-emphasizing others. | emphasizing some aspects and de-emphasizing others. | |||
A significant addition in this revision is Section 2, which discusses | A significant addition is Section 2, which discusses the CBOR data | |||
the CBOR data model and its small variations involved in the | model and its small variations involved in the processing of CBOR. | |||
processing of CBOR. Introducing terms for those (basic generic, | The introduction of terms for those variations (basic generic, | |||
extended generic, specific) enables more concise language in other | extended generic, specific) enables more concise language in other | |||
places of the document, but also helps in clarifying expectations on | places of the document and also helps to clarify expectations of | |||
implementations and on the extensibility features of the format. | implementations and of the extensibility features of the format. | |||
RFC 7049, as a format derived from the JSON ecosystem, was influenced | As a format derived from the JSON ecosystem, RFC 7049 was influenced | |||
by the JSON number system that was in turn inherited from JavaScript | by the JSON number system that was in turn inherited from JavaScript | |||
at the time. JSON does not provide distinct integers and floating- | at the time. JSON does not provide distinct integers and floating- | |||
point values (and the latter are decimal in the format). CBOR | point values (and the latter are decimal in the format). CBOR | |||
provides binary representations of numbers, which do differ between | provides binary representations of numbers, which do differ between | |||
integers and floating-point values. Experience from implementation | integers and floating-point values. Experience from implementation | |||
and use now suggested that the separation between these two number | and use suggested that the separation between these two number | |||
domains should be more clearly drawn in the document; language that | domains should be more clearly drawn in the document; language that | |||
suggested an integer could seamlessly stand in for a floating-point | suggested an integer could seamlessly stand in for a floating-point | |||
value was removed. Also, a suggestion (based on I-JSON [RFC7493]) | value was removed. Also, a suggestion (based on I-JSON [RFC7493]) | |||
was added for handling these types when converting JSON to CBOR, and | was added for handling these types when converting JSON to CBOR, and | |||
the use of a specific rounding mechanism has been recommended. | the use of a specific rounding mechanism has been recommended. | |||
For a single value in the data model, CBOR often provides multiple | For a single value in the data model, CBOR often provides multiple | |||
encoding options. The revision adds a new section Section 4, which | encoding options. A new section (Section 4) introduces the term | |||
first introduces the term "preferred serialization" (Section 4.1) and | "preferred serialization" (Section 4.1) and defines it for various | |||
defines it for various kinds of data items. On the basis of this | kinds of data items. On the basis of this terminology, the section | |||
terminology, the section goes on to discuss how a CBOR-based protocol | then discusses how a CBOR-based protocol can define "deterministic | |||
can define "deterministic encoding" (Section 4.2), which now avoids | encoding" (Section 4.2), which avoids terms "canonical" and | |||
the RFC 7049 terms "canonical" and "canonicalization". The | "canonicalization" from RFC 7049. The suggestion of "Core | |||
suggestion of "Core Deterministic Encoding Requirements" | Deterministic Encoding Requirements" (Section 4.2.1) enables generic | |||
Section 4.2.1 enables generic support for such protocol-defined | support for such protocol-defined encoding requirements. This | |||
encoding requirements. The present revision further eases the | document further eases the implementation of deterministic encoding | |||
implementation of deterministic encoding by simplifying the map | by simplifying the map ordering suggested in RFC 7049 to a simple | |||
ordering suggested in RFC 7049 to simple lexicographic ordering of | lexicographic ordering of encoded keys. A description of the older | |||
encoded keys. A description of the older suggestion is kept as an | suggestion is kept as an alternative, now termed "length-first map | |||
alternative, now termed "length-first map key ordering" | key ordering" (Section 4.2.3). | |||
(Section 4.2.3). | ||||
The terminology for well-formed and valid data was sharpened and more | The terminology for well-formed and valid data was sharpened and more | |||
stringently used, avoiding less well-defined alternative terms such | stringently used, avoiding less well-defined alternative terms such | |||
as "syntax error", "decoding error" and "strict mode" outside | as "syntax error", "decoding error", and "strict mode" outside of | |||
examples. Also, a third level of requirements beyond CBOR-level | examples. Also, a third level of requirements that an application | |||
validity that an application has on its input data is now explicitly | has on its input data beyond CBOR-level validity is now explicitly | |||
called out. Well-formed (processable at all), valid (checked by a | called out. Well-formed (processable at all), valid (checked by a | |||
validity-checking generic decoder), and expected input (as checked by | validity-checking generic decoder), and expected input (as checked by | |||
the application) are treated as a hierarchy of layers of | the application) are treated as a hierarchy of layers of | |||
acceptability. | acceptability. | |||
The handling of non-well-formed simple values was clarified in text | The handling of non-well-formed simple values was clarified in text | |||
and pseudocode. Appendix F was added to discuss well-formedness | and pseudocode. Appendix F was added to discuss well-formedness | |||
errors and provide examples for them. The pseudocode was updated to | errors and provide examples for them. The pseudocode was updated to | |||
be more portable and some portability considerations were added. | be more portable, and some portability considerations were added. | |||
The discussion of validity has been sharpened in two areas. Map | The discussion of validity has been sharpened in two areas. Map | |||
validity (handling of duplicate keys) was clarified and the domain of | validity (handling of duplicate keys) was clarified, and the domain | |||
applicability of certain implementation choices explained. Also, | of applicability of certain implementation choices explained. Also, | |||
while streamlining the terminology for tags, tag numbers, and tag | while streamlining the terminology for tags, tag numbers, and tag | |||
content, discussion was added on tag validity, and the restrictions | content, discussion was added on tag validity, and the restrictions | |||
were clarified on tag content, in general and specifically for tag 1. | were clarified on tag content, in general and specifically for tag 1. | |||
An implementation note (and note for future tag definitions) was | An implementation note (and note for future tag definitions) was | |||
added to Section 3.4 about defining tags with semantics that depend | added to Section 3.4 about defining tags with semantics that depend | |||
on serialization order. | on serialization order. | |||
Tag 35 is no longer defined in this updated document; the | Tag 35 is not defined by this document; the registration based on the | |||
registration based on the definition in RFC 7049 remains in place. | definition in RFC 7049 remains in place. | |||
Terminology was introduced in Section 3 for "argument" and "head", | Terminology was introduced in Section 3 for "argument" and "head", | |||
simplifying further discussion. | simplifying further discussion. | |||
The security considerations were mostly rewritten and significantly | The security considerations (Section 10) were mostly rewritten and | |||
expanded; in multiple other places, the document is now more explicit | significantly expanded; in multiple other places, the document is now | |||
that a decoder cannot simply condone well-formedness errors. | more explicit that a decoder cannot simply condone well-formedness | |||
errors. | ||||
Acknowledgements | Acknowledgements | |||
CBOR was inspired by MessagePack. MessagePack was developed and | CBOR was inspired by MessagePack. MessagePack was developed and | |||
promoted by Sadayuki Furuhashi ("frsyuki"). This reference to | promoted by Sadayuki Furuhashi ("frsyuki"). This reference to | |||
MessagePack is solely for attribution; CBOR is not intended as a | MessagePack is solely for attribution; CBOR is not intended as a | |||
version of or replacement for MessagePack, as it has different design | version of, or replacement for, MessagePack, as it has different | |||
goals and requirements. | design goals and requirements. | |||
The need for functionality beyond the original MessagePack | The need for functionality beyond the original MessagePack | |||
Specification became obvious to many people at about the same time | specification became obvious to many people at about the same time | |||
around the year 2012. BinaryPack is a minor derivation of | around the year 2012. BinaryPack is a minor derivation of | |||
MessagePack that was developed by Eric Zhang for the binaryjs | MessagePack that was developed by Eric Zhang for the binaryjs | |||
project. A similar, but different, extension was made by Tim Caswell | project. A similar, but different, extension was made by Tim Caswell | |||
for his msgpack-js and msgpack-js-browser projects. Many people have | for his msgpack-js and msgpack-js-browser projects. Many people have | |||
contributed to the discussion about extending MessagePack to separate | contributed to the discussion about extending MessagePack to separate | |||
text string representation from byte string representation. | text string representation from byte string representation. | |||
The encoding of the additional information in CBOR was inspired by | The encoding of the additional information in CBOR was inspired by | |||
the encoding of length information designed by Klaus Hartke for CoAP. | the encoding of length information designed by Klaus Hartke for CoAP. | |||
skipping to change at page 79, line 42 ¶ | skipping to change at line 3658 ¶ | |||
Richardson, Nico Williams, Peter Occil, Phillip Hallam-Baker, Ray | Richardson, Nico Williams, Peter Occil, Phillip Hallam-Baker, Ray | |||
Polk, Stuart Cheshire, Tim Bray, Tony Finch, Tony Hansen, and Yaron | Polk, Stuart Cheshire, Tim Bray, Tony Finch, Tony Hansen, and Yaron | |||
Sheffer. Benjamin Kaduk provided an extensive review during IESG | Sheffer. Benjamin Kaduk provided an extensive review during IESG | |||
processing. Éric Vyncke, Erik Kline, Robert Wilton, and Roman Danyliw | processing. Éric Vyncke, Erik Kline, Robert Wilton, and Roman Danyliw | |||
provided further IESG comments, which included an IoT directorate | provided further IESG comments, which included an IoT directorate | |||
review by Eve Schooler. | review by Eve Schooler. | |||
Authors' Addresses | Authors' Addresses | |||
Carsten Bormann | Carsten Bormann | |||
Universitaet Bremen TZI | Universität Bremen TZI | |||
Postfach 330440 | Postfach 330440 | |||
D-28359 Bremen | D-28359 Bremen | |||
Germany | Germany | |||
Phone: +49-421-218-63921 | Phone: +49-421-218-63921 | |||
Email: cabo@tzi.org | Email: cabo@tzi.org | |||
Paul Hoffman | Paul Hoffman | |||
ICANN | ICANN | |||
Email: paul.hoffman@icann.org | Email: paul.hoffman@icann.org | |||
End of changes. 299 change blocks. | ||||
906 lines changed or deleted | 950 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |