--- 1/draft-ietf-cbor-7049bis-00.txt 2017-10-14 13:13:11.338842153 -0700 +++ 2/draft-ietf-cbor-7049bis-01.txt 2017-10-14 13:13:11.462845114 -0700 @@ -1,19 +1,19 @@ Network Working Group C. Bormann Internet-Draft Universitaet Bremen TZI Intended status: Standards Track P. Hoffman -Expires: October 14, 2017 ICANN - April 12, 2017 +Expires: April 17, 2018 ICANN + October 14, 2017 Concise Binary Object Representation (CBOR) - draft-ietf-cbor-7049bis-00 + draft-ietf-cbor-7049bis-01 Abstract The Concise Binary Object Representation (CBOR) is a data format whose design goals include the possibility of extremely small code size, fairly small message size, and extensibility without the need for version negotiation. These design goals make it different from earlier binary serializations such as ASN.1 and MessagePack. Contributing @@ -36,21 +36,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on October 14, 2017. + This Internet-Draft will expire on April 17, 2018. Copyright Notice Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -72,68 +72,70 @@ 2.2.2. Indefinite-Length Byte Strings and Text Strings . . . 11 2.3. Floating-Point Numbers and Values with No Content . . . . 12 2.4. Optional Tagging of Items . . . . . . . . . . . . . . . . 14 2.4.1. Date and Time . . . . . . . . . . . . . . . . . . . . 16 2.4.2. Bignums . . . . . . . . . . . . . . . . . . . . . . . 16 2.4.3. Decimal Fractions and Bigfloats . . . . . . . . . . . 16 2.4.4. Content Hints . . . . . . . . . . . . . . . . . . . . 18 2.4.4.1. Encoded CBOR Data Item . . . . . . . . . . . . . 18 2.4.4.2. Expected Later Encoding for CBOR-to-JSON Converters . . . . . . . . . . . . . . . . . . . 18 - 2.4.4.3. Encoded Text . . . . . . . . . . . . . . . . . . 18 + 2.4.4.3. Encoded Text . . . . . . . . . . . . . . . . . . 19 2.4.5. Self-Describe CBOR . . . . . . . . . . . . . . . . . 19 - 3. Creating CBOR-Based Protocols . . . . . . . . . . . . . . . . 19 - 3.1. CBOR in Streaming Applications . . . . . . . . . . . . . 20 - 3.2. Generic Encoders and Decoders . . . . . . . . . . . . . . 20 - 3.3. Syntax Errors . . . . . . . . . . . . . . . . . . . . . . 21 - 3.3.1. Incomplete CBOR Data Items . . . . . . . . . . . . . 21 - 3.3.2. Malformed Indefinite-Length Items . . . . . . . . . . 22 - 3.3.3. Unknown Additional Information Values . . . . . . . . 22 - 3.4. Other Decoding Errors . . . . . . . . . . . . . . . . . . 22 - 3.5. Handling Unknown Simple Values and Tags . . . . . . . . . 23 - 3.6. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 23 - 3.7. Specifying Keys for Maps . . . . . . . . . . . . . . . . 24 - 3.8. Undefined Values . . . . . . . . . . . . . . . . . . . . 25 - 3.9. Canonical CBOR . . . . . . . . . . . . . . . . . . . . . 26 - 3.10. Strict Mode . . . . . . . . . . . . . . . . . . . . . . . 27 - 4. Converting Data between CBOR and JSON . . . . . . . . . . . . 28 - 4.1. Converting from CBOR to JSON . . . . . . . . . . . . . . 29 - 4.2. Converting from JSON to CBOR . . . . . . . . . . . . . . 30 - 5. Future Evolution of CBOR . . . . . . . . . . . . . . . . . . 31 - 5.1. Extension Points . . . . . . . . . . . . . . . . . . . . 31 - 5.2. Curating the Additional Information Space . . . . . . . . 32 - 6. Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . 32 - 6.1. Encoding Indicators . . . . . . . . . . . . . . . . . . . 33 - 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 - 7.1. Simple Values Registry . . . . . . . . . . . . . . . . . 34 - 7.2. Tags Registry . . . . . . . . . . . . . . . . . . . . . . 34 - 7.3. Media Type ("MIME Type") . . . . . . . . . . . . . . . . 35 - 7.4. CoAP Content-Format . . . . . . . . . . . . . . . . . . . 36 - 7.5. The +cbor Structured Syntax Suffix Registration . . . . . 36 - 8. Security Considerations . . . . . . . . . . . . . . . . . . . 37 - 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 38 - 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 - 10.1. Normative References . . . . . . . . . . . . . . . . . . 38 - 10.2. Informative References . . . . . . . . . . . . . . . . . 39 - Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 41 - Appendix B. Jump Table . . . . . . . . . . . . . . . . . . . . . 45 - Appendix C. Pseudocode . . . . . . . . . . . . . . . . . . . . . 48 - Appendix D. Half-Precision . . . . . . . . . . . . . . . . . . . 50 + 2.5. CBOR Data Models . . . . . . . . . . . . . . . . . . . . 20 + 3. Creating CBOR-Based Protocols . . . . . . . . . . . . . . . . 21 + 3.1. CBOR in Streaming Applications . . . . . . . . . . . . . 22 + 3.2. Generic Encoders and Decoders . . . . . . . . . . . . . . 22 + 3.3. Syntax Errors . . . . . . . . . . . . . . . . . . . . . . 23 + 3.3.1. Incomplete CBOR Data Items . . . . . . . . . . . . . 23 + 3.3.2. Malformed Indefinite-Length Items . . . . . . . . . . 24 + 3.3.3. Unknown Additional Information Values . . . . . . . . 24 + 3.4. Other Decoding Errors . . . . . . . . . . . . . . . . . . 24 + 3.5. Handling Unknown Simple Values and Tags . . . . . . . . . 25 + 3.6. Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 25 + 3.7. Specifying Keys for Maps . . . . . . . . . . . . . . . . 26 + 3.8. Undefined Values . . . . . . . . . . . . . . . . . . . . 27 + 3.9. Canonical CBOR . . . . . . . . . . . . . . . . . . . . . 28 + 3.10. Strict Mode . . . . . . . . . . . . . . . . . . . . . . . 29 + 4. Converting Data between CBOR and JSON . . . . . . . . . . . . 30 + 4.1. Converting from CBOR to JSON . . . . . . . . . . . . . . 31 + 4.2. Converting from JSON to CBOR . . . . . . . . . . . . . . 32 + 5. Future Evolution of CBOR . . . . . . . . . . . . . . . . . . 33 + 5.1. Extension Points . . . . . . . . . . . . . . . . . . . . 33 + 5.2. Curating the Additional Information Space . . . . . . . . 34 + 6. Diagnostic Notation . . . . . . . . . . . . . . . . . . . . . 34 + 6.1. Encoding Indicators . . . . . . . . . . . . . . . . . . . 35 + 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36 + 7.1. Simple Values Registry . . . . . . . . . . . . . . . . . 36 + 7.2. Tags Registry . . . . . . . . . . . . . . . . . . . . . . 36 + 7.3. Media Type ("MIME Type") . . . . . . . . . . . . . . . . 37 + 7.4. CoAP Content-Format . . . . . . . . . . . . . . . . . . . 38 + 7.5. The +cbor Structured Syntax Suffix Registration . . . . . 38 + 8. Security Considerations . . . . . . . . . . . . . . . . . . . 39 + 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 40 + 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 40 + 10.1. Normative References . . . . . . . . . . . . . . . . . . 40 + 10.2. Informative References . . . . . . . . . . . . . . . . . 41 + Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 43 + Appendix B. Jump Table . . . . . . . . . . . . . . . . . . . . . 47 + Appendix C. Pseudocode . . . . . . . . . . . . . . . . . . . . . 50 + Appendix D. Half-Precision . . . . . . . . . . . . . . . . . . . 52 Appendix E. Comparison of Other Binary Formats to CBOR's Design - Objectives . . . . . . . . . . . . . . . . . . . . . 51 - E.1. ASN.1 DER, BER, and PER . . . . . . . . . . . . . . . . . 52 - E.2. MessagePack . . . . . . . . . . . . . . . . . . . . . . . 52 - E.3. BSON . . . . . . . . . . . . . . . . . . . . . . . . . . 53 - E.4. UBJSON . . . . . . . . . . . . . . . . . . . . . . . . . 53 - E.5. MSDTP: RFC 713 . . . . . . . . . . . . . . . . . . . . . 53 - E.6. Conciseness on the Wire . . . . . . . . . . . . . . . . . 53 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 54 + Objectives . . . . . . . . . . . . . . . . . . . . . 53 + E.1. ASN.1 DER, BER, and PER . . . . . . . . . . . . . . . . . 54 + E.2. MessagePack . . . . . . . . . . . . . . . . . . . . . . . 54 + E.3. BSON . . . . . . . . . . . . . . . . . . . . . . . . . . 55 + E.4. UBJSON . . . . . . . . . . . . . . . . . . . . . . . . . 55 + E.5. MSDTP: RFC 713 . . . . . . . . . . . . . . . . . . . . . 55 + E.6. Conciseness on the Wire . . . . . . . . . . . . . . . . . 55 + Appendix F. Changes from RFC 7049 . . . . . . . . . . . . . . . 56 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 56 1. Introduction There are hundreds of standardized formats for binary representation of structured data (also known as binary serialization formats). Of those, some are for specific domains of information, while others are generalized for arbitrary data. In the IETF, probably the best-known formats in the latter category are ASN.1's BER and DER [ASN.1]. The format defined here follows some specific design goals that are @@ -603,20 +605,27 @@ | 32..255 | (Unassigned) | +---------+-----------------+ Table 2: Simple Values The 5-bit values of 25, 26, and 27 are for 16-bit, 32-bit, and 64-bit IEEE 754 binary floating-point values. These floating-point values are encoded in the additional bytes of the appropriate size. (See Appendix D for some information about 16-bit floating point.) + An encoder MUST NOT encode False as the two-byte sequence of 0xf814, + MUST NOT encode True as the two-byte sequence of 0xf815, MUST NOT + encode Null as the two-byte sequence of 0xf816, and MUST NOT encode + Undefined value as the two-byte sequence of 0xf817. A decoder MUST + treat these two-byte sequences as an error. Similar prohibitions + apply to the unassigned simple values as well. + 2.4. Optional Tagging of Items In CBOR, a data item can optionally be preceded by a tag to give it additional semantics while retaining its structure. The tag is major type 6, and represents an integer number as indicated by the tag's integer value; the (sole) data item is carried as content data. If a tag requires structured data, this structure is encoded into the nested data item. The definition of a tag usually restricts what kinds of nested data item or items can be carried by a tag. @@ -877,20 +888,102 @@ use as a distinguishing mark for frequently used file types. In particular, it is not a valid start of a Unicode text in any Unicode encoding if followed by a valid CBOR data item. For instance, a decoder might be able to parse both CBOR and JSON. Such a decoder would need to mechanically distinguish the two formats. An easy way for an encoder to help the decoder would be to tag the entire CBOR item with tag 55799, the serialization of which will never be found at the beginning of a JSON text. +2.5. CBOR Data Models + + CBOR is explicit about its generic data model, which defines the set + of all data items that can be represented in CBOR. Its basic generic + data model is extensible by the registration of simple type values + and tags. Applications can then subset the resulting extended + generic data model to build their specific data models. + + Within environments that can represent the data items in the generic + data model, generic CBOR encoders and decoders can be implemented + (which usually involves defining additional implementation data types + for those data items that do not already have a natural + representation in the environment). The ability to provide generic + encoders and decoders is an explicit design goal of CBOR; however + many applications will provide their own application-specific + encoders and/or decoders. + + In the basic (un-extended) generic data model, a data item is one of: + + o an integer in the range -2**64..2**64-1 inclusive + + o a simple value, identified by a number between 0 and 255, but + distinct from that number + + o a floating point value, distinct from an integer, out of the set + representable by IEEE 754 binary64 (including non-finites) + + o a sequence of zero or more bytes ("byte string") + + o a sequence of zero or more Unicode code points ("text string") + + o a sequence of zero or more data items ("array") + + o a mapping (mathematical function) from zero or more data items + ("keys") each to a data item ("values"), ("map") + + o a tagged data item, comprising a tag (an integer in the range + 0..2**64-1) and a value (a data item) + + Note that integer and floating-point values are distinct in this + model, even if they have the same numeric value. + + This basic generic data model comes pre-extended by the registration + of a number of simple values and tags right in this document, such + as: + + o "false", "true", "null", and "undefined" (simple values identified + by 20..23) + + o integer and floating point values with a larger range and + precision than the above (tags 2 to 5) + + o application data types such as a point in time or an RFC 3339 + date/time string (tags 1, 0) + + Further elements of the extended generic data model can be (and have + been) defined via the IANA registries created for CBOR. Even if such + an extension is unknown to a generic encoder or decoder, data items + using that extension can be passed to or from the application by + representing them at the interface to the application within the + basic generic data model, i.e., as generic values of a simple type or + generic tagged items. + + In other words, the basic generic data model is stable as defined in + this document, while the extended generic data model expands by the + registration of new simple values or tags, but never shrinks. + + While there is a strong expectation that generic encoders and + decoders can represent "false", "true", and "null" in the form + appropriate for their programming environment, implementation of the + data model extensions created by tags is truly optional and a matter + of implementation quality. + + A specific data model usually subsets the extended generic data model + and assigns application semantics to the data items within this + subset and its components. When documenting such specific data + models, where it is desired to specify the types of data items, it is + preferred to identify the types by their names in the generic data + model ("negative integer", "array") instead of by referring to + aspects of their CBOR representation ("major type 1", "major type + 4"). + 3. Creating CBOR-Based Protocols Data formats such as CBOR are often used in environments where there is no format negotiation. A specific design goal of CBOR is to not need any included or assumed schema: a decoder can take a CBOR item and decode it with no other knowledge. Of course, in real-world implementations, the encoder and the decoder will have a shared view of what should be in a CBOR data item. For example, an agreed-to format might be "the item is an array whose @@ -1773,52 +1866,52 @@ [ECMA262] European Computer Manufacturers Association, "ECMAScript Language Specification 5.1 Edition", ECMA Standard ECMA- 262, June 2011, . [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, DOI 10.17487/RFC2045, November 1996, - . + . [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, - DOI 10.17487/RFC2119, March 1997, - . + DOI 10.17487/RFC2119, March 1997, . [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet: Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, - . + . [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November - 2003, . + 2003, . [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, DOI 10.17487/RFC3986, January 2005, - . + . [RFC4287] Nottingham, M., Ed. and R. Sayre, Ed., "The Atom Syndication Format", RFC 4287, DOI 10.17487/RFC4287, - December 2005, . + December 2005, . [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, - . + . [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an - IANA Considerations Section in RFCs", BCP 26, RFC 5226, - DOI 10.17487/RFC5226, May 2008, - . + IANA Considerations Section in RFCs", RFC 5226, + DOI 10.17487/RFC5226, May 2008, . [TIME_T] The Open Group Base Specifications, "Vol. 1: Base Definitions, Issue 7", Section 4.15 'Seconds Since the Epoch', IEEE Std 1003.1, 2013 Edition, 2013, . 10.2. Informative References [ASN.1] International Telecommunication Union, "Information @@ -1828,35 +1921,35 @@ X.690, 1994. [BSON] Various, "BSON - Binary JSON", 2013, . [MessagePack] Furuhashi, S., "MessagePack", 2013, . [RFC0713] Haverty, J., "MSDTP-Message Services Data Transmission Protocol", RFC 713, DOI 10.17487/RFC0713, April 1976, - . + . [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type Specifications and Registration Procedures", BCP 13, RFC 6838, DOI 10.17487/RFC6838, January 2013, - . + . [RFC7159] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data Interchange Format", RFC 7159, DOI 10.17487/RFC7159, March - 2014, . + 2014, . [RFC7228] Bormann, C., Ersue, M., and A. Keranen, "Terminology for Constrained-Node Networks", RFC 7228, - DOI 10.17487/RFC7228, May 2014, - . + DOI 10.17487/RFC7228, May 2014, . [UBJSON] The Buzz Media, "Universal Binary JSON Specification", 2013, . [YAML] Ben-Kiki, O., Evans, C., and I. Net, "YAML Ain't Markup Language (YAML[TM]) Version 1.2", 3rd Edition, October 2009, . Appendix A. Examples @@ -2470,22 +2563,39 @@ | | 00 00 | | | | | | | UBJSON | 61 02 42 01 61 02 42 02 | 61 ff 42 01 61 02 42 02 | | | 42 03 | 42 03 45 | | | | | | CBOR | 82 01 82 02 03 | 9f 01 82 02 03 ff | +-------------+--------------------------+--------------------------+ Table 6: Examples for Different Levels of Conciseness -Authors' Addresses +Appendix F. Changes from RFC 7049 + + The following is a list of known changes from RFC 7049. This list is + non-authoritative. It is meant to help reviewers see the significant + differences. + + o Updated reference for [RFC4267] to [RFC7159] in many places + o Updated reference for [CNN-TERMS] to [RFC7228] + + o Added a comment to the last example in Section 2.2.1 (added + "Second value") + + o Fixed a bug in the example in Section 2.4.2 ("29" -> "49") + + o Fixed a bug in the last paragraph of Section 3.6 ("0b000_11101" -> + "0b000_11001") + +Authors' Addresses Carsten Bormann Universitaet Bremen TZI Postfach 330440 D-28359 Bremen Germany Phone: +49-421-218-63921 EMail: cabo@tzi.org Paul Hoffman