--- 1/draft-ietf-cbor-array-tags-07.txt 2019-10-08 08:13:12.549018141 -0700 +++ 2/draft-ietf-cbor-array-tags-08.txt 2019-10-08 08:13:12.581018954 -0700 @@ -1,18 +1,18 @@ Network Working Group C. Bormann, Ed. Internet-Draft Universitaet Bremen TZI -Intended status: Standards Track August 22, 2019 -Expires: February 23, 2020 +Intended status: Standards Track October 08, 2019 +Expires: April 10, 2020 Concise Binary Object Representation (CBOR) Tags for Typed Arrays - draft-ietf-cbor-array-tags-07 + draft-ietf-cbor-array-tags-08 Abstract The Concise Binary Object Representation (CBOR, RFC 7049) is a data format whose design goals include the possibility of extremely small code size, fairly small message size, and extensibility without the need for version negotiation. The present document makes use of this extensibility to define a number of CBOR tags for typed arrays of numeric data, as well as two @@ -28,21 +28,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on February 23, 2020. + This Internet-Draft will expire on April 10, 2020. Copyright Notice Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -50,53 +50,53 @@ to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 2. Typed Arrays . . . . . . . . . . . . . . . . . . . . . . . . 3 - 2.1. Types of numbers . . . . . . . . . . . . . . . . . . . . 3 + 2.1. Types of numbers . . . . . . . . . . . . . . . . . . . . 4 3. Additional Array Tags . . . . . . . . . . . . . . . . . . . . 5 - 3.1. Multi-dimensional Array . . . . . . . . . . . . . . . . . 5 + 3.1. Multi-dimensional Array . . . . . . . . . . . . . . . . . 6 3.1.1. Row-major Order . . . . . . . . . . . . . . . . . . . 6 3.1.2. Column-Major order . . . . . . . . . . . . . . . . . 7 3.2. Homogeneous Array . . . . . . . . . . . . . . . . . . . . 8 4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 9 5. CDDL typenames . . . . . . . . . . . . . . . . . . . . . . . 10 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 7. Security Considerations . . . . . . . . . . . . . . . . . . . 13 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 8.1. Normative References . . . . . . . . . . . . . . . . . . 14 8.2. Informative References . . . . . . . . . . . . . . . . . 14 - Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 14 + Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 15 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 15 1. Introduction The Concise Binary Object Representation (CBOR, [RFC7049]) provides for the interchange of structured data without a requirement for a pre-agreed schema. RFC 7049 defines a basic set of data types, as well as a tagging mechanism that enables extending the set of data types supported via an IANA registry. - Recently, a simple form of typed arrays of numeric data have received + Recently, a simple form of typed arrays of numeric data has received interest both in the Web graphics community [TypedArray] and in the JavaScript specification [TypedArrayES6], as well as in corresponding implementations [ArrayBuffer]. Since these typed arrays may carry significant amounts of data, there is interest in interchanging them in CBOR without the need of lengthy - conversion of each number in the array. This also can save space + conversion of each number in the array. This can also save space overhead with encoding a type for each element of an array. This document defines a number of interrelated CBOR tags that cover these typed arrays, as well as two additional tags for multi- dimensional and homogeneous arrays. It is intended as the reference document for the IANA registration of the tags defined. Note that an application that generates CBOR with these tags has considerable freedom in choosing variants, e.g., with respect to endianness, embedded type (signed vs. unsigned), and number of bits @@ -111,30 +111,35 @@ 1.1. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. The term "byte" is used in its now customary sense as a synonym for "octet". Where bit arithmetic is explained, this document uses the - notation familiar from the programming language C (including C++14's - 0bnnn binary literals), except that the operator "**" stands for - exponentiation. + notation familiar from the programming language C [C] (including + C++14's 0bnnn binary literals [Cplusplus]), except that the operator + "**" stands for exponentiation. The term "array" is used in a general sense in this document, unless further specified. The term "classical CBOR array" describes an array represented with CBOR major type 4. A "homogeneous array" is an array of elements that are all of the same type (the term is - neutral whether that is a representation type or an application data - model type). + neutral as to whether that is a representation type or an application + data model type). + + The terms "big endian" and "little endian" are used to indicate a + most significant byte first (MSB first) representation of integers, + and a least significant byte first (LSB first) representation, + respectively. 2. Typed Arrays Typed arrays are homogeneous arrays of numbers, all of which are encoded in a single form of binary representation. The concatenation of these representations is encoded as a single CBOR byte string (major type 2), enclosed by a single tag indicating the type and encoding of all the numbers represented in the byte string. 2.1. Types of numbers @@ -174,37 +179,38 @@ of the tag: Tag values from 64 to 87. The value is split up into 5 bit fields: 0b010_f_s_e_ll, as detailed in Table 2. +-------+-------------------------------------------------------+ | Field | Use | +-------+-------------------------------------------------------+ | 0b010 | the constant bits 0, 1, 0 | | f | 0 for integer, 1 for float | - | s | 0 for unsigned integer or float, 1 for signed integer | + | s | 0 for float or unsigned integer, 1 for signed integer | | e | 0 for big endian, 1 for little endian | | ll | A number for the length (Table 1). | +-------+-------------------------------------------------------+ Table 2: Bit fields in the low 8 bits of the tag The number of bytes in each array element can then be calculated by "2**(f + ll)" (or "1 << (f + ll)" in a typical programming language). (Notice that 0f and ll are the two least significant bits, respectively, of each nibble (4bit) in the byte.) In the CBOR representation, the total number of elements in the array is not expressed explicitly, but implied from the length of the byte string and the length of each representation. It can be computed - inversely to the previous formula from the length of the byte string - in bytes: "bytelength >> (f + ll)". + from the length, in bytes, of the byte string comprising the + representation of the array by inverting the previous formula: + "bytelength >> (f + ll)". For the uint8/sint8 values, the endianness is redundant. Only the tag for the big endian variant is used and assigned as such. The Tag that would signify the little endian variant of sint8 MUST NOT be used, its tag number is marked as reserved. As a special case, the Tag that would signify the little endian variant of uint8 is instead assigned to signify that the numbers in the array are using clamped conversion from integers, as described in more detail in Section 7.1.11 ("ToUint8Clamp") of the ES6 JavaScript specification [TypedArrayES6]; the assumption here is that a program-internal @@ -233,21 +239,21 @@ A multi-dimensional array is represented as a tagged array that contains two (one-dimensional) arrays. The first array defines the dimensions of the multi-dimensional array (in the sequence of outer dimensions towards inner dimensions) while the second array represents the contents of the multi-dimensional array. If the second array is itself tagged as a Typed Array then the element type of the multi-dimensional array is known to be the same type as that of the Typed Array. Two tags are defined by this document, one for elements arranged in - row-major order, and one for column-major order. + row-major order, and one for column-major order [RowColMajor]. 3.1.1. Row-major Order Tag: 40 Data Item: array (major type 4) of two arrays, one array (major type 4) of dimensions, which are unsigned integers distinct from zero, and one array (either a CBOR array of major type 4, or a Typed Array, or a Homogeneous Array) of elements @@ -342,21 +348,21 @@ type of the first array element. This can be used in application data models that apply specific semantics to homogeneous arrays. Also, in certain cases, implementations in strongly typed languages may be able to create native homogeneous arrays of specific types instead of ordered lists while decoding. Which CBOR data items constitute elements of the same application type is specific to the application. Figure 4 shows an example for a homogeneous array of booleans in C++ - and CBOR. + [Cplusplus] and CBOR. bool boolArray[2] = { true, false }; # Homogeneous Array Tag 82 #array(2) F5 # true F4 # false Figure 4: Homogeneous array in C++ and CBOR @@ -512,20 +518,26 @@ Uint8ClampedArray for where the application expects a Uint8Array, or vice versa, potentially leading to very different (and unexpected) processing semantics of the in-memory data structures constructed. Applications that could be affected by this therefore will need to be careful about making this distinction in their input validation. 8. References 8.1. Normative References + [C] "Information technology -- Programming languages -- C", + ISO/IEC 9899, 2018. + + [Cplusplus] + "Programming languages -- C++", ISO/IEC 14882, 2017. + [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE Std 754-2008. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, @@ -534,52 +546,63 @@ [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, June 2019, . + [TypedArrayES6] + "22.2 TypedArray Objects", in: ECMA-262 6th Edition, The + ECMAScript 2015 Language Specification, June 2015, + . + 8.2. Informative References [ArrayBuffer] Mozilla Developer Network, "JavaScript typed arrays", 2013, . + [RowColMajor] + Wikipedia, "Row- and column-major order", September 2019, + . + [TypedArray] Vukicevic, V. and K. Russell, "Typed Array Specification", - February 2011. - - [TypedArrayES6] - "22.2 TypedArray Objects", in: ECMA-262 6th Edition, The - ECMAScript 2015 Language Specification, June 2015, - . + February 2011, + . Contributors The initial draft for this specification was written by Johnathan Roatch (roatch@gmail.com). Many thanks for getting this ball rolling. Glenn Engel suggested the tags for multi-dimensional arrays and homogeneous arrays. Acknowledgements Jim Schaad provided helpful comments and reminded us that column- major order still is in use. Jeffrey Yaskin helped improve the definition of homogeneous arrays. IANA helped correct an error in a - previous version. + previous version. Francesca Palombini acted as a shepherd, and + Alexey Melnikov as responsible area director. Elwyn Davies as Gen- + ART reviewer and IESG members Martin Vigoureux, Adam Roach, Roman + Danyliw, and Benjamin Kaduk helped finding further improvements of + the text; thanks also to the other reviewers. Author's Address Carsten Bormann (editor) Universitaet Bremen TZI Postfach 330440 Bremen D-28359 Germany Phone: +49-421-218-63921