RTMP Errata and Addenda | July 2024 | |
Thornburgh | Informational | [Page] |
The specification for Adobe's Real-Time Messaging Protocol (RTMP), last updated in December 2012, contains errors, omissions, and ambiguities that impede interoperation. This memo corrects, clarifies, and amends portions of the RTMP specification.¶
Copyright © 2023 Michael Thornburgh. All rights reserved.¶
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:¶
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.¶
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.¶
At the time of writing, Adobe's Real-Time Messaging Protocol (RTMP) continues to be used by the media streaming industry and in media streaming products for transmitting video and audio over IP networks, including between and through various stages of live video production, ingest, and distribution.¶
The RTMP specification [RTMP] was last updated by Adobe in December 2012. That version contains numerous errors, omissions, and ambiguities that impede the independent development of correct and interoperable streaming products.¶
At the time of writing, the RTMP specification is considered to be an immutable historical artifact. This memo is intended to be referenced alongside [RTMP] to correct, clarify, and amend portions of that historical specification.¶
This memo addresses RTMP Version 3 as publicly described in [RTMP]. It does not address private or undocumented extensions for which interoperation is not desired by the extending parties.¶
For coordinated interoperable extensions and enhancements to RTMP, see Enhanced RTMP [E-RTMP].¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Definitions of types and structures in this specification use traditional text diagrams paired with procedural descriptions using a C-like syntax. The C-like procedural descriptions SHALL be construed as definitive.¶
Structures are packed to take only as many bytes as explicitly indicated. There is no 32-bit alignment constraint, and fields are not padded for alignment unless explicitly indicated or described. Text diagrams may include a bit ruler across the top; this is a convenience for counting bits in individual fields and does not necessarily imply field alignment on a multiple of the ruler width.¶
Unless specified otherwise, reserved fields SHOULD be set to 0 by a sender and MUST be ignored by a receiver.¶
The procedural syntax of this specification defines correct and error-free encoded inputs to a parser. The procedural syntax does not describe a fully featured parser, including error detection and handling. Implementations MUST include means to identify error circumstances, including truncations causing elementary or composed types not to fit inside containing structures, fields, or elements. Unless specified otherwise, an error circumstance SHALL abort the parsing and processing of an element and its enclosing elements.¶
This memo uses the elementary types and constructs described in Section 2.1.1 of [RFC7016]. The definitions of that section are incorporated by reference as though fully set forth here.¶
This section lists additional elementary types used in the following sections.¶
uint24_t var;
uint16le_t var;
uint32le_t var;
Section 5.2 of [RTMP]
describes the initial handshake of an RTMP Chunk Stream connection. In that
section and its subsections, the C1
and S1
packets are
described as containing fields of 1528 random bytes, and the C2
and
S2
packets as containing identical echoes of the same 1528 random
bytes from S1
and C1
respectively.¶
Some proprietary extensions to RTMP use these data fields for further
handshaking purposes (such as cryptographic key exchange or feature enablement),
and in these cases the C2
and S2
packets might not contain
identical echoes of the random bytes from the respective S1
and
C1
packets.¶
Clients or servers that are not implementing or enforcing proprietary
extensions to the handshake SHOULD NOT fail the connection or
limit functionality on account of a non-identical echo of the corresponding
random bytes received in an S2
or C2
packet.¶
Section 4 of [RTMP] states that "all integer fields are carried in network byte order". However, there are two exceptions in the specification where multi-byte integers are coded in little-endian byte order. To draw attention to these exceptions and to eliminate the dissonance between the exceptions and the statement, the first two sentences of that section are amended to read¶
Unless otherwise specified, multi-byte integer fields are carried in network byte order, byte zero is the first byte shown, and bit zero is the most significant bit in a word or field. This byte order is commonly known as "big-endian".¶
Section 5.3.1.1 of [RTMP]
describes the one, two, and three byte encodings of the Chunk Basic Header.
The descriptive text gives a formula to decode the three-byte encoding as
though the "cs id - 64
" field is little-endian; however, this is not
explicitly stated and is easy to miss given the diagrams and the description
of the field as just "8 or 16 bits". This section clarifies the Chunk Basic
Header encodings.¶
0 1 2 3 4 5 6 7| +-+-+-+-+-+-+-+-+ |fmt| 2 .. 63 | +-+-+-+-+-+-+-+-+ 0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |fmt| 0 | cs id - 64 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |fmt| 1 | cs id - 64 (LE) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ struct chunkBasicHeader_t { uintn_t fmt :2; // format uintn_t provisionalChunkStreamID :6; if (provisionalChunkStreamID >= 2) chunkStreamID = provisionalChunkStreamID; // one-byte form else { if (0 == provisionalChunkStreamID) uint8_t chunkStreamIDMinus64; // two-byte form else // 1 == provisionalChunkStreamID uint16le_t chunkStreamIDMinus64; // three-byte form chunkStreamID = chunkStreamIDMinus64 + 64; } } :variable*8;¶
The chunkStreamIDMinus64
field of the 3-byte form encodes the
chunk stream ID less 64 in little-endian byte order.¶
Section 5.3.1.2 of [RTMP]
describes the four Chunk Message Header formats. The format being used is
indicated by the fmt
field of the Chunk Basic Header.¶
The Type 0
Chunk Message Header format includes the RTMP Message
Stream ID. The description of the Message Stream ID (4 bytes)
field
in
Section 5.3.1.2.5 of [RTMP]
specifies that the field "is stored in little-endian format". However, the
diagram doesn't show this and it is easily missed. This section clarifies the
Type 0
Chunk Message Header format.¶
0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7|0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | provisional timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | message length |message type id| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | message stream id (LE) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+ | if(0xFFFFFF == provisionalTimestamp) extendedTimestamp | +~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+~+ struct chunkMessageHeaderType0_t { uint24_t provisionalTimestamp; uint24_t messageLength; uint8_t messageTypeID; uint32le_t messageStreamID; if (0xFFFFFF == provisionalTimestamp) { uint32_t extendedTimestamp; timestamp = extendedTimestamp; } else timestamp = provisionalTimestamp; } :variable*8;¶
The messageStreamID
field encodes the message stream ID
as four bytes in little-endian byte order. Note however that
per Section 7
the message stream ID SHOULD NOT exceed 16777215.¶
Section 5.3.1.2 of [RTMP]
implies that the Extended Timestamp
field is separate from the Chunk
Message Header. This implication is misleading because, when it is present,
it is in fact part of the chunk header and does not count toward the chunk's
payload or the Message Length
. Because of this, all four types of
Chunk Message Header should be recognized as being variable-length.¶
While
Section 5.3.1.3 of [RTMP]
states that the Extended Timestamp
is present in Type 3
chunks when the most recent Type 0
, Type 1
, or Type
2
chunk on the same Chunk Stream ID indicated the presence of an extended
timestamp, this has caused confusion for some implementors for the Type
3
chunks that are the subsequent portions of the same message, or that
are the first chunk of subsequent messages on the same Chunk Stream ID. To
clarify: while the Extended Timestamp
field is indicated, it
MUST be present in every Type 3
Chunk Message
Header for a message, including the first as well as subsequent chunks of
that message and subsequent messages starting with Type 3
,
even though this is wasteful.¶
When used with a reliable transport protocol such as TCP [RFC0793], RTMP Chunk Stream provides guaranteed timestamp-ordered end-to-end delivery of all messages, across multiple streams.¶
However, this wording is misleading. For example, Section 5.3.1.2.1 of [RTMP] alludes to a case where "the stream timestamp goes backward (e.g. because of a backward seek)". Note that RTMP can be used for Video on Demand (VOD) and Digital Video Recorder (DVR) playback as well as for live streaming.¶
Senders MAY apply real-time treatments to RTMP messages, including prioritization policies and transmission deadlines. Prioritization can include (but is not limited to) preferential allocation of transmission resources to audio over video, or to one stream over another. Prioritization can cause, for example, later audio messages to be transmitted before earlier video messages.¶
Senders have full control over scheduling, interleaving, and transmitting RTMP messages, or portions thereof, over lower-layer transports. When using RTMP Chunk Stream over a reliable, in-order transport protocol such as TCP [RFC0793], a sender can ensure delivery of messages in any desired order, including in timestamp order.¶
For optimum interoperability with straightforward implementations and with many existing implementations, timestamps SHOULD NOT go backwards in the same Message Stream ID for the same RTMP Message Type ID without an explicit signal from the receiver (such as initiating a backward seek) indicating such a jump is expected.¶
Section 5.4.2 of [RTMP]
describes the Abort Message
Protocol Control Message. The description
of this message states¶
An application may send this message when closing in order to indicate that further processing of the messages is not required.¶
This description is misleading, and suggests that receipt of this message signals that no further processing of any messages is required, which is incorrect.¶
This protocol control message aborts a single partially-transmitted message on the indicated Chunk Stream ID. Afterward, new messages can commence on the same Chunk Stream ID.¶
Senders can apply real-time treatments to messages, including transmission
deadlines. A message's transmission deadline might be exceeded after transmission
has started but before transmission is complete. Therefore, receivers
MUST be operable to process the Abort Message
Protocol
Control Message by discarding the partially-received message and being prepared
to receive a new message on the indicated Chunk Stream ID.¶
If at least one chunk for a new message is transmitted, then a timestamp is established for that message against which a timestamp delta in a subsequent message can be applied, even if the first message is aborted.¶
Section 6.1 of [RTMP] defines the RTMP Message Format, with an explicit encoding of the RTMP message header including field widths and byte ordering. However, this definition is misleading because RTMP messages are not serialized according to this format over either of the common transports. For example, when RTMP messages are transported by the RTMP Chunk Stream, RTMP message headers are encoded into Chunk Message Headers; and when RTMP messages are transported in RTMFP according to Section 5.1 of [RFC7425], RTMP message headers are encoded into flow metadata and the leading bytes of flow user messages. Therefore, the Message Header of the RTMP Message Format SHOULD be construed as a virtual header that is mapped to the facilities of a lower-layer transport protocol such as RTMP Chunk Stream or RTMFP.¶
The semantics of the RTMP Message Virtual Header and the encodings used in the common transport protocols constrain the possible values of RTMP Message Header fields. In particular:¶
Note that in some cases these constraints are more restrictive than the facilities provided by lower-layer transports. For example, RTMP Chunk Stream could encode a Stream ID up to 4294967295; RTMFP could encode a Stream ID of any finite non-negative integer (though typical implementations are limited to 264 - 1) and transport messages of unlimited length. However, exceeding the limits enumerated above will impede interoperability and protocol translation, so they SHOULD NOT be exceeded.¶
RTMP Chunk Stream is not the only possible transport service for RTMP Messages. This flexibility is stated in several places. For example:¶
The [sic] section specifies the format of RTMP messages that are transferred between entities on a network using a lower level transport layer, such as RTMP Chunk Stream. While RTMP was designed to work with the RTMP Chunk Stream, it can send the messages using any other transport protocol. (Section 6 of [RTMP])¶
Real-time streaming network communication for the Flash platform of video, audio, and data typically uses Adobe's Real-Time Messaging Protocol (RTMP) messages. RTMP messages were originally designed to be transported over RTMP Chunk Stream in TCP; however, other transports (such as the one described in this memo) are possible. (Section 1 of [RFC7425])¶
The Flash platform uses RTMP messages for media streaming and communication. This section describes how to transport RTMP messages over RTMFP flows and additional messages and semantics unique to this transport. (Section 5 of [RFC7425])¶
This flexibility is memorialized here.¶
Note well that during ordinary processing, RTMP Messages can be translated between different protocols and can pass through heterogeneous processing elements, which can mask, confound, or destroy message meta-information beyond the Message Virtual Header (such as the Chunk Stream ID or RTMFP flow on which a message originally arrived). Different processing elements can apply different treatments, policies, priorities, and deadlines to RTMP message streams, potentially affecting message arrival, arrival order, and interarrival timing.¶
When objectEncoding
3 is negotiated during connect
(Section 8.1.1), Type 17
Command
Extended and Type 15
Data Extended messages can be used to carry
AMF 3 [AMF3] coded values.¶
Section 7.1.1 of [RTMP]
implies that a Type 17
Command Extended message is a sequence of
AMF 3 coded values. Likewise,
Section 7.1.2 of [RTMP]
implies that a Type 15
Data Extended message is a sequence of AMF 3
coded values. These implications are incorrect.¶
0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |format selector| +-+-+-+-+-+-+-+-+ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+ | coded value | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+ : +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+ | coded value | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+ // Message Type 17 "Command Extended" and Type 15 "Data Extended" struct extendedActionMessagePayload_t { uint8_t formatSelector; if (0 == formatSelector) { while (remainder() > 0) { provisionalValue = decode_amf0(); if (avmplus_object_marker == provisionalValue) // saw marker byte 0x11 codedValue = decode_amf3(); else codedValue = provisionalValue; } } // else unknown format } :rtmpMessagePayloadLength*8;¶
A Type 17
Command Extended or Type 15
Data Extended
message payload begins with a format selector byte, followed by a sequence
of values in a format-specific encoding. Currently only format 0 is defined;
therefore, the format selector byte MUST be 0. Format 0 is a
sequence of AMF values, each encoded in AMF 0 by default; AMF 3 encoding for
a value is selected by prefixing it with an avmplus-object-marker
(byte 0x11
) as defined in [AMF0].¶
Type 20
Command messages and Type 18
Data messages
comprise a sequence of AMF 0 coded values, as originally implied.¶
Section 7.2.1.1 of [RTMP]
lists parameters for the connect
command and their potential values
or ranges. However, that section contains incomplete and misleading information
particularly in capability negotiation.¶
The objectEncoding
property of the connect
command's Command Object
is OPTIONAL with a default value of 0. If given, the possible
values are 0 and 3 (Numbers). This value indicates the Object Encoding methods
supported by the client. A value of 0 indicates support only for AMF 0 and
the Type 20
Command and Type 18
Data message types. A
value of 3 indicates support for both AMF 0 and AMF 3, and for Type 17
Command Extended and Type 15
Data Extended messages in addition to
Type 20
and Type 18
messages. See
Section 7.2 for the corrected encoding of these
message types.¶
The server responds with the highest legal value that both it and the
client support (that is, the server MUST NOT send 3
if the client didn't send 3
). Some clients require this property in
the _result
, so the server SHOULD send it. If Object
Encoding 0 is negotiated, client and server MUST NOT send
Command Extended or Data Extended messages to each other. If Object Encoding
3 is negotiated, client and server can use either the normal (Type 20
and Type 18
) or extended (Type 17
and Type 15
)
message types at any time.¶
Interoperability Note: The connect
command SHOULD
be sent in a Type 20
Command message unless the client has prior
knowledge that the server to which it is connecting supports Object Encoding
3.¶
The tcUrl
property of the connect
command's Command Object
is REQUIRED. Its value is a string representing the URI
to which the client is connecting. Contrary to
Section 7.2.1.1 of [RTMP],
RTMP family URIs follow the
generic syntax for URIs [RFC3986]. The path and optional
query components identify the abstract target resource within the server's
namespace. See [RTMP-URIS] for more information about the
syntax constraints and access semantics of RTMP family URIs.¶
The meaning of "target resource" is implementation-specific and is the prerogative of the server.¶
Interoperability Note: Some client implementations incorrectly presume
the server's interpretation of the path
component of the URI,
specifically by presuming the number of path segments that identify the target
resource and setting tcUrl
accordingly. These clients then incorrectly
interpret the remaining path segments of the original URI as identifying a secondary
resource (such as the name of a stream to play or publish via the connection to
the target resource). This behavior is not in keeping with the spirit of
URIs (particularly that the authority
governs its namespace),
is not interoperable, and is NOT RECOMMENDED. Such secondary
resources, when encoded in a URI, should instead be identified by the fragment
component as described in [RTMP-URIS]. The full path and
optional query components from the original URI SHOULD be
preserved in tcUrl
.¶
Per Security Considerations (Section 10), the
client SHOULD strip any userinfo
and fragment
components from the URI before sending it in tcUrl
.¶
The app
property of the connect
command's Command Object
is REQUIRED. Its value is a string indicating the abstract
target application resource to which the client is connecting. Unless
the client has prior knowledge to the contrary of the server's specific
implementation, app
SHOULD be set to the path
and optional query
components of the URI to which the client is
connecting, not including the leading slash of the path (if any).¶
The meaning of "target application resource" is implementation-specific and is the prerogative of the server.¶
The videoCodecs
and audioCodecs
properties of the
connect
command's Command Object are bit fields representing the video and
audio codecs, respectively, that the client is able and willing to receive.
These properties are OPTIONAL and each defaults to 0 (that is,
no bits set, so no codecs indicated). The values for these fields are
enumerated in
Section 7.2.1.1 of [RTMP];
multiple codecs are combined with bitwise-or.¶
Interoperability Note: Some server implementations, including Adobe Media
Server, use this information to selectively filter out Video
and
Audio
messages having codecs not indicated in the respective bit
fields. In other words, some servers might not send the video and/or audio
messages of certain streams unless these properties are sent with appropriate
values. Therefore, clients SHOULD send videoCodecs
and audioCodecs
in the connect
command indicating at least
their supported playback codecs.¶
Section 7.2.1.1 of [RTMP]
shows that optional user arguments to the connect
command are allowed.
However, that section shows and implies that “any optional information” would
be in a single optional AMF Object following the Command Object. This is
misleading. The connect
command can accept, expect, or require zero
or more additional arguments of any AMF types following the Command Object.
The number and types of the additional arguments, their interpretation, and
whether they are expected or required is implementation-specific and is the
prerogative of the server. Often a server will expect one or more string
arguments to the connect
command following the Command Object; for
example a developer key, a user name and password, or implementation-specific
connection properties.¶
RTMP doesn’t define an explicit connection authentication mechanism, handshake, or protocol. However, RTMP does provide multiple generic facilities that an application could use to authenticate and authorize a connection. Some common examples:¶
connect
command;¶
tcUrl
and app
properties of the connect
command's Command Object);¶
connect
command, potentially incorporating authentication
credentials and a challenge or other information from the connect
command's _result
.¶
This list of examples is not exhaustive; other authentication schemes are
possible both within the facilities of RTMP and external (such as the server
requiring an authorized TLS [RFC8446] client certificate
for an rtmps:
connection).¶
A concrete multi-phase challenge/response authentication protocol, using
URI query parameters and challenges returned from connect
_error
responses, is described in the
Authentication between RTMP Publisher and Primetime Live Packager
section of [PRIMETIME].¶
Section 7.2.1 of [RTMP],
in context, implies that "call
" is the name of a standard remote
command to be invoked on the server (similar to connect
and
createStream
). This implication is incorrect. Instead, "call
"
is an implementation-specific client-side method of a
NetConnection
ActionScript object (specific to the APIs of Flash
Player and Adobe Integrated Runtime), the effect of which in those clients
is to send an arbitrary command to the server as a normally-encoded Command
Message
(Section 7.2 of [RTMP]).¶
This incorrect implication has caused confusion for some implementers.¶
Note that Section 7.2.1.2 of [RTMP] correctly describes the encoding of a generic remote procedure call into a Command Message. However, the "Optional Arguments" portion following the Command Object in that section’s illustration implies that the optional arguments are composed into a single AMF Object; instead, additional arguments can be a concatenation of any kinds of AMF coded values, as clarified and described in Section 7.2.¶
Section 7.2.2.3 of [RTMP]
erroneously states that the deleteStream
command is a "NetStream"
command. The deleteStream
command is actually a "NetConnection"
command (like connect
and createStream
) and is to be sent
on Message Stream ID 0 (the "control stream"). The Message Stream ID being
deleted is sent as the fourth field of the command message, as indicated.¶
This memo has no IANA actions.¶
RTMP and RTMFP URIs use the generic syntax for
URIs [RFC3986]. URIs can encode sensitive or private information such as user
names, passwords, stream names, and so on. Implementations SHOULD
remove any userinfo
and fragment
components of a URI before
sending it as the tcUrl
property of the connect
command's
Command Object or in the Ancillary Data option of an RTMFP Endpoint Discriminator
(Section 4.4.2.2 of [RFC7425]).¶
Thanks to Slavik Lozben for his detailed review of this memo.¶