Flow Control#
As any http proxy software, Envoy takes flow control very seriously. Because CPU/memory resources are limited, it is also important to avoid situations where a single flow can take up too much resources. It is important to note that, as with any software implemented in an asynchronous/threaded multiplexed architecture, flow control is never a simple task.
If someone asked me what was the hardest part of learning the Envoy implementation? My answer must be the flow control part. And there is very little information about it on the web. Or there are readers ask, so difficult, why study, this study has any value? In my opinion, this study has at least the following values:
Envoy as an important part of the business traffic must pass through, can not be wrong. Its memory usage should be understood when we do service resource evaluation, so that we can evaluate it scientifically.
Understanding the behavior of Envoy and service degradation when traffic is overrun can be a good precaution.
because flow control involves all participants in the data flow path, the process of research itself is the process of understanding the relationship of Envoy flow components.
It should be notice that the “flow control” in this section does not mean that we generally do microservice APIs, control API TPS to prevent the service from crashing in the high-frequency API calls to protect the service from such overload. It’s more of a backpressure based protection to prevent a single connection/http2 stream from using too much memory buffer when the Envoy is processing a data stream such as request body/response body.
Envoy has an Envoy Flow Control document that describes some of these details. In this section, I document the results of some of my study research based on this, but also added a lot of my interpretation.
Traffic control in Envoy is accomplished by limiting each Buffer with watermark callbacks. When a Buffer contains more data than the configured limit, a high watermark callback is triggered, which triggers a series of events that eventually notify the data source to stop sending data. This suppression may be immediate (e.g., stopping reads from sockets) or gradual (e.g., stopping HTTP/2 window updates), so all Buffer limits in the Envoy are considered soft limits.
When the Buffer is finally processed (drains) (usually halfway to the high water mark to avoid jittering back and forth), a low water mark callback is triggered to notify the sender that it can resume sending data.
The following is a simple TCP implementation detailing the flow control process, followed by a more complex HTTP2 flow control process.
Some flow control terms#
back up- A situation in which data is congested in one or more intermediate buffers due to slow or poor traffic flow to the destination, resulting in the buffer running out of space.buffers fill up- the cache space reaches the upper limit.backpressure- Stream backpressure is a feedback mechanism that allows the system to respond to requests rather than crashing under load when processing capacity is exceeded. This occurs when the rate of incoming data exceeds the rate of processing or outputting data, leading to congestion and potential data loss. For more details, see:Backpressure explained - the resisted flow of data through softwareHigh/Low Watermark - High and low watermark design patterns for controlling memory or buffer consumption but not wanting to trigger control operations with frequent high-frequency jitter, see “What are high and low water marks in bit streaming” for details.
drained- This term describes the process of emptying a buffer. Typically, when the buffer’s data volume surpasses a “high watermark,” the system throttles or “rate-limits” the input source. The existing data in the buffer is then continuously processed and removed until the volume drops below a “low watermark” or the buffer is cleared out.HTTP/2 window- The HTTP/2 standard implementation of flow control that indicates, via theWINDOW_UPDATEframe, the number of octets the sender may transmit in addition to the existing flow control window. See “Hypertext Transfer Protocol Version 2 (HTTP/2) - 5.2. Flow Control for details. “http stream- The HTTP/2 standard for streams. For details, see “Hypertext Transfer Protocol Version 2 (HTTP/2) - 5. Streams and Multiplexing”
TCP flow control implementation#
Flow control for TCP and TLS endpoints is handled through the coordination between the Write Buffer of Network::ConnectionImpl and the Network::TcpProxy Filter.
The flow control for Downstream is as follows.
Downstream
Network::ConnectionImpl::write_buffer_buffers too much data. It callsNetwork::ConnectionCallbacks::onAboveWriteBufferHighWatermark().Network::TcpProxy::DownstreamCallbacksreceivesonAboveWriteBufferHighWatermark()and callsreadDisable(true)on the Upstream connection.When the Downstream is finished processing (
drained), it callsNetwork::ConnectionCallbacks::onBelowWriteBufferLowWatermark()on the Upstream connection.Network::TcpProxy::DownstreamCallbacksreceivesonBelowWriteBufferLowWatermark()and callsreadDisable(false)on the Upstream connection.
The flow control for Upstream is roughly the same.
Upstream
Network::ConnectionImpl::write_buffer_buffers too much data. It callsNetwork::ConnectionCallbacks::onAboveWriteBufferHighWatermark().Network::TcpProxy::UpstreamCallbacksreceivesonAboveWriteBufferHighWatermark()and callsreadDisable(true)on the Downstream connection.When the Upstream has finished processing (
drained), it callsNetwork::ConnectionCallbacks::onBelowWriteBufferLowWatermark()on the Downstream connection.Network::TcpProxy::UpstreamCallbacksreceivesonBelowWriteBufferLowWatermark()and callsreadDisable(false)on Downstream connections.
The subsystem and Callback mechanism can be found in this book in the section: Callback design pattern.
HTTP2 Flow Control Implementation#
Because the various buffers in the HTTP/2 stack are fairly complicated, each path from a buffer going over the watermark limit to disabling data from the data source is documented separately.
Note
Readers who don’t know much about Envoy’s http-connection-manager and http filter chain are advised to read the following section of this book: http connection manager section. The following assumes that the reader already knows this.
HTTP2 flow control general flow#
Downstream/Upstream connection backs-up overview#
For HTTP/2, when filters, streams, or connections back up, the end result is
readDisable(true)being called on the source stream. This results in the stream ceasing to consume window, and so not sending further flow control window updates to the peer. This will result in the peer eventually stopping sending data when the available window is consumed (or nghttp2 closing the connection if the peer violates the flow control limit) and so limiting the amount of data Envoy will buffer for each stream.When
readDisable(FALSE)is called, any outstanding unconsumed data is immediately consumed, which results in resuming window updates to the peer and the resumption of data.
Figure: Downstream connection backs-up and backpressure overview#
Figure: Upstream connection back up and backpressure#
The Unbounded buffer above does not mean that the buffer does not have a limit, it means that the limit is a soft limit.
Upstream connection and Upstream http stream back-up at the same time#
Note that
readDisable(true)on a stream may be called by multiple entities. It is called when any filter buffers too much, when the stream backs up and has too much data buffered, or the connection has too much data buffered. Because of this,readDisable()maintains a count of the number of times it has been called to both enable and disable the stream, resuming reads when each caller has called the equivalent low watermark callback.
Source code:
void ConnectionImpl::StreamImpl::readDisable(bool disable) {
ENVOY_CONN_LOG(debug, "Stream {} {}, unconsumed_bytes {} read_disable_count {}",
parent_.connection_, stream_id_, (disable ? "disabled" : "enabled"),
unconsumed_bytes_, read_disable_count_);
if (disable) {
++read_disable_count_;
} else {
ASSERT(read_disable_count_ > 0);
--read_disable_count_;
if (!buffersOverrun()) {
scheduleProcessingOfBufferedData(false);
if (shouldAllowPeerAdditionalStreamWindow()) {
grantPeerAdditionalStreamWindow();
}
}
}
}
For example, if the TCP window upstream fills up and results in the network buffer backing up, all the streams associated with that connection will
readDisable(true)their downstream data sources.When the HTTP/2 flow control window fills up an individual stream may use all of the window available and call a second
readDisable(true)on its downstream data source.When the upstream TCP socket drains, the connection will go below its low watermark and each stream will call
readDisable(false)to resume the flow of data. The stream which had both a network level block and a H2 flow control block will still not be fully enabled.Once the upstream peer sends window updates, the stream buffer will drain and the second
readDisable(false)will be called on the downstream data source, which will finally result in data flowing from downstream again.
Example:
if the upstream TCP Write Buffer window fills and causes the network buffer to be full, all
streamsassociated with thatconnectionwillreadDisable(true)their Downsteam data source.At the same time, if the HTTP/2 flow control window fills up, a single stream may use all available windows and call a second
readDisable(true)on its Downstream datasource.Then, as the Upstream TCP Write Buffer continues to send and drain (drains), the
connectionwill fall below its low water mark and each stream will callreadDisable(false)to resume the data flow. However, astreamwith both network-level hangs and H2 flow control-level hangs will still not be fully enabled.Once the Upstream peer sends the HTTP2 window update, the
streambuffer will empty and the Downstream data source will call a secondreadDisable(false), which will eventually cause the data to flow out of the Downstream again.
Figure: Upstream connection and Upstream http stream back-up at the same time#
Collaboration of Router::Filter during Upstream back-up#
The two main parties involved in flow control are the router filter (
Envoy::Router::Filter) and the connection manager (Envoy::Http::ConnectionManagerImpl). The router is responsible for intercepting watermark events for its own buffers, the individual upstream streams (if codec buffers fill up) and the upstream connection (if the network buffer fills up). It passes any events to the connection manager, which has the ability to callreadDisable()to enable and disable further data from downstream.
Figure: Collaboration of Router::Filter during Upstream back-up#
Collaboration of Http::ConnectionManagerImpl when Downstream back-up#
On the reverse path, when the downstream connection backs up, the connection manager collects events for the downstream streams and the downstream connection. It passes events to the router filter via
Envoy::Http::DownstreamWatermarkCallbacksand the router can then callreadDisable()on the upstream stream. Filters opt into subscribing toDownstreamWatermarkCallbacksas a performance optimization to avoid each watermark event on a downstream HTTP/2 connection resulting in “number of streams * number of filters” callbacks. Instead, only the router filter is notified and only the “number of streams” multiplier applies. Because the router filter only subscribes to notifications when it has an upstream connection, the connection manager tracks how many outstanding high watermark events have occurred and passes any on to the router filter when it subscribes.
Figure: Collaboration of Http::ConnectionManagerImpl when Downstream back-up#
HTTP decode/encode filter flow control detail#
Each HTTP and HTTP/2 filter has an opportunity to call
decoderBufferLimit()orencoderBufferLimit()on creation. No filter should buffer more than the configured bytes without calling the appropriate watermark callbacks or sending an error response.Filters may override the default limit with calls to
setDecoderBufferLimit()andsetEncoderBufferLimit(). These limits are applied as filters are created so filters later in the chain can override the limits set by prior filters. It is recommended that filters calling these functions should generally only perform increases to the buffer limit, to avoid potentially conflicting with the buffer requirements of other filters in the chain.Most filters do not buffer internally, but instead push back on data by returning a FilterDataStatus on
encodeData()/decodeData()calls. If a buffer is a streaming buffer, i.e. the buffered data will resolve over time, it should returnFilterDataStatus::StopIterationAndWatermarkto pause further data processing, which will cause theConnectionManagerImplto trigger watermark callbacks on behalf of the filter. If a filter can not make forward progress without the complete body, it should returnFilterDataStatus::StopIterationAndBuffer. In this case if theConnectionManagerImplbuffers more than the allowed data it will return an error downstream: a 413 on the request path, 500 orresetStream()on the response path.
Decoder filters#
For filters which do their own internal buffering, filters buffering more than the buffer limit should call
onDecoderFilterAboveWriteBufferHighWatermarkif they are streaming filters, i.e. filters which can process more bytes as the underlying buffer is drained. This causes the downstream stream to be readDisabled and the flow of downstream data to be halted. The filter is then responsible for callingonDecoderFilterBelowWriteBufferLowWatermarkwhen the buffer is drained to resume the flow of data.Decoder filters which must buffer the full response should respond with a 413 (Payload Too Large) when encountering a response body too large to buffer.
The decoder high watermark path for streaming filters is as follows:
When an instance of
Envoy::Router::StreamDecoderFilterbuffers too much data it should callStreamDecoderFilterCallback::onDecoderFilterAboveWriteBufferHighWatermark().When
Envoy::Http::ConnectionManagerImpl::ActiveStreamDecoderFilterreceivesonDecoderFilterAboveWriteBufferHighWatermark()it callsreadDisable(true)on the downstream stream to pause data.And the low watermark path:
When the buffer of the
Envoy::Router::StreamDecoderFilterdrains should callStreamDecoderFilterCallback::onDecoderFilterBelowWriteBufferLowWatermark().When
Envoy::Http::ConnectionManagerImplreceivesonDecoderFilterAboveWriteBufferHighWatermark()it callsreadDisable(false)on the downstream stream to resume data.
Encoder filters#
Encoder filters buffering more than the buffer limit should call
onEncoderFilterAboveWriteBufferHighWatermarkif they are streaming filters, i.e. filters which can process more bytes as the underlying buffer is drained. The high watermark call will be passed from theEnvoy::Http::ConnectionManagerImplto theEnvoy::Router::Filterwhich willreadDisable(true)to stop the flow of upstream data. Streaming filters which callonEncoderFilterAboveWriteBufferHighWatermarkshould callonEncoderFilterBelowWriteBufferLowWatermarkwhen the underlying buffer drains.Filters which must buffer a full request body before processing further, should respond with a 500 (Server Error) if encountering a request body which is larger than the buffer limits.
The encoder high watermark path for streaming filters is as follows:
When an instance of
Envoy::Router::StreamEncoderFilterbuffers too much data it should callStreamEncoderFilterCallback::onEncodeFilterAboveWriteBufferHighWatermark().When
Envoy::Http::ConnectionManagerImpl::ActiveStreamEncoderFilterreceivesonEncoderFilterAboveWriteBufferHighWatermark()it callsConnectionManagerImpl::ActiveStream::callHighWatermarkCallbacks()
callHighWatermarkCallbacks()then in turn callsDownstreamWatermarkCallbacks::onAboveWriteBufferHighWatermark()for all filters which registered to receive watermark events
Envoy::Router::FilterreceivesonAboveWriteBufferHighWatermark()and callsreadDisable(true)on the upstream request.The encoder low watermark path for streaming filters is as follows:
When an instance of
Envoy::Router::StreamEncoderFilterbuffers drains it should callStreamEncoderFilterCallback::onEncodeFilterBelowWriteBufferLowWatermark().When
Envoy::Http::ConnectionManagerImpl::ActiveStreamEncoderFilterreceivesonEncoderFilterBelowWriteBufferLowWatermark()it callsConnectionManagerImpl::ActiveStream::callLowWatermarkCallbacks()
callLowWatermarkCallbacks()then in turn callsDownstreamWatermarkCallbacks::onBelowWriteBufferLowWatermark()for all filters which registered to receive watermark events
Envoy::Router::FilterreceivesonBelowWriteBufferLowWatermark()and callsreadDisable(false)on the upstream request.
HTTP and HTTP/2 codec upstream send buffer#
Below I am using the original document directly. However, I have included diagrams that I have drawn to make it easier to understand.
The upstream send buffer Envoy::Http::Http2::ConnectionImpl::StreamImpl::pending_send_data_ is H2 stream data destined for an Envoy backend. Data is added to this buffer after each filter in the chain is done processing, and it backs up if there is insufficient connection or stream window to send the data. The high watermark path goes as follows:
When
pending_send_data_has too much data it callsConnectionImpl::StreamImpl::pendingSendBufferHighWatermark().pendingSendBufferHighWatermark()callsStreamCallbackHelper::runHighWatermarkCallbacks()runHighWatermarkCallbacks()results in all subscribers ofEnvoy::Http::StreamCallbacksreceiving anonAboveWriteBufferHighWatermark()callback.When
Envoy::Router::FilterreceivesonAboveWriteBufferHighWatermark()it callsStreamDecoderFilterCallback::onDecoderFilterAboveWriteBufferHighWatermark().When
Envoy::Http::ConnectionManagerImplreceivesonDecoderFilterAboveWriteBufferHighWatermark()it callsreadDisable(true)on the downstream stream to pause data.
For the low watermark path:
When
pending_send_data_drains it callsConnectionImpl::StreamImpl::pendingSendBufferLowWatermark()pendingSendBufferLowWatermark()callsStreamCallbackHelper::runLowWatermarkCallbacks()runLowWatermarkCallbacks()results in all subscribers ofEnvoy::Http::StreamCallbacksreceiving aonBelowWriteBufferLowWatermark()callback.When
Envoy::Router::FilterreceivesonBelowWriteBufferLowWatermark()it callsStreamDecoderFilterCallback::onDecoderFilterBelowWriteBufferLowWatermark().When
Envoy::Http::ConnectionManagerImplreceivesonDecoderFilterBelowWriteBufferLowWatermark()it callsreadDisable(false)on the downstream stream to resume data.
Figure: Collaboration of Router::Filter during Upstream back-up#
HTTP and HTTP/2 network upstream network buffer#
Below I am using the original document directly. However, I have included diagrams that I have drawn to make it easier to understand. At the same time, I discovered a BUG in the official documentation and submitted a Pull Request to correct it: [Docs: flow control - Add a figure and fix suspected incorrect use of Envoy::Http::ConnectionManagerImpl #33847](https:// github.com/envoyproxy/envoy/pull/33847)
The upstream network buffer is HTTP/2 data for all streams destined for the Envoy backend. If the network buffer fills up, all streams associated with the underlying TCP connection will be informed of the back-up, and the data sources (HTTP/2 streams or HTTP connections) feeding into those streams will be readDisabled.
The high watermark path is as follows:
When
Envoy::Network::ConnectionImpl::write_buffer_has too much data it callsNetwork::ConnectionCallbacks::onAboveWriteBufferHighWatermark().When
Envoy::Http::CodecClientreceivesonAboveWriteBufferHighWatermark()it callsonUnderlyingConnectionAboveWriteBufferHighWatermark()oncodec_.When
Envoy::Http::Http2::ConnectionImpl(the original document useEnvoy::Http::ConnectionManagerImplincorrectly) receivesonAboveWriteBufferHighWatermark()it callsrunHighWatermarkCallbacks()for each stream of the connection.runHighWatermarkCallbacks()results in all subscribers ofEnvoy::Http::StreamCallbackreceiving anonAboveWriteBufferHighWatermark()callback.When
Envoy::Router::FilterreceivesonAboveWriteBufferHighWatermark()it callsStreamDecoderFilterCallback::onDecoderFilterAboveWriteBufferHighWatermark().When
Envoy::Http::ConnectionManagerImplreceivesonDecoderFilterAboveWriteBufferHighWatermark()it callsreadDisable(true)on the downstream stream to pause data.
The low watermark path is as follows:
When
Envoy::Network::ConnectionImpl::write_buffer_is drained it callsNetwork::ConnectionCallbacks::onBelowWriteBufferLowWatermark().When
Envoy::Http::CodecClientreceivesonBelowWriteBufferLowWatermark()it callsonUnderlyingConnectionBelowWriteBufferLowWatermark()oncodec_.When
Envoy::Http::Http2::ConnectionImpl(the original document useEnvoy::Http::ConnectionManagerImplincorrectly) receivesonBelowWriteBufferLowWatermark()it callsrunLowWatermarkCallbacks()for each stream of the connection.runLowWatermarkCallbacks()results in all subscribers ofEnvoy::Http::StreamCallbackreceiving aonBelowWriteBufferLowWatermark()callback.When
Envoy::Router::FilterreceivesonBelowWriteBufferLowWatermark()it callsStreamDecoderFilterCallback::onDecoderFilterBelowWriteBufferLowWatermark().When
Envoy::Http::ConnectionManagerImplreceivesonDecoderFilterBelowWriteBufferLowWatermark()it callsreadDisable(false)on the downstream stream to resume data.
As with the downstream network buffer, it is important that as new upstream streams are associated with an existing upstream connection over its buffer limits that the new streams are created in the correct state. To handle this, the Envoy::Http::Http2::ClientConnectionImpl tracks the state of the underlying Network::Connection in underlying_connection_above_watermark_. If a new stream is created when the connection is above the high watermark the new stream has runHighWatermarkCallbacks() called on it immediately.
Figure: Collaboration of Router::Filter when Upstream connection back-up#