I’ve been doing some major learning and deep-dives lately on SQL Server Always On Availability Groups, and with SQL Server 2025 now being in preview, it immediately sparked my interest to see if any new features relating to Always On were going to be available. You can find the full list here, but one feature that grabbed my attention more than others was the new server level sp_configure setting max ucs send boxcars.
What Is UCS and What Does It Do?
UCS stands for Universal Communication Service and it’s a communication protocol that AGs use to send log blocks between primary and secondary replicas. Log blocks are units of transaction log data that SQL Server sends from the primary to secondary replicas. The secondary replays these blocks during redo to stay synchronized with the primary. The purpose of UCS is to detect when a secondary replica falls behind the primary replica by way of evaluating the time it takes for the primary to receive an acknowledgement message from the secondary that a transaction was hardened to the transaction log. If a threshold is met and SQL determines a secondary is too far behind, UCS will enter what Microsoft calls a flow control state. As I understand it, think of flow control as a stop gap or valve that governs the number of transactions that are sent through to your secondary replicas. If there are too many transactions backed up on the secondary, and this threshold is reached, SQL Server will temporarily throttle the sending of log blocks to the secondary until it catches up, and when the number of backlogged transactions falls below whatever threshold is set under the hood, transactions will continue to flow over. This helps avoid oversaturation of messages from the primary flowing through the network to your secondary.
Now, with servers in the same data center, the communication between nodes is usually pretty quick, because the network traffic is local, however, when you have a stretched cluster (i.e.. three node AG with two nodes in one data center and a DR node in another data center) the DR node is being communicated to over a wide-area network (WAN). When UCS sends communications over the WAN, and there is network latency, this can result in your secondary replica to lag behind the primary. While this setting primarily affects both synchronous and asynchronous AG replicas alike, disaster recovery nodes (DR) in an AG are usually susceptible to this because they are most likely the ones in another data center in a different geographical location.
UCS tries to help mitigate this potential for flow control when network latency happens by raising the number of UCS box cars that can be used to send log blocks from primary to secondary. According to Microsoft documentation, “UCS packets are grouped together in a boxcar to allow for more efficient transmission over a network. When you increase the maximum number of UCS boxcars, more packets can be transferred at a time, which in turn defers entering flow control.”
— Microsoft Learn
Increasing the Performance of Your Availability Group
Starting with SQL Server 2025 Preview, you now have the opportunity to dictate how many UCS boxcars can be used to deliver log blocks to your secondary nodes by utilizing the new sp_configure setting max ucs send boxcars. By raising the limit, you can allow SQL to have more boxcars to send more log blocks to your secondary node(s) increasing the efficiency and decreasing the odds of SQL entering a flow control state. Here’s an example how to adjust this setting:
USE master;
GO
EXEC sp_configure 'show advanced options', 1;
RECONFIGURE;
GO
EXEC sp_configure 'max ucs send boxcars', 1024;
RECONFIGURE;
GO
Now, technically, this feature is available for you to adjust starting in SQL Server 2022. However, you have to do this through the registry value that SQL reads during initialization. I don’t know about you, but the registry isn’t a frequent location that I visit and would like to stay out of it if I can help it. But, if you have more courage than I do, you can read up how to do it in the MS post linked above.
Caveats and Things to be Aware of
Now, there are some inherent details about this setting that you should be aware of, according to Microsoft. The following excerpt is from the official documentation:
- This setting is an advanced
sp_configureoption.- The minimum value is 256 (the default), and the maximum value is 2048. However, you can use a value of 0 to reset the value to default.
- This configuration option takes precedence over the registry setting.
- This setting takes effect after a SQL Server instance restart.
My Thoughts
I think this setting can be very useful in specific scenarios where you’re experiencing performance issues due to network lag and degraded communication between nodes. Especially when those nodes are communicating over a WAN, and the network cannot be upgraded or there are other limiting factors that prevent a root cause fix. Do I think this should be done preemptively? Well, unless it is a standard best practice recommended by Microsoft, or widely accepted implementation to preemptively change a default by trusted members of the SQL Server community, I usually wouldn’t change it out of the box. However, if you are experiencing related issues, this could definitely be a fix, or at least mask the problem until a longer term fix can be implemented. I don’t see many other conversations around this setting currently, so as time goes on, maybe it’ll be something to change as apart of a best practice.

Leave a comment