How CommitBatchSize And CommitBatchThreshold Affect Replication

<div>Note: This is a long post. I hope it’s worth your while to read. A common suggestion for optimizing transactional replication performance is to adjust the values in your distribution agent profile for CommitBatchSize and CommitBatchThreshold. Unfortunately what these two value really do isn’t documented very well anywhere. According to <a href="http://msdn.microsoft.com/en-us/library/ms147328.aspx" target="_blank">Books Online</a>: <ul> <li>CommitBatchSize “Is the number of transactions to be issued to the Subscriber before a COMMIT statement is issued. The default is 100.” </li> <li>CommitBatchThreshold “Is the number of replication commands to be issued to the Subscriber before a COMMIT statement is issued. The default is 1000.” </li> </ul> When you read this you might wonder what the difference is between a transaction and a command. After all, isn’t a command just an <a href="http://msdn.microsoft.com/en-us/library/ms187878.aspx" target="_blank">autocommit transaction</a>? Does this mean that the commands in a transaction could get broken up into smaller transactions at the subscriber? Won’t that violate <a href="http://en.wikipedia.org/wiki/ACID" target="_blank">ACID</a>? Microsoft went out of their way to make these two distinct values so each one has to influence delivery of commands in some way, right? I went the cheap route to find out and posted to the forums at <a href="http://www.sqlservercentral.com/" target="_blank">SQLServerCentral</a> and to <a href="http://groups.google.com/group/microsoft.public.sqlserver.replication/browse_thread/thread/e67e86ff450794e3" target="_blank">microsoft.public.sqlserver.replication</a>. While I waited for an answer I set up some simple tests to try and figure it out for myself. Before reading on, I want to test your replication mettle and give you a little quiz. If you want to cut to the chase and see if you’re right, jump to the conclusions at the end of this post to view the answers. <ol> <li>Based on the defaults above, what happens if I issue 1,500 insert\update\delete statements which each affect a single row all contained within an explicit transaction (e.g. BEGIN TRANSACTION, 1,500 updates, then COMMIT)? </li> <li>What if those 1,500 statements are not part of a transaction? </li> <li>What happens if I issue three update statements: the first affecting 600 rows, the second affecting 500 rows, and the third affecting 400 rows? </li> </ol>   <h5>Setup</h5> I started by creating a simple table and populating it with 1,000 rows (representative of a large and random sample set): <pre class="brush: sql">SET NOCOUNT ON; 
GO 
CREATE TABLE TransReplTest (
		[ID] INT IDENTITY(1, 1) ,
		[SomeDate] DATETIME ,
		[SomeValue] VARCHAR(20) ,
		CONSTRAINT [PK_TransReplTest] PRIMARY KEY CLUSTERED ([ID])
	); 
GO 
INSERT	INTO TransReplTest
		(SomeDate ,
			SomeValue
		)
SELECT	GETDATE() ,
		CONVERT(VARCHAR(20), RAND() * 1000);
GO 1000
</pre>

I created a publication, added a table article to it, and set up a subscription. I created a new distribution agent profile based on the default agent profile and changed only the values for CommitBatchSize and CommitBatchThreshold. Each time I changed their values I stopped and started the distribution agent to ensure I was using the new value. Finally, I set up a profiler session capturing the following events on the subscriber: “RPC:Completed”, “SQL:BatchCompleted”, and “SQL:BatchStarting”. I captured some extra noise so for the sake of readability I saved my profiler sessions and filtered them for each test afterwards. <h5>Test 1</h5> To make things easier to understand I set the values low, like this: CommitBatchSize: 3 CommitBatchThreshold: 5 First I ran this statement which randomly updates 10 rows, one at a time, all contained within a transaction:

<pre class="brush: sql">BEGIN TRANSACTION;

DECLARE	@rownum TINYINT; 
SET @rownum = 0;

WHILE @rownum < 10
BEGIN

UPDATE	TransReplTest
	SET		SomeDate = GETDATE()
	WHERE	ID IN (SELECT TOP 1
							ID
					FROM	TransReplTest
					ORDER BY NEWID());

SET @rownum = @rownum + 1; 
END;

COMMIT;</pre>

Profiler shows 1 transaction with 10 updates. CommitBatchThreshold didn’t appear affect anything here. <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipGexn0YGArRAG6ozLJHmEF0fUbA_8DchUQcW6sQjpUrC_3UuNHAobIsDz9umWdQyHezhK7eRy31Y6Xz1MYaYCO2cOH3ZST0_WZDO2W_Ptuv-GlzXeKM9DZvNlSSLi-2lfWCjvIVaroMeF/s1600-h/image45.png" imageanchor="1"><img alt="Click to enlarge" height="176" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipGexn0YGArRAG6ozLJHmEF0fUbA_8DchUQcW6sQjpUrC_3UuNHAobIsDz9umWdQyHezhK7eRy31Y6Xz1MYaYCO2cOH3ZST0_WZDO2W_Ptuv-GlzXeKM9DZvNlSSLi-2lfWCjvIVaroMeF/s176/image45.png" style="display: block; float: none; margin-left: auto; margin-right: auto" title="Click to enlarge" width="240" /></a> 
When I issue the same statement without the explicit transaction Profiler shows 4 transactions of 3 updates, 3 updates, 3 updates, and 1 update. CommitBatchSize causes every 3 statements (each of which is an autocommit transaction) to be grouped together in a transaction on the subscriber. 
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgafe5CAb_R8eAfjEb44HARjZZfaJ-dobyY1ug0PGXxNheQ8LBYkJShwuIFrAaBED6PSc7Labz3z8kZH9zG_PGkxP6t8BfBwU8vTidN0gsILVBbaNM31Ofsqot_dcyjncRjocoOrr98QjMP/s1600-h/image39.png" imageanchor="1"><img alt="Click to enlarge" height="176" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgafe5CAb_R8eAfjEb44HARjZZfaJ-dobyY1ug0PGXxNheQ8LBYkJShwuIFrAaBED6PSc7Labz3z8kZH9zG_PGkxP6t8BfBwU8vTidN0gsILVBbaNM31Ofsqot_dcyjncRjocoOrr98QjMP/s176/image39.png" style="display: block; float: none; margin-left: auto; margin-right: auto" title="Click to enlarge" width="240" /></a> 
Next I issued a single update statement affecting 10 rows:

<pre class="brush: sql">UPDATE	TransReplTest
SET		SomeDate = GETDATE()
WHERE	ID IN (SELECT TOP 10
						ID
				FROM	TransReplTest
				ORDER BY NEWID());</pre>

Profiler shows 1 transaction with 10 updates at the subscriber. Again CommitBatchThreshold did not affect anything here. 
				
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2BhQ3zvr31v1SmwRX0oc42r-UbZqPP0wUt9gSi_lnHmMbQT8hYg4b4GGTvH35hnGznbwVEySszg6fMSgRcZZZT0fXWnOJsWknRXADtiUUfAGAQCpFDHPByJm6woGYSnGkLcX5aK8hMYZD/s1600-h/image51.png" imageanchor="1"><img alt="Click to enlarge" height="176" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2BhQ3zvr31v1SmwRX0oc42r-UbZqPP0wUt9gSi_lnHmMbQT8hYg4b4GGTvH35hnGznbwVEySszg6fMSgRcZZZT0fXWnOJsWknRXADtiUUfAGAQCpFDHPByJm6woGYSnGkLcX5aK8hMYZD/s176/image51.png" style="display: block; float: none; margin-left: auto; margin-right: auto" title="Click to enlarge" width="240" /></a>

Finally, I issued an update statement affecting 3 rows 3 times: 
<pre class="brush: sql">UPDATE	TransReplTest
SET		SomeDate = GETDATE()
WHERE	ID IN (SELECT TOP 3
						ID
				FROM	TransReplTest
				ORDER BY NEWID()); 
GO 3</pre>

This time Profiler shows 2 transactions; the first contains 6 updates and the second contains 3 updates. Even though CommitBatchSize was set to 3, the number of rows affected by the first two statements exceeded CommitBatchThreshold (set to 5) and so the third statement was put into its own transaction.

<h5>Test 2</h5> Let's see what happens if we switch the values around and run through the same statements as Test 1: CommitBatchSize: 5 CommitBatchThreshold: 3 Randomly update 10 rows in a single transaction. We still see 1 transaction with 10 updates.

Now the same statement without the transaction. Now we see 4 transactions with updates in batches of 4, 1, 4, and 1. This time CommitBatchThreshold set to 3 was the reason we saw the smaller batches. But wait – shouldn’t a CommitBatchThreshold of 3 mean we should see 3 transactions with 3 updates in each transaction and 1 transaction with 1 update?

Next, the single update statement affecting 10 rows. Profiler shows 1 transaction with 10 updates. CommitBatchThreshold made no difference here.

Finally, updating 3 rows 3 times. Profiler shows 1 transaction with 6 updates and 1 transaction with 3 updates. Just like in Test 1, even though CommitBatchThreshold was set to 3, the number of rows affected by the first two statements exceeded CommitBatchSize (set to 5) and so the third statement was put into its own transaction.

<h5>Conclusions</h5> Based on the profiler traces, here’s what we can conclude: <ul> <li>Replication delivers changes to subscribers one row at a time; a single update statement which changes 1,000 rows at the publisher will result in 1,000 statements at the subscriber. </li> <li>Transactions at the publisher are kept intact at the subscriber. If a million rows are affected inside a transaction – either explicit or autocommit - on the publisher you can count on a million statements in one explicit transaction to be delivered to the subscriber. We saw this in both the statement which updated 10 rows one at a time in an explicit transaction and the statement which updated 10 rows in one shot. </li> <li>The lower of the two values for CommitBatchSize and CommitBatchThreshold is the primary driver for how statements are delivered to subscribers. When the number of transactions on the publisher reaches that lower value the distribution agent will apply them in a single explicit transaction at the subscriber. We saw this in the statement which updated 10 rows one at a time (not in a transaction). </li> <li>The higher of the two values can cause commands to be delivered before the lower value is met. If the number of rows affected at the publisher reaches the higher value the distribution agent will immediately apply the changes in a single explicit transaction at the subscriber. We saw this in the statements which updated 3 rows 3 times. </li> </ul> OK, now that we know how CommitBatchSize and CommitBatchThreshold work, let’s look at the answers to the quiz: <ol> <li>Based on the defaults above, what happens if I issue 1,500 insert\update\delete statements which each affect a single row all contained within an explicit transaction (e.g. BEGIN TRANSACTION, 1,500 updates, then COMMIT)? Answer: Transactions at the publisher are honored at the subscriber so we’ll see one explicit transaction applied at the subscriber which contains 1,500 updates.   </li> <li>What if those 1,500 statements are not part of a transaction? Answer: Because CommitBatchSize is set to 100 we’ll see 15 transactions at the subscriber, each containing 100 statements which affect one row per statement.   </li> <li>What happens if I issue three update statements: the first affecting 600 rows, the second affecting 500 rows, and the third affecting 400 rows? Answer: Because CommitBatchThreshold is set to 1,000 we’ll see 2 transactions at the subscriber. The first transaction will contain 1,100 statements and the second transaction will contain 400 statements. </li> </ol> BTW, remember that one weird case where CommitBatchSize was set to 5, CommitBatchThreshold was set to 3, and when we updated 10 rows one at a time we saw 4 transactions with updates in batches of 4, 1, 4, and 1? I think that’s a bug in SQL Server. It looks like the distribution agent alternates between (CommitBatchThreshold + 1) and (CommitBatchThreshold - 2) number of commands placed into each transaction delivered to the subscriber. Since this only appears to happen when CommitBatchSize is higher than CommitBatchThreshold – and most people don’t change the values to work that way – this seems relatively insignificant…but still not the behavior that I expected to see. P.S. – I did get a response to my usenet posting from MVPs Hilary Cotter and Paul Ibison. You can read it <a href="http://groups.google.com/group/microsoft.public.sqlserver.replication/browse_thread/thread/e67e86ff450794e3" target="_blank">here</a>. Gopal Ashok, a program manager in the SQL Replication team, confirmed that what I thought was a bug was indeed a bug. Yay me!</div>

Note: This is a long post. I hope it’s worth your while to read.

A common suggestion for optimizing transactional replication performance is to adjust the values in your distribution agent profile for CommitBatchSize and CommitBatchThreshold. Unfortunately what these two value really do isn’t documented very well anywhere. According to Books Online:

CommitBatchSize “Is the number of transactions to be issued to the Subscriber before a COMMIT statement is issued. The default is 100.”
CommitBatchThreshold “Is the number of replication commands to be issued to the Subscriber before a COMMIT statement is issued. The default is 1000.”

When you read this you might wonder what the difference is between a transaction and a command. After all, isn’t a command just an autocommit transaction? Does this mean that the commands in a transaction could get broken up into smaller transactions at the subscriber? Won’t that violate ACID? Microsoft went out of their way to make these two distinct values so each one has to influence delivery of commands in some way, right?

I went the cheap route to find out and posted to the forums at SQLServerCentral and to microsoft.public.sqlserver.replication. While I waited for an answer I set up some simple tests to try and figure it out for myself.

Before reading on, I want to test your replication mettle and give you a little quiz. If you want to cut to the chase and see if you’re right, jump to the conclusions at the end of this post to view the answers.

Based on the defaults above, what happens if I issue 1,500 insert\update\delete statements which each affect a single row all contained within an explicit transaction (e.g. BEGIN TRANSACTION, 1,500 updates, then COMMIT)?
What if those 1,500 statements are not part of a transaction?
What happens if I issue three update statements: the first affecting 600 rows, the second affecting 500 rows, and the third affecting 400 rows?

Setup

I started by creating a simple table and populating it with 1,000 rows (representative of a large and random sample set):

SET NOCOUNT ON; 
GO 
CREATE TABLE TransReplTest (
		[ID] INT IDENTITY(1, 1) ,
		[SomeDate] DATETIME ,
		[SomeValue] VARCHAR(20) ,
		CONSTRAINT [PK_TransReplTest] PRIMARY KEY CLUSTERED ([ID])
	); 
GO 
INSERT	INTO TransReplTest
		(SomeDate ,
			SomeValue
		)
SELECT	GETDATE() ,
		CONVERT(VARCHAR(20), RAND() * 1000);
GO 1000

I created a publication, added a table article to it, and set up a subscription. I created a new distribution agent profile based on the default agent profile and changed only the values for CommitBatchSize and CommitBatchThreshold. Each time I changed their values I stopped and started the distribution agent to ensure I was using the new value. Finally, I set up a profiler session capturing the following events on the subscriber: “RPC:Completed”, “SQL:BatchCompleted”, and “SQL:BatchStarting”. I captured some extra noise so for the sake of readability I saved my profiler sessions and filtered them for each test afterwards.

Test 1

To make things easier to understand I set the values low, like this:

CommitBatchSize: 3
CommitBatchThreshold: 5

First I ran this statement which randomly updates 10 rows, one at a time, all contained within a transaction:

BEGIN TRANSACTION; 

DECLARE	@rownum TINYINT; 
SET @rownum = 0; 

WHILE @rownum < 10
BEGIN 

	UPDATE	TransReplTest
	SET		SomeDate = GETDATE()
	WHERE	ID IN (SELECT TOP 1
							ID
					FROM	TransReplTest
					ORDER BY NEWID()); 

	SET @rownum = @rownum + 1; 
END; 

COMMIT;

Profiler shows 1 transaction with 10 updates. CommitBatchThreshold didn’t appear affect anything here.

When I issue the same statement without the explicit transaction Profiler shows 4 transactions of 3 updates, 3 updates, 3 updates, and 1 update. CommitBatchSize causes every 3 statements (each of which is an autocommit transaction) to be grouped together in a transaction on the subscriber.

Next I issued a single update statement affecting 10 rows:

UPDATE	TransReplTest
SET		SomeDate = GETDATE()
WHERE	ID IN (SELECT TOP 10
						ID
				FROM	TransReplTest
				ORDER BY NEWID());

Profiler shows 1 transaction with 10 updates at the subscriber. Again CommitBatchThreshold did not affect anything here.

Finally, I issued an update statement affecting 3 rows 3 times:

UPDATE	TransReplTest
SET		SomeDate = GETDATE()
WHERE	ID IN (SELECT TOP 3
						ID
				FROM	TransReplTest
				ORDER BY NEWID()); 
GO 3

This time Profiler shows 2 transactions; the first contains 6 updates and the second contains 3 updates. Even though CommitBatchSize was set to 3, the number of rows affected by the first two statements exceeded CommitBatchThreshold (set to 5) and so the third statement was put into its own transaction.

Test 2

Let's see what happens if we switch the values around and run through the same statements as Test 1:

CommitBatchSize: 5
CommitBatchThreshold: 3

Randomly update 10 rows in a single transaction. We still see 1 transaction with 10 updates.

Now the same statement without the transaction. Now we see 4 transactions with updates in batches of 4, 1, 4, and 1. This time CommitBatchThreshold set to 3 was the reason we saw the smaller batches. But wait – shouldn’t a CommitBatchThreshold of 3 mean we should see 3 transactions with 3 updates in each transaction and 1 transaction with 1 update?

Next, the single update statement affecting 10 rows. Profiler shows 1 transaction with 10 updates. CommitBatchThreshold made no difference here.

Finally, updating 3 rows 3 times. Profiler shows 1 transaction with 6 updates and 1 transaction with 3 updates. Just like in Test 1, even though CommitBatchThreshold was set to 3, the number of rows affected by the first two statements exceeded CommitBatchSize (set to 5) and so the third statement was put into its own transaction.

Conclusions

Based on the profiler traces, here’s what we can conclude:

Replication delivers changes to subscribers one row at a time; a single update statement which changes 1,000 rows at the publisher will result in 1,000 statements at the subscriber.
Transactions at the publisher are kept intact at the subscriber. If a million rows are affected inside a transaction – either explicit or autocommit - on the publisher you can count on a million statements in one explicit transaction to be delivered to the subscriber. We saw this in both the statement which updated 10 rows one at a time in an explicit transaction and the statement which updated 10 rows in one shot.
The lower of the two values for CommitBatchSize and CommitBatchThreshold is the primary driver for how statements are delivered to subscribers. When the number of transactions on the publisher reaches that lower value the distribution agent will apply them in a single explicit transaction at the subscriber. We saw this in the statement which updated 10 rows one at a time (not in a transaction).
The higher of the two values can cause commands to be delivered before the lower value is met. If the number of rows affected at the publisher reaches the higher value the distribution agent will immediately apply the changes in a single explicit transaction at the subscriber. We saw this in the statements which updated 3 rows 3 times.

OK, now that we know how CommitBatchSize and CommitBatchThreshold work, let’s look at the answers to the quiz:

Based on the defaults above, what happens if I issue 1,500 insert\update\delete statements which each affect a single row all contained within an explicit transaction (e.g. BEGIN TRANSACTION, 1,500 updates, then COMMIT)?
Answer: Transactions at the publisher are honored at the subscriber so we’ll see one explicit transaction applied at the subscriber which contains 1,500 updates.
What if those 1,500 statements are not part of a transaction?
Answer: Because CommitBatchSize is set to 100 we’ll see 15 transactions at the subscriber, each containing 100 statements which affect one row per statement.
What happens if I issue three update statements: the first affecting 600 rows, the second affecting 500 rows, and the third affecting 400 rows?
Answer: Because CommitBatchThreshold is set to 1,000 we’ll see 2 transactions at the subscriber. The first transaction will contain 1,100 statements and the second transaction will contain 400 statements.

BTW, remember that one weird case where CommitBatchSize was set to 5, CommitBatchThreshold was set to 3, and when we updated 10 rows one at a time we saw 4 transactions with updates in batches of 4, 1, 4, and 1? I think that’s a bug in SQL Server. It looks like the distribution agent alternates between (CommitBatchThreshold + 1) and (CommitBatchThreshold - 2) number of commands placed into each transaction delivered to the subscriber. Since this only appears to happen when CommitBatchSize is higher than CommitBatchThreshold – and most people don’t change the values to work that way – this seems relatively insignificant…but still not the behavior that I expected to see.

P.S. – I did get a response to my usenet posting from MVPs Hilary Cotter and Paul Ibison. You can read it here. Gopal Ashok, a program manager in the SQL Replication team, confirmed that what I thought was a bug was indeed a bug. Yay me!

About Kendal

Kendal is a database strategist, community advocate, public speaker, and blogger. A practiced IT professional with over 15 years of SQL Server experience, Kendal excels at disaster recovery, high availability planning/implementation, & debugging/troubleshooting mission critical SQL Server environments. Kendal is a Senior Consultant on the Microsoft Premier Developer Support team and President of MagicPASS, the Orlando, FL based chapter of PASS. Before joining Microsoft, Kendal was a SQL Server/Data Platform MVP from 2011-2016.

4 comments

Kurt said...: You have saved me a great deal of time in proving this behavior to a client. Excellent and thorough work.

Kurt Survance
SQL Consulting
http://www.sqlconsulting.com; April 23, 2009 at 2:21 PM
Anonymous said...: Kendal,
Thanks for the information it was very informative. I do have one question and that is regarding CommitBatchSizeTheshold. I had thought that was the max of commands that can be applied in one commit which is 1000. So if I insert 1,500 records in one Transaction why does it come back with 1 transaction, 1500 commands should it say 1 transaction 1000 commands and 1 transaction 500 commands.; June 7, 2010 at 7:43 PM
Unknown said...: Should we replicate indices, not replicate and rebuild on subscriber, or both ?; March 22, 2013 at 9:10 AM
Unknown said...: I notice latency on the subscribers when the housekeeping jobs (agent history clean up and Distribution clean up) on the distributor have long run times (> 15 seconds). They are currently set to the default, is is recommended to alter the schedules of these jobs, or leave "as is" ? My distributor is on a dedicated host, I rebuild idxs and updstats on distribution, do not auto shrink devices, so on and so forth. Our publisher does process large updates, 60K rows at most, every hour or so in a few select tables. Is it too much to ask for replication never to exceed 5 seconds latency for P->D and D->S ?; March 22, 2013 at 9:17 AM

Kendal Van Dyke

How CommitBatchSize And CommitBatchThreshold Affect Replication

Setup

Test 1

Test 2

Conclusions

About Kendal

4 comments

Popular Posts

Labels

What I'm Saying On Twitter

Kendal Van Dyke

How CommitBatchSize And CommitBatchThreshold Affect Replication

Setup

Test 1

Test 2

Conclusions

Share This

About Kendal

Related Posts

4 comments

Popular Posts

Labels

What I'm Saying On Twitter