Friday 20 November 2009

An update-update conflict between two peers on same node

I recently encountered an error with a peer to peer replication topology between two nodes that drove me near insane trying to solve.

The topology was typical of a peer-peer design in that updates are done on one node only, while reads are done on both, to gain scale out/load balancing benefits. (I had been testing this setup before implementing the final topology which was to horizantally partition the data updates, so that eventually updates were done on both nodes, but records updated on one node were for the first half of the data, and the other node updates for the second half. Reads being done on both still.)

Anyway, while testing the 'updates on 1 node' situation I got this conflict error....

"An update-update conflict between peer 1 (incoming) and peer 1 (on disk) was detected and resolved. The incoming update was skipped by peer 2."

This made no sense as it looks like the conflict was with itself! (ie Peer 1 incoming, and Peer 1 disk)

I tore down the replication many times, checked my code, and did this debugging for a full day without any luck. In the end I went to bed square eyed and disheartened, and woke tired and dissappointed at having wasted the whole previous day and got nowhere to go with the solution. My wife also concerned about my stress bless her.

I popped a message on the priority managed groups (luckily we are a gold partner) and lo and behold MVP Hilary Cotter pointed me to this....

http://support.microsoft.com/kb/973223


This talks about 'dummy updates' causing problems. Dummy updates?
 
Well in my case an update statement was along the lines of this
 
UPDATE FRED SET COL = 'ABC' WHERE ID=123.
 
The problem was that COL was already 'ABC' (and so really the update wasn't necessary), but I had not coded for this. Although the update was a valid statement, it sent Replication into a spin.
 
I hope this saves someone else some pain and anger.

No comments:

Post a Comment