Thanks for the information Greg. Unfortunately modifying the application stack this close to the holiday season won’t be an option so I’m left with: 1) Trying to optimize the settings I have for the query mix I have. 2) Optimize any long running DML queries (if any) to prevent lag due to locks. 3) Getting a better understanding of “what” causes lag. #3 will probably be central to at least minimizing lag during heavy DML load. If anyone has a good resource to describe when a slave would start to lag potentially that would help me hunt for the cause. I know long running DML on the master may cause lag but I’m uncertain as to the specifics of why. During periods of lag we do have more DML than usual running against the master but the queries themselves are very quick although there might be 20-30 DML operations per second against some of our central tables that store user account information. Even under heavy DML the queries still return in under a second. Possibly a large volume of of short running DML cause replication lag issues for large tables (~20M)? Thanks again for your help. BDR looks interesting but probably too cutting edge for my client. Mike Wilson
|