* James Mansion (james@xxxxxxxxxxxxxxxxxxxxxx) wrote: > Fine. But like I said, I'd suggest measuring the fractional improvement > for this > when sending multi-row inserts before writing something complex. I > think the > big will will be doing multi-row inserts at all. You're re-hashing things I've already said. The big win is batching the inserts, however that's done, into fewer transactions. Sure, multi-row inserts could be used to do that, but so could dropping begin/commits in right now which probably takes even less effort. > If you are going to > prepare then > you'll need a collection of different prepared statements for different > batch sizes > (say 1,2,3,4,5,10,20,50) and things will get complicated. A multi-row > insert > with unions and dynamic SQL is actually rather universal. No, as was pointed out previously already, you really just need 2. A single-insert, and a batch insert of some size. It'd be interesting to see if there's really much of a performance difference between a 50-insert prepared statement, and 50 1-insert prepared statements. If they're both done in larger transactions, I don't know that there's really alot of performance difference. > Personally I'd implement that first (and it should be easy to do across > multiple > dbms types) and then return to it to have a more complex client side with > prepared statements etc if (and only if) necessary AND the performance > improvement were measurably worthwhile, given the indexing and storage > overheads. storage overhead? indexing overhead? We're talking about prepared statements here, what additional storage requirement do you think those would impose? What additional indexing overhead? I don't believe we actually do anything differently between prepared statements and multi-row inserts that would change either of those. > There is no point optimising away the CPU of the simple parse if you are > just going to get hit with a lot of latency from round trips, and forming a > generic multi-insert SQL string is much, much easier to get working as a > first > step. Server CPU isn't a bottleneck all that often - and with something as > simple as this you'll hit IO performance bottlenecks rather easily. Ah, latency is a reasonable thing to bring up. Of course, if you want to talk about latency then you get to consider that multi-insert SQL will inherently have larger packet sizes which could cause them to be delayed in some QoS arrangements. As I said, most of this is a re-hash of things already said. The low-hanging fruit here is doing multiple inserts inside of a transaction, rather than 1 insert per transaction. Regardless of how that's done, it's going to give the best bang-for-buck. It will complicate the client code some, regardless of how it's implemented, so that failures are handled gracefully (if that's something you care about anyway), but as there exists some queueing mechanisms in rsyslog already, hopefully it won't be too bad. Thanks, Stephen
Attachment:
signature.asc
Description: Digital signature