On Mon, Jan 6, 2020 at 10:20 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: > Or do a middle ground thing where we use 32-bit arithmetic > for the per-station weights, but go to 64-bit for the weight sum? I > don't really have a good grip on how much of a performance impact we're > talking about here, so I'm not sure which I prefer... Double width accumulation is very common in many applications. Double width addition and comparison are _much_ cheaper than double width multiplication and division. /john