Addy, On Mon, Sep 29, 2014 at 12:45 AM, addy ke <addy.ke at rock-chips.com> wrote: > Hi Doug > > On 2014/9/29 12:39, Doug Anderson wrote: >> Addy, >> >> On Sat, Sep 27, 2014 at 12:11 AM, Addy Ke <addy.ke at rock-chips.com> wrote: >>> From: Addy <addy.ke at rock-chips.com> >>> >>> As show in I2C specification: >>> - Standard-mode: the minimum HIGH period of the scl clock is 4.0us >>> the minimum LOW period of the scl clock is 4.7us >>> - Fast-mode: the minimum HIGH period of the scl clock is 0.6us >>> the minimum LOW period of the scl clock is 1.3us >>> >>> I have measured i2c SCL waveforms in fast-mode by oscilloscope >>> on rk3288-pinky board. the LOW period of the scl clock is 1.3us. >>> It is so critical that we must adjust LOW division to increase >>> the LOW period of the scl clock. >>> >>> After put this patch, we can find that min_low_ns is about >>> 5.4us in Standard-mode and 1.6us in Fast-mode. >>> >>> Thanks Doug for the suggestion about division formulas. >>> >>> Signed-off-by: Addy <addy.ke at rock-chips.com> >>> --- >>> Changes in v2: >>> - remove Fast-mode plus and HS-mode >>> - use new formulas suggested by Doug >>> >>> drivers/i2c/busses/i2c-rk3x.c | 104 +++++++++++++++++++++++++++++++++++++++--- >>> 1 file changed, 97 insertions(+), 7 deletions(-) >>> >>> diff --git a/drivers/i2c/busses/i2c-rk3x.c b/drivers/i2c/busses/i2c-rk3x.c >>> index e637c32..81672a8 100644 >>> --- a/drivers/i2c/busses/i2c-rk3x.c >>> +++ b/drivers/i2c/busses/i2c-rk3x.c >>> @@ -428,19 +428,109 @@ out: >>> return IRQ_HANDLED; >>> } >>> >>> +static void rk3x_i2c_get_min_ns(unsigned int scl_rate, >>> + unsigned long *min_low_ns, >>> + unsigned long *min_high_ns) >>> +{ >>> + WARN_ON(scl_rate > 400000); >>> + >>> + /* As show in I2C specification: >>> + * - Standard-mode(100KHz): >>> + * min_low_ns is 4700ns >>> + * min_high_ns is 4000ns >>> + * - Fast-mode(400KHz): >>> + * min_low_ns is 1300ns >>> + * min_high_ns is 600ns >>> + * >>> + * (min_low_ns + min_high_ns) up to 10000ns in Standard-mode >>> + * and 2500ns in Fast-mode >>> + */ >>> + if (scl_rate <= 100000) { >>> + *min_low_ns = 4700 + 650; >>> + *min_high_ns = 4000 + 650; >>> + } else { >>> + *min_low_ns = 1300 + 300; >>> + *min_high_ns = 600 + 300; >> >> Wait, where did the extra 650 and 300 come from here? Are you just >> trying to balance things out? That's not the right way to do things. >> Please explain what you were trying to accomplish and we can figure >> out a better way. >> >> ...maybe this helped make thins nicer in the clock rates you tried, >> but I'd bet that I can find clock rates that this is broken for... >> >> > 650 = [(1000000000/100K) ns - min_low_ns_100k - min_high_ns_100k] / 2 > 300 = [(1000000000/300K) ns - min_low_ns_400k - min_high_ns_400k] / 2 > > if there is no extra 650 and 300: > extra_div is 3 in fast-mode and 11 in standard-mode. Umm, OK. Except that "min_low_ns" is no longer the actual minimum "ns" we need according to the spec. That will mean that you're sometimes going to end up with a clock rate that's slower than you want. > 1)if all of the "extra" is asssigned to the high part, the LOW period is too close to min_low. > In my test: > 400k, scl_low_ns: 1440ns, scl_high_ns: 1120ns > 100k, scl_low_ns: 4800ns, scl_high_ns: 5120ns Why does it matter if it's close to "min_low"? It's above the minimum so it's fine, right? If you need a buffer, add an explicit buffer to the calculations. > 2)if not, dato hold time is too close to min_hold_time(400khz, 900ns) > In my test: > 400k, scl_low_ns: 1760ns, scl_high_ns: 800ns > 100k, scl_low_ns: 5304ns, scl_high_ns: 4368ns > > hold_time_ns ~= scl_low_ns / 2 = 880ns It's still under, though. Why does this matter? Again, if you need a buffer then add a buffer. > So I think we must decrease min_low_ns:min_high_ns to decrease scl_low in case 2 > > So I set the extra 650 and 300. > > After this, in my test: > 400k, scl_low_ns: 1600ns, scl_high_ns: 960ns > 100k, scl_low_ns: 5200ns, scl_hign_ns: 4576ns > > I think this value is more reasonable. You're testing with a very particular clock input rate. The function needs to be general and work across a lot of different input clock rates. That's why my python code has loops in it to test the results over a huge variety of input clock rates... --- Here are a few examples of differences (SEP 30 is my new proposal below): With this example your code produces a slightly slower clock rate than desirable: SEP 30: CLK = 74250000, Req = 100000, act = 99798.39, 5.49 us low, 4.53 us high, low/high = 1.21 (50 41) AddyV2: CLK = 74250000, Req = 100000, act = 98736.70, 5.39 us low, 4.74 us high, low/high = 1.14 (49 43) Here's another example where your patch doesn't produce ideal timings (in this case the difference is more pronounced) SEP 30: CLK = 66380000, Req = 400000, act = 395119.05, 1.69 us low, 0.84 us high, low/high = 2.00 (13 6) AddyV2: CLK = 66380000, Req = 400000, act = 377159.09, 1.69 us low, 0.96 us high, low/high = 1.75 (13 7) Here's an example your code violates timings (true, a 1.61MHz clock isn't super realistic in rk3288, but it does show a non-ideal case): SEP 30: CLK = 1610000, Req = 100000, act = 67083.33, 4.97 us low, 9.94 us high, low/high = 0.50 (0 1) AddyV2: CLK = 1610000, Req = 100000, act = 67083.33, 9.94 us low, 4.97 us high, low/high = 2.00 (1 0) ...ERROR: low too long 9.94 us > 6.85 us (/2 4.97 us > 3.42 us) (conflicting=0) --- I'll paste new code here with a proposal that adds a buffer (could be 0) and also send you an attachment in a separate email (I can't quite believe that the lists will handle attachments). One thing you'll notice is that with sufficiently slow input clocks it's possible to get into a state where we can't meet both the minimum and maximum times for SCL being low. This probably isn't super realistic, but the code could certainly have a warning. --- def DIV_ROUND_UP(a, b): return (a + b - 1) // b # This is bassed on v2 sent by Addy (fixed DIV_ROUND_UP) def test_it_addy_v2(min_low_ns, min_high_ns, max_low_ns, i2c_rate, scl_rate): # Add the buffer in suggested by addy if min_low_ns == 4700: min_low_ns += 650 min_high_ns += 650 else: min_low_ns += 300 min_high_ns += 300 min_total_ns = min_low_ns + min_high_ns # Adjust to avoid overflow i2c_rate_khz = DIV_ROUND_UP(i2c_rate, 1000) scl_rate_khz = DIV_ROUND_UP(scl_rate, 1000) # We need the total div to be >= this number so we don't clock too fast. min_total_div = DIV_ROUND_UP(i2c_rate_khz, scl_rate_khz * 8); # These are the min dividers needed for min hold times. min_low_div = DIV_ROUND_UP(i2c_rate_khz * min_low_ns, 8 * 1000000) min_high_div = DIV_ROUND_UP(i2c_rate_khz * min_high_ns, 8 * 1000000) min_div_for_hold = (min_low_div + min_high_div) if min_div_for_hold > min_total_div: # Time needed to meet hold requirements is important. Just use that div_low = min_low_div div_high = min_high_div else: # We've got to distribute some time among the low and high so we # don't run too fast. extra_div = min_total_div - min_div_for_hold # We'll try to split things up perfectly evenly, biasing slightly # towards having a higher div for low (spend more time low). ideal_low_div = DIV_ROUND_UP(i2c_rate_khz * min_low_ns, scl_rate_khz * 8 * min_total_ns) # Handle when the ideal low div is going to take up more than we have if ideal_low_div > min_low_div + extra_div: assert ideal_low_div == min_low_div + extra_div + 1 ideal_low_div = min_low_div + extra_div # Give low the "ideal" and give high whatever extra is left. div_low = ideal_low_div div_high = min_high_div + (extra_div - (ideal_low_div - min_low_div)) # Adjust to the fact that the hardware has an implicit "+1". # NOTE: Above calculations always produce div_low > 0 and div_high > 0. div_low -= 1 div_high -= 1 return div_low, div_high, False # Note: we don't calculate "conflicting" # test_it_sep30 - Sept 30th proposal for i2c stuff # # min_low_ns: The minimum number of ns we need to hold low to meet i2c spec # min_high_ns: The minimum number of ns we need to hold high to meet i2c spec # max_low_ns: The maximum number of ns we can hold low to meet i2c spec # # Note: max_low_ns should be (max data hold time * 2 - buffer) # This is because the i2c host on Rockchip holds the data line for # half the low time. def test_it_sep30(min_low_ns, min_high_ns, max_low_ns, i2c_rate, scl_rate): min_total_ns = min_low_ns + min_high_ns # Adjust to avoid overflow i2c_rate_khz = DIV_ROUND_UP(i2c_rate, 1000) scl_rate_khz = DIV_ROUND_UP(scl_rate, 1000) # We need the total div to be >= this number so we don't clock too fast. min_total_div = DIV_ROUND_UP(i2c_rate_khz, scl_rate_khz * 8); # These are the min dividers needed for min hold times. min_low_div = DIV_ROUND_UP(i2c_rate_khz * min_low_ns, 8 * 1000000) min_high_div = DIV_ROUND_UP(i2c_rate_khz * min_high_ns, 8 * 1000000) min_div_for_hold = (min_low_div + min_high_div) # This is the maximum divider so we don't go over the max. We don't round up # here (we round down) since this is a max. max_low_div = i2c_rate_khz * max_low_ns / (8 * 1000000) # Giving different rounding, it's possible that min > max (!?!) We're # totally out of luck in that case, so let's go with getting clocks # right I guess... if min_low_div > max_low_div: # print "WARNING: conflicting restrictions" conflicting = True max_low_div = min_low_div else: conflicting = False if min_div_for_hold > min_total_div: # Time needed to meet hold requirements is important. Just use that div_low = min_low_div div_high = min_high_div else: # We've got to distribute some time among the low and high so we # don't run too fast. extra_div = min_total_div - min_div_for_hold # We'll try to split things up perfectly evenly, biasing slightly # towards having a higher div for low (spend more time low). ideal_low_div = DIV_ROUND_UP(i2c_rate_khz * min_low_ns, scl_rate_khz * 8 * min_total_ns) # Don't allow it to go over the max if ideal_low_div > max_low_div: ideal_low_div = max_low_div # Handle when the ideal low div is going to take up more than we have if ideal_low_div > min_low_div + extra_div: assert ideal_low_div == min_low_div + extra_div + 1 ideal_low_div = min_low_div + extra_div # Give low the "ideal" and give high whatever extra is left. div_low = ideal_low_div div_high = min_high_div + (extra_div - (ideal_low_div - min_low_div)) # Adjust to the fact that the hardware has an implicit "+1". # NOTE: Above calculations always produce div_low > 0 and div_high > 0. div_low -= 1 div_high -= 1 return div_low, div_high, conflicting def print_it(label, min_low_ns, min_high_ns, max_low_ns, i2c_rate, scl_rate, div_low, div_high, conflicting): T_pclk_us = 1000000. / i2c_rate T_sclk_us = 1000000. / scl_rate T_low_us = T_pclk_us * (div_low + 1) * 8 T_high_us = T_pclk_us * (div_high + 1) * 8 T_tot_us = (T_high_us + T_low_us) freq = 1000000. / T_tot_us print "%s: CLK = %d, Req = %d, act = %.2f, %.2f us low, " \ "%.2f us high, low/high = %.2f (%d %d)" % ( label, i2c_rate, scl_rate, freq, T_low_us, T_high_us, T_low_us / T_high_us, div_low, div_high) if T_low_us * 1000 < min_low_ns: print "...ERROR: not low long enough" if T_high_us * 1000 < min_high_ns: print "...ERROR: not high long enough" if T_low_us * 1000 > max_low_ns: print "...ERROR: low too long %.2f us > %.2f us " \ "(/2 %.2f us > %.2f us) (conflicting=%d)" % ( T_low_us, max_low_ns / 1000., T_low_us / 2., max_low_ns / 2000., conflicting) elif conflicting: print "...Conflicting, but not low too long?" return (i2c_rate, scl_rate, freq, T_low_us, T_high_us) def test_it(min_low_ns, min_high_ns, max_low_ns, i2c_rate, scl_rate): sep30 = test_it_sep30(min_low_ns, min_high_ns, max_low_ns, i2c_rate, scl_rate) addy_v2 = test_it_addy_v2(min_low_ns, min_high_ns, max_low_ns, i2c_rate, scl_rate) print_it("SEP 30", min_low_ns, min_high_ns, max_low_ns, i2c_rate, scl_rate, sep30[0], sep30[1], sep30[2]) print_it("AddyV2", min_low_ns, min_high_ns, max_low_ns, i2c_rate, scl_rate, addy_v2[0], addy_v2[1], addy_v2[2]) # Test to see where results are differet #if (sep30[0], sep30[1]) != (addy_v2[0], addy_v2[1]): # print_it("SEP 30", min_low_ns, min_high_ns, max_low_ns, # i2c_rate, scl_rate, sep30[0], sep30[1], sep30[2]) # print_it("AddyV2", min_low_ns, min_high_ns, max_low_ns, # i2c_rate, scl_rate, addy_v2[0], addy_v2[1], addy_v2[2]) # Test to see where clock rates are different. #if (sep30[0] + sep30[1]) != (addy_v2[0] + addy_v2[1]): # print_it("SEP 30", min_low_ns, min_high_ns, max_low_ns, # i2c_rate, scl_rate, sep30[0], sep30[1], sep30[2]) # print_it("AddyV2", min_low_ns, min_high_ns, max_low_ns, # i2c_rate, scl_rate, addy_v2[0], addy_v2[1], addy_v2[2]) min_low_std_ns = 4700 min_high_std_ns = 4000 max_data_hold_std_ns = 3450 data_hold_buffer_std_ns = 50 min_low_fast_ns = 1300 min_high_fast_ns = 600 max_data_hold_fast_ns = 900 data_hold_buffer_fast_ns = 50 test_it(min_low_std_ns, min_high_std_ns, max_data_hold_std_ns * 2 - data_hold_buffer_std_ns, 74250000, 100000) test_it(min_low_std_ns, min_high_std_ns, max_data_hold_std_ns * 2 - data_hold_buffer_std_ns, 2000001, 100000) test_it(min_low_std_ns, min_high_std_ns, max_data_hold_std_ns * 2 - data_hold_buffer_std_ns, 1610000, 100000) test_it(min_low_std_ns, min_high_std_ns, max_data_hold_std_ns * 2 - data_hold_buffer_std_ns, 1484000, 100000) test_it(min_low_fast_ns, min_high_fast_ns, max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, 66380000, 400000) test_it(min_low_fast_ns, min_high_fast_ns, max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, 74250000, 400000) test_it(min_low_fast_ns, min_high_fast_ns, max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, 12800000, 400000) test_it(min_low_fast_ns, min_high_fast_ns, max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, 9400000, 400000) test_it(min_low_fast_ns, min_high_fast_ns, max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, 6400000, 400000) test_it(min_low_fast_ns, min_high_fast_ns, max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, 5000000, 400000) test_it(min_low_fast_ns, min_high_fast_ns, max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, 3200000, 400000) test_it(min_low_fast_ns, min_high_fast_ns, max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, 1600000, 400000) test_it(min_low_fast_ns, min_high_fast_ns, max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, 800000, 400000) #for i in xrange(74250000, 800000, -100): # test_it(min_low_std_ns, min_high_std_ns, # max_data_hold_std_ns * 2 - data_hold_buffer_std_ns, i, 100000) # #for i in xrange(74250000, 800000, -100): # test_it(min_low_fast_ns, min_high_fast_ns, # max_data_hold_fast_ns * 2 - data_hold_buffer_fast_ns, i, 400000) --- -Doug