Re: [LLVM PATCH] bpf: fix wrong truncation elimination when there is back-edge/loop

Jiong Wang <jiong.wang@xxxxxxxxxxxxx> · Wed, 16 Oct 2019 16:33:15 +0100

Jiong Wang writes:

> Yonghong Song writes:
>
>> On 10/12/19 12:18 AM, Jiong Wang wrote:
>>> Currently, BPF backend is doing truncation elimination. If one truncation
>>> is performed on a value defined by narrow loads, then it could be redundant
>>> given BPF loads zero extend the destination register implicitly.
>>> 
>>> When the definition of the truncated value is a merging value (PHI node)
>>> that could come from different code paths, then checks need to be done on
>>> all possible code paths.
>>> 
>>> Above described optimization was introduced as r306685, however it doesn't
>>> work when there is back-edge, for example when loop is used inside BPF
>>> code.
>>> 
>>> For example for the following code, a zero-extended value should be stored
>>> into b[i], but the "and reg, 0xffff" is wrongly eliminated which then
>>> generates corrupted data.
>>> 
>>> void cal1(unsigned short *a, unsigned long *b, unsigned int k)
>>> {
>>>    unsigned short e;
>>> 
>>>    e = *a;
>>>    for (unsigned int i = 0; i < k; i++) {
>>>      b[i] = e;
>>>      e = ~e;
>>>    }
>>> }
>>> 
>>> The reason is r306685 was trying to do the PHI node checks inside isel
>>> DAG2DAG phase, and the checks are done on MachineInstr. This is actually
>>> wrong, because MachineInstr is being built during isel phase and the
>>> associated information is not completed yet. A quick search shows none
>>> target other than BPF is access MachineInstr info during isel phase.
>>> 
>>> For an PHI node, when you reached it during isel phase, it may have all
>>> predecessors linked, but not successors. It seems successors are linked to
>>> PHI node only when doing SelectionDAGISel::FinishBasicBlock and this
>>> happens later than PreprocessISelDAG hook.
>>> 
>>> Previously, BPF program doesn't allow loop, there is probably the reason
>>> why this bug was not exposed.
>>> 
>>> This patch therefore fixes the bug by the following approach:
>>>   - The existing truncation elimination code and the associated
>>>     "load_to_vreg_" records are removed.
>>>   - Instead, implement truncation elimination using MachineSSA pass, this
>>>     is where all information are built, and keep the pass together with other
>>>     similar peephole optimizations inside BPFMIPeephole.cpp. Redundant move
>>>     elimination logic is updated accordingly.
>>>   - Unit testcase included + no compilation errors for kernel BPF selftest.
>>
>> Thanks for the fix. The code looks good. Just two minor comments.
>
> Thanks Yonghong. Your comments make sense to me, will fix them.
>
>> After the fix, could you directly push to the llvm repo?
>
> Sure will do.
>
> (And I will update my llvm account email first, should be quick, if it takes
> too long will come back to you for committing help)

Fix pushed after two minor comments addressed and re-unit-tested:

  https://github.com/llvm/llvm-project/commit/ec51851026a55e1cfc7f006f0e75f0a19acb32d3

Regards,
Jiong