Re: [PATCH] drm/amdgpu: csa_vaddr should not larger than AMDGPU_GMC_HOLE_START

Christian König <ckoenig.leichtzumerken@xxxxxxxxx> · Fri, 18 Jan 2019 10:11:47 +0100



    Hi Monk,

      
        You see that for UMD, it can use 0 to HOLE_START
      Let me say it once more: The UMD nor anybody else CAN'T use 0 to
      HOLE_START, that region is reserved for the ATC hardware!

      
      We unfortunately didn't knew that initially and also didn't used
      the ATC, so we didn't ran into a problem.

      
      But ROCm now uses the ATC on Raven/Picasso and I have a branch
      where I enable it on Vega as well. So when we don't fix that we
      will run into problems here.

      
      The ATC isn't usable in combination with SRIOV and I don't think
      Windows uses it either, so they probably never ran into any
      issues.

      
      Do you mean even
          UMD should  not use virtual address that dropped in range from
          0 to HOLE_START ?
      Yes, exactly! That in combination with ATC use can have quite a
      bunch of strange and hard to track down effects because two parts
      of the driver are using the same address space.

      
      That way where should UMD work in ?and I
        assume our UMD now still using this range, this part make me
        puzzle
        
      At least Mesa now uses the high address space from
      HOLE_END..0xFFFF FFFF FFFF FFFF.

      
      Regards,

      Christian.

      
      Am 18.01.19 um 02:32 schrieb Liu, Monk:

    
        Thanks
            Christian,
         
        Questions I
            have now:
         
        
            You see that for UMD, it can use 0 to HOLE_START, so why CSA
            cannot use that range although the range is as you said
            reserved to ATC h/w ? Be note that for windows KMD, the CSA
            is allocated by UMD driver so CSA shares the same aperture
            /space range with other UMD BO, which mean CSA in windows
            also located in ATC range, if that’s a problem why windows
            still works well.
            
              
                Can you illustrate this limitation with more details ?
                we need to understand why CSA couldn’t be put in ATC
                range.
            
          
            According to your previous description :” Now on Vega/Raven/Picasso etc..
              (everything with a GFX9) the lower range (0x0-0x8000 0000
              0000) is reserved for SVA/ATC use. Since we
              unfortunately didn't knew that initially we exposed
                those to older user space as usable and also put the
              CSA in there.”
            
              
                Do you mean even UMD should 
                  not use virtual address that dropped in range from 0
                  to HOLE_START ?
            
          
        that way
          where should UMD work in ?and I assume our UMD now still using
          this range, this part make me puzzle
          
         
        /Monk
        
          
            From: amd-gfx
                <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx>
                On Behalf Of Koenig, Christian

                Sent: Thursday, January 17, 2019 9:26 PM

                To: Liu, Monk <Monk.Liu@xxxxxxx>; Lou,
                Wentao <Wentao.Lou@xxxxxxx>;
                amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Zhu, Rex
                <Rex.Zhu@xxxxxxx>

                Cc: Deng, Emily <Emily.Deng@xxxxxxx>

                Subject: Re: [PATCH] drm/amdgpu: csa_vaddr should
                not larger than AMDGPU_GMC_HOLE_START
          
        
          Hi Monk,

            
            Regarding with above sentence, do you
              mean this range (0->HOLE_START) shouldn’t be exposed to
              user space ? I don’t get your point here …
          
          Yes exactly. As I said the problem is
            that 0->HOLE_START is reserved for the ATC hardware, we
            should not touch it at all.

            
            Putting
              CSA in 0~HOLD_START is the legacy approach we selected for
              a long time since very early stage, how comes that you
              think it is a problem now ?
          
          That turned out to be never a good idea
            in the first place.

            
            What we could do is reduce the max_pfn for SRIOV because the
            ATC doesn't work in that configuration anyway. But I would
            only do this as last resort.

            
            Any idea why an address above the hole doesn't work with
            SRIOV? It seems to work fine in the bare metal case.

            
            Regards,

            Christian.

            
            Am 17.01.19 um 14:19 schrieb Liu, Monk:
        
        
          Hi
              Christian 
           
          Thanks for
              explaining the HOLD for us,
            
           
          My
              understanding is we still could put CSA to 0~HOLE_START,
              because we can report UMD the max space is
              HOLD_START-CSA_SIZE , thus no colliding will hit.
           
          >
            Now on Vega/Raven/Picasso etc.. (everything with a GFX9) the
            lower range (0x0-0x8000 0000 0000) is reserved for SVA/ATC
            use. Since we unfortunately didn't knew that initially we
            exposed those to older userspace as usable and also put the
            CSA in there.

            
          Regarding with above sentence, do you
            mean this range (0->HOLE_START) shouldn’t be exposed to
            user space ? I don’t get your point here …
           
          Putting CSA in 0~HOLD_START is the legacy
            approach we selected for a long time since very early stage,
            how comes that you think it is a problem now ?
           
          /Monk
          
            
              From: amd-gfx
                  <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx>
                  On Behalf Of Koenig, Christian

                  Sent: Thursday, January 17, 2019 4:30 PM

                  To: Liu, Monk <Monk.Liu@xxxxxxx>;
                  Lou, Wentao
                  <Wentao.Lou@xxxxxxx>;
                  
                    amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Zhu, Rex <Rex.Zhu@xxxxxxx>

                  Cc: Deng, Emily <Emily.Deng@xxxxxxx>

                  Subject: Re: [PATCH] drm/amdgpu: csa_vaddr
                  should not larger than AMDGPU_GMC_HOLE_START
            
          
            Hi Monk,

              
              ok let me explain a bit more how the hardware works.

              
              The GMC manages a virtual 64bit address space, but only
              48bit of that virtual address space are handled by the
              page table walker.

              
              The 48bits of address space are sign extended, so bit 47
              of that are extended into bits 48-63.

              
              This gives us the following memory layout:

              0x0

              .... virtual address space

              0x8000 0000 0000

              .... hole

              0xFFFF 8000 0000 0000

              .... virtual address space

              0xFFFF FFFF FFFF FFFF

              
              Trying to access the hole results in a range fault
              interrupt IIRC.

              
              When doing the VM page table walk the topmost 16bits are
              ignored, so when programming the page table walker you cut
              those of and use a linear address again. This is what
              AMDGPU_GMC_HOLE_MASK is good for.

              
              Now on Vega/Raven/Picasso etc.. (everything with a GFX9)
              the lower range (0x0-0x8000 0000 0000) is reserved for
              SVA/ATC use. Since we unfortunately didn't knew that
              initially we exposed those to older userspace as usable
              and also put the CSA in there.

              
              The most likely cause of this is that we still have a bug
              somewhere about this, e.g. not correctly using sign
              extended addresses *OR* using sign extended addresses
              where we should use linear instead.

              
              Regards,

              Christian.

              
              Am 17.01.19 um 09:04 schrieb Liu, Monk:
          
          
            Hi
                Christian 
             
            I
                believe Wentao can fix the issue we it by below step:
            
              Return 
                  Virtual_address_max (UMD use it) to HOLE_START –
                RESERVED_SIZE 
              [optional] Still Keep
                virtual_address_offset to RESERVED_SIZE (current way, I
                think it’s because previously we put CSA in 0
                à
                RESERVED_SIZE space) 
              Put CSA in HOLE_START – RESERVED_SIZE 
                è HOLE_START
                (it’s current design) 
                
            
            I don’t
                get where above scheme is not correct … can you give
                more explain for the GMC_HOLE_START ?
             
            e.g.
            
              
                why you set GMC_HOLE_START to 0x8’000’0000’0000 (half
                size of MAX of 48bit address space) ? is it for HSA
                purpose to make sure GPU address can also be used for
                CPU address ?
                
              
                now MAX_PFN is 1’000’0000’0000, do you need to change
                GMC_HOLE_START ? 
            
             
            thanks 
            we need
                some catch up 
              
             
            /Monk
             
            
                From: amd-gfx
                    <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx>
                    On Behalf Of Koenig, Christian

                    Sent: Thursday, January 17, 2019 3:39 PM

                    To: Lou, Wentao <Wentao.Lou@xxxxxxx>;
                    Liu, Monk
                    <Monk.Liu@xxxxxxx>;
                    
                      amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Zhu, Rex <Rex.Zhu@xxxxxxx>

                    Cc: Deng, Emily <Emily.Deng@xxxxxxx>

                    Subject: Re: [PATCH] drm/amdgpu: csa_vaddr
                    should not larger than AMDGPU_GMC_HOLE_START
              
            
              Am 17.01.19 um 04:17 schrieb Lou,
                Wentao:
            
            
              Hi Christian,
               
              Your solution as:
              addr = (max_pfn -
                (AMDGPU_VA_RESERVED_SIZE >> AMDGPU_PAGE_SHIFT))
                << AMDGPU_PAGE_SHIFT;
              now max_pfn = 0x10 0000 0000,
                AMDGPU_VA_RESERVED_SIZE = 0x10 0000, AMDGPU_PAGE_SHIFT =
                12
              Still got addr = 0xFFFF FFF0 0000,
                which would cause ring gfx timeout.
            
            
              But 0xFFFF FFF0 0000 is the correct address, if that is
              causing a problem then there is a bug somewhere else.

              
              Please try to use
              AMDGPU_GMC_HOLE_START-AMDGPU_VA_RESERVED_SIZE as well.
              Does that work?

              
              Before commit
                1bf621c42137926ac249af761c0190a9258aa0db, vm_size was
                32GB, and csa_addr was under AMDGPU_GMC_HOLE_START.
            
            
              Wait a second why was the vm_size 32GB? This is on a
              Vega10 isn't it?

              
              I didn’t understand why csa_addr need
                to be above AMDGPU_GMC_HOLE_START now.
            
            
              On Vega10 the lower range, e.g. everything below
              AMDGPU_GMC_HOLE_START is reserved for SVA.

              
              Regards,

              Christian.

              
              Thanks.
               
              BR,
              Wentao
               
               
                  From: Koenig, Christian 
                      <Christian.Koenig@xxxxxxx> 

                    Sent: Wednesday, January 16, 2019 5:48 PM

                    To: Lou, Wentao <Wentao.Lou@xxxxxxx>;
                    Liu, Monk
                    <Monk.Liu@xxxxxxx>;
                    
                      amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Zhu, Rex <Rex.Zhu@xxxxxxx>

                    Cc: Deng, Emily <Emily.Deng@xxxxxxx>

                    Subject: Re: [PATCH] drm/amdgpu: csa_vaddr
                    should not larger than AMDGPU_GMC_HOLE_START
                
              
                Hi Wentao,

                  
                  well the problem is you don't seem to understand how
                  the hardware works.

                  
                  See the engines see an MC address space with a hole in
                  the middle, similar to the how x86 64bit CPU address
                  space works. But the page tables are programmed
                  linearly.

                  
                  So the calculation in amdgpu_driver_open_kms() is
                  correct because it takes the MC address and mages a
                  linear page table index from it again.

                  
                  The only thing we might need to fix here is shifting
                  max_pfn before the subtraction and I doubt that even
                  that is necessary.

                  
                  Regards,

                  Christian.

                  
                  Am 16.01.19 um 10:34 schrieb Lou, Wentao:
              
              
                Hi Christian,
                 
                Now vm_size was set to 
                    0x4 0000 GB by below commit:
                1bf621c42137926ac249af761c0190a9258aa0db
                  drm/amdgpu: Remove unnecessary VM size calculations
                 
                So that max_pfn would be 
                    0x10 0000 0000.
                amdgpu_csa_vaddr would make
                  max_pfn << 12 to get 0x1 0000 0000 0000, and
                  then minus AMDGPU_VA_RESERVED_SIZE, to get
                  0xFFFF
                    FFF0 0000
                unfortunately this number was
                  between AMDGPU_GMC_HOLE_START and AMDGPU_GMC_HOLE_END,
                  so that amdgpu_gmc_sign_extend was called to make it
                  0xFFFF
                    FFFF FFF0 0000
                 
                in amdgpu_driver_open_kms,
                  extended csa_addr cannot be passed into
                  amdgpu_map_static_csa directly, it would be above the
                  limit of max_pfn.
                So that csa_addr was restricted
                  by AMDGPU_GMC_HOLE_MASK to make it possible for
                  amdgpu_vm_alloc_pts.
                But this restriction by
                  AMDGPU_GMC_HOLE_MASK would make the address fall back
                  into AMDGPU_GMC_HOLE again,  which causing GPU reset.
                We just put amdgpu_csa_vaddr
                  back to AMDGPU_GMC_HOLE_START, to avoid the address
                  touching AMDGPU_GMC_HOLE.
                By the way, if max_pfn was shift
                  much to the left, it would always get zero, with or
                  without min(*,*).
                 
                 
                BR,
                Wentao
                 
                 
                -----Original Message-----

                  From: Koenig, Christian <Christian.Koenig@xxxxxxx>
                  

                  Sent: Tuesday, January 15, 2019 4:02 PM

                  To: Liu, Monk <Monk.Liu@xxxxxxx>;
                  Lou, Wentao
                  <Wentao.Lou@xxxxxxx>;
                  
                    amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Zhu, Rex <Rex.Zhu@xxxxxxx>

                  Subject: Re: [PATCH] drm/amdgpu: csa_vaddr should not
                  larger than AMDGPU_GMC_HOLE_START
                 
                Am 15.01.19 um 07:19 schrieb
                  Liu, Monk:
                > The max_pfn is now
                  1'0000'0000'0000'0000 (bytes) which is above 48 bit
                  now, and it with AMDGPU_GMC_HOLE_MASK make it to zero
                  ....
                > 
                > And in code
                  "amdgpu_driver_open_kms()" I saw @Zhu, Rex write the
                  code as :
                > 
                > "csa_addr =
                  amdgpu_csa_vadr(adev) & AMDGPU_GMC_HOLE_MASK", I
                  think this is wrong since you intentionally place the
                  csa above GMC hole, right ?
                 
                The fix is just completely
                  incorrect since min(adev->vm_manager.max_pfn
                  << AMDGPU_GPU_PAGE_SHIFT, AMDGPU_GMC_HOLE_START)
                  still gives you 0 when we shift max_pfn to much to the
                  left.
                 
                The correct solution is to
                  substract the reserved size first and then shift.
                  E.g.:
                 
                addr = (max_pfn -
                  (AMDGPU_VA_RESERVED_SIZE >> AMDGPU_PAGE_SHIFT))
                  << AMDGPU_PAGE_SHIFT;
                 
                Regards,
                Christian.
                 
                > 
                > Looks like  we should
                  modify this place
                > 
                > /Monk
                > 
                > -----Original Message-----
                > From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx>
                  On Behalf Of
                  
                > Christian K?nig
                > Sent: Monday, January 14,
                  2019 9:05 PM
                > To: Lou, Wentao <Wentao.Lou@xxxxxxx>;
                  amd-gfx@xxxxxxxxxxxxxxxxxxxxx
                > Subject: Re: [PATCH]
                  drm/amdgpu: csa_vaddr should not larger than
                  
                > AMDGPU_GMC_HOLE_START
                > 
                > Am 14.01.19 um 09:40
                  schrieb wentalou:
                >> After removing
                  unnecessary VM size calculations, vm_manager.max_pfn
                  
                >> would reach
                  0x10,0000,0000 max_pfn << AMDGPU_GPU_PAGE_SHIFT
                  exceeding
                  
                >> AMDGPU_GMC_HOLE_START
                  would caused GPU reset.
                >> 
                >> Change-Id:
                  I47ad0be2b0bd9fb7490c4e1d7bb7bdacf71132cb
                >> Signed-off-by: wentalou
                  <Wentao.Lou@xxxxxxx>
                > NAK, that is incorrect. We
                  intentionally place the csa above the GMC hole.
                > 
                > Regards,
                > Christian.
                > 
                >> ---
                >>   
                  drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 3 ++-
                >>    1 file changed, 2
                  insertions(+), 1 deletion(-)
                >> 
                >> diff --git
                  a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
                >>
                  b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
                >> index 7e22be7..dd3bd01
                  100644
                >> ---
                  a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
                >> +++
                  b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
                >> @@ -26,7 +26,8 @@
                >>    
                >>    uint64_t
                  amdgpu_csa_vaddr(struct amdgpu_device *adev)
                >>    {
                >> -        uint64_t addr
                  = adev->vm_manager.max_pfn <<
                  AMDGPU_GPU_PAGE_SHIFT;
                >> +       uint64_t addr =
                  min(adev->vm_manager.max_pfn <<
                  AMDGPU_GPU_PAGE_SHIFT,
                >>
                  +                                                   
                  AMDGPU_GMC_HOLE_START);
                >>    
                >>          addr -=
                  AMDGPU_VA_RESERVED_SIZE;
                >>          addr =
                  amdgpu_gmc_sign_extend(addr);
                >
                  _______________________________________________
                > amd-gfx mailing list
                > amd-gfx@xxxxxxxxxxxxxxxxxxxxx
                > 
                    https://lists.freedesktop.org/mailman/listinfo/amd-gfx
                 
              
      _______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

    
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx