Re: kernel BUG at include/linux/mm.h:1020!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 26 Mar 2019 at 01:31, Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
>
>
> Ok, thanks.
>
> Trying one last time before putting together a debugging patch to see
> exactly what PFNs are triggering as I still have not reproduced this on a
> local machine. This is another replacement that is based on the assumption
> that it's the free_pfn at the end of the zone that is triggering the
> warning and it happens to be the case the end of a zone is aligned. Sorry
> for the frustration with this and for persisting.
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index f171a83707ce..b4930bf93c8a 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -242,6 +242,7 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source,
>                                                         bool check_target)
>  {
>         struct page *page = pfn_to_online_page(pfn);
> +       struct page *block_page;
>         struct page *end_page;
>         unsigned long block_pfn;
>
> @@ -267,20 +268,26 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source,
>             get_pageblock_migratetype(page) != MIGRATE_MOVABLE)
>                 return false;
>
> +       /* Ensure the start of the pageblock or zone is online and valid */
> +       block_pfn = pageblock_start_pfn(pfn);
> +       block_page = pfn_to_online_page(max(block_pfn, zone->zone_start_pfn));
> +       if (block_page) {
> +               page = block_page;
> +               pfn = block_pfn;
> +       }
> +
> +       /* Ensure the end of the pageblock or zone is online and valid */
> +       block_pfn += pageblock_nr_pages;
> +       block_pfn = min(block_pfn, zone_end_pfn(zone) - 1);
> +       end_page = pfn_to_online_page(block_pfn);
> +       if (!end_page)
> +               return false;
> +
>         /*
>          * Only clear the hint if a sample indicates there is either a
>          * free page or an LRU page in the block. One or other condition
>          * is necessary for the block to be a migration source/target.
>          */
> -       block_pfn = pageblock_start_pfn(pfn);
> -       pfn = max(block_pfn, zone->zone_start_pfn);
> -       page = pfn_to_page(pfn);
> -       if (zone != page_zone(page))
> -               return false;
> -       pfn = block_pfn + pageblock_nr_pages;
> -       pfn = min(pfn, zone_end_pfn(zone));
> -       end_page = pfn_to_page(pfn);
> -
>         do {
>                 if (pfn_valid_within(pfn)) {
>                         if (check_source && PageLRU(page)) {
> @@ -309,7 +316,7 @@ __reset_isolation_pfn(struct zone *zone, unsigned long pfn, bool check_source,
>  static void __reset_isolation_suitable(struct zone *zone)
>  {
>         unsigned long migrate_pfn = zone->zone_start_pfn;
> -       unsigned long free_pfn = zone_end_pfn(zone);
> +       unsigned long free_pfn = zone_end_pfn(zone) - 1;
>         unsigned long reset_migrate = free_pfn;
>         unsigned long reset_free = migrate_pfn;
>         bool source_set = false;
>
>
>
> --
> Mel Gorman
> SUSE Labs


I do not want to hurry, but it looks like this patch has fixed the problem.
I will watch for a day.
But the system has already experienced a night without a hang (kernel panic).

--
Best Regards,
Mike Gavrilov.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux