memory-management - 关于 "__GFP_COMP"的用法?

标签 memory-management linux-kernel kernel

我没有发现任何有用的东西,只是通过评论__GFP_COMP 出内核的源代码,其中写着:“__GFP_COMP地址复合 页面元数据。”

我用谷歌搜索了它,但我仍然很困惑。

此外,我使用 kzalloc 参数调用了函数 GFP_KERNEL 在 Linux-4.19.82 上,但内核最终提示并指出了该选项 是 GFP_KERNEL|__GFP_COMP|__GFP_ZERO 。我明白为什么有一个 __GFP_ZEROGFP_KERNEL 的选项,但是 __GFP_COMP 在哪里 从哪里来?

以下是相关代码片段(请引用gitlab.denx.de/Xenomai/xenomai/-/blob/v3.1/kernel/cobalt/heap.c第735行):

int xnheap_init(struct xnheap *heap, void *membase, size_t size)
{
    int n, nrpages;
    spl_t s;

    ......    
    nrpages = size >> XNHEAP_PAGE_SHIFT;
    heap->pagemap = kzalloc(sizeof(struct xnheap_pgentry) * nrpages,
                GFP_KERNEL);
    if (heap->pagemap == NULL)
        return -ENOMEM;
    ......

}

这是 kzalloc 的实现:


/**
 * kzalloc - allocate memory. The memory is set to zero.
 * @size: how many bytes of memory are required.
 * @flags: the type of memory to allocate (see kmalloc).
 */
static inline void *kzalloc(size_t size, gfp_t flags)
{
    return kmalloc(size, flags | __GFP_ZERO);
}

这是“dmesg”输出的最相关的日志:

    page allocation failure: order:9, mode:0x60c0c0
(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)

这是整个日志:

[22041.387673] HelloWorld: page allocation failure: order:9,
mode:0x60c0c0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)
    [22041.387678] HelloWorld cpuset=/ mems_allowed=0
    [22041.387690] CPU: 3 PID: 27737 Comm: HelloWorld Not tainted
4.19.84
    [22041.387695] I-pipe domain: Linux
    [22041.387697] Call Trace:
    [22041.387711]  dump_stack+0x9e/0xc8
    [22041.387718]  warn_alloc+0x100/0x190
    [22041.387725]  __alloc_pages_slowpath+0xb93/0xbd0
    [22041.387732]  __alloc_pages_nodemask+0x26d/0x2b0
    [22041.387739]  alloc_pages_current+0x6a/0xe0
    [22041.387744]  kmalloc_order+0x18/0x40
    [22041.387748]  kmalloc_order_trace+0x24/0xb0
    [22041.387754]  __kmalloc+0x20e/0x230
    [22041.387759]  ? __vmalloc_node_range+0x171/0x250
    [22041.387765]  xnheap_init+0x87/0x200
    [22041.387770]  ? remove_process+0xc0/0xc0
    [22041.387775]  cobalt_umm_init+0x61/0xb0
    [22041.387779]  cobalt_process_attach+0x64/0x4c0
    [22041.387784]  ? snprintf+0x45/0x70
    [22041.387790]  ? security_capable+0x46/0x60
    [22041.387794]  bind_personality+0x5a/0x120
    [22041.387798]  cobalt_bind_core+0x27/0x60
    [22041.387803]  CoBaLt_bind+0x18a/0x1d0
    [22041.387812]  ? handle_head_syscall+0x3f0/0x3f0
    [22041.387816]  ipipe_syscall_hook+0x119/0x340
    [22041.387822]  __ipipe_notify_syscall+0xd3/0x190
    [22041.387827]  ? __x64_sys_rt_sigaction+0x7b/0xd0
    [22041.387832]  ipipe_handle_syscall+0x3e/0xc0
    [22041.387837]  do_syscall_64+0x3b/0x250
    [22041.387842]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [22041.387847] RIP: 0033:0x7ff3d074e481
    [22041.387852] Code: 89 c6 48 8b 05 10 6b 21 00 c7 04 24 00 00 00 a4 8b
38 85 ff 75 43 bb 00 00 00 10 c7 44 24 04 11 00 00 00 48 89 e7 89 d8 0f 05
<bf> 04 00 00 00 48 89 c3 e8 e2 e0 ff ff 8d 53 26 83 fa 26 0f 87 46
    [22041.387855] RSP: 002b:00007ffc62caf210 EFLAGS: 00000246 ORIG_RAX:
0000000010000000
    [22041.387860] RAX: ffffffffffffffda RBX: 0000000010000000 RCX:
00007ff3d074e481
    [22041.387863] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
00007ffc62caf210
    [22041.387865] RBP: 00007ff3d20a3780 R08: 00007ffc62caf160 R09:
0000000000000000
    [22041.387868] R10: 0000000000000008 R11: 0000000000000246 R12:
00007ff3d0965b00
    [22041.387870] R13: 0000000001104320 R14: 00007ff3d0965d40 R15:
0000000001104050
    [22041.387876] Mem-Info:
    [22041.387885] active_anon:56054 inactive_anon:109301 isolated_anon:0
                    active_file:110190 inactive_file:91980 isolated_file:0
                    unevictable:9375 dirty:1 writeback:0 unstable:0
                    slab_reclaimable:22463 slab_unreclaimable:19122
                    mapped:101678 shmem:25642 pagetables:7663 bounce:0
                    free:456443 free_pcp:0 free_cma:0
    [22041.387891] Node 0 active_anon:224216kB inactive_anon:437204kB
active_file:440760kB inactive_file:367920kB unevictable:37500kB
isolated(anon):0kB isolated(file):0kB mapped:406712kB dirty:4kB
writeback:0kB shmem:102568kB writeback_tmp:0kB unstable:0kB
all_unreclaimable? no
    [22041.387893] Node 0 DMA free:15892kB min:32kB low:44kB high:56kB
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
unevictable:0kB writepending:0kB present:15992kB managed:15892kB
mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB
    [22041.387901] lowmem_reserve[]: 0 2804 3762 3762
    [22041.387912] Node 0 DMA32 free:1798624kB min:5836kB low:8704kB
high:11572kB active_anon:188040kB inactive_anon:219400kB
active_file:184156kB inactive_file:346776kB unevictable:24900kB
writepending:0kB present:3017476kB managed:2927216kB mlocked:24900kB
kernel_stack:1712kB pagetables:7564kB bounce:0kB free_pcp:0kB local_pcp:0kB
free_cma:0kB
    [22041.387920] lowmem_reserve[]: 0 0 958 958
    [22041.387930] Node 0 Normal free:11256kB min:1992kB low:2972kB
high:3952kB active_anon:36084kB inactive_anon:218100kB active_file:257220kB
inactive_file:21148kB unevictable:12600kB writepending:4kB
present:1048576kB managed:981268kB mlocked:12600kB kernel_stack:5280kB
pagetables:23088kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
    [22041.387938] lowmem_reserve[]: 0 0 0 0
    [22041.387948] Node 0 DMA: 3*4kB (U) 3*8kB (U) 1*16kB (U) 1*32kB (U)
3*64kB (U) 0*128kB 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB
(M) = 15892kB
    [22041.387990] Node 0 DMA32: 14912*4kB (UME) 13850*8kB (UME) 9325*16kB
(UME) 5961*32kB (UME) 3622*64kB (UME) 2359*128kB (UME) 1128*256kB (UME)
524*512kB (M) 194*1024kB (UM) 0*2048kB 0*4096kB = 1799872kB
    [22041.388033] Node 0 Normal: 1643*4kB (UME) 71*8kB (UME) 47*16kB (UM)
35*32kB (M) 38*64kB (M) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 11572kB
    [22041.388071] Node 0 hugepages_total=0 hugepages_free=0
hugepages_surp=0 hugepages_size=2048kB
    [22041.388073] 232507 total pagecache pages
    [22041.388077] 7 pages in swap cache
    [22041.388079] Swap cache stats: add 1015, delete 1008, find 0/1
    [22041.388081] Free swap  = 995068kB
    [22041.388083] Total swap = 999420kB
    [22041.388086] 1020511 pages RAM
    [22041.388088] 0 pages HighMem/MovableOnly
    [22041.388090] 39417 pages reserved
    [22041.388092] 0 pages hwpoisoned

最佳答案

正如您所注意到的,kzalloc(size, flags) 最终会调用:

kmalloc(size, flags | __GFP_ZERO);

因此将 __GFP_ZERO 添加到您的 GFP_KERNEL 中。

接下来发生的情况取决于请求的大小。根据您在帖子下的评论中所说的内容,您应该:

size == 256*1024
nrpages == size >> 9 == 512
sizeof(struct xnheap_pgentry) == 12

因此最终大小为12 * 512 = 6144,大于KMALLOC_MAX_CACHE_SIZE (4096)。正如您在评论中正确指出的那样,这个请求相当大。

因此,您输入 this branch in kmalloc() :

static __always_inline void *kmalloc(size_t size, gfp_t flags)
{
    if (__builtin_constant_p(size)) {
        if (size > KMALLOC_MAX_CACHE_SIZE)
            return kmalloc_large(size, flags); // <<<<<<< HERE

    /* ... */
}

接下来发生的是:

  1. kmalloc_large()是:

    static __always_inline void *kmalloc_large(size_t size, gfp_t flags)
    {
        unsigned int order = get_order(size);
        return kmalloc_order_trace(size, flags, order);
    }
    
  2. kmalloc_order_trace()是:

    void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order)
    {
        void *ret = kmalloc_order(size, flags, order);
        trace_kmalloc(_RET_IP_, ret, size, PAGE_SIZE << order, flags);
        return ret;
    }
    
  3. 最后kmalloc_order()是:

    /*
     * To avoid unnecessary overhead, we pass through large allocation requests
     * directly to the page allocator. We use __GFP_COMP, because we will need to
     * know the allocation order to free the pages properly in kfree.
     */
    void *kmalloc_order(size_t size, gfp_t flags, unsigned int order)
    {
        void *ret;
        struct page *page;
    
        flags |= __GFP_COMP; // <<<<<<<<<<<<<< FLAG __GFP_COMP ADDED HERE
        page = alloc_pages(flags, order);
        ret = page ? page_address(page) : NULL;
        kmemleak_alloc(ret, size, 1, flags);
        kasan_kmalloc_large(ret, size, flags);
        return ret;
    }
    

所以最终 kmalloc_order() 是负责添加 __GFP_COMP 标志的函数,它这样做是出于与实现相关的原因(以便能够正确kfree() 之后的那些页面)。

在您的具体情况下,接下来发生的情况是,当 kmalloc_order() 调用 alloc_pages() 时,内核无法分配页面并发生 panic 。如果不进行一些额外的分析,很难判断其原因是什么以及如何解决它。您可能内存不足。

关于memory-management - 关于 "__GFP_COMP"的用法?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62486827/

相关文章:

windows - 保留内存和提交内存有什么区别?

linux - 什么是 Mutex 获取和释放顺序?

自定义 Linux 内核系统调用包装函数

c++ - list.h list_del() 给出内核分页错误

c++ - 在 Release模式下调用 delete 时未删除 fstreams

ios - 我可以找到创建命名 OSMallocTag 的库吗?

java - 删除并重新创建数组或用零填充它是否更快,为什么?

assembly - 为什么switch_to使用push+jmp+ret来改变EIP,而不是直接jmp?

linux蓝牙低能耗源代码

linux-kernel - 将自定义标志传递给设备驱动程序中的 "open"