linux - 理解 OOM 奇怪的行为?

标签 linux memory linux-kernel

我的服务器触发了 OOM killer ,我正在尝试了解原因。系统有很多 RAM 128 GB,看起来实际使用了大约 70GB。看了之前关于 OOM 的问题,看起来这可能是内存碎片的情况。查看系统日志输出

Jun 23 17:20:10 server1 kernel: [517262.504589] gmond invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Jun 23 17:20:10 server1 kernel: [517262.504593] gmond cpuset=/ mems_allowed=0-1
Jun 23 17:20:10 server1 kernel: [517262.504598] CPU: 4 PID: 1522 Comm: gmond Tainted: P           OE 3.15.1-031501-lowlatency #201406161841
Jun 23 17:20:10 server1 kernel: [517262.504599] Hardware name: Dell Inc. PowerEdge R420/0K29HN, BIOS 2.3.3 07/10/2014
Jun 23 17:20:10 server1 kernel: [517262.504601]  0000000000000000 ffff880fce2ab848 ffffffff817746ec 0000000000000007
Jun 23 17:20:10 server1 kernel: [517262.504603]  ffff880f74691950 ffff880fce2ab898 ffffffff8176a980 ffff880f00000000
Jun 23 17:20:10 server1 kernel: [517262.504605]  000201da81383df8 ffff881470376540 ffff881dcf7ab2a0 0000000000000000
Jun 23 17:20:10 server1 kernel: [517262.504607] Call Trace:
Jun 23 17:20:10 server1 kernel: [517262.504615]  [<ffffffff817746ec>] dump_stack+0x4e/0x71
Jun 23 17:20:10 server1 kernel: [517262.504618]  [<ffffffff8176a980>] dump_header+0x7e/0xbd
Jun 23 17:20:10 server1 kernel: [517262.504620]  [<ffffffff8176aa16>] oom_kill_process.part.6+0x57/0x30a
Jun 23 17:20:10 server1 kernel: [517262.504623]  [<ffffffff811654e7>] oom_kill_process+0x47/0x50
Jun 23 17:20:10 server1 kernel: [517262.504625]  [<ffffffff81165825>] out_of_memory+0x145/0x1d0
Jun 23 17:20:10 server1 kernel: [517262.504628]  [<ffffffff8116c1ba>] __alloc_pages_nodemask+0xb1a/0xc40
Jun 23 17:20:10 server1 kernel: [517262.504634]  [<ffffffff811adba3>] alloc_pages_current+0xb3/0x180
Jun 23 17:20:10 server1 kernel: [517262.504636]  [<ffffffff81161737>] __page_cache_alloc+0xb7/0xd0
Jun 23 17:20:10 server1 kernel: [517262.504638]  [<ffffffff81163f80>] filemap_fault+0x280/0x430
Jun 23 17:20:10 server1 kernel: [517262.504642]  [<ffffffff8118a0d9>] __do_fault+0x39/0x90
Jun 23 17:20:10 server1 kernel: [517262.504644]  [<ffffffff8118e31e>] do_read_fault.isra.59+0x10e/0x1d0
Jun 23 17:20:10 server1 kernel: [517262.504646]  [<ffffffff8118e870>] do_linear_fault.isra.61+0x70/0x80
Jun 23 17:20:10 server1 kernel: [517262.504647]  [<ffffffff8118e986>] handle_pte_fault+0x76/0x1b0
Jun 23 17:20:10 server1 kernel: [517262.504652]  [<ffffffff81095fe0>] ? lock_hrtimer_base.isra.25+0x30/0x60
Jun 23 17:20:10 server1 kernel: [517262.504654]  [<ffffffff8118eea4>] __handle_mm_fault+0x1b4/0x360
Jun 23 17:20:10 server1 kernel: [517262.504655]  [<ffffffff8118f101>] handle_mm_fault+0xb1/0x160
Jun 23 17:20:10 server1 kernel: [517262.504658]  [<ffffffff81784667>] ? __do_page_fault+0x2b7/0x5a0
Jun 23 17:20:10 server1 kernel: [517262.504660]  [<ffffffff81784522>] __do_page_fault+0x172/0x5a0
Jun 23 17:20:10 server1 kernel: [517262.504664]  [<ffffffff8111fdec>] ? acct_account_cputime+0x1c/0x20
Jun 23 17:20:10 server1 kernel: [517262.504667]  [<ffffffff810a73a9>] ? account_user_time+0x99/0xb0
Jun 23 17:20:10 server1 kernel: [517262.504669]  [<ffffffff810a79dd>] ? vtime_account_user+0x5d/0x70
Jun 23 17:20:10 server1 kernel: [517262.504671]  [<ffffffff8178498e>] do_page_fault+0x3e/0x80
Jun 23 17:20:10 server1 kernel: [517262.504673]  [<ffffffff817811f8>] page_fault+0x28/0x30
Jun 23 17:20:10 server1 kernel: [517262.504674] Mem-Info:
Jun 23 17:20:10 server1 kernel: [517262.504675] Node 0 DMA per-cpu:
Jun 23 17:20:10 server1 kernel: [517262.504677] CPU    0: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504678] CPU    1: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504679] CPU    2: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504680] CPU    3: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504681] CPU    4: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504682] CPU    5: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504683] CPU    6: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504684] CPU    7: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504685] CPU    8: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504686] CPU    9: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504687] CPU   10: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504687] CPU   11: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504688] CPU   12: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504689] CPU   13: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504690] CPU   14: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504691] CPU   15: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504692] CPU   16: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504693] CPU   17: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504694] CPU   18: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504695] CPU   19: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504696] CPU   20: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504697] CPU   21: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504698] CPU   22: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504698] CPU   23: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504699] Node 0 DMA32 per-cpu:
Jun 23 17:20:10 server1 kernel: [517262.504701] CPU    0: hi:  186, btch:  31 usd:  30
Jun 23 17:20:10 server1 kernel: [517262.504702] CPU    1: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504703] CPU    2: hi:  186, btch:  31 usd:  34
Jun 23 17:20:10 server1 kernel: [517262.504704] CPU    3: hi:  186, btch:  31 usd:  27
Jun 23 17:20:10 server1 kernel: [517262.504705] CPU    4: hi:  186, btch:  31 usd:  30
Jun 23 17:20:10 server1 kernel: [517262.504705] CPU    5: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504706] CPU    6: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504707] CPU    7: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504708] CPU    8: hi:  186, btch:  31 usd: 173
Jun 23 17:20:10 server1 kernel: [517262.504709] CPU    9: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504710] CPU   10: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504711] CPU   11: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504712] CPU   12: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504713] CPU   13: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504714] CPU   14: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504715] CPU   15: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504716] CPU   16: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504717] CPU   17: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504718] CPU   18: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504719] CPU   19: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504720] CPU   20: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504721] CPU   21: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504722] CPU   22: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504722] CPU   23: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504723] Node 0 Normal per-cpu:
Jun 23 17:20:10 server1 kernel: [517262.504724] CPU    0: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504725] CPU    1: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504726] CPU    2: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504727] CPU    3: hi:  186, btch:  31 usd:  14
Jun 23 17:20:10 server1 kernel: [517262.504728] CPU    4: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504729] CPU    5: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504730] CPU    6: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504731] CPU    7: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504732] CPU    8: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504733] CPU    9: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504734] CPU   10: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504735] CPU   11: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504736] CPU   12: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504737] CPU   13: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504738] CPU   14: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504739] CPU   15: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504740] CPU   16: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504740] CPU   17: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504741] CPU   18: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504742] CPU   19: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504743] CPU   20: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504744] CPU   21: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504745] CPU   22: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504746] CPU   23: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504747] Node 1 Normal per-cpu:
Jun 23 17:20:10 server1 kernel: [517262.504748] CPU    0: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504749] CPU    1: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504750] CPU    2: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504751] CPU    3: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504752] CPU    4: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504753] CPU    5: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504754] CPU    6: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504755] CPU    7: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504756] CPU    8: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504757] CPU    9: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504758] CPU   10: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504758] CPU   11: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504759] CPU   12: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504760] CPU   13: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504761] CPU   14: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504762] CPU   15: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504763] CPU   16: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504764] CPU   17: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504765] CPU   18: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504766] CPU   19: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504767] CPU   20: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504768] CPU   21: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504769] CPU   22: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504770] CPU   23: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504773] active_anon:17833290 inactive_anon:2465707 isolated_anon:0
Jun 23 17:20:10 server1 kernel: [517262.504773]  active_file:573 inactive_file:595 isolated_file:36
Jun 23 17:20:10 server1 kernel: [517262.504773]  unevictable:0 dirty:4 writeback:0 unstable:0
Jun 23 17:20:10 server1 kernel: [517262.504773]  free:82698 slab_reclaimable:43224 slab_unreclaimable:11476749
Jun 23 17:20:10 server1 kernel: [517262.504773]  mapped:2465518 shmem:2465767 pagetables:66385 bounce:0
Jun 23 17:20:10 server1 kernel: [517262.504773]  free_cma:0
Jun 23 17:20:10 server1 kernel: [517262.504776] Node 0 DMA free:14804kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15968kB managed:15828kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jun 23 17:20:10 server1 kernel: [517262.504779] lowmem_reserve[]: 0 2933 64370 64370
Jun 23 17:20:10 server1 kernel: [517262.504782] Node 0 DMA32 free:247776kB min:2048kB low:2560kB high:3072kB active_anon:1774744kB inactive_anon:607052kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3083200kB managed:3003592kB mlocked:0kB dirty:16kB writeback:0kB mapped:607068kB shmem:607068kB slab_reclaimable:25524kB slab_unreclaimable:302060kB kernel_stack:4928kB pagetables:3100kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2660 all_unreclaimable? yes
Jun 23 17:20:10 server1 kernel: [517262.504785] lowmem_reserve[]: 0 0 61436 61436
Jun 23 17:20:10 server1 kernel: [517262.504787] Node 0 Normal free:34728kB min:42952kB low:53688kB high:64428kB active_anon:30286072kB inactive_anon:9255576kB active_file:236kB inactive_file:640kB unevictable:0kB isolated(anon):0kB isolated(file):16kB present:63963136kB managed:62911420kB mlocked:0kB dirty:0kB writeback:0kB mapped:9255000kB shmem:9255724kB slab_reclaimable:86416kB slab_unreclaimable:22165372kB kernel_stack:21072kB pagetables:121112kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:13936 all_unreclaimable? yes
Jun 23 17:20:10 server1 kernel: [517262.504791] lowmem_reserve[]: 0 0 0 0
Jun 23 17:20:10 server1 kernel: [517262.504793] Node 1 Normal free:33484kB min:45096kB low:56368kB high:67644kB active_anon:39272344kB inactive_anon:200kB active_file:2112kB inactive_file:1752kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:67108864kB managed:66056916kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:276kB slab_reclaimable:60956kB slab_unreclaimable:23439564kB kernel_stack:13536kB pagetables:141328kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:18448 all_unreclaimable? yes
Jun 23 17:20:10 server1 kernel: [517262.504797] lowmem_reserve[]: 0 0 0 0
Jun 23 17:20:10 server1 kernel: [517262.504799] Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 1*2048kB (R) 3*4096kB (M) = 14804kB
Jun 23 17:20:10 server1 kernel: [517262.504807] Node 0 DMA32: 4660*4kB (UEM) 2172*8kB (EM) 1739*16kB (EM) 1046*32kB (UEM) 629*64kB (EM) 344*128kB (UEM) 155*256kB (E) 46*512kB (UE) 3*1024kB (E) 0*2048kB 0*4096kB = 247904kB
Jun 23 17:20:10 server1 kernel: [517262.504816] Node 0 Normal: 9038*4kB (M) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 36152kB
Jun 23 17:20:10 server1 kernel: [517262.504822] Node 1 Normal: 9055*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 36220kB
Jun 23 17:20:10 server1 kernel: [517262.504829] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jun 23 17:20:10 server1 kernel: [517262.504830] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jun 23 17:20:10 server1 kernel: [517262.504831] 2467056 total pagecache pages
Jun 23 17:20:10 server1 kernel: [517262.504832] 0 pages in swap cache
Jun 23 17:20:10 server1 kernel: [517262.504833] Swap cache stats: add 0, delete 0, find 0/0
Jun 23 17:20:10 server1 kernel: [517262.504834] Free swap  = 0kB
Jun 23 17:20:10 server1 kernel: [517262.504834] Total swap = 0kB
Jun 23 17:20:10 server1 kernel: [517262.504835] 33542792 pages RAM
Jun 23 17:20:10 server1 kernel: [517262.504836] 0 pages HighMem/MovableOnly
Jun 23 17:20:10 server1 kernel: [517262.504837] 262987 pages reserved
Jun 23 17:20:10 server1 kernel: [517262.504838] 0 pages hwpoisoned
Jun 23 17:20:10 server1 kernel: [517262.504839] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Jun 23 17:20:10 server1 kernel: [517262.504866] [  569]     0   569     4997      144      13        0             0 upstart-udev-br
Jun 23 17:20:10 server1 kernel: [517262.504868] [  578]     0   578    12891      187      29        0         -1000 systemd-udevd
Jun 23 17:20:10 server1 kernel: [517262.504873] [  692]   101   692    80659     2295      59        0             0 rsyslogd
Jun 23 17:20:10 server1 kernel: [517262.504875] [  750]     0   750     4084      331      13        0             0 upstart-file-br
Jun 23 17:20:10 server1 kernel: [517262.504877] [  792]     0   792     3815       53      13        0             0 upstart-socket-
Jun 23 17:20:10 server1 kernel: [517262.504877] [  792]     0   792     3815       53      13        0             0 upstart-socket-
Jun 23 17:20:10 server1 kernel: [517262.504879] [  842]   111   842    27001      275      53        0             0 dbus-daemon
Jun 23 17:20:10 server1 kernel: [517262.504880] [  851]     0   851     8834      101      22        0             0 systemd-logind
Jun 23 17:20:10 server1 kernel: [517262.504886] [ 1232]     0  1232     2558      572       8        0             0 dhclient
Jun 23 17:20:10 server1 kernel: [517262.504888] [ 1342]   104  1342    24484      281      49        0             0 ntpd
Jun 23 17:20:10 server1 kernel: [517262.504890] [ 1440]     0  1440     3955       41      12        0             0 getty
Jun 23 17:20:10 server1 kernel: [517262.504891] [ 1443]     0  1443     3955       41      12        0             0 getty
Jun 23 17:20:10 server1 kernel: [517262.504893] [ 1448]     0  1448     3955       39      13        0             0 getty
Jun 23 17:20:10 server1 kernel: [517262.504895] [ 1450]     0  1450     3955       41      13        0             0 getty
Jun 23 17:20:10 server1 kernel: [517262.504896] [ 1452]     0  1452     3955       42      13        0             0 getty
Jun 23 17:20:10 server1 kernel: [517262.504898] [ 1469]     0  1469     4785       40      13        0             0 atd
Jun 23 17:20:10 server1 kernel: [517262.504900] [ 1470]     0  1470    15341      168      32        0         -1000 sshd
Jun 23 17:20:10 server1 kernel: [517262.504902] [ 1472]     0  1472     5914       65      17        0             0 cron
Jun 23 17:20:10 server1 kernel: [517262.504904] [ 1478]   999  1478    16020     3710      31        0             0 gmond
Jun 23 17:20:10 server1 kernel: [517262.504905] [ 1486]     0  1486     4821       65      14        0             0 irqbalance
Jun 23 17:20:10 server1 kernel: [517262.504907] [ 1500]     0  1500   343627     1730      85        0             0 nscd                                                                                                          743,1          1%Jun 23 17:20:10 server1 kernel: [517262.504909] [ 1559]     0  1559     1092       37       8        0             0 acpid
Jun 23 17:20:10 server1 kernel: [517262.504911] [ 1641]     0  1641     4978       71      13        0             0 master
Jun 23 17:20:10 server1 kernel: [517262.504913] [ 1650]   103  1650     5427       72      14        0             0 qmgr
Jun 23 17:20:10 server1 kernel: [517262.504917] [ 1895]     0  1895     1900       30       9        0             0 getty
Jun 23 17:20:10 server1 kernel: [517262.504919] [ 1906]  1000  1906  2854329     2610    2594        0             0 thttpd
Jun 23 17:20:10 server1 kernel: [517262.504927] [ 3163]  1000  3163     2432       39      10        0             0 searchd
Jun 23 17:20:10 server1 kernel: [517262.504928] [ 3167]  1000  3167  2727221  2467025    4863        0             0 sphinx-daemon
Jun 23 17:20:10 server1 kernel: [517262.504931] [47622]  1000 47622 17834794 17329575   33989        0             0 MyExec

<.................Trimmed bunch of processes with low mem usage.......................................>


Jun 23 17:20:10 server1 kernel: [517262.508350] Out of memory: Kill process 47622 (MyExec) score 526 or sacrifice child
Jun 23 17:20:10 server1 kernel: [517262.508375] Killed process 47622 (MyExec) total-vm:71339176kB, anon-rss:69318300kB, file-rss:0kB

查看以下几行,问题似乎是碎片化。

Jun 23 17:20:10 server1 kernel: [517262.504816] Node 0 Normal: 9038*4kB (M) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 36152kB
Jun 23 17:20:10 server1 kernel: [517262.504822] Node 1 Normal: 9055*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 36220kB

我不知道为什么系统会如此严重地碎片化。发生这种情况时它只运行了 5 天。还要查看调用 oom killer 的进程(gmond 调用 oom-killer:gfp_mask=0x201da,order=0,oom_score_adj=0),似乎它只请求 4K block 并且有很多可用的 block 。

  1. 在这种情况下,我对碎片化的理解是否正确?
  2. 我怎么知道为什么内存变得如此零散?
  3. 我该怎么做才能避免陷入这种情况。

您可能注意到的一件事是,我已完全关闭交换并将交换设置为 0。原因是我的系统有足够多的 RAM,永远不应该进行交换。我计划启用它并将 swappiness 设置为 10。我不确定这对这种情况是否有帮助。

感谢您的输入。

最佳答案

对碎片的理解是不正确的。发出 oom 是因为内存水印被破坏了。看看这个:

Node 0 Normal free:34728kB min:42952kB low:53688kB
Node 1 Normal free:33484kB min:45096kB low:56368kB

关于linux - 理解 OOM 奇怪的行为?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31034536/

相关文章:

c - unsigned char Pixel_intensity[] 到图像; C 代码、Linux

r - 通过其内存地址获取对象

android -/proc/slabinfo 提供什么信息?

memory - 使用 VADump 跟踪内存使用情况 - OpenProcess 失败 c0000034

linux - 将文件路由到 Linux 中的正确文件夹

linux - 在 linux 中,所有内核进程共享同一个内核堆栈,每个用户进程都有自己的堆栈,对吗?

linux - 为什么我在为我的 cron 作业寻找匹配的 `' 时收到意外的 EOF?

linux - 如何确定路径是位于物理设备上还是位于内存中?

linux - grep 双引号内的字符串

C++堆组织——哪种数据结构?