我的服务器触发了 OOM killer ,我正在尝试了解原因。系统有很多 RAM 128 GB,看起来实际使用了大约 70GB。看了之前关于 OOM 的问题,看起来这可能是内存碎片的情况。查看系统日志输出
Jun 23 17:20:10 server1 kernel: [517262.504589] gmond invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Jun 23 17:20:10 server1 kernel: [517262.504593] gmond cpuset=/ mems_allowed=0-1
Jun 23 17:20:10 server1 kernel: [517262.504598] CPU: 4 PID: 1522 Comm: gmond Tainted: P OE 3.15.1-031501-lowlatency #201406161841
Jun 23 17:20:10 server1 kernel: [517262.504599] Hardware name: Dell Inc. PowerEdge R420/0K29HN, BIOS 2.3.3 07/10/2014
Jun 23 17:20:10 server1 kernel: [517262.504601] 0000000000000000 ffff880fce2ab848 ffffffff817746ec 0000000000000007
Jun 23 17:20:10 server1 kernel: [517262.504603] ffff880f74691950 ffff880fce2ab898 ffffffff8176a980 ffff880f00000000
Jun 23 17:20:10 server1 kernel: [517262.504605] 000201da81383df8 ffff881470376540 ffff881dcf7ab2a0 0000000000000000
Jun 23 17:20:10 server1 kernel: [517262.504607] Call Trace:
Jun 23 17:20:10 server1 kernel: [517262.504615] [<ffffffff817746ec>] dump_stack+0x4e/0x71
Jun 23 17:20:10 server1 kernel: [517262.504618] [<ffffffff8176a980>] dump_header+0x7e/0xbd
Jun 23 17:20:10 server1 kernel: [517262.504620] [<ffffffff8176aa16>] oom_kill_process.part.6+0x57/0x30a
Jun 23 17:20:10 server1 kernel: [517262.504623] [<ffffffff811654e7>] oom_kill_process+0x47/0x50
Jun 23 17:20:10 server1 kernel: [517262.504625] [<ffffffff81165825>] out_of_memory+0x145/0x1d0
Jun 23 17:20:10 server1 kernel: [517262.504628] [<ffffffff8116c1ba>] __alloc_pages_nodemask+0xb1a/0xc40
Jun 23 17:20:10 server1 kernel: [517262.504634] [<ffffffff811adba3>] alloc_pages_current+0xb3/0x180
Jun 23 17:20:10 server1 kernel: [517262.504636] [<ffffffff81161737>] __page_cache_alloc+0xb7/0xd0
Jun 23 17:20:10 server1 kernel: [517262.504638] [<ffffffff81163f80>] filemap_fault+0x280/0x430
Jun 23 17:20:10 server1 kernel: [517262.504642] [<ffffffff8118a0d9>] __do_fault+0x39/0x90
Jun 23 17:20:10 server1 kernel: [517262.504644] [<ffffffff8118e31e>] do_read_fault.isra.59+0x10e/0x1d0
Jun 23 17:20:10 server1 kernel: [517262.504646] [<ffffffff8118e870>] do_linear_fault.isra.61+0x70/0x80
Jun 23 17:20:10 server1 kernel: [517262.504647] [<ffffffff8118e986>] handle_pte_fault+0x76/0x1b0
Jun 23 17:20:10 server1 kernel: [517262.504652] [<ffffffff81095fe0>] ? lock_hrtimer_base.isra.25+0x30/0x60
Jun 23 17:20:10 server1 kernel: [517262.504654] [<ffffffff8118eea4>] __handle_mm_fault+0x1b4/0x360
Jun 23 17:20:10 server1 kernel: [517262.504655] [<ffffffff8118f101>] handle_mm_fault+0xb1/0x160
Jun 23 17:20:10 server1 kernel: [517262.504658] [<ffffffff81784667>] ? __do_page_fault+0x2b7/0x5a0
Jun 23 17:20:10 server1 kernel: [517262.504660] [<ffffffff81784522>] __do_page_fault+0x172/0x5a0
Jun 23 17:20:10 server1 kernel: [517262.504664] [<ffffffff8111fdec>] ? acct_account_cputime+0x1c/0x20
Jun 23 17:20:10 server1 kernel: [517262.504667] [<ffffffff810a73a9>] ? account_user_time+0x99/0xb0
Jun 23 17:20:10 server1 kernel: [517262.504669] [<ffffffff810a79dd>] ? vtime_account_user+0x5d/0x70
Jun 23 17:20:10 server1 kernel: [517262.504671] [<ffffffff8178498e>] do_page_fault+0x3e/0x80
Jun 23 17:20:10 server1 kernel: [517262.504673] [<ffffffff817811f8>] page_fault+0x28/0x30
Jun 23 17:20:10 server1 kernel: [517262.504674] Mem-Info:
Jun 23 17:20:10 server1 kernel: [517262.504675] Node 0 DMA per-cpu:
Jun 23 17:20:10 server1 kernel: [517262.504677] CPU 0: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504678] CPU 1: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504679] CPU 2: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504680] CPU 3: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504681] CPU 4: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504682] CPU 5: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504683] CPU 6: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504684] CPU 7: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504685] CPU 8: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504686] CPU 9: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504687] CPU 10: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504687] CPU 11: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504688] CPU 12: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504689] CPU 13: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504690] CPU 14: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504691] CPU 15: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504692] CPU 16: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504693] CPU 17: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504694] CPU 18: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504695] CPU 19: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504696] CPU 20: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504697] CPU 21: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504698] CPU 22: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504698] CPU 23: hi: 0, btch: 1 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504699] Node 0 DMA32 per-cpu:
Jun 23 17:20:10 server1 kernel: [517262.504701] CPU 0: hi: 186, btch: 31 usd: 30
Jun 23 17:20:10 server1 kernel: [517262.504702] CPU 1: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504703] CPU 2: hi: 186, btch: 31 usd: 34
Jun 23 17:20:10 server1 kernel: [517262.504704] CPU 3: hi: 186, btch: 31 usd: 27
Jun 23 17:20:10 server1 kernel: [517262.504705] CPU 4: hi: 186, btch: 31 usd: 30
Jun 23 17:20:10 server1 kernel: [517262.504705] CPU 5: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504706] CPU 6: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504707] CPU 7: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504708] CPU 8: hi: 186, btch: 31 usd: 173
Jun 23 17:20:10 server1 kernel: [517262.504709] CPU 9: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504710] CPU 10: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504711] CPU 11: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504712] CPU 12: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504713] CPU 13: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504714] CPU 14: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504715] CPU 15: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504716] CPU 16: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504717] CPU 17: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504718] CPU 18: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504719] CPU 19: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504720] CPU 20: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504721] CPU 21: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504722] CPU 22: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504722] CPU 23: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504723] Node 0 Normal per-cpu:
Jun 23 17:20:10 server1 kernel: [517262.504724] CPU 0: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504725] CPU 1: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504726] CPU 2: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504727] CPU 3: hi: 186, btch: 31 usd: 14
Jun 23 17:20:10 server1 kernel: [517262.504728] CPU 4: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504729] CPU 5: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504730] CPU 6: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504731] CPU 7: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504732] CPU 8: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504733] CPU 9: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504734] CPU 10: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504735] CPU 11: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504736] CPU 12: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504737] CPU 13: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504738] CPU 14: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504739] CPU 15: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504740] CPU 16: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504740] CPU 17: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504741] CPU 18: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504742] CPU 19: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504743] CPU 20: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504744] CPU 21: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504745] CPU 22: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504746] CPU 23: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504747] Node 1 Normal per-cpu:
Jun 23 17:20:10 server1 kernel: [517262.504748] CPU 0: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504749] CPU 1: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504750] CPU 2: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504751] CPU 3: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504752] CPU 4: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504753] CPU 5: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504754] CPU 6: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504755] CPU 7: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504756] CPU 8: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504757] CPU 9: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504758] CPU 10: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504758] CPU 11: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504759] CPU 12: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504760] CPU 13: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504761] CPU 14: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504762] CPU 15: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504763] CPU 16: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504764] CPU 17: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504765] CPU 18: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504766] CPU 19: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504767] CPU 20: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504768] CPU 21: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504769] CPU 22: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504770] CPU 23: hi: 186, btch: 31 usd: 0
Jun 23 17:20:10 server1 kernel: [517262.504773] active_anon:17833290 inactive_anon:2465707 isolated_anon:0
Jun 23 17:20:10 server1 kernel: [517262.504773] active_file:573 inactive_file:595 isolated_file:36
Jun 23 17:20:10 server1 kernel: [517262.504773] unevictable:0 dirty:4 writeback:0 unstable:0
Jun 23 17:20:10 server1 kernel: [517262.504773] free:82698 slab_reclaimable:43224 slab_unreclaimable:11476749
Jun 23 17:20:10 server1 kernel: [517262.504773] mapped:2465518 shmem:2465767 pagetables:66385 bounce:0
Jun 23 17:20:10 server1 kernel: [517262.504773] free_cma:0
Jun 23 17:20:10 server1 kernel: [517262.504776] Node 0 DMA free:14804kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15968kB managed:15828kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jun 23 17:20:10 server1 kernel: [517262.504779] lowmem_reserve[]: 0 2933 64370 64370
Jun 23 17:20:10 server1 kernel: [517262.504782] Node 0 DMA32 free:247776kB min:2048kB low:2560kB high:3072kB active_anon:1774744kB inactive_anon:607052kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3083200kB managed:3003592kB mlocked:0kB dirty:16kB writeback:0kB mapped:607068kB shmem:607068kB slab_reclaimable:25524kB slab_unreclaimable:302060kB kernel_stack:4928kB pagetables:3100kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2660 all_unreclaimable? yes
Jun 23 17:20:10 server1 kernel: [517262.504785] lowmem_reserve[]: 0 0 61436 61436
Jun 23 17:20:10 server1 kernel: [517262.504787] Node 0 Normal free:34728kB min:42952kB low:53688kB high:64428kB active_anon:30286072kB inactive_anon:9255576kB active_file:236kB inactive_file:640kB unevictable:0kB isolated(anon):0kB isolated(file):16kB present:63963136kB managed:62911420kB mlocked:0kB dirty:0kB writeback:0kB mapped:9255000kB shmem:9255724kB slab_reclaimable:86416kB slab_unreclaimable:22165372kB kernel_stack:21072kB pagetables:121112kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:13936 all_unreclaimable? yes
Jun 23 17:20:10 server1 kernel: [517262.504791] lowmem_reserve[]: 0 0 0 0
Jun 23 17:20:10 server1 kernel: [517262.504793] Node 1 Normal free:33484kB min:45096kB low:56368kB high:67644kB active_anon:39272344kB inactive_anon:200kB active_file:2112kB inactive_file:1752kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:67108864kB managed:66056916kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:276kB slab_reclaimable:60956kB slab_unreclaimable:23439564kB kernel_stack:13536kB pagetables:141328kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:18448 all_unreclaimable? yes
Jun 23 17:20:10 server1 kernel: [517262.504797] lowmem_reserve[]: 0 0 0 0
Jun 23 17:20:10 server1 kernel: [517262.504799] Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 1*2048kB (R) 3*4096kB (M) = 14804kB
Jun 23 17:20:10 server1 kernel: [517262.504807] Node 0 DMA32: 4660*4kB (UEM) 2172*8kB (EM) 1739*16kB (EM) 1046*32kB (UEM) 629*64kB (EM) 344*128kB (UEM) 155*256kB (E) 46*512kB (UE) 3*1024kB (E) 0*2048kB 0*4096kB = 247904kB
Jun 23 17:20:10 server1 kernel: [517262.504816] Node 0 Normal: 9038*4kB (M) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 36152kB
Jun 23 17:20:10 server1 kernel: [517262.504822] Node 1 Normal: 9055*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 36220kB
Jun 23 17:20:10 server1 kernel: [517262.504829] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jun 23 17:20:10 server1 kernel: [517262.504830] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jun 23 17:20:10 server1 kernel: [517262.504831] 2467056 total pagecache pages
Jun 23 17:20:10 server1 kernel: [517262.504832] 0 pages in swap cache
Jun 23 17:20:10 server1 kernel: [517262.504833] Swap cache stats: add 0, delete 0, find 0/0
Jun 23 17:20:10 server1 kernel: [517262.504834] Free swap = 0kB
Jun 23 17:20:10 server1 kernel: [517262.504834] Total swap = 0kB
Jun 23 17:20:10 server1 kernel: [517262.504835] 33542792 pages RAM
Jun 23 17:20:10 server1 kernel: [517262.504836] 0 pages HighMem/MovableOnly
Jun 23 17:20:10 server1 kernel: [517262.504837] 262987 pages reserved
Jun 23 17:20:10 server1 kernel: [517262.504838] 0 pages hwpoisoned
Jun 23 17:20:10 server1 kernel: [517262.504839] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
Jun 23 17:20:10 server1 kernel: [517262.504866] [ 569] 0 569 4997 144 13 0 0 upstart-udev-br
Jun 23 17:20:10 server1 kernel: [517262.504868] [ 578] 0 578 12891 187 29 0 -1000 systemd-udevd
Jun 23 17:20:10 server1 kernel: [517262.504873] [ 692] 101 692 80659 2295 59 0 0 rsyslogd
Jun 23 17:20:10 server1 kernel: [517262.504875] [ 750] 0 750 4084 331 13 0 0 upstart-file-br
Jun 23 17:20:10 server1 kernel: [517262.504877] [ 792] 0 792 3815 53 13 0 0 upstart-socket-
Jun 23 17:20:10 server1 kernel: [517262.504877] [ 792] 0 792 3815 53 13 0 0 upstart-socket-
Jun 23 17:20:10 server1 kernel: [517262.504879] [ 842] 111 842 27001 275 53 0 0 dbus-daemon
Jun 23 17:20:10 server1 kernel: [517262.504880] [ 851] 0 851 8834 101 22 0 0 systemd-logind
Jun 23 17:20:10 server1 kernel: [517262.504886] [ 1232] 0 1232 2558 572 8 0 0 dhclient
Jun 23 17:20:10 server1 kernel: [517262.504888] [ 1342] 104 1342 24484 281 49 0 0 ntpd
Jun 23 17:20:10 server1 kernel: [517262.504890] [ 1440] 0 1440 3955 41 12 0 0 getty
Jun 23 17:20:10 server1 kernel: [517262.504891] [ 1443] 0 1443 3955 41 12 0 0 getty
Jun 23 17:20:10 server1 kernel: [517262.504893] [ 1448] 0 1448 3955 39 13 0 0 getty
Jun 23 17:20:10 server1 kernel: [517262.504895] [ 1450] 0 1450 3955 41 13 0 0 getty
Jun 23 17:20:10 server1 kernel: [517262.504896] [ 1452] 0 1452 3955 42 13 0 0 getty
Jun 23 17:20:10 server1 kernel: [517262.504898] [ 1469] 0 1469 4785 40 13 0 0 atd
Jun 23 17:20:10 server1 kernel: [517262.504900] [ 1470] 0 1470 15341 168 32 0 -1000 sshd
Jun 23 17:20:10 server1 kernel: [517262.504902] [ 1472] 0 1472 5914 65 17 0 0 cron
Jun 23 17:20:10 server1 kernel: [517262.504904] [ 1478] 999 1478 16020 3710 31 0 0 gmond
Jun 23 17:20:10 server1 kernel: [517262.504905] [ 1486] 0 1486 4821 65 14 0 0 irqbalance
Jun 23 17:20:10 server1 kernel: [517262.504907] [ 1500] 0 1500 343627 1730 85 0 0 nscd 743,1 1%Jun 23 17:20:10 server1 kernel: [517262.504909] [ 1559] 0 1559 1092 37 8 0 0 acpid
Jun 23 17:20:10 server1 kernel: [517262.504911] [ 1641] 0 1641 4978 71 13 0 0 master
Jun 23 17:20:10 server1 kernel: [517262.504913] [ 1650] 103 1650 5427 72 14 0 0 qmgr
Jun 23 17:20:10 server1 kernel: [517262.504917] [ 1895] 0 1895 1900 30 9 0 0 getty
Jun 23 17:20:10 server1 kernel: [517262.504919] [ 1906] 1000 1906 2854329 2610 2594 0 0 thttpd
Jun 23 17:20:10 server1 kernel: [517262.504927] [ 3163] 1000 3163 2432 39 10 0 0 searchd
Jun 23 17:20:10 server1 kernel: [517262.504928] [ 3167] 1000 3167 2727221 2467025 4863 0 0 sphinx-daemon
Jun 23 17:20:10 server1 kernel: [517262.504931] [47622] 1000 47622 17834794 17329575 33989 0 0 MyExec
<.................Trimmed bunch of processes with low mem usage.......................................>
Jun 23 17:20:10 server1 kernel: [517262.508350] Out of memory: Kill process 47622 (MyExec) score 526 or sacrifice child
Jun 23 17:20:10 server1 kernel: [517262.508375] Killed process 47622 (MyExec) total-vm:71339176kB, anon-rss:69318300kB, file-rss:0kB
查看以下几行,问题似乎是碎片化。
Jun 23 17:20:10 server1 kernel: [517262.504816] Node 0 Normal: 9038*4kB (M) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 36152kB
Jun 23 17:20:10 server1 kernel: [517262.504822] Node 1 Normal: 9055*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 36220kB
我不知道为什么系统会如此严重地碎片化。发生这种情况时它只运行了 5 天。还要查看调用 oom killer 的进程(gmond 调用 oom-killer:gfp_mask=0x201da,order=0,oom_score_adj=0),似乎它只请求 4K block 并且有很多可用的 block 。
- 在这种情况下,我对碎片化的理解是否正确?
- 我怎么知道为什么内存变得如此零散?
- 我该怎么做才能避免陷入这种情况。
您可能注意到的一件事是,我已完全关闭交换并将交换设置为 0。原因是我的系统有足够多的 RAM,永远不应该进行交换。我计划启用它并将 swappiness 设置为 10。我不确定这对这种情况是否有帮助。
感谢您的输入。
最佳答案
对碎片的理解是不正确的。发出 oom 是因为内存水印被破坏了。看看这个:
Node 0 Normal free:34728kB min:42952kB low:53688kB
Node 1 Normal free:33484kB min:45096kB low:56368kB
关于linux - 理解 OOM 奇怪的行为?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31034536/