当前时区为 UTC + 8 小时



发表新帖 回复这个主题  [ 5 篇帖子 ] 
作者 内容
1 楼 
 文章标题 : ubuntu16.04死机,log里反复有NMI watchdog: BUG: soft lockup - CPU#6
帖子发表于 : 2017-09-29 10:55 

注册: 2017-09-29 9:53
帖子: 3
系统: ubuntu16.04
送出感谢: 2
接收感谢: 0 次
各位大神:

公司里在一台安装了ubuntu16.04作为编译服务器,电脑型号是联想扬天T4900c-00,配置如下:
内存:7.9G
处理器:Intel(R) Core(TM) I7-4900 [email protected]*8
图形:Gallium 0.4 on NV106
操作系统: ubuntu 16.04 LTS 32位
磁盘:976.0 GB

系统上装了SSH server,samba,git,svn,等等软件,我们是通过putty登录到这台电脑进行软件编译,samba作为文件共享
但是服务器经常出现死机的现象,一天都要死个一两次,非常郁闷,之前系统是WIN7+ubuntu双系统,现在卸掉了win7,重新只装了ubuntu,还是会死机。
死机后,重启电脑后,抓出的log如下:

Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9899] manager: Networking is enabled by state file
Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9900] Loaded device plugin: NMVxlanFactory (internal)
Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9900] Loaded device plugin: NMVlanFactory (internal)
Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9900] Loaded device plugin: NMVethFactory (internal)
Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9900] Loaded device plugin: NMTunFactory (internal)
Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9900] Loaded device plugin: NMMacvlanFactory (internal)
Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9900] Loaded device plugin: NMIPTunnelFactory (internal)
Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9901] Loaded device plugin: NMInfinibandFactory (internal)
Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9901] Loaded device plugin: NMEthernetFactory (internal)
Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9901] Loaded device plugin: NMBridgeFactory (internal)
Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9901] Loaded device plugin: NMBondFactory (internal)
Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9904] Loaded device plugin: NMWifiFactory (/usr/lib/i386-linux-gnu/NetworkManager/libnm-device-plugin-wifi.so)
Sep 28 18:01:06 buildserver NetworkManager[866]: <info> [1506592866.9906] Loaded device plugin: NMAtmManager (/usr/lib/i386-linux-gnu/NetworkManager/libnm-device-plugin-adsl.so)
Sep 28 18:01:07 buildserver NetworkManager[866]: <info> [1506592867.0258] Loaded device plugin: NMBluezManager (/usr/lib/i386-linux-gnu/NetworkManager/libnm-device-plugin-bluetooth.so)
Sep 28 18:01:07 buildserver NetworkManager[866]: <info> [1506592867.0261] Loaded device plugin: NMWwanFactory (/usr/lib/i386-linux-gnu/NetworkManager/libnm-device-plugin-wwan.so)
Sep 28 18:01:07 buildserver NetworkManager[866]: nm_device_get_device_type: assertion 'NM_IS_DEVICE (self)' failed
Sep 28 18:01:07 buildserver NetworkManager[866]: <info> [1506592867.0324] device (lo): link connected
Sep 28 18:01:07 buildserver NetworkManager[866]: <info> [1506592867.0329] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/0)
Sep 28 18:01:07 buildserver NetworkManager[866]: <info> [1506592867.0348] device (enp5s0): link connected
Sep 28 18:01:07 buildserver NetworkManager[866]: <info> [1506592867.0354] manager: (enp5s0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/1)
Sep 28 18:01:07 buildserver NetworkManager[866]: <info> [1506592867.0358] manager: startup complete
Sep 28 18:01:07 buildserver NetworkManager[866]: <info> [1506592867.0358] manager: NetworkManager state is now CONNECTED_GLOBAL
Sep 28 18:01:07 buildserver NetworkManager[866]: <info> [1506592867.0372] urfkill disappeared from the bus
Sep 28 18:01:07 buildserver NetworkManager[866]: <info> [1506592867.0399] ofono is now available
Sep 28 18:01:07 buildserver NetworkManager[866]: <warn> [1506592867.0402] failed to enumerate oFono devices: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.ofono was not provided by any .service files
Sep 28 18:01:07 buildserver NetworkManager[866]: <info> [1506592867.0405] ModemManager available in the bus
Sep 28 18:01:16 buildserver NetworkManager[866]: <info> [1506592876.6081] manager: WiFi hardware radio set enabled
Sep 28 18:01:16 buildserver NetworkManager[866]: <info> [1506592876.6081] manager: WWAN hardware radio set enabled
Sep 28 18:35:19 buildserver kernel: [ 2065.183249] INFO: rcu_sched detected stalls on CPUs/tasks:
Sep 28 18:35:19 buildserver kernel: [ 2065.183265] 7-...: (11 GPs behind) idle=8d5/1/0 softirq=19851/19852 fqs=120
Sep 28 18:35:19 buildserver kernel: [ 2065.183266] (detected by 6, t=15002 jiffies, g=10726, c=10725, q=441)
Sep 28 18:35:19 buildserver kernel: [ 2065.183267] Task dump for CPU 7:
Sep 28 18:35:19 buildserver kernel: [ 2065.183268] swapper/7 R running task 0 0 1 0x00000008
Sep 28 18:35:19 buildserver kernel: [ 2065.183269] Call Trace:
Sep 28 18:35:19 buildserver kernel: [ 2065.183272] ? cpuidle_enter_state+0x156/0x350
Sep 28 18:35:19 buildserver kernel: [ 2065.183274] ? cpuidle_enter+0x14/0x20
Sep 28 18:35:19 buildserver kernel: [ 2065.183275] ? call_cpuidle+0x21/0x40
Sep 28 18:35:19 buildserver kernel: [ 2065.183276] ? do_idle+0x164/0x1d0
Sep 28 18:35:19 buildserver kernel: [ 2065.183278] ? cpu_startup_entry+0x6d/0x70
Sep 28 18:35:19 buildserver kernel: [ 2065.183279] ? start_secondary+0x15c/0x1b0
Sep 28 18:35:19 buildserver kernel: [ 2065.183280] ? startup_32_smp+0x16b/0x16d
Sep 28 18:35:19 buildserver kernel: [ 2065.183282] rcu_sched kthread starved for 14762 jiffies! g10726 c10725 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0
Sep 28 18:35:19 buildserver kernel: [ 2065.183282] rcu_sched R running task 0 7 2 0x00000000
Sep 28 18:35:19 buildserver kernel: [ 2065.183283] Call Trace:
Sep 28 18:35:19 buildserver kernel: [ 2065.183285] __schedule+0x264/0x720
Sep 28 18:35:19 buildserver kernel: [ 2065.183287] ? lock_timer_base+0x67/0x80
Sep 28 18:35:19 buildserver kernel: [ 2065.183296] schedule+0x2e/0x80
Sep 28 18:35:19 buildserver kernel: [ 2065.183298] schedule_timeout+0x198/0x360
Sep 28 18:35:19 buildserver kernel: [ 2065.183299] ? del_timer_sync+0x50/0x50
Sep 28 18:35:19 buildserver kernel: [ 2065.183300] rcu_gp_kthread+0x4ea/0x880
Sep 28 18:35:19 buildserver kernel: [ 2065.183302] kthread+0xdb/0x110
Sep 28 18:35:19 buildserver kernel: [ 2065.183303] ? rcu_note_context_switch+0x100/0x100
Sep 28 18:35:19 buildserver kernel: [ 2065.183303] ? kthread_create_on_node+0x30/0x30
Sep 28 18:35:19 buildserver kernel: [ 2065.183304] ret_from_fork+0x21/0x2c
Sep 28 18:35:46 buildserver kernel: [ 2091.923860] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [kworker/6:1:132]
Sep 28 18:35:46 buildserver kernel: [ 2091.923861] Modules linked in: snd_hda_codec_hdmi joydev input_leds intel_rapl x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek coretemp snd_hda_codec_generic kvm snd_hda_intel irqbypass snd_hda_codec crc32_pclmul pcbc snd_hda_core snd_hwdep snd_pcm aesni_intel snd_seq_midi aes_i586 snd_seq_midi_event crypto_simd cryptd snd_rawmidi intel_cstate intel_rapl_perf snd_seq snd_seq_device lpc_ich snd_timer snd mei_me mei soundcore shpchp mac_hid parport_pc ppdev lp parport autofs4 hid_generic usbhid hid nouveau mxm_wmi i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci drm r8169 libahci mii wmi fjes video
Sep 28 18:35:46 buildserver kernel: [ 2091.923889] CPU: 6 PID: 132 Comm: kworker/6:1 Not tainted 4.10.0-28-generic #32~16.04.2-Ubuntu
Sep 28 18:35:46 buildserver kernel: [ 2091.923890] Hardware name: LENOVO 90ETCTO1WW/ , BIOS FCKT77AUS 12/22/2015
Sep 28 18:35:46 buildserver kernel: [ 2091.923893] Workqueue: events netstamp_clear
Sep 28 18:35:46 buildserver kernel: [ 2091.923894] task: f55c0000 task.stack: f55ca000
Sep 28 18:35:46 buildserver kernel: [ 2091.923896] EIP: smp_call_function_many+0x1ca/0x220
Sep 28 18:35:46 buildserver kernel: [ 2091.923897] EFLAGS: 00000202 CPU: 6
Sep 28 18:35:46 buildserver kernel: [ 2091.923897] EAX: 00000007 EBX: f5df4450 ECX: 00000007 EDX: 00000001
Sep 28 18:35:46 buildserver kernel: [ 2091.923898] ESI: f5dde540 EDI: f5dde544 EBP: f55cbe68 ESP: f55cbe44
Sep 28 18:35:46 buildserver kernel: [ 2091.923898] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Sep 28 18:35:46 buildserver kernel: [ 2091.923899] CR0: 80050033 CR2: bfb28d10 CR3: 2bf31740 CR4: 001406f0
Sep 28 18:35:46 buildserver kernel: [ 2091.923899] Call Trace:
Sep 28 18:35:46 buildserver kernel: [ 2091.923902] ? arch_unregister_cpu+0x20/0x20
Sep 28 18:35:46 buildserver kernel: [ 2091.923904] ? netif_receive_skb_internal+0x19/0x90
Sep 28 18:35:46 buildserver kernel: [ 2091.923905] ? arch_unregister_cpu+0x20/0x20
Sep 28 18:35:46 buildserver kernel: [ 2091.923906] on_each_cpu+0x2a/0x50
Sep 28 18:35:46 buildserver kernel: [ 2091.923917] ? netif_receive_skb_internal+0x19/0x90
Sep 28 18:35:46 buildserver kernel: [ 2091.923918] ? netif_receive_skb_internal+0x1a/0x90
Sep 28 18:35:46 buildserver kernel: [ 2091.923919] text_poke_bp+0x5d/0xd0
Sep 28 18:35:46 buildserver kernel: [ 2091.923920] ? netif_receive_skb_internal+0x19/0x90
Sep 28 18:35:46 buildserver kernel: [ 2091.923922] arch_jump_label_transform+0x83/0x110
Sep 28 18:35:46 buildserver kernel: [ 2091.923923] ? netif_receive_skb_internal+0x1e/0x90
Sep 28 18:35:46 buildserver kernel: [ 2091.923924] __jump_label_update+0x6c/0x80
Sep 28 18:35:46 buildserver kernel: [ 2091.923934] jump_label_update+0x74/0x80
Sep 28 18:35:46 buildserver kernel: [ 2091.923935] static_key_slow_inc+0xad/0xc0
Sep 28 18:35:46 buildserver kernel: [ 2091.923936] static_key_enable+0x1c/0x50
Sep 28 18:35:46 buildserver kernel: [ 2091.923937] netstamp_clear+0x2a/0x40
Sep 28 18:35:46 buildserver kernel: [ 2091.923939] process_one_work+0x121/0x400
Sep 28 18:35:46 buildserver kernel: [ 2091.923940] worker_thread+0x37/0x4b0
Sep 28 18:35:46 buildserver kernel: [ 2091.923941] kthread+0xdb/0x110
Sep 28 18:35:46 buildserver kernel: [ 2091.923942] ? process_one_work+0x400/0x400
Sep 28 18:35:46 buildserver kernel: [ 2091.923943] ? kthread_create_on_node+0x30/0x30
Sep 28 18:35:46 buildserver kernel: [ 2091.923944] ret_from_fork+0x21/0x2c
Sep 28 18:35:46 buildserver kernel: [ 2091.923945] Code: 00 89 f8 e8 29 97 30 00 3b 05 54 ce c3 d9 0f 8d af fe ff ff 8b 1e 03 1c 85 e0 d1 b1 d9 8b 53 0c 83 e2 01 74 0e 8d 74 26 00 f3 90 <8b> 53 0c 83 e2 01 75 f6 0f ae e8 89 f6 eb bf 0f b6 45 e0 89 04
Sep 28 18:36:14 buildserver kernel: [ 2119.924489] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [kworker/6:1:132]
Sep 28 18:36:14 buildserver kernel: [ 2119.924491] Modules linked in: snd_hda_codec_hdmi joydev input_leds intel_rapl x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek coretemp snd_hda_codec_generic kvm snd_hda_intel irqbypass snd_hda_codec crc32_pclmul pcbc snd_hda_core snd_hwdep snd_pcm aesni_intel snd_seq_midi aes_i586 snd_seq_midi_event crypto_simd cryptd snd_rawmidi intel_cstate intel_rapl_perf snd_seq snd_seq_device lpc_ich snd_timer snd mei_me mei soundcore shpchp mac_hid parport_pc ppdev lp parport autofs4 hid_generic usbhid hid nouveau mxm_wmi i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci drm r8169 libahci mii wmi fjes video
Sep 28 18:36:14 buildserver kernel: [ 2119.924521] CPU: 6 PID: 132 Comm: kworker/6:1 Tainted: G L 4.10.0-28-generic #32~16.04.2-Ubuntu
Sep 28 18:36:14 buildserver kernel: [ 2119.924522] Hardware name: LENOVO 90ETCTO1WW/ , BIOS FCKT77AUS 12/22/2015
Sep 28 18:36:14 buildserver kernel: [ 2119.924526] Workqueue: events netstamp_clear
Sep 28 18:36:14 buildserver kernel: [ 2119.924527] task: f55c0000 task.stack: f55ca000
Sep 28 18:36:14 buildserver kernel: [ 2119.924529] EIP: smp_call_function_many+0x1c8/0x220
Sep 28 18:36:14 buildserver kernel: [ 2119.924529] EFLAGS: 00000202 CPU: 6
Sep 28 18:36:14 buildserver kernel: [ 2119.924530] EAX: 00000007 EBX: f5df4450 ECX: 00000007 EDX: 00000001
Sep 28 18:36:14 buildserver kernel: [ 2119.924531] ESI: f5dde540 EDI: f5dde544 EBP: f55cbe68 ESP: f55cbe44
Sep 28 18:36:14 buildserver kernel: [ 2119.924531] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068

昨天9月28号,重装了系统,18:36的时候看起来又挂了,并且在以后的几个小时内一直出现“NMI watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [kworker/6:1:132]”,类似的log.
网上搜索,也没有解决,这个问题困扰我好久了
求各位大神给支个招,谢谢啦

附件里有完整的log


附件:
kern.7z [123.59 KiB]
被下载 3 次
页首
 用户资料  
 
2 楼 
 文章标题 : Re: ubuntu16.04死机,log里反复有NMI watchdog: BUG: soft lockup - CP
帖子发表于 : 2017-09-29 12:10 

注册: 2009-08-04 16:33
帖子: 16912
送出感谢: 21
接收感谢: 1832
引用:
一直出现“NMI watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [kworker/6:1:132]”

1. 換裝個 kernel 版本 試試
1-1. 然後重新啟動
使用新安裝 kernel 版本 登入
1-2. 參見
https://github.com/lxc/lxc/issues/1088
Problem on latest kernel: kernel:NMI watchdog: BUG: soft lockup
1-2-1. https://github.com/lxc/lxc/issues/1088# ... -240370737
Our lockup happened on June 31 and right after that I upgraded the kernel to latest Ubuntu 16.04 kernel.



_________________
评价: 3.7% shanshiwu
 
页首
 用户资料  
 
3 楼 
 文章标题 : Re: ubuntu16.04死机,log里反复有NMI watchdog: BUG: soft lockup - CP
帖子发表于 : 2017-09-29 12:19 

注册: 2017-09-29 9:53
帖子: 3
系统: ubuntu16.04
送出感谢: 2
接收感谢: 0 次
poloshiao 写道:
引用:
一直出现“NMI watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [kworker/6:1:132]”

1. 換裝個 kernel 版本 試試
1-1. 然後重新啟動
使用新安裝 kernel 版本 登入
1-2. 參見
https://github.com/lxc/lxc/issues/1088
Problem on latest kernel: kernel:NMI watchdog: BUG: soft lockup
1-2-1. https://github.com/lxc/lxc/issues/1088# ... -240370737
Our lockup happened on June 31 and right after that I upgraded the kernel to latest Ubuntu 16.04 kernel.


非常感谢您的回复,我已经装过好几版linux了,14.04,16.04都试过,包括32位 64位的,可都是报类似的错误


页首
 用户资料  
 
4 楼 
 文章标题 : Re: ubuntu16.04死机,log里反复有NMI watchdog: BUG: soft lockup - CP
帖子发表于 : 2017-09-29 17:54 

注册: 2009-08-04 16:33
帖子: 16912
送出感谢: 21
接收感谢: 1832
參閱
https://bugs.launchpad.net/ubuntu/+sour ... ug/1530405
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kerneloops:814]
使用 kworker 搜尋

試試 這個暫時解決方案 是否有效
https://bugs.launchpad.net/ubuntu/+sour ... omments/71
gksudo gedit /etc/sysctl.conf # 需要先安裝 gksu 套件
# 加上這一行
kernel.watchdog_thresh=30
sudo systemctl reboot

還沒有統一的解決方案

有興趣 也可以貼文跟他們交流心得



_________________
评价: 3.7% shanshiwu
 
页首
 用户资料  
 
5 楼 
 文章标题 : Re: ubuntu16.04死机,log里反复有NMI watchdog: BUG: soft lockup - CP
帖子发表于 : 2017-09-30 9:50 

注册: 2017-09-29 9:53
帖子: 3
系统: ubuntu16.04
送出感谢: 2
接收感谢: 0 次
poloshiao 写道:
參閱
https://bugs.launchpad.net/ubuntu/+sour ... ug/1530405
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kerneloops:814]
使用 kworker 搜尋

試試 這個暫時解決方案 是否有效
https://bugs.launchpad.net/ubuntu/+sour ... omments/71
gksudo gedit /etc/sysctl.conf # 需要先安裝 gksu 套件
# 加上這一行
kernel.watchdog_thresh=30
sudo systemctl reboot

還沒有統一的解決方案

有興趣 也可以貼文跟他們交流心得


hi poloshiao:
您好,非常感谢您的回复,我安装了gksu套件,并且将kernel.watchdog_thresh设为30,还是死机了
我再试试装其它版本的kernel试试,谢谢~


页首
 用户资料  
 
显示帖子 :  排序  
发表新帖 回复这个主题  [ 5 篇帖子 ] 

当前时区为 UTC + 8 小时


在线用户

正在浏览此版面的用户:没有注册用户 和 5 位游客


不能 在这个版面发表主题
不能 在这个版面回复主题
不能 在这个版面编辑帖子
不能 在这个版面删除帖子
不能 在这个版面提交附件

前往 :  
本站点为公益性站点,用于推广开源自由软件,由 DiaHosting VPSBudgetVM VPS 提供服务。
我们认为:软件应可免费取得,软件工具在各种语言环境下皆可使用,且不会有任何功能上的差异;
人们应有定制和修改软件的自由,且方式不受限制,只要他们自认为合适。

Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
简体中文语系由 王笑宇 翻译