Opened 12 months ago

Closed 12 months ago

Last modified 12 months ago

#1560 closed defect (worksforme)

CPU Usage and Load average peaked at 3am this morning

Reported by: HwyXingFrog Owned by:
Priority: major Milestone:
Component: Backend Version: 8.0.2-RELEASE
Keywords: Cc:

Description

All I really have are the Reporting Charts.

I couldn't even load the web page to reboot, I had to use the power button on the box. And even after the reboot, the CPU Usage and System Load are showing much higher than average, and file access is now very slow.

Is there any more information I can capture from logs or anything.

It seems my FreeNAS system does things like this after it runs for a month or so, don't really have any concrete conclusions.

Attachments (4)

cpu-1d.png (16.8 KB) - added by HwyXingFrog 12 months ago.
cpu-1w.png (14.6 KB) - added by HwyXingFrog 12 months ago.
load-1d.png (20.2 KB) - added by HwyXingFrog 12 months ago.
load-1w.png (15.0 KB) - added by HwyXingFrog 12 months ago.

Download all attachments as: .zip

Change History (8)

Changed 12 months ago by HwyXingFrog

Changed 12 months ago by HwyXingFrog

Changed 12 months ago by HwyXingFrog

Changed 12 months ago by HwyXingFrog

comment:1 Changed 12 months ago by jpaetzel

There are tasks the underlying OS runs at 3am. It's also possible that the filesystem started doing a scrub. Can you paste the output of running zpool status from the CLI?

Perhaps a snapshot of the output of top as well.

comment:2 Changed 12 months ago by HwyXingFrog

The system finally recovered after another reboot.

So, if this happens again, this is all the extra info that helps:

[root@freenas] ~# zpool status

pool: Pool1

state: ONLINE
scrub: scrub completed after 2h30m with 0 errors on Sat Jun 2 03:25:13 2012

config:

NAME STATE READ WRITE CKSUM
Pool1 ONLINE 0 0 0

raidz1 ONLINE 0 0 0

gptid/2988f3da-a278-11e0-a287-001fd05b97a8 ONLINE 0 0 0
gptid/29ea801d-a278-11e0-a287-001fd05b97a8 ONLINE 0 0 0
gptid/2a4f3b89-a278-11e0-a287-001fd05b97a8 ONLINE 0 0 0
gptid/2acb6c0f-a278-11e0-a287-001fd05b97a8 ONLINE 0 0 0
gptid/2b41e1c3-a278-11e0-a287-001fd05b97a8 ONLINE 0 0 0
gptid/2bc761c6-a278-11e0-a287-001fd05b97a8 ONLINE 0 0 0

errors: No known data errors
[root@freenas] ~# top

last pid: 10956; load averages: 0.05, 0.01, 0.00 up 0+11:06:02 12:00:31
45 processes: 1 running, 44 sleeping
CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
Mem: 370M Active, 1232M Inact, 2034M Wired, 129M Cache, 198M Buf, 56M Free
Swap: 12G Total, 1184K Used, 12G Free

PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND

2487 root 1 44 0 47788K 7324K select 0 0:17 0.00% smbd

10355 root 1 44 0 47796K 7592K select 0 0:14 0.00% smbd

1847 root 7 44 0 65044K 7388K ucond 1 0:13 0.00% collectd
2215 www 1 44 0 19324K 3436K kqread 0 0:05 0.00% lighttpd
1750 root 6 44 0 126M 68324K uwait 0 0:04 0.00% python
1501 root 1 44 0 11776K 2236K select 0 0:01 0.00% ntpd
1347 root 1 44 0 46624K 6444K select 1 0:01 0.00% smbd
1343 root 1 44 0 38048K 4356K select 0 0:00 0.00% nmbd
2305 root 1 76 0 64096K 23068K ttyin 1 0:00 0.00% python
1962 root 1 54 0 7832K 1184K nanslp 0 0:00 0.00% cron
1088 root 1 44 0 6904K 1176K select 0 0:00 0.00% syslogd
1731 avahi 1 44 0 16932K 2236K select 1 0:00 0.00% avahi-daemon

10914 root 1 44 0 47796K 7036K select 0 0:00 0.00% smbd
10915 root 1 44 0 33300K 4056K select 0 0:00 0.00% sshd

2129 root 1 44 0 7836K 1320K select 0 0:00 0.00% rpcbind

10917 root 1 44 0 10172K 2600K pause 1 0:00 0.00% csh
10356 root 1 44 0 46632K 6796K select 0 0:00 0.00% smbd

1403 root 1 44 0 46624K 6400K select 0 0:00 0.00% smbd
2133 root 1 49 0 6772K 1180K select 1 0:00 0.00% mountd
2306 root 1 76 0 6772K 924K ttyin 0 0:00 0.00% getty
1940 root 1 44 0 24972K 3048K select 0 0:00 0.00% sshd
2307 root 1 76 0 6772K 924K ttyin 1 0:00 0.00% getty
2312 root 1 76 0 6772K 924K ttyin 0 0:00 0.00% getty
2310 root 1 76 0 6772K 924K ttyin 1 0:00 0.00% getty
2311 root 1 76 0 6772K 924K ttyin 1 0:00 0.00% getty
2309 root 1 76 0 6772K 924K ttyin 1 0:00 0.00% getty
2308 root 1 76 0 6772K 924K ttyin 1 0:00 0.00% getty
1725 messagebus 1 70 0 7980K 1576K select 1 0:00 0.00% dbus-daemon

641 root 1 76 0 5684K 1084K select 0 0:00 0.00% dhclient

10956 root 1 44 0 9224K 2052K CPU1 0 0:00 0.00% top
10930 root 1 44 0 46632K 6684K select 0 0:00 0.00% smbd

1691 root 1 76 0 5812K 1068K select 0 0:00 0.00% rsync

788 root 1 44 0 3200K 576K select 1 0:00 0.00% devd
662 _dhcp 1 44 0 5684K 1156K select 1 0:00 0.00% dhclient

Let me know so I know the info to gather if/when this happens again.

Thanks.

comment:3 Changed 12 months ago by william

  • Resolution set to worksforme
  • Status changed from new to closed

Taking a look at zpool status lloks like it was zpool scrub.

Scrub runs every 30 days in 8.0.x.

comment:4 Changed 12 months ago by HwyXingFrog

So, then the issue would be that the scrub pinned the cpu for 9+ hours when it initiated the scrub, but then after a reboot it only took 2.5 hours (According to the zpool status message above).

Note: See TracTickets for help on using tickets.