/proc/stat的CPU负载信息说明

前言

关于linux下的/proc/stat文件, 国内好多人好多blog都有过文章来说明这个文件中每个段的含义. 今天在弄一个文档时, 也搜了一下, 但看到最后, 发现有几个明显的地方是错的, 而且多数关于/proc/stat的文章也是错的, 所以才有了这么一篇老掉牙的文章

关于/proc/stat最正确的文档

首先给出/proc/stat原始的文档, 其它文档都是从这文档翻译或引申出来, 如果看得明E文文档, 这最好不过

/proc/stat
              kernel/system statistics.  Varies with architecture.  Common
              entries include:

              cpu  3357 0 4313 1362393
                     The amount of time, measured in units of USER_HZ
                     (1/100ths of a second on most architectures, use
                     sysconf(_SC_CLK_TCK) to obtain the right value), that
                     the system spent in various states:

                     user   (1) Time spent in user mode.

                     nice   (2) Time spent in user mode with low priority
                            (nice).

                     system (3) Time spent in system mode.

                     idle   (4) Time spent in the idle task.  This value
                            should be USER_HZ times the second entry in the
                            /proc/uptime pseudo-file.

                     iowait (since Linux 2.5.41)
                            (5) Time waiting for I/O to complete.

                     irq (since Linux 2.6.0-test4)
                            (6) Time servicing interrupts.

                     softirq (since Linux 2.6.0-test4)
                            (7) Time servicing softirqs.

                     steal (since Linux 2.6.11)
                            (8) Stolen time, which is the time spent in
                            other operating systems when running in a
                            virtualized environment

                     guest (since Linux 2.6.24)
                            (9) Time spent running a virtual CPU for guest
                            operating systems under the control of the Linux
                            kernel.

                     guest_nice (since Linux 2.6.33)
                            (10) Time spent running a niced guest (virtual
                            CPU for guest operating systems under the
                            control of the Linux kernel).

              page 5741 1808
                     The number of pages the system paged in and the number
                     that were paged out (from disk).

              swap 1 0
                     The number of swap pages that have been brought in and
                     out.

              intr 1462898
                     This line shows counts of interrupts serviced since
                     boot time, for each of the possible system interrupts.
                     The first column is the total of all interrupts
                     serviced ncluding  unnumbered  architecture specific
                     interrupts; each  subsequent column is the  total for
                     that particular numbered interrupt.  Unnumbered
                     interrupts are not shown, only summed into the total.

              disk_io: (2,0):(31,30,5764,1,2) (3,0):...
                     (major,disk_idx):(noinfo, read_io_ops, blks_read,
                     write_io_ops, blks_written)
                     (Linux 2.4 only)

              ctxt 115315
                     The number of context switches that the system
                     underwent.

              btime 769041601
                     boot time, in seconds since the Epoch, 1970-01-01
                     00:00:00 +0000 (UTC).

              processes 86031
                     Number of forks since boot.

              procs_running 6
                     Number of processes in runnable state.  (Linux 2.5.45
                     onward.)

              procs_blocked 2
                     Number of processes blocked waiting for I/O to
                     complete.  (Linux 2.5.45 onward.)

出自: http://man7.org/linux/man-pages/man5/proc.5.html

常见的错误解释

参数 解释
user 从系统启动开始累计到当前时刻,用户态的CPU时间(单位:jiffies) ,不包含nice值为负进程。1jiffies=0.01秒
nice 从系统启动开始累计到当前时刻,nice值为负的进程所占用的CPU时间(单位:jiffies) 
system 从系统启动开始累计到当前时刻,核心时间(单位:jiffies) 
idle 从系统启动开始累计到当前时刻,除硬盘IO等待时间以外其它等待时间(单位:jiffies) 
iowait 从系统启动开始累计到当前时刻,硬盘IO等待时间(单位:jiffies) ,
irq 从系统启动开始累计到当前时刻,硬中断时间(单位:jiffies) 
softirq 从系统启动开始累计到当前时刻,软中断时间(单位:jiffies)

/proc/stat的CPU负载信息说明

参数 说明
user 用户态的CPU时间, 从英文文档中我们可以看出, 并没有其它附加说明, 不包含nice值为负进程这种说法是错误的
nice 原文的意思是: 低优先级程序所占用的用户态的CPU时间, 对于nice值, 低优先级的nice值为正, 所以nice值为负的进程所占用的CPU时间的说法是错误的
system 系统态的CPU时间
idle CPU空闲的时间(不包含IO等待)
iowait 等待IO响应的时间
irq 处理硬件中断的时间
softirq 处理软中断的时间

测试验证

除了nice, 其它段没多少问题, 多数文章说

nice是nice值为负的进程所占用的CPU时间

我特意找一台机做了测试

  1. 初始时, nice段数据为0, 用nice值-20来跑下cpu压力程序, nice数据没有变化, 一直为0;
  2. 重新用nice值为20来跑cpu压力程序, nice数据直接增长

这也与英文文档说明一致

2014-05-13 14:511058