Mike Chirico (mchirico@users.sourceforge.net) or (mchirico@gmail.com)
Copyright (c) 2005 (GNU Free Documentation License)
Last Updated: Fri Dec 2 07:28:29 EST 2005
[http://souptonuts.sourceforge.net/performance_tutorial.html]

Performance Monitoring on Linux

The steps in this document were tested with Fedora Core 4.

oprofile - steps for running oprofile on Fedora.

Although it is possible to get an RPM with Red Hat's vmlinux in it, I personally prefer recompile the kernel from source. The kernel source code contains a wealth of information. Reference the Documentation folder to start. Later take a look at lxr, glimpse, and patchset, which are powerful tools for searching and understanding the kernel source. (Reference TIP 117)

Step 1:

Find out what version of the kernel you are running.

          
   $ uname -a
   Linux closet.squeezel.com 2.6.12-1.1398_FC4 #1 Fri Jul 15 00:52:32 EDT 2005 i686 i686 i386 GNU/Linux

Step 2:

Download the source in a chosen directory. Above, I'm running 2.6.12-1, but I'm going to go for 2.6.12.3, since it's a little later. You want the signed file as well.

    $ wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.12.3.tar.gz
    $ wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.12.3.tar.gz.sign          

Now, check the signature.

    $ gpg --verify linux-2.6.12.3.tar.gz.sign linux-2.6.12.3.tar.gz

Step 3:

Unpack the file.

    $ tar -xzf linux-2.6.12.3.tar.gz
    $ cd linux-2.6.12.3

Step 4:

Copy the ".config" used to compile your previous kernel. You should find it in the following directory "/lib/modules/$(uname -r)/build/.config".

Copy it to the linux-2.6.12.3 directory.

   $ cp "/lib/modules/$(uname -r)/build/.config" .

Step 5:

Run make as follows. It will ask for a few questions on "make oldconfig". The make installs below will have to be done with root privileges.

   $ make oldconfig
   $ make bzImage
   $ make modules
   # make modules_install
   # make install

Step 6:

Edit the "/boot/grub/grub.conf" and set default = 0 as shown below in bold.

   # grub.conf generated by anaconda                                                         
   #                                                                                         
   # Note that you do not have to rerun grub after making changes to this file               
   # NOTICE:  You have a /boot partition.  This means that                                   
   #          all kernel and initrd paths are relative to /boot/, eg.                        
   #          root (hd0,2)                                                                   
   #          kernel /vmlinuz-version ro root=/dev/VolGroup00/LogVol00                       
   #          initrd /initrd-version.img                                                     
   #boot=/dev/hda                                                                            
   default=0
   timeout=5                                                                                 
   splashimage=(hd0,2)/grub/splash.xpm.gz                                                    
   hiddenmenu                                                                                
   title Fedora Core (2.6.12.3)                                                              
           root (hd0,2)                                                                      
           kernel /vmlinuz-2.6.12.3 ro root=/dev/VolGroup00/LogVol00 rhgb quiet              
           initrd /initrd-2.6.12.3.img                                                       
   title Fedora Core (2.6.12-1.1398_FC4)                                                     
           root (hd0,2)                                                                      
           kernel /vmlinuz-2.6.12-1.1398_FC4 ro root=/dev/VolGroup00/LogVol00 rhgb quiet     
           initrd /initrd-2.6.12-1.1398_FC4.img                                              
   title Fedora Core (2.6.11-1.1369_FC4)                                                     
           root (hd0,2)                                                                      
           kernel /vmlinuz-2.6.11-1.1369_FC4 ro root=/dev/VolGroup00/LogVol00 rhgb quiet     
           initrd /initrd-2.6.11-1.1369_FC4.img                                              
   title Other                                                                               
           rootnoverify (hd0,1)                                                              
           chainloader +1                                                                    

Step 7:

Restart the computer, or run the shutdown command with the -r option. This is normally done at root.

   # shutdown -r now

Step 8:

Run opcontrol. The commands below are done as root. My kernel was compiled in the following directory "/home/kernel/linux-2.6.12.3/", so I'll run opcontrol as follows:

   # opcontrol --vmlinux=/home/kernel/linux-2.6.12.3/vmlinux

Now start.

   # opcontrol --start
   Using 2.6+ OProfile kernel interface.
   Reading module info.
   Using log file /var/lib/oprofile/oprofiled.log
   Daemon started.
   Profiler running.

Run report.

   $ opreport

   CPU: CPU with timer interrupt, speed 0 MHz (estimated)          
   Profiling through timer interrupt                               
             TIMER:0|                                              
     samples|      %|                                              
   ------------------                                              
       64451 99.5628 vmlinux                                       
          93  0.1437 opreport                                      
          67  0.1035 libc-2.3.5.so                                 
          44  0.0680 libstdc++.so.6.0.4                            
          25  0.0386 bash                                          
          21  0.0324 oprofiled                                     
           9  0.0139 ld-2.3.5.so                                   
           5  0.0077 ext3                                          
           4  0.0062 libcrypto.so.0.9.7f                           
           3  0.0046 jbd                                           
           2  0.0031 libproc-3.2.5.so                              
           1  0.0015 grep                                          
           1  0.0015 dm_mod                                        
           1  0.0015 ip_conntrack                                  
           1  0.0015 ip_tables                                     
           1  0.0015 libpthread-2.3.5.so                           
           1  0.0015 oprofile                                      
           1  0.0015 dirname                                       
           1  0.0015 libdns.so.20.0.2                              
           1  0.0015 cfenvd                                        
           1  0.0015 sshd                                          

You can reset the stats at anytime.

   # opcontrol --reset
   Signalling daemon... done

To stop the daemon, use the following command.

   # opcontrol --stop
   Stopping profiling.

To shutdown the daemon.

  # opcontrol --shutdown
  Stopping profiling.
  Killing daemon.

Reference the following for more documentation:
http://oprofile.sourceforge.net/doc/

Disk and Filesystem Monitoring

IOzone - Benchmark Tool

Iozone is a good benchmark mark tool. It generates a lot of reads and writes on the filesystem. You'll want to run it, make changes to the kernel and then run it again.

Quick Start Instructions

Do this on each filesystem that you're running the tests on, or just copy the contents on the current dirctory.

   $ wget http://www.iozone.org/src/current/iozone3_247.tar
   $ tar -xf iozone3_247.tar
   $ cd iozone3_247/src/current/

   $ make linux

Now to run a default test. This will generate a long table of tests.

   $ ./iozone -a

If you are running test on a small filesystem, consider limiting the record size to 10000 or smaller. The -O (oh) option is for Operations per second. Higher numbers means faster times.

   $ ./iozone -a -s 10000 -O

Another very good tool for stress testing your system is dbench. It's very simple to use; and, with precision control, load averages of 600 and above can be obtained.

Partitions

Before looking at some of the other tools, find out what partitions you have setup on your system. Here's a quick way to determine this information.

   [chirico@squeezel ~]$ cat /proc/partitions     
   major minor  #blocks  name                     
                                                  
      8     0  156250000 sda                      
      8     1     104391 sda1                     
      8     2  156143767 sda2                     
    253     0  153976832 dm-0                     
    253     1    2031616 dm-1                     
      7     0      20480 loop0                    
      7     1     204800 loop1                    
      7     2    2408604 loop2                    
      7     3    2686116 loop3                    

Now run fdisk on the sda partition so that we can see how it is formatted.

   [root@squeezel ~]# fdisk -l /dev/sda                             
                                                                    
   Disk /dev/sda: 160.0 GB, 160000000000 bytes                      
   255 heads, 63 sectors/track, 19452 cylinders                     
   Units = cylinders of 16065 * 512 = 8225280 bytes                 
                                                                    
      Device Boot      Start         End      Blocks   Id  System   
   /dev/sda1   *           1          13      104391   83  Linux    
   /dev/sda2              14       19452   156143767+  8e  Linux LVM

vmstat

Now you can run vmstat with the -p option on sda1 or sda2. vmstat also has the -d option for more filesystem statistics, but that is not shown here. The first listing is always the average since the last reboot.

   [root@squeezel ~]# vmstat -p sda1                              
   sda1          reads   read sectors  writes    requested writes 
                    610       1236         53        106          

Or running this on sda2.

   [root@squeezel ~]# vmstat -p sda2                                
   sda2          reads   read sectors  writes    requested writes 
                1084073   12126266    3117484   24939880          

You probably want continous output. For example, the following will produce stats every 3 seconds. After 5 outputs, the output will stop. Again, the first line of output is the average since the last reboot.

     $ vmstat 3 5
     procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
      r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
      1  0    208  13968 346176 186824    0    0    17    63   76    36  1  2 95  1
      1  0    208  14012 346180 186820    0    0     0    24 1268  1163  1  2 97  0
      0  0    208  12944 346180 187080    0    0     4     0 1386  1763 10  2 88  0
      0  0    208  13028 346184 187076    0    0     0    16 1249  1107  1  2 97  0
      0  0    208  12860 346192 187068    0    0     0    16 1191   913  1  2 98  0

Take note of "si" and "so" under "swap", since large values mean you need more memory to prevent swapping. Below you can see a more heavily loaded system. This is an older version of vmstat so the heading is different.

     $ vmstat 3 5                                                                    
        procs                      memory    swap          io     system         cpu 
      r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id 
      2  2  0  44596  10920 219208 6584900   0   1     1     1    3     0   3   1   2
      2  1  1  44604  14684 219052 6581368   0   3   987  2872  692  1962   2  29  69
      0  0  0  44604  14412 219100 6581316   0   0     7   848  444  1056   1   7  92
      0  1  0  44600  14356 219212 6581512   0   0    93  1463  626  1183   1   9  90
      0 20  1  44600  13932 219284 6581928   0   0    69  1668  589  1376   1  16  83
iostat

This utility give a rate in blocks per second. Think of iostat as the speedometer whereas vmstat give you a mileage or total count. However, notice at the end iostat give you a total count as well which would match the number above had both command been executed at the same time. By default, iostat gives a snapshot since the last boot.

   [chirico@squeezel performance]$ iostat -p                             
   Linux 2.6.11-1.35_FC3smp (squeezel.squeezel.com)        07/25/2005    
                                                                         
   avg-cpu:  %user   %nice %system %iowait   %idle                       
              0.75    0.09    0.82    0.49   97.85                       
                                                                         
   Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
   sda               6.29        32.76        67.67   12127992   25053770
   sda1              0.00         0.00         0.00       1236        106
   sda2             11.39        32.75        67.67   12126386   25053656

  • tps - transfers per second issed to the device.
  • Blk_read/s - blocks per second read from device (1 Block = 512 bytes).
  • Blk_wrtn/s - blocks per second written from device.
  • Blk_read - total number of blocks read.
  • Blk_wrtn - total number of blocks written.

What you really want is NOT an average since the last, but a current listing. The following command gives you 3 sets of data spaced 5 seconds apart.


  $ iostat -xtc 5 3
  Linux 2.6.13-1.1526_FC4 (closet.squeezel.com) 	10/10/2005

  Time: 09:22:22 PM
  avg-cpu:  %user   %nice    %sys %iowait   %idle
             0.25    0.03    0.18    0.28   99.26

  Time: 09:22:27 PM
  avg-cpu:  %user   %nice    %sys %iowait   %idle
             0.80    0.00    0.40    0.00   98.80

  Time: 09:22:32 PM
  avg-cpu:  %user   %nice    %sys %iowait   %idle
             0.60    0.00    0.60    0.00   98.80


smartmontools

To get drive temperature, power on hours and other related information, take a look at smartmontools. Unfortunally, SATA drives are not fully supported.

   [root@closet smartmontools-5.33]# smartctl -A /dev/hda
   smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen                                         
   Home page is http://smartmontools.sourceforge.net/                                                                     
                                                                                                                          
   === START OF READ SMART DATA SECTION ===                                                                               
   SMART Attributes Data Structure revision number: 16                                                                    
   Vendor Specific SMART Attributes with Thresholds:                                                                      
   ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE                       
     1 Raw_Read_Error_Rate     0x000b   100   100   060    Pre-fail  Always       -       0                               
     2 Throughput_Performance  0x0005   155   155   050    Pre-fail  Offline      -       226                             
     3 Spin_Up_Time            0x0007   103   103   024    Pre-fail  Always       -       238 (Average 291)               
     4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       54                              
     5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0                               
     7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0                               
     8 Seek_Time_Performance   0x0005   123   123   000    Pre-fail  Offline      -       37                              
     9 Power_On_Hours          0x0012   098   098   000    Old_age   Always       -       15394                           
    10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0                               
    12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       53                              
   192 Power-Off_Retract_Count 0x0032   100   100   050    Old_age   Always       -       338                             
   193 Load_Cycle_Count        0x0012   100   100   050    Old_age   Always       -       338                             
   194 Temperature_Celsius     0x0002   137   137   000    Old_age   Always       -       40 (Lifetime Min/Max 16/44)     
   196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0                               
   197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0                               
   198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0                               
   199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0                               

To the the overall health of the drive user the -Hc option.

   [root@closet smartmontools-5.33]# smartctl -Hc /dev/hda 
   smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen                    
   Home page is http://smartmontools.sourceforge.net/                                                
                                                                                                     
   === START OF READ SMART DATA SECTION ===                                                          
   SMART overall-health self-assessment test result: PASSED                                          
                                                                                                     
   General SMART Values:                                                                             
   Offline data collection status:  (0x84) Offline data collection activity                          
                                           was suspended by an interrupting command from host.       
                                           Auto Offline Data Collection: Enabled.                    
   Self-test execution status:      (   0) The previous self-test routine completed                  
                                           without error or no self-test has ever                    
                                           been run.                                                 
   Total time to complete Offline                                                                    
   data collection:                 (2855) seconds.                                                  
   Offline data collection                                                                           
   capabilities:                    (0x1b) SMART execute Offline immediate.                          
                                           Auto Offline data collection on/off support.              
                                           Suspend Offline collection upon new                       
                                           command.                                                  
                                           Offline surface scan supported.                           
                                           Self-test supported.                                      
                                           No Conveyance Self-test supported.                        
                                           No Selective Self-test supported.                         
   SMART capabilities:            (0x0003) Saves SMART data before entering                          
                                           power-saving mode.                                        
                                           Supports SMART auto save timer.                           
   Error logging capability:        (0x01) Error logging supported.                                  
                                           General Purpose Logging supported.                        
   Short self-test routine                                                                           
   recommended polling time:        (   1) minutes.                                                  
   Extended self-test routine                                                                        
   recommended polling time:        (  48) minutes.                                                  

Quotas

For monitoring filesystem quotas reference Implementing Disk Quotas on Linux .


Information on Processes

If you just want to find out what is going on, use the ps command with the -e, for everything, and the -f, for full to get the following output.

   $ ps -ef
      root         1     0  0 Nov16 ?        00:00:01 init [5]                                              
      root         2     1  0 Nov16 ?        00:00:22 [migration/0]
      root         3     1  0 Nov16 ?        00:00:00 [ksoftirqd/0]
      root         4     1  0 Nov16 ?        00:00:22 [migration/1]

You may want to redirect the above to a file for searching "ps -ef > data".

The following command can give you detailed information on every running process in your system, sorted by percentage cpu, with no headers ("h" option).

   $ ps h  -e -o %cpu,pid,user,state,start,time,etime,%cpu,%mem,cmd|sort -rn
      0.1 28157 root     S 07:19:34 00:00:00       00:28  0.1  0.2 sshd: chirico [priv]
      0.0   937 named    S   Oct 27 00:03:56 22-07:16:06  0.0  0.4 /usr/sbin/named -u named -t /var/named/chroot
      0.0     7 root     S   Oct 22 00:00:00 26-21:53:13  0.0  0.0 [kacpid]
      0.0  7963 mchirico S   Nov 05 00:00:40 12-13:41:05  0.0  0.2 fetchmail
      0.0   785 root     S   Oct 22 00:00:00 26-21:53:02  0.0  0.0 udevd
      0.0    69 root     S   Oct 22 00:00:00 26-21:53:12  0.0  0.0 [khubd]
      0.0    66 root     S   Oct 22 00:00:01 26-21:53:12  0.0  0.0 [kblockd/0]
      0.0     5 root     S   Oct 22 00:00:00 26-21:53:13  0.0  0.0 [kthread]
      0.0     4 root     S   Oct 22 00:00:00 26-21:53:13  0.0  0.0 [khelper]
      0.0     3 root     S   Oct 22 00:00:00 26-21:53:13  0.0  0.0 [events/0]
      0.0  3589 chirico  S   Oct 26 00:00:00 23-02:46:55  0.0  0.1 /usr/libexec/gconfd-2 6

If you want to choose a particular process, use the -p option.

   ps -p 28157 -o %cpu,pid,user,state,start,time,etime,%cpu,%mem,cmd
    %CPU   PID USER     S  STARTED     TIME     ELAPSED %CPU %MEM CMD
    0.0 28157 root     S 07:19:33 00:00:00       06:18  0.0  0.2 sshd: chirico [priv]


Users

Sometimes you want to focus on that the user is doing.

accton

This command turns on and off process accounting. Used in conjunction with lastcomm, you can see what commands are being executed by user.

The following, performed as root turns on system accounting.

    $ accton /var/account/pacct

You should notice that the file /var/account/pacct starts to grow.

To view the content of this file for user chirico, execute the following command.

    $ lastcomm --user chirico  

To turn off system account executing the accton command without a filename.

    $ accton 

Network

Netstat with the -p option will give you two critical pieces of information: socket and process id. For network monitoring, especially when you're initially looking into what is using you're resources, look to netstat.

netstat

$ netstat -nap
Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
tcp        0      0 192.168.1.81:59961          63.111.66.12:80             ESTABLISHED 21879/gweather-appl
tcp        0      0 192.168.1.81:48264          64.233.163.109:995          TIME_WAIT   -
tcp        0      0 192.168.1.81:50652          147.140.8.44:22             ESTABLISHED 1644/ssh
tcp        0      0 192.168.1.81:49265          147.140.8.44:22             ESTABLISHED 5374/ssh
tcp        0      0 192.168.1.81:49262          147.140.8.44:22             ESTABLISHED 5321/ssh
tcp        0      0 192.168.1.81:49157          147.140.8.44:22             ESTABLISHED 13360/ssh

Common Terms

The following is a list of common functions in the Linux kernel

Slab Allocator

The slab allocator tries to efficiently allocate small amounts of memory, referring to anything less than a page size, and any sizes of memory that may cross page sizes. The slab allocators job is to make things work efficiently these buffers of memory are allocated and deallocated often which could lead to memory fragmentation.

Recommended Readings

Enabling High Performance Data Transfers


Other Tutorials

Breaking Firewalls with OpenSSH and PuTTY: If the system administrator deliberately filters out all traffic except port 22 (ssh), to a single server, it is very likely that you can still gain access other computers behind the firewall. This article shows how remote Linux and Windows users can gain access to firewalled samba, mail, and http servers. In essence, it shows how openSSH and Putty can be used as a VPN solution for your home or workplace.

Linux System Admin Tips: There are over 200 Linux tips and tricks from using encryption, tar, cpio, setting up multiple IP addresses on one NIC, tips for using man, putting jobs into the background, using shred, watch who is doing what on your system, listing system settings, IPC status, how to make a file immutable so that even root cannot delete, ssh key pair generation, keeping 12 months of backups in the system logs, low level tap commands (mt), mount an ISO image as a filesystem, getting information on the hard drive, setting up cron jobs, look command, getting a bigger word dictionary, find out if a command is aliased, ASCII codes, using elinks, screen commands, FTP auto-login, Bash brace expansion, Bash string operators, for loops in Bash, diff and patch, script, change library path (ldconfig), monitor file usage, --parents option in commands, advance usage of the find command, cat tricks, guard against SYN flood attacks and ping, special shell variables, RPM usage, finding IP and MAC address, DOS to UNIX conversion, PHP as command line scripting language, Gnuplot, POVRAY and making animated GIFs, plus a lot more.

Create a Live Linux CD - BusyBox and OpenSSH Included: These steps will show you how to create a functioning Linux system, with the latest 2.6 kernel compiled from source, and how to integrate the BusyBox utilities including the installation of DHCP. Plus, how to compile in the OpenSSH package on this CD based system. On system boot-up a filesystem will be created and the contents from the CD will be uncompressed and completely loaded into RAM -- the CD could be removed at this point for boot-up on a second computer. The remaining functioning system will have full ssh capabilities. You can take over any PC assuming, of course, you have configured the kernel with the appropriate drivers and the PC can boot from a CD. This tutorial steps you through the whole processes.

SQLite Tutorial : This article explores the power and simplicity of sqlite3, first by starting with common commands and triggers, then the attach statement with the union operation is introduced in a way that allows multiple tables, in separate databases, to be combined as one virtual table, without the overhead of copying or moving data. Next, the simple sign function and the amazingly powerful trick of using this function in SQL select statements to solve complex queries with a single pass through the data is demonstrated, after making a brief mathematical case for how the sign function defines the absolute value and IF conditions.

The Lemon Parser Tutorial: This article explains how to build grammars and programs using the lemon parser, which is faster than yacc. And, unlike yacc, it is thread safe.

How to Compile the 2.6 kernel for Red Hat 9 and 8.0 and get Fedora Updates: This is a step by step tutorial on how to compile the 2.6 kernel from source.

Virtual Filesystem: Building A Linux Filesystem From An Ordinary File. You can take a disk file, format it as ext2, ext3, or reiser filesystem and then mount it, just like a physical drive. Yes, it then possible to read and write files to this newly mounted device. You can also copy the complete filesystem, sinc\ e it is just a file, to another computer. If security is an issue, read on. This article will show you how to encrypt the filesystem, and mount it with ACL (Access Control Lists), which give you rights beyond the traditional read (r) write (w) and execute (x) for the 3 user groups file, owner and other.

Gmail on Home Linux Box using Postfix and Fetchmail: If you have a Google Gmail account, you can relay mail from your home linux system. It's a good exercise in configuring Postfix with TLS and SASL. Plus, you will learn how to bring down the mail safely, using fetchmail with the "sslcertck" option, that is, after you have verify and copied the necessary certificates. You'll learn it all from this tutorial. And you'll have Gmail running on your local Postfix MTA.

Postfix 2nd Instance for Sender-based Routing: Multiple Gmail and Comcast Accounts. Configure your home system to support several Gmail accounts, and additionally, Comcast and or other ISP accounts that require individual authentication rules based on the sending address. This tutorial walks you through configuring a second instance of Postfix, on a second IP address (same NIC), with sender-based routing.

Working With Time: What? There are 61 seconds in a minute? We can go back in time? We still tell time by the sun?



Chirico img Mike Chirico, a father of triplets (all girls) lives outside of Philadelphia, PA, USA. He has worked with Linux since 1996, has a Masters in Computer Science and Mathematics from Villanova University, and has worked in computer-related jobs from Wall Street to the University of Pennsylvania. His hero is Paul Erdos, a brilliant number theorist who was known for his open collaboration with others.


Mike's notes page is souptonuts. For open source consulting needs, please send an email to mchirico@cwxstat.com. All consulting work must include a donation to SourceForge.net.

SourceForge.net Logo


SourceForge.net Logo