UNIX Unleashed, System Administrator's Edition

- 19 -

Kernel Basics and Configuration

by Dan Wilson, Bill Pierce, and Bill Wood

You're probably asking yourself "Why would I want to know about this thing called the UNIX kernel?. I can add users, run jobs, print files, perform backups and restores, and even start up and shut down the machine when it needs it. Why do I need to know about, and, more specifically, even change my system's configuration to do my job as a systems administrator?" The simple answer is "You don't need to know much about the UNIX kernel if you know you'll never have to add any hardware or change or tune your system to perform better."

In all of our collective years of experience as systems administrators, about 26, we have rarely, if ever, experienced a situation where it was possible or desirable to operate an Original Equipment Manufacturer (OEM) configured UNIX system. There are just too many different uses for this type of operating system for it to remain unchanged throughout its lifetime. So, assuming you are one of the fortunate individuals who has the title of system administrator, we'll try to provide you with some useful and general information about this all-powerful UNIX process called the kernel. After that, we'll take you through some sample configurations for the following UNIX operating systems:

HP-UX 10.x

Solaris 2.5

System V Release 4 (SVR4)

AIX

Linux

What Is a Kernel?

Let's start by providing a definition for the term kernel. The UNIX kernel is the software that manages the user program's access to the systems hardware and software resources. These resources range from being granted CPU time, accessing memory, reading and writing to the disk drives, connecting to the network, and interacting with the terminal or GUI interface. The kernel makes this all possible by controlling and providing access to memory, processor, input/output devices, disk files, and special services to user programs.

Kernel Services

The basic UNIX kernel can be broken into four main subsystems:

Process Management

Memory Management

I/O Management

File Management

These subsystems should be viewed as separate entities that work in concert to provide services to a program that enable it to do meaningful work. These management subsystems make it possible for a user to access a database via a Web interface, print a report, or do something as complex as managing a 911 emergency system. At any moment in the system, numerous programs may request services from these subsystems. It is the kernel's responsibility to schedule work and, if the process is authorized, grant access to utilize these subsystems. In short, programs interact with the subsystems via software libraries and the systems call interface. Refer to your UNIX reference manuals for descriptions of the systems calls and libraries supported by your system. Because each of the subsystems is key to enabling a process to perform a useful function, we will cover the basics of each subsystem. We'll start by looking at how the UNIX kernel comes to life by way of the system initialization process.

System Initialization

System initialization (booting) is the first step toward bringing your system into an operational state. A number of machine-dependent and machine-independent steps are gone through before your system is ready to begin servicing users. At system startup, there is nothing running on the Central Processing Unit (CPU). The kernel is a complex program that must have its binary image loaded at a specific address from some type of storage device, usually a disk drive. The boot disk maintains a small restricted area called the boot sector that contains a boot program that loads and initializes the kernel. You'll find that this is a vendor specific procedure that reflects the architectural hardware differences between the various UNIX vendor platforms. When this step is completed, the CPU must jump to a specific memory address and start executing the code at that location. Once the kernel is loaded, it goes through its own hardware and software initialization.

Kernel Mode

The operating system, or kernel, runs in a privileged manner known as kernel mode. This mode of operation allows the kernel to run without being interfered with by other programs currently in the system. The microprocessor enforces this line of demarcation between user and kernel level mode. With the kernel operating in its own protected address space, it is guaranteed to maintain the integrity of its own data structures and that of other processes. (That's not to say that a privileged process could not inadvertently cause corruption within the kernel.) These data structures are used by the kernel to manage and control itself and any other programs that may be running in the system. If any of these data structures were allowed to be accidentally or intentionally altered, the system could quickly crash. Now that we have learned what a UNIX kernel is and how it is loaded into the system, we are ready to take a look at the four UNIX subsystems Process Management, Memory Management, Filesystem Management and I/O Management.

Process Management

The Process Management subsystem controls the creation, termination, accounting, and scheduling of processes. It also oversees process state transitions and the switching between privileged and nonprivileged modes of execution. The Process Management subsystem also facilitates and manages the complex task of the creation of child processes.

A simple definition of a process is that it is an executing program. It is an entity that requires system resources, and it has a finite lifetime. It has the capability to create other processes via the system call interface. In short, it is an electronic representation of a user's or programmer's desire to accomplish some useful piece of work. A process may appear to the user as if it is the only job running in the machine. This "sleight of hand" is only an illusion. At any one time a processor is only executing a single process.

Process Structure

A process has a definite structure (see Figure 19.1). The kernel views this string of bits as the process image. This binary image consists of both a user and system address space as well as registers that store the process's data during its execution. The user address space is also known as the user image. This is the code that is written by a programmer and compiled into an ".o " object file. An object file is a file that contains machine language code/data and is in a format that the linker program can use to then create an executable program.

***19uni01***

Figure 19.1.
Diagram of process areas.

The user address space consists of five separate areas: Text, Data, Bss, stack, and user area.

Text Segment The first area of a process is its text segment. This area contains the executable program code for the process. This area is shared by other processes that execute the program. It is therefore fixed and unchangeable and is usually swapped out to disk by the system when memory gets too tight.

Data Area The data area contains both the global and static variables used by the program. For example, a programmer may know in advance that a certain data variable needs to be set to a certain value. In the C programming language, it would look like:

int x = 15;

If you were to look at the data segment when the program was loaded, you would see that the variable x was an integer type with an initial value of 15.

Bss Area The bss area, like the data area, holds information for the programs variables. The difference is that the bss area maintains variables that will have their data values assigned to them during the programs execution. For example, a programmer may know that she needs variables to hold certain data that will be input by a user during the execution of the program.

int a,b,c;        // a,b and c are variables that hold integer values.
char *ptr;        // ptr is an unitialized character pointer.

The program code can also make calls to library routines like malloc to obtain a chunk of memory and assign it to a variable like the one declared above.

Stack Area The stack area maintains the process's local variables, parameters used in functions, and values returned by functions. For example, a program may contain code that calls another block of code (possibly written by someone else). The calling block of code passes data to the receiving block of code by way of the stack. The called block of code then process's the data and returns data back to the calling code. The stack plays an important role in allowing a process to work with temporary data.

User Area The user area maintains data that is used by the kernel while the process is running. The user area contains the real and effective user identifiers, real and effective group identifiers, current directory, and a list of open files. Sizes of the text, data, and stack areas, as well as pointers to process data structures, are maintained. Other areas that can be considered part of the process's address space are the heap, private shared libraries data, shared libraries, and shared memory. During initial startup and execution of the program, the kernel allocates the memory and creates the necessary structures to maintain these areas.

The user area is used by the kernel to manage the process. This area maintains the majority of the accounting information for a process. It is part of the process address space and is only used by the kernel while the process is executing(see Figure 19.2). When the process is not executing, its user area may be swapped out to disk by the Memory Manager. In most versions of UNIX, the user area is mapped to a fixed virtual memory address. Under HP-UX 10.X, this virtual address is 0x7FFE6000. When the kernel performs a context switch (starts executing a different process) to a new process, it will always map the process's physical address to this virtual address. Since the kernel already has a pointer fixed to this location in memory, it is a simple matter of referencing the current u pointer to be able to begin managing the newly switched in process. The file /usr/include/sys/user.h contains the user area's structure definition for your version of UNIX.

Figure 19.2.
Diagram of kernel address space.

Process Table The process table is another important structure used by the kernel to manage the processes in the system. The process table is an array of process structures that the kernel uses to manage the execution of programs. Each table entry defines a process that the kernel has created. The process table is always resident in the computer's memory. This is because the kernel is repeatedly querying and updating this table as it switches processes in and out of the CPU. For those processes that are not currently executing, their process table structures are being updated by the kernel for scheduling purposes. The process structures for your system are defined in /usr/include/sys/proc.h.

Fork Process The kernel provides each process with the tools to duplicate itself for the purpose of creating a new process. This new entity is termed a child process. The fork() system call is invoked by an existing process (termed the parent process) and creates a replica of the parent process. While a process will have one parent, it can spawn many children. The new child process inherits certain attributes from its parent. The fork() system call documentation for HP-UX 10.0 (fork(2) in HP-UX Reference Release 10.0 Volume 3 (of 4) HP 9000 Series Computers) lists the following as being inherited by the child:

Real, effective, and saved user IDs

Real, effective, and saved group IDs

Supplementary group IDs

Process group ID

Environment

File descriptors

Close-on-exec flags

Signal handling settings

Signal mask

Profiling on/off status

Command name in the accounting record

Nice value

All attached shared memory segments

Current working directory

Root directory

File mode creation mask

File size limit

Real-time priority

It is important to note how the child process differs from the parent process in order to see how one tells the difference between the parent and the child. When the kernel creates a child process on behalf of the parent, it gives the child a new process identifier. This unique process ID is returned to the parent by the kernel to be used by the parents code (of which the child also has a copy at this point) to determine the next step the parent process should follow: either continue on with additional work, wait for the child to finish, or terminate. The kernel will return the user ID of 0 (zero) to the child. Since the child is still executing the parent's copy of the program at this point, the code simply checks for a return status of 0 (zero) and continues executing that branch of the code. The following short pseudocode segment should help clarify this concept.

            start
            print     " I am a process "
            print     " I will now make a copy of myself "
            if fork() is greater than 0
               print   " I am the parent"
               exit   () or wait    ()
            else if fork() = 0
               print    " I am the new child "
               print    " I am now ready  to start  a new program "
               exec("new_program")
            else fork() failed

The child process can also make another system call that will replace the child's process image with that of a new one. The system call that will completely overlay the child's text, data, and BSS areas with that of a new program one is called exec(). This is how the system is able to execute multiple programs. By using both the fork() and the exec() systems calls in conjunction with one another, a single process is able to execute numerous programs that perform any number of tasks that the programmer needs to have done. Except for a few system level processes started at boot time, this is how the kernel goes about executing the numerous jobs your system is required to run to support your organization.

To see how all this looks running on your system, you can use the ps command to view the fact that the system has created all these new child processes. The ps -ef command will show you that the child's parent process ID column (PPID) will match that of the parent's process ID column (PID). The simplest way to test this is to logon and, at the shell prompt, issue a UNIX command. By doing this you are telling the shell to spawn off a child process that will execute the command (program) you just gave it and to return control to you once the command has finished executing. Another way to experiment with this is to start a program in what is termed the background. This is done by simply appending an ampersand (&) to the end of your command line statement. This has the effect of telling the system to start this new program, but not to wait for it to finish before giving control back to your current shell process. This way you can use the ps -ef command to view your current shell and background processes.

Sample ps -ef output from a system running AIX 4.2
     UID   PID  PPID   C    STIME    TTY  TIME CMD
    root     1     0   0   Apr 24      -  2:55 /etc/init
    root  2060 17606   0 10:38:30      -  0:02 dtwm
    root  2486     1   0   Apr 24      -  0:00 /usr/dt/bin/dtlogin -daemon
    root  2750  2486   0   Apr 24      -  3:12 /usr/lpp/X11/bin/X -x xv -D 
/usr/lib/X11//rgb -T -force :0 -auth /var/dt/A:0-yjc2ya
    root  2910     1   0   Apr 24      -  0:00 /usr/sbin/srcmstr
    root  3176  2486   0   Apr 25      -  0:00 dtlogin <:0>        -daemon
    root  3794     1   0   Apr 25      -  0:00 /usr/ns-home/admserv/ns-admin 
-d /usr/ns-home/admserv .
    root  3854  2910   0   Apr 24      -  0:00 /usr/lpp/info/bin/infod
    root  4192  6550   0   Apr 24      -  0:00 rpc.ttdbserver 100083 1
    root  4364     1   0   Apr 24      -  2:59 /usr/sbin/syncd 60
    root  4628     1   0   Apr 24      -  0:00 /usr/lib/errdemon
    root  5066     1   0   Apr 24      -  0:03 /usr/sbin/cron
    root  5236  2910   0   Apr 24      -  0:00 /usr/sbin/syslogd
    root  5526  2910   0   Apr 24      -  0:00 /usr/sbin/biod 6
    root  6014  2910   0   Apr 24      -  0:00 sendmail: accepting connections
    root  6284  2910   0   Apr 24      -  0:00 /usr/sbin/portmap
    root  6550  2910   0   Apr 24      -  0:00 /usr/sbin/inetd
    root  6814  2910   0   Apr 24      -  9:04 /usr/sbin/snmpd
    root  7080  2910   0   Apr 24      -  0:00 /usr/sbin/dpid2
    root  7390     1   0   Apr 24      -  0:00 /usr/sbin/uprintfd
    root  7626     1   0   Apr 24      -  0:00 /usr/OV/bin/ntl_reader 
0 1 1 1 1000 /usr/OV/log/nettl
    root  8140  7626   0   Apr 24      -  0:00 netfmt -CF
    root  8410  8662   0   Apr 24      -  0:00 nvsecd -O
    root  8662     1   0   Apr 24      -  0:15 ovspmd
    root  8926  8662   0   Apr 24      -  0:19 ovwdb -O -n5000 -t
    root  9184  8662   0   Apr 24      -  0:04 pmd -Au -At -Mu -Mt -m
    root  9442  8662   0   Apr 24      -  0:32 trapgend -f
    root  9700  8662   0   Apr 24      -  0:01 mgragentd -f
    root  9958  8662   0   Apr 24      -  0:00 nvpagerd
    root 10216  8662   0   Apr 24      -  0:00 nvlockd
    root 10478  8662   0   Apr 24      -  0:05 trapd
    root 10736  8662   0   Apr 24      -  0:04 orsd
    root 11004  8662   0   Apr 24      -  0:31 ovtopmd -O -t
    root 11254  8662   0   Apr 24      -  0:00 nvcold -O
    root 11518  8662   0   Apr 24      -  0:03 ovactiond
    root 11520  8662   0   Apr 24      -  0:05 nvcorrd
    root 11780  8662   0   Apr 24      -  0:00 actionsvr
    root 12038  8662   0   Apr 24      -  0:00 nvserverd
    root 12310  8662   0   Apr 24      -  0:04 ovelmd
    root 12558  8662   0   Apr 24      -  4:28 netmon -P
    root 12816  8662   0   Apr 24      -  0:04 ovesmd
    root 13074  8662   0   Apr 24      -  0:00 snmpCollect
    root 13442  2910   0   Apr 24      -  0:00 /usr/lib/netsvc/yp/ypbind
    root 13738  5526   0   Apr 24      -  0:00 /usr/sbin/biod 6
    root 13992  5526   0   Apr 24      -  0:00 /usr/sbin/biod 6
    root 14252  5526   0   Apr 24      -  0:00 /usr/sbin/biod 6
    root 14510  5526   0   Apr 24      -  0:00 /usr/sbin/biod 6
    root 14768  5526   0   Apr 24      -  0:00 /usr/sbin/biod 6
    root 15028  2910   0   Apr 24      -  0:00 /usr/sbin/rpc.statd
    root 15210  6550   0   Apr 24      -  0:00 rpc.ttdbserver 100083 1
    root 15580  2910   0   Apr 24      -  0:00 /usr/sbin/writesrv
    root 15816  2910   0   Apr 24      -  0:00 /usr/sbin/rpc.lockd
    root 16338  2910   0   Apr 24      -  0:00 /usr/sbin/qdaemon
    root 16520  2060   0 13:44:46      -  0:00 /usr/dt/bin/dtexec -open 0 
-ttprocid 2.pOtBq 01 17916 1342177279 1 0 0 10.19.12.115 3_101_1 /usr/dt/bin/dtterm
    root 16640     1   0   Apr 24   lft0  0:00 /usr/sbin/getty /dev/console
    root 17378     1   0   Apr 24      -  0:13 /usr/bin/pmd
    root 17606  3176   0 10:38:27      -  0:00 /usr/dt/bin/dtsession
    root 17916     1   0 10:38:28      -  0:00 /usr/dt/bin/ttsession -s
    root 18168     1   0   Apr 24      -  0:00 /usr/lpp/diagnostics/bin/diagd
  nobody 18562 19324   0   Apr 25      -  0:32 ./ns-httpd -d 
/usr/ns-home/httpd-supp_aix/config
    root 18828 22410   0 13:44:47  pts/2  0:00 /bin/ksh
    root 19100 21146   0 13:45:38  pts/3  0:00 vi hp.c
  nobody 19324     1   0   Apr 25      -  0:00 ./ns-httpd -d 
/usr/ns-home/httpd-supp_aix/config
    root 19576  6550   0 13:43:38      -  0:00 telnetd
  nobody 19840 19324   0   Apr 25      -  0:33 ./ns-httpd -d 
/usr/ns-home/httpd-supp_aix/config
    root 19982 17606   0 10:38:32      -  0:03 dtfile
  nobody 20356 19324   0   Apr 25      -  0:33 ./ns-httpd -d 
/usr/ns-home/httpd-supp_aix/config
    root 20694 20948   0   Apr 25      -  0:00 
/usr/ns-home/admserv/ns-admin -d /usr/ns-home/admserv .
    root 20948  3794   0   Apr 25      -  0:01 
/usr/ns-home/admserv/ns-admin -d /usr/ns-home/admserv .
    root 21146 23192   0 13:45:32  pts/3  0:00 /bin/ksh
  nobody 21374 19324   0   Apr 25      -  0:00 ./ns-httpd -d /usr/ns-home/httpd-supp_aix/config
    root 21654  2060   0 13:45:31      -  0:00 /usr/dt/bin/dtexec 
-open 0 -ttprocid 2.pOtBq 01 17916 1342177279 1 0 0 10.19.12.115 3_102_1 /usr/dt/bin/dtterm
    root 21882 19576   0 13:43:39  pts/0  0:00 -ksh
    root 22038 19982   0 10:38:37      -  0:04 dtfile
    root 22410 16520   0 13:44:47      -  0:00 /usr/dt/bin/dtterm
    root 22950 21882   8 13:46:06  pts/0  0:00 ps -ef
    root 23192 21654   0 13:45:31      -  0:00 /usr/dt/bin/dtterm
    root 23438 18828   0 13:45:03  pts/2  0:00 vi aix.c

Process Run States

A process moves between several states during its lifetime, although a process can only be in one state at any one time. Certain events, such as system interrupts, blocking of resources, or software traps will cause a process to change its run state. The kernel maintains queues in memory that it uses to assign a process to based upon that process's state. It keeps track of the process by its user ID.

UNIX version System V Release 4 (SVR4) recognizes the following process run states:

- SIDLE     This is the state right after a process has issued 
a fork() system call. A process image has yet to be copied into memory.
- SRUN      The process is ready to run and is waiting to be executed by  the CPU.
- SONPROC   The process is currently being executed by the CPU.
- SSLEEP    The process is blocking on an event or resource.
- SZOMB     The process has terminated and is waiting on 
either its parent or the init process to allow it to completely exit.
- SXBRK     The process is has been switched out so that another  process can be executed.
- SSTOP     The process is stopped.

When a process first starts, the kernel allocates it a slot in the process table and places the process in the SIDL state. Once the process has the resources it needs to run, the kernel places it onto the run queue. The process is now in the SRUN state awaiting its turn in the CPU. Once its turn comes for the process to be switched into the CPU, the kernel will tag it as being in the SONPROC state. In this state, the process will execute in either user or kernel mode. User mode is where the process is executing nonprivileged code from the user's compiled program. Kernel mode is where kernel code is being executed from the kernel's privileged address space via a system call.

At some point the process is switched out of the CPU because it has either been signaled to do so (for instance, the user issues a stop signal--SSTOP state) or the process has exceeded its quota of allowable CPU time and the kernel needs the CPU to do some work for another process. The act of switching the focus of the CPU from one process to another is called a context switch. When this occurs, the process enters what is known as the SXBRK state. If the process still needs to run and is waiting for another system resource, such as disk services, it will enter the SSLEEP state until the resource is available and the kernel wakes the process up and places it on the SRUN queue. When the process has finally completed its work and is ready to terminate, it enters the SZOMB state. We have seen the fundamentals of what states a process can exist in and how it moves through them. Let's now learn how a kernel schedules a process to run.

Process Scheduler

Most modern versions of UNIX (for instance, SVR4 and Solaris 2.x) are classified as preemptive operating systems. They are capable of interrupting an executing a process and "freezing" it so that the CPU can service a different process. This obviously has the advantage of fairly allocating the system's resources to all the processes in the system. This is one goal of the many systems architects and programmers who design and write schedulers. The disadvantages are that not all processes are equal and that complex algorithms must be designed and implemented as kernel code in order to maintain the illusion that each user process is running as if it was the only job in the system. The kernel maintains this balance by placing processes in the various priority queues or run queues and apportioning its CPU time-slice based on its priority class (Real-Time versus Timeshare).

Universities and UNIX system vendors have conducted extensive studies on how best to design and build an optimal scheduler. Each vendor's flavor of UNIX--4.4BSD, SVR4, HP-UX, Solaris, and AIX, to name a few--attempts to implement this research to provide a scheduler that best balances its customers' needs. The systems administrator must realize that there are limits to the scheduler's ability to service batch, real-time, and interactive users in the same environment. Once the system becomes overloaded, it will become necessary for some jobs to suffer at the expense of others. This is an extremely important issue to both users and systems administrators alike. The reader should refer to Chapter 22, "Systems Performance and Tuning," to gain a better understanding of what he can do to balance and tune his system.

Memory Management

Random access memory (RAM) is a very critical component in any computer system. It's the one component that always seems to be in short supply on most systems. Unfortunately, most organizations' budgets don't allow for the purchase of all the memory that their technical staff feel is necessary to support all their projects. Luckily, UNIX allows us to execute all sorts of programs without, what appears at first glance to be, enough physical memory. This comes in very handy when the system is required to support a user community that needs to execute an organization's custom and commercial software to gain access to its data.

Memory chips are high-speed electronic devices that plug directly into your computer. Main memory is also called core memory by some technicians. Ever heard of a core dump? (Writing out main memory to a storage device for post-dump analysis.) Usually it is caused by a program or system crash or failure. An important aspect of memory chips is that they can store data at specific locations called addresses. This makes it quite convenient for another hardware device called the central processing unit (CPU) to access these locations to run your programs. The kernel uses a paging and segmentation arrangement to organize process memory. This is where the memory management subsystem plays a significant role. Memory management can be defined as the efficient managing and sharing of the system's memory resources by the kernel and user processes.

Memory management follows certain rules that manage both physical and virtual memory. Since we already have an idea of what a physical memory chip or card is, we will provide a definition of virtual memory. Virtual memory is where the addressable memory locations that a process can be mapped into are independent of the physical address space of the CPU. Generally speaking, a process can exceed the physical address space/size of main memory and still load and execute.

The systems administrator should be aware that just because she has a fixed amount of physical memory, she should not expect it all to be available to execute user programs. The kernel is always resident in main memory and depending upon the kernel's configuration (tunable-like kernel tables, daemons, device drivers loaded, and so on), the amount left over can be classified as available memory. It is important for the systems administrator to know how much available memory the system has to work with when supporting his environment. Most systems display memory statistics during boot time. If your kernel is larger than it needs to be to support your environment, consider reconfiguring a smaller kernel to free up resources.

We learned before that a process has a well-defined structure and has certain specific control data structures that the kernel uses to manage the process during its system lifetime. One of the more important data structures that the kernel uses is the virtual address space (vas in HP-UX and as in SVR4. For a more detailed description of the layout of these structures, look at the vas.h or as.h header files under /usr/include on your system.).

A virtual address space exists for each process and is used by the process to keep track of process logical segments or regions that point to specific segments of the process's text (code), data, u_area, user, and kernel stacks; shared memory; shared library; and memory mapped file segments. Per-process regions protect and maintain the number of pages mapped into the segments. Each segment has a virtual address space segment as well. Multiple programs can share the process's text segment. The data segment holds the process's initialized and uninitialized (BSS) data. These areas can change size as the program executes.

The u_area and kernel stack contain information used by the kernel, and are a fixed size. The user stack is contained in the u_area; however, its size will fluctuate during its execution. Memory mapped files allow programmers to bring files into memory and work with them while in memory. Obviously, there is a limit to the size of the file you can load into memory (check your system documentation). Shared memory segments are usually set up and used by a process to share data with other processes. For example, a programmer may want to be able to pass messages to other programs by writing to a shared memory segment and having the receiving programs attach to that specific shared memory segment and read the message. Shared libraries allow programs to link to commonly used code at runtime. Shared libraries reduce the amount of memory needed by executing programs because only one copy of the code is required to be in memory. Each program will access the code at that memory location when necessary.

When a programmer writes and compiles a program, the compiler generates the object file from the source code. The linker program (ld) links the object file with the appropriate libraries and, if necessary, other object files to generate the executable program. The executable program contains virtual addresses that are converted into physical memory addresses when the program is run. This address translation must occur prior to the program being loaded into memory so that the CPU can reference the actual code.

When the program starts to run, the kernel sets up its data structures (proc, virtual address space, per-process region) and begins to execute the process in user mode. Eventually, the process will access a page that's not in main memory (for instance, the pages in its working set are not in main memory). This is called a page fault. When this occurs, the kernel puts the process to sleep, switches from user mode to kernel mode, and attempts to load the page that the process was requesting to be loaded. The kernel searches for the page by locating the per-process region where the virtual address is located. It then goes to the segments (text, data, or other) per-process region to find the actual region that contains the information necessary to read in the page.

The kernel must now find a free page in which to load the process's requested page. If there are no free pages, the kernel must either page or swap out pages to make room for the new page request. Once there is some free space, the kernel pages in a block of pages from disk. This block contains the requested page plus additional pages that may be used by the process. Finally the kernel establishes the permissions and sets the protections for the newly loaded pages. The kernel wakes the process and switches back to user mode so the process can begin executing using the requested page. Pages are not brought into memory until the process requests them for execution. This is why the system is referred to as a demand paging system.

NOTE: The verb page means to move individual blocks of memory for a process between system memory and disk swap area. The pagesize is defined in the /usr/include/limits.h header file. For a definition of paging see RAM I/O.

The memory management unit is a hardware component that handles the translation of virtual address spaces to physical memory addresses. The memory management unit also prevents a process from accessing another process's address space unless it is permitted to do so (protection fault). Memory is thus protected at the page level. The Translation Lookaside Buffer (TLB) is a hardware cache that maintains the most recently used virtual address space to physical address translations. It is controlled by the memory management unit to reduce the number of address translations that occur on the system.

Input and Output Management

The simplest definition of input/output is the control of data between hardware devices and software. A systems administrator is concerned with I/O at two separate levels. The first level is concerned with I/O between user address space and kernel address space; the second level is concerned with I/O between kernel address space and physical hardware devices. When data is written to disk, the first level of the I/O subsystem copies the data from user space to kernel space. Data is then passed from the kernel address space to the second level of the I/O subsystem. This is when the physical hardware device activates its own I/O subsystems, which determine the best location for the data on the available disks.

The OEM (Original Equipment Manufacture) UNIX configuration is satisfactory for many work environments, but does not take into consideration the network traffic or the behavior of specific applications on your system. Systems administrators find that they need to reconfigure the systems I/O to meet the expectations of the users and the demands of their applications. You should use the default configuration as a starting point and, as experience is gained with the demands on the system resources, tune the system to achieve peak I/O performance.

UNIX comes with a wide variety of tools that monitor system performance. Learning to use these tools will help you determine whether a performance problem is hardware or software related. Using these tools will help you determine whether a problem is poor user training, application tuning, system maintenance, or system configuration. sar, iostat, and monitor are some of your best basic I/O performance monitoring tools.

sar The sar command writes to standard output the contents of selected cumulative activity counters in the operating system. The following list is a breakdown of those activity counters that sar accumulates.
- File access
- Buffer usage
- system call activity
- Disk and tape input/output activity
- Free memory and swap space
- Kernel Memory Allocation (KMA)
- Interprocess communication
- Paging
- Queue Activity
- Central Processing Unit (CPU)
- Kernel tables
- Switching
- Terminal device activity
iostat Reports CPU statistics and input/output statistics for TTY devices, disks, and CD-ROMs.
monitor Like the sar command, but with a visual representation of the computer state.

RAM I/O

The memory subsystem comes into effect when the programs start requesting access to more physical RAM memory than is installed on your system. Once this point is reached, UNIX will start I/O processes called paging and swapping. This is when kernel procedures start moving pages of stored memory out to the paging or swap areas defined on your hard drives. (This procedure reflects how swap files work in Windows by Microsoft for a PC.) All UNIX systems use these procedures to free physical memory for reuse by other programs. The drawback to this is that once paging and swapping have started, system performance decreases rapidly. The system will continue using these techniques until demands for physical RAM drop to the amount that is installed on your system. There are only two physical states for memory performance on your system: Either you have enough RAM or you don't, and performance drops through the floor.

Memory performance problems are simple to diagnose; either you have enough memory or your system is thrashing. Computer systems start thrashing when more resources are dedicated to moving memory (paging and swapping) from RAM to the hard drives. Performance decreases as the CPUs and all subsystems become dedicated to trying to free physical RAM for themselves and other processes.

This summary doesn't do justice, however, to the complexity of memory management nor does it help you to deal with problems as they arise. To provide the background to understand these problems, we need to discuss virtual memory activity in more detail.

We have been discussing two memory processes: paging and swapping. These two processes help UNIX fulfill memory requirements for all processes. UNIX systems employ both paging and swapping to reduce I/O traffic and execute better control over the system's total aggregate memory. Keep in mind that paging and swapping are temporary measures; they cannot fix the underlying problem of low physical RAM memory.

Swapping moves entire idle processes to disk for reclamation of memory, and is a normal procedure for the UNIX operating system. When the idle process is called by the system again, it will copy the memory image from the disk swap area back into RAM.

On systems performing paging and swapping, swapping occurs in two separate situations. Swapping is often a part of normal housekeeping. Jobs that sleep for more that 20 seconds are considered idle and may be swapped out at any time. Swapping is also an emergency technique used to combat extreme memory shortages. Remember our definition of thrashing; this is when a system is in trouble. Some system administrators sum this up very well by calling it "desperation swapping."

Paging, on the other hand, moves individual pages (or pieces) of processes to disk and reclaims the freed memory, with most of the process remaining loaded in memory. Paging employs an algorithm to monitor usage of the pages, to leave recently accessed pages in physical memory, and to move idle pages into disk storage. This allows for optimum performance of I/O and reduces the amount of I/O traffic that swapping would normally require.

NOTE: Monitoring what the system is doing is easy with the ps command. ps is a "process status" command on all UNIX systems and typically shows many idle and swapped-out jobs. This command has a rich amount of options to show you what the computer is doing, too many to show you here.

I/O performance management, like all administrative tasks, is a continual process. Generating performance statistics on a routine basis will assist in identifying and correcting potential problems before they have an impact on your system or, worst case, your users. UNIX offers basic system usage statistics packages that will assist you in automatically collecting and examining usage statistics.

You will find the load on the system will increase rapidly as new jobs are submitted and resources are not freed quickly enough. Performance drops as the disks become I/O bound trying to satisfy paging and swapping calls. Memory overload quickly forces a system to become I/O and CPU bound. However, once you identify the problem to be memory, you will find adding RAM to be cheaper than adding another CPU to your system.

Hard Drive I/O

Some simple configuration considerations will help you obtain better I/O performance regardless of your system's usage patterns. The factors to consider are the arrangement of your disks and disk controllers and the speed of the hard drives.

The best policy is to spread the disk workload as evenly as possible across all controllers. If you have a large system with multiple I/O back planes, split your disk drives evenly among the two buses. Most disk controllers allow you to daisy chain several disk drives from the same controller channel. For the absolute best performance, spread the disk drives evenly over all controllers. This is particularly important if your system has many users who all need to make large sequential transfers.

Small Computer System Interface (SCSI) devices are those that adhere to the American National Standards Institute (ANSI) standards for connecting intelligent interface peripherals to computers. The SCSI bus is a daisy-chained arrangement originating at a SCSI adapter card that interconnects several SCSI controllers. Each adapter interfaces the device to the bus and has a different SCSI address that is set on the controller. This address determines the priority that the SCSI device is given, with the highest address having the highest priority. When you load balance a system, always place more frequently accessed data on the hard drives with the highest SCSI address. Data at the top of the channel takes less access time, and load balancing increases the availability of that data to the system.

After deciding the best placement of the controllers and hard drives on your system, you have one last item for increasing system performance. When adding new disks, remember that the seek time of the disk is the single most important indicator of its performance. Different processes will be accessing the disk at the same time as they are accessing different files and reading from different areas at one time.

The seek time of a disk is the measure of time required to move the disk drive's heads from one track to another. Seek time is affected by how far the heads have to move from one track to another. Moving the heads from track to track takes less time that shifting those same drive heads across the entire disk. You will find that seek time is actually a nonlinear measurement, taking into account that the heads have to accelerate, decelerate, and then stabilize in their new position. This is why all disks will typically specify a minimum, average, and maximum seek time. The ratio of time spent seeking between tracks to time spent transferring data is usually at least 10 to 1. The lower the aggregate seek time, the greater your performance gain or improvement.

One problem with allowing for paging and swap files to be added to the hard disks is that some system administrators try to use this feature to add more RAM to a system. It does not work that way. The most you could hope for is to temporarily avert the underlying cause, low physical memory. There is one thing that a systems administrator can do to increase performance, and that is to accurately balance the disk drives.

Don't overlook the obvious upgrade path for I/O performance, tuning. If you understand how your system is configured and how you intend to use it, you will be much less likely to buy equipment you don't need or that won't solve your problem.

Filesystem Management Subsystem

In discussing "Kernel Basics and Configuration" a very important topic, filesystems, must be considered. This discussion shall deal with the basic structural method of long-term storage of system and user data. Filesystems and the parameters that are used to create them have a direct impact on performance, system resource utilization, and kernel efficiency dealing with Input/Output (I/O).

Filesystem Types

There are several important filesystem types that are supported by different operating systems (OS), many of which are not used for implementation at this time. The reasons they are not used vary from being inefficient to just being outdated. However, many operating systems still support their filesystem structure so that compatibility doesn't become an issue for portability.

This support of other filesystem structures plays a large role in allowing companies to move between OS and computer types with little impact to their applications.

The following is a list of filesystem types that are supported by specific operating systems. The list will only cover local, network, and CD-ROM filesystems.

Local Filesystem NFS* CD-ROM

Solaris ufs yes bsfs

SunOS 4.2 yes bsfs

SCO EAFS yes HS

IRIX efs yes iso9660

Digital ufs yes cdfs

HP-UX bfs yes cdfs

AIX jfs yes cdrfs

Linux ext2 yes iso9660

Note: NFS stands for Networked FileSystem

Hardware Architecture

Since filesystems are stored on disk, the systems administrator should look at basic disk hardware architecture before proceeding with specifics of filesystems. A disk is physically divided into tracks, sectors, and blocks. A good representation of a sector would be a piece of pie removed form the pie pan. Therefore, as with a pie, a disk is composed of several sectors (see Figure 19.3). Tracks are concentric rings going from the outside perimeter to the center of the disk, with each track becoming smaller as it approaches the center of the disk. Tracks on a disk are concentric, therefore they never touch each other. The area of the track that lies between the edges of the sector is termed a block, and the block is the area where data is stored. Disk devices typically use a block mode accessing scheme when transferring data between the file management subsystem and the I/O subsystem. The block size is usually 512- or 1024-byte fixed-length blocks, depending upon the scheme used by the operating system. A programmer may access files using either block or character device files.

Figure 19.3.
Diagram of a single platter from a hard drive showing disk geometry.

You now have a basic understanding of the terms tracks, sectors, and blocks as they apply to a single platter disk drive. But most disk today are composed of several platters with each platter having its own read/write head. With this in mind, we have a new term: cylinder (see Figure 19.4). Let's make the assumption that we have a disk drive that has six platters so, logically, it must have six read/write heads. When read/write head 1 is on track 10 of platter 1, then heads 2 through 6 are on track 10 of their respective platters. You now have a cylinder. A cylinder is collectively the same track on each platter of a multi-platter disk.

Figure 19.4.
Diagram showing multiple platters of a single disk drive.

Filesystem Concepts and Format

The term filesystem has two connotations. The first is the complete hierarchical filesystem tree. The second is the collection place on disk device(s) for files. Visualize the filesystem as consisting of a single node at the highest level (ROOT) and all other nodes descending from the root node in a tree-like fashion (see Figure 19.5) . The second meaning will be used for this discussion, and Hewlett Packard's High-performance Filesystem will be used for technical reference purposes.

Figure 19.5.
Diagram of a UNIX hierarchical filesystem.

The superblock is the key to maintaining the filesystem. It's an 8 KB block of disk space that maintains the current status of the filesystem. Because of its importance, a copy is maintained in memory and at each cylinder group within the filesystem. The copy in main memory is updated as events transpire. The update daemon is the actual process that calls on the kernel to flush the cached superblocks, modified inodes, and cached data blocks to disk. The superblock maintains the following static and dynamic information about the filesystem. An asterisk will denote dynamically maintained information.

Filesystem size

Number of Inodes

Location of free space

Number of cylinder groups

Fragment size and number

Block size and number

Location of superblocks, cylinder groups, inodes, and data blocks

Total number of free data blocks

Total number of free inodes

Filesystem status flag (clean flag)

As you can see from the listed information, the superblock maintains the integrity of the filesystem and all associated pertinent information. To prevent catastrophic events, the OS stores copies of the superblock in cylinder groups. The locations of these alternate superblocks may be found in /etc/sbtab. When system administrators are using fsck -b to recover from an alternate superblock, they will be required to give the location of that alternate block. Again, the only place to find that information is in /etc/sbtab. As a qualification to that statement, there is always an alternate superblock at block sixteen.

Cylinder groups are adjacent groups of cylinders, 16 cylinders by default, that have their own set of inodes and free space mapping. This is done to improve performance and reduce disk latency. Disk latency is the time between when the disk is read and the I/O subsystem can transfer the data. Some factors that affect disk latency are rotational speed, seek time, and the interleave factor. This concept also associates the inodes and data blocks in closer proximity.

NOTE: The interleave factor is the value that determines the order in which sectors on a disk drive are accessed.

The layout of the cylinder group is:

Boot block

Primary superblock

Redundant superblock

Cylinder group information

Inode table

Data blocks

The boot block and the primary superblock will only be there if this is the first cylinder group; otherwise, it may be filled with data.

Inodes are fixed-length entries that vary in their length according to the OS implemented. SVR4 implementation is 128 bytes for a UFS inode and 64 bytes for an S5 inode. The inode maintains all of the pertinent information about the file except for the filename and the data. The information maintained by the inode is as follows:

File permissions or mode

Type of file

Number of hard links

Current owner

Group associated to the file

Actual file size in bytes

Time Stamps

Time/Date file last changed

Time/Date file last accessed

Time/Date last inode modification

Single indirect block pointer

Double indirect block pointer

Triple indirect block pointer

There are 15 slots in the inode structure for disk address or pointers(see Figure 19.6). Twelve of the slots are for direct block addressing. A direct address can either point to a complete block or to a fragment of that block. The block and fragment sizes we are discussing are configurable parameters that are set at filesystem creation. They cannot be altered unless the filesystem is removed and re-created with the new parameters.

Figure 19.6.
Diagram of an Inode Structure of a UNIX filesystem

Listing of a typical AIX Root directory using `ls -ali`, to indicate the inode numbers for each file entry in the directory.

inode Permissions ln Owner Group Size Access Date Filename

2 drwxr-xr-x 23 bin bin 1024 Apr 27 15:53 . (dot)

2 drwxr-xr-x 23 bin bin 1024 Apr 27 15:53 .. (dot, dot)

765 -rw-r--r-- 1 root system 259 Apr 08 08:34 Guidefaults

1257 -rw------- 1 root system 156 Apr 27 11:01 .Xauthority

2061 drwxr-xr-x 11 root system 512 Apr 27 11:01 .dt

591 -rwxr-xr-x 1 root system 3970 Apr 08 08:38 .dtprofile

6151 drwx------ 3 root system 512 Apr 17 13:42 .netscape

593 -rw------- 1 root system 1904 Apr 11 08:12 .old_sh_history

1011 -rwxr----- 1 7 system 254 Apr 10 11:15 .profile

1007 -rw------- 1 root system 3444 Apr 27 15:53 .sh_history

1009 -rw-r--r-- 1 root system 30 Apr 14 10:35 .showcase

2069 drwxr-xr-x 2 root system 512 Apr 08 08:54 TT_DB

2058 drwxr-xr-x 3 root system 512 Apr 11 11:21 admin

109 lrwxrwxrwx 1 bin bin 8 Apr 01 05:27 bin ->/usr/bin

23 drwxrwxr-x 4 root system 2048 Apr 27 14:37 dev

24 drwxr-xr-x 12 root system 6144 Apr 27 11:29 etc

2 drwxr-xr-x 5 bin bin 512 Apr 02 01:52 home

8195 drwxr-xr-x 2 root system 512 Apr 25 13:08 httpd

586 lrwxrwxrwx 1 bin bin 20 Apr 02 01:57 launch_demo ->

22 lrwxrwxrwx 1 bin bin 8 Apr 01 05:27 lib ->/usr/lib

16 drwx------ 2 root system 512 Apr 01 05:27 lost+found

100 drwxr-xr-x 26 bin bin 1024 Apr 11 15:23 lpp

101 drwxr-xr-x 2 bin bin 512 Apr 01 05:27 mnt

4096 drwxr-xr-x 2 root system 512 Apr 11 14:57 mnt10032

4097 drwxr-xr-x 2 root system 512 Apr 14 10:31 mnt10086

1251 -rw-rw-rw- 1 root system 3192 Apr 15 14:12 nv6000.log

102 drwxr-xr-x 2 root system 512 Apr 02 01:54 opt

103 drwxr-xr-x 3 bin bin 512 Apr 11 15:23 sbin

1252 -rw-r--r-- 1 root system 39265 Apr 27 13:29 smit.log

1253 -rw-r--r-- 1 root system 5578 Apr 27 13:24 smit.script

271 drwxrwxr-x 2 root system 512 Apr 01 05:37 tftpboot

2 drwxrwxrwt 9 bin bin 1536 Apr 27 15:47 tmp

99 lrwxrwxrwx 1 bin bin 5 Apr 01 05:27 u ->/home

192 lrwxrwxrwx 1 root system 21 Apr 01 05:30 unix ->/usr/lib/boot/unix_up

2 drwxr-xr-x 26 bin bin 512 Apr 25 13:19 usr

2 drwxr-xr-x 14 bin bin 512 Apr 01 06:03 var

764 -rw-rw-rw- 1 root system 3074 Apr 08 08:33 vim.log

2 drwxr-xr-x 12 bin bin 2048 Apr 08 08:21 welcome

Single indirect addressing (slot 13) points to a block of four-byte pointers that point to data blocks. If the block that is pointed to by the single indirect method is 4 KB in size, it would contain 1024 four-byte pointers, and if it were 8 KB in size, it would contain 2048 four-byte pointers to data blocks. The double indirect block pointer is located in slot 14, and slot 15 maintains the triple indirect block pointer.

In the "Filesystem Concepts and Format" section, the initial discussion covered basic concepts of superblocks, alternate superblocks, cylinder groups, inodes, and direct and indirect addressing of data blocks. Further reading into these subjects is a must for all systems administrators, especially the new and inexperienced.

Kernel Configuration Process

Kernel configuration is a detailed process in which the systems administrator is altering the behavior of the computer. The systems administrator must remember that a change of a single parameter may affect other kernel subsystems, thus exposing the administrator to the "law of unintended consequences."

When Do You Rebuild the Kernel

Kernel components are generally broken into four major groups, and if changes are made to any of these groups, a kernel reconfiguration is required.

Subsystems--These are components that are required for special functionality (ISO9660)

Dump Devices--System memory dumps are placed here when a panic condition exist. Core dumps are usually placed at the end of the swap area.

Configurable Parameters--These are tuning parameters and data structures. There are a significant number, and they may have inter-dependencies, so it is important that you are aware of the impact of each change.

Device Drivers--These handle interfaces to peripherals like modems, printers, disks, tape drives, kernel memory, and other physical devices.

HP-UX 10.X

There are two ways to rebuild the kernel:

A. Use the System Activity Monitor (SAM)

Step 1--Run SAM and select "Kernel Configuration."

You will now see the following four identified components:

Subsystem

Configurable Parameters

Dump Devices

Device Drivers

Step 2--Select the desired component and make the appropriate change(s).

Step 3--Now answer the prompts and the kernel will be rebuilt.

Step 4--It will also prompt you for whether you want to reboot the kernel now or later.

Consider the importance of the changes and the availability of the system to answer this prompt. If you answer "YES" to reboot the system now it can not be reversed. The point is to know what you are going to do prior to getting to that prompt.

B. Manual Method

Step 1--Go to the build area of the kernel by typing the command line below.

# cd /stand/build

Step 2--The first step is to create a system file from the current system configuration by typing the command line below.

# /usr/lbin/sysadm/system_prep -s system

This command places the current system configuration in the filesystem. There is no standard that you call it system; it could be any name you desire.

Step 3--Now you must modify the existing parameters and insert unlisted configuration parameters, new subsystems, and device drivers, or alter the dump device. The reason you may not have one of the listed configurable parameters in this file: The previous kernel took the default value.

Step 4--The next step is to create the conf.c file, and we are using the modified system file to create it. Remember, if you did not use system for the existing configuration file, insert your name where I show system. The conf.c file has constants for the tunable parameters. Type the command below to execute the config program.

# /usr/sbin/config -s system

Step 5--Now rebuild the kernel by linking the driver objects to the basic kernel.

# make -f config.mk

Step 6--Save the old system configuration file.

# mv /stand/system /stand/system.prev

Step 7--Save the old kernel.

# mv /stand/vmunix /stand/vmunix.prev

Step 8--Move the new system configuration file into place.

# mv ./system /stand/system

Step 9--Move the new kernel into place.

# mv ./vmunix_test /stand/vmunix

Step 10--You are ready to boot the system to load the new kernel.

# shutdown -r -y 60

Solaris 2.5

Suppose we were going to run Oracle on our Sun system under Solaris 2.5 and you wanted to change max_nprocs to 1000 and set up the following Interprocess Communications configuration for your shared memory and semaphore parameters:

SHMMAX 2097152 (2 x the default 1048576)

SHMMIN 1

SHMNI 100

SHMSEG 32

SEMMNI 64

SEMMNS 1600

SEMMNU 1250

SEMMSL 25

Step 1--As root, enter the commands below:

# cd /etc
# cp system system.old - create a backup

Step 2

# vi system

Add or change the following:

set max_nprocs=1000
set shmsys:shminfo_shmmax=2097152
set shmsys:shminfo_shmmin=1
set shmsys:shminfo_shmmni=100
set shmsys:shminfo_shmseg=32
set msgsys:seminfo_semmni=64
set msgsys:seminfo_semmns=1600
set msgsys:seminfo_semmnu=1250
set msgsys:seminfo_semmsl=25

Save and close the file.

Step 3--Reboot your system by entering the following command.

# shutdown -r now

The above kernel parameter and kernel module variables are now set for your system.

SVR4

In this example we will set the tunable NPROC to 500 and then rebuild the kernel to reflect this new value.

Step 1--Log into the system as root and make a backup of /stand/unix to another area.

# cp /stand/unix /old/unix

Step 2

#cd /etc/conf/cf.d

Edit the init.base file to include any changes that you made in the /etc/inittab file that you want to make permanent. A new /etc/inittab file is created when a new kernel is built and put into place.

Step 3--In this step you edit the configuration files in the /etc/conf directory. We will only change /etc/conf/cf.d/stune (although you can change /etc/conf/cf.d/mtune). The stune and mtune files contain the tunable parameters the system uses for its kernel configuration. stune is the system file that you should use when you alter the tunable values for the system. It overrides the values listed in mtune. mtune is the master parameter specification file for the system. It contains the tunable parameters' default, minimum, and maximum values.

The following command line is an example of how you make stune reflect a parameter change.

# /etc/conf/bin/idtune NPROC 500

You can look at stune to see the changes. (stune can be altered by using the vi editor)

Step 4--Build the new kernel.

# /etc/conf/bin/idbuild

It will take several minutes to complete.

Step 5--Reboot the computer system to enable the new kernel to take effect.

# shutdown -I6 -g0 -y

To see your changes, log back into your system and execute the sysdef command. The system parameters will then be displayed.

AIX 4.2

Unlike the preceding examples, the AIX operating system requires a special tool to reconfigure the kernel. This tool is the System Management Interface Tool (SMIT), developed by IBM for the AIX operating system. The AIX kernel is modular in the sense that portions of the kernel's subsystems are resident only when required.

The following shows a SMIT session to change the MAX USERS PROCESSES on an AIX 4.2 system. This is demonstrated to the reader by screen prints of an actual kernel configuration session. While using SMIT you can see the commands sequences being generated by SMIT by pressing the F6 key. SMIT also makes two interaction logs that are handy for post configuration review. SMIT.LOG is an ASCII file that shows all menu selections, commands, and output of a session. SMIT.SCRIPT shows just the actual command line codes used during the session.

Step 1--At root, start SMIT with the following command. This will bring up the IBM SMIT GUI interface screen.

# smit

Figure 19.7.
Systems management interface tool.

Step 2--Select "System Environments" from the System Management menu with your mouse.

Figure 19.8.
SMIT--System environments.

Step 3--Select "Change/Show Characteristics of Operating System" from the System Environment menu with your mouse.

Figure 19.9.
SMIT--Change/Show Characteristics of Operating System.

Step 4--Change "Maximum number of PROCESSES allowed per user" to "50" in the "Change/Show Characteristics of Operating System" menu. Do this by selecting the field for "Maximum number of PROCESSES" with your mouse. Then change the current value in the field to "50."

Figure 19.10.
Maximum number of processes changed to 50.

Step 5--After making your change, select the "OK" button to make the new kernel parameters take effect.

Step 6--The System Management Interface Tool will respond with a "Command Status" screen. Verify that there are no errors in it. If there are none, you are done.

Figure 19.11.
SMIT--Command Status screen.

If an error is returned it would look like the following screen print.

Figure 19.12.
SMIT--Possible Error Screen example.

Linux

The point-to-point protocol (PPP) allows you to dial-in over a telephone line and run Transmission Control Protocol/Internet Protocol (TCP/IP).This allows you to run your GUI applications that use IP from a system that is not directly connected to a network. Let's look at how to configure PPP into the Linux kernel.

Step 1--Linux source code is usually found in the /usr/rc/linux directory. Let's start by changing to this directory by typing the following command.

# cd /usr/src/linux

Step 2--Type the following:

# make config

You will be presented with a series of questions asking if you would like to include or enable specific modules, drivers, and other kernel options in your kernel. For our build, we are concerned that we have a modem and the required networking device driver information configured into our kernel. Make sure you have answered [y] to:

Networking Support (CONFIG_NET)

Network device support (CONFIG_NETDEVICES)

TCP/IP networking (CONFIG_INET)

PPP (point-to-point) support (CONFIG_PPP)

For the most part, you can accept the defaults of most of the questions. It's probably a good idea to go through this step once without changing anything to get a feel for the questions you will need to answer. That way you can set your configuration once and move on to the next step.

After you respond to all the questions, you will see a message telling you that the kernel is configured (it still needs to be built and loaded).

Step 3--The next two commands will set all of the source dependencies so that you can compile the kernel and clean up files that the old version leaves behind.

# make dep
# make clean

Step 4--To compile the new kernel issue the make command.

# make

Don't be surprised if this takes several minutes.

Step 5--To see the new kernel, do a long listing of the directory.

# ls -al

You should see vmlinux in the current directory.

Step 6--You now need to make a bootable kernel.

# make boot

To see the compressed bootable kernel image, do a long listing on arch/i386/boot You will see a file named zImage.

Step 7--The last step is to install the new kernel to the boot drive.

# make zlilo

This command will make the previous kernel (/vmlinuz) to become /vmlinuz.old. Your new kernel image zImage is now /vmlinuz. You can now reboot to check your new kernel configuration. During the boot process, you should see messages about the newly configured PPP device driver scroll across as the system loads.

Once everything checks out and you are satisfied with your new Linux kernel, you can continue on with setting up the PPP software.

Summary

We began our discussion by defining the UNIX kernel and the four basic subsystems that comprise the Operating System. We described how Process Management creates and manages the process and how Memory Management handles multiple process in the system. We discussed how the I/O subsystem takes advantage of swapping and paging to balance the system's load and the interaction of the I/O subsystem with the file management subsystem.

Next, we covered the steps involved in altering the kernel configuration. We demonstrated in detail the steps involved in configuring:

HP-UX 10.X

Solaris 2.5

System V Release 4 (SVR4)

AIX

Linux

In the author's opinion, the systems administrator should become familiar with the concepts presented in this chapter. Further in-depth study of the kernel and its four subsystems will make the systems administrator more knowledgeable and effective at systems management.

	Local Filesystem	NFS*	CD-ROM
Solaris	ufs	yes	bsfs
SunOS	4.2	yes	bsfs
SCO	EAFS	yes	HS
IRIX	efs	yes	iso9660
Digital	ufs	yes	cdfs
HP-UX	bfs	yes	cdfs
AIX	jfs	yes	cdrfs
Linux	ext2	yes	iso9660

inode	Permissions	ln	Owner	Group	Size	Access Date	Filename
2	drwxr-xr-x	23	bin	bin	1024	Apr 27 15:53	. (dot)
2	drwxr-xr-x	23	bin	bin	1024	Apr 27 15:53	.. (dot, dot)
765	-rw-r--r--	1	root	system	259	Apr 08 08:34	Guidefaults
1257	-rw-------	1	root	system	156	Apr 27 11:01	.Xauthority
2061	drwxr-xr-x	11	root	system	512	Apr 27 11:01	.dt
591	-rwxr-xr-x	1	root	system	3970	Apr 08 08:38	.dtprofile
6151	drwx------	3	root	system	512	Apr 17 13:42	.netscape
593	-rw-------	1	root	system	1904	Apr 11 08:12	.old_sh_history
1011	-rwxr-----	1	7	system	254	Apr 10 11:15	.profile
1007	-rw-------	1	root	system	3444	Apr 27 15:53	.sh_history
1009	-rw-r--r--	1	root	system	30	Apr 14 10:35	.showcase
2069	drwxr-xr-x	2	root	system	512	Apr 08 08:54	TT_DB
2058	drwxr-xr-x	3	root	system	512	Apr 11 11:21	admin
109	lrwxrwxrwx	1	bin	bin	8	Apr 01 05:27	bin ->/usr/bin
23	drwxrwxr-x	4	root	system	2048	Apr 27 14:37	dev
24	drwxr-xr-x	12	root	system	6144	Apr 27 11:29	etc
2	drwxr-xr-x	5	bin	bin	512	Apr 02 01:52	home
8195	drwxr-xr-x	2	root	system	512	Apr 25 13:08	httpd
586	lrwxrwxrwx	1	bin	bin	20	Apr 02 01:57	launch_demo ->
22	lrwxrwxrwx	1	bin	bin	8	Apr 01 05:27	lib ->/usr/lib
16	drwx------	2	root	system	512	Apr 01 05:27	lost+found
100	drwxr-xr-x	26	bin	bin	1024	Apr 11 15:23	lpp
101	drwxr-xr-x	2	bin	bin	512	Apr 01 05:27	mnt
4096	drwxr-xr-x	2	root	system	512	Apr 11 14:57	mnt10032
4097	drwxr-xr-x	2	root	system	512	Apr 14 10:31	mnt10086
1251	-rw-rw-rw-	1	root	system	3192	Apr 15 14:12	nv6000.log
102	drwxr-xr-x	2	root	system	512	Apr 02 01:54	opt
103	drwxr-xr-x	3	bin	bin	512	Apr 11 15:23	sbin
1252	-rw-r--r--	1	root	system	39265	Apr 27 13:29	smit.log
1253	-rw-r--r--	1	root	system	5578	Apr 27 13:24	smit.script
271	drwxrwxr-x	2	root	system	512	Apr 01 05:37	tftpboot
2	drwxrwxrwt	9	bin	bin	1536	Apr 27 15:47	tmp
99	lrwxrwxrwx	1	bin	bin	5	Apr 01 05:27	u ->/home
192	lrwxrwxrwx	1	root	system	21	Apr 01 05:30	unix ->/usr/lib/boot/unix_up
2	drwxr-xr-x	26	bin	bin	512	Apr 25 13:19	usr
2	drwxr-xr-x	14	bin	bin	512	Apr 01 06:03	var
764	-rw-rw-rw-	1	root	system	3074	Apr 08 08:33	vim.log
2	drwxr-xr-x	12	bin	bin	2048	Apr 08 08:21	welcome

`SHMMAX`	2097152 (2 x the default 1048576)
`SHMMIN`	1
`SHMNI`	100
`SHMSEG`	32
`SEMMNI`	64
`SEMMNS`	1600
`SEMMNU`	1250
`SEMMSL`	25

UNIX Unleashed, System Administrator's Edition

- 19 -

Kernel Basics and Configuration

What Is a Kernel?

Kernel Services

System Initialization

Kernel Mode

Process Management

Process Structure

Process Run States

Process Scheduler

Memory Management

Input and Output Management

RAM I/O

Hard Drive I/O

Filesystem Management Subsystem

Filesystem Types

Hardware Architecture

Filesystem Concepts and Format

Listing of a typical AIX Root directory using ls -ali, to indicate the inode numbers for each file entry in the directory.

Kernel Configuration Process

When Do You Rebuild the Kernel

HP-UX 10.X

Solaris 2.5

SVR4

AIX 4.2

Linux

Summary

Listing of a typical AIX Root directory using `ls -ali`, to indicate the inode numbers for each file entry in the directory.