Magic SysRq key of Linux

The magic SysRq key is a key combination in the Linux kernel which allows the user to perform various low-level commands regardless of the system’s state.

It is often used to recover from freezes or to reboot a computer without corrupting the filesystem. The key combination consists of Alt+SysRq+commandkey. In many systems, the SysRq key is the PrintScreen key.

First, you need to enable the SysRq key, as shown below.

# echo “1” > /proc/sys/kernel/sysrq

List of SysRq Command Keys

Following are the command keys available for Alt+SysRq+commandkey.

‘k’ – Kills all the process running on the current virtual console.
’s’ – This will attempt to sync all the mounted file system.
‘b’ – Immediately reboot the system, without unmounting partitions or syncing.
‘e’ – Sends SIGTERM to all process except init.
‘m’ – Output current memory information to the console.
‘i’ – Send the SIGKILL signal to all processes except init
‘r’ – Switch the keyboard from raw mode (the mode used by programs such as X11), to XLATE mode.
’s’ – sync all mounted file systems.
‘t’ – Output a list of current tasks and their information to the console.
‘u’ – Remount all mounted filesystems in read-only mode.
‘o’ – Shutdown the system immediately.
‘p’ – Print the current registers and flags to the console.
‘0-9′ – Sets the console log level, controlling which kernel messages will be printed to your console.
‘f’ – Will call oom_kill to kill a process which takes more memory.
‘h’ – Used to display the help. But any other keys than the above listed will print help.

We can also do this by echoing the keys to the /proc/sysrq-trigger file. For example, to reboot a system you can perform the following.

# echo “b” > /proc/sysrq-trigger

Perform a Safe reboot of Linux using Magic SysRq Key

To perform a safe reboot of a Linux computer which hangs up, do the following. This will avoid the fsck during the next re-booting. i.e Press Alt+SysRq+letter highlighted below.

unRaw (take control of keyboard back from X11,
tErminate (send SIGTERM to all processes, allowing them to terminate gracefully),
kIll (send SIGILL to all processes, forcing them to terminate immediately),
Sync (flush data to disk),
Unmount (remount all filesystems read-only),
reBoot.

Linux Process & Process Management

Introduction

A Linux server, like any other computer you may be familiar with, runs applications. To the computer, these are considered “processes”.

While Linux will handle the low-level, behind-the-scenes management in a process’s lifecycle, you will need a way of interacting with the operating system to manage it from a higher-level.

In this post, we will discuss some simple aspects of process management. Linux provides an abundant collection of tools for this purpose.

How To View Running Processes in Linux

The easiest way to find out what processes are running on your server is to run the top command:

# top

top – 07:45:38 up 9:38, 1 user, load average: 0.00, 0.01, 0.05
Tasks: 89 total, 2 running, 87 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1014976 total, 517484 free, 102656 used, 394836 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 728192 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 128092 6680 3932 S 0.0 0.7 0:02.84 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/0
6 root 20 0 0 0 0 S 0.0 0.0 0:00.32 kworker/u30:0
7 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
————- output truncated ——————-

The top chunk of information gives system statistics, such as system load and the total number of tasks.

You can easily see that there is 2 running process, and 87 processes are sleeping (aka idle/not using CPU resources).

The bottom portion has the running processes and their usage statistics.

Though top gives you an interface to view running processes based on ncurses. This tool is not always flexible enough to adequately cover all scenarios. A powerful command called ps is often the answer to these problems.

List processes with ps command

When called without arguments, the output can be a bit lack-luster:

# ps

PID TTY TIME CMD
3125 pts/0 00:00:00 sudo
3126 pts/0 00:00:00 su
3127 pts/0 00:00:00 bash
3150 pts/0 00:00:00 ps

This output shows all of the processes associated with the current user and terminal session. This makes sense because we are only running bash, sudo and ps with this terminal currently.

We can run ps command with different options to get a complete picture of the processes on this system.

BSD style – The options in bsd style syntax are not preceded by a dash.

# ps aux

UNIX/LINUX style – The options in Linux style syntax are preceded by a dash as usual.

# ps -ef

It is okay to mix both the syntax styles on Linux systems. For example “ps au -x”. In this post, we’re using both style syntaxes.

# ps aux

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2 0.0 0.0 0 0 ? S Jan11 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S Jan11 0:00 [ksoftirqd/0]
root 6 0.0 0.0 0 0 ? S Jan11 0:00 [kworker/u30:0]
root 7 0.0 0.0 0 0 ? S Jan11 0:00 [migration/0]
root 8 0.0 0.0 0 0 ? S Jan11 0:00 [rcu_bh]
root 9 0.0 0.0 0 0 ? R Jan11 0:00 [rcu_sched]
root 10 0.0 0.0 0 0 ? S Jan11 0:00

————- output truncated ——————-

These options tell ps to show processes owned by all users (regardless of their terminal association) in a user-friendly format.

To see a tree view, where hierarchal relationships are illustrated, we can run the command with these options:

# ps axjf

PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
0 2 0 0 ? -1 S 0 0:00 [kthreadd]
2 3 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/0]
2 6 0 0 ? -1 S 0 0:00 \_ [kworker/u30:0]
2 7 0 0 ? -1 S 0 0:00 \_ [migration/0]
2 8 0 0 ? -1 S 0 0:00 \_ [rcu_bh]
2 9 0 0 ? -1 R 0 0:00 \_ 1 2024 2024 2024 ? -1 Ss 0 0:00 /usr/sbin/sshd
2024 3100 3100 3100 ? -1 Ss 0 0:00 \_ sshd: ajoy[priv]
3100 3103 3100 3100 ? -1 S 1000 0:00 \_ sshd: ajoy@pts/0
3103 3104 3104 3104 pts/0 3153 Ss 1000 0:00 \_ -bash
3104 3125 3125 3104 pts/0 3153 S 0 0:00 \_ sudo su –
3125 3126 3125 3104 pts/0 3153 S 0 0:00 \_ su –
3126 3127 3127 3104 pts/0 3153 S 0 0:00 \_ -bash
3127 3153 3153 3104 pts/0 3153 R+ 0 0:00 \_ ps axjf
————- output truncated ——————-

As you can see, the process sshd is shown to be a parent of the processes like bash, su, sudo, and ps ajx itself.

List the Process based on the UID and Commands (ps -u, ps -C)

Use -u option to displays the process that belongs to a specific username. When you have multiple usernames, separate them using a comma. The example below displays all the process that are owned by user wwwrun, or postfix.

# ps -f -u wwwrun,postfix

UID PID PPID C STIME TTY TIME CMD
postfix 7457 7435 0 Mar09 ? 00:00:00 qmgr -l -t fifo -u
wwwrun 7495 7491 0 Mar09 ? 00:00:00 /usr/sbin/httpd2-prefork -f /etc/apache2/httpd.conf
wwwrun 7496 7491 0 Mar09 ? 00:00:00 /usr/sbin/httpd2-prefork -f /etc/apache2/httpd.conf
wwwrun 7497 7491 0 Mar09 ? 00:00:00 /usr/sbin/httpd2-prefork -f /etc/apache2/httpd.conf
wwwrun 7498 7491 0 Mar09 ? 00:00:00 /usr/sbin/httpd2-prefork -f /etc/apac

The following example shows that all the processes which have tatad.pl in its command execution.

# ps -f -C tatad.pl

UID PID PPID C STIME TTY TIME CMD
root 9576 1 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9577 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9579 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9580 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9581 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bi

The following method is used to get a list of processes with a particular PPID.

#ps -f –ppid 9576

UID PID PPID C STIME TTY TIME CMD
root 9577 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9579 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9580 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9581 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin

List Processes in a Hierarchy (ps –forest)

The example below displays the process Id and commands in a hierarchy. –forest is an argument to ps command which displays ASCII art of process tree. From this tree, we can identify which is the parent process and the child processes it forked in a recursive manner.

#ps -e -o pid,args –forest
468 \_ sshd: root@pts/7
514 | \_ -bash
17484 \_ sshd: root@pts/11
17513 | \_ -bash
24004 | \_ vi ./790310__11117/journal
15513 \_ sshd: root@pts/1
15522 | \_ -bash
4280 \_ sshd: root@pts/5
4302 | \_ -bash

List elapsed wall time for processes (ps -o pid,etime=)

If you want the get the elapsed time for the processes which are currently running ps command provides etime which provides the elapsed time since the process was started, in the form [[dd-]hh:]mm: , ss.

The below command displays the elapsed time for the process IDs 1 (init) and process id 29675.

For example “10-22:13:29? in the output represents the process init is running for 10days, 22hours,13 minutes and 29seconds. Since init process starts during the system startup, this time will be same as the output of the ‘uptime’ command.

# ps -p 1,29675 -o pid,etime=
PID
1 10-22:13:29
29675 1-02:58:46

List all threads for a particular process (ps -L)

You can get a list of threads for the processes. When a process hangs, we might need to identify the list of threads running for a particular process as shown below.

# ps -C java -L -o pid,tid,pcpu,state,nlwp,args

PID TID %CPU S NLWP COMMAND
16992 16992 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16993 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16994 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16995 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16996 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16997 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16998 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16999 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 17000 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_l

Finding memory Leak (ps –sort pmem)

A memory leak, technically, is an ever-increasing usage of memory by an application.

With common desktop applications, this may go unnoticed, because a process typically frees any memory it has used when you close the application.

However, In the client/server model, memory leakage is a serious issue, because applications are expected to be available 24×7. Applications must not continue to increase their memory usage indefinite because this can cause serious issues. To monitor such memory leaks, we can use the following commands.

# ps aux –sort pmem

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 1520 508 ? S 2005 1:27 init
inst 1309 0.0 0.4 344308 33048 ? S 2005 1:55 agnt (idle)
inst 2919 0.0 0.4 345580 37368 ? S 2005 20:02 agnt (idle)
inst 24594 0.0 0.4 345068 36960 ? S 2005 15:45 agnt (idle)

In the above ps command, –sort option outputs the highest %MEM at the bottom. Just note down the PID for the highest %MEM usage. Then use ps command to view all the details about this process id, and monitor the change over time. You had to manually repeat ir or put it as a cron to a file.

The VSZ number is useless if what you are interested in is memory consumption. VSZ measures how much of the process’s virtual memory space has been marked by the process of memory that should be mapped by the operating system if the process happens to touch it. But it has nothing to do with whether that memory has actually been touched and used. VSZ is an internal detail about how a process does memory allocation — how big a chunk of unused memory it grabs at once. Look at RSS for the count of memory pages it has actually started using

RSS:

Resident set size = the non-swapped physical memory that a task has used; Resident Set currently in physical memory including Code, Data, Stack

VSZ:

Virtual memory usage of entire process = VmLib + VmExe + VmData + VmStk

In other words,
a) VSZ *includes* RSS
b) “ps -aux” alone isn’t enough to tell you if a process is thrashing (although, if your system *is* thrashing, “ps -aux” will help you identify the processes experiencing the biggest hits).

# ps ev –pid=27645

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
27645 ? S 3:01 0 25 1231262 1183976 14.4 /TaskServer/bin/./wrapper-linux-x86-32

# ps ev –pid=27645

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
27645 ? S 3:01 0 25 1231262 1183976 14.4 /TaskServer/bin/./wrapper-linux-x86-32

Note: In the above output, if RSS (resident set size, in KB) increases over time (so would %MEM), it may indicate a memory leak in the application.

The following command displays all the process owned by Linux username: oracle.

# ps U oracle

PID TTY STAT TIME COMMAND
5014 ? Ss 0:01 /oracle/bin/tnslsnr
7124 ? Ss 0:00 ora_q002_med
8206 ? Ss 0:00 ora_cjq0_med
8852 ? Ss 0:01 ora_pmon_med

Following command displays all the process owned by the current user.

# ps U $USER

PID TTY STAT TIME COMMAND
10329 ? S 0:00 sshd: ajoy@pts/1,pts/2
10330 pts/1 Ss 0:00 -bash
10354 pts/2 Ss+ 0:00 -bash

The ps command can be configured to show a selected list of columns only. There are a large number of columns to to show and the full list is available in the man pages.

The following command shows only the pid, username, cpu, memory and command columns.

# ps -e -o pid,uname,pcpu,pmem,comm

PID USER %CPU %MEM COMMAND
1 root 0.0 0.6 systemd
2 root 0.0 0.0 kthreadd
3 root 0.0 0.0 ksoftirqd/0
6 root 0.0 0.0 kworker/u30:0
7 root 0.0 0.0 migration/0

The ps command is quite flexible and it is possible to rename the column labels as shown below:

# ps -e -o pid,uname=USERNAME,pcpu=CPU_USAGE,pmem,comm

PID USERNAME CPU_USAGE %MEM COMMAND
1 root 0.0 0.6 systemd
2 root 0.0 0.0 kthreadd
3 root 0.0 0.0 ksoftirqd/0
6 root 0.0 0.0 kworker/u30:0
7 root 0.0 0.0 migration/0

Combined with the watch command we can turn ps into a realtime process reporter. Simple example is like this

# watch -n 1 ‘ps -e -o pid,uname,cmd,pmem,pcpu –sort=-pmem,-pcpu | head -15’

Every 1.0s: ps -e -o pid,uname,cmd,pmem,pcpu –… Sun Dec 1 18:16:08 2009

PID USER CMD %MEM %CPU
3800 1000 /opt/google/chrome/chrome – 4.6 1.4
7492 1000 /opt/google/chrome/chrome – 2.7 1.4
3150 1000 /opt/google/chrome/chrome 2.7 2.5
3824 1000 /opt/google/chrome/chrome – 2.6 0.6
3936 1000 /opt/google/chrome/chrome – 2.4 1.6
2936 1000 /usr/bin/plasma-desktop 2.3 0.2
9666 1000 /opt/google/chrome/chrome – 2.1 0.8

Process IDs:

In Linux and Unix-like systems, each process is assigned a process ID, or PID. This is how the operating system identifies and keeps track of processes.

# pgrep bash

3104
3127

The first process spawned at boot, called init, is given the PID of “1”.

# pgrep init

1

This process is then responsible for spawning every other process on the system. The later processes are given larger PID numbers.

A process’s parent is the process that was responsible for spawning it. If a process’s parent is killed, then the child processes also die. The parent process’s PID is referred to as the PPID.

Process States:

Here are the different values that the s, stat and state output specifiers (header “STAT” or “S”) will display to describe the state of a process:

Running

The process is either running (it is the current process in the system) or it is ready to run (it is waiting to be assigned to one of the system’s CPUs).

Waiting

The process is waiting for an event or for a resource. Linux differentiates between two types of waiting process; interruptible and uninterruptible. Interruptible waiting processes can be interrupted by signals whereas uninterruptible waiting processes are waiting directly on hardware conditions and cannot be interrupted under any circumstances.

Stopped

The process has been stopped, usually by receiving a signal. A process that is being debugged can be in a stopped state.
Zombie
This is a halted process which, for some reason, still has a task_struct data structure in the task vector. It is what it sounds like, a dead process.

D uninterruptible sleep (usually IO)
R running or runnable (on run queue)
S interruptible sleep (waiting for an event to complete)
T stopped, either by a job control signal or because it is being traced.
W paging (not valid since the 2.6.xx kernel)
X dead (should never be seen)
Z defunct (“zombie”) process, terminated but not reaped by its parent.

For BSD formats and when the stat keyword is used, additional characters may be displayed:
< high-priority (not nice to other users)
N low-priority (nice to other users)
L has pages locked into memory (for real-time and custom IO)
s is a session leader
l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)
+ is in the foreground process group

Processes Signals:

All processes in Linux respond to signals. Signals are an os-level way of telling programs to terminate or modify their behavior. The most common way of passing signals to a program is with the kill command. The default functionality of this utility is to attempt to kill a process:

# kill <PID of process>

This sends the TERM signal to the process. The TERM signal tells the process to please terminate. This allows the program to perform clean-up operations and exit smoothly.

If the program is misbehaving and does not exit when given the TERM signal, we can escalate the signal by passing the KILL signal:

# kill -KILL <PID of process>

This is a special signal that is not sent to the program.

Instead, it is given to the operating system kernel, which shuts down the process. This is used to bypass programs that ignore the signals sent to them.

Each signal has an associated number that can be passed instead of the name. For instance, You can pass “-15” instead of “-TERM”, and “-9” instead of “-KILL”.

Signals are not only used to shut down programs. They can also be used to perform other actions.

For instance, many daemons will restart when they are given the HUP or hang-up signal. Apache is one program that operates like this.

# kill -HUP <PID of httpd>

The above command will cause Apache to reload its configuration file and resume serving content.

You can list all of the signals that are possible to send with the kill by typing:

# kill -l

1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP
6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1
11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM
16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP
21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ
26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR
31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3
38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8
43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13
48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12
53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7
58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2
63) SIGRTMAX-1 64) SIGRTMAX

The conventional way of sending signals is through the use of PIDs, there are also methods of doing this with regular process names.

The pkill command works in almost exactly the same way as kill, but it operates on a process name instead:

# pkill -9 ping

is equivalent to

# kill -9 `pgrep ping`

If you would like to send a signal to every instance of a certain process, you can use the killall command:

# killall firefox

Process Priorities

Some processes might be considered mission critical for your situation, while others may be executed whenever there might be leftover resources.You will want to adjust which processes are given priority in a server environment. Linux controls priority through a value called niceness.

High priority tasks are considered less nice, because they don’t share resources as well. Low priority processes, on the other hand, are nice because they insist on only taking minimal resources.

When we ran top at the beginning of this blog, there was a column marked “NI”. This is the nice value of the process:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 128092 6680 3932 S 0.0 0.7 0:02.84 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0

Nice values can range between “-19/-20” (highest priority) and “19/20” (lowest priority) depending on the system.

To run a program with a certain nice value, we can use the nice command:

# nice -n 15 <command>

This only works while executing a new program.

To alter the nice value of a program that is already executing, we use a tool called renice:

# renice 0 <PID of process>

Real, Effective & Saved UID explained

Each Linux/Unix process has 3 UIDs associated with it. Superuser privilege is UID=0.

Real UID

This is the UID of the user/process that created THIS process. It can be changed only if the running process has EUID=0.

Effective UID

This UID is used to evaluate privileges of the process to perform a particular action. EUID can be changed either to RUID, or SUID if EUID!=0. If EUID=0, it can be changed to anything.

Saved UID

If the binary image file, that was launched has a Set-UID bit on, SUID will be the UID of the owner of the file. Otherwise, SUID will be the RUID.

  • What is the idea behind this?

Normal programs, like “ls”, “cat”, “echo” will be run by a normal user, under that users UID. Special programs that allow the user to have controlled access to protected data, can have Set-UID bit to allow the program to be run under privileged UID.

An example of such program is “passwd”. If you list it in full, you will see that it has a Set-UID bit and the owner is “root”. When a normal user, say “ajoy”, runs “passwd”, passwd starts with:

Real-UID = ajoy
Effective-UID = ajoy
Saved-UID = root

The program calls a system call “seteuid( 0 )” and since SUID=0, the call will succeed and the UIDs will be:

Real-UID = ajoy
Effective-UID = root
Saved-UID = root

After that, “passwd” process will be able to access /etc/passwd and change password for user “ajoy”. Note that user “ajoy” cannot write to /etc/passwd on it’s own. Note one other thing, setting a Set-UID on an executable file is not enough to make it run as a privileged process. The program itself must make a system call.

That is the idea.

List open files lsof command explained

The command lsof stands for list open files, which will list all the open files in the system. The open files include network connections, devices, and directories. The output of the lsof command will have the following columns:

COMMAND process name.
PID process ID
USER Username
FD file descriptor
TYPE node type of the file
DEVICE device number
SIZE file size
NODE node number
NAME full path of the file name.

Simply typing lsof will provide a list of all open files belonging to all active processes.

# lsof

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
init 1 root cwd DIR 8,1 4096 2 /
init 1 root txt REG 8,1 124704 917562 /sbin/init
init 1 root 0u CHR 1,3 0t0 4369 /dev/null
init 1 root 1u CHR 1,3 0t0 4369 /dev/null
init 1 root 2u CHR 1,3 0t0 4369 /dev/null
init 1 root 3r FIFO 0,8 0t0 6323 pipe
—————————————-truncated——————

By default, one file per line is displayed. Most of the columns are self-explanatory. We will explain the details about a couple of cryptic columns (FD and TYPE).

FD – Represents the file descriptor. Some of the values of FDs are,

cwd – Current Working Directory
txt – Text file
mem – Memory mapped file
mmap – Memory mapped device
NUMBER – Represent the actual file descriptor. The character after the number i.e ’1u’, represents the mode in which the file is opened. r for read, w for write, u for read and write.

TYPE – Specifies the type of the file. Some of the values of TYPEs are,

REG – Regular File
DIR – Directory
FIFO – First In First Out
CHR – Character special file

The lsof command by itself without may return lot of records as output, which may not be very meaningful except to give you a rough idea about how many files are open in the system at any given point of view as shown below.

# lsof | wc -l

3093

Use lsof –u option to display all the files opened by a specific user.

# lsof –u ajoy

vi 7190 ajoy txt REG 8,1 47

 

List opened files under a directory

You can list the processes which opened files under a specified directory using ‘+D’ option. +D will recurse the sub directories also. If you don’t want lsof to recurse, then use ‘+d’ option.

# lsof +D /var/log/

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
rsyslogd 488 syslog 1w REG 8,1 1151 268940 /var/log/syslog
rsyslogd 488 syslog 2w REG 8,1 2405 269616 /var/log/auth.log
console-k 144 root 9w REG 8,1 10871 269369 /var/log/ConsoleKit/history

List opened files based on process names starting with

You can list the files opened by process names starting with a string, using ‘-c’ option. -c followed by the process name will list the files opened by the process starting with that processes name. You can give multiple -c switch on a single command line.

# lsof -c ssh -c init

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
init 1 root txt REG 8,1 124704 917562 /sbin/init
init 1 root mem REG 8,1 1434180 1442625 /lib/i386-linux-gnu/libc-2.13.so
init 1 root mem REG 8,1 30684 1442694 /lib/i386-linux-gnu/librt-2.13.so

ssh-agent 1528 user1 1u CHR 1,3 0t0 4369 /dev/null
ssh-agent 1528 user1 2u CHR 1,3 0t0 4369 /dev/null

List processes using a mount point

Sometime when we try to umount a directory, the system will say “Device or Resource Busy” error. So we need to find out what are all the processes using the mount point and kill those processes to umount the directory. By using lsof we can find those processes.

# lsof /home

The following will also work.

# lsof +D /home/

# lsof -u ^ajoy

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
rtkit-dae 1380 rtkit 7u 0000 0,9 0 4360 anon_inode
udisks-da 1584 root cwd DIR 8,1 4096 2 /

The above command listed all the files opened by all users, expect user ‘ajoy’.

List all open files by a specific process

You can list all the files opened by a specific process using ‘-p’ option. It will be helpful sometimes to get more information about a specific process.

# lsof -p 1753

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
bash 1753 user1 cwd DIR 8,1 4096 393571 /home/ajoy/test.txt
bash 1753 user1 rtd DIR 8,1 4096 2 /
bash 1753 user1 255u CHR 136,0 0t0 3 /dev/pts/0

Kill all process that belongs to a particular user

When you want to kill all the processes which has files opened by a specific user, you can use ‘-t’ option to list output only the process id of the process, and pass it to kill as follows

# kill -9 `lsof -t -u ajoy`

The above command will kill all process belonging to user ‘ajoy’, which has files opened.

Similarly you can also use ‘-t’ in many ways. For example, to list process id of a process which opened /var/log/syslog can be done by

# lsof -t /var/log/syslog

489

Combine more list options using OR/AND

By default when you use more than one list option in lsof, they will be treated as OR. For example,

# lsof -u user1-c init

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
init 1 root cwd DIR 8,1 4096 2 /
init 1 root txt REG 8,1 124704 917562 /sbin/init
bash 1995 user1 2u CHR 136,2 0t0 5 /dev/pts/2
bash 1995 user1 255u CHR 136,2 0t0 5 /dev/pts/2

The above command uses two list options, ‘-u’ and ‘-c’. So the command will list process belongs to user ‘lakshmanan’ as well as process name starts with ‘init’.

But when you want to list a process belongs to user ‘lakshmanan’ and the process name starts with ‘init’, you can use ‘-a’ option.

# lsof -u user1 -c init -a

List all network connections

You can list all the network connections opened by using ‘-i’ option.

# lsof -i

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
avahi-dae 515 avahi 13u IPv4 6848 0t0 UDP *:mdns
avahi-dae 515 avahi 16u IPv6 6851 0t0 UDP *:52060
cupsd 1075 root 5u IPv6 22512 0t0 TCP ip6-localhost:ipp (LISTEN)

List all network files in use by a specific process

You can list all the network files which is being used by a process as follows

# lsof -i -a -p 234

You can also use the following

# lsof -i -a -c ssh

List processes which are listening on a particular port

You can list the processes which are listening on a particular port by using ‘-i’ with ‘:’ as follows

# lsofi :25

COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
exim4 2541 Debian-exim 3u IPv4 8677 TCP localhost:smtp (LISTEN)

List all TCP or UDP connections

You can list all the TCP or UDP connections by specifying the protocol using ‘-i’.

# lsof -i tcp; lsof -i udp;

List all Network File System ( NFS ) files

You can list all the NFS files by using ‘-N’ option. The following lsof command will list all NFS files used by user ‘lakshmanan’.

# lsof -N -u user1 -a

4608 475196 /bin/vi

sshd 7163 ajoy 3u IPv6 15088263 TCP dev-db:ssh->abc-12-12-12-12.socal.res.rr.com:2631 (ESTABLISHED)

A system administrator can use this command to get some idea on what users are executing on the system.

List Users of a particular file

If you like to view all the users who are using a particular file, use lsof as shown below. In this example, it displays all users who are currently using vi.

# lsof /bin/vi

COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
vi 7258 root txt REG 8,1 474608 475196 /bin/vi
vi 7300 ajoy txt REG 8,1 474608 475196 /bin/vi

Semaphores in a DB perspective

Semaphores can be described as counters which are used to provide synchronization between processes or between threads within a process for shared resources like shared memories. System V semaphores support semaphore sets where each one is a counting semaphore. So when an application requests semaphores, the kernel releases them in sets. The number of semaphores per set can be defined through the kernel parameter SEMMSL.

To see all semaphore settings, run:

# ipcs -ls

  • The SEMMSL Parameter

This parameter defines the maximum number of semaphores per semaphore set.

Oracle recommends SEMMSL to be at least 250 for 9i R2 and 10g R1/R2 databases except for 9i R2 on x86 platforms where the minimum value is lower. Since these recommendations are minimum settings, it’s best to set it always to at least 250 for 9i and 10g databases on x86 and x86-64 platforms.

NOTE:
If a database gets thousands of concurrent connections where the ora.init parameter PROCESSES is very large, then SEMMSL should be larger as well. Note what Metalink Note:187405.1 and Note:184821.1 have to say regarding SEMMSL: “The SEMMSL setting should be 10 plus the largest PROCESSES parameter of any Oracle database on the system”. Even though these notes talk about 9i databases this SEMMSL rule also applies to 10g databases. I’ve seen low SEMMSL settings to be an issue for 10g RAC databases where Oracle recommended to increase SEMMSL and to calculate it according to the rule mentioned in these notes. An example for setting semaphores for higher PROCESSES settings can be found at Example for Semaphore Settings.

  • The SEMMNI Parameter

This parameter defines the maximum number of semaphore sets for the entire Linux system.

Oracle recommends SEMMNI to be at least 128 for 9i R2 and 10g R1/R2 databases except for 9i R2 on x86 platforms where the minimum value is lower. Since these recommendations are minimum settings, it’s best to set it always to at least 128 for 9i and 10g databases on x86 and x86-64 platforms.

  • The SEMMNS Parameter

This parameter defines the total number of semaphores (not semaphore sets) for the entire Linux system. A semaphore set can have more than one semaphore, and as the semget(2) man page explains, values greater than SEMMSL * SEMMNI makes it irrelevant. The maximum number of semaphores that can be allocated on a Linux system will be the lesser of: SEMMNS or (SEMMSL * SEMMNI).

Oracle recommends SEMMNS to be at least 32000 for 9i R2 and 10g R1/R2 databases except for 9i R2 on x86 platforms where the minimum value is lower. Setting SEMMNS to 32000 ensures that SEMMSL * SEMMNI (250*128=32000) semaphores can be be used. Therefore it’s recommended to set SEMMNS to at least 32000 for 9i and 10g databases on x86 and x86-64 platforms.

  • The SEMOPM Parameter

This parameter defines the maximum number of semaphore operations that can be performed per semop(2) system call (semaphore call). The semop(2) function provides the ability to do operations for multiple semaphores with one semop(2) system call. Since a semaphore set can have the maximum number of SEMMSL semaphores per semaphore set, it is often recommended to set SEMOPM equal to SEMMSL.

Oracle recommends to set SEMOPM to a minimum value of 100 for 9i R2 and 10g R1/R2 databases on x86 and x86-64 platforms.

  • Setting Semaphore Parameters

To determine the values of the four described semaphore parameters, run:

# cat /proc/sys/kernel/sem
250 32000 32 128

These values represent SEMMSL, SEMMNS, SEMOPM, and SEMMNI.

Alternatively, you can run:

# ipcs -ls

All four described semaphore parameters can be changed in the proc file system without reboot:

# echo 250 32000 100 128 > /proc/sys/kernel/sem

Alternatively, you can use sysctl(8) to change it:

# sysctl -w kernel.sem=”250 32000 100 128″

To make the change permanent, add or change the following line in the file /etc/sysctl.conf. This file is used during the boot process.

# echo “kernel.sem=250 32000 100 128” >> /etc/sysctl.conf

  • Example for Semaphore Settings

On systems where the ora.init parameter PROCESSES is very large, the semaphore settings need to be adjusted accordingly.

As shown at The SEMMSL Parameter the SEMMSL setting should be 10 plus the largest PROCESSES parameter of any Oracle database on the system. So if you have one database instance running on a system where PROCESSES is set to 5000, then SEMMSL should be set to 5010.

As shown at The SEMMNS Parameter the maximum number of semaphores that can be allocated on a Linux system will be the lesser of: SEMMNS or (SEMMSL * SEMMNI). Since SEMMNI can stay at 128, we need to increase SEMMNS to 641280 (5010*128).

As shown at The SEMOPM Parameter a semaphore set can have the maximum number of SEMMSL semaphores per semaphore set and it is recommended to set SEMOPM equal to SEMMSL. Since SEMMSL is set to 5010 the SEMOPM parameter should be set to 5010 as well.

Hence, if the ora.init parameter PROCESSES is set to 5000, then the semaphore settings should be as follows:

# sysctl -w kernel.sem=”5010 641280 5010 128″

Init & Initrd

In this post, we’re discussing on two major components of Linux OS, initrd and init process.

Initrd a.k.a initial RAM disk:

The initial RAM disk (initrd) is an initial root file system that is mounted prior to when the real root file system is available. The initrd is bound to the kernel and loaded as part of the kernel boot procedure. The kernel then mounts this initrd as part of the two-stage boot process to load the modules to make the real file systems available and get at the real root file system.

The initrd contains a minimal set of directories and executables to achieve this, such as the insmod tool to install kernel modules into the kernel.

In the case of desktop or server Linux systems, the initrd is a transient file system. Its lifetime is short, only serving as a bridge to the real root file system. In embedded systems with no mutable storage, the initrd is the permanent root file system. We’re going to traverse both of these contexts.

Anatomy of the initrd

The initrd image contains the necessary executables and system files to support the second-stage boot of a Linux system. Depending on which version of Linux you’re running, the method for creating the initial RAM disk can vary. Prior to Fedora Core 3, the initrd is constructed using the loop device. The loop device is a device driver that allows you to mount a file as a block device and then interpret the file system it represents. The loop device may not be present in your kernel, but you can enable it through the kernel’s configuration tool (make menuconfig) by selecting Device Drivers > Block Devices > Loopback Device Support. You can inspect the loop device as follows (your initrd file name will vary):

Inspecting the initrd (prior to FC3)

# mkdir temp ; cd temp
# cp /boot/initrd.img.gz .
# gunzip initrd.img.gz
# mount -t ext -o loop initrd.img /mnt/initrd
# ls -la /mnt/initrd
#

You can now inspect the /mnt/initrd subdirectory for the contents of the initrd. Note that even if your initrd image file does not end with the .gz suffix, it’s a compressed file, and you can add the .gz suffix to gunzip it.
Beginning with Fedora Core 3, the default initrd image is a compressed cpio archive file. Instead of mounting the file as a compressed image using the loop device, you can use a cpio archive. To inspect the contents of a cpio archive, use the following commands:

Inspecting the initrd (FC3 and later)

# mkdir temp ; cd temp
# cp /boot/initrd-2.6.14.2.img initrd-2.6.14.2.img.gz
# gunzip initrd-2.6.14.2.img.gz
# cpio -i –make-directories < initrd-2.6.14.2.img
#

The result is a small root file system, as shown below. The small, but necessary, set of applications are present in the ./bin directory, including nash (not a shell, a script interpreter), insmod for loading kernel modules, and lvm (logical volume manager tools).

Default Linux initrd directory structure

# ls -la
#
drwxr-xr-x 10 root root 4096 May 7 02:48 .
drwxr-x— 15 root root 4096 May 7 00:54 ..
drwxr-xr-x 2 root root 4096 May 7 02:48 bin
drwxr-xr-x 2 root root 4096 May 7 02:48 dev
drwxr-xr-x 4 root root 4096 May 7 02:48 etc
-rwxr-xr-x 1 root root 812 May 7 02:48 init
-rw-r–r– 1 root root 1723392 May 7 02:45 initrd-2.6.14.2.img
drwxr-xr-x 2 root root 4096 May 7 02:48 lib
drwxr-xr-x 2 root root 4096 May 7 02:48 loopfs
drwxr-xr-x 2 root root 4096 May 7 02:48 proc
lrwxrwxrwx 1 root root 3 May 7 02:48 sbin -> bin
drwxr-xr-x 2 root root 4096 May 7 02:48 sys
drwxr-xr-x 2 root root 4096 May 7 02:48 sysroot
#

Of interest in the above command output is the init file at the root. This file, like the traditional Linux boot process, is invoked when the initrd image is decompressed into the RAM disk. We’ll explore about init in detail later in this post.

Tools for creating an initrd

The initial RAM disk was originally created to support bridging the kernel to the ultimate root file system through a transient root file system. The initrd is also useful as a non-persistent root file system mounted in a RAM disk for embedded Linux systems.

 

The mkinitrd utility is ideal for creating initrd images. In addition to creating an initrd image, it also identifies the modules to load for your particular system and populates them into the image.

Find out your kernel version:

# uname -r

2.6.15.4

Make a backup of existing ram disk:

# cp /boot/initrd.$(uname -r).img /root

To create initial ramdisk image type the following command as the root user:

# mkinitrd -o /boot/initrd.$(uname -r).img $(uname -r)
# ls -l /boot/initrd.$(uname -r).img

# mkinitrd -f -v /boot/initrd-$(uname -r).img $(uname -r)

The -v verbose flag causes mkinitrd to display the names of all the modules it is including in the initial ramdisk. The -f option will force an overwrite of any existing initial ramdisk image at the path you have specified. If you are in a kernel version different to the initrd you are building (including if you are in Rescue Mode) you must specify the full kernel version, without architecture

# mkinitrd -f -v /boot/initrd-2.6.18-164.el5.img 2.6.18-164.el5

You may need to modify grub.conf to point out to correct ramdisk image, make sure following line existing in grub.conf file:

# initrd /boot/initrd.img-2.6.15.4.img

 

initrd during Linux booting process:

initrd provides the capability to load a RAM disk by the bootloader. This RAM disk can then be mounted as the root filesystem and programs can be run from it. Afterward, a new root file system can be mounted on a different device. The previous root (from initrd) is then moved to a directory and can be subsequently unmounted. initrd is mainly designed to allow system startup to occur in two phases, where the kernel comes up with a minimum set of compiled-in drivers, and where additional modules are loaded from initrd.

  1. The boot loader loads the kernel and the initial RAM disk
  2. The kernel converts initrd into a “normal” RAM disk and frees the memory used by initrd
  3. initrd is mounted read-write as root
  4. /linuxrc is executed (this can be any valid executable, including shell scripts; it is run with uid 0 and can do basically everything init can do)
  5. linuxrc mounts the “real” root file system
  6. linuxrc places the root file system at the root directory using the pivot_root system call
  7. The usual boot sequence (e.g. invocation of /sbin/init) is performed on the root file system
  8. The initrd file system is removed

What is init?

Init is a user-space process that always has PID=1 and PPID=0. It’s the first user-space program spawned by the kernel once everything is ready (i.e. essential device drivers are initialized and the root filesystem is mounted). As the first process launched, it doesn’t have a meaningful parent.

 

When the kernel is loaded, it immediately initializes and configures the computer’s memory and configures the various hardware attached to the system, including all processors, I/O subsystems, and storage devices. It then looks for the compressed initrd image(s) in a predetermined location in memory, decompresses it directly to /sysroot/, and loads all necessary drivers. Next, it initializes virtual devices related to the file system, such as LVM or software RAID, before completing the initrd processes and freeing up all the memory the disk image once occupied.

The kernel then creates a root device, mounts the root partition read-only, and frees any unused memory.

At this point, the kernel is loaded into memory and operational. However, since there are no user applications that allow meaningful input to the system, not much can be done with the system.

To set up the user environment, the kernel executes the /sbin/init program.

The /sbin/init program (also called init) coordinates the rest of the boot process and configures the environment for the user. When the init command starts, it becomes the parent or grandparent of all of the processes that start up automatically on the system. First, it runs the /etc/rc.d/rc.sysinit script, which sets the environment path, starts swap, checks the file systems, and executes all other steps required for system initialization. For example, most systems use a clock, so rc.sysinit reads the /etc/sysconfig/clock configuration file to initialize the hardware clock. Another example is if there are special serial port processes which must be initialized, rc.sysinit executes the /etc/rc.serial file.

The init command then processes the jobs in the /etc/event.d directory, which describes how the system should be set up in each SysV init runlevel. Runlevels are a state, or mode, defined by the services listed in the SysV /etc/rc.d/rc<x>.d/ directory, where <x> is the number of the runlevel.

Next, the init command sets the source function library, /etc/rc.d/init.d/functions, for the system, which configures how to start, kill, and determine the PID of a program.

The init program starts all of the background processes by looking in the appropriate rc directory for the runlevel specified as the default in /etc/inittab. The rc directories are numbered to correspond to the runlevel they represent. For instance, /etc/rc.d/rc5.d/ is the directory for runlevel 5.
When booting to runlevel 5, the init program looks in the /etc/rc.d/rc5.d/ directory to determine which processes to start and stop.
Below is an example listing of the /etc/rc.d/rc5.d/ directory:

K74ypserv -> ../init.d/ypserv
K74ypxfrd -> ../init.d/ypxfrd
K85mdmpd -> ../init.d/mdmpd
K89netplugd -> ../init.d/netplugd
K99microcode_ctl -> ../init.d/microcode_ctl
S04readahead_early -> ../init.d/readahead_early
S05kudzu -> ../init.d/kudzu
S06cpuspeed -> ../init.d/cpuspeed
S08ip6tables -> ../init.d/ip6tables
S08iptables -> ../init.d/iptables
S09isdn -> ../init.d/isdn

As illustrated in this listing, none of the scripts that actually start and stop the services are located in the /etc/rc.d/rc5.d/ directory. Rather, all of the files in /etc/rc.d/rc5.d/ are symbolic links pointing to scripts located in the /etc/rc.d/init.d/ directory. Symbolic links are used in each of the rc directories so that the runlevels can be reconfigured by creating, modifying, and deleting the symbolic links without affecting the actual scripts they reference.

The name of each symbolic link begins with either a K or an S. The K links are processes that are killed on that runlevel, while those beginning with an S are started.

The init command first stops all of the K symbolic links in the directory by issuing the /etc/rc.d/init.d/<command> stop command, where <command> is the process to be killed. It then starts all of the S symbolic links by issuing /etc/rc.d/init.d/<command> start.

Running Additional Programs at Boot Time:

The /etc/rc.d/rc.local script is executed by the init command at boot time or when changing run levels. Adding commands to the bottom of this script is an easy way to perform necessary tasks like starting special services or initialize devices without writing complex initialization scripts in the /etc/rc.d/init.d/ directory and creating symbolic links.
The /etc/rc.serial script is used if serial ports must be setup at boot time. This script runs setserial commands to configure the system’s serial ports. Refer to the setserial man page for more information.

Shutting Down

To shut down Red Hat Enterprise Linux, the root user may issue the /sbin/shutdown command. The shutdown man page has a complete list of options, but the two most common uses are:

/sbin/shutdown -h now

and

/sbin/shutdown -r now

After shutting everything down, the -h option halts the machine and the -r option reboots.
PAM console users can use the reboot and halt commands to shut down the system while in run levels 1 through 5. For more information about PAM console users, refer to the Red Hat Enterprise Linux Deployment Guide.
If the computer does not power itself down, be careful not to turn off the computer until a message appears indicating that the system is halted.
Failure to wait for this message can mean that not all the hard drive partitions are unmounted, which can lead to file system corruption.

Further to the discussion on init, we have init stages like init s, init 0, init 1, init 2, init 3, init 4, init 5, init 6 which corresponds to run levels. init s and init 1 are single user mode but with a significant difference.

“init 1” is an administrative mode that runs the /etc/inittab script.
“init s” is a repair mode that doesn’t mount the filesystem. So, /etc/inittab doesn’t need to be there

Booting to run level 1 causes RHEL to process the /etc/rc.d/rc.sysinit script followed a single script from rc1.d which after some basic checks and cleanup execs init S.

Booting to runlevel s (S or single) causes RHEL to just process the /etc/rc.d/rc.sysinit script. If /etc/inittab is missing or corrupt, you go directly to a root shell with no scripts processed.

 

 

 

Linux memory usage demystified.. !!!!

Have you ever worried about Memory usage in the Linux server?

Ok, let’s take a look into the Linux memory.

Memory Architecture

To execute a process, the Linux kernel allocates a portion of the memory area to the requesting process. The process uses the memory area as workspace and performs the
required work. Assume a desktop or laptop assigned to you in your office cubicle. Take a look at your desktop (read as your Windows/Linux Dekstop) most of us will use this place in our system to scatter text files, documents, pdf’s, ppt’s and other things to perform your work, look how muddled up it looks. We are very poor in managing our desktop area, we are speaking about a maximum area of 1366×768 or more. Think of our Linux kernel managing GB’s of memory how cluttered it will be, But the Linux kernel is perfectly efficient in managing its memory area. The difference is that the kernel has to allocate space in a more dynamic manner. The number of running processes sometimes comes to tens of thousands and amount of memory is usually limited. Therefore, Linux kernel must handle the memory efficiently or you will end up with an unresponsive system.

32 Bit Vs 64 bit

On 32-bit architectures such as the i386, the Linux kernel can directly address only the first gigabyte of physical memory (896 MB when considering the reserved range). Memory above the so-called ZONE_NORMAL must be mapped into the lower 1 GB. This mapping is completely transparent to applications, but allocating a memory page in ZONE_HIGHMEM causes a small performance degradation.

On the other hand, with 64-bit architectures such as x86_64, ZONE_NORMAL extends all the way to 64 GB or to 128 GB in the case of IA-64 systems. As you can see, the overhead of mapping memory pages from ZONE_HIGHMEM into ZONE_NORMAL can be eliminated by using a 64-bit architecture

On 32-bit architectures, the maximum address space that single process can access is 4GB.This is a restriction derived from 32-bit virtual addressing. In a standard implementation, the virtual address space is divided into a 3 GB user space and a 1 GB kernel space. There are some variants like 4 G/4 G addressing layout implementing.

On the other hand, on 64-bit architecture such as x86_64 and IA64, no such restriction exists.Each single process can benefit from the vast and huge address space.

Memory Usage

On a Linux system, many programs run at the same time. These programs support multiple users, and some processes are more used than others. Some of these programs use a portion of memory while the rest are “sleeping.” When an application accesses cache, the performance increases because an in-memory access retrieves data, thereby eliminating the need to access slower disks. The OS uses an algorithm to control which programs will use physical memory and which are paged out. This is transparent to user programs.

Paging and Swapping

In Linux, genrally *NIXes, there are differences between paging and swapping. Paging moves individual pages to swap space on the disk and swapping is a bigger operation that moves the entire address space of a process to swap space in one operation.

If your server is always paging to disk (a high page-out rate), buddy..! its time to increase your physical RAM. However, for systems with a low page-out rate, it might not affect performance. If a process behaves poorly. Paging can be a serious performance problem when the amount of free memory pages falls below the minimum amount specified, because the paging mechanism is not able to handle the requests for physical memory pages and the swap mechanism is called to free more pages. This significantly increases I/O to disk and will quickly degrade a server’s performance.

Memory Available

This indicates how much physical memory is available for use. If, after you start your application, this value has decreased significantly, you might have a memory leak. Check the application that is causing it and make the necessary adjustments.

System Cache

This is the common memory space used by the file system cache.

Page Faults

There are two types of page faults: soft page faults, when the page is found in memory, and hard page faults, when the page is not found in memory and must be fetched from disk. Accessing the disk will slow your application considerably.

Private Memory

This represents the memory used by each process running on the server.

Oh.. ok.. at least some of you seems its too boring by now, So let’s think it real time.

To explain the memory usage I’ll try to illustrate with examples from my virtual test server running  PostgreSQL DB Instance on Centos 6.3.

The above output is a typical output for the free command used for memory monitoring. Just looking at this numbers, we can conclude that the system is having approx 2GB of ram and nearly 95% of it is used. If we look at the Swap: line in the output, we see that the swap space appears to be unused.

In between the Mem: and Swap: lines, we see a line labeled -/+ buffers/cache. This is probably the trickiest part in interpreting free command output. This line shows how much of the physical memory is used by the buffer/cache. In other words, this shows how much memory is being used (a better word would be borrowed) for disk caching. Disk caching makes the system run much faster.

So, while at first glance, this system appears to be running short of memory, it’s actually just making good use of memory that’s currently not needed for anything else. The key number to look at in the output above is, therefore, 1703. This is the amount of memory that would be made available to your applications if they need it.

Now we will see the Swap and Paging activity

NOTE: If your system is busy and you want to watch how memory is changing, you can run free with a -s (seconds) argument that causes the command to give you totals every X seconds. You need to start worrying about memory only and only if you see the swap usage is high.

Why Linux?

Why do I use Linux? Why do I spend much of my time suggesting others use it? Is it just because it’s available for free? These are interesting questions that are not discussed very often.

I post my personal motive for using Linux below. Some are practical, while others are more judicious.

If you’re one of those quivers on the brink of switching to Linux, reading this list might be a good place to start, and you may find some inspiration to make the leap.

  • Control over my system

There’s no “right way” or “wrong way” of doing things (although there are sensible and efficient ways of doing things, of course). I have the freedom to do what I want with Linux. In the Linux community, you’ll never hear somebody say, “Hey! You’re not supposed to do that!” or, “Serves you right for doing it the wrong way!” Instead, what you’re more likely to hear is, “Hey! I didn’t know you could do that! That’s cool!” Innovative solutions are encouraged. There is more than one way to do that a.k.a TIMTOWTDT

This freedom extends to my choice of software applications I use on my system. If I don’t like a particular piece of software, I can use an alternative. This is true even of desktop or system components.

  • Community

All operating systems tend to generate communities around them, but the community around Linux is proactive, rather than passive. What does it mean? Well, a typical posting in a Windows community forum might be something like this: “Hey! Feature X doesn’t work as it should! This sucks! I sure hope Microsoft fixes it soon!” The same posting in a Linux forum is more likely to be like this: “Hey! Feature X doesn’t work as it should! Here’s a solution…” Not only that but, in the reply, there will be several other solutions from other people or the original solution will be improved by others. Not only that but a developer might read the posting and offer a tweak for the original program, or even start his/her alternative project.

People in the Linux community share what they know. This is the whole damn point. Linux is based on the fundamental concept that knowledge wants to be free. Personally, I think this is awe-inspiring, not least because Linux brings out the very best in people.

What this also means that, if you solve a problem and take just a few moments to share it, you’re making Linux stronger. You are both a user and a contributor.

  • Virus-Free

This is a very practical reason for liking Linux, but no less valid: When using Linux, I don’t have to worry about viruses. Viruses are an ever-present threat to Windows–a kind of terrorism for computers.

I won’t brag that there are no viruses for Linux, but I’m fairly certain that there are fewer viruses in active circulation. Any that arise tend to die out quickly, simply because it’s much harder to infect a Linux system due to the way it’s built.

In addition, there’s a secret I read in some of the hacker blogs and sites that are rarely discussed: The kind of people who create viruses respect and like Linux. They don’t want to damage it, either in practical terms or by damaging its reputation. There’s also the fact that Linux is still a minority operating system, and virus writers tend to target the big fishes in the pond.

The lack of viruses means I don’t have to have an annoying antivirus program installed on my computer. There are no irritating pop-ups from the virus checker telling me it’s doing a good job, and no daily/weekly scans rendering my computer almost unusable. (IMHO nearly all antivirus programs on Windows are almost as bad as the viruses they claim to protect the user from.)

  • It’s Free Of Charge

This is another very practical reason, but it really can’t be underestimated. Linux doesn’t cost anything, and everybody in the world, therefore, has access to it. Who can argue with that?

And finally, if I’m to list out my personal liking for Linux I’ll put it in this way,

  • It’s free.
  • I can run on pretty much any hardware.
  • It is highly scalable… I can install it on a 486 or a dual core.
  • Help is readily available and free of charge.
  • Well documented.
  • An inbuilt standard help system that is actually useful (man pages).
  • Powerful CLI.
  • Linux can be configured to run without a GUI for max performance. This is especially useful for servers. Other operating systems don’t have this luxury.
  • Linux will actually give you a reason why something had an error.
  • You can choose which type of desktop environment you want.
  • With Linux, we can compile our own kernel so we don’t have to a wear a one size fits all hat.
  • When you “end a task” in Linux it actually works. (SIGTERM, SIGHUP,SIGKILL).
  • The command line auto complete feature works the way you expect it to.
  • Linux tends not to hide details.
  • You can choose a filesystem that better fits your needs. 
  • When there is a security exploit I can expect a patch immediately on the next day.

 

Linux – Most Popular Operating System

Before Linux

In order to understand the popularity of Linux, we need to travel back in time, about 30 years ago.

Imagine computers as big as houses, even stadiums. While the sizes of those computers posed substantial problems, there was one thing that made this even worse: every computer had a different operating system. Software was always customized to serve a specific purpose, and software for one given system didn’t run on another system. Being able to work with one system didn’t automatically mean that you could work with another. It was difficult, both for the users and the system administrators.

Computers were extremely expensive then, and sacrifices had to be made even after the original purchase just to get the users to understand how they worked. The total cost per unit of computing power was enormous.

Technologically the world was not quite that advanced, so they had to live with the size for another decade. In 1969, a team of developers in the Bell Labs laboratories started working on a solution for the software problem, to address these compatibility issues. They developed a new operating system, which was

Simple and elegant.

Written in the C programming language instead of in assembly code.

Able to recycle code.

The Bell Labs developers named their project “UNIX.”

The code recycling features were very important. Until then, all commercially available computer systems were written in a code specifically developed for one system. UNIX on the other hand needed only a small piece of that special code, which is now commonly named the kernel. This kernel is the only piece of code that needs to be adapted for every specific system and forms the base of the UNIX system. The operating system and all other functions were built around this kernel and written in a higher programming language, C. This language was especially developed for creating the UNIX system. Using this new technique, it was much easier to develop an operating system that could run on many different types of hardware.

The software vendors were quick to adapt, since they could sell ten times more software almost effortlessly. Weird new situations came in existence: imagine for instance computers from different vendors communicating in the same network, or users working on different systems without the need for extra education to use another computer. UNIX did a great deal to help users become compatible with different systems.

Throughout the next couple of decades, the development of UNIX continued. More things became possible to do and more hardware and software vendors added support for UNIX to their products.

UNIX was initially found only in very large environments with mainframes and minicomputers (note that a PC is a “micro” computer). You had to work at a university, for the government or for large financial corporations in order to get your hands on a UNIX system.

But smaller computers were being developed, and by the end of the 80’s, many people had home computers. By that time, there were several versions of UNIX available for the PC architecture, but none of them are truly free and more important: they were all terribly slow, so most people ran MS DOS or Windows 3.1 on their home PCs.

Linus and Linux

By the beginning of the 90s, home PCs were finally powerful enough to run a full blown UNIX. Linus Torvalds, a young man studying computer science at the university of Helsinki, thought it would be a good idea to have some sort of freely available academic version of UNIX, and promptly started to code.

He started to ask questions, looking for answers and solutions that would help him get UNIX on his PC. Below is one of his first posts in comp.os.minix, dating from 1991:

From: torvalds@klaava.Helsinki.FI (Linus Benedict Torvalds)
Newsgroups: comp.os.minix
Subject: Gcc-1.40 and a posix-question
Message-ID: <1991Jul3.100050.9886@klaava.Helsinki.FI>
Date: 3 Jul 91 10:00:50 GMT
Hello netlanders,
Due to a project I’m working on (in minix), I’m interested in the posix
standard definition. Could somebody please point me to a (preferably)
machine-readable format of the latest posix rules? Ftp-sites would be
nice.

From the start, it was Linus’ goal to have a free system that was completely compliant with the original UNIX. That is why he asked for POSIX standards, POSIX still being the standard for UNIX.

In those days plug-and-play wasn’t invented yet, but so many people were interested in having a UNIX system of their own, that this was only a small obstacle. New drivers became available for all kinds of new hardware, at a continuously rising speed. Almost as soon as a new piece of hardware became available, someone bought it and submitted it to the Linux test, as the system was gradually being called, releasing more free code for an ever wider range of hardware. These coders didn’t stop at their PC’s; every piece of hardware they could find was useful for Linux.

Back then, those people were called “nerds” or “freaks”, but it didn’t matter to them, as long as the supported hardware list grew longer and longer. Thanks to these people, Linux is now not only ideal to run on new PC’s, but is also the system of choice for old and exotic hardware that would be useless if Linux didn’t exist.

Two years after Linus’ post, there were 12000 Linux users. The project, popular with hobbyists, grew steadily, all the while staying within the bounds of the POSIX standard. All the features of UNIX were added over the next couple of years, resulting in the mature operating system Linux has become today. Linux is a full UNIX clone, fit for use on workstations as well as on middle-range and high-end servers. Today, a lot of the important players on the hardware and software market each have their team of Linux developers; at your local dealer’s you can even buy pre-installed Linux systems with official support – eventhough there is still a lot of hardware and software that is not supported, too.

Current application of Linux systems

Today Linux has joined the desktop market. Linux developers concentrated on networking and services in the beginning, and office applications have been the last barrier to be taken down. We don’t like to admit that Microsoft is ruling this market, so plenty of alternatives have been started over the last couple of years to make Linux an acceptable choice as a workstation, providing an easy user interface and MS compatible office applications like word processors, spreadsheets, presentations and the like.

On the server side, Linux is well-known as a stable and reliable platform, providing database and trading services for companies like Amazon, the well-known online bookshop, US Post Office, the German army and many others. Especially Internet providers and Internet service providers have grown fond of Linux as firewall, proxy- and web server, and you will find a Linux box within reach of every UNIX system administrator who appreciates a comfortable management station. Clusters of Linux machines are used in the creation of movies such as “Titanic”, “Shrek” and others. In post offices, they are the nerve centers that route mail and in large search engine, clusters are used to perform internet searches.These are only a few of the thousands of heavy-duty jobs that Linux is performing day-to-day across the world.

It is also worth to note that modern Linux not only runs on workstations, mid- and high-end servers but also on “gadgets” like PDA’s, mobiles, a shipload of embedded applications and even on experimental wristwatches. This makes Linux the only operating system in the world covering such a wide range of hardware.

Current development Torvalds continues to direct the development of the kernel. Stallman heads the Free Software Foundation, which in turn supports the GNU components. Finally, individuals and corporations develop third-party non-GNU components. These third-party components comprise a vast body of work and may include both kernel modules and user applications and libraries. Linux vendors and communities combine and distribute the kernel, GNU components, and non-GNU components, with additional package management software in the form of Linux distributions.

Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991, by Linus Torvalds. Linux system distributions may vary in many details of system operation, configuration, and software package selections.

Linux runs on a wide variety of computer hardware, including mobile phones, tablet computers, network routers, televisions, video game consoles, desktop computers, mainframes and supercomputers. Linux is a leading server operating system and runs the 10 fastest supercomputers in the world. In addition, more than 90% of today’s supercomputers run some variant of Linux.

The development of Linux is one of the most prominent examples of free and open source software collaboration: the underlying source code may be used, modified, and distributed—commercially or non-commercially—by anyone under licenses such as the GNU General Public License. Typically Linux is packaged in a format known as a Linux distribution for desktop and server use. Some popular mainstream Linux distributions include Debian (and its derivatives such as Ubuntu), Fedora and openSUSE. Linux distributions include the Linux kernel, supporting utilities and libraries and usually a large amount of application software to fulfill the distribution’s intended use.