Unikernel: Another paradigm for the cloud

In this cloud era it is hard to imagine a world without access to services in the cloud. From contacting someone through mail, to storing work-related documents on an online drive and accessing it across devices, there are lot of services we use on a daily basis that is in the cloud.

To reduce the cost of compute power, Virtualization has been adapted towards offering more services with less hardware. And then came the concept of containers where you deploy the application in isolated containers with light weight images which has few binaries and libraries to run your application, But still we need the underlying VMs to deploy such solutions. All these VMs comes with a cost. While large data-centers are offering services in the cloud, they are also hungry for electric power, which is becoming a growing concern as our planet is being drained of its resources. So what we need now is less power-hungry solutions.

What if, instead of virtualization of an entire operating system, you were to load an application with only the required components from the operating system? Effectively reducing the size of the virtual machine to its bare minimum resource footprint? This is where unikernels come into play.

Unikernel

Unikernel is a relatively new concept that was first introduced around 2013 by Anil Madhavapeddy in a paper titled “Unikernels: Library Operating Systems for the Cloud” (Madhavapeddy, et al., 2013).

You can find more details on Unilernel by searching the scholarly articles in Google.

Unikernels are defined by the community at Unikernel.org as follows.

“Unikernels are specialized, single-address-space machine images constructed by using library operating systems.”

For more detailed reading about the concepts behind Unikernel, please follow this link,

A Unikernel is an application that has been boiled down to a small, secure, light-weight virtual machine which eliminates general purpose operating systems such as Linux or Windows. Unikernels aims to be a much more secure system than Linux. It does this through several thrusts. Not having the notion of users, running a single process per VM, and limiting the amount of code that is incorporated into each VM. This means that there are no users and no shell to login to and, more importantly, you can’t run more than the one program you want to run inside. Despite their relatively young age, unikernels borrow from age-old concepts rooted in the dawn of the computer era: microkernels and library operating systems. This means that a unikernel holds a single application. Single-address space means that in its core, the unikernel does not have separate user and kernel address space. Library operating systems are the core of unikernel systems. Unikernels are provisioned directly on the hypervisor without a traditional system like Linux. So it can run 1000X more vms/per server.

You can have a look in here for details about Microkernel, Monolithic & Library Operating Systems

Virtual Machines VS Linux Containers VS Unikernel

Virtualization of services can be implemented in various ways. One of the most widespread methods today is through virtual machine, hosted on hypervisors such as VMware’s ESXi or Linux Foundation’s Xen Project.

Hypervisors allow hosting multiple guest operating systems on a single physical machine. These guest operating systems are executed in what is called virtual machines. The widespread use of hypervisors is due to their ability to better distribute and optimize the workload on the physical servers as opposed to legacy infrastructures of one operating system per physical server.

Containers are another method of virtualization, which differentiates from hypervisors by creating virtualized environments and sharing the host’s kernel. This provides a lighter approach to hypervisors which requires each guest to have their copy of the operating system kernel, making a hypervisor-virtualized environment resource heavy in contrast to containers which share parts of the existing operating system.

As aforementioned, unikernels leverage the abstraction of hypervisors in addition to using library operating systems to only include the required kernel routines alongside the application to present the lightest of all three solutions.

The figure above shows the major difference between the three virtualization technologies. Here we can clearly see that virtual machines present a much larger load on the infrastructure as opposed to containers and unikernels.

Additionally, unikernels are in direct “competition” with containers. By providing services in the form of reduced virtual machines, unikernels improve on the container model by its increased security. By sharing the host kernel, containerized applications share the same vulnerabilities as the host operating system. Furthermore, containers do not possess the same level of host/guest isolation as hypervisors/virtual machines, potentially making container breaches more damaging than both virtual machines and unikernels.

TechnologyProsCons
Virtual Machines– Allows deploying different operating systems on a single host
– Complete isolation from host
– Orchestration solutions available
– Requires compute power proportional to number of instances
– Requires large infrastructures
– Each instance loads an entire operating system
Linux Containers– Lightweight virtualization
– Fast boot times
– Ochestration solutions
– Dynamic resource allocation
– Reduced isolation between host and guest due to shared kernel
– Less flexible (i.e.: dependent on host kernel)
– Network is less flexible
Unikernels– Lightweight images
– Specialized application
– Complete isolation from host
– Higher security against absent functionalities (e.g.: remote command execution)
– Not mature enough yet for production
– Requires developing applications from the grounds up
– Limited deployment possibilities
– Lack of complete IDE support
– Static resource allocation
– Lack of orchestration tools
A Comparison of solutions

Docker and containerization technology and the container orchestra-tors like Kubernetes, OpenShift are 2 steps forward for the world of DevOps and that the principles it promotes are forward-thinking and largely on-target for the future of a more secure, performance oriented, and easy-to-manage cloud future. However, an alternative approach leveraging unikernels and immutable servers will result in smaller, easier to manage, more secure containers that will be simpler to adopt by existing enterprises. As DevOps matures, the shortcomings of cloud application deployment and management are becoming clear. Virtual machine image bloat, large attack surfaces, legacy executable, base-OS fragmentation, and unclear division of responsibilities between development and IT for cloud deployments are all causing significant friction (and opportunities for the future).

For Example: It remains virtually impossible to create a Ruby or Python web server virtual machine image that DOESN’T include build tools (gcc), ssh, and multiple latent shell executable. All of these components are detrimental for production systems as they increase image size, increase attack surface, and increase maintenance overhead.

Compared to VMs running Operating systems like Windows and Linux, the unikernel has only a tenth of 1% of the attack surface. So in the case of a unikernel — sysdig, tcpdump, and mysql-client are not installed and you can’t just “apt-get install” them either. You have to bring that with your exploit. To take it further even a simple cat /etc/hosts or grep of /var/log/nginx/access.log simply won’t work — once again they are separate processes.
So unikernels are highly resistant to remote code execution attacks, more specifically shell code exploits.

Immutable Servers & Unikernels

Immutable Servers are a deployment model that mandates that no application updates, security patches, or configuration changes happen on production systems. If any of these layers needs to be modified, a new image is constructed, pushed and cycled into production. Heroku is a great example of immutable servers in action: every change to your application requires a ‘git push’ to overwrite the existing version. The advantages of this approach include higher confidence in the code that is running in production, integration of testing into deployment workflows, easy to verify that systems have not been compromised.

Once you become a believer in the concept of immutable servers, then speed of deployment and minimizing vulnerability surface area become objectives. Containers promote the idea of single-service-per-container (microservices), and unikernels take this idea even further.

Unikernels allow you to compile and link your application code all the way down to and include the operating system. For example, if your application doesn’t require persistent disk access, no device drivers or OS facilities for disk access would even be included in final production images. Since unikernels are designed to run on hypervisors such as Xen, they only need interfaces to standardized resources such as networking and persistence. Device drivers for thousands of displays, disks, network cards are completely unnecessary. Production systems become minimalist — only requiring the application code, the runtime environment, and the OS facilities required by the applications. The net effect is smaller VM images with less surface area that can be deployed faster and maintained more easily.

Traditional Operating Systems (Linux, Windows) will become extinct on servers. They will be replaced with single-user, bare metal hypervisors optimized for the specific hardware, taking decades of multi-user, hardware-agnostic code cruft with them. More mature build-deploy-manage tool set based on these technologies will be truly game changing for hosted and enterprise clouds alike.

UnikernelLanguageTargetsFunctions
ClickOSC++XenNetwork Function Virtualization
HalVMHaskellXen
IncludeOSC++KVM, VirtualBox, ESXi, Google Cloud, OpenStackOrchestration tool available
MirageOSOCamlKVM, Xen, RTOS/MCU
Nanos UnikernelC, C++, Go, Java, Node.js, Python, Rust, Ruby, PHP, etcQEMU/KVMOrchestration tool available
OSvJava, C, C++, Node, RubyVirtualBox, ESXi, KVM, Amazon EC2, Google CloudCloud and IoT (ARM)
RumprunC, C++, Erlan, Go, Java, JavaScript, Node.js, Python, Ruby, RustXen, KVM
UnikGo, Node.js, Java, C, C++, Python, OCamlVirtualBox, ESXi, KVM, XEN, Amazon EC2, Google Cloud, OpenStack, PhotonControllerUnikernel compiler toolbox with orchestration possible through Kubernetes and Cloud Foundry
ToroKernelFreePascalVirtualBox, KVM, XEN, HyperVUnikernel dedicated to run microservices
Comparing few Unikernel solutions from active projects

Out of the various existing projects, some standout due to their wide range of supported languages. Out of the active projects, the above table describes the language they support, the hypervisors they can run on and remarks concerning their functionality.

Currently experimenting with the Unikernel in the AWS and Google Cloud Platform and will update you with another post on that soon.

Source: Medium, github, containerjournal, linuxjournal

Linux Inside Win 10

I am a zealous fan of Linux and FOSS. I have been using Linux and it’s TUI with bash shell for more than seventeen years. When I moved to my new role I find it bit difficult when I had a Windows 10 laptop and was literally fumbling with the powershell and cmd line when I tried working with tools like terraform, git etc. But luckily I figured out a solution for the old school *NIX users like me who are forced to use a Windows laptop, and that solution is WSL.

Windows Subsystem for Linux a.k.a WSL is an environment in Windows 10 for running unmodified Linux binaries in a way similar to Linux Containers. Please go through my earlier post on LinuxContainers for more details on it. WSL runs Linux binaries by implementing a Linux API compatibility layer partly in the Windows kernel when it was introduced first. The second iteration of it, “WSL 2” uses the Linux kernel itself in a lightweight VM to provide better compatibility with native Linux installations.

To use WSL in Win 10, you have to enable wsl feature from the Windows optional features. Being a aficionado of command line than the GUI, I’ll now list out step by step commands in order to enable the WSL and install your favourite Linux distribution and how to use it.

  1. Open Powershell as administrator and execute the command below Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux
    Note: Restart is required when prompted
  2. Invoke-WebRequest -Uri https://aka.ms/wsl-ubuntu-1804 -OutFile Ubuntu.appx -UseBasicParsing

In the above command you can use all the Linux distros available for WSL. For example if you want to use Kali Linux instead of Ubuntu, you can edit the distro url in the step 2 command as “https://aka.ms/ wsl-kali-linux”. Please refer to this guide for all available distros.

Once the download is completed, you need to add that package to the Windows 10 application list which can be done using the command below.

  1. Add-AppxPackage .\Ubuntu.appx

Once all these steps are completed, in the Win 10 search (magnifying glass in the bottom left corner near to the windows logo) type Ubuntu and you can see your Ubuntu (you need to search which ever wsl distro you have added). To complete the initialization of your newly installed distro, launch a new instance which is Ubuntu in our case by selecting and running Ubuntu from search as seen in the screen shot below.

This will start the installation of your chosen Linux distros binaries and libraries along with the kernel. It will take some time to complete the installation and configuration (approximately 5 to 10 minutes depending on your laptop/desktop configuration).

Once installation is complete, you will be prompted to create a new user account (and its password).

Note: You can choose any username and password you wish – they have no bearing on your Windows username.

If everything goes well, we’ll have Ubuntu installed as a sub system. When you open a new distro instance, you won’t be prompted for your password,
but if you elevate your privileges using sudo, you will need to enter your password.

Next step is to updating your package catalog, and upgrading your installed packages with the Ubuntu package manager apt. To do so, execute the below command in the prompt.

$ sudo apt update && sudo apt upgrade

Now we will proceed to install Git.

$ sudo apt install git

$ git --version

And finally test out git installation with the above command. You can also install other tools, packages available in the repository. I have installed git, terraform, aws-cli, azure-cli and ansible. You can install python, ruby, go programming environment as well. Python pip and ruby gem installations are also supported. You can use this sub system as an alternative for your day to day Linux operations and as an alternative terminal for your Powershell if you are using Sublime Text or Atom or Visual studio code

Arch installation – Part 1

This is a continuation for my earlier post on Arch Linux. In this blog we’ll try to install and tame Arch Linux in an easier and quickest way.

The default installation covers only a minimal base system and expects the end user to configure and use it. Based on the KISS – Keep It Simple, Stupid! principle, Arch Linux focus on elegance, code correctness, minimalist system and simplicity.

Arch Linux supports the Rolling release model and has its own package manager – pacman. With the aim to provide a cutting-edge operating system, Arch never misses out to have an up-to-date repository. The fact that it provides a minimal base system gives you a choice to install it even on low-end hardware and then install only the required packages over it.

Also, its one of the most popular OS for learning Linux from scratch. If you like to experiment with a DIY attitude, you should give Arch Linux a try. It’s what many Linux users consider a core Linux experience.

You have been warned…!!! The method we are going to discuss here wipes out existing operating system(s) from your computer and install Arch Linux on it. So if you are going to follow this tutorial, make sure that you have backed up your files or else you’ll lose all of it or else your are installing in a fresh VM (You can deploy a VM easily with a Virtual Box, VMWare Workstation, KVM, Xen etc). Deploying a VM with any of those afore mentioned tools is completely out of scope for this post. again I would reccomend you to try this installation in any of your old laptop or a new VM

Requirements for installing Arch Linux:

  • A x86_64 (i.e. 64 bit) compatible machine
  • Minimum 512 MB of RAM (recommended 2 GB)
  • At least 1 GB of free disk space (recommended 20 GB for basic usage)
  • An active internet connection
  • A USB drive with minimum 2 GB of storage capacity
  • Familiarity with Linux command line

You have been warned..!!! If you are a person addicted to GUI and doesn’t want to do a DIY kind of Linux installation, IMHO, please discontinue reading this and go ahead with Arch derivatives like Manjaro.

Step 1: Download the ISO

You can download the ISO from the official website.
I prefer using the magnet link and downloading the ISO with a torrent client.

Step 2: Create a live USB of Arch Linux

We will have to create a live USB of Arch Linux from the ISO you just downloaded.

If you are on Linux, you can use dd command to create a live USB. I recommend you to try Balena Etcher for creating your usb. This is my favourite way of doing it. It is a portable piece of software and can be used on Windows/Mac OS/Linux etc

You can download it from the link here

Once you have created a live USB for Arch Linux, shut down your PC. Plugin your USB and boot your system. While booting keep pressing ESC, F2, F10 or F12 (depending upon your system) to go into boot settings. In here, select to boot from USB or removable disk. You will be presented with a Linux terminal which will be automatically logged in as root user.

As the initial steps are completed, we’ll discuss the installation part in my next post

Demystifying Arch

There is always a lot of buzz around the Arch Linux distribution in the Linux community. Those who are switching to Linux as a desktop operating system are very reluctant to go the Arch way and those who have experienced with Linux as well. IMHO it’s only because of the mystery prevailing around this distro. Here in this blog we’ll try to address some of those and help to embrace everyone to Arch world with out any concerns.

Arch principles are: Simplicity, Modernity, Pragmatism, User Centrality and Versatility.

Arch Linux is an independently developed, x86-64 general-purpose GNU/Linux distribution that strives to provide the latest stable versions of most software by following a rolling-release model. The default installation is a minimal base system, configured by the user to only add what is purposely required. In other words Arch is a DIY Linux instalaltion where the user is having full control on each and every component getting installed and configured.

The original creator of Arch Linux, Judd Vinet was in total love with the simplicity and elegance of Slackware, BSD, etc. He wanted a distribution that could be DIY, you make it yourself from ground up and it only has what you want it to have.

The solution was to use a shell based manual installation instead of the automatic ncurses or graphical based installations that lead you to learn almost nothing and you know nothing about what exactly was installed on the system. The ZSH based installer of Arch Linux means that you do everything using commands. All this means that you know exactly what is happening, since you are the one doing it. You can customize everything right from the pre-installation steps. You don’t install the OS in Arch. You just download the base system and install it in the root partition/directory. Then you install other required packages needed by you by either chrooting into the system or booting into the new system and it will be a terminal, as it is only base system install the you’ve done. Install and configure each package, you must know what are your required packages and how they are installed and configured. That is how Arch Linux haelps you install your on custom Linux (DIY) and help you to learn by doing, breaking & repeating.

Arch Linux doesn’t care about being easy for Ubuntu/Debian/Redhat/Fedora style to set up. But, there do exist easy-to-install variants if anyone wants to have a touch and feel os Arch Linux; Antergos & Manjaro is essentially Arch with a graphical installer.

Typically in a more simpler steps Arch installation will be like,

  • Download ISO Image
  • Burn the image to a DVD/USB
  • Boot Arch from the media
  • create disk partitions
  • Setup network with/without DHCP, including wired or wireless network
  • Optimize gcc for the specific CPU
  • Config/compile Linux kernel & modules
  • Base packages
  • Environmental configuration
  • Necessary softwares/tools/applications
  • setup X server and GUI

In one word, the installation bundle in those distributions like Debian and CentOS does all things above in a more user-friendly way or even automatically, which you will have to do manually step by step in Arch Linux. But it is not at all a hard to do thing as it sounds. We’ll walk through the installation in the next post.

Linux Containers

The Linux Containers project or LinuX Containers (LXC) is an open source container platform that provides a set of tools, templates, libraries, and language bindings. LXC has a simple command line interface that improves the user experience when starting containers. LXC is a user-space interface for the Linux kernel containment features. Through a powerful API and simple tools, it lets Linux users easily create and manage system or application containers.

LXC offers an operating-system level virtualization environment that is available to be installed on many Linux-based systems. Your Linux distribution may have it available through its package repository. LXC is a free software, most of the code is released under the terms of the GNU LGPLv2.1+ license, some Android compatibility bits are released under a standard 2-clause BSD license and some binaries and templates are released under the GNU GPLv2 license.

LXC is a OS-level virtualization technology that allows creation and running of multiple isolated Linux virtual environments (VE) on a single control host. These isolation levels or containers can be used to either sandbox specific applications, or to emulate an entirely new host. LXC uses Linux’s cgroups functionality, which was introduced in kernel version 2.6.24 to allow the host CPU to better partition memory allocation into isolation levels called namespaces. Note that a VE is distinct from a virtual machine (VM).

The idea of what we now call container technology first appeared in 2000 as FreeBSD jails, a technology that allows the partitioning of a FreeBSD system into multiple subsystems, or jails. Jails were developed as safe environments that a system administrator could share with multiple users inside or outside of an organization.

In 2001, an implementation of an isolated environment made its way into Linux, by way of Jacques Gélinas’ VServer project. Once this foundation was set for multiple controlled userspaces in Linux, pieces began to fall into place to form what is today’s Linux container.

Very quickly, more technologies combined to make this isolated approach a reality. Control groups (cgroups) is a kernel feature that controls and limits resource usage for a process or groups of processes. And systemd, an initialization system that sets up the userspace and manages their processes, is used by cgroups to provide greater control over these isolated processes. Both of these technologies, while adding overall control for Linux, were the framework for how environments could be successful in staying separated.

Current LXC uses the following kernel features to contain processes:

  • Kernel namespaces (ipc, uts, mount, pid, network and user)
  • Apparmor and SELinux profiles
  • Seccomp policies
  • Chroots (using pivot_root)
  • Kernel capabilities
  • CGroups (control groups)

LXC containers are often considered as something in the middle between a chroot and a full fledged virtual machine. The goal of LXC is to create an environment as close as possible to a standard Linux installation but without the need for a separate kernel. LXC is currently made of a few separate components:

  • The liblxc library
  • Several language bindings for the API:
  • python3 (in-tree, long term support in 2.0.x)
  • lua (in tree, long term support in 2.0.x)
  • Go
  • ruby
  • python2
  • Haskell
  • A set of standard tools to control the containers
  • Distribution container templates

Now we will quickly explore the LXC containers.

For most modern Linux distributions, the kernel is enabled with cgroups, but you most likely will need to install the LXC utilities.

If you’re using Red Hat or CentOS, you’ll need to install the EPEL repositories first. For other distributions, such as Ubuntu or Debian, simply type:

$ sudo apt-get install lxc

Create the ~/.config/lxc directory if it doesn’t already exist, and copy the /etc/lxc/default.conf configuration file to ~/.config/lxc/default.conf. Append the following two lines to the end of the file:

lxc.id_map = u 0 100000 65536
lxc.id_map = g 0 100000 65536

Append the following to the /etc/lxc/lxc-usernet file (replace the first column with your user name):

akbharat veth lxcbr0 10

The quickest way for these settings to take effect is either to reboot the host or log the user out and then log back in.

Once logged back in, verify that the veth networking driver is currently loaded:

$ lsmod|grep veth
veth 16384 0

$ sudo modprobe veth

If in case you couldn’t see the module ‘veth’ not loaded you can use the command above to load it to the kernel.

Next, download a container image and name it “my-container”. When you type the following command, you’ll see a long list of supported containers under many Linux distributions and versions:

$ sudo lxc-create -t download -n my-container

You’ll be given three prompts to pick the distribution, release and architecture. I chose the following:

Distribution: ubuntu
Release: xenial
Architecture: amd64

Once you press Enter, the rootfs will be downloaded locally and configured. For security reasons, each container does not ship with an OpenSSH server or user accounts. A default root password also is not provided. In order to change the root password and log in, you must run either an lxc-attach or chroot into the container directory path (after it has been started). We will use the command below to start the container

$ sudo lxc-start -n my-container -d

The -d option dæmonizes the container, and it will run in the background. If you want to observe the boot process, replace the -d with -F, and it will run in the foreground, ending at a login prompt.

Open up a second terminal window and verify the status of the container:

$ sudo lxc-info -n my-container
Name: my-container
State: RUNNING
PID: 1356
IP: 10.0.3.28
CPU use: 0.29 seconds
BlkIO use: 16.80 MiB
Memory use: 29.02 MiB
KMem use: 0 bytes
Link: vethPRK7YU
TX bytes: 1.34 KiB
RX bytes: 2.09 KiB
Total bytes: 3.43 KiB

There is also another way to see a list of all installed containers, which is provided below:

$ sudo lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6
my-container RUNNING 0 - 10.0.3.28 -

But still you will not be able to use it, so for that you can attach the container directly with your LXC tool sets and work with it.

$ sudo lxc-attach -n my-container
root@my-container:/#

If you set a username now from within the container, you can either use console command to connect to the container with your newly created username and password.

$ sudo lxc-console -n my-container

If you want to connect to the container using SSH, you can install the Open SSH server in the container and then figure out the IP address from the container with the commands below and connect to the IP from your linux host.

root@my-container:/# apt-get install openssh-server

root@my-container:/# ip addr show eth0|grep inet

From your Linux host now ssh to the container using the IP we captured from the earlier command.

$ ssh 10.0.3.25

On the host system, and not within the container, it’s interesting to observe which LXC processes are initiated and running after launching a container:

$ ps aux|grep lxc|grep -v grep
root 861 0.0 0.0 234772 1368 ? Ssl 11:01
↪0:00 /usr/bin/lxcfs /var/lib/lxcfs/
lxc-dns+ 1155 0.0 0.1 52868 2908 ? S 11:01
↪0:00 dnsmasq -u lxc-dnsmasq --strict-order
↪--bind-interfaces --pid-file=/run/lxc/dnsmasq.pid
↪--listen-address 10.0.3.1 --dhcp-range 10.0.3.2,10.0.3.254
↪--dhcp-lease-max=253 --dhcp-no-override
↪--except-interface=lo --interface=lxcbr0
↪--dhcp-leasefile=/var/lib/misc/dnsmasq.lxcbr0.leases
↪--dhcp-authoritative
root 1196 0.0 0.1 54484 3928 ? Ss 11:01
↪0:00 [lxc monitor] /var/lib/lxc my-container
root 1658 0.0 0.1 54780 3960 pts/1 S+ 11:02
↪0:00 sudo lxc-attach -n my-container
root 1660 0.0 0.2 54464 4900 pts/1 S+ 11:02
↪0:00 lxc-attach -n my-container

And finally to stop the container and verify whether the container is stopped we can use the following commnds in sequence,

$ sudo lxc-stop -n my-container

$ sudo lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6
my-container STOPPED 0 - - -

$ sudo lxc-info -n my-container
Name: my-container
State: STOPPED

You might also want to destroy the contianer:

$ sudo lxc-destroy -n my-container
Destroyed container my-container

This is a very basic example of a container capability. Now in modern container era containerization is much more sophisticated and Docker is a significant improvement of LXC’s capabilities. Its obvious advantages are gaining Docker a growing following of adherents. In fact, it starts getting dangerously close to negating the advantage of VM’s over VE’s because of its ability to quickly and easily transfer and replicate any Docker-created packages. We’ll discuss more on Docker in my next post.

Linux Interview Questions

These are some of the Interview questions we were using while conducting L1, L2 & L3 level interviews, I’ll not posting any answers to these questions as it’s open up to you to search and find the answers. It will help you to understand the topic and you’ll remember the answers all the time.

What is the difference between kernel image and initrd image?

Why is root file system mounted read-only initially?

What will happen if it is mounted read-write?

What are the process states in Linux?

Which log files holds the information of currently logged users?

What is called a fork bomb?What is its relevance?

What is the difference between commands last and lastb?

How do you ensure your users have hard-to-guess passwords?

What is log rotate and Why is it used in a Linux system?

How will you increase the priority of a process?

What is the range for the priority of the processes?

I have a file named “/ \+Xy \+\8_/ \” How do I get rid of it?

What does the command pvscan do?

How will you probe for a new SCSI drive?

How will you take a system backup image for recovery?

What will the command chage –E -1 do?

What will the command option –a will do for usermod?

What is the command gpasswd?

When debugging a core in gdb, what does the command `bt` give?

What do you mean by man (1) (5) and (8)?

How many fields in crontab and what are those?

To display a list of all manual pages containing the keyword “date”, which commad can be used?

What do you mean by restricted deletion flag?

What’s the difference between `telnet` and `ssh`? What’s a good use for each?

What will vgcfgbackup and vgcfgrestore do and can you explain with a scenario?

What is the difference between insmod and modprobe?

What to do if the newly built kernel does not boot?

What are TCP wrappers? What is the library responsible for TCP wrappers?

In how many ways you can block the traffic to ssh from a specific host?

How will you run ssh in debugging mode to identify a failure in establishing a connection?

Which file is responsible for kernel parameter changes?

What do you mean by hardening of a Linux server?How will you achieve this?

How will you tune a Linux server for maximum performance?

What is the use of commands clustat clusvcadm?

What is multipathing?Why is it used?

What is black-listing of devices in multipath?

Which command will display the current multipath configuration gathered from sysfs, the device mapper, and all other available components on the system?

What is multipath interactive console?

What is GFS?

How will you create a GFS file system?

What is called NIC bonding or teaming?

How will you repair a GFS file system?

 

SHELL SCRIPTING

How do you read arguments in a shell program?

How do you send a mail message to somebody within a script?

Which command is used to count lines and/or characters in a file?

How do you start a process in the background?

How will you define a function in a shell script?

How do you get character positions 10-20 from a text file?

What do you mean by getopts?

How do you test for file properties in a shell script?

How will you define an array?

How can you take in key-board inputs to a shell script?

What is meant by traps in a shell scripts?

Is it possible to use alphanumeric in for loop?

Write a Palindrome script?

Write a script to check no of machines up in network ( Provided list of hosts are available in a file)?

if u have f1;f2;f3 words, need to display f3 first then f1 and at last f2 using a script?

Print the numbers of 1 to 1000 in the format 0001,0002,0003………1000?

Magic SysRq key of Linux

The magic SysRq key is a key combination in the Linux kernel which allows the user to perform various low-level commands regardless of the system’s state.

It is often used to recover from freezes or to reboot a computer without corrupting the filesystem. The key combination consists of Alt+SysRq+commandkey. In many systems, the SysRq key is the PrintScreen key.

First, you need to enable the SysRq key, as shown below.

# echo “1” > /proc/sys/kernel/sysrq

List of SysRq Command Keys

Following are the command keys available for Alt+SysRq+commandkey.

‘k’ – Kills all the process running on the current virtual console.
’s’ – This will attempt to sync all the mounted file system.
‘b’ – Immediately reboot the system, without unmounting partitions or syncing.
‘e’ – Sends SIGTERM to all process except init.
‘m’ – Output current memory information to the console.
‘i’ – Send the SIGKILL signal to all processes except init
‘r’ – Switch the keyboard from raw mode (the mode used by programs such as X11), to XLATE mode.
’s’ – sync all mounted file systems.
‘t’ – Output a list of current tasks and their information to the console.
‘u’ – Remount all mounted filesystems in read-only mode.
‘o’ – Shutdown the system immediately.
‘p’ – Print the current registers and flags to the console.
‘0-9′ – Sets the console log level, controlling which kernel messages will be printed to your console.
‘f’ – Will call oom_kill to kill a process which takes more memory.
‘h’ – Used to display the help. But any other keys than the above listed will print help.

We can also do this by echoing the keys to the /proc/sysrq-trigger file. For example, to reboot a system you can perform the following.

# echo “b” > /proc/sysrq-trigger

Perform a Safe reboot of Linux using Magic SysRq Key

To perform a safe reboot of a Linux computer which hangs up, do the following. This will avoid the fsck during the next re-booting. i.e Press Alt+SysRq+letter highlighted below.

unRaw (take control of keyboard back from X11,
tErminate (send SIGTERM to all processes, allowing them to terminate gracefully),
kIll (send SIGILL to all processes, forcing them to terminate immediately),
Sync (flush data to disk),
Unmount (remount all filesystems read-only),
reBoot.

Finding the ‘find’ command

The common problem users or admins run into when first dealing with a Linux machine is how to find the files they are looking for.

We discuss here the GNU/Linux find command which is one of the most important and much-used commands in Linux systems. find is used to search and locate a list of files and directories based on conditions you specify for files that match the arguments. Find can be used in a variety of conditions like you can find files by permissions, users, groups, file type, date, size and other possible criteria.

1. Find files under the current directory or in a specific directory:

ajoy@testserver:~$ find . -name file2.txt 
./file2.txt
ajoy@testserver:~$ find /home/ajoy/ -name file5.txt 
/home/ajoy/file5.txt
ajoy@testserver:~$

2. Find only files or only directories

ajoy@testserver:~$ find /home/ajoy -type f -name file1.txt
/home/ajoy/file1.txt
ajoy@testserver:~$ find /home/ajoy -type f -name “*.txt”
/home/ajoy/File1.txt
/home/ajoy/File5.txt
/home/ajoy/file3.txt
/home/ajoy/File2.txt
==result truncated=====
/home/ajoy/file9.txt
/home/ajoy/File9.txt
/home/ajoy/File7.txt
ajoy@testserver:~$

In the above example if we replace -type f to -type d find will look for only directories.

ajoy@testserver:~$ find . -type d -name “test”
./test
ajoy@testserver:~$

3. Find files ignoring the case

ajoy@testserver:~$ find . -iname file2.txt
./file2.txt
ajoy@testserver:/tmp$ find /home/ajoy/ -iname file5.txt
/home/ajoy/File5.txt
/home/ajoy/file5.txt
ajoy@testserver:~$

4. Find files limitting the directory traversal

ajoy@testserver:~$ find test/ -maxdepth 3 -name “*.py”
test/subdir1/subdir2/github_summary.py
test/subdir1/github_summary.py
test/github_summary.py
ajoy@testserver:~$ find test/ -maxdepth 2 -name “*.py”
test/subdir1/github_summary.py
test/github_summary.py
ajoy@testserver:~$

5. Find file inverting the match

ajoy@testserver:~$ find test/ -not -name “*.py”
test/
test/File1.txt
test/File5.txt
test/File2.txt
test/subdir1
test/subdir1/subdir2
test/File3.txt
test/File6.txt
test/File4.txt
test/File8.txt
test/File9.txt
test/File7.txt
ajoy@testserver:~$

6. Find with multiple search criterias

ajoy@testserver:~$ find test/ -name “*.txt” -o -name “*.py”
test/File1.txt
test/File5.txt
test/File2.txt
test/subdir1/subdir2/github_summary.py
test/subdir1/github_summary.py
test/File3.txt
test/File6.txt
test/File4.txt
test/File8.txt
test/github_summary.py
test/File9.txt
test/File7.txt
ajoy@testserver:~$

7. Find files with certain permissions

ajoy@testserver:~$ find . -type f -perm 0664
./file3.txt
./file6.txt
./file1.txt
./.gitconfig
./file5.txt
./file4.txt
./file2.txt
./file8.txt
./file7.txt
ajoy@testserver:~$

ajoy@testserver:~$ sudo find / -maxdepth 2 -perm /u=s 2>/dev/null
/bin/ping6
/bin/ping
/bin/fusermount
/bin/mount
/bin/umount
/bin/su
ajoy@testserver:~$

The above example shows the files with suid permissions set. The /dev/null bit bucket is used to remove errors related to permission while find traverses through directories

ajoy@testserver:~$ sudo find / -maxdepth 2 -type d -perm /o=t 2>/dev/null
/tmp
/var/tmp
/var/crash
/run/shm
/run/lock
ajoy@testserver:~$

The above example shows the directories  with sticky bit

8. Find files based on users and groups

ajoy@testserver:~$ sudo find /var -user www-data
/var/cache/apache2/mod_cache_disk
ajoy@testserver:~$

ajoy@testserver:~$ sudo find /var -group crontab
/var/spool/cron/crontabs
ajoy@testserver:~$

9. Find files as per access time and modified time and changed time

ajoy@testserver:~$ find / -maxdepth 2 -mtime 50 ==> last modified 50 days back

ajoy@testserver:~$ find / -maxdepth 2 -atime 50 ==> last accessed 50 days back

ajoy@testserver:~$ find /  -mtime +50 -mtime -100 ==> modified between 50 to 100 days ago

ajoy@testserver:~$ find .  -cmin -60 ==> changed in last 60 minutes (1 hour)

ajoy@testserver:~$ find / -mmin -60 ==> modified in last 60 minutes

ajoy@testserver:~$ find / -amin -60 ==> accessed in last 60 minutes

10. Find files based on size

ajoy@testserver:~$ find .  -size 50 ==> all files of 50 MB

ajoy@testserver:~$ find /  -size +50 -size -100 ==> all files greater than 50 MB & less than 100 MB

ajoy@testserver:~$ find /var -type f -empty ==> empty file

ajoy@testserver:~$ find /var -type d -empty ==> empty directory

11. Listing out files found with find

ajoy@testserver:~$ find . -maxdepth 1 -name “*.txt” -exec ls -l {} \;
-rw-rw-r– 1 ajoy ajoy 0 Jan 31 17:40 ./file3.txt
-rw-rw-r– 1 ajoy ajoy 0 Jan 31 17:40 ./file6.txt
-rw-rw-r– 1 ajoy ajoy 0 Jan 31 17:40 ./file1.txt
-rw-rw-r– 1 ajoy ajoy 0 Jan 31 17:40 ./file5.txt
-rw-rw-r– 1 ajoy ajoy 0 Jan 31 17:40 ./file4.txt
-rw-rw-r– 1 ajoy ajoy 0 Jan 31 17:40 ./file2.txt
-rw-rw-r– 1 ajoy ajoy 0 Jan 31 17:40 ./file8.txt
-rw-rw-r– 1 ajoy ajoy 0 Jan 31 17:40 ./file7.txt
-rw-rw-r– 1 ajoy ajoy 0 Jan 31 17:40 ./file9.txt
ajoy@testserver:~$

The above command will long list the files matching the find criteria and the one given below will remove the files matching the given criteria

ajoy@testserver:~$ find . -maxdepth 1 -name “*.txt” -exec rm -f {} \;
ajoy@testserver:~$ ls
script.sh test
ajoy@testserver:~$

12. Finding Smallest and Largest file

4 largest files in the current directory and sub dorectories

ajoy@testserver:~$ sudo find /var -type f -exec ls -l {} \; |sort -n -r |head -4
-rwxr-xr-x 1 root root 998 Dec 5 2012 /var/lib/dpkg/info/sgml-base.preinst
-rwxr-xr-x 1 root root 991 Mar 25 2013 /var/lib/dpkg/info/ureadahead.postinst
-rwxr-xr-x 1 root root 980 Sep 23 2014 /var/lib/dpkg/info/man-db.postrm
-rwxr-xr-x 1 root root 972 May 15 2014 /var/lib/dpkg/info/locales.preinst

2 smallest files in the current directory and sub dorectories

ajoy@testserver:~$ sudo find /var -type f -exec ls -l {} \; |sort -n |head -2 
-rw——- 1 daemon daemon 2 Jan 31 16:36 /var/spool/cron/atjobs/.SEQ
-rw——- 1 root ajoy 40 Jan 31 20:16 /var/lib/sudo/ajoy/0
ajoy@testserver:~$

Command Line Tips and Tricks

To find the largest file:

  • Search under /home for files greater than 100MB, where c stands for bytes.

# find /home -size +104857600c

  • This following command will output all the files under /usr/share which is more than 50MB in the format “file name : file size”

# find /usr/share -type f -size +50000k -exec ls –lh {} \; | awk ‘{ print $9 “: ” $5 }’

  • The below script will display every directory under the directory given as an argument to this script with its size sorted by the largest at the bottom.

Add the following lines to a file and give the file execute permission,

#!/bin/bash

du -sm $(find $1 –maxdepth 1 –xdevtype d)|sort -g

Then do a chmod +x <filename> to assign execute permission

  • The following two commands helps you in detecting the empty files.

# find . -size 0c or,

# find . -empty 

To Fix a Corrupt RPM database:

Execute the below commands as root user

# rm /var/lib/rpm/_db*

# rpm -v rebuilddb

Checking for SELinux :

# getenforce

This command will tell whether the selinux is in Enforcing, Permissive or Disabled mode

# ls -Z

This command helps in identifying the files and folders against selinux details.

# echo 0 > /selinux/enforcing

This will helps to change the selinux from enforcing to permissive temporarily (echo 1 > /selinux/enforcing to set it back to enforcing)

To disable the selinux permanently open the following file in your favourite editor and change the SELINUX context to disabled

# vi /etc/selinux/config

SELINUX=disabled

Increase the semaphore count in Linux:

# ipcs -a

—— Shared Memory Segments ——–
key shmid owner perms bytes nattch status
0x00000000 0 gdm 600 393216 2 dest
0x00000000 32769 gdm 600 393216 2 dest
0x00000000 65538 gdm 600 393216 2 dest
0x00000000 98307 gdm 600 393216 2 dest

—— Semaphore Arrays ——–
key semid owner perms nsems
0x00000000 0 root 600 1
0x00000000 65537 root 600 1

—— Message Queues ——–
key msqid owner perms used-bytes messages

# sysctl -a |grep sem
kernel.sem = 250 32000 32 128

# sysctl -w kernel.sem=300
kernel.sem = 300

# sysctl -a |grep sem
kernel.sem = 300 32000 32 128

We need to edit the /etc/sysctl.conf file to make this changes persistent across reboot.

The quick way to find the hardware model of your server:

# dmidecode -t 1

  • An example output,

# dmidecode 2.11
SMBIOS 2.4 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
Manufacturer: VMware, Inc.
Product Name: VMware Virtual Platform
Version: None
Serial Number: VMware-56 4d 42 b0 ee 0c 84 de-af 2d 47 d3 95 40 70 ef
UUID: 564D42B0-EE0C-84DE-AF2D-47D3954070EF
Wake-up Type: Power Switch
SKU Number: Not Specified
Family: Not Specified

  • Another example output,

# dmidecode 3.0
Scanning /dev/mem for entry point.
SMBIOS 2.4 present.

Handle 0x0100, DMI type 1, 27 bytes
System Information
Manufacturer: Xen
Product Name: HVM domU
Version: 4.2.amazon
Serial Number: ec2c4ad4-1afd-abba-aa66-74e921a4a3b7
UUID: EC2C4AD4-1AFD-ABBA-AA66-74E921A4A3B7
Wake-up Type: Power Switch
SKU Number: Not Specified
Family: Not Specified

The Product Name will tell you server details like its is VMWare, HP Proliant, Dell Poweregde etc.

To get the details of the IP Address and other N/W aspects:
# ifconfig -a
eth0 Link encap:Ethernet HWaddr 00:0C:29:40:70:EF
inet addr:192.168.213.155 Bcast:192.168.213.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe40:70ef/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:594 errors:0 dropped:0 overruns:0 frame:0
TX packets:530 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:54336 (53.0 KiB) TX bytes:127273 (124.2 KiB)

eth1 Link encap:Ethernet HWaddr 00:0C:29:40:70:F9
inet6 addr: fe80::20c:29ff:fe40:70f9/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:2700 (2.6 KiB)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:4 errors:0 dropped:0 overruns:0 frame:0
TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:240 (240.0 b) TX bytes:240 (240.0 b)

the typical ifconfig command gives you info about all the interfaces configured in the system

the short and simple method you can use is the ip command as below,

# ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:40:70:ef brd ff:ff:ff:ff:ff:ff
inet 192.168.213.155/24 brd 192.168.213.255 scope global eth0
inet6 fe80::20c:29ff:fe40:70ef/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:40:70:f9 brd ff:ff:ff:ff:ff:ff
inet6 fe80::20c:29ff:fe40:70f9/64 scope link
valid_lft forever preferred_lft forever

# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
192.168.213.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0

This will provode the kernel routing table and so is the one below,

# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
link-local * 255.255.0.0 U 1002 0 0 eth0
192.168.213.0 * 255.255.255.0 U 0 0 0 eth0

# ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supported pause frame use: No
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
MDI-X: Unknown
Supports Wake-on: d
Wake-on: d
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes

the ethtool provides the details of all your network interface.

# mii-tool eth1
eth1: negotiated 100baseTx-FD, link ok

this command also throw some brief info

Setting a Temporary IP:

#ifconfig eth0 192.168.213.415 netmask 255.255.255.0

#route add default gw 192.168.213.1

This settings will be lost the moment you reboot your system

Run a command repeatedly and display the output:

#watch -d ls -l

by default the program run every two seconds and lists the contents of the directory where the command executed

another way is a simple while loop

#while true; do ls -l; done

Information on a command:

# which ping
/bin/ping

the absolute path of a command

# rpm -qf /bin/ping
iputils-20071127-16.el6.x86_64

which pakage provides you this command

Add the words horizontally to a file:

To paste the contents of two files horizontally we can take the help of paste command,

# cat command
ping
ping
ping
ping

# cat ip
192.168.213.153
192.168.213.100
192.168.213.45
192.168.213.61

# paste -d ” ” command ip > output

-d ” ” will seperate the contents by a single space instead of a tab which is default.

# cat output
ping 192.168.213.153
ping 192.168.213.100
ping 192.168.213.45
ping 192.168.213.61

Redirecting the top command output to a file:

# top -b -n2 -d5 > /tmp/top.out

This command will run top 2 times and wait 5 seconds between each output.

Create a directory with different permission:

# mkdir /root/mydir -v -m 1777
mkdir: created directory `/root/mydir’

# ls -lhd /root/mydir/
drwxrwxrwt. 2 root root 4.0K Aug 9 20:35 /root/mydir/

This can be used instead of creating a directory and use chmod command to change the permissions

Create multiple files or directories:

#touch myfile{1..5}.txt

# ls my*

myfile1.txt  myfile2.txt  myfile3.txt  myfile4.txt  myfile5.txt

# mkdir dir{1..4}

Commands cat and tac:

# cat file
This is Linux
This is Ubuntu

# tac file
This is Ubuntu
This is Linux
[root@docker ~]#

Linux Process & Process Management

Introduction

A Linux server, like any other computer you may be familiar with, runs applications. To the computer, these are considered “processes”.

While Linux will handle the low-level, behind-the-scenes management in a process’s lifecycle, you will need a way of interacting with the operating system to manage it from a higher-level.

In this post, we will discuss some simple aspects of process management. Linux provides an abundant collection of tools for this purpose.

How To View Running Processes in Linux

The easiest way to find out what processes are running on your server is to run the top command:

# top

top – 07:45:38 up 9:38, 1 user, load average: 0.00, 0.01, 0.05
Tasks: 89 total, 2 running, 87 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1014976 total, 517484 free, 102656 used, 394836 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 728192 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 128092 6680 3932 S 0.0 0.7 0:02.84 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/0
6 root 20 0 0 0 0 S 0.0 0.0 0:00.32 kworker/u30:0
7 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
————- output truncated ——————-

The top chunk of information gives system statistics, such as system load and the total number of tasks.

You can easily see that there is 2 running process, and 87 processes are sleeping (aka idle/not using CPU resources).

The bottom portion has the running processes and their usage statistics.

Though top gives you an interface to view running processes based on ncurses. This tool is not always flexible enough to adequately cover all scenarios. A powerful command called ps is often the answer to these problems.

List processes with ps command

When called without arguments, the output can be a bit lack-luster:

# ps

PID TTY TIME CMD
3125 pts/0 00:00:00 sudo
3126 pts/0 00:00:00 su
3127 pts/0 00:00:00 bash
3150 pts/0 00:00:00 ps

This output shows all of the processes associated with the current user and terminal session. This makes sense because we are only running bash, sudo and ps with this terminal currently.

We can run ps command with different options to get a complete picture of the processes on this system.

BSD style – The options in bsd style syntax are not preceded by a dash.

# ps aux

UNIX/LINUX style – The options in Linux style syntax are preceded by a dash as usual.

# ps -ef

It is okay to mix both the syntax styles on Linux systems. For example “ps au -x”. In this post, we’re using both style syntaxes.

# ps aux

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2 0.0 0.0 0 0 ? S Jan11 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S Jan11 0:00 [ksoftirqd/0]
root 6 0.0 0.0 0 0 ? S Jan11 0:00 [kworker/u30:0]
root 7 0.0 0.0 0 0 ? S Jan11 0:00 [migration/0]
root 8 0.0 0.0 0 0 ? S Jan11 0:00 [rcu_bh]
root 9 0.0 0.0 0 0 ? R Jan11 0:00 [rcu_sched]
root 10 0.0 0.0 0 0 ? S Jan11 0:00

————- output truncated ——————-

These options tell ps to show processes owned by all users (regardless of their terminal association) in a user-friendly format.

To see a tree view, where hierarchal relationships are illustrated, we can run the command with these options:

# ps axjf

PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
0 2 0 0 ? -1 S 0 0:00 [kthreadd]
2 3 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/0]
2 6 0 0 ? -1 S 0 0:00 \_ [kworker/u30:0]
2 7 0 0 ? -1 S 0 0:00 \_ [migration/0]
2 8 0 0 ? -1 S 0 0:00 \_ [rcu_bh]
2 9 0 0 ? -1 R 0 0:00 \_ 1 2024 2024 2024 ? -1 Ss 0 0:00 /usr/sbin/sshd
2024 3100 3100 3100 ? -1 Ss 0 0:00 \_ sshd: ajoy[priv]
3100 3103 3100 3100 ? -1 S 1000 0:00 \_ sshd: ajoy@pts/0
3103 3104 3104 3104 pts/0 3153 Ss 1000 0:00 \_ -bash
3104 3125 3125 3104 pts/0 3153 S 0 0:00 \_ sudo su –
3125 3126 3125 3104 pts/0 3153 S 0 0:00 \_ su –
3126 3127 3127 3104 pts/0 3153 S 0 0:00 \_ -bash
3127 3153 3153 3104 pts/0 3153 R+ 0 0:00 \_ ps axjf
————- output truncated ——————-

As you can see, the process sshd is shown to be a parent of the processes like bash, su, sudo, and ps ajx itself.

List the Process based on the UID and Commands (ps -u, ps -C)

Use -u option to displays the process that belongs to a specific username. When you have multiple usernames, separate them using a comma. The example below displays all the process that are owned by user wwwrun, or postfix.

# ps -f -u wwwrun,postfix

UID PID PPID C STIME TTY TIME CMD
postfix 7457 7435 0 Mar09 ? 00:00:00 qmgr -l -t fifo -u
wwwrun 7495 7491 0 Mar09 ? 00:00:00 /usr/sbin/httpd2-prefork -f /etc/apache2/httpd.conf
wwwrun 7496 7491 0 Mar09 ? 00:00:00 /usr/sbin/httpd2-prefork -f /etc/apache2/httpd.conf
wwwrun 7497 7491 0 Mar09 ? 00:00:00 /usr/sbin/httpd2-prefork -f /etc/apache2/httpd.conf
wwwrun 7498 7491 0 Mar09 ? 00:00:00 /usr/sbin/httpd2-prefork -f /etc/apac

The following example shows that all the processes which have tatad.pl in its command execution.

# ps -f -C tatad.pl

UID PID PPID C STIME TTY TIME CMD
root 9576 1 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9577 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9579 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9580 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9581 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bi

The following method is used to get a list of processes with a particular PPID.

#ps -f –ppid 9576

UID PID PPID C STIME TTY TIME CMD
root 9577 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9579 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9580 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin/tatad.pl
root 9581 9576 0 Mar09 ? 00:00:00 /opt/tata/perl/bin/perl /opt/tata/bin

List Processes in a Hierarchy (ps –forest)

The example below displays the process Id and commands in a hierarchy. –forest is an argument to ps command which displays ASCII art of process tree. From this tree, we can identify which is the parent process and the child processes it forked in a recursive manner.

#ps -e -o pid,args –forest
468 \_ sshd: root@pts/7
514 | \_ -bash
17484 \_ sshd: root@pts/11
17513 | \_ -bash
24004 | \_ vi ./790310__11117/journal
15513 \_ sshd: root@pts/1
15522 | \_ -bash
4280 \_ sshd: root@pts/5
4302 | \_ -bash

List elapsed wall time for processes (ps -o pid,etime=)

If you want the get the elapsed time for the processes which are currently running ps command provides etime which provides the elapsed time since the process was started, in the form [[dd-]hh:]mm: , ss.

The below command displays the elapsed time for the process IDs 1 (init) and process id 29675.

For example “10-22:13:29? in the output represents the process init is running for 10days, 22hours,13 minutes and 29seconds. Since init process starts during the system startup, this time will be same as the output of the ‘uptime’ command.

# ps -p 1,29675 -o pid,etime=
PID
1 10-22:13:29
29675 1-02:58:46

List all threads for a particular process (ps -L)

You can get a list of threads for the processes. When a process hangs, we might need to identify the list of threads running for a particular process as shown below.

# ps -C java -L -o pid,tid,pcpu,state,nlwp,args

PID TID %CPU S NLWP COMMAND
16992 16992 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16993 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16994 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16995 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16996 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16997 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16998 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 16999 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_lib -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5006
16992 17000 0.0 S 15 ../jre/bin/java -Djava.ext.dirs=../jre/lib/ext:../lib:../auto_l

Finding memory Leak (ps –sort pmem)

A memory leak, technically, is an ever-increasing usage of memory by an application.

With common desktop applications, this may go unnoticed, because a process typically frees any memory it has used when you close the application.

However, In the client/server model, memory leakage is a serious issue, because applications are expected to be available 24×7. Applications must not continue to increase their memory usage indefinite because this can cause serious issues. To monitor such memory leaks, we can use the following commands.

# ps aux –sort pmem

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 1520 508 ? S 2005 1:27 init
inst 1309 0.0 0.4 344308 33048 ? S 2005 1:55 agnt (idle)
inst 2919 0.0 0.4 345580 37368 ? S 2005 20:02 agnt (idle)
inst 24594 0.0 0.4 345068 36960 ? S 2005 15:45 agnt (idle)

In the above ps command, –sort option outputs the highest %MEM at the bottom. Just note down the PID for the highest %MEM usage. Then use ps command to view all the details about this process id, and monitor the change over time. You had to manually repeat ir or put it as a cron to a file.

The VSZ number is useless if what you are interested in is memory consumption. VSZ measures how much of the process’s virtual memory space has been marked by the process of memory that should be mapped by the operating system if the process happens to touch it. But it has nothing to do with whether that memory has actually been touched and used. VSZ is an internal detail about how a process does memory allocation — how big a chunk of unused memory it grabs at once. Look at RSS for the count of memory pages it has actually started using

RSS:

Resident set size = the non-swapped physical memory that a task has used; Resident Set currently in physical memory including Code, Data, Stack

VSZ:

Virtual memory usage of entire process = VmLib + VmExe + VmData + VmStk

In other words,
a) VSZ *includes* RSS
b) “ps -aux” alone isn’t enough to tell you if a process is thrashing (although, if your system *is* thrashing, “ps -aux” will help you identify the processes experiencing the biggest hits).

# ps ev –pid=27645

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
27645 ? S 3:01 0 25 1231262 1183976 14.4 /TaskServer/bin/./wrapper-linux-x86-32

# ps ev –pid=27645

PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
27645 ? S 3:01 0 25 1231262 1183976 14.4 /TaskServer/bin/./wrapper-linux-x86-32

Note: In the above output, if RSS (resident set size, in KB) increases over time (so would %MEM), it may indicate a memory leak in the application.

The following command displays all the process owned by Linux username: oracle.

# ps U oracle

PID TTY STAT TIME COMMAND
5014 ? Ss 0:01 /oracle/bin/tnslsnr
7124 ? Ss 0:00 ora_q002_med
8206 ? Ss 0:00 ora_cjq0_med
8852 ? Ss 0:01 ora_pmon_med

Following command displays all the process owned by the current user.

# ps U $USER

PID TTY STAT TIME COMMAND
10329 ? S 0:00 sshd: ajoy@pts/1,pts/2
10330 pts/1 Ss 0:00 -bash
10354 pts/2 Ss+ 0:00 -bash

The ps command can be configured to show a selected list of columns only. There are a large number of columns to to show and the full list is available in the man pages.

The following command shows only the pid, username, cpu, memory and command columns.

# ps -e -o pid,uname,pcpu,pmem,comm

PID USER %CPU %MEM COMMAND
1 root 0.0 0.6 systemd
2 root 0.0 0.0 kthreadd
3 root 0.0 0.0 ksoftirqd/0
6 root 0.0 0.0 kworker/u30:0
7 root 0.0 0.0 migration/0

The ps command is quite flexible and it is possible to rename the column labels as shown below:

# ps -e -o pid,uname=USERNAME,pcpu=CPU_USAGE,pmem,comm

PID USERNAME CPU_USAGE %MEM COMMAND
1 root 0.0 0.6 systemd
2 root 0.0 0.0 kthreadd
3 root 0.0 0.0 ksoftirqd/0
6 root 0.0 0.0 kworker/u30:0
7 root 0.0 0.0 migration/0

Combined with the watch command we can turn ps into a realtime process reporter. Simple example is like this

# watch -n 1 ‘ps -e -o pid,uname,cmd,pmem,pcpu –sort=-pmem,-pcpu | head -15’

Every 1.0s: ps -e -o pid,uname,cmd,pmem,pcpu –… Sun Dec 1 18:16:08 2009

PID USER CMD %MEM %CPU
3800 1000 /opt/google/chrome/chrome – 4.6 1.4
7492 1000 /opt/google/chrome/chrome – 2.7 1.4
3150 1000 /opt/google/chrome/chrome 2.7 2.5
3824 1000 /opt/google/chrome/chrome – 2.6 0.6
3936 1000 /opt/google/chrome/chrome – 2.4 1.6
2936 1000 /usr/bin/plasma-desktop 2.3 0.2
9666 1000 /opt/google/chrome/chrome – 2.1 0.8

Process IDs:

In Linux and Unix-like systems, each process is assigned a process ID, or PID. This is how the operating system identifies and keeps track of processes.

# pgrep bash

3104
3127

The first process spawned at boot, called init, is given the PID of “1”.

# pgrep init

1

This process is then responsible for spawning every other process on the system. The later processes are given larger PID numbers.

A process’s parent is the process that was responsible for spawning it. If a process’s parent is killed, then the child processes also die. The parent process’s PID is referred to as the PPID.

Process States:

Here are the different values that the s, stat and state output specifiers (header “STAT” or “S”) will display to describe the state of a process:

Running

The process is either running (it is the current process in the system) or it is ready to run (it is waiting to be assigned to one of the system’s CPUs).

Waiting

The process is waiting for an event or for a resource. Linux differentiates between two types of waiting process; interruptible and uninterruptible. Interruptible waiting processes can be interrupted by signals whereas uninterruptible waiting processes are waiting directly on hardware conditions and cannot be interrupted under any circumstances.

Stopped

The process has been stopped, usually by receiving a signal. A process that is being debugged can be in a stopped state.
Zombie
This is a halted process which, for some reason, still has a task_struct data structure in the task vector. It is what it sounds like, a dead process.

D uninterruptible sleep (usually IO)
R running or runnable (on run queue)
S interruptible sleep (waiting for an event to complete)
T stopped, either by a job control signal or because it is being traced.
W paging (not valid since the 2.6.xx kernel)
X dead (should never be seen)
Z defunct (“zombie”) process, terminated but not reaped by its parent.

For BSD formats and when the stat keyword is used, additional characters may be displayed:
< high-priority (not nice to other users)
N low-priority (nice to other users)
L has pages locked into memory (for real-time and custom IO)
s is a session leader
l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)
+ is in the foreground process group

Processes Signals:

All processes in Linux respond to signals. Signals are an os-level way of telling programs to terminate or modify their behavior. The most common way of passing signals to a program is with the kill command. The default functionality of this utility is to attempt to kill a process:

# kill <PID of process>

This sends the TERM signal to the process. The TERM signal tells the process to please terminate. This allows the program to perform clean-up operations and exit smoothly.

If the program is misbehaving and does not exit when given the TERM signal, we can escalate the signal by passing the KILL signal:

# kill -KILL <PID of process>

This is a special signal that is not sent to the program.

Instead, it is given to the operating system kernel, which shuts down the process. This is used to bypass programs that ignore the signals sent to them.

Each signal has an associated number that can be passed instead of the name. For instance, You can pass “-15” instead of “-TERM”, and “-9” instead of “-KILL”.

Signals are not only used to shut down programs. They can also be used to perform other actions.

For instance, many daemons will restart when they are given the HUP or hang-up signal. Apache is one program that operates like this.

# kill -HUP <PID of httpd>

The above command will cause Apache to reload its configuration file and resume serving content.

You can list all of the signals that are possible to send with the kill by typing:

# kill -l

1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP
6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1
11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM
16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP
21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ
26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR
31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3
38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8
43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13
48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12
53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7
58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2
63) SIGRTMAX-1 64) SIGRTMAX

The conventional way of sending signals is through the use of PIDs, there are also methods of doing this with regular process names.

The pkill command works in almost exactly the same way as kill, but it operates on a process name instead:

# pkill -9 ping

is equivalent to

# kill -9 `pgrep ping`

If you would like to send a signal to every instance of a certain process, you can use the killall command:

# killall firefox

Process Priorities

Some processes might be considered mission critical for your situation, while others may be executed whenever there might be leftover resources.You will want to adjust which processes are given priority in a server environment. Linux controls priority through a value called niceness.

High priority tasks are considered less nice, because they don’t share resources as well. Low priority processes, on the other hand, are nice because they insist on only taking minimal resources.

When we ran top at the beginning of this blog, there was a column marked “NI”. This is the nice value of the process:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 128092 6680 3932 S 0.0 0.7 0:02.84 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0

Nice values can range between “-19/-20” (highest priority) and “19/20” (lowest priority) depending on the system.

To run a program with a certain nice value, we can use the nice command:

# nice -n 15 <command>

This only works while executing a new program.

To alter the nice value of a program that is already executing, we use a tool called renice:

# renice 0 <PID of process>