Linux Programming – Part 11: Program Cat Command

Cat command.

Are you excited? I am. Here, we’ll code cat command in C language by assessing the kernel’s four fundamental APIs. open(), read(), write(), and close(). This is the very beginning of our Linux programming journey. Let’s start it!!

Since we’ve gone through UNIX/Linux basics as well as gcc programming tutorials, let’s build our first Linux command in C language. And we’re going to build the cat command, but this time without any options. The function we create here is to print out the contents of a file set as its argument.

Here is the code.

cat.c

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <fcntl.h>

static void do_cat(const char *path);
static void die(const char *s);

int
main(int argc, char *argv[]) {
	int i;
	if (argc < 2) {
		fprintf(stderr, "%s: file name not given\n", argv[0]);
		exit(1);
	}
	for (i = 1; i < argc; i++) {
		do_cat(argv[i]);
	}
	exit(0);
	}

#define BUFFER_SIZE 2048

	static void
		do_cat(const char *path) {
			int fd;
			unsigned char buf[BUFFER_SIZE];
			int n;

			fd = open(path, O_RDONLY);
			if (fd < 0) die(path);
			for (;;) {
				n = read(fd, buf, sizeof buf);
				if (n < 0) die(path);
				if (n == 0) break;
				if(write(STDOUT_FILENO, buf, n) < 0) die(path);
		}
			if (close(fd) < 0) die (path);
		}

		static void
		die(const char *s) {
			perror(s);
			exit(1);
		}

For a starter, let’s take look at main() function.

main()

image01: main()

In the first if-branch in lines 13, 14, and 15, it counts the command line argument number and prints out the specified error message in line 14, and exits if there were zero arguments. fprintf(), by the way, is basically the same as printf(). And please keep in mind that it also prints out the program name with argv[0].

The main part of this process is for(). From a brief glance, you could assume that the loop transfers the data to do_cat(). Since do_cat()’s received argument is the file name specified in the command line argument, we can also assume that do_cat() will open the file and read through its contents from the code

do_cat() Part1

Let’s take a look at do_cat function.

do_cat()

static void
		do_cat(const char *path) {
			int fd;
			unsigned char buf[BUFFER_SIZE];
			int n;

			fd = open(path, O_RDONLY);
			if (fd < 0) die(path);
			for (;;) {
				n = read(fd, buf, sizeof buf);
				if (n < 0) die(path);
				if (n == 0) break;
				if(write(STDOUT_FILENO, buf, n) < 0) die(path);
		}
			if (close(fd) < 0) die (path);
		}

Although we cut out do_cat function, it still is a bit hard to understand all of them at a grace. Let’s take a look at smaller segments one after another.

do_cat() Part2

do_cat()


fd = open(path, O_RDONLY);
if (fd < 0) die(path);
.
.
(omitted)
.
.
if (close(fd) < 0) die (path);

Now, it’s better. First off, it opens the file specified in the command line argument. Since the cat command doesn’t write any data to the file, the argument is O_RDONLY(read). In the next line, it checks if the file is successfully opened or not.

And finally, in the last line, it closes the file, which is necessary for every program. Once opened, it must be closed.

sample_code

status - closed(fd);
if (status < 0) die(path)

Wouldn’t it be better to understand if it were written like this? But in the real world, no one writes code like this, so we need to get used to a little complicated code.

The die() is a function that closes the program with an error message. We’ll take a look at it later.

do_cat() Part3

Let’s look at the for-loop.

for (;;) {
     n = read(fd, buf, sizeof buf);
     if (n < 0) die(path);
     if (n == 0) break;
     if(write(STDOUT_FILENO, buf, n) < 0) die(path);
}

for (;;) is the infinite loop and orders the program to infinitely loop through this process. So, what does it loop through? As we can assume from the code that contains some keywords like read() and write(), it may be the process of reading data from the stream and writing it down to the console.

But it shouldn’t loop through the process forever. Once it reaches the end of the file, it should stop. Based on the assumption, when we look at it, yes there is the sign:

if (n == 0) break;

n== 0 could mean ‘when it finishes the reading process. Even though we might not be able to fully grasp what’s going on in the code, we can at least make an assumption.

do_cat() Part4

Now let’s take a look at read() and write().

read()

n = read(fd, buf, sizeof buf);
if (n < 0) die(path);
if (n == 0) break;
if(write(STDOUT_FILENO, buf, n) < 0) die(path);

The read() streams byte-string specified by file descriptor fd to buf. The maximum size is substituted to sizeof buf, the size of the array.

The sizeof buf can also be written as BUFFER_SIZE, and the compiler will recognize both formats. But the former is more human-readability-oriented than the latter. The fact that sizeof buf is array-buf’s size is obvious from C language’s specifications, so programmers can understand it just by looking at it – and this is a better code.

The next code validates the result of the previous code. If an error occurs, the die() will end the program.

The third line will finish the for-loop once it completes the cycle as I mentioned earlier. In the end, read() will return 0.

write()

First off, write() writes down buf‘s contents into STDOUT_FILENO (standard output). And the number of contents they can handle accords with n, the number of byte-strings it read in read().

#define BUFFER_SIZE 2048

Keep in mind that the number of byte-strings read() reads doesn’t according to the buffer’s maximum size. As shown above, in line 23, the buffer size is defined in 2048. So, if it streams a 2050-buffer-sized file, it has to loop through it twice: first 2048 bytes, and second 2 bytes.

do_cat() Afterthoughts

We’ve been through all individual components in do_cat() function. Now, let’s connect the dots and fully understand what was going on in the entire program.

First, the stream originated from the path that was given to the program as the argument is given to the program. read() and write() were reading through all byte-stings from the stream, writing the data into the console. And the for-loop was responsible for continuing the process until the end of the file. And the loop ends itself once it reaches the last line of the file.

Header files

In cat.c, the program includes several header files. Throughout your programming process, you may need additional APIs. and in such cases, you can always look it up in man command. You really don’t have to memorize everything. If necessary, google it.

And even in a case where you forgot to include a necessary header file, you can let the system warn you if you include -W option when compiling it in gcc.

errno variables

open(), close(), read(), and write() have the shared coherent patters:

  • Returns 0 if successful
  • Returns -1 if failed

If a system-call process is failed, an error-oriented constant number is substituted to the global variable: error (errno stands for ERRor Number).

Some examples:

  • ENOENT: happens when a file doesn’t exist
  • EINVAL: happens when the argument’s number doesn’t match the expected.

die()

When an error occurs, we have to solve it according to the error message the system print out. And in cat.c’s program, die() function is responsible for that.

Let’s take a look at it:

static void
die(const char *s) {
	perror(s);
	exit(1);
}

The perror() function prints out the error message, and exit() function finishes it. And perror() is a library function.

perror()

#include <stdio.h>

void perror(const char *s)

perror() prints out to the standard output in accordance with errno’s error messages.

image02: perror() samples

In image 02, there are some examples of error messages that are executed when inappropriate files are given to cat.c.

  • /etc/shadow: Permission denied
  • directly: ..: Is a directory
  • non-existing file: No such file or directory

strerror()

Along with perror(), we also have strerror().

strerror() returns error messages corresponding to errono’s values. Keep in mind that a value returned from strerror() will be overwritten by the next strerror()’s process. So, once the number is retuned, use it asap and do not store it.

Leave a Reply