Linux Programming – Part 3: File system

Linux is composed of multiple files.

Unlike other major OSes, how Linux handles its file systems and files is a little different. Plus, we need to know that Linux itself is written and composed of files and file systems entirely.

File system is the system responsible for defining the concept: file. To understand what file is, we need to clearly define what it is. To better understand the concept, we have to list three different definitions of files.

  • General files
  • Regular files
  • Stream

General files

We’ll take a look at ‘stream’ later on, so let’s find out the clear definitions of the first two: general files and regular files. When I execute the ls command. the system lists all the files stored in /etc directly as listed in image 01. Even though there are different types of files, including binaries, directories, and symbolic links, they are essentially the same files. Let’s define this general notion of files as ‘General files’ here.

image 01: ls /etc

Regular file

The aforementioned ‘general files’ can be diversified into different types of files, such as doc files, spreadsheets, image files, coding files and you name it… But from the kernel’s perspective, they are the same files.

In Windows, for example, it seems that files are identified by extensions, such as .img or .txt, but those extensions are only telling the system to change how it interacts with each file based on them. When you change an image file’s extension from .img to .txt, Windows explorer won’t be able to open it properly, for instance.

Directory

Directly is a file that’s able to accommodate other files (general files). UNIX traditionally allows users to read directly as bytes, but Linux prevents it.

Symbolic links

Symbolic links are files that accommodate other files’ names, also known as soft links. When accessing a symbolic link, the kernel replaces it with the actual file to which the link is connected.

Let’s think about an actual use case. Generally, /var/log is the directory where Linux logs are stored. But in some Linux distros, /var/adm roles as its log directory. A program built in those distros considers /var/adm as its logging directory, so in this case, we’ll need to create a symbolic link /var/adm.

When we create a symbolic link /var/adm that links to /var/log, the program will log its data to the symbolic link that transfers it to the actual logging directory: /var/log.

Device files

Device files handle hardware devices as files. A device file, /dev/sda, represents SSD or HDD, and by using a special API, the file can access and control the data within the devices.

There are two types of device files: character device files and block device files. The difference between the two is the flexibility that allows you to access the device whenever/wherever you like or not. SSD and HDD are block devices while printers and modems are character devices.

Among device files, there are some weird ones that exist, such as /dev/null, which is always empty, and whatever data you send to the file disappears.

Named pipe

Named pipes are files that can be used in the communication between processes, and it is also known as FIFO. Since it’s barely used, we don’t have to learn it at least for now.

UNIX domain sockets

UNIX domain sockets are also files that can be used in the communication between processes. By now, TCP sockets can replace them, so we also don’t have to learn it at least for now either.

Supplementary information

Files not only have their data itself but also their other information, such as:

  • File types (either file or directory)
  • Permissions
  • Sizes
  • Updated time

And you can list all of that information via a command like ls-l.

image 03: ls -l

General files’ definitions

Data

A file or a directory has data regardless of its file type. A file may have text or an image, while a directory has files inside.

Supplementary information

A file has supplementary information such as its updated time.

Path

A path is the integral and the most important aspect of files.

File systems and mount

File systems exist on hardware devices, such as SSD and HDD. By partitioning the disk if it were an SSD or HDD, you can load a file system on each partition. And you can mount a partition to other locations of your machine.

The mount command will tell you what file system is used on your system. In my Ubuntu Server, there are five different mounted file systems running at the moment as indicated in image 04. Ext4 is a type of file system that is most commonly used in the modern Linux system.

mount -t ext4
image 04: mount -t ext4
TypeDescription
ext4The most commonly used Linux file system
xfsA journaling file system built by a company called SGI.
btrfsA copy on-light system for Linux.
Linux’s file systems

There are other types of file systems too, such as procfs, tmpfs, and devfs, and they are different types of file systems when compared to the standard ext4 or xfs. Since they don’t have any backup devices, they are known as pseudo filesystems (we’ll look at it later on…).

Leave a Reply