Linux is composed of multiple files.
Unlike other major OSes, how Linux handles its file systems and files is a little different. Plus, we need to know that Linux itself is written and composed of files and file systems entirely.
File system is the system responsible for defining the concept: file. To understand what file is, we need to clearly define what it is. To better understand the concept, we have to list three different definitions of files.
- General files
- Regular files
- Stream
General files
We’ll take a look at ‘stream’ later on, so let’s find out the clear definitions of the first two: general files and regular files. When I execute the ls command. the system lists all the files stored in /etc directly as listed in image 01. Even though there are different types of files, including binaries, directories, and symbolic links, they are essentially the same files. Let’s define this general notion of files as ‘General files’ here.
Regular file
The aforementioned ‘general files’ can be diversified into different types of files, such as doc files, spreadsheets, image files, coding files and you name it… But from the kernel’s perspective, they are the same files.
In Windows, for example, it seems that files are identified by extensions, such as .img or .txt, but those extensions are only telling the system to change how it interacts with each file based on them. When you change an image file’s extension from .img to .txt, Windows explorer won’t be able to open it properly, for instance.
Directory
Directly is a file that’s able to accommodate other files (general files). UNIX traditionally allows users to read directly as bytes, but Linux prevents it.
Symbolic links
Symbolic links are files that accommodate other files’ names, also known as soft links. When accessing a symbolic link, the kernel replaces it with the actual file to which the link is connected.
Let’s think about an actual use case. Generally, /var/log is the directory where Linux logs are stored. But in some Linux distros, /var/adm roles as its log directory. A program built in those distros considers /var/adm as its logging directory, so in this case, we’ll need to create a symbolic link /var/adm.
When we create a symbolic link /var/adm that links to /var/log, the program will log its data to the symbolic link that transfers it to the actual logging directory: /var/log.
Device files
Device files handle hardware devices as files. A device file, /dev/sda, represents SSD or HDD, and by using a special API, the file can access and control the data within the devices.
There are two types of device files: character device files and block device files. The difference between the two is the flexibility that allows you to access the device whenever/wherever you like or not. SSD and HDD are block devices while printers and modems are character devices.
Among device files, there are some weird ones that exist, such as /dev/null, which is always empty, and whatever data you send to the file disappears.
Named pipe
Named pipes are files that can be used in the communication between processes, and it is also known as FIFO. Since it’s barely used, we don’t have to learn it at least for now.
UNIX domain sockets
UNIX domain sockets are also files that can be used in the communication between processes. By now, TCP sockets can replace them, so we also don’t have to learn it at least for now either.
Supplementary information
Files not only have their data itself but also their other information, such as:
- File types (either file or directory)
- Permissions
- Sizes
- Updated time
And you can list all of that information via a command like ls-l.
General files’ definitions
Data
A file or a directory has data regardless of its file type. A file may have text or an image, while a directory has files inside.
Supplementary information
A file has supplementary information such as its updated time.
Path
A path is the integral and the most important aspect of files.
File systems and mount
File systems exist on hardware devices, such as SSD and HDD. By partitioning the disk if it were an SSD or HDD, you can load a file system on each partition. And you can mount a partition to other locations of your machine.
The mount command will tell you what file system is used on your system. In my Ubuntu Server, there are five different mounted file systems running at the moment as indicated in image 04. Ext4 is a type of file system that is most commonly used in the modern Linux system.
mount -t ext4
Type | Description |
ext4 | The most commonly used Linux file system |
xfs | A journaling file system built by a company called SGI. |
btrfs | A copy on-light system for Linux. |
There are other types of file systems too, such as procfs, tmpfs, and devfs, and they are different types of file systems when compared to the standard ext4 or xfs. Since they don’t have any backup devices, they are known as pseudo filesystems (we’ll look at it later on…).