Back to basics: Operating systems
There’s a form of operating system in almost every computer since the creation of the “personal computer” and even before that. Even then, most people have little to no idea what it is or what it does. I’ll even go as far as to say that programmers that are working on higher level software, like web applications, vaguely know what it does since, for them, it is just another thing running in the background, taken for granted.
The operating system is a key software inside the computer since it is, in a way, the “mother of all software”, but what is it and what are its roles?
We’ll go over all of this in this article. The first part will be about the roles of an operating system, then we’ll dive a bit deeper and give an overview of how it actually achieves them.
The bare machine problem
If you were to have a computer. Just the computer. No operating system or any other software on it. What would you have?
You would have exactly that! A computer. A huge box of rare metals that you can power on and it would have electricity flowing through it. You might even have an output made by the motherboard firmware that shows the company that made it and the settings for it.
This is what we could call the bare machine problem. Without software, you can’t really do anything on a computer. We often forget that computers are just extremely complex physical machines. When you see it this way, you realise that no sane human being would use a CPU at the raw hardware level, without any abstraction (and yes I consider Assembly as an abstraction level). It is just way too complex and there are too many variables to take care of.
Now let’s consider a middle ground. Let’s say we got software running on this computer. What I mean by this is you open the computer and it boots on a note taking software. Good, but what if I want to load my web browser? Unless the note taking application has an integrated web browser, I’ll have to shut down, put the web browser in and reboot.
This might be a bit inconvenient for the user, but it would be a nightmare to develop software for this. It would basically mean that every software has to write its own ways of interacting with hardware. Even the most basic of things, like files, would become a serious engineering problem to manage since every software would have to share the disk without even knowing each other or without having a unified way to manage them.
In the next section, we’ll see that this is the problem that the operating system solves, it is why they are almost as old as “modern” computers.
Roles of an operating system
We can see how a computer would need an operating system for it to be usable, but it is still a bit abstract what it should be responsible for. If we take our last example again, we can see two problems arising.
The first would be the flow we talked about: Boot a program, run it, poweroff, load another program and repeat. This is inconvenient and it would be nice if we could switch from web browser to note taking application without having to continuously reboot and change program. The operating system can do that by managing the computer resources for us. This is how modern computers can run tens or even hundreds of programs “at the same time” without us even lifting a finger.
The second would be that every software would have to bring its own way to manage the hardware of the computer from CPU to input handling. This would tie it down to that specific machine with those specific pieces which would be a sad state for a note taking program. This is where the operating system slips in. It will take the role of abstracting the hardware for the program so that it can concentrate on what it does best: take notes. If the program wants to save a file, take user inputs or show something on the screen, it could just call the appropriate OS functions for those without ever knowing with what or how it accomplished that call.
These are the two core roles of an operating system in a nutshell. We’ll go over each of them and elaborate a bit more on the concept. The how will follow in the next section.
Managing the computer resources
You could say that there are two main resources on a computer: CPU and memory. Without them, you can’t really do anything. The CPU is responsible for actually executing a program and the memory for keeping it loaded during that time. Both can be managed by the operating system and we’ll go over how for each.
We can start with the most basic of CPUs which would be a single core one. Just as a quick reminder, a single core CPU can only do one thing at a time. The operating system will be the manager of that CPU and it will decide who runs when on it. When it finds that one program ran for enough time, it will switch it for another one. What you can do doesn’t stop there though.
An ideal operating system will try to always have something using that core so it is never waiting for something to happen. You can see how it would be a waste of processing if the CPU was idling while waiting for user inputs on that note application. While the program is waiting for this, the OS might take the chance to load another program on the core so that it can work on something else until an input is typed. That change of the program running on the CPU is called a context switch.
If a context switch takes a few microseconds and you switch between running programs a couple of times per second, you get something really interesting. That single core now feels like it is running multiple programs “at the same time”, but it isn’t. It is just switching really fast between programs that need the CPU. Context switching and the decision of which program should run when are core concepts of OS development.
Now that we know how the operating system manages the CPU, we can sort of see a problem. If both our note taking application and our web browser are running almost at the same time, it means both need to be in memory. Do we switch programs in and out of memory just like we do with the CPU? No, switching them like this would be really bad, since fetching from the disk is an operation that takes a few milli, not micro nor nano, seconds.
We can see how to solve this problem with a simple visualisation. Let’s say we have 500 megabytes of memory. You could see that memory like a spa divided in three small sections. Since it does a lot of operations and manages the whole computer, the operating system gets a spot and it manages who gets a seat too. The note taking application is currently running and so is the web browser, so the operating system will make both of them have a place. Those places are special in a way too. The OS doesn’t want the web browser to put its feet all over the other places and vice-versa, so it will make sure to wall off that seat, but, at the same time, give a nice illusion that this seat is a whole spa in itself. This is a bit hard to wrap your head around the first time, but we’ll come back to it, in more details, during the next section. This splitting and management of memory is how the problem is solved.
By sharing the system memory between running programs, we can easily continue the context switching without having to constantly swap the programs in and out. This keeps context switching short and means we aren’t using the disk for no reason.
In the case where a program is too big for the memory or that there are too many of them for the available memory, the OS will start choosing who gets to be on memory and will go as far as to partially load programs in memory when those parts are needed. The rest is kept on disk. This practice of partially loading programs in memory and keeping inactive parts on disk is called swapping. This is how the operating system can give the illusion of hundreds of programs loaded in memory and, for programs, the belief they can take more memory than the computer physically has.
Providing hardware abstraction
The operating system has the critical role of abstracting the computer hardware so that programs and users can cleanly interact with it. Both have different needs for abstraction and we’ll go over them.
Like we said before, a program doesn’t want to have to manage how it has to write a file or display something on the monitor. Forcing every program to do this themselves would quickly become chaos. To solve this problem, the operating system becomes the middle man between a program and the hardware. Using this model, a program will almost never interface with hardware directly. The only exception are drivers, but we’ll address those later.
The operating system exposes a clean interface of functions that can be used by programs. Those programs will simply call them and leave the OS to figure out how to execute this request on the current hardware. This interface is the abstraction provided to programs and the functions are called system calls.
Each system call has a use case that never involves the caller into the way it will be done. An example
could be open on UNIX like operating systems. Among other things, it is given a path to the file the program
wants to “open” and it will return a file descriptor allowing the application to handle the file.
This is it. The caller never had to think about the type of drive (SATA, NVME, SD etc), where the file is stored
or how to retrieve it from there. It just asked through the interface and the operating system took care
of the complexities.
For users, the abstraction is noticeable in two subtle ways. The first is all the information needed for context switching about what I keep referring to as a “running program”. For context switching, you need a lot of information about a program and, without it, it would create a weird situation where a program would lose all its progress every time it lost the CPU.
Another thing is that the operating system needs a way to know all the programs that are running so it can make decisions. To bundle together all this information and identify it as a program currently running, the OS provides an abstraction called the process.
Those processes can be exposed to the users too, without them ever knowing all the information they package together, as a quick way of saying that this program is running, right now, on the computer. A program like a task manager can expose all the processes of a computer and allow users to see them, kill them or see how much they’re using resources.
For the second abstraction users interact with daily, we need to take a bit of a step back. We all want persistent storage on a computer. What good would a personal computer be if it couldn’t store data that would persist between reboots. This means that the operating system needs to manage a disk since it falls under the OS’s responsibilities, but how do you store data?
We’ll get into a bit more details later, but you don’t want the user or programs to have to ask tricky questions like where on a disk a file is stored, as in, on which sector or platter. This is why the operating system needs to expose an abstraction of all of this to them. This abstraction is the file system or, simply, files and folders on partitions.
Files and folders are a sweet convenient illusion given to users and programs. It lets us cluster data together into files and organize those into folders cleanly. The operating system has the role of keeping track of those files so that, when we ask for one, it can service our request from the disk. This way of organizing the computer’s data is key to how we use it and it is all managed by the operating system.
Every “formatted” disk has to abide by this abstraction and it is why we can have a USB key that we can plug into multiple computers and still see the same files on it. The operating system needs to know a lot of ways that files can be stored on a disk and use the right way for the right disk so that users don’t have broken files when switching computers or operating system.
How does it achieve its roles
We now know that an operating system needs to manage the computer resources and provide a clean, usable abstraction of the hardware. Both sound somewhat simple when reading about them, but they are complex tasks that require tight coupling between the operating system and the hardware.
In this section, we’ll do an overview of how the operating system manages to achieve its roles. We’ll talk about the basic techniques and what they do in the OS. This will be an overview for each of them and, for a more in depth analysis, I encourage picking up a book like Operating Systems: Three Easy Pieces and/or Modern Operating Systems.
Execution modes
All those operations that are closed to the hardware means that the operating system, or at least its core, needs to be really close to it and have unlimited access over the machine. This right is a core need for a software like an operating system, but it means that we need a way so that user processes won’t have those unlimited rights. Remember that they are managed by the OS.
This is why an operating system, with the help of the hardware, divides the computer into two separate spaces for software: kernel space and user space. We can call those execution modes too.
Kernel space is for the core of the operating system which we call the kernel. Only it and a restricted amount of privileged modules (like drivers which we’ll cover later) can run at this state, but what does it gain from it? The kernel is the place where all the complex operations said in the previous section (managing resources and abstraction) are done. To achieve those operations, it needs unlimited access over the machine and the processes on it so it is why it has its own dedicated execution mode where it is the sole dictator of the computer.
User space is for the rest of the software. Every user process and processes we could count as part of the operating system (like a file manager) are executed in this execution mode. It exists as a place where a process can be restricted in what it can do and how it interacts with the computer. If a process wants to open a file, it needs to call the kernel via a system call, like we said before, and it will execute the request on behalf of the process in kernel space. This ensures every process has a standardized way of interacting with the computer. It also ensures the kernel is the only one with full control over the computer.
Process management
In the previous section, we talked about how the operating system shares the CPU among multiple processes like our note taking application and our internet browser. This is what we call process management and it is a core part of the kernel. A simple two process system is simple to manage, but if you look at modern computers, you’ll see that there are hundreds of processes running. This brings two management issues: scalability and fairness.
By scalability, it is meant that the kernel must be able to withstand an enormous amount of processes. This includes more interactive ones like our note taking application and background processes like a service that pings an email server to see if you’ve got mail.
This brings the other issue of fairness. By fairness, we mean that the way we manage processes must be fair to all programs by giving it enough CPU time. If our note taking application doesn’t have enough CPU time, it will start to feel sluggish quickly since it needs fast response times. If our background emails service doesn’t get enough CPU time, we’ll probably miss a new email not by a few seconds, but by a few minutes or hours since it never got the chance to get the CPU and ping the server.
Those two issues can be dealt with what we call CPU scheduling algorithms. Those algorithms are each made to solve specific problems, but most of them try to balance a mix of fairness between processes, scalability, response times and complexity. Just to name a few: FIFO (first in first out), clock scheduling, multilevel queue scheduling (MLQ) and Linux completely fair scheduling (CFS).
Another issue the kernel has to watch for is processes hogging the CPU. When running, the process has the CPU, not the kernel, and it will only willingly give back the CPU when performing a system call. A malicious (or badly made) process could hog the CPU and prevent it being given to the kernel, thus it will never be switched out for another process. To fix this issue, the kernel sets up with the CPU what we call a clock interrupt. This will make sure that, every X amount of time, the CPU will forcefully give control back to the kernel. The amount of time is usually determined by which scheduling algorithm is used. It is worth mentioning that the clock interrupt isn’t the only interrupt. A lot of hardware pieces have an interrupt to notify the kernel that it needs to handle an event.
Memory management
In the last section, we also mentioned how the kernel manages the memory by splitting it so that each process can be put on it. Splitting memory alone causes an issue and it is that a process needs to constantly reference things by memory addresses, but it doesn’t know where it is in memory so it can’t. To fix this issue, we have to remember the spa analogy.
First, each process had a seat that was walled off. Memory works in the same way. The kernel will give each process a specified amount of memory or an area if you prefer. Each time a process will request something from the memory, the CPU will make sure that this address is valid by checking that it doesn’t reference things outside what has been assigned to it. If it is, the CPU will tell the kernel and the kernel will deal with the offending process. Think of it this way, you wouldn’t want your note taking application secretly reading your browser history.
Second, each process believes it has a spa to itself. This is an important one. The kernel will put processes memory where it can and will maybe move them around or even swap them to the disk. A program can’t keep track of this. This is why the kernel will give to programs the illusion that it has the whole memory for itself, from address 0 to whatever the kernel decided. Those fictional addresses are called logical addresses. Every time the process references one and asks the kernel to load the content of it, it will be the job of the kernel and the CPU to translate it into a physical address where it is really stored. The real address in memory if you want.
This illusion has a nice side effect too. It means that the process is completely oblivious to where the content of its memory is stored. This is how swapping is achieved. If the computer runs out of memory and the kernel starts swapping a part of the note taking application onto the disk, when that process references that memory address, the CPU will notify the kernel that the logical address isn’t referencing anything in memory and this will allow the kernel to bring it back in memory and ask the CPU to service the request again (it will succeed since it is now present).
All of this is made possible with the help of the CPU since it contains another component called the memory management unit (MMU). Depending on the CPU architecture, the MMU will want to have some references to different data structures and to be set up at boot time so that, at every memory request, it can go through those and find the physical address or if a process even has the right to this part of memory. It will be the one responsible to notify the kernel if something is wrong or if something is missing by doing a fault.
Drivers
By this point, you probably realized that the kernel is extremely close to the hardware because of its roles. Does this mean that it has to know every piece of hardware on Earth to be able to interact with them? No, not really. A kernel might know well some CPU’s and some other key components, but if we are talking about something like a printer or your graphics card, it will rely on drivers.
Drivers were the privileged modules that usually run in kernel space we mentioned earlier. The kernel will load them and those will provide a way to talk to a specific piece of hardware. This is done so that, when a new system call is received and the kernel doesn’t know how to interact with certain hardware, it will go through the designated driver and it will operate the hardware on behalf of the kernel.
This is how a user can buy a new printer and make it usable on a computer without ever modifying the kernel. They’ll simply go to the manufacturer to get the driver and they’ll ask the operating system to load it. After that (and maybe a quick reboot), the user should be able to operate their printer. If they have two different printers, then the kernel will talk to each of them through their driver. All of this without the user noticing the difference.
File systems
File systems were briefly mentioned in the last section and we’ll define them here. The kernel needs to provide the abstraction of the files and folders while being able to keep track of all the data on a disk. The file system is what allows it to do it.
The kernel will be able to split a disk into parts that are called partitions. Those partitions can be formatted using a specific file system supported by the kernel. This file system tells the kernel how to store data on it, how files are named, where they should be located etc.
This is why file systems transcend kernels a bit. They are a more standardized way of storing data on a partition. It allows our example from earlier where a USB key was formatted using a specific file system. If it is supported by two different kernels, then both will see the partition on the USB, check the file system and interact with the data on it in the way it was intended. All of this will be completely hidden to most users unless they get a bit more deeply into disk partitioning and formatting. A few notable file systems are NTFS, FAT16/FAT32 and ext4.
Conclusion
An operating system is an extremely complex piece of software that is a bit forgotten when it is well done. This article served to give an overview of it and to demystify its roles and inner workings. After reading this, you might see why they’re hard to make and why the biggest ones aren’t replaced every couple of years. Even Windows is building on the kernel it has, Windows NT, since 1993.
If you want to learn beyond what was shown here, the two books I linked earlier are a great start to this since they dive way deeper than I did.