Skip to content

Commit 20ee369

Browse files
authored
Merge branch 'master' into capitalize-linux
2 parents c2ecc33 + 5f76b7e commit 20ee369

File tree

82 files changed

+877
-426
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

82 files changed

+877
-426
lines changed

Assets/linux-kernel.png

-11.7 KB
Loading

Booting/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,6 @@ This chapter describes the Linux kernel boot process. Here you will see a series
77
* [Video mode initialization and transition to protected mode](linux-bootstrap-3.md) - describes video mode initialization in the kernel setup code and transition to protected mode.
88
* [Transition to 64-bit mode](linux-bootstrap-4.md) - describes preparation for transition into 64-bit mode and details of transition.
99
* [Kernel Decompression](linux-bootstrap-5.md) - describes preparation before kernel decompression and details of direct decompression.
10-
* [Kernel random address randomization](linux-bootstrap-6.md) - describes randomization of the Linux kernel load address.
10+
* [Kernel load address randomization](linux-bootstrap-6.md) - describes randomization of the Linux kernel load address.
1111

1212
This chapter coincides with `Linux kernel v4.17`.

Booting/images/bss.png

-1.37 KB
Loading
-11.3 KB
Loading

Booting/images/linear_address.png

-717 Bytes
Loading

Booting/images/minimal_stack.png

-1.82 KB
Loading

Booting/images/simple_bootloader.png

-335 Bytes
Loading

Booting/images/stack1.png

-1.38 KB
Loading

Booting/images/stack2.png

-1.2 KB
Loading
-451 Bytes
Loading
-1.63 KB
Loading

Booting/linux-bootstrap-1.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ The starting address is formed by adding the base address to the value in the EI
7373
'0xfffffff0'
7474
```
7575

76-
We get `0xfffffff0`, which is 16 bytes below 4GB. This point is called the [reset vector](https://en.wikipedia.org/wiki/Reset_vector). It's the memory location at which the CPU expects to find the first instruction to execute after reset. It contains a [jump](https://en.wikipedia.org/wiki/JMP_%28x86_instruction%29) (`jmp`) instruction that usually points to the [BIOS](https://en.wikipedia.org/wiki/BIOS) (Basic Input/Output System) entry point. For example, if we look in the [coreboot](https://www.coreboot.org/) source code (`src/cpu/x86/16bit/reset16.inc`), we see:
76+
We get `0xfffffff0`, which is 16 bytes below 4GB. This point is called the [reset vector](https://en.wikipedia.org/wiki/Reset_vector). It's the memory location at which the CPU expects to find the first instruction to execute after reset. It contains a [jump](https://en.wikipedia.org/wiki/JMP_%28x86_instruction%29) (`jmp`) instruction that usually points to the [BIOS](https://en.wikipedia.org/wiki/BIOS) (Basic Input/Output System) entry point. For example, if we look in the [coreboot](https://www.coreboot.org/) source code ([src/cpu/x86/16bit/reset16.inc](https://review.coreboot.org/plugins/gitiles/coreboot/+/refs/heads/4.11_branch/src/cpu/x86/16bit/reset16.inc)), we see:
7777

7878
```assembly
7979
.section ".reset", "ax", %progbits
@@ -87,7 +87,7 @@ _start:
8787

8888
Here we can see the `jmp` instruction [opcode](http://ref.x86asm.net/coder32.html#xE9), which is `0xe9`, and its destination address at `_start16bit - ( . + 2)`.
8989

90-
We also see that the `reset` section is `16` bytes and is compiled to start from the address `0xfffffff0` (`src/cpu/x86/16bit/reset16.ld`):
90+
We also see that the `reset` section is `16` bytes and is compiled to start from the address `0xfffffff0` ([src/cpu/x86/16bit/reset16.ld](https://review.coreboot.org/plugins/gitiles/coreboot/+/refs/heads/4.11_branch/src/cpu/x86/16bit/reset16.ld)):
9191

9292
```
9393
SECTIONS {

Booting/linux-bootstrap-3.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -299,7 +299,7 @@ io_delay();
299299

300300
At first, there is an inline assembly statement with a `cli` instruction which clears the interrupt flag (`IF`). After this, external interrupts are disabled. The next line disables NMI (non-maskable interrupt).
301301

302-
An interrupt is a signal to the CPU which is emitted by hardware or software. After getting such a signal, the CPU suspends the current instruction sequence, saves its state and transfers control to the interrupt handler. After the interrupt handler has finished it's work, it transfers control back to the interrupted instruction. Non-maskable interrupts (NMI) are interrupts which are always processed, independently of permission. They cannot be ignored and are typically used to signal for non-recoverable hardware errors. We will not dive into the details of interrupts now but we will be discussing them in the coming posts.
302+
An interrupt is a signal to the CPU which is emitted by hardware or software. After getting such a signal, the CPU suspends the current instruction sequence, saves its state and transfers control to the interrupt handler. After the interrupt handler has finished its work, it transfers control back to the interrupted instruction. Non-maskable interrupts (NMI) are interrupts which are always processed, independently of permission. They cannot be ignored and are typically used to signal for non-recoverable hardware errors. We will not dive into the details of interrupts now but we will be discussing them in the coming posts.
303303

304304
Let's get back to the code. We can see in the second line that we are writing the byte `0x80` (disabled bit) to `0x70` (the CMOS Address register). After that, a call to the `io_delay` function occurs. `io_delay` causes a small delay and looks like:
305305

@@ -326,7 +326,7 @@ static int a20_test(int loops)
326326
327327
saved = ctr = rdfs32(A20_TEST_ADDR);
328328
329-
while (loops--) {
329+
while (loops--) {
330330
wrfs32(++ctr, A20_TEST_ADDR);
331331
io_delay(); /* Serialize and make delay constant */
332332
ok = rdgs32(A20_TEST_ADDR+0x10) ^ ctr;

Booting/linux-bootstrap-5.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,7 @@ Like before, we push `rsi` onto the stack to preserve the pointer to `boot_param
204204
* `output` - the start address of the decompressed kernel;
205205
* `output_len` - the size of the decompressed kernel;
206206

207-
All arguments will be passed through registers as per the [System V Application Binary Interface](http://www.x86-64.org/documentation/abi.pdf). We've finished all the preparations and can now decompress the kernel.
207+
All arguments will be passed through registers as per the [System V Application Binary Interface](https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-1.0.pdf). We've finished all the preparations and can now decompress the kernel.
208208

209209
Kernel decompression
210210
--------------------------------------------------------------------------------

Booting/linux-bootstrap-6.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ This function takes five parameters:
4646
* `input`;
4747
* `input_size`;
4848
* `output`;
49-
* `output_isze`;
49+
* `output_size`;
5050
* `virt_addr`.
5151
5252
Let's try to understand what these parameters are. The first parameter, `input` is just the `input_data` parameter of the `extract_kernel` function from the [arch/x86/boot/compressed/misc.c](https://github.com/torvalds/linux/blob/v4.16/arch/x86/boot/compressed/misc.c) source code file, cast to `unsigned long`:
@@ -146,7 +146,7 @@ Now, we call another function:
146146
initialize_identity_maps();
147147
```
148148
149-
The `initialize_identity_maps` function is defined in the [arch/x86/boot/compressed/kaslr_64.c](https://github.com/torvalds/linux/blob/master/arch/x86/boot/compressed/kaslr_64.c) source code file. This function starts by initialising an instance of the `x86_mapping_info` structure called `mapping_info`:
149+
The `initialize_identity_maps` function is defined in the [arch/x86/boot/compressed/kaslr_64.c](https://github.com/torvalds/linux/blob/master/arch/x86/boot/compressed/kaslr_64.c) source code file. This function starts by initializing an instance of the `x86_mapping_info` structure called `mapping_info`:
150150
151151
```C
152152
mapping_info.alloc_pgt_page = alloc_pgt_page;
@@ -254,7 +254,7 @@ add_identity_map(mem_avoid[MEM_AVOID_ZO_RANGE].start,
254254
mem_avoid[MEM_AVOID_ZO_RANGE].size);
255255
```
256256
257-
THe `mem_avoid_init` function first tries to avoid memory regions currently used to decompress the kernel. We fill an entry from the `mem_avoid` array with the start address and the size of the relevant region and call the `add_identity_map` function, which builds the identity mapped pages for this region. The `add_identity_map` function is defined in the [arch/x86/boot/compressed/kaslr_64.c](https://github.com/torvalds/linux/blob/v4.16/arch/x86/boot/compressed/kaslr_64.c) source code file and looks like this:
257+
The `mem_avoid_init` function first tries to avoid memory regions currently used to decompress the kernel. We fill an entry from the `mem_avoid` array with the start address and the size of the relevant region and call the `add_identity_map` function, which builds the identity mapped pages for this region. The `add_identity_map` function is defined in the [arch/x86/boot/compressed/kaslr_64.c](https://github.com/torvalds/linux/blob/v4.16/arch/x86/boot/compressed/kaslr_64.c) source code file and looks like this:
258258
259259
```C
260260
void add_identity_map(unsigned long start, unsigned long size)

Cgroups/images/menuconfig.png

-15.1 KB
Loading

Cgroups/linux-cgroups-1.md

+9-9
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Each of these control group subsystems depends on related configuration option.
3030
You may see enabled control groups on your computer via [proc](https://en.wikipedia.org/wiki/Procfs) filesystem:
3131

3232
```
33-
$ cat /proc/cgroups
33+
$ cat /proc/cgroups
3434
#subsys_name hierarchy num_cgroups enabled
3535
cpuset 8 1 1
3636
cpu 7 66 1
@@ -90,7 +90,7 @@ So, if we will run this script we will see following result:
9090

9191
```
9292
$ sudo chmod +x cgroup_test_script.sh
93-
~$ ./cgroup_test_script.sh
93+
~$ ./cgroup_test_script.sh
9494
print line
9595
print line
9696
print line
@@ -147,7 +147,7 @@ crw-rw-rw- 1 root tty 5, 0 Dec 3 22:48 /dev/tty
147147
see the first `c` letter in a permissions list. The second part is `5:0` is major and minor numbers of the device. You can see these numbers in the output of `ls` too. And the last `w` letter forbids tasks to write to the specified device. So let's start the `cgroup_test_script.sh` script:
148148

149149
```
150-
~$ ./cgroup_test_script.sh
150+
~$ ./cgroup_test_script.sh
151151
print line
152152
print line
153153
print line
@@ -164,7 +164,7 @@ and add pid of this process to the `devices/tasks` file of our group:
164164
The result of this action will be as expected:
165165

166166
```
167-
~$ ./cgroup_test_script.sh
167+
~$ ./cgroup_test_script.sh
168168
print line
169169
print line
170170
print line
@@ -174,7 +174,7 @@ print line
174174
./cgroup_test_script.sh: line 5: /dev/tty: Operation not permitted
175175
```
176176

177-
Similar situation will be when you will run you [docker](https://en.wikipedia.org/wiki/Docker_\(software\)) containers for example:
177+
Similar situation will be when you will run you [docker](https://en.wikipedia.org/wiki/Docker_(software)) containers for example:
178178

179179
```
180180
~$ docker ps
@@ -213,7 +213,7 @@ Control group /:
213213
│ └─6404 /bin/bash
214214
```
215215

216-
Now we know a little about `control groups` mechanism, how to use it manually and what's purpose of this mechanism. It's time to look inside of the Linux kernel source code and start to dive into implementation of this mechanism.
216+
Now we know a little about `control groups` mechanism, how to use it manually and what's the purpose of this mechanism. It's time to look inside of the Linux kernel source code and start to dive into implementation of this mechanism.
217217

218218
Early initialization of control groups
219219
--------------------------------------------------------------------------------
@@ -294,7 +294,7 @@ Here we may see call of the `init_cgroup_root` function which will execute initi
294294
struct cgroup_root cgrp_dfl_root;
295295
```
296296

297-
Its `cgrp` field represented by the `cgroup` structure which represents a `cgroup` as you already may guess and defined in the [include/linux/cgroup-defs.h](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/include/linux/cgroup-defs.h) header file. We already know that a process which is represented by the `task_struct` in the Linux kernel. The `task_struct` does not contain direct link to a `cgroup` where this task is attached. But it may be reached via `css_set` field of the `task_struct`. This `css_set` structure holds pointer to the array of subsystem states:
297+
Its `cgrp` field represented by the `cgroup` structure which represents a `cgroup` as you already may guess and defined in the [include/linux/cgroup-defs.h](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/include/linux/cgroup-defs.h) header file. We already know that a process is represented by the `task_struct` in the Linux kernel. The `task_struct` does not contain direct link to a `cgroup` where this task is attached. But it may be reached via `css_set` field of the `task_struct`. This `css_set` structure holds pointer to the array of subsystem states:
298298

299299
```C
300300
struct css_set {
@@ -324,14 +324,14 @@ struct cgroup_subsys_state {
324324

325325
So, the overall picture of `cgroups` related data structure is following:
326326

327-
```
327+
```
328328
+-------------+ +---------------------+ +------------->+---------------------+ +----------------+
329329
| task_struct | | css_set | | | cgroup_subsys_state | | cgroup |
330330
+-------------+ | | | +---------------------+ +----------------+
331331
| | | | | | | | flags |
332332
| | | | | +---------------------+ | cgroup.procs |
333333
| | | | | | cgroup |--------->| id |
334-
| | | | | +---------------------+ | .... |
334+
| | | | | +---------------------+ | .... |
335335
|-------------+ |---------------------+----+ +----------------+
336336
| cgroups | ------> | cgroup_subsys_state | array of cgroup_subsys_state
337337
|-------------+ +---------------------+------------------>+---------------------+ +----------------+

Concepts/linux-cpu-2.md

+7-7
Original file line numberDiff line numberDiff line change
@@ -19,13 +19,13 @@ set_cpu_present(cpu, true);
1919
set_cpu_possible(cpu, true);
2020
```
2121
22-
Before we will consider implementation of these functions, let's consider all of these masks.
22+
Before we consider implementation of these functions, let's consider all of these masks.
2323
24-
The `cpu_possible` is a set of cpu ID's which can be plugged in anytime during the life of that system boot or in other words mask of possible CPUs contains maximum number of CPUs which are possible in the system. It will be equal to value of the `NR_CPUS` which is which is set statically via the `CONFIG_NR_CPUS` kernel configuration option.
24+
The `cpu_possible` is a set of cpu ID's which can be plugged in anytime during the life of that system boot or in other words mask of possible CPUs contains maximum number of CPUs which are possible in the system. It will be equal to value of the `NR_CPUS` which is set statically via the `CONFIG_NR_CPUS` kernel configuration option.
2525
2626
The `cpu_present` mask represents which CPUs are currently plugged in.
2727
28-
The `cpu_online` represents a subset of the `cpu_present` and indicates CPUs which are available for scheduling or in other words a bit from this mask tells to kernel is a processor may be utilized by the Linux kernel.
28+
The `cpu_online` represents a subset of the `cpu_present` and indicates CPUs which are available for scheduling or in other words a bit from this mask tells the kernel if a processor may be utilized by the Linux kernel.
2929
3030
The last mask is `cpu_active`. Bits of this mask tells to Linux kernel is a task may be moved to a certain processor.
3131
@@ -94,9 +94,9 @@ And returns `1` every time. We need it here for only one purpose: at compile tim
9494
cpumask API
9595
--------------------------------------------------------------------------------
9696
97-
As we can define cpumask with one of the method, Linux kernel provides API for manipulating a cpumask. Let's consider one of the function which presented above. For example `set_cpu_online`. This function takes two parameters:
97+
As we can define cpumask with one of the methods, Linux kernel provides API for manipulating a cpumask. Let's consider one of the function which presented above. For example `set_cpu_online`. This function takes two parameters:
9898
99-
* Number of CPU;
99+
* Index of CPU;
100100
* CPU status;
101101
102102
Implementation of this function looks as:
@@ -113,7 +113,7 @@ void set_cpu_online(unsigned int cpu, bool online)
113113
}
114114
```
115115

116-
First of all it checks the second `state` parameter and calls `cpumask_set_cpu` or `cpumask_clear_cpu` depends on it. Here we can see casting to the `struct cpumask *` of the second parameter in the `cpumask_set_cpu`. In our case it is `cpu_online_bits` which is a bitmap and defined as:
116+
First of all it checks the second `state` parameter and calls `cpumask_set_cpu` or `cpumask_clear_cpu` depending on it. Here we can see casting to the `struct cpumask *` of the second parameter in the `cpumask_set_cpu`. In our case it is `cpu_online_bits` which is a bitmap and defined as:
117117

118118
```C
119119
static DECLARE_BITMAP(cpu_online_bits, CONFIG_NR_CPUS) __read_mostly;
@@ -128,7 +128,7 @@ static inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
128128
}
129129
```
130130

131-
The `set_bit` function takes two parameters too, and sets a given bit (first parameter) in the memory (second parameter or `cpu_online_bits` bitmap). We can see here that before `set_bit` will be called, its two parameters will be passed to the
131+
The `set_bit` function takes two parameters too, and sets a given bit (first parameter) in the memory (second parameter or `cpu_online_bits` bitmap). We can see here that before `set_bit` is called, its two parameters will be passed to the
132132

133133
* cpumask_check;
134134
* cpumask_bits.

0 commit comments

Comments
 (0)