[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

B. Completely Fair Scheduler

For project 1, you must implement the scheduler described in this appendix. This scheduler is a simplified version of the Completely Fair Scheduler (CFS) in Linux, which in turn is a variant of a weighted fair queuing scheduler.


B.1 Niceness

Niceness is the traditional way in which Unix systems allow users to express the relative importance of threads. Each thread has an integer nice value that determines how "nice" the thread should be to other threads. A positive nice, to the maximum of NICE_MAX (19), decreases the priority of a thread, thus causing it to give up some CPU time it would otherwise receive. On the other hand, a negative nice, to the minimum of NICE_MIN (-20), tends to take away CPU time from other threads. By default, each thread has a nice value of NICE_DEFAULT (0).

Pintos uses the functions described below to set and get nice values.

Function: int thread_get_nice (void)
Returns the current thread's nice value.

Function: void thread_set_nice (int new_nice)
Sets the current thread's nice value to new_nice.


B.2 Computing Weights

The nice value is not directly used to make scheduling decisions. Instead, each thread is assigned an integer weight that is derived from its nice value. Threads with lower nice values have higher priority, and therefore are given higher weight. In contrast, threads with higher nice values have lower weight. Threads with higher weights tend to not only be scheduled more often, but run for longer periods of time when scheduled.

The following table shows the mapping from nice value to weight, as used in CFS. All values are represented as unsigned 64-bit integers.

 
static const uint32_t prio_to_weight[40] =
  {
    /* -20 */    88761, 71755, 56483, 46273, 36291,
    /* -15 */    29154, 23254, 18705, 14949, 11916,
    /* -10 */    9548, 7620, 6100, 4904, 3906,
    /*  -5 */    3121, 2501, 1991, 1586, 1277,
    /*   0 */    1024, 820, 655, 526, 423,
    /*   5 */    335, 272, 215, 172, 137,
    /*  10 */    110, 87, 70, 56, 45,
    /*  15 */    36, 29, 23, 18, 15,
  }


B.3 Calculating Virtual Runtime vruntime

Each thread keeps track of its vruntime, which is short for "virtual runtime." It is a normalized measure of how much CPU time a thread has already consumed. CFS always selects the thread with the lowest vruntime value when picking a task to run, which represents the thread that is farthest behind relative to its desired share.

If multiple threads have the same vruntime value, you should break ties by scheduling the thread with the lowest tid. (This tie breaker is needed only for the tests, it is not used in the actual CFS algorithm.)

A thread's virtual runtime increases linearly as a function of consumed CPU time d where a thread's weight determines the rate of the increase. Hence, given the same amount of CPU consumption, vruntime increases more slowly for threads with higher weight and more quickly for a thread with lower weight.

Specifically, vruntime is computed based on the thread's consumed CPU time d and its weight w as follows:

vruntime = vruntime_0 + d * w0 / w

where vruntime_0 is an initial value for the thread's virtual runtime set when the thread is added to the ready queue, and where w0 is the weight of a thread with a nice value of 0. (64-bit integer arithmetic must be used for all CFS calculations.)

The very first thread's vruntime_0 is initialized to 0, but consider what would happen if the vruntime_0 values of threads created later were also set to 0 when those threads are added to the ready queue: those threads would appear to have no CPU time consumed at all, and would be preferred by the scheduler until they caught up with the threads that were already running in the system (which would then not receive any CPU time).

Instead, CFS chooses as the initial value of vruntime_0 for newly created threads the minimum value of vruntime of all threads already running or ready at that point. This value, called min_vruntime, must be maintained for each ready queue.

The scheduler must calculate the vruntime values of all threads in accordance with their actual CPU consumption, so that accurate values are used when the scheduler makes decisions. Specifically, your scheduler must use updated values of vruntime when a new thread is created and whenever a thread is unblocked.


B.4 Calculating ideal_runtime

At each timer interrupt the scheduler needs to decide whether to preempt the currently running thread or not. A thread is preempted if it has run for more than its "ideal runtime," which represents the length of this thread's time slice. In CFS, the length of a thread's time slice depends on its niceness: higher priority threads receive longer time slices than lower priority threads.

Specifically, the ideal_runtime is computed as

ideal_runtime = 4000000 * n * w / s

where n is the number of threads either running or ready to run, w is the weight of the thread, and s is the sum of weights of all threads that are either running or ready to run. Since all of these variables may change as a thread runs, the preemption decision should be made based on their current values when a timer interrupt occurs.

Notice that in the common case where all threads have the same weight (s = n * w), the ideal runtime is 4,000,000ns, or 4ms. For example, assuming a timer frequency of 1000 Hz, if 2 CPU bound threads were running on a CPU, they would be taking turns every 4 clock ticks.

This time interval is long enough to avoid excessive context switch overhead, but short enough so that users can perceive their threads as making progress simultaneously.


B.5 Handling I/O bound threads

I/O bound threads spend much of their time in the blocked state, waiting for I/O operations to complete. (In the Linux kernel, they are referred to as "sleepers.") An example is a program such as PowerPoint, which may run only when a user presses a key to update a slide, then go back sleeping to wait for more input. To increase responsiveness, the scheduler should schedule such threads as early as possible when they become ready. Most general-purpose schedulers, CFS included, include a special policy for this case.

When a thread is unblocked, its vruntime is likely to be lower than that of other threads that did not sleep. As in the case of newly created threads discussed above, without adjustment, those threads would be scheduled by the scheduler until they have caught up with the others. Although this meets the goal of minimizing latency, it is in general undesirable, particularly if the thread now started using the CPU extensively.

To avoid this, CFS sets an unblocked thread's vruntime_0 to a slightly smaller value than min_vruntime, specifically:

vruntime_0 = max(vruntime, min_vruntime - 20000000)

where 20000000 represents the "sleeper bonus" given to I/O bound processes when they wake up (unblock). This adjustment tends to place these threads at the front of the ready queue.

To avoid threads manipulating this system by intentionally sleeping, the previous value of vruntime from when the thread began sleeping is included as a lower bound. This ensures that a thread's vruntime value cannot decrease, thus threads will not be receiving more CPU time than if they had been continuously ready.

Threads receiving a sleeper bonus may temporarily obtain a vruntime value that is less than min_vruntime. In this case, do not take the vruntime value of the thread receiving the bonus into account when updating the ready queue's min_vruntime. You must maintain the invariant that a ready queue's min_vruntime value increases monotonically.

Keep in mind that unblocked threads are not necessarily added to the current CPU's ready queue. To preserve processor affinity, unblocked threads are added to the CPU on which they last ran.

Since I/O bound threads require quick response times, your scheduler must arrange for the unblocked thread to preempt the running thread if the idle thread is running, or if the vruntime of the newly unblocked thread is lower than that of the currently running thread. To do so, it must return RETURN_YIELD from sched_unblock() in this case. As a simplification for this project, you should treat all occasions in which an existing thread is unblocked as if they were related to I/O.


B.6 Summary

A summary of the CFS algorithm is provided below:


B.7 Load Balancing

Whereas the previous sections focused on the per-processor scheduling policy, this section focuses on how CFS balances the load between CPUs. This load balancing policy is specific towards the CFS scheduler because its load metric is CFS specific. Thus we recommend that you get CFS working before attempting to implement load balancing.

When load balancing, a CPU moves threads from another CPU's ready queue to its own. To decide whether to pull threads from another CPU's ready queue, CFS examines the load on each CPU, represented by a variable cpu_load. cpu_load is defined as the sum of the weights of all threads in the ready queue (notice that unlike the previous definition for sum of weights, s which was used to calculate ideal_runtime, the weight of the thread currently running on that CPU is not taken into account here).

A CPU's imbalance is calculated as follows:

imbalance = (busiest_cpu_load - my_load) / 2

where busiest_cpu_load is the cpu_load of the CPU with highest load and my_load is the cpu_load of the CPU that is executing the load balancing.

If imbalance is small (imbalance * 4 < busiest_cpu_load) then no rebalancing occurs. Otherwise, CFS pulls threads from the busiest CPU to the CPU that initiated the load balancing. It continues to do so until the aggregate weight of threads that have been migrated equals or exceeds imbalance.

Since threads' vruntime values are significant only when compared to the vruntime of other threads on a CPU's local queue, the vruntime of threads in different CPUs' ready queues may be vastly different. Therefore, the vruntime values of each of the migrated threads must be adjusted upon migration as follows:

vruntime_0 = vruntime - busiest_cpu_minvruntime + my_minvruntime

where busiest_cpu_minvruntime is the min_vruntime of the busiest CPU and my_minvruntime is the min_vruntime of the CPU that initiated the load balancing. This adjustment will allow a thread to retain any sleeper bonus it may have enjoyed at the time of the migration. The min_vruntime of the receiving CPU is not updated. Make sure that vruntime_0 (and thus vruntime) stays nonnegative.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Godmar Back on January, 19 2022 using texi2html