-
Philippe Gerum authored
To maintain consistency between both Cobalt and host schedulers, reflecting a thread migration to another CPU into the Cobalt scheduler state must happen from secondary mode only, on behalf of the migrated thread itself once it runs on the target CPU (*). For this reason, handle_setaffinity_event() may NOT fix up thread->sched immediately using the passive migration call for a blocked thread. Failing to ensure this may lead to the following scenario, with taskA as the migrated thread, and taskB any other Cobalt thread: CPU0(cobalt): suspend(taskA, XNRELAX) CPU0(cobalt): suspend(taskB, ...) CPU0(cobalt): enter_root(), next_task := <whatever> ... CPU0(root): handle_setaffinity_event(taskA, CPU3) taskA->sched = xnsched_struct(CPU3) CPU0(root): <relax epilogue code for taskA> CPU0(root): resume(taskA, XNRELAX) enqueue(rq=CPU3), reschedule IPI to CPU3 CPU0(root): resume(taskB, ...) CPU0(root): leave_root(), host_task := taskA ... CPU0(cobalt): suspend(taskA) CPU0(cobalt): enter_root(), next_task := host_task := taskA CPU0(root??): <<<taskA execution>>> BROKEN CPU3(cobalt): <taskA execution> via reschedule IPI To sum up, we would end up with the migrated task running on both CPUs in parallel, which would be, well, a problem. To resync the Cobalt scheduler information, send a SIGSHADOW signal to the migrated thread, asking it to switch back to primary mode from the handler, at which point the interrupted syscall may be restarted. This guarantees that check_affinity() is called, and fixups are done from the proper context. There is a cost: setting the affinity of a blocked thread may now induce a delay for that target thread as well, since it has to make a roundtrip between primary and secondary modes for handling the change event. However, 1) there is no other safe way to handle such event, 2) changing the CPU affinity of a remote real-time thread at random times makes absolutely no sense latency-wise, anyway. (*) This means that the Cobalt scheduler state regarding the CPU information lags behind the host scheduler state until the migrated thread switches back to primary mode (i.e. task_cpu(p) != xnsched_cpu(xnthread_from_task(p)->sched)). This is ok since Cobalt does not schedule such thread until then.
59344943