1. 02 Dec, 2019 1 commit
    • Philippe Gerum's avatar
      cobalt/process: exit_thread handlers should reschedule as needed · 393ece85
      Philippe Gerum authored
      An exit_handler wants to be called from the root domain, with
      preemption enabled and hard irqs on in order to keep all options open,
      such as using regular sleeping locks.
      
      If such a handler updates the Cobalt scheduler state, it has to
      trigger the rescheduling procedure (xnsched_run()) internally as well,
      grouping the changes and the rescheduling call into the same
      interrupt-free section, in order to guard against CPU migration.
      
      Relying on the core to kick such procedure in order to commit pending
      changes later on is unreliable.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
      393ece85
  2. 25 Nov, 2019 3 commits
  3. 03 Jun, 2019 5 commits
    • Jan Kiszka's avatar
      cobalt/kernel: Implement synchronous stop and resume for process debugging · 0b9e8180
      Jan Kiszka authored
      When a thread in primary mode currently hits a breakpoint, it will
      immediately be relaxed, leaving space for other runnable RT threads of
      the process to take over. This can delay gdb to finally take control and
      stop the whole process.
      
      To prevent this unexpected and often undesired behavior, this introduces
      a mechanism to stop all threads of a process in primary mode when one of
      them runs into a breakpoint or a single-step event. The new XNDBGSTOP
      thread state is used for this purpose.
      
      On resumption, no thread can continue in primary mode until all that
      should do so have reached this state. Then the last one will wake up all
      threads waiting on XNDBGSTOP.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
      0b9e8180
    • Jan Kiszka's avatar
      cobalt/kernel: Reintroduce process-local thread list · ac66d641
      Jan Kiszka authored
      Track threads on a per-process basis again to make walking through them
      cheaper.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
      ac66d641
    • Jan Kiszka's avatar
      cobalt/kernel: Switch back ptraced primary mode threads on resume · cc4964de
      Jan Kiszka authored
      When a thread is stopped in primary mode for debugging, make sure it
      will migrate back before resuming in user space. This is a building
      block for making real-time process debugging more deterministic.
      
      The information if a thread should resume in primary is transported via
      a new thread info flag XNCONTHI. It is set either by the exception
      handler detecting a breakpoint hit in primary user mode or by
      xnthread_relax when invoked for a thread under XNSSTEP which indicates
      it was signaled to stop.
      
      The feature depends on the new I-pipe notifier for user interrupt
      return, i.e. a callback from the point that a secondary mode thread is
      about to return to user space after an exception or an interrupt.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
      cc4964de
    • Jan Kiszka's avatar
      cobalt/kernel: Rework register/unregister_debugged_thread · 65973e23
      Jan Kiszka authored
      Given that registration can run asynchronously to deregistration, better
      put testing for XNSSTEP and the latter under nklock. This is likely
      overkill for task termination but it's more consistent.
      
      While at it, move up the functions and add a cobalt_ prefix. We will
      need them earlier in the file and are going to export one of them for
      use in other code modules.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
      65973e23
    • Jan Kiszka's avatar
      cobalt/kernel: Simplify mayday processing · 4e7bb325
      Jan Kiszka authored
      The mayday mechanism exists in order to kick a xenomai userspace task
      into secondary mode while it is running userspace code. For that, we ask
      I-pipe to call us back when the task was interrupted and is about to
      return to userspace.
      
      So far we defer the relaxation from that callback via a VDSO-like
      mechanism that triggers a special syscall to the return path of that
      very same syscall. However, that is not desirable because it is a
      complex, arch-specific mechanism that can easily break and,
      specifically, that destroys the backtrace of ptraced tasks.
      
      Fortunately, we can fulfill the needs of mayday also by relaxing
      the task directly from the mayday callback. Tested successfully on
      x86-64 and ARM.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
      4e7bb325
  4. 19 Mar, 2019 2 commits
  5. 02 Oct, 2018 2 commits
  6. 03 Jul, 2018 5 commits
    • Jan Kiszka's avatar
      cobalt/tracing: Primarily identify threads via pid · 8c668069
      Jan Kiszka authored
      Except for the short phase between thread_init and shadow_map, a thread
      is always identifiable via the pid of its Linux mate. Use this shorter
      value, which also correlates with what ftrace records anyway, instead of
      the pointer or the name. Report the full thread name only in prominent
      cases: init, resume and switch.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
      8c668069
    • Jan Kiszka's avatar
      cobalt/tracing: Don't report current thread in tracepoints · e23670a7
      Jan Kiszka authored
      All these are synchronous, and the thread context is already recorded by
      ftrace.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
      e23670a7
    • Philippe Gerum's avatar
      cobalt/timer: fix, rework ptrace detection and timer forwarding · 9ebc2b6e
      Philippe Gerum authored
      Ptracing may cause timer overruns, as the ptraced application cannot
      go waiting for the current period in a timely manner when stopped on a
      breakpoint or single-stepped. A mechanism was introduced a long time
      ago for hiding those overruns from the application, while ptracing is
      in effect.
      
      The current implementation dealing with this case has two major flaws:
      
      - it crashes the system when single-stepping (observed on ARM i.MX6q),
        revealing a past regression which went unnoticed so far.
      
      - it uses a big hammer to forward (most) timers without running their
        respective timeout handler while ptracing, in order to hide this
        timespan from the overrun accounting code. This introduces two
        issues:
      
        * the timer forwarding code sits in the tick announcement code,
          which is a very hot path, despite ptracing an application is
          definitely not a common operation.
      
        * all timers are affected / blocked during ptracing, except those
          which have been specifically marked (XNTIMER_NOBLCK) at creation,
          which turns out to be impractical for the common case.
      
      The new implementation only addresses what is at stake, i.e. hiding
      overrun reports due to ptracing from applications. This can be done
      simply by noting when a thread should disregard overruns after an exit
      from the ptraced mode (XNHICCUP), then discard the pending overruns if
      this flag is detected by the code reporting them
      (xntimer_get_overrun()).
      9ebc2b6e
    • Jan Kiszka's avatar
      cobalt: Fix conditional build by avoiding #if XENO_DEBUG() · 90e99e74
      Jan Kiszka authored
      In contrast to #ifdef CONFIG_x, #if IS_ENABLED(x) (or our wrapper of
      the latter) does not update the dependency information for kbuild. So,
      switching any config easily left inconsistent build artifacts behind.
      
      This conversion also fixes de66d324: there is and there was never a
      CONFIG_XENO_DEBUG_LOCKING.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
      90e99e74
    • Philippe Gerum's avatar
      c6232b1f
  7. 10 Apr, 2018 1 commit
    • Philippe Gerum's avatar
      cobalt: unconditionally allow idling requests · 33dd9ce1
      Philippe Gerum authored
      Wait for the idling interface rework from I-pipe/4.14, which will
      provide more information for determining whether Cobalt should be ok
      with entering the target idle state.
      
      As a result of this change, the original kernel behavior is restored
      for all ipipe-4.9.y patches with respect to entering an idle state,
      including for the releases lacking commits #89146106e8 or #8d3fa22c95.
      
      This change only affects kernels built with CONFIG_CPU_IDLE enabled.
      
      NOTE: XNIDLE is intentionally kept for future use in the Cobalt core.
      33dd9ce1
  8. 20 Mar, 2018 1 commit
    • Philippe Gerum's avatar
      cobalt/sched, clock: provide ipipe_enter_idle_hook() · 59560645
      Philippe Gerum authored
      Since kernel 4.9, the pipeline code may ask us whether it would be
      fine to enter the idle state on the current CPU, by mean of a probing
      hook called ipipe_enter_idle_hook().
      
      Provide this hook, considering that absence of outstanding timers
      means idleness to us.
      59560645
  9. 08 Dec, 2017 1 commit
  10. 29 Oct, 2017 1 commit
    • Philippe Gerum's avatar
      cobalt/process: fix CPU migration handling for blocked threads · 59344943
      Philippe Gerum authored
      To maintain consistency between both Cobalt and host schedulers,
      reflecting a thread migration to another CPU into the Cobalt scheduler
      state must happen from secondary mode only, on behalf of the migrated
      thread itself once it runs on the target CPU (*).
      
      For this reason, handle_setaffinity_event() may NOT fix up
      thread->sched immediately using the passive migration call for a
      blocked thread.
      
      Failing to ensure this may lead to the following scenario, with taskA
      as the migrated thread, and taskB any other Cobalt thread:
      
      CPU0(cobalt): suspend(taskA, XNRELAX)
      CPU0(cobalt): suspend(taskB, ...)
      CPU0(cobalt): enter_root(), next_task := <whatever>
      ...
      CPU0(root): handle_setaffinity_event(taskA, CPU3)
            taskA->sched = xnsched_struct(CPU3)
      CPU0(root): <relax epilogue code for taskA>
      CPU0(root): resume(taskA, XNRELAX)
            enqueue(rq=CPU3), reschedule IPI to CPU3
      CPU0(root): resume(taskB, ...)
      CPU0(root): leave_root(), host_task := taskA
      ...
      CPU0(cobalt): suspend(taskA)
      CPU0(cobalt): enter_root(), next_task := host_task := taskA
      CPU0(root??): <<<taskA execution>>> BROKEN
      CPU3(cobalt): <taskA execution> via reschedule IPI
      
      To sum up, we would end up with the migrated task running on both CPUs
      in parallel, which would be, well, a problem.
      
      To resync the Cobalt scheduler information, send a SIGSHADOW signal to
      the migrated thread, asking it to switch back to primary mode from the
      handler, at which point the interrupted syscall may be restarted. This
      guarantees that check_affinity() is called, and fixups are done from
      the proper context.
      
      There is a cost: setting the affinity of a blocked thread may now
      induce a delay for that target thread as well, since it has to make a
      roundtrip between primary and secondary modes for handling the change
      event. However, 1) there is no other safe way to handle such event, 2)
      changing the CPU affinity of a remote real-time thread at random times
      makes absolutely no sense latency-wise, anyway.
      
      (*) This means that the Cobalt scheduler state regarding the
      CPU information lags behind the host scheduler state until
      the migrated thread switches back to primary mode
      (i.e. task_cpu(p) != xnsched_cpu(xnthread_from_task(p)->sched)).
      This is ok since Cobalt does not schedule such thread until then.
      59344943
  11. 17 Jul, 2017 1 commit
  12. 20 Jul, 2016 1 commit
  13. 16 Jul, 2016 1 commit
  14. 09 Jul, 2016 1 commit
    • Philippe Gerum's avatar
      cobalt/kernel: use raw_spinlock* API to access IRQ pipeline locks · 18aaf9db
      Philippe Gerum authored
      In order to cope with PREEMPT_RT_FULL, the spinlock* API should be
      invoked for regular spin locks exclusively, so that those locks can be
      handled by PREEMPT_RT's sleeping lock API seamlessly.
      
      Since I-pipe locks are basically raw locks with hard IRQ management,
      sticking to the raw_spinlock* API for them makes sense. The regular
      spinlock* and raw_spinlock* APIs can be used indifferently for
      manipulating IRQ pipeline locks (ipipe_spinlock_t) with the current
      pipeline releases, so this change is backward compatible.
      18aaf9db
  15. 04 Mar, 2016 1 commit
  16. 02 Mar, 2016 1 commit
  17. 01 Mar, 2016 1 commit
    • Philippe Gerum's avatar
      cobalt/thread: force secondary mode for joining threads · 738d0d12
      Philippe Gerum authored
      Make xnthread_join() switch the caller to secondary mode prior to
      waiting for the target thread termination. The original runtime mode
      is restored upon return.
      
      Since the joiner was already synchronized on an event that may be sent
      by the joinee from secondary mode exclusively, this change does not
      drop any real-time guarantee for the joiner: there has never been any
      in the first place.
      
      This is a preparation step to a stricter synchronization between the
      joiner and the joinee, especially in the SMP case.
      738d0d12
  18. 27 Nov, 2015 1 commit
  19. 23 Nov, 2015 1 commit
  20. 10 Nov, 2015 1 commit
    • Philippe Gerum's avatar
      cobalt/posix/process: fix mayday mapping denial on NOMMU · ac450df2
      Philippe Gerum authored
      Until we stop backing /dev/mem with the mayday page, we can't ask for
      PROT_EXEC on NOMMU, since the former does not define mmap
      capabilities, and default ones won't allow an executable mapping with
      MAP_SHARED. In the NOMMU case, this is (currently) not an issue for
      implementing the MAYDAY support.
      ac450df2
  21. 15 Oct, 2015 1 commit
    • Gilles Chanteperdrix's avatar
      cobalt/process: avoid crash at fork · e30f30df
      Gilles Chanteperdrix authored
      When handle_cleanup_event() is called because a thread is calling exec
      after a fork from a Xenomai process, handle_taskexit_event() is called
      before remove_process(). However, handle_taskexit_event() calls
      clear_threadinfo(), so that later calls to cobalt_current_process()
      return NULL, causing crashes in cleanup functions relying on
      cobalt_current_process() or cobalt_ppd_get(), such as
      cobalt_mutex_reclaim().
      e30f30df
  22. 14 Sep, 2015 1 commit
  23. 20 Jul, 2015 1 commit
  24. 16 Jul, 2015 1 commit
  25. 27 Jun, 2015 1 commit
    • Jan Kiszka's avatar
      cobalt/kernel: Rework thread debugging helpers · e0f8d1b2
      Jan Kiszka authored
      Factor out register/unregister_debugged_thread helpers to have a single
      point where tasks related to preparing/cleaning up ptraced-base thread
      debugging can be placed. Put all steps under nklock, which is required
      anyway for manipulating xnthread::state and fixes a lurking race. The
      timer lock counter can then be converted into a non-atomic variable.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
      e0f8d1b2
  26. 18 May, 2015 1 commit
  27. 16 May, 2015 1 commit
    • Philippe Gerum's avatar
      cobalt/mutex, lib/cobalt: detect attempts to sleep while holding a mutex · e3153cf2
      Philippe Gerum authored
      Going to sleep intentionally while holding a mutex is an application
      issue. When user consistency checks are enabled (XENO_OPT_DEBUG_USER),
      any attempt to do so will trigger a SIGDEBUG signal to the offending
      thread, conveying the SIGDEBUG_RESCNT_SLEEP code.
      
      This change implicitly disables fast locking optimizations for
      user-space threads when XENO_OPT_DEBUG_USER is enabled. Since the
      debugging code present or future may already add some overhead, this
      seems acceptable anyway.
      e3153cf2
  28. 25 Mar, 2015 1 commit