Timers and hrtimers
Scheduling kernel work at a specific time
Two timer subsystems
The kernel has two independent timer mechanisms:
| timer_list | hrtimer | |
|---|---|---|
| Resolution | jiffies (1ms–10ms) | nanoseconds |
| Data structure | hash table (timer wheel) | red-black tree |
| Callback context | softirq (TIMER_SOFTIRQ) | softirq or hardirq |
| Use for | Coarse timeouts, watchdogs | High-precision, audio/video, userspace sleep |
timer_list: jiffies-based timers
timer_list is the traditional timer API. Timers expire on a jiffies boundary (typically 1ms–10ms depending on HZ).
/* include/linux/timer.h */
struct timer_list {
struct hlist_node entry; /* timer wheel position */
unsigned long expires; /* expiry time in jiffies */
void (*function)(struct timer_list *);
u32 flags;
};
API
#include <linux/timer.h>
struct my_device {
struct timer_list watchdog;
};
static void my_watchdog_handler(struct timer_list *t)
{
struct my_device *dev = from_timer(dev, t, watchdog);
/* runs in TIMER_SOFTIRQ context: no sleeping */
if (device_stuck(dev))
recover_device(dev);
/* Restart timer */
mod_timer(&dev->watchdog, jiffies + HZ * 5); /* 5 seconds */
}
/* Initialize */
timer_setup(&dev->watchdog, my_watchdog_handler, 0);
/* Start timer: expire in 5 seconds */
mod_timer(&dev->watchdog, jiffies + HZ * 5);
/* Check if pending */
if (timer_pending(&dev->watchdog)) { ... }
/* Cancel (non-synchronous: may still fire) */
del_timer(&dev->watchdog);
/* Cancel and wait for in-flight handler to complete */
del_timer_sync(&dev->watchdog);
/* Restart timer to new expiry (or arm if not pending) */
mod_timer(&dev->watchdog, jiffies + HZ);
Time conversion helpers
jiffies + HZ /* 1 second */
jiffies + HZ / 2 /* 500ms */
jiffies + msecs_to_jiffies(250) /* 250ms */
jiffies + usecs_to_jiffies(500) /* 500µs */
/* Remaining time */
long remaining = timer->expires - jiffies;
The timer wheel
Timers are stored in a hierarchical hash table called the timer wheel. The wheel has multiple levels of 64 slots each, covering different time ranges:
Level 0: 64 slots × 1 jiffie = 64 jiffies (~64ms at HZ=1000)
Level 1: 64 slots × 64 jiffies = ~4 seconds
Level 2: 64 slots × 4096 jiffies = ~4.5 minutes
Level 3: 64 slots × 262144 jiffies = ~4.8 hours
On each timer tick, the wheel advances. Timers in the current slot are fired. Timers at higher levels are "cascaded" down as needed.
hrtimer: high-resolution timers
hrtimers were introduced in Linux 2.6.16 by Thomas Gleixner (LWN). They use ktime_t (nanosecond resolution) and a red-black tree ordered by expiry time. The closest expiry sets the hardware timer interrupt.
/* include/linux/hrtimer.h */
struct hrtimer {
struct timerqueue_node node; /* rb-tree node, stores expiry */
ktime_t _softexpires; /* earliest possible expiry */
enum hrtimer_restart (*function)(struct hrtimer *);
struct hrtimer_clock_base *base;
u8 state;
u8 is_rel; /* relative time? */
u8 is_soft; /* softirq delivery? */
u8 is_hard; /* hardirq delivery? */
};
API
#include <linux/hrtimer.h>
struct my_device {
struct hrtimer timer;
};
static enum hrtimer_restart my_hrtimer_handler(struct hrtimer *timer)
{
struct my_device *dev = container_of(timer, struct my_device, timer);
/* Process work */
do_periodic_work(dev);
/* Restart: advance by 10ms */
hrtimer_forward_now(timer, ms_to_ktime(10));
return HRTIMER_RESTART;
/* Or: don't restart */
/* return HRTIMER_NORESTART; */
}
/* Initialize */
hrtimer_setup(&dev->timer, my_hrtimer_handler,
CLOCK_MONOTONIC, HRTIMER_MODE_REL);
/* Start: fire in 10ms */
hrtimer_start(&dev->timer, ms_to_ktime(10), HRTIMER_MODE_REL);
/* Start with absolute time */
hrtimer_start(&dev->timer, ktime_get() + ms_to_ktime(100),
HRTIMER_MODE_ABS);
/* Cancel */
hrtimer_cancel(&dev->timer); /* cancel, wait if firing */
hrtimer_try_to_cancel(&dev->timer); /* cancel only if not firing */
/* Check if active */
if (hrtimer_active(&dev->timer)) { ... }
Clock sources
CLOCK_MONOTONIC /* always-increasing, not affected by settimeofday */
CLOCK_REALTIME /* wall-clock time, can jump */
CLOCK_BOOTTIME /* like MONOTONIC but includes suspend time */
CLOCK_TAI /* international atomic time */
Use CLOCK_MONOTONIC for most kernel timers. Use CLOCK_REALTIME only when the timer must track wall-clock time.
Delivery modes
/* Soft mode: callback in HRTIMER_SOFTIRQ (default on non-RT) */
HRTIMER_MODE_REL_SOFT
/* Hard mode: callback in hardirq context (lower latency, more restrictions) */
HRTIMER_MODE_REL_HARD
/* Pinned: timer stays on current CPU */
HRTIMER_MODE_REL_PINNED
Userspace sleep: hrtimer under the hood
The kernel's nanosleep() syscall and usleep() in glibc are implemented via hrtimers. When a process calls nanosleep(10ms):
hrtimer_start()arms a timer 10ms in the future- Process goes to
TASK_INTERRUPTIBLEsleep - Timer fires → callback wakes the process
- Process returns from
nanosleep()
The precision of usleep(1) (1µs sleep) depends on whether CONFIG_HIGH_RES_TIMERS is set and whether the hardware supports high-resolution timer mode.
# Check if high-resolution timers are active
cat /sys/devices/system/clocksource/clocksource0/available_clocksource
# tsc hpet acpi_pm
# Current clocksource
cat /sys/devices/system/clocksource/clocksource0/current_clocksource
# tsc
Further reading
- Softirqs — Where TIMER_SOFTIRQ and HRTIMER_SOFTIRQ run
- Workqueues — For work deferred by a timer that needs to sleep
Documentation/timers/timers-howto.rst— Which timer to use when