Skip to content
  • Tomoki Sekiyama's avatar
    eb1c160b
    elevator: Fix a race in elevator switching and md device initialization · eb1c160b
    Tomoki Sekiyama authored
    The soft lockup below happens at the boot time of the system using dm
    multipath and the udev rules to switch scheduler.
    
    [  356.127001] BUG: soft lockup - CPU#3 stuck for 22s! [sh:483]
    [  356.127001] RIP: 0010:[<ffffffff81072a7d>]  [<ffffffff81072a7d>] lock_timer_base.isra.35+0x1d/0x50
    ...
    [  356.127001] Call Trace:
    [  356.127001]  [<ffffffff81073810>] try_to_del_timer_sync+0x20/0x70
    [  356.127001]  [<ffffffff8118b08a>] ? kmem_cache_alloc_node_trace+0x20a/0x230
    [  356.127001]  [<ffffffff810738b2>] del_timer_sync+0x52/0x60
    [  356.127001]  [<ffffffff812ece22>] cfq_exit_queue+0x32/0xf0
    [  356.127001]  [<ffffffff812c98df>] elevator_exit+0x2f/0x50
    [  356.127001]  [<ffffffff812c9f21>] elevator_change+0xf1/0x1c0
    [  356.127001]  [<ffffffff812caa50>] elv_iosched_store+0x20/0x50
    [  356.127001]  [<ffffffff812d1d09>] queue_attr_store+0x59/0xb0
    [  356.127001]  [<ffffffff812143f6>] sysfs_write_file+0xc6/0x140
    [  356.127001]  [<ffffffff811a326d>] vfs_write+0xbd/0x1e0
    [  356.127001]  [<ffffffff811a3ca9>] SyS_write+0x49/0xa0
    [  356.127001]  [<ffffffff8164e899>] system_call_fastpath+0x16/0x1b
    
    This is caused by a race between md device initialization by multipathd and
    shell script to switch the scheduler using sysfs.
    
     - multipathd:
       SyS_ioctl -> do_vfs_ioctl -> dm_ctl_ioctl -> ctl_ioctl -> table_load
       -> dm_setup_md_queue -> blk_init_allocated_queue -> elevator_init
        q->elevator = elevator_alloc(q, e); // not yet initialized
    
     - sh -c 'echo deadline > /sys/$DEVPATH/queue/scheduler':
       elevator_switch (in the call trace above)
        struct elevator_queue *old = q->elevator;
        q->elevator = elevator_alloc(q, new_e);
        elevator_exit(old);                 // lockup! (*)
    
     - multipathd: (cont.)
        err = e->ops.elevator_init_fn(q);   // init fails; q->elevator is modified
    
    (*) When del_timer_sync() is called, lock_timer_base() will loop infinitely
    while timer->base == NULL. In this case, as timer will never initialized,
    it results in lockup.
    
    This patch introduces acquisition of q->sysfs_lock around elevator_init()
    into blk_init_allocated_queue(), to provide mutual exclusion between
    initialization of the q->scheduler and switching of the scheduler.
    
    This should fix this bugzilla:
    https://bugzilla.redhat.com/show_bug.cgi?id=902012
    
    
    
    Signed-off-by: default avatarTomoki Sekiyama <tomoki.sekiyama@hds.com>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    eb1c160b
    elevator: Fix a race in elevator switching and md device initialization
    Tomoki Sekiyama authored
    The soft lockup below happens at the boot time of the system using dm
    multipath and the udev rules to switch scheduler.
    
    [  356.127001] BUG: soft lockup - CPU#3 stuck for 22s! [sh:483]
    [  356.127001] RIP: 0010:[<ffffffff81072a7d>]  [<ffffffff81072a7d>] lock_timer_base.isra.35+0x1d/0x50
    ...
    [  356.127001] Call Trace:
    [  356.127001]  [<ffffffff81073810>] try_to_del_timer_sync+0x20/0x70
    [  356.127001]  [<ffffffff8118b08a>] ? kmem_cache_alloc_node_trace+0x20a/0x230
    [  356.127001]  [<ffffffff810738b2>] del_timer_sync+0x52/0x60
    [  356.127001]  [<ffffffff812ece22>] cfq_exit_queue+0x32/0xf0
    [  356.127001]  [<ffffffff812c98df>] elevator_exit+0x2f/0x50
    [  356.127001]  [<ffffffff812c9f21>] elevator_change+0xf1/0x1c0
    [  356.127001]  [<ffffffff812caa50>] elv_iosched_store+0x20/0x50
    [  356.127001]  [<ffffffff812d1d09>] queue_attr_store+0x59/0xb0
    [  356.127001]  [<ffffffff812143f6>] sysfs_write_file+0xc6/0x140
    [  356.127001]  [<ffffffff811a326d>] vfs_write+0xbd/0x1e0
    [  356.127001]  [<ffffffff811a3ca9>] SyS_write+0x49/0xa0
    [  356.127001]  [<ffffffff8164e899>] system_call_fastpath+0x16/0x1b
    
    This is caused by a race between md device initialization by multipathd and
    shell script to switch the scheduler using sysfs.
    
     - multipathd:
       SyS_ioctl -> do_vfs_ioctl -> dm_ctl_ioctl -> ctl_ioctl -> table_load
       -> dm_setup_md_queue -> blk_init_allocated_queue -> elevator_init
        q->elevator = elevator_alloc(q, e); // not yet initialized
    
     - sh -c 'echo deadline > /sys/$DEVPATH/queue/scheduler':
       elevator_switch (in the call trace above)
        struct elevator_queue *old = q->elevator;
        q->elevator = elevator_alloc(q, new_e);
        elevator_exit(old);                 // lockup! (*)
    
     - multipathd: (cont.)
        err = e->ops.elevator_init_fn(q);   // init fails; q->elevator is modified
    
    (*) When del_timer_sync() is called, lock_timer_base() will loop infinitely
    while timer->base == NULL. In this case, as timer will never initialized,
    it results in lockup.
    
    This patch introduces acquisition of q->sysfs_lock around elevator_init()
    into blk_init_allocated_queue(), to provide mutual exclusion between
    initialization of the q->scheduler and switching of the scheduler.
    
    This should fix this bugzilla:
    https://bugzilla.redhat.com/show_bug.cgi?id=902012
    
    
    
    Signed-off-by: default avatarTomoki Sekiyama <tomoki.sekiyama@hds.com>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
Loading