| Description | In the Linux kernel, the following vulnerability has been resolved:  net/mlx5e: Prevent deadlock while disabling aRFS  When disabling aRFS under the `priv->state_lock`, any scheduled aRFS works are canceled using the `cancel_work_sync` function, which waits for the work to end if it has already started. However, while waiting for the work handler, the handler will try to acquire the `state_lock` which is already acquired.  The worker acquires the lock to delete the rules if the state is down, which is not the worker's responsibility since disabling aRFS deletes the rules.  Add an aRFS state variable, which indicates whether the aRFS is enabled and prevent adding rules when the aRFS is disabled.  Kernel log:  ====================================================== WARNING: possible circular locking dependency detected 6.7.0-rc4_net_next_mlx5_5483eb2 #1 Tainted: G          I ------------------------------------------------------ ethtool/386089 is trying to acquire lock: ffff88810f21ce68 ((work_completion)(&rule->arfs_work)){+.+.}-{0:0}, at: __flush_work+0x74/0x4e0  but task is already holding lock: ffff8884a1808cc0 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_ethtool_set_channels+0x53/0x200 [mlx5_core]  which lock already depends on the new lock.  the existing dependency chain (in reverse order) is:  -> #1 (&priv->state_lock){+.+.}-{3:3}:        __mutex_lock+0x80/0xc90        arfs_handle_work+0x4b/0x3b0 [mlx5_core]        process_one_work+0x1dc/0x4a0        worker_thread+0x1bf/0x3c0        kthread+0xd7/0x100        ret_from_fork+0x2d/0x50        ret_from_fork_asm+0x11/0x20  -> #0 ((work_completion)(&rule->arfs_work)){+.+.}-{0:0}:        __lock_acquire+0x17b4/0x2c80        lock_acquire+0xd0/0x2b0        __flush_work+0x7a/0x4e0        __cancel_work_timer+0x131/0x1c0        arfs_del_rules+0x143/0x1e0 [mlx5_core]        mlx5e_arfs_disable+0x1b/0x30 [mlx5_core]        mlx5e_ethtool_set_channels+0xcb/0x200 [mlx5_core]        ethnl_set_channels+0x28f/0x3b0        ethnl_default_set_doit+0xec/0x240        genl_family_rcv_msg_doit+0xd0/0x120        genl_rcv_msg+0x188/0x2c0        netlink_rcv_skb+0x54/0x100        genl_rcv+0x24/0x40        netlink_unicast+0x1a1/0x270        netlink_sendmsg+0x214/0x460        __sock_sendmsg+0x38/0x60        __sys_sendto+0x113/0x170        __x64_sys_sendto+0x20/0x30        do_syscall_64+0x40/0xe0        entry_SYSCALL_64_after_hwframe+0x46/0x4e  other info that might help us debug this:   Possible unsafe locking scenario:         CPU0                    CPU1        ----                    ----   lock(&priv->state_lock);                                lock((work_completion)(&rule->arfs_work));                                lock(&priv->state_lock);   lock((work_completion)(&rule->arfs_work));   *** DEADLOCK ***  3 locks held by ethtool/386089:  #0: ffffffff82ea7210 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40  #1: ffffffff82e94c88 (rtnl_mutex){+.+.}-{3:3}, at: ethnl_default_set_doit+0xd3/0x240  #2: ffff8884a1808cc0 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_ethtool_set_channels+0x53/0x200 [mlx5_core]  stack backtrace: CPU: 15 PID: 386089 Comm: ethtool Tainted: G          I        6.7.0-rc4_net_next_mlx5_5483eb2 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace:  <TASK>  dump_stack_lvl+0x60/0xa0  check_noncircular+0x144/0x160  __lock_acquire+0x17b4/0x2c80  lock_acquire+0xd0/0x2b0  ? __flush_work+0x74/0x4e0  ? save_trace+0x3e/0x360  ? __flush_work+0x74/0x4e0  __flush_work+0x7a/0x4e0  ? __flush_work+0x74/0x4e0  ? __lock_acquire+0xa78/0x2c80  ? lock_acquire+0xd0/0x2b0  ? mark_held_locks+0x49/0x70  __cancel_work_timer+0x131/0x1c0  ? mark_held_locks+0x49/0x70  arfs_del_rules+0x143/0x1e0 [mlx5_core]  mlx5e_arfs_disable+0x1b/0x30 [mlx5_core]  mlx5e_ethtool_set_channels+0xcb/0x200 [mlx5_core]  ethnl_set_channels+0x28f/0x3b0  ethnl_default_set_doit+0xec/0x240  genl_family_rcv_msg_doit+0xd0/0x120  genl_rcv_msg+0x188/0x2c0  ? ethn ---truncated--- |