- Jan 06, 2021
-
-
Greg Kroah-Hartman authored
Tested-by:
Jon Hunter <jonathanh@nvidia.com> Tested-by:
Linux Kernel Functional Testing <lkft@linaro.org> Tested-by:
Shuah Khan <skhan@linuxfoundation.org> Tested-by:
Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20210104155705.740576914@linuxfoundation.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hyeongseok Kim authored
[ Upstream commit 252bd125 ] If emergency system shutdown is called, like by thermal shutdown, a dm device could be alive when the block device couldn't process I/O requests anymore. In this state, the handling of I/O errors by new dm I/O requests or by those already in-flight can lead to a verity corruption state, which is a misjudgment. So, skip verity work in response to I/O error when system is shutting down. Signed-off-by:
Hyeongseok Kim <hyeongseok@gmail.com> Reviewed-by:
Sami Tolvanen <samitolvanen@google.com> Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Takashi Iwai authored
[ Upstream commit 618de0f4 ] The PCM hw_params core function tries to clear up the PCM buffer before actually using for avoiding the information leak from the previous usages or the usage before a new allocation. It performs the memset() with runtime->dma_bytes, but this might still leave some remaining bytes untouched; namely, the PCM buffer size is aligned in page size for mmap, hence runtime->dma_bytes doesn't necessarily cover all PCM buffer pages, and the remaining bytes are exposed via mmap. This patch changes the memory clearance to cover the all buffer pages if the stream is supposed to be mmap-ready (that guarantees that the buffer size is aligned in page size). Reviewed-by:
Lars-Peter Clausen <lars@metafoo.de> Link: https://lore.kernel.org/r/20201218145625.2045-3-tiwai@suse.de Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Thomas Gleixner authored
[ Upstream commit ba8ea8e7 ] can_stop_idle_tick() checks whether the do_timer() duty has been taken over by a CPU on boot. That's silly because the boot CPU always takes over with the initial clockevent device. But even if no CPU would have installed a clockevent and taken over the duty then the question whether the tick on the current CPU can be stopped or not is moot. In that case the current CPU would have no clockevent either, so there would be nothing to keep ticking. Remove it. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20201206212002.725238293@linutronix.de Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Gabriel Krisman Bertazi authored
[ Upstream commit fc6b6a87 ] Internally, UBD treats each physical IO segment as a separate command to be submitted in the execution pipe. If the pipe returns a transient error after a few segments have already been written, UBD will tell the block layer to requeue the request, but there is no way to reclaim the segments already submitted. When a new attempt to dispatch the request is done, those segments already submitted will get duplicated, causing the WARN_ON below in the best case, and potentially data corruption. In my system, running a UML instance with 2GB of RAM and a 50M UBD disk, I can reproduce the WARN_ON by simply running mkfs.fvat against the disk on a freshly booted system. There are a few ways to around this, like reducing the pressure on the pipe by reducing the queue depth, which almost eliminates the occurrence of the problem, increasing the pipe buffer size on the host system, or by limiting the request to one physical segment, which causes the block layer to submit way more requests to resolve a single operation. Instead, this patch modifies the format of a UBD command, such that all segments are sent through a single element in the communication pipe, turning the command submission atomic from the point of view of the block layer. The new format has a variable size, depending on the number of elements, and looks like this: +------------+-----------+-----------+------------ | cmd_header | segment 0 | segment 1 | segment ... +------------+-----------+-----------+------------ With this format, we push a pointer to cmd_header in the submission pipe. This has the advantage of reducing the memory footprint of executing a single request, since it allow us to merge some fields in the header. It is possible to reduce even further each segment memory footprint, by merging bitmap_words and cow_offset, for instance, but this is not the focus of this patch and is left as future work. One issue with the patch is that for a big number of segments, we now perform one big memory allocation instead of multiple small ones, but I wasn't able to trigger any real issues or -ENOMEM because of this change, that wouldn't be reproduced otherwise. This was tested using fio with the verify-crc32 option, and by running an ext4 filesystem over this UBD device. The original WARN_ON was: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0x13f/0x141 refcount_t: underflow; use-after-free. Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.5.0-rc6-00002-g2a5bb2cf75c8 #346 Stack: 6084eed0 6063dc77 00000009 6084ef60 00000000 604b8d9f 6084eee0 6063dcbc 6084ef40 6006ab8d e013d780 1c00000000 Call Trace: [<600a0c1c>] ? printk+0x0/0x94 [<6004a888>] show_stack+0x13b/0x155 [<6063dc77>] ? dump_stack_print_info+0xdf/0xe8 [<604b8d9f>] ? refcount_warn_saturate+0x13f/0x141 [<6063dcbc>] dump_stack+0x2a/0x2c [<6006ab8d>] __warn+0x107/0x134 [<6008da6c>] ? wake_up_process+0x17/0x19 [<60487628>] ? blk_queue_max_discard_sectors+0x0/0xd [<6006b05f>] warn_slowpath_fmt+0xd1/0xdf [<6006af8e>] ? warn_slowpath_fmt+0x0/0xdf [<600acc14>] ? raw_read_seqcount_begin.constprop.0+0x0/0x15 [<600619ae>] ? os_nsecs+0x1d/0x2b [<604b8d9f>] refcount_warn_saturate+0x13f/0x141 [<6048bc8f>] refcount_sub_and_test.constprop.0+0x2f/0x37 [<6048c8de>] blk_mq_free_request+0xf1/0x10d [<6048ca06>] __blk_mq_end_request+0x10c/0x114 [<6005ac0f>] ubd_intr+0xb5/0x169 [<600a1a37>] __handle_irq_event_percpu+0x6b/0x17e [<600a1b70>] handle_irq_event_percpu+0x26/0x69 [<600a1bd9>] handle_irq_event+0x26/0x34 [<600a1bb3>] ? handle_irq_event+0x0/0x34 [<600a5186>] ? unmask_irq+0x0/0x37 [<600a57e6>] handle_edge_irq+0xbc/0xd6 [<600a131a>] generic_handle_irq+0x21/0x29 [<60048f6e>] do_IRQ+0x39/0x54 [...] ---[ end trace c6e7444e55386c0f ]--- Cc: Christopher Obbard <chris.obbard@collabora.com> Reported-by:
Martyn Welch <martyn@collabora.com> Signed-off-by:
Gabriel Krisman Bertazi <krisman@collabora.com> Tested-by:
Christopher Obbard <chris.obbard@collabora.com> Acked-by:
Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Eric Biggers authored
[ Upstream commit edf7ddbf ] Missing calls to mntget() (or equivalently, too many calls to mntput()) are hard to detect because mntput() delays freeing mounts using task_work_add(), then again using call_rcu(). As a result, mnt_count can often be decremented to -1 without getting a KASAN use-after-free report. Such cases are still bugs though, and they point to real use-after-frees being possible. For an example of this, see the bug fixed by commit 1b0b9cc8 ("vfs: fsmount: add missing mntget()"), discussed at https://lkml.kernel.org/linux-fsdevel/20190605135401.GB30925@xxxxxxxxxxxxxxxxxxxxxxxxx/T/#u . This bug *should* have been trivial to find. But actually, it wasn't found until syzkaller happened to use fchdir() to manipulate the reference count just right for the bug to be noticeable. Address this by making mntput_no_expire() issue a WARN if mnt_count has become negative. Suggested-by:
Miklos Szeredi <miklos@szeredi.hu> Signed-off-by:
Eric Biggers <ebiggers@google.com> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jessica Yu authored
[ Upstream commit 38dc717e ] Apparently there has been a longstanding race between udev/systemd and the module loader. Currently, the module loader sends a uevent right after sysfs initialization, but before the module calls its init function. However, some udev rules expect that the module has initialized already upon receiving the uevent. This race has been triggered recently (see link in references) in some systemd mount unit files. For instance, the configfs module creates the /sys/kernel/config mount point in its init function, however the module loader issues the uevent before this happens. sys-kernel-config.mount expects to be able to mount /sys/kernel/config upon receipt of the module loading uevent, but if the configfs module has not called its init function yet, then this directory will not exist and the mount unit fails. A similar situation exists for sys-fs-fuse-connections.mount, as the fuse sysfs mount point is created during the fuse module's init function. If udev is faster than module initialization then the mount unit would fail in a similar fashion. To fix this race, delay the module KOBJ_ADD uevent until after the module has finished calling its init routine. References: https://github.com/systemd/systemd/issues/17586 Reviewed-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-By:
Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> Signed-off-by:
Jessica Yu <jeyu@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jaegeuk Kim authored
[ Upstream commit a95ba66a ] Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in wrong do_shinker work. Let's avoid to return insane overflowed value by adding single tracking value. Reported-by:
Light Hsieh <Light.Hsieh@mediatek.com> Reviewed-by:
Chao Yu <yuchao0@huawei.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Trond Myklebust authored
[ Upstream commit b6d49ecd ] When returning the layout in nfs4_evict_inode(), we need to ensure that the layout is actually done being freed before we can proceed to free the inode itself. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Qinglang Miao authored
[ Upstream commit 59165d16 ] Add the missing destroy_workqueue() before return from i3c_master_register in the error handling case. Signed-off-by:
Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Link: https://lore.kernel.org/linux-i3c/20201028091543.136167-1-miaoqinglang@huawei.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Qinglang Miao authored
[ Upstream commit ffa17970 ] I noticed that iounmap() of msgr_block_addr before return from mpic_msgr_probe() in the error handling case is missing. So use devm_ioremap() instead of just ioremap() when remapping the message register block, so the mapping will be automatically released on probe failure. Signed-off-by:
Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201028091551.136400-1-miaoqinglang@huawei.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Zheng Liang authored
[ Upstream commit 1eab0fea ] When devm_rtc_allocate_device is failed in pl031_probe, it should release mem regions with device. Reported-by:
Hulk Robot <hulkci@huawei.com> Signed-off-by:
Zheng Liang <zhengliang6@huawei.com> Signed-off-by:
Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by:
Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20201112093139.32566-1-zhengliang6@huawei.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jan Kara authored
[ Upstream commit 10f04d40 ] The on-disk quota format supports quota files with upto 2^32 blocks. Be careful when computing quota file offsets in the quota files from block numbers as they can overflow 32-bit types. Since quota files larger than 4GB would require ~26 millions of quota users, this is mostly a theoretical concern now but better be careful, fuzzers would find the problem sooner or later anyway... Reviewed-by:
Andreas Dilger <adilger@dilger.ca> Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Miroslav Benes authored
[ Upstream commit 5e8ed280 ] If a module fails to load due to an error in prepare_coming_module(), the following error handling in load_module() runs with MODULE_STATE_COMING in module's state. Fix it by correctly setting MODULE_STATE_GOING under "bug_cleanup" label. Signed-off-by:
Miroslav Benes <mbenes@suse.cz> Signed-off-by:
Jessica Yu <jeyu@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Dinghao Liu authored
[ Upstream commit 28d21191 ] When clk_hw_register_fixed_rate_with_accuracy() fails, clk_data should be freed. It's the same for the subsequent two error paths, but we should also unregister the already registered clocks in them. Signed-off-by:
Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by:
Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201020061226.6572-1-dinghao.liu@zju.edu.cn Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Boqun Feng authored
commit 8d1ddb5e upstream. Syzbot reports a potential deadlock found by the newly added recursive read deadlock detection in lockdep: [...] ======================================================== [...] WARNING: possible irq lock inversion dependency detected [...] 5.9.0-rc2-syzkaller #0 Not tainted [...] -------------------------------------------------------- [...] syz-executor.1/10214 just changed the state of lock: [...] ffff88811f506338 (&f->f_owner.lock){.+..}-{2:2}, at: send_sigurg+0x1d/0x200 [...] but this lock was taken by another, HARDIRQ-safe lock in the past: [...] (&dev->event_lock){-...}-{2:2} [...] [...] [...] and interrupts could create inverse lock ordering between them. [...] [...] [...] other info that might help us debug this: [...] Chain exists of: [...] &dev->event_lock --> &new->fa_lock --> &f->f_owner.lock [...] [...] Possible interrupt unsafe locking scenario: [...] [...] CPU0 CPU1 [...] ---- ---- [...] lock(&f->f_owner.lock); [...] local_irq_disable(); [...] lock(&dev->event_lock); [...] lock(&new->fa_lock); [...] <Interrupt> [...] lock(&dev->event_lock); [...] [...] *** DEADLOCK *** The corresponding deadlock case is as followed: CPU 0 CPU 1 CPU 2 read_lock(&fown->lock); spin_lock_irqsave(&dev->event_lock, ...) write_lock_irq(&filp->f_owner.lock); // wait for the lock read_lock(&fown-lock); // have to wait until the writer release // due to the fairness <interrupted> spin_lock_irqsave(&dev->event_lock); // wait for the lock The lock dependency on CPU 1 happens if there exists a call sequence: input_inject_event(): spin_lock_irqsave(&dev->event_lock,...); input_handle_event(): input_pass_values(): input_to_handler(): handler->event(): // evdev_event() evdev_pass_values(): spin_lock(&client->buffer_lock); __pass_event(): kill_fasync(): kill_fasync_rcu(): read_lock(&fa->fa_lock); send_sigio(): read_lock(&fown->lock); To fix this, make the reader in send_sigurg() and send_sigio() use read_lock_irqsave() and read_lock_irqrestore(). Reported-by:
<syzbot+22e87cdf94021b984aa6@syzkaller.appspotmail.com> Reported-by:
<syzbot+c5e32344981ad9f33750@syzkaller.appspotmail.com> Signed-off-by:
Boqun Feng <boqun.feng@gmail.com> Signed-off-by:
Jeff Layton <jlayton@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Randy Dunlap authored
commit dc889b8d upstream. Make the printk() [bfs "printf" macro] seem less severe by changing "WARNING:" to "NOTE:". <asm-generic/bug.h> warns us about using WARNING or BUG in a format string other than in WARN() or BUG() family macros. bfs/inode.c is doing just that in a normal printk() call, so change the "WARNING" string to be "NOTE". Link: https://lkml.kernel.org/r/20201203212634.17278-1-rdunlap@infradead.org Reported-by:
<syzbot+3fd34060f26e766536ff@syzkaller.appspotmail.com> Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: "Tigran A. Aivazian" <aivazian.tigran@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Takashi Iwai authored
commit 88a06d6f upstream. The runtime->avail field may be accessed concurrently while some places refer to it without taking the runtime->lock spinlock, as detected by KCSAN. Usually this isn't a big problem, but for consistency and safety, we should take the spinlock at each place referencing this field. Reported-by:
<syzbot+a23a6f1215c84756577c@syzkaller.appspotmail.com> Reported-by:
<syzbot+3d367d1df1d2b67f5c19@syzkaller.appspotmail.com> Link: https://lore.kernel.org/r/20201206083527.21163-1-tiwai@suse.de Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Takashi Iwai authored
commit 4ebd4703 upstream. The snd_seq_queue struct contains various flags in the bit fields. Those are categorized to two different use cases, both of which are protected by different spinlocks. That implies that there are still potential risks of the bad operations for bit fields by concurrent accesses. For addressing the problem, this patch rearranges those flags to be a standard bool instead of a bit field. Reported-by:
<syzbot+63cbe31877bb80ef58f5@syzkaller.appspotmail.com> Link: https://lore.kernel.org/r/20201206083456.21110-1-tiwai@suse.de Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chao Yu authored
commit e584bbe8 upstream. syzbot reported a bug which could cause shift-out-of-bounds issue, fix it. Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395 sanity_check_raw_super fs/f2fs/super.c:2812 [inline] read_raw_super_block fs/f2fs/super.c:3267 [inline] f2fs_fill_super.cold+0x16c9/0x16f6 fs/f2fs/super.c:3519 mount_bdev+0x34d/0x410 fs/super.c:1366 legacy_get_tree+0x105/0x220 fs/fs_context.c:592 vfs_get_tree+0x89/0x2f0 fs/super.c:1496 do_new_mount fs/namespace.c:2896 [inline] path_mount+0x12ae/0x1e70 fs/namespace.c:3227 do_mount fs/namespace.c:3240 [inline] __do_sys_mount fs/namespace.c:3448 [inline] __se_sys_mount fs/namespace.c:3425 [inline] __x64_sys_mount+0x27f/0x300 fs/namespace.c:3425 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported-by:
<syzbot+ca9a785f8ac472085994@syzkaller.appspotmail.com> Signed-off-by:
Anant Thazhemadam <anant.thazhemadam@gmail.com> Signed-off-by:
Chao Yu <yuchao0@huawei.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mauro Carvalho Chehab authored
commit d0ac1a26 upstream. As reported on: https://lore.kernel.org/linux-media/20190627222020.45909-1-willemdebruijn.kernel@gmail.com/ if gp8psk_usb_in_op() returns an error, the status var is not initialized. Yet, this var is used later on, in order to identify: - if the device was already started; - if firmware has loaded; - if the LNBf was powered on. Using status = 0 seems to ensure that everything will be properly powered up. So, instead of the proposed solution, let's just set status = 0. Reported-by:
syzbot <syzkaller@googlegroups.com> Reported-by:
Willem de Bruijn <willemb@google.com> Signed-off-by:
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Anant Thazhemadam authored
commit 31dcb6c3 upstream. A kernel-infoleak was reported by syzbot, which was caused because dbells was left uninitialized. Using kzalloc() instead of kmalloc() fixes this issue. Reported-by:
<syzbot+a79e17c39564bedf0930@syzkaller.appspotmail.com> Tested-by:
<syzbot+a79e17c39564bedf0930@syzkaller.appspotmail.com> Signed-off-by:
Anant Thazhemadam <anant.thazhemadam@gmail.com> Link: https://lore.kernel.org/r/20201122224534.333471-1-anant.thazhemadam@gmail.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rustam Kovhaev authored
commit d24396c5 upstream. when directory item has an invalid value set for ih_entry_count it might trigger use-after-free or out-of-bounds read in bin_search_in_dir_item() ih_entry_count * IH_SIZE for directory item should not be larger than ih_item_len Link: https://lore.kernel.org/r/20201101140958.3650143-1-rkovhaev@gmail.com Reported-and-tested-by:
<syzbot+83b6f7cf9922cae5c4d7@syzkaller.appspotmail.com> Link: https://syzkaller.appspot.com/bug?extid=83b6f7cf9922cae5c4d7 Signed-off-by:
Rustam Kovhaev <rkovhaev@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Anant Thazhemadam authored
commit 70f259a3 upstream. When h5_close() gets called, the memory allocated for the hu gets freed only if hu->serdev doesn't exist. This leads to a memory leak. So when h5_close() is requested, close the serdev device instance and free the memory allocated to the hu entirely instead. Fixes: https://syzkaller.appspot.com/bug?extid=6ce141c55b2f7aafd1c4 Reported-by:
<syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com> Tested-by:
<syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com> Signed-off-by:
Anant Thazhemadam <anant.thazhemadam@gmail.com> Signed-off-by:
Marcel Holtmann <marcel@holtmann.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Randy Dunlap authored
commit cb525319 upstream. SCSI_CXGB4_ISCSI selects CHELSIO_T4. The latter depends on TLS || TLS=n, so since 'select' does not check dependencies of the selected symbol, SCSI_CXGB4_ISCSI should also depend on TLS || TLS=n. This prevents the following kconfig warning and restricts SCSI_CXGB4_ISCSI to 'm' whenever TLS=m. WARNING: unmet direct dependencies detected for CHELSIO_T4 Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_CHELSIO [=y] && PCI [=y] && (IPV6 [=y] || IPV6 [=y]=n) && (TLS [=m] || TLS [=m]=n) Selected by [y]: - SCSI_CXGB4_ISCSI [=y] && SCSI_LOWLEVEL [=y] && SCSI [=y] && PCI [=y] && INET [=y] && (IPV6 [=y] || IPV6 [=y]=n) && ETHERNET [=y] Link: https://lore.kernel.org/r/20201208220505.24488-1-rdunlap@infradead.org Fixes: 7b36b6e0 ("[SCSI] cxgb4i v5: iscsi driver") Cc: Karen Xie <kxie@chelsio.com> Cc: linux-scsi@vger.kernel.org Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Qinglang Miao authored
commit 2d18e54d upstream. A memory leak is found in cgroup1_parse_param() when multiple source parameters overwrite fc->source in the fs_context struct without free. unreferenced object 0xffff888100d930e0 (size 16): comm "mount", pid 520, jiffies 4303326831 (age 152.783s) hex dump (first 16 bytes): 74 65 73 74 6c 65 61 6b 00 00 00 00 00 00 00 00 testleak........ backtrace: [<000000003e5023ec>] kmemdup_nul+0x2d/0xa0 [<00000000377dbdaa>] vfs_parse_fs_string+0xc0/0x150 [<00000000cb2b4882>] generic_parse_monolithic+0x15a/0x1d0 [<000000000f750198>] path_mount+0xee1/0x1820 [<0000000004756de2>] do_mount+0xea/0x100 [<0000000094cafb0a>] __x64_sys_mount+0x14b/0x1f0 Fix this bug by permitting a single source parameter and rejecting with an error all subsequent ones. Fixes: 8d2451f4 ("cgroup1: switch to option-by-option parsing") Reported-by:
Hulk Robot <hulkci@huawei.com> Signed-off-by:
Qinglang Miao <miaoqinglang@huawei.com> Reviewed-by:
Zefan Li <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Johan Hovold authored
commit 5812b32e upstream. Specify type alignment when declaring linker-section match-table entries to prevent gcc from increasing alignment and corrupting the various tables with padding (e.g. timers, irqchips, clocks, reserved memory). This is specifically needed on x86 where gcc (typically) aligns larger objects like struct of_device_id with static extent on 32-byte boundaries which at best prevents matching on anything but the first entry. Specifying alignment when declaring variables suppresses this optimisation. Here's a 64-bit example where all entries are corrupt as 16 bytes of padding has been inserted before the first entry: ffffffff8266b4b0 D __clk_of_table ffffffff8266b4c0 d __of_table_fixed_factor_clk ffffffff8266b5a0 d __of_table_fixed_clk ffffffff8266b680 d __clk_of_table_sentinel And here's a 32-bit example where the 8-byte-aligned table happens to be placed on a 32-byte boundary so that all but the first entry are corrupt due to the 28 bytes of padding inserted between entries: 812b3ec0 D __irqchip_of_table 812b3ec0 d __of_table_irqchip1 812b3fa0 d __of_table_irqchip2 812b4080 d __of_table_irqchip3 812b4160 d irqchip_of_match_end Verified on x86 using gcc-9.3 and gcc-4.9 (which uses 64-byte alignment), and on arm using gcc-7.2. Note that there are no in-tree users of these tables on x86 currently (even if they are included in the image). Fixes: 54196ccb ("of: consolidate linker section OF match table declarations") Fixes: f6e916b8 ("irqchip: add basic infrastructure") Cc: stable <stable@vger.kernel.org> # 3.9 Signed-off-by:
Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20201123102319.8090-2-johan@kernel.org [ johan: adjust context to 5.4 ] Signed-off-by:
Johan Hovold <johan@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Damien Le Moal authored
commit 0ebcdd70 upstream. For a null_blk device with zoned mode enabled is currently initialized with a number of zones equal to the device capacity divided by the zone size, without considering if the device capacity is a multiple of the zone size. If the zone size is not a divisor of the capacity, the zones end up not covering the entire capacity, potentially resulting is out of bounds accesses to the zone array. Fix this by adding one last smaller zone with a size equal to the remainder of the disk capacity divided by the zone size if the capacity is not a multiple of the zone size. For such smaller last zone, the zone capacity is also checked so that it does not exceed the smaller zone size. Reported-by:
Naohiro Aota <naohiro.aota@wdc.com> Fixes: ca4b2a01 ("null_blk: add zone support") Cc: stable@vger.kernel.org Signed-off-by:
Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnaldo Carvalho de Melo authored
commit 7ddcdea5 upstream. To pick up the changes in: a85cbe61 ("uapi: move constants from <linux/kernel.h> to <linux/const.h>") That causes no changes in tooling, just addresses this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/const.h' differs from latest version at 'include/uapi/linux/const.h' diff -u tools/include/uapi/linux/const.h include/uapi/linux/const.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Petr Vorel <petr.vorel@gmail.com> Signed-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Petr Vorel authored
commit a85cbe61 upstream. and include <linux/const.h> in UAPI headers instead of <linux/kernel.h>. The reason is to avoid indirect <linux/sysinfo.h> include when using some network headers: <linux/netlink.h> or others -> <linux/kernel.h> -> <linux/sysinfo.h>. This indirect include causes on MUSL redefinition of struct sysinfo when included both <sys/sysinfo.h> and some of UAPI headers: In file included from x86_64-buildroot-linux-musl/sysroot/usr/include/linux/kernel.h:5, from x86_64-buildroot-linux-musl/sysroot/usr/include/linux/netlink.h:5, from ../include/tst_netlink.h:14, from tst_crypto.c:13: x86_64-buildroot-linux-musl/sysroot/usr/include/linux/sysinfo.h:8:8: error: redefinition of `struct sysinfo' struct sysinfo { ^~~~~~~ In file included from ../include/tst_safe_macros.h:15, from ../include/tst_test.h:93, from tst_crypto.c:11: x86_64-buildroot-linux-musl/sysroot/usr/include/sys/sysinfo.h:10:8: note: originally defined here Link: https://lkml.kernel.org/r/20201015190013.8901-1-petr.vorel@gmail.com Signed-off-by:
Petr Vorel <petr.vorel@gmail.com> Suggested-by:
Rich Felker <dalias@aerifal.cx> Acked-by:
Rich Felker <dalias@libc.org> Cc: Peter Korsgaard <peter@korsgaard.com> Cc: Baruch Siach <baruch@tkos.co.il> Cc: Florian Weimer <fweimer@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bart Van Assche authored
commit fa4d0f19 upstream. With the current implementation the following race can happen: * blk_pre_runtime_suspend() calls blk_freeze_queue_start() and blk_mq_unfreeze_queue(). * blk_queue_enter() calls blk_queue_pm_only() and that function returns true. * blk_queue_enter() calls blk_pm_request_resume() and that function does not call pm_request_resume() because the queue runtime status is RPM_ACTIVE. * blk_pre_runtime_suspend() changes the queue status into RPM_SUSPENDING. Fix this race by changing the queue runtime status into RPM_SUSPENDING before switching q_usage_counter to atomic mode. Link: https://lore.kernel.org/r/20201209052951.16136-2-bvanassche@acm.org Fixes: 986d413b ("blk-mq: Enable support for runtime power management") Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable <stable@vger.kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Hannes Reinecke <hare@suse.de> Reviewed-by:
Jens Axboe <axboe@kernel.dk> Acked-by:
Alan Stern <stern@rowland.harvard.edu> Acked-by:
Stanley Chu <stanley.chu@mediatek.com> Co-developed-by:
Can Guo <cang@codeaurora.org> Signed-off-by:
Can Guo <cang@codeaurora.org> Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jamie Iles authored
[ Upstream commit a61df3c4 ] syzkaller found the following JFFS2 splat: Unable to handle kernel paging request at virtual address dfffa00000000001 Mem abort info: ESR = 0x96000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 [dfffa00000000001] address between user and kernel address ranges Internal error: Oops: 96000004 [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 12745 Comm: syz-executor.5 Tainted: G S 5.9.0-rc8+ #98 Hardware name: linux,dummy-virt (DT) pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--) pc : jffs2_parse_param+0x138/0x308 fs/jffs2/super.c:206 lr : jffs2_parse_param+0x108/0x308 fs/jffs2/super.c:205 sp : ffff000022a57910 x29: ffff000022a57910 x28: 0000000000000000 x27: ffff000057634008 x26: 000000000000d800 x25: 000000000000d800 x24: ffff0000271a9000 x23: ffffa0001adb5dc0 x22: ffff000023fdcf00 x21: 1fffe0000454af2c x20: ffff000024cc9400 x19: 0000000000000000 x18: 0000000000000000 x17: 0000000000000000 x16: ffffa000102dbdd0 x15: 0000000000000000 x14: ffffa000109e44bc x13: ffffa00010a3a26c x12: ffff80000476e0b3 x11: 1fffe0000476e0b2 x10: ffff80000476e0b2 x9 : ffffa00010a3ad60 x8 : ffff000023b70593 x7 : 0000000000000003 x6 : 00000000f1f1f1f1 x5 : ffff000023fdcf00 x4 : 0000000000000002 x3 : ffffa00010000000 x2 : 0000000000000001 x1 : dfffa00000000000 x0 : 0000000000000008 Call trace: jffs2_parse_param+0x138/0x308 fs/jffs2/super.c:206 vfs_parse_fs_param+0x234/0x4e8 fs/fs_context.c:117 vfs_parse_fs_string+0xe8/0x148 fs/fs_context.c:161 generic_parse_monolithic+0x17c/0x208 fs/fs_context.c:201 parse_monolithic_mount_data+0x7c/0xa8 fs/fs_context.c:649 do_new_mount fs/namespace.c:2871 [inline] path_mount+0x548/0x1da8 fs/namespace.c:3192 do_mount+0x124/0x138 fs/namespace.c:3205 __do_sys_mount fs/namespace.c:3413 [inline] __se_sys_mount fs/namespace.c:3390 [inline] __arm64_sys_mount+0x164/0x238 fs/namespace.c:3390 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline] invoke_syscall arch/arm64/kernel/syscall.c:48 [inline] el0_svc_common.constprop.0+0x15c/0x598 arch/arm64/kernel/syscall.c:149 do_el0_svc+0x60/0x150 arch/arm64/kernel/syscall.c:195 el0_svc+0x34/0xb0 arch/arm64/kernel/entry-common.c:226 el0_sync_handler+0xc8/0x5b4 arch/arm64/kernel/entry-common.c:236 el0_sync+0x15c/0x180 arch/arm64/kernel/entry.S:663 Code: d2d40001 f2fbffe1 91002260 d343fc02 (38e16841) ---[ end trace 4edf690313deda44 ]--- This is because since ec10a24f, the option parsing happens before fill_super and so the MTD device isn't associated with the filesystem. Defer the size check until there is a valid association. Fixes: ec10a24f ("vfs: Convert jffs2 to use the new mount API") Cc: <stable@vger.kernel.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by:
Jamie Iles <jamie@nuviainc.com> Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
lizhe authored
[ Upstream commit cd3ed3c7 ] Set rp_size to zero will be ignore during remounting. The method to identify whether we input a remounting option of rp_size is to check if the rp_size input is zero. It can not work well if we pass "rp_size=0". This patch add a bool variable "set_rp_size" to fix this problem. Reported-by:
Jubin Zhong <zhongjubin@huawei.com> Signed-off-by:
lizhe <lizhe67@huawei.com> Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Christophe Leroy authored
[ Upstream commit 1891ef21 ] fls() and fls64() are using __builtin_ctz() and _builtin_ctzll(). On powerpc, those builtins trivially use ctlzw and ctlzd power instructions. Allthough those instructions provide the expected result with input argument 0, __builtin_ctz() and __builtin_ctzll() are documented as undefined for value 0. The easiest fix would be to use fls() and fls64() functions defined in include/asm-generic/bitops/builtin-fls.h and include/asm-generic/bitops/fls64.h, but GCC output is not optimal: 00000388 <testfls>: 388: 2c 03 00 00 cmpwi r3,0 38c: 41 82 00 10 beq 39c <testfls+0x14> 390: 7c 63 00 34 cntlzw r3,r3 394: 20 63 00 20 subfic r3,r3,32 398: 4e 80 00 20 blr 39c: 38 60 00 00 li r3,0 3a0: 4e 80 00 20 blr 000003b0 <testfls64>: 3b0: 2c 03 00 00 cmpwi r3,0 3b4: 40 82 00 1c bne 3d0 <testfls64+0x20> 3b8: 2f 84 00 00 cmpwi cr7,r4,0 3bc: 38 60 00 00 li r3,0 3c0: 4d 9e 00 20 beqlr cr7 3c4: 7c 83 00 34 cntlzw r3,r4 3c8: 20 63 00 20 subfic r3,r3,32 3cc: 4e 80 00 20 blr 3d0: 7c 63 00 34 cntlzw r3,r3 3d4: 20 63 00 40 subfic r3,r3,64 3d8: 4e 80 00 20 blr When the input of fls(x) is a constant, just check x for nullity and return either 0 or __builtin_clz(x). Otherwise, use cntlzw instruction directly. For fls64() on PPC64, do the same but with __builtin_clzll() and cntlzd instruction. On PPC32, lets take the generic fls64() which will use our fls(). The result is as expected: 00000388 <testfls>: 388: 7c 63 00 34 cntlzw r3,r3 38c: 20 63 00 20 subfic r3,r3,32 390: 4e 80 00 20 blr 000003a0 <testfls64>: 3a0: 2c 03 00 00 cmpwi r3,0 3a4: 40 82 00 10 bne 3b4 <testfls64+0x14> 3a8: 7c 83 00 34 cntlzw r3,r4 3ac: 20 63 00 20 subfic r3,r3,32 3b0: 4e 80 00 20 blr 3b4: 7c 63 00 34 cntlzw r3,r3 3b8: 20 63 00 40 subfic r3,r3,64 3bc: 4e 80 00 20 blr Fixes: 2fcff790 ("powerpc: Use builtin functions for fls()/__fls()/fls64()") Cc: stable@vger.kernel.org Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by:
Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/348c2d3f19ffcff8abe50d52513f989c4581d000.1603375524.git.christophe.leroy@csgroup.eu Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Paolo Bonzini authored
[ Upstream commit 39485ed9 ] Until commit e7c587da ("x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP"), KVM was testing both Intel and AMD CPUID bits before allowing the guest to write MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD. Testing only Intel bits on VMX processors, or only AMD bits on SVM processors, fails if the guests are created with the "opposite" vendor as the host. While at it, also tweak the host CPU check to use the vendor-agnostic feature bit X86_FEATURE_IBPB, since we only care about the availability of the MSR on the host here and not about specific CPUID bits. Fixes: e7c587da ("x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP") Cc: stable@vger.kernel.org Reported-by:
Denis V. Lunev <den@openvz.org> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Paolo Bonzini authored
[ Upstream commit df7e8818 ] Userspace that does not know about the AMD_IBRS bit might still allow the guest to protect itself with MSR_IA32_SPEC_CTRL using the Intel SPEC_CTRL bit. However, svm.c disallows this and will cause a #GP in the guest when writing to the MSR. Fix this by loosening the test and allowing the Intel CPUID bit, and in fact allow the AMD_STIBP bit as well since it allows writing to MSR_IA32_SPEC_CTRL too. Reported-by:
Zhiyi Guo <zhguo@redhat.com> Analyzed-by:
Dr. David Alan Gilbert <dgilbert@redhat.com> Analyzed-by:
Laszlo Ersek <lersek@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Paolo Bonzini authored
[ Upstream commit 6441fa61 ] If the guest is configured to have SPEC_CTRL but the host does not (which is a nonsensical configuration but these are not explicitly forbidden) then a host-initiated MSR write can write vmx->spec_ctrl (respectively svm->spec_ctrl) and trigger a #GP when KVM tries to restore the host value of the MSR. Add a more comprehensive check for valid bits of SPEC_CTRL, covering host CPUID flags and, since we are at it and it is more correct that way, guest CPUID flags too. For AMD, remove the unnecessary is_guest_mode check around setting the MSR interception bitmap, so that the code looks the same as for Intel. Cc: Jim Mattson <jmattson@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Jan Kara authored
[ Upstream commit b08070ec ] ext4_handle_error() with errors=continue mount option can accidentally remount the filesystem read-only when the system is rebooting. Fix that. Fixes: 1dc1097f ("ext4: avoid panic during forced reboot") Signed-off-by:
Jan Kara <jack@suse.cz> Reviewed-by:
Andreas Dilger <adilger@dilger.ca> Cc: stable@kernel.org Link: https://lore.kernel.org/r/20201127113405.26867-2-jack@suse.cz Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Filipe Manana authored
[ Upstream commit 7f458a38 ] When defragmenting we skip ranges that have holes or inline extents, so that we don't do unnecessary IO and waste space. We do this check when calling should_defrag_range() at btrfs_defrag_file(). However we do it without holding the inode's lock. The reason we do it like this is to avoid blocking other tasks for too long, that possibly want to operate on other file ranges, since after the call to should_defrag_range() and before locking the inode, we trigger a synchronous page cache readahead. However before we were able to lock the inode, some other task might have punched a hole in our range, or we may now have an inline extent there, in which case we should not set the range for defrag anymore since that would cause unnecessary IO and make us waste space (i.e. allocating extents to contain zeros for a hole). So after we locked the inode and the range in the iotree, check again if we have holes or an inline extent, and if we do, just skip the range. I hit this while testing my next patch that fixes races when updating an inode's number of bytes (subject "btrfs: update the number of bytes used by an inode atomically"), and it depends on this change in order to work correctly. Alternatively I could rework that other patch to detect holes and flag their range with the 'new delalloc' bit, but this itself fixes an efficiency problem due a race that from a functional point of view is not harmful (it could be triggered with btrfs/062 from fstests). CC: stable@vger.kernel.org # 5.4+ Reviewed-by:
Josef Bacik <josef@toxicpanda.com> Signed-off-by:
Filipe Manana <fdmanana@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Eric Auger authored
[ Upstream commit 16b8fe4c ] In case an error occurs in vfio_pci_enable() before the call to vfio_pci_probe_mmaps(), vfio_pci_disable() will try to iterate on an uninitialized list and cause a kernel panic. Lets move to the initialization to vfio_pci_probe() to fix the issue. Signed-off-by:
Eric Auger <eric.auger@redhat.com> Fixes: 05f0c03f ("vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive") CC: Stable <stable@vger.kernel.org> # v4.7+ Signed-off-by:
Alex Williamson <alex.williamson@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-