- Jan 06, 2021
-
-
Baoquan He authored
commit dc2da7b4 upstream. VMware observed a performance regression during memmap init on their platform, and bisected to commit 73a6e474 ("mm: memmap_init: iterate over memblock regions rather that check each PFN") causing it. Before the commit: [0.033176] Normal zone: 1445888 pages used for memmap [0.033176] Normal zone: 89391104 pages, LIFO batch:63 [0.035851] ACPI: PM-Timer IO Port: 0x448 With commit [0.026874] Normal zone: 1445888 pages used for memmap [0.026875] Normal zone: 89391104 pages, LIFO batch:63 [2.028450] ACPI: PM-Timer IO Port: 0x448 The root cause is the current memmap defer init doesn't work as expected. Before, memmap_init_zone() was used to do memmap init of one whole zone, to initialize all low zones of one numa node, but defer memmap init of the last zone in that numa node. However, since commit 73a6e474, function memmap_init() is adapted to iterater over memblock regions inside one zone, then call memmap_init_zone() to do memmap init for each region. E.g, on VMware's system, the memory layout is as below, there are two memory regions in node 2. The current code will mistakenly initialize the whole 1st region [mem 0xab00000000-0xfcffffffff], then do memmap defer to iniatialize only one memmory section on the 2nd region [mem 0x10000000000-0x1033fffffff]. In fact, we only expect to see that there's only one memory section's memmap initialized. That's why more time is costed at the time. [ 0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff] [ 0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0xbfffffff] [ 0.008843] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x55ffffffff] [ 0.008844] ACPI: SRAT: Node 1 PXM 1 [mem 0x5600000000-0xaaffffffff] [ 0.008844] ACPI: SRAT: Node 2 PXM 2 [mem 0xab00000000-0xfcffffffff] [ 0.008845] ACPI: SRAT: Node 2 PXM 2 [mem 0x10000000000-0x1033fffffff] Now, let's add a parameter 'zone_end_pfn' to memmap_init_zone() to pass down the real zone end pfn so that defer_init() can use it to judge whether defer need be taken in zone wide. Link: https://lkml.kernel.org/r/20201223080811.16211-1-bhe@redhat.com Link: https://lkml.kernel.org/r/20201223080811.16211-2-bhe@redhat.com Fixes: commit 73a6e474 ("mm: memmap_init: iterate over memblock regions rather that check each PFN") Signed-off-by:
Baoquan He <bhe@redhat.com> Reported-by:
Rahul Gopakumar <gopakumarr@vmware.com> Reviewed-by:
Mike Rapoport <rppt@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mike Kravetz authored
commit e7dd91c4 upstream. syzbot reported the deadlock here [1]. The issue is in hugetlb cow error handling when there are not enough huge pages for the faulting task which took the original reservation. It is possible that other (child) tasks could have consumed pages associated with the reservation. In this case, we want the task which took the original reservation to succeed. So, we unmap any associated pages in children so that they can be used by the faulting task that owns the reservation. The unmapping code needs to hold i_mmap_rwsem in write mode. However, due to commit c0d0381a ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") we are already holding i_mmap_rwsem in read mode when hugetlb_cow is called. Technically, i_mmap_rwsem does not need to be held in read mode for COW mappings as they can not share pmd's. Modifying the fault code to not take i_mmap_rwsem in read mode for COW (and other non-sharable) mappings is too involved for a stable fix. Instead, we simply drop the hugetlb_fault_mutex and i_mmap_rwsem before unmapping. This is OK as it is technically not needed. They are reacquired after unmapping as expected by calling code. Since this is done in an uncommon error path, the overhead of dropping and reacquiring mutexes is acceptable. While making changes, remove redundant BUG_ON after unmap_ref_private. [1] https://lkml.kernel.org/r/000000000000b73ccc05b5cf8558@google.com Link: https://lkml.kernel.org/r/4c5781b8-3b00-761e-c0c7-c5edebb6ec1a@oracle.com Fixes: c0d0381a ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") Signed-off-by:
Mike Kravetz <mike.kravetz@oracle.com> Reported-by:
<syzbot+5eee4145df3c15e96625@syzkaller.appspotmail.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bart Van Assche authored
commit fa4d0f19 upstream. With the current implementation the following race can happen: * blk_pre_runtime_suspend() calls blk_freeze_queue_start() and blk_mq_unfreeze_queue(). * blk_queue_enter() calls blk_queue_pm_only() and that function returns true. * blk_queue_enter() calls blk_pm_request_resume() and that function does not call pm_request_resume() because the queue runtime status is RPM_ACTIVE. * blk_pre_runtime_suspend() changes the queue status into RPM_SUSPENDING. Fix this race by changing the queue runtime status into RPM_SUSPENDING before switching q_usage_counter to atomic mode. Link: https://lore.kernel.org/r/20201209052951.16136-2-bvanassche@acm.org Fixes: 986d413b ("blk-mq: Enable support for runtime power management") Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable <stable@vger.kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Hannes Reinecke <hare@suse.de> Reviewed-by:
Jens Axboe <axboe@kernel.dk> Acked-by:
Alan Stern <stern@rowland.harvard.edu> Acked-by:
Stanley Chu <stanley.chu@mediatek.com> Co-developed-by:
Can Guo <cang@codeaurora.org> Signed-off-by:
Can Guo <cang@codeaurora.org> Signed-off-by:
Bart Van Assche <bvanassche@acm.org> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Viresh Kumar authored
commit 0e1d9ca1 upstream. Fix the clock reference counting by calling the missing clk_put() in the error path. Cc: v5.10 <stable@vger.kernel.org> # v5.10 Fixes: dd461cd9 ("opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER") Signed-off-by:
Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Quanyang Wang authored
commit 976509bb upstream. In function _allocate_opp_table, opp_dev is allocated and referenced by opp_table via _add_opp_dev. But in the case that the subsequent calls return -EPROBE_DEFER, it will jump to err label and opp_table will be freed. Then opp_dev becomes an unreferenced object to cause memory leak. So let's call _remove_opp_dev to do the cleanup. This fixes the following kmemleak report: unreferenced object 0xffff000801524a00 (size 128): comm "swapper/0", pid 1, jiffies 4294892465 (age 84.616s) hex dump (first 32 bytes): 40 00 56 01 08 00 ff ff 40 00 56 01 08 00 ff ff @.V.....@.V..... b8 52 77 7f 08 00 ff ff 00 3c 4c 00 08 00 ff ff .Rw......<L..... backtrace: [<00000000b1289fb1>] kmemleak_alloc+0x30/0x40 [<0000000056da48f0>] kmem_cache_alloc+0x3d4/0x588 [<00000000a84b3b0e>] _add_opp_dev+0x2c/0x88 [<0000000062a380cd>] _add_opp_table_indexed+0x124/0x268 [<000000008b4c8f1f>] dev_pm_opp_of_add_table+0x20/0x1d8 [<00000000e5316798>] dev_pm_opp_of_cpumask_add_table+0x48/0xf0 [<00000000db0a8ec2>] dt_cpufreq_probe+0x20c/0x448 [<0000000030a3a26c>] platform_probe+0x68/0xd8 [<00000000c618e78d>] really_probe+0xd0/0x3a0 [<00000000642e856f>] driver_probe_device+0x58/0xb8 [<00000000f10f5307>] device_driver_attach+0x74/0x80 [<0000000004f254b8>] __driver_attach+0x58/0xe0 [<0000000009d5d19e>] bus_for_each_dev+0x70/0xc8 [<0000000000d22e1c>] driver_attach+0x24/0x30 [<0000000001d4e952>] bus_add_driver+0x14c/0x1f0 [<0000000089928aaa>] driver_register+0x64/0x120 Cc: v5.10 <stable@vger.kernel.org> # v5.10 Fixes: dd461cd9 ("opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER") Signed-off-by:
Quanyang Wang <quanyang.wang@windriver.com> [ Viresh: Added the stable tag ] Signed-off-by:
Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Serge Semin authored
commit 72188381 upstream. I mistakenly added the select attributes to the SPI_DW_BT1_DIRMAP config instead of having them defined in SPI_DW_BT1. If the kernel doesn't have the MULTIPLEXER and MUX_MMIO configs manually enabled and the SPI_DW_BT1_DIRMAP config hasn't been selected, Baikal-T1 SPI device will always fail to be probed by the driver. Fix that and the error reported by the test robot: >> ld.lld: error: undefined symbol: devm_mux_control_get >>> referenced by spi-dw-bt1.c >>> spi/spi-dw-bt1.o:(dw_spi_bt1_sys_init) in archive drivers/built-in.a by moving the MULTIPLEXER/MUX_MMIO configs selection to the SPI_DW_BT1 config. Link: https://lore.kernel.org/lkml/202011161745.uYRlekse-lkp@intel.com/ Link: https://lore.kernel.org/linux-spi/20201116040721.8001-1-rdunlap@infradead.org/ Fixes: abf00907 ("spi: dw: Add Baikal-T1 SPI Controller glue driver") Reported-by:
kernel test robot <lkp@intel.com> Signed-off-by:
Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ramil Zaripov <Ramil.Zaripov@baikalelectronics.ru> Link: https://lore.kernel.org/r/20201127144612.4204-1-Sergey.Semin@baikalelectronics.ru Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jamie Iles authored
[ Upstream commit a61df3c4 ] syzkaller found the following JFFS2 splat: Unable to handle kernel paging request at virtual address dfffa00000000001 Mem abort info: ESR = 0x96000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 [dfffa00000000001] address between user and kernel address ranges Internal error: Oops: 96000004 [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 12745 Comm: syz-executor.5 Tainted: G S 5.9.0-rc8+ #98 Hardware name: linux,dummy-virt (DT) pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--) pc : jffs2_parse_param+0x138/0x308 fs/jffs2/super.c:206 lr : jffs2_parse_param+0x108/0x308 fs/jffs2/super.c:205 sp : ffff000022a57910 x29: ffff000022a57910 x28: 0000000000000000 x27: ffff000057634008 x26: 000000000000d800 x25: 000000000000d800 x24: ffff0000271a9000 x23: ffffa0001adb5dc0 x22: ffff000023fdcf00 x21: 1fffe0000454af2c x20: ffff000024cc9400 x19: 0000000000000000 x18: 0000000000000000 x17: 0000000000000000 x16: ffffa000102dbdd0 x15: 0000000000000000 x14: ffffa000109e44bc x13: ffffa00010a3a26c x12: ffff80000476e0b3 x11: 1fffe0000476e0b2 x10: ffff80000476e0b2 x9 : ffffa00010a3ad60 x8 : ffff000023b70593 x7 : 0000000000000003 x6 : 00000000f1f1f1f1 x5 : ffff000023fdcf00 x4 : 0000000000000002 x3 : ffffa00010000000 x2 : 0000000000000001 x1 : dfffa00000000000 x0 : 0000000000000008 Call trace: jffs2_parse_param+0x138/0x308 fs/jffs2/super.c:206 vfs_parse_fs_param+0x234/0x4e8 fs/fs_context.c:117 vfs_parse_fs_string+0xe8/0x148 fs/fs_context.c:161 generic_parse_monolithic+0x17c/0x208 fs/fs_context.c:201 parse_monolithic_mount_data+0x7c/0xa8 fs/fs_context.c:649 do_new_mount fs/namespace.c:2871 [inline] path_mount+0x548/0x1da8 fs/namespace.c:3192 do_mount+0x124/0x138 fs/namespace.c:3205 __do_sys_mount fs/namespace.c:3413 [inline] __se_sys_mount fs/namespace.c:3390 [inline] __arm64_sys_mount+0x164/0x238 fs/namespace.c:3390 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline] invoke_syscall arch/arm64/kernel/syscall.c:48 [inline] el0_svc_common.constprop.0+0x15c/0x598 arch/arm64/kernel/syscall.c:149 do_el0_svc+0x60/0x150 arch/arm64/kernel/syscall.c:195 el0_svc+0x34/0xb0 arch/arm64/kernel/entry-common.c:226 el0_sync_handler+0xc8/0x5b4 arch/arm64/kernel/entry-common.c:236 el0_sync+0x15c/0x180 arch/arm64/kernel/entry.S:663 Code: d2d40001 f2fbffe1 91002260 d343fc02 (38e16841) ---[ end trace 4edf690313deda44 ]--- This is because since ec10a24f, the option parsing happens before fill_super and so the MTD device isn't associated with the filesystem. Defer the size check until there is a valid association. Fixes: ec10a24f ("vfs: Convert jffs2 to use the new mount API") Cc: <stable@vger.kernel.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by:
Jamie Iles <jamie@nuviainc.com> Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
lizhe authored
[ Upstream commit cd3ed3c7 ] Set rp_size to zero will be ignore during remounting. The method to identify whether we input a remounting option of rp_size is to check if the rp_size input is zero. It can not work well if we pass "rp_size=0". This patch add a bool variable "set_rp_size" to fix this problem. Reported-by:
Jubin Zhong <zhongjubin@huawei.com> Signed-off-by:
lizhe <lizhe67@huawei.com> Signed-off-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Pavel Begunkov authored
commit dfea9fce upstream. The purpose of io_uring_cancel_files() is to wait for all requests matching ->files to go/be cancelled. We should first drop files of a request in io_req_drop_files() and only then make it undiscoverable for io_uring_cancel_files. First drop, then delete from list. It's ok to leave req->id->files dangling, because it's not dereferenced by cancellation code, only compared against. It would potentially go to sleep and be awaken by following in io_req_drop_files() wake_up(). Fixes: 0f212204 ("io_uring: don't rely on weak ->files references") Cc: <stable@vger.kernel.org> # 5.5+ Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rodrigo Siqueira authored
commit 6bdeff12 upstream. Some old ASICs might not implement/require get_dig_frontend helper; in this scenario, we can have a NULL pointer exception when we try to call it inside vbios disable operation. For example, this situation might happen when using Polaris12 with an eDP panel. This commit avoids this situation by adding a specific get_dig_frontend implementation for DCEx. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Harry Wentland <Harry.Wentland@amd.com> Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Cc: Chiawen Huang <chiawen.huang@amd.com> Reported-and-tested-by:
Borislav Petkov <bp@suse.de> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kevin Vigor authored
commit 93decc56 upstream. In __make_request() a new r10bio is allocated and passed to raid10_read_request(). The read_slot member of the bio is not initialized, and the raid10_read_request() uses it to index an array. This leads to occasional panics. Fix by initializing the field to invalid value and checking for valid value in raid10_read_request(). Cc: stable@vger.kernel.org Signed-off-by:
Kevin Vigor <kvigor@gmail.com> Signed-off-by:
Song Liu <songliubraving@fb.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michal Kubecek authored
[ Upstream commit efb796f5 ] Syzbot reported a shift of a u32 by more than 31 in strset_parse_request() which is undefined behavior. This is caused by range check of string set id using variable ret (which is always 0 at this point) instead of id (string set id from request). Fixes: 71921690 ("ethtool: provide string sets with STRSET_GET request") Reported-by:
<syzbot+96523fb438937cd01220@syzkaller.appspotmail.com> Signed-off-by:
Michal Kubecek <mkubecek@suse.cz> Link: https://lore.kernel.org/r/b54ed5c5fd972a59afea3e1badfb36d86df68799.1607952208.git.mkubecek@suse.cz Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ivan Vecera authored
[ Upstream commit ef72cd3c ] Fix two error paths in ethnl_set_channels() to avoid lock-up caused but unreleased RTNL. Fixes: e19c591e ("ethtool: set device channel counts with CHANNELS_SET request") Reported-by:
LiLiang <liali@redhat.com> Signed-off-by:
Ivan Vecera <ivecera@redhat.com> Reviewed-by:
Michal Kubecek <mkubecek@suse.cz> Link: https://lore.kernel.org/r/20201215090810.801777-1-ivecera@redhat.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paolo Abeni authored
[ Upstream commit 0c148460 ] Currently MPTCP is not propagating the security context from the ingress request socket to newly created msk at clone time. Address the issue invoking the missing security helper. Fixes: cf7da0d6 ("mptcp: Create SUBFLOW socket for incoming connections") Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Davide Caratti authored
[ Upstream commit 44d4775c ] syzkaller shows that packets can still be dequeued while taprio_destroy() is running. Let sch_taprio use the reset() function to cancel the advance timer and drop all skbs from the child qdiscs. Fixes: 5a781ccb ("tc: Add support for configuring the taprio scheduler") Link: https://syzkaller.appspot.com/bug?id=f362872379bf8f0017fb667c1ab158f2d1e764ae Reported-by:
<syzbot+8971da381fb5a31f542d@syzkaller.appspotmail.com> Signed-off-by:
Davide Caratti <dcaratti@redhat.com> Acked-by:
Vinicius Costa Gomes <vinicius.gomes@intel.com> Link: https://lore.kernel.org/r/63b6d79b0e830ebb0283e020db4df3cdfdfb2b94.1608142843.git.dcaratti@redhat.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Dec 30, 2020
-
-
Greg Kroah-Hartman authored
Tested-by:
Jon Hunter <jonathanh@nvidia.com> Tested-by:
Guenter Roeck <linux@roeck-us.net> Tested-by:
Linux Kernel Functional Testing <lkft@linaro.org> Link: https://lore.kernel.org/r/20201229103832.108495696@linuxfoundation.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yazen Ghannam authored
[ Upstream commit 028c221e ] AMD systems provide a "NodeId" value that represents a global ID indicating to which "Node" a logical CPU belongs. The "Node" is a physical structure equivalent to a Die, and it should not be confused with logical structures like NUMA nodes. Logical nodes can be adjusted based on firmware or other settings whereas the physical nodes/dies are fixed based on hardware topology. The NodeId value can be used when a physical ID is needed by software. Save the AMD NodeId to struct cpuinfo_x86.cpu_die_id. Use the value from CPUID or MSR as appropriate. Default to phys_proc_id otherwise. Do so for both AMD and Hygon systems. Drop the node_id parameter from cacheinfo_*_init_llc_id() as it is no longer needed. Update the x86 topology documentation. Suggested-by:
Borislav Petkov <bp@alien8.de> Signed-off-by:
Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201109210659.754018-2-Yazen.Ghannam@amd.com Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
Linus Torvalds authored
commit d652d5f1 upstream. Commit 991fcb77 ("drm/edid: Fix uninitialized variable in drm_cvt_modes()") just replaced one warning with another. The original warning about a possibly uninitialized variable was due to the compiler not being smart enough to see that the case statement actually enumerated all possible cases. And the initial fix was just to add a "default" case that had a single "unreachable()", just to tell the compiler that that situation cannot happen. However, that doesn't actually fix the fundamental reason for the problem: the compiler still doesn't see that the existing case statements enumerate all possibilities, so the compiler will still generate code to jump to that unreachable case statement. It just won't complain about an uninitialized variable any more. So now the compiler generates code to our inline asm marker that we told it would not fall through, and end end result is basically random. We have created a bridge to nowhere. And then, depending on the random details of just exactly what the compiler ends up doing, 'objtool' might end up complaining about the conditional branches (for conditions that cannot happen, and that thus will never be taken - but if the compiler was not smart enough to figure that out, we can't expect objtool to do so) going off in the weeds. So depending on how the compiler has laid out the result, you might see something like this: drivers/gpu/drm/drm_edid.o: warning: objtool: do_cvt_mode() falls through to next function drm_mode_detailed.isra.0() and now you have a truly inscrutable warning that makes no sense at all unless you start looking at whatever random code the compiler happened to generate for our bare "unreachable()" statement. IOW, don't use "unreachable()" unless you have an _active_ operation that generates code that actually makes it obvious that something is not reachable (ie an UD instruction or similar). Solve the "compiler isn't smart enough" problem by just marking one of the cases as "default", so that even when the compiler doesn't otherwise see that we've enumerated all cases, the compiler will feel happy and safe about there always being a valid case that initializes the 'width' variable. This also generates better code, since now the compiler doesn't generate comparisons for five different possibilities (the four real ones and the one that can't happen), but just for the three real ones and "the rest" (which is that last one). A smart enough compiler that sees that we cover all the cases won't care. Cc: Lyude Paul <lyude@redhat.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Damien Le Moal authored
commit 2e896d89 upstream. Conventional zones do not have a write pointer and so cannot accept zone append writes. Make sure to fail any zone append write command issued to a conventional zone. Reported-by:
Naohiro Aota <naohiro.aota@wdc.com> Fixes: e0489ed5 ("null_blk: Support REQ_OP_ZONE_APPEND") Cc: stable@vger.kernel.org Signed-off-by:
Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Damien Le Moal authored
commit 0ebcdd70 upstream. For a null_blk device with zoned mode enabled is currently initialized with a number of zones equal to the device capacity divided by the zone size, without considering if the device capacity is a multiple of the zone size. If the zone size is not a divisor of the capacity, the zones end up not covering the entire capacity, potentially resulting is out of bounds accesses to the zone array. Fix this by adding one last smaller zone with a size equal to the remainder of the disk capacity divided by the zone size if the capacity is not a multiple of the zone size. For such smaller last zone, the zone capacity is also checked so that it does not exceed the smaller zone size. Reported-by:
Naohiro Aota <naohiro.aota@wdc.com> Fixes: ca4b2a01 ("null_blk: add zone support") Cc: stable@vger.kernel.org Signed-off-by:
Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steven Rostedt (VMware) authored
commit adab66b7 upstream. It was believed that metag was the only architecture that required the ring buffer to keep 8 byte words aligned on 8 byte architectures, and with its removal, it was assumed that the ring buffer code did not need to handle this case. It appears that sparc64 also requires this. The following was reported on a sparc64 boot up: kernel: futex hash table entries: 65536 (order: 9, 4194304 bytes, linear) kernel: Running postponed tracer tests: kernel: Testing tracer function: kernel: Kernel unaligned access at TPC[552a20] trace_function+0x40/0x140 kernel: Kernel unaligned access at TPC[552a24] trace_function+0x44/0x140 kernel: Kernel unaligned access at TPC[552a20] trace_function+0x40/0x140 kernel: Kernel unaligned access at TPC[552a24] trace_function+0x44/0x140 kernel: Kernel unaligned access at TPC[552a20] trace_function+0x40/0x140 kernel: PASSED Need to put back the 64BIT aligned code for the ring buffer. Link: https://lore.kernel.org/r/CADxRZqzXQRYgKc=y-KV=S_yHL+Y8Ay2mh5ezeZUnpRvg+syWKw@mail.gmail.com Cc: stable@vger.kernel.org Fixes: 86b3de60 ("ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS") Reported-by:
Anatoly Pugachev <matorola@gmail.com> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nikita Shubin authored
commit 00c33482 upstream. Mismatch in probe platform_set_drvdata set's and method's that call dev_get_platdata will result in "Unable to handle kernel NULL pointer dereference", let's use according method for getting driver data after platform_set_drvdata. 8<--- cut here --- Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = (ptrval) [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 5.9.10-00003-g723e101e0037-dirty #4 Hardware name: Technologic Systems TS-72xx SBC PC is at ep93xx_rtc_read_time+0xc/0x2c LR is at __rtc_read_time+0x4c/0x8c [...] [<c02b01c8>] (ep93xx_rtc_read_time) from [<c02ac38c>] (__rtc_read_time+0x4c/0x8c) [<c02ac38c>] (__rtc_read_time) from [<c02ac3f8>] (rtc_read_time+0x2c/0x4c) [<c02ac3f8>] (rtc_read_time) from [<c02acc54>] (__rtc_read_alarm+0x28/0x358) [<c02acc54>] (__rtc_read_alarm) from [<c02abd80>] (__rtc_register_device+0x124/0x2ec) [<c02abd80>] (__rtc_register_device) from [<c02b028c>] (ep93xx_rtc_probe+0xa4/0xac) [<c02b028c>] (ep93xx_rtc_probe) from [<c026424c>] (platform_drv_probe+0x24/0x5c) [<c026424c>] (platform_drv_probe) from [<c0262918>] (really_probe+0x218/0x374) [<c0262918>] (really_probe) from [<c0262da0>] (device_driver_attach+0x44/0x60) [<c0262da0>] (device_driver_attach) from [<c0262e70>] (__driver_attach+0xb4/0xc0) [<c0262e70>] (__driver_attach) from [<c0260d44>] (bus_for_each_dev+0x68/0xac) [<c0260d44>] (bus_for_each_dev) from [<c026223c>] (driver_attach+0x18/0x24) [<c026223c>] (driver_attach) from [<c0261dd8>] (bus_add_driver+0x150/0x1b4) [<c0261dd8>] (bus_add_driver) from [<c026342c>] (driver_register+0xb0/0xf4) [<c026342c>] (driver_register) from [<c0264210>] (__platform_driver_register+0x30/0x48) [<c0264210>] (__platform_driver_register) from [<c04cb9ac>] (ep93xx_rtc_driver_init+0x10/0x1c) [<c04cb9ac>] (ep93xx_rtc_driver_init) from [<c000973c>] (do_one_initcall+0x7c/0x1c0) [<c000973c>] (do_one_initcall) from [<c04b9ecc>] (kernel_init_freeable+0x168/0x1ac) [<c04b9ecc>] (kernel_init_freeable) from [<c03b2228>] (kernel_init+0x8/0xf4) [<c03b2228>] (kernel_init) from [<c00082c0>] (ret_from_fork+0x14/0x34) Exception stack(0xc441dfb0 to 0xc441dff8) dfa0: 00000000 00000000 00000000 00000000 dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 Code: e12fff1e e92d4010 e590303c e1a02001 (e5933000) ---[ end trace c914d6030eaa95c8 ]--- Fixes: b809d192 ("rtc: ep93xx: stop setting platform_data") Signed-off-by:
Nikita Shubin <nikita.shubin@maquefel.me> Signed-off-by:
Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201201095507.10317-1-nikita.shubin@maquefel.me Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Zhuguangqing authored
commit 236761f1 upstream. If state has not changed successfully and we updated cpufreq_state, next time when the new state is equal to cpufreq_state (not changed successfully last time), we will return directly and miss a freq_qos_update_request() that should have been. Fixes: 5130802d ("thermal: cpu_cooling: Switch to QoS requests for freq limits") Cc: v5.4+ <stable@vger.kernel.org> # v5.4+ Signed-off-by:
Zhuguangqing <zhuguangqing@xiaomi.com> Acked-by:
Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by:
Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201106092243.15574-1-zhuguangqing83@gmail.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bjorn Andersson authored
commit 138a6428 upstream. The reliance on the remoteproc's state for determining when to send sysmon notifications to a remote processor is racy with regard to concurrent remoteproc operations. Further more the advertisement of the state of other remote processor to a newly started remote processor might not only send the wrong state, but might result in a stream of state changes that are out of order. Address this by introducing state tracking within the sysmon instances themselves and extend the locking to ensure that the notifications are consistent with this state. Fixes: 1f36ab3f ("remoteproc: sysmon: Inform current rproc about all active rprocs") Fixes: 1877f54f ("remoteproc: sysmon: Add notifications for events") Fixes: 1fb82ee8 ("remoteproc: qcom: Introduce sysmon") Cc: stable@vger.kernel.org Reviewed-by:
Rishabh Bhatnagar <rishabhb@codeaurora.org> Link: https://lore.kernel.org/r/20201122054135.802935-2-bjorn.andersson@linaro.org Signed-off-by:
Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
DingHua Ma authored
commit 291de1d1 upstream. When I use the axp20x chip to power my SDIO device on the 5.4 kernel, the output voltage of DLDO2 is wrong. After comparing the register manual and source code of the chip, I found that the mask bit of the driver register of the port was wrong. I fixed this error by modifying the mask register of the source code. This error seems to be a copy error of the macro when writing the code. Now the voltage output of the DLDO2 port of axp20x is correct. My development environment is Allwinner A40I of arm architecture, and the kernel version is 5.4. Signed-off-by:
DingHua Ma <dinghua.ma.sz@gmail.com> Reviewed-by:
Chen-Yu Tsai <wens@csie.org> Cc: <stable@vger.kernel.org> Fixes: db4a555f ("regulator: axp20x: use defines for masks") Link: https://lore.kernel.org/r/20201201001000.22302-1-dinghua.ma.sz@gmail.com Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jubin Zhong authored
commit 4684709b upstream. If kobject_init_and_add() fails, pci_slot_release() is called to delete slot->list from parent->slots. But slot->list hasn't been initialized yet, so we dereference a NULL pointer: Unable to handle kernel NULL pointer dereference at virtual address 00000000 ... CPU: 10 PID: 1 Comm: swapper/0 Not tainted 4.4.240 #197 task: ffffeb398a45ef10 task.stack: ffffeb398a470000 PC is at __list_del_entry_valid+0x5c/0xb0 LR is at pci_slot_release+0x84/0xe4 ... __list_del_entry_valid+0x5c/0xb0 pci_slot_release+0x84/0xe4 kobject_put+0x184/0x1c4 pci_create_slot+0x17c/0x1b4 __pci_hp_initialize+0x68/0xa4 pciehp_probe+0x1a4/0x2fc pcie_port_probe_service+0x58/0x84 driver_probe_device+0x320/0x470 Initialize slot->list before calling kobject_init_and_add() to avoid this. Fixes: 8a94644b ("PCI: Fix pci_create_slot() reference count leak") Link: https://lore.kernel.org/r/1606876422-117457-1-git-send-email-zhongjubin@huawei.com Signed-off-by:
Jubin Zhong <zhongjubin@huawei.com> Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.9+ Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Johan Hovold authored
commit 5812b32e upstream. Specify type alignment when declaring linker-section match-table entries to prevent gcc from increasing alignment and corrupting the various tables with padding (e.g. timers, irqchips, clocks, reserved memory). This is specifically needed on x86 where gcc (typically) aligns larger objects like struct of_device_id with static extent on 32-byte boundaries which at best prevents matching on anything but the first entry. Specifying alignment when declaring variables suppresses this optimisation. Here's a 64-bit example where all entries are corrupt as 16 bytes of padding has been inserted before the first entry: ffffffff8266b4b0 D __clk_of_table ffffffff8266b4c0 d __of_table_fixed_factor_clk ffffffff8266b5a0 d __of_table_fixed_clk ffffffff8266b680 d __clk_of_table_sentinel And here's a 32-bit example where the 8-byte-aligned table happens to be placed on a 32-byte boundary so that all but the first entry are corrupt due to the 28 bytes of padding inserted between entries: 812b3ec0 D __irqchip_of_table 812b3ec0 d __of_table_irqchip1 812b3fa0 d __of_table_irqchip2 812b4080 d __of_table_irqchip3 812b4160 d irqchip_of_match_end Verified on x86 using gcc-9.3 and gcc-4.9 (which uses 64-byte alignment), and on arm using gcc-7.2. Note that there are no in-tree users of these tables on x86 currently (even if they are included in the image). Fixes: 54196ccb ("of: consolidate linker section OF match table declarations") Fixes: f6e916b8 ("irqchip: add basic infrastructure") Cc: stable <stable@vger.kernel.org> # 3.9 Signed-off-by:
Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20201123102319.8090-2-johan@kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Felix Fietkau authored
commit ed89b893 upstream. It was accidentally dropped while adding multiple wiphy support Fixes fast-rx support and avoids handling reordering in both mac80211 and the driver Cc: stable@vger.kernel.org Fixes: c89d3625 ("mt76: add function for allocating an extra wiphy") Signed-off-by:
Felix Fietkau <nbd@nbd.name> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masami Hiramatsu authored
commit 60efe21e upstream. Disable ftrace selftests when any tracer (kernel command line options like ftrace=, trace_events=, kprobe_events=, and boot-time tracing) starts running because selftest can disturb it. Currently ftrace= and trace_events= are checked, but kprobe_events has a different flag, and boot-time tracing didn't checked. This unifies the disabled flag and all of those boot-time tracing features sets the flag. This also fixes warnings on kprobe-event selftest (CONFIG_FTRACE_STARTUP_TEST=y and CONFIG_KPROBE_EVENTS=y) with boot-time tracing (ftrace.event.kprobes.EVENT.probes) like below; [ 59.803496] trace_kprobe: Testing kprobe tracing: [ 59.804258] ------------[ cut here ]------------ [ 59.805682] WARNING: CPU: 3 PID: 1 at kernel/trace/trace_kprobe.c:1987 kprobe_trace_self_tests_ib [ 59.806944] Modules linked in: [ 59.807335] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.10.0-rc7+ #172 [ 59.808029] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/204 [ 59.808999] RIP: 0010:kprobe_trace_self_tests_init+0x5f/0x42b [ 59.809696] Code: e8 03 00 00 48 c7 c7 30 8e 07 82 e8 6d 3c 46 ff 48 c7 c6 00 b2 1a 81 48 c7 c7 7 [ 59.812439] RSP: 0018:ffffc90000013e78 EFLAGS: 00010282 [ 59.813038] RAX: 00000000ffffffef RBX: 0000000000000000 RCX: 0000000000049443 [ 59.813780] RDX: 0000000000049403 RSI: 0000000000049403 RDI: 000000000002deb0 [ 59.814589] RBP: ffffc90000013e90 R08: 0000000000000001 R09: 0000000000000001 [ 59.815349] R10: 0000000000000001 R11: 0000000000000000 R12: 00000000ffffffef [ 59.816138] R13: ffff888004613d80 R14: ffffffff82696940 R15: ffff888004429138 [ 59.816877] FS: 0000000000000000(0000) GS:ffff88807dcc0000(0000) knlGS:0000000000000000 [ 59.817772] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 59.818395] CR2: 0000000001a8dd38 CR3: 0000000002222000 CR4: 00000000000006a0 [ 59.819144] Call Trace: [ 59.819469] ? init_kprobe_trace+0x6b/0x6b [ 59.819948] do_one_initcall+0x5f/0x300 [ 59.820392] ? rcu_read_lock_sched_held+0x4f/0x80 [ 59.820916] kernel_init_freeable+0x22a/0x271 [ 59.821416] ? rest_init+0x241/0x241 [ 59.821841] kernel_init+0xe/0x10f [ 59.822251] ret_from_fork+0x22/0x30 [ 59.822683] irq event stamp: 16403349 [ 59.823121] hardirqs last enabled at (16403359): [<ffffffff810db81e>] console_unlock+0x48e/0x580 [ 59.824074] hardirqs last disabled at (16403368): [<ffffffff810db786>] console_unlock+0x3f6/0x580 [ 59.825036] softirqs last enabled at (16403200): [<ffffffff81c0033a>] __do_softirq+0x33a/0x484 [ 59.825982] softirqs last disabled at (16403087): [<ffffffff81a00f02>] asm_call_irq_on_stack+0x10 [ 59.827034] ---[ end trace 200c544775cdfeb3 ]--- [ 59.827635] trace_kprobe: error on probing function entry. Link: https://lkml.kernel.org/r/160741764955.3448999.3347769358299456915.stgit@devnote2 Fixes: 4d655281 ("tracing/boot Add kprobe event support") Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Carlos Garnacho authored
commit fe600099 upstream. This 2-in-1 model (Product name: Switch SA5-271) features a SW_TABLET_MODE that works as it would be expected, both when detaching the keyboard and when folding it behind the tablet body. It used to work until the introduction of the allow list at commit 8169bd3e ("platform/x86: intel-vbtn: Switch to an allow-list for SW_TABLET_MODE reporting"). Add this model to it, so that the Virtual Buttons device announces the EV_SW features again. Fixes: 8169bd3e ("platform/x86: intel-vbtn: Switch to an allow-list for SW_TABLET_MODE reporting") Cc: stable@vger.kernel.org Signed-off-by:
Carlos Garnacho <carlosg@gnome.org> Link: https://lore.kernel.org/r/20201201135727.212917-1-carlosg@gnome.org Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Williams authored
commit 2dd2a174 upstream. A recent change to ndctl to attempt to reconfigure namespaces in place uncovered a label accounting problem in block-window-type namespaces. The ndctl "create.sh" test is able to trigger this signature: WARNING: CPU: 34 PID: 9167 at drivers/nvdimm/label.c:1100 __blk_label_update+0x9a3/0xbc0 [libnvdimm] [..] RIP: 0010:__blk_label_update+0x9a3/0xbc0 [libnvdimm] [..] Call Trace: uuid_store+0x21b/0x2f0 [libnvdimm] kernfs_fop_write+0xcf/0x1c0 vfs_write+0xcc/0x380 ksys_write+0x68/0xe0 When allocated capacity for a namespace is renamed (new UUID) the labels with the old UUID need to be deleted. The ndctl behavior to always destroy namespaces on reconfiguration hid this problem. The immediate impact of this bug is limited since block-window-type namespaces only seem to exist in the specification and not in any shipping products. However, the label handling code is being reused for other technologies like CXL region labels, so there is a benefit to making sure both vertical labels sets (block-window) and horizontal label sets (pmem) have a functional reference implementation in libnvdimm. Fixes: c4703ce1 ("libnvdimm/namespace: Fix label tracking error") Cc: <stable@vger.kernel.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lad Prabhakar authored
commit 61a6d854 upstream. rpcif_enable_rpm calls pm_runtime_enable, so rpcif_disable_rpm needs to call pm_runtime_disable and not pm_runtime_put_sync. Fixes: ca7d8b98 ("memory: add Renesas RPC-IF driver") Reported-by:
Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by:
Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by:
Sergei Shtylyov <sergei.shtylyov@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201126191146.8753-3-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by:
Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lad Prabhakar authored
commit a0453f4e upstream. In the error path of rpcif_manual_xfer() the value of ret is overwritten by value returned by reset_control_reset() function and thus returning incorrect value to the caller. This patch makes sure the correct value is returned to the caller of rpcif_manual_xfer() by dropping the overwrite of ret in error path. Also now we ignore the value returned by reset_control_reset() in the error path and instead print a error message when it fails. Fixes: ca7d8b98 ("memory: add Renesas RPC-IF driver") Reported-by:
Pavel Machek <pavel@denx.de> Signed-off-by:
Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by:
Sergei Shtylyov <sergei.shtylyov@gmail.com> Reviewed-by:
Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by:
Pavel Machek (CIP) <pavel@denx.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201126191146.8753-2-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by:
Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lad Prabhakar authored
commit 4e6b86b4 upstream. Release the node reference by calling of_node_put(flash) in the probe. Fixes: ca7d8b98 ("memory: add Renesas RPC-IF driver") Reported-by:
Pavel Machek <pavel@denx.de> Signed-off-by:
Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by:
Sergei Shtylyov <sergei.shtylyov@gmail.com> Reviewed-by:
Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by:
Pavel Machek (CIP) <pavel@denx.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201126191146.8753-4-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by:
Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Carpenter authored
commit 96999c79 upstream. The devm_ioremap() function returns NULL on error, it doesn't return error pointers. This bug could lead to an Oops during probe. Fixes: f046e4a3 ("memory: jz4780_nemc: Only request IO memory the driver will use") Cc: <stable@vger.kernel.org> Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by:
Paul Cercueil <paul@crapouillou.net> Link: https://lore.kernel.org/r/20200803143607.GC346925@mwanda Signed-off-by:
Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
SeongJae Park authored
commit 9996bd49 upstream. 'xenbus_backend' watches 'state' of devices, which is writable by guests. Hence, if guests intensively updates it, dom0 will have lots of pending events that exhausting memory of dom0. In other words, guests can trigger dom0 memory pressure. This is known as XSA-349. However, the watch callback of it, 'frontend_changed()', reads only 'state', so doesn't need to have the pending events. To avoid the problem, this commit disallows pending watch messages for 'xenbus_backend' using the 'will_handle()' watch callback. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by:
SeongJae Park <sjpark@amazon.de> Reported-by:
Michael Kurth <mku@amazon.de> Reported-by:
Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
SeongJae Park authored
commit 3dc86ca6 upstream. This commit adds a counter of pending messages for each watch in the struct. It is used to skip unnecessary pending messages lookup in 'unregister_xenbus_watch()'. It could also be used in 'will_handle' callback. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by:
SeongJae Park <sjpark@amazon.de> Reported-by:
Michael Kurth <mku@amazon.de> Reported-by:
Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
SeongJae Park authored
commit be987200 upstream. This commit adds support of the 'will_handle' watch callback for 'xen_bus_type' users. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by:
SeongJae Park <sjpark@amazon.de> Reported-by:
Michael Kurth <mku@amazon.de> Reported-by:
Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
SeongJae Park authored
commit 2e85d32b upstream. Some code does not directly make 'xenbus_watch' object and call 'register_xenbus_watch()' but use 'xenbus_watch_path()' instead. This commit adds support of 'will_handle' callback in the 'xenbus_watch_path()' and it's wrapper, 'xenbus_watch_pathfmt()'. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by:
SeongJae Park <sjpark@amazon.de> Reported-by:
Michael Kurth <mku@amazon.de> Reported-by:
Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
SeongJae Park authored
commit fed1755b upstream. If handling logics of watch events are slower than the events enqueue logic and the events can be created from the guests, the guests could trigger memory pressure by intensively inducing the events, because it will create a huge number of pending events that exhausting the memory. Fortunately, some watch events could be ignored, depending on its handler callback. For example, if the callback has interest in only one single path, the watch wouldn't want multiple pending events. Or, some watches could ignore events to same path. To let such watches to volutarily help avoiding the memory pressure situation, this commit introduces new watch callback, 'will_handle'. If it is not NULL, it will be called for each new event just before enqueuing it. Then, if the callback returns false, the event will be discarded. No watch is using the callback for now, though. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by:
SeongJae Park <sjpark@amazon.de> Reported-by:
Michael Kurth <mku@amazon.de> Reported-by:
Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-