Skip to content
Snippets Groups Projects
  1. Nov 18, 2021
    • Pali Rohár's avatar
      PCI: aardvark: Fix PCIe Max Payload Size setting · c7a440cd
      Pali Rohár authored
      commit a4e17d65 upstream.
      
      Change PCIe Max Payload Size setting in PCIe Device Control register to 512
      bytes to align with PCIe Link Initialization sequence as defined in Marvell
      Armada 3700 Functional Specification. According to the specification,
      maximal Max Payload Size supported by this device is 512 bytes.
      
      Without this kernel prints suspicious line:
      
          pci 0000:01:00.0: Upstream bridge's Max Payload Size set to 256 (was 16384, max 512)
      
      With this change it changes to:
      
          pci 0000:01:00.0: Upstream bridge's Max Payload Size set to 256 (was 512, max 512)
      
      Link: https://lore.kernel.org/r/20211005180952.6812-3-kabel@kernel.org
      
      
      Fixes: 8c39d710 ("PCI: aardvark: Add Aardvark PCI host controller driver")
      Signed-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7a440cd
    • Pali Rohár's avatar
      PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macros · f967d120
      Pali Rohár authored
      commit 460275f1 upstream.
      
      Define a macro PCI_EXP_DEVCTL_PAYLOAD_* for every possible Max Payload
      Size in linux/pci_regs.h, in the same style as PCI_EXP_DEVCTL_READRQ_*.
      
      Link: https://lore.kernel.org/r/20211005180952.6812-2-kabel@kernel.org
      
      
      Signed-off-by: default avatarPali Rohár <pali@kernel.org>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      Reviewed-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f967d120
    • Jernej Skrabec's avatar
      drm/sun4i: Fix macros in sun8i_csc.h · f3396f6d
      Jernej Skrabec authored
      
      commit c302c98d upstream.
      
      Macros SUN8I_CSC_CTRL() and SUN8I_CSC_COEFF() don't follow usual
      recommendation of having arguments enclosed in parenthesis. While that
      didn't change anything for quite sometime, it actually become important
      after CSC code rework with commit ea067aee ("drm/sun4i: de2/de3:
      Remove redundant CSC matrices").
      
      Without this fix, colours are completely off for supported YVU formats
      on SoCs with DE2 (A64, H3, R40, etc.).
      
      Fix the issue by enclosing macro arguments in parenthesis.
      
      Cc: stable@vger.kernel.org # 5.12+
      Fixes: 88302939 ("drm/sun4i: Add DE2 CSC library")
      Reported-by: default avatarRoman Stratiienko <r.stratiienko@gmail.com>
      Signed-off-by: default avatarJernej Skrabec <jernej.skrabec@gmail.com>
      Reviewed-by: default avatarChen-Yu Tsai <wens@csie.org>
      Signed-off-by: default avatarMaxime Ripard <maxime@cerno.tech>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210831184819.93670-1-jernej.skrabec@gmail.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f3396f6d
    • Xiaoming Ni's avatar
      powerpc/85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n · 10233552
      Xiaoming Ni authored
      
      commit c45361ab upstream.
      
      When CONFIG_SMP=y, timebase synchronization is required when the second
      kernel is started.
      
      arch/powerpc/kernel/smp.c:
        int __cpu_up(unsigned int cpu, struct task_struct *tidle)
        {
        	...
        	if (smp_ops->give_timebase)
        		smp_ops->give_timebase();
        	...
        }
      
        void start_secondary(void *unused)
        {
        	...
        	if (smp_ops->take_timebase)
        		smp_ops->take_timebase();
        	...
        }
      
      When CONFIG_HOTPLUG_CPU=n and CONFIG_KEXEC_CORE=n,
       smp_85xx_ops.give_timebase is NULL,
       smp_85xx_ops.take_timebase is NULL,
      As a result, the timebase is not synchronized.
      
      Timebase  synchronization does not depend on CONFIG_HOTPLUG_CPU.
      
      Fixes: 56f1ba28 ("powerpc/mpc85xx: refactor the PM operations")
      Cc: stable@vger.kernel.org # v4.6+
      Signed-off-by: default avatarXiaoming Ni <nixiaoming@huawei.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210929033646.39630-3-nixiaoming@huawei.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      10233552
    • Vasant Hegde's avatar
      powerpc/powernv/prd: Unregister OPAL_MSG_PRD2 notifier during module unload · 77d543e6
      Vasant Hegde authored
      
      commit 52862ab3 upstream.
      
      Commit 587164cd, introduced new opal message type (OPAL_MSG_PRD2) and
      added opal notifier. But I missed to unregister the notifier during
      module unload path. This results in below call trace if you try to
      unload and load opal_prd module.
      
      Also add new notifier_block for OPAL_MSG_PRD2 message.
      
      Sample calltrace (modprobe -r opal_prd; modprobe opal_prd)
        BUG: Unable to handle kernel data access on read at 0xc0080000192200e0
        Faulting instruction address: 0xc00000000018d1cc
        Oops: Kernel access of bad area, sig: 11 [#1]
        LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
        CPU: 66 PID: 7446 Comm: modprobe Kdump: loaded Tainted: G            E     5.14.0prd #759
        NIP:  c00000000018d1cc LR: c00000000018d2a8 CTR: c0000000000cde10
        REGS: c0000003c4c0f0a0 TRAP: 0300   Tainted: G            E      (5.14.0prd)
        MSR:  9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 24224824  XER: 20040000
        CFAR: c00000000018d2a4 DAR: c0080000192200e0 DSISR: 40000000 IRQMASK: 1
        ...
        NIP notifier_chain_register+0x2c/0xc0
        LR  atomic_notifier_chain_register+0x48/0x80
        Call Trace:
          0xc000000002090610 (unreliable)
          atomic_notifier_chain_register+0x58/0x80
          opal_message_notifier_register+0x7c/0x1e0
          opal_prd_probe+0x84/0x150 [opal_prd]
          platform_probe+0x78/0x130
          really_probe+0x110/0x5d0
          __driver_probe_device+0x17c/0x230
          driver_probe_device+0x60/0x130
          __driver_attach+0xfc/0x220
          bus_for_each_dev+0xa8/0x130
          driver_attach+0x34/0x50
          bus_add_driver+0x1b0/0x300
          driver_register+0x98/0x1a0
          __platform_driver_register+0x38/0x50
          opal_prd_driver_init+0x34/0x50 [opal_prd]
          do_one_initcall+0x60/0x2d0
          do_init_module+0x7c/0x320
          load_module+0x3394/0x3650
          __do_sys_finit_module+0xd4/0x160
          system_call_exception+0x140/0x290
          system_call_common+0xf4/0x258
      
      Fixes: 587164cd ("powerpc/powernv: Add new opal message type")
      Cc: stable@vger.kernel.org # v5.4+
      Signed-off-by: default avatarVasant Hegde <hegdevasant@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20211028165716.41300-1-hegdevasant@linux.vnet.ibm.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      77d543e6
    • Miquel Raynal's avatar
      mtd: rawnand: au1550nd: Keep the driver compatible with on-die ECC engines · 9dcdadd6
      Miquel Raynal authored
      
      commit 7e3cdba1 upstream.
      
      Following the introduction of the generic ECC engine infrastructure, it
      was necessary to reorganize the code and move the ECC configuration in
      the ->attach_chip() hook. Failing to do that properly lead to a first
      series of fixes supposed to stabilize the situation. Unfortunately, this
      only fixed the use of software ECC engines, preventing any other kind of
      engine to be used, including on-die ones.
      
      It is now time to (finally) fix the situation by ensuring that we still
      provide a default (eg. software ECC) but will still support different
      ECC engines such as on-die ECC engines if properly described in the
      device tree.
      
      There are no changes needed on the core side in order to do this, but we
      just need to leverage the logic there which allows:
      1- a subsystem default (set to Host engines in the raw NAND world)
      2- a driver specific default (here set to software ECC engines)
      3- any type of engine requested by the user (ie. described in the DT)
      
      As the raw NAND subsystem has not yet been fully converted to the ECC
      engine infrastructure, in order to provide a default ECC engine for this
      driver we need to set chip->ecc.engine_type *before* calling
      nand_scan(). During the initialization step, the core will consider this
      entry as the default engine for this driver. This value may of course
      be overloaded by the user if the usual DT properties are provided.
      
      Fixes: dbffc8cc ("mtd: rawnand: au1550: Move the ECC initialization to ->attach_chip()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-3-miquel.raynal@bootlin.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9dcdadd6
    • Miquel Raynal's avatar
      mtd: rawnand: plat_nand: Keep the driver compatible with on-die ECC engines · 51e34fcf
      Miquel Raynal authored
      
      commit 325fd539 upstream.
      
      Following the introduction of the generic ECC engine infrastructure, it
      was necessary to reorganize the code and move the ECC configuration in
      the ->attach_chip() hook. Failing to do that properly lead to a first
      series of fixes supposed to stabilize the situation. Unfortunately, this
      only fixed the use of software ECC engines, preventing any other kind of
      engine to be used, including on-die ones.
      
      It is now time to (finally) fix the situation by ensuring that we still
      provide a default (eg. software ECC) but will still support different
      ECC engines such as on-die ECC engines if properly described in the
      device tree.
      
      There are no changes needed on the core side in order to do this, but we
      just need to leverage the logic there which allows:
      1- a subsystem default (set to Host engines in the raw NAND world)
      2- a driver specific default (here set to software ECC engines)
      3- any type of engine requested by the user (ie. described in the DT)
      
      As the raw NAND subsystem has not yet been fully converted to the ECC
      engine infrastructure, in order to provide a default ECC engine for this
      driver we need to set chip->ecc.engine_type *before* calling
      nand_scan(). During the initialization step, the core will consider this
      entry as the default engine for this driver. This value may of course
      be overloaded by the user if the usual DT properties are provided.
      
      Fixes: 612e048e ("mtd: rawnand: plat_nand: Move the ECC initialization to ->attach_chip()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-8-miquel.raynal@bootlin.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51e34fcf
    • Miquel Raynal's avatar
      mtd: rawnand: orion: Keep the driver compatible with on-die ECC engines · e1de04df
      Miquel Raynal authored
      
      commit 194ac63d upstream.
      
      Following the introduction of the generic ECC engine infrastructure, it
      was necessary to reorganize the code and move the ECC configuration in
      the ->attach_chip() hook. Failing to do that properly lead to a first
      series of fixes supposed to stabilize the situation. Unfortunately, this
      only fixed the use of software ECC engines, preventing any other kind of
      engine to be used, including on-die ones.
      
      It is now time to (finally) fix the situation by ensuring that we still
      provide a default (eg. software ECC) but will still support different
      ECC engines such as on-die ECC engines if properly described in the
      device tree.
      
      There are no changes needed on the core side in order to do this, but we
      just need to leverage the logic there which allows:
      1- a subsystem default (set to Host engines in the raw NAND world)
      2- a driver specific default (here set to software ECC engines)
      3- any type of engine requested by the user (ie. described in the DT)
      
      As the raw NAND subsystem has not yet been fully converted to the ECC
      engine infrastructure, in order to provide a default ECC engine for this
      driver we need to set chip->ecc.engine_type *before* calling
      nand_scan(). During the initialization step, the core will consider this
      entry as the default engine for this driver. This value may of course
      be overloaded by the user if the usual DT properties are provided.
      
      Fixes: 553508ce ("mtd: rawnand: orion: Move the ECC initialization to ->attach_chip()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-6-miquel.raynal@bootlin.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e1de04df
    • Miquel Raynal's avatar
      mtd: rawnand: pasemi: Keep the driver compatible with on-die ECC engines · b4e2e9fb
      Miquel Raynal authored
      
      commit f16b7d2a upstream.
      
      Following the introduction of the generic ECC engine infrastructure, it
      was necessary to reorganize the code and move the ECC configuration in
      the ->attach_chip() hook. Failing to do that properly lead to a first
      series of fixes supposed to stabilize the situation. Unfortunately, this
      only fixed the use of software ECC engines, preventing any other kind of
      engine to be used, including on-die ones.
      
      It is now time to (finally) fix the situation by ensuring that we still
      provide a default (eg. software ECC) but will still support different
      ECC engines such as on-die ECC engines if properly described in the
      device tree.
      
      There are no changes needed on the core side in order to do this, but we
      just need to leverage the logic there which allows:
      1- a subsystem default (set to Host engines in the raw NAND world)
      2- a driver specific default (here set to software ECC engines)
      3- any type of engine requested by the user (ie. described in the DT)
      
      As the raw NAND subsystem has not yet been fully converted to the ECC
      engine infrastructure, in order to provide a default ECC engine for this
      driver we need to set chip->ecc.engine_type *before* calling
      nand_scan(). During the initialization step, the core will consider this
      entry as the default engine for this driver. This value may of course
      be overloaded by the user if the usual DT properties are provided.
      
      Fixes: 8fc6f1f0 ("mtd: rawnand: pasemi: Move the ECC initialization to ->attach_chip()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-7-miquel.raynal@bootlin.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b4e2e9fb
    • Miquel Raynal's avatar
      mtd: rawnand: gpio: Keep the driver compatible with on-die ECC engines · 963db3cc
      Miquel Raynal authored
      
      commit b5b5b4dc upstream.
      
      Following the introduction of the generic ECC engine infrastructure, it
      was necessary to reorganize the code and move the ECC configuration in
      the ->attach_chip() hook. Failing to do that properly lead to a first
      series of fixes supposed to stabilize the situation. Unfortunately, this
      only fixed the use of software ECC engines, preventing any other kind of
      engine to be used, including on-die ones.
      
      It is now time to (finally) fix the situation by ensuring that we still
      provide a default (eg. software ECC) but will still support different
      ECC engines such as on-die ECC engines if properly described in the
      device tree.
      
      There are no changes needed on the core side in order to do this, but we
      just need to leverage the logic there which allows:
      1- a subsystem default (set to Host engines in the raw NAND world)
      2- a driver specific default (here set to software ECC engines)
      3- any type of engine requested by the user (ie. described in the DT)
      
      As the raw NAND subsystem has not yet been fully converted to the ECC
      engine infrastructure, in order to provide a default ECC engine for this
      driver we need to set chip->ecc.engine_type *before* calling
      nand_scan(). During the initialization step, the core will consider this
      entry as the default engine for this driver. This value may of course
      be overloaded by the user if the usual DT properties are provided.
      
      Fixes: f6341f64 ("mtd: rawnand: gpio: Move the ECC initialization to ->attach_chip()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-4-miquel.raynal@bootlin.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      963db3cc
    • Miquel Raynal's avatar
      mtd: rawnand: mpc5121: Keep the driver compatible with on-die ECC engines · 13566bc1
      Miquel Raynal authored
      
      commit f9d8570b upstream.
      
      Following the introduction of the generic ECC engine infrastructure, it
      was necessary to reorganize the code and move the ECC configuration in
      the ->attach_chip() hook. Failing to do that properly lead to a first
      series of fixes supposed to stabilize the situation. Unfortunately, this
      only fixed the use of software ECC engines, preventing any other kind of
      engine to be used, including on-die ones.
      
      It is now time to (finally) fix the situation by ensuring that we still
      provide a default (eg. software ECC) but will still support different
      ECC engines such as on-die ECC engines if properly described in the
      device tree.
      
      There are no changes needed on the core side in order to do this, but we
      just need to leverage the logic there which allows:
      1- a subsystem default (set to Host engines in the raw NAND world)
      2- a driver specific default (here set to software ECC engines)
      3- any type of engine requested by the user (ie. described in the DT)
      
      As the raw NAND subsystem has not yet been fully converted to the ECC
      engine infrastructure, in order to provide a default ECC engine for this
      driver we need to set chip->ecc.engine_type *before* calling
      nand_scan(). During the initialization step, the core will consider this
      entry as the default engine for this driver. This value may of course
      be overloaded by the user if the usual DT properties are provided.
      
      Fixes: 6dd09f77 ("mtd: rawnand: mpc5121: Move the ECC initialization to ->attach_chip()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-5-miquel.raynal@bootlin.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13566bc1
    • Miquel Raynal's avatar
      mtd: rawnand: xway: Keep the driver compatible with on-die ECC engines · 9b366f52
      Miquel Raynal authored
      
      commit 6bcd2960 upstream.
      
      Following the introduction of the generic ECC engine infrastructure, it
      was necessary to reorganize the code and move the ECC configuration in
      the ->attach_chip() hook. Failing to do that properly lead to a first
      series of fixes supposed to stabilize the situation. Unfortunately, this
      only fixed the use of software ECC engines, preventing any other kind of
      engine to be used, including on-die ones.
      
      It is now time to (finally) fix the situation by ensuring that we still
      provide a default (eg. software ECC) but will still support different
      ECC engines such as on-die ECC engines if properly described in the
      device tree.
      
      There are no changes needed on the core side in order to do this, but we
      just need to leverage the logic there which allows:
      1- a subsystem default (set to Host engines in the raw NAND world)
      2- a driver specific default (here set to software ECC engines)
      3- any type of engine requested by the user (ie. described in the DT)
      
      As the raw NAND subsystem has not yet been fully converted to the ECC
      engine infrastructure, in order to provide a default ECC engine for this
      driver we need to set chip->ecc.engine_type *before* calling
      nand_scan(). During the initialization step, the core will consider this
      entry as the default engine for this driver. This value may of course
      be overloaded by the user if the usual DT properties are provided.
      
      Fixes: d525914b ("mtd: rawnand: xway: Move the ECC initialization to ->attach_chip()")
      Cc: stable@vger.kernel.org
      Cc: Jan Hoffmann <jan@3e8.eu>
      Cc: Kestrel seventyfour <kestrelseventyfour@gmail.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Tested-by: default avatarJan Hoffmann <jan@3e8.eu>
      Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-10-miquel.raynal@bootlin.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b366f52
    • Miquel Raynal's avatar
      mtd: rawnand: ams-delta: Keep the driver compatible with on-die ECC engines · cbc55cf4
      Miquel Raynal authored
      
      commit d707bb74 upstream.
      
      Following the introduction of the generic ECC engine infrastructure, it
      was necessary to reorganize the code and move the ECC configuration in
      the ->attach_chip() hook. Failing to do that properly lead to a first
      series of fixes supposed to stabilize the situation. Unfortunately, this
      only fixed the use of software ECC engines, preventing any other kind of
      engine to be used, including on-die ones.
      
      It is now time to (finally) fix the situation by ensuring that we still
      provide a default (eg. software ECC) but will still support different
      ECC engines such as on-die ECC engines if properly described in the
      device tree.
      
      There are no changes needed on the core side in order to do this, but we
      just need to leverage the logic there which allows:
      1- a subsystem default (set to Host engines in the raw NAND world)
      2- a driver specific default (here set to software ECC engines)
      3- any type of engine requested by the user (ie. described in the DT)
      
      As the raw NAND subsystem has not yet been fully converted to the ECC
      engine infrastructure, in order to provide a default ECC engine for this
      driver we need to set chip->ecc.engine_type *before* calling
      nand_scan(). During the initialization step, the core will consider this
      entry as the default engine for this driver. This value may of course
      be overloaded by the user if the usual DT properties are provided.
      
      Fixes: 59d93473 ("mtd: rawnand: ams-delta: Move the ECC initialization to ->attach_chip()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-2-miquel.raynal@bootlin.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cbc55cf4
    • Halil Pasic's avatar
      s390/cio: make ccw_device_dma_* more robust · 1f420818
      Halil Pasic authored
      
      commit ad9a1451 upstream.
      
      Since commit 48720ba5 ("virtio/s390: use DMA memory for ccw I/O and
      classic notifiers") we were supposed to make sure that
      virtio_ccw_release_dev() completes before the ccw device and the
      attached dma pool are torn down, but unfortunately we did not.  Before
      that commit it used to be OK to delay cleaning up the memory allocated
      by virtio-ccw indefinitely (which isn't really intuitive for guys used
      to destruction happens in reverse construction order), but now we
      trigger a BUG_ON if the genpool is destroyed before all memory allocated
      from it is deallocated. Which brings down the guest. We can observe this
      problem, when unregister_virtio_device() does not give up the last
      reference to the virtio_device (e.g. because a virtio-scsi attached scsi
      disk got removed without previously unmounting its previously mounted
      partition).
      
      To make sure that the genpool is only destroyed after all the necessary
      freeing is done let us take a reference on the ccw device on each
      ccw_device_dma_zalloc() and give it up on each ccw_device_dma_free().
      
      Actually there are multiple approaches to fixing the problem at hand
      that can work. The upside of this one is that it is the safest one while
      remaining simple. We don't crash the guest even if the driver does not
      pair allocations and frees. The downside is the reference counting
      overhead, that the reference counting for ccw devices becomes more
      complex, in a sense that we need to pair the calls to the aforementioned
      functions for it to be correct, and that if we happen to leak, we leak
      more than necessary (the whole ccw device instead of just the genpool).
      
      Some alternatives to this approach are taking a reference in
      virtio_ccw_online() and giving it up in virtio_ccw_release_dev() or
      making sure virtio_ccw_release_dev() completes its work before
      virtio_ccw_remove() returns. The downside of these approaches is that
      these are less safe against programming errors.
      
      Cc: <stable@vger.kernel.org> # v5.3
      Signed-off-by: default avatarHalil Pasic <pasic@linux.ibm.com>
      Fixes: 48720ba5 ("virtio/s390: use DMA memory for ccw I/O and classic notifiers")
      Reported-by: default avatar <bfu@redhat.com>
      Reviewed-by: default avatarVineeth Vijayan <vneethv@linux.ibm.com>
      Acked-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f420818
    • Harald Freudenberger's avatar
      s390/ap: Fix hanging ioctl caused by orphaned replies · c9ca9669
      Harald Freudenberger authored
      
      commit 3826350e upstream.
      
      When a queue is switched to soft offline during heavy load and later
      switched to soft online again and now used, it may be that the caller
      is blocked forever in the ioctl call.
      
      The failure occurs because there is a pending reply after the queue(s)
      have been switched to offline. This orphaned reply is received when
      the queue is switched to online and is accidentally counted for the
      outstanding replies. So when there was a valid outstanding reply and
      this orphaned reply is received it counts as the outstanding one thus
      dropping the outstanding counter to 0. Voila, with this counter the
      receive function is not called any more and the real outstanding reply
      is never received (until another request comes in...) and the ioctl
      blocks.
      
      The fix is simple. However, instead of readjusting the counter when an
      orphaned reply is detected, I check the queue status for not empty and
      compare this to the outstanding counter. So if the queue is not empty
      then the counter must not drop to 0 but at least have a value of 1.
      
      Signed-off-by: default avatarHarald Freudenberger <freude@linux.ibm.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c9ca9669
    • Sven Schnelle's avatar
      s390/tape: fix timer initialization in tape_std_assign() · 57de1fbe
      Sven Schnelle authored
      
      commit 213fca9e upstream.
      
      commit 9c6c273a ("timer: Remove init_timer_on_stack() in favor
      of timer_setup_on_stack()") changed the timer setup from
      init_timer_on_stack(() to timer_setup(), but missed to change the
      mod_timer() call. And while at it, use msecs_to_jiffies() instead
      of the open coded timeout calculation.
      
      Cc: stable@vger.kernel.org
      Fixes: 9c6c273a ("timer: Remove init_timer_on_stack() in favor of timer_setup_on_stack()")
      Signed-off-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Reviewed-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      57de1fbe
    • Vineeth Vijayan's avatar
      s390/cio: check the subchannel validity for dev_busid · 1174298a
      Vineeth Vijayan authored
      
      commit a4751f15 upstream.
      
      Check the validity of subchanel before reading other fields in
      the schib.
      
      Fixes: d3683c05 ("s390/cio: add dev_busid sysfs entry for each subchannel")
      CC: <stable@vger.kernel.org>
      Reported-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarVineeth Vijayan <vneethv@linux.ibm.com>
      Reviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
      Link: https://lore.kernel.org/r/20211105154451.847288-1-vneethv@linux.ibm.com
      
      
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1174298a
    • Marek Vasut's avatar
      video: backlight: Drop maximum brightness override for brightness zero · 7d0341b3
      Marek Vasut authored
      
      commit 33a5471f upstream.
      
      The note in c2adda27 ("video: backlight: Add of_find_backlight helper
      in backlight.c") says that gpio-backlight uses brightness as power state.
      This has been fixed since in ec665b75 ("backlight: gpio-backlight:
      Correct initial power state handling") and other backlight drivers do not
      require this workaround. Drop the workaround.
      
      This fixes the case where e.g. pwm-backlight can perfectly well be set to
      brightness 0 on boot in DT, which without this patch leads to the display
      brightness to be max instead of off.
      
      Fixes: c2adda27 ("video: backlight: Add of_find_backlight helper in backlight.c")
      Cc: <stable@vger.kernel.org> # 5.4+
      Cc: <stable@vger.kernel.org> # 4.19.x: ec665b75: backlight: gpio-backlight: Correct initial power state handling
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Acked-by: default avatarNoralf Trønnes <noralf@tronnes.org>
      Reviewed-by: default avatarDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d0341b3
    • Jack Andersen's avatar
      mfd: dln2: Add cell for initializing DLN2 ADC · 332306b1
      Jack Andersen authored
      commit 313c84b5 upstream.
      
      This patch extends the DLN2 driver; adding cell for adc_dln2 module.
      
      The original patch[1] fell through the cracks when the driver was added
      so ADC has never actually been usable. That patch did not have ACPI
      support which was added in v5.9, so the oldest supported version this
      current patch can be backported to is 5.10.
      
      [1] https://www.spinics.net/lists/linux-iio/msg33975.html
      
      
      
      Cc: <stable@vger.kernel.org> # 5.10+
      Signed-off-by: default avatarJack Andersen <jackoalan@gmail.com>
      Signed-off-by: default avatarNoralf Trønnes <noralf@tronnes.org>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Link: https://lore.kernel.org/r/20211018112541.25466-1-noralf@tronnes.org
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      332306b1
    • Michal Hocko's avatar
      mm, oom: do not trigger out_of_memory from the #PF · 1d457987
      Michal Hocko authored
      commit 60e2793d upstream.
      
      Any allocation failure during the #PF path will return with VM_FAULT_OOM
      which in turn results in pagefault_out_of_memory.  This can happen for 2
      different reasons.  a) Memcg is out of memory and we rely on
      mem_cgroup_oom_synchronize to perform the memcg OOM handling or b)
      normal allocation fails.
      
      The latter is quite problematic because allocation paths already trigger
      out_of_memory and the page allocator tries really hard to not fail
      allocations.  Anyway, if the OOM killer has been already invoked there
      is no reason to invoke it again from the #PF path.  Especially when the
      OOM condition might be gone by that time and we have no way to find out
      other than allocate.
      
      Moreover if the allocation failed and the OOM killer hasn't been invoked
      then we are unlikely to do the right thing from the #PF context because
      we have already lost the allocation context and restictions and
      therefore might oom kill a task from a different NUMA domain.
      
      This all suggests that there is no legitimate reason to trigger
      out_of_memory from pagefault_out_of_memory so drop it.  Just to be sure
      that no #PF path returns with VM_FAULT_OOM without allocation print a
      warning that this is happening before we restart the #PF.
      
      [VvS: #PF allocation can hit into limit of cgroup v1 kmem controller.
      This is a local problem related to memcg, however, it causes unnecessary
      global OOM kills that are repeated over and over again and escalate into a
      real disaster.  This has been broken since kmem accounting has been
      introduced for cgroup v1 (3.8).  There was no kmem specific reclaim for
      the separate limit so the only way to handle kmem hard limit was to return
      with ENOMEM.  In upstream the problem will be fixed by removing the
      outdated kmem limit, however stable and LTS kernels cannot do it and are
      still affected.  This patch fixes the problem and should be backported
      into stable/LTS.]
      
      Link: https://lkml.kernel.org/r/f5fd8dd8-0ad4-c524-5f65-920b01972a42@virtuozzo.com
      
      
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d457987
    • Vasily Averin's avatar
      mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks · ac7f6bef
      Vasily Averin authored
      commit 0b28179a upstream.
      
      Patch series "memcg: prohibit unconditional exceeding the limit of dying tasks", v3.
      
      Memory cgroup charging allows killed or exiting tasks to exceed the hard
      limit.  It can be misused and allowed to trigger global OOM from inside
      a memcg-limited container.  On the other hand if memcg fails allocation,
      called from inside #PF handler it triggers global OOM from inside
      pagefault_out_of_memory().
      
      To prevent these problems this patchset:
       (a) removes execution of out_of_memory() from
           pagefault_out_of_memory(), becasue nobody can explain why it is
           necessary.
       (b) allow memcg to fail allocation of dying/killed tasks.
      
      This patch (of 3):
      
      Any allocation failure during the #PF path will return with VM_FAULT_OOM
      which in turn results in pagefault_out_of_memory which in turn executes
      out_out_memory() and can kill a random task.
      
      An allocation might fail when the current task is the oom victim and
      there are no memory reserves left.  The OOM killer is already handled at
      the page allocator level for the global OOM and at the charging level
      for the memcg one.  Both have much more information about the scope of
      allocation/charge request.  This means that either the OOM killer has
      been invoked properly and didn't lead to the allocation success or it
      has been skipped because it couldn't have been invoked.  In both cases
      triggering it from here is pointless and even harmful.
      
      It makes much more sense to let the killed task die rather than to wake
      up an eternally hungry oom-killer and send him to choose a fatter victim
      for breakfast.
      
      Link: https://lkml.kernel.org/r/0828a149-786e-7c06-b70a-52d086818ea3@virtuozzo.com
      
      
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Suggested-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac7f6bef
    • Naveen N. Rao's avatar
      powerpc/bpf: Emit stf barrier instruction sequences for BPF_NOSPEC · 1ada8699
      Naveen N. Rao authored
      
      upstream commit b7540d62
      
      Emit similar instruction sequences to commit a048a07d
      ("powerpc/64s: Add support for a store forwarding barrier at kernel
      entry/exit") when encountering BPF_NOSPEC.
      
      Mitigations are enabled depending on what the firmware advertises. In
      particular, we do not gate these mitigations based on current settings,
      just like in x86. Due to this, we don't need to take any action if
      mitigations are enabled or disabled at runtime.
      
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/956570cbc191cd41f8274bed48ee757a86dac62a.1633464148.git.naveen.n.rao@linux.vnet.ibm.com
      
      
      [adjust macros to account for commits 1c9debbc and ef909ba9.
      adjust security feature checks to account for commit 84ed26fd]
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ada8699
    • Naveen N. Rao's avatar
    • Naveen N. Rao's avatar
    • Naveen N. Rao's avatar
      powerpc/lib: Add helper to check if offset is within conditional branch range · 51cf71d5
      Naveen N. Rao authored
      
      upstream commit 4549c3ea
      
      Add a helper to check if a given offset is within the branch range for a
      powerpc conditional branch instruction, and update some sites to use the
      new helper.
      
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Reviewed-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/442b69a34ced32ca346a0d9a855f3f6cfdbbbd41.1633464148.git.naveen.n.rao@linux.vnet.ibm.com
      
      
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51cf71d5
    • Vasily Averin's avatar
      memcg: prohibit unconditional exceeding the limit of dying tasks · 74293225
      Vasily Averin authored
      commit a4ebf1b6 upstream.
      
      Memory cgroup charging allows killed or exiting tasks to exceed the hard
      limit.  It is assumed that the amount of the memory charged by those
      tasks is bound and most of the memory will get released while the task
      is exiting.  This is resembling a heuristic for the global OOM situation
      when tasks get access to memory reserves.  There is no global memory
      shortage at the memcg level so the memcg heuristic is more relieved.
      
      The above assumption is overly optimistic though.  E.g.  vmalloc can
      scale to really large requests and the heuristic would allow that.  We
      used to have an early break in the vmalloc allocator for killed tasks
      but this has been reverted by commit b8c8a338 ("Revert "vmalloc:
      back off when the current task is killed"").  There are likely other
      similar code paths which do not check for fatal signals in an
      allocation&charge loop.  Also there are some kernel objects charged to a
      memcg which are not bound to a process life time.
      
      It has been observed that it is not really hard to trigger these
      bypasses and cause global OOM situation.
      
      One potential way to address these runaways would be to limit the amount
      of excess (similar to the global OOM with limited oom reserves).  This
      is certainly possible but it is not really clear how much of an excess
      is desirable and still protects from global OOMs as that would have to
      consider the overall memcg configuration.
      
      This patch is addressing the problem by removing the heuristic
      altogether.  Bypass is only allowed for requests which either cannot
      fail or where the failure is not desirable while excess should be still
      limited (e.g.  atomic requests).  Implementation wise a killed or dying
      task fails to charge if it has passed the OOM killer stage.  That should
      give all forms of reclaim chance to restore the limit before the failure
      (ENOMEM) and tell the caller to back off.
      
      In addition, this patch renames should_force_charge() helper to
      task_is_dying() because now its use is not associated witch forced
      charging.
      
      This patch depends on pagefault_out_of_memory() to not trigger
      out_of_memory(), because then a memcg failure can unwind to VM_FAULT_OOM
      and cause a global OOM killer.
      
      Link: https://lkml.kernel.org/r/8f5cebbb-06da-4902-91f0-6566fc4b4203@virtuozzo.com
      
      
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Suggested-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Uladzislau Rezki <urezki@gmail.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74293225
    • Dominique Martinet's avatar
    • Daniel Borkmann's avatar
      net, neigh: Enable state migration between NUD_PERMANENT and NTF_USE · a8cdf34f
      Daniel Borkmann authored
      
      [ Upstream commit 3dc20f47 ]
      
      Currently, it is not possible to migrate a neighbor entry between NUD_PERMANENT
      state and NTF_USE flag with a dynamic NUD state from a user space control plane.
      Similarly, it is not possible to add/remove NTF_EXT_LEARNED flag from an existing
      neighbor entry in combination with NTF_USE flag.
      
      This is due to the latter directly calling into neigh_event_send() without any
      meta data updates as happening in __neigh_update(). Thus, to enable this use
      case, extend the latter with a NEIGH_UPDATE_F_USE flag where we break the
      NUD_PERMANENT state in particular so that a latter neigh_event_send() is able
      to re-resolve a neighbor entry.
      
      Before fix, NUD_PERMANENT -> NUD_* & NTF_USE:
      
        # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a
        # ./ip/ip n
        192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
        [...]
        # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
        # ./ip/ip n
        192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
        [...]
      
      As can be seen, despite the admin-triggered replace, the entry remains in the
      NUD_PERMANENT state.
      
      After fix, NUD_PERMANENT -> NUD_* & NTF_USE:
      
        # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a
        # ./ip/ip n
        192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
        [...]
        # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
        # ./ip/ip n
        192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
        [...]
        # ./ip/ip n
        192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn STALE
        [...]
        # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a
        # ./ip/ip n
        192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
        [...]
      
      After the fix, the admin-triggered replace switches to a dynamic state from
      the NTF_USE flag which triggered a new neighbor resolution. Likewise, we can
      transition back from there, if needed, into NUD_PERMANENT.
      
      Similar before/after behavior can be observed for below transitions:
      
      Before fix, NTF_USE -> NTF_USE | NTF_EXT_LEARNED -> NTF_USE:
      
        # ./ip/ip n replace 192.168.178.30 dev enp5s0 use
        # ./ip/ip n
        192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
        [...]
        # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
        # ./ip/ip n
        192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
        [...]
      
      After fix, NTF_USE -> NTF_USE | NTF_EXT_LEARNED -> NTF_USE:
      
        # ./ip/ip n replace 192.168.178.30 dev enp5s0 use
        # ./ip/ip n
        192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
        [...]
        # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
        # ./ip/ip n
        192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
        [...]
        # ./ip/ip n replace 192.168.178.30 dev enp5s0 use
        # ./ip/ip n
        192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
        [..]
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarRoopa Prabhu <roopa@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a8cdf34f
    • Jaegeuk Kim's avatar
      f2fs: should use GFP_NOFS for directory inodes · 0bf5c6a1
      Jaegeuk Kim authored
      
      commit 92d602bc upstream.
      
      We use inline_dentry which requires to allocate dentry page when adding a link.
      If we allow to reclaim memory from filesystem, we do down_read(&sbi->cp_rwsem)
      twice by f2fs_lock_op(). I think this should be okay, but how about stopping
      the lockdep complaint [1]?
      
      f2fs_create()
       - f2fs_lock_op()
       - f2fs_do_add_link()
        - __f2fs_find_entry
         - f2fs_get_read_data_page()
         -> kswapd
          - shrink_node
           - f2fs_evict_inode
            - f2fs_lock_op()
      
      [1]
      
      fs_reclaim
      ){+.+.}-{0:0}
      :
      kswapd0:        lock_acquire+0x114/0x394
      kswapd0:        __fs_reclaim_acquire+0x40/0x50
      kswapd0:        prepare_alloc_pages+0x94/0x1ec
      kswapd0:        __alloc_pages_nodemask+0x78/0x1b0
      kswapd0:        pagecache_get_page+0x2e0/0x57c
      kswapd0:        f2fs_get_read_data_page+0xc0/0x394
      kswapd0:        f2fs_find_data_page+0xa4/0x23c
      kswapd0:        find_in_level+0x1a8/0x36c
      kswapd0:        __f2fs_find_entry+0x70/0x100
      kswapd0:        f2fs_do_add_link+0x84/0x1ec
      kswapd0:        f2fs_mkdir+0xe4/0x1e4
      kswapd0:        vfs_mkdir+0x110/0x1c0
      kswapd0:        do_mkdirat+0xa4/0x160
      kswapd0:        __arm64_sys_mkdirat+0x24/0x34
      kswapd0:        el0_svc_common.llvm.17258447499513131576+0xc4/0x1e8
      kswapd0:        do_el0_svc+0x28/0xa0
      kswapd0:        el0_svc+0x24/0x38
      kswapd0:        el0_sync_handler+0x88/0xec
      kswapd0:        el0_sync+0x1c0/0x200
      kswapd0:
      -> #1
      (
      &sbi->cp_rwsem
      ){++++}-{3:3}
      :
      kswapd0:        lock_acquire+0x114/0x394
      kswapd0:        down_read+0x7c/0x98
      kswapd0:        f2fs_do_truncate_blocks+0x78/0x3dc
      kswapd0:        f2fs_truncate+0xc8/0x128
      kswapd0:        f2fs_evict_inode+0x2b8/0x8b8
      kswapd0:        evict+0xd4/0x2f8
      kswapd0:        iput+0x1c0/0x258
      kswapd0:        do_unlinkat+0x170/0x2a0
      kswapd0:        __arm64_sys_unlinkat+0x4c/0x68
      kswapd0:        el0_svc_common.llvm.17258447499513131576+0xc4/0x1e8
      kswapd0:        do_el0_svc+0x28/0xa0
      kswapd0:        el0_svc+0x24/0x38
      kswapd0:        el0_sync_handler+0x88/0xec
      kswapd0:        el0_sync+0x1c0/0x200
      
      Cc: stable@vger.kernel.org
      Fixes: bdbc90fa ("f2fs: don't put dentry page in pagecache into highmem")
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Reviewed-by: default avatarStanley Chu <stanley.chu@mediatek.com>
      Reviewed-by: default avatarLight Hsieh <light.hsieh@mediatek.com>
      Tested-by: default avatarLight Hsieh <light.hsieh@mediatek.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0bf5c6a1
    • Guo Ren's avatar
      irqchip/sifive-plic: Fixup EOI failed when masked · 7930892c
      Guo Ren authored
      commit 69ea4630 upstream.
      
      When using "devm_request_threaded_irq(,,,,IRQF_ONESHOT,,)" in a driver,
      only the first interrupt is handled, and following interrupts are never
      delivered (initially reported in [1]).
      
      That's because the RISC-V PLIC cannot EOI masked interrupts, as explained
      in the description of Interrupt Completion in the PLIC spec [2]:
      
      <quote>
      The PLIC signals it has completed executing an interrupt handler by
      writing the interrupt ID it received from the claim to the claim/complete
      register. The PLIC does not check whether the completion ID is the same
      as the last claim ID for that target. If the completion ID does not match
      an interrupt source that *is currently enabled* for the target, the
      completion is silently ignored.
      </quote>
      
      Re-enable the interrupt before completion if it has been masked during
      the handling, and remask it afterwards.
      
      [1] http://lists.infradead.org/pipermail/linux-riscv/2021-July/007441.html
      [2] https://github.com/riscv/riscv-plic-spec/blob/8bc15a35d07c9edf7b5d23fec9728302595ffc4d/riscv-plic.adoc
      
      
      
      Fixes: bb0fed1c ("irqchip/sifive-plic: Switch to fasteoi flow")
      Reported-by: default avatarVincent Pelletier <plr.vincent@gmail.com>
      Tested-by: default avatarNikita Shubin <nikita.shubin@maquefel.me>
      Signed-off-by: default avatarGuo Ren <guoren@linux.alibaba.com>
      Cc: stable@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Atish Patra <atish.patra@wdc.com>
      Reviewed-by: default avatarAnup Patel <anup@brainfault.org>
      [maz: amended commit message]
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20211105094748.3894453-1-guoren@kernel.org
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7930892c
    • Michael Pratt's avatar
      posix-cpu-timers: Clear task::posix_cputimers_work in copy_process() · f67f6eb7
      Michael Pratt authored
      
      commit ca7752ca upstream.
      
      copy_process currently copies task_struct.posix_cputimers_work as-is. If a
      timer interrupt arrives while handling clone and before dup_task_struct
      completes then the child task will have:
      
      1. posix_cputimers_work.scheduled = true
      2. posix_cputimers_work.work queued.
      
      copy_process clears task_struct.task_works, so (2) will have no effect and
      posix_cpu_timers_work will never run (not to mention it doesn't make sense
      for two tasks to share a common linked list).
      
      Since posix_cpu_timers_work never runs, posix_cputimers_work.scheduled is
      never cleared. Since scheduled is set, future timer interrupts will skip
      scheduling work, with the ultimate result that the task will never receive
      timer expirations.
      
      Together, the complete flow is:
      
      1. Task 1 calls clone(), enters kernel.
      2. Timer interrupt fires, schedules task work on Task 1.
         2a. task_struct.posix_cputimers_work.scheduled = true
         2b. task_struct.posix_cputimers_work.work added to
             task_struct.task_works.
      3. dup_task_struct() copies Task 1 to Task 2.
      4. copy_process() clears task_struct.task_works for Task 2.
      5. Future timer interrupts on Task 2 see
         task_struct.posix_cputimers_work.scheduled = true and skip scheduling
         work.
      
      Fix this by explicitly clearing contents of task_struct.posix_cputimers_work
      in copy_process(). This was never meant to be shared or inherited across
      tasks in the first place.
      
      Fixes: 1fb497dd ("posix-cpu-timers: Provide mechanisms to defer timer handling to task_work")
      Reported-by: default avatarRhys Hiltner <rhys@justin.tv>
      Signed-off-by: default avatarMichael Pratt <mpratt@google.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20211101210615.716522-1-mpratt@google.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f67f6eb7
    • Dave Jones's avatar
      x86/mce: Add errata workaround for Skylake SKX37 · 1372eb18
      Dave Jones authored
      
      commit e629fc14 upstream.
      
      Errata SKX37 is word-for-word identical to the other errata listed in
      this workaround.   I happened to notice this after investigating a CMCI
      storm on a Skylake host.  While I can't confirm this was the root cause,
      spurious corrected errors does sound like a likely suspect.
      
      Fixes: 2976908e ("x86/mce: Do not log spurious corrected mce errors")
      Signed-off-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: default avatarTony Luck <tony.luck@intel.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20211029205759.GA7385@codemonkey.org.uk
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1372eb18
    • Maciej W. Rozycki's avatar
      MIPS: Fix assembly error from MIPSr2 code used within MIPS_ISA_ARCH_LEVEL · 1ee5bc2b
      Maciej W. Rozycki authored
      
      commit a923a267 upstream.
      
      Fix assembly errors like:
      
      {standard input}: Assembler messages:
      {standard input}:287: Error: opcode not supported on this processor: mips3 (mips3) `dins $10,$7,32,32'
      {standard input}:680: Error: opcode not supported on this processor: mips3 (mips3) `dins $10,$7,32,32'
      {standard input}:1274: Error: opcode not supported on this processor: mips3 (mips3) `dins $12,$9,32,32'
      {standard input}:2175: Error: opcode not supported on this processor: mips3 (mips3) `dins $10,$7,32,32'
      make[1]: *** [scripts/Makefile.build:277: mm/highmem.o] Error 1
      
      with code produced from `__cmpxchg64' for MIPS64r2 CPU configurations
      using CONFIG_32BIT and CONFIG_PHYS_ADDR_T_64BIT.
      
      This is due to MIPS_ISA_ARCH_LEVEL downgrading the assembly architecture
      to `r4000' i.e. MIPS III for MIPS64r2 configurations, while there is a
      block of code containing a DINS MIPS64r2 instruction conditionalized on
      MIPS_ISA_REV >= 2 within the scope of the downgrade.
      
      The assembly architecture override code pattern has been put there for
      LL/SC instructions, so that code compiles for configurations that select
      a processor to build for that does not support these instructions while
      still providing run-time support for processors that do, dynamically
      switched by non-constant `cpu_has_llsc'.  It went in with linux-mips.org
      commit aac8aa77 ("Enable a suitable ISA for the assembler around
      ll/sc so that code builds even for processors that don't support the
      instructions. Plus minor formatting fixes.") back in 2005.
      
      Fix the problem by wrapping these instructions along with the adjacent
      SYNC instructions only, following the practice established with commit
      cfd54de3 ("MIPS: Avoid move psuedo-instruction whilst using
      MIPS_ISA_LEVEL") and commit 378ed6f0 ("MIPS: Avoid using .set mips0
      to restore ISA").  Strictly speaking the SYNC instructions do not have
      to be wrapped as they are only used as a Loongson3 erratum workaround,
      so they will be enabled in the assembler by default, but do this so as
      to keep code consistent with other places.
      
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarMaciej W. Rozycki <macro@orcam.me.uk>
      Fixes: c7e2d71d ("MIPS: Fix set_pte() for Netlogic XLR using cmpxchg64()")
      Cc: stable@vger.kernel.org # v5.1+
      Signed-off-by: default avatarThomas Bogendoerfer <tsbogend@alpha.franken.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ee5bc2b
    • Helge Deller's avatar
      parisc: Fix backtrace to always include init funtion names · fc42bbb7
      Helge Deller authored
      
      commit 279917e2 upstream.
      
      I noticed that sometimes at kernel startup the backtraces did not
      included the function names of init functions. Their address were not
      resolved to function names and instead only the address was printed.
      
      Debugging shows that the culprit is is_ksym_addr() which is called
      by the backtrace functions to check if an address belongs to a function in
      the kernel. The problem occurs only for CONFIG_KALLSYMS_ALL=y.
      
      When looking at is_ksym_addr() one can see that for CONFIG_KALLSYMS_ALL=y
      the function only tries to resolve the address via is_kernel() function,
      which checks like this:
      	if (addr >= _stext && addr <= _end)
                      return 1;
      On parisc the init functions are located before _stext, so this check fails.
      Other platforms seem to have all functions (including init functions)
      behind _stext.
      
      The following patch moves the _stext symbol at the beginning of the
      kernel and thus includes the init section. This fixes the check and does
      not seem to have any negative side effects on where the kernel mapping
      happens in the map_pages() function in arch/parisc/mm/init.c.
      
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: stable@kernel.org # 5.4+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc42bbb7
    • Arnd Bergmann's avatar
      ARM: 9156/1: drop cc-option fallbacks for architecture selection · 241c74cc
      Arnd Bergmann authored
      commit 418ace99 upstream.
      
      Naresh and Antonio ran into a build failure with latest Debian
      armhf compilers, with lots of output like
      
       tmp/ccY3nOAs.s:2215: Error: selected processor does not support `cpsid i' in ARM mode
      
      As it turns out, $(cc-option) fails early here when the FPU is not
      selected before CPU architecture is selected, as the compiler
      option check runs before enabling -msoft-float, which causes
      a problem when testing a target architecture level without an FPU:
      
      cc1: error: '-mfloat-abi=hard': selected architecture lacks an FPU
      
      Passing e.g. -march=armv6k+fp in place of -march=armv6k would avoid this
      issue, but the fallback logic is already broken because all supported
      compilers (gcc-5 and higher) are much more recent than these options,
      and building with -march=armv5t as a fallback no longer works.
      
      The best way forward that I see is to just remove all the checks, which
      also has the nice side-effect of slightly improving the startup time for
      'make'.
      
      The -mtune=marvell-f option was apparently never supported by any mainline
      compiler, and the custom Codesourcery gcc build that did support is
      now too old to build kernels, so just use -mtune=xscale unconditionally
      for those.
      
      This should be safe to apply on all stable kernels, and will be required
      in order to keep building them with gcc-11 and higher.
      
      Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996419
      
      
      
      Reported-by: default avatarAntonio Terceiro <antonio.terceiro@linaro.org>
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Reported-by: default avatarSebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Tested-by: default avatarSebastian Reichel <sebastian.reichel@collabora.com>
      Tested-by: default avatarKlaus Kudielka <klaus.kudielka@gmail.com>
      Cc: Matthias Klose <doko@debian.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      241c74cc
    • Michał Mirosław's avatar
      ARM: 9155/1: fix early early_iounmap() · 03f25781
      Michał Mirosław authored
      
      commit 0d08e7bf upstream.
      
      Currently __set_fixmap() bails out with a warning when called in early boot
      from early_iounmap(). Fix it, and while at it, make the comment a bit easier
      to understand.
      
      Cc: <stable@vger.kernel.org>
      Fixes: b089c31c ("ARM: 8667/3: Fix memory attribute inconsistencies when using fixmap")
      Acked-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarMichał Mirosław <mirq-linux@rere.qmqm.pl>
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03f25781
    • Willem de Bruijn's avatar
      selftests/net: udpgso_bench_rx: fix port argument · ee79560c
      Willem de Bruijn authored
      
      [ Upstream commit d336509c ]
      
      The below commit added optional support for passing a bind address.
      It configures the sockaddr bind arguments before parsing options and
      reconfigures on options -b and -4.
      
      This broke support for passing port (-p) on its own.
      
      Configure sockaddr after parsing all arguments.
      
      Fixes: 3327a9c4 ("selftests: add functionals test for UDP GRO")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ee79560c
    • Rahul Lakkireddy's avatar
      cxgb4: fix eeprom len when diagnostics not implemented · 8b215edb
      Rahul Lakkireddy authored
      
      [ Upstream commit 4ca110bf ]
      
      Ensure diagnostics monitoring support is implemented for the SFF 8472
      compliant port module and set the correct length for ethtool port
      module eeprom read.
      
      Fixes: f56ec676 ("cxgb4: Add support for ethtool i2c dump")
      Signed-off-by: default avatarManoj Malviya <manojmalviya@chelsio.com>
      Signed-off-by: default avatarRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8b215edb
    • Dust Li's avatar
      net/smc: fix sk_refcnt underflow on linkdown and fallback · 93bc3ef6
      Dust Li authored
      
      [ Upstream commit e5d5aadc ]
      
      We got the following WARNING when running ab/nginx
      test with RDMA link flapping (up-down-up).
      The reason is when smc_sock fallback and at linkdown
      happens simultaneously, we may got the following situation:
      
      __smc_lgr_terminate()
       --> smc_conn_kill()
          --> smc_close_active_abort()
                 smc_sock->sk_state = SMC_CLOSED
                 sock_put(smc_sock)
      
      smc_sock was set to SMC_CLOSED and sock_put() been called
      when terminate the link group. But later application call
      close() on the socket, then we got:
      
      __smc_release():
          if (smc_sock->fallback)
              smc_sock->sk_state = SMC_CLOSED
              sock_put(smc_sock)
      
      Again we set the smc_sock to CLOSED through it's already
      in CLOSED state, and double put the refcnt, so the following
      warning happens:
      
      refcount_t: underflow; use-after-free.
      WARNING: CPU: 5 PID: 860 at lib/refcount.c:28 refcount_warn_saturate+0x8d/0xf0
      Modules linked in:
      CPU: 5 PID: 860 Comm: nginx Not tainted 5.10.46+ #403
      Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/01/2014
      RIP: 0010:refcount_warn_saturate+0x8d/0xf0
      Code: 05 5c 1e b5 01 01 e8 52 25 bc ff 0f 0b c3 80 3d 4f 1e b5 01 00 75 ad 48
      
      RSP: 0018:ffffc90000527e50 EFLAGS: 00010286
      RAX: 0000000000000026 RBX: ffff8881300df2c0 RCX: 0000000000000027
      RDX: 0000000000000000 RSI: ffff88813bd58040 RDI: ffff88813bd58048
      RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000001
      R10: ffff8881300df2c0 R11: ffffc90000527c78 R12: ffff8881300df340
      R13: ffff8881300df930 R14: ffff88810b3dad80 R15: ffff8881300df4f8
      FS:  00007f739de8fb80(0000) GS:ffff88813bd40000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000000a01b008 CR3: 0000000111b64003 CR4: 00000000003706e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       smc_release+0x353/0x3f0
       __sock_release+0x3d/0xb0
       sock_close+0x11/0x20
       __fput+0x93/0x230
       task_work_run+0x65/0xa0
       exit_to_user_mode_prepare+0xf9/0x100
       syscall_exit_to_user_mode+0x27/0x190
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      This patch adds check in __smc_release() to make
      sure we won't do an extra sock_put() and set the
      socket to CLOSED when its already in CLOSED state.
      
      Fixes: 51f1de79 (net/smc: replace sock_put worker by socket refcounting)
      Signed-off-by: default avatarDust Li <dust.li@linux.alibaba.com>
      Reviewed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: default avatarDust Li <dust.li@linux.alibaba.com>
      Acked-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      93bc3ef6
    • Eiichi Tsukata's avatar
      vsock: prevent unnecessary refcnt inc for nonblocking connect · 7e03b797
      Eiichi Tsukata authored
      
      [ Upstream commit c7cd82b9 ]
      
      Currently vosck_connect() increments sock refcount for nonblocking
      socket each time it's called, which can lead to memory leak if
      it's called multiple times because connect timeout function decrements
      sock refcount only once.
      
      Fixes it by making vsock_connect() return -EALREADY immediately when
      sock state is already SS_CONNECTING.
      
      Fixes: d021c344 ("VSOCK: Introduce VM Sockets")
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarEiichi Tsukata <eiichi.tsukata@nutanix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7e03b797
Loading