Commit Graph

790960 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
de4c70d6a9 This is the 4.19.65 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1Js7MACgkQONu9yGCS
 aT4PQxAAo7xa4kYvDxc1RjUY/yIlp6lQ3rpYAAfZB0t8vN+dqivnJZ7m6JHeWX1Y
 CMcxg85zxLVFeuiXdP821Zj68AB5zqlWMhX0bXm2lhw/Eo9+XHzXtnrLZHhz0/Xd
 M5cmfIPmoyPCUQQfzSfUMvch+ZpwzEt5op5pUfSjckSpjHQZ0HFj1WJ4D8Hn9jAJ
 y4+DAKDZgtqhb3GvpS6MoVnBJgcPk9+mBiDkSb12L392+FvHqfeBi3tDRhvyiZAO
 iJrk747SPds7NlNmuRnj7YyUSDhBzaceRCz0Jsv9FT5EKXoPErXdsL3Bkfa9TREM
 pH0OaMgNr6WSXLO9qIMcfxMeaKVIvIbotqBTkBTzhEAGPkHA75dhi0lpixXXFExg
 MaqhLfmHO0dOEr9FrvYGe7f2wUA1Rdw/qRTM3KPEKmHxMqBS7eufIWMHwie1n9Oe
 cYoP6UkxUIvhUyFV2BlMRFdMfaDbtR0iqy8Dqh36NISD6PAYaUGSoVeSO1fEg4Jy
 5GgrKPg6rcz2XNY2cVbsm2zLpqY4dY58SFK9ORfuULdKUQvScvFGrdSSW0CgX+uc
 F/5NmPutUoboHVxFraDPx7yo46pHf1RW0Me4xZ0aJ3e9ituLAN4fmJ9u46nofb5M
 thPelQlMVt30O41uViJ0ADkOjCsiBr3AxOFvc76Ct9Q/BJVxhLk=
 =JVBv
 -----END PGP SIGNATURE-----

Merge 4.19.65 into android-4.19

Changes in 4.19.65
	ARM: riscpc: fix DMA
	ARM: dts: rockchip: Make rk3288-veyron-minnie run at hs200
	ARM: dts: rockchip: Make rk3288-veyron-mickey's emmc work again
	ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend
	ftrace: Enable trampoline when rec count returns back to one
	dmaengine: tegra-apb: Error out if DMA_PREP_INTERRUPT flag is unset
	arm64: dts: rockchip: fix isp iommu clocks and power domain
	kernel/module.c: Only return -EEXIST for modules that have finished loading
	firmware/psci: psci_checker: Park kthreads before stopping them
	MIPS: lantiq: Fix bitfield masking
	dmaengine: rcar-dmac: Reject zero-length slave DMA requests
	clk: tegra210: fix PLLU and PLLU_OUT1
	fs/adfs: super: fix use-after-free bug
	clk: sprd: Add check for return value of sprd_clk_regmap_init()
	btrfs: fix minimum number of chunk errors for DUP
	btrfs: qgroup: Don't hold qgroup_ioctl_lock in btrfs_qgroup_inherit()
	cifs: Fix a race condition with cifs_echo_request
	ceph: fix improper use of smp_mb__before_atomic()
	ceph: return -ERANGE if virtual xattr value didn't fit in buffer
	ACPI: blacklist: fix clang warning for unused DMI table
	scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized
	perf version: Fix segfault due to missing OPT_END()
	x86: kvm: avoid constant-conversion warning
	ACPI: fix false-positive -Wuninitialized warning
	be2net: Signal that the device cannot transmit during reconfiguration
	x86/apic: Silence -Wtype-limits compiler warnings
	x86: math-emu: Hide clang warnings for 16-bit overflow
	mm/cma.c: fail if fixed declaration can't be honored
	lib/test_overflow.c: avoid tainting the kernel and fix wrap size
	lib/test_string.c: avoid masking memset16/32/64 failures
	coda: add error handling for fget
	coda: fix build using bare-metal toolchain
	uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers
	drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings
	ipc/mqueue.c: only perform resource calculation if user valid
	mlxsw: spectrum_dcb: Configure DSCP map as the last rule is removed
	xen/pv: Fix a boot up hang revealed by int3 self test
	x86/kvm: Don't call kvm_spurious_fault() from .fixup
	x86/paravirt: Fix callee-saved function ELF sizes
	x86, boot: Remove multiple copy of static function sanitize_boot_params()
	drm/nouveau: fix memory leak in nouveau_conn_reset()
	kconfig: Clear "written" flag to avoid data loss
	kbuild: initialize CLANG_FLAGS correctly in the top Makefile
	Btrfs: fix incremental send failure after deduplication
	Btrfs: fix race leading to fs corruption after transaction abort
	mmc: dw_mmc: Fix occasional hang after tuning on eMMC
	mmc: meson-mx-sdio: Fix misuse of GENMASK macro
	gpiolib: fix incorrect IRQ requesting of an active-low lineevent
	IB/hfi1: Fix Spectre v1 vulnerability
	mtd: rawnand: micron: handle on-die "ECC-off" devices correctly
	selinux: fix memory leak in policydb_init()
	ALSA: hda: Fix 1-minute detection delay when i915 module is not available
	mm: vmscan: check if mem cgroup is disabled or not before calling memcg slab shrinker
	s390/dasd: fix endless loop after read unit address configuration
	cgroup: kselftest: relax fs_spec checks
	parisc: Fix build of compressed kernel even with debug enabled
	drivers/perf: arm_pmu: Fix failure path in PM notifier
	arm64: compat: Allow single-byte watchpoints on all addresses
	arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG}
	nbd: replace kill_bdev() with __invalidate_device() again
	xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
	IB/mlx5: Fix unreg_umr to ignore the mkey state
	IB/mlx5: Use direct mkey destroy command upon UMR unreg failure
	IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache
	IB/mlx5: Fix clean_mr() to work in the expected order
	IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification
	IB/hfi1: Check for error on call to alloc_rsm_map_table
	drm/i915/gvt: fix incorrect cache entry for guest page mapping
	eeprom: at24: make spd world-readable again
	ARC: enable uboot support unconditionally
	objtool: Support GCC 9 cold subfunction naming scheme
	gcc-9: properly declare the {pv,hv}clock_page storage
	x86/vdso: Prevent segfaults due to hoisted vclock reads
	scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA
	x86/cpufeatures: Carve out CQM features retrieval
	x86/cpufeatures: Combine word 11 and 12 into a new scattered features word
	x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
	x86/speculation: Enable Spectre v1 swapgs mitigations
	x86/entry/64: Use JMP instead of JMPQ
	x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
	Documentation: Add swapgs description to the Spectre v1 documentation
	Linux 4.19.65

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iceeabdb164657e0a616db618e6aa8445d56b0dc1
2019-08-06 20:08:18 +02:00
Greg Kroah-Hartman
cc4c818b22 Linux 4.19.65 2019-08-06 19:06:58 +02:00
Josh Poimboeuf
7634b9cd27 Documentation: Add swapgs description to the Spectre v1 documentation
commit 4c92057661a3412f547ede95715641d7ee16ddac upstream

Add documentation to the Spectre document about the new swapgs variant of
Spectre v1.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:58 +02:00
Thomas Gleixner
b88241aef6 x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
commit f36cf386e3fec258a341d446915862eded3e13d8 upstream

Intel provided the following information:

 On all current Atom processors, instructions that use a segment register
 value (e.g. a load or store) will not speculatively execute before the
 last writer of that segment retires. Thus they will not use a
 speculatively written segment value.

That means on ATOMs there is no speculation through SWAPGS, so the SWAPGS
entry paths can be excluded from the extra LFENCE if PTI is disabled.

Create a separate bug flag for the through SWAPGS speculation and mark all
out-of-order ATOMs and AMD/HYGON CPUs as not affected. The in-order ATOMs
are excluded from the whole mitigation mess anyway.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:58 +02:00
Josh Poimboeuf
931b6bfe8a x86/entry/64: Use JMP instead of JMPQ
commit 64dbc122b20f75183d8822618c24f85144a5a94d upstream

Somehow the swapgs mitigation entry code patch ended up with a JMPQ
instruction instead of JMP, where only the short jump is needed.  Some
assembler versions apparently fail to optimize JMPQ into a two-byte JMP
when possible, instead always using a 7-byte JMP with relocation.  For
some reason that makes the entry code explode with a #GP during boot.

Change it back to "JMP" as originally intended.

Fixes: 18ec54fdd6d1 ("x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:58 +02:00
Josh Poimboeuf
23e7a7b3a7 x86/speculation: Enable Spectre v1 swapgs mitigations
commit a2059825986a1c8143fd6698774fa9d83733bb11 upstream

The previous commit added macro calls in the entry code which mitigate the
Spectre v1 swapgs issue if the X86_FEATURE_FENCE_SWAPGS_* features are
enabled.  Enable those features where applicable.

The mitigations may be disabled with "nospectre_v1" or "mitigations=off".

There are different features which can affect the risk of attack:

- When FSGSBASE is enabled, unprivileged users are able to place any
  value in GS, using the wrgsbase instruction.  This means they can
  write a GS value which points to any value in kernel space, which can
  be useful with the following gadget in an interrupt/exception/NMI
  handler:

	if (coming from user space)
		swapgs
	mov %gs:<percpu_offset>, %reg1
	// dependent load or store based on the value of %reg
	// for example: mov %(reg1), %reg2

  If an interrupt is coming from user space, and the entry code
  speculatively skips the swapgs (due to user branch mistraining), it
  may speculatively execute the GS-based load and a subsequent dependent
  load or store, exposing the kernel data to an L1 side channel leak.

  Note that, on Intel, a similar attack exists in the above gadget when
  coming from kernel space, if the swapgs gets speculatively executed to
  switch back to the user GS.  On AMD, this variant isn't possible
  because swapgs is serializing with respect to future GS-based
  accesses.

  NOTE: The FSGSBASE patch set hasn't been merged yet, so the above case
	doesn't exist quite yet.

- When FSGSBASE is disabled, the issue is mitigated somewhat because
  unprivileged users must use prctl(ARCH_SET_GS) to set GS, which
  restricts GS values to user space addresses only.  That means the
  gadget would need an additional step, since the target kernel address
  needs to be read from user space first.  Something like:

	if (coming from user space)
		swapgs
	mov %gs:<percpu_offset>, %reg1
	mov (%reg1), %reg2
	// dependent load or store based on the value of %reg2
	// for example: mov %(reg2), %reg3

  It's difficult to audit for this gadget in all the handlers, so while
  there are no known instances of it, it's entirely possible that it
  exists somewhere (or could be introduced in the future).  Without
  tooling to analyze all such code paths, consider it vulnerable.

  Effects of SMAP on the !FSGSBASE case:

  - If SMAP is enabled, and the CPU reports RDCL_NO (i.e., not
    susceptible to Meltdown), the kernel is prevented from speculatively
    reading user space memory, even L1 cached values.  This effectively
    disables the !FSGSBASE attack vector.

  - If SMAP is enabled, but the CPU *is* susceptible to Meltdown, SMAP
    still prevents the kernel from speculatively reading user space
    memory.  But it does *not* prevent the kernel from reading the
    user value from L1, if it has already been cached.  This is probably
    only a small hurdle for an attacker to overcome.

Thanks to Dave Hansen for contributing the speculative_smap() function.

Thanks to Andrew Cooper for providing the inside scoop on whether swapgs
is serializing on AMD.

[ tglx: Fixed the USER fence decision and polished the comment as suggested
  	by Dave Hansen ]

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:58 +02:00
Josh Poimboeuf
befb822c06 x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
commit 18ec54fdd6d18d92025af097cd042a75cf0ea24c upstream

Spectre v1 isn't only about array bounds checks.  It can affect any
conditional checks.  The kernel entry code interrupt, exception, and NMI
handlers all have conditional swapgs checks.  Those may be problematic in
the context of Spectre v1, as kernel code can speculatively run with a user
GS.

For example:

	if (coming from user space)
		swapgs
	mov %gs:<percpu_offset>, %reg
	mov (%reg), %reg1

When coming from user space, the CPU can speculatively skip the swapgs, and
then do a speculative percpu load using the user GS value.  So the user can
speculatively force a read of any kernel value.  If a gadget exists which
uses the percpu value as an address in another load/store, then the
contents of the kernel value may become visible via an L1 side channel
attack.

A similar attack exists when coming from kernel space.  The CPU can
speculatively do the swapgs, causing the user GS to get used for the rest
of the speculative window.

The mitigation is similar to a traditional Spectre v1 mitigation, except:

  a) index masking isn't possible; because the index (percpu offset)
     isn't user-controlled; and

  b) an lfence is needed in both the "from user" swapgs path and the
     "from kernel" non-swapgs path (because of the two attacks described
     above).

The user entry swapgs paths already have SWITCH_TO_KERNEL_CR3, which has a
CR3 write when PTI is enabled.  Since CR3 writes are serializing, the
lfences can be skipped in those cases.

On the other hand, the kernel entry swapgs paths don't depend on PTI.

To avoid unnecessary lfences for the user entry case, create two separate
features for alternative patching:

  X86_FEATURE_FENCE_SWAPGS_USER
  X86_FEATURE_FENCE_SWAPGS_KERNEL

Use these features in entry code to patch in lfences where needed.

The features aren't enabled yet, so there's no functional change.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:58 +02:00
Fenghua Yu
b5dd7f61fc x86/cpufeatures: Combine word 11 and 12 into a new scattered features word
commit acec0ce081de0c36459eea91647faf99296445a3 upstream

It's a waste for the four X86_FEATURE_CQM_* feature bits to occupy two
whole feature bits words. To better utilize feature words, re-define
word 11 to host scattered features and move the four X86_FEATURE_CQM_*
features into Linux defined word 11. More scattered features can be
added in word 11 in the future.

Rename leaf 11 in cpuid_leafs to CPUID_LNX_4 to reflect it's a
Linux-defined leaf.

Rename leaf 12 as CPUID_DUMMY which will be replaced by a meaningful
name in the next patch when CPUID.7.1:EAX occupies world 12.

Maximum number of RMID and cache occupancy scale are retrieved from
CPUID.0xf.1 after scattered CQM features are enumerated. Carve out the
code into a separate function.

KVM doesn't support resctrl now. So it's safe to move the
X86_FEATURE_CQM_* features to scattered features word 11 for KVM.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Aaron Lewis <aaronlewis@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Babu Moger <babu.moger@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: "Sean J Christopherson" <sean.j.christopherson@intel.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: x86 <x86@kernel.org>
Link: https://lkml.kernel.org/r/1560794416-217638-2-git-send-email-fenghua.yu@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:58 +02:00
Borislav Petkov
16ad0b63f3 x86/cpufeatures: Carve out CQM features retrieval
commit 45fc56e629caa451467e7664fbd4c797c434a6c4 upstream

... into a separate function for better readability. Split out from a
patch from Fenghua Yu <fenghua.yu@intel.com> to keep the mechanical,
sole code movement separate for easy review.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: x86@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:57 +02:00
Suganath Prabu
9e034c6195 scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA
commit df9a606184bfdb5ae3ca9d226184e9489f5c24f7 upstream.

Although SAS3 & SAS3.5 IT HBA controllers support 64-bit DMA addressing, as
per hardware design, if DMA-able range contains all 64-bits
set (0xFFFFFFFF-FFFFFFFF) then it results in a firmware fault.

E.g. SGE's start address is 0xFFFFFFFF-FFFF000 and data length is 0x1000
bytes. when HBA tries to DMA the data at 0xFFFFFFFF-FFFFFFFF location then
HBA will fault the firmware.

Driver will set 63-bit DMA mask to ensure the above address will not be
used.

Cc: <stable@vger.kernel.org> # 4.19.63
Signed-off-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:57 +02:00
Andy Lutomirski
3732a473be x86/vdso: Prevent segfaults due to hoisted vclock reads
commit ff17bbe0bb405ad8b36e55815d381841f9fdeebc upstream.

GCC 5.5.0 sometimes cleverly hoists reads of the pvclock and/or hvclock
pages before the vclock mode checks.  This creates a path through
vclock_gettime() in which no vclock is enabled at all (due to disabled
TSC on old CPUs, for example) but the pvclock or hvclock page
nevertheless read.  This will segfault on bare metal.

This fixes commit 459e3a21535a ("gcc-9: properly declare the
{pv,hv}clock_page storage") in the sense that, before that commit, GCC
didn't seem to generate the offending code.  There was nothing wrong
with that commit per se, and -stable maintainers should backport this to
all supported kernels regardless of whether the offending commit was
present, since the same crash could just as easily be triggered by the
phase of the moon.

On GCC 9.1.1, this doesn't seem to affect the generated code at all, so
I'm not too concerned about performance regressions from this fix.

Cc: stable@vger.kernel.org
Cc: x86@kernel.org
Cc: Borislav Petkov <bp@alien8.de>
Reported-by: Duncan Roe <duncan_roe@optusnet.com.au>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:57 +02:00
Linus Torvalds
8320768d26 gcc-9: properly declare the {pv,hv}clock_page storage
commit 459e3a21535ae3c7a9a123650e54f5c882b8fcbf upstream.

The pvlock_page and hvclock_page variables are (as the name implies)
addresses to pages, created by the linker script.

But we declared them as just "extern u8" variables, which _works_, but
now that gcc does some more bounds checking, it causes warnings like

    warning: array subscript 1 is outside array bounds of ‘u8[1]’

when we then access more than one byte from those variables.

Fix this by simply making the declaration of the variables match
reality, which makes the compiler happy too.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:57 +02:00
Josh Poimboeuf
354887ae31 objtool: Support GCC 9 cold subfunction naming scheme
commit bcb6fb5da77c2a228adf07cc9cb1a0c2aa2001c6 upstream.

Starting with GCC 8, a lot of unlikely code was moved out of line to
"cold" subfunctions in .text.unlikely.

For example, the unlikely bits of:

  irq_do_set_affinity()

are moved out to the following subfunction:

  irq_do_set_affinity.cold.49()

Starting with GCC 9, the numbered suffix has been removed.  So in the
above example, the cold subfunction is instead:

  irq_do_set_affinity.cold()

Tweak the objtool subfunction detection logic so that it detects both
GCC 8 and GCC 9 naming schemes.

Reported-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/015e9544b1f188d36a7f02fa31e9e95629aa5f50.1541040800.git.jpoimboe@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:57 +02:00
Eugeniy Paltsev
89f3896b65 ARC: enable uboot support unconditionally
commit 493a2f812446e92bcb1e69a77381b4d39808d730 upstream.

After reworking U-boot args handling code and adding paranoid
arguments check we can eliminate CONFIG_ARC_UBOOT_SUPPORT and
enable uboot support unconditionally.

For JTAG case we can assume that core registers will come up
reset value of 0 or in worst case we rely on user passing
'-on=clear_regs' to Metaware debugger.

Cc: stable@vger.kernel.org
Tested-by: Corentin LABBE <clabbe@baylibre.com>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:57 +02:00
Jean Delvare
8dd376273f eeprom: at24: make spd world-readable again
commit 25e5ef302c24a6fead369c0cfe88c073d7b97ca8 upstream.

The integration of the at24 driver into the nvmem framework broke the
world-readability of spd EEPROMs. Fix it.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: stable@vger.kernel.org
Fixes: 57d155506d ("eeprom: at24: extend driver to plug into the NVMEM framework")
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
[Bartosz: backported to v4.19.y]
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:57 +02:00
Xiaolin Zhang
a7340d31ab drm/i915/gvt: fix incorrect cache entry for guest page mapping
commit 7366aeb77cd840f3edea02c65065d40affaa7f45 upstream.

GPU hang observed during the guest OCL conformance test which is caused
by THP GTT feature used durning the test.

It was observed the same GFN with different size (4K and 2M) requested
from the guest in GVT. So during the guest page dma map stage, it is
required to unmap first with orginal size and then remap again with
requested size.

Fixes: b901b252b6 ("drm/i915/gvt: Add 2M huge gtt support")
Cc: stable@vger.kernel.org
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:56 +02:00
John Fleck
a1c020ce26 IB/hfi1: Check for error on call to alloc_rsm_map_table
commit cd48a82087231fdba0e77521102386c6ed0168d6 upstream.

The call to alloc_rsm_map_table does not check if the kmalloc fails.
Check for a NULL on alloc, and bail if it fails.

Fixes: 372cc85a13 ("IB/hfi1: Extract RSM map table init from QOS")
Link: https://lore.kernel.org/r/20190715164521.74174.27047.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: John Fleck <john.fleck@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:56 +02:00
Yishai Hadas
e9cd4962a8 IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification
commit b7165bd0d6cbb93732559be6ea8774653b204480 upstream.

The specification for the Toeplitz function doesn't require to set the key
explicitly to be symmetric. In case a symmetric functionality is required
a symmetric key can be simply used.

Wrongly forcing the algorithm to symmetric causes the wrong packet
distribution and a performance degradation.

Link: https://lore.kernel.org/r/20190723065733.4899-7-leon@kernel.org
Cc: <stable@vger.kernel.org> # 4.7
Fixes: 28d6137008 ("IB/mlx5: Add RSS QP support")
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Alex Vainman <alexv@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:56 +02:00
Yishai Hadas
924308d2a6 IB/mlx5: Fix clean_mr() to work in the expected order
commit b9332dad987018745a0c0bb718d12dacfa760489 upstream.

Any dma map underlying the MR should only be freed once the MR is fenced
at the hardware.

As of the above we first destroy the MKEY and just after that can safely
call to dma_unmap_single().

Link: https://lore.kernel.org/r/20190723065733.4899-6-leon@kernel.org
Cc: <stable@vger.kernel.org> # 4.3
Fixes: 8a187ee52b ("IB/mlx5: Support the new memory registration API")
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:56 +02:00
Yishai Hadas
7e5ce9f394 IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache
commit 9ec4483a3f0f71a228a5933bc040441322bfb090 upstream.

Fix unreg_umr to move the MR to a kernel owned PD (i.e. the UMR PD) which
can't be accessed by userspace.

This ensures that nothing can continue to access the MR once it has been
placed in the kernels cache for reuse.

MRs in the cache continue to have their HW state, including DMA tables,
present. Even though the MR has been invalidated, changing the PD provides
an additional layer of protection against use of the MR.

Link: https://lore.kernel.org/r/20190723065733.4899-5-leon@kernel.org
Cc: <stable@vger.kernel.org> # 3.10
Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:56 +02:00
Yishai Hadas
3cfa1087a1 IB/mlx5: Use direct mkey destroy command upon UMR unreg failure
commit afd1417404fba6dbfa6c0a8e5763bd348da682e4 upstream.

Use a direct firmware command to destroy the mkey in case the unreg UMR
operation has failed.

This prevents a case that a mkey will leak out from the cache post a
failure to be destroyed by a UMR WR.

In case the MR cache limit didn't reach a call to add another entry to the
cache instead of the destroyed one is issued.

In addition, replaced a warn message to WARN_ON() as this flow is fatal
and can't happen unless some bug around.

Link: https://lore.kernel.org/r/20190723065733.4899-4-leon@kernel.org
Cc: <stable@vger.kernel.org> # 4.10
Fixes: 49780d42df ("IB/mlx5: Expose MR cache for mlx5_ib")
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:56 +02:00
Yishai Hadas
41be192805 IB/mlx5: Fix unreg_umr to ignore the mkey state
commit 6a053953739d23694474a5f9c81d1a30093da81a upstream.

Fix unreg_umr to ignore the mkey state and do not fail if was freed.  This
prevents a case that a user space application already changed the mkey
state to free and then the UMR operation will fail leaving the mkey in an
inappropriate state.

Link: https://lore.kernel.org/r/20190723065733.4899-3-leon@kernel.org
Cc: <stable@vger.kernel.org> # 3.19
Fixes: 968e78dd96 ("IB/mlx5: Enhance UMR support to allow partial page table update")
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:56 +02:00
Juergen Gross
04fdca1f2f xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
commit 50f6393f9654c561df4cdcf8e6cfba7260143601 upstream.

The condition in xen_swiotlb_free_coherent() for deciding whether to
call xen_destroy_contiguous_region() is wrong: in case the region to
be freed is not contiguous calling xen_destroy_contiguous_region() is
the wrong thing to do: it would result in inconsistent mappings of
multiple PFNs to the same MFN. This will lead to various strange
crashes or data corruption.

Instead of calling xen_destroy_contiguous_region() in that case a
warning should be issued as that situation should never occur.

Cc: stable@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:55 +02:00
Munehisa Kamata
eb828241b4 nbd: replace kill_bdev() with __invalidate_device() again
commit 2b5c8f0063e4b263cf2de82029798183cf85c320 upstream.

Commit abbbdf1249 ("replace kill_bdev() with __invalidate_device()")
once did this, but 29eaadc036 ("nbd: stop using the bdev everywhere")
resurrected kill_bdev() and it has been there since then. So buffer_head
mappings still get killed on a server disconnection, and we can still
hit the BUG_ON on a filesystem on the top of the nbd device.

  EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: (null)
  block nbd0: Receive control failed (result -32)
  block nbd0: shutting down sockets
  print_req_error: I/O error, dev nbd0, sector 66264 flags 3000
  EXT4-fs warning (device nbd0): htree_dirblock_to_tree:979: inode #2: lblock 0: comm ls: error -5 reading directory block
  print_req_error: I/O error, dev nbd0, sector 2264 flags 3000
  EXT4-fs error (device nbd0): __ext4_get_inode_loc:4690: inode #2: block 283: comm ls: unable to read itable block
  EXT4-fs error (device nbd0) in ext4_reserve_inode_write:5894: IO failure
  ------------[ cut here ]------------
  kernel BUG at fs/buffer.c:3057!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 7 PID: 40045 Comm: jbd2/nbd0-8 Not tainted 5.1.0-rc3+ #4
  Hardware name: Amazon EC2 m5.12xlarge/, BIOS 1.0 10/16/2017
  RIP: 0010:submit_bh_wbc+0x18b/0x190
  ...
  Call Trace:
   jbd2_write_superblock+0xf1/0x230 [jbd2]
   ? account_entity_enqueue+0xc5/0xf0
   jbd2_journal_update_sb_log_tail+0x94/0xe0 [jbd2]
   jbd2_journal_commit_transaction+0x12f/0x1d20 [jbd2]
   ? __switch_to_asm+0x40/0x70
   ...
   ? lock_timer_base+0x67/0x80
   kjournald2+0x121/0x360 [jbd2]
   ? remove_wait_queue+0x60/0x60
   kthread+0xf8/0x130
   ? commit_timeout+0x10/0x10 [jbd2]
   ? kthread_bind+0x10/0x10
   ret_from_fork+0x35/0x40

With __invalidate_device(), I no longer hit the BUG_ON with sync or
unmount on the disconnected device.

Fixes: 29eaadc036 ("nbd: stop using the bdev everywhere")
Cc: linux-block@vger.kernel.org
Cc: Ratna Manoj Bolla <manoj.br@gmail.com>
Cc: nbd@other.debian.org
Cc: stable@vger.kernel.org
Cc: David Woodhouse <dwmw@amazon.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Munehisa Kamata <kamatam@amazon.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:55 +02:00
Will Deacon
8dfef0f442 arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG}
commit 147b9635e6347104b91f48ca9dca61eb0fbf2a54 upstream.

If CTR_EL0.{CWG,ERG} are 0b0000 then they must be interpreted to have
their architecturally maximum values, which defeats the use of
FTR_HIGHER_SAFE when sanitising CPU ID registers on heterogeneous
machines.

Introduce FTR_HIGHER_OR_ZERO_SAFE so that these fields effectively
saturate at zero.

Fixes: 3c739b5710 ("arm64: Keep track of CPU feature registers")
Cc: <stable@vger.kernel.org> # 4.4.x-
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:55 +02:00
Will Deacon
2bddc98565 arm64: compat: Allow single-byte watchpoints on all addresses
commit 849adec41203ac5837c40c2d7e08490ffdef3c2c upstream.

Commit d968d2b801 ("ARM: 7497/1: hw_breakpoint: allow single-byte
watchpoints on all addresses") changed the validation requirements for
hardware watchpoints on arch/arm/. Update our compat layer to implement
the same relaxation.

Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:55 +02:00
Will Deacon
c385cda0e7 drivers/perf: arm_pmu: Fix failure path in PM notifier
commit 0d7fd70f26039bd4b33444ca47f0e69ce3ae0354 upstream.

Handling of the CPU_PM_ENTER_FAILED transition in the Arm PMU PM
notifier code incorrectly skips restoration of the counters. Fix the
logic so that CPU_PM_ENTER_FAILED follows the same path as CPU_PM_EXIT.

Cc: <stable@vger.kernel.org>
Fixes: da4e4f18af ("drivers/perf: arm_pmu: implement CPU_PM notifier")
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:55 +02:00
Helge Deller
5f80ac50b9 parisc: Fix build of compressed kernel even with debug enabled
commit 3fe6c873af2f2247544debdbe51ec29f690a2ccf upstream.

With debug info enabled (CONFIG_DEBUG_INFO=y) the resulting vmlinux may get
that huge that we need to increase the start addresss for the decompression
text section otherwise one will face a linker error.

Reported-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Sven Schnelle <svens@stackframe.org>
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:55 +02:00
Chris Down
001f93d95d cgroup: kselftest: relax fs_spec checks
commit b59b1baab789eacdde809135542e3d4f256f6878 upstream.

On my laptop most memcg kselftests were being skipped because it claimed
cgroup v2 hierarchy wasn't mounted, but this isn't correct.  Instead, it
seems current systemd HEAD mounts it with the name "cgroup2" instead of
"cgroup":

    % grep cgroup /proc/mounts
    cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0

I can't think of a reason to need to check fs_spec explicitly
since it's arbitrary, so we can just rely on fs_vfstype.

After these changes, `make TARGETS=cgroup kselftest` actually runs the
cgroup v2 tests in more cases.

Link: http://lkml.kernel.org/r/20190723210737.GA487@chrisdown.name
Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:55 +02:00
Stefan Haberland
6cb9e0d9ad s390/dasd: fix endless loop after read unit address configuration
commit 41995342b40c418a47603e1321256d2c4a2ed0fb upstream.

After getting a storage server event that causes the DASD device driver
to update its unit address configuration during a device shutdown there is
the possibility of an endless loop in the device driver.

In the system log there will be ongoing DASD error messages with RC: -19.

The reason is that the loop starting the ruac request only terminates when
the retry counter is decreased to 0. But in the sleep_on function there are
early exit paths that do not decrease the retry counter.

Prevent an endless loop by handling those cases separately.

Remove the unnecessary do..while loop since the sleep_on function takes
care of retries by itself.

Fixes: 8e09f21574 ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Cc: stable@vger.kernel.org # 2.6.25+
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:55 +02:00
Yang Shi
beb0cc781b mm: vmscan: check if mem cgroup is disabled or not before calling memcg slab shrinker
commit fa1e512fac717f34e7c12d7a384c46e90a647392 upstream.

Shakeel Butt reported premature oom on kernel with
"cgroup_disable=memory" since mem_cgroup_is_root() returns false even
though memcg is actually NULL.  The drop_caches is also broken.

It is because commit aeed1d325d ("mm/vmscan.c: generalize
shrink_slab() calls in shrink_node()") removed the !memcg check before
!mem_cgroup_is_root().  And, surprisingly root memcg is allocated even
though memory cgroup is disabled by kernel boot parameter.

Add mem_cgroup_disabled() check to make reclaimer work as expected.

Link: http://lkml.kernel.org/r/1563385526-20805-1-git-send-email-yang.shi@linux.alibaba.com
Fixes: aeed1d325d ("mm/vmscan.c: generalize shrink_slab() calls in shrink_node()")
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Reported-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jan Hadrava <had@kam.mff.cuni.cz>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>	[4.19+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:54 +02:00
Samuel Thibault
72651bbdf3 ALSA: hda: Fix 1-minute detection delay when i915 module is not available
commit 74bf71ed792ab0f64631cc65ccdb54c356c36d45 upstream.

Distribution installation images such as Debian include different sets
of modules which can be downloaded dynamically.  Such images may notably
include the hda sound modules but not the i915 DRM module, even if the
latter was enabled at build time, as reported on
https://bugs.debian.org/931507

In such a case hdac_i915 would be linked in and try to load the i915
module, fail since it is not there, but still wait for a whole minute
before giving up binding with it.

This fixes such as case by only waiting for the binding if the module
was properly loaded (or module support is disabled, in which case i915
is already compiled-in anyway).

Fixes: f9b54e1961 ("ALSA: hda/i915: Allow delayed i915 audio component binding")
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:54 +02:00
Ondrej Mosnacek
46650ac2e1 selinux: fix memory leak in policydb_init()
commit 45385237f65aeee73641f1ef737d7273905a233f upstream.

Since roles_init() adds some entries to the role hash table, we need to
destroy also its keys/values on error, otherwise we get a memory leak in
the error path.

Cc: <stable@vger.kernel.org>
Reported-by: syzbot+fee3a14d4cdf92646287@syzkaller.appspotmail.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:54 +02:00
Marco Felsch
e7bb4c81b3 mtd: rawnand: micron: handle on-die "ECC-off" devices correctly
commit 8493b2a06fc5b77ef5c579dc32b12761f7b7a84c upstream.

Some devices are not supposed to support on-die ECC but experience
shows that internal ECC machinery can actually be enabled through the
"SET FEATURE (EFh)" command, even if a read of the "READ ID Parameter
Tables" returns that it is not.

Currently, the driver checks the "READ ID Parameter" field directly
after having enabled the feature. If the check fails it returns
immediately but leaves the ECC on. When using buggy chips like
MT29F2G08ABAGA and MT29F2G08ABBGA, all future read/program cycles will
go through the on-die ECC, confusing the host controller which is
supposed to be the one handling correction.

To address this in a common way we need to turn off the on-die ECC
directly after reading the "READ ID Parameter" and before checking the
"ECC status".

Cc: stable@vger.kernel.org
Fixes: dbc44edbf8 ("mtd: rawnand: micron: Fix on-die ECC detection logic")
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:54 +02:00
Gustavo A. R. Silva
fafaeae4d3 IB/hfi1: Fix Spectre v1 vulnerability
commit 6497d0a9c53df6e98b25e2b79f2295d7caa47b6e upstream.

sl is controlled by user-space, hence leading to a potential
exploitation of the Spectre variant 1 vulnerability.

Fix this by sanitizing sl before using it to index ibp->sl_to_sc.

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/

Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Link: https://lore.kernel.org/r/20190731175428.GA16736@embeddedor
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:54 +02:00
Michael Wu
fdb0fb56ad gpiolib: fix incorrect IRQ requesting of an active-low lineevent
commit 223ecaf140b1dd1c1d2a1a1d96281efc5c906984 upstream.

When a pin is active-low, logical trigger edge should be inverted to match
the same interrupt opportunity.

For example, a button pushed triggers falling edge in ACTIVE_HIGH case; in
ACTIVE_LOW case, the button pushed triggers rising edge. For user space the
IRQ requesting doesn't need to do any modification except to configuring
GPIOHANDLE_REQUEST_ACTIVE_LOW.

For example, we want to catch the event when the button is pushed. The
button on the original board drives level to be low when it is pushed, and
drives level to be high when it is released.

In user space we can do:

	req.handleflags = GPIOHANDLE_REQUEST_INPUT;
	req.eventflags = GPIOEVENT_REQUEST_FALLING_EDGE;

	while (1) {
		read(fd, &dat, sizeof(dat));
		if (dat.id == GPIOEVENT_EVENT_FALLING_EDGE)
			printf("button pushed\n");
	}

Run the same logic on another board which the polarity of the button is
inverted; it drives level to be high when pushed, and level to be low when
released. For this inversion we add flag GPIOHANDLE_REQUEST_ACTIVE_LOW:

	req.handleflags = GPIOHANDLE_REQUEST_INPUT |
		GPIOHANDLE_REQUEST_ACTIVE_LOW;
	req.eventflags = GPIOEVENT_REQUEST_FALLING_EDGE;

At the result, there are no any events caught when the button is pushed.
By the way, button releasing will emit a "falling" event. The timing of
"falling" catching is not expected.

Cc: stable@vger.kernel.org
Signed-off-by: Michael Wu <michael.wu@vatics.com>
Tested-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:54 +02:00
Joe Perches
7e3efb6550 mmc: meson-mx-sdio: Fix misuse of GENMASK macro
commit 665e985c2f41bebc3e6cee7e04c36a44afbc58f7 upstream.

Arguments are supposed to be ordered high then low.

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Fixes: ed80a13bb4 ("mmc: meson-mx-sdio: Add a driver for the Amlogic
Meson8 and Meson8b SoCs")
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:54 +02:00
Douglas Anderson
29841b5c6a mmc: dw_mmc: Fix occasional hang after tuning on eMMC
commit ba2d139b02ba684c6c101de42fed782d6cd2b997 upstream.

In commit 46d179525a ("mmc: dw_mmc: Wait for data transfer after
response errors.") we fixed a tuning-induced hang that I saw when
stress testing tuning on certain SD cards.  I won't re-hash that whole
commit, but the summary is that as a normal part of tuning you need to
deal with transfer errors and there were cases where these transfer
errors was putting my system into a bad state causing all future
transfers to fail.  That commit fixed handling of the transfer errors
for me.

In downstream Chrome OS my fix landed and had the same behavior for
all SD/MMC commands.  However, it looks like when the commit landed
upstream we limited it to only SD tuning commands.  Presumably this
was to try to get around problems that Alim Akhtar reported on exynos
[1].

Unfortunately while stress testing reboots (and suspend/resume) on
some rk3288-based Chromebooks I found the same problem on the eMMC on
some of my Chromebooks (the ones with Hynix eMMC).  Since the eMMC
tuning command is different (MMC_SEND_TUNING_BLOCK_HS200
vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the
same situation.

I'm hoping that whatever problems exynos was having in the past are
somehow magically fixed now and we can make the behavior the same for
all commands.

[1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4A@mail.gmail.com

Fixes: 46d179525a ("mmc: dw_mmc: Wait for data transfer after response errors.")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Alim Akhtar <alim.akhtar@gmail.com>
Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:53 +02:00
Filipe Manana
50d700408a Btrfs: fix race leading to fs corruption after transaction abort
commit cb2d3daddbfb6318d170e79aac1f7d5e4d49f0d7 upstream.

When one transaction is finishing its commit, it is possible for another
transaction to start and enter its initial commit phase as well. If the
first ends up getting aborted, we have a small time window where the second
transaction commit does not notice that the previous transaction aborted
and ends up committing, writing a superblock that points to btrees that
reference extent buffers (nodes and leafs) that were not persisted to disk.
The consequence is that after mounting the filesystem again, we will be
unable to load some btree nodes/leafs, either because the content on disk
is either garbage (or just zeroes) or corresponds to the old content of a
previouly COWed or deleted node/leaf, resulting in the well known error
messages "parent transid verify failed on ...".
The following sequence diagram illustrates how this can happen.

        CPU 1                                           CPU 2

 <at transaction N>

 btrfs_commit_transaction()
   (...)
   --> sets transaction state to
       TRANS_STATE_UNBLOCKED
   --> sets fs_info->running_transaction
       to NULL

                                                    (...)
                                                    btrfs_start_transaction()
                                                      start_transaction()
                                                        wait_current_trans()
                                                          --> returns immediately
                                                              because
                                                              fs_info->running_transaction
                                                              is NULL
                                                        join_transaction()
                                                          --> creates transaction N + 1
                                                          --> sets
                                                              fs_info->running_transaction
                                                              to transaction N + 1
                                                          --> adds transaction N + 1 to
                                                              the fs_info->trans_list list
                                                        --> returns transaction handle
                                                            pointing to the new
                                                            transaction N + 1
                                                    (...)

                                                    btrfs_sync_file()
                                                      btrfs_start_transaction()
                                                        --> returns handle to
                                                            transaction N + 1
                                                      (...)

   btrfs_write_and_wait_transaction()
     --> writeback of some extent
         buffer fails, returns an
	 error
   btrfs_handle_fs_error()
     --> sets BTRFS_FS_STATE_ERROR in
         fs_info->fs_state
   --> jumps to label "scrub_continue"
   cleanup_transaction()
     btrfs_abort_transaction(N)
       --> sets BTRFS_FS_STATE_TRANS_ABORTED
           flag in fs_info->fs_state
       --> sets aborted field in the
           transaction and transaction
	   handle structures, for
           transaction N only
     --> removes transaction from the
         list fs_info->trans_list
                                                      btrfs_commit_transaction(N + 1)
                                                        --> transaction N + 1 was not
							    aborted, so it proceeds
                                                        (...)
                                                        --> sets the transaction's state
                                                            to TRANS_STATE_COMMIT_START
                                                        --> does not find the previous
                                                            transaction (N) in the
                                                            fs_info->trans_list, so it
                                                            doesn't know that transaction
                                                            was aborted, and the commit
                                                            of transaction N + 1 proceeds
                                                        (...)
                                                        --> sets transaction N + 1 state
                                                            to TRANS_STATE_UNBLOCKED
                                                        btrfs_write_and_wait_transaction()
                                                          --> succeeds writing all extent
                                                              buffers created in the
                                                              transaction N + 1
                                                        write_all_supers()
                                                           --> succeeds
                                                           --> we now have a superblock on
                                                               disk that points to trees
                                                               that refer to at least one
                                                               extent buffer that was
                                                               never persisted

So fix this by updating the transaction commit path to check if the flag
BTRFS_FS_STATE_TRANS_ABORTED is set on fs_info->fs_state if after setting
the transaction to the TRANS_STATE_COMMIT_START we do not find any previous
transaction in the fs_info->trans_list. If the flag is set, just fail the
transaction commit with -EROFS, as we do in other places. The exact error
code for the previous transaction abort was already logged and reported.

Fixes: 49b25e0540 ("btrfs: enhance transaction abort infrastructure")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:53 +02:00
Filipe Manana
009d7a4eb3 Btrfs: fix incremental send failure after deduplication
commit b4f9a1a87a48c255bb90d8a6c3d555a1abb88130 upstream.

When doing an incremental send operation we can fail if we previously did
deduplication operations against a file that exists in both snapshots. In
that case we will fail the send operation with -EIO and print a message
to dmesg/syslog like the following:

  BTRFS error (device sdc): Send: inconsistent snapshot, found updated \
  extent for inode 257 without updated inode item, send root is 258, \
  parent root is 257

This requires that we deduplicate to the same file in both snapshots for
the same amount of times on each snapshot. The issue happens because a
deduplication only updates the iversion of an inode and does not update
any other field of the inode, therefore if we deduplicate the file on
each snapshot for the same amount of time, the inode will have the same
iversion value (stored as the "sequence" field on the inode item) on both
snapshots, therefore it will be seen as unchanged between in the send
snapshot while there are new/updated/deleted extent items when comparing
to the parent snapshot. This makes the send operation return -EIO and
print an error message.

Example reproducer:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt

  # Create our first file. The first half of the file has several 64Kb
  # extents while the second half as a single 512Kb extent.
  $ xfs_io -f -s -c "pwrite -S 0xb8 -b 64K 0 512K" /mnt/foo
  $ xfs_io -c "pwrite -S 0xb8 512K 512K" /mnt/foo

  # Create the base snapshot and the parent send stream from it.
  $ btrfs subvolume snapshot -r /mnt /mnt/mysnap1
  $ btrfs send -f /tmp/1.snap /mnt/mysnap1

  # Create our second file, that has exactly the same data as the first
  # file.
  $ xfs_io -f -c "pwrite -S 0xb8 0 1M" /mnt/bar

  # Create the second snapshot, used for the incremental send, before
  # doing the file deduplication.
  $ btrfs subvolume snapshot -r /mnt /mnt/mysnap2

  # Now before creating the incremental send stream:
  #
  # 1) Deduplicate into a subrange of file foo in snapshot mysnap1. This
  #    will drop several extent items and add a new one, also updating
  #    the inode's iversion (sequence field in inode item) by 1, but not
  #    any other field of the inode;
  #
  # 2) Deduplicate into a different subrange of file foo in snapshot
  #    mysnap2. This will replace an extent item with a new one, also
  #    updating the inode's iversion by 1 but not any other field of the
  #    inode.
  #
  # After these two deduplication operations, the inode items, for file
  # foo, are identical in both snapshots, but we have different extent
  # items for this inode in both snapshots. We want to check this doesn't
  # cause send to fail with an error or produce an incorrect stream.

  $ xfs_io -r -c "dedupe /mnt/bar 0 0 512K" /mnt/mysnap1/foo
  $ xfs_io -r -c "dedupe /mnt/bar 512K 512K 512K" /mnt/mysnap2/foo

  # Create the incremental send stream.
  $ btrfs send -p /mnt/mysnap1 -f /tmp/2.snap /mnt/mysnap2
  ERROR: send ioctl failed with -5: Input/output error

This issue started happening back in 2015 when deduplication was updated
to not update the inode's ctime and mtime and update only the iversion.
Back then we would hit a BUG_ON() in send, but later in 2016 send was
updated to return -EIO and print the error message instead of doing the
BUG_ON().

A test case for fstests follows soon.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203933
Fixes: 1c919a5e13 ("btrfs: don't update mtime/ctime on deduped inodes")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:53 +02:00
Masahiro Yamada
4c5a442561 kbuild: initialize CLANG_FLAGS correctly in the top Makefile
commit 5241ab4cf42d3a93b933b55d3d53f43049081fa1 upstream.

CLANG_FLAGS is initialized by the following line:

  CLANG_FLAGS     := --target=$(notdir $(CROSS_COMPILE:%-=%))

..., which is run only when CROSS_COMPILE is set.

Some build targets (bindeb-pkg etc.) recurse to the top Makefile.

When you build the kernel with Clang but without CROSS_COMPILE,
the same compiler flags such as -no-integrated-as are accumulated
into CLANG_FLAGS.

If you run 'make CC=clang' and then 'make CC=clang bindeb-pkg',
Kbuild will recompile everything needlessly due to the build command
change.

Fix this by correctly initializing CLANG_FLAGS.

Fixes: 238bcbc4e07f ("kbuild: consolidate Clang compiler flags")
Cc: <stable@vger.kernel.org> # v5.0+
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:53 +02:00
M. Vefa Bicakci
3736612d25 kconfig: Clear "written" flag to avoid data loss
commit 0c5b6c28ed68becb692b43eae5e44d5aa7e160ce upstream.

Prior to this commit, starting nconfig, xconfig or gconfig, and saving
the .config file more than once caused data loss, where a .config file
that contained only comments would be written to disk starting from the
second save operation.

This bug manifests itself because the SYMBOL_WRITTEN flag is never
cleared after the first call to conf_write, and subsequent calls to
conf_write then skip all of the configuration symbols due to the
SYMBOL_WRITTEN flag being set.

This commit resolves this issue by clearing the SYMBOL_WRITTEN flag
from all symbols before conf_write returns.

Fixes: 8e2442a5f86e ("kconfig: fix missing choice values in auto.conf")
Cc: linux-stable <stable@vger.kernel.org> # 4.19+
Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:53 +02:00
Yongxin Liu
4c6500b532 drm/nouveau: fix memory leak in nouveau_conn_reset()
[ Upstream commit 09b90e2fe35faeace2488234e2a7728f2ea8ba26 ]

In nouveau_conn_reset(), if connector->state is true,
__drm_atomic_helper_connector_destroy_state() will be called,
but the memory pointed by asyc isn't freed. Memory leak happens
in the following function __drm_atomic_helper_connector_reset(),
where newly allocated asyc->state will be assigned to connector->state.

So using nouveau_conn_atomic_destroy_state() instead of
__drm_atomic_helper_connector_destroy_state to free the "old" asyc.

Here the is the log showing memory leak.

unreferenced object 0xffff8c5480483c80 (size 192):
  comm "kworker/0:2", pid 188, jiffies 4294695279 (age 53.179s)
  hex dump (first 32 bytes):
    00 f0 ba 7b 54 8c ff ff 00 00 00 00 00 00 00 00  ...{T...........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000005005c0d0>] kmem_cache_alloc_trace+0x195/0x2c0
    [<00000000a122baed>] nouveau_conn_reset+0x25/0xc0 [nouveau]
    [<000000004fd189a2>] nouveau_connector_create+0x3a7/0x610 [nouveau]
    [<00000000c73343a8>] nv50_display_create+0x343/0x980 [nouveau]
    [<000000002e2b03c3>] nouveau_display_create+0x51f/0x660 [nouveau]
    [<00000000c924699b>] nouveau_drm_device_init+0x182/0x7f0 [nouveau]
    [<00000000cc029436>] nouveau_drm_probe+0x20c/0x2c0 [nouveau]
    [<000000007e961c3e>] local_pci_probe+0x47/0xa0
    [<00000000da14d569>] work_for_cpu_fn+0x1a/0x30
    [<0000000028da4805>] process_one_work+0x27c/0x660
    [<000000001d415b04>] worker_thread+0x22b/0x3f0
    [<0000000003b69f1f>] kthread+0x12f/0x150
    [<00000000c94c29b7>] ret_from_fork+0x3a/0x50

Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-06 19:06:53 +02:00
Zhenzhong Duan
84ce045222 x86, boot: Remove multiple copy of static function sanitize_boot_params()
[ Upstream commit 8c5477e8046ca139bac250386c08453da37ec1ae ]

Kernel build warns:
 'sanitize_boot_params' defined but not used [-Wunused-function]

at below files:
  arch/x86/boot/compressed/cmdline.c
  arch/x86/boot/compressed/error.c
  arch/x86/boot/compressed/early_serial_console.c
  arch/x86/boot/compressed/acpi.c

That's becausethey each include misc.h which includes a definition of
sanitize_boot_params() via bootparam_utils.h.

Remove the inclusion from misc.h and have the c file including
bootparam_utils.h directly.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1563283092-1189-1-git-send-email-zhenzhong.duan@oracle.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-06 19:06:53 +02:00
Josh Poimboeuf
740e0167a3 x86/paravirt: Fix callee-saved function ELF sizes
[ Upstream commit 083db6764821996526970e42d09c1ab2f4155dd4 ]

The __raw_callee_save_*() functions have an ELF symbol size of zero,
which confuses objtool and other tools.

Fixes a bunch of warnings like the following:

  arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_pte_val() is missing an ELF size annotation
  arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_pgd_val() is missing an ELF size annotation
  arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_make_pte() is missing an ELF size annotation
  arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_make_pgd() is missing an ELF size annotation

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/afa6d49bb07497ca62e4fc3b27a2d0cece545b4e.1563413318.git.jpoimboe@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-06 19:06:52 +02:00
Josh Poimboeuf
ba5c072fbf x86/kvm: Don't call kvm_spurious_fault() from .fixup
[ Upstream commit 3901336ed9887b075531bffaeef7742ba614058b ]

After making a change to improve objtool's sibling call detection, it
started showing the following warning:

  arch/x86/kvm/vmx/nested.o: warning: objtool: .fixup+0x15: sibling call from callable instruction with modified stack frame

The problem is the ____kvm_handle_fault_on_reboot() macro.  It does a
fake call by pushing a fake RIP and doing a jump.  That tricks the
unwinder into printing the function which triggered the exception,
rather than the .fixup code.

Instead of the hack to make it look like the original function made the
call, just change the macro so that the original function actually does
make the call.  This allows removal of the hack, and also makes objtool
happy.

I triggered a vmx instruction exception and verified that the stack
trace is still sane:

  kernel BUG at arch/x86/kvm/x86.c:358!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 28 PID: 4096 Comm: qemu-kvm Not tainted 5.2.0+ #16
  Hardware name: Lenovo THINKSYSTEM SD530 -[7X2106Z000]-/-[7X2106Z000]-, BIOS -[TEE113Z-1.00]- 07/17/2017
  RIP: 0010:kvm_spurious_fault+0x5/0x10
  Code: 00 00 00 00 00 8b 44 24 10 89 d2 45 89 c9 48 89 44 24 10 8b 44 24 08 48 89 44 24 08 e9 d4 40 22 00 0f 1f 40 00 0f 1f 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41
  RSP: 0018:ffffbf91c683bd00 EFLAGS: 00010246
  RAX: 000061f040000000 RBX: ffff9e159c77bba0 RCX: ffff9e15a5c87000
  RDX: 0000000665c87000 RSI: ffff9e15a5c87000 RDI: ffff9e159c77bba0
  RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9e15a5c87000
  R10: 0000000000000000 R11: fffff8f2d99721c0 R12: ffff9e159c77bba0
  R13: ffffbf91c671d960 R14: ffff9e159c778000 R15: 0000000000000000
  FS:  00007fa341cbe700(0000) GS:ffff9e15b7400000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fdd38356804 CR3: 00000006759de003 CR4: 00000000007606e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   loaded_vmcs_init+0x4f/0xe0
   alloc_loaded_vmcs+0x38/0xd0
   vmx_create_vcpu+0xf7/0x600
   kvm_vm_ioctl+0x5e9/0x980
   ? __switch_to_asm+0x40/0x70
   ? __switch_to_asm+0x34/0x70
   ? __switch_to_asm+0x40/0x70
   ? __switch_to_asm+0x34/0x70
   ? free_one_page+0x13f/0x4e0
   do_vfs_ioctl+0xa4/0x630
   ksys_ioctl+0x60/0x90
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x55/0x1c0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7fa349b1ee5b

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/64a9b64d127e87b6920a97afde8e96ea76f6524e.1563413318.git.jpoimboe@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-06 19:06:52 +02:00
Zhenzhong Duan
11cb9f8700 xen/pv: Fix a boot up hang revealed by int3 self test
[ Upstream commit b23e5844dfe78a80ba672793187d3f52e4b528d7 ]

Commit 7457c0da024b ("x86/alternatives: Add int3_emulate_call()
selftest") is used to ensure there is a gap setup in int3 exception stack
which could be used for inserting call return address.

This gap is missed in XEN PV int3 exception entry path, then below panic
triggered:

[    0.772876] general protection fault: 0000 [#1] SMP NOPTI
[    0.772886] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #11
[    0.772893] RIP: e030:int3_magic+0x0/0x7
[    0.772905] RSP: 3507:ffffffff82203e98 EFLAGS: 00000246
[    0.773334] Call Trace:
[    0.773334]  alternative_instructions+0x3d/0x12e
[    0.773334]  check_bugs+0x7c9/0x887
[    0.773334]  ? __get_locked_pte+0x178/0x1f0
[    0.773334]  start_kernel+0x4ff/0x535
[    0.773334]  ? set_init_arg+0x55/0x55
[    0.773334]  xen_start_kernel+0x571/0x57a

For 64bit PV guests, Xen's ABI enters the kernel with using SYSRET, with
%rcx/%r11 on the stack. To convert back to "normal" looking exceptions,
the xen thunks do 'xen_*: pop %rcx; pop %r11; jmp *'.

E.g. Extracting 'xen_pv_trap xenint3' we have:
xen_xenint3:
 pop %rcx;
 pop %r11;
 jmp xenint3

As xenint3 and int3 entry code are same except xenint3 doesn't generate
a gap, we can fix it by using int3 and drop useless xenint3.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-06 19:06:52 +02:00
Petr Machata
d3e36788d9 mlxsw: spectrum_dcb: Configure DSCP map as the last rule is removed
[ Upstream commit dedfde2fe1c4ccf27179fcb234e2112d065c39bb ]

Spectrum systems use DSCP rewrite map to update DSCP field in egressing
packets to correspond to priority that the packet has. Whether rewriting
will take place is determined at the point when the packet ingresses the
switch: if the port is in Trust L3 mode, packet priority is determined from
the DSCP map at the port, and DSCP rewrite will happen. If the port is in
Trust L2 mode, 802.1p is used for packet prioritization, and no DSCP
rewrite will happen.

The driver determines the port trust mode based on whether any DSCP
prioritization rules are in effect at given port. If there are any, trust
level is L3, otherwise it's L2. When the last DSCP rule is removed, the
port is switched to trust L2. Under that scenario, if DSCP of a packet
should be rewritten, it should be rewritten to 0.

However, when switching to Trust L2, the driver neglects to also update the
DSCP rewrite map. The last DSCP rule thus remains in effect, and packets
egressing through this port, if they have the right priority, will have
their DSCP set according to this rule.

Fix by first configuring the rewrite map, and only then switching to trust
L2 and bailing out.

Fixes: b2b1dab688 ("mlxsw: spectrum: Support ieee_setapp, ieee_delapp")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reported-by: Alex Veber <alexve@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-06 19:06:52 +02:00
Kees Cook
48c5c4f0a4 ipc/mqueue.c: only perform resource calculation if user valid
[ Upstream commit a318f12ed8843cfac53198390c74a565c632f417 ]

Andreas Christoforou reported:

  UBSAN: Undefined behaviour in ipc/mqueue.c:414:49 signed integer overflow:
  9 * 2305843009213693951 cannot be represented in type 'long int'
  ...
  Call Trace:
    mqueue_evict_inode+0x8e7/0xa10 ipc/mqueue.c:414
    evict+0x472/0x8c0 fs/inode.c:558
    iput_final fs/inode.c:1547 [inline]
    iput+0x51d/0x8c0 fs/inode.c:1573
    mqueue_get_inode+0x8eb/0x1070 ipc/mqueue.c:320
    mqueue_create_attr+0x198/0x440 ipc/mqueue.c:459
    vfs_mkobj+0x39e/0x580 fs/namei.c:2892
    prepare_open ipc/mqueue.c:731 [inline]
    do_mq_open+0x6da/0x8e0 ipc/mqueue.c:771

Which could be triggered by:

        struct mq_attr attr = {
                .mq_flags = 0,
                .mq_maxmsg = 9,
                .mq_msgsize = 0x1fffffffffffffff,
                .mq_curmsgs = 0,
        };

        if (mq_open("/testing", 0x40, 3, &attr) == (mqd_t) -1)
                perror("mq_open");

mqueue_get_inode() was correctly rejecting the giant mq_msgsize, and
preparing to return -EINVAL.  During the cleanup, it calls
mqueue_evict_inode() which performed resource usage tracking math for
updating "user", before checking if there was a valid "user" at all
(which would indicate that the calculations would be sane).  Instead,
delay this check to after seeing a valid "user".

The overflow was real, but the results went unused, so while the flaw is
harmless, it's noisy for kernel fuzzers, so just fix it by moving the
calculation under the non-NULL "user" where it actually gets used.

Link: http://lkml.kernel.org/r/201906072207.ECB65450@keescook
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Andreas Christoforou <andreaschristofo@gmail.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-06 19:06:52 +02:00
Dan Carpenter
f873142252 drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings
[ Upstream commit 156e0b1a8112b76e351684ac948c59757037ac36 ]

The dev_info.name[] array has space for RIO_MAX_DEVNAME_SZ + 1
characters.  But the problem here is that we don't ensure that the user
put a NUL terminator on the end of the string.  It could lead to an out
of bounds read.

Link: http://lkml.kernel.org/r/20190529110601.GB19119@mwanda
Fixes: e8de370188 ("rapidio: add mport char device driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-06 19:06:52 +02:00