android_kernel_xiaomi_sm7250/Documentation
Daniel Sneddon 56cf3753a1 x86/speculation: Add RSB VM Exit protections
commit 2b1299322016731d56807aa49254a5ea3080b6b3 upstream.

tl;dr: The Enhanced IBRS mitigation for Spectre v2 does not work as
documented for RET instructions after VM exits. Mitigate it with a new
one-entry RSB stuffing mechanism and a new LFENCE.

== Background ==

Indirect Branch Restricted Speculation (IBRS) was designed to help
mitigate Branch Target Injection and Speculative Store Bypass, i.e.
Spectre, attacks. IBRS prevents software run in less privileged modes
from affecting branch prediction in more privileged modes. IBRS requires
the MSR to be written on every privilege level change.

To overcome some of the performance issues of IBRS, Enhanced IBRS was
introduced.  eIBRS is an "always on" IBRS, in other words, just turn
it on once instead of writing the MSR on every privilege level change.
When eIBRS is enabled, more privileged modes should be protected from
less privileged modes, including protecting VMMs from guests.

== Problem ==

Here's a simplification of how guests are run on Linux' KVM:

void run_kvm_guest(void)
{
	// Prepare to run guest
	VMRESUME();
	// Clean up after guest runs
}

The execution flow for that would look something like this to the
processor:

1. Host-side: call run_kvm_guest()
2. Host-side: VMRESUME
3. Guest runs, does "CALL guest_function"
4. VM exit, host runs again
5. Host might make some "cleanup" function calls
6. Host-side: RET from run_kvm_guest()

Now, when back on the host, there are a couple of possible scenarios of
post-guest activity the host needs to do before executing host code:

* on pre-eIBRS hardware (legacy IBRS, or nothing at all), the RSB is not
touched and Linux has to do a 32-entry stuffing.

* on eIBRS hardware, VM exit with IBRS enabled, or restoring the host
IBRS=1 shortly after VM exit, has a documented side effect of flushing
the RSB except in this PBRSB situation where the software needs to stuff
the last RSB entry "by hand".

IOW, with eIBRS supported, host RET instructions should no longer be
influenced by guest behavior after the host retires a single CALL
instruction.

However, if the RET instructions are "unbalanced" with CALLs after a VM
exit as is the RET in #6, it might speculatively use the address for the
instruction after the CALL in #3 as an RSB prediction. This is a problem
since the (untrusted) guest controls this address.

Balanced CALL/RET instruction pairs such as in step #5 are not affected.

== Solution ==

The PBRSB issue affects a wide variety of Intel processors which
support eIBRS. But not all of them need mitigation. Today,
X86_FEATURE_RSB_VMEXIT triggers an RSB filling sequence that mitigates
PBRSB. Systems setting RSB_VMEXIT need no further mitigation - i.e.,
eIBRS systems which enable legacy IBRS explicitly.

However, such systems (X86_FEATURE_IBRS_ENHANCED) do not set RSB_VMEXIT
and most of them need a new mitigation.

Therefore, introduce a new feature flag X86_FEATURE_RSB_VMEXIT_LITE
which triggers a lighter-weight PBRSB mitigation versus RSB_VMEXIT.

The lighter-weight mitigation performs a CALL instruction which is
immediately followed by a speculative execution barrier (INT3). This
steers speculative execution to the barrier -- just like a retpoline
-- which ensures that speculation can never reach an unbalanced RET.
Then, ensure this CALL is retired before continuing execution with an
LFENCE.

In other words, the window of exposure is opened at VM exit where RET
behavior is troublesome. While the window is open, force RSB predictions
sampling for RET targets to a dead end at the INT3. Close the window
with the LFENCE.

There is a subset of eIBRS systems which are not vulnerable to PBRSB.
Add these systems to the cpu_vuln_whitelist[] as NO_EIBRS_PBRSB.
Future systems that aren't vulnerable will set ARCH_CAP_PBRSB_NO.

  [ bp: Massage, incorporate review comments from Andy Cooper. ]

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
[ bp: Adjust patch to account for kvm entry being in c ]
Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-23 07:53:47 +01:00
..
ABI iio: ABI: Fix wrong format of differential capacitance channel ABI. 2022-10-26 13:19:30 +02:00
accelerators ocxl: Document new OCXL IOCTLs 2018-06-03 20:40:33 +10:00
accounting
acpi ACPI: property: graph: Update graph documentation to use generic references 2018-07-23 12:44:52 +02:00
admin-guide x86/speculation: Add RSB VM Exit protections 2022-11-23 07:53:47 +01:00
aoe
arm ARM: 8833/1: Ensure that NEON code always compiles with Clang 2019-04-05 22:33:08 +02:00
arm64 arm64: errata: Remove AES hwcap for COMPAT tasks 2022-11-03 23:52:25 +09:00
auxdisplay Doc: misc-devices: move lcd-panel-cgram.txt to auxdisplay/ 2018-04-12 16:08:02 +02:00
backlight
block block: Track DISCARD statistics and output them in stat and diskstat 2018-07-18 08:44:22 -06:00
blockdev zram: introduce zram memory tracking 2018-06-07 17:34:34 -07:00
bpf Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2018-08-07 11:02:05 -07:00
bus-devices
cdrom
cgroup-v1 page cache: use xa_lock 2018-04-11 10:28:39 -07:00
cma
connector
console Documentation: corrections to console/console.txt 2018-08-10 16:09:40 -06:00
core-api idr: Change documentation license 2018-10-15 16:31:29 -04:00
cpu-freq cpufreq: Drop cpufreq_table_validate_and_show() 2018-04-10 08:40:45 +02:00
cpuidle cpuidle: Add definition of residency to sysfs documentation 2018-04-09 13:44:37 +02:00
crypto crypto: remove redundant type flags from tfm allocation 2018-07-09 00:30:29 +08:00
dev-tools doc: dev-tools: kselftest.rst: update contributing new tests 2018-06-29 09:01:50 -06:00
device-mapper dm integrity: conditionally disable "recalculate" feature 2021-01-30 13:32:13 +01:00
devicetree ARM: dts: fix Moxa SDIO 'compatible', remove 'sdhci' misnomer 2022-10-26 13:19:16 +02:00
doc-guide Documentation/sphinx: allow "functions" with no parameters 2018-06-30 07:52:42 -06:00
driver-api ata: make qc_prep return ata_completion_errors 2020-10-01 13:14:54 +02:00
driver-model dmaengine: add a new helper dmaenginem_async_device_register 2018-07-30 10:50:22 +05:30
early-userspace initramfs: move gen_initramfs_list.sh from scripts/ to usr/ 2018-08-22 23:21:44 +09:00
EDID
extcon
fault-injection
fb uvesafb: Fix URLs in the documentation 2018-09-26 18:11:23 +02:00
features ARM: 8777/1: Hook up SYNC_CORE functionality for sys_membarrier() 2018-07-11 11:02:08 +01:00
filesystems locks: print a warning when mount fails due to lack of "mand" support 2021-08-26 08:36:49 -04:00
firmware_class
fmc
fpga docs: fpga: add a document for FPGA Device Feature List (DFL) Framework Overview 2018-07-15 13:55:44 +02:00
gpio
gpu drm/msm/gpu: Add the buffer objects from the submit to the crash dump 2018-07-30 08:50:10 -04:00
hid HID: doc: fix wrong data structure reference for UHID_OUTPUT 2019-12-05 09:20:36 +01:00
hwmon Revert "hwmon: Make chip parameter for with_info API mandatory" 2022-06-25 11:49:18 +02:00
i2c i2c: i801: Add support for Intel Comet Lake 2019-05-04 09:20:15 +02:00
ia64
ide
iio
infiniband
input Input: iforce - add support for Boeder Force Feedback Wheel 2022-09-20 12:26:48 +02:00
ioctl Char/Misc driver patches for 4.19-rc1 2018-08-18 11:04:51 -07:00
isdn
kbuild kbuild: support LLVM=1 to switch the default tools to Clang/LLVM 2020-09-26 18:01:32 +02:00
kdump
kernel-hacking doc:it_IT: translation for kernel-hacking 2018-07-26 16:21:09 -06:00
laptops platform/x86: thinkpad_acpi: silence HKEY 0x6032, 0x60f0, 0x6030 2018-05-07 15:10:31 +03:00
leds
lightnvm
livepatch livepatch: Remove not longer valid limitations from the documentation 2018-05-24 15:37:57 +02:00
locking locking: Implement an algorithm choice for Wound-Wait mutexes 2018-07-03 09:44:36 +02:00
m68k
maintainer docs: Fix more broken references 2018-06-15 18:11:26 -03:00
md
media media: videodev2.h: RGB BT2020 and HSV are always full range 2020-11-05 11:08:40 +01:00
memory-devices
mic
mips
misc-devices pci_endpoint_test: Add 2 ioctl commands 2018-07-19 11:46:57 +01:00
mmc
mtd
namespaces
netlabel
networking Documentation: fix sctp_wmem in ip-sysctl.rst 2022-08-11 12:48:40 +02:00
nfc
nios2
nvdimm
nvmem
openrisc
parisc
PCI Merge branch 'remotes/lorenzo/pci/dwc' 2018-08-15 14:59:11 -05:00
pcmcia pcmcia: remove long deprecated pcmcia_request_exclusive_irq() function 2018-08-18 12:30:42 -07:00
perf
phy
platform
power PM / reboot: Eliminate race between reboot and suspend 2018-08-06 12:35:20 +02:00
powerpc powerpc: Document issues with TM on POWER9 2018-07-02 23:54:29 +10:00
pps
process docs: update mediator information in CoC docs 2022-10-26 13:19:16 +02:00
pti
ptp
rapidio
RCU rculist: Improve documentation for list_for_each_entry_from_rcu() 2018-07-12 15:39:25 -07:00
riscv perf: riscv: Add Document for Future Porting Guide 2018-06-04 14:02:11 -07:00
s390
scheduler sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices 2019-11-12 19:20:50 +01:00
scsi scsi: documentation: add scsi_mod.use_blk_mq to scsi-parameters 2018-08-27 12:26:10 -04:00
security Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables 2018-06-14 12:21:18 +09:00
serial
sh
sound ALSA: hda/realtek - Remove now-unnecessary XPS 13 headphone noise fixups 2020-04-17 10:48:45 +02:00
sparc
sphinx tweewide: Fix most Shebang lines 2021-05-22 10:59:50 +02:00
sphinx-static
spi
sysctl random: fix sysctl documentation nits 2022-06-25 11:49:09 +02:00
target tweewide: Fix most Shebang lines 2021-05-22 10:59:50 +02:00
thermal
timers timekeeping.txt: Correct maxCount of n-bit binary counter 2018-07-23 09:33:06 -06:00
trace tracing/histogram: Update document for KEYS_MAX size 2022-11-10 17:46:55 +01:00
translations This was a moderately busy cycle for docs, with the usual collection of 2018-08-14 14:29:31 -07:00
usb USB: rio500: Remove Rio 500 kernel driver 2019-10-17 13:44:47 -07:00
userspace-api Documentation: Add section about CPU vulnerabilities for Spectre 2019-07-14 08:11:17 +02:00
virtual KVM: X86: MMU: Use the correct inherited permissions to get shadow page 2021-08-15 13:05:04 +02:00
vm mm/slub: clarify verification reporting 2021-06-30 08:48:26 -04:00
w1 w1: fix w1_ds2438 documentation 2018-07-07 17:27:13 +02:00
watchdog
wimax
x86 x86/speculation/taa: Add documentation for TSX Async Abort 2019-11-12 19:21:34 +01:00
xtensa
.gitignore
00-INDEX docs: admin-guide: add cgroup-v2 documentation 2018-05-10 15:42:41 -06:00
atomic_bitops.txt locking/atomic: Make test_and_*_bit() ordered on failure 2022-08-25 11:15:42 +02:00
atomic_t.txt x86/atomic: Fix smp_mb__{before,after}_atomic() 2019-07-26 09:14:08 +02:00
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
Changes
clearing-warn-once.txt
CodingStyle
conf.py docs/conf.py: Cope with removal of language=None in Sphinx 5.0.0 2022-06-14 16:59:30 +02:00
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt Documentation: remove stale firmware API reference 2018-05-14 16:44:41 +02:00
digsig.txt
DMA-API-HOWTO.txt
DMA-API.txt
DMA-attributes.txt Reinstate some of "swiotlb: rework "fix info leak with DMA_FROM_DEVICE"" 2022-05-25 09:10:41 +02:00
DMA-ISA-LPC.txt
docutils.conf
dontdiff
efi-stub.txt
eisa.txt
flexible-arrays.txt
futex-requeue-pi.txt
gcc-plugins.txt
highuid.txt
hw_random.txt
hwspinlock.txt
index.rst x86/speculation/mds: Add mds_clear_cpu_buffers() 2019-05-14 19:17:54 +02:00
intel_txt.txt
Intel-IOMMU.txt
io_ordering.txt
io-mapping.txt
iostats.txt block: Track DISCARD statistics and output them in stat and diskstat 2018-07-18 08:44:22 -06:00
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
irqflags-tracing.txt
isa.txt
isapnp.txt
kernel-per-CPU-kthreads.txt
kobject.txt
kprobes.txt kprobes/Documentation: Fix various typos 2018-06-22 11:10:55 +02:00
kref.txt
ldm.txt
lockup-watchdogs.txt
logo.gif
logo.txt
lsm.txt
lzo.txt
mailbox.txt
Makefile
memory-barriers.txt sched/Documentation: Update wake_up() & co. memory-barrier guarantees 2018-07-17 09:30:34 +02:00
memory-hotplug.txt
men-chameleon-bus.txt
nommu-mmap.txt Documentation: nommu-map: Fix duplicate word typo 2018-06-26 09:01:27 -06:00
ntb.txt
numastat.txt
padata.txt
parport-lowlevel.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pnp.txt
preempt-locking.txt
pwm.txt
rbtree.txt
remoteproc.txt
rfkill.txt rfkill: Fix several typos in documentation 2018-06-15 13:36:08 +02:00
robust-futex-ABI.txt
robust-futexes.txt futex: Update comments and docs about return values of arch futex code 2019-07-03 13:14:49 +02:00
rpmsg.txt
rtc.txt
SAK.txt
sgi-ioc4.txt
siphash.txt
SM501.txt
smsc_ece1099.txt
speculation.txt
static-keys.txt
SubmittingPatches
svga.txt
switchtec.txt
sync_file.txt
tee.txt
this_cpu_ops.txt
unaligned-memory-access.txt
vfio-mediated-device.txt vfio/mdev: Check globally for duplicate devices 2018-06-08 10:24:27 -06:00
vfio.txt vfio: fix documentation 2018-05-08 09:16:41 -06:00
video-output.txt
xillybus.txt
xz.txt
zorro.txt