android_kernel_xiaomi_sm7250/fs
Yu Zhao 400395317f FROMLIST: mm: multi-gen LRU: support page table walks
To further exploit spatial locality, the aging prefers to walk page
tables to search for young PTEs and promote hot pages. A kill switch
will be added in the next patch to disable this behavior. When
disabled, the aging relies on the rmap only.

NB: this behavior has nothing similar with the page table scanning in
the 2.4 kernel [1], which searches page tables for old PTEs, adds cold
pages to swapcache and unmaps them.

To avoid confusion, the term "iteration" specifically means the
traversal of an entire mm_struct list; the term "walk" will be applied
to page tables and the rmap, as usual.

An mm_struct list is maintained for each memcg, and an mm_struct
follows its owner task to the new memcg when this task is migrated.
Given an lruvec, the aging iterates lruvec_memcg()->mm_list and calls
walk_page_range() with each mm_struct on this list to promote hot
pages before it increments max_seq.

When multiple page table walkers iterate the same list, each of them
gets a unique mm_struct; therefore they can run concurrently. Page
table walkers ignore any misplaced pages, e.g., if an mm_struct was
migrated, pages it left in the previous memcg will not be promoted
when its current memcg is under reclaim. Similarly, page table walkers
will not promote pages from nodes other than the one under reclaim.

This patch uses the following optimizations when walking page tables:
1. It tracks the usage of mm_struct's between context switches so that
   page table walkers can skip processes that have been sleeping since
   the last iteration.
2. It uses generational Bloom filters to record populated branches so
   that page table walkers can reduce their search space based on the
   query results, e.g., to skip page tables containing mostly holes or
   misplaced pages.
3. It takes advantage of the accessed bit in non-leaf PMD entries when
   CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG=y.
4. It does not zigzag between a PGD table and the same PMD table
   spanning multiple VMAs. IOW, it finishes all the VMAs within the
   range of the same PMD table before it returns to a PGD table. This
   improves the cache performance for workloads that have large
   numbers of tiny VMAs [2], especially when CONFIG_PGTABLE_LEVELS=5.

Server benchmark results:
  Single workload:
    fio (buffered I/O): no change

  Single workload:
    memcached (anon): +[5.5, 7.5]%
                         Ops/sec      KB/sec
      patch1-7:          1014393.57   39455.42
      patch1-8:          1078507.59   41949.15

  Configurations:
    no change

Client benchmark results:
  kswapd profiles:
    patch1-7
      45.54%  lzo1x_1_do_compress (real work)
       9.56%  page_vma_mapped_walk
       6.70%  _raw_spin_unlock_irq
       2.78%  ptep_clear_flush
       2.47%  do_raw_spin_lock
       2.22%  __zram_bvec_write
       1.87%  lru_gen_look_around
       1.78%  memmove
       1.77%  obj_malloc
       1.44%  free_unref_page_list

    patch1-8
      47.02%  lzo1x_1_do_compress (real work)
       6.73%  page_vma_mapped_walk
       6.14%  _raw_spin_unlock_irq
       3.39%  walk_pte_range
       2.63%  ptep_clear_flush
       2.29%  __zram_bvec_write
       2.10%  do_raw_spin_lock
       1.81%  memmove
       1.73%  obj_malloc
       1.53%  free_unref_page_list

  Configurations:
    no change

[1] https://lwn.net/Articles/23732/
[2] https://source.android.com/devices/tech/debug/scudo

Link: https://lore.kernel.org/r/20220309021230.721028-9-yuzhao@google.com/
Signed-off-by: Yu Zhao <yuzhao@google.com>
Acked-by: Brian Geffon <bgeffon@google.com>
Acked-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Acked-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Acked-by: Steven Barrett <steven@liquorix.net>
Acked-by: Suleiman Souhlal <suleiman@google.com>
Tested-by: Daniel Byrne <djbyrne@mtu.edu>
Tested-by: Donald Carr <d@chaos-reins.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Tested-by: Shuang Zhai <szhai2@cs.rochester.edu>
Tested-by: Sofia Trinh <sofia.trinh@edi.works>
Tested-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Bug: 228114874
Change-Id: I5a3c97cf8ebf8d65d5f9528cd979a637c190053e
2022-11-12 11:21:16 +00:00
..
9p 9p: missing chunk of "fs/9p: Don't update file type when updating file attributes" 2022-06-25 11:48:57 +02:00
adfs fs/adfs: super: fix use-after-free bug 2019-08-06 19:06:49 +02:00
affs fs/affs: release old buffer head on error path 2021-03-04 09:39:55 +01:00
afs Merge remote-tracking branch 'aosp/android-4.19-stable' into android12-base 2022-07-09 10:34:17 +05:30
autofs autofs: fix a leak in autofs_expire_indirect() 2019-12-13 08:51:01 +01:00
befs
bfs bfs: add sanity check at bfs_fill_super() 2018-12-01 09:37:27 +01:00
btrfs btrfs: fix processing of delayed tree block refs during backref walking 2022-11-03 23:52:25 +09:00
cachefiles cachefiles: Handle readpage error correctly 2020-11-05 11:08:54 +01:00
ceph ceph: don't truncate file in atomic_open 2022-10-26 13:19:18 +02:00
cifs cifs: don't send down the destination address to sendmsg for a SOCK_STREAM 2022-09-28 11:02:52 +02:00
coda coda: add error handling for fget 2019-08-06 19:06:51 +02:00
configfs configfs: fix a race in configfs_{,un}register_subsystem() 2022-03-02 11:38:13 +01:00
cramfs
crypto Merge tag '5.11-rc1-4.19' of https://kernel.googlesource.com/pub/scm/linux/kernel/git/jaegeuk/f2fs-stable into HEAD 2022-07-03 13:50:05 +00:00
debugfs debugfs: add debugfs_lookup_and_remove() 2022-09-15 12:17:05 +02:00
devpts fs/devpts: always delete dcache dentry-s in dput() 2019-03-23 20:09:59 +01:00
dlm fs: dlm: handle -EBUSY first in lock arg validation 2022-10-26 13:19:21 +02:00
ecryptfs Revert "ecryptfs: replace BUG_ON with error handling code" 2021-05-26 11:48:34 +02:00
efivarfs efivarfs: revert "fix memory leak in efivarfs_create()" 2020-12-02 08:48:12 +01:00
efs
exfat Merge remote-tracking branch 'namjaejeon/linux-exfat-oot/master' into android12-base 2022-10-19 10:45:47 +05:30
exofs Merge android-4.19-q.88 (47d86d5) into msm-4.19 2020-01-28 03:20:43 -08:00
exportfs exportfs: fix 'passing zero to ERR_PTR()' warning 2020-01-27 14:50:02 +01:00
ext2 ext2: Add more validity checks for inode counts 2022-08-25 11:14:58 +02:00
ext4 Merge branch 'android-4.19-stable' of https://github.com/aosp-mirror/kernel_common into skizo-x 2022-11-12 11:18:12 +00:00
f2fs mm: mm_event: show MM/FS/IO/UFS latencies in fault flow 2022-11-12 11:20:48 +00:00
fat fat: add ratelimit to fat*_ent_bread() 2022-06-14 16:59:18 +02:00
freevxfs
fscache fscache: Fix cookie key hashing 2021-09-22 11:48:02 +02:00
fuse FROMLIST: mm: multi-gen LRU: groundwork 2022-11-12 11:21:15 +00:00
gfs2 Merge remote-tracking branch 'aosp/android-4.19-stable' into android12-base 2022-05-19 14:51:23 +05:30
hfs hfs: add lock nesting notation to hfs_find_init 2021-07-31 08:22:38 +02:00
hfsplus hfsplus: prevent corruption in shrinking truncate 2021-05-22 10:59:45 +02:00
hostfs
hpfs
hugetlbfs This is the 4.19.193 stable release 2021-06-03 09:05:30 +02:00
incfs ANDROID: incremental-fs: limit mount stack depth 2022-04-08 12:58:18 -07:00
isofs isofs: Fix out of bound access for corrupted isofs image 2021-11-12 14:40:50 +01:00
jbd2 Merge remote-tracking branch 'aosp/android-4.19-stable' into android12-base 2022-09-22 14:02:10 +05:30
jffs2 This is the 4.19.247 stable release 2022-06-14 17:16:36 +02:00
jfs fs: jfs: fix possible NULL pointer dereference in dbFree() 2022-06-14 16:59:17 +02:00
kernfs Revert "mm: zero-seek shrinkers" 2022-11-12 11:20:34 +00:00
lockd lockd: don't use interval-based rebinding over TCP 2020-12-30 11:25:59 +01:00
minix minix: fix bug when opening a file with O_DIRECT 2022-04-15 14:15:03 +02:00
nfs Merge remote-tracking branch 'aosp/android-4.19-stable' into android12-base 2022-09-22 14:02:10 +05:30
nfs_common nfs_common: need lock during iterate through the list 2020-12-30 11:26:02 +01:00
nfsd NFSD: Return nfserr_serverfault if splice_ok but buf->pages have data 2022-10-26 13:19:36 +02:00
nilfs2 nilfs2: fix use-after-free bug of struct nilfs_root 2022-10-26 13:19:22 +02:00
nls
notify Merge remote-tracking branch 'aosp/android-4.19-stable' into android12-base 2022-06-30 15:31:02 +05:30
ntfs ntfs: fix BUG_ON in ntfs_lookup_inode_by_name() 2022-10-05 10:36:44 +02:00
ocfs2 mass revert: clean 2022-11-12 11:18:57 +00:00
omfs
openpromfs
orangefs orangefs: Fix the size of a memory allocation in orangefs_bufmap_alloc() 2022-01-27 09:04:13 +01:00
overlayfs Merge 4cb392956a ("MIPS: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK") into android-mainline 2022-09-07 09:20:10 +02:00
proc CHROMIUM: mm: make perproc-recalim THP aware 2022-11-12 11:20:56 +00:00
pstore fs: pstore: import vangogh-r-oss support to get last_kmsg 2022-11-12 11:19:16 +00:00
qnx4 qnx4: work around gcc false positive warning bug 2021-10-06 15:31:20 +02:00
qnx6
quota quota: Check next/prev free block number after reading from quota file 2022-10-26 13:19:21 +02:00
ramfs ramfs: fix nommu mmap with gaps in the page cache 2020-10-30 10:38:21 +01:00
reiserfs reiserfs: check directory items on read from disk 2021-08-12 13:19:44 +02:00
romfs romfs: fix uninitialized memory leak in romfs_dev_read() 2020-08-26 10:30:59 +02:00
sdcardfs Restore sdcardfs feature 2020-08-21 15:15:18 +05:30
squashfs squashfs: fix divide error in calculate_skip() 2021-05-22 10:59:45 +02:00
sysfs This is the 4.19.236 stable release 2022-03-23 12:26:14 +01:00
sysv sysv: return 'err' instead of 0 in __sysv_write_inode 2018-12-17 09:24:30 +01:00
tracefs Merge 4.19.259 into android-4.19-stable 2022-09-21 11:46:01 +02:00
ubifs Merge tag '5.11-rc1-4.19' of https://kernel.googlesource.com/pub/scm/linux/kernel/git/jaegeuk/f2fs-stable into HEAD 2022-07-03 13:50:05 +00:00
udf udf: Fix NULL ptr deref when converting from inline format 2022-02-08 18:23:03 +01:00
ufs fs/ufs: avoid potential u32 multiplication overflow 2020-08-21 11:05:38 +02:00
unicode unicode: Add utf8_casefold_hash 2020-09-11 11:22:30 -07:00
verity Merge tag '5.12-rc1-4.19' of https://kernel.googlesource.com/pub/scm/linux/kernel/git/jaegeuk/f2fs-stable into HEAD 2022-07-03 14:22:19 +00:00
xfs mass revert: clean 2022-11-12 11:18:57 +00:00
aio.c aio: fix use-after-free due to missing POLLFREE handling 2022-02-28 19:11:35 +05:30
anon_inodes.c
attr.c Merge remote-tracking branch 'aosp/android-4.19-stable' into android12-base 2022-09-07 22:01:02 +05:30
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappings 2021-10-06 15:31:24 +02:00
binfmt_em86.c
binfmt_flat.c binfmt_flat: do not stop relocating GOT entries prematurely on riscv 2022-06-14 16:59:13 +02:00
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-17 16:43:51 +01:00
binfmt_script.c exec: load_script: Do not exec truncated interpreter path 2019-11-06 13:05:37 +01:00
block_dev.c This is the 4.19.191 stable release 2021-05-22 11:54:36 +02:00
buffer.c Merge android-4.19-stable.157 (8ee67bc) into msm-4.19 2020-12-18 18:35:06 +05:30
char_dev.c chardev: Avoid potential use-after-free in 'chrdev_open()' 2020-01-24 14:28:27 +05:30
compat_binfmt_elf.c
compat_ioctl.c fix compat handling of FICLONERANGE, FIDEDUPERANGE and FS_IOC_FIEMAP 2020-01-09 10:19:07 +01:00
compat.c
coredump.c coredump: fix crash when umh is disabled 2020-09-08 01:59:16 -07:00
d_path.c
dax.c dax: fix cache flush on PMD-mapped pages 2022-06-14 16:59:24 +02:00
dcache.c fs, fscrypt: clear DCACHE_ENCRYPTED_NAME when unaliasing directory 2020-11-05 11:08:35 +01:00
dcookies.c
direct-io.c Merge 4.19.187 into android-4.19-stable 2021-04-16 07:42:26 +02:00
drop_caches.c fs: avoid softlockups in s_inodes iterators 2020-01-12 12:17:20 +01:00
eventfd.c eventfd: track eventfd_signal() recursion depth 2020-02-11 04:34:08 -08:00
eventpoll.c This is the 4.19.150 stable release 2020-10-07 08:45:35 +02:00
exec.c FROMLIST: mm: multi-gen LRU: support page table walks 2022-11-12 11:21:16 +00:00
fcntl.c fcntl: fix potential deadlock for &fasync_struct.fa_lock 2021-09-22 11:47:50 +02:00
fhandle.c
file_table.c Merge tag '5.10-rc1-4.19' of https://kernel.googlesource.com/pub/scm/linux/kernel/git/jaegeuk/f2fs-stable into HEAD 2022-07-03 13:23:06 +00:00
file.c disp: msm: sde: Force SDE fd to start from 1 2022-11-12 11:19:44 +00:00
filesystems.c fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() 2020-04-17 10:48:51 +02:00
fs_pin.c
fs_struct.c Restore sdcardfs feature 2020-08-21 15:15:18 +05:30
fs-writeback.c fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages 2022-06-14 16:59:27 +02:00
inode.c Merge branch 'android-4.19-stable' of https://github.com/aosp-mirror/kernel_common into skizo-x 2022-11-12 11:18:12 +00:00
internal.h Restore sdcardfs feature 2020-08-21 15:15:18 +05:30
ioctl.c
iomap.c This is the 4.19.191 stable release 2021-05-22 11:54:36 +02:00
Kconfig fs: exfat: Add support for building inside kernel 2022-02-26 20:27:24 +05:30
Kconfig.binfmt
libfs.c Merge tag '5.12-rc1-4.19' of https://kernel.googlesource.com/pub/scm/linux/kernel/git/jaegeuk/f2fs-stable into HEAD 2022-07-03 14:22:19 +00:00
locks.c locks: print unsigned ino in /proc/locks 2020-01-09 10:19:00 +01:00
Makefile fs: exfat: Add support for building inside kernel 2022-02-26 20:27:24 +05:30
mbcache.c
mount.h
mpage.c f2fs: fix build error on android tracepoints 2019-08-17 00:18:14 +00:00
namei.c Merge remote-tracking branch 'aosp/android-4.19-stable' into android12-base 2022-09-07 22:01:02 +05:30
namespace.c Merge remote-tracking branch 'aosp/android-4.19-stable' into android12-base 2022-05-19 14:51:23 +05:30
no-block.c
nsfs.c dcache: sort the freeing-without-RCU-delay mess for good. 2019-05-25 18:23:26 +02:00
open.c Merge tag '5.10-rc1-4.19' of https://kernel.googlesource.com/pub/scm/linux/kernel/git/jaegeuk/f2fs-stable into HEAD 2022-07-03 13:23:06 +00:00
OWNERS ANDROID: Add OWNERS files referring to the respective android-mainline OWNERS 2021-04-03 14:09:44 +00:00
pipe.c pipe: increase minimum default pipe size to 2 pages 2021-08-12 13:19:43 +02:00
pnode.c This is the 4.19.120 stable release 2020-05-03 08:48:02 +02:00
pnode.h ANDROID: mnt: Add filesystem private data to mount points 2018-12-05 09:48:13 -08:00
posix_acl.c
proc_namespace.c ANDROID: vfs: Allow filesystems to access their private mount data 2018-12-05 09:48:13 -08:00
read_write.c Merge android-4.19-stable.136 (204dd19) into msm-4.19 2020-10-14 20:04:29 +05:30
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-28 13:16:50 +02:00
select.c Revert "Revert "select: use freezable blocking call"" 2022-11-12 11:19:33 +00:00
seq_file.c seq_file: disallow extremely large seq buffer allocations 2021-12-07 12:47:09 +05:30
signalfd.c signalfd: use wake_up_pollfree() 2022-02-28 19:00:41 +05:30
splice.c Revert "fs: check FMODE_LSEEK to control internal pipe splicing" 2022-10-26 13:19:21 +02:00
stack.c
stat.c stat: fix inconsistency between struct stat and struct compat_stat 2022-04-27 13:39:44 +02:00
statfs.c vfs: Fix EOVERFLOW testing in put_compat_statfs64 2019-10-11 18:21:39 +02:00
super.c Merge tag 'ASB-2022-03-05_4.19-stable' of https://github.com/aosp-mirror/kernel_common into android12-base 2022-03-08 06:44:12 +05:30
sync.c Merge tag '5.10-rc1-4.19' of https://kernel.googlesource.com/pub/scm/linux/kernel/git/jaegeuk/f2fs-stable into HEAD 2022-07-03 13:23:06 +00:00
timerfd.c
userfaultfd.c Revert "mm: protect VMA modifications using VMA sequence count" 2022-11-12 11:20:39 +00:00
utimes.c Restore sdcardfs feature 2020-08-21 15:15:18 +05:30
xattr.c Merge android-4.19-stable.146 (443485d) into msm-4.19 2020-10-16 11:06:31 +05:30