android_kernel_xiaomi_sm7250

Author	SHA1	Message	Date
Yue Hu	3fa8ac111c	erofs: directly use wrapper erofs_page_is_managed() when shrinking We already have the wrapper function to identify managed page. Link: https://lore.kernel.org/r/20210810065450.1320-1-zbestahu@gmail.com Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Change-Id: Id68356b5fc0cd6f4da2b53c2e56f9075253d624d	2022-11-12 11:24:50 +00:00
Gao Xiang	027bb6d69b	erofs: fix 1 lcluster-sized pcluster for big pcluster If the 1st NONHEAD lcluster of a pcluster isn't CBLKCNT lcluster type rather than a HEAD or PLAIN type instead, which means its pclustersize _must_ be 1 lcluster (since its uncompressed size < 2 lclusters), as illustrated below: HEAD HEAD / PLAIN lcluster type ____________ ____________ \|_:__________\|_________:__\| file data (uncompressed) . . .____________. \|____________\| pcluster data (compressed) Such on-disk case was explained before [1] but missed to be handled properly in the runtime implementation. It can be observed if manually generating 1 lcluster-sized pcluster with 2 lclusters (thus CBLKCNT doesn't exist.) Let's fix it now. [1] https://lore.kernel.org/r/20210407043927.10623-1-xiang@kernel.org Link: https://lore.kernel.org/r/20210510064715.29123-1-xiang@kernel.org Fixes: cec6e93beadf ("erofs: support parsing big pcluster compress indexes") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <xiang@kernel.org> Change-Id: I8a11d6d0c883a7222767e5c48f5da41c22698784	2022-11-12 11:24:50 +00:00
Gao Xiang	c94dab0246	erofs: enable big pcluster feature Enable COMPR_CFGS and BIG_PCLUSTER since the implementations are all settled properly. Link: https://lore.kernel.org/r/20210407043927.10623-11-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I3e12261cf03e62bad4d7287d5827f3309a45d2a6	2022-11-12 11:24:50 +00:00
Gao Xiang	6247dd5bfb	erofs: support decompress big pcluster for lz4 backend Prior to big pcluster, there was only one compressed page so it'd easy to map this. However, when big pcluster is enabled, more work needs to be done to handle multiple compressed pages. In detail, - (maptype 0) if there is only one compressed page + no need to copy inplace I/O, just map it directly what we did before; - (maptype 1) if there are more compressed pages + no need to copy inplace I/O, vmap such compressed pages instead; - (maptype 2) if inplace I/O needs to be copied, use per-CPU buffers for decompression then. Another thing is how to detect inplace decompression is feasable or not (it's still quite easy for non big pclusters), apart from the inplace margin calculation, inplace I/O page reusing order is also needed to be considered for each compressed page. Currently, if the compressed page is the xth page, it shouldn't be reused as [0 ... nrpages_out - nrpages_in + x], otherwise a full copy will be triggered. Although there are some extra optimization ideas for this, I'd like to make big pcluster work correctly first and obviously it can be further optimized later since it has nothing with the on-disk format at all. Link: https://lore.kernel.org/r/20210407043927.10623-10-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: Icf073fd978b9a7f659651540e478a67d14450a09	2022-11-12 11:24:50 +00:00
Gao Xiang	7caadd02df	erofs: support parsing big pcluster compact indexes Different from non-compact indexes, several lclusters are packed as the compact form at once and an unique base blkaddr is stored for each pack, so each lcluster index would take less space on avarage (e.g. 2 bytes for COMPACT_2B.) btw, that is also why BIG_PCLUSTER switch should be consistent for compact head0/1. Prior to big pcluster, the size of all pclusters was 1 lcluster. Therefore, when a new HEAD lcluster was scanned, blkaddr would be bumped by 1 lcluster. However, that way doesn't work anymore for big pcluster since we actually don't know the compressed size of pclusters in advance (before reading CBLKCNT lcluster). So, instead, let blkaddr of each pack be the first pcluster blkaddr with a valid CBLKCNT, in detail, 1) if CBLKCNT starts at the pack, this first valid pcluster is itself, e.g. _____________________________________________________________ \|_CBLKCNT0_\|_NONHEAD_\| .. \|_HEAD_\|_CBLKCNT1_\| ... \|_HEAD_\| ... ^ = blkaddr base ^ += CBLKCNT0 ^ += CBLKCNT1 2) if CBLKCNT doesn't start at the pack, the first valid pcluster is the next pcluster, e.g. _________________________________________________________ \| NONHEAD_\| .. \|_HEAD_\|_CBLKCNT0_\| ... \|_HEAD_\|_HEAD_\| ... ^ = blkaddr base ^ += CBLKCNT0 ^ += 1 When a CBLKCNT is found, blkaddr will be increased by CBLKCNT lclusters, or a new HEAD is found immediately, bump blkaddr by 1 instead (see the picture above.) Also noted if CBLKCNT is the end of the pack, instead of storing delta1 (distance of the next HEAD lcluster) as normal NONHEADs, it still uses the compressed block count (delta0) since delta1 can be calculated indirectly but the block count can't. Adjust decoding logic to fit big pcluster compact indexes as well. Link: https://lore.kernel.org/r/20210407043927.10623-9-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: Ia9dc0c8bf03b630edc4dfcedf61ef0632e1fcb05	2022-11-12 11:24:49 +00:00
Gao Xiang	05658e6d14	erofs: support parsing big pcluster compress indexes When INCOMPAT_BIG_PCLUSTER sb feature is enabled, legacy compress indexes will also have the same on-disk header compact indexes to keep per-file configurations instead of leaving it zeroed. If ADVISE_BIG_PCLUSTER is set for a file, CBLKCNT will be loaded for each pcluster in this file by parsing 1st non-head lcluster. Link: https://lore.kernel.org/r/20210407043927.10623-8-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I316bad6965e2cfa97e6fb1fb548dda8bc363b3b1	2022-11-12 11:24:49 +00:00
Gao Xiang	3a2d5cdb17	erofs: adjust per-CPU buffers according to max_pclusterblks Adjust per-CPU buffers on demand since big pcluster definition is available. Also, bail out unsupported pcluster size according to Z_EROFS_PCLUSTER_MAX_SIZE. Link: https://lore.kernel.org/r/20210407043927.10623-7-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I5679ca306856206222bba50d78985390252e6b1f	2022-11-12 11:24:49 +00:00
Gao Xiang	d0d8fc4059	erofs: add big physical cluster definition Big pcluster indicates the size of compressed data for each physical pcluster is no longer fixed as block size, but could be more than 1 block (more accurately, 1 logical pcluster) When big pcluster feature is enabled for head0/1, delta0 of the 1st non-head lcluster index will keep block count of this pcluster in lcluster size instead of 1. Or, the compressed size of pcluster should be 1 lcluster if pcluster has no non-head lcluster index. Also note that BIG_PCLUSTER feature reuses COMPR_CFGS feature since it depends on COMPR_CFGS and will be released together. Link: https://lore.kernel.org/r/20210407043927.10623-6-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I00c6ff3b9197f0d4c3b19698d4f1fb1899773734	2022-11-12 11:24:49 +00:00
Gao Xiang	fd0b2e82e9	erofs: fix up inplace I/O pointer for big pcluster When picking up inplace I/O pages, it should be traversed in reverse order in aligned with the traversal order of file-backed online pages. Also, index should be updated together when preloading compressed pages. Previously, only page-sized pclustersize was supported so no problem at all. Also rename `compressedpages' to `icpage_ptr' to reflect its functionality. Link: https://lore.kernel.org/r/20210407043927.10623-5-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: Iefdad4fe34757134e4ae06dd4dcc16ada0a7880c	2022-11-12 11:24:49 +00:00
Gao Xiang	18f3dd8eb4	erofs: introduce physical cluster slab pools Since multiple pcluster sizes could be used at once, the number of compressed pages will become a variable factor. It's necessary to introduce slab pools rather than a single slab cache now. This limits the pclustersize to 1M (Z_EROFS_PCLUSTER_MAX_SIZE), and get rid of the obsolete EROFS_FS_CLUSTER_PAGE_LIMIT, which has no use now. Link: https://lore.kernel.org/r/20210407043927.10623-4-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I69bdcc92c379a66f71db38f0111d41a34526a586	2022-11-12 11:24:48 +00:00
Gao Xiang	089adbeb82	erofs: introduce multipage per-CPU buffers To deal the with the cases which inplace decompression is infeasible for some inplace I/O. Per-CPU buffers was introduced to get rid of page allocation latency and thrash for low-latency decompression algorithms such as lz4. For the big pcluster feature, introduce multipage per-CPU buffers to keep such inplace I/O pclusters temporarily as well but note that per-CPU pages are just consecutive virtually. When a new big pcluster fs is mounted, its max pclustersize will be read and per-CPU buffers can be growed if needed. Shrinking adjustable per-CPU buffers is more complex (because we don't know if such size is still be used), so currently just release them all when unloading. Link: https://lore.kernel.org/r/20210409190630.19569-1-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I7153344779a49710efc2ce57b0e2d8b9b991009b	2022-11-12 11:24:48 +00:00
Gao Xiang	d59de04be7	erofs: reserve physical_clusterbits[] Formal big pcluster design is actually more powerful / flexable than the previous thought whose pclustersize was fixed as power-of-2 blocks, which was obviously inefficient and space-wasting. Instead, pclustersize can now be set independently for each pcluster, so various pcluster sizes can also be used together in one file if mkfs wants (for example, according to data type and/or compression ratio). Let's get rid of previous physical_clusterbits[] setting (also notice that corresponding on-disk fields are still 0 for now). Therefore, head1/2 can be used for at most 2 different algorithms in one file and again pclustersize is now independent of these. Link: https://lore.kernel.org/r/20210407043927.10623-2-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: Ie48ec3c8088d7aab59683ecce6049cc8c13f4bf2	2022-11-12 11:24:48 +00:00
Ruiqi Gong	272228d253	erofs: Clean up spelling mistakes found in fs/erofs zmap.c: s/correspoinding/corresponding zdata.c: s/endding/ending Link: https://lore.kernel.org/r/20210331093920.31923-1-gongruiqi1@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com> Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: Ia6f036e2eb2aad4054fb7fa4d434623fa97ac0fd	2022-11-12 11:24:48 +00:00
Gao Xiang	52b1a55f3d	erofs: add on-disk compression configurations Add a bitmap for available compression algorithms and a variable-sized on-disk table for compression options in preparation for upcoming big pcluster and LZMA algorithm, which follows the end of super block. To parse the compression options, the bitmap is scanned one by one. For each available algorithm, there is data followed by 2-byte `length' correspondingly (it's enough for most cases, or entire fs blocks should be used.) With such available algorithm bitmap, kernel itself can also refuse to mount such filesystem if any unsupported compression algorithm exists. Note that COMPR_CFGS feature will be enabled with BIG_PCLUSTER. Link: https://lore.kernel.org/r/20210329100012.12980-1-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: Ic8420a2f98774fb316075c54bd81103f62756e19	2022-11-12 11:24:48 +00:00
Gao Xiang	d45aa2f8fe	erofs: introduce on-disk lz4 fs configurations Introduce z_erofs_lz4_cfgs to store all lz4 configurations. Currently it's only max_distance, but will be used for new features later. Link: https://lore.kernel.org/r/20210329012308.28743-4-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: Ie7306c702d33d62529630c3c6825d892a306a6fe	2022-11-12 11:24:48 +00:00
Huang Jianan	50412aa0b1	erofs: support adjust lz4 history window size lz4 uses LZ4_DISTANCE_MAX to record history preservation. When using rolling decompression, a block with a higher compression ratio will cause a larger memory allocation (up to 64k). It may cause a large resource burden in extreme cases on devices with small memory and a large number of concurrent IOs. So appropriately reducing this value can improve performance. Decreasing this value will reduce the compression ratio (except when input_size <LZ4_DISTANCE_MAX). But considering that erofs currently only supports 4k output, reducing this value will not significantly reduce the compression benefits. The maximum value of LZ4_DISTANCE_MAX defined by lz4 is 64k, and we can only reduce this value. For the old kernel, it just can't reduce the memory allocation during rolling decompression without affecting the decompression result. Link: https://lore.kernel.org/r/20210329012308.28743-3-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Huang Jianan <huangjianan@oppo.com> Signed-off-by: Guo Weichao <guoweichao@oppo.com> [ Gao Xiang: introduce struct erofs_sb_lz4_info for configurations. ] Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I19ebaefe144f3144e291225626ff7067ca646af6	2022-11-12 11:24:47 +00:00
Gao Xiang	a8f99e98e5	erofs: introduce erofs_sb_has_xxx() helpers Introduce erofs_sb_has_xxx() to make long checks short, especially for later big pcluster & LZMA features. Link: https://lore.kernel.org/r/20210329012308.28743-2-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: Ibf0e8724fbbc787b53e512212c794ca99f31390b	2022-11-12 11:24:47 +00:00
Yue Hu	6f04dbe3a3	erofs: don't use erofs_map_blocks() any more Currently, erofs_map_blocks() will be called only from erofs_{bmap, read_raw_page} which are all for uncompressed files. So, the compression branch in erofs_map_blocks() is pointless. Let's remove it and use erofs_map_blocks_flatmode() directly. Also update related comments. Link: https://lore.kernel.org/r/20210325071008.573-1-zbestahu@gmail.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I1804aa803d08d1df05e03811788ee8a2ace91f76	2022-11-12 11:24:47 +00:00
Gao Xiang	1a47f87d98	erofs: complete a missing case for inplace I/O Add a missing case which could cause unnecessary page allocation but not directly use inplace I/O instead, which increases runtime extra memory footprint. The detail is, considering an online file-backed page, the right half of the page is chosen to be cached (e.g. the end page of a readahead request) and some of its data doesn't exist in managed cache, so the pcluster will be definitely kept in the submission chain. (IOWs, it cannot be decompressed without I/O, e.g., due to the bypass queue). Currently, DELAYEDALLOC/TRYALLOC cases can be downgraded as NOINPLACE, and stop online pages from inplace I/O. After this patch, unneeded page allocations won't be observed in pickup_page_for_submission() then. Link: https://lore.kernel.org/r/20210321183227.5182-1-hsiangkao@aol.com Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I4ec6ee8ee275e142846f657dabe14934c0019ce2	2022-11-12 11:24:47 +00:00
Huang Jianan	1ffe1efa18	erofs: use workqueue decompression for atomic contexts only z_erofs_decompressqueue_endio may not be executed in the atomic context, for example, when dm-verity is turned on. In this scenario, data can be decompressed directly to get rid of additional kworker scheduling overhead. Link: https://lore.kernel.org/r/20210317035448.13921-2-huangjianan@oppo.com Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Huang Jianan <huangjianan@oppo.com> Signed-off-by: Guo Weichao <guoweichao@oppo.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I3dbb47bf28b420fe272de86a437d4fec09d69943	2022-11-12 11:24:47 +00:00
Huang Jianan	1913a8c2a4	erofs: avoid memory allocation failure during rolling decompression Currently, err would be treated as io error. Therefore, it'd be better to ensure memory allocation during rolling decompression to avoid such io error. In the long term, we might consider adding another !Uptodate case for such case. Link: https://lore.kernel.org/r/20210316031515.90954-1-huangjianan@oppo.com Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Huang Jianan <huangjianan@oppo.com> Signed-off-by: Guo Weichao <guoweichao@oppo.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I609eec373f765e897fa3f66f4b5a8b6549b88356	2022-11-12 11:24:46 +00:00
Gao Xiang	76042998ea	erofs: force inplace I/O under low memory scenario Try to forcely switch to inplace I/O under low memory scenario in order to avoid direct memory reclaim due to cached page allocation. Link: https://lore.kernel.org/r/20201209123717.12430-1-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I8ea2d3b59c68125271f66853cf5dc6ca39e7aaa9	2022-11-12 11:24:46 +00:00
Gao Xiang	6ddba74d77	erofs: simplify try_to_claim_pcluster() simplify try_to_claim_pcluster() by directly using cmpxchg() here (the retry loop caused more overhead.) Also, move the chain loop detection in and rename it to z_erofs_try_to_claim_pcluster(). Link: https://lore.kernel.org/r/20201208095834.3133565-3-hsiangkao@redhat.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I8d091ff44123b099ef199eaa4200a00b8854623f	2022-11-12 11:24:46 +00:00
Gao Xiang	d6e4efa5bd	erofs: insert to managed cache after adding to pcl Previously, it could be some concern to call add_to_page_cache_lru() with page->mapping == Z_EROFS_MAPPING_STAGING (!= NULL). In contrast, page->private is used instead now, so partially revert commit 5ddcee1f3a1c ("erofs: get rid of __stagingpage_alloc helper") with some adaption for simplicity. Link: https://lore.kernel.org/r/20201208095834.3133565-2-hsiangkao@redhat.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: If250d62b47083649e96d0937eb1990b6c84d768f	2022-11-12 11:24:46 +00:00
Gao Xiang	5e6356e2a9	erofs: get rid of magical Z_EROFS_MAPPING_STAGING Previously, we played around with magical page->mapping for short-lived temporary pages since we need to identify different types of pages in the same pcluster but both invalidated and short-lived temporary pages can have page->mapping == NULL. It was considered as safe because that temporary pages are all non-LRU / non-movable pages. This patch tends to use specific page->private to identify short-lived pages instead so it won't rely on page->mapping anymore. Details are described in "compress.h" as well. Link: https://lore.kernel.org/r/20201208095834.3133565-1-hsiangkao@redhat.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: I2c8650e80cb6016ed828d04f89f8bd3512ca3fb2	2022-11-12 11:24:46 +00:00
Vladimir Zapolskiy	bb7da106ff	erofs: remove a void EROFS_VERSION macro set in Makefile Since commit 4f761fa253b4 ("erofs: rename errln/infoln/debugln to erofs_{err, info, dbg}") the defined macro EROFS_VERSION has no affect, therefore removing it from the Makefile is a non-functional change. Link: https://lore.kernel.org/r/20201030122839.25431-1-vladimir@tuxera.com Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Vladimir Zapolskiy <vladimir@tuxera.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Change-Id: Id63ad279985db2a156d62be814bf381c9bea8342	2022-11-12 11:24:45 +00:00
Gao Xiang	be617d2193	erofs: fix unsafe pagevec reuse of hooked pclusters There are pclusters in runtime marked with Z_EROFS_PCLUSTER_TAIL before actual I/O submission. Thus, the decompression chain can be extended if the following pcluster chain hooks such tail pcluster. As the related comment mentioned, if some page is made of a hooked pcluster and another followed pcluster, it can be reused for in-place I/O (since I/O should be submitted anyway): _______________________________________________________________ \| tail (partial) page \| head (partial) page \| \|_____PRIMARY_HOOKED___\|____________PRIMARY_FOLLOWED____________\| However, it's by no means safe to reuse as pagevec since if such PRIMARY_HOOKED pclusters finally move into bypass chain without I/O submission. It's somewhat hard to reproduce with LZ4 and I just found it (general protection fault) by ro_fsstressing a LZMA image for long time. I'm going to actively clean up related code together with multi-page folio adaption in the next few months. Let's address it directly for easier backporting for now. Call trace for reference: z_erofs_decompress_pcluster+0x10a/0x8a0 [erofs] z_erofs_decompress_queue.isra.36+0x3c/0x60 [erofs] z_erofs_runqueue+0x5f3/0x840 [erofs] z_erofs_readahead+0x1e8/0x320 [erofs] read_pages+0x91/0x270 page_cache_ra_unbounded+0x18b/0x240 filemap_get_pages+0x10a/0x5f0 filemap_read+0xa9/0x330 new_sync_read+0x11b/0x1a0 vfs_read+0xf1/0x190 Link: https://lore.kernel.org/r/20211103182006.4040-1-xiang@kernel.org Fixes: `3883a79abd` ("staging: erofs: introduce VLE decompression support") Cc: <stable@vger.kernel.org> # 4.19+ Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Change-Id: Ieecf1f9ac4d2b2e1d285b7c157d9f37feddb79f7	2022-11-12 11:24:45 +00:00
Yue Hu	3fc031f2a6	erofs: remove the occupied parameter from z_erofs_pagevec_enqueue() No any behavior to variable occupied in z_erofs_attach_page() which is only caller to z_erofs_pagevec_enqueue(). Link: https://lore.kernel.org/r/20210419102623.2015-1-zbestahu@gmail.com Signed-off-by: Yue Hu <huyue2@yulong.com> Reviewed-by: Gao Xiang <xiang@kernel.org> Signed-off-by: Gao Xiang <xiang@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Change-Id: I1e3990f8b8c6bdce29dcadcf561aca0ee8657453	2022-11-12 11:24:45 +00:00
Gao Xiang	ba4695a03d	erofs: don't trigger WARN() when decompression fails syzbot reported a WARNING [1] due to corrupted compressed data. As Dmitry said, "If this is not a kernel bug, then the code should not use WARN. WARN if for kernel bugs and is recognized as such by all testing systems and humans." [1] https://lore.kernel.org/r/000000000000b3586105cf0ff45e@google.com Link: https://lore.kernel.org/r/20211025074311.130395-1-hsiangkao@linux.alibaba.com Cc: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Reported-by: syzbot+d8aaffc3719597e8cfb4@syzkaller.appspotmail.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Change-Id: If67aa741b0e1563465d162cf56ddc0ba6a50b0d1	2022-11-12 11:24:45 +00:00
Gao Xiang	7c6e563c72	erofs: move from drivers/staging/ to fs/ Since 5.4, erofs has been moved into fs/. Keep up with the 5.10 LTS kernel until the following commit: dbaf435ddf97 ("erofs: add unsupported inode i_format check") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Tested-by: Liu Bo <bo.liu@linux.alibaba.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Change-Id: If4827beb38d76d65844e31f9a594010871849918	2022-11-12 11:24:45 +00:00
Gao Xiang	0c8e6c2ec1	erofs: sync up with kernel 5.10 Backport 5.10 LTS erofs codebase to 4.19 with adaption such as using old mount APIs, radix tree instead of XArray and reverting some bio interface changes. Keep up with the 5.10 LTS kernel until the following commit: dbaf435ddf97 ("erofs: add unsupported inode i_format check") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Tested-by: Liu Bo <bo.liu@linux.alibaba.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Change-Id: Iac3047e822480a14c446ac2671f79369251bd7b6	2022-11-12 11:24:44 +00:00
Park Ju Hyung	038ea3002f	msm_geni_serial: skip flushing tx upon shutdown This is causing runtime PM to malfunction and prevents suspend indefinitely. This might be HAL related, but since flushing data by manually powering on the serial is unnecessary upon shutdown, skip it instead. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: Danny Lin <danny@kdrag0n.dev> Signed-off-by: celtare21 <celtare21@gmail.com>	2022-11-12 11:24:44 +00:00
Park Ju Hyung	56f3dfad34	msm_geni_serial: reduce wakelock timeout from ISR to 100ms Average userspace response time from ISR is less than 10ms. Whooping 2 seconds is way too long. Reduce it to 100ms. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Change-Id: I78e1f434f6e5585497972ee82b2a29c5cfd87408	2022-11-12 11:24:44 +00:00
Arjan van de Ven	de7528f5bd	fs: ext4: fsync: optimize double-fsync() a bunch There are cases where EXT4 is a bit too conservative sending barriers down to the disk; there are cases where the transaction in progress is not the one that sent the barrier (in other words: the fsync is for a file for which the IO happened more time ago and all data was already sent to the disk). For that case, a more performing tradeoff can be made on SSD devices (which have the ability to flush their dram caches in a hurry on a power fail event) where the barrier gets sent to the disk, but we don't need to wait for the barrier to complete. Any consecutive IO will block on the barrier correctly. Signed-off-by: Diab Neiroukh <lazerl0rd@thezest.dev>	2022-11-12 11:24:44 +00:00
nathanchance	6ff225f1cd	block : makefile : disable align mismatch new clang introduced Walign mismatch fix compile block/blk-mq	2022-11-12 11:24:43 +00:00
Jens Axboe	f0042f8846	blk-mq: fix corruption with direct issue If we attempt a direct issue to a SCSI device, and it returns BUSY, then we queue the request up normally. However, the SCSI layer may have already setup SG tables etc for this particular command. If we later merge with this request, then the old tables are no longer valid. Once we issue the IO, we only read/write the original part of the request, not the new state of it. This causes data corruption, and is most often noticed with the file system complaining about the just read data being invalid: [ 235.934465] EXT4-fs error (device sda1): ext4_iget:4831: inode #7142: comm dpkg-query: bad extra_isize 24937 (inode size 256) because most of it is garbage... This doesn't happen from the normal issue path, as we will simply defer the request to the hardware queue dispatch list if we fail. Once it's on the dispatch list, we never merge with it. Fix this from the direct issue path by flagging the request as REQ_NOMERGE so we don't change the size of it before issue. See also: https://bugzilla.kernel.org/show_bug.cgi?id=201685 Tested-by: Guenter Roeck <linux@roeck-us.net> Fixes: `6ce3dd6eec` ("blk-mq: issue directly if hw queue isn't busy in case of 'none'") Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Diab Neiroukh <lazerl0rd@thezest.dev>	2022-11-12 11:24:43 +00:00
Arjan van de Ven	29729569a8	do accept() in LIFO order for cache efficiency Signed-off-by: Diab Neiroukh <lazerl0rd@thezest.dev>	2022-11-12 11:24:43 +00:00
Arjan van de Ven	c48ce61c87	kernel: time: reduce ntp wakeups Signed-off-by: Diab Neiroukh <lazerl0rd@thezest.dev>	2022-11-12 11:24:43 +00:00
Arjan van de Ven	1353e025a4	ipv4/tcp: allow the memory tuning for tcp to go a little bigger than default Signed-off-by: Diab Neiroukh <lazerl0rd@thezest.dev>	2022-11-12 11:24:43 +00:00
Arjan van de Ven	e27e70b196	Initialize ata before graphics ATA init is the long pole in the boot process, and its asynchronous. move the graphics init after it so that ata and graphics initialize in parallel Signed-off-by: Diab Neiroukh <lazerl0rd@thezest.dev>	2022-11-12 11:24:42 +00:00
Arjan van de Ven	5f4f5aceae	Increase the ext4 default commit age Both the VM and EXT4 have a "commit to disk after X seconds" time. Currently the EXT4 time is shorter than our VM time, which is a bit suboptional, it's better for performance to let the VM do the writeouts in bulk rather than something deep in the journalling layer. (DISTRO TWEAK -- NOT FOR UPSTREAM) Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com> Signed-off-by: Diab Neiroukh <lazerl0rd@thezest.dev>	2022-11-12 11:24:42 +00:00
spakkkk	eaa4e1547f	arm64: config: enable exfat	2022-11-12 11:24:42 +00:00
Ritesh Harjani	70cc7eeb24	ext4: optimize file overwrites In case if the file already has underlying blocks/extents allocated then we don't need to start a journal txn and can directly return the underlying mapping. Currently ext4_iomap_begin() is used by both DAX & DIO path. We can check if the write request is an overwrite & then directly return the mapping information. This could give a significant perf boost for multi-threaded writes specially random overwrites. On PPC64 VM with simulated pmem(DAX) device, ~10x perf improvement could be seen in random writes (overwrite). Also bcoz this optimizes away the spinlock contention during jbd2 slab cache allocation (jbd2_journal_handle). On x86 VM, ~2x perf improvement was observed. Reported-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/88e795d8a4d5cd22165c7ebe857ba91d68d8813e.1600401668.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2022-11-12 11:24:42 +00:00
Park Ju Hyung	4d8e9a4708	quota_tree: Avoid dynamic memory allocations Most allocations done here are rather small and can fit on the stack, eliminating the need to allocate them dynamically. Reserve a 1024B stack buffer for this purpose to avoid the overhead of dynamic memory allocation. 1024B covers most use cases, and higher values were observed to cause stack corruptions. Co-authored-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>	2022-11-12 11:24:42 +00:00
Park Ju Hyung	33ce619f2f	techpack: display: add some bp hints to hot paths Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> [@0ctobot: Adapted for 4.19] Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>	2022-11-12 11:24:41 +00:00
Park Ju Hyung	aeef1bc25e	drm/msm: use kmem_cache pool for struct vblank_work These get allocated and freed millions of times on this kernel tree. Use a dedicated kmem_cache pool and avoid costly dynamic memory allocations. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>	2022-11-12 11:24:41 +00:00
Park Ju Hyung	d2baa1092e	kthread: use buffer from the stack space struct kthread_create_info is small enough to fit perfectly under the stack space. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>	2022-11-12 11:24:41 +00:00
Park Ju Hyung	518acbdb88	exec: use bprm from the stack space struct linux_binprm isn't big and is safe to use from the stack space Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> [@0ctobot: Adapted for 4.19] Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>	2022-11-12 11:24:41 +00:00
Park Ju Hyung	60b66cad88	sched: do not allocate window cpu arrays separately These are allocated extremely frequently. Allocate them with CONFIG_NR_CPUS upon struct ravg's allocation. This will break walt debug tracings. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>	2022-11-12 11:24:41 +00:00
Park Ju Hyung	19d0fb968c	power_supply: don't allocate attrname healthd queries this extremely frequently and attrname is allocated and de-allocated repeatedly. Use the stack space instead. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>	2022-11-12 11:24:40 +00:00

1 2 3 4 5 ...

870728 Commits