diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 80c2f0b25047..4fa822cb561e 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -566,7 +566,7 @@ loops can be debugged more effectively on production systems. - clearcpuid=BITNUM [X86] + clearcpuid=BITNUM[,BITNUM...] [X86] Disable CPUID feature X for the kernel. See arch/x86/include/asm/cpufeatures.h for the valid bit numbers. Note the Linux specific bits are not necessarily @@ -5302,6 +5302,14 @@ with /sys/devices/system/xen_memory/xen_memory0/scrub_pages. Default value controlled with CONFIG_XEN_SCRUB_PAGES_DEFAULT. + xen.event_eoi_delay= [XEN] + How long to delay EOI handling in case of event + storms (jiffies). Default is 10. + + xen.event_loop_timeout= [XEN] + After which time (jiffies) the event handling loop + should start to delay EOI handling. Default is 2. + xirc2ps_cs= [NET,PCMCIA] Format: ,,,,,[,[,[,]]] diff --git a/Documentation/core-api/timekeeping.rst b/Documentation/core-api/timekeeping.rst index 93cbeb9daec0..491a93c574a0 100644 --- a/Documentation/core-api/timekeeping.rst +++ b/Documentation/core-api/timekeeping.rst @@ -99,16 +99,20 @@ Coarse and fast_ns access Some additional variants exist for more specialized cases: -.. c:function:: ktime_t ktime_get_coarse_boottime( void ) +.. c:function:: ktime_t ktime_get_coarse( void ) + ktime_t ktime_get_coarse_boottime( void ) ktime_t ktime_get_coarse_real( void ) ktime_t ktime_get_coarse_clocktai( void ) - ktime_t ktime_get_coarse_raw( void ) + +.. c:function:: u64 ktime_get_coarse_ns( void ) + u64 ktime_get_coarse_boottime_ns( void ) + u64 ktime_get_coarse_real_ns( void ) + u64 ktime_get_coarse_clocktai_ns( void ) .. c:function:: void ktime_get_coarse_ts64( struct timespec64 * ) void ktime_get_coarse_boottime_ts64( struct timespec64 * ) void ktime_get_coarse_real_ts64( struct timespec64 * ) void ktime_get_coarse_clocktai_ts64( struct timespec64 * ) - void ktime_get_coarse_raw_ts64( struct timespec64 * ) These are quicker than the non-coarse versions, but less accurate, corresponding to CLOCK_MONONOTNIC_COARSE and CLOCK_REALTIME_COARSE diff --git a/Documentation/media/uapi/v4l/colorspaces-defs.rst b/Documentation/media/uapi/v4l/colorspaces-defs.rst index f24615544792..16e46bec8093 100644 --- a/Documentation/media/uapi/v4l/colorspaces-defs.rst +++ b/Documentation/media/uapi/v4l/colorspaces-defs.rst @@ -29,8 +29,7 @@ whole range, 0-255, dividing the angular value by 1.41. The enum :c:type:`v4l2_hsv_encoding` specifies which encoding is used. .. note:: The default R'G'B' quantization is full range for all - colorspaces except for BT.2020 which uses limited range R'G'B' - quantization. + colorspaces. HSV formats are always full range. .. tabularcolumns:: |p{6.0cm}|p{11.5cm}| @@ -162,8 +161,8 @@ whole range, 0-255, dividing the angular value by 1.41. The enum - Details * - ``V4L2_QUANTIZATION_DEFAULT`` - Use the default quantization encoding as defined by the - colorspace. This is always full range for R'G'B' (except for the - BT.2020 colorspace) and HSV. It is usually limited range for Y'CbCr. + colorspace. This is always full range for R'G'B' and HSV. + It is usually limited range for Y'CbCr. * - ``V4L2_QUANTIZATION_FULL_RANGE`` - Use the full range quantization encoding. I.e. the range [0…1] is mapped to [0…255] (with possible clipping to [1…254] to avoid the @@ -173,4 +172,4 @@ whole range, 0-255, dividing the angular value by 1.41. The enum * - ``V4L2_QUANTIZATION_LIM_RANGE`` - Use the limited range quantization encoding. I.e. the range [0…1] is mapped to [16…235]. Cb and Cr are mapped from [-0.5…0.5] to - [16…240]. + [16…240]. Limited Range cannot be used with HSV. diff --git a/Documentation/media/uapi/v4l/colorspaces-details.rst b/Documentation/media/uapi/v4l/colorspaces-details.rst index 09fabf4cd412..ca7176cae8dd 100644 --- a/Documentation/media/uapi/v4l/colorspaces-details.rst +++ b/Documentation/media/uapi/v4l/colorspaces-details.rst @@ -370,9 +370,8 @@ Colorspace BT.2020 (V4L2_COLORSPACE_BT2020) The :ref:`itu2020` standard defines the colorspace used by Ultra-high definition television (UHDTV). The default transfer function is ``V4L2_XFER_FUNC_709``. The default Y'CbCr encoding is -``V4L2_YCBCR_ENC_BT2020``. The default R'G'B' quantization is limited -range (!), and so is the default Y'CbCr quantization. The chromaticities -of the primary colors and the white reference are: +``V4L2_YCBCR_ENC_BT2020``. The default Y'CbCr quantization is limited range. +The chromaticities of the primary colors and the white reference are: diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index bf65e24225a0..975f617613b8 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -949,12 +949,14 @@ icmp_ratelimit - INTEGER icmp_msgs_per_sec - INTEGER Limit maximal number of ICMP packets sent per second from this host. Only messages whose type matches icmp_ratemask (see below) are - controlled by this limit. + controlled by this limit. For security reasons, the precise count + of messages per second is randomized. Default: 1000 icmp_msgs_burst - INTEGER icmp_msgs_per_sec controls number of ICMP packets sent per second, while icmp_msgs_burst controls the burst size of these packets. + For security reasons, the precise burst size is randomized. Default: 50 icmp_ratemask - INTEGER diff --git a/MAINTAINERS b/MAINTAINERS index 5ad444893d3e..b8a90bd17dc9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3907,6 +3907,7 @@ F: crypto/ F: drivers/crypto/ F: include/crypto/ F: include/linux/crypto* +F: lib/crypto/ CRYPTOGRAPHIC RANDOM NUMBER GENERATOR M: Neil Horman @@ -15890,6 +15891,14 @@ L: linux-gpio@vger.kernel.org S: Maintained F: drivers/gpio/gpio-ws16c48.c +WIREGUARD SECURE NETWORK TUNNEL +M: Jason A. Donenfeld +S: Maintained +F: drivers/net/wireguard/ +F: tools/testing/selftests/wireguard/ +L: wireguard@lists.zx2c4.com +L: netdev@vger.kernel.org + WISTRON LAPTOP BUTTON DRIVER M: Miloslav Trmac S: Maintained diff --git a/Makefile b/Makefile index 8201f9cc9af4..e6ffc28de7c4 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 4 PATCHLEVEL = 19 -SUBLEVEL = 152 +SUBLEVEL = 157 EXTRAVERSION = NAME = "People's Front" @@ -505,11 +505,7 @@ endif ifeq ($(cc-name),clang) ifneq ($(CROSS_COMPILE),) -CLANG_TRIPLE ?= $(CROSS_COMPILE) -CLANG_FLAGS += --target=$(notdir $(CLANG_TRIPLE:%-=%)) -ifeq ($(shell $(srctree)/scripts/clang-android.sh $(CC) $(CLANG_FLAGS)), y) -$(error "Clang with Android --target detected. Did you specify CLANG_TRIPLE?") -endif +CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%)) GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit)) CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(notdir $(CROSS_COMPILE)) GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..) diff --git a/android/abi_gki_aarch64.xml b/android/abi_gki_aarch64.xml index b81396432ec1..e7b884328a84 100644 --- a/android/abi_gki_aarch64.xml +++ b/android/abi_gki_aarch64.xml @@ -2259,6 +2259,7 @@ + @@ -7024,7 +7025,7 @@ - + @@ -8136,7 +8137,7 @@ - + @@ -9386,6 +9387,7 @@ + @@ -12204,13 +12206,13 @@ - - + + - + - + @@ -16175,6 +16177,61 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -20710,10 +20767,10 @@ - - - - + + + + @@ -23894,7 +23951,7 @@ - + @@ -27401,7 +27458,6 @@ - @@ -27992,7 +28048,6 @@ - @@ -28420,6 +28475,17 @@ + + + + + + + + + + + @@ -28553,6 +28619,17 @@ + + + + + + + + + + + @@ -28643,41 +28720,24 @@ - + - + - + - + - + - - - - + + - - - - - - - - - - - - - - - - - + + @@ -28721,18 +28781,27 @@ - + - + - + - - + + - - + + + + + + + + + + + @@ -28855,7 +28924,7 @@ - + @@ -28926,10 +28995,7 @@ - - - @@ -28952,14 +29018,14 @@ - + - + @@ -28970,14 +29036,14 @@ - + - + @@ -28998,44 +29064,44 @@ - - + + - + - + - + - + - + - + - + @@ -29047,27 +29113,27 @@ - + - - + + - + - + - + @@ -29078,11 +29144,11 @@ - + - + @@ -32254,7 +32320,6 @@ - @@ -32266,8 +32331,6 @@ - - @@ -32332,7 +32395,7 @@ - + @@ -34641,7 +34704,7 @@ - + @@ -35990,6 +36053,17 @@ + + + + + + + + + + + @@ -36442,21 +36516,21 @@ - - + + - - + + - - - + + + - - + + @@ -36500,6 +36574,11 @@ + + + + + @@ -40317,11 +40396,6 @@ - - - - - @@ -40484,6 +40558,17 @@ + + + + + + + + + + + @@ -40708,7 +40793,7 @@ - + @@ -40716,7 +40801,7 @@ - + @@ -40953,7 +41038,7 @@ - + @@ -41086,7 +41171,7 @@ - + @@ -41137,7 +41222,7 @@ - + @@ -41351,7 +41436,7 @@ - + @@ -41364,17 +41449,6 @@ - - - - - - - - - - - @@ -41964,7 +42038,7 @@ - + @@ -42260,7 +42334,7 @@ - + @@ -42287,12 +42361,12 @@ - + - + @@ -42342,7 +42416,7 @@ - + @@ -42350,13 +42424,13 @@ - + - + @@ -42484,7 +42558,7 @@ - + @@ -47463,7 +47537,7 @@ - + @@ -47483,7 +47557,7 @@ - + @@ -48397,6 +48471,26 @@ + + + + + + + + + + + + + + + + + + + + @@ -49661,7 +49755,7 @@ - + @@ -49692,74 +49786,6 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - @@ -49798,7 +49824,20 @@ - + + + + + + + + + + + + + + @@ -51532,20 +51571,20 @@ - - + + - - + + - - + + - - + + @@ -55441,25 +55480,6 @@ - - - - - - - - - - - - - - - - - - - @@ -55484,10 +55504,6 @@ - - - - @@ -59165,17 +59181,6 @@ - - - - - - - - - - - @@ -59353,17 +59358,6 @@ - - - - - - - - - - - @@ -59483,6 +59477,17 @@ + + + + + + + + + + + @@ -63592,7 +63597,7 @@ - + @@ -65622,9 +65627,9 @@ - - - + + + @@ -68132,11 +68137,11 @@ - + - + @@ -68688,17 +68693,6 @@ - - - - - - - - - - - @@ -68868,7 +68862,7 @@ - + @@ -68896,7 +68890,7 @@ - + @@ -68905,20 +68899,20 @@ - - - - + + + + - - + + - - - - + + + + @@ -68936,7 +68930,6 @@ - @@ -70090,7 +70083,7 @@ - + @@ -70105,8 +70098,8 @@ - - + + @@ -76281,7 +76274,7 @@ - + @@ -80266,7 +80259,7 @@ - + @@ -80442,7 +80435,7 @@ - + @@ -80726,8 +80719,8 @@ - - + + @@ -82913,7 +82906,7 @@ - + @@ -82970,7 +82963,7 @@ - + @@ -83140,6 +83133,25 @@ + + + + + + + + + + + + + + + + + + + @@ -83345,17 +83357,6 @@ - - - - - - - - - - - @@ -83447,7 +83448,7 @@ - + @@ -83476,7 +83477,7 @@ - + @@ -83569,16 +83570,21 @@ - - + + - - + + + + + + + @@ -83661,7 +83667,7 @@ - + @@ -83734,7 +83740,7 @@ - + @@ -84726,12 +84732,12 @@ - + - + @@ -85107,43 +85113,43 @@ - - - + + + - - - - + + + + - - - - + + + + - - - + + + - - - + + + - - - - + + + + - - - - + + + + @@ -85313,28 +85319,28 @@ - - - - + + + + - - - - + + + + - - - - + + + + - - - - + + + + @@ -85433,42 +85439,42 @@ - - - - + + + + - - - + + + - - - - - + + + + + - - - - - + + + + + - - - + + + - - - + + + - + @@ -86289,11 +86295,6 @@ - - - - - @@ -87119,12 +87120,12 @@ - + - + - + @@ -87256,13 +87257,6 @@ - - - - - - - @@ -87556,7 +87550,7 @@ - + @@ -87795,12 +87789,12 @@ - + - + - + @@ -88472,6 +88466,76 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -88712,84 +88776,6 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - @@ -88947,7 +88933,7 @@ - + @@ -89016,21 +89002,21 @@ - + - + - + @@ -90050,7 +90036,7 @@ - + @@ -94891,124 +94877,124 @@ - - - - - - - - + + + + + + + + - - - + + + - - - - - + + + + + - - + + - - - - + + + + - - - - - - + + + + + + - - - - - + + + + + - - - - - + + + + + - - - - - + + + + + - - - - - - - + + + + + + + - - - - - + + + + + - - - - - - + + + + + + - - - + + + - - - - - - + + + + + + - - - + + + - - - + + + - - - - + + + + - - - - + + + + @@ -96457,10 +96443,10 @@ - + - + @@ -97333,13 +97319,13 @@ - - + + - + @@ -97349,8 +97335,8 @@ - - + + @@ -97735,7 +97721,7 @@ - + @@ -97821,7 +97807,6 @@ - @@ -97832,27 +97817,27 @@ - - + + - + - + - + - + @@ -97860,7 +97845,7 @@ - + @@ -97868,14 +97853,14 @@ - + - + @@ -97883,7 +97868,7 @@ - + @@ -97892,7 +97877,7 @@ - + @@ -97901,11 +97886,11 @@ - + - + @@ -101293,6 +101278,6 @@ diff --git a/android/abi_gki_aarch64_qcom b/android/abi_gki_aarch64_qcom index 88d3768d176d..07ff4483e2be 100644 --- a/android/abi_gki_aarch64_qcom +++ b/android/abi_gki_aarch64_qcom @@ -2348,6 +2348,7 @@ __sock_recv_ts_and_drops sock_wake_async sock_wfree + timer_reduce unregister_net_sysctl_table __wake_up_sync_key __xfrm_policy_check diff --git a/arch/Kconfig b/arch/Kconfig index 7a06ece3d9bd..8a5149f875fe 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -366,6 +366,13 @@ config HAVE_RCU_TABLE_FREE config HAVE_RCU_TABLE_INVALIDATE bool +config ARCH_WANT_IRQS_OFF_ACTIVATE_MM + bool + help + Temporary select until all architectures can be converted to have + irqs disabled over activate_mm. Architectures that do IPI based TLB + shootdowns should enable this. + config ARCH_HAVE_NMI_SAFE_CMPXCHG bool diff --git a/arch/arc/kernel/entry.S b/arch/arc/kernel/entry.S index 705a68208423..85d9ea4a0acc 100644 --- a/arch/arc/kernel/entry.S +++ b/arch/arc/kernel/entry.S @@ -156,6 +156,7 @@ END(EV_Extension) tracesys: ; save EFA in case tracer wants the PC of traced task ; using ERET won't work since next-PC has already committed + lr r12, [efa] GET_CURR_TASK_FIELD_PTR TASK_THREAD, r11 st r12, [r11, THREAD_FAULT_ADDR] ; thread.fault_address @@ -198,9 +199,15 @@ tracesys_exit: ; Breakpoint TRAP ; --------------------------------------------- trap_with_param: - mov r0, r12 ; EFA in case ptracer/gdb wants stop_pc + + ; stop_pc info by gdb needs this info + lr r0, [efa] mov r1, sp + ; Now that we have read EFA, it is safe to do "fake" rtie + ; and get out of CPU exception mode + FAKE_RET_FROM_EXCPN + ; Save callee regs in case gdb wants to have a look ; SP will grow up by size of CALLEE Reg-File ; NOTE: clobbers r12 @@ -227,10 +234,6 @@ ENTRY(EV_Trap) EXCEPTION_PROLOGUE - lr r12, [efa] - - FAKE_RET_FROM_EXCPN - ;============ TRAP 1 :breakpoints ; Check ECR for trap with arg (PROLOGUE ensures r9 has ECR) bmsk.f 0, r9, 7 @@ -238,6 +241,9 @@ ENTRY(EV_Trap) ;============ TRAP (no param): syscall top level + ; First return from Exception to pure K mode (Exception/IRQs renabled) + FAKE_RET_FROM_EXCPN + ; If syscall tracing ongoing, invoke pre-post-hooks GET_CURR_THR_INFO_FLAGS r10 btst r10, TIF_SYSCALL_TRACE diff --git a/arch/arc/kernel/stacktrace.c b/arch/arc/kernel/stacktrace.c index bf40e06f3fb8..0fed32b95923 100644 --- a/arch/arc/kernel/stacktrace.c +++ b/arch/arc/kernel/stacktrace.c @@ -115,7 +115,7 @@ arc_unwind_core(struct task_struct *tsk, struct pt_regs *regs, int (*consumer_fn) (unsigned int, void *), void *arg) { #ifdef CONFIG_ARC_DW2_UNWIND - int ret = 0; + int ret = 0, cnt = 0; unsigned int address; struct unwind_frame_info frame_info; @@ -135,6 +135,11 @@ arc_unwind_core(struct task_struct *tsk, struct pt_regs *regs, break; frame_info.regs.r63 = frame_info.regs.r31; + + if (cnt++ > 128) { + printk("unwinder looping too long, aborting !\n"); + return 0; + } } return address; /* return the last address it saw */ diff --git a/arch/arc/plat-hsdk/Kconfig b/arch/arc/plat-hsdk/Kconfig index c285a83cbf08..df35ea1912e8 100644 --- a/arch/arc/plat-hsdk/Kconfig +++ b/arch/arc/plat-hsdk/Kconfig @@ -11,5 +11,6 @@ menuconfig ARC_SOC_HSDK select ARC_HAS_ACCL_REGS select ARC_IRQ_NO_AUTOSAVE select CLK_HSDK + select RESET_CONTROLLER select RESET_HSDK select MIGHT_HAVE_PCI diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 104262218319..793c8f003b4c 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -622,8 +622,10 @@ config ARCH_S3C24XX select HAVE_S3C2410_WATCHDOG if WATCHDOG select HAVE_S3C_RTC if RTC_CLASS select NEED_MACH_IO_H + select S3C2410_WATCHDOG select SAMSUNG_ATAGS select USE_OF + select WATCHDOG help Samsung S3C2410, S3C2412, S3C2413, S3C2416, S3C2440, S3C2442, S3C2443 and S3C2450 SoCs based systems, such as the Simtec Electronics BAST diff --git a/arch/arm/boot/dts/imx6sl.dtsi b/arch/arm/boot/dts/imx6sl.dtsi index 55d1872aa81a..9d19183f40e1 100644 --- a/arch/arm/boot/dts/imx6sl.dtsi +++ b/arch/arm/boot/dts/imx6sl.dtsi @@ -922,8 +922,10 @@ }; rngb: rngb@21b4000 { + compatible = "fsl,imx6sl-rngb", "fsl,imx25-rngb"; reg = <0x021b4000 0x4000>; interrupts = <0 5 IRQ_TYPE_LEVEL_HIGH>; + clocks = <&clks IMX6SL_CLK_DUMMY>; }; weim: weim@21b8000 { diff --git a/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts b/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts index 2b760f90f38c..5375c6699843 100644 --- a/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts +++ b/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts @@ -192,6 +192,7 @@ fixed-link { speed = <1000>; full-duplex; + pause; }; }; }; diff --git a/arch/arm/boot/dts/omap4.dtsi b/arch/arm/boot/dts/omap4.dtsi index 1a96d4317c97..8f907c235b02 100644 --- a/arch/arm/boot/dts/omap4.dtsi +++ b/arch/arm/boot/dts/omap4.dtsi @@ -516,7 +516,7 @@ status = "disabled"; }; - target-module@56000000 { + sgx_module: target-module@56000000 { compatible = "ti,sysc-omap4", "ti,sysc"; ti,hwmods = "gpu"; reg = <0x5601fc00 0x4>, diff --git a/arch/arm/boot/dts/omap443x.dtsi b/arch/arm/boot/dts/omap443x.dtsi index cbcdcb4e7d1c..86b9caf461df 100644 --- a/arch/arm/boot/dts/omap443x.dtsi +++ b/arch/arm/boot/dts/omap443x.dtsi @@ -74,3 +74,13 @@ }; /include/ "omap443x-clocks.dtsi" + +/* + * Use dpll_per for sgx at 153.6MHz like droid4 stock v3.0.8 Android kernel + */ +&sgx_module { + assigned-clocks = <&l3_gfx_clkctrl OMAP4_GPU_CLKCTRL 24>, + <&dpll_per_m7x2_ck>; + assigned-clock-rates = <0>, <153600000>; + assigned-clock-parents = <&dpll_per_m7x2_ck>; +}; diff --git a/arch/arm/boot/dts/owl-s500.dtsi b/arch/arm/boot/dts/owl-s500.dtsi index 43c9980a4260..75a76842c270 100644 --- a/arch/arm/boot/dts/owl-s500.dtsi +++ b/arch/arm/boot/dts/owl-s500.dtsi @@ -85,21 +85,21 @@ global_timer: timer@b0020200 { compatible = "arm,cortex-a9-global-timer"; reg = <0xb0020200 0x100>; - interrupts = ; + interrupts = ; status = "disabled"; }; twd_timer: timer@b0020600 { compatible = "arm,cortex-a9-twd-timer"; reg = <0xb0020600 0x20>; - interrupts = ; + interrupts = ; status = "disabled"; }; twd_wdt: wdt@b0020620 { compatible = "arm,cortex-a9-twd-wdt"; reg = <0xb0020620 0xe0>; - interrupts = ; + interrupts = ; status = "disabled"; }; diff --git a/arch/arm/boot/dts/s5pv210.dtsi b/arch/arm/boot/dts/s5pv210.dtsi index 67358562a6ea..020a864623ff 100644 --- a/arch/arm/boot/dts/s5pv210.dtsi +++ b/arch/arm/boot/dts/s5pv210.dtsi @@ -98,19 +98,16 @@ }; clocks: clock-controller@e0100000 { - compatible = "samsung,s5pv210-clock", "simple-bus"; + compatible = "samsung,s5pv210-clock"; reg = <0xe0100000 0x10000>; clock-names = "xxti", "xusbxti"; clocks = <&xxti>, <&xusbxti>; #clock-cells = <1>; - #address-cells = <1>; - #size-cells = <1>; - ranges; + }; - pmu_syscon: syscon@e0108000 { - compatible = "samsung-s5pv210-pmu", "syscon"; - reg = <0xe0108000 0x8000>; - }; + pmu_syscon: syscon@e0108000 { + compatible = "samsung-s5pv210-pmu", "syscon"; + reg = <0xe0108000 0x8000>; }; pinctrl0: pinctrl@e0200000 { @@ -126,35 +123,28 @@ }; }; - amba { - #address-cells = <1>; - #size-cells = <1>; - compatible = "simple-bus"; - ranges; + pdma0: dma@e0900000 { + compatible = "arm,pl330", "arm,primecell"; + reg = <0xe0900000 0x1000>; + interrupt-parent = <&vic0>; + interrupts = <19>; + clocks = <&clocks CLK_PDMA0>; + clock-names = "apb_pclk"; + #dma-cells = <1>; + #dma-channels = <8>; + #dma-requests = <32>; + }; - pdma0: dma@e0900000 { - compatible = "arm,pl330", "arm,primecell"; - reg = <0xe0900000 0x1000>; - interrupt-parent = <&vic0>; - interrupts = <19>; - clocks = <&clocks CLK_PDMA0>; - clock-names = "apb_pclk"; - #dma-cells = <1>; - #dma-channels = <8>; - #dma-requests = <32>; - }; - - pdma1: dma@e0a00000 { - compatible = "arm,pl330", "arm,primecell"; - reg = <0xe0a00000 0x1000>; - interrupt-parent = <&vic0>; - interrupts = <20>; - clocks = <&clocks CLK_PDMA1>; - clock-names = "apb_pclk"; - #dma-cells = <1>; - #dma-channels = <8>; - #dma-requests = <32>; - }; + pdma1: dma@e0a00000 { + compatible = "arm,pl330", "arm,primecell"; + reg = <0xe0a00000 0x1000>; + interrupt-parent = <&vic0>; + interrupts = <20>; + clocks = <&clocks CLK_PDMA1>; + clock-names = "apb_pclk"; + #dma-cells = <1>; + #dma-channels = <8>; + #dma-requests = <32>; }; spi0: spi@e1300000 { @@ -227,43 +217,36 @@ status = "disabled"; }; - audio-subsystem { - compatible = "samsung,s5pv210-audss", "simple-bus"; - #address-cells = <1>; - #size-cells = <1>; - ranges; + clk_audss: clock-controller@eee10000 { + compatible = "samsung,s5pv210-audss-clock"; + reg = <0xeee10000 0x1000>; + clock-names = "hclk", "xxti", + "fout_epll", + "sclk_audio0"; + clocks = <&clocks DOUT_HCLKP>, <&xxti>, + <&clocks FOUT_EPLL>, + <&clocks SCLK_AUDIO0>; + #clock-cells = <1>; + }; - clk_audss: clock-controller@eee10000 { - compatible = "samsung,s5pv210-audss-clock"; - reg = <0xeee10000 0x1000>; - clock-names = "hclk", "xxti", - "fout_epll", - "sclk_audio0"; - clocks = <&clocks DOUT_HCLKP>, <&xxti>, - <&clocks FOUT_EPLL>, - <&clocks SCLK_AUDIO0>; - #clock-cells = <1>; - }; - - i2s0: i2s@eee30000 { - compatible = "samsung,s5pv210-i2s"; - reg = <0xeee30000 0x1000>; - interrupt-parent = <&vic2>; - interrupts = <16>; - dma-names = "rx", "tx", "tx-sec"; - dmas = <&pdma1 9>, <&pdma1 10>, <&pdma1 11>; - clock-names = "iis", - "i2s_opclk0", - "i2s_opclk1"; - clocks = <&clk_audss CLK_I2S>, - <&clk_audss CLK_I2S>, - <&clk_audss CLK_DOUT_AUD_BUS>; - samsung,idma-addr = <0xc0010000>; - pinctrl-names = "default"; - pinctrl-0 = <&i2s0_bus>; - #sound-dai-cells = <0>; - status = "disabled"; - }; + i2s0: i2s@eee30000 { + compatible = "samsung,s5pv210-i2s"; + reg = <0xeee30000 0x1000>; + interrupt-parent = <&vic2>; + interrupts = <16>; + dma-names = "rx", "tx", "tx-sec"; + dmas = <&pdma1 9>, <&pdma1 10>, <&pdma1 11>; + clock-names = "iis", + "i2s_opclk0", + "i2s_opclk1"; + clocks = <&clk_audss CLK_I2S>, + <&clk_audss CLK_I2S>, + <&clk_audss CLK_DOUT_AUD_BUS>; + samsung,idma-addr = <0xc0010000>; + pinctrl-names = "default"; + pinctrl-0 = <&i2s0_bus>; + #sound-dai-cells = <0>; + status = "disabled"; }; i2s1: i2s@e2100000 { diff --git a/arch/arm/boot/dts/sun4i-a10.dtsi b/arch/arm/boot/dts/sun4i-a10.dtsi index 5d46bb0139fa..707ad5074878 100644 --- a/arch/arm/boot/dts/sun4i-a10.dtsi +++ b/arch/arm/boot/dts/sun4i-a10.dtsi @@ -143,7 +143,7 @@ trips { cpu_alert0: cpu-alert0 { /* milliCelsius */ - temperature = <850000>; + temperature = <85000>; hysteresis = <2000>; type = "passive"; }; diff --git a/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts b/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts index c39b9169ea64..b2a773a718e1 100644 --- a/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts +++ b/arch/arm/boot/dts/sun8i-r40-bananapi-m2-ultra.dts @@ -206,16 +206,16 @@ }; ®_dc1sw { - regulator-min-microvolt = <3000000>; - regulator-max-microvolt = <3000000>; + regulator-min-microvolt = <3300000>; + regulator-max-microvolt = <3300000>; regulator-name = "vcc-gmac-phy"; }; ®_dcdc1 { regulator-always-on; - regulator-min-microvolt = <3000000>; - regulator-max-microvolt = <3000000>; - regulator-name = "vcc-3v0"; + regulator-min-microvolt = <3300000>; + regulator-max-microvolt = <3300000>; + regulator-name = "vcc-3v3"; }; ®_dcdc2 { diff --git a/arch/arm/crypto/.gitignore b/arch/arm/crypto/.gitignore index 31e1f538df7d..a3c7ad52a469 100644 --- a/arch/arm/crypto/.gitignore +++ b/arch/arm/crypto/.gitignore @@ -1,3 +1,4 @@ aesbs-core.S sha256-core.S sha512-core.S +poly1305-core.S diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig index 5c278a8cce68..b98056ec10e5 100644 --- a/arch/arm/crypto/Kconfig +++ b/arch/arm/crypto/Kconfig @@ -125,14 +125,24 @@ config CRYPTO_CRC32_ARM_CE select CRYPTO_HASH config CRYPTO_CHACHA20_NEON - tristate "NEON accelerated ChaCha stream cipher algorithms" - depends on KERNEL_MODE_NEON + tristate "NEON and scalar accelerated ChaCha stream cipher algorithms" select CRYPTO_BLKCIPHER - select CRYPTO_CHACHA20 + select CRYPTO_ARCH_HAVE_LIB_CHACHA + +config CRYPTO_POLY1305_ARM + tristate "Accelerated scalar and SIMD Poly1305 hash implementations" + select CRYPTO_HASH + select CRYPTO_ARCH_HAVE_LIB_POLY1305 config CRYPTO_NHPOLY1305_NEON tristate "NEON accelerated NHPoly1305 hash function (for Adiantum)" depends on KERNEL_MODE_NEON select CRYPTO_NHPOLY1305 +config CRYPTO_CURVE25519_NEON + tristate "NEON accelerated Curve25519 scalar multiplication library" + depends on KERNEL_MODE_NEON + select CRYPTO_LIB_CURVE25519_GENERIC + select CRYPTO_ARCH_HAVE_LIB_CURVE25519 + endif diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile index b65d6bfab8e6..055b18cd9065 100644 --- a/arch/arm/crypto/Makefile +++ b/arch/arm/crypto/Makefile @@ -10,7 +10,9 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o +obj-$(CONFIG_CRYPTO_POLY1305_ARM) += poly1305-arm.o obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o +obj-$(CONFIG_CRYPTO_CURVE25519_NEON) += curve25519-neon.o ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o @@ -53,13 +55,19 @@ aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o -chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o +chacha-neon-y := chacha-scalar-core.o chacha-glue.o +chacha-neon-$(CONFIG_KERNEL_MODE_NEON) += chacha-neon-core.o +poly1305-arm-y := poly1305-core.o poly1305-glue.o nhpoly1305-neon-y := nh-neon-core.o nhpoly1305-neon-glue.o +curve25519-neon-y := curve25519-core.o curve25519-glue.o ifdef REGENERATE_ARM_CRYPTO quiet_cmd_perl = PERL $@ cmd_perl = $(PERL) $(<) > $(@) +$(src)/poly1305-core.S_shipped: $(src)/poly1305-armv4.pl + $(call cmd,perl) + $(src)/sha256-core.S_shipped: $(src)/sha256-armv4.pl $(call cmd,perl) @@ -67,4 +75,9 @@ $(src)/sha512-core.S_shipped: $(src)/sha512-armv4.pl $(call cmd,perl) endif -targets += sha256-core.S sha512-core.S +targets += poly1305-core.S sha256-core.S sha512-core.S + +# massage the perlasm code a bit so we only get the NEON routine if we need it +poly1305-aflags-$(CONFIG_CPU_V7) := -U__LINUX_ARM_ARCH__ -D__LINUX_ARM_ARCH__=5 +poly1305-aflags-$(CONFIG_KERNEL_MODE_NEON) := -U__LINUX_ARM_ARCH__ -D__LINUX_ARM_ARCH__=7 +AFLAGS_poly1305-core.o += $(poly1305-aflags-y) diff --git a/arch/arm/crypto/chacha-glue.c b/arch/arm/crypto/chacha-glue.c new file mode 100644 index 000000000000..f8eee0f13388 --- /dev/null +++ b/arch/arm/crypto/chacha-glue.c @@ -0,0 +1,356 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ARM NEON accelerated ChaCha and XChaCha stream ciphers, + * including ChaCha20 (RFC7539) + * + * Copyright (C) 2016-2019 Linaro, Ltd. + * Copyright (C) 2015 Martin Willi + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +asmlinkage void chacha_block_xor_neon(const u32 *state, u8 *dst, const u8 *src, + int nrounds); +asmlinkage void chacha_4block_xor_neon(const u32 *state, u8 *dst, const u8 *src, + int nrounds); +asmlinkage void hchacha_block_arm(const u32 *state, u32 *out, int nrounds); +asmlinkage void hchacha_block_neon(const u32 *state, u32 *out, int nrounds); + +asmlinkage void chacha_doarm(u8 *dst, const u8 *src, unsigned int bytes, + const u32 *state, int nrounds); + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(use_neon); + +static inline bool neon_usable(void) +{ + return static_branch_likely(&use_neon) && may_use_simd(); +} + +static void chacha_doneon(u32 *state, u8 *dst, const u8 *src, + unsigned int bytes, int nrounds) +{ + u8 buf[CHACHA_BLOCK_SIZE]; + + while (bytes >= CHACHA_BLOCK_SIZE * 4) { + chacha_4block_xor_neon(state, dst, src, nrounds); + bytes -= CHACHA_BLOCK_SIZE * 4; + src += CHACHA_BLOCK_SIZE * 4; + dst += CHACHA_BLOCK_SIZE * 4; + state[12] += 4; + } + while (bytes >= CHACHA_BLOCK_SIZE) { + chacha_block_xor_neon(state, dst, src, nrounds); + bytes -= CHACHA_BLOCK_SIZE; + src += CHACHA_BLOCK_SIZE; + dst += CHACHA_BLOCK_SIZE; + state[12]++; + } + if (bytes) { + memcpy(buf, src, bytes); + chacha_block_xor_neon(state, buf, buf, nrounds); + memcpy(dst, buf, bytes); + } +} + +void hchacha_block_arch(const u32 *state, u32 *stream, int nrounds) +{ + if (!IS_ENABLED(CONFIG_KERNEL_MODE_NEON) || !neon_usable()) { + hchacha_block_arm(state, stream, nrounds); + } else { + kernel_neon_begin(); + hchacha_block_neon(state, stream, nrounds); + kernel_neon_end(); + } +} +EXPORT_SYMBOL(hchacha_block_arch); + +void chacha_init_arch(u32 *state, const u32 *key, const u8 *iv) +{ + chacha_init_generic(state, key, iv); +} +EXPORT_SYMBOL(chacha_init_arch); + +void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes, + int nrounds) +{ + if (!IS_ENABLED(CONFIG_KERNEL_MODE_NEON) || !neon_usable() || + bytes <= CHACHA_BLOCK_SIZE) { + chacha_doarm(dst, src, bytes, state, nrounds); + state[12] += DIV_ROUND_UP(bytes, CHACHA_BLOCK_SIZE); + return; + } + + do { + unsigned int todo = min_t(unsigned int, bytes, SZ_4K); + + kernel_neon_begin(); + chacha_doneon(state, dst, src, todo, nrounds); + kernel_neon_end(); + + bytes -= todo; + src += todo; + dst += todo; + } while (bytes); +} +EXPORT_SYMBOL(chacha_crypt_arch); + +static int chacha_stream_xor(struct skcipher_request *req, + const struct chacha_ctx *ctx, const u8 *iv, + bool neon) +{ + struct skcipher_walk walk; + u32 state[16]; + int err; + + err = skcipher_walk_virt(&walk, req, false); + + chacha_init_generic(state, ctx->key, iv); + + while (walk.nbytes > 0) { + unsigned int nbytes = walk.nbytes; + + if (nbytes < walk.total) + nbytes = round_down(nbytes, walk.stride); + + if (!IS_ENABLED(CONFIG_KERNEL_MODE_NEON) || !neon) { + chacha_doarm(walk.dst.virt.addr, walk.src.virt.addr, + nbytes, state, ctx->nrounds); + state[12] += DIV_ROUND_UP(nbytes, CHACHA_BLOCK_SIZE); + } else { + kernel_neon_begin(); + chacha_doneon(state, walk.dst.virt.addr, + walk.src.virt.addr, nbytes, ctx->nrounds); + kernel_neon_end(); + } + err = skcipher_walk_done(&walk, walk.nbytes - nbytes); + } + + return err; +} + +static int do_chacha(struct skcipher_request *req, bool neon) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + + return chacha_stream_xor(req, ctx, req->iv, neon); +} + +static int chacha_arm(struct skcipher_request *req) +{ + return do_chacha(req, false); +} + +static int chacha_neon(struct skcipher_request *req) +{ + return do_chacha(req, neon_usable()); +} + +static int do_xchacha(struct skcipher_request *req, bool neon) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + struct chacha_ctx subctx; + u32 state[16]; + u8 real_iv[16]; + + chacha_init_generic(state, ctx->key, req->iv); + + if (!IS_ENABLED(CONFIG_KERNEL_MODE_NEON) || !neon) { + hchacha_block_arm(state, subctx.key, ctx->nrounds); + } else { + kernel_neon_begin(); + hchacha_block_neon(state, subctx.key, ctx->nrounds); + kernel_neon_end(); + } + subctx.nrounds = ctx->nrounds; + + memcpy(&real_iv[0], req->iv + 24, 8); + memcpy(&real_iv[8], req->iv + 16, 8); + return chacha_stream_xor(req, &subctx, real_iv, neon); +} + +static int xchacha_arm(struct skcipher_request *req) +{ + return do_xchacha(req, false); +} + +static int xchacha_neon(struct skcipher_request *req) +{ + return do_xchacha(req, neon_usable()); +} + +static struct skcipher_alg arm_algs[] = { + { + .base.cra_name = "chacha20", + .base.cra_driver_name = "chacha20-arm", + .base.cra_priority = 200, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = CHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .setkey = chacha20_setkey, + .encrypt = chacha_arm, + .decrypt = chacha_arm, + }, { + .base.cra_name = "xchacha20", + .base.cra_driver_name = "xchacha20-arm", + .base.cra_priority = 200, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = XCHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .setkey = chacha20_setkey, + .encrypt = xchacha_arm, + .decrypt = xchacha_arm, + }, { + .base.cra_name = "xchacha12", + .base.cra_driver_name = "xchacha12-arm", + .base.cra_priority = 200, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = XCHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .setkey = chacha12_setkey, + .encrypt = xchacha_arm, + .decrypt = xchacha_arm, + }, +}; + +static struct skcipher_alg neon_algs[] = { + { + .base.cra_name = "chacha20", + .base.cra_driver_name = "chacha20-neon", + .base.cra_priority = 300, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = CHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .walksize = 4 * CHACHA_BLOCK_SIZE, + .setkey = chacha20_setkey, + .encrypt = chacha_neon, + .decrypt = chacha_neon, + }, { + .base.cra_name = "xchacha20", + .base.cra_driver_name = "xchacha20-neon", + .base.cra_priority = 300, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = XCHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .walksize = 4 * CHACHA_BLOCK_SIZE, + .setkey = chacha20_setkey, + .encrypt = xchacha_neon, + .decrypt = xchacha_neon, + }, { + .base.cra_name = "xchacha12", + .base.cra_driver_name = "xchacha12-neon", + .base.cra_priority = 300, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = XCHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .walksize = 4 * CHACHA_BLOCK_SIZE, + .setkey = chacha12_setkey, + .encrypt = xchacha_neon, + .decrypt = xchacha_neon, + } +}; + +static int __init chacha_simd_mod_init(void) +{ + int err = 0; + + if (IS_REACHABLE(CONFIG_CRYPTO_BLKCIPHER)) { + err = crypto_register_skciphers(arm_algs, ARRAY_SIZE(arm_algs)); + if (err) + return err; + } + + if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && (elf_hwcap & HWCAP_NEON)) { + int i; + + switch (read_cpuid_part()) { + case ARM_CPU_PART_CORTEX_A7: + case ARM_CPU_PART_CORTEX_A5: + /* + * The Cortex-A7 and Cortex-A5 do not perform well with + * the NEON implementation but do incredibly with the + * scalar one and use less power. + */ + for (i = 0; i < ARRAY_SIZE(neon_algs); i++) + neon_algs[i].base.cra_priority = 0; + break; + default: + static_branch_enable(&use_neon); + } + + if (IS_REACHABLE(CONFIG_CRYPTO_BLKCIPHER)) { + err = crypto_register_skciphers(neon_algs, ARRAY_SIZE(neon_algs)); + if (err) + crypto_unregister_skciphers(arm_algs, ARRAY_SIZE(arm_algs)); + } + } + return err; +} + +static void __exit chacha_simd_mod_fini(void) +{ + if (IS_REACHABLE(CONFIG_CRYPTO_BLKCIPHER)) { + crypto_unregister_skciphers(arm_algs, ARRAY_SIZE(arm_algs)); + if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && (elf_hwcap & HWCAP_NEON)) + crypto_unregister_skciphers(neon_algs, ARRAY_SIZE(neon_algs)); + } +} + +module_init(chacha_simd_mod_init); +module_exit(chacha_simd_mod_fini); + +MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (scalar and NEON accelerated)"); +MODULE_AUTHOR("Ard Biesheuvel "); +MODULE_LICENSE("GPL v2"); +MODULE_ALIAS_CRYPTO("chacha20"); +MODULE_ALIAS_CRYPTO("chacha20-arm"); +MODULE_ALIAS_CRYPTO("xchacha20"); +MODULE_ALIAS_CRYPTO("xchacha20-arm"); +MODULE_ALIAS_CRYPTO("xchacha12"); +MODULE_ALIAS_CRYPTO("xchacha12-arm"); +#ifdef CONFIG_KERNEL_MODE_NEON +MODULE_ALIAS_CRYPTO("chacha20-neon"); +MODULE_ALIAS_CRYPTO("xchacha20-neon"); +MODULE_ALIAS_CRYPTO("xchacha12-neon"); +#endif diff --git a/arch/arm/crypto/chacha-scalar-core.S b/arch/arm/crypto/chacha-scalar-core.S new file mode 100644 index 000000000000..2985b80a45b5 --- /dev/null +++ b/arch/arm/crypto/chacha-scalar-core.S @@ -0,0 +1,460 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2018 Google, Inc. + */ + +#include +#include + +/* + * Design notes: + * + * 16 registers would be needed to hold the state matrix, but only 14 are + * available because 'sp' and 'pc' cannot be used. So we spill the elements + * (x8, x9) to the stack and swap them out with (x10, x11). This adds one + * 'ldrd' and one 'strd' instruction per round. + * + * All rotates are performed using the implicit rotate operand accepted by the + * 'add' and 'eor' instructions. This is faster than using explicit rotate + * instructions. To make this work, we allow the values in the second and last + * rows of the ChaCha state matrix (rows 'b' and 'd') to temporarily have the + * wrong rotation amount. The rotation amount is then fixed up just in time + * when the values are used. 'brot' is the number of bits the values in row 'b' + * need to be rotated right to arrive at the correct values, and 'drot' + * similarly for row 'd'. (brot, drot) start out as (0, 0) but we make it such + * that they end up as (25, 24) after every round. + */ + + // ChaCha state registers + X0 .req r0 + X1 .req r1 + X2 .req r2 + X3 .req r3 + X4 .req r4 + X5 .req r5 + X6 .req r6 + X7 .req r7 + X8_X10 .req r8 // shared by x8 and x10 + X9_X11 .req r9 // shared by x9 and x11 + X12 .req r10 + X13 .req r11 + X14 .req r12 + X15 .req r14 + +.macro __rev out, in, t0, t1, t2 +.if __LINUX_ARM_ARCH__ >= 6 + rev \out, \in +.else + lsl \t0, \in, #24 + and \t1, \in, #0xff00 + and \t2, \in, #0xff0000 + orr \out, \t0, \in, lsr #24 + orr \out, \out, \t1, lsl #8 + orr \out, \out, \t2, lsr #8 +.endif +.endm + +.macro _le32_bswap x, t0, t1, t2 +#ifdef __ARMEB__ + __rev \x, \x, \t0, \t1, \t2 +#endif +.endm + +.macro _le32_bswap_4x a, b, c, d, t0, t1, t2 + _le32_bswap \a, \t0, \t1, \t2 + _le32_bswap \b, \t0, \t1, \t2 + _le32_bswap \c, \t0, \t1, \t2 + _le32_bswap \d, \t0, \t1, \t2 +.endm + +.macro __ldrd a, b, src, offset +#if __LINUX_ARM_ARCH__ >= 6 + ldrd \a, \b, [\src, #\offset] +#else + ldr \a, [\src, #\offset] + ldr \b, [\src, #\offset + 4] +#endif +.endm + +.macro __strd a, b, dst, offset +#if __LINUX_ARM_ARCH__ >= 6 + strd \a, \b, [\dst, #\offset] +#else + str \a, [\dst, #\offset] + str \b, [\dst, #\offset + 4] +#endif +.endm + +.macro _halfround a1, b1, c1, d1, a2, b2, c2, d2 + + // a += b; d ^= a; d = rol(d, 16); + add \a1, \a1, \b1, ror #brot + add \a2, \a2, \b2, ror #brot + eor \d1, \a1, \d1, ror #drot + eor \d2, \a2, \d2, ror #drot + // drot == 32 - 16 == 16 + + // c += d; b ^= c; b = rol(b, 12); + add \c1, \c1, \d1, ror #16 + add \c2, \c2, \d2, ror #16 + eor \b1, \c1, \b1, ror #brot + eor \b2, \c2, \b2, ror #brot + // brot == 32 - 12 == 20 + + // a += b; d ^= a; d = rol(d, 8); + add \a1, \a1, \b1, ror #20 + add \a2, \a2, \b2, ror #20 + eor \d1, \a1, \d1, ror #16 + eor \d2, \a2, \d2, ror #16 + // drot == 32 - 8 == 24 + + // c += d; b ^= c; b = rol(b, 7); + add \c1, \c1, \d1, ror #24 + add \c2, \c2, \d2, ror #24 + eor \b1, \c1, \b1, ror #20 + eor \b2, \c2, \b2, ror #20 + // brot == 32 - 7 == 25 +.endm + +.macro _doubleround + + // column round + + // quarterrounds: (x0, x4, x8, x12) and (x1, x5, x9, x13) + _halfround X0, X4, X8_X10, X12, X1, X5, X9_X11, X13 + + // save (x8, x9); restore (x10, x11) + __strd X8_X10, X9_X11, sp, 0 + __ldrd X8_X10, X9_X11, sp, 8 + + // quarterrounds: (x2, x6, x10, x14) and (x3, x7, x11, x15) + _halfround X2, X6, X8_X10, X14, X3, X7, X9_X11, X15 + + .set brot, 25 + .set drot, 24 + + // diagonal round + + // quarterrounds: (x0, x5, x10, x15) and (x1, x6, x11, x12) + _halfround X0, X5, X8_X10, X15, X1, X6, X9_X11, X12 + + // save (x10, x11); restore (x8, x9) + __strd X8_X10, X9_X11, sp, 8 + __ldrd X8_X10, X9_X11, sp, 0 + + // quarterrounds: (x2, x7, x8, x13) and (x3, x4, x9, x14) + _halfround X2, X7, X8_X10, X13, X3, X4, X9_X11, X14 +.endm + +.macro _chacha_permute nrounds + .set brot, 0 + .set drot, 0 + .rept \nrounds / 2 + _doubleround + .endr +.endm + +.macro _chacha nrounds + +.Lnext_block\@: + // Stack: unused0-unused1 x10-x11 x0-x15 OUT IN LEN + // Registers contain x0-x9,x12-x15. + + // Do the core ChaCha permutation to update x0-x15. + _chacha_permute \nrounds + + add sp, #8 + // Stack: x10-x11 orig_x0-orig_x15 OUT IN LEN + // Registers contain x0-x9,x12-x15. + // x4-x7 are rotated by 'brot'; x12-x15 are rotated by 'drot'. + + // Free up some registers (r8-r12,r14) by pushing (x8-x9,x12-x15). + push {X8_X10, X9_X11, X12, X13, X14, X15} + + // Load (OUT, IN, LEN). + ldr r14, [sp, #96] + ldr r12, [sp, #100] + ldr r11, [sp, #104] + + orr r10, r14, r12 + + // Use slow path if fewer than 64 bytes remain. + cmp r11, #64 + blt .Lxor_slowpath\@ + + // Use slow path if IN and/or OUT isn't 4-byte aligned. Needed even on + // ARMv6+, since ldmia and stmia (used below) still require alignment. + tst r10, #3 + bne .Lxor_slowpath\@ + + // Fast path: XOR 64 bytes of aligned data. + + // Stack: x8-x9 x12-x15 x10-x11 orig_x0-orig_x15 OUT IN LEN + // Registers: r0-r7 are x0-x7; r8-r11 are free; r12 is IN; r14 is OUT. + // x4-x7 are rotated by 'brot'; x12-x15 are rotated by 'drot'. + + // x0-x3 + __ldrd r8, r9, sp, 32 + __ldrd r10, r11, sp, 40 + add X0, X0, r8 + add X1, X1, r9 + add X2, X2, r10 + add X3, X3, r11 + _le32_bswap_4x X0, X1, X2, X3, r8, r9, r10 + ldmia r12!, {r8-r11} + eor X0, X0, r8 + eor X1, X1, r9 + eor X2, X2, r10 + eor X3, X3, r11 + stmia r14!, {X0-X3} + + // x4-x7 + __ldrd r8, r9, sp, 48 + __ldrd r10, r11, sp, 56 + add X4, r8, X4, ror #brot + add X5, r9, X5, ror #brot + ldmia r12!, {X0-X3} + add X6, r10, X6, ror #brot + add X7, r11, X7, ror #brot + _le32_bswap_4x X4, X5, X6, X7, r8, r9, r10 + eor X4, X4, X0 + eor X5, X5, X1 + eor X6, X6, X2 + eor X7, X7, X3 + stmia r14!, {X4-X7} + + // x8-x15 + pop {r0-r7} // (x8-x9,x12-x15,x10-x11) + __ldrd r8, r9, sp, 32 + __ldrd r10, r11, sp, 40 + add r0, r0, r8 // x8 + add r1, r1, r9 // x9 + add r6, r6, r10 // x10 + add r7, r7, r11 // x11 + _le32_bswap_4x r0, r1, r6, r7, r8, r9, r10 + ldmia r12!, {r8-r11} + eor r0, r0, r8 // x8 + eor r1, r1, r9 // x9 + eor r6, r6, r10 // x10 + eor r7, r7, r11 // x11 + stmia r14!, {r0,r1,r6,r7} + ldmia r12!, {r0,r1,r6,r7} + __ldrd r8, r9, sp, 48 + __ldrd r10, r11, sp, 56 + add r2, r8, r2, ror #drot // x12 + add r3, r9, r3, ror #drot // x13 + add r4, r10, r4, ror #drot // x14 + add r5, r11, r5, ror #drot // x15 + _le32_bswap_4x r2, r3, r4, r5, r9, r10, r11 + ldr r9, [sp, #72] // load LEN + eor r2, r2, r0 // x12 + eor r3, r3, r1 // x13 + eor r4, r4, r6 // x14 + eor r5, r5, r7 // x15 + subs r9, #64 // decrement and check LEN + stmia r14!, {r2-r5} + + beq .Ldone\@ + +.Lprepare_for_next_block\@: + + // Stack: x0-x15 OUT IN LEN + + // Increment block counter (x12) + add r8, #1 + + // Store updated (OUT, IN, LEN) + str r14, [sp, #64] + str r12, [sp, #68] + str r9, [sp, #72] + + mov r14, sp + + // Store updated block counter (x12) + str r8, [sp, #48] + + sub sp, #16 + + // Reload state and do next block + ldmia r14!, {r0-r11} // load x0-x11 + __strd r10, r11, sp, 8 // store x10-x11 before state + ldmia r14, {r10-r12,r14} // load x12-x15 + b .Lnext_block\@ + +.Lxor_slowpath\@: + // Slow path: < 64 bytes remaining, or unaligned input or output buffer. + // We handle it by storing the 64 bytes of keystream to the stack, then + // XOR-ing the needed portion with the data. + + // Allocate keystream buffer + sub sp, #64 + mov r14, sp + + // Stack: ks0-ks15 x8-x9 x12-x15 x10-x11 orig_x0-orig_x15 OUT IN LEN + // Registers: r0-r7 are x0-x7; r8-r11 are free; r12 is IN; r14 is &ks0. + // x4-x7 are rotated by 'brot'; x12-x15 are rotated by 'drot'. + + // Save keystream for x0-x3 + __ldrd r8, r9, sp, 96 + __ldrd r10, r11, sp, 104 + add X0, X0, r8 + add X1, X1, r9 + add X2, X2, r10 + add X3, X3, r11 + _le32_bswap_4x X0, X1, X2, X3, r8, r9, r10 + stmia r14!, {X0-X3} + + // Save keystream for x4-x7 + __ldrd r8, r9, sp, 112 + __ldrd r10, r11, sp, 120 + add X4, r8, X4, ror #brot + add X5, r9, X5, ror #brot + add X6, r10, X6, ror #brot + add X7, r11, X7, ror #brot + _le32_bswap_4x X4, X5, X6, X7, r8, r9, r10 + add r8, sp, #64 + stmia r14!, {X4-X7} + + // Save keystream for x8-x15 + ldm r8, {r0-r7} // (x8-x9,x12-x15,x10-x11) + __ldrd r8, r9, sp, 128 + __ldrd r10, r11, sp, 136 + add r0, r0, r8 // x8 + add r1, r1, r9 // x9 + add r6, r6, r10 // x10 + add r7, r7, r11 // x11 + _le32_bswap_4x r0, r1, r6, r7, r8, r9, r10 + stmia r14!, {r0,r1,r6,r7} + __ldrd r8, r9, sp, 144 + __ldrd r10, r11, sp, 152 + add r2, r8, r2, ror #drot // x12 + add r3, r9, r3, ror #drot // x13 + add r4, r10, r4, ror #drot // x14 + add r5, r11, r5, ror #drot // x15 + _le32_bswap_4x r2, r3, r4, r5, r9, r10, r11 + stmia r14, {r2-r5} + + // Stack: ks0-ks15 unused0-unused7 x0-x15 OUT IN LEN + // Registers: r8 is block counter, r12 is IN. + + ldr r9, [sp, #168] // LEN + ldr r14, [sp, #160] // OUT + cmp r9, #64 + mov r0, sp + movle r1, r9 + movgt r1, #64 + // r1 is number of bytes to XOR, in range [1, 64] + +.if __LINUX_ARM_ARCH__ < 6 + orr r2, r12, r14 + tst r2, #3 // IN or OUT misaligned? + bne .Lxor_next_byte\@ +.endif + + // XOR a word at a time +.rept 16 + subs r1, #4 + blt .Lxor_words_done\@ + ldr r2, [r12], #4 + ldr r3, [r0], #4 + eor r2, r2, r3 + str r2, [r14], #4 +.endr + b .Lxor_slowpath_done\@ +.Lxor_words_done\@: + ands r1, r1, #3 + beq .Lxor_slowpath_done\@ + + // XOR a byte at a time +.Lxor_next_byte\@: + ldrb r2, [r12], #1 + ldrb r3, [r0], #1 + eor r2, r2, r3 + strb r2, [r14], #1 + subs r1, #1 + bne .Lxor_next_byte\@ + +.Lxor_slowpath_done\@: + subs r9, #64 + add sp, #96 + bgt .Lprepare_for_next_block\@ + +.Ldone\@: +.endm // _chacha + +/* + * void chacha_doarm(u8 *dst, const u8 *src, unsigned int bytes, + * const u32 *state, int nrounds); + */ +ENTRY(chacha_doarm) + cmp r2, #0 // len == 0? + reteq lr + + ldr ip, [sp] + cmp ip, #12 + + push {r0-r2,r4-r11,lr} + + // Push state x0-x15 onto stack. + // Also store an extra copy of x10-x11 just before the state. + + add X12, r3, #48 + ldm X12, {X12,X13,X14,X15} + push {X12,X13,X14,X15} + sub sp, sp, #64 + + __ldrd X8_X10, X9_X11, r3, 40 + __strd X8_X10, X9_X11, sp, 8 + __strd X8_X10, X9_X11, sp, 56 + ldm r3, {X0-X9_X11} + __strd X0, X1, sp, 16 + __strd X2, X3, sp, 24 + __strd X4, X5, sp, 32 + __strd X6, X7, sp, 40 + __strd X8_X10, X9_X11, sp, 48 + + beq 1f + _chacha 20 + +0: add sp, #76 + pop {r4-r11, pc} + +1: _chacha 12 + b 0b +ENDPROC(chacha_doarm) + +/* + * void hchacha_block_arm(const u32 state[16], u32 out[8], int nrounds); + */ +ENTRY(hchacha_block_arm) + push {r1,r4-r11,lr} + + cmp r2, #12 // ChaCha12 ? + + mov r14, r0 + ldmia r14!, {r0-r11} // load x0-x11 + push {r10-r11} // store x10-x11 to stack + ldm r14, {r10-r12,r14} // load x12-x15 + sub sp, #8 + + beq 1f + _chacha_permute 20 + + // Skip over (unused0-unused1, x10-x11) +0: add sp, #16 + + // Fix up rotations of x12-x15 + ror X12, X12, #drot + ror X13, X13, #drot + pop {r4} // load 'out' + ror X14, X14, #drot + ror X15, X15, #drot + + // Store (x0-x3,x12-x15) to 'out' + stm r4, {X0,X1,X2,X3,X12,X13,X14,X15} + + pop {r4-r11,pc} + +1: _chacha_permute 12 + b 0b +ENDPROC(hchacha_block_arm) diff --git a/arch/arm/crypto/curve25519-core.S b/arch/arm/crypto/curve25519-core.S new file mode 100644 index 000000000000..be18af52e7dc --- /dev/null +++ b/arch/arm/crypto/curve25519-core.S @@ -0,0 +1,2062 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * Based on public domain code from Daniel J. Bernstein and Peter Schwabe. This + * began from SUPERCOP's curve25519/neon2/scalarmult.s, but has subsequently been + * manually reworked for use in kernel space. + */ + +#include + +.text +.fpu neon +.arch armv7-a +.align 4 + +ENTRY(curve25519_neon) + push {r4-r11, lr} + mov ip, sp + sub r3, sp, #704 + and r3, r3, #0xfffffff0 + mov sp, r3 + movw r4, #0 + movw r5, #254 + vmov.i32 q0, #1 + vshr.u64 q1, q0, #7 + vshr.u64 q0, q0, #8 + vmov.i32 d4, #19 + vmov.i32 d5, #38 + add r6, sp, #480 + vst1.8 {d2-d3}, [r6, : 128]! + vst1.8 {d0-d1}, [r6, : 128]! + vst1.8 {d4-d5}, [r6, : 128] + add r6, r3, #0 + vmov.i32 q2, #0 + vst1.8 {d4-d5}, [r6, : 128]! + vst1.8 {d4-d5}, [r6, : 128]! + vst1.8 d4, [r6, : 64] + add r6, r3, #0 + movw r7, #960 + sub r7, r7, #2 + neg r7, r7 + sub r7, r7, r7, LSL #7 + str r7, [r6] + add r6, sp, #672 + vld1.8 {d4-d5}, [r1]! + vld1.8 {d6-d7}, [r1] + vst1.8 {d4-d5}, [r6, : 128]! + vst1.8 {d6-d7}, [r6, : 128] + sub r1, r6, #16 + ldrb r6, [r1] + and r6, r6, #248 + strb r6, [r1] + ldrb r6, [r1, #31] + and r6, r6, #127 + orr r6, r6, #64 + strb r6, [r1, #31] + vmov.i64 q2, #0xffffffff + vshr.u64 q3, q2, #7 + vshr.u64 q2, q2, #6 + vld1.8 {d8}, [r2] + vld1.8 {d10}, [r2] + add r2, r2, #6 + vld1.8 {d12}, [r2] + vld1.8 {d14}, [r2] + add r2, r2, #6 + vld1.8 {d16}, [r2] + add r2, r2, #4 + vld1.8 {d18}, [r2] + vld1.8 {d20}, [r2] + add r2, r2, #6 + vld1.8 {d22}, [r2] + add r2, r2, #2 + vld1.8 {d24}, [r2] + vld1.8 {d26}, [r2] + vshr.u64 q5, q5, #26 + vshr.u64 q6, q6, #3 + vshr.u64 q7, q7, #29 + vshr.u64 q8, q8, #6 + vshr.u64 q10, q10, #25 + vshr.u64 q11, q11, #3 + vshr.u64 q12, q12, #12 + vshr.u64 q13, q13, #38 + vand q4, q4, q2 + vand q6, q6, q2 + vand q8, q8, q2 + vand q10, q10, q2 + vand q2, q12, q2 + vand q5, q5, q3 + vand q7, q7, q3 + vand q9, q9, q3 + vand q11, q11, q3 + vand q3, q13, q3 + add r2, r3, #48 + vadd.i64 q12, q4, q1 + vadd.i64 q13, q10, q1 + vshr.s64 q12, q12, #26 + vshr.s64 q13, q13, #26 + vadd.i64 q5, q5, q12 + vshl.i64 q12, q12, #26 + vadd.i64 q14, q5, q0 + vadd.i64 q11, q11, q13 + vshl.i64 q13, q13, #26 + vadd.i64 q15, q11, q0 + vsub.i64 q4, q4, q12 + vshr.s64 q12, q14, #25 + vsub.i64 q10, q10, q13 + vshr.s64 q13, q15, #25 + vadd.i64 q6, q6, q12 + vshl.i64 q12, q12, #25 + vadd.i64 q14, q6, q1 + vadd.i64 q2, q2, q13 + vsub.i64 q5, q5, q12 + vshr.s64 q12, q14, #26 + vshl.i64 q13, q13, #25 + vadd.i64 q14, q2, q1 + vadd.i64 q7, q7, q12 + vshl.i64 q12, q12, #26 + vadd.i64 q15, q7, q0 + vsub.i64 q11, q11, q13 + vshr.s64 q13, q14, #26 + vsub.i64 q6, q6, q12 + vshr.s64 q12, q15, #25 + vadd.i64 q3, q3, q13 + vshl.i64 q13, q13, #26 + vadd.i64 q14, q3, q0 + vadd.i64 q8, q8, q12 + vshl.i64 q12, q12, #25 + vadd.i64 q15, q8, q1 + add r2, r2, #8 + vsub.i64 q2, q2, q13 + vshr.s64 q13, q14, #25 + vsub.i64 q7, q7, q12 + vshr.s64 q12, q15, #26 + vadd.i64 q14, q13, q13 + vadd.i64 q9, q9, q12 + vtrn.32 d12, d14 + vshl.i64 q12, q12, #26 + vtrn.32 d13, d15 + vadd.i64 q0, q9, q0 + vadd.i64 q4, q4, q14 + vst1.8 d12, [r2, : 64]! + vshl.i64 q6, q13, #4 + vsub.i64 q7, q8, q12 + vshr.s64 q0, q0, #25 + vadd.i64 q4, q4, q6 + vadd.i64 q6, q10, q0 + vshl.i64 q0, q0, #25 + vadd.i64 q8, q6, q1 + vadd.i64 q4, q4, q13 + vshl.i64 q10, q13, #25 + vadd.i64 q1, q4, q1 + vsub.i64 q0, q9, q0 + vshr.s64 q8, q8, #26 + vsub.i64 q3, q3, q10 + vtrn.32 d14, d0 + vshr.s64 q1, q1, #26 + vtrn.32 d15, d1 + vadd.i64 q0, q11, q8 + vst1.8 d14, [r2, : 64] + vshl.i64 q7, q8, #26 + vadd.i64 q5, q5, q1 + vtrn.32 d4, d6 + vshl.i64 q1, q1, #26 + vtrn.32 d5, d7 + vsub.i64 q3, q6, q7 + add r2, r2, #16 + vsub.i64 q1, q4, q1 + vst1.8 d4, [r2, : 64] + vtrn.32 d6, d0 + vtrn.32 d7, d1 + sub r2, r2, #8 + vtrn.32 d2, d10 + vtrn.32 d3, d11 + vst1.8 d6, [r2, : 64] + sub r2, r2, #24 + vst1.8 d2, [r2, : 64] + add r2, r3, #96 + vmov.i32 q0, #0 + vmov.i64 d2, #0xff + vmov.i64 d3, #0 + vshr.u32 q1, q1, #7 + vst1.8 {d2-d3}, [r2, : 128]! + vst1.8 {d0-d1}, [r2, : 128]! + vst1.8 d0, [r2, : 64] + add r2, r3, #144 + vmov.i32 q0, #0 + vst1.8 {d0-d1}, [r2, : 128]! + vst1.8 {d0-d1}, [r2, : 128]! + vst1.8 d0, [r2, : 64] + add r2, r3, #240 + vmov.i32 q0, #0 + vmov.i64 d2, #0xff + vmov.i64 d3, #0 + vshr.u32 q1, q1, #7 + vst1.8 {d2-d3}, [r2, : 128]! + vst1.8 {d0-d1}, [r2, : 128]! + vst1.8 d0, [r2, : 64] + add r2, r3, #48 + add r6, r3, #192 + vld1.8 {d0-d1}, [r2, : 128]! + vld1.8 {d2-d3}, [r2, : 128]! + vld1.8 {d4}, [r2, : 64] + vst1.8 {d0-d1}, [r6, : 128]! + vst1.8 {d2-d3}, [r6, : 128]! + vst1.8 d4, [r6, : 64] +.Lmainloop: + mov r2, r5, LSR #3 + and r6, r5, #7 + ldrb r2, [r1, r2] + mov r2, r2, LSR r6 + and r2, r2, #1 + str r5, [sp, #456] + eor r4, r4, r2 + str r2, [sp, #460] + neg r2, r4 + add r4, r3, #96 + add r5, r3, #192 + add r6, r3, #144 + vld1.8 {d8-d9}, [r4, : 128]! + add r7, r3, #240 + vld1.8 {d10-d11}, [r5, : 128]! + veor q6, q4, q5 + vld1.8 {d14-d15}, [r6, : 128]! + vdup.i32 q8, r2 + vld1.8 {d18-d19}, [r7, : 128]! + veor q10, q7, q9 + vld1.8 {d22-d23}, [r4, : 128]! + vand q6, q6, q8 + vld1.8 {d24-d25}, [r5, : 128]! + vand q10, q10, q8 + vld1.8 {d26-d27}, [r6, : 128]! + veor q4, q4, q6 + vld1.8 {d28-d29}, [r7, : 128]! + veor q5, q5, q6 + vld1.8 {d0}, [r4, : 64] + veor q6, q7, q10 + vld1.8 {d2}, [r5, : 64] + veor q7, q9, q10 + vld1.8 {d4}, [r6, : 64] + veor q9, q11, q12 + vld1.8 {d6}, [r7, : 64] + veor q10, q0, q1 + sub r2, r4, #32 + vand q9, q9, q8 + sub r4, r5, #32 + vand q10, q10, q8 + sub r5, r6, #32 + veor q11, q11, q9 + sub r6, r7, #32 + veor q0, q0, q10 + veor q9, q12, q9 + veor q1, q1, q10 + veor q10, q13, q14 + veor q12, q2, q3 + vand q10, q10, q8 + vand q8, q12, q8 + veor q12, q13, q10 + veor q2, q2, q8 + veor q10, q14, q10 + veor q3, q3, q8 + vadd.i32 q8, q4, q6 + vsub.i32 q4, q4, q6 + vst1.8 {d16-d17}, [r2, : 128]! + vadd.i32 q6, q11, q12 + vst1.8 {d8-d9}, [r5, : 128]! + vsub.i32 q4, q11, q12 + vst1.8 {d12-d13}, [r2, : 128]! + vadd.i32 q6, q0, q2 + vst1.8 {d8-d9}, [r5, : 128]! + vsub.i32 q0, q0, q2 + vst1.8 d12, [r2, : 64] + vadd.i32 q2, q5, q7 + vst1.8 d0, [r5, : 64] + vsub.i32 q0, q5, q7 + vst1.8 {d4-d5}, [r4, : 128]! + vadd.i32 q2, q9, q10 + vst1.8 {d0-d1}, [r6, : 128]! + vsub.i32 q0, q9, q10 + vst1.8 {d4-d5}, [r4, : 128]! + vadd.i32 q2, q1, q3 + vst1.8 {d0-d1}, [r6, : 128]! + vsub.i32 q0, q1, q3 + vst1.8 d4, [r4, : 64] + vst1.8 d0, [r6, : 64] + add r2, sp, #512 + add r4, r3, #96 + add r5, r3, #144 + vld1.8 {d0-d1}, [r2, : 128] + vld1.8 {d2-d3}, [r4, : 128]! + vld1.8 {d4-d5}, [r5, : 128]! + vzip.i32 q1, q2 + vld1.8 {d6-d7}, [r4, : 128]! + vld1.8 {d8-d9}, [r5, : 128]! + vshl.i32 q5, q1, #1 + vzip.i32 q3, q4 + vshl.i32 q6, q2, #1 + vld1.8 {d14}, [r4, : 64] + vshl.i32 q8, q3, #1 + vld1.8 {d15}, [r5, : 64] + vshl.i32 q9, q4, #1 + vmul.i32 d21, d7, d1 + vtrn.32 d14, d15 + vmul.i32 q11, q4, q0 + vmul.i32 q0, q7, q0 + vmull.s32 q12, d2, d2 + vmlal.s32 q12, d11, d1 + vmlal.s32 q12, d12, d0 + vmlal.s32 q12, d13, d23 + vmlal.s32 q12, d16, d22 + vmlal.s32 q12, d7, d21 + vmull.s32 q10, d2, d11 + vmlal.s32 q10, d4, d1 + vmlal.s32 q10, d13, d0 + vmlal.s32 q10, d6, d23 + vmlal.s32 q10, d17, d22 + vmull.s32 q13, d10, d4 + vmlal.s32 q13, d11, d3 + vmlal.s32 q13, d13, d1 + vmlal.s32 q13, d16, d0 + vmlal.s32 q13, d17, d23 + vmlal.s32 q13, d8, d22 + vmull.s32 q1, d10, d5 + vmlal.s32 q1, d11, d4 + vmlal.s32 q1, d6, d1 + vmlal.s32 q1, d17, d0 + vmlal.s32 q1, d8, d23 + vmull.s32 q14, d10, d6 + vmlal.s32 q14, d11, d13 + vmlal.s32 q14, d4, d4 + vmlal.s32 q14, d17, d1 + vmlal.s32 q14, d18, d0 + vmlal.s32 q14, d9, d23 + vmull.s32 q11, d10, d7 + vmlal.s32 q11, d11, d6 + vmlal.s32 q11, d12, d5 + vmlal.s32 q11, d8, d1 + vmlal.s32 q11, d19, d0 + vmull.s32 q15, d10, d8 + vmlal.s32 q15, d11, d17 + vmlal.s32 q15, d12, d6 + vmlal.s32 q15, d13, d5 + vmlal.s32 q15, d19, d1 + vmlal.s32 q15, d14, d0 + vmull.s32 q2, d10, d9 + vmlal.s32 q2, d11, d8 + vmlal.s32 q2, d12, d7 + vmlal.s32 q2, d13, d6 + vmlal.s32 q2, d14, d1 + vmull.s32 q0, d15, d1 + vmlal.s32 q0, d10, d14 + vmlal.s32 q0, d11, d19 + vmlal.s32 q0, d12, d8 + vmlal.s32 q0, d13, d17 + vmlal.s32 q0, d6, d6 + add r2, sp, #480 + vld1.8 {d18-d19}, [r2, : 128]! + vmull.s32 q3, d16, d7 + vmlal.s32 q3, d10, d15 + vmlal.s32 q3, d11, d14 + vmlal.s32 q3, d12, d9 + vmlal.s32 q3, d13, d8 + vld1.8 {d8-d9}, [r2, : 128] + vadd.i64 q5, q12, q9 + vadd.i64 q6, q15, q9 + vshr.s64 q5, q5, #26 + vshr.s64 q6, q6, #26 + vadd.i64 q7, q10, q5 + vshl.i64 q5, q5, #26 + vadd.i64 q8, q7, q4 + vadd.i64 q2, q2, q6 + vshl.i64 q6, q6, #26 + vadd.i64 q10, q2, q4 + vsub.i64 q5, q12, q5 + vshr.s64 q8, q8, #25 + vsub.i64 q6, q15, q6 + vshr.s64 q10, q10, #25 + vadd.i64 q12, q13, q8 + vshl.i64 q8, q8, #25 + vadd.i64 q13, q12, q9 + vadd.i64 q0, q0, q10 + vsub.i64 q7, q7, q8 + vshr.s64 q8, q13, #26 + vshl.i64 q10, q10, #25 + vadd.i64 q13, q0, q9 + vadd.i64 q1, q1, q8 + vshl.i64 q8, q8, #26 + vadd.i64 q15, q1, q4 + vsub.i64 q2, q2, q10 + vshr.s64 q10, q13, #26 + vsub.i64 q8, q12, q8 + vshr.s64 q12, q15, #25 + vadd.i64 q3, q3, q10 + vshl.i64 q10, q10, #26 + vadd.i64 q13, q3, q4 + vadd.i64 q14, q14, q12 + add r2, r3, #288 + vshl.i64 q12, q12, #25 + add r4, r3, #336 + vadd.i64 q15, q14, q9 + add r2, r2, #8 + vsub.i64 q0, q0, q10 + add r4, r4, #8 + vshr.s64 q10, q13, #25 + vsub.i64 q1, q1, q12 + vshr.s64 q12, q15, #26 + vadd.i64 q13, q10, q10 + vadd.i64 q11, q11, q12 + vtrn.32 d16, d2 + vshl.i64 q12, q12, #26 + vtrn.32 d17, d3 + vadd.i64 q1, q11, q4 + vadd.i64 q4, q5, q13 + vst1.8 d16, [r2, : 64]! + vshl.i64 q5, q10, #4 + vst1.8 d17, [r4, : 64]! + vsub.i64 q8, q14, q12 + vshr.s64 q1, q1, #25 + vadd.i64 q4, q4, q5 + vadd.i64 q5, q6, q1 + vshl.i64 q1, q1, #25 + vadd.i64 q6, q5, q9 + vadd.i64 q4, q4, q10 + vshl.i64 q10, q10, #25 + vadd.i64 q9, q4, q9 + vsub.i64 q1, q11, q1 + vshr.s64 q6, q6, #26 + vsub.i64 q3, q3, q10 + vtrn.32 d16, d2 + vshr.s64 q9, q9, #26 + vtrn.32 d17, d3 + vadd.i64 q1, q2, q6 + vst1.8 d16, [r2, : 64] + vshl.i64 q2, q6, #26 + vst1.8 d17, [r4, : 64] + vadd.i64 q6, q7, q9 + vtrn.32 d0, d6 + vshl.i64 q7, q9, #26 + vtrn.32 d1, d7 + vsub.i64 q2, q5, q2 + add r2, r2, #16 + vsub.i64 q3, q4, q7 + vst1.8 d0, [r2, : 64] + add r4, r4, #16 + vst1.8 d1, [r4, : 64] + vtrn.32 d4, d2 + vtrn.32 d5, d3 + sub r2, r2, #8 + sub r4, r4, #8 + vtrn.32 d6, d12 + vtrn.32 d7, d13 + vst1.8 d4, [r2, : 64] + vst1.8 d5, [r4, : 64] + sub r2, r2, #24 + sub r4, r4, #24 + vst1.8 d6, [r2, : 64] + vst1.8 d7, [r4, : 64] + add r2, r3, #240 + add r4, r3, #96 + vld1.8 {d0-d1}, [r4, : 128]! + vld1.8 {d2-d3}, [r4, : 128]! + vld1.8 {d4}, [r4, : 64] + add r4, r3, #144 + vld1.8 {d6-d7}, [r4, : 128]! + vtrn.32 q0, q3 + vld1.8 {d8-d9}, [r4, : 128]! + vshl.i32 q5, q0, #4 + vtrn.32 q1, q4 + vshl.i32 q6, q3, #4 + vadd.i32 q5, q5, q0 + vadd.i32 q6, q6, q3 + vshl.i32 q7, q1, #4 + vld1.8 {d5}, [r4, : 64] + vshl.i32 q8, q4, #4 + vtrn.32 d4, d5 + vadd.i32 q7, q7, q1 + vadd.i32 q8, q8, q4 + vld1.8 {d18-d19}, [r2, : 128]! + vshl.i32 q10, q2, #4 + vld1.8 {d22-d23}, [r2, : 128]! + vadd.i32 q10, q10, q2 + vld1.8 {d24}, [r2, : 64] + vadd.i32 q5, q5, q0 + add r2, r3, #192 + vld1.8 {d26-d27}, [r2, : 128]! + vadd.i32 q6, q6, q3 + vld1.8 {d28-d29}, [r2, : 128]! + vadd.i32 q8, q8, q4 + vld1.8 {d25}, [r2, : 64] + vadd.i32 q10, q10, q2 + vtrn.32 q9, q13 + vadd.i32 q7, q7, q1 + vadd.i32 q5, q5, q0 + vtrn.32 q11, q14 + vadd.i32 q6, q6, q3 + add r2, sp, #528 + vadd.i32 q10, q10, q2 + vtrn.32 d24, d25 + vst1.8 {d12-d13}, [r2, : 128]! + vshl.i32 q6, q13, #1 + vst1.8 {d20-d21}, [r2, : 128]! + vshl.i32 q10, q14, #1 + vst1.8 {d12-d13}, [r2, : 128]! + vshl.i32 q15, q12, #1 + vadd.i32 q8, q8, q4 + vext.32 d10, d31, d30, #0 + vadd.i32 q7, q7, q1 + vst1.8 {d16-d17}, [r2, : 128]! + vmull.s32 q8, d18, d5 + vmlal.s32 q8, d26, d4 + vmlal.s32 q8, d19, d9 + vmlal.s32 q8, d27, d3 + vmlal.s32 q8, d22, d8 + vmlal.s32 q8, d28, d2 + vmlal.s32 q8, d23, d7 + vmlal.s32 q8, d29, d1 + vmlal.s32 q8, d24, d6 + vmlal.s32 q8, d25, d0 + vst1.8 {d14-d15}, [r2, : 128]! + vmull.s32 q2, d18, d4 + vmlal.s32 q2, d12, d9 + vmlal.s32 q2, d13, d8 + vmlal.s32 q2, d19, d3 + vmlal.s32 q2, d22, d2 + vmlal.s32 q2, d23, d1 + vmlal.s32 q2, d24, d0 + vst1.8 {d20-d21}, [r2, : 128]! + vmull.s32 q7, d18, d9 + vmlal.s32 q7, d26, d3 + vmlal.s32 q7, d19, d8 + vmlal.s32 q7, d27, d2 + vmlal.s32 q7, d22, d7 + vmlal.s32 q7, d28, d1 + vmlal.s32 q7, d23, d6 + vmlal.s32 q7, d29, d0 + vst1.8 {d10-d11}, [r2, : 128]! + vmull.s32 q5, d18, d3 + vmlal.s32 q5, d19, d2 + vmlal.s32 q5, d22, d1 + vmlal.s32 q5, d23, d0 + vmlal.s32 q5, d12, d8 + vst1.8 {d16-d17}, [r2, : 128] + vmull.s32 q4, d18, d8 + vmlal.s32 q4, d26, d2 + vmlal.s32 q4, d19, d7 + vmlal.s32 q4, d27, d1 + vmlal.s32 q4, d22, d6 + vmlal.s32 q4, d28, d0 + vmull.s32 q8, d18, d7 + vmlal.s32 q8, d26, d1 + vmlal.s32 q8, d19, d6 + vmlal.s32 q8, d27, d0 + add r2, sp, #544 + vld1.8 {d20-d21}, [r2, : 128] + vmlal.s32 q7, d24, d21 + vmlal.s32 q7, d25, d20 + vmlal.s32 q4, d23, d21 + vmlal.s32 q4, d29, d20 + vmlal.s32 q8, d22, d21 + vmlal.s32 q8, d28, d20 + vmlal.s32 q5, d24, d20 + vst1.8 {d14-d15}, [r2, : 128] + vmull.s32 q7, d18, d6 + vmlal.s32 q7, d26, d0 + add r2, sp, #624 + vld1.8 {d30-d31}, [r2, : 128] + vmlal.s32 q2, d30, d21 + vmlal.s32 q7, d19, d21 + vmlal.s32 q7, d27, d20 + add r2, sp, #592 + vld1.8 {d26-d27}, [r2, : 128] + vmlal.s32 q4, d25, d27 + vmlal.s32 q8, d29, d27 + vmlal.s32 q8, d25, d26 + vmlal.s32 q7, d28, d27 + vmlal.s32 q7, d29, d26 + add r2, sp, #576 + vld1.8 {d28-d29}, [r2, : 128] + vmlal.s32 q4, d24, d29 + vmlal.s32 q8, d23, d29 + vmlal.s32 q8, d24, d28 + vmlal.s32 q7, d22, d29 + vmlal.s32 q7, d23, d28 + vst1.8 {d8-d9}, [r2, : 128] + add r2, sp, #528 + vld1.8 {d8-d9}, [r2, : 128] + vmlal.s32 q7, d24, d9 + vmlal.s32 q7, d25, d31 + vmull.s32 q1, d18, d2 + vmlal.s32 q1, d19, d1 + vmlal.s32 q1, d22, d0 + vmlal.s32 q1, d24, d27 + vmlal.s32 q1, d23, d20 + vmlal.s32 q1, d12, d7 + vmlal.s32 q1, d13, d6 + vmull.s32 q6, d18, d1 + vmlal.s32 q6, d19, d0 + vmlal.s32 q6, d23, d27 + vmlal.s32 q6, d22, d20 + vmlal.s32 q6, d24, d26 + vmull.s32 q0, d18, d0 + vmlal.s32 q0, d22, d27 + vmlal.s32 q0, d23, d26 + vmlal.s32 q0, d24, d31 + vmlal.s32 q0, d19, d20 + add r2, sp, #608 + vld1.8 {d18-d19}, [r2, : 128] + vmlal.s32 q2, d18, d7 + vmlal.s32 q5, d18, d6 + vmlal.s32 q1, d18, d21 + vmlal.s32 q0, d18, d28 + vmlal.s32 q6, d18, d29 + vmlal.s32 q2, d19, d6 + vmlal.s32 q5, d19, d21 + vmlal.s32 q1, d19, d29 + vmlal.s32 q0, d19, d9 + vmlal.s32 q6, d19, d28 + add r2, sp, #560 + vld1.8 {d18-d19}, [r2, : 128] + add r2, sp, #480 + vld1.8 {d22-d23}, [r2, : 128] + vmlal.s32 q5, d19, d7 + vmlal.s32 q0, d18, d21 + vmlal.s32 q0, d19, d29 + vmlal.s32 q6, d18, d6 + add r2, sp, #496 + vld1.8 {d6-d7}, [r2, : 128] + vmlal.s32 q6, d19, d21 + add r2, sp, #544 + vld1.8 {d18-d19}, [r2, : 128] + vmlal.s32 q0, d30, d8 + add r2, sp, #640 + vld1.8 {d20-d21}, [r2, : 128] + vmlal.s32 q5, d30, d29 + add r2, sp, #576 + vld1.8 {d24-d25}, [r2, : 128] + vmlal.s32 q1, d30, d28 + vadd.i64 q13, q0, q11 + vadd.i64 q14, q5, q11 + vmlal.s32 q6, d30, d9 + vshr.s64 q4, q13, #26 + vshr.s64 q13, q14, #26 + vadd.i64 q7, q7, q4 + vshl.i64 q4, q4, #26 + vadd.i64 q14, q7, q3 + vadd.i64 q9, q9, q13 + vshl.i64 q13, q13, #26 + vadd.i64 q15, q9, q3 + vsub.i64 q0, q0, q4 + vshr.s64 q4, q14, #25 + vsub.i64 q5, q5, q13 + vshr.s64 q13, q15, #25 + vadd.i64 q6, q6, q4 + vshl.i64 q4, q4, #25 + vadd.i64 q14, q6, q11 + vadd.i64 q2, q2, q13 + vsub.i64 q4, q7, q4 + vshr.s64 q7, q14, #26 + vshl.i64 q13, q13, #25 + vadd.i64 q14, q2, q11 + vadd.i64 q8, q8, q7 + vshl.i64 q7, q7, #26 + vadd.i64 q15, q8, q3 + vsub.i64 q9, q9, q13 + vshr.s64 q13, q14, #26 + vsub.i64 q6, q6, q7 + vshr.s64 q7, q15, #25 + vadd.i64 q10, q10, q13 + vshl.i64 q13, q13, #26 + vadd.i64 q14, q10, q3 + vadd.i64 q1, q1, q7 + add r2, r3, #144 + vshl.i64 q7, q7, #25 + add r4, r3, #96 + vadd.i64 q15, q1, q11 + add r2, r2, #8 + vsub.i64 q2, q2, q13 + add r4, r4, #8 + vshr.s64 q13, q14, #25 + vsub.i64 q7, q8, q7 + vshr.s64 q8, q15, #26 + vadd.i64 q14, q13, q13 + vadd.i64 q12, q12, q8 + vtrn.32 d12, d14 + vshl.i64 q8, q8, #26 + vtrn.32 d13, d15 + vadd.i64 q3, q12, q3 + vadd.i64 q0, q0, q14 + vst1.8 d12, [r2, : 64]! + vshl.i64 q7, q13, #4 + vst1.8 d13, [r4, : 64]! + vsub.i64 q1, q1, q8 + vshr.s64 q3, q3, #25 + vadd.i64 q0, q0, q7 + vadd.i64 q5, q5, q3 + vshl.i64 q3, q3, #25 + vadd.i64 q6, q5, q11 + vadd.i64 q0, q0, q13 + vshl.i64 q7, q13, #25 + vadd.i64 q8, q0, q11 + vsub.i64 q3, q12, q3 + vshr.s64 q6, q6, #26 + vsub.i64 q7, q10, q7 + vtrn.32 d2, d6 + vshr.s64 q8, q8, #26 + vtrn.32 d3, d7 + vadd.i64 q3, q9, q6 + vst1.8 d2, [r2, : 64] + vshl.i64 q6, q6, #26 + vst1.8 d3, [r4, : 64] + vadd.i64 q1, q4, q8 + vtrn.32 d4, d14 + vshl.i64 q4, q8, #26 + vtrn.32 d5, d15 + vsub.i64 q5, q5, q6 + add r2, r2, #16 + vsub.i64 q0, q0, q4 + vst1.8 d4, [r2, : 64] + add r4, r4, #16 + vst1.8 d5, [r4, : 64] + vtrn.32 d10, d6 + vtrn.32 d11, d7 + sub r2, r2, #8 + sub r4, r4, #8 + vtrn.32 d0, d2 + vtrn.32 d1, d3 + vst1.8 d10, [r2, : 64] + vst1.8 d11, [r4, : 64] + sub r2, r2, #24 + sub r4, r4, #24 + vst1.8 d0, [r2, : 64] + vst1.8 d1, [r4, : 64] + add r2, r3, #288 + add r4, r3, #336 + vld1.8 {d0-d1}, [r2, : 128]! + vld1.8 {d2-d3}, [r4, : 128]! + vsub.i32 q0, q0, q1 + vld1.8 {d2-d3}, [r2, : 128]! + vld1.8 {d4-d5}, [r4, : 128]! + vsub.i32 q1, q1, q2 + add r5, r3, #240 + vld1.8 {d4}, [r2, : 64] + vld1.8 {d6}, [r4, : 64] + vsub.i32 q2, q2, q3 + vst1.8 {d0-d1}, [r5, : 128]! + vst1.8 {d2-d3}, [r5, : 128]! + vst1.8 d4, [r5, : 64] + add r2, r3, #144 + add r4, r3, #96 + add r5, r3, #144 + add r6, r3, #192 + vld1.8 {d0-d1}, [r2, : 128]! + vld1.8 {d2-d3}, [r4, : 128]! + vsub.i32 q2, q0, q1 + vadd.i32 q0, q0, q1 + vld1.8 {d2-d3}, [r2, : 128]! + vld1.8 {d6-d7}, [r4, : 128]! + vsub.i32 q4, q1, q3 + vadd.i32 q1, q1, q3 + vld1.8 {d6}, [r2, : 64] + vld1.8 {d10}, [r4, : 64] + vsub.i32 q6, q3, q5 + vadd.i32 q3, q3, q5 + vst1.8 {d4-d5}, [r5, : 128]! + vst1.8 {d0-d1}, [r6, : 128]! + vst1.8 {d8-d9}, [r5, : 128]! + vst1.8 {d2-d3}, [r6, : 128]! + vst1.8 d12, [r5, : 64] + vst1.8 d6, [r6, : 64] + add r2, r3, #0 + add r4, r3, #240 + vld1.8 {d0-d1}, [r4, : 128]! + vld1.8 {d2-d3}, [r4, : 128]! + vld1.8 {d4}, [r4, : 64] + add r4, r3, #336 + vld1.8 {d6-d7}, [r4, : 128]! + vtrn.32 q0, q3 + vld1.8 {d8-d9}, [r4, : 128]! + vshl.i32 q5, q0, #4 + vtrn.32 q1, q4 + vshl.i32 q6, q3, #4 + vadd.i32 q5, q5, q0 + vadd.i32 q6, q6, q3 + vshl.i32 q7, q1, #4 + vld1.8 {d5}, [r4, : 64] + vshl.i32 q8, q4, #4 + vtrn.32 d4, d5 + vadd.i32 q7, q7, q1 + vadd.i32 q8, q8, q4 + vld1.8 {d18-d19}, [r2, : 128]! + vshl.i32 q10, q2, #4 + vld1.8 {d22-d23}, [r2, : 128]! + vadd.i32 q10, q10, q2 + vld1.8 {d24}, [r2, : 64] + vadd.i32 q5, q5, q0 + add r2, r3, #288 + vld1.8 {d26-d27}, [r2, : 128]! + vadd.i32 q6, q6, q3 + vld1.8 {d28-d29}, [r2, : 128]! + vadd.i32 q8, q8, q4 + vld1.8 {d25}, [r2, : 64] + vadd.i32 q10, q10, q2 + vtrn.32 q9, q13 + vadd.i32 q7, q7, q1 + vadd.i32 q5, q5, q0 + vtrn.32 q11, q14 + vadd.i32 q6, q6, q3 + add r2, sp, #528 + vadd.i32 q10, q10, q2 + vtrn.32 d24, d25 + vst1.8 {d12-d13}, [r2, : 128]! + vshl.i32 q6, q13, #1 + vst1.8 {d20-d21}, [r2, : 128]! + vshl.i32 q10, q14, #1 + vst1.8 {d12-d13}, [r2, : 128]! + vshl.i32 q15, q12, #1 + vadd.i32 q8, q8, q4 + vext.32 d10, d31, d30, #0 + vadd.i32 q7, q7, q1 + vst1.8 {d16-d17}, [r2, : 128]! + vmull.s32 q8, d18, d5 + vmlal.s32 q8, d26, d4 + vmlal.s32 q8, d19, d9 + vmlal.s32 q8, d27, d3 + vmlal.s32 q8, d22, d8 + vmlal.s32 q8, d28, d2 + vmlal.s32 q8, d23, d7 + vmlal.s32 q8, d29, d1 + vmlal.s32 q8, d24, d6 + vmlal.s32 q8, d25, d0 + vst1.8 {d14-d15}, [r2, : 128]! + vmull.s32 q2, d18, d4 + vmlal.s32 q2, d12, d9 + vmlal.s32 q2, d13, d8 + vmlal.s32 q2, d19, d3 + vmlal.s32 q2, d22, d2 + vmlal.s32 q2, d23, d1 + vmlal.s32 q2, d24, d0 + vst1.8 {d20-d21}, [r2, : 128]! + vmull.s32 q7, d18, d9 + vmlal.s32 q7, d26, d3 + vmlal.s32 q7, d19, d8 + vmlal.s32 q7, d27, d2 + vmlal.s32 q7, d22, d7 + vmlal.s32 q7, d28, d1 + vmlal.s32 q7, d23, d6 + vmlal.s32 q7, d29, d0 + vst1.8 {d10-d11}, [r2, : 128]! + vmull.s32 q5, d18, d3 + vmlal.s32 q5, d19, d2 + vmlal.s32 q5, d22, d1 + vmlal.s32 q5, d23, d0 + vmlal.s32 q5, d12, d8 + vst1.8 {d16-d17}, [r2, : 128]! + vmull.s32 q4, d18, d8 + vmlal.s32 q4, d26, d2 + vmlal.s32 q4, d19, d7 + vmlal.s32 q4, d27, d1 + vmlal.s32 q4, d22, d6 + vmlal.s32 q4, d28, d0 + vmull.s32 q8, d18, d7 + vmlal.s32 q8, d26, d1 + vmlal.s32 q8, d19, d6 + vmlal.s32 q8, d27, d0 + add r2, sp, #544 + vld1.8 {d20-d21}, [r2, : 128] + vmlal.s32 q7, d24, d21 + vmlal.s32 q7, d25, d20 + vmlal.s32 q4, d23, d21 + vmlal.s32 q4, d29, d20 + vmlal.s32 q8, d22, d21 + vmlal.s32 q8, d28, d20 + vmlal.s32 q5, d24, d20 + vst1.8 {d14-d15}, [r2, : 128] + vmull.s32 q7, d18, d6 + vmlal.s32 q7, d26, d0 + add r2, sp, #624 + vld1.8 {d30-d31}, [r2, : 128] + vmlal.s32 q2, d30, d21 + vmlal.s32 q7, d19, d21 + vmlal.s32 q7, d27, d20 + add r2, sp, #592 + vld1.8 {d26-d27}, [r2, : 128] + vmlal.s32 q4, d25, d27 + vmlal.s32 q8, d29, d27 + vmlal.s32 q8, d25, d26 + vmlal.s32 q7, d28, d27 + vmlal.s32 q7, d29, d26 + add r2, sp, #576 + vld1.8 {d28-d29}, [r2, : 128] + vmlal.s32 q4, d24, d29 + vmlal.s32 q8, d23, d29 + vmlal.s32 q8, d24, d28 + vmlal.s32 q7, d22, d29 + vmlal.s32 q7, d23, d28 + vst1.8 {d8-d9}, [r2, : 128] + add r2, sp, #528 + vld1.8 {d8-d9}, [r2, : 128] + vmlal.s32 q7, d24, d9 + vmlal.s32 q7, d25, d31 + vmull.s32 q1, d18, d2 + vmlal.s32 q1, d19, d1 + vmlal.s32 q1, d22, d0 + vmlal.s32 q1, d24, d27 + vmlal.s32 q1, d23, d20 + vmlal.s32 q1, d12, d7 + vmlal.s32 q1, d13, d6 + vmull.s32 q6, d18, d1 + vmlal.s32 q6, d19, d0 + vmlal.s32 q6, d23, d27 + vmlal.s32 q6, d22, d20 + vmlal.s32 q6, d24, d26 + vmull.s32 q0, d18, d0 + vmlal.s32 q0, d22, d27 + vmlal.s32 q0, d23, d26 + vmlal.s32 q0, d24, d31 + vmlal.s32 q0, d19, d20 + add r2, sp, #608 + vld1.8 {d18-d19}, [r2, : 128] + vmlal.s32 q2, d18, d7 + vmlal.s32 q5, d18, d6 + vmlal.s32 q1, d18, d21 + vmlal.s32 q0, d18, d28 + vmlal.s32 q6, d18, d29 + vmlal.s32 q2, d19, d6 + vmlal.s32 q5, d19, d21 + vmlal.s32 q1, d19, d29 + vmlal.s32 q0, d19, d9 + vmlal.s32 q6, d19, d28 + add r2, sp, #560 + vld1.8 {d18-d19}, [r2, : 128] + add r2, sp, #480 + vld1.8 {d22-d23}, [r2, : 128] + vmlal.s32 q5, d19, d7 + vmlal.s32 q0, d18, d21 + vmlal.s32 q0, d19, d29 + vmlal.s32 q6, d18, d6 + add r2, sp, #496 + vld1.8 {d6-d7}, [r2, : 128] + vmlal.s32 q6, d19, d21 + add r2, sp, #544 + vld1.8 {d18-d19}, [r2, : 128] + vmlal.s32 q0, d30, d8 + add r2, sp, #640 + vld1.8 {d20-d21}, [r2, : 128] + vmlal.s32 q5, d30, d29 + add r2, sp, #576 + vld1.8 {d24-d25}, [r2, : 128] + vmlal.s32 q1, d30, d28 + vadd.i64 q13, q0, q11 + vadd.i64 q14, q5, q11 + vmlal.s32 q6, d30, d9 + vshr.s64 q4, q13, #26 + vshr.s64 q13, q14, #26 + vadd.i64 q7, q7, q4 + vshl.i64 q4, q4, #26 + vadd.i64 q14, q7, q3 + vadd.i64 q9, q9, q13 + vshl.i64 q13, q13, #26 + vadd.i64 q15, q9, q3 + vsub.i64 q0, q0, q4 + vshr.s64 q4, q14, #25 + vsub.i64 q5, q5, q13 + vshr.s64 q13, q15, #25 + vadd.i64 q6, q6, q4 + vshl.i64 q4, q4, #25 + vadd.i64 q14, q6, q11 + vadd.i64 q2, q2, q13 + vsub.i64 q4, q7, q4 + vshr.s64 q7, q14, #26 + vshl.i64 q13, q13, #25 + vadd.i64 q14, q2, q11 + vadd.i64 q8, q8, q7 + vshl.i64 q7, q7, #26 + vadd.i64 q15, q8, q3 + vsub.i64 q9, q9, q13 + vshr.s64 q13, q14, #26 + vsub.i64 q6, q6, q7 + vshr.s64 q7, q15, #25 + vadd.i64 q10, q10, q13 + vshl.i64 q13, q13, #26 + vadd.i64 q14, q10, q3 + vadd.i64 q1, q1, q7 + add r2, r3, #288 + vshl.i64 q7, q7, #25 + add r4, r3, #96 + vadd.i64 q15, q1, q11 + add r2, r2, #8 + vsub.i64 q2, q2, q13 + add r4, r4, #8 + vshr.s64 q13, q14, #25 + vsub.i64 q7, q8, q7 + vshr.s64 q8, q15, #26 + vadd.i64 q14, q13, q13 + vadd.i64 q12, q12, q8 + vtrn.32 d12, d14 + vshl.i64 q8, q8, #26 + vtrn.32 d13, d15 + vadd.i64 q3, q12, q3 + vadd.i64 q0, q0, q14 + vst1.8 d12, [r2, : 64]! + vshl.i64 q7, q13, #4 + vst1.8 d13, [r4, : 64]! + vsub.i64 q1, q1, q8 + vshr.s64 q3, q3, #25 + vadd.i64 q0, q0, q7 + vadd.i64 q5, q5, q3 + vshl.i64 q3, q3, #25 + vadd.i64 q6, q5, q11 + vadd.i64 q0, q0, q13 + vshl.i64 q7, q13, #25 + vadd.i64 q8, q0, q11 + vsub.i64 q3, q12, q3 + vshr.s64 q6, q6, #26 + vsub.i64 q7, q10, q7 + vtrn.32 d2, d6 + vshr.s64 q8, q8, #26 + vtrn.32 d3, d7 + vadd.i64 q3, q9, q6 + vst1.8 d2, [r2, : 64] + vshl.i64 q6, q6, #26 + vst1.8 d3, [r4, : 64] + vadd.i64 q1, q4, q8 + vtrn.32 d4, d14 + vshl.i64 q4, q8, #26 + vtrn.32 d5, d15 + vsub.i64 q5, q5, q6 + add r2, r2, #16 + vsub.i64 q0, q0, q4 + vst1.8 d4, [r2, : 64] + add r4, r4, #16 + vst1.8 d5, [r4, : 64] + vtrn.32 d10, d6 + vtrn.32 d11, d7 + sub r2, r2, #8 + sub r4, r4, #8 + vtrn.32 d0, d2 + vtrn.32 d1, d3 + vst1.8 d10, [r2, : 64] + vst1.8 d11, [r4, : 64] + sub r2, r2, #24 + sub r4, r4, #24 + vst1.8 d0, [r2, : 64] + vst1.8 d1, [r4, : 64] + add r2, sp, #512 + add r4, r3, #144 + add r5, r3, #192 + vld1.8 {d0-d1}, [r2, : 128] + vld1.8 {d2-d3}, [r4, : 128]! + vld1.8 {d4-d5}, [r5, : 128]! + vzip.i32 q1, q2 + vld1.8 {d6-d7}, [r4, : 128]! + vld1.8 {d8-d9}, [r5, : 128]! + vshl.i32 q5, q1, #1 + vzip.i32 q3, q4 + vshl.i32 q6, q2, #1 + vld1.8 {d14}, [r4, : 64] + vshl.i32 q8, q3, #1 + vld1.8 {d15}, [r5, : 64] + vshl.i32 q9, q4, #1 + vmul.i32 d21, d7, d1 + vtrn.32 d14, d15 + vmul.i32 q11, q4, q0 + vmul.i32 q0, q7, q0 + vmull.s32 q12, d2, d2 + vmlal.s32 q12, d11, d1 + vmlal.s32 q12, d12, d0 + vmlal.s32 q12, d13, d23 + vmlal.s32 q12, d16, d22 + vmlal.s32 q12, d7, d21 + vmull.s32 q10, d2, d11 + vmlal.s32 q10, d4, d1 + vmlal.s32 q10, d13, d0 + vmlal.s32 q10, d6, d23 + vmlal.s32 q10, d17, d22 + vmull.s32 q13, d10, d4 + vmlal.s32 q13, d11, d3 + vmlal.s32 q13, d13, d1 + vmlal.s32 q13, d16, d0 + vmlal.s32 q13, d17, d23 + vmlal.s32 q13, d8, d22 + vmull.s32 q1, d10, d5 + vmlal.s32 q1, d11, d4 + vmlal.s32 q1, d6, d1 + vmlal.s32 q1, d17, d0 + vmlal.s32 q1, d8, d23 + vmull.s32 q14, d10, d6 + vmlal.s32 q14, d11, d13 + vmlal.s32 q14, d4, d4 + vmlal.s32 q14, d17, d1 + vmlal.s32 q14, d18, d0 + vmlal.s32 q14, d9, d23 + vmull.s32 q11, d10, d7 + vmlal.s32 q11, d11, d6 + vmlal.s32 q11, d12, d5 + vmlal.s32 q11, d8, d1 + vmlal.s32 q11, d19, d0 + vmull.s32 q15, d10, d8 + vmlal.s32 q15, d11, d17 + vmlal.s32 q15, d12, d6 + vmlal.s32 q15, d13, d5 + vmlal.s32 q15, d19, d1 + vmlal.s32 q15, d14, d0 + vmull.s32 q2, d10, d9 + vmlal.s32 q2, d11, d8 + vmlal.s32 q2, d12, d7 + vmlal.s32 q2, d13, d6 + vmlal.s32 q2, d14, d1 + vmull.s32 q0, d15, d1 + vmlal.s32 q0, d10, d14 + vmlal.s32 q0, d11, d19 + vmlal.s32 q0, d12, d8 + vmlal.s32 q0, d13, d17 + vmlal.s32 q0, d6, d6 + add r2, sp, #480 + vld1.8 {d18-d19}, [r2, : 128]! + vmull.s32 q3, d16, d7 + vmlal.s32 q3, d10, d15 + vmlal.s32 q3, d11, d14 + vmlal.s32 q3, d12, d9 + vmlal.s32 q3, d13, d8 + vld1.8 {d8-d9}, [r2, : 128] + vadd.i64 q5, q12, q9 + vadd.i64 q6, q15, q9 + vshr.s64 q5, q5, #26 + vshr.s64 q6, q6, #26 + vadd.i64 q7, q10, q5 + vshl.i64 q5, q5, #26 + vadd.i64 q8, q7, q4 + vadd.i64 q2, q2, q6 + vshl.i64 q6, q6, #26 + vadd.i64 q10, q2, q4 + vsub.i64 q5, q12, q5 + vshr.s64 q8, q8, #25 + vsub.i64 q6, q15, q6 + vshr.s64 q10, q10, #25 + vadd.i64 q12, q13, q8 + vshl.i64 q8, q8, #25 + vadd.i64 q13, q12, q9 + vadd.i64 q0, q0, q10 + vsub.i64 q7, q7, q8 + vshr.s64 q8, q13, #26 + vshl.i64 q10, q10, #25 + vadd.i64 q13, q0, q9 + vadd.i64 q1, q1, q8 + vshl.i64 q8, q8, #26 + vadd.i64 q15, q1, q4 + vsub.i64 q2, q2, q10 + vshr.s64 q10, q13, #26 + vsub.i64 q8, q12, q8 + vshr.s64 q12, q15, #25 + vadd.i64 q3, q3, q10 + vshl.i64 q10, q10, #26 + vadd.i64 q13, q3, q4 + vadd.i64 q14, q14, q12 + add r2, r3, #144 + vshl.i64 q12, q12, #25 + add r4, r3, #192 + vadd.i64 q15, q14, q9 + add r2, r2, #8 + vsub.i64 q0, q0, q10 + add r4, r4, #8 + vshr.s64 q10, q13, #25 + vsub.i64 q1, q1, q12 + vshr.s64 q12, q15, #26 + vadd.i64 q13, q10, q10 + vadd.i64 q11, q11, q12 + vtrn.32 d16, d2 + vshl.i64 q12, q12, #26 + vtrn.32 d17, d3 + vadd.i64 q1, q11, q4 + vadd.i64 q4, q5, q13 + vst1.8 d16, [r2, : 64]! + vshl.i64 q5, q10, #4 + vst1.8 d17, [r4, : 64]! + vsub.i64 q8, q14, q12 + vshr.s64 q1, q1, #25 + vadd.i64 q4, q4, q5 + vadd.i64 q5, q6, q1 + vshl.i64 q1, q1, #25 + vadd.i64 q6, q5, q9 + vadd.i64 q4, q4, q10 + vshl.i64 q10, q10, #25 + vadd.i64 q9, q4, q9 + vsub.i64 q1, q11, q1 + vshr.s64 q6, q6, #26 + vsub.i64 q3, q3, q10 + vtrn.32 d16, d2 + vshr.s64 q9, q9, #26 + vtrn.32 d17, d3 + vadd.i64 q1, q2, q6 + vst1.8 d16, [r2, : 64] + vshl.i64 q2, q6, #26 + vst1.8 d17, [r4, : 64] + vadd.i64 q6, q7, q9 + vtrn.32 d0, d6 + vshl.i64 q7, q9, #26 + vtrn.32 d1, d7 + vsub.i64 q2, q5, q2 + add r2, r2, #16 + vsub.i64 q3, q4, q7 + vst1.8 d0, [r2, : 64] + add r4, r4, #16 + vst1.8 d1, [r4, : 64] + vtrn.32 d4, d2 + vtrn.32 d5, d3 + sub r2, r2, #8 + sub r4, r4, #8 + vtrn.32 d6, d12 + vtrn.32 d7, d13 + vst1.8 d4, [r2, : 64] + vst1.8 d5, [r4, : 64] + sub r2, r2, #24 + sub r4, r4, #24 + vst1.8 d6, [r2, : 64] + vst1.8 d7, [r4, : 64] + add r2, r3, #336 + add r4, r3, #288 + vld1.8 {d0-d1}, [r2, : 128]! + vld1.8 {d2-d3}, [r4, : 128]! + vadd.i32 q0, q0, q1 + vld1.8 {d2-d3}, [r2, : 128]! + vld1.8 {d4-d5}, [r4, : 128]! + vadd.i32 q1, q1, q2 + add r5, r3, #288 + vld1.8 {d4}, [r2, : 64] + vld1.8 {d6}, [r4, : 64] + vadd.i32 q2, q2, q3 + vst1.8 {d0-d1}, [r5, : 128]! + vst1.8 {d2-d3}, [r5, : 128]! + vst1.8 d4, [r5, : 64] + add r2, r3, #48 + add r4, r3, #144 + vld1.8 {d0-d1}, [r4, : 128]! + vld1.8 {d2-d3}, [r4, : 128]! + vld1.8 {d4}, [r4, : 64] + add r4, r3, #288 + vld1.8 {d6-d7}, [r4, : 128]! + vtrn.32 q0, q3 + vld1.8 {d8-d9}, [r4, : 128]! + vshl.i32 q5, q0, #4 + vtrn.32 q1, q4 + vshl.i32 q6, q3, #4 + vadd.i32 q5, q5, q0 + vadd.i32 q6, q6, q3 + vshl.i32 q7, q1, #4 + vld1.8 {d5}, [r4, : 64] + vshl.i32 q8, q4, #4 + vtrn.32 d4, d5 + vadd.i32 q7, q7, q1 + vadd.i32 q8, q8, q4 + vld1.8 {d18-d19}, [r2, : 128]! + vshl.i32 q10, q2, #4 + vld1.8 {d22-d23}, [r2, : 128]! + vadd.i32 q10, q10, q2 + vld1.8 {d24}, [r2, : 64] + vadd.i32 q5, q5, q0 + add r2, r3, #240 + vld1.8 {d26-d27}, [r2, : 128]! + vadd.i32 q6, q6, q3 + vld1.8 {d28-d29}, [r2, : 128]! + vadd.i32 q8, q8, q4 + vld1.8 {d25}, [r2, : 64] + vadd.i32 q10, q10, q2 + vtrn.32 q9, q13 + vadd.i32 q7, q7, q1 + vadd.i32 q5, q5, q0 + vtrn.32 q11, q14 + vadd.i32 q6, q6, q3 + add r2, sp, #528 + vadd.i32 q10, q10, q2 + vtrn.32 d24, d25 + vst1.8 {d12-d13}, [r2, : 128]! + vshl.i32 q6, q13, #1 + vst1.8 {d20-d21}, [r2, : 128]! + vshl.i32 q10, q14, #1 + vst1.8 {d12-d13}, [r2, : 128]! + vshl.i32 q15, q12, #1 + vadd.i32 q8, q8, q4 + vext.32 d10, d31, d30, #0 + vadd.i32 q7, q7, q1 + vst1.8 {d16-d17}, [r2, : 128]! + vmull.s32 q8, d18, d5 + vmlal.s32 q8, d26, d4 + vmlal.s32 q8, d19, d9 + vmlal.s32 q8, d27, d3 + vmlal.s32 q8, d22, d8 + vmlal.s32 q8, d28, d2 + vmlal.s32 q8, d23, d7 + vmlal.s32 q8, d29, d1 + vmlal.s32 q8, d24, d6 + vmlal.s32 q8, d25, d0 + vst1.8 {d14-d15}, [r2, : 128]! + vmull.s32 q2, d18, d4 + vmlal.s32 q2, d12, d9 + vmlal.s32 q2, d13, d8 + vmlal.s32 q2, d19, d3 + vmlal.s32 q2, d22, d2 + vmlal.s32 q2, d23, d1 + vmlal.s32 q2, d24, d0 + vst1.8 {d20-d21}, [r2, : 128]! + vmull.s32 q7, d18, d9 + vmlal.s32 q7, d26, d3 + vmlal.s32 q7, d19, d8 + vmlal.s32 q7, d27, d2 + vmlal.s32 q7, d22, d7 + vmlal.s32 q7, d28, d1 + vmlal.s32 q7, d23, d6 + vmlal.s32 q7, d29, d0 + vst1.8 {d10-d11}, [r2, : 128]! + vmull.s32 q5, d18, d3 + vmlal.s32 q5, d19, d2 + vmlal.s32 q5, d22, d1 + vmlal.s32 q5, d23, d0 + vmlal.s32 q5, d12, d8 + vst1.8 {d16-d17}, [r2, : 128]! + vmull.s32 q4, d18, d8 + vmlal.s32 q4, d26, d2 + vmlal.s32 q4, d19, d7 + vmlal.s32 q4, d27, d1 + vmlal.s32 q4, d22, d6 + vmlal.s32 q4, d28, d0 + vmull.s32 q8, d18, d7 + vmlal.s32 q8, d26, d1 + vmlal.s32 q8, d19, d6 + vmlal.s32 q8, d27, d0 + add r2, sp, #544 + vld1.8 {d20-d21}, [r2, : 128] + vmlal.s32 q7, d24, d21 + vmlal.s32 q7, d25, d20 + vmlal.s32 q4, d23, d21 + vmlal.s32 q4, d29, d20 + vmlal.s32 q8, d22, d21 + vmlal.s32 q8, d28, d20 + vmlal.s32 q5, d24, d20 + vst1.8 {d14-d15}, [r2, : 128] + vmull.s32 q7, d18, d6 + vmlal.s32 q7, d26, d0 + add r2, sp, #624 + vld1.8 {d30-d31}, [r2, : 128] + vmlal.s32 q2, d30, d21 + vmlal.s32 q7, d19, d21 + vmlal.s32 q7, d27, d20 + add r2, sp, #592 + vld1.8 {d26-d27}, [r2, : 128] + vmlal.s32 q4, d25, d27 + vmlal.s32 q8, d29, d27 + vmlal.s32 q8, d25, d26 + vmlal.s32 q7, d28, d27 + vmlal.s32 q7, d29, d26 + add r2, sp, #576 + vld1.8 {d28-d29}, [r2, : 128] + vmlal.s32 q4, d24, d29 + vmlal.s32 q8, d23, d29 + vmlal.s32 q8, d24, d28 + vmlal.s32 q7, d22, d29 + vmlal.s32 q7, d23, d28 + vst1.8 {d8-d9}, [r2, : 128] + add r2, sp, #528 + vld1.8 {d8-d9}, [r2, : 128] + vmlal.s32 q7, d24, d9 + vmlal.s32 q7, d25, d31 + vmull.s32 q1, d18, d2 + vmlal.s32 q1, d19, d1 + vmlal.s32 q1, d22, d0 + vmlal.s32 q1, d24, d27 + vmlal.s32 q1, d23, d20 + vmlal.s32 q1, d12, d7 + vmlal.s32 q1, d13, d6 + vmull.s32 q6, d18, d1 + vmlal.s32 q6, d19, d0 + vmlal.s32 q6, d23, d27 + vmlal.s32 q6, d22, d20 + vmlal.s32 q6, d24, d26 + vmull.s32 q0, d18, d0 + vmlal.s32 q0, d22, d27 + vmlal.s32 q0, d23, d26 + vmlal.s32 q0, d24, d31 + vmlal.s32 q0, d19, d20 + add r2, sp, #608 + vld1.8 {d18-d19}, [r2, : 128] + vmlal.s32 q2, d18, d7 + vmlal.s32 q5, d18, d6 + vmlal.s32 q1, d18, d21 + vmlal.s32 q0, d18, d28 + vmlal.s32 q6, d18, d29 + vmlal.s32 q2, d19, d6 + vmlal.s32 q5, d19, d21 + vmlal.s32 q1, d19, d29 + vmlal.s32 q0, d19, d9 + vmlal.s32 q6, d19, d28 + add r2, sp, #560 + vld1.8 {d18-d19}, [r2, : 128] + add r2, sp, #480 + vld1.8 {d22-d23}, [r2, : 128] + vmlal.s32 q5, d19, d7 + vmlal.s32 q0, d18, d21 + vmlal.s32 q0, d19, d29 + vmlal.s32 q6, d18, d6 + add r2, sp, #496 + vld1.8 {d6-d7}, [r2, : 128] + vmlal.s32 q6, d19, d21 + add r2, sp, #544 + vld1.8 {d18-d19}, [r2, : 128] + vmlal.s32 q0, d30, d8 + add r2, sp, #640 + vld1.8 {d20-d21}, [r2, : 128] + vmlal.s32 q5, d30, d29 + add r2, sp, #576 + vld1.8 {d24-d25}, [r2, : 128] + vmlal.s32 q1, d30, d28 + vadd.i64 q13, q0, q11 + vadd.i64 q14, q5, q11 + vmlal.s32 q6, d30, d9 + vshr.s64 q4, q13, #26 + vshr.s64 q13, q14, #26 + vadd.i64 q7, q7, q4 + vshl.i64 q4, q4, #26 + vadd.i64 q14, q7, q3 + vadd.i64 q9, q9, q13 + vshl.i64 q13, q13, #26 + vadd.i64 q15, q9, q3 + vsub.i64 q0, q0, q4 + vshr.s64 q4, q14, #25 + vsub.i64 q5, q5, q13 + vshr.s64 q13, q15, #25 + vadd.i64 q6, q6, q4 + vshl.i64 q4, q4, #25 + vadd.i64 q14, q6, q11 + vadd.i64 q2, q2, q13 + vsub.i64 q4, q7, q4 + vshr.s64 q7, q14, #26 + vshl.i64 q13, q13, #25 + vadd.i64 q14, q2, q11 + vadd.i64 q8, q8, q7 + vshl.i64 q7, q7, #26 + vadd.i64 q15, q8, q3 + vsub.i64 q9, q9, q13 + vshr.s64 q13, q14, #26 + vsub.i64 q6, q6, q7 + vshr.s64 q7, q15, #25 + vadd.i64 q10, q10, q13 + vshl.i64 q13, q13, #26 + vadd.i64 q14, q10, q3 + vadd.i64 q1, q1, q7 + add r2, r3, #240 + vshl.i64 q7, q7, #25 + add r4, r3, #144 + vadd.i64 q15, q1, q11 + add r2, r2, #8 + vsub.i64 q2, q2, q13 + add r4, r4, #8 + vshr.s64 q13, q14, #25 + vsub.i64 q7, q8, q7 + vshr.s64 q8, q15, #26 + vadd.i64 q14, q13, q13 + vadd.i64 q12, q12, q8 + vtrn.32 d12, d14 + vshl.i64 q8, q8, #26 + vtrn.32 d13, d15 + vadd.i64 q3, q12, q3 + vadd.i64 q0, q0, q14 + vst1.8 d12, [r2, : 64]! + vshl.i64 q7, q13, #4 + vst1.8 d13, [r4, : 64]! + vsub.i64 q1, q1, q8 + vshr.s64 q3, q3, #25 + vadd.i64 q0, q0, q7 + vadd.i64 q5, q5, q3 + vshl.i64 q3, q3, #25 + vadd.i64 q6, q5, q11 + vadd.i64 q0, q0, q13 + vshl.i64 q7, q13, #25 + vadd.i64 q8, q0, q11 + vsub.i64 q3, q12, q3 + vshr.s64 q6, q6, #26 + vsub.i64 q7, q10, q7 + vtrn.32 d2, d6 + vshr.s64 q8, q8, #26 + vtrn.32 d3, d7 + vadd.i64 q3, q9, q6 + vst1.8 d2, [r2, : 64] + vshl.i64 q6, q6, #26 + vst1.8 d3, [r4, : 64] + vadd.i64 q1, q4, q8 + vtrn.32 d4, d14 + vshl.i64 q4, q8, #26 + vtrn.32 d5, d15 + vsub.i64 q5, q5, q6 + add r2, r2, #16 + vsub.i64 q0, q0, q4 + vst1.8 d4, [r2, : 64] + add r4, r4, #16 + vst1.8 d5, [r4, : 64] + vtrn.32 d10, d6 + vtrn.32 d11, d7 + sub r2, r2, #8 + sub r4, r4, #8 + vtrn.32 d0, d2 + vtrn.32 d1, d3 + vst1.8 d10, [r2, : 64] + vst1.8 d11, [r4, : 64] + sub r2, r2, #24 + sub r4, r4, #24 + vst1.8 d0, [r2, : 64] + vst1.8 d1, [r4, : 64] + ldr r2, [sp, #456] + ldr r4, [sp, #460] + subs r5, r2, #1 + bge .Lmainloop + add r1, r3, #144 + add r2, r3, #336 + vld1.8 {d0-d1}, [r1, : 128]! + vld1.8 {d2-d3}, [r1, : 128]! + vld1.8 {d4}, [r1, : 64] + vst1.8 {d0-d1}, [r2, : 128]! + vst1.8 {d2-d3}, [r2, : 128]! + vst1.8 d4, [r2, : 64] + movw r1, #0 +.Linvertloop: + add r2, r3, #144 + movw r4, #0 + movw r5, #2 + cmp r1, #1 + moveq r5, #1 + addeq r2, r3, #336 + addeq r4, r3, #48 + cmp r1, #2 + moveq r5, #1 + addeq r2, r3, #48 + cmp r1, #3 + moveq r5, #5 + addeq r4, r3, #336 + cmp r1, #4 + moveq r5, #10 + cmp r1, #5 + moveq r5, #20 + cmp r1, #6 + moveq r5, #10 + addeq r2, r3, #336 + addeq r4, r3, #336 + cmp r1, #7 + moveq r5, #50 + cmp r1, #8 + moveq r5, #100 + cmp r1, #9 + moveq r5, #50 + addeq r2, r3, #336 + cmp r1, #10 + moveq r5, #5 + addeq r2, r3, #48 + cmp r1, #11 + moveq r5, #0 + addeq r2, r3, #96 + add r6, r3, #144 + add r7, r3, #288 + vld1.8 {d0-d1}, [r6, : 128]! + vld1.8 {d2-d3}, [r6, : 128]! + vld1.8 {d4}, [r6, : 64] + vst1.8 {d0-d1}, [r7, : 128]! + vst1.8 {d2-d3}, [r7, : 128]! + vst1.8 d4, [r7, : 64] + cmp r5, #0 + beq .Lskipsquaringloop +.Lsquaringloop: + add r6, r3, #288 + add r7, r3, #288 + add r8, r3, #288 + vmov.i32 q0, #19 + vmov.i32 q1, #0 + vmov.i32 q2, #1 + vzip.i32 q1, q2 + vld1.8 {d4-d5}, [r7, : 128]! + vld1.8 {d6-d7}, [r7, : 128]! + vld1.8 {d9}, [r7, : 64] + vld1.8 {d10-d11}, [r6, : 128]! + add r7, sp, #384 + vld1.8 {d12-d13}, [r6, : 128]! + vmul.i32 q7, q2, q0 + vld1.8 {d8}, [r6, : 64] + vext.32 d17, d11, d10, #1 + vmul.i32 q9, q3, q0 + vext.32 d16, d10, d8, #1 + vshl.u32 q10, q5, q1 + vext.32 d22, d14, d4, #1 + vext.32 d24, d18, d6, #1 + vshl.u32 q13, q6, q1 + vshl.u32 d28, d8, d2 + vrev64.i32 d22, d22 + vmul.i32 d1, d9, d1 + vrev64.i32 d24, d24 + vext.32 d29, d8, d13, #1 + vext.32 d0, d1, d9, #1 + vrev64.i32 d0, d0 + vext.32 d2, d9, d1, #1 + vext.32 d23, d15, d5, #1 + vmull.s32 q4, d20, d4 + vrev64.i32 d23, d23 + vmlal.s32 q4, d21, d1 + vrev64.i32 d2, d2 + vmlal.s32 q4, d26, d19 + vext.32 d3, d5, d15, #1 + vmlal.s32 q4, d27, d18 + vrev64.i32 d3, d3 + vmlal.s32 q4, d28, d15 + vext.32 d14, d12, d11, #1 + vmull.s32 q5, d16, d23 + vext.32 d15, d13, d12, #1 + vmlal.s32 q5, d17, d4 + vst1.8 d8, [r7, : 64]! + vmlal.s32 q5, d14, d1 + vext.32 d12, d9, d8, #0 + vmlal.s32 q5, d15, d19 + vmov.i64 d13, #0 + vmlal.s32 q5, d29, d18 + vext.32 d25, d19, d7, #1 + vmlal.s32 q6, d20, d5 + vrev64.i32 d25, d25 + vmlal.s32 q6, d21, d4 + vst1.8 d11, [r7, : 64]! + vmlal.s32 q6, d26, d1 + vext.32 d9, d10, d10, #0 + vmlal.s32 q6, d27, d19 + vmov.i64 d8, #0 + vmlal.s32 q6, d28, d18 + vmlal.s32 q4, d16, d24 + vmlal.s32 q4, d17, d5 + vmlal.s32 q4, d14, d4 + vst1.8 d12, [r7, : 64]! + vmlal.s32 q4, d15, d1 + vext.32 d10, d13, d12, #0 + vmlal.s32 q4, d29, d19 + vmov.i64 d11, #0 + vmlal.s32 q5, d20, d6 + vmlal.s32 q5, d21, d5 + vmlal.s32 q5, d26, d4 + vext.32 d13, d8, d8, #0 + vmlal.s32 q5, d27, d1 + vmov.i64 d12, #0 + vmlal.s32 q5, d28, d19 + vst1.8 d9, [r7, : 64]! + vmlal.s32 q6, d16, d25 + vmlal.s32 q6, d17, d6 + vst1.8 d10, [r7, : 64] + vmlal.s32 q6, d14, d5 + vext.32 d8, d11, d10, #0 + vmlal.s32 q6, d15, d4 + vmov.i64 d9, #0 + vmlal.s32 q6, d29, d1 + vmlal.s32 q4, d20, d7 + vmlal.s32 q4, d21, d6 + vmlal.s32 q4, d26, d5 + vext.32 d11, d12, d12, #0 + vmlal.s32 q4, d27, d4 + vmov.i64 d10, #0 + vmlal.s32 q4, d28, d1 + vmlal.s32 q5, d16, d0 + sub r6, r7, #32 + vmlal.s32 q5, d17, d7 + vmlal.s32 q5, d14, d6 + vext.32 d30, d9, d8, #0 + vmlal.s32 q5, d15, d5 + vld1.8 {d31}, [r6, : 64]! + vmlal.s32 q5, d29, d4 + vmlal.s32 q15, d20, d0 + vext.32 d0, d6, d18, #1 + vmlal.s32 q15, d21, d25 + vrev64.i32 d0, d0 + vmlal.s32 q15, d26, d24 + vext.32 d1, d7, d19, #1 + vext.32 d7, d10, d10, #0 + vmlal.s32 q15, d27, d23 + vrev64.i32 d1, d1 + vld1.8 {d6}, [r6, : 64] + vmlal.s32 q15, d28, d22 + vmlal.s32 q3, d16, d4 + add r6, r6, #24 + vmlal.s32 q3, d17, d2 + vext.32 d4, d31, d30, #0 + vmov d17, d11 + vmlal.s32 q3, d14, d1 + vext.32 d11, d13, d13, #0 + vext.32 d13, d30, d30, #0 + vmlal.s32 q3, d15, d0 + vext.32 d1, d8, d8, #0 + vmlal.s32 q3, d29, d3 + vld1.8 {d5}, [r6, : 64] + sub r6, r6, #16 + vext.32 d10, d6, d6, #0 + vmov.i32 q1, #0xffffffff + vshl.i64 q4, q1, #25 + add r7, sp, #480 + vld1.8 {d14-d15}, [r7, : 128] + vadd.i64 q9, q2, q7 + vshl.i64 q1, q1, #26 + vshr.s64 q10, q9, #26 + vld1.8 {d0}, [r6, : 64]! + vadd.i64 q5, q5, q10 + vand q9, q9, q1 + vld1.8 {d16}, [r6, : 64]! + add r6, sp, #496 + vld1.8 {d20-d21}, [r6, : 128] + vadd.i64 q11, q5, q10 + vsub.i64 q2, q2, q9 + vshr.s64 q9, q11, #25 + vext.32 d12, d5, d4, #0 + vand q11, q11, q4 + vadd.i64 q0, q0, q9 + vmov d19, d7 + vadd.i64 q3, q0, q7 + vsub.i64 q5, q5, q11 + vshr.s64 q11, q3, #26 + vext.32 d18, d11, d10, #0 + vand q3, q3, q1 + vadd.i64 q8, q8, q11 + vadd.i64 q11, q8, q10 + vsub.i64 q0, q0, q3 + vshr.s64 q3, q11, #25 + vand q11, q11, q4 + vadd.i64 q3, q6, q3 + vadd.i64 q6, q3, q7 + vsub.i64 q8, q8, q11 + vshr.s64 q11, q6, #26 + vand q6, q6, q1 + vadd.i64 q9, q9, q11 + vadd.i64 d25, d19, d21 + vsub.i64 q3, q3, q6 + vshr.s64 d23, d25, #25 + vand q4, q12, q4 + vadd.i64 d21, d23, d23 + vshl.i64 d25, d23, #4 + vadd.i64 d21, d21, d23 + vadd.i64 d25, d25, d21 + vadd.i64 d4, d4, d25 + vzip.i32 q0, q8 + vadd.i64 d12, d4, d14 + add r6, r8, #8 + vst1.8 d0, [r6, : 64] + vsub.i64 d19, d19, d9 + add r6, r6, #16 + vst1.8 d16, [r6, : 64] + vshr.s64 d22, d12, #26 + vand q0, q6, q1 + vadd.i64 d10, d10, d22 + vzip.i32 q3, q9 + vsub.i64 d4, d4, d0 + sub r6, r6, #8 + vst1.8 d6, [r6, : 64] + add r6, r6, #16 + vst1.8 d18, [r6, : 64] + vzip.i32 q2, q5 + sub r6, r6, #32 + vst1.8 d4, [r6, : 64] + subs r5, r5, #1 + bhi .Lsquaringloop +.Lskipsquaringloop: + mov r2, r2 + add r5, r3, #288 + add r6, r3, #144 + vmov.i32 q0, #19 + vmov.i32 q1, #0 + vmov.i32 q2, #1 + vzip.i32 q1, q2 + vld1.8 {d4-d5}, [r5, : 128]! + vld1.8 {d6-d7}, [r5, : 128]! + vld1.8 {d9}, [r5, : 64] + vld1.8 {d10-d11}, [r2, : 128]! + add r5, sp, #384 + vld1.8 {d12-d13}, [r2, : 128]! + vmul.i32 q7, q2, q0 + vld1.8 {d8}, [r2, : 64] + vext.32 d17, d11, d10, #1 + vmul.i32 q9, q3, q0 + vext.32 d16, d10, d8, #1 + vshl.u32 q10, q5, q1 + vext.32 d22, d14, d4, #1 + vext.32 d24, d18, d6, #1 + vshl.u32 q13, q6, q1 + vshl.u32 d28, d8, d2 + vrev64.i32 d22, d22 + vmul.i32 d1, d9, d1 + vrev64.i32 d24, d24 + vext.32 d29, d8, d13, #1 + vext.32 d0, d1, d9, #1 + vrev64.i32 d0, d0 + vext.32 d2, d9, d1, #1 + vext.32 d23, d15, d5, #1 + vmull.s32 q4, d20, d4 + vrev64.i32 d23, d23 + vmlal.s32 q4, d21, d1 + vrev64.i32 d2, d2 + vmlal.s32 q4, d26, d19 + vext.32 d3, d5, d15, #1 + vmlal.s32 q4, d27, d18 + vrev64.i32 d3, d3 + vmlal.s32 q4, d28, d15 + vext.32 d14, d12, d11, #1 + vmull.s32 q5, d16, d23 + vext.32 d15, d13, d12, #1 + vmlal.s32 q5, d17, d4 + vst1.8 d8, [r5, : 64]! + vmlal.s32 q5, d14, d1 + vext.32 d12, d9, d8, #0 + vmlal.s32 q5, d15, d19 + vmov.i64 d13, #0 + vmlal.s32 q5, d29, d18 + vext.32 d25, d19, d7, #1 + vmlal.s32 q6, d20, d5 + vrev64.i32 d25, d25 + vmlal.s32 q6, d21, d4 + vst1.8 d11, [r5, : 64]! + vmlal.s32 q6, d26, d1 + vext.32 d9, d10, d10, #0 + vmlal.s32 q6, d27, d19 + vmov.i64 d8, #0 + vmlal.s32 q6, d28, d18 + vmlal.s32 q4, d16, d24 + vmlal.s32 q4, d17, d5 + vmlal.s32 q4, d14, d4 + vst1.8 d12, [r5, : 64]! + vmlal.s32 q4, d15, d1 + vext.32 d10, d13, d12, #0 + vmlal.s32 q4, d29, d19 + vmov.i64 d11, #0 + vmlal.s32 q5, d20, d6 + vmlal.s32 q5, d21, d5 + vmlal.s32 q5, d26, d4 + vext.32 d13, d8, d8, #0 + vmlal.s32 q5, d27, d1 + vmov.i64 d12, #0 + vmlal.s32 q5, d28, d19 + vst1.8 d9, [r5, : 64]! + vmlal.s32 q6, d16, d25 + vmlal.s32 q6, d17, d6 + vst1.8 d10, [r5, : 64] + vmlal.s32 q6, d14, d5 + vext.32 d8, d11, d10, #0 + vmlal.s32 q6, d15, d4 + vmov.i64 d9, #0 + vmlal.s32 q6, d29, d1 + vmlal.s32 q4, d20, d7 + vmlal.s32 q4, d21, d6 + vmlal.s32 q4, d26, d5 + vext.32 d11, d12, d12, #0 + vmlal.s32 q4, d27, d4 + vmov.i64 d10, #0 + vmlal.s32 q4, d28, d1 + vmlal.s32 q5, d16, d0 + sub r2, r5, #32 + vmlal.s32 q5, d17, d7 + vmlal.s32 q5, d14, d6 + vext.32 d30, d9, d8, #0 + vmlal.s32 q5, d15, d5 + vld1.8 {d31}, [r2, : 64]! + vmlal.s32 q5, d29, d4 + vmlal.s32 q15, d20, d0 + vext.32 d0, d6, d18, #1 + vmlal.s32 q15, d21, d25 + vrev64.i32 d0, d0 + vmlal.s32 q15, d26, d24 + vext.32 d1, d7, d19, #1 + vext.32 d7, d10, d10, #0 + vmlal.s32 q15, d27, d23 + vrev64.i32 d1, d1 + vld1.8 {d6}, [r2, : 64] + vmlal.s32 q15, d28, d22 + vmlal.s32 q3, d16, d4 + add r2, r2, #24 + vmlal.s32 q3, d17, d2 + vext.32 d4, d31, d30, #0 + vmov d17, d11 + vmlal.s32 q3, d14, d1 + vext.32 d11, d13, d13, #0 + vext.32 d13, d30, d30, #0 + vmlal.s32 q3, d15, d0 + vext.32 d1, d8, d8, #0 + vmlal.s32 q3, d29, d3 + vld1.8 {d5}, [r2, : 64] + sub r2, r2, #16 + vext.32 d10, d6, d6, #0 + vmov.i32 q1, #0xffffffff + vshl.i64 q4, q1, #25 + add r5, sp, #480 + vld1.8 {d14-d15}, [r5, : 128] + vadd.i64 q9, q2, q7 + vshl.i64 q1, q1, #26 + vshr.s64 q10, q9, #26 + vld1.8 {d0}, [r2, : 64]! + vadd.i64 q5, q5, q10 + vand q9, q9, q1 + vld1.8 {d16}, [r2, : 64]! + add r2, sp, #496 + vld1.8 {d20-d21}, [r2, : 128] + vadd.i64 q11, q5, q10 + vsub.i64 q2, q2, q9 + vshr.s64 q9, q11, #25 + vext.32 d12, d5, d4, #0 + vand q11, q11, q4 + vadd.i64 q0, q0, q9 + vmov d19, d7 + vadd.i64 q3, q0, q7 + vsub.i64 q5, q5, q11 + vshr.s64 q11, q3, #26 + vext.32 d18, d11, d10, #0 + vand q3, q3, q1 + vadd.i64 q8, q8, q11 + vadd.i64 q11, q8, q10 + vsub.i64 q0, q0, q3 + vshr.s64 q3, q11, #25 + vand q11, q11, q4 + vadd.i64 q3, q6, q3 + vadd.i64 q6, q3, q7 + vsub.i64 q8, q8, q11 + vshr.s64 q11, q6, #26 + vand q6, q6, q1 + vadd.i64 q9, q9, q11 + vadd.i64 d25, d19, d21 + vsub.i64 q3, q3, q6 + vshr.s64 d23, d25, #25 + vand q4, q12, q4 + vadd.i64 d21, d23, d23 + vshl.i64 d25, d23, #4 + vadd.i64 d21, d21, d23 + vadd.i64 d25, d25, d21 + vadd.i64 d4, d4, d25 + vzip.i32 q0, q8 + vadd.i64 d12, d4, d14 + add r2, r6, #8 + vst1.8 d0, [r2, : 64] + vsub.i64 d19, d19, d9 + add r2, r2, #16 + vst1.8 d16, [r2, : 64] + vshr.s64 d22, d12, #26 + vand q0, q6, q1 + vadd.i64 d10, d10, d22 + vzip.i32 q3, q9 + vsub.i64 d4, d4, d0 + sub r2, r2, #8 + vst1.8 d6, [r2, : 64] + add r2, r2, #16 + vst1.8 d18, [r2, : 64] + vzip.i32 q2, q5 + sub r2, r2, #32 + vst1.8 d4, [r2, : 64] + cmp r4, #0 + beq .Lskippostcopy + add r2, r3, #144 + mov r4, r4 + vld1.8 {d0-d1}, [r2, : 128]! + vld1.8 {d2-d3}, [r2, : 128]! + vld1.8 {d4}, [r2, : 64] + vst1.8 {d0-d1}, [r4, : 128]! + vst1.8 {d2-d3}, [r4, : 128]! + vst1.8 d4, [r4, : 64] +.Lskippostcopy: + cmp r1, #1 + bne .Lskipfinalcopy + add r2, r3, #288 + add r4, r3, #144 + vld1.8 {d0-d1}, [r2, : 128]! + vld1.8 {d2-d3}, [r2, : 128]! + vld1.8 {d4}, [r2, : 64] + vst1.8 {d0-d1}, [r4, : 128]! + vst1.8 {d2-d3}, [r4, : 128]! + vst1.8 d4, [r4, : 64] +.Lskipfinalcopy: + add r1, r1, #1 + cmp r1, #12 + blo .Linvertloop + add r1, r3, #144 + ldr r2, [r1], #4 + ldr r3, [r1], #4 + ldr r4, [r1], #4 + ldr r5, [r1], #4 + ldr r6, [r1], #4 + ldr r7, [r1], #4 + ldr r8, [r1], #4 + ldr r9, [r1], #4 + ldr r10, [r1], #4 + ldr r1, [r1] + add r11, r1, r1, LSL #4 + add r11, r11, r1, LSL #1 + add r11, r11, #16777216 + mov r11, r11, ASR #25 + add r11, r11, r2 + mov r11, r11, ASR #26 + add r11, r11, r3 + mov r11, r11, ASR #25 + add r11, r11, r4 + mov r11, r11, ASR #26 + add r11, r11, r5 + mov r11, r11, ASR #25 + add r11, r11, r6 + mov r11, r11, ASR #26 + add r11, r11, r7 + mov r11, r11, ASR #25 + add r11, r11, r8 + mov r11, r11, ASR #26 + add r11, r11, r9 + mov r11, r11, ASR #25 + add r11, r11, r10 + mov r11, r11, ASR #26 + add r11, r11, r1 + mov r11, r11, ASR #25 + add r2, r2, r11 + add r2, r2, r11, LSL #1 + add r2, r2, r11, LSL #4 + mov r11, r2, ASR #26 + add r3, r3, r11 + sub r2, r2, r11, LSL #26 + mov r11, r3, ASR #25 + add r4, r4, r11 + sub r3, r3, r11, LSL #25 + mov r11, r4, ASR #26 + add r5, r5, r11 + sub r4, r4, r11, LSL #26 + mov r11, r5, ASR #25 + add r6, r6, r11 + sub r5, r5, r11, LSL #25 + mov r11, r6, ASR #26 + add r7, r7, r11 + sub r6, r6, r11, LSL #26 + mov r11, r7, ASR #25 + add r8, r8, r11 + sub r7, r7, r11, LSL #25 + mov r11, r8, ASR #26 + add r9, r9, r11 + sub r8, r8, r11, LSL #26 + mov r11, r9, ASR #25 + add r10, r10, r11 + sub r9, r9, r11, LSL #25 + mov r11, r10, ASR #26 + add r1, r1, r11 + sub r10, r10, r11, LSL #26 + mov r11, r1, ASR #25 + sub r1, r1, r11, LSL #25 + add r2, r2, r3, LSL #26 + mov r3, r3, LSR #6 + add r3, r3, r4, LSL #19 + mov r4, r4, LSR #13 + add r4, r4, r5, LSL #13 + mov r5, r5, LSR #19 + add r5, r5, r6, LSL #6 + add r6, r7, r8, LSL #25 + mov r7, r8, LSR #7 + add r7, r7, r9, LSL #19 + mov r8, r9, LSR #13 + add r8, r8, r10, LSL #12 + mov r9, r10, LSR #20 + add r1, r9, r1, LSL #6 + str r2, [r0] + str r3, [r0, #4] + str r4, [r0, #8] + str r5, [r0, #12] + str r6, [r0, #16] + str r7, [r0, #20] + str r8, [r0, #24] + str r1, [r0, #28] + movw r0, #0 + mov sp, ip + pop {r4-r11, pc} +ENDPROC(curve25519_neon) diff --git a/arch/arm/crypto/curve25519-glue.c b/arch/arm/crypto/curve25519-glue.c new file mode 100644 index 000000000000..4f5cf1445ed0 --- /dev/null +++ b/arch/arm/crypto/curve25519-glue.c @@ -0,0 +1,135 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * Based on public domain code from Daniel J. Bernstein and Peter Schwabe. This + * began from SUPERCOP's curve25519/neon2/scalarmult.s, but has subsequently been + * manually reworked for use in kernel space. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +asmlinkage void curve25519_neon(u8 mypublic[CURVE25519_KEY_SIZE], + const u8 secret[CURVE25519_KEY_SIZE], + const u8 basepoint[CURVE25519_KEY_SIZE]); + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_neon); + +void curve25519_arch(u8 out[CURVE25519_KEY_SIZE], + const u8 scalar[CURVE25519_KEY_SIZE], + const u8 point[CURVE25519_KEY_SIZE]) +{ + if (static_branch_likely(&have_neon) && may_use_simd()) { + kernel_neon_begin(); + curve25519_neon(out, scalar, point); + kernel_neon_end(); + } else { + curve25519_generic(out, scalar, point); + } +} +EXPORT_SYMBOL(curve25519_arch); + +void curve25519_base_arch(u8 pub[CURVE25519_KEY_SIZE], + const u8 secret[CURVE25519_KEY_SIZE]) +{ + return curve25519_arch(pub, secret, curve25519_base_point); +} +EXPORT_SYMBOL(curve25519_base_arch); + +static int curve25519_set_secret(struct crypto_kpp *tfm, const void *buf, + unsigned int len) +{ + u8 *secret = kpp_tfm_ctx(tfm); + + if (!len) + curve25519_generate_secret(secret); + else if (len == CURVE25519_KEY_SIZE && + crypto_memneq(buf, curve25519_null_point, CURVE25519_KEY_SIZE)) + memcpy(secret, buf, CURVE25519_KEY_SIZE); + else + return -EINVAL; + return 0; +} + +static int curve25519_compute_value(struct kpp_request *req) +{ + struct crypto_kpp *tfm = crypto_kpp_reqtfm(req); + const u8 *secret = kpp_tfm_ctx(tfm); + u8 public_key[CURVE25519_KEY_SIZE]; + u8 buf[CURVE25519_KEY_SIZE]; + int copied, nbytes; + u8 const *bp; + + if (req->src) { + copied = sg_copy_to_buffer(req->src, + sg_nents_for_len(req->src, + CURVE25519_KEY_SIZE), + public_key, CURVE25519_KEY_SIZE); + if (copied != CURVE25519_KEY_SIZE) + return -EINVAL; + bp = public_key; + } else { + bp = curve25519_base_point; + } + + curve25519_arch(buf, secret, bp); + + /* might want less than we've got */ + nbytes = min_t(size_t, CURVE25519_KEY_SIZE, req->dst_len); + copied = sg_copy_from_buffer(req->dst, sg_nents_for_len(req->dst, + nbytes), + buf, nbytes); + if (copied != nbytes) + return -EINVAL; + return 0; +} + +static unsigned int curve25519_max_size(struct crypto_kpp *tfm) +{ + return CURVE25519_KEY_SIZE; +} + +static struct kpp_alg curve25519_alg = { + .base.cra_name = "curve25519", + .base.cra_driver_name = "curve25519-neon", + .base.cra_priority = 200, + .base.cra_module = THIS_MODULE, + .base.cra_ctxsize = CURVE25519_KEY_SIZE, + + .set_secret = curve25519_set_secret, + .generate_public_key = curve25519_compute_value, + .compute_shared_secret = curve25519_compute_value, + .max_size = curve25519_max_size, +}; + +static int __init mod_init(void) +{ + if (elf_hwcap & HWCAP_NEON) { + static_branch_enable(&have_neon); + return IS_REACHABLE(CONFIG_CRYPTO_KPP) ? + crypto_register_kpp(&curve25519_alg) : 0; + } + return 0; +} + +static void __exit mod_exit(void) +{ + if (IS_REACHABLE(CONFIG_CRYPTO_KPP) && elf_hwcap & HWCAP_NEON) + crypto_unregister_kpp(&curve25519_alg); +} + +module_init(mod_init); +module_exit(mod_exit); + +MODULE_ALIAS_CRYPTO("curve25519"); +MODULE_ALIAS_CRYPTO("curve25519-neon"); +MODULE_LICENSE("GPL v2"); diff --git a/arch/arm/crypto/poly1305-armv4.pl b/arch/arm/crypto/poly1305-armv4.pl new file mode 100644 index 000000000000..6d79498d3115 --- /dev/null +++ b/arch/arm/crypto/poly1305-armv4.pl @@ -0,0 +1,1236 @@ +#!/usr/bin/env perl +# SPDX-License-Identifier: GPL-1.0+ OR BSD-3-Clause +# +# ==================================================================== +# Written by Andy Polyakov, @dot-asm, initially for the OpenSSL +# project. +# ==================================================================== +# +# IALU(*)/gcc-4.4 NEON +# +# ARM11xx(ARMv6) 7.78/+100% - +# Cortex-A5 6.35/+130% 3.00 +# Cortex-A8 6.25/+115% 2.36 +# Cortex-A9 5.10/+95% 2.55 +# Cortex-A15 3.85/+85% 1.25(**) +# Snapdragon S4 5.70/+100% 1.48(**) +# +# (*) this is for -march=armv6, i.e. with bunch of ldrb loading data; +# (**) these are trade-off results, they can be improved by ~8% but at +# the cost of 15/12% regression on Cortex-A5/A7, it's even possible +# to improve Cortex-A9 result, but then A5/A7 loose more than 20%; + +$flavour = shift; +if ($flavour=~/\w[\w\-]*\.\w+$/) { $output=$flavour; undef $flavour; } +else { while (($output=shift) && ($output!~/\w[\w\-]*\.\w+$/)) {} } + +if ($flavour && $flavour ne "void") { + $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; + ( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or + ( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or + die "can't locate arm-xlate.pl"; + + open STDOUT,"| \"$^X\" $xlate $flavour $output"; +} else { + open STDOUT,">$output"; +} + +($ctx,$inp,$len,$padbit)=map("r$_",(0..3)); + +$code.=<<___; +#ifndef __KERNEL__ +# include "arm_arch.h" +#else +# define __ARM_ARCH__ __LINUX_ARM_ARCH__ +# define __ARM_MAX_ARCH__ __LINUX_ARM_ARCH__ +# define poly1305_init poly1305_init_arm +# define poly1305_blocks poly1305_blocks_arm +# define poly1305_emit poly1305_emit_arm +.globl poly1305_blocks_neon +#endif + +#if defined(__thumb2__) +.syntax unified +.thumb +#else +.code 32 +#endif + +.text + +.globl poly1305_emit +.globl poly1305_blocks +.globl poly1305_init +.type poly1305_init,%function +.align 5 +poly1305_init: +.Lpoly1305_init: + stmdb sp!,{r4-r11} + + eor r3,r3,r3 + cmp $inp,#0 + str r3,[$ctx,#0] @ zero hash value + str r3,[$ctx,#4] + str r3,[$ctx,#8] + str r3,[$ctx,#12] + str r3,[$ctx,#16] + str r3,[$ctx,#36] @ clear is_base2_26 + add $ctx,$ctx,#20 + +#ifdef __thumb2__ + it eq +#endif + moveq r0,#0 + beq .Lno_key + +#if __ARM_MAX_ARCH__>=7 + mov r3,#-1 + str r3,[$ctx,#28] @ impossible key power value +# ifndef __KERNEL__ + adr r11,.Lpoly1305_init + ldr r12,.LOPENSSL_armcap +# endif +#endif + ldrb r4,[$inp,#0] + mov r10,#0x0fffffff + ldrb r5,[$inp,#1] + and r3,r10,#-4 @ 0x0ffffffc + ldrb r6,[$inp,#2] + ldrb r7,[$inp,#3] + orr r4,r4,r5,lsl#8 + ldrb r5,[$inp,#4] + orr r4,r4,r6,lsl#16 + ldrb r6,[$inp,#5] + orr r4,r4,r7,lsl#24 + ldrb r7,[$inp,#6] + and r4,r4,r10 + +#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__) +# if !defined(_WIN32) + ldr r12,[r11,r12] @ OPENSSL_armcap_P +# endif +# if defined(__APPLE__) || defined(_WIN32) + ldr r12,[r12] +# endif +#endif + ldrb r8,[$inp,#7] + orr r5,r5,r6,lsl#8 + ldrb r6,[$inp,#8] + orr r5,r5,r7,lsl#16 + ldrb r7,[$inp,#9] + orr r5,r5,r8,lsl#24 + ldrb r8,[$inp,#10] + and r5,r5,r3 + +#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__) + tst r12,#ARMV7_NEON @ check for NEON +# ifdef __thumb2__ + adr r9,.Lpoly1305_blocks_neon + adr r11,.Lpoly1305_blocks + it ne + movne r11,r9 + adr r12,.Lpoly1305_emit + orr r11,r11,#1 @ thumb-ify addresses + orr r12,r12,#1 +# else + add r12,r11,#(.Lpoly1305_emit-.Lpoly1305_init) + ite eq + addeq r11,r11,#(.Lpoly1305_blocks-.Lpoly1305_init) + addne r11,r11,#(.Lpoly1305_blocks_neon-.Lpoly1305_init) +# endif +#endif + ldrb r9,[$inp,#11] + orr r6,r6,r7,lsl#8 + ldrb r7,[$inp,#12] + orr r6,r6,r8,lsl#16 + ldrb r8,[$inp,#13] + orr r6,r6,r9,lsl#24 + ldrb r9,[$inp,#14] + and r6,r6,r3 + + ldrb r10,[$inp,#15] + orr r7,r7,r8,lsl#8 + str r4,[$ctx,#0] + orr r7,r7,r9,lsl#16 + str r5,[$ctx,#4] + orr r7,r7,r10,lsl#24 + str r6,[$ctx,#8] + and r7,r7,r3 + str r7,[$ctx,#12] +#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__) + stmia r2,{r11,r12} @ fill functions table + mov r0,#1 +#else + mov r0,#0 +#endif +.Lno_key: + ldmia sp!,{r4-r11} +#if __ARM_ARCH__>=5 + ret @ bx lr +#else + tst lr,#1 + moveq pc,lr @ be binary compatible with V4, yet + bx lr @ interoperable with Thumb ISA:-) +#endif +.size poly1305_init,.-poly1305_init +___ +{ +my ($h0,$h1,$h2,$h3,$h4,$r0,$r1,$r2,$r3)=map("r$_",(4..12)); +my ($s1,$s2,$s3)=($r1,$r2,$r3); + +$code.=<<___; +.type poly1305_blocks,%function +.align 5 +poly1305_blocks: +.Lpoly1305_blocks: + stmdb sp!,{r3-r11,lr} + + ands $len,$len,#-16 + beq .Lno_data + + add $len,$len,$inp @ end pointer + sub sp,sp,#32 + +#if __ARM_ARCH__<7 + ldmia $ctx,{$h0-$r3} @ load context + add $ctx,$ctx,#20 + str $len,[sp,#16] @ offload stuff + str $ctx,[sp,#12] +#else + ldr lr,[$ctx,#36] @ is_base2_26 + ldmia $ctx!,{$h0-$h4} @ load hash value + str $len,[sp,#16] @ offload stuff + str $ctx,[sp,#12] + + adds $r0,$h0,$h1,lsl#26 @ base 2^26 -> base 2^32 + mov $r1,$h1,lsr#6 + adcs $r1,$r1,$h2,lsl#20 + mov $r2,$h2,lsr#12 + adcs $r2,$r2,$h3,lsl#14 + mov $r3,$h3,lsr#18 + adcs $r3,$r3,$h4,lsl#8 + mov $len,#0 + teq lr,#0 + str $len,[$ctx,#16] @ clear is_base2_26 + adc $len,$len,$h4,lsr#24 + + itttt ne + movne $h0,$r0 @ choose between radixes + movne $h1,$r1 + movne $h2,$r2 + movne $h3,$r3 + ldmia $ctx,{$r0-$r3} @ load key + it ne + movne $h4,$len +#endif + + mov lr,$inp + cmp $padbit,#0 + str $r1,[sp,#20] + str $r2,[sp,#24] + str $r3,[sp,#28] + b .Loop + +.align 4 +.Loop: +#if __ARM_ARCH__<7 + ldrb r0,[lr],#16 @ load input +# ifdef __thumb2__ + it hi +# endif + addhi $h4,$h4,#1 @ 1<<128 + ldrb r1,[lr,#-15] + ldrb r2,[lr,#-14] + ldrb r3,[lr,#-13] + orr r1,r0,r1,lsl#8 + ldrb r0,[lr,#-12] + orr r2,r1,r2,lsl#16 + ldrb r1,[lr,#-11] + orr r3,r2,r3,lsl#24 + ldrb r2,[lr,#-10] + adds $h0,$h0,r3 @ accumulate input + + ldrb r3,[lr,#-9] + orr r1,r0,r1,lsl#8 + ldrb r0,[lr,#-8] + orr r2,r1,r2,lsl#16 + ldrb r1,[lr,#-7] + orr r3,r2,r3,lsl#24 + ldrb r2,[lr,#-6] + adcs $h1,$h1,r3 + + ldrb r3,[lr,#-5] + orr r1,r0,r1,lsl#8 + ldrb r0,[lr,#-4] + orr r2,r1,r2,lsl#16 + ldrb r1,[lr,#-3] + orr r3,r2,r3,lsl#24 + ldrb r2,[lr,#-2] + adcs $h2,$h2,r3 + + ldrb r3,[lr,#-1] + orr r1,r0,r1,lsl#8 + str lr,[sp,#8] @ offload input pointer + orr r2,r1,r2,lsl#16 + add $s1,$r1,$r1,lsr#2 + orr r3,r2,r3,lsl#24 +#else + ldr r0,[lr],#16 @ load input + it hi + addhi $h4,$h4,#1 @ padbit + ldr r1,[lr,#-12] + ldr r2,[lr,#-8] + ldr r3,[lr,#-4] +# ifdef __ARMEB__ + rev r0,r0 + rev r1,r1 + rev r2,r2 + rev r3,r3 +# endif + adds $h0,$h0,r0 @ accumulate input + str lr,[sp,#8] @ offload input pointer + adcs $h1,$h1,r1 + add $s1,$r1,$r1,lsr#2 + adcs $h2,$h2,r2 +#endif + add $s2,$r2,$r2,lsr#2 + adcs $h3,$h3,r3 + add $s3,$r3,$r3,lsr#2 + + umull r2,r3,$h1,$r0 + adc $h4,$h4,#0 + umull r0,r1,$h0,$r0 + umlal r2,r3,$h4,$s1 + umlal r0,r1,$h3,$s1 + ldr $r1,[sp,#20] @ reload $r1 + umlal r2,r3,$h2,$s3 + umlal r0,r1,$h1,$s3 + umlal r2,r3,$h3,$s2 + umlal r0,r1,$h2,$s2 + umlal r2,r3,$h0,$r1 + str r0,[sp,#0] @ future $h0 + mul r0,$s2,$h4 + ldr $r2,[sp,#24] @ reload $r2 + adds r2,r2,r1 @ d1+=d0>>32 + eor r1,r1,r1 + adc lr,r3,#0 @ future $h2 + str r2,[sp,#4] @ future $h1 + + mul r2,$s3,$h4 + eor r3,r3,r3 + umlal r0,r1,$h3,$s3 + ldr $r3,[sp,#28] @ reload $r3 + umlal r2,r3,$h3,$r0 + umlal r0,r1,$h2,$r0 + umlal r2,r3,$h2,$r1 + umlal r0,r1,$h1,$r1 + umlal r2,r3,$h1,$r2 + umlal r0,r1,$h0,$r2 + umlal r2,r3,$h0,$r3 + ldr $h0,[sp,#0] + mul $h4,$r0,$h4 + ldr $h1,[sp,#4] + + adds $h2,lr,r0 @ d2+=d1>>32 + ldr lr,[sp,#8] @ reload input pointer + adc r1,r1,#0 + adds $h3,r2,r1 @ d3+=d2>>32 + ldr r0,[sp,#16] @ reload end pointer + adc r3,r3,#0 + add $h4,$h4,r3 @ h4+=d3>>32 + + and r1,$h4,#-4 + and $h4,$h4,#3 + add r1,r1,r1,lsr#2 @ *=5 + adds $h0,$h0,r1 + adcs $h1,$h1,#0 + adcs $h2,$h2,#0 + adcs $h3,$h3,#0 + adc $h4,$h4,#0 + + cmp r0,lr @ done yet? + bhi .Loop + + ldr $ctx,[sp,#12] + add sp,sp,#32 + stmdb $ctx,{$h0-$h4} @ store the result + +.Lno_data: +#if __ARM_ARCH__>=5 + ldmia sp!,{r3-r11,pc} +#else + ldmia sp!,{r3-r11,lr} + tst lr,#1 + moveq pc,lr @ be binary compatible with V4, yet + bx lr @ interoperable with Thumb ISA:-) +#endif +.size poly1305_blocks,.-poly1305_blocks +___ +} +{ +my ($ctx,$mac,$nonce)=map("r$_",(0..2)); +my ($h0,$h1,$h2,$h3,$h4,$g0,$g1,$g2,$g3)=map("r$_",(3..11)); +my $g4=$ctx; + +$code.=<<___; +.type poly1305_emit,%function +.align 5 +poly1305_emit: +.Lpoly1305_emit: + stmdb sp!,{r4-r11} + + ldmia $ctx,{$h0-$h4} + +#if __ARM_ARCH__>=7 + ldr ip,[$ctx,#36] @ is_base2_26 + + adds $g0,$h0,$h1,lsl#26 @ base 2^26 -> base 2^32 + mov $g1,$h1,lsr#6 + adcs $g1,$g1,$h2,lsl#20 + mov $g2,$h2,lsr#12 + adcs $g2,$g2,$h3,lsl#14 + mov $g3,$h3,lsr#18 + adcs $g3,$g3,$h4,lsl#8 + mov $g4,#0 + adc $g4,$g4,$h4,lsr#24 + + tst ip,ip + itttt ne + movne $h0,$g0 + movne $h1,$g1 + movne $h2,$g2 + movne $h3,$g3 + it ne + movne $h4,$g4 +#endif + + adds $g0,$h0,#5 @ compare to modulus + adcs $g1,$h1,#0 + adcs $g2,$h2,#0 + adcs $g3,$h3,#0 + adc $g4,$h4,#0 + tst $g4,#4 @ did it carry/borrow? + +#ifdef __thumb2__ + it ne +#endif + movne $h0,$g0 + ldr $g0,[$nonce,#0] +#ifdef __thumb2__ + it ne +#endif + movne $h1,$g1 + ldr $g1,[$nonce,#4] +#ifdef __thumb2__ + it ne +#endif + movne $h2,$g2 + ldr $g2,[$nonce,#8] +#ifdef __thumb2__ + it ne +#endif + movne $h3,$g3 + ldr $g3,[$nonce,#12] + + adds $h0,$h0,$g0 + adcs $h1,$h1,$g1 + adcs $h2,$h2,$g2 + adc $h3,$h3,$g3 + +#if __ARM_ARCH__>=7 +# ifdef __ARMEB__ + rev $h0,$h0 + rev $h1,$h1 + rev $h2,$h2 + rev $h3,$h3 +# endif + str $h0,[$mac,#0] + str $h1,[$mac,#4] + str $h2,[$mac,#8] + str $h3,[$mac,#12] +#else + strb $h0,[$mac,#0] + mov $h0,$h0,lsr#8 + strb $h1,[$mac,#4] + mov $h1,$h1,lsr#8 + strb $h2,[$mac,#8] + mov $h2,$h2,lsr#8 + strb $h3,[$mac,#12] + mov $h3,$h3,lsr#8 + + strb $h0,[$mac,#1] + mov $h0,$h0,lsr#8 + strb $h1,[$mac,#5] + mov $h1,$h1,lsr#8 + strb $h2,[$mac,#9] + mov $h2,$h2,lsr#8 + strb $h3,[$mac,#13] + mov $h3,$h3,lsr#8 + + strb $h0,[$mac,#2] + mov $h0,$h0,lsr#8 + strb $h1,[$mac,#6] + mov $h1,$h1,lsr#8 + strb $h2,[$mac,#10] + mov $h2,$h2,lsr#8 + strb $h3,[$mac,#14] + mov $h3,$h3,lsr#8 + + strb $h0,[$mac,#3] + strb $h1,[$mac,#7] + strb $h2,[$mac,#11] + strb $h3,[$mac,#15] +#endif + ldmia sp!,{r4-r11} +#if __ARM_ARCH__>=5 + ret @ bx lr +#else + tst lr,#1 + moveq pc,lr @ be binary compatible with V4, yet + bx lr @ interoperable with Thumb ISA:-) +#endif +.size poly1305_emit,.-poly1305_emit +___ +{ +my ($R0,$R1,$S1,$R2,$S2,$R3,$S3,$R4,$S4) = map("d$_",(0..9)); +my ($D0,$D1,$D2,$D3,$D4, $H0,$H1,$H2,$H3,$H4) = map("q$_",(5..14)); +my ($T0,$T1,$MASK) = map("q$_",(15,4,0)); + +my ($in2,$zeros,$tbl0,$tbl1) = map("r$_",(4..7)); + +$code.=<<___; +#if __ARM_MAX_ARCH__>=7 +.fpu neon + +.type poly1305_init_neon,%function +.align 5 +poly1305_init_neon: +.Lpoly1305_init_neon: + ldr r3,[$ctx,#48] @ first table element + cmp r3,#-1 @ is value impossible? + bne .Lno_init_neon + + ldr r4,[$ctx,#20] @ load key base 2^32 + ldr r5,[$ctx,#24] + ldr r6,[$ctx,#28] + ldr r7,[$ctx,#32] + + and r2,r4,#0x03ffffff @ base 2^32 -> base 2^26 + mov r3,r4,lsr#26 + mov r4,r5,lsr#20 + orr r3,r3,r5,lsl#6 + mov r5,r6,lsr#14 + orr r4,r4,r6,lsl#12 + mov r6,r7,lsr#8 + orr r5,r5,r7,lsl#18 + and r3,r3,#0x03ffffff + and r4,r4,#0x03ffffff + and r5,r5,#0x03ffffff + + vdup.32 $R0,r2 @ r^1 in both lanes + add r2,r3,r3,lsl#2 @ *5 + vdup.32 $R1,r3 + add r3,r4,r4,lsl#2 + vdup.32 $S1,r2 + vdup.32 $R2,r4 + add r4,r5,r5,lsl#2 + vdup.32 $S2,r3 + vdup.32 $R3,r5 + add r5,r6,r6,lsl#2 + vdup.32 $S3,r4 + vdup.32 $R4,r6 + vdup.32 $S4,r5 + + mov $zeros,#2 @ counter + +.Lsquare_neon: + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4 + @ d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4 + @ d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4 + @ d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4 + @ d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4 + + vmull.u32 $D0,$R0,${R0}[1] + vmull.u32 $D1,$R1,${R0}[1] + vmull.u32 $D2,$R2,${R0}[1] + vmull.u32 $D3,$R3,${R0}[1] + vmull.u32 $D4,$R4,${R0}[1] + + vmlal.u32 $D0,$R4,${S1}[1] + vmlal.u32 $D1,$R0,${R1}[1] + vmlal.u32 $D2,$R1,${R1}[1] + vmlal.u32 $D3,$R2,${R1}[1] + vmlal.u32 $D4,$R3,${R1}[1] + + vmlal.u32 $D0,$R3,${S2}[1] + vmlal.u32 $D1,$R4,${S2}[1] + vmlal.u32 $D3,$R1,${R2}[1] + vmlal.u32 $D2,$R0,${R2}[1] + vmlal.u32 $D4,$R2,${R2}[1] + + vmlal.u32 $D0,$R2,${S3}[1] + vmlal.u32 $D3,$R0,${R3}[1] + vmlal.u32 $D1,$R3,${S3}[1] + vmlal.u32 $D2,$R4,${S3}[1] + vmlal.u32 $D4,$R1,${R3}[1] + + vmlal.u32 $D3,$R4,${S4}[1] + vmlal.u32 $D0,$R1,${S4}[1] + vmlal.u32 $D1,$R2,${S4}[1] + vmlal.u32 $D2,$R3,${S4}[1] + vmlal.u32 $D4,$R0,${R4}[1] + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ lazy reduction as discussed in "NEON crypto" by D.J. Bernstein + @ and P. Schwabe + @ + @ H0>>+H1>>+H2>>+H3>>+H4 + @ H3>>+H4>>*5+H0>>+H1 + @ + @ Trivia. + @ + @ Result of multiplication of n-bit number by m-bit number is + @ n+m bits wide. However! Even though 2^n is a n+1-bit number, + @ m-bit number multiplied by 2^n is still n+m bits wide. + @ + @ Sum of two n-bit numbers is n+1 bits wide, sum of three - n+2, + @ and so is sum of four. Sum of 2^m n-m-bit numbers and n-bit + @ one is n+1 bits wide. + @ + @ >>+ denotes Hnext += Hn>>26, Hn &= 0x3ffffff. This means that + @ H0, H2, H3 are guaranteed to be 26 bits wide, while H1 and H4 + @ can be 27. However! In cases when their width exceeds 26 bits + @ they are limited by 2^26+2^6. This in turn means that *sum* + @ of the products with these values can still be viewed as sum + @ of 52-bit numbers as long as the amount of addends is not a + @ power of 2. For example, + @ + @ H4 = H4*R0 + H3*R1 + H2*R2 + H1*R3 + H0 * R4, + @ + @ which can't be larger than 5 * (2^26 + 2^6) * (2^26 + 2^6), or + @ 5 * (2^52 + 2*2^32 + 2^12), which in turn is smaller than + @ 8 * (2^52) or 2^55. However, the value is then multiplied by + @ by 5, so we should be looking at 5 * 5 * (2^52 + 2^33 + 2^12), + @ which is less than 32 * (2^52) or 2^57. And when processing + @ data we are looking at triple as many addends... + @ + @ In key setup procedure pre-reduced H0 is limited by 5*4+1 and + @ 5*H4 - by 5*5 52-bit addends, or 57 bits. But when hashing the + @ input H0 is limited by (5*4+1)*3 addends, or 58 bits, while + @ 5*H4 by 5*5*3, or 59[!] bits. How is this relevant? vmlal.u32 + @ instruction accepts 2x32-bit input and writes 2x64-bit result. + @ This means that result of reduction have to be compressed upon + @ loop wrap-around. This can be done in the process of reduction + @ to minimize amount of instructions [as well as amount of + @ 128-bit instructions, which benefits low-end processors], but + @ one has to watch for H2 (which is narrower than H0) and 5*H4 + @ not being wider than 58 bits, so that result of right shift + @ by 26 bits fits in 32 bits. This is also useful on x86, + @ because it allows to use paddd in place for paddq, which + @ benefits Atom, where paddq is ridiculously slow. + + vshr.u64 $T0,$D3,#26 + vmovn.i64 $D3#lo,$D3 + vshr.u64 $T1,$D0,#26 + vmovn.i64 $D0#lo,$D0 + vadd.i64 $D4,$D4,$T0 @ h3 -> h4 + vbic.i32 $D3#lo,#0xfc000000 @ &=0x03ffffff + vadd.i64 $D1,$D1,$T1 @ h0 -> h1 + vbic.i32 $D0#lo,#0xfc000000 + + vshrn.u64 $T0#lo,$D4,#26 + vmovn.i64 $D4#lo,$D4 + vshr.u64 $T1,$D1,#26 + vmovn.i64 $D1#lo,$D1 + vadd.i64 $D2,$D2,$T1 @ h1 -> h2 + vbic.i32 $D4#lo,#0xfc000000 + vbic.i32 $D1#lo,#0xfc000000 + + vadd.i32 $D0#lo,$D0#lo,$T0#lo + vshl.u32 $T0#lo,$T0#lo,#2 + vshrn.u64 $T1#lo,$D2,#26 + vmovn.i64 $D2#lo,$D2 + vadd.i32 $D0#lo,$D0#lo,$T0#lo @ h4 -> h0 + vadd.i32 $D3#lo,$D3#lo,$T1#lo @ h2 -> h3 + vbic.i32 $D2#lo,#0xfc000000 + + vshr.u32 $T0#lo,$D0#lo,#26 + vbic.i32 $D0#lo,#0xfc000000 + vshr.u32 $T1#lo,$D3#lo,#26 + vbic.i32 $D3#lo,#0xfc000000 + vadd.i32 $D1#lo,$D1#lo,$T0#lo @ h0 -> h1 + vadd.i32 $D4#lo,$D4#lo,$T1#lo @ h3 -> h4 + + subs $zeros,$zeros,#1 + beq .Lsquare_break_neon + + add $tbl0,$ctx,#(48+0*9*4) + add $tbl1,$ctx,#(48+1*9*4) + + vtrn.32 $R0,$D0#lo @ r^2:r^1 + vtrn.32 $R2,$D2#lo + vtrn.32 $R3,$D3#lo + vtrn.32 $R1,$D1#lo + vtrn.32 $R4,$D4#lo + + vshl.u32 $S2,$R2,#2 @ *5 + vshl.u32 $S3,$R3,#2 + vshl.u32 $S1,$R1,#2 + vshl.u32 $S4,$R4,#2 + vadd.i32 $S2,$S2,$R2 + vadd.i32 $S1,$S1,$R1 + vadd.i32 $S3,$S3,$R3 + vadd.i32 $S4,$S4,$R4 + + vst4.32 {${R0}[0],${R1}[0],${S1}[0],${R2}[0]},[$tbl0]! + vst4.32 {${R0}[1],${R1}[1],${S1}[1],${R2}[1]},[$tbl1]! + vst4.32 {${S2}[0],${R3}[0],${S3}[0],${R4}[0]},[$tbl0]! + vst4.32 {${S2}[1],${R3}[1],${S3}[1],${R4}[1]},[$tbl1]! + vst1.32 {${S4}[0]},[$tbl0,:32] + vst1.32 {${S4}[1]},[$tbl1,:32] + + b .Lsquare_neon + +.align 4 +.Lsquare_break_neon: + add $tbl0,$ctx,#(48+2*4*9) + add $tbl1,$ctx,#(48+3*4*9) + + vmov $R0,$D0#lo @ r^4:r^3 + vshl.u32 $S1,$D1#lo,#2 @ *5 + vmov $R1,$D1#lo + vshl.u32 $S2,$D2#lo,#2 + vmov $R2,$D2#lo + vshl.u32 $S3,$D3#lo,#2 + vmov $R3,$D3#lo + vshl.u32 $S4,$D4#lo,#2 + vmov $R4,$D4#lo + vadd.i32 $S1,$S1,$D1#lo + vadd.i32 $S2,$S2,$D2#lo + vadd.i32 $S3,$S3,$D3#lo + vadd.i32 $S4,$S4,$D4#lo + + vst4.32 {${R0}[0],${R1}[0],${S1}[0],${R2}[0]},[$tbl0]! + vst4.32 {${R0}[1],${R1}[1],${S1}[1],${R2}[1]},[$tbl1]! + vst4.32 {${S2}[0],${R3}[0],${S3}[0],${R4}[0]},[$tbl0]! + vst4.32 {${S2}[1],${R3}[1],${S3}[1],${R4}[1]},[$tbl1]! + vst1.32 {${S4}[0]},[$tbl0] + vst1.32 {${S4}[1]},[$tbl1] + +.Lno_init_neon: + ret @ bx lr +.size poly1305_init_neon,.-poly1305_init_neon + +.type poly1305_blocks_neon,%function +.align 5 +poly1305_blocks_neon: +.Lpoly1305_blocks_neon: + ldr ip,[$ctx,#36] @ is_base2_26 + + cmp $len,#64 + blo .Lpoly1305_blocks + + stmdb sp!,{r4-r7} + vstmdb sp!,{d8-d15} @ ABI specification says so + + tst ip,ip @ is_base2_26? + bne .Lbase2_26_neon + + stmdb sp!,{r1-r3,lr} + bl .Lpoly1305_init_neon + + ldr r4,[$ctx,#0] @ load hash value base 2^32 + ldr r5,[$ctx,#4] + ldr r6,[$ctx,#8] + ldr r7,[$ctx,#12] + ldr ip,[$ctx,#16] + + and r2,r4,#0x03ffffff @ base 2^32 -> base 2^26 + mov r3,r4,lsr#26 + veor $D0#lo,$D0#lo,$D0#lo + mov r4,r5,lsr#20 + orr r3,r3,r5,lsl#6 + veor $D1#lo,$D1#lo,$D1#lo + mov r5,r6,lsr#14 + orr r4,r4,r6,lsl#12 + veor $D2#lo,$D2#lo,$D2#lo + mov r6,r7,lsr#8 + orr r5,r5,r7,lsl#18 + veor $D3#lo,$D3#lo,$D3#lo + and r3,r3,#0x03ffffff + orr r6,r6,ip,lsl#24 + veor $D4#lo,$D4#lo,$D4#lo + and r4,r4,#0x03ffffff + mov r1,#1 + and r5,r5,#0x03ffffff + str r1,[$ctx,#36] @ set is_base2_26 + + vmov.32 $D0#lo[0],r2 + vmov.32 $D1#lo[0],r3 + vmov.32 $D2#lo[0],r4 + vmov.32 $D3#lo[0],r5 + vmov.32 $D4#lo[0],r6 + adr $zeros,.Lzeros + + ldmia sp!,{r1-r3,lr} + b .Lhash_loaded + +.align 4 +.Lbase2_26_neon: + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ load hash value + + veor $D0#lo,$D0#lo,$D0#lo + veor $D1#lo,$D1#lo,$D1#lo + veor $D2#lo,$D2#lo,$D2#lo + veor $D3#lo,$D3#lo,$D3#lo + veor $D4#lo,$D4#lo,$D4#lo + vld4.32 {$D0#lo[0],$D1#lo[0],$D2#lo[0],$D3#lo[0]},[$ctx]! + adr $zeros,.Lzeros + vld1.32 {$D4#lo[0]},[$ctx] + sub $ctx,$ctx,#16 @ rewind + +.Lhash_loaded: + add $in2,$inp,#32 + mov $padbit,$padbit,lsl#24 + tst $len,#31 + beq .Leven + + vld4.32 {$H0#lo[0],$H1#lo[0],$H2#lo[0],$H3#lo[0]},[$inp]! + vmov.32 $H4#lo[0],$padbit + sub $len,$len,#16 + add $in2,$inp,#32 + +# ifdef __ARMEB__ + vrev32.8 $H0,$H0 + vrev32.8 $H3,$H3 + vrev32.8 $H1,$H1 + vrev32.8 $H2,$H2 +# endif + vsri.u32 $H4#lo,$H3#lo,#8 @ base 2^32 -> base 2^26 + vshl.u32 $H3#lo,$H3#lo,#18 + + vsri.u32 $H3#lo,$H2#lo,#14 + vshl.u32 $H2#lo,$H2#lo,#12 + vadd.i32 $H4#hi,$H4#lo,$D4#lo @ add hash value and move to #hi + + vbic.i32 $H3#lo,#0xfc000000 + vsri.u32 $H2#lo,$H1#lo,#20 + vshl.u32 $H1#lo,$H1#lo,#6 + + vbic.i32 $H2#lo,#0xfc000000 + vsri.u32 $H1#lo,$H0#lo,#26 + vadd.i32 $H3#hi,$H3#lo,$D3#lo + + vbic.i32 $H0#lo,#0xfc000000 + vbic.i32 $H1#lo,#0xfc000000 + vadd.i32 $H2#hi,$H2#lo,$D2#lo + + vadd.i32 $H0#hi,$H0#lo,$D0#lo + vadd.i32 $H1#hi,$H1#lo,$D1#lo + + mov $tbl1,$zeros + add $tbl0,$ctx,#48 + + cmp $len,$len + b .Long_tail + +.align 4 +.Leven: + subs $len,$len,#64 + it lo + movlo $in2,$zeros + + vmov.i32 $H4,#1<<24 @ padbit, yes, always + vld4.32 {$H0#lo,$H1#lo,$H2#lo,$H3#lo},[$inp] @ inp[0:1] + add $inp,$inp,#64 + vld4.32 {$H0#hi,$H1#hi,$H2#hi,$H3#hi},[$in2] @ inp[2:3] (or 0) + add $in2,$in2,#64 + itt hi + addhi $tbl1,$ctx,#(48+1*9*4) + addhi $tbl0,$ctx,#(48+3*9*4) + +# ifdef __ARMEB__ + vrev32.8 $H0,$H0 + vrev32.8 $H3,$H3 + vrev32.8 $H1,$H1 + vrev32.8 $H2,$H2 +# endif + vsri.u32 $H4,$H3,#8 @ base 2^32 -> base 2^26 + vshl.u32 $H3,$H3,#18 + + vsri.u32 $H3,$H2,#14 + vshl.u32 $H2,$H2,#12 + + vbic.i32 $H3,#0xfc000000 + vsri.u32 $H2,$H1,#20 + vshl.u32 $H1,$H1,#6 + + vbic.i32 $H2,#0xfc000000 + vsri.u32 $H1,$H0,#26 + + vbic.i32 $H0,#0xfc000000 + vbic.i32 $H1,#0xfc000000 + + bls .Lskip_loop + + vld4.32 {${R0}[1],${R1}[1],${S1}[1],${R2}[1]},[$tbl1]! @ load r^2 + vld4.32 {${R0}[0],${R1}[0],${S1}[0],${R2}[0]},[$tbl0]! @ load r^4 + vld4.32 {${S2}[1],${R3}[1],${S3}[1],${R4}[1]},[$tbl1]! + vld4.32 {${S2}[0],${R3}[0],${S3}[0],${R4}[0]},[$tbl0]! + b .Loop_neon + +.align 5 +.Loop_neon: + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2 + @ ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^3+inp[7]*r + @ \___________________/ + @ ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2+inp[8])*r^2 + @ ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^4+inp[7]*r^2+inp[9])*r + @ \___________________/ \____________________/ + @ + @ Note that we start with inp[2:3]*r^2. This is because it + @ doesn't depend on reduction in previous iteration. + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4 + @ d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4 + @ d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4 + @ d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4 + @ d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4 + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ inp[2:3]*r^2 + + vadd.i32 $H2#lo,$H2#lo,$D2#lo @ accumulate inp[0:1] + vmull.u32 $D2,$H2#hi,${R0}[1] + vadd.i32 $H0#lo,$H0#lo,$D0#lo + vmull.u32 $D0,$H0#hi,${R0}[1] + vadd.i32 $H3#lo,$H3#lo,$D3#lo + vmull.u32 $D3,$H3#hi,${R0}[1] + vmlal.u32 $D2,$H1#hi,${R1}[1] + vadd.i32 $H1#lo,$H1#lo,$D1#lo + vmull.u32 $D1,$H1#hi,${R0}[1] + + vadd.i32 $H4#lo,$H4#lo,$D4#lo + vmull.u32 $D4,$H4#hi,${R0}[1] + subs $len,$len,#64 + vmlal.u32 $D0,$H4#hi,${S1}[1] + it lo + movlo $in2,$zeros + vmlal.u32 $D3,$H2#hi,${R1}[1] + vld1.32 ${S4}[1],[$tbl1,:32] + vmlal.u32 $D1,$H0#hi,${R1}[1] + vmlal.u32 $D4,$H3#hi,${R1}[1] + + vmlal.u32 $D0,$H3#hi,${S2}[1] + vmlal.u32 $D3,$H1#hi,${R2}[1] + vmlal.u32 $D4,$H2#hi,${R2}[1] + vmlal.u32 $D1,$H4#hi,${S2}[1] + vmlal.u32 $D2,$H0#hi,${R2}[1] + + vmlal.u32 $D3,$H0#hi,${R3}[1] + vmlal.u32 $D0,$H2#hi,${S3}[1] + vmlal.u32 $D4,$H1#hi,${R3}[1] + vmlal.u32 $D1,$H3#hi,${S3}[1] + vmlal.u32 $D2,$H4#hi,${S3}[1] + + vmlal.u32 $D3,$H4#hi,${S4}[1] + vmlal.u32 $D0,$H1#hi,${S4}[1] + vmlal.u32 $D4,$H0#hi,${R4}[1] + vmlal.u32 $D1,$H2#hi,${S4}[1] + vmlal.u32 $D2,$H3#hi,${S4}[1] + + vld4.32 {$H0#hi,$H1#hi,$H2#hi,$H3#hi},[$in2] @ inp[2:3] (or 0) + add $in2,$in2,#64 + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ (hash+inp[0:1])*r^4 and accumulate + + vmlal.u32 $D3,$H3#lo,${R0}[0] + vmlal.u32 $D0,$H0#lo,${R0}[0] + vmlal.u32 $D4,$H4#lo,${R0}[0] + vmlal.u32 $D1,$H1#lo,${R0}[0] + vmlal.u32 $D2,$H2#lo,${R0}[0] + vld1.32 ${S4}[0],[$tbl0,:32] + + vmlal.u32 $D3,$H2#lo,${R1}[0] + vmlal.u32 $D0,$H4#lo,${S1}[0] + vmlal.u32 $D4,$H3#lo,${R1}[0] + vmlal.u32 $D1,$H0#lo,${R1}[0] + vmlal.u32 $D2,$H1#lo,${R1}[0] + + vmlal.u32 $D3,$H1#lo,${R2}[0] + vmlal.u32 $D0,$H3#lo,${S2}[0] + vmlal.u32 $D4,$H2#lo,${R2}[0] + vmlal.u32 $D1,$H4#lo,${S2}[0] + vmlal.u32 $D2,$H0#lo,${R2}[0] + + vmlal.u32 $D3,$H0#lo,${R3}[0] + vmlal.u32 $D0,$H2#lo,${S3}[0] + vmlal.u32 $D4,$H1#lo,${R3}[0] + vmlal.u32 $D1,$H3#lo,${S3}[0] + vmlal.u32 $D3,$H4#lo,${S4}[0] + + vmlal.u32 $D2,$H4#lo,${S3}[0] + vmlal.u32 $D0,$H1#lo,${S4}[0] + vmlal.u32 $D4,$H0#lo,${R4}[0] + vmov.i32 $H4,#1<<24 @ padbit, yes, always + vmlal.u32 $D1,$H2#lo,${S4}[0] + vmlal.u32 $D2,$H3#lo,${S4}[0] + + vld4.32 {$H0#lo,$H1#lo,$H2#lo,$H3#lo},[$inp] @ inp[0:1] + add $inp,$inp,#64 +# ifdef __ARMEB__ + vrev32.8 $H0,$H0 + vrev32.8 $H1,$H1 + vrev32.8 $H2,$H2 + vrev32.8 $H3,$H3 +# endif + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ lazy reduction interleaved with base 2^32 -> base 2^26 of + @ inp[0:3] previously loaded to $H0-$H3 and smashed to $H0-$H4. + + vshr.u64 $T0,$D3,#26 + vmovn.i64 $D3#lo,$D3 + vshr.u64 $T1,$D0,#26 + vmovn.i64 $D0#lo,$D0 + vadd.i64 $D4,$D4,$T0 @ h3 -> h4 + vbic.i32 $D3#lo,#0xfc000000 + vsri.u32 $H4,$H3,#8 @ base 2^32 -> base 2^26 + vadd.i64 $D1,$D1,$T1 @ h0 -> h1 + vshl.u32 $H3,$H3,#18 + vbic.i32 $D0#lo,#0xfc000000 + + vshrn.u64 $T0#lo,$D4,#26 + vmovn.i64 $D4#lo,$D4 + vshr.u64 $T1,$D1,#26 + vmovn.i64 $D1#lo,$D1 + vadd.i64 $D2,$D2,$T1 @ h1 -> h2 + vsri.u32 $H3,$H2,#14 + vbic.i32 $D4#lo,#0xfc000000 + vshl.u32 $H2,$H2,#12 + vbic.i32 $D1#lo,#0xfc000000 + + vadd.i32 $D0#lo,$D0#lo,$T0#lo + vshl.u32 $T0#lo,$T0#lo,#2 + vbic.i32 $H3,#0xfc000000 + vshrn.u64 $T1#lo,$D2,#26 + vmovn.i64 $D2#lo,$D2 + vaddl.u32 $D0,$D0#lo,$T0#lo @ h4 -> h0 [widen for a sec] + vsri.u32 $H2,$H1,#20 + vadd.i32 $D3#lo,$D3#lo,$T1#lo @ h2 -> h3 + vshl.u32 $H1,$H1,#6 + vbic.i32 $D2#lo,#0xfc000000 + vbic.i32 $H2,#0xfc000000 + + vshrn.u64 $T0#lo,$D0,#26 @ re-narrow + vmovn.i64 $D0#lo,$D0 + vsri.u32 $H1,$H0,#26 + vbic.i32 $H0,#0xfc000000 + vshr.u32 $T1#lo,$D3#lo,#26 + vbic.i32 $D3#lo,#0xfc000000 + vbic.i32 $D0#lo,#0xfc000000 + vadd.i32 $D1#lo,$D1#lo,$T0#lo @ h0 -> h1 + vadd.i32 $D4#lo,$D4#lo,$T1#lo @ h3 -> h4 + vbic.i32 $H1,#0xfc000000 + + bhi .Loop_neon + +.Lskip_loop: + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ multiply (inp[0:1]+hash) or inp[2:3] by r^2:r^1 + + add $tbl1,$ctx,#(48+0*9*4) + add $tbl0,$ctx,#(48+1*9*4) + adds $len,$len,#32 + it ne + movne $len,#0 + bne .Long_tail + + vadd.i32 $H2#hi,$H2#lo,$D2#lo @ add hash value and move to #hi + vadd.i32 $H0#hi,$H0#lo,$D0#lo + vadd.i32 $H3#hi,$H3#lo,$D3#lo + vadd.i32 $H1#hi,$H1#lo,$D1#lo + vadd.i32 $H4#hi,$H4#lo,$D4#lo + +.Long_tail: + vld4.32 {${R0}[1],${R1}[1],${S1}[1],${R2}[1]},[$tbl1]! @ load r^1 + vld4.32 {${R0}[0],${R1}[0],${S1}[0],${R2}[0]},[$tbl0]! @ load r^2 + + vadd.i32 $H2#lo,$H2#lo,$D2#lo @ can be redundant + vmull.u32 $D2,$H2#hi,$R0 + vadd.i32 $H0#lo,$H0#lo,$D0#lo + vmull.u32 $D0,$H0#hi,$R0 + vadd.i32 $H3#lo,$H3#lo,$D3#lo + vmull.u32 $D3,$H3#hi,$R0 + vadd.i32 $H1#lo,$H1#lo,$D1#lo + vmull.u32 $D1,$H1#hi,$R0 + vadd.i32 $H4#lo,$H4#lo,$D4#lo + vmull.u32 $D4,$H4#hi,$R0 + + vmlal.u32 $D0,$H4#hi,$S1 + vld4.32 {${S2}[1],${R3}[1],${S3}[1],${R4}[1]},[$tbl1]! + vmlal.u32 $D3,$H2#hi,$R1 + vld4.32 {${S2}[0],${R3}[0],${S3}[0],${R4}[0]},[$tbl0]! + vmlal.u32 $D1,$H0#hi,$R1 + vmlal.u32 $D4,$H3#hi,$R1 + vmlal.u32 $D2,$H1#hi,$R1 + + vmlal.u32 $D3,$H1#hi,$R2 + vld1.32 ${S4}[1],[$tbl1,:32] + vmlal.u32 $D0,$H3#hi,$S2 + vld1.32 ${S4}[0],[$tbl0,:32] + vmlal.u32 $D4,$H2#hi,$R2 + vmlal.u32 $D1,$H4#hi,$S2 + vmlal.u32 $D2,$H0#hi,$R2 + + vmlal.u32 $D3,$H0#hi,$R3 + it ne + addne $tbl1,$ctx,#(48+2*9*4) + vmlal.u32 $D0,$H2#hi,$S3 + it ne + addne $tbl0,$ctx,#(48+3*9*4) + vmlal.u32 $D4,$H1#hi,$R3 + vmlal.u32 $D1,$H3#hi,$S3 + vmlal.u32 $D2,$H4#hi,$S3 + + vmlal.u32 $D3,$H4#hi,$S4 + vorn $MASK,$MASK,$MASK @ all-ones, can be redundant + vmlal.u32 $D0,$H1#hi,$S4 + vshr.u64 $MASK,$MASK,#38 + vmlal.u32 $D4,$H0#hi,$R4 + vmlal.u32 $D1,$H2#hi,$S4 + vmlal.u32 $D2,$H3#hi,$S4 + + beq .Lshort_tail + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ (hash+inp[0:1])*r^4:r^3 and accumulate + + vld4.32 {${R0}[1],${R1}[1],${S1}[1],${R2}[1]},[$tbl1]! @ load r^3 + vld4.32 {${R0}[0],${R1}[0],${S1}[0],${R2}[0]},[$tbl0]! @ load r^4 + + vmlal.u32 $D2,$H2#lo,$R0 + vmlal.u32 $D0,$H0#lo,$R0 + vmlal.u32 $D3,$H3#lo,$R0 + vmlal.u32 $D1,$H1#lo,$R0 + vmlal.u32 $D4,$H4#lo,$R0 + + vmlal.u32 $D0,$H4#lo,$S1 + vld4.32 {${S2}[1],${R3}[1],${S3}[1],${R4}[1]},[$tbl1]! + vmlal.u32 $D3,$H2#lo,$R1 + vld4.32 {${S2}[0],${R3}[0],${S3}[0],${R4}[0]},[$tbl0]! + vmlal.u32 $D1,$H0#lo,$R1 + vmlal.u32 $D4,$H3#lo,$R1 + vmlal.u32 $D2,$H1#lo,$R1 + + vmlal.u32 $D3,$H1#lo,$R2 + vld1.32 ${S4}[1],[$tbl1,:32] + vmlal.u32 $D0,$H3#lo,$S2 + vld1.32 ${S4}[0],[$tbl0,:32] + vmlal.u32 $D4,$H2#lo,$R2 + vmlal.u32 $D1,$H4#lo,$S2 + vmlal.u32 $D2,$H0#lo,$R2 + + vmlal.u32 $D3,$H0#lo,$R3 + vmlal.u32 $D0,$H2#lo,$S3 + vmlal.u32 $D4,$H1#lo,$R3 + vmlal.u32 $D1,$H3#lo,$S3 + vmlal.u32 $D2,$H4#lo,$S3 + + vmlal.u32 $D3,$H4#lo,$S4 + vorn $MASK,$MASK,$MASK @ all-ones + vmlal.u32 $D0,$H1#lo,$S4 + vshr.u64 $MASK,$MASK,#38 + vmlal.u32 $D4,$H0#lo,$R4 + vmlal.u32 $D1,$H2#lo,$S4 + vmlal.u32 $D2,$H3#lo,$S4 + +.Lshort_tail: + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ horizontal addition + + vadd.i64 $D3#lo,$D3#lo,$D3#hi + vadd.i64 $D0#lo,$D0#lo,$D0#hi + vadd.i64 $D4#lo,$D4#lo,$D4#hi + vadd.i64 $D1#lo,$D1#lo,$D1#hi + vadd.i64 $D2#lo,$D2#lo,$D2#hi + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ lazy reduction, but without narrowing + + vshr.u64 $T0,$D3,#26 + vand.i64 $D3,$D3,$MASK + vshr.u64 $T1,$D0,#26 + vand.i64 $D0,$D0,$MASK + vadd.i64 $D4,$D4,$T0 @ h3 -> h4 + vadd.i64 $D1,$D1,$T1 @ h0 -> h1 + + vshr.u64 $T0,$D4,#26 + vand.i64 $D4,$D4,$MASK + vshr.u64 $T1,$D1,#26 + vand.i64 $D1,$D1,$MASK + vadd.i64 $D2,$D2,$T1 @ h1 -> h2 + + vadd.i64 $D0,$D0,$T0 + vshl.u64 $T0,$T0,#2 + vshr.u64 $T1,$D2,#26 + vand.i64 $D2,$D2,$MASK + vadd.i64 $D0,$D0,$T0 @ h4 -> h0 + vadd.i64 $D3,$D3,$T1 @ h2 -> h3 + + vshr.u64 $T0,$D0,#26 + vand.i64 $D0,$D0,$MASK + vshr.u64 $T1,$D3,#26 + vand.i64 $D3,$D3,$MASK + vadd.i64 $D1,$D1,$T0 @ h0 -> h1 + vadd.i64 $D4,$D4,$T1 @ h3 -> h4 + + cmp $len,#0 + bne .Leven + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ store hash value + + vst4.32 {$D0#lo[0],$D1#lo[0],$D2#lo[0],$D3#lo[0]},[$ctx]! + vst1.32 {$D4#lo[0]},[$ctx] + + vldmia sp!,{d8-d15} @ epilogue + ldmia sp!,{r4-r7} + ret @ bx lr +.size poly1305_blocks_neon,.-poly1305_blocks_neon + +.align 5 +.Lzeros: +.long 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 +#ifndef __KERNEL__ +.LOPENSSL_armcap: +# ifdef _WIN32 +.word OPENSSL_armcap_P +# else +.word OPENSSL_armcap_P-.Lpoly1305_init +# endif +.comm OPENSSL_armcap_P,4,4 +.hidden OPENSSL_armcap_P +#endif +#endif +___ +} } +$code.=<<___; +.asciz "Poly1305 for ARMv4/NEON, CRYPTOGAMS by \@dot-asm" +.align 2 +___ + +foreach (split("\n",$code)) { + s/\`([^\`]*)\`/eval $1/geo; + + s/\bq([0-9]+)#(lo|hi)/sprintf "d%d",2*$1+($2 eq "hi")/geo or + s/\bret\b/bx lr/go or + s/\bbx\s+lr\b/.word\t0xe12fff1e/go; # make it possible to compile with -march=armv4 + + print $_,"\n"; +} +close STDOUT; # enforce flush diff --git a/arch/arm/crypto/poly1305-core.S_shipped b/arch/arm/crypto/poly1305-core.S_shipped new file mode 100644 index 000000000000..37b71d990293 --- /dev/null +++ b/arch/arm/crypto/poly1305-core.S_shipped @@ -0,0 +1,1158 @@ +#ifndef __KERNEL__ +# include "arm_arch.h" +#else +# define __ARM_ARCH__ __LINUX_ARM_ARCH__ +# define __ARM_MAX_ARCH__ __LINUX_ARM_ARCH__ +# define poly1305_init poly1305_init_arm +# define poly1305_blocks poly1305_blocks_arm +# define poly1305_emit poly1305_emit_arm +.globl poly1305_blocks_neon +#endif + +#if defined(__thumb2__) +.syntax unified +.thumb +#else +.code 32 +#endif + +.text + +.globl poly1305_emit +.globl poly1305_blocks +.globl poly1305_init +.type poly1305_init,%function +.align 5 +poly1305_init: +.Lpoly1305_init: + stmdb sp!,{r4-r11} + + eor r3,r3,r3 + cmp r1,#0 + str r3,[r0,#0] @ zero hash value + str r3,[r0,#4] + str r3,[r0,#8] + str r3,[r0,#12] + str r3,[r0,#16] + str r3,[r0,#36] @ clear is_base2_26 + add r0,r0,#20 + +#ifdef __thumb2__ + it eq +#endif + moveq r0,#0 + beq .Lno_key + +#if __ARM_MAX_ARCH__>=7 + mov r3,#-1 + str r3,[r0,#28] @ impossible key power value +# ifndef __KERNEL__ + adr r11,.Lpoly1305_init + ldr r12,.LOPENSSL_armcap +# endif +#endif + ldrb r4,[r1,#0] + mov r10,#0x0fffffff + ldrb r5,[r1,#1] + and r3,r10,#-4 @ 0x0ffffffc + ldrb r6,[r1,#2] + ldrb r7,[r1,#3] + orr r4,r4,r5,lsl#8 + ldrb r5,[r1,#4] + orr r4,r4,r6,lsl#16 + ldrb r6,[r1,#5] + orr r4,r4,r7,lsl#24 + ldrb r7,[r1,#6] + and r4,r4,r10 + +#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__) +# if !defined(_WIN32) + ldr r12,[r11,r12] @ OPENSSL_armcap_P +# endif +# if defined(__APPLE__) || defined(_WIN32) + ldr r12,[r12] +# endif +#endif + ldrb r8,[r1,#7] + orr r5,r5,r6,lsl#8 + ldrb r6,[r1,#8] + orr r5,r5,r7,lsl#16 + ldrb r7,[r1,#9] + orr r5,r5,r8,lsl#24 + ldrb r8,[r1,#10] + and r5,r5,r3 + +#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__) + tst r12,#ARMV7_NEON @ check for NEON +# ifdef __thumb2__ + adr r9,.Lpoly1305_blocks_neon + adr r11,.Lpoly1305_blocks + it ne + movne r11,r9 + adr r12,.Lpoly1305_emit + orr r11,r11,#1 @ thumb-ify addresses + orr r12,r12,#1 +# else + add r12,r11,#(.Lpoly1305_emit-.Lpoly1305_init) + ite eq + addeq r11,r11,#(.Lpoly1305_blocks-.Lpoly1305_init) + addne r11,r11,#(.Lpoly1305_blocks_neon-.Lpoly1305_init) +# endif +#endif + ldrb r9,[r1,#11] + orr r6,r6,r7,lsl#8 + ldrb r7,[r1,#12] + orr r6,r6,r8,lsl#16 + ldrb r8,[r1,#13] + orr r6,r6,r9,lsl#24 + ldrb r9,[r1,#14] + and r6,r6,r3 + + ldrb r10,[r1,#15] + orr r7,r7,r8,lsl#8 + str r4,[r0,#0] + orr r7,r7,r9,lsl#16 + str r5,[r0,#4] + orr r7,r7,r10,lsl#24 + str r6,[r0,#8] + and r7,r7,r3 + str r7,[r0,#12] +#if __ARM_MAX_ARCH__>=7 && !defined(__KERNEL__) + stmia r2,{r11,r12} @ fill functions table + mov r0,#1 +#else + mov r0,#0 +#endif +.Lno_key: + ldmia sp!,{r4-r11} +#if __ARM_ARCH__>=5 + bx lr @ bx lr +#else + tst lr,#1 + moveq pc,lr @ be binary compatible with V4, yet + .word 0xe12fff1e @ interoperable with Thumb ISA:-) +#endif +.size poly1305_init,.-poly1305_init +.type poly1305_blocks,%function +.align 5 +poly1305_blocks: +.Lpoly1305_blocks: + stmdb sp!,{r3-r11,lr} + + ands r2,r2,#-16 + beq .Lno_data + + add r2,r2,r1 @ end pointer + sub sp,sp,#32 + +#if __ARM_ARCH__<7 + ldmia r0,{r4-r12} @ load context + add r0,r0,#20 + str r2,[sp,#16] @ offload stuff + str r0,[sp,#12] +#else + ldr lr,[r0,#36] @ is_base2_26 + ldmia r0!,{r4-r8} @ load hash value + str r2,[sp,#16] @ offload stuff + str r0,[sp,#12] + + adds r9,r4,r5,lsl#26 @ base 2^26 -> base 2^32 + mov r10,r5,lsr#6 + adcs r10,r10,r6,lsl#20 + mov r11,r6,lsr#12 + adcs r11,r11,r7,lsl#14 + mov r12,r7,lsr#18 + adcs r12,r12,r8,lsl#8 + mov r2,#0 + teq lr,#0 + str r2,[r0,#16] @ clear is_base2_26 + adc r2,r2,r8,lsr#24 + + itttt ne + movne r4,r9 @ choose between radixes + movne r5,r10 + movne r6,r11 + movne r7,r12 + ldmia r0,{r9-r12} @ load key + it ne + movne r8,r2 +#endif + + mov lr,r1 + cmp r3,#0 + str r10,[sp,#20] + str r11,[sp,#24] + str r12,[sp,#28] + b .Loop + +.align 4 +.Loop: +#if __ARM_ARCH__<7 + ldrb r0,[lr],#16 @ load input +# ifdef __thumb2__ + it hi +# endif + addhi r8,r8,#1 @ 1<<128 + ldrb r1,[lr,#-15] + ldrb r2,[lr,#-14] + ldrb r3,[lr,#-13] + orr r1,r0,r1,lsl#8 + ldrb r0,[lr,#-12] + orr r2,r1,r2,lsl#16 + ldrb r1,[lr,#-11] + orr r3,r2,r3,lsl#24 + ldrb r2,[lr,#-10] + adds r4,r4,r3 @ accumulate input + + ldrb r3,[lr,#-9] + orr r1,r0,r1,lsl#8 + ldrb r0,[lr,#-8] + orr r2,r1,r2,lsl#16 + ldrb r1,[lr,#-7] + orr r3,r2,r3,lsl#24 + ldrb r2,[lr,#-6] + adcs r5,r5,r3 + + ldrb r3,[lr,#-5] + orr r1,r0,r1,lsl#8 + ldrb r0,[lr,#-4] + orr r2,r1,r2,lsl#16 + ldrb r1,[lr,#-3] + orr r3,r2,r3,lsl#24 + ldrb r2,[lr,#-2] + adcs r6,r6,r3 + + ldrb r3,[lr,#-1] + orr r1,r0,r1,lsl#8 + str lr,[sp,#8] @ offload input pointer + orr r2,r1,r2,lsl#16 + add r10,r10,r10,lsr#2 + orr r3,r2,r3,lsl#24 +#else + ldr r0,[lr],#16 @ load input + it hi + addhi r8,r8,#1 @ padbit + ldr r1,[lr,#-12] + ldr r2,[lr,#-8] + ldr r3,[lr,#-4] +# ifdef __ARMEB__ + rev r0,r0 + rev r1,r1 + rev r2,r2 + rev r3,r3 +# endif + adds r4,r4,r0 @ accumulate input + str lr,[sp,#8] @ offload input pointer + adcs r5,r5,r1 + add r10,r10,r10,lsr#2 + adcs r6,r6,r2 +#endif + add r11,r11,r11,lsr#2 + adcs r7,r7,r3 + add r12,r12,r12,lsr#2 + + umull r2,r3,r5,r9 + adc r8,r8,#0 + umull r0,r1,r4,r9 + umlal r2,r3,r8,r10 + umlal r0,r1,r7,r10 + ldr r10,[sp,#20] @ reload r10 + umlal r2,r3,r6,r12 + umlal r0,r1,r5,r12 + umlal r2,r3,r7,r11 + umlal r0,r1,r6,r11 + umlal r2,r3,r4,r10 + str r0,[sp,#0] @ future r4 + mul r0,r11,r8 + ldr r11,[sp,#24] @ reload r11 + adds r2,r2,r1 @ d1+=d0>>32 + eor r1,r1,r1 + adc lr,r3,#0 @ future r6 + str r2,[sp,#4] @ future r5 + + mul r2,r12,r8 + eor r3,r3,r3 + umlal r0,r1,r7,r12 + ldr r12,[sp,#28] @ reload r12 + umlal r2,r3,r7,r9 + umlal r0,r1,r6,r9 + umlal r2,r3,r6,r10 + umlal r0,r1,r5,r10 + umlal r2,r3,r5,r11 + umlal r0,r1,r4,r11 + umlal r2,r3,r4,r12 + ldr r4,[sp,#0] + mul r8,r9,r8 + ldr r5,[sp,#4] + + adds r6,lr,r0 @ d2+=d1>>32 + ldr lr,[sp,#8] @ reload input pointer + adc r1,r1,#0 + adds r7,r2,r1 @ d3+=d2>>32 + ldr r0,[sp,#16] @ reload end pointer + adc r3,r3,#0 + add r8,r8,r3 @ h4+=d3>>32 + + and r1,r8,#-4 + and r8,r8,#3 + add r1,r1,r1,lsr#2 @ *=5 + adds r4,r4,r1 + adcs r5,r5,#0 + adcs r6,r6,#0 + adcs r7,r7,#0 + adc r8,r8,#0 + + cmp r0,lr @ done yet? + bhi .Loop + + ldr r0,[sp,#12] + add sp,sp,#32 + stmdb r0,{r4-r8} @ store the result + +.Lno_data: +#if __ARM_ARCH__>=5 + ldmia sp!,{r3-r11,pc} +#else + ldmia sp!,{r3-r11,lr} + tst lr,#1 + moveq pc,lr @ be binary compatible with V4, yet + .word 0xe12fff1e @ interoperable with Thumb ISA:-) +#endif +.size poly1305_blocks,.-poly1305_blocks +.type poly1305_emit,%function +.align 5 +poly1305_emit: +.Lpoly1305_emit: + stmdb sp!,{r4-r11} + + ldmia r0,{r3-r7} + +#if __ARM_ARCH__>=7 + ldr ip,[r0,#36] @ is_base2_26 + + adds r8,r3,r4,lsl#26 @ base 2^26 -> base 2^32 + mov r9,r4,lsr#6 + adcs r9,r9,r5,lsl#20 + mov r10,r5,lsr#12 + adcs r10,r10,r6,lsl#14 + mov r11,r6,lsr#18 + adcs r11,r11,r7,lsl#8 + mov r0,#0 + adc r0,r0,r7,lsr#24 + + tst ip,ip + itttt ne + movne r3,r8 + movne r4,r9 + movne r5,r10 + movne r6,r11 + it ne + movne r7,r0 +#endif + + adds r8,r3,#5 @ compare to modulus + adcs r9,r4,#0 + adcs r10,r5,#0 + adcs r11,r6,#0 + adc r0,r7,#0 + tst r0,#4 @ did it carry/borrow? + +#ifdef __thumb2__ + it ne +#endif + movne r3,r8 + ldr r8,[r2,#0] +#ifdef __thumb2__ + it ne +#endif + movne r4,r9 + ldr r9,[r2,#4] +#ifdef __thumb2__ + it ne +#endif + movne r5,r10 + ldr r10,[r2,#8] +#ifdef __thumb2__ + it ne +#endif + movne r6,r11 + ldr r11,[r2,#12] + + adds r3,r3,r8 + adcs r4,r4,r9 + adcs r5,r5,r10 + adc r6,r6,r11 + +#if __ARM_ARCH__>=7 +# ifdef __ARMEB__ + rev r3,r3 + rev r4,r4 + rev r5,r5 + rev r6,r6 +# endif + str r3,[r1,#0] + str r4,[r1,#4] + str r5,[r1,#8] + str r6,[r1,#12] +#else + strb r3,[r1,#0] + mov r3,r3,lsr#8 + strb r4,[r1,#4] + mov r4,r4,lsr#8 + strb r5,[r1,#8] + mov r5,r5,lsr#8 + strb r6,[r1,#12] + mov r6,r6,lsr#8 + + strb r3,[r1,#1] + mov r3,r3,lsr#8 + strb r4,[r1,#5] + mov r4,r4,lsr#8 + strb r5,[r1,#9] + mov r5,r5,lsr#8 + strb r6,[r1,#13] + mov r6,r6,lsr#8 + + strb r3,[r1,#2] + mov r3,r3,lsr#8 + strb r4,[r1,#6] + mov r4,r4,lsr#8 + strb r5,[r1,#10] + mov r5,r5,lsr#8 + strb r6,[r1,#14] + mov r6,r6,lsr#8 + + strb r3,[r1,#3] + strb r4,[r1,#7] + strb r5,[r1,#11] + strb r6,[r1,#15] +#endif + ldmia sp!,{r4-r11} +#if __ARM_ARCH__>=5 + bx lr @ bx lr +#else + tst lr,#1 + moveq pc,lr @ be binary compatible with V4, yet + .word 0xe12fff1e @ interoperable with Thumb ISA:-) +#endif +.size poly1305_emit,.-poly1305_emit +#if __ARM_MAX_ARCH__>=7 +.fpu neon + +.type poly1305_init_neon,%function +.align 5 +poly1305_init_neon: +.Lpoly1305_init_neon: + ldr r3,[r0,#48] @ first table element + cmp r3,#-1 @ is value impossible? + bne .Lno_init_neon + + ldr r4,[r0,#20] @ load key base 2^32 + ldr r5,[r0,#24] + ldr r6,[r0,#28] + ldr r7,[r0,#32] + + and r2,r4,#0x03ffffff @ base 2^32 -> base 2^26 + mov r3,r4,lsr#26 + mov r4,r5,lsr#20 + orr r3,r3,r5,lsl#6 + mov r5,r6,lsr#14 + orr r4,r4,r6,lsl#12 + mov r6,r7,lsr#8 + orr r5,r5,r7,lsl#18 + and r3,r3,#0x03ffffff + and r4,r4,#0x03ffffff + and r5,r5,#0x03ffffff + + vdup.32 d0,r2 @ r^1 in both lanes + add r2,r3,r3,lsl#2 @ *5 + vdup.32 d1,r3 + add r3,r4,r4,lsl#2 + vdup.32 d2,r2 + vdup.32 d3,r4 + add r4,r5,r5,lsl#2 + vdup.32 d4,r3 + vdup.32 d5,r5 + add r5,r6,r6,lsl#2 + vdup.32 d6,r4 + vdup.32 d7,r6 + vdup.32 d8,r5 + + mov r5,#2 @ counter + +.Lsquare_neon: + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4 + @ d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4 + @ d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4 + @ d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4 + @ d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4 + + vmull.u32 q5,d0,d0[1] + vmull.u32 q6,d1,d0[1] + vmull.u32 q7,d3,d0[1] + vmull.u32 q8,d5,d0[1] + vmull.u32 q9,d7,d0[1] + + vmlal.u32 q5,d7,d2[1] + vmlal.u32 q6,d0,d1[1] + vmlal.u32 q7,d1,d1[1] + vmlal.u32 q8,d3,d1[1] + vmlal.u32 q9,d5,d1[1] + + vmlal.u32 q5,d5,d4[1] + vmlal.u32 q6,d7,d4[1] + vmlal.u32 q8,d1,d3[1] + vmlal.u32 q7,d0,d3[1] + vmlal.u32 q9,d3,d3[1] + + vmlal.u32 q5,d3,d6[1] + vmlal.u32 q8,d0,d5[1] + vmlal.u32 q6,d5,d6[1] + vmlal.u32 q7,d7,d6[1] + vmlal.u32 q9,d1,d5[1] + + vmlal.u32 q8,d7,d8[1] + vmlal.u32 q5,d1,d8[1] + vmlal.u32 q6,d3,d8[1] + vmlal.u32 q7,d5,d8[1] + vmlal.u32 q9,d0,d7[1] + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ lazy reduction as discussed in "NEON crypto" by D.J. Bernstein + @ and P. Schwabe + @ + @ H0>>+H1>>+H2>>+H3>>+H4 + @ H3>>+H4>>*5+H0>>+H1 + @ + @ Trivia. + @ + @ Result of multiplication of n-bit number by m-bit number is + @ n+m bits wide. However! Even though 2^n is a n+1-bit number, + @ m-bit number multiplied by 2^n is still n+m bits wide. + @ + @ Sum of two n-bit numbers is n+1 bits wide, sum of three - n+2, + @ and so is sum of four. Sum of 2^m n-m-bit numbers and n-bit + @ one is n+1 bits wide. + @ + @ >>+ denotes Hnext += Hn>>26, Hn &= 0x3ffffff. This means that + @ H0, H2, H3 are guaranteed to be 26 bits wide, while H1 and H4 + @ can be 27. However! In cases when their width exceeds 26 bits + @ they are limited by 2^26+2^6. This in turn means that *sum* + @ of the products with these values can still be viewed as sum + @ of 52-bit numbers as long as the amount of addends is not a + @ power of 2. For example, + @ + @ H4 = H4*R0 + H3*R1 + H2*R2 + H1*R3 + H0 * R4, + @ + @ which can't be larger than 5 * (2^26 + 2^6) * (2^26 + 2^6), or + @ 5 * (2^52 + 2*2^32 + 2^12), which in turn is smaller than + @ 8 * (2^52) or 2^55. However, the value is then multiplied by + @ by 5, so we should be looking at 5 * 5 * (2^52 + 2^33 + 2^12), + @ which is less than 32 * (2^52) or 2^57. And when processing + @ data we are looking at triple as many addends... + @ + @ In key setup procedure pre-reduced H0 is limited by 5*4+1 and + @ 5*H4 - by 5*5 52-bit addends, or 57 bits. But when hashing the + @ input H0 is limited by (5*4+1)*3 addends, or 58 bits, while + @ 5*H4 by 5*5*3, or 59[!] bits. How is this relevant? vmlal.u32 + @ instruction accepts 2x32-bit input and writes 2x64-bit result. + @ This means that result of reduction have to be compressed upon + @ loop wrap-around. This can be done in the process of reduction + @ to minimize amount of instructions [as well as amount of + @ 128-bit instructions, which benefits low-end processors], but + @ one has to watch for H2 (which is narrower than H0) and 5*H4 + @ not being wider than 58 bits, so that result of right shift + @ by 26 bits fits in 32 bits. This is also useful on x86, + @ because it allows to use paddd in place for paddq, which + @ benefits Atom, where paddq is ridiculously slow. + + vshr.u64 q15,q8,#26 + vmovn.i64 d16,q8 + vshr.u64 q4,q5,#26 + vmovn.i64 d10,q5 + vadd.i64 q9,q9,q15 @ h3 -> h4 + vbic.i32 d16,#0xfc000000 @ &=0x03ffffff + vadd.i64 q6,q6,q4 @ h0 -> h1 + vbic.i32 d10,#0xfc000000 + + vshrn.u64 d30,q9,#26 + vmovn.i64 d18,q9 + vshr.u64 q4,q6,#26 + vmovn.i64 d12,q6 + vadd.i64 q7,q7,q4 @ h1 -> h2 + vbic.i32 d18,#0xfc000000 + vbic.i32 d12,#0xfc000000 + + vadd.i32 d10,d10,d30 + vshl.u32 d30,d30,#2 + vshrn.u64 d8,q7,#26 + vmovn.i64 d14,q7 + vadd.i32 d10,d10,d30 @ h4 -> h0 + vadd.i32 d16,d16,d8 @ h2 -> h3 + vbic.i32 d14,#0xfc000000 + + vshr.u32 d30,d10,#26 + vbic.i32 d10,#0xfc000000 + vshr.u32 d8,d16,#26 + vbic.i32 d16,#0xfc000000 + vadd.i32 d12,d12,d30 @ h0 -> h1 + vadd.i32 d18,d18,d8 @ h3 -> h4 + + subs r5,r5,#1 + beq .Lsquare_break_neon + + add r6,r0,#(48+0*9*4) + add r7,r0,#(48+1*9*4) + + vtrn.32 d0,d10 @ r^2:r^1 + vtrn.32 d3,d14 + vtrn.32 d5,d16 + vtrn.32 d1,d12 + vtrn.32 d7,d18 + + vshl.u32 d4,d3,#2 @ *5 + vshl.u32 d6,d5,#2 + vshl.u32 d2,d1,#2 + vshl.u32 d8,d7,#2 + vadd.i32 d4,d4,d3 + vadd.i32 d2,d2,d1 + vadd.i32 d6,d6,d5 + vadd.i32 d8,d8,d7 + + vst4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]! + vst4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]! + vst4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]! + vst4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]! + vst1.32 {d8[0]},[r6,:32] + vst1.32 {d8[1]},[r7,:32] + + b .Lsquare_neon + +.align 4 +.Lsquare_break_neon: + add r6,r0,#(48+2*4*9) + add r7,r0,#(48+3*4*9) + + vmov d0,d10 @ r^4:r^3 + vshl.u32 d2,d12,#2 @ *5 + vmov d1,d12 + vshl.u32 d4,d14,#2 + vmov d3,d14 + vshl.u32 d6,d16,#2 + vmov d5,d16 + vshl.u32 d8,d18,#2 + vmov d7,d18 + vadd.i32 d2,d2,d12 + vadd.i32 d4,d4,d14 + vadd.i32 d6,d6,d16 + vadd.i32 d8,d8,d18 + + vst4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]! + vst4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]! + vst4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]! + vst4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]! + vst1.32 {d8[0]},[r6] + vst1.32 {d8[1]},[r7] + +.Lno_init_neon: + bx lr @ bx lr +.size poly1305_init_neon,.-poly1305_init_neon + +.type poly1305_blocks_neon,%function +.align 5 +poly1305_blocks_neon: +.Lpoly1305_blocks_neon: + ldr ip,[r0,#36] @ is_base2_26 + + cmp r2,#64 + blo .Lpoly1305_blocks + + stmdb sp!,{r4-r7} + vstmdb sp!,{d8-d15} @ ABI specification says so + + tst ip,ip @ is_base2_26? + bne .Lbase2_26_neon + + stmdb sp!,{r1-r3,lr} + bl .Lpoly1305_init_neon + + ldr r4,[r0,#0] @ load hash value base 2^32 + ldr r5,[r0,#4] + ldr r6,[r0,#8] + ldr r7,[r0,#12] + ldr ip,[r0,#16] + + and r2,r4,#0x03ffffff @ base 2^32 -> base 2^26 + mov r3,r4,lsr#26 + veor d10,d10,d10 + mov r4,r5,lsr#20 + orr r3,r3,r5,lsl#6 + veor d12,d12,d12 + mov r5,r6,lsr#14 + orr r4,r4,r6,lsl#12 + veor d14,d14,d14 + mov r6,r7,lsr#8 + orr r5,r5,r7,lsl#18 + veor d16,d16,d16 + and r3,r3,#0x03ffffff + orr r6,r6,ip,lsl#24 + veor d18,d18,d18 + and r4,r4,#0x03ffffff + mov r1,#1 + and r5,r5,#0x03ffffff + str r1,[r0,#36] @ set is_base2_26 + + vmov.32 d10[0],r2 + vmov.32 d12[0],r3 + vmov.32 d14[0],r4 + vmov.32 d16[0],r5 + vmov.32 d18[0],r6 + adr r5,.Lzeros + + ldmia sp!,{r1-r3,lr} + b .Lhash_loaded + +.align 4 +.Lbase2_26_neon: + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ load hash value + + veor d10,d10,d10 + veor d12,d12,d12 + veor d14,d14,d14 + veor d16,d16,d16 + veor d18,d18,d18 + vld4.32 {d10[0],d12[0],d14[0],d16[0]},[r0]! + adr r5,.Lzeros + vld1.32 {d18[0]},[r0] + sub r0,r0,#16 @ rewind + +.Lhash_loaded: + add r4,r1,#32 + mov r3,r3,lsl#24 + tst r2,#31 + beq .Leven + + vld4.32 {d20[0],d22[0],d24[0],d26[0]},[r1]! + vmov.32 d28[0],r3 + sub r2,r2,#16 + add r4,r1,#32 + +# ifdef __ARMEB__ + vrev32.8 q10,q10 + vrev32.8 q13,q13 + vrev32.8 q11,q11 + vrev32.8 q12,q12 +# endif + vsri.u32 d28,d26,#8 @ base 2^32 -> base 2^26 + vshl.u32 d26,d26,#18 + + vsri.u32 d26,d24,#14 + vshl.u32 d24,d24,#12 + vadd.i32 d29,d28,d18 @ add hash value and move to #hi + + vbic.i32 d26,#0xfc000000 + vsri.u32 d24,d22,#20 + vshl.u32 d22,d22,#6 + + vbic.i32 d24,#0xfc000000 + vsri.u32 d22,d20,#26 + vadd.i32 d27,d26,d16 + + vbic.i32 d20,#0xfc000000 + vbic.i32 d22,#0xfc000000 + vadd.i32 d25,d24,d14 + + vadd.i32 d21,d20,d10 + vadd.i32 d23,d22,d12 + + mov r7,r5 + add r6,r0,#48 + + cmp r2,r2 + b .Long_tail + +.align 4 +.Leven: + subs r2,r2,#64 + it lo + movlo r4,r5 + + vmov.i32 q14,#1<<24 @ padbit, yes, always + vld4.32 {d20,d22,d24,d26},[r1] @ inp[0:1] + add r1,r1,#64 + vld4.32 {d21,d23,d25,d27},[r4] @ inp[2:3] (or 0) + add r4,r4,#64 + itt hi + addhi r7,r0,#(48+1*9*4) + addhi r6,r0,#(48+3*9*4) + +# ifdef __ARMEB__ + vrev32.8 q10,q10 + vrev32.8 q13,q13 + vrev32.8 q11,q11 + vrev32.8 q12,q12 +# endif + vsri.u32 q14,q13,#8 @ base 2^32 -> base 2^26 + vshl.u32 q13,q13,#18 + + vsri.u32 q13,q12,#14 + vshl.u32 q12,q12,#12 + + vbic.i32 q13,#0xfc000000 + vsri.u32 q12,q11,#20 + vshl.u32 q11,q11,#6 + + vbic.i32 q12,#0xfc000000 + vsri.u32 q11,q10,#26 + + vbic.i32 q10,#0xfc000000 + vbic.i32 q11,#0xfc000000 + + bls .Lskip_loop + + vld4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]! @ load r^2 + vld4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]! @ load r^4 + vld4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]! + vld4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]! + b .Loop_neon + +.align 5 +.Loop_neon: + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2 + @ ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^3+inp[7]*r + @ ___________________/ + @ ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2+inp[8])*r^2 + @ ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^4+inp[7]*r^2+inp[9])*r + @ ___________________/ ____________________/ + @ + @ Note that we start with inp[2:3]*r^2. This is because it + @ doesn't depend on reduction in previous iteration. + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4 + @ d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4 + @ d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4 + @ d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4 + @ d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4 + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ inp[2:3]*r^2 + + vadd.i32 d24,d24,d14 @ accumulate inp[0:1] + vmull.u32 q7,d25,d0[1] + vadd.i32 d20,d20,d10 + vmull.u32 q5,d21,d0[1] + vadd.i32 d26,d26,d16 + vmull.u32 q8,d27,d0[1] + vmlal.u32 q7,d23,d1[1] + vadd.i32 d22,d22,d12 + vmull.u32 q6,d23,d0[1] + + vadd.i32 d28,d28,d18 + vmull.u32 q9,d29,d0[1] + subs r2,r2,#64 + vmlal.u32 q5,d29,d2[1] + it lo + movlo r4,r5 + vmlal.u32 q8,d25,d1[1] + vld1.32 d8[1],[r7,:32] + vmlal.u32 q6,d21,d1[1] + vmlal.u32 q9,d27,d1[1] + + vmlal.u32 q5,d27,d4[1] + vmlal.u32 q8,d23,d3[1] + vmlal.u32 q9,d25,d3[1] + vmlal.u32 q6,d29,d4[1] + vmlal.u32 q7,d21,d3[1] + + vmlal.u32 q8,d21,d5[1] + vmlal.u32 q5,d25,d6[1] + vmlal.u32 q9,d23,d5[1] + vmlal.u32 q6,d27,d6[1] + vmlal.u32 q7,d29,d6[1] + + vmlal.u32 q8,d29,d8[1] + vmlal.u32 q5,d23,d8[1] + vmlal.u32 q9,d21,d7[1] + vmlal.u32 q6,d25,d8[1] + vmlal.u32 q7,d27,d8[1] + + vld4.32 {d21,d23,d25,d27},[r4] @ inp[2:3] (or 0) + add r4,r4,#64 + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ (hash+inp[0:1])*r^4 and accumulate + + vmlal.u32 q8,d26,d0[0] + vmlal.u32 q5,d20,d0[0] + vmlal.u32 q9,d28,d0[0] + vmlal.u32 q6,d22,d0[0] + vmlal.u32 q7,d24,d0[0] + vld1.32 d8[0],[r6,:32] + + vmlal.u32 q8,d24,d1[0] + vmlal.u32 q5,d28,d2[0] + vmlal.u32 q9,d26,d1[0] + vmlal.u32 q6,d20,d1[0] + vmlal.u32 q7,d22,d1[0] + + vmlal.u32 q8,d22,d3[0] + vmlal.u32 q5,d26,d4[0] + vmlal.u32 q9,d24,d3[0] + vmlal.u32 q6,d28,d4[0] + vmlal.u32 q7,d20,d3[0] + + vmlal.u32 q8,d20,d5[0] + vmlal.u32 q5,d24,d6[0] + vmlal.u32 q9,d22,d5[0] + vmlal.u32 q6,d26,d6[0] + vmlal.u32 q8,d28,d8[0] + + vmlal.u32 q7,d28,d6[0] + vmlal.u32 q5,d22,d8[0] + vmlal.u32 q9,d20,d7[0] + vmov.i32 q14,#1<<24 @ padbit, yes, always + vmlal.u32 q6,d24,d8[0] + vmlal.u32 q7,d26,d8[0] + + vld4.32 {d20,d22,d24,d26},[r1] @ inp[0:1] + add r1,r1,#64 +# ifdef __ARMEB__ + vrev32.8 q10,q10 + vrev32.8 q11,q11 + vrev32.8 q12,q12 + vrev32.8 q13,q13 +# endif + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ lazy reduction interleaved with base 2^32 -> base 2^26 of + @ inp[0:3] previously loaded to q10-q13 and smashed to q10-q14. + + vshr.u64 q15,q8,#26 + vmovn.i64 d16,q8 + vshr.u64 q4,q5,#26 + vmovn.i64 d10,q5 + vadd.i64 q9,q9,q15 @ h3 -> h4 + vbic.i32 d16,#0xfc000000 + vsri.u32 q14,q13,#8 @ base 2^32 -> base 2^26 + vadd.i64 q6,q6,q4 @ h0 -> h1 + vshl.u32 q13,q13,#18 + vbic.i32 d10,#0xfc000000 + + vshrn.u64 d30,q9,#26 + vmovn.i64 d18,q9 + vshr.u64 q4,q6,#26 + vmovn.i64 d12,q6 + vadd.i64 q7,q7,q4 @ h1 -> h2 + vsri.u32 q13,q12,#14 + vbic.i32 d18,#0xfc000000 + vshl.u32 q12,q12,#12 + vbic.i32 d12,#0xfc000000 + + vadd.i32 d10,d10,d30 + vshl.u32 d30,d30,#2 + vbic.i32 q13,#0xfc000000 + vshrn.u64 d8,q7,#26 + vmovn.i64 d14,q7 + vaddl.u32 q5,d10,d30 @ h4 -> h0 [widen for a sec] + vsri.u32 q12,q11,#20 + vadd.i32 d16,d16,d8 @ h2 -> h3 + vshl.u32 q11,q11,#6 + vbic.i32 d14,#0xfc000000 + vbic.i32 q12,#0xfc000000 + + vshrn.u64 d30,q5,#26 @ re-narrow + vmovn.i64 d10,q5 + vsri.u32 q11,q10,#26 + vbic.i32 q10,#0xfc000000 + vshr.u32 d8,d16,#26 + vbic.i32 d16,#0xfc000000 + vbic.i32 d10,#0xfc000000 + vadd.i32 d12,d12,d30 @ h0 -> h1 + vadd.i32 d18,d18,d8 @ h3 -> h4 + vbic.i32 q11,#0xfc000000 + + bhi .Loop_neon + +.Lskip_loop: + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ multiply (inp[0:1]+hash) or inp[2:3] by r^2:r^1 + + add r7,r0,#(48+0*9*4) + add r6,r0,#(48+1*9*4) + adds r2,r2,#32 + it ne + movne r2,#0 + bne .Long_tail + + vadd.i32 d25,d24,d14 @ add hash value and move to #hi + vadd.i32 d21,d20,d10 + vadd.i32 d27,d26,d16 + vadd.i32 d23,d22,d12 + vadd.i32 d29,d28,d18 + +.Long_tail: + vld4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]! @ load r^1 + vld4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]! @ load r^2 + + vadd.i32 d24,d24,d14 @ can be redundant + vmull.u32 q7,d25,d0 + vadd.i32 d20,d20,d10 + vmull.u32 q5,d21,d0 + vadd.i32 d26,d26,d16 + vmull.u32 q8,d27,d0 + vadd.i32 d22,d22,d12 + vmull.u32 q6,d23,d0 + vadd.i32 d28,d28,d18 + vmull.u32 q9,d29,d0 + + vmlal.u32 q5,d29,d2 + vld4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]! + vmlal.u32 q8,d25,d1 + vld4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]! + vmlal.u32 q6,d21,d1 + vmlal.u32 q9,d27,d1 + vmlal.u32 q7,d23,d1 + + vmlal.u32 q8,d23,d3 + vld1.32 d8[1],[r7,:32] + vmlal.u32 q5,d27,d4 + vld1.32 d8[0],[r6,:32] + vmlal.u32 q9,d25,d3 + vmlal.u32 q6,d29,d4 + vmlal.u32 q7,d21,d3 + + vmlal.u32 q8,d21,d5 + it ne + addne r7,r0,#(48+2*9*4) + vmlal.u32 q5,d25,d6 + it ne + addne r6,r0,#(48+3*9*4) + vmlal.u32 q9,d23,d5 + vmlal.u32 q6,d27,d6 + vmlal.u32 q7,d29,d6 + + vmlal.u32 q8,d29,d8 + vorn q0,q0,q0 @ all-ones, can be redundant + vmlal.u32 q5,d23,d8 + vshr.u64 q0,q0,#38 + vmlal.u32 q9,d21,d7 + vmlal.u32 q6,d25,d8 + vmlal.u32 q7,d27,d8 + + beq .Lshort_tail + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ (hash+inp[0:1])*r^4:r^3 and accumulate + + vld4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]! @ load r^3 + vld4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]! @ load r^4 + + vmlal.u32 q7,d24,d0 + vmlal.u32 q5,d20,d0 + vmlal.u32 q8,d26,d0 + vmlal.u32 q6,d22,d0 + vmlal.u32 q9,d28,d0 + + vmlal.u32 q5,d28,d2 + vld4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]! + vmlal.u32 q8,d24,d1 + vld4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]! + vmlal.u32 q6,d20,d1 + vmlal.u32 q9,d26,d1 + vmlal.u32 q7,d22,d1 + + vmlal.u32 q8,d22,d3 + vld1.32 d8[1],[r7,:32] + vmlal.u32 q5,d26,d4 + vld1.32 d8[0],[r6,:32] + vmlal.u32 q9,d24,d3 + vmlal.u32 q6,d28,d4 + vmlal.u32 q7,d20,d3 + + vmlal.u32 q8,d20,d5 + vmlal.u32 q5,d24,d6 + vmlal.u32 q9,d22,d5 + vmlal.u32 q6,d26,d6 + vmlal.u32 q7,d28,d6 + + vmlal.u32 q8,d28,d8 + vorn q0,q0,q0 @ all-ones + vmlal.u32 q5,d22,d8 + vshr.u64 q0,q0,#38 + vmlal.u32 q9,d20,d7 + vmlal.u32 q6,d24,d8 + vmlal.u32 q7,d26,d8 + +.Lshort_tail: + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ horizontal addition + + vadd.i64 d16,d16,d17 + vadd.i64 d10,d10,d11 + vadd.i64 d18,d18,d19 + vadd.i64 d12,d12,d13 + vadd.i64 d14,d14,d15 + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ lazy reduction, but without narrowing + + vshr.u64 q15,q8,#26 + vand.i64 q8,q8,q0 + vshr.u64 q4,q5,#26 + vand.i64 q5,q5,q0 + vadd.i64 q9,q9,q15 @ h3 -> h4 + vadd.i64 q6,q6,q4 @ h0 -> h1 + + vshr.u64 q15,q9,#26 + vand.i64 q9,q9,q0 + vshr.u64 q4,q6,#26 + vand.i64 q6,q6,q0 + vadd.i64 q7,q7,q4 @ h1 -> h2 + + vadd.i64 q5,q5,q15 + vshl.u64 q15,q15,#2 + vshr.u64 q4,q7,#26 + vand.i64 q7,q7,q0 + vadd.i64 q5,q5,q15 @ h4 -> h0 + vadd.i64 q8,q8,q4 @ h2 -> h3 + + vshr.u64 q15,q5,#26 + vand.i64 q5,q5,q0 + vshr.u64 q4,q8,#26 + vand.i64 q8,q8,q0 + vadd.i64 q6,q6,q15 @ h0 -> h1 + vadd.i64 q9,q9,q4 @ h3 -> h4 + + cmp r2,#0 + bne .Leven + + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ + @ store hash value + + vst4.32 {d10[0],d12[0],d14[0],d16[0]},[r0]! + vst1.32 {d18[0]},[r0] + + vldmia sp!,{d8-d15} @ epilogue + ldmia sp!,{r4-r7} + bx lr @ bx lr +.size poly1305_blocks_neon,.-poly1305_blocks_neon + +.align 5 +.Lzeros: +.long 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 +#ifndef __KERNEL__ +.LOPENSSL_armcap: +# ifdef _WIN32 +.word OPENSSL_armcap_P +# else +.word OPENSSL_armcap_P-.Lpoly1305_init +# endif +.comm OPENSSL_armcap_P,4,4 +.hidden OPENSSL_armcap_P +#endif +#endif +.asciz "Poly1305 for ARMv4/NEON, CRYPTOGAMS by @dot-asm" +.align 2 diff --git a/arch/arm/crypto/poly1305-glue.c b/arch/arm/crypto/poly1305-glue.c new file mode 100644 index 000000000000..066e6a8ca5b3 --- /dev/null +++ b/arch/arm/crypto/poly1305-glue.c @@ -0,0 +1,272 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * OpenSSL/Cryptogams accelerated Poly1305 transform for ARM + * + * Copyright (C) 2019 Linaro Ltd. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +void poly1305_init_arm(void *state, const u8 *key); +void poly1305_blocks_arm(void *state, const u8 *src, u32 len, u32 hibit); +void poly1305_blocks_neon(void *state, const u8 *src, u32 len, u32 hibit); +void poly1305_emit_arm(void *state, u8 *digest, const u32 *nonce); + +void __weak poly1305_blocks_neon(void *state, const u8 *src, u32 len, u32 hibit) +{ +} + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_neon); + +void poly1305_init_arch(struct poly1305_desc_ctx *dctx, const u8 *key) +{ + poly1305_init_arm(&dctx->h, key); + dctx->s[0] = get_unaligned_le32(key + 16); + dctx->s[1] = get_unaligned_le32(key + 20); + dctx->s[2] = get_unaligned_le32(key + 24); + dctx->s[3] = get_unaligned_le32(key + 28); + dctx->buflen = 0; +} +EXPORT_SYMBOL(poly1305_init_arch); + +static int arm_poly1305_init(struct shash_desc *desc) +{ + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + + dctx->buflen = 0; + dctx->rset = 0; + dctx->sset = false; + + return 0; +} + +static void arm_poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src, + u32 len, u32 hibit, bool do_neon) +{ + if (unlikely(!dctx->sset)) { + if (!dctx->rset) { + poly1305_init_arm(&dctx->h, src); + src += POLY1305_BLOCK_SIZE; + len -= POLY1305_BLOCK_SIZE; + dctx->rset = 1; + } + if (len >= POLY1305_BLOCK_SIZE) { + dctx->s[0] = get_unaligned_le32(src + 0); + dctx->s[1] = get_unaligned_le32(src + 4); + dctx->s[2] = get_unaligned_le32(src + 8); + dctx->s[3] = get_unaligned_le32(src + 12); + src += POLY1305_BLOCK_SIZE; + len -= POLY1305_BLOCK_SIZE; + dctx->sset = true; + } + if (len < POLY1305_BLOCK_SIZE) + return; + } + + len &= ~(POLY1305_BLOCK_SIZE - 1); + + if (static_branch_likely(&have_neon) && likely(do_neon)) + poly1305_blocks_neon(&dctx->h, src, len, hibit); + else + poly1305_blocks_arm(&dctx->h, src, len, hibit); +} + +static void arm_poly1305_do_update(struct poly1305_desc_ctx *dctx, + const u8 *src, u32 len, bool do_neon) +{ + if (unlikely(dctx->buflen)) { + u32 bytes = min(len, POLY1305_BLOCK_SIZE - dctx->buflen); + + memcpy(dctx->buf + dctx->buflen, src, bytes); + src += bytes; + len -= bytes; + dctx->buflen += bytes; + + if (dctx->buflen == POLY1305_BLOCK_SIZE) { + arm_poly1305_blocks(dctx, dctx->buf, + POLY1305_BLOCK_SIZE, 1, false); + dctx->buflen = 0; + } + } + + if (likely(len >= POLY1305_BLOCK_SIZE)) { + arm_poly1305_blocks(dctx, src, len, 1, do_neon); + src += round_down(len, POLY1305_BLOCK_SIZE); + len %= POLY1305_BLOCK_SIZE; + } + + if (unlikely(len)) { + dctx->buflen = len; + memcpy(dctx->buf, src, len); + } +} + +static int arm_poly1305_update(struct shash_desc *desc, + const u8 *src, unsigned int srclen) +{ + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + + arm_poly1305_do_update(dctx, src, srclen, false); + return 0; +} + +static int __maybe_unused arm_poly1305_update_neon(struct shash_desc *desc, + const u8 *src, + unsigned int srclen) +{ + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + bool do_neon = may_use_simd() && srclen > 128; + + if (static_branch_likely(&have_neon) && do_neon) + kernel_neon_begin(); + arm_poly1305_do_update(dctx, src, srclen, do_neon); + if (static_branch_likely(&have_neon) && do_neon) + kernel_neon_end(); + return 0; +} + +void poly1305_update_arch(struct poly1305_desc_ctx *dctx, const u8 *src, + unsigned int nbytes) +{ + bool do_neon = IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && + may_use_simd(); + + if (unlikely(dctx->buflen)) { + u32 bytes = min(nbytes, POLY1305_BLOCK_SIZE - dctx->buflen); + + memcpy(dctx->buf + dctx->buflen, src, bytes); + src += bytes; + nbytes -= bytes; + dctx->buflen += bytes; + + if (dctx->buflen == POLY1305_BLOCK_SIZE) { + poly1305_blocks_arm(&dctx->h, dctx->buf, + POLY1305_BLOCK_SIZE, 1); + dctx->buflen = 0; + } + } + + if (likely(nbytes >= POLY1305_BLOCK_SIZE)) { + unsigned int len = round_down(nbytes, POLY1305_BLOCK_SIZE); + + if (static_branch_likely(&have_neon) && do_neon) { + do { + unsigned int todo = min_t(unsigned int, len, SZ_4K); + + kernel_neon_begin(); + poly1305_blocks_neon(&dctx->h, src, todo, 1); + kernel_neon_end(); + + len -= todo; + src += todo; + } while (len); + } else { + poly1305_blocks_arm(&dctx->h, src, len, 1); + src += len; + } + nbytes %= POLY1305_BLOCK_SIZE; + } + + if (unlikely(nbytes)) { + dctx->buflen = nbytes; + memcpy(dctx->buf, src, nbytes); + } +} +EXPORT_SYMBOL(poly1305_update_arch); + +void poly1305_final_arch(struct poly1305_desc_ctx *dctx, u8 *dst) +{ + if (unlikely(dctx->buflen)) { + dctx->buf[dctx->buflen++] = 1; + memset(dctx->buf + dctx->buflen, 0, + POLY1305_BLOCK_SIZE - dctx->buflen); + poly1305_blocks_arm(&dctx->h, dctx->buf, POLY1305_BLOCK_SIZE, 0); + } + + poly1305_emit_arm(&dctx->h, dst, dctx->s); + *dctx = (struct poly1305_desc_ctx){}; +} +EXPORT_SYMBOL(poly1305_final_arch); + +static int arm_poly1305_final(struct shash_desc *desc, u8 *dst) +{ + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + + if (unlikely(!dctx->sset)) + return -ENOKEY; + + poly1305_final_arch(dctx, dst); + return 0; +} + +static struct shash_alg arm_poly1305_algs[] = {{ + .init = arm_poly1305_init, + .update = arm_poly1305_update, + .final = arm_poly1305_final, + .digestsize = POLY1305_DIGEST_SIZE, + .descsize = sizeof(struct poly1305_desc_ctx), + + .base.cra_name = "poly1305", + .base.cra_driver_name = "poly1305-arm", + .base.cra_priority = 150, + .base.cra_blocksize = POLY1305_BLOCK_SIZE, + .base.cra_module = THIS_MODULE, +#ifdef CONFIG_KERNEL_MODE_NEON +}, { + .init = arm_poly1305_init, + .update = arm_poly1305_update_neon, + .final = arm_poly1305_final, + .digestsize = POLY1305_DIGEST_SIZE, + .descsize = sizeof(struct poly1305_desc_ctx), + + .base.cra_name = "poly1305", + .base.cra_driver_name = "poly1305-neon", + .base.cra_priority = 200, + .base.cra_blocksize = POLY1305_BLOCK_SIZE, + .base.cra_module = THIS_MODULE, +#endif +}}; + +static int __init arm_poly1305_mod_init(void) +{ + if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && + (elf_hwcap & HWCAP_NEON)) + static_branch_enable(&have_neon); + else if (IS_REACHABLE(CONFIG_CRYPTO_HASH)) + /* register only the first entry */ + return crypto_register_shash(&arm_poly1305_algs[0]); + + return IS_REACHABLE(CONFIG_CRYPTO_HASH) ? + crypto_register_shashes(arm_poly1305_algs, + ARRAY_SIZE(arm_poly1305_algs)) : 0; +} + +static void __exit arm_poly1305_mod_exit(void) +{ + if (!IS_REACHABLE(CONFIG_CRYPTO_HASH)) + return; + if (!static_branch_likely(&have_neon)) { + crypto_unregister_shash(&arm_poly1305_algs[0]); + return; + } + crypto_unregister_shashes(arm_poly1305_algs, + ARRAY_SIZE(arm_poly1305_algs)); +} + +module_init(arm_poly1305_mod_init); +module_exit(arm_poly1305_mod_exit); + +MODULE_LICENSE("GPL v2"); +MODULE_ALIAS_CRYPTO("poly1305"); +MODULE_ALIAS_CRYPTO("poly1305-arm"); +MODULE_ALIAS_CRYPTO("poly1305-neon"); diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c index 8a8470d36c65..97fa9c167757 100644 --- a/arch/arm/kernel/hw_breakpoint.c +++ b/arch/arm/kernel/hw_breakpoint.c @@ -688,6 +688,40 @@ static void disable_single_step(struct perf_event *bp) arch_install_hw_breakpoint(bp); } +/* + * Arm32 hardware does not always report a watchpoint hit address that matches + * one of the watchpoints set. It can also report an address "near" the + * watchpoint if a single instruction access both watched and unwatched + * addresses. There is no straight-forward way, short of disassembling the + * offending instruction, to map that address back to the watchpoint. This + * function computes the distance of the memory access from the watchpoint as a + * heuristic for the likelyhood that a given access triggered the watchpoint. + * + * See this same function in the arm64 platform code, which has the same + * problem. + * + * The function returns the distance of the address from the bytes watched by + * the watchpoint. In case of an exact match, it returns 0. + */ +static u32 get_distance_from_watchpoint(unsigned long addr, u32 val, + struct arch_hw_breakpoint_ctrl *ctrl) +{ + u32 wp_low, wp_high; + u32 lens, lene; + + lens = __ffs(ctrl->len); + lene = __fls(ctrl->len); + + wp_low = val + lens; + wp_high = val + lene; + if (addr < wp_low) + return wp_low - addr; + else if (addr > wp_high) + return addr - wp_high; + else + return 0; +} + static int watchpoint_fault_on_uaccess(struct pt_regs *regs, struct arch_hw_breakpoint *info) { @@ -697,23 +731,25 @@ static int watchpoint_fault_on_uaccess(struct pt_regs *regs, static void watchpoint_handler(unsigned long addr, unsigned int fsr, struct pt_regs *regs) { - int i, access; - u32 val, ctrl_reg, alignment_mask; + int i, access, closest_match = 0; + u32 min_dist = -1, dist; + u32 val, ctrl_reg; struct perf_event *wp, **slots; struct arch_hw_breakpoint *info; struct arch_hw_breakpoint_ctrl ctrl; slots = this_cpu_ptr(wp_on_reg); + /* + * Find all watchpoints that match the reported address. If no exact + * match is found. Attribute the hit to the closest watchpoint. + */ + rcu_read_lock(); for (i = 0; i < core_num_wrps; ++i) { - rcu_read_lock(); - wp = slots[i]; - if (wp == NULL) - goto unlock; + continue; - info = counter_arch_bp(wp); /* * The DFAR is an unknown value on debug architectures prior * to 7.1. Since we only allow a single watchpoint on these @@ -722,33 +758,31 @@ static void watchpoint_handler(unsigned long addr, unsigned int fsr, */ if (debug_arch < ARM_DEBUG_ARCH_V7_1) { BUG_ON(i > 0); + info = counter_arch_bp(wp); info->trigger = wp->attr.bp_addr; } else { - if (info->ctrl.len == ARM_BREAKPOINT_LEN_8) - alignment_mask = 0x7; - else - alignment_mask = 0x3; - - /* Check if the watchpoint value matches. */ - val = read_wb_reg(ARM_BASE_WVR + i); - if (val != (addr & ~alignment_mask)) - goto unlock; - - /* Possible match, check the byte address select. */ - ctrl_reg = read_wb_reg(ARM_BASE_WCR + i); - decode_ctrl_reg(ctrl_reg, &ctrl); - if (!((1 << (addr & alignment_mask)) & ctrl.len)) - goto unlock; - /* Check that the access type matches. */ if (debug_exception_updates_fsr()) { access = (fsr & ARM_FSR_ACCESS_MASK) ? HW_BREAKPOINT_W : HW_BREAKPOINT_R; if (!(access & hw_breakpoint_type(wp))) - goto unlock; + continue; } + val = read_wb_reg(ARM_BASE_WVR + i); + ctrl_reg = read_wb_reg(ARM_BASE_WCR + i); + decode_ctrl_reg(ctrl_reg, &ctrl); + dist = get_distance_from_watchpoint(addr, val, &ctrl); + if (dist < min_dist) { + min_dist = dist; + closest_match = i; + } + /* Is this an exact match? */ + if (dist != 0) + continue; + /* We have a winner. */ + info = counter_arch_bp(wp); info->trigger = addr; } @@ -770,13 +804,23 @@ static void watchpoint_handler(unsigned long addr, unsigned int fsr, * we can single-step over the watchpoint trigger. */ if (!is_default_overflow_handler(wp)) - goto unlock; - + continue; step: enable_single_step(wp, instruction_pointer(regs)); -unlock: - rcu_read_unlock(); } + + if (min_dist > 0 && min_dist != -1) { + /* No exact match found. */ + wp = slots[closest_match]; + info = counter_arch_bp(wp); + info->trigger = addr; + pr_debug("watchpoint fired: address = 0x%x\n", info->trigger); + perf_bp_event(wp, regs); + if (is_default_overflow_handler(wp)) + enable_single_step(wp, instruction_pointer(regs)); + } + + rcu_read_unlock(); } static void watchpoint_single_step_handler(unsigned long pc) diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c index 808efbb89b88..02f613def40d 100644 --- a/arch/arm/mm/cache-l2x0.c +++ b/arch/arm/mm/cache-l2x0.c @@ -1261,20 +1261,28 @@ static void __init l2c310_of_parse(const struct device_node *np, ret = of_property_read_u32(np, "prefetch-data", &val); if (ret == 0) { - if (val) + if (val) { prefetch |= L310_PREFETCH_CTRL_DATA_PREFETCH; - else + *aux_val |= L310_PREFETCH_CTRL_DATA_PREFETCH; + } else { prefetch &= ~L310_PREFETCH_CTRL_DATA_PREFETCH; + *aux_val &= ~L310_PREFETCH_CTRL_DATA_PREFETCH; + } + *aux_mask &= ~L310_PREFETCH_CTRL_DATA_PREFETCH; } else if (ret != -EINVAL) { pr_err("L2C-310 OF prefetch-data property value is missing\n"); } ret = of_property_read_u32(np, "prefetch-instr", &val); if (ret == 0) { - if (val) + if (val) { prefetch |= L310_PREFETCH_CTRL_INSTR_PREFETCH; - else + *aux_val |= L310_PREFETCH_CTRL_INSTR_PREFETCH; + } else { prefetch &= ~L310_PREFETCH_CTRL_INSTR_PREFETCH; + *aux_val &= ~L310_PREFETCH_CTRL_INSTR_PREFETCH; + } + *aux_mask &= ~L310_PREFETCH_CTRL_INSTR_PREFETCH; } else if (ret != -EINVAL) { pr_err("L2C-310 OF prefetch-instr property value is missing\n"); } diff --git a/arch/arm/plat-samsung/Kconfig b/arch/arm/plat-samsung/Kconfig index 377ff9cda667..c83baa51289f 100644 --- a/arch/arm/plat-samsung/Kconfig +++ b/arch/arm/plat-samsung/Kconfig @@ -240,6 +240,7 @@ config SAMSUNG_PM_DEBUG bool "Samsung PM Suspend debug" depends on PM && DEBUG_KERNEL depends on DEBUG_EXYNOS_UART || DEBUG_S3C24XX_UART || DEBUG_S3C2410_UART + depends on DEBUG_LL && MMU help Say Y here if you want verbose debugging from the PM Suspend and Resume code. See diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms index 77935524dc5b..eef5ae859f7b 100644 --- a/arch/arm64/Kconfig.platforms +++ b/arch/arm64/Kconfig.platforms @@ -46,6 +46,7 @@ config ARCH_BCM_IPROC config ARCH_BERLIN bool "Marvell Berlin SoC Family" select DW_APB_ICTL + select DW_APB_TIMER_OF select GPIOLIB select PINCTRL help diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile index 9fe11fe38f87..ddb607140f7c 100644 --- a/arch/arm64/Makefile +++ b/arch/arm64/Makefile @@ -10,7 +10,7 @@ # # Copyright (C) 1995-2001 by Russell King -LDFLAGS_vmlinux :=--no-undefined -X +LDFLAGS_vmlinux :=--no-undefined -X -z norelro CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET) GZFLAGS :=-9 @@ -18,7 +18,7 @@ ifeq ($(CONFIG_RELOCATABLE), y) # Pass --no-apply-dynamic-relocs to restore pre-binutils-2.27 behaviour # for relative relocs, since this leads to better Image compression # with the relocation offsets always being zero. -LDFLAGS_vmlinux += -shared -Bsymbolic -z notext -z norelro \ +LDFLAGS_vmlinux += -shared -Bsymbolic -z notext \ $(call ld-option, --no-apply-dynamic-relocs) endif diff --git a/arch/arm64/boot/dts/marvell/armada-3720-espressobin.dts b/arch/arm64/boot/dts/marvell/armada-3720-espressobin.dts index 6cbdd66921aa..1a3e6e3b04eb 100644 --- a/arch/arm64/boot/dts/marvell/armada-3720-espressobin.dts +++ b/arch/arm64/boot/dts/marvell/armada-3720-espressobin.dts @@ -21,6 +21,10 @@ aliases { ethernet0 = ð0; + /* for dsa slave device */ + ethernet1 = &switch0port1; + ethernet2 = &switch0port2; + ethernet3 = &switch0port3; serial0 = &uart0; serial1 = &uart1; }; @@ -136,25 +140,25 @@ #address-cells = <1>; #size-cells = <0>; - port@0 { + switch0port0: port@0 { reg = <0>; label = "cpu"; ethernet = <ð0>; }; - port@1 { + switch0port1: port@1 { reg = <1>; label = "wan"; phy-handle = <&switch0phy0>; }; - port@2 { + switch0port2: port@2 { reg = <2>; label = "lan0"; phy-handle = <&switch0phy1>; }; - port@3 { + switch0port3: port@3 { reg = <3>; label = "lan1"; phy-handle = <&switch0phy2>; diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi b/arch/arm64/boot/dts/qcom/msm8916.dtsi index 8011e564a234..2c5193ae2027 100644 --- a/arch/arm64/boot/dts/qcom/msm8916.dtsi +++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi @@ -877,7 +877,7 @@ reg-names = "mdp_phys"; interrupt-parent = <&mdss>; - interrupts = <0 0>; + interrupts = <0>; clocks = <&gcc GCC_MDSS_AHB_CLK>, <&gcc GCC_MDSS_AXI_CLK>, @@ -909,7 +909,7 @@ reg-names = "dsi_ctrl"; interrupt-parent = <&mdss>; - interrupts = <4 0>; + interrupts = <4>; assigned-clocks = <&gcc BYTE0_CLK_SRC>, <&gcc PCLK0_CLK_SRC>; diff --git a/arch/arm64/boot/dts/qcom/pm8916.dtsi b/arch/arm64/boot/dts/qcom/pm8916.dtsi index 196b1c0ceb9b..b968afa8da17 100644 --- a/arch/arm64/boot/dts/qcom/pm8916.dtsi +++ b/arch/arm64/boot/dts/qcom/pm8916.dtsi @@ -99,7 +99,7 @@ wcd_codec: codec@f000 { compatible = "qcom,pm8916-wcd-analog-codec"; - reg = <0xf000 0x200>; + reg = <0xf000>; reg-names = "pmic-codec-core"; clocks = <&gcc GCC_CODEC_DIGCODEC_CLK>; clock-names = "mclk"; diff --git a/arch/arm64/boot/dts/renesas/ulcb.dtsi b/arch/arm64/boot/dts/renesas/ulcb.dtsi index 0ead552d7eae..600adc25eaef 100644 --- a/arch/arm64/boot/dts/renesas/ulcb.dtsi +++ b/arch/arm64/boot/dts/renesas/ulcb.dtsi @@ -430,6 +430,7 @@ bus-width = <8>; mmc-hs200-1_8v; non-removable; + full-pwr-cycle-in-suspend; status = "okay"; }; diff --git a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi index a516c0e01429..8a885ae647b7 100644 --- a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi +++ b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi @@ -411,7 +411,7 @@ }; i2c0: i2c@ff020000 { - compatible = "cdns,i2c-r1p14", "cdns,i2c-r1p10"; + compatible = "cdns,i2c-r1p14"; status = "disabled"; interrupt-parent = <&gic>; interrupts = <0 17 4>; @@ -421,7 +421,7 @@ }; i2c1: i2c@ff030000 { - compatible = "cdns,i2c-r1p14", "cdns,i2c-r1p10"; + compatible = "cdns,i2c-r1p14"; status = "disabled"; interrupt-parent = <&gic>; interrupts = <0 18 4>; diff --git a/arch/arm64/configs/gki_defconfig b/arch/arm64/configs/gki_defconfig index d79a0e2287ae..6f6ce66e9ae2 100644 --- a/arch/arm64/configs/gki_defconfig +++ b/arch/arm64/configs/gki_defconfig @@ -77,7 +77,6 @@ CONFIG_ARM_SCMI_PROTOCOL=y CONFIG_ARM_SCPI_PROTOCOL=y # CONFIG_ARM_SCPI_POWER_DOMAIN is not set # CONFIG_EFI_ARMSTUB_DTB_LOADER is not set -CONFIG_ARM64_CRYPTO=y CONFIG_CRYPTO_SHA2_ARM64_CE=y CONFIG_CRYPTO_AES_ARM64_CE_BLK=y CONFIG_JUMP_LABEL=y @@ -246,6 +245,7 @@ CONFIG_DM_VERITY_FEC=y CONFIG_DM_BOW=y CONFIG_NETDEVICES=y CONFIG_DUMMY=y +CONFIG_WIREGUARD=y CONFIG_TUN=y CONFIG_VETH=y # CONFIG_ETHERNET is not set @@ -358,6 +358,7 @@ CONFIG_HID_NINTENDO=y CONFIG_HID_SONY=y CONFIG_HID_STEAM=y CONFIG_USB_HIDDEV=y +CONFIG_USB_ANNOUNCE_NEW_DEVICES=y CONFIG_USB_OTG=y CONFIG_USB_XHCI_HCD=y CONFIG_USB_GADGET=y @@ -503,6 +504,7 @@ CONFIG_CRC8=y CONFIG_XZ_DEC=y CONFIG_PRINTK_TIME=y CONFIG_DEBUG_INFO=y +CONFIG_DEBUG_INFO_DWARF4=y # CONFIG_ENABLE_MUST_CHECK is not set # CONFIG_SECTION_MISMATCH_WARN_ONLY is not set CONFIG_MAGIC_SYSRQ=y diff --git a/arch/arm64/crypto/.gitignore b/arch/arm64/crypto/.gitignore index 879df8781ed5..e403b1343328 100644 --- a/arch/arm64/crypto/.gitignore +++ b/arch/arm64/crypto/.gitignore @@ -1,2 +1,3 @@ sha256-core.S sha512-core.S +poly1305-core.S diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index d51944ff9f91..273d727fb3f1 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -106,10 +106,17 @@ config CRYPTO_AES_ARM64_NEON_BLK select CRYPTO_SIMD config CRYPTO_CHACHA20_NEON - tristate "NEON accelerated ChaCha20 symmetric cipher" + tristate "ChaCha20, XChaCha20, and XChaCha12 stream ciphers using NEON instructions" depends on KERNEL_MODE_NEON select CRYPTO_BLKCIPHER - select CRYPTO_CHACHA20 + select CRYPTO_LIB_CHACHA_GENERIC + select CRYPTO_ARCH_HAVE_LIB_CHACHA + +config CRYPTO_POLY1305_NEON + tristate "Poly1305 hash function using scalar or NEON instructions" + depends on KERNEL_MODE_NEON + select CRYPTO_HASH + select CRYPTO_ARCH_HAVE_LIB_POLY1305 config CRYPTO_AES_ARM64_BS tristate "AES in ECB/CBC/CTR/XTS modes using bit-sliced NEON algorithm" diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile index 7bc4bda6d9c6..dbb74e1f4119 100644 --- a/arch/arm64/crypto/Makefile +++ b/arch/arm64/crypto/Makefile @@ -53,8 +53,12 @@ sha256-arm64-y := sha256-glue.o sha256-core.o obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o sha512-arm64-y := sha512-glue.o sha512-core.o -obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o -chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o +obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o +chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o + +obj-$(CONFIG_CRYPTO_POLY1305_NEON) += poly1305-neon.o +poly1305-neon-y := poly1305-core.o poly1305-glue.o +AFLAGS_poly1305-core.o += -Dpoly1305_init=poly1305_init_arm64 obj-$(CONFIG_CRYPTO_AES_ARM64) += aes-arm64.o aes-arm64-y := aes-cipher-core.o aes-cipher-glue.o @@ -71,6 +75,9 @@ ifdef REGENERATE_ARM64_CRYPTO quiet_cmd_perlasm = PERLASM $@ cmd_perlasm = $(PERL) $(<) void $(@) +$(src)/poly1305-core.S_shipped: $(src)/poly1305-armv8.pl + $(call cmd,perlasm) + $(src)/sha256-core.S_shipped: $(src)/sha512-armv8.pl $(call cmd,perlasm) @@ -78,4 +85,4 @@ $(src)/sha512-core.S_shipped: $(src)/sha512-armv8.pl $(call cmd,perlasm) endif -targets += sha256-core.S sha512-core.S +targets += poly1305-core.S sha256-core.S sha512-core.S diff --git a/arch/arm64/crypto/chacha20-neon-core.S b/arch/arm64/crypto/chacha-neon-core.S similarity index 51% rename from arch/arm64/crypto/chacha20-neon-core.S rename to arch/arm64/crypto/chacha-neon-core.S index 13c85e272c2a..706c4e10e9e2 100644 --- a/arch/arm64/crypto/chacha20-neon-core.S +++ b/arch/arm64/crypto/chacha-neon-core.S @@ -1,13 +1,13 @@ /* - * ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions + * ChaCha/XChaCha NEON helper functions * - * Copyright (C) 2016 Linaro, Ltd. + * Copyright (C) 2016-2018 Linaro, Ltd. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as * published by the Free Software Foundation. * - * Based on: + * Originally based on: * ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSSE3 functions * * Copyright (C) 2015 Martin Willi @@ -19,29 +19,27 @@ */ #include +#include +#include .text .align 6 -ENTRY(chacha20_block_xor_neon) - // x0: Input state matrix, s - // x1: 1 data block output, o - // x2: 1 data block input, i +/* + * chacha_permute - permute one block + * + * Permute one 64-byte block where the state matrix is stored in the four NEON + * registers v0-v3. It performs matrix operations on four words in parallel, + * but requires shuffling to rearrange the words after each round. + * + * The round count is given in w3. + * + * Clobbers: w3, x10, v4, v12 + */ +chacha_permute: - // - // This function encrypts one ChaCha20 block by loading the state matrix - // in four NEON registers. It performs matrix operation on four words in - // parallel, but requires shuffling to rearrange the words after each - // round. - // - - // x0..3 = s0..3 - adr x3, ROT8 - ld1 {v0.4s-v3.4s}, [x0] - ld1 {v8.4s-v11.4s}, [x0] - ld1 {v12.4s}, [x3] - - mov x3, #10 + adr_l x10, ROT8 + ld1 {v12.4s}, [x10] .Ldoubleround: // x0 += x1, x3 = rotl32(x3 ^ x0, 16) @@ -102,9 +100,27 @@ ENTRY(chacha20_block_xor_neon) // x3 = shuffle32(x3, MASK(0, 3, 2, 1)) ext v3.16b, v3.16b, v3.16b, #4 - subs x3, x3, #1 + subs w3, w3, #2 b.ne .Ldoubleround + ret +ENDPROC(chacha_permute) + +ENTRY(chacha_block_xor_neon) + // x0: Input state matrix, s + // x1: 1 data block output, o + // x2: 1 data block input, i + // w3: nrounds + + stp x29, x30, [sp, #-16]! + mov x29, sp + + // x0..3 = s0..3 + ld1 {v0.4s-v3.4s}, [x0] + ld1 {v8.4s-v11.4s}, [x0] + + bl chacha_permute + ld1 {v4.16b-v7.16b}, [x2] // o0 = i0 ^ (x0 + s0) @@ -125,71 +141,156 @@ ENTRY(chacha20_block_xor_neon) st1 {v0.16b-v3.16b}, [x1] + ldp x29, x30, [sp], #16 ret -ENDPROC(chacha20_block_xor_neon) +ENDPROC(chacha_block_xor_neon) + +ENTRY(hchacha_block_neon) + // x0: Input state matrix, s + // x1: output (8 32-bit words) + // w2: nrounds + + stp x29, x30, [sp, #-16]! + mov x29, sp + + ld1 {v0.4s-v3.4s}, [x0] + + mov w3, w2 + bl chacha_permute + + st1 {v0.4s}, [x1], #16 + st1 {v3.4s}, [x1] + + ldp x29, x30, [sp], #16 + ret +ENDPROC(hchacha_block_neon) + + a0 .req w12 + a1 .req w13 + a2 .req w14 + a3 .req w15 + a4 .req w16 + a5 .req w17 + a6 .req w19 + a7 .req w20 + a8 .req w21 + a9 .req w22 + a10 .req w23 + a11 .req w24 + a12 .req w25 + a13 .req w26 + a14 .req w27 + a15 .req w28 .align 6 -ENTRY(chacha20_4block_xor_neon) +ENTRY(chacha_4block_xor_neon) + frame_push 10 + // x0: Input state matrix, s // x1: 4 data blocks output, o // x2: 4 data blocks input, i + // w3: nrounds + // x4: byte count + + adr_l x10, .Lpermute + and x5, x4, #63 + add x10, x10, x5 + add x11, x10, #64 // - // This function encrypts four consecutive ChaCha20 blocks by loading + // This function encrypts four consecutive ChaCha blocks by loading // the state matrix in NEON registers four times. The algorithm performs // each operation on the corresponding word of each state matrix, hence // requires no word shuffling. For final XORing step we transpose the // matrix by interleaving 32- and then 64-bit words, which allows us to // do XOR in NEON registers. // - adr x3, CTRINC // ... and ROT8 - ld1 {v30.4s-v31.4s}, [x3] + // At the same time, a fifth block is encrypted in parallel using + // scalar registers + // + adr_l x9, CTRINC // ... and ROT8 + ld1 {v30.4s-v31.4s}, [x9] // x0..15[0-3] = s0..3[0..3] - mov x4, x0 - ld4r { v0.4s- v3.4s}, [x4], #16 - ld4r { v4.4s- v7.4s}, [x4], #16 - ld4r { v8.4s-v11.4s}, [x4], #16 - ld4r {v12.4s-v15.4s}, [x4] + add x8, x0, #16 + ld4r { v0.4s- v3.4s}, [x0] + ld4r { v4.4s- v7.4s}, [x8], #16 + ld4r { v8.4s-v11.4s}, [x8], #16 + ld4r {v12.4s-v15.4s}, [x8] - // x12 += counter values 0-3 + mov a0, v0.s[0] + mov a1, v1.s[0] + mov a2, v2.s[0] + mov a3, v3.s[0] + mov a4, v4.s[0] + mov a5, v5.s[0] + mov a6, v6.s[0] + mov a7, v7.s[0] + mov a8, v8.s[0] + mov a9, v9.s[0] + mov a10, v10.s[0] + mov a11, v11.s[0] + mov a12, v12.s[0] + mov a13, v13.s[0] + mov a14, v14.s[0] + mov a15, v15.s[0] + + // x12 += counter values 1-4 add v12.4s, v12.4s, v30.4s - mov x3, #10 - .Ldoubleround4: // x0 += x4, x12 = rotl32(x12 ^ x0, 16) // x1 += x5, x13 = rotl32(x13 ^ x1, 16) // x2 += x6, x14 = rotl32(x14 ^ x2, 16) // x3 += x7, x15 = rotl32(x15 ^ x3, 16) add v0.4s, v0.4s, v4.4s + add a0, a0, a4 add v1.4s, v1.4s, v5.4s + add a1, a1, a5 add v2.4s, v2.4s, v6.4s + add a2, a2, a6 add v3.4s, v3.4s, v7.4s + add a3, a3, a7 eor v12.16b, v12.16b, v0.16b + eor a12, a12, a0 eor v13.16b, v13.16b, v1.16b + eor a13, a13, a1 eor v14.16b, v14.16b, v2.16b + eor a14, a14, a2 eor v15.16b, v15.16b, v3.16b + eor a15, a15, a3 rev32 v12.8h, v12.8h + ror a12, a12, #16 rev32 v13.8h, v13.8h + ror a13, a13, #16 rev32 v14.8h, v14.8h + ror a14, a14, #16 rev32 v15.8h, v15.8h + ror a15, a15, #16 // x8 += x12, x4 = rotl32(x4 ^ x8, 12) // x9 += x13, x5 = rotl32(x5 ^ x9, 12) // x10 += x14, x6 = rotl32(x6 ^ x10, 12) // x11 += x15, x7 = rotl32(x7 ^ x11, 12) add v8.4s, v8.4s, v12.4s + add a8, a8, a12 add v9.4s, v9.4s, v13.4s + add a9, a9, a13 add v10.4s, v10.4s, v14.4s + add a10, a10, a14 add v11.4s, v11.4s, v15.4s + add a11, a11, a15 eor v16.16b, v4.16b, v8.16b + eor a4, a4, a8 eor v17.16b, v5.16b, v9.16b + eor a5, a5, a9 eor v18.16b, v6.16b, v10.16b + eor a6, a6, a10 eor v19.16b, v7.16b, v11.16b + eor a7, a7, a11 shl v4.4s, v16.4s, #12 shl v5.4s, v17.4s, #12 @@ -197,42 +298,66 @@ ENTRY(chacha20_4block_xor_neon) shl v7.4s, v19.4s, #12 sri v4.4s, v16.4s, #20 + ror a4, a4, #20 sri v5.4s, v17.4s, #20 + ror a5, a5, #20 sri v6.4s, v18.4s, #20 + ror a6, a6, #20 sri v7.4s, v19.4s, #20 + ror a7, a7, #20 // x0 += x4, x12 = rotl32(x12 ^ x0, 8) // x1 += x5, x13 = rotl32(x13 ^ x1, 8) // x2 += x6, x14 = rotl32(x14 ^ x2, 8) // x3 += x7, x15 = rotl32(x15 ^ x3, 8) add v0.4s, v0.4s, v4.4s + add a0, a0, a4 add v1.4s, v1.4s, v5.4s + add a1, a1, a5 add v2.4s, v2.4s, v6.4s + add a2, a2, a6 add v3.4s, v3.4s, v7.4s + add a3, a3, a7 eor v12.16b, v12.16b, v0.16b + eor a12, a12, a0 eor v13.16b, v13.16b, v1.16b + eor a13, a13, a1 eor v14.16b, v14.16b, v2.16b + eor a14, a14, a2 eor v15.16b, v15.16b, v3.16b + eor a15, a15, a3 tbl v12.16b, {v12.16b}, v31.16b + ror a12, a12, #24 tbl v13.16b, {v13.16b}, v31.16b + ror a13, a13, #24 tbl v14.16b, {v14.16b}, v31.16b + ror a14, a14, #24 tbl v15.16b, {v15.16b}, v31.16b + ror a15, a15, #24 // x8 += x12, x4 = rotl32(x4 ^ x8, 7) // x9 += x13, x5 = rotl32(x5 ^ x9, 7) // x10 += x14, x6 = rotl32(x6 ^ x10, 7) // x11 += x15, x7 = rotl32(x7 ^ x11, 7) add v8.4s, v8.4s, v12.4s + add a8, a8, a12 add v9.4s, v9.4s, v13.4s + add a9, a9, a13 add v10.4s, v10.4s, v14.4s + add a10, a10, a14 add v11.4s, v11.4s, v15.4s + add a11, a11, a15 eor v16.16b, v4.16b, v8.16b + eor a4, a4, a8 eor v17.16b, v5.16b, v9.16b + eor a5, a5, a9 eor v18.16b, v6.16b, v10.16b + eor a6, a6, a10 eor v19.16b, v7.16b, v11.16b + eor a7, a7, a11 shl v4.4s, v16.4s, #7 shl v5.4s, v17.4s, #7 @@ -240,42 +365,66 @@ ENTRY(chacha20_4block_xor_neon) shl v7.4s, v19.4s, #7 sri v4.4s, v16.4s, #25 + ror a4, a4, #25 sri v5.4s, v17.4s, #25 + ror a5, a5, #25 sri v6.4s, v18.4s, #25 + ror a6, a6, #25 sri v7.4s, v19.4s, #25 + ror a7, a7, #25 // x0 += x5, x15 = rotl32(x15 ^ x0, 16) // x1 += x6, x12 = rotl32(x12 ^ x1, 16) // x2 += x7, x13 = rotl32(x13 ^ x2, 16) // x3 += x4, x14 = rotl32(x14 ^ x3, 16) add v0.4s, v0.4s, v5.4s + add a0, a0, a5 add v1.4s, v1.4s, v6.4s + add a1, a1, a6 add v2.4s, v2.4s, v7.4s + add a2, a2, a7 add v3.4s, v3.4s, v4.4s + add a3, a3, a4 eor v15.16b, v15.16b, v0.16b + eor a15, a15, a0 eor v12.16b, v12.16b, v1.16b + eor a12, a12, a1 eor v13.16b, v13.16b, v2.16b + eor a13, a13, a2 eor v14.16b, v14.16b, v3.16b + eor a14, a14, a3 rev32 v15.8h, v15.8h + ror a15, a15, #16 rev32 v12.8h, v12.8h + ror a12, a12, #16 rev32 v13.8h, v13.8h + ror a13, a13, #16 rev32 v14.8h, v14.8h + ror a14, a14, #16 // x10 += x15, x5 = rotl32(x5 ^ x10, 12) // x11 += x12, x6 = rotl32(x6 ^ x11, 12) // x8 += x13, x7 = rotl32(x7 ^ x8, 12) // x9 += x14, x4 = rotl32(x4 ^ x9, 12) add v10.4s, v10.4s, v15.4s + add a10, a10, a15 add v11.4s, v11.4s, v12.4s + add a11, a11, a12 add v8.4s, v8.4s, v13.4s + add a8, a8, a13 add v9.4s, v9.4s, v14.4s + add a9, a9, a14 eor v16.16b, v5.16b, v10.16b + eor a5, a5, a10 eor v17.16b, v6.16b, v11.16b + eor a6, a6, a11 eor v18.16b, v7.16b, v8.16b + eor a7, a7, a8 eor v19.16b, v4.16b, v9.16b + eor a4, a4, a9 shl v5.4s, v16.4s, #12 shl v6.4s, v17.4s, #12 @@ -283,42 +432,66 @@ ENTRY(chacha20_4block_xor_neon) shl v4.4s, v19.4s, #12 sri v5.4s, v16.4s, #20 + ror a5, a5, #20 sri v6.4s, v17.4s, #20 + ror a6, a6, #20 sri v7.4s, v18.4s, #20 + ror a7, a7, #20 sri v4.4s, v19.4s, #20 + ror a4, a4, #20 // x0 += x5, x15 = rotl32(x15 ^ x0, 8) // x1 += x6, x12 = rotl32(x12 ^ x1, 8) // x2 += x7, x13 = rotl32(x13 ^ x2, 8) // x3 += x4, x14 = rotl32(x14 ^ x3, 8) add v0.4s, v0.4s, v5.4s + add a0, a0, a5 add v1.4s, v1.4s, v6.4s + add a1, a1, a6 add v2.4s, v2.4s, v7.4s + add a2, a2, a7 add v3.4s, v3.4s, v4.4s + add a3, a3, a4 eor v15.16b, v15.16b, v0.16b + eor a15, a15, a0 eor v12.16b, v12.16b, v1.16b + eor a12, a12, a1 eor v13.16b, v13.16b, v2.16b + eor a13, a13, a2 eor v14.16b, v14.16b, v3.16b + eor a14, a14, a3 tbl v15.16b, {v15.16b}, v31.16b + ror a15, a15, #24 tbl v12.16b, {v12.16b}, v31.16b + ror a12, a12, #24 tbl v13.16b, {v13.16b}, v31.16b + ror a13, a13, #24 tbl v14.16b, {v14.16b}, v31.16b + ror a14, a14, #24 // x10 += x15, x5 = rotl32(x5 ^ x10, 7) // x11 += x12, x6 = rotl32(x6 ^ x11, 7) // x8 += x13, x7 = rotl32(x7 ^ x8, 7) // x9 += x14, x4 = rotl32(x4 ^ x9, 7) add v10.4s, v10.4s, v15.4s + add a10, a10, a15 add v11.4s, v11.4s, v12.4s + add a11, a11, a12 add v8.4s, v8.4s, v13.4s + add a8, a8, a13 add v9.4s, v9.4s, v14.4s + add a9, a9, a14 eor v16.16b, v5.16b, v10.16b + eor a5, a5, a10 eor v17.16b, v6.16b, v11.16b + eor a6, a6, a11 eor v18.16b, v7.16b, v8.16b + eor a7, a7, a8 eor v19.16b, v4.16b, v9.16b + eor a4, a4, a9 shl v5.4s, v16.4s, #7 shl v6.4s, v17.4s, #7 @@ -326,11 +499,15 @@ ENTRY(chacha20_4block_xor_neon) shl v4.4s, v19.4s, #7 sri v5.4s, v16.4s, #25 + ror a5, a5, #25 sri v6.4s, v17.4s, #25 + ror a6, a6, #25 sri v7.4s, v18.4s, #25 + ror a7, a7, #25 sri v4.4s, v19.4s, #25 + ror a4, a4, #25 - subs x3, x3, #1 + subs w3, w3, #2 b.ne .Ldoubleround4 ld4r {v16.4s-v19.4s}, [x0], #16 @@ -344,9 +521,21 @@ ENTRY(chacha20_4block_xor_neon) // x2[0-3] += s0[2] // x3[0-3] += s0[3] add v0.4s, v0.4s, v16.4s + mov w6, v16.s[0] + mov w7, v17.s[0] add v1.4s, v1.4s, v17.4s + mov w8, v18.s[0] + mov w9, v19.s[0] add v2.4s, v2.4s, v18.4s + add a0, a0, w6 + add a1, a1, w7 add v3.4s, v3.4s, v19.4s + add a2, a2, w8 + add a3, a3, w9 +CPU_BE( rev a0, a0 ) +CPU_BE( rev a1, a1 ) +CPU_BE( rev a2, a2 ) +CPU_BE( rev a3, a3 ) ld4r {v24.4s-v27.4s}, [x0], #16 ld4r {v28.4s-v31.4s}, [x0] @@ -356,95 +545,316 @@ ENTRY(chacha20_4block_xor_neon) // x6[0-3] += s1[2] // x7[0-3] += s1[3] add v4.4s, v4.4s, v20.4s + mov w6, v20.s[0] + mov w7, v21.s[0] add v5.4s, v5.4s, v21.4s + mov w8, v22.s[0] + mov w9, v23.s[0] add v6.4s, v6.4s, v22.4s + add a4, a4, w6 + add a5, a5, w7 add v7.4s, v7.4s, v23.4s + add a6, a6, w8 + add a7, a7, w9 +CPU_BE( rev a4, a4 ) +CPU_BE( rev a5, a5 ) +CPU_BE( rev a6, a6 ) +CPU_BE( rev a7, a7 ) // x8[0-3] += s2[0] // x9[0-3] += s2[1] // x10[0-3] += s2[2] // x11[0-3] += s2[3] add v8.4s, v8.4s, v24.4s + mov w6, v24.s[0] + mov w7, v25.s[0] add v9.4s, v9.4s, v25.4s + mov w8, v26.s[0] + mov w9, v27.s[0] add v10.4s, v10.4s, v26.4s + add a8, a8, w6 + add a9, a9, w7 add v11.4s, v11.4s, v27.4s + add a10, a10, w8 + add a11, a11, w9 +CPU_BE( rev a8, a8 ) +CPU_BE( rev a9, a9 ) +CPU_BE( rev a10, a10 ) +CPU_BE( rev a11, a11 ) // x12[0-3] += s3[0] // x13[0-3] += s3[1] // x14[0-3] += s3[2] // x15[0-3] += s3[3] add v12.4s, v12.4s, v28.4s + mov w6, v28.s[0] + mov w7, v29.s[0] add v13.4s, v13.4s, v29.4s + mov w8, v30.s[0] + mov w9, v31.s[0] add v14.4s, v14.4s, v30.4s + add a12, a12, w6 + add a13, a13, w7 add v15.4s, v15.4s, v31.4s + add a14, a14, w8 + add a15, a15, w9 +CPU_BE( rev a12, a12 ) +CPU_BE( rev a13, a13 ) +CPU_BE( rev a14, a14 ) +CPU_BE( rev a15, a15 ) // interleave 32-bit words in state n, n+1 + ldp w6, w7, [x2], #64 zip1 v16.4s, v0.4s, v1.4s + ldp w8, w9, [x2, #-56] + eor a0, a0, w6 zip2 v17.4s, v0.4s, v1.4s + eor a1, a1, w7 zip1 v18.4s, v2.4s, v3.4s + eor a2, a2, w8 zip2 v19.4s, v2.4s, v3.4s + eor a3, a3, w9 + ldp w6, w7, [x2, #-48] zip1 v20.4s, v4.4s, v5.4s + ldp w8, w9, [x2, #-40] + eor a4, a4, w6 zip2 v21.4s, v4.4s, v5.4s + eor a5, a5, w7 zip1 v22.4s, v6.4s, v7.4s + eor a6, a6, w8 zip2 v23.4s, v6.4s, v7.4s + eor a7, a7, w9 + ldp w6, w7, [x2, #-32] zip1 v24.4s, v8.4s, v9.4s + ldp w8, w9, [x2, #-24] + eor a8, a8, w6 zip2 v25.4s, v8.4s, v9.4s + eor a9, a9, w7 zip1 v26.4s, v10.4s, v11.4s + eor a10, a10, w8 zip2 v27.4s, v10.4s, v11.4s + eor a11, a11, w9 + ldp w6, w7, [x2, #-16] zip1 v28.4s, v12.4s, v13.4s + ldp w8, w9, [x2, #-8] + eor a12, a12, w6 zip2 v29.4s, v12.4s, v13.4s + eor a13, a13, w7 zip1 v30.4s, v14.4s, v15.4s + eor a14, a14, w8 zip2 v31.4s, v14.4s, v15.4s + eor a15, a15, w9 + + mov x3, #64 + subs x5, x4, #128 + add x6, x5, x2 + csel x3, x3, xzr, ge + csel x2, x2, x6, ge // interleave 64-bit words in state n, n+2 zip1 v0.2d, v16.2d, v18.2d zip2 v4.2d, v16.2d, v18.2d + stp a0, a1, [x1], #64 zip1 v8.2d, v17.2d, v19.2d zip2 v12.2d, v17.2d, v19.2d - ld1 {v16.16b-v19.16b}, [x2], #64 + stp a2, a3, [x1, #-56] + ld1 {v16.16b-v19.16b}, [x2], x3 + + subs x6, x4, #192 + ccmp x3, xzr, #4, lt + add x7, x6, x2 + csel x3, x3, xzr, eq + csel x2, x2, x7, eq zip1 v1.2d, v20.2d, v22.2d zip2 v5.2d, v20.2d, v22.2d + stp a4, a5, [x1, #-48] zip1 v9.2d, v21.2d, v23.2d zip2 v13.2d, v21.2d, v23.2d - ld1 {v20.16b-v23.16b}, [x2], #64 + stp a6, a7, [x1, #-40] + ld1 {v20.16b-v23.16b}, [x2], x3 + + subs x7, x4, #256 + ccmp x3, xzr, #4, lt + add x8, x7, x2 + csel x3, x3, xzr, eq + csel x2, x2, x8, eq zip1 v2.2d, v24.2d, v26.2d zip2 v6.2d, v24.2d, v26.2d + stp a8, a9, [x1, #-32] zip1 v10.2d, v25.2d, v27.2d zip2 v14.2d, v25.2d, v27.2d - ld1 {v24.16b-v27.16b}, [x2], #64 + stp a10, a11, [x1, #-24] + ld1 {v24.16b-v27.16b}, [x2], x3 + + subs x8, x4, #320 + ccmp x3, xzr, #4, lt + add x9, x8, x2 + csel x2, x2, x9, eq zip1 v3.2d, v28.2d, v30.2d zip2 v7.2d, v28.2d, v30.2d + stp a12, a13, [x1, #-16] zip1 v11.2d, v29.2d, v31.2d zip2 v15.2d, v29.2d, v31.2d + stp a14, a15, [x1, #-8] ld1 {v28.16b-v31.16b}, [x2] // xor with corresponding input, write to output + tbnz x5, #63, 0f eor v16.16b, v16.16b, v0.16b eor v17.16b, v17.16b, v1.16b eor v18.16b, v18.16b, v2.16b eor v19.16b, v19.16b, v3.16b + st1 {v16.16b-v19.16b}, [x1], #64 + cbz x5, .Lout + + tbnz x6, #63, 1f eor v20.16b, v20.16b, v4.16b eor v21.16b, v21.16b, v5.16b - st1 {v16.16b-v19.16b}, [x1], #64 eor v22.16b, v22.16b, v6.16b eor v23.16b, v23.16b, v7.16b + st1 {v20.16b-v23.16b}, [x1], #64 + cbz x6, .Lout + + tbnz x7, #63, 2f eor v24.16b, v24.16b, v8.16b eor v25.16b, v25.16b, v9.16b - st1 {v20.16b-v23.16b}, [x1], #64 eor v26.16b, v26.16b, v10.16b eor v27.16b, v27.16b, v11.16b - eor v28.16b, v28.16b, v12.16b st1 {v24.16b-v27.16b}, [x1], #64 + cbz x7, .Lout + + tbnz x8, #63, 3f + eor v28.16b, v28.16b, v12.16b eor v29.16b, v29.16b, v13.16b eor v30.16b, v30.16b, v14.16b eor v31.16b, v31.16b, v15.16b st1 {v28.16b-v31.16b}, [x1] +.Lout: frame_pop ret -ENDPROC(chacha20_4block_xor_neon) -CTRINC: .word 0, 1, 2, 3 + // fewer than 128 bytes of in/output +0: ld1 {v8.16b}, [x10] + ld1 {v9.16b}, [x11] + movi v10.16b, #16 + sub x2, x1, #64 + add x1, x1, x5 + ld1 {v16.16b-v19.16b}, [x2] + tbl v4.16b, {v0.16b-v3.16b}, v8.16b + tbx v20.16b, {v16.16b-v19.16b}, v9.16b + add v8.16b, v8.16b, v10.16b + add v9.16b, v9.16b, v10.16b + tbl v5.16b, {v0.16b-v3.16b}, v8.16b + tbx v21.16b, {v16.16b-v19.16b}, v9.16b + add v8.16b, v8.16b, v10.16b + add v9.16b, v9.16b, v10.16b + tbl v6.16b, {v0.16b-v3.16b}, v8.16b + tbx v22.16b, {v16.16b-v19.16b}, v9.16b + add v8.16b, v8.16b, v10.16b + add v9.16b, v9.16b, v10.16b + tbl v7.16b, {v0.16b-v3.16b}, v8.16b + tbx v23.16b, {v16.16b-v19.16b}, v9.16b + + eor v20.16b, v20.16b, v4.16b + eor v21.16b, v21.16b, v5.16b + eor v22.16b, v22.16b, v6.16b + eor v23.16b, v23.16b, v7.16b + st1 {v20.16b-v23.16b}, [x1] + b .Lout + + // fewer than 192 bytes of in/output +1: ld1 {v8.16b}, [x10] + ld1 {v9.16b}, [x11] + movi v10.16b, #16 + add x1, x1, x6 + tbl v0.16b, {v4.16b-v7.16b}, v8.16b + tbx v20.16b, {v16.16b-v19.16b}, v9.16b + add v8.16b, v8.16b, v10.16b + add v9.16b, v9.16b, v10.16b + tbl v1.16b, {v4.16b-v7.16b}, v8.16b + tbx v21.16b, {v16.16b-v19.16b}, v9.16b + add v8.16b, v8.16b, v10.16b + add v9.16b, v9.16b, v10.16b + tbl v2.16b, {v4.16b-v7.16b}, v8.16b + tbx v22.16b, {v16.16b-v19.16b}, v9.16b + add v8.16b, v8.16b, v10.16b + add v9.16b, v9.16b, v10.16b + tbl v3.16b, {v4.16b-v7.16b}, v8.16b + tbx v23.16b, {v16.16b-v19.16b}, v9.16b + + eor v20.16b, v20.16b, v0.16b + eor v21.16b, v21.16b, v1.16b + eor v22.16b, v22.16b, v2.16b + eor v23.16b, v23.16b, v3.16b + st1 {v20.16b-v23.16b}, [x1] + b .Lout + + // fewer than 256 bytes of in/output +2: ld1 {v4.16b}, [x10] + ld1 {v5.16b}, [x11] + movi v6.16b, #16 + add x1, x1, x7 + tbl v0.16b, {v8.16b-v11.16b}, v4.16b + tbx v24.16b, {v20.16b-v23.16b}, v5.16b + add v4.16b, v4.16b, v6.16b + add v5.16b, v5.16b, v6.16b + tbl v1.16b, {v8.16b-v11.16b}, v4.16b + tbx v25.16b, {v20.16b-v23.16b}, v5.16b + add v4.16b, v4.16b, v6.16b + add v5.16b, v5.16b, v6.16b + tbl v2.16b, {v8.16b-v11.16b}, v4.16b + tbx v26.16b, {v20.16b-v23.16b}, v5.16b + add v4.16b, v4.16b, v6.16b + add v5.16b, v5.16b, v6.16b + tbl v3.16b, {v8.16b-v11.16b}, v4.16b + tbx v27.16b, {v20.16b-v23.16b}, v5.16b + + eor v24.16b, v24.16b, v0.16b + eor v25.16b, v25.16b, v1.16b + eor v26.16b, v26.16b, v2.16b + eor v27.16b, v27.16b, v3.16b + st1 {v24.16b-v27.16b}, [x1] + b .Lout + + // fewer than 320 bytes of in/output +3: ld1 {v4.16b}, [x10] + ld1 {v5.16b}, [x11] + movi v6.16b, #16 + add x1, x1, x8 + tbl v0.16b, {v12.16b-v15.16b}, v4.16b + tbx v28.16b, {v24.16b-v27.16b}, v5.16b + add v4.16b, v4.16b, v6.16b + add v5.16b, v5.16b, v6.16b + tbl v1.16b, {v12.16b-v15.16b}, v4.16b + tbx v29.16b, {v24.16b-v27.16b}, v5.16b + add v4.16b, v4.16b, v6.16b + add v5.16b, v5.16b, v6.16b + tbl v2.16b, {v12.16b-v15.16b}, v4.16b + tbx v30.16b, {v24.16b-v27.16b}, v5.16b + add v4.16b, v4.16b, v6.16b + add v5.16b, v5.16b, v6.16b + tbl v3.16b, {v12.16b-v15.16b}, v4.16b + tbx v31.16b, {v24.16b-v27.16b}, v5.16b + + eor v28.16b, v28.16b, v0.16b + eor v29.16b, v29.16b, v1.16b + eor v30.16b, v30.16b, v2.16b + eor v31.16b, v31.16b, v3.16b + st1 {v28.16b-v31.16b}, [x1] + b .Lout +ENDPROC(chacha_4block_xor_neon) + + .section ".rodata", "a", %progbits + .align L1_CACHE_SHIFT +.Lpermute: + .set .Li, 0 + .rept 192 + .byte (.Li - 64) + .set .Li, .Li + 1 + .endr + +CTRINC: .word 1, 2, 3, 4 ROT8: .word 0x02010003, 0x06050407, 0x0a09080b, 0x0e0d0c0f diff --git a/arch/arm/crypto/chacha-neon-glue.c b/arch/arm64/crypto/chacha-neon-glue.c similarity index 56% rename from arch/arm/crypto/chacha-neon-glue.c rename to arch/arm64/crypto/chacha-neon-glue.c index 9d6fda81986d..531567a52671 100644 --- a/arch/arm/crypto/chacha-neon-glue.c +++ b/arch/arm64/crypto/chacha-neon-glue.c @@ -1,8 +1,8 @@ /* - * ARM NEON accelerated ChaCha and XChaCha stream ciphers, + * ARM NEON and scalar accelerated ChaCha and XChaCha stream ciphers, * including ChaCha20 (RFC7539) * - * Copyright (C) 2016 Linaro, Ltd. + * Copyright (C) 2016 - 2017 Linaro, Ltd. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -20,8 +20,9 @@ */ #include -#include +#include #include +#include #include #include @@ -29,40 +30,78 @@ #include #include -asmlinkage void chacha_block_xor_neon(const u32 *state, u8 *dst, const u8 *src, +asmlinkage void chacha_block_xor_neon(u32 *state, u8 *dst, const u8 *src, int nrounds); -asmlinkage void chacha_4block_xor_neon(const u32 *state, u8 *dst, const u8 *src, - int nrounds); +asmlinkage void chacha_4block_xor_neon(u32 *state, u8 *dst, const u8 *src, + int nrounds, int bytes); asmlinkage void hchacha_block_neon(const u32 *state, u32 *out, int nrounds); -static void chacha_doneon(u32 *state, u8 *dst, const u8 *src, - unsigned int bytes, int nrounds) -{ - u8 buf[CHACHA_BLOCK_SIZE]; +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_neon); - while (bytes >= CHACHA_BLOCK_SIZE * 4) { - chacha_4block_xor_neon(state, dst, src, nrounds); - bytes -= CHACHA_BLOCK_SIZE * 4; - src += CHACHA_BLOCK_SIZE * 4; - dst += CHACHA_BLOCK_SIZE * 4; - state[12] += 4; - } - while (bytes >= CHACHA_BLOCK_SIZE) { - chacha_block_xor_neon(state, dst, src, nrounds); - bytes -= CHACHA_BLOCK_SIZE; - src += CHACHA_BLOCK_SIZE; - dst += CHACHA_BLOCK_SIZE; - state[12]++; - } - if (bytes) { - memcpy(buf, src, bytes); - chacha_block_xor_neon(state, buf, buf, nrounds); - memcpy(dst, buf, bytes); +static void chacha_doneon(u32 *state, u8 *dst, const u8 *src, + int bytes, int nrounds) +{ + while (bytes > 0) { + int l = min(bytes, CHACHA_BLOCK_SIZE * 5); + + if (l <= CHACHA_BLOCK_SIZE) { + u8 buf[CHACHA_BLOCK_SIZE]; + + memcpy(buf, src, l); + chacha_block_xor_neon(state, buf, buf, nrounds); + memcpy(dst, buf, l); + state[12] += 1; + break; + } + chacha_4block_xor_neon(state, dst, src, nrounds, l); + bytes -= l; + src += l; + dst += l; + state[12] += DIV_ROUND_UP(l, CHACHA_BLOCK_SIZE); } } +void hchacha_block_arch(const u32 *state, u32 *stream, int nrounds) +{ + if (!static_branch_likely(&have_neon) || !may_use_simd()) { + hchacha_block_generic(state, stream, nrounds); + } else { + kernel_neon_begin(); + hchacha_block_neon(state, stream, nrounds); + kernel_neon_end(); + } +} +EXPORT_SYMBOL(hchacha_block_arch); + +void chacha_init_arch(u32 *state, const u32 *key, const u8 *iv) +{ + chacha_init_generic(state, key, iv); +} +EXPORT_SYMBOL(chacha_init_arch); + +void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes, + int nrounds) +{ + if (!static_branch_likely(&have_neon) || bytes <= CHACHA_BLOCK_SIZE || + !may_use_simd()) + return chacha_crypt_generic(state, dst, src, bytes, nrounds); + + do { + unsigned int todo = min_t(unsigned int, bytes, SZ_4K); + + kernel_neon_begin(); + chacha_doneon(state, dst, src, todo, nrounds); + kernel_neon_end(); + + bytes -= todo; + src += todo; + dst += todo; + } while (bytes); +} +EXPORT_SYMBOL(chacha_crypt_arch); + static int chacha_neon_stream_xor(struct skcipher_request *req, - struct chacha_ctx *ctx, u8 *iv) + const struct chacha_ctx *ctx, const u8 *iv) { struct skcipher_walk walk; u32 state[16]; @@ -70,18 +109,25 @@ static int chacha_neon_stream_xor(struct skcipher_request *req, err = skcipher_walk_virt(&walk, req, false); - crypto_chacha_init(state, ctx, iv); + chacha_init_generic(state, ctx->key, iv); while (walk.nbytes > 0) { unsigned int nbytes = walk.nbytes; if (nbytes < walk.total) - nbytes = round_down(nbytes, walk.stride); + nbytes = rounddown(nbytes, walk.stride); - kernel_neon_begin(); - chacha_doneon(state, walk.dst.virt.addr, walk.src.virt.addr, - nbytes, ctx->nrounds); - kernel_neon_end(); + if (!static_branch_likely(&have_neon) || + !may_use_simd()) { + chacha_crypt_generic(state, walk.dst.virt.addr, + walk.src.virt.addr, nbytes, + ctx->nrounds); + } else { + kernel_neon_begin(); + chacha_doneon(state, walk.dst.virt.addr, + walk.src.virt.addr, nbytes, ctx->nrounds); + kernel_neon_end(); + } err = skcipher_walk_done(&walk, walk.nbytes - nbytes); } @@ -93,9 +139,6 @@ static int chacha_neon(struct skcipher_request *req) struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); - if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd()) - return crypto_chacha_crypt(req); - return chacha_neon_stream_xor(req, ctx, req->iv); } @@ -107,14 +150,8 @@ static int xchacha_neon(struct skcipher_request *req) u32 state[16]; u8 real_iv[16]; - if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd()) - return crypto_xchacha_crypt(req); - - crypto_chacha_init(state, ctx, req->iv); - - kernel_neon_begin(); - hchacha_block_neon(state, subctx.key, ctx->nrounds); - kernel_neon_end(); + chacha_init_generic(state, ctx->key, req->iv); + hchacha_block_arch(state, subctx.key, ctx->nrounds); subctx.nrounds = ctx->nrounds; memcpy(&real_iv[0], req->iv + 24, 8); @@ -135,8 +172,8 @@ static struct skcipher_alg algs[] = { .max_keysize = CHACHA_KEY_SIZE, .ivsize = CHACHA_IV_SIZE, .chunksize = CHACHA_BLOCK_SIZE, - .walksize = 4 * CHACHA_BLOCK_SIZE, - .setkey = crypto_chacha20_setkey, + .walksize = 5 * CHACHA_BLOCK_SIZE, + .setkey = chacha20_setkey, .encrypt = chacha_neon, .decrypt = chacha_neon, }, { @@ -151,8 +188,8 @@ static struct skcipher_alg algs[] = { .max_keysize = CHACHA_KEY_SIZE, .ivsize = XCHACHA_IV_SIZE, .chunksize = CHACHA_BLOCK_SIZE, - .walksize = 4 * CHACHA_BLOCK_SIZE, - .setkey = crypto_chacha20_setkey, + .walksize = 5 * CHACHA_BLOCK_SIZE, + .setkey = chacha20_setkey, .encrypt = xchacha_neon, .decrypt = xchacha_neon, }, { @@ -167,8 +204,8 @@ static struct skcipher_alg algs[] = { .max_keysize = CHACHA_KEY_SIZE, .ivsize = XCHACHA_IV_SIZE, .chunksize = CHACHA_BLOCK_SIZE, - .walksize = 4 * CHACHA_BLOCK_SIZE, - .setkey = crypto_chacha12_setkey, + .walksize = 5 * CHACHA_BLOCK_SIZE, + .setkey = chacha12_setkey, .encrypt = xchacha_neon, .decrypt = xchacha_neon, } @@ -176,15 +213,19 @@ static struct skcipher_alg algs[] = { static int __init chacha_simd_mod_init(void) { - if (!(elf_hwcap & HWCAP_NEON)) - return -ENODEV; + if (!(elf_hwcap & HWCAP_ASIMD)) + return 0; - return crypto_register_skciphers(algs, ARRAY_SIZE(algs)); + static_branch_enable(&have_neon); + + return IS_REACHABLE(CONFIG_CRYPTO_BLKCIPHER) ? + crypto_register_skciphers(algs, ARRAY_SIZE(algs)) : 0; } static void __exit chacha_simd_mod_fini(void) { - crypto_unregister_skciphers(algs, ARRAY_SIZE(algs)); + if (IS_REACHABLE(CONFIG_CRYPTO_BLKCIPHER) && (elf_hwcap & HWCAP_ASIMD)) + crypto_unregister_skciphers(algs, ARRAY_SIZE(algs)); } module_init(chacha_simd_mod_init); diff --git a/arch/arm64/crypto/chacha20-neon-glue.c b/arch/arm64/crypto/chacha20-neon-glue.c deleted file mode 100644 index 96e0cfb8c3f5..000000000000 --- a/arch/arm64/crypto/chacha20-neon-glue.c +++ /dev/null @@ -1,133 +0,0 @@ -/* - * ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions - * - * Copyright (C) 2016 - 2017 Linaro, Ltd. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * Based on: - * ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code - * - * Copyright (C) 2015 Martin Willi - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - */ - -#include -#include -#include -#include -#include - -#include -#include -#include - -asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src); -asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src); - -static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src, - unsigned int bytes) -{ - u8 buf[CHACHA_BLOCK_SIZE]; - - while (bytes >= CHACHA_BLOCK_SIZE * 4) { - kernel_neon_begin(); - chacha20_4block_xor_neon(state, dst, src); - kernel_neon_end(); - bytes -= CHACHA_BLOCK_SIZE * 4; - src += CHACHA_BLOCK_SIZE * 4; - dst += CHACHA_BLOCK_SIZE * 4; - state[12] += 4; - } - - if (!bytes) - return; - - kernel_neon_begin(); - while (bytes >= CHACHA_BLOCK_SIZE) { - chacha20_block_xor_neon(state, dst, src); - bytes -= CHACHA_BLOCK_SIZE; - src += CHACHA_BLOCK_SIZE; - dst += CHACHA_BLOCK_SIZE; - state[12]++; - } - if (bytes) { - memcpy(buf, src, bytes); - chacha20_block_xor_neon(state, buf, buf); - memcpy(dst, buf, bytes); - } - kernel_neon_end(); -} - -static int chacha20_neon(struct skcipher_request *req) -{ - struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); - struct skcipher_walk walk; - u32 state[16]; - int err; - - if (!may_use_simd() || req->cryptlen <= CHACHA_BLOCK_SIZE) - return crypto_chacha_crypt(req); - - err = skcipher_walk_virt(&walk, req, false); - - crypto_chacha_init(state, ctx, walk.iv); - - while (walk.nbytes > 0) { - unsigned int nbytes = walk.nbytes; - - if (nbytes < walk.total) - nbytes = round_down(nbytes, walk.stride); - - chacha20_doneon(state, walk.dst.virt.addr, walk.src.virt.addr, - nbytes); - err = skcipher_walk_done(&walk, walk.nbytes - nbytes); - } - - return err; -} - -static struct skcipher_alg alg = { - .base.cra_name = "chacha20", - .base.cra_driver_name = "chacha20-neon", - .base.cra_priority = 300, - .base.cra_blocksize = 1, - .base.cra_ctxsize = sizeof(struct chacha_ctx), - .base.cra_module = THIS_MODULE, - - .min_keysize = CHACHA_KEY_SIZE, - .max_keysize = CHACHA_KEY_SIZE, - .ivsize = CHACHA_IV_SIZE, - .chunksize = CHACHA_BLOCK_SIZE, - .walksize = 4 * CHACHA_BLOCK_SIZE, - .setkey = crypto_chacha20_setkey, - .encrypt = chacha20_neon, - .decrypt = chacha20_neon, -}; - -static int __init chacha20_simd_mod_init(void) -{ - if (!(elf_hwcap & HWCAP_ASIMD)) - return -ENODEV; - - return crypto_register_skcipher(&alg); -} - -static void __exit chacha20_simd_mod_fini(void) -{ - crypto_unregister_skcipher(&alg); -} - -module_init(chacha20_simd_mod_init); -module_exit(chacha20_simd_mod_fini); - -MODULE_AUTHOR("Ard Biesheuvel "); -MODULE_LICENSE("GPL v2"); -MODULE_ALIAS_CRYPTO("chacha20"); diff --git a/arch/arm64/crypto/poly1305-armv8.pl b/arch/arm64/crypto/poly1305-armv8.pl new file mode 100644 index 000000000000..cbc980fb02e3 --- /dev/null +++ b/arch/arm64/crypto/poly1305-armv8.pl @@ -0,0 +1,913 @@ +#!/usr/bin/env perl +# SPDX-License-Identifier: GPL-1.0+ OR BSD-3-Clause +# +# ==================================================================== +# Written by Andy Polyakov, @dot-asm, initially for the OpenSSL +# project. +# ==================================================================== +# +# This module implements Poly1305 hash for ARMv8. +# +# June 2015 +# +# Numbers are cycles per processed byte with poly1305_blocks alone. +# +# IALU/gcc-4.9 NEON +# +# Apple A7 1.86/+5% 0.72 +# Cortex-A53 2.69/+58% 1.47 +# Cortex-A57 2.70/+7% 1.14 +# Denver 1.64/+50% 1.18(*) +# X-Gene 2.13/+68% 2.27 +# Mongoose 1.77/+75% 1.12 +# Kryo 2.70/+55% 1.13 +# ThunderX2 1.17/+95% 1.36 +# +# (*) estimate based on resources availability is less than 1.0, +# i.e. measured result is worse than expected, presumably binary +# translator is not almighty; + +$flavour=shift; +$output=shift; + +if ($flavour && $flavour ne "void") { + $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; + ( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or + ( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or + die "can't locate arm-xlate.pl"; + + open STDOUT,"| \"$^X\" $xlate $flavour $output"; +} else { + open STDOUT,">$output"; +} + +my ($ctx,$inp,$len,$padbit) = map("x$_",(0..3)); +my ($mac,$nonce)=($inp,$len); + +my ($h0,$h1,$h2,$r0,$r1,$s1,$t0,$t1,$d0,$d1,$d2) = map("x$_",(4..14)); + +$code.=<<___; +#ifndef __KERNEL__ +# include "arm_arch.h" +.extern OPENSSL_armcap_P +#endif + +.text + +// forward "declarations" are required for Apple +.globl poly1305_blocks +.globl poly1305_emit + +.globl poly1305_init +.type poly1305_init,%function +.align 5 +poly1305_init: + cmp $inp,xzr + stp xzr,xzr,[$ctx] // zero hash value + stp xzr,xzr,[$ctx,#16] // [along with is_base2_26] + + csel x0,xzr,x0,eq + b.eq .Lno_key + +#ifndef __KERNEL__ + adrp x17,OPENSSL_armcap_P + ldr w17,[x17,#:lo12:OPENSSL_armcap_P] +#endif + + ldp $r0,$r1,[$inp] // load key + mov $s1,#0xfffffffc0fffffff + movk $s1,#0x0fff,lsl#48 +#ifdef __AARCH64EB__ + rev $r0,$r0 // flip bytes + rev $r1,$r1 +#endif + and $r0,$r0,$s1 // &=0ffffffc0fffffff + and $s1,$s1,#-4 + and $r1,$r1,$s1 // &=0ffffffc0ffffffc + mov w#$s1,#-1 + stp $r0,$r1,[$ctx,#32] // save key value + str w#$s1,[$ctx,#48] // impossible key power value + +#ifndef __KERNEL__ + tst w17,#ARMV7_NEON + + adr $d0,.Lpoly1305_blocks + adr $r0,.Lpoly1305_blocks_neon + adr $d1,.Lpoly1305_emit + + csel $d0,$d0,$r0,eq + +# ifdef __ILP32__ + stp w#$d0,w#$d1,[$len] +# else + stp $d0,$d1,[$len] +# endif +#endif + mov x0,#1 +.Lno_key: + ret +.size poly1305_init,.-poly1305_init + +.type poly1305_blocks,%function +.align 5 +poly1305_blocks: +.Lpoly1305_blocks: + ands $len,$len,#-16 + b.eq .Lno_data + + ldp $h0,$h1,[$ctx] // load hash value + ldp $h2,x17,[$ctx,#16] // [along with is_base2_26] + ldp $r0,$r1,[$ctx,#32] // load key value + +#ifdef __AARCH64EB__ + lsr $d0,$h0,#32 + mov w#$d1,w#$h0 + lsr $d2,$h1,#32 + mov w15,w#$h1 + lsr x16,$h2,#32 +#else + mov w#$d0,w#$h0 + lsr $d1,$h0,#32 + mov w#$d2,w#$h1 + lsr x15,$h1,#32 + mov w16,w#$h2 +#endif + + add $d0,$d0,$d1,lsl#26 // base 2^26 -> base 2^64 + lsr $d1,$d2,#12 + adds $d0,$d0,$d2,lsl#52 + add $d1,$d1,x15,lsl#14 + adc $d1,$d1,xzr + lsr $d2,x16,#24 + adds $d1,$d1,x16,lsl#40 + adc $d2,$d2,xzr + + cmp x17,#0 // is_base2_26? + add $s1,$r1,$r1,lsr#2 // s1 = r1 + (r1 >> 2) + csel $h0,$h0,$d0,eq // choose between radixes + csel $h1,$h1,$d1,eq + csel $h2,$h2,$d2,eq + +.Loop: + ldp $t0,$t1,[$inp],#16 // load input + sub $len,$len,#16 +#ifdef __AARCH64EB__ + rev $t0,$t0 + rev $t1,$t1 +#endif + adds $h0,$h0,$t0 // accumulate input + adcs $h1,$h1,$t1 + + mul $d0,$h0,$r0 // h0*r0 + adc $h2,$h2,$padbit + umulh $d1,$h0,$r0 + + mul $t0,$h1,$s1 // h1*5*r1 + umulh $t1,$h1,$s1 + + adds $d0,$d0,$t0 + mul $t0,$h0,$r1 // h0*r1 + adc $d1,$d1,$t1 + umulh $d2,$h0,$r1 + + adds $d1,$d1,$t0 + mul $t0,$h1,$r0 // h1*r0 + adc $d2,$d2,xzr + umulh $t1,$h1,$r0 + + adds $d1,$d1,$t0 + mul $t0,$h2,$s1 // h2*5*r1 + adc $d2,$d2,$t1 + mul $t1,$h2,$r0 // h2*r0 + + adds $d1,$d1,$t0 + adc $d2,$d2,$t1 + + and $t0,$d2,#-4 // final reduction + and $h2,$d2,#3 + add $t0,$t0,$d2,lsr#2 + adds $h0,$d0,$t0 + adcs $h1,$d1,xzr + adc $h2,$h2,xzr + + cbnz $len,.Loop + + stp $h0,$h1,[$ctx] // store hash value + stp $h2,xzr,[$ctx,#16] // [and clear is_base2_26] + +.Lno_data: + ret +.size poly1305_blocks,.-poly1305_blocks + +.type poly1305_emit,%function +.align 5 +poly1305_emit: +.Lpoly1305_emit: + ldp $h0,$h1,[$ctx] // load hash base 2^64 + ldp $h2,$r0,[$ctx,#16] // [along with is_base2_26] + ldp $t0,$t1,[$nonce] // load nonce + +#ifdef __AARCH64EB__ + lsr $d0,$h0,#32 + mov w#$d1,w#$h0 + lsr $d2,$h1,#32 + mov w15,w#$h1 + lsr x16,$h2,#32 +#else + mov w#$d0,w#$h0 + lsr $d1,$h0,#32 + mov w#$d2,w#$h1 + lsr x15,$h1,#32 + mov w16,w#$h2 +#endif + + add $d0,$d0,$d1,lsl#26 // base 2^26 -> base 2^64 + lsr $d1,$d2,#12 + adds $d0,$d0,$d2,lsl#52 + add $d1,$d1,x15,lsl#14 + adc $d1,$d1,xzr + lsr $d2,x16,#24 + adds $d1,$d1,x16,lsl#40 + adc $d2,$d2,xzr + + cmp $r0,#0 // is_base2_26? + csel $h0,$h0,$d0,eq // choose between radixes + csel $h1,$h1,$d1,eq + csel $h2,$h2,$d2,eq + + adds $d0,$h0,#5 // compare to modulus + adcs $d1,$h1,xzr + adc $d2,$h2,xzr + + tst $d2,#-4 // see if it's carried/borrowed + + csel $h0,$h0,$d0,eq + csel $h1,$h1,$d1,eq + +#ifdef __AARCH64EB__ + ror $t0,$t0,#32 // flip nonce words + ror $t1,$t1,#32 +#endif + adds $h0,$h0,$t0 // accumulate nonce + adc $h1,$h1,$t1 +#ifdef __AARCH64EB__ + rev $h0,$h0 // flip output bytes + rev $h1,$h1 +#endif + stp $h0,$h1,[$mac] // write result + + ret +.size poly1305_emit,.-poly1305_emit +___ +my ($R0,$R1,$S1,$R2,$S2,$R3,$S3,$R4,$S4) = map("v$_.4s",(0..8)); +my ($IN01_0,$IN01_1,$IN01_2,$IN01_3,$IN01_4) = map("v$_.2s",(9..13)); +my ($IN23_0,$IN23_1,$IN23_2,$IN23_3,$IN23_4) = map("v$_.2s",(14..18)); +my ($ACC0,$ACC1,$ACC2,$ACC3,$ACC4) = map("v$_.2d",(19..23)); +my ($H0,$H1,$H2,$H3,$H4) = map("v$_.2s",(24..28)); +my ($T0,$T1,$MASK) = map("v$_",(29..31)); + +my ($in2,$zeros)=("x16","x17"); +my $is_base2_26 = $zeros; # borrow + +$code.=<<___; +.type poly1305_mult,%function +.align 5 +poly1305_mult: + mul $d0,$h0,$r0 // h0*r0 + umulh $d1,$h0,$r0 + + mul $t0,$h1,$s1 // h1*5*r1 + umulh $t1,$h1,$s1 + + adds $d0,$d0,$t0 + mul $t0,$h0,$r1 // h0*r1 + adc $d1,$d1,$t1 + umulh $d2,$h0,$r1 + + adds $d1,$d1,$t0 + mul $t0,$h1,$r0 // h1*r0 + adc $d2,$d2,xzr + umulh $t1,$h1,$r0 + + adds $d1,$d1,$t0 + mul $t0,$h2,$s1 // h2*5*r1 + adc $d2,$d2,$t1 + mul $t1,$h2,$r0 // h2*r0 + + adds $d1,$d1,$t0 + adc $d2,$d2,$t1 + + and $t0,$d2,#-4 // final reduction + and $h2,$d2,#3 + add $t0,$t0,$d2,lsr#2 + adds $h0,$d0,$t0 + adcs $h1,$d1,xzr + adc $h2,$h2,xzr + + ret +.size poly1305_mult,.-poly1305_mult + +.type poly1305_splat,%function +.align 4 +poly1305_splat: + and x12,$h0,#0x03ffffff // base 2^64 -> base 2^26 + ubfx x13,$h0,#26,#26 + extr x14,$h1,$h0,#52 + and x14,x14,#0x03ffffff + ubfx x15,$h1,#14,#26 + extr x16,$h2,$h1,#40 + + str w12,[$ctx,#16*0] // r0 + add w12,w13,w13,lsl#2 // r1*5 + str w13,[$ctx,#16*1] // r1 + add w13,w14,w14,lsl#2 // r2*5 + str w12,[$ctx,#16*2] // s1 + str w14,[$ctx,#16*3] // r2 + add w14,w15,w15,lsl#2 // r3*5 + str w13,[$ctx,#16*4] // s2 + str w15,[$ctx,#16*5] // r3 + add w15,w16,w16,lsl#2 // r4*5 + str w14,[$ctx,#16*6] // s3 + str w16,[$ctx,#16*7] // r4 + str w15,[$ctx,#16*8] // s4 + + ret +.size poly1305_splat,.-poly1305_splat + +#ifdef __KERNEL__ +.globl poly1305_blocks_neon +#endif +.type poly1305_blocks_neon,%function +.align 5 +poly1305_blocks_neon: +.Lpoly1305_blocks_neon: + ldr $is_base2_26,[$ctx,#24] + cmp $len,#128 + b.lo .Lpoly1305_blocks + + .inst 0xd503233f // paciasp + stp x29,x30,[sp,#-80]! + add x29,sp,#0 + + stp d8,d9,[sp,#16] // meet ABI requirements + stp d10,d11,[sp,#32] + stp d12,d13,[sp,#48] + stp d14,d15,[sp,#64] + + cbz $is_base2_26,.Lbase2_64_neon + + ldp w10,w11,[$ctx] // load hash value base 2^26 + ldp w12,w13,[$ctx,#8] + ldr w14,[$ctx,#16] + + tst $len,#31 + b.eq .Leven_neon + + ldp $r0,$r1,[$ctx,#32] // load key value + + add $h0,x10,x11,lsl#26 // base 2^26 -> base 2^64 + lsr $h1,x12,#12 + adds $h0,$h0,x12,lsl#52 + add $h1,$h1,x13,lsl#14 + adc $h1,$h1,xzr + lsr $h2,x14,#24 + adds $h1,$h1,x14,lsl#40 + adc $d2,$h2,xzr // can be partially reduced... + + ldp $d0,$d1,[$inp],#16 // load input + sub $len,$len,#16 + add $s1,$r1,$r1,lsr#2 // s1 = r1 + (r1 >> 2) + +#ifdef __AARCH64EB__ + rev $d0,$d0 + rev $d1,$d1 +#endif + adds $h0,$h0,$d0 // accumulate input + adcs $h1,$h1,$d1 + adc $h2,$h2,$padbit + + bl poly1305_mult + + and x10,$h0,#0x03ffffff // base 2^64 -> base 2^26 + ubfx x11,$h0,#26,#26 + extr x12,$h1,$h0,#52 + and x12,x12,#0x03ffffff + ubfx x13,$h1,#14,#26 + extr x14,$h2,$h1,#40 + + b .Leven_neon + +.align 4 +.Lbase2_64_neon: + ldp $r0,$r1,[$ctx,#32] // load key value + + ldp $h0,$h1,[$ctx] // load hash value base 2^64 + ldr $h2,[$ctx,#16] + + tst $len,#31 + b.eq .Linit_neon + + ldp $d0,$d1,[$inp],#16 // load input + sub $len,$len,#16 + add $s1,$r1,$r1,lsr#2 // s1 = r1 + (r1 >> 2) +#ifdef __AARCH64EB__ + rev $d0,$d0 + rev $d1,$d1 +#endif + adds $h0,$h0,$d0 // accumulate input + adcs $h1,$h1,$d1 + adc $h2,$h2,$padbit + + bl poly1305_mult + +.Linit_neon: + ldr w17,[$ctx,#48] // first table element + and x10,$h0,#0x03ffffff // base 2^64 -> base 2^26 + ubfx x11,$h0,#26,#26 + extr x12,$h1,$h0,#52 + and x12,x12,#0x03ffffff + ubfx x13,$h1,#14,#26 + extr x14,$h2,$h1,#40 + + cmp w17,#-1 // is value impossible? + b.ne .Leven_neon + + fmov ${H0},x10 + fmov ${H1},x11 + fmov ${H2},x12 + fmov ${H3},x13 + fmov ${H4},x14 + + ////////////////////////////////// initialize r^n table + mov $h0,$r0 // r^1 + add $s1,$r1,$r1,lsr#2 // s1 = r1 + (r1 >> 2) + mov $h1,$r1 + mov $h2,xzr + add $ctx,$ctx,#48+12 + bl poly1305_splat + + bl poly1305_mult // r^2 + sub $ctx,$ctx,#4 + bl poly1305_splat + + bl poly1305_mult // r^3 + sub $ctx,$ctx,#4 + bl poly1305_splat + + bl poly1305_mult // r^4 + sub $ctx,$ctx,#4 + bl poly1305_splat + sub $ctx,$ctx,#48 // restore original $ctx + b .Ldo_neon + +.align 4 +.Leven_neon: + fmov ${H0},x10 + fmov ${H1},x11 + fmov ${H2},x12 + fmov ${H3},x13 + fmov ${H4},x14 + +.Ldo_neon: + ldp x8,x12,[$inp,#32] // inp[2:3] + subs $len,$len,#64 + ldp x9,x13,[$inp,#48] + add $in2,$inp,#96 + adr $zeros,.Lzeros + + lsl $padbit,$padbit,#24 + add x15,$ctx,#48 + +#ifdef __AARCH64EB__ + rev x8,x8 + rev x12,x12 + rev x9,x9 + rev x13,x13 +#endif + and x4,x8,#0x03ffffff // base 2^64 -> base 2^26 + and x5,x9,#0x03ffffff + ubfx x6,x8,#26,#26 + ubfx x7,x9,#26,#26 + add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32 + extr x8,x12,x8,#52 + extr x9,x13,x9,#52 + add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32 + fmov $IN23_0,x4 + and x8,x8,#0x03ffffff + and x9,x9,#0x03ffffff + ubfx x10,x12,#14,#26 + ubfx x11,x13,#14,#26 + add x12,$padbit,x12,lsr#40 + add x13,$padbit,x13,lsr#40 + add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32 + fmov $IN23_1,x6 + add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32 + add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32 + fmov $IN23_2,x8 + fmov $IN23_3,x10 + fmov $IN23_4,x12 + + ldp x8,x12,[$inp],#16 // inp[0:1] + ldp x9,x13,[$inp],#48 + + ld1 {$R0,$R1,$S1,$R2},[x15],#64 + ld1 {$S2,$R3,$S3,$R4},[x15],#64 + ld1 {$S4},[x15] + +#ifdef __AARCH64EB__ + rev x8,x8 + rev x12,x12 + rev x9,x9 + rev x13,x13 +#endif + and x4,x8,#0x03ffffff // base 2^64 -> base 2^26 + and x5,x9,#0x03ffffff + ubfx x6,x8,#26,#26 + ubfx x7,x9,#26,#26 + add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32 + extr x8,x12,x8,#52 + extr x9,x13,x9,#52 + add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32 + fmov $IN01_0,x4 + and x8,x8,#0x03ffffff + and x9,x9,#0x03ffffff + ubfx x10,x12,#14,#26 + ubfx x11,x13,#14,#26 + add x12,$padbit,x12,lsr#40 + add x13,$padbit,x13,lsr#40 + add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32 + fmov $IN01_1,x6 + add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32 + add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32 + movi $MASK.2d,#-1 + fmov $IN01_2,x8 + fmov $IN01_3,x10 + fmov $IN01_4,x12 + ushr $MASK.2d,$MASK.2d,#38 + + b.ls .Lskip_loop + +.align 4 +.Loop_neon: + //////////////////////////////////////////////////////////////// + // ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2 + // ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^3+inp[7]*r + // \___________________/ + // ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2+inp[8])*r^2 + // ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^4+inp[7]*r^2+inp[9])*r + // \___________________/ \____________________/ + // + // Note that we start with inp[2:3]*r^2. This is because it + // doesn't depend on reduction in previous iteration. + //////////////////////////////////////////////////////////////// + // d4 = h0*r4 + h1*r3 + h2*r2 + h3*r1 + h4*r0 + // d3 = h0*r3 + h1*r2 + h2*r1 + h3*r0 + h4*5*r4 + // d2 = h0*r2 + h1*r1 + h2*r0 + h3*5*r4 + h4*5*r3 + // d1 = h0*r1 + h1*r0 + h2*5*r4 + h3*5*r3 + h4*5*r2 + // d0 = h0*r0 + h1*5*r4 + h2*5*r3 + h3*5*r2 + h4*5*r1 + + subs $len,$len,#64 + umull $ACC4,$IN23_0,${R4}[2] + csel $in2,$zeros,$in2,lo + umull $ACC3,$IN23_0,${R3}[2] + umull $ACC2,$IN23_0,${R2}[2] + ldp x8,x12,[$in2],#16 // inp[2:3] (or zero) + umull $ACC1,$IN23_0,${R1}[2] + ldp x9,x13,[$in2],#48 + umull $ACC0,$IN23_0,${R0}[2] +#ifdef __AARCH64EB__ + rev x8,x8 + rev x12,x12 + rev x9,x9 + rev x13,x13 +#endif + + umlal $ACC4,$IN23_1,${R3}[2] + and x4,x8,#0x03ffffff // base 2^64 -> base 2^26 + umlal $ACC3,$IN23_1,${R2}[2] + and x5,x9,#0x03ffffff + umlal $ACC2,$IN23_1,${R1}[2] + ubfx x6,x8,#26,#26 + umlal $ACC1,$IN23_1,${R0}[2] + ubfx x7,x9,#26,#26 + umlal $ACC0,$IN23_1,${S4}[2] + add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32 + + umlal $ACC4,$IN23_2,${R2}[2] + extr x8,x12,x8,#52 + umlal $ACC3,$IN23_2,${R1}[2] + extr x9,x13,x9,#52 + umlal $ACC2,$IN23_2,${R0}[2] + add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32 + umlal $ACC1,$IN23_2,${S4}[2] + fmov $IN23_0,x4 + umlal $ACC0,$IN23_2,${S3}[2] + and x8,x8,#0x03ffffff + + umlal $ACC4,$IN23_3,${R1}[2] + and x9,x9,#0x03ffffff + umlal $ACC3,$IN23_3,${R0}[2] + ubfx x10,x12,#14,#26 + umlal $ACC2,$IN23_3,${S4}[2] + ubfx x11,x13,#14,#26 + umlal $ACC1,$IN23_3,${S3}[2] + add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32 + umlal $ACC0,$IN23_3,${S2}[2] + fmov $IN23_1,x6 + + add $IN01_2,$IN01_2,$H2 + add x12,$padbit,x12,lsr#40 + umlal $ACC4,$IN23_4,${R0}[2] + add x13,$padbit,x13,lsr#40 + umlal $ACC3,$IN23_4,${S4}[2] + add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32 + umlal $ACC2,$IN23_4,${S3}[2] + add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32 + umlal $ACC1,$IN23_4,${S2}[2] + fmov $IN23_2,x8 + umlal $ACC0,$IN23_4,${S1}[2] + fmov $IN23_3,x10 + + //////////////////////////////////////////////////////////////// + // (hash+inp[0:1])*r^4 and accumulate + + add $IN01_0,$IN01_0,$H0 + fmov $IN23_4,x12 + umlal $ACC3,$IN01_2,${R1}[0] + ldp x8,x12,[$inp],#16 // inp[0:1] + umlal $ACC0,$IN01_2,${S3}[0] + ldp x9,x13,[$inp],#48 + umlal $ACC4,$IN01_2,${R2}[0] + umlal $ACC1,$IN01_2,${S4}[0] + umlal $ACC2,$IN01_2,${R0}[0] +#ifdef __AARCH64EB__ + rev x8,x8 + rev x12,x12 + rev x9,x9 + rev x13,x13 +#endif + + add $IN01_1,$IN01_1,$H1 + umlal $ACC3,$IN01_0,${R3}[0] + umlal $ACC4,$IN01_0,${R4}[0] + and x4,x8,#0x03ffffff // base 2^64 -> base 2^26 + umlal $ACC2,$IN01_0,${R2}[0] + and x5,x9,#0x03ffffff + umlal $ACC0,$IN01_0,${R0}[0] + ubfx x6,x8,#26,#26 + umlal $ACC1,$IN01_0,${R1}[0] + ubfx x7,x9,#26,#26 + + add $IN01_3,$IN01_3,$H3 + add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32 + umlal $ACC3,$IN01_1,${R2}[0] + extr x8,x12,x8,#52 + umlal $ACC4,$IN01_1,${R3}[0] + extr x9,x13,x9,#52 + umlal $ACC0,$IN01_1,${S4}[0] + add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32 + umlal $ACC2,$IN01_1,${R1}[0] + fmov $IN01_0,x4 + umlal $ACC1,$IN01_1,${R0}[0] + and x8,x8,#0x03ffffff + + add $IN01_4,$IN01_4,$H4 + and x9,x9,#0x03ffffff + umlal $ACC3,$IN01_3,${R0}[0] + ubfx x10,x12,#14,#26 + umlal $ACC0,$IN01_3,${S2}[0] + ubfx x11,x13,#14,#26 + umlal $ACC4,$IN01_3,${R1}[0] + add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32 + umlal $ACC1,$IN01_3,${S3}[0] + fmov $IN01_1,x6 + umlal $ACC2,$IN01_3,${S4}[0] + add x12,$padbit,x12,lsr#40 + + umlal $ACC3,$IN01_4,${S4}[0] + add x13,$padbit,x13,lsr#40 + umlal $ACC0,$IN01_4,${S1}[0] + add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32 + umlal $ACC4,$IN01_4,${R0}[0] + add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32 + umlal $ACC1,$IN01_4,${S2}[0] + fmov $IN01_2,x8 + umlal $ACC2,$IN01_4,${S3}[0] + fmov $IN01_3,x10 + fmov $IN01_4,x12 + + ///////////////////////////////////////////////////////////////// + // lazy reduction as discussed in "NEON crypto" by D.J. Bernstein + // and P. Schwabe + // + // [see discussion in poly1305-armv4 module] + + ushr $T0.2d,$ACC3,#26 + xtn $H3,$ACC3 + ushr $T1.2d,$ACC0,#26 + and $ACC0,$ACC0,$MASK.2d + add $ACC4,$ACC4,$T0.2d // h3 -> h4 + bic $H3,#0xfc,lsl#24 // &=0x03ffffff + add $ACC1,$ACC1,$T1.2d // h0 -> h1 + + ushr $T0.2d,$ACC4,#26 + xtn $H4,$ACC4 + ushr $T1.2d,$ACC1,#26 + xtn $H1,$ACC1 + bic $H4,#0xfc,lsl#24 + add $ACC2,$ACC2,$T1.2d // h1 -> h2 + + add $ACC0,$ACC0,$T0.2d + shl $T0.2d,$T0.2d,#2 + shrn $T1.2s,$ACC2,#26 + xtn $H2,$ACC2 + add $ACC0,$ACC0,$T0.2d // h4 -> h0 + bic $H1,#0xfc,lsl#24 + add $H3,$H3,$T1.2s // h2 -> h3 + bic $H2,#0xfc,lsl#24 + + shrn $T0.2s,$ACC0,#26 + xtn $H0,$ACC0 + ushr $T1.2s,$H3,#26 + bic $H3,#0xfc,lsl#24 + bic $H0,#0xfc,lsl#24 + add $H1,$H1,$T0.2s // h0 -> h1 + add $H4,$H4,$T1.2s // h3 -> h4 + + b.hi .Loop_neon + +.Lskip_loop: + dup $IN23_2,${IN23_2}[0] + add $IN01_2,$IN01_2,$H2 + + //////////////////////////////////////////////////////////////// + // multiply (inp[0:1]+hash) or inp[2:3] by r^2:r^1 + + adds $len,$len,#32 + b.ne .Long_tail + + dup $IN23_2,${IN01_2}[0] + add $IN23_0,$IN01_0,$H0 + add $IN23_3,$IN01_3,$H3 + add $IN23_1,$IN01_1,$H1 + add $IN23_4,$IN01_4,$H4 + +.Long_tail: + dup $IN23_0,${IN23_0}[0] + umull2 $ACC0,$IN23_2,${S3} + umull2 $ACC3,$IN23_2,${R1} + umull2 $ACC4,$IN23_2,${R2} + umull2 $ACC2,$IN23_2,${R0} + umull2 $ACC1,$IN23_2,${S4} + + dup $IN23_1,${IN23_1}[0] + umlal2 $ACC0,$IN23_0,${R0} + umlal2 $ACC2,$IN23_0,${R2} + umlal2 $ACC3,$IN23_0,${R3} + umlal2 $ACC4,$IN23_0,${R4} + umlal2 $ACC1,$IN23_0,${R1} + + dup $IN23_3,${IN23_3}[0] + umlal2 $ACC0,$IN23_1,${S4} + umlal2 $ACC3,$IN23_1,${R2} + umlal2 $ACC2,$IN23_1,${R1} + umlal2 $ACC4,$IN23_1,${R3} + umlal2 $ACC1,$IN23_1,${R0} + + dup $IN23_4,${IN23_4}[0] + umlal2 $ACC3,$IN23_3,${R0} + umlal2 $ACC4,$IN23_3,${R1} + umlal2 $ACC0,$IN23_3,${S2} + umlal2 $ACC1,$IN23_3,${S3} + umlal2 $ACC2,$IN23_3,${S4} + + umlal2 $ACC3,$IN23_4,${S4} + umlal2 $ACC0,$IN23_4,${S1} + umlal2 $ACC4,$IN23_4,${R0} + umlal2 $ACC1,$IN23_4,${S2} + umlal2 $ACC2,$IN23_4,${S3} + + b.eq .Lshort_tail + + //////////////////////////////////////////////////////////////// + // (hash+inp[0:1])*r^4:r^3 and accumulate + + add $IN01_0,$IN01_0,$H0 + umlal $ACC3,$IN01_2,${R1} + umlal $ACC0,$IN01_2,${S3} + umlal $ACC4,$IN01_2,${R2} + umlal $ACC1,$IN01_2,${S4} + umlal $ACC2,$IN01_2,${R0} + + add $IN01_1,$IN01_1,$H1 + umlal $ACC3,$IN01_0,${R3} + umlal $ACC0,$IN01_0,${R0} + umlal $ACC4,$IN01_0,${R4} + umlal $ACC1,$IN01_0,${R1} + umlal $ACC2,$IN01_0,${R2} + + add $IN01_3,$IN01_3,$H3 + umlal $ACC3,$IN01_1,${R2} + umlal $ACC0,$IN01_1,${S4} + umlal $ACC4,$IN01_1,${R3} + umlal $ACC1,$IN01_1,${R0} + umlal $ACC2,$IN01_1,${R1} + + add $IN01_4,$IN01_4,$H4 + umlal $ACC3,$IN01_3,${R0} + umlal $ACC0,$IN01_3,${S2} + umlal $ACC4,$IN01_3,${R1} + umlal $ACC1,$IN01_3,${S3} + umlal $ACC2,$IN01_3,${S4} + + umlal $ACC3,$IN01_4,${S4} + umlal $ACC0,$IN01_4,${S1} + umlal $ACC4,$IN01_4,${R0} + umlal $ACC1,$IN01_4,${S2} + umlal $ACC2,$IN01_4,${S3} + +.Lshort_tail: + //////////////////////////////////////////////////////////////// + // horizontal add + + addp $ACC3,$ACC3,$ACC3 + ldp d8,d9,[sp,#16] // meet ABI requirements + addp $ACC0,$ACC0,$ACC0 + ldp d10,d11,[sp,#32] + addp $ACC4,$ACC4,$ACC4 + ldp d12,d13,[sp,#48] + addp $ACC1,$ACC1,$ACC1 + ldp d14,d15,[sp,#64] + addp $ACC2,$ACC2,$ACC2 + ldr x30,[sp,#8] + + //////////////////////////////////////////////////////////////// + // lazy reduction, but without narrowing + + ushr $T0.2d,$ACC3,#26 + and $ACC3,$ACC3,$MASK.2d + ushr $T1.2d,$ACC0,#26 + and $ACC0,$ACC0,$MASK.2d + + add $ACC4,$ACC4,$T0.2d // h3 -> h4 + add $ACC1,$ACC1,$T1.2d // h0 -> h1 + + ushr $T0.2d,$ACC4,#26 + and $ACC4,$ACC4,$MASK.2d + ushr $T1.2d,$ACC1,#26 + and $ACC1,$ACC1,$MASK.2d + add $ACC2,$ACC2,$T1.2d // h1 -> h2 + + add $ACC0,$ACC0,$T0.2d + shl $T0.2d,$T0.2d,#2 + ushr $T1.2d,$ACC2,#26 + and $ACC2,$ACC2,$MASK.2d + add $ACC0,$ACC0,$T0.2d // h4 -> h0 + add $ACC3,$ACC3,$T1.2d // h2 -> h3 + + ushr $T0.2d,$ACC0,#26 + and $ACC0,$ACC0,$MASK.2d + ushr $T1.2d,$ACC3,#26 + and $ACC3,$ACC3,$MASK.2d + add $ACC1,$ACC1,$T0.2d // h0 -> h1 + add $ACC4,$ACC4,$T1.2d // h3 -> h4 + + //////////////////////////////////////////////////////////////// + // write the result, can be partially reduced + + st4 {$ACC0,$ACC1,$ACC2,$ACC3}[0],[$ctx],#16 + mov x4,#1 + st1 {$ACC4}[0],[$ctx] + str x4,[$ctx,#8] // set is_base2_26 + + ldr x29,[sp],#80 + .inst 0xd50323bf // autiasp + ret +.size poly1305_blocks_neon,.-poly1305_blocks_neon + +.align 5 +.Lzeros: +.long 0,0,0,0,0,0,0,0 +.asciz "Poly1305 for ARMv8, CRYPTOGAMS by \@dot-asm" +.align 2 +#if !defined(__KERNEL__) && !defined(_WIN64) +.comm OPENSSL_armcap_P,4,4 +.hidden OPENSSL_armcap_P +#endif +___ + +foreach (split("\n",$code)) { + s/\b(shrn\s+v[0-9]+)\.[24]d/$1.2s/ or + s/\b(fmov\s+)v([0-9]+)[^,]*,\s*x([0-9]+)/$1d$2,x$3/ or + (m/\bdup\b/ and (s/\.[24]s/.2d/g or 1)) or + (m/\b(eor|and)/ and (s/\.[248][sdh]/.16b/g or 1)) or + (m/\bum(ul|la)l\b/ and (s/\.4s/.2s/g or 1)) or + (m/\bum(ul|la)l2\b/ and (s/\.2s/.4s/g or 1)) or + (m/\bst[1-4]\s+{[^}]+}\[/ and (s/\.[24]d/.s/g or 1)); + + s/\.[124]([sd])\[/.$1\[/; + s/w#x([0-9]+)/w$1/g; + + print $_,"\n"; +} +close STDOUT; diff --git a/arch/arm64/crypto/poly1305-core.S_shipped b/arch/arm64/crypto/poly1305-core.S_shipped new file mode 100644 index 000000000000..fb2822abf63a --- /dev/null +++ b/arch/arm64/crypto/poly1305-core.S_shipped @@ -0,0 +1,835 @@ +#ifndef __KERNEL__ +# include "arm_arch.h" +.extern OPENSSL_armcap_P +#endif + +.text + +// forward "declarations" are required for Apple +.globl poly1305_blocks +.globl poly1305_emit + +.globl poly1305_init +.type poly1305_init,%function +.align 5 +poly1305_init: + cmp x1,xzr + stp xzr,xzr,[x0] // zero hash value + stp xzr,xzr,[x0,#16] // [along with is_base2_26] + + csel x0,xzr,x0,eq + b.eq .Lno_key + +#ifndef __KERNEL__ + adrp x17,OPENSSL_armcap_P + ldr w17,[x17,#:lo12:OPENSSL_armcap_P] +#endif + + ldp x7,x8,[x1] // load key + mov x9,#0xfffffffc0fffffff + movk x9,#0x0fff,lsl#48 +#ifdef __AARCH64EB__ + rev x7,x7 // flip bytes + rev x8,x8 +#endif + and x7,x7,x9 // &=0ffffffc0fffffff + and x9,x9,#-4 + and x8,x8,x9 // &=0ffffffc0ffffffc + mov w9,#-1 + stp x7,x8,[x0,#32] // save key value + str w9,[x0,#48] // impossible key power value + +#ifndef __KERNEL__ + tst w17,#ARMV7_NEON + + adr x12,.Lpoly1305_blocks + adr x7,.Lpoly1305_blocks_neon + adr x13,.Lpoly1305_emit + + csel x12,x12,x7,eq + +# ifdef __ILP32__ + stp w12,w13,[x2] +# else + stp x12,x13,[x2] +# endif +#endif + mov x0,#1 +.Lno_key: + ret +.size poly1305_init,.-poly1305_init + +.type poly1305_blocks,%function +.align 5 +poly1305_blocks: +.Lpoly1305_blocks: + ands x2,x2,#-16 + b.eq .Lno_data + + ldp x4,x5,[x0] // load hash value + ldp x6,x17,[x0,#16] // [along with is_base2_26] + ldp x7,x8,[x0,#32] // load key value + +#ifdef __AARCH64EB__ + lsr x12,x4,#32 + mov w13,w4 + lsr x14,x5,#32 + mov w15,w5 + lsr x16,x6,#32 +#else + mov w12,w4 + lsr x13,x4,#32 + mov w14,w5 + lsr x15,x5,#32 + mov w16,w6 +#endif + + add x12,x12,x13,lsl#26 // base 2^26 -> base 2^64 + lsr x13,x14,#12 + adds x12,x12,x14,lsl#52 + add x13,x13,x15,lsl#14 + adc x13,x13,xzr + lsr x14,x16,#24 + adds x13,x13,x16,lsl#40 + adc x14,x14,xzr + + cmp x17,#0 // is_base2_26? + add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2) + csel x4,x4,x12,eq // choose between radixes + csel x5,x5,x13,eq + csel x6,x6,x14,eq + +.Loop: + ldp x10,x11,[x1],#16 // load input + sub x2,x2,#16 +#ifdef __AARCH64EB__ + rev x10,x10 + rev x11,x11 +#endif + adds x4,x4,x10 // accumulate input + adcs x5,x5,x11 + + mul x12,x4,x7 // h0*r0 + adc x6,x6,x3 + umulh x13,x4,x7 + + mul x10,x5,x9 // h1*5*r1 + umulh x11,x5,x9 + + adds x12,x12,x10 + mul x10,x4,x8 // h0*r1 + adc x13,x13,x11 + umulh x14,x4,x8 + + adds x13,x13,x10 + mul x10,x5,x7 // h1*r0 + adc x14,x14,xzr + umulh x11,x5,x7 + + adds x13,x13,x10 + mul x10,x6,x9 // h2*5*r1 + adc x14,x14,x11 + mul x11,x6,x7 // h2*r0 + + adds x13,x13,x10 + adc x14,x14,x11 + + and x10,x14,#-4 // final reduction + and x6,x14,#3 + add x10,x10,x14,lsr#2 + adds x4,x12,x10 + adcs x5,x13,xzr + adc x6,x6,xzr + + cbnz x2,.Loop + + stp x4,x5,[x0] // store hash value + stp x6,xzr,[x0,#16] // [and clear is_base2_26] + +.Lno_data: + ret +.size poly1305_blocks,.-poly1305_blocks + +.type poly1305_emit,%function +.align 5 +poly1305_emit: +.Lpoly1305_emit: + ldp x4,x5,[x0] // load hash base 2^64 + ldp x6,x7,[x0,#16] // [along with is_base2_26] + ldp x10,x11,[x2] // load nonce + +#ifdef __AARCH64EB__ + lsr x12,x4,#32 + mov w13,w4 + lsr x14,x5,#32 + mov w15,w5 + lsr x16,x6,#32 +#else + mov w12,w4 + lsr x13,x4,#32 + mov w14,w5 + lsr x15,x5,#32 + mov w16,w6 +#endif + + add x12,x12,x13,lsl#26 // base 2^26 -> base 2^64 + lsr x13,x14,#12 + adds x12,x12,x14,lsl#52 + add x13,x13,x15,lsl#14 + adc x13,x13,xzr + lsr x14,x16,#24 + adds x13,x13,x16,lsl#40 + adc x14,x14,xzr + + cmp x7,#0 // is_base2_26? + csel x4,x4,x12,eq // choose between radixes + csel x5,x5,x13,eq + csel x6,x6,x14,eq + + adds x12,x4,#5 // compare to modulus + adcs x13,x5,xzr + adc x14,x6,xzr + + tst x14,#-4 // see if it's carried/borrowed + + csel x4,x4,x12,eq + csel x5,x5,x13,eq + +#ifdef __AARCH64EB__ + ror x10,x10,#32 // flip nonce words + ror x11,x11,#32 +#endif + adds x4,x4,x10 // accumulate nonce + adc x5,x5,x11 +#ifdef __AARCH64EB__ + rev x4,x4 // flip output bytes + rev x5,x5 +#endif + stp x4,x5,[x1] // write result + + ret +.size poly1305_emit,.-poly1305_emit +.type poly1305_mult,%function +.align 5 +poly1305_mult: + mul x12,x4,x7 // h0*r0 + umulh x13,x4,x7 + + mul x10,x5,x9 // h1*5*r1 + umulh x11,x5,x9 + + adds x12,x12,x10 + mul x10,x4,x8 // h0*r1 + adc x13,x13,x11 + umulh x14,x4,x8 + + adds x13,x13,x10 + mul x10,x5,x7 // h1*r0 + adc x14,x14,xzr + umulh x11,x5,x7 + + adds x13,x13,x10 + mul x10,x6,x9 // h2*5*r1 + adc x14,x14,x11 + mul x11,x6,x7 // h2*r0 + + adds x13,x13,x10 + adc x14,x14,x11 + + and x10,x14,#-4 // final reduction + and x6,x14,#3 + add x10,x10,x14,lsr#2 + adds x4,x12,x10 + adcs x5,x13,xzr + adc x6,x6,xzr + + ret +.size poly1305_mult,.-poly1305_mult + +.type poly1305_splat,%function +.align 4 +poly1305_splat: + and x12,x4,#0x03ffffff // base 2^64 -> base 2^26 + ubfx x13,x4,#26,#26 + extr x14,x5,x4,#52 + and x14,x14,#0x03ffffff + ubfx x15,x5,#14,#26 + extr x16,x6,x5,#40 + + str w12,[x0,#16*0] // r0 + add w12,w13,w13,lsl#2 // r1*5 + str w13,[x0,#16*1] // r1 + add w13,w14,w14,lsl#2 // r2*5 + str w12,[x0,#16*2] // s1 + str w14,[x0,#16*3] // r2 + add w14,w15,w15,lsl#2 // r3*5 + str w13,[x0,#16*4] // s2 + str w15,[x0,#16*5] // r3 + add w15,w16,w16,lsl#2 // r4*5 + str w14,[x0,#16*6] // s3 + str w16,[x0,#16*7] // r4 + str w15,[x0,#16*8] // s4 + + ret +.size poly1305_splat,.-poly1305_splat + +#ifdef __KERNEL__ +.globl poly1305_blocks_neon +#endif +.type poly1305_blocks_neon,%function +.align 5 +poly1305_blocks_neon: +.Lpoly1305_blocks_neon: + ldr x17,[x0,#24] + cmp x2,#128 + b.lo .Lpoly1305_blocks + + .inst 0xd503233f // paciasp + stp x29,x30,[sp,#-80]! + add x29,sp,#0 + + stp d8,d9,[sp,#16] // meet ABI requirements + stp d10,d11,[sp,#32] + stp d12,d13,[sp,#48] + stp d14,d15,[sp,#64] + + cbz x17,.Lbase2_64_neon + + ldp w10,w11,[x0] // load hash value base 2^26 + ldp w12,w13,[x0,#8] + ldr w14,[x0,#16] + + tst x2,#31 + b.eq .Leven_neon + + ldp x7,x8,[x0,#32] // load key value + + add x4,x10,x11,lsl#26 // base 2^26 -> base 2^64 + lsr x5,x12,#12 + adds x4,x4,x12,lsl#52 + add x5,x5,x13,lsl#14 + adc x5,x5,xzr + lsr x6,x14,#24 + adds x5,x5,x14,lsl#40 + adc x14,x6,xzr // can be partially reduced... + + ldp x12,x13,[x1],#16 // load input + sub x2,x2,#16 + add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2) + +#ifdef __AARCH64EB__ + rev x12,x12 + rev x13,x13 +#endif + adds x4,x4,x12 // accumulate input + adcs x5,x5,x13 + adc x6,x6,x3 + + bl poly1305_mult + + and x10,x4,#0x03ffffff // base 2^64 -> base 2^26 + ubfx x11,x4,#26,#26 + extr x12,x5,x4,#52 + and x12,x12,#0x03ffffff + ubfx x13,x5,#14,#26 + extr x14,x6,x5,#40 + + b .Leven_neon + +.align 4 +.Lbase2_64_neon: + ldp x7,x8,[x0,#32] // load key value + + ldp x4,x5,[x0] // load hash value base 2^64 + ldr x6,[x0,#16] + + tst x2,#31 + b.eq .Linit_neon + + ldp x12,x13,[x1],#16 // load input + sub x2,x2,#16 + add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2) +#ifdef __AARCH64EB__ + rev x12,x12 + rev x13,x13 +#endif + adds x4,x4,x12 // accumulate input + adcs x5,x5,x13 + adc x6,x6,x3 + + bl poly1305_mult + +.Linit_neon: + ldr w17,[x0,#48] // first table element + and x10,x4,#0x03ffffff // base 2^64 -> base 2^26 + ubfx x11,x4,#26,#26 + extr x12,x5,x4,#52 + and x12,x12,#0x03ffffff + ubfx x13,x5,#14,#26 + extr x14,x6,x5,#40 + + cmp w17,#-1 // is value impossible? + b.ne .Leven_neon + + fmov d24,x10 + fmov d25,x11 + fmov d26,x12 + fmov d27,x13 + fmov d28,x14 + + ////////////////////////////////// initialize r^n table + mov x4,x7 // r^1 + add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2) + mov x5,x8 + mov x6,xzr + add x0,x0,#48+12 + bl poly1305_splat + + bl poly1305_mult // r^2 + sub x0,x0,#4 + bl poly1305_splat + + bl poly1305_mult // r^3 + sub x0,x0,#4 + bl poly1305_splat + + bl poly1305_mult // r^4 + sub x0,x0,#4 + bl poly1305_splat + sub x0,x0,#48 // restore original x0 + b .Ldo_neon + +.align 4 +.Leven_neon: + fmov d24,x10 + fmov d25,x11 + fmov d26,x12 + fmov d27,x13 + fmov d28,x14 + +.Ldo_neon: + ldp x8,x12,[x1,#32] // inp[2:3] + subs x2,x2,#64 + ldp x9,x13,[x1,#48] + add x16,x1,#96 + adr x17,.Lzeros + + lsl x3,x3,#24 + add x15,x0,#48 + +#ifdef __AARCH64EB__ + rev x8,x8 + rev x12,x12 + rev x9,x9 + rev x13,x13 +#endif + and x4,x8,#0x03ffffff // base 2^64 -> base 2^26 + and x5,x9,#0x03ffffff + ubfx x6,x8,#26,#26 + ubfx x7,x9,#26,#26 + add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32 + extr x8,x12,x8,#52 + extr x9,x13,x9,#52 + add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32 + fmov d14,x4 + and x8,x8,#0x03ffffff + and x9,x9,#0x03ffffff + ubfx x10,x12,#14,#26 + ubfx x11,x13,#14,#26 + add x12,x3,x12,lsr#40 + add x13,x3,x13,lsr#40 + add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32 + fmov d15,x6 + add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32 + add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32 + fmov d16,x8 + fmov d17,x10 + fmov d18,x12 + + ldp x8,x12,[x1],#16 // inp[0:1] + ldp x9,x13,[x1],#48 + + ld1 {v0.4s,v1.4s,v2.4s,v3.4s},[x15],#64 + ld1 {v4.4s,v5.4s,v6.4s,v7.4s},[x15],#64 + ld1 {v8.4s},[x15] + +#ifdef __AARCH64EB__ + rev x8,x8 + rev x12,x12 + rev x9,x9 + rev x13,x13 +#endif + and x4,x8,#0x03ffffff // base 2^64 -> base 2^26 + and x5,x9,#0x03ffffff + ubfx x6,x8,#26,#26 + ubfx x7,x9,#26,#26 + add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32 + extr x8,x12,x8,#52 + extr x9,x13,x9,#52 + add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32 + fmov d9,x4 + and x8,x8,#0x03ffffff + and x9,x9,#0x03ffffff + ubfx x10,x12,#14,#26 + ubfx x11,x13,#14,#26 + add x12,x3,x12,lsr#40 + add x13,x3,x13,lsr#40 + add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32 + fmov d10,x6 + add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32 + add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32 + movi v31.2d,#-1 + fmov d11,x8 + fmov d12,x10 + fmov d13,x12 + ushr v31.2d,v31.2d,#38 + + b.ls .Lskip_loop + +.align 4 +.Loop_neon: + //////////////////////////////////////////////////////////////// + // ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2 + // ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^3+inp[7]*r + // ___________________/ + // ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2+inp[8])*r^2 + // ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^4+inp[7]*r^2+inp[9])*r + // ___________________/ ____________________/ + // + // Note that we start with inp[2:3]*r^2. This is because it + // doesn't depend on reduction in previous iteration. + //////////////////////////////////////////////////////////////// + // d4 = h0*r4 + h1*r3 + h2*r2 + h3*r1 + h4*r0 + // d3 = h0*r3 + h1*r2 + h2*r1 + h3*r0 + h4*5*r4 + // d2 = h0*r2 + h1*r1 + h2*r0 + h3*5*r4 + h4*5*r3 + // d1 = h0*r1 + h1*r0 + h2*5*r4 + h3*5*r3 + h4*5*r2 + // d0 = h0*r0 + h1*5*r4 + h2*5*r3 + h3*5*r2 + h4*5*r1 + + subs x2,x2,#64 + umull v23.2d,v14.2s,v7.s[2] + csel x16,x17,x16,lo + umull v22.2d,v14.2s,v5.s[2] + umull v21.2d,v14.2s,v3.s[2] + ldp x8,x12,[x16],#16 // inp[2:3] (or zero) + umull v20.2d,v14.2s,v1.s[2] + ldp x9,x13,[x16],#48 + umull v19.2d,v14.2s,v0.s[2] +#ifdef __AARCH64EB__ + rev x8,x8 + rev x12,x12 + rev x9,x9 + rev x13,x13 +#endif + + umlal v23.2d,v15.2s,v5.s[2] + and x4,x8,#0x03ffffff // base 2^64 -> base 2^26 + umlal v22.2d,v15.2s,v3.s[2] + and x5,x9,#0x03ffffff + umlal v21.2d,v15.2s,v1.s[2] + ubfx x6,x8,#26,#26 + umlal v20.2d,v15.2s,v0.s[2] + ubfx x7,x9,#26,#26 + umlal v19.2d,v15.2s,v8.s[2] + add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32 + + umlal v23.2d,v16.2s,v3.s[2] + extr x8,x12,x8,#52 + umlal v22.2d,v16.2s,v1.s[2] + extr x9,x13,x9,#52 + umlal v21.2d,v16.2s,v0.s[2] + add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32 + umlal v20.2d,v16.2s,v8.s[2] + fmov d14,x4 + umlal v19.2d,v16.2s,v6.s[2] + and x8,x8,#0x03ffffff + + umlal v23.2d,v17.2s,v1.s[2] + and x9,x9,#0x03ffffff + umlal v22.2d,v17.2s,v0.s[2] + ubfx x10,x12,#14,#26 + umlal v21.2d,v17.2s,v8.s[2] + ubfx x11,x13,#14,#26 + umlal v20.2d,v17.2s,v6.s[2] + add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32 + umlal v19.2d,v17.2s,v4.s[2] + fmov d15,x6 + + add v11.2s,v11.2s,v26.2s + add x12,x3,x12,lsr#40 + umlal v23.2d,v18.2s,v0.s[2] + add x13,x3,x13,lsr#40 + umlal v22.2d,v18.2s,v8.s[2] + add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32 + umlal v21.2d,v18.2s,v6.s[2] + add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32 + umlal v20.2d,v18.2s,v4.s[2] + fmov d16,x8 + umlal v19.2d,v18.2s,v2.s[2] + fmov d17,x10 + + //////////////////////////////////////////////////////////////// + // (hash+inp[0:1])*r^4 and accumulate + + add v9.2s,v9.2s,v24.2s + fmov d18,x12 + umlal v22.2d,v11.2s,v1.s[0] + ldp x8,x12,[x1],#16 // inp[0:1] + umlal v19.2d,v11.2s,v6.s[0] + ldp x9,x13,[x1],#48 + umlal v23.2d,v11.2s,v3.s[0] + umlal v20.2d,v11.2s,v8.s[0] + umlal v21.2d,v11.2s,v0.s[0] +#ifdef __AARCH64EB__ + rev x8,x8 + rev x12,x12 + rev x9,x9 + rev x13,x13 +#endif + + add v10.2s,v10.2s,v25.2s + umlal v22.2d,v9.2s,v5.s[0] + umlal v23.2d,v9.2s,v7.s[0] + and x4,x8,#0x03ffffff // base 2^64 -> base 2^26 + umlal v21.2d,v9.2s,v3.s[0] + and x5,x9,#0x03ffffff + umlal v19.2d,v9.2s,v0.s[0] + ubfx x6,x8,#26,#26 + umlal v20.2d,v9.2s,v1.s[0] + ubfx x7,x9,#26,#26 + + add v12.2s,v12.2s,v27.2s + add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32 + umlal v22.2d,v10.2s,v3.s[0] + extr x8,x12,x8,#52 + umlal v23.2d,v10.2s,v5.s[0] + extr x9,x13,x9,#52 + umlal v19.2d,v10.2s,v8.s[0] + add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32 + umlal v21.2d,v10.2s,v1.s[0] + fmov d9,x4 + umlal v20.2d,v10.2s,v0.s[0] + and x8,x8,#0x03ffffff + + add v13.2s,v13.2s,v28.2s + and x9,x9,#0x03ffffff + umlal v22.2d,v12.2s,v0.s[0] + ubfx x10,x12,#14,#26 + umlal v19.2d,v12.2s,v4.s[0] + ubfx x11,x13,#14,#26 + umlal v23.2d,v12.2s,v1.s[0] + add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32 + umlal v20.2d,v12.2s,v6.s[0] + fmov d10,x6 + umlal v21.2d,v12.2s,v8.s[0] + add x12,x3,x12,lsr#40 + + umlal v22.2d,v13.2s,v8.s[0] + add x13,x3,x13,lsr#40 + umlal v19.2d,v13.2s,v2.s[0] + add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32 + umlal v23.2d,v13.2s,v0.s[0] + add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32 + umlal v20.2d,v13.2s,v4.s[0] + fmov d11,x8 + umlal v21.2d,v13.2s,v6.s[0] + fmov d12,x10 + fmov d13,x12 + + ///////////////////////////////////////////////////////////////// + // lazy reduction as discussed in "NEON crypto" by D.J. Bernstein + // and P. Schwabe + // + // [see discussion in poly1305-armv4 module] + + ushr v29.2d,v22.2d,#26 + xtn v27.2s,v22.2d + ushr v30.2d,v19.2d,#26 + and v19.16b,v19.16b,v31.16b + add v23.2d,v23.2d,v29.2d // h3 -> h4 + bic v27.2s,#0xfc,lsl#24 // &=0x03ffffff + add v20.2d,v20.2d,v30.2d // h0 -> h1 + + ushr v29.2d,v23.2d,#26 + xtn v28.2s,v23.2d + ushr v30.2d,v20.2d,#26 + xtn v25.2s,v20.2d + bic v28.2s,#0xfc,lsl#24 + add v21.2d,v21.2d,v30.2d // h1 -> h2 + + add v19.2d,v19.2d,v29.2d + shl v29.2d,v29.2d,#2 + shrn v30.2s,v21.2d,#26 + xtn v26.2s,v21.2d + add v19.2d,v19.2d,v29.2d // h4 -> h0 + bic v25.2s,#0xfc,lsl#24 + add v27.2s,v27.2s,v30.2s // h2 -> h3 + bic v26.2s,#0xfc,lsl#24 + + shrn v29.2s,v19.2d,#26 + xtn v24.2s,v19.2d + ushr v30.2s,v27.2s,#26 + bic v27.2s,#0xfc,lsl#24 + bic v24.2s,#0xfc,lsl#24 + add v25.2s,v25.2s,v29.2s // h0 -> h1 + add v28.2s,v28.2s,v30.2s // h3 -> h4 + + b.hi .Loop_neon + +.Lskip_loop: + dup v16.2d,v16.d[0] + add v11.2s,v11.2s,v26.2s + + //////////////////////////////////////////////////////////////// + // multiply (inp[0:1]+hash) or inp[2:3] by r^2:r^1 + + adds x2,x2,#32 + b.ne .Long_tail + + dup v16.2d,v11.d[0] + add v14.2s,v9.2s,v24.2s + add v17.2s,v12.2s,v27.2s + add v15.2s,v10.2s,v25.2s + add v18.2s,v13.2s,v28.2s + +.Long_tail: + dup v14.2d,v14.d[0] + umull2 v19.2d,v16.4s,v6.4s + umull2 v22.2d,v16.4s,v1.4s + umull2 v23.2d,v16.4s,v3.4s + umull2 v21.2d,v16.4s,v0.4s + umull2 v20.2d,v16.4s,v8.4s + + dup v15.2d,v15.d[0] + umlal2 v19.2d,v14.4s,v0.4s + umlal2 v21.2d,v14.4s,v3.4s + umlal2 v22.2d,v14.4s,v5.4s + umlal2 v23.2d,v14.4s,v7.4s + umlal2 v20.2d,v14.4s,v1.4s + + dup v17.2d,v17.d[0] + umlal2 v19.2d,v15.4s,v8.4s + umlal2 v22.2d,v15.4s,v3.4s + umlal2 v21.2d,v15.4s,v1.4s + umlal2 v23.2d,v15.4s,v5.4s + umlal2 v20.2d,v15.4s,v0.4s + + dup v18.2d,v18.d[0] + umlal2 v22.2d,v17.4s,v0.4s + umlal2 v23.2d,v17.4s,v1.4s + umlal2 v19.2d,v17.4s,v4.4s + umlal2 v20.2d,v17.4s,v6.4s + umlal2 v21.2d,v17.4s,v8.4s + + umlal2 v22.2d,v18.4s,v8.4s + umlal2 v19.2d,v18.4s,v2.4s + umlal2 v23.2d,v18.4s,v0.4s + umlal2 v20.2d,v18.4s,v4.4s + umlal2 v21.2d,v18.4s,v6.4s + + b.eq .Lshort_tail + + //////////////////////////////////////////////////////////////// + // (hash+inp[0:1])*r^4:r^3 and accumulate + + add v9.2s,v9.2s,v24.2s + umlal v22.2d,v11.2s,v1.2s + umlal v19.2d,v11.2s,v6.2s + umlal v23.2d,v11.2s,v3.2s + umlal v20.2d,v11.2s,v8.2s + umlal v21.2d,v11.2s,v0.2s + + add v10.2s,v10.2s,v25.2s + umlal v22.2d,v9.2s,v5.2s + umlal v19.2d,v9.2s,v0.2s + umlal v23.2d,v9.2s,v7.2s + umlal v20.2d,v9.2s,v1.2s + umlal v21.2d,v9.2s,v3.2s + + add v12.2s,v12.2s,v27.2s + umlal v22.2d,v10.2s,v3.2s + umlal v19.2d,v10.2s,v8.2s + umlal v23.2d,v10.2s,v5.2s + umlal v20.2d,v10.2s,v0.2s + umlal v21.2d,v10.2s,v1.2s + + add v13.2s,v13.2s,v28.2s + umlal v22.2d,v12.2s,v0.2s + umlal v19.2d,v12.2s,v4.2s + umlal v23.2d,v12.2s,v1.2s + umlal v20.2d,v12.2s,v6.2s + umlal v21.2d,v12.2s,v8.2s + + umlal v22.2d,v13.2s,v8.2s + umlal v19.2d,v13.2s,v2.2s + umlal v23.2d,v13.2s,v0.2s + umlal v20.2d,v13.2s,v4.2s + umlal v21.2d,v13.2s,v6.2s + +.Lshort_tail: + //////////////////////////////////////////////////////////////// + // horizontal add + + addp v22.2d,v22.2d,v22.2d + ldp d8,d9,[sp,#16] // meet ABI requirements + addp v19.2d,v19.2d,v19.2d + ldp d10,d11,[sp,#32] + addp v23.2d,v23.2d,v23.2d + ldp d12,d13,[sp,#48] + addp v20.2d,v20.2d,v20.2d + ldp d14,d15,[sp,#64] + addp v21.2d,v21.2d,v21.2d + ldr x30,[sp,#8] + + //////////////////////////////////////////////////////////////// + // lazy reduction, but without narrowing + + ushr v29.2d,v22.2d,#26 + and v22.16b,v22.16b,v31.16b + ushr v30.2d,v19.2d,#26 + and v19.16b,v19.16b,v31.16b + + add v23.2d,v23.2d,v29.2d // h3 -> h4 + add v20.2d,v20.2d,v30.2d // h0 -> h1 + + ushr v29.2d,v23.2d,#26 + and v23.16b,v23.16b,v31.16b + ushr v30.2d,v20.2d,#26 + and v20.16b,v20.16b,v31.16b + add v21.2d,v21.2d,v30.2d // h1 -> h2 + + add v19.2d,v19.2d,v29.2d + shl v29.2d,v29.2d,#2 + ushr v30.2d,v21.2d,#26 + and v21.16b,v21.16b,v31.16b + add v19.2d,v19.2d,v29.2d // h4 -> h0 + add v22.2d,v22.2d,v30.2d // h2 -> h3 + + ushr v29.2d,v19.2d,#26 + and v19.16b,v19.16b,v31.16b + ushr v30.2d,v22.2d,#26 + and v22.16b,v22.16b,v31.16b + add v20.2d,v20.2d,v29.2d // h0 -> h1 + add v23.2d,v23.2d,v30.2d // h3 -> h4 + + //////////////////////////////////////////////////////////////// + // write the result, can be partially reduced + + st4 {v19.s,v20.s,v21.s,v22.s}[0],[x0],#16 + mov x4,#1 + st1 {v23.s}[0],[x0] + str x4,[x0,#8] // set is_base2_26 + + ldr x29,[sp],#80 + .inst 0xd50323bf // autiasp + ret +.size poly1305_blocks_neon,.-poly1305_blocks_neon + +.align 5 +.Lzeros: +.long 0,0,0,0,0,0,0,0 +.asciz "Poly1305 for ARMv8, CRYPTOGAMS by @dot-asm" +.align 2 +#if !defined(__KERNEL__) && !defined(_WIN64) +.comm OPENSSL_armcap_P,4,4 +.hidden OPENSSL_armcap_P +#endif diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c new file mode 100644 index 000000000000..2d348021574a --- /dev/null +++ b/arch/arm64/crypto/poly1305-glue.c @@ -0,0 +1,230 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * OpenSSL/Cryptogams accelerated Poly1305 transform for arm64 + * + * Copyright (C) 2019 Linaro Ltd. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +asmlinkage void poly1305_init_arm64(void *state, const u8 *key); +asmlinkage void poly1305_blocks(void *state, const u8 *src, u32 len, u32 hibit); +asmlinkage void poly1305_blocks_neon(void *state, const u8 *src, u32 len, u32 hibit); +asmlinkage void poly1305_emit(void *state, u8 *digest, const u32 *nonce); + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_neon); + +void poly1305_init_arch(struct poly1305_desc_ctx *dctx, const u8 *key) +{ + poly1305_init_arm64(&dctx->h, key); + dctx->s[0] = get_unaligned_le32(key + 16); + dctx->s[1] = get_unaligned_le32(key + 20); + dctx->s[2] = get_unaligned_le32(key + 24); + dctx->s[3] = get_unaligned_le32(key + 28); + dctx->buflen = 0; +} +EXPORT_SYMBOL(poly1305_init_arch); + +static int neon_poly1305_init(struct shash_desc *desc) +{ + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + + dctx->buflen = 0; + dctx->rset = 0; + dctx->sset = false; + + return 0; +} + +static void neon_poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src, + u32 len, u32 hibit, bool do_neon) +{ + if (unlikely(!dctx->sset)) { + if (!dctx->rset) { + poly1305_init_arch(dctx, src); + src += POLY1305_BLOCK_SIZE; + len -= POLY1305_BLOCK_SIZE; + dctx->rset = 1; + } + if (len >= POLY1305_BLOCK_SIZE) { + dctx->s[0] = get_unaligned_le32(src + 0); + dctx->s[1] = get_unaligned_le32(src + 4); + dctx->s[2] = get_unaligned_le32(src + 8); + dctx->s[3] = get_unaligned_le32(src + 12); + src += POLY1305_BLOCK_SIZE; + len -= POLY1305_BLOCK_SIZE; + dctx->sset = true; + } + if (len < POLY1305_BLOCK_SIZE) + return; + } + + len &= ~(POLY1305_BLOCK_SIZE - 1); + + if (static_branch_likely(&have_neon) && likely(do_neon)) + poly1305_blocks_neon(&dctx->h, src, len, hibit); + else + poly1305_blocks(&dctx->h, src, len, hibit); +} + +static void neon_poly1305_do_update(struct poly1305_desc_ctx *dctx, + const u8 *src, u32 len, bool do_neon) +{ + if (unlikely(dctx->buflen)) { + u32 bytes = min(len, POLY1305_BLOCK_SIZE - dctx->buflen); + + memcpy(dctx->buf + dctx->buflen, src, bytes); + src += bytes; + len -= bytes; + dctx->buflen += bytes; + + if (dctx->buflen == POLY1305_BLOCK_SIZE) { + neon_poly1305_blocks(dctx, dctx->buf, + POLY1305_BLOCK_SIZE, 1, false); + dctx->buflen = 0; + } + } + + if (likely(len >= POLY1305_BLOCK_SIZE)) { + neon_poly1305_blocks(dctx, src, len, 1, do_neon); + src += round_down(len, POLY1305_BLOCK_SIZE); + len %= POLY1305_BLOCK_SIZE; + } + + if (unlikely(len)) { + dctx->buflen = len; + memcpy(dctx->buf, src, len); + } +} + +static int neon_poly1305_update(struct shash_desc *desc, + const u8 *src, unsigned int srclen) +{ + bool do_neon = may_use_simd() && srclen > 128; + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + + if (static_branch_likely(&have_neon) && do_neon) + kernel_neon_begin(); + neon_poly1305_do_update(dctx, src, srclen, do_neon); + if (static_branch_likely(&have_neon) && do_neon) + kernel_neon_end(); + return 0; +} + +void poly1305_update_arch(struct poly1305_desc_ctx *dctx, const u8 *src, + unsigned int nbytes) +{ + if (unlikely(dctx->buflen)) { + u32 bytes = min(nbytes, POLY1305_BLOCK_SIZE - dctx->buflen); + + memcpy(dctx->buf + dctx->buflen, src, bytes); + src += bytes; + nbytes -= bytes; + dctx->buflen += bytes; + + if (dctx->buflen == POLY1305_BLOCK_SIZE) { + poly1305_blocks(&dctx->h, dctx->buf, POLY1305_BLOCK_SIZE, 1); + dctx->buflen = 0; + } + } + + if (likely(nbytes >= POLY1305_BLOCK_SIZE)) { + unsigned int len = round_down(nbytes, POLY1305_BLOCK_SIZE); + + if (static_branch_likely(&have_neon) && may_use_simd()) { + do { + unsigned int todo = min_t(unsigned int, len, SZ_4K); + + kernel_neon_begin(); + poly1305_blocks_neon(&dctx->h, src, todo, 1); + kernel_neon_end(); + + len -= todo; + src += todo; + } while (len); + } else { + poly1305_blocks(&dctx->h, src, len, 1); + src += len; + } + nbytes %= POLY1305_BLOCK_SIZE; + } + + if (unlikely(nbytes)) { + dctx->buflen = nbytes; + memcpy(dctx->buf, src, nbytes); + } +} +EXPORT_SYMBOL(poly1305_update_arch); + +void poly1305_final_arch(struct poly1305_desc_ctx *dctx, u8 *dst) +{ + if (unlikely(dctx->buflen)) { + dctx->buf[dctx->buflen++] = 1; + memset(dctx->buf + dctx->buflen, 0, + POLY1305_BLOCK_SIZE - dctx->buflen); + poly1305_blocks(&dctx->h, dctx->buf, POLY1305_BLOCK_SIZE, 0); + } + + poly1305_emit(&dctx->h, dst, dctx->s); + *dctx = (struct poly1305_desc_ctx){}; +} +EXPORT_SYMBOL(poly1305_final_arch); + +static int neon_poly1305_final(struct shash_desc *desc, u8 *dst) +{ + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + + if (unlikely(!dctx->sset)) + return -ENOKEY; + + poly1305_final_arch(dctx, dst); + return 0; +} + +static struct shash_alg neon_poly1305_alg = { + .init = neon_poly1305_init, + .update = neon_poly1305_update, + .final = neon_poly1305_final, + .digestsize = POLY1305_DIGEST_SIZE, + .descsize = sizeof(struct poly1305_desc_ctx), + + .base.cra_name = "poly1305", + .base.cra_driver_name = "poly1305-neon", + .base.cra_priority = 200, + .base.cra_blocksize = POLY1305_BLOCK_SIZE, + .base.cra_module = THIS_MODULE, +}; + +static int __init neon_poly1305_mod_init(void) +{ + if (!(elf_hwcap & HWCAP_ASIMD)) + return 0; + + static_branch_enable(&have_neon); + + return IS_REACHABLE(CONFIG_CRYPTO_HASH) ? + crypto_register_shash(&neon_poly1305_alg) : 0; +} + +static void __exit neon_poly1305_mod_exit(void) +{ + if (IS_REACHABLE(CONFIG_CRYPTO_HASH) && (elf_hwcap & HWCAP_ASIMD)) + crypto_unregister_shash(&neon_poly1305_alg); +} + +module_init(neon_poly1305_mod_init); +module_exit(neon_poly1305_mod_exit); + +MODULE_LICENSE("GPL v2"); +MODULE_ALIAS_CRYPTO("poly1305"); +MODULE_ALIAS_CRYPTO("poly1305-neon"); diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 5e720742d647..c67cae9d5229 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -192,6 +192,7 @@ enum vcpu_sysreg { #define cp14_DBGWCR0 (DBGWCR0_EL1 * 2) #define cp14_DBGWVR0 (DBGWVR0_EL1 * 2) #define cp14_DBGDCCINT (MDCCINT_EL1 * 2) +#define cp14_DBGVCR (DBGVCR32_EL2 * 2) #define NR_COPRO_REGS (NR_SYS_REGS * 2) diff --git a/arch/arm64/include/asm/numa.h b/arch/arm64/include/asm/numa.h index 626ad01e83bf..dd870390d639 100644 --- a/arch/arm64/include/asm/numa.h +++ b/arch/arm64/include/asm/numa.h @@ -25,6 +25,9 @@ const struct cpumask *cpumask_of_node(int node); /* Returns a pointer to the cpumask of CPUs on Node 'node'. */ static inline const struct cpumask *cpumask_of_node(int node) { + if (node == NUMA_NO_NODE) + return cpu_all_mask; + return node_to_cpumask_map[node]; } #endif diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 90ba6d0b87e0..29f37effe7a4 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -620,6 +620,12 @@ check_branch_predictor(const struct arm64_cpu_capabilities *entry, int scope) return (need_wa > 0); } +static void +cpu_enable_branch_predictor_hardening(const struct arm64_cpu_capabilities *cap) +{ + cap->matches(cap, SCOPE_LOCAL_CPU); +} + static const __maybe_unused struct midr_range tx2_family_cpus[] = { MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN), MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2), @@ -860,9 +866,11 @@ const struct arm64_cpu_capabilities arm64_errata[] = { }, #endif { + .desc = "Branch predictor hardening", .capability = ARM64_HARDEN_BRANCH_PREDICTOR, .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, .matches = check_branch_predictor, + .cpu_enable = cpu_enable_branch_predictor_hardening, }, #ifdef CONFIG_HARDEN_EL2_VECTORS { diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index 2786c4e6061a..518ecb796b29 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -290,21 +290,23 @@ void store_cpu_topology(unsigned int cpuid) if (mpidr & MPIDR_UP_BITMASK) return; - /* Create cpu topology mapping based on MPIDR. */ - if (mpidr & MPIDR_MT_BITMASK) { - /* Multiprocessor system : Multi-threads per core */ - cpuid_topo->thread_id = MPIDR_AFFINITY_LEVEL(mpidr, 0); - cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 1); - cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2) | - MPIDR_AFFINITY_LEVEL(mpidr, 3) << 8; - } else { - /* Multiprocessor system : Single-thread per core */ - cpuid_topo->thread_id = -1; - cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0); - cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1) | - MPIDR_AFFINITY_LEVEL(mpidr, 2) << 8 | - MPIDR_AFFINITY_LEVEL(mpidr, 3) << 16; - } + /* + * This would be the place to create cpu topology based on MPIDR. + * + * However, it cannot be trusted to depict the actual topology; some + * pieces of the architecture enforce an artificial cap on Aff0 values + * (e.g. GICv3's ICC_SGI1R_EL1 limits it to 15), leading to an + * artificial cycling of Aff1, Aff2 and Aff3 values. IOW, these end up + * having absolutely no relationship to the actual underlying system + * topology, and cannot be reasonably used as core / package ID. + * + * If the MT bit is set, Aff0 *could* be used to define a thread ID, but + * we still wouldn't be able to obtain a sane core ID. This means we + * need to entirely ignore MPIDR for any topology deduction. + */ + cpuid_topo->thread_id = -1; + cpuid_topo->core_id = cpuid; + cpuid_topo->package_id = cpu_to_node(cpuid); pr_debug("CPU%u: cluster %d core %d thread %d mpidr %#016llx\n", cpuid, cpuid_topo->package_id, cpuid_topo->core_id, diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 0c073f3ca122..b53d0ebb87fc 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1555,9 +1555,9 @@ static const struct sys_reg_desc cp14_regs[] = { { Op1( 0), CRn( 0), CRm( 1), Op2( 0), trap_raz_wi }, DBG_BCR_BVR_WCR_WVR(1), /* DBGDCCINT */ - { Op1( 0), CRn( 0), CRm( 2), Op2( 0), trap_debug32 }, + { Op1( 0), CRn( 0), CRm( 2), Op2( 0), trap_debug32, NULL, cp14_DBGDCCINT }, /* DBGDSCRext */ - { Op1( 0), CRn( 0), CRm( 2), Op2( 2), trap_debug32 }, + { Op1( 0), CRn( 0), CRm( 2), Op2( 2), trap_debug32, NULL, cp14_DBGDSCRext }, DBG_BCR_BVR_WCR_WVR(2), /* DBGDTR[RT]Xint */ { Op1( 0), CRn( 0), CRm( 3), Op2( 0), trap_raz_wi }, @@ -1572,7 +1572,7 @@ static const struct sys_reg_desc cp14_regs[] = { { Op1( 0), CRn( 0), CRm( 6), Op2( 2), trap_raz_wi }, DBG_BCR_BVR_WCR_WVR(6), /* DBGVCR */ - { Op1( 0), CRn( 0), CRm( 7), Op2( 0), trap_debug32 }, + { Op1( 0), CRn( 0), CRm( 7), Op2( 0), trap_debug32, NULL, cp14_DBGVCR }, DBG_BCR_BVR_WCR_WVR(7), DBG_BCR_BVR_WCR_WVR(8), DBG_BCR_BVR_WCR_WVR(9), diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c index 54529b4ed513..15eaf1e09d0c 100644 --- a/arch/arm64/mm/numa.c +++ b/arch/arm64/mm/numa.c @@ -58,7 +58,11 @@ EXPORT_SYMBOL(node_to_cpumask_map); */ const struct cpumask *cpumask_of_node(int node) { - if (WARN_ON(node >= nr_node_ids)) + + if (node == NUMA_NO_NODE) + return cpu_all_mask; + + if (WARN_ON(node < 0 || node >= nr_node_ids)) return cpu_none_mask; if (WARN_ON(node_to_cpumask_map[node] == NULL)) diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile index d0c0ccdd656a..03ee3ff3cefa 100644 --- a/arch/ia64/kernel/Makefile +++ b/arch/ia64/kernel/Makefile @@ -42,7 +42,7 @@ obj-y += esi_stub.o # must be in kernel proper endif obj-$(CONFIG_INTEL_IOMMU) += pci-dma.o -obj-$(CONFIG_BINFMT_ELF) += elfcore.o +obj-$(CONFIG_ELF_CORE) += elfcore.o # fp_emulate() expects f2-f5,f16-f31 to contain the user-level state. CFLAGS_traps.o += -mfixed-range=f2-f5,f16-f31 diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c index aa41bd5cf9b7..8207b897b49d 100644 --- a/arch/ia64/kernel/kprobes.c +++ b/arch/ia64/kernel/kprobes.c @@ -409,83 +409,9 @@ static void kretprobe_trampoline(void) { } -/* - * At this point the target function has been tricked into - * returning into our trampoline. Lookup the associated instance - * and then: - * - call the handler function - * - cleanup by marking the instance as unused - * - long jump back to the original return address - */ int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs) { - struct kretprobe_instance *ri = NULL; - struct hlist_head *head, empty_rp; - struct hlist_node *tmp; - unsigned long flags, orig_ret_address = 0; - unsigned long trampoline_address = - ((struct fnptr *)kretprobe_trampoline)->ip; - - INIT_HLIST_HEAD(&empty_rp); - kretprobe_hash_lock(current, &head, &flags); - - /* - * It is possible to have multiple instances associated with a given - * task either because an multiple functions in the call path - * have a return probe installed on them, and/or more than one return - * return probe was registered for a target function. - * - * We can handle this because: - * - instances are always inserted at the head of the list - * - when multiple return probes are registered for the same - * function, the first instance's ret_addr will point to the - * real return address, and all the rest will point to - * kretprobe_trampoline - */ - hlist_for_each_entry_safe(ri, tmp, head, hlist) { - if (ri->task != current) - /* another task is sharing our hash bucket */ - continue; - - orig_ret_address = (unsigned long)ri->ret_addr; - if (orig_ret_address != trampoline_address) - /* - * This is the real return address. Any other - * instances associated with this task are for - * other calls deeper on the call stack - */ - break; - } - - regs->cr_iip = orig_ret_address; - - hlist_for_each_entry_safe(ri, tmp, head, hlist) { - if (ri->task != current) - /* another task is sharing our hash bucket */ - continue; - - if (ri->rp && ri->rp->handler) - ri->rp->handler(ri, regs); - - orig_ret_address = (unsigned long)ri->ret_addr; - recycle_rp_inst(ri, &empty_rp); - - if (orig_ret_address != trampoline_address) - /* - * This is the real return address. Any other - * instances associated with this task are for - * other calls deeper on the call stack - */ - break; - } - kretprobe_assert(ri, orig_ret_address, trampoline_address); - - kretprobe_hash_unlock(current, &flags); - - hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) { - hlist_del(&ri->hlist); - kfree(ri); - } + regs->cr_iip = __kretprobe_trampoline_handler(regs, kretprobe_trampoline, NULL); /* * By returning a non-zero value, we are telling * kprobe_handler() that we don't want the post_handler @@ -498,6 +424,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs) { ri->ret_addr = (kprobe_opcode_t *)regs->b0; + ri->fp = NULL; /* Replace the return addr with trampoline addr */ regs->b0 = ((struct fnptr *)kretprobe_trampoline)->ip; diff --git a/arch/mips/Makefile b/arch/mips/Makefile index 63e2ad43bd6a..7683bb7d7a42 100644 --- a/arch/mips/Makefile +++ b/arch/mips/Makefile @@ -339,7 +339,7 @@ libs-y += arch/mips/math-emu/ # See arch/mips/Kbuild for content of core part of the kernel core-y += arch/mips/ -drivers-$(CONFIG_MIPS_CRC_SUPPORT) += arch/mips/crypto/ +drivers-y += arch/mips/crypto/ drivers-$(CONFIG_OPROFILE) += arch/mips/oprofile/ # suspend and hibernation support diff --git a/arch/mips/crypto/Makefile b/arch/mips/crypto/Makefile index e07aca572c2e..8e1deaf00e0c 100644 --- a/arch/mips/crypto/Makefile +++ b/arch/mips/crypto/Makefile @@ -4,3 +4,21 @@ # obj-$(CONFIG_CRYPTO_CRC32_MIPS) += crc32-mips.o + +obj-$(CONFIG_CRYPTO_CHACHA_MIPS) += chacha-mips.o +chacha-mips-y := chacha-core.o chacha-glue.o +AFLAGS_chacha-core.o += -O2 # needed to fill branch delay slots + +obj-$(CONFIG_CRYPTO_POLY1305_MIPS) += poly1305-mips.o +poly1305-mips-y := poly1305-core.o poly1305-glue.o + +perlasm-flavour-$(CONFIG_CPU_MIPS32) := o32 +perlasm-flavour-$(CONFIG_CPU_MIPS64) := 64 + +quiet_cmd_perlasm = PERLASM $@ + cmd_perlasm = $(PERL) $(<) $(perlasm-flavour-y) $(@) + +$(obj)/poly1305-core.S: $(src)/poly1305-mips.pl FORCE + $(call if_changed,perlasm) + +targets += poly1305-core.S diff --git a/arch/mips/crypto/chacha-core.S b/arch/mips/crypto/chacha-core.S new file mode 100644 index 000000000000..5755f69cfe00 --- /dev/null +++ b/arch/mips/crypto/chacha-core.S @@ -0,0 +1,497 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ +/* + * Copyright (C) 2016-2018 René van Dorst . All Rights Reserved. + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#define MASK_U32 0x3c +#define CHACHA20_BLOCK_SIZE 64 +#define STACK_SIZE 32 + +#define X0 $t0 +#define X1 $t1 +#define X2 $t2 +#define X3 $t3 +#define X4 $t4 +#define X5 $t5 +#define X6 $t6 +#define X7 $t7 +#define X8 $t8 +#define X9 $t9 +#define X10 $v1 +#define X11 $s6 +#define X12 $s5 +#define X13 $s4 +#define X14 $s3 +#define X15 $s2 +/* Use regs which are overwritten on exit for Tx so we don't leak clear data. */ +#define T0 $s1 +#define T1 $s0 +#define T(n) T ## n +#define X(n) X ## n + +/* Input arguments */ +#define STATE $a0 +#define OUT $a1 +#define IN $a2 +#define BYTES $a3 + +/* Output argument */ +/* NONCE[0] is kept in a register and not in memory. + * We don't want to touch original value in memory. + * Must be incremented every loop iteration. + */ +#define NONCE_0 $v0 + +/* SAVED_X and SAVED_CA are set in the jump table. + * Use regs which are overwritten on exit else we don't leak clear data. + * They are used to handling the last bytes which are not multiple of 4. + */ +#define SAVED_X X15 +#define SAVED_CA $s7 + +#define IS_UNALIGNED $s7 + +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ +#define MSB 0 +#define LSB 3 +#define ROTx rotl +#define ROTR(n) rotr n, 24 +#define CPU_TO_LE32(n) \ + wsbh n; \ + rotr n, 16; +#else +#define MSB 3 +#define LSB 0 +#define ROTx rotr +#define CPU_TO_LE32(n) +#define ROTR(n) +#endif + +#define FOR_EACH_WORD(x) \ + x( 0); \ + x( 1); \ + x( 2); \ + x( 3); \ + x( 4); \ + x( 5); \ + x( 6); \ + x( 7); \ + x( 8); \ + x( 9); \ + x(10); \ + x(11); \ + x(12); \ + x(13); \ + x(14); \ + x(15); + +#define FOR_EACH_WORD_REV(x) \ + x(15); \ + x(14); \ + x(13); \ + x(12); \ + x(11); \ + x(10); \ + x( 9); \ + x( 8); \ + x( 7); \ + x( 6); \ + x( 5); \ + x( 4); \ + x( 3); \ + x( 2); \ + x( 1); \ + x( 0); + +#define PLUS_ONE_0 1 +#define PLUS_ONE_1 2 +#define PLUS_ONE_2 3 +#define PLUS_ONE_3 4 +#define PLUS_ONE_4 5 +#define PLUS_ONE_5 6 +#define PLUS_ONE_6 7 +#define PLUS_ONE_7 8 +#define PLUS_ONE_8 9 +#define PLUS_ONE_9 10 +#define PLUS_ONE_10 11 +#define PLUS_ONE_11 12 +#define PLUS_ONE_12 13 +#define PLUS_ONE_13 14 +#define PLUS_ONE_14 15 +#define PLUS_ONE_15 16 +#define PLUS_ONE(x) PLUS_ONE_ ## x +#define _CONCAT3(a,b,c) a ## b ## c +#define CONCAT3(a,b,c) _CONCAT3(a,b,c) + +#define STORE_UNALIGNED(x) \ +CONCAT3(.Lchacha_mips_xor_unaligned_, PLUS_ONE(x), _b: ;) \ + .if (x != 12); \ + lw T0, (x*4)(STATE); \ + .endif; \ + lwl T1, (x*4)+MSB ## (IN); \ + lwr T1, (x*4)+LSB ## (IN); \ + .if (x == 12); \ + addu X ## x, NONCE_0; \ + .else; \ + addu X ## x, T0; \ + .endif; \ + CPU_TO_LE32(X ## x); \ + xor X ## x, T1; \ + swl X ## x, (x*4)+MSB ## (OUT); \ + swr X ## x, (x*4)+LSB ## (OUT); + +#define STORE_ALIGNED(x) \ +CONCAT3(.Lchacha_mips_xor_aligned_, PLUS_ONE(x), _b: ;) \ + .if (x != 12); \ + lw T0, (x*4)(STATE); \ + .endif; \ + lw T1, (x*4) ## (IN); \ + .if (x == 12); \ + addu X ## x, NONCE_0; \ + .else; \ + addu X ## x, T0; \ + .endif; \ + CPU_TO_LE32(X ## x); \ + xor X ## x, T1; \ + sw X ## x, (x*4) ## (OUT); + +/* Jump table macro. + * Used for setup and handling the last bytes, which are not multiple of 4. + * X15 is free to store Xn + * Every jumptable entry must be equal in size. + */ +#define JMPTBL_ALIGNED(x) \ +.Lchacha_mips_jmptbl_aligned_ ## x: ; \ + .set noreorder; \ + b .Lchacha_mips_xor_aligned_ ## x ## _b; \ + .if (x == 12); \ + addu SAVED_X, X ## x, NONCE_0; \ + .else; \ + addu SAVED_X, X ## x, SAVED_CA; \ + .endif; \ + .set reorder + +#define JMPTBL_UNALIGNED(x) \ +.Lchacha_mips_jmptbl_unaligned_ ## x: ; \ + .set noreorder; \ + b .Lchacha_mips_xor_unaligned_ ## x ## _b; \ + .if (x == 12); \ + addu SAVED_X, X ## x, NONCE_0; \ + .else; \ + addu SAVED_X, X ## x, SAVED_CA; \ + .endif; \ + .set reorder + +#define AXR(A, B, C, D, K, L, M, N, V, W, Y, Z, S) \ + addu X(A), X(K); \ + addu X(B), X(L); \ + addu X(C), X(M); \ + addu X(D), X(N); \ + xor X(V), X(A); \ + xor X(W), X(B); \ + xor X(Y), X(C); \ + xor X(Z), X(D); \ + rotl X(V), S; \ + rotl X(W), S; \ + rotl X(Y), S; \ + rotl X(Z), S; + +.text +.set reorder +.set noat +.globl chacha_crypt_arch +.ent chacha_crypt_arch +chacha_crypt_arch: + .frame $sp, STACK_SIZE, $ra + + /* Load number of rounds */ + lw $at, 16($sp) + + addiu $sp, -STACK_SIZE + + /* Return bytes = 0. */ + beqz BYTES, .Lchacha_mips_end + + lw NONCE_0, 48(STATE) + + /* Save s0-s7 */ + sw $s0, 0($sp) + sw $s1, 4($sp) + sw $s2, 8($sp) + sw $s3, 12($sp) + sw $s4, 16($sp) + sw $s5, 20($sp) + sw $s6, 24($sp) + sw $s7, 28($sp) + + /* Test IN or OUT is unaligned. + * IS_UNALIGNED = ( IN | OUT ) & 0x00000003 + */ + or IS_UNALIGNED, IN, OUT + andi IS_UNALIGNED, 0x3 + + b .Lchacha_rounds_start + +.align 4 +.Loop_chacha_rounds: + addiu IN, CHACHA20_BLOCK_SIZE + addiu OUT, CHACHA20_BLOCK_SIZE + addiu NONCE_0, 1 + +.Lchacha_rounds_start: + lw X0, 0(STATE) + lw X1, 4(STATE) + lw X2, 8(STATE) + lw X3, 12(STATE) + + lw X4, 16(STATE) + lw X5, 20(STATE) + lw X6, 24(STATE) + lw X7, 28(STATE) + lw X8, 32(STATE) + lw X9, 36(STATE) + lw X10, 40(STATE) + lw X11, 44(STATE) + + move X12, NONCE_0 + lw X13, 52(STATE) + lw X14, 56(STATE) + lw X15, 60(STATE) + +.Loop_chacha_xor_rounds: + addiu $at, -2 + AXR( 0, 1, 2, 3, 4, 5, 6, 7, 12,13,14,15, 16); + AXR( 8, 9,10,11, 12,13,14,15, 4, 5, 6, 7, 12); + AXR( 0, 1, 2, 3, 4, 5, 6, 7, 12,13,14,15, 8); + AXR( 8, 9,10,11, 12,13,14,15, 4, 5, 6, 7, 7); + AXR( 0, 1, 2, 3, 5, 6, 7, 4, 15,12,13,14, 16); + AXR(10,11, 8, 9, 15,12,13,14, 5, 6, 7, 4, 12); + AXR( 0, 1, 2, 3, 5, 6, 7, 4, 15,12,13,14, 8); + AXR(10,11, 8, 9, 15,12,13,14, 5, 6, 7, 4, 7); + bnez $at, .Loop_chacha_xor_rounds + + addiu BYTES, -(CHACHA20_BLOCK_SIZE) + + /* Is data src/dst unaligned? Jump */ + bnez IS_UNALIGNED, .Loop_chacha_unaligned + + /* Set number rounds here to fill delayslot. */ + lw $at, (STACK_SIZE+16)($sp) + + /* BYTES < 0, it has no full block. */ + bltz BYTES, .Lchacha_mips_no_full_block_aligned + + FOR_EACH_WORD_REV(STORE_ALIGNED) + + /* BYTES > 0? Loop again. */ + bgtz BYTES, .Loop_chacha_rounds + + /* Place this here to fill delay slot */ + addiu NONCE_0, 1 + + /* BYTES < 0? Handle last bytes */ + bltz BYTES, .Lchacha_mips_xor_bytes + +.Lchacha_mips_xor_done: + /* Restore used registers */ + lw $s0, 0($sp) + lw $s1, 4($sp) + lw $s2, 8($sp) + lw $s3, 12($sp) + lw $s4, 16($sp) + lw $s5, 20($sp) + lw $s6, 24($sp) + lw $s7, 28($sp) + + /* Write NONCE_0 back to right location in state */ + sw NONCE_0, 48(STATE) + +.Lchacha_mips_end: + addiu $sp, STACK_SIZE + jr $ra + +.Lchacha_mips_no_full_block_aligned: + /* Restore the offset on BYTES */ + addiu BYTES, CHACHA20_BLOCK_SIZE + + /* Get number of full WORDS */ + andi $at, BYTES, MASK_U32 + + /* Load upper half of jump table addr */ + lui T0, %hi(.Lchacha_mips_jmptbl_aligned_0) + + /* Calculate lower half jump table offset */ + ins T0, $at, 1, 6 + + /* Add offset to STATE */ + addu T1, STATE, $at + + /* Add lower half jump table addr */ + addiu T0, %lo(.Lchacha_mips_jmptbl_aligned_0) + + /* Read value from STATE */ + lw SAVED_CA, 0(T1) + + /* Store remaining bytecounter as negative value */ + subu BYTES, $at, BYTES + + jr T0 + + /* Jump table */ + FOR_EACH_WORD(JMPTBL_ALIGNED) + + +.Loop_chacha_unaligned: + /* Set number rounds here to fill delayslot. */ + lw $at, (STACK_SIZE+16)($sp) + + /* BYTES > 0, it has no full block. */ + bltz BYTES, .Lchacha_mips_no_full_block_unaligned + + FOR_EACH_WORD_REV(STORE_UNALIGNED) + + /* BYTES > 0? Loop again. */ + bgtz BYTES, .Loop_chacha_rounds + + /* Write NONCE_0 back to right location in state */ + sw NONCE_0, 48(STATE) + + .set noreorder + /* Fall through to byte handling */ + bgez BYTES, .Lchacha_mips_xor_done +.Lchacha_mips_xor_unaligned_0_b: +.Lchacha_mips_xor_aligned_0_b: + /* Place this here to fill delay slot */ + addiu NONCE_0, 1 + .set reorder + +.Lchacha_mips_xor_bytes: + addu IN, $at + addu OUT, $at + /* First byte */ + lbu T1, 0(IN) + addiu $at, BYTES, 1 + CPU_TO_LE32(SAVED_X) + ROTR(SAVED_X) + xor T1, SAVED_X + sb T1, 0(OUT) + beqz $at, .Lchacha_mips_xor_done + /* Second byte */ + lbu T1, 1(IN) + addiu $at, BYTES, 2 + ROTx SAVED_X, 8 + xor T1, SAVED_X + sb T1, 1(OUT) + beqz $at, .Lchacha_mips_xor_done + /* Third byte */ + lbu T1, 2(IN) + ROTx SAVED_X, 8 + xor T1, SAVED_X + sb T1, 2(OUT) + b .Lchacha_mips_xor_done + +.Lchacha_mips_no_full_block_unaligned: + /* Restore the offset on BYTES */ + addiu BYTES, CHACHA20_BLOCK_SIZE + + /* Get number of full WORDS */ + andi $at, BYTES, MASK_U32 + + /* Load upper half of jump table addr */ + lui T0, %hi(.Lchacha_mips_jmptbl_unaligned_0) + + /* Calculate lower half jump table offset */ + ins T0, $at, 1, 6 + + /* Add offset to STATE */ + addu T1, STATE, $at + + /* Add lower half jump table addr */ + addiu T0, %lo(.Lchacha_mips_jmptbl_unaligned_0) + + /* Read value from STATE */ + lw SAVED_CA, 0(T1) + + /* Store remaining bytecounter as negative value */ + subu BYTES, $at, BYTES + + jr T0 + + /* Jump table */ + FOR_EACH_WORD(JMPTBL_UNALIGNED) +.end chacha_crypt_arch +.set at + +/* Input arguments + * STATE $a0 + * OUT $a1 + * NROUND $a2 + */ + +#undef X12 +#undef X13 +#undef X14 +#undef X15 + +#define X12 $a3 +#define X13 $at +#define X14 $v0 +#define X15 STATE + +.set noat +.globl hchacha_block_arch +.ent hchacha_block_arch +hchacha_block_arch: + .frame $sp, STACK_SIZE, $ra + + addiu $sp, -STACK_SIZE + + /* Save X11(s6) */ + sw X11, 0($sp) + + lw X0, 0(STATE) + lw X1, 4(STATE) + lw X2, 8(STATE) + lw X3, 12(STATE) + lw X4, 16(STATE) + lw X5, 20(STATE) + lw X6, 24(STATE) + lw X7, 28(STATE) + lw X8, 32(STATE) + lw X9, 36(STATE) + lw X10, 40(STATE) + lw X11, 44(STATE) + lw X12, 48(STATE) + lw X13, 52(STATE) + lw X14, 56(STATE) + lw X15, 60(STATE) + +.Loop_hchacha_xor_rounds: + addiu $a2, -2 + AXR( 0, 1, 2, 3, 4, 5, 6, 7, 12,13,14,15, 16); + AXR( 8, 9,10,11, 12,13,14,15, 4, 5, 6, 7, 12); + AXR( 0, 1, 2, 3, 4, 5, 6, 7, 12,13,14,15, 8); + AXR( 8, 9,10,11, 12,13,14,15, 4, 5, 6, 7, 7); + AXR( 0, 1, 2, 3, 5, 6, 7, 4, 15,12,13,14, 16); + AXR(10,11, 8, 9, 15,12,13,14, 5, 6, 7, 4, 12); + AXR( 0, 1, 2, 3, 5, 6, 7, 4, 15,12,13,14, 8); + AXR(10,11, 8, 9, 15,12,13,14, 5, 6, 7, 4, 7); + bnez $a2, .Loop_hchacha_xor_rounds + + /* Restore used register */ + lw X11, 0($sp) + + sw X0, 0(OUT) + sw X1, 4(OUT) + sw X2, 8(OUT) + sw X3, 12(OUT) + sw X12, 16(OUT) + sw X13, 20(OUT) + sw X14, 24(OUT) + sw X15, 28(OUT) + + addiu $sp, STACK_SIZE + jr $ra +.end hchacha_block_arch +.set at diff --git a/arch/mips/crypto/chacha-glue.c b/arch/mips/crypto/chacha-glue.c new file mode 100644 index 000000000000..90896029d0cd --- /dev/null +++ b/arch/mips/crypto/chacha-glue.c @@ -0,0 +1,152 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * MIPS accelerated ChaCha and XChaCha stream ciphers, + * including ChaCha20 (RFC7539) + * + * Copyright (C) 2019 Linaro, Ltd. + */ + +#include +#include +#include +#include +#include +#include + +asmlinkage void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, + unsigned int bytes, int nrounds); +EXPORT_SYMBOL(chacha_crypt_arch); + +asmlinkage void hchacha_block_arch(const u32 *state, u32 *stream, int nrounds); +EXPORT_SYMBOL(hchacha_block_arch); + +void chacha_init_arch(u32 *state, const u32 *key, const u8 *iv) +{ + chacha_init_generic(state, key, iv); +} +EXPORT_SYMBOL(chacha_init_arch); + +static int chacha_mips_stream_xor(struct skcipher_request *req, + const struct chacha_ctx *ctx, const u8 *iv) +{ + struct skcipher_walk walk; + u32 state[16]; + int err; + + err = skcipher_walk_virt(&walk, req, false); + + chacha_init_generic(state, ctx->key, iv); + + while (walk.nbytes > 0) { + unsigned int nbytes = walk.nbytes; + + if (nbytes < walk.total) + nbytes = round_down(nbytes, walk.stride); + + chacha_crypt(state, walk.dst.virt.addr, walk.src.virt.addr, + nbytes, ctx->nrounds); + err = skcipher_walk_done(&walk, walk.nbytes - nbytes); + } + + return err; +} + +static int chacha_mips(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + + return chacha_mips_stream_xor(req, ctx, req->iv); +} + +static int xchacha_mips(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + struct chacha_ctx subctx; + u32 state[16]; + u8 real_iv[16]; + + chacha_init_generic(state, ctx->key, req->iv); + + hchacha_block(state, subctx.key, ctx->nrounds); + subctx.nrounds = ctx->nrounds; + + memcpy(&real_iv[0], req->iv + 24, 8); + memcpy(&real_iv[8], req->iv + 16, 8); + return chacha_mips_stream_xor(req, &subctx, real_iv); +} + +static struct skcipher_alg algs[] = { + { + .base.cra_name = "chacha20", + .base.cra_driver_name = "chacha20-mips", + .base.cra_priority = 200, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = CHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .setkey = chacha20_setkey, + .encrypt = chacha_mips, + .decrypt = chacha_mips, + }, { + .base.cra_name = "xchacha20", + .base.cra_driver_name = "xchacha20-mips", + .base.cra_priority = 200, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = XCHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .setkey = chacha20_setkey, + .encrypt = xchacha_mips, + .decrypt = xchacha_mips, + }, { + .base.cra_name = "xchacha12", + .base.cra_driver_name = "xchacha12-mips", + .base.cra_priority = 200, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = XCHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .setkey = chacha12_setkey, + .encrypt = xchacha_mips, + .decrypt = xchacha_mips, + } +}; + +static int __init chacha_simd_mod_init(void) +{ + return IS_REACHABLE(CONFIG_CRYPTO_BLKCIPHER) ? + crypto_register_skciphers(algs, ARRAY_SIZE(algs)) : 0; +} + +static void __exit chacha_simd_mod_fini(void) +{ + if (IS_REACHABLE(CONFIG_CRYPTO_BLKCIPHER)) + crypto_unregister_skciphers(algs, ARRAY_SIZE(algs)); +} + +module_init(chacha_simd_mod_init); +module_exit(chacha_simd_mod_fini); + +MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (MIPS accelerated)"); +MODULE_AUTHOR("Ard Biesheuvel "); +MODULE_LICENSE("GPL v2"); +MODULE_ALIAS_CRYPTO("chacha20"); +MODULE_ALIAS_CRYPTO("chacha20-mips"); +MODULE_ALIAS_CRYPTO("xchacha20"); +MODULE_ALIAS_CRYPTO("xchacha20-mips"); +MODULE_ALIAS_CRYPTO("xchacha12"); +MODULE_ALIAS_CRYPTO("xchacha12-mips"); diff --git a/arch/mips/crypto/poly1305-glue.c b/arch/mips/crypto/poly1305-glue.c new file mode 100644 index 000000000000..fc881b46d911 --- /dev/null +++ b/arch/mips/crypto/poly1305-glue.c @@ -0,0 +1,191 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * OpenSSL/Cryptogams accelerated Poly1305 transform for MIPS + * + * Copyright (C) 2019 Linaro Ltd. + */ + +#include +#include +#include +#include +#include +#include +#include + +asmlinkage void poly1305_init_mips(void *state, const u8 *key); +asmlinkage void poly1305_blocks_mips(void *state, const u8 *src, u32 len, u32 hibit); +asmlinkage void poly1305_emit_mips(void *state, u8 *digest, const u32 *nonce); + +void poly1305_init_arch(struct poly1305_desc_ctx *dctx, const u8 *key) +{ + poly1305_init_mips(&dctx->h, key); + dctx->s[0] = get_unaligned_le32(key + 16); + dctx->s[1] = get_unaligned_le32(key + 20); + dctx->s[2] = get_unaligned_le32(key + 24); + dctx->s[3] = get_unaligned_le32(key + 28); + dctx->buflen = 0; +} +EXPORT_SYMBOL(poly1305_init_arch); + +static int mips_poly1305_init(struct shash_desc *desc) +{ + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + + dctx->buflen = 0; + dctx->rset = 0; + dctx->sset = false; + + return 0; +} + +static void mips_poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src, + u32 len, u32 hibit) +{ + if (unlikely(!dctx->sset)) { + if (!dctx->rset) { + poly1305_init_mips(&dctx->h, src); + src += POLY1305_BLOCK_SIZE; + len -= POLY1305_BLOCK_SIZE; + dctx->rset = 1; + } + if (len >= POLY1305_BLOCK_SIZE) { + dctx->s[0] = get_unaligned_le32(src + 0); + dctx->s[1] = get_unaligned_le32(src + 4); + dctx->s[2] = get_unaligned_le32(src + 8); + dctx->s[3] = get_unaligned_le32(src + 12); + src += POLY1305_BLOCK_SIZE; + len -= POLY1305_BLOCK_SIZE; + dctx->sset = true; + } + if (len < POLY1305_BLOCK_SIZE) + return; + } + + len &= ~(POLY1305_BLOCK_SIZE - 1); + + poly1305_blocks_mips(&dctx->h, src, len, hibit); +} + +static int mips_poly1305_update(struct shash_desc *desc, const u8 *src, + unsigned int len) +{ + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + + if (unlikely(dctx->buflen)) { + u32 bytes = min(len, POLY1305_BLOCK_SIZE - dctx->buflen); + + memcpy(dctx->buf + dctx->buflen, src, bytes); + src += bytes; + len -= bytes; + dctx->buflen += bytes; + + if (dctx->buflen == POLY1305_BLOCK_SIZE) { + mips_poly1305_blocks(dctx, dctx->buf, POLY1305_BLOCK_SIZE, 1); + dctx->buflen = 0; + } + } + + if (likely(len >= POLY1305_BLOCK_SIZE)) { + mips_poly1305_blocks(dctx, src, len, 1); + src += round_down(len, POLY1305_BLOCK_SIZE); + len %= POLY1305_BLOCK_SIZE; + } + + if (unlikely(len)) { + dctx->buflen = len; + memcpy(dctx->buf, src, len); + } + return 0; +} + +void poly1305_update_arch(struct poly1305_desc_ctx *dctx, const u8 *src, + unsigned int nbytes) +{ + if (unlikely(dctx->buflen)) { + u32 bytes = min(nbytes, POLY1305_BLOCK_SIZE - dctx->buflen); + + memcpy(dctx->buf + dctx->buflen, src, bytes); + src += bytes; + nbytes -= bytes; + dctx->buflen += bytes; + + if (dctx->buflen == POLY1305_BLOCK_SIZE) { + poly1305_blocks_mips(&dctx->h, dctx->buf, + POLY1305_BLOCK_SIZE, 1); + dctx->buflen = 0; + } + } + + if (likely(nbytes >= POLY1305_BLOCK_SIZE)) { + unsigned int len = round_down(nbytes, POLY1305_BLOCK_SIZE); + + poly1305_blocks_mips(&dctx->h, src, len, 1); + src += len; + nbytes %= POLY1305_BLOCK_SIZE; + } + + if (unlikely(nbytes)) { + dctx->buflen = nbytes; + memcpy(dctx->buf, src, nbytes); + } +} +EXPORT_SYMBOL(poly1305_update_arch); + +void poly1305_final_arch(struct poly1305_desc_ctx *dctx, u8 *dst) +{ + if (unlikely(dctx->buflen)) { + dctx->buf[dctx->buflen++] = 1; + memset(dctx->buf + dctx->buflen, 0, + POLY1305_BLOCK_SIZE - dctx->buflen); + poly1305_blocks_mips(&dctx->h, dctx->buf, POLY1305_BLOCK_SIZE, 0); + } + + poly1305_emit_mips(&dctx->h, dst, dctx->s); + *dctx = (struct poly1305_desc_ctx){}; +} +EXPORT_SYMBOL(poly1305_final_arch); + +static int mips_poly1305_final(struct shash_desc *desc, u8 *dst) +{ + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + + if (unlikely(!dctx->sset)) + return -ENOKEY; + + poly1305_final_arch(dctx, dst); + return 0; +} + +static struct shash_alg mips_poly1305_alg = { + .init = mips_poly1305_init, + .update = mips_poly1305_update, + .final = mips_poly1305_final, + .digestsize = POLY1305_DIGEST_SIZE, + .descsize = sizeof(struct poly1305_desc_ctx), + + .base.cra_name = "poly1305", + .base.cra_driver_name = "poly1305-mips", + .base.cra_priority = 200, + .base.cra_blocksize = POLY1305_BLOCK_SIZE, + .base.cra_module = THIS_MODULE, +}; + +static int __init mips_poly1305_mod_init(void) +{ + return IS_REACHABLE(CONFIG_CRYPTO_HASH) ? + crypto_register_shash(&mips_poly1305_alg) : 0; +} + +static void __exit mips_poly1305_mod_exit(void) +{ + if (IS_REACHABLE(CONFIG_CRYPTO_HASH)) + crypto_unregister_shash(&mips_poly1305_alg); +} + +module_init(mips_poly1305_mod_init); +module_exit(mips_poly1305_mod_exit); + +MODULE_LICENSE("GPL v2"); +MODULE_ALIAS_CRYPTO("poly1305"); +MODULE_ALIAS_CRYPTO("poly1305-mips"); diff --git a/arch/mips/crypto/poly1305-mips.pl b/arch/mips/crypto/poly1305-mips.pl new file mode 100644 index 000000000000..b05bab884ed2 --- /dev/null +++ b/arch/mips/crypto/poly1305-mips.pl @@ -0,0 +1,1273 @@ +#!/usr/bin/env perl +# SPDX-License-Identifier: GPL-1.0+ OR BSD-3-Clause +# +# ==================================================================== +# Written by Andy Polyakov, @dot-asm, originally for the OpenSSL +# project. +# ==================================================================== + +# Poly1305 hash for MIPS. +# +# May 2016 +# +# Numbers are cycles per processed byte with poly1305_blocks alone. +# +# IALU/gcc +# R1x000 ~5.5/+130% (big-endian) +# Octeon II 2.50/+70% (little-endian) +# +# March 2019 +# +# Add 32-bit code path. +# +# October 2019 +# +# Modulo-scheduling reduction allows to omit dependency chain at the +# end of inner loop and improve performance. Also optimize MIPS32R2 +# code path for MIPS 1004K core. Per René von Dorst's suggestions. +# +# IALU/gcc +# R1x000 ~9.8/? (big-endian) +# Octeon II 3.65/+140% (little-endian) +# MT7621/1004K 4.75/? (little-endian) +# +###################################################################### +# There is a number of MIPS ABI in use, O32 and N32/64 are most +# widely used. Then there is a new contender: NUBI. It appears that if +# one picks the latter, it's possible to arrange code in ABI neutral +# manner. Therefore let's stick to NUBI register layout: +# +($zero,$at,$t0,$t1,$t2)=map("\$$_",(0..2,24,25)); +($a0,$a1,$a2,$a3,$a4,$a5,$a6,$a7)=map("\$$_",(4..11)); +($s0,$s1,$s2,$s3,$s4,$s5,$s6,$s7,$s8,$s9,$s10,$s11)=map("\$$_",(12..23)); +($gp,$tp,$sp,$fp,$ra)=map("\$$_",(3,28..31)); +# +# The return value is placed in $a0. Following coding rules facilitate +# interoperability: +# +# - never ever touch $tp, "thread pointer", former $gp [o32 can be +# excluded from the rule, because it's specified volatile]; +# - copy return value to $t0, former $v0 [or to $a0 if you're adapting +# old code]; +# - on O32 populate $a4-$a7 with 'lw $aN,4*N($sp)' if necessary; +# +# For reference here is register layout for N32/64 MIPS ABIs: +# +# ($zero,$at,$v0,$v1)=map("\$$_",(0..3)); +# ($a0,$a1,$a2,$a3,$a4,$a5,$a6,$a7)=map("\$$_",(4..11)); +# ($t0,$t1,$t2,$t3,$t8,$t9)=map("\$$_",(12..15,24,25)); +# ($s0,$s1,$s2,$s3,$s4,$s5,$s6,$s7)=map("\$$_",(16..23)); +# ($gp,$sp,$fp,$ra)=map("\$$_",(28..31)); +# +# +# +###################################################################### + +$flavour = shift || "64"; # supported flavours are o32,n32,64,nubi32,nubi64 + +$v0 = ($flavour =~ /nubi/i) ? $a0 : $t0; + +if ($flavour =~ /64|n32/i) {{{ +###################################################################### +# 64-bit code path +# + +my ($ctx,$inp,$len,$padbit) = ($a0,$a1,$a2,$a3); +my ($in0,$in1,$tmp0,$tmp1,$tmp2,$tmp3,$tmp4) = ($a4,$a5,$a6,$a7,$at,$t0,$t1); + +$code.=<<___; +#if (defined(_MIPS_ARCH_MIPS64R3) || defined(_MIPS_ARCH_MIPS64R5) || \\ + defined(_MIPS_ARCH_MIPS64R6)) \\ + && !defined(_MIPS_ARCH_MIPS64R2) +# define _MIPS_ARCH_MIPS64R2 +#endif + +#if defined(_MIPS_ARCH_MIPS64R6) +# define dmultu(rs,rt) +# define mflo(rd,rs,rt) dmulu rd,rs,rt +# define mfhi(rd,rs,rt) dmuhu rd,rs,rt +#else +# define dmultu(rs,rt) dmultu rs,rt +# define mflo(rd,rs,rt) mflo rd +# define mfhi(rd,rs,rt) mfhi rd +#endif + +#ifdef __KERNEL__ +# define poly1305_init poly1305_init_mips +# define poly1305_blocks poly1305_blocks_mips +# define poly1305_emit poly1305_emit_mips +#endif + +#if defined(__MIPSEB__) && !defined(MIPSEB) +# define MIPSEB +#endif + +#ifdef MIPSEB +# define MSB 0 +# define LSB 7 +#else +# define MSB 7 +# define LSB 0 +#endif + +.text +.set noat +.set noreorder + +.align 5 +.globl poly1305_init +.ent poly1305_init +poly1305_init: + .frame $sp,0,$ra + .set reorder + + sd $zero,0($ctx) + sd $zero,8($ctx) + sd $zero,16($ctx) + + beqz $inp,.Lno_key + +#if defined(_MIPS_ARCH_MIPS64R6) + andi $tmp0,$inp,7 # $inp % 8 + dsubu $inp,$inp,$tmp0 # align $inp + sll $tmp0,$tmp0,3 # byte to bit offset + ld $in0,0($inp) + ld $in1,8($inp) + beqz $tmp0,.Laligned_key + ld $tmp2,16($inp) + + subu $tmp1,$zero,$tmp0 +# ifdef MIPSEB + dsllv $in0,$in0,$tmp0 + dsrlv $tmp3,$in1,$tmp1 + dsllv $in1,$in1,$tmp0 + dsrlv $tmp2,$tmp2,$tmp1 +# else + dsrlv $in0,$in0,$tmp0 + dsllv $tmp3,$in1,$tmp1 + dsrlv $in1,$in1,$tmp0 + dsllv $tmp2,$tmp2,$tmp1 +# endif + or $in0,$in0,$tmp3 + or $in1,$in1,$tmp2 +.Laligned_key: +#else + ldl $in0,0+MSB($inp) + ldl $in1,8+MSB($inp) + ldr $in0,0+LSB($inp) + ldr $in1,8+LSB($inp) +#endif +#ifdef MIPSEB +# if defined(_MIPS_ARCH_MIPS64R2) + dsbh $in0,$in0 # byte swap + dsbh $in1,$in1 + dshd $in0,$in0 + dshd $in1,$in1 +# else + ori $tmp0,$zero,0xFF + dsll $tmp2,$tmp0,32 + or $tmp0,$tmp2 # 0x000000FF000000FF + + and $tmp1,$in0,$tmp0 # byte swap + and $tmp3,$in1,$tmp0 + dsrl $tmp2,$in0,24 + dsrl $tmp4,$in1,24 + dsll $tmp1,24 + dsll $tmp3,24 + and $tmp2,$tmp0 + and $tmp4,$tmp0 + dsll $tmp0,8 # 0x0000FF000000FF00 + or $tmp1,$tmp2 + or $tmp3,$tmp4 + and $tmp2,$in0,$tmp0 + and $tmp4,$in1,$tmp0 + dsrl $in0,8 + dsrl $in1,8 + dsll $tmp2,8 + dsll $tmp4,8 + and $in0,$tmp0 + and $in1,$tmp0 + or $tmp1,$tmp2 + or $tmp3,$tmp4 + or $in0,$tmp1 + or $in1,$tmp3 + dsrl $tmp1,$in0,32 + dsrl $tmp3,$in1,32 + dsll $in0,32 + dsll $in1,32 + or $in0,$tmp1 + or $in1,$tmp3 +# endif +#endif + li $tmp0,1 + dsll $tmp0,32 # 0x0000000100000000 + daddiu $tmp0,-63 # 0x00000000ffffffc1 + dsll $tmp0,28 # 0x0ffffffc10000000 + daddiu $tmp0,-1 # 0x0ffffffc0fffffff + + and $in0,$tmp0 + daddiu $tmp0,-3 # 0x0ffffffc0ffffffc + and $in1,$tmp0 + + sd $in0,24($ctx) + dsrl $tmp0,$in1,2 + sd $in1,32($ctx) + daddu $tmp0,$in1 # s1 = r1 + (r1 >> 2) + sd $tmp0,40($ctx) + +.Lno_key: + li $v0,0 # return 0 + jr $ra +.end poly1305_init +___ +{ +my $SAVED_REGS_MASK = ($flavour =~ /nubi/i) ? "0x0003f000" : "0x00030000"; + +my ($h0,$h1,$h2,$r0,$r1,$rs1,$d0,$d1,$d2) = + ($s0,$s1,$s2,$s3,$s4,$s5,$in0,$in1,$t2); +my ($shr,$shl) = ($s6,$s7); # used on R6 + +$code.=<<___; +.align 5 +.globl poly1305_blocks +.ent poly1305_blocks +poly1305_blocks: + .set noreorder + dsrl $len,4 # number of complete blocks + bnez $len,poly1305_blocks_internal + nop + jr $ra + nop +.end poly1305_blocks + +.align 5 +.ent poly1305_blocks_internal +poly1305_blocks_internal: + .set noreorder +#if defined(_MIPS_ARCH_MIPS64R6) + .frame $sp,8*8,$ra + .mask $SAVED_REGS_MASK|0x000c0000,-8 + dsubu $sp,8*8 + sd $s7,56($sp) + sd $s6,48($sp) +#else + .frame $sp,6*8,$ra + .mask $SAVED_REGS_MASK,-8 + dsubu $sp,6*8 +#endif + sd $s5,40($sp) + sd $s4,32($sp) +___ +$code.=<<___ if ($flavour =~ /nubi/i); # optimize non-nubi prologue + sd $s3,24($sp) + sd $s2,16($sp) + sd $s1,8($sp) + sd $s0,0($sp) +___ +$code.=<<___; + .set reorder + +#if defined(_MIPS_ARCH_MIPS64R6) + andi $shr,$inp,7 + dsubu $inp,$inp,$shr # align $inp + sll $shr,$shr,3 # byte to bit offset + subu $shl,$zero,$shr +#endif + + ld $h0,0($ctx) # load hash value + ld $h1,8($ctx) + ld $h2,16($ctx) + + ld $r0,24($ctx) # load key + ld $r1,32($ctx) + ld $rs1,40($ctx) + + dsll $len,4 + daddu $len,$inp # end of buffer + b .Loop + +.align 4 +.Loop: +#if defined(_MIPS_ARCH_MIPS64R6) + ld $in0,0($inp) # load input + ld $in1,8($inp) + beqz $shr,.Laligned_inp + + ld $tmp2,16($inp) +# ifdef MIPSEB + dsllv $in0,$in0,$shr + dsrlv $tmp3,$in1,$shl + dsllv $in1,$in1,$shr + dsrlv $tmp2,$tmp2,$shl +# else + dsrlv $in0,$in0,$shr + dsllv $tmp3,$in1,$shl + dsrlv $in1,$in1,$shr + dsllv $tmp2,$tmp2,$shl +# endif + or $in0,$in0,$tmp3 + or $in1,$in1,$tmp2 +.Laligned_inp: +#else + ldl $in0,0+MSB($inp) # load input + ldl $in1,8+MSB($inp) + ldr $in0,0+LSB($inp) + ldr $in1,8+LSB($inp) +#endif + daddiu $inp,16 +#ifdef MIPSEB +# if defined(_MIPS_ARCH_MIPS64R2) + dsbh $in0,$in0 # byte swap + dsbh $in1,$in1 + dshd $in0,$in0 + dshd $in1,$in1 +# else + ori $tmp0,$zero,0xFF + dsll $tmp2,$tmp0,32 + or $tmp0,$tmp2 # 0x000000FF000000FF + + and $tmp1,$in0,$tmp0 # byte swap + and $tmp3,$in1,$tmp0 + dsrl $tmp2,$in0,24 + dsrl $tmp4,$in1,24 + dsll $tmp1,24 + dsll $tmp3,24 + and $tmp2,$tmp0 + and $tmp4,$tmp0 + dsll $tmp0,8 # 0x0000FF000000FF00 + or $tmp1,$tmp2 + or $tmp3,$tmp4 + and $tmp2,$in0,$tmp0 + and $tmp4,$in1,$tmp0 + dsrl $in0,8 + dsrl $in1,8 + dsll $tmp2,8 + dsll $tmp4,8 + and $in0,$tmp0 + and $in1,$tmp0 + or $tmp1,$tmp2 + or $tmp3,$tmp4 + or $in0,$tmp1 + or $in1,$tmp3 + dsrl $tmp1,$in0,32 + dsrl $tmp3,$in1,32 + dsll $in0,32 + dsll $in1,32 + or $in0,$tmp1 + or $in1,$tmp3 +# endif +#endif + dsrl $tmp1,$h2,2 # modulo-scheduled reduction + andi $h2,$h2,3 + dsll $tmp0,$tmp1,2 + + daddu $d0,$h0,$in0 # accumulate input + daddu $tmp1,$tmp0 + sltu $tmp0,$d0,$h0 + daddu $d0,$d0,$tmp1 # ... and residue + sltu $tmp1,$d0,$tmp1 + daddu $d1,$h1,$in1 + daddu $tmp0,$tmp1 + sltu $tmp1,$d1,$h1 + daddu $d1,$tmp0 + + dmultu ($r0,$d0) # h0*r0 + daddu $d2,$h2,$padbit + sltu $tmp0,$d1,$tmp0 + mflo ($h0,$r0,$d0) + mfhi ($h1,$r0,$d0) + + dmultu ($rs1,$d1) # h1*5*r1 + daddu $d2,$tmp1 + daddu $d2,$tmp0 + mflo ($tmp0,$rs1,$d1) + mfhi ($tmp1,$rs1,$d1) + + dmultu ($r1,$d0) # h0*r1 + mflo ($tmp2,$r1,$d0) + mfhi ($h2,$r1,$d0) + daddu $h0,$tmp0 + daddu $h1,$tmp1 + sltu $tmp0,$h0,$tmp0 + + dmultu ($r0,$d1) # h1*r0 + daddu $h1,$tmp0 + daddu $h1,$tmp2 + mflo ($tmp0,$r0,$d1) + mfhi ($tmp1,$r0,$d1) + + dmultu ($rs1,$d2) # h2*5*r1 + sltu $tmp2,$h1,$tmp2 + daddu $h2,$tmp2 + mflo ($tmp2,$rs1,$d2) + + dmultu ($r0,$d2) # h2*r0 + daddu $h1,$tmp0 + daddu $h2,$tmp1 + mflo ($tmp3,$r0,$d2) + sltu $tmp0,$h1,$tmp0 + daddu $h2,$tmp0 + + daddu $h1,$tmp2 + sltu $tmp2,$h1,$tmp2 + daddu $h2,$tmp2 + daddu $h2,$tmp3 + + bne $inp,$len,.Loop + + sd $h0,0($ctx) # store hash value + sd $h1,8($ctx) + sd $h2,16($ctx) + + .set noreorder +#if defined(_MIPS_ARCH_MIPS64R6) + ld $s7,56($sp) + ld $s6,48($sp) +#endif + ld $s5,40($sp) # epilogue + ld $s4,32($sp) +___ +$code.=<<___ if ($flavour =~ /nubi/i); # optimize non-nubi epilogue + ld $s3,24($sp) + ld $s2,16($sp) + ld $s1,8($sp) + ld $s0,0($sp) +___ +$code.=<<___; + jr $ra +#if defined(_MIPS_ARCH_MIPS64R6) + daddu $sp,8*8 +#else + daddu $sp,6*8 +#endif +.end poly1305_blocks_internal +___ +} +{ +my ($ctx,$mac,$nonce) = ($a0,$a1,$a2); + +$code.=<<___; +.align 5 +.globl poly1305_emit +.ent poly1305_emit +poly1305_emit: + .frame $sp,0,$ra + .set reorder + + ld $tmp2,16($ctx) + ld $tmp0,0($ctx) + ld $tmp1,8($ctx) + + li $in0,-4 # final reduction + dsrl $in1,$tmp2,2 + and $in0,$tmp2 + andi $tmp2,$tmp2,3 + daddu $in0,$in1 + + daddu $tmp0,$tmp0,$in0 + sltu $in1,$tmp0,$in0 + daddiu $in0,$tmp0,5 # compare to modulus + daddu $tmp1,$tmp1,$in1 + sltiu $tmp3,$in0,5 + sltu $tmp4,$tmp1,$in1 + daddu $in1,$tmp1,$tmp3 + daddu $tmp2,$tmp2,$tmp4 + sltu $tmp3,$in1,$tmp3 + daddu $tmp2,$tmp2,$tmp3 + + dsrl $tmp2,2 # see if it carried/borrowed + dsubu $tmp2,$zero,$tmp2 + + xor $in0,$tmp0 + xor $in1,$tmp1 + and $in0,$tmp2 + and $in1,$tmp2 + xor $in0,$tmp0 + xor $in1,$tmp1 + + lwu $tmp0,0($nonce) # load nonce + lwu $tmp1,4($nonce) + lwu $tmp2,8($nonce) + lwu $tmp3,12($nonce) + dsll $tmp1,32 + dsll $tmp3,32 + or $tmp0,$tmp1 + or $tmp2,$tmp3 + + daddu $in0,$tmp0 # accumulate nonce + daddu $in1,$tmp2 + sltu $tmp0,$in0,$tmp0 + daddu $in1,$tmp0 + + dsrl $tmp0,$in0,8 # write mac value + dsrl $tmp1,$in0,16 + dsrl $tmp2,$in0,24 + sb $in0,0($mac) + dsrl $tmp3,$in0,32 + sb $tmp0,1($mac) + dsrl $tmp0,$in0,40 + sb $tmp1,2($mac) + dsrl $tmp1,$in0,48 + sb $tmp2,3($mac) + dsrl $tmp2,$in0,56 + sb $tmp3,4($mac) + dsrl $tmp3,$in1,8 + sb $tmp0,5($mac) + dsrl $tmp0,$in1,16 + sb $tmp1,6($mac) + dsrl $tmp1,$in1,24 + sb $tmp2,7($mac) + + sb $in1,8($mac) + dsrl $tmp2,$in1,32 + sb $tmp3,9($mac) + dsrl $tmp3,$in1,40 + sb $tmp0,10($mac) + dsrl $tmp0,$in1,48 + sb $tmp1,11($mac) + dsrl $tmp1,$in1,56 + sb $tmp2,12($mac) + sb $tmp3,13($mac) + sb $tmp0,14($mac) + sb $tmp1,15($mac) + + jr $ra +.end poly1305_emit +.rdata +.asciiz "Poly1305 for MIPS64, CRYPTOGAMS by \@dot-asm" +.align 2 +___ +} +}}} else {{{ +###################################################################### +# 32-bit code path +# + +my ($ctx,$inp,$len,$padbit) = ($a0,$a1,$a2,$a3); +my ($in0,$in1,$in2,$in3,$tmp0,$tmp1,$tmp2,$tmp3) = + ($a4,$a5,$a6,$a7,$at,$t0,$t1,$t2); + +$code.=<<___; +#if (defined(_MIPS_ARCH_MIPS32R3) || defined(_MIPS_ARCH_MIPS32R5) || \\ + defined(_MIPS_ARCH_MIPS32R6)) \\ + && !defined(_MIPS_ARCH_MIPS32R2) +# define _MIPS_ARCH_MIPS32R2 +#endif + +#if defined(_MIPS_ARCH_MIPS32R6) +# define multu(rs,rt) +# define mflo(rd,rs,rt) mulu rd,rs,rt +# define mfhi(rd,rs,rt) muhu rd,rs,rt +#else +# define multu(rs,rt) multu rs,rt +# define mflo(rd,rs,rt) mflo rd +# define mfhi(rd,rs,rt) mfhi rd +#endif + +#ifdef __KERNEL__ +# define poly1305_init poly1305_init_mips +# define poly1305_blocks poly1305_blocks_mips +# define poly1305_emit poly1305_emit_mips +#endif + +#if defined(__MIPSEB__) && !defined(MIPSEB) +# define MIPSEB +#endif + +#ifdef MIPSEB +# define MSB 0 +# define LSB 3 +#else +# define MSB 3 +# define LSB 0 +#endif + +.text +.set noat +.set noreorder + +.align 5 +.globl poly1305_init +.ent poly1305_init +poly1305_init: + .frame $sp,0,$ra + .set reorder + + sw $zero,0($ctx) + sw $zero,4($ctx) + sw $zero,8($ctx) + sw $zero,12($ctx) + sw $zero,16($ctx) + + beqz $inp,.Lno_key + +#if defined(_MIPS_ARCH_MIPS32R6) + andi $tmp0,$inp,3 # $inp % 4 + subu $inp,$inp,$tmp0 # align $inp + sll $tmp0,$tmp0,3 # byte to bit offset + lw $in0,0($inp) + lw $in1,4($inp) + lw $in2,8($inp) + lw $in3,12($inp) + beqz $tmp0,.Laligned_key + + lw $tmp2,16($inp) + subu $tmp1,$zero,$tmp0 +# ifdef MIPSEB + sllv $in0,$in0,$tmp0 + srlv $tmp3,$in1,$tmp1 + sllv $in1,$in1,$tmp0 + or $in0,$in0,$tmp3 + srlv $tmp3,$in2,$tmp1 + sllv $in2,$in2,$tmp0 + or $in1,$in1,$tmp3 + srlv $tmp3,$in3,$tmp1 + sllv $in3,$in3,$tmp0 + or $in2,$in2,$tmp3 + srlv $tmp2,$tmp2,$tmp1 + or $in3,$in3,$tmp2 +# else + srlv $in0,$in0,$tmp0 + sllv $tmp3,$in1,$tmp1 + srlv $in1,$in1,$tmp0 + or $in0,$in0,$tmp3 + sllv $tmp3,$in2,$tmp1 + srlv $in2,$in2,$tmp0 + or $in1,$in1,$tmp3 + sllv $tmp3,$in3,$tmp1 + srlv $in3,$in3,$tmp0 + or $in2,$in2,$tmp3 + sllv $tmp2,$tmp2,$tmp1 + or $in3,$in3,$tmp2 +# endif +.Laligned_key: +#else + lwl $in0,0+MSB($inp) + lwl $in1,4+MSB($inp) + lwl $in2,8+MSB($inp) + lwl $in3,12+MSB($inp) + lwr $in0,0+LSB($inp) + lwr $in1,4+LSB($inp) + lwr $in2,8+LSB($inp) + lwr $in3,12+LSB($inp) +#endif +#ifdef MIPSEB +# if defined(_MIPS_ARCH_MIPS32R2) + wsbh $in0,$in0 # byte swap + wsbh $in1,$in1 + wsbh $in2,$in2 + wsbh $in3,$in3 + rotr $in0,$in0,16 + rotr $in1,$in1,16 + rotr $in2,$in2,16 + rotr $in3,$in3,16 +# else + srl $tmp0,$in0,24 # byte swap + srl $tmp1,$in0,8 + andi $tmp2,$in0,0xFF00 + sll $in0,$in0,24 + andi $tmp1,0xFF00 + sll $tmp2,$tmp2,8 + or $in0,$tmp0 + srl $tmp0,$in1,24 + or $tmp1,$tmp2 + srl $tmp2,$in1,8 + or $in0,$tmp1 + andi $tmp1,$in1,0xFF00 + sll $in1,$in1,24 + andi $tmp2,0xFF00 + sll $tmp1,$tmp1,8 + or $in1,$tmp0 + srl $tmp0,$in2,24 + or $tmp2,$tmp1 + srl $tmp1,$in2,8 + or $in1,$tmp2 + andi $tmp2,$in2,0xFF00 + sll $in2,$in2,24 + andi $tmp1,0xFF00 + sll $tmp2,$tmp2,8 + or $in2,$tmp0 + srl $tmp0,$in3,24 + or $tmp1,$tmp2 + srl $tmp2,$in3,8 + or $in2,$tmp1 + andi $tmp1,$in3,0xFF00 + sll $in3,$in3,24 + andi $tmp2,0xFF00 + sll $tmp1,$tmp1,8 + or $in3,$tmp0 + or $tmp2,$tmp1 + or $in3,$tmp2 +# endif +#endif + lui $tmp0,0x0fff + ori $tmp0,0xffff # 0x0fffffff + and $in0,$in0,$tmp0 + subu $tmp0,3 # 0x0ffffffc + and $in1,$in1,$tmp0 + and $in2,$in2,$tmp0 + and $in3,$in3,$tmp0 + + sw $in0,20($ctx) + sw $in1,24($ctx) + sw $in2,28($ctx) + sw $in3,32($ctx) + + srl $tmp1,$in1,2 + srl $tmp2,$in2,2 + srl $tmp3,$in3,2 + addu $in1,$in1,$tmp1 # s1 = r1 + (r1 >> 2) + addu $in2,$in2,$tmp2 + addu $in3,$in3,$tmp3 + sw $in1,36($ctx) + sw $in2,40($ctx) + sw $in3,44($ctx) +.Lno_key: + li $v0,0 + jr $ra +.end poly1305_init +___ +{ +my $SAVED_REGS_MASK = ($flavour =~ /nubi/i) ? "0x00fff000" : "0x00ff0000"; + +my ($h0,$h1,$h2,$h3,$h4, $r0,$r1,$r2,$r3, $rs1,$rs2,$rs3) = + ($s0,$s1,$s2,$s3,$s4, $s5,$s6,$s7,$s8, $s9,$s10,$s11); +my ($d0,$d1,$d2,$d3) = + ($a4,$a5,$a6,$a7); +my $shr = $t2; # used on R6 +my $one = $t2; # used on R2 + +$code.=<<___; +.globl poly1305_blocks +.align 5 +.ent poly1305_blocks +poly1305_blocks: + .frame $sp,16*4,$ra + .mask $SAVED_REGS_MASK,-4 + .set noreorder + subu $sp, $sp,4*12 + sw $s11,4*11($sp) + sw $s10,4*10($sp) + sw $s9, 4*9($sp) + sw $s8, 4*8($sp) + sw $s7, 4*7($sp) + sw $s6, 4*6($sp) + sw $s5, 4*5($sp) + sw $s4, 4*4($sp) +___ +$code.=<<___ if ($flavour =~ /nubi/i); # optimize non-nubi prologue + sw $s3, 4*3($sp) + sw $s2, 4*2($sp) + sw $s1, 4*1($sp) + sw $s0, 4*0($sp) +___ +$code.=<<___; + .set reorder + + srl $len,4 # number of complete blocks + li $one,1 + beqz $len,.Labort + +#if defined(_MIPS_ARCH_MIPS32R6) + andi $shr,$inp,3 + subu $inp,$inp,$shr # align $inp + sll $shr,$shr,3 # byte to bit offset +#endif + + lw $h0,0($ctx) # load hash value + lw $h1,4($ctx) + lw $h2,8($ctx) + lw $h3,12($ctx) + lw $h4,16($ctx) + + lw $r0,20($ctx) # load key + lw $r1,24($ctx) + lw $r2,28($ctx) + lw $r3,32($ctx) + lw $rs1,36($ctx) + lw $rs2,40($ctx) + lw $rs3,44($ctx) + + sll $len,4 + addu $len,$len,$inp # end of buffer + b .Loop + +.align 4 +.Loop: +#if defined(_MIPS_ARCH_MIPS32R6) + lw $d0,0($inp) # load input + lw $d1,4($inp) + lw $d2,8($inp) + lw $d3,12($inp) + beqz $shr,.Laligned_inp + + lw $t0,16($inp) + subu $t1,$zero,$shr +# ifdef MIPSEB + sllv $d0,$d0,$shr + srlv $at,$d1,$t1 + sllv $d1,$d1,$shr + or $d0,$d0,$at + srlv $at,$d2,$t1 + sllv $d2,$d2,$shr + or $d1,$d1,$at + srlv $at,$d3,$t1 + sllv $d3,$d3,$shr + or $d2,$d2,$at + srlv $t0,$t0,$t1 + or $d3,$d3,$t0 +# else + srlv $d0,$d0,$shr + sllv $at,$d1,$t1 + srlv $d1,$d1,$shr + or $d0,$d0,$at + sllv $at,$d2,$t1 + srlv $d2,$d2,$shr + or $d1,$d1,$at + sllv $at,$d3,$t1 + srlv $d3,$d3,$shr + or $d2,$d2,$at + sllv $t0,$t0,$t1 + or $d3,$d3,$t0 +# endif +.Laligned_inp: +#else + lwl $d0,0+MSB($inp) # load input + lwl $d1,4+MSB($inp) + lwl $d2,8+MSB($inp) + lwl $d3,12+MSB($inp) + lwr $d0,0+LSB($inp) + lwr $d1,4+LSB($inp) + lwr $d2,8+LSB($inp) + lwr $d3,12+LSB($inp) +#endif +#ifdef MIPSEB +# if defined(_MIPS_ARCH_MIPS32R2) + wsbh $d0,$d0 # byte swap + wsbh $d1,$d1 + wsbh $d2,$d2 + wsbh $d3,$d3 + rotr $d0,$d0,16 + rotr $d1,$d1,16 + rotr $d2,$d2,16 + rotr $d3,$d3,16 +# else + srl $at,$d0,24 # byte swap + srl $t0,$d0,8 + andi $t1,$d0,0xFF00 + sll $d0,$d0,24 + andi $t0,0xFF00 + sll $t1,$t1,8 + or $d0,$at + srl $at,$d1,24 + or $t0,$t1 + srl $t1,$d1,8 + or $d0,$t0 + andi $t0,$d1,0xFF00 + sll $d1,$d1,24 + andi $t1,0xFF00 + sll $t0,$t0,8 + or $d1,$at + srl $at,$d2,24 + or $t1,$t0 + srl $t0,$d2,8 + or $d1,$t1 + andi $t1,$d2,0xFF00 + sll $d2,$d2,24 + andi $t0,0xFF00 + sll $t1,$t1,8 + or $d2,$at + srl $at,$d3,24 + or $t0,$t1 + srl $t1,$d3,8 + or $d2,$t0 + andi $t0,$d3,0xFF00 + sll $d3,$d3,24 + andi $t1,0xFF00 + sll $t0,$t0,8 + or $d3,$at + or $t1,$t0 + or $d3,$t1 +# endif +#endif + srl $t0,$h4,2 # modulo-scheduled reduction + andi $h4,$h4,3 + sll $at,$t0,2 + + addu $d0,$d0,$h0 # accumulate input + addu $t0,$t0,$at + sltu $h0,$d0,$h0 + addu $d0,$d0,$t0 # ... and residue + sltu $at,$d0,$t0 + + addu $d1,$d1,$h1 + addu $h0,$h0,$at # carry + sltu $h1,$d1,$h1 + addu $d1,$d1,$h0 + sltu $h0,$d1,$h0 + + addu $d2,$d2,$h2 + addu $h1,$h1,$h0 # carry + sltu $h2,$d2,$h2 + addu $d2,$d2,$h1 + sltu $h1,$d2,$h1 + + addu $d3,$d3,$h3 + addu $h2,$h2,$h1 # carry + sltu $h3,$d3,$h3 + addu $d3,$d3,$h2 + +#if defined(_MIPS_ARCH_MIPS32R2) && !defined(_MIPS_ARCH_MIPS32R6) + multu $r0,$d0 # d0*r0 + sltu $h2,$d3,$h2 + maddu $rs3,$d1 # d1*s3 + addu $h3,$h3,$h2 # carry + maddu $rs2,$d2 # d2*s2 + addu $h4,$h4,$padbit + maddu $rs1,$d3 # d3*s1 + addu $h4,$h4,$h3 + mfhi $at + mflo $h0 + + multu $r1,$d0 # d0*r1 + maddu $r0,$d1 # d1*r0 + maddu $rs3,$d2 # d2*s3 + maddu $rs2,$d3 # d3*s2 + maddu $rs1,$h4 # h4*s1 + maddu $at,$one # hi*1 + mfhi $at + mflo $h1 + + multu $r2,$d0 # d0*r2 + maddu $r1,$d1 # d1*r1 + maddu $r0,$d2 # d2*r0 + maddu $rs3,$d3 # d3*s3 + maddu $rs2,$h4 # h4*s2 + maddu $at,$one # hi*1 + mfhi $at + mflo $h2 + + mul $t0,$r0,$h4 # h4*r0 + + multu $r3,$d0 # d0*r3 + maddu $r2,$d1 # d1*r2 + maddu $r1,$d2 # d2*r1 + maddu $r0,$d3 # d3*r0 + maddu $rs3,$h4 # h4*s3 + maddu $at,$one # hi*1 + mfhi $at + mflo $h3 + + addiu $inp,$inp,16 + + addu $h4,$t0,$at +#else + multu ($r0,$d0) # d0*r0 + mflo ($h0,$r0,$d0) + mfhi ($h1,$r0,$d0) + + sltu $h2,$d3,$h2 + addu $h3,$h3,$h2 # carry + + multu ($rs3,$d1) # d1*s3 + mflo ($at,$rs3,$d1) + mfhi ($t0,$rs3,$d1) + + addu $h4,$h4,$padbit + addiu $inp,$inp,16 + addu $h4,$h4,$h3 + + multu ($rs2,$d2) # d2*s2 + mflo ($a3,$rs2,$d2) + mfhi ($t1,$rs2,$d2) + addu $h0,$h0,$at + addu $h1,$h1,$t0 + multu ($rs1,$d3) # d3*s1 + sltu $at,$h0,$at + addu $h1,$h1,$at + + mflo ($at,$rs1,$d3) + mfhi ($t0,$rs1,$d3) + addu $h0,$h0,$a3 + addu $h1,$h1,$t1 + multu ($r1,$d0) # d0*r1 + sltu $a3,$h0,$a3 + addu $h1,$h1,$a3 + + + mflo ($a3,$r1,$d0) + mfhi ($h2,$r1,$d0) + addu $h0,$h0,$at + addu $h1,$h1,$t0 + multu ($r0,$d1) # d1*r0 + sltu $at,$h0,$at + addu $h1,$h1,$at + + mflo ($at,$r0,$d1) + mfhi ($t0,$r0,$d1) + addu $h1,$h1,$a3 + sltu $a3,$h1,$a3 + multu ($rs3,$d2) # d2*s3 + addu $h2,$h2,$a3 + + mflo ($a3,$rs3,$d2) + mfhi ($t1,$rs3,$d2) + addu $h1,$h1,$at + addu $h2,$h2,$t0 + multu ($rs2,$d3) # d3*s2 + sltu $at,$h1,$at + addu $h2,$h2,$at + + mflo ($at,$rs2,$d3) + mfhi ($t0,$rs2,$d3) + addu $h1,$h1,$a3 + addu $h2,$h2,$t1 + multu ($rs1,$h4) # h4*s1 + sltu $a3,$h1,$a3 + addu $h2,$h2,$a3 + + mflo ($a3,$rs1,$h4) + addu $h1,$h1,$at + addu $h2,$h2,$t0 + multu ($r2,$d0) # d0*r2 + sltu $at,$h1,$at + addu $h2,$h2,$at + + + mflo ($at,$r2,$d0) + mfhi ($h3,$r2,$d0) + addu $h1,$h1,$a3 + sltu $a3,$h1,$a3 + multu ($r1,$d1) # d1*r1 + addu $h2,$h2,$a3 + + mflo ($a3,$r1,$d1) + mfhi ($t1,$r1,$d1) + addu $h2,$h2,$at + sltu $at,$h2,$at + multu ($r0,$d2) # d2*r0 + addu $h3,$h3,$at + + mflo ($at,$r0,$d2) + mfhi ($t0,$r0,$d2) + addu $h2,$h2,$a3 + addu $h3,$h3,$t1 + multu ($rs3,$d3) # d3*s3 + sltu $a3,$h2,$a3 + addu $h3,$h3,$a3 + + mflo ($a3,$rs3,$d3) + mfhi ($t1,$rs3,$d3) + addu $h2,$h2,$at + addu $h3,$h3,$t0 + multu ($rs2,$h4) # h4*s2 + sltu $at,$h2,$at + addu $h3,$h3,$at + + mflo ($at,$rs2,$h4) + addu $h2,$h2,$a3 + addu $h3,$h3,$t1 + multu ($r3,$d0) # d0*r3 + sltu $a3,$h2,$a3 + addu $h3,$h3,$a3 + + + mflo ($a3,$r3,$d0) + mfhi ($t1,$r3,$d0) + addu $h2,$h2,$at + sltu $at,$h2,$at + multu ($r2,$d1) # d1*r2 + addu $h3,$h3,$at + + mflo ($at,$r2,$d1) + mfhi ($t0,$r2,$d1) + addu $h3,$h3,$a3 + sltu $a3,$h3,$a3 + multu ($r0,$d3) # d3*r0 + addu $t1,$t1,$a3 + + mflo ($a3,$r0,$d3) + mfhi ($d3,$r0,$d3) + addu $h3,$h3,$at + addu $t1,$t1,$t0 + multu ($r1,$d2) # d2*r1 + sltu $at,$h3,$at + addu $t1,$t1,$at + + mflo ($at,$r1,$d2) + mfhi ($t0,$r1,$d2) + addu $h3,$h3,$a3 + addu $t1,$t1,$d3 + multu ($rs3,$h4) # h4*s3 + sltu $a3,$h3,$a3 + addu $t1,$t1,$a3 + + mflo ($a3,$rs3,$h4) + addu $h3,$h3,$at + addu $t1,$t1,$t0 + multu ($r0,$h4) # h4*r0 + sltu $at,$h3,$at + addu $t1,$t1,$at + + + mflo ($h4,$r0,$h4) + addu $h3,$h3,$a3 + sltu $a3,$h3,$a3 + addu $t1,$t1,$a3 + addu $h4,$h4,$t1 + + li $padbit,1 # if we loop, padbit is 1 +#endif + bne $inp,$len,.Loop + + sw $h0,0($ctx) # store hash value + sw $h1,4($ctx) + sw $h2,8($ctx) + sw $h3,12($ctx) + sw $h4,16($ctx) + + .set noreorder +.Labort: + lw $s11,4*11($sp) + lw $s10,4*10($sp) + lw $s9, 4*9($sp) + lw $s8, 4*8($sp) + lw $s7, 4*7($sp) + lw $s6, 4*6($sp) + lw $s5, 4*5($sp) + lw $s4, 4*4($sp) +___ +$code.=<<___ if ($flavour =~ /nubi/i); # optimize non-nubi prologue + lw $s3, 4*3($sp) + lw $s2, 4*2($sp) + lw $s1, 4*1($sp) + lw $s0, 4*0($sp) +___ +$code.=<<___; + jr $ra + addu $sp,$sp,4*12 +.end poly1305_blocks +___ +} +{ +my ($ctx,$mac,$nonce,$tmp4) = ($a0,$a1,$a2,$a3); + +$code.=<<___; +.align 5 +.globl poly1305_emit +.ent poly1305_emit +poly1305_emit: + .frame $sp,0,$ra + .set reorder + + lw $tmp4,16($ctx) + lw $tmp0,0($ctx) + lw $tmp1,4($ctx) + lw $tmp2,8($ctx) + lw $tmp3,12($ctx) + + li $in0,-4 # final reduction + srl $ctx,$tmp4,2 + and $in0,$in0,$tmp4 + andi $tmp4,$tmp4,3 + addu $ctx,$ctx,$in0 + + addu $tmp0,$tmp0,$ctx + sltu $ctx,$tmp0,$ctx + addiu $in0,$tmp0,5 # compare to modulus + addu $tmp1,$tmp1,$ctx + sltiu $in1,$in0,5 + sltu $ctx,$tmp1,$ctx + addu $in1,$in1,$tmp1 + addu $tmp2,$tmp2,$ctx + sltu $in2,$in1,$tmp1 + sltu $ctx,$tmp2,$ctx + addu $in2,$in2,$tmp2 + addu $tmp3,$tmp3,$ctx + sltu $in3,$in2,$tmp2 + sltu $ctx,$tmp3,$ctx + addu $in3,$in3,$tmp3 + addu $tmp4,$tmp4,$ctx + sltu $ctx,$in3,$tmp3 + addu $ctx,$tmp4 + + srl $ctx,2 # see if it carried/borrowed + subu $ctx,$zero,$ctx + + xor $in0,$tmp0 + xor $in1,$tmp1 + xor $in2,$tmp2 + xor $in3,$tmp3 + and $in0,$ctx + and $in1,$ctx + and $in2,$ctx + and $in3,$ctx + xor $in0,$tmp0 + xor $in1,$tmp1 + xor $in2,$tmp2 + xor $in3,$tmp3 + + lw $tmp0,0($nonce) # load nonce + lw $tmp1,4($nonce) + lw $tmp2,8($nonce) + lw $tmp3,12($nonce) + + addu $in0,$tmp0 # accumulate nonce + sltu $ctx,$in0,$tmp0 + + addu $in1,$tmp1 + sltu $tmp1,$in1,$tmp1 + addu $in1,$ctx + sltu $ctx,$in1,$ctx + addu $ctx,$tmp1 + + addu $in2,$tmp2 + sltu $tmp2,$in2,$tmp2 + addu $in2,$ctx + sltu $ctx,$in2,$ctx + addu $ctx,$tmp2 + + addu $in3,$tmp3 + addu $in3,$ctx + + srl $tmp0,$in0,8 # write mac value + srl $tmp1,$in0,16 + srl $tmp2,$in0,24 + sb $in0, 0($mac) + sb $tmp0,1($mac) + srl $tmp0,$in1,8 + sb $tmp1,2($mac) + srl $tmp1,$in1,16 + sb $tmp2,3($mac) + srl $tmp2,$in1,24 + sb $in1, 4($mac) + sb $tmp0,5($mac) + srl $tmp0,$in2,8 + sb $tmp1,6($mac) + srl $tmp1,$in2,16 + sb $tmp2,7($mac) + srl $tmp2,$in2,24 + sb $in2, 8($mac) + sb $tmp0,9($mac) + srl $tmp0,$in3,8 + sb $tmp1,10($mac) + srl $tmp1,$in3,16 + sb $tmp2,11($mac) + srl $tmp2,$in3,24 + sb $in3, 12($mac) + sb $tmp0,13($mac) + sb $tmp1,14($mac) + sb $tmp2,15($mac) + + jr $ra +.end poly1305_emit +.rdata +.asciiz "Poly1305 for MIPS32, CRYPTOGAMS by \@dot-asm" +.align 2 +___ +} +}}} + +$output=pop and open STDOUT,">$output"; +print $code; +close STDOUT; diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index f38d153d2586..d18ea3c1f4fa 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -152,6 +152,7 @@ config PPC select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF if PPC64 select ARCH_WANT_IPC_PARSE_VERSION + select ARCH_WANT_IRQS_OFF_ACTIVATE_MM select ARCH_WEAK_RELEASE_ACQUIRE select BINFMT_ELF select BUILDTIME_EXTABLE_SORT @@ -1009,6 +1010,19 @@ config FSL_RIO source "drivers/rapidio/Kconfig" +config PPC_RTAS_FILTER + bool "Enable filtering of RTAS syscalls" + default y + depends on PPC_RTAS + help + The RTAS syscall API has security issues that could be used to + compromise system integrity. This option enforces restrictions on the + RTAS calls and arguments passed by userspace programs to mitigate + these issues. + + Say Y unless you know what you are doing and the filter is causing + problems for you. + endmenu config NONSTATIC_KERNEL diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h index 9e516fe3daab..8b196720e9a0 100644 --- a/arch/powerpc/include/asm/drmem.h +++ b/arch/powerpc/include/asm/drmem.h @@ -12,6 +12,8 @@ #ifndef _ASM_POWERPC_LMB_H #define _ASM_POWERPC_LMB_H +#include + struct drmem_lmb { u64 base_addr; u32 drc_index; @@ -22,13 +24,27 @@ struct drmem_lmb { struct drmem_lmb_info { struct drmem_lmb *lmbs; int n_lmbs; - u32 lmb_size; + u64 lmb_size; }; extern struct drmem_lmb_info *drmem_info; +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb, + const struct drmem_lmb *start) +{ + /* + * DLPAR code paths can take several milliseconds per element + * when interacting with firmware. Ensure that we don't + * unfairly monopolize the CPU. + */ + if (((++lmb - start) % 16) == 0) + cond_resched(); + + return lmb; +} + #define for_each_drmem_lmb_in_range(lmb, start, end) \ - for ((lmb) = (start); (lmb) < (end); (lmb)++) + for ((lmb) = (start); (lmb) < (end); lmb = drmem_lmb_next(lmb, start)) #define for_each_drmem_lmb(lmb) \ for_each_drmem_lmb_in_range((lmb), \ @@ -67,7 +83,7 @@ struct of_drconf_cell_v2 { #define DRCONF_MEM_AI_INVALID 0x00000040 #define DRCONF_MEM_RESERVED 0x00000080 -static inline u32 drmem_lmb_size(void) +static inline u64 drmem_lmb_size(void) { return drmem_info->lmb_size; } diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index ae953958c0f3..d93bdcaa4a46 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -204,7 +204,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, */ static inline void activate_mm(struct mm_struct *prev, struct mm_struct *next) { - switch_mm(prev, next, current); + switch_mm_irqs_off(prev, next, current); } /* We don't currently use enter_lazy_tlb() for anything */ diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index af9971661512..494b0283f212 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -788,7 +788,7 @@ #define THRM1_TIN (1 << 31) #define THRM1_TIV (1 << 30) #define THRM1_THRES(x) ((x&0x7f)<<23) -#define THRM3_SITV(x) ((x&0x3fff)<<1) +#define THRM3_SITV(x) ((x & 0x1fff) << 1) #define THRM1_TID (1<<2) #define THRM1_TIE (1<<1) #define THRM1_V (1<<0) diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h index f0e571b2dc7c..a6073fecdacd 100644 --- a/arch/powerpc/include/asm/tlb.h +++ b/arch/powerpc/include/asm/tlb.h @@ -76,19 +76,6 @@ static inline int mm_is_thread_local(struct mm_struct *mm) return false; return cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm)); } -static inline void mm_reset_thread_local(struct mm_struct *mm) -{ - WARN_ON(atomic_read(&mm->context.copros) > 0); - /* - * It's possible for mm_access to take a reference on mm_users to - * access the remote mm from another thread, but it's not allowed - * to set mm_cpumask, so mm_users may be > 1 here. - */ - WARN_ON(current->mm != mm); - atomic_set(&mm->context.active_cpus, 1); - cpumask_clear(mm_cpumask(mm)); - cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm)); -} #else /* CONFIG_PPC_BOOK3S_64 */ static inline int mm_is_thread_local(struct mm_struct *mm) { diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 95d1264ba795..7e0722b62cae 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -1057,6 +1057,147 @@ struct pseries_errorlog *get_pseries_errorlog(struct rtas_error_log *log, return NULL; } +#ifdef CONFIG_PPC_RTAS_FILTER + +/* + * The sys_rtas syscall, as originally designed, allows root to pass + * arbitrary physical addresses to RTAS calls. A number of RTAS calls + * can be abused to write to arbitrary memory and do other things that + * are potentially harmful to system integrity, and thus should only + * be used inside the kernel and not exposed to userspace. + * + * All known legitimate users of the sys_rtas syscall will only ever + * pass addresses that fall within the RMO buffer, and use a known + * subset of RTAS calls. + * + * Accordingly, we filter RTAS requests to check that the call is + * permitted, and that provided pointers fall within the RMO buffer. + * The rtas_filters list contains an entry for each permitted call, + * with the indexes of the parameters which are expected to contain + * addresses and sizes of buffers allocated inside the RMO buffer. + */ +struct rtas_filter { + const char *name; + int token; + /* Indexes into the args buffer, -1 if not used */ + int buf_idx1; + int size_idx1; + int buf_idx2; + int size_idx2; + + int fixed_size; +}; + +static struct rtas_filter rtas_filters[] __ro_after_init = { + { "ibm,activate-firmware", -1, -1, -1, -1, -1 }, + { "ibm,configure-connector", -1, 0, -1, 1, -1, 4096 }, /* Special cased */ + { "display-character", -1, -1, -1, -1, -1 }, + { "ibm,display-message", -1, 0, -1, -1, -1 }, + { "ibm,errinjct", -1, 2, -1, -1, -1, 1024 }, + { "ibm,close-errinjct", -1, -1, -1, -1, -1 }, + { "ibm,open-errinct", -1, -1, -1, -1, -1 }, + { "ibm,get-config-addr-info2", -1, -1, -1, -1, -1 }, + { "ibm,get-dynamic-sensor-state", -1, 1, -1, -1, -1 }, + { "ibm,get-indices", -1, 2, 3, -1, -1 }, + { "get-power-level", -1, -1, -1, -1, -1 }, + { "get-sensor-state", -1, -1, -1, -1, -1 }, + { "ibm,get-system-parameter", -1, 1, 2, -1, -1 }, + { "get-time-of-day", -1, -1, -1, -1, -1 }, + { "ibm,get-vpd", -1, 0, -1, 1, 2 }, + { "ibm,lpar-perftools", -1, 2, 3, -1, -1 }, + { "ibm,platform-dump", -1, 4, 5, -1, -1 }, + { "ibm,read-slot-reset-state", -1, -1, -1, -1, -1 }, + { "ibm,scan-log-dump", -1, 0, 1, -1, -1 }, + { "ibm,set-dynamic-indicator", -1, 2, -1, -1, -1 }, + { "ibm,set-eeh-option", -1, -1, -1, -1, -1 }, + { "set-indicator", -1, -1, -1, -1, -1 }, + { "set-power-level", -1, -1, -1, -1, -1 }, + { "set-time-for-power-on", -1, -1, -1, -1, -1 }, + { "ibm,set-system-parameter", -1, 1, -1, -1, -1 }, + { "set-time-of-day", -1, -1, -1, -1, -1 }, + { "ibm,suspend-me", -1, -1, -1, -1, -1 }, + { "ibm,update-nodes", -1, 0, -1, -1, -1, 4096 }, + { "ibm,update-properties", -1, 0, -1, -1, -1, 4096 }, + { "ibm,physical-attestation", -1, 0, 1, -1, -1 }, +}; + +static bool in_rmo_buf(u32 base, u32 end) +{ + return base >= rtas_rmo_buf && + base < (rtas_rmo_buf + RTAS_RMOBUF_MAX) && + base <= end && + end >= rtas_rmo_buf && + end < (rtas_rmo_buf + RTAS_RMOBUF_MAX); +} + +static bool block_rtas_call(int token, int nargs, + struct rtas_args *args) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(rtas_filters); i++) { + struct rtas_filter *f = &rtas_filters[i]; + u32 base, size, end; + + if (token != f->token) + continue; + + if (f->buf_idx1 != -1) { + base = be32_to_cpu(args->args[f->buf_idx1]); + if (f->size_idx1 != -1) + size = be32_to_cpu(args->args[f->size_idx1]); + else if (f->fixed_size) + size = f->fixed_size; + else + size = 1; + + end = base + size - 1; + if (!in_rmo_buf(base, end)) + goto err; + } + + if (f->buf_idx2 != -1) { + base = be32_to_cpu(args->args[f->buf_idx2]); + if (f->size_idx2 != -1) + size = be32_to_cpu(args->args[f->size_idx2]); + else if (f->fixed_size) + size = f->fixed_size; + else + size = 1; + end = base + size - 1; + + /* + * Special case for ibm,configure-connector where the + * address can be 0 + */ + if (!strcmp(f->name, "ibm,configure-connector") && + base == 0) + return false; + + if (!in_rmo_buf(base, end)) + goto err; + } + + return false; + } + +err: + pr_err_ratelimited("sys_rtas: RTAS call blocked - exploit attempt?\n"); + pr_err_ratelimited("sys_rtas: token=0x%x, nargs=%d (called by %s)\n", + token, nargs, current->comm); + return true; +} + +#else + +static bool block_rtas_call(int token, int nargs, + struct rtas_args *args) +{ + return false; +} + +#endif /* CONFIG_PPC_RTAS_FILTER */ + /* We assume to be passed big endian arguments */ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs) { @@ -1094,6 +1235,9 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs) args.rets = &args.args[nargs]; memset(args.rets, 0, nret * sizeof(rtas_arg_t)); + if (block_rtas_call(token, nargs, &args)) + return -EINVAL; + /* Need to handle ibm,suspend_me call specially */ if (token == ibm_suspend_me_token) { @@ -1155,6 +1299,9 @@ void __init rtas_initialize(void) unsigned long rtas_region = RTAS_INSTANTIATE_MAX; u32 base, size, entry; int no_base, no_size, no_entry; +#ifdef CONFIG_PPC_RTAS_FILTER + int i; +#endif /* Get RTAS dev node and fill up our "rtas" structure with infos * about it. @@ -1190,6 +1337,12 @@ void __init rtas_initialize(void) #ifdef CONFIG_RTAS_ERROR_LOGGING rtas_last_error_token = rtas_token("rtas-last-error"); #endif + +#ifdef CONFIG_PPC_RTAS_FILTER + for (i = 0; i < ARRAY_SIZE(rtas_filters); i++) { + rtas_filters[i].token = rtas_token(rtas_filters[i].name); + } +#endif } int __init early_init_dt_scan_rtas(unsigned long node, diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c index 755dc98a57ae..6b107de10ffa 100644 --- a/arch/powerpc/kernel/sysfs.c +++ b/arch/powerpc/kernel/sysfs.c @@ -29,29 +29,27 @@ static DEFINE_PER_CPU(struct cpu, cpu_devices); -/* - * SMT snooze delay stuff, 64-bit only for now - */ - #ifdef CONFIG_PPC64 -/* Time in microseconds we delay before sleeping in the idle loop */ -static DEFINE_PER_CPU(long, smt_snooze_delay) = { 100 }; +/* + * Snooze delay has not been hooked up since 3fa8cad82b94 ("powerpc/pseries/cpuidle: + * smt-snooze-delay cleanup.") and has been broken even longer. As was foretold in + * 2014: + * + * "ppc64_util currently utilises it. Once we fix ppc64_util, propose to clean + * up the kernel code." + * + * powerpc-utils stopped using it as of 1.3.8. At some point in the future this + * code should be removed. + */ static ssize_t store_smt_snooze_delay(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { - struct cpu *cpu = container_of(dev, struct cpu, dev); - ssize_t ret; - long snooze; - - ret = sscanf(buf, "%ld", &snooze); - if (ret != 1) - return -EINVAL; - - per_cpu(smt_snooze_delay, cpu->dev.id) = snooze; + pr_warn_once("%s (%d) stored to unsupported smt_snooze_delay, which has no effect.\n", + current->comm, current->pid); return count; } @@ -59,9 +57,9 @@ static ssize_t show_smt_snooze_delay(struct device *dev, struct device_attribute *attr, char *buf) { - struct cpu *cpu = container_of(dev, struct cpu, dev); - - return sprintf(buf, "%ld\n", per_cpu(smt_snooze_delay, cpu->dev.id)); + pr_warn_once("%s (%d) read from unsupported smt_snooze_delay\n", + current->comm, current->pid); + return sprintf(buf, "100\n"); } static DEVICE_ATTR(smt_snooze_delay, 0644, show_smt_snooze_delay, @@ -69,16 +67,10 @@ static DEVICE_ATTR(smt_snooze_delay, 0644, show_smt_snooze_delay, static int __init setup_smt_snooze_delay(char *str) { - unsigned int cpu; - long snooze; - if (!cpu_has_feature(CPU_FTR_SMT)) return 1; - snooze = simple_strtol(str, NULL, 10); - for_each_possible_cpu(cpu) - per_cpu(smt_snooze_delay, cpu) = snooze; - + pr_warn("smt-snooze-delay command line option has no effect\n"); return 1; } __setup("smt-snooze-delay=", setup_smt_snooze_delay); diff --git a/arch/powerpc/kernel/tau_6xx.c b/arch/powerpc/kernel/tau_6xx.c index e2ab8a111b69..8ece45c2a1f6 100644 --- a/arch/powerpc/kernel/tau_6xx.c +++ b/arch/powerpc/kernel/tau_6xx.c @@ -13,13 +13,14 @@ */ #include -#include #include #include #include #include #include #include +#include +#include #include #include @@ -39,9 +40,7 @@ static struct tau_temp unsigned char grew; } tau[NR_CPUS]; -struct timer_list tau_timer; - -#undef DEBUG +static bool tau_int_enable; /* TODO: put these in a /proc interface, with some sanity checks, and maybe * dynamic adjustment to minimize # of interrupts */ @@ -50,72 +49,49 @@ struct timer_list tau_timer; #define step_size 2 /* step size when temp goes out of range */ #define window_expand 1 /* expand the window by this much */ /* configurable values for shrinking the window */ -#define shrink_timer 2*HZ /* period between shrinking the window */ +#define shrink_timer 2000 /* period between shrinking the window */ #define min_window 2 /* minimum window size, degrees C */ static void set_thresholds(unsigned long cpu) { -#ifdef CONFIG_TAU_INT - /* - * setup THRM1, - * threshold, valid bit, enable interrupts, interrupt when below threshold - */ - mtspr(SPRN_THRM1, THRM1_THRES(tau[cpu].low) | THRM1_V | THRM1_TIE | THRM1_TID); + u32 maybe_tie = tau_int_enable ? THRM1_TIE : 0; - /* setup THRM2, - * threshold, valid bit, enable interrupts, interrupt when above threshold - */ - mtspr (SPRN_THRM2, THRM1_THRES(tau[cpu].high) | THRM1_V | THRM1_TIE); -#else - /* same thing but don't enable interrupts */ - mtspr(SPRN_THRM1, THRM1_THRES(tau[cpu].low) | THRM1_V | THRM1_TID); - mtspr(SPRN_THRM2, THRM1_THRES(tau[cpu].high) | THRM1_V); -#endif + /* setup THRM1, threshold, valid bit, interrupt when below threshold */ + mtspr(SPRN_THRM1, THRM1_THRES(tau[cpu].low) | THRM1_V | maybe_tie | THRM1_TID); + + /* setup THRM2, threshold, valid bit, interrupt when above threshold */ + mtspr(SPRN_THRM2, THRM1_THRES(tau[cpu].high) | THRM1_V | maybe_tie); } static void TAUupdate(int cpu) { - unsigned thrm; - -#ifdef DEBUG - printk("TAUupdate "); -#endif + u32 thrm; + u32 bits = THRM1_TIV | THRM1_TIN | THRM1_V; /* if both thresholds are crossed, the step_sizes cancel out * and the window winds up getting expanded twice. */ - if((thrm = mfspr(SPRN_THRM1)) & THRM1_TIV){ /* is valid? */ - if(thrm & THRM1_TIN){ /* crossed low threshold */ - if (tau[cpu].low >= step_size){ - tau[cpu].low -= step_size; - tau[cpu].high -= (step_size - window_expand); - } - tau[cpu].grew = 1; -#ifdef DEBUG - printk("low threshold crossed "); -#endif + thrm = mfspr(SPRN_THRM1); + if ((thrm & bits) == bits) { + mtspr(SPRN_THRM1, 0); + + if (tau[cpu].low >= step_size) { + tau[cpu].low -= step_size; + tau[cpu].high -= (step_size - window_expand); } + tau[cpu].grew = 1; + pr_debug("%s: low threshold crossed\n", __func__); } - if((thrm = mfspr(SPRN_THRM2)) & THRM1_TIV){ /* is valid? */ - if(thrm & THRM1_TIN){ /* crossed high threshold */ - if (tau[cpu].high <= 127-step_size){ - tau[cpu].low += (step_size - window_expand); - tau[cpu].high += step_size; - } - tau[cpu].grew = 1; -#ifdef DEBUG - printk("high threshold crossed "); -#endif + thrm = mfspr(SPRN_THRM2); + if ((thrm & bits) == bits) { + mtspr(SPRN_THRM2, 0); + + if (tau[cpu].high <= 127 - step_size) { + tau[cpu].low += (step_size - window_expand); + tau[cpu].high += step_size; } + tau[cpu].grew = 1; + pr_debug("%s: high threshold crossed\n", __func__); } - -#ifdef DEBUG - printk("grew = %d\n", tau[cpu].grew); -#endif - -#ifndef CONFIG_TAU_INT /* tau_timeout will do this if not using interrupts */ - set_thresholds(cpu); -#endif - } #ifdef CONFIG_TAU_INT @@ -140,17 +116,16 @@ void TAUException(struct pt_regs * regs) static void tau_timeout(void * info) { int cpu; - unsigned long flags; int size; int shrink; - /* disabling interrupts *should* be okay */ - local_irq_save(flags); cpu = smp_processor_id(); -#ifndef CONFIG_TAU_INT - TAUupdate(cpu); -#endif + if (!tau_int_enable) + TAUupdate(cpu); + + /* Stop thermal sensor comparisons and interrupts */ + mtspr(SPRN_THRM3, 0); size = tau[cpu].high - tau[cpu].low; if (size > min_window && ! tau[cpu].grew) { @@ -173,32 +148,26 @@ static void tau_timeout(void * info) set_thresholds(cpu); - /* - * Do the enable every time, since otherwise a bunch of (relatively) - * complex sleep code needs to be added. One mtspr every time - * tau_timeout is called is probably not a big deal. - * - * Enable thermal sensor and set up sample interval timer - * need 20 us to do the compare.. until a nice 'cpu_speed' function - * call is implemented, just assume a 500 mhz clock. It doesn't really - * matter if we take too long for a compare since it's all interrupt - * driven anyway. - * - * use a extra long time.. (60 us @ 500 mhz) + /* Restart thermal sensor comparisons and interrupts. + * The "PowerPC 740 and PowerPC 750 Microprocessor Datasheet" + * recommends that "the maximum value be set in THRM3 under all + * conditions." */ - mtspr(SPRN_THRM3, THRM3_SITV(500*60) | THRM3_E); - - local_irq_restore(flags); + mtspr(SPRN_THRM3, THRM3_SITV(0x1fff) | THRM3_E); } -static void tau_timeout_smp(struct timer_list *unused) +static struct workqueue_struct *tau_workq; + +static void tau_work_func(struct work_struct *work) { - - /* schedule ourselves to be run again */ - mod_timer(&tau_timer, jiffies + shrink_timer) ; + msleep(shrink_timer); on_each_cpu(tau_timeout, NULL, 0); + /* schedule ourselves to be run again */ + queue_work(tau_workq, work); } +DECLARE_WORK(tau_work, tau_work_func); + /* * setup the TAU * @@ -231,21 +200,19 @@ static int __init TAU_init(void) return 1; } + tau_int_enable = IS_ENABLED(CONFIG_TAU_INT) && + !strcmp(cur_cpu_spec->platform, "ppc750"); - /* first, set up the window shrinking timer */ - timer_setup(&tau_timer, tau_timeout_smp, 0); - tau_timer.expires = jiffies + shrink_timer; - add_timer(&tau_timer); + tau_workq = alloc_workqueue("tau", WQ_UNBOUND, 1); + if (!tau_workq) + return -ENOMEM; on_each_cpu(TAU_init_smp, NULL, 0); - printk("Thermal assist unit "); -#ifdef CONFIG_TAU_INT - printk("using interrupts, "); -#else - printk("using timers, "); -#endif - printk("shrink_timer: %d jiffies\n", shrink_timer); + queue_work(tau_workq, &tau_work); + + pr_info("Thermal assist unit using %s, shrink_timer: %d ms\n", + tau_int_enable ? "interrupts" : "workqueue", shrink_timer); tau_initialized = 1; return 0; diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 7781f0168ce8..1b2d84cb373b 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -794,7 +794,7 @@ static void p9_hmi_special_emu(struct pt_regs *regs) { unsigned int ra, rb, t, i, sel, instr, rc; const void __user *addr; - u8 vbuf[16], *vdst; + u8 vbuf[16] __aligned(16), *vdst; unsigned long ea, msr, msr_mask; bool swap; diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c index 1749f15fc070..80b8fc4173de 100644 --- a/arch/powerpc/mm/tlb-radix.c +++ b/arch/powerpc/mm/tlb-radix.c @@ -598,19 +598,29 @@ static void do_exit_flush_lazy_tlb(void *arg) struct mm_struct *mm = arg; unsigned long pid = mm->context.id; + /* + * A kthread could have done a mmget_not_zero() after the flushing CPU + * checked mm_is_singlethreaded, and be in the process of + * kthread_use_mm when interrupted here. In that case, current->mm will + * be set to mm, because kthread_use_mm() setting ->mm and switching to + * the mm is done with interrupts off. + */ if (current->mm == mm) - return; /* Local CPU */ + goto out_flush; if (current->active_mm == mm) { - /* - * Must be a kernel thread because sender is single-threaded. - */ - BUG_ON(current->mm); + WARN_ON_ONCE(current->mm != NULL); + /* Is a kernel thread and is using mm as the lazy tlb */ mmgrab(&init_mm); - switch_mm(mm, &init_mm, current); current->active_mm = &init_mm; + switch_mm_irqs_off(mm, &init_mm, current); mmdrop(mm); } + + atomic_dec(&mm->context.active_cpus); + cpumask_clear_cpu(smp_processor_id(), mm_cpumask(mm)); + +out_flush: _tlbiel_pid(pid, RIC_FLUSH_ALL); } @@ -625,7 +635,6 @@ static void exit_flush_lazy_tlbs(struct mm_struct *mm) */ smp_call_function_many(mm_cpumask(mm), do_exit_flush_lazy_tlb, (void *)mm, 1); - mm_reset_thread_local(mm); } void radix__flush_tlb_mm(struct mm_struct *mm) diff --git a/arch/powerpc/perf/hv-gpci-requests.h b/arch/powerpc/perf/hv-gpci-requests.h index e608f9db12dd..8965b4463d43 100644 --- a/arch/powerpc/perf/hv-gpci-requests.h +++ b/arch/powerpc/perf/hv-gpci-requests.h @@ -95,7 +95,7 @@ REQUEST(__field(0, 8, partition_id) #define REQUEST_NAME system_performance_capabilities #define REQUEST_NUM 0x40 -#define REQUEST_IDX_KIND "starting_index=0xffffffffffffffff" +#define REQUEST_IDX_KIND "starting_index=0xffffffff" #include I(REQUEST_BEGIN) REQUEST(__field(0, 1, perf_collect_privileged) __field(0x1, 1, capability_mask) @@ -223,7 +223,7 @@ REQUEST(__field(0, 2, partition_id) #define REQUEST_NAME system_hypervisor_times #define REQUEST_NUM 0xF0 -#define REQUEST_IDX_KIND "starting_index=0xffffffffffffffff" +#define REQUEST_IDX_KIND "starting_index=0xffffffff" #include I(REQUEST_BEGIN) REQUEST(__count(0, 8, time_spent_to_dispatch_virtual_processors) __count(0x8, 8, time_spent_processing_virtual_processor_timers) @@ -234,7 +234,7 @@ REQUEST(__count(0, 8, time_spent_to_dispatch_virtual_processors) #define REQUEST_NAME system_tlbie_count_and_time #define REQUEST_NUM 0xF4 -#define REQUEST_IDX_KIND "starting_index=0xffffffffffffffff" +#define REQUEST_IDX_KIND "starting_index=0xffffffff" #include I(REQUEST_BEGIN) REQUEST(__count(0, 8, tlbie_instructions_issued) /* diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c index 053b8e9aa9e7..69a2dc2b16cf 100644 --- a/arch/powerpc/perf/isa207-common.c +++ b/arch/powerpc/perf/isa207-common.c @@ -273,6 +273,15 @@ int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp) mask |= CNST_PMC_MASK(pmc); value |= CNST_PMC_VAL(pmc); + + /* + * PMC5 and PMC6 are used to count cycles and instructions and + * they do not support most of the constraint bits. Add a check + * to exclude PMC5/6 from most of the constraints except for + * EBB/BHRB. + */ + if (pmc >= 5) + goto ebb_bhrb; } if (pmc <= 4) { @@ -331,6 +340,7 @@ int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp) } } +ebb_bhrb: if (!pmc && ebb) /* EBB events must specify the PMC */ return -1; diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig index 14ef17e10ec9..9914544e6677 100644 --- a/arch/powerpc/platforms/Kconfig +++ b/arch/powerpc/platforms/Kconfig @@ -238,12 +238,11 @@ config TAU temperature within 2-4 degrees Celsius. This option shows the current on-die temperature in /proc/cpuinfo if the cpu supports it. - Unfortunately, on some chip revisions, this sensor is very inaccurate - and in many cases, does not work at all, so don't assume the cpu - temp is actually what /proc/cpuinfo says it is. + Unfortunately, this sensor is very inaccurate when uncalibrated, so + don't assume the cpu temp is actually what /proc/cpuinfo says it is. config TAU_INT - bool "Interrupt driven TAU driver (DANGEROUS)" + bool "Interrupt driven TAU driver (EXPERIMENTAL)" depends on TAU ---help--- The TAU supports an interrupt driven mode which causes an interrupt @@ -251,12 +250,7 @@ config TAU_INT to get notified the temp has exceeded a range. With this option off, a timer is used to re-check the temperature periodically. - However, on some cpus it appears that the TAU interrupt hardware - is buggy and can cause a situation which would lead unexplained hard - lockups. - - Unless you are extending the TAU driver, or enjoy kernel/hardware - debugging, leave this option off. + If in doubt, say N here. config TAU_AVERAGE bool "Average high and low temp" diff --git a/arch/powerpc/platforms/powernv/opal-dump.c b/arch/powerpc/platforms/powernv/opal-dump.c index 198143833f00..1dc2122a3cf5 100644 --- a/arch/powerpc/platforms/powernv/opal-dump.c +++ b/arch/powerpc/platforms/powernv/opal-dump.c @@ -322,15 +322,14 @@ static ssize_t dump_attr_read(struct file *filep, struct kobject *kobj, return count; } -static struct dump_obj *create_dump_obj(uint32_t id, size_t size, - uint32_t type) +static void create_dump_obj(uint32_t id, size_t size, uint32_t type) { struct dump_obj *dump; int rc; dump = kzalloc(sizeof(*dump), GFP_KERNEL); if (!dump) - return NULL; + return; dump->kobj.kset = dump_kset; @@ -350,21 +349,39 @@ static struct dump_obj *create_dump_obj(uint32_t id, size_t size, rc = kobject_add(&dump->kobj, NULL, "0x%x-0x%x", type, id); if (rc) { kobject_put(&dump->kobj); - return NULL; + return; } + /* + * As soon as the sysfs file for this dump is created/activated there is + * a chance the opal_errd daemon (or any userspace) might read and + * acknowledge the dump before kobject_uevent() is called. If that + * happens then there is a potential race between + * dump_ack_store->kobject_put() and kobject_uevent() which leads to a + * use-after-free of a kernfs object resulting in a kernel crash. + * + * To avoid that, we need to take a reference on behalf of the bin file, + * so that our reference remains valid while we call kobject_uevent(). + * We then drop our reference before exiting the function, leaving the + * bin file to drop the last reference (if it hasn't already). + */ + + /* Take a reference for the bin file */ + kobject_get(&dump->kobj); rc = sysfs_create_bin_file(&dump->kobj, &dump->dump_attr); - if (rc) { + if (rc == 0) { + kobject_uevent(&dump->kobj, KOBJ_ADD); + + pr_info("%s: New platform dump. ID = 0x%x Size %u\n", + __func__, dump->id, dump->size); + } else { + /* Drop reference count taken for bin file */ kobject_put(&dump->kobj); - return NULL; } - pr_info("%s: New platform dump. ID = 0x%x Size %u\n", - __func__, dump->id, dump->size); - - kobject_uevent(&dump->kobj, KOBJ_ADD); - - return dump; + /* Drop our reference */ + kobject_put(&dump->kobj); + return; } static irqreturn_t process_dump(int irq, void *data) diff --git a/arch/powerpc/platforms/powernv/opal-elog.c b/arch/powerpc/platforms/powernv/opal-elog.c index ba6e437abb4b..398a06631456 100644 --- a/arch/powerpc/platforms/powernv/opal-elog.c +++ b/arch/powerpc/platforms/powernv/opal-elog.c @@ -183,14 +183,14 @@ static ssize_t raw_attr_read(struct file *filep, struct kobject *kobj, return count; } -static struct elog_obj *create_elog_obj(uint64_t id, size_t size, uint64_t type) +static void create_elog_obj(uint64_t id, size_t size, uint64_t type) { struct elog_obj *elog; int rc; elog = kzalloc(sizeof(*elog), GFP_KERNEL); if (!elog) - return NULL; + return; elog->kobj.kset = elog_kset; @@ -223,18 +223,37 @@ static struct elog_obj *create_elog_obj(uint64_t id, size_t size, uint64_t type) rc = kobject_add(&elog->kobj, NULL, "0x%llx", id); if (rc) { kobject_put(&elog->kobj); - return NULL; + return; } + /* + * As soon as the sysfs file for this elog is created/activated there is + * a chance the opal_errd daemon (or any userspace) might read and + * acknowledge the elog before kobject_uevent() is called. If that + * happens then there is a potential race between + * elog_ack_store->kobject_put() and kobject_uevent() which leads to a + * use-after-free of a kernfs object resulting in a kernel crash. + * + * To avoid that, we need to take a reference on behalf of the bin file, + * so that our reference remains valid while we call kobject_uevent(). + * We then drop our reference before exiting the function, leaving the + * bin file to drop the last reference (if it hasn't already). + */ + + /* Take a reference for the bin file */ + kobject_get(&elog->kobj); rc = sysfs_create_bin_file(&elog->kobj, &elog->raw_attr); - if (rc) { + if (rc == 0) { + kobject_uevent(&elog->kobj, KOBJ_ADD); + } else { + /* Drop the reference taken for the bin file */ kobject_put(&elog->kobj); - return NULL; } - kobject_uevent(&elog->kobj, KOBJ_ADD); + /* Drop our reference */ + kobject_put(&elog->kobj); - return elog; + return; } static irqreturn_t elog_event(int irq, void *data) diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index 8d49ba370c50..889c3dbec6fb 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -47,7 +47,7 @@ #include #define DBG(fmt...) udbg_printf(fmt) #else -#define DBG(fmt...) +#define DBG(fmt...) do { } while (0) #endif static void pnv_smp_setup_cpu(int cpu) diff --git a/arch/powerpc/platforms/pseries/rng.c b/arch/powerpc/platforms/pseries/rng.c index 31ca557af60b..262b8c5e1b9d 100644 --- a/arch/powerpc/platforms/pseries/rng.c +++ b/arch/powerpc/platforms/pseries/rng.c @@ -40,6 +40,7 @@ static __init int rng_init(void) ppc_md.get_random_seed = pseries_get_random_long; + of_node_put(dn); return 0; } machine_subsys_initcall(pseries, rng_init); diff --git a/arch/powerpc/sysdev/xics/icp-hv.c b/arch/powerpc/sysdev/xics/icp-hv.c index bbc839a98c41..003deaabb568 100644 --- a/arch/powerpc/sysdev/xics/icp-hv.c +++ b/arch/powerpc/sysdev/xics/icp-hv.c @@ -179,6 +179,7 @@ int icp_hv_init(void) icp_ops = &icp_hv_ops; + of_node_put(np); return 0; } diff --git a/arch/riscv/include/uapi/asm/auxvec.h b/arch/riscv/include/uapi/asm/auxvec.h index 1376515547cd..ed7bf7c7add5 100644 --- a/arch/riscv/include/uapi/asm/auxvec.h +++ b/arch/riscv/include/uapi/asm/auxvec.h @@ -21,4 +21,7 @@ /* vDSO location */ #define AT_SYSINFO_EHDR 33 +/* entries in ARCH_DLINFO */ +#define AT_VECTOR_SIZE_ARCH 1 + #endif /* _UAPI_ASM_RISCV_AUXVEC_H */ diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c index 8ea9db599d38..11c32b228f51 100644 --- a/arch/s390/kernel/time.c +++ b/arch/s390/kernel/time.c @@ -354,8 +354,9 @@ static DEFINE_PER_CPU(atomic_t, clock_sync_word); static DEFINE_MUTEX(clock_sync_mutex); static unsigned long clock_sync_flags; -#define CLOCK_SYNC_HAS_STP 0 -#define CLOCK_SYNC_STP 1 +#define CLOCK_SYNC_HAS_STP 0 +#define CLOCK_SYNC_STP 1 +#define CLOCK_SYNC_STPINFO_VALID 2 /* * The get_clock function for the physical clock. It will get the current @@ -592,6 +593,22 @@ void stp_queue_work(void) queue_work(time_sync_wq, &stp_work); } +static int __store_stpinfo(void) +{ + int rc = chsc_sstpi(stp_page, &stp_info, sizeof(struct stp_sstpi)); + + if (rc) + clear_bit(CLOCK_SYNC_STPINFO_VALID, &clock_sync_flags); + else + set_bit(CLOCK_SYNC_STPINFO_VALID, &clock_sync_flags); + return rc; +} + +static int stpinfo_valid(void) +{ + return stp_online && test_bit(CLOCK_SYNC_STPINFO_VALID, &clock_sync_flags); +} + static int stp_sync_clock(void *data) { struct clock_sync_data *sync = data; @@ -613,8 +630,7 @@ static int stp_sync_clock(void *data) if (rc == 0) { sync->clock_delta = clock_delta; clock_sync_global(clock_delta); - rc = chsc_sstpi(stp_page, &stp_info, - sizeof(struct stp_sstpi)); + rc = __store_stpinfo(); if (rc == 0 && stp_info.tmd != 2) rc = -EAGAIN; } @@ -659,7 +675,7 @@ static void stp_work_fn(struct work_struct *work) if (rc) goto out_unlock; - rc = chsc_sstpi(stp_page, &stp_info, sizeof(struct stp_sstpi)); + rc = __store_stpinfo(); if (rc || stp_info.c == 0) goto out_unlock; @@ -696,10 +712,14 @@ static ssize_t stp_ctn_id_show(struct device *dev, struct device_attribute *attr, char *buf) { - if (!stp_online) - return -ENODATA; - return sprintf(buf, "%016llx\n", - *(unsigned long long *) stp_info.ctnid); + ssize_t ret = -ENODATA; + + mutex_lock(&stp_work_mutex); + if (stpinfo_valid()) + ret = sprintf(buf, "%016llx\n", + *(unsigned long long *) stp_info.ctnid); + mutex_unlock(&stp_work_mutex); + return ret; } static DEVICE_ATTR(ctn_id, 0400, stp_ctn_id_show, NULL); @@ -708,9 +728,13 @@ static ssize_t stp_ctn_type_show(struct device *dev, struct device_attribute *attr, char *buf) { - if (!stp_online) - return -ENODATA; - return sprintf(buf, "%i\n", stp_info.ctn); + ssize_t ret = -ENODATA; + + mutex_lock(&stp_work_mutex); + if (stpinfo_valid()) + ret = sprintf(buf, "%i\n", stp_info.ctn); + mutex_unlock(&stp_work_mutex); + return ret; } static DEVICE_ATTR(ctn_type, 0400, stp_ctn_type_show, NULL); @@ -719,9 +743,13 @@ static ssize_t stp_dst_offset_show(struct device *dev, struct device_attribute *attr, char *buf) { - if (!stp_online || !(stp_info.vbits & 0x2000)) - return -ENODATA; - return sprintf(buf, "%i\n", (int)(s16) stp_info.dsto); + ssize_t ret = -ENODATA; + + mutex_lock(&stp_work_mutex); + if (stpinfo_valid() && (stp_info.vbits & 0x2000)) + ret = sprintf(buf, "%i\n", (int)(s16) stp_info.dsto); + mutex_unlock(&stp_work_mutex); + return ret; } static DEVICE_ATTR(dst_offset, 0400, stp_dst_offset_show, NULL); @@ -730,9 +758,13 @@ static ssize_t stp_leap_seconds_show(struct device *dev, struct device_attribute *attr, char *buf) { - if (!stp_online || !(stp_info.vbits & 0x8000)) - return -ENODATA; - return sprintf(buf, "%i\n", (int)(s16) stp_info.leaps); + ssize_t ret = -ENODATA; + + mutex_lock(&stp_work_mutex); + if (stpinfo_valid() && (stp_info.vbits & 0x8000)) + ret = sprintf(buf, "%i\n", (int)(s16) stp_info.leaps); + mutex_unlock(&stp_work_mutex); + return ret; } static DEVICE_ATTR(leap_seconds, 0400, stp_leap_seconds_show, NULL); @@ -741,9 +773,13 @@ static ssize_t stp_stratum_show(struct device *dev, struct device_attribute *attr, char *buf) { - if (!stp_online) - return -ENODATA; - return sprintf(buf, "%i\n", (int)(s16) stp_info.stratum); + ssize_t ret = -ENODATA; + + mutex_lock(&stp_work_mutex); + if (stpinfo_valid()) + ret = sprintf(buf, "%i\n", (int)(s16) stp_info.stratum); + mutex_unlock(&stp_work_mutex); + return ret; } static DEVICE_ATTR(stratum, 0400, stp_stratum_show, NULL); @@ -752,9 +788,13 @@ static ssize_t stp_time_offset_show(struct device *dev, struct device_attribute *attr, char *buf) { - if (!stp_online || !(stp_info.vbits & 0x0800)) - return -ENODATA; - return sprintf(buf, "%i\n", (int) stp_info.tto); + ssize_t ret = -ENODATA; + + mutex_lock(&stp_work_mutex); + if (stpinfo_valid() && (stp_info.vbits & 0x0800)) + ret = sprintf(buf, "%i\n", (int) stp_info.tto); + mutex_unlock(&stp_work_mutex); + return ret; } static DEVICE_ATTR(time_offset, 0400, stp_time_offset_show, NULL); @@ -763,9 +803,13 @@ static ssize_t stp_time_zone_offset_show(struct device *dev, struct device_attribute *attr, char *buf) { - if (!stp_online || !(stp_info.vbits & 0x4000)) - return -ENODATA; - return sprintf(buf, "%i\n", (int)(s16) stp_info.tzo); + ssize_t ret = -ENODATA; + + mutex_lock(&stp_work_mutex); + if (stpinfo_valid() && (stp_info.vbits & 0x4000)) + ret = sprintf(buf, "%i\n", (int)(s16) stp_info.tzo); + mutex_unlock(&stp_work_mutex); + return ret; } static DEVICE_ATTR(time_zone_offset, 0400, @@ -775,9 +819,13 @@ static ssize_t stp_timing_mode_show(struct device *dev, struct device_attribute *attr, char *buf) { - if (!stp_online) - return -ENODATA; - return sprintf(buf, "%i\n", stp_info.tmd); + ssize_t ret = -ENODATA; + + mutex_lock(&stp_work_mutex); + if (stpinfo_valid()) + ret = sprintf(buf, "%i\n", stp_info.tmd); + mutex_unlock(&stp_work_mutex); + return ret; } static DEVICE_ATTR(timing_mode, 0400, stp_timing_mode_show, NULL); @@ -786,9 +834,13 @@ static ssize_t stp_timing_state_show(struct device *dev, struct device_attribute *attr, char *buf) { - if (!stp_online) - return -ENODATA; - return sprintf(buf, "%i\n", stp_info.tst); + ssize_t ret = -ENODATA; + + mutex_lock(&stp_work_mutex); + if (stpinfo_valid()) + ret = sprintf(buf, "%i\n", stp_info.tst); + mutex_unlock(&stp_work_mutex); + return ret; } static DEVICE_ATTR(timing_state, 0400, stp_timing_state_show, NULL); diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c index d3ea1f3c06a0..a7d7b7ade42f 100644 --- a/arch/sparc/kernel/smp_64.c +++ b/arch/sparc/kernel/smp_64.c @@ -1039,38 +1039,9 @@ void smp_fetch_global_pmu(void) * are flush_tlb_*() routines, and these run after flush_cache_*() * which performs the flushw. * - * The SMP TLB coherency scheme we use works as follows: - * - * 1) mm->cpu_vm_mask is a bit mask of which cpus an address - * space has (potentially) executed on, this is the heuristic - * we use to avoid doing cross calls. - * - * Also, for flushing from kswapd and also for clones, we - * use cpu_vm_mask as the list of cpus to make run the TLB. - * - * 2) TLB context numbers are shared globally across all processors - * in the system, this allows us to play several games to avoid - * cross calls. - * - * One invariant is that when a cpu switches to a process, and - * that processes tsk->active_mm->cpu_vm_mask does not have the - * current cpu's bit set, that tlb context is flushed locally. - * - * If the address space is non-shared (ie. mm->count == 1) we avoid - * cross calls when we want to flush the currently running process's - * tlb state. This is done by clearing all cpu bits except the current - * processor's in current->mm->cpu_vm_mask and performing the - * flush locally only. This will force any subsequent cpus which run - * this task to flush the context from the local tlb if the process - * migrates to another cpu (again). - * - * 3) For shared address spaces (threads) and swapping we bite the - * bullet for most cases and perform the cross call (but only to - * the cpus listed in cpu_vm_mask). - * - * The performance gain from "optimizing" away the cross call for threads is - * questionable (in theory the big win for threads is the massive sharing of - * address space state across processors). + * mm->cpu_vm_mask is a bit mask of which cpus an address + * space has (potentially) executed on, this is the heuristic + * we use to limit cross calls. */ /* This currently is only used by the hugetlb arch pre-fault @@ -1080,18 +1051,13 @@ void smp_fetch_global_pmu(void) void smp_flush_tlb_mm(struct mm_struct *mm) { u32 ctx = CTX_HWBITS(mm->context); - int cpu = get_cpu(); - if (atomic_read(&mm->mm_users) == 1) { - cpumask_copy(mm_cpumask(mm), cpumask_of(cpu)); - goto local_flush_and_out; - } + get_cpu(); smp_cross_call_masked(&xcall_flush_tlb_mm, ctx, 0, 0, mm_cpumask(mm)); -local_flush_and_out: __flush_tlb_mm(ctx, SECONDARY_CONTEXT); put_cpu(); @@ -1114,17 +1080,15 @@ void smp_flush_tlb_pending(struct mm_struct *mm, unsigned long nr, unsigned long { u32 ctx = CTX_HWBITS(mm->context); struct tlb_pending_info info; - int cpu = get_cpu(); + + get_cpu(); info.ctx = ctx; info.nr = nr; info.vaddrs = vaddrs; - if (mm == current->mm && atomic_read(&mm->mm_users) == 1) - cpumask_copy(mm_cpumask(mm), cpumask_of(cpu)); - else - smp_call_function_many(mm_cpumask(mm), tlb_pending_func, - &info, 1); + smp_call_function_many(mm_cpumask(mm), tlb_pending_func, + &info, 1); __flush_tlb_pending(ctx, nr, vaddrs); @@ -1134,14 +1098,13 @@ void smp_flush_tlb_pending(struct mm_struct *mm, unsigned long nr, unsigned long void smp_flush_tlb_page(struct mm_struct *mm, unsigned long vaddr) { unsigned long context = CTX_HWBITS(mm->context); - int cpu = get_cpu(); - if (mm == current->mm && atomic_read(&mm->mm_users) == 1) - cpumask_copy(mm_cpumask(mm), cpumask_of(cpu)); - else - smp_cross_call_masked(&xcall_flush_tlb_page, - context, vaddr, 0, - mm_cpumask(mm)); + get_cpu(); + + smp_cross_call_masked(&xcall_flush_tlb_page, + context, vaddr, 0, + mm_cpumask(mm)); + __flush_tlb_page(context, vaddr); put_cpu(); diff --git a/arch/um/kernel/sigio.c b/arch/um/kernel/sigio.c index b5e0cbb34382..476ded92affa 100644 --- a/arch/um/kernel/sigio.c +++ b/arch/um/kernel/sigio.c @@ -36,14 +36,14 @@ int write_sigio_irq(int fd) } /* These are called from os-Linux/sigio.c to protect its pollfds arrays. */ -static DEFINE_SPINLOCK(sigio_spinlock); +static DEFINE_MUTEX(sigio_mutex); void sigio_lock(void) { - spin_lock(&sigio_spinlock); + mutex_lock(&sigio_mutex); } void sigio_unlock(void) { - spin_unlock(&sigio_spinlock); + mutex_unlock(&sigio_mutex); } diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 5e400c74ddc7..7f233461a9cd 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -200,9 +200,10 @@ avx2_instr :=$(call as-instr,vpbroadcastb %xmm0$(comma)%ymm1,-DCONFIG_AS_AVX2=1) avx512_instr :=$(call as-instr,vpmovm2b %k1$(comma)%zmm5,-DCONFIG_AS_AVX512=1) sha1_ni_instr :=$(call as-instr,sha1msg1 %xmm0$(comma)%xmm1,-DCONFIG_AS_SHA1_NI=1) sha256_ni_instr :=$(call as-instr,sha256msg1 %xmm0$(comma)%xmm1,-DCONFIG_AS_SHA256_NI=1) +adx_instr := $(call as-instr,adox %r10$(comma)%r10,-DCONFIG_AS_ADX=1) -KBUILD_AFLAGS += $(cfi) $(cfi-sigframe) $(cfi-sections) $(asinstr) $(avx_instr) $(avx2_instr) $(avx512_instr) $(sha1_ni_instr) $(sha256_ni_instr) -KBUILD_CFLAGS += $(cfi) $(cfi-sigframe) $(cfi-sections) $(asinstr) $(avx_instr) $(avx2_instr) $(avx512_instr) $(sha1_ni_instr) $(sha256_ni_instr) +KBUILD_AFLAGS += $(cfi) $(cfi-sigframe) $(cfi-sections) $(asinstr) $(avx_instr) $(avx2_instr) $(avx512_instr) $(sha1_ni_instr) $(sha256_ni_instr) $(adx_instr) +KBUILD_CFLAGS += $(cfi) $(cfi-sigframe) $(cfi-sections) $(asinstr) $(avx_instr) $(avx2_instr) $(avx512_instr) $(sha1_ni_instr) $(sha256_ni_instr) $(adx_instr) KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE) diff --git a/arch/x86/configs/gki_defconfig b/arch/x86/configs/gki_defconfig index 22ac03d6b83f..0607d16fcba2 100644 --- a/arch/x86/configs/gki_defconfig +++ b/arch/x86/configs/gki_defconfig @@ -40,6 +40,7 @@ CONFIG_EMBEDDED=y # CONFIG_SLAB_MERGE_DEFAULT is not set CONFIG_PROFILING=y CONFIG_SMP=y +CONFIG_X86_X2APIC=y CONFIG_HYPERVISOR_GUEST=y CONFIG_PARAVIRT=y CONFIG_NR_CPUS=32 @@ -213,6 +214,7 @@ CONFIG_DM_VERITY_FEC=y CONFIG_DM_BOW=y CONFIG_NETDEVICES=y CONFIG_DUMMY=y +CONFIG_WIREGUARD=y CONFIG_TUN=y CONFIG_VETH=y # CONFIG_ETHERNET is not set @@ -310,6 +312,7 @@ CONFIG_HID_NINTENDO=y CONFIG_HID_SONY=y CONFIG_HID_STEAM=y CONFIG_USB_HIDDEV=y +CONFIG_USB_ANNOUNCE_NEW_DEVICES=y CONFIG_USB_XHCI_HCD=y CONFIG_USB_GADGET=y CONFIG_USB_GADGET_VBUS_DRAW=500 @@ -436,6 +439,7 @@ CONFIG_CRC8=y CONFIG_XZ_DEC=y CONFIG_PRINTK_TIME=y CONFIG_DEBUG_INFO=y +CONFIG_DEBUG_INFO_DWARF4=y # CONFIG_ENABLE_MUST_CHECK is not set # CONFIG_SECTION_MISMATCH_WARN_ONLY is not set CONFIG_MAGIC_SYSRQ=y diff --git a/arch/x86/crypto/.gitignore b/arch/x86/crypto/.gitignore new file mode 100644 index 000000000000..30be0400a439 --- /dev/null +++ b/arch/x86/crypto/.gitignore @@ -0,0 +1 @@ +poly1305-x86_64-cryptogams.S diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile index a450ad573dcb..d7ebccf26cba 100644 --- a/arch/x86/crypto/Makefile +++ b/arch/x86/crypto/Makefile @@ -8,8 +8,10 @@ OBJECT_FILES_NON_STANDARD := y avx_supported := $(call as-instr,vpxor %xmm0$(comma)%xmm0$(comma)%xmm0,yes,no) avx2_supported := $(call as-instr,vpgatherdd %ymm0$(comma)(%eax$(comma)%ymm1\ $(comma)4)$(comma)%ymm2,yes,no) +avx512_supported :=$(call as-instr,vpmovm2b %k1$(comma)%zmm5,yes,no) sha1_ni_supported :=$(call as-instr,sha1msg1 %xmm0$(comma)%xmm1,yes,no) sha256_ni_supported :=$(call as-instr,sha256msg1 %xmm0$(comma)%xmm1,yes,no) +adx_supported := $(call as-instr,adox %r10$(comma)%r10,yes,no) obj-$(CONFIG_CRYPTO_GLUE_HELPER_X86) += glue_helper.o @@ -23,7 +25,7 @@ obj-$(CONFIG_CRYPTO_CAMELLIA_X86_64) += camellia-x86_64.o obj-$(CONFIG_CRYPTO_BLOWFISH_X86_64) += blowfish-x86_64.o obj-$(CONFIG_CRYPTO_TWOFISH_X86_64) += twofish-x86_64.o obj-$(CONFIG_CRYPTO_TWOFISH_X86_64_3WAY) += twofish-x86_64-3way.o -obj-$(CONFIG_CRYPTO_CHACHA20_X86_64) += chacha20-x86_64.o +obj-$(CONFIG_CRYPTO_CHACHA20_X86_64) += chacha-x86_64.o obj-$(CONFIG_CRYPTO_SERPENT_SSE2_X86_64) += serpent-sse2-x86_64.o obj-$(CONFIG_CRYPTO_AES_NI_INTEL) += aesni-intel.o obj-$(CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL) += ghash-clmulni-intel.o @@ -46,6 +48,11 @@ obj-$(CONFIG_CRYPTO_MORUS1280_GLUE) += morus1280_glue.o obj-$(CONFIG_CRYPTO_MORUS640_SSE2) += morus640-sse2.o obj-$(CONFIG_CRYPTO_MORUS1280_SSE2) += morus1280-sse2.o +# These modules require the assembler to support ADX. +ifeq ($(adx_supported),yes) + obj-$(CONFIG_CRYPTO_CURVE25519_X86) += curve25519-x86_64.o +endif + # These modules require assembler to support AVX. ifeq ($(avx_supported),yes) obj-$(CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64) += \ @@ -54,6 +61,7 @@ ifeq ($(avx_supported),yes) obj-$(CONFIG_CRYPTO_CAST6_AVX_X86_64) += cast6-avx-x86_64.o obj-$(CONFIG_CRYPTO_TWOFISH_AVX_X86_64) += twofish-avx-x86_64.o obj-$(CONFIG_CRYPTO_SERPENT_AVX_X86_64) += serpent-avx-x86_64.o + obj-$(CONFIG_CRYPTO_BLAKE2S_X86) += blake2s-x86_64.o endif # These modules require assembler to support AVX2. @@ -77,7 +85,7 @@ camellia-x86_64-y := camellia-x86_64-asm_64.o camellia_glue.o blowfish-x86_64-y := blowfish-x86_64-asm_64.o blowfish_glue.o twofish-x86_64-y := twofish-x86_64-asm_64.o twofish_glue.o twofish-x86_64-3way-y := twofish-x86_64-asm_64-3way.o twofish_glue_3way.o -chacha20-x86_64-y := chacha20-ssse3-x86_64.o chacha20_glue.o +chacha-x86_64-y := chacha-ssse3-x86_64.o chacha_glue.o serpent-sse2-x86_64-y := serpent-sse2-x86_64-asm_64.o serpent_sse2_glue.o aegis128-aesni-y := aegis128-aesni-asm.o aegis128-aesni-glue.o @@ -87,6 +95,12 @@ aegis256-aesni-y := aegis256-aesni-asm.o aegis256-aesni-glue.o morus640-sse2-y := morus640-sse2-asm.o morus640-sse2-glue.o morus1280-sse2-y := morus1280-sse2-asm.o morus1280-sse2-glue.o +blake2s-x86_64-y := blake2s-core.o blake2s-glue.o +poly1305-x86_64-y := poly1305-x86_64-cryptogams.o poly1305_glue.o +ifneq ($(CONFIG_CRYPTO_POLY1305_X86_64),) +targets += poly1305-x86_64-cryptogams.S +endif + ifeq ($(avx_supported),yes) camellia-aesni-avx-x86_64-y := camellia-aesni-avx-asm_64.o \ camellia_aesni_avx_glue.o @@ -100,20 +114,22 @@ endif ifeq ($(avx2_supported),yes) camellia-aesni-avx2-y := camellia-aesni-avx2-asm_64.o camellia_aesni_avx2_glue.o - chacha20-x86_64-y += chacha20-avx2-x86_64.o + chacha-x86_64-y += chacha-avx2-x86_64.o serpent-avx2-y := serpent-avx2-asm_64.o serpent_avx2_glue.o morus1280-avx2-y := morus1280-avx2-asm.o morus1280-avx2-glue.o endif +ifeq ($(avx512_supported),yes) + chacha-x86_64-y += chacha-avx512vl-x86_64.o +endif + aesni-intel-y := aesni-intel_asm.o aesni-intel_glue.o fpu.o aesni-intel-$(CONFIG_64BIT) += aesni-intel_avx-x86_64.o aes_ctrby8_avx-x86_64.o ghash-clmulni-intel-y := ghash-clmulni-intel_asm.o ghash-clmulni-intel_glue.o sha1-ssse3-y := sha1_ssse3_asm.o sha1_ssse3_glue.o -poly1305-x86_64-y := poly1305-sse2-x86_64.o poly1305_glue.o ifeq ($(avx2_supported),yes) sha1-ssse3-y += sha1_avx2_x86_64_asm.o -poly1305-x86_64-y += poly1305-avx2-x86_64.o endif ifeq ($(sha1_ni_supported),yes) sha1-ssse3-y += sha1_ni_asm.o @@ -127,3 +143,8 @@ sha256-ssse3-y += sha256_ni_asm.o endif sha512-ssse3-y := sha512-ssse3-asm.o sha512-avx-asm.o sha512-avx2-asm.o sha512_ssse3_glue.o crct10dif-pclmul-y := crct10dif-pcl-asm_64.o crct10dif-pclmul_glue.o + +quiet_cmd_perlasm = PERLASM $@ + cmd_perlasm = $(PERL) $< > $@ +$(obj)/%.S: $(src)/%.pl FORCE + $(call if_changed,perlasm) diff --git a/arch/x86/crypto/blake2s-core.S b/arch/x86/crypto/blake2s-core.S new file mode 100644 index 000000000000..8591938eee26 --- /dev/null +++ b/arch/x86/crypto/blake2s-core.S @@ -0,0 +1,258 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * Copyright (C) 2017-2019 Samuel Neves . All Rights Reserved. + */ + +#include + +.section .rodata.cst32.BLAKE2S_IV, "aM", @progbits, 32 +.align 32 +IV: .octa 0xA54FF53A3C6EF372BB67AE856A09E667 + .octa 0x5BE0CD191F83D9AB9B05688C510E527F +.section .rodata.cst16.ROT16, "aM", @progbits, 16 +.align 16 +ROT16: .octa 0x0D0C0F0E09080B0A0504070601000302 +.section .rodata.cst16.ROR328, "aM", @progbits, 16 +.align 16 +ROR328: .octa 0x0C0F0E0D080B0A090407060500030201 +.section .rodata.cst64.BLAKE2S_SIGMA, "aM", @progbits, 160 +.align 64 +SIGMA: +.byte 0, 2, 4, 6, 1, 3, 5, 7, 14, 8, 10, 12, 15, 9, 11, 13 +.byte 14, 4, 9, 13, 10, 8, 15, 6, 5, 1, 0, 11, 3, 12, 2, 7 +.byte 11, 12, 5, 15, 8, 0, 2, 13, 9, 10, 3, 7, 4, 14, 6, 1 +.byte 7, 3, 13, 11, 9, 1, 12, 14, 15, 2, 5, 4, 8, 6, 10, 0 +.byte 9, 5, 2, 10, 0, 7, 4, 15, 3, 14, 11, 6, 13, 1, 12, 8 +.byte 2, 6, 0, 8, 12, 10, 11, 3, 1, 4, 7, 15, 9, 13, 5, 14 +.byte 12, 1, 14, 4, 5, 15, 13, 10, 8, 0, 6, 9, 11, 7, 3, 2 +.byte 13, 7, 12, 3, 11, 14, 1, 9, 2, 5, 15, 8, 10, 0, 4, 6 +.byte 6, 14, 11, 0, 15, 9, 3, 8, 10, 12, 13, 1, 5, 2, 7, 4 +.byte 10, 8, 7, 1, 2, 4, 6, 5, 13, 15, 9, 3, 0, 11, 14, 12 +#ifdef CONFIG_AS_AVX512 +.section .rodata.cst64.BLAKE2S_SIGMA2, "aM", @progbits, 640 +.align 64 +SIGMA2: +.long 0, 2, 4, 6, 1, 3, 5, 7, 14, 8, 10, 12, 15, 9, 11, 13 +.long 8, 2, 13, 15, 10, 9, 12, 3, 6, 4, 0, 14, 5, 11, 1, 7 +.long 11, 13, 8, 6, 5, 10, 14, 3, 2, 4, 12, 15, 1, 0, 7, 9 +.long 11, 10, 7, 0, 8, 15, 1, 13, 3, 6, 2, 12, 4, 14, 9, 5 +.long 4, 10, 9, 14, 15, 0, 11, 8, 1, 7, 3, 13, 2, 5, 6, 12 +.long 2, 11, 4, 15, 14, 3, 10, 8, 13, 6, 5, 7, 0, 12, 1, 9 +.long 4, 8, 15, 9, 14, 11, 13, 5, 3, 2, 1, 12, 6, 10, 7, 0 +.long 6, 13, 0, 14, 12, 2, 1, 11, 15, 4, 5, 8, 7, 9, 3, 10 +.long 15, 5, 4, 13, 10, 7, 3, 11, 12, 2, 0, 6, 9, 8, 1, 14 +.long 8, 7, 14, 11, 13, 15, 0, 12, 10, 4, 5, 6, 3, 2, 1, 9 +#endif /* CONFIG_AS_AVX512 */ + +.text +#ifdef CONFIG_AS_SSSE3 +ENTRY(blake2s_compress_ssse3) + testq %rdx,%rdx + je .Lendofloop + movdqu (%rdi),%xmm0 + movdqu 0x10(%rdi),%xmm1 + movdqa ROT16(%rip),%xmm12 + movdqa ROR328(%rip),%xmm13 + movdqu 0x20(%rdi),%xmm14 + movq %rcx,%xmm15 + leaq SIGMA+0xa0(%rip),%r8 + jmp .Lbeginofloop + .align 32 +.Lbeginofloop: + movdqa %xmm0,%xmm10 + movdqa %xmm1,%xmm11 + paddq %xmm15,%xmm14 + movdqa IV(%rip),%xmm2 + movdqa %xmm14,%xmm3 + pxor IV+0x10(%rip),%xmm3 + leaq SIGMA(%rip),%rcx +.Lroundloop: + movzbl (%rcx),%eax + movd (%rsi,%rax,4),%xmm4 + movzbl 0x1(%rcx),%eax + movd (%rsi,%rax,4),%xmm5 + movzbl 0x2(%rcx),%eax + movd (%rsi,%rax,4),%xmm6 + movzbl 0x3(%rcx),%eax + movd (%rsi,%rax,4),%xmm7 + punpckldq %xmm5,%xmm4 + punpckldq %xmm7,%xmm6 + punpcklqdq %xmm6,%xmm4 + paddd %xmm4,%xmm0 + paddd %xmm1,%xmm0 + pxor %xmm0,%xmm3 + pshufb %xmm12,%xmm3 + paddd %xmm3,%xmm2 + pxor %xmm2,%xmm1 + movdqa %xmm1,%xmm8 + psrld $0xc,%xmm1 + pslld $0x14,%xmm8 + por %xmm8,%xmm1 + movzbl 0x4(%rcx),%eax + movd (%rsi,%rax,4),%xmm5 + movzbl 0x5(%rcx),%eax + movd (%rsi,%rax,4),%xmm6 + movzbl 0x6(%rcx),%eax + movd (%rsi,%rax,4),%xmm7 + movzbl 0x7(%rcx),%eax + movd (%rsi,%rax,4),%xmm4 + punpckldq %xmm6,%xmm5 + punpckldq %xmm4,%xmm7 + punpcklqdq %xmm7,%xmm5 + paddd %xmm5,%xmm0 + paddd %xmm1,%xmm0 + pxor %xmm0,%xmm3 + pshufb %xmm13,%xmm3 + paddd %xmm3,%xmm2 + pxor %xmm2,%xmm1 + movdqa %xmm1,%xmm8 + psrld $0x7,%xmm1 + pslld $0x19,%xmm8 + por %xmm8,%xmm1 + pshufd $0x93,%xmm0,%xmm0 + pshufd $0x4e,%xmm3,%xmm3 + pshufd $0x39,%xmm2,%xmm2 + movzbl 0x8(%rcx),%eax + movd (%rsi,%rax,4),%xmm6 + movzbl 0x9(%rcx),%eax + movd (%rsi,%rax,4),%xmm7 + movzbl 0xa(%rcx),%eax + movd (%rsi,%rax,4),%xmm4 + movzbl 0xb(%rcx),%eax + movd (%rsi,%rax,4),%xmm5 + punpckldq %xmm7,%xmm6 + punpckldq %xmm5,%xmm4 + punpcklqdq %xmm4,%xmm6 + paddd %xmm6,%xmm0 + paddd %xmm1,%xmm0 + pxor %xmm0,%xmm3 + pshufb %xmm12,%xmm3 + paddd %xmm3,%xmm2 + pxor %xmm2,%xmm1 + movdqa %xmm1,%xmm8 + psrld $0xc,%xmm1 + pslld $0x14,%xmm8 + por %xmm8,%xmm1 + movzbl 0xc(%rcx),%eax + movd (%rsi,%rax,4),%xmm7 + movzbl 0xd(%rcx),%eax + movd (%rsi,%rax,4),%xmm4 + movzbl 0xe(%rcx),%eax + movd (%rsi,%rax,4),%xmm5 + movzbl 0xf(%rcx),%eax + movd (%rsi,%rax,4),%xmm6 + punpckldq %xmm4,%xmm7 + punpckldq %xmm6,%xmm5 + punpcklqdq %xmm5,%xmm7 + paddd %xmm7,%xmm0 + paddd %xmm1,%xmm0 + pxor %xmm0,%xmm3 + pshufb %xmm13,%xmm3 + paddd %xmm3,%xmm2 + pxor %xmm2,%xmm1 + movdqa %xmm1,%xmm8 + psrld $0x7,%xmm1 + pslld $0x19,%xmm8 + por %xmm8,%xmm1 + pshufd $0x39,%xmm0,%xmm0 + pshufd $0x4e,%xmm3,%xmm3 + pshufd $0x93,%xmm2,%xmm2 + addq $0x10,%rcx + cmpq %r8,%rcx + jnz .Lroundloop + pxor %xmm2,%xmm0 + pxor %xmm3,%xmm1 + pxor %xmm10,%xmm0 + pxor %xmm11,%xmm1 + addq $0x40,%rsi + decq %rdx + jnz .Lbeginofloop + movdqu %xmm0,(%rdi) + movdqu %xmm1,0x10(%rdi) + movdqu %xmm14,0x20(%rdi) +.Lendofloop: + ret +ENDPROC(blake2s_compress_ssse3) +#endif /* CONFIG_AS_SSSE3 */ + +#ifdef CONFIG_AS_AVX512 +ENTRY(blake2s_compress_avx512) + vmovdqu (%rdi),%xmm0 + vmovdqu 0x10(%rdi),%xmm1 + vmovdqu 0x20(%rdi),%xmm4 + vmovq %rcx,%xmm5 + vmovdqa IV(%rip),%xmm14 + vmovdqa IV+16(%rip),%xmm15 + jmp .Lblake2s_compress_avx512_mainloop +.align 32 +.Lblake2s_compress_avx512_mainloop: + vmovdqa %xmm0,%xmm10 + vmovdqa %xmm1,%xmm11 + vpaddq %xmm5,%xmm4,%xmm4 + vmovdqa %xmm14,%xmm2 + vpxor %xmm15,%xmm4,%xmm3 + vmovdqu (%rsi),%ymm6 + vmovdqu 0x20(%rsi),%ymm7 + addq $0x40,%rsi + leaq SIGMA2(%rip),%rax + movb $0xa,%cl +.Lblake2s_compress_avx512_roundloop: + addq $0x40,%rax + vmovdqa -0x40(%rax),%ymm8 + vmovdqa -0x20(%rax),%ymm9 + vpermi2d %ymm7,%ymm6,%ymm8 + vpermi2d %ymm7,%ymm6,%ymm9 + vmovdqa %ymm8,%ymm6 + vmovdqa %ymm9,%ymm7 + vpaddd %xmm8,%xmm0,%xmm0 + vpaddd %xmm1,%xmm0,%xmm0 + vpxor %xmm0,%xmm3,%xmm3 + vprord $0x10,%xmm3,%xmm3 + vpaddd %xmm3,%xmm2,%xmm2 + vpxor %xmm2,%xmm1,%xmm1 + vprord $0xc,%xmm1,%xmm1 + vextracti128 $0x1,%ymm8,%xmm8 + vpaddd %xmm8,%xmm0,%xmm0 + vpaddd %xmm1,%xmm0,%xmm0 + vpxor %xmm0,%xmm3,%xmm3 + vprord $0x8,%xmm3,%xmm3 + vpaddd %xmm3,%xmm2,%xmm2 + vpxor %xmm2,%xmm1,%xmm1 + vprord $0x7,%xmm1,%xmm1 + vpshufd $0x93,%xmm0,%xmm0 + vpshufd $0x4e,%xmm3,%xmm3 + vpshufd $0x39,%xmm2,%xmm2 + vpaddd %xmm9,%xmm0,%xmm0 + vpaddd %xmm1,%xmm0,%xmm0 + vpxor %xmm0,%xmm3,%xmm3 + vprord $0x10,%xmm3,%xmm3 + vpaddd %xmm3,%xmm2,%xmm2 + vpxor %xmm2,%xmm1,%xmm1 + vprord $0xc,%xmm1,%xmm1 + vextracti128 $0x1,%ymm9,%xmm9 + vpaddd %xmm9,%xmm0,%xmm0 + vpaddd %xmm1,%xmm0,%xmm0 + vpxor %xmm0,%xmm3,%xmm3 + vprord $0x8,%xmm3,%xmm3 + vpaddd %xmm3,%xmm2,%xmm2 + vpxor %xmm2,%xmm1,%xmm1 + vprord $0x7,%xmm1,%xmm1 + vpshufd $0x39,%xmm0,%xmm0 + vpshufd $0x4e,%xmm3,%xmm3 + vpshufd $0x93,%xmm2,%xmm2 + decb %cl + jne .Lblake2s_compress_avx512_roundloop + vpxor %xmm10,%xmm0,%xmm0 + vpxor %xmm11,%xmm1,%xmm1 + vpxor %xmm2,%xmm0,%xmm0 + vpxor %xmm3,%xmm1,%xmm1 + decq %rdx + jne .Lblake2s_compress_avx512_mainloop + vmovdqu %xmm0,(%rdi) + vmovdqu %xmm1,0x10(%rdi) + vmovdqu %xmm4,0x20(%rdi) + vzeroupper + retq +ENDPROC(blake2s_compress_avx512) +#endif /* CONFIG_AS_AVX512 */ diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c new file mode 100644 index 000000000000..5fe0a86d8fc9 --- /dev/null +++ b/arch/x86/crypto/blake2s-glue.c @@ -0,0 +1,232 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include +#include + +#include +#include +#include +#include + +#include +#include +#include +#include + +asmlinkage void blake2s_compress_ssse3(struct blake2s_state *state, + const u8 *block, const size_t nblocks, + const u32 inc); +asmlinkage void blake2s_compress_avx512(struct blake2s_state *state, + const u8 *block, const size_t nblocks, + const u32 inc); + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(blake2s_use_ssse3); +static __ro_after_init DEFINE_STATIC_KEY_FALSE(blake2s_use_avx512); + +void blake2s_compress_arch(struct blake2s_state *state, + const u8 *block, size_t nblocks, + const u32 inc) +{ + /* SIMD disables preemption, so relax after processing each page. */ + BUILD_BUG_ON(SZ_4K / BLAKE2S_BLOCK_SIZE < 8); + + if (!static_branch_likely(&blake2s_use_ssse3) || !may_use_simd()) { + blake2s_compress_generic(state, block, nblocks, inc); + return; + } + + do { + const size_t blocks = min_t(size_t, nblocks, + SZ_4K / BLAKE2S_BLOCK_SIZE); + + kernel_fpu_begin(); + if (IS_ENABLED(CONFIG_AS_AVX512) && + static_branch_likely(&blake2s_use_avx512)) + blake2s_compress_avx512(state, block, blocks, inc); + else + blake2s_compress_ssse3(state, block, blocks, inc); + kernel_fpu_end(); + + nblocks -= blocks; + block += blocks * BLAKE2S_BLOCK_SIZE; + } while (nblocks); +} +EXPORT_SYMBOL(blake2s_compress_arch); + +static int crypto_blake2s_setkey(struct crypto_shash *tfm, const u8 *key, + unsigned int keylen) +{ + struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(tfm); + + if (keylen == 0 || keylen > BLAKE2S_KEY_SIZE) { + crypto_shash_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN); + return -EINVAL; + } + + memcpy(tctx->key, key, keylen); + tctx->keylen = keylen; + + return 0; +} + +static int crypto_blake2s_init(struct shash_desc *desc) +{ + struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); + struct blake2s_state *state = shash_desc_ctx(desc); + const int outlen = crypto_shash_digestsize(desc->tfm); + + if (tctx->keylen) + blake2s_init_key(state, outlen, tctx->key, tctx->keylen); + else + blake2s_init(state, outlen); + + return 0; +} + +static int crypto_blake2s_update(struct shash_desc *desc, const u8 *in, + unsigned int inlen) +{ + struct blake2s_state *state = shash_desc_ctx(desc); + const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen; + + if (unlikely(!inlen)) + return 0; + if (inlen > fill) { + memcpy(state->buf + state->buflen, in, fill); + blake2s_compress_arch(state, state->buf, 1, BLAKE2S_BLOCK_SIZE); + state->buflen = 0; + in += fill; + inlen -= fill; + } + if (inlen > BLAKE2S_BLOCK_SIZE) { + const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE); + /* Hash one less (full) block than strictly possible */ + blake2s_compress_arch(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE); + in += BLAKE2S_BLOCK_SIZE * (nblocks - 1); + inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1); + } + memcpy(state->buf + state->buflen, in, inlen); + state->buflen += inlen; + + return 0; +} + +static int crypto_blake2s_final(struct shash_desc *desc, u8 *out) +{ + struct blake2s_state *state = shash_desc_ctx(desc); + + blake2s_set_lastblock(state); + memset(state->buf + state->buflen, 0, + BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */ + blake2s_compress_arch(state, state->buf, 1, state->buflen); + cpu_to_le32_array(state->h, ARRAY_SIZE(state->h)); + memcpy(out, state->h, state->outlen); + memzero_explicit(state, sizeof(*state)); + + return 0; +} + +static struct shash_alg blake2s_algs[] = {{ + .base.cra_name = "blake2s-128", + .base.cra_driver_name = "blake2s-128-x86", + .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, + .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), + .base.cra_priority = 200, + .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, + .base.cra_module = THIS_MODULE, + + .digestsize = BLAKE2S_128_HASH_SIZE, + .setkey = crypto_blake2s_setkey, + .init = crypto_blake2s_init, + .update = crypto_blake2s_update, + .final = crypto_blake2s_final, + .descsize = sizeof(struct blake2s_state), +}, { + .base.cra_name = "blake2s-160", + .base.cra_driver_name = "blake2s-160-x86", + .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, + .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), + .base.cra_priority = 200, + .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, + .base.cra_module = THIS_MODULE, + + .digestsize = BLAKE2S_160_HASH_SIZE, + .setkey = crypto_blake2s_setkey, + .init = crypto_blake2s_init, + .update = crypto_blake2s_update, + .final = crypto_blake2s_final, + .descsize = sizeof(struct blake2s_state), +}, { + .base.cra_name = "blake2s-224", + .base.cra_driver_name = "blake2s-224-x86", + .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, + .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), + .base.cra_priority = 200, + .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, + .base.cra_module = THIS_MODULE, + + .digestsize = BLAKE2S_224_HASH_SIZE, + .setkey = crypto_blake2s_setkey, + .init = crypto_blake2s_init, + .update = crypto_blake2s_update, + .final = crypto_blake2s_final, + .descsize = sizeof(struct blake2s_state), +}, { + .base.cra_name = "blake2s-256", + .base.cra_driver_name = "blake2s-256-x86", + .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, + .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), + .base.cra_priority = 200, + .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, + .base.cra_module = THIS_MODULE, + + .digestsize = BLAKE2S_256_HASH_SIZE, + .setkey = crypto_blake2s_setkey, + .init = crypto_blake2s_init, + .update = crypto_blake2s_update, + .final = crypto_blake2s_final, + .descsize = sizeof(struct blake2s_state), +}}; + +static int __init blake2s_mod_init(void) +{ + if (!boot_cpu_has(X86_FEATURE_SSSE3)) + return 0; + + static_branch_enable(&blake2s_use_ssse3); + + if (IS_ENABLED(CONFIG_AS_AVX512) && + boot_cpu_has(X86_FEATURE_AVX) && + boot_cpu_has(X86_FEATURE_AVX2) && + boot_cpu_has(X86_FEATURE_AVX512F) && + boot_cpu_has(X86_FEATURE_AVX512VL) && + cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM | + XFEATURE_MASK_AVX512, NULL)) + static_branch_enable(&blake2s_use_avx512); + + return IS_REACHABLE(CONFIG_CRYPTO_HASH) ? + crypto_register_shashes(blake2s_algs, + ARRAY_SIZE(blake2s_algs)) : 0; +} + +static void __exit blake2s_mod_exit(void) +{ + if (IS_REACHABLE(CONFIG_CRYPTO_HASH) && boot_cpu_has(X86_FEATURE_SSSE3)) + crypto_unregister_shashes(blake2s_algs, ARRAY_SIZE(blake2s_algs)); +} + +module_init(blake2s_mod_init); +module_exit(blake2s_mod_exit); + +MODULE_ALIAS_CRYPTO("blake2s-128"); +MODULE_ALIAS_CRYPTO("blake2s-128-x86"); +MODULE_ALIAS_CRYPTO("blake2s-160"); +MODULE_ALIAS_CRYPTO("blake2s-160-x86"); +MODULE_ALIAS_CRYPTO("blake2s-224"); +MODULE_ALIAS_CRYPTO("blake2s-224-x86"); +MODULE_ALIAS_CRYPTO("blake2s-256"); +MODULE_ALIAS_CRYPTO("blake2s-256-x86"); +MODULE_LICENSE("GPL v2"); diff --git a/arch/x86/crypto/chacha-avx2-x86_64.S b/arch/x86/crypto/chacha-avx2-x86_64.S new file mode 100644 index 000000000000..32903fd450af --- /dev/null +++ b/arch/x86/crypto/chacha-avx2-x86_64.S @@ -0,0 +1,1025 @@ +/* + * ChaCha 256-bit cipher algorithm, x64 AVX2 functions + * + * Copyright (C) 2015 Martin Willi + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include + +.section .rodata.cst32.ROT8, "aM", @progbits, 32 +.align 32 +ROT8: .octa 0x0e0d0c0f0a09080b0605040702010003 + .octa 0x0e0d0c0f0a09080b0605040702010003 + +.section .rodata.cst32.ROT16, "aM", @progbits, 32 +.align 32 +ROT16: .octa 0x0d0c0f0e09080b0a0504070601000302 + .octa 0x0d0c0f0e09080b0a0504070601000302 + +.section .rodata.cst32.CTRINC, "aM", @progbits, 32 +.align 32 +CTRINC: .octa 0x00000003000000020000000100000000 + .octa 0x00000007000000060000000500000004 + +.section .rodata.cst32.CTR2BL, "aM", @progbits, 32 +.align 32 +CTR2BL: .octa 0x00000000000000000000000000000000 + .octa 0x00000000000000000000000000000001 + +.section .rodata.cst32.CTR4BL, "aM", @progbits, 32 +.align 32 +CTR4BL: .octa 0x00000000000000000000000000000002 + .octa 0x00000000000000000000000000000003 + +.text + +ENTRY(chacha_2block_xor_avx2) + # %rdi: Input state matrix, s + # %rsi: up to 2 data blocks output, o + # %rdx: up to 2 data blocks input, i + # %rcx: input/output length in bytes + # %r8d: nrounds + + # This function encrypts two ChaCha blocks by loading the state + # matrix twice across four AVX registers. It performs matrix operations + # on four words in each matrix in parallel, but requires shuffling to + # rearrange the words after each round. + + vzeroupper + + # x0..3[0-2] = s0..3 + vbroadcasti128 0x00(%rdi),%ymm0 + vbroadcasti128 0x10(%rdi),%ymm1 + vbroadcasti128 0x20(%rdi),%ymm2 + vbroadcasti128 0x30(%rdi),%ymm3 + + vpaddd CTR2BL(%rip),%ymm3,%ymm3 + + vmovdqa %ymm0,%ymm8 + vmovdqa %ymm1,%ymm9 + vmovdqa %ymm2,%ymm10 + vmovdqa %ymm3,%ymm11 + + vmovdqa ROT8(%rip),%ymm4 + vmovdqa ROT16(%rip),%ymm5 + + mov %rcx,%rax + +.Ldoubleround: + + # x0 += x1, x3 = rotl32(x3 ^ x0, 16) + vpaddd %ymm1,%ymm0,%ymm0 + vpxor %ymm0,%ymm3,%ymm3 + vpshufb %ymm5,%ymm3,%ymm3 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 12) + vpaddd %ymm3,%ymm2,%ymm2 + vpxor %ymm2,%ymm1,%ymm1 + vmovdqa %ymm1,%ymm6 + vpslld $12,%ymm6,%ymm6 + vpsrld $20,%ymm1,%ymm1 + vpor %ymm6,%ymm1,%ymm1 + + # x0 += x1, x3 = rotl32(x3 ^ x0, 8) + vpaddd %ymm1,%ymm0,%ymm0 + vpxor %ymm0,%ymm3,%ymm3 + vpshufb %ymm4,%ymm3,%ymm3 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 7) + vpaddd %ymm3,%ymm2,%ymm2 + vpxor %ymm2,%ymm1,%ymm1 + vmovdqa %ymm1,%ymm7 + vpslld $7,%ymm7,%ymm7 + vpsrld $25,%ymm1,%ymm1 + vpor %ymm7,%ymm1,%ymm1 + + # x1 = shuffle32(x1, MASK(0, 3, 2, 1)) + vpshufd $0x39,%ymm1,%ymm1 + # x2 = shuffle32(x2, MASK(1, 0, 3, 2)) + vpshufd $0x4e,%ymm2,%ymm2 + # x3 = shuffle32(x3, MASK(2, 1, 0, 3)) + vpshufd $0x93,%ymm3,%ymm3 + + # x0 += x1, x3 = rotl32(x3 ^ x0, 16) + vpaddd %ymm1,%ymm0,%ymm0 + vpxor %ymm0,%ymm3,%ymm3 + vpshufb %ymm5,%ymm3,%ymm3 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 12) + vpaddd %ymm3,%ymm2,%ymm2 + vpxor %ymm2,%ymm1,%ymm1 + vmovdqa %ymm1,%ymm6 + vpslld $12,%ymm6,%ymm6 + vpsrld $20,%ymm1,%ymm1 + vpor %ymm6,%ymm1,%ymm1 + + # x0 += x1, x3 = rotl32(x3 ^ x0, 8) + vpaddd %ymm1,%ymm0,%ymm0 + vpxor %ymm0,%ymm3,%ymm3 + vpshufb %ymm4,%ymm3,%ymm3 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 7) + vpaddd %ymm3,%ymm2,%ymm2 + vpxor %ymm2,%ymm1,%ymm1 + vmovdqa %ymm1,%ymm7 + vpslld $7,%ymm7,%ymm7 + vpsrld $25,%ymm1,%ymm1 + vpor %ymm7,%ymm1,%ymm1 + + # x1 = shuffle32(x1, MASK(2, 1, 0, 3)) + vpshufd $0x93,%ymm1,%ymm1 + # x2 = shuffle32(x2, MASK(1, 0, 3, 2)) + vpshufd $0x4e,%ymm2,%ymm2 + # x3 = shuffle32(x3, MASK(0, 3, 2, 1)) + vpshufd $0x39,%ymm3,%ymm3 + + sub $2,%r8d + jnz .Ldoubleround + + # o0 = i0 ^ (x0 + s0) + vpaddd %ymm8,%ymm0,%ymm7 + cmp $0x10,%rax + jl .Lxorpart2 + vpxor 0x00(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x00(%rsi) + vextracti128 $1,%ymm7,%xmm0 + # o1 = i1 ^ (x1 + s1) + vpaddd %ymm9,%ymm1,%ymm7 + cmp $0x20,%rax + jl .Lxorpart2 + vpxor 0x10(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x10(%rsi) + vextracti128 $1,%ymm7,%xmm1 + # o2 = i2 ^ (x2 + s2) + vpaddd %ymm10,%ymm2,%ymm7 + cmp $0x30,%rax + jl .Lxorpart2 + vpxor 0x20(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x20(%rsi) + vextracti128 $1,%ymm7,%xmm2 + # o3 = i3 ^ (x3 + s3) + vpaddd %ymm11,%ymm3,%ymm7 + cmp $0x40,%rax + jl .Lxorpart2 + vpxor 0x30(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x30(%rsi) + vextracti128 $1,%ymm7,%xmm3 + + # xor and write second block + vmovdqa %xmm0,%xmm7 + cmp $0x50,%rax + jl .Lxorpart2 + vpxor 0x40(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x40(%rsi) + + vmovdqa %xmm1,%xmm7 + cmp $0x60,%rax + jl .Lxorpart2 + vpxor 0x50(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x50(%rsi) + + vmovdqa %xmm2,%xmm7 + cmp $0x70,%rax + jl .Lxorpart2 + vpxor 0x60(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x60(%rsi) + + vmovdqa %xmm3,%xmm7 + cmp $0x80,%rax + jl .Lxorpart2 + vpxor 0x70(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x70(%rsi) + +.Ldone2: + vzeroupper + ret + +.Lxorpart2: + # xor remaining bytes from partial register into output + mov %rax,%r9 + and $0x0f,%r9 + jz .Ldone2 + and $~0x0f,%rax + + mov %rsi,%r11 + + lea 8(%rsp),%r10 + sub $0x10,%rsp + and $~31,%rsp + + lea (%rdx,%rax),%rsi + mov %rsp,%rdi + mov %r9,%rcx + rep movsb + + vpxor 0x00(%rsp),%xmm7,%xmm7 + vmovdqa %xmm7,0x00(%rsp) + + mov %rsp,%rsi + lea (%r11,%rax),%rdi + mov %r9,%rcx + rep movsb + + lea -8(%r10),%rsp + jmp .Ldone2 + +ENDPROC(chacha_2block_xor_avx2) + +ENTRY(chacha_4block_xor_avx2) + # %rdi: Input state matrix, s + # %rsi: up to 4 data blocks output, o + # %rdx: up to 4 data blocks input, i + # %rcx: input/output length in bytes + # %r8d: nrounds + + # This function encrypts four ChaCha blocks by loading the state + # matrix four times across eight AVX registers. It performs matrix + # operations on four words in two matrices in parallel, sequentially + # to the operations on the four words of the other two matrices. The + # required word shuffling has a rather high latency, we can do the + # arithmetic on two matrix-pairs without much slowdown. + + vzeroupper + + # x0..3[0-4] = s0..3 + vbroadcasti128 0x00(%rdi),%ymm0 + vbroadcasti128 0x10(%rdi),%ymm1 + vbroadcasti128 0x20(%rdi),%ymm2 + vbroadcasti128 0x30(%rdi),%ymm3 + + vmovdqa %ymm0,%ymm4 + vmovdqa %ymm1,%ymm5 + vmovdqa %ymm2,%ymm6 + vmovdqa %ymm3,%ymm7 + + vpaddd CTR2BL(%rip),%ymm3,%ymm3 + vpaddd CTR4BL(%rip),%ymm7,%ymm7 + + vmovdqa %ymm0,%ymm11 + vmovdqa %ymm1,%ymm12 + vmovdqa %ymm2,%ymm13 + vmovdqa %ymm3,%ymm14 + vmovdqa %ymm7,%ymm15 + + vmovdqa ROT8(%rip),%ymm8 + vmovdqa ROT16(%rip),%ymm9 + + mov %rcx,%rax + +.Ldoubleround4: + + # x0 += x1, x3 = rotl32(x3 ^ x0, 16) + vpaddd %ymm1,%ymm0,%ymm0 + vpxor %ymm0,%ymm3,%ymm3 + vpshufb %ymm9,%ymm3,%ymm3 + + vpaddd %ymm5,%ymm4,%ymm4 + vpxor %ymm4,%ymm7,%ymm7 + vpshufb %ymm9,%ymm7,%ymm7 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 12) + vpaddd %ymm3,%ymm2,%ymm2 + vpxor %ymm2,%ymm1,%ymm1 + vmovdqa %ymm1,%ymm10 + vpslld $12,%ymm10,%ymm10 + vpsrld $20,%ymm1,%ymm1 + vpor %ymm10,%ymm1,%ymm1 + + vpaddd %ymm7,%ymm6,%ymm6 + vpxor %ymm6,%ymm5,%ymm5 + vmovdqa %ymm5,%ymm10 + vpslld $12,%ymm10,%ymm10 + vpsrld $20,%ymm5,%ymm5 + vpor %ymm10,%ymm5,%ymm5 + + # x0 += x1, x3 = rotl32(x3 ^ x0, 8) + vpaddd %ymm1,%ymm0,%ymm0 + vpxor %ymm0,%ymm3,%ymm3 + vpshufb %ymm8,%ymm3,%ymm3 + + vpaddd %ymm5,%ymm4,%ymm4 + vpxor %ymm4,%ymm7,%ymm7 + vpshufb %ymm8,%ymm7,%ymm7 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 7) + vpaddd %ymm3,%ymm2,%ymm2 + vpxor %ymm2,%ymm1,%ymm1 + vmovdqa %ymm1,%ymm10 + vpslld $7,%ymm10,%ymm10 + vpsrld $25,%ymm1,%ymm1 + vpor %ymm10,%ymm1,%ymm1 + + vpaddd %ymm7,%ymm6,%ymm6 + vpxor %ymm6,%ymm5,%ymm5 + vmovdqa %ymm5,%ymm10 + vpslld $7,%ymm10,%ymm10 + vpsrld $25,%ymm5,%ymm5 + vpor %ymm10,%ymm5,%ymm5 + + # x1 = shuffle32(x1, MASK(0, 3, 2, 1)) + vpshufd $0x39,%ymm1,%ymm1 + vpshufd $0x39,%ymm5,%ymm5 + # x2 = shuffle32(x2, MASK(1, 0, 3, 2)) + vpshufd $0x4e,%ymm2,%ymm2 + vpshufd $0x4e,%ymm6,%ymm6 + # x3 = shuffle32(x3, MASK(2, 1, 0, 3)) + vpshufd $0x93,%ymm3,%ymm3 + vpshufd $0x93,%ymm7,%ymm7 + + # x0 += x1, x3 = rotl32(x3 ^ x0, 16) + vpaddd %ymm1,%ymm0,%ymm0 + vpxor %ymm0,%ymm3,%ymm3 + vpshufb %ymm9,%ymm3,%ymm3 + + vpaddd %ymm5,%ymm4,%ymm4 + vpxor %ymm4,%ymm7,%ymm7 + vpshufb %ymm9,%ymm7,%ymm7 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 12) + vpaddd %ymm3,%ymm2,%ymm2 + vpxor %ymm2,%ymm1,%ymm1 + vmovdqa %ymm1,%ymm10 + vpslld $12,%ymm10,%ymm10 + vpsrld $20,%ymm1,%ymm1 + vpor %ymm10,%ymm1,%ymm1 + + vpaddd %ymm7,%ymm6,%ymm6 + vpxor %ymm6,%ymm5,%ymm5 + vmovdqa %ymm5,%ymm10 + vpslld $12,%ymm10,%ymm10 + vpsrld $20,%ymm5,%ymm5 + vpor %ymm10,%ymm5,%ymm5 + + # x0 += x1, x3 = rotl32(x3 ^ x0, 8) + vpaddd %ymm1,%ymm0,%ymm0 + vpxor %ymm0,%ymm3,%ymm3 + vpshufb %ymm8,%ymm3,%ymm3 + + vpaddd %ymm5,%ymm4,%ymm4 + vpxor %ymm4,%ymm7,%ymm7 + vpshufb %ymm8,%ymm7,%ymm7 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 7) + vpaddd %ymm3,%ymm2,%ymm2 + vpxor %ymm2,%ymm1,%ymm1 + vmovdqa %ymm1,%ymm10 + vpslld $7,%ymm10,%ymm10 + vpsrld $25,%ymm1,%ymm1 + vpor %ymm10,%ymm1,%ymm1 + + vpaddd %ymm7,%ymm6,%ymm6 + vpxor %ymm6,%ymm5,%ymm5 + vmovdqa %ymm5,%ymm10 + vpslld $7,%ymm10,%ymm10 + vpsrld $25,%ymm5,%ymm5 + vpor %ymm10,%ymm5,%ymm5 + + # x1 = shuffle32(x1, MASK(2, 1, 0, 3)) + vpshufd $0x93,%ymm1,%ymm1 + vpshufd $0x93,%ymm5,%ymm5 + # x2 = shuffle32(x2, MASK(1, 0, 3, 2)) + vpshufd $0x4e,%ymm2,%ymm2 + vpshufd $0x4e,%ymm6,%ymm6 + # x3 = shuffle32(x3, MASK(0, 3, 2, 1)) + vpshufd $0x39,%ymm3,%ymm3 + vpshufd $0x39,%ymm7,%ymm7 + + sub $2,%r8d + jnz .Ldoubleround4 + + # o0 = i0 ^ (x0 + s0), first block + vpaddd %ymm11,%ymm0,%ymm10 + cmp $0x10,%rax + jl .Lxorpart4 + vpxor 0x00(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x00(%rsi) + vextracti128 $1,%ymm10,%xmm0 + # o1 = i1 ^ (x1 + s1), first block + vpaddd %ymm12,%ymm1,%ymm10 + cmp $0x20,%rax + jl .Lxorpart4 + vpxor 0x10(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x10(%rsi) + vextracti128 $1,%ymm10,%xmm1 + # o2 = i2 ^ (x2 + s2), first block + vpaddd %ymm13,%ymm2,%ymm10 + cmp $0x30,%rax + jl .Lxorpart4 + vpxor 0x20(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x20(%rsi) + vextracti128 $1,%ymm10,%xmm2 + # o3 = i3 ^ (x3 + s3), first block + vpaddd %ymm14,%ymm3,%ymm10 + cmp $0x40,%rax + jl .Lxorpart4 + vpxor 0x30(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x30(%rsi) + vextracti128 $1,%ymm10,%xmm3 + + # xor and write second block + vmovdqa %xmm0,%xmm10 + cmp $0x50,%rax + jl .Lxorpart4 + vpxor 0x40(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x40(%rsi) + + vmovdqa %xmm1,%xmm10 + cmp $0x60,%rax + jl .Lxorpart4 + vpxor 0x50(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x50(%rsi) + + vmovdqa %xmm2,%xmm10 + cmp $0x70,%rax + jl .Lxorpart4 + vpxor 0x60(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x60(%rsi) + + vmovdqa %xmm3,%xmm10 + cmp $0x80,%rax + jl .Lxorpart4 + vpxor 0x70(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x70(%rsi) + + # o0 = i0 ^ (x0 + s0), third block + vpaddd %ymm11,%ymm4,%ymm10 + cmp $0x90,%rax + jl .Lxorpart4 + vpxor 0x80(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x80(%rsi) + vextracti128 $1,%ymm10,%xmm4 + # o1 = i1 ^ (x1 + s1), third block + vpaddd %ymm12,%ymm5,%ymm10 + cmp $0xa0,%rax + jl .Lxorpart4 + vpxor 0x90(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x90(%rsi) + vextracti128 $1,%ymm10,%xmm5 + # o2 = i2 ^ (x2 + s2), third block + vpaddd %ymm13,%ymm6,%ymm10 + cmp $0xb0,%rax + jl .Lxorpart4 + vpxor 0xa0(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0xa0(%rsi) + vextracti128 $1,%ymm10,%xmm6 + # o3 = i3 ^ (x3 + s3), third block + vpaddd %ymm15,%ymm7,%ymm10 + cmp $0xc0,%rax + jl .Lxorpart4 + vpxor 0xb0(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0xb0(%rsi) + vextracti128 $1,%ymm10,%xmm7 + + # xor and write fourth block + vmovdqa %xmm4,%xmm10 + cmp $0xd0,%rax + jl .Lxorpart4 + vpxor 0xc0(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0xc0(%rsi) + + vmovdqa %xmm5,%xmm10 + cmp $0xe0,%rax + jl .Lxorpart4 + vpxor 0xd0(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0xd0(%rsi) + + vmovdqa %xmm6,%xmm10 + cmp $0xf0,%rax + jl .Lxorpart4 + vpxor 0xe0(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0xe0(%rsi) + + vmovdqa %xmm7,%xmm10 + cmp $0x100,%rax + jl .Lxorpart4 + vpxor 0xf0(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0xf0(%rsi) + +.Ldone4: + vzeroupper + ret + +.Lxorpart4: + # xor remaining bytes from partial register into output + mov %rax,%r9 + and $0x0f,%r9 + jz .Ldone4 + and $~0x0f,%rax + + mov %rsi,%r11 + + lea 8(%rsp),%r10 + sub $0x10,%rsp + and $~31,%rsp + + lea (%rdx,%rax),%rsi + mov %rsp,%rdi + mov %r9,%rcx + rep movsb + + vpxor 0x00(%rsp),%xmm10,%xmm10 + vmovdqa %xmm10,0x00(%rsp) + + mov %rsp,%rsi + lea (%r11,%rax),%rdi + mov %r9,%rcx + rep movsb + + lea -8(%r10),%rsp + jmp .Ldone4 + +ENDPROC(chacha_4block_xor_avx2) + +ENTRY(chacha_8block_xor_avx2) + # %rdi: Input state matrix, s + # %rsi: up to 8 data blocks output, o + # %rdx: up to 8 data blocks input, i + # %rcx: input/output length in bytes + # %r8d: nrounds + + # This function encrypts eight consecutive ChaCha blocks by loading + # the state matrix in AVX registers eight times. As we need some + # scratch registers, we save the first four registers on the stack. The + # algorithm performs each operation on the corresponding word of each + # state matrix, hence requires no word shuffling. For final XORing step + # we transpose the matrix by interleaving 32-, 64- and then 128-bit + # words, which allows us to do XOR in AVX registers. 8/16-bit word + # rotation is done with the slightly better performing byte shuffling, + # 7/12-bit word rotation uses traditional shift+OR. + + vzeroupper + # 4 * 32 byte stack, 32-byte aligned + lea 8(%rsp),%r10 + and $~31, %rsp + sub $0x80, %rsp + mov %rcx,%rax + + # x0..15[0-7] = s[0..15] + vpbroadcastd 0x00(%rdi),%ymm0 + vpbroadcastd 0x04(%rdi),%ymm1 + vpbroadcastd 0x08(%rdi),%ymm2 + vpbroadcastd 0x0c(%rdi),%ymm3 + vpbroadcastd 0x10(%rdi),%ymm4 + vpbroadcastd 0x14(%rdi),%ymm5 + vpbroadcastd 0x18(%rdi),%ymm6 + vpbroadcastd 0x1c(%rdi),%ymm7 + vpbroadcastd 0x20(%rdi),%ymm8 + vpbroadcastd 0x24(%rdi),%ymm9 + vpbroadcastd 0x28(%rdi),%ymm10 + vpbroadcastd 0x2c(%rdi),%ymm11 + vpbroadcastd 0x30(%rdi),%ymm12 + vpbroadcastd 0x34(%rdi),%ymm13 + vpbroadcastd 0x38(%rdi),%ymm14 + vpbroadcastd 0x3c(%rdi),%ymm15 + # x0..3 on stack + vmovdqa %ymm0,0x00(%rsp) + vmovdqa %ymm1,0x20(%rsp) + vmovdqa %ymm2,0x40(%rsp) + vmovdqa %ymm3,0x60(%rsp) + + vmovdqa CTRINC(%rip),%ymm1 + vmovdqa ROT8(%rip),%ymm2 + vmovdqa ROT16(%rip),%ymm3 + + # x12 += counter values 0-3 + vpaddd %ymm1,%ymm12,%ymm12 + +.Ldoubleround8: + # x0 += x4, x12 = rotl32(x12 ^ x0, 16) + vpaddd 0x00(%rsp),%ymm4,%ymm0 + vmovdqa %ymm0,0x00(%rsp) + vpxor %ymm0,%ymm12,%ymm12 + vpshufb %ymm3,%ymm12,%ymm12 + # x1 += x5, x13 = rotl32(x13 ^ x1, 16) + vpaddd 0x20(%rsp),%ymm5,%ymm0 + vmovdqa %ymm0,0x20(%rsp) + vpxor %ymm0,%ymm13,%ymm13 + vpshufb %ymm3,%ymm13,%ymm13 + # x2 += x6, x14 = rotl32(x14 ^ x2, 16) + vpaddd 0x40(%rsp),%ymm6,%ymm0 + vmovdqa %ymm0,0x40(%rsp) + vpxor %ymm0,%ymm14,%ymm14 + vpshufb %ymm3,%ymm14,%ymm14 + # x3 += x7, x15 = rotl32(x15 ^ x3, 16) + vpaddd 0x60(%rsp),%ymm7,%ymm0 + vmovdqa %ymm0,0x60(%rsp) + vpxor %ymm0,%ymm15,%ymm15 + vpshufb %ymm3,%ymm15,%ymm15 + + # x8 += x12, x4 = rotl32(x4 ^ x8, 12) + vpaddd %ymm12,%ymm8,%ymm8 + vpxor %ymm8,%ymm4,%ymm4 + vpslld $12,%ymm4,%ymm0 + vpsrld $20,%ymm4,%ymm4 + vpor %ymm0,%ymm4,%ymm4 + # x9 += x13, x5 = rotl32(x5 ^ x9, 12) + vpaddd %ymm13,%ymm9,%ymm9 + vpxor %ymm9,%ymm5,%ymm5 + vpslld $12,%ymm5,%ymm0 + vpsrld $20,%ymm5,%ymm5 + vpor %ymm0,%ymm5,%ymm5 + # x10 += x14, x6 = rotl32(x6 ^ x10, 12) + vpaddd %ymm14,%ymm10,%ymm10 + vpxor %ymm10,%ymm6,%ymm6 + vpslld $12,%ymm6,%ymm0 + vpsrld $20,%ymm6,%ymm6 + vpor %ymm0,%ymm6,%ymm6 + # x11 += x15, x7 = rotl32(x7 ^ x11, 12) + vpaddd %ymm15,%ymm11,%ymm11 + vpxor %ymm11,%ymm7,%ymm7 + vpslld $12,%ymm7,%ymm0 + vpsrld $20,%ymm7,%ymm7 + vpor %ymm0,%ymm7,%ymm7 + + # x0 += x4, x12 = rotl32(x12 ^ x0, 8) + vpaddd 0x00(%rsp),%ymm4,%ymm0 + vmovdqa %ymm0,0x00(%rsp) + vpxor %ymm0,%ymm12,%ymm12 + vpshufb %ymm2,%ymm12,%ymm12 + # x1 += x5, x13 = rotl32(x13 ^ x1, 8) + vpaddd 0x20(%rsp),%ymm5,%ymm0 + vmovdqa %ymm0,0x20(%rsp) + vpxor %ymm0,%ymm13,%ymm13 + vpshufb %ymm2,%ymm13,%ymm13 + # x2 += x6, x14 = rotl32(x14 ^ x2, 8) + vpaddd 0x40(%rsp),%ymm6,%ymm0 + vmovdqa %ymm0,0x40(%rsp) + vpxor %ymm0,%ymm14,%ymm14 + vpshufb %ymm2,%ymm14,%ymm14 + # x3 += x7, x15 = rotl32(x15 ^ x3, 8) + vpaddd 0x60(%rsp),%ymm7,%ymm0 + vmovdqa %ymm0,0x60(%rsp) + vpxor %ymm0,%ymm15,%ymm15 + vpshufb %ymm2,%ymm15,%ymm15 + + # x8 += x12, x4 = rotl32(x4 ^ x8, 7) + vpaddd %ymm12,%ymm8,%ymm8 + vpxor %ymm8,%ymm4,%ymm4 + vpslld $7,%ymm4,%ymm0 + vpsrld $25,%ymm4,%ymm4 + vpor %ymm0,%ymm4,%ymm4 + # x9 += x13, x5 = rotl32(x5 ^ x9, 7) + vpaddd %ymm13,%ymm9,%ymm9 + vpxor %ymm9,%ymm5,%ymm5 + vpslld $7,%ymm5,%ymm0 + vpsrld $25,%ymm5,%ymm5 + vpor %ymm0,%ymm5,%ymm5 + # x10 += x14, x6 = rotl32(x6 ^ x10, 7) + vpaddd %ymm14,%ymm10,%ymm10 + vpxor %ymm10,%ymm6,%ymm6 + vpslld $7,%ymm6,%ymm0 + vpsrld $25,%ymm6,%ymm6 + vpor %ymm0,%ymm6,%ymm6 + # x11 += x15, x7 = rotl32(x7 ^ x11, 7) + vpaddd %ymm15,%ymm11,%ymm11 + vpxor %ymm11,%ymm7,%ymm7 + vpslld $7,%ymm7,%ymm0 + vpsrld $25,%ymm7,%ymm7 + vpor %ymm0,%ymm7,%ymm7 + + # x0 += x5, x15 = rotl32(x15 ^ x0, 16) + vpaddd 0x00(%rsp),%ymm5,%ymm0 + vmovdqa %ymm0,0x00(%rsp) + vpxor %ymm0,%ymm15,%ymm15 + vpshufb %ymm3,%ymm15,%ymm15 + # x1 += x6, x12 = rotl32(x12 ^ x1, 16)%ymm0 + vpaddd 0x20(%rsp),%ymm6,%ymm0 + vmovdqa %ymm0,0x20(%rsp) + vpxor %ymm0,%ymm12,%ymm12 + vpshufb %ymm3,%ymm12,%ymm12 + # x2 += x7, x13 = rotl32(x13 ^ x2, 16) + vpaddd 0x40(%rsp),%ymm7,%ymm0 + vmovdqa %ymm0,0x40(%rsp) + vpxor %ymm0,%ymm13,%ymm13 + vpshufb %ymm3,%ymm13,%ymm13 + # x3 += x4, x14 = rotl32(x14 ^ x3, 16) + vpaddd 0x60(%rsp),%ymm4,%ymm0 + vmovdqa %ymm0,0x60(%rsp) + vpxor %ymm0,%ymm14,%ymm14 + vpshufb %ymm3,%ymm14,%ymm14 + + # x10 += x15, x5 = rotl32(x5 ^ x10, 12) + vpaddd %ymm15,%ymm10,%ymm10 + vpxor %ymm10,%ymm5,%ymm5 + vpslld $12,%ymm5,%ymm0 + vpsrld $20,%ymm5,%ymm5 + vpor %ymm0,%ymm5,%ymm5 + # x11 += x12, x6 = rotl32(x6 ^ x11, 12) + vpaddd %ymm12,%ymm11,%ymm11 + vpxor %ymm11,%ymm6,%ymm6 + vpslld $12,%ymm6,%ymm0 + vpsrld $20,%ymm6,%ymm6 + vpor %ymm0,%ymm6,%ymm6 + # x8 += x13, x7 = rotl32(x7 ^ x8, 12) + vpaddd %ymm13,%ymm8,%ymm8 + vpxor %ymm8,%ymm7,%ymm7 + vpslld $12,%ymm7,%ymm0 + vpsrld $20,%ymm7,%ymm7 + vpor %ymm0,%ymm7,%ymm7 + # x9 += x14, x4 = rotl32(x4 ^ x9, 12) + vpaddd %ymm14,%ymm9,%ymm9 + vpxor %ymm9,%ymm4,%ymm4 + vpslld $12,%ymm4,%ymm0 + vpsrld $20,%ymm4,%ymm4 + vpor %ymm0,%ymm4,%ymm4 + + # x0 += x5, x15 = rotl32(x15 ^ x0, 8) + vpaddd 0x00(%rsp),%ymm5,%ymm0 + vmovdqa %ymm0,0x00(%rsp) + vpxor %ymm0,%ymm15,%ymm15 + vpshufb %ymm2,%ymm15,%ymm15 + # x1 += x6, x12 = rotl32(x12 ^ x1, 8) + vpaddd 0x20(%rsp),%ymm6,%ymm0 + vmovdqa %ymm0,0x20(%rsp) + vpxor %ymm0,%ymm12,%ymm12 + vpshufb %ymm2,%ymm12,%ymm12 + # x2 += x7, x13 = rotl32(x13 ^ x2, 8) + vpaddd 0x40(%rsp),%ymm7,%ymm0 + vmovdqa %ymm0,0x40(%rsp) + vpxor %ymm0,%ymm13,%ymm13 + vpshufb %ymm2,%ymm13,%ymm13 + # x3 += x4, x14 = rotl32(x14 ^ x3, 8) + vpaddd 0x60(%rsp),%ymm4,%ymm0 + vmovdqa %ymm0,0x60(%rsp) + vpxor %ymm0,%ymm14,%ymm14 + vpshufb %ymm2,%ymm14,%ymm14 + + # x10 += x15, x5 = rotl32(x5 ^ x10, 7) + vpaddd %ymm15,%ymm10,%ymm10 + vpxor %ymm10,%ymm5,%ymm5 + vpslld $7,%ymm5,%ymm0 + vpsrld $25,%ymm5,%ymm5 + vpor %ymm0,%ymm5,%ymm5 + # x11 += x12, x6 = rotl32(x6 ^ x11, 7) + vpaddd %ymm12,%ymm11,%ymm11 + vpxor %ymm11,%ymm6,%ymm6 + vpslld $7,%ymm6,%ymm0 + vpsrld $25,%ymm6,%ymm6 + vpor %ymm0,%ymm6,%ymm6 + # x8 += x13, x7 = rotl32(x7 ^ x8, 7) + vpaddd %ymm13,%ymm8,%ymm8 + vpxor %ymm8,%ymm7,%ymm7 + vpslld $7,%ymm7,%ymm0 + vpsrld $25,%ymm7,%ymm7 + vpor %ymm0,%ymm7,%ymm7 + # x9 += x14, x4 = rotl32(x4 ^ x9, 7) + vpaddd %ymm14,%ymm9,%ymm9 + vpxor %ymm9,%ymm4,%ymm4 + vpslld $7,%ymm4,%ymm0 + vpsrld $25,%ymm4,%ymm4 + vpor %ymm0,%ymm4,%ymm4 + + sub $2,%r8d + jnz .Ldoubleround8 + + # x0..15[0-3] += s[0..15] + vpbroadcastd 0x00(%rdi),%ymm0 + vpaddd 0x00(%rsp),%ymm0,%ymm0 + vmovdqa %ymm0,0x00(%rsp) + vpbroadcastd 0x04(%rdi),%ymm0 + vpaddd 0x20(%rsp),%ymm0,%ymm0 + vmovdqa %ymm0,0x20(%rsp) + vpbroadcastd 0x08(%rdi),%ymm0 + vpaddd 0x40(%rsp),%ymm0,%ymm0 + vmovdqa %ymm0,0x40(%rsp) + vpbroadcastd 0x0c(%rdi),%ymm0 + vpaddd 0x60(%rsp),%ymm0,%ymm0 + vmovdqa %ymm0,0x60(%rsp) + vpbroadcastd 0x10(%rdi),%ymm0 + vpaddd %ymm0,%ymm4,%ymm4 + vpbroadcastd 0x14(%rdi),%ymm0 + vpaddd %ymm0,%ymm5,%ymm5 + vpbroadcastd 0x18(%rdi),%ymm0 + vpaddd %ymm0,%ymm6,%ymm6 + vpbroadcastd 0x1c(%rdi),%ymm0 + vpaddd %ymm0,%ymm7,%ymm7 + vpbroadcastd 0x20(%rdi),%ymm0 + vpaddd %ymm0,%ymm8,%ymm8 + vpbroadcastd 0x24(%rdi),%ymm0 + vpaddd %ymm0,%ymm9,%ymm9 + vpbroadcastd 0x28(%rdi),%ymm0 + vpaddd %ymm0,%ymm10,%ymm10 + vpbroadcastd 0x2c(%rdi),%ymm0 + vpaddd %ymm0,%ymm11,%ymm11 + vpbroadcastd 0x30(%rdi),%ymm0 + vpaddd %ymm0,%ymm12,%ymm12 + vpbroadcastd 0x34(%rdi),%ymm0 + vpaddd %ymm0,%ymm13,%ymm13 + vpbroadcastd 0x38(%rdi),%ymm0 + vpaddd %ymm0,%ymm14,%ymm14 + vpbroadcastd 0x3c(%rdi),%ymm0 + vpaddd %ymm0,%ymm15,%ymm15 + + # x12 += counter values 0-3 + vpaddd %ymm1,%ymm12,%ymm12 + + # interleave 32-bit words in state n, n+1 + vmovdqa 0x00(%rsp),%ymm0 + vmovdqa 0x20(%rsp),%ymm1 + vpunpckldq %ymm1,%ymm0,%ymm2 + vpunpckhdq %ymm1,%ymm0,%ymm1 + vmovdqa %ymm2,0x00(%rsp) + vmovdqa %ymm1,0x20(%rsp) + vmovdqa 0x40(%rsp),%ymm0 + vmovdqa 0x60(%rsp),%ymm1 + vpunpckldq %ymm1,%ymm0,%ymm2 + vpunpckhdq %ymm1,%ymm0,%ymm1 + vmovdqa %ymm2,0x40(%rsp) + vmovdqa %ymm1,0x60(%rsp) + vmovdqa %ymm4,%ymm0 + vpunpckldq %ymm5,%ymm0,%ymm4 + vpunpckhdq %ymm5,%ymm0,%ymm5 + vmovdqa %ymm6,%ymm0 + vpunpckldq %ymm7,%ymm0,%ymm6 + vpunpckhdq %ymm7,%ymm0,%ymm7 + vmovdqa %ymm8,%ymm0 + vpunpckldq %ymm9,%ymm0,%ymm8 + vpunpckhdq %ymm9,%ymm0,%ymm9 + vmovdqa %ymm10,%ymm0 + vpunpckldq %ymm11,%ymm0,%ymm10 + vpunpckhdq %ymm11,%ymm0,%ymm11 + vmovdqa %ymm12,%ymm0 + vpunpckldq %ymm13,%ymm0,%ymm12 + vpunpckhdq %ymm13,%ymm0,%ymm13 + vmovdqa %ymm14,%ymm0 + vpunpckldq %ymm15,%ymm0,%ymm14 + vpunpckhdq %ymm15,%ymm0,%ymm15 + + # interleave 64-bit words in state n, n+2 + vmovdqa 0x00(%rsp),%ymm0 + vmovdqa 0x40(%rsp),%ymm2 + vpunpcklqdq %ymm2,%ymm0,%ymm1 + vpunpckhqdq %ymm2,%ymm0,%ymm2 + vmovdqa %ymm1,0x00(%rsp) + vmovdqa %ymm2,0x40(%rsp) + vmovdqa 0x20(%rsp),%ymm0 + vmovdqa 0x60(%rsp),%ymm2 + vpunpcklqdq %ymm2,%ymm0,%ymm1 + vpunpckhqdq %ymm2,%ymm0,%ymm2 + vmovdqa %ymm1,0x20(%rsp) + vmovdqa %ymm2,0x60(%rsp) + vmovdqa %ymm4,%ymm0 + vpunpcklqdq %ymm6,%ymm0,%ymm4 + vpunpckhqdq %ymm6,%ymm0,%ymm6 + vmovdqa %ymm5,%ymm0 + vpunpcklqdq %ymm7,%ymm0,%ymm5 + vpunpckhqdq %ymm7,%ymm0,%ymm7 + vmovdqa %ymm8,%ymm0 + vpunpcklqdq %ymm10,%ymm0,%ymm8 + vpunpckhqdq %ymm10,%ymm0,%ymm10 + vmovdqa %ymm9,%ymm0 + vpunpcklqdq %ymm11,%ymm0,%ymm9 + vpunpckhqdq %ymm11,%ymm0,%ymm11 + vmovdqa %ymm12,%ymm0 + vpunpcklqdq %ymm14,%ymm0,%ymm12 + vpunpckhqdq %ymm14,%ymm0,%ymm14 + vmovdqa %ymm13,%ymm0 + vpunpcklqdq %ymm15,%ymm0,%ymm13 + vpunpckhqdq %ymm15,%ymm0,%ymm15 + + # interleave 128-bit words in state n, n+4 + # xor/write first four blocks + vmovdqa 0x00(%rsp),%ymm1 + vperm2i128 $0x20,%ymm4,%ymm1,%ymm0 + cmp $0x0020,%rax + jl .Lxorpart8 + vpxor 0x0000(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x0000(%rsi) + vperm2i128 $0x31,%ymm4,%ymm1,%ymm4 + + vperm2i128 $0x20,%ymm12,%ymm8,%ymm0 + cmp $0x0040,%rax + jl .Lxorpart8 + vpxor 0x0020(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x0020(%rsi) + vperm2i128 $0x31,%ymm12,%ymm8,%ymm12 + + vmovdqa 0x40(%rsp),%ymm1 + vperm2i128 $0x20,%ymm6,%ymm1,%ymm0 + cmp $0x0060,%rax + jl .Lxorpart8 + vpxor 0x0040(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x0040(%rsi) + vperm2i128 $0x31,%ymm6,%ymm1,%ymm6 + + vperm2i128 $0x20,%ymm14,%ymm10,%ymm0 + cmp $0x0080,%rax + jl .Lxorpart8 + vpxor 0x0060(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x0060(%rsi) + vperm2i128 $0x31,%ymm14,%ymm10,%ymm14 + + vmovdqa 0x20(%rsp),%ymm1 + vperm2i128 $0x20,%ymm5,%ymm1,%ymm0 + cmp $0x00a0,%rax + jl .Lxorpart8 + vpxor 0x0080(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x0080(%rsi) + vperm2i128 $0x31,%ymm5,%ymm1,%ymm5 + + vperm2i128 $0x20,%ymm13,%ymm9,%ymm0 + cmp $0x00c0,%rax + jl .Lxorpart8 + vpxor 0x00a0(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x00a0(%rsi) + vperm2i128 $0x31,%ymm13,%ymm9,%ymm13 + + vmovdqa 0x60(%rsp),%ymm1 + vperm2i128 $0x20,%ymm7,%ymm1,%ymm0 + cmp $0x00e0,%rax + jl .Lxorpart8 + vpxor 0x00c0(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x00c0(%rsi) + vperm2i128 $0x31,%ymm7,%ymm1,%ymm7 + + vperm2i128 $0x20,%ymm15,%ymm11,%ymm0 + cmp $0x0100,%rax + jl .Lxorpart8 + vpxor 0x00e0(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x00e0(%rsi) + vperm2i128 $0x31,%ymm15,%ymm11,%ymm15 + + # xor remaining blocks, write to output + vmovdqa %ymm4,%ymm0 + cmp $0x0120,%rax + jl .Lxorpart8 + vpxor 0x0100(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x0100(%rsi) + + vmovdqa %ymm12,%ymm0 + cmp $0x0140,%rax + jl .Lxorpart8 + vpxor 0x0120(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x0120(%rsi) + + vmovdqa %ymm6,%ymm0 + cmp $0x0160,%rax + jl .Lxorpart8 + vpxor 0x0140(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x0140(%rsi) + + vmovdqa %ymm14,%ymm0 + cmp $0x0180,%rax + jl .Lxorpart8 + vpxor 0x0160(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x0160(%rsi) + + vmovdqa %ymm5,%ymm0 + cmp $0x01a0,%rax + jl .Lxorpart8 + vpxor 0x0180(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x0180(%rsi) + + vmovdqa %ymm13,%ymm0 + cmp $0x01c0,%rax + jl .Lxorpart8 + vpxor 0x01a0(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x01a0(%rsi) + + vmovdqa %ymm7,%ymm0 + cmp $0x01e0,%rax + jl .Lxorpart8 + vpxor 0x01c0(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x01c0(%rsi) + + vmovdqa %ymm15,%ymm0 + cmp $0x0200,%rax + jl .Lxorpart8 + vpxor 0x01e0(%rdx),%ymm0,%ymm0 + vmovdqu %ymm0,0x01e0(%rsi) + +.Ldone8: + vzeroupper + lea -8(%r10),%rsp + ret + +.Lxorpart8: + # xor remaining bytes from partial register into output + mov %rax,%r9 + and $0x1f,%r9 + jz .Ldone8 + and $~0x1f,%rax + + mov %rsi,%r11 + + lea (%rdx,%rax),%rsi + mov %rsp,%rdi + mov %r9,%rcx + rep movsb + + vpxor 0x00(%rsp),%ymm0,%ymm0 + vmovdqa %ymm0,0x00(%rsp) + + mov %rsp,%rsi + lea (%r11,%rax),%rdi + mov %r9,%rcx + rep movsb + + jmp .Ldone8 + +ENDPROC(chacha_8block_xor_avx2) diff --git a/arch/x86/crypto/chacha-avx512vl-x86_64.S b/arch/x86/crypto/chacha-avx512vl-x86_64.S new file mode 100644 index 000000000000..848f9c75fd4f --- /dev/null +++ b/arch/x86/crypto/chacha-avx512vl-x86_64.S @@ -0,0 +1,836 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* + * ChaCha 256-bit cipher algorithm, x64 AVX-512VL functions + * + * Copyright (C) 2018 Martin Willi + */ + +#include + +.section .rodata.cst32.CTR2BL, "aM", @progbits, 32 +.align 32 +CTR2BL: .octa 0x00000000000000000000000000000000 + .octa 0x00000000000000000000000000000001 + +.section .rodata.cst32.CTR4BL, "aM", @progbits, 32 +.align 32 +CTR4BL: .octa 0x00000000000000000000000000000002 + .octa 0x00000000000000000000000000000003 + +.section .rodata.cst32.CTR8BL, "aM", @progbits, 32 +.align 32 +CTR8BL: .octa 0x00000003000000020000000100000000 + .octa 0x00000007000000060000000500000004 + +.text + +ENTRY(chacha_2block_xor_avx512vl) + # %rdi: Input state matrix, s + # %rsi: up to 2 data blocks output, o + # %rdx: up to 2 data blocks input, i + # %rcx: input/output length in bytes + # %r8d: nrounds + + # This function encrypts two ChaCha blocks by loading the state + # matrix twice across four AVX registers. It performs matrix operations + # on four words in each matrix in parallel, but requires shuffling to + # rearrange the words after each round. + + vzeroupper + + # x0..3[0-2] = s0..3 + vbroadcasti128 0x00(%rdi),%ymm0 + vbroadcasti128 0x10(%rdi),%ymm1 + vbroadcasti128 0x20(%rdi),%ymm2 + vbroadcasti128 0x30(%rdi),%ymm3 + + vpaddd CTR2BL(%rip),%ymm3,%ymm3 + + vmovdqa %ymm0,%ymm8 + vmovdqa %ymm1,%ymm9 + vmovdqa %ymm2,%ymm10 + vmovdqa %ymm3,%ymm11 + +.Ldoubleround: + + # x0 += x1, x3 = rotl32(x3 ^ x0, 16) + vpaddd %ymm1,%ymm0,%ymm0 + vpxord %ymm0,%ymm3,%ymm3 + vprold $16,%ymm3,%ymm3 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 12) + vpaddd %ymm3,%ymm2,%ymm2 + vpxord %ymm2,%ymm1,%ymm1 + vprold $12,%ymm1,%ymm1 + + # x0 += x1, x3 = rotl32(x3 ^ x0, 8) + vpaddd %ymm1,%ymm0,%ymm0 + vpxord %ymm0,%ymm3,%ymm3 + vprold $8,%ymm3,%ymm3 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 7) + vpaddd %ymm3,%ymm2,%ymm2 + vpxord %ymm2,%ymm1,%ymm1 + vprold $7,%ymm1,%ymm1 + + # x1 = shuffle32(x1, MASK(0, 3, 2, 1)) + vpshufd $0x39,%ymm1,%ymm1 + # x2 = shuffle32(x2, MASK(1, 0, 3, 2)) + vpshufd $0x4e,%ymm2,%ymm2 + # x3 = shuffle32(x3, MASK(2, 1, 0, 3)) + vpshufd $0x93,%ymm3,%ymm3 + + # x0 += x1, x3 = rotl32(x3 ^ x0, 16) + vpaddd %ymm1,%ymm0,%ymm0 + vpxord %ymm0,%ymm3,%ymm3 + vprold $16,%ymm3,%ymm3 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 12) + vpaddd %ymm3,%ymm2,%ymm2 + vpxord %ymm2,%ymm1,%ymm1 + vprold $12,%ymm1,%ymm1 + + # x0 += x1, x3 = rotl32(x3 ^ x0, 8) + vpaddd %ymm1,%ymm0,%ymm0 + vpxord %ymm0,%ymm3,%ymm3 + vprold $8,%ymm3,%ymm3 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 7) + vpaddd %ymm3,%ymm2,%ymm2 + vpxord %ymm2,%ymm1,%ymm1 + vprold $7,%ymm1,%ymm1 + + # x1 = shuffle32(x1, MASK(2, 1, 0, 3)) + vpshufd $0x93,%ymm1,%ymm1 + # x2 = shuffle32(x2, MASK(1, 0, 3, 2)) + vpshufd $0x4e,%ymm2,%ymm2 + # x3 = shuffle32(x3, MASK(0, 3, 2, 1)) + vpshufd $0x39,%ymm3,%ymm3 + + sub $2,%r8d + jnz .Ldoubleround + + # o0 = i0 ^ (x0 + s0) + vpaddd %ymm8,%ymm0,%ymm7 + cmp $0x10,%rcx + jl .Lxorpart2 + vpxord 0x00(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x00(%rsi) + vextracti128 $1,%ymm7,%xmm0 + # o1 = i1 ^ (x1 + s1) + vpaddd %ymm9,%ymm1,%ymm7 + cmp $0x20,%rcx + jl .Lxorpart2 + vpxord 0x10(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x10(%rsi) + vextracti128 $1,%ymm7,%xmm1 + # o2 = i2 ^ (x2 + s2) + vpaddd %ymm10,%ymm2,%ymm7 + cmp $0x30,%rcx + jl .Lxorpart2 + vpxord 0x20(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x20(%rsi) + vextracti128 $1,%ymm7,%xmm2 + # o3 = i3 ^ (x3 + s3) + vpaddd %ymm11,%ymm3,%ymm7 + cmp $0x40,%rcx + jl .Lxorpart2 + vpxord 0x30(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x30(%rsi) + vextracti128 $1,%ymm7,%xmm3 + + # xor and write second block + vmovdqa %xmm0,%xmm7 + cmp $0x50,%rcx + jl .Lxorpart2 + vpxord 0x40(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x40(%rsi) + + vmovdqa %xmm1,%xmm7 + cmp $0x60,%rcx + jl .Lxorpart2 + vpxord 0x50(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x50(%rsi) + + vmovdqa %xmm2,%xmm7 + cmp $0x70,%rcx + jl .Lxorpart2 + vpxord 0x60(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x60(%rsi) + + vmovdqa %xmm3,%xmm7 + cmp $0x80,%rcx + jl .Lxorpart2 + vpxord 0x70(%rdx),%xmm7,%xmm6 + vmovdqu %xmm6,0x70(%rsi) + +.Ldone2: + vzeroupper + ret + +.Lxorpart2: + # xor remaining bytes from partial register into output + mov %rcx,%rax + and $0xf,%rcx + jz .Ldone8 + mov %rax,%r9 + and $~0xf,%r9 + + mov $1,%rax + shld %cl,%rax,%rax + sub $1,%rax + kmovq %rax,%k1 + + vmovdqu8 (%rdx,%r9),%xmm1{%k1}{z} + vpxord %xmm7,%xmm1,%xmm1 + vmovdqu8 %xmm1,(%rsi,%r9){%k1} + + jmp .Ldone2 + +ENDPROC(chacha_2block_xor_avx512vl) + +ENTRY(chacha_4block_xor_avx512vl) + # %rdi: Input state matrix, s + # %rsi: up to 4 data blocks output, o + # %rdx: up to 4 data blocks input, i + # %rcx: input/output length in bytes + # %r8d: nrounds + + # This function encrypts four ChaCha blocks by loading the state + # matrix four times across eight AVX registers. It performs matrix + # operations on four words in two matrices in parallel, sequentially + # to the operations on the four words of the other two matrices. The + # required word shuffling has a rather high latency, we can do the + # arithmetic on two matrix-pairs without much slowdown. + + vzeroupper + + # x0..3[0-4] = s0..3 + vbroadcasti128 0x00(%rdi),%ymm0 + vbroadcasti128 0x10(%rdi),%ymm1 + vbroadcasti128 0x20(%rdi),%ymm2 + vbroadcasti128 0x30(%rdi),%ymm3 + + vmovdqa %ymm0,%ymm4 + vmovdqa %ymm1,%ymm5 + vmovdqa %ymm2,%ymm6 + vmovdqa %ymm3,%ymm7 + + vpaddd CTR2BL(%rip),%ymm3,%ymm3 + vpaddd CTR4BL(%rip),%ymm7,%ymm7 + + vmovdqa %ymm0,%ymm11 + vmovdqa %ymm1,%ymm12 + vmovdqa %ymm2,%ymm13 + vmovdqa %ymm3,%ymm14 + vmovdqa %ymm7,%ymm15 + +.Ldoubleround4: + + # x0 += x1, x3 = rotl32(x3 ^ x0, 16) + vpaddd %ymm1,%ymm0,%ymm0 + vpxord %ymm0,%ymm3,%ymm3 + vprold $16,%ymm3,%ymm3 + + vpaddd %ymm5,%ymm4,%ymm4 + vpxord %ymm4,%ymm7,%ymm7 + vprold $16,%ymm7,%ymm7 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 12) + vpaddd %ymm3,%ymm2,%ymm2 + vpxord %ymm2,%ymm1,%ymm1 + vprold $12,%ymm1,%ymm1 + + vpaddd %ymm7,%ymm6,%ymm6 + vpxord %ymm6,%ymm5,%ymm5 + vprold $12,%ymm5,%ymm5 + + # x0 += x1, x3 = rotl32(x3 ^ x0, 8) + vpaddd %ymm1,%ymm0,%ymm0 + vpxord %ymm0,%ymm3,%ymm3 + vprold $8,%ymm3,%ymm3 + + vpaddd %ymm5,%ymm4,%ymm4 + vpxord %ymm4,%ymm7,%ymm7 + vprold $8,%ymm7,%ymm7 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 7) + vpaddd %ymm3,%ymm2,%ymm2 + vpxord %ymm2,%ymm1,%ymm1 + vprold $7,%ymm1,%ymm1 + + vpaddd %ymm7,%ymm6,%ymm6 + vpxord %ymm6,%ymm5,%ymm5 + vprold $7,%ymm5,%ymm5 + + # x1 = shuffle32(x1, MASK(0, 3, 2, 1)) + vpshufd $0x39,%ymm1,%ymm1 + vpshufd $0x39,%ymm5,%ymm5 + # x2 = shuffle32(x2, MASK(1, 0, 3, 2)) + vpshufd $0x4e,%ymm2,%ymm2 + vpshufd $0x4e,%ymm6,%ymm6 + # x3 = shuffle32(x3, MASK(2, 1, 0, 3)) + vpshufd $0x93,%ymm3,%ymm3 + vpshufd $0x93,%ymm7,%ymm7 + + # x0 += x1, x3 = rotl32(x3 ^ x0, 16) + vpaddd %ymm1,%ymm0,%ymm0 + vpxord %ymm0,%ymm3,%ymm3 + vprold $16,%ymm3,%ymm3 + + vpaddd %ymm5,%ymm4,%ymm4 + vpxord %ymm4,%ymm7,%ymm7 + vprold $16,%ymm7,%ymm7 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 12) + vpaddd %ymm3,%ymm2,%ymm2 + vpxord %ymm2,%ymm1,%ymm1 + vprold $12,%ymm1,%ymm1 + + vpaddd %ymm7,%ymm6,%ymm6 + vpxord %ymm6,%ymm5,%ymm5 + vprold $12,%ymm5,%ymm5 + + # x0 += x1, x3 = rotl32(x3 ^ x0, 8) + vpaddd %ymm1,%ymm0,%ymm0 + vpxord %ymm0,%ymm3,%ymm3 + vprold $8,%ymm3,%ymm3 + + vpaddd %ymm5,%ymm4,%ymm4 + vpxord %ymm4,%ymm7,%ymm7 + vprold $8,%ymm7,%ymm7 + + # x2 += x3, x1 = rotl32(x1 ^ x2, 7) + vpaddd %ymm3,%ymm2,%ymm2 + vpxord %ymm2,%ymm1,%ymm1 + vprold $7,%ymm1,%ymm1 + + vpaddd %ymm7,%ymm6,%ymm6 + vpxord %ymm6,%ymm5,%ymm5 + vprold $7,%ymm5,%ymm5 + + # x1 = shuffle32(x1, MASK(2, 1, 0, 3)) + vpshufd $0x93,%ymm1,%ymm1 + vpshufd $0x93,%ymm5,%ymm5 + # x2 = shuffle32(x2, MASK(1, 0, 3, 2)) + vpshufd $0x4e,%ymm2,%ymm2 + vpshufd $0x4e,%ymm6,%ymm6 + # x3 = shuffle32(x3, MASK(0, 3, 2, 1)) + vpshufd $0x39,%ymm3,%ymm3 + vpshufd $0x39,%ymm7,%ymm7 + + sub $2,%r8d + jnz .Ldoubleround4 + + # o0 = i0 ^ (x0 + s0), first block + vpaddd %ymm11,%ymm0,%ymm10 + cmp $0x10,%rcx + jl .Lxorpart4 + vpxord 0x00(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x00(%rsi) + vextracti128 $1,%ymm10,%xmm0 + # o1 = i1 ^ (x1 + s1), first block + vpaddd %ymm12,%ymm1,%ymm10 + cmp $0x20,%rcx + jl .Lxorpart4 + vpxord 0x10(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x10(%rsi) + vextracti128 $1,%ymm10,%xmm1 + # o2 = i2 ^ (x2 + s2), first block + vpaddd %ymm13,%ymm2,%ymm10 + cmp $0x30,%rcx + jl .Lxorpart4 + vpxord 0x20(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x20(%rsi) + vextracti128 $1,%ymm10,%xmm2 + # o3 = i3 ^ (x3 + s3), first block + vpaddd %ymm14,%ymm3,%ymm10 + cmp $0x40,%rcx + jl .Lxorpart4 + vpxord 0x30(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x30(%rsi) + vextracti128 $1,%ymm10,%xmm3 + + # xor and write second block + vmovdqa %xmm0,%xmm10 + cmp $0x50,%rcx + jl .Lxorpart4 + vpxord 0x40(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x40(%rsi) + + vmovdqa %xmm1,%xmm10 + cmp $0x60,%rcx + jl .Lxorpart4 + vpxord 0x50(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x50(%rsi) + + vmovdqa %xmm2,%xmm10 + cmp $0x70,%rcx + jl .Lxorpart4 + vpxord 0x60(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x60(%rsi) + + vmovdqa %xmm3,%xmm10 + cmp $0x80,%rcx + jl .Lxorpart4 + vpxord 0x70(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x70(%rsi) + + # o0 = i0 ^ (x0 + s0), third block + vpaddd %ymm11,%ymm4,%ymm10 + cmp $0x90,%rcx + jl .Lxorpart4 + vpxord 0x80(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x80(%rsi) + vextracti128 $1,%ymm10,%xmm4 + # o1 = i1 ^ (x1 + s1), third block + vpaddd %ymm12,%ymm5,%ymm10 + cmp $0xa0,%rcx + jl .Lxorpart4 + vpxord 0x90(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0x90(%rsi) + vextracti128 $1,%ymm10,%xmm5 + # o2 = i2 ^ (x2 + s2), third block + vpaddd %ymm13,%ymm6,%ymm10 + cmp $0xb0,%rcx + jl .Lxorpart4 + vpxord 0xa0(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0xa0(%rsi) + vextracti128 $1,%ymm10,%xmm6 + # o3 = i3 ^ (x3 + s3), third block + vpaddd %ymm15,%ymm7,%ymm10 + cmp $0xc0,%rcx + jl .Lxorpart4 + vpxord 0xb0(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0xb0(%rsi) + vextracti128 $1,%ymm10,%xmm7 + + # xor and write fourth block + vmovdqa %xmm4,%xmm10 + cmp $0xd0,%rcx + jl .Lxorpart4 + vpxord 0xc0(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0xc0(%rsi) + + vmovdqa %xmm5,%xmm10 + cmp $0xe0,%rcx + jl .Lxorpart4 + vpxord 0xd0(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0xd0(%rsi) + + vmovdqa %xmm6,%xmm10 + cmp $0xf0,%rcx + jl .Lxorpart4 + vpxord 0xe0(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0xe0(%rsi) + + vmovdqa %xmm7,%xmm10 + cmp $0x100,%rcx + jl .Lxorpart4 + vpxord 0xf0(%rdx),%xmm10,%xmm9 + vmovdqu %xmm9,0xf0(%rsi) + +.Ldone4: + vzeroupper + ret + +.Lxorpart4: + # xor remaining bytes from partial register into output + mov %rcx,%rax + and $0xf,%rcx + jz .Ldone8 + mov %rax,%r9 + and $~0xf,%r9 + + mov $1,%rax + shld %cl,%rax,%rax + sub $1,%rax + kmovq %rax,%k1 + + vmovdqu8 (%rdx,%r9),%xmm1{%k1}{z} + vpxord %xmm10,%xmm1,%xmm1 + vmovdqu8 %xmm1,(%rsi,%r9){%k1} + + jmp .Ldone4 + +ENDPROC(chacha_4block_xor_avx512vl) + +ENTRY(chacha_8block_xor_avx512vl) + # %rdi: Input state matrix, s + # %rsi: up to 8 data blocks output, o + # %rdx: up to 8 data blocks input, i + # %rcx: input/output length in bytes + # %r8d: nrounds + + # This function encrypts eight consecutive ChaCha blocks by loading + # the state matrix in AVX registers eight times. Compared to AVX2, this + # mostly benefits from the new rotate instructions in VL and the + # additional registers. + + vzeroupper + + # x0..15[0-7] = s[0..15] + vpbroadcastd 0x00(%rdi),%ymm0 + vpbroadcastd 0x04(%rdi),%ymm1 + vpbroadcastd 0x08(%rdi),%ymm2 + vpbroadcastd 0x0c(%rdi),%ymm3 + vpbroadcastd 0x10(%rdi),%ymm4 + vpbroadcastd 0x14(%rdi),%ymm5 + vpbroadcastd 0x18(%rdi),%ymm6 + vpbroadcastd 0x1c(%rdi),%ymm7 + vpbroadcastd 0x20(%rdi),%ymm8 + vpbroadcastd 0x24(%rdi),%ymm9 + vpbroadcastd 0x28(%rdi),%ymm10 + vpbroadcastd 0x2c(%rdi),%ymm11 + vpbroadcastd 0x30(%rdi),%ymm12 + vpbroadcastd 0x34(%rdi),%ymm13 + vpbroadcastd 0x38(%rdi),%ymm14 + vpbroadcastd 0x3c(%rdi),%ymm15 + + # x12 += counter values 0-3 + vpaddd CTR8BL(%rip),%ymm12,%ymm12 + + vmovdqa64 %ymm0,%ymm16 + vmovdqa64 %ymm1,%ymm17 + vmovdqa64 %ymm2,%ymm18 + vmovdqa64 %ymm3,%ymm19 + vmovdqa64 %ymm4,%ymm20 + vmovdqa64 %ymm5,%ymm21 + vmovdqa64 %ymm6,%ymm22 + vmovdqa64 %ymm7,%ymm23 + vmovdqa64 %ymm8,%ymm24 + vmovdqa64 %ymm9,%ymm25 + vmovdqa64 %ymm10,%ymm26 + vmovdqa64 %ymm11,%ymm27 + vmovdqa64 %ymm12,%ymm28 + vmovdqa64 %ymm13,%ymm29 + vmovdqa64 %ymm14,%ymm30 + vmovdqa64 %ymm15,%ymm31 + +.Ldoubleround8: + # x0 += x4, x12 = rotl32(x12 ^ x0, 16) + vpaddd %ymm0,%ymm4,%ymm0 + vpxord %ymm0,%ymm12,%ymm12 + vprold $16,%ymm12,%ymm12 + # x1 += x5, x13 = rotl32(x13 ^ x1, 16) + vpaddd %ymm1,%ymm5,%ymm1 + vpxord %ymm1,%ymm13,%ymm13 + vprold $16,%ymm13,%ymm13 + # x2 += x6, x14 = rotl32(x14 ^ x2, 16) + vpaddd %ymm2,%ymm6,%ymm2 + vpxord %ymm2,%ymm14,%ymm14 + vprold $16,%ymm14,%ymm14 + # x3 += x7, x15 = rotl32(x15 ^ x3, 16) + vpaddd %ymm3,%ymm7,%ymm3 + vpxord %ymm3,%ymm15,%ymm15 + vprold $16,%ymm15,%ymm15 + + # x8 += x12, x4 = rotl32(x4 ^ x8, 12) + vpaddd %ymm12,%ymm8,%ymm8 + vpxord %ymm8,%ymm4,%ymm4 + vprold $12,%ymm4,%ymm4 + # x9 += x13, x5 = rotl32(x5 ^ x9, 12) + vpaddd %ymm13,%ymm9,%ymm9 + vpxord %ymm9,%ymm5,%ymm5 + vprold $12,%ymm5,%ymm5 + # x10 += x14, x6 = rotl32(x6 ^ x10, 12) + vpaddd %ymm14,%ymm10,%ymm10 + vpxord %ymm10,%ymm6,%ymm6 + vprold $12,%ymm6,%ymm6 + # x11 += x15, x7 = rotl32(x7 ^ x11, 12) + vpaddd %ymm15,%ymm11,%ymm11 + vpxord %ymm11,%ymm7,%ymm7 + vprold $12,%ymm7,%ymm7 + + # x0 += x4, x12 = rotl32(x12 ^ x0, 8) + vpaddd %ymm0,%ymm4,%ymm0 + vpxord %ymm0,%ymm12,%ymm12 + vprold $8,%ymm12,%ymm12 + # x1 += x5, x13 = rotl32(x13 ^ x1, 8) + vpaddd %ymm1,%ymm5,%ymm1 + vpxord %ymm1,%ymm13,%ymm13 + vprold $8,%ymm13,%ymm13 + # x2 += x6, x14 = rotl32(x14 ^ x2, 8) + vpaddd %ymm2,%ymm6,%ymm2 + vpxord %ymm2,%ymm14,%ymm14 + vprold $8,%ymm14,%ymm14 + # x3 += x7, x15 = rotl32(x15 ^ x3, 8) + vpaddd %ymm3,%ymm7,%ymm3 + vpxord %ymm3,%ymm15,%ymm15 + vprold $8,%ymm15,%ymm15 + + # x8 += x12, x4 = rotl32(x4 ^ x8, 7) + vpaddd %ymm12,%ymm8,%ymm8 + vpxord %ymm8,%ymm4,%ymm4 + vprold $7,%ymm4,%ymm4 + # x9 += x13, x5 = rotl32(x5 ^ x9, 7) + vpaddd %ymm13,%ymm9,%ymm9 + vpxord %ymm9,%ymm5,%ymm5 + vprold $7,%ymm5,%ymm5 + # x10 += x14, x6 = rotl32(x6 ^ x10, 7) + vpaddd %ymm14,%ymm10,%ymm10 + vpxord %ymm10,%ymm6,%ymm6 + vprold $7,%ymm6,%ymm6 + # x11 += x15, x7 = rotl32(x7 ^ x11, 7) + vpaddd %ymm15,%ymm11,%ymm11 + vpxord %ymm11,%ymm7,%ymm7 + vprold $7,%ymm7,%ymm7 + + # x0 += x5, x15 = rotl32(x15 ^ x0, 16) + vpaddd %ymm0,%ymm5,%ymm0 + vpxord %ymm0,%ymm15,%ymm15 + vprold $16,%ymm15,%ymm15 + # x1 += x6, x12 = rotl32(x12 ^ x1, 16) + vpaddd %ymm1,%ymm6,%ymm1 + vpxord %ymm1,%ymm12,%ymm12 + vprold $16,%ymm12,%ymm12 + # x2 += x7, x13 = rotl32(x13 ^ x2, 16) + vpaddd %ymm2,%ymm7,%ymm2 + vpxord %ymm2,%ymm13,%ymm13 + vprold $16,%ymm13,%ymm13 + # x3 += x4, x14 = rotl32(x14 ^ x3, 16) + vpaddd %ymm3,%ymm4,%ymm3 + vpxord %ymm3,%ymm14,%ymm14 + vprold $16,%ymm14,%ymm14 + + # x10 += x15, x5 = rotl32(x5 ^ x10, 12) + vpaddd %ymm15,%ymm10,%ymm10 + vpxord %ymm10,%ymm5,%ymm5 + vprold $12,%ymm5,%ymm5 + # x11 += x12, x6 = rotl32(x6 ^ x11, 12) + vpaddd %ymm12,%ymm11,%ymm11 + vpxord %ymm11,%ymm6,%ymm6 + vprold $12,%ymm6,%ymm6 + # x8 += x13, x7 = rotl32(x7 ^ x8, 12) + vpaddd %ymm13,%ymm8,%ymm8 + vpxord %ymm8,%ymm7,%ymm7 + vprold $12,%ymm7,%ymm7 + # x9 += x14, x4 = rotl32(x4 ^ x9, 12) + vpaddd %ymm14,%ymm9,%ymm9 + vpxord %ymm9,%ymm4,%ymm4 + vprold $12,%ymm4,%ymm4 + + # x0 += x5, x15 = rotl32(x15 ^ x0, 8) + vpaddd %ymm0,%ymm5,%ymm0 + vpxord %ymm0,%ymm15,%ymm15 + vprold $8,%ymm15,%ymm15 + # x1 += x6, x12 = rotl32(x12 ^ x1, 8) + vpaddd %ymm1,%ymm6,%ymm1 + vpxord %ymm1,%ymm12,%ymm12 + vprold $8,%ymm12,%ymm12 + # x2 += x7, x13 = rotl32(x13 ^ x2, 8) + vpaddd %ymm2,%ymm7,%ymm2 + vpxord %ymm2,%ymm13,%ymm13 + vprold $8,%ymm13,%ymm13 + # x3 += x4, x14 = rotl32(x14 ^ x3, 8) + vpaddd %ymm3,%ymm4,%ymm3 + vpxord %ymm3,%ymm14,%ymm14 + vprold $8,%ymm14,%ymm14 + + # x10 += x15, x5 = rotl32(x5 ^ x10, 7) + vpaddd %ymm15,%ymm10,%ymm10 + vpxord %ymm10,%ymm5,%ymm5 + vprold $7,%ymm5,%ymm5 + # x11 += x12, x6 = rotl32(x6 ^ x11, 7) + vpaddd %ymm12,%ymm11,%ymm11 + vpxord %ymm11,%ymm6,%ymm6 + vprold $7,%ymm6,%ymm6 + # x8 += x13, x7 = rotl32(x7 ^ x8, 7) + vpaddd %ymm13,%ymm8,%ymm8 + vpxord %ymm8,%ymm7,%ymm7 + vprold $7,%ymm7,%ymm7 + # x9 += x14, x4 = rotl32(x4 ^ x9, 7) + vpaddd %ymm14,%ymm9,%ymm9 + vpxord %ymm9,%ymm4,%ymm4 + vprold $7,%ymm4,%ymm4 + + sub $2,%r8d + jnz .Ldoubleround8 + + # x0..15[0-3] += s[0..15] + vpaddd %ymm16,%ymm0,%ymm0 + vpaddd %ymm17,%ymm1,%ymm1 + vpaddd %ymm18,%ymm2,%ymm2 + vpaddd %ymm19,%ymm3,%ymm3 + vpaddd %ymm20,%ymm4,%ymm4 + vpaddd %ymm21,%ymm5,%ymm5 + vpaddd %ymm22,%ymm6,%ymm6 + vpaddd %ymm23,%ymm7,%ymm7 + vpaddd %ymm24,%ymm8,%ymm8 + vpaddd %ymm25,%ymm9,%ymm9 + vpaddd %ymm26,%ymm10,%ymm10 + vpaddd %ymm27,%ymm11,%ymm11 + vpaddd %ymm28,%ymm12,%ymm12 + vpaddd %ymm29,%ymm13,%ymm13 + vpaddd %ymm30,%ymm14,%ymm14 + vpaddd %ymm31,%ymm15,%ymm15 + + # interleave 32-bit words in state n, n+1 + vpunpckldq %ymm1,%ymm0,%ymm16 + vpunpckhdq %ymm1,%ymm0,%ymm17 + vpunpckldq %ymm3,%ymm2,%ymm18 + vpunpckhdq %ymm3,%ymm2,%ymm19 + vpunpckldq %ymm5,%ymm4,%ymm20 + vpunpckhdq %ymm5,%ymm4,%ymm21 + vpunpckldq %ymm7,%ymm6,%ymm22 + vpunpckhdq %ymm7,%ymm6,%ymm23 + vpunpckldq %ymm9,%ymm8,%ymm24 + vpunpckhdq %ymm9,%ymm8,%ymm25 + vpunpckldq %ymm11,%ymm10,%ymm26 + vpunpckhdq %ymm11,%ymm10,%ymm27 + vpunpckldq %ymm13,%ymm12,%ymm28 + vpunpckhdq %ymm13,%ymm12,%ymm29 + vpunpckldq %ymm15,%ymm14,%ymm30 + vpunpckhdq %ymm15,%ymm14,%ymm31 + + # interleave 64-bit words in state n, n+2 + vpunpcklqdq %ymm18,%ymm16,%ymm0 + vpunpcklqdq %ymm19,%ymm17,%ymm1 + vpunpckhqdq %ymm18,%ymm16,%ymm2 + vpunpckhqdq %ymm19,%ymm17,%ymm3 + vpunpcklqdq %ymm22,%ymm20,%ymm4 + vpunpcklqdq %ymm23,%ymm21,%ymm5 + vpunpckhqdq %ymm22,%ymm20,%ymm6 + vpunpckhqdq %ymm23,%ymm21,%ymm7 + vpunpcklqdq %ymm26,%ymm24,%ymm8 + vpunpcklqdq %ymm27,%ymm25,%ymm9 + vpunpckhqdq %ymm26,%ymm24,%ymm10 + vpunpckhqdq %ymm27,%ymm25,%ymm11 + vpunpcklqdq %ymm30,%ymm28,%ymm12 + vpunpcklqdq %ymm31,%ymm29,%ymm13 + vpunpckhqdq %ymm30,%ymm28,%ymm14 + vpunpckhqdq %ymm31,%ymm29,%ymm15 + + # interleave 128-bit words in state n, n+4 + # xor/write first four blocks + vmovdqa64 %ymm0,%ymm16 + vperm2i128 $0x20,%ymm4,%ymm0,%ymm0 + cmp $0x0020,%rcx + jl .Lxorpart8 + vpxord 0x0000(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x0000(%rsi) + vmovdqa64 %ymm16,%ymm0 + vperm2i128 $0x31,%ymm4,%ymm0,%ymm4 + + vperm2i128 $0x20,%ymm12,%ymm8,%ymm0 + cmp $0x0040,%rcx + jl .Lxorpart8 + vpxord 0x0020(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x0020(%rsi) + vperm2i128 $0x31,%ymm12,%ymm8,%ymm12 + + vperm2i128 $0x20,%ymm6,%ymm2,%ymm0 + cmp $0x0060,%rcx + jl .Lxorpart8 + vpxord 0x0040(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x0040(%rsi) + vperm2i128 $0x31,%ymm6,%ymm2,%ymm6 + + vperm2i128 $0x20,%ymm14,%ymm10,%ymm0 + cmp $0x0080,%rcx + jl .Lxorpart8 + vpxord 0x0060(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x0060(%rsi) + vperm2i128 $0x31,%ymm14,%ymm10,%ymm14 + + vperm2i128 $0x20,%ymm5,%ymm1,%ymm0 + cmp $0x00a0,%rcx + jl .Lxorpart8 + vpxord 0x0080(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x0080(%rsi) + vperm2i128 $0x31,%ymm5,%ymm1,%ymm5 + + vperm2i128 $0x20,%ymm13,%ymm9,%ymm0 + cmp $0x00c0,%rcx + jl .Lxorpart8 + vpxord 0x00a0(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x00a0(%rsi) + vperm2i128 $0x31,%ymm13,%ymm9,%ymm13 + + vperm2i128 $0x20,%ymm7,%ymm3,%ymm0 + cmp $0x00e0,%rcx + jl .Lxorpart8 + vpxord 0x00c0(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x00c0(%rsi) + vperm2i128 $0x31,%ymm7,%ymm3,%ymm7 + + vperm2i128 $0x20,%ymm15,%ymm11,%ymm0 + cmp $0x0100,%rcx + jl .Lxorpart8 + vpxord 0x00e0(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x00e0(%rsi) + vperm2i128 $0x31,%ymm15,%ymm11,%ymm15 + + # xor remaining blocks, write to output + vmovdqa64 %ymm4,%ymm0 + cmp $0x0120,%rcx + jl .Lxorpart8 + vpxord 0x0100(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x0100(%rsi) + + vmovdqa64 %ymm12,%ymm0 + cmp $0x0140,%rcx + jl .Lxorpart8 + vpxord 0x0120(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x0120(%rsi) + + vmovdqa64 %ymm6,%ymm0 + cmp $0x0160,%rcx + jl .Lxorpart8 + vpxord 0x0140(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x0140(%rsi) + + vmovdqa64 %ymm14,%ymm0 + cmp $0x0180,%rcx + jl .Lxorpart8 + vpxord 0x0160(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x0160(%rsi) + + vmovdqa64 %ymm5,%ymm0 + cmp $0x01a0,%rcx + jl .Lxorpart8 + vpxord 0x0180(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x0180(%rsi) + + vmovdqa64 %ymm13,%ymm0 + cmp $0x01c0,%rcx + jl .Lxorpart8 + vpxord 0x01a0(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x01a0(%rsi) + + vmovdqa64 %ymm7,%ymm0 + cmp $0x01e0,%rcx + jl .Lxorpart8 + vpxord 0x01c0(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x01c0(%rsi) + + vmovdqa64 %ymm15,%ymm0 + cmp $0x0200,%rcx + jl .Lxorpart8 + vpxord 0x01e0(%rdx),%ymm0,%ymm0 + vmovdqu64 %ymm0,0x01e0(%rsi) + +.Ldone8: + vzeroupper + ret + +.Lxorpart8: + # xor remaining bytes from partial register into output + mov %rcx,%rax + and $0x1f,%rcx + jz .Ldone8 + mov %rax,%r9 + and $~0x1f,%r9 + + mov $1,%rax + shld %cl,%rax,%rax + sub $1,%rax + kmovq %rax,%k1 + + vmovdqu8 (%rdx,%r9),%ymm1{%k1}{z} + vpxord %ymm0,%ymm1,%ymm1 + vmovdqu8 %ymm1,(%rsi,%r9){%k1} + + jmp .Ldone8 + +ENDPROC(chacha_8block_xor_avx512vl) diff --git a/arch/x86/crypto/chacha20-ssse3-x86_64.S b/arch/x86/crypto/chacha-ssse3-x86_64.S similarity index 76% rename from arch/x86/crypto/chacha20-ssse3-x86_64.S rename to arch/x86/crypto/chacha-ssse3-x86_64.S index 512a2b500fd1..e5fd7b27c798 100644 --- a/arch/x86/crypto/chacha20-ssse3-x86_64.S +++ b/arch/x86/crypto/chacha-ssse3-x86_64.S @@ -1,5 +1,5 @@ /* - * ChaCha20 256-bit cipher algorithm, RFC7539, x64 SSSE3 functions + * ChaCha 256-bit cipher algorithm, x64 SSSE3 functions * * Copyright (C) 2015 Martin Willi * @@ -10,6 +10,7 @@ */ #include +#include .section .rodata.cst16.ROT8, "aM", @progbits, 16 .align 16 @@ -23,35 +24,25 @@ CTRINC: .octa 0x00000003000000020000000100000000 .text -ENTRY(chacha20_block_xor_ssse3) - # %rdi: Input state matrix, s - # %rsi: 1 data block output, o - # %rdx: 1 data block input, i - - # This function encrypts one ChaCha20 block by loading the state matrix - # in four SSE registers. It performs matrix operation on four words in - # parallel, but requireds shuffling to rearrange the words after each - # round. 8/16-bit word rotation is done with the slightly better - # performing SSSE3 byte shuffling, 7/12-bit word rotation uses - # traditional shift+OR. - - # x0..3 = s0..3 - movdqa 0x00(%rdi),%xmm0 - movdqa 0x10(%rdi),%xmm1 - movdqa 0x20(%rdi),%xmm2 - movdqa 0x30(%rdi),%xmm3 - movdqa %xmm0,%xmm8 - movdqa %xmm1,%xmm9 - movdqa %xmm2,%xmm10 - movdqa %xmm3,%xmm11 +/* + * chacha_permute - permute one block + * + * Permute one 64-byte block where the state matrix is in %xmm0-%xmm3. This + * function performs matrix operations on four words in parallel, but requires + * shuffling to rearrange the words after each round. 8/16-bit word rotation is + * done with the slightly better performing SSSE3 byte shuffling, 7/12-bit word + * rotation uses traditional shift+OR. + * + * The round count is given in %r8d. + * + * Clobbers: %r8d, %xmm4-%xmm7 + */ +chacha_permute: movdqa ROT8(%rip),%xmm4 movdqa ROT16(%rip),%xmm5 - mov $10,%ecx - .Ldoubleround: - # x0 += x1, x3 = rotl32(x3 ^ x0, 16) paddd %xmm1,%xmm0 pxor %xmm0,%xmm3 @@ -118,39 +109,129 @@ ENTRY(chacha20_block_xor_ssse3) # x3 = shuffle32(x3, MASK(0, 3, 2, 1)) pshufd $0x39,%xmm3,%xmm3 - dec %ecx + sub $2,%r8d jnz .Ldoubleround + ret +ENDPROC(chacha_permute) + +ENTRY(chacha_block_xor_ssse3) + # %rdi: Input state matrix, s + # %rsi: up to 1 data block output, o + # %rdx: up to 1 data block input, i + # %rcx: input/output length in bytes + # %r8d: nrounds + FRAME_BEGIN + + # x0..3 = s0..3 + movdqu 0x00(%rdi),%xmm0 + movdqu 0x10(%rdi),%xmm1 + movdqu 0x20(%rdi),%xmm2 + movdqu 0x30(%rdi),%xmm3 + movdqa %xmm0,%xmm8 + movdqa %xmm1,%xmm9 + movdqa %xmm2,%xmm10 + movdqa %xmm3,%xmm11 + + mov %rcx,%rax + call chacha_permute + # o0 = i0 ^ (x0 + s0) - movdqu 0x00(%rdx),%xmm4 paddd %xmm8,%xmm0 + cmp $0x10,%rax + jl .Lxorpart + movdqu 0x00(%rdx),%xmm4 pxor %xmm4,%xmm0 movdqu %xmm0,0x00(%rsi) # o1 = i1 ^ (x1 + s1) - movdqu 0x10(%rdx),%xmm5 paddd %xmm9,%xmm1 - pxor %xmm5,%xmm1 - movdqu %xmm1,0x10(%rsi) + movdqa %xmm1,%xmm0 + cmp $0x20,%rax + jl .Lxorpart + movdqu 0x10(%rdx),%xmm0 + pxor %xmm1,%xmm0 + movdqu %xmm0,0x10(%rsi) # o2 = i2 ^ (x2 + s2) - movdqu 0x20(%rdx),%xmm6 paddd %xmm10,%xmm2 - pxor %xmm6,%xmm2 - movdqu %xmm2,0x20(%rsi) + movdqa %xmm2,%xmm0 + cmp $0x30,%rax + jl .Lxorpart + movdqu 0x20(%rdx),%xmm0 + pxor %xmm2,%xmm0 + movdqu %xmm0,0x20(%rsi) # o3 = i3 ^ (x3 + s3) - movdqu 0x30(%rdx),%xmm7 paddd %xmm11,%xmm3 - pxor %xmm7,%xmm3 - movdqu %xmm3,0x30(%rsi) + movdqa %xmm3,%xmm0 + cmp $0x40,%rax + jl .Lxorpart + movdqu 0x30(%rdx),%xmm0 + pxor %xmm3,%xmm0 + movdqu %xmm0,0x30(%rsi) +.Ldone: + FRAME_END ret -ENDPROC(chacha20_block_xor_ssse3) -ENTRY(chacha20_4block_xor_ssse3) +.Lxorpart: + # xor remaining bytes from partial register into output + mov %rax,%r9 + and $0x0f,%r9 + jz .Ldone + and $~0x0f,%rax + + mov %rsi,%r11 + + lea 8(%rsp),%r10 + sub $0x10,%rsp + and $~31,%rsp + + lea (%rdx,%rax),%rsi + mov %rsp,%rdi + mov %r9,%rcx + rep movsb + + pxor 0x00(%rsp),%xmm0 + movdqa %xmm0,0x00(%rsp) + + mov %rsp,%rsi + lea (%r11,%rax),%rdi + mov %r9,%rcx + rep movsb + + lea -8(%r10),%rsp + jmp .Ldone + +ENDPROC(chacha_block_xor_ssse3) + +ENTRY(hchacha_block_ssse3) # %rdi: Input state matrix, s - # %rsi: 4 data blocks output, o - # %rdx: 4 data blocks input, i + # %rsi: output (8 32-bit words) + # %edx: nrounds + FRAME_BEGIN - # This function encrypts four consecutive ChaCha20 blocks by loading the + movdqu 0x00(%rdi),%xmm0 + movdqu 0x10(%rdi),%xmm1 + movdqu 0x20(%rdi),%xmm2 + movdqu 0x30(%rdi),%xmm3 + + mov %edx,%r8d + call chacha_permute + + movdqu %xmm0,0x00(%rsi) + movdqu %xmm3,0x10(%rsi) + + FRAME_END + ret +ENDPROC(hchacha_block_ssse3) + +ENTRY(chacha_4block_xor_ssse3) + # %rdi: Input state matrix, s + # %rsi: up to 4 data blocks output, o + # %rdx: up to 4 data blocks input, i + # %rcx: input/output length in bytes + # %r8d: nrounds + + # This function encrypts four consecutive ChaCha blocks by loading the # the state matrix in SSE registers four times. As we need some scratch # registers, we save the first four registers on the stack. The # algorithm performs each operation on the corresponding word of each @@ -163,6 +244,7 @@ ENTRY(chacha20_4block_xor_ssse3) lea 8(%rsp),%r10 sub $0x80,%rsp and $~63,%rsp + mov %rcx,%rax # x0..15[0-3] = s0..3[0..3] movq 0x00(%rdi),%xmm1 @@ -202,8 +284,6 @@ ENTRY(chacha20_4block_xor_ssse3) # x12 += counter values 0-3 paddd %xmm1,%xmm12 - mov $10,%ecx - .Ldoubleround4: # x0 += x4, x12 = rotl32(x12 ^ x0, 16) movdqa 0x00(%rsp),%xmm0 @@ -421,7 +501,7 @@ ENTRY(chacha20_4block_xor_ssse3) psrld $25,%xmm4 por %xmm0,%xmm4 - dec %ecx + sub $2,%r8d jnz .Ldoubleround4 # x0[0-3] += s0[0] @@ -573,58 +653,143 @@ ENTRY(chacha20_4block_xor_ssse3) # xor with corresponding input, write to output movdqa 0x00(%rsp),%xmm0 + cmp $0x10,%rax + jl .Lxorpart4 movdqu 0x00(%rdx),%xmm1 pxor %xmm1,%xmm0 movdqu %xmm0,0x00(%rsi) - movdqa 0x10(%rsp),%xmm0 - movdqu 0x80(%rdx),%xmm1 + + movdqu %xmm4,%xmm0 + cmp $0x20,%rax + jl .Lxorpart4 + movdqu 0x10(%rdx),%xmm1 pxor %xmm1,%xmm0 - movdqu %xmm0,0x80(%rsi) + movdqu %xmm0,0x10(%rsi) + + movdqu %xmm8,%xmm0 + cmp $0x30,%rax + jl .Lxorpart4 + movdqu 0x20(%rdx),%xmm1 + pxor %xmm1,%xmm0 + movdqu %xmm0,0x20(%rsi) + + movdqu %xmm12,%xmm0 + cmp $0x40,%rax + jl .Lxorpart4 + movdqu 0x30(%rdx),%xmm1 + pxor %xmm1,%xmm0 + movdqu %xmm0,0x30(%rsi) + movdqa 0x20(%rsp),%xmm0 + cmp $0x50,%rax + jl .Lxorpart4 movdqu 0x40(%rdx),%xmm1 pxor %xmm1,%xmm0 movdqu %xmm0,0x40(%rsi) + + movdqu %xmm6,%xmm0 + cmp $0x60,%rax + jl .Lxorpart4 + movdqu 0x50(%rdx),%xmm1 + pxor %xmm1,%xmm0 + movdqu %xmm0,0x50(%rsi) + + movdqu %xmm10,%xmm0 + cmp $0x70,%rax + jl .Lxorpart4 + movdqu 0x60(%rdx),%xmm1 + pxor %xmm1,%xmm0 + movdqu %xmm0,0x60(%rsi) + + movdqu %xmm14,%xmm0 + cmp $0x80,%rax + jl .Lxorpart4 + movdqu 0x70(%rdx),%xmm1 + pxor %xmm1,%xmm0 + movdqu %xmm0,0x70(%rsi) + + movdqa 0x10(%rsp),%xmm0 + cmp $0x90,%rax + jl .Lxorpart4 + movdqu 0x80(%rdx),%xmm1 + pxor %xmm1,%xmm0 + movdqu %xmm0,0x80(%rsi) + + movdqu %xmm5,%xmm0 + cmp $0xa0,%rax + jl .Lxorpart4 + movdqu 0x90(%rdx),%xmm1 + pxor %xmm1,%xmm0 + movdqu %xmm0,0x90(%rsi) + + movdqu %xmm9,%xmm0 + cmp $0xb0,%rax + jl .Lxorpart4 + movdqu 0xa0(%rdx),%xmm1 + pxor %xmm1,%xmm0 + movdqu %xmm0,0xa0(%rsi) + + movdqu %xmm13,%xmm0 + cmp $0xc0,%rax + jl .Lxorpart4 + movdqu 0xb0(%rdx),%xmm1 + pxor %xmm1,%xmm0 + movdqu %xmm0,0xb0(%rsi) + movdqa 0x30(%rsp),%xmm0 + cmp $0xd0,%rax + jl .Lxorpart4 movdqu 0xc0(%rdx),%xmm1 pxor %xmm1,%xmm0 movdqu %xmm0,0xc0(%rsi) - movdqu 0x10(%rdx),%xmm1 - pxor %xmm1,%xmm4 - movdqu %xmm4,0x10(%rsi) - movdqu 0x90(%rdx),%xmm1 - pxor %xmm1,%xmm5 - movdqu %xmm5,0x90(%rsi) - movdqu 0x50(%rdx),%xmm1 - pxor %xmm1,%xmm6 - movdqu %xmm6,0x50(%rsi) - movdqu 0xd0(%rdx),%xmm1 - pxor %xmm1,%xmm7 - movdqu %xmm7,0xd0(%rsi) - movdqu 0x20(%rdx),%xmm1 - pxor %xmm1,%xmm8 - movdqu %xmm8,0x20(%rsi) - movdqu 0xa0(%rdx),%xmm1 - pxor %xmm1,%xmm9 - movdqu %xmm9,0xa0(%rsi) - movdqu 0x60(%rdx),%xmm1 - pxor %xmm1,%xmm10 - movdqu %xmm10,0x60(%rsi) - movdqu 0xe0(%rdx),%xmm1 - pxor %xmm1,%xmm11 - movdqu %xmm11,0xe0(%rsi) - movdqu 0x30(%rdx),%xmm1 - pxor %xmm1,%xmm12 - movdqu %xmm12,0x30(%rsi) - movdqu 0xb0(%rdx),%xmm1 - pxor %xmm1,%xmm13 - movdqu %xmm13,0xb0(%rsi) - movdqu 0x70(%rdx),%xmm1 - pxor %xmm1,%xmm14 - movdqu %xmm14,0x70(%rsi) - movdqu 0xf0(%rdx),%xmm1 - pxor %xmm1,%xmm15 - movdqu %xmm15,0xf0(%rsi) + movdqu %xmm7,%xmm0 + cmp $0xe0,%rax + jl .Lxorpart4 + movdqu 0xd0(%rdx),%xmm1 + pxor %xmm1,%xmm0 + movdqu %xmm0,0xd0(%rsi) + + movdqu %xmm11,%xmm0 + cmp $0xf0,%rax + jl .Lxorpart4 + movdqu 0xe0(%rdx),%xmm1 + pxor %xmm1,%xmm0 + movdqu %xmm0,0xe0(%rsi) + + movdqu %xmm15,%xmm0 + cmp $0x100,%rax + jl .Lxorpart4 + movdqu 0xf0(%rdx),%xmm1 + pxor %xmm1,%xmm0 + movdqu %xmm0,0xf0(%rsi) + +.Ldone4: lea -8(%r10),%rsp ret -ENDPROC(chacha20_4block_xor_ssse3) + +.Lxorpart4: + # xor remaining bytes from partial register into output + mov %rax,%r9 + and $0x0f,%r9 + jz .Ldone4 + and $~0x0f,%rax + + mov %rsi,%r11 + + lea (%rdx,%rax),%rsi + mov %rsp,%rdi + mov %r9,%rcx + rep movsb + + pxor 0x00(%rsp),%xmm0 + movdqa %xmm0,0x00(%rsp) + + mov %rsp,%rsi + lea (%r11,%rax),%rdi + mov %r9,%rcx + rep movsb + + jmp .Ldone4 + +ENDPROC(chacha_4block_xor_ssse3) diff --git a/arch/x86/crypto/chacha20-avx2-x86_64.S b/arch/x86/crypto/chacha20-avx2-x86_64.S deleted file mode 100644 index f3cd26f48332..000000000000 --- a/arch/x86/crypto/chacha20-avx2-x86_64.S +++ /dev/null @@ -1,448 +0,0 @@ -/* - * ChaCha20 256-bit cipher algorithm, RFC7539, x64 AVX2 functions - * - * Copyright (C) 2015 Martin Willi - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - */ - -#include - -.section .rodata.cst32.ROT8, "aM", @progbits, 32 -.align 32 -ROT8: .octa 0x0e0d0c0f0a09080b0605040702010003 - .octa 0x0e0d0c0f0a09080b0605040702010003 - -.section .rodata.cst32.ROT16, "aM", @progbits, 32 -.align 32 -ROT16: .octa 0x0d0c0f0e09080b0a0504070601000302 - .octa 0x0d0c0f0e09080b0a0504070601000302 - -.section .rodata.cst32.CTRINC, "aM", @progbits, 32 -.align 32 -CTRINC: .octa 0x00000003000000020000000100000000 - .octa 0x00000007000000060000000500000004 - -.text - -ENTRY(chacha20_8block_xor_avx2) - # %rdi: Input state matrix, s - # %rsi: 8 data blocks output, o - # %rdx: 8 data blocks input, i - - # This function encrypts eight consecutive ChaCha20 blocks by loading - # the state matrix in AVX registers eight times. As we need some - # scratch registers, we save the first four registers on the stack. The - # algorithm performs each operation on the corresponding word of each - # state matrix, hence requires no word shuffling. For final XORing step - # we transpose the matrix by interleaving 32-, 64- and then 128-bit - # words, which allows us to do XOR in AVX registers. 8/16-bit word - # rotation is done with the slightly better performing byte shuffling, - # 7/12-bit word rotation uses traditional shift+OR. - - vzeroupper - # 4 * 32 byte stack, 32-byte aligned - lea 8(%rsp),%r10 - and $~31, %rsp - sub $0x80, %rsp - - # x0..15[0-7] = s[0..15] - vpbroadcastd 0x00(%rdi),%ymm0 - vpbroadcastd 0x04(%rdi),%ymm1 - vpbroadcastd 0x08(%rdi),%ymm2 - vpbroadcastd 0x0c(%rdi),%ymm3 - vpbroadcastd 0x10(%rdi),%ymm4 - vpbroadcastd 0x14(%rdi),%ymm5 - vpbroadcastd 0x18(%rdi),%ymm6 - vpbroadcastd 0x1c(%rdi),%ymm7 - vpbroadcastd 0x20(%rdi),%ymm8 - vpbroadcastd 0x24(%rdi),%ymm9 - vpbroadcastd 0x28(%rdi),%ymm10 - vpbroadcastd 0x2c(%rdi),%ymm11 - vpbroadcastd 0x30(%rdi),%ymm12 - vpbroadcastd 0x34(%rdi),%ymm13 - vpbroadcastd 0x38(%rdi),%ymm14 - vpbroadcastd 0x3c(%rdi),%ymm15 - # x0..3 on stack - vmovdqa %ymm0,0x00(%rsp) - vmovdqa %ymm1,0x20(%rsp) - vmovdqa %ymm2,0x40(%rsp) - vmovdqa %ymm3,0x60(%rsp) - - vmovdqa CTRINC(%rip),%ymm1 - vmovdqa ROT8(%rip),%ymm2 - vmovdqa ROT16(%rip),%ymm3 - - # x12 += counter values 0-3 - vpaddd %ymm1,%ymm12,%ymm12 - - mov $10,%ecx - -.Ldoubleround8: - # x0 += x4, x12 = rotl32(x12 ^ x0, 16) - vpaddd 0x00(%rsp),%ymm4,%ymm0 - vmovdqa %ymm0,0x00(%rsp) - vpxor %ymm0,%ymm12,%ymm12 - vpshufb %ymm3,%ymm12,%ymm12 - # x1 += x5, x13 = rotl32(x13 ^ x1, 16) - vpaddd 0x20(%rsp),%ymm5,%ymm0 - vmovdqa %ymm0,0x20(%rsp) - vpxor %ymm0,%ymm13,%ymm13 - vpshufb %ymm3,%ymm13,%ymm13 - # x2 += x6, x14 = rotl32(x14 ^ x2, 16) - vpaddd 0x40(%rsp),%ymm6,%ymm0 - vmovdqa %ymm0,0x40(%rsp) - vpxor %ymm0,%ymm14,%ymm14 - vpshufb %ymm3,%ymm14,%ymm14 - # x3 += x7, x15 = rotl32(x15 ^ x3, 16) - vpaddd 0x60(%rsp),%ymm7,%ymm0 - vmovdqa %ymm0,0x60(%rsp) - vpxor %ymm0,%ymm15,%ymm15 - vpshufb %ymm3,%ymm15,%ymm15 - - # x8 += x12, x4 = rotl32(x4 ^ x8, 12) - vpaddd %ymm12,%ymm8,%ymm8 - vpxor %ymm8,%ymm4,%ymm4 - vpslld $12,%ymm4,%ymm0 - vpsrld $20,%ymm4,%ymm4 - vpor %ymm0,%ymm4,%ymm4 - # x9 += x13, x5 = rotl32(x5 ^ x9, 12) - vpaddd %ymm13,%ymm9,%ymm9 - vpxor %ymm9,%ymm5,%ymm5 - vpslld $12,%ymm5,%ymm0 - vpsrld $20,%ymm5,%ymm5 - vpor %ymm0,%ymm5,%ymm5 - # x10 += x14, x6 = rotl32(x6 ^ x10, 12) - vpaddd %ymm14,%ymm10,%ymm10 - vpxor %ymm10,%ymm6,%ymm6 - vpslld $12,%ymm6,%ymm0 - vpsrld $20,%ymm6,%ymm6 - vpor %ymm0,%ymm6,%ymm6 - # x11 += x15, x7 = rotl32(x7 ^ x11, 12) - vpaddd %ymm15,%ymm11,%ymm11 - vpxor %ymm11,%ymm7,%ymm7 - vpslld $12,%ymm7,%ymm0 - vpsrld $20,%ymm7,%ymm7 - vpor %ymm0,%ymm7,%ymm7 - - # x0 += x4, x12 = rotl32(x12 ^ x0, 8) - vpaddd 0x00(%rsp),%ymm4,%ymm0 - vmovdqa %ymm0,0x00(%rsp) - vpxor %ymm0,%ymm12,%ymm12 - vpshufb %ymm2,%ymm12,%ymm12 - # x1 += x5, x13 = rotl32(x13 ^ x1, 8) - vpaddd 0x20(%rsp),%ymm5,%ymm0 - vmovdqa %ymm0,0x20(%rsp) - vpxor %ymm0,%ymm13,%ymm13 - vpshufb %ymm2,%ymm13,%ymm13 - # x2 += x6, x14 = rotl32(x14 ^ x2, 8) - vpaddd 0x40(%rsp),%ymm6,%ymm0 - vmovdqa %ymm0,0x40(%rsp) - vpxor %ymm0,%ymm14,%ymm14 - vpshufb %ymm2,%ymm14,%ymm14 - # x3 += x7, x15 = rotl32(x15 ^ x3, 8) - vpaddd 0x60(%rsp),%ymm7,%ymm0 - vmovdqa %ymm0,0x60(%rsp) - vpxor %ymm0,%ymm15,%ymm15 - vpshufb %ymm2,%ymm15,%ymm15 - - # x8 += x12, x4 = rotl32(x4 ^ x8, 7) - vpaddd %ymm12,%ymm8,%ymm8 - vpxor %ymm8,%ymm4,%ymm4 - vpslld $7,%ymm4,%ymm0 - vpsrld $25,%ymm4,%ymm4 - vpor %ymm0,%ymm4,%ymm4 - # x9 += x13, x5 = rotl32(x5 ^ x9, 7) - vpaddd %ymm13,%ymm9,%ymm9 - vpxor %ymm9,%ymm5,%ymm5 - vpslld $7,%ymm5,%ymm0 - vpsrld $25,%ymm5,%ymm5 - vpor %ymm0,%ymm5,%ymm5 - # x10 += x14, x6 = rotl32(x6 ^ x10, 7) - vpaddd %ymm14,%ymm10,%ymm10 - vpxor %ymm10,%ymm6,%ymm6 - vpslld $7,%ymm6,%ymm0 - vpsrld $25,%ymm6,%ymm6 - vpor %ymm0,%ymm6,%ymm6 - # x11 += x15, x7 = rotl32(x7 ^ x11, 7) - vpaddd %ymm15,%ymm11,%ymm11 - vpxor %ymm11,%ymm7,%ymm7 - vpslld $7,%ymm7,%ymm0 - vpsrld $25,%ymm7,%ymm7 - vpor %ymm0,%ymm7,%ymm7 - - # x0 += x5, x15 = rotl32(x15 ^ x0, 16) - vpaddd 0x00(%rsp),%ymm5,%ymm0 - vmovdqa %ymm0,0x00(%rsp) - vpxor %ymm0,%ymm15,%ymm15 - vpshufb %ymm3,%ymm15,%ymm15 - # x1 += x6, x12 = rotl32(x12 ^ x1, 16)%ymm0 - vpaddd 0x20(%rsp),%ymm6,%ymm0 - vmovdqa %ymm0,0x20(%rsp) - vpxor %ymm0,%ymm12,%ymm12 - vpshufb %ymm3,%ymm12,%ymm12 - # x2 += x7, x13 = rotl32(x13 ^ x2, 16) - vpaddd 0x40(%rsp),%ymm7,%ymm0 - vmovdqa %ymm0,0x40(%rsp) - vpxor %ymm0,%ymm13,%ymm13 - vpshufb %ymm3,%ymm13,%ymm13 - # x3 += x4, x14 = rotl32(x14 ^ x3, 16) - vpaddd 0x60(%rsp),%ymm4,%ymm0 - vmovdqa %ymm0,0x60(%rsp) - vpxor %ymm0,%ymm14,%ymm14 - vpshufb %ymm3,%ymm14,%ymm14 - - # x10 += x15, x5 = rotl32(x5 ^ x10, 12) - vpaddd %ymm15,%ymm10,%ymm10 - vpxor %ymm10,%ymm5,%ymm5 - vpslld $12,%ymm5,%ymm0 - vpsrld $20,%ymm5,%ymm5 - vpor %ymm0,%ymm5,%ymm5 - # x11 += x12, x6 = rotl32(x6 ^ x11, 12) - vpaddd %ymm12,%ymm11,%ymm11 - vpxor %ymm11,%ymm6,%ymm6 - vpslld $12,%ymm6,%ymm0 - vpsrld $20,%ymm6,%ymm6 - vpor %ymm0,%ymm6,%ymm6 - # x8 += x13, x7 = rotl32(x7 ^ x8, 12) - vpaddd %ymm13,%ymm8,%ymm8 - vpxor %ymm8,%ymm7,%ymm7 - vpslld $12,%ymm7,%ymm0 - vpsrld $20,%ymm7,%ymm7 - vpor %ymm0,%ymm7,%ymm7 - # x9 += x14, x4 = rotl32(x4 ^ x9, 12) - vpaddd %ymm14,%ymm9,%ymm9 - vpxor %ymm9,%ymm4,%ymm4 - vpslld $12,%ymm4,%ymm0 - vpsrld $20,%ymm4,%ymm4 - vpor %ymm0,%ymm4,%ymm4 - - # x0 += x5, x15 = rotl32(x15 ^ x0, 8) - vpaddd 0x00(%rsp),%ymm5,%ymm0 - vmovdqa %ymm0,0x00(%rsp) - vpxor %ymm0,%ymm15,%ymm15 - vpshufb %ymm2,%ymm15,%ymm15 - # x1 += x6, x12 = rotl32(x12 ^ x1, 8) - vpaddd 0x20(%rsp),%ymm6,%ymm0 - vmovdqa %ymm0,0x20(%rsp) - vpxor %ymm0,%ymm12,%ymm12 - vpshufb %ymm2,%ymm12,%ymm12 - # x2 += x7, x13 = rotl32(x13 ^ x2, 8) - vpaddd 0x40(%rsp),%ymm7,%ymm0 - vmovdqa %ymm0,0x40(%rsp) - vpxor %ymm0,%ymm13,%ymm13 - vpshufb %ymm2,%ymm13,%ymm13 - # x3 += x4, x14 = rotl32(x14 ^ x3, 8) - vpaddd 0x60(%rsp),%ymm4,%ymm0 - vmovdqa %ymm0,0x60(%rsp) - vpxor %ymm0,%ymm14,%ymm14 - vpshufb %ymm2,%ymm14,%ymm14 - - # x10 += x15, x5 = rotl32(x5 ^ x10, 7) - vpaddd %ymm15,%ymm10,%ymm10 - vpxor %ymm10,%ymm5,%ymm5 - vpslld $7,%ymm5,%ymm0 - vpsrld $25,%ymm5,%ymm5 - vpor %ymm0,%ymm5,%ymm5 - # x11 += x12, x6 = rotl32(x6 ^ x11, 7) - vpaddd %ymm12,%ymm11,%ymm11 - vpxor %ymm11,%ymm6,%ymm6 - vpslld $7,%ymm6,%ymm0 - vpsrld $25,%ymm6,%ymm6 - vpor %ymm0,%ymm6,%ymm6 - # x8 += x13, x7 = rotl32(x7 ^ x8, 7) - vpaddd %ymm13,%ymm8,%ymm8 - vpxor %ymm8,%ymm7,%ymm7 - vpslld $7,%ymm7,%ymm0 - vpsrld $25,%ymm7,%ymm7 - vpor %ymm0,%ymm7,%ymm7 - # x9 += x14, x4 = rotl32(x4 ^ x9, 7) - vpaddd %ymm14,%ymm9,%ymm9 - vpxor %ymm9,%ymm4,%ymm4 - vpslld $7,%ymm4,%ymm0 - vpsrld $25,%ymm4,%ymm4 - vpor %ymm0,%ymm4,%ymm4 - - dec %ecx - jnz .Ldoubleround8 - - # x0..15[0-3] += s[0..15] - vpbroadcastd 0x00(%rdi),%ymm0 - vpaddd 0x00(%rsp),%ymm0,%ymm0 - vmovdqa %ymm0,0x00(%rsp) - vpbroadcastd 0x04(%rdi),%ymm0 - vpaddd 0x20(%rsp),%ymm0,%ymm0 - vmovdqa %ymm0,0x20(%rsp) - vpbroadcastd 0x08(%rdi),%ymm0 - vpaddd 0x40(%rsp),%ymm0,%ymm0 - vmovdqa %ymm0,0x40(%rsp) - vpbroadcastd 0x0c(%rdi),%ymm0 - vpaddd 0x60(%rsp),%ymm0,%ymm0 - vmovdqa %ymm0,0x60(%rsp) - vpbroadcastd 0x10(%rdi),%ymm0 - vpaddd %ymm0,%ymm4,%ymm4 - vpbroadcastd 0x14(%rdi),%ymm0 - vpaddd %ymm0,%ymm5,%ymm5 - vpbroadcastd 0x18(%rdi),%ymm0 - vpaddd %ymm0,%ymm6,%ymm6 - vpbroadcastd 0x1c(%rdi),%ymm0 - vpaddd %ymm0,%ymm7,%ymm7 - vpbroadcastd 0x20(%rdi),%ymm0 - vpaddd %ymm0,%ymm8,%ymm8 - vpbroadcastd 0x24(%rdi),%ymm0 - vpaddd %ymm0,%ymm9,%ymm9 - vpbroadcastd 0x28(%rdi),%ymm0 - vpaddd %ymm0,%ymm10,%ymm10 - vpbroadcastd 0x2c(%rdi),%ymm0 - vpaddd %ymm0,%ymm11,%ymm11 - vpbroadcastd 0x30(%rdi),%ymm0 - vpaddd %ymm0,%ymm12,%ymm12 - vpbroadcastd 0x34(%rdi),%ymm0 - vpaddd %ymm0,%ymm13,%ymm13 - vpbroadcastd 0x38(%rdi),%ymm0 - vpaddd %ymm0,%ymm14,%ymm14 - vpbroadcastd 0x3c(%rdi),%ymm0 - vpaddd %ymm0,%ymm15,%ymm15 - - # x12 += counter values 0-3 - vpaddd %ymm1,%ymm12,%ymm12 - - # interleave 32-bit words in state n, n+1 - vmovdqa 0x00(%rsp),%ymm0 - vmovdqa 0x20(%rsp),%ymm1 - vpunpckldq %ymm1,%ymm0,%ymm2 - vpunpckhdq %ymm1,%ymm0,%ymm1 - vmovdqa %ymm2,0x00(%rsp) - vmovdqa %ymm1,0x20(%rsp) - vmovdqa 0x40(%rsp),%ymm0 - vmovdqa 0x60(%rsp),%ymm1 - vpunpckldq %ymm1,%ymm0,%ymm2 - vpunpckhdq %ymm1,%ymm0,%ymm1 - vmovdqa %ymm2,0x40(%rsp) - vmovdqa %ymm1,0x60(%rsp) - vmovdqa %ymm4,%ymm0 - vpunpckldq %ymm5,%ymm0,%ymm4 - vpunpckhdq %ymm5,%ymm0,%ymm5 - vmovdqa %ymm6,%ymm0 - vpunpckldq %ymm7,%ymm0,%ymm6 - vpunpckhdq %ymm7,%ymm0,%ymm7 - vmovdqa %ymm8,%ymm0 - vpunpckldq %ymm9,%ymm0,%ymm8 - vpunpckhdq %ymm9,%ymm0,%ymm9 - vmovdqa %ymm10,%ymm0 - vpunpckldq %ymm11,%ymm0,%ymm10 - vpunpckhdq %ymm11,%ymm0,%ymm11 - vmovdqa %ymm12,%ymm0 - vpunpckldq %ymm13,%ymm0,%ymm12 - vpunpckhdq %ymm13,%ymm0,%ymm13 - vmovdqa %ymm14,%ymm0 - vpunpckldq %ymm15,%ymm0,%ymm14 - vpunpckhdq %ymm15,%ymm0,%ymm15 - - # interleave 64-bit words in state n, n+2 - vmovdqa 0x00(%rsp),%ymm0 - vmovdqa 0x40(%rsp),%ymm2 - vpunpcklqdq %ymm2,%ymm0,%ymm1 - vpunpckhqdq %ymm2,%ymm0,%ymm2 - vmovdqa %ymm1,0x00(%rsp) - vmovdqa %ymm2,0x40(%rsp) - vmovdqa 0x20(%rsp),%ymm0 - vmovdqa 0x60(%rsp),%ymm2 - vpunpcklqdq %ymm2,%ymm0,%ymm1 - vpunpckhqdq %ymm2,%ymm0,%ymm2 - vmovdqa %ymm1,0x20(%rsp) - vmovdqa %ymm2,0x60(%rsp) - vmovdqa %ymm4,%ymm0 - vpunpcklqdq %ymm6,%ymm0,%ymm4 - vpunpckhqdq %ymm6,%ymm0,%ymm6 - vmovdqa %ymm5,%ymm0 - vpunpcklqdq %ymm7,%ymm0,%ymm5 - vpunpckhqdq %ymm7,%ymm0,%ymm7 - vmovdqa %ymm8,%ymm0 - vpunpcklqdq %ymm10,%ymm0,%ymm8 - vpunpckhqdq %ymm10,%ymm0,%ymm10 - vmovdqa %ymm9,%ymm0 - vpunpcklqdq %ymm11,%ymm0,%ymm9 - vpunpckhqdq %ymm11,%ymm0,%ymm11 - vmovdqa %ymm12,%ymm0 - vpunpcklqdq %ymm14,%ymm0,%ymm12 - vpunpckhqdq %ymm14,%ymm0,%ymm14 - vmovdqa %ymm13,%ymm0 - vpunpcklqdq %ymm15,%ymm0,%ymm13 - vpunpckhqdq %ymm15,%ymm0,%ymm15 - - # interleave 128-bit words in state n, n+4 - vmovdqa 0x00(%rsp),%ymm0 - vperm2i128 $0x20,%ymm4,%ymm0,%ymm1 - vperm2i128 $0x31,%ymm4,%ymm0,%ymm4 - vmovdqa %ymm1,0x00(%rsp) - vmovdqa 0x20(%rsp),%ymm0 - vperm2i128 $0x20,%ymm5,%ymm0,%ymm1 - vperm2i128 $0x31,%ymm5,%ymm0,%ymm5 - vmovdqa %ymm1,0x20(%rsp) - vmovdqa 0x40(%rsp),%ymm0 - vperm2i128 $0x20,%ymm6,%ymm0,%ymm1 - vperm2i128 $0x31,%ymm6,%ymm0,%ymm6 - vmovdqa %ymm1,0x40(%rsp) - vmovdqa 0x60(%rsp),%ymm0 - vperm2i128 $0x20,%ymm7,%ymm0,%ymm1 - vperm2i128 $0x31,%ymm7,%ymm0,%ymm7 - vmovdqa %ymm1,0x60(%rsp) - vperm2i128 $0x20,%ymm12,%ymm8,%ymm0 - vperm2i128 $0x31,%ymm12,%ymm8,%ymm12 - vmovdqa %ymm0,%ymm8 - vperm2i128 $0x20,%ymm13,%ymm9,%ymm0 - vperm2i128 $0x31,%ymm13,%ymm9,%ymm13 - vmovdqa %ymm0,%ymm9 - vperm2i128 $0x20,%ymm14,%ymm10,%ymm0 - vperm2i128 $0x31,%ymm14,%ymm10,%ymm14 - vmovdqa %ymm0,%ymm10 - vperm2i128 $0x20,%ymm15,%ymm11,%ymm0 - vperm2i128 $0x31,%ymm15,%ymm11,%ymm15 - vmovdqa %ymm0,%ymm11 - - # xor with corresponding input, write to output - vmovdqa 0x00(%rsp),%ymm0 - vpxor 0x0000(%rdx),%ymm0,%ymm0 - vmovdqu %ymm0,0x0000(%rsi) - vmovdqa 0x20(%rsp),%ymm0 - vpxor 0x0080(%rdx),%ymm0,%ymm0 - vmovdqu %ymm0,0x0080(%rsi) - vmovdqa 0x40(%rsp),%ymm0 - vpxor 0x0040(%rdx),%ymm0,%ymm0 - vmovdqu %ymm0,0x0040(%rsi) - vmovdqa 0x60(%rsp),%ymm0 - vpxor 0x00c0(%rdx),%ymm0,%ymm0 - vmovdqu %ymm0,0x00c0(%rsi) - vpxor 0x0100(%rdx),%ymm4,%ymm4 - vmovdqu %ymm4,0x0100(%rsi) - vpxor 0x0180(%rdx),%ymm5,%ymm5 - vmovdqu %ymm5,0x00180(%rsi) - vpxor 0x0140(%rdx),%ymm6,%ymm6 - vmovdqu %ymm6,0x0140(%rsi) - vpxor 0x01c0(%rdx),%ymm7,%ymm7 - vmovdqu %ymm7,0x01c0(%rsi) - vpxor 0x0020(%rdx),%ymm8,%ymm8 - vmovdqu %ymm8,0x0020(%rsi) - vpxor 0x00a0(%rdx),%ymm9,%ymm9 - vmovdqu %ymm9,0x00a0(%rsi) - vpxor 0x0060(%rdx),%ymm10,%ymm10 - vmovdqu %ymm10,0x0060(%rsi) - vpxor 0x00e0(%rdx),%ymm11,%ymm11 - vmovdqu %ymm11,0x00e0(%rsi) - vpxor 0x0120(%rdx),%ymm12,%ymm12 - vmovdqu %ymm12,0x0120(%rsi) - vpxor 0x01a0(%rdx),%ymm13,%ymm13 - vmovdqu %ymm13,0x01a0(%rsi) - vpxor 0x0160(%rdx),%ymm14,%ymm14 - vmovdqu %ymm14,0x0160(%rsi) - vpxor 0x01e0(%rdx),%ymm15,%ymm15 - vmovdqu %ymm15,0x01e0(%rsi) - - vzeroupper - lea -8(%r10),%rsp - ret -ENDPROC(chacha20_8block_xor_avx2) diff --git a/arch/x86/crypto/chacha20_glue.c b/arch/x86/crypto/chacha20_glue.c deleted file mode 100644 index bd249f0b29dc..000000000000 --- a/arch/x86/crypto/chacha20_glue.c +++ /dev/null @@ -1,146 +0,0 @@ -/* - * ChaCha20 256-bit cipher algorithm, RFC7539, SIMD glue code - * - * Copyright (C) 2015 Martin Willi - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - */ - -#include -#include -#include -#include -#include -#include -#include - -#define CHACHA20_STATE_ALIGN 16 - -asmlinkage void chacha20_block_xor_ssse3(u32 *state, u8 *dst, const u8 *src); -asmlinkage void chacha20_4block_xor_ssse3(u32 *state, u8 *dst, const u8 *src); -#ifdef CONFIG_AS_AVX2 -asmlinkage void chacha20_8block_xor_avx2(u32 *state, u8 *dst, const u8 *src); -static bool chacha20_use_avx2; -#endif - -static void chacha20_dosimd(u32 *state, u8 *dst, const u8 *src, - unsigned int bytes) -{ - u8 buf[CHACHA_BLOCK_SIZE]; - -#ifdef CONFIG_AS_AVX2 - if (chacha20_use_avx2) { - while (bytes >= CHACHA_BLOCK_SIZE * 8) { - chacha20_8block_xor_avx2(state, dst, src); - bytes -= CHACHA_BLOCK_SIZE * 8; - src += CHACHA_BLOCK_SIZE * 8; - dst += CHACHA_BLOCK_SIZE * 8; - state[12] += 8; - } - } -#endif - while (bytes >= CHACHA_BLOCK_SIZE * 4) { - chacha20_4block_xor_ssse3(state, dst, src); - bytes -= CHACHA_BLOCK_SIZE * 4; - src += CHACHA_BLOCK_SIZE * 4; - dst += CHACHA_BLOCK_SIZE * 4; - state[12] += 4; - } - while (bytes >= CHACHA_BLOCK_SIZE) { - chacha20_block_xor_ssse3(state, dst, src); - bytes -= CHACHA_BLOCK_SIZE; - src += CHACHA_BLOCK_SIZE; - dst += CHACHA_BLOCK_SIZE; - state[12]++; - } - if (bytes) { - memcpy(buf, src, bytes); - chacha20_block_xor_ssse3(state, buf, buf); - memcpy(dst, buf, bytes); - } -} - -static int chacha20_simd(struct skcipher_request *req) -{ - struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); - u32 *state, state_buf[16 + 2] __aligned(8); - struct skcipher_walk walk; - int err; - - BUILD_BUG_ON(CHACHA20_STATE_ALIGN != 16); - state = PTR_ALIGN(state_buf + 0, CHACHA20_STATE_ALIGN); - - if (req->cryptlen <= CHACHA_BLOCK_SIZE || !may_use_simd()) - return crypto_chacha_crypt(req); - - err = skcipher_walk_virt(&walk, req, true); - - crypto_chacha_init(state, ctx, walk.iv); - - kernel_fpu_begin(); - - while (walk.nbytes >= CHACHA_BLOCK_SIZE) { - chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr, - rounddown(walk.nbytes, CHACHA_BLOCK_SIZE)); - err = skcipher_walk_done(&walk, - walk.nbytes % CHACHA_BLOCK_SIZE); - } - - if (walk.nbytes) { - chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr, - walk.nbytes); - err = skcipher_walk_done(&walk, 0); - } - - kernel_fpu_end(); - - return err; -} - -static struct skcipher_alg alg = { - .base.cra_name = "chacha20", - .base.cra_driver_name = "chacha20-simd", - .base.cra_priority = 300, - .base.cra_blocksize = 1, - .base.cra_ctxsize = sizeof(struct chacha_ctx), - .base.cra_module = THIS_MODULE, - - .min_keysize = CHACHA_KEY_SIZE, - .max_keysize = CHACHA_KEY_SIZE, - .ivsize = CHACHA_IV_SIZE, - .chunksize = CHACHA_BLOCK_SIZE, - .setkey = crypto_chacha20_setkey, - .encrypt = chacha20_simd, - .decrypt = chacha20_simd, -}; - -static int __init chacha20_simd_mod_init(void) -{ - if (!boot_cpu_has(X86_FEATURE_SSSE3)) - return -ENODEV; - -#ifdef CONFIG_AS_AVX2 - chacha20_use_avx2 = boot_cpu_has(X86_FEATURE_AVX) && - boot_cpu_has(X86_FEATURE_AVX2) && - cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL); -#endif - return crypto_register_skcipher(&alg); -} - -static void __exit chacha20_simd_mod_fini(void) -{ - crypto_unregister_skcipher(&alg); -} - -module_init(chacha20_simd_mod_init); -module_exit(chacha20_simd_mod_fini); - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Martin Willi "); -MODULE_DESCRIPTION("chacha20 cipher algorithm, SIMD accelerated"); -MODULE_ALIAS_CRYPTO("chacha20"); -MODULE_ALIAS_CRYPTO("chacha20-simd"); diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c new file mode 100644 index 000000000000..f198292ac39e --- /dev/null +++ b/arch/x86/crypto/chacha_glue.c @@ -0,0 +1,322 @@ +/* + * x64 SIMD accelerated ChaCha and XChaCha stream ciphers, + * including ChaCha20 (RFC7539) + * + * Copyright (C) 2015 Martin Willi + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include + +asmlinkage void chacha_block_xor_ssse3(u32 *state, u8 *dst, const u8 *src, + unsigned int len, int nrounds); +asmlinkage void chacha_4block_xor_ssse3(u32 *state, u8 *dst, const u8 *src, + unsigned int len, int nrounds); +asmlinkage void hchacha_block_ssse3(const u32 *state, u32 *out, int nrounds); + +asmlinkage void chacha_2block_xor_avx2(u32 *state, u8 *dst, const u8 *src, + unsigned int len, int nrounds); +asmlinkage void chacha_4block_xor_avx2(u32 *state, u8 *dst, const u8 *src, + unsigned int len, int nrounds); +asmlinkage void chacha_8block_xor_avx2(u32 *state, u8 *dst, const u8 *src, + unsigned int len, int nrounds); + +asmlinkage void chacha_2block_xor_avx512vl(u32 *state, u8 *dst, const u8 *src, + unsigned int len, int nrounds); +asmlinkage void chacha_4block_xor_avx512vl(u32 *state, u8 *dst, const u8 *src, + unsigned int len, int nrounds); +asmlinkage void chacha_8block_xor_avx512vl(u32 *state, u8 *dst, const u8 *src, + unsigned int len, int nrounds); + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(chacha_use_simd); +static __ro_after_init DEFINE_STATIC_KEY_FALSE(chacha_use_avx2); +static __ro_after_init DEFINE_STATIC_KEY_FALSE(chacha_use_avx512vl); + +static unsigned int chacha_advance(unsigned int len, unsigned int maxblocks) +{ + len = min(len, maxblocks * CHACHA_BLOCK_SIZE); + return round_up(len, CHACHA_BLOCK_SIZE) / CHACHA_BLOCK_SIZE; +} + +static void chacha_dosimd(u32 *state, u8 *dst, const u8 *src, + unsigned int bytes, int nrounds) +{ + if (IS_ENABLED(CONFIG_AS_AVX512) && + static_branch_likely(&chacha_use_avx512vl)) { + while (bytes >= CHACHA_BLOCK_SIZE * 8) { + chacha_8block_xor_avx512vl(state, dst, src, bytes, + nrounds); + bytes -= CHACHA_BLOCK_SIZE * 8; + src += CHACHA_BLOCK_SIZE * 8; + dst += CHACHA_BLOCK_SIZE * 8; + state[12] += 8; + } + if (bytes > CHACHA_BLOCK_SIZE * 4) { + chacha_8block_xor_avx512vl(state, dst, src, bytes, + nrounds); + state[12] += chacha_advance(bytes, 8); + return; + } + if (bytes > CHACHA_BLOCK_SIZE * 2) { + chacha_4block_xor_avx512vl(state, dst, src, bytes, + nrounds); + state[12] += chacha_advance(bytes, 4); + return; + } + if (bytes) { + chacha_2block_xor_avx512vl(state, dst, src, bytes, + nrounds); + state[12] += chacha_advance(bytes, 2); + return; + } + } + + if (IS_ENABLED(CONFIG_AS_AVX2) && + static_branch_likely(&chacha_use_avx2)) { + while (bytes >= CHACHA_BLOCK_SIZE * 8) { + chacha_8block_xor_avx2(state, dst, src, bytes, nrounds); + bytes -= CHACHA_BLOCK_SIZE * 8; + src += CHACHA_BLOCK_SIZE * 8; + dst += CHACHA_BLOCK_SIZE * 8; + state[12] += 8; + } + if (bytes > CHACHA_BLOCK_SIZE * 4) { + chacha_8block_xor_avx2(state, dst, src, bytes, nrounds); + state[12] += chacha_advance(bytes, 8); + return; + } + if (bytes > CHACHA_BLOCK_SIZE * 2) { + chacha_4block_xor_avx2(state, dst, src, bytes, nrounds); + state[12] += chacha_advance(bytes, 4); + return; + } + if (bytes > CHACHA_BLOCK_SIZE) { + chacha_2block_xor_avx2(state, dst, src, bytes, nrounds); + state[12] += chacha_advance(bytes, 2); + return; + } + } + + while (bytes >= CHACHA_BLOCK_SIZE * 4) { + chacha_4block_xor_ssse3(state, dst, src, bytes, nrounds); + bytes -= CHACHA_BLOCK_SIZE * 4; + src += CHACHA_BLOCK_SIZE * 4; + dst += CHACHA_BLOCK_SIZE * 4; + state[12] += 4; + } + if (bytes > CHACHA_BLOCK_SIZE) { + chacha_4block_xor_ssse3(state, dst, src, bytes, nrounds); + state[12] += chacha_advance(bytes, 4); + return; + } + if (bytes) { + chacha_block_xor_ssse3(state, dst, src, bytes, nrounds); + state[12]++; + } +} + +void hchacha_block_arch(const u32 *state, u32 *stream, int nrounds) +{ + if (!static_branch_likely(&chacha_use_simd) || !may_use_simd()) { + hchacha_block_generic(state, stream, nrounds); + } else { + kernel_fpu_begin(); + hchacha_block_ssse3(state, stream, nrounds); + kernel_fpu_end(); + } +} +EXPORT_SYMBOL(hchacha_block_arch); + +void chacha_init_arch(u32 *state, const u32 *key, const u8 *iv) +{ + chacha_init_generic(state, key, iv); +} +EXPORT_SYMBOL(chacha_init_arch); + +void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes, + int nrounds) +{ + if (!static_branch_likely(&chacha_use_simd) || !may_use_simd() || + bytes <= CHACHA_BLOCK_SIZE) + return chacha_crypt_generic(state, dst, src, bytes, nrounds); + + do { + unsigned int todo = min_t(unsigned int, bytes, SZ_4K); + + kernel_fpu_begin(); + chacha_dosimd(state, dst, src, todo, nrounds); + kernel_fpu_end(); + + bytes -= todo; + src += todo; + dst += todo; + } while (bytes); +} +EXPORT_SYMBOL(chacha_crypt_arch); + +static int chacha_simd_stream_xor(struct skcipher_request *req, + const struct chacha_ctx *ctx, const u8 *iv) +{ + u32 state[CHACHA_STATE_WORDS] __aligned(8); + struct skcipher_walk walk; + int err; + + err = skcipher_walk_virt(&walk, req, false); + + chacha_init_generic(state, ctx->key, iv); + + while (walk.nbytes > 0) { + unsigned int nbytes = walk.nbytes; + + if (nbytes < walk.total) + nbytes = round_down(nbytes, walk.stride); + + if (!static_branch_likely(&chacha_use_simd) || + !may_use_simd()) { + chacha_crypt_generic(state, walk.dst.virt.addr, + walk.src.virt.addr, nbytes, + ctx->nrounds); + } else { + kernel_fpu_begin(); + chacha_dosimd(state, walk.dst.virt.addr, + walk.src.virt.addr, nbytes, + ctx->nrounds); + kernel_fpu_end(); + } + err = skcipher_walk_done(&walk, walk.nbytes - nbytes); + } + + return err; +} + +static int chacha_simd(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + + return chacha_simd_stream_xor(req, ctx, req->iv); +} + +static int xchacha_simd(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + u32 state[CHACHA_STATE_WORDS] __aligned(8); + struct chacha_ctx subctx; + u8 real_iv[16]; + + chacha_init_generic(state, ctx->key, req->iv); + + if (req->cryptlen > CHACHA_BLOCK_SIZE && irq_fpu_usable()) { + kernel_fpu_begin(); + hchacha_block_ssse3(state, subctx.key, ctx->nrounds); + kernel_fpu_end(); + } else { + hchacha_block_generic(state, subctx.key, ctx->nrounds); + } + subctx.nrounds = ctx->nrounds; + + memcpy(&real_iv[0], req->iv + 24, 8); + memcpy(&real_iv[8], req->iv + 16, 8); + return chacha_simd_stream_xor(req, &subctx, real_iv); +} + +static struct skcipher_alg algs[] = { + { + .base.cra_name = "chacha20", + .base.cra_driver_name = "chacha20-simd", + .base.cra_priority = 300, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = CHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .setkey = chacha20_setkey, + .encrypt = chacha_simd, + .decrypt = chacha_simd, + }, { + .base.cra_name = "xchacha20", + .base.cra_driver_name = "xchacha20-simd", + .base.cra_priority = 300, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = XCHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .setkey = chacha20_setkey, + .encrypt = xchacha_simd, + .decrypt = xchacha_simd, + }, { + .base.cra_name = "xchacha12", + .base.cra_driver_name = "xchacha12-simd", + .base.cra_priority = 300, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct chacha_ctx), + .base.cra_module = THIS_MODULE, + + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = XCHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .setkey = chacha12_setkey, + .encrypt = xchacha_simd, + .decrypt = xchacha_simd, + }, +}; + +static int __init chacha_simd_mod_init(void) +{ + if (!boot_cpu_has(X86_FEATURE_SSSE3)) + return 0; + + static_branch_enable(&chacha_use_simd); + + if (IS_ENABLED(CONFIG_AS_AVX2) && + boot_cpu_has(X86_FEATURE_AVX) && + boot_cpu_has(X86_FEATURE_AVX2) && + cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) { + static_branch_enable(&chacha_use_avx2); + + if (IS_ENABLED(CONFIG_AS_AVX512) && + boot_cpu_has(X86_FEATURE_AVX512VL) && + boot_cpu_has(X86_FEATURE_AVX512BW)) /* kmovq */ + static_branch_enable(&chacha_use_avx512vl); + } + return IS_REACHABLE(CONFIG_CRYPTO_BLKCIPHER) ? + crypto_register_skciphers(algs, ARRAY_SIZE(algs)) : 0; +} + +static void __exit chacha_simd_mod_fini(void) +{ + if (IS_REACHABLE(CONFIG_CRYPTO_BLKCIPHER) && boot_cpu_has(X86_FEATURE_SSSE3)) + crypto_unregister_skciphers(algs, ARRAY_SIZE(algs)); +} + +module_init(chacha_simd_mod_init); +module_exit(chacha_simd_mod_fini); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Martin Willi "); +MODULE_DESCRIPTION("ChaCha and XChaCha stream ciphers (x64 SIMD accelerated)"); +MODULE_ALIAS_CRYPTO("chacha20"); +MODULE_ALIAS_CRYPTO("chacha20-simd"); +MODULE_ALIAS_CRYPTO("xchacha20"); +MODULE_ALIAS_CRYPTO("xchacha20-simd"); +MODULE_ALIAS_CRYPTO("xchacha12"); +MODULE_ALIAS_CRYPTO("xchacha12-simd"); diff --git a/arch/x86/crypto/curve25519-x86_64.c b/arch/x86/crypto/curve25519-x86_64.c new file mode 100644 index 000000000000..a9edb6f8a0ba --- /dev/null +++ b/arch/x86/crypto/curve25519-x86_64.c @@ -0,0 +1,1512 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2020 Jason A. Donenfeld . All Rights Reserved. + * Copyright (c) 2016-2020 INRIA, CMU and Microsoft Corporation + */ + +#include +#include + +#include +#include +#include +#include + +#include +#include + +static __always_inline u64 eq_mask(u64 a, u64 b) +{ + u64 x = a ^ b; + u64 minus_x = ~x + (u64)1U; + u64 x_or_minus_x = x | minus_x; + u64 xnx = x_or_minus_x >> (u32)63U; + return xnx - (u64)1U; +} + +static __always_inline u64 gte_mask(u64 a, u64 b) +{ + u64 x = a; + u64 y = b; + u64 x_xor_y = x ^ y; + u64 x_sub_y = x - y; + u64 x_sub_y_xor_y = x_sub_y ^ y; + u64 q = x_xor_y | x_sub_y_xor_y; + u64 x_xor_q = x ^ q; + u64 x_xor_q_ = x_xor_q >> (u32)63U; + return x_xor_q_ - (u64)1U; +} + +/* Computes the addition of four-element f1 with value in f2 + * and returns the carry (if any) */ +static inline u64 add_scalar(u64 *out, const u64 *f1, u64 f2) +{ + u64 carry_r; + + asm volatile( + /* Clear registers to propagate the carry bit */ + " xor %%r8d, %%r8d;" + " xor %%r9d, %%r9d;" + " xor %%r10d, %%r10d;" + " xor %%r11d, %%r11d;" + " xor %k1, %k1;" + + /* Begin addition chain */ + " addq 0(%3), %0;" + " movq %0, 0(%2);" + " adcxq 8(%3), %%r8;" + " movq %%r8, 8(%2);" + " adcxq 16(%3), %%r9;" + " movq %%r9, 16(%2);" + " adcxq 24(%3), %%r10;" + " movq %%r10, 24(%2);" + + /* Return the carry bit in a register */ + " adcx %%r11, %1;" + : "+&r" (f2), "=&r" (carry_r) + : "r" (out), "r" (f1) + : "%r8", "%r9", "%r10", "%r11", "memory", "cc" + ); + + return carry_r; +} + +/* Computes the field addition of two field elements */ +static inline void fadd(u64 *out, const u64 *f1, const u64 *f2) +{ + asm volatile( + /* Compute the raw addition of f1 + f2 */ + " movq 0(%0), %%r8;" + " addq 0(%2), %%r8;" + " movq 8(%0), %%r9;" + " adcxq 8(%2), %%r9;" + " movq 16(%0), %%r10;" + " adcxq 16(%2), %%r10;" + " movq 24(%0), %%r11;" + " adcxq 24(%2), %%r11;" + + /* Wrap the result back into the field */ + + /* Step 1: Compute carry*38 */ + " mov $0, %%rax;" + " mov $38, %0;" + " cmovc %0, %%rax;" + + /* Step 2: Add carry*38 to the original sum */ + " xor %%ecx, %%ecx;" + " add %%rax, %%r8;" + " adcx %%rcx, %%r9;" + " movq %%r9, 8(%1);" + " adcx %%rcx, %%r10;" + " movq %%r10, 16(%1);" + " adcx %%rcx, %%r11;" + " movq %%r11, 24(%1);" + + /* Step 3: Fold the carry bit back in; guaranteed not to carry at this point */ + " mov $0, %%rax;" + " cmovc %0, %%rax;" + " add %%rax, %%r8;" + " movq %%r8, 0(%1);" + : "+&r" (f2) + : "r" (out), "r" (f1) + : "%rax", "%rcx", "%r8", "%r9", "%r10", "%r11", "memory", "cc" + ); +} + +/* Computes the field substraction of two field elements */ +static inline void fsub(u64 *out, const u64 *f1, const u64 *f2) +{ + asm volatile( + /* Compute the raw substraction of f1-f2 */ + " movq 0(%1), %%r8;" + " subq 0(%2), %%r8;" + " movq 8(%1), %%r9;" + " sbbq 8(%2), %%r9;" + " movq 16(%1), %%r10;" + " sbbq 16(%2), %%r10;" + " movq 24(%1), %%r11;" + " sbbq 24(%2), %%r11;" + + /* Wrap the result back into the field */ + + /* Step 1: Compute carry*38 */ + " mov $0, %%rax;" + " mov $38, %%rcx;" + " cmovc %%rcx, %%rax;" + + /* Step 2: Substract carry*38 from the original difference */ + " sub %%rax, %%r8;" + " sbb $0, %%r9;" + " sbb $0, %%r10;" + " sbb $0, %%r11;" + + /* Step 3: Fold the carry bit back in; guaranteed not to carry at this point */ + " mov $0, %%rax;" + " cmovc %%rcx, %%rax;" + " sub %%rax, %%r8;" + + /* Store the result */ + " movq %%r8, 0(%0);" + " movq %%r9, 8(%0);" + " movq %%r10, 16(%0);" + " movq %%r11, 24(%0);" + : + : "r" (out), "r" (f1), "r" (f2) + : "%rax", "%rcx", "%r8", "%r9", "%r10", "%r11", "memory", "cc" + ); +} + +/* Computes a field multiplication: out <- f1 * f2 + * Uses the 8-element buffer tmp for intermediate results */ +static inline void fmul(u64 *out, const u64 *f1, const u64 *f2, u64 *tmp) +{ + asm volatile( + /* Compute the raw multiplication: tmp <- src1 * src2 */ + + /* Compute src1[0] * src2 */ + " movq 0(%1), %%rdx;" + " mulxq 0(%3), %%r8, %%r9;" " xor %%r10d, %%r10d;" " movq %%r8, 0(%0);" + " mulxq 8(%3), %%r10, %%r11;" " adox %%r9, %%r10;" " movq %%r10, 8(%0);" + " mulxq 16(%3), %%rbx, %%r13;" " adox %%r11, %%rbx;" + " mulxq 24(%3), %%r14, %%rdx;" " adox %%r13, %%r14;" " mov $0, %%rax;" + " adox %%rdx, %%rax;" + /* Compute src1[1] * src2 */ + " movq 8(%1), %%rdx;" + " mulxq 0(%3), %%r8, %%r9;" " xor %%r10d, %%r10d;" " adcxq 8(%0), %%r8;" " movq %%r8, 8(%0);" + " mulxq 8(%3), %%r10, %%r11;" " adox %%r9, %%r10;" " adcx %%rbx, %%r10;" " movq %%r10, 16(%0);" + " mulxq 16(%3), %%rbx, %%r13;" " adox %%r11, %%rbx;" " adcx %%r14, %%rbx;" " mov $0, %%r8;" + " mulxq 24(%3), %%r14, %%rdx;" " adox %%r13, %%r14;" " adcx %%rax, %%r14;" " mov $0, %%rax;" + " adox %%rdx, %%rax;" " adcx %%r8, %%rax;" + /* Compute src1[2] * src2 */ + " movq 16(%1), %%rdx;" + " mulxq 0(%3), %%r8, %%r9;" " xor %%r10d, %%r10d;" " adcxq 16(%0), %%r8;" " movq %%r8, 16(%0);" + " mulxq 8(%3), %%r10, %%r11;" " adox %%r9, %%r10;" " adcx %%rbx, %%r10;" " movq %%r10, 24(%0);" + " mulxq 16(%3), %%rbx, %%r13;" " adox %%r11, %%rbx;" " adcx %%r14, %%rbx;" " mov $0, %%r8;" + " mulxq 24(%3), %%r14, %%rdx;" " adox %%r13, %%r14;" " adcx %%rax, %%r14;" " mov $0, %%rax;" + " adox %%rdx, %%rax;" " adcx %%r8, %%rax;" + /* Compute src1[3] * src2 */ + " movq 24(%1), %%rdx;" + " mulxq 0(%3), %%r8, %%r9;" " xor %%r10d, %%r10d;" " adcxq 24(%0), %%r8;" " movq %%r8, 24(%0);" + " mulxq 8(%3), %%r10, %%r11;" " adox %%r9, %%r10;" " adcx %%rbx, %%r10;" " movq %%r10, 32(%0);" + " mulxq 16(%3), %%rbx, %%r13;" " adox %%r11, %%rbx;" " adcx %%r14, %%rbx;" " movq %%rbx, 40(%0);" " mov $0, %%r8;" + " mulxq 24(%3), %%r14, %%rdx;" " adox %%r13, %%r14;" " adcx %%rax, %%r14;" " movq %%r14, 48(%0);" " mov $0, %%rax;" + " adox %%rdx, %%rax;" " adcx %%r8, %%rax;" " movq %%rax, 56(%0);" + /* Line up pointers */ + " mov %0, %1;" + " mov %2, %0;" + + /* Wrap the result back into the field */ + + /* Step 1: Compute dst + carry == tmp_hi * 38 + tmp_lo */ + " mov $38, %%rdx;" + " mulxq 32(%1), %%r8, %%r13;" + " xor %k3, %k3;" + " adoxq 0(%1), %%r8;" + " mulxq 40(%1), %%r9, %%rbx;" + " adcx %%r13, %%r9;" + " adoxq 8(%1), %%r9;" + " mulxq 48(%1), %%r10, %%r13;" + " adcx %%rbx, %%r10;" + " adoxq 16(%1), %%r10;" + " mulxq 56(%1), %%r11, %%rax;" + " adcx %%r13, %%r11;" + " adoxq 24(%1), %%r11;" + " adcx %3, %%rax;" + " adox %3, %%rax;" + " imul %%rdx, %%rax;" + + /* Step 2: Fold the carry back into dst */ + " add %%rax, %%r8;" + " adcx %3, %%r9;" + " movq %%r9, 8(%0);" + " adcx %3, %%r10;" + " movq %%r10, 16(%0);" + " adcx %3, %%r11;" + " movq %%r11, 24(%0);" + + /* Step 3: Fold the carry bit back in; guaranteed not to carry at this point */ + " mov $0, %%rax;" + " cmovc %%rdx, %%rax;" + " add %%rax, %%r8;" + " movq %%r8, 0(%0);" + : "+&r" (tmp), "+&r" (f1), "+&r" (out), "+&r" (f2) + : + : "%rax", "%rdx", "%r8", "%r9", "%r10", "%r11", "%rbx", "%r13", "%r14", "memory", "cc" + ); +} + +/* Computes two field multiplications: + * out[0] <- f1[0] * f2[0] + * out[1] <- f1[1] * f2[1] + * Uses the 16-element buffer tmp for intermediate results. */ +static inline void fmul2(u64 *out, const u64 *f1, const u64 *f2, u64 *tmp) +{ + asm volatile( + /* Compute the raw multiplication tmp[0] <- f1[0] * f2[0] */ + + /* Compute src1[0] * src2 */ + " movq 0(%1), %%rdx;" + " mulxq 0(%3), %%r8, %%r9;" " xor %%r10d, %%r10d;" " movq %%r8, 0(%0);" + " mulxq 8(%3), %%r10, %%r11;" " adox %%r9, %%r10;" " movq %%r10, 8(%0);" + " mulxq 16(%3), %%rbx, %%r13;" " adox %%r11, %%rbx;" + " mulxq 24(%3), %%r14, %%rdx;" " adox %%r13, %%r14;" " mov $0, %%rax;" + " adox %%rdx, %%rax;" + /* Compute src1[1] * src2 */ + " movq 8(%1), %%rdx;" + " mulxq 0(%3), %%r8, %%r9;" " xor %%r10d, %%r10d;" " adcxq 8(%0), %%r8;" " movq %%r8, 8(%0);" + " mulxq 8(%3), %%r10, %%r11;" " adox %%r9, %%r10;" " adcx %%rbx, %%r10;" " movq %%r10, 16(%0);" + " mulxq 16(%3), %%rbx, %%r13;" " adox %%r11, %%rbx;" " adcx %%r14, %%rbx;" " mov $0, %%r8;" + " mulxq 24(%3), %%r14, %%rdx;" " adox %%r13, %%r14;" " adcx %%rax, %%r14;" " mov $0, %%rax;" + " adox %%rdx, %%rax;" " adcx %%r8, %%rax;" + /* Compute src1[2] * src2 */ + " movq 16(%1), %%rdx;" + " mulxq 0(%3), %%r8, %%r9;" " xor %%r10d, %%r10d;" " adcxq 16(%0), %%r8;" " movq %%r8, 16(%0);" + " mulxq 8(%3), %%r10, %%r11;" " adox %%r9, %%r10;" " adcx %%rbx, %%r10;" " movq %%r10, 24(%0);" + " mulxq 16(%3), %%rbx, %%r13;" " adox %%r11, %%rbx;" " adcx %%r14, %%rbx;" " mov $0, %%r8;" + " mulxq 24(%3), %%r14, %%rdx;" " adox %%r13, %%r14;" " adcx %%rax, %%r14;" " mov $0, %%rax;" + " adox %%rdx, %%rax;" " adcx %%r8, %%rax;" + /* Compute src1[3] * src2 */ + " movq 24(%1), %%rdx;" + " mulxq 0(%3), %%r8, %%r9;" " xor %%r10d, %%r10d;" " adcxq 24(%0), %%r8;" " movq %%r8, 24(%0);" + " mulxq 8(%3), %%r10, %%r11;" " adox %%r9, %%r10;" " adcx %%rbx, %%r10;" " movq %%r10, 32(%0);" + " mulxq 16(%3), %%rbx, %%r13;" " adox %%r11, %%rbx;" " adcx %%r14, %%rbx;" " movq %%rbx, 40(%0);" " mov $0, %%r8;" + " mulxq 24(%3), %%r14, %%rdx;" " adox %%r13, %%r14;" " adcx %%rax, %%r14;" " movq %%r14, 48(%0);" " mov $0, %%rax;" + " adox %%rdx, %%rax;" " adcx %%r8, %%rax;" " movq %%rax, 56(%0);" + + /* Compute the raw multiplication tmp[1] <- f1[1] * f2[1] */ + + /* Compute src1[0] * src2 */ + " movq 32(%1), %%rdx;" + " mulxq 32(%3), %%r8, %%r9;" " xor %%r10d, %%r10d;" " movq %%r8, 64(%0);" + " mulxq 40(%3), %%r10, %%r11;" " adox %%r9, %%r10;" " movq %%r10, 72(%0);" + " mulxq 48(%3), %%rbx, %%r13;" " adox %%r11, %%rbx;" + " mulxq 56(%3), %%r14, %%rdx;" " adox %%r13, %%r14;" " mov $0, %%rax;" + " adox %%rdx, %%rax;" + /* Compute src1[1] * src2 */ + " movq 40(%1), %%rdx;" + " mulxq 32(%3), %%r8, %%r9;" " xor %%r10d, %%r10d;" " adcxq 72(%0), %%r8;" " movq %%r8, 72(%0);" + " mulxq 40(%3), %%r10, %%r11;" " adox %%r9, %%r10;" " adcx %%rbx, %%r10;" " movq %%r10, 80(%0);" + " mulxq 48(%3), %%rbx, %%r13;" " adox %%r11, %%rbx;" " adcx %%r14, %%rbx;" " mov $0, %%r8;" + " mulxq 56(%3), %%r14, %%rdx;" " adox %%r13, %%r14;" " adcx %%rax, %%r14;" " mov $0, %%rax;" + " adox %%rdx, %%rax;" " adcx %%r8, %%rax;" + /* Compute src1[2] * src2 */ + " movq 48(%1), %%rdx;" + " mulxq 32(%3), %%r8, %%r9;" " xor %%r10d, %%r10d;" " adcxq 80(%0), %%r8;" " movq %%r8, 80(%0);" + " mulxq 40(%3), %%r10, %%r11;" " adox %%r9, %%r10;" " adcx %%rbx, %%r10;" " movq %%r10, 88(%0);" + " mulxq 48(%3), %%rbx, %%r13;" " adox %%r11, %%rbx;" " adcx %%r14, %%rbx;" " mov $0, %%r8;" + " mulxq 56(%3), %%r14, %%rdx;" " adox %%r13, %%r14;" " adcx %%rax, %%r14;" " mov $0, %%rax;" + " adox %%rdx, %%rax;" " adcx %%r8, %%rax;" + /* Compute src1[3] * src2 */ + " movq 56(%1), %%rdx;" + " mulxq 32(%3), %%r8, %%r9;" " xor %%r10d, %%r10d;" " adcxq 88(%0), %%r8;" " movq %%r8, 88(%0);" + " mulxq 40(%3), %%r10, %%r11;" " adox %%r9, %%r10;" " adcx %%rbx, %%r10;" " movq %%r10, 96(%0);" + " mulxq 48(%3), %%rbx, %%r13;" " adox %%r11, %%rbx;" " adcx %%r14, %%rbx;" " movq %%rbx, 104(%0);" " mov $0, %%r8;" + " mulxq 56(%3), %%r14, %%rdx;" " adox %%r13, %%r14;" " adcx %%rax, %%r14;" " movq %%r14, 112(%0);" " mov $0, %%rax;" + " adox %%rdx, %%rax;" " adcx %%r8, %%rax;" " movq %%rax, 120(%0);" + /* Line up pointers */ + " mov %0, %1;" + " mov %2, %0;" + + /* Wrap the results back into the field */ + + /* Step 1: Compute dst + carry == tmp_hi * 38 + tmp_lo */ + " mov $38, %%rdx;" + " mulxq 32(%1), %%r8, %%r13;" + " xor %k3, %k3;" + " adoxq 0(%1), %%r8;" + " mulxq 40(%1), %%r9, %%rbx;" + " adcx %%r13, %%r9;" + " adoxq 8(%1), %%r9;" + " mulxq 48(%1), %%r10, %%r13;" + " adcx %%rbx, %%r10;" + " adoxq 16(%1), %%r10;" + " mulxq 56(%1), %%r11, %%rax;" + " adcx %%r13, %%r11;" + " adoxq 24(%1), %%r11;" + " adcx %3, %%rax;" + " adox %3, %%rax;" + " imul %%rdx, %%rax;" + + /* Step 2: Fold the carry back into dst */ + " add %%rax, %%r8;" + " adcx %3, %%r9;" + " movq %%r9, 8(%0);" + " adcx %3, %%r10;" + " movq %%r10, 16(%0);" + " adcx %3, %%r11;" + " movq %%r11, 24(%0);" + + /* Step 3: Fold the carry bit back in; guaranteed not to carry at this point */ + " mov $0, %%rax;" + " cmovc %%rdx, %%rax;" + " add %%rax, %%r8;" + " movq %%r8, 0(%0);" + + /* Step 1: Compute dst + carry == tmp_hi * 38 + tmp_lo */ + " mov $38, %%rdx;" + " mulxq 96(%1), %%r8, %%r13;" + " xor %k3, %k3;" + " adoxq 64(%1), %%r8;" + " mulxq 104(%1), %%r9, %%rbx;" + " adcx %%r13, %%r9;" + " adoxq 72(%1), %%r9;" + " mulxq 112(%1), %%r10, %%r13;" + " adcx %%rbx, %%r10;" + " adoxq 80(%1), %%r10;" + " mulxq 120(%1), %%r11, %%rax;" + " adcx %%r13, %%r11;" + " adoxq 88(%1), %%r11;" + " adcx %3, %%rax;" + " adox %3, %%rax;" + " imul %%rdx, %%rax;" + + /* Step 2: Fold the carry back into dst */ + " add %%rax, %%r8;" + " adcx %3, %%r9;" + " movq %%r9, 40(%0);" + " adcx %3, %%r10;" + " movq %%r10, 48(%0);" + " adcx %3, %%r11;" + " movq %%r11, 56(%0);" + + /* Step 3: Fold the carry bit back in; guaranteed not to carry at this point */ + " mov $0, %%rax;" + " cmovc %%rdx, %%rax;" + " add %%rax, %%r8;" + " movq %%r8, 32(%0);" + : "+&r" (tmp), "+&r" (f1), "+&r" (out), "+&r" (f2) + : + : "%rax", "%rdx", "%r8", "%r9", "%r10", "%r11", "%rbx", "%r13", "%r14", "memory", "cc" + ); +} + +/* Computes the field multiplication of four-element f1 with value in f2 */ +static inline void fmul_scalar(u64 *out, const u64 *f1, u64 f2) +{ + register u64 f2_r asm("rdx") = f2; + + asm volatile( + /* Compute the raw multiplication of f1*f2 */ + " mulxq 0(%2), %%r8, %%rcx;" /* f1[0]*f2 */ + " mulxq 8(%2), %%r9, %%rbx;" /* f1[1]*f2 */ + " add %%rcx, %%r9;" + " mov $0, %%rcx;" + " mulxq 16(%2), %%r10, %%r13;" /* f1[2]*f2 */ + " adcx %%rbx, %%r10;" + " mulxq 24(%2), %%r11, %%rax;" /* f1[3]*f2 */ + " adcx %%r13, %%r11;" + " adcx %%rcx, %%rax;" + + /* Wrap the result back into the field */ + + /* Step 1: Compute carry*38 */ + " mov $38, %%rdx;" + " imul %%rdx, %%rax;" + + /* Step 2: Fold the carry back into dst */ + " add %%rax, %%r8;" + " adcx %%rcx, %%r9;" + " movq %%r9, 8(%1);" + " adcx %%rcx, %%r10;" + " movq %%r10, 16(%1);" + " adcx %%rcx, %%r11;" + " movq %%r11, 24(%1);" + + /* Step 3: Fold the carry bit back in; guaranteed not to carry at this point */ + " mov $0, %%rax;" + " cmovc %%rdx, %%rax;" + " add %%rax, %%r8;" + " movq %%r8, 0(%1);" + : "+&r" (f2_r) + : "r" (out), "r" (f1) + : "%rax", "%rcx", "%r8", "%r9", "%r10", "%r11", "%rbx", "%r13", "memory", "cc" + ); +} + +/* Computes p1 <- bit ? p2 : p1 in constant time */ +static inline void cswap2(u64 bit, const u64 *p1, const u64 *p2) +{ + asm volatile( + /* Invert the polarity of bit to match cmov expectations */ + " add $18446744073709551615, %0;" + + /* cswap p1[0], p2[0] */ + " movq 0(%1), %%r8;" + " movq 0(%2), %%r9;" + " mov %%r8, %%r10;" + " cmovc %%r9, %%r8;" + " cmovc %%r10, %%r9;" + " movq %%r8, 0(%1);" + " movq %%r9, 0(%2);" + + /* cswap p1[1], p2[1] */ + " movq 8(%1), %%r8;" + " movq 8(%2), %%r9;" + " mov %%r8, %%r10;" + " cmovc %%r9, %%r8;" + " cmovc %%r10, %%r9;" + " movq %%r8, 8(%1);" + " movq %%r9, 8(%2);" + + /* cswap p1[2], p2[2] */ + " movq 16(%1), %%r8;" + " movq 16(%2), %%r9;" + " mov %%r8, %%r10;" + " cmovc %%r9, %%r8;" + " cmovc %%r10, %%r9;" + " movq %%r8, 16(%1);" + " movq %%r9, 16(%2);" + + /* cswap p1[3], p2[3] */ + " movq 24(%1), %%r8;" + " movq 24(%2), %%r9;" + " mov %%r8, %%r10;" + " cmovc %%r9, %%r8;" + " cmovc %%r10, %%r9;" + " movq %%r8, 24(%1);" + " movq %%r9, 24(%2);" + + /* cswap p1[4], p2[4] */ + " movq 32(%1), %%r8;" + " movq 32(%2), %%r9;" + " mov %%r8, %%r10;" + " cmovc %%r9, %%r8;" + " cmovc %%r10, %%r9;" + " movq %%r8, 32(%1);" + " movq %%r9, 32(%2);" + + /* cswap p1[5], p2[5] */ + " movq 40(%1), %%r8;" + " movq 40(%2), %%r9;" + " mov %%r8, %%r10;" + " cmovc %%r9, %%r8;" + " cmovc %%r10, %%r9;" + " movq %%r8, 40(%1);" + " movq %%r9, 40(%2);" + + /* cswap p1[6], p2[6] */ + " movq 48(%1), %%r8;" + " movq 48(%2), %%r9;" + " mov %%r8, %%r10;" + " cmovc %%r9, %%r8;" + " cmovc %%r10, %%r9;" + " movq %%r8, 48(%1);" + " movq %%r9, 48(%2);" + + /* cswap p1[7], p2[7] */ + " movq 56(%1), %%r8;" + " movq 56(%2), %%r9;" + " mov %%r8, %%r10;" + " cmovc %%r9, %%r8;" + " cmovc %%r10, %%r9;" + " movq %%r8, 56(%1);" + " movq %%r9, 56(%2);" + : "+&r" (bit) + : "r" (p1), "r" (p2) + : "%r8", "%r9", "%r10", "memory", "cc" + ); +} + +/* Computes the square of a field element: out <- f * f + * Uses the 8-element buffer tmp for intermediate results */ +static inline void fsqr(u64 *out, const u64 *f, u64 *tmp) +{ + asm volatile( + /* Compute the raw multiplication: tmp <- f * f */ + + /* Step 1: Compute all partial products */ + " movq 0(%1), %%rdx;" /* f[0] */ + " mulxq 8(%1), %%r8, %%r14;" " xor %%r15d, %%r15d;" /* f[1]*f[0] */ + " mulxq 16(%1), %%r9, %%r10;" " adcx %%r14, %%r9;" /* f[2]*f[0] */ + " mulxq 24(%1), %%rax, %%rcx;" " adcx %%rax, %%r10;" /* f[3]*f[0] */ + " movq 24(%1), %%rdx;" /* f[3] */ + " mulxq 8(%1), %%r11, %%rbx;" " adcx %%rcx, %%r11;" /* f[1]*f[3] */ + " mulxq 16(%1), %%rax, %%r13;" " adcx %%rax, %%rbx;" /* f[2]*f[3] */ + " movq 8(%1), %%rdx;" " adcx %%r15, %%r13;" /* f1 */ + " mulxq 16(%1), %%rax, %%rcx;" " mov $0, %%r14;" /* f[2]*f[1] */ + + /* Step 2: Compute two parallel carry chains */ + " xor %%r15d, %%r15d;" + " adox %%rax, %%r10;" + " adcx %%r8, %%r8;" + " adox %%rcx, %%r11;" + " adcx %%r9, %%r9;" + " adox %%r15, %%rbx;" + " adcx %%r10, %%r10;" + " adox %%r15, %%r13;" + " adcx %%r11, %%r11;" + " adox %%r15, %%r14;" + " adcx %%rbx, %%rbx;" + " adcx %%r13, %%r13;" + " adcx %%r14, %%r14;" + + /* Step 3: Compute intermediate squares */ + " movq 0(%1), %%rdx;" " mulx %%rdx, %%rax, %%rcx;" /* f[0]^2 */ + " movq %%rax, 0(%0);" + " add %%rcx, %%r8;" " movq %%r8, 8(%0);" + " movq 8(%1), %%rdx;" " mulx %%rdx, %%rax, %%rcx;" /* f[1]^2 */ + " adcx %%rax, %%r9;" " movq %%r9, 16(%0);" + " adcx %%rcx, %%r10;" " movq %%r10, 24(%0);" + " movq 16(%1), %%rdx;" " mulx %%rdx, %%rax, %%rcx;" /* f[2]^2 */ + " adcx %%rax, %%r11;" " movq %%r11, 32(%0);" + " adcx %%rcx, %%rbx;" " movq %%rbx, 40(%0);" + " movq 24(%1), %%rdx;" " mulx %%rdx, %%rax, %%rcx;" /* f[3]^2 */ + " adcx %%rax, %%r13;" " movq %%r13, 48(%0);" + " adcx %%rcx, %%r14;" " movq %%r14, 56(%0);" + + /* Line up pointers */ + " mov %0, %1;" + " mov %2, %0;" + + /* Wrap the result back into the field */ + + /* Step 1: Compute dst + carry == tmp_hi * 38 + tmp_lo */ + " mov $38, %%rdx;" + " mulxq 32(%1), %%r8, %%r13;" + " xor %%ecx, %%ecx;" + " adoxq 0(%1), %%r8;" + " mulxq 40(%1), %%r9, %%rbx;" + " adcx %%r13, %%r9;" + " adoxq 8(%1), %%r9;" + " mulxq 48(%1), %%r10, %%r13;" + " adcx %%rbx, %%r10;" + " adoxq 16(%1), %%r10;" + " mulxq 56(%1), %%r11, %%rax;" + " adcx %%r13, %%r11;" + " adoxq 24(%1), %%r11;" + " adcx %%rcx, %%rax;" + " adox %%rcx, %%rax;" + " imul %%rdx, %%rax;" + + /* Step 2: Fold the carry back into dst */ + " add %%rax, %%r8;" + " adcx %%rcx, %%r9;" + " movq %%r9, 8(%0);" + " adcx %%rcx, %%r10;" + " movq %%r10, 16(%0);" + " adcx %%rcx, %%r11;" + " movq %%r11, 24(%0);" + + /* Step 3: Fold the carry bit back in; guaranteed not to carry at this point */ + " mov $0, %%rax;" + " cmovc %%rdx, %%rax;" + " add %%rax, %%r8;" + " movq %%r8, 0(%0);" + : "+&r" (tmp), "+&r" (f), "+&r" (out) + : + : "%rax", "%rcx", "%rdx", "%r8", "%r9", "%r10", "%r11", "%rbx", "%r13", "%r14", "%r15", "memory", "cc" + ); +} + +/* Computes two field squarings: + * out[0] <- f[0] * f[0] + * out[1] <- f[1] * f[1] + * Uses the 16-element buffer tmp for intermediate results */ +static inline void fsqr2(u64 *out, const u64 *f, u64 *tmp) +{ + asm volatile( + /* Step 1: Compute all partial products */ + " movq 0(%1), %%rdx;" /* f[0] */ + " mulxq 8(%1), %%r8, %%r14;" " xor %%r15d, %%r15d;" /* f[1]*f[0] */ + " mulxq 16(%1), %%r9, %%r10;" " adcx %%r14, %%r9;" /* f[2]*f[0] */ + " mulxq 24(%1), %%rax, %%rcx;" " adcx %%rax, %%r10;" /* f[3]*f[0] */ + " movq 24(%1), %%rdx;" /* f[3] */ + " mulxq 8(%1), %%r11, %%rbx;" " adcx %%rcx, %%r11;" /* f[1]*f[3] */ + " mulxq 16(%1), %%rax, %%r13;" " adcx %%rax, %%rbx;" /* f[2]*f[3] */ + " movq 8(%1), %%rdx;" " adcx %%r15, %%r13;" /* f1 */ + " mulxq 16(%1), %%rax, %%rcx;" " mov $0, %%r14;" /* f[2]*f[1] */ + + /* Step 2: Compute two parallel carry chains */ + " xor %%r15d, %%r15d;" + " adox %%rax, %%r10;" + " adcx %%r8, %%r8;" + " adox %%rcx, %%r11;" + " adcx %%r9, %%r9;" + " adox %%r15, %%rbx;" + " adcx %%r10, %%r10;" + " adox %%r15, %%r13;" + " adcx %%r11, %%r11;" + " adox %%r15, %%r14;" + " adcx %%rbx, %%rbx;" + " adcx %%r13, %%r13;" + " adcx %%r14, %%r14;" + + /* Step 3: Compute intermediate squares */ + " movq 0(%1), %%rdx;" " mulx %%rdx, %%rax, %%rcx;" /* f[0]^2 */ + " movq %%rax, 0(%0);" + " add %%rcx, %%r8;" " movq %%r8, 8(%0);" + " movq 8(%1), %%rdx;" " mulx %%rdx, %%rax, %%rcx;" /* f[1]^2 */ + " adcx %%rax, %%r9;" " movq %%r9, 16(%0);" + " adcx %%rcx, %%r10;" " movq %%r10, 24(%0);" + " movq 16(%1), %%rdx;" " mulx %%rdx, %%rax, %%rcx;" /* f[2]^2 */ + " adcx %%rax, %%r11;" " movq %%r11, 32(%0);" + " adcx %%rcx, %%rbx;" " movq %%rbx, 40(%0);" + " movq 24(%1), %%rdx;" " mulx %%rdx, %%rax, %%rcx;" /* f[3]^2 */ + " adcx %%rax, %%r13;" " movq %%r13, 48(%0);" + " adcx %%rcx, %%r14;" " movq %%r14, 56(%0);" + + /* Step 1: Compute all partial products */ + " movq 32(%1), %%rdx;" /* f[0] */ + " mulxq 40(%1), %%r8, %%r14;" " xor %%r15d, %%r15d;" /* f[1]*f[0] */ + " mulxq 48(%1), %%r9, %%r10;" " adcx %%r14, %%r9;" /* f[2]*f[0] */ + " mulxq 56(%1), %%rax, %%rcx;" " adcx %%rax, %%r10;" /* f[3]*f[0] */ + " movq 56(%1), %%rdx;" /* f[3] */ + " mulxq 40(%1), %%r11, %%rbx;" " adcx %%rcx, %%r11;" /* f[1]*f[3] */ + " mulxq 48(%1), %%rax, %%r13;" " adcx %%rax, %%rbx;" /* f[2]*f[3] */ + " movq 40(%1), %%rdx;" " adcx %%r15, %%r13;" /* f1 */ + " mulxq 48(%1), %%rax, %%rcx;" " mov $0, %%r14;" /* f[2]*f[1] */ + + /* Step 2: Compute two parallel carry chains */ + " xor %%r15d, %%r15d;" + " adox %%rax, %%r10;" + " adcx %%r8, %%r8;" + " adox %%rcx, %%r11;" + " adcx %%r9, %%r9;" + " adox %%r15, %%rbx;" + " adcx %%r10, %%r10;" + " adox %%r15, %%r13;" + " adcx %%r11, %%r11;" + " adox %%r15, %%r14;" + " adcx %%rbx, %%rbx;" + " adcx %%r13, %%r13;" + " adcx %%r14, %%r14;" + + /* Step 3: Compute intermediate squares */ + " movq 32(%1), %%rdx;" " mulx %%rdx, %%rax, %%rcx;" /* f[0]^2 */ + " movq %%rax, 64(%0);" + " add %%rcx, %%r8;" " movq %%r8, 72(%0);" + " movq 40(%1), %%rdx;" " mulx %%rdx, %%rax, %%rcx;" /* f[1]^2 */ + " adcx %%rax, %%r9;" " movq %%r9, 80(%0);" + " adcx %%rcx, %%r10;" " movq %%r10, 88(%0);" + " movq 48(%1), %%rdx;" " mulx %%rdx, %%rax, %%rcx;" /* f[2]^2 */ + " adcx %%rax, %%r11;" " movq %%r11, 96(%0);" + " adcx %%rcx, %%rbx;" " movq %%rbx, 104(%0);" + " movq 56(%1), %%rdx;" " mulx %%rdx, %%rax, %%rcx;" /* f[3]^2 */ + " adcx %%rax, %%r13;" " movq %%r13, 112(%0);" + " adcx %%rcx, %%r14;" " movq %%r14, 120(%0);" + + /* Line up pointers */ + " mov %0, %1;" + " mov %2, %0;" + + /* Step 1: Compute dst + carry == tmp_hi * 38 + tmp_lo */ + " mov $38, %%rdx;" + " mulxq 32(%1), %%r8, %%r13;" + " xor %%ecx, %%ecx;" + " adoxq 0(%1), %%r8;" + " mulxq 40(%1), %%r9, %%rbx;" + " adcx %%r13, %%r9;" + " adoxq 8(%1), %%r9;" + " mulxq 48(%1), %%r10, %%r13;" + " adcx %%rbx, %%r10;" + " adoxq 16(%1), %%r10;" + " mulxq 56(%1), %%r11, %%rax;" + " adcx %%r13, %%r11;" + " adoxq 24(%1), %%r11;" + " adcx %%rcx, %%rax;" + " adox %%rcx, %%rax;" + " imul %%rdx, %%rax;" + + /* Step 2: Fold the carry back into dst */ + " add %%rax, %%r8;" + " adcx %%rcx, %%r9;" + " movq %%r9, 8(%0);" + " adcx %%rcx, %%r10;" + " movq %%r10, 16(%0);" + " adcx %%rcx, %%r11;" + " movq %%r11, 24(%0);" + + /* Step 3: Fold the carry bit back in; guaranteed not to carry at this point */ + " mov $0, %%rax;" + " cmovc %%rdx, %%rax;" + " add %%rax, %%r8;" + " movq %%r8, 0(%0);" + + /* Step 1: Compute dst + carry == tmp_hi * 38 + tmp_lo */ + " mov $38, %%rdx;" + " mulxq 96(%1), %%r8, %%r13;" + " xor %%ecx, %%ecx;" + " adoxq 64(%1), %%r8;" + " mulxq 104(%1), %%r9, %%rbx;" + " adcx %%r13, %%r9;" + " adoxq 72(%1), %%r9;" + " mulxq 112(%1), %%r10, %%r13;" + " adcx %%rbx, %%r10;" + " adoxq 80(%1), %%r10;" + " mulxq 120(%1), %%r11, %%rax;" + " adcx %%r13, %%r11;" + " adoxq 88(%1), %%r11;" + " adcx %%rcx, %%rax;" + " adox %%rcx, %%rax;" + " imul %%rdx, %%rax;" + + /* Step 2: Fold the carry back into dst */ + " add %%rax, %%r8;" + " adcx %%rcx, %%r9;" + " movq %%r9, 40(%0);" + " adcx %%rcx, %%r10;" + " movq %%r10, 48(%0);" + " adcx %%rcx, %%r11;" + " movq %%r11, 56(%0);" + + /* Step 3: Fold the carry bit back in; guaranteed not to carry at this point */ + " mov $0, %%rax;" + " cmovc %%rdx, %%rax;" + " add %%rax, %%r8;" + " movq %%r8, 32(%0);" + : "+&r" (tmp), "+&r" (f), "+&r" (out) + : + : "%rax", "%rcx", "%rdx", "%r8", "%r9", "%r10", "%r11", "%rbx", "%r13", "%r14", "%r15", "memory", "cc" + ); +} + +static void point_add_and_double(u64 *q, u64 *p01_tmp1, u64 *tmp2) +{ + u64 *nq = p01_tmp1; + u64 *nq_p1 = p01_tmp1 + (u32)8U; + u64 *tmp1 = p01_tmp1 + (u32)16U; + u64 *x1 = q; + u64 *x2 = nq; + u64 *z2 = nq + (u32)4U; + u64 *z3 = nq_p1 + (u32)4U; + u64 *a = tmp1; + u64 *b = tmp1 + (u32)4U; + u64 *ab = tmp1; + u64 *dc = tmp1 + (u32)8U; + u64 *x3; + u64 *z31; + u64 *d0; + u64 *c0; + u64 *a1; + u64 *b1; + u64 *d; + u64 *c; + u64 *ab1; + u64 *dc1; + fadd(a, x2, z2); + fsub(b, x2, z2); + x3 = nq_p1; + z31 = nq_p1 + (u32)4U; + d0 = dc; + c0 = dc + (u32)4U; + fadd(c0, x3, z31); + fsub(d0, x3, z31); + fmul2(dc, dc, ab, tmp2); + fadd(x3, d0, c0); + fsub(z31, d0, c0); + a1 = tmp1; + b1 = tmp1 + (u32)4U; + d = tmp1 + (u32)8U; + c = tmp1 + (u32)12U; + ab1 = tmp1; + dc1 = tmp1 + (u32)8U; + fsqr2(dc1, ab1, tmp2); + fsqr2(nq_p1, nq_p1, tmp2); + a1[0U] = c[0U]; + a1[1U] = c[1U]; + a1[2U] = c[2U]; + a1[3U] = c[3U]; + fsub(c, d, c); + fmul_scalar(b1, c, (u64)121665U); + fadd(b1, b1, d); + fmul2(nq, dc1, ab1, tmp2); + fmul(z3, z3, x1, tmp2); +} + +static void point_double(u64 *nq, u64 *tmp1, u64 *tmp2) +{ + u64 *x2 = nq; + u64 *z2 = nq + (u32)4U; + u64 *a = tmp1; + u64 *b = tmp1 + (u32)4U; + u64 *d = tmp1 + (u32)8U; + u64 *c = tmp1 + (u32)12U; + u64 *ab = tmp1; + u64 *dc = tmp1 + (u32)8U; + fadd(a, x2, z2); + fsub(b, x2, z2); + fsqr2(dc, ab, tmp2); + a[0U] = c[0U]; + a[1U] = c[1U]; + a[2U] = c[2U]; + a[3U] = c[3U]; + fsub(c, d, c); + fmul_scalar(b, c, (u64)121665U); + fadd(b, b, d); + fmul2(nq, dc, ab, tmp2); +} + +static void montgomery_ladder(u64 *out, const u8 *key, u64 *init1) +{ + u64 tmp2[16U] = { 0U }; + u64 p01_tmp1_swap[33U] = { 0U }; + u64 *p0 = p01_tmp1_swap; + u64 *p01 = p01_tmp1_swap; + u64 *p03 = p01; + u64 *p11 = p01 + (u32)8U; + u64 *x0; + u64 *z0; + u64 *p01_tmp1; + u64 *p01_tmp11; + u64 *nq10; + u64 *nq_p11; + u64 *swap1; + u64 sw0; + u64 *nq1; + u64 *tmp1; + memcpy(p11, init1, (u32)8U * sizeof(init1[0U])); + x0 = p03; + z0 = p03 + (u32)4U; + x0[0U] = (u64)1U; + x0[1U] = (u64)0U; + x0[2U] = (u64)0U; + x0[3U] = (u64)0U; + z0[0U] = (u64)0U; + z0[1U] = (u64)0U; + z0[2U] = (u64)0U; + z0[3U] = (u64)0U; + p01_tmp1 = p01_tmp1_swap; + p01_tmp11 = p01_tmp1_swap; + nq10 = p01_tmp1_swap; + nq_p11 = p01_tmp1_swap + (u32)8U; + swap1 = p01_tmp1_swap + (u32)32U; + cswap2((u64)1U, nq10, nq_p11); + point_add_and_double(init1, p01_tmp11, tmp2); + swap1[0U] = (u64)1U; + { + u32 i; + for (i = (u32)0U; i < (u32)251U; i = i + (u32)1U) { + u64 *p01_tmp12 = p01_tmp1_swap; + u64 *swap2 = p01_tmp1_swap + (u32)32U; + u64 *nq2 = p01_tmp12; + u64 *nq_p12 = p01_tmp12 + (u32)8U; + u64 bit = (u64)(key[((u32)253U - i) / (u32)8U] >> ((u32)253U - i) % (u32)8U & (u8)1U); + u64 sw = swap2[0U] ^ bit; + cswap2(sw, nq2, nq_p12); + point_add_and_double(init1, p01_tmp12, tmp2); + swap2[0U] = bit; + } + } + sw0 = swap1[0U]; + cswap2(sw0, nq10, nq_p11); + nq1 = p01_tmp1; + tmp1 = p01_tmp1 + (u32)16U; + point_double(nq1, tmp1, tmp2); + point_double(nq1, tmp1, tmp2); + point_double(nq1, tmp1, tmp2); + memcpy(out, p0, (u32)8U * sizeof(p0[0U])); + + memzero_explicit(tmp2, sizeof(tmp2)); + memzero_explicit(p01_tmp1_swap, sizeof(p01_tmp1_swap)); +} + +static void fsquare_times(u64 *o, const u64 *inp, u64 *tmp, u32 n1) +{ + u32 i; + fsqr(o, inp, tmp); + for (i = (u32)0U; i < n1 - (u32)1U; i = i + (u32)1U) + fsqr(o, o, tmp); +} + +static void finv(u64 *o, const u64 *i, u64 *tmp) +{ + u64 t1[16U] = { 0U }; + u64 *a0 = t1; + u64 *b = t1 + (u32)4U; + u64 *c = t1 + (u32)8U; + u64 *t00 = t1 + (u32)12U; + u64 *tmp1 = tmp; + u64 *a; + u64 *t0; + fsquare_times(a0, i, tmp1, (u32)1U); + fsquare_times(t00, a0, tmp1, (u32)2U); + fmul(b, t00, i, tmp); + fmul(a0, b, a0, tmp); + fsquare_times(t00, a0, tmp1, (u32)1U); + fmul(b, t00, b, tmp); + fsquare_times(t00, b, tmp1, (u32)5U); + fmul(b, t00, b, tmp); + fsquare_times(t00, b, tmp1, (u32)10U); + fmul(c, t00, b, tmp); + fsquare_times(t00, c, tmp1, (u32)20U); + fmul(t00, t00, c, tmp); + fsquare_times(t00, t00, tmp1, (u32)10U); + fmul(b, t00, b, tmp); + fsquare_times(t00, b, tmp1, (u32)50U); + fmul(c, t00, b, tmp); + fsquare_times(t00, c, tmp1, (u32)100U); + fmul(t00, t00, c, tmp); + fsquare_times(t00, t00, tmp1, (u32)50U); + fmul(t00, t00, b, tmp); + fsquare_times(t00, t00, tmp1, (u32)5U); + a = t1; + t0 = t1 + (u32)12U; + fmul(o, t0, a, tmp); +} + +static void store_felem(u64 *b, u64 *f) +{ + u64 f30 = f[3U]; + u64 top_bit0 = f30 >> (u32)63U; + u64 f31; + u64 top_bit; + u64 f0; + u64 f1; + u64 f2; + u64 f3; + u64 m0; + u64 m1; + u64 m2; + u64 m3; + u64 mask; + u64 f0_; + u64 f1_; + u64 f2_; + u64 f3_; + u64 o0; + u64 o1; + u64 o2; + u64 o3; + f[3U] = f30 & (u64)0x7fffffffffffffffU; + add_scalar(f, f, (u64)19U * top_bit0); + f31 = f[3U]; + top_bit = f31 >> (u32)63U; + f[3U] = f31 & (u64)0x7fffffffffffffffU; + add_scalar(f, f, (u64)19U * top_bit); + f0 = f[0U]; + f1 = f[1U]; + f2 = f[2U]; + f3 = f[3U]; + m0 = gte_mask(f0, (u64)0xffffffffffffffedU); + m1 = eq_mask(f1, (u64)0xffffffffffffffffU); + m2 = eq_mask(f2, (u64)0xffffffffffffffffU); + m3 = eq_mask(f3, (u64)0x7fffffffffffffffU); + mask = ((m0 & m1) & m2) & m3; + f0_ = f0 - (mask & (u64)0xffffffffffffffedU); + f1_ = f1 - (mask & (u64)0xffffffffffffffffU); + f2_ = f2 - (mask & (u64)0xffffffffffffffffU); + f3_ = f3 - (mask & (u64)0x7fffffffffffffffU); + o0 = f0_; + o1 = f1_; + o2 = f2_; + o3 = f3_; + b[0U] = o0; + b[1U] = o1; + b[2U] = o2; + b[3U] = o3; +} + +static void encode_point(u8 *o, const u64 *i) +{ + const u64 *x = i; + const u64 *z = i + (u32)4U; + u64 tmp[4U] = { 0U }; + u64 tmp_w[16U] = { 0U }; + finv(tmp, z, tmp_w); + fmul(tmp, tmp, x, tmp_w); + store_felem((u64 *)o, tmp); +} + +static void curve25519_ever64(u8 *out, const u8 *priv, const u8 *pub) +{ + u64 init1[8U] = { 0U }; + u64 tmp[4U] = { 0U }; + u64 tmp3; + u64 *x; + u64 *z; + { + u32 i; + for (i = (u32)0U; i < (u32)4U; i = i + (u32)1U) { + u64 *os = tmp; + const u8 *bj = pub + i * (u32)8U; + u64 u = *(u64 *)bj; + u64 r = u; + u64 x0 = r; + os[i] = x0; + } + } + tmp3 = tmp[3U]; + tmp[3U] = tmp3 & (u64)0x7fffffffffffffffU; + x = init1; + z = init1 + (u32)4U; + z[0U] = (u64)1U; + z[1U] = (u64)0U; + z[2U] = (u64)0U; + z[3U] = (u64)0U; + x[0U] = tmp[0U]; + x[1U] = tmp[1U]; + x[2U] = tmp[2U]; + x[3U] = tmp[3U]; + montgomery_ladder(init1, priv, init1); + encode_point(out, init1); +} + +/* The below constants were generated using this sage script: + * + * #!/usr/bin/env sage + * import sys + * from sage.all import * + * def limbs(n): + * n = int(n) + * l = ((n >> 0) % 2^64, (n >> 64) % 2^64, (n >> 128) % 2^64, (n >> 192) % 2^64) + * return "0x%016xULL, 0x%016xULL, 0x%016xULL, 0x%016xULL" % l + * ec = EllipticCurve(GF(2^255 - 19), [0, 486662, 0, 1, 0]) + * p_minus_s = (ec.lift_x(9) - ec.lift_x(1))[0] + * print("static const u64 p_minus_s[] = { %s };\n" % limbs(p_minus_s)) + * print("static const u64 table_ladder[] = {") + * p = ec.lift_x(9) + * for i in range(252): + * l = (p[0] + p[2]) / (p[0] - p[2]) + * print(("\t%s" + ("," if i != 251 else "")) % limbs(l)) + * p = p * 2 + * print("};") + * + */ + +static const u64 p_minus_s[] = { 0x816b1e0137d48290ULL, 0x440f6a51eb4d1207ULL, 0x52385f46dca2b71dULL, 0x215132111d8354cbULL }; + +static const u64 table_ladder[] = { + 0xfffffffffffffff3ULL, 0xffffffffffffffffULL, 0xffffffffffffffffULL, 0x5fffffffffffffffULL, + 0x6b8220f416aafe96ULL, 0x82ebeb2b4f566a34ULL, 0xd5a9a5b075a5950fULL, 0x5142b2cf4b2488f4ULL, + 0x6aaebc750069680cULL, 0x89cf7820a0f99c41ULL, 0x2a58d9183b56d0f4ULL, 0x4b5aca80e36011a4ULL, + 0x329132348c29745dULL, 0xf4a2e616e1642fd7ULL, 0x1e45bb03ff67bc34ULL, 0x306912d0f42a9b4aULL, + 0xff886507e6af7154ULL, 0x04f50e13dfeec82fULL, 0xaa512fe82abab5ceULL, 0x174e251a68d5f222ULL, + 0xcf96700d82028898ULL, 0x1743e3370a2c02c5ULL, 0x379eec98b4e86eaaULL, 0x0c59888a51e0482eULL, + 0xfbcbf1d699b5d189ULL, 0xacaef0d58e9fdc84ULL, 0xc1c20d06231f7614ULL, 0x2938218da274f972ULL, + 0xf6af49beff1d7f18ULL, 0xcc541c22387ac9c2ULL, 0x96fcc9ef4015c56bULL, 0x69c1627c690913a9ULL, + 0x7a86fd2f4733db0eULL, 0xfdb8c4f29e087de9ULL, 0x095e4b1a8ea2a229ULL, 0x1ad7a7c829b37a79ULL, + 0x342d89cad17ea0c0ULL, 0x67bedda6cced2051ULL, 0x19ca31bf2bb42f74ULL, 0x3df7b4c84980acbbULL, + 0xa8c6444dc80ad883ULL, 0xb91e440366e3ab85ULL, 0xc215cda00164f6d8ULL, 0x3d867c6ef247e668ULL, + 0xc7dd582bcc3e658cULL, 0xfd2c4748ee0e5528ULL, 0xa0fd9b95cc9f4f71ULL, 0x7529d871b0675ddfULL, + 0xb8f568b42d3cbd78ULL, 0x1233011b91f3da82ULL, 0x2dce6ccd4a7c3b62ULL, 0x75e7fc8e9e498603ULL, + 0x2f4f13f1fcd0b6ecULL, 0xf1a8ca1f29ff7a45ULL, 0xc249c1a72981e29bULL, 0x6ebe0dbb8c83b56aULL, + 0x7114fa8d170bb222ULL, 0x65a2dcd5bf93935fULL, 0xbdc41f68b59c979aULL, 0x2f0eef79a2ce9289ULL, + 0x42ecbf0c083c37ceULL, 0x2930bc09ec496322ULL, 0xf294b0c19cfeac0dULL, 0x3780aa4bedfabb80ULL, + 0x56c17d3e7cead929ULL, 0xe7cb4beb2e5722c5ULL, 0x0ce931732dbfe15aULL, 0x41b883c7621052f8ULL, + 0xdbf75ca0c3d25350ULL, 0x2936be086eb1e351ULL, 0xc936e03cb4a9b212ULL, 0x1d45bf82322225aaULL, + 0xe81ab1036a024cc5ULL, 0xe212201c304c9a72ULL, 0xc5d73fba6832b1fcULL, 0x20ffdb5a4d839581ULL, + 0xa283d367be5d0fadULL, 0x6c2b25ca8b164475ULL, 0x9d4935467caaf22eULL, 0x5166408eee85ff49ULL, + 0x3c67baa2fab4e361ULL, 0xb3e433c67ef35cefULL, 0x5259729241159b1cULL, 0x6a621892d5b0ab33ULL, + 0x20b74a387555cdcbULL, 0x532aa10e1208923fULL, 0xeaa17b7762281dd1ULL, 0x61ab3443f05c44bfULL, + 0x257a6c422324def8ULL, 0x131c6c1017e3cf7fULL, 0x23758739f630a257ULL, 0x295a407a01a78580ULL, + 0xf8c443246d5da8d9ULL, 0x19d775450c52fa5dULL, 0x2afcfc92731bf83dULL, 0x7d10c8e81b2b4700ULL, + 0xc8e0271f70baa20bULL, 0x993748867ca63957ULL, 0x5412efb3cb7ed4bbULL, 0x3196d36173e62975ULL, + 0xde5bcad141c7dffcULL, 0x47cc8cd2b395c848ULL, 0xa34cd942e11af3cbULL, 0x0256dbf2d04ecec2ULL, + 0x875ab7e94b0e667fULL, 0xcad4dd83c0850d10ULL, 0x47f12e8f4e72c79fULL, 0x5f1a87bb8c85b19bULL, + 0x7ae9d0b6437f51b8ULL, 0x12c7ce5518879065ULL, 0x2ade09fe5cf77aeeULL, 0x23a05a2f7d2c5627ULL, + 0x5908e128f17c169aULL, 0xf77498dd8ad0852dULL, 0x74b4c4ceab102f64ULL, 0x183abadd10139845ULL, + 0xb165ba8daa92aaacULL, 0xd5c5ef9599386705ULL, 0xbe2f8f0cf8fc40d1ULL, 0x2701e635ee204514ULL, + 0x629fa80020156514ULL, 0xf223868764a8c1ceULL, 0x5b894fff0b3f060eULL, 0x60d9944cf708a3faULL, + 0xaeea001a1c7a201fULL, 0xebf16a633ee2ce63ULL, 0x6f7709594c7a07e1ULL, 0x79b958150d0208cbULL, + 0x24b55e5301d410e7ULL, 0xe3a34edff3fdc84dULL, 0xd88768e4904032d8ULL, 0x131384427b3aaeecULL, + 0x8405e51286234f14ULL, 0x14dc4739adb4c529ULL, 0xb8a2b5b250634ffdULL, 0x2fe2a94ad8a7ff93ULL, + 0xec5c57efe843faddULL, 0x2843ce40f0bb9918ULL, 0xa4b561d6cf3d6305ULL, 0x743629bde8fb777eULL, + 0x343edd46bbaf738fULL, 0xed981828b101a651ULL, 0xa401760b882c797aULL, 0x1fc223e28dc88730ULL, + 0x48604e91fc0fba0eULL, 0xb637f78f052c6fa4ULL, 0x91ccac3d09e9239cULL, 0x23f7eed4437a687cULL, + 0x5173b1118d9bd800ULL, 0x29d641b63189d4a7ULL, 0xfdbf177988bbc586ULL, 0x2959894fcad81df5ULL, + 0xaebc8ef3b4bbc899ULL, 0x4148995ab26992b9ULL, 0x24e20b0134f92cfbULL, 0x40d158894a05dee8ULL, + 0x46b00b1185af76f6ULL, 0x26bac77873187a79ULL, 0x3dc0bf95ab8fff5fULL, 0x2a608bd8945524d7ULL, + 0x26449588bd446302ULL, 0x7c4bc21c0388439cULL, 0x8e98a4f383bd11b2ULL, 0x26218d7bc9d876b9ULL, + 0xe3081542997c178aULL, 0x3c2d29a86fb6606fULL, 0x5c217736fa279374ULL, 0x7dde05734afeb1faULL, + 0x3bf10e3906d42babULL, 0xe4f7803e1980649cULL, 0xe6053bf89595bf7aULL, 0x394faf38da245530ULL, + 0x7a8efb58896928f4ULL, 0xfbc778e9cc6a113cULL, 0x72670ce330af596fULL, 0x48f222a81d3d6cf7ULL, + 0xf01fce410d72caa7ULL, 0x5a20ecc7213b5595ULL, 0x7bc21165c1fa1483ULL, 0x07f89ae31da8a741ULL, + 0x05d2c2b4c6830ff9ULL, 0xd43e330fc6316293ULL, 0xa5a5590a96d3a904ULL, 0x705edb91a65333b6ULL, + 0x048ee15e0bb9a5f7ULL, 0x3240cfca9e0aaf5dULL, 0x8f4b71ceedc4a40bULL, 0x621c0da3de544a6dULL, + 0x92872836a08c4091ULL, 0xce8375b010c91445ULL, 0x8a72eb524f276394ULL, 0x2667fcfa7ec83635ULL, + 0x7f4c173345e8752aULL, 0x061b47feee7079a5ULL, 0x25dd9afa9f86ff34ULL, 0x3780cef5425dc89cULL, + 0x1a46035a513bb4e9ULL, 0x3e1ef379ac575adaULL, 0xc78c5f1c5fa24b50ULL, 0x321a967634fd9f22ULL, + 0x946707b8826e27faULL, 0x3dca84d64c506fd0ULL, 0xc189218075e91436ULL, 0x6d9284169b3b8484ULL, + 0x3a67e840383f2ddfULL, 0x33eec9a30c4f9b75ULL, 0x3ec7c86fa783ef47ULL, 0x26ec449fbac9fbc4ULL, + 0x5c0f38cba09b9e7dULL, 0x81168cc762a3478cULL, 0x3e23b0d306fc121cULL, 0x5a238aa0a5efdcddULL, + 0x1ba26121c4ea43ffULL, 0x36f8c77f7c8832b5ULL, 0x88fbea0b0adcf99aULL, 0x5ca9938ec25bebf9ULL, + 0xd5436a5e51fccda0ULL, 0x1dbc4797c2cd893bULL, 0x19346a65d3224a08ULL, 0x0f5034e49b9af466ULL, + 0xf23c3967a1e0b96eULL, 0xe58b08fa867a4d88ULL, 0xfb2fabc6a7341679ULL, 0x2a75381eb6026946ULL, + 0xc80a3be4c19420acULL, 0x66b1f6c681f2b6dcULL, 0x7cf7036761e93388ULL, 0x25abbbd8a660a4c4ULL, + 0x91ea12ba14fd5198ULL, 0x684950fc4a3cffa9ULL, 0xf826842130f5ad28ULL, 0x3ea988f75301a441ULL, + 0xc978109a695f8c6fULL, 0x1746eb4a0530c3f3ULL, 0x444d6d77b4459995ULL, 0x75952b8c054e5cc7ULL, + 0xa3703f7915f4d6aaULL, 0x66c346202f2647d8ULL, 0xd01469df811d644bULL, 0x77fea47d81a5d71fULL, + 0xc5e9529ef57ca381ULL, 0x6eeeb4b9ce2f881aULL, 0xb6e91a28e8009bd6ULL, 0x4b80be3e9afc3fecULL, + 0x7e3773c526aed2c5ULL, 0x1b4afcb453c9a49dULL, 0xa920bdd7baffb24dULL, 0x7c54699f122d400eULL, + 0xef46c8e14fa94bc8ULL, 0xe0b074ce2952ed5eULL, 0xbea450e1dbd885d5ULL, 0x61b68649320f712cULL, + 0x8a485f7309ccbdd1ULL, 0xbd06320d7d4d1a2dULL, 0x25232973322dbef4ULL, 0x445dc4758c17f770ULL, + 0xdb0434177cc8933cULL, 0xed6fe82175ea059fULL, 0x1efebefdc053db34ULL, 0x4adbe867c65daf99ULL, + 0x3acd71a2a90609dfULL, 0xe5e991856dd04050ULL, 0x1ec69b688157c23cULL, 0x697427f6885cfe4dULL, + 0xd7be7b9b65e1a851ULL, 0xa03d28d522c536ddULL, 0x28399d658fd2b645ULL, 0x49e5b7e17c2641e1ULL, + 0x6f8c3a98700457a4ULL, 0x5078f0a25ebb6778ULL, 0xd13c3ccbc382960fULL, 0x2e003258a7df84b1ULL, + 0x8ad1f39be6296a1cULL, 0xc1eeaa652a5fbfb2ULL, 0x33ee0673fd26f3cbULL, 0x59256173a69d2cccULL, + 0x41ea07aa4e18fc41ULL, 0xd9fc19527c87a51eULL, 0xbdaacb805831ca6fULL, 0x445b652dc916694fULL, + 0xce92a3a7f2172315ULL, 0x1edc282de11b9964ULL, 0xa1823aafe04c314aULL, 0x790a2d94437cf586ULL, + 0x71c447fb93f6e009ULL, 0x8922a56722845276ULL, 0xbf70903b204f5169ULL, 0x2f7a89891ba319feULL, + 0x02a08eb577e2140cULL, 0xed9a4ed4427bdcf4ULL, 0x5253ec44e4323cd1ULL, 0x3e88363c14e9355bULL, + 0xaa66c14277110b8cULL, 0x1ae0391610a23390ULL, 0x2030bd12c93fc2a2ULL, 0x3ee141579555c7abULL, + 0x9214de3a6d6e7d41ULL, 0x3ccdd88607f17efeULL, 0x674f1288f8e11217ULL, 0x5682250f329f93d0ULL, + 0x6cf00b136d2e396eULL, 0x6e4cf86f1014debfULL, 0x5930b1b5bfcc4e83ULL, 0x047069b48aba16b6ULL, + 0x0d4ce4ab69b20793ULL, 0xb24db91a97d0fb9eULL, 0xcdfa50f54e00d01dULL, 0x221b1085368bddb5ULL, + 0xe7e59468b1e3d8d2ULL, 0x53c56563bd122f93ULL, 0xeee8a903e0663f09ULL, 0x61efa662cbbe3d42ULL, + 0x2cf8ddddde6eab2aULL, 0x9bf80ad51435f231ULL, 0x5deadacec9f04973ULL, 0x29275b5d41d29b27ULL, + 0xcfde0f0895ebf14fULL, 0xb9aab96b054905a7ULL, 0xcae80dd9a1c420fdULL, 0x0a63bf2f1673bbc7ULL, + 0x092f6e11958fbc8cULL, 0x672a81e804822fadULL, 0xcac8351560d52517ULL, 0x6f3f7722c8f192f8ULL, + 0xf8ba90ccc2e894b7ULL, 0x2c7557a438ff9f0dULL, 0x894d1d855ae52359ULL, 0x68e122157b743d69ULL, + 0xd87e5570cfb919f3ULL, 0x3f2cdecd95798db9ULL, 0x2121154710c0a2ceULL, 0x3c66a115246dc5b2ULL, + 0xcbedc562294ecb72ULL, 0xba7143c36a280b16ULL, 0x9610c2efd4078b67ULL, 0x6144735d946a4b1eULL, + 0x536f111ed75b3350ULL, 0x0211db8c2041d81bULL, 0xf93cb1000e10413cULL, 0x149dfd3c039e8876ULL, + 0xd479dde46b63155bULL, 0xb66e15e93c837976ULL, 0xdafde43b1f13e038ULL, 0x5fafda1a2e4b0b35ULL, + 0x3600bbdf17197581ULL, 0x3972050bbe3cd2c2ULL, 0x5938906dbdd5be86ULL, 0x34fce5e43f9b860fULL, + 0x75a8a4cd42d14d02ULL, 0x828dabc53441df65ULL, 0x33dcabedd2e131d3ULL, 0x3ebad76fb814d25fULL, + 0xd4906f566f70e10fULL, 0x5d12f7aa51690f5aULL, 0x45adb16e76cefcf2ULL, 0x01f768aead232999ULL, + 0x2b6cc77b6248febdULL, 0x3cd30628ec3aaffdULL, 0xce1c0b80d4ef486aULL, 0x4c3bff2ea6f66c23ULL, + 0x3f2ec4094aeaeb5fULL, 0x61b19b286e372ca7ULL, 0x5eefa966de2a701dULL, 0x23b20565de55e3efULL, + 0xe301ca5279d58557ULL, 0x07b2d4ce27c2874fULL, 0xa532cd8a9dcf1d67ULL, 0x2a52fee23f2bff56ULL, + 0x8624efb37cd8663dULL, 0xbbc7ac20ffbd7594ULL, 0x57b85e9c82d37445ULL, 0x7b3052cb86a6ec66ULL, + 0x3482f0ad2525e91eULL, 0x2cb68043d28edca0ULL, 0xaf4f6d052e1b003aULL, 0x185f8c2529781b0aULL, + 0xaa41de5bd80ce0d6ULL, 0x9407b2416853e9d6ULL, 0x563ec36e357f4c3aULL, 0x4cc4b8dd0e297bceULL, + 0xa2fc1a52ffb8730eULL, 0x1811f16e67058e37ULL, 0x10f9a366cddf4ee1ULL, 0x72f4a0c4a0b9f099ULL, + 0x8c16c06f663f4ea7ULL, 0x693b3af74e970fbaULL, 0x2102e7f1d69ec345ULL, 0x0ba53cbc968a8089ULL, + 0xca3d9dc7fea15537ULL, 0x4c6824bb51536493ULL, 0xb9886314844006b1ULL, 0x40d2a72ab454cc60ULL, + 0x5936a1b712570975ULL, 0x91b9d648debda657ULL, 0x3344094bb64330eaULL, 0x006ba10d12ee51d0ULL, + 0x19228468f5de5d58ULL, 0x0eb12f4c38cc05b0ULL, 0xa1039f9dd5601990ULL, 0x4502d4ce4fff0e0bULL, + 0xeb2054106837c189ULL, 0xd0f6544c6dd3b93cULL, 0x40727064c416d74fULL, 0x6e15c6114b502ef0ULL, + 0x4df2a398cfb1a76bULL, 0x11256c7419f2f6b1ULL, 0x4a497962066e6043ULL, 0x705b3aab41355b44ULL, + 0x365ef536d797b1d8ULL, 0x00076bd622ddf0dbULL, 0x3bbf33b0e0575a88ULL, 0x3777aa05c8e4ca4dULL, + 0x392745c85578db5fULL, 0x6fda4149dbae5ae2ULL, 0xb1f0b00b8adc9867ULL, 0x09963437d36f1da3ULL, + 0x7e824e90a5dc3853ULL, 0xccb5f6641f135cbdULL, 0x6736d86c87ce8fccULL, 0x625f3ce26604249fULL, + 0xaf8ac8059502f63fULL, 0x0c05e70a2e351469ULL, 0x35292e9c764b6305ULL, 0x1a394360c7e23ac3ULL, + 0xd5c6d53251183264ULL, 0x62065abd43c2b74fULL, 0xb5fbf5d03b973f9bULL, 0x13a3da3661206e5eULL, + 0xc6bd5837725d94e5ULL, 0x18e30912205016c5ULL, 0x2088ce1570033c68ULL, 0x7fba1f495c837987ULL, + 0x5a8c7423f2f9079dULL, 0x1735157b34023fc5ULL, 0xe4f9b49ad2fab351ULL, 0x6691ff72c878e33cULL, + 0x122c2adedc5eff3eULL, 0xf8dd4bf1d8956cf4ULL, 0xeb86205d9e9e5bdaULL, 0x049b92b9d975c743ULL, + 0xa5379730b0f6c05aULL, 0x72a0ffacc6f3a553ULL, 0xb0032c34b20dcd6dULL, 0x470e9dbc88d5164aULL, + 0xb19cf10ca237c047ULL, 0xb65466711f6c81a2ULL, 0xb3321bd16dd80b43ULL, 0x48c14f600c5fbe8eULL, + 0x66451c264aa6c803ULL, 0xb66e3904a4fa7da6ULL, 0xd45f19b0b3128395ULL, 0x31602627c3c9bc10ULL, + 0x3120dc4832e4e10dULL, 0xeb20c46756c717f7ULL, 0x00f52e3f67280294ULL, 0x566d4fc14730c509ULL, + 0x7e3a5d40fd837206ULL, 0xc1e926dc7159547aULL, 0x216730fba68d6095ULL, 0x22e8c3843f69cea7ULL, + 0x33d074e8930e4b2bULL, 0xb6e4350e84d15816ULL, 0x5534c26ad6ba2365ULL, 0x7773c12f89f1f3f3ULL, + 0x8cba404da57962aaULL, 0x5b9897a81999ce56ULL, 0x508e862f121692fcULL, 0x3a81907fa093c291ULL, + 0x0dded0ff4725a510ULL, 0x10d8cc10673fc503ULL, 0x5b9d151c9f1f4e89ULL, 0x32a5c1d5cb09a44cULL, + 0x1e0aa442b90541fbULL, 0x5f85eb7cc1b485dbULL, 0xbee595ce8a9df2e5ULL, 0x25e496c722422236ULL, + 0x5edf3c46cd0fe5b9ULL, 0x34e75a7ed2a43388ULL, 0xe488de11d761e352ULL, 0x0e878a01a085545cULL, + 0xba493c77e021bb04ULL, 0x2b4d1843c7df899aULL, 0x9ea37a487ae80d67ULL, 0x67a9958011e41794ULL, + 0x4b58051a6697b065ULL, 0x47e33f7d8d6ba6d4ULL, 0xbb4da8d483ca46c1ULL, 0x68becaa181c2db0dULL, + 0x8d8980e90b989aa5ULL, 0xf95eb14a2c93c99bULL, 0x51c6c7c4796e73a2ULL, 0x6e228363b5efb569ULL, + 0xc6bbc0b02dd624c8ULL, 0x777eb47dec8170eeULL, 0x3cde15a004cfafa9ULL, 0x1dc6bc087160bf9bULL, + 0x2e07e043eec34002ULL, 0x18e9fc677a68dc7fULL, 0xd8da03188bd15b9aULL, 0x48fbc3bb00568253ULL, + 0x57547d4cfb654ce1ULL, 0xd3565b82a058e2adULL, 0xf63eaf0bbf154478ULL, 0x47531ef114dfbb18ULL, + 0xe1ec630a4278c587ULL, 0x5507d546ca8e83f3ULL, 0x85e135c63adc0c2bULL, 0x0aa7efa85682844eULL, + 0x72691ba8b3e1f615ULL, 0x32b4e9701fbe3ffaULL, 0x97b6d92e39bb7868ULL, 0x2cfe53dea02e39e8ULL, + 0x687392cd85cd52b0ULL, 0x27ff66c910e29831ULL, 0x97134556a9832d06ULL, 0x269bb0360a84f8a0ULL, + 0x706e55457643f85cULL, 0x3734a48c9b597d1bULL, 0x7aee91e8c6efa472ULL, 0x5cd6abc198a9d9e0ULL, + 0x0e04de06cb3ce41aULL, 0xd8c6eb893402e138ULL, 0x904659bb686e3772ULL, 0x7215c371746ba8c8ULL, + 0xfd12a97eeae4a2d9ULL, 0x9514b7516394f2c5ULL, 0x266fd5809208f294ULL, 0x5c847085619a26b9ULL, + 0x52985410fed694eaULL, 0x3c905b934a2ed254ULL, 0x10bb47692d3be467ULL, 0x063b3d2d69e5e9e1ULL, + 0x472726eedda57debULL, 0xefb6c4ae10f41891ULL, 0x2b1641917b307614ULL, 0x117c554fc4f45b7cULL, + 0xc07cf3118f9d8812ULL, 0x01dbd82050017939ULL, 0xd7e803f4171b2827ULL, 0x1015e87487d225eaULL, + 0xc58de3fed23acc4dULL, 0x50db91c294a7be2dULL, 0x0b94d43d1c9cf457ULL, 0x6b1640fa6e37524aULL, + 0x692f346c5fda0d09ULL, 0x200b1c59fa4d3151ULL, 0xb8c46f760777a296ULL, 0x4b38395f3ffdfbcfULL, + 0x18d25e00be54d671ULL, 0x60d50582bec8aba6ULL, 0x87ad8f263b78b982ULL, 0x50fdf64e9cda0432ULL, + 0x90f567aac578dcf0ULL, 0xef1e9b0ef2a3133bULL, 0x0eebba9242d9de71ULL, 0x15473c9bf03101c7ULL, + 0x7c77e8ae56b78095ULL, 0xb678e7666e6f078eULL, 0x2da0b9615348ba1fULL, 0x7cf931c1ff733f0bULL, + 0x26b357f50a0a366cULL, 0xe9708cf42b87d732ULL, 0xc13aeea5f91cb2c0ULL, 0x35d90c991143bb4cULL, + 0x47c1c404a9a0d9dcULL, 0x659e58451972d251ULL, 0x3875a8c473b38c31ULL, 0x1fbd9ed379561f24ULL, + 0x11fabc6fd41ec28dULL, 0x7ef8dfe3cd2a2dcaULL, 0x72e73b5d8c404595ULL, 0x6135fa4954b72f27ULL, + 0xccfc32a2de24b69cULL, 0x3f55698c1f095d88ULL, 0xbe3350ed5ac3f929ULL, 0x5e9bf806ca477eebULL, + 0xe9ce8fb63c309f68ULL, 0x5376f63565e1f9f4ULL, 0xd1afcfb35a6393f1ULL, 0x6632a1ede5623506ULL, + 0x0b7d6c390c2ded4cULL, 0x56cb3281df04cb1fULL, 0x66305a1249ecc3c7ULL, 0x5d588b60a38ca72aULL, + 0xa6ecbf78e8e5f42dULL, 0x86eeb44b3c8a3eecULL, 0xec219c48fbd21604ULL, 0x1aaf1af517c36731ULL, + 0xc306a2836769bde7ULL, 0x208280622b1e2adbULL, 0x8027f51ffbff94a6ULL, 0x76cfa1ce1124f26bULL, + 0x18eb00562422abb6ULL, 0xf377c4d58f8c29c3ULL, 0x4dbbc207f531561aULL, 0x0253b7f082128a27ULL, + 0x3d1f091cb62c17e0ULL, 0x4860e1abd64628a9ULL, 0x52d17436309d4253ULL, 0x356f97e13efae576ULL, + 0xd351e11aa150535bULL, 0x3e6b45bb1dd878ccULL, 0x0c776128bed92c98ULL, 0x1d34ae93032885b8ULL, + 0x4ba0488ca85ba4c3ULL, 0x985348c33c9ce6ceULL, 0x66124c6f97bda770ULL, 0x0f81a0290654124aULL, + 0x9ed09ca6569b86fdULL, 0x811009fd18af9a2dULL, 0xff08d03f93d8c20aULL, 0x52a148199faef26bULL, + 0x3e03f9dc2d8d1b73ULL, 0x4205801873961a70ULL, 0xc0d987f041a35970ULL, 0x07aa1f15a1c0d549ULL, + 0xdfd46ce08cd27224ULL, 0x6d0a024f934e4239ULL, 0x808a7a6399897b59ULL, 0x0a4556e9e13d95a2ULL, + 0xd21a991fe9c13045ULL, 0x9b0e8548fe7751b8ULL, 0x5da643cb4bf30035ULL, 0x77db28d63940f721ULL, + 0xfc5eeb614adc9011ULL, 0x5229419ae8c411ebULL, 0x9ec3e7787d1dcf74ULL, 0x340d053e216e4cb5ULL, + 0xcac7af39b48df2b4ULL, 0xc0faec2871a10a94ULL, 0x140a69245ca575edULL, 0x0cf1c37134273a4cULL, + 0xc8ee306ac224b8a5ULL, 0x57eaee7ccb4930b0ULL, 0xa1e806bdaacbe74fULL, 0x7d9a62742eeb657dULL, + 0x9eb6b6ef546c4830ULL, 0x885cca1fddb36e2eULL, 0xe6b9f383ef0d7105ULL, 0x58654fef9d2e0412ULL, + 0xa905c4ffbe0e8e26ULL, 0x942de5df9b31816eULL, 0x497d723f802e88e1ULL, 0x30684dea602f408dULL, + 0x21e5a278a3e6cb34ULL, 0xaefb6e6f5b151dc4ULL, 0xb30b8e049d77ca15ULL, 0x28c3c9cf53b98981ULL, + 0x287fb721556cdd2aULL, 0x0d317ca897022274ULL, 0x7468c7423a543258ULL, 0x4a7f11464eb5642fULL, + 0xa237a4774d193aa6ULL, 0xd865986ea92129a1ULL, 0x24c515ecf87c1a88ULL, 0x604003575f39f5ebULL, + 0x47b9f189570a9b27ULL, 0x2b98cede465e4b78ULL, 0x026df551dbb85c20ULL, 0x74fcd91047e21901ULL, + 0x13e2a90a23c1bfa3ULL, 0x0cb0074e478519f6ULL, 0x5ff1cbbe3af6cf44ULL, 0x67fe5438be812dbeULL, + 0xd13cf64fa40f05b0ULL, 0x054dfb2f32283787ULL, 0x4173915b7f0d2aeaULL, 0x482f144f1f610d4eULL, + 0xf6210201b47f8234ULL, 0x5d0ae1929e70b990ULL, 0xdcd7f455b049567cULL, 0x7e93d0f1f0916f01ULL, + 0xdd79cbf18a7db4faULL, 0xbe8391bf6f74c62fULL, 0x027145d14b8291bdULL, 0x585a73ea2cbf1705ULL, + 0x485ca03e928a0db2ULL, 0x10fc01a5742857e7ULL, 0x2f482edbd6d551a7ULL, 0x0f0433b5048fdb8aULL, + 0x60da2e8dd7dc6247ULL, 0x88b4c9d38cd4819aULL, 0x13033ac001f66697ULL, 0x273b24fe3b367d75ULL, + 0xc6e8f66a31b3b9d4ULL, 0x281514a494df49d5ULL, 0xd1726fdfc8b23da7ULL, 0x4b3ae7d103dee548ULL, + 0xc6256e19ce4b9d7eULL, 0xff5c5cf186e3c61cULL, 0xacc63ca34b8ec145ULL, 0x74621888fee66574ULL, + 0x956f409645290a1eULL, 0xef0bf8e3263a962eULL, 0xed6a50eb5ec2647bULL, 0x0694283a9dca7502ULL, + 0x769b963643a2dcd1ULL, 0x42b7c8ea09fc5353ULL, 0x4f002aee13397eabULL, 0x63005e2c19b7d63aULL, + 0xca6736da63023beaULL, 0x966c7f6db12a99b7ULL, 0xace09390c537c5e1ULL, 0x0b696063a1aa89eeULL, + 0xebb03e97288c56e5ULL, 0x432a9f9f938c8be8ULL, 0xa6a5a93d5b717f71ULL, 0x1a5fb4c3e18f9d97ULL, + 0x1c94e7ad1c60cdceULL, 0xee202a43fc02c4a0ULL, 0x8dafe4d867c46a20ULL, 0x0a10263c8ac27b58ULL, + 0xd0dea9dfe4432a4aULL, 0x856af87bbe9277c5ULL, 0xce8472acc212c71aULL, 0x6f151b6d9bbb1e91ULL, + 0x26776c527ceed56aULL, 0x7d211cb7fbf8faecULL, 0x37ae66a6fd4609ccULL, 0x1f81b702d2770c42ULL, + 0x2fb0b057eac58392ULL, 0xe1dd89fe29744e9dULL, 0xc964f8eb17beb4f8ULL, 0x29571073c9a2d41eULL, + 0xa948a18981c0e254ULL, 0x2df6369b65b22830ULL, 0xa33eb2d75fcfd3c6ULL, 0x078cd6ec4199a01fULL, + 0x4a584a41ad900d2fULL, 0x32142b78e2c74c52ULL, 0x68c4e8338431c978ULL, 0x7f69ea9008689fc2ULL, + 0x52f2c81e46a38265ULL, 0xfd78072d04a832fdULL, 0x8cd7d5fa25359e94ULL, 0x4de71b7454cc29d2ULL, + 0x42eb60ad1eda6ac9ULL, 0x0aad37dfdbc09c3aULL, 0x81004b71e33cc191ULL, 0x44e6be345122803cULL, + 0x03fe8388ba1920dbULL, 0xf5d57c32150db008ULL, 0x49c8c4281af60c29ULL, 0x21edb518de701aeeULL, + 0x7fb63e418f06dc99ULL, 0xa4460d99c166d7b8ULL, 0x24dd5248ce520a83ULL, 0x5ec3ad712b928358ULL, + 0x15022a5fbd17930fULL, 0xa4f64a77d82570e3ULL, 0x12bc8d6915783712ULL, 0x498194c0fc620abbULL, + 0x38a2d9d255686c82ULL, 0x785c6bd9193e21f0ULL, 0xe4d5c81ab24a5484ULL, 0x56307860b2e20989ULL, + 0x429d55f78b4d74c4ULL, 0x22f1834643350131ULL, 0x1e60c24598c71fffULL, 0x59f2f014979983efULL, + 0x46a47d56eb494a44ULL, 0x3e22a854d636a18eULL, 0xb346e15274491c3bULL, 0x2ceafd4e5390cde7ULL, + 0xba8a8538be0d6675ULL, 0x4b9074bb50818e23ULL, 0xcbdab89085d304c3ULL, 0x61a24fe0e56192c4ULL, + 0xcb7615e6db525bcbULL, 0xdd7d8c35a567e4caULL, 0xe6b4153acafcdd69ULL, 0x2d668e097f3c9766ULL, + 0xa57e7e265ce55ef0ULL, 0x5d9f4e527cd4b967ULL, 0xfbc83606492fd1e5ULL, 0x090d52beb7c3f7aeULL, + 0x09b9515a1e7b4d7cULL, 0x1f266a2599da44c0ULL, 0xa1c49548e2c55504ULL, 0x7ef04287126f15ccULL, + 0xfed1659dbd30ef15ULL, 0x8b4ab9eec4e0277bULL, 0x884d6236a5df3291ULL, 0x1fd96ea6bf5cf788ULL, + 0x42a161981f190d9aULL, 0x61d849507e6052c1ULL, 0x9fe113bf285a2cd5ULL, 0x7c22d676dbad85d8ULL, + 0x82e770ed2bfbd27dULL, 0x4c05b2ece996f5a5ULL, 0xcd40a9c2b0900150ULL, 0x5895319213d9bf64ULL, + 0xe7cc5d703fea2e08ULL, 0xb50c491258e2188cULL, 0xcce30baa48205bf0ULL, 0x537c659ccfa32d62ULL, + 0x37b6623a98cfc088ULL, 0xfe9bed1fa4d6aca4ULL, 0x04d29b8e56a8d1b0ULL, 0x725f71c40b519575ULL, + 0x28c7f89cd0339ce6ULL, 0x8367b14469ddc18bULL, 0x883ada83a6a1652cULL, 0x585f1974034d6c17ULL, + 0x89cfb266f1b19188ULL, 0xe63b4863e7c35217ULL, 0xd88c9da6b4c0526aULL, 0x3e035c9df0954635ULL, + 0xdd9d5412fb45de9dULL, 0xdd684532e4cff40dULL, 0x4b5c999b151d671cULL, 0x2d8c2cc811e7f690ULL, + 0x7f54be1d90055d40ULL, 0xa464c5df464aaf40ULL, 0x33979624f0e917beULL, 0x2c018dc527356b30ULL, + 0xa5415024e330b3d4ULL, 0x73ff3d96691652d3ULL, 0x94ec42c4ef9b59f1ULL, 0x0747201618d08e5aULL, + 0x4d6ca48aca411c53ULL, 0x66415f2fcfa66119ULL, 0x9c4dd40051e227ffULL, 0x59810bc09a02f7ebULL, + 0x2a7eb171b3dc101dULL, 0x441c5ab99ffef68eULL, 0x32025c9b93b359eaULL, 0x5e8ce0a71e9d112fULL, + 0xbfcccb92429503fdULL, 0xd271ba752f095d55ULL, 0x345ead5e972d091eULL, 0x18c8df11a83103baULL, + 0x90cd949a9aed0f4cULL, 0xc5d1f4cb6660e37eULL, 0xb8cac52d56c52e0bULL, 0x6e42e400c5808e0dULL, + 0xa3b46966eeaefd23ULL, 0x0c4f1f0be39ecdcaULL, 0x189dc8c9d683a51dULL, 0x51f27f054c09351bULL, + 0x4c487ccd2a320682ULL, 0x587ea95bb3df1c96ULL, 0xc8ccf79e555cb8e8ULL, 0x547dc829a206d73dULL, + 0xb822a6cd80c39b06ULL, 0xe96d54732000d4c6ULL, 0x28535b6f91463b4dULL, 0x228f4660e2486e1dULL, + 0x98799538de8d3abfULL, 0x8cd8330045ebca6eULL, 0x79952a008221e738ULL, 0x4322e1a7535cd2bbULL, + 0xb114c11819d1801cULL, 0x2016e4d84f3f5ec7ULL, 0xdd0e2df409260f4cULL, 0x5ec362c0ae5f7266ULL, + 0xc0462b18b8b2b4eeULL, 0x7cc8d950274d1afbULL, 0xf25f7105436b02d2ULL, 0x43bbf8dcbff9ccd3ULL, + 0xb6ad1767a039e9dfULL, 0xb0714da8f69d3583ULL, 0x5e55fa18b42931f5ULL, 0x4ed5558f33c60961ULL, + 0x1fe37901c647a5ddULL, 0x593ddf1f8081d357ULL, 0x0249a4fd813fd7a6ULL, 0x69acca274e9caf61ULL, + 0x047ba3ea330721c9ULL, 0x83423fc20e7e1ea0ULL, 0x1df4c0af01314a60ULL, 0x09a62dab89289527ULL, + 0xa5b325a49cc6cb00ULL, 0xe94b5dc654b56cb6ULL, 0x3be28779adc994a0ULL, 0x4296e8f8ba3a4aadULL, + 0x328689761e451eabULL, 0x2e4d598bff59594aULL, 0x49b96853d7a7084aULL, 0x4980a319601420a8ULL, + 0x9565b9e12f552c42ULL, 0x8a5318db7100fe96ULL, 0x05c90b4d43add0d7ULL, 0x538b4cd66a5d4edaULL, + 0xf4e94fc3e89f039fULL, 0x592c9af26f618045ULL, 0x08a36eb5fd4b9550ULL, 0x25fffaf6c2ed1419ULL, + 0x34434459cc79d354ULL, 0xeeecbfb4b1d5476bULL, 0xddeb34a061615d99ULL, 0x5129cecceb64b773ULL, + 0xee43215894993520ULL, 0x772f9c7cf14c0b3bULL, 0xd2e2fce306bedad5ULL, 0x715f42b546f06a97ULL, + 0x434ecdceda5b5f1aULL, 0x0da17115a49741a9ULL, 0x680bd77c73edad2eULL, 0x487c02354edd9041ULL, + 0xb8efeff3a70ed9c4ULL, 0x56a32aa3e857e302ULL, 0xdf3a68bd48a2a5a0ULL, 0x07f650b73176c444ULL, + 0xe38b9b1626e0ccb1ULL, 0x79e053c18b09fb36ULL, 0x56d90319c9f94964ULL, 0x1ca941e7ac9ff5c4ULL, + 0x49c4df29162fa0bbULL, 0x8488cf3282b33305ULL, 0x95dfda14cabb437dULL, 0x3391f78264d5ad86ULL, + 0x729ae06ae2b5095dULL, 0xd58a58d73259a946ULL, 0xe9834262d13921edULL, 0x27fedafaa54bb592ULL, + 0xa99dc5b829ad48bbULL, 0x5f025742499ee260ULL, 0x802c8ecd5d7513fdULL, 0x78ceb3ef3f6dd938ULL, + 0xc342f44f8a135d94ULL, 0x7b9edb44828cdda3ULL, 0x9436d11a0537cfe7ULL, 0x5064b164ec1ab4c8ULL, + 0x7020eccfd37eb2fcULL, 0x1f31ea3ed90d25fcULL, 0x1b930d7bdfa1bb34ULL, 0x5344467a48113044ULL, + 0x70073170f25e6dfbULL, 0xe385dc1a50114cc8ULL, 0x2348698ac8fc4f00ULL, 0x2a77a55284dd40d8ULL, + 0xfe06afe0c98c6ce4ULL, 0xc235df96dddfd6e4ULL, 0x1428d01e33bf1ed3ULL, 0x785768ec9300bdafULL, + 0x9702e57a91deb63bULL, 0x61bdb8bfe5ce8b80ULL, 0x645b426f3d1d58acULL, 0x4804a82227a557bcULL, + 0x8e57048ab44d2601ULL, 0x68d6501a4b3a6935ULL, 0xc39c9ec3f9e1c293ULL, 0x4172f257d4de63e2ULL, + 0xd368b450330c6401ULL, 0x040d3017418f2391ULL, 0x2c34bb6090b7d90dULL, 0x16f649228fdfd51fULL, + 0xbea6818e2b928ef5ULL, 0xe28ccf91cdc11e72ULL, 0x594aaa68e77a36cdULL, 0x313034806c7ffd0fULL, + 0x8a9d27ac2249bd65ULL, 0x19a3b464018e9512ULL, 0xc26ccff352b37ec7ULL, 0x056f68341d797b21ULL, + 0x5e79d6757efd2327ULL, 0xfabdbcb6553afe15ULL, 0xd3e7222c6eaf5a60ULL, 0x7046c76d4dae743bULL, + 0x660be872b18d4a55ULL, 0x19992518574e1496ULL, 0xc103053a302bdcbbULL, 0x3ed8e9800b218e8eULL, + 0x7b0b9239fa75e03eULL, 0xefe9fb684633c083ULL, 0x98a35fbe391a7793ULL, 0x6065510fe2d0fe34ULL, + 0x55cb668548abad0cULL, 0xb4584548da87e527ULL, 0x2c43ecea0107c1ddULL, 0x526028809372de35ULL, + 0x3415c56af9213b1fULL, 0x5bee1a4d017e98dbULL, 0x13f6b105b5cf709bULL, 0x5ff20e3482b29ab6ULL, + 0x0aa29c75cc2e6c90ULL, 0xfc7d73ca3a70e206ULL, 0x899fc38fc4b5c515ULL, 0x250386b124ffc207ULL, + 0x54ea28d5ae3d2b56ULL, 0x9913149dd6de60ceULL, 0x16694fc58f06d6c1ULL, 0x46b23975eb018fc7ULL, + 0x470a6a0fb4b7b4e2ULL, 0x5d92475a8f7253deULL, 0xabeee5b52fbd3adbULL, 0x7fa20801a0806968ULL, + 0x76f3faf19f7714d2ULL, 0xb3e840c12f4660c3ULL, 0x0fb4cd8df212744eULL, 0x4b065a251d3a2dd2ULL, + 0x5cebde383d77cd4aULL, 0x6adf39df882c9cb1ULL, 0xa2dd242eb09af759ULL, 0x3147c0e50e5f6422ULL, + 0x164ca5101d1350dbULL, 0xf8d13479c33fc962ULL, 0xe640ce4d13e5da08ULL, 0x4bdee0c45061f8baULL, + 0xd7c46dc1a4edb1c9ULL, 0x5514d7b6437fd98aULL, 0x58942f6bb2a1c00bULL, 0x2dffb2ab1d70710eULL, + 0xccdfcf2fc18b6d68ULL, 0xa8ebcba8b7806167ULL, 0x980697f95e2937e3ULL, 0x02fbba1cd0126e8cULL +}; + +static void curve25519_ever64_base(u8 *out, const u8 *priv) +{ + u64 swap = 1; + int i, j, k; + u64 tmp[16 + 32 + 4]; + u64 *x1 = &tmp[0]; + u64 *z1 = &tmp[4]; + u64 *x2 = &tmp[8]; + u64 *z2 = &tmp[12]; + u64 *xz1 = &tmp[0]; + u64 *xz2 = &tmp[8]; + u64 *a = &tmp[0 + 16]; + u64 *b = &tmp[4 + 16]; + u64 *c = &tmp[8 + 16]; + u64 *ab = &tmp[0 + 16]; + u64 *abcd = &tmp[0 + 16]; + u64 *ef = &tmp[16 + 16]; + u64 *efgh = &tmp[16 + 16]; + u64 *key = &tmp[0 + 16 + 32]; + + memcpy(key, priv, 32); + ((u8 *)key)[0] &= 248; + ((u8 *)key)[31] = (((u8 *)key)[31] & 127) | 64; + + x1[0] = 1, x1[1] = x1[2] = x1[3] = 0; + z1[0] = 1, z1[1] = z1[2] = z1[3] = 0; + z2[0] = 1, z2[1] = z2[2] = z2[3] = 0; + memcpy(x2, p_minus_s, sizeof(p_minus_s)); + + j = 3; + for (i = 0; i < 4; ++i) { + while (j < (const int[]){ 64, 64, 64, 63 }[i]) { + u64 bit = (key[i] >> j) & 1; + k = (64 * i + j - 3); + swap = swap ^ bit; + cswap2(swap, xz1, xz2); + swap = bit; + fsub(b, x1, z1); + fadd(a, x1, z1); + fmul(c, &table_ladder[4 * k], b, ef); + fsub(b, a, c); + fadd(a, a, c); + fsqr2(ab, ab, efgh); + fmul2(xz1, xz2, ab, efgh); + ++j; + } + j = 0; + } + + point_double(xz1, abcd, efgh); + point_double(xz1, abcd, efgh); + point_double(xz1, abcd, efgh); + encode_point(out, xz1); + + memzero_explicit(tmp, sizeof(tmp)); +} + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(curve25519_use_bmi2_adx); + +void curve25519_arch(u8 mypublic[CURVE25519_KEY_SIZE], + const u8 secret[CURVE25519_KEY_SIZE], + const u8 basepoint[CURVE25519_KEY_SIZE]) +{ + if (static_branch_likely(&curve25519_use_bmi2_adx)) + curve25519_ever64(mypublic, secret, basepoint); + else + curve25519_generic(mypublic, secret, basepoint); +} +EXPORT_SYMBOL(curve25519_arch); + +void curve25519_base_arch(u8 pub[CURVE25519_KEY_SIZE], + const u8 secret[CURVE25519_KEY_SIZE]) +{ + if (static_branch_likely(&curve25519_use_bmi2_adx)) + curve25519_ever64_base(pub, secret); + else + curve25519_generic(pub, secret, curve25519_base_point); +} +EXPORT_SYMBOL(curve25519_base_arch); + +static int curve25519_set_secret(struct crypto_kpp *tfm, const void *buf, + unsigned int len) +{ + u8 *secret = kpp_tfm_ctx(tfm); + + if (!len) + curve25519_generate_secret(secret); + else if (len == CURVE25519_KEY_SIZE && + crypto_memneq(buf, curve25519_null_point, CURVE25519_KEY_SIZE)) + memcpy(secret, buf, CURVE25519_KEY_SIZE); + else + return -EINVAL; + return 0; +} + +static int curve25519_generate_public_key(struct kpp_request *req) +{ + struct crypto_kpp *tfm = crypto_kpp_reqtfm(req); + const u8 *secret = kpp_tfm_ctx(tfm); + u8 buf[CURVE25519_KEY_SIZE]; + int copied, nbytes; + + if (req->src) + return -EINVAL; + + curve25519_base_arch(buf, secret); + + /* might want less than we've got */ + nbytes = min_t(size_t, CURVE25519_KEY_SIZE, req->dst_len); + copied = sg_copy_from_buffer(req->dst, sg_nents_for_len(req->dst, + nbytes), + buf, nbytes); + if (copied != nbytes) + return -EINVAL; + return 0; +} + +static int curve25519_compute_shared_secret(struct kpp_request *req) +{ + struct crypto_kpp *tfm = crypto_kpp_reqtfm(req); + const u8 *secret = kpp_tfm_ctx(tfm); + u8 public_key[CURVE25519_KEY_SIZE]; + u8 buf[CURVE25519_KEY_SIZE]; + int copied, nbytes; + + if (!req->src) + return -EINVAL; + + copied = sg_copy_to_buffer(req->src, + sg_nents_for_len(req->src, + CURVE25519_KEY_SIZE), + public_key, CURVE25519_KEY_SIZE); + if (copied != CURVE25519_KEY_SIZE) + return -EINVAL; + + curve25519_arch(buf, secret, public_key); + + /* might want less than we've got */ + nbytes = min_t(size_t, CURVE25519_KEY_SIZE, req->dst_len); + copied = sg_copy_from_buffer(req->dst, sg_nents_for_len(req->dst, + nbytes), + buf, nbytes); + if (copied != nbytes) + return -EINVAL; + return 0; +} + +static unsigned int curve25519_max_size(struct crypto_kpp *tfm) +{ + return CURVE25519_KEY_SIZE; +} + +static struct kpp_alg curve25519_alg = { + .base.cra_name = "curve25519", + .base.cra_driver_name = "curve25519-x86", + .base.cra_priority = 200, + .base.cra_module = THIS_MODULE, + .base.cra_ctxsize = CURVE25519_KEY_SIZE, + + .set_secret = curve25519_set_secret, + .generate_public_key = curve25519_generate_public_key, + .compute_shared_secret = curve25519_compute_shared_secret, + .max_size = curve25519_max_size, +}; + + +static int __init curve25519_mod_init(void) +{ + if (boot_cpu_has(X86_FEATURE_BMI2) && boot_cpu_has(X86_FEATURE_ADX)) + static_branch_enable(&curve25519_use_bmi2_adx); + else + return 0; + return IS_REACHABLE(CONFIG_CRYPTO_KPP) ? + crypto_register_kpp(&curve25519_alg) : 0; +} + +static void __exit curve25519_mod_exit(void) +{ + if (IS_REACHABLE(CONFIG_CRYPTO_KPP) && + (boot_cpu_has(X86_FEATURE_BMI2) || boot_cpu_has(X86_FEATURE_ADX))) + crypto_unregister_kpp(&curve25519_alg); +} + +module_init(curve25519_mod_init); +module_exit(curve25519_mod_exit); + +MODULE_ALIAS_CRYPTO("curve25519"); +MODULE_ALIAS_CRYPTO("curve25519-x86"); +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Jason A. Donenfeld "); diff --git a/arch/x86/crypto/poly1305-avx2-x86_64.S b/arch/x86/crypto/poly1305-avx2-x86_64.S deleted file mode 100644 index 8457cdd47f75..000000000000 --- a/arch/x86/crypto/poly1305-avx2-x86_64.S +++ /dev/null @@ -1,394 +0,0 @@ -/* - * Poly1305 authenticator algorithm, RFC7539, x64 AVX2 functions - * - * Copyright (C) 2015 Martin Willi - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - */ - -#include - -.section .rodata.cst32.ANMASK, "aM", @progbits, 32 -.align 32 -ANMASK: .octa 0x0000000003ffffff0000000003ffffff - .octa 0x0000000003ffffff0000000003ffffff - -.section .rodata.cst32.ORMASK, "aM", @progbits, 32 -.align 32 -ORMASK: .octa 0x00000000010000000000000001000000 - .octa 0x00000000010000000000000001000000 - -.text - -#define h0 0x00(%rdi) -#define h1 0x04(%rdi) -#define h2 0x08(%rdi) -#define h3 0x0c(%rdi) -#define h4 0x10(%rdi) -#define r0 0x00(%rdx) -#define r1 0x04(%rdx) -#define r2 0x08(%rdx) -#define r3 0x0c(%rdx) -#define r4 0x10(%rdx) -#define u0 0x00(%r8) -#define u1 0x04(%r8) -#define u2 0x08(%r8) -#define u3 0x0c(%r8) -#define u4 0x10(%r8) -#define w0 0x14(%r8) -#define w1 0x18(%r8) -#define w2 0x1c(%r8) -#define w3 0x20(%r8) -#define w4 0x24(%r8) -#define y0 0x28(%r8) -#define y1 0x2c(%r8) -#define y2 0x30(%r8) -#define y3 0x34(%r8) -#define y4 0x38(%r8) -#define m %rsi -#define hc0 %ymm0 -#define hc1 %ymm1 -#define hc2 %ymm2 -#define hc3 %ymm3 -#define hc4 %ymm4 -#define hc0x %xmm0 -#define hc1x %xmm1 -#define hc2x %xmm2 -#define hc3x %xmm3 -#define hc4x %xmm4 -#define t1 %ymm5 -#define t2 %ymm6 -#define t1x %xmm5 -#define t2x %xmm6 -#define ruwy0 %ymm7 -#define ruwy1 %ymm8 -#define ruwy2 %ymm9 -#define ruwy3 %ymm10 -#define ruwy4 %ymm11 -#define ruwy0x %xmm7 -#define ruwy1x %xmm8 -#define ruwy2x %xmm9 -#define ruwy3x %xmm10 -#define ruwy4x %xmm11 -#define svxz1 %ymm12 -#define svxz2 %ymm13 -#define svxz3 %ymm14 -#define svxz4 %ymm15 -#define d0 %r9 -#define d1 %r10 -#define d2 %r11 -#define d3 %r12 -#define d4 %r13 - -ENTRY(poly1305_4block_avx2) - # %rdi: Accumulator h[5] - # %rsi: 64 byte input block m - # %rdx: Poly1305 key r[5] - # %rcx: Quadblock count - # %r8: Poly1305 derived key r^2 u[5], r^3 w[5], r^4 y[5], - - # This four-block variant uses loop unrolled block processing. It - # requires 4 Poly1305 keys: r, r^2, r^3 and r^4: - # h = (h + m) * r => h = (h + m1) * r^4 + m2 * r^3 + m3 * r^2 + m4 * r - - vzeroupper - push %rbx - push %r12 - push %r13 - - # combine r0,u0,w0,y0 - vmovd y0,ruwy0x - vmovd w0,t1x - vpunpcklqdq t1,ruwy0,ruwy0 - vmovd u0,t1x - vmovd r0,t2x - vpunpcklqdq t2,t1,t1 - vperm2i128 $0x20,t1,ruwy0,ruwy0 - - # combine r1,u1,w1,y1 and s1=r1*5,v1=u1*5,x1=w1*5,z1=y1*5 - vmovd y1,ruwy1x - vmovd w1,t1x - vpunpcklqdq t1,ruwy1,ruwy1 - vmovd u1,t1x - vmovd r1,t2x - vpunpcklqdq t2,t1,t1 - vperm2i128 $0x20,t1,ruwy1,ruwy1 - vpslld $2,ruwy1,svxz1 - vpaddd ruwy1,svxz1,svxz1 - - # combine r2,u2,w2,y2 and s2=r2*5,v2=u2*5,x2=w2*5,z2=y2*5 - vmovd y2,ruwy2x - vmovd w2,t1x - vpunpcklqdq t1,ruwy2,ruwy2 - vmovd u2,t1x - vmovd r2,t2x - vpunpcklqdq t2,t1,t1 - vperm2i128 $0x20,t1,ruwy2,ruwy2 - vpslld $2,ruwy2,svxz2 - vpaddd ruwy2,svxz2,svxz2 - - # combine r3,u3,w3,y3 and s3=r3*5,v3=u3*5,x3=w3*5,z3=y3*5 - vmovd y3,ruwy3x - vmovd w3,t1x - vpunpcklqdq t1,ruwy3,ruwy3 - vmovd u3,t1x - vmovd r3,t2x - vpunpcklqdq t2,t1,t1 - vperm2i128 $0x20,t1,ruwy3,ruwy3 - vpslld $2,ruwy3,svxz3 - vpaddd ruwy3,svxz3,svxz3 - - # combine r4,u4,w4,y4 and s4=r4*5,v4=u4*5,x4=w4*5,z4=y4*5 - vmovd y4,ruwy4x - vmovd w4,t1x - vpunpcklqdq t1,ruwy4,ruwy4 - vmovd u4,t1x - vmovd r4,t2x - vpunpcklqdq t2,t1,t1 - vperm2i128 $0x20,t1,ruwy4,ruwy4 - vpslld $2,ruwy4,svxz4 - vpaddd ruwy4,svxz4,svxz4 - -.Ldoblock4: - # hc0 = [m[48-51] & 0x3ffffff, m[32-35] & 0x3ffffff, - # m[16-19] & 0x3ffffff, m[ 0- 3] & 0x3ffffff + h0] - vmovd 0x00(m),hc0x - vmovd 0x10(m),t1x - vpunpcklqdq t1,hc0,hc0 - vmovd 0x20(m),t1x - vmovd 0x30(m),t2x - vpunpcklqdq t2,t1,t1 - vperm2i128 $0x20,t1,hc0,hc0 - vpand ANMASK(%rip),hc0,hc0 - vmovd h0,t1x - vpaddd t1,hc0,hc0 - # hc1 = [(m[51-54] >> 2) & 0x3ffffff, (m[35-38] >> 2) & 0x3ffffff, - # (m[19-22] >> 2) & 0x3ffffff, (m[ 3- 6] >> 2) & 0x3ffffff + h1] - vmovd 0x03(m),hc1x - vmovd 0x13(m),t1x - vpunpcklqdq t1,hc1,hc1 - vmovd 0x23(m),t1x - vmovd 0x33(m),t2x - vpunpcklqdq t2,t1,t1 - vperm2i128 $0x20,t1,hc1,hc1 - vpsrld $2,hc1,hc1 - vpand ANMASK(%rip),hc1,hc1 - vmovd h1,t1x - vpaddd t1,hc1,hc1 - # hc2 = [(m[54-57] >> 4) & 0x3ffffff, (m[38-41] >> 4) & 0x3ffffff, - # (m[22-25] >> 4) & 0x3ffffff, (m[ 6- 9] >> 4) & 0x3ffffff + h2] - vmovd 0x06(m),hc2x - vmovd 0x16(m),t1x - vpunpcklqdq t1,hc2,hc2 - vmovd 0x26(m),t1x - vmovd 0x36(m),t2x - vpunpcklqdq t2,t1,t1 - vperm2i128 $0x20,t1,hc2,hc2 - vpsrld $4,hc2,hc2 - vpand ANMASK(%rip),hc2,hc2 - vmovd h2,t1x - vpaddd t1,hc2,hc2 - # hc3 = [(m[57-60] >> 6) & 0x3ffffff, (m[41-44] >> 6) & 0x3ffffff, - # (m[25-28] >> 6) & 0x3ffffff, (m[ 9-12] >> 6) & 0x3ffffff + h3] - vmovd 0x09(m),hc3x - vmovd 0x19(m),t1x - vpunpcklqdq t1,hc3,hc3 - vmovd 0x29(m),t1x - vmovd 0x39(m),t2x - vpunpcklqdq t2,t1,t1 - vperm2i128 $0x20,t1,hc3,hc3 - vpsrld $6,hc3,hc3 - vpand ANMASK(%rip),hc3,hc3 - vmovd h3,t1x - vpaddd t1,hc3,hc3 - # hc4 = [(m[60-63] >> 8) | (1<<24), (m[44-47] >> 8) | (1<<24), - # (m[28-31] >> 8) | (1<<24), (m[12-15] >> 8) | (1<<24) + h4] - vmovd 0x0c(m),hc4x - vmovd 0x1c(m),t1x - vpunpcklqdq t1,hc4,hc4 - vmovd 0x2c(m),t1x - vmovd 0x3c(m),t2x - vpunpcklqdq t2,t1,t1 - vperm2i128 $0x20,t1,hc4,hc4 - vpsrld $8,hc4,hc4 - vpor ORMASK(%rip),hc4,hc4 - vmovd h4,t1x - vpaddd t1,hc4,hc4 - - # t1 = [ hc0[3] * r0, hc0[2] * u0, hc0[1] * w0, hc0[0] * y0 ] - vpmuludq hc0,ruwy0,t1 - # t1 += [ hc1[3] * s4, hc1[2] * v4, hc1[1] * x4, hc1[0] * z4 ] - vpmuludq hc1,svxz4,t2 - vpaddq t2,t1,t1 - # t1 += [ hc2[3] * s3, hc2[2] * v3, hc2[1] * x3, hc2[0] * z3 ] - vpmuludq hc2,svxz3,t2 - vpaddq t2,t1,t1 - # t1 += [ hc3[3] * s2, hc3[2] * v2, hc3[1] * x2, hc3[0] * z2 ] - vpmuludq hc3,svxz2,t2 - vpaddq t2,t1,t1 - # t1 += [ hc4[3] * s1, hc4[2] * v1, hc4[1] * x1, hc4[0] * z1 ] - vpmuludq hc4,svxz1,t2 - vpaddq t2,t1,t1 - # d0 = t1[0] + t1[1] + t[2] + t[3] - vpermq $0xee,t1,t2 - vpaddq t2,t1,t1 - vpsrldq $8,t1,t2 - vpaddq t2,t1,t1 - vmovq t1x,d0 - - # t1 = [ hc0[3] * r1, hc0[2] * u1,hc0[1] * w1, hc0[0] * y1 ] - vpmuludq hc0,ruwy1,t1 - # t1 += [ hc1[3] * r0, hc1[2] * u0, hc1[1] * w0, hc1[0] * y0 ] - vpmuludq hc1,ruwy0,t2 - vpaddq t2,t1,t1 - # t1 += [ hc2[3] * s4, hc2[2] * v4, hc2[1] * x4, hc2[0] * z4 ] - vpmuludq hc2,svxz4,t2 - vpaddq t2,t1,t1 - # t1 += [ hc3[3] * s3, hc3[2] * v3, hc3[1] * x3, hc3[0] * z3 ] - vpmuludq hc3,svxz3,t2 - vpaddq t2,t1,t1 - # t1 += [ hc4[3] * s2, hc4[2] * v2, hc4[1] * x2, hc4[0] * z2 ] - vpmuludq hc4,svxz2,t2 - vpaddq t2,t1,t1 - # d1 = t1[0] + t1[1] + t1[3] + t1[4] - vpermq $0xee,t1,t2 - vpaddq t2,t1,t1 - vpsrldq $8,t1,t2 - vpaddq t2,t1,t1 - vmovq t1x,d1 - - # t1 = [ hc0[3] * r2, hc0[2] * u2, hc0[1] * w2, hc0[0] * y2 ] - vpmuludq hc0,ruwy2,t1 - # t1 += [ hc1[3] * r1, hc1[2] * u1, hc1[1] * w1, hc1[0] * y1 ] - vpmuludq hc1,ruwy1,t2 - vpaddq t2,t1,t1 - # t1 += [ hc2[3] * r0, hc2[2] * u0, hc2[1] * w0, hc2[0] * y0 ] - vpmuludq hc2,ruwy0,t2 - vpaddq t2,t1,t1 - # t1 += [ hc3[3] * s4, hc3[2] * v4, hc3[1] * x4, hc3[0] * z4 ] - vpmuludq hc3,svxz4,t2 - vpaddq t2,t1,t1 - # t1 += [ hc4[3] * s3, hc4[2] * v3, hc4[1] * x3, hc4[0] * z3 ] - vpmuludq hc4,svxz3,t2 - vpaddq t2,t1,t1 - # d2 = t1[0] + t1[1] + t1[2] + t1[3] - vpermq $0xee,t1,t2 - vpaddq t2,t1,t1 - vpsrldq $8,t1,t2 - vpaddq t2,t1,t1 - vmovq t1x,d2 - - # t1 = [ hc0[3] * r3, hc0[2] * u3, hc0[1] * w3, hc0[0] * y3 ] - vpmuludq hc0,ruwy3,t1 - # t1 += [ hc1[3] * r2, hc1[2] * u2, hc1[1] * w2, hc1[0] * y2 ] - vpmuludq hc1,ruwy2,t2 - vpaddq t2,t1,t1 - # t1 += [ hc2[3] * r1, hc2[2] * u1, hc2[1] * w1, hc2[0] * y1 ] - vpmuludq hc2,ruwy1,t2 - vpaddq t2,t1,t1 - # t1 += [ hc3[3] * r0, hc3[2] * u0, hc3[1] * w0, hc3[0] * y0 ] - vpmuludq hc3,ruwy0,t2 - vpaddq t2,t1,t1 - # t1 += [ hc4[3] * s4, hc4[2] * v4, hc4[1] * x4, hc4[0] * z4 ] - vpmuludq hc4,svxz4,t2 - vpaddq t2,t1,t1 - # d3 = t1[0] + t1[1] + t1[2] + t1[3] - vpermq $0xee,t1,t2 - vpaddq t2,t1,t1 - vpsrldq $8,t1,t2 - vpaddq t2,t1,t1 - vmovq t1x,d3 - - # t1 = [ hc0[3] * r4, hc0[2] * u4, hc0[1] * w4, hc0[0] * y4 ] - vpmuludq hc0,ruwy4,t1 - # t1 += [ hc1[3] * r3, hc1[2] * u3, hc1[1] * w3, hc1[0] * y3 ] - vpmuludq hc1,ruwy3,t2 - vpaddq t2,t1,t1 - # t1 += [ hc2[3] * r2, hc2[2] * u2, hc2[1] * w2, hc2[0] * y2 ] - vpmuludq hc2,ruwy2,t2 - vpaddq t2,t1,t1 - # t1 += [ hc3[3] * r1, hc3[2] * u1, hc3[1] * w1, hc3[0] * y1 ] - vpmuludq hc3,ruwy1,t2 - vpaddq t2,t1,t1 - # t1 += [ hc4[3] * r0, hc4[2] * u0, hc4[1] * w0, hc4[0] * y0 ] - vpmuludq hc4,ruwy0,t2 - vpaddq t2,t1,t1 - # d4 = t1[0] + t1[1] + t1[2] + t1[3] - vpermq $0xee,t1,t2 - vpaddq t2,t1,t1 - vpsrldq $8,t1,t2 - vpaddq t2,t1,t1 - vmovq t1x,d4 - - # Now do a partial reduction mod (2^130)-5, carrying h0 -> h1 -> h2 -> - # h3 -> h4 -> h0 -> h1 to get h0,h2,h3,h4 < 2^26 and h1 < 2^26 + a small - # amount. Careful: we must not assume the carry bits 'd0 >> 26', - # 'd1 >> 26', 'd2 >> 26', 'd3 >> 26', and '(d4 >> 26) * 5' fit in 32-bit - # integers. It's true in a single-block implementation, but not here. - - # d1 += d0 >> 26 - mov d0,%rax - shr $26,%rax - add %rax,d1 - # h0 = d0 & 0x3ffffff - mov d0,%rbx - and $0x3ffffff,%ebx - - # d2 += d1 >> 26 - mov d1,%rax - shr $26,%rax - add %rax,d2 - # h1 = d1 & 0x3ffffff - mov d1,%rax - and $0x3ffffff,%eax - mov %eax,h1 - - # d3 += d2 >> 26 - mov d2,%rax - shr $26,%rax - add %rax,d3 - # h2 = d2 & 0x3ffffff - mov d2,%rax - and $0x3ffffff,%eax - mov %eax,h2 - - # d4 += d3 >> 26 - mov d3,%rax - shr $26,%rax - add %rax,d4 - # h3 = d3 & 0x3ffffff - mov d3,%rax - and $0x3ffffff,%eax - mov %eax,h3 - - # h0 += (d4 >> 26) * 5 - mov d4,%rax - shr $26,%rax - lea (%rax,%rax,4),%rax - add %rax,%rbx - # h4 = d4 & 0x3ffffff - mov d4,%rax - and $0x3ffffff,%eax - mov %eax,h4 - - # h1 += h0 >> 26 - mov %rbx,%rax - shr $26,%rax - add %eax,h1 - # h0 = h0 & 0x3ffffff - andl $0x3ffffff,%ebx - mov %ebx,h0 - - add $0x40,m - dec %rcx - jnz .Ldoblock4 - - vzeroupper - pop %r13 - pop %r12 - pop %rbx - ret -ENDPROC(poly1305_4block_avx2) diff --git a/arch/x86/crypto/poly1305-sse2-x86_64.S b/arch/x86/crypto/poly1305-sse2-x86_64.S deleted file mode 100644 index 5851c7418fb7..000000000000 --- a/arch/x86/crypto/poly1305-sse2-x86_64.S +++ /dev/null @@ -1,590 +0,0 @@ -/* - * Poly1305 authenticator algorithm, RFC7539, x64 SSE2 functions - * - * Copyright (C) 2015 Martin Willi - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - */ - -#include - -.section .rodata.cst16.ANMASK, "aM", @progbits, 16 -.align 16 -ANMASK: .octa 0x0000000003ffffff0000000003ffffff - -.section .rodata.cst16.ORMASK, "aM", @progbits, 16 -.align 16 -ORMASK: .octa 0x00000000010000000000000001000000 - -.text - -#define h0 0x00(%rdi) -#define h1 0x04(%rdi) -#define h2 0x08(%rdi) -#define h3 0x0c(%rdi) -#define h4 0x10(%rdi) -#define r0 0x00(%rdx) -#define r1 0x04(%rdx) -#define r2 0x08(%rdx) -#define r3 0x0c(%rdx) -#define r4 0x10(%rdx) -#define s1 0x00(%rsp) -#define s2 0x04(%rsp) -#define s3 0x08(%rsp) -#define s4 0x0c(%rsp) -#define m %rsi -#define h01 %xmm0 -#define h23 %xmm1 -#define h44 %xmm2 -#define t1 %xmm3 -#define t2 %xmm4 -#define t3 %xmm5 -#define t4 %xmm6 -#define mask %xmm7 -#define d0 %r8 -#define d1 %r9 -#define d2 %r10 -#define d3 %r11 -#define d4 %r12 - -ENTRY(poly1305_block_sse2) - # %rdi: Accumulator h[5] - # %rsi: 16 byte input block m - # %rdx: Poly1305 key r[5] - # %rcx: Block count - - # This single block variant tries to improve performance by doing two - # multiplications in parallel using SSE instructions. There is quite - # some quardword packing involved, hence the speedup is marginal. - - push %rbx - push %r12 - sub $0x10,%rsp - - # s1..s4 = r1..r4 * 5 - mov r1,%eax - lea (%eax,%eax,4),%eax - mov %eax,s1 - mov r2,%eax - lea (%eax,%eax,4),%eax - mov %eax,s2 - mov r3,%eax - lea (%eax,%eax,4),%eax - mov %eax,s3 - mov r4,%eax - lea (%eax,%eax,4),%eax - mov %eax,s4 - - movdqa ANMASK(%rip),mask - -.Ldoblock: - # h01 = [0, h1, 0, h0] - # h23 = [0, h3, 0, h2] - # h44 = [0, h4, 0, h4] - movd h0,h01 - movd h1,t1 - movd h2,h23 - movd h3,t2 - movd h4,h44 - punpcklqdq t1,h01 - punpcklqdq t2,h23 - punpcklqdq h44,h44 - - # h01 += [ (m[3-6] >> 2) & 0x3ffffff, m[0-3] & 0x3ffffff ] - movd 0x00(m),t1 - movd 0x03(m),t2 - psrld $2,t2 - punpcklqdq t2,t1 - pand mask,t1 - paddd t1,h01 - # h23 += [ (m[9-12] >> 6) & 0x3ffffff, (m[6-9] >> 4) & 0x3ffffff ] - movd 0x06(m),t1 - movd 0x09(m),t2 - psrld $4,t1 - psrld $6,t2 - punpcklqdq t2,t1 - pand mask,t1 - paddd t1,h23 - # h44 += [ (m[12-15] >> 8) | (1 << 24), (m[12-15] >> 8) | (1 << 24) ] - mov 0x0c(m),%eax - shr $8,%eax - or $0x01000000,%eax - movd %eax,t1 - pshufd $0xc4,t1,t1 - paddd t1,h44 - - # t1[0] = h0 * r0 + h2 * s3 - # t1[1] = h1 * s4 + h3 * s2 - movd r0,t1 - movd s4,t2 - punpcklqdq t2,t1 - pmuludq h01,t1 - movd s3,t2 - movd s2,t3 - punpcklqdq t3,t2 - pmuludq h23,t2 - paddq t2,t1 - # t2[0] = h0 * r1 + h2 * s4 - # t2[1] = h1 * r0 + h3 * s3 - movd r1,t2 - movd r0,t3 - punpcklqdq t3,t2 - pmuludq h01,t2 - movd s4,t3 - movd s3,t4 - punpcklqdq t4,t3 - pmuludq h23,t3 - paddq t3,t2 - # t3[0] = h4 * s1 - # t3[1] = h4 * s2 - movd s1,t3 - movd s2,t4 - punpcklqdq t4,t3 - pmuludq h44,t3 - # d0 = t1[0] + t1[1] + t3[0] - # d1 = t2[0] + t2[1] + t3[1] - movdqa t1,t4 - punpcklqdq t2,t4 - punpckhqdq t2,t1 - paddq t4,t1 - paddq t3,t1 - movq t1,d0 - psrldq $8,t1 - movq t1,d1 - - # t1[0] = h0 * r2 + h2 * r0 - # t1[1] = h1 * r1 + h3 * s4 - movd r2,t1 - movd r1,t2 - punpcklqdq t2,t1 - pmuludq h01,t1 - movd r0,t2 - movd s4,t3 - punpcklqdq t3,t2 - pmuludq h23,t2 - paddq t2,t1 - # t2[0] = h0 * r3 + h2 * r1 - # t2[1] = h1 * r2 + h3 * r0 - movd r3,t2 - movd r2,t3 - punpcklqdq t3,t2 - pmuludq h01,t2 - movd r1,t3 - movd r0,t4 - punpcklqdq t4,t3 - pmuludq h23,t3 - paddq t3,t2 - # t3[0] = h4 * s3 - # t3[1] = h4 * s4 - movd s3,t3 - movd s4,t4 - punpcklqdq t4,t3 - pmuludq h44,t3 - # d2 = t1[0] + t1[1] + t3[0] - # d3 = t2[0] + t2[1] + t3[1] - movdqa t1,t4 - punpcklqdq t2,t4 - punpckhqdq t2,t1 - paddq t4,t1 - paddq t3,t1 - movq t1,d2 - psrldq $8,t1 - movq t1,d3 - - # t1[0] = h0 * r4 + h2 * r2 - # t1[1] = h1 * r3 + h3 * r1 - movd r4,t1 - movd r3,t2 - punpcklqdq t2,t1 - pmuludq h01,t1 - movd r2,t2 - movd r1,t3 - punpcklqdq t3,t2 - pmuludq h23,t2 - paddq t2,t1 - # t3[0] = h4 * r0 - movd r0,t3 - pmuludq h44,t3 - # d4 = t1[0] + t1[1] + t3[0] - movdqa t1,t4 - psrldq $8,t4 - paddq t4,t1 - paddq t3,t1 - movq t1,d4 - - # d1 += d0 >> 26 - mov d0,%rax - shr $26,%rax - add %rax,d1 - # h0 = d0 & 0x3ffffff - mov d0,%rbx - and $0x3ffffff,%ebx - - # d2 += d1 >> 26 - mov d1,%rax - shr $26,%rax - add %rax,d2 - # h1 = d1 & 0x3ffffff - mov d1,%rax - and $0x3ffffff,%eax - mov %eax,h1 - - # d3 += d2 >> 26 - mov d2,%rax - shr $26,%rax - add %rax,d3 - # h2 = d2 & 0x3ffffff - mov d2,%rax - and $0x3ffffff,%eax - mov %eax,h2 - - # d4 += d3 >> 26 - mov d3,%rax - shr $26,%rax - add %rax,d4 - # h3 = d3 & 0x3ffffff - mov d3,%rax - and $0x3ffffff,%eax - mov %eax,h3 - - # h0 += (d4 >> 26) * 5 - mov d4,%rax - shr $26,%rax - lea (%rax,%rax,4),%rax - add %rax,%rbx - # h4 = d4 & 0x3ffffff - mov d4,%rax - and $0x3ffffff,%eax - mov %eax,h4 - - # h1 += h0 >> 26 - mov %rbx,%rax - shr $26,%rax - add %eax,h1 - # h0 = h0 & 0x3ffffff - andl $0x3ffffff,%ebx - mov %ebx,h0 - - add $0x10,m - dec %rcx - jnz .Ldoblock - - add $0x10,%rsp - pop %r12 - pop %rbx - ret -ENDPROC(poly1305_block_sse2) - - -#define u0 0x00(%r8) -#define u1 0x04(%r8) -#define u2 0x08(%r8) -#define u3 0x0c(%r8) -#define u4 0x10(%r8) -#define hc0 %xmm0 -#define hc1 %xmm1 -#define hc2 %xmm2 -#define hc3 %xmm5 -#define hc4 %xmm6 -#define ru0 %xmm7 -#define ru1 %xmm8 -#define ru2 %xmm9 -#define ru3 %xmm10 -#define ru4 %xmm11 -#define sv1 %xmm12 -#define sv2 %xmm13 -#define sv3 %xmm14 -#define sv4 %xmm15 -#undef d0 -#define d0 %r13 - -ENTRY(poly1305_2block_sse2) - # %rdi: Accumulator h[5] - # %rsi: 16 byte input block m - # %rdx: Poly1305 key r[5] - # %rcx: Doubleblock count - # %r8: Poly1305 derived key r^2 u[5] - - # This two-block variant further improves performance by using loop - # unrolled block processing. This is more straight forward and does - # less byte shuffling, but requires a second Poly1305 key r^2: - # h = (h + m) * r => h = (h + m1) * r^2 + m2 * r - - push %rbx - push %r12 - push %r13 - - # combine r0,u0 - movd u0,ru0 - movd r0,t1 - punpcklqdq t1,ru0 - - # combine r1,u1 and s1=r1*5,v1=u1*5 - movd u1,ru1 - movd r1,t1 - punpcklqdq t1,ru1 - movdqa ru1,sv1 - pslld $2,sv1 - paddd ru1,sv1 - - # combine r2,u2 and s2=r2*5,v2=u2*5 - movd u2,ru2 - movd r2,t1 - punpcklqdq t1,ru2 - movdqa ru2,sv2 - pslld $2,sv2 - paddd ru2,sv2 - - # combine r3,u3 and s3=r3*5,v3=u3*5 - movd u3,ru3 - movd r3,t1 - punpcklqdq t1,ru3 - movdqa ru3,sv3 - pslld $2,sv3 - paddd ru3,sv3 - - # combine r4,u4 and s4=r4*5,v4=u4*5 - movd u4,ru4 - movd r4,t1 - punpcklqdq t1,ru4 - movdqa ru4,sv4 - pslld $2,sv4 - paddd ru4,sv4 - -.Ldoblock2: - # hc0 = [ m[16-19] & 0x3ffffff, h0 + m[0-3] & 0x3ffffff ] - movd 0x00(m),hc0 - movd 0x10(m),t1 - punpcklqdq t1,hc0 - pand ANMASK(%rip),hc0 - movd h0,t1 - paddd t1,hc0 - # hc1 = [ (m[19-22] >> 2) & 0x3ffffff, h1 + (m[3-6] >> 2) & 0x3ffffff ] - movd 0x03(m),hc1 - movd 0x13(m),t1 - punpcklqdq t1,hc1 - psrld $2,hc1 - pand ANMASK(%rip),hc1 - movd h1,t1 - paddd t1,hc1 - # hc2 = [ (m[22-25] >> 4) & 0x3ffffff, h2 + (m[6-9] >> 4) & 0x3ffffff ] - movd 0x06(m),hc2 - movd 0x16(m),t1 - punpcklqdq t1,hc2 - psrld $4,hc2 - pand ANMASK(%rip),hc2 - movd h2,t1 - paddd t1,hc2 - # hc3 = [ (m[25-28] >> 6) & 0x3ffffff, h3 + (m[9-12] >> 6) & 0x3ffffff ] - movd 0x09(m),hc3 - movd 0x19(m),t1 - punpcklqdq t1,hc3 - psrld $6,hc3 - pand ANMASK(%rip),hc3 - movd h3,t1 - paddd t1,hc3 - # hc4 = [ (m[28-31] >> 8) | (1<<24), h4 + (m[12-15] >> 8) | (1<<24) ] - movd 0x0c(m),hc4 - movd 0x1c(m),t1 - punpcklqdq t1,hc4 - psrld $8,hc4 - por ORMASK(%rip),hc4 - movd h4,t1 - paddd t1,hc4 - - # t1 = [ hc0[1] * r0, hc0[0] * u0 ] - movdqa ru0,t1 - pmuludq hc0,t1 - # t1 += [ hc1[1] * s4, hc1[0] * v4 ] - movdqa sv4,t2 - pmuludq hc1,t2 - paddq t2,t1 - # t1 += [ hc2[1] * s3, hc2[0] * v3 ] - movdqa sv3,t2 - pmuludq hc2,t2 - paddq t2,t1 - # t1 += [ hc3[1] * s2, hc3[0] * v2 ] - movdqa sv2,t2 - pmuludq hc3,t2 - paddq t2,t1 - # t1 += [ hc4[1] * s1, hc4[0] * v1 ] - movdqa sv1,t2 - pmuludq hc4,t2 - paddq t2,t1 - # d0 = t1[0] + t1[1] - movdqa t1,t2 - psrldq $8,t2 - paddq t2,t1 - movq t1,d0 - - # t1 = [ hc0[1] * r1, hc0[0] * u1 ] - movdqa ru1,t1 - pmuludq hc0,t1 - # t1 += [ hc1[1] * r0, hc1[0] * u0 ] - movdqa ru0,t2 - pmuludq hc1,t2 - paddq t2,t1 - # t1 += [ hc2[1] * s4, hc2[0] * v4 ] - movdqa sv4,t2 - pmuludq hc2,t2 - paddq t2,t1 - # t1 += [ hc3[1] * s3, hc3[0] * v3 ] - movdqa sv3,t2 - pmuludq hc3,t2 - paddq t2,t1 - # t1 += [ hc4[1] * s2, hc4[0] * v2 ] - movdqa sv2,t2 - pmuludq hc4,t2 - paddq t2,t1 - # d1 = t1[0] + t1[1] - movdqa t1,t2 - psrldq $8,t2 - paddq t2,t1 - movq t1,d1 - - # t1 = [ hc0[1] * r2, hc0[0] * u2 ] - movdqa ru2,t1 - pmuludq hc0,t1 - # t1 += [ hc1[1] * r1, hc1[0] * u1 ] - movdqa ru1,t2 - pmuludq hc1,t2 - paddq t2,t1 - # t1 += [ hc2[1] * r0, hc2[0] * u0 ] - movdqa ru0,t2 - pmuludq hc2,t2 - paddq t2,t1 - # t1 += [ hc3[1] * s4, hc3[0] * v4 ] - movdqa sv4,t2 - pmuludq hc3,t2 - paddq t2,t1 - # t1 += [ hc4[1] * s3, hc4[0] * v3 ] - movdqa sv3,t2 - pmuludq hc4,t2 - paddq t2,t1 - # d2 = t1[0] + t1[1] - movdqa t1,t2 - psrldq $8,t2 - paddq t2,t1 - movq t1,d2 - - # t1 = [ hc0[1] * r3, hc0[0] * u3 ] - movdqa ru3,t1 - pmuludq hc0,t1 - # t1 += [ hc1[1] * r2, hc1[0] * u2 ] - movdqa ru2,t2 - pmuludq hc1,t2 - paddq t2,t1 - # t1 += [ hc2[1] * r1, hc2[0] * u1 ] - movdqa ru1,t2 - pmuludq hc2,t2 - paddq t2,t1 - # t1 += [ hc3[1] * r0, hc3[0] * u0 ] - movdqa ru0,t2 - pmuludq hc3,t2 - paddq t2,t1 - # t1 += [ hc4[1] * s4, hc4[0] * v4 ] - movdqa sv4,t2 - pmuludq hc4,t2 - paddq t2,t1 - # d3 = t1[0] + t1[1] - movdqa t1,t2 - psrldq $8,t2 - paddq t2,t1 - movq t1,d3 - - # t1 = [ hc0[1] * r4, hc0[0] * u4 ] - movdqa ru4,t1 - pmuludq hc0,t1 - # t1 += [ hc1[1] * r3, hc1[0] * u3 ] - movdqa ru3,t2 - pmuludq hc1,t2 - paddq t2,t1 - # t1 += [ hc2[1] * r2, hc2[0] * u2 ] - movdqa ru2,t2 - pmuludq hc2,t2 - paddq t2,t1 - # t1 += [ hc3[1] * r1, hc3[0] * u1 ] - movdqa ru1,t2 - pmuludq hc3,t2 - paddq t2,t1 - # t1 += [ hc4[1] * r0, hc4[0] * u0 ] - movdqa ru0,t2 - pmuludq hc4,t2 - paddq t2,t1 - # d4 = t1[0] + t1[1] - movdqa t1,t2 - psrldq $8,t2 - paddq t2,t1 - movq t1,d4 - - # Now do a partial reduction mod (2^130)-5, carrying h0 -> h1 -> h2 -> - # h3 -> h4 -> h0 -> h1 to get h0,h2,h3,h4 < 2^26 and h1 < 2^26 + a small - # amount. Careful: we must not assume the carry bits 'd0 >> 26', - # 'd1 >> 26', 'd2 >> 26', 'd3 >> 26', and '(d4 >> 26) * 5' fit in 32-bit - # integers. It's true in a single-block implementation, but not here. - - # d1 += d0 >> 26 - mov d0,%rax - shr $26,%rax - add %rax,d1 - # h0 = d0 & 0x3ffffff - mov d0,%rbx - and $0x3ffffff,%ebx - - # d2 += d1 >> 26 - mov d1,%rax - shr $26,%rax - add %rax,d2 - # h1 = d1 & 0x3ffffff - mov d1,%rax - and $0x3ffffff,%eax - mov %eax,h1 - - # d3 += d2 >> 26 - mov d2,%rax - shr $26,%rax - add %rax,d3 - # h2 = d2 & 0x3ffffff - mov d2,%rax - and $0x3ffffff,%eax - mov %eax,h2 - - # d4 += d3 >> 26 - mov d3,%rax - shr $26,%rax - add %rax,d4 - # h3 = d3 & 0x3ffffff - mov d3,%rax - and $0x3ffffff,%eax - mov %eax,h3 - - # h0 += (d4 >> 26) * 5 - mov d4,%rax - shr $26,%rax - lea (%rax,%rax,4),%rax - add %rax,%rbx - # h4 = d4 & 0x3ffffff - mov d4,%rax - and $0x3ffffff,%eax - mov %eax,h4 - - # h1 += h0 >> 26 - mov %rbx,%rax - shr $26,%rax - add %eax,h1 - # h0 = h0 & 0x3ffffff - andl $0x3ffffff,%ebx - mov %ebx,h0 - - add $0x20,m - dec %rcx - jnz .Ldoblock2 - - pop %r13 - pop %r12 - pop %rbx - ret -ENDPROC(poly1305_2block_sse2) diff --git a/arch/x86/crypto/poly1305-x86_64-cryptogams.pl b/arch/x86/crypto/poly1305-x86_64-cryptogams.pl new file mode 100644 index 000000000000..5b593990501d --- /dev/null +++ b/arch/x86/crypto/poly1305-x86_64-cryptogams.pl @@ -0,0 +1,4265 @@ +#!/usr/bin/env perl +# SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause +# +# Copyright (C) 2017-2018 Samuel Neves . All Rights Reserved. +# Copyright (C) 2017-2019 Jason A. Donenfeld . All Rights Reserved. +# Copyright (C) 2006-2017 CRYPTOGAMS by . All Rights Reserved. +# +# This code is taken from the OpenSSL project but the author, Andy Polyakov, +# has relicensed it under the licenses specified in the SPDX header above. +# The original headers, including the original license headers, are +# included below for completeness. +# +# ==================================================================== +# Written by Andy Polyakov for the OpenSSL +# project. The module is, however, dual licensed under OpenSSL and +# CRYPTOGAMS licenses depending on where you obtain it. For further +# details see http://www.openssl.org/~appro/cryptogams/. +# ==================================================================== +# +# This module implements Poly1305 hash for x86_64. +# +# March 2015 +# +# Initial release. +# +# December 2016 +# +# Add AVX512F+VL+BW code path. +# +# November 2017 +# +# Convert AVX512F+VL+BW code path to pure AVX512F, so that it can be +# executed even on Knights Landing. Trigger for modification was +# observation that AVX512 code paths can negatively affect overall +# Skylake-X system performance. Since we are likely to suppress +# AVX512F capability flag [at least on Skylake-X], conversion serves +# as kind of "investment protection". Note that next *lake processor, +# Cannonlake, has AVX512IFMA code path to execute... +# +# Numbers are cycles per processed byte with poly1305_blocks alone, +# measured with rdtsc at fixed clock frequency. +# +# IALU/gcc-4.8(*) AVX(**) AVX2 AVX-512 +# P4 4.46/+120% - +# Core 2 2.41/+90% - +# Westmere 1.88/+120% - +# Sandy Bridge 1.39/+140% 1.10 +# Haswell 1.14/+175% 1.11 0.65 +# Skylake[-X] 1.13/+120% 0.96 0.51 [0.35] +# Silvermont 2.83/+95% - +# Knights L 3.60/? 1.65 1.10 0.41(***) +# Goldmont 1.70/+180% - +# VIA Nano 1.82/+150% - +# Sledgehammer 1.38/+160% - +# Bulldozer 2.30/+130% 0.97 +# Ryzen 1.15/+200% 1.08 1.18 +# +# (*) improvement coefficients relative to clang are more modest and +# are ~50% on most processors, in both cases we are comparing to +# __int128 code; +# (**) SSE2 implementation was attempted, but among non-AVX processors +# it was faster than integer-only code only on older Intel P4 and +# Core processors, 50-30%, less newer processor is, but slower on +# contemporary ones, for example almost 2x slower on Atom, and as +# former are naturally disappearing, SSE2 is deemed unnecessary; +# (***) strangely enough performance seems to vary from core to core, +# listed result is best case; + +$flavour = shift; +$output = shift; +if ($flavour =~ /\./) { $output = $flavour; undef $flavour; } + +$win64=0; $win64=1 if ($flavour =~ /[nm]asm|mingw64/ || $output =~ /\.asm$/); +$kernel=0; $kernel=1 if (!$flavour && !$output); + +if (!$kernel) { + $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1; + ( $xlate="${dir}x86_64-xlate.pl" and -f $xlate ) or + ( $xlate="${dir}../../perlasm/x86_64-xlate.pl" and -f $xlate) or + die "can't locate x86_64-xlate.pl"; + + open OUT,"| \"$^X\" \"$xlate\" $flavour \"$output\""; + *STDOUT=*OUT; + + if (`$ENV{CC} -Wa,-v -c -o /dev/null -x assembler /dev/null 2>&1` + =~ /GNU assembler version ([2-9]\.[0-9]+)/) { + $avx = ($1>=2.19) + ($1>=2.22) + ($1>=2.25); + } + + if (!$avx && $win64 && ($flavour =~ /nasm/ || $ENV{ASM} =~ /nasm/) && + `nasm -v 2>&1` =~ /NASM version ([2-9]\.[0-9]+)(?:\.([0-9]+))?/) { + $avx = ($1>=2.09) + ($1>=2.10) + ($1>=2.12); + $avx += 1 if ($1==2.11 && $2>=8); + } + + if (!$avx && $win64 && ($flavour =~ /masm/ || $ENV{ASM} =~ /ml64/) && + `ml64 2>&1` =~ /Version ([0-9]+)\./) { + $avx = ($1>=10) + ($1>=11); + } + + if (!$avx && `$ENV{CC} -v 2>&1` =~ /((?:^clang|LLVM) version|.*based on LLVM) ([3-9]\.[0-9]+)/) { + $avx = ($2>=3.0) + ($2>3.0); + } +} else { + $avx = 4; # The kernel uses ifdefs for this. +} + +sub declare_function() { + my ($name, $align, $nargs) = @_; + if($kernel) { + $code .= ".align $align\n"; + $code .= "ENTRY($name)\n"; + $code .= ".L$name:\n"; + } else { + $code .= ".globl $name\n"; + $code .= ".type $name,\@function,$nargs\n"; + $code .= ".align $align\n"; + $code .= "$name:\n"; + } +} + +sub end_function() { + my ($name) = @_; + if($kernel) { + $code .= "ENDPROC($name)\n"; + } else { + $code .= ".size $name,.-$name\n"; + } +} + +$code.=<<___ if $kernel; +#include +___ + +if ($avx) { +$code.=<<___ if $kernel; +.section .rodata +___ +$code.=<<___; +.align 64 +.Lconst: +.Lmask24: +.long 0x0ffffff,0,0x0ffffff,0,0x0ffffff,0,0x0ffffff,0 +.L129: +.long `1<<24`,0,`1<<24`,0,`1<<24`,0,`1<<24`,0 +.Lmask26: +.long 0x3ffffff,0,0x3ffffff,0,0x3ffffff,0,0x3ffffff,0 +.Lpermd_avx2: +.long 2,2,2,3,2,0,2,1 +.Lpermd_avx512: +.long 0,0,0,1, 0,2,0,3, 0,4,0,5, 0,6,0,7 + +.L2_44_inp_permd: +.long 0,1,1,2,2,3,7,7 +.L2_44_inp_shift: +.quad 0,12,24,64 +.L2_44_mask: +.quad 0xfffffffffff,0xfffffffffff,0x3ffffffffff,0xffffffffffffffff +.L2_44_shift_rgt: +.quad 44,44,42,64 +.L2_44_shift_lft: +.quad 8,8,10,64 + +.align 64 +.Lx_mask44: +.quad 0xfffffffffff,0xfffffffffff,0xfffffffffff,0xfffffffffff +.quad 0xfffffffffff,0xfffffffffff,0xfffffffffff,0xfffffffffff +.Lx_mask42: +.quad 0x3ffffffffff,0x3ffffffffff,0x3ffffffffff,0x3ffffffffff +.quad 0x3ffffffffff,0x3ffffffffff,0x3ffffffffff,0x3ffffffffff +___ +} +$code.=<<___ if (!$kernel); +.asciz "Poly1305 for x86_64, CRYPTOGAMS by " +.align 16 +___ + +my ($ctx,$inp,$len,$padbit)=("%rdi","%rsi","%rdx","%rcx"); +my ($mac,$nonce)=($inp,$len); # *_emit arguments +my ($d1,$d2,$d3, $r0,$r1,$s1)=("%r8","%r9","%rdi","%r11","%r12","%r13"); +my ($h0,$h1,$h2)=("%r14","%rbx","%r10"); + +sub poly1305_iteration { +# input: copy of $r1 in %rax, $h0-$h2, $r0-$r1 +# output: $h0-$h2 *= $r0-$r1 +$code.=<<___; + mulq $h0 # h0*r1 + mov %rax,$d2 + mov $r0,%rax + mov %rdx,$d3 + + mulq $h0 # h0*r0 + mov %rax,$h0 # future $h0 + mov $r0,%rax + mov %rdx,$d1 + + mulq $h1 # h1*r0 + add %rax,$d2 + mov $s1,%rax + adc %rdx,$d3 + + mulq $h1 # h1*s1 + mov $h2,$h1 # borrow $h1 + add %rax,$h0 + adc %rdx,$d1 + + imulq $s1,$h1 # h2*s1 + add $h1,$d2 + mov $d1,$h1 + adc \$0,$d3 + + imulq $r0,$h2 # h2*r0 + add $d2,$h1 + mov \$-4,%rax # mask value + adc $h2,$d3 + + and $d3,%rax # last reduction step + mov $d3,$h2 + shr \$2,$d3 + and \$3,$h2 + add $d3,%rax + add %rax,$h0 + adc \$0,$h1 + adc \$0,$h2 +___ +} + +######################################################################## +# Layout of opaque area is following. +# +# unsigned __int64 h[3]; # current hash value base 2^64 +# unsigned __int64 r[2]; # key value base 2^64 + +$code.=<<___; +.text +___ +$code.=<<___ if (!$kernel); +.extern OPENSSL_ia32cap_P + +.globl poly1305_init_x86_64 +.hidden poly1305_init_x86_64 +.globl poly1305_blocks_x86_64 +.hidden poly1305_blocks_x86_64 +.globl poly1305_emit_x86_64 +.hidden poly1305_emit_x86_64 +___ +&declare_function("poly1305_init_x86_64", 32, 3); +$code.=<<___; + xor %eax,%eax + mov %rax,0($ctx) # initialize hash value + mov %rax,8($ctx) + mov %rax,16($ctx) + + cmp \$0,$inp + je .Lno_key +___ +$code.=<<___ if (!$kernel); + lea poly1305_blocks_x86_64(%rip),%r10 + lea poly1305_emit_x86_64(%rip),%r11 +___ +$code.=<<___ if (!$kernel && $avx); + mov OPENSSL_ia32cap_P+4(%rip),%r9 + lea poly1305_blocks_avx(%rip),%rax + lea poly1305_emit_avx(%rip),%rcx + bt \$`60-32`,%r9 # AVX? + cmovc %rax,%r10 + cmovc %rcx,%r11 +___ +$code.=<<___ if (!$kernel && $avx>1); + lea poly1305_blocks_avx2(%rip),%rax + bt \$`5+32`,%r9 # AVX2? + cmovc %rax,%r10 +___ +$code.=<<___ if (!$kernel && $avx>3); + mov \$`(1<<31|1<<21|1<<16)`,%rax + shr \$32,%r9 + and %rax,%r9 + cmp %rax,%r9 + je .Linit_base2_44 +___ +$code.=<<___; + mov \$0x0ffffffc0fffffff,%rax + mov \$0x0ffffffc0ffffffc,%rcx + and 0($inp),%rax + and 8($inp),%rcx + mov %rax,24($ctx) + mov %rcx,32($ctx) +___ +$code.=<<___ if (!$kernel && $flavour !~ /elf32/); + mov %r10,0(%rdx) + mov %r11,8(%rdx) +___ +$code.=<<___ if (!$kernel && $flavour =~ /elf32/); + mov %r10d,0(%rdx) + mov %r11d,4(%rdx) +___ +$code.=<<___; + mov \$1,%eax +.Lno_key: + ret +___ +&end_function("poly1305_init_x86_64"); + +&declare_function("poly1305_blocks_x86_64", 32, 4); +$code.=<<___; +.cfi_startproc +.Lblocks: + shr \$4,$len + jz .Lno_data # too short + + push %rbx +.cfi_push %rbx + push %r12 +.cfi_push %r12 + push %r13 +.cfi_push %r13 + push %r14 +.cfi_push %r14 + push %r15 +.cfi_push %r15 + push $ctx +.cfi_push $ctx +.Lblocks_body: + + mov $len,%r15 # reassign $len + + mov 24($ctx),$r0 # load r + mov 32($ctx),$s1 + + mov 0($ctx),$h0 # load hash value + mov 8($ctx),$h1 + mov 16($ctx),$h2 + + mov $s1,$r1 + shr \$2,$s1 + mov $r1,%rax + add $r1,$s1 # s1 = r1 + (r1 >> 2) + jmp .Loop + +.align 32 +.Loop: + add 0($inp),$h0 # accumulate input + adc 8($inp),$h1 + lea 16($inp),$inp + adc $padbit,$h2 +___ + + &poly1305_iteration(); + +$code.=<<___; + mov $r1,%rax + dec %r15 # len-=16 + jnz .Loop + + mov 0(%rsp),$ctx +.cfi_restore $ctx + + mov $h0,0($ctx) # store hash value + mov $h1,8($ctx) + mov $h2,16($ctx) + + mov 8(%rsp),%r15 +.cfi_restore %r15 + mov 16(%rsp),%r14 +.cfi_restore %r14 + mov 24(%rsp),%r13 +.cfi_restore %r13 + mov 32(%rsp),%r12 +.cfi_restore %r12 + mov 40(%rsp),%rbx +.cfi_restore %rbx + lea 48(%rsp),%rsp +.cfi_adjust_cfa_offset -48 +.Lno_data: +.Lblocks_epilogue: + ret +.cfi_endproc +___ +&end_function("poly1305_blocks_x86_64"); + +&declare_function("poly1305_emit_x86_64", 32, 3); +$code.=<<___; +.Lemit: + mov 0($ctx),%r8 # load hash value + mov 8($ctx),%r9 + mov 16($ctx),%r10 + + mov %r8,%rax + add \$5,%r8 # compare to modulus + mov %r9,%rcx + adc \$0,%r9 + adc \$0,%r10 + shr \$2,%r10 # did 130-bit value overflow? + cmovnz %r8,%rax + cmovnz %r9,%rcx + + add 0($nonce),%rax # accumulate nonce + adc 8($nonce),%rcx + mov %rax,0($mac) # write result + mov %rcx,8($mac) + + ret +___ +&end_function("poly1305_emit_x86_64"); +if ($avx) { + +if($kernel) { + $code .= "#ifdef CONFIG_AS_AVX\n"; +} + +######################################################################## +# Layout of opaque area is following. +# +# unsigned __int32 h[5]; # current hash value base 2^26 +# unsigned __int32 is_base2_26; +# unsigned __int64 r[2]; # key value base 2^64 +# unsigned __int64 pad; +# struct { unsigned __int32 r^2, r^1, r^4, r^3; } r[9]; +# +# where r^n are base 2^26 digits of degrees of multiplier key. There are +# 5 digits, but last four are interleaved with multiples of 5, totalling +# in 9 elements: r0, r1, 5*r1, r2, 5*r2, r3, 5*r3, r4, 5*r4. + +my ($H0,$H1,$H2,$H3,$H4, $T0,$T1,$T2,$T3,$T4, $D0,$D1,$D2,$D3,$D4, $MASK) = + map("%xmm$_",(0..15)); + +$code.=<<___; +.type __poly1305_block,\@abi-omnipotent +.align 32 +__poly1305_block: + push $ctx +___ + &poly1305_iteration(); +$code.=<<___; + pop $ctx + ret +.size __poly1305_block,.-__poly1305_block + +.type __poly1305_init_avx,\@abi-omnipotent +.align 32 +__poly1305_init_avx: + push %rbp + mov %rsp,%rbp + mov $r0,$h0 + mov $r1,$h1 + xor $h2,$h2 + + lea 48+64($ctx),$ctx # size optimization + + mov $r1,%rax + call __poly1305_block # r^2 + + mov \$0x3ffffff,%eax # save interleaved r^2 and r base 2^26 + mov \$0x3ffffff,%edx + mov $h0,$d1 + and $h0#d,%eax + mov $r0,$d2 + and $r0#d,%edx + mov %eax,`16*0+0-64`($ctx) + shr \$26,$d1 + mov %edx,`16*0+4-64`($ctx) + shr \$26,$d2 + + mov \$0x3ffffff,%eax + mov \$0x3ffffff,%edx + and $d1#d,%eax + and $d2#d,%edx + mov %eax,`16*1+0-64`($ctx) + lea (%rax,%rax,4),%eax # *5 + mov %edx,`16*1+4-64`($ctx) + lea (%rdx,%rdx,4),%edx # *5 + mov %eax,`16*2+0-64`($ctx) + shr \$26,$d1 + mov %edx,`16*2+4-64`($ctx) + shr \$26,$d2 + + mov $h1,%rax + mov $r1,%rdx + shl \$12,%rax + shl \$12,%rdx + or $d1,%rax + or $d2,%rdx + and \$0x3ffffff,%eax + and \$0x3ffffff,%edx + mov %eax,`16*3+0-64`($ctx) + lea (%rax,%rax,4),%eax # *5 + mov %edx,`16*3+4-64`($ctx) + lea (%rdx,%rdx,4),%edx # *5 + mov %eax,`16*4+0-64`($ctx) + mov $h1,$d1 + mov %edx,`16*4+4-64`($ctx) + mov $r1,$d2 + + mov \$0x3ffffff,%eax + mov \$0x3ffffff,%edx + shr \$14,$d1 + shr \$14,$d2 + and $d1#d,%eax + and $d2#d,%edx + mov %eax,`16*5+0-64`($ctx) + lea (%rax,%rax,4),%eax # *5 + mov %edx,`16*5+4-64`($ctx) + lea (%rdx,%rdx,4),%edx # *5 + mov %eax,`16*6+0-64`($ctx) + shr \$26,$d1 + mov %edx,`16*6+4-64`($ctx) + shr \$26,$d2 + + mov $h2,%rax + shl \$24,%rax + or %rax,$d1 + mov $d1#d,`16*7+0-64`($ctx) + lea ($d1,$d1,4),$d1 # *5 + mov $d2#d,`16*7+4-64`($ctx) + lea ($d2,$d2,4),$d2 # *5 + mov $d1#d,`16*8+0-64`($ctx) + mov $d2#d,`16*8+4-64`($ctx) + + mov $r1,%rax + call __poly1305_block # r^3 + + mov \$0x3ffffff,%eax # save r^3 base 2^26 + mov $h0,$d1 + and $h0#d,%eax + shr \$26,$d1 + mov %eax,`16*0+12-64`($ctx) + + mov \$0x3ffffff,%edx + and $d1#d,%edx + mov %edx,`16*1+12-64`($ctx) + lea (%rdx,%rdx,4),%edx # *5 + shr \$26,$d1 + mov %edx,`16*2+12-64`($ctx) + + mov $h1,%rax + shl \$12,%rax + or $d1,%rax + and \$0x3ffffff,%eax + mov %eax,`16*3+12-64`($ctx) + lea (%rax,%rax,4),%eax # *5 + mov $h1,$d1 + mov %eax,`16*4+12-64`($ctx) + + mov \$0x3ffffff,%edx + shr \$14,$d1 + and $d1#d,%edx + mov %edx,`16*5+12-64`($ctx) + lea (%rdx,%rdx,4),%edx # *5 + shr \$26,$d1 + mov %edx,`16*6+12-64`($ctx) + + mov $h2,%rax + shl \$24,%rax + or %rax,$d1 + mov $d1#d,`16*7+12-64`($ctx) + lea ($d1,$d1,4),$d1 # *5 + mov $d1#d,`16*8+12-64`($ctx) + + mov $r1,%rax + call __poly1305_block # r^4 + + mov \$0x3ffffff,%eax # save r^4 base 2^26 + mov $h0,$d1 + and $h0#d,%eax + shr \$26,$d1 + mov %eax,`16*0+8-64`($ctx) + + mov \$0x3ffffff,%edx + and $d1#d,%edx + mov %edx,`16*1+8-64`($ctx) + lea (%rdx,%rdx,4),%edx # *5 + shr \$26,$d1 + mov %edx,`16*2+8-64`($ctx) + + mov $h1,%rax + shl \$12,%rax + or $d1,%rax + and \$0x3ffffff,%eax + mov %eax,`16*3+8-64`($ctx) + lea (%rax,%rax,4),%eax # *5 + mov $h1,$d1 + mov %eax,`16*4+8-64`($ctx) + + mov \$0x3ffffff,%edx + shr \$14,$d1 + and $d1#d,%edx + mov %edx,`16*5+8-64`($ctx) + lea (%rdx,%rdx,4),%edx # *5 + shr \$26,$d1 + mov %edx,`16*6+8-64`($ctx) + + mov $h2,%rax + shl \$24,%rax + or %rax,$d1 + mov $d1#d,`16*7+8-64`($ctx) + lea ($d1,$d1,4),$d1 # *5 + mov $d1#d,`16*8+8-64`($ctx) + + lea -48-64($ctx),$ctx # size [de-]optimization + pop %rbp + ret +.size __poly1305_init_avx,.-__poly1305_init_avx +___ + +&declare_function("poly1305_blocks_avx", 32, 4); +$code.=<<___; +.cfi_startproc + mov 20($ctx),%r8d # is_base2_26 + cmp \$128,$len + jae .Lblocks_avx + test %r8d,%r8d + jz .Lblocks + +.Lblocks_avx: + and \$-16,$len + jz .Lno_data_avx + + vzeroupper + + test %r8d,%r8d + jz .Lbase2_64_avx + + test \$31,$len + jz .Leven_avx + + push %rbp +.cfi_push %rbp + mov %rsp,%rbp + push %rbx +.cfi_push %rbx + push %r12 +.cfi_push %r12 + push %r13 +.cfi_push %r13 + push %r14 +.cfi_push %r14 + push %r15 +.cfi_push %r15 +.Lblocks_avx_body: + + mov $len,%r15 # reassign $len + + mov 0($ctx),$d1 # load hash value + mov 8($ctx),$d2 + mov 16($ctx),$h2#d + + mov 24($ctx),$r0 # load r + mov 32($ctx),$s1 + + ################################# base 2^26 -> base 2^64 + mov $d1#d,$h0#d + and \$`-1*(1<<31)`,$d1 + mov $d2,$r1 # borrow $r1 + mov $d2#d,$h1#d + and \$`-1*(1<<31)`,$d2 + + shr \$6,$d1 + shl \$52,$r1 + add $d1,$h0 + shr \$12,$h1 + shr \$18,$d2 + add $r1,$h0 + adc $d2,$h1 + + mov $h2,$d1 + shl \$40,$d1 + shr \$24,$h2 + add $d1,$h1 + adc \$0,$h2 # can be partially reduced... + + mov \$-4,$d2 # ... so reduce + mov $h2,$d1 + and $h2,$d2 + shr \$2,$d1 + and \$3,$h2 + add $d2,$d1 # =*5 + add $d1,$h0 + adc \$0,$h1 + adc \$0,$h2 + + mov $s1,$r1 + mov $s1,%rax + shr \$2,$s1 + add $r1,$s1 # s1 = r1 + (r1 >> 2) + + add 0($inp),$h0 # accumulate input + adc 8($inp),$h1 + lea 16($inp),$inp + adc $padbit,$h2 + + call __poly1305_block + + test $padbit,$padbit # if $padbit is zero, + jz .Lstore_base2_64_avx # store hash in base 2^64 format + + ################################# base 2^64 -> base 2^26 + mov $h0,%rax + mov $h0,%rdx + shr \$52,$h0 + mov $h1,$r0 + mov $h1,$r1 + shr \$26,%rdx + and \$0x3ffffff,%rax # h[0] + shl \$12,$r0 + and \$0x3ffffff,%rdx # h[1] + shr \$14,$h1 + or $r0,$h0 + shl \$24,$h2 + and \$0x3ffffff,$h0 # h[2] + shr \$40,$r1 + and \$0x3ffffff,$h1 # h[3] + or $r1,$h2 # h[4] + + sub \$16,%r15 + jz .Lstore_base2_26_avx + + vmovd %rax#d,$H0 + vmovd %rdx#d,$H1 + vmovd $h0#d,$H2 + vmovd $h1#d,$H3 + vmovd $h2#d,$H4 + jmp .Lproceed_avx + +.align 32 +.Lstore_base2_64_avx: + mov $h0,0($ctx) + mov $h1,8($ctx) + mov $h2,16($ctx) # note that is_base2_26 is zeroed + jmp .Ldone_avx + +.align 16 +.Lstore_base2_26_avx: + mov %rax#d,0($ctx) # store hash value base 2^26 + mov %rdx#d,4($ctx) + mov $h0#d,8($ctx) + mov $h1#d,12($ctx) + mov $h2#d,16($ctx) +.align 16 +.Ldone_avx: + pop %r15 +.cfi_restore %r15 + pop %r14 +.cfi_restore %r14 + pop %r13 +.cfi_restore %r13 + pop %r12 +.cfi_restore %r12 + pop %rbx +.cfi_restore %rbx + pop %rbp +.cfi_restore %rbp +.Lno_data_avx: +.Lblocks_avx_epilogue: + ret +.cfi_endproc + +.align 32 +.Lbase2_64_avx: +.cfi_startproc + push %rbp +.cfi_push %rbp + mov %rsp,%rbp + push %rbx +.cfi_push %rbx + push %r12 +.cfi_push %r12 + push %r13 +.cfi_push %r13 + push %r14 +.cfi_push %r14 + push %r15 +.cfi_push %r15 +.Lbase2_64_avx_body: + + mov $len,%r15 # reassign $len + + mov 24($ctx),$r0 # load r + mov 32($ctx),$s1 + + mov 0($ctx),$h0 # load hash value + mov 8($ctx),$h1 + mov 16($ctx),$h2#d + + mov $s1,$r1 + mov $s1,%rax + shr \$2,$s1 + add $r1,$s1 # s1 = r1 + (r1 >> 2) + + test \$31,$len + jz .Linit_avx + + add 0($inp),$h0 # accumulate input + adc 8($inp),$h1 + lea 16($inp),$inp + adc $padbit,$h2 + sub \$16,%r15 + + call __poly1305_block + +.Linit_avx: + ################################# base 2^64 -> base 2^26 + mov $h0,%rax + mov $h0,%rdx + shr \$52,$h0 + mov $h1,$d1 + mov $h1,$d2 + shr \$26,%rdx + and \$0x3ffffff,%rax # h[0] + shl \$12,$d1 + and \$0x3ffffff,%rdx # h[1] + shr \$14,$h1 + or $d1,$h0 + shl \$24,$h2 + and \$0x3ffffff,$h0 # h[2] + shr \$40,$d2 + and \$0x3ffffff,$h1 # h[3] + or $d2,$h2 # h[4] + + vmovd %rax#d,$H0 + vmovd %rdx#d,$H1 + vmovd $h0#d,$H2 + vmovd $h1#d,$H3 + vmovd $h2#d,$H4 + movl \$1,20($ctx) # set is_base2_26 + + call __poly1305_init_avx + +.Lproceed_avx: + mov %r15,$len + pop %r15 +.cfi_restore %r15 + pop %r14 +.cfi_restore %r14 + pop %r13 +.cfi_restore %r13 + pop %r12 +.cfi_restore %r12 + pop %rbx +.cfi_restore %rbx + pop %rbp +.cfi_restore %rbp +.Lbase2_64_avx_epilogue: + jmp .Ldo_avx +.cfi_endproc + +.align 32 +.Leven_avx: +.cfi_startproc + vmovd 4*0($ctx),$H0 # load hash value + vmovd 4*1($ctx),$H1 + vmovd 4*2($ctx),$H2 + vmovd 4*3($ctx),$H3 + vmovd 4*4($ctx),$H4 + +.Ldo_avx: +___ +$code.=<<___ if (!$win64); + lea 8(%rsp),%r10 +.cfi_def_cfa_register %r10 + and \$-32,%rsp + sub \$-8,%rsp + lea -0x58(%rsp),%r11 + sub \$0x178,%rsp +___ +$code.=<<___ if ($win64); + lea -0xf8(%rsp),%r11 + sub \$0x218,%rsp + vmovdqa %xmm6,0x50(%r11) + vmovdqa %xmm7,0x60(%r11) + vmovdqa %xmm8,0x70(%r11) + vmovdqa %xmm9,0x80(%r11) + vmovdqa %xmm10,0x90(%r11) + vmovdqa %xmm11,0xa0(%r11) + vmovdqa %xmm12,0xb0(%r11) + vmovdqa %xmm13,0xc0(%r11) + vmovdqa %xmm14,0xd0(%r11) + vmovdqa %xmm15,0xe0(%r11) +.Ldo_avx_body: +___ +$code.=<<___; + sub \$64,$len + lea -32($inp),%rax + cmovc %rax,$inp + + vmovdqu `16*3`($ctx),$D4 # preload r0^2 + lea `16*3+64`($ctx),$ctx # size optimization + lea .Lconst(%rip),%rcx + + ################################################################ + # load input + vmovdqu 16*2($inp),$T0 + vmovdqu 16*3($inp),$T1 + vmovdqa 64(%rcx),$MASK # .Lmask26 + + vpsrldq \$6,$T0,$T2 # splat input + vpsrldq \$6,$T1,$T3 + vpunpckhqdq $T1,$T0,$T4 # 4 + vpunpcklqdq $T1,$T0,$T0 # 0:1 + vpunpcklqdq $T3,$T2,$T3 # 2:3 + + vpsrlq \$40,$T4,$T4 # 4 + vpsrlq \$26,$T0,$T1 + vpand $MASK,$T0,$T0 # 0 + vpsrlq \$4,$T3,$T2 + vpand $MASK,$T1,$T1 # 1 + vpsrlq \$30,$T3,$T3 + vpand $MASK,$T2,$T2 # 2 + vpand $MASK,$T3,$T3 # 3 + vpor 32(%rcx),$T4,$T4 # padbit, yes, always + + jbe .Lskip_loop_avx + + # expand and copy pre-calculated table to stack + vmovdqu `16*1-64`($ctx),$D1 + vmovdqu `16*2-64`($ctx),$D2 + vpshufd \$0xEE,$D4,$D3 # 34xx -> 3434 + vpshufd \$0x44,$D4,$D0 # xx12 -> 1212 + vmovdqa $D3,-0x90(%r11) + vmovdqa $D0,0x00(%rsp) + vpshufd \$0xEE,$D1,$D4 + vmovdqu `16*3-64`($ctx),$D0 + vpshufd \$0x44,$D1,$D1 + vmovdqa $D4,-0x80(%r11) + vmovdqa $D1,0x10(%rsp) + vpshufd \$0xEE,$D2,$D3 + vmovdqu `16*4-64`($ctx),$D1 + vpshufd \$0x44,$D2,$D2 + vmovdqa $D3,-0x70(%r11) + vmovdqa $D2,0x20(%rsp) + vpshufd \$0xEE,$D0,$D4 + vmovdqu `16*5-64`($ctx),$D2 + vpshufd \$0x44,$D0,$D0 + vmovdqa $D4,-0x60(%r11) + vmovdqa $D0,0x30(%rsp) + vpshufd \$0xEE,$D1,$D3 + vmovdqu `16*6-64`($ctx),$D0 + vpshufd \$0x44,$D1,$D1 + vmovdqa $D3,-0x50(%r11) + vmovdqa $D1,0x40(%rsp) + vpshufd \$0xEE,$D2,$D4 + vmovdqu `16*7-64`($ctx),$D1 + vpshufd \$0x44,$D2,$D2 + vmovdqa $D4,-0x40(%r11) + vmovdqa $D2,0x50(%rsp) + vpshufd \$0xEE,$D0,$D3 + vmovdqu `16*8-64`($ctx),$D2 + vpshufd \$0x44,$D0,$D0 + vmovdqa $D3,-0x30(%r11) + vmovdqa $D0,0x60(%rsp) + vpshufd \$0xEE,$D1,$D4 + vpshufd \$0x44,$D1,$D1 + vmovdqa $D4,-0x20(%r11) + vmovdqa $D1,0x70(%rsp) + vpshufd \$0xEE,$D2,$D3 + vmovdqa 0x00(%rsp),$D4 # preload r0^2 + vpshufd \$0x44,$D2,$D2 + vmovdqa $D3,-0x10(%r11) + vmovdqa $D2,0x80(%rsp) + + jmp .Loop_avx + +.align 32 +.Loop_avx: + ################################################################ + # ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2 + # ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^3+inp[7]*r + # \___________________/ + # ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2+inp[8])*r^2 + # ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^4+inp[7]*r^2+inp[9])*r + # \___________________/ \____________________/ + # + # Note that we start with inp[2:3]*r^2. This is because it + # doesn't depend on reduction in previous iteration. + ################################################################ + # d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4 + # d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4 + # d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4 + # d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4 + # d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4 + # + # though note that $Tx and $Hx are "reversed" in this section, + # and $D4 is preloaded with r0^2... + + vpmuludq $T0,$D4,$D0 # d0 = h0*r0 + vpmuludq $T1,$D4,$D1 # d1 = h1*r0 + vmovdqa $H2,0x20(%r11) # offload hash + vpmuludq $T2,$D4,$D2 # d3 = h2*r0 + vmovdqa 0x10(%rsp),$H2 # r1^2 + vpmuludq $T3,$D4,$D3 # d3 = h3*r0 + vpmuludq $T4,$D4,$D4 # d4 = h4*r0 + + vmovdqa $H0,0x00(%r11) # + vpmuludq 0x20(%rsp),$T4,$H0 # h4*s1 + vmovdqa $H1,0x10(%r11) # + vpmuludq $T3,$H2,$H1 # h3*r1 + vpaddq $H0,$D0,$D0 # d0 += h4*s1 + vpaddq $H1,$D4,$D4 # d4 += h3*r1 + vmovdqa $H3,0x30(%r11) # + vpmuludq $T2,$H2,$H0 # h2*r1 + vpmuludq $T1,$H2,$H1 # h1*r1 + vpaddq $H0,$D3,$D3 # d3 += h2*r1 + vmovdqa 0x30(%rsp),$H3 # r2^2 + vpaddq $H1,$D2,$D2 # d2 += h1*r1 + vmovdqa $H4,0x40(%r11) # + vpmuludq $T0,$H2,$H2 # h0*r1 + vpmuludq $T2,$H3,$H0 # h2*r2 + vpaddq $H2,$D1,$D1 # d1 += h0*r1 + + vmovdqa 0x40(%rsp),$H4 # s2^2 + vpaddq $H0,$D4,$D4 # d4 += h2*r2 + vpmuludq $T1,$H3,$H1 # h1*r2 + vpmuludq $T0,$H3,$H3 # h0*r2 + vpaddq $H1,$D3,$D3 # d3 += h1*r2 + vmovdqa 0x50(%rsp),$H2 # r3^2 + vpaddq $H3,$D2,$D2 # d2 += h0*r2 + vpmuludq $T4,$H4,$H0 # h4*s2 + vpmuludq $T3,$H4,$H4 # h3*s2 + vpaddq $H0,$D1,$D1 # d1 += h4*s2 + vmovdqa 0x60(%rsp),$H3 # s3^2 + vpaddq $H4,$D0,$D0 # d0 += h3*s2 + + vmovdqa 0x80(%rsp),$H4 # s4^2 + vpmuludq $T1,$H2,$H1 # h1*r3 + vpmuludq $T0,$H2,$H2 # h0*r3 + vpaddq $H1,$D4,$D4 # d4 += h1*r3 + vpaddq $H2,$D3,$D3 # d3 += h0*r3 + vpmuludq $T4,$H3,$H0 # h4*s3 + vpmuludq $T3,$H3,$H1 # h3*s3 + vpaddq $H0,$D2,$D2 # d2 += h4*s3 + vmovdqu 16*0($inp),$H0 # load input + vpaddq $H1,$D1,$D1 # d1 += h3*s3 + vpmuludq $T2,$H3,$H3 # h2*s3 + vpmuludq $T2,$H4,$T2 # h2*s4 + vpaddq $H3,$D0,$D0 # d0 += h2*s3 + + vmovdqu 16*1($inp),$H1 # + vpaddq $T2,$D1,$D1 # d1 += h2*s4 + vpmuludq $T3,$H4,$T3 # h3*s4 + vpmuludq $T4,$H4,$T4 # h4*s4 + vpsrldq \$6,$H0,$H2 # splat input + vpaddq $T3,$D2,$D2 # d2 += h3*s4 + vpaddq $T4,$D3,$D3 # d3 += h4*s4 + vpsrldq \$6,$H1,$H3 # + vpmuludq 0x70(%rsp),$T0,$T4 # h0*r4 + vpmuludq $T1,$H4,$T0 # h1*s4 + vpunpckhqdq $H1,$H0,$H4 # 4 + vpaddq $T4,$D4,$D4 # d4 += h0*r4 + vmovdqa -0x90(%r11),$T4 # r0^4 + vpaddq $T0,$D0,$D0 # d0 += h1*s4 + + vpunpcklqdq $H1,$H0,$H0 # 0:1 + vpunpcklqdq $H3,$H2,$H3 # 2:3 + + #vpsrlq \$40,$H4,$H4 # 4 + vpsrldq \$`40/8`,$H4,$H4 # 4 + vpsrlq \$26,$H0,$H1 + vpand $MASK,$H0,$H0 # 0 + vpsrlq \$4,$H3,$H2 + vpand $MASK,$H1,$H1 # 1 + vpand 0(%rcx),$H4,$H4 # .Lmask24 + vpsrlq \$30,$H3,$H3 + vpand $MASK,$H2,$H2 # 2 + vpand $MASK,$H3,$H3 # 3 + vpor 32(%rcx),$H4,$H4 # padbit, yes, always + + vpaddq 0x00(%r11),$H0,$H0 # add hash value + vpaddq 0x10(%r11),$H1,$H1 + vpaddq 0x20(%r11),$H2,$H2 + vpaddq 0x30(%r11),$H3,$H3 + vpaddq 0x40(%r11),$H4,$H4 + + lea 16*2($inp),%rax + lea 16*4($inp),$inp + sub \$64,$len + cmovc %rax,$inp + + ################################################################ + # Now we accumulate (inp[0:1]+hash)*r^4 + ################################################################ + # d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4 + # d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4 + # d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4 + # d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4 + # d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4 + + vpmuludq $H0,$T4,$T0 # h0*r0 + vpmuludq $H1,$T4,$T1 # h1*r0 + vpaddq $T0,$D0,$D0 + vpaddq $T1,$D1,$D1 + vmovdqa -0x80(%r11),$T2 # r1^4 + vpmuludq $H2,$T4,$T0 # h2*r0 + vpmuludq $H3,$T4,$T1 # h3*r0 + vpaddq $T0,$D2,$D2 + vpaddq $T1,$D3,$D3 + vpmuludq $H4,$T4,$T4 # h4*r0 + vpmuludq -0x70(%r11),$H4,$T0 # h4*s1 + vpaddq $T4,$D4,$D4 + + vpaddq $T0,$D0,$D0 # d0 += h4*s1 + vpmuludq $H2,$T2,$T1 # h2*r1 + vpmuludq $H3,$T2,$T0 # h3*r1 + vpaddq $T1,$D3,$D3 # d3 += h2*r1 + vmovdqa -0x60(%r11),$T3 # r2^4 + vpaddq $T0,$D4,$D4 # d4 += h3*r1 + vpmuludq $H1,$T2,$T1 # h1*r1 + vpmuludq $H0,$T2,$T2 # h0*r1 + vpaddq $T1,$D2,$D2 # d2 += h1*r1 + vpaddq $T2,$D1,$D1 # d1 += h0*r1 + + vmovdqa -0x50(%r11),$T4 # s2^4 + vpmuludq $H2,$T3,$T0 # h2*r2 + vpmuludq $H1,$T3,$T1 # h1*r2 + vpaddq $T0,$D4,$D4 # d4 += h2*r2 + vpaddq $T1,$D3,$D3 # d3 += h1*r2 + vmovdqa -0x40(%r11),$T2 # r3^4 + vpmuludq $H0,$T3,$T3 # h0*r2 + vpmuludq $H4,$T4,$T0 # h4*s2 + vpaddq $T3,$D2,$D2 # d2 += h0*r2 + vpaddq $T0,$D1,$D1 # d1 += h4*s2 + vmovdqa -0x30(%r11),$T3 # s3^4 + vpmuludq $H3,$T4,$T4 # h3*s2 + vpmuludq $H1,$T2,$T1 # h1*r3 + vpaddq $T4,$D0,$D0 # d0 += h3*s2 + + vmovdqa -0x10(%r11),$T4 # s4^4 + vpaddq $T1,$D4,$D4 # d4 += h1*r3 + vpmuludq $H0,$T2,$T2 # h0*r3 + vpmuludq $H4,$T3,$T0 # h4*s3 + vpaddq $T2,$D3,$D3 # d3 += h0*r3 + vpaddq $T0,$D2,$D2 # d2 += h4*s3 + vmovdqu 16*2($inp),$T0 # load input + vpmuludq $H3,$T3,$T2 # h3*s3 + vpmuludq $H2,$T3,$T3 # h2*s3 + vpaddq $T2,$D1,$D1 # d1 += h3*s3 + vmovdqu 16*3($inp),$T1 # + vpaddq $T3,$D0,$D0 # d0 += h2*s3 + + vpmuludq $H2,$T4,$H2 # h2*s4 + vpmuludq $H3,$T4,$H3 # h3*s4 + vpsrldq \$6,$T0,$T2 # splat input + vpaddq $H2,$D1,$D1 # d1 += h2*s4 + vpmuludq $H4,$T4,$H4 # h4*s4 + vpsrldq \$6,$T1,$T3 # + vpaddq $H3,$D2,$H2 # h2 = d2 + h3*s4 + vpaddq $H4,$D3,$H3 # h3 = d3 + h4*s4 + vpmuludq -0x20(%r11),$H0,$H4 # h0*r4 + vpmuludq $H1,$T4,$H0 + vpunpckhqdq $T1,$T0,$T4 # 4 + vpaddq $H4,$D4,$H4 # h4 = d4 + h0*r4 + vpaddq $H0,$D0,$H0 # h0 = d0 + h1*s4 + + vpunpcklqdq $T1,$T0,$T0 # 0:1 + vpunpcklqdq $T3,$T2,$T3 # 2:3 + + #vpsrlq \$40,$T4,$T4 # 4 + vpsrldq \$`40/8`,$T4,$T4 # 4 + vpsrlq \$26,$T0,$T1 + vmovdqa 0x00(%rsp),$D4 # preload r0^2 + vpand $MASK,$T0,$T0 # 0 + vpsrlq \$4,$T3,$T2 + vpand $MASK,$T1,$T1 # 1 + vpand 0(%rcx),$T4,$T4 # .Lmask24 + vpsrlq \$30,$T3,$T3 + vpand $MASK,$T2,$T2 # 2 + vpand $MASK,$T3,$T3 # 3 + vpor 32(%rcx),$T4,$T4 # padbit, yes, always + + ################################################################ + # lazy reduction as discussed in "NEON crypto" by D.J. Bernstein + # and P. Schwabe + + vpsrlq \$26,$H3,$D3 + vpand $MASK,$H3,$H3 + vpaddq $D3,$H4,$H4 # h3 -> h4 + + vpsrlq \$26,$H0,$D0 + vpand $MASK,$H0,$H0 + vpaddq $D0,$D1,$H1 # h0 -> h1 + + vpsrlq \$26,$H4,$D0 + vpand $MASK,$H4,$H4 + + vpsrlq \$26,$H1,$D1 + vpand $MASK,$H1,$H1 + vpaddq $D1,$H2,$H2 # h1 -> h2 + + vpaddq $D0,$H0,$H0 + vpsllq \$2,$D0,$D0 + vpaddq $D0,$H0,$H0 # h4 -> h0 + + vpsrlq \$26,$H2,$D2 + vpand $MASK,$H2,$H2 + vpaddq $D2,$H3,$H3 # h2 -> h3 + + vpsrlq \$26,$H0,$D0 + vpand $MASK,$H0,$H0 + vpaddq $D0,$H1,$H1 # h0 -> h1 + + vpsrlq \$26,$H3,$D3 + vpand $MASK,$H3,$H3 + vpaddq $D3,$H4,$H4 # h3 -> h4 + + ja .Loop_avx + +.Lskip_loop_avx: + ################################################################ + # multiply (inp[0:1]+hash) or inp[2:3] by r^2:r^1 + + vpshufd \$0x10,$D4,$D4 # r0^n, xx12 -> x1x2 + add \$32,$len + jnz .Long_tail_avx + + vpaddq $H2,$T2,$T2 + vpaddq $H0,$T0,$T0 + vpaddq $H1,$T1,$T1 + vpaddq $H3,$T3,$T3 + vpaddq $H4,$T4,$T4 + +.Long_tail_avx: + vmovdqa $H2,0x20(%r11) + vmovdqa $H0,0x00(%r11) + vmovdqa $H1,0x10(%r11) + vmovdqa $H3,0x30(%r11) + vmovdqa $H4,0x40(%r11) + + # d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4 + # d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4 + # d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4 + # d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4 + # d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4 + + vpmuludq $T2,$D4,$D2 # d2 = h2*r0 + vpmuludq $T0,$D4,$D0 # d0 = h0*r0 + vpshufd \$0x10,`16*1-64`($ctx),$H2 # r1^n + vpmuludq $T1,$D4,$D1 # d1 = h1*r0 + vpmuludq $T3,$D4,$D3 # d3 = h3*r0 + vpmuludq $T4,$D4,$D4 # d4 = h4*r0 + + vpmuludq $T3,$H2,$H0 # h3*r1 + vpaddq $H0,$D4,$D4 # d4 += h3*r1 + vpshufd \$0x10,`16*2-64`($ctx),$H3 # s1^n + vpmuludq $T2,$H2,$H1 # h2*r1 + vpaddq $H1,$D3,$D3 # d3 += h2*r1 + vpshufd \$0x10,`16*3-64`($ctx),$H4 # r2^n + vpmuludq $T1,$H2,$H0 # h1*r1 + vpaddq $H0,$D2,$D2 # d2 += h1*r1 + vpmuludq $T0,$H2,$H2 # h0*r1 + vpaddq $H2,$D1,$D1 # d1 += h0*r1 + vpmuludq $T4,$H3,$H3 # h4*s1 + vpaddq $H3,$D0,$D0 # d0 += h4*s1 + + vpshufd \$0x10,`16*4-64`($ctx),$H2 # s2^n + vpmuludq $T2,$H4,$H1 # h2*r2 + vpaddq $H1,$D4,$D4 # d4 += h2*r2 + vpmuludq $T1,$H4,$H0 # h1*r2 + vpaddq $H0,$D3,$D3 # d3 += h1*r2 + vpshufd \$0x10,`16*5-64`($ctx),$H3 # r3^n + vpmuludq $T0,$H4,$H4 # h0*r2 + vpaddq $H4,$D2,$D2 # d2 += h0*r2 + vpmuludq $T4,$H2,$H1 # h4*s2 + vpaddq $H1,$D1,$D1 # d1 += h4*s2 + vpshufd \$0x10,`16*6-64`($ctx),$H4 # s3^n + vpmuludq $T3,$H2,$H2 # h3*s2 + vpaddq $H2,$D0,$D0 # d0 += h3*s2 + + vpmuludq $T1,$H3,$H0 # h1*r3 + vpaddq $H0,$D4,$D4 # d4 += h1*r3 + vpmuludq $T0,$H3,$H3 # h0*r3 + vpaddq $H3,$D3,$D3 # d3 += h0*r3 + vpshufd \$0x10,`16*7-64`($ctx),$H2 # r4^n + vpmuludq $T4,$H4,$H1 # h4*s3 + vpaddq $H1,$D2,$D2 # d2 += h4*s3 + vpshufd \$0x10,`16*8-64`($ctx),$H3 # s4^n + vpmuludq $T3,$H4,$H0 # h3*s3 + vpaddq $H0,$D1,$D1 # d1 += h3*s3 + vpmuludq $T2,$H4,$H4 # h2*s3 + vpaddq $H4,$D0,$D0 # d0 += h2*s3 + + vpmuludq $T0,$H2,$H2 # h0*r4 + vpaddq $H2,$D4,$D4 # h4 = d4 + h0*r4 + vpmuludq $T4,$H3,$H1 # h4*s4 + vpaddq $H1,$D3,$D3 # h3 = d3 + h4*s4 + vpmuludq $T3,$H3,$H0 # h3*s4 + vpaddq $H0,$D2,$D2 # h2 = d2 + h3*s4 + vpmuludq $T2,$H3,$H1 # h2*s4 + vpaddq $H1,$D1,$D1 # h1 = d1 + h2*s4 + vpmuludq $T1,$H3,$H3 # h1*s4 + vpaddq $H3,$D0,$D0 # h0 = d0 + h1*s4 + + jz .Lshort_tail_avx + + vmovdqu 16*0($inp),$H0 # load input + vmovdqu 16*1($inp),$H1 + + vpsrldq \$6,$H0,$H2 # splat input + vpsrldq \$6,$H1,$H3 + vpunpckhqdq $H1,$H0,$H4 # 4 + vpunpcklqdq $H1,$H0,$H0 # 0:1 + vpunpcklqdq $H3,$H2,$H3 # 2:3 + + vpsrlq \$40,$H4,$H4 # 4 + vpsrlq \$26,$H0,$H1 + vpand $MASK,$H0,$H0 # 0 + vpsrlq \$4,$H3,$H2 + vpand $MASK,$H1,$H1 # 1 + vpsrlq \$30,$H3,$H3 + vpand $MASK,$H2,$H2 # 2 + vpand $MASK,$H3,$H3 # 3 + vpor 32(%rcx),$H4,$H4 # padbit, yes, always + + vpshufd \$0x32,`16*0-64`($ctx),$T4 # r0^n, 34xx -> x3x4 + vpaddq 0x00(%r11),$H0,$H0 + vpaddq 0x10(%r11),$H1,$H1 + vpaddq 0x20(%r11),$H2,$H2 + vpaddq 0x30(%r11),$H3,$H3 + vpaddq 0x40(%r11),$H4,$H4 + + ################################################################ + # multiply (inp[0:1]+hash) by r^4:r^3 and accumulate + + vpmuludq $H0,$T4,$T0 # h0*r0 + vpaddq $T0,$D0,$D0 # d0 += h0*r0 + vpmuludq $H1,$T4,$T1 # h1*r0 + vpaddq $T1,$D1,$D1 # d1 += h1*r0 + vpmuludq $H2,$T4,$T0 # h2*r0 + vpaddq $T0,$D2,$D2 # d2 += h2*r0 + vpshufd \$0x32,`16*1-64`($ctx),$T2 # r1^n + vpmuludq $H3,$T4,$T1 # h3*r0 + vpaddq $T1,$D3,$D3 # d3 += h3*r0 + vpmuludq $H4,$T4,$T4 # h4*r0 + vpaddq $T4,$D4,$D4 # d4 += h4*r0 + + vpmuludq $H3,$T2,$T0 # h3*r1 + vpaddq $T0,$D4,$D4 # d4 += h3*r1 + vpshufd \$0x32,`16*2-64`($ctx),$T3 # s1 + vpmuludq $H2,$T2,$T1 # h2*r1 + vpaddq $T1,$D3,$D3 # d3 += h2*r1 + vpshufd \$0x32,`16*3-64`($ctx),$T4 # r2 + vpmuludq $H1,$T2,$T0 # h1*r1 + vpaddq $T0,$D2,$D2 # d2 += h1*r1 + vpmuludq $H0,$T2,$T2 # h0*r1 + vpaddq $T2,$D1,$D1 # d1 += h0*r1 + vpmuludq $H4,$T3,$T3 # h4*s1 + vpaddq $T3,$D0,$D0 # d0 += h4*s1 + + vpshufd \$0x32,`16*4-64`($ctx),$T2 # s2 + vpmuludq $H2,$T4,$T1 # h2*r2 + vpaddq $T1,$D4,$D4 # d4 += h2*r2 + vpmuludq $H1,$T4,$T0 # h1*r2 + vpaddq $T0,$D3,$D3 # d3 += h1*r2 + vpshufd \$0x32,`16*5-64`($ctx),$T3 # r3 + vpmuludq $H0,$T4,$T4 # h0*r2 + vpaddq $T4,$D2,$D2 # d2 += h0*r2 + vpmuludq $H4,$T2,$T1 # h4*s2 + vpaddq $T1,$D1,$D1 # d1 += h4*s2 + vpshufd \$0x32,`16*6-64`($ctx),$T4 # s3 + vpmuludq $H3,$T2,$T2 # h3*s2 + vpaddq $T2,$D0,$D0 # d0 += h3*s2 + + vpmuludq $H1,$T3,$T0 # h1*r3 + vpaddq $T0,$D4,$D4 # d4 += h1*r3 + vpmuludq $H0,$T3,$T3 # h0*r3 + vpaddq $T3,$D3,$D3 # d3 += h0*r3 + vpshufd \$0x32,`16*7-64`($ctx),$T2 # r4 + vpmuludq $H4,$T4,$T1 # h4*s3 + vpaddq $T1,$D2,$D2 # d2 += h4*s3 + vpshufd \$0x32,`16*8-64`($ctx),$T3 # s4 + vpmuludq $H3,$T4,$T0 # h3*s3 + vpaddq $T0,$D1,$D1 # d1 += h3*s3 + vpmuludq $H2,$T4,$T4 # h2*s3 + vpaddq $T4,$D0,$D0 # d0 += h2*s3 + + vpmuludq $H0,$T2,$T2 # h0*r4 + vpaddq $T2,$D4,$D4 # d4 += h0*r4 + vpmuludq $H4,$T3,$T1 # h4*s4 + vpaddq $T1,$D3,$D3 # d3 += h4*s4 + vpmuludq $H3,$T3,$T0 # h3*s4 + vpaddq $T0,$D2,$D2 # d2 += h3*s4 + vpmuludq $H2,$T3,$T1 # h2*s4 + vpaddq $T1,$D1,$D1 # d1 += h2*s4 + vpmuludq $H1,$T3,$T3 # h1*s4 + vpaddq $T3,$D0,$D0 # d0 += h1*s4 + +.Lshort_tail_avx: + ################################################################ + # horizontal addition + + vpsrldq \$8,$D4,$T4 + vpsrldq \$8,$D3,$T3 + vpsrldq \$8,$D1,$T1 + vpsrldq \$8,$D0,$T0 + vpsrldq \$8,$D2,$T2 + vpaddq $T3,$D3,$D3 + vpaddq $T4,$D4,$D4 + vpaddq $T0,$D0,$D0 + vpaddq $T1,$D1,$D1 + vpaddq $T2,$D2,$D2 + + ################################################################ + # lazy reduction + + vpsrlq \$26,$D3,$H3 + vpand $MASK,$D3,$D3 + vpaddq $H3,$D4,$D4 # h3 -> h4 + + vpsrlq \$26,$D0,$H0 + vpand $MASK,$D0,$D0 + vpaddq $H0,$D1,$D1 # h0 -> h1 + + vpsrlq \$26,$D4,$H4 + vpand $MASK,$D4,$D4 + + vpsrlq \$26,$D1,$H1 + vpand $MASK,$D1,$D1 + vpaddq $H1,$D2,$D2 # h1 -> h2 + + vpaddq $H4,$D0,$D0 + vpsllq \$2,$H4,$H4 + vpaddq $H4,$D0,$D0 # h4 -> h0 + + vpsrlq \$26,$D2,$H2 + vpand $MASK,$D2,$D2 + vpaddq $H2,$D3,$D3 # h2 -> h3 + + vpsrlq \$26,$D0,$H0 + vpand $MASK,$D0,$D0 + vpaddq $H0,$D1,$D1 # h0 -> h1 + + vpsrlq \$26,$D3,$H3 + vpand $MASK,$D3,$D3 + vpaddq $H3,$D4,$D4 # h3 -> h4 + + vmovd $D0,`4*0-48-64`($ctx) # save partially reduced + vmovd $D1,`4*1-48-64`($ctx) + vmovd $D2,`4*2-48-64`($ctx) + vmovd $D3,`4*3-48-64`($ctx) + vmovd $D4,`4*4-48-64`($ctx) +___ +$code.=<<___ if ($win64); + vmovdqa 0x50(%r11),%xmm6 + vmovdqa 0x60(%r11),%xmm7 + vmovdqa 0x70(%r11),%xmm8 + vmovdqa 0x80(%r11),%xmm9 + vmovdqa 0x90(%r11),%xmm10 + vmovdqa 0xa0(%r11),%xmm11 + vmovdqa 0xb0(%r11),%xmm12 + vmovdqa 0xc0(%r11),%xmm13 + vmovdqa 0xd0(%r11),%xmm14 + vmovdqa 0xe0(%r11),%xmm15 + lea 0xf8(%r11),%rsp +.Ldo_avx_epilogue: +___ +$code.=<<___ if (!$win64); + lea -8(%r10),%rsp +.cfi_def_cfa_register %rsp +___ +$code.=<<___; + vzeroupper + ret +.cfi_endproc +___ +&end_function("poly1305_blocks_avx"); + +&declare_function("poly1305_emit_avx", 32, 3); +$code.=<<___; + cmpl \$0,20($ctx) # is_base2_26? + je .Lemit + + mov 0($ctx),%eax # load hash value base 2^26 + mov 4($ctx),%ecx + mov 8($ctx),%r8d + mov 12($ctx),%r11d + mov 16($ctx),%r10d + + shl \$26,%rcx # base 2^26 -> base 2^64 + mov %r8,%r9 + shl \$52,%r8 + add %rcx,%rax + shr \$12,%r9 + add %rax,%r8 # h0 + adc \$0,%r9 + + shl \$14,%r11 + mov %r10,%rax + shr \$24,%r10 + add %r11,%r9 + shl \$40,%rax + add %rax,%r9 # h1 + adc \$0,%r10 # h2 + + mov %r10,%rax # could be partially reduced, so reduce + mov %r10,%rcx + and \$3,%r10 + shr \$2,%rax + and \$-4,%rcx + add %rcx,%rax + add %rax,%r8 + adc \$0,%r9 + adc \$0,%r10 + + mov %r8,%rax + add \$5,%r8 # compare to modulus + mov %r9,%rcx + adc \$0,%r9 + adc \$0,%r10 + shr \$2,%r10 # did 130-bit value overflow? + cmovnz %r8,%rax + cmovnz %r9,%rcx + + add 0($nonce),%rax # accumulate nonce + adc 8($nonce),%rcx + mov %rax,0($mac) # write result + mov %rcx,8($mac) + + ret +___ +&end_function("poly1305_emit_avx"); + +if ($kernel) { + $code .= "#endif\n"; +} + +if ($avx>1) { + +if ($kernel) { + $code .= "#ifdef CONFIG_AS_AVX2\n"; +} + +my ($H0,$H1,$H2,$H3,$H4, $MASK, $T4,$T0,$T1,$T2,$T3, $D0,$D1,$D2,$D3,$D4) = + map("%ymm$_",(0..15)); +my $S4=$MASK; + +sub poly1305_blocks_avxN { + my ($avx512) = @_; + my $suffix = $avx512 ? "_avx512" : ""; +$code.=<<___; +.cfi_startproc + mov 20($ctx),%r8d # is_base2_26 + cmp \$128,$len + jae .Lblocks_avx2$suffix + test %r8d,%r8d + jz .Lblocks + +.Lblocks_avx2$suffix: + and \$-16,$len + jz .Lno_data_avx2$suffix + + vzeroupper + + test %r8d,%r8d + jz .Lbase2_64_avx2$suffix + + test \$63,$len + jz .Leven_avx2$suffix + + push %rbp +.cfi_push %rbp + mov %rsp,%rbp + push %rbx +.cfi_push %rbx + push %r12 +.cfi_push %r12 + push %r13 +.cfi_push %r13 + push %r14 +.cfi_push %r14 + push %r15 +.cfi_push %r15 +.Lblocks_avx2_body$suffix: + + mov $len,%r15 # reassign $len + + mov 0($ctx),$d1 # load hash value + mov 8($ctx),$d2 + mov 16($ctx),$h2#d + + mov 24($ctx),$r0 # load r + mov 32($ctx),$s1 + + ################################# base 2^26 -> base 2^64 + mov $d1#d,$h0#d + and \$`-1*(1<<31)`,$d1 + mov $d2,$r1 # borrow $r1 + mov $d2#d,$h1#d + and \$`-1*(1<<31)`,$d2 + + shr \$6,$d1 + shl \$52,$r1 + add $d1,$h0 + shr \$12,$h1 + shr \$18,$d2 + add $r1,$h0 + adc $d2,$h1 + + mov $h2,$d1 + shl \$40,$d1 + shr \$24,$h2 + add $d1,$h1 + adc \$0,$h2 # can be partially reduced... + + mov \$-4,$d2 # ... so reduce + mov $h2,$d1 + and $h2,$d2 + shr \$2,$d1 + and \$3,$h2 + add $d2,$d1 # =*5 + add $d1,$h0 + adc \$0,$h1 + adc \$0,$h2 + + mov $s1,$r1 + mov $s1,%rax + shr \$2,$s1 + add $r1,$s1 # s1 = r1 + (r1 >> 2) + +.Lbase2_26_pre_avx2$suffix: + add 0($inp),$h0 # accumulate input + adc 8($inp),$h1 + lea 16($inp),$inp + adc $padbit,$h2 + sub \$16,%r15 + + call __poly1305_block + mov $r1,%rax + + test \$63,%r15 + jnz .Lbase2_26_pre_avx2$suffix + + test $padbit,$padbit # if $padbit is zero, + jz .Lstore_base2_64_avx2$suffix # store hash in base 2^64 format + + ################################# base 2^64 -> base 2^26 + mov $h0,%rax + mov $h0,%rdx + shr \$52,$h0 + mov $h1,$r0 + mov $h1,$r1 + shr \$26,%rdx + and \$0x3ffffff,%rax # h[0] + shl \$12,$r0 + and \$0x3ffffff,%rdx # h[1] + shr \$14,$h1 + or $r0,$h0 + shl \$24,$h2 + and \$0x3ffffff,$h0 # h[2] + shr \$40,$r1 + and \$0x3ffffff,$h1 # h[3] + or $r1,$h2 # h[4] + + test %r15,%r15 + jz .Lstore_base2_26_avx2$suffix + + vmovd %rax#d,%x#$H0 + vmovd %rdx#d,%x#$H1 + vmovd $h0#d,%x#$H2 + vmovd $h1#d,%x#$H3 + vmovd $h2#d,%x#$H4 + jmp .Lproceed_avx2$suffix + +.align 32 +.Lstore_base2_64_avx2$suffix: + mov $h0,0($ctx) + mov $h1,8($ctx) + mov $h2,16($ctx) # note that is_base2_26 is zeroed + jmp .Ldone_avx2$suffix + +.align 16 +.Lstore_base2_26_avx2$suffix: + mov %rax#d,0($ctx) # store hash value base 2^26 + mov %rdx#d,4($ctx) + mov $h0#d,8($ctx) + mov $h1#d,12($ctx) + mov $h2#d,16($ctx) +.align 16 +.Ldone_avx2$suffix: + pop %r15 +.cfi_restore %r15 + pop %r14 +.cfi_restore %r14 + pop %r13 +.cfi_restore %r13 + pop %r12 +.cfi_restore %r12 + pop %rbx +.cfi_restore %rbx + pop %rbp +.cfi_restore %rbp +.Lno_data_avx2$suffix: +.Lblocks_avx2_epilogue$suffix: + ret +.cfi_endproc + +.align 32 +.Lbase2_64_avx2$suffix: +.cfi_startproc + push %rbp +.cfi_push %rbp + mov %rsp,%rbp + push %rbx +.cfi_push %rbx + push %r12 +.cfi_push %r12 + push %r13 +.cfi_push %r13 + push %r14 +.cfi_push %r14 + push %r15 +.cfi_push %r15 +.Lbase2_64_avx2_body$suffix: + + mov $len,%r15 # reassign $len + + mov 24($ctx),$r0 # load r + mov 32($ctx),$s1 + + mov 0($ctx),$h0 # load hash value + mov 8($ctx),$h1 + mov 16($ctx),$h2#d + + mov $s1,$r1 + mov $s1,%rax + shr \$2,$s1 + add $r1,$s1 # s1 = r1 + (r1 >> 2) + + test \$63,$len + jz .Linit_avx2$suffix + +.Lbase2_64_pre_avx2$suffix: + add 0($inp),$h0 # accumulate input + adc 8($inp),$h1 + lea 16($inp),$inp + adc $padbit,$h2 + sub \$16,%r15 + + call __poly1305_block + mov $r1,%rax + + test \$63,%r15 + jnz .Lbase2_64_pre_avx2$suffix + +.Linit_avx2$suffix: + ################################# base 2^64 -> base 2^26 + mov $h0,%rax + mov $h0,%rdx + shr \$52,$h0 + mov $h1,$d1 + mov $h1,$d2 + shr \$26,%rdx + and \$0x3ffffff,%rax # h[0] + shl \$12,$d1 + and \$0x3ffffff,%rdx # h[1] + shr \$14,$h1 + or $d1,$h0 + shl \$24,$h2 + and \$0x3ffffff,$h0 # h[2] + shr \$40,$d2 + and \$0x3ffffff,$h1 # h[3] + or $d2,$h2 # h[4] + + vmovd %rax#d,%x#$H0 + vmovd %rdx#d,%x#$H1 + vmovd $h0#d,%x#$H2 + vmovd $h1#d,%x#$H3 + vmovd $h2#d,%x#$H4 + movl \$1,20($ctx) # set is_base2_26 + + call __poly1305_init_avx + +.Lproceed_avx2$suffix: + mov %r15,$len # restore $len +___ +$code.=<<___ if (!$kernel); + mov OPENSSL_ia32cap_P+8(%rip),%r9d + mov \$`(1<<31|1<<30|1<<16)`,%r11d +___ +$code.=<<___; + pop %r15 +.cfi_restore %r15 + pop %r14 +.cfi_restore %r14 + pop %r13 +.cfi_restore %r13 + pop %r12 +.cfi_restore %r12 + pop %rbx +.cfi_restore %rbx + pop %rbp +.cfi_restore %rbp +.Lbase2_64_avx2_epilogue$suffix: + jmp .Ldo_avx2$suffix +.cfi_endproc + +.align 32 +.Leven_avx2$suffix: +.cfi_startproc +___ +$code.=<<___ if (!$kernel); + mov OPENSSL_ia32cap_P+8(%rip),%r9d +___ +$code.=<<___; + vmovd 4*0($ctx),%x#$H0 # load hash value base 2^26 + vmovd 4*1($ctx),%x#$H1 + vmovd 4*2($ctx),%x#$H2 + vmovd 4*3($ctx),%x#$H3 + vmovd 4*4($ctx),%x#$H4 + +.Ldo_avx2$suffix: +___ +$code.=<<___ if (!$kernel && $avx>2); + cmp \$512,$len + jb .Lskip_avx512 + and %r11d,%r9d + test \$`1<<16`,%r9d # check for AVX512F + jnz .Lblocks_avx512 +.Lskip_avx512$suffix: +___ +$code.=<<___ if ($avx > 2 && $avx512 && $kernel); + cmp \$512,$len + jae .Lblocks_avx512 +___ +$code.=<<___ if (!$win64); + lea 8(%rsp),%r10 +.cfi_def_cfa_register %r10 + sub \$0x128,%rsp +___ +$code.=<<___ if ($win64); + lea 8(%rsp),%r10 + sub \$0x1c8,%rsp + vmovdqa %xmm6,-0xb0(%r10) + vmovdqa %xmm7,-0xa0(%r10) + vmovdqa %xmm8,-0x90(%r10) + vmovdqa %xmm9,-0x80(%r10) + vmovdqa %xmm10,-0x70(%r10) + vmovdqa %xmm11,-0x60(%r10) + vmovdqa %xmm12,-0x50(%r10) + vmovdqa %xmm13,-0x40(%r10) + vmovdqa %xmm14,-0x30(%r10) + vmovdqa %xmm15,-0x20(%r10) +.Ldo_avx2_body$suffix: +___ +$code.=<<___; + lea .Lconst(%rip),%rcx + lea 48+64($ctx),$ctx # size optimization + vmovdqa 96(%rcx),$T0 # .Lpermd_avx2 + + # expand and copy pre-calculated table to stack + vmovdqu `16*0-64`($ctx),%x#$T2 + and \$-512,%rsp + vmovdqu `16*1-64`($ctx),%x#$T3 + vmovdqu `16*2-64`($ctx),%x#$T4 + vmovdqu `16*3-64`($ctx),%x#$D0 + vmovdqu `16*4-64`($ctx),%x#$D1 + vmovdqu `16*5-64`($ctx),%x#$D2 + lea 0x90(%rsp),%rax # size optimization + vmovdqu `16*6-64`($ctx),%x#$D3 + vpermd $T2,$T0,$T2 # 00003412 -> 14243444 + vmovdqu `16*7-64`($ctx),%x#$D4 + vpermd $T3,$T0,$T3 + vmovdqu `16*8-64`($ctx),%x#$MASK + vpermd $T4,$T0,$T4 + vmovdqa $T2,0x00(%rsp) + vpermd $D0,$T0,$D0 + vmovdqa $T3,0x20-0x90(%rax) + vpermd $D1,$T0,$D1 + vmovdqa $T4,0x40-0x90(%rax) + vpermd $D2,$T0,$D2 + vmovdqa $D0,0x60-0x90(%rax) + vpermd $D3,$T0,$D3 + vmovdqa $D1,0x80-0x90(%rax) + vpermd $D4,$T0,$D4 + vmovdqa $D2,0xa0-0x90(%rax) + vpermd $MASK,$T0,$MASK + vmovdqa $D3,0xc0-0x90(%rax) + vmovdqa $D4,0xe0-0x90(%rax) + vmovdqa $MASK,0x100-0x90(%rax) + vmovdqa 64(%rcx),$MASK # .Lmask26 + + ################################################################ + # load input + vmovdqu 16*0($inp),%x#$T0 + vmovdqu 16*1($inp),%x#$T1 + vinserti128 \$1,16*2($inp),$T0,$T0 + vinserti128 \$1,16*3($inp),$T1,$T1 + lea 16*4($inp),$inp + + vpsrldq \$6,$T0,$T2 # splat input + vpsrldq \$6,$T1,$T3 + vpunpckhqdq $T1,$T0,$T4 # 4 + vpunpcklqdq $T3,$T2,$T2 # 2:3 + vpunpcklqdq $T1,$T0,$T0 # 0:1 + + vpsrlq \$30,$T2,$T3 + vpsrlq \$4,$T2,$T2 + vpsrlq \$26,$T0,$T1 + vpsrlq \$40,$T4,$T4 # 4 + vpand $MASK,$T2,$T2 # 2 + vpand $MASK,$T0,$T0 # 0 + vpand $MASK,$T1,$T1 # 1 + vpand $MASK,$T3,$T3 # 3 + vpor 32(%rcx),$T4,$T4 # padbit, yes, always + + vpaddq $H2,$T2,$H2 # accumulate input + sub \$64,$len + jz .Ltail_avx2$suffix + jmp .Loop_avx2$suffix + +.align 32 +.Loop_avx2$suffix: + ################################################################ + # ((inp[0]*r^4+inp[4])*r^4+inp[ 8])*r^4 + # ((inp[1]*r^4+inp[5])*r^4+inp[ 9])*r^3 + # ((inp[2]*r^4+inp[6])*r^4+inp[10])*r^2 + # ((inp[3]*r^4+inp[7])*r^4+inp[11])*r^1 + # \________/\__________/ + ################################################################ + #vpaddq $H2,$T2,$H2 # accumulate input + vpaddq $H0,$T0,$H0 + vmovdqa `32*0`(%rsp),$T0 # r0^4 + vpaddq $H1,$T1,$H1 + vmovdqa `32*1`(%rsp),$T1 # r1^4 + vpaddq $H3,$T3,$H3 + vmovdqa `32*3`(%rsp),$T2 # r2^4 + vpaddq $H4,$T4,$H4 + vmovdqa `32*6-0x90`(%rax),$T3 # s3^4 + vmovdqa `32*8-0x90`(%rax),$S4 # s4^4 + + # d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4 + # d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4 + # d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4 + # d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4 + # d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4 + # + # however, as h2 is "chronologically" first one available pull + # corresponding operations up, so it's + # + # d4 = h2*r2 + h4*r0 + h3*r1 + h1*r3 + h0*r4 + # d3 = h2*r1 + h3*r0 + h1*r2 + h0*r3 + h4*5*r4 + # d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4 + # d1 = h2*5*r4 + h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + # d0 = h2*5*r3 + h0*r0 + h4*5*r1 + h3*5*r2 + h1*5*r4 + + vpmuludq $H2,$T0,$D2 # d2 = h2*r0 + vpmuludq $H2,$T1,$D3 # d3 = h2*r1 + vpmuludq $H2,$T2,$D4 # d4 = h2*r2 + vpmuludq $H2,$T3,$D0 # d0 = h2*s3 + vpmuludq $H2,$S4,$D1 # d1 = h2*s4 + + vpmuludq $H0,$T1,$T4 # h0*r1 + vpmuludq $H1,$T1,$H2 # h1*r1, borrow $H2 as temp + vpaddq $T4,$D1,$D1 # d1 += h0*r1 + vpaddq $H2,$D2,$D2 # d2 += h1*r1 + vpmuludq $H3,$T1,$T4 # h3*r1 + vpmuludq `32*2`(%rsp),$H4,$H2 # h4*s1 + vpaddq $T4,$D4,$D4 # d4 += h3*r1 + vpaddq $H2,$D0,$D0 # d0 += h4*s1 + vmovdqa `32*4-0x90`(%rax),$T1 # s2 + + vpmuludq $H0,$T0,$T4 # h0*r0 + vpmuludq $H1,$T0,$H2 # h1*r0 + vpaddq $T4,$D0,$D0 # d0 += h0*r0 + vpaddq $H2,$D1,$D1 # d1 += h1*r0 + vpmuludq $H3,$T0,$T4 # h3*r0 + vpmuludq $H4,$T0,$H2 # h4*r0 + vmovdqu 16*0($inp),%x#$T0 # load input + vpaddq $T4,$D3,$D3 # d3 += h3*r0 + vpaddq $H2,$D4,$D4 # d4 += h4*r0 + vinserti128 \$1,16*2($inp),$T0,$T0 + + vpmuludq $H3,$T1,$T4 # h3*s2 + vpmuludq $H4,$T1,$H2 # h4*s2 + vmovdqu 16*1($inp),%x#$T1 + vpaddq $T4,$D0,$D0 # d0 += h3*s2 + vpaddq $H2,$D1,$D1 # d1 += h4*s2 + vmovdqa `32*5-0x90`(%rax),$H2 # r3 + vpmuludq $H1,$T2,$T4 # h1*r2 + vpmuludq $H0,$T2,$T2 # h0*r2 + vpaddq $T4,$D3,$D3 # d3 += h1*r2 + vpaddq $T2,$D2,$D2 # d2 += h0*r2 + vinserti128 \$1,16*3($inp),$T1,$T1 + lea 16*4($inp),$inp + + vpmuludq $H1,$H2,$T4 # h1*r3 + vpmuludq $H0,$H2,$H2 # h0*r3 + vpsrldq \$6,$T0,$T2 # splat input + vpaddq $T4,$D4,$D4 # d4 += h1*r3 + vpaddq $H2,$D3,$D3 # d3 += h0*r3 + vpmuludq $H3,$T3,$T4 # h3*s3 + vpmuludq $H4,$T3,$H2 # h4*s3 + vpsrldq \$6,$T1,$T3 + vpaddq $T4,$D1,$D1 # d1 += h3*s3 + vpaddq $H2,$D2,$D2 # d2 += h4*s3 + vpunpckhqdq $T1,$T0,$T4 # 4 + + vpmuludq $H3,$S4,$H3 # h3*s4 + vpmuludq $H4,$S4,$H4 # h4*s4 + vpunpcklqdq $T1,$T0,$T0 # 0:1 + vpaddq $H3,$D2,$H2 # h2 = d2 + h3*r4 + vpaddq $H4,$D3,$H3 # h3 = d3 + h4*r4 + vpunpcklqdq $T3,$T2,$T3 # 2:3 + vpmuludq `32*7-0x90`(%rax),$H0,$H4 # h0*r4 + vpmuludq $H1,$S4,$H0 # h1*s4 + vmovdqa 64(%rcx),$MASK # .Lmask26 + vpaddq $H4,$D4,$H4 # h4 = d4 + h0*r4 + vpaddq $H0,$D0,$H0 # h0 = d0 + h1*s4 + + ################################################################ + # lazy reduction (interleaved with tail of input splat) + + vpsrlq \$26,$H3,$D3 + vpand $MASK,$H3,$H3 + vpaddq $D3,$H4,$H4 # h3 -> h4 + + vpsrlq \$26,$H0,$D0 + vpand $MASK,$H0,$H0 + vpaddq $D0,$D1,$H1 # h0 -> h1 + + vpsrlq \$26,$H4,$D4 + vpand $MASK,$H4,$H4 + + vpsrlq \$4,$T3,$T2 + + vpsrlq \$26,$H1,$D1 + vpand $MASK,$H1,$H1 + vpaddq $D1,$H2,$H2 # h1 -> h2 + + vpaddq $D4,$H0,$H0 + vpsllq \$2,$D4,$D4 + vpaddq $D4,$H0,$H0 # h4 -> h0 + + vpand $MASK,$T2,$T2 # 2 + vpsrlq \$26,$T0,$T1 + + vpsrlq \$26,$H2,$D2 + vpand $MASK,$H2,$H2 + vpaddq $D2,$H3,$H3 # h2 -> h3 + + vpaddq $T2,$H2,$H2 # modulo-scheduled + vpsrlq \$30,$T3,$T3 + + vpsrlq \$26,$H0,$D0 + vpand $MASK,$H0,$H0 + vpaddq $D0,$H1,$H1 # h0 -> h1 + + vpsrlq \$40,$T4,$T4 # 4 + + vpsrlq \$26,$H3,$D3 + vpand $MASK,$H3,$H3 + vpaddq $D3,$H4,$H4 # h3 -> h4 + + vpand $MASK,$T0,$T0 # 0 + vpand $MASK,$T1,$T1 # 1 + vpand $MASK,$T3,$T3 # 3 + vpor 32(%rcx),$T4,$T4 # padbit, yes, always + + sub \$64,$len + jnz .Loop_avx2$suffix + + .byte 0x66,0x90 +.Ltail_avx2$suffix: + ################################################################ + # while above multiplications were by r^4 in all lanes, in last + # iteration we multiply least significant lane by r^4 and most + # significant one by r, so copy of above except that references + # to the precomputed table are displaced by 4... + + #vpaddq $H2,$T2,$H2 # accumulate input + vpaddq $H0,$T0,$H0 + vmovdqu `32*0+4`(%rsp),$T0 # r0^4 + vpaddq $H1,$T1,$H1 + vmovdqu `32*1+4`(%rsp),$T1 # r1^4 + vpaddq $H3,$T3,$H3 + vmovdqu `32*3+4`(%rsp),$T2 # r2^4 + vpaddq $H4,$T4,$H4 + vmovdqu `32*6+4-0x90`(%rax),$T3 # s3^4 + vmovdqu `32*8+4-0x90`(%rax),$S4 # s4^4 + + vpmuludq $H2,$T0,$D2 # d2 = h2*r0 + vpmuludq $H2,$T1,$D3 # d3 = h2*r1 + vpmuludq $H2,$T2,$D4 # d4 = h2*r2 + vpmuludq $H2,$T3,$D0 # d0 = h2*s3 + vpmuludq $H2,$S4,$D1 # d1 = h2*s4 + + vpmuludq $H0,$T1,$T4 # h0*r1 + vpmuludq $H1,$T1,$H2 # h1*r1 + vpaddq $T4,$D1,$D1 # d1 += h0*r1 + vpaddq $H2,$D2,$D2 # d2 += h1*r1 + vpmuludq $H3,$T1,$T4 # h3*r1 + vpmuludq `32*2+4`(%rsp),$H4,$H2 # h4*s1 + vpaddq $T4,$D4,$D4 # d4 += h3*r1 + vpaddq $H2,$D0,$D0 # d0 += h4*s1 + + vpmuludq $H0,$T0,$T4 # h0*r0 + vpmuludq $H1,$T0,$H2 # h1*r0 + vpaddq $T4,$D0,$D0 # d0 += h0*r0 + vmovdqu `32*4+4-0x90`(%rax),$T1 # s2 + vpaddq $H2,$D1,$D1 # d1 += h1*r0 + vpmuludq $H3,$T0,$T4 # h3*r0 + vpmuludq $H4,$T0,$H2 # h4*r0 + vpaddq $T4,$D3,$D3 # d3 += h3*r0 + vpaddq $H2,$D4,$D4 # d4 += h4*r0 + + vpmuludq $H3,$T1,$T4 # h3*s2 + vpmuludq $H4,$T1,$H2 # h4*s2 + vpaddq $T4,$D0,$D0 # d0 += h3*s2 + vpaddq $H2,$D1,$D1 # d1 += h4*s2 + vmovdqu `32*5+4-0x90`(%rax),$H2 # r3 + vpmuludq $H1,$T2,$T4 # h1*r2 + vpmuludq $H0,$T2,$T2 # h0*r2 + vpaddq $T4,$D3,$D3 # d3 += h1*r2 + vpaddq $T2,$D2,$D2 # d2 += h0*r2 + + vpmuludq $H1,$H2,$T4 # h1*r3 + vpmuludq $H0,$H2,$H2 # h0*r3 + vpaddq $T4,$D4,$D4 # d4 += h1*r3 + vpaddq $H2,$D3,$D3 # d3 += h0*r3 + vpmuludq $H3,$T3,$T4 # h3*s3 + vpmuludq $H4,$T3,$H2 # h4*s3 + vpaddq $T4,$D1,$D1 # d1 += h3*s3 + vpaddq $H2,$D2,$D2 # d2 += h4*s3 + + vpmuludq $H3,$S4,$H3 # h3*s4 + vpmuludq $H4,$S4,$H4 # h4*s4 + vpaddq $H3,$D2,$H2 # h2 = d2 + h3*r4 + vpaddq $H4,$D3,$H3 # h3 = d3 + h4*r4 + vpmuludq `32*7+4-0x90`(%rax),$H0,$H4 # h0*r4 + vpmuludq $H1,$S4,$H0 # h1*s4 + vmovdqa 64(%rcx),$MASK # .Lmask26 + vpaddq $H4,$D4,$H4 # h4 = d4 + h0*r4 + vpaddq $H0,$D0,$H0 # h0 = d0 + h1*s4 + + ################################################################ + # horizontal addition + + vpsrldq \$8,$D1,$T1 + vpsrldq \$8,$H2,$T2 + vpsrldq \$8,$H3,$T3 + vpsrldq \$8,$H4,$T4 + vpsrldq \$8,$H0,$T0 + vpaddq $T1,$D1,$D1 + vpaddq $T2,$H2,$H2 + vpaddq $T3,$H3,$H3 + vpaddq $T4,$H4,$H4 + vpaddq $T0,$H0,$H0 + + vpermq \$0x2,$H3,$T3 + vpermq \$0x2,$H4,$T4 + vpermq \$0x2,$H0,$T0 + vpermq \$0x2,$D1,$T1 + vpermq \$0x2,$H2,$T2 + vpaddq $T3,$H3,$H3 + vpaddq $T4,$H4,$H4 + vpaddq $T0,$H0,$H0 + vpaddq $T1,$D1,$D1 + vpaddq $T2,$H2,$H2 + + ################################################################ + # lazy reduction + + vpsrlq \$26,$H3,$D3 + vpand $MASK,$H3,$H3 + vpaddq $D3,$H4,$H4 # h3 -> h4 + + vpsrlq \$26,$H0,$D0 + vpand $MASK,$H0,$H0 + vpaddq $D0,$D1,$H1 # h0 -> h1 + + vpsrlq \$26,$H4,$D4 + vpand $MASK,$H4,$H4 + + vpsrlq \$26,$H1,$D1 + vpand $MASK,$H1,$H1 + vpaddq $D1,$H2,$H2 # h1 -> h2 + + vpaddq $D4,$H0,$H0 + vpsllq \$2,$D4,$D4 + vpaddq $D4,$H0,$H0 # h4 -> h0 + + vpsrlq \$26,$H2,$D2 + vpand $MASK,$H2,$H2 + vpaddq $D2,$H3,$H3 # h2 -> h3 + + vpsrlq \$26,$H0,$D0 + vpand $MASK,$H0,$H0 + vpaddq $D0,$H1,$H1 # h0 -> h1 + + vpsrlq \$26,$H3,$D3 + vpand $MASK,$H3,$H3 + vpaddq $D3,$H4,$H4 # h3 -> h4 + + vmovd %x#$H0,`4*0-48-64`($ctx)# save partially reduced + vmovd %x#$H1,`4*1-48-64`($ctx) + vmovd %x#$H2,`4*2-48-64`($ctx) + vmovd %x#$H3,`4*3-48-64`($ctx) + vmovd %x#$H4,`4*4-48-64`($ctx) +___ +$code.=<<___ if ($win64); + vmovdqa -0xb0(%r10),%xmm6 + vmovdqa -0xa0(%r10),%xmm7 + vmovdqa -0x90(%r10),%xmm8 + vmovdqa -0x80(%r10),%xmm9 + vmovdqa -0x70(%r10),%xmm10 + vmovdqa -0x60(%r10),%xmm11 + vmovdqa -0x50(%r10),%xmm12 + vmovdqa -0x40(%r10),%xmm13 + vmovdqa -0x30(%r10),%xmm14 + vmovdqa -0x20(%r10),%xmm15 + lea -8(%r10),%rsp +.Ldo_avx2_epilogue$suffix: +___ +$code.=<<___ if (!$win64); + lea -8(%r10),%rsp +.cfi_def_cfa_register %rsp +___ +$code.=<<___; + vzeroupper + ret +.cfi_endproc +___ +if($avx > 2 && $avx512) { +my ($R0,$R1,$R2,$R3,$R4, $S1,$S2,$S3,$S4) = map("%zmm$_",(16..24)); +my ($M0,$M1,$M2,$M3,$M4) = map("%zmm$_",(25..29)); +my $PADBIT="%zmm30"; + +map(s/%y/%z/,($T4,$T0,$T1,$T2,$T3)); # switch to %zmm domain +map(s/%y/%z/,($D0,$D1,$D2,$D3,$D4)); +map(s/%y/%z/,($H0,$H1,$H2,$H3,$H4)); +map(s/%y/%z/,($MASK)); + +$code.=<<___; +.cfi_startproc +.Lblocks_avx512: + mov \$15,%eax + kmovw %eax,%k2 +___ +$code.=<<___ if (!$win64); + lea 8(%rsp),%r10 +.cfi_def_cfa_register %r10 + sub \$0x128,%rsp +___ +$code.=<<___ if ($win64); + lea 8(%rsp),%r10 + sub \$0x1c8,%rsp + vmovdqa %xmm6,-0xb0(%r10) + vmovdqa %xmm7,-0xa0(%r10) + vmovdqa %xmm8,-0x90(%r10) + vmovdqa %xmm9,-0x80(%r10) + vmovdqa %xmm10,-0x70(%r10) + vmovdqa %xmm11,-0x60(%r10) + vmovdqa %xmm12,-0x50(%r10) + vmovdqa %xmm13,-0x40(%r10) + vmovdqa %xmm14,-0x30(%r10) + vmovdqa %xmm15,-0x20(%r10) +.Ldo_avx512_body: +___ +$code.=<<___; + lea .Lconst(%rip),%rcx + lea 48+64($ctx),$ctx # size optimization + vmovdqa 96(%rcx),%y#$T2 # .Lpermd_avx2 + + # expand pre-calculated table + vmovdqu `16*0-64`($ctx),%x#$D0 # will become expanded ${R0} + and \$-512,%rsp + vmovdqu `16*1-64`($ctx),%x#$D1 # will become ... ${R1} + mov \$0x20,%rax + vmovdqu `16*2-64`($ctx),%x#$T0 # ... ${S1} + vmovdqu `16*3-64`($ctx),%x#$D2 # ... ${R2} + vmovdqu `16*4-64`($ctx),%x#$T1 # ... ${S2} + vmovdqu `16*5-64`($ctx),%x#$D3 # ... ${R3} + vmovdqu `16*6-64`($ctx),%x#$T3 # ... ${S3} + vmovdqu `16*7-64`($ctx),%x#$D4 # ... ${R4} + vmovdqu `16*8-64`($ctx),%x#$T4 # ... ${S4} + vpermd $D0,$T2,$R0 # 00003412 -> 14243444 + vpbroadcastq 64(%rcx),$MASK # .Lmask26 + vpermd $D1,$T2,$R1 + vpermd $T0,$T2,$S1 + vpermd $D2,$T2,$R2 + vmovdqa64 $R0,0x00(%rsp){%k2} # save in case $len%128 != 0 + vpsrlq \$32,$R0,$T0 # 14243444 -> 01020304 + vpermd $T1,$T2,$S2 + vmovdqu64 $R1,0x00(%rsp,%rax){%k2} + vpsrlq \$32,$R1,$T1 + vpermd $D3,$T2,$R3 + vmovdqa64 $S1,0x40(%rsp){%k2} + vpermd $T3,$T2,$S3 + vpermd $D4,$T2,$R4 + vmovdqu64 $R2,0x40(%rsp,%rax){%k2} + vpermd $T4,$T2,$S4 + vmovdqa64 $S2,0x80(%rsp){%k2} + vmovdqu64 $R3,0x80(%rsp,%rax){%k2} + vmovdqa64 $S3,0xc0(%rsp){%k2} + vmovdqu64 $R4,0xc0(%rsp,%rax){%k2} + vmovdqa64 $S4,0x100(%rsp){%k2} + + ################################################################ + # calculate 5th through 8th powers of the key + # + # d0 = r0'*r0 + r1'*5*r4 + r2'*5*r3 + r3'*5*r2 + r4'*5*r1 + # d1 = r0'*r1 + r1'*r0 + r2'*5*r4 + r3'*5*r3 + r4'*5*r2 + # d2 = r0'*r2 + r1'*r1 + r2'*r0 + r3'*5*r4 + r4'*5*r3 + # d3 = r0'*r3 + r1'*r2 + r2'*r1 + r3'*r0 + r4'*5*r4 + # d4 = r0'*r4 + r1'*r3 + r2'*r2 + r3'*r1 + r4'*r0 + + vpmuludq $T0,$R0,$D0 # d0 = r0'*r0 + vpmuludq $T0,$R1,$D1 # d1 = r0'*r1 + vpmuludq $T0,$R2,$D2 # d2 = r0'*r2 + vpmuludq $T0,$R3,$D3 # d3 = r0'*r3 + vpmuludq $T0,$R4,$D4 # d4 = r0'*r4 + vpsrlq \$32,$R2,$T2 + + vpmuludq $T1,$S4,$M0 + vpmuludq $T1,$R0,$M1 + vpmuludq $T1,$R1,$M2 + vpmuludq $T1,$R2,$M3 + vpmuludq $T1,$R3,$M4 + vpsrlq \$32,$R3,$T3 + vpaddq $M0,$D0,$D0 # d0 += r1'*5*r4 + vpaddq $M1,$D1,$D1 # d1 += r1'*r0 + vpaddq $M2,$D2,$D2 # d2 += r1'*r1 + vpaddq $M3,$D3,$D3 # d3 += r1'*r2 + vpaddq $M4,$D4,$D4 # d4 += r1'*r3 + + vpmuludq $T2,$S3,$M0 + vpmuludq $T2,$S4,$M1 + vpmuludq $T2,$R1,$M3 + vpmuludq $T2,$R2,$M4 + vpmuludq $T2,$R0,$M2 + vpsrlq \$32,$R4,$T4 + vpaddq $M0,$D0,$D0 # d0 += r2'*5*r3 + vpaddq $M1,$D1,$D1 # d1 += r2'*5*r4 + vpaddq $M3,$D3,$D3 # d3 += r2'*r1 + vpaddq $M4,$D4,$D4 # d4 += r2'*r2 + vpaddq $M2,$D2,$D2 # d2 += r2'*r0 + + vpmuludq $T3,$S2,$M0 + vpmuludq $T3,$R0,$M3 + vpmuludq $T3,$R1,$M4 + vpmuludq $T3,$S3,$M1 + vpmuludq $T3,$S4,$M2 + vpaddq $M0,$D0,$D0 # d0 += r3'*5*r2 + vpaddq $M3,$D3,$D3 # d3 += r3'*r0 + vpaddq $M4,$D4,$D4 # d4 += r3'*r1 + vpaddq $M1,$D1,$D1 # d1 += r3'*5*r3 + vpaddq $M2,$D2,$D2 # d2 += r3'*5*r4 + + vpmuludq $T4,$S4,$M3 + vpmuludq $T4,$R0,$M4 + vpmuludq $T4,$S1,$M0 + vpmuludq $T4,$S2,$M1 + vpmuludq $T4,$S3,$M2 + vpaddq $M3,$D3,$D3 # d3 += r2'*5*r4 + vpaddq $M4,$D4,$D4 # d4 += r2'*r0 + vpaddq $M0,$D0,$D0 # d0 += r2'*5*r1 + vpaddq $M1,$D1,$D1 # d1 += r2'*5*r2 + vpaddq $M2,$D2,$D2 # d2 += r2'*5*r3 + + ################################################################ + # load input + vmovdqu64 16*0($inp),%z#$T3 + vmovdqu64 16*4($inp),%z#$T4 + lea 16*8($inp),$inp + + ################################################################ + # lazy reduction + + vpsrlq \$26,$D3,$M3 + vpandq $MASK,$D3,$D3 + vpaddq $M3,$D4,$D4 # d3 -> d4 + + vpsrlq \$26,$D0,$M0 + vpandq $MASK,$D0,$D0 + vpaddq $M0,$D1,$D1 # d0 -> d1 + + vpsrlq \$26,$D4,$M4 + vpandq $MASK,$D4,$D4 + + vpsrlq \$26,$D1,$M1 + vpandq $MASK,$D1,$D1 + vpaddq $M1,$D2,$D2 # d1 -> d2 + + vpaddq $M4,$D0,$D0 + vpsllq \$2,$M4,$M4 + vpaddq $M4,$D0,$D0 # d4 -> d0 + + vpsrlq \$26,$D2,$M2 + vpandq $MASK,$D2,$D2 + vpaddq $M2,$D3,$D3 # d2 -> d3 + + vpsrlq \$26,$D0,$M0 + vpandq $MASK,$D0,$D0 + vpaddq $M0,$D1,$D1 # d0 -> d1 + + vpsrlq \$26,$D3,$M3 + vpandq $MASK,$D3,$D3 + vpaddq $M3,$D4,$D4 # d3 -> d4 + + ################################################################ + # at this point we have 14243444 in $R0-$S4 and 05060708 in + # $D0-$D4, ... + + vpunpcklqdq $T4,$T3,$T0 # transpose input + vpunpckhqdq $T4,$T3,$T4 + + # ... since input 64-bit lanes are ordered as 73625140, we could + # "vperm" it to 76543210 (here and in each loop iteration), *or* + # we could just flow along, hence the goal for $R0-$S4 is + # 1858286838784888 ... + + vmovdqa32 128(%rcx),$M0 # .Lpermd_avx512: + mov \$0x7777,%eax + kmovw %eax,%k1 + + vpermd $R0,$M0,$R0 # 14243444 -> 1---2---3---4--- + vpermd $R1,$M0,$R1 + vpermd $R2,$M0,$R2 + vpermd $R3,$M0,$R3 + vpermd $R4,$M0,$R4 + + vpermd $D0,$M0,${R0}{%k1} # 05060708 -> 1858286838784888 + vpermd $D1,$M0,${R1}{%k1} + vpermd $D2,$M0,${R2}{%k1} + vpermd $D3,$M0,${R3}{%k1} + vpermd $D4,$M0,${R4}{%k1} + + vpslld \$2,$R1,$S1 # *5 + vpslld \$2,$R2,$S2 + vpslld \$2,$R3,$S3 + vpslld \$2,$R4,$S4 + vpaddd $R1,$S1,$S1 + vpaddd $R2,$S2,$S2 + vpaddd $R3,$S3,$S3 + vpaddd $R4,$S4,$S4 + + vpbroadcastq 32(%rcx),$PADBIT # .L129 + + vpsrlq \$52,$T0,$T2 # splat input + vpsllq \$12,$T4,$T3 + vporq $T3,$T2,$T2 + vpsrlq \$26,$T0,$T1 + vpsrlq \$14,$T4,$T3 + vpsrlq \$40,$T4,$T4 # 4 + vpandq $MASK,$T2,$T2 # 2 + vpandq $MASK,$T0,$T0 # 0 + #vpandq $MASK,$T1,$T1 # 1 + #vpandq $MASK,$T3,$T3 # 3 + #vporq $PADBIT,$T4,$T4 # padbit, yes, always + + vpaddq $H2,$T2,$H2 # accumulate input + sub \$192,$len + jbe .Ltail_avx512 + jmp .Loop_avx512 + +.align 32 +.Loop_avx512: + ################################################################ + # ((inp[0]*r^8+inp[ 8])*r^8+inp[16])*r^8 + # ((inp[1]*r^8+inp[ 9])*r^8+inp[17])*r^7 + # ((inp[2]*r^8+inp[10])*r^8+inp[18])*r^6 + # ((inp[3]*r^8+inp[11])*r^8+inp[19])*r^5 + # ((inp[4]*r^8+inp[12])*r^8+inp[20])*r^4 + # ((inp[5]*r^8+inp[13])*r^8+inp[21])*r^3 + # ((inp[6]*r^8+inp[14])*r^8+inp[22])*r^2 + # ((inp[7]*r^8+inp[15])*r^8+inp[23])*r^1 + # \________/\___________/ + ################################################################ + #vpaddq $H2,$T2,$H2 # accumulate input + + # d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4 + # d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4 + # d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4 + # d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4 + # d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4 + # + # however, as h2 is "chronologically" first one available pull + # corresponding operations up, so it's + # + # d3 = h2*r1 + h0*r3 + h1*r2 + h3*r0 + h4*5*r4 + # d4 = h2*r2 + h0*r4 + h1*r3 + h3*r1 + h4*r0 + # d0 = h2*5*r3 + h0*r0 + h1*5*r4 + h3*5*r2 + h4*5*r1 + # d1 = h2*5*r4 + h0*r1 + h1*r0 + h3*5*r3 + h4*5*r2 + # d2 = h2*r0 + h0*r2 + h1*r1 + h3*5*r4 + h4*5*r3 + + vpmuludq $H2,$R1,$D3 # d3 = h2*r1 + vpaddq $H0,$T0,$H0 + vpmuludq $H2,$R2,$D4 # d4 = h2*r2 + vpandq $MASK,$T1,$T1 # 1 + vpmuludq $H2,$S3,$D0 # d0 = h2*s3 + vpandq $MASK,$T3,$T3 # 3 + vpmuludq $H2,$S4,$D1 # d1 = h2*s4 + vporq $PADBIT,$T4,$T4 # padbit, yes, always + vpmuludq $H2,$R0,$D2 # d2 = h2*r0 + vpaddq $H1,$T1,$H1 # accumulate input + vpaddq $H3,$T3,$H3 + vpaddq $H4,$T4,$H4 + + vmovdqu64 16*0($inp),$T3 # load input + vmovdqu64 16*4($inp),$T4 + lea 16*8($inp),$inp + vpmuludq $H0,$R3,$M3 + vpmuludq $H0,$R4,$M4 + vpmuludq $H0,$R0,$M0 + vpmuludq $H0,$R1,$M1 + vpaddq $M3,$D3,$D3 # d3 += h0*r3 + vpaddq $M4,$D4,$D4 # d4 += h0*r4 + vpaddq $M0,$D0,$D0 # d0 += h0*r0 + vpaddq $M1,$D1,$D1 # d1 += h0*r1 + + vpmuludq $H1,$R2,$M3 + vpmuludq $H1,$R3,$M4 + vpmuludq $H1,$S4,$M0 + vpmuludq $H0,$R2,$M2 + vpaddq $M3,$D3,$D3 # d3 += h1*r2 + vpaddq $M4,$D4,$D4 # d4 += h1*r3 + vpaddq $M0,$D0,$D0 # d0 += h1*s4 + vpaddq $M2,$D2,$D2 # d2 += h0*r2 + + vpunpcklqdq $T4,$T3,$T0 # transpose input + vpunpckhqdq $T4,$T3,$T4 + + vpmuludq $H3,$R0,$M3 + vpmuludq $H3,$R1,$M4 + vpmuludq $H1,$R0,$M1 + vpmuludq $H1,$R1,$M2 + vpaddq $M3,$D3,$D3 # d3 += h3*r0 + vpaddq $M4,$D4,$D4 # d4 += h3*r1 + vpaddq $M1,$D1,$D1 # d1 += h1*r0 + vpaddq $M2,$D2,$D2 # d2 += h1*r1 + + vpmuludq $H4,$S4,$M3 + vpmuludq $H4,$R0,$M4 + vpmuludq $H3,$S2,$M0 + vpmuludq $H3,$S3,$M1 + vpaddq $M3,$D3,$D3 # d3 += h4*s4 + vpmuludq $H3,$S4,$M2 + vpaddq $M4,$D4,$D4 # d4 += h4*r0 + vpaddq $M0,$D0,$D0 # d0 += h3*s2 + vpaddq $M1,$D1,$D1 # d1 += h3*s3 + vpaddq $M2,$D2,$D2 # d2 += h3*s4 + + vpmuludq $H4,$S1,$M0 + vpmuludq $H4,$S2,$M1 + vpmuludq $H4,$S3,$M2 + vpaddq $M0,$D0,$H0 # h0 = d0 + h4*s1 + vpaddq $M1,$D1,$H1 # h1 = d2 + h4*s2 + vpaddq $M2,$D2,$H2 # h2 = d3 + h4*s3 + + ################################################################ + # lazy reduction (interleaved with input splat) + + vpsrlq \$52,$T0,$T2 # splat input + vpsllq \$12,$T4,$T3 + + vpsrlq \$26,$D3,$H3 + vpandq $MASK,$D3,$D3 + vpaddq $H3,$D4,$H4 # h3 -> h4 + + vporq $T3,$T2,$T2 + + vpsrlq \$26,$H0,$D0 + vpandq $MASK,$H0,$H0 + vpaddq $D0,$H1,$H1 # h0 -> h1 + + vpandq $MASK,$T2,$T2 # 2 + + vpsrlq \$26,$H4,$D4 + vpandq $MASK,$H4,$H4 + + vpsrlq \$26,$H1,$D1 + vpandq $MASK,$H1,$H1 + vpaddq $D1,$H2,$H2 # h1 -> h2 + + vpaddq $D4,$H0,$H0 + vpsllq \$2,$D4,$D4 + vpaddq $D4,$H0,$H0 # h4 -> h0 + + vpaddq $T2,$H2,$H2 # modulo-scheduled + vpsrlq \$26,$T0,$T1 + + vpsrlq \$26,$H2,$D2 + vpandq $MASK,$H2,$H2 + vpaddq $D2,$D3,$H3 # h2 -> h3 + + vpsrlq \$14,$T4,$T3 + + vpsrlq \$26,$H0,$D0 + vpandq $MASK,$H0,$H0 + vpaddq $D0,$H1,$H1 # h0 -> h1 + + vpsrlq \$40,$T4,$T4 # 4 + + vpsrlq \$26,$H3,$D3 + vpandq $MASK,$H3,$H3 + vpaddq $D3,$H4,$H4 # h3 -> h4 + + vpandq $MASK,$T0,$T0 # 0 + #vpandq $MASK,$T1,$T1 # 1 + #vpandq $MASK,$T3,$T3 # 3 + #vporq $PADBIT,$T4,$T4 # padbit, yes, always + + sub \$128,$len + ja .Loop_avx512 + +.Ltail_avx512: + ################################################################ + # while above multiplications were by r^8 in all lanes, in last + # iteration we multiply least significant lane by r^8 and most + # significant one by r, that's why table gets shifted... + + vpsrlq \$32,$R0,$R0 # 0105020603070408 + vpsrlq \$32,$R1,$R1 + vpsrlq \$32,$R2,$R2 + vpsrlq \$32,$S3,$S3 + vpsrlq \$32,$S4,$S4 + vpsrlq \$32,$R3,$R3 + vpsrlq \$32,$R4,$R4 + vpsrlq \$32,$S1,$S1 + vpsrlq \$32,$S2,$S2 + + ################################################################ + # load either next or last 64 byte of input + lea ($inp,$len),$inp + + #vpaddq $H2,$T2,$H2 # accumulate input + vpaddq $H0,$T0,$H0 + + vpmuludq $H2,$R1,$D3 # d3 = h2*r1 + vpmuludq $H2,$R2,$D4 # d4 = h2*r2 + vpmuludq $H2,$S3,$D0 # d0 = h2*s3 + vpandq $MASK,$T1,$T1 # 1 + vpmuludq $H2,$S4,$D1 # d1 = h2*s4 + vpandq $MASK,$T3,$T3 # 3 + vpmuludq $H2,$R0,$D2 # d2 = h2*r0 + vporq $PADBIT,$T4,$T4 # padbit, yes, always + vpaddq $H1,$T1,$H1 # accumulate input + vpaddq $H3,$T3,$H3 + vpaddq $H4,$T4,$H4 + + vmovdqu 16*0($inp),%x#$T0 + vpmuludq $H0,$R3,$M3 + vpmuludq $H0,$R4,$M4 + vpmuludq $H0,$R0,$M0 + vpmuludq $H0,$R1,$M1 + vpaddq $M3,$D3,$D3 # d3 += h0*r3 + vpaddq $M4,$D4,$D4 # d4 += h0*r4 + vpaddq $M0,$D0,$D0 # d0 += h0*r0 + vpaddq $M1,$D1,$D1 # d1 += h0*r1 + + vmovdqu 16*1($inp),%x#$T1 + vpmuludq $H1,$R2,$M3 + vpmuludq $H1,$R3,$M4 + vpmuludq $H1,$S4,$M0 + vpmuludq $H0,$R2,$M2 + vpaddq $M3,$D3,$D3 # d3 += h1*r2 + vpaddq $M4,$D4,$D4 # d4 += h1*r3 + vpaddq $M0,$D0,$D0 # d0 += h1*s4 + vpaddq $M2,$D2,$D2 # d2 += h0*r2 + + vinserti128 \$1,16*2($inp),%y#$T0,%y#$T0 + vpmuludq $H3,$R0,$M3 + vpmuludq $H3,$R1,$M4 + vpmuludq $H1,$R0,$M1 + vpmuludq $H1,$R1,$M2 + vpaddq $M3,$D3,$D3 # d3 += h3*r0 + vpaddq $M4,$D4,$D4 # d4 += h3*r1 + vpaddq $M1,$D1,$D1 # d1 += h1*r0 + vpaddq $M2,$D2,$D2 # d2 += h1*r1 + + vinserti128 \$1,16*3($inp),%y#$T1,%y#$T1 + vpmuludq $H4,$S4,$M3 + vpmuludq $H4,$R0,$M4 + vpmuludq $H3,$S2,$M0 + vpmuludq $H3,$S3,$M1 + vpmuludq $H3,$S4,$M2 + vpaddq $M3,$D3,$H3 # h3 = d3 + h4*s4 + vpaddq $M4,$D4,$D4 # d4 += h4*r0 + vpaddq $M0,$D0,$D0 # d0 += h3*s2 + vpaddq $M1,$D1,$D1 # d1 += h3*s3 + vpaddq $M2,$D2,$D2 # d2 += h3*s4 + + vpmuludq $H4,$S1,$M0 + vpmuludq $H4,$S2,$M1 + vpmuludq $H4,$S3,$M2 + vpaddq $M0,$D0,$H0 # h0 = d0 + h4*s1 + vpaddq $M1,$D1,$H1 # h1 = d2 + h4*s2 + vpaddq $M2,$D2,$H2 # h2 = d3 + h4*s3 + + ################################################################ + # horizontal addition + + mov \$1,%eax + vpermq \$0xb1,$H3,$D3 + vpermq \$0xb1,$D4,$H4 + vpermq \$0xb1,$H0,$D0 + vpermq \$0xb1,$H1,$D1 + vpermq \$0xb1,$H2,$D2 + vpaddq $D3,$H3,$H3 + vpaddq $D4,$H4,$H4 + vpaddq $D0,$H0,$H0 + vpaddq $D1,$H1,$H1 + vpaddq $D2,$H2,$H2 + + kmovw %eax,%k3 + vpermq \$0x2,$H3,$D3 + vpermq \$0x2,$H4,$D4 + vpermq \$0x2,$H0,$D0 + vpermq \$0x2,$H1,$D1 + vpermq \$0x2,$H2,$D2 + vpaddq $D3,$H3,$H3 + vpaddq $D4,$H4,$H4 + vpaddq $D0,$H0,$H0 + vpaddq $D1,$H1,$H1 + vpaddq $D2,$H2,$H2 + + vextracti64x4 \$0x1,$H3,%y#$D3 + vextracti64x4 \$0x1,$H4,%y#$D4 + vextracti64x4 \$0x1,$H0,%y#$D0 + vextracti64x4 \$0x1,$H1,%y#$D1 + vextracti64x4 \$0x1,$H2,%y#$D2 + vpaddq $D3,$H3,${H3}{%k3}{z} # keep single qword in case + vpaddq $D4,$H4,${H4}{%k3}{z} # it's passed to .Ltail_avx2 + vpaddq $D0,$H0,${H0}{%k3}{z} + vpaddq $D1,$H1,${H1}{%k3}{z} + vpaddq $D2,$H2,${H2}{%k3}{z} +___ +map(s/%z/%y/,($T0,$T1,$T2,$T3,$T4, $PADBIT)); +map(s/%z/%y/,($H0,$H1,$H2,$H3,$H4, $D0,$D1,$D2,$D3,$D4, $MASK)); +$code.=<<___; + ################################################################ + # lazy reduction (interleaved with input splat) + + vpsrlq \$26,$H3,$D3 + vpand $MASK,$H3,$H3 + vpsrldq \$6,$T0,$T2 # splat input + vpsrldq \$6,$T1,$T3 + vpunpckhqdq $T1,$T0,$T4 # 4 + vpaddq $D3,$H4,$H4 # h3 -> h4 + + vpsrlq \$26,$H0,$D0 + vpand $MASK,$H0,$H0 + vpunpcklqdq $T3,$T2,$T2 # 2:3 + vpunpcklqdq $T1,$T0,$T0 # 0:1 + vpaddq $D0,$H1,$H1 # h0 -> h1 + + vpsrlq \$26,$H4,$D4 + vpand $MASK,$H4,$H4 + + vpsrlq \$26,$H1,$D1 + vpand $MASK,$H1,$H1 + vpsrlq \$30,$T2,$T3 + vpsrlq \$4,$T2,$T2 + vpaddq $D1,$H2,$H2 # h1 -> h2 + + vpaddq $D4,$H0,$H0 + vpsllq \$2,$D4,$D4 + vpsrlq \$26,$T0,$T1 + vpsrlq \$40,$T4,$T4 # 4 + vpaddq $D4,$H0,$H0 # h4 -> h0 + + vpsrlq \$26,$H2,$D2 + vpand $MASK,$H2,$H2 + vpand $MASK,$T2,$T2 # 2 + vpand $MASK,$T0,$T0 # 0 + vpaddq $D2,$H3,$H3 # h2 -> h3 + + vpsrlq \$26,$H0,$D0 + vpand $MASK,$H0,$H0 + vpaddq $H2,$T2,$H2 # accumulate input for .Ltail_avx2 + vpand $MASK,$T1,$T1 # 1 + vpaddq $D0,$H1,$H1 # h0 -> h1 + + vpsrlq \$26,$H3,$D3 + vpand $MASK,$H3,$H3 + vpand $MASK,$T3,$T3 # 3 + vpor 32(%rcx),$T4,$T4 # padbit, yes, always + vpaddq $D3,$H4,$H4 # h3 -> h4 + + lea 0x90(%rsp),%rax # size optimization for .Ltail_avx2 + add \$64,$len + jnz .Ltail_avx2$suffix + + vpsubq $T2,$H2,$H2 # undo input accumulation + vmovd %x#$H0,`4*0-48-64`($ctx)# save partially reduced + vmovd %x#$H1,`4*1-48-64`($ctx) + vmovd %x#$H2,`4*2-48-64`($ctx) + vmovd %x#$H3,`4*3-48-64`($ctx) + vmovd %x#$H4,`4*4-48-64`($ctx) + vzeroall +___ +$code.=<<___ if ($win64); + movdqa -0xb0(%r10),%xmm6 + movdqa -0xa0(%r10),%xmm7 + movdqa -0x90(%r10),%xmm8 + movdqa -0x80(%r10),%xmm9 + movdqa -0x70(%r10),%xmm10 + movdqa -0x60(%r10),%xmm11 + movdqa -0x50(%r10),%xmm12 + movdqa -0x40(%r10),%xmm13 + movdqa -0x30(%r10),%xmm14 + movdqa -0x20(%r10),%xmm15 + lea -8(%r10),%rsp +.Ldo_avx512_epilogue: +___ +$code.=<<___ if (!$win64); + lea -8(%r10),%rsp +.cfi_def_cfa_register %rsp +___ +$code.=<<___; + ret +.cfi_endproc +___ + +} + +} + +&declare_function("poly1305_blocks_avx2", 32, 4); +poly1305_blocks_avxN(0); +&end_function("poly1305_blocks_avx2"); + +if($kernel) { + $code .= "#endif\n"; +} + +####################################################################### +if ($avx>2) { +# On entry we have input length divisible by 64. But since inner loop +# processes 128 bytes per iteration, cases when length is not divisible +# by 128 are handled by passing tail 64 bytes to .Ltail_avx2. For this +# reason stack layout is kept identical to poly1305_blocks_avx2. If not +# for this tail, we wouldn't have to even allocate stack frame... + +if($kernel) { + $code .= "#ifdef CONFIG_AS_AVX512\n"; +} + +&declare_function("poly1305_blocks_avx512", 32, 4); +poly1305_blocks_avxN(1); +&end_function("poly1305_blocks_avx512"); + +if ($kernel) { + $code .= "#endif\n"; +} + +if (!$kernel && $avx>3) { +######################################################################## +# VPMADD52 version using 2^44 radix. +# +# One can argue that base 2^52 would be more natural. Well, even though +# some operations would be more natural, one has to recognize couple of +# things. Base 2^52 doesn't provide advantage over base 2^44 if you look +# at amount of multiply-n-accumulate operations. Secondly, it makes it +# impossible to pre-compute multiples of 5 [referred to as s[]/sN in +# reference implementations], which means that more such operations +# would have to be performed in inner loop, which in turn makes critical +# path longer. In other words, even though base 2^44 reduction might +# look less elegant, overall critical path is actually shorter... + +######################################################################## +# Layout of opaque area is following. +# +# unsigned __int64 h[3]; # current hash value base 2^44 +# unsigned __int64 s[2]; # key value*20 base 2^44 +# unsigned __int64 r[3]; # key value base 2^44 +# struct { unsigned __int64 r^1, r^3, r^2, r^4; } R[4]; +# # r^n positions reflect +# # placement in register, not +# # memory, R[3] is R[1]*20 + +$code.=<<___; +.type poly1305_init_base2_44,\@function,3 +.align 32 +poly1305_init_base2_44: + xor %eax,%eax + mov %rax,0($ctx) # initialize hash value + mov %rax,8($ctx) + mov %rax,16($ctx) + +.Linit_base2_44: + lea poly1305_blocks_vpmadd52(%rip),%r10 + lea poly1305_emit_base2_44(%rip),%r11 + + mov \$0x0ffffffc0fffffff,%rax + mov \$0x0ffffffc0ffffffc,%rcx + and 0($inp),%rax + mov \$0x00000fffffffffff,%r8 + and 8($inp),%rcx + mov \$0x00000fffffffffff,%r9 + and %rax,%r8 + shrd \$44,%rcx,%rax + mov %r8,40($ctx) # r0 + and %r9,%rax + shr \$24,%rcx + mov %rax,48($ctx) # r1 + lea (%rax,%rax,4),%rax # *5 + mov %rcx,56($ctx) # r2 + shl \$2,%rax # magic <<2 + lea (%rcx,%rcx,4),%rcx # *5 + shl \$2,%rcx # magic <<2 + mov %rax,24($ctx) # s1 + mov %rcx,32($ctx) # s2 + movq \$-1,64($ctx) # write impossible value +___ +$code.=<<___ if ($flavour !~ /elf32/); + mov %r10,0(%rdx) + mov %r11,8(%rdx) +___ +$code.=<<___ if ($flavour =~ /elf32/); + mov %r10d,0(%rdx) + mov %r11d,4(%rdx) +___ +$code.=<<___; + mov \$1,%eax + ret +.size poly1305_init_base2_44,.-poly1305_init_base2_44 +___ +{ +my ($H0,$H1,$H2,$r2r1r0,$r1r0s2,$r0s2s1,$Dlo,$Dhi) = map("%ymm$_",(0..5,16,17)); +my ($T0,$inp_permd,$inp_shift,$PAD) = map("%ymm$_",(18..21)); +my ($reduc_mask,$reduc_rght,$reduc_left) = map("%ymm$_",(22..25)); + +$code.=<<___; +.type poly1305_blocks_vpmadd52,\@function,4 +.align 32 +poly1305_blocks_vpmadd52: + shr \$4,$len + jz .Lno_data_vpmadd52 # too short + + shl \$40,$padbit + mov 64($ctx),%r8 # peek on power of the key + + # if powers of the key are not calculated yet, process up to 3 + # blocks with this single-block subroutine, otherwise ensure that + # length is divisible by 2 blocks and pass the rest down to next + # subroutine... + + mov \$3,%rax + mov \$1,%r10 + cmp \$4,$len # is input long + cmovae %r10,%rax + test %r8,%r8 # is power value impossible? + cmovns %r10,%rax + + and $len,%rax # is input of favourable length? + jz .Lblocks_vpmadd52_4x + + sub %rax,$len + mov \$7,%r10d + mov \$1,%r11d + kmovw %r10d,%k7 + lea .L2_44_inp_permd(%rip),%r10 + kmovw %r11d,%k1 + + vmovq $padbit,%x#$PAD + vmovdqa64 0(%r10),$inp_permd # .L2_44_inp_permd + vmovdqa64 32(%r10),$inp_shift # .L2_44_inp_shift + vpermq \$0xcf,$PAD,$PAD + vmovdqa64 64(%r10),$reduc_mask # .L2_44_mask + + vmovdqu64 0($ctx),${Dlo}{%k7}{z} # load hash value + vmovdqu64 40($ctx),${r2r1r0}{%k7}{z} # load keys + vmovdqu64 32($ctx),${r1r0s2}{%k7}{z} + vmovdqu64 24($ctx),${r0s2s1}{%k7}{z} + + vmovdqa64 96(%r10),$reduc_rght # .L2_44_shift_rgt + vmovdqa64 128(%r10),$reduc_left # .L2_44_shift_lft + + jmp .Loop_vpmadd52 + +.align 32 +.Loop_vpmadd52: + vmovdqu32 0($inp),%x#$T0 # load input as ----3210 + lea 16($inp),$inp + + vpermd $T0,$inp_permd,$T0 # ----3210 -> --322110 + vpsrlvq $inp_shift,$T0,$T0 + vpandq $reduc_mask,$T0,$T0 + vporq $PAD,$T0,$T0 + + vpaddq $T0,$Dlo,$Dlo # accumulate input + + vpermq \$0,$Dlo,${H0}{%k7}{z} # smash hash value + vpermq \$0b01010101,$Dlo,${H1}{%k7}{z} + vpermq \$0b10101010,$Dlo,${H2}{%k7}{z} + + vpxord $Dlo,$Dlo,$Dlo + vpxord $Dhi,$Dhi,$Dhi + + vpmadd52luq $r2r1r0,$H0,$Dlo + vpmadd52huq $r2r1r0,$H0,$Dhi + + vpmadd52luq $r1r0s2,$H1,$Dlo + vpmadd52huq $r1r0s2,$H1,$Dhi + + vpmadd52luq $r0s2s1,$H2,$Dlo + vpmadd52huq $r0s2s1,$H2,$Dhi + + vpsrlvq $reduc_rght,$Dlo,$T0 # 0 in topmost qword + vpsllvq $reduc_left,$Dhi,$Dhi # 0 in topmost qword + vpandq $reduc_mask,$Dlo,$Dlo + + vpaddq $T0,$Dhi,$Dhi + + vpermq \$0b10010011,$Dhi,$Dhi # 0 in lowest qword + + vpaddq $Dhi,$Dlo,$Dlo # note topmost qword :-) + + vpsrlvq $reduc_rght,$Dlo,$T0 # 0 in topmost word + vpandq $reduc_mask,$Dlo,$Dlo + + vpermq \$0b10010011,$T0,$T0 + + vpaddq $T0,$Dlo,$Dlo + + vpermq \$0b10010011,$Dlo,${T0}{%k1}{z} + + vpaddq $T0,$Dlo,$Dlo + vpsllq \$2,$T0,$T0 + + vpaddq $T0,$Dlo,$Dlo + + dec %rax # len-=16 + jnz .Loop_vpmadd52 + + vmovdqu64 $Dlo,0($ctx){%k7} # store hash value + + test $len,$len + jnz .Lblocks_vpmadd52_4x + +.Lno_data_vpmadd52: + ret +.size poly1305_blocks_vpmadd52,.-poly1305_blocks_vpmadd52 +___ +} +{ +######################################################################## +# As implied by its name 4x subroutine processes 4 blocks in parallel +# (but handles even 4*n+2 blocks lengths). It takes up to 4th key power +# and is handled in 256-bit %ymm registers. + +my ($H0,$H1,$H2,$R0,$R1,$R2,$S1,$S2) = map("%ymm$_",(0..5,16,17)); +my ($D0lo,$D0hi,$D1lo,$D1hi,$D2lo,$D2hi) = map("%ymm$_",(18..23)); +my ($T0,$T1,$T2,$T3,$mask44,$mask42,$tmp,$PAD) = map("%ymm$_",(24..31)); + +$code.=<<___; +.type poly1305_blocks_vpmadd52_4x,\@function,4 +.align 32 +poly1305_blocks_vpmadd52_4x: + shr \$4,$len + jz .Lno_data_vpmadd52_4x # too short + + shl \$40,$padbit + mov 64($ctx),%r8 # peek on power of the key + +.Lblocks_vpmadd52_4x: + vpbroadcastq $padbit,$PAD + + vmovdqa64 .Lx_mask44(%rip),$mask44 + mov \$5,%eax + vmovdqa64 .Lx_mask42(%rip),$mask42 + kmovw %eax,%k1 # used in 2x path + + test %r8,%r8 # is power value impossible? + js .Linit_vpmadd52 # if it is, then init R[4] + + vmovq 0($ctx),%x#$H0 # load current hash value + vmovq 8($ctx),%x#$H1 + vmovq 16($ctx),%x#$H2 + + test \$3,$len # is length 4*n+2? + jnz .Lblocks_vpmadd52_2x_do + +.Lblocks_vpmadd52_4x_do: + vpbroadcastq 64($ctx),$R0 # load 4th power of the key + vpbroadcastq 96($ctx),$R1 + vpbroadcastq 128($ctx),$R2 + vpbroadcastq 160($ctx),$S1 + +.Lblocks_vpmadd52_4x_key_loaded: + vpsllq \$2,$R2,$S2 # S2 = R2*5*4 + vpaddq $R2,$S2,$S2 + vpsllq \$2,$S2,$S2 + + test \$7,$len # is len 8*n? + jz .Lblocks_vpmadd52_8x + + vmovdqu64 16*0($inp),$T2 # load data + vmovdqu64 16*2($inp),$T3 + lea 16*4($inp),$inp + + vpunpcklqdq $T3,$T2,$T1 # transpose data + vpunpckhqdq $T3,$T2,$T3 + + # at this point 64-bit lanes are ordered as 3-1-2-0 + + vpsrlq \$24,$T3,$T2 # splat the data + vporq $PAD,$T2,$T2 + vpaddq $T2,$H2,$H2 # accumulate input + vpandq $mask44,$T1,$T0 + vpsrlq \$44,$T1,$T1 + vpsllq \$20,$T3,$T3 + vporq $T3,$T1,$T1 + vpandq $mask44,$T1,$T1 + + sub \$4,$len + jz .Ltail_vpmadd52_4x + jmp .Loop_vpmadd52_4x + ud2 + +.align 32 +.Linit_vpmadd52: + vmovq 24($ctx),%x#$S1 # load key + vmovq 56($ctx),%x#$H2 + vmovq 32($ctx),%x#$S2 + vmovq 40($ctx),%x#$R0 + vmovq 48($ctx),%x#$R1 + + vmovdqa $R0,$H0 + vmovdqa $R1,$H1 + vmovdqa $H2,$R2 + + mov \$2,%eax + +.Lmul_init_vpmadd52: + vpxorq $D0lo,$D0lo,$D0lo + vpmadd52luq $H2,$S1,$D0lo + vpxorq $D0hi,$D0hi,$D0hi + vpmadd52huq $H2,$S1,$D0hi + vpxorq $D1lo,$D1lo,$D1lo + vpmadd52luq $H2,$S2,$D1lo + vpxorq $D1hi,$D1hi,$D1hi + vpmadd52huq $H2,$S2,$D1hi + vpxorq $D2lo,$D2lo,$D2lo + vpmadd52luq $H2,$R0,$D2lo + vpxorq $D2hi,$D2hi,$D2hi + vpmadd52huq $H2,$R0,$D2hi + + vpmadd52luq $H0,$R0,$D0lo + vpmadd52huq $H0,$R0,$D0hi + vpmadd52luq $H0,$R1,$D1lo + vpmadd52huq $H0,$R1,$D1hi + vpmadd52luq $H0,$R2,$D2lo + vpmadd52huq $H0,$R2,$D2hi + + vpmadd52luq $H1,$S2,$D0lo + vpmadd52huq $H1,$S2,$D0hi + vpmadd52luq $H1,$R0,$D1lo + vpmadd52huq $H1,$R0,$D1hi + vpmadd52luq $H1,$R1,$D2lo + vpmadd52huq $H1,$R1,$D2hi + + ################################################################ + # partial reduction + vpsrlq \$44,$D0lo,$tmp + vpsllq \$8,$D0hi,$D0hi + vpandq $mask44,$D0lo,$H0 + vpaddq $tmp,$D0hi,$D0hi + + vpaddq $D0hi,$D1lo,$D1lo + + vpsrlq \$44,$D1lo,$tmp + vpsllq \$8,$D1hi,$D1hi + vpandq $mask44,$D1lo,$H1 + vpaddq $tmp,$D1hi,$D1hi + + vpaddq $D1hi,$D2lo,$D2lo + + vpsrlq \$42,$D2lo,$tmp + vpsllq \$10,$D2hi,$D2hi + vpandq $mask42,$D2lo,$H2 + vpaddq $tmp,$D2hi,$D2hi + + vpaddq $D2hi,$H0,$H0 + vpsllq \$2,$D2hi,$D2hi + + vpaddq $D2hi,$H0,$H0 + + vpsrlq \$44,$H0,$tmp # additional step + vpandq $mask44,$H0,$H0 + + vpaddq $tmp,$H1,$H1 + + dec %eax + jz .Ldone_init_vpmadd52 + + vpunpcklqdq $R1,$H1,$R1 # 1,2 + vpbroadcastq %x#$H1,%x#$H1 # 2,2 + vpunpcklqdq $R2,$H2,$R2 + vpbroadcastq %x#$H2,%x#$H2 + vpunpcklqdq $R0,$H0,$R0 + vpbroadcastq %x#$H0,%x#$H0 + + vpsllq \$2,$R1,$S1 # S1 = R1*5*4 + vpsllq \$2,$R2,$S2 # S2 = R2*5*4 + vpaddq $R1,$S1,$S1 + vpaddq $R2,$S2,$S2 + vpsllq \$2,$S1,$S1 + vpsllq \$2,$S2,$S2 + + jmp .Lmul_init_vpmadd52 + ud2 + +.align 32 +.Ldone_init_vpmadd52: + vinserti128 \$1,%x#$R1,$H1,$R1 # 1,2,3,4 + vinserti128 \$1,%x#$R2,$H2,$R2 + vinserti128 \$1,%x#$R0,$H0,$R0 + + vpermq \$0b11011000,$R1,$R1 # 1,3,2,4 + vpermq \$0b11011000,$R2,$R2 + vpermq \$0b11011000,$R0,$R0 + + vpsllq \$2,$R1,$S1 # S1 = R1*5*4 + vpaddq $R1,$S1,$S1 + vpsllq \$2,$S1,$S1 + + vmovq 0($ctx),%x#$H0 # load current hash value + vmovq 8($ctx),%x#$H1 + vmovq 16($ctx),%x#$H2 + + test \$3,$len # is length 4*n+2? + jnz .Ldone_init_vpmadd52_2x + + vmovdqu64 $R0,64($ctx) # save key powers + vpbroadcastq %x#$R0,$R0 # broadcast 4th power + vmovdqu64 $R1,96($ctx) + vpbroadcastq %x#$R1,$R1 + vmovdqu64 $R2,128($ctx) + vpbroadcastq %x#$R2,$R2 + vmovdqu64 $S1,160($ctx) + vpbroadcastq %x#$S1,$S1 + + jmp .Lblocks_vpmadd52_4x_key_loaded + ud2 + +.align 32 +.Ldone_init_vpmadd52_2x: + vmovdqu64 $R0,64($ctx) # save key powers + vpsrldq \$8,$R0,$R0 # 0-1-0-2 + vmovdqu64 $R1,96($ctx) + vpsrldq \$8,$R1,$R1 + vmovdqu64 $R2,128($ctx) + vpsrldq \$8,$R2,$R2 + vmovdqu64 $S1,160($ctx) + vpsrldq \$8,$S1,$S1 + jmp .Lblocks_vpmadd52_2x_key_loaded + ud2 + +.align 32 +.Lblocks_vpmadd52_2x_do: + vmovdqu64 128+8($ctx),${R2}{%k1}{z}# load 2nd and 1st key powers + vmovdqu64 160+8($ctx),${S1}{%k1}{z} + vmovdqu64 64+8($ctx),${R0}{%k1}{z} + vmovdqu64 96+8($ctx),${R1}{%k1}{z} + +.Lblocks_vpmadd52_2x_key_loaded: + vmovdqu64 16*0($inp),$T2 # load data + vpxorq $T3,$T3,$T3 + lea 16*2($inp),$inp + + vpunpcklqdq $T3,$T2,$T1 # transpose data + vpunpckhqdq $T3,$T2,$T3 + + # at this point 64-bit lanes are ordered as x-1-x-0 + + vpsrlq \$24,$T3,$T2 # splat the data + vporq $PAD,$T2,$T2 + vpaddq $T2,$H2,$H2 # accumulate input + vpandq $mask44,$T1,$T0 + vpsrlq \$44,$T1,$T1 + vpsllq \$20,$T3,$T3 + vporq $T3,$T1,$T1 + vpandq $mask44,$T1,$T1 + + jmp .Ltail_vpmadd52_2x + ud2 + +.align 32 +.Loop_vpmadd52_4x: + #vpaddq $T2,$H2,$H2 # accumulate input + vpaddq $T0,$H0,$H0 + vpaddq $T1,$H1,$H1 + + vpxorq $D0lo,$D0lo,$D0lo + vpmadd52luq $H2,$S1,$D0lo + vpxorq $D0hi,$D0hi,$D0hi + vpmadd52huq $H2,$S1,$D0hi + vpxorq $D1lo,$D1lo,$D1lo + vpmadd52luq $H2,$S2,$D1lo + vpxorq $D1hi,$D1hi,$D1hi + vpmadd52huq $H2,$S2,$D1hi + vpxorq $D2lo,$D2lo,$D2lo + vpmadd52luq $H2,$R0,$D2lo + vpxorq $D2hi,$D2hi,$D2hi + vpmadd52huq $H2,$R0,$D2hi + + vmovdqu64 16*0($inp),$T2 # load data + vmovdqu64 16*2($inp),$T3 + lea 16*4($inp),$inp + vpmadd52luq $H0,$R0,$D0lo + vpmadd52huq $H0,$R0,$D0hi + vpmadd52luq $H0,$R1,$D1lo + vpmadd52huq $H0,$R1,$D1hi + vpmadd52luq $H0,$R2,$D2lo + vpmadd52huq $H0,$R2,$D2hi + + vpunpcklqdq $T3,$T2,$T1 # transpose data + vpunpckhqdq $T3,$T2,$T3 + vpmadd52luq $H1,$S2,$D0lo + vpmadd52huq $H1,$S2,$D0hi + vpmadd52luq $H1,$R0,$D1lo + vpmadd52huq $H1,$R0,$D1hi + vpmadd52luq $H1,$R1,$D2lo + vpmadd52huq $H1,$R1,$D2hi + + ################################################################ + # partial reduction (interleaved with data splat) + vpsrlq \$44,$D0lo,$tmp + vpsllq \$8,$D0hi,$D0hi + vpandq $mask44,$D0lo,$H0 + vpaddq $tmp,$D0hi,$D0hi + + vpsrlq \$24,$T3,$T2 + vporq $PAD,$T2,$T2 + vpaddq $D0hi,$D1lo,$D1lo + + vpsrlq \$44,$D1lo,$tmp + vpsllq \$8,$D1hi,$D1hi + vpandq $mask44,$D1lo,$H1 + vpaddq $tmp,$D1hi,$D1hi + + vpandq $mask44,$T1,$T0 + vpsrlq \$44,$T1,$T1 + vpsllq \$20,$T3,$T3 + vpaddq $D1hi,$D2lo,$D2lo + + vpsrlq \$42,$D2lo,$tmp + vpsllq \$10,$D2hi,$D2hi + vpandq $mask42,$D2lo,$H2 + vpaddq $tmp,$D2hi,$D2hi + + vpaddq $T2,$H2,$H2 # accumulate input + vpaddq $D2hi,$H0,$H0 + vpsllq \$2,$D2hi,$D2hi + + vpaddq $D2hi,$H0,$H0 + vporq $T3,$T1,$T1 + vpandq $mask44,$T1,$T1 + + vpsrlq \$44,$H0,$tmp # additional step + vpandq $mask44,$H0,$H0 + + vpaddq $tmp,$H1,$H1 + + sub \$4,$len # len-=64 + jnz .Loop_vpmadd52_4x + +.Ltail_vpmadd52_4x: + vmovdqu64 128($ctx),$R2 # load all key powers + vmovdqu64 160($ctx),$S1 + vmovdqu64 64($ctx),$R0 + vmovdqu64 96($ctx),$R1 + +.Ltail_vpmadd52_2x: + vpsllq \$2,$R2,$S2 # S2 = R2*5*4 + vpaddq $R2,$S2,$S2 + vpsllq \$2,$S2,$S2 + + #vpaddq $T2,$H2,$H2 # accumulate input + vpaddq $T0,$H0,$H0 + vpaddq $T1,$H1,$H1 + + vpxorq $D0lo,$D0lo,$D0lo + vpmadd52luq $H2,$S1,$D0lo + vpxorq $D0hi,$D0hi,$D0hi + vpmadd52huq $H2,$S1,$D0hi + vpxorq $D1lo,$D1lo,$D1lo + vpmadd52luq $H2,$S2,$D1lo + vpxorq $D1hi,$D1hi,$D1hi + vpmadd52huq $H2,$S2,$D1hi + vpxorq $D2lo,$D2lo,$D2lo + vpmadd52luq $H2,$R0,$D2lo + vpxorq $D2hi,$D2hi,$D2hi + vpmadd52huq $H2,$R0,$D2hi + + vpmadd52luq $H0,$R0,$D0lo + vpmadd52huq $H0,$R0,$D0hi + vpmadd52luq $H0,$R1,$D1lo + vpmadd52huq $H0,$R1,$D1hi + vpmadd52luq $H0,$R2,$D2lo + vpmadd52huq $H0,$R2,$D2hi + + vpmadd52luq $H1,$S2,$D0lo + vpmadd52huq $H1,$S2,$D0hi + vpmadd52luq $H1,$R0,$D1lo + vpmadd52huq $H1,$R0,$D1hi + vpmadd52luq $H1,$R1,$D2lo + vpmadd52huq $H1,$R1,$D2hi + + ################################################################ + # horizontal addition + + mov \$1,%eax + kmovw %eax,%k1 + vpsrldq \$8,$D0lo,$T0 + vpsrldq \$8,$D0hi,$H0 + vpsrldq \$8,$D1lo,$T1 + vpsrldq \$8,$D1hi,$H1 + vpaddq $T0,$D0lo,$D0lo + vpaddq $H0,$D0hi,$D0hi + vpsrldq \$8,$D2lo,$T2 + vpsrldq \$8,$D2hi,$H2 + vpaddq $T1,$D1lo,$D1lo + vpaddq $H1,$D1hi,$D1hi + vpermq \$0x2,$D0lo,$T0 + vpermq \$0x2,$D0hi,$H0 + vpaddq $T2,$D2lo,$D2lo + vpaddq $H2,$D2hi,$D2hi + + vpermq \$0x2,$D1lo,$T1 + vpermq \$0x2,$D1hi,$H1 + vpaddq $T0,$D0lo,${D0lo}{%k1}{z} + vpaddq $H0,$D0hi,${D0hi}{%k1}{z} + vpermq \$0x2,$D2lo,$T2 + vpermq \$0x2,$D2hi,$H2 + vpaddq $T1,$D1lo,${D1lo}{%k1}{z} + vpaddq $H1,$D1hi,${D1hi}{%k1}{z} + vpaddq $T2,$D2lo,${D2lo}{%k1}{z} + vpaddq $H2,$D2hi,${D2hi}{%k1}{z} + + ################################################################ + # partial reduction + vpsrlq \$44,$D0lo,$tmp + vpsllq \$8,$D0hi,$D0hi + vpandq $mask44,$D0lo,$H0 + vpaddq $tmp,$D0hi,$D0hi + + vpaddq $D0hi,$D1lo,$D1lo + + vpsrlq \$44,$D1lo,$tmp + vpsllq \$8,$D1hi,$D1hi + vpandq $mask44,$D1lo,$H1 + vpaddq $tmp,$D1hi,$D1hi + + vpaddq $D1hi,$D2lo,$D2lo + + vpsrlq \$42,$D2lo,$tmp + vpsllq \$10,$D2hi,$D2hi + vpandq $mask42,$D2lo,$H2 + vpaddq $tmp,$D2hi,$D2hi + + vpaddq $D2hi,$H0,$H0 + vpsllq \$2,$D2hi,$D2hi + + vpaddq $D2hi,$H0,$H0 + + vpsrlq \$44,$H0,$tmp # additional step + vpandq $mask44,$H0,$H0 + + vpaddq $tmp,$H1,$H1 + # at this point $len is + # either 4*n+2 or 0... + sub \$2,$len # len-=32 + ja .Lblocks_vpmadd52_4x_do + + vmovq %x#$H0,0($ctx) + vmovq %x#$H1,8($ctx) + vmovq %x#$H2,16($ctx) + vzeroall + +.Lno_data_vpmadd52_4x: + ret +.size poly1305_blocks_vpmadd52_4x,.-poly1305_blocks_vpmadd52_4x +___ +} +{ +######################################################################## +# As implied by its name 8x subroutine processes 8 blocks in parallel... +# This is intermediate version, as it's used only in cases when input +# length is either 8*n, 8*n+1 or 8*n+2... + +my ($H0,$H1,$H2,$R0,$R1,$R2,$S1,$S2) = map("%ymm$_",(0..5,16,17)); +my ($D0lo,$D0hi,$D1lo,$D1hi,$D2lo,$D2hi) = map("%ymm$_",(18..23)); +my ($T0,$T1,$T2,$T3,$mask44,$mask42,$tmp,$PAD) = map("%ymm$_",(24..31)); +my ($RR0,$RR1,$RR2,$SS1,$SS2) = map("%ymm$_",(6..10)); + +$code.=<<___; +.type poly1305_blocks_vpmadd52_8x,\@function,4 +.align 32 +poly1305_blocks_vpmadd52_8x: + shr \$4,$len + jz .Lno_data_vpmadd52_8x # too short + + shl \$40,$padbit + mov 64($ctx),%r8 # peek on power of the key + + vmovdqa64 .Lx_mask44(%rip),$mask44 + vmovdqa64 .Lx_mask42(%rip),$mask42 + + test %r8,%r8 # is power value impossible? + js .Linit_vpmadd52 # if it is, then init R[4] + + vmovq 0($ctx),%x#$H0 # load current hash value + vmovq 8($ctx),%x#$H1 + vmovq 16($ctx),%x#$H2 + +.Lblocks_vpmadd52_8x: + ################################################################ + # fist we calculate more key powers + + vmovdqu64 128($ctx),$R2 # load 1-3-2-4 powers + vmovdqu64 160($ctx),$S1 + vmovdqu64 64($ctx),$R0 + vmovdqu64 96($ctx),$R1 + + vpsllq \$2,$R2,$S2 # S2 = R2*5*4 + vpaddq $R2,$S2,$S2 + vpsllq \$2,$S2,$S2 + + vpbroadcastq %x#$R2,$RR2 # broadcast 4th power + vpbroadcastq %x#$R0,$RR0 + vpbroadcastq %x#$R1,$RR1 + + vpxorq $D0lo,$D0lo,$D0lo + vpmadd52luq $RR2,$S1,$D0lo + vpxorq $D0hi,$D0hi,$D0hi + vpmadd52huq $RR2,$S1,$D0hi + vpxorq $D1lo,$D1lo,$D1lo + vpmadd52luq $RR2,$S2,$D1lo + vpxorq $D1hi,$D1hi,$D1hi + vpmadd52huq $RR2,$S2,$D1hi + vpxorq $D2lo,$D2lo,$D2lo + vpmadd52luq $RR2,$R0,$D2lo + vpxorq $D2hi,$D2hi,$D2hi + vpmadd52huq $RR2,$R0,$D2hi + + vpmadd52luq $RR0,$R0,$D0lo + vpmadd52huq $RR0,$R0,$D0hi + vpmadd52luq $RR0,$R1,$D1lo + vpmadd52huq $RR0,$R1,$D1hi + vpmadd52luq $RR0,$R2,$D2lo + vpmadd52huq $RR0,$R2,$D2hi + + vpmadd52luq $RR1,$S2,$D0lo + vpmadd52huq $RR1,$S2,$D0hi + vpmadd52luq $RR1,$R0,$D1lo + vpmadd52huq $RR1,$R0,$D1hi + vpmadd52luq $RR1,$R1,$D2lo + vpmadd52huq $RR1,$R1,$D2hi + + ################################################################ + # partial reduction + vpsrlq \$44,$D0lo,$tmp + vpsllq \$8,$D0hi,$D0hi + vpandq $mask44,$D0lo,$RR0 + vpaddq $tmp,$D0hi,$D0hi + + vpaddq $D0hi,$D1lo,$D1lo + + vpsrlq \$44,$D1lo,$tmp + vpsllq \$8,$D1hi,$D1hi + vpandq $mask44,$D1lo,$RR1 + vpaddq $tmp,$D1hi,$D1hi + + vpaddq $D1hi,$D2lo,$D2lo + + vpsrlq \$42,$D2lo,$tmp + vpsllq \$10,$D2hi,$D2hi + vpandq $mask42,$D2lo,$RR2 + vpaddq $tmp,$D2hi,$D2hi + + vpaddq $D2hi,$RR0,$RR0 + vpsllq \$2,$D2hi,$D2hi + + vpaddq $D2hi,$RR0,$RR0 + + vpsrlq \$44,$RR0,$tmp # additional step + vpandq $mask44,$RR0,$RR0 + + vpaddq $tmp,$RR1,$RR1 + + ################################################################ + # At this point Rx holds 1324 powers, RRx - 5768, and the goal + # is 15263748, which reflects how data is loaded... + + vpunpcklqdq $R2,$RR2,$T2 # 3748 + vpunpckhqdq $R2,$RR2,$R2 # 1526 + vpunpcklqdq $R0,$RR0,$T0 + vpunpckhqdq $R0,$RR0,$R0 + vpunpcklqdq $R1,$RR1,$T1 + vpunpckhqdq $R1,$RR1,$R1 +___ +######## switch to %zmm +map(s/%y/%z/, $H0,$H1,$H2,$R0,$R1,$R2,$S1,$S2); +map(s/%y/%z/, $D0lo,$D0hi,$D1lo,$D1hi,$D2lo,$D2hi); +map(s/%y/%z/, $T0,$T1,$T2,$T3,$mask44,$mask42,$tmp,$PAD); +map(s/%y/%z/, $RR0,$RR1,$RR2,$SS1,$SS2); + +$code.=<<___; + vshufi64x2 \$0x44,$R2,$T2,$RR2 # 15263748 + vshufi64x2 \$0x44,$R0,$T0,$RR0 + vshufi64x2 \$0x44,$R1,$T1,$RR1 + + vmovdqu64 16*0($inp),$T2 # load data + vmovdqu64 16*4($inp),$T3 + lea 16*8($inp),$inp + + vpsllq \$2,$RR2,$SS2 # S2 = R2*5*4 + vpsllq \$2,$RR1,$SS1 # S1 = R1*5*4 + vpaddq $RR2,$SS2,$SS2 + vpaddq $RR1,$SS1,$SS1 + vpsllq \$2,$SS2,$SS2 + vpsllq \$2,$SS1,$SS1 + + vpbroadcastq $padbit,$PAD + vpbroadcastq %x#$mask44,$mask44 + vpbroadcastq %x#$mask42,$mask42 + + vpbroadcastq %x#$SS1,$S1 # broadcast 8th power + vpbroadcastq %x#$SS2,$S2 + vpbroadcastq %x#$RR0,$R0 + vpbroadcastq %x#$RR1,$R1 + vpbroadcastq %x#$RR2,$R2 + + vpunpcklqdq $T3,$T2,$T1 # transpose data + vpunpckhqdq $T3,$T2,$T3 + + # at this point 64-bit lanes are ordered as 73625140 + + vpsrlq \$24,$T3,$T2 # splat the data + vporq $PAD,$T2,$T2 + vpaddq $T2,$H2,$H2 # accumulate input + vpandq $mask44,$T1,$T0 + vpsrlq \$44,$T1,$T1 + vpsllq \$20,$T3,$T3 + vporq $T3,$T1,$T1 + vpandq $mask44,$T1,$T1 + + sub \$8,$len + jz .Ltail_vpmadd52_8x + jmp .Loop_vpmadd52_8x + +.align 32 +.Loop_vpmadd52_8x: + #vpaddq $T2,$H2,$H2 # accumulate input + vpaddq $T0,$H0,$H0 + vpaddq $T1,$H1,$H1 + + vpxorq $D0lo,$D0lo,$D0lo + vpmadd52luq $H2,$S1,$D0lo + vpxorq $D0hi,$D0hi,$D0hi + vpmadd52huq $H2,$S1,$D0hi + vpxorq $D1lo,$D1lo,$D1lo + vpmadd52luq $H2,$S2,$D1lo + vpxorq $D1hi,$D1hi,$D1hi + vpmadd52huq $H2,$S2,$D1hi + vpxorq $D2lo,$D2lo,$D2lo + vpmadd52luq $H2,$R0,$D2lo + vpxorq $D2hi,$D2hi,$D2hi + vpmadd52huq $H2,$R0,$D2hi + + vmovdqu64 16*0($inp),$T2 # load data + vmovdqu64 16*4($inp),$T3 + lea 16*8($inp),$inp + vpmadd52luq $H0,$R0,$D0lo + vpmadd52huq $H0,$R0,$D0hi + vpmadd52luq $H0,$R1,$D1lo + vpmadd52huq $H0,$R1,$D1hi + vpmadd52luq $H0,$R2,$D2lo + vpmadd52huq $H0,$R2,$D2hi + + vpunpcklqdq $T3,$T2,$T1 # transpose data + vpunpckhqdq $T3,$T2,$T3 + vpmadd52luq $H1,$S2,$D0lo + vpmadd52huq $H1,$S2,$D0hi + vpmadd52luq $H1,$R0,$D1lo + vpmadd52huq $H1,$R0,$D1hi + vpmadd52luq $H1,$R1,$D2lo + vpmadd52huq $H1,$R1,$D2hi + + ################################################################ + # partial reduction (interleaved with data splat) + vpsrlq \$44,$D0lo,$tmp + vpsllq \$8,$D0hi,$D0hi + vpandq $mask44,$D0lo,$H0 + vpaddq $tmp,$D0hi,$D0hi + + vpsrlq \$24,$T3,$T2 + vporq $PAD,$T2,$T2 + vpaddq $D0hi,$D1lo,$D1lo + + vpsrlq \$44,$D1lo,$tmp + vpsllq \$8,$D1hi,$D1hi + vpandq $mask44,$D1lo,$H1 + vpaddq $tmp,$D1hi,$D1hi + + vpandq $mask44,$T1,$T0 + vpsrlq \$44,$T1,$T1 + vpsllq \$20,$T3,$T3 + vpaddq $D1hi,$D2lo,$D2lo + + vpsrlq \$42,$D2lo,$tmp + vpsllq \$10,$D2hi,$D2hi + vpandq $mask42,$D2lo,$H2 + vpaddq $tmp,$D2hi,$D2hi + + vpaddq $T2,$H2,$H2 # accumulate input + vpaddq $D2hi,$H0,$H0 + vpsllq \$2,$D2hi,$D2hi + + vpaddq $D2hi,$H0,$H0 + vporq $T3,$T1,$T1 + vpandq $mask44,$T1,$T1 + + vpsrlq \$44,$H0,$tmp # additional step + vpandq $mask44,$H0,$H0 + + vpaddq $tmp,$H1,$H1 + + sub \$8,$len # len-=128 + jnz .Loop_vpmadd52_8x + +.Ltail_vpmadd52_8x: + #vpaddq $T2,$H2,$H2 # accumulate input + vpaddq $T0,$H0,$H0 + vpaddq $T1,$H1,$H1 + + vpxorq $D0lo,$D0lo,$D0lo + vpmadd52luq $H2,$SS1,$D0lo + vpxorq $D0hi,$D0hi,$D0hi + vpmadd52huq $H2,$SS1,$D0hi + vpxorq $D1lo,$D1lo,$D1lo + vpmadd52luq $H2,$SS2,$D1lo + vpxorq $D1hi,$D1hi,$D1hi + vpmadd52huq $H2,$SS2,$D1hi + vpxorq $D2lo,$D2lo,$D2lo + vpmadd52luq $H2,$RR0,$D2lo + vpxorq $D2hi,$D2hi,$D2hi + vpmadd52huq $H2,$RR0,$D2hi + + vpmadd52luq $H0,$RR0,$D0lo + vpmadd52huq $H0,$RR0,$D0hi + vpmadd52luq $H0,$RR1,$D1lo + vpmadd52huq $H0,$RR1,$D1hi + vpmadd52luq $H0,$RR2,$D2lo + vpmadd52huq $H0,$RR2,$D2hi + + vpmadd52luq $H1,$SS2,$D0lo + vpmadd52huq $H1,$SS2,$D0hi + vpmadd52luq $H1,$RR0,$D1lo + vpmadd52huq $H1,$RR0,$D1hi + vpmadd52luq $H1,$RR1,$D2lo + vpmadd52huq $H1,$RR1,$D2hi + + ################################################################ + # horizontal addition + + mov \$1,%eax + kmovw %eax,%k1 + vpsrldq \$8,$D0lo,$T0 + vpsrldq \$8,$D0hi,$H0 + vpsrldq \$8,$D1lo,$T1 + vpsrldq \$8,$D1hi,$H1 + vpaddq $T0,$D0lo,$D0lo + vpaddq $H0,$D0hi,$D0hi + vpsrldq \$8,$D2lo,$T2 + vpsrldq \$8,$D2hi,$H2 + vpaddq $T1,$D1lo,$D1lo + vpaddq $H1,$D1hi,$D1hi + vpermq \$0x2,$D0lo,$T0 + vpermq \$0x2,$D0hi,$H0 + vpaddq $T2,$D2lo,$D2lo + vpaddq $H2,$D2hi,$D2hi + + vpermq \$0x2,$D1lo,$T1 + vpermq \$0x2,$D1hi,$H1 + vpaddq $T0,$D0lo,$D0lo + vpaddq $H0,$D0hi,$D0hi + vpermq \$0x2,$D2lo,$T2 + vpermq \$0x2,$D2hi,$H2 + vpaddq $T1,$D1lo,$D1lo + vpaddq $H1,$D1hi,$D1hi + vextracti64x4 \$1,$D0lo,%y#$T0 + vextracti64x4 \$1,$D0hi,%y#$H0 + vpaddq $T2,$D2lo,$D2lo + vpaddq $H2,$D2hi,$D2hi + + vextracti64x4 \$1,$D1lo,%y#$T1 + vextracti64x4 \$1,$D1hi,%y#$H1 + vextracti64x4 \$1,$D2lo,%y#$T2 + vextracti64x4 \$1,$D2hi,%y#$H2 +___ +######## switch back to %ymm +map(s/%z/%y/, $H0,$H1,$H2,$R0,$R1,$R2,$S1,$S2); +map(s/%z/%y/, $D0lo,$D0hi,$D1lo,$D1hi,$D2lo,$D2hi); +map(s/%z/%y/, $T0,$T1,$T2,$T3,$mask44,$mask42,$tmp,$PAD); + +$code.=<<___; + vpaddq $T0,$D0lo,${D0lo}{%k1}{z} + vpaddq $H0,$D0hi,${D0hi}{%k1}{z} + vpaddq $T1,$D1lo,${D1lo}{%k1}{z} + vpaddq $H1,$D1hi,${D1hi}{%k1}{z} + vpaddq $T2,$D2lo,${D2lo}{%k1}{z} + vpaddq $H2,$D2hi,${D2hi}{%k1}{z} + + ################################################################ + # partial reduction + vpsrlq \$44,$D0lo,$tmp + vpsllq \$8,$D0hi,$D0hi + vpandq $mask44,$D0lo,$H0 + vpaddq $tmp,$D0hi,$D0hi + + vpaddq $D0hi,$D1lo,$D1lo + + vpsrlq \$44,$D1lo,$tmp + vpsllq \$8,$D1hi,$D1hi + vpandq $mask44,$D1lo,$H1 + vpaddq $tmp,$D1hi,$D1hi + + vpaddq $D1hi,$D2lo,$D2lo + + vpsrlq \$42,$D2lo,$tmp + vpsllq \$10,$D2hi,$D2hi + vpandq $mask42,$D2lo,$H2 + vpaddq $tmp,$D2hi,$D2hi + + vpaddq $D2hi,$H0,$H0 + vpsllq \$2,$D2hi,$D2hi + + vpaddq $D2hi,$H0,$H0 + + vpsrlq \$44,$H0,$tmp # additional step + vpandq $mask44,$H0,$H0 + + vpaddq $tmp,$H1,$H1 + + ################################################################ + + vmovq %x#$H0,0($ctx) + vmovq %x#$H1,8($ctx) + vmovq %x#$H2,16($ctx) + vzeroall + +.Lno_data_vpmadd52_8x: + ret +.size poly1305_blocks_vpmadd52_8x,.-poly1305_blocks_vpmadd52_8x +___ +} +$code.=<<___; +.type poly1305_emit_base2_44,\@function,3 +.align 32 +poly1305_emit_base2_44: + mov 0($ctx),%r8 # load hash value + mov 8($ctx),%r9 + mov 16($ctx),%r10 + + mov %r9,%rax + shr \$20,%r9 + shl \$44,%rax + mov %r10,%rcx + shr \$40,%r10 + shl \$24,%rcx + + add %rax,%r8 + adc %rcx,%r9 + adc \$0,%r10 + + mov %r8,%rax + add \$5,%r8 # compare to modulus + mov %r9,%rcx + adc \$0,%r9 + adc \$0,%r10 + shr \$2,%r10 # did 130-bit value overflow? + cmovnz %r8,%rax + cmovnz %r9,%rcx + + add 0($nonce),%rax # accumulate nonce + adc 8($nonce),%rcx + mov %rax,0($mac) # write result + mov %rcx,8($mac) + + ret +.size poly1305_emit_base2_44,.-poly1305_emit_base2_44 +___ +} } } +} + +if (!$kernel) +{ # chacha20-poly1305 helpers +my ($out,$inp,$otp,$len)=$win64 ? ("%rcx","%rdx","%r8", "%r9") : # Win64 order + ("%rdi","%rsi","%rdx","%rcx"); # Unix order +$code.=<<___; +.globl xor128_encrypt_n_pad +.type xor128_encrypt_n_pad,\@abi-omnipotent +.align 16 +xor128_encrypt_n_pad: + sub $otp,$inp + sub $otp,$out + mov $len,%r10 # put len aside + shr \$4,$len # len / 16 + jz .Ltail_enc + nop +.Loop_enc_xmm: + movdqu ($inp,$otp),%xmm0 + pxor ($otp),%xmm0 + movdqu %xmm0,($out,$otp) + movdqa %xmm0,($otp) + lea 16($otp),$otp + dec $len + jnz .Loop_enc_xmm + + and \$15,%r10 # len % 16 + jz .Ldone_enc + +.Ltail_enc: + mov \$16,$len + sub %r10,$len + xor %eax,%eax +.Loop_enc_byte: + mov ($inp,$otp),%al + xor ($otp),%al + mov %al,($out,$otp) + mov %al,($otp) + lea 1($otp),$otp + dec %r10 + jnz .Loop_enc_byte + + xor %eax,%eax +.Loop_enc_pad: + mov %al,($otp) + lea 1($otp),$otp + dec $len + jnz .Loop_enc_pad + +.Ldone_enc: + mov $otp,%rax + ret +.size xor128_encrypt_n_pad,.-xor128_encrypt_n_pad + +.globl xor128_decrypt_n_pad +.type xor128_decrypt_n_pad,\@abi-omnipotent +.align 16 +xor128_decrypt_n_pad: + sub $otp,$inp + sub $otp,$out + mov $len,%r10 # put len aside + shr \$4,$len # len / 16 + jz .Ltail_dec + nop +.Loop_dec_xmm: + movdqu ($inp,$otp),%xmm0 + movdqa ($otp),%xmm1 + pxor %xmm0,%xmm1 + movdqu %xmm1,($out,$otp) + movdqa %xmm0,($otp) + lea 16($otp),$otp + dec $len + jnz .Loop_dec_xmm + + pxor %xmm1,%xmm1 + and \$15,%r10 # len % 16 + jz .Ldone_dec + +.Ltail_dec: + mov \$16,$len + sub %r10,$len + xor %eax,%eax + xor %r11d,%r11d +.Loop_dec_byte: + mov ($inp,$otp),%r11b + mov ($otp),%al + xor %r11b,%al + mov %al,($out,$otp) + mov %r11b,($otp) + lea 1($otp),$otp + dec %r10 + jnz .Loop_dec_byte + + xor %eax,%eax +.Loop_dec_pad: + mov %al,($otp) + lea 1($otp),$otp + dec $len + jnz .Loop_dec_pad + +.Ldone_dec: + mov $otp,%rax + ret +.size xor128_decrypt_n_pad,.-xor128_decrypt_n_pad +___ +} + +# EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame, +# CONTEXT *context,DISPATCHER_CONTEXT *disp) +if ($win64) { +$rec="%rcx"; +$frame="%rdx"; +$context="%r8"; +$disp="%r9"; + +$code.=<<___; +.extern __imp_RtlVirtualUnwind +.type se_handler,\@abi-omnipotent +.align 16 +se_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + mov 8($disp),%rsi # disp->ImageBase + mov 56($disp),%r11 # disp->HandlerData + + mov 0(%r11),%r10d # HandlerData[0] + lea (%rsi,%r10),%r10 # prologue label + cmp %r10,%rbx # context->Rip<.Lprologue + jb .Lcommon_seh_tail + + mov 152($context),%rax # pull context->Rsp + + mov 4(%r11),%r10d # HandlerData[1] + lea (%rsi,%r10),%r10 # epilogue label + cmp %r10,%rbx # context->Rip>=.Lepilogue + jae .Lcommon_seh_tail + + lea 48(%rax),%rax + + mov -8(%rax),%rbx + mov -16(%rax),%rbp + mov -24(%rax),%r12 + mov -32(%rax),%r13 + mov -40(%rax),%r14 + mov -48(%rax),%r15 + mov %rbx,144($context) # restore context->Rbx + mov %rbp,160($context) # restore context->Rbp + mov %r12,216($context) # restore context->R12 + mov %r13,224($context) # restore context->R13 + mov %r14,232($context) # restore context->R14 + mov %r15,240($context) # restore context->R14 + + jmp .Lcommon_seh_tail +.size se_handler,.-se_handler + +.type avx_handler,\@abi-omnipotent +.align 16 +avx_handler: + push %rsi + push %rdi + push %rbx + push %rbp + push %r12 + push %r13 + push %r14 + push %r15 + pushfq + sub \$64,%rsp + + mov 120($context),%rax # pull context->Rax + mov 248($context),%rbx # pull context->Rip + + mov 8($disp),%rsi # disp->ImageBase + mov 56($disp),%r11 # disp->HandlerData + + mov 0(%r11),%r10d # HandlerData[0] + lea (%rsi,%r10),%r10 # prologue label + cmp %r10,%rbx # context->RipRsp + + mov 4(%r11),%r10d # HandlerData[1] + lea (%rsi,%r10),%r10 # epilogue label + cmp %r10,%rbx # context->Rip>=epilogue label + jae .Lcommon_seh_tail + + mov 208($context),%rax # pull context->R11 + + lea 0x50(%rax),%rsi + lea 0xf8(%rax),%rax + lea 512($context),%rdi # &context.Xmm6 + mov \$20,%ecx + .long 0xa548f3fc # cld; rep movsq + +.Lcommon_seh_tail: + mov 8(%rax),%rdi + mov 16(%rax),%rsi + mov %rax,152($context) # restore context->Rsp + mov %rsi,168($context) # restore context->Rsi + mov %rdi,176($context) # restore context->Rdi + + mov 40($disp),%rdi # disp->ContextRecord + mov $context,%rsi # context + mov \$154,%ecx # sizeof(CONTEXT) + .long 0xa548f3fc # cld; rep movsq + + mov $disp,%rsi + xor %ecx,%ecx # arg1, UNW_FLAG_NHANDLER + mov 8(%rsi),%rdx # arg2, disp->ImageBase + mov 0(%rsi),%r8 # arg3, disp->ControlPc + mov 16(%rsi),%r9 # arg4, disp->FunctionEntry + mov 40(%rsi),%r10 # disp->ContextRecord + lea 56(%rsi),%r11 # &disp->HandlerData + lea 24(%rsi),%r12 # &disp->EstablisherFrame + mov %r10,32(%rsp) # arg5 + mov %r11,40(%rsp) # arg6 + mov %r12,48(%rsp) # arg7 + mov %rcx,56(%rsp) # arg8, (NULL) + call *__imp_RtlVirtualUnwind(%rip) + + mov \$1,%eax # ExceptionContinueSearch + add \$64,%rsp + popfq + pop %r15 + pop %r14 + pop %r13 + pop %r12 + pop %rbp + pop %rbx + pop %rdi + pop %rsi + ret +.size avx_handler,.-avx_handler + +.section .pdata +.align 4 + .rva .LSEH_begin_poly1305_init_x86_64 + .rva .LSEH_end_poly1305_init_x86_64 + .rva .LSEH_info_poly1305_init_x86_64 + + .rva .LSEH_begin_poly1305_blocks_x86_64 + .rva .LSEH_end_poly1305_blocks_x86_64 + .rva .LSEH_info_poly1305_blocks_x86_64 + + .rva .LSEH_begin_poly1305_emit_x86_64 + .rva .LSEH_end_poly1305_emit_x86_64 + .rva .LSEH_info_poly1305_emit_x86_64 +___ +$code.=<<___ if ($avx); + .rva .LSEH_begin_poly1305_blocks_avx + .rva .Lbase2_64_avx + .rva .LSEH_info_poly1305_blocks_avx_1 + + .rva .Lbase2_64_avx + .rva .Leven_avx + .rva .LSEH_info_poly1305_blocks_avx_2 + + .rva .Leven_avx + .rva .LSEH_end_poly1305_blocks_avx + .rva .LSEH_info_poly1305_blocks_avx_3 + + .rva .LSEH_begin_poly1305_emit_avx + .rva .LSEH_end_poly1305_emit_avx + .rva .LSEH_info_poly1305_emit_avx +___ +$code.=<<___ if ($avx>1); + .rva .LSEH_begin_poly1305_blocks_avx2 + .rva .Lbase2_64_avx2 + .rva .LSEH_info_poly1305_blocks_avx2_1 + + .rva .Lbase2_64_avx2 + .rva .Leven_avx2 + .rva .LSEH_info_poly1305_blocks_avx2_2 + + .rva .Leven_avx2 + .rva .LSEH_end_poly1305_blocks_avx2 + .rva .LSEH_info_poly1305_blocks_avx2_3 +___ +$code.=<<___ if ($avx>2); + .rva .LSEH_begin_poly1305_blocks_avx512 + .rva .LSEH_end_poly1305_blocks_avx512 + .rva .LSEH_info_poly1305_blocks_avx512 +___ +$code.=<<___; +.section .xdata +.align 8 +.LSEH_info_poly1305_init_x86_64: + .byte 9,0,0,0 + .rva se_handler + .rva .LSEH_begin_poly1305_init_x86_64,.LSEH_begin_poly1305_init_x86_64 + +.LSEH_info_poly1305_blocks_x86_64: + .byte 9,0,0,0 + .rva se_handler + .rva .Lblocks_body,.Lblocks_epilogue + +.LSEH_info_poly1305_emit_x86_64: + .byte 9,0,0,0 + .rva se_handler + .rva .LSEH_begin_poly1305_emit_x86_64,.LSEH_begin_poly1305_emit_x86_64 +___ +$code.=<<___ if ($avx); +.LSEH_info_poly1305_blocks_avx_1: + .byte 9,0,0,0 + .rva se_handler + .rva .Lblocks_avx_body,.Lblocks_avx_epilogue # HandlerData[] + +.LSEH_info_poly1305_blocks_avx_2: + .byte 9,0,0,0 + .rva se_handler + .rva .Lbase2_64_avx_body,.Lbase2_64_avx_epilogue # HandlerData[] + +.LSEH_info_poly1305_blocks_avx_3: + .byte 9,0,0,0 + .rva avx_handler + .rva .Ldo_avx_body,.Ldo_avx_epilogue # HandlerData[] + +.LSEH_info_poly1305_emit_avx: + .byte 9,0,0,0 + .rva se_handler + .rva .LSEH_begin_poly1305_emit_avx,.LSEH_begin_poly1305_emit_avx +___ +$code.=<<___ if ($avx>1); +.LSEH_info_poly1305_blocks_avx2_1: + .byte 9,0,0,0 + .rva se_handler + .rva .Lblocks_avx2_body,.Lblocks_avx2_epilogue # HandlerData[] + +.LSEH_info_poly1305_blocks_avx2_2: + .byte 9,0,0,0 + .rva se_handler + .rva .Lbase2_64_avx2_body,.Lbase2_64_avx2_epilogue # HandlerData[] + +.LSEH_info_poly1305_blocks_avx2_3: + .byte 9,0,0,0 + .rva avx_handler + .rva .Ldo_avx2_body,.Ldo_avx2_epilogue # HandlerData[] +___ +$code.=<<___ if ($avx>2); +.LSEH_info_poly1305_blocks_avx512: + .byte 9,0,0,0 + .rva avx_handler + .rva .Ldo_avx512_body,.Ldo_avx512_epilogue # HandlerData[] +___ +} + +open SELF,$0; +while() { + next if (/^#!/); + last if (!s/^#/\/\// and !/^$/); + print; +} +close SELF; + +foreach (split('\n',$code)) { + s/\`([^\`]*)\`/eval($1)/ge; + s/%r([a-z]+)#d/%e$1/g; + s/%r([0-9]+)#d/%r$1d/g; + s/%x#%[yz]/%x/g or s/%y#%z/%y/g or s/%z#%[yz]/%z/g; + + if ($kernel) { + s/(^\.type.*),[0-9]+$/\1/; + s/(^\.type.*),\@abi-omnipotent+$/\1,\@function/; + next if /^\.cfi.*/; + } + + print $_,"\n"; +} +close STDOUT; diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 88cc01506c84..e360ec1d9556 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -1,135 +1,175 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT /* - * Poly1305 authenticator algorithm, RFC7539, SIMD glue code - * - * Copyright (C) 2015 Martin Willi - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. */ #include #include -#include +#include #include +#include #include #include #include #include +#include -struct poly1305_simd_desc_ctx { - struct poly1305_desc_ctx base; - /* derived key u set? */ - bool uset; -#ifdef CONFIG_AS_AVX2 - /* derived keys r^3, r^4 set? */ - bool wset; -#endif - /* derived Poly1305 key r^2 */ - u32 u[5]; - /* ... silently appended r^3 and r^4 when using AVX2 */ +asmlinkage void poly1305_init_x86_64(void *ctx, + const u8 key[POLY1305_KEY_SIZE]); +asmlinkage void poly1305_blocks_x86_64(void *ctx, const u8 *inp, + const size_t len, const u32 padbit); +asmlinkage void poly1305_emit_x86_64(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], + const u32 nonce[4]); +asmlinkage void poly1305_emit_avx(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], + const u32 nonce[4]); +asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, const size_t len, + const u32 padbit); +asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, const size_t len, + const u32 padbit); +asmlinkage void poly1305_blocks_avx512(void *ctx, const u8 *inp, + const size_t len, const u32 padbit); + +static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx); +static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx2); +static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx512); + +struct poly1305_arch_internal { + union { + struct { + u32 h[5]; + u32 is_base2_26; + }; + u64 hs[3]; + }; + u64 r[2]; + u64 pad; + struct { u32 r2, r1, r4, r3; } rn[9]; }; -asmlinkage void poly1305_block_sse2(u32 *h, const u8 *src, - const u32 *r, unsigned int blocks); -asmlinkage void poly1305_2block_sse2(u32 *h, const u8 *src, const u32 *r, - unsigned int blocks, const u32 *u); -#ifdef CONFIG_AS_AVX2 -asmlinkage void poly1305_4block_avx2(u32 *h, const u8 *src, const u32 *r, - unsigned int blocks, const u32 *u); -static bool poly1305_use_avx2; -#endif - -static int poly1305_simd_init(struct shash_desc *desc) +/* The AVX code uses base 2^26, while the scalar code uses base 2^64. If we hit + * the unfortunate situation of using AVX and then having to go back to scalar + * -- because the user is silly and has called the update function from two + * separate contexts -- then we need to convert back to the original base before + * proceeding. It is possible to reason that the initial reduction below is + * sufficient given the implementation invariants. However, for an avoidance of + * doubt and because this is not performance critical, we do the full reduction + * anyway. Z3 proof of below function: https://xn--4db.cc/ltPtHCKN/py + */ +static void convert_to_base2_64(void *ctx) { - struct poly1305_simd_desc_ctx *sctx = shash_desc_ctx(desc); + struct poly1305_arch_internal *state = ctx; + u32 cy; - sctx->uset = false; -#ifdef CONFIG_AS_AVX2 - sctx->wset = false; -#endif + if (!state->is_base2_26) + return; - return crypto_poly1305_init(desc); + cy = state->h[0] >> 26; state->h[0] &= 0x3ffffff; state->h[1] += cy; + cy = state->h[1] >> 26; state->h[1] &= 0x3ffffff; state->h[2] += cy; + cy = state->h[2] >> 26; state->h[2] &= 0x3ffffff; state->h[3] += cy; + cy = state->h[3] >> 26; state->h[3] &= 0x3ffffff; state->h[4] += cy; + state->hs[0] = ((u64)state->h[2] << 52) | ((u64)state->h[1] << 26) | state->h[0]; + state->hs[1] = ((u64)state->h[4] << 40) | ((u64)state->h[3] << 14) | (state->h[2] >> 12); + state->hs[2] = state->h[4] >> 24; +#define ULT(a, b) ((a ^ ((a ^ b) | ((a - b) ^ b))) >> (sizeof(a) * 8 - 1)) + cy = (state->hs[2] >> 2) + (state->hs[2] & ~3ULL); + state->hs[2] &= 3; + state->hs[0] += cy; + state->hs[1] += (cy = ULT(state->hs[0], cy)); + state->hs[2] += ULT(state->hs[1], cy); +#undef ULT + state->is_base2_26 = 0; } -static void poly1305_simd_mult(u32 *a, const u32 *b) +static void poly1305_simd_init(void *ctx, const u8 key[POLY1305_KEY_SIZE]) { - u8 m[POLY1305_BLOCK_SIZE]; - - memset(m, 0, sizeof(m)); - /* The poly1305 block function adds a hi-bit to the accumulator which - * we don't need for key multiplication; compensate for it. */ - a[4] -= 1 << 24; - poly1305_block_sse2(a, m, b, 1); + poly1305_init_x86_64(ctx, key); } -static unsigned int poly1305_simd_blocks(struct poly1305_desc_ctx *dctx, - const u8 *src, unsigned int srclen) +static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, + const u32 padbit) { - struct poly1305_simd_desc_ctx *sctx; - unsigned int blocks, datalen; + struct poly1305_arch_internal *state = ctx; - BUILD_BUG_ON(offsetof(struct poly1305_simd_desc_ctx, base)); - sctx = container_of(dctx, struct poly1305_simd_desc_ctx, base); + /* SIMD disables preemption, so relax after processing each page. */ + BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE || + SZ_4K % POLY1305_BLOCK_SIZE); + if (!IS_ENABLED(CONFIG_AS_AVX) || !static_branch_likely(&poly1305_use_avx) || + (len < (POLY1305_BLOCK_SIZE * 18) && !state->is_base2_26) || + !may_use_simd()) { + convert_to_base2_64(ctx); + poly1305_blocks_x86_64(ctx, inp, len, padbit); + return; + } + + do { + const size_t bytes = min_t(size_t, len, SZ_4K); + + kernel_fpu_begin(); + if (IS_ENABLED(CONFIG_AS_AVX512) && static_branch_likely(&poly1305_use_avx512)) + poly1305_blocks_avx512(ctx, inp, bytes, padbit); + else if (IS_ENABLED(CONFIG_AS_AVX2) && static_branch_likely(&poly1305_use_avx2)) + poly1305_blocks_avx2(ctx, inp, bytes, padbit); + else + poly1305_blocks_avx(ctx, inp, bytes, padbit); + kernel_fpu_end(); + + len -= bytes; + inp += bytes; + } while (len); +} + +static void poly1305_simd_emit(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], + const u32 nonce[4]) +{ + if (!IS_ENABLED(CONFIG_AS_AVX) || !static_branch_likely(&poly1305_use_avx)) + poly1305_emit_x86_64(ctx, mac, nonce); + else + poly1305_emit_avx(ctx, mac, nonce); +} + +void poly1305_init_arch(struct poly1305_desc_ctx *dctx, const u8 *key) +{ + poly1305_simd_init(&dctx->h, key); + dctx->s[0] = get_unaligned_le32(&key[16]); + dctx->s[1] = get_unaligned_le32(&key[20]); + dctx->s[2] = get_unaligned_le32(&key[24]); + dctx->s[3] = get_unaligned_le32(&key[28]); + dctx->buflen = 0; + dctx->sset = true; +} +EXPORT_SYMBOL(poly1305_init_arch); + +static unsigned int crypto_poly1305_setdctxkey(struct poly1305_desc_ctx *dctx, + const u8 *inp, unsigned int len) +{ + unsigned int acc = 0; if (unlikely(!dctx->sset)) { - datalen = crypto_poly1305_setdesckey(dctx, src, srclen); - src += srclen - datalen; - srclen = datalen; - } - -#ifdef CONFIG_AS_AVX2 - if (poly1305_use_avx2 && srclen >= POLY1305_BLOCK_SIZE * 4) { - if (unlikely(!sctx->wset)) { - if (!sctx->uset) { - memcpy(sctx->u, dctx->r.r, sizeof(sctx->u)); - poly1305_simd_mult(sctx->u, dctx->r.r); - sctx->uset = true; - } - memcpy(sctx->u + 5, sctx->u, sizeof(sctx->u)); - poly1305_simd_mult(sctx->u + 5, dctx->r.r); - memcpy(sctx->u + 10, sctx->u + 5, sizeof(sctx->u)); - poly1305_simd_mult(sctx->u + 10, dctx->r.r); - sctx->wset = true; + if (!dctx->rset && len >= POLY1305_BLOCK_SIZE) { + poly1305_simd_init(&dctx->h, inp); + inp += POLY1305_BLOCK_SIZE; + len -= POLY1305_BLOCK_SIZE; + acc += POLY1305_BLOCK_SIZE; + dctx->rset = 1; } - blocks = srclen / (POLY1305_BLOCK_SIZE * 4); - poly1305_4block_avx2(dctx->h.h, src, dctx->r.r, blocks, - sctx->u); - src += POLY1305_BLOCK_SIZE * 4 * blocks; - srclen -= POLY1305_BLOCK_SIZE * 4 * blocks; - } -#endif - if (likely(srclen >= POLY1305_BLOCK_SIZE * 2)) { - if (unlikely(!sctx->uset)) { - memcpy(sctx->u, dctx->r.r, sizeof(sctx->u)); - poly1305_simd_mult(sctx->u, dctx->r.r); - sctx->uset = true; + if (len >= POLY1305_BLOCK_SIZE) { + dctx->s[0] = get_unaligned_le32(&inp[0]); + dctx->s[1] = get_unaligned_le32(&inp[4]); + dctx->s[2] = get_unaligned_le32(&inp[8]); + dctx->s[3] = get_unaligned_le32(&inp[12]); + inp += POLY1305_BLOCK_SIZE; + len -= POLY1305_BLOCK_SIZE; + acc += POLY1305_BLOCK_SIZE; + dctx->sset = true; } - blocks = srclen / (POLY1305_BLOCK_SIZE * 2); - poly1305_2block_sse2(dctx->h.h, src, dctx->r.r, blocks, - sctx->u); - src += POLY1305_BLOCK_SIZE * 2 * blocks; - srclen -= POLY1305_BLOCK_SIZE * 2 * blocks; } - if (srclen >= POLY1305_BLOCK_SIZE) { - poly1305_block_sse2(dctx->h.h, src, dctx->r.r, 1); - srclen -= POLY1305_BLOCK_SIZE; - } - return srclen; + return acc; } -static int poly1305_simd_update(struct shash_desc *desc, - const u8 *src, unsigned int srclen) +void poly1305_update_arch(struct poly1305_desc_ctx *dctx, const u8 *src, + unsigned int srclen) { - struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); - unsigned int bytes; - - /* kernel_fpu_begin/end is costly, use fallback for small updates */ - if (srclen <= 288 || !may_use_simd()) - return crypto_poly1305_update(desc, src, srclen); - - kernel_fpu_begin(); + unsigned int bytes, used; if (unlikely(dctx->buflen)) { bytes = min(srclen, POLY1305_BLOCK_SIZE - dctx->buflen); @@ -139,34 +179,76 @@ static int poly1305_simd_update(struct shash_desc *desc, dctx->buflen += bytes; if (dctx->buflen == POLY1305_BLOCK_SIZE) { - poly1305_simd_blocks(dctx, dctx->buf, - POLY1305_BLOCK_SIZE); + if (likely(!crypto_poly1305_setdctxkey(dctx, dctx->buf, POLY1305_BLOCK_SIZE))) + poly1305_simd_blocks(&dctx->h, dctx->buf, POLY1305_BLOCK_SIZE, 1); dctx->buflen = 0; } } if (likely(srclen >= POLY1305_BLOCK_SIZE)) { - bytes = poly1305_simd_blocks(dctx, src, srclen); - src += srclen - bytes; - srclen = bytes; + bytes = round_down(srclen, POLY1305_BLOCK_SIZE); + srclen -= bytes; + used = crypto_poly1305_setdctxkey(dctx, src, bytes); + if (likely(bytes - used)) + poly1305_simd_blocks(&dctx->h, src + used, bytes - used, 1); + src += bytes; } - kernel_fpu_end(); - if (unlikely(srclen)) { dctx->buflen = srclen; memcpy(dctx->buf, src, srclen); } +} +EXPORT_SYMBOL(poly1305_update_arch); +void poly1305_final_arch(struct poly1305_desc_ctx *dctx, u8 *dst) +{ + if (unlikely(dctx->buflen)) { + dctx->buf[dctx->buflen++] = 1; + memset(dctx->buf + dctx->buflen, 0, + POLY1305_BLOCK_SIZE - dctx->buflen); + poly1305_simd_blocks(&dctx->h, dctx->buf, POLY1305_BLOCK_SIZE, 0); + } + + poly1305_simd_emit(&dctx->h, dst, dctx->s); + *dctx = (struct poly1305_desc_ctx){}; +} +EXPORT_SYMBOL(poly1305_final_arch); + +static int crypto_poly1305_init(struct shash_desc *desc) +{ + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + + *dctx = (struct poly1305_desc_ctx){}; + return 0; +} + +static int crypto_poly1305_update(struct shash_desc *desc, + const u8 *src, unsigned int srclen) +{ + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + + poly1305_update_arch(dctx, src, srclen); + return 0; +} + +static int crypto_poly1305_final(struct shash_desc *desc, u8 *dst) +{ + struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); + + if (unlikely(!dctx->sset)) + return -ENOKEY; + + poly1305_final_arch(dctx, dst); return 0; } static struct shash_alg alg = { .digestsize = POLY1305_DIGEST_SIZE, - .init = poly1305_simd_init, - .update = poly1305_simd_update, + .init = crypto_poly1305_init, + .update = crypto_poly1305_update, .final = crypto_poly1305_final, - .descsize = sizeof(struct poly1305_simd_desc_ctx), + .descsize = sizeof(struct poly1305_desc_ctx), .base = { .cra_name = "poly1305", .cra_driver_name = "poly1305-simd", @@ -178,30 +260,33 @@ static struct shash_alg alg = { static int __init poly1305_simd_mod_init(void) { - if (!boot_cpu_has(X86_FEATURE_XMM2)) - return -ENODEV; - -#ifdef CONFIG_AS_AVX2 - poly1305_use_avx2 = boot_cpu_has(X86_FEATURE_AVX) && - boot_cpu_has(X86_FEATURE_AVX2) && - cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL); - alg.descsize = sizeof(struct poly1305_simd_desc_ctx); - if (poly1305_use_avx2) - alg.descsize += 10 * sizeof(u32); -#endif - return crypto_register_shash(&alg); + if (IS_ENABLED(CONFIG_AS_AVX) && boot_cpu_has(X86_FEATURE_AVX) && + cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) + static_branch_enable(&poly1305_use_avx); + if (IS_ENABLED(CONFIG_AS_AVX2) && boot_cpu_has(X86_FEATURE_AVX) && + boot_cpu_has(X86_FEATURE_AVX2) && + cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) + static_branch_enable(&poly1305_use_avx2); + if (IS_ENABLED(CONFIG_AS_AVX512) && boot_cpu_has(X86_FEATURE_AVX) && + boot_cpu_has(X86_FEATURE_AVX2) && boot_cpu_has(X86_FEATURE_AVX512F) && + cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM | XFEATURE_MASK_AVX512, NULL) && + /* Skylake downclocks unacceptably much when using zmm, but later generations are fast. */ + boot_cpu_data.x86_model != INTEL_FAM6_SKYLAKE_X) + static_branch_enable(&poly1305_use_avx512); + return IS_REACHABLE(CONFIG_CRYPTO_HASH) ? crypto_register_shash(&alg) : 0; } static void __exit poly1305_simd_mod_exit(void) { - crypto_unregister_shash(&alg); + if (IS_REACHABLE(CONFIG_CRYPTO_HASH)) + crypto_unregister_shash(&alg); } module_init(poly1305_simd_mod_init); module_exit(poly1305_simd_mod_exit); MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Martin Willi "); +MODULE_AUTHOR("Jason A. Donenfeld "); MODULE_DESCRIPTION("Poly1305 authenticator"); MODULE_ALIAS_CRYPTO("poly1305"); MODULE_ALIAS_CRYPTO("poly1305-simd"); diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c index 07bf5517d9d8..2410bd4bb48f 100644 --- a/arch/x86/events/amd/ibs.c +++ b/arch/x86/events/amd/ibs.c @@ -89,6 +89,7 @@ struct perf_ibs { u64 max_period; unsigned long offset_mask[1]; int offset_max; + unsigned int fetch_count_reset_broken : 1; struct cpu_perf_ibs __percpu *pcpu; struct attribute **format_attrs; @@ -346,11 +347,15 @@ static u64 get_ibs_op_count(u64 config) { u64 count = 0; + /* + * If the internal 27-bit counter rolled over, the count is MaxCnt + * and the lower 7 bits of CurCnt are randomized. + * Otherwise CurCnt has the full 27-bit current counter value. + */ if (config & IBS_OP_VAL) - count += (config & IBS_OP_MAX_CNT) << 4; /* cnt rolled over */ - - if (ibs_caps & IBS_CAPS_RDWROPCNT) - count += (config & IBS_OP_CUR_CNT) >> 32; + count = (config & IBS_OP_MAX_CNT) << 4; + else if (ibs_caps & IBS_CAPS_RDWROPCNT) + count = (config & IBS_OP_CUR_CNT) >> 32; return count; } @@ -375,7 +380,12 @@ perf_ibs_event_update(struct perf_ibs *perf_ibs, struct perf_event *event, static inline void perf_ibs_enable_event(struct perf_ibs *perf_ibs, struct hw_perf_event *hwc, u64 config) { - wrmsrl(hwc->config_base, hwc->config | config | perf_ibs->enable_mask); + u64 tmp = hwc->config | config; + + if (perf_ibs->fetch_count_reset_broken) + wrmsrl(hwc->config_base, tmp & ~perf_ibs->enable_mask); + + wrmsrl(hwc->config_base, tmp | perf_ibs->enable_mask); } /* @@ -637,18 +647,24 @@ fail: perf_ibs->offset_max, offset + 1); } while (offset < offset_max); + /* + * Read IbsBrTarget, IbsOpData4, and IbsExtdCtl separately + * depending on their availability. + * Can't add to offset_max as they are staggered + */ if (event->attr.sample_type & PERF_SAMPLE_RAW) { - /* - * Read IbsBrTarget and IbsOpData4 separately - * depending on their availability. - * Can't add to offset_max as they are staggered - */ - if (ibs_caps & IBS_CAPS_BRNTRGT) { - rdmsrl(MSR_AMD64_IBSBRTARGET, *buf++); - size++; + if (perf_ibs == &perf_ibs_op) { + if (ibs_caps & IBS_CAPS_BRNTRGT) { + rdmsrl(MSR_AMD64_IBSBRTARGET, *buf++); + size++; + } + if (ibs_caps & IBS_CAPS_OPDATA4) { + rdmsrl(MSR_AMD64_IBSOPDATA4, *buf++); + size++; + } } - if (ibs_caps & IBS_CAPS_OPDATA4) { - rdmsrl(MSR_AMD64_IBSOPDATA4, *buf++); + if (perf_ibs == &perf_ibs_fetch && (ibs_caps & IBS_CAPS_FETCHCTLEXTD)) { + rdmsrl(MSR_AMD64_ICIBSEXTDCTL, *buf++); size++; } } @@ -744,6 +760,13 @@ static __init void perf_event_ibs_init(void) { struct attribute **attr = ibs_op_format_attrs; + /* + * Some chips fail to reset the fetch count when it is written; instead + * they need a 0-1 transition of IbsFetchEn. + */ + if (boot_cpu_data.x86 >= 0x16 && boot_cpu_data.x86 <= 0x18) + perf_ibs_fetch.fetch_count_reset_broken = 1; + perf_ibs_pmu_init(&perf_ibs_fetch, "ibs_fetch"); if (ibs_caps & IBS_CAPS_OPCNT) { diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c index 3210fee27e7f..0014d26391fa 100644 --- a/arch/x86/events/amd/iommu.c +++ b/arch/x86/events/amd/iommu.c @@ -387,7 +387,7 @@ static __init int _init_events_attrs(void) while (amd_iommu_v2_event_descs[i].attr.attr.name) i++; - attrs = kcalloc(i + 1, sizeof(struct attribute **), GFP_KERNEL); + attrs = kcalloc(i + 1, sizeof(*attrs), GFP_KERNEL); if (!attrs) return -ENOMEM; diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 5bb11a8c245e..892af8ab95d7 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -377,6 +377,7 @@ #define MSR_AMD64_IBSOP_REG_MASK ((1UL<= 0 && - bit < NCAPINTS * 32) - setup_clear_cpu_cap(bit); + arglen = cmdline_find_option(boot_command_line, "clearcpuid", arg, sizeof(arg)); + if (arglen <= 0) + return; + + pr_info("Clearing CPUID bits:"); + do { + res = get_option(&argptr, &bit); + if (res == 0 || res == 3) + break; + + /* If the argument was too long, the last bit may be cut off */ + if (res == 1 && arglen >= sizeof(arg)) + break; + + if (bit >= 0 && bit < NCAPINTS * 32) { + pr_cont(" " X86_CAP_FMT, x86_cap_flag(bit)); + setup_clear_cpu_cap(bit); + } + } while (res == 2); + pr_cont("\n"); } /* diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c index 9490a2845f14..273687986a26 100644 --- a/arch/x86/kernel/kexec-bzimage64.c +++ b/arch/x86/kernel/kexec-bzimage64.c @@ -211,8 +211,7 @@ setup_boot_parameters(struct kimage *image, struct boot_params *params, params->hdr.hardware_subarch = boot_params.hdr.hardware_subarch; /* Copying screen_info will do? */ - memcpy(¶ms->screen_info, &boot_params.screen_info, - sizeof(struct screen_info)); + memcpy(¶ms->screen_info, &screen_info, sizeof(struct screen_info)); /* Fill in memsize later */ params->screen_info.ext_mem_k = 0; diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c index 0f8b9b900b0e..996eb53f8eb7 100644 --- a/arch/x86/kernel/nmi.c +++ b/arch/x86/kernel/nmi.c @@ -104,7 +104,6 @@ fs_initcall(nmi_warning_debugfs); static void nmi_check_duration(struct nmiaction *action, u64 duration) { - u64 whole_msecs = READ_ONCE(action->max_duration); int remainder_ns, decimal_msecs; if (duration < nmi_longest_ns || duration < action->max_duration) @@ -112,12 +111,12 @@ static void nmi_check_duration(struct nmiaction *action, u64 duration) action->max_duration = duration; - remainder_ns = do_div(whole_msecs, (1000 * 1000)); + remainder_ns = do_div(duration, (1000 * 1000)); decimal_msecs = remainder_ns / 1000; printk_ratelimited(KERN_INFO "INFO: NMI handler (%ps) took too long to run: %lld.%03d msecs\n", - action->handler, whole_msecs, decimal_msecs); + action->handler, duration, decimal_msecs); } static int nmi_handle(unsigned int type, struct pt_regs *regs) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index 1d264ba1e56d..8fa9ca3c3bd7 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -300,19 +300,12 @@ EXPORT_SYMBOL_GPL(unwind_get_return_address); unsigned long *unwind_get_return_address_ptr(struct unwind_state *state) { - struct task_struct *task = state->task; - if (unwind_done(state)) return NULL; if (state->regs) return &state->regs->ip; - if (task != current && state->sp == task->thread.sp) { - struct inactive_task_frame *frame = (void *)task->thread.sp; - return &frame->ret_addr; - } - if (state->sp) return (unsigned long *)state->sp - 1; @@ -634,7 +627,7 @@ void __unwind_start(struct unwind_state *state, struct task_struct *task, } else { struct inactive_task_frame *frame = (void *)task->thread.sp; - state->sp = task->thread.sp; + state->sp = task->thread.sp + sizeof(*frame); state->bp = READ_ONCE_NOCHECK(frame->bp); state->ip = READ_ONCE_NOCHECK(frame->ret_addr); state->signal = (void *)state->ip == ret_from_fork; diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 210eabd71ab2..670c2aedcefa 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -3561,7 +3561,7 @@ static int em_rdpid(struct x86_emulate_ctxt *ctxt) u64 tsc_aux = 0; if (ctxt->ops->get_msr(ctxt, MSR_TSC_AUX, &tsc_aux)) - return emulate_gp(ctxt, 0); + return emulate_ud(ctxt); ctxt->dst.val = tsc_aux; return X86EMUL_CONTINUE; } diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index a2ff5c214738..5faa49a95ac9 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -6225,6 +6225,7 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) cond_resched_lock(&kvm->mmu_lock); } } + kvm_mmu_commit_zap_page(kvm, &invalid_list); spin_unlock(&kvm->mmu_lock); srcu_read_unlock(&kvm->srcu, rcu_idx); diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index cb09a0ec8750..a0c3d1b4b295 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -5380,6 +5380,7 @@ static int svm_update_pi_irte(struct kvm *kvm, unsigned int host_irq, * - Tell IOMMU to use legacy mode for this interrupt. * - Retrieve ga_tag of prior interrupt remapping data. */ + pi.prev_ga_tag = 0; pi.is_guest_mode = false; ret = irq_set_vcpu_affinity(host_irq, &pi); diff --git a/arch/x86/pci/intel_mid_pci.c b/arch/x86/pci/intel_mid_pci.c index 43867bc85368..eea5a0f3b959 100644 --- a/arch/x86/pci/intel_mid_pci.c +++ b/arch/x86/pci/intel_mid_pci.c @@ -33,6 +33,7 @@ #include #include #include +#include #define PCIE_CAP_OFFSET 0x100 diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 76864ea59160..9f8995cd28f6 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -1383,6 +1383,15 @@ asmlinkage __visible void __init xen_start_kernel(void) x86_init.mpparse.get_smp_config = x86_init_uint_noop; xen_boot_params_init_edd(); + +#ifdef CONFIG_ACPI + /* + * Disable selecting "Firmware First mode" for correctable + * memory errors, as this is the duty of the hypervisor to + * decide. + */ + acpi_disable_cmcff = 1; +#endif } if (!boot_params.screen_info.orig_video_isVGA) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index a06547fe6f6b..85bd46e0a745 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -876,13 +876,20 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, goto fail; } + if (radix_tree_preload(GFP_KERNEL)) { + blkg_free(new_blkg); + ret = -ENOMEM; + goto fail; + } + rcu_read_lock(); spin_lock_irq(q->queue_lock); blkg = blkg_lookup_check(pos, pol, q); if (IS_ERR(blkg)) { ret = PTR_ERR(blkg); - goto fail_unlock; + blkg_free(new_blkg); + goto fail_preloaded; } if (blkg) { @@ -891,10 +898,12 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, blkg = blkg_create(pos, q, new_blkg); if (unlikely(IS_ERR(blkg))) { ret = PTR_ERR(blkg); - goto fail_unlock; + goto fail_preloaded; } } + radix_tree_preload_end(); + if (pos == blkcg) goto success; } @@ -904,6 +913,8 @@ success: ctx->body = body; return 0; +fail_preloaded: + radix_tree_preload_end(); fail_unlock: spin_unlock_irq(q->queue_lock); rcu_read_unlock(); diff --git a/build.config.aarch64 b/build.config.aarch64 index ce1709ac1812..569b9747661d 100644 --- a/build.config.aarch64 +++ b/build.config.aarch64 @@ -1,12 +1,12 @@ ARCH=arm64 -CLANG_TRIPLE=aarch64-linux-gnu- -CROSS_COMPILE=aarch64-linux-androidkernel- -CROSS_COMPILE_COMPAT=arm-linux-androidkernel- -LINUX_GCC_CROSS_COMPILE_PREBUILTS_BIN=prebuilts/gcc/linux-x86/aarch64/aarch64-linux-android-4.9/bin -LINUX_GCC_CROSS_COMPILE_COMPAT_PREBUILTS_BIN=prebuilts/gcc/linux-x86/arm/arm-linux-androideabi-4.9/bin/ +CROSS_COMPILE=aarch64-linux-gnu- +CROSS_COMPILE_COMPAT=arm-linux-gnueabi- +LINUX_GCC_CROSS_COMPILE_PREBUILTS_BIN=prebuilts/gas/linux-x86 +LINUX_GCC_CROSS_COMPILE_COMPAT_PREBUILTS_BIN=prebuilts/gas/linux-x86 FILES=" +arch/arm64/boot/Image arch/arm64/boot/Image.gz vmlinux System.map diff --git a/build.config.allmodconfig b/build.config.allmodconfig index 4ce14e5a7c5a..103d1cdf4edc 100644 --- a/build.config.allmodconfig +++ b/build.config.allmodconfig @@ -10,5 +10,5 @@ function update_config() { -e UNWINDER_FRAME_POINTER \ (cd ${OUT_DIR} && \ - make O=${OUT_DIR} $archsubarch CLANG_TRIPLE=${CLANG_TRIPLE} CROSS_COMPILE=${CROSS_COMPILE} "${TOOL_ARGS[@]}" ${MAKE_ARGS} olddefconfig) + make O=${OUT_DIR} $archsubarch CROSS_COMPILE=${CROSS_COMPILE} "${TOOL_ARGS[@]}" ${MAKE_ARGS} olddefconfig) } diff --git a/build.config.arm b/build.config.arm index 8eb547e21593..a9e9a6c2a418 100644 --- a/build.config.arm +++ b/build.config.arm @@ -1,8 +1,7 @@ ARCH=arm -CLANG_TRIPLE=arm-linux-gnueabi- -CROSS_COMPILE=arm-linux-androidkernel- -LINUX_GCC_CROSS_COMPILE_PREBUILTS_BIN=prebuilts/gcc/linux-x86/arm/arm-linux-androideabi-4.9/bin +CROSS_COMPILE=arm-linux-gnueabi- +LINUX_GCC_CROSS_COMPILE_PREBUILTS_BIN=prebuilts/gas/linux-x86 FILES=" arch/arm/boot/Image.gz diff --git a/build.config.common b/build.config.common index 1b0034f31452..c9467028ea7f 100644 --- a/build.config.common +++ b/build.config.common @@ -3,7 +3,7 @@ KMI_GENERATION=0 LLVM=1 DEPMOD=depmod -CLANG_PREBUILT_BIN=prebuilts-master/clang/host/linux-x86/clang-r383902/bin +CLANG_PREBUILT_BIN=prebuilts-master/clang/host/linux-x86/clang-r399163b/bin BUILDTOOLS_PREBUILT_BIN=build/build-tools/path/linux-x86 EXTRA_CMDS='' diff --git a/build.config.x86_64 b/build.config.x86_64 index df73a47e7220..b16ac8a68b25 100644 --- a/build.config.x86_64 +++ b/build.config.x86_64 @@ -1,8 +1,7 @@ ARCH=x86_64 -CLANG_TRIPLE=x86_64-linux-gnu- -CROSS_COMPILE=x86_64-linux-androidkernel- -LINUX_GCC_CROSS_COMPILE_PREBUILTS_BIN=prebuilts/gcc/linux-x86/x86/x86_64-linux-android-4.9/bin +CROSS_COMPILE=x86_64-linux-gnu- +LINUX_GCC_CROSS_COMPILE_PREBUILTS_BIN=prebuilts/gas/linux-x86 FILES=" arch/x86/boot/bzImage diff --git a/crypto/Kconfig b/crypto/Kconfig index fdaa0dce0fcc..3ebf3e1472a1 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -163,7 +163,6 @@ config CRYPTO_USER config CRYPTO_MANAGER_DISABLE_TESTS bool "Disable run-time self tests" default y - depends on CRYPTO_MANAGER2 help Disable run-time self tests that normally take place at algorithm registration. @@ -257,6 +256,17 @@ config CRYPTO_GLUE_HELPER_X86 config CRYPTO_ENGINE tristate +config CRYPTO_CURVE25519 + tristate "Curve25519 algorithm" + select CRYPTO_KPP + select CRYPTO_LIB_CURVE25519_GENERIC + +config CRYPTO_CURVE25519_X86 + tristate "x86_64 accelerated Curve25519 scalar multiplication library" + depends on X86 && 64BIT + select CRYPTO_LIB_CURVE25519_GENERIC + select CRYPTO_ARCH_HAVE_LIB_CURVE25519 + comment "Authenticated Encryption with Associated Data" config CRYPTO_CCM @@ -498,12 +508,12 @@ config CRYPTO_KEYWRAP config CRYPTO_NHPOLY1305 tristate select CRYPTO_HASH - select CRYPTO_POLY1305 + select CRYPTO_LIB_POLY1305_GENERIC config CRYPTO_ADIANTUM tristate "Adiantum support" select CRYPTO_CHACHA20 - select CRYPTO_POLY1305 + select CRYPTO_LIB_POLY1305_GENERIC select CRYPTO_NHPOLY1305 help Adiantum is a tweakable, length-preserving encryption mode @@ -638,6 +648,30 @@ config CRYPTO_CRC32_MIPS instructions, when available. +config CRYPTO_BLAKE2S + tristate "BLAKE2s digest algorithm" + select CRYPTO_LIB_BLAKE2S_GENERIC + select CRYPTO_HASH + help + Implementation of cryptographic hash function BLAKE2s + optimized for 8-32bit platforms and can produce digests of any size + between 1 to 32. The keyed hash is also implemented. + + This module provides the following algorithms: + + - blake2s-128 + - blake2s-160 + - blake2s-224 + - blake2s-256 + + See https://blake2.net for further information. + +config CRYPTO_BLAKE2S_X86 + tristate "BLAKE2s digest algorithm (x86 accelerated version)" + depends on X86 && 64BIT + select CRYPTO_LIB_BLAKE2S_GENERIC + select CRYPTO_ARCH_HAVE_LIB_BLAKE2S + config CRYPTO_CRCT10DIF tristate "CRCT10DIF algorithm" select CRYPTO_HASH @@ -684,6 +718,7 @@ config CRYPTO_GHASH config CRYPTO_POLY1305 tristate "Poly1305 authenticator algorithm" select CRYPTO_HASH + select CRYPTO_LIB_POLY1305_GENERIC help Poly1305 authenticator algorithm, RFC7539. @@ -694,7 +729,8 @@ config CRYPTO_POLY1305 config CRYPTO_POLY1305_X86_64 tristate "Poly1305 authenticator algorithm (x86_64/SSE2/AVX2)" depends on X86 && 64BIT - select CRYPTO_POLY1305 + select CRYPTO_LIB_POLY1305_GENERIC + select CRYPTO_ARCH_HAVE_LIB_POLY1305 help Poly1305 authenticator algorithm, RFC7539. @@ -703,6 +739,11 @@ config CRYPTO_POLY1305_X86_64 in IETF protocols. This is the x86_64 assembler implementation using SIMD instructions. +config CRYPTO_POLY1305_MIPS + tristate "Poly1305 authenticator algorithm (MIPS optimized)" + depends on CPU_MIPS32 || (CPU_MIPS64 && 64BIT) + select CRYPTO_ARCH_HAVE_LIB_POLY1305 + config CRYPTO_MD4 tristate "MD4 digest algorithm" select CRYPTO_HASH @@ -1467,6 +1508,7 @@ config CRYPTO_SALSA20 config CRYPTO_CHACHA20 tristate "ChaCha stream cipher algorithms" + select CRYPTO_LIB_CHACHA_GENERIC select CRYPTO_BLKCIPHER help The ChaCha20, XChaCha20, and XChaCha12 stream cipher algorithms. @@ -1487,19 +1529,20 @@ config CRYPTO_CHACHA20 in some performance-sensitive scenarios. config CRYPTO_CHACHA20_X86_64 - tristate "ChaCha20 cipher algorithm (x86_64/SSSE3/AVX2)" + tristate "ChaCha stream cipher algorithms (x86_64/SSSE3/AVX2/AVX-512VL)" depends on X86 && 64BIT select CRYPTO_BLKCIPHER - select CRYPTO_CHACHA20 + select CRYPTO_LIB_CHACHA_GENERIC + select CRYPTO_ARCH_HAVE_LIB_CHACHA help - ChaCha20 cipher algorithm, RFC7539. + SSSE3, AVX2, and AVX-512VL optimized implementations of the ChaCha20, + XChaCha20, and XChaCha12 stream ciphers. - ChaCha20 is a 256-bit high-speed stream cipher designed by Daniel J. - Bernstein and further specified in RFC7539 for use in IETF protocols. - This is the x86_64 assembler implementation using SIMD instructions. - - See also: - +config CRYPTO_CHACHA_MIPS + tristate "ChaCha stream cipher algorithms (MIPS 32r2 optimized)" + depends on CPU_MIPS32_R2 + select CRYPTO_BLKCIPHER + select CRYPTO_ARCH_HAVE_LIB_CHACHA config CRYPTO_SEED tristate "SEED cipher algorithm" @@ -1901,6 +1944,7 @@ config CRYPTO_USER_API_AEAD config CRYPTO_HASH_INFO bool +source "lib/crypto/Kconfig" source "drivers/crypto/Kconfig" source crypto/asymmetric_keys/Kconfig source certs/Kconfig diff --git a/crypto/Makefile b/crypto/Makefile index a56514cecb63..00e0bc20e508 100644 --- a/crypto/Makefile +++ b/crypto/Makefile @@ -73,6 +73,7 @@ obj-$(CONFIG_CRYPTO_SM3) += sm3_generic.o obj-$(CONFIG_CRYPTO_WP512) += wp512.o CFLAGS_wp512.o := $(call cc-option,-fno-schedule-insns) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149 obj-$(CONFIG_CRYPTO_TGR192) += tgr192.o +obj-$(CONFIG_CRYPTO_BLAKE2S) += blake2s_generic.o obj-$(CONFIG_CRYPTO_GF128MUL) += gf128mul.o obj-$(CONFIG_CRYPTO_ECB) += ecb.o obj-$(CONFIG_CRYPTO_CBC) += cbc.o @@ -149,6 +150,7 @@ ecdh_generic-y := ecc.o ecdh_generic-y += ecdh.o ecdh_generic-y += ecdh_helper.o obj-$(CONFIG_CRYPTO_ECDH) += ecdh_generic.o +obj-$(CONFIG_CRYPTO_CURVE25519) += curve25519-generic.o # # generic algorithms and the async_tx api diff --git a/crypto/adiantum.c b/crypto/adiantum.c index 5564e73266a6..3f129c6da447 100644 --- a/crypto/adiantum.c +++ b/crypto/adiantum.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include #include @@ -71,7 +72,7 @@ struct adiantum_tfm_ctx { struct crypto_skcipher *streamcipher; struct crypto_cipher *blockcipher; struct crypto_shash *hash; - struct poly1305_key header_hash_key; + struct poly1305_core_key header_hash_key; }; struct adiantum_request_ctx { @@ -242,13 +243,13 @@ static void adiantum_hash_header(struct skcipher_request *req) BUILD_BUG_ON(sizeof(header) % POLY1305_BLOCK_SIZE != 0); poly1305_core_blocks(&state, &tctx->header_hash_key, - &header, sizeof(header) / POLY1305_BLOCK_SIZE); + &header, sizeof(header) / POLY1305_BLOCK_SIZE, 1); BUILD_BUG_ON(TWEAK_SIZE % POLY1305_BLOCK_SIZE != 0); poly1305_core_blocks(&state, &tctx->header_hash_key, req->iv, - TWEAK_SIZE / POLY1305_BLOCK_SIZE); + TWEAK_SIZE / POLY1305_BLOCK_SIZE, 1); - poly1305_core_emit(&state, &rctx->header_hash); + poly1305_core_emit(&state, NULL, &rctx->header_hash); } /* Hash the left-hand part (the "bulk") of the message using NHPoly1305 */ diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c index e643d268c7ba..1a0e2401ac8a 100644 --- a/crypto/algif_aead.c +++ b/crypto/algif_aead.c @@ -82,7 +82,7 @@ static int crypto_aead_copy_sgl(struct crypto_sync_skcipher *null_tfm, SYNC_SKCIPHER_REQUEST_ON_STACK(skreq, null_tfm); skcipher_request_set_sync_tfm(skreq, null_tfm); - skcipher_request_set_callback(skreq, CRYPTO_TFM_REQ_MAY_BACKLOG, + skcipher_request_set_callback(skreq, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL); skcipher_request_set_crypt(skreq, src, dst, len, NULL); @@ -295,19 +295,20 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg, areq->outlen = outlen; aead_request_set_callback(&areq->cra_u.aead_req, - CRYPTO_TFM_REQ_MAY_BACKLOG, + CRYPTO_TFM_REQ_MAY_SLEEP, af_alg_async_cb, areq); err = ctx->enc ? crypto_aead_encrypt(&areq->cra_u.aead_req) : crypto_aead_decrypt(&areq->cra_u.aead_req); /* AIO operation in progress */ - if (err == -EINPROGRESS || err == -EBUSY) + if (err == -EINPROGRESS) return -EIOCBQUEUED; sock_put(sk); } else { /* Synchronous operation */ aead_request_set_callback(&areq->cra_u.aead_req, + CRYPTO_TFM_REQ_MAY_SLEEP | CRYPTO_TFM_REQ_MAY_BACKLOG, crypto_req_done, &ctx->wait); err = crypto_wait_req(ctx->enc ? diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index 1cb106c46043..9d2e9783c0d4 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -127,7 +127,7 @@ static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg, crypto_skcipher_decrypt(&areq->cra_u.skcipher_req); /* AIO operation in progress */ - if (err == -EINPROGRESS || err == -EBUSY) + if (err == -EINPROGRESS) return -EIOCBQUEUED; sock_put(sk); diff --git a/crypto/blake2s_generic.c b/crypto/blake2s_generic.c new file mode 100644 index 000000000000..139c745c934b --- /dev/null +++ b/crypto/blake2s_generic.c @@ -0,0 +1,170 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include +#include + +#include +#include +#include +#include + +static int crypto_blake2s_setkey(struct crypto_shash *tfm, const u8 *key, + unsigned int keylen) +{ + struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(tfm); + + if (keylen == 0 || keylen > BLAKE2S_KEY_SIZE) { + crypto_shash_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN); + return -EINVAL; + } + + memcpy(tctx->key, key, keylen); + tctx->keylen = keylen; + + return 0; +} + +static int crypto_blake2s_init(struct shash_desc *desc) +{ + struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); + struct blake2s_state *state = shash_desc_ctx(desc); + const int outlen = crypto_shash_digestsize(desc->tfm); + + if (tctx->keylen) + blake2s_init_key(state, outlen, tctx->key, tctx->keylen); + else + blake2s_init(state, outlen); + + return 0; +} + +static int crypto_blake2s_update(struct shash_desc *desc, const u8 *in, + unsigned int inlen) +{ + struct blake2s_state *state = shash_desc_ctx(desc); + const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen; + + if (unlikely(!inlen)) + return 0; + if (inlen > fill) { + memcpy(state->buf + state->buflen, in, fill); + blake2s_compress_generic(state, state->buf, 1, BLAKE2S_BLOCK_SIZE); + state->buflen = 0; + in += fill; + inlen -= fill; + } + if (inlen > BLAKE2S_BLOCK_SIZE) { + const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE); + /* Hash one less (full) block than strictly possible */ + blake2s_compress_generic(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE); + in += BLAKE2S_BLOCK_SIZE * (nblocks - 1); + inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1); + } + memcpy(state->buf + state->buflen, in, inlen); + state->buflen += inlen; + + return 0; +} + +static int crypto_blake2s_final(struct shash_desc *desc, u8 *out) +{ + struct blake2s_state *state = shash_desc_ctx(desc); + + blake2s_set_lastblock(state); + memset(state->buf + state->buflen, 0, + BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */ + blake2s_compress_generic(state, state->buf, 1, state->buflen); + cpu_to_le32_array(state->h, ARRAY_SIZE(state->h)); + memcpy(out, state->h, state->outlen); + memzero_explicit(state, sizeof(*state)); + + return 0; +} + +static struct shash_alg blake2s_algs[] = {{ + .base.cra_name = "blake2s-128", + .base.cra_driver_name = "blake2s-128-generic", + .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, + .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), + .base.cra_priority = 200, + .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, + .base.cra_module = THIS_MODULE, + + .digestsize = BLAKE2S_128_HASH_SIZE, + .setkey = crypto_blake2s_setkey, + .init = crypto_blake2s_init, + .update = crypto_blake2s_update, + .final = crypto_blake2s_final, + .descsize = sizeof(struct blake2s_state), +}, { + .base.cra_name = "blake2s-160", + .base.cra_driver_name = "blake2s-160-generic", + .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, + .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), + .base.cra_priority = 200, + .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, + .base.cra_module = THIS_MODULE, + + .digestsize = BLAKE2S_160_HASH_SIZE, + .setkey = crypto_blake2s_setkey, + .init = crypto_blake2s_init, + .update = crypto_blake2s_update, + .final = crypto_blake2s_final, + .descsize = sizeof(struct blake2s_state), +}, { + .base.cra_name = "blake2s-224", + .base.cra_driver_name = "blake2s-224-generic", + .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, + .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), + .base.cra_priority = 200, + .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, + .base.cra_module = THIS_MODULE, + + .digestsize = BLAKE2S_224_HASH_SIZE, + .setkey = crypto_blake2s_setkey, + .init = crypto_blake2s_init, + .update = crypto_blake2s_update, + .final = crypto_blake2s_final, + .descsize = sizeof(struct blake2s_state), +}, { + .base.cra_name = "blake2s-256", + .base.cra_driver_name = "blake2s-256-generic", + .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, + .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), + .base.cra_priority = 200, + .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, + .base.cra_module = THIS_MODULE, + + .digestsize = BLAKE2S_256_HASH_SIZE, + .setkey = crypto_blake2s_setkey, + .init = crypto_blake2s_init, + .update = crypto_blake2s_update, + .final = crypto_blake2s_final, + .descsize = sizeof(struct blake2s_state), +}}; + +static int __init blake2s_mod_init(void) +{ + return crypto_register_shashes(blake2s_algs, ARRAY_SIZE(blake2s_algs)); +} + +static void __exit blake2s_mod_exit(void) +{ + crypto_unregister_shashes(blake2s_algs, ARRAY_SIZE(blake2s_algs)); +} + +subsys_initcall(blake2s_mod_init); +module_exit(blake2s_mod_exit); + +MODULE_ALIAS_CRYPTO("blake2s-128"); +MODULE_ALIAS_CRYPTO("blake2s-128-generic"); +MODULE_ALIAS_CRYPTO("blake2s-160"); +MODULE_ALIAS_CRYPTO("blake2s-160-generic"); +MODULE_ALIAS_CRYPTO("blake2s-224"); +MODULE_ALIAS_CRYPTO("blake2s-224-generic"); +MODULE_ALIAS_CRYPTO("blake2s-256"); +MODULE_ALIAS_CRYPTO("blake2s-256-generic"); +MODULE_LICENSE("GPL v2"); diff --git a/crypto/chacha_generic.c b/crypto/chacha_generic.c index 35b583101f4f..daf03beba9de 100644 --- a/crypto/chacha_generic.c +++ b/crypto/chacha_generic.c @@ -12,33 +12,12 @@ #include #include -#include +#include #include #include -static void chacha_docrypt(u32 *state, u8 *dst, const u8 *src, - unsigned int bytes, int nrounds) -{ - /* aligned to potentially speed up crypto_xor() */ - u8 stream[CHACHA_BLOCK_SIZE] __aligned(sizeof(long)); - - if (dst != src) - memcpy(dst, src, bytes); - - while (bytes >= CHACHA_BLOCK_SIZE) { - chacha_block(state, stream, nrounds); - crypto_xor(dst, stream, CHACHA_BLOCK_SIZE); - bytes -= CHACHA_BLOCK_SIZE; - dst += CHACHA_BLOCK_SIZE; - } - if (bytes) { - chacha_block(state, stream, nrounds); - crypto_xor(dst, stream, bytes); - } -} - static int chacha_stream_xor(struct skcipher_request *req, - struct chacha_ctx *ctx, u8 *iv) + const struct chacha_ctx *ctx, const u8 *iv) { struct skcipher_walk walk; u32 state[16]; @@ -46,7 +25,7 @@ static int chacha_stream_xor(struct skcipher_request *req, err = skcipher_walk_virt(&walk, req, false); - crypto_chacha_init(state, ctx, iv); + chacha_init_generic(state, ctx->key, iv); while (walk.nbytes > 0) { unsigned int nbytes = walk.nbytes; @@ -54,75 +33,23 @@ static int chacha_stream_xor(struct skcipher_request *req, if (nbytes < walk.total) nbytes = round_down(nbytes, walk.stride); - chacha_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr, - nbytes, ctx->nrounds); + chacha_crypt_generic(state, walk.dst.virt.addr, + walk.src.virt.addr, nbytes, ctx->nrounds); err = skcipher_walk_done(&walk, walk.nbytes - nbytes); } return err; } -void crypto_chacha_init(u32 *state, struct chacha_ctx *ctx, u8 *iv) -{ - state[0] = 0x61707865; /* "expa" */ - state[1] = 0x3320646e; /* "nd 3" */ - state[2] = 0x79622d32; /* "2-by" */ - state[3] = 0x6b206574; /* "te k" */ - state[4] = ctx->key[0]; - state[5] = ctx->key[1]; - state[6] = ctx->key[2]; - state[7] = ctx->key[3]; - state[8] = ctx->key[4]; - state[9] = ctx->key[5]; - state[10] = ctx->key[6]; - state[11] = ctx->key[7]; - state[12] = get_unaligned_le32(iv + 0); - state[13] = get_unaligned_le32(iv + 4); - state[14] = get_unaligned_le32(iv + 8); - state[15] = get_unaligned_le32(iv + 12); -} -EXPORT_SYMBOL_GPL(crypto_chacha_init); - -static int chacha_setkey(struct crypto_skcipher *tfm, const u8 *key, - unsigned int keysize, int nrounds) -{ - struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); - int i; - - if (keysize != CHACHA_KEY_SIZE) - return -EINVAL; - - for (i = 0; i < ARRAY_SIZE(ctx->key); i++) - ctx->key[i] = get_unaligned_le32(key + i * sizeof(u32)); - - ctx->nrounds = nrounds; - return 0; -} - -int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key, - unsigned int keysize) -{ - return chacha_setkey(tfm, key, keysize, 20); -} -EXPORT_SYMBOL_GPL(crypto_chacha20_setkey); - -int crypto_chacha12_setkey(struct crypto_skcipher *tfm, const u8 *key, - unsigned int keysize) -{ - return chacha_setkey(tfm, key, keysize, 12); -} -EXPORT_SYMBOL_GPL(crypto_chacha12_setkey); - -int crypto_chacha_crypt(struct skcipher_request *req) +static int crypto_chacha_crypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); return chacha_stream_xor(req, ctx, req->iv); } -EXPORT_SYMBOL_GPL(crypto_chacha_crypt); -int crypto_xchacha_crypt(struct skcipher_request *req) +static int crypto_xchacha_crypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); @@ -131,8 +58,8 @@ int crypto_xchacha_crypt(struct skcipher_request *req) u8 real_iv[16]; /* Compute the subkey given the original key and first 128 nonce bits */ - crypto_chacha_init(state, ctx, req->iv); - hchacha_block(state, subctx.key, ctx->nrounds); + chacha_init_generic(state, ctx->key, req->iv); + hchacha_block_generic(state, subctx.key, ctx->nrounds); subctx.nrounds = ctx->nrounds; /* Build the real IV */ @@ -142,7 +69,6 @@ int crypto_xchacha_crypt(struct skcipher_request *req) /* Generate the stream and XOR it with the data */ return chacha_stream_xor(req, &subctx, real_iv); } -EXPORT_SYMBOL_GPL(crypto_xchacha_crypt); static struct skcipher_alg algs[] = { { @@ -157,7 +83,7 @@ static struct skcipher_alg algs[] = { .max_keysize = CHACHA_KEY_SIZE, .ivsize = CHACHA_IV_SIZE, .chunksize = CHACHA_BLOCK_SIZE, - .setkey = crypto_chacha20_setkey, + .setkey = chacha20_setkey, .encrypt = crypto_chacha_crypt, .decrypt = crypto_chacha_crypt, }, { @@ -172,7 +98,7 @@ static struct skcipher_alg algs[] = { .max_keysize = CHACHA_KEY_SIZE, .ivsize = XCHACHA_IV_SIZE, .chunksize = CHACHA_BLOCK_SIZE, - .setkey = crypto_chacha20_setkey, + .setkey = chacha20_setkey, .encrypt = crypto_xchacha_crypt, .decrypt = crypto_xchacha_crypt, }, { @@ -187,7 +113,7 @@ static struct skcipher_alg algs[] = { .max_keysize = CHACHA_KEY_SIZE, .ivsize = XCHACHA_IV_SIZE, .chunksize = CHACHA_BLOCK_SIZE, - .setkey = crypto_chacha12_setkey, + .setkey = chacha12_setkey, .encrypt = crypto_xchacha_crypt, .decrypt = crypto_xchacha_crypt, } diff --git a/crypto/curve25519-generic.c b/crypto/curve25519-generic.c new file mode 100644 index 000000000000..bd88fd571393 --- /dev/null +++ b/crypto/curve25519-generic.c @@ -0,0 +1,90 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +#include +#include +#include +#include +#include + +static int curve25519_set_secret(struct crypto_kpp *tfm, const void *buf, + unsigned int len) +{ + u8 *secret = kpp_tfm_ctx(tfm); + + if (!len) + curve25519_generate_secret(secret); + else if (len == CURVE25519_KEY_SIZE && + crypto_memneq(buf, curve25519_null_point, CURVE25519_KEY_SIZE)) + memcpy(secret, buf, CURVE25519_KEY_SIZE); + else + return -EINVAL; + return 0; +} + +static int curve25519_compute_value(struct kpp_request *req) +{ + struct crypto_kpp *tfm = crypto_kpp_reqtfm(req); + const u8 *secret = kpp_tfm_ctx(tfm); + u8 public_key[CURVE25519_KEY_SIZE]; + u8 buf[CURVE25519_KEY_SIZE]; + int copied, nbytes; + u8 const *bp; + + if (req->src) { + copied = sg_copy_to_buffer(req->src, + sg_nents_for_len(req->src, + CURVE25519_KEY_SIZE), + public_key, CURVE25519_KEY_SIZE); + if (copied != CURVE25519_KEY_SIZE) + return -EINVAL; + bp = public_key; + } else { + bp = curve25519_base_point; + } + + curve25519_generic(buf, secret, bp); + + /* might want less than we've got */ + nbytes = min_t(size_t, CURVE25519_KEY_SIZE, req->dst_len); + copied = sg_copy_from_buffer(req->dst, sg_nents_for_len(req->dst, + nbytes), + buf, nbytes); + if (copied != nbytes) + return -EINVAL; + return 0; +} + +static unsigned int curve25519_max_size(struct crypto_kpp *tfm) +{ + return CURVE25519_KEY_SIZE; +} + +static struct kpp_alg curve25519_alg = { + .base.cra_name = "curve25519", + .base.cra_driver_name = "curve25519-generic", + .base.cra_priority = 100, + .base.cra_module = THIS_MODULE, + .base.cra_ctxsize = CURVE25519_KEY_SIZE, + + .set_secret = curve25519_set_secret, + .generate_public_key = curve25519_compute_value, + .compute_shared_secret = curve25519_compute_value, + .max_size = curve25519_max_size, +}; + +static int curve25519_init(void) +{ + return crypto_register_kpp(&curve25519_alg); +} + +static void curve25519_exit(void) +{ + crypto_unregister_kpp(&curve25519_alg); +} + +subsys_initcall(curve25519_init); +module_exit(curve25519_exit); + +MODULE_ALIAS_CRYPTO("curve25519"); +MODULE_ALIAS_CRYPTO("curve25519-generic"); +MODULE_LICENSE("GPL"); diff --git a/crypto/nhpoly1305.c b/crypto/nhpoly1305.c index ec831a5594d8..52b94e9eb98d 100644 --- a/crypto/nhpoly1305.c +++ b/crypto/nhpoly1305.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include #include @@ -78,7 +79,7 @@ static void process_nh_hash_value(struct nhpoly1305_state *state, BUILD_BUG_ON(NH_HASH_BYTES % POLY1305_BLOCK_SIZE != 0); poly1305_core_blocks(&state->poly_state, &key->poly_key, state->nh_hash, - NH_HASH_BYTES / POLY1305_BLOCK_SIZE); + NH_HASH_BYTES / POLY1305_BLOCK_SIZE, 1); } /* @@ -209,7 +210,7 @@ int crypto_nhpoly1305_final_helper(struct shash_desc *desc, u8 *dst, nh_t nh_fn) if (state->nh_remaining) process_nh_hash_value(state, key); - poly1305_core_emit(&state->poly_state, dst); + poly1305_core_emit(&state->poly_state, NULL, dst); return 0; } EXPORT_SYMBOL(crypto_nhpoly1305_final_helper); diff --git a/crypto/poly1305_generic.c b/crypto/poly1305_generic.c index 2a06874204e8..56b16687578f 100644 --- a/crypto/poly1305_generic.c +++ b/crypto/poly1305_generic.c @@ -13,65 +13,33 @@ #include #include -#include +#include #include #include #include #include -static inline u64 mlt(u64 a, u64 b) -{ - return a * b; -} - -static inline u32 sr(u64 v, u_char n) -{ - return v >> n; -} - -static inline u32 and(u32 v, u32 mask) -{ - return v & mask; -} - -int crypto_poly1305_init(struct shash_desc *desc) +static int crypto_poly1305_init(struct shash_desc *desc) { struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); poly1305_core_init(&dctx->h); dctx->buflen = 0; - dctx->rset = false; + dctx->rset = 0; dctx->sset = false; return 0; } -EXPORT_SYMBOL_GPL(crypto_poly1305_init); -void poly1305_core_setkey(struct poly1305_key *key, const u8 *raw_key) -{ - /* r &= 0xffffffc0ffffffc0ffffffc0fffffff */ - key->r[0] = (get_unaligned_le32(raw_key + 0) >> 0) & 0x3ffffff; - key->r[1] = (get_unaligned_le32(raw_key + 3) >> 2) & 0x3ffff03; - key->r[2] = (get_unaligned_le32(raw_key + 6) >> 4) & 0x3ffc0ff; - key->r[3] = (get_unaligned_le32(raw_key + 9) >> 6) & 0x3f03fff; - key->r[4] = (get_unaligned_le32(raw_key + 12) >> 8) & 0x00fffff; -} -EXPORT_SYMBOL_GPL(poly1305_core_setkey); - -/* - * Poly1305 requires a unique key for each tag, which implies that we can't set - * it on the tfm that gets accessed by multiple users simultaneously. Instead we - * expect the key as the first 32 bytes in the update() call. - */ -unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx, - const u8 *src, unsigned int srclen) +static unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx, + const u8 *src, unsigned int srclen) { if (!dctx->sset) { if (!dctx->rset && srclen >= POLY1305_BLOCK_SIZE) { - poly1305_core_setkey(&dctx->r, src); + poly1305_core_setkey(&dctx->core_r, src); src += POLY1305_BLOCK_SIZE; srclen -= POLY1305_BLOCK_SIZE; - dctx->rset = true; + dctx->rset = 2; } if (srclen >= POLY1305_BLOCK_SIZE) { dctx->s[0] = get_unaligned_le32(src + 0); @@ -85,86 +53,9 @@ unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx, } return srclen; } -EXPORT_SYMBOL_GPL(crypto_poly1305_setdesckey); -static void poly1305_blocks_internal(struct poly1305_state *state, - const struct poly1305_key *key, - const void *src, unsigned int nblocks, - u32 hibit) -{ - u32 r0, r1, r2, r3, r4; - u32 s1, s2, s3, s4; - u32 h0, h1, h2, h3, h4; - u64 d0, d1, d2, d3, d4; - - if (!nblocks) - return; - - r0 = key->r[0]; - r1 = key->r[1]; - r2 = key->r[2]; - r3 = key->r[3]; - r4 = key->r[4]; - - s1 = r1 * 5; - s2 = r2 * 5; - s3 = r3 * 5; - s4 = r4 * 5; - - h0 = state->h[0]; - h1 = state->h[1]; - h2 = state->h[2]; - h3 = state->h[3]; - h4 = state->h[4]; - - do { - /* h += m[i] */ - h0 += (get_unaligned_le32(src + 0) >> 0) & 0x3ffffff; - h1 += (get_unaligned_le32(src + 3) >> 2) & 0x3ffffff; - h2 += (get_unaligned_le32(src + 6) >> 4) & 0x3ffffff; - h3 += (get_unaligned_le32(src + 9) >> 6) & 0x3ffffff; - h4 += (get_unaligned_le32(src + 12) >> 8) | hibit; - - /* h *= r */ - d0 = mlt(h0, r0) + mlt(h1, s4) + mlt(h2, s3) + - mlt(h3, s2) + mlt(h4, s1); - d1 = mlt(h0, r1) + mlt(h1, r0) + mlt(h2, s4) + - mlt(h3, s3) + mlt(h4, s2); - d2 = mlt(h0, r2) + mlt(h1, r1) + mlt(h2, r0) + - mlt(h3, s4) + mlt(h4, s3); - d3 = mlt(h0, r3) + mlt(h1, r2) + mlt(h2, r1) + - mlt(h3, r0) + mlt(h4, s4); - d4 = mlt(h0, r4) + mlt(h1, r3) + mlt(h2, r2) + - mlt(h3, r1) + mlt(h4, r0); - - /* (partial) h %= p */ - d1 += sr(d0, 26); h0 = and(d0, 0x3ffffff); - d2 += sr(d1, 26); h1 = and(d1, 0x3ffffff); - d3 += sr(d2, 26); h2 = and(d2, 0x3ffffff); - d4 += sr(d3, 26); h3 = and(d3, 0x3ffffff); - h0 += sr(d4, 26) * 5; h4 = and(d4, 0x3ffffff); - h1 += h0 >> 26; h0 = h0 & 0x3ffffff; - - src += POLY1305_BLOCK_SIZE; - } while (--nblocks); - - state->h[0] = h0; - state->h[1] = h1; - state->h[2] = h2; - state->h[3] = h3; - state->h[4] = h4; -} - -void poly1305_core_blocks(struct poly1305_state *state, - const struct poly1305_key *key, - const void *src, unsigned int nblocks) -{ - poly1305_blocks_internal(state, key, src, nblocks, 1 << 24); -} -EXPORT_SYMBOL_GPL(poly1305_core_blocks); - -static void poly1305_blocks(struct poly1305_desc_ctx *dctx, - const u8 *src, unsigned int srclen, u32 hibit) +static void poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src, + unsigned int srclen) { unsigned int datalen; @@ -174,12 +65,12 @@ static void poly1305_blocks(struct poly1305_desc_ctx *dctx, srclen = datalen; } - poly1305_blocks_internal(&dctx->h, &dctx->r, - src, srclen / POLY1305_BLOCK_SIZE, hibit); + poly1305_core_blocks(&dctx->h, &dctx->core_r, src, + srclen / POLY1305_BLOCK_SIZE, 1); } -int crypto_poly1305_update(struct shash_desc *desc, - const u8 *src, unsigned int srclen) +static int crypto_poly1305_update(struct shash_desc *desc, + const u8 *src, unsigned int srclen) { struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); unsigned int bytes; @@ -193,13 +84,13 @@ int crypto_poly1305_update(struct shash_desc *desc, if (dctx->buflen == POLY1305_BLOCK_SIZE) { poly1305_blocks(dctx, dctx->buf, - POLY1305_BLOCK_SIZE, 1 << 24); + POLY1305_BLOCK_SIZE); dctx->buflen = 0; } } if (likely(srclen >= POLY1305_BLOCK_SIZE)) { - poly1305_blocks(dctx, src, srclen, 1 << 24); + poly1305_blocks(dctx, src, srclen); src += srclen - (srclen % POLY1305_BLOCK_SIZE); srclen %= POLY1305_BLOCK_SIZE; } @@ -211,87 +102,17 @@ int crypto_poly1305_update(struct shash_desc *desc, return 0; } -EXPORT_SYMBOL_GPL(crypto_poly1305_update); -void poly1305_core_emit(const struct poly1305_state *state, void *dst) -{ - u32 h0, h1, h2, h3, h4; - u32 g0, g1, g2, g3, g4; - u32 mask; - - /* fully carry h */ - h0 = state->h[0]; - h1 = state->h[1]; - h2 = state->h[2]; - h3 = state->h[3]; - h4 = state->h[4]; - - h2 += (h1 >> 26); h1 = h1 & 0x3ffffff; - h3 += (h2 >> 26); h2 = h2 & 0x3ffffff; - h4 += (h3 >> 26); h3 = h3 & 0x3ffffff; - h0 += (h4 >> 26) * 5; h4 = h4 & 0x3ffffff; - h1 += (h0 >> 26); h0 = h0 & 0x3ffffff; - - /* compute h + -p */ - g0 = h0 + 5; - g1 = h1 + (g0 >> 26); g0 &= 0x3ffffff; - g2 = h2 + (g1 >> 26); g1 &= 0x3ffffff; - g3 = h3 + (g2 >> 26); g2 &= 0x3ffffff; - g4 = h4 + (g3 >> 26) - (1 << 26); g3 &= 0x3ffffff; - - /* select h if h < p, or h + -p if h >= p */ - mask = (g4 >> ((sizeof(u32) * 8) - 1)) - 1; - g0 &= mask; - g1 &= mask; - g2 &= mask; - g3 &= mask; - g4 &= mask; - mask = ~mask; - h0 = (h0 & mask) | g0; - h1 = (h1 & mask) | g1; - h2 = (h2 & mask) | g2; - h3 = (h3 & mask) | g3; - h4 = (h4 & mask) | g4; - - /* h = h % (2^128) */ - put_unaligned_le32((h0 >> 0) | (h1 << 26), dst + 0); - put_unaligned_le32((h1 >> 6) | (h2 << 20), dst + 4); - put_unaligned_le32((h2 >> 12) | (h3 << 14), dst + 8); - put_unaligned_le32((h3 >> 18) | (h4 << 8), dst + 12); -} -EXPORT_SYMBOL_GPL(poly1305_core_emit); - -int crypto_poly1305_final(struct shash_desc *desc, u8 *dst) +static int crypto_poly1305_final(struct shash_desc *desc, u8 *dst) { struct poly1305_desc_ctx *dctx = shash_desc_ctx(desc); - __le32 digest[4]; - u64 f = 0; if (unlikely(!dctx->sset)) return -ENOKEY; - if (unlikely(dctx->buflen)) { - dctx->buf[dctx->buflen++] = 1; - memset(dctx->buf + dctx->buflen, 0, - POLY1305_BLOCK_SIZE - dctx->buflen); - poly1305_blocks(dctx, dctx->buf, POLY1305_BLOCK_SIZE, 0); - } - - poly1305_core_emit(&dctx->h, digest); - - /* mac = (h + s) % (2^128) */ - f = (f >> 32) + le32_to_cpu(digest[0]) + dctx->s[0]; - put_unaligned_le32(f, dst + 0); - f = (f >> 32) + le32_to_cpu(digest[1]) + dctx->s[1]; - put_unaligned_le32(f, dst + 4); - f = (f >> 32) + le32_to_cpu(digest[2]) + dctx->s[2]; - put_unaligned_le32(f, dst + 8); - f = (f >> 32) + le32_to_cpu(digest[3]) + dctx->s[3]; - put_unaligned_le32(f, dst + 12); - + poly1305_final_generic(dctx, dst); return 0; } -EXPORT_SYMBOL_GPL(crypto_poly1305_final); static struct shash_alg poly1305_alg = { .digestsize = POLY1305_DIGEST_SIZE, diff --git a/crypto/testmgr.c b/crypto/testmgr.c index 8cc4cf8316fa..ea3524441176 100644 --- a/crypto/testmgr.c +++ b/crypto/testmgr.c @@ -2616,6 +2616,30 @@ static const struct alg_test_desc alg_test_descs[] = { .alg = "authenc(hmac(sha512),rfc3686(ctr(aes)))", .test = alg_test_null, .fips_allowed = 1, + }, { + .alg = "blake2s-128", + .test = alg_test_hash, + .suite = { + .hash = __VECS(blakes2s_128_tv_template) + } + }, { + .alg = "blake2s-160", + .test = alg_test_hash, + .suite = { + .hash = __VECS(blakes2s_160_tv_template) + } + }, { + .alg = "blake2s-224", + .test = alg_test_hash, + .suite = { + .hash = __VECS(blakes2s_224_tv_template) + } + }, { + .alg = "blake2s-256", + .test = alg_test_hash, + .suite = { + .hash = __VECS(blakes2s_256_tv_template) + } }, { .alg = "cbc(aes)", .test = alg_test_skcipher, @@ -2821,6 +2845,12 @@ static const struct alg_test_desc alg_test_descs[] = { .suite = { .cipher = __VECS(cts_mode_tv_template) } + }, { + .alg = "curve25519", + .test = alg_test_kpp, + .suite = { + .kpp = __VECS(curve25519_tv_template) + } }, { .alg = "deflate", .test = alg_test_comp, diff --git a/crypto/testmgr.h b/crypto/testmgr.h index 39652cda7a44..9466be39f61b 100644 --- a/crypto/testmgr.h +++ b/crypto/testmgr.h @@ -857,6 +857,1231 @@ static const struct kpp_testvec dh_tv_template[] = { } }; +static const struct kpp_testvec curve25519_tv_template[] = { +{ + .secret = (u8[32]){ 0x77, 0x07, 0x6d, 0x0a, 0x73, 0x18, 0xa5, 0x7d, + 0x3c, 0x16, 0xc1, 0x72, 0x51, 0xb2, 0x66, 0x45, + 0xdf, 0x4c, 0x2f, 0x87, 0xeb, 0xc0, 0x99, 0x2a, + 0xb1, 0x77, 0xfb, 0xa5, 0x1d, 0xb9, 0x2c, 0x2a }, + .b_public = (u8[32]){ 0xde, 0x9e, 0xdb, 0x7d, 0x7b, 0x7d, 0xc1, 0xb4, + 0xd3, 0x5b, 0x61, 0xc2, 0xec, 0xe4, 0x35, 0x37, + 0x3f, 0x83, 0x43, 0xc8, 0x5b, 0x78, 0x67, 0x4d, + 0xad, 0xfc, 0x7e, 0x14, 0x6f, 0x88, 0x2b, 0x4f }, + .expected_ss = (u8[32]){ 0x4a, 0x5d, 0x9d, 0x5b, 0xa4, 0xce, 0x2d, 0xe1, + 0x72, 0x8e, 0x3b, 0xf4, 0x80, 0x35, 0x0f, 0x25, + 0xe0, 0x7e, 0x21, 0xc9, 0x47, 0xd1, 0x9e, 0x33, + 0x76, 0xf0, 0x9b, 0x3c, 0x1e, 0x16, 0x17, 0x42 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +{ + .secret = (u8[32]){ 0x5d, 0xab, 0x08, 0x7e, 0x62, 0x4a, 0x8a, 0x4b, + 0x79, 0xe1, 0x7f, 0x8b, 0x83, 0x80, 0x0e, 0xe6, + 0x6f, 0x3b, 0xb1, 0x29, 0x26, 0x18, 0xb6, 0xfd, + 0x1c, 0x2f, 0x8b, 0x27, 0xff, 0x88, 0xe0, 0xeb }, + .b_public = (u8[32]){ 0x85, 0x20, 0xf0, 0x09, 0x89, 0x30, 0xa7, 0x54, + 0x74, 0x8b, 0x7d, 0xdc, 0xb4, 0x3e, 0xf7, 0x5a, + 0x0d, 0xbf, 0x3a, 0x0d, 0x26, 0x38, 0x1a, 0xf4, + 0xeb, 0xa4, 0xa9, 0x8e, 0xaa, 0x9b, 0x4e, 0x6a }, + .expected_ss = (u8[32]){ 0x4a, 0x5d, 0x9d, 0x5b, 0xa4, 0xce, 0x2d, 0xe1, + 0x72, 0x8e, 0x3b, 0xf4, 0x80, 0x35, 0x0f, 0x25, + 0xe0, 0x7e, 0x21, 0xc9, 0x47, 0xd1, 0x9e, 0x33, + 0x76, 0xf0, 0x9b, 0x3c, 0x1e, 0x16, 0x17, 0x42 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +{ + .secret = (u8[32]){ 1 }, + .b_public = (u8[32]){ 0x25, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .expected_ss = (u8[32]){ 0x3c, 0x77, 0x77, 0xca, 0xf9, 0x97, 0xb2, 0x64, + 0x41, 0x60, 0x77, 0x66, 0x5b, 0x4e, 0x22, 0x9d, + 0x0b, 0x95, 0x48, 0xdc, 0x0c, 0xd8, 0x19, 0x98, + 0xdd, 0xcd, 0xc5, 0xc8, 0x53, 0x3c, 0x79, 0x7f }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +{ + .secret = (u8[32]){ 1 }, + .b_public = (u8[32]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .expected_ss = (u8[32]){ 0xb3, 0x2d, 0x13, 0x62, 0xc2, 0x48, 0xd6, 0x2f, + 0xe6, 0x26, 0x19, 0xcf, 0xf0, 0x4d, 0xd4, 0x3d, + 0xb7, 0x3f, 0xfc, 0x1b, 0x63, 0x08, 0xed, 0xe3, + 0x0b, 0x78, 0xd8, 0x73, 0x80, 0xf1, 0xe8, 0x34 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +{ + .secret = (u8[32]){ 0xa5, 0x46, 0xe3, 0x6b, 0xf0, 0x52, 0x7c, 0x9d, + 0x3b, 0x16, 0x15, 0x4b, 0x82, 0x46, 0x5e, 0xdd, + 0x62, 0x14, 0x4c, 0x0a, 0xc1, 0xfc, 0x5a, 0x18, + 0x50, 0x6a, 0x22, 0x44, 0xba, 0x44, 0x9a, 0xc4 }, + .b_public = (u8[32]){ 0xe6, 0xdb, 0x68, 0x67, 0x58, 0x30, 0x30, 0xdb, + 0x35, 0x94, 0xc1, 0xa4, 0x24, 0xb1, 0x5f, 0x7c, + 0x72, 0x66, 0x24, 0xec, 0x26, 0xb3, 0x35, 0x3b, + 0x10, 0xa9, 0x03, 0xa6, 0xd0, 0xab, 0x1c, 0x4c }, + .expected_ss = (u8[32]){ 0xc3, 0xda, 0x55, 0x37, 0x9d, 0xe9, 0xc6, 0x90, + 0x8e, 0x94, 0xea, 0x4d, 0xf2, 0x8d, 0x08, 0x4f, + 0x32, 0xec, 0xcf, 0x03, 0x49, 0x1c, 0x71, 0xf7, + 0x54, 0xb4, 0x07, 0x55, 0x77, 0xa2, 0x85, 0x52 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +{ + .secret = (u8[32]){ 0xff, 0xff, 0xff, 0xff, 0x0a, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .b_public = (u8[32]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0x0a, 0x00, 0xfb, 0x9f }, + .expected_ss = (u8[32]){ 0x77, 0x52, 0xb6, 0x18, 0xc1, 0x2d, 0x48, 0xd2, + 0xc6, 0x93, 0x46, 0x83, 0x81, 0x7c, 0xc6, 0x57, + 0xf3, 0x31, 0x03, 0x19, 0x49, 0x48, 0x20, 0x05, + 0x42, 0x2b, 0x4e, 0xae, 0x8d, 0x1d, 0x43, 0x23 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +{ + .secret = (u8[32]){ 0x8e, 0x0a, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .b_public = (u8[32]){ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x8e, 0x06 }, + .expected_ss = (u8[32]){ 0x5a, 0xdf, 0xaa, 0x25, 0x86, 0x8e, 0x32, 0x3d, + 0xae, 0x49, 0x62, 0xc1, 0x01, 0x5c, 0xb3, 0x12, + 0xe1, 0xc5, 0xc7, 0x9e, 0x95, 0x3f, 0x03, 0x99, + 0xb0, 0xba, 0x16, 0x22, 0xf3, 0xb6, 0xf7, 0x0c }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - normal case */ +{ + .secret = (u8[32]){ 0x48, 0x52, 0x83, 0x4d, 0x9d, 0x6b, 0x77, 0xda, + 0xde, 0xab, 0xaa, 0xf2, 0xe1, 0x1d, 0xca, 0x66, + 0xd1, 0x9f, 0xe7, 0x49, 0x93, 0xa7, 0xbe, 0xc3, + 0x6c, 0x6e, 0x16, 0xa0, 0x98, 0x3f, 0xea, 0xba }, + .b_public = (u8[32]){ 0x9c, 0x64, 0x7d, 0x9a, 0xe5, 0x89, 0xb9, 0xf5, + 0x8f, 0xdc, 0x3c, 0xa4, 0x94, 0x7e, 0xfb, 0xc9, + 0x15, 0xc4, 0xb2, 0xe0, 0x8e, 0x74, 0x4a, 0x0e, + 0xdf, 0x46, 0x9d, 0xac, 0x59, 0xc8, 0xf8, 0x5a }, + .expected_ss = (u8[32]){ 0x87, 0xb7, 0xf2, 0x12, 0xb6, 0x27, 0xf7, 0xa5, + 0x4c, 0xa5, 0xe0, 0xbc, 0xda, 0xdd, 0xd5, 0x38, + 0x9d, 0x9d, 0xe6, 0x15, 0x6c, 0xdb, 0xcf, 0x8e, + 0xbe, 0x14, 0xff, 0xbc, 0xfb, 0x43, 0x65, 0x51 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key on twist */ +{ + .secret = (u8[32]){ 0x58, 0x8c, 0x06, 0x1a, 0x50, 0x80, 0x4a, 0xc4, + 0x88, 0xad, 0x77, 0x4a, 0xc7, 0x16, 0xc3, 0xf5, + 0xba, 0x71, 0x4b, 0x27, 0x12, 0xe0, 0x48, 0x49, + 0x13, 0x79, 0xa5, 0x00, 0x21, 0x19, 0x98, 0xa8 }, + .b_public = (u8[32]){ 0x63, 0xaa, 0x40, 0xc6, 0xe3, 0x83, 0x46, 0xc5, + 0xca, 0xf2, 0x3a, 0x6d, 0xf0, 0xa5, 0xe6, 0xc8, + 0x08, 0x89, 0xa0, 0x86, 0x47, 0xe5, 0x51, 0xb3, + 0x56, 0x34, 0x49, 0xbe, 0xfc, 0xfc, 0x97, 0x33 }, + .expected_ss = (u8[32]){ 0xb1, 0xa7, 0x07, 0x51, 0x94, 0x95, 0xff, 0xff, + 0xb2, 0x98, 0xff, 0x94, 0x17, 0x16, 0xb0, 0x6d, + 0xfa, 0xb8, 0x7c, 0xf8, 0xd9, 0x11, 0x23, 0xfe, + 0x2b, 0xe9, 0xa2, 0x33, 0xdd, 0xa2, 0x22, 0x12 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key on twist */ +{ + .secret = (u8[32]){ 0xb0, 0x5b, 0xfd, 0x32, 0xe5, 0x53, 0x25, 0xd9, + 0xfd, 0x64, 0x8c, 0xb3, 0x02, 0x84, 0x80, 0x39, + 0x00, 0x0b, 0x39, 0x0e, 0x44, 0xd5, 0x21, 0xe5, + 0x8a, 0xab, 0x3b, 0x29, 0xa6, 0x96, 0x0b, 0xa8 }, + .b_public = (u8[32]){ 0x0f, 0x83, 0xc3, 0x6f, 0xde, 0xd9, 0xd3, 0x2f, + 0xad, 0xf4, 0xef, 0xa3, 0xae, 0x93, 0xa9, 0x0b, + 0xb5, 0xcf, 0xa6, 0x68, 0x93, 0xbc, 0x41, 0x2c, + 0x43, 0xfa, 0x72, 0x87, 0xdb, 0xb9, 0x97, 0x79 }, + .expected_ss = (u8[32]){ 0x67, 0xdd, 0x4a, 0x6e, 0x16, 0x55, 0x33, 0x53, + 0x4c, 0x0e, 0x3f, 0x17, 0x2e, 0x4a, 0xb8, 0x57, + 0x6b, 0xca, 0x92, 0x3a, 0x5f, 0x07, 0xb2, 0xc0, + 0x69, 0xb4, 0xc3, 0x10, 0xff, 0x2e, 0x93, 0x5b }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key on twist */ +{ + .secret = (u8[32]){ 0x70, 0xe3, 0x4b, 0xcb, 0xe1, 0xf4, 0x7f, 0xbc, + 0x0f, 0xdd, 0xfd, 0x7c, 0x1e, 0x1a, 0xa5, 0x3d, + 0x57, 0xbf, 0xe0, 0xf6, 0x6d, 0x24, 0x30, 0x67, + 0xb4, 0x24, 0xbb, 0x62, 0x10, 0xbe, 0xd1, 0x9c }, + .b_public = (u8[32]){ 0x0b, 0x82, 0x11, 0xa2, 0xb6, 0x04, 0x90, 0x97, + 0xf6, 0x87, 0x1c, 0x6c, 0x05, 0x2d, 0x3c, 0x5f, + 0xc1, 0xba, 0x17, 0xda, 0x9e, 0x32, 0xae, 0x45, + 0x84, 0x03, 0xb0, 0x5b, 0xb2, 0x83, 0x09, 0x2a }, + .expected_ss = (u8[32]){ 0x4a, 0x06, 0x38, 0xcf, 0xaa, 0x9e, 0xf1, 0x93, + 0x3b, 0x47, 0xf8, 0x93, 0x92, 0x96, 0xa6, 0xb2, + 0x5b, 0xe5, 0x41, 0xef, 0x7f, 0x70, 0xe8, 0x44, + 0xc0, 0xbc, 0xc0, 0x0b, 0x13, 0x4d, 0xe6, 0x4a }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key on twist */ +{ + .secret = (u8[32]){ 0x68, 0xc1, 0xf3, 0xa6, 0x53, 0xa4, 0xcd, 0xb1, + 0xd3, 0x7b, 0xba, 0x94, 0x73, 0x8f, 0x8b, 0x95, + 0x7a, 0x57, 0xbe, 0xb2, 0x4d, 0x64, 0x6e, 0x99, + 0x4d, 0xc2, 0x9a, 0x27, 0x6a, 0xad, 0x45, 0x8d }, + .b_public = (u8[32]){ 0x34, 0x3a, 0xc2, 0x0a, 0x3b, 0x9c, 0x6a, 0x27, + 0xb1, 0x00, 0x81, 0x76, 0x50, 0x9a, 0xd3, 0x07, + 0x35, 0x85, 0x6e, 0xc1, 0xc8, 0xd8, 0xfc, 0xae, + 0x13, 0x91, 0x2d, 0x08, 0xd1, 0x52, 0xf4, 0x6c }, + .expected_ss = (u8[32]){ 0x39, 0x94, 0x91, 0xfc, 0xe8, 0xdf, 0xab, 0x73, + 0xb4, 0xf9, 0xf6, 0x11, 0xde, 0x8e, 0xa0, 0xb2, + 0x7b, 0x28, 0xf8, 0x59, 0x94, 0x25, 0x0b, 0x0f, + 0x47, 0x5d, 0x58, 0x5d, 0x04, 0x2a, 0xc2, 0x07 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key on twist */ +{ + .secret = (u8[32]){ 0xd8, 0x77, 0xb2, 0x6d, 0x06, 0xdf, 0xf9, 0xd9, + 0xf7, 0xfd, 0x4c, 0x5b, 0x37, 0x69, 0xf8, 0xcd, + 0xd5, 0xb3, 0x05, 0x16, 0xa5, 0xab, 0x80, 0x6b, + 0xe3, 0x24, 0xff, 0x3e, 0xb6, 0x9e, 0xa0, 0xb2 }, + .b_public = (u8[32]){ 0xfa, 0x69, 0x5f, 0xc7, 0xbe, 0x8d, 0x1b, 0xe5, + 0xbf, 0x70, 0x48, 0x98, 0xf3, 0x88, 0xc4, 0x52, + 0xba, 0xfd, 0xd3, 0xb8, 0xea, 0xe8, 0x05, 0xf8, + 0x68, 0x1a, 0x8d, 0x15, 0xc2, 0xd4, 0xe1, 0x42 }, + .expected_ss = (u8[32]){ 0x2c, 0x4f, 0xe1, 0x1d, 0x49, 0x0a, 0x53, 0x86, + 0x17, 0x76, 0xb1, 0x3b, 0x43, 0x54, 0xab, 0xd4, + 0xcf, 0x5a, 0x97, 0x69, 0x9d, 0xb6, 0xe6, 0xc6, + 0x8c, 0x16, 0x26, 0xd0, 0x76, 0x62, 0xf7, 0x58 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case on twist */ +{ + .secret = (u8[32]){ 0x38, 0xdd, 0xe9, 0xf3, 0xe7, 0xb7, 0x99, 0x04, + 0x5f, 0x9a, 0xc3, 0x79, 0x3d, 0x4a, 0x92, 0x77, + 0xda, 0xde, 0xad, 0xc4, 0x1b, 0xec, 0x02, 0x90, + 0xf8, 0x1f, 0x74, 0x4f, 0x73, 0x77, 0x5f, 0x84 }, + .b_public = (u8[32]){ 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .expected_ss = (u8[32]){ 0x9a, 0x2c, 0xfe, 0x84, 0xff, 0x9c, 0x4a, 0x97, + 0x39, 0x62, 0x5c, 0xae, 0x4a, 0x3b, 0x82, 0xa9, + 0x06, 0x87, 0x7a, 0x44, 0x19, 0x46, 0xf8, 0xd7, + 0xb3, 0xd7, 0x95, 0xfe, 0x8f, 0x5d, 0x16, 0x39 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case on twist */ +{ + .secret = (u8[32]){ 0x98, 0x57, 0xa9, 0x14, 0xe3, 0xc2, 0x90, 0x36, + 0xfd, 0x9a, 0x44, 0x2b, 0xa5, 0x26, 0xb5, 0xcd, + 0xcd, 0xf2, 0x82, 0x16, 0x15, 0x3e, 0x63, 0x6c, + 0x10, 0x67, 0x7a, 0xca, 0xb6, 0xbd, 0x6a, 0xa5 }, + .b_public = (u8[32]){ 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .expected_ss = (u8[32]){ 0x4d, 0xa4, 0xe0, 0xaa, 0x07, 0x2c, 0x23, 0x2e, + 0xe2, 0xf0, 0xfa, 0x4e, 0x51, 0x9a, 0xe5, 0x0b, + 0x52, 0xc1, 0xed, 0xd0, 0x8a, 0x53, 0x4d, 0x4e, + 0xf3, 0x46, 0xc2, 0xe1, 0x06, 0xd2, 0x1d, 0x60 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case on twist */ +{ + .secret = (u8[32]){ 0x48, 0xe2, 0x13, 0x0d, 0x72, 0x33, 0x05, 0xed, + 0x05, 0xe6, 0xe5, 0x89, 0x4d, 0x39, 0x8a, 0x5e, + 0x33, 0x36, 0x7a, 0x8c, 0x6a, 0xac, 0x8f, 0xcd, + 0xf0, 0xa8, 0x8e, 0x4b, 0x42, 0x82, 0x0d, 0xb7 }, + .b_public = (u8[32]){ 0xff, 0xff, 0xff, 0x03, 0x00, 0x00, 0xf8, 0xff, + 0xff, 0x1f, 0x00, 0x00, 0xc0, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0xfe, 0xff, 0xff, 0x07, 0x00, + 0x00, 0xf0, 0xff, 0xff, 0x3f, 0x00, 0x00, 0x00 }, + .expected_ss = (u8[32]){ 0x9e, 0xd1, 0x0c, 0x53, 0x74, 0x7f, 0x64, 0x7f, + 0x82, 0xf4, 0x51, 0x25, 0xd3, 0xde, 0x15, 0xa1, + 0xe6, 0xb8, 0x24, 0x49, 0x6a, 0xb4, 0x04, 0x10, + 0xff, 0xcc, 0x3c, 0xfe, 0x95, 0x76, 0x0f, 0x3b }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case on twist */ +{ + .secret = (u8[32]){ 0x28, 0xf4, 0x10, 0x11, 0x69, 0x18, 0x51, 0xb3, + 0xa6, 0x2b, 0x64, 0x15, 0x53, 0xb3, 0x0d, 0x0d, + 0xfd, 0xdc, 0xb8, 0xff, 0xfc, 0xf5, 0x37, 0x00, + 0xa7, 0xbe, 0x2f, 0x6a, 0x87, 0x2e, 0x9f, 0xb0 }, + .b_public = (u8[32]){ 0x00, 0x00, 0x00, 0xfc, 0xff, 0xff, 0x07, 0x00, + 0x00, 0xe0, 0xff, 0xff, 0x3f, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0x01, 0x00, 0x00, 0xf8, 0xff, + 0xff, 0x0f, 0x00, 0x00, 0xc0, 0xff, 0xff, 0x7f }, + .expected_ss = (u8[32]){ 0xcf, 0x72, 0xb4, 0xaa, 0x6a, 0xa1, 0xc9, 0xf8, + 0x94, 0xf4, 0x16, 0x5b, 0x86, 0x10, 0x9a, 0xa4, + 0x68, 0x51, 0x76, 0x48, 0xe1, 0xf0, 0xcc, 0x70, + 0xe1, 0xab, 0x08, 0x46, 0x01, 0x76, 0x50, 0x6b }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case on twist */ +{ + .secret = (u8[32]){ 0x18, 0xa9, 0x3b, 0x64, 0x99, 0xb9, 0xf6, 0xb3, + 0x22, 0x5c, 0xa0, 0x2f, 0xef, 0x41, 0x0e, 0x0a, + 0xde, 0xc2, 0x35, 0x32, 0x32, 0x1d, 0x2d, 0x8e, + 0xf1, 0xa6, 0xd6, 0x02, 0xa8, 0xc6, 0x5b, 0x83 }, + .b_public = (u8[32]){ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0x7f }, + .expected_ss = (u8[32]){ 0x5d, 0x50, 0xb6, 0x28, 0x36, 0xbb, 0x69, 0x57, + 0x94, 0x10, 0x38, 0x6c, 0xf7, 0xbb, 0x81, 0x1c, + 0x14, 0xbf, 0x85, 0xb1, 0xc7, 0xb1, 0x7e, 0x59, + 0x24, 0xc7, 0xff, 0xea, 0x91, 0xef, 0x9e, 0x12 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case on twist */ +{ + .secret = (u8[32]){ 0xc0, 0x1d, 0x13, 0x05, 0xa1, 0x33, 0x8a, 0x1f, + 0xca, 0xc2, 0xba, 0x7e, 0x2e, 0x03, 0x2b, 0x42, + 0x7e, 0x0b, 0x04, 0x90, 0x31, 0x65, 0xac, 0xa9, + 0x57, 0xd8, 0xd0, 0x55, 0x3d, 0x87, 0x17, 0xb0 }, + .b_public = (u8[32]){ 0xea, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .expected_ss = (u8[32]){ 0x19, 0x23, 0x0e, 0xb1, 0x48, 0xd5, 0xd6, 0x7c, + 0x3c, 0x22, 0xab, 0x1d, 0xae, 0xff, 0x80, 0xa5, + 0x7e, 0xae, 0x42, 0x65, 0xce, 0x28, 0x72, 0x65, + 0x7b, 0x2c, 0x80, 0x99, 0xfc, 0x69, 0x8e, 0x50 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for public key */ +{ + .secret = (u8[32]){ 0x38, 0x6f, 0x7f, 0x16, 0xc5, 0x07, 0x31, 0xd6, + 0x4f, 0x82, 0xe6, 0xa1, 0x70, 0xb1, 0x42, 0xa4, + 0xe3, 0x4f, 0x31, 0xfd, 0x77, 0x68, 0xfc, 0xb8, + 0x90, 0x29, 0x25, 0xe7, 0xd1, 0xe2, 0x1a, 0xbe }, + .b_public = (u8[32]){ 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .expected_ss = (u8[32]){ 0x0f, 0xca, 0xb5, 0xd8, 0x42, 0xa0, 0x78, 0xd7, + 0xa7, 0x1f, 0xc5, 0x9b, 0x57, 0xbf, 0xb4, 0xca, + 0x0b, 0xe6, 0x87, 0x3b, 0x49, 0xdc, 0xdb, 0x9f, + 0x44, 0xe1, 0x4a, 0xe8, 0xfb, 0xdf, 0xa5, 0x42 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for public key */ +{ + .secret = (u8[32]){ 0xe0, 0x23, 0xa2, 0x89, 0xbd, 0x5e, 0x90, 0xfa, + 0x28, 0x04, 0xdd, 0xc0, 0x19, 0xa0, 0x5e, 0xf3, + 0xe7, 0x9d, 0x43, 0x4b, 0xb6, 0xea, 0x2f, 0x52, + 0x2e, 0xcb, 0x64, 0x3a, 0x75, 0x29, 0x6e, 0x95 }, + .b_public = (u8[32]){ 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }, + .expected_ss = (u8[32]){ 0x54, 0xce, 0x8f, 0x22, 0x75, 0xc0, 0x77, 0xe3, + 0xb1, 0x30, 0x6a, 0x39, 0x39, 0xc5, 0xe0, 0x3e, + 0xef, 0x6b, 0xbb, 0x88, 0x06, 0x05, 0x44, 0x75, + 0x8d, 0x9f, 0xef, 0x59, 0xb0, 0xbc, 0x3e, 0x4f }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for public key */ +{ + .secret = (u8[32]){ 0x68, 0xf0, 0x10, 0xd6, 0x2e, 0xe8, 0xd9, 0x26, + 0x05, 0x3a, 0x36, 0x1c, 0x3a, 0x75, 0xc6, 0xea, + 0x4e, 0xbd, 0xc8, 0x60, 0x6a, 0xb2, 0x85, 0x00, + 0x3a, 0x6f, 0x8f, 0x40, 0x76, 0xb0, 0x1e, 0x83 }, + .b_public = (u8[32]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x03 }, + .expected_ss = (u8[32]){ 0xf1, 0x36, 0x77, 0x5c, 0x5b, 0xeb, 0x0a, 0xf8, + 0x11, 0x0a, 0xf1, 0x0b, 0x20, 0x37, 0x23, 0x32, + 0x04, 0x3c, 0xab, 0x75, 0x24, 0x19, 0x67, 0x87, + 0x75, 0xa2, 0x23, 0xdf, 0x57, 0xc9, 0xd3, 0x0d }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for public key */ +{ + .secret = (u8[32]){ 0x58, 0xeb, 0xcb, 0x35, 0xb0, 0xf8, 0x84, 0x5c, + 0xaf, 0x1e, 0xc6, 0x30, 0xf9, 0x65, 0x76, 0xb6, + 0x2c, 0x4b, 0x7b, 0x6c, 0x36, 0xb2, 0x9d, 0xeb, + 0x2c, 0xb0, 0x08, 0x46, 0x51, 0x75, 0x5c, 0x96 }, + .b_public = (u8[32]){ 0xff, 0xff, 0xff, 0xfb, 0xff, 0xff, 0xfb, 0xff, + 0xff, 0xdf, 0xff, 0xff, 0xdf, 0xff, 0xff, 0xff, + 0xfe, 0xff, 0xff, 0xfe, 0xff, 0xff, 0xf7, 0xff, + 0xff, 0xf7, 0xff, 0xff, 0xbf, 0xff, 0xff, 0x3f }, + .expected_ss = (u8[32]){ 0xbf, 0x9a, 0xff, 0xd0, 0x6b, 0x84, 0x40, 0x85, + 0x58, 0x64, 0x60, 0x96, 0x2e, 0xf2, 0x14, 0x6f, + 0xf3, 0xd4, 0x53, 0x3d, 0x94, 0x44, 0xaa, 0xb0, + 0x06, 0xeb, 0x88, 0xcc, 0x30, 0x54, 0x40, 0x7d }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for public key */ +{ + .secret = (u8[32]){ 0x18, 0x8c, 0x4b, 0xc5, 0xb9, 0xc4, 0x4b, 0x38, + 0xbb, 0x65, 0x8b, 0x9b, 0x2a, 0xe8, 0x2d, 0x5b, + 0x01, 0x01, 0x5e, 0x09, 0x31, 0x84, 0xb1, 0x7c, + 0xb7, 0x86, 0x35, 0x03, 0xa7, 0x83, 0xe1, 0xbb }, + .b_public = (u8[32]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f }, + .expected_ss = (u8[32]){ 0xd4, 0x80, 0xde, 0x04, 0xf6, 0x99, 0xcb, 0x3b, + 0xe0, 0x68, 0x4a, 0x9c, 0xc2, 0xe3, 0x12, 0x81, + 0xea, 0x0b, 0xc5, 0xa9, 0xdc, 0xc1, 0x57, 0xd3, + 0xd2, 0x01, 0x58, 0xd4, 0x6c, 0xa5, 0x24, 0x6d }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for public key */ +{ + .secret = (u8[32]){ 0xe0, 0x6c, 0x11, 0xbb, 0x2e, 0x13, 0xce, 0x3d, + 0xc7, 0x67, 0x3f, 0x67, 0xf5, 0x48, 0x22, 0x42, + 0x90, 0x94, 0x23, 0xa9, 0xae, 0x95, 0xee, 0x98, + 0x6a, 0x98, 0x8d, 0x98, 0xfa, 0xee, 0x23, 0xa2 }, + .b_public = (u8[32]){ 0xff, 0xff, 0xff, 0xff, 0xfe, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0xff, 0xfe, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0xff, 0xfe, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0xff, 0xfe, 0xff, 0xff, 0x7f }, + .expected_ss = (u8[32]){ 0x4c, 0x44, 0x01, 0xcc, 0xe6, 0xb5, 0x1e, 0x4c, + 0xb1, 0x8f, 0x27, 0x90, 0x24, 0x6c, 0x9b, 0xf9, + 0x14, 0xdb, 0x66, 0x77, 0x50, 0xa1, 0xcb, 0x89, + 0x06, 0x90, 0x92, 0xaf, 0x07, 0x29, 0x22, 0x76 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for public key */ +{ + .secret = (u8[32]){ 0xc0, 0x65, 0x8c, 0x46, 0xdd, 0xe1, 0x81, 0x29, + 0x29, 0x38, 0x77, 0x53, 0x5b, 0x11, 0x62, 0xb6, + 0xf9, 0xf5, 0x41, 0x4a, 0x23, 0xcf, 0x4d, 0x2c, + 0xbc, 0x14, 0x0a, 0x4d, 0x99, 0xda, 0x2b, 0x8f }, + .b_public = (u8[32]){ 0xeb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .expected_ss = (u8[32]){ 0x57, 0x8b, 0xa8, 0xcc, 0x2d, 0xbd, 0xc5, 0x75, + 0xaf, 0xcf, 0x9d, 0xf2, 0xb3, 0xee, 0x61, 0x89, + 0xf5, 0x33, 0x7d, 0x68, 0x54, 0xc7, 0x9b, 0x4c, + 0xe1, 0x65, 0xea, 0x12, 0x29, 0x3b, 0x3a, 0x0f }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0xf0, 0x1e, 0x48, 0xda, 0xfa, 0xc9, 0xd7, 0xbc, + 0xf5, 0x89, 0xcb, 0xc3, 0x82, 0xc8, 0x78, 0xd1, + 0x8b, 0xda, 0x35, 0x50, 0x58, 0x9f, 0xfb, 0x5d, + 0x50, 0xb5, 0x23, 0xbe, 0xbe, 0x32, 0x9d, 0xae }, + .b_public = (u8[32]){ 0xef, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .expected_ss = (u8[32]){ 0xbd, 0x36, 0xa0, 0x79, 0x0e, 0xb8, 0x83, 0x09, + 0x8c, 0x98, 0x8b, 0x21, 0x78, 0x67, 0x73, 0xde, + 0x0b, 0x3a, 0x4d, 0xf1, 0x62, 0x28, 0x2c, 0xf1, + 0x10, 0xde, 0x18, 0xdd, 0x48, 0x4c, 0xe7, 0x4b }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0x28, 0x87, 0x96, 0xbc, 0x5a, 0xff, 0x4b, 0x81, + 0xa3, 0x75, 0x01, 0x75, 0x7b, 0xc0, 0x75, 0x3a, + 0x3c, 0x21, 0x96, 0x47, 0x90, 0xd3, 0x86, 0x99, + 0x30, 0x8d, 0xeb, 0xc1, 0x7a, 0x6e, 0xaf, 0x8d }, + .b_public = (u8[32]){ 0xf0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .expected_ss = (u8[32]){ 0xb4, 0xe0, 0xdd, 0x76, 0xda, 0x7b, 0x07, 0x17, + 0x28, 0xb6, 0x1f, 0x85, 0x67, 0x71, 0xaa, 0x35, + 0x6e, 0x57, 0xed, 0xa7, 0x8a, 0x5b, 0x16, 0x55, + 0xcc, 0x38, 0x20, 0xfb, 0x5f, 0x85, 0x4c, 0x5c }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0x98, 0xdf, 0x84, 0x5f, 0x66, 0x51, 0xbf, 0x11, + 0x38, 0x22, 0x1f, 0x11, 0x90, 0x41, 0xf7, 0x2b, + 0x6d, 0xbc, 0x3c, 0x4a, 0xce, 0x71, 0x43, 0xd9, + 0x9f, 0xd5, 0x5a, 0xd8, 0x67, 0x48, 0x0d, 0xa8 }, + .b_public = (u8[32]){ 0xf1, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .expected_ss = (u8[32]){ 0x6f, 0xdf, 0x6c, 0x37, 0x61, 0x1d, 0xbd, 0x53, + 0x04, 0xdc, 0x0f, 0x2e, 0xb7, 0xc9, 0x51, 0x7e, + 0xb3, 0xc5, 0x0e, 0x12, 0xfd, 0x05, 0x0a, 0xc6, + 0xde, 0xc2, 0x70, 0x71, 0xd4, 0xbf, 0xc0, 0x34 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0xf0, 0x94, 0x98, 0xe4, 0x6f, 0x02, 0xf8, 0x78, + 0x82, 0x9e, 0x78, 0xb8, 0x03, 0xd3, 0x16, 0xa2, + 0xed, 0x69, 0x5d, 0x04, 0x98, 0xa0, 0x8a, 0xbd, + 0xf8, 0x27, 0x69, 0x30, 0xe2, 0x4e, 0xdc, 0xb0 }, + .b_public = (u8[32]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .expected_ss = (u8[32]){ 0x4c, 0x8f, 0xc4, 0xb1, 0xc6, 0xab, 0x88, 0xfb, + 0x21, 0xf1, 0x8f, 0x6d, 0x4c, 0x81, 0x02, 0x40, + 0xd4, 0xe9, 0x46, 0x51, 0xba, 0x44, 0xf7, 0xa2, + 0xc8, 0x63, 0xce, 0xc7, 0xdc, 0x56, 0x60, 0x2d }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0x18, 0x13, 0xc1, 0x0a, 0x5c, 0x7f, 0x21, 0xf9, + 0x6e, 0x17, 0xf2, 0x88, 0xc0, 0xcc, 0x37, 0x60, + 0x7c, 0x04, 0xc5, 0xf5, 0xae, 0xa2, 0xdb, 0x13, + 0x4f, 0x9e, 0x2f, 0xfc, 0x66, 0xbd, 0x9d, 0xb8 }, + .b_public = (u8[32]){ 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 }, + .expected_ss = (u8[32]){ 0x1c, 0xd0, 0xb2, 0x82, 0x67, 0xdc, 0x54, 0x1c, + 0x64, 0x2d, 0x6d, 0x7d, 0xca, 0x44, 0xa8, 0xb3, + 0x8a, 0x63, 0x73, 0x6e, 0xef, 0x5c, 0x4e, 0x65, + 0x01, 0xff, 0xbb, 0xb1, 0x78, 0x0c, 0x03, 0x3c }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0x78, 0x57, 0xfb, 0x80, 0x86, 0x53, 0x64, 0x5a, + 0x0b, 0xeb, 0x13, 0x8a, 0x64, 0xf5, 0xf4, 0xd7, + 0x33, 0xa4, 0x5e, 0xa8, 0x4c, 0x3c, 0xda, 0x11, + 0xa9, 0xc0, 0x6f, 0x7e, 0x71, 0x39, 0x14, 0x9e }, + .b_public = (u8[32]){ 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 }, + .expected_ss = (u8[32]){ 0x87, 0x55, 0xbe, 0x01, 0xc6, 0x0a, 0x7e, 0x82, + 0x5c, 0xff, 0x3e, 0x0e, 0x78, 0xcb, 0x3a, 0xa4, + 0x33, 0x38, 0x61, 0x51, 0x6a, 0xa5, 0x9b, 0x1c, + 0x51, 0xa8, 0xb2, 0xa5, 0x43, 0xdf, 0xa8, 0x22 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0xe0, 0x3a, 0xa8, 0x42, 0xe2, 0xab, 0xc5, 0x6e, + 0x81, 0xe8, 0x7b, 0x8b, 0x9f, 0x41, 0x7b, 0x2a, + 0x1e, 0x59, 0x13, 0xc7, 0x23, 0xee, 0xd2, 0x8d, + 0x75, 0x2f, 0x8d, 0x47, 0xa5, 0x9f, 0x49, 0x8f }, + .b_public = (u8[32]){ 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 }, + .expected_ss = (u8[32]){ 0x54, 0xc9, 0xa1, 0xed, 0x95, 0xe5, 0x46, 0xd2, + 0x78, 0x22, 0xa3, 0x60, 0x93, 0x1d, 0xda, 0x60, + 0xa1, 0xdf, 0x04, 0x9d, 0xa6, 0xf9, 0x04, 0x25, + 0x3c, 0x06, 0x12, 0xbb, 0xdc, 0x08, 0x74, 0x76 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0xf8, 0xf7, 0x07, 0xb7, 0x99, 0x9b, 0x18, 0xcb, + 0x0d, 0x6b, 0x96, 0x12, 0x4f, 0x20, 0x45, 0x97, + 0x2c, 0xa2, 0x74, 0xbf, 0xc1, 0x54, 0xad, 0x0c, + 0x87, 0x03, 0x8c, 0x24, 0xc6, 0xd0, 0xd4, 0xb2 }, + .b_public = (u8[32]){ 0xda, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .expected_ss = (u8[32]){ 0xcc, 0x1f, 0x40, 0xd7, 0x43, 0xcd, 0xc2, 0x23, + 0x0e, 0x10, 0x43, 0xda, 0xba, 0x8b, 0x75, 0xe8, + 0x10, 0xf1, 0xfb, 0xab, 0x7f, 0x25, 0x52, 0x69, + 0xbd, 0x9e, 0xbb, 0x29, 0xe6, 0xbf, 0x49, 0x4f }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0xa0, 0x34, 0xf6, 0x84, 0xfa, 0x63, 0x1e, 0x1a, + 0x34, 0x81, 0x18, 0xc1, 0xce, 0x4c, 0x98, 0x23, + 0x1f, 0x2d, 0x9e, 0xec, 0x9b, 0xa5, 0x36, 0x5b, + 0x4a, 0x05, 0xd6, 0x9a, 0x78, 0x5b, 0x07, 0x96 }, + .b_public = (u8[32]){ 0xdb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .expected_ss = (u8[32]){ 0x54, 0x99, 0x8e, 0xe4, 0x3a, 0x5b, 0x00, 0x7b, + 0xf4, 0x99, 0xf0, 0x78, 0xe7, 0x36, 0x52, 0x44, + 0x00, 0xa8, 0xb5, 0xc7, 0xe9, 0xb9, 0xb4, 0x37, + 0x71, 0x74, 0x8c, 0x7c, 0xdf, 0x88, 0x04, 0x12 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0x30, 0xb6, 0xc6, 0xa0, 0xf2, 0xff, 0xa6, 0x80, + 0x76, 0x8f, 0x99, 0x2b, 0xa8, 0x9e, 0x15, 0x2d, + 0x5b, 0xc9, 0x89, 0x3d, 0x38, 0xc9, 0x11, 0x9b, + 0xe4, 0xf7, 0x67, 0xbf, 0xab, 0x6e, 0x0c, 0xa5 }, + .b_public = (u8[32]){ 0xdc, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .expected_ss = (u8[32]){ 0xea, 0xd9, 0xb3, 0x8e, 0xfd, 0xd7, 0x23, 0x63, + 0x79, 0x34, 0xe5, 0x5a, 0xb7, 0x17, 0xa7, 0xae, + 0x09, 0xeb, 0x86, 0xa2, 0x1d, 0xc3, 0x6a, 0x3f, + 0xee, 0xb8, 0x8b, 0x75, 0x9e, 0x39, 0x1e, 0x09 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0x90, 0x1b, 0x9d, 0xcf, 0x88, 0x1e, 0x01, 0xe0, + 0x27, 0x57, 0x50, 0x35, 0xd4, 0x0b, 0x43, 0xbd, + 0xc1, 0xc5, 0x24, 0x2e, 0x03, 0x08, 0x47, 0x49, + 0x5b, 0x0c, 0x72, 0x86, 0x46, 0x9b, 0x65, 0x91 }, + .b_public = (u8[32]){ 0xea, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .expected_ss = (u8[32]){ 0x60, 0x2f, 0xf4, 0x07, 0x89, 0xb5, 0x4b, 0x41, + 0x80, 0x59, 0x15, 0xfe, 0x2a, 0x62, 0x21, 0xf0, + 0x7a, 0x50, 0xff, 0xc2, 0xc3, 0xfc, 0x94, 0xcf, + 0x61, 0xf1, 0x3d, 0x79, 0x04, 0xe8, 0x8e, 0x0e }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0x80, 0x46, 0x67, 0x7c, 0x28, 0xfd, 0x82, 0xc9, + 0xa1, 0xbd, 0xb7, 0x1a, 0x1a, 0x1a, 0x34, 0xfa, + 0xba, 0x12, 0x25, 0xe2, 0x50, 0x7f, 0xe3, 0xf5, + 0x4d, 0x10, 0xbd, 0x5b, 0x0d, 0x86, 0x5f, 0x8e }, + .b_public = (u8[32]){ 0xeb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .expected_ss = (u8[32]){ 0xe0, 0x0a, 0xe8, 0xb1, 0x43, 0x47, 0x12, 0x47, + 0xba, 0x24, 0xf1, 0x2c, 0x88, 0x55, 0x36, 0xc3, + 0xcb, 0x98, 0x1b, 0x58, 0xe1, 0xe5, 0x6b, 0x2b, + 0xaf, 0x35, 0xc1, 0x2a, 0xe1, 0xf7, 0x9c, 0x26 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0x60, 0x2f, 0x7e, 0x2f, 0x68, 0xa8, 0x46, 0xb8, + 0x2c, 0xc2, 0x69, 0xb1, 0xd4, 0x8e, 0x93, 0x98, + 0x86, 0xae, 0x54, 0xfd, 0x63, 0x6c, 0x1f, 0xe0, + 0x74, 0xd7, 0x10, 0x12, 0x7d, 0x47, 0x24, 0x91 }, + .b_public = (u8[32]){ 0xef, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .expected_ss = (u8[32]){ 0x98, 0xcb, 0x9b, 0x50, 0xdd, 0x3f, 0xc2, 0xb0, + 0xd4, 0xf2, 0xd2, 0xbf, 0x7c, 0x5c, 0xfd, 0xd1, + 0x0c, 0x8f, 0xcd, 0x31, 0xfc, 0x40, 0xaf, 0x1a, + 0xd4, 0x4f, 0x47, 0xc1, 0x31, 0x37, 0x63, 0x62 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0x60, 0x88, 0x7b, 0x3d, 0xc7, 0x24, 0x43, 0x02, + 0x6e, 0xbe, 0xdb, 0xbb, 0xb7, 0x06, 0x65, 0xf4, + 0x2b, 0x87, 0xad, 0xd1, 0x44, 0x0e, 0x77, 0x68, + 0xfb, 0xd7, 0xe8, 0xe2, 0xce, 0x5f, 0x63, 0x9d }, + .b_public = (u8[32]){ 0xf0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .expected_ss = (u8[32]){ 0x38, 0xd6, 0x30, 0x4c, 0x4a, 0x7e, 0x6d, 0x9f, + 0x79, 0x59, 0x33, 0x4f, 0xb5, 0x24, 0x5b, 0xd2, + 0xc7, 0x54, 0x52, 0x5d, 0x4c, 0x91, 0xdb, 0x95, + 0x02, 0x06, 0x92, 0x62, 0x34, 0xc1, 0xf6, 0x33 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0x78, 0xd3, 0x1d, 0xfa, 0x85, 0x44, 0x97, 0xd7, + 0x2d, 0x8d, 0xef, 0x8a, 0x1b, 0x7f, 0xb0, 0x06, + 0xce, 0xc2, 0xd8, 0xc4, 0x92, 0x46, 0x47, 0xc9, + 0x38, 0x14, 0xae, 0x56, 0xfa, 0xed, 0xa4, 0x95 }, + .b_public = (u8[32]){ 0xf1, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .expected_ss = (u8[32]){ 0x78, 0x6c, 0xd5, 0x49, 0x96, 0xf0, 0x14, 0xa5, + 0xa0, 0x31, 0xec, 0x14, 0xdb, 0x81, 0x2e, 0xd0, + 0x83, 0x55, 0x06, 0x1f, 0xdb, 0x5d, 0xe6, 0x80, + 0xa8, 0x00, 0xac, 0x52, 0x1f, 0x31, 0x8e, 0x23 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - public key >= p */ +{ + .secret = (u8[32]){ 0xc0, 0x4c, 0x5b, 0xae, 0xfa, 0x83, 0x02, 0xdd, + 0xde, 0xd6, 0xa4, 0xbb, 0x95, 0x77, 0x61, 0xb4, + 0xeb, 0x97, 0xae, 0xfa, 0x4f, 0xc3, 0xb8, 0x04, + 0x30, 0x85, 0xf9, 0x6a, 0x56, 0x59, 0xb3, 0xa5 }, + .b_public = (u8[32]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .expected_ss = (u8[32]){ 0x29, 0xae, 0x8b, 0xc7, 0x3e, 0x9b, 0x10, 0xa0, + 0x8b, 0x4f, 0x68, 0x1c, 0x43, 0xc3, 0xe0, 0xac, + 0x1a, 0x17, 0x1d, 0x31, 0xb3, 0x8f, 0x1a, 0x48, + 0xef, 0xba, 0x29, 0xae, 0x63, 0x9e, 0xa1, 0x34 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - RFC 7748 */ +{ + .secret = (u8[32]){ 0xa0, 0x46, 0xe3, 0x6b, 0xf0, 0x52, 0x7c, 0x9d, + 0x3b, 0x16, 0x15, 0x4b, 0x82, 0x46, 0x5e, 0xdd, + 0x62, 0x14, 0x4c, 0x0a, 0xc1, 0xfc, 0x5a, 0x18, + 0x50, 0x6a, 0x22, 0x44, 0xba, 0x44, 0x9a, 0x44 }, + .b_public = (u8[32]){ 0xe6, 0xdb, 0x68, 0x67, 0x58, 0x30, 0x30, 0xdb, + 0x35, 0x94, 0xc1, 0xa4, 0x24, 0xb1, 0x5f, 0x7c, + 0x72, 0x66, 0x24, 0xec, 0x26, 0xb3, 0x35, 0x3b, + 0x10, 0xa9, 0x03, 0xa6, 0xd0, 0xab, 0x1c, 0x4c }, + .expected_ss = (u8[32]){ 0xc3, 0xda, 0x55, 0x37, 0x9d, 0xe9, 0xc6, 0x90, + 0x8e, 0x94, 0xea, 0x4d, 0xf2, 0x8d, 0x08, 0x4f, + 0x32, 0xec, 0xcf, 0x03, 0x49, 0x1c, 0x71, 0xf7, + 0x54, 0xb4, 0x07, 0x55, 0x77, 0xa2, 0x85, 0x52 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - RFC 7748 */ +{ + .secret = (u8[32]){ 0x48, 0x66, 0xe9, 0xd4, 0xd1, 0xb4, 0x67, 0x3c, + 0x5a, 0xd2, 0x26, 0x91, 0x95, 0x7d, 0x6a, 0xf5, + 0xc1, 0x1b, 0x64, 0x21, 0xe0, 0xea, 0x01, 0xd4, + 0x2c, 0xa4, 0x16, 0x9e, 0x79, 0x18, 0xba, 0x4d }, + .b_public = (u8[32]){ 0xe5, 0x21, 0x0f, 0x12, 0x78, 0x68, 0x11, 0xd3, + 0xf4, 0xb7, 0x95, 0x9d, 0x05, 0x38, 0xae, 0x2c, + 0x31, 0xdb, 0xe7, 0x10, 0x6f, 0xc0, 0x3c, 0x3e, + 0xfc, 0x4c, 0xd5, 0x49, 0xc7, 0x15, 0xa4, 0x13 }, + .expected_ss = (u8[32]){ 0x95, 0xcb, 0xde, 0x94, 0x76, 0xe8, 0x90, 0x7d, + 0x7a, 0xad, 0xe4, 0x5c, 0xb4, 0xb8, 0x73, 0xf8, + 0x8b, 0x59, 0x5a, 0x68, 0x79, 0x9f, 0xa1, 0x52, + 0xe6, 0xf8, 0xf7, 0x64, 0x7a, 0xac, 0x79, 0x57 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0x0a, 0xb4, 0xe7, 0x63, 0x80, 0xd8, 0x4d, 0xde, + 0x4f, 0x68, 0x33, 0xc5, 0x8f, 0x2a, 0x9f, 0xb8, + 0xf8, 0x3b, 0xb0, 0x16, 0x9b, 0x17, 0x2b, 0xe4, + 0xb6, 0xe0, 0x59, 0x28, 0x87, 0x74, 0x1a, 0x36 }, + .expected_ss = (u8[32]){ 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0x89, 0xe1, 0x0d, 0x57, 0x01, 0xb4, 0x33, 0x7d, + 0x2d, 0x03, 0x21, 0x81, 0x53, 0x8b, 0x10, 0x64, + 0xbd, 0x40, 0x84, 0x40, 0x1c, 0xec, 0xa1, 0xfd, + 0x12, 0x66, 0x3a, 0x19, 0x59, 0x38, 0x80, 0x00 }, + .expected_ss = (u8[32]){ 0x09, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0x2b, 0x55, 0xd3, 0xaa, 0x4a, 0x8f, 0x80, 0xc8, + 0xc0, 0xb2, 0xae, 0x5f, 0x93, 0x3e, 0x85, 0xaf, + 0x49, 0xbe, 0xac, 0x36, 0xc2, 0xfa, 0x73, 0x94, + 0xba, 0xb7, 0x6c, 0x89, 0x33, 0xf8, 0xf8, 0x1d }, + .expected_ss = (u8[32]){ 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0x63, 0xe5, 0xb1, 0xfe, 0x96, 0x01, 0xfe, 0x84, + 0x38, 0x5d, 0x88, 0x66, 0xb0, 0x42, 0x12, 0x62, + 0xf7, 0x8f, 0xbf, 0xa5, 0xaf, 0xf9, 0x58, 0x5e, + 0x62, 0x66, 0x79, 0xb1, 0x85, 0x47, 0xd9, 0x59 }, + .expected_ss = (u8[32]){ 0xfe, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0xe4, 0x28, 0xf3, 0xda, 0xc1, 0x78, 0x09, 0xf8, + 0x27, 0xa5, 0x22, 0xce, 0x32, 0x35, 0x50, 0x58, + 0xd0, 0x73, 0x69, 0x36, 0x4a, 0xa7, 0x89, 0x02, + 0xee, 0x10, 0x13, 0x9b, 0x9f, 0x9d, 0xd6, 0x53 }, + .expected_ss = (u8[32]){ 0xfc, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0xb3, 0xb5, 0x0e, 0x3e, 0xd3, 0xa4, 0x07, 0xb9, + 0x5d, 0xe9, 0x42, 0xef, 0x74, 0x57, 0x5b, 0x5a, + 0xb8, 0xa1, 0x0c, 0x09, 0xee, 0x10, 0x35, 0x44, + 0xd6, 0x0b, 0xdf, 0xed, 0x81, 0x38, 0xab, 0x2b }, + .expected_ss = (u8[32]){ 0xf9, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0x21, 0x3f, 0xff, 0xe9, 0x3d, 0x5e, 0xa8, 0xcd, + 0x24, 0x2e, 0x46, 0x28, 0x44, 0x02, 0x99, 0x22, + 0xc4, 0x3c, 0x77, 0xc9, 0xe3, 0xe4, 0x2f, 0x56, + 0x2f, 0x48, 0x5d, 0x24, 0xc5, 0x01, 0xa2, 0x0b }, + .expected_ss = (u8[32]){ 0xf3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0x91, 0xb2, 0x32, 0xa1, 0x78, 0xb3, 0xcd, 0x53, + 0x09, 0x32, 0x44, 0x1e, 0x61, 0x39, 0x41, 0x8f, + 0x72, 0x17, 0x22, 0x92, 0xf1, 0xda, 0x4c, 0x18, + 0x34, 0xfc, 0x5e, 0xbf, 0xef, 0xb5, 0x1e, 0x3f }, + .expected_ss = (u8[32]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x03 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0x04, 0x5c, 0x6e, 0x11, 0xc5, 0xd3, 0x32, 0x55, + 0x6c, 0x78, 0x22, 0xfe, 0x94, 0xeb, 0xf8, 0x9b, + 0x56, 0xa3, 0x87, 0x8d, 0xc2, 0x7c, 0xa0, 0x79, + 0x10, 0x30, 0x58, 0x84, 0x9f, 0xab, 0xcb, 0x4f }, + .expected_ss = (u8[32]){ 0xe5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0x1c, 0xa2, 0x19, 0x0b, 0x71, 0x16, 0x35, 0x39, + 0x06, 0x3c, 0x35, 0x77, 0x3b, 0xda, 0x0c, 0x9c, + 0x92, 0x8e, 0x91, 0x36, 0xf0, 0x62, 0x0a, 0xeb, + 0x09, 0x3f, 0x09, 0x91, 0x97, 0xb7, 0xf7, 0x4e }, + .expected_ss = (u8[32]){ 0xe3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0xf7, 0x6e, 0x90, 0x10, 0xac, 0x33, 0xc5, 0x04, + 0x3b, 0x2d, 0x3b, 0x76, 0xa8, 0x42, 0x17, 0x10, + 0x00, 0xc4, 0x91, 0x62, 0x22, 0xe9, 0xe8, 0x58, + 0x97, 0xa0, 0xae, 0xc7, 0xf6, 0x35, 0x0b, 0x3c }, + .expected_ss = (u8[32]){ 0xdd, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0xbb, 0x72, 0x68, 0x8d, 0x8f, 0x8a, 0xa7, 0xa3, + 0x9c, 0xd6, 0x06, 0x0c, 0xd5, 0xc8, 0x09, 0x3c, + 0xde, 0xc6, 0xfe, 0x34, 0x19, 0x37, 0xc3, 0x88, + 0x6a, 0x99, 0x34, 0x6c, 0xd0, 0x7f, 0xaa, 0x55 }, + .expected_ss = (u8[32]){ 0xdb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0x88, 0xfd, 0xde, 0xa1, 0x93, 0x39, 0x1c, 0x6a, + 0x59, 0x33, 0xef, 0x9b, 0x71, 0x90, 0x15, 0x49, + 0x44, 0x72, 0x05, 0xaa, 0xe9, 0xda, 0x92, 0x8a, + 0x6b, 0x91, 0xa3, 0x52, 0xba, 0x10, 0xf4, 0x1f }, + .expected_ss = (u8[32]){ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - edge case for shared secret */ +{ + .secret = (u8[32]){ 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .b_public = (u8[32]){ 0x30, 0x3b, 0x39, 0x2f, 0x15, 0x31, 0x16, 0xca, + 0xd9, 0xcc, 0x68, 0x2a, 0x00, 0xcc, 0xc4, 0x4c, + 0x95, 0xff, 0x0d, 0x3b, 0xbe, 0x56, 0x8b, 0xeb, + 0x6c, 0x4e, 0x73, 0x9b, 0xaf, 0xdc, 0x2c, 0x68 }, + .expected_ss = (u8[32]){ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x00 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - checking for overflow */ +{ + .secret = (u8[32]){ 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d, + 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d, + 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c, + 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 }, + .b_public = (u8[32]){ 0xfd, 0x30, 0x0a, 0xeb, 0x40, 0xe1, 0xfa, 0x58, + 0x25, 0x18, 0x41, 0x2b, 0x49, 0xb2, 0x08, 0xa7, + 0x84, 0x2b, 0x1e, 0x1f, 0x05, 0x6a, 0x04, 0x01, + 0x78, 0xea, 0x41, 0x41, 0x53, 0x4f, 0x65, 0x2d }, + .expected_ss = (u8[32]){ 0xb7, 0x34, 0x10, 0x5d, 0xc2, 0x57, 0x58, 0x5d, + 0x73, 0xb5, 0x66, 0xcc, 0xb7, 0x6f, 0x06, 0x27, + 0x95, 0xcc, 0xbe, 0xc8, 0x91, 0x28, 0xe5, 0x2b, + 0x02, 0xf3, 0xe5, 0x96, 0x39, 0xf1, 0x3c, 0x46 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - checking for overflow */ +{ + .secret = (u8[32]){ 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d, + 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d, + 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c, + 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 }, + .b_public = (u8[32]){ 0xc8, 0xef, 0x79, 0xb5, 0x14, 0xd7, 0x68, 0x26, + 0x77, 0xbc, 0x79, 0x31, 0xe0, 0x6e, 0xe5, 0xc2, + 0x7c, 0x9b, 0x39, 0x2b, 0x4a, 0xe9, 0x48, 0x44, + 0x73, 0xf5, 0x54, 0xe6, 0x67, 0x8e, 0xcc, 0x2e }, + .expected_ss = (u8[32]){ 0x64, 0x7a, 0x46, 0xb6, 0xfc, 0x3f, 0x40, 0xd6, + 0x21, 0x41, 0xee, 0x3c, 0xee, 0x70, 0x6b, 0x4d, + 0x7a, 0x92, 0x71, 0x59, 0x3a, 0x7b, 0x14, 0x3e, + 0x8e, 0x2e, 0x22, 0x79, 0x88, 0x3e, 0x45, 0x50 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - checking for overflow */ +{ + .secret = (u8[32]){ 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d, + 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d, + 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c, + 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 }, + .b_public = (u8[32]){ 0x64, 0xae, 0xac, 0x25, 0x04, 0x14, 0x48, 0x61, + 0x53, 0x2b, 0x7b, 0xbc, 0xb6, 0xc8, 0x7d, 0x67, + 0xdd, 0x4c, 0x1f, 0x07, 0xeb, 0xc2, 0xe0, 0x6e, + 0xff, 0xb9, 0x5a, 0xec, 0xc6, 0x17, 0x0b, 0x2c }, + .expected_ss = (u8[32]){ 0x4f, 0xf0, 0x3d, 0x5f, 0xb4, 0x3c, 0xd8, 0x65, + 0x7a, 0x3c, 0xf3, 0x7c, 0x13, 0x8c, 0xad, 0xce, + 0xcc, 0xe5, 0x09, 0xe4, 0xeb, 0xa0, 0x89, 0xd0, + 0xef, 0x40, 0xb4, 0xe4, 0xfb, 0x94, 0x61, 0x55 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - checking for overflow */ +{ + .secret = (u8[32]){ 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d, + 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d, + 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c, + 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 }, + .b_public = (u8[32]){ 0xbf, 0x68, 0xe3, 0x5e, 0x9b, 0xdb, 0x7e, 0xee, + 0x1b, 0x50, 0x57, 0x02, 0x21, 0x86, 0x0f, 0x5d, + 0xcd, 0xad, 0x8a, 0xcb, 0xab, 0x03, 0x1b, 0x14, + 0x97, 0x4c, 0xc4, 0x90, 0x13, 0xc4, 0x98, 0x31 }, + .expected_ss = (u8[32]){ 0x21, 0xce, 0xe5, 0x2e, 0xfd, 0xbc, 0x81, 0x2e, + 0x1d, 0x02, 0x1a, 0x4a, 0xf1, 0xe1, 0xd8, 0xbc, + 0x4d, 0xb3, 0xc4, 0x00, 0xe4, 0xd2, 0xa2, 0xc5, + 0x6a, 0x39, 0x26, 0xdb, 0x4d, 0x99, 0xc6, 0x5b }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - checking for overflow */ +{ + .secret = (u8[32]){ 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d, + 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d, + 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c, + 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 }, + .b_public = (u8[32]){ 0x53, 0x47, 0xc4, 0x91, 0x33, 0x1a, 0x64, 0xb4, + 0x3d, 0xdc, 0x68, 0x30, 0x34, 0xe6, 0x77, 0xf5, + 0x3d, 0xc3, 0x2b, 0x52, 0xa5, 0x2a, 0x57, 0x7c, + 0x15, 0xa8, 0x3b, 0xf2, 0x98, 0xe9, 0x9f, 0x19 }, + .expected_ss = (u8[32]){ 0x18, 0xcb, 0x89, 0xe4, 0xe2, 0x0c, 0x0c, 0x2b, + 0xd3, 0x24, 0x30, 0x52, 0x45, 0x26, 0x6c, 0x93, + 0x27, 0x69, 0x0b, 0xbe, 0x79, 0xac, 0xb8, 0x8f, + 0x5b, 0x8f, 0xb3, 0xf7, 0x4e, 0xca, 0x3e, 0x52 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - private key == -1 (mod order) */ +{ + .secret = (u8[32]){ 0xa0, 0x23, 0xcd, 0xd0, 0x83, 0xef, 0x5b, 0xb8, + 0x2f, 0x10, 0xd6, 0x2e, 0x59, 0xe1, 0x5a, 0x68, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x50 }, + .b_public = (u8[32]){ 0x25, 0x8e, 0x04, 0x52, 0x3b, 0x8d, 0x25, 0x3e, + 0xe6, 0x57, 0x19, 0xfc, 0x69, 0x06, 0xc6, 0x57, + 0x19, 0x2d, 0x80, 0x71, 0x7e, 0xdc, 0x82, 0x8f, + 0xa0, 0xaf, 0x21, 0x68, 0x6e, 0x2f, 0xaa, 0x75 }, + .expected_ss = (u8[32]){ 0x25, 0x8e, 0x04, 0x52, 0x3b, 0x8d, 0x25, 0x3e, + 0xe6, 0x57, 0x19, 0xfc, 0x69, 0x06, 0xc6, 0x57, + 0x19, 0x2d, 0x80, 0x71, 0x7e, 0xdc, 0x82, 0x8f, + 0xa0, 0xaf, 0x21, 0x68, 0x6e, 0x2f, 0xaa, 0x75 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +}, +/* wycheproof - private key == 1 (mod order) on twist */ +{ + .secret = (u8[32]){ 0x58, 0x08, 0x3d, 0xd2, 0x61, 0xad, 0x91, 0xef, + 0xf9, 0x52, 0x32, 0x2e, 0xc8, 0x24, 0xc6, 0x82, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x5f }, + .b_public = (u8[32]){ 0x2e, 0xae, 0x5e, 0xc3, 0xdd, 0x49, 0x4e, 0x9f, + 0x2d, 0x37, 0xd2, 0x58, 0xf8, 0x73, 0xa8, 0xe6, + 0xe9, 0xd0, 0xdb, 0xd1, 0xe3, 0x83, 0xef, 0x64, + 0xd9, 0x8b, 0xb9, 0x1b, 0x3e, 0x0b, 0xe0, 0x35 }, + .expected_ss = (u8[32]){ 0x2e, 0xae, 0x5e, 0xc3, 0xdd, 0x49, 0x4e, 0x9f, + 0x2d, 0x37, 0xd2, 0x58, 0xf8, 0x73, 0xa8, 0xe6, + 0xe9, 0xd0, 0xdb, 0xd1, 0xe3, 0x83, 0xef, 0x64, + 0xd9, 0x8b, 0xb9, 0x1b, 0x3e, 0x0b, 0xe0, 0x35 }, + .secret_size = 32, + .b_public_size = 32, + .expected_ss_size = 32, + +} +}; + static const struct kpp_testvec ecdh_tv_template[] = { { #ifndef CONFIG_CRYPTO_FIPS @@ -32105,8 +33330,9 @@ static const struct cipher_testvec xchacha20_tv_template[] = { "\x57\x78\x8e\x6f\xae\x90\xfc\x31" "\x09\x7c\xfc", .len = 91, - }, { /* Taken from the ChaCha20 test vectors, appended 16 random bytes - to nonce, and recomputed the ciphertext with libsodium */ + }, { /* Taken from the ChaCha20 test vectors, appended 12 random bytes + to the nonce, zero-padded the stream position from 4 to 8 bytes, + and recomputed the ciphertext using libsodium's XChaCha20 */ .key = "\x00\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00" @@ -32133,8 +33359,7 @@ static const struct cipher_testvec xchacha20_tv_template[] = { "\x03\xdc\xf8\x2b\xc1\xe1\x75\x67" "\x23\x7b\xe6\xfc\xd4\x03\x86\x54", .len = 64, - }, { /* Taken from the ChaCha20 test vectors, appended 16 random bytes - to nonce, and recomputed the ciphertext with libsodium */ + }, { /* Derived from a ChaCha20 test vector, via the process above */ .key = "\x00\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00" @@ -32243,8 +33468,7 @@ static const struct cipher_testvec xchacha20_tv_template[] = { .np = 3, .tap = { 375 - 20, 4, 16 }, - }, { /* Taken from the ChaCha20 test vectors, appended 16 random bytes - to nonce, and recomputed the ciphertext with libsodium */ + }, { /* Derived from a ChaCha20 test vector, via the process above */ .key = "\x1c\x92\x40\xa5\xeb\x55\xd3\x8a" "\xf3\x33\x88\x86\x04\xf6\xb5\xf0" "\x47\x39\x17\xc1\x40\x2b\x80\x09" @@ -32287,8 +33511,7 @@ static const struct cipher_testvec xchacha20_tv_template[] = { "\x65\x03\xfa\x45\xf7\x9e\x53\x7a" "\x99\xf1\x82\x25\x4f\x8d\x07", .len = 127, - }, { /* Taken from the ChaCha20 test vectors, appended 16 random bytes - to nonce, and recomputed the ciphertext with libsodium */ + }, { /* Derived from a ChaCha20 test vector, via the process above */ .key = "\x1c\x92\x40\xa5\xeb\x55\xd3\x8a" "\xf3\x33\x88\x86\x04\xf6\xb5\xf0" "\x47\x39\x17\xc1\x40\x2b\x80\x09" @@ -32624,7 +33847,94 @@ static const struct cipher_testvec xchacha20_tv_template[] = { .also_non_np = 1, .np = 3, .tap = { 1200, 1, 80 }, - }, + }, { /* test vector from https://tools.ietf.org/html/draft-arciszewski-xchacha-02#appendix-A.3.2 */ + .key = "\x80\x81\x82\x83\x84\x85\x86\x87" + "\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f" + "\x90\x91\x92\x93\x94\x95\x96\x97" + "\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f", + .klen = 32, + .iv = "\x40\x41\x42\x43\x44\x45\x46\x47" + "\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f" + "\x50\x51\x52\x53\x54\x55\x56\x58" + "\x00\x00\x00\x00\x00\x00\x00\x00", + .ptext = "\x54\x68\x65\x20\x64\x68\x6f\x6c" + "\x65\x20\x28\x70\x72\x6f\x6e\x6f" + "\x75\x6e\x63\x65\x64\x20\x22\x64" + "\x6f\x6c\x65\x22\x29\x20\x69\x73" + "\x20\x61\x6c\x73\x6f\x20\x6b\x6e" + "\x6f\x77\x6e\x20\x61\x73\x20\x74" + "\x68\x65\x20\x41\x73\x69\x61\x74" + "\x69\x63\x20\x77\x69\x6c\x64\x20" + "\x64\x6f\x67\x2c\x20\x72\x65\x64" + "\x20\x64\x6f\x67\x2c\x20\x61\x6e" + "\x64\x20\x77\x68\x69\x73\x74\x6c" + "\x69\x6e\x67\x20\x64\x6f\x67\x2e" + "\x20\x49\x74\x20\x69\x73\x20\x61" + "\x62\x6f\x75\x74\x20\x74\x68\x65" + "\x20\x73\x69\x7a\x65\x20\x6f\x66" + "\x20\x61\x20\x47\x65\x72\x6d\x61" + "\x6e\x20\x73\x68\x65\x70\x68\x65" + "\x72\x64\x20\x62\x75\x74\x20\x6c" + "\x6f\x6f\x6b\x73\x20\x6d\x6f\x72" + "\x65\x20\x6c\x69\x6b\x65\x20\x61" + "\x20\x6c\x6f\x6e\x67\x2d\x6c\x65" + "\x67\x67\x65\x64\x20\x66\x6f\x78" + "\x2e\x20\x54\x68\x69\x73\x20\x68" + "\x69\x67\x68\x6c\x79\x20\x65\x6c" + "\x75\x73\x69\x76\x65\x20\x61\x6e" + "\x64\x20\x73\x6b\x69\x6c\x6c\x65" + "\x64\x20\x6a\x75\x6d\x70\x65\x72" + "\x20\x69\x73\x20\x63\x6c\x61\x73" + "\x73\x69\x66\x69\x65\x64\x20\x77" + "\x69\x74\x68\x20\x77\x6f\x6c\x76" + "\x65\x73\x2c\x20\x63\x6f\x79\x6f" + "\x74\x65\x73\x2c\x20\x6a\x61\x63" + "\x6b\x61\x6c\x73\x2c\x20\x61\x6e" + "\x64\x20\x66\x6f\x78\x65\x73\x20" + "\x69\x6e\x20\x74\x68\x65\x20\x74" + "\x61\x78\x6f\x6e\x6f\x6d\x69\x63" + "\x20\x66\x61\x6d\x69\x6c\x79\x20" + "\x43\x61\x6e\x69\x64\x61\x65\x2e", + .ctext = "\x45\x59\xab\xba\x4e\x48\xc1\x61" + "\x02\xe8\xbb\x2c\x05\xe6\x94\x7f" + "\x50\xa7\x86\xde\x16\x2f\x9b\x0b" + "\x7e\x59\x2a\x9b\x53\xd0\xd4\xe9" + "\x8d\x8d\x64\x10\xd5\x40\xa1\xa6" + "\x37\x5b\x26\xd8\x0d\xac\xe4\xfa" + "\xb5\x23\x84\xc7\x31\xac\xbf\x16" + "\xa5\x92\x3c\x0c\x48\xd3\x57\x5d" + "\x4d\x0d\x2c\x67\x3b\x66\x6f\xaa" + "\x73\x10\x61\x27\x77\x01\x09\x3a" + "\x6b\xf7\xa1\x58\xa8\x86\x42\x92" + "\xa4\x1c\x48\xe3\xa9\xb4\xc0\xda" + "\xec\xe0\xf8\xd9\x8d\x0d\x7e\x05" + "\xb3\x7a\x30\x7b\xbb\x66\x33\x31" + "\x64\xec\x9e\x1b\x24\xea\x0d\x6c" + "\x3f\xfd\xdc\xec\x4f\x68\xe7\x44" + "\x30\x56\x19\x3a\x03\xc8\x10\xe1" + "\x13\x44\xca\x06\xd8\xed\x8a\x2b" + "\xfb\x1e\x8d\x48\xcf\xa6\xbc\x0e" + "\xb4\xe2\x46\x4b\x74\x81\x42\x40" + "\x7c\x9f\x43\x1a\xee\x76\x99\x60" + "\xe1\x5b\xa8\xb9\x68\x90\x46\x6e" + "\xf2\x45\x75\x99\x85\x23\x85\xc6" + "\x61\xf7\x52\xce\x20\xf9\xda\x0c" + "\x09\xab\x6b\x19\xdf\x74\xe7\x6a" + "\x95\x96\x74\x46\xf8\xd0\xfd\x41" + "\x5e\x7b\xee\x2a\x12\xa1\x14\xc2" + "\x0e\xb5\x29\x2a\xe7\xa3\x49\xae" + "\x57\x78\x20\xd5\x52\x0a\x1f\x3f" + "\xb6\x2a\x17\xce\x6a\x7e\x68\xfa" + "\x7c\x79\x11\x1d\x88\x60\x92\x0b" + "\xc0\x48\xef\x43\xfe\x84\x48\x6c" + "\xcb\x87\xc2\x5f\x0a\xe0\x45\xf0" + "\xcc\xe1\xe7\x98\x9a\x9a\xa2\x20" + "\xa2\x8b\xdd\x48\x27\xe7\x51\xa2" + "\x4a\x6d\x5c\x62\xd7\x90\xa6\x63" + "\x93\xb9\x31\x11\xc1\xa5\x5d\xd7" + "\x42\x1a\x10\x18\x49\x74\xc7\xc5", + .len = 304, + } }; /* @@ -33202,7 +34512,94 @@ static const struct cipher_testvec xchacha12_tv_template[] = { .also_non_np = 1, .np = 3, .tap = { 1200, 1, 80 }, - }, + }, { + .key = "\x80\x81\x82\x83\x84\x85\x86\x87" + "\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f" + "\x90\x91\x92\x93\x94\x95\x96\x97" + "\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f", + .klen = 32, + .iv = "\x40\x41\x42\x43\x44\x45\x46\x47" + "\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f" + "\x50\x51\x52\x53\x54\x55\x56\x58" + "\x00\x00\x00\x00\x00\x00\x00\x00", + .ptext = "\x54\x68\x65\x20\x64\x68\x6f\x6c" + "\x65\x20\x28\x70\x72\x6f\x6e\x6f" + "\x75\x6e\x63\x65\x64\x20\x22\x64" + "\x6f\x6c\x65\x22\x29\x20\x69\x73" + "\x20\x61\x6c\x73\x6f\x20\x6b\x6e" + "\x6f\x77\x6e\x20\x61\x73\x20\x74" + "\x68\x65\x20\x41\x73\x69\x61\x74" + "\x69\x63\x20\x77\x69\x6c\x64\x20" + "\x64\x6f\x67\x2c\x20\x72\x65\x64" + "\x20\x64\x6f\x67\x2c\x20\x61\x6e" + "\x64\x20\x77\x68\x69\x73\x74\x6c" + "\x69\x6e\x67\x20\x64\x6f\x67\x2e" + "\x20\x49\x74\x20\x69\x73\x20\x61" + "\x62\x6f\x75\x74\x20\x74\x68\x65" + "\x20\x73\x69\x7a\x65\x20\x6f\x66" + "\x20\x61\x20\x47\x65\x72\x6d\x61" + "\x6e\x20\x73\x68\x65\x70\x68\x65" + "\x72\x64\x20\x62\x75\x74\x20\x6c" + "\x6f\x6f\x6b\x73\x20\x6d\x6f\x72" + "\x65\x20\x6c\x69\x6b\x65\x20\x61" + "\x20\x6c\x6f\x6e\x67\x2d\x6c\x65" + "\x67\x67\x65\x64\x20\x66\x6f\x78" + "\x2e\x20\x54\x68\x69\x73\x20\x68" + "\x69\x67\x68\x6c\x79\x20\x65\x6c" + "\x75\x73\x69\x76\x65\x20\x61\x6e" + "\x64\x20\x73\x6b\x69\x6c\x6c\x65" + "\x64\x20\x6a\x75\x6d\x70\x65\x72" + "\x20\x69\x73\x20\x63\x6c\x61\x73" + "\x73\x69\x66\x69\x65\x64\x20\x77" + "\x69\x74\x68\x20\x77\x6f\x6c\x76" + "\x65\x73\x2c\x20\x63\x6f\x79\x6f" + "\x74\x65\x73\x2c\x20\x6a\x61\x63" + "\x6b\x61\x6c\x73\x2c\x20\x61\x6e" + "\x64\x20\x66\x6f\x78\x65\x73\x20" + "\x69\x6e\x20\x74\x68\x65\x20\x74" + "\x61\x78\x6f\x6e\x6f\x6d\x69\x63" + "\x20\x66\x61\x6d\x69\x6c\x79\x20" + "\x43\x61\x6e\x69\x64\x61\x65\x2e", + .ctext = "\x9f\x1a\xab\x8a\x95\xf4\x7e\xcd" + "\xee\x34\xc0\x39\xd6\x23\x43\x94" + "\xf6\x01\xc1\x7f\x60\x91\xa5\x23" + "\x4a\x8a\xe6\xb1\x14\x8b\xd7\x58" + "\xee\x02\xad\xab\xce\x1e\x7d\xdf" + "\xf9\x49\x27\x69\xd0\x8d\x0c\x20" + "\x6e\x17\xc4\xae\x87\x7a\xc6\x61" + "\x91\xe2\x8e\x0a\x1d\x61\xcc\x38" + "\x02\x64\x43\x49\xc6\xb2\x59\x59" + "\x42\xe7\x9d\x83\x00\x60\x90\xd2" + "\xb9\xcd\x97\x6e\xc7\x95\x71\xbc" + "\x23\x31\x58\x07\xb3\xb4\xac\x0b" + "\x87\x64\x56\xe5\xe3\xec\x63\xa1" + "\x71\x8c\x08\x48\x33\x20\x29\x81" + "\xea\x01\x25\x20\xc3\xda\xe6\xee" + "\x6a\x03\xf6\x68\x4d\x26\xa0\x91" + "\x9e\x44\xb8\xc1\xc0\x8f\x5a\x6a" + "\xc0\xcd\xbf\x24\x5e\x40\x66\xd2" + "\x42\x24\xb5\xbf\xc1\xeb\x12\x60" + "\x56\xbe\xb1\xa6\xc4\x0f\xfc\x49" + "\x69\x9f\xcc\x06\x5c\xe3\x26\xd7" + "\x52\xc0\x42\xe8\xb4\x76\xc3\xee" + "\xb2\x97\xe3\x37\x61\x29\x5a\xb5" + "\x8e\xe8\x8c\xc5\x38\xcc\xcb\xec" + "\x64\x1a\xa9\x12\x5f\xf7\x79\xdf" + "\x64\xca\x77\x4e\xbd\xf9\x83\xa0" + "\x13\x27\x3f\x31\x03\x63\x30\x26" + "\x27\x0b\x3e\xb3\x23\x13\x61\x0b" + "\x70\x1d\xd4\xad\x85\x1e\xbf\xdf" + "\xc6\x8e\x4d\x08\xcc\x7e\x77\xbd" + "\x1e\x18\x77\x38\x3a\xfe\xc0\x5d" + "\x16\xfc\xf0\xa9\x2f\xe9\x17\xc7" + "\xd3\x23\x17\x18\xa3\xe6\x54\x77" + "\x6f\x1b\xbe\x8a\x6e\x7e\xca\x97" + "\x08\x05\x36\x76\xaf\x12\x7a\x42" + "\xf7\x7a\xc2\x35\xc3\xb4\x93\x40" + "\x54\x14\x90\xa0\x4d\x65\x1c\x37" + "\x50\x70\x44\x29\x6d\x6e\x62\x68", + .len = 304, + } }; /* Adiantum test vectors from https://github.com/google/adiantum */ @@ -35134,4 +36531,256 @@ static const struct comp_testvec zstd_decomp_tv_template[] = { "functions.", }, }; + +static const char blake2_ordered_sequence[] = + "\x00\x01\x02\x03\x04\x05\x06\x07" + "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" + "\x10\x11\x12\x13\x14\x15\x16\x17" + "\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" + "\x20\x21\x22\x23\x24\x25\x26\x27" + "\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f" + "\x30\x31\x32\x33\x34\x35\x36\x37" + "\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f" + "\x40\x41\x42\x43\x44\x45\x46\x47" + "\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f" + "\x50\x51\x52\x53\x54\x55\x56\x57" + "\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f" + "\x60\x61\x62\x63\x64\x65\x66\x67" + "\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f" + "\x70\x71\x72\x73\x74\x75\x76\x77" + "\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f" + "\x80\x81\x82\x83\x84\x85\x86\x87" + "\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f" + "\x90\x91\x92\x93\x94\x95\x96\x97" + "\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f" + "\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7" + "\xa8\xa9\xaa\xab\xac\xad\xae\xaf" + "\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7" + "\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf" + "\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7" + "\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf" + "\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7" + "\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf" + "\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7" + "\xe8\xe9\xea\xeb\xec\xed\xee\xef" + "\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7" + "\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff"; + +static const struct hash_testvec blakes2s_128_tv_template[] = {{ + .digest = (u8[]){ 0x64, 0x55, 0x0d, 0x6f, 0xfe, 0x2c, 0x0a, 0x01, + 0xa1, 0x4a, 0xba, 0x1e, 0xad, 0xe0, 0x20, 0x0c, }, +}, { + .plaintext = blake2_ordered_sequence, + .psize = 64, + .digest = (u8[]){ 0xdc, 0x66, 0xca, 0x8f, 0x03, 0x86, 0x58, 0x01, + 0xb0, 0xff, 0xe0, 0x6e, 0xd8, 0xa1, 0xa9, 0x0e, }, +}, { + .ksize = 16, + .key = blake2_ordered_sequence, + .plaintext = blake2_ordered_sequence, + .psize = 1, + .digest = (u8[]){ 0x88, 0x1e, 0x42, 0xe7, 0xbb, 0x35, 0x80, 0x82, + 0x63, 0x7c, 0x0a, 0x0f, 0xd7, 0xec, 0x6c, 0x2f, }, +}, { + .ksize = 32, + .key = blake2_ordered_sequence, + .plaintext = blake2_ordered_sequence, + .psize = 7, + .digest = (u8[]){ 0xcf, 0x9e, 0x07, 0x2a, 0xd5, 0x22, 0xf2, 0xcd, + 0xa2, 0xd8, 0x25, 0x21, 0x80, 0x86, 0x73, 0x1c, }, +}, { + .ksize = 1, + .key = "B", + .plaintext = blake2_ordered_sequence, + .psize = 15, + .digest = (u8[]){ 0xf6, 0x33, 0x5a, 0x2c, 0x22, 0xa0, 0x64, 0xb2, + 0xb6, 0x3f, 0xeb, 0xbc, 0xd1, 0xc3, 0xe5, 0xb2, }, +}, { + .ksize = 16, + .key = blake2_ordered_sequence, + .plaintext = blake2_ordered_sequence, + .psize = 247, + .digest = (u8[]){ 0x72, 0x66, 0x49, 0x60, 0xf9, 0x4a, 0xea, 0xbe, + 0x1f, 0xf4, 0x60, 0xce, 0xb7, 0x81, 0xcb, 0x09, }, +}, { + .ksize = 32, + .key = blake2_ordered_sequence, + .plaintext = blake2_ordered_sequence, + .psize = 256, + .digest = (u8[]){ 0xd5, 0xa4, 0x0e, 0xc3, 0x16, 0xc7, 0x51, 0xa6, + 0x3c, 0xd0, 0xd9, 0x11, 0x57, 0xfa, 0x1e, 0xbb, }, +}}; + +static const struct hash_testvec blakes2s_160_tv_template[] = {{ + .plaintext = blake2_ordered_sequence, + .psize = 7, + .digest = (u8[]){ 0xb4, 0xf2, 0x03, 0x49, 0x37, 0xed, 0xb1, 0x3e, + 0x5b, 0x2a, 0xca, 0x64, 0x82, 0x74, 0xf6, 0x62, + 0xe3, 0xf2, 0x84, 0xff, }, +}, { + .plaintext = blake2_ordered_sequence, + .psize = 256, + .digest = (u8[]){ 0xaa, 0x56, 0x9b, 0xdc, 0x98, 0x17, 0x75, 0xf2, + 0xb3, 0x68, 0x83, 0xb7, 0x9b, 0x8d, 0x48, 0xb1, + 0x9b, 0x2d, 0x35, 0x05, }, +}, { + .ksize = 1, + .key = "B", + .digest = (u8[]){ 0x50, 0x16, 0xe7, 0x0c, 0x01, 0xd0, 0xd3, 0xc3, + 0xf4, 0x3e, 0xb1, 0x6e, 0x97, 0xa9, 0x4e, 0xd1, + 0x79, 0x65, 0x32, 0x93, }, +}, { + .ksize = 32, + .key = blake2_ordered_sequence, + .plaintext = blake2_ordered_sequence, + .psize = 1, + .digest = (u8[]){ 0x1c, 0x2b, 0xcd, 0x9a, 0x68, 0xca, 0x8c, 0x71, + 0x90, 0x29, 0x6c, 0x54, 0xfa, 0x56, 0x4a, 0xef, + 0xa2, 0x3a, 0x56, 0x9c, }, +}, { + .ksize = 16, + .key = blake2_ordered_sequence, + .plaintext = blake2_ordered_sequence, + .psize = 15, + .digest = (u8[]){ 0x36, 0xc3, 0x5f, 0x9a, 0xdc, 0x7e, 0xbf, 0x19, + 0x68, 0xaa, 0xca, 0xd8, 0x81, 0xbf, 0x09, 0x34, + 0x83, 0x39, 0x0f, 0x30, }, +}, { + .ksize = 1, + .key = "B", + .plaintext = blake2_ordered_sequence, + .psize = 64, + .digest = (u8[]){ 0x86, 0x80, 0x78, 0xa4, 0x14, 0xec, 0x03, 0xe5, + 0xb6, 0x9a, 0x52, 0x0e, 0x42, 0xee, 0x39, 0x9d, + 0xac, 0xa6, 0x81, 0x63, }, +}, { + .ksize = 32, + .key = blake2_ordered_sequence, + .plaintext = blake2_ordered_sequence, + .psize = 247, + .digest = (u8[]){ 0x2d, 0xd8, 0xd2, 0x53, 0x66, 0xfa, 0xa9, 0x01, + 0x1c, 0x9c, 0xaf, 0xa3, 0xe2, 0x9d, 0x9b, 0x10, + 0x0a, 0xf6, 0x73, 0xe8, }, +}}; + +static const struct hash_testvec blakes2s_224_tv_template[] = {{ + .plaintext = blake2_ordered_sequence, + .psize = 1, + .digest = (u8[]){ 0x61, 0xb9, 0x4e, 0xc9, 0x46, 0x22, 0xa3, 0x91, + 0xd2, 0xae, 0x42, 0xe6, 0x45, 0x6c, 0x90, 0x12, + 0xd5, 0x80, 0x07, 0x97, 0xb8, 0x86, 0x5a, 0xfc, + 0x48, 0x21, 0x97, 0xbb, }, +}, { + .plaintext = blake2_ordered_sequence, + .psize = 247, + .digest = (u8[]){ 0x9e, 0xda, 0xc7, 0x20, 0x2c, 0xd8, 0x48, 0x2e, + 0x31, 0x94, 0xab, 0x46, 0x6d, 0x94, 0xd8, 0xb4, + 0x69, 0xcd, 0xae, 0x19, 0x6d, 0x9e, 0x41, 0xcc, + 0x2b, 0xa4, 0xd5, 0xf6, }, +}, { + .ksize = 16, + .key = blake2_ordered_sequence, + .digest = (u8[]){ 0x32, 0xc0, 0xac, 0xf4, 0x3b, 0xd3, 0x07, 0x9f, + 0xbe, 0xfb, 0xfa, 0x4d, 0x6b, 0x4e, 0x56, 0xb3, + 0xaa, 0xd3, 0x27, 0xf6, 0x14, 0xbf, 0xb9, 0x32, + 0xa7, 0x19, 0xfc, 0xb8, }, +}, { + .ksize = 1, + .key = "B", + .plaintext = blake2_ordered_sequence, + .psize = 7, + .digest = (u8[]){ 0x73, 0xad, 0x5e, 0x6d, 0xb9, 0x02, 0x8e, 0x76, + 0xf2, 0x66, 0x42, 0x4b, 0x4c, 0xfa, 0x1f, 0xe6, + 0x2e, 0x56, 0x40, 0xe5, 0xa2, 0xb0, 0x3c, 0xe8, + 0x7b, 0x45, 0xfe, 0x05, }, +}, { + .ksize = 32, + .key = blake2_ordered_sequence, + .plaintext = blake2_ordered_sequence, + .psize = 15, + .digest = (u8[]){ 0x16, 0x60, 0xfb, 0x92, 0x54, 0xb3, 0x6e, 0x36, + 0x81, 0xf4, 0x16, 0x41, 0xc3, 0x3d, 0xd3, 0x43, + 0x84, 0xed, 0x10, 0x6f, 0x65, 0x80, 0x7a, 0x3e, + 0x25, 0xab, 0xc5, 0x02, }, +}, { + .ksize = 16, + .key = blake2_ordered_sequence, + .plaintext = blake2_ordered_sequence, + .psize = 64, + .digest = (u8[]){ 0xca, 0xaa, 0x39, 0x67, 0x9c, 0xf7, 0x6b, 0xc7, + 0xb6, 0x82, 0xca, 0x0e, 0x65, 0x36, 0x5b, 0x7c, + 0x24, 0x00, 0xfa, 0x5f, 0xda, 0x06, 0x91, 0x93, + 0x6a, 0x31, 0x83, 0xb5, }, +}, { + .ksize = 1, + .key = "B", + .plaintext = blake2_ordered_sequence, + .psize = 256, + .digest = (u8[]){ 0x90, 0x02, 0x26, 0xb5, 0x06, 0x9c, 0x36, 0x86, + 0x94, 0x91, 0x90, 0x1e, 0x7d, 0x2a, 0x71, 0xb2, + 0x48, 0xb5, 0xe8, 0x16, 0xfd, 0x64, 0x33, 0x45, + 0xb3, 0xd7, 0xec, 0xcc, }, +}}; + +static const struct hash_testvec blakes2s_256_tv_template[] = {{ + .plaintext = blake2_ordered_sequence, + .psize = 15, + .digest = (u8[]){ 0xd9, 0x7c, 0x82, 0x8d, 0x81, 0x82, 0xa7, 0x21, + 0x80, 0xa0, 0x6a, 0x78, 0x26, 0x83, 0x30, 0x67, + 0x3f, 0x7c, 0x4e, 0x06, 0x35, 0x94, 0x7c, 0x04, + 0xc0, 0x23, 0x23, 0xfd, 0x45, 0xc0, 0xa5, 0x2d, }, +}, { + .ksize = 32, + .key = blake2_ordered_sequence, + .digest = (u8[]){ 0x48, 0xa8, 0x99, 0x7d, 0xa4, 0x07, 0x87, 0x6b, + 0x3d, 0x79, 0xc0, 0xd9, 0x23, 0x25, 0xad, 0x3b, + 0x89, 0xcb, 0xb7, 0x54, 0xd8, 0x6a, 0xb7, 0x1a, + 0xee, 0x04, 0x7a, 0xd3, 0x45, 0xfd, 0x2c, 0x49, }, +}, { + .ksize = 1, + .key = "B", + .plaintext = blake2_ordered_sequence, + .psize = 1, + .digest = (u8[]){ 0x22, 0x27, 0xae, 0xaa, 0x6e, 0x81, 0x56, 0x03, + 0xa7, 0xe3, 0xa1, 0x18, 0xa5, 0x9a, 0x2c, 0x18, + 0xf4, 0x63, 0xbc, 0x16, 0x70, 0xf1, 0xe7, 0x4b, + 0x00, 0x6d, 0x66, 0x16, 0xae, 0x9e, 0x74, 0x4e, }, +}, { + .ksize = 16, + .key = blake2_ordered_sequence, + .plaintext = blake2_ordered_sequence, + .psize = 7, + .digest = (u8[]){ 0x58, 0x5d, 0xa8, 0x60, 0x1c, 0xa4, 0xd8, 0x03, + 0x86, 0x86, 0x84, 0x64, 0xd7, 0xa0, 0x8e, 0x15, + 0x2f, 0x05, 0xa2, 0x1b, 0xbc, 0xef, 0x7a, 0x34, + 0xb3, 0xc5, 0xbc, 0x4b, 0xf0, 0x32, 0xeb, 0x12, }, +}, { + .ksize = 32, + .key = blake2_ordered_sequence, + .plaintext = blake2_ordered_sequence, + .psize = 64, + .digest = (u8[]){ 0x89, 0x75, 0xb0, 0x57, 0x7f, 0xd3, 0x55, 0x66, + 0xd7, 0x50, 0xb3, 0x62, 0xb0, 0x89, 0x7a, 0x26, + 0xc3, 0x99, 0x13, 0x6d, 0xf0, 0x7b, 0xab, 0xab, + 0xbd, 0xe6, 0x20, 0x3f, 0xf2, 0x95, 0x4e, 0xd4, }, +}, { + .ksize = 1, + .key = "B", + .plaintext = blake2_ordered_sequence, + .psize = 247, + .digest = (u8[]){ 0x2e, 0x74, 0x1c, 0x1d, 0x03, 0xf4, 0x9d, 0x84, + 0x6f, 0xfc, 0x86, 0x32, 0x92, 0x49, 0x7e, 0x66, + 0xd7, 0xc3, 0x10, 0x88, 0xfe, 0x28, 0xb3, 0xe0, + 0xbf, 0x50, 0x75, 0xad, 0x8e, 0xa4, 0xe6, 0xb2, }, +}, { + .ksize = 16, + .key = blake2_ordered_sequence, + .plaintext = blake2_ordered_sequence, + .psize = 256, + .digest = (u8[]){ 0xb9, 0xd2, 0x81, 0x0e, 0x3a, 0xb1, 0x62, 0x9b, + 0xad, 0x44, 0x05, 0xf4, 0x92, 0x2e, 0x99, 0xc1, + 0x4a, 0x47, 0xbb, 0x5b, 0x6f, 0xb2, 0x96, 0xed, + 0xd5, 0x06, 0xb5, 0x3a, 0x7c, 0x7a, 0x65, 0x1d, }, +}}; + #endif /* _CRYPTO_TESTMGR_H */ diff --git a/drivers/acpi/acpi_dbg.c b/drivers/acpi/acpi_dbg.c index f21c99ec46ee..ed19d9822bc0 100644 --- a/drivers/acpi/acpi_dbg.c +++ b/drivers/acpi/acpi_dbg.c @@ -757,6 +757,9 @@ int __init acpi_aml_init(void) goto err_exit; } + if (acpi_disabled) + return -ENODEV; + /* Initialize AML IO interface */ mutex_init(&acpi_aml_io.lock); init_waitqueue_head(&acpi_aml_io.wait); diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c index 560fdae8cc59..943b1dc2d0b3 100644 --- a/drivers/acpi/acpi_extlog.c +++ b/drivers/acpi/acpi_extlog.c @@ -224,9 +224,9 @@ static int __init extlog_init(void) u64 cap; int rc; - rdmsrl(MSR_IA32_MCG_CAP, cap); - - if (!(cap & MCG_ELOG_P) || !extlog_get_l1addr()) + if (rdmsrl_safe(MSR_IA32_MCG_CAP, &cap) || + !(cap & MCG_ELOG_P) || + !extlog_get_l1addr()) return -ENODEV; if (edac_get_report_status() == EDAC_REPORTING_FORCE) { diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c index d5c19e25ddf5..f43f5adc21b6 100644 --- a/drivers/acpi/button.c +++ b/drivers/acpi/button.c @@ -149,6 +149,7 @@ struct acpi_button { int last_state; ktime_t last_time; bool suspended; + bool lid_state_initialized; }; static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier); @@ -404,6 +405,8 @@ static int acpi_lid_update_state(struct acpi_device *device, static void acpi_lid_initialize_state(struct acpi_device *device) { + struct acpi_button *button = acpi_driver_data(device); + switch (lid_init_state) { case ACPI_BUTTON_LID_INIT_OPEN: (void)acpi_lid_notify_state(device, 1); @@ -415,13 +418,14 @@ static void acpi_lid_initialize_state(struct acpi_device *device) default: break; } + + button->lid_state_initialized = true; } static void acpi_button_notify(struct acpi_device *device, u32 event) { struct acpi_button *button = acpi_driver_data(device); struct input_dev *input; - int users; switch (event) { case ACPI_FIXED_HARDWARE_EVENT: @@ -430,10 +434,7 @@ static void acpi_button_notify(struct acpi_device *device, u32 event) case ACPI_BUTTON_NOTIFY_STATUS: input = button->input; if (button->type == ACPI_BUTTON_TYPE_LID) { - mutex_lock(&button->input->mutex); - users = button->input->users; - mutex_unlock(&button->input->mutex); - if (users) + if (button->lid_state_initialized) acpi_lid_update_state(device, true); } else { int keycode; @@ -478,7 +479,7 @@ static int acpi_button_resume(struct device *dev) struct acpi_button *button = acpi_driver_data(device); button->suspended = false; - if (button->type == ACPI_BUTTON_TYPE_LID && button->input->users) { + if (button->type == ACPI_BUTTON_TYPE_LID) { button->last_state = !!acpi_lid_evaluate_state(device); button->last_time = ktime_get(); acpi_lid_initialize_state(device); diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index bce135d23f53..77471ed36032 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -1535,7 +1535,7 @@ static ssize_t format1_show(struct device *dev, le16_to_cpu(nfit_dcr->dcr->code)); break; } - if (rc != ENXIO) + if (rc != -ENXIO) break; } mutex_unlock(&acpi_desc->init_mutex); diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c index 0da58f0bf7e5..a28ff3cfbc29 100644 --- a/drivers/acpi/numa.c +++ b/drivers/acpi/numa.c @@ -46,7 +46,7 @@ int acpi_numa __initdata; int pxm_to_node(int pxm) { - if (pxm < 0) + if (pxm < 0 || pxm >= MAX_PXM_DOMAINS || numa_off) return NUMA_NO_NODE; return pxm_to_node_map[pxm]; } diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c index ab1da5e6e7e3..86ffb4af4afc 100644 --- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -274,6 +274,15 @@ static const struct dmi_system_id video_detect_dmi_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "530U4E/540U4E"), }, }, + /* https://bugs.launchpad.net/bugs/1894667 */ + { + .callback = video_detect_force_video, + .ident = "HP 635 Notebook", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"), + DMI_MATCH(DMI_PRODUCT_NAME, "HP 635 Notebook PC"), + }, + }, /* Non win8 machines which need native backlight nevertheless */ { diff --git a/drivers/android/binder.c b/drivers/android/binder.c index 8eb783309e0e..8b485c39a08c 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -237,7 +237,7 @@ static struct binder_transaction_log_entry *binder_transaction_log_add( struct binder_work { struct list_head entry; - enum { + enum binder_work_type { BINDER_WORK_TRANSACTION = 1, BINDER_WORK_TRANSACTION_COMPLETE, BINDER_WORK_RETURN_ERROR, @@ -897,27 +897,6 @@ static struct binder_work *binder_dequeue_work_head_ilocked( return w; } -/** - * binder_dequeue_work_head() - Dequeues the item at head of list - * @proc: binder_proc associated with list - * @list: list to dequeue head - * - * Removes the head of the list if there are items on the list - * - * Return: pointer dequeued binder_work, NULL if list was empty - */ -static struct binder_work *binder_dequeue_work_head( - struct binder_proc *proc, - struct list_head *list) -{ - struct binder_work *w; - - binder_inner_proc_lock(proc); - w = binder_dequeue_work_head_ilocked(list); - binder_inner_proc_unlock(proc); - return w; -} - static void binder_defer_work(struct binder_proc *proc, enum binder_deferred_state defer); static void binder_free_thread(struct binder_thread *thread); @@ -4569,13 +4548,17 @@ static void binder_release_work(struct binder_proc *proc, struct list_head *list) { struct binder_work *w; + enum binder_work_type wtype; while (1) { - w = binder_dequeue_work_head(proc, list); + binder_inner_proc_lock(proc); + w = binder_dequeue_work_head_ilocked(list); + wtype = w ? w->type : 0; + binder_inner_proc_unlock(proc); if (!w) return; - switch (w->type) { + switch (wtype) { case BINDER_WORK_TRANSACTION: { struct binder_transaction *t; @@ -4609,9 +4592,11 @@ static void binder_release_work(struct binder_proc *proc, kfree(death); binder_stats_deleted(BINDER_STAT_DEATH); } break; + case BINDER_WORK_NODE: + break; default: pr_err("unexpected work type, %d, not freed\n", - w->type); + wtype); break; } } diff --git a/drivers/ata/sata_nv.c b/drivers/ata/sata_nv.c index 798d549435cc..2248a40631bf 100644 --- a/drivers/ata/sata_nv.c +++ b/drivers/ata/sata_nv.c @@ -2122,7 +2122,7 @@ static int nv_swncq_sdbfis(struct ata_port *ap) pp->dhfis_bits &= ~done_mask; pp->dmafis_bits &= ~done_mask; pp->sdbfis_bits |= done_mask; - ata_qc_complete_multiple(ap, ap->qc_active ^ done_mask); + ata_qc_complete_multiple(ap, ata_qc_get_active(ap) ^ done_mask); if (!ap->qc_active) { DPRINTK("over\n"); diff --git a/drivers/ata/sata_rcar.c b/drivers/ata/sata_rcar.c index 8323f88d17a5..4dcdf8ee0055 100644 --- a/drivers/ata/sata_rcar.c +++ b/drivers/ata/sata_rcar.c @@ -124,7 +124,7 @@ /* Descriptor table word 0 bit (when DTA32M = 1) */ #define SATA_RCAR_DTEND BIT(0) -#define SATA_RCAR_DMA_BOUNDARY 0x1FFFFFFEUL +#define SATA_RCAR_DMA_BOUNDARY 0x1FFFFFFFUL /* Gen2 Physical Layer Control Registers */ #define RCAR_GEN2_PHY_CTL1_REG 0x1704 diff --git a/drivers/base/core.c b/drivers/base/core.c index e9f67a7f41b9..8746d7d7329b 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -3802,6 +3802,7 @@ static inline bool fwnode_is_primary(struct fwnode_handle *fwnode) */ void set_primary_fwnode(struct device *dev, struct fwnode_handle *fwnode) { + struct device *parent = dev->parent; struct fwnode_handle *fn = dev->fwnode; if (fwnode) { @@ -3816,7 +3817,8 @@ void set_primary_fwnode(struct device *dev, struct fwnode_handle *fwnode) } else { if (fwnode_is_primary(fn)) { dev->fwnode = fn->secondary; - fn->secondary = NULL; + if (!(parent && fn == parent->fwnode)) + fn->secondary = ERR_PTR(-ENODEV); } else { dev->fwnode = NULL; } diff --git a/drivers/base/dd.c b/drivers/base/dd.c index cbfab9774f17..20b96029f2c9 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -952,6 +952,8 @@ static void __device_release_driver(struct device *dev, struct device *parent) drv = dev->driver; if (drv) { + pm_runtime_get_sync(dev); + while (device_links_busy(dev)) { device_unlock(dev); if (parent && dev->bus->need_parent_lock) @@ -967,11 +969,12 @@ static void __device_release_driver(struct device *dev, struct device *parent) * have released the driver successfully while this one * was waiting, so check for that. */ - if (dev->driver != drv) + if (dev->driver != drv) { + pm_runtime_put(dev); return; + } } - pm_runtime_get_sync(dev); pm_runtime_clean_up_links(dev); driver_sysfs_remove(dev); diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index d7c7232e438c..52e1e71e8124 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -740,9 +740,9 @@ static void recv_work(struct work_struct *work) blk_mq_complete_request(blk_mq_rq_from_pdu(cmd)); } + nbd_config_put(nbd); atomic_dec(&config->recv_threads); wake_up(&config->recv_wq); - nbd_config_put(nbd); kfree(args); } diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index 3666afa639d1..b18f0162cb9c 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -202,7 +202,7 @@ static inline void shrink_free_pagepool(struct xen_blkif_ring *ring, int num) #define vaddr(page) ((unsigned long)pfn_to_kaddr(page_to_pfn(page))) -static int do_block_io_op(struct xen_blkif_ring *ring); +static int do_block_io_op(struct xen_blkif_ring *ring, unsigned int *eoi_flags); static int dispatch_rw_block_io(struct xen_blkif_ring *ring, struct blkif_request *req, struct pending_req *pending_req); @@ -615,6 +615,8 @@ int xen_blkif_schedule(void *arg) struct xen_vbd *vbd = &blkif->vbd; unsigned long timeout; int ret; + bool do_eoi; + unsigned int eoi_flags = XEN_EOI_FLAG_SPURIOUS; set_freezable(); while (!kthread_should_stop()) { @@ -639,16 +641,23 @@ int xen_blkif_schedule(void *arg) if (timeout == 0) goto purge_gnt_list; + do_eoi = ring->waiting_reqs; + ring->waiting_reqs = 0; smp_mb(); /* clear flag *before* checking for work */ - ret = do_block_io_op(ring); + ret = do_block_io_op(ring, &eoi_flags); if (ret > 0) ring->waiting_reqs = 1; if (ret == -EACCES) wait_event_interruptible(ring->shutdown_wq, kthread_should_stop()); + if (do_eoi && !ring->waiting_reqs) { + xen_irq_lateeoi(ring->irq, eoi_flags); + eoi_flags |= XEN_EOI_FLAG_SPURIOUS; + } + purge_gnt_list: if (blkif->vbd.feature_gnt_persistent && time_after(jiffies, ring->next_lru)) { @@ -1121,7 +1130,7 @@ static void end_block_io_op(struct bio *bio) * and transmute it to the block API to hand it over to the proper block disk. */ static int -__do_block_io_op(struct xen_blkif_ring *ring) +__do_block_io_op(struct xen_blkif_ring *ring, unsigned int *eoi_flags) { union blkif_back_rings *blk_rings = &ring->blk_rings; struct blkif_request req; @@ -1144,6 +1153,9 @@ __do_block_io_op(struct xen_blkif_ring *ring) if (RING_REQUEST_CONS_OVERFLOW(&blk_rings->common, rc)) break; + /* We've seen a request, so clear spurious eoi flag. */ + *eoi_flags &= ~XEN_EOI_FLAG_SPURIOUS; + if (kthread_should_stop()) { more_to_do = 1; break; @@ -1202,13 +1214,13 @@ done: } static int -do_block_io_op(struct xen_blkif_ring *ring) +do_block_io_op(struct xen_blkif_ring *ring, unsigned int *eoi_flags) { union blkif_back_rings *blk_rings = &ring->blk_rings; int more_to_do; do { - more_to_do = __do_block_io_op(ring); + more_to_do = __do_block_io_op(ring, eoi_flags); if (more_to_do) break; diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index 25c41ce070a7..93896c992245 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -237,9 +237,8 @@ static int xen_blkif_map(struct xen_blkif_ring *ring, grant_ref_t *gref, BUG(); } - err = bind_interdomain_evtchn_to_irqhandler(blkif->domid, evtchn, - xen_blkif_be_int, 0, - "blkif-backend", ring); + err = bind_interdomain_evtchn_to_irqhandler_lateeoi(blkif->domid, + evtchn, xen_blkif_be_int, 0, "blkif-backend", ring); if (err < 0) { xenbus_unmap_ring_vfree(blkif->be->dev, ring->blk_ring); ring->blk_rings.common.sring = NULL; diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c index efeb8137ec67..48560e646e53 100644 --- a/drivers/bluetooth/hci_ldisc.c +++ b/drivers/bluetooth/hci_ldisc.c @@ -545,6 +545,7 @@ static void hci_uart_tty_close(struct tty_struct *tty) clear_bit(HCI_UART_PROTO_READY, &hu->flags); percpu_up_write(&hu->proto_lock); + cancel_work_sync(&hu->init_ready); cancel_work_sync(&hu->write_work); if (hdev) { diff --git a/drivers/bluetooth/hci_serdev.c b/drivers/bluetooth/hci_serdev.c index d3fb0d657fa5..7b3aade431e5 100644 --- a/drivers/bluetooth/hci_serdev.c +++ b/drivers/bluetooth/hci_serdev.c @@ -369,6 +369,8 @@ void hci_uart_unregister_device(struct hci_uart *hu) struct hci_dev *hdev = hu->hdev; clear_bit(HCI_UART_PROTO_READY, &hu->flags); + + cancel_work_sync(&hu->init_ready); if (test_bit(HCI_UART_REGISTERED, &hu->flags)) hci_unregister_dev(hdev); hci_free_dev(hdev); diff --git a/drivers/bus/fsl-mc/mc-io.c b/drivers/bus/fsl-mc/mc-io.c index 7226cfc49b6f..3f806599748a 100644 --- a/drivers/bus/fsl-mc/mc-io.c +++ b/drivers/bus/fsl-mc/mc-io.c @@ -129,7 +129,12 @@ error_destroy_mc_io: */ void fsl_destroy_mc_io(struct fsl_mc_io *mc_io) { - struct fsl_mc_device *dpmcp_dev = mc_io->dpmcp_dev; + struct fsl_mc_device *dpmcp_dev; + + if (!mc_io) + return; + + dpmcp_dev = mc_io->dpmcp_dev; if (dpmcp_dev) fsl_mc_io_unset_dpmcp(mc_io); diff --git a/drivers/clk/at91/clk-main.c b/drivers/clk/at91/clk-main.c index c97efd43e315..9e5b73c34218 100644 --- a/drivers/clk/at91/clk-main.c +++ b/drivers/clk/at91/clk-main.c @@ -517,12 +517,17 @@ static int clk_sam9x5_main_set_parent(struct clk_hw *hw, u8 index) return -EINVAL; regmap_read(regmap, AT91_CKGR_MOR, &tmp); - tmp &= ~MOR_KEY_MASK; if (index && !(tmp & AT91_PMC_MOSCSEL)) - regmap_write(regmap, AT91_CKGR_MOR, tmp | AT91_PMC_MOSCSEL); + tmp = AT91_PMC_MOSCSEL; else if (!index && (tmp & AT91_PMC_MOSCSEL)) - regmap_write(regmap, AT91_CKGR_MOR, tmp & ~AT91_PMC_MOSCSEL); + tmp = 0; + else + return 0; + + regmap_update_bits(regmap, AT91_CKGR_MOR, + AT91_PMC_MOSCSEL | MOR_KEY_MASK, + tmp | AT91_PMC_KEY); while (!clk_sam9x5_main_ready(regmap)) cpu_relax(); diff --git a/drivers/clk/bcm/clk-bcm2835.c b/drivers/clk/bcm/clk-bcm2835.c index de895fc3dc93..f89ec88567c6 100644 --- a/drivers/clk/bcm/clk-bcm2835.c +++ b/drivers/clk/bcm/clk-bcm2835.c @@ -1319,8 +1319,10 @@ static struct clk_hw *bcm2835_register_pll(struct bcm2835_cprman *cprman, pll->hw.init = &init; ret = devm_clk_hw_register(cprman->dev, &pll->hw); - if (ret) + if (ret) { + kfree(pll); return NULL; + } return &pll->hw; } diff --git a/drivers/clk/rockchip/clk-half-divider.c b/drivers/clk/rockchip/clk-half-divider.c index b8da6e799423..6a371d05218d 100644 --- a/drivers/clk/rockchip/clk-half-divider.c +++ b/drivers/clk/rockchip/clk-half-divider.c @@ -166,7 +166,7 @@ struct clk *rockchip_clk_register_halfdiv(const char *name, unsigned long flags, spinlock_t *lock) { - struct clk *clk; + struct clk *clk = ERR_PTR(-ENOMEM); struct clk_mux *mux = NULL; struct clk_gate *gate = NULL; struct clk_divider *div = NULL; diff --git a/drivers/clk/ti/clockdomain.c b/drivers/clk/ti/clockdomain.c index 07a805125e98..11d92311e162 100644 --- a/drivers/clk/ti/clockdomain.c +++ b/drivers/clk/ti/clockdomain.c @@ -146,10 +146,12 @@ static void __init of_ti_clockdomain_setup(struct device_node *node) if (clk_hw_get_flags(clk_hw) & CLK_IS_BASIC) { pr_warn("can't setup clkdm for basic clk %s\n", __clk_get_name(clk)); + clk_put(clk); continue; } to_clk_hw_omap(clk_hw)->clkdm_name = clkdm_name; omap2_init_clk_clkdm(clk_hw); + clk_put(clk); } } diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c index aca30f45172e..9e86404a361f 100644 --- a/drivers/cpufreq/acpi-cpufreq.c +++ b/drivers/cpufreq/acpi-cpufreq.c @@ -701,7 +701,8 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy) cpumask_copy(policy->cpus, topology_core_cpumask(cpu)); } - if (check_amd_hwpstate_cpu(cpu) && !acpi_pstate_strict) { + if (check_amd_hwpstate_cpu(cpu) && boot_cpu_data.x86 < 0x19 && + !acpi_pstate_strict) { cpumask_clear(policy->cpus); cpumask_set_cpu(cpu, policy->cpus); cpumask_copy(data->freqdomain_cpus, diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c index c5f98cafc25c..9b0b490d70ff 100644 --- a/drivers/cpufreq/armada-37xx-cpufreq.c +++ b/drivers/cpufreq/armada-37xx-cpufreq.c @@ -486,6 +486,12 @@ remove_opp: /* late_initcall, to guarantee the driver is loaded after A37xx clock driver */ late_initcall(armada37xx_cpufreq_driver_init); +static const struct of_device_id __maybe_unused armada37xx_cpufreq_of_match[] = { + { .compatible = "marvell,armada-3700-nb-pm" }, + { }, +}; +MODULE_DEVICE_TABLE(of, armada37xx_cpufreq_of_match); + MODULE_AUTHOR("Gregory CLEMENT "); MODULE_DESCRIPTION("Armada 37xx cpufreq driver"); MODULE_LICENSE("GPL"); diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c index 79942f705757..5da985604692 100644 --- a/drivers/cpufreq/powernv-cpufreq.c +++ b/drivers/cpufreq/powernv-cpufreq.c @@ -885,12 +885,15 @@ static int powernv_cpufreq_reboot_notifier(struct notifier_block *nb, unsigned long action, void *unused) { int cpu; - struct cpufreq_policy cpu_policy; + struct cpufreq_policy *cpu_policy; rebooting = true; for_each_online_cpu(cpu) { - cpufreq_get_policy(&cpu_policy, cpu); - powernv_cpufreq_target_index(&cpu_policy, get_nominal_index()); + cpu_policy = cpufreq_cpu_get(cpu); + if (!cpu_policy) + continue; + powernv_cpufreq_target_index(cpu_policy, get_nominal_index()); + cpufreq_cpu_put(cpu_policy); } return NOTIFY_DONE; diff --git a/drivers/cpufreq/sti-cpufreq.c b/drivers/cpufreq/sti-cpufreq.c index 47105735df12..6b5d241c30b7 100644 --- a/drivers/cpufreq/sti-cpufreq.c +++ b/drivers/cpufreq/sti-cpufreq.c @@ -144,7 +144,8 @@ static const struct reg_field sti_stih407_dvfs_regfields[DVFS_MAX_REGFIELDS] = { static const struct reg_field *sti_cpufreq_match(void) { if (of_machine_is_compatible("st,stih407") || - of_machine_is_compatible("st,stih410")) + of_machine_is_compatible("st,stih410") || + of_machine_is_compatible("st,stih418")) return sti_stih407_dvfs_regfields; return NULL; @@ -261,7 +262,8 @@ static int sti_cpufreq_init(void) int ret; if ((!of_machine_is_compatible("st,stih407")) && - (!of_machine_is_compatible("st,stih410"))) + (!of_machine_is_compatible("st,stih410")) && + (!of_machine_is_compatible("st,stih418"))) return -ENODEV; ddata.cpu = get_cpu_device(0); diff --git a/drivers/crypto/ccp/ccp-ops.c b/drivers/crypto/ccp/ccp-ops.c index 626b643d610e..20ca9c9e109e 100644 --- a/drivers/crypto/ccp/ccp-ops.c +++ b/drivers/crypto/ccp/ccp-ops.c @@ -1752,7 +1752,7 @@ ccp_run_sha_cmd(struct ccp_cmd_queue *cmd_q, struct ccp_cmd *cmd) break; default: ret = -EINVAL; - goto e_ctx; + goto e_data; } } else { /* Stash the context */ diff --git a/drivers/crypto/chelsio/chtls/chtls_cm.c b/drivers/crypto/chelsio/chtls/chtls_cm.c index 28d24118c645..35116cdb5e38 100644 --- a/drivers/crypto/chelsio/chtls/chtls_cm.c +++ b/drivers/crypto/chelsio/chtls/chtls_cm.c @@ -175,7 +175,7 @@ static struct sk_buff *alloc_ctrl_skb(struct sk_buff *skb, int len) { if (likely(skb && !skb_shared(skb) && !skb_cloned(skb))) { __skb_trim(skb, 0); - refcount_add(2, &skb->users); + refcount_inc(&skb->users); } else { skb = alloc_skb(len, GFP_KERNEL | __GFP_NOFAIL); } @@ -696,14 +696,13 @@ static int chtls_pass_open_rpl(struct chtls_dev *cdev, struct sk_buff *skb) if (rpl->status != CPL_ERR_NONE) { pr_info("Unexpected PASS_OPEN_RPL status %u for STID %u\n", rpl->status, stid); - return CPL_RET_BUF_DONE; + } else { + cxgb4_free_stid(cdev->tids, stid, listen_ctx->lsk->sk_family); + sock_put(listen_ctx->lsk); + kfree(listen_ctx); + module_put(THIS_MODULE); } - cxgb4_free_stid(cdev->tids, stid, listen_ctx->lsk->sk_family); - sock_put(listen_ctx->lsk); - kfree(listen_ctx); - module_put(THIS_MODULE); - - return 0; + return CPL_RET_BUF_DONE; } static int chtls_close_listsrv_rpl(struct chtls_dev *cdev, struct sk_buff *skb) @@ -720,15 +719,13 @@ static int chtls_close_listsrv_rpl(struct chtls_dev *cdev, struct sk_buff *skb) if (rpl->status != CPL_ERR_NONE) { pr_info("Unexpected CLOSE_LISTSRV_RPL status %u for STID %u\n", rpl->status, stid); - return CPL_RET_BUF_DONE; + } else { + cxgb4_free_stid(cdev->tids, stid, listen_ctx->lsk->sk_family); + sock_put(listen_ctx->lsk); + kfree(listen_ctx); + module_put(THIS_MODULE); } - - cxgb4_free_stid(cdev->tids, stid, listen_ctx->lsk->sk_family); - sock_put(listen_ctx->lsk); - kfree(listen_ctx); - module_put(THIS_MODULE); - - return 0; + return CPL_RET_BUF_DONE; } static void chtls_purge_wr_queue(struct sock *sk) @@ -1057,6 +1054,9 @@ static struct sock *chtls_recv_sock(struct sock *lsk, ndev = n->dev; if (!ndev) goto free_dst; + if (is_vlan_dev(ndev)) + ndev = vlan_dev_real_dev(ndev); + port_id = cxgb4_port_idx(ndev); csk = chtls_sock_create(cdev); @@ -1345,7 +1345,6 @@ static void add_to_reap_list(struct sock *sk) struct chtls_sock *csk = sk->sk_user_data; local_bh_disable(); - bh_lock_sock(sk); release_tcp_port(sk); /* release the port immediately */ spin_lock(&reap_list_lock); @@ -1354,7 +1353,6 @@ static void add_to_reap_list(struct sock *sk) if (!csk->passive_reap_next) schedule_work(&reap_task); spin_unlock(&reap_list_lock); - bh_unlock_sock(sk); local_bh_enable(); } diff --git a/drivers/crypto/chelsio/chtls/chtls_hw.c b/drivers/crypto/chelsio/chtls/chtls_hw.c index 64d24823c65a..2294fb06bef3 100644 --- a/drivers/crypto/chelsio/chtls/chtls_hw.c +++ b/drivers/crypto/chelsio/chtls/chtls_hw.c @@ -368,6 +368,9 @@ int chtls_setkey(struct chtls_sock *csk, u32 keylen, u32 optname) if (ret) goto out_notcb; + if (unlikely(csk_flag(sk, CSK_ABORT_SHUTDOWN))) + goto out_notcb; + set_wr_txq(skb, CPL_PRIORITY_DATA, csk->tlshws.txqid); csk->wr_credits -= DIV_ROUND_UP(len, 16); csk->wr_unacked += DIV_ROUND_UP(len, 16); diff --git a/drivers/crypto/chelsio/chtls/chtls_io.c b/drivers/crypto/chelsio/chtls/chtls_io.c index 2c1f3ddb0cc7..f9874da23a29 100644 --- a/drivers/crypto/chelsio/chtls/chtls_io.c +++ b/drivers/crypto/chelsio/chtls/chtls_io.c @@ -914,9 +914,9 @@ static int tls_header_read(struct tls_hdr *thdr, struct iov_iter *from) return (__force int)cpu_to_be16(thdr->length); } -static int csk_mem_free(struct chtls_dev *cdev, struct sock *sk) +static bool csk_mem_free(struct chtls_dev *cdev, struct sock *sk) { - return (cdev->max_host_sndbuf - sk->sk_wmem_queued); + return (cdev->max_host_sndbuf - sk->sk_wmem_queued > 0); } static int csk_wait_memory(struct chtls_dev *cdev, @@ -1217,6 +1217,7 @@ int chtls_sendpage(struct sock *sk, struct page *page, copied = 0; csk = rcu_dereference_sk_user_data(sk); cdev = csk->cdev; + lock_sock(sk); timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); err = sk_stream_wait_connect(sk, &timeo); @@ -1548,6 +1549,7 @@ skip_copy: tp->urg_data = 0; if ((avail + offset) >= skb->len) { + struct sk_buff *next_skb; if (ULP_SKB_CB(skb)->flags & ULPCB_FLAG_TLS_HDR) { tp->copied_seq += skb->len; hws->rcvpld = skb->hdr_len; @@ -1557,8 +1559,10 @@ skip_copy: chtls_free_skb(sk, skb); buffers_freed++; hws->copied_seq = 0; - if (copied >= target && - !skb_peek(&sk->sk_receive_queue)) + next_skb = skb_peek(&sk->sk_receive_queue); + if (copied >= target && !next_skb) + break; + if (ULP_SKB_CB(next_skb)->flags & ULPCB_FLAG_TLS_HDR) break; } } while (len > 0); diff --git a/drivers/crypto/ixp4xx_crypto.c b/drivers/crypto/ixp4xx_crypto.c index 27f7dad2d45d..9b7b8558db31 100644 --- a/drivers/crypto/ixp4xx_crypto.c +++ b/drivers/crypto/ixp4xx_crypto.c @@ -530,7 +530,7 @@ static void release_ixp_crypto(struct device *dev) if (crypt_virt) { dma_free_coherent(dev, - NPE_QLEN_TOTAL * sizeof( struct crypt_ctl), + NPE_QLEN * sizeof(struct crypt_ctl), crypt_virt, crypt_phys); } } diff --git a/drivers/crypto/mediatek/mtk-platform.c b/drivers/crypto/mediatek/mtk-platform.c index ee0404e27a0f..e4d7ef3bfb61 100644 --- a/drivers/crypto/mediatek/mtk-platform.c +++ b/drivers/crypto/mediatek/mtk-platform.c @@ -446,7 +446,7 @@ static void mtk_desc_dma_free(struct mtk_cryp *cryp) static int mtk_desc_ring_alloc(struct mtk_cryp *cryp) { struct mtk_ring **ring = cryp->ring; - int i, err = ENOMEM; + int i; for (i = 0; i < MTK_RING_MAX; i++) { ring[i] = kzalloc(sizeof(**ring), GFP_KERNEL); @@ -473,14 +473,14 @@ static int mtk_desc_ring_alloc(struct mtk_cryp *cryp) return 0; err_cleanup: - for (; i--; ) { + do { dma_free_coherent(cryp->dev, MTK_DESC_RING_SZ, ring[i]->res_base, ring[i]->res_dma); dma_free_coherent(cryp->dev, MTK_DESC_RING_SZ, ring[i]->cmd_base, ring[i]->cmd_dma); kfree(ring[i]); - } - return err; + } while (i--); + return -ENOMEM; } static int mtk_crypto_probe(struct platform_device *pdev) diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c index 2faaa4069cdd..4d31ef472436 100644 --- a/drivers/crypto/omap-sham.c +++ b/drivers/crypto/omap-sham.c @@ -456,6 +456,9 @@ static void omap_sham_write_ctrl_omap4(struct omap_sham_dev *dd, size_t length, struct omap_sham_reqctx *ctx = ahash_request_ctx(dd->req); u32 val, mask; + if (likely(ctx->digcnt)) + omap_sham_write(dd, SHA_REG_DIGCNT(dd), ctx->digcnt); + /* * Setting ALGO_CONST only for the first iteration and * CLOSE_HASH only for the last one. Note that flags mode bits diff --git a/drivers/crypto/picoxcell_crypto.c b/drivers/crypto/picoxcell_crypto.c index 1895452c2c2b..396606755381 100644 --- a/drivers/crypto/picoxcell_crypto.c +++ b/drivers/crypto/picoxcell_crypto.c @@ -1700,11 +1700,6 @@ static int spacc_probe(struct platform_device *pdev) goto err_clk_put; } - ret = device_create_file(&pdev->dev, &dev_attr_stat_irq_thresh); - if (ret) - goto err_clk_disable; - - /* * Use an IRQ threshold of 50% as a default. This seems to be a * reasonable trade off of latency against throughput but can be @@ -1712,6 +1707,10 @@ static int spacc_probe(struct platform_device *pdev) */ engine->stat_irq_thresh = (engine->fifo_sz / 2); + ret = device_create_file(&pdev->dev, &dev_attr_stat_irq_thresh); + if (ret) + goto err_clk_disable; + /* * Configure the interrupts. We only use the STAT_CNT interrupt as we * only submit a new packet for processing when we complete another in diff --git a/drivers/dma/dma-jz4780.c b/drivers/dma/dma-jz4780.c index edff93aacad3..0d0850b35e45 100644 --- a/drivers/dma/dma-jz4780.c +++ b/drivers/dma/dma-jz4780.c @@ -574,11 +574,11 @@ static enum dma_status jz4780_dma_tx_status(struct dma_chan *chan, enum dma_status status; unsigned long flags; + spin_lock_irqsave(&jzchan->vchan.lock, flags); + status = dma_cookie_status(chan, cookie, txstate); if ((status == DMA_COMPLETE) || (txstate == NULL)) - return status; - - spin_lock_irqsave(&jzchan->vchan.lock, flags); + goto out_unlock_irqrestore; vdesc = vchan_find_desc(&jzchan->vchan, cookie); if (vdesc) { @@ -595,6 +595,7 @@ static enum dma_status jz4780_dma_tx_status(struct dma_chan *chan, && jzchan->desc->status & (JZ_DMA_DCS_AR | JZ_DMA_DCS_HLT)) status = DMA_ERROR; +out_unlock_irqrestore: spin_unlock_irqrestore(&jzchan->vchan.lock, flags); return status; } diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c index b506eef6b146..858ef4e15180 100644 --- a/drivers/edac/i5100_edac.c +++ b/drivers/edac/i5100_edac.c @@ -1072,16 +1072,15 @@ static int i5100_init_one(struct pci_dev *pdev, const struct pci_device_id *id) PCI_DEVICE_ID_INTEL_5100_19, 0); if (!einj) { ret = -ENODEV; - goto bail_einj; + goto bail_mc_free; } rc = pci_enable_device(einj); if (rc < 0) { ret = rc; - goto bail_disable_einj; + goto bail_einj; } - mci->pdev = &pdev->dev; priv = mci->pvt_info; @@ -1147,14 +1146,14 @@ static int i5100_init_one(struct pci_dev *pdev, const struct pci_device_id *id) bail_scrub: priv->scrub_enable = 0; cancel_delayed_work_sync(&(priv->i5100_scrubbing)); - edac_mc_free(mci); - -bail_disable_einj: pci_disable_device(einj); bail_einj: pci_dev_put(einj); +bail_mc_free: + edac_mc_free(mci); + bail_disable_ch1: pci_disable_device(ch1mm); diff --git a/drivers/edac/ti_edac.c b/drivers/edac/ti_edac.c index 6ac26d1b929f..324768946743 100644 --- a/drivers/edac/ti_edac.c +++ b/drivers/edac/ti_edac.c @@ -278,7 +278,8 @@ static int ti_edac_probe(struct platform_device *pdev) /* add EMIF ECC error handler */ error_irq = platform_get_irq(pdev, 0); - if (!error_irq) { + if (error_irq < 0) { + ret = error_irq; edac_printk(KERN_ERR, EDAC_MOD_NAME, "EMIF irq number not defined.\n"); goto err; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index b54238406192..98d20efe54d0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -565,6 +565,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data, struct ww_acquire_ctx ticket; struct list_head list, duplicates; uint64_t va_flags; + uint64_t vm_size; int r = 0; if (args->va_address < AMDGPU_VA_RESERVED_SIZE) { @@ -585,6 +586,15 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data, args->va_address &= AMDGPU_VA_HOLE_MASK; + vm_size = adev->vm_manager.max_pfn * AMDGPU_GPU_PAGE_SIZE; + vm_size -= AMDGPU_VA_RESERVED_SIZE; + if (args->va_address + args->map_size > vm_size) { + dev_dbg(&dev->pdev->dev, + "va_address 0x%llx is in top reserved area 0x%llx\n", + args->va_address + args->map_size, vm_size); + return -EINVAL; + } + if ((args->flags & ~valid_flags) && (args->flags & ~prt_flags)) { dev_dbg(&dev->pdev->dev, "invalid flags combination 0x%08X\n", args->flags); diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c index 2fb2c683ad54..fa0e6c8e2447 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -2009,7 +2009,7 @@ enum dc_status dc_link_validate_mode_timing( /* A hack to avoid failing any modes for EDID override feature on * topology change such as lower quality cable for DP or different dongle */ - if (link->remote_sinks[0]) + if (link->remote_sinks[0] && link->remote_sinks[0]->sink_signal == SIGNAL_TYPE_VIRTUAL) return DC_OK; /* Passive Dongle */ diff --git a/drivers/gpu/drm/amd/display/dc/os_types.h b/drivers/gpu/drm/amd/display/dc/os_types.h index a407892905af..d4cb7db89192 100644 --- a/drivers/gpu/drm/amd/display/dc/os_types.h +++ b/drivers/gpu/drm/amd/display/dc/os_types.h @@ -57,7 +57,7 @@ * general debug capabilities * */ -#if defined(CONFIG_HAVE_KGDB) || defined(CONFIG_KGDB) +#if defined(CONFIG_DEBUG_KERNEL_DC) && (defined(CONFIG_HAVE_KGDB) || defined(CONFIG_KGDB)) #define ASSERT_CRITICAL(expr) do { \ if (WARN_ON(!(expr))) { \ kgdb_breakpoint(); \ diff --git a/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c b/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c index 2136c97aeb8e..dcf091f9d843 100644 --- a/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c +++ b/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c @@ -306,8 +306,12 @@ static int stdp4028_ge_b850v3_fw_probe(struct i2c_client *stdp4028_i2c, const struct i2c_device_id *id) { struct device *dev = &stdp4028_i2c->dev; + int ret; - ge_b850v3_lvds_init(dev); + ret = ge_b850v3_lvds_init(dev); + + if (ret) + return ret; ge_b850v3_lvds_ptr->stdp4028_i2c = stdp4028_i2c; i2c_set_clientdata(stdp4028_i2c, ge_b850v3_lvds_ptr); @@ -365,8 +369,12 @@ static int stdp2690_ge_b850v3_fw_probe(struct i2c_client *stdp2690_i2c, const struct i2c_device_id *id) { struct device *dev = &stdp2690_i2c->dev; + int ret; - ge_b850v3_lvds_init(dev); + ret = ge_b850v3_lvds_init(dev); + + if (ret) + return ret; ge_b850v3_lvds_ptr->stdp2690_i2c = stdp2690_i2c; i2c_set_clientdata(stdp2690_i2c, ge_b850v3_lvds_ptr); diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c index fd7999642cf8..8b5f9241a887 100644 --- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c +++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c @@ -326,7 +326,6 @@ static void dw_mipi_message_config(struct dw_mipi_dsi *dsi, if (lpm) val |= CMD_MODE_ALL_LP; - dsi_write(dsi, DSI_LPCLK_CTRL, lpm ? 0 : PHY_TXREQUESTCLKHS); dsi_write(dsi, DSI_CMD_MODE_CFG, val); } @@ -488,16 +487,22 @@ static void dw_mipi_dsi_video_mode_config(struct dw_mipi_dsi *dsi) static void dw_mipi_dsi_set_mode(struct dw_mipi_dsi *dsi, unsigned long mode_flags) { + u32 val; + dsi_write(dsi, DSI_PWR_UP, RESET); if (mode_flags & MIPI_DSI_MODE_VIDEO) { dsi_write(dsi, DSI_MODE_CFG, ENABLE_VIDEO_MODE); dw_mipi_dsi_video_mode_config(dsi); - dsi_write(dsi, DSI_LPCLK_CTRL, PHY_TXREQUESTCLKHS); } else { dsi_write(dsi, DSI_MODE_CFG, ENABLE_CMD_MODE); } + val = PHY_TXREQUESTCLKHS; + if (dsi->mode_flags & MIPI_DSI_CLOCK_NON_CONTINUOUS) + val |= AUTO_CLKLANE_CTRL; + dsi_write(dsi, DSI_LPCLK_CTRL, val); + dsi_write(dsi, DSI_PWR_UP, POWERUP); } diff --git a/drivers/gpu/drm/gma500/cdv_intel_dp.c b/drivers/gpu/drm/gma500/cdv_intel_dp.c index 90ed20083009..05eba6dec5eb 100644 --- a/drivers/gpu/drm/gma500/cdv_intel_dp.c +++ b/drivers/gpu/drm/gma500/cdv_intel_dp.c @@ -2119,7 +2119,7 @@ cdv_intel_dp_init(struct drm_device *dev, struct psb_intel_mode_device *mode_dev intel_dp->dpcd, sizeof(intel_dp->dpcd)); cdv_intel_edp_panel_vdd_off(gma_encoder); - if (ret == 0) { + if (ret <= 0) { /* if this fails, presume the device is a ghost */ DRM_INFO("failed to retrieve link info, disabling eDP\n"); cdv_intel_dp_encoder_destroy(encoder); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index db2e9af49ae6..37c80cfecd09 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -33,6 +33,8 @@ #include #include +#include + #include #include #include @@ -2683,7 +2685,9 @@ static inline bool intel_vtd_active(void) if (intel_iommu_gfx_mapped) return true; #endif - return false; + + /* Running as a guest, we assume the host is enforcing VT'd */ + return !hypervisor_is_type(X86_HYPER_NATIVE); } static inline bool intel_scanout_needs_vtd_wa(struct drm_i915_private *dev_priv) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index a262a64f5625..ba24ac698e8b 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -268,6 +268,8 @@ static int compress_page(struct compress *c, if (zlib_deflate(zstream, Z_NO_FLUSH) != Z_OK) return -EIO; + + cond_resched(); } while (zstream->avail_in); /* Fallback to uncompressed if we increase size? */ @@ -347,6 +349,7 @@ static int compress_page(struct compress *c, if (!i915_memcpy_from_wc(ptr, src, PAGE_SIZE)) memcpy(ptr, src, PAGE_SIZE); dst->pages[dst->page_count++] = ptr; + cond_resched(); return 0; } diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 268f5a3b3122..81e076662c7a 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -671,7 +671,7 @@ bool ttm_bo_eviction_valuable(struct ttm_buffer_object *bo, /* Don't evict this BO if it's outside of the * requested placement range */ - if (place->fpfn >= (bo->mem.start + bo->mem.size) || + if (place->fpfn >= (bo->mem.start + bo->mem.num_pages) || (place->lpfn && place->lpfn <= bo->mem.start)) return false; diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c index 04270a14fcaa..868dd1ef3b69 100644 --- a/drivers/gpu/drm/vc4/vc4_drv.c +++ b/drivers/gpu/drm/vc4/vc4_drv.c @@ -312,6 +312,7 @@ unbind_all: component_unbind_all(dev, drm); gem_destroy: vc4_gem_destroy(drm); + drm_mode_config_cleanup(drm); vc4_bo_cache_destroy(drm); dev_put: drm_dev_put(drm); diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c b/drivers/gpu/drm/virtio/virtgpu_kms.c index 84b6a6bf00c6..6b330d555458 100644 --- a/drivers/gpu/drm/virtio/virtgpu_kms.c +++ b/drivers/gpu/drm/virtio/virtgpu_kms.c @@ -94,8 +94,10 @@ static void virtio_gpu_get_capsets(struct virtio_gpu_device *vgdev, vgdev->capsets[i].id > 0, 5 * HZ); if (ret == 0) { DRM_ERROR("timed out waiting for cap set %d\n", i); + spin_lock(&vgdev->display_info_lock); kfree(vgdev->capsets); vgdev->capsets = NULL; + spin_unlock(&vgdev->display_info_lock); return; } DRM_INFO("cap set %d: id %d, max-version %d, max-size %d\n", diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c b/drivers/gpu/drm/virtio/virtgpu_vq.c index 981ee16e3ee9..8c8dcfebc753 100644 --- a/drivers/gpu/drm/virtio/virtgpu_vq.c +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c @@ -571,9 +571,13 @@ static void virtio_gpu_cmd_get_capset_info_cb(struct virtio_gpu_device *vgdev, int i = le32_to_cpu(cmd->capset_index); spin_lock(&vgdev->display_info_lock); - vgdev->capsets[i].id = le32_to_cpu(resp->capset_id); - vgdev->capsets[i].max_version = le32_to_cpu(resp->capset_max_version); - vgdev->capsets[i].max_size = le32_to_cpu(resp->capset_max_size); + if (vgdev->capsets) { + vgdev->capsets[i].id = le32_to_cpu(resp->capset_id); + vgdev->capsets[i].max_version = le32_to_cpu(resp->capset_max_version); + vgdev->capsets[i].max_size = le32_to_cpu(resp->capset_max_size); + } else { + DRM_ERROR("invalid capset memory."); + } spin_unlock(&vgdev->display_info_lock); wake_up(&vgdev->resp_wq); } diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c index a9da1526c40a..11bd2ca22a2e 100644 --- a/drivers/hid/hid-input.c +++ b/drivers/hid/hid-input.c @@ -796,7 +796,7 @@ static void hidinput_configure_usage(struct hid_input *hidinput, struct hid_fiel case 0x3b: /* Battery Strength */ hidinput_setup_battery(device, HID_INPUT_REPORT, field); usage->type = EV_PWR; - goto ignore; + return; case 0x3c: /* Invert */ map_key_clear(BTN_TOOL_RUBBER); @@ -1052,7 +1052,7 @@ static void hidinput_configure_usage(struct hid_input *hidinput, struct hid_fiel case HID_DC_BATTERYSTRENGTH: hidinput_setup_battery(device, HID_INPUT_REPORT, field); usage->type = EV_PWR; - goto ignore; + return; } goto unknown; diff --git a/drivers/hid/hid-roccat-kone.c b/drivers/hid/hid-roccat-kone.c index bf4675a27396..9be8c31f613f 100644 --- a/drivers/hid/hid-roccat-kone.c +++ b/drivers/hid/hid-roccat-kone.c @@ -297,31 +297,40 @@ static ssize_t kone_sysfs_write_settings(struct file *fp, struct kobject *kobj, struct kone_device *kone = hid_get_drvdata(dev_get_drvdata(dev)); struct usb_device *usb_dev = interface_to_usbdev(to_usb_interface(dev)); int retval = 0, difference, old_profile; + struct kone_settings *settings = (struct kone_settings *)buf; /* I need to get my data in one piece */ if (off != 0 || count != sizeof(struct kone_settings)) return -EINVAL; mutex_lock(&kone->kone_lock); - difference = memcmp(buf, &kone->settings, sizeof(struct kone_settings)); + difference = memcmp(settings, &kone->settings, + sizeof(struct kone_settings)); if (difference) { - retval = kone_set_settings(usb_dev, - (struct kone_settings const *)buf); - if (retval) { - mutex_unlock(&kone->kone_lock); - return retval; + if (settings->startup_profile < 1 || + settings->startup_profile > 5) { + retval = -EINVAL; + goto unlock; } + retval = kone_set_settings(usb_dev, settings); + if (retval) + goto unlock; + old_profile = kone->settings.startup_profile; - memcpy(&kone->settings, buf, sizeof(struct kone_settings)); + memcpy(&kone->settings, settings, sizeof(struct kone_settings)); kone_profile_activated(kone, kone->settings.startup_profile); if (kone->settings.startup_profile != old_profile) kone_profile_report(kone, kone->settings.startup_profile); } +unlock: mutex_unlock(&kone->kone_lock); + if (retval) + return retval; + return sizeof(struct kone_settings); } static BIN_ATTR(settings, 0660, kone_sysfs_read_settings, diff --git a/drivers/hid/wacom_wac.c b/drivers/hid/wacom_wac.c index 77bb46948eea..da83884b90d2 100644 --- a/drivers/hid/wacom_wac.c +++ b/drivers/hid/wacom_wac.c @@ -2729,7 +2729,9 @@ static int wacom_wac_collection(struct hid_device *hdev, struct hid_report *repo if (report->type != HID_INPUT_REPORT) return -1; - if (WACOM_PEN_FIELD(field) && wacom->wacom_wac.pen_input) + if (WACOM_PAD_FIELD(field)) + return 0; + else if (WACOM_PEN_FIELD(field) && wacom->wacom_wac.pen_input) wacom_wac_pen_report(hdev, report); else if (WACOM_FINGER_FIELD(field) && wacom->wacom_wac.touch_input) wacom_wac_finger_report(hdev, report); diff --git a/drivers/hwmon/pmbus/max34440.c b/drivers/hwmon/pmbus/max34440.c index 47576c460010..9af5ab52ca31 100644 --- a/drivers/hwmon/pmbus/max34440.c +++ b/drivers/hwmon/pmbus/max34440.c @@ -400,7 +400,6 @@ static struct pmbus_driver_info max34440_info[] = { .func[18] = PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP, .func[19] = PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP, .func[20] = PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP, - .read_byte_data = max34440_read_byte_data, .read_word_data = max34440_read_word_data, .write_word_data = max34440_write_word_data, }, @@ -431,7 +430,6 @@ static struct pmbus_driver_info max34440_info[] = { .func[15] = PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP, .func[16] = PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP, .func[17] = PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP, - .read_byte_data = max34440_read_byte_data, .read_word_data = max34440_read_word_data, .write_word_data = max34440_write_word_data, }, @@ -467,7 +465,6 @@ static struct pmbus_driver_info max34440_info[] = { .func[19] = PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP, .func[20] = PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP, .func[21] = PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP, - .read_byte_data = max34440_read_byte_data, .read_word_data = max34440_read_word_data, .write_word_data = max34440_write_word_data, }, diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig index a1c3f0fd7eea..193a07158ec6 100644 --- a/drivers/i2c/busses/Kconfig +++ b/drivers/i2c/busses/Kconfig @@ -1117,6 +1117,7 @@ config I2C_RCAR tristate "Renesas R-Car I2C Controller" depends on ARCH_RENESAS || COMPILE_TEST select I2C_SLAVE + select RESET_CONTROLLER if ARCH_RCAR_GEN3 help If you say yes to this option, support will be included for the R-Car I2C controller. diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index d4b72e4ffd71..0e7f9bd17a91 100644 --- a/drivers/i2c/busses/i2c-imx.c +++ b/drivers/i2c/busses/i2c-imx.c @@ -1101,14 +1101,6 @@ static int i2c_imx_probe(struct platform_device *pdev) return ret; } - /* Request IRQ */ - ret = devm_request_irq(&pdev->dev, irq, i2c_imx_isr, IRQF_SHARED, - pdev->name, i2c_imx); - if (ret) { - dev_err(&pdev->dev, "can't claim irq %d\n", irq); - goto clk_disable; - } - /* Init queue */ init_waitqueue_head(&i2c_imx->queue); @@ -1127,6 +1119,14 @@ static int i2c_imx_probe(struct platform_device *pdev) if (ret < 0) goto rpm_disable; + /* Request IRQ */ + ret = request_threaded_irq(irq, i2c_imx_isr, NULL, IRQF_SHARED, + pdev->name, i2c_imx); + if (ret) { + dev_err(&pdev->dev, "can't claim irq %d\n", irq); + goto rpm_disable; + } + /* Set up clock divider */ i2c_imx->bitrate = IMX_I2C_BIT_RATE; ret = of_property_read_u32(pdev->dev.of_node, @@ -1169,13 +1169,12 @@ static int i2c_imx_probe(struct platform_device *pdev) clk_notifier_unregister: clk_notifier_unregister(i2c_imx->clk, &i2c_imx->clk_change_nb); + free_irq(irq, i2c_imx); rpm_disable: pm_runtime_put_noidle(&pdev->dev); pm_runtime_disable(&pdev->dev); pm_runtime_set_suspended(&pdev->dev); pm_runtime_dont_use_autosuspend(&pdev->dev); - -clk_disable: clk_disable_unprepare(i2c_imx->clk); return ret; } @@ -1183,7 +1182,7 @@ clk_disable: static int i2c_imx_remove(struct platform_device *pdev) { struct imx_i2c_struct *i2c_imx = platform_get_drvdata(pdev); - int ret; + int irq, ret; ret = pm_runtime_get_sync(&pdev->dev); if (ret < 0) @@ -1203,6 +1202,9 @@ static int i2c_imx_remove(struct platform_device *pdev) imx_i2c_write_reg(0, i2c_imx, IMX_I2C_I2SR); clk_notifier_unregister(i2c_imx->clk, &i2c_imx->clk_change_nb); + irq = platform_get_irq(pdev, 0); + if (irq >= 0) + free_irq(irq, i2c_imx); clk_disable_unprepare(i2c_imx->clk); pm_runtime_put_noidle(&pdev->dev); diff --git a/drivers/i2c/i2c-core-acpi.c b/drivers/i2c/i2c-core-acpi.c index eb0569359387..8ba4122fb340 100644 --- a/drivers/i2c/i2c-core-acpi.c +++ b/drivers/i2c/i2c-core-acpi.c @@ -219,6 +219,7 @@ static acpi_status i2c_acpi_add_device(acpi_handle handle, u32 level, void i2c_acpi_register_devices(struct i2c_adapter *adap) { acpi_status status; + acpi_handle handle; if (!has_acpi_companion(&adap->dev)) return; @@ -229,6 +230,15 @@ void i2c_acpi_register_devices(struct i2c_adapter *adap) adap, NULL); if (ACPI_FAILURE(status)) dev_warn(&adap->dev, "failed to enumerate I2C slaves\n"); + + if (!adap->dev.parent) + return; + + handle = ACPI_HANDLE(adap->dev.parent); + if (!handle) + return; + + acpi_walk_dep_device_list(handle); } const struct acpi_device_id * @@ -693,7 +703,6 @@ int i2c_acpi_install_space_handler(struct i2c_adapter *adapter) return -ENOMEM; } - acpi_walk_dep_device_list(handle); return 0; } diff --git a/drivers/iio/adc/ti-adc0832.c b/drivers/iio/adc/ti-adc0832.c index 188dae705bf7..a408d97e2d2e 100644 --- a/drivers/iio/adc/ti-adc0832.c +++ b/drivers/iio/adc/ti-adc0832.c @@ -31,6 +31,12 @@ struct adc0832 { struct regulator *reg; struct mutex lock; u8 mux_bits; + /* + * Max size needed: 16x 1 byte ADC data + 8 bytes timestamp + * May be shorter if not all channels are enabled subject + * to the timestamp remaining 8 byte aligned. + */ + u8 data[24] __aligned(8); u8 tx_buf[2] ____cacheline_aligned; u8 rx_buf[2]; @@ -202,7 +208,6 @@ static irqreturn_t adc0832_trigger_handler(int irq, void *p) struct iio_poll_func *pf = p; struct iio_dev *indio_dev = pf->indio_dev; struct adc0832 *adc = iio_priv(indio_dev); - u8 data[24] = { }; /* 16x 1 byte ADC data + 8 bytes timestamp */ int scan_index; int i = 0; @@ -220,10 +225,10 @@ static irqreturn_t adc0832_trigger_handler(int irq, void *p) goto out; } - data[i] = ret; + adc->data[i] = ret; i++; } - iio_push_to_buffers_with_timestamp(indio_dev, data, + iio_push_to_buffers_with_timestamp(indio_dev, adc->data, iio_get_time_ns(indio_dev)); out: mutex_unlock(&adc->lock); diff --git a/drivers/iio/adc/ti-adc12138.c b/drivers/iio/adc/ti-adc12138.c index 703d68ae96b7..4517d7742bc3 100644 --- a/drivers/iio/adc/ti-adc12138.c +++ b/drivers/iio/adc/ti-adc12138.c @@ -50,6 +50,12 @@ struct adc12138 { struct completion complete; /* The number of cclk periods for the S/H's acquisition time */ unsigned int acquisition_time; + /* + * Maximum size needed: 16x 2 bytes ADC data + 8 bytes timestamp. + * Less may be need if not all channels are enabled, as long as + * the 8 byte alignment of the timestamp is maintained. + */ + __be16 data[20] __aligned(8); u8 tx_buf[2] ____cacheline_aligned; u8 rx_buf[2]; @@ -332,7 +338,6 @@ static irqreturn_t adc12138_trigger_handler(int irq, void *p) struct iio_poll_func *pf = p; struct iio_dev *indio_dev = pf->indio_dev; struct adc12138 *adc = iio_priv(indio_dev); - __be16 data[20] = { }; /* 16x 2 bytes ADC data + 8 bytes timestamp */ __be16 trash; int ret; int scan_index; @@ -348,7 +353,7 @@ static irqreturn_t adc12138_trigger_handler(int irq, void *p) reinit_completion(&adc->complete); ret = adc12138_start_and_read_conv(adc, scan_chan, - i ? &data[i - 1] : &trash); + i ? &adc->data[i - 1] : &trash); if (ret) { dev_warn(&adc->spi->dev, "failed to start conversion\n"); @@ -365,7 +370,7 @@ static irqreturn_t adc12138_trigger_handler(int irq, void *p) } if (i) { - ret = adc12138_read_conv_data(adc, &data[i - 1]); + ret = adc12138_read_conv_data(adc, &adc->data[i - 1]); if (ret) { dev_warn(&adc->spi->dev, "failed to get conversion data\n"); @@ -373,7 +378,7 @@ static irqreturn_t adc12138_trigger_handler(int irq, void *p) } } - iio_push_to_buffers_with_timestamp(indio_dev, data, + iio_push_to_buffers_with_timestamp(indio_dev, adc->data, iio_get_time_ns(indio_dev)); out: mutex_unlock(&adc->lock); diff --git a/drivers/iio/gyro/itg3200_buffer.c b/drivers/iio/gyro/itg3200_buffer.c index 59770e5b6660..b080362a8766 100644 --- a/drivers/iio/gyro/itg3200_buffer.c +++ b/drivers/iio/gyro/itg3200_buffer.c @@ -49,13 +49,20 @@ static irqreturn_t itg3200_trigger_handler(int irq, void *p) struct iio_poll_func *pf = p; struct iio_dev *indio_dev = pf->indio_dev; struct itg3200 *st = iio_priv(indio_dev); - __be16 buf[ITG3200_SCAN_ELEMENTS + sizeof(s64)/sizeof(u16)]; + /* + * Ensure correct alignment and padding including for the + * timestamp that may be inserted. + */ + struct { + __be16 buf[ITG3200_SCAN_ELEMENTS]; + s64 ts __aligned(8); + } scan; - int ret = itg3200_read_all_channels(st->i2c, buf); + int ret = itg3200_read_all_channels(st->i2c, scan.buf); if (ret < 0) goto error_ret; - iio_push_to_buffers_with_timestamp(indio_dev, buf, pf->timestamp); + iio_push_to_buffers_with_timestamp(indio_dev, &scan, pf->timestamp); iio_trigger_notify_done(indio_dev->trig); diff --git a/drivers/iio/light/si1145.c b/drivers/iio/light/si1145.c index 76f16f9c7616..31f78fd7f915 100644 --- a/drivers/iio/light/si1145.c +++ b/drivers/iio/light/si1145.c @@ -172,6 +172,7 @@ struct si1145_part_info { * @part_info: Part information * @trig: Pointer to iio trigger * @meas_rate: Value of MEAS_RATE register. Only set in HW in auto mode + * @buffer: Used to pack data read from sensor. */ struct si1145_data { struct i2c_client *client; @@ -183,6 +184,14 @@ struct si1145_data { bool autonomous; struct iio_trigger *trig; int meas_rate; + /* + * Ensure timestamp will be naturally aligned if present. + * Maximum buffer size (may be only partly used if not all + * channels are enabled): + * 6*2 bytes channels data + 4 bytes alignment + + * 8 bytes timestamp + */ + u8 buffer[24] __aligned(8); }; /** @@ -444,12 +453,6 @@ static irqreturn_t si1145_trigger_handler(int irq, void *private) struct iio_poll_func *pf = private; struct iio_dev *indio_dev = pf->indio_dev; struct si1145_data *data = iio_priv(indio_dev); - /* - * Maximum buffer size: - * 6*2 bytes channels data + 4 bytes alignment + - * 8 bytes timestamp - */ - u8 buffer[24]; int i, j = 0; int ret; u8 irq_status = 0; @@ -482,7 +485,7 @@ static irqreturn_t si1145_trigger_handler(int irq, void *private) ret = i2c_smbus_read_i2c_block_data_or_emulated( data->client, indio_dev->channels[i].address, - sizeof(u16) * run, &buffer[j]); + sizeof(u16) * run, &data->buffer[j]); if (ret < 0) goto done; j += run * sizeof(u16); @@ -497,7 +500,7 @@ static irqreturn_t si1145_trigger_handler(int irq, void *private) goto done; } - iio_push_to_buffers_with_timestamp(indio_dev, buffer, + iio_push_to_buffers_with_timestamp(indio_dev, data->buffer, iio_get_time_ns(indio_dev)); done: diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c index df8f5ceea2dd..30385ba7c5d9 100644 --- a/drivers/infiniband/core/addr.c +++ b/drivers/infiniband/core/addr.c @@ -571,13 +571,12 @@ static void process_one_req(struct work_struct *_work) req->callback = NULL; spin_lock_bh(&lock); + /* + * Although the work will normally have been canceled by the workqueue, + * it can still be requeued as long as it is on the req_list. + */ + cancel_delayed_work(&req->work); if (!list_empty(&req->list)) { - /* - * Although the work will normally have been canceled by the - * workqueue, it can still be requeued as long as it is on the - * req_list. - */ - cancel_delayed_work(&req->work); list_del_init(&req->list); kfree(req); } diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 1f14cd4ce3db..8cdf933310d1 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1678,19 +1678,30 @@ static void cma_release_port(struct rdma_id_private *id_priv) mutex_unlock(&lock); } -static void cma_leave_roce_mc_group(struct rdma_id_private *id_priv, - struct cma_multicast *mc) +static void destroy_mc(struct rdma_id_private *id_priv, + struct cma_multicast *mc) { - struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr; - struct net_device *ndev = NULL; - - if (dev_addr->bound_dev_if) - ndev = dev_get_by_index(dev_addr->net, dev_addr->bound_dev_if); - if (ndev) { - cma_igmp_send(ndev, &mc->multicast.ib->rec.mgid, false); - dev_put(ndev); + if (rdma_cap_ib_mcast(id_priv->id.device, id_priv->id.port_num)) { + ib_sa_free_multicast(mc->multicast.ib); + kfree(mc); + return; + } + + if (rdma_protocol_roce(id_priv->id.device, + id_priv->id.port_num)) { + struct rdma_dev_addr *dev_addr = + &id_priv->id.route.addr.dev_addr; + struct net_device *ndev = NULL; + + if (dev_addr->bound_dev_if) + ndev = dev_get_by_index(dev_addr->net, + dev_addr->bound_dev_if); + if (ndev) { + cma_igmp_send(ndev, &mc->multicast.ib->rec.mgid, false); + dev_put(ndev); + } + kref_put(&mc->mcref, release_mc); } - kref_put(&mc->mcref, release_mc); } static void cma_leave_mc_groups(struct rdma_id_private *id_priv) @@ -1698,16 +1709,10 @@ static void cma_leave_mc_groups(struct rdma_id_private *id_priv) struct cma_multicast *mc; while (!list_empty(&id_priv->mc_list)) { - mc = container_of(id_priv->mc_list.next, - struct cma_multicast, list); + mc = list_first_entry(&id_priv->mc_list, struct cma_multicast, + list); list_del(&mc->list); - if (rdma_cap_ib_mcast(id_priv->cma_dev->device, - id_priv->id.port_num)) { - ib_sa_free_multicast(mc->multicast.ib); - kfree(mc); - } else { - cma_leave_roce_mc_group(id_priv, mc); - } + destroy_mc(id_priv, mc); } } @@ -4020,16 +4025,6 @@ static int cma_ib_mc_handler(int status, struct ib_sa_multicast *multicast) else pr_debug_ratelimited("RDMA CM: MULTICAST_ERROR: failed to join multicast. status %d\n", status); - mutex_lock(&id_priv->qp_mutex); - if (!status && id_priv->id.qp) { - status = ib_attach_mcast(id_priv->id.qp, &multicast->rec.mgid, - be16_to_cpu(multicast->rec.mlid)); - if (status) - pr_debug_ratelimited("RDMA CM: MULTICAST_ERROR: failed to attach QP. status %d\n", - status); - } - mutex_unlock(&id_priv->qp_mutex); - event.status = status; event.param.ud.private_data = mc->context; if (!status) { @@ -4283,6 +4278,10 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr, struct cma_multicast *mc; int ret; + /* Not supported for kernel QPs */ + if (WARN_ON(id->qp)) + return -EINVAL; + if (!id->device) return -EINVAL; @@ -4333,25 +4332,14 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr) id_priv = container_of(id, struct rdma_id_private, id); spin_lock_irq(&id_priv->lock); list_for_each_entry(mc, &id_priv->mc_list, list) { - if (!memcmp(&mc->addr, addr, rdma_addr_size(addr))) { - list_del(&mc->list); - spin_unlock_irq(&id_priv->lock); + if (memcmp(&mc->addr, addr, rdma_addr_size(addr)) != 0) + continue; + list_del(&mc->list); + spin_unlock_irq(&id_priv->lock); - if (id->qp) - ib_detach_mcast(id->qp, - &mc->multicast.ib->rec.mgid, - be16_to_cpu(mc->multicast.ib->rec.mlid)); - - BUG_ON(id_priv->cma_dev->device != id->device); - - if (rdma_cap_ib_mcast(id->device, id->port_num)) { - ib_sa_free_multicast(mc->multicast.ib); - kfree(mc); - } else if (rdma_protocol_roce(id->device, id->port_num)) { - cma_leave_roce_mc_group(id_priv, mc); - } - return; - } + WARN_ON(id_priv->cma_dev->device != id->device); + destroy_mc(id_priv, mc); + return; } spin_unlock_irq(&id_priv->lock); } diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c index 2acc30c3d5b2..01052de6bedb 100644 --- a/drivers/infiniband/core/ucma.c +++ b/drivers/infiniband/core/ucma.c @@ -588,6 +588,7 @@ static int ucma_free_ctx(struct ucma_context *ctx) list_move_tail(&uevent->list, &list); } list_del(&ctx->list); + events_reported = ctx->events_reported; mutex_unlock(&ctx->file->mut); list_for_each_entry_safe(uevent, tmp, &list, list) { @@ -597,7 +598,6 @@ static int ucma_free_ctx(struct ucma_context *ctx) kfree(uevent); } - events_reported = ctx->events_reported; mutex_destroy(&ctx->mutex); kfree(ctx); return events_reported; @@ -1476,7 +1476,9 @@ static ssize_t ucma_process_join(struct ucma_file *file, return 0; err3: + mutex_lock(&ctx->mutex); rdma_leave_multicast(ctx->cm_id, (struct sockaddr *) &mc->addr); + mutex_unlock(&ctx->mutex); ucma_cleanup_mc_events(mc); err2: mutex_lock(&mut); @@ -1644,7 +1646,9 @@ static ssize_t ucma_migrate_id(struct ucma_file *new_file, cur_file = ctx->file; if (cur_file == new_file) { + mutex_lock(&cur_file->mut); resp.events_reported = ctx->events_reported; + mutex_unlock(&cur_file->mut); goto response; } diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c index 081aa91fc162..620eaca2b831 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c @@ -274,7 +274,6 @@ static int hns_roce_v1_post_send(struct ib_qp *ibqp, ps_opcode = HNS_ROCE_WQE_OPCODE_SEND; break; case IB_WR_LOCAL_INV: - break; case IB_WR_ATOMIC_CMP_AND_SWP: case IB_WR_ATOMIC_FETCH_AND_ADD: case IB_WR_LSO: diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 417de7ac0d5e..2a203e08d4c1 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -3821,6 +3821,7 @@ done: } qp_init_attr->cap = qp_attr->cap; + qp_init_attr->sq_sig_type = hr_qp->sq_signal_bits; out: mutex_unlock(&hr_qp->mutex); diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c index 8c79a480f2b7..d3e11503e67c 100644 --- a/drivers/infiniband/hw/mlx4/cm.c +++ b/drivers/infiniband/hw/mlx4/cm.c @@ -307,6 +307,9 @@ static void schedule_delayed(struct ib_device *ibdev, struct id_map_entry *id) if (!sriov->is_going_down) { id->scheduled_delete = 1; schedule_delayed_work(&id->timeout, CM_CLEANUP_CACHE_TIMEOUT); + } else if (id->scheduled_delete) { + /* Adjust timeout if already scheduled */ + mod_delayed_work(system_wq, &id->timeout, CM_CLEANUP_CACHE_TIMEOUT); } spin_unlock_irqrestore(&sriov->going_down_lock, flags); spin_unlock(&sriov->id_map_lock); diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c index 5aaa2a6c431b..418b9312fb2d 100644 --- a/drivers/infiniband/hw/mlx4/mad.c +++ b/drivers/infiniband/hw/mlx4/mad.c @@ -1305,6 +1305,18 @@ static void mlx4_ib_tunnel_comp_handler(struct ib_cq *cq, void *arg) spin_unlock_irqrestore(&dev->sriov.going_down_lock, flags); } +static void mlx4_ib_wire_comp_handler(struct ib_cq *cq, void *arg) +{ + unsigned long flags; + struct mlx4_ib_demux_pv_ctx *ctx = cq->cq_context; + struct mlx4_ib_dev *dev = to_mdev(ctx->ib_dev); + + spin_lock_irqsave(&dev->sriov.going_down_lock, flags); + if (!dev->sriov.is_going_down && ctx->state == DEMUX_PV_STATE_ACTIVE) + queue_work(ctx->wi_wq, &ctx->work); + spin_unlock_irqrestore(&dev->sriov.going_down_lock, flags); +} + static int mlx4_ib_post_pv_qp_buf(struct mlx4_ib_demux_pv_ctx *ctx, struct mlx4_ib_demux_pv_qp *tun_qp, int index) @@ -2000,7 +2012,8 @@ static int create_pv_resources(struct ib_device *ibdev, int slave, int port, cq_size *= 2; cq_attr.cqe = cq_size; - ctx->cq = ib_create_cq(ctx->ib_dev, mlx4_ib_tunnel_comp_handler, + ctx->cq = ib_create_cq(ctx->ib_dev, + create_tun ? mlx4_ib_tunnel_comp_handler : mlx4_ib_wire_comp_handler, NULL, ctx, &cq_attr); if (IS_ERR(ctx->cq)) { ret = PTR_ERR(ctx->cq); @@ -2037,6 +2050,7 @@ static int create_pv_resources(struct ib_device *ibdev, int slave, int port, INIT_WORK(&ctx->work, mlx4_ib_sqp_comp_worker); ctx->wq = to_mdev(ibdev)->sriov.demux[port - 1].wq; + ctx->wi_wq = to_mdev(ibdev)->sriov.demux[port - 1].wi_wq; ret = ib_req_notify_cq(ctx->cq, IB_CQ_NEXT_COMP); if (ret) { @@ -2180,7 +2194,7 @@ static int mlx4_ib_alloc_demux_ctx(struct mlx4_ib_dev *dev, goto err_mcg; } - snprintf(name, sizeof name, "mlx4_ibt%d", port); + snprintf(name, sizeof(name), "mlx4_ibt%d", port); ctx->wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM); if (!ctx->wq) { pr_err("Failed to create tunnelling WQ for port %d\n", port); @@ -2188,7 +2202,15 @@ static int mlx4_ib_alloc_demux_ctx(struct mlx4_ib_dev *dev, goto err_wq; } - snprintf(name, sizeof name, "mlx4_ibud%d", port); + snprintf(name, sizeof(name), "mlx4_ibwi%d", port); + ctx->wi_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM); + if (!ctx->wi_wq) { + pr_err("Failed to create wire WQ for port %d\n", port); + ret = -ENOMEM; + goto err_wiwq; + } + + snprintf(name, sizeof(name), "mlx4_ibud%d", port); ctx->ud_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM); if (!ctx->ud_wq) { pr_err("Failed to create up/down WQ for port %d\n", port); @@ -2199,6 +2221,10 @@ static int mlx4_ib_alloc_demux_ctx(struct mlx4_ib_dev *dev, return 0; err_udwq: + destroy_workqueue(ctx->wi_wq); + ctx->wi_wq = NULL; + +err_wiwq: destroy_workqueue(ctx->wq); ctx->wq = NULL; @@ -2246,12 +2272,14 @@ static void mlx4_ib_free_demux_ctx(struct mlx4_ib_demux_ctx *ctx) ctx->tun[i]->state = DEMUX_PV_STATE_DOWNING; } flush_workqueue(ctx->wq); + flush_workqueue(ctx->wi_wq); for (i = 0; i < dev->dev->caps.sqp_demux; i++) { destroy_pv_resources(dev, i, ctx->port, ctx->tun[i], 0); free_pv_object(dev, i, ctx->port); } kfree(ctx->tun); destroy_workqueue(ctx->ud_wq); + destroy_workqueue(ctx->wi_wq); destroy_workqueue(ctx->wq); } } diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h index e10dccc7958f..76ca67aa4015 100644 --- a/drivers/infiniband/hw/mlx4/mlx4_ib.h +++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h @@ -464,6 +464,7 @@ struct mlx4_ib_demux_pv_ctx { struct ib_pd *pd; struct work_struct work; struct workqueue_struct *wq; + struct workqueue_struct *wi_wq; struct mlx4_ib_demux_pv_qp qp[2]; }; @@ -471,6 +472,7 @@ struct mlx4_ib_demux_ctx { struct ib_device *ib_dev; int port; struct workqueue_struct *wq; + struct workqueue_struct *wi_wq; struct workqueue_struct *ud_wq; spinlock_t ud_lock; atomic64_t subnet_prefix; diff --git a/drivers/infiniband/hw/qedr/main.c b/drivers/infiniband/hw/qedr/main.c index d1680d3b5825..2a82661620fe 100644 --- a/drivers/infiniband/hw/qedr/main.c +++ b/drivers/infiniband/hw/qedr/main.c @@ -604,7 +604,7 @@ static int qedr_set_device_attr(struct qedr_dev *dev) qed_attr = dev->ops->rdma_query_device(dev->rdma_ctx); /* Part 2 - check capabilities */ - page_size = ~dev->attr.page_size_caps + 1; + page_size = ~qed_attr->page_size_caps + 1; if (page_size > PAGE_SIZE) { DP_ERR(dev, "Kernel PAGE_SIZE is %ld which is smaller than minimum page size (%d) required by qedr\n", diff --git a/drivers/infiniband/hw/qedr/qedr_iw_cm.c b/drivers/infiniband/hw/qedr/qedr_iw_cm.c index e908dfbaa137..1f1d6a000e5c 100644 --- a/drivers/infiniband/hw/qedr/qedr_iw_cm.c +++ b/drivers/infiniband/hw/qedr/qedr_iw_cm.c @@ -677,6 +677,7 @@ int qedr_iw_destroy_listen(struct iw_cm_id *cm_id) listener->qed_handle); cm_id->rem_ref(cm_id); + kfree(listener); return rc; } diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c index 7b26afc7fef3..f847f0a9f204 100644 --- a/drivers/infiniband/hw/qedr/verbs.c +++ b/drivers/infiniband/hw/qedr/verbs.c @@ -2522,7 +2522,7 @@ int qedr_query_qp(struct ib_qp *ibqp, qp_attr->cap.max_recv_wr = qp->rq.max_wr; qp_attr->cap.max_send_sge = qp->sq.max_sges; qp_attr->cap.max_recv_sge = qp->rq.max_sges; - qp_attr->cap.max_inline_data = ROCE_REQ_MAX_INLINE_DATA_SIZE; + qp_attr->cap.max_inline_data = dev->attr.max_inline; qp_init_attr->cap = qp_attr->cap; qp_attr->ah_attr.type = RDMA_AH_ATTR_TYPE_ROCE; diff --git a/drivers/infiniband/sw/rdmavt/vt.c b/drivers/infiniband/sw/rdmavt/vt.c index 17e4abc067af..541ee30727aa 100644 --- a/drivers/infiniband/sw/rdmavt/vt.c +++ b/drivers/infiniband/sw/rdmavt/vt.c @@ -95,9 +95,7 @@ struct rvt_dev_info *rvt_alloc_device(size_t size, int nports) if (!rdi) return rdi; - rdi->ports = kcalloc(nports, - sizeof(struct rvt_ibport **), - GFP_KERNEL); + rdi->ports = kcalloc(nports, sizeof(*rdi->ports), GFP_KERNEL); if (!rdi->ports) ib_dealloc_device(&rdi->ibdev); diff --git a/drivers/input/keyboard/ep93xx_keypad.c b/drivers/input/keyboard/ep93xx_keypad.c index f77b295e0123..01788a78041b 100644 --- a/drivers/input/keyboard/ep93xx_keypad.c +++ b/drivers/input/keyboard/ep93xx_keypad.c @@ -257,8 +257,8 @@ static int ep93xx_keypad_probe(struct platform_device *pdev) } keypad->irq = platform_get_irq(pdev, 0); - if (!keypad->irq) { - err = -ENXIO; + if (keypad->irq < 0) { + err = keypad->irq; goto failed_free; } diff --git a/drivers/input/keyboard/omap4-keypad.c b/drivers/input/keyboard/omap4-keypad.c index 840e53732753..aeeef50cef9b 100644 --- a/drivers/input/keyboard/omap4-keypad.c +++ b/drivers/input/keyboard/omap4-keypad.c @@ -253,10 +253,8 @@ static int omap4_keypad_probe(struct platform_device *pdev) } irq = platform_get_irq(pdev, 0); - if (!irq) { - dev_err(&pdev->dev, "no keyboard irq assigned\n"); - return -EINVAL; - } + if (irq < 0) + return irq; keypad_data = kzalloc(sizeof(struct omap4_keypad), GFP_KERNEL); if (!keypad_data) { diff --git a/drivers/input/keyboard/twl4030_keypad.c b/drivers/input/keyboard/twl4030_keypad.c index f9f98ef1d98e..8677dbe0fd20 100644 --- a/drivers/input/keyboard/twl4030_keypad.c +++ b/drivers/input/keyboard/twl4030_keypad.c @@ -63,7 +63,7 @@ struct twl4030_keypad { bool autorepeat; unsigned int n_rows; unsigned int n_cols; - unsigned int irq; + int irq; struct device *dbg_dev; struct input_dev *input; @@ -389,10 +389,8 @@ static int twl4030_kp_probe(struct platform_device *pdev) } kp->irq = platform_get_irq(pdev, 0); - if (!kp->irq) { - dev_err(&pdev->dev, "no keyboard irq assigned\n"); - return -EINVAL; - } + if (kp->irq < 0) + return kp->irq; error = matrix_keypad_build_keymap(keymap_data, NULL, TWL4030_MAX_ROWS, diff --git a/drivers/input/serio/hil_mlc.c b/drivers/input/serio/hil_mlc.c index e1423f7648d6..4c039e4125d9 100644 --- a/drivers/input/serio/hil_mlc.c +++ b/drivers/input/serio/hil_mlc.c @@ -74,7 +74,7 @@ EXPORT_SYMBOL(hil_mlc_unregister); static LIST_HEAD(hil_mlcs); static DEFINE_RWLOCK(hil_mlcs_lock); static struct timer_list hil_mlcs_kicker; -static int hil_mlcs_probe; +static int hil_mlcs_probe, hil_mlc_stop; static void hil_mlcs_process(unsigned long unused); static DECLARE_TASKLET_DISABLED(hil_mlcs_tasklet, hil_mlcs_process, 0); @@ -702,9 +702,13 @@ static int hilse_donode(hil_mlc *mlc) if (!mlc->ostarted) { mlc->ostarted = 1; mlc->opacket = pack; - mlc->out(mlc); + rc = mlc->out(mlc); nextidx = HILSEN_DOZE; write_unlock_irqrestore(&mlc->lock, flags); + if (rc) { + hil_mlc_stop = 1; + return 1; + } break; } mlc->ostarted = 0; @@ -715,8 +719,13 @@ static int hilse_donode(hil_mlc *mlc) case HILSE_CTS: write_lock_irqsave(&mlc->lock, flags); - nextidx = mlc->cts(mlc) ? node->bad : node->good; + rc = mlc->cts(mlc); + nextidx = rc ? node->bad : node->good; write_unlock_irqrestore(&mlc->lock, flags); + if (rc) { + hil_mlc_stop = 1; + return 1; + } break; default: @@ -780,6 +789,12 @@ static void hil_mlcs_process(unsigned long unused) static void hil_mlcs_timer(struct timer_list *unused) { + if (hil_mlc_stop) { + /* could not send packet - stop immediately. */ + pr_warn(PREFIX "HIL seems stuck - Disabling HIL MLC.\n"); + return; + } + hil_mlcs_probe = 1; tasklet_schedule(&hil_mlcs_tasklet); /* Re-insert the periodic task. */ diff --git a/drivers/input/serio/hp_sdc_mlc.c b/drivers/input/serio/hp_sdc_mlc.c index 232d30c825bd..3e85e9039374 100644 --- a/drivers/input/serio/hp_sdc_mlc.c +++ b/drivers/input/serio/hp_sdc_mlc.c @@ -210,7 +210,7 @@ static int hp_sdc_mlc_cts(hil_mlc *mlc) priv->tseq[2] = 1; priv->tseq[3] = 0; priv->tseq[4] = 0; - __hp_sdc_enqueue_transaction(&priv->trans); + return __hp_sdc_enqueue_transaction(&priv->trans); busy: return 1; done: @@ -219,7 +219,7 @@ static int hp_sdc_mlc_cts(hil_mlc *mlc) return 0; } -static void hp_sdc_mlc_out(hil_mlc *mlc) +static int hp_sdc_mlc_out(hil_mlc *mlc) { struct hp_sdc_mlc_priv_s *priv; @@ -234,7 +234,7 @@ static void hp_sdc_mlc_out(hil_mlc *mlc) do_data: if (priv->emtestmode) { up(&mlc->osem); - return; + return 0; } /* Shouldn't be sending commands when loop may be busy */ BUG_ON(down_trylock(&mlc->csem)); @@ -296,7 +296,7 @@ static void hp_sdc_mlc_out(hil_mlc *mlc) BUG_ON(down_trylock(&mlc->csem)); } enqueue: - hp_sdc_enqueue_transaction(&priv->trans); + return hp_sdc_enqueue_transaction(&priv->trans); } static int __init hp_sdc_mlc_init(void) diff --git a/drivers/input/serio/sun4i-ps2.c b/drivers/input/serio/sun4i-ps2.c index 04b96fe39339..46512b4d686a 100644 --- a/drivers/input/serio/sun4i-ps2.c +++ b/drivers/input/serio/sun4i-ps2.c @@ -210,7 +210,6 @@ static int sun4i_ps2_probe(struct platform_device *pdev) struct sun4i_ps2data *drvdata; struct serio *serio; struct device *dev = &pdev->dev; - unsigned int irq; int error; drvdata = kzalloc(sizeof(struct sun4i_ps2data), GFP_KERNEL); @@ -263,14 +262,12 @@ static int sun4i_ps2_probe(struct platform_device *pdev) writel(0, drvdata->reg_base + PS2_REG_GCTL); /* Get IRQ for the device */ - irq = platform_get_irq(pdev, 0); - if (!irq) { - dev_err(dev, "no IRQ found\n"); - error = -ENXIO; + drvdata->irq = platform_get_irq(pdev, 0); + if (drvdata->irq < 0) { + error = drvdata->irq; goto err_disable_clk; } - drvdata->irq = irq; drvdata->serio = serio; drvdata->dev = dev; diff --git a/drivers/input/touchscreen/imx6ul_tsc.c b/drivers/input/touchscreen/imx6ul_tsc.c index c10fc594f94d..6bfe42a11452 100644 --- a/drivers/input/touchscreen/imx6ul_tsc.c +++ b/drivers/input/touchscreen/imx6ul_tsc.c @@ -538,20 +538,25 @@ static int __maybe_unused imx6ul_tsc_resume(struct device *dev) mutex_lock(&input_dev->mutex); - if (input_dev->users) { - retval = clk_prepare_enable(tsc->adc_clk); - if (retval) - goto out; + if (!input_dev->users) + goto out; - retval = clk_prepare_enable(tsc->tsc_clk); - if (retval) { - clk_disable_unprepare(tsc->adc_clk); - goto out; - } + retval = clk_prepare_enable(tsc->adc_clk); + if (retval) + goto out; - retval = imx6ul_tsc_init(tsc); + retval = clk_prepare_enable(tsc->tsc_clk); + if (retval) { + clk_disable_unprepare(tsc->adc_clk); + goto out; } + retval = imx6ul_tsc_init(tsc); + if (retval) { + clk_disable_unprepare(tsc->tsc_clk); + clk_disable_unprepare(tsc->adc_clk); + goto out; + } out: mutex_unlock(&input_dev->mutex); return retval; diff --git a/drivers/input/touchscreen/stmfts.c b/drivers/input/touchscreen/stmfts.c index b6f95f20f924..cd8805d71d97 100644 --- a/drivers/input/touchscreen/stmfts.c +++ b/drivers/input/touchscreen/stmfts.c @@ -479,7 +479,7 @@ static ssize_t stmfts_sysfs_hover_enable_write(struct device *dev, mutex_lock(&sdata->mutex); - if (value & sdata->hover_enabled) + if (value && sdata->hover_enabled) goto out; if (sdata->running) diff --git a/drivers/leds/leds-bcm6328.c b/drivers/leds/leds-bcm6328.c index 2cfd9389ee96..b944ae828004 100644 --- a/drivers/leds/leds-bcm6328.c +++ b/drivers/leds/leds-bcm6328.c @@ -336,7 +336,7 @@ static int bcm6328_led(struct device *dev, struct device_node *nc, u32 reg, led->cdev.brightness_set = bcm6328_led_set; led->cdev.blink_set = bcm6328_blink_set; - rc = led_classdev_register(dev, &led->cdev); + rc = devm_led_classdev_register(dev, &led->cdev); if (rc < 0) return rc; diff --git a/drivers/leds/leds-bcm6358.c b/drivers/leds/leds-bcm6358.c index b2cc06618abe..a86ab6197a4e 100644 --- a/drivers/leds/leds-bcm6358.c +++ b/drivers/leds/leds-bcm6358.c @@ -141,7 +141,7 @@ static int bcm6358_led(struct device *dev, struct device_node *nc, u32 reg, led->cdev.brightness_set = bcm6358_led_set; - rc = led_classdev_register(dev, &led->cdev); + rc = devm_led_classdev_register(dev, &led->cdev); if (rc < 0) return rc; diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c index db5aa2905d8d..3980e5006b00 100644 --- a/drivers/mailbox/mailbox.c +++ b/drivers/mailbox/mailbox.c @@ -103,9 +103,12 @@ static void msg_submit(struct mbox_chan *chan) err = __msg_submit(chan); } while (err == -EAGAIN); - if (!err && (chan->txdone_method & TXDONE_BY_POLL)) - /* kick start the timer immediately to avoid delays */ - hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL); + /* kick start the timer immediately to avoid delays */ + if (!err && (chan->txdone_method & TXDONE_BY_POLL)) { + /* but only if not already active */ + if (!hrtimer_active(&chan->mbox->poll_hrt)) + hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL); + } } static void tx_tick(struct mbox_chan *chan, int r) @@ -143,11 +146,10 @@ static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer) struct mbox_chan *chan = &mbox->chans[i]; if (chan->active_req && chan->cl) { + resched = true; txdone = chan->mbox->ops->last_tx_done(chan); if (txdone) tx_tick(chan, 0); - else - resched = true; } } diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c index fd8607124bdb..503f5e06fa86 100644 --- a/drivers/md/md-bitmap.c +++ b/drivers/md/md-bitmap.c @@ -1371,7 +1371,7 @@ __acquires(bitmap->lock) if (bitmap->bp[page].hijacked || bitmap->bp[page].map == NULL) csize = ((sector_t)1) << (bitmap->chunkshift + - PAGE_COUNTER_SHIFT - 1); + PAGE_COUNTER_SHIFT); else csize = ((sector_t)1) << bitmap->chunkshift; *blocks = csize - (offset & (csize - 1)); diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index d91154d65455..c7bda4b0bced 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -2417,8 +2417,6 @@ static int resize_stripes(struct r5conf *conf, int newsize) } else err = -ENOMEM; - mutex_unlock(&conf->cache_size_mutex); - conf->slab_cache = sc; conf->active_name = 1-conf->active_name; @@ -2441,6 +2439,8 @@ static int resize_stripes(struct r5conf *conf, int newsize) if (!err) conf->pool_size = newsize; + mutex_unlock(&conf->cache_size_mutex); + return err; } diff --git a/drivers/media/firewire/firedtv-fw.c b/drivers/media/firewire/firedtv-fw.c index eaf94b817dbc..2ac9d24d3f0c 100644 --- a/drivers/media/firewire/firedtv-fw.c +++ b/drivers/media/firewire/firedtv-fw.c @@ -271,8 +271,10 @@ static int node_probe(struct fw_unit *unit, const struct ieee1394_device_id *id) name_len = fw_csr_string(unit->directory, CSR_MODEL, name, sizeof(name)); - if (name_len < 0) - return name_len; + if (name_len < 0) { + err = name_len; + goto fail_free; + } for (i = ARRAY_SIZE(model_names); --i; ) if (strlen(model_names[i]) <= name_len && strncmp(name, model_names[i], name_len) == 0) diff --git a/drivers/media/i2c/imx274.c b/drivers/media/i2c/imx274.c index 8cc3bdb7f608..0fe8b869245b 100644 --- a/drivers/media/i2c/imx274.c +++ b/drivers/media/i2c/imx274.c @@ -1239,6 +1239,8 @@ static int imx274_s_frame_interval(struct v4l2_subdev *sd, ret = imx274_set_frame_interval(imx274, fi->interval); if (!ret) { + fi->interval = imx274->frame_interval; + /* * exposure time range is decided by frame interval * need to update it after frame interval changes @@ -1760,9 +1762,9 @@ static int imx274_set_frame_interval(struct stimx274 *priv, __func__, frame_interval.numerator, frame_interval.denominator); - if (frame_interval.numerator == 0) { - err = -EINVAL; - goto fail; + if (frame_interval.numerator == 0 || frame_interval.denominator == 0) { + frame_interval.denominator = IMX274_DEF_FRAME_RATE; + frame_interval.numerator = 1; } req_frame_rate = (u32)(frame_interval.denominator diff --git a/drivers/media/i2c/m5mols/m5mols_core.c b/drivers/media/i2c/m5mols/m5mols_core.c index 12e79f9e32d5..d9a964430609 100644 --- a/drivers/media/i2c/m5mols/m5mols_core.c +++ b/drivers/media/i2c/m5mols/m5mols_core.c @@ -768,7 +768,8 @@ static int m5mols_sensor_power(struct m5mols_info *info, bool enable) ret = regulator_bulk_enable(ARRAY_SIZE(supplies), supplies); if (ret) { - info->set_power(&client->dev, 0); + if (info->set_power) + info->set_power(&client->dev, 0); return ret; } diff --git a/drivers/media/i2c/tc358743.c b/drivers/media/i2c/tc358743.c index e4c0a27b636a..d9bc3851bf63 100644 --- a/drivers/media/i2c/tc358743.c +++ b/drivers/media/i2c/tc358743.c @@ -919,8 +919,8 @@ static const struct cec_adap_ops tc358743_cec_adap_ops = { .adap_monitor_all_enable = tc358743_cec_adap_monitor_all_enable, }; -static void tc358743_cec_isr(struct v4l2_subdev *sd, u16 intstatus, - bool *handled) +static void tc358743_cec_handler(struct v4l2_subdev *sd, u16 intstatus, + bool *handled) { struct tc358743_state *state = to_state(sd); unsigned int cec_rxint, cec_txint; @@ -953,7 +953,8 @@ static void tc358743_cec_isr(struct v4l2_subdev *sd, u16 intstatus, cec_transmit_attempt_done(state->cec_adap, CEC_TX_STATUS_ERROR); } - *handled = true; + if (handled) + *handled = true; } if ((intstatus & MASK_CEC_RINT) && (cec_rxint & MASK_CECRIEND)) { @@ -968,7 +969,8 @@ static void tc358743_cec_isr(struct v4l2_subdev *sd, u16 intstatus, msg.msg[i] = v & 0xff; } cec_received_msg(state->cec_adap, &msg); - *handled = true; + if (handled) + *handled = true; } i2c_wr16(sd, INTSTATUS, intstatus & (MASK_CEC_RINT | MASK_CEC_TINT)); @@ -1432,7 +1434,7 @@ static int tc358743_isr(struct v4l2_subdev *sd, u32 status, bool *handled) #ifdef CONFIG_VIDEO_TC358743_CEC if (intstatus & (MASK_CEC_RINT | MASK_CEC_TINT)) { - tc358743_cec_isr(sd, intstatus, handled); + tc358743_cec_handler(sd, intstatus, handled); i2c_wr16(sd, INTSTATUS, intstatus & (MASK_CEC_RINT | MASK_CEC_TINT)); intstatus &= ~(MASK_CEC_RINT | MASK_CEC_TINT); @@ -1461,7 +1463,7 @@ static int tc358743_isr(struct v4l2_subdev *sd, u32 status, bool *handled) static irqreturn_t tc358743_irq_handler(int irq, void *dev_id) { struct tc358743_state *state = dev_id; - bool handled; + bool handled = false; tc358743_isr(&state->sd, 0, &handled); diff --git a/drivers/media/pci/bt8xx/bttv-driver.c b/drivers/media/pci/bt8xx/bttv-driver.c index cf05e11da01b..4c042ba6de91 100644 --- a/drivers/media/pci/bt8xx/bttv-driver.c +++ b/drivers/media/pci/bt8xx/bttv-driver.c @@ -4055,11 +4055,13 @@ static int bttv_probe(struct pci_dev *dev, const struct pci_device_id *pci_id) btv->id = dev->device; if (pci_enable_device(dev)) { pr_warn("%d: Can't enable device\n", btv->c.nr); - return -EIO; + result = -EIO; + goto free_mem; } if (pci_set_dma_mask(dev, DMA_BIT_MASK(32))) { pr_warn("%d: No suitable DMA available\n", btv->c.nr); - return -EIO; + result = -EIO; + goto free_mem; } if (!request_mem_region(pci_resource_start(dev,0), pci_resource_len(dev,0), @@ -4067,7 +4069,8 @@ static int bttv_probe(struct pci_dev *dev, const struct pci_device_id *pci_id) pr_warn("%d: can't request iomem (0x%llx)\n", btv->c.nr, (unsigned long long)pci_resource_start(dev, 0)); - return -EBUSY; + result = -EBUSY; + goto free_mem; } pci_set_master(dev); pci_set_command(dev); @@ -4253,6 +4256,10 @@ fail0: release_mem_region(pci_resource_start(btv->c.pci,0), pci_resource_len(btv->c.pci,0)); pci_disable_device(btv->c.pci); + +free_mem: + bttvs[btv->c.nr] = NULL; + kfree(btv); return result; } diff --git a/drivers/media/pci/saa7134/saa7134-tvaudio.c b/drivers/media/pci/saa7134/saa7134-tvaudio.c index 68d400e1e240..8c3da6f7a60f 100644 --- a/drivers/media/pci/saa7134/saa7134-tvaudio.c +++ b/drivers/media/pci/saa7134/saa7134-tvaudio.c @@ -693,7 +693,8 @@ int saa_dsp_writel(struct saa7134_dev *dev, int reg, u32 value) { int err; - audio_dbg(2, "dsp write reg 0x%x = 0x%06x\n", reg << 2, value); + audio_dbg(2, "dsp write reg 0x%x = 0x%06x\n", + (reg << 2) & 0xffffffff, value); err = saa_dsp_wait_bit(dev,SAA7135_DSP_RWSTATE_WRR); if (err < 0) return err; diff --git a/drivers/media/pci/tw5864/tw5864-video.c b/drivers/media/pci/tw5864/tw5864-video.c index 6c40e60ac993..b0f8d1532b70 100644 --- a/drivers/media/pci/tw5864/tw5864-video.c +++ b/drivers/media/pci/tw5864/tw5864-video.c @@ -776,6 +776,9 @@ static int tw5864_enum_frameintervals(struct file *file, void *priv, fintv->type = V4L2_FRMIVAL_TYPE_STEPWISE; ret = tw5864_frameinterval_get(input, &frameinterval); + if (ret) + return ret; + fintv->stepwise.step = frameinterval; fintv->stepwise.min = frameinterval; fintv->stepwise.max = frameinterval; @@ -794,6 +797,9 @@ static int tw5864_g_parm(struct file *file, void *priv, cp->capability = V4L2_CAP_TIMEPERFRAME; ret = tw5864_frameinterval_get(input, &cp->timeperframe); + if (ret) + return ret; + cp->timeperframe.numerator *= input->frame_interval; cp->capturemode = 0; cp->readbuffers = 2; diff --git a/drivers/media/platform/exynos4-is/fimc-isp.c b/drivers/media/platform/exynos4-is/fimc-isp.c index 9a48c0f69320..1dbebdc1c2f8 100644 --- a/drivers/media/platform/exynos4-is/fimc-isp.c +++ b/drivers/media/platform/exynos4-is/fimc-isp.c @@ -311,8 +311,10 @@ static int fimc_isp_subdev_s_power(struct v4l2_subdev *sd, int on) if (on) { ret = pm_runtime_get_sync(&is->pdev->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put(&is->pdev->dev); return ret; + } set_bit(IS_ST_PWR_ON, &is->state); ret = fimc_is_start_firmware(is); diff --git a/drivers/media/platform/exynos4-is/fimc-lite.c b/drivers/media/platform/exynos4-is/fimc-lite.c index 70d5f5586a5d..10fe7d2e8790 100644 --- a/drivers/media/platform/exynos4-is/fimc-lite.c +++ b/drivers/media/platform/exynos4-is/fimc-lite.c @@ -480,7 +480,7 @@ static int fimc_lite_open(struct file *file) set_bit(ST_FLITE_IN_USE, &fimc->state); ret = pm_runtime_get_sync(&fimc->pdev->dev); if (ret < 0) - goto unlock; + goto err_pm; ret = v4l2_fh_open(file); if (ret < 0) diff --git a/drivers/media/platform/exynos4-is/media-dev.c b/drivers/media/platform/exynos4-is/media-dev.c index 2d25a197dc65..3261dc72cc61 100644 --- a/drivers/media/platform/exynos4-is/media-dev.c +++ b/drivers/media/platform/exynos4-is/media-dev.c @@ -481,8 +481,10 @@ static int fimc_md_register_sensor_entities(struct fimc_md *fmd) return -ENXIO; ret = pm_runtime_get_sync(fmd->pmf); - if (ret < 0) + if (ret < 0) { + pm_runtime_put(fmd->pmf); return ret; + } fmd->num_sensors = 0; @@ -1257,11 +1259,9 @@ static int fimc_md_get_pinctrl(struct fimc_md *fmd) if (IS_ERR(pctl->state_default)) return PTR_ERR(pctl->state_default); + /* PINCTRL_STATE_IDLE is optional */ pctl->state_idle = pinctrl_lookup_state(pctl->pinctrl, PINCTRL_STATE_IDLE); - if (IS_ERR(pctl->state_idle)) - return PTR_ERR(pctl->state_idle); - return 0; } diff --git a/drivers/media/platform/exynos4-is/mipi-csis.c b/drivers/media/platform/exynos4-is/mipi-csis.c index b4e28a299e26..efab3ebc6756 100644 --- a/drivers/media/platform/exynos4-is/mipi-csis.c +++ b/drivers/media/platform/exynos4-is/mipi-csis.c @@ -513,8 +513,10 @@ static int s5pcsis_s_stream(struct v4l2_subdev *sd, int enable) if (enable) { s5pcsis_clear_counters(state); ret = pm_runtime_get_sync(&state->pdev->dev); - if (ret && ret != 1) + if (ret && ret != 1) { + pm_runtime_put_noidle(&state->pdev->dev); return ret; + } } mutex_lock(&state->lock); diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c index 11429633b2fb..f0bca30a0a80 100644 --- a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c +++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c @@ -579,6 +579,13 @@ static int mtk_jpeg_queue_setup(struct vb2_queue *q, if (!q_data) return -EINVAL; + if (*num_planes) { + for (i = 0; i < *num_planes; i++) + if (sizes[i] < q_data->sizeimage[i]) + return -EINVAL; + return 0; + } + *num_planes = q_data->fmt->colplanes; for (i = 0; i < q_data->fmt->colplanes; i++) { sizes[i] = q_data->sizeimage[i]; diff --git a/drivers/media/platform/mx2_emmaprp.c b/drivers/media/platform/mx2_emmaprp.c index 419e1cb10dc6..f4be4c672d40 100644 --- a/drivers/media/platform/mx2_emmaprp.c +++ b/drivers/media/platform/mx2_emmaprp.c @@ -929,8 +929,11 @@ static int emmaprp_probe(struct platform_device *pdev) platform_set_drvdata(pdev, pcdev); irq = platform_get_irq(pdev, 0); - if (irq < 0) - return irq; + if (irq < 0) { + ret = irq; + goto rel_vdev; + } + ret = devm_request_irq(&pdev->dev, irq, emmaprp_irq, 0, dev_name(&pdev->dev), pcdev); if (ret) diff --git a/drivers/media/platform/omap3isp/isp.c b/drivers/media/platform/omap3isp/isp.c index addd03b51748..00e52f0b8251 100644 --- a/drivers/media/platform/omap3isp/isp.c +++ b/drivers/media/platform/omap3isp/isp.c @@ -2265,8 +2265,10 @@ static int isp_probe(struct platform_device *pdev) mem = platform_get_resource(pdev, IORESOURCE_MEM, i); isp->mmio_base[map_idx] = devm_ioremap_resource(isp->dev, mem); - if (IS_ERR(isp->mmio_base[map_idx])) - return PTR_ERR(isp->mmio_base[map_idx]); + if (IS_ERR(isp->mmio_base[map_idx])) { + ret = PTR_ERR(isp->mmio_base[map_idx]); + goto error; + } } ret = isp_get_clocks(isp); diff --git a/drivers/media/platform/qcom/camss/camss-csiphy.c b/drivers/media/platform/qcom/camss/camss-csiphy.c index 008afb85023b..3c5b9082ad72 100644 --- a/drivers/media/platform/qcom/camss/camss-csiphy.c +++ b/drivers/media/platform/qcom/camss/camss-csiphy.c @@ -176,8 +176,10 @@ static int csiphy_set_power(struct v4l2_subdev *sd, int on) int ret; ret = pm_runtime_get_sync(dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_sync(dev); return ret; + } ret = csiphy_set_clock_rates(csiphy); if (ret < 0) { diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c index 60069869596c..168f5af6abcc 100644 --- a/drivers/media/platform/qcom/venus/core.c +++ b/drivers/media/platform/qcom/venus/core.c @@ -321,8 +321,10 @@ static int venus_probe(struct platform_device *pdev) goto err_dev_unregister; ret = pm_runtime_put_sync(dev); - if (ret) + if (ret) { + pm_runtime_get_noresume(dev); goto err_dev_unregister; + } return 0; @@ -333,6 +335,7 @@ err_core_deinit: err_venus_shutdown: venus_shutdown(dev); err_runtime_disable: + pm_runtime_put_noidle(dev); pm_runtime_set_suspended(dev); pm_runtime_disable(dev); hfi_destroy(core); diff --git a/drivers/media/platform/rcar-fcp.c b/drivers/media/platform/rcar-fcp.c index 5c6b00737fe7..05c712e00a2a 100644 --- a/drivers/media/platform/rcar-fcp.c +++ b/drivers/media/platform/rcar-fcp.c @@ -103,8 +103,10 @@ int rcar_fcp_enable(struct rcar_fcp_device *fcp) return 0; ret = pm_runtime_get_sync(fcp->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_noidle(fcp->dev); return ret; + } return 0; } diff --git a/drivers/media/platform/rcar-vin/rcar-dma.c b/drivers/media/platform/rcar-vin/rcar-dma.c index 92323310f735..70a8cc433a03 100644 --- a/drivers/media/platform/rcar-vin/rcar-dma.c +++ b/drivers/media/platform/rcar-vin/rcar-dma.c @@ -1323,8 +1323,10 @@ int rvin_set_channel_routing(struct rvin_dev *vin, u8 chsel) int ret; ret = pm_runtime_get_sync(vin->dev); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_noidle(vin->dev); return ret; + } /* Make register writes take effect immediately. */ vnmc = rvin_read(vin, VNMC_REG); diff --git a/drivers/media/platform/rockchip/rga/rga-buf.c b/drivers/media/platform/rockchip/rga/rga-buf.c index 356821c2dacf..0932f1445dea 100644 --- a/drivers/media/platform/rockchip/rga/rga-buf.c +++ b/drivers/media/platform/rockchip/rga/rga-buf.c @@ -89,6 +89,7 @@ static int rga_buf_start_streaming(struct vb2_queue *q, unsigned int count) ret = pm_runtime_get_sync(rga->dev); if (ret < 0) { + pm_runtime_put_noidle(rga->dev); rga_buf_return_buffers(q, VB2_BUF_STATE_QUEUED); return ret; } diff --git a/drivers/media/platform/s3c-camif/camif-core.c b/drivers/media/platform/s3c-camif/camif-core.c index 79bc0ef6bb41..8d8ed72bd0aa 100644 --- a/drivers/media/platform/s3c-camif/camif-core.c +++ b/drivers/media/platform/s3c-camif/camif-core.c @@ -476,7 +476,7 @@ static int s3c_camif_probe(struct platform_device *pdev) ret = camif_media_dev_init(camif); if (ret < 0) - goto err_alloc; + goto err_pm; ret = camif_register_sensor(camif); if (ret < 0) @@ -510,10 +510,9 @@ err_sens: media_device_unregister(&camif->media_dev); media_device_cleanup(&camif->media_dev); camif_unregister_media_entities(camif); -err_alloc: +err_pm: pm_runtime_put(dev); pm_runtime_disable(dev); -err_pm: camif_clk_put(camif); err_clk: s3c_camif_unregister_subdev(camif); diff --git a/drivers/media/platform/s5p-mfc/s5p_mfc_pm.c b/drivers/media/platform/s5p-mfc/s5p_mfc_pm.c index 5e080f32b0e8..95abf2bd7eba 100644 --- a/drivers/media/platform/s5p-mfc/s5p_mfc_pm.c +++ b/drivers/media/platform/s5p-mfc/s5p_mfc_pm.c @@ -83,8 +83,10 @@ int s5p_mfc_power_on(void) int i, ret = 0; ret = pm_runtime_get_sync(pm->device); - if (ret < 0) + if (ret < 0) { + pm_runtime_put_noidle(pm->device); return ret; + } /* clock control */ for (i = 0; i < pm->num_clocks; i++) { diff --git a/drivers/media/platform/sti/bdisp/bdisp-v4l2.c b/drivers/media/platform/sti/bdisp/bdisp-v4l2.c index 40c4eef71c34..00f6e3f06dac 100644 --- a/drivers/media/platform/sti/bdisp/bdisp-v4l2.c +++ b/drivers/media/platform/sti/bdisp/bdisp-v4l2.c @@ -1371,7 +1371,7 @@ static int bdisp_probe(struct platform_device *pdev) ret = pm_runtime_get_sync(dev); if (ret < 0) { dev_err(dev, "failed to set PM\n"); - goto err_dbg; + goto err_pm; } /* Filters */ @@ -1399,7 +1399,6 @@ err_filter: bdisp_hw_free_filters(bdisp->dev); err_pm: pm_runtime_put(dev); -err_dbg: bdisp_debugfs_remove(bdisp); err_v4l2: v4l2_device_unregister(&bdisp->v4l2_dev); diff --git a/drivers/media/platform/sti/delta/delta-v4l2.c b/drivers/media/platform/sti/delta/delta-v4l2.c index 0b42acd4e3a6..53dc6da2b09e 100644 --- a/drivers/media/platform/sti/delta/delta-v4l2.c +++ b/drivers/media/platform/sti/delta/delta-v4l2.c @@ -954,8 +954,10 @@ static void delta_run_work(struct work_struct *work) /* enable the hardware */ if (!dec->pm) { ret = delta_get_sync(ctx); - if (ret) + if (ret) { + delta_put_autosuspend(ctx); goto err; + } } /* decode this access unit */ diff --git a/drivers/media/platform/sti/hva/hva-hw.c b/drivers/media/platform/sti/hva/hva-hw.c index 7917fd2c4bd4..d826c011c095 100644 --- a/drivers/media/platform/sti/hva/hva-hw.c +++ b/drivers/media/platform/sti/hva/hva-hw.c @@ -272,6 +272,7 @@ static unsigned long int hva_hw_get_ip_version(struct hva_dev *hva) if (pm_runtime_get_sync(dev) < 0) { dev_err(dev, "%s failed to get pm_runtime\n", HVA_PREFIX); + pm_runtime_put_noidle(dev); mutex_unlock(&hva->protect_mutex); return -EFAULT; } @@ -392,7 +393,7 @@ int hva_hw_probe(struct platform_device *pdev, struct hva_dev *hva) ret = pm_runtime_get_sync(dev); if (ret < 0) { dev_err(dev, "%s failed to set PM\n", HVA_PREFIX); - goto err_clk; + goto err_pm; } /* check IP hardware version */ @@ -557,6 +558,7 @@ void hva_hw_dump_regs(struct hva_dev *hva, struct seq_file *s) if (pm_runtime_get_sync(dev) < 0) { seq_puts(s, "Cannot wake up IP\n"); + pm_runtime_put_noidle(dev); mutex_unlock(&hva->protect_mutex); return; } diff --git a/drivers/media/platform/stm32/stm32-dcmi.c b/drivers/media/platform/stm32/stm32-dcmi.c index 18d0b5641789..ee1a21179767 100644 --- a/drivers/media/platform/stm32/stm32-dcmi.c +++ b/drivers/media/platform/stm32/stm32-dcmi.c @@ -587,7 +587,7 @@ static int dcmi_start_streaming(struct vb2_queue *vq, unsigned int count) if (ret < 0) { dev_err(dcmi->dev, "%s: Failed to start streaming, cannot get sync (%d)\n", __func__, ret); - goto err_release_buffers; + goto err_pm_put; } /* Enable stream on the sub device */ @@ -682,8 +682,6 @@ err_subdev_streamoff: err_pm_put: pm_runtime_put(dcmi->dev); - -err_release_buffers: spin_lock_irq(&dcmi->irqlock); /* * Return all buffers to vb2 in QUEUED state. diff --git a/drivers/media/platform/ti-vpe/vpe.c b/drivers/media/platform/ti-vpe/vpe.c index a285b9db7ee8..70a8371b7e9a 100644 --- a/drivers/media/platform/ti-vpe/vpe.c +++ b/drivers/media/platform/ti-vpe/vpe.c @@ -2451,6 +2451,8 @@ static int vpe_runtime_get(struct platform_device *pdev) r = pm_runtime_get_sync(&pdev->dev); WARN_ON(r < 0); + if (r) + pm_runtime_put_noidle(&pdev->dev); return r < 0 ? r : 0; } diff --git a/drivers/media/platform/vsp1/vsp1_drv.c b/drivers/media/platform/vsp1/vsp1_drv.c index b6619c9c18bb..4e6530ee809a 100644 --- a/drivers/media/platform/vsp1/vsp1_drv.c +++ b/drivers/media/platform/vsp1/vsp1_drv.c @@ -562,7 +562,12 @@ int vsp1_device_get(struct vsp1_device *vsp1) int ret; ret = pm_runtime_get_sync(vsp1->dev); - return ret < 0 ? ret : 0; + if (ret < 0) { + pm_runtime_put_noidle(vsp1->dev); + return ret; + } + + return 0; } /* @@ -845,12 +850,12 @@ static int vsp1_probe(struct platform_device *pdev) /* Configure device parameters based on the version register. */ pm_runtime_enable(&pdev->dev); - ret = pm_runtime_get_sync(&pdev->dev); + ret = vsp1_device_get(vsp1); if (ret < 0) goto done; vsp1->version = vsp1_read(vsp1, VI6_IP_VERSION); - pm_runtime_put_sync(&pdev->dev); + vsp1_device_put(vsp1); for (i = 0; i < ARRAY_SIZE(vsp1_device_infos); ++i) { if ((vsp1->version & VI6_IP_VERSION_MODEL_MASK) == diff --git a/drivers/media/rc/ati_remote.c b/drivers/media/rc/ati_remote.c index 8e82610ffaad..01c82da8e9aa 100644 --- a/drivers/media/rc/ati_remote.c +++ b/drivers/media/rc/ati_remote.c @@ -845,6 +845,10 @@ static int ati_remote_probe(struct usb_interface *interface, err("%s: endpoint_in message size==0? \n", __func__); return -ENODEV; } + if (!usb_endpoint_is_int_out(endpoint_out)) { + err("%s: Unexpected endpoint_out\n", __func__); + return -ENODEV; + } ati_remote = kzalloc(sizeof (struct ati_remote), GFP_KERNEL); rc_dev = rc_allocate_device(RC_DRIVER_SCANCODE); diff --git a/drivers/media/tuners/tuner-simple.c b/drivers/media/tuners/tuner-simple.c index 29c1473f2e9f..81e24cf0c8b8 100644 --- a/drivers/media/tuners/tuner-simple.c +++ b/drivers/media/tuners/tuner-simple.c @@ -499,7 +499,7 @@ static int simple_radio_bandswitch(struct dvb_frontend *fe, u8 *buffer) case TUNER_TENA_9533_DI: case TUNER_YMEC_TVF_5533MF: tuner_dbg("This tuner doesn't have FM. Most cards have a TEA5767 for FM\n"); - return 0; + return -EINVAL; case TUNER_PHILIPS_FM1216ME_MK3: case TUNER_PHILIPS_FM1236_MK3: case TUNER_PHILIPS_FMD1216ME_MK3: @@ -701,7 +701,8 @@ static int simple_set_radio_freq(struct dvb_frontend *fe, TUNER_RATIO_SELECT_50; /* 50 kHz step */ /* Bandswitch byte */ - simple_radio_bandswitch(fe, &buffer[0]); + if (simple_radio_bandswitch(fe, &buffer[0])) + return 0; /* Convert from 1/16 kHz V4L steps to 1/20 MHz (=50 kHz) PLL steps freq * (1 Mhz / 16000 V4L steps) * (20 PLL steps / 1 MHz) = diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c index 5357558bef39..919203723863 100644 --- a/drivers/media/usb/uvc/uvc_ctrl.c +++ b/drivers/media/usb/uvc/uvc_ctrl.c @@ -778,12 +778,16 @@ static s32 uvc_get_le_value(struct uvc_control_mapping *mapping, offset &= 7; mask = ((1LL << bits) - 1) << offset; - for (; bits > 0; data++) { + while (1) { u8 byte = *data & mask; value |= offset > 0 ? (byte >> offset) : (byte << (-offset)); bits -= 8 - (offset > 0 ? offset : 0); + if (bits <= 0) + break; + offset -= 8; mask = (1 << bits) - 1; + data++; } /* Sign-extend the value if needed. */ @@ -1849,30 +1853,35 @@ int uvc_xu_ctrl_query(struct uvc_video_chain *chain, { struct uvc_entity *entity; struct uvc_control *ctrl; - unsigned int i, found = 0; + unsigned int i; + bool found; u32 reqflags; u16 size; u8 *data = NULL; int ret; /* Find the extension unit. */ + found = false; list_for_each_entry(entity, &chain->entities, chain) { if (UVC_ENTITY_TYPE(entity) == UVC_VC_EXTENSION_UNIT && - entity->id == xqry->unit) + entity->id == xqry->unit) { + found = true; break; + } } - if (entity->id != xqry->unit) { + if (!found) { uvc_trace(UVC_TRACE_CONTROL, "Extension unit %u not found.\n", xqry->unit); return -ENOENT; } /* Find the control and perform delayed initialization if needed. */ + found = false; for (i = 0; i < entity->ncontrols; ++i) { ctrl = &entity->controls[i]; if (ctrl->index == xqry->selector - 1) { - found = 1; + found = true; break; } } @@ -2029,13 +2038,6 @@ static int uvc_ctrl_add_info(struct uvc_device *dev, struct uvc_control *ctrl, goto done; } - /* - * Retrieve control flags from the device. Ignore errors and work with - * default flag values from the uvc_ctrl array when the device doesn't - * properly implement GET_INFO on standard controls. - */ - uvc_ctrl_get_flags(dev, ctrl, &ctrl->info); - ctrl->initialized = 1; uvc_trace(UVC_TRACE_CONTROL, "Added control %pUl/%u to device %s " @@ -2261,6 +2263,13 @@ static void uvc_ctrl_init_ctrl(struct uvc_device *dev, struct uvc_control *ctrl) if (uvc_entity_match_guid(ctrl->entity, info->entity) && ctrl->index == info->index) { uvc_ctrl_add_info(dev, ctrl, info); + /* + * Retrieve control flags from the device. Ignore errors + * and work with default flag values from the uvc_ctrl + * array when the device doesn't properly implement + * GET_INFO on standard controls. + */ + uvc_ctrl_get_flags(dev, ctrl, &ctrl->info); break; } } diff --git a/drivers/media/usb/uvc/uvc_entity.c b/drivers/media/usb/uvc/uvc_entity.c index 554063c07d7a..f2457953f27c 100644 --- a/drivers/media/usb/uvc/uvc_entity.c +++ b/drivers/media/usb/uvc/uvc_entity.c @@ -78,10 +78,45 @@ static int uvc_mc_init_entity(struct uvc_video_chain *chain, int ret; if (UVC_ENTITY_TYPE(entity) != UVC_TT_STREAMING) { + u32 function; + v4l2_subdev_init(&entity->subdev, &uvc_subdev_ops); strlcpy(entity->subdev.name, entity->name, sizeof(entity->subdev.name)); + switch (UVC_ENTITY_TYPE(entity)) { + case UVC_VC_SELECTOR_UNIT: + function = MEDIA_ENT_F_VID_MUX; + break; + case UVC_VC_PROCESSING_UNIT: + case UVC_VC_EXTENSION_UNIT: + /* For lack of a better option. */ + function = MEDIA_ENT_F_PROC_VIDEO_PIXEL_FORMATTER; + break; + case UVC_COMPOSITE_CONNECTOR: + case UVC_COMPONENT_CONNECTOR: + function = MEDIA_ENT_F_CONN_COMPOSITE; + break; + case UVC_SVIDEO_CONNECTOR: + function = MEDIA_ENT_F_CONN_SVIDEO; + break; + case UVC_ITT_CAMERA: + function = MEDIA_ENT_F_CAM_SENSOR; + break; + case UVC_TT_VENDOR_SPECIFIC: + case UVC_ITT_VENDOR_SPECIFIC: + case UVC_ITT_MEDIA_TRANSPORT_INPUT: + case UVC_OTT_VENDOR_SPECIFIC: + case UVC_OTT_DISPLAY: + case UVC_OTT_MEDIA_TRANSPORT_OUTPUT: + case UVC_EXTERNAL_VENDOR_SPECIFIC: + default: + function = MEDIA_ENT_F_V4L2_SUBDEV_UNKNOWN; + break; + } + + entity->subdev.entity.function = function; + ret = media_entity_pads_init(&entity->subdev.entity, entity->num_pads, entity->pads); diff --git a/drivers/media/usb/uvc/uvc_v4l2.c b/drivers/media/usb/uvc/uvc_v4l2.c index 18a7384b50ee..0921c95a1dca 100644 --- a/drivers/media/usb/uvc/uvc_v4l2.c +++ b/drivers/media/usb/uvc/uvc_v4l2.c @@ -252,11 +252,41 @@ static int uvc_v4l2_try_format(struct uvc_streaming *stream, if (ret < 0) goto done; + /* After the probe, update fmt with the values returned from + * negotiation with the device. + */ + for (i = 0; i < stream->nformats; ++i) { + if (probe->bFormatIndex == stream->format[i].index) { + format = &stream->format[i]; + break; + } + } + + if (i == stream->nformats) { + uvc_trace(UVC_TRACE_FORMAT, "Unknown bFormatIndex %u\n", + probe->bFormatIndex); + return -EINVAL; + } + + for (i = 0; i < format->nframes; ++i) { + if (probe->bFrameIndex == format->frame[i].bFrameIndex) { + frame = &format->frame[i]; + break; + } + } + + if (i == format->nframes) { + uvc_trace(UVC_TRACE_FORMAT, "Unknown bFrameIndex %u\n", + probe->bFrameIndex); + return -EINVAL; + } + fmt->fmt.pix.width = frame->wWidth; fmt->fmt.pix.height = frame->wHeight; fmt->fmt.pix.field = V4L2_FIELD_NONE; fmt->fmt.pix.bytesperline = uvc_v4l2_get_bytesperline(format, frame); fmt->fmt.pix.sizeimage = probe->dwMaxVideoFrameSize; + fmt->fmt.pix.pixelformat = format->fcc; fmt->fmt.pix.colorspace = format->colorspace; fmt->fmt.pix.priv = 0; diff --git a/drivers/memory/emif.c b/drivers/memory/emif.c index 2f214440008c..1c6b2cc6269a 100644 --- a/drivers/memory/emif.c +++ b/drivers/memory/emif.c @@ -165,35 +165,12 @@ static const struct file_operations emif_mr4_fops = { static int __init_or_module emif_debugfs_init(struct emif_data *emif) { - struct dentry *dentry; - int ret; - - dentry = debugfs_create_dir(dev_name(emif->dev), NULL); - if (!dentry) { - ret = -ENOMEM; - goto err0; - } - emif->debugfs_root = dentry; - - dentry = debugfs_create_file("regcache_dump", S_IRUGO, - emif->debugfs_root, emif, &emif_regdump_fops); - if (!dentry) { - ret = -ENOMEM; - goto err1; - } - - dentry = debugfs_create_file("mr4", S_IRUGO, - emif->debugfs_root, emif, &emif_mr4_fops); - if (!dentry) { - ret = -ENOMEM; - goto err1; - } - + emif->debugfs_root = debugfs_create_dir(dev_name(emif->dev), NULL); + debugfs_create_file("regcache_dump", S_IRUGO, emif->debugfs_root, emif, + &emif_regdump_fops); + debugfs_create_file("mr4", S_IRUGO, emif->debugfs_root, emif, + &emif_mr4_fops); return 0; -err1: - debugfs_remove_recursive(emif->debugfs_root); -err0: - return ret; } static void __exit emif_debugfs_exit(struct emif_data *emif) diff --git a/drivers/memory/fsl-corenet-cf.c b/drivers/memory/fsl-corenet-cf.c index 662d050243be..2fbf8d09af36 100644 --- a/drivers/memory/fsl-corenet-cf.c +++ b/drivers/memory/fsl-corenet-cf.c @@ -215,10 +215,8 @@ static int ccf_probe(struct platform_device *pdev) dev_set_drvdata(&pdev->dev, ccf); irq = platform_get_irq(pdev, 0); - if (!irq) { - dev_err(&pdev->dev, "%s: no irq\n", __func__); - return -ENXIO; - } + if (irq < 0) + return irq; ret = devm_request_irq(&pdev->dev, irq, ccf_irq, 0, pdev->name, ccf); if (ret) { diff --git a/drivers/memory/omap-gpmc.c b/drivers/memory/omap-gpmc.c index 1c6a7c16e0c1..2ca507f3a58c 100644 --- a/drivers/memory/omap-gpmc.c +++ b/drivers/memory/omap-gpmc.c @@ -951,7 +951,7 @@ static int gpmc_cs_remap(int cs, u32 base) int ret; u32 old_base, size; - if (cs > gpmc_cs_num) { + if (cs >= gpmc_cs_num) { pr_err("%s: requested chip-select is disabled\n", __func__); return -ENODEV; } @@ -986,7 +986,7 @@ int gpmc_cs_request(int cs, unsigned long size, unsigned long *base) struct resource *res = &gpmc->mem; int r = -1; - if (cs > gpmc_cs_num) { + if (cs >= gpmc_cs_num) { pr_err("%s: requested chip-select is disabled\n", __func__); return -ENODEV; } @@ -2278,6 +2278,10 @@ static void gpmc_probe_dt_children(struct platform_device *pdev) } } #else +void gpmc_read_settings_dt(struct device_node *np, struct gpmc_settings *p) +{ + memset(p, 0, sizeof(*p)); +} static int gpmc_probe_dt(struct platform_device *pdev) { return 0; diff --git a/drivers/message/fusion/mptscsih.c b/drivers/message/fusion/mptscsih.c index 2af7ae13449d..cec867c10968 100644 --- a/drivers/message/fusion/mptscsih.c +++ b/drivers/message/fusion/mptscsih.c @@ -1174,8 +1174,10 @@ mptscsih_remove(struct pci_dev *pdev) MPT_SCSI_HOST *hd; int sz1; - if((hd = shost_priv(host)) == NULL) - return; + if (host == NULL) + hd = NULL; + else + hd = shost_priv(host); mptscsih_shutdown(pdev); @@ -1191,14 +1193,15 @@ mptscsih_remove(struct pci_dev *pdev) "Free'd ScsiLookup (%d) memory\n", ioc->name, sz1)); - kfree(hd->info_kbuf); + if (hd) + kfree(hd->info_kbuf); /* NULL the Scsi_Host pointer */ ioc->sh = NULL; - scsi_host_put(host); - + if (host) + scsi_host_put(host); mpt_detach(pdev); } diff --git a/drivers/mfd/sm501.c b/drivers/mfd/sm501.c index e0173bf4b0dc..ec1ac61a21ed 100644 --- a/drivers/mfd/sm501.c +++ b/drivers/mfd/sm501.c @@ -1429,8 +1429,14 @@ static int sm501_plat_probe(struct platform_device *dev) goto err_claim; } - return sm501_init_dev(sm); + ret = sm501_init_dev(sm); + if (ret) + goto err_unmap; + return 0; + + err_unmap: + iounmap(sm->regs); err_claim: release_resource(sm->regs_claim); kfree(sm->regs_claim); diff --git a/drivers/misc/cardreader/rtsx_pcr.c b/drivers/misc/cardreader/rtsx_pcr.c index 5c5d0241603a..3eb3c237f339 100644 --- a/drivers/misc/cardreader/rtsx_pcr.c +++ b/drivers/misc/cardreader/rtsx_pcr.c @@ -1524,12 +1524,14 @@ static int rtsx_pci_probe(struct pci_dev *pcidev, ret = mfd_add_devices(&pcidev->dev, pcr->id, rtsx_pcr_cells, ARRAY_SIZE(rtsx_pcr_cells), NULL, 0, NULL); if (ret < 0) - goto disable_irq; + goto free_slots; schedule_delayed_work(&pcr->idle_work, msecs_to_jiffies(200)); return 0; +free_slots: + kfree(pcr->slots); disable_irq: free_irq(pcr->irq, (void *)pcr); disable_msi: diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 787a69a2a726..b9d3c87e9318 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -397,8 +397,8 @@ int cxl_calc_capp_routing(struct pci_dev *dev, u64 *chipid, *capp_unit_id = get_capp_unit_id(np, *phb_index); of_node_put(np); if (!*capp_unit_id) { - pr_err("cxl: invalid capp unit id (phb_index: %d)\n", - *phb_index); + pr_err("cxl: No capp unit found for PHB[%lld,%d]. Make sure the adapter is on a capi-compatible slot\n", + *chipid, *phb_index); return -ENODEV; } diff --git a/drivers/misc/eeprom/at25.c b/drivers/misc/eeprom/at25.c index 840afb398f9e..8dd5c610c438 100644 --- a/drivers/misc/eeprom/at25.c +++ b/drivers/misc/eeprom/at25.c @@ -362,7 +362,7 @@ static int at25_probe(struct spi_device *spi) at25->nvmem_config.reg_read = at25_ee_read; at25->nvmem_config.reg_write = at25_ee_write; at25->nvmem_config.priv = at25; - at25->nvmem_config.stride = 4; + at25->nvmem_config.stride = 1; at25->nvmem_config.word_size = 1; at25->nvmem_config.size = chip.byte_len; diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c index 0e4193cb08cf..e1f59b17715d 100644 --- a/drivers/misc/mic/scif/scif_rma.c +++ b/drivers/misc/mic/scif/scif_rma.c @@ -1403,6 +1403,8 @@ retry: NULL); up_write(&mm->mmap_sem); if (nr_pages != pinned_pages->nr_pages) { + if (pinned_pages->nr_pages < 0) + pinned_pages->nr_pages = 0; if (try_upgrade) { if (ulimit) __scif_dec_pinned_vm_lock(mm, @@ -1423,7 +1425,6 @@ retry: if (pinned_pages->nr_pages < nr_pages) { err = -EFAULT; - pinned_pages->nr_pages = nr_pages; goto dec_pinned; } @@ -1436,7 +1437,6 @@ dec_pinned: __scif_dec_pinned_vm_lock(mm, nr_pages, 0); /* Something went wrong! Rollback */ error_unmap: - pinned_pages->nr_pages = nr_pages; scif_destroy_pinned_pages(pinned_pages); *pages = NULL; dev_dbg(scif_info.mdev.this_device, diff --git a/drivers/misc/mic/vop/vop_main.c b/drivers/misc/mic/vop/vop_main.c index de7f035a176d..f4332a97c691 100644 --- a/drivers/misc/mic/vop/vop_main.c +++ b/drivers/misc/mic/vop/vop_main.c @@ -301,7 +301,7 @@ static struct virtqueue *vop_find_vq(struct virtio_device *dev, /* First assign the vring's allocated in host memory */ vqconfig = _vop_vq_config(vdev->desc) + index; memcpy_fromio(&config, vqconfig, sizeof(config)); - _vr_size = vring_size(le16_to_cpu(config.num), MIC_VIRTIO_RING_ALIGN); + _vr_size = round_up(vring_size(le16_to_cpu(config.num), MIC_VIRTIO_RING_ALIGN), 4); vr_size = PAGE_ALIGN(_vr_size + sizeof(struct _mic_vring_info)); va = vpdev->hw_ops->ioremap(vpdev, le64_to_cpu(config.address), vr_size); diff --git a/drivers/misc/mic/vop/vop_vringh.c b/drivers/misc/mic/vop/vop_vringh.c index cbc8ebcff5cf..a252c2199b93 100644 --- a/drivers/misc/mic/vop/vop_vringh.c +++ b/drivers/misc/mic/vop/vop_vringh.c @@ -308,7 +308,7 @@ static int vop_virtio_add_device(struct vop_vdev *vdev, num = le16_to_cpu(vqconfig[i].num); mutex_init(&vvr->vr_mutex); - vr_size = PAGE_ALIGN(vring_size(num, MIC_VIRTIO_RING_ALIGN) + + vr_size = PAGE_ALIGN(round_up(vring_size(num, MIC_VIRTIO_RING_ALIGN), 4) + sizeof(struct _mic_vring_info)); vr->va = (void *) __get_free_pages(GFP_KERNEL | __GFP_ZERO, @@ -320,7 +320,7 @@ static int vop_virtio_add_device(struct vop_vdev *vdev, goto err; } vr->len = vr_size; - vr->info = vr->va + vring_size(num, MIC_VIRTIO_RING_ALIGN); + vr->info = vr->va + round_up(vring_size(num, MIC_VIRTIO_RING_ALIGN), 4); vr->info->magic = cpu_to_le32(MIC_MAGIC + vdev->virtio_id + i); vr_addr = dma_map_single(&vpdev->dev, vr->va, vr_size, DMA_BIDIRECTIONAL); @@ -611,6 +611,7 @@ static int vop_virtio_copy_from_user(struct vop_vdev *vdev, void __user *ubuf, size_t partlen; bool dma = VOP_USE_DMA; int err = 0; + size_t offset = 0; if (daddr & (dma_alignment - 1)) { vdev->tx_dst_unaligned += len; @@ -659,13 +660,20 @@ memcpy: * We are copying to IO below and should ideally use something * like copy_from_user_toio(..) if it existed. */ - if (copy_from_user((void __force *)dbuf, ubuf, len)) { - err = -EFAULT; - dev_err(vop_dev(vdev), "%s %d err %d\n", - __func__, __LINE__, err); - goto err; + while (len) { + partlen = min_t(size_t, len, VOP_INT_DMA_BUF_SIZE); + + if (copy_from_user(vvr->buf, ubuf + offset, partlen)) { + err = -EFAULT; + dev_err(vop_dev(vdev), "%s %d err %d\n", + __func__, __LINE__, err); + goto err; + } + memcpy_toio(dbuf + offset, vvr->buf, partlen); + offset += partlen; + vdev->out_bytes += partlen; + len -= partlen; } - vdev->out_bytes += len; err = 0; err: vpdev->hw_ops->iounmap(vpdev, dbuf); diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c b/drivers/misc/vmw_vmci/vmci_queue_pair.c index bd52f29b4a4e..5e0d1ac67f73 100644 --- a/drivers/misc/vmw_vmci/vmci_queue_pair.c +++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c @@ -671,8 +671,9 @@ static int qp_host_get_user_memory(u64 produce_uva, if (retval < (int)produce_q->kernel_if->num_pages) { pr_debug("get_user_pages_fast(produce) failed (retval=%d)", retval); - qp_release_pages(produce_q->kernel_if->u.h.header_page, - retval, false); + if (retval > 0) + qp_release_pages(produce_q->kernel_if->u.h.header_page, + retval, false); err = VMCI_ERROR_NO_MEM; goto out; } @@ -683,8 +684,9 @@ static int qp_host_get_user_memory(u64 produce_uva, if (retval < (int)consume_q->kernel_if->num_pages) { pr_debug("get_user_pages_fast(consume) failed (retval=%d)", retval); - qp_release_pages(consume_q->kernel_if->u.h.header_page, - retval, false); + if (retval > 0) + qp_release_pages(consume_q->kernel_if->u.h.header_page, + retval, false); qp_release_pages(produce_q->kernel_if->u.h.header_page, produce_q->kernel_if->num_pages, false); err = VMCI_ERROR_NO_MEM; diff --git a/drivers/mmc/core/sdio_cis.c b/drivers/mmc/core/sdio_cis.c index 4197e69f351f..75311efc954e 100644 --- a/drivers/mmc/core/sdio_cis.c +++ b/drivers/mmc/core/sdio_cis.c @@ -30,6 +30,9 @@ static int cistpl_vers_1(struct mmc_card *card, struct sdio_func *func, unsigned i, nr_strings; char **buffer, *string; + if (size < 2) + return 0; + /* Find all null-terminated (including zero length) strings in the TPLLV1_INFO field. Trailing garbage is ignored. */ buf += 2; diff --git a/drivers/mmc/host/sdhci-acpi.c b/drivers/mmc/host/sdhci-acpi.c index 145143b6a0e6..6cc187ce3a32 100644 --- a/drivers/mmc/host/sdhci-acpi.c +++ b/drivers/mmc/host/sdhci-acpi.c @@ -546,6 +546,43 @@ static int sdhci_acpi_emmc_amd_probe_slot(struct platform_device *pdev, (host->mmc->caps & MMC_CAP_1_8V_DDR)) host->mmc->caps2 = MMC_CAP2_HS400_1_8V; + /* + * There are two types of presets out in the wild: + * 1) Default/broken presets. + * These presets have two sets of problems: + * a) The clock divisor for SDR12, SDR25, and SDR50 is too small. + * This results in clock frequencies that are 2x higher than + * acceptable. i.e., SDR12 = 25 MHz, SDR25 = 50 MHz, SDR50 = + * 100 MHz.x + * b) The HS200 and HS400 driver strengths don't match. + * By default, the SDR104 preset register has a driver strength of + * A, but the (internal) HS400 preset register has a driver + * strength of B. As part of initializing HS400, HS200 tuning + * needs to be performed. Having different driver strengths + * between tuning and operation is wrong. It results in different + * rise/fall times that lead to incorrect sampling. + * 2) Firmware with properly initialized presets. + * These presets have proper clock divisors. i.e., SDR12 => 12MHz, + * SDR25 => 25 MHz, SDR50 => 50 MHz. Additionally the HS200 and + * HS400 preset driver strengths match. + * + * Enabling presets for HS400 doesn't work for the following reasons: + * 1) sdhci_set_ios has a hard coded list of timings that are used + * to determine if presets should be enabled. + * 2) sdhci_get_preset_value is using a non-standard register to + * read out HS400 presets. The AMD controller doesn't support this + * non-standard register. In fact, it doesn't expose the HS400 + * preset register anywhere in the SDHCI memory map. This results + * in reading a garbage value and using the wrong presets. + * + * Since HS400 and HS200 presets must be identical, we could + * instead use the the SDR104 preset register. + * + * If the above issues are resolved we could remove this quirk for + * firmware that that has valid presets (i.e., SDR12 <= 12 MHz). + */ + host->quirks2 |= SDHCI_QUIRK2_PRESET_VALUE_BROKEN; + host->mmc_host_ops.select_drive_strength = amd_select_drive_strength; host->mmc_host_ops.set_ios = amd_set_ios; return 0; diff --git a/drivers/mmc/host/via-sdmmc.c b/drivers/mmc/host/via-sdmmc.c index 246dc6255e69..9fdb92729c28 100644 --- a/drivers/mmc/host/via-sdmmc.c +++ b/drivers/mmc/host/via-sdmmc.c @@ -1273,11 +1273,14 @@ static void via_init_sdc_pm(struct via_crdr_mmc_host *host) static int via_sd_suspend(struct pci_dev *pcidev, pm_message_t state) { struct via_crdr_mmc_host *host; + unsigned long flags; host = pci_get_drvdata(pcidev); + spin_lock_irqsave(&host->lock, flags); via_save_pcictrlreg(host); via_save_sdcreg(host); + spin_unlock_irqrestore(&host->lock, flags); pci_save_state(pcidev); pci_enable_wake(pcidev, pci_choose_state(pcidev, state), 0); diff --git a/drivers/mtd/lpddr/lpddr2_nvm.c b/drivers/mtd/lpddr/lpddr2_nvm.c index c950c880ad59..90e6cb64db69 100644 --- a/drivers/mtd/lpddr/lpddr2_nvm.c +++ b/drivers/mtd/lpddr/lpddr2_nvm.c @@ -402,6 +402,17 @@ static int lpddr2_nvm_lock(struct mtd_info *mtd, loff_t start_add, return lpddr2_nvm_do_block_op(mtd, start_add, len, LPDDR2_NVM_LOCK); } +static const struct mtd_info lpddr2_nvm_mtd_info = { + .type = MTD_RAM, + .writesize = 1, + .flags = (MTD_CAP_NVRAM | MTD_POWERUP_LOCK), + ._read = lpddr2_nvm_read, + ._write = lpddr2_nvm_write, + ._erase = lpddr2_nvm_erase, + ._unlock = lpddr2_nvm_unlock, + ._lock = lpddr2_nvm_lock, +}; + /* * lpddr2_nvm driver probe method */ @@ -442,6 +453,7 @@ static int lpddr2_nvm_probe(struct platform_device *pdev) .pfow_base = OW_BASE_ADDRESS, .fldrv_priv = pcm_data, }; + if (IS_ERR(map->virt)) return PTR_ERR(map->virt); @@ -453,22 +465,13 @@ static int lpddr2_nvm_probe(struct platform_device *pdev) return PTR_ERR(pcm_data->ctl_regs); /* Populate mtd_info data structure */ - *mtd = (struct mtd_info) { - .dev = { .parent = &pdev->dev }, - .name = pdev->dev.init_name, - .type = MTD_RAM, - .priv = map, - .size = resource_size(add_range), - .erasesize = ERASE_BLOCKSIZE * pcm_data->bus_width, - .writesize = 1, - .writebufsize = WRITE_BUFFSIZE * pcm_data->bus_width, - .flags = (MTD_CAP_NVRAM | MTD_POWERUP_LOCK), - ._read = lpddr2_nvm_read, - ._write = lpddr2_nvm_write, - ._erase = lpddr2_nvm_erase, - ._unlock = lpddr2_nvm_unlock, - ._lock = lpddr2_nvm_lock, - }; + *mtd = lpddr2_nvm_mtd_info; + mtd->dev.parent = &pdev->dev; + mtd->name = pdev->dev.init_name; + mtd->priv = map; + mtd->size = resource_size(add_range); + mtd->erasesize = ERASE_BLOCKSIZE * pcm_data->bus_width; + mtd->writebufsize = WRITE_BUFFSIZE * pcm_data->bus_width; /* Verify the presence of the device looking for PFOW string */ if (!lpddr2_nvm_pfow_present(map)) { diff --git a/drivers/mtd/mtdoops.c b/drivers/mtd/mtdoops.c index e078fc41aa61..feeffde2d4fa 100644 --- a/drivers/mtd/mtdoops.c +++ b/drivers/mtd/mtdoops.c @@ -293,12 +293,13 @@ static void mtdoops_do_dump(struct kmsg_dumper *dumper, kmsg_dump_get_buffer(dumper, true, cxt->oops_buf + MTDOOPS_HEADER_SIZE, record_size - MTDOOPS_HEADER_SIZE, NULL); - /* Panics must be written immediately */ - if (reason != KMSG_DUMP_OOPS) + if (reason != KMSG_DUMP_OOPS) { + /* Panics must be written immediately */ mtdoops_write(cxt, 1); - - /* For other cases, schedule work to write it "nicely" */ - schedule_work(&cxt->work_write); + } else { + /* For other cases, schedule work to write it "nicely" */ + schedule_work(&cxt->work_write); + } } static void mtdoops_notify_add(struct mtd_info *mtd) diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c index 80d64d7e7a8b..ac336164f625 100644 --- a/drivers/mtd/ubi/wl.c +++ b/drivers/mtd/ubi/wl.c @@ -1471,6 +1471,19 @@ int ubi_thread(void *u) !ubi->thread_enabled || ubi_dbg_is_bgt_disabled(ubi)) { set_current_state(TASK_INTERRUPTIBLE); spin_unlock(&ubi->wl_lock); + + /* + * Check kthread_should_stop() after we set the task + * state to guarantee that we either see the stop bit + * and exit or the task state is reset to runnable such + * that it's not scheduled out indefinitely and detects + * the stop bit at kthread_should_stop(). + */ + if (kthread_should_stop()) { + set_current_state(TASK_RUNNING); + break; + } + schedule(); continue; } diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index 0652caad57ec..dcfc57b06eda 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -70,6 +70,49 @@ config DUMMY To compile this driver as a module, choose M here: the module will be called dummy. +config WIREGUARD + tristate "WireGuard secure network tunnel" + depends on NET && INET + depends on IPV6 || !IPV6 + select NET_UDP_TUNNEL + select DST_CACHE + select CRYPTO + select CRYPTO_LIB_CURVE25519 + select CRYPTO_LIB_CHACHA20POLY1305 + select CRYPTO_LIB_BLAKE2S + select CRYPTO_CHACHA20_X86_64 if X86 && 64BIT + select CRYPTO_POLY1305_X86_64 if X86 && 64BIT + select CRYPTO_BLAKE2S_X86 if X86 && 64BIT + select CRYPTO_CURVE25519_X86 if X86 && 64BIT + select ARM_CRYPTO if ARM + select ARM64_CRYPTO if ARM64 + select CRYPTO_CHACHA20_NEON if (ARM || ARM64) && KERNEL_MODE_NEON + select CRYPTO_POLY1305_NEON if ARM64 && KERNEL_MODE_NEON + select CRYPTO_POLY1305_ARM if ARM + select CRYPTO_CURVE25519_NEON if ARM && KERNEL_MODE_NEON + select CRYPTO_CHACHA_MIPS if CPU_MIPS32_R2 + select CRYPTO_POLY1305_MIPS if CPU_MIPS32 || (CPU_MIPS64 && 64BIT) + help + WireGuard is a secure, fast, and easy to use replacement for IPSec + that uses modern cryptography and clever networking tricks. It's + designed to be fairly general purpose and abstract enough to fit most + use cases, while at the same time remaining extremely simple to + configure. See www.wireguard.com for more info. + + It's safe to say Y or M here, as the driver is very lightweight and + is only in use when an administrator chooses to add an interface. + +config WIREGUARD_DEBUG + bool "Debugging checks and verbose messages" + depends on WIREGUARD + help + This will write log messages for handshake and other events + that occur for a WireGuard interface. It will also perform some + extra validation checks and unit tests at various points. This is + only useful for debugging. + + Say N here unless you know what you're doing. + config EQUALIZER tristate "EQL (serial line load balancing) support" ---help--- diff --git a/drivers/net/Makefile b/drivers/net/Makefile index 0d3ba056cda3..953b7c12f0b0 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -10,6 +10,7 @@ obj-$(CONFIG_BONDING) += bonding/ obj-$(CONFIG_IPVLAN) += ipvlan/ obj-$(CONFIG_IPVTAP) += ipvlan/ obj-$(CONFIG_DUMMY) += dummy.o +obj-$(CONFIG_WIREGUARD) += wireguard/ obj-$(CONFIG_EQUALIZER) += eql.o obj-$(CONFIG_IFB) += ifb.o obj-$(CONFIG_MACSEC) += macsec.o diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c index bfe13c6627be..0be8db6ab319 100644 --- a/drivers/net/can/flexcan.c +++ b/drivers/net/can/flexcan.c @@ -1091,18 +1091,23 @@ static int flexcan_chip_start(struct net_device *dev) return err; } -/* flexcan_chip_stop +/* __flexcan_chip_stop * - * this functions is entered with clocks enabled + * this function is entered with clocks enabled */ -static void flexcan_chip_stop(struct net_device *dev) +static int __flexcan_chip_stop(struct net_device *dev, bool disable_on_error) { struct flexcan_priv *priv = netdev_priv(dev); struct flexcan_regs __iomem *regs = priv->regs; + int err; /* freeze + disable module */ - flexcan_chip_freeze(priv); - flexcan_chip_disable(priv); + err = flexcan_chip_freeze(priv); + if (err && !disable_on_error) + return err; + err = flexcan_chip_disable(priv); + if (err && !disable_on_error) + goto out_chip_unfreeze; /* Disable all interrupts */ priv->write(0, ®s->imask2); @@ -1112,6 +1117,23 @@ static void flexcan_chip_stop(struct net_device *dev) flexcan_transceiver_disable(priv); priv->can.state = CAN_STATE_STOPPED; + + return 0; + + out_chip_unfreeze: + flexcan_chip_unfreeze(priv); + + return err; +} + +static inline int flexcan_chip_stop_disable_on_error(struct net_device *dev) +{ + return __flexcan_chip_stop(dev, true); +} + +static inline int flexcan_chip_stop(struct net_device *dev) +{ + return __flexcan_chip_stop(dev, false); } static int flexcan_open(struct net_device *dev) @@ -1165,7 +1187,7 @@ static int flexcan_close(struct net_device *dev) netif_stop_queue(dev); can_rx_offload_disable(&priv->offload); - flexcan_chip_stop(dev); + flexcan_chip_stop_disable_on_error(dev); free_irq(dev->irq, dev); clk_disable_unprepare(priv->clk_per); diff --git a/drivers/net/dsa/realtek-smi.h b/drivers/net/dsa/realtek-smi.h index 9a63b51e1d82..6f2dab7e33d6 100644 --- a/drivers/net/dsa/realtek-smi.h +++ b/drivers/net/dsa/realtek-smi.h @@ -25,6 +25,9 @@ struct rtl8366_mib_counter { const char *name; }; +/** + * struct rtl8366_vlan_mc - Virtual LAN member configuration + */ struct rtl8366_vlan_mc { u16 vid; u16 untag; @@ -119,7 +122,6 @@ int realtek_smi_setup_mdio(struct realtek_smi *smi); int rtl8366_mc_is_used(struct realtek_smi *smi, int mc_index, int *used); int rtl8366_set_vlan(struct realtek_smi *smi, int vid, u32 member, u32 untag, u32 fid); -int rtl8366_get_pvid(struct realtek_smi *smi, int port, int *val); int rtl8366_set_pvid(struct realtek_smi *smi, unsigned int port, unsigned int vid); int rtl8366_enable_vlan4k(struct realtek_smi *smi, bool enable); diff --git a/drivers/net/dsa/rtl8366.c b/drivers/net/dsa/rtl8366.c index 430988f79722..dddbc86429bd 100644 --- a/drivers/net/dsa/rtl8366.c +++ b/drivers/net/dsa/rtl8366.c @@ -36,13 +36,114 @@ int rtl8366_mc_is_used(struct realtek_smi *smi, int mc_index, int *used) } EXPORT_SYMBOL_GPL(rtl8366_mc_is_used); -int rtl8366_set_vlan(struct realtek_smi *smi, int vid, u32 member, - u32 untag, u32 fid) +/** + * rtl8366_obtain_mc() - retrieve or allocate a VLAN member configuration + * @smi: the Realtek SMI device instance + * @vid: the VLAN ID to look up or allocate + * @vlanmc: the pointer will be assigned to a pointer to a valid member config + * if successful + * @return: index of a new member config or negative error number + */ +static int rtl8366_obtain_mc(struct realtek_smi *smi, int vid, + struct rtl8366_vlan_mc *vlanmc) { struct rtl8366_vlan_4k vlan4k; int ret; int i; + /* Try to find an existing member config entry for this VID */ + for (i = 0; i < smi->num_vlan_mc; i++) { + ret = smi->ops->get_vlan_mc(smi, i, vlanmc); + if (ret) { + dev_err(smi->dev, "error searching for VLAN MC %d for VID %d\n", + i, vid); + return ret; + } + + if (vid == vlanmc->vid) + return i; + } + + /* We have no MC entry for this VID, try to find an empty one */ + for (i = 0; i < smi->num_vlan_mc; i++) { + ret = smi->ops->get_vlan_mc(smi, i, vlanmc); + if (ret) { + dev_err(smi->dev, "error searching for VLAN MC %d for VID %d\n", + i, vid); + return ret; + } + + if (vlanmc->vid == 0 && vlanmc->member == 0) { + /* Update the entry from the 4K table */ + ret = smi->ops->get_vlan_4k(smi, vid, &vlan4k); + if (ret) { + dev_err(smi->dev, "error looking for 4K VLAN MC %d for VID %d\n", + i, vid); + return ret; + } + + vlanmc->vid = vid; + vlanmc->member = vlan4k.member; + vlanmc->untag = vlan4k.untag; + vlanmc->fid = vlan4k.fid; + ret = smi->ops->set_vlan_mc(smi, i, vlanmc); + if (ret) { + dev_err(smi->dev, "unable to set/update VLAN MC %d for VID %d\n", + i, vid); + return ret; + } + + dev_dbg(smi->dev, "created new MC at index %d for VID %d\n", + i, vid); + return i; + } + } + + /* MC table is full, try to find an unused entry and replace it */ + for (i = 0; i < smi->num_vlan_mc; i++) { + int used; + + ret = rtl8366_mc_is_used(smi, i, &used); + if (ret) + return ret; + + if (!used) { + /* Update the entry from the 4K table */ + ret = smi->ops->get_vlan_4k(smi, vid, &vlan4k); + if (ret) + return ret; + + vlanmc->vid = vid; + vlanmc->member = vlan4k.member; + vlanmc->untag = vlan4k.untag; + vlanmc->fid = vlan4k.fid; + ret = smi->ops->set_vlan_mc(smi, i, vlanmc); + if (ret) { + dev_err(smi->dev, "unable to set/update VLAN MC %d for VID %d\n", + i, vid); + return ret; + } + dev_dbg(smi->dev, "recycled MC at index %i for VID %d\n", + i, vid); + return i; + } + } + + dev_err(smi->dev, "all VLAN member configurations are in use\n"); + return -ENOSPC; +} + +int rtl8366_set_vlan(struct realtek_smi *smi, int vid, u32 member, + u32 untag, u32 fid) +{ + struct rtl8366_vlan_mc vlanmc; + struct rtl8366_vlan_4k vlan4k; + int mc; + int ret; + + if (!smi->ops->is_vlan_valid(smi, vid)) + return -EINVAL; + dev_dbg(smi->dev, "setting VLAN%d 4k members: 0x%02x, untagged: 0x%02x\n", vid, member, untag); @@ -63,133 +164,58 @@ int rtl8366_set_vlan(struct realtek_smi *smi, int vid, u32 member, "resulting VLAN%d 4k members: 0x%02x, untagged: 0x%02x\n", vid, vlan4k.member, vlan4k.untag); - /* Try to find an existing MC entry for this VID */ - for (i = 0; i < smi->num_vlan_mc; i++) { - struct rtl8366_vlan_mc vlanmc; + /* Find or allocate a member config for this VID */ + ret = rtl8366_obtain_mc(smi, vid, &vlanmc); + if (ret < 0) + return ret; + mc = ret; - ret = smi->ops->get_vlan_mc(smi, i, &vlanmc); - if (ret) - return ret; + /* Update the MC entry */ + vlanmc.member |= member; + vlanmc.untag |= untag; + vlanmc.fid = fid; - if (vid == vlanmc.vid) { - /* update the MC entry */ - vlanmc.member |= member; - vlanmc.untag |= untag; - vlanmc.fid = fid; - - ret = smi->ops->set_vlan_mc(smi, i, &vlanmc); - - dev_dbg(smi->dev, - "resulting VLAN%d MC members: 0x%02x, untagged: 0x%02x\n", - vid, vlanmc.member, vlanmc.untag); - - break; - } - } + /* Commit updates to the MC entry */ + ret = smi->ops->set_vlan_mc(smi, mc, &vlanmc); + if (ret) + dev_err(smi->dev, "failed to commit changes to VLAN MC index %d for VID %d\n", + mc, vid); + else + dev_dbg(smi->dev, + "resulting VLAN%d MC members: 0x%02x, untagged: 0x%02x\n", + vid, vlanmc.member, vlanmc.untag); return ret; } EXPORT_SYMBOL_GPL(rtl8366_set_vlan); -int rtl8366_get_pvid(struct realtek_smi *smi, int port, int *val) -{ - struct rtl8366_vlan_mc vlanmc; - int ret; - int index; - - ret = smi->ops->get_mc_index(smi, port, &index); - if (ret) - return ret; - - ret = smi->ops->get_vlan_mc(smi, index, &vlanmc); - if (ret) - return ret; - - *val = vlanmc.vid; - return 0; -} -EXPORT_SYMBOL_GPL(rtl8366_get_pvid); - int rtl8366_set_pvid(struct realtek_smi *smi, unsigned int port, unsigned int vid) { struct rtl8366_vlan_mc vlanmc; - struct rtl8366_vlan_4k vlan4k; + int mc; int ret; - int i; - /* Try to find an existing MC entry for this VID */ - for (i = 0; i < smi->num_vlan_mc; i++) { - ret = smi->ops->get_vlan_mc(smi, i, &vlanmc); - if (ret) - return ret; + if (!smi->ops->is_vlan_valid(smi, vid)) + return -EINVAL; - if (vid == vlanmc.vid) { - ret = smi->ops->set_vlan_mc(smi, i, &vlanmc); - if (ret) - return ret; + /* Find or allocate a member config for this VID */ + ret = rtl8366_obtain_mc(smi, vid, &vlanmc); + if (ret < 0) + return ret; + mc = ret; - ret = smi->ops->set_mc_index(smi, port, i); - return ret; - } + ret = smi->ops->set_mc_index(smi, port, mc); + if (ret) { + dev_err(smi->dev, "set PVID: failed to set MC index %d for port %d\n", + mc, port); + return ret; } - /* We have no MC entry for this VID, try to find an empty one */ - for (i = 0; i < smi->num_vlan_mc; i++) { - ret = smi->ops->get_vlan_mc(smi, i, &vlanmc); - if (ret) - return ret; + dev_dbg(smi->dev, "set PVID: the PVID for port %d set to %d using existing MC index %d\n", + port, vid, mc); - if (vlanmc.vid == 0 && vlanmc.member == 0) { - /* Update the entry from the 4K table */ - ret = smi->ops->get_vlan_4k(smi, vid, &vlan4k); - if (ret) - return ret; - - vlanmc.vid = vid; - vlanmc.member = vlan4k.member; - vlanmc.untag = vlan4k.untag; - vlanmc.fid = vlan4k.fid; - ret = smi->ops->set_vlan_mc(smi, i, &vlanmc); - if (ret) - return ret; - - ret = smi->ops->set_mc_index(smi, port, i); - return ret; - } - } - - /* MC table is full, try to find an unused entry and replace it */ - for (i = 0; i < smi->num_vlan_mc; i++) { - int used; - - ret = rtl8366_mc_is_used(smi, i, &used); - if (ret) - return ret; - - if (!used) { - /* Update the entry from the 4K table */ - ret = smi->ops->get_vlan_4k(smi, vid, &vlan4k); - if (ret) - return ret; - - vlanmc.vid = vid; - vlanmc.member = vlan4k.member; - vlanmc.untag = vlan4k.untag; - vlanmc.fid = vlan4k.fid; - ret = smi->ops->set_vlan_mc(smi, i, &vlanmc); - if (ret) - return ret; - - ret = smi->ops->set_mc_index(smi, port, i); - return ret; - } - } - - dev_err(smi->dev, - "all VLAN member configurations are in use\n"); - - return -ENOSPC; + return 0; } EXPORT_SYMBOL_GPL(rtl8366_set_pvid); @@ -389,7 +415,8 @@ void rtl8366_vlan_add(struct dsa_switch *ds, int port, if (!smi->ops->is_vlan_valid(smi, vid)) return; - dev_info(smi->dev, "add VLAN on port %d, %s, %s\n", + dev_info(smi->dev, "add VLAN %d on port %d, %s, %s\n", + vlan->vid_begin, port, untagged ? "untagged" : "tagged", pvid ? " PVID" : "no PVID"); @@ -398,34 +425,29 @@ void rtl8366_vlan_add(struct dsa_switch *ds, int port, dev_err(smi->dev, "port is DSA or CPU port\n"); for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) { - int pvid_val = 0; - - dev_info(smi->dev, "add VLAN %04x\n", vid); member |= BIT(port); if (untagged) untag |= BIT(port); - /* To ensure that we have a valid MC entry for this VLAN, - * initialize the port VLAN ID here. - */ - ret = rtl8366_get_pvid(smi, port, &pvid_val); - if (ret < 0) { - dev_err(smi->dev, "could not lookup PVID for port %d\n", - port); - return; - } - if (pvid_val == 0) { - ret = rtl8366_set_pvid(smi, port, vid); - if (ret < 0) - return; - } - ret = rtl8366_set_vlan(smi, vid, member, untag, 0); if (ret) dev_err(smi->dev, "failed to set up VLAN %04x", vid); + + if (!pvid) + continue; + + ret = rtl8366_set_pvid(smi, port, vid); + if (ret) + dev_err(smi->dev, + "failed to set PVID on port %d to VLAN %04x", + port, vid); + + if (!ret) + dev_dbg(smi->dev, "VLAN add: added VLAN %d with PVID on port %d\n", + vid, port); } } EXPORT_SYMBOL_GPL(rtl8366_vlan_add); diff --git a/drivers/net/dsa/rtl8366rb.c b/drivers/net/dsa/rtl8366rb.c index f4b14b6acd22..5aefd7a4696a 100644 --- a/drivers/net/dsa/rtl8366rb.c +++ b/drivers/net/dsa/rtl8366rb.c @@ -1270,7 +1270,7 @@ static bool rtl8366rb_is_vlan_valid(struct realtek_smi *smi, unsigned int vlan) if (smi->vlan4k_enabled) max = RTL8366RB_NUM_VIDS - 1; - if (vlan == 0 || vlan >= max) + if (vlan == 0 || vlan > max) return false; return true; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index c3f04fb31955..01d28ede1fb2 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -6326,6 +6326,11 @@ static void bnxt_report_link(struct bnxt *bp) u16 fec; netif_carrier_on(bp->dev); + speed = bnxt_fw_to_ethtool_speed(bp->link_info.link_speed); + if (speed == SPEED_UNKNOWN) { + netdev_info(bp->dev, "NIC Link is Up, speed unknown\n"); + return; + } if (bp->link_info.duplex == BNXT_LINK_DUPLEX_FULL) duplex = "full"; else @@ -6338,7 +6343,6 @@ static void bnxt_report_link(struct bnxt *bp) flow_ctrl = "ON - receive"; else flow_ctrl = "none"; - speed = bnxt_fw_to_ethtool_speed(bp->link_info.link_speed); netdev_info(bp->dev, "NIC Link is Up, %u Mbps %s duplex, Flow control: %s\n", speed, duplex, flow_ctrl); if (bp->flags & BNXT_FLAG_EEE_CAP) diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index acae87f548a1..0374a1ba1010 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -1704,7 +1704,8 @@ static inline int macb_clear_csum(struct sk_buff *skb) static int macb_pad_and_fcs(struct sk_buff **skb, struct net_device *ndev) { - bool cloned = skb_cloned(*skb) || skb_header_cloned(*skb); + bool cloned = skb_cloned(*skb) || skb_header_cloned(*skb) || + skb_is_nonlinear(*skb); int padlen = ETH_ZLEN - (*skb)->len; int headroom = skb_headroom(*skb); int tailroom = skb_tailroom(*skb); diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c index bb3ee55cb72c..a62c96001761 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c @@ -145,13 +145,13 @@ static int configure_filter_smac(struct adapter *adap, struct filter_entry *f) int err; /* do a set-tcb for smac-sel and CWR bit.. */ - err = set_tcb_tflag(adap, f, f->tid, TF_CCTRL_CWR_S, 1, 1); - if (err) - goto smac_err; - err = set_tcb_field(adap, f, f->tid, TCB_SMAC_SEL_W, TCB_SMAC_SEL_V(TCB_SMAC_SEL_M), TCB_SMAC_SEL_V(f->smt->idx), 1); + if (err) + goto smac_err; + + err = set_tcb_tflag(adap, f, f->tid, TF_CCTRL_CWR_S, 1, 1); if (!err) return 0; @@ -608,6 +608,7 @@ int set_filter_wr(struct adapter *adapter, int fidx) FW_FILTER_WR_DIRSTEERHASH_V(f->fs.dirsteerhash) | FW_FILTER_WR_LPBK_V(f->fs.action == FILTER_SWITCH) | FW_FILTER_WR_DMAC_V(f->fs.newdmac) | + FW_FILTER_WR_SMAC_V(f->fs.newsmac) | FW_FILTER_WR_INSVLAN_V(f->fs.newvlan == VLAN_INSERT || f->fs.newvlan == VLAN_REWRITE) | FW_FILTER_WR_RMVLAN_V(f->fs.newvlan == VLAN_REMOVE || @@ -625,7 +626,7 @@ int set_filter_wr(struct adapter *adapter, int fidx) FW_FILTER_WR_OVLAN_VLD_V(f->fs.val.ovlan_vld) | FW_FILTER_WR_IVLAN_VLDM_V(f->fs.mask.ivlan_vld) | FW_FILTER_WR_OVLAN_VLDM_V(f->fs.mask.ovlan_vld)); - fwr->smac_sel = 0; + fwr->smac_sel = f->smt->idx; fwr->rx_chan_rx_rpl_iq = htons(FW_FILTER_WR_RX_CHAN_V(0) | FW_FILTER_WR_RX_RPL_IQ_V(adapter->sge.fw_evtq.abs_id)); @@ -1019,11 +1020,8 @@ static void mk_act_open_req6(struct filter_entry *f, struct sk_buff *skb, TX_QUEUE_V(f->fs.nat_mode) | T5_OPT_2_VALID_F | RX_CHANNEL_F | - CONG_CNTRL_V((f->fs.action == FILTER_DROP) | - (f->fs.dirsteer << 1)) | PACE_V((f->fs.maskhash) | - ((f->fs.dirsteerhash) << 1)) | - CCTRL_ECN_V(f->fs.action == FILTER_SWITCH)); + ((f->fs.dirsteerhash) << 1))); } static void mk_act_open_req(struct filter_entry *f, struct sk_buff *skb, @@ -1059,11 +1057,8 @@ static void mk_act_open_req(struct filter_entry *f, struct sk_buff *skb, TX_QUEUE_V(f->fs.nat_mode) | T5_OPT_2_VALID_F | RX_CHANNEL_F | - CONG_CNTRL_V((f->fs.action == FILTER_DROP) | - (f->fs.dirsteer << 1)) | PACE_V((f->fs.maskhash) | - ((f->fs.dirsteerhash) << 1)) | - CCTRL_ECN_V(f->fs.action == FILTER_SWITCH)); + ((f->fs.dirsteerhash) << 1))); } static int cxgb4_set_hash_filter(struct net_device *dev, @@ -1722,6 +1717,20 @@ void hash_filter_rpl(struct adapter *adap, const struct cpl_act_open_rpl *rpl) } return; } + switch (f->fs.action) { + case FILTER_PASS: + if (f->fs.dirsteer) + set_tcb_tflag(adap, f, tid, + TF_DIRECT_STEER_S, 1, 1); + break; + case FILTER_DROP: + set_tcb_tflag(adap, f, tid, TF_DROP_S, 1, 1); + break; + case FILTER_SWITCH: + set_tcb_tflag(adap, f, tid, TF_LPBK_S, 1, 1); + break; + } + break; default: @@ -1781,22 +1790,11 @@ void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl) if (ctx) ctx->result = 0; } else if (ret == FW_FILTER_WR_FLT_ADDED) { - int err = 0; - - if (f->fs.newsmac) - err = configure_filter_smac(adap, f); - - if (!err) { - f->pending = 0; /* async setup completed */ - f->valid = 1; - if (ctx) { - ctx->result = 0; - ctx->tid = idx; - } - } else { - clear_filter(adap, f); - if (ctx) - ctx->result = err; + f->pending = 0; /* async setup completed */ + f->valid = 1; + if (ctx) { + ctx->result = 0; + ctx->tid = idx; } } else { /* Something went wrong. Issue a warning about the diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_tcb.h b/drivers/net/ethernet/chelsio/cxgb4/t4_tcb.h index 3297ce025e8b..6ddc2baa7481 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/t4_tcb.h +++ b/drivers/net/ethernet/chelsio/cxgb4/t4_tcb.h @@ -42,6 +42,10 @@ #define TCB_T_FLAGS_W 1 +#define TF_DROP_S 22 +#define TF_DIRECT_STEER_S 23 +#define TF_LPBK_S 59 + #define TF_CCTRL_ECE_S 60 #define TF_CCTRL_CWR_S 61 #define TF_CCTRL_RFR_S 62 diff --git a/drivers/net/ethernet/cisco/enic/enic.h b/drivers/net/ethernet/cisco/enic/enic.h index 0dd64acd2a3f..08cac1bfacaf 100644 --- a/drivers/net/ethernet/cisco/enic/enic.h +++ b/drivers/net/ethernet/cisco/enic/enic.h @@ -171,6 +171,7 @@ struct enic { u16 num_vfs; #endif spinlock_t enic_api_lock; + bool enic_api_busy; struct enic_port_profile *pp; /* work queue cache line section */ diff --git a/drivers/net/ethernet/cisco/enic/enic_api.c b/drivers/net/ethernet/cisco/enic/enic_api.c index b161f24522b8..b028ea2dec2b 100644 --- a/drivers/net/ethernet/cisco/enic/enic_api.c +++ b/drivers/net/ethernet/cisco/enic/enic_api.c @@ -34,6 +34,12 @@ int enic_api_devcmd_proxy_by_index(struct net_device *netdev, int vf, struct vnic_dev *vdev = enic->vdev; spin_lock(&enic->enic_api_lock); + while (enic->enic_api_busy) { + spin_unlock(&enic->enic_api_lock); + cpu_relax(); + spin_lock(&enic->enic_api_lock); + } + spin_lock_bh(&enic->devcmd_lock); vnic_dev_cmd_proxy_by_index_start(vdev, vf); diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c index 026a3bd71204..810cbe221046 100644 --- a/drivers/net/ethernet/cisco/enic/enic_main.c +++ b/drivers/net/ethernet/cisco/enic/enic_main.c @@ -2142,8 +2142,6 @@ static int enic_dev_wait(struct vnic_dev *vdev, int done; int err; - BUG_ON(in_interrupt()); - err = start(vdev, arg); if (err) return err; @@ -2331,6 +2329,13 @@ static int enic_set_rss_nic_cfg(struct enic *enic) rss_hash_bits, rss_base_cpu, rss_enable); } +static void enic_set_api_busy(struct enic *enic, bool busy) +{ + spin_lock(&enic->enic_api_lock); + enic->enic_api_busy = busy; + spin_unlock(&enic->enic_api_lock); +} + static void enic_reset(struct work_struct *work) { struct enic *enic = container_of(work, struct enic, reset); @@ -2340,7 +2345,9 @@ static void enic_reset(struct work_struct *work) rtnl_lock(); - spin_lock(&enic->enic_api_lock); + /* Stop any activity from infiniband */ + enic_set_api_busy(enic, true); + enic_stop(enic->netdev); enic_dev_soft_reset(enic); enic_reset_addr_lists(enic); @@ -2348,7 +2355,10 @@ static void enic_reset(struct work_struct *work) enic_set_rss_nic_cfg(enic); enic_dev_set_ig_vlan_rewrite_mode(enic); enic_open(enic->netdev); - spin_unlock(&enic->enic_api_lock); + + /* Allow infiniband to fiddle with the device again */ + enic_set_api_busy(enic, false); + call_netdevice_notifiers(NETDEV_REBOOT, enic->netdev); rtnl_unlock(); @@ -2360,7 +2370,9 @@ static void enic_tx_hang_reset(struct work_struct *work) rtnl_lock(); - spin_lock(&enic->enic_api_lock); + /* Stop any activity from infiniband */ + enic_set_api_busy(enic, true); + enic_dev_hang_notify(enic); enic_stop(enic->netdev); enic_dev_hang_reset(enic); @@ -2369,7 +2381,10 @@ static void enic_tx_hang_reset(struct work_struct *work) enic_set_rss_nic_cfg(enic); enic_dev_set_ig_vlan_rewrite_mode(enic); enic_open(enic->netdev); - spin_unlock(&enic->enic_api_lock); + + /* Allow infiniband to fiddle with the device again */ + enic_set_api_busy(enic, false); + call_netdevice_notifiers(NETDEV_REBOOT, enic->netdev); rtnl_unlock(); diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 3b6da228140e..7d1a669416f2 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1897,6 +1897,27 @@ static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum, return ret; } +static void fec_enet_phy_reset_after_clk_enable(struct net_device *ndev) +{ + struct fec_enet_private *fep = netdev_priv(ndev); + struct phy_device *phy_dev = ndev->phydev; + + if (phy_dev) { + phy_reset_after_clk_enable(phy_dev); + } else if (fep->phy_node) { + /* + * If the PHY still is not bound to the MAC, but there is + * OF PHY node and a matching PHY device instance already, + * use the OF PHY node to obtain the PHY device instance, + * and then use that PHY device instance when triggering + * the PHY reset. + */ + phy_dev = of_phy_find_device(fep->phy_node); + phy_reset_after_clk_enable(phy_dev); + put_device(&phy_dev->mdio.dev); + } +} + static int fec_enet_clk_enable(struct net_device *ndev, bool enable) { struct fec_enet_private *fep = netdev_priv(ndev); @@ -1923,7 +1944,7 @@ static int fec_enet_clk_enable(struct net_device *ndev, bool enable) if (ret) goto failed_clk_ref; - phy_reset_after_clk_enable(ndev->phydev); + fec_enet_phy_reset_after_clk_enable(ndev); } else { clk_disable_unprepare(fep->clk_enet_out); if (fep->clk_ptp) { @@ -2929,16 +2950,16 @@ fec_enet_open(struct net_device *ndev) /* Init MAC prior to mii bus probe */ fec_restart(ndev); - /* Probe and connect to PHY when open the interface */ - ret = fec_enet_mii_probe(ndev); - if (ret) - goto err_enet_mii_probe; - /* Call phy_reset_after_clk_enable() again if it failed during * phy_reset_after_clk_enable() before because the PHY wasn't probed. */ if (reset_again) - phy_reset_after_clk_enable(ndev->phydev); + fec_enet_phy_reset_after_clk_enable(ndev); + + /* Probe and connect to PHY when open the interface */ + ret = fec_enet_mii_probe(ndev); + if (ret) + goto err_enet_mii_probe; if (fep->quirks & FEC_QUIRK_ERR006687) imx6q_cpuidle_fec_irqs_used(); diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c index 8243501c3757..8db0924ec681 100644 --- a/drivers/net/ethernet/freescale/gianfar.c +++ b/drivers/net/ethernet/freescale/gianfar.c @@ -1388,7 +1388,7 @@ static int gfar_probe(struct platform_device *ofdev) if (dev->features & NETIF_F_IP_CSUM || priv->device_flags & FSL_GIANFAR_DEV_HAS_TIMER) - dev->needed_headroom = GMAC_FCB_LEN; + dev->needed_headroom = GMAC_FCB_LEN + GMAC_TXPAL_LEN; /* Initializing some of the rx/tx queue level parameters */ for (i = 0; i < priv->num_tx_queues; i++) { @@ -2370,20 +2370,12 @@ static netdev_tx_t gfar_start_xmit(struct sk_buff *skb, struct net_device *dev) fcb_len = GMAC_FCB_LEN + GMAC_TXPAL_LEN; /* make space for additional header when fcb is needed */ - if (fcb_len && unlikely(skb_headroom(skb) < fcb_len)) { - struct sk_buff *skb_new; - - skb_new = skb_realloc_headroom(skb, fcb_len); - if (!skb_new) { + if (fcb_len) { + if (unlikely(skb_cow_head(skb, fcb_len))) { dev->stats.tx_errors++; dev_kfree_skb_any(skb); return NETDEV_TX_OK; } - - if (skb->sk) - skb_set_owner_w(skb_new, skb->sk); - dev_consume_skb_any(skb); - skb = skb_new; } /* total number of fragments in the SKB */ diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c index e2f6670d6eaf..75a1915d95aa 100644 --- a/drivers/net/ethernet/ibm/ibmveth.c +++ b/drivers/net/ethernet/ibm/ibmveth.c @@ -1330,6 +1330,7 @@ static int ibmveth_poll(struct napi_struct *napi, int budget) int offset = ibmveth_rxq_frame_offset(adapter); int csum_good = ibmveth_rxq_csum_good(adapter); int lrg_pkt = ibmveth_rxq_large_packet(adapter); + __sum16 iph_check = 0; skb = ibmveth_rxq_get_buffer(adapter); @@ -1366,16 +1367,26 @@ static int ibmveth_poll(struct napi_struct *napi, int budget) skb_put(skb, length); skb->protocol = eth_type_trans(skb, netdev); + /* PHYP without PLSO support places a -1 in the ip + * checksum for large send frames. + */ + if (skb->protocol == cpu_to_be16(ETH_P_IP)) { + struct iphdr *iph = (struct iphdr *)skb->data; + + iph_check = iph->check; + } + + if ((length > netdev->mtu + ETH_HLEN) || + lrg_pkt || iph_check == 0xffff) { + ibmveth_rx_mss_helper(skb, mss, lrg_pkt); + adapter->rx_large_packets++; + } + if (csum_good) { skb->ip_summed = CHECKSUM_UNNECESSARY; ibmveth_rx_csum_helper(skb, adapter); } - if (length > netdev->mtu + ETH_HLEN) { - ibmveth_rx_mss_helper(skb, mss, lrg_pkt); - adapter->rx_large_packets++; - } - napi_gro_receive(napi, skb); /* send it up */ netdev->stats.rx_packets++; diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c index ae195f8adff5..993f495e2bf7 100644 --- a/drivers/net/ethernet/korina.c +++ b/drivers/net/ethernet/korina.c @@ -1113,7 +1113,7 @@ out: return rc; probe_err_register: - kfree(lp->td_ring); + kfree((struct dma_desc *)KSEG0ADDR(lp->td_ring)); probe_err_td_ring: iounmap(lp->tx_dma_regs); probe_err_dma_tx: @@ -1133,6 +1133,7 @@ static int korina_remove(struct platform_device *pdev) iounmap(lp->eth_regs); iounmap(lp->rx_dma_regs); iounmap(lp->tx_dma_regs); + kfree((struct dma_desc *)KSEG0ADDR(lp->td_ring)); unregister_netdev(bif->dev); free_netdev(bif->dev); diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c index 45d9a5f8fa1b..f509a6ce31db 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -945,6 +945,9 @@ int mlx4_en_poll_rx_cq(struct napi_struct *napi, int budget) bool clean_complete = true; int done; + if (!budget) + return 0; + if (priv->tx_ring_num[TX_XDP]) { xdp_tx_cq = priv->tx_cq[TX_XDP][cq->ring]; if (xdp_tx_cq->xdp_busy) { diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c index 1857ee0f0871..e58052d07e39 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c @@ -343,7 +343,7 @@ u32 mlx4_en_recycle_tx_desc(struct mlx4_en_priv *priv, .dma = tx_info->map0_dma, }; - if (!mlx4_en_rx_recycle(ring->recycle_ring, &frame)) { + if (!napi_mode || !mlx4_en_rx_recycle(ring->recycle_ring, &frame)) { dma_unmap_page(priv->ddev, tx_info->map0_dma, PAGE_SIZE, priv->dma_dir); put_page(tx_info->page); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c index d359e850dbf0..0fd62510fb27 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c @@ -475,8 +475,9 @@ void mlx5_pps_event(struct mlx5_core_dev *mdev, switch (clock->ptp_info.pin_config[pin].func) { case PTP_PF_EXTTS: ptp_event.index = pin; - ptp_event.timestamp = timecounter_cyc2time(&clock->tc, - be64_to_cpu(eqe->data.pps.time_stamp)); + ptp_event.timestamp = + mlx5_timecounter_cyc2time(clock, + be64_to_cpu(eqe->data.pps.time_stamp)); if (clock->pps_info.enabled) { ptp_event.type = PTP_CLOCK_PPSUSR; ptp_event.pps_times.ts_real = diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c index d8e7ca48753f..423c3e9925d0 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/core.c +++ b/drivers/net/ethernet/mellanox/mlxsw/core.c @@ -488,6 +488,9 @@ static void mlxsw_emad_transmit_retry(struct mlxsw_core *mlxsw_core, err = mlxsw_emad_transmit(trans->core, trans); if (err == 0) return; + + if (!atomic_dec_and_test(&trans->active)) + return; } else { err = -EIO; } @@ -1111,6 +1114,8 @@ void mlxsw_core_bus_device_unregister(struct mlxsw_core *mlxsw_core, if (!reload) devlink_resources_unregister(devlink, NULL); mlxsw_core->bus->fini(mlxsw_core->bus_priv); + if (!reload) + devlink_free(devlink); return; diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c index 6df404e3dd27..1555c0dae490 100644 --- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c @@ -4111,6 +4111,27 @@ static void rtl_rar_set(struct rtl8169_private *tp, u8 *addr) rtl_unlock_work(tp); } +static void rtl_init_rxcfg(struct rtl8169_private *tp) +{ + switch (tp->mac_version) { + case RTL_GIGA_MAC_VER_01 ... RTL_GIGA_MAC_VER_06: + case RTL_GIGA_MAC_VER_10 ... RTL_GIGA_MAC_VER_17: + RTL_W32(tp, RxConfig, RX_FIFO_THRESH | RX_DMA_BURST); + break; + case RTL_GIGA_MAC_VER_18 ... RTL_GIGA_MAC_VER_24: + case RTL_GIGA_MAC_VER_34 ... RTL_GIGA_MAC_VER_36: + case RTL_GIGA_MAC_VER_38: + RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST); + break; + case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_51: + RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF); + break; + default: + RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST); + break; + } +} + static int rtl_set_mac_address(struct net_device *dev, void *p) { struct rtl8169_private *tp = netdev_priv(dev); @@ -4128,6 +4149,10 @@ static int rtl_set_mac_address(struct net_device *dev, void *p) pm_runtime_put_noidle(d); + /* Reportedly at least Asus X453MA truncates packets otherwise */ + if (tp->mac_version == RTL_GIGA_MAC_VER_37) + rtl_init_rxcfg(tp); + return 0; } @@ -4289,27 +4314,6 @@ static void rtl_pll_power_up(struct rtl8169_private *tp) } } -static void rtl_init_rxcfg(struct rtl8169_private *tp) -{ - switch (tp->mac_version) { - case RTL_GIGA_MAC_VER_01 ... RTL_GIGA_MAC_VER_06: - case RTL_GIGA_MAC_VER_10 ... RTL_GIGA_MAC_VER_17: - RTL_W32(tp, RxConfig, RX_FIFO_THRESH | RX_DMA_BURST); - break; - case RTL_GIGA_MAC_VER_18 ... RTL_GIGA_MAC_VER_24: - case RTL_GIGA_MAC_VER_34 ... RTL_GIGA_MAC_VER_36: - case RTL_GIGA_MAC_VER_38: - RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST); - break; - case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_51: - RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF); - break; - default: - RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST); - break; - } -} - static void rtl8169_init_ring_indexes(struct rtl8169_private *tp) { tp->dirty_tx = tp->cur_tx = tp->cur_rx = 0; @@ -6626,7 +6630,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance) return IRQ_NONE; rtl_irq_disable(tp); - napi_schedule_irqoff(&tp->napi); + napi_schedule(&tp->napi); return IRQ_HANDLED; } @@ -6826,7 +6830,7 @@ static int rtl8169_close(struct net_device *dev) phy_disconnect(dev->phydev); - pci_free_irq(pdev, 0, tp); + free_irq(pci_irq_vector(pdev, 0), tp); dma_free_coherent(&pdev->dev, R8169_RX_RING_BYTES, tp->RxDescArray, tp->RxPhyAddr); @@ -6881,8 +6885,8 @@ static int rtl_open(struct net_device *dev) rtl_request_firmware(tp); - retval = pci_request_irq(pdev, 0, rtl8169_interrupt, NULL, tp, - dev->name); + retval = request_irq(pci_irq_vector(pdev, 0), rtl8169_interrupt, + IRQF_SHARED, dev->name, tp); if (retval < 0) goto err_release_fw_2; @@ -6915,7 +6919,7 @@ out: return retval; err_free_irq: - pci_free_irq(pdev, 0, tp); + free_irq(pci_irq_vector(pdev, 0), tp); err_release_fw_2: rtl_release_firmware(tp); rtl8169_rx_clear(tp); diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c index 569e698b5c80..c24b7ea37e39 100644 --- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -1732,12 +1732,16 @@ static int ravb_hwtstamp_get(struct net_device *ndev, struct ifreq *req) config.flags = 0; config.tx_type = priv->tstamp_tx_ctrl ? HWTSTAMP_TX_ON : HWTSTAMP_TX_OFF; - if (priv->tstamp_rx_ctrl & RAVB_RXTSTAMP_TYPE_V2_L2_EVENT) + switch (priv->tstamp_rx_ctrl & RAVB_RXTSTAMP_TYPE) { + case RAVB_RXTSTAMP_TYPE_V2_L2_EVENT: config.rx_filter = HWTSTAMP_FILTER_PTP_V2_L2_EVENT; - else if (priv->tstamp_rx_ctrl & RAVB_RXTSTAMP_TYPE_ALL) + break; + case RAVB_RXTSTAMP_TYPE_ALL: config.rx_filter = HWTSTAMP_FILTER_ALL; - else + break; + default: config.rx_filter = HWTSTAMP_FILTER_NONE; + } return copy_to_user(req->ifr_data, &config, sizeof(config)) ? -EFAULT : 0; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index c41879a955b5..2872684906e1 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -177,32 +177,6 @@ static void stmmac_enable_all_queues(struct stmmac_priv *priv) } } -/** - * stmmac_stop_all_queues - Stop all queues - * @priv: driver private structure - */ -static void stmmac_stop_all_queues(struct stmmac_priv *priv) -{ - u32 tx_queues_cnt = priv->plat->tx_queues_to_use; - u32 queue; - - for (queue = 0; queue < tx_queues_cnt; queue++) - netif_tx_stop_queue(netdev_get_tx_queue(priv->dev, queue)); -} - -/** - * stmmac_start_all_queues - Start all queues - * @priv: driver private structure - */ -static void stmmac_start_all_queues(struct stmmac_priv *priv) -{ - u32 tx_queues_cnt = priv->plat->tx_queues_to_use; - u32 queue; - - for (queue = 0; queue < tx_queues_cnt; queue++) - netif_tx_start_queue(netdev_get_tx_queue(priv->dev, queue)); -} - static void stmmac_service_event_schedule(struct stmmac_priv *priv) { if (!test_bit(STMMAC_DOWN, &priv->state) && @@ -2678,7 +2652,7 @@ static int stmmac_open(struct net_device *dev) } stmmac_enable_all_queues(priv); - stmmac_start_all_queues(priv); + netif_tx_start_all_queues(priv->dev); return 0; @@ -2724,8 +2698,6 @@ static int stmmac_release(struct net_device *dev) phy_disconnect(dev->phydev); } - stmmac_stop_all_queues(priv); - stmmac_disable_all_queues(priv); for (chan = 0; chan < priv->plat->tx_queues_to_use; chan++) @@ -4519,7 +4491,6 @@ int stmmac_suspend(struct device *dev) mutex_lock(&priv->lock); netif_device_detach(ndev); - stmmac_stop_all_queues(priv); stmmac_disable_all_queues(priv); @@ -4628,8 +4599,6 @@ int stmmac_resume(struct device *dev) stmmac_enable_all_queues(priv); - stmmac_start_all_queues(priv); - mutex_unlock(&priv->lock); if (ndev->phydev) diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index f2fecb684220..bb9cd1262cc9 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -667,10 +667,6 @@ static int gtp_newlink(struct net *src_net, struct net_device *dev, gtp = netdev_priv(dev); - err = gtp_encap_enable(gtp, data); - if (err < 0) - return err; - if (!data[IFLA_GTP_PDP_HASHSIZE]) { hashsize = 1024; } else { @@ -681,12 +677,16 @@ static int gtp_newlink(struct net *src_net, struct net_device *dev, err = gtp_hashtable_new(gtp, hashsize); if (err < 0) - goto out_encap; + return err; + + err = gtp_encap_enable(gtp, data); + if (err < 0) + goto out_hashtable; err = register_netdevice(dev); if (err < 0) { netdev_dbg(dev, "failed to register new netdev %d\n", err); - goto out_hashtable; + goto out_encap; } gn = net_generic(dev_net(dev), gtp_net_id); @@ -697,11 +697,11 @@ static int gtp_newlink(struct net *src_net, struct net_device *dev, return 0; +out_encap: + gtp_encap_disable(gtp); out_hashtable: kfree(gtp->addr_hash); kfree(gtp->tid_hash); -out_encap: - gtp_encap_disable(gtp); return err; } diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c index 998d08ae7431..47d518e6d5d4 100644 --- a/drivers/net/phy/sfp.c +++ b/drivers/net/phy/sfp.c @@ -1886,7 +1886,8 @@ static int sfp_probe(struct platform_device *pdev) continue; irq = gpiod_to_irq(sfp->gpio[i]); - if (!irq) { + if (irq < 0) { + irq = 0; poll = true; continue; } diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index af58bf54aa9b..6e0b3dc14aa4 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1268,6 +1268,7 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x1bc7, 0x1101, 3)}, /* Telit ME910 dual modem */ {QMI_FIXED_INTF(0x1bc7, 0x1200, 5)}, /* Telit LE920 */ {QMI_QUIRK_SET_DTR(0x1bc7, 0x1201, 2)}, /* Telit LE920, LE920A4 */ + {QMI_QUIRK_SET_DTR(0x1bc7, 0x1230, 2)}, /* Telit LE910Cx */ {QMI_QUIRK_SET_DTR(0x1bc7, 0x1260, 2)}, /* Telit LE910Cx */ {QMI_QUIRK_SET_DTR(0x1bc7, 0x1261, 2)}, /* Telit LE910Cx */ {QMI_QUIRK_SET_DTR(0x1bc7, 0x1900, 1)}, /* Telit LN940 series */ @@ -1312,6 +1313,7 @@ static const struct usb_device_id products[] = { {QMI_QUIRK_SET_DTR(0x2cb7, 0x0104, 4)}, /* Fibocom NL678 series */ {QMI_FIXED_INTF(0x0489, 0xe0b4, 0)}, /* Foxconn T77W968 LTE */ {QMI_FIXED_INTF(0x0489, 0xe0b5, 0)}, /* Foxconn T77W968 LTE with eSIM support*/ + {QMI_FIXED_INTF(0x2692, 0x9025, 4)}, /* Cellient MPL200 (rebranded Qualcomm 05c6:9025) */ /* 4. Gobi 1000 devices */ {QMI_GOBI1K_DEVICE(0x05c6, 0x9212)}, /* Acer Gobi Modem Device */ diff --git a/drivers/net/wan/hdlc.c b/drivers/net/wan/hdlc.c index 7221a53b8b14..500463044b1a 100644 --- a/drivers/net/wan/hdlc.c +++ b/drivers/net/wan/hdlc.c @@ -49,7 +49,15 @@ static struct hdlc_proto *first_proto; static int hdlc_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *p, struct net_device *orig_dev) { - struct hdlc_device *hdlc = dev_to_hdlc(dev); + struct hdlc_device *hdlc; + + /* First make sure "dev" is an HDLC device */ + if (!(dev->priv_flags & IFF_WAN_HDLC)) { + kfree_skb(skb); + return NET_RX_SUCCESS; + } + + hdlc = dev_to_hdlc(dev); if (!net_eq(dev_net(dev), &init_net)) { kfree_skb(skb); diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c index 03b5f5cce6f4..96b4ce13f3a5 100644 --- a/drivers/net/wan/hdlc_fr.c +++ b/drivers/net/wan/hdlc_fr.c @@ -276,63 +276,69 @@ static inline struct net_device **get_dev_p(struct pvc_device *pvc, static int fr_hard_header(struct sk_buff **skb_p, u16 dlci) { - u16 head_len; struct sk_buff *skb = *skb_p; - switch (skb->protocol) { - case cpu_to_be16(NLPID_CCITT_ANSI_LMI): - head_len = 4; - skb_push(skb, head_len); - skb->data[3] = NLPID_CCITT_ANSI_LMI; - break; + if (!skb->dev) { /* Control packets */ + switch (dlci) { + case LMI_CCITT_ANSI_DLCI: + skb_push(skb, 4); + skb->data[3] = NLPID_CCITT_ANSI_LMI; + break; - case cpu_to_be16(NLPID_CISCO_LMI): - head_len = 4; - skb_push(skb, head_len); - skb->data[3] = NLPID_CISCO_LMI; - break; + case LMI_CISCO_DLCI: + skb_push(skb, 4); + skb->data[3] = NLPID_CISCO_LMI; + break; - case cpu_to_be16(ETH_P_IP): - head_len = 4; - skb_push(skb, head_len); - skb->data[3] = NLPID_IP; - break; + default: + return -EINVAL; + } - case cpu_to_be16(ETH_P_IPV6): - head_len = 4; - skb_push(skb, head_len); - skb->data[3] = NLPID_IPV6; - break; + } else if (skb->dev->type == ARPHRD_DLCI) { + switch (skb->protocol) { + case htons(ETH_P_IP): + skb_push(skb, 4); + skb->data[3] = NLPID_IP; + break; - case cpu_to_be16(ETH_P_802_3): - head_len = 10; - if (skb_headroom(skb) < head_len) { - struct sk_buff *skb2 = skb_realloc_headroom(skb, - head_len); + case htons(ETH_P_IPV6): + skb_push(skb, 4); + skb->data[3] = NLPID_IPV6; + break; + + default: + skb_push(skb, 10); + skb->data[3] = FR_PAD; + skb->data[4] = NLPID_SNAP; + /* OUI 00-00-00 indicates an Ethertype follows */ + skb->data[5] = 0x00; + skb->data[6] = 0x00; + skb->data[7] = 0x00; + /* This should be an Ethertype: */ + *(__be16 *)(skb->data + 8) = skb->protocol; + } + + } else if (skb->dev->type == ARPHRD_ETHER) { + if (skb_headroom(skb) < 10) { + struct sk_buff *skb2 = skb_realloc_headroom(skb, 10); if (!skb2) return -ENOBUFS; dev_kfree_skb(skb); skb = *skb_p = skb2; } - skb_push(skb, head_len); + skb_push(skb, 10); skb->data[3] = FR_PAD; skb->data[4] = NLPID_SNAP; - skb->data[5] = FR_PAD; + /* OUI 00-80-C2 stands for the 802.1 organization */ + skb->data[5] = 0x00; skb->data[6] = 0x80; skb->data[7] = 0xC2; + /* PID 00-07 stands for Ethernet frames without FCS */ skb->data[8] = 0x00; - skb->data[9] = 0x07; /* bridged Ethernet frame w/out FCS */ - break; + skb->data[9] = 0x07; - default: - head_len = 10; - skb_push(skb, head_len); - skb->data[3] = FR_PAD; - skb->data[4] = NLPID_SNAP; - skb->data[5] = FR_PAD; - skb->data[6] = FR_PAD; - skb->data[7] = FR_PAD; - *(__be16*)(skb->data + 8) = skb->protocol; + } else { + return -EINVAL; } dlci_to_q922(skb->data, dlci); @@ -428,8 +434,8 @@ static netdev_tx_t pvc_xmit(struct sk_buff *skb, struct net_device *dev) skb_put(skb, pad); memset(skb->data + len, 0, pad); } - skb->protocol = cpu_to_be16(ETH_P_802_3); } + skb->dev = dev; if (!fr_hard_header(&skb, pvc->dlci)) { dev->stats.tx_bytes += skb->len; dev->stats.tx_packets++; @@ -497,10 +503,8 @@ static void fr_lmi_send(struct net_device *dev, int fullrep) memset(skb->data, 0, len); skb_reserve(skb, 4); if (lmi == LMI_CISCO) { - skb->protocol = cpu_to_be16(NLPID_CISCO_LMI); fr_hard_header(&skb, LMI_CISCO_DLCI); } else { - skb->protocol = cpu_to_be16(NLPID_CCITT_ANSI_LMI); fr_hard_header(&skb, LMI_CCITT_ANSI_DLCI); } data = skb_tail_pointer(skb); diff --git a/drivers/net/wan/hdlc_raw_eth.c b/drivers/net/wan/hdlc_raw_eth.c index 8bd3ed905813..676dea2918bf 100644 --- a/drivers/net/wan/hdlc_raw_eth.c +++ b/drivers/net/wan/hdlc_raw_eth.c @@ -102,6 +102,7 @@ static int raw_eth_ioctl(struct net_device *dev, struct ifreq *ifr) old_qlen = dev->tx_queue_len; ether_setup(dev); dev->tx_queue_len = old_qlen; + dev->priv_flags &= ~IFF_TX_SKB_SHARING; eth_hw_addr_random(dev); call_netdevice_notifiers(NETDEV_POST_TYPE_CHANGE, dev); netif_dormant_off(dev); diff --git a/drivers/net/wireguard/Makefile b/drivers/net/wireguard/Makefile new file mode 100644 index 000000000000..fc52b2cb500b --- /dev/null +++ b/drivers/net/wireguard/Makefile @@ -0,0 +1,18 @@ +ccflags-y := -O3 +ccflags-y += -D'pr_fmt(fmt)=KBUILD_MODNAME ": " fmt' +ccflags-$(CONFIG_WIREGUARD_DEBUG) += -DDEBUG +wireguard-y := main.o +wireguard-y += noise.o +wireguard-y += device.o +wireguard-y += peer.o +wireguard-y += timers.o +wireguard-y += queueing.o +wireguard-y += send.o +wireguard-y += receive.o +wireguard-y += socket.o +wireguard-y += peerlookup.o +wireguard-y += allowedips.o +wireguard-y += ratelimiter.o +wireguard-y += cookie.o +wireguard-y += netlink.o +obj-$(CONFIG_WIREGUARD) := wireguard.o diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c new file mode 100644 index 000000000000..3725e9cd85f4 --- /dev/null +++ b/drivers/net/wireguard/allowedips.c @@ -0,0 +1,377 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "allowedips.h" +#include "peer.h" + +static void swap_endian(u8 *dst, const u8 *src, u8 bits) +{ + if (bits == 32) { + *(u32 *)dst = be32_to_cpu(*(const __be32 *)src); + } else if (bits == 128) { + ((u64 *)dst)[0] = be64_to_cpu(((const __be64 *)src)[0]); + ((u64 *)dst)[1] = be64_to_cpu(((const __be64 *)src)[1]); + } +} + +static void copy_and_assign_cidr(struct allowedips_node *node, const u8 *src, + u8 cidr, u8 bits) +{ + node->cidr = cidr; + node->bit_at_a = cidr / 8U; +#ifdef __LITTLE_ENDIAN + node->bit_at_a ^= (bits / 8U - 1U) % 8U; +#endif + node->bit_at_b = 7U - (cidr % 8U); + node->bitlen = bits; + memcpy(node->bits, src, bits / 8U); +} +#define CHOOSE_NODE(parent, key) \ + parent->bit[(key[parent->bit_at_a] >> parent->bit_at_b) & 1] + +static void push_rcu(struct allowedips_node **stack, + struct allowedips_node __rcu *p, unsigned int *len) +{ + if (rcu_access_pointer(p)) { + WARN_ON(IS_ENABLED(DEBUG) && *len >= 128); + stack[(*len)++] = rcu_dereference_raw(p); + } +} + +static void root_free_rcu(struct rcu_head *rcu) +{ + struct allowedips_node *node, *stack[128] = { + container_of(rcu, struct allowedips_node, rcu) }; + unsigned int len = 1; + + while (len > 0 && (node = stack[--len])) { + push_rcu(stack, node->bit[0], &len); + push_rcu(stack, node->bit[1], &len); + kfree(node); + } +} + +static void root_remove_peer_lists(struct allowedips_node *root) +{ + struct allowedips_node *node, *stack[128] = { root }; + unsigned int len = 1; + + while (len > 0 && (node = stack[--len])) { + push_rcu(stack, node->bit[0], &len); + push_rcu(stack, node->bit[1], &len); + if (rcu_access_pointer(node->peer)) + list_del(&node->peer_list); + } +} + +static void walk_remove_by_peer(struct allowedips_node __rcu **top, + struct wg_peer *peer, struct mutex *lock) +{ +#define REF(p) rcu_access_pointer(p) +#define DEREF(p) rcu_dereference_protected(*(p), lockdep_is_held(lock)) +#define PUSH(p) ({ \ + WARN_ON(IS_ENABLED(DEBUG) && len >= 128); \ + stack[len++] = p; \ + }) + + struct allowedips_node __rcu **stack[128], **nptr; + struct allowedips_node *node, *prev; + unsigned int len; + + if (unlikely(!peer || !REF(*top))) + return; + + for (prev = NULL, len = 0, PUSH(top); len > 0; prev = node) { + nptr = stack[len - 1]; + node = DEREF(nptr); + if (!node) { + --len; + continue; + } + if (!prev || REF(prev->bit[0]) == node || + REF(prev->bit[1]) == node) { + if (REF(node->bit[0])) + PUSH(&node->bit[0]); + else if (REF(node->bit[1])) + PUSH(&node->bit[1]); + } else if (REF(node->bit[0]) == prev) { + if (REF(node->bit[1])) + PUSH(&node->bit[1]); + } else { + if (rcu_dereference_protected(node->peer, + lockdep_is_held(lock)) == peer) { + RCU_INIT_POINTER(node->peer, NULL); + list_del_init(&node->peer_list); + if (!node->bit[0] || !node->bit[1]) { + rcu_assign_pointer(*nptr, DEREF( + &node->bit[!REF(node->bit[0])])); + kfree_rcu(node, rcu); + node = DEREF(nptr); + } + } + --len; + } + } + +#undef REF +#undef DEREF +#undef PUSH +} + +static unsigned int fls128(u64 a, u64 b) +{ + return a ? fls64(a) + 64U : fls64(b); +} + +static u8 common_bits(const struct allowedips_node *node, const u8 *key, + u8 bits) +{ + if (bits == 32) + return 32U - fls(*(const u32 *)node->bits ^ *(const u32 *)key); + else if (bits == 128) + return 128U - fls128( + *(const u64 *)&node->bits[0] ^ *(const u64 *)&key[0], + *(const u64 *)&node->bits[8] ^ *(const u64 *)&key[8]); + return 0; +} + +static bool prefix_matches(const struct allowedips_node *node, const u8 *key, + u8 bits) +{ + /* This could be much faster if it actually just compared the common + * bits properly, by precomputing a mask bswap(~0 << (32 - cidr)), and + * the rest, but it turns out that common_bits is already super fast on + * modern processors, even taking into account the unfortunate bswap. + * So, we just inline it like this instead. + */ + return common_bits(node, key, bits) >= node->cidr; +} + +static struct allowedips_node *find_node(struct allowedips_node *trie, u8 bits, + const u8 *key) +{ + struct allowedips_node *node = trie, *found = NULL; + + while (node && prefix_matches(node, key, bits)) { + if (rcu_access_pointer(node->peer)) + found = node; + if (node->cidr == bits) + break; + node = rcu_dereference_bh(CHOOSE_NODE(node, key)); + } + return found; +} + +/* Returns a strong reference to a peer */ +static struct wg_peer *lookup(struct allowedips_node __rcu *root, u8 bits, + const void *be_ip) +{ + /* Aligned so it can be passed to fls/fls64 */ + u8 ip[16] __aligned(__alignof(u64)); + struct allowedips_node *node; + struct wg_peer *peer = NULL; + + swap_endian(ip, be_ip, bits); + + rcu_read_lock_bh(); +retry: + node = find_node(rcu_dereference_bh(root), bits, ip); + if (node) { + peer = wg_peer_get_maybe_zero(rcu_dereference_bh(node->peer)); + if (!peer) + goto retry; + } + rcu_read_unlock_bh(); + return peer; +} + +static bool node_placement(struct allowedips_node __rcu *trie, const u8 *key, + u8 cidr, u8 bits, struct allowedips_node **rnode, + struct mutex *lock) +{ + struct allowedips_node *node = rcu_dereference_protected(trie, + lockdep_is_held(lock)); + struct allowedips_node *parent = NULL; + bool exact = false; + + while (node && node->cidr <= cidr && prefix_matches(node, key, bits)) { + parent = node; + if (parent->cidr == cidr) { + exact = true; + break; + } + node = rcu_dereference_protected(CHOOSE_NODE(parent, key), + lockdep_is_held(lock)); + } + *rnode = parent; + return exact; +} + +static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, + u8 cidr, struct wg_peer *peer, struct mutex *lock) +{ + struct allowedips_node *node, *parent, *down, *newnode; + + if (unlikely(cidr > bits || !peer)) + return -EINVAL; + + if (!rcu_access_pointer(*trie)) { + node = kzalloc(sizeof(*node), GFP_KERNEL); + if (unlikely(!node)) + return -ENOMEM; + RCU_INIT_POINTER(node->peer, peer); + list_add_tail(&node->peer_list, &peer->allowedips_list); + copy_and_assign_cidr(node, key, cidr, bits); + rcu_assign_pointer(*trie, node); + return 0; + } + if (node_placement(*trie, key, cidr, bits, &node, lock)) { + rcu_assign_pointer(node->peer, peer); + list_move_tail(&node->peer_list, &peer->allowedips_list); + return 0; + } + + newnode = kzalloc(sizeof(*newnode), GFP_KERNEL); + if (unlikely(!newnode)) + return -ENOMEM; + RCU_INIT_POINTER(newnode->peer, peer); + list_add_tail(&newnode->peer_list, &peer->allowedips_list); + copy_and_assign_cidr(newnode, key, cidr, bits); + + if (!node) { + down = rcu_dereference_protected(*trie, lockdep_is_held(lock)); + } else { + down = rcu_dereference_protected(CHOOSE_NODE(node, key), + lockdep_is_held(lock)); + if (!down) { + rcu_assign_pointer(CHOOSE_NODE(node, key), newnode); + return 0; + } + } + cidr = min(cidr, common_bits(down, key, bits)); + parent = node; + + if (newnode->cidr == cidr) { + rcu_assign_pointer(CHOOSE_NODE(newnode, down->bits), down); + if (!parent) + rcu_assign_pointer(*trie, newnode); + else + rcu_assign_pointer(CHOOSE_NODE(parent, newnode->bits), + newnode); + } else { + node = kzalloc(sizeof(*node), GFP_KERNEL); + if (unlikely(!node)) { + list_del(&newnode->peer_list); + kfree(newnode); + return -ENOMEM; + } + INIT_LIST_HEAD(&node->peer_list); + copy_and_assign_cidr(node, newnode->bits, cidr, bits); + + rcu_assign_pointer(CHOOSE_NODE(node, down->bits), down); + rcu_assign_pointer(CHOOSE_NODE(node, newnode->bits), newnode); + if (!parent) + rcu_assign_pointer(*trie, node); + else + rcu_assign_pointer(CHOOSE_NODE(parent, node->bits), + node); + } + return 0; +} + +void wg_allowedips_init(struct allowedips *table) +{ + table->root4 = table->root6 = NULL; + table->seq = 1; +} + +void wg_allowedips_free(struct allowedips *table, struct mutex *lock) +{ + struct allowedips_node __rcu *old4 = table->root4, *old6 = table->root6; + + ++table->seq; + RCU_INIT_POINTER(table->root4, NULL); + RCU_INIT_POINTER(table->root6, NULL); + if (rcu_access_pointer(old4)) { + struct allowedips_node *node = rcu_dereference_protected(old4, + lockdep_is_held(lock)); + + root_remove_peer_lists(node); + call_rcu(&node->rcu, root_free_rcu); + } + if (rcu_access_pointer(old6)) { + struct allowedips_node *node = rcu_dereference_protected(old6, + lockdep_is_held(lock)); + + root_remove_peer_lists(node); + call_rcu(&node->rcu, root_free_rcu); + } +} + +int wg_allowedips_insert_v4(struct allowedips *table, const struct in_addr *ip, + u8 cidr, struct wg_peer *peer, struct mutex *lock) +{ + /* Aligned so it can be passed to fls */ + u8 key[4] __aligned(__alignof(u32)); + + ++table->seq; + swap_endian(key, (const u8 *)ip, 32); + return add(&table->root4, 32, key, cidr, peer, lock); +} + +int wg_allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip, + u8 cidr, struct wg_peer *peer, struct mutex *lock) +{ + /* Aligned so it can be passed to fls64 */ + u8 key[16] __aligned(__alignof(u64)); + + ++table->seq; + swap_endian(key, (const u8 *)ip, 128); + return add(&table->root6, 128, key, cidr, peer, lock); +} + +void wg_allowedips_remove_by_peer(struct allowedips *table, + struct wg_peer *peer, struct mutex *lock) +{ + ++table->seq; + walk_remove_by_peer(&table->root4, peer, lock); + walk_remove_by_peer(&table->root6, peer, lock); +} + +int wg_allowedips_read_node(struct allowedips_node *node, u8 ip[16], u8 *cidr) +{ + const unsigned int cidr_bytes = DIV_ROUND_UP(node->cidr, 8U); + swap_endian(ip, node->bits, node->bitlen); + memset(ip + cidr_bytes, 0, node->bitlen / 8U - cidr_bytes); + if (node->cidr) + ip[cidr_bytes - 1U] &= ~0U << (-node->cidr % 8U); + + *cidr = node->cidr; + return node->bitlen == 32 ? AF_INET : AF_INET6; +} + +/* Returns a strong reference to a peer */ +struct wg_peer *wg_allowedips_lookup_dst(struct allowedips *table, + struct sk_buff *skb) +{ + if (skb->protocol == htons(ETH_P_IP)) + return lookup(table->root4, 32, &ip_hdr(skb)->daddr); + else if (skb->protocol == htons(ETH_P_IPV6)) + return lookup(table->root6, 128, &ipv6_hdr(skb)->daddr); + return NULL; +} + +/* Returns a strong reference to a peer */ +struct wg_peer *wg_allowedips_lookup_src(struct allowedips *table, + struct sk_buff *skb) +{ + if (skb->protocol == htons(ETH_P_IP)) + return lookup(table->root4, 32, &ip_hdr(skb)->saddr); + else if (skb->protocol == htons(ETH_P_IPV6)) + return lookup(table->root6, 128, &ipv6_hdr(skb)->saddr); + return NULL; +} + +#include "selftest/allowedips.c" diff --git a/drivers/net/wireguard/allowedips.h b/drivers/net/wireguard/allowedips.h new file mode 100644 index 000000000000..e5c83cafcef4 --- /dev/null +++ b/drivers/net/wireguard/allowedips.h @@ -0,0 +1,59 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef _WG_ALLOWEDIPS_H +#define _WG_ALLOWEDIPS_H + +#include +#include +#include + +struct wg_peer; + +struct allowedips_node { + struct wg_peer __rcu *peer; + struct allowedips_node __rcu *bit[2]; + /* While it may seem scandalous that we waste space for v4, + * we're alloc'ing to the nearest power of 2 anyway, so this + * doesn't actually make a difference. + */ + u8 bits[16] __aligned(__alignof(u64)); + u8 cidr, bit_at_a, bit_at_b, bitlen; + + /* Keep rarely used list at bottom to be beyond cache line. */ + union { + struct list_head peer_list; + struct rcu_head rcu; + }; +}; + +struct allowedips { + struct allowedips_node __rcu *root4; + struct allowedips_node __rcu *root6; + u64 seq; +}; + +void wg_allowedips_init(struct allowedips *table); +void wg_allowedips_free(struct allowedips *table, struct mutex *mutex); +int wg_allowedips_insert_v4(struct allowedips *table, const struct in_addr *ip, + u8 cidr, struct wg_peer *peer, struct mutex *lock); +int wg_allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip, + u8 cidr, struct wg_peer *peer, struct mutex *lock); +void wg_allowedips_remove_by_peer(struct allowedips *table, + struct wg_peer *peer, struct mutex *lock); +/* The ip input pointer should be __aligned(__alignof(u64))) */ +int wg_allowedips_read_node(struct allowedips_node *node, u8 ip[16], u8 *cidr); + +/* These return a strong reference to a peer: */ +struct wg_peer *wg_allowedips_lookup_dst(struct allowedips *table, + struct sk_buff *skb); +struct wg_peer *wg_allowedips_lookup_src(struct allowedips *table, + struct sk_buff *skb); + +#ifdef DEBUG +bool wg_allowedips_selftest(void); +#endif + +#endif /* _WG_ALLOWEDIPS_H */ diff --git a/drivers/net/wireguard/cookie.c b/drivers/net/wireguard/cookie.c new file mode 100644 index 000000000000..4956f0499c19 --- /dev/null +++ b/drivers/net/wireguard/cookie.c @@ -0,0 +1,236 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "cookie.h" +#include "peer.h" +#include "device.h" +#include "messages.h" +#include "ratelimiter.h" +#include "timers.h" + +#include +#include + +#include +#include + +void wg_cookie_checker_init(struct cookie_checker *checker, + struct wg_device *wg) +{ + init_rwsem(&checker->secret_lock); + checker->secret_birthdate = ktime_get_coarse_boottime_ns(); + get_random_bytes(checker->secret, NOISE_HASH_LEN); + checker->device = wg; +} + +enum { COOKIE_KEY_LABEL_LEN = 8 }; +static const u8 mac1_key_label[COOKIE_KEY_LABEL_LEN] = "mac1----"; +static const u8 cookie_key_label[COOKIE_KEY_LABEL_LEN] = "cookie--"; + +static void precompute_key(u8 key[NOISE_SYMMETRIC_KEY_LEN], + const u8 pubkey[NOISE_PUBLIC_KEY_LEN], + const u8 label[COOKIE_KEY_LABEL_LEN]) +{ + struct blake2s_state blake; + + blake2s_init(&blake, NOISE_SYMMETRIC_KEY_LEN); + blake2s_update(&blake, label, COOKIE_KEY_LABEL_LEN); + blake2s_update(&blake, pubkey, NOISE_PUBLIC_KEY_LEN); + blake2s_final(&blake, key); +} + +/* Must hold peer->handshake.static_identity->lock */ +void wg_cookie_checker_precompute_device_keys(struct cookie_checker *checker) +{ + if (likely(checker->device->static_identity.has_identity)) { + precompute_key(checker->cookie_encryption_key, + checker->device->static_identity.static_public, + cookie_key_label); + precompute_key(checker->message_mac1_key, + checker->device->static_identity.static_public, + mac1_key_label); + } else { + memset(checker->cookie_encryption_key, 0, + NOISE_SYMMETRIC_KEY_LEN); + memset(checker->message_mac1_key, 0, NOISE_SYMMETRIC_KEY_LEN); + } +} + +void wg_cookie_checker_precompute_peer_keys(struct wg_peer *peer) +{ + precompute_key(peer->latest_cookie.cookie_decryption_key, + peer->handshake.remote_static, cookie_key_label); + precompute_key(peer->latest_cookie.message_mac1_key, + peer->handshake.remote_static, mac1_key_label); +} + +void wg_cookie_init(struct cookie *cookie) +{ + memset(cookie, 0, sizeof(*cookie)); + init_rwsem(&cookie->lock); +} + +static void compute_mac1(u8 mac1[COOKIE_LEN], const void *message, size_t len, + const u8 key[NOISE_SYMMETRIC_KEY_LEN]) +{ + len = len - sizeof(struct message_macs) + + offsetof(struct message_macs, mac1); + blake2s(mac1, message, key, COOKIE_LEN, len, NOISE_SYMMETRIC_KEY_LEN); +} + +static void compute_mac2(u8 mac2[COOKIE_LEN], const void *message, size_t len, + const u8 cookie[COOKIE_LEN]) +{ + len = len - sizeof(struct message_macs) + + offsetof(struct message_macs, mac2); + blake2s(mac2, message, cookie, COOKIE_LEN, len, COOKIE_LEN); +} + +static void make_cookie(u8 cookie[COOKIE_LEN], struct sk_buff *skb, + struct cookie_checker *checker) +{ + struct blake2s_state state; + + if (wg_birthdate_has_expired(checker->secret_birthdate, + COOKIE_SECRET_MAX_AGE)) { + down_write(&checker->secret_lock); + checker->secret_birthdate = ktime_get_coarse_boottime_ns(); + get_random_bytes(checker->secret, NOISE_HASH_LEN); + up_write(&checker->secret_lock); + } + + down_read(&checker->secret_lock); + + blake2s_init_key(&state, COOKIE_LEN, checker->secret, NOISE_HASH_LEN); + if (skb->protocol == htons(ETH_P_IP)) + blake2s_update(&state, (u8 *)&ip_hdr(skb)->saddr, + sizeof(struct in_addr)); + else if (skb->protocol == htons(ETH_P_IPV6)) + blake2s_update(&state, (u8 *)&ipv6_hdr(skb)->saddr, + sizeof(struct in6_addr)); + blake2s_update(&state, (u8 *)&udp_hdr(skb)->source, sizeof(__be16)); + blake2s_final(&state, cookie); + + up_read(&checker->secret_lock); +} + +enum cookie_mac_state wg_cookie_validate_packet(struct cookie_checker *checker, + struct sk_buff *skb, + bool check_cookie) +{ + struct message_macs *macs = (struct message_macs *) + (skb->data + skb->len - sizeof(*macs)); + enum cookie_mac_state ret; + u8 computed_mac[COOKIE_LEN]; + u8 cookie[COOKIE_LEN]; + + ret = INVALID_MAC; + compute_mac1(computed_mac, skb->data, skb->len, + checker->message_mac1_key); + if (crypto_memneq(computed_mac, macs->mac1, COOKIE_LEN)) + goto out; + + ret = VALID_MAC_BUT_NO_COOKIE; + + if (!check_cookie) + goto out; + + make_cookie(cookie, skb, checker); + + compute_mac2(computed_mac, skb->data, skb->len, cookie); + if (crypto_memneq(computed_mac, macs->mac2, COOKIE_LEN)) + goto out; + + ret = VALID_MAC_WITH_COOKIE_BUT_RATELIMITED; + if (!wg_ratelimiter_allow(skb, dev_net(checker->device->dev))) + goto out; + + ret = VALID_MAC_WITH_COOKIE; + +out: + return ret; +} + +void wg_cookie_add_mac_to_packet(void *message, size_t len, + struct wg_peer *peer) +{ + struct message_macs *macs = (struct message_macs *) + ((u8 *)message + len - sizeof(*macs)); + + down_write(&peer->latest_cookie.lock); + compute_mac1(macs->mac1, message, len, + peer->latest_cookie.message_mac1_key); + memcpy(peer->latest_cookie.last_mac1_sent, macs->mac1, COOKIE_LEN); + peer->latest_cookie.have_sent_mac1 = true; + up_write(&peer->latest_cookie.lock); + + down_read(&peer->latest_cookie.lock); + if (peer->latest_cookie.is_valid && + !wg_birthdate_has_expired(peer->latest_cookie.birthdate, + COOKIE_SECRET_MAX_AGE - COOKIE_SECRET_LATENCY)) + compute_mac2(macs->mac2, message, len, + peer->latest_cookie.cookie); + else + memset(macs->mac2, 0, COOKIE_LEN); + up_read(&peer->latest_cookie.lock); +} + +void wg_cookie_message_create(struct message_handshake_cookie *dst, + struct sk_buff *skb, __le32 index, + struct cookie_checker *checker) +{ + struct message_macs *macs = (struct message_macs *) + ((u8 *)skb->data + skb->len - sizeof(*macs)); + u8 cookie[COOKIE_LEN]; + + dst->header.type = cpu_to_le32(MESSAGE_HANDSHAKE_COOKIE); + dst->receiver_index = index; + get_random_bytes_wait(dst->nonce, COOKIE_NONCE_LEN); + + make_cookie(cookie, skb, checker); + xchacha20poly1305_encrypt(dst->encrypted_cookie, cookie, COOKIE_LEN, + macs->mac1, COOKIE_LEN, dst->nonce, + checker->cookie_encryption_key); +} + +void wg_cookie_message_consume(struct message_handshake_cookie *src, + struct wg_device *wg) +{ + struct wg_peer *peer = NULL; + u8 cookie[COOKIE_LEN]; + bool ret; + + if (unlikely(!wg_index_hashtable_lookup(wg->index_hashtable, + INDEX_HASHTABLE_HANDSHAKE | + INDEX_HASHTABLE_KEYPAIR, + src->receiver_index, &peer))) + return; + + down_read(&peer->latest_cookie.lock); + if (unlikely(!peer->latest_cookie.have_sent_mac1)) { + up_read(&peer->latest_cookie.lock); + goto out; + } + ret = xchacha20poly1305_decrypt( + cookie, src->encrypted_cookie, sizeof(src->encrypted_cookie), + peer->latest_cookie.last_mac1_sent, COOKIE_LEN, src->nonce, + peer->latest_cookie.cookie_decryption_key); + up_read(&peer->latest_cookie.lock); + + if (ret) { + down_write(&peer->latest_cookie.lock); + memcpy(peer->latest_cookie.cookie, cookie, COOKIE_LEN); + peer->latest_cookie.birthdate = ktime_get_coarse_boottime_ns(); + peer->latest_cookie.is_valid = true; + peer->latest_cookie.have_sent_mac1 = false; + up_write(&peer->latest_cookie.lock); + } else { + net_dbg_ratelimited("%s: Could not decrypt invalid cookie response\n", + wg->dev->name); + } + +out: + wg_peer_put(peer); +} diff --git a/drivers/net/wireguard/cookie.h b/drivers/net/wireguard/cookie.h new file mode 100644 index 000000000000..c4bd61ca03f2 --- /dev/null +++ b/drivers/net/wireguard/cookie.h @@ -0,0 +1,59 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef _WG_COOKIE_H +#define _WG_COOKIE_H + +#include "messages.h" +#include + +struct wg_peer; + +struct cookie_checker { + u8 secret[NOISE_HASH_LEN]; + u8 cookie_encryption_key[NOISE_SYMMETRIC_KEY_LEN]; + u8 message_mac1_key[NOISE_SYMMETRIC_KEY_LEN]; + u64 secret_birthdate; + struct rw_semaphore secret_lock; + struct wg_device *device; +}; + +struct cookie { + u64 birthdate; + bool is_valid; + u8 cookie[COOKIE_LEN]; + bool have_sent_mac1; + u8 last_mac1_sent[COOKIE_LEN]; + u8 cookie_decryption_key[NOISE_SYMMETRIC_KEY_LEN]; + u8 message_mac1_key[NOISE_SYMMETRIC_KEY_LEN]; + struct rw_semaphore lock; +}; + +enum cookie_mac_state { + INVALID_MAC, + VALID_MAC_BUT_NO_COOKIE, + VALID_MAC_WITH_COOKIE_BUT_RATELIMITED, + VALID_MAC_WITH_COOKIE +}; + +void wg_cookie_checker_init(struct cookie_checker *checker, + struct wg_device *wg); +void wg_cookie_checker_precompute_device_keys(struct cookie_checker *checker); +void wg_cookie_checker_precompute_peer_keys(struct wg_peer *peer); +void wg_cookie_init(struct cookie *cookie); + +enum cookie_mac_state wg_cookie_validate_packet(struct cookie_checker *checker, + struct sk_buff *skb, + bool check_cookie); +void wg_cookie_add_mac_to_packet(void *message, size_t len, + struct wg_peer *peer); + +void wg_cookie_message_create(struct message_handshake_cookie *src, + struct sk_buff *skb, __le32 index, + struct cookie_checker *checker); +void wg_cookie_message_consume(struct message_handshake_cookie *src, + struct wg_device *wg); + +#endif /* _WG_COOKIE_H */ diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c new file mode 100644 index 000000000000..de8a4fad4d3f --- /dev/null +++ b/drivers/net/wireguard/device.c @@ -0,0 +1,455 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "queueing.h" +#include "socket.h" +#include "timers.h" +#include "device.h" +#include "ratelimiter.h" +#include "peer.h" +#include "messages.h" + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static LIST_HEAD(device_list); + +static int wg_open(struct net_device *dev) +{ + struct in_device *dev_v4 = __in_dev_get_rtnl(dev); + struct inet6_dev *dev_v6 = __in6_dev_get(dev); + struct wg_device *wg = netdev_priv(dev); + struct wg_peer *peer; + int ret; + + if (dev_v4) { + /* At some point we might put this check near the ip_rt_send_ + * redirect call of ip_forward in net/ipv4/ip_forward.c, similar + * to the current secpath check. + */ + IN_DEV_CONF_SET(dev_v4, SEND_REDIRECTS, false); + IPV4_DEVCONF_ALL(dev_net(dev), SEND_REDIRECTS) = false; + } + if (dev_v6) + dev_v6->cnf.addr_gen_mode = IN6_ADDR_GEN_MODE_NONE; + + mutex_lock(&wg->device_update_lock); + ret = wg_socket_init(wg, wg->incoming_port); + if (ret < 0) + goto out; + list_for_each_entry(peer, &wg->peer_list, peer_list) { + wg_packet_send_staged_packets(peer); + if (peer->persistent_keepalive_interval) + wg_packet_send_keepalive(peer); + } +out: + mutex_unlock(&wg->device_update_lock); + return ret; +} + +#ifdef CONFIG_PM_SLEEP +static int wg_pm_notification(struct notifier_block *nb, unsigned long action, + void *data) +{ + struct wg_device *wg; + struct wg_peer *peer; + + /* If the machine is constantly suspending and resuming, as part of + * its normal operation rather than as a somewhat rare event, then we + * don't actually want to clear keys. + */ + if (IS_ENABLED(CONFIG_PM_AUTOSLEEP) || IS_ENABLED(CONFIG_ANDROID)) + return 0; + + if (action != PM_HIBERNATION_PREPARE && action != PM_SUSPEND_PREPARE) + return 0; + + rtnl_lock(); + list_for_each_entry(wg, &device_list, device_list) { + mutex_lock(&wg->device_update_lock); + list_for_each_entry(peer, &wg->peer_list, peer_list) { + del_timer(&peer->timer_zero_key_material); + wg_noise_handshake_clear(&peer->handshake); + wg_noise_keypairs_clear(&peer->keypairs); + } + mutex_unlock(&wg->device_update_lock); + } + rtnl_unlock(); + rcu_barrier(); + return 0; +} + +static struct notifier_block pm_notifier = { .notifier_call = wg_pm_notification }; +#endif + +static int wg_stop(struct net_device *dev) +{ + struct wg_device *wg = netdev_priv(dev); + struct wg_peer *peer; + + mutex_lock(&wg->device_update_lock); + list_for_each_entry(peer, &wg->peer_list, peer_list) { + wg_packet_purge_staged_packets(peer); + wg_timers_stop(peer); + wg_noise_handshake_clear(&peer->handshake); + wg_noise_keypairs_clear(&peer->keypairs); + wg_noise_reset_last_sent_handshake(&peer->last_sent_handshake); + } + mutex_unlock(&wg->device_update_lock); + skb_queue_purge(&wg->incoming_handshakes); + wg_socket_reinit(wg, NULL, NULL); + return 0; +} + +static netdev_tx_t wg_xmit(struct sk_buff *skb, struct net_device *dev) +{ + struct wg_device *wg = netdev_priv(dev); + struct sk_buff_head packets; + struct wg_peer *peer; + struct sk_buff *next; + sa_family_t family; + u32 mtu; + int ret; + + if (unlikely(!wg_check_packet_protocol(skb))) { + ret = -EPROTONOSUPPORT; + net_dbg_ratelimited("%s: Invalid IP packet\n", dev->name); + goto err; + } + + peer = wg_allowedips_lookup_dst(&wg->peer_allowedips, skb); + if (unlikely(!peer)) { + ret = -ENOKEY; + if (skb->protocol == htons(ETH_P_IP)) + net_dbg_ratelimited("%s: No peer has allowed IPs matching %pI4\n", + dev->name, &ip_hdr(skb)->daddr); + else if (skb->protocol == htons(ETH_P_IPV6)) + net_dbg_ratelimited("%s: No peer has allowed IPs matching %pI6\n", + dev->name, &ipv6_hdr(skb)->daddr); + goto err; + } + + family = READ_ONCE(peer->endpoint.addr.sa_family); + if (unlikely(family != AF_INET && family != AF_INET6)) { + ret = -EDESTADDRREQ; + net_dbg_ratelimited("%s: No valid endpoint has been configured or discovered for peer %llu\n", + dev->name, peer->internal_id); + goto err_peer; + } + + mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu; + + __skb_queue_head_init(&packets); + if (!skb_is_gso(skb)) { + skb_mark_not_on_list(skb); + } else { + struct sk_buff *segs = skb_gso_segment(skb, 0); + + if (unlikely(IS_ERR(segs))) { + ret = PTR_ERR(segs); + goto err_peer; + } + dev_kfree_skb(skb); + skb = segs; + } + + skb_list_walk_safe(skb, skb, next) { + skb_mark_not_on_list(skb); + + skb = skb_share_check(skb, GFP_ATOMIC); + if (unlikely(!skb)) + continue; + + /* We only need to keep the original dst around for icmp, + * so at this point we're in a position to drop it. + */ + skb_dst_drop(skb); + + PACKET_CB(skb)->mtu = mtu; + + __skb_queue_tail(&packets, skb); + } + + spin_lock_bh(&peer->staged_packet_queue.lock); + /* If the queue is getting too big, we start removing the oldest packets + * until it's small again. We do this before adding the new packet, so + * we don't remove GSO segments that are in excess. + */ + while (skb_queue_len(&peer->staged_packet_queue) > MAX_STAGED_PACKETS) { + dev_kfree_skb(__skb_dequeue(&peer->staged_packet_queue)); + ++dev->stats.tx_dropped; + } + skb_queue_splice_tail(&packets, &peer->staged_packet_queue); + spin_unlock_bh(&peer->staged_packet_queue.lock); + + wg_packet_send_staged_packets(peer); + + wg_peer_put(peer); + return NETDEV_TX_OK; + +err_peer: + wg_peer_put(peer); +err: + ++dev->stats.tx_errors; + if (skb->protocol == htons(ETH_P_IP)) + icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0); + else if (skb->protocol == htons(ETH_P_IPV6)) + icmpv6_ndo_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_ADDR_UNREACH, 0); + kfree_skb(skb); + return ret; +} + +static const struct net_device_ops netdev_ops = { + .ndo_open = wg_open, + .ndo_stop = wg_stop, + .ndo_start_xmit = wg_xmit, + .ndo_get_stats64 = ip_tunnel_get_stats64 +}; + +static void wg_destruct(struct net_device *dev) +{ + struct wg_device *wg = netdev_priv(dev); + + rtnl_lock(); + list_del(&wg->device_list); + rtnl_unlock(); + mutex_lock(&wg->device_update_lock); + rcu_assign_pointer(wg->creating_net, NULL); + wg->incoming_port = 0; + wg_socket_reinit(wg, NULL, NULL); + /* The final references are cleared in the below calls to destroy_workqueue. */ + wg_peer_remove_all(wg); + destroy_workqueue(wg->handshake_receive_wq); + destroy_workqueue(wg->handshake_send_wq); + destroy_workqueue(wg->packet_crypt_wq); + wg_packet_queue_free(&wg->decrypt_queue, true); + wg_packet_queue_free(&wg->encrypt_queue, true); + rcu_barrier(); /* Wait for all the peers to be actually freed. */ + wg_ratelimiter_uninit(); + memzero_explicit(&wg->static_identity, sizeof(wg->static_identity)); + skb_queue_purge(&wg->incoming_handshakes); + free_percpu(dev->tstats); + free_percpu(wg->incoming_handshakes_worker); + kvfree(wg->index_hashtable); + kvfree(wg->peer_hashtable); + mutex_unlock(&wg->device_update_lock); + + pr_debug("%s: Interface destroyed\n", dev->name); + free_netdev(dev); +} + +static const struct device_type device_type = { .name = KBUILD_MODNAME }; + +static void wg_setup(struct net_device *dev) +{ + struct wg_device *wg = netdev_priv(dev); + enum { WG_NETDEV_FEATURES = NETIF_F_HW_CSUM | NETIF_F_RXCSUM | + NETIF_F_SG | NETIF_F_GSO | + NETIF_F_GSO_SOFTWARE | NETIF_F_HIGHDMA }; + const int overhead = MESSAGE_MINIMUM_LENGTH + sizeof(struct udphdr) + + max(sizeof(struct ipv6hdr), sizeof(struct iphdr)); + + dev->netdev_ops = &netdev_ops; + dev->hard_header_len = 0; + dev->addr_len = 0; + dev->needed_headroom = DATA_PACKET_HEAD_ROOM; + dev->needed_tailroom = noise_encrypted_len(MESSAGE_PADDING_MULTIPLE); + dev->type = ARPHRD_NONE; + dev->flags = IFF_POINTOPOINT | IFF_NOARP; + dev->priv_flags |= IFF_NO_QUEUE; + dev->features |= NETIF_F_LLTX; + dev->features |= WG_NETDEV_FEATURES; + dev->hw_features |= WG_NETDEV_FEATURES; + dev->hw_enc_features |= WG_NETDEV_FEATURES; + dev->mtu = ETH_DATA_LEN - overhead; + dev->max_mtu = round_down(INT_MAX, MESSAGE_PADDING_MULTIPLE) - overhead; + + SET_NETDEV_DEVTYPE(dev, &device_type); + + /* We need to keep the dst around in case of icmp replies. */ + netif_keep_dst(dev); + + memset(wg, 0, sizeof(*wg)); + wg->dev = dev; +} + +static int wg_newlink(struct net *src_net, struct net_device *dev, + struct nlattr *tb[], struct nlattr *data[], + struct netlink_ext_ack *extack) +{ + struct wg_device *wg = netdev_priv(dev); + int ret = -ENOMEM; + + rcu_assign_pointer(wg->creating_net, src_net); + init_rwsem(&wg->static_identity.lock); + mutex_init(&wg->socket_update_lock); + mutex_init(&wg->device_update_lock); + skb_queue_head_init(&wg->incoming_handshakes); + wg_allowedips_init(&wg->peer_allowedips); + wg_cookie_checker_init(&wg->cookie_checker, wg); + INIT_LIST_HEAD(&wg->peer_list); + wg->device_update_gen = 1; + + wg->peer_hashtable = wg_pubkey_hashtable_alloc(); + if (!wg->peer_hashtable) + return ret; + + wg->index_hashtable = wg_index_hashtable_alloc(); + if (!wg->index_hashtable) + goto err_free_peer_hashtable; + + dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); + if (!dev->tstats) + goto err_free_index_hashtable; + + wg->incoming_handshakes_worker = + wg_packet_percpu_multicore_worker_alloc( + wg_packet_handshake_receive_worker, wg); + if (!wg->incoming_handshakes_worker) + goto err_free_tstats; + + wg->handshake_receive_wq = alloc_workqueue("wg-kex-%s", + WQ_CPU_INTENSIVE | WQ_FREEZABLE, 0, dev->name); + if (!wg->handshake_receive_wq) + goto err_free_incoming_handshakes; + + wg->handshake_send_wq = alloc_workqueue("wg-kex-%s", + WQ_UNBOUND | WQ_FREEZABLE, 0, dev->name); + if (!wg->handshake_send_wq) + goto err_destroy_handshake_receive; + + wg->packet_crypt_wq = alloc_workqueue("wg-crypt-%s", + WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM, 0, dev->name); + if (!wg->packet_crypt_wq) + goto err_destroy_handshake_send; + + ret = wg_packet_queue_init(&wg->encrypt_queue, wg_packet_encrypt_worker, + true, MAX_QUEUED_PACKETS); + if (ret < 0) + goto err_destroy_packet_crypt; + + ret = wg_packet_queue_init(&wg->decrypt_queue, wg_packet_decrypt_worker, + true, MAX_QUEUED_PACKETS); + if (ret < 0) + goto err_free_encrypt_queue; + + ret = wg_ratelimiter_init(); + if (ret < 0) + goto err_free_decrypt_queue; + + ret = register_netdevice(dev); + if (ret < 0) + goto err_uninit_ratelimiter; + + list_add(&wg->device_list, &device_list); + + /* We wait until the end to assign priv_destructor, so that + * register_netdevice doesn't call it for us if it fails. + */ + dev->priv_destructor = wg_destruct; + + pr_debug("%s: Interface created\n", dev->name); + return ret; + +err_uninit_ratelimiter: + wg_ratelimiter_uninit(); +err_free_decrypt_queue: + wg_packet_queue_free(&wg->decrypt_queue, true); +err_free_encrypt_queue: + wg_packet_queue_free(&wg->encrypt_queue, true); +err_destroy_packet_crypt: + destroy_workqueue(wg->packet_crypt_wq); +err_destroy_handshake_send: + destroy_workqueue(wg->handshake_send_wq); +err_destroy_handshake_receive: + destroy_workqueue(wg->handshake_receive_wq); +err_free_incoming_handshakes: + free_percpu(wg->incoming_handshakes_worker); +err_free_tstats: + free_percpu(dev->tstats); +err_free_index_hashtable: + kvfree(wg->index_hashtable); +err_free_peer_hashtable: + kvfree(wg->peer_hashtable); + return ret; +} + +static struct rtnl_link_ops link_ops __read_mostly = { + .kind = KBUILD_MODNAME, + .priv_size = sizeof(struct wg_device), + .setup = wg_setup, + .newlink = wg_newlink, +}; + +static void wg_netns_exit(struct net *net) +{ + struct wg_device *wg; + + rtnl_lock(); + list_for_each_entry(wg, &device_list, device_list) { + if (rcu_access_pointer(wg->creating_net) == net) { + pr_debug("%s: Creating namespace exiting\n", wg->dev->name); + netif_carrier_off(wg->dev); + mutex_lock(&wg->device_update_lock); + rcu_assign_pointer(wg->creating_net, NULL); + wg_socket_reinit(wg, NULL, NULL); + mutex_unlock(&wg->device_update_lock); + } + } + rtnl_unlock(); +} + +static struct pernet_operations pernet_ops = { + .exit = wg_netns_exit +}; + +int __init wg_device_init(void) +{ + int ret; + +#ifdef CONFIG_PM_SLEEP + ret = register_pm_notifier(&pm_notifier); + if (ret) + return ret; +#endif + + ret = register_pernet_device(&pernet_ops); + if (ret) + goto error_pm; + + ret = rtnl_link_register(&link_ops); + if (ret) + goto error_pernet; + + return 0; + +error_pernet: + unregister_pernet_device(&pernet_ops); +error_pm: +#ifdef CONFIG_PM_SLEEP + unregister_pm_notifier(&pm_notifier); +#endif + return ret; +} + +void wg_device_uninit(void) +{ + rtnl_link_unregister(&link_ops); + unregister_pernet_device(&pernet_ops); +#ifdef CONFIG_PM_SLEEP + unregister_pm_notifier(&pm_notifier); +#endif + rcu_barrier(); +} diff --git a/drivers/net/wireguard/device.h b/drivers/net/wireguard/device.h new file mode 100644 index 000000000000..4d0144e16947 --- /dev/null +++ b/drivers/net/wireguard/device.h @@ -0,0 +1,64 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef _WG_DEVICE_H +#define _WG_DEVICE_H + +#include "noise.h" +#include "allowedips.h" +#include "peerlookup.h" +#include "cookie.h" + +#include +#include +#include +#include +#include +#include + +struct wg_device; + +struct multicore_worker { + void *ptr; + struct work_struct work; +}; + +struct crypt_queue { + struct ptr_ring ring; + union { + struct { + struct multicore_worker __percpu *worker; + int last_cpu; + }; + struct work_struct work; + }; +}; + +struct wg_device { + struct net_device *dev; + struct crypt_queue encrypt_queue, decrypt_queue; + struct sock __rcu *sock4, *sock6; + struct net __rcu *creating_net; + struct noise_static_identity static_identity; + struct workqueue_struct *handshake_receive_wq, *handshake_send_wq; + struct workqueue_struct *packet_crypt_wq; + struct sk_buff_head incoming_handshakes; + int incoming_handshake_cpu; + struct multicore_worker __percpu *incoming_handshakes_worker; + struct cookie_checker cookie_checker; + struct pubkey_hashtable *peer_hashtable; + struct index_hashtable *index_hashtable; + struct allowedips peer_allowedips; + struct mutex device_update_lock, socket_update_lock; + struct list_head device_list, peer_list; + unsigned int num_peers, device_update_gen; + u32 fwmark; + u16 incoming_port; +}; + +int wg_device_init(void); +void wg_device_uninit(void); + +#endif /* _WG_DEVICE_H */ diff --git a/drivers/net/wireguard/main.c b/drivers/net/wireguard/main.c new file mode 100644 index 000000000000..7a7d5f1a80fc --- /dev/null +++ b/drivers/net/wireguard/main.c @@ -0,0 +1,63 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "version.h" +#include "device.h" +#include "noise.h" +#include "queueing.h" +#include "ratelimiter.h" +#include "netlink.h" + +#include + +#include +#include +#include +#include + +static int __init mod_init(void) +{ + int ret; + +#ifdef DEBUG + if (!wg_allowedips_selftest() || !wg_packet_counter_selftest() || + !wg_ratelimiter_selftest()) + return -ENOTRECOVERABLE; +#endif + wg_noise_init(); + + ret = wg_device_init(); + if (ret < 0) + goto err_device; + + ret = wg_genetlink_init(); + if (ret < 0) + goto err_netlink; + + pr_info("WireGuard " WIREGUARD_VERSION " loaded. See www.wireguard.com for information.\n"); + pr_info("Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved.\n"); + + return 0; + +err_netlink: + wg_device_uninit(); +err_device: + return ret; +} + +static void __exit mod_exit(void) +{ + wg_genetlink_uninit(); + wg_device_uninit(); +} + +module_init(mod_init); +module_exit(mod_exit); +MODULE_LICENSE("GPL v2"); +MODULE_DESCRIPTION("WireGuard secure network tunnel"); +MODULE_AUTHOR("Jason A. Donenfeld "); +MODULE_VERSION(WIREGUARD_VERSION); +MODULE_ALIAS_RTNL_LINK(KBUILD_MODNAME); +MODULE_ALIAS_GENL_FAMILY(WG_GENL_NAME); diff --git a/drivers/net/wireguard/messages.h b/drivers/net/wireguard/messages.h new file mode 100644 index 000000000000..208da72673fc --- /dev/null +++ b/drivers/net/wireguard/messages.h @@ -0,0 +1,128 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef _WG_MESSAGES_H +#define _WG_MESSAGES_H + +#include +#include +#include + +#include +#include +#include + +enum noise_lengths { + NOISE_PUBLIC_KEY_LEN = CURVE25519_KEY_SIZE, + NOISE_SYMMETRIC_KEY_LEN = CHACHA20POLY1305_KEY_SIZE, + NOISE_TIMESTAMP_LEN = sizeof(u64) + sizeof(u32), + NOISE_AUTHTAG_LEN = CHACHA20POLY1305_AUTHTAG_SIZE, + NOISE_HASH_LEN = BLAKE2S_HASH_SIZE +}; + +#define noise_encrypted_len(plain_len) ((plain_len) + NOISE_AUTHTAG_LEN) + +enum cookie_values { + COOKIE_SECRET_MAX_AGE = 2 * 60, + COOKIE_SECRET_LATENCY = 5, + COOKIE_NONCE_LEN = XCHACHA20POLY1305_NONCE_SIZE, + COOKIE_LEN = 16 +}; + +enum counter_values { + COUNTER_BITS_TOTAL = 8192, + COUNTER_REDUNDANT_BITS = BITS_PER_LONG, + COUNTER_WINDOW_SIZE = COUNTER_BITS_TOTAL - COUNTER_REDUNDANT_BITS +}; + +enum limits { + REKEY_AFTER_MESSAGES = 1ULL << 60, + REJECT_AFTER_MESSAGES = U64_MAX - COUNTER_WINDOW_SIZE - 1, + REKEY_TIMEOUT = 5, + REKEY_TIMEOUT_JITTER_MAX_JIFFIES = HZ / 3, + REKEY_AFTER_TIME = 120, + REJECT_AFTER_TIME = 180, + INITIATIONS_PER_SECOND = 50, + MAX_PEERS_PER_DEVICE = 1U << 20, + KEEPALIVE_TIMEOUT = 10, + MAX_TIMER_HANDSHAKES = 90 / REKEY_TIMEOUT, + MAX_QUEUED_INCOMING_HANDSHAKES = 4096, /* TODO: replace this with DQL */ + MAX_STAGED_PACKETS = 128, + MAX_QUEUED_PACKETS = 1024 /* TODO: replace this with DQL */ +}; + +enum message_type { + MESSAGE_INVALID = 0, + MESSAGE_HANDSHAKE_INITIATION = 1, + MESSAGE_HANDSHAKE_RESPONSE = 2, + MESSAGE_HANDSHAKE_COOKIE = 3, + MESSAGE_DATA = 4 +}; + +struct message_header { + /* The actual layout of this that we want is: + * u8 type + * u8 reserved_zero[3] + * + * But it turns out that by encoding this as little endian, + * we achieve the same thing, and it makes checking faster. + */ + __le32 type; +}; + +struct message_macs { + u8 mac1[COOKIE_LEN]; + u8 mac2[COOKIE_LEN]; +}; + +struct message_handshake_initiation { + struct message_header header; + __le32 sender_index; + u8 unencrypted_ephemeral[NOISE_PUBLIC_KEY_LEN]; + u8 encrypted_static[noise_encrypted_len(NOISE_PUBLIC_KEY_LEN)]; + u8 encrypted_timestamp[noise_encrypted_len(NOISE_TIMESTAMP_LEN)]; + struct message_macs macs; +}; + +struct message_handshake_response { + struct message_header header; + __le32 sender_index; + __le32 receiver_index; + u8 unencrypted_ephemeral[NOISE_PUBLIC_KEY_LEN]; + u8 encrypted_nothing[noise_encrypted_len(0)]; + struct message_macs macs; +}; + +struct message_handshake_cookie { + struct message_header header; + __le32 receiver_index; + u8 nonce[COOKIE_NONCE_LEN]; + u8 encrypted_cookie[noise_encrypted_len(COOKIE_LEN)]; +}; + +struct message_data { + struct message_header header; + __le32 key_idx; + __le64 counter; + u8 encrypted_data[]; +}; + +#define message_data_len(plain_len) \ + (noise_encrypted_len(plain_len) + sizeof(struct message_data)) + +enum message_alignments { + MESSAGE_PADDING_MULTIPLE = 16, + MESSAGE_MINIMUM_LENGTH = message_data_len(0) +}; + +#define SKB_HEADER_LEN \ + (max(sizeof(struct iphdr), sizeof(struct ipv6hdr)) + \ + sizeof(struct udphdr) + NET_SKB_PAD) +#define DATA_PACKET_HEAD_ROOM \ + ALIGN(sizeof(struct message_data) + SKB_HEADER_LEN, 4) + +enum { HANDSHAKE_DSCP = 0x88 /* AF41, plus 00 ECN */ }; + +#endif /* _WG_MESSAGES_H */ diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c new file mode 100644 index 000000000000..a4377add66d8 --- /dev/null +++ b/drivers/net/wireguard/netlink.c @@ -0,0 +1,651 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "netlink.h" +#include "device.h" +#include "peer.h" +#include "socket.h" +#include "queueing.h" +#include "messages.h" + +#include + +#include +#include +#include +#include + +struct __uapi_kernel_timespec { + int64_t tv_sec, tv_nsec; +}; + +static struct genl_family genl_family; + +static const struct nla_policy device_policy[WGDEVICE_A_MAX + 1] = { + [WGDEVICE_A_IFINDEX] = { .type = NLA_U32 }, + [WGDEVICE_A_IFNAME] = { .type = NLA_NUL_STRING, .len = IFNAMSIZ - 1 }, + [WGDEVICE_A_PRIVATE_KEY] = { .len = NOISE_PUBLIC_KEY_LEN }, + [WGDEVICE_A_PUBLIC_KEY] = { .len = NOISE_PUBLIC_KEY_LEN }, + [WGDEVICE_A_FLAGS] = { .type = NLA_U32 }, + [WGDEVICE_A_LISTEN_PORT] = { .type = NLA_U16 }, + [WGDEVICE_A_FWMARK] = { .type = NLA_U32 }, + [WGDEVICE_A_PEERS] = { .type = NLA_NESTED } +}; + +static const struct nla_policy peer_policy[WGPEER_A_MAX + 1] = { + [WGPEER_A_PUBLIC_KEY] = { .len = NOISE_PUBLIC_KEY_LEN }, + [WGPEER_A_PRESHARED_KEY] = { .len = NOISE_SYMMETRIC_KEY_LEN }, + [WGPEER_A_FLAGS] = { .type = NLA_U32 }, + [WGPEER_A_ENDPOINT] = { .len = sizeof(struct sockaddr) }, + [WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL] = { .type = NLA_U16 }, + [WGPEER_A_LAST_HANDSHAKE_TIME] = { .len = sizeof(struct __uapi_kernel_timespec) }, + [WGPEER_A_RX_BYTES] = { .type = NLA_U64 }, + [WGPEER_A_TX_BYTES] = { .type = NLA_U64 }, + [WGPEER_A_ALLOWEDIPS] = { .type = NLA_NESTED }, + [WGPEER_A_PROTOCOL_VERSION] = { .type = NLA_U32 } +}; + +static const struct nla_policy allowedip_policy[WGALLOWEDIP_A_MAX + 1] = { + [WGALLOWEDIP_A_FAMILY] = { .type = NLA_U16 }, + [WGALLOWEDIP_A_IPADDR] = { .len = sizeof(struct in_addr) }, + [WGALLOWEDIP_A_CIDR_MASK] = { .type = NLA_U8 } +}; + +static struct wg_device *lookup_interface(struct nlattr **attrs, + struct sk_buff *skb) +{ + struct net_device *dev = NULL; + + if (!attrs[WGDEVICE_A_IFINDEX] == !attrs[WGDEVICE_A_IFNAME]) + return ERR_PTR(-EBADR); + if (attrs[WGDEVICE_A_IFINDEX]) + dev = dev_get_by_index(sock_net(skb->sk), + nla_get_u32(attrs[WGDEVICE_A_IFINDEX])); + else if (attrs[WGDEVICE_A_IFNAME]) + dev = dev_get_by_name(sock_net(skb->sk), + nla_data(attrs[WGDEVICE_A_IFNAME])); + if (!dev) + return ERR_PTR(-ENODEV); + if (!dev->rtnl_link_ops || !dev->rtnl_link_ops->kind || + strcmp(dev->rtnl_link_ops->kind, KBUILD_MODNAME)) { + dev_put(dev); + return ERR_PTR(-EOPNOTSUPP); + } + return netdev_priv(dev); +} + +static int get_allowedips(struct sk_buff *skb, const u8 *ip, u8 cidr, + int family) +{ + struct nlattr *allowedip_nest; + + allowedip_nest = nla_nest_start(skb, 0); + if (!allowedip_nest) + return -EMSGSIZE; + + if (nla_put_u8(skb, WGALLOWEDIP_A_CIDR_MASK, cidr) || + nla_put_u16(skb, WGALLOWEDIP_A_FAMILY, family) || + nla_put(skb, WGALLOWEDIP_A_IPADDR, family == AF_INET6 ? + sizeof(struct in6_addr) : sizeof(struct in_addr), ip)) { + nla_nest_cancel(skb, allowedip_nest); + return -EMSGSIZE; + } + + nla_nest_end(skb, allowedip_nest); + return 0; +} + +struct dump_ctx { + struct wg_device *wg; + struct wg_peer *next_peer; + u64 allowedips_seq; + struct allowedips_node *next_allowedip; +}; + +#define DUMP_CTX(cb) ((struct dump_ctx *)(cb)->args) + +static int +get_peer(struct wg_peer *peer, struct sk_buff *skb, struct dump_ctx *ctx) +{ + + struct nlattr *allowedips_nest, *peer_nest = nla_nest_start(skb, 0); + struct allowedips_node *allowedips_node = ctx->next_allowedip; + bool fail; + + if (!peer_nest) + return -EMSGSIZE; + + down_read(&peer->handshake.lock); + fail = nla_put(skb, WGPEER_A_PUBLIC_KEY, NOISE_PUBLIC_KEY_LEN, + peer->handshake.remote_static); + up_read(&peer->handshake.lock); + if (fail) + goto err; + + if (!allowedips_node) { + const struct __uapi_kernel_timespec last_handshake = { + .tv_sec = peer->walltime_last_handshake.tv_sec, + .tv_nsec = peer->walltime_last_handshake.tv_nsec + }; + + down_read(&peer->handshake.lock); + fail = nla_put(skb, WGPEER_A_PRESHARED_KEY, + NOISE_SYMMETRIC_KEY_LEN, + peer->handshake.preshared_key); + up_read(&peer->handshake.lock); + if (fail) + goto err; + + if (nla_put(skb, WGPEER_A_LAST_HANDSHAKE_TIME, + sizeof(last_handshake), &last_handshake) || + nla_put_u16(skb, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL, + peer->persistent_keepalive_interval) || + nla_put_u64_64bit(skb, WGPEER_A_TX_BYTES, peer->tx_bytes, + WGPEER_A_UNSPEC) || + nla_put_u64_64bit(skb, WGPEER_A_RX_BYTES, peer->rx_bytes, + WGPEER_A_UNSPEC) || + nla_put_u32(skb, WGPEER_A_PROTOCOL_VERSION, 1)) + goto err; + + read_lock_bh(&peer->endpoint_lock); + if (peer->endpoint.addr.sa_family == AF_INET) + fail = nla_put(skb, WGPEER_A_ENDPOINT, + sizeof(peer->endpoint.addr4), + &peer->endpoint.addr4); + else if (peer->endpoint.addr.sa_family == AF_INET6) + fail = nla_put(skb, WGPEER_A_ENDPOINT, + sizeof(peer->endpoint.addr6), + &peer->endpoint.addr6); + read_unlock_bh(&peer->endpoint_lock); + if (fail) + goto err; + allowedips_node = + list_first_entry_or_null(&peer->allowedips_list, + struct allowedips_node, peer_list); + } + if (!allowedips_node) + goto no_allowedips; + if (!ctx->allowedips_seq) + ctx->allowedips_seq = peer->device->peer_allowedips.seq; + else if (ctx->allowedips_seq != peer->device->peer_allowedips.seq) + goto no_allowedips; + + allowedips_nest = nla_nest_start(skb, WGPEER_A_ALLOWEDIPS); + if (!allowedips_nest) + goto err; + + list_for_each_entry_from(allowedips_node, &peer->allowedips_list, + peer_list) { + u8 cidr, ip[16] __aligned(__alignof(u64)); + int family; + + family = wg_allowedips_read_node(allowedips_node, ip, &cidr); + if (get_allowedips(skb, ip, cidr, family)) { + nla_nest_end(skb, allowedips_nest); + nla_nest_end(skb, peer_nest); + ctx->next_allowedip = allowedips_node; + return -EMSGSIZE; + } + } + nla_nest_end(skb, allowedips_nest); +no_allowedips: + nla_nest_end(skb, peer_nest); + ctx->next_allowedip = NULL; + ctx->allowedips_seq = 0; + return 0; +err: + nla_nest_cancel(skb, peer_nest); + return -EMSGSIZE; +} + +static int wg_get_device_start(struct netlink_callback *cb) +{ + struct nlattr **attrs = genl_family_attrbuf(&genl_family); + struct wg_device *wg; + int ret; + + ret = nlmsg_parse(cb->nlh, GENL_HDRLEN + genl_family.hdrsize, attrs, + genl_family.maxattr, device_policy, NULL); + if (ret < 0) + return ret; + wg = lookup_interface(attrs, cb->skb); + if (IS_ERR(wg)) + return PTR_ERR(wg); + DUMP_CTX(cb)->wg = wg; + return 0; +} + +static int wg_get_device_dump(struct sk_buff *skb, struct netlink_callback *cb) +{ + struct wg_peer *peer, *next_peer_cursor; + struct dump_ctx *ctx = DUMP_CTX(cb); + struct wg_device *wg = ctx->wg; + struct nlattr *peers_nest; + int ret = -EMSGSIZE; + bool done = true; + void *hdr; + + rtnl_lock(); + mutex_lock(&wg->device_update_lock); + cb->seq = wg->device_update_gen; + next_peer_cursor = ctx->next_peer; + + hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, + &genl_family, NLM_F_MULTI, WG_CMD_GET_DEVICE); + if (!hdr) + goto out; + genl_dump_check_consistent(cb, hdr); + + if (!ctx->next_peer) { + if (nla_put_u16(skb, WGDEVICE_A_LISTEN_PORT, + wg->incoming_port) || + nla_put_u32(skb, WGDEVICE_A_FWMARK, wg->fwmark) || + nla_put_u32(skb, WGDEVICE_A_IFINDEX, wg->dev->ifindex) || + nla_put_string(skb, WGDEVICE_A_IFNAME, wg->dev->name)) + goto out; + + down_read(&wg->static_identity.lock); + if (wg->static_identity.has_identity) { + if (nla_put(skb, WGDEVICE_A_PRIVATE_KEY, + NOISE_PUBLIC_KEY_LEN, + wg->static_identity.static_private) || + nla_put(skb, WGDEVICE_A_PUBLIC_KEY, + NOISE_PUBLIC_KEY_LEN, + wg->static_identity.static_public)) { + up_read(&wg->static_identity.lock); + goto out; + } + } + up_read(&wg->static_identity.lock); + } + + peers_nest = nla_nest_start(skb, WGDEVICE_A_PEERS); + if (!peers_nest) + goto out; + ret = 0; + /* If the last cursor was removed via list_del_init in peer_remove, then + * we just treat this the same as there being no more peers left. The + * reason is that seq_nr should indicate to userspace that this isn't a + * coherent dump anyway, so they'll try again. + */ + if (list_empty(&wg->peer_list) || + (ctx->next_peer && list_empty(&ctx->next_peer->peer_list))) { + nla_nest_cancel(skb, peers_nest); + goto out; + } + lockdep_assert_held(&wg->device_update_lock); + peer = list_prepare_entry(ctx->next_peer, &wg->peer_list, peer_list); + list_for_each_entry_continue(peer, &wg->peer_list, peer_list) { + if (get_peer(peer, skb, ctx)) { + done = false; + break; + } + next_peer_cursor = peer; + } + nla_nest_end(skb, peers_nest); + +out: + if (!ret && !done && next_peer_cursor) + wg_peer_get(next_peer_cursor); + wg_peer_put(ctx->next_peer); + mutex_unlock(&wg->device_update_lock); + rtnl_unlock(); + + if (ret) { + genlmsg_cancel(skb, hdr); + return ret; + } + genlmsg_end(skb, hdr); + if (done) { + ctx->next_peer = NULL; + return 0; + } + ctx->next_peer = next_peer_cursor; + return skb->len; + + /* At this point, we can't really deal ourselves with safely zeroing out + * the private key material after usage. This will need an additional API + * in the kernel for marking skbs as zero_on_free. + */ +} + +static int wg_get_device_done(struct netlink_callback *cb) +{ + struct dump_ctx *ctx = DUMP_CTX(cb); + + if (ctx->wg) + dev_put(ctx->wg->dev); + wg_peer_put(ctx->next_peer); + return 0; +} + +static int set_port(struct wg_device *wg, u16 port) +{ + struct wg_peer *peer; + + if (wg->incoming_port == port) + return 0; + list_for_each_entry(peer, &wg->peer_list, peer_list) + wg_socket_clear_peer_endpoint_src(peer); + if (!netif_running(wg->dev)) { + wg->incoming_port = port; + return 0; + } + return wg_socket_init(wg, port); +} + +static int set_allowedip(struct wg_peer *peer, struct nlattr **attrs) +{ + int ret = -EINVAL; + u16 family; + u8 cidr; + + if (!attrs[WGALLOWEDIP_A_FAMILY] || !attrs[WGALLOWEDIP_A_IPADDR] || + !attrs[WGALLOWEDIP_A_CIDR_MASK]) + return ret; + family = nla_get_u16(attrs[WGALLOWEDIP_A_FAMILY]); + cidr = nla_get_u8(attrs[WGALLOWEDIP_A_CIDR_MASK]); + + if (family == AF_INET && cidr <= 32 && + nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in_addr)) + ret = wg_allowedips_insert_v4( + &peer->device->peer_allowedips, + nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, peer, + &peer->device->device_update_lock); + else if (family == AF_INET6 && cidr <= 128 && + nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in6_addr)) + ret = wg_allowedips_insert_v6( + &peer->device->peer_allowedips, + nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, peer, + &peer->device->device_update_lock); + + return ret; +} + +static int set_peer(struct wg_device *wg, struct nlattr **attrs) +{ + u8 *public_key = NULL, *preshared_key = NULL; + struct wg_peer *peer = NULL; + u32 flags = 0; + int ret; + + ret = -EINVAL; + if (attrs[WGPEER_A_PUBLIC_KEY] && + nla_len(attrs[WGPEER_A_PUBLIC_KEY]) == NOISE_PUBLIC_KEY_LEN) + public_key = nla_data(attrs[WGPEER_A_PUBLIC_KEY]); + else + goto out; + if (attrs[WGPEER_A_PRESHARED_KEY] && + nla_len(attrs[WGPEER_A_PRESHARED_KEY]) == NOISE_SYMMETRIC_KEY_LEN) + preshared_key = nla_data(attrs[WGPEER_A_PRESHARED_KEY]); + + if (attrs[WGPEER_A_FLAGS]) + flags = nla_get_u32(attrs[WGPEER_A_FLAGS]); + ret = -EOPNOTSUPP; + if (flags & ~__WGPEER_F_ALL) + goto out; + + ret = -EPFNOSUPPORT; + if (attrs[WGPEER_A_PROTOCOL_VERSION]) { + if (nla_get_u32(attrs[WGPEER_A_PROTOCOL_VERSION]) != 1) + goto out; + } + + peer = wg_pubkey_hashtable_lookup(wg->peer_hashtable, + nla_data(attrs[WGPEER_A_PUBLIC_KEY])); + ret = 0; + if (!peer) { /* Peer doesn't exist yet. Add a new one. */ + if (flags & (WGPEER_F_REMOVE_ME | WGPEER_F_UPDATE_ONLY)) + goto out; + + /* The peer is new, so there aren't allowed IPs to remove. */ + flags &= ~WGPEER_F_REPLACE_ALLOWEDIPS; + + down_read(&wg->static_identity.lock); + if (wg->static_identity.has_identity && + !memcmp(nla_data(attrs[WGPEER_A_PUBLIC_KEY]), + wg->static_identity.static_public, + NOISE_PUBLIC_KEY_LEN)) { + /* We silently ignore peers that have the same public + * key as the device. The reason we do it silently is + * that we'd like for people to be able to reuse the + * same set of API calls across peers. + */ + up_read(&wg->static_identity.lock); + ret = 0; + goto out; + } + up_read(&wg->static_identity.lock); + + peer = wg_peer_create(wg, public_key, preshared_key); + if (IS_ERR(peer)) { + ret = PTR_ERR(peer); + peer = NULL; + goto out; + } + /* Take additional reference, as though we've just been + * looked up. + */ + wg_peer_get(peer); + } + + if (flags & WGPEER_F_REMOVE_ME) { + wg_peer_remove(peer); + goto out; + } + + if (preshared_key) { + down_write(&peer->handshake.lock); + memcpy(&peer->handshake.preshared_key, preshared_key, + NOISE_SYMMETRIC_KEY_LEN); + up_write(&peer->handshake.lock); + } + + if (attrs[WGPEER_A_ENDPOINT]) { + struct sockaddr *addr = nla_data(attrs[WGPEER_A_ENDPOINT]); + size_t len = nla_len(attrs[WGPEER_A_ENDPOINT]); + + if ((len == sizeof(struct sockaddr_in) && + addr->sa_family == AF_INET) || + (len == sizeof(struct sockaddr_in6) && + addr->sa_family == AF_INET6)) { + struct endpoint endpoint = { { { 0 } } }; + + memcpy(&endpoint.addr, addr, len); + wg_socket_set_peer_endpoint(peer, &endpoint); + } + } + + if (flags & WGPEER_F_REPLACE_ALLOWEDIPS) + wg_allowedips_remove_by_peer(&wg->peer_allowedips, peer, + &wg->device_update_lock); + + if (attrs[WGPEER_A_ALLOWEDIPS]) { + struct nlattr *attr, *allowedip[WGALLOWEDIP_A_MAX + 1]; + int rem; + + nla_for_each_nested(attr, attrs[WGPEER_A_ALLOWEDIPS], rem) { + ret = nla_parse_nested(allowedip, WGALLOWEDIP_A_MAX, + attr, allowedip_policy, NULL); + if (ret < 0) + goto out; + ret = set_allowedip(peer, allowedip); + if (ret < 0) + goto out; + } + } + + if (attrs[WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL]) { + const u16 persistent_keepalive_interval = nla_get_u16( + attrs[WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL]); + const bool send_keepalive = + !peer->persistent_keepalive_interval && + persistent_keepalive_interval && + netif_running(wg->dev); + + peer->persistent_keepalive_interval = persistent_keepalive_interval; + if (send_keepalive) + wg_packet_send_keepalive(peer); + } + + if (netif_running(wg->dev)) + wg_packet_send_staged_packets(peer); + +out: + wg_peer_put(peer); + if (attrs[WGPEER_A_PRESHARED_KEY]) + memzero_explicit(nla_data(attrs[WGPEER_A_PRESHARED_KEY]), + nla_len(attrs[WGPEER_A_PRESHARED_KEY])); + return ret; +} + +static int wg_set_device(struct sk_buff *skb, struct genl_info *info) +{ + struct wg_device *wg = lookup_interface(info->attrs, skb); + u32 flags = 0; + int ret; + + if (IS_ERR(wg)) { + ret = PTR_ERR(wg); + goto out_nodev; + } + + rtnl_lock(); + mutex_lock(&wg->device_update_lock); + + if (info->attrs[WGDEVICE_A_FLAGS]) + flags = nla_get_u32(info->attrs[WGDEVICE_A_FLAGS]); + ret = -EOPNOTSUPP; + if (flags & ~__WGDEVICE_F_ALL) + goto out; + + if (info->attrs[WGDEVICE_A_LISTEN_PORT] || info->attrs[WGDEVICE_A_FWMARK]) { + struct net *net; + rcu_read_lock(); + net = rcu_dereference(wg->creating_net); + ret = !net || !ns_capable(net->user_ns, CAP_NET_ADMIN) ? -EPERM : 0; + rcu_read_unlock(); + if (ret) + goto out; + } + + ++wg->device_update_gen; + + if (info->attrs[WGDEVICE_A_FWMARK]) { + struct wg_peer *peer; + + wg->fwmark = nla_get_u32(info->attrs[WGDEVICE_A_FWMARK]); + list_for_each_entry(peer, &wg->peer_list, peer_list) + wg_socket_clear_peer_endpoint_src(peer); + } + + if (info->attrs[WGDEVICE_A_LISTEN_PORT]) { + ret = set_port(wg, + nla_get_u16(info->attrs[WGDEVICE_A_LISTEN_PORT])); + if (ret) + goto out; + } + + if (flags & WGDEVICE_F_REPLACE_PEERS) + wg_peer_remove_all(wg); + + if (info->attrs[WGDEVICE_A_PRIVATE_KEY] && + nla_len(info->attrs[WGDEVICE_A_PRIVATE_KEY]) == + NOISE_PUBLIC_KEY_LEN) { + u8 *private_key = nla_data(info->attrs[WGDEVICE_A_PRIVATE_KEY]); + u8 public_key[NOISE_PUBLIC_KEY_LEN]; + struct wg_peer *peer, *temp; + + if (!crypto_memneq(wg->static_identity.static_private, + private_key, NOISE_PUBLIC_KEY_LEN)) + goto skip_set_private_key; + + /* We remove before setting, to prevent race, which means doing + * two 25519-genpub ops. + */ + if (curve25519_generate_public(public_key, private_key)) { + peer = wg_pubkey_hashtable_lookup(wg->peer_hashtable, + public_key); + if (peer) { + wg_peer_put(peer); + wg_peer_remove(peer); + } + } + + down_write(&wg->static_identity.lock); + wg_noise_set_static_identity_private_key(&wg->static_identity, + private_key); + list_for_each_entry_safe(peer, temp, &wg->peer_list, + peer_list) { + wg_noise_precompute_static_static(peer); + wg_noise_expire_current_peer_keypairs(peer); + } + wg_cookie_checker_precompute_device_keys(&wg->cookie_checker); + up_write(&wg->static_identity.lock); + } +skip_set_private_key: + + if (info->attrs[WGDEVICE_A_PEERS]) { + struct nlattr *attr, *peer[WGPEER_A_MAX + 1]; + int rem; + + nla_for_each_nested(attr, info->attrs[WGDEVICE_A_PEERS], rem) { + ret = nla_parse_nested(peer, WGPEER_A_MAX, attr, + peer_policy, NULL); + if (ret < 0) + goto out; + ret = set_peer(wg, peer); + if (ret < 0) + goto out; + } + } + ret = 0; + +out: + mutex_unlock(&wg->device_update_lock); + rtnl_unlock(); + dev_put(wg->dev); +out_nodev: + if (info->attrs[WGDEVICE_A_PRIVATE_KEY]) + memzero_explicit(nla_data(info->attrs[WGDEVICE_A_PRIVATE_KEY]), + nla_len(info->attrs[WGDEVICE_A_PRIVATE_KEY])); + return ret; +} + +static const struct genl_ops genl_ops[] = { + { + .cmd = WG_CMD_GET_DEVICE, + .start = wg_get_device_start, + .dumpit = wg_get_device_dump, + .done = wg_get_device_done, + .flags = GENL_UNS_ADMIN_PERM, + .policy = device_policy + }, { + .cmd = WG_CMD_SET_DEVICE, + .doit = wg_set_device, + .flags = GENL_UNS_ADMIN_PERM, + .policy = device_policy + } +}; + +static struct genl_family genl_family __ro_after_init = { + .ops = genl_ops, + .n_ops = ARRAY_SIZE(genl_ops), + .name = WG_GENL_NAME, + .version = WG_GENL_VERSION, + .maxattr = WGDEVICE_A_MAX, + .module = THIS_MODULE, + .netnsok = true +}; + +int __init wg_genetlink_init(void) +{ + return genl_register_family(&genl_family); +} + +void __exit wg_genetlink_uninit(void) +{ + genl_unregister_family(&genl_family); +} diff --git a/drivers/net/wireguard/netlink.h b/drivers/net/wireguard/netlink.h new file mode 100644 index 000000000000..15100d92e2e3 --- /dev/null +++ b/drivers/net/wireguard/netlink.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef _WG_NETLINK_H +#define _WG_NETLINK_H + +int wg_genetlink_init(void); +void wg_genetlink_uninit(void); + +#endif /* _WG_NETLINK_H */ diff --git a/drivers/net/wireguard/noise.c b/drivers/net/wireguard/noise.c new file mode 100644 index 000000000000..27cb5045bed2 --- /dev/null +++ b/drivers/net/wireguard/noise.c @@ -0,0 +1,828 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "noise.h" +#include "device.h" +#include "peer.h" +#include "messages.h" +#include "queueing.h" +#include "peerlookup.h" + +#include +#include +#include +#include +#include +#include + +/* This implements Noise_IKpsk2: + * + * <- s + * ****** + * -> e, es, s, ss, {t} + * <- e, ee, se, psk, {} + */ + +static const u8 handshake_name[37] = "Noise_IKpsk2_25519_ChaChaPoly_BLAKE2s"; +static const u8 identifier_name[34] = "WireGuard v1 zx2c4 Jason@zx2c4.com"; +static u8 handshake_init_hash[NOISE_HASH_LEN] __ro_after_init; +static u8 handshake_init_chaining_key[NOISE_HASH_LEN] __ro_after_init; +static atomic64_t keypair_counter = ATOMIC64_INIT(0); + +void __init wg_noise_init(void) +{ + struct blake2s_state blake; + + blake2s(handshake_init_chaining_key, handshake_name, NULL, + NOISE_HASH_LEN, sizeof(handshake_name), 0); + blake2s_init(&blake, NOISE_HASH_LEN); + blake2s_update(&blake, handshake_init_chaining_key, NOISE_HASH_LEN); + blake2s_update(&blake, identifier_name, sizeof(identifier_name)); + blake2s_final(&blake, handshake_init_hash); +} + +/* Must hold peer->handshake.static_identity->lock */ +void wg_noise_precompute_static_static(struct wg_peer *peer) +{ + down_write(&peer->handshake.lock); + if (!peer->handshake.static_identity->has_identity || + !curve25519(peer->handshake.precomputed_static_static, + peer->handshake.static_identity->static_private, + peer->handshake.remote_static)) + memset(peer->handshake.precomputed_static_static, 0, + NOISE_PUBLIC_KEY_LEN); + up_write(&peer->handshake.lock); +} + +void wg_noise_handshake_init(struct noise_handshake *handshake, + struct noise_static_identity *static_identity, + const u8 peer_public_key[NOISE_PUBLIC_KEY_LEN], + const u8 peer_preshared_key[NOISE_SYMMETRIC_KEY_LEN], + struct wg_peer *peer) +{ + memset(handshake, 0, sizeof(*handshake)); + init_rwsem(&handshake->lock); + handshake->entry.type = INDEX_HASHTABLE_HANDSHAKE; + handshake->entry.peer = peer; + memcpy(handshake->remote_static, peer_public_key, NOISE_PUBLIC_KEY_LEN); + if (peer_preshared_key) + memcpy(handshake->preshared_key, peer_preshared_key, + NOISE_SYMMETRIC_KEY_LEN); + handshake->static_identity = static_identity; + handshake->state = HANDSHAKE_ZEROED; + wg_noise_precompute_static_static(peer); +} + +static void handshake_zero(struct noise_handshake *handshake) +{ + memset(&handshake->ephemeral_private, 0, NOISE_PUBLIC_KEY_LEN); + memset(&handshake->remote_ephemeral, 0, NOISE_PUBLIC_KEY_LEN); + memset(&handshake->hash, 0, NOISE_HASH_LEN); + memset(&handshake->chaining_key, 0, NOISE_HASH_LEN); + handshake->remote_index = 0; + handshake->state = HANDSHAKE_ZEROED; +} + +void wg_noise_handshake_clear(struct noise_handshake *handshake) +{ + down_write(&handshake->lock); + wg_index_hashtable_remove( + handshake->entry.peer->device->index_hashtable, + &handshake->entry); + handshake_zero(handshake); + up_write(&handshake->lock); +} + +static struct noise_keypair *keypair_create(struct wg_peer *peer) +{ + struct noise_keypair *keypair = kzalloc(sizeof(*keypair), GFP_KERNEL); + + if (unlikely(!keypair)) + return NULL; + spin_lock_init(&keypair->receiving_counter.lock); + keypair->internal_id = atomic64_inc_return(&keypair_counter); + keypair->entry.type = INDEX_HASHTABLE_KEYPAIR; + keypair->entry.peer = peer; + kref_init(&keypair->refcount); + return keypair; +} + +static void keypair_free_rcu(struct rcu_head *rcu) +{ + kzfree(container_of(rcu, struct noise_keypair, rcu)); +} + +static void keypair_free_kref(struct kref *kref) +{ + struct noise_keypair *keypair = + container_of(kref, struct noise_keypair, refcount); + + net_dbg_ratelimited("%s: Keypair %llu destroyed for peer %llu\n", + keypair->entry.peer->device->dev->name, + keypair->internal_id, + keypair->entry.peer->internal_id); + wg_index_hashtable_remove(keypair->entry.peer->device->index_hashtable, + &keypair->entry); + call_rcu(&keypair->rcu, keypair_free_rcu); +} + +void wg_noise_keypair_put(struct noise_keypair *keypair, bool unreference_now) +{ + if (unlikely(!keypair)) + return; + if (unlikely(unreference_now)) + wg_index_hashtable_remove( + keypair->entry.peer->device->index_hashtable, + &keypair->entry); + kref_put(&keypair->refcount, keypair_free_kref); +} + +struct noise_keypair *wg_noise_keypair_get(struct noise_keypair *keypair) +{ + RCU_LOCKDEP_WARN(!rcu_read_lock_bh_held(), + "Taking noise keypair reference without holding the RCU BH read lock"); + if (unlikely(!keypair || !kref_get_unless_zero(&keypair->refcount))) + return NULL; + return keypair; +} + +void wg_noise_keypairs_clear(struct noise_keypairs *keypairs) +{ + struct noise_keypair *old; + + spin_lock_bh(&keypairs->keypair_update_lock); + + /* We zero the next_keypair before zeroing the others, so that + * wg_noise_received_with_keypair returns early before subsequent ones + * are zeroed. + */ + old = rcu_dereference_protected(keypairs->next_keypair, + lockdep_is_held(&keypairs->keypair_update_lock)); + RCU_INIT_POINTER(keypairs->next_keypair, NULL); + wg_noise_keypair_put(old, true); + + old = rcu_dereference_protected(keypairs->previous_keypair, + lockdep_is_held(&keypairs->keypair_update_lock)); + RCU_INIT_POINTER(keypairs->previous_keypair, NULL); + wg_noise_keypair_put(old, true); + + old = rcu_dereference_protected(keypairs->current_keypair, + lockdep_is_held(&keypairs->keypair_update_lock)); + RCU_INIT_POINTER(keypairs->current_keypair, NULL); + wg_noise_keypair_put(old, true); + + spin_unlock_bh(&keypairs->keypair_update_lock); +} + +void wg_noise_expire_current_peer_keypairs(struct wg_peer *peer) +{ + struct noise_keypair *keypair; + + wg_noise_handshake_clear(&peer->handshake); + wg_noise_reset_last_sent_handshake(&peer->last_sent_handshake); + + spin_lock_bh(&peer->keypairs.keypair_update_lock); + keypair = rcu_dereference_protected(peer->keypairs.next_keypair, + lockdep_is_held(&peer->keypairs.keypair_update_lock)); + if (keypair) + keypair->sending.is_valid = false; + keypair = rcu_dereference_protected(peer->keypairs.current_keypair, + lockdep_is_held(&peer->keypairs.keypair_update_lock)); + if (keypair) + keypair->sending.is_valid = false; + spin_unlock_bh(&peer->keypairs.keypair_update_lock); +} + +static void add_new_keypair(struct noise_keypairs *keypairs, + struct noise_keypair *new_keypair) +{ + struct noise_keypair *previous_keypair, *next_keypair, *current_keypair; + + spin_lock_bh(&keypairs->keypair_update_lock); + previous_keypair = rcu_dereference_protected(keypairs->previous_keypair, + lockdep_is_held(&keypairs->keypair_update_lock)); + next_keypair = rcu_dereference_protected(keypairs->next_keypair, + lockdep_is_held(&keypairs->keypair_update_lock)); + current_keypair = rcu_dereference_protected(keypairs->current_keypair, + lockdep_is_held(&keypairs->keypair_update_lock)); + if (new_keypair->i_am_the_initiator) { + /* If we're the initiator, it means we've sent a handshake, and + * received a confirmation response, which means this new + * keypair can now be used. + */ + if (next_keypair) { + /* If there already was a next keypair pending, we + * demote it to be the previous keypair, and free the + * existing current. Note that this means KCI can result + * in this transition. It would perhaps be more sound to + * always just get rid of the unused next keypair + * instead of putting it in the previous slot, but this + * might be a bit less robust. Something to think about + * for the future. + */ + RCU_INIT_POINTER(keypairs->next_keypair, NULL); + rcu_assign_pointer(keypairs->previous_keypair, + next_keypair); + wg_noise_keypair_put(current_keypair, true); + } else /* If there wasn't an existing next keypair, we replace + * the previous with the current one. + */ + rcu_assign_pointer(keypairs->previous_keypair, + current_keypair); + /* At this point we can get rid of the old previous keypair, and + * set up the new keypair. + */ + wg_noise_keypair_put(previous_keypair, true); + rcu_assign_pointer(keypairs->current_keypair, new_keypair); + } else { + /* If we're the responder, it means we can't use the new keypair + * until we receive confirmation via the first data packet, so + * we get rid of the existing previous one, the possibly + * existing next one, and slide in the new next one. + */ + rcu_assign_pointer(keypairs->next_keypair, new_keypair); + wg_noise_keypair_put(next_keypair, true); + RCU_INIT_POINTER(keypairs->previous_keypair, NULL); + wg_noise_keypair_put(previous_keypair, true); + } + spin_unlock_bh(&keypairs->keypair_update_lock); +} + +bool wg_noise_received_with_keypair(struct noise_keypairs *keypairs, + struct noise_keypair *received_keypair) +{ + struct noise_keypair *old_keypair; + bool key_is_new; + + /* We first check without taking the spinlock. */ + key_is_new = received_keypair == + rcu_access_pointer(keypairs->next_keypair); + if (likely(!key_is_new)) + return false; + + spin_lock_bh(&keypairs->keypair_update_lock); + /* After locking, we double check that things didn't change from + * beneath us. + */ + if (unlikely(received_keypair != + rcu_dereference_protected(keypairs->next_keypair, + lockdep_is_held(&keypairs->keypair_update_lock)))) { + spin_unlock_bh(&keypairs->keypair_update_lock); + return false; + } + + /* When we've finally received the confirmation, we slide the next + * into the current, the current into the previous, and get rid of + * the old previous. + */ + old_keypair = rcu_dereference_protected(keypairs->previous_keypair, + lockdep_is_held(&keypairs->keypair_update_lock)); + rcu_assign_pointer(keypairs->previous_keypair, + rcu_dereference_protected(keypairs->current_keypair, + lockdep_is_held(&keypairs->keypair_update_lock))); + wg_noise_keypair_put(old_keypair, true); + rcu_assign_pointer(keypairs->current_keypair, received_keypair); + RCU_INIT_POINTER(keypairs->next_keypair, NULL); + + spin_unlock_bh(&keypairs->keypair_update_lock); + return true; +} + +/* Must hold static_identity->lock */ +void wg_noise_set_static_identity_private_key( + struct noise_static_identity *static_identity, + const u8 private_key[NOISE_PUBLIC_KEY_LEN]) +{ + memcpy(static_identity->static_private, private_key, + NOISE_PUBLIC_KEY_LEN); + curve25519_clamp_secret(static_identity->static_private); + static_identity->has_identity = curve25519_generate_public( + static_identity->static_public, private_key); +} + +/* This is Hugo Krawczyk's HKDF: + * - https://eprint.iacr.org/2010/264.pdf + * - https://tools.ietf.org/html/rfc5869 + */ +static void kdf(u8 *first_dst, u8 *second_dst, u8 *third_dst, const u8 *data, + size_t first_len, size_t second_len, size_t third_len, + size_t data_len, const u8 chaining_key[NOISE_HASH_LEN]) +{ + u8 output[BLAKE2S_HASH_SIZE + 1]; + u8 secret[BLAKE2S_HASH_SIZE]; + + WARN_ON(IS_ENABLED(DEBUG) && + (first_len > BLAKE2S_HASH_SIZE || + second_len > BLAKE2S_HASH_SIZE || + third_len > BLAKE2S_HASH_SIZE || + ((second_len || second_dst || third_len || third_dst) && + (!first_len || !first_dst)) || + ((third_len || third_dst) && (!second_len || !second_dst)))); + + /* Extract entropy from data into secret */ + blake2s256_hmac(secret, data, chaining_key, data_len, NOISE_HASH_LEN); + + if (!first_dst || !first_len) + goto out; + + /* Expand first key: key = secret, data = 0x1 */ + output[0] = 1; + blake2s256_hmac(output, output, secret, 1, BLAKE2S_HASH_SIZE); + memcpy(first_dst, output, first_len); + + if (!second_dst || !second_len) + goto out; + + /* Expand second key: key = secret, data = first-key || 0x2 */ + output[BLAKE2S_HASH_SIZE] = 2; + blake2s256_hmac(output, output, secret, BLAKE2S_HASH_SIZE + 1, + BLAKE2S_HASH_SIZE); + memcpy(second_dst, output, second_len); + + if (!third_dst || !third_len) + goto out; + + /* Expand third key: key = secret, data = second-key || 0x3 */ + output[BLAKE2S_HASH_SIZE] = 3; + blake2s256_hmac(output, output, secret, BLAKE2S_HASH_SIZE + 1, + BLAKE2S_HASH_SIZE); + memcpy(third_dst, output, third_len); + +out: + /* Clear sensitive data from stack */ + memzero_explicit(secret, BLAKE2S_HASH_SIZE); + memzero_explicit(output, BLAKE2S_HASH_SIZE + 1); +} + +static void derive_keys(struct noise_symmetric_key *first_dst, + struct noise_symmetric_key *second_dst, + const u8 chaining_key[NOISE_HASH_LEN]) +{ + u64 birthdate = ktime_get_coarse_boottime_ns(); + kdf(first_dst->key, second_dst->key, NULL, NULL, + NOISE_SYMMETRIC_KEY_LEN, NOISE_SYMMETRIC_KEY_LEN, 0, 0, + chaining_key); + first_dst->birthdate = second_dst->birthdate = birthdate; + first_dst->is_valid = second_dst->is_valid = true; +} + +static bool __must_check mix_dh(u8 chaining_key[NOISE_HASH_LEN], + u8 key[NOISE_SYMMETRIC_KEY_LEN], + const u8 private[NOISE_PUBLIC_KEY_LEN], + const u8 public[NOISE_PUBLIC_KEY_LEN]) +{ + u8 dh_calculation[NOISE_PUBLIC_KEY_LEN]; + + if (unlikely(!curve25519(dh_calculation, private, public))) + return false; + kdf(chaining_key, key, NULL, dh_calculation, NOISE_HASH_LEN, + NOISE_SYMMETRIC_KEY_LEN, 0, NOISE_PUBLIC_KEY_LEN, chaining_key); + memzero_explicit(dh_calculation, NOISE_PUBLIC_KEY_LEN); + return true; +} + +static bool __must_check mix_precomputed_dh(u8 chaining_key[NOISE_HASH_LEN], + u8 key[NOISE_SYMMETRIC_KEY_LEN], + const u8 precomputed[NOISE_PUBLIC_KEY_LEN]) +{ + static u8 zero_point[NOISE_PUBLIC_KEY_LEN]; + if (unlikely(!crypto_memneq(precomputed, zero_point, NOISE_PUBLIC_KEY_LEN))) + return false; + kdf(chaining_key, key, NULL, precomputed, NOISE_HASH_LEN, + NOISE_SYMMETRIC_KEY_LEN, 0, NOISE_PUBLIC_KEY_LEN, + chaining_key); + return true; +} + +static void mix_hash(u8 hash[NOISE_HASH_LEN], const u8 *src, size_t src_len) +{ + struct blake2s_state blake; + + blake2s_init(&blake, NOISE_HASH_LEN); + blake2s_update(&blake, hash, NOISE_HASH_LEN); + blake2s_update(&blake, src, src_len); + blake2s_final(&blake, hash); +} + +static void mix_psk(u8 chaining_key[NOISE_HASH_LEN], u8 hash[NOISE_HASH_LEN], + u8 key[NOISE_SYMMETRIC_KEY_LEN], + const u8 psk[NOISE_SYMMETRIC_KEY_LEN]) +{ + u8 temp_hash[NOISE_HASH_LEN]; + + kdf(chaining_key, temp_hash, key, psk, NOISE_HASH_LEN, NOISE_HASH_LEN, + NOISE_SYMMETRIC_KEY_LEN, NOISE_SYMMETRIC_KEY_LEN, chaining_key); + mix_hash(hash, temp_hash, NOISE_HASH_LEN); + memzero_explicit(temp_hash, NOISE_HASH_LEN); +} + +static void handshake_init(u8 chaining_key[NOISE_HASH_LEN], + u8 hash[NOISE_HASH_LEN], + const u8 remote_static[NOISE_PUBLIC_KEY_LEN]) +{ + memcpy(hash, handshake_init_hash, NOISE_HASH_LEN); + memcpy(chaining_key, handshake_init_chaining_key, NOISE_HASH_LEN); + mix_hash(hash, remote_static, NOISE_PUBLIC_KEY_LEN); +} + +static void message_encrypt(u8 *dst_ciphertext, const u8 *src_plaintext, + size_t src_len, u8 key[NOISE_SYMMETRIC_KEY_LEN], + u8 hash[NOISE_HASH_LEN]) +{ + chacha20poly1305_encrypt(dst_ciphertext, src_plaintext, src_len, hash, + NOISE_HASH_LEN, + 0 /* Always zero for Noise_IK */, key); + mix_hash(hash, dst_ciphertext, noise_encrypted_len(src_len)); +} + +static bool message_decrypt(u8 *dst_plaintext, const u8 *src_ciphertext, + size_t src_len, u8 key[NOISE_SYMMETRIC_KEY_LEN], + u8 hash[NOISE_HASH_LEN]) +{ + if (!chacha20poly1305_decrypt(dst_plaintext, src_ciphertext, src_len, + hash, NOISE_HASH_LEN, + 0 /* Always zero for Noise_IK */, key)) + return false; + mix_hash(hash, src_ciphertext, src_len); + return true; +} + +static void message_ephemeral(u8 ephemeral_dst[NOISE_PUBLIC_KEY_LEN], + const u8 ephemeral_src[NOISE_PUBLIC_KEY_LEN], + u8 chaining_key[NOISE_HASH_LEN], + u8 hash[NOISE_HASH_LEN]) +{ + if (ephemeral_dst != ephemeral_src) + memcpy(ephemeral_dst, ephemeral_src, NOISE_PUBLIC_KEY_LEN); + mix_hash(hash, ephemeral_src, NOISE_PUBLIC_KEY_LEN); + kdf(chaining_key, NULL, NULL, ephemeral_src, NOISE_HASH_LEN, 0, 0, + NOISE_PUBLIC_KEY_LEN, chaining_key); +} + +static void tai64n_now(u8 output[NOISE_TIMESTAMP_LEN]) +{ + struct timespec64 now; + + ktime_get_real_ts64(&now); + + /* In order to prevent some sort of infoleak from precise timers, we + * round down the nanoseconds part to the closest rounded-down power of + * two to the maximum initiations per second allowed anyway by the + * implementation. + */ + now.tv_nsec = ALIGN_DOWN(now.tv_nsec, + rounddown_pow_of_two(NSEC_PER_SEC / INITIATIONS_PER_SECOND)); + + /* https://cr.yp.to/libtai/tai64.html */ + *(__be64 *)output = cpu_to_be64(0x400000000000000aULL + now.tv_sec); + *(__be32 *)(output + sizeof(__be64)) = cpu_to_be32(now.tv_nsec); +} + +bool +wg_noise_handshake_create_initiation(struct message_handshake_initiation *dst, + struct noise_handshake *handshake) +{ + u8 timestamp[NOISE_TIMESTAMP_LEN]; + u8 key[NOISE_SYMMETRIC_KEY_LEN]; + bool ret = false; + + /* We need to wait for crng _before_ taking any locks, since + * curve25519_generate_secret uses get_random_bytes_wait. + */ + wait_for_random_bytes(); + + down_read(&handshake->static_identity->lock); + down_write(&handshake->lock); + + if (unlikely(!handshake->static_identity->has_identity)) + goto out; + + dst->header.type = cpu_to_le32(MESSAGE_HANDSHAKE_INITIATION); + + handshake_init(handshake->chaining_key, handshake->hash, + handshake->remote_static); + + /* e */ + curve25519_generate_secret(handshake->ephemeral_private); + if (!curve25519_generate_public(dst->unencrypted_ephemeral, + handshake->ephemeral_private)) + goto out; + message_ephemeral(dst->unencrypted_ephemeral, + dst->unencrypted_ephemeral, handshake->chaining_key, + handshake->hash); + + /* es */ + if (!mix_dh(handshake->chaining_key, key, handshake->ephemeral_private, + handshake->remote_static)) + goto out; + + /* s */ + message_encrypt(dst->encrypted_static, + handshake->static_identity->static_public, + NOISE_PUBLIC_KEY_LEN, key, handshake->hash); + + /* ss */ + if (!mix_precomputed_dh(handshake->chaining_key, key, + handshake->precomputed_static_static)) + goto out; + + /* {t} */ + tai64n_now(timestamp); + message_encrypt(dst->encrypted_timestamp, timestamp, + NOISE_TIMESTAMP_LEN, key, handshake->hash); + + dst->sender_index = wg_index_hashtable_insert( + handshake->entry.peer->device->index_hashtable, + &handshake->entry); + + handshake->state = HANDSHAKE_CREATED_INITIATION; + ret = true; + +out: + up_write(&handshake->lock); + up_read(&handshake->static_identity->lock); + memzero_explicit(key, NOISE_SYMMETRIC_KEY_LEN); + return ret; +} + +struct wg_peer * +wg_noise_handshake_consume_initiation(struct message_handshake_initiation *src, + struct wg_device *wg) +{ + struct wg_peer *peer = NULL, *ret_peer = NULL; + struct noise_handshake *handshake; + bool replay_attack, flood_attack; + u8 key[NOISE_SYMMETRIC_KEY_LEN]; + u8 chaining_key[NOISE_HASH_LEN]; + u8 hash[NOISE_HASH_LEN]; + u8 s[NOISE_PUBLIC_KEY_LEN]; + u8 e[NOISE_PUBLIC_KEY_LEN]; + u8 t[NOISE_TIMESTAMP_LEN]; + u64 initiation_consumption; + + down_read(&wg->static_identity.lock); + if (unlikely(!wg->static_identity.has_identity)) + goto out; + + handshake_init(chaining_key, hash, wg->static_identity.static_public); + + /* e */ + message_ephemeral(e, src->unencrypted_ephemeral, chaining_key, hash); + + /* es */ + if (!mix_dh(chaining_key, key, wg->static_identity.static_private, e)) + goto out; + + /* s */ + if (!message_decrypt(s, src->encrypted_static, + sizeof(src->encrypted_static), key, hash)) + goto out; + + /* Lookup which peer we're actually talking to */ + peer = wg_pubkey_hashtable_lookup(wg->peer_hashtable, s); + if (!peer) + goto out; + handshake = &peer->handshake; + + /* ss */ + if (!mix_precomputed_dh(chaining_key, key, + handshake->precomputed_static_static)) + goto out; + + /* {t} */ + if (!message_decrypt(t, src->encrypted_timestamp, + sizeof(src->encrypted_timestamp), key, hash)) + goto out; + + down_read(&handshake->lock); + replay_attack = memcmp(t, handshake->latest_timestamp, + NOISE_TIMESTAMP_LEN) <= 0; + flood_attack = (s64)handshake->last_initiation_consumption + + NSEC_PER_SEC / INITIATIONS_PER_SECOND > + (s64)ktime_get_coarse_boottime_ns(); + up_read(&handshake->lock); + if (replay_attack || flood_attack) + goto out; + + /* Success! Copy everything to peer */ + down_write(&handshake->lock); + memcpy(handshake->remote_ephemeral, e, NOISE_PUBLIC_KEY_LEN); + if (memcmp(t, handshake->latest_timestamp, NOISE_TIMESTAMP_LEN) > 0) + memcpy(handshake->latest_timestamp, t, NOISE_TIMESTAMP_LEN); + memcpy(handshake->hash, hash, NOISE_HASH_LEN); + memcpy(handshake->chaining_key, chaining_key, NOISE_HASH_LEN); + handshake->remote_index = src->sender_index; + initiation_consumption = ktime_get_coarse_boottime_ns(); + if ((s64)(handshake->last_initiation_consumption - initiation_consumption) < 0) + handshake->last_initiation_consumption = initiation_consumption; + handshake->state = HANDSHAKE_CONSUMED_INITIATION; + up_write(&handshake->lock); + ret_peer = peer; + +out: + memzero_explicit(key, NOISE_SYMMETRIC_KEY_LEN); + memzero_explicit(hash, NOISE_HASH_LEN); + memzero_explicit(chaining_key, NOISE_HASH_LEN); + up_read(&wg->static_identity.lock); + if (!ret_peer) + wg_peer_put(peer); + return ret_peer; +} + +bool wg_noise_handshake_create_response(struct message_handshake_response *dst, + struct noise_handshake *handshake) +{ + u8 key[NOISE_SYMMETRIC_KEY_LEN]; + bool ret = false; + + /* We need to wait for crng _before_ taking any locks, since + * curve25519_generate_secret uses get_random_bytes_wait. + */ + wait_for_random_bytes(); + + down_read(&handshake->static_identity->lock); + down_write(&handshake->lock); + + if (handshake->state != HANDSHAKE_CONSUMED_INITIATION) + goto out; + + dst->header.type = cpu_to_le32(MESSAGE_HANDSHAKE_RESPONSE); + dst->receiver_index = handshake->remote_index; + + /* e */ + curve25519_generate_secret(handshake->ephemeral_private); + if (!curve25519_generate_public(dst->unencrypted_ephemeral, + handshake->ephemeral_private)) + goto out; + message_ephemeral(dst->unencrypted_ephemeral, + dst->unencrypted_ephemeral, handshake->chaining_key, + handshake->hash); + + /* ee */ + if (!mix_dh(handshake->chaining_key, NULL, handshake->ephemeral_private, + handshake->remote_ephemeral)) + goto out; + + /* se */ + if (!mix_dh(handshake->chaining_key, NULL, handshake->ephemeral_private, + handshake->remote_static)) + goto out; + + /* psk */ + mix_psk(handshake->chaining_key, handshake->hash, key, + handshake->preshared_key); + + /* {} */ + message_encrypt(dst->encrypted_nothing, NULL, 0, key, handshake->hash); + + dst->sender_index = wg_index_hashtable_insert( + handshake->entry.peer->device->index_hashtable, + &handshake->entry); + + handshake->state = HANDSHAKE_CREATED_RESPONSE; + ret = true; + +out: + up_write(&handshake->lock); + up_read(&handshake->static_identity->lock); + memzero_explicit(key, NOISE_SYMMETRIC_KEY_LEN); + return ret; +} + +struct wg_peer * +wg_noise_handshake_consume_response(struct message_handshake_response *src, + struct wg_device *wg) +{ + enum noise_handshake_state state = HANDSHAKE_ZEROED; + struct wg_peer *peer = NULL, *ret_peer = NULL; + struct noise_handshake *handshake; + u8 key[NOISE_SYMMETRIC_KEY_LEN]; + u8 hash[NOISE_HASH_LEN]; + u8 chaining_key[NOISE_HASH_LEN]; + u8 e[NOISE_PUBLIC_KEY_LEN]; + u8 ephemeral_private[NOISE_PUBLIC_KEY_LEN]; + u8 static_private[NOISE_PUBLIC_KEY_LEN]; + u8 preshared_key[NOISE_SYMMETRIC_KEY_LEN]; + + down_read(&wg->static_identity.lock); + + if (unlikely(!wg->static_identity.has_identity)) + goto out; + + handshake = (struct noise_handshake *)wg_index_hashtable_lookup( + wg->index_hashtable, INDEX_HASHTABLE_HANDSHAKE, + src->receiver_index, &peer); + if (unlikely(!handshake)) + goto out; + + down_read(&handshake->lock); + state = handshake->state; + memcpy(hash, handshake->hash, NOISE_HASH_LEN); + memcpy(chaining_key, handshake->chaining_key, NOISE_HASH_LEN); + memcpy(ephemeral_private, handshake->ephemeral_private, + NOISE_PUBLIC_KEY_LEN); + memcpy(preshared_key, handshake->preshared_key, + NOISE_SYMMETRIC_KEY_LEN); + up_read(&handshake->lock); + + if (state != HANDSHAKE_CREATED_INITIATION) + goto fail; + + /* e */ + message_ephemeral(e, src->unencrypted_ephemeral, chaining_key, hash); + + /* ee */ + if (!mix_dh(chaining_key, NULL, ephemeral_private, e)) + goto fail; + + /* se */ + if (!mix_dh(chaining_key, NULL, wg->static_identity.static_private, e)) + goto fail; + + /* psk */ + mix_psk(chaining_key, hash, key, preshared_key); + + /* {} */ + if (!message_decrypt(NULL, src->encrypted_nothing, + sizeof(src->encrypted_nothing), key, hash)) + goto fail; + + /* Success! Copy everything to peer */ + down_write(&handshake->lock); + /* It's important to check that the state is still the same, while we + * have an exclusive lock. + */ + if (handshake->state != state) { + up_write(&handshake->lock); + goto fail; + } + memcpy(handshake->remote_ephemeral, e, NOISE_PUBLIC_KEY_LEN); + memcpy(handshake->hash, hash, NOISE_HASH_LEN); + memcpy(handshake->chaining_key, chaining_key, NOISE_HASH_LEN); + handshake->remote_index = src->sender_index; + handshake->state = HANDSHAKE_CONSUMED_RESPONSE; + up_write(&handshake->lock); + ret_peer = peer; + goto out; + +fail: + wg_peer_put(peer); +out: + memzero_explicit(key, NOISE_SYMMETRIC_KEY_LEN); + memzero_explicit(hash, NOISE_HASH_LEN); + memzero_explicit(chaining_key, NOISE_HASH_LEN); + memzero_explicit(ephemeral_private, NOISE_PUBLIC_KEY_LEN); + memzero_explicit(static_private, NOISE_PUBLIC_KEY_LEN); + memzero_explicit(preshared_key, NOISE_SYMMETRIC_KEY_LEN); + up_read(&wg->static_identity.lock); + return ret_peer; +} + +bool wg_noise_handshake_begin_session(struct noise_handshake *handshake, + struct noise_keypairs *keypairs) +{ + struct noise_keypair *new_keypair; + bool ret = false; + + down_write(&handshake->lock); + if (handshake->state != HANDSHAKE_CREATED_RESPONSE && + handshake->state != HANDSHAKE_CONSUMED_RESPONSE) + goto out; + + new_keypair = keypair_create(handshake->entry.peer); + if (!new_keypair) + goto out; + new_keypair->i_am_the_initiator = handshake->state == + HANDSHAKE_CONSUMED_RESPONSE; + new_keypair->remote_index = handshake->remote_index; + + if (new_keypair->i_am_the_initiator) + derive_keys(&new_keypair->sending, &new_keypair->receiving, + handshake->chaining_key); + else + derive_keys(&new_keypair->receiving, &new_keypair->sending, + handshake->chaining_key); + + handshake_zero(handshake); + rcu_read_lock_bh(); + if (likely(!READ_ONCE(container_of(handshake, struct wg_peer, + handshake)->is_dead))) { + add_new_keypair(keypairs, new_keypair); + net_dbg_ratelimited("%s: Keypair %llu created for peer %llu\n", + handshake->entry.peer->device->dev->name, + new_keypair->internal_id, + handshake->entry.peer->internal_id); + ret = wg_index_hashtable_replace( + handshake->entry.peer->device->index_hashtable, + &handshake->entry, &new_keypair->entry); + } else { + kzfree(new_keypair); + } + rcu_read_unlock_bh(); + +out: + up_write(&handshake->lock); + return ret; +} diff --git a/drivers/net/wireguard/noise.h b/drivers/net/wireguard/noise.h new file mode 100644 index 000000000000..c527253dba80 --- /dev/null +++ b/drivers/net/wireguard/noise.h @@ -0,0 +1,135 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ +#ifndef _WG_NOISE_H +#define _WG_NOISE_H + +#include "messages.h" +#include "peerlookup.h" + +#include +#include +#include +#include +#include +#include + +struct noise_replay_counter { + u64 counter; + spinlock_t lock; + unsigned long backtrack[COUNTER_BITS_TOTAL / BITS_PER_LONG]; +}; + +struct noise_symmetric_key { + u8 key[NOISE_SYMMETRIC_KEY_LEN]; + u64 birthdate; + bool is_valid; +}; + +struct noise_keypair { + struct index_hashtable_entry entry; + struct noise_symmetric_key sending; + atomic64_t sending_counter; + struct noise_symmetric_key receiving; + struct noise_replay_counter receiving_counter; + __le32 remote_index; + bool i_am_the_initiator; + struct kref refcount; + struct rcu_head rcu; + u64 internal_id; +}; + +struct noise_keypairs { + struct noise_keypair __rcu *current_keypair; + struct noise_keypair __rcu *previous_keypair; + struct noise_keypair __rcu *next_keypair; + spinlock_t keypair_update_lock; +}; + +struct noise_static_identity { + u8 static_public[NOISE_PUBLIC_KEY_LEN]; + u8 static_private[NOISE_PUBLIC_KEY_LEN]; + struct rw_semaphore lock; + bool has_identity; +}; + +enum noise_handshake_state { + HANDSHAKE_ZEROED, + HANDSHAKE_CREATED_INITIATION, + HANDSHAKE_CONSUMED_INITIATION, + HANDSHAKE_CREATED_RESPONSE, + HANDSHAKE_CONSUMED_RESPONSE +}; + +struct noise_handshake { + struct index_hashtable_entry entry; + + enum noise_handshake_state state; + u64 last_initiation_consumption; + + struct noise_static_identity *static_identity; + + u8 ephemeral_private[NOISE_PUBLIC_KEY_LEN]; + u8 remote_static[NOISE_PUBLIC_KEY_LEN]; + u8 remote_ephemeral[NOISE_PUBLIC_KEY_LEN]; + u8 precomputed_static_static[NOISE_PUBLIC_KEY_LEN]; + + u8 preshared_key[NOISE_SYMMETRIC_KEY_LEN]; + + u8 hash[NOISE_HASH_LEN]; + u8 chaining_key[NOISE_HASH_LEN]; + + u8 latest_timestamp[NOISE_TIMESTAMP_LEN]; + __le32 remote_index; + + /* Protects all members except the immutable (after noise_handshake_ + * init): remote_static, precomputed_static_static, static_identity. + */ + struct rw_semaphore lock; +}; + +struct wg_device; + +void wg_noise_init(void); +void wg_noise_handshake_init(struct noise_handshake *handshake, + struct noise_static_identity *static_identity, + const u8 peer_public_key[NOISE_PUBLIC_KEY_LEN], + const u8 peer_preshared_key[NOISE_SYMMETRIC_KEY_LEN], + struct wg_peer *peer); +void wg_noise_handshake_clear(struct noise_handshake *handshake); +static inline void wg_noise_reset_last_sent_handshake(atomic64_t *handshake_ns) +{ + atomic64_set(handshake_ns, ktime_get_coarse_boottime_ns() - + (u64)(REKEY_TIMEOUT + 1) * NSEC_PER_SEC); +} + +void wg_noise_keypair_put(struct noise_keypair *keypair, bool unreference_now); +struct noise_keypair *wg_noise_keypair_get(struct noise_keypair *keypair); +void wg_noise_keypairs_clear(struct noise_keypairs *keypairs); +bool wg_noise_received_with_keypair(struct noise_keypairs *keypairs, + struct noise_keypair *received_keypair); +void wg_noise_expire_current_peer_keypairs(struct wg_peer *peer); + +void wg_noise_set_static_identity_private_key( + struct noise_static_identity *static_identity, + const u8 private_key[NOISE_PUBLIC_KEY_LEN]); +void wg_noise_precompute_static_static(struct wg_peer *peer); + +bool +wg_noise_handshake_create_initiation(struct message_handshake_initiation *dst, + struct noise_handshake *handshake); +struct wg_peer * +wg_noise_handshake_consume_initiation(struct message_handshake_initiation *src, + struct wg_device *wg); + +bool wg_noise_handshake_create_response(struct message_handshake_response *dst, + struct noise_handshake *handshake); +struct wg_peer * +wg_noise_handshake_consume_response(struct message_handshake_response *src, + struct wg_device *wg); + +bool wg_noise_handshake_begin_session(struct noise_handshake *handshake, + struct noise_keypairs *keypairs); + +#endif /* _WG_NOISE_H */ diff --git a/drivers/net/wireguard/peer.c b/drivers/net/wireguard/peer.c new file mode 100644 index 000000000000..1d634bd3038f --- /dev/null +++ b/drivers/net/wireguard/peer.c @@ -0,0 +1,237 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "peer.h" +#include "device.h" +#include "queueing.h" +#include "timers.h" +#include "peerlookup.h" +#include "noise.h" + +#include +#include +#include +#include + +static atomic64_t peer_counter = ATOMIC64_INIT(0); + +struct wg_peer *wg_peer_create(struct wg_device *wg, + const u8 public_key[NOISE_PUBLIC_KEY_LEN], + const u8 preshared_key[NOISE_SYMMETRIC_KEY_LEN]) +{ + struct wg_peer *peer; + int ret = -ENOMEM; + + lockdep_assert_held(&wg->device_update_lock); + + if (wg->num_peers >= MAX_PEERS_PER_DEVICE) + return ERR_PTR(ret); + + peer = kzalloc(sizeof(*peer), GFP_KERNEL); + if (unlikely(!peer)) + return ERR_PTR(ret); + peer->device = wg; + + wg_noise_handshake_init(&peer->handshake, &wg->static_identity, + public_key, preshared_key, peer); + if (dst_cache_init(&peer->endpoint_cache, GFP_KERNEL)) + goto err_1; + if (wg_packet_queue_init(&peer->tx_queue, wg_packet_tx_worker, false, + MAX_QUEUED_PACKETS)) + goto err_2; + if (wg_packet_queue_init(&peer->rx_queue, NULL, false, + MAX_QUEUED_PACKETS)) + goto err_3; + + peer->internal_id = atomic64_inc_return(&peer_counter); + peer->serial_work_cpu = nr_cpumask_bits; + wg_cookie_init(&peer->latest_cookie); + wg_timers_init(peer); + wg_cookie_checker_precompute_peer_keys(peer); + spin_lock_init(&peer->keypairs.keypair_update_lock); + INIT_WORK(&peer->transmit_handshake_work, + wg_packet_handshake_send_worker); + rwlock_init(&peer->endpoint_lock); + kref_init(&peer->refcount); + skb_queue_head_init(&peer->staged_packet_queue); + wg_noise_reset_last_sent_handshake(&peer->last_sent_handshake); + set_bit(NAPI_STATE_NO_BUSY_POLL, &peer->napi.state); + netif_napi_add(wg->dev, &peer->napi, wg_packet_rx_poll, + NAPI_POLL_WEIGHT); + napi_enable(&peer->napi); + list_add_tail(&peer->peer_list, &wg->peer_list); + INIT_LIST_HEAD(&peer->allowedips_list); + wg_pubkey_hashtable_add(wg->peer_hashtable, peer); + ++wg->num_peers; + pr_debug("%s: Peer %llu created\n", wg->dev->name, peer->internal_id); + return peer; + +err_3: + wg_packet_queue_free(&peer->tx_queue, false); +err_2: + dst_cache_destroy(&peer->endpoint_cache); +err_1: + kfree(peer); + return ERR_PTR(ret); +} + +struct wg_peer *wg_peer_get_maybe_zero(struct wg_peer *peer) +{ + RCU_LOCKDEP_WARN(!rcu_read_lock_bh_held(), + "Taking peer reference without holding the RCU read lock"); + if (unlikely(!peer || !kref_get_unless_zero(&peer->refcount))) + return NULL; + return peer; +} + +static void peer_make_dead(struct wg_peer *peer) +{ + /* Remove from configuration-time lookup structures. */ + list_del_init(&peer->peer_list); + wg_allowedips_remove_by_peer(&peer->device->peer_allowedips, peer, + &peer->device->device_update_lock); + wg_pubkey_hashtable_remove(peer->device->peer_hashtable, peer); + + /* Mark as dead, so that we don't allow jumping contexts after. */ + WRITE_ONCE(peer->is_dead, true); + + /* The caller must now synchronize_rcu() for this to take effect. */ +} + +static void peer_remove_after_dead(struct wg_peer *peer) +{ + WARN_ON(!peer->is_dead); + + /* No more keypairs can be created for this peer, since is_dead protects + * add_new_keypair, so we can now destroy existing ones. + */ + wg_noise_keypairs_clear(&peer->keypairs); + + /* Destroy all ongoing timers that were in-flight at the beginning of + * this function. + */ + wg_timers_stop(peer); + + /* The transition between packet encryption/decryption queues isn't + * guarded by is_dead, but each reference's life is strictly bounded by + * two generations: once for parallel crypto and once for serial + * ingestion, so we can simply flush twice, and be sure that we no + * longer have references inside these queues. + */ + + /* a) For encrypt/decrypt. */ + flush_workqueue(peer->device->packet_crypt_wq); + /* b.1) For send (but not receive, since that's napi). */ + flush_workqueue(peer->device->packet_crypt_wq); + /* b.2.1) For receive (but not send, since that's wq). */ + napi_disable(&peer->napi); + /* b.2.1) It's now safe to remove the napi struct, which must be done + * here from process context. + */ + netif_napi_del(&peer->napi); + + /* Ensure any workstructs we own (like transmit_handshake_work or + * clear_peer_work) no longer are in use. + */ + flush_workqueue(peer->device->handshake_send_wq); + + /* After the above flushes, a peer might still be active in a few + * different contexts: 1) from xmit(), before hitting is_dead and + * returning, 2) from wg_packet_consume_data(), before hitting is_dead + * and returning, 3) from wg_receive_handshake_packet() after a point + * where it has processed an incoming handshake packet, but where + * all calls to pass it off to timers fails because of is_dead. We won't + * have new references in (1) eventually, because we're removed from + * allowedips; we won't have new references in (2) eventually, because + * wg_index_hashtable_lookup will always return NULL, since we removed + * all existing keypairs and no more can be created; we won't have new + * references in (3) eventually, because we're removed from the pubkey + * hash table, which allows for a maximum of one handshake response, + * via the still-uncleared index hashtable entry, but not more than one, + * and in wg_cookie_message_consume, the lookup eventually gets a peer + * with a refcount of zero, so no new reference is taken. + */ + + --peer->device->num_peers; + wg_peer_put(peer); +} + +/* We have a separate "remove" function make sure that all active places where + * a peer is currently operating will eventually come to an end and not pass + * their reference onto another context. + */ +void wg_peer_remove(struct wg_peer *peer) +{ + if (unlikely(!peer)) + return; + lockdep_assert_held(&peer->device->device_update_lock); + + peer_make_dead(peer); + synchronize_rcu(); + peer_remove_after_dead(peer); +} + +void wg_peer_remove_all(struct wg_device *wg) +{ + struct wg_peer *peer, *temp; + LIST_HEAD(dead_peers); + + lockdep_assert_held(&wg->device_update_lock); + + /* Avoid having to traverse individually for each one. */ + wg_allowedips_free(&wg->peer_allowedips, &wg->device_update_lock); + + list_for_each_entry_safe(peer, temp, &wg->peer_list, peer_list) { + peer_make_dead(peer); + list_add_tail(&peer->peer_list, &dead_peers); + } + synchronize_rcu(); + list_for_each_entry_safe(peer, temp, &dead_peers, peer_list) + peer_remove_after_dead(peer); +} + +static void rcu_release(struct rcu_head *rcu) +{ + struct wg_peer *peer = container_of(rcu, struct wg_peer, rcu); + + dst_cache_destroy(&peer->endpoint_cache); + wg_packet_queue_free(&peer->rx_queue, false); + wg_packet_queue_free(&peer->tx_queue, false); + + /* The final zeroing takes care of clearing any remaining handshake key + * material and other potentially sensitive information. + */ + kzfree(peer); +} + +static void kref_release(struct kref *refcount) +{ + struct wg_peer *peer = container_of(refcount, struct wg_peer, refcount); + + pr_debug("%s: Peer %llu (%pISpfsc) destroyed\n", + peer->device->dev->name, peer->internal_id, + &peer->endpoint.addr); + + /* Remove ourself from dynamic runtime lookup structures, now that the + * last reference is gone. + */ + wg_index_hashtable_remove(peer->device->index_hashtable, + &peer->handshake.entry); + + /* Remove any lingering packets that didn't have a chance to be + * transmitted. + */ + wg_packet_purge_staged_packets(peer); + + /* Free the memory used. */ + call_rcu(&peer->rcu, rcu_release); +} + +void wg_peer_put(struct wg_peer *peer) +{ + if (unlikely(!peer)) + return; + kref_put(&peer->refcount, kref_release); +} diff --git a/drivers/net/wireguard/peer.h b/drivers/net/wireguard/peer.h new file mode 100644 index 000000000000..23af40922997 --- /dev/null +++ b/drivers/net/wireguard/peer.h @@ -0,0 +1,83 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef _WG_PEER_H +#define _WG_PEER_H + +#include "device.h" +#include "noise.h" +#include "cookie.h" + +#include +#include +#include +#include +#include + +struct wg_device; + +struct endpoint { + union { + struct sockaddr addr; + struct sockaddr_in addr4; + struct sockaddr_in6 addr6; + }; + union { + struct { + struct in_addr src4; + /* Essentially the same as addr6->scope_id */ + int src_if4; + }; + struct in6_addr src6; + }; +}; + +struct wg_peer { + struct wg_device *device; + struct crypt_queue tx_queue, rx_queue; + struct sk_buff_head staged_packet_queue; + int serial_work_cpu; + struct noise_keypairs keypairs; + struct endpoint endpoint; + struct dst_cache endpoint_cache; + rwlock_t endpoint_lock; + struct noise_handshake handshake; + atomic64_t last_sent_handshake; + struct work_struct transmit_handshake_work, clear_peer_work; + struct cookie latest_cookie; + struct hlist_node pubkey_hash; + u64 rx_bytes, tx_bytes; + struct timer_list timer_retransmit_handshake, timer_send_keepalive; + struct timer_list timer_new_handshake, timer_zero_key_material; + struct timer_list timer_persistent_keepalive; + unsigned int timer_handshake_attempts; + u16 persistent_keepalive_interval; + bool timer_need_another_keepalive; + bool sent_lastminute_handshake; + struct timespec64 walltime_last_handshake; + struct kref refcount; + struct rcu_head rcu; + struct list_head peer_list; + struct list_head allowedips_list; + u64 internal_id; + struct napi_struct napi; + bool is_dead; +}; + +struct wg_peer *wg_peer_create(struct wg_device *wg, + const u8 public_key[NOISE_PUBLIC_KEY_LEN], + const u8 preshared_key[NOISE_SYMMETRIC_KEY_LEN]); + +struct wg_peer *__must_check wg_peer_get_maybe_zero(struct wg_peer *peer); +static inline struct wg_peer *wg_peer_get(struct wg_peer *peer) +{ + kref_get(&peer->refcount); + return peer; +} +void wg_peer_put(struct wg_peer *peer); +void wg_peer_remove(struct wg_peer *peer); +void wg_peer_remove_all(struct wg_device *wg); + +#endif /* _WG_PEER_H */ diff --git a/drivers/net/wireguard/peerlookup.c b/drivers/net/wireguard/peerlookup.c new file mode 100644 index 000000000000..f2783aa7a88f --- /dev/null +++ b/drivers/net/wireguard/peerlookup.c @@ -0,0 +1,226 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "peerlookup.h" +#include "peer.h" +#include "noise.h" + +static struct hlist_head *pubkey_bucket(struct pubkey_hashtable *table, + const u8 pubkey[NOISE_PUBLIC_KEY_LEN]) +{ + /* siphash gives us a secure 64bit number based on a random key. Since + * the bits are uniformly distributed, we can then mask off to get the + * bits we need. + */ + const u64 hash = siphash(pubkey, NOISE_PUBLIC_KEY_LEN, &table->key); + + return &table->hashtable[hash & (HASH_SIZE(table->hashtable) - 1)]; +} + +struct pubkey_hashtable *wg_pubkey_hashtable_alloc(void) +{ + struct pubkey_hashtable *table = kvmalloc(sizeof(*table), GFP_KERNEL); + + if (!table) + return NULL; + + get_random_bytes(&table->key, sizeof(table->key)); + hash_init(table->hashtable); + mutex_init(&table->lock); + return table; +} + +void wg_pubkey_hashtable_add(struct pubkey_hashtable *table, + struct wg_peer *peer) +{ + mutex_lock(&table->lock); + hlist_add_head_rcu(&peer->pubkey_hash, + pubkey_bucket(table, peer->handshake.remote_static)); + mutex_unlock(&table->lock); +} + +void wg_pubkey_hashtable_remove(struct pubkey_hashtable *table, + struct wg_peer *peer) +{ + mutex_lock(&table->lock); + hlist_del_init_rcu(&peer->pubkey_hash); + mutex_unlock(&table->lock); +} + +/* Returns a strong reference to a peer */ +struct wg_peer * +wg_pubkey_hashtable_lookup(struct pubkey_hashtable *table, + const u8 pubkey[NOISE_PUBLIC_KEY_LEN]) +{ + struct wg_peer *iter_peer, *peer = NULL; + + rcu_read_lock_bh(); + hlist_for_each_entry_rcu_bh(iter_peer, pubkey_bucket(table, pubkey), + pubkey_hash) { + if (!memcmp(pubkey, iter_peer->handshake.remote_static, + NOISE_PUBLIC_KEY_LEN)) { + peer = iter_peer; + break; + } + } + peer = wg_peer_get_maybe_zero(peer); + rcu_read_unlock_bh(); + return peer; +} + +static struct hlist_head *index_bucket(struct index_hashtable *table, + const __le32 index) +{ + /* Since the indices are random and thus all bits are uniformly + * distributed, we can find its bucket simply by masking. + */ + return &table->hashtable[(__force u32)index & + (HASH_SIZE(table->hashtable) - 1)]; +} + +struct index_hashtable *wg_index_hashtable_alloc(void) +{ + struct index_hashtable *table = kvmalloc(sizeof(*table), GFP_KERNEL); + + if (!table) + return NULL; + + hash_init(table->hashtable); + spin_lock_init(&table->lock); + return table; +} + +/* At the moment, we limit ourselves to 2^20 total peers, which generally might + * amount to 2^20*3 items in this hashtable. The algorithm below works by + * picking a random number and testing it. We can see that these limits mean we + * usually succeed pretty quickly: + * + * >>> def calculation(tries, size): + * ... return (size / 2**32)**(tries - 1) * (1 - (size / 2**32)) + * ... + * >>> calculation(1, 2**20 * 3) + * 0.999267578125 + * >>> calculation(2, 2**20 * 3) + * 0.0007318854331970215 + * >>> calculation(3, 2**20 * 3) + * 5.360489012673497e-07 + * >>> calculation(4, 2**20 * 3) + * 3.9261394135792216e-10 + * + * At the moment, we don't do any masking, so this algorithm isn't exactly + * constant time in either the random guessing or in the hash list lookup. We + * could require a minimum of 3 tries, which would successfully mask the + * guessing. this would not, however, help with the growing hash lengths, which + * is another thing to consider moving forward. + */ + +__le32 wg_index_hashtable_insert(struct index_hashtable *table, + struct index_hashtable_entry *entry) +{ + struct index_hashtable_entry *existing_entry; + + spin_lock_bh(&table->lock); + hlist_del_init_rcu(&entry->index_hash); + spin_unlock_bh(&table->lock); + + rcu_read_lock_bh(); + +search_unused_slot: + /* First we try to find an unused slot, randomly, while unlocked. */ + entry->index = (__force __le32)get_random_u32(); + hlist_for_each_entry_rcu_bh(existing_entry, + index_bucket(table, entry->index), + index_hash) { + if (existing_entry->index == entry->index) + /* If it's already in use, we continue searching. */ + goto search_unused_slot; + } + + /* Once we've found an unused slot, we lock it, and then double-check + * that nobody else stole it from us. + */ + spin_lock_bh(&table->lock); + hlist_for_each_entry_rcu_bh(existing_entry, + index_bucket(table, entry->index), + index_hash) { + if (existing_entry->index == entry->index) { + spin_unlock_bh(&table->lock); + /* If it was stolen, we start over. */ + goto search_unused_slot; + } + } + /* Otherwise, we know we have it exclusively (since we're locked), + * so we insert. + */ + hlist_add_head_rcu(&entry->index_hash, + index_bucket(table, entry->index)); + spin_unlock_bh(&table->lock); + + rcu_read_unlock_bh(); + + return entry->index; +} + +bool wg_index_hashtable_replace(struct index_hashtable *table, + struct index_hashtable_entry *old, + struct index_hashtable_entry *new) +{ + bool ret; + + spin_lock_bh(&table->lock); + ret = !hlist_unhashed(&old->index_hash); + if (unlikely(!ret)) + goto out; + + new->index = old->index; + hlist_replace_rcu(&old->index_hash, &new->index_hash); + + /* Calling init here NULLs out index_hash, and in fact after this + * function returns, it's theoretically possible for this to get + * reinserted elsewhere. That means the RCU lookup below might either + * terminate early or jump between buckets, in which case the packet + * simply gets dropped, which isn't terrible. + */ + INIT_HLIST_NODE(&old->index_hash); +out: + spin_unlock_bh(&table->lock); + return ret; +} + +void wg_index_hashtable_remove(struct index_hashtable *table, + struct index_hashtable_entry *entry) +{ + spin_lock_bh(&table->lock); + hlist_del_init_rcu(&entry->index_hash); + spin_unlock_bh(&table->lock); +} + +/* Returns a strong reference to a entry->peer */ +struct index_hashtable_entry * +wg_index_hashtable_lookup(struct index_hashtable *table, + const enum index_hashtable_type type_mask, + const __le32 index, struct wg_peer **peer) +{ + struct index_hashtable_entry *iter_entry, *entry = NULL; + + rcu_read_lock_bh(); + hlist_for_each_entry_rcu_bh(iter_entry, index_bucket(table, index), + index_hash) { + if (iter_entry->index == index) { + if (likely(iter_entry->type & type_mask)) + entry = iter_entry; + break; + } + } + if (likely(entry)) { + entry->peer = wg_peer_get_maybe_zero(entry->peer); + if (likely(entry->peer)) + *peer = entry->peer; + else + entry = NULL; + } + rcu_read_unlock_bh(); + return entry; +} diff --git a/drivers/net/wireguard/peerlookup.h b/drivers/net/wireguard/peerlookup.h new file mode 100644 index 000000000000..ced811797680 --- /dev/null +++ b/drivers/net/wireguard/peerlookup.h @@ -0,0 +1,64 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef _WG_PEERLOOKUP_H +#define _WG_PEERLOOKUP_H + +#include "messages.h" + +#include +#include +#include + +struct wg_peer; + +struct pubkey_hashtable { + /* TODO: move to rhashtable */ + DECLARE_HASHTABLE(hashtable, 11); + siphash_key_t key; + struct mutex lock; +}; + +struct pubkey_hashtable *wg_pubkey_hashtable_alloc(void); +void wg_pubkey_hashtable_add(struct pubkey_hashtable *table, + struct wg_peer *peer); +void wg_pubkey_hashtable_remove(struct pubkey_hashtable *table, + struct wg_peer *peer); +struct wg_peer * +wg_pubkey_hashtable_lookup(struct pubkey_hashtable *table, + const u8 pubkey[NOISE_PUBLIC_KEY_LEN]); + +struct index_hashtable { + /* TODO: move to rhashtable */ + DECLARE_HASHTABLE(hashtable, 13); + spinlock_t lock; +}; + +enum index_hashtable_type { + INDEX_HASHTABLE_HANDSHAKE = 1U << 0, + INDEX_HASHTABLE_KEYPAIR = 1U << 1 +}; + +struct index_hashtable_entry { + struct wg_peer *peer; + struct hlist_node index_hash; + enum index_hashtable_type type; + __le32 index; +}; + +struct index_hashtable *wg_index_hashtable_alloc(void); +__le32 wg_index_hashtable_insert(struct index_hashtable *table, + struct index_hashtable_entry *entry); +bool wg_index_hashtable_replace(struct index_hashtable *table, + struct index_hashtable_entry *old, + struct index_hashtable_entry *new); +void wg_index_hashtable_remove(struct index_hashtable *table, + struct index_hashtable_entry *entry); +struct index_hashtable_entry * +wg_index_hashtable_lookup(struct index_hashtable *table, + const enum index_hashtable_type type_mask, + const __le32 index, struct wg_peer **peer); + +#endif /* _WG_PEERLOOKUP_H */ diff --git a/drivers/net/wireguard/queueing.c b/drivers/net/wireguard/queueing.c new file mode 100644 index 000000000000..71b8e80b58e1 --- /dev/null +++ b/drivers/net/wireguard/queueing.c @@ -0,0 +1,55 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "queueing.h" + +struct multicore_worker __percpu * +wg_packet_percpu_multicore_worker_alloc(work_func_t function, void *ptr) +{ + int cpu; + struct multicore_worker __percpu *worker = + alloc_percpu(struct multicore_worker); + + if (!worker) + return NULL; + + for_each_possible_cpu(cpu) { + per_cpu_ptr(worker, cpu)->ptr = ptr; + INIT_WORK(&per_cpu_ptr(worker, cpu)->work, function); + } + return worker; +} + +int wg_packet_queue_init(struct crypt_queue *queue, work_func_t function, + bool multicore, unsigned int len) +{ + int ret; + + memset(queue, 0, sizeof(*queue)); + ret = ptr_ring_init(&queue->ring, len, GFP_KERNEL); + if (ret) + return ret; + if (function) { + if (multicore) { + queue->worker = wg_packet_percpu_multicore_worker_alloc( + function, queue); + if (!queue->worker) { + ptr_ring_cleanup(&queue->ring, NULL); + return -ENOMEM; + } + } else { + INIT_WORK(&queue->work, function); + } + } + return 0; +} + +void wg_packet_queue_free(struct crypt_queue *queue, bool multicore) +{ + if (multicore) + free_percpu(queue->worker); + WARN_ON(!__ptr_ring_empty(&queue->ring)); + ptr_ring_cleanup(&queue->ring, NULL); +} diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h new file mode 100644 index 000000000000..866d46865f13 --- /dev/null +++ b/drivers/net/wireguard/queueing.h @@ -0,0 +1,193 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef _WG_QUEUEING_H +#define _WG_QUEUEING_H + +#include "peer.h" +#include +#include +#include +#include +#include + +struct wg_device; +struct wg_peer; +struct multicore_worker; +struct crypt_queue; +struct sk_buff; + +/* queueing.c APIs: */ +int wg_packet_queue_init(struct crypt_queue *queue, work_func_t function, + bool multicore, unsigned int len); +void wg_packet_queue_free(struct crypt_queue *queue, bool multicore); +struct multicore_worker __percpu * +wg_packet_percpu_multicore_worker_alloc(work_func_t function, void *ptr); + +/* receive.c APIs: */ +void wg_packet_receive(struct wg_device *wg, struct sk_buff *skb); +void wg_packet_handshake_receive_worker(struct work_struct *work); +/* NAPI poll function: */ +int wg_packet_rx_poll(struct napi_struct *napi, int budget); +/* Workqueue worker: */ +void wg_packet_decrypt_worker(struct work_struct *work); + +/* send.c APIs: */ +void wg_packet_send_queued_handshake_initiation(struct wg_peer *peer, + bool is_retry); +void wg_packet_send_handshake_response(struct wg_peer *peer); +void wg_packet_send_handshake_cookie(struct wg_device *wg, + struct sk_buff *initiating_skb, + __le32 sender_index); +void wg_packet_send_keepalive(struct wg_peer *peer); +void wg_packet_purge_staged_packets(struct wg_peer *peer); +void wg_packet_send_staged_packets(struct wg_peer *peer); +/* Workqueue workers: */ +void wg_packet_handshake_send_worker(struct work_struct *work); +void wg_packet_tx_worker(struct work_struct *work); +void wg_packet_encrypt_worker(struct work_struct *work); + +enum packet_state { + PACKET_STATE_UNCRYPTED, + PACKET_STATE_CRYPTED, + PACKET_STATE_DEAD +}; + +struct packet_cb { + u64 nonce; + struct noise_keypair *keypair; + atomic_t state; + u32 mtu; + u8 ds; +}; + +#define PACKET_CB(skb) ((struct packet_cb *)((skb)->cb)) +#define PACKET_PEER(skb) (PACKET_CB(skb)->keypair->entry.peer) + +static inline bool wg_check_packet_protocol(struct sk_buff *skb) +{ + __be16 real_protocol = ip_tunnel_parse_protocol(skb); + return real_protocol && skb->protocol == real_protocol; +} + +static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating) +{ + u8 l4_hash = skb->l4_hash; + u8 sw_hash = skb->sw_hash; + u32 hash = skb->hash; + skb_scrub_packet(skb, true); + memset(&skb->headers_start, 0, + offsetof(struct sk_buff, headers_end) - + offsetof(struct sk_buff, headers_start)); + if (encapsulating) { + skb->l4_hash = l4_hash; + skb->sw_hash = sw_hash; + skb->hash = hash; + } + skb->queue_mapping = 0; + skb->nohdr = 0; + skb->peeked = 0; + skb->mac_len = 0; + skb->dev = NULL; +#ifdef CONFIG_NET_SCHED + skb->tc_index = 0; + skb_reset_tc(skb); +#endif + skb->hdr_len = skb_headroom(skb); + skb_reset_mac_header(skb); + skb_reset_network_header(skb); + skb_reset_transport_header(skb); + skb_probe_transport_header(skb, 0); + skb_reset_inner_headers(skb); +} + +static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id) +{ + unsigned int cpu = *stored_cpu, cpu_index, i; + + if (unlikely(cpu == nr_cpumask_bits || + !cpumask_test_cpu(cpu, cpu_online_mask))) { + cpu_index = id % cpumask_weight(cpu_online_mask); + cpu = cpumask_first(cpu_online_mask); + for (i = 0; i < cpu_index; ++i) + cpu = cpumask_next(cpu, cpu_online_mask); + *stored_cpu = cpu; + } + return cpu; +} + +/* This function is racy, in the sense that next is unlocked, so it could return + * the same CPU twice. A race-free version of this would be to instead store an + * atomic sequence number, do an increment-and-return, and then iterate through + * every possible CPU until we get to that index -- choose_cpu. However that's + * a bit slower, and it doesn't seem like this potential race actually + * introduces any performance loss, so we live with it. + */ +static inline int wg_cpumask_next_online(int *next) +{ + int cpu = *next; + + while (unlikely(!cpumask_test_cpu(cpu, cpu_online_mask))) + cpu = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits; + *next = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits; + return cpu; +} + +static inline int wg_queue_enqueue_per_device_and_peer( + struct crypt_queue *device_queue, struct crypt_queue *peer_queue, + struct sk_buff *skb, struct workqueue_struct *wq, int *next_cpu) +{ + int cpu; + + atomic_set_release(&PACKET_CB(skb)->state, PACKET_STATE_UNCRYPTED); + /* We first queue this up for the peer ingestion, but the consumer + * will wait for the state to change to CRYPTED or DEAD before. + */ + if (unlikely(ptr_ring_produce_bh(&peer_queue->ring, skb))) + return -ENOSPC; + /* Then we queue it up in the device queue, which consumes the + * packet as soon as it can. + */ + cpu = wg_cpumask_next_online(next_cpu); + if (unlikely(ptr_ring_produce_bh(&device_queue->ring, skb))) + return -EPIPE; + queue_work_on(cpu, wq, &per_cpu_ptr(device_queue->worker, cpu)->work); + return 0; +} + +static inline void wg_queue_enqueue_per_peer(struct crypt_queue *queue, + struct sk_buff *skb, + enum packet_state state) +{ + /* We take a reference, because as soon as we call atomic_set, the + * peer can be freed from below us. + */ + struct wg_peer *peer = wg_peer_get(PACKET_PEER(skb)); + + atomic_set_release(&PACKET_CB(skb)->state, state); + queue_work_on(wg_cpumask_choose_online(&peer->serial_work_cpu, + peer->internal_id), + peer->device->packet_crypt_wq, &queue->work); + wg_peer_put(peer); +} + +static inline void wg_queue_enqueue_per_peer_napi(struct sk_buff *skb, + enum packet_state state) +{ + /* We take a reference, because as soon as we call atomic_set, the + * peer can be freed from below us. + */ + struct wg_peer *peer = wg_peer_get(PACKET_PEER(skb)); + + atomic_set_release(&PACKET_CB(skb)->state, state); + napi_schedule(&peer->napi); + wg_peer_put(peer); +} + +#ifdef DEBUG +bool wg_packet_counter_selftest(void); +#endif + +#endif /* _WG_QUEUEING_H */ diff --git a/drivers/net/wireguard/ratelimiter.c b/drivers/net/wireguard/ratelimiter.c new file mode 100644 index 000000000000..9baa930c8157 --- /dev/null +++ b/drivers/net/wireguard/ratelimiter.c @@ -0,0 +1,223 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "ratelimiter.h" +#include +#include +#include +#include + +static struct kmem_cache *entry_cache; +static hsiphash_key_t key; +static spinlock_t table_lock = __SPIN_LOCK_UNLOCKED("ratelimiter_table_lock"); +static DEFINE_MUTEX(init_lock); +static u64 init_refcnt; /* Protected by init_lock, hence not atomic. */ +static atomic_t total_entries = ATOMIC_INIT(0); +static unsigned int max_entries, table_size; +static void wg_ratelimiter_gc_entries(struct work_struct *); +static DECLARE_DEFERRABLE_WORK(gc_work, wg_ratelimiter_gc_entries); +static struct hlist_head *table_v4; +#if IS_ENABLED(CONFIG_IPV6) +static struct hlist_head *table_v6; +#endif + +struct ratelimiter_entry { + u64 last_time_ns, tokens, ip; + void *net; + spinlock_t lock; + struct hlist_node hash; + struct rcu_head rcu; +}; + +enum { + PACKETS_PER_SECOND = 20, + PACKETS_BURSTABLE = 5, + PACKET_COST = NSEC_PER_SEC / PACKETS_PER_SECOND, + TOKEN_MAX = PACKET_COST * PACKETS_BURSTABLE +}; + +static void entry_free(struct rcu_head *rcu) +{ + kmem_cache_free(entry_cache, + container_of(rcu, struct ratelimiter_entry, rcu)); + atomic_dec(&total_entries); +} + +static void entry_uninit(struct ratelimiter_entry *entry) +{ + hlist_del_rcu(&entry->hash); + call_rcu(&entry->rcu, entry_free); +} + +/* Calling this function with a NULL work uninits all entries. */ +static void wg_ratelimiter_gc_entries(struct work_struct *work) +{ + const u64 now = ktime_get_coarse_boottime_ns(); + struct ratelimiter_entry *entry; + struct hlist_node *temp; + unsigned int i; + + for (i = 0; i < table_size; ++i) { + spin_lock(&table_lock); + hlist_for_each_entry_safe(entry, temp, &table_v4[i], hash) { + if (unlikely(!work) || + now - entry->last_time_ns > NSEC_PER_SEC) + entry_uninit(entry); + } +#if IS_ENABLED(CONFIG_IPV6) + hlist_for_each_entry_safe(entry, temp, &table_v6[i], hash) { + if (unlikely(!work) || + now - entry->last_time_ns > NSEC_PER_SEC) + entry_uninit(entry); + } +#endif + spin_unlock(&table_lock); + if (likely(work)) + cond_resched(); + } + if (likely(work)) + queue_delayed_work(system_power_efficient_wq, &gc_work, HZ); +} + +bool wg_ratelimiter_allow(struct sk_buff *skb, struct net *net) +{ + /* We only take the bottom half of the net pointer, so that we can hash + * 3 words in the end. This way, siphash's len param fits into the final + * u32, and we don't incur an extra round. + */ + const u32 net_word = (unsigned long)net; + struct ratelimiter_entry *entry; + struct hlist_head *bucket; + u64 ip; + + if (skb->protocol == htons(ETH_P_IP)) { + ip = (u64 __force)ip_hdr(skb)->saddr; + bucket = &table_v4[hsiphash_2u32(net_word, ip, &key) & + (table_size - 1)]; + } +#if IS_ENABLED(CONFIG_IPV6) + else if (skb->protocol == htons(ETH_P_IPV6)) { + /* Only use 64 bits, so as to ratelimit the whole /64. */ + memcpy(&ip, &ipv6_hdr(skb)->saddr, sizeof(ip)); + bucket = &table_v6[hsiphash_3u32(net_word, ip >> 32, ip, &key) & + (table_size - 1)]; + } +#endif + else + return false; + rcu_read_lock(); + hlist_for_each_entry_rcu(entry, bucket, hash) { + if (entry->net == net && entry->ip == ip) { + u64 now, tokens; + bool ret; + /* Quasi-inspired by nft_limit.c, but this is actually a + * slightly different algorithm. Namely, we incorporate + * the burst as part of the maximum tokens, rather than + * as part of the rate. + */ + spin_lock(&entry->lock); + now = ktime_get_coarse_boottime_ns(); + tokens = min_t(u64, TOKEN_MAX, + entry->tokens + now - + entry->last_time_ns); + entry->last_time_ns = now; + ret = tokens >= PACKET_COST; + entry->tokens = ret ? tokens - PACKET_COST : tokens; + spin_unlock(&entry->lock); + rcu_read_unlock(); + return ret; + } + } + rcu_read_unlock(); + + if (atomic_inc_return(&total_entries) > max_entries) + goto err_oom; + + entry = kmem_cache_alloc(entry_cache, GFP_KERNEL); + if (unlikely(!entry)) + goto err_oom; + + entry->net = net; + entry->ip = ip; + INIT_HLIST_NODE(&entry->hash); + spin_lock_init(&entry->lock); + entry->last_time_ns = ktime_get_coarse_boottime_ns(); + entry->tokens = TOKEN_MAX - PACKET_COST; + spin_lock(&table_lock); + hlist_add_head_rcu(&entry->hash, bucket); + spin_unlock(&table_lock); + return true; + +err_oom: + atomic_dec(&total_entries); + return false; +} + +int wg_ratelimiter_init(void) +{ + mutex_lock(&init_lock); + if (++init_refcnt != 1) + goto out; + + entry_cache = KMEM_CACHE(ratelimiter_entry, 0); + if (!entry_cache) + goto err; + + /* xt_hashlimit.c uses a slightly different algorithm for ratelimiting, + * but what it shares in common is that it uses a massive hashtable. So, + * we borrow their wisdom about good table sizes on different systems + * dependent on RAM. This calculation here comes from there. + */ + table_size = (totalram_pages > (1U << 30) / PAGE_SIZE) ? 8192 : + max_t(unsigned long, 16, roundup_pow_of_two( + (totalram_pages << PAGE_SHIFT) / + (1U << 14) / sizeof(struct hlist_head))); + max_entries = table_size * 8; + + table_v4 = kvzalloc(table_size * sizeof(*table_v4), GFP_KERNEL); + if (unlikely(!table_v4)) + goto err_kmemcache; + +#if IS_ENABLED(CONFIG_IPV6) + table_v6 = kvzalloc(table_size * sizeof(*table_v6), GFP_KERNEL); + if (unlikely(!table_v6)) { + kvfree(table_v4); + goto err_kmemcache; + } +#endif + + queue_delayed_work(system_power_efficient_wq, &gc_work, HZ); + get_random_bytes(&key, sizeof(key)); +out: + mutex_unlock(&init_lock); + return 0; + +err_kmemcache: + kmem_cache_destroy(entry_cache); +err: + --init_refcnt; + mutex_unlock(&init_lock); + return -ENOMEM; +} + +void wg_ratelimiter_uninit(void) +{ + mutex_lock(&init_lock); + if (!init_refcnt || --init_refcnt) + goto out; + + cancel_delayed_work_sync(&gc_work); + wg_ratelimiter_gc_entries(NULL); + rcu_barrier(); + kvfree(table_v4); +#if IS_ENABLED(CONFIG_IPV6) + kvfree(table_v6); +#endif + kmem_cache_destroy(entry_cache); +out: + mutex_unlock(&init_lock); +} + +#include "selftest/ratelimiter.c" diff --git a/drivers/net/wireguard/ratelimiter.h b/drivers/net/wireguard/ratelimiter.h new file mode 100644 index 000000000000..83067f71ea99 --- /dev/null +++ b/drivers/net/wireguard/ratelimiter.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef _WG_RATELIMITER_H +#define _WG_RATELIMITER_H + +#include + +int wg_ratelimiter_init(void); +void wg_ratelimiter_uninit(void); +bool wg_ratelimiter_allow(struct sk_buff *skb, struct net *net); + +#ifdef DEBUG +bool wg_ratelimiter_selftest(void); +#endif + +#endif /* _WG_RATELIMITER_H */ diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c new file mode 100644 index 000000000000..2c9551ea6dc7 --- /dev/null +++ b/drivers/net/wireguard/receive.c @@ -0,0 +1,590 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "queueing.h" +#include "device.h" +#include "peer.h" +#include "timers.h" +#include "messages.h" +#include "cookie.h" +#include "socket.h" + +#include +#include +#include +#include + +/* Must be called with bh disabled. */ +static void update_rx_stats(struct wg_peer *peer, size_t len) +{ + struct pcpu_sw_netstats *tstats = + get_cpu_ptr(peer->device->dev->tstats); + + u64_stats_update_begin(&tstats->syncp); + ++tstats->rx_packets; + tstats->rx_bytes += len; + peer->rx_bytes += len; + u64_stats_update_end(&tstats->syncp); + put_cpu_ptr(tstats); +} + +#define SKB_TYPE_LE32(skb) (((struct message_header *)(skb)->data)->type) + +static size_t validate_header_len(struct sk_buff *skb) +{ + if (unlikely(skb->len < sizeof(struct message_header))) + return 0; + if (SKB_TYPE_LE32(skb) == cpu_to_le32(MESSAGE_DATA) && + skb->len >= MESSAGE_MINIMUM_LENGTH) + return sizeof(struct message_data); + if (SKB_TYPE_LE32(skb) == cpu_to_le32(MESSAGE_HANDSHAKE_INITIATION) && + skb->len == sizeof(struct message_handshake_initiation)) + return sizeof(struct message_handshake_initiation); + if (SKB_TYPE_LE32(skb) == cpu_to_le32(MESSAGE_HANDSHAKE_RESPONSE) && + skb->len == sizeof(struct message_handshake_response)) + return sizeof(struct message_handshake_response); + if (SKB_TYPE_LE32(skb) == cpu_to_le32(MESSAGE_HANDSHAKE_COOKIE) && + skb->len == sizeof(struct message_handshake_cookie)) + return sizeof(struct message_handshake_cookie); + return 0; +} + +static int prepare_skb_header(struct sk_buff *skb, struct wg_device *wg) +{ + size_t data_offset, data_len, header_len; + struct udphdr *udp; + + if (unlikely(!wg_check_packet_protocol(skb) || + skb_transport_header(skb) < skb->head || + (skb_transport_header(skb) + sizeof(struct udphdr)) > + skb_tail_pointer(skb))) + return -EINVAL; /* Bogus IP header */ + udp = udp_hdr(skb); + data_offset = (u8 *)udp - skb->data; + if (unlikely(data_offset > U16_MAX || + data_offset + sizeof(struct udphdr) > skb->len)) + /* Packet has offset at impossible location or isn't big enough + * to have UDP fields. + */ + return -EINVAL; + data_len = ntohs(udp->len); + if (unlikely(data_len < sizeof(struct udphdr) || + data_len > skb->len - data_offset)) + /* UDP packet is reporting too small of a size or lying about + * its size. + */ + return -EINVAL; + data_len -= sizeof(struct udphdr); + data_offset = (u8 *)udp + sizeof(struct udphdr) - skb->data; + if (unlikely(!pskb_may_pull(skb, + data_offset + sizeof(struct message_header)) || + pskb_trim(skb, data_len + data_offset) < 0)) + return -EINVAL; + skb_pull(skb, data_offset); + if (unlikely(skb->len != data_len)) + /* Final len does not agree with calculated len */ + return -EINVAL; + header_len = validate_header_len(skb); + if (unlikely(!header_len)) + return -EINVAL; + __skb_push(skb, data_offset); + if (unlikely(!pskb_may_pull(skb, data_offset + header_len))) + return -EINVAL; + __skb_pull(skb, data_offset); + return 0; +} + +static void wg_receive_handshake_packet(struct wg_device *wg, + struct sk_buff *skb) +{ + enum cookie_mac_state mac_state; + struct wg_peer *peer = NULL; + /* This is global, so that our load calculation applies to the whole + * system. We don't care about races with it at all. + */ + static u64 last_under_load; + bool packet_needs_cookie; + bool under_load; + + if (SKB_TYPE_LE32(skb) == cpu_to_le32(MESSAGE_HANDSHAKE_COOKIE)) { + net_dbg_skb_ratelimited("%s: Receiving cookie response from %pISpfsc\n", + wg->dev->name, skb); + wg_cookie_message_consume( + (struct message_handshake_cookie *)skb->data, wg); + return; + } + + under_load = skb_queue_len(&wg->incoming_handshakes) >= + MAX_QUEUED_INCOMING_HANDSHAKES / 8; + if (under_load) { + last_under_load = ktime_get_coarse_boottime_ns(); + } else if (last_under_load) { + under_load = !wg_birthdate_has_expired(last_under_load, 1); + if (!under_load) + last_under_load = 0; + } + mac_state = wg_cookie_validate_packet(&wg->cookie_checker, skb, + under_load); + if ((under_load && mac_state == VALID_MAC_WITH_COOKIE) || + (!under_load && mac_state == VALID_MAC_BUT_NO_COOKIE)) { + packet_needs_cookie = false; + } else if (under_load && mac_state == VALID_MAC_BUT_NO_COOKIE) { + packet_needs_cookie = true; + } else { + net_dbg_skb_ratelimited("%s: Invalid MAC of handshake, dropping packet from %pISpfsc\n", + wg->dev->name, skb); + return; + } + + switch (SKB_TYPE_LE32(skb)) { + case cpu_to_le32(MESSAGE_HANDSHAKE_INITIATION): { + struct message_handshake_initiation *message = + (struct message_handshake_initiation *)skb->data; + + if (packet_needs_cookie) { + wg_packet_send_handshake_cookie(wg, skb, + message->sender_index); + return; + } + peer = wg_noise_handshake_consume_initiation(message, wg); + if (unlikely(!peer)) { + net_dbg_skb_ratelimited("%s: Invalid handshake initiation from %pISpfsc\n", + wg->dev->name, skb); + return; + } + wg_socket_set_peer_endpoint_from_skb(peer, skb); + net_dbg_ratelimited("%s: Receiving handshake initiation from peer %llu (%pISpfsc)\n", + wg->dev->name, peer->internal_id, + &peer->endpoint.addr); + wg_packet_send_handshake_response(peer); + break; + } + case cpu_to_le32(MESSAGE_HANDSHAKE_RESPONSE): { + struct message_handshake_response *message = + (struct message_handshake_response *)skb->data; + + if (packet_needs_cookie) { + wg_packet_send_handshake_cookie(wg, skb, + message->sender_index); + return; + } + peer = wg_noise_handshake_consume_response(message, wg); + if (unlikely(!peer)) { + net_dbg_skb_ratelimited("%s: Invalid handshake response from %pISpfsc\n", + wg->dev->name, skb); + return; + } + wg_socket_set_peer_endpoint_from_skb(peer, skb); + net_dbg_ratelimited("%s: Receiving handshake response from peer %llu (%pISpfsc)\n", + wg->dev->name, peer->internal_id, + &peer->endpoint.addr); + if (wg_noise_handshake_begin_session(&peer->handshake, + &peer->keypairs)) { + wg_timers_session_derived(peer); + wg_timers_handshake_complete(peer); + /* Calling this function will either send any existing + * packets in the queue and not send a keepalive, which + * is the best case, Or, if there's nothing in the + * queue, it will send a keepalive, in order to give + * immediate confirmation of the session. + */ + wg_packet_send_keepalive(peer); + } + break; + } + } + + if (unlikely(!peer)) { + WARN(1, "Somehow a wrong type of packet wound up in the handshake queue!\n"); + return; + } + + local_bh_disable(); + update_rx_stats(peer, skb->len); + local_bh_enable(); + + wg_timers_any_authenticated_packet_received(peer); + wg_timers_any_authenticated_packet_traversal(peer); + wg_peer_put(peer); +} + +void wg_packet_handshake_receive_worker(struct work_struct *work) +{ + struct wg_device *wg = container_of(work, struct multicore_worker, + work)->ptr; + struct sk_buff *skb; + + while ((skb = skb_dequeue(&wg->incoming_handshakes)) != NULL) { + wg_receive_handshake_packet(wg, skb); + dev_kfree_skb(skb); + cond_resched(); + } +} + +static void keep_key_fresh(struct wg_peer *peer) +{ + struct noise_keypair *keypair; + bool send; + + if (peer->sent_lastminute_handshake) + return; + + rcu_read_lock_bh(); + keypair = rcu_dereference_bh(peer->keypairs.current_keypair); + send = keypair && READ_ONCE(keypair->sending.is_valid) && + keypair->i_am_the_initiator && + wg_birthdate_has_expired(keypair->sending.birthdate, + REJECT_AFTER_TIME - KEEPALIVE_TIMEOUT - REKEY_TIMEOUT); + rcu_read_unlock_bh(); + + if (unlikely(send)) { + peer->sent_lastminute_handshake = true; + wg_packet_send_queued_handshake_initiation(peer, false); + } +} + +static bool decrypt_packet(struct sk_buff *skb, struct noise_keypair *keypair) +{ + struct scatterlist sg[MAX_SKB_FRAGS + 8]; + struct sk_buff *trailer; + unsigned int offset; + int num_frags; + + if (unlikely(!keypair)) + return false; + + if (unlikely(!READ_ONCE(keypair->receiving.is_valid) || + wg_birthdate_has_expired(keypair->receiving.birthdate, REJECT_AFTER_TIME) || + keypair->receiving_counter.counter >= REJECT_AFTER_MESSAGES)) { + WRITE_ONCE(keypair->receiving.is_valid, false); + return false; + } + + PACKET_CB(skb)->nonce = + le64_to_cpu(((struct message_data *)skb->data)->counter); + + /* We ensure that the network header is part of the packet before we + * call skb_cow_data, so that there's no chance that data is removed + * from the skb, so that later we can extract the original endpoint. + */ + offset = skb->data - skb_network_header(skb); + skb_push(skb, offset); + num_frags = skb_cow_data(skb, 0, &trailer); + offset += sizeof(struct message_data); + skb_pull(skb, offset); + if (unlikely(num_frags < 0 || num_frags > ARRAY_SIZE(sg))) + return false; + + sg_init_table(sg, num_frags); + if (skb_to_sgvec(skb, sg, 0, skb->len) <= 0) + return false; + + if (!chacha20poly1305_decrypt_sg_inplace(sg, skb->len, NULL, 0, + PACKET_CB(skb)->nonce, + keypair->receiving.key)) + return false; + + /* Another ugly situation of pushing and pulling the header so as to + * keep endpoint information intact. + */ + skb_push(skb, offset); + if (pskb_trim(skb, skb->len - noise_encrypted_len(0))) + return false; + skb_pull(skb, offset); + + return true; +} + +/* This is RFC6479, a replay detection bitmap algorithm that avoids bitshifts */ +static bool counter_validate(struct noise_replay_counter *counter, u64 their_counter) +{ + unsigned long index, index_current, top, i; + bool ret = false; + + spin_lock_bh(&counter->lock); + + if (unlikely(counter->counter >= REJECT_AFTER_MESSAGES + 1 || + their_counter >= REJECT_AFTER_MESSAGES)) + goto out; + + ++their_counter; + + if (unlikely((COUNTER_WINDOW_SIZE + their_counter) < + counter->counter)) + goto out; + + index = their_counter >> ilog2(BITS_PER_LONG); + + if (likely(their_counter > counter->counter)) { + index_current = counter->counter >> ilog2(BITS_PER_LONG); + top = min_t(unsigned long, index - index_current, + COUNTER_BITS_TOTAL / BITS_PER_LONG); + for (i = 1; i <= top; ++i) + counter->backtrack[(i + index_current) & + ((COUNTER_BITS_TOTAL / BITS_PER_LONG) - 1)] = 0; + counter->counter = their_counter; + } + + index &= (COUNTER_BITS_TOTAL / BITS_PER_LONG) - 1; + ret = !test_and_set_bit(their_counter & (BITS_PER_LONG - 1), + &counter->backtrack[index]); + +out: + spin_unlock_bh(&counter->lock); + return ret; +} + +#include "selftest/counter.c" + +static void wg_packet_consume_data_done(struct wg_peer *peer, + struct sk_buff *skb, + struct endpoint *endpoint) +{ + struct net_device *dev = peer->device->dev; + unsigned int len, len_before_trim; + struct wg_peer *routed_peer; + + wg_socket_set_peer_endpoint(peer, endpoint); + + if (unlikely(wg_noise_received_with_keypair(&peer->keypairs, + PACKET_CB(skb)->keypair))) { + wg_timers_handshake_complete(peer); + wg_packet_send_staged_packets(peer); + } + + keep_key_fresh(peer); + + wg_timers_any_authenticated_packet_received(peer); + wg_timers_any_authenticated_packet_traversal(peer); + + /* A packet with length 0 is a keepalive packet */ + if (unlikely(!skb->len)) { + update_rx_stats(peer, message_data_len(0)); + net_dbg_ratelimited("%s: Receiving keepalive packet from peer %llu (%pISpfsc)\n", + dev->name, peer->internal_id, + &peer->endpoint.addr); + goto packet_processed; + } + + wg_timers_data_received(peer); + + if (unlikely(skb_network_header(skb) < skb->head)) + goto dishonest_packet_size; + if (unlikely(!(pskb_network_may_pull(skb, sizeof(struct iphdr)) && + (ip_hdr(skb)->version == 4 || + (ip_hdr(skb)->version == 6 && + pskb_network_may_pull(skb, sizeof(struct ipv6hdr))))))) + goto dishonest_packet_type; + + skb->dev = dev; + /* We've already verified the Poly1305 auth tag, which means this packet + * was not modified in transit. We can therefore tell the networking + * stack that all checksums of every layer of encapsulation have already + * been checked "by the hardware" and therefore is unnecessary to check + * again in software. + */ + skb->ip_summed = CHECKSUM_UNNECESSARY; + skb->csum_level = ~0; /* All levels */ + skb->protocol = ip_tunnel_parse_protocol(skb); + if (skb->protocol == htons(ETH_P_IP)) { + len = ntohs(ip_hdr(skb)->tot_len); + if (unlikely(len < sizeof(struct iphdr))) + goto dishonest_packet_size; + INET_ECN_decapsulate(skb, PACKET_CB(skb)->ds, ip_hdr(skb)->tos); + } else if (skb->protocol == htons(ETH_P_IPV6)) { + len = ntohs(ipv6_hdr(skb)->payload_len) + + sizeof(struct ipv6hdr); + INET_ECN_decapsulate(skb, PACKET_CB(skb)->ds, ipv6_get_dsfield(ipv6_hdr(skb))); + } else { + goto dishonest_packet_type; + } + + if (unlikely(len > skb->len)) + goto dishonest_packet_size; + len_before_trim = skb->len; + if (unlikely(pskb_trim(skb, len))) + goto packet_processed; + + routed_peer = wg_allowedips_lookup_src(&peer->device->peer_allowedips, + skb); + wg_peer_put(routed_peer); /* We don't need the extra reference. */ + + if (unlikely(routed_peer != peer)) + goto dishonest_packet_peer; + + napi_gro_receive(&peer->napi, skb); + update_rx_stats(peer, message_data_len(len_before_trim)); + return; + +dishonest_packet_peer: + net_dbg_skb_ratelimited("%s: Packet has unallowed src IP (%pISc) from peer %llu (%pISpfsc)\n", + dev->name, skb, peer->internal_id, + &peer->endpoint.addr); + ++dev->stats.rx_errors; + ++dev->stats.rx_frame_errors; + goto packet_processed; +dishonest_packet_type: + net_dbg_ratelimited("%s: Packet is neither ipv4 nor ipv6 from peer %llu (%pISpfsc)\n", + dev->name, peer->internal_id, &peer->endpoint.addr); + ++dev->stats.rx_errors; + ++dev->stats.rx_frame_errors; + goto packet_processed; +dishonest_packet_size: + net_dbg_ratelimited("%s: Packet has incorrect size from peer %llu (%pISpfsc)\n", + dev->name, peer->internal_id, &peer->endpoint.addr); + ++dev->stats.rx_errors; + ++dev->stats.rx_length_errors; + goto packet_processed; +packet_processed: + dev_kfree_skb(skb); +} + +int wg_packet_rx_poll(struct napi_struct *napi, int budget) +{ + struct wg_peer *peer = container_of(napi, struct wg_peer, napi); + struct crypt_queue *queue = &peer->rx_queue; + struct noise_keypair *keypair; + struct endpoint endpoint; + enum packet_state state; + struct sk_buff *skb; + int work_done = 0; + bool free; + + if (unlikely(budget <= 0)) + return 0; + + while ((skb = __ptr_ring_peek(&queue->ring)) != NULL && + (state = atomic_read_acquire(&PACKET_CB(skb)->state)) != + PACKET_STATE_UNCRYPTED) { + __ptr_ring_discard_one(&queue->ring); + peer = PACKET_PEER(skb); + keypair = PACKET_CB(skb)->keypair; + free = true; + + if (unlikely(state != PACKET_STATE_CRYPTED)) + goto next; + + if (unlikely(!counter_validate(&keypair->receiving_counter, + PACKET_CB(skb)->nonce))) { + net_dbg_ratelimited("%s: Packet has invalid nonce %llu (max %llu)\n", + peer->device->dev->name, + PACKET_CB(skb)->nonce, + keypair->receiving_counter.counter); + goto next; + } + + if (unlikely(wg_socket_endpoint_from_skb(&endpoint, skb))) + goto next; + + wg_reset_packet(skb, false); + wg_packet_consume_data_done(peer, skb, &endpoint); + free = false; + +next: + wg_noise_keypair_put(keypair, false); + wg_peer_put(peer); + if (unlikely(free)) + dev_kfree_skb(skb); + + if (++work_done >= budget) + break; + } + + if (work_done < budget) + napi_complete_done(napi, work_done); + + return work_done; +} + +void wg_packet_decrypt_worker(struct work_struct *work) +{ + struct crypt_queue *queue = container_of(work, struct multicore_worker, + work)->ptr; + struct sk_buff *skb; + + while ((skb = ptr_ring_consume_bh(&queue->ring)) != NULL) { + enum packet_state state = + likely(decrypt_packet(skb, PACKET_CB(skb)->keypair)) ? + PACKET_STATE_CRYPTED : PACKET_STATE_DEAD; + wg_queue_enqueue_per_peer_napi(skb, state); + if (need_resched()) + cond_resched(); + } +} + +static void wg_packet_consume_data(struct wg_device *wg, struct sk_buff *skb) +{ + __le32 idx = ((struct message_data *)skb->data)->key_idx; + struct wg_peer *peer = NULL; + int ret; + + rcu_read_lock_bh(); + PACKET_CB(skb)->keypair = + (struct noise_keypair *)wg_index_hashtable_lookup( + wg->index_hashtable, INDEX_HASHTABLE_KEYPAIR, idx, + &peer); + if (unlikely(!wg_noise_keypair_get(PACKET_CB(skb)->keypair))) + goto err_keypair; + + if (unlikely(READ_ONCE(peer->is_dead))) + goto err; + + ret = wg_queue_enqueue_per_device_and_peer(&wg->decrypt_queue, + &peer->rx_queue, skb, + wg->packet_crypt_wq, + &wg->decrypt_queue.last_cpu); + if (unlikely(ret == -EPIPE)) + wg_queue_enqueue_per_peer_napi(skb, PACKET_STATE_DEAD); + if (likely(!ret || ret == -EPIPE)) { + rcu_read_unlock_bh(); + return; + } +err: + wg_noise_keypair_put(PACKET_CB(skb)->keypair, false); +err_keypair: + rcu_read_unlock_bh(); + wg_peer_put(peer); + dev_kfree_skb(skb); +} + +void wg_packet_receive(struct wg_device *wg, struct sk_buff *skb) +{ + if (unlikely(prepare_skb_header(skb, wg) < 0)) + goto err; + switch (SKB_TYPE_LE32(skb)) { + case cpu_to_le32(MESSAGE_HANDSHAKE_INITIATION): + case cpu_to_le32(MESSAGE_HANDSHAKE_RESPONSE): + case cpu_to_le32(MESSAGE_HANDSHAKE_COOKIE): { + int cpu; + + if (skb_queue_len(&wg->incoming_handshakes) > + MAX_QUEUED_INCOMING_HANDSHAKES || + unlikely(!rng_is_initialized())) { + net_dbg_skb_ratelimited("%s: Dropping handshake packet from %pISpfsc\n", + wg->dev->name, skb); + goto err; + } + skb_queue_tail(&wg->incoming_handshakes, skb); + /* Queues up a call to packet_process_queued_handshake_ + * packets(skb): + */ + cpu = wg_cpumask_next_online(&wg->incoming_handshake_cpu); + queue_work_on(cpu, wg->handshake_receive_wq, + &per_cpu_ptr(wg->incoming_handshakes_worker, cpu)->work); + break; + } + case cpu_to_le32(MESSAGE_DATA): + PACKET_CB(skb)->ds = ip_tunnel_get_dsfield(ip_hdr(skb), skb); + wg_packet_consume_data(wg, skb); + break; + default: + WARN(1, "Non-exhaustive parsing of packet header lead to unknown packet type!\n"); + goto err; + } + return; + +err: + dev_kfree_skb(skb); +} diff --git a/drivers/net/wireguard/selftest/allowedips.c b/drivers/net/wireguard/selftest/allowedips.c new file mode 100644 index 000000000000..846db14cb046 --- /dev/null +++ b/drivers/net/wireguard/selftest/allowedips.c @@ -0,0 +1,683 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * This contains some basic static unit tests for the allowedips data structure. + * It also has two additional modes that are disabled and meant to be used by + * folks directly playing with this file. If you define the macro + * DEBUG_PRINT_TRIE_GRAPHVIZ to be 1, then every time there's a full tree in + * memory, it will be printed out as KERN_DEBUG in a format that can be passed + * to graphviz (the dot command) to visualize it. If you define the macro + * DEBUG_RANDOM_TRIE to be 1, then there will be an extremely costly set of + * randomized tests done against a trivial implementation, which may take + * upwards of a half-hour to complete. There's no set of users who should be + * enabling these, and the only developers that should go anywhere near these + * nobs are the ones who are reading this comment. + */ + +#ifdef DEBUG + +#include + +static __init void swap_endian_and_apply_cidr(u8 *dst, const u8 *src, u8 bits, + u8 cidr) +{ + swap_endian(dst, src, bits); + memset(dst + (cidr + 7) / 8, 0, bits / 8 - (cidr + 7) / 8); + if (cidr) + dst[(cidr + 7) / 8 - 1] &= ~0U << ((8 - (cidr % 8)) % 8); +} + +static __init void print_node(struct allowedips_node *node, u8 bits) +{ + char *fmt_connection = KERN_DEBUG "\t\"%p/%d\" -> \"%p/%d\";\n"; + char *fmt_declaration = KERN_DEBUG + "\t\"%p/%d\"[style=%s, color=\"#%06x\"];\n"; + char *style = "dotted"; + u8 ip1[16], ip2[16]; + u32 color = 0; + + if (bits == 32) { + fmt_connection = KERN_DEBUG "\t\"%pI4/%d\" -> \"%pI4/%d\";\n"; + fmt_declaration = KERN_DEBUG + "\t\"%pI4/%d\"[style=%s, color=\"#%06x\"];\n"; + } else if (bits == 128) { + fmt_connection = KERN_DEBUG "\t\"%pI6/%d\" -> \"%pI6/%d\";\n"; + fmt_declaration = KERN_DEBUG + "\t\"%pI6/%d\"[style=%s, color=\"#%06x\"];\n"; + } + if (node->peer) { + hsiphash_key_t key = { { 0 } }; + + memcpy(&key, &node->peer, sizeof(node->peer)); + color = hsiphash_1u32(0xdeadbeef, &key) % 200 << 16 | + hsiphash_1u32(0xbabecafe, &key) % 200 << 8 | + hsiphash_1u32(0xabad1dea, &key) % 200; + style = "bold"; + } + swap_endian_and_apply_cidr(ip1, node->bits, bits, node->cidr); + printk(fmt_declaration, ip1, node->cidr, style, color); + if (node->bit[0]) { + swap_endian_and_apply_cidr(ip2, + rcu_dereference_raw(node->bit[0])->bits, bits, + node->cidr); + printk(fmt_connection, ip1, node->cidr, ip2, + rcu_dereference_raw(node->bit[0])->cidr); + print_node(rcu_dereference_raw(node->bit[0]), bits); + } + if (node->bit[1]) { + swap_endian_and_apply_cidr(ip2, + rcu_dereference_raw(node->bit[1])->bits, + bits, node->cidr); + printk(fmt_connection, ip1, node->cidr, ip2, + rcu_dereference_raw(node->bit[1])->cidr); + print_node(rcu_dereference_raw(node->bit[1]), bits); + } +} + +static __init void print_tree(struct allowedips_node __rcu *top, u8 bits) +{ + printk(KERN_DEBUG "digraph trie {\n"); + print_node(rcu_dereference_raw(top), bits); + printk(KERN_DEBUG "}\n"); +} + +enum { + NUM_PEERS = 2000, + NUM_RAND_ROUTES = 400, + NUM_MUTATED_ROUTES = 100, + NUM_QUERIES = NUM_RAND_ROUTES * NUM_MUTATED_ROUTES * 30 +}; + +struct horrible_allowedips { + struct hlist_head head; +}; + +struct horrible_allowedips_node { + struct hlist_node table; + union nf_inet_addr ip; + union nf_inet_addr mask; + u8 ip_version; + void *value; +}; + +static __init void horrible_allowedips_init(struct horrible_allowedips *table) +{ + INIT_HLIST_HEAD(&table->head); +} + +static __init void horrible_allowedips_free(struct horrible_allowedips *table) +{ + struct horrible_allowedips_node *node; + struct hlist_node *h; + + hlist_for_each_entry_safe(node, h, &table->head, table) { + hlist_del(&node->table); + kfree(node); + } +} + +static __init inline union nf_inet_addr horrible_cidr_to_mask(u8 cidr) +{ + union nf_inet_addr mask; + + memset(&mask, 0x00, 128 / 8); + memset(&mask, 0xff, cidr / 8); + if (cidr % 32) + mask.all[cidr / 32] = (__force u32)htonl( + (0xFFFFFFFFUL << (32 - (cidr % 32))) & 0xFFFFFFFFUL); + return mask; +} + +static __init inline u8 horrible_mask_to_cidr(union nf_inet_addr subnet) +{ + return hweight32(subnet.all[0]) + hweight32(subnet.all[1]) + + hweight32(subnet.all[2]) + hweight32(subnet.all[3]); +} + +static __init inline void +horrible_mask_self(struct horrible_allowedips_node *node) +{ + if (node->ip_version == 4) { + node->ip.ip &= node->mask.ip; + } else if (node->ip_version == 6) { + node->ip.ip6[0] &= node->mask.ip6[0]; + node->ip.ip6[1] &= node->mask.ip6[1]; + node->ip.ip6[2] &= node->mask.ip6[2]; + node->ip.ip6[3] &= node->mask.ip6[3]; + } +} + +static __init inline bool +horrible_match_v4(const struct horrible_allowedips_node *node, + struct in_addr *ip) +{ + return (ip->s_addr & node->mask.ip) == node->ip.ip; +} + +static __init inline bool +horrible_match_v6(const struct horrible_allowedips_node *node, + struct in6_addr *ip) +{ + return (ip->in6_u.u6_addr32[0] & node->mask.ip6[0]) == + node->ip.ip6[0] && + (ip->in6_u.u6_addr32[1] & node->mask.ip6[1]) == + node->ip.ip6[1] && + (ip->in6_u.u6_addr32[2] & node->mask.ip6[2]) == + node->ip.ip6[2] && + (ip->in6_u.u6_addr32[3] & node->mask.ip6[3]) == node->ip.ip6[3]; +} + +static __init void +horrible_insert_ordered(struct horrible_allowedips *table, + struct horrible_allowedips_node *node) +{ + struct horrible_allowedips_node *other = NULL, *where = NULL; + u8 my_cidr = horrible_mask_to_cidr(node->mask); + + hlist_for_each_entry(other, &table->head, table) { + if (!memcmp(&other->mask, &node->mask, + sizeof(union nf_inet_addr)) && + !memcmp(&other->ip, &node->ip, + sizeof(union nf_inet_addr)) && + other->ip_version == node->ip_version) { + other->value = node->value; + kfree(node); + return; + } + where = other; + if (horrible_mask_to_cidr(other->mask) <= my_cidr) + break; + } + if (!other && !where) + hlist_add_head(&node->table, &table->head); + else if (!other) + hlist_add_behind(&node->table, &where->table); + else + hlist_add_before(&node->table, &where->table); +} + +static __init int +horrible_allowedips_insert_v4(struct horrible_allowedips *table, + struct in_addr *ip, u8 cidr, void *value) +{ + struct horrible_allowedips_node *node = kzalloc(sizeof(*node), + GFP_KERNEL); + + if (unlikely(!node)) + return -ENOMEM; + node->ip.in = *ip; + node->mask = horrible_cidr_to_mask(cidr); + node->ip_version = 4; + node->value = value; + horrible_mask_self(node); + horrible_insert_ordered(table, node); + return 0; +} + +static __init int +horrible_allowedips_insert_v6(struct horrible_allowedips *table, + struct in6_addr *ip, u8 cidr, void *value) +{ + struct horrible_allowedips_node *node = kzalloc(sizeof(*node), + GFP_KERNEL); + + if (unlikely(!node)) + return -ENOMEM; + node->ip.in6 = *ip; + node->mask = horrible_cidr_to_mask(cidr); + node->ip_version = 6; + node->value = value; + horrible_mask_self(node); + horrible_insert_ordered(table, node); + return 0; +} + +static __init void * +horrible_allowedips_lookup_v4(struct horrible_allowedips *table, + struct in_addr *ip) +{ + struct horrible_allowedips_node *node; + void *ret = NULL; + + hlist_for_each_entry(node, &table->head, table) { + if (node->ip_version != 4) + continue; + if (horrible_match_v4(node, ip)) { + ret = node->value; + break; + } + } + return ret; +} + +static __init void * +horrible_allowedips_lookup_v6(struct horrible_allowedips *table, + struct in6_addr *ip) +{ + struct horrible_allowedips_node *node; + void *ret = NULL; + + hlist_for_each_entry(node, &table->head, table) { + if (node->ip_version != 6) + continue; + if (horrible_match_v6(node, ip)) { + ret = node->value; + break; + } + } + return ret; +} + +static __init bool randomized_test(void) +{ + unsigned int i, j, k, mutate_amount, cidr; + u8 ip[16], mutate_mask[16], mutated[16]; + struct wg_peer **peers, *peer; + struct horrible_allowedips h; + DEFINE_MUTEX(mutex); + struct allowedips t; + bool ret = false; + + mutex_init(&mutex); + + wg_allowedips_init(&t); + horrible_allowedips_init(&h); + + peers = kcalloc(NUM_PEERS, sizeof(*peers), GFP_KERNEL); + if (unlikely(!peers)) { + pr_err("allowedips random self-test malloc: FAIL\n"); + goto free; + } + for (i = 0; i < NUM_PEERS; ++i) { + peers[i] = kzalloc(sizeof(*peers[i]), GFP_KERNEL); + if (unlikely(!peers[i])) { + pr_err("allowedips random self-test malloc: FAIL\n"); + goto free; + } + kref_init(&peers[i]->refcount); + } + + mutex_lock(&mutex); + + for (i = 0; i < NUM_RAND_ROUTES; ++i) { + prandom_bytes(ip, 4); + cidr = prandom_u32_max(32) + 1; + peer = peers[prandom_u32_max(NUM_PEERS)]; + if (wg_allowedips_insert_v4(&t, (struct in_addr *)ip, cidr, + peer, &mutex) < 0) { + pr_err("allowedips random self-test malloc: FAIL\n"); + goto free_locked; + } + if (horrible_allowedips_insert_v4(&h, (struct in_addr *)ip, + cidr, peer) < 0) { + pr_err("allowedips random self-test malloc: FAIL\n"); + goto free_locked; + } + for (j = 0; j < NUM_MUTATED_ROUTES; ++j) { + memcpy(mutated, ip, 4); + prandom_bytes(mutate_mask, 4); + mutate_amount = prandom_u32_max(32); + for (k = 0; k < mutate_amount / 8; ++k) + mutate_mask[k] = 0xff; + mutate_mask[k] = 0xff + << ((8 - (mutate_amount % 8)) % 8); + for (; k < 4; ++k) + mutate_mask[k] = 0; + for (k = 0; k < 4; ++k) + mutated[k] = (mutated[k] & mutate_mask[k]) | + (~mutate_mask[k] & + prandom_u32_max(256)); + cidr = prandom_u32_max(32) + 1; + peer = peers[prandom_u32_max(NUM_PEERS)]; + if (wg_allowedips_insert_v4(&t, + (struct in_addr *)mutated, + cidr, peer, &mutex) < 0) { + pr_err("allowedips random malloc: FAIL\n"); + goto free_locked; + } + if (horrible_allowedips_insert_v4(&h, + (struct in_addr *)mutated, cidr, peer)) { + pr_err("allowedips random self-test malloc: FAIL\n"); + goto free_locked; + } + } + } + + for (i = 0; i < NUM_RAND_ROUTES; ++i) { + prandom_bytes(ip, 16); + cidr = prandom_u32_max(128) + 1; + peer = peers[prandom_u32_max(NUM_PEERS)]; + if (wg_allowedips_insert_v6(&t, (struct in6_addr *)ip, cidr, + peer, &mutex) < 0) { + pr_err("allowedips random self-test malloc: FAIL\n"); + goto free_locked; + } + if (horrible_allowedips_insert_v6(&h, (struct in6_addr *)ip, + cidr, peer) < 0) { + pr_err("allowedips random self-test malloc: FAIL\n"); + goto free_locked; + } + for (j = 0; j < NUM_MUTATED_ROUTES; ++j) { + memcpy(mutated, ip, 16); + prandom_bytes(mutate_mask, 16); + mutate_amount = prandom_u32_max(128); + for (k = 0; k < mutate_amount / 8; ++k) + mutate_mask[k] = 0xff; + mutate_mask[k] = 0xff + << ((8 - (mutate_amount % 8)) % 8); + for (; k < 4; ++k) + mutate_mask[k] = 0; + for (k = 0; k < 4; ++k) + mutated[k] = (mutated[k] & mutate_mask[k]) | + (~mutate_mask[k] & + prandom_u32_max(256)); + cidr = prandom_u32_max(128) + 1; + peer = peers[prandom_u32_max(NUM_PEERS)]; + if (wg_allowedips_insert_v6(&t, + (struct in6_addr *)mutated, + cidr, peer, &mutex) < 0) { + pr_err("allowedips random self-test malloc: FAIL\n"); + goto free_locked; + } + if (horrible_allowedips_insert_v6( + &h, (struct in6_addr *)mutated, cidr, + peer)) { + pr_err("allowedips random self-test malloc: FAIL\n"); + goto free_locked; + } + } + } + + mutex_unlock(&mutex); + + if (IS_ENABLED(DEBUG_PRINT_TRIE_GRAPHVIZ)) { + print_tree(t.root4, 32); + print_tree(t.root6, 128); + } + + for (i = 0; i < NUM_QUERIES; ++i) { + prandom_bytes(ip, 4); + if (lookup(t.root4, 32, ip) != + horrible_allowedips_lookup_v4(&h, (struct in_addr *)ip)) { + pr_err("allowedips random self-test: FAIL\n"); + goto free; + } + } + + for (i = 0; i < NUM_QUERIES; ++i) { + prandom_bytes(ip, 16); + if (lookup(t.root6, 128, ip) != + horrible_allowedips_lookup_v6(&h, (struct in6_addr *)ip)) { + pr_err("allowedips random self-test: FAIL\n"); + goto free; + } + } + ret = true; + +free: + mutex_lock(&mutex); +free_locked: + wg_allowedips_free(&t, &mutex); + mutex_unlock(&mutex); + horrible_allowedips_free(&h); + if (peers) { + for (i = 0; i < NUM_PEERS; ++i) + kfree(peers[i]); + } + kfree(peers); + return ret; +} + +static __init inline struct in_addr *ip4(u8 a, u8 b, u8 c, u8 d) +{ + static struct in_addr ip; + u8 *split = (u8 *)&ip; + + split[0] = a; + split[1] = b; + split[2] = c; + split[3] = d; + return &ip; +} + +static __init inline struct in6_addr *ip6(u32 a, u32 b, u32 c, u32 d) +{ + static struct in6_addr ip; + __be32 *split = (__be32 *)&ip; + + split[0] = cpu_to_be32(a); + split[1] = cpu_to_be32(b); + split[2] = cpu_to_be32(c); + split[3] = cpu_to_be32(d); + return &ip; +} + +static __init struct wg_peer *init_peer(void) +{ + struct wg_peer *peer = kzalloc(sizeof(*peer), GFP_KERNEL); + + if (!peer) + return NULL; + kref_init(&peer->refcount); + INIT_LIST_HEAD(&peer->allowedips_list); + return peer; +} + +#define insert(version, mem, ipa, ipb, ipc, ipd, cidr) \ + wg_allowedips_insert_v##version(&t, ip##version(ipa, ipb, ipc, ipd), \ + cidr, mem, &mutex) + +#define maybe_fail() do { \ + ++i; \ + if (!_s) { \ + pr_info("allowedips self-test %zu: FAIL\n", i); \ + success = false; \ + } \ + } while (0) + +#define test(version, mem, ipa, ipb, ipc, ipd) do { \ + bool _s = lookup(t.root##version, (version) == 4 ? 32 : 128, \ + ip##version(ipa, ipb, ipc, ipd)) == (mem); \ + maybe_fail(); \ + } while (0) + +#define test_negative(version, mem, ipa, ipb, ipc, ipd) do { \ + bool _s = lookup(t.root##version, (version) == 4 ? 32 : 128, \ + ip##version(ipa, ipb, ipc, ipd)) != (mem); \ + maybe_fail(); \ + } while (0) + +#define test_boolean(cond) do { \ + bool _s = (cond); \ + maybe_fail(); \ + } while (0) + +bool __init wg_allowedips_selftest(void) +{ + bool found_a = false, found_b = false, found_c = false, found_d = false, + found_e = false, found_other = false; + struct wg_peer *a = init_peer(), *b = init_peer(), *c = init_peer(), + *d = init_peer(), *e = init_peer(), *f = init_peer(), + *g = init_peer(), *h = init_peer(); + struct allowedips_node *iter_node; + bool success = false; + struct allowedips t; + DEFINE_MUTEX(mutex); + struct in6_addr ip; + size_t i = 0, count = 0; + __be64 part; + + mutex_init(&mutex); + mutex_lock(&mutex); + wg_allowedips_init(&t); + + if (!a || !b || !c || !d || !e || !f || !g || !h) { + pr_err("allowedips self-test malloc: FAIL\n"); + goto free; + } + + insert(4, a, 192, 168, 4, 0, 24); + insert(4, b, 192, 168, 4, 4, 32); + insert(4, c, 192, 168, 0, 0, 16); + insert(4, d, 192, 95, 5, 64, 27); + /* replaces previous entry, and maskself is required */ + insert(4, c, 192, 95, 5, 65, 27); + insert(6, d, 0x26075300, 0x60006b00, 0, 0xc05f0543, 128); + insert(6, c, 0x26075300, 0x60006b00, 0, 0, 64); + insert(4, e, 0, 0, 0, 0, 0); + insert(6, e, 0, 0, 0, 0, 0); + /* replaces previous entry */ + insert(6, f, 0, 0, 0, 0, 0); + insert(6, g, 0x24046800, 0, 0, 0, 32); + /* maskself is required */ + insert(6, h, 0x24046800, 0x40040800, 0xdeadbeef, 0xdeadbeef, 64); + insert(6, a, 0x24046800, 0x40040800, 0xdeadbeef, 0xdeadbeef, 128); + insert(6, c, 0x24446800, 0x40e40800, 0xdeaebeef, 0xdefbeef, 128); + insert(6, b, 0x24446800, 0xf0e40800, 0xeeaebeef, 0, 98); + insert(4, g, 64, 15, 112, 0, 20); + /* maskself is required */ + insert(4, h, 64, 15, 123, 211, 25); + insert(4, a, 10, 0, 0, 0, 25); + insert(4, b, 10, 0, 0, 128, 25); + insert(4, a, 10, 1, 0, 0, 30); + insert(4, b, 10, 1, 0, 4, 30); + insert(4, c, 10, 1, 0, 8, 29); + insert(4, d, 10, 1, 0, 16, 29); + + if (IS_ENABLED(DEBUG_PRINT_TRIE_GRAPHVIZ)) { + print_tree(t.root4, 32); + print_tree(t.root6, 128); + } + + success = true; + + test(4, a, 192, 168, 4, 20); + test(4, a, 192, 168, 4, 0); + test(4, b, 192, 168, 4, 4); + test(4, c, 192, 168, 200, 182); + test(4, c, 192, 95, 5, 68); + test(4, e, 192, 95, 5, 96); + test(6, d, 0x26075300, 0x60006b00, 0, 0xc05f0543); + test(6, c, 0x26075300, 0x60006b00, 0, 0xc02e01ee); + test(6, f, 0x26075300, 0x60006b01, 0, 0); + test(6, g, 0x24046800, 0x40040806, 0, 0x1006); + test(6, g, 0x24046800, 0x40040806, 0x1234, 0x5678); + test(6, f, 0x240467ff, 0x40040806, 0x1234, 0x5678); + test(6, f, 0x24046801, 0x40040806, 0x1234, 0x5678); + test(6, h, 0x24046800, 0x40040800, 0x1234, 0x5678); + test(6, h, 0x24046800, 0x40040800, 0, 0); + test(6, h, 0x24046800, 0x40040800, 0x10101010, 0x10101010); + test(6, a, 0x24046800, 0x40040800, 0xdeadbeef, 0xdeadbeef); + test(4, g, 64, 15, 116, 26); + test(4, g, 64, 15, 127, 3); + test(4, g, 64, 15, 123, 1); + test(4, h, 64, 15, 123, 128); + test(4, h, 64, 15, 123, 129); + test(4, a, 10, 0, 0, 52); + test(4, b, 10, 0, 0, 220); + test(4, a, 10, 1, 0, 2); + test(4, b, 10, 1, 0, 6); + test(4, c, 10, 1, 0, 10); + test(4, d, 10, 1, 0, 20); + + insert(4, a, 1, 0, 0, 0, 32); + insert(4, a, 64, 0, 0, 0, 32); + insert(4, a, 128, 0, 0, 0, 32); + insert(4, a, 192, 0, 0, 0, 32); + insert(4, a, 255, 0, 0, 0, 32); + wg_allowedips_remove_by_peer(&t, a, &mutex); + test_negative(4, a, 1, 0, 0, 0); + test_negative(4, a, 64, 0, 0, 0); + test_negative(4, a, 128, 0, 0, 0); + test_negative(4, a, 192, 0, 0, 0); + test_negative(4, a, 255, 0, 0, 0); + + wg_allowedips_free(&t, &mutex); + wg_allowedips_init(&t); + insert(4, a, 192, 168, 0, 0, 16); + insert(4, a, 192, 168, 0, 0, 24); + wg_allowedips_remove_by_peer(&t, a, &mutex); + test_negative(4, a, 192, 168, 0, 1); + + /* These will hit the WARN_ON(len >= 128) in free_node if something + * goes wrong. + */ + for (i = 0; i < 128; ++i) { + part = cpu_to_be64(~(1LLU << (i % 64))); + memset(&ip, 0xff, 16); + memcpy((u8 *)&ip + (i < 64) * 8, &part, 8); + wg_allowedips_insert_v6(&t, &ip, 128, a, &mutex); + } + + wg_allowedips_free(&t, &mutex); + + wg_allowedips_init(&t); + insert(4, a, 192, 95, 5, 93, 27); + insert(6, a, 0x26075300, 0x60006b00, 0, 0xc05f0543, 128); + insert(4, a, 10, 1, 0, 20, 29); + insert(6, a, 0x26075300, 0x6d8a6bf8, 0xdab1f1df, 0xc05f1523, 83); + insert(6, a, 0x26075300, 0x6d8a6bf8, 0xdab1f1df, 0xc05f1523, 21); + list_for_each_entry(iter_node, &a->allowedips_list, peer_list) { + u8 cidr, ip[16] __aligned(__alignof(u64)); + int family = wg_allowedips_read_node(iter_node, ip, &cidr); + + count++; + + if (cidr == 27 && family == AF_INET && + !memcmp(ip, ip4(192, 95, 5, 64), sizeof(struct in_addr))) + found_a = true; + else if (cidr == 128 && family == AF_INET6 && + !memcmp(ip, ip6(0x26075300, 0x60006b00, 0, 0xc05f0543), + sizeof(struct in6_addr))) + found_b = true; + else if (cidr == 29 && family == AF_INET && + !memcmp(ip, ip4(10, 1, 0, 16), sizeof(struct in_addr))) + found_c = true; + else if (cidr == 83 && family == AF_INET6 && + !memcmp(ip, ip6(0x26075300, 0x6d8a6bf8, 0xdab1e000, 0), + sizeof(struct in6_addr))) + found_d = true; + else if (cidr == 21 && family == AF_INET6 && + !memcmp(ip, ip6(0x26075000, 0, 0, 0), + sizeof(struct in6_addr))) + found_e = true; + else + found_other = true; + } + test_boolean(count == 5); + test_boolean(found_a); + test_boolean(found_b); + test_boolean(found_c); + test_boolean(found_d); + test_boolean(found_e); + test_boolean(!found_other); + + if (IS_ENABLED(DEBUG_RANDOM_TRIE) && success) + success = randomized_test(); + + if (success) + pr_info("allowedips self-tests: pass\n"); + +free: + wg_allowedips_free(&t, &mutex); + kfree(a); + kfree(b); + kfree(c); + kfree(d); + kfree(e); + kfree(f); + kfree(g); + kfree(h); + mutex_unlock(&mutex); + + return success; +} + +#undef test_negative +#undef test +#undef remove +#undef insert +#undef init_peer + +#endif diff --git a/drivers/net/wireguard/selftest/counter.c b/drivers/net/wireguard/selftest/counter.c new file mode 100644 index 000000000000..ec3c156bf91b --- /dev/null +++ b/drivers/net/wireguard/selftest/counter.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifdef DEBUG +bool __init wg_packet_counter_selftest(void) +{ + struct noise_replay_counter *counter; + unsigned int test_num = 0, i; + bool success = true; + + counter = kmalloc(sizeof(*counter), GFP_KERNEL); + if (unlikely(!counter)) { + pr_err("nonce counter self-test malloc: FAIL\n"); + return false; + } + +#define T_INIT do { \ + memset(counter, 0, sizeof(*counter)); \ + spin_lock_init(&counter->lock); \ + } while (0) +#define T_LIM (COUNTER_WINDOW_SIZE + 1) +#define T(n, v) do { \ + ++test_num; \ + if (counter_validate(counter, n) != (v)) { \ + pr_err("nonce counter self-test %u: FAIL\n", \ + test_num); \ + success = false; \ + } \ + } while (0) + + T_INIT; + /* 1 */ T(0, true); + /* 2 */ T(1, true); + /* 3 */ T(1, false); + /* 4 */ T(9, true); + /* 5 */ T(8, true); + /* 6 */ T(7, true); + /* 7 */ T(7, false); + /* 8 */ T(T_LIM, true); + /* 9 */ T(T_LIM - 1, true); + /* 10 */ T(T_LIM - 1, false); + /* 11 */ T(T_LIM - 2, true); + /* 12 */ T(2, true); + /* 13 */ T(2, false); + /* 14 */ T(T_LIM + 16, true); + /* 15 */ T(3, false); + /* 16 */ T(T_LIM + 16, false); + /* 17 */ T(T_LIM * 4, true); + /* 18 */ T(T_LIM * 4 - (T_LIM - 1), true); + /* 19 */ T(10, false); + /* 20 */ T(T_LIM * 4 - T_LIM, false); + /* 21 */ T(T_LIM * 4 - (T_LIM + 1), false); + /* 22 */ T(T_LIM * 4 - (T_LIM - 2), true); + /* 23 */ T(T_LIM * 4 + 1 - T_LIM, false); + /* 24 */ T(0, false); + /* 25 */ T(REJECT_AFTER_MESSAGES, false); + /* 26 */ T(REJECT_AFTER_MESSAGES - 1, true); + /* 27 */ T(REJECT_AFTER_MESSAGES, false); + /* 28 */ T(REJECT_AFTER_MESSAGES - 1, false); + /* 29 */ T(REJECT_AFTER_MESSAGES - 2, true); + /* 30 */ T(REJECT_AFTER_MESSAGES + 1, false); + /* 31 */ T(REJECT_AFTER_MESSAGES + 2, false); + /* 32 */ T(REJECT_AFTER_MESSAGES - 2, false); + /* 33 */ T(REJECT_AFTER_MESSAGES - 3, true); + /* 34 */ T(0, false); + + T_INIT; + for (i = 1; i <= COUNTER_WINDOW_SIZE; ++i) + T(i, true); + T(0, true); + T(0, false); + + T_INIT; + for (i = 2; i <= COUNTER_WINDOW_SIZE + 1; ++i) + T(i, true); + T(1, true); + T(0, false); + + T_INIT; + for (i = COUNTER_WINDOW_SIZE + 1; i-- > 0;) + T(i, true); + + T_INIT; + for (i = COUNTER_WINDOW_SIZE + 2; i-- > 1;) + T(i, true); + T(0, false); + + T_INIT; + for (i = COUNTER_WINDOW_SIZE + 1; i-- > 1;) + T(i, true); + T(COUNTER_WINDOW_SIZE + 1, true); + T(0, false); + + T_INIT; + for (i = COUNTER_WINDOW_SIZE + 1; i-- > 1;) + T(i, true); + T(0, true); + T(COUNTER_WINDOW_SIZE + 1, true); + +#undef T +#undef T_LIM +#undef T_INIT + + if (success) + pr_info("nonce counter self-tests: pass\n"); + kfree(counter); + return success; +} +#endif diff --git a/drivers/net/wireguard/selftest/ratelimiter.c b/drivers/net/wireguard/selftest/ratelimiter.c new file mode 100644 index 000000000000..007cd4457c5f --- /dev/null +++ b/drivers/net/wireguard/selftest/ratelimiter.c @@ -0,0 +1,226 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifdef DEBUG + +#include + +static const struct { + bool result; + unsigned int msec_to_sleep_before; +} expected_results[] __initconst = { + [0 ... PACKETS_BURSTABLE - 1] = { true, 0 }, + [PACKETS_BURSTABLE] = { false, 0 }, + [PACKETS_BURSTABLE + 1] = { true, MSEC_PER_SEC / PACKETS_PER_SECOND }, + [PACKETS_BURSTABLE + 2] = { false, 0 }, + [PACKETS_BURSTABLE + 3] = { true, (MSEC_PER_SEC / PACKETS_PER_SECOND) * 2 }, + [PACKETS_BURSTABLE + 4] = { true, 0 }, + [PACKETS_BURSTABLE + 5] = { false, 0 } +}; + +static __init unsigned int maximum_jiffies_at_index(int index) +{ + unsigned int total_msecs = 2 * MSEC_PER_SEC / PACKETS_PER_SECOND / 3; + int i; + + for (i = 0; i <= index; ++i) + total_msecs += expected_results[i].msec_to_sleep_before; + return msecs_to_jiffies(total_msecs); +} + +static __init int timings_test(struct sk_buff *skb4, struct iphdr *hdr4, + struct sk_buff *skb6, struct ipv6hdr *hdr6, + int *test) +{ + unsigned long loop_start_time; + int i; + + wg_ratelimiter_gc_entries(NULL); + rcu_barrier(); + loop_start_time = jiffies; + + for (i = 0; i < ARRAY_SIZE(expected_results); ++i) { + if (expected_results[i].msec_to_sleep_before) + msleep(expected_results[i].msec_to_sleep_before); + + if (time_is_before_jiffies(loop_start_time + + maximum_jiffies_at_index(i))) + return -ETIMEDOUT; + if (wg_ratelimiter_allow(skb4, &init_net) != + expected_results[i].result) + return -EXFULL; + ++(*test); + + hdr4->saddr = htonl(ntohl(hdr4->saddr) + i + 1); + if (time_is_before_jiffies(loop_start_time + + maximum_jiffies_at_index(i))) + return -ETIMEDOUT; + if (!wg_ratelimiter_allow(skb4, &init_net)) + return -EXFULL; + ++(*test); + + hdr4->saddr = htonl(ntohl(hdr4->saddr) - i - 1); + +#if IS_ENABLED(CONFIG_IPV6) + hdr6->saddr.in6_u.u6_addr32[2] = htonl(i); + hdr6->saddr.in6_u.u6_addr32[3] = htonl(i); + if (time_is_before_jiffies(loop_start_time + + maximum_jiffies_at_index(i))) + return -ETIMEDOUT; + if (wg_ratelimiter_allow(skb6, &init_net) != + expected_results[i].result) + return -EXFULL; + ++(*test); + + hdr6->saddr.in6_u.u6_addr32[0] = + htonl(ntohl(hdr6->saddr.in6_u.u6_addr32[0]) + i + 1); + if (time_is_before_jiffies(loop_start_time + + maximum_jiffies_at_index(i))) + return -ETIMEDOUT; + if (!wg_ratelimiter_allow(skb6, &init_net)) + return -EXFULL; + ++(*test); + + hdr6->saddr.in6_u.u6_addr32[0] = + htonl(ntohl(hdr6->saddr.in6_u.u6_addr32[0]) - i - 1); + + if (time_is_before_jiffies(loop_start_time + + maximum_jiffies_at_index(i))) + return -ETIMEDOUT; +#endif + } + return 0; +} + +static __init int capacity_test(struct sk_buff *skb4, struct iphdr *hdr4, + int *test) +{ + int i; + + wg_ratelimiter_gc_entries(NULL); + rcu_barrier(); + + if (atomic_read(&total_entries)) + return -EXFULL; + ++(*test); + + for (i = 0; i <= max_entries; ++i) { + hdr4->saddr = htonl(i); + if (wg_ratelimiter_allow(skb4, &init_net) != (i != max_entries)) + return -EXFULL; + ++(*test); + } + return 0; +} + +bool __init wg_ratelimiter_selftest(void) +{ + enum { TRIALS_BEFORE_GIVING_UP = 5000 }; + bool success = false; + int test = 0, trials; + struct sk_buff *skb4, *skb6 = NULL; + struct iphdr *hdr4; + struct ipv6hdr *hdr6 = NULL; + + if (IS_ENABLED(CONFIG_KASAN) || IS_ENABLED(CONFIG_UBSAN)) + return true; + + BUILD_BUG_ON(MSEC_PER_SEC % PACKETS_PER_SECOND != 0); + + if (wg_ratelimiter_init()) + goto out; + ++test; + if (wg_ratelimiter_init()) { + wg_ratelimiter_uninit(); + goto out; + } + ++test; + if (wg_ratelimiter_init()) { + wg_ratelimiter_uninit(); + wg_ratelimiter_uninit(); + goto out; + } + ++test; + + skb4 = alloc_skb(sizeof(struct iphdr), GFP_KERNEL); + if (unlikely(!skb4)) + goto err_nofree; + skb4->protocol = htons(ETH_P_IP); + hdr4 = (struct iphdr *)skb_put(skb4, sizeof(*hdr4)); + hdr4->saddr = htonl(8182); + skb_reset_network_header(skb4); + ++test; + +#if IS_ENABLED(CONFIG_IPV6) + skb6 = alloc_skb(sizeof(struct ipv6hdr), GFP_KERNEL); + if (unlikely(!skb6)) { + kfree_skb(skb4); + goto err_nofree; + } + skb6->protocol = htons(ETH_P_IPV6); + hdr6 = (struct ipv6hdr *)skb_put(skb6, sizeof(*hdr6)); + hdr6->saddr.in6_u.u6_addr32[0] = htonl(1212); + hdr6->saddr.in6_u.u6_addr32[1] = htonl(289188); + skb_reset_network_header(skb6); + ++test; +#endif + + for (trials = TRIALS_BEFORE_GIVING_UP;;) { + int test_count = 0, ret; + + ret = timings_test(skb4, hdr4, skb6, hdr6, &test_count); + if (ret == -ETIMEDOUT) { + if (!trials--) { + test += test_count; + goto err; + } + msleep(500); + continue; + } else if (ret < 0) { + test += test_count; + goto err; + } else { + test += test_count; + break; + } + } + + for (trials = TRIALS_BEFORE_GIVING_UP;;) { + int test_count = 0; + + if (capacity_test(skb4, hdr4, &test_count) < 0) { + if (!trials--) { + test += test_count; + goto err; + } + msleep(50); + continue; + } + test += test_count; + break; + } + + success = true; + +err: + kfree_skb(skb4); +#if IS_ENABLED(CONFIG_IPV6) + kfree_skb(skb6); +#endif +err_nofree: + wg_ratelimiter_uninit(); + wg_ratelimiter_uninit(); + wg_ratelimiter_uninit(); + /* Uninit one extra time to check underflow detection. */ + wg_ratelimiter_uninit(); +out: + if (success) + pr_info("ratelimiter self-tests: pass\n"); + else + pr_err("ratelimiter self-test %d: FAIL\n", test); + + return success; +} +#endif diff --git a/drivers/net/wireguard/send.c b/drivers/net/wireguard/send.c new file mode 100644 index 000000000000..f74b9341ab0f --- /dev/null +++ b/drivers/net/wireguard/send.c @@ -0,0 +1,422 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "queueing.h" +#include "timers.h" +#include "device.h" +#include "peer.h" +#include "socket.h" +#include "messages.h" +#include "cookie.h" + +#include +#include +#include +#include +#include +#include + +static void wg_packet_send_handshake_initiation(struct wg_peer *peer) +{ + struct message_handshake_initiation packet; + + if (!wg_birthdate_has_expired(atomic64_read(&peer->last_sent_handshake), + REKEY_TIMEOUT)) + return; /* This function is rate limited. */ + + atomic64_set(&peer->last_sent_handshake, ktime_get_coarse_boottime_ns()); + net_dbg_ratelimited("%s: Sending handshake initiation to peer %llu (%pISpfsc)\n", + peer->device->dev->name, peer->internal_id, + &peer->endpoint.addr); + + if (wg_noise_handshake_create_initiation(&packet, &peer->handshake)) { + wg_cookie_add_mac_to_packet(&packet, sizeof(packet), peer); + wg_timers_any_authenticated_packet_traversal(peer); + wg_timers_any_authenticated_packet_sent(peer); + atomic64_set(&peer->last_sent_handshake, + ktime_get_coarse_boottime_ns()); + wg_socket_send_buffer_to_peer(peer, &packet, sizeof(packet), + HANDSHAKE_DSCP); + wg_timers_handshake_initiated(peer); + } +} + +void wg_packet_handshake_send_worker(struct work_struct *work) +{ + struct wg_peer *peer = container_of(work, struct wg_peer, + transmit_handshake_work); + + wg_packet_send_handshake_initiation(peer); + wg_peer_put(peer); +} + +void wg_packet_send_queued_handshake_initiation(struct wg_peer *peer, + bool is_retry) +{ + if (!is_retry) + peer->timer_handshake_attempts = 0; + + rcu_read_lock_bh(); + /* We check last_sent_handshake here in addition to the actual function + * we're queueing up, so that we don't queue things if not strictly + * necessary: + */ + if (!wg_birthdate_has_expired(atomic64_read(&peer->last_sent_handshake), + REKEY_TIMEOUT) || + unlikely(READ_ONCE(peer->is_dead))) + goto out; + + wg_peer_get(peer); + /* Queues up calling packet_send_queued_handshakes(peer), where we do a + * peer_put(peer) after: + */ + if (!queue_work(peer->device->handshake_send_wq, + &peer->transmit_handshake_work)) + /* If the work was already queued, we want to drop the + * extra reference: + */ + wg_peer_put(peer); +out: + rcu_read_unlock_bh(); +} + +void wg_packet_send_handshake_response(struct wg_peer *peer) +{ + struct message_handshake_response packet; + + atomic64_set(&peer->last_sent_handshake, ktime_get_coarse_boottime_ns()); + net_dbg_ratelimited("%s: Sending handshake response to peer %llu (%pISpfsc)\n", + peer->device->dev->name, peer->internal_id, + &peer->endpoint.addr); + + if (wg_noise_handshake_create_response(&packet, &peer->handshake)) { + wg_cookie_add_mac_to_packet(&packet, sizeof(packet), peer); + if (wg_noise_handshake_begin_session(&peer->handshake, + &peer->keypairs)) { + wg_timers_session_derived(peer); + wg_timers_any_authenticated_packet_traversal(peer); + wg_timers_any_authenticated_packet_sent(peer); + atomic64_set(&peer->last_sent_handshake, + ktime_get_coarse_boottime_ns()); + wg_socket_send_buffer_to_peer(peer, &packet, + sizeof(packet), + HANDSHAKE_DSCP); + } + } +} + +void wg_packet_send_handshake_cookie(struct wg_device *wg, + struct sk_buff *initiating_skb, + __le32 sender_index) +{ + struct message_handshake_cookie packet; + + net_dbg_skb_ratelimited("%s: Sending cookie response for denied handshake message for %pISpfsc\n", + wg->dev->name, initiating_skb); + wg_cookie_message_create(&packet, initiating_skb, sender_index, + &wg->cookie_checker); + wg_socket_send_buffer_as_reply_to_skb(wg, initiating_skb, &packet, + sizeof(packet)); +} + +static void keep_key_fresh(struct wg_peer *peer) +{ + struct noise_keypair *keypair; + bool send; + + rcu_read_lock_bh(); + keypair = rcu_dereference_bh(peer->keypairs.current_keypair); + send = keypair && READ_ONCE(keypair->sending.is_valid) && + (atomic64_read(&keypair->sending_counter) > REKEY_AFTER_MESSAGES || + (keypair->i_am_the_initiator && + wg_birthdate_has_expired(keypair->sending.birthdate, REKEY_AFTER_TIME))); + rcu_read_unlock_bh(); + + if (unlikely(send)) + wg_packet_send_queued_handshake_initiation(peer, false); +} + +static unsigned int calculate_skb_padding(struct sk_buff *skb) +{ + unsigned int padded_size, last_unit = skb->len; + + if (unlikely(!PACKET_CB(skb)->mtu)) + return ALIGN(last_unit, MESSAGE_PADDING_MULTIPLE) - last_unit; + + /* We do this modulo business with the MTU, just in case the networking + * layer gives us a packet that's bigger than the MTU. In that case, we + * wouldn't want the final subtraction to overflow in the case of the + * padded_size being clamped. Fortunately, that's very rarely the case, + * so we optimize for that not happening. + */ + if (unlikely(last_unit > PACKET_CB(skb)->mtu)) + last_unit %= PACKET_CB(skb)->mtu; + + padded_size = min(PACKET_CB(skb)->mtu, + ALIGN(last_unit, MESSAGE_PADDING_MULTIPLE)); + return padded_size - last_unit; +} + +static bool encrypt_packet(struct sk_buff *skb, struct noise_keypair *keypair) +{ + unsigned int padding_len, plaintext_len, trailer_len; + struct scatterlist sg[MAX_SKB_FRAGS + 8]; + struct message_data *header; + struct sk_buff *trailer; + int num_frags; + + /* Force hash calculation before encryption so that flow analysis is + * consistent over the inner packet. + */ + skb_get_hash(skb); + + /* Calculate lengths. */ + padding_len = calculate_skb_padding(skb); + trailer_len = padding_len + noise_encrypted_len(0); + plaintext_len = skb->len + padding_len; + + /* Expand data section to have room for padding and auth tag. */ + num_frags = skb_cow_data(skb, trailer_len, &trailer); + if (unlikely(num_frags < 0 || num_frags > ARRAY_SIZE(sg))) + return false; + + /* Set the padding to zeros, and make sure it and the auth tag are part + * of the skb. + */ + memset(skb_tail_pointer(trailer), 0, padding_len); + + /* Expand head section to have room for our header and the network + * stack's headers. + */ + if (unlikely(skb_cow_head(skb, DATA_PACKET_HEAD_ROOM) < 0)) + return false; + + /* Finalize checksum calculation for the inner packet, if required. */ + if (unlikely(skb->ip_summed == CHECKSUM_PARTIAL && + skb_checksum_help(skb))) + return false; + + /* Only after checksumming can we safely add on the padding at the end + * and the header. + */ + skb_set_inner_network_header(skb, 0); + header = (struct message_data *)skb_push(skb, sizeof(*header)); + header->header.type = cpu_to_le32(MESSAGE_DATA); + header->key_idx = keypair->remote_index; + header->counter = cpu_to_le64(PACKET_CB(skb)->nonce); + pskb_put(skb, trailer, trailer_len); + + /* Now we can encrypt the scattergather segments */ + sg_init_table(sg, num_frags); + if (skb_to_sgvec(skb, sg, sizeof(struct message_data), + noise_encrypted_len(plaintext_len)) <= 0) + return false; + return chacha20poly1305_encrypt_sg_inplace(sg, plaintext_len, NULL, 0, + PACKET_CB(skb)->nonce, + keypair->sending.key); +} + +void wg_packet_send_keepalive(struct wg_peer *peer) +{ + struct sk_buff *skb; + + if (skb_queue_empty(&peer->staged_packet_queue)) { + skb = alloc_skb(DATA_PACKET_HEAD_ROOM + MESSAGE_MINIMUM_LENGTH, + GFP_ATOMIC); + if (unlikely(!skb)) + return; + skb_reserve(skb, DATA_PACKET_HEAD_ROOM); + skb->dev = peer->device->dev; + PACKET_CB(skb)->mtu = skb->dev->mtu; + skb_queue_tail(&peer->staged_packet_queue, skb); + net_dbg_ratelimited("%s: Sending keepalive packet to peer %llu (%pISpfsc)\n", + peer->device->dev->name, peer->internal_id, + &peer->endpoint.addr); + } + + wg_packet_send_staged_packets(peer); +} + +static void wg_packet_create_data_done(struct sk_buff *first, + struct wg_peer *peer) +{ + struct sk_buff *skb, *next; + bool is_keepalive, data_sent = false; + + wg_timers_any_authenticated_packet_traversal(peer); + wg_timers_any_authenticated_packet_sent(peer); + skb_list_walk_safe(first, skb, next) { + is_keepalive = skb->len == message_data_len(0); + if (likely(!wg_socket_send_skb_to_peer(peer, skb, + PACKET_CB(skb)->ds) && !is_keepalive)) + data_sent = true; + } + + if (likely(data_sent)) + wg_timers_data_sent(peer); + + keep_key_fresh(peer); +} + +void wg_packet_tx_worker(struct work_struct *work) +{ + struct crypt_queue *queue = container_of(work, struct crypt_queue, + work); + struct noise_keypair *keypair; + enum packet_state state; + struct sk_buff *first; + struct wg_peer *peer; + + while ((first = __ptr_ring_peek(&queue->ring)) != NULL && + (state = atomic_read_acquire(&PACKET_CB(first)->state)) != + PACKET_STATE_UNCRYPTED) { + __ptr_ring_discard_one(&queue->ring); + peer = PACKET_PEER(first); + keypair = PACKET_CB(first)->keypair; + + if (likely(state == PACKET_STATE_CRYPTED)) + wg_packet_create_data_done(first, peer); + else + kfree_skb_list(first); + + wg_noise_keypair_put(keypair, false); + wg_peer_put(peer); + if (need_resched()) + cond_resched(); + } +} + +void wg_packet_encrypt_worker(struct work_struct *work) +{ + struct crypt_queue *queue = container_of(work, struct multicore_worker, + work)->ptr; + struct sk_buff *first, *skb, *next; + + while ((first = ptr_ring_consume_bh(&queue->ring)) != NULL) { + enum packet_state state = PACKET_STATE_CRYPTED; + + skb_list_walk_safe(first, skb, next) { + if (likely(encrypt_packet(skb, + PACKET_CB(first)->keypair))) { + wg_reset_packet(skb, true); + } else { + state = PACKET_STATE_DEAD; + break; + } + } + wg_queue_enqueue_per_peer(&PACKET_PEER(first)->tx_queue, first, + state); + if (need_resched()) + cond_resched(); + } +} + +static void wg_packet_create_data(struct sk_buff *first) +{ + struct wg_peer *peer = PACKET_PEER(first); + struct wg_device *wg = peer->device; + int ret = -EINVAL; + + rcu_read_lock_bh(); + if (unlikely(READ_ONCE(peer->is_dead))) + goto err; + + ret = wg_queue_enqueue_per_device_and_peer(&wg->encrypt_queue, + &peer->tx_queue, first, + wg->packet_crypt_wq, + &wg->encrypt_queue.last_cpu); + if (unlikely(ret == -EPIPE)) + wg_queue_enqueue_per_peer(&peer->tx_queue, first, + PACKET_STATE_DEAD); +err: + rcu_read_unlock_bh(); + if (likely(!ret || ret == -EPIPE)) + return; + wg_noise_keypair_put(PACKET_CB(first)->keypair, false); + wg_peer_put(peer); + kfree_skb_list(first); +} + +void wg_packet_purge_staged_packets(struct wg_peer *peer) +{ + spin_lock_bh(&peer->staged_packet_queue.lock); + peer->device->dev->stats.tx_dropped += peer->staged_packet_queue.qlen; + __skb_queue_purge(&peer->staged_packet_queue); + spin_unlock_bh(&peer->staged_packet_queue.lock); +} + +void wg_packet_send_staged_packets(struct wg_peer *peer) +{ + struct noise_keypair *keypair; + struct sk_buff_head packets; + struct sk_buff *skb; + + /* Steal the current queue into our local one. */ + __skb_queue_head_init(&packets); + spin_lock_bh(&peer->staged_packet_queue.lock); + skb_queue_splice_init(&peer->staged_packet_queue, &packets); + spin_unlock_bh(&peer->staged_packet_queue.lock); + if (unlikely(skb_queue_empty(&packets))) + return; + + /* First we make sure we have a valid reference to a valid key. */ + rcu_read_lock_bh(); + keypair = wg_noise_keypair_get( + rcu_dereference_bh(peer->keypairs.current_keypair)); + rcu_read_unlock_bh(); + if (unlikely(!keypair)) + goto out_nokey; + if (unlikely(!READ_ONCE(keypair->sending.is_valid))) + goto out_nokey; + if (unlikely(wg_birthdate_has_expired(keypair->sending.birthdate, + REJECT_AFTER_TIME))) + goto out_invalid; + + /* After we know we have a somewhat valid key, we now try to assign + * nonces to all of the packets in the queue. If we can't assign nonces + * for all of them, we just consider it a failure and wait for the next + * handshake. + */ + skb_queue_walk(&packets, skb) { + /* 0 for no outer TOS: no leak. TODO: at some later point, we + * might consider using flowi->tos as outer instead. + */ + PACKET_CB(skb)->ds = ip_tunnel_ecn_encap(0, ip_hdr(skb), skb); + PACKET_CB(skb)->nonce = + atomic64_inc_return(&keypair->sending_counter) - 1; + if (unlikely(PACKET_CB(skb)->nonce >= REJECT_AFTER_MESSAGES)) + goto out_invalid; + } + + packets.prev->next = NULL; + wg_peer_get(keypair->entry.peer); + PACKET_CB(packets.next)->keypair = keypair; + wg_packet_create_data(packets.next); + return; + +out_invalid: + WRITE_ONCE(keypair->sending.is_valid, false); +out_nokey: + wg_noise_keypair_put(keypair, false); + + /* We orphan the packets if we're waiting on a handshake, so that they + * don't block a socket's pool. + */ + skb_queue_walk(&packets, skb) + skb_orphan(skb); + /* Then we put them back on the top of the queue. We're not too + * concerned about accidentally getting things a little out of order if + * packets are being added really fast, because this queue is for before + * packets can even be sent and it's small anyway. + */ + spin_lock_bh(&peer->staged_packet_queue.lock); + skb_queue_splice(&packets, &peer->staged_packet_queue); + spin_unlock_bh(&peer->staged_packet_queue.lock); + + /* If we're exiting because there's something wrong with the key, it + * means we should initiate a new handshake. + */ + wg_packet_send_queued_handshake_initiation(peer, false); +} diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c new file mode 100644 index 000000000000..c33e2c81635f --- /dev/null +++ b/drivers/net/wireguard/socket.c @@ -0,0 +1,436 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "device.h" +#include "peer.h" +#include "socket.h" +#include "queueing.h" +#include "messages.h" + +#include +#include +#include +#include +#include +#include +#include + +static int send4(struct wg_device *wg, struct sk_buff *skb, + struct endpoint *endpoint, u8 ds, struct dst_cache *cache) +{ + struct flowi4 fl = { + .saddr = endpoint->src4.s_addr, + .daddr = endpoint->addr4.sin_addr.s_addr, + .fl4_dport = endpoint->addr4.sin_port, + .flowi4_mark = wg->fwmark, + .flowi4_proto = IPPROTO_UDP + }; + struct rtable *rt = NULL; + struct sock *sock; + int ret = 0; + + skb_mark_not_on_list(skb); + skb->dev = wg->dev; + skb->mark = wg->fwmark; + + rcu_read_lock_bh(); + sock = rcu_dereference_bh(wg->sock4); + + if (unlikely(!sock)) { + ret = -ENONET; + goto err; + } + + fl.fl4_sport = inet_sk(sock)->inet_sport; + + if (cache) + rt = dst_cache_get_ip4(cache, &fl.saddr); + + if (!rt) { + security_sk_classify_flow(sock, flowi4_to_flowi(&fl)); + if (unlikely(!inet_confirm_addr(sock_net(sock), NULL, 0, + fl.saddr, RT_SCOPE_HOST))) { + endpoint->src4.s_addr = 0; + *(__force __be32 *)&endpoint->src_if4 = 0; + fl.saddr = 0; + if (cache) + dst_cache_reset(cache); + } + rt = ip_route_output_flow(sock_net(sock), &fl, sock); + if (unlikely(endpoint->src_if4 && ((IS_ERR(rt) && + PTR_ERR(rt) == -EINVAL) || (!IS_ERR(rt) && + rt->dst.dev->ifindex != endpoint->src_if4)))) { + endpoint->src4.s_addr = 0; + *(__force __be32 *)&endpoint->src_if4 = 0; + fl.saddr = 0; + if (cache) + dst_cache_reset(cache); + if (!IS_ERR(rt)) + ip_rt_put(rt); + rt = ip_route_output_flow(sock_net(sock), &fl, sock); + } + if (unlikely(IS_ERR(rt))) { + ret = PTR_ERR(rt); + net_dbg_ratelimited("%s: No route to %pISpfsc, error %d\n", + wg->dev->name, &endpoint->addr, ret); + goto err; + } + if (cache) + dst_cache_set_ip4(cache, &rt->dst, fl.saddr); + } + + skb->ignore_df = 1; + udp_tunnel_xmit_skb(rt, sock, skb, fl.saddr, fl.daddr, ds, + ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport, + fl.fl4_dport, false, false); + goto out; + +err: + kfree_skb(skb); +out: + rcu_read_unlock_bh(); + return ret; +} + +static int send6(struct wg_device *wg, struct sk_buff *skb, + struct endpoint *endpoint, u8 ds, struct dst_cache *cache) +{ +#if IS_ENABLED(CONFIG_IPV6) + struct flowi6 fl = { + .saddr = endpoint->src6, + .daddr = endpoint->addr6.sin6_addr, + .fl6_dport = endpoint->addr6.sin6_port, + .flowi6_mark = wg->fwmark, + .flowi6_oif = endpoint->addr6.sin6_scope_id, + .flowi6_proto = IPPROTO_UDP + /* TODO: addr->sin6_flowinfo */ + }; + struct dst_entry *dst = NULL; + struct sock *sock; + int ret = 0; + + skb_mark_not_on_list(skb); + skb->dev = wg->dev; + skb->mark = wg->fwmark; + + rcu_read_lock_bh(); + sock = rcu_dereference_bh(wg->sock6); + + if (unlikely(!sock)) { + ret = -ENONET; + goto err; + } + + fl.fl6_sport = inet_sk(sock)->inet_sport; + + if (cache) + dst = dst_cache_get_ip6(cache, &fl.saddr); + + if (!dst) { + security_sk_classify_flow(sock, flowi6_to_flowi(&fl)); + if (unlikely(!ipv6_addr_any(&fl.saddr) && + !ipv6_chk_addr(sock_net(sock), &fl.saddr, NULL, 0))) { + endpoint->src6 = fl.saddr = in6addr_any; + if (cache) + dst_cache_reset(cache); + } + dst = ipv6_stub->ipv6_dst_lookup_flow(sock_net(sock), sock, &fl, + NULL); + if (unlikely(IS_ERR(dst))) { + ret = PTR_ERR(dst); + net_dbg_ratelimited("%s: No route to %pISpfsc, error %d\n", + wg->dev->name, &endpoint->addr, ret); + goto err; + } + if (cache) + dst_cache_set_ip6(cache, dst, &fl.saddr); + } + + skb->ignore_df = 1; + udp_tunnel6_xmit_skb(dst, sock, skb, skb->dev, &fl.saddr, &fl.daddr, ds, + ip6_dst_hoplimit(dst), 0, fl.fl6_sport, + fl.fl6_dport, false); + goto out; + +err: + kfree_skb(skb); +out: + rcu_read_unlock_bh(); + return ret; +#else + return -EAFNOSUPPORT; +#endif +} + +int wg_socket_send_skb_to_peer(struct wg_peer *peer, struct sk_buff *skb, u8 ds) +{ + size_t skb_len = skb->len; + int ret = -EAFNOSUPPORT; + + read_lock_bh(&peer->endpoint_lock); + if (peer->endpoint.addr.sa_family == AF_INET) + ret = send4(peer->device, skb, &peer->endpoint, ds, + &peer->endpoint_cache); + else if (peer->endpoint.addr.sa_family == AF_INET6) + ret = send6(peer->device, skb, &peer->endpoint, ds, + &peer->endpoint_cache); + else + dev_kfree_skb(skb); + if (likely(!ret)) + peer->tx_bytes += skb_len; + read_unlock_bh(&peer->endpoint_lock); + + return ret; +} + +int wg_socket_send_buffer_to_peer(struct wg_peer *peer, void *buffer, + size_t len, u8 ds) +{ + struct sk_buff *skb = alloc_skb(len + SKB_HEADER_LEN, GFP_ATOMIC); + + if (unlikely(!skb)) + return -ENOMEM; + + skb_reserve(skb, SKB_HEADER_LEN); + skb_set_inner_network_header(skb, 0); + skb_put_data(skb, buffer, len); + return wg_socket_send_skb_to_peer(peer, skb, ds); +} + +int wg_socket_send_buffer_as_reply_to_skb(struct wg_device *wg, + struct sk_buff *in_skb, void *buffer, + size_t len) +{ + int ret = 0; + struct sk_buff *skb; + struct endpoint endpoint; + + if (unlikely(!in_skb)) + return -EINVAL; + ret = wg_socket_endpoint_from_skb(&endpoint, in_skb); + if (unlikely(ret < 0)) + return ret; + + skb = alloc_skb(len + SKB_HEADER_LEN, GFP_ATOMIC); + if (unlikely(!skb)) + return -ENOMEM; + skb_reserve(skb, SKB_HEADER_LEN); + skb_set_inner_network_header(skb, 0); + skb_put_data(skb, buffer, len); + + if (endpoint.addr.sa_family == AF_INET) + ret = send4(wg, skb, &endpoint, 0, NULL); + else if (endpoint.addr.sa_family == AF_INET6) + ret = send6(wg, skb, &endpoint, 0, NULL); + /* No other possibilities if the endpoint is valid, which it is, + * as we checked above. + */ + + return ret; +} + +int wg_socket_endpoint_from_skb(struct endpoint *endpoint, + const struct sk_buff *skb) +{ + memset(endpoint, 0, sizeof(*endpoint)); + if (skb->protocol == htons(ETH_P_IP)) { + endpoint->addr4.sin_family = AF_INET; + endpoint->addr4.sin_port = udp_hdr(skb)->source; + endpoint->addr4.sin_addr.s_addr = ip_hdr(skb)->saddr; + endpoint->src4.s_addr = ip_hdr(skb)->daddr; + endpoint->src_if4 = skb->skb_iif; + } else if (skb->protocol == htons(ETH_P_IPV6)) { + endpoint->addr6.sin6_family = AF_INET6; + endpoint->addr6.sin6_port = udp_hdr(skb)->source; + endpoint->addr6.sin6_addr = ipv6_hdr(skb)->saddr; + endpoint->addr6.sin6_scope_id = ipv6_iface_scope_id( + &ipv6_hdr(skb)->saddr, skb->skb_iif); + endpoint->src6 = ipv6_hdr(skb)->daddr; + } else { + return -EINVAL; + } + return 0; +} + +static bool endpoint_eq(const struct endpoint *a, const struct endpoint *b) +{ + return (a->addr.sa_family == AF_INET && b->addr.sa_family == AF_INET && + a->addr4.sin_port == b->addr4.sin_port && + a->addr4.sin_addr.s_addr == b->addr4.sin_addr.s_addr && + a->src4.s_addr == b->src4.s_addr && a->src_if4 == b->src_if4) || + (a->addr.sa_family == AF_INET6 && + b->addr.sa_family == AF_INET6 && + a->addr6.sin6_port == b->addr6.sin6_port && + ipv6_addr_equal(&a->addr6.sin6_addr, &b->addr6.sin6_addr) && + a->addr6.sin6_scope_id == b->addr6.sin6_scope_id && + ipv6_addr_equal(&a->src6, &b->src6)) || + unlikely(!a->addr.sa_family && !b->addr.sa_family); +} + +void wg_socket_set_peer_endpoint(struct wg_peer *peer, + const struct endpoint *endpoint) +{ + /* First we check unlocked, in order to optimize, since it's pretty rare + * that an endpoint will change. If we happen to be mid-write, and two + * CPUs wind up writing the same thing or something slightly different, + * it doesn't really matter much either. + */ + if (endpoint_eq(endpoint, &peer->endpoint)) + return; + write_lock_bh(&peer->endpoint_lock); + if (endpoint->addr.sa_family == AF_INET) { + peer->endpoint.addr4 = endpoint->addr4; + peer->endpoint.src4 = endpoint->src4; + peer->endpoint.src_if4 = endpoint->src_if4; + } else if (endpoint->addr.sa_family == AF_INET6) { + peer->endpoint.addr6 = endpoint->addr6; + peer->endpoint.src6 = endpoint->src6; + } else { + goto out; + } + dst_cache_reset(&peer->endpoint_cache); +out: + write_unlock_bh(&peer->endpoint_lock); +} + +void wg_socket_set_peer_endpoint_from_skb(struct wg_peer *peer, + const struct sk_buff *skb) +{ + struct endpoint endpoint; + + if (!wg_socket_endpoint_from_skb(&endpoint, skb)) + wg_socket_set_peer_endpoint(peer, &endpoint); +} + +void wg_socket_clear_peer_endpoint_src(struct wg_peer *peer) +{ + write_lock_bh(&peer->endpoint_lock); + memset(&peer->endpoint.src6, 0, sizeof(peer->endpoint.src6)); + dst_cache_reset(&peer->endpoint_cache); + write_unlock_bh(&peer->endpoint_lock); +} + +static int wg_receive(struct sock *sk, struct sk_buff *skb) +{ + struct wg_device *wg; + + if (unlikely(!sk)) + goto err; + wg = sk->sk_user_data; + if (unlikely(!wg)) + goto err; + skb_mark_not_on_list(skb); + wg_packet_receive(wg, skb); + return 0; + +err: + kfree_skb(skb); + return 0; +} + +static void sock_free(struct sock *sock) +{ + if (unlikely(!sock)) + return; + sk_clear_memalloc(sock); + udp_tunnel_sock_release(sock->sk_socket); +} + +static void set_sock_opts(struct socket *sock) +{ + sock->sk->sk_allocation = GFP_ATOMIC; + sock->sk->sk_sndbuf = INT_MAX; + sk_set_memalloc(sock->sk); +} + +int wg_socket_init(struct wg_device *wg, u16 port) +{ + struct net *net; + int ret; + struct udp_tunnel_sock_cfg cfg = { + .sk_user_data = wg, + .encap_type = 1, + .encap_rcv = wg_receive + }; + struct socket *new4 = NULL, *new6 = NULL; + struct udp_port_cfg port4 = { + .family = AF_INET, + .local_ip.s_addr = htonl(INADDR_ANY), + .local_udp_port = htons(port), + .use_udp_checksums = true + }; +#if IS_ENABLED(CONFIG_IPV6) + int retries = 0; + struct udp_port_cfg port6 = { + .family = AF_INET6, + .local_ip6 = IN6ADDR_ANY_INIT, + .use_udp6_tx_checksums = true, + .use_udp6_rx_checksums = true, + .ipv6_v6only = true + }; +#endif + + rcu_read_lock(); + net = rcu_dereference(wg->creating_net); + net = net ? maybe_get_net(net) : NULL; + rcu_read_unlock(); + if (unlikely(!net)) + return -ENONET; + +#if IS_ENABLED(CONFIG_IPV6) +retry: +#endif + + ret = udp_sock_create(net, &port4, &new4); + if (ret < 0) { + pr_err("%s: Could not create IPv4 socket\n", wg->dev->name); + goto out; + } + set_sock_opts(new4); + setup_udp_tunnel_sock(net, new4, &cfg); + +#if IS_ENABLED(CONFIG_IPV6) + if (ipv6_mod_enabled()) { + port6.local_udp_port = inet_sk(new4->sk)->inet_sport; + ret = udp_sock_create(net, &port6, &new6); + if (ret < 0) { + udp_tunnel_sock_release(new4); + if (ret == -EADDRINUSE && !port && retries++ < 100) + goto retry; + pr_err("%s: Could not create IPv6 socket\n", + wg->dev->name); + goto out; + } + set_sock_opts(new6); + setup_udp_tunnel_sock(net, new6, &cfg); + } +#endif + + wg_socket_reinit(wg, new4->sk, new6 ? new6->sk : NULL); + ret = 0; +out: + put_net(net); + return ret; +} + +void wg_socket_reinit(struct wg_device *wg, struct sock *new4, + struct sock *new6) +{ + struct sock *old4, *old6; + + mutex_lock(&wg->socket_update_lock); + old4 = rcu_dereference_protected(wg->sock4, + lockdep_is_held(&wg->socket_update_lock)); + old6 = rcu_dereference_protected(wg->sock6, + lockdep_is_held(&wg->socket_update_lock)); + rcu_assign_pointer(wg->sock4, new4); + rcu_assign_pointer(wg->sock6, new6); + if (new4) + wg->incoming_port = ntohs(inet_sk(new4)->inet_sport); + mutex_unlock(&wg->socket_update_lock); + synchronize_rcu(); + sock_free(old4); + sock_free(old6); +} diff --git a/drivers/net/wireguard/socket.h b/drivers/net/wireguard/socket.h new file mode 100644 index 000000000000..bab5848efbcd --- /dev/null +++ b/drivers/net/wireguard/socket.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef _WG_SOCKET_H +#define _WG_SOCKET_H + +#include +#include +#include +#include + +int wg_socket_init(struct wg_device *wg, u16 port); +void wg_socket_reinit(struct wg_device *wg, struct sock *new4, + struct sock *new6); +int wg_socket_send_buffer_to_peer(struct wg_peer *peer, void *data, + size_t len, u8 ds); +int wg_socket_send_skb_to_peer(struct wg_peer *peer, struct sk_buff *skb, + u8 ds); +int wg_socket_send_buffer_as_reply_to_skb(struct wg_device *wg, + struct sk_buff *in_skb, + void *out_buffer, size_t len); + +int wg_socket_endpoint_from_skb(struct endpoint *endpoint, + const struct sk_buff *skb); +void wg_socket_set_peer_endpoint(struct wg_peer *peer, + const struct endpoint *endpoint); +void wg_socket_set_peer_endpoint_from_skb(struct wg_peer *peer, + const struct sk_buff *skb); +void wg_socket_clear_peer_endpoint_src(struct wg_peer *peer); + +#if defined(CONFIG_DYNAMIC_DEBUG) || defined(DEBUG) +#define net_dbg_skb_ratelimited(fmt, dev, skb, ...) do { \ + struct endpoint __endpoint; \ + wg_socket_endpoint_from_skb(&__endpoint, skb); \ + net_dbg_ratelimited(fmt, dev, &__endpoint.addr, \ + ##__VA_ARGS__); \ + } while (0) +#else +#define net_dbg_skb_ratelimited(fmt, skb, ...) +#endif + +#endif /* _WG_SOCKET_H */ diff --git a/drivers/net/wireguard/timers.c b/drivers/net/wireguard/timers.c new file mode 100644 index 000000000000..d54d32ac9bc4 --- /dev/null +++ b/drivers/net/wireguard/timers.c @@ -0,0 +1,243 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include "timers.h" +#include "device.h" +#include "peer.h" +#include "queueing.h" +#include "socket.h" + +/* + * - Timer for retransmitting the handshake if we don't hear back after + * `REKEY_TIMEOUT + jitter` ms. + * + * - Timer for sending empty packet if we have received a packet but after have + * not sent one for `KEEPALIVE_TIMEOUT` ms. + * + * - Timer for initiating new handshake if we have sent a packet but after have + * not received one (even empty) for `(KEEPALIVE_TIMEOUT + REKEY_TIMEOUT) + + * jitter` ms. + * + * - Timer for zeroing out all ephemeral keys after `(REJECT_AFTER_TIME * 3)` ms + * if no new keys have been received. + * + * - Timer for, if enabled, sending an empty authenticated packet every user- + * specified seconds. + */ + +static inline void mod_peer_timer(struct wg_peer *peer, + struct timer_list *timer, + unsigned long expires) +{ + rcu_read_lock_bh(); + if (likely(netif_running(peer->device->dev) && + !READ_ONCE(peer->is_dead))) + mod_timer(timer, expires); + rcu_read_unlock_bh(); +} + +static void wg_expired_retransmit_handshake(struct timer_list *timer) +{ + struct wg_peer *peer = from_timer(peer, timer, + timer_retransmit_handshake); + + if (peer->timer_handshake_attempts > MAX_TIMER_HANDSHAKES) { + pr_debug("%s: Handshake for peer %llu (%pISpfsc) did not complete after %d attempts, giving up\n", + peer->device->dev->name, peer->internal_id, + &peer->endpoint.addr, MAX_TIMER_HANDSHAKES + 2); + + del_timer(&peer->timer_send_keepalive); + /* We drop all packets without a keypair and don't try again, + * if we try unsuccessfully for too long to make a handshake. + */ + wg_packet_purge_staged_packets(peer); + + /* We set a timer for destroying any residue that might be left + * of a partial exchange. + */ + if (!timer_pending(&peer->timer_zero_key_material)) + mod_peer_timer(peer, &peer->timer_zero_key_material, + jiffies + REJECT_AFTER_TIME * 3 * HZ); + } else { + ++peer->timer_handshake_attempts; + pr_debug("%s: Handshake for peer %llu (%pISpfsc) did not complete after %d seconds, retrying (try %d)\n", + peer->device->dev->name, peer->internal_id, + &peer->endpoint.addr, REKEY_TIMEOUT, + peer->timer_handshake_attempts + 1); + + /* We clear the endpoint address src address, in case this is + * the cause of trouble. + */ + wg_socket_clear_peer_endpoint_src(peer); + + wg_packet_send_queued_handshake_initiation(peer, true); + } +} + +static void wg_expired_send_keepalive(struct timer_list *timer) +{ + struct wg_peer *peer = from_timer(peer, timer, timer_send_keepalive); + + wg_packet_send_keepalive(peer); + if (peer->timer_need_another_keepalive) { + peer->timer_need_another_keepalive = false; + mod_peer_timer(peer, &peer->timer_send_keepalive, + jiffies + KEEPALIVE_TIMEOUT * HZ); + } +} + +static void wg_expired_new_handshake(struct timer_list *timer) +{ + struct wg_peer *peer = from_timer(peer, timer, timer_new_handshake); + + pr_debug("%s: Retrying handshake with peer %llu (%pISpfsc) because we stopped hearing back after %d seconds\n", + peer->device->dev->name, peer->internal_id, + &peer->endpoint.addr, KEEPALIVE_TIMEOUT + REKEY_TIMEOUT); + /* We clear the endpoint address src address, in case this is the cause + * of trouble. + */ + wg_socket_clear_peer_endpoint_src(peer); + wg_packet_send_queued_handshake_initiation(peer, false); +} + +static void wg_expired_zero_key_material(struct timer_list *timer) +{ + struct wg_peer *peer = from_timer(peer, timer, timer_zero_key_material); + + rcu_read_lock_bh(); + if (!READ_ONCE(peer->is_dead)) { + wg_peer_get(peer); + if (!queue_work(peer->device->handshake_send_wq, + &peer->clear_peer_work)) + /* If the work was already on the queue, we want to drop + * the extra reference. + */ + wg_peer_put(peer); + } + rcu_read_unlock_bh(); +} + +static void wg_queued_expired_zero_key_material(struct work_struct *work) +{ + struct wg_peer *peer = container_of(work, struct wg_peer, + clear_peer_work); + + pr_debug("%s: Zeroing out all keys for peer %llu (%pISpfsc), since we haven't received a new one in %d seconds\n", + peer->device->dev->name, peer->internal_id, + &peer->endpoint.addr, REJECT_AFTER_TIME * 3); + wg_noise_handshake_clear(&peer->handshake); + wg_noise_keypairs_clear(&peer->keypairs); + wg_peer_put(peer); +} + +static void wg_expired_send_persistent_keepalive(struct timer_list *timer) +{ + struct wg_peer *peer = from_timer(peer, timer, + timer_persistent_keepalive); + + if (likely(peer->persistent_keepalive_interval)) + wg_packet_send_keepalive(peer); +} + +/* Should be called after an authenticated data packet is sent. */ +void wg_timers_data_sent(struct wg_peer *peer) +{ + if (!timer_pending(&peer->timer_new_handshake)) + mod_peer_timer(peer, &peer->timer_new_handshake, + jiffies + (KEEPALIVE_TIMEOUT + REKEY_TIMEOUT) * HZ + + prandom_u32_max(REKEY_TIMEOUT_JITTER_MAX_JIFFIES)); +} + +/* Should be called after an authenticated data packet is received. */ +void wg_timers_data_received(struct wg_peer *peer) +{ + if (likely(netif_running(peer->device->dev))) { + if (!timer_pending(&peer->timer_send_keepalive)) + mod_peer_timer(peer, &peer->timer_send_keepalive, + jiffies + KEEPALIVE_TIMEOUT * HZ); + else + peer->timer_need_another_keepalive = true; + } +} + +/* Should be called after any type of authenticated packet is sent, whether + * keepalive, data, or handshake. + */ +void wg_timers_any_authenticated_packet_sent(struct wg_peer *peer) +{ + del_timer(&peer->timer_send_keepalive); +} + +/* Should be called after any type of authenticated packet is received, whether + * keepalive, data, or handshake. + */ +void wg_timers_any_authenticated_packet_received(struct wg_peer *peer) +{ + del_timer(&peer->timer_new_handshake); +} + +/* Should be called after a handshake initiation message is sent. */ +void wg_timers_handshake_initiated(struct wg_peer *peer) +{ + mod_peer_timer(peer, &peer->timer_retransmit_handshake, + jiffies + REKEY_TIMEOUT * HZ + + prandom_u32_max(REKEY_TIMEOUT_JITTER_MAX_JIFFIES)); +} + +/* Should be called after a handshake response message is received and processed + * or when getting key confirmation via the first data message. + */ +void wg_timers_handshake_complete(struct wg_peer *peer) +{ + del_timer(&peer->timer_retransmit_handshake); + peer->timer_handshake_attempts = 0; + peer->sent_lastminute_handshake = false; + ktime_get_real_ts64(&peer->walltime_last_handshake); +} + +/* Should be called after an ephemeral key is created, which is before sending a + * handshake response or after receiving a handshake response. + */ +void wg_timers_session_derived(struct wg_peer *peer) +{ + mod_peer_timer(peer, &peer->timer_zero_key_material, + jiffies + REJECT_AFTER_TIME * 3 * HZ); +} + +/* Should be called before a packet with authentication, whether + * keepalive, data, or handshakem is sent, or after one is received. + */ +void wg_timers_any_authenticated_packet_traversal(struct wg_peer *peer) +{ + if (peer->persistent_keepalive_interval) + mod_peer_timer(peer, &peer->timer_persistent_keepalive, + jiffies + peer->persistent_keepalive_interval * HZ); +} + +void wg_timers_init(struct wg_peer *peer) +{ + timer_setup(&peer->timer_retransmit_handshake, + wg_expired_retransmit_handshake, 0); + timer_setup(&peer->timer_send_keepalive, wg_expired_send_keepalive, 0); + timer_setup(&peer->timer_new_handshake, wg_expired_new_handshake, 0); + timer_setup(&peer->timer_zero_key_material, + wg_expired_zero_key_material, 0); + timer_setup(&peer->timer_persistent_keepalive, + wg_expired_send_persistent_keepalive, 0); + INIT_WORK(&peer->clear_peer_work, wg_queued_expired_zero_key_material); + peer->timer_handshake_attempts = 0; + peer->sent_lastminute_handshake = false; + peer->timer_need_another_keepalive = false; +} + +void wg_timers_stop(struct wg_peer *peer) +{ + del_timer_sync(&peer->timer_retransmit_handshake); + del_timer_sync(&peer->timer_send_keepalive); + del_timer_sync(&peer->timer_new_handshake); + del_timer_sync(&peer->timer_zero_key_material); + del_timer_sync(&peer->timer_persistent_keepalive); + flush_work(&peer->clear_peer_work); +} diff --git a/drivers/net/wireguard/timers.h b/drivers/net/wireguard/timers.h new file mode 100644 index 000000000000..f0653dcb1326 --- /dev/null +++ b/drivers/net/wireguard/timers.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef _WG_TIMERS_H +#define _WG_TIMERS_H + +#include + +struct wg_peer; + +void wg_timers_init(struct wg_peer *peer); +void wg_timers_stop(struct wg_peer *peer); +void wg_timers_data_sent(struct wg_peer *peer); +void wg_timers_data_received(struct wg_peer *peer); +void wg_timers_any_authenticated_packet_sent(struct wg_peer *peer); +void wg_timers_any_authenticated_packet_received(struct wg_peer *peer); +void wg_timers_handshake_initiated(struct wg_peer *peer); +void wg_timers_handshake_complete(struct wg_peer *peer); +void wg_timers_session_derived(struct wg_peer *peer); +void wg_timers_any_authenticated_packet_traversal(struct wg_peer *peer); + +static inline bool wg_birthdate_has_expired(u64 birthday_nanoseconds, + u64 expiration_seconds) +{ + return (s64)(birthday_nanoseconds + expiration_seconds * NSEC_PER_SEC) + <= (s64)ktime_get_coarse_boottime_ns(); +} + +#endif /* _WG_TIMERS_H */ diff --git a/drivers/net/wireguard/version.h b/drivers/net/wireguard/version.h new file mode 100644 index 000000000000..a1a269a11634 --- /dev/null +++ b/drivers/net/wireguard/version.h @@ -0,0 +1 @@ +#define WIREGUARD_VERSION "1.0.0" diff --git a/drivers/net/wireless/ath/ath10k/ce.c b/drivers/net/wireless/ath/ath10k/ce.c index f761d651c16e..2276d608bca3 100644 --- a/drivers/net/wireless/ath/ath10k/ce.c +++ b/drivers/net/wireless/ath/ath10k/ce.c @@ -1453,7 +1453,7 @@ ath10k_ce_alloc_src_ring(struct ath10k *ar, unsigned int ce_id, ret = ath10k_ce_alloc_shadow_base(ar, src_ring, nentries); if (ret) { dma_free_coherent(ar->dev, - (nentries * sizeof(struct ce_desc_64) + + (nentries * sizeof(struct ce_desc) + CE_DESC_RING_ALIGN), src_ring->base_addr_owner_space_unaligned, base_addr); diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c index 03d4cc6f35bc..68cda1564c77 100644 --- a/drivers/net/wireless/ath/ath10k/htt_rx.c +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c @@ -153,6 +153,14 @@ static int __ath10k_htt_rx_ring_fill_n(struct ath10k_htt *htt, int num) BUILD_BUG_ON(HTT_RX_RING_FILL_LEVEL >= HTT_RX_RING_SIZE / 2); idx = __le32_to_cpu(*htt->rx_ring.alloc_idx.vaddr); + + if (idx < 0 || idx >= htt->rx_ring.size) { + ath10k_err(htt->ar, "rx ring index is not valid, firmware malfunctioning?\n"); + idx &= htt->rx_ring.size_mask; + ret = -ENOMEM; + goto fail; + } + while (num > 0) { skb = dev_alloc_skb(HTT_RX_BUF_SIZE + HTT_RX_DESC_ALIGN); if (!skb) { @@ -759,6 +767,7 @@ static void ath10k_htt_rx_h_rates(struct ath10k *ar, u8 preamble = 0; u8 group_id; u32 info1, info2, info3; + u32 stbc, nsts_su; info1 = __le32_to_cpu(rxd->ppdu_start.info1); info2 = __le32_to_cpu(rxd->ppdu_start.info2); @@ -803,11 +812,16 @@ static void ath10k_htt_rx_h_rates(struct ath10k *ar, */ bw = info2 & 3; sgi = info3 & 1; + stbc = (info2 >> 3) & 1; group_id = (info2 >> 4) & 0x3F; if (GROUP_ID_IS_SU_MIMO(group_id)) { mcs = (info3 >> 4) & 0x0F; - nss = ((info2 >> 10) & 0x07) + 1; + nsts_su = ((info2 >> 10) & 0x07); + if (stbc) + nss = (nsts_su >> 2) + 1; + else + nss = (nsts_su + 1); } else { /* Hardware doesn't decode VHT-SIG-B into Rx descriptor * so it's impossible to decode MCS. Also since diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c index 81af403c19c2..faaca7fe9ad1 100644 --- a/drivers/net/wireless/ath/ath10k/mac.c +++ b/drivers/net/wireless/ath/ath10k/mac.c @@ -6862,7 +6862,7 @@ ath10k_mac_update_bss_chan_survey(struct ath10k *ar, struct ieee80211_channel *channel) { int ret; - enum wmi_bss_survey_req_type type = WMI_BSS_SURVEY_REQ_TYPE_READ_CLEAR; + enum wmi_bss_survey_req_type type = WMI_BSS_SURVEY_REQ_TYPE_READ; lockdep_assert_held(&ar->conf_mutex); diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c index 0cdaecb0e28a..28d86da65c05 100644 --- a/drivers/net/wireless/ath/ath10k/sdio.c +++ b/drivers/net/wireless/ath/ath10k/sdio.c @@ -561,6 +561,10 @@ static int ath10k_sdio_mbox_rx_alloc(struct ath10k *ar, le16_to_cpu(htc_hdr->len), ATH10K_HTC_MBOX_MAX_PAYLOAD_LENGTH); ret = -ENOMEM; + + queue_work(ar->workqueue, &ar->restart_work); + ath10k_warn(ar, "exceeds length, start recovery\n"); + goto err; } diff --git a/drivers/net/wireless/ath/ath6kl/main.c b/drivers/net/wireless/ath/ath6kl/main.c index 0c61dbaa62a4..702c4761006c 100644 --- a/drivers/net/wireless/ath/ath6kl/main.c +++ b/drivers/net/wireless/ath/ath6kl/main.c @@ -429,6 +429,9 @@ void ath6kl_connect_ap_mode_sta(struct ath6kl_vif *vif, u16 aid, u8 *mac_addr, ath6kl_dbg(ATH6KL_DBG_TRC, "new station %pM aid=%d\n", mac_addr, aid); + if (aid < 1 || aid > AP_MAX_NUM_STA) + return; + if (assoc_req_len > sizeof(struct ieee80211_hdr_3addr)) { struct ieee80211_mgmt *mgmt = (struct ieee80211_mgmt *) assoc_info; diff --git a/drivers/net/wireless/ath/ath6kl/wmi.c b/drivers/net/wireless/ath/ath6kl/wmi.c index bc7916f2add0..987ebae8ea0e 100644 --- a/drivers/net/wireless/ath/ath6kl/wmi.c +++ b/drivers/net/wireless/ath/ath6kl/wmi.c @@ -2648,6 +2648,11 @@ int ath6kl_wmi_delete_pstream_cmd(struct wmi *wmi, u8 if_idx, u8 traffic_class, return -EINVAL; } + if (tsid >= 16) { + ath6kl_err("invalid tsid: %d\n", tsid); + return -EINVAL; + } + skb = ath6kl_wmi_get_new_buf(sizeof(*cmd)); if (!skb) return -ENOMEM; diff --git a/drivers/net/wireless/ath/ath9k/hif_usb.c b/drivers/net/wireless/ath/ath9k/hif_usb.c index 3f563e02d17d..2ed98aaed6fb 100644 --- a/drivers/net/wireless/ath/ath9k/hif_usb.c +++ b/drivers/net/wireless/ath/ath9k/hif_usb.c @@ -449,10 +449,19 @@ static void hif_usb_stop(void *hif_handle) spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags); /* The pending URBs have to be canceled. */ + spin_lock_irqsave(&hif_dev->tx.tx_lock, flags); list_for_each_entry_safe(tx_buf, tx_buf_tmp, &hif_dev->tx.tx_pending, list) { + usb_get_urb(tx_buf->urb); + spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags); usb_kill_urb(tx_buf->urb); + list_del(&tx_buf->list); + usb_free_urb(tx_buf->urb); + kfree(tx_buf->buf); + kfree(tx_buf); + spin_lock_irqsave(&hif_dev->tx.tx_lock, flags); } + spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags); usb_kill_anchored_urbs(&hif_dev->mgmt_submitted); } @@ -762,27 +771,37 @@ static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb *hif_dev) struct tx_buf *tx_buf = NULL, *tx_buf_tmp = NULL; unsigned long flags; + spin_lock_irqsave(&hif_dev->tx.tx_lock, flags); list_for_each_entry_safe(tx_buf, tx_buf_tmp, &hif_dev->tx.tx_buf, list) { + usb_get_urb(tx_buf->urb); + spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags); usb_kill_urb(tx_buf->urb); list_del(&tx_buf->list); usb_free_urb(tx_buf->urb); kfree(tx_buf->buf); kfree(tx_buf); + spin_lock_irqsave(&hif_dev->tx.tx_lock, flags); } + spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags); spin_lock_irqsave(&hif_dev->tx.tx_lock, flags); hif_dev->tx.flags |= HIF_USB_TX_FLUSH; spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags); + spin_lock_irqsave(&hif_dev->tx.tx_lock, flags); list_for_each_entry_safe(tx_buf, tx_buf_tmp, &hif_dev->tx.tx_pending, list) { + usb_get_urb(tx_buf->urb); + spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags); usb_kill_urb(tx_buf->urb); list_del(&tx_buf->list); usb_free_urb(tx_buf->urb); kfree(tx_buf->buf); kfree(tx_buf); + spin_lock_irqsave(&hif_dev->tx.tx_lock, flags); } + spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags); usb_kill_anchored_urbs(&hif_dev->mgmt_submitted); } diff --git a/drivers/net/wireless/ath/ath9k/htc_hst.c b/drivers/net/wireless/ath/ath9k/htc_hst.c index f705f0e1cb5b..05fca38b38ed 100644 --- a/drivers/net/wireless/ath/ath9k/htc_hst.c +++ b/drivers/net/wireless/ath/ath9k/htc_hst.c @@ -342,6 +342,8 @@ void ath9k_htc_txcompletion_cb(struct htc_target *htc_handle, if (skb) { htc_hdr = (struct htc_frame_hdr *) skb->data; + if (htc_hdr->endpoint_id >= ARRAY_SIZE(htc_handle->endpoint)) + goto ret; endpoint = &htc_handle->endpoint[htc_hdr->endpoint_id]; skb_pull(skb, sizeof(struct htc_frame_hdr)); diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c index ad051f34e65b..46ae4ec4ad47 100644 --- a/drivers/net/wireless/ath/wcn36xx/main.c +++ b/drivers/net/wireless/ath/wcn36xx/main.c @@ -163,7 +163,7 @@ static struct ieee80211_supported_band wcn_band_5ghz = { .ampdu_density = IEEE80211_HT_MPDU_DENSITY_16, .mcs = { .rx_mask = { 0xff, 0, 0, 0, 0, 0, 0, 0, 0, 0, }, - .rx_highest = cpu_to_le16(72), + .rx_highest = cpu_to_le16(150), .tx_params = IEEE80211_HT_MCS_TX_DEFINED, } } diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c index 9d7b8834b854..db4c541f58ae 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c @@ -438,7 +438,7 @@ static int brcmf_rx_hdrpull(struct brcmf_pub *drvr, struct sk_buff *skb, ret = brcmf_proto_hdrpull(drvr, true, skb, ifp); if (ret || !(*ifp) || !(*ifp)->ndev) { - if (ret != -ENODATA && *ifp) + if (ret != -ENODATA && *ifp && (*ifp)->ndev) (*ifp)->ndev->stats.rx_errors++; brcmu_pkt_buf_free_skb(skb); return -ENODATA; diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c index ee922b052561..768a99c15c08 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/msgbuf.c @@ -1563,6 +1563,8 @@ fail: BRCMF_TX_IOCTL_MAX_MSG_SIZE, msgbuf->ioctbuf, msgbuf->ioctbuf_handle); + if (msgbuf->txflow_wq) + destroy_workqueue(msgbuf->txflow_wq); kfree(msgbuf); } return -ENOMEM; diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c index 9fb0d9fbd939..d532decc1538 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_lcn.c @@ -5085,8 +5085,10 @@ bool wlc_phy_attach_lcnphy(struct brcms_phy *pi) pi->pi_fptr.radioloftget = wlc_lcnphy_get_radio_loft; pi->pi_fptr.detach = wlc_phy_detach_lcnphy; - if (!wlc_phy_txpwr_srom_read_lcnphy(pi)) + if (!wlc_phy_txpwr_srom_read_lcnphy(pi)) { + kfree(pi->u.pi_lcnphy); return false; + } if (LCNREV_IS(pi->pubpi.phy_rev, 1)) { if (pi_lcn->lcnphy_tempsense_option == 3) { diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c index 18be402d1a21..470fe628f6ba 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c @@ -3460,9 +3460,12 @@ static int iwl_mvm_send_aux_roc_cmd(struct iwl_mvm *mvm, aux_roc_req.apply_time_max_delay = cpu_to_le32(delay); IWL_DEBUG_TE(mvm, - "ROC: Requesting to remain on channel %u for %ums (requested = %ums, max_delay = %ums, dtim_interval = %ums)\n", - channel->hw_value, req_dur, duration, delay, - dtim_interval); + "ROC: Requesting to remain on channel %u for %ums\n", + channel->hw_value, req_dur); + IWL_DEBUG_TE(mvm, + "\t(requested = %ums, max_delay = %ums, dtim_interval = %ums)\n", + duration, delay, dtim_interval); + /* Set the node address */ memcpy(aux_roc_req.node_addr, vif->addr, ETH_ALEN); diff --git a/drivers/net/wireless/intersil/p54/p54pci.c b/drivers/net/wireless/intersil/p54/p54pci.c index 57ad56435dda..8bc0286b4f8c 100644 --- a/drivers/net/wireless/intersil/p54/p54pci.c +++ b/drivers/net/wireless/intersil/p54/p54pci.c @@ -332,10 +332,12 @@ static void p54p_tx(struct ieee80211_hw *dev, struct sk_buff *skb) struct p54p_desc *desc; dma_addr_t mapping; u32 idx, i; + __le32 device_addr; spin_lock_irqsave(&priv->lock, flags); idx = le32_to_cpu(ring_control->host_idx[1]); i = idx % ARRAY_SIZE(ring_control->tx_data); + device_addr = ((struct p54_hdr *)skb->data)->req_id; mapping = pci_map_single(priv->pdev, skb->data, skb->len, PCI_DMA_TODEVICE); @@ -349,7 +351,7 @@ static void p54p_tx(struct ieee80211_hw *dev, struct sk_buff *skb) desc = &ring_control->tx_data[i]; desc->host_addr = cpu_to_le32(mapping); - desc->device_addr = ((struct p54_hdr *)skb->data)->req_id; + desc->device_addr = device_addr; desc->len = cpu_to_le16(skb->len); desc->flags = 0; diff --git a/drivers/net/wireless/marvell/mwifiex/scan.c b/drivers/net/wireless/marvell/mwifiex/scan.c index 85d6d5f3dce5..c9f6cd291969 100644 --- a/drivers/net/wireless/marvell/mwifiex/scan.c +++ b/drivers/net/wireless/marvell/mwifiex/scan.c @@ -1895,7 +1895,7 @@ mwifiex_parse_single_response_buf(struct mwifiex_private *priv, u8 **bss_info, chan, CFG80211_BSS_FTYPE_UNKNOWN, bssid, timestamp, cap_info_bitmap, beacon_period, - ie_buf, ie_len, rssi, GFP_KERNEL); + ie_buf, ie_len, rssi, GFP_ATOMIC); if (bss) { bss_priv = (struct mwifiex_bss_priv *)bss->priv; bss_priv->band = band; diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c index bfbe3aa058d9..0773d81072aa 100644 --- a/drivers/net/wireless/marvell/mwifiex/sdio.c +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c @@ -1985,6 +1985,8 @@ error: kfree(card->mpa_rx.buf); card->mpa_tx.buf_size = 0; card->mpa_rx.buf_size = 0; + card->mpa_tx.buf = NULL; + card->mpa_rx.buf = NULL; } return ret; diff --git a/drivers/net/wireless/marvell/mwifiex/usb.c b/drivers/net/wireless/marvell/mwifiex/usb.c index d445acc4786b..2a8d40ce463d 100644 --- a/drivers/net/wireless/marvell/mwifiex/usb.c +++ b/drivers/net/wireless/marvell/mwifiex/usb.c @@ -1355,7 +1355,8 @@ static void mwifiex_usb_cleanup_tx_aggr(struct mwifiex_adapter *adapter) skb_dequeue(&port->tx_aggr.aggr_list))) mwifiex_write_data_complete(adapter, skb_tmp, 0, -1); - del_timer_sync(&port->tx_aggr.timer_cnxt.hold_timer); + if (port->tx_aggr.timer_cnxt.hold_timer.function) + del_timer_sync(&port->tx_aggr.timer_cnxt.hold_timer); port->tx_aggr.timer_cnxt.is_hold_timer_set = false; port->tx_aggr.timer_cnxt.hold_tmo_msecs = 0; } diff --git a/drivers/net/wireless/quantenna/qtnfmac/commands.c b/drivers/net/wireless/quantenna/qtnfmac/commands.c index 734844b34c26..dd473b206f12 100644 --- a/drivers/net/wireless/quantenna/qtnfmac/commands.c +++ b/drivers/net/wireless/quantenna/qtnfmac/commands.c @@ -894,6 +894,7 @@ int qtnf_cmd_send_del_intf(struct qtnf_vif *vif) default: pr_warn("VIF%u.%u: unsupported iftype %d\n", vif->mac->macid, vif->vifid, vif->wdev.iftype); + dev_kfree_skb(cmd_skb); ret = -EINVAL; goto out; } @@ -2212,6 +2213,7 @@ int qtnf_cmd_send_change_sta(struct qtnf_vif *vif, const u8 *mac, break; default: pr_err("unsupported iftype %d\n", vif->wdev.iftype); + dev_kfree_skb(cmd_skb); ret = -EINVAL; goto out; } diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c index 070ea0f456ab..b80cff96dea1 100644 --- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c +++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c @@ -5453,7 +5453,6 @@ static int rtl8xxxu_submit_int_urb(struct ieee80211_hw *hw) ret = usb_submit_urb(urb, GFP_KERNEL); if (ret) { usb_unanchor_urb(urb); - usb_free_urb(urb); goto error; } @@ -5462,6 +5461,7 @@ static int rtl8xxxu_submit_int_urb(struct ieee80211_hw *hw) rtl8xxxu_write32(priv, REG_USB_HIMR, val32); error: + usb_free_urb(urb); return ret; } @@ -5787,6 +5787,7 @@ static int rtl8xxxu_start(struct ieee80211_hw *hw) struct rtl8xxxu_priv *priv = hw->priv; struct rtl8xxxu_rx_urb *rx_urb; struct rtl8xxxu_tx_urb *tx_urb; + struct sk_buff *skb; unsigned long flags; int ret, i; @@ -5837,6 +5838,13 @@ static int rtl8xxxu_start(struct ieee80211_hw *hw) rx_urb->hw = hw; ret = rtl8xxxu_submit_rx_urb(priv, rx_urb); + if (ret) { + if (ret != -ENOMEM) { + skb = (struct sk_buff *)rx_urb->urb.context; + dev_kfree_skb(skb); + } + rtl8xxxu_queue_rx_urb(priv, rx_urb); + } } exit: /* diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 936c0b3e0ba2..86d23d0f563c 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -140,6 +140,20 @@ struct xenvif_queue { /* Per-queue data for xenvif */ char name[QUEUE_NAME_SIZE]; /* DEVNAME-qN */ struct xenvif *vif; /* Parent VIF */ + /* + * TX/RX common EOI handling. + * When feature-split-event-channels = 0, interrupt handler sets + * NETBK_COMMON_EOI, otherwise NETBK_RX_EOI and NETBK_TX_EOI are set + * by the RX and TX interrupt handlers. + * RX and TX handler threads will issue an EOI when either + * NETBK_COMMON_EOI or their specific bits (NETBK_RX_EOI or + * NETBK_TX_EOI) are set and they will reset those bits. + */ + atomic_t eoi_pending; +#define NETBK_RX_EOI 0x01 +#define NETBK_TX_EOI 0x02 +#define NETBK_COMMON_EOI 0x04 + /* Use NAPI for guest TX */ struct napi_struct napi; /* When feature-split-event-channels = 0, tx_irq = rx_irq. */ @@ -357,6 +371,7 @@ int xenvif_dealloc_kthread(void *data); irqreturn_t xenvif_ctrl_irq_fn(int irq, void *data); +bool xenvif_have_rx_work(struct xenvif_queue *queue, bool test_kthread); void xenvif_rx_action(struct xenvif_queue *queue); void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb); diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 4cafc31b98b7..c960cb7e3251 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -77,12 +77,28 @@ int xenvif_schedulable(struct xenvif *vif) !vif->disabled; } +static bool xenvif_handle_tx_interrupt(struct xenvif_queue *queue) +{ + bool rc; + + rc = RING_HAS_UNCONSUMED_REQUESTS(&queue->tx); + if (rc) + napi_schedule(&queue->napi); + return rc; +} + static irqreturn_t xenvif_tx_interrupt(int irq, void *dev_id) { struct xenvif_queue *queue = dev_id; + int old; - if (RING_HAS_UNCONSUMED_REQUESTS(&queue->tx)) - napi_schedule(&queue->napi); + old = atomic_fetch_or(NETBK_TX_EOI, &queue->eoi_pending); + WARN(old & NETBK_TX_EOI, "Interrupt while EOI pending\n"); + + if (!xenvif_handle_tx_interrupt(queue)) { + atomic_andnot(NETBK_TX_EOI, &queue->eoi_pending); + xen_irq_lateeoi(irq, XEN_EOI_FLAG_SPURIOUS); + } return IRQ_HANDLED; } @@ -116,19 +132,46 @@ static int xenvif_poll(struct napi_struct *napi, int budget) return work_done; } +static bool xenvif_handle_rx_interrupt(struct xenvif_queue *queue) +{ + bool rc; + + rc = xenvif_have_rx_work(queue, false); + if (rc) + xenvif_kick_thread(queue); + return rc; +} + static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id) { struct xenvif_queue *queue = dev_id; + int old; - xenvif_kick_thread(queue); + old = atomic_fetch_or(NETBK_RX_EOI, &queue->eoi_pending); + WARN(old & NETBK_RX_EOI, "Interrupt while EOI pending\n"); + + if (!xenvif_handle_rx_interrupt(queue)) { + atomic_andnot(NETBK_RX_EOI, &queue->eoi_pending); + xen_irq_lateeoi(irq, XEN_EOI_FLAG_SPURIOUS); + } return IRQ_HANDLED; } irqreturn_t xenvif_interrupt(int irq, void *dev_id) { - xenvif_tx_interrupt(irq, dev_id); - xenvif_rx_interrupt(irq, dev_id); + struct xenvif_queue *queue = dev_id; + int old; + + old = atomic_fetch_or(NETBK_COMMON_EOI, &queue->eoi_pending); + WARN(old, "Interrupt while EOI pending\n"); + + /* Use bitwise or as we need to call both functions. */ + if ((!xenvif_handle_tx_interrupt(queue) | + !xenvif_handle_rx_interrupt(queue))) { + atomic_andnot(NETBK_COMMON_EOI, &queue->eoi_pending); + xen_irq_lateeoi(irq, XEN_EOI_FLAG_SPURIOUS); + } return IRQ_HANDLED; } @@ -595,7 +638,7 @@ int xenvif_connect_ctrl(struct xenvif *vif, grant_ref_t ring_ref, shared = (struct xen_netif_ctrl_sring *)addr; BACK_RING_INIT(&vif->ctrl, shared, XEN_PAGE_SIZE); - err = bind_interdomain_evtchn_to_irq(vif->domid, evtchn); + err = bind_interdomain_evtchn_to_irq_lateeoi(vif->domid, evtchn); if (err < 0) goto err_unmap; @@ -653,7 +696,7 @@ int xenvif_connect_data(struct xenvif_queue *queue, if (tx_evtchn == rx_evtchn) { /* feature-split-event-channels == 0 */ - err = bind_interdomain_evtchn_to_irqhandler( + err = bind_interdomain_evtchn_to_irqhandler_lateeoi( queue->vif->domid, tx_evtchn, xenvif_interrupt, 0, queue->name, queue); if (err < 0) @@ -664,7 +707,7 @@ int xenvif_connect_data(struct xenvif_queue *queue, /* feature-split-event-channels == 1 */ snprintf(queue->tx_irq_name, sizeof(queue->tx_irq_name), "%s-tx", queue->name); - err = bind_interdomain_evtchn_to_irqhandler( + err = bind_interdomain_evtchn_to_irqhandler_lateeoi( queue->vif->domid, tx_evtchn, xenvif_tx_interrupt, 0, queue->tx_irq_name, queue); if (err < 0) @@ -674,7 +717,7 @@ int xenvif_connect_data(struct xenvif_queue *queue, snprintf(queue->rx_irq_name, sizeof(queue->rx_irq_name), "%s-rx", queue->name); - err = bind_interdomain_evtchn_to_irqhandler( + err = bind_interdomain_evtchn_to_irqhandler_lateeoi( queue->vif->domid, rx_evtchn, xenvif_rx_interrupt, 0, queue->rx_irq_name, queue); if (err < 0) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 1c849106b793..f228298c3bd0 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -162,6 +162,10 @@ void xenvif_napi_schedule_or_enable_events(struct xenvif_queue *queue) if (more_to_do) napi_schedule(&queue->napi); + else if (atomic_fetch_andnot(NETBK_TX_EOI | NETBK_COMMON_EOI, + &queue->eoi_pending) & + (NETBK_TX_EOI | NETBK_COMMON_EOI)) + xen_irq_lateeoi(queue->tx_irq, 0); } static void tx_add_credit(struct xenvif_queue *queue) @@ -1613,9 +1617,14 @@ static bool xenvif_ctrl_work_todo(struct xenvif *vif) irqreturn_t xenvif_ctrl_irq_fn(int irq, void *data) { struct xenvif *vif = data; + unsigned int eoi_flag = XEN_EOI_FLAG_SPURIOUS; - while (xenvif_ctrl_work_todo(vif)) + while (xenvif_ctrl_work_todo(vif)) { xenvif_ctrl_action(vif); + eoi_flag = 0; + } + + xen_irq_lateeoi(irq, eoi_flag); return IRQ_HANDLED; } diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c index ef5887037b22..9b62f65b630e 100644 --- a/drivers/net/xen-netback/rx.c +++ b/drivers/net/xen-netback/rx.c @@ -490,13 +490,13 @@ static bool xenvif_rx_queue_ready(struct xenvif_queue *queue) return queue->stalled && prod - cons >= 1; } -static bool xenvif_have_rx_work(struct xenvif_queue *queue) +bool xenvif_have_rx_work(struct xenvif_queue *queue, bool test_kthread) { return xenvif_rx_ring_slots_available(queue) || (queue->vif->stall_timeout && (xenvif_rx_queue_stalled(queue) || xenvif_rx_queue_ready(queue))) || - kthread_should_stop() || + (test_kthread && kthread_should_stop()) || queue->vif->disabled; } @@ -527,15 +527,20 @@ static void xenvif_wait_for_rx_work(struct xenvif_queue *queue) { DEFINE_WAIT(wait); - if (xenvif_have_rx_work(queue)) + if (xenvif_have_rx_work(queue, true)) return; for (;;) { long ret; prepare_to_wait(&queue->wq, &wait, TASK_INTERRUPTIBLE); - if (xenvif_have_rx_work(queue)) + if (xenvif_have_rx_work(queue, true)) break; + if (atomic_fetch_andnot(NETBK_RX_EOI | NETBK_COMMON_EOI, + &queue->eoi_pending) & + (NETBK_RX_EOI | NETBK_COMMON_EOI)) + xen_irq_lateeoi(queue->rx_irq, 0); + ret = schedule_timeout(xenvif_rx_queue_timeout(queue)); if (!ret) break; diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.c b/drivers/ntb/hw/amd/ntb_hw_amd.c index efb214fc545a..0b1fbb5dba9b 100644 --- a/drivers/ntb/hw/amd/ntb_hw_amd.c +++ b/drivers/ntb/hw/amd/ntb_hw_amd.c @@ -1036,6 +1036,7 @@ static int amd_ntb_init_pci(struct amd_ntb_dev *ndev, err_dma_mask: pci_clear_master(pdev); + pci_release_regions(pdev); err_pci_regions: pci_disable_device(pdev); err_pci_enable: diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 077c67816665..134e14e778f8 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -1640,7 +1640,6 @@ static int nvme_rdma_cm_handler(struct rdma_cm_id *cm_id, complete(&queue->cm_done); return 0; case RDMA_CM_EVENT_REJECTED: - nvme_rdma_destroy_queue_ib(queue); cm_error = nvme_rdma_conn_rejected(queue, ev); break; case RDMA_CM_EVENT_ROUTE_ERROR: diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index f28df233dfcd..2b492ad55f0e 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -787,7 +787,8 @@ static void nvmet_start_ctrl(struct nvmet_ctrl *ctrl) * in case a host died before it enabled the controller. Hence, simply * reset the keep alive timer when the controller is enabled. */ - mod_delayed_work(system_wq, &ctrl->ka_work, ctrl->kato * HZ); + if (ctrl->kato) + mod_delayed_work(system_wq, &ctrl->ka_work, ctrl->kato * HZ); } static void nvmet_clear_ctrl(struct nvmet_ctrl *ctrl) diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c index 24ae36142670..9518d6faa998 100644 --- a/drivers/of/of_reserved_mem.c +++ b/drivers/of/of_reserved_mem.c @@ -221,6 +221,16 @@ static int __init __rmem_cmp(const void *a, const void *b) if (ra->base > rb->base) return 1; + /* + * Put the dynamic allocations (address == 0, size == 0) before static + * allocations at address 0x0 so that overlap detection works + * correctly. + */ + if (ra->size < rb->size) + return -1; + if (ra->size > rb->size) + return 1; + return 0; } @@ -238,8 +248,7 @@ static void __init __rmem_check_for_overlap(void) this = &reserved_mem[i]; next = &reserved_mem[i + 1]; - if (!(this->base && next->base)) - continue; + if (this->base + this->size > next->base) { phys_addr_t this_end, next_end; diff --git a/drivers/pci/controller/pcie-iproc-msi.c b/drivers/pci/controller/pcie-iproc-msi.c index 9deb56989d72..ea612382599c 100644 --- a/drivers/pci/controller/pcie-iproc-msi.c +++ b/drivers/pci/controller/pcie-iproc-msi.c @@ -209,15 +209,20 @@ static int iproc_msi_irq_set_affinity(struct irq_data *data, struct iproc_msi *msi = irq_data_get_irq_chip_data(data); int target_cpu = cpumask_first(mask); int curr_cpu; + int ret; curr_cpu = hwirq_to_cpu(msi, data->hwirq); if (curr_cpu == target_cpu) - return IRQ_SET_MASK_OK_DONE; + ret = IRQ_SET_MASK_OK_DONE; + else { + /* steer MSI to the target CPU */ + data->hwirq = hwirq_to_canonical_hwirq(msi, data->hwirq) + target_cpu; + ret = IRQ_SET_MASK_OK; + } - /* steer MSI to the target CPU */ - data->hwirq = hwirq_to_canonical_hwirq(msi, data->hwirq) + target_cpu; + irq_data_update_effective_affinity(data, cpumask_of(target_cpu)); - return IRQ_SET_MASK_OK; + return ret; } static void iproc_msi_irq_compose_msi_msg(struct irq_data *data, diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c index 0e31f1392a53..949b07e29c06 100644 --- a/drivers/perf/xgene_pmu.c +++ b/drivers/perf/xgene_pmu.c @@ -1474,17 +1474,6 @@ static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id) } #if defined(CONFIG_ACPI) -static int acpi_pmu_dev_add_resource(struct acpi_resource *ares, void *data) -{ - struct resource *res = data; - - if (ares->type == ACPI_RESOURCE_TYPE_FIXED_MEMORY32) - acpi_dev_resource_memory(ares, res); - - /* Always tell the ACPI core to skip this resource */ - return 1; -} - static struct xgene_pmu_dev_ctx *acpi_get_pmu_hw_inf(struct xgene_pmu *xgene_pmu, struct acpi_device *adev, u32 type) @@ -1496,6 +1485,7 @@ xgene_pmu_dev_ctx *acpi_get_pmu_hw_inf(struct xgene_pmu *xgene_pmu, struct hw_pmu_info *inf; void __iomem *dev_csr; struct resource res; + struct resource_entry *rentry; int enable_bit; int rc; @@ -1504,11 +1494,23 @@ xgene_pmu_dev_ctx *acpi_get_pmu_hw_inf(struct xgene_pmu *xgene_pmu, return NULL; INIT_LIST_HEAD(&resource_list); - rc = acpi_dev_get_resources(adev, &resource_list, - acpi_pmu_dev_add_resource, &res); + rc = acpi_dev_get_resources(adev, &resource_list, NULL, NULL); + if (rc <= 0) { + dev_err(dev, "PMU type %d: No resources found\n", type); + return NULL; + } + + list_for_each_entry(rentry, &resource_list, node) { + if (resource_type(rentry->res) == IORESOURCE_MEM) { + res = *rentry->res; + rentry = NULL; + break; + } + } acpi_dev_free_resource_list(&resource_list); - if (rc < 0) { - dev_err(dev, "PMU type %d: No resource address found\n", type); + + if (rentry) { + dev_err(dev, "PMU type %d: No memory resource found\n", type); return NULL; } diff --git a/drivers/pinctrl/bcm/Kconfig b/drivers/pinctrl/bcm/Kconfig index 0f38d51f47c6..e6cd314919de 100644 --- a/drivers/pinctrl/bcm/Kconfig +++ b/drivers/pinctrl/bcm/Kconfig @@ -21,6 +21,7 @@ config PINCTRL_BCM2835 select PINMUX select PINCONF select GENERIC_PINCONF + select GPIOLIB select GPIOLIB_IRQCHIP config PINCTRL_IPROC_GPIO diff --git a/drivers/pinctrl/pinctrl-mcp23s08.c b/drivers/pinctrl/pinctrl-mcp23s08.c index 33c3eca0ece9..5b5a4323ae63 100644 --- a/drivers/pinctrl/pinctrl-mcp23s08.c +++ b/drivers/pinctrl/pinctrl-mcp23s08.c @@ -120,7 +120,7 @@ static const struct regmap_config mcp23x08_regmap = { .max_register = MCP_OLAT, }; -static const struct reg_default mcp23x16_defaults[] = { +static const struct reg_default mcp23x17_defaults[] = { {.reg = MCP_IODIR << 1, .def = 0xffff}, {.reg = MCP_IPOL << 1, .def = 0x0000}, {.reg = MCP_GPINTEN << 1, .def = 0x0000}, @@ -131,23 +131,23 @@ static const struct reg_default mcp23x16_defaults[] = { {.reg = MCP_OLAT << 1, .def = 0x0000}, }; -static const struct regmap_range mcp23x16_volatile_range = { +static const struct regmap_range mcp23x17_volatile_range = { .range_min = MCP_INTF << 1, .range_max = MCP_GPIO << 1, }; -static const struct regmap_access_table mcp23x16_volatile_table = { - .yes_ranges = &mcp23x16_volatile_range, +static const struct regmap_access_table mcp23x17_volatile_table = { + .yes_ranges = &mcp23x17_volatile_range, .n_yes_ranges = 1, }; -static const struct regmap_range mcp23x16_precious_range = { - .range_min = MCP_GPIO << 1, +static const struct regmap_range mcp23x17_precious_range = { + .range_min = MCP_INTCAP << 1, .range_max = MCP_GPIO << 1, }; -static const struct regmap_access_table mcp23x16_precious_table = { - .yes_ranges = &mcp23x16_precious_range, +static const struct regmap_access_table mcp23x17_precious_table = { + .yes_ranges = &mcp23x17_precious_range, .n_yes_ranges = 1, }; @@ -157,10 +157,10 @@ static const struct regmap_config mcp23x17_regmap = { .reg_stride = 2, .max_register = MCP_OLAT << 1, - .volatile_table = &mcp23x16_volatile_table, - .precious_table = &mcp23x16_precious_table, - .reg_defaults = mcp23x16_defaults, - .num_reg_defaults = ARRAY_SIZE(mcp23x16_defaults), + .volatile_table = &mcp23x17_volatile_table, + .precious_table = &mcp23x17_precious_table, + .reg_defaults = mcp23x17_defaults, + .num_reg_defaults = ARRAY_SIZE(mcp23x17_defaults), .cache_type = REGCACHE_FLAT, .val_format_endian = REGMAP_ENDIAN_LITTLE, }; diff --git a/drivers/platform/x86/mlx-platform.c b/drivers/platform/x86/mlx-platform.c index 69e28c12d591..0c72de95b5cc 100644 --- a/drivers/platform/x86/mlx-platform.c +++ b/drivers/platform/x86/mlx-platform.c @@ -221,15 +221,6 @@ static struct i2c_board_info mlxplat_mlxcpld_psu[] = { }, }; -static struct i2c_board_info mlxplat_mlxcpld_ng_psu[] = { - { - I2C_BOARD_INFO("24c32", 0x51), - }, - { - I2C_BOARD_INFO("24c32", 0x50), - }, -}; - static struct i2c_board_info mlxplat_mlxcpld_pwr[] = { { I2C_BOARD_INFO("dps460", 0x59), @@ -589,15 +580,13 @@ static struct mlxreg_core_data mlxplat_mlxcpld_default_ng_psu_items_data[] = { .label = "psu1", .reg = MLXPLAT_CPLD_LPC_REG_PSU_OFFSET, .mask = BIT(0), - .hpdev.brdinfo = &mlxplat_mlxcpld_ng_psu[0], - .hpdev.nr = MLXPLAT_CPLD_PSU_MSNXXXX_NR, + .hpdev.nr = MLXPLAT_CPLD_NR_NONE, }, { .label = "psu2", .reg = MLXPLAT_CPLD_LPC_REG_PSU_OFFSET, .mask = BIT(1), - .hpdev.brdinfo = &mlxplat_mlxcpld_ng_psu[1], - .hpdev.nr = MLXPLAT_CPLD_PSU_MSNXXXX_NR, + .hpdev.nr = MLXPLAT_CPLD_NR_NONE, }, }; diff --git a/drivers/power/supply/bq27xxx_battery.c b/drivers/power/supply/bq27xxx_battery.c index ff02a917556a..93e3d9c747aa 100644 --- a/drivers/power/supply/bq27xxx_battery.c +++ b/drivers/power/supply/bq27xxx_battery.c @@ -1680,8 +1680,6 @@ static int bq27xxx_battery_status(struct bq27xxx_device_info *di, status = POWER_SUPPLY_STATUS_FULL; else if (di->cache.flags & BQ27000_FLAG_CHGS) status = POWER_SUPPLY_STATUS_CHARGING; - else if (power_supply_am_i_supplied(di->bat) > 0) - status = POWER_SUPPLY_STATUS_NOT_CHARGING; else status = POWER_SUPPLY_STATUS_DISCHARGING; } else { @@ -1693,6 +1691,10 @@ static int bq27xxx_battery_status(struct bq27xxx_device_info *di, status = POWER_SUPPLY_STATUS_CHARGING; } + if ((status == POWER_SUPPLY_STATUS_DISCHARGING) && + (power_supply_am_i_supplied(di->bat) > 0)) + status = POWER_SUPPLY_STATUS_NOT_CHARGING; + val->intval = status; return 0; diff --git a/drivers/power/supply/test_power.c b/drivers/power/supply/test_power.c index 57246cdbd042..925abec45380 100644 --- a/drivers/power/supply/test_power.c +++ b/drivers/power/supply/test_power.c @@ -344,6 +344,7 @@ static int param_set_ac_online(const char *key, const struct kernel_param *kp) static int param_get_ac_online(char *buffer, const struct kernel_param *kp) { strcpy(buffer, map_get_key(map_ac_online, ac_online, "unknown")); + strcat(buffer, "\n"); return strlen(buffer); } @@ -357,6 +358,7 @@ static int param_set_usb_online(const char *key, const struct kernel_param *kp) static int param_get_usb_online(char *buffer, const struct kernel_param *kp) { strcpy(buffer, map_get_key(map_ac_online, usb_online, "unknown")); + strcat(buffer, "\n"); return strlen(buffer); } @@ -371,6 +373,7 @@ static int param_set_battery_status(const char *key, static int param_get_battery_status(char *buffer, const struct kernel_param *kp) { strcpy(buffer, map_get_key(map_status, battery_status, "unknown")); + strcat(buffer, "\n"); return strlen(buffer); } @@ -385,6 +388,7 @@ static int param_set_battery_health(const char *key, static int param_get_battery_health(char *buffer, const struct kernel_param *kp) { strcpy(buffer, map_get_key(map_health, battery_health, "unknown")); + strcat(buffer, "\n"); return strlen(buffer); } @@ -400,6 +404,7 @@ static int param_get_battery_present(char *buffer, const struct kernel_param *kp) { strcpy(buffer, map_get_key(map_present, battery_present, "unknown")); + strcat(buffer, "\n"); return strlen(buffer); } @@ -417,6 +422,7 @@ static int param_get_battery_technology(char *buffer, { strcpy(buffer, map_get_key(map_technology, battery_technology, "unknown")); + strcat(buffer, "\n"); return strlen(buffer); } diff --git a/drivers/powercap/powercap_sys.c b/drivers/powercap/powercap_sys.c index 9e2f274bd44f..60c8375c3c81 100644 --- a/drivers/powercap/powercap_sys.c +++ b/drivers/powercap/powercap_sys.c @@ -379,9 +379,9 @@ static void create_power_zone_common_attributes( &dev_attr_max_energy_range_uj.attr; if (power_zone->ops->get_energy_uj) { if (power_zone->ops->reset_energy_uj) - dev_attr_energy_uj.attr.mode = S_IWUSR | S_IRUGO; + dev_attr_energy_uj.attr.mode = S_IWUSR | S_IRUSR; else - dev_attr_energy_uj.attr.mode = S_IRUGO; + dev_attr_energy_uj.attr.mode = S_IRUSR; power_zone->zone_dev_attrs[count++] = &dev_attr_energy_uj.attr; } diff --git a/drivers/pwm/pwm-img.c b/drivers/pwm/pwm-img.c index da72b2866e88..3b0a097ce2ab 100644 --- a/drivers/pwm/pwm-img.c +++ b/drivers/pwm/pwm-img.c @@ -280,6 +280,8 @@ static int img_pwm_probe(struct platform_device *pdev) return PTR_ERR(pwm->pwm_clk); } + platform_set_drvdata(pdev, pwm); + pm_runtime_set_autosuspend_delay(&pdev->dev, IMG_PWM_PM_TIMEOUT); pm_runtime_use_autosuspend(&pdev->dev); pm_runtime_enable(&pdev->dev); @@ -316,7 +318,6 @@ static int img_pwm_probe(struct platform_device *pdev) goto err_suspend; } - platform_set_drvdata(pdev, pwm); return 0; err_suspend: diff --git a/drivers/pwm/pwm-lpss.c b/drivers/pwm/pwm-lpss.c index 7a4a6406cf69..69f8be065919 100644 --- a/drivers/pwm/pwm-lpss.c +++ b/drivers/pwm/pwm-lpss.c @@ -105,10 +105,12 @@ static void pwm_lpss_prepare(struct pwm_lpss_chip *lpwm, struct pwm_device *pwm, * The equation is: * base_unit = round(base_unit_range * freq / c) */ - base_unit_range = BIT(lpwm->info->base_unit_bits) - 1; + base_unit_range = BIT(lpwm->info->base_unit_bits); freq *= base_unit_range; base_unit = DIV_ROUND_CLOSEST_ULL(freq, c); + /* base_unit must not be 0 and we also want to avoid overflowing it */ + base_unit = clamp_val(base_unit, 1, base_unit_range - 1); on_time_div = 255ULL * duty_ns; do_div(on_time_div, period_ns); @@ -116,8 +118,7 @@ static void pwm_lpss_prepare(struct pwm_lpss_chip *lpwm, struct pwm_device *pwm, orig_ctrl = ctrl = pwm_lpss_read(pwm); ctrl &= ~PWM_ON_TIME_DIV_MASK; - ctrl &= ~(base_unit_range << PWM_BASE_UNIT_SHIFT); - base_unit &= base_unit_range; + ctrl &= ~((base_unit_range - 1) << PWM_BASE_UNIT_SHIFT); ctrl |= (u32) base_unit << PWM_BASE_UNIT_SHIFT; ctrl |= on_time_div; diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c index f36a8a5261a1..a136a7ae7714 100644 --- a/drivers/rapidio/devices/rio_mport_cdev.c +++ b/drivers/rapidio/devices/rio_mport_cdev.c @@ -875,15 +875,16 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode, rmcd_error("get_user_pages_unlocked err=%ld", pinned); nr_pages = 0; - } else + } else { rmcd_error("pinned %ld out of %ld pages", pinned, nr_pages); + /* + * Set nr_pages up to mean "how many pages to unpin, in + * the error handler: + */ + nr_pages = pinned; + } ret = -EFAULT; - /* - * Set nr_pages up to mean "how many pages to unpin, in - * the error handler: - */ - nr_pages = pinned; goto err_pg; } @@ -1684,6 +1685,7 @@ static int rio_mport_add_riodev(struct mport_cdev_priv *priv, struct rio_dev *rdev; struct rio_switch *rswitch = NULL; struct rio_mport *mport; + struct device *dev; size_t size; u32 rval; u32 swpinfo = 0; @@ -1698,8 +1700,10 @@ static int rio_mport_add_riodev(struct mport_cdev_priv *priv, rmcd_debug(RDEV, "name:%s ct:0x%x did:0x%x hc:0x%x", dev_info.name, dev_info.comptag, dev_info.destid, dev_info.hopcount); - if (bus_find_device_by_name(&rio_bus_type, NULL, dev_info.name)) { + dev = bus_find_device_by_name(&rio_bus_type, NULL, dev_info.name); + if (dev) { rmcd_debug(RDEV, "device %s already exists", dev_info.name); + put_device(dev); return -EEXIST; } diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index 09d3e2fb2a70..4fb3d05c474e 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -4785,15 +4785,20 @@ regulator_register(const struct regulator_desc *regulator_desc, else if (regulator_desc->supply_name) rdev->supply_name = regulator_desc->supply_name; - /* - * Attempt to resolve the regulator supply, if specified, - * but don't return an error if we fail because we will try - * to resolve it again later as more regulators are added. - */ - if (regulator_resolve_supply(rdev)) - rdev_dbg(rdev, "unable to resolve supply\n"); - ret = set_machine_constraints(rdev, constraints); + if (ret == -EPROBE_DEFER) { + /* Regulator might be in bypass mode and so needs its supply + * to set the constraints */ + /* FIXME: this currently triggers a chicken-and-egg problem + * when creating -SUPPLY symlink in sysfs to a regulator + * that is just being created */ + ret = regulator_resolve_supply(rdev); + if (!ret) + ret = set_machine_constraints(rdev, constraints); + else + rdev_dbg(rdev, "unable to resolve supply early: %pe\n", + ERR_PTR(ret)); + } if (ret < 0) goto wash; diff --git a/drivers/rpmsg/qcom_smd.c b/drivers/rpmsg/qcom_smd.c index b2e5a6abf7d5..aa008fa11002 100644 --- a/drivers/rpmsg/qcom_smd.c +++ b/drivers/rpmsg/qcom_smd.c @@ -1338,7 +1338,7 @@ static int qcom_smd_parse_edge(struct device *dev, ret = of_property_read_u32(node, key, &edge->edge_id); if (ret) { dev_err(dev, "edge missing %s property\n", key); - return -EINVAL; + goto put_node; } edge->remote_pid = QCOM_SMEM_HOST_ANY; @@ -1349,32 +1349,37 @@ static int qcom_smd_parse_edge(struct device *dev, edge->mbox_client.knows_txdone = true; edge->mbox_chan = mbox_request_channel(&edge->mbox_client, 0); if (IS_ERR(edge->mbox_chan)) { - if (PTR_ERR(edge->mbox_chan) != -ENODEV) - return PTR_ERR(edge->mbox_chan); + if (PTR_ERR(edge->mbox_chan) != -ENODEV) { + ret = PTR_ERR(edge->mbox_chan); + goto put_node; + } edge->mbox_chan = NULL; syscon_np = of_parse_phandle(node, "qcom,ipc", 0); if (!syscon_np) { dev_err(dev, "no qcom,ipc node\n"); - return -ENODEV; + ret = -ENODEV; + goto put_node; } edge->ipc_regmap = syscon_node_to_regmap(syscon_np); - if (IS_ERR(edge->ipc_regmap)) - return PTR_ERR(edge->ipc_regmap); + if (IS_ERR(edge->ipc_regmap)) { + ret = PTR_ERR(edge->ipc_regmap); + goto put_node; + } key = "qcom,ipc"; ret = of_property_read_u32_index(node, key, 1, &edge->ipc_offset); if (ret < 0) { dev_err(dev, "no offset in %s\n", key); - return -EINVAL; + goto put_node; } ret = of_property_read_u32_index(node, key, 2, &edge->ipc_bit); if (ret < 0) { dev_err(dev, "no bit in %s\n", key); - return -EINVAL; + goto put_node; } } @@ -1385,7 +1390,8 @@ static int qcom_smd_parse_edge(struct device *dev, irq = irq_of_parse_and_map(node, 0); if (irq < 0) { dev_err(dev, "required smd interrupt missing\n"); - return -EINVAL; + ret = irq; + goto put_node; } ret = devm_request_irq(dev, irq, @@ -1393,12 +1399,18 @@ static int qcom_smd_parse_edge(struct device *dev, node->name, edge); if (ret) { dev_err(dev, "failed to request smd irq\n"); - return ret; + goto put_node; } edge->irq = irq; return 0; + +put_node: + of_node_put(node); + edge->of_node = NULL; + + return ret; } /* diff --git a/drivers/rtc/rtc-rx8010.c b/drivers/rtc/rtc-rx8010.c index 7ddc22eb5b0f..f4db80f9c1b1 100644 --- a/drivers/rtc/rtc-rx8010.c +++ b/drivers/rtc/rtc-rx8010.c @@ -428,16 +428,26 @@ static int rx8010_ioctl(struct device *dev, unsigned int cmd, unsigned long arg) } } -static struct rtc_class_ops rx8010_rtc_ops = { +static const struct rtc_class_ops rx8010_rtc_ops_default = { .read_time = rx8010_get_time, .set_time = rx8010_set_time, .ioctl = rx8010_ioctl, }; +static const struct rtc_class_ops rx8010_rtc_ops_alarm = { + .read_time = rx8010_get_time, + .set_time = rx8010_set_time, + .ioctl = rx8010_ioctl, + .read_alarm = rx8010_read_alarm, + .set_alarm = rx8010_set_alarm, + .alarm_irq_enable = rx8010_alarm_irq_enable, +}; + static int rx8010_probe(struct i2c_client *client, const struct i2c_device_id *id) { struct i2c_adapter *adapter = to_i2c_adapter(client->dev.parent); + const struct rtc_class_ops *rtc_ops; struct rx8010_data *rx8010; int err = 0; @@ -468,16 +478,16 @@ static int rx8010_probe(struct i2c_client *client, if (err) { dev_err(&client->dev, "unable to request IRQ\n"); - client->irq = 0; - } else { - rx8010_rtc_ops.read_alarm = rx8010_read_alarm; - rx8010_rtc_ops.set_alarm = rx8010_set_alarm; - rx8010_rtc_ops.alarm_irq_enable = rx8010_alarm_irq_enable; + return err; } + + rtc_ops = &rx8010_rtc_ops_alarm; + } else { + rtc_ops = &rx8010_rtc_ops_default; } rx8010->rtc = devm_rtc_device_register(&client->dev, client->name, - &rx8010_rtc_ops, THIS_MODULE); + rtc_ops, THIS_MODULE); if (IS_ERR(rx8010->rtc)) { dev_err(&client->dev, "unable to register the class device\n"); diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c index 3660059784f7..6221a8372cee 100644 --- a/drivers/scsi/be2iscsi/be_main.c +++ b/drivers/scsi/be2iscsi/be_main.c @@ -3039,6 +3039,7 @@ static int beiscsi_create_eqs(struct beiscsi_hba *phba, goto create_eq_error; } + mem->dma = paddr; mem->va = eq_vaddress; ret = be_fill_queue(eq, phba->params.num_eq_entries, sizeof(struct be_eq_entry), eq_vaddress); @@ -3048,7 +3049,6 @@ static int beiscsi_create_eqs(struct beiscsi_hba *phba, goto create_eq_error; } - mem->dma = paddr; ret = beiscsi_cmd_eq_create(&phba->ctrl, eq, BEISCSI_EQ_DELAY_DEF); if (ret) { @@ -3105,6 +3105,7 @@ static int beiscsi_create_cqs(struct beiscsi_hba *phba, goto create_cq_error; } + mem->dma = paddr; ret = be_fill_queue(cq, phba->params.num_cq_entries, sizeof(struct sol_cqe), cq_vaddress); if (ret) { @@ -3114,7 +3115,6 @@ static int beiscsi_create_cqs(struct beiscsi_hba *phba, goto create_cq_error; } - mem->dma = paddr; ret = beiscsi_cmd_cq_create(&phba->ctrl, cq, eq, false, false, 0); if (ret) { diff --git a/drivers/scsi/csiostor/csio_hw.c b/drivers/scsi/csiostor/csio_hw.c index e51923886475..1b6f9351b43f 100644 --- a/drivers/scsi/csiostor/csio_hw.c +++ b/drivers/scsi/csiostor/csio_hw.c @@ -2384,7 +2384,7 @@ static int csio_hw_prep_fw(struct csio_hw *hw, struct fw_info *fw_info, FW_HDR_FW_VER_MICRO_G(c), FW_HDR_FW_VER_BUILD_G(c), FW_HDR_FW_VER_MAJOR_G(k), FW_HDR_FW_VER_MINOR_G(k), FW_HDR_FW_VER_MICRO_G(k), FW_HDR_FW_VER_BUILD_G(k)); - ret = EINVAL; + ret = -EINVAL; goto bye; } diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index 71d53bb239e2..090ab377f65e 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -4795,6 +4795,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id) if (IS_ERR(vhost->work_thread)) { dev_err(dev, "Couldn't create kernel thread: %ld\n", PTR_ERR(vhost->work_thread)); + rc = PTR_ERR(vhost->work_thread); goto free_host_mem; } diff --git a/drivers/scsi/mvumi.c b/drivers/scsi/mvumi.c index b3cd9a6b1d30..b3df114a1200 100644 --- a/drivers/scsi/mvumi.c +++ b/drivers/scsi/mvumi.c @@ -2439,6 +2439,7 @@ static int mvumi_io_attach(struct mvumi_hba *mhba) if (IS_ERR(mhba->dm_thread)) { dev_err(&mhba->pdev->dev, "failed to create device scan thread\n"); + ret = PTR_ERR(mhba->dm_thread); mutex_unlock(&mhba->sas_discovery_mutex); goto fail_create_thread; } diff --git a/drivers/scsi/qedi/qedi_fw.c b/drivers/scsi/qedi/qedi_fw.c index 25d763ae5d5a..357a0acc5ed2 100644 --- a/drivers/scsi/qedi/qedi_fw.c +++ b/drivers/scsi/qedi/qedi_fw.c @@ -62,6 +62,7 @@ static void qedi_process_logout_resp(struct qedi_ctx *qedi, "Freeing tid=0x%x for cid=0x%x\n", cmd->task_id, qedi_conn->iscsi_conn_id); + spin_lock(&qedi_conn->list_lock); if (likely(cmd->io_cmd_in_list)) { cmd->io_cmd_in_list = false; list_del_init(&cmd->io_cmd); @@ -72,6 +73,7 @@ static void qedi_process_logout_resp(struct qedi_ctx *qedi, cmd->task_id, qedi_conn->iscsi_conn_id, &cmd->io_cmd); } + spin_unlock(&qedi_conn->list_lock); cmd->state = RESPONSE_RECEIVED; qedi_clear_task_idx(qedi, cmd->task_id); @@ -125,6 +127,7 @@ static void qedi_process_text_resp(struct qedi_ctx *qedi, "Freeing tid=0x%x for cid=0x%x\n", cmd->task_id, qedi_conn->iscsi_conn_id); + spin_lock(&qedi_conn->list_lock); if (likely(cmd->io_cmd_in_list)) { cmd->io_cmd_in_list = false; list_del_init(&cmd->io_cmd); @@ -135,6 +138,7 @@ static void qedi_process_text_resp(struct qedi_ctx *qedi, cmd->task_id, qedi_conn->iscsi_conn_id, &cmd->io_cmd); } + spin_unlock(&qedi_conn->list_lock); cmd->state = RESPONSE_RECEIVED; qedi_clear_task_idx(qedi, cmd->task_id); @@ -227,11 +231,13 @@ static void qedi_process_tmf_resp(struct qedi_ctx *qedi, tmf_hdr = (struct iscsi_tm *)qedi_cmd->task->hdr; + spin_lock(&qedi_conn->list_lock); if (likely(qedi_cmd->io_cmd_in_list)) { qedi_cmd->io_cmd_in_list = false; list_del_init(&qedi_cmd->io_cmd); qedi_conn->active_cmd_count--; } + spin_unlock(&qedi_conn->list_lock); if (((tmf_hdr->flags & ISCSI_FLAG_TM_FUNC_MASK) == ISCSI_TM_FUNC_LOGICAL_UNIT_RESET) || @@ -293,11 +299,13 @@ static void qedi_process_login_resp(struct qedi_ctx *qedi, ISCSI_LOGIN_RESPONSE_HDR_DATA_SEG_LEN_MASK; qedi_conn->gen_pdu.resp_wr_ptr = qedi_conn->gen_pdu.resp_buf + pld_len; + spin_lock(&qedi_conn->list_lock); if (likely(cmd->io_cmd_in_list)) { cmd->io_cmd_in_list = false; list_del_init(&cmd->io_cmd); qedi_conn->active_cmd_count--; } + spin_unlock(&qedi_conn->list_lock); memset(task_ctx, '\0', sizeof(*task_ctx)); @@ -829,8 +837,11 @@ static void qedi_process_cmd_cleanup_resp(struct qedi_ctx *qedi, qedi_clear_task_idx(qedi_conn->qedi, rtid); spin_lock(&qedi_conn->list_lock); - list_del_init(&dbg_cmd->io_cmd); - qedi_conn->active_cmd_count--; + if (likely(dbg_cmd->io_cmd_in_list)) { + dbg_cmd->io_cmd_in_list = false; + list_del_init(&dbg_cmd->io_cmd); + qedi_conn->active_cmd_count--; + } spin_unlock(&qedi_conn->list_lock); qedi_cmd->state = CLEANUP_RECV; wake_up_interruptible(&qedi_conn->wait_queue); @@ -1249,6 +1260,7 @@ int qedi_cleanup_all_io(struct qedi_ctx *qedi, struct qedi_conn *qedi_conn, qedi_conn->cmd_cleanup_req++; qedi_iscsi_cleanup_task(ctask, true); + cmd->io_cmd_in_list = false; list_del_init(&cmd->io_cmd); qedi_conn->active_cmd_count--; QEDI_WARN(&qedi->dbg_ctx, @@ -1462,8 +1474,11 @@ ldel_exit: spin_unlock_bh(&qedi_conn->tmf_work_lock); spin_lock(&qedi_conn->list_lock); - list_del_init(&cmd->io_cmd); - qedi_conn->active_cmd_count--; + if (likely(cmd->io_cmd_in_list)) { + cmd->io_cmd_in_list = false; + list_del_init(&cmd->io_cmd); + qedi_conn->active_cmd_count--; + } spin_unlock(&qedi_conn->list_lock); clear_bit(QEDI_CONN_FW_CLEANUP, &qedi_conn->flags); diff --git a/drivers/scsi/qedi/qedi_iscsi.c b/drivers/scsi/qedi/qedi_iscsi.c index aa451c8b49e5..4e8c5fcbded6 100644 --- a/drivers/scsi/qedi/qedi_iscsi.c +++ b/drivers/scsi/qedi/qedi_iscsi.c @@ -976,11 +976,13 @@ static void qedi_cleanup_active_cmd_list(struct qedi_conn *qedi_conn) { struct qedi_cmd *cmd, *cmd_tmp; + spin_lock(&qedi_conn->list_lock); list_for_each_entry_safe(cmd, cmd_tmp, &qedi_conn->active_cmd_list, io_cmd) { list_del_init(&cmd->io_cmd); qedi_conn->active_cmd_count--; } + spin_unlock(&qedi_conn->list_lock); } static void qedi_ep_disconnect(struct iscsi_endpoint *ep) diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index 3e2f8ce1d9a9..7821c1695e82 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -676,7 +676,7 @@ int qla_nvme_register_hba(struct scsi_qla_host *vha) struct nvme_fc_port_template *tmpl; struct qla_hw_data *ha; struct nvme_fc_port_info pinfo; - int ret = EINVAL; + int ret = -EINVAL; if (!IS_ENABLED(CONFIG_NVME_FC)) return ret; diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c index 29b79e85fa7f..eb6112eb475e 100644 --- a/drivers/scsi/qla2xxx/qla_target.c +++ b/drivers/scsi/qla2xxx/qla_target.c @@ -1228,14 +1228,15 @@ void qlt_schedule_sess_for_deletion(struct fc_port *sess) case DSC_DELETE_PEND: return; case DSC_DELETED: - if (tgt && tgt->tgt_stop && (tgt->sess_count == 0)) - wake_up_all(&tgt->waitQ); - if (sess->vha->fcport_count == 0) - wake_up_all(&sess->vha->fcport_waitQ); - if (!sess->plogi_link[QLT_PLOGI_LINK_SAME_WWN] && - !sess->plogi_link[QLT_PLOGI_LINK_CONFLICT]) + !sess->plogi_link[QLT_PLOGI_LINK_CONFLICT]) { + if (tgt && tgt->tgt_stop && tgt->sess_count == 0) + wake_up_all(&tgt->waitQ); + + if (sess->vha->fcport_count == 0) + wake_up_all(&sess->vha->fcport_waitQ); return; + } break; case DSC_UPD_FCPORT: /* diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c index f59b8982b288..4ba9f46fcf74 100644 --- a/drivers/scsi/qla4xxx/ql4_os.c +++ b/drivers/scsi/qla4xxx/ql4_os.c @@ -1221,7 +1221,7 @@ static int qla4xxx_get_host_stats(struct Scsi_Host *shost, char *buf, int len) le64_to_cpu(ql_iscsi_stats->iscsi_sequence_error); exit_host_stats: if (ql_iscsi_stats) - dma_free_coherent(&ha->pdev->dev, host_stats_size, + dma_free_coherent(&ha->pdev->dev, stats_size, ql_iscsi_stats, iscsi_stats_dma); ql4_printk(KERN_INFO, ha, "%s: Get host stats done\n", diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 1c7d1e5dc654..9266ebf3feab 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1721,15 +1721,16 @@ static void scsi_sysfs_add_devices(struct Scsi_Host *shost) */ static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost) { - struct async_scan_data *data; + struct async_scan_data *data = NULL; unsigned long flags; if (strncmp(scsi_scan_type, "sync", 4) == 0) return NULL; + mutex_lock(&shost->scan_mutex); if (shost->async_scan) { shost_printk(KERN_DEBUG, shost, "%s called twice\n", __func__); - return NULL; + goto err; } data = kmalloc(sizeof(*data), GFP_KERNEL); @@ -1740,7 +1741,6 @@ static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost) goto err; init_completion(&data->prev_finished); - mutex_lock(&shost->scan_mutex); spin_lock_irqsave(shost->host_lock, flags); shost->async_scan = 1; spin_unlock_irqrestore(shost->host_lock, flags); @@ -1755,6 +1755,7 @@ static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost) return data; err: + mutex_unlock(&shost->scan_mutex); kfree(data); return NULL; } diff --git a/drivers/slimbus/core.c b/drivers/slimbus/core.c index 943172806a8a..3e63e4ce45b0 100644 --- a/drivers/slimbus/core.c +++ b/drivers/slimbus/core.c @@ -255,8 +255,6 @@ int slim_unregister_controller(struct slim_controller *ctrl) { /* Remove all clients */ device_for_each_child(ctrl->dev, NULL, slim_ctrl_remove_device); - /* Enter Clock Pause */ - slim_ctrl_clk_pause(ctrl, false, 0); ida_simple_remove(&ctrl_ida, ctrl->id); return 0; @@ -297,8 +295,8 @@ void slim_report_absent(struct slim_device *sbdev) mutex_lock(&ctrl->lock); sbdev->is_laddr_valid = false; mutex_unlock(&ctrl->lock); - - ida_simple_remove(&ctrl->laddr_ida, sbdev->laddr); + if (!ctrl->get_laddr) + ida_simple_remove(&ctrl->laddr_ida, sbdev->laddr); slim_device_update_status(sbdev, SLIM_DEVICE_STATUS_DOWN); } EXPORT_SYMBOL_GPL(slim_report_absent); diff --git a/drivers/spi/spi-s3c64xx.c b/drivers/spi/spi-s3c64xx.c index 7b7151ec14c8..1d948fee1a03 100644 --- a/drivers/spi/spi-s3c64xx.c +++ b/drivers/spi/spi-s3c64xx.c @@ -122,6 +122,7 @@ struct s3c64xx_spi_dma_data { struct dma_chan *ch; + dma_cookie_t cookie; enum dma_transfer_direction direction; }; @@ -264,12 +265,13 @@ static void s3c64xx_spi_dmacb(void *data) spin_unlock_irqrestore(&sdd->lock, flags); } -static void prepare_dma(struct s3c64xx_spi_dma_data *dma, +static int prepare_dma(struct s3c64xx_spi_dma_data *dma, struct sg_table *sgt) { struct s3c64xx_spi_driver_data *sdd; struct dma_slave_config config; struct dma_async_tx_descriptor *desc; + int ret; memset(&config, 0, sizeof(config)); @@ -293,12 +295,24 @@ static void prepare_dma(struct s3c64xx_spi_dma_data *dma, desc = dmaengine_prep_slave_sg(dma->ch, sgt->sgl, sgt->nents, dma->direction, DMA_PREP_INTERRUPT); + if (!desc) { + dev_err(&sdd->pdev->dev, "unable to prepare %s scatterlist", + dma->direction == DMA_DEV_TO_MEM ? "rx" : "tx"); + return -ENOMEM; + } desc->callback = s3c64xx_spi_dmacb; desc->callback_param = dma; - dmaengine_submit(desc); + dma->cookie = dmaengine_submit(desc); + ret = dma_submit_error(dma->cookie); + if (ret) { + dev_err(&sdd->pdev->dev, "DMA submission failed"); + return -EIO; + } + dma_async_issue_pending(dma->ch); + return 0; } static void s3c64xx_spi_set_cs(struct spi_device *spi, bool enable) @@ -348,11 +362,12 @@ static bool s3c64xx_spi_can_dma(struct spi_master *master, return xfer->len > (FIFO_LVL_MASK(sdd) >> 1) + 1; } -static void s3c64xx_enable_datapath(struct s3c64xx_spi_driver_data *sdd, +static int s3c64xx_enable_datapath(struct s3c64xx_spi_driver_data *sdd, struct spi_transfer *xfer, int dma_mode) { void __iomem *regs = sdd->regs; u32 modecfg, chcfg; + int ret = 0; modecfg = readl(regs + S3C64XX_SPI_MODE_CFG); modecfg &= ~(S3C64XX_SPI_MODE_TXDMA_ON | S3C64XX_SPI_MODE_RXDMA_ON); @@ -378,7 +393,7 @@ static void s3c64xx_enable_datapath(struct s3c64xx_spi_driver_data *sdd, chcfg |= S3C64XX_SPI_CH_TXCH_ON; if (dma_mode) { modecfg |= S3C64XX_SPI_MODE_TXDMA_ON; - prepare_dma(&sdd->tx_dma, &xfer->tx_sg); + ret = prepare_dma(&sdd->tx_dma, &xfer->tx_sg); } else { switch (sdd->cur_bpw) { case 32: @@ -410,12 +425,17 @@ static void s3c64xx_enable_datapath(struct s3c64xx_spi_driver_data *sdd, writel(((xfer->len * 8 / sdd->cur_bpw) & 0xffff) | S3C64XX_SPI_PACKET_CNT_EN, regs + S3C64XX_SPI_PACKET_CNT); - prepare_dma(&sdd->rx_dma, &xfer->rx_sg); + ret = prepare_dma(&sdd->rx_dma, &xfer->rx_sg); } } + if (ret) + return ret; + writel(modecfg, regs + S3C64XX_SPI_MODE_CFG); writel(chcfg, regs + S3C64XX_SPI_CH_CFG); + + return 0; } static u32 s3c64xx_spi_wait_for_timeout(struct s3c64xx_spi_driver_data *sdd, @@ -548,9 +568,10 @@ static int s3c64xx_wait_for_pio(struct s3c64xx_spi_driver_data *sdd, return 0; } -static void s3c64xx_spi_config(struct s3c64xx_spi_driver_data *sdd) +static int s3c64xx_spi_config(struct s3c64xx_spi_driver_data *sdd) { void __iomem *regs = sdd->regs; + int ret; u32 val; /* Disable Clock */ @@ -598,7 +619,9 @@ static void s3c64xx_spi_config(struct s3c64xx_spi_driver_data *sdd) if (sdd->port_conf->clk_from_cmu) { /* The src_clk clock is divided internally by 2 */ - clk_set_rate(sdd->src_clk, sdd->cur_speed * 2); + ret = clk_set_rate(sdd->src_clk, sdd->cur_speed * 2); + if (ret) + return ret; } else { /* Configure Clock */ val = readl(regs + S3C64XX_SPI_CLK_CFG); @@ -612,6 +635,8 @@ static void s3c64xx_spi_config(struct s3c64xx_spi_driver_data *sdd) val |= S3C64XX_SPI_ENCLK_ENABLE; writel(val, regs + S3C64XX_SPI_CLK_CFG); } + + return 0; } #define XFER_DMAADDR_INVALID DMA_BIT_MASK(32) @@ -654,7 +679,9 @@ static int s3c64xx_spi_transfer_one(struct spi_master *master, sdd->cur_bpw = bpw; sdd->cur_speed = speed; sdd->cur_mode = spi->mode; - s3c64xx_spi_config(sdd); + status = s3c64xx_spi_config(sdd); + if (status) + return status; } if (!is_polling(sdd) && (xfer->len > fifo_len) && @@ -678,13 +705,18 @@ static int s3c64xx_spi_transfer_one(struct spi_master *master, sdd->state &= ~RXBUSY; sdd->state &= ~TXBUSY; - s3c64xx_enable_datapath(sdd, xfer, use_dma); - /* Start the signals */ s3c64xx_spi_set_cs(spi, true); + status = s3c64xx_enable_datapath(sdd, xfer, use_dma); + spin_unlock_irqrestore(&sdd->lock, flags); + if (status) { + dev_err(&spi->dev, "failed to enable data path for transfer: %d\n", status); + break; + } + if (use_dma) status = s3c64xx_wait_for_dma(sdd, xfer); else diff --git a/drivers/staging/comedi/drivers/cb_pcidas.c b/drivers/staging/comedi/drivers/cb_pcidas.c index 8429d57087fd..9b716c696477 100644 --- a/drivers/staging/comedi/drivers/cb_pcidas.c +++ b/drivers/staging/comedi/drivers/cb_pcidas.c @@ -1342,6 +1342,7 @@ static int cb_pcidas_auto_attach(struct comedi_device *dev, if (dev->irq && board->has_ao_fifo) { dev->write_subdev = s; s->subdev_flags |= SDF_CMD_WRITE; + s->len_chanlist = s->n_chan; s->do_cmdtest = cb_pcidas_ao_cmdtest; s->do_cmd = cb_pcidas_ao_cmd; s->cancel = cb_pcidas_ao_cancel; diff --git a/drivers/staging/octeon/ethernet-mdio.c b/drivers/staging/octeon/ethernet-mdio.c index f67f95043887..5761a31e2318 100644 --- a/drivers/staging/octeon/ethernet-mdio.c +++ b/drivers/staging/octeon/ethernet-mdio.c @@ -152,12 +152,6 @@ int cvm_oct_phy_setup_device(struct net_device *dev) phy_node = of_parse_phandle(priv->of_node, "phy-handle", 0); if (!phy_node && of_phy_is_fixed_link(priv->of_node)) { - int rc; - - rc = of_phy_register_fixed_link(priv->of_node); - if (rc) - return rc; - phy_node = of_node_get(priv->of_node); } if (!phy_node) diff --git a/drivers/staging/octeon/ethernet-rx.c b/drivers/staging/octeon/ethernet-rx.c index 5e271245273c..6c644ef6f3ff 100644 --- a/drivers/staging/octeon/ethernet-rx.c +++ b/drivers/staging/octeon/ethernet-rx.c @@ -80,15 +80,17 @@ static inline int cvm_oct_check_rcv_error(cvmx_wqe_t *work) else port = work->word1.cn38xx.ipprt; - if ((work->word2.snoip.err_code == 10) && (work->word1.len <= 64)) { + if ((work->word2.snoip.err_code == 10) && (work->word1.len <= 64)) /* * Ignore length errors on min size packets. Some * equipment incorrectly pads packets to 64+4FCS * instead of 60+4FCS. Note these packets still get * counted as frame errors. */ - } else if (work->word2.snoip.err_code == 5 || - work->word2.snoip.err_code == 7) { + return 0; + + if (work->word2.snoip.err_code == 5 || + work->word2.snoip.err_code == 7) { /* * We received a packet with either an alignment error * or a FCS error. This may be signalling that we are @@ -119,7 +121,10 @@ static inline int cvm_oct_check_rcv_error(cvmx_wqe_t *work) /* Port received 0xd5 preamble */ work->packet_ptr.s.addr += i + 1; work->word1.len -= i + 5; - } else if ((*ptr & 0xf) == 0xd) { + return 0; + } + + if ((*ptr & 0xf) == 0xd) { /* Port received 0xd preamble */ work->packet_ptr.s.addr += i; work->word1.len -= i + 4; @@ -129,21 +134,20 @@ static inline int cvm_oct_check_rcv_error(cvmx_wqe_t *work) ((*(ptr + 1) & 0xf) << 4); ptr++; } - } else { - printk_ratelimited("Port %d unknown preamble, packet dropped\n", - port); - cvm_oct_free_work(work); - return 1; + return 0; } + + printk_ratelimited("Port %d unknown preamble, packet dropped\n", + port); + cvm_oct_free_work(work); + return 1; } - } else { - printk_ratelimited("Port %d receive error code %d, packet dropped\n", - port, work->word2.snoip.err_code); - cvm_oct_free_work(work); - return 1; } - return 0; + printk_ratelimited("Port %d receive error code %d, packet dropped\n", + port, work->word2.snoip.err_code); + cvm_oct_free_work(work); + return 1; } static void copy_segments_to_skb(cvmx_wqe_t *work, struct sk_buff *skb) diff --git a/drivers/staging/octeon/ethernet.c b/drivers/staging/octeon/ethernet.c index 9b15c9ed844b..b680e5785ae3 100644 --- a/drivers/staging/octeon/ethernet.c +++ b/drivers/staging/octeon/ethernet.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -875,6 +876,14 @@ static int cvm_oct_probe(struct platform_device *pdev) break; } + if (priv->of_node && of_phy_is_fixed_link(priv->of_node)) { + if (of_phy_register_fixed_link(priv->of_node)) { + netdev_err(dev, "Failed to register fixed link for interface %d, port %d\n", + interface, priv->port); + dev->netdev_ops = NULL; + } + } + if (!dev->netdev_ops) { free_netdev(dev); } else if (register_netdev(dev) < 0) { diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211_rx.c b/drivers/staging/rtl8192u/ieee80211/ieee80211_rx.c index 28cae82d795c..fb824c517449 100644 --- a/drivers/staging/rtl8192u/ieee80211/ieee80211_rx.c +++ b/drivers/staging/rtl8192u/ieee80211/ieee80211_rx.c @@ -599,7 +599,7 @@ static void RxReorderIndicatePacket(struct ieee80211_device *ieee, prxbIndicateArray = kmalloc_array(REORDER_WIN_SIZE, sizeof(struct ieee80211_rxb *), - GFP_KERNEL); + GFP_ATOMIC); if (!prxbIndicateArray) return; diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c index 99314e516244..0219b5a865be 100644 --- a/drivers/target/target_core_user.c +++ b/drivers/target/target_core_user.c @@ -680,7 +680,7 @@ static void scatter_data_area(struct tcmu_dev *udev, void *from, *to = NULL; size_t copy_bytes, to_offset, offset; struct scatterlist *sg; - struct page *page; + struct page *page = NULL; for_each_sg(data_sg, sg, data_nents, i) { int sg_remaining = sg->length; diff --git a/drivers/tty/hvc/hvcs.c b/drivers/tty/hvc/hvcs.c index cb4db1b3ca3c..7853c6375325 100644 --- a/drivers/tty/hvc/hvcs.c +++ b/drivers/tty/hvc/hvcs.c @@ -1218,13 +1218,6 @@ static void hvcs_close(struct tty_struct *tty, struct file *filp) tty_wait_until_sent(tty, HVCS_CLOSE_WAIT); - /* - * This line is important because it tells hvcs_open that this - * device needs to be re-configured the next time hvcs_open is - * called. - */ - tty->driver_data = NULL; - free_irq(irq, hvcsd); return; } else if (hvcsd->port.count < 0) { @@ -1239,6 +1232,13 @@ static void hvcs_cleanup(struct tty_struct * tty) { struct hvcs_struct *hvcsd = tty->driver_data; + /* + * This line is important because it tells hvcs_open that this + * device needs to be re-configured the next time hvcs_open is + * called. + */ + tty->driver_data = NULL; + tty_port_put(&hvcsd->port); } diff --git a/drivers/tty/ipwireless/network.c b/drivers/tty/ipwireless/network.c index cf20616340a1..fe569f6294a2 100644 --- a/drivers/tty/ipwireless/network.c +++ b/drivers/tty/ipwireless/network.c @@ -117,7 +117,7 @@ static int ipwireless_ppp_start_xmit(struct ppp_channel *ppp_channel, skb->len, notify_packet_sent, network); - if (ret == -1) { + if (ret < 0) { skb_pull(skb, 2); return 0; } @@ -134,7 +134,7 @@ static int ipwireless_ppp_start_xmit(struct ppp_channel *ppp_channel, notify_packet_sent, network); kfree(buf); - if (ret == -1) + if (ret < 0) return 0; } kfree_skb(skb); diff --git a/drivers/tty/ipwireless/tty.c b/drivers/tty/ipwireless/tty.c index 1ef751c27ac6..cb0497184330 100644 --- a/drivers/tty/ipwireless/tty.c +++ b/drivers/tty/ipwireless/tty.c @@ -218,7 +218,7 @@ static int ipw_write(struct tty_struct *linux_tty, ret = ipwireless_send_packet(tty->hardware, IPW_CHANNEL_RAS, buf, count, ipw_write_packet_sent_callback, tty); - if (ret == -1) { + if (ret < 0) { mutex_unlock(&tty->ipw_tty_mutex); return 0; } diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c index 00099a8439d2..c6a1d8c4e689 100644 --- a/drivers/tty/pty.c +++ b/drivers/tty/pty.c @@ -120,10 +120,10 @@ static int pty_write(struct tty_struct *tty, const unsigned char *buf, int c) spin_lock_irqsave(&to->port->lock, flags); /* Stuff the data into the input queue of the other end */ c = tty_insert_flip_string(to->port, buf, c); + spin_unlock_irqrestore(&to->port->lock, flags); /* And shovel */ if (c) tty_flip_buffer_push(to->port); - spin_unlock_irqrestore(&to->port->lock, flags); } return c; } diff --git a/drivers/tty/serial/8250/8250_mtk.c b/drivers/tty/serial/8250/8250_mtk.c index 159169a48deb..cc4f0c5e5ddf 100644 --- a/drivers/tty/serial/8250/8250_mtk.c +++ b/drivers/tty/serial/8250/8250_mtk.c @@ -47,7 +47,7 @@ mtk8250_set_termios(struct uart_port *port, struct ktermios *termios, */ baud = tty_termios_baud_rate(termios); - serial8250_do_set_termios(port, termios, old); + serial8250_do_set_termios(port, termios, NULL); tty_termios_encode_baud_rate(termios, baud, baud); diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig index f470201039b3..2deae534c6ef 100644 --- a/drivers/tty/serial/Kconfig +++ b/drivers/tty/serial/Kconfig @@ -9,6 +9,7 @@ menu "Serial drivers" config SERIAL_EARLYCON bool + depends on SERIAL_CORE help Support for early consoles with the earlycon parameter. This enables the console before standard serial driver is probed. The console is diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c index 45e4f2952143..1306ce5c5d9b 100644 --- a/drivers/tty/serial/amba-pl011.c +++ b/drivers/tty/serial/amba-pl011.c @@ -313,8 +313,9 @@ static void pl011_write(unsigned int val, const struct uart_amba_port *uap, */ static int pl011_fifo_to_tty(struct uart_amba_port *uap) { - u16 status; unsigned int ch, flag, fifotaken; + int sysrq; + u16 status; for (fifotaken = 0; fifotaken != 256; fifotaken++) { status = pl011_read(uap, REG_FR); @@ -349,10 +350,12 @@ static int pl011_fifo_to_tty(struct uart_amba_port *uap) flag = TTY_FRAME; } - if (uart_handle_sysrq_char(&uap->port, ch & 255)) - continue; + spin_unlock(&uap->port.lock); + sysrq = uart_handle_sysrq_char(&uap->port, ch & 255); + spin_lock(&uap->port.lock); - uart_insert_char(&uap->port, ch, UART011_DR_OE, ch, flag); + if (!sysrq) + uart_insert_char(&uap->port, ch, UART011_DR_OE, ch, flag); } return fifotaken; diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c index 2daccb10ae2f..4b9f42269477 100644 --- a/drivers/tty/serial/fsl_lpuart.c +++ b/drivers/tty/serial/fsl_lpuart.c @@ -563,7 +563,7 @@ static void lpuart32_poll_put_char(struct uart_port *port, unsigned char c) static int lpuart32_poll_get_char(struct uart_port *port) { - if (!(lpuart32_read(port, UARTSTAT) & UARTSTAT_RDRF)) + if (!(lpuart32_read(port, UARTWATER) >> UARTWATER_RXCNT_OFF)) return NO_POLL_CHAR; return lpuart32_read(port, UARTDATA); diff --git a/drivers/tty/serial/serial_txx9.c b/drivers/tty/serial/serial_txx9.c index 1b4008d022bf..2f7ef64536a0 100644 --- a/drivers/tty/serial/serial_txx9.c +++ b/drivers/tty/serial/serial_txx9.c @@ -1284,6 +1284,9 @@ static int __init serial_txx9_init(void) #ifdef ENABLE_SERIAL_TXX9_PCI ret = pci_register_driver(&serial_txx9_pci_driver); + if (ret) { + platform_driver_unregister(&serial_txx9_plat_driver); + } #endif if (ret == 0) goto out; diff --git a/drivers/tty/vt/keyboard.c b/drivers/tty/vt/keyboard.c index a7455f8a4235..94cad9f86ff9 100644 --- a/drivers/tty/vt/keyboard.c +++ b/drivers/tty/vt/keyboard.c @@ -742,8 +742,13 @@ static void k_fn(struct vc_data *vc, unsigned char value, char up_flag) return; if ((unsigned)value < ARRAY_SIZE(func_table)) { + unsigned long flags; + + spin_lock_irqsave(&func_buf_lock, flags); if (func_table[value]) puts_queue(vc, func_table[value]); + spin_unlock_irqrestore(&func_buf_lock, flags); + } else pr_err("k_fn called with value=%d\n", value); } @@ -1990,13 +1995,11 @@ out: #undef s #undef v -/* FIXME: This one needs untangling and locking */ +/* FIXME: This one needs untangling */ int vt_do_kdgkb_ioctl(int cmd, struct kbsentry __user *user_kdgkb, int perm) { struct kbsentry *kbs; - char *p; u_char *q; - u_char __user *up; int sz, fnw_sz; int delta; char *first_free, *fj, *fnw; @@ -2022,23 +2025,19 @@ int vt_do_kdgkb_ioctl(int cmd, struct kbsentry __user *user_kdgkb, int perm) i = kbs->kb_func; switch (cmd) { - case KDGKBSENT: - sz = sizeof(kbs->kb_string) - 1; /* sz should have been - a struct member */ - up = user_kdgkb->kb_string; - p = func_table[i]; - if(p) - for ( ; *p && sz; p++, sz--) - if (put_user(*p, up++)) { - ret = -EFAULT; - goto reterr; - } - if (put_user('\0', up)) { - ret = -EFAULT; - goto reterr; - } - kfree(kbs); - return ((p && *p) ? -EOVERFLOW : 0); + case KDGKBSENT: { + /* size should have been a struct member */ + ssize_t len = sizeof(user_kdgkb->kb_string); + + spin_lock_irqsave(&func_buf_lock, flags); + len = strlcpy(kbs->kb_string, func_table[i] ? : "", len); + spin_unlock_irqrestore(&func_buf_lock, flags); + + ret = copy_to_user(user_kdgkb->kb_string, kbs->kb_string, + len + 1) ? -EFAULT : 0; + + goto reterr; + } case KDSKBSENT: if (!perm) { ret = -EPERM; diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index 758f522f331e..13ea0579f104 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -4574,27 +4574,6 @@ static int con_font_default(struct vc_data *vc, struct console_font_op *op) return rc; } -static int con_font_copy(struct vc_data *vc, struct console_font_op *op) -{ - int con = op->height; - int rc; - - - console_lock(); - if (vc->vc_mode != KD_TEXT) - rc = -EINVAL; - else if (!vc->vc_sw->con_font_copy) - rc = -ENOSYS; - else if (con < 0 || !vc_cons_allocated(con)) - rc = -ENOTTY; - else if (con == vc->vc_num) /* nothing to do */ - rc = 0; - else - rc = vc->vc_sw->con_font_copy(vc, con); - console_unlock(); - return rc; -} - int con_font_op(struct vc_data *vc, struct console_font_op *op) { switch (op->op) { @@ -4605,7 +4584,8 @@ int con_font_op(struct vc_data *vc, struct console_font_op *op) case KD_FONT_OP_SET_DEFAULT: return con_font_default(vc, op); case KD_FONT_OP_COPY: - return con_font_copy(vc, op); + /* was buggy and never really used */ + return -EINVAL; } return -ENOSYS; } diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c index 6a82030cf1ef..2e959563af53 100644 --- a/drivers/tty/vt/vt_ioctl.c +++ b/drivers/tty/vt/vt_ioctl.c @@ -244,7 +244,7 @@ int vt_waitactive(int n) static inline int -do_fontx_ioctl(int cmd, struct consolefontdesc __user *user_cfd, int perm, struct console_font_op *op) +do_fontx_ioctl(struct vc_data *vc, int cmd, struct consolefontdesc __user *user_cfd, int perm, struct console_font_op *op) { struct consolefontdesc cfdarg; int i; @@ -262,15 +262,16 @@ do_fontx_ioctl(int cmd, struct consolefontdesc __user *user_cfd, int perm, struc op->height = cfdarg.charheight; op->charcount = cfdarg.charcount; op->data = cfdarg.chardata; - return con_font_op(vc_cons[fg_console].d, op); - case GIO_FONTX: { + return con_font_op(vc, op); + + case GIO_FONTX: op->op = KD_FONT_OP_GET; op->flags = KD_FONT_FLAG_OLD; op->width = 8; op->height = cfdarg.charheight; op->charcount = cfdarg.charcount; op->data = cfdarg.chardata; - i = con_font_op(vc_cons[fg_console].d, op); + i = con_font_op(vc, op); if (i) return i; cfdarg.charheight = op->height; @@ -278,7 +279,6 @@ do_fontx_ioctl(int cmd, struct consolefontdesc __user *user_cfd, int perm, struc if (copy_to_user(user_cfd, &cfdarg, sizeof(struct consolefontdesc))) return -EFAULT; return 0; - } } return -EINVAL; } @@ -924,7 +924,7 @@ int vt_ioctl(struct tty_struct *tty, op.height = 0; op.charcount = 256; op.data = up; - ret = con_font_op(vc_cons[fg_console].d, &op); + ret = con_font_op(vc, &op); break; } @@ -935,7 +935,7 @@ int vt_ioctl(struct tty_struct *tty, op.height = 32; op.charcount = 256; op.data = up; - ret = con_font_op(vc_cons[fg_console].d, &op); + ret = con_font_op(vc, &op); break; } @@ -952,7 +952,7 @@ int vt_ioctl(struct tty_struct *tty, case PIO_FONTX: case GIO_FONTX: - ret = do_fontx_ioctl(cmd, up, perm, &op); + ret = do_fontx_ioctl(vc, cmd, up, perm, &op); break; case PIO_FONTRESET: @@ -969,11 +969,11 @@ int vt_ioctl(struct tty_struct *tty, { op.op = KD_FONT_OP_SET_DEFAULT; op.data = NULL; - ret = con_font_op(vc_cons[fg_console].d, &op); + ret = con_font_op(vc, &op); if (ret) break; console_lock(); - con_set_default_unimap(vc_cons[fg_console].d); + con_set_default_unimap(vc); console_unlock(); break; } @@ -1100,8 +1100,9 @@ struct compat_consolefontdesc { }; static inline int -compat_fontx_ioctl(int cmd, struct compat_consolefontdesc __user *user_cfd, - int perm, struct console_font_op *op) +compat_fontx_ioctl(struct vc_data *vc, int cmd, + struct compat_consolefontdesc __user *user_cfd, + int perm, struct console_font_op *op) { struct compat_consolefontdesc cfdarg; int i; @@ -1119,7 +1120,8 @@ compat_fontx_ioctl(int cmd, struct compat_consolefontdesc __user *user_cfd, op->height = cfdarg.charheight; op->charcount = cfdarg.charcount; op->data = compat_ptr(cfdarg.chardata); - return con_font_op(vc_cons[fg_console].d, op); + return con_font_op(vc, op); + case GIO_FONTX: op->op = KD_FONT_OP_GET; op->flags = KD_FONT_FLAG_OLD; @@ -1127,7 +1129,7 @@ compat_fontx_ioctl(int cmd, struct compat_consolefontdesc __user *user_cfd, op->height = cfdarg.charheight; op->charcount = cfdarg.charcount; op->data = compat_ptr(cfdarg.chardata); - i = con_font_op(vc_cons[fg_console].d, op); + i = con_font_op(vc, op); if (i) return i; cfdarg.charheight = op->height; @@ -1218,7 +1220,7 @@ long vt_compat_ioctl(struct tty_struct *tty, */ case PIO_FONTX: case GIO_FONTX: - ret = compat_fontx_ioctl(cmd, up, perm, &op); + ret = compat_fontx_ioctl(vc, cmd, up, perm, &op); break; case KDFONTOP: diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c index 9c788748bdc6..3926be659147 100644 --- a/drivers/uio/uio.c +++ b/drivers/uio/uio.c @@ -1008,8 +1008,6 @@ void uio_unregister_device(struct uio_info *info) idev = info->uio_dev; - uio_free_minor(idev); - mutex_lock(&idev->info_lock); uio_dev_del_attributes(idev); @@ -1021,6 +1019,8 @@ void uio_unregister_device(struct uio_info *info) device_unregister(&idev->dev); + uio_free_minor(idev); + return; } EXPORT_SYMBOL_GPL(uio_unregister_device); diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c index 41453bf6fc0b..08751d1a765f 100644 --- a/drivers/usb/class/cdc-acm.c +++ b/drivers/usb/class/cdc-acm.c @@ -508,6 +508,7 @@ static void acm_read_bulk_callback(struct urb *urb) "%s - cooling babbling device\n", __func__); usb_mark_last_busy(acm->dev); set_bit(rb->index, &acm->urbs_in_error_delay); + set_bit(ACM_ERROR_DELAY, &acm->flags); cooldown = true; break; default: @@ -533,7 +534,7 @@ static void acm_read_bulk_callback(struct urb *urb) if (stopped || stalled || cooldown) { if (stalled) - schedule_work(&acm->work); + schedule_delayed_work(&acm->dwork, 0); else if (cooldown) schedule_delayed_work(&acm->dwork, HZ / 2); return; @@ -568,13 +569,13 @@ static void acm_write_bulk(struct urb *urb) acm_write_done(acm, wb); spin_unlock_irqrestore(&acm->write_lock, flags); set_bit(EVENT_TTY_WAKEUP, &acm->flags); - schedule_work(&acm->work); + schedule_delayed_work(&acm->dwork, 0); } static void acm_softint(struct work_struct *work) { int i; - struct acm *acm = container_of(work, struct acm, work); + struct acm *acm = container_of(work, struct acm, dwork.work); if (test_bit(EVENT_RX_STALL, &acm->flags)) { smp_mb(); /* against acm_suspend() */ @@ -590,7 +591,7 @@ static void acm_softint(struct work_struct *work) if (test_and_clear_bit(ACM_ERROR_DELAY, &acm->flags)) { for (i = 0; i < acm->rx_buflimit; i++) if (test_and_clear_bit(i, &acm->urbs_in_error_delay)) - acm_submit_read_urb(acm, i, GFP_NOIO); + acm_submit_read_urb(acm, i, GFP_KERNEL); } if (test_and_clear_bit(EVENT_TTY_WAKEUP, &acm->flags)) @@ -1275,9 +1276,21 @@ static int acm_probe(struct usb_interface *intf, } } } else { + int class = -1; + data_intf_num = union_header->bSlaveInterface0; control_interface = usb_ifnum_to_if(usb_dev, union_header->bMasterInterface0); data_interface = usb_ifnum_to_if(usb_dev, data_intf_num); + + if (control_interface) + class = control_interface->cur_altsetting->desc.bInterfaceClass; + + if (class != USB_CLASS_COMM && class != USB_CLASS_CDC_DATA) { + dev_dbg(&intf->dev, "Broken union descriptor, assuming single interface\n"); + combined_interfaces = 1; + control_interface = data_interface = intf; + goto look_for_collapsed_interface; + } } if (!control_interface || !data_interface) { @@ -1384,7 +1397,6 @@ made_compressed_probe: acm->ctrlsize = ctrlsize; acm->readsize = readsize; acm->rx_buflimit = num_rx_buf; - INIT_WORK(&acm->work, acm_softint); INIT_DELAYED_WORK(&acm->dwork, acm_softint); init_waitqueue_head(&acm->wioctl); spin_lock_init(&acm->write_lock); @@ -1594,7 +1606,6 @@ static void acm_disconnect(struct usb_interface *intf) } acm_kill_urbs(acm); - cancel_work_sync(&acm->work); cancel_delayed_work_sync(&acm->dwork); tty_unregister_device(acm_tty_driver, acm->minor); @@ -1637,7 +1648,6 @@ static int acm_suspend(struct usb_interface *intf, pm_message_t message) return 0; acm_kill_urbs(acm); - cancel_work_sync(&acm->work); cancel_delayed_work_sync(&acm->dwork); acm->urbs_in_error_delay = 0; @@ -1932,6 +1942,17 @@ static const struct usb_device_id acm_ids[] = { .driver_info = IGNORE_DEVICE, }, + /* Exclude ETAS ES58x */ + { USB_DEVICE(0x108c, 0x0159), /* ES581.4 */ + .driver_info = IGNORE_DEVICE, + }, + { USB_DEVICE(0x108c, 0x0168), /* ES582.1 */ + .driver_info = IGNORE_DEVICE, + }, + { USB_DEVICE(0x108c, 0x0169), /* ES584.1 */ + .driver_info = IGNORE_DEVICE, + }, + { USB_DEVICE(0x1bc7, 0x0021), /* Telit 3G ACM only composition */ .driver_info = SEND_ZERO_PACKET, }, diff --git a/drivers/usb/class/cdc-acm.h b/drivers/usb/class/cdc-acm.h index 30380d28a504..d8f8651425c4 100644 --- a/drivers/usb/class/cdc-acm.h +++ b/drivers/usb/class/cdc-acm.h @@ -111,8 +111,7 @@ struct acm { # define ACM_ERROR_DELAY 3 unsigned long urbs_in_error_delay; /* these need to be restarted after a delay */ struct usb_cdc_line_coding line; /* bits, stop, parity */ - struct work_struct work; /* work queue entry for various purposes*/ - struct delayed_work dwork; /* for cool downs needed in error recovery */ + struct delayed_work dwork; /* work queue entry for various purposes */ unsigned int ctrlin; /* input control lines (DCD, DSR, RI, break, overruns) */ unsigned int ctrlout; /* output control lines (DTR, RTS) */ struct async_icount iocount; /* counters for control line changes */ diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c index 4929c5883068..55ad4c43b380 100644 --- a/drivers/usb/class/cdc-wdm.c +++ b/drivers/usb/class/cdc-wdm.c @@ -58,6 +58,9 @@ MODULE_DEVICE_TABLE (usb, wdm_ids); #define WDM_MAX 16 +/* we cannot wait forever at flush() */ +#define WDM_FLUSH_TIMEOUT (30 * HZ) + /* CDC-WMC r1.1 requires wMaxCommand to be "at least 256 decimal (0x100)" */ #define WDM_DEFAULT_BUFSIZE 256 @@ -151,7 +154,7 @@ static void wdm_out_callback(struct urb *urb) kfree(desc->outbuf); desc->outbuf = NULL; clear_bit(WDM_IN_USE, &desc->flags); - wake_up(&desc->wait); + wake_up_all(&desc->wait); } static void wdm_in_callback(struct urb *urb) @@ -393,6 +396,9 @@ static ssize_t wdm_write if (test_bit(WDM_RESETTING, &desc->flags)) r = -EIO; + if (test_bit(WDM_DISCONNECTING, &desc->flags)) + r = -ENODEV; + if (r < 0) { rv = r; goto out_free_mem_pm; @@ -424,6 +430,7 @@ static ssize_t wdm_write if (rv < 0) { desc->outbuf = NULL; clear_bit(WDM_IN_USE, &desc->flags); + wake_up_all(&desc->wait); /* for wdm_wait_for_response() */ dev_err(&desc->intf->dev, "Tx URB error: %d\n", rv); rv = usb_translate_errors(rv); goto out_free_mem_pm; @@ -583,28 +590,58 @@ err: return rv; } -static int wdm_flush(struct file *file, fl_owner_t id) +static int wdm_wait_for_response(struct file *file, long timeout) { struct wdm_device *desc = file->private_data; + long rv; /* Use long here because (int) MAX_SCHEDULE_TIMEOUT < 0. */ - wait_event(desc->wait, - /* - * needs both flags. We cannot do with one - * because resetting it would cause a race - * with write() yet we need to signal - * a disconnect - */ - !test_bit(WDM_IN_USE, &desc->flags) || - test_bit(WDM_DISCONNECTING, &desc->flags)); + /* + * Needs both flags. We cannot do with one because resetting it would + * cause a race with write() yet we need to signal a disconnect. + */ + rv = wait_event_interruptible_timeout(desc->wait, + !test_bit(WDM_IN_USE, &desc->flags) || + test_bit(WDM_DISCONNECTING, &desc->flags), + timeout); - /* cannot dereference desc->intf if WDM_DISCONNECTING */ + /* + * To report the correct error. This is best effort. + * We are inevitably racing with the hardware. + */ if (test_bit(WDM_DISCONNECTING, &desc->flags)) return -ENODEV; - if (desc->werr < 0) - dev_err(&desc->intf->dev, "Error in flush path: %d\n", - desc->werr); + if (!rv) + return -EIO; + if (rv < 0) + return -EINTR; - return usb_translate_errors(desc->werr); + spin_lock_irq(&desc->iuspin); + rv = desc->werr; + desc->werr = 0; + spin_unlock_irq(&desc->iuspin); + + return usb_translate_errors(rv); + +} + +/* + * You need to send a signal when you react to malicious or defective hardware. + * Also, don't abort when fsync() returned -EINVAL, for older kernels which do + * not implement wdm_flush() will return -EINVAL. + */ +static int wdm_fsync(struct file *file, loff_t start, loff_t end, int datasync) +{ + return wdm_wait_for_response(file, MAX_SCHEDULE_TIMEOUT); +} + +/* + * Same with wdm_fsync(), except it uses finite timeout in order to react to + * malicious or defective hardware which ceased communication after close() was + * implicitly called due to process termination. + */ +static int wdm_flush(struct file *file, fl_owner_t id) +{ + return wdm_wait_for_response(file, WDM_FLUSH_TIMEOUT); } static __poll_t wdm_poll(struct file *file, struct poll_table_struct *wait) @@ -729,6 +766,7 @@ static const struct file_operations wdm_fops = { .owner = THIS_MODULE, .read = wdm_read, .write = wdm_write, + .fsync = wdm_fsync, .open = wdm_open, .flush = wdm_flush, .release = wdm_release, diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c index 4ee810531098..5ad14cdd9762 100644 --- a/drivers/usb/core/quirks.c +++ b/drivers/usb/core/quirks.c @@ -378,6 +378,9 @@ static const struct usb_device_id usb_quirk_list[] = { { USB_DEVICE(0x0926, 0x3333), .driver_info = USB_QUIRK_CONFIG_INTF_STRINGS }, + /* Kingston DataTraveler 3.0 */ + { USB_DEVICE(0x0951, 0x1666), .driver_info = USB_QUIRK_NO_LPM }, + /* X-Rite/Gretag-Macbeth Eye-One Pro display colorimeter */ { USB_DEVICE(0x0971, 0x2000), .driver_info = USB_QUIRK_NO_SET_INTF }, diff --git a/drivers/usb/core/urb.c b/drivers/usb/core/urb.c index 5e844097a9e3..3cd7732c086e 100644 --- a/drivers/usb/core/urb.c +++ b/drivers/usb/core/urb.c @@ -773,11 +773,12 @@ void usb_block_urb(struct urb *urb) EXPORT_SYMBOL_GPL(usb_block_urb); /** - * usb_kill_anchored_urbs - cancel transfer requests en masse + * usb_kill_anchored_urbs - kill all URBs associated with an anchor * @anchor: anchor the requests are bound to * - * this allows all outstanding URBs to be killed starting - * from the back of the queue + * This kills all outstanding URBs starting from the back of the queue, + * with guarantee that no completer callbacks will take place from the + * anchor after this function returns. * * This routine should not be called by a driver after its disconnect * method has returned. @@ -785,20 +786,26 @@ EXPORT_SYMBOL_GPL(usb_block_urb); void usb_kill_anchored_urbs(struct usb_anchor *anchor) { struct urb *victim; + int surely_empty; - spin_lock_irq(&anchor->lock); - while (!list_empty(&anchor->urb_list)) { - victim = list_entry(anchor->urb_list.prev, struct urb, - anchor_list); - /* we must make sure the URB isn't freed before we kill it*/ - usb_get_urb(victim); - spin_unlock_irq(&anchor->lock); - /* this will unanchor the URB */ - usb_kill_urb(victim); - usb_put_urb(victim); + do { spin_lock_irq(&anchor->lock); - } - spin_unlock_irq(&anchor->lock); + while (!list_empty(&anchor->urb_list)) { + victim = list_entry(anchor->urb_list.prev, + struct urb, anchor_list); + /* make sure the URB isn't freed before we kill it */ + usb_get_urb(victim); + spin_unlock_irq(&anchor->lock); + /* this will unanchor the URB */ + usb_kill_urb(victim); + usb_put_urb(victim); + spin_lock_irq(&anchor->lock); + } + surely_empty = usb_anchor_check_wakeup(anchor); + + spin_unlock_irq(&anchor->lock); + cpu_relax(); + } while (!surely_empty); } EXPORT_SYMBOL_GPL(usb_kill_anchored_urbs); @@ -817,21 +824,27 @@ EXPORT_SYMBOL_GPL(usb_kill_anchored_urbs); void usb_poison_anchored_urbs(struct usb_anchor *anchor) { struct urb *victim; + int surely_empty; - spin_lock_irq(&anchor->lock); - anchor->poisoned = 1; - while (!list_empty(&anchor->urb_list)) { - victim = list_entry(anchor->urb_list.prev, struct urb, - anchor_list); - /* we must make sure the URB isn't freed before we kill it*/ - usb_get_urb(victim); - spin_unlock_irq(&anchor->lock); - /* this will unanchor the URB */ - usb_poison_urb(victim); - usb_put_urb(victim); + do { spin_lock_irq(&anchor->lock); - } - spin_unlock_irq(&anchor->lock); + anchor->poisoned = 1; + while (!list_empty(&anchor->urb_list)) { + victim = list_entry(anchor->urb_list.prev, + struct urb, anchor_list); + /* make sure the URB isn't freed before we kill it */ + usb_get_urb(victim); + spin_unlock_irq(&anchor->lock); + /* this will unanchor the URB */ + usb_poison_urb(victim); + usb_put_urb(victim); + spin_lock_irq(&anchor->lock); + } + surely_empty = usb_anchor_check_wakeup(anchor); + + spin_unlock_irq(&anchor->lock); + cpu_relax(); + } while (!surely_empty); } EXPORT_SYMBOL_GPL(usb_poison_anchored_urbs); @@ -971,14 +984,20 @@ void usb_scuttle_anchored_urbs(struct usb_anchor *anchor) { struct urb *victim; unsigned long flags; + int surely_empty; - spin_lock_irqsave(&anchor->lock, flags); - while (!list_empty(&anchor->urb_list)) { - victim = list_entry(anchor->urb_list.prev, struct urb, - anchor_list); - __usb_unanchor_urb(victim, anchor); - } - spin_unlock_irqrestore(&anchor->lock, flags); + do { + spin_lock_irqsave(&anchor->lock, flags); + while (!list_empty(&anchor->urb_list)) { + victim = list_entry(anchor->urb_list.prev, + struct urb, anchor_list); + __usb_unanchor_urb(victim, anchor); + } + surely_empty = usb_anchor_check_wakeup(anchor); + + spin_unlock_irqrestore(&anchor->lock, flags); + cpu_relax(); + } while (!surely_empty); } EXPORT_SYMBOL_GPL(usb_scuttle_anchored_urbs); diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c index f18aa3f59e51..8e98b4df9b10 100644 --- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -671,8 +671,11 @@ static u32 dwc2_hsotg_read_frameno(struct dwc2_hsotg *hsotg) */ static unsigned int dwc2_gadget_get_chain_limit(struct dwc2_hsotg_ep *hs_ep) { + const struct usb_endpoint_descriptor *ep_desc = hs_ep->ep.desc; int is_isoc = hs_ep->isochronous; unsigned int maxsize; + u32 mps = hs_ep->ep.maxpacket; + int dir_in = hs_ep->dir_in; if (is_isoc) maxsize = (hs_ep->dir_in ? DEV_DMA_ISOC_TX_NBYTES_LIMIT : @@ -681,6 +684,11 @@ static unsigned int dwc2_gadget_get_chain_limit(struct dwc2_hsotg_ep *hs_ep) else maxsize = DEV_DMA_NBYTES_LIMIT * MAX_DMA_DESC_NUM_GENERIC; + /* Interrupt OUT EP with mps not multiple of 4 */ + if (hs_ep->index) + if (usb_endpoint_xfer_int(ep_desc) && !dir_in && (mps % 4)) + maxsize = mps * MAX_DMA_DESC_NUM_GENERIC; + return maxsize; } @@ -696,11 +704,14 @@ static unsigned int dwc2_gadget_get_chain_limit(struct dwc2_hsotg_ep *hs_ep) * Isochronous - descriptor rx/tx bytes bitfield limit, * Control In/Bulk/Interrupt - multiple of mps. This will allow to not * have concatenations from various descriptors within one packet. + * Interrupt OUT - if mps not multiple of 4 then a single packet corresponds + * to a single descriptor. * * Selects corresponding mask for RX/TX bytes as well. */ static u32 dwc2_gadget_get_desc_params(struct dwc2_hsotg_ep *hs_ep, u32 *mask) { + const struct usb_endpoint_descriptor *ep_desc = hs_ep->ep.desc; u32 mps = hs_ep->ep.maxpacket; int dir_in = hs_ep->dir_in; u32 desc_size = 0; @@ -724,6 +735,13 @@ static u32 dwc2_gadget_get_desc_params(struct dwc2_hsotg_ep *hs_ep, u32 *mask) desc_size -= desc_size % mps; } + /* Interrupt OUT EP with mps not multiple of 4 */ + if (hs_ep->index) + if (usb_endpoint_xfer_int(ep_desc) && !dir_in && (mps % 4)) { + desc_size = mps; + *mask = DEV_DMA_NBYTES_MASK; + } + return desc_size; } @@ -1044,13 +1062,7 @@ static void dwc2_hsotg_start_req(struct dwc2_hsotg *hsotg, length += (mps - (length % mps)); } - /* - * If more data to send, adjust DMA for EP0 out data stage. - * ureq->dma stays unchanged, hence increment it by already - * passed passed data count before starting new transaction. - */ - if (!index && hsotg->ep0_state == DWC2_EP0_DATA_OUT && - continuing) + if (continuing) offset = ureq->actual; /* Fill DDMA chain entries */ @@ -2220,22 +2232,36 @@ static void dwc2_hsotg_change_ep_iso_parity(struct dwc2_hsotg *hsotg, */ static unsigned int dwc2_gadget_get_xfersize_ddma(struct dwc2_hsotg_ep *hs_ep) { + const struct usb_endpoint_descriptor *ep_desc = hs_ep->ep.desc; struct dwc2_hsotg *hsotg = hs_ep->parent; unsigned int bytes_rem = 0; + unsigned int bytes_rem_correction = 0; struct dwc2_dma_desc *desc = hs_ep->desc_list; int i; u32 status; + u32 mps = hs_ep->ep.maxpacket; + int dir_in = hs_ep->dir_in; if (!desc) return -EINVAL; + /* Interrupt OUT EP with mps not multiple of 4 */ + if (hs_ep->index) + if (usb_endpoint_xfer_int(ep_desc) && !dir_in && (mps % 4)) + bytes_rem_correction = 4 - (mps % 4); + for (i = 0; i < hs_ep->desc_count; ++i) { status = desc->status; bytes_rem += status & DEV_DMA_NBYTES_MASK; + bytes_rem -= bytes_rem_correction; if (status & DEV_DMA_STS_MASK) dev_err(hsotg->dev, "descriptor %d closed with %x\n", i, status & DEV_DMA_STS_MASK); + + if (status & DEV_DMA_L) + break; + desc++; } diff --git a/drivers/usb/dwc2/params.c b/drivers/usb/dwc2/params.c index a93415f33bf3..6d7861cba3f5 100644 --- a/drivers/usb/dwc2/params.c +++ b/drivers/usb/dwc2/params.c @@ -808,7 +808,7 @@ int dwc2_get_hwparams(struct dwc2_hsotg *hsotg) int dwc2_init_params(struct dwc2_hsotg *hsotg) { const struct of_device_id *match; - void (*set_params)(void *data); + void (*set_params)(struct dwc2_hsotg *data); dwc2_set_default_params(hsotg); dwc2_get_device_properties(hsotg); diff --git a/drivers/usb/dwc3/dwc3-of-simple.c b/drivers/usb/dwc3/dwc3-of-simple.c index 4c2771c5e727..1ef89a4317c8 100644 --- a/drivers/usb/dwc3/dwc3-of-simple.c +++ b/drivers/usb/dwc3/dwc3-of-simple.c @@ -243,6 +243,7 @@ static const struct of_device_id of_dwc3_simple_match[] = { { .compatible = "amlogic,meson-axg-dwc3" }, { .compatible = "amlogic,meson-gxl-dwc3" }, { .compatible = "allwinner,sun50i-h6-dwc3" }, + { .compatible = "hisilicon,hi3670-dwc3" }, { /* Sentinel */ } }; MODULE_DEVICE_TABLE(of, of_dwc3_simple_match); diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c index 1b70b8da2a72..7db46a37605d 100644 --- a/drivers/usb/dwc3/ep0.c +++ b/drivers/usb/dwc3/ep0.c @@ -1033,12 +1033,16 @@ static void dwc3_ep0_xfer_complete(struct dwc3 *dwc, static void __dwc3_ep0_do_control_data(struct dwc3 *dwc, struct dwc3_ep *dep, struct dwc3_request *req) { + unsigned int trb_length = 0; int ret; req->direction = !!dep->number; if (req->request.length == 0) { - dwc3_ep0_prepare_one_trb(dep, dwc->ep0_trb_addr, 0, + if (!req->direction) + trb_length = dep->endpoint.maxpacket; + + dwc3_ep0_prepare_one_trb(dep, dwc->bounce_addr, trb_length, DWC3_TRBCTL_CONTROL_DATA, false); ret = dwc3_ep0_start_trans(dep); } else if (!IS_ALIGNED(req->request.length, dep->endpoint.maxpacket) @@ -1088,9 +1092,12 @@ static void __dwc3_ep0_do_control_data(struct dwc3 *dwc, req->trb = &dwc->ep0_trb[dep->trb_enqueue - 1]; + if (!req->direction) + trb_length = dep->endpoint.maxpacket; + /* Now prepare one extra TRB to align transfer size */ dwc3_ep0_prepare_one_trb(dep, dwc->bounce_addr, - 0, DWC3_TRBCTL_CONTROL_DATA, + trb_length, DWC3_TRBCTL_CONTROL_DATA, false); ret = dwc3_ep0_start_trans(dep); } else { diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 20c25fbfb9a7..e87f089d98ff 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -1283,6 +1283,8 @@ static void dwc3_prepare_one_trb_sg(struct dwc3_ep *dep, struct scatterlist *s; int i; unsigned int length = req->request.length; + unsigned int maxp = usb_endpoint_maxp(dep->endpoint.desc); + unsigned int rem = length % maxp; unsigned int remaining = req->request.num_mapped_sgs - req->num_queued_sgs; @@ -1294,8 +1296,6 @@ static void dwc3_prepare_one_trb_sg(struct dwc3_ep *dep, length -= sg_dma_len(s); for_each_sg(sg, s, remaining, i) { - unsigned int maxp = usb_endpoint_maxp(dep->endpoint.desc); - unsigned int rem = length % maxp; unsigned int trb_length; unsigned chain = true; diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c index e7a4c3b67d03..4ba859295524 100644 --- a/drivers/usb/gadget/function/f_ncm.c +++ b/drivers/usb/gadget/function/f_ncm.c @@ -86,8 +86,10 @@ static inline struct f_ncm *func_to_ncm(struct usb_function *f) /* peak (theoretical) bulk transfer rate in bits-per-second */ static inline unsigned ncm_bitrate(struct usb_gadget *g) { - if (gadget_is_superspeed(g) && g->speed == USB_SPEED_SUPER) - return 13 * 1024 * 8 * 1000 * 8; + if (gadget_is_superspeed(g) && g->speed >= USB_SPEED_SUPER_PLUS) + return 4250000000U; + else if (gadget_is_superspeed(g) && g->speed == USB_SPEED_SUPER) + return 3750000000U; else if (gadget_is_dualspeed(g) && g->speed == USB_SPEED_HIGH) return 13 * 512 * 8 * 1000 * 8; else diff --git a/drivers/usb/gadget/function/f_printer.c b/drivers/usb/gadget/function/f_printer.c index 9c7ed2539ff7..8ed1295d7e35 100644 --- a/drivers/usb/gadget/function/f_printer.c +++ b/drivers/usb/gadget/function/f_printer.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include @@ -64,7 +65,7 @@ struct printer_dev { struct usb_gadget *gadget; s8 interface; struct usb_ep *in_ep, *out_ep; - + struct kref kref; struct list_head rx_reqs; /* List of free RX structs */ struct list_head rx_reqs_active; /* List of Active RX xfers */ struct list_head rx_buffers; /* List of completed xfers */ @@ -218,6 +219,13 @@ static inline struct usb_endpoint_descriptor *ep_desc(struct usb_gadget *gadget, /*-------------------------------------------------------------------------*/ +static void printer_dev_free(struct kref *kref) +{ + struct printer_dev *dev = container_of(kref, struct printer_dev, kref); + + kfree(dev); +} + static struct usb_request * printer_req_alloc(struct usb_ep *ep, unsigned len, gfp_t gfp_flags) { @@ -348,6 +356,7 @@ printer_open(struct inode *inode, struct file *fd) spin_unlock_irqrestore(&dev->lock, flags); + kref_get(&dev->kref); DBG(dev, "printer_open returned %x\n", ret); return ret; } @@ -365,6 +374,7 @@ printer_close(struct inode *inode, struct file *fd) dev->printer_status &= ~PRINTER_SELECTED; spin_unlock_irqrestore(&dev->lock, flags); + kref_put(&dev->kref, printer_dev_free); DBG(dev, "printer_close\n"); return 0; @@ -1350,7 +1360,8 @@ static void gprinter_free(struct usb_function *f) struct f_printer_opts *opts; opts = container_of(f->fi, struct f_printer_opts, func_inst); - kfree(dev); + + kref_put(&dev->kref, printer_dev_free); mutex_lock(&opts->lock); --opts->refcnt; mutex_unlock(&opts->lock); @@ -1419,6 +1430,7 @@ static struct usb_function *gprinter_alloc(struct usb_function_instance *fi) return ERR_PTR(-ENOMEM); } + kref_init(&dev->kref); ++opts->refcnt; dev->minor = opts->minor; dev->pnp_string = opts->pnp_string; diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c index 540724b6f4b8..ddab9e1fa082 100644 --- a/drivers/usb/gadget/function/u_ether.c +++ b/drivers/usb/gadget/function/u_ether.c @@ -94,7 +94,7 @@ struct eth_dev { static inline int qlen(struct usb_gadget *gadget, unsigned qmult) { if (gadget_is_dualspeed(gadget) && (gadget->speed == USB_SPEED_HIGH || - gadget->speed == USB_SPEED_SUPER)) + gadget->speed >= USB_SPEED_SUPER)) return qmult * DEFAULT_QLEN; else return DEFAULT_QLEN; diff --git a/drivers/usb/host/fsl-mph-dr-of.c b/drivers/usb/host/fsl-mph-dr-of.c index 677f9d592109..de922022b83a 100644 --- a/drivers/usb/host/fsl-mph-dr-of.c +++ b/drivers/usb/host/fsl-mph-dr-of.c @@ -94,10 +94,13 @@ static struct platform_device *fsl_usb2_device_register( pdev->dev.coherent_dma_mask = ofdev->dev.coherent_dma_mask; - if (!pdev->dev.dma_mask) + if (!pdev->dev.dma_mask) { pdev->dev.dma_mask = &ofdev->dev.coherent_dma_mask; - else - dma_set_mask(&pdev->dev, DMA_BIT_MASK(32)); + } else { + retval = dma_set_mask(&pdev->dev, DMA_BIT_MASK(32)); + if (retval) + goto error; + } retval = platform_device_add_data(pdev, pdata, sizeof(*pdata)); if (retval) diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c index af11887f5f9e..e88486d8084a 100644 --- a/drivers/usb/host/ohci-hcd.c +++ b/drivers/usb/host/ohci-hcd.c @@ -665,20 +665,24 @@ retry: /* handle root hub init quirks ... */ val = roothub_a (ohci); - val &= ~(RH_A_PSM | RH_A_OCPM); + /* Configure for per-port over-current protection by default */ + val &= ~RH_A_NOCP; + val |= RH_A_OCPM; if (ohci->flags & OHCI_QUIRK_SUPERIO) { - /* NSC 87560 and maybe others */ + /* NSC 87560 and maybe others. + * Ganged power switching, no over-current protection. + */ val |= RH_A_NOCP; - val &= ~(RH_A_POTPGT | RH_A_NPS); - ohci_writel (ohci, val, &ohci->regs->roothub.a); + val &= ~(RH_A_POTPGT | RH_A_NPS | RH_A_PSM | RH_A_OCPM); } else if ((ohci->flags & OHCI_QUIRK_AMD756) || (ohci->flags & OHCI_QUIRK_HUB_POWER)) { /* hub power always on; required for AMD-756 and some - * Mac platforms. ganged overcurrent reporting, if any. + * Mac platforms. */ val |= RH_A_NPS; - ohci_writel (ohci, val, &ohci->regs->roothub.a); } + ohci_writel(ohci, val, &ohci->regs->roothub.a); + ohci_writel (ohci, RH_HS_LPSC, &ohci->regs->roothub.status); ohci_writel (ohci, (val & RH_A_NPS) ? 0 : RH_B_PPCM, &ohci->regs->roothub.b); diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index 1a6a23e57201..0c6b6f14b169 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -21,6 +21,8 @@ #define SSIC_PORT_CFG2_OFFSET 0x30 #define PROG_DONE (1 << 30) #define SSIC_PORT_UNUSED (1 << 31) +#define SPARSE_DISABLE_BIT 17 +#define SPARSE_CNTL_ENABLE 0xC12C /* Device for a quirk */ #define PCI_VENDOR_ID_FRESCO_LOGIC 0x1b73 @@ -141,6 +143,9 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) (pdev->device == 0x15e0 || pdev->device == 0x15e1)) xhci->quirks |= XHCI_SNPS_BROKEN_SUSPEND; + if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->device == 0x15e5) + xhci->quirks |= XHCI_DISABLE_SPARSE; + if (pdev->vendor == PCI_VENDOR_ID_AMD) xhci->quirks |= XHCI_TRUST_TX_LENGTH; @@ -441,6 +446,15 @@ static void xhci_pme_quirk(struct usb_hcd *hcd) readl(reg); } +static void xhci_sparse_control_quirk(struct usb_hcd *hcd) +{ + u32 reg; + + reg = readl(hcd->regs + SPARSE_CNTL_ENABLE); + reg &= ~BIT(SPARSE_DISABLE_BIT); + writel(reg, hcd->regs + SPARSE_CNTL_ENABLE); +} + static int xhci_pci_suspend(struct usb_hcd *hcd, bool do_wakeup) { struct xhci_hcd *xhci = hcd_to_xhci(hcd); @@ -460,6 +474,9 @@ static int xhci_pci_suspend(struct usb_hcd *hcd, bool do_wakeup) if (xhci->quirks & XHCI_SSIC_PORT_UNUSED) xhci_ssic_port_unused_quirk(hcd, true); + if (xhci->quirks & XHCI_DISABLE_SPARSE) + xhci_sparse_control_quirk(hcd); + ret = xhci_suspend(xhci, do_wakeup); if (ret && (xhci->quirks & XHCI_SSIC_PORT_UNUSED)) xhci_ssic_port_unused_quirk(hcd, false); diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index 299ff2d3b429..64a69b3c8ce9 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -1001,12 +1001,15 @@ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup) xhci->shared_hcd->state != HC_STATE_SUSPENDED) return -EINVAL; - xhci_dbc_suspend(xhci); - /* Clear root port wake on bits if wakeup not allowed. */ if (!do_wakeup) xhci_disable_port_wake_on_bits(xhci); + if (!HCD_HW_ACCESSIBLE(hcd)) + return 0; + + xhci_dbc_suspend(xhci); + /* Don't poll the roothubs on bus suspend. */ xhci_dbg(xhci, "%s: stopping port polling.\n", __func__); clear_bit(HCD_FLAG_POLL_RH, &hcd->flags); diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index d5be443e182e..2baedad172e7 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1881,6 +1881,7 @@ struct xhci_hcd { #define XHCI_ZERO_64B_REGS BIT_ULL(32) #define XHCI_RESET_PLL_ON_DISCONNECT BIT_ULL(34) #define XHCI_SNPS_BROKEN_SUSPEND BIT_ULL(35) +#define XHCI_DISABLE_SPARSE BIT_ULL(38) unsigned int num_active_eps; unsigned int limit_active_eps; diff --git a/drivers/usb/misc/adutux.c b/drivers/usb/misc/adutux.c index b8073f36ffdc..62fdfde4ad03 100644 --- a/drivers/usb/misc/adutux.c +++ b/drivers/usb/misc/adutux.c @@ -209,6 +209,7 @@ static void adu_interrupt_out_callback(struct urb *urb) if (status != 0) { if ((status != -ENOENT) && + (status != -ESHUTDOWN) && (status != -ECONNRESET)) { dev_dbg(&dev->udev->dev, "%s :nonzero status received: %d\n", __func__, diff --git a/drivers/usb/mtu3/mtu3_gadget.c b/drivers/usb/mtu3/mtu3_gadget.c index bbcd3332471d..3828ed6d38a2 100644 --- a/drivers/usb/mtu3/mtu3_gadget.c +++ b/drivers/usb/mtu3/mtu3_gadget.c @@ -573,6 +573,7 @@ static int mtu3_gadget_stop(struct usb_gadget *g) spin_unlock_irqrestore(&mtu->lock, flags); + synchronize_irq(mtu->irq); return 0; } diff --git a/drivers/usb/serial/cyberjack.c b/drivers/usb/serial/cyberjack.c index ebd76ab07b72..36dd688b5795 100644 --- a/drivers/usb/serial/cyberjack.c +++ b/drivers/usb/serial/cyberjack.c @@ -357,11 +357,12 @@ static void cyberjack_write_bulk_callback(struct urb *urb) struct device *dev = &port->dev; int status = urb->status; unsigned long flags; + bool resubmitted = false; - set_bit(0, &port->write_urbs_free); if (status) { dev_dbg(dev, "%s - nonzero write bulk status received: %d\n", __func__, status); + set_bit(0, &port->write_urbs_free); return; } @@ -394,6 +395,8 @@ static void cyberjack_write_bulk_callback(struct urb *urb) goto exit; } + resubmitted = true; + dev_dbg(dev, "%s - priv->wrsent=%d\n", __func__, priv->wrsent); dev_dbg(dev, "%s - priv->wrfilled=%d\n", __func__, priv->wrfilled); @@ -410,6 +413,8 @@ static void cyberjack_write_bulk_callback(struct urb *urb) exit: spin_unlock_irqrestore(&priv->lock, flags); + if (!resubmitted) + set_bit(0, &port->write_urbs_free); usb_serial_port_softint(port); } diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index c773db129bf9..f28344f03141 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -250,6 +250,7 @@ static void option_instat_callback(struct urb *urb); #define QUECTEL_PRODUCT_EP06 0x0306 #define QUECTEL_PRODUCT_EM12 0x0512 #define QUECTEL_PRODUCT_RM500Q 0x0800 +#define QUECTEL_PRODUCT_EC200T 0x6026 #define CMOTECH_VENDOR_ID 0x16d8 #define CMOTECH_PRODUCT_6001 0x6001 @@ -1117,6 +1118,7 @@ static const struct usb_device_id option_ids[] = { { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500Q, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500Q, 0xff, 0xff, 0x10), .driver_info = ZLP }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200T, 0xff, 0, 0) }, { USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_6001) }, { USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_CMU_300) }, @@ -1189,6 +1191,8 @@ static const struct usb_device_id option_ids[] = { .driver_info = NCTRL(0) | RSVD(1) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1054, 0xff), /* Telit FT980-KS */ .driver_info = NCTRL(2) | RSVD(3) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1055, 0xff), /* Telit FN980 (PCIe) */ + .driver_info = NCTRL(0) | RSVD(1) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910), .driver_info = NCTRL(0) | RSVD(1) | RSVD(3) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910_DUAL_MODEM), @@ -1201,6 +1205,8 @@ static const struct usb_device_id option_ids[] = { .driver_info = NCTRL(0) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_LE910), .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1203, 0xff), /* Telit LE910Cx (RNDIS) */ + .driver_info = NCTRL(2) | RSVD(3) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_LE910_USBCFG4), .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) | RSVD(3) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_LE920), @@ -1215,6 +1221,10 @@ static const struct usb_device_id option_ids[] = { { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, TELIT_PRODUCT_LE920A4_1213, 0xff) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_LE920A4_1214), .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) | RSVD(3) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1230, 0xff), /* Telit LE910Cx (rmnet) */ + .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1231, 0xff), /* Telit LE910Cx (RNDIS) */ + .driver_info = NCTRL(2) | RSVD(3) }, { USB_DEVICE(TELIT_VENDOR_ID, 0x1260), .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) }, { USB_DEVICE(TELIT_VENDOR_ID, 0x1261), diff --git a/drivers/usb/typec/tcpm.c b/drivers/usb/typec/tcpm.c index 29d72e9b0f01..af41d4dce3ad 100644 --- a/drivers/usb/typec/tcpm.c +++ b/drivers/usb/typec/tcpm.c @@ -2727,12 +2727,12 @@ static void tcpm_reset_port(struct tcpm_port *port) static void tcpm_detach(struct tcpm_port *port) { - if (!port->attached) - return; - if (tcpm_port_is_disconnected(port)) port->hard_reset_count = 0; + if (!port->attached) + return; + tcpm_reset_port(port); } @@ -3486,7 +3486,7 @@ static void run_state_machine(struct tcpm_port *port) */ tcpm_set_pwr_role(port, TYPEC_SOURCE); tcpm_pd_send_control(port, PD_CTRL_PS_RDY); - tcpm_set_state(port, SRC_STARTUP, 0); + tcpm_set_state(port, SRC_STARTUP, PD_T_SWAP_SRC_START); break; case VCONN_SWAP_ACCEPT: diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index bdfdd506bc58..c989f777bf77 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -355,11 +355,13 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev, vdev->ctx[vector].producer.token = trigger; vdev->ctx[vector].producer.irq = irq; ret = irq_bypass_register_producer(&vdev->ctx[vector].producer); - if (unlikely(ret)) + if (unlikely(ret)) { dev_info(&pdev->dev, "irq bypass producer (token %p) registration fails: %d\n", vdev->ctx[vector].producer.token, ret); + vdev->ctx[vector].producer.token = NULL; + } vdev->ctx[vector].trigger = trigger; return 0; diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 482e2d27b47b..3f20cc205666 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -638,7 +638,8 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data, ret = vfio_add_to_pfn_list(dma, iova, phys_pfn[i]); if (ret) { - vfio_unpin_page_external(dma, iova, do_accounting); + if (put_pfn(phys_pfn[i], dma->prot) && do_accounting) + vfio_lock_acct(dma, -1, true); goto pin_unwind; } } diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c index a94d700a4503..59c61744dcc1 100644 --- a/drivers/vhost/vringh.c +++ b/drivers/vhost/vringh.c @@ -273,13 +273,14 @@ __vringh_iov(struct vringh *vrh, u16 i, desc_max = vrh->vring.num; up_next = -1; + /* You must want something! */ + if (WARN_ON(!riov && !wiov)) + return -EINVAL; + if (riov) riov->i = riov->used = 0; - else if (wiov) + if (wiov) wiov->i = wiov->used = 0; - else - /* You must want something! */ - BUG(); for (;;) { void *addr; diff --git a/drivers/video/backlight/sky81452-backlight.c b/drivers/video/backlight/sky81452-backlight.c index d414c7a3acf5..a2f77625b717 100644 --- a/drivers/video/backlight/sky81452-backlight.c +++ b/drivers/video/backlight/sky81452-backlight.c @@ -207,6 +207,7 @@ static struct sky81452_bl_platform_data *sky81452_bl_parse_dt( num_entry); if (ret < 0) { dev_err(dev, "led-sources node is invalid.\n"); + of_node_put(np); return ERR_PTR(-EINVAL); } diff --git a/drivers/video/fbdev/aty/radeon_base.c b/drivers/video/fbdev/aty/radeon_base.c index e8594bbaea60..c6109a385cac 100644 --- a/drivers/video/fbdev/aty/radeon_base.c +++ b/drivers/video/fbdev/aty/radeon_base.c @@ -2327,7 +2327,7 @@ static int radeonfb_pci_register(struct pci_dev *pdev, ret = radeon_kick_out_firmware_fb(pdev); if (ret) - return ret; + goto err_release_fb; /* request the mem regions */ ret = pci_request_region(pdev, 0, "radeonfb framebuffer"); diff --git a/drivers/video/fbdev/pvr2fb.c b/drivers/video/fbdev/pvr2fb.c index 8a53d1de611d..3fd2cb4cdfa9 100644 --- a/drivers/video/fbdev/pvr2fb.c +++ b/drivers/video/fbdev/pvr2fb.c @@ -1027,6 +1027,8 @@ static int __init pvr2fb_setup(char *options) if (!options || !*options) return 0; + cable_arg[0] = output_arg[0] = 0; + while ((this_opt = strsep(&options, ","))) { if (!*this_opt) continue; diff --git a/drivers/video/fbdev/sis/init.c b/drivers/video/fbdev/sis/init.c index dfe3eb769638..fde27feae5d0 100644 --- a/drivers/video/fbdev/sis/init.c +++ b/drivers/video/fbdev/sis/init.c @@ -2428,6 +2428,11 @@ SiS_SetCRT1FIFO_630(struct SiS_Private *SiS_Pr, unsigned short ModeNo, i = 0; + if (SiS_Pr->ChipType == SIS_730) + queuedata = &FQBQData730[0]; + else + queuedata = &FQBQData[0]; + if(ModeNo > 0x13) { /* Get VCLK */ @@ -2445,12 +2450,6 @@ SiS_SetCRT1FIFO_630(struct SiS_Private *SiS_Pr, unsigned short ModeNo, /* Get half colordepth */ colorth = colortharray[(SiS_Pr->SiS_ModeType - ModeEGA)]; - if(SiS_Pr->ChipType == SIS_730) { - queuedata = &FQBQData730[0]; - } else { - queuedata = &FQBQData[0]; - } - do { templ = SiS_CalcDelay2(SiS_Pr, queuedata[i]) * VCLK * colorth; diff --git a/drivers/video/fbdev/vga16fb.c b/drivers/video/fbdev/vga16fb.c index 4b83109202b1..3c4d20618de4 100644 --- a/drivers/video/fbdev/vga16fb.c +++ b/drivers/video/fbdev/vga16fb.c @@ -243,7 +243,7 @@ static void vga16fb_update_fix(struct fb_info *info) } static void vga16fb_clock_chip(struct vga16fb_par *par, - unsigned int pixclock, + unsigned int *pixclock, const struct fb_info *info, int mul, int div) { @@ -259,14 +259,14 @@ static void vga16fb_clock_chip(struct vga16fb_par *par, { 0 /* bad */, 0x00, 0x00}}; int err; - pixclock = (pixclock * mul) / div; + *pixclock = (*pixclock * mul) / div; best = vgaclocks; - err = pixclock - best->pixclock; + err = *pixclock - best->pixclock; if (err < 0) err = -err; for (ptr = vgaclocks + 1; ptr->pixclock; ptr++) { int tmp; - tmp = pixclock - ptr->pixclock; + tmp = *pixclock - ptr->pixclock; if (tmp < 0) tmp = -tmp; if (tmp < err) { err = tmp; @@ -275,7 +275,7 @@ static void vga16fb_clock_chip(struct vga16fb_par *par, } par->misc |= best->misc; par->clkdiv = best->seq_clock_mode; - pixclock = (best->pixclock * div) / mul; + *pixclock = (best->pixclock * div) / mul; } #define FAIL(X) return -EINVAL @@ -497,10 +497,10 @@ static int vga16fb_check_var(struct fb_var_screeninfo *var, if (mode & MODE_8BPP) /* pixel clock == vga clock / 2 */ - vga16fb_clock_chip(par, var->pixclock, info, 1, 2); + vga16fb_clock_chip(par, &var->pixclock, info, 1, 2); else /* pixel clock == vga clock */ - vga16fb_clock_chip(par, var->pixclock, info, 1, 1); + vga16fb_clock_chip(par, &var->pixclock, info, 1, 1); var->red.offset = var->green.offset = var->blue.offset = var->transp.offset = 0; diff --git a/drivers/virt/fsl_hypervisor.c b/drivers/virt/fsl_hypervisor.c index 1bbd910d4ddb..2a7f7f47fe89 100644 --- a/drivers/virt/fsl_hypervisor.c +++ b/drivers/virt/fsl_hypervisor.c @@ -157,7 +157,7 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy __user *p) unsigned int i; long ret = 0; - int num_pinned; /* return value from get_user_pages() */ + int num_pinned = 0; /* return value from get_user_pages_fast() */ phys_addr_t remote_paddr; /* The next address in the remote buffer */ uint32_t count; /* The number of bytes left to copy */ @@ -174,7 +174,7 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy __user *p) return -EINVAL; /* - * The array of pages returned by get_user_pages() covers only + * The array of pages returned by get_user_pages_fast() covers only * page-aligned memory. Since the user buffer is probably not * page-aligned, we need to handle the discrepancy. * @@ -224,7 +224,7 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy __user *p) /* * 'pages' is an array of struct page pointers that's initialized by - * get_user_pages(). + * get_user_pages_fast(). */ pages = kcalloc(num_pages, sizeof(struct page *), GFP_KERNEL); if (!pages) { @@ -241,7 +241,7 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy __user *p) if (!sg_list_unaligned) { pr_debug("fsl-hv: could not allocate S/G list\n"); ret = -ENOMEM; - goto exit; + goto free_pages; } sg_list = PTR_ALIGN(sg_list_unaligned, sizeof(struct fh_sg_list)); @@ -250,7 +250,6 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy __user *p) num_pages, param.source != -1, pages); if (num_pinned != num_pages) { - /* get_user_pages() failed */ pr_debug("fsl-hv: could not lock source buffer\n"); ret = (num_pinned < 0) ? num_pinned : -EFAULT; goto exit; @@ -292,13 +291,13 @@ static long ioctl_memcpy(struct fsl_hv_ioctl_memcpy __user *p) virt_to_phys(sg_list), num_pages); exit: - if (pages) { - for (i = 0; i < num_pages; i++) - if (pages[i]) - put_page(pages[i]); + if (pages && (num_pinned > 0)) { + for (i = 0; i < num_pinned; i++) + put_page(pages[i]); } kfree(sg_list_unaligned); +free_pages: kfree(pages); if (!ret) diff --git a/drivers/w1/masters/mxc_w1.c b/drivers/w1/masters/mxc_w1.c index 50b46c4399ea..075454053e5e 100644 --- a/drivers/w1/masters/mxc_w1.c +++ b/drivers/w1/masters/mxc_w1.c @@ -15,7 +15,7 @@ #include #include #include -#include +#include #include #include #include @@ -48,12 +48,12 @@ struct mxc_w1_device { static u8 mxc_w1_ds2_reset_bus(void *data) { struct mxc_w1_device *dev = data; - unsigned long timeout; + ktime_t timeout; writeb(MXC_W1_CONTROL_RPP, dev->regs + MXC_W1_CONTROL); /* Wait for reset sequence 511+512us, use 1500us for sure */ - timeout = jiffies + usecs_to_jiffies(1500); + timeout = ktime_add_us(ktime_get(), 1500); udelay(511 + 512); @@ -63,7 +63,7 @@ static u8 mxc_w1_ds2_reset_bus(void *data) /* PST bit is valid after the RPP bit is self-cleared */ if (!(ctrl & MXC_W1_CONTROL_RPP)) return !(ctrl & MXC_W1_CONTROL_PST); - } while (time_is_after_jiffies(timeout)); + } while (ktime_before(ktime_get(), timeout)); return 1; } @@ -76,12 +76,12 @@ static u8 mxc_w1_ds2_reset_bus(void *data) static u8 mxc_w1_ds2_touch_bit(void *data, u8 bit) { struct mxc_w1_device *dev = data; - unsigned long timeout; + ktime_t timeout; writeb(MXC_W1_CONTROL_WR(bit), dev->regs + MXC_W1_CONTROL); /* Wait for read/write bit (60us, Max 120us), use 200us for sure */ - timeout = jiffies + usecs_to_jiffies(200); + timeout = ktime_add_us(ktime_get(), 200); udelay(60); @@ -91,7 +91,7 @@ static u8 mxc_w1_ds2_touch_bit(void *data, u8 bit) /* RDST bit is valid after the WR1/RD bit is self-cleared */ if (!(ctrl & MXC_W1_CONTROL_WR(bit))) return !!(ctrl & MXC_W1_CONTROL_RDST); - } while (time_is_after_jiffies(timeout)); + } while (ktime_before(ktime_get(), timeout)); return 0; } diff --git a/drivers/watchdog/rdc321x_wdt.c b/drivers/watchdog/rdc321x_wdt.c index a281aa84bfb1..4c3b4ea4e17f 100644 --- a/drivers/watchdog/rdc321x_wdt.c +++ b/drivers/watchdog/rdc321x_wdt.c @@ -244,6 +244,8 @@ static int rdc321x_wdt_probe(struct platform_device *pdev) rdc321x_wdt_device.sb_pdev = pdata->sb_pdev; rdc321x_wdt_device.base_reg = r->start; + rdc321x_wdt_device.queue = 0; + rdc321x_wdt_device.default_ticks = ticks; err = misc_register(&rdc321x_wdt_misc); if (err < 0) { @@ -258,14 +260,11 @@ static int rdc321x_wdt_probe(struct platform_device *pdev) rdc321x_wdt_device.base_reg, RDC_WDT_RST); init_completion(&rdc321x_wdt_device.stop); - rdc321x_wdt_device.queue = 0; clear_bit(0, &rdc321x_wdt_device.inuse); timer_setup(&rdc321x_wdt_device.timer, rdc321x_wdt_trigger, 0); - rdc321x_wdt_device.default_ticks = ticks; - dev_info(&pdev->dev, "watchdog init success\n"); return 0; diff --git a/drivers/watchdog/sp5100_tco.h b/drivers/watchdog/sp5100_tco.h index 87eaf357ae01..adf015aa4126 100644 --- a/drivers/watchdog/sp5100_tco.h +++ b/drivers/watchdog/sp5100_tco.h @@ -70,7 +70,7 @@ #define EFCH_PM_DECODEEN_WDT_TMREN BIT(7) -#define EFCH_PM_DECODEEN3 0x00 +#define EFCH_PM_DECODEEN3 0x03 #define EFCH_PM_DECODEEN_SECOND_RES GENMASK(1, 0) #define EFCH_PM_WATCHDOG_DISABLE ((u8)GENMASK(3, 2)) diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index 1c322caecf7f..8fe59b7d8eec 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -944,8 +944,10 @@ static int watchdog_cdev_register(struct watchdog_device *wdd) wd_data->wdd = wdd; wdd->wd_data = wd_data; - if (IS_ERR_OR_NULL(watchdog_kworker)) + if (IS_ERR_OR_NULL(watchdog_kworker)) { + kfree(wd_data); return -ENODEV; + } device_initialize(&wd_data->dev); wd_data->dev.devt = MKDEV(MAJOR(watchdog_devt), wdd->id); @@ -971,7 +973,7 @@ static int watchdog_cdev_register(struct watchdog_device *wdd) pr_err("%s: a legacy watchdog module is probably present.\n", wdd->info->identity); old_wd_data = NULL; - kfree(wd_data); + put_device(&wd_data->dev); return err; } } diff --git a/drivers/xen/events/events_2l.c b/drivers/xen/events/events_2l.c index 8edef51c92e5..f026624898e7 100644 --- a/drivers/xen/events/events_2l.c +++ b/drivers/xen/events/events_2l.c @@ -91,6 +91,8 @@ static void evtchn_2l_unmask(unsigned port) BUG_ON(!irqs_disabled()); + smp_wmb(); /* All writes before unmask must be visible. */ + if (unlikely((cpu != cpu_from_evtchn(port)))) do_hypercall = 1; else { @@ -159,7 +161,7 @@ static inline xen_ulong_t active_evtchns(unsigned int cpu, * a bitset of words which contain pending event bits. The second * level is a bitset of pending events themselves. */ -static void evtchn_2l_handle_events(unsigned cpu) +static void evtchn_2l_handle_events(unsigned cpu, struct evtchn_loop_ctrl *ctrl) { int irq; xen_ulong_t pending_words; @@ -240,10 +242,7 @@ static void evtchn_2l_handle_events(unsigned cpu) /* Process port. */ port = (word_idx * BITS_PER_EVTCHN_WORD) + bit_idx; - irq = get_evtchn_to_irq(port); - - if (irq != -1) - generic_handle_irq(irq); + handle_irq_for_port(port, ctrl); bit_idx = (bit_idx + 1) % BITS_PER_EVTCHN_WORD; diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 95e5a9300ff0..aca845675279 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -32,6 +32,10 @@ #include #include #include +#include +#include +#include +#include #ifdef CONFIG_X86 #include @@ -61,6 +65,15 @@ #include "events_internal.h" +#undef MODULE_PARAM_PREFIX +#define MODULE_PARAM_PREFIX "xen." + +static uint __read_mostly event_loop_timeout = 2; +module_param(event_loop_timeout, uint, 0644); + +static uint __read_mostly event_eoi_delay = 10; +module_param(event_eoi_delay, uint, 0644); + const struct evtchn_ops *evtchn_ops; /* @@ -69,6 +82,24 @@ const struct evtchn_ops *evtchn_ops; */ static DEFINE_MUTEX(irq_mapping_update_lock); +/* + * Lock protecting event handling loop against removing event channels. + * Adding of event channels is no issue as the associated IRQ becomes active + * only after everything is setup (before request_[threaded_]irq() the handler + * can't be entered for an event, as the event channel will be unmasked only + * then). + */ +static DEFINE_RWLOCK(evtchn_rwlock); + +/* + * Lock hierarchy: + * + * irq_mapping_update_lock + * evtchn_rwlock + * IRQ-desc lock + * percpu eoi_list_lock + */ + static LIST_HEAD(xen_irq_list_head); /* IRQ <-> VIRQ mapping. */ @@ -90,18 +121,23 @@ static bool (*pirq_needs_eoi)(unsigned irq); /* Xen will never allocate port zero for any purpose. */ #define VALID_EVTCHN(chn) ((chn) != 0) +static struct irq_info *legacy_info_ptrs[NR_IRQS_LEGACY]; + static struct irq_chip xen_dynamic_chip; +static struct irq_chip xen_lateeoi_chip; static struct irq_chip xen_percpu_chip; static struct irq_chip xen_pirq_chip; static void enable_dynirq(struct irq_data *data); static void disable_dynirq(struct irq_data *data); +static DEFINE_PER_CPU(unsigned int, irq_epoch); + static void clear_evtchn_to_irq_row(unsigned row) { unsigned col; for (col = 0; col < EVTCHN_PER_ROW; col++) - evtchn_to_irq[row][col] = -1; + WRITE_ONCE(evtchn_to_irq[row][col], -1); } static void clear_evtchn_to_irq_all(void) @@ -138,7 +174,7 @@ static int set_evtchn_to_irq(unsigned evtchn, unsigned irq) clear_evtchn_to_irq_row(row); } - evtchn_to_irq[row][col] = irq; + WRITE_ONCE(evtchn_to_irq[row][col], irq); return 0; } @@ -148,13 +184,24 @@ int get_evtchn_to_irq(unsigned evtchn) return -1; if (evtchn_to_irq[EVTCHN_ROW(evtchn)] == NULL) return -1; - return evtchn_to_irq[EVTCHN_ROW(evtchn)][EVTCHN_COL(evtchn)]; + return READ_ONCE(evtchn_to_irq[EVTCHN_ROW(evtchn)][EVTCHN_COL(evtchn)]); } /* Get info for IRQ */ struct irq_info *info_for_irq(unsigned irq) { - return irq_get_chip_data(irq); + if (irq < nr_legacy_irqs()) + return legacy_info_ptrs[irq]; + else + return irq_get_chip_data(irq); +} + +static void set_info_for_irq(unsigned int irq, struct irq_info *info) +{ + if (irq < nr_legacy_irqs()) + legacy_info_ptrs[irq] = info; + else + irq_set_chip_data(irq, info); } /* Constructors for packed IRQ information. */ @@ -246,10 +293,14 @@ static void xen_irq_info_cleanup(struct irq_info *info) */ unsigned int evtchn_from_irq(unsigned irq) { - if (unlikely(WARN(irq >= nr_irqs, "Invalid irq %d!\n", irq))) + const struct irq_info *info = NULL; + + if (likely(irq < nr_irqs)) + info = info_for_irq(irq); + if (!info) return 0; - return info_for_irq(irq)->evtchn; + return info->evtchn; } unsigned irq_from_evtchn(unsigned int evtchn) @@ -360,9 +411,157 @@ void notify_remote_via_irq(int irq) } EXPORT_SYMBOL_GPL(notify_remote_via_irq); +struct lateeoi_work { + struct delayed_work delayed; + spinlock_t eoi_list_lock; + struct list_head eoi_list; +}; + +static DEFINE_PER_CPU(struct lateeoi_work, lateeoi); + +static void lateeoi_list_del(struct irq_info *info) +{ + struct lateeoi_work *eoi = &per_cpu(lateeoi, info->eoi_cpu); + unsigned long flags; + + spin_lock_irqsave(&eoi->eoi_list_lock, flags); + list_del_init(&info->eoi_list); + spin_unlock_irqrestore(&eoi->eoi_list_lock, flags); +} + +static void lateeoi_list_add(struct irq_info *info) +{ + struct lateeoi_work *eoi = &per_cpu(lateeoi, info->eoi_cpu); + struct irq_info *elem; + u64 now = get_jiffies_64(); + unsigned long delay; + unsigned long flags; + + if (now < info->eoi_time) + delay = info->eoi_time - now; + else + delay = 1; + + spin_lock_irqsave(&eoi->eoi_list_lock, flags); + + if (list_empty(&eoi->eoi_list)) { + list_add(&info->eoi_list, &eoi->eoi_list); + mod_delayed_work_on(info->eoi_cpu, system_wq, + &eoi->delayed, delay); + } else { + list_for_each_entry_reverse(elem, &eoi->eoi_list, eoi_list) { + if (elem->eoi_time <= info->eoi_time) + break; + } + list_add(&info->eoi_list, &elem->eoi_list); + } + + spin_unlock_irqrestore(&eoi->eoi_list_lock, flags); +} + +static void xen_irq_lateeoi_locked(struct irq_info *info, bool spurious) +{ + evtchn_port_t evtchn; + unsigned int cpu; + unsigned int delay = 0; + + evtchn = info->evtchn; + if (!VALID_EVTCHN(evtchn) || !list_empty(&info->eoi_list)) + return; + + if (spurious) { + if ((1 << info->spurious_cnt) < (HZ << 2)) + info->spurious_cnt++; + if (info->spurious_cnt > 1) { + delay = 1 << (info->spurious_cnt - 2); + if (delay > HZ) + delay = HZ; + if (!info->eoi_time) + info->eoi_cpu = smp_processor_id(); + info->eoi_time = get_jiffies_64() + delay; + } + } else { + info->spurious_cnt = 0; + } + + cpu = info->eoi_cpu; + if (info->eoi_time && + (info->irq_epoch == per_cpu(irq_epoch, cpu) || delay)) { + lateeoi_list_add(info); + return; + } + + info->eoi_time = 0; + unmask_evtchn(evtchn); +} + +static void xen_irq_lateeoi_worker(struct work_struct *work) +{ + struct lateeoi_work *eoi; + struct irq_info *info; + u64 now = get_jiffies_64(); + unsigned long flags; + + eoi = container_of(to_delayed_work(work), struct lateeoi_work, delayed); + + read_lock_irqsave(&evtchn_rwlock, flags); + + while (true) { + spin_lock(&eoi->eoi_list_lock); + + info = list_first_entry_or_null(&eoi->eoi_list, struct irq_info, + eoi_list); + + if (info == NULL || now < info->eoi_time) { + spin_unlock(&eoi->eoi_list_lock); + break; + } + + list_del_init(&info->eoi_list); + + spin_unlock(&eoi->eoi_list_lock); + + info->eoi_time = 0; + + xen_irq_lateeoi_locked(info, false); + } + + if (info) + mod_delayed_work_on(info->eoi_cpu, system_wq, + &eoi->delayed, info->eoi_time - now); + + read_unlock_irqrestore(&evtchn_rwlock, flags); +} + +static void xen_cpu_init_eoi(unsigned int cpu) +{ + struct lateeoi_work *eoi = &per_cpu(lateeoi, cpu); + + INIT_DELAYED_WORK(&eoi->delayed, xen_irq_lateeoi_worker); + spin_lock_init(&eoi->eoi_list_lock); + INIT_LIST_HEAD(&eoi->eoi_list); +} + +void xen_irq_lateeoi(unsigned int irq, unsigned int eoi_flags) +{ + struct irq_info *info; + unsigned long flags; + + read_lock_irqsave(&evtchn_rwlock, flags); + + info = info_for_irq(irq); + + if (info) + xen_irq_lateeoi_locked(info, eoi_flags & XEN_EOI_FLAG_SPURIOUS); + + read_unlock_irqrestore(&evtchn_rwlock, flags); +} +EXPORT_SYMBOL_GPL(xen_irq_lateeoi); + static void xen_irq_init(unsigned irq) { struct irq_info *info; + #ifdef CONFIG_SMP /* By default all event channels notify CPU#0. */ cpumask_copy(irq_get_affinity_mask(irq), cpumask_of(0)); @@ -375,8 +574,9 @@ static void xen_irq_init(unsigned irq) info->type = IRQT_UNBOUND; info->refcnt = -1; - irq_set_chip_data(irq, info); + set_info_for_irq(irq, info); + INIT_LIST_HEAD(&info->eoi_list); list_add_tail(&info->list, &xen_irq_list_head); } @@ -424,17 +624,25 @@ static int __must_check xen_allocate_irq_gsi(unsigned gsi) static void xen_free_irq(unsigned irq) { - struct irq_info *info = irq_get_chip_data(irq); + struct irq_info *info = info_for_irq(irq); + unsigned long flags; if (WARN_ON(!info)) return; + write_lock_irqsave(&evtchn_rwlock, flags); + + if (!list_empty(&info->eoi_list)) + lateeoi_list_del(info); + list_del(&info->list); - irq_set_chip_data(irq, NULL); + set_info_for_irq(irq, NULL); WARN_ON(info->refcnt > 0); + write_unlock_irqrestore(&evtchn_rwlock, flags); + kfree(info); /* Legacy IRQ descriptors are managed by the arch. */ @@ -601,7 +809,7 @@ EXPORT_SYMBOL_GPL(xen_irq_from_gsi); static void __unbind_from_irq(unsigned int irq) { int evtchn = evtchn_from_irq(irq); - struct irq_info *info = irq_get_chip_data(irq); + struct irq_info *info = info_for_irq(irq); if (info->refcnt > 0) { info->refcnt--; @@ -826,7 +1034,7 @@ int xen_pirq_from_irq(unsigned irq) } EXPORT_SYMBOL_GPL(xen_pirq_from_irq); -int bind_evtchn_to_irq(unsigned int evtchn) +static int bind_evtchn_to_irq_chip(evtchn_port_t evtchn, struct irq_chip *chip) { int irq; int ret; @@ -843,7 +1051,7 @@ int bind_evtchn_to_irq(unsigned int evtchn) if (irq < 0) goto out; - irq_set_chip_and_handler_name(irq, &xen_dynamic_chip, + irq_set_chip_and_handler_name(irq, chip, handle_edge_irq, "event"); ret = xen_irq_info_evtchn_setup(irq, evtchn); @@ -864,8 +1072,19 @@ out: return irq; } + +int bind_evtchn_to_irq(evtchn_port_t evtchn) +{ + return bind_evtchn_to_irq_chip(evtchn, &xen_dynamic_chip); +} EXPORT_SYMBOL_GPL(bind_evtchn_to_irq); +int bind_evtchn_to_irq_lateeoi(evtchn_port_t evtchn) +{ + return bind_evtchn_to_irq_chip(evtchn, &xen_lateeoi_chip); +} +EXPORT_SYMBOL_GPL(bind_evtchn_to_irq_lateeoi); + static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) { struct evtchn_bind_ipi bind_ipi; @@ -907,8 +1126,9 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu) return irq; } -int bind_interdomain_evtchn_to_irq(unsigned int remote_domain, - unsigned int remote_port) +static int bind_interdomain_evtchn_to_irq_chip(unsigned int remote_domain, + evtchn_port_t remote_port, + struct irq_chip *chip) { struct evtchn_bind_interdomain bind_interdomain; int err; @@ -919,10 +1139,26 @@ int bind_interdomain_evtchn_to_irq(unsigned int remote_domain, err = HYPERVISOR_event_channel_op(EVTCHNOP_bind_interdomain, &bind_interdomain); - return err ? : bind_evtchn_to_irq(bind_interdomain.local_port); + return err ? : bind_evtchn_to_irq_chip(bind_interdomain.local_port, + chip); +} + +int bind_interdomain_evtchn_to_irq(unsigned int remote_domain, + evtchn_port_t remote_port) +{ + return bind_interdomain_evtchn_to_irq_chip(remote_domain, remote_port, + &xen_dynamic_chip); } EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irq); +int bind_interdomain_evtchn_to_irq_lateeoi(unsigned int remote_domain, + evtchn_port_t remote_port) +{ + return bind_interdomain_evtchn_to_irq_chip(remote_domain, remote_port, + &xen_lateeoi_chip); +} +EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irq_lateeoi); + static int find_virq(unsigned int virq, unsigned int cpu) { struct evtchn_status status; @@ -1018,14 +1254,15 @@ static void unbind_from_irq(unsigned int irq) mutex_unlock(&irq_mapping_update_lock); } -int bind_evtchn_to_irqhandler(unsigned int evtchn, - irq_handler_t handler, - unsigned long irqflags, - const char *devname, void *dev_id) +static int bind_evtchn_to_irqhandler_chip(evtchn_port_t evtchn, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, void *dev_id, + struct irq_chip *chip) { int irq, retval; - irq = bind_evtchn_to_irq(evtchn); + irq = bind_evtchn_to_irq_chip(evtchn, chip); if (irq < 0) return irq; retval = request_irq(irq, handler, irqflags, devname, dev_id); @@ -1036,31 +1273,76 @@ int bind_evtchn_to_irqhandler(unsigned int evtchn, return irq; } + +int bind_evtchn_to_irqhandler(evtchn_port_t evtchn, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, void *dev_id) +{ + return bind_evtchn_to_irqhandler_chip(evtchn, handler, irqflags, + devname, dev_id, + &xen_dynamic_chip); +} EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler); +int bind_evtchn_to_irqhandler_lateeoi(evtchn_port_t evtchn, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, void *dev_id) +{ + return bind_evtchn_to_irqhandler_chip(evtchn, handler, irqflags, + devname, dev_id, + &xen_lateeoi_chip); +} +EXPORT_SYMBOL_GPL(bind_evtchn_to_irqhandler_lateeoi); + +static int bind_interdomain_evtchn_to_irqhandler_chip( + unsigned int remote_domain, evtchn_port_t remote_port, + irq_handler_t handler, unsigned long irqflags, + const char *devname, void *dev_id, struct irq_chip *chip) +{ + int irq, retval; + + irq = bind_interdomain_evtchn_to_irq_chip(remote_domain, remote_port, + chip); + if (irq < 0) + return irq; + + retval = request_irq(irq, handler, irqflags, devname, dev_id); + if (retval != 0) { + unbind_from_irq(irq); + return retval; + } + + return irq; +} + int bind_interdomain_evtchn_to_irqhandler(unsigned int remote_domain, - unsigned int remote_port, + evtchn_port_t remote_port, irq_handler_t handler, unsigned long irqflags, const char *devname, void *dev_id) { - int irq, retval; - - irq = bind_interdomain_evtchn_to_irq(remote_domain, remote_port); - if (irq < 0) - return irq; - - retval = request_irq(irq, handler, irqflags, devname, dev_id); - if (retval != 0) { - unbind_from_irq(irq); - return retval; - } - - return irq; + return bind_interdomain_evtchn_to_irqhandler_chip(remote_domain, + remote_port, handler, irqflags, devname, + dev_id, &xen_dynamic_chip); } EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irqhandler); +int bind_interdomain_evtchn_to_irqhandler_lateeoi(unsigned int remote_domain, + evtchn_port_t remote_port, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, + void *dev_id) +{ + return bind_interdomain_evtchn_to_irqhandler_chip(remote_domain, + remote_port, handler, irqflags, devname, + dev_id, &xen_lateeoi_chip); +} +EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irqhandler_lateeoi); + int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, irq_handler_t handler, unsigned long irqflags, const char *devname, void *dev_id) @@ -1105,7 +1387,7 @@ int bind_ipi_to_irqhandler(enum ipi_vector ipi, void unbind_from_irqhandler(unsigned int irq, void *dev_id) { - struct irq_info *info = irq_get_chip_data(irq); + struct irq_info *info = info_for_irq(irq); if (WARN_ON(!info)) return; @@ -1139,7 +1421,7 @@ int evtchn_make_refcounted(unsigned int evtchn) if (irq == -1) return -ENOENT; - info = irq_get_chip_data(irq); + info = info_for_irq(irq); if (!info) return -ENOENT; @@ -1167,13 +1449,13 @@ int evtchn_get(unsigned int evtchn) if (irq == -1) goto done; - info = irq_get_chip_data(irq); + info = info_for_irq(irq); if (!info) goto done; err = -EINVAL; - if (info->refcnt <= 0) + if (info->refcnt <= 0 || info->refcnt == SHRT_MAX) goto done; info->refcnt++; @@ -1212,6 +1494,54 @@ void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector) notify_remote_via_irq(irq); } +struct evtchn_loop_ctrl { + ktime_t timeout; + unsigned count; + bool defer_eoi; +}; + +void handle_irq_for_port(evtchn_port_t port, struct evtchn_loop_ctrl *ctrl) +{ + int irq; + struct irq_info *info; + + irq = get_evtchn_to_irq(port); + if (irq == -1) + return; + + /* + * Check for timeout every 256 events. + * We are setting the timeout value only after the first 256 + * events in order to not hurt the common case of few loop + * iterations. The 256 is basically an arbitrary value. + * + * In case we are hitting the timeout we need to defer all further + * EOIs in order to ensure to leave the event handling loop rather + * sooner than later. + */ + if (!ctrl->defer_eoi && !(++ctrl->count & 0xff)) { + ktime_t kt = ktime_get(); + + if (!ctrl->timeout) { + kt = ktime_add_ms(kt, + jiffies_to_msecs(event_loop_timeout)); + ctrl->timeout = kt; + } else if (kt > ctrl->timeout) { + ctrl->defer_eoi = true; + } + } + + info = info_for_irq(irq); + + if (ctrl->defer_eoi) { + info->eoi_cpu = smp_processor_id(); + info->irq_epoch = __this_cpu_read(irq_epoch); + info->eoi_time = get_jiffies_64() + event_eoi_delay; + } + + generic_handle_irq(irq); +} + static DEFINE_PER_CPU(unsigned, xed_nesting_count); static void __xen_evtchn_do_upcall(void) @@ -1219,6 +1549,9 @@ static void __xen_evtchn_do_upcall(void) struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu); int cpu = get_cpu(); unsigned count; + struct evtchn_loop_ctrl ctrl = { 0 }; + + read_lock(&evtchn_rwlock); do { vcpu_info->evtchn_upcall_pending = 0; @@ -1226,7 +1559,7 @@ static void __xen_evtchn_do_upcall(void) if (__this_cpu_inc_return(xed_nesting_count) - 1) goto out; - xen_evtchn_handle_events(cpu); + xen_evtchn_handle_events(cpu, &ctrl); BUG_ON(!irqs_disabled()); @@ -1235,6 +1568,14 @@ static void __xen_evtchn_do_upcall(void) } while (count != 1 || vcpu_info->evtchn_upcall_pending); out: + read_unlock(&evtchn_rwlock); + + /* + * Increment irq_epoch only now to defer EOIs only for + * xen_irq_lateeoi() invocations occurring from inside the loop + * above. + */ + __this_cpu_inc(irq_epoch); put_cpu(); } @@ -1601,6 +1942,21 @@ static struct irq_chip xen_dynamic_chip __read_mostly = { .irq_retrigger = retrigger_dynirq, }; +static struct irq_chip xen_lateeoi_chip __read_mostly = { + /* The chip name needs to contain "xen-dyn" for irqbalance to work. */ + .name = "xen-dyn-lateeoi", + + .irq_disable = disable_dynirq, + .irq_mask = disable_dynirq, + .irq_unmask = enable_dynirq, + + .irq_ack = mask_ack_dynirq, + .irq_mask_ack = mask_ack_dynirq, + + .irq_set_affinity = set_affinity_irq, + .irq_retrigger = retrigger_dynirq, +}; + static struct irq_chip xen_pirq_chip __read_mostly = { .name = "xen-pirq", @@ -1667,12 +2023,31 @@ void xen_callback_vector(void) void xen_callback_vector(void) {} #endif -#undef MODULE_PARAM_PREFIX -#define MODULE_PARAM_PREFIX "xen." - static bool fifo_events = true; module_param(fifo_events, bool, 0); +static int xen_evtchn_cpu_prepare(unsigned int cpu) +{ + int ret = 0; + + xen_cpu_init_eoi(cpu); + + if (evtchn_ops->percpu_init) + ret = evtchn_ops->percpu_init(cpu); + + return ret; +} + +static int xen_evtchn_cpu_dead(unsigned int cpu) +{ + int ret = 0; + + if (evtchn_ops->percpu_deinit) + ret = evtchn_ops->percpu_deinit(cpu); + + return ret; +} + void __init xen_init_IRQ(void) { int ret = -EINVAL; @@ -1683,6 +2058,12 @@ void __init xen_init_IRQ(void) if (ret < 0) xen_evtchn_2l_init(); + xen_cpu_init_eoi(smp_processor_id()); + + cpuhp_setup_state_nocalls(CPUHP_XEN_EVTCHN_PREPARE, + "xen/evtchn:prepare", + xen_evtchn_cpu_prepare, xen_evtchn_cpu_dead); + evtchn_to_irq = kcalloc(EVTCHN_ROW(xen_evtchn_max_channels()), sizeof(*evtchn_to_irq), GFP_KERNEL); BUG_ON(!evtchn_to_irq); diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c index 76b318e88382..33462521bfd0 100644 --- a/drivers/xen/events/events_fifo.c +++ b/drivers/xen/events/events_fifo.c @@ -227,19 +227,25 @@ static bool evtchn_fifo_is_masked(unsigned port) return sync_test_bit(EVTCHN_FIFO_BIT(MASKED, word), BM(word)); } /* - * Clear MASKED, spinning if BUSY is set. + * Clear MASKED if not PENDING, spinning if BUSY is set. + * Return true if mask was cleared. */ -static void clear_masked(volatile event_word_t *word) +static bool clear_masked_cond(volatile event_word_t *word) { event_word_t new, old, w; w = *word; do { + if (w & (1 << EVTCHN_FIFO_PENDING)) + return false; + old = w & ~(1 << EVTCHN_FIFO_BUSY); new = old & ~(1 << EVTCHN_FIFO_MASKED); w = sync_cmpxchg(word, old, new); } while (w != old); + + return true; } static void evtchn_fifo_unmask(unsigned port) @@ -248,8 +254,7 @@ static void evtchn_fifo_unmask(unsigned port) BUG_ON(!irqs_disabled()); - clear_masked(word); - if (evtchn_fifo_is_pending(port)) { + if (!clear_masked_cond(word)) { struct evtchn_unmask unmask = { .port = port }; (void)HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask); } @@ -270,19 +275,9 @@ static uint32_t clear_linked(volatile event_word_t *word) return w & EVTCHN_FIFO_LINK_MASK; } -static void handle_irq_for_port(unsigned port) -{ - int irq; - - irq = get_evtchn_to_irq(port); - if (irq != -1) - generic_handle_irq(irq); -} - -static void consume_one_event(unsigned cpu, +static void consume_one_event(unsigned cpu, struct evtchn_loop_ctrl *ctrl, struct evtchn_fifo_control_block *control_block, - unsigned priority, unsigned long *ready, - bool drop) + unsigned priority, unsigned long *ready) { struct evtchn_fifo_queue *q = &per_cpu(cpu_queue, cpu); uint32_t head; @@ -315,16 +310,17 @@ static void consume_one_event(unsigned cpu, clear_bit(priority, ready); if (evtchn_fifo_is_pending(port) && !evtchn_fifo_is_masked(port)) { - if (unlikely(drop)) + if (unlikely(!ctrl)) pr_warn("Dropping pending event for port %u\n", port); else - handle_irq_for_port(port); + handle_irq_for_port(port, ctrl); } q->head[priority] = head; } -static void __evtchn_fifo_handle_events(unsigned cpu, bool drop) +static void __evtchn_fifo_handle_events(unsigned cpu, + struct evtchn_loop_ctrl *ctrl) { struct evtchn_fifo_control_block *control_block; unsigned long ready; @@ -336,14 +332,15 @@ static void __evtchn_fifo_handle_events(unsigned cpu, bool drop) while (ready) { q = find_first_bit(&ready, EVTCHN_FIFO_MAX_QUEUES); - consume_one_event(cpu, control_block, q, &ready, drop); + consume_one_event(cpu, ctrl, control_block, q, &ready); ready |= xchg(&control_block->ready, 0); } } -static void evtchn_fifo_handle_events(unsigned cpu) +static void evtchn_fifo_handle_events(unsigned cpu, + struct evtchn_loop_ctrl *ctrl) { - __evtchn_fifo_handle_events(cpu, false); + __evtchn_fifo_handle_events(cpu, ctrl); } static void evtchn_fifo_resume(void) @@ -380,21 +377,6 @@ static void evtchn_fifo_resume(void) event_array_pages = 0; } -static const struct evtchn_ops evtchn_ops_fifo = { - .max_channels = evtchn_fifo_max_channels, - .nr_channels = evtchn_fifo_nr_channels, - .setup = evtchn_fifo_setup, - .bind_to_cpu = evtchn_fifo_bind_to_cpu, - .clear_pending = evtchn_fifo_clear_pending, - .set_pending = evtchn_fifo_set_pending, - .is_pending = evtchn_fifo_is_pending, - .test_and_set_mask = evtchn_fifo_test_and_set_mask, - .mask = evtchn_fifo_mask, - .unmask = evtchn_fifo_unmask, - .handle_events = evtchn_fifo_handle_events, - .resume = evtchn_fifo_resume, -}; - static int evtchn_fifo_alloc_control_block(unsigned cpu) { void *control_block = NULL; @@ -417,19 +399,36 @@ static int evtchn_fifo_alloc_control_block(unsigned cpu) return ret; } -static int xen_evtchn_cpu_prepare(unsigned int cpu) +static int evtchn_fifo_percpu_init(unsigned int cpu) { if (!per_cpu(cpu_control_block, cpu)) return evtchn_fifo_alloc_control_block(cpu); return 0; } -static int xen_evtchn_cpu_dead(unsigned int cpu) +static int evtchn_fifo_percpu_deinit(unsigned int cpu) { - __evtchn_fifo_handle_events(cpu, true); + __evtchn_fifo_handle_events(cpu, NULL); return 0; } +static const struct evtchn_ops evtchn_ops_fifo = { + .max_channels = evtchn_fifo_max_channels, + .nr_channels = evtchn_fifo_nr_channels, + .setup = evtchn_fifo_setup, + .bind_to_cpu = evtchn_fifo_bind_to_cpu, + .clear_pending = evtchn_fifo_clear_pending, + .set_pending = evtchn_fifo_set_pending, + .is_pending = evtchn_fifo_is_pending, + .test_and_set_mask = evtchn_fifo_test_and_set_mask, + .mask = evtchn_fifo_mask, + .unmask = evtchn_fifo_unmask, + .handle_events = evtchn_fifo_handle_events, + .resume = evtchn_fifo_resume, + .percpu_init = evtchn_fifo_percpu_init, + .percpu_deinit = evtchn_fifo_percpu_deinit, +}; + int __init xen_evtchn_fifo_init(void) { int cpu = smp_processor_id(); @@ -443,9 +442,5 @@ int __init xen_evtchn_fifo_init(void) evtchn_ops = &evtchn_ops_fifo; - cpuhp_setup_state_nocalls(CPUHP_XEN_EVTCHN_PREPARE, - "xen/evtchn:prepare", - xen_evtchn_cpu_prepare, xen_evtchn_cpu_dead); - return ret; } diff --git a/drivers/xen/events/events_internal.h b/drivers/xen/events/events_internal.h index 50c2050a1e32..b9b4f5919893 100644 --- a/drivers/xen/events/events_internal.h +++ b/drivers/xen/events/events_internal.h @@ -32,11 +32,16 @@ enum xen_irq_type { */ struct irq_info { struct list_head list; - int refcnt; + struct list_head eoi_list; + short refcnt; + short spurious_cnt; enum xen_irq_type type; /* type */ unsigned irq; unsigned int evtchn; /* event channel */ unsigned short cpu; /* cpu bound */ + unsigned short eoi_cpu; /* EOI must happen on this cpu */ + unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */ + u64 eoi_time; /* Time in jiffies when to EOI. */ union { unsigned short virq; @@ -55,6 +60,8 @@ struct irq_info { #define PIRQ_SHAREABLE (1 << 1) #define PIRQ_MSI_GROUP (1 << 2) +struct evtchn_loop_ctrl; + struct evtchn_ops { unsigned (*max_channels)(void); unsigned (*nr_channels)(void); @@ -69,14 +76,18 @@ struct evtchn_ops { void (*mask)(unsigned port); void (*unmask)(unsigned port); - void (*handle_events)(unsigned cpu); + void (*handle_events)(unsigned cpu, struct evtchn_loop_ctrl *ctrl); void (*resume)(void); + + int (*percpu_init)(unsigned int cpu); + int (*percpu_deinit)(unsigned int cpu); }; extern const struct evtchn_ops *evtchn_ops; extern int **evtchn_to_irq; int get_evtchn_to_irq(unsigned int evtchn); +void handle_irq_for_port(evtchn_port_t port, struct evtchn_loop_ctrl *ctrl); struct irq_info *info_for_irq(unsigned irq); unsigned cpu_from_irq(unsigned irq); @@ -134,9 +145,10 @@ static inline void unmask_evtchn(unsigned port) return evtchn_ops->unmask(port); } -static inline void xen_evtchn_handle_events(unsigned cpu) +static inline void xen_evtchn_handle_events(unsigned cpu, + struct evtchn_loop_ctrl *ctrl) { - return evtchn_ops->handle_events(cpu); + return evtchn_ops->handle_events(cpu, ctrl); } static inline void xen_evtchn_resume(void) diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c index 47c70b826a6a..4b11e60e37a3 100644 --- a/drivers/xen/evtchn.c +++ b/drivers/xen/evtchn.c @@ -166,7 +166,6 @@ static irqreturn_t evtchn_interrupt(int irq, void *data) "Interrupt for port %d, but apparently not enabled; per-user %p\n", evtchn->port, u); - disable_irq_nosync(irq); evtchn->enabled = false; spin_lock(&u->ring_prod_lock); @@ -292,7 +291,7 @@ static ssize_t evtchn_write(struct file *file, const char __user *buf, evtchn = find_evtchn(u, port); if (evtchn && !evtchn->enabled) { evtchn->enabled = true; - enable_irq(irq_from_evtchn(port)); + xen_irq_lateeoi(irq_from_evtchn(port), 0); } } @@ -392,8 +391,8 @@ static int evtchn_bind_to_user(struct per_user_data *u, int port) if (rc < 0) goto err; - rc = bind_evtchn_to_irqhandler(port, evtchn_interrupt, 0, - u->name, evtchn); + rc = bind_evtchn_to_irqhandler_lateeoi(port, evtchn_interrupt, 0, + u->name, evtchn); if (rc < 0) goto err; diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c index 398fd8b1639d..f94bb6034a5a 100644 --- a/drivers/xen/pvcalls-back.c +++ b/drivers/xen/pvcalls-back.c @@ -75,6 +75,7 @@ struct sock_mapping { atomic_t write; atomic_t io; atomic_t release; + atomic_t eoi; void (*saved_data_ready)(struct sock *sk); struct pvcalls_ioworker ioworker; }; @@ -96,7 +97,7 @@ static int pvcalls_back_release_active(struct xenbus_device *dev, struct pvcalls_fedata *fedata, struct sock_mapping *map); -static void pvcalls_conn_back_read(void *opaque) +static bool pvcalls_conn_back_read(void *opaque) { struct sock_mapping *map = (struct sock_mapping *)opaque; struct msghdr msg; @@ -116,17 +117,17 @@ static void pvcalls_conn_back_read(void *opaque) virt_mb(); if (error) - return; + return false; size = pvcalls_queued(prod, cons, array_size); if (size >= array_size) - return; + return false; spin_lock_irqsave(&map->sock->sk->sk_receive_queue.lock, flags); if (skb_queue_empty(&map->sock->sk->sk_receive_queue)) { atomic_set(&map->read, 0); spin_unlock_irqrestore(&map->sock->sk->sk_receive_queue.lock, flags); - return; + return true; } spin_unlock_irqrestore(&map->sock->sk->sk_receive_queue.lock, flags); wanted = array_size - size; @@ -150,7 +151,7 @@ static void pvcalls_conn_back_read(void *opaque) ret = inet_recvmsg(map->sock, &msg, wanted, MSG_DONTWAIT); WARN_ON(ret > wanted); if (ret == -EAGAIN) /* shouldn't happen */ - return; + return true; if (!ret) ret = -ENOTCONN; spin_lock_irqsave(&map->sock->sk->sk_receive_queue.lock, flags); @@ -169,10 +170,10 @@ static void pvcalls_conn_back_read(void *opaque) virt_wmb(); notify_remote_via_irq(map->irq); - return; + return true; } -static void pvcalls_conn_back_write(struct sock_mapping *map) +static bool pvcalls_conn_back_write(struct sock_mapping *map) { struct pvcalls_data_intf *intf = map->ring; struct pvcalls_data *data = &map->data; @@ -189,7 +190,7 @@ static void pvcalls_conn_back_write(struct sock_mapping *map) array_size = XEN_FLEX_RING_SIZE(map->ring_order); size = pvcalls_queued(prod, cons, array_size); if (size == 0) - return; + return false; memset(&msg, 0, sizeof(msg)); msg.msg_flags |= MSG_DONTWAIT; @@ -207,12 +208,11 @@ static void pvcalls_conn_back_write(struct sock_mapping *map) atomic_set(&map->write, 0); ret = inet_sendmsg(map->sock, &msg, size); - if (ret == -EAGAIN || (ret >= 0 && ret < size)) { + if (ret == -EAGAIN) { atomic_inc(&map->write); atomic_inc(&map->io); + return true; } - if (ret == -EAGAIN) - return; /* write the data, then update the indexes */ virt_wmb(); @@ -225,9 +225,13 @@ static void pvcalls_conn_back_write(struct sock_mapping *map) } /* update the indexes, then notify the other end */ virt_wmb(); - if (prod != cons + ret) + if (prod != cons + ret) { atomic_inc(&map->write); + atomic_inc(&map->io); + } notify_remote_via_irq(map->irq); + + return true; } static void pvcalls_back_ioworker(struct work_struct *work) @@ -236,6 +240,7 @@ static void pvcalls_back_ioworker(struct work_struct *work) struct pvcalls_ioworker, register_work); struct sock_mapping *map = container_of(ioworker, struct sock_mapping, ioworker); + unsigned int eoi_flags = XEN_EOI_FLAG_SPURIOUS; while (atomic_read(&map->io) > 0) { if (atomic_read(&map->release) > 0) { @@ -243,10 +248,18 @@ static void pvcalls_back_ioworker(struct work_struct *work) return; } - if (atomic_read(&map->read) > 0) - pvcalls_conn_back_read(map); - if (atomic_read(&map->write) > 0) - pvcalls_conn_back_write(map); + if (atomic_read(&map->read) > 0 && + pvcalls_conn_back_read(map)) + eoi_flags = 0; + if (atomic_read(&map->write) > 0 && + pvcalls_conn_back_write(map)) + eoi_flags = 0; + + if (atomic_read(&map->eoi) > 0 && !atomic_read(&map->write)) { + atomic_set(&map->eoi, 0); + xen_irq_lateeoi(map->irq, eoi_flags); + eoi_flags = XEN_EOI_FLAG_SPURIOUS; + } atomic_dec(&map->io); } @@ -343,12 +356,9 @@ static struct sock_mapping *pvcalls_new_active_socket( goto out; map->bytes = page; - ret = bind_interdomain_evtchn_to_irqhandler(fedata->dev->otherend_id, - evtchn, - pvcalls_back_conn_event, - 0, - "pvcalls-backend", - map); + ret = bind_interdomain_evtchn_to_irqhandler_lateeoi( + fedata->dev->otherend_id, evtchn, + pvcalls_back_conn_event, 0, "pvcalls-backend", map); if (ret < 0) goto out; map->irq = ret; @@ -882,15 +892,18 @@ static irqreturn_t pvcalls_back_event(int irq, void *dev_id) { struct xenbus_device *dev = dev_id; struct pvcalls_fedata *fedata = NULL; + unsigned int eoi_flags = XEN_EOI_FLAG_SPURIOUS; - if (dev == NULL) - return IRQ_HANDLED; + if (dev) { + fedata = dev_get_drvdata(&dev->dev); + if (fedata) { + pvcalls_back_work(fedata); + eoi_flags = 0; + } + } - fedata = dev_get_drvdata(&dev->dev); - if (fedata == NULL) - return IRQ_HANDLED; + xen_irq_lateeoi(irq, eoi_flags); - pvcalls_back_work(fedata); return IRQ_HANDLED; } @@ -900,12 +913,15 @@ static irqreturn_t pvcalls_back_conn_event(int irq, void *sock_map) struct pvcalls_ioworker *iow; if (map == NULL || map->sock == NULL || map->sock->sk == NULL || - map->sock->sk->sk_user_data != map) + map->sock->sk->sk_user_data != map) { + xen_irq_lateeoi(irq, 0); return IRQ_HANDLED; + } iow = &map->ioworker; atomic_inc(&map->write); + atomic_inc(&map->eoi); atomic_inc(&map->io); queue_work(iow->wq, &iow->register_work); @@ -940,7 +956,7 @@ static int backend_connect(struct xenbus_device *dev) goto error; } - err = bind_interdomain_evtchn_to_irq(dev->otherend_id, evtchn); + err = bind_interdomain_evtchn_to_irq_lateeoi(dev->otherend_id, evtchn); if (err < 0) goto error; fedata->irq = err; diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c index 097410a7cdb7..adf3aae2939f 100644 --- a/drivers/xen/xen-pciback/pci_stub.c +++ b/drivers/xen/xen-pciback/pci_stub.c @@ -733,10 +733,17 @@ static pci_ers_result_t common_process(struct pcistub_device *psdev, wmb(); notify_remote_via_irq(pdev->evtchn_irq); + /* Enable IRQ to signal "request done". */ + xen_pcibk_lateeoi(pdev, 0); + ret = wait_event_timeout(xen_pcibk_aer_wait_queue, !(test_bit(_XEN_PCIB_active, (unsigned long *) &sh_info->flags)), 300*HZ); + /* Enable IRQ for pcifront request if not already active. */ + if (!test_bit(_PDEVF_op_active, &pdev->flags)) + xen_pcibk_lateeoi(pdev, 0); + if (!ret) { if (test_bit(_XEN_PCIB_active, (unsigned long *)&sh_info->flags)) { @@ -750,13 +757,6 @@ static pci_ers_result_t common_process(struct pcistub_device *psdev, } clear_bit(_PCIB_op_pending, (unsigned long *)&pdev->flags); - if (test_bit(_XEN_PCIF_active, - (unsigned long *)&sh_info->flags)) { - dev_dbg(&psdev->dev->dev, - "schedule pci_conf service in " DRV_NAME "\n"); - xen_pcibk_test_and_schedule_op(psdev->pdev); - } - res = (pci_ers_result_t)aer_op->err; return res; } diff --git a/drivers/xen/xen-pciback/pciback.h b/drivers/xen/xen-pciback/pciback.h index 263c059bff90..235cdfe13494 100644 --- a/drivers/xen/xen-pciback/pciback.h +++ b/drivers/xen/xen-pciback/pciback.h @@ -14,6 +14,7 @@ #include #include #include +#include #include #define DRV_NAME "xen-pciback" @@ -27,6 +28,8 @@ struct pci_dev_entry { #define PDEVF_op_active (1<<(_PDEVF_op_active)) #define _PCIB_op_pending (1) #define PCIB_op_pending (1<<(_PCIB_op_pending)) +#define _EOI_pending (2) +#define EOI_pending (1<<(_EOI_pending)) struct xen_pcibk_device { void *pci_dev_data; @@ -182,12 +185,17 @@ static inline void xen_pcibk_release_devices(struct xen_pcibk_device *pdev) irqreturn_t xen_pcibk_handle_event(int irq, void *dev_id); void xen_pcibk_do_op(struct work_struct *data); +static inline void xen_pcibk_lateeoi(struct xen_pcibk_device *pdev, + unsigned int eoi_flag) +{ + if (test_and_clear_bit(_EOI_pending, &pdev->flags)) + xen_irq_lateeoi(pdev->evtchn_irq, eoi_flag); +} + int xen_pcibk_xenbus_register(void); void xen_pcibk_xenbus_unregister(void); extern int verbose_request; - -void xen_pcibk_test_and_schedule_op(struct xen_pcibk_device *pdev); #endif /* Handles shared IRQs that can to device domain and control domain. */ diff --git a/drivers/xen/xen-pciback/pciback_ops.c b/drivers/xen/xen-pciback/pciback_ops.c index 787966f44589..c4ed2c634ca7 100644 --- a/drivers/xen/xen-pciback/pciback_ops.c +++ b/drivers/xen/xen-pciback/pciback_ops.c @@ -297,26 +297,41 @@ int xen_pcibk_disable_msix(struct xen_pcibk_device *pdev, return 0; } #endif + +static inline bool xen_pcibk_test_op_pending(struct xen_pcibk_device *pdev) +{ + return test_bit(_XEN_PCIF_active, + (unsigned long *)&pdev->sh_info->flags) && + !test_and_set_bit(_PDEVF_op_active, &pdev->flags); +} + /* * Now the same evtchn is used for both pcifront conf_read_write request * as well as pcie aer front end ack. We use a new work_queue to schedule * xen_pcibk conf_read_write service for avoiding confict with aer_core * do_recovery job which also use the system default work_queue */ -void xen_pcibk_test_and_schedule_op(struct xen_pcibk_device *pdev) +static void xen_pcibk_test_and_schedule_op(struct xen_pcibk_device *pdev) { + bool eoi = true; + /* Check that frontend is requesting an operation and that we are not * already processing a request */ - if (test_bit(_XEN_PCIF_active, (unsigned long *)&pdev->sh_info->flags) - && !test_and_set_bit(_PDEVF_op_active, &pdev->flags)) { + if (xen_pcibk_test_op_pending(pdev)) { schedule_work(&pdev->op_work); + eoi = false; } /*_XEN_PCIB_active should have been cleared by pcifront. And also make sure xen_pcibk is waiting for ack by checking _PCIB_op_pending*/ if (!test_bit(_XEN_PCIB_active, (unsigned long *)&pdev->sh_info->flags) && test_bit(_PCIB_op_pending, &pdev->flags)) { wake_up(&xen_pcibk_aer_wait_queue); + eoi = false; } + + /* EOI if there was nothing to do. */ + if (eoi) + xen_pcibk_lateeoi(pdev, XEN_EOI_FLAG_SPURIOUS); } /* Performing the configuration space reads/writes must not be done in atomic @@ -324,10 +339,8 @@ void xen_pcibk_test_and_schedule_op(struct xen_pcibk_device *pdev) * use of semaphores). This function is intended to be called from a work * queue in process context taking a struct xen_pcibk_device as a parameter */ -void xen_pcibk_do_op(struct work_struct *data) +static void xen_pcibk_do_one_op(struct xen_pcibk_device *pdev) { - struct xen_pcibk_device *pdev = - container_of(data, struct xen_pcibk_device, op_work); struct pci_dev *dev; struct xen_pcibk_dev_data *dev_data = NULL; struct xen_pci_op *op = &pdev->op; @@ -400,16 +413,31 @@ void xen_pcibk_do_op(struct work_struct *data) smp_mb__before_atomic(); /* /after/ clearing PCIF_active */ clear_bit(_PDEVF_op_active, &pdev->flags); smp_mb__after_atomic(); /* /before/ final check for work */ +} - /* Check to see if the driver domain tried to start another request in - * between clearing _XEN_PCIF_active and clearing _PDEVF_op_active. - */ - xen_pcibk_test_and_schedule_op(pdev); +void xen_pcibk_do_op(struct work_struct *data) +{ + struct xen_pcibk_device *pdev = + container_of(data, struct xen_pcibk_device, op_work); + + do { + xen_pcibk_do_one_op(pdev); + } while (xen_pcibk_test_op_pending(pdev)); + + xen_pcibk_lateeoi(pdev, 0); } irqreturn_t xen_pcibk_handle_event(int irq, void *dev_id) { struct xen_pcibk_device *pdev = dev_id; + bool eoi; + + /* IRQs might come in before pdev->evtchn_irq is written. */ + if (unlikely(pdev->evtchn_irq != irq)) + pdev->evtchn_irq = irq; + + eoi = test_and_set_bit(_EOI_pending, &pdev->flags); + WARN(eoi, "IRQ while EOI pending\n"); xen_pcibk_test_and_schedule_op(pdev); diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c index 581c4e1a8b82..3bbed47da3fa 100644 --- a/drivers/xen/xen-pciback/xenbus.c +++ b/drivers/xen/xen-pciback/xenbus.c @@ -123,7 +123,7 @@ static int xen_pcibk_do_attach(struct xen_pcibk_device *pdev, int gnt_ref, pdev->sh_info = vaddr; - err = bind_interdomain_evtchn_to_irqhandler( + err = bind_interdomain_evtchn_to_irqhandler_lateeoi( pdev->xdev->otherend_id, remote_evtchn, xen_pcibk_handle_event, 0, DRV_NAME, pdev); if (err < 0) { diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c index 14a3d4cbc2a7..1abc0a55b8d9 100644 --- a/drivers/xen/xen-scsiback.c +++ b/drivers/xen/xen-scsiback.c @@ -91,7 +91,6 @@ struct vscsibk_info { unsigned int irq; struct vscsiif_back_ring ring; - int ring_error; spinlock_t ring_lock; atomic_t nr_unreplied_reqs; @@ -722,7 +721,8 @@ static struct vscsibk_pend *prepare_pending_reqs(struct vscsibk_info *info, return pending_req; } -static int scsiback_do_cmd_fn(struct vscsibk_info *info) +static int scsiback_do_cmd_fn(struct vscsibk_info *info, + unsigned int *eoi_flags) { struct vscsiif_back_ring *ring = &info->ring; struct vscsiif_request ring_req; @@ -739,11 +739,12 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info) rc = ring->rsp_prod_pvt; pr_warn("Dom%d provided bogus ring requests (%#x - %#x = %u). Halting ring processing\n", info->domid, rp, rc, rp - rc); - info->ring_error = 1; - return 0; + return -EINVAL; } while ((rc != rp)) { + *eoi_flags &= ~XEN_EOI_FLAG_SPURIOUS; + if (RING_REQUEST_CONS_OVERFLOW(ring, rc)) break; @@ -802,13 +803,16 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info) static irqreturn_t scsiback_irq_fn(int irq, void *dev_id) { struct vscsibk_info *info = dev_id; + int rc; + unsigned int eoi_flags = XEN_EOI_FLAG_SPURIOUS; - if (info->ring_error) - return IRQ_HANDLED; - - while (scsiback_do_cmd_fn(info)) + while ((rc = scsiback_do_cmd_fn(info, &eoi_flags)) > 0) cond_resched(); + /* In case of a ring error we keep the event channel masked. */ + if (!rc) + xen_irq_lateeoi(irq, eoi_flags); + return IRQ_HANDLED; } @@ -829,7 +833,7 @@ static int scsiback_init_sring(struct vscsibk_info *info, grant_ref_t ring_ref, sring = (struct vscsiif_sring *)area; BACK_RING_INIT(&info->ring, sring, PAGE_SIZE); - err = bind_interdomain_evtchn_to_irq(info->domid, evtchn); + err = bind_interdomain_evtchn_to_irq_lateeoi(info->domid, evtchn); if (err < 0) goto unmap_page; @@ -1252,7 +1256,6 @@ static int scsiback_probe(struct xenbus_device *dev, info->domid = dev->otherend_id; spin_lock_init(&info->ring_lock); - info->ring_error = 0; atomic_set(&info->nr_unreplied_reqs, 0); init_waitqueue_head(&info->waiting_to_free); info->dev = dev; diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c index 550d0b169d7c..61e0c552083f 100644 --- a/fs/9p/vfs_file.c +++ b/fs/9p/vfs_file.c @@ -624,9 +624,9 @@ static void v9fs_mmap_vm_close(struct vm_area_struct *vma) struct writeback_control wbc = { .nr_to_write = LONG_MAX, .sync_mode = WB_SYNC_ALL, - .range_start = vma->vm_pgoff * PAGE_SIZE, + .range_start = (loff_t)vma->vm_pgoff * PAGE_SIZE, /* absolute end, byte at end included */ - .range_end = vma->vm_pgoff * PAGE_SIZE + + .range_end = (loff_t)vma->vm_pgoff * PAGE_SIZE + (vma->vm_end - vma->vm_start - 1), }; diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 8007b6aacec6..f36b2a386aae 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1110,6 +1110,8 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, ret = update_ref_for_cow(trans, root, buf, cow, &last_ref); if (ret) { + btrfs_tree_unlock(cow); + free_extent_buffer(cow); btrfs_abort_transaction(trans, ret); return ret; } @@ -1117,6 +1119,8 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, if (test_bit(BTRFS_ROOT_REF_COWS, &root->state)) { ret = btrfs_reloc_cow_block(trans, root, buf, cow); if (ret) { + btrfs_tree_unlock(cow); + free_extent_buffer(cow); btrfs_abort_transaction(trans, ret); return ret; } @@ -1149,6 +1153,8 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, if (last_ref) { ret = tree_mod_log_free_eb(buf); if (ret) { + btrfs_tree_unlock(cow); + free_extent_buffer(cow); btrfs_abort_transaction(trans, ret); return ret; } diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 554727d82d43..4d1c12faada8 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1459,6 +1459,21 @@ do { \ #define BTRFS_INODE_ROOT_ITEM_INIT (1 << 31) +#define BTRFS_INODE_FLAG_MASK \ + (BTRFS_INODE_NODATASUM | \ + BTRFS_INODE_NODATACOW | \ + BTRFS_INODE_READONLY | \ + BTRFS_INODE_NOCOMPRESS | \ + BTRFS_INODE_PREALLOC | \ + BTRFS_INODE_SYNC | \ + BTRFS_INODE_IMMUTABLE | \ + BTRFS_INODE_APPEND | \ + BTRFS_INODE_NODUMP | \ + BTRFS_INODE_NOATIME | \ + BTRFS_INODE_DIRSYNC | \ + BTRFS_INODE_COMPRESS | \ + BTRFS_INODE_ROOT_ITEM_INIT) + struct btrfs_map_token { const struct extent_buffer *eb; char *kaddr; diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index 7374fb23381c..14855972dee3 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -620,8 +620,7 @@ static int btrfs_delayed_inode_reserve_metadata( */ if (!src_rsv || (!trans->bytes_reserved && src_rsv->type != BTRFS_BLOCK_RSV_DELALLOC)) { - ret = btrfs_qgroup_reserve_meta_prealloc(root, - fs_info->nodesize, true); + ret = btrfs_qgroup_reserve_meta_prealloc(root, num_bytes, true); if (ret < 0) return ret; ret = btrfs_block_rsv_add(root, dst_rsv, num_bytes, diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c index 1b9c8ffb038f..36c0490156ac 100644 --- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -190,7 +190,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info, int ret = 0; *device_out = NULL; - if (fs_info->fs_devices->seeding) { + if (srcdev->fs_devices->seeding) { btrfs_err(fs_info, "the filesystem is a seed filesystem!"); return -EINVAL; } diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 82d597b16152..301111922a1a 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -138,7 +138,61 @@ static int add_extent_changeset(struct extent_state *state, unsigned bits, return ret; } -static void flush_write_bio(struct extent_page_data *epd); +static int __must_check submit_one_bio(struct bio *bio, int mirror_num, + unsigned long bio_flags) +{ + blk_status_t ret = 0; + struct bio_vec *bvec = bio_last_bvec_all(bio); + struct page *page = bvec->bv_page; + struct extent_io_tree *tree = bio->bi_private; + u64 start; + + start = page_offset(page) + bvec->bv_offset; + + bio->bi_private = NULL; + + if (tree->ops) + ret = tree->ops->submit_bio_hook(tree->private_data, bio, + mirror_num, bio_flags, start); + else + btrfsic_submit_bio(bio); + + return blk_status_to_errno(ret); +} + +/* Cleanup unsubmitted bios */ +static void end_write_bio(struct extent_page_data *epd, int ret) +{ + if (epd->bio) { + epd->bio->bi_status = errno_to_blk_status(ret); + bio_endio(epd->bio); + epd->bio = NULL; + } +} + +/* + * Submit bio from extent page data via submit_one_bio + * + * Return 0 if everything is OK. + * Return <0 for error. + */ +static int __must_check flush_write_bio(struct extent_page_data *epd) +{ + int ret = 0; + + if (epd->bio) { + ret = submit_one_bio(epd->bio, 0, 0); + /* + * Clean up of epd->bio is handled by its endio function. + * And endio is either triggered by successful bio execution + * or the error handler of submit bio hook. + * So at this point, no matter what happened, we don't need + * to clean up epd->bio. + */ + epd->bio = NULL; + } + return ret; +} int __init extent_io_init(void) { @@ -2710,28 +2764,6 @@ struct bio *btrfs_bio_clone_partial(struct bio *orig, int offset, int size) return bio; } -static int __must_check submit_one_bio(struct bio *bio, int mirror_num, - unsigned long bio_flags) -{ - blk_status_t ret = 0; - struct bio_vec *bvec = bio_last_bvec_all(bio); - struct page *page = bvec->bv_page; - struct extent_io_tree *tree = bio->bi_private; - u64 start; - - start = page_offset(page) + bvec->bv_offset; - - bio->bi_private = NULL; - - if (tree->ops) - ret = tree->ops->submit_bio_hook(tree->private_data, bio, - mirror_num, bio_flags, start); - else - btrfsic_submit_bio(bio); - - return blk_status_to_errno(ret); -} - /* * @opf: bio REQ_OP_* and REQ_* flags as one value * @tree: tree so we can call our merge_bio hook @@ -3439,6 +3471,9 @@ done: * records are inserted to lock ranges in the tree, and as dirty areas * are found, they are marked writeback. Then the lock bits are removed * and the end_io handler clears the writeback ranges + * + * Return 0 if everything goes well. + * Return <0 for error. */ static int __extent_writepage(struct page *page, struct writeback_control *wbc, struct extent_page_data *epd) @@ -3506,6 +3541,7 @@ done: end_extent_writepage(page, ret, start, page_end); } unlock_page(page); + ASSERT(ret <= 0); return ret; done_unlocked: @@ -3518,18 +3554,34 @@ void wait_on_extent_buffer_writeback(struct extent_buffer *eb) TASK_UNINTERRUPTIBLE); } +static void end_extent_buffer_writeback(struct extent_buffer *eb) +{ + clear_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags); + smp_mb__after_atomic(); + wake_up_bit(&eb->bflags, EXTENT_BUFFER_WRITEBACK); +} + +/* + * Lock eb pages and flush the bio if we can't the locks + * + * Return 0 if nothing went wrong + * Return >0 is same as 0, except bio is not submitted + * Return <0 if something went wrong, no page is locked + */ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb, struct btrfs_fs_info *fs_info, struct extent_page_data *epd) { - int i, num_pages; + int i, num_pages, failed_page_nr; int flush = 0; int ret = 0; if (!btrfs_try_tree_write_lock(eb)) { + ret = flush_write_bio(epd); + if (ret < 0) + return ret; flush = 1; - flush_write_bio(epd); btrfs_tree_lock(eb); } @@ -3538,7 +3590,9 @@ lock_extent_buffer_for_io(struct extent_buffer *eb, if (!epd->sync_io) return 0; if (!flush) { - flush_write_bio(epd); + ret = flush_write_bio(epd); + if (ret < 0) + return ret; flush = 1; } while (1) { @@ -3579,7 +3633,14 @@ lock_extent_buffer_for_io(struct extent_buffer *eb, if (!trylock_page(p)) { if (!flush) { - flush_write_bio(epd); + int err; + + err = flush_write_bio(epd); + if (err < 0) { + ret = err; + failed_page_nr = i; + goto err_unlock; + } flush = 1; } lock_page(p); @@ -3587,13 +3648,25 @@ lock_extent_buffer_for_io(struct extent_buffer *eb, } return ret; -} - -static void end_extent_buffer_writeback(struct extent_buffer *eb) -{ - clear_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags); - smp_mb__after_atomic(); - wake_up_bit(&eb->bflags, EXTENT_BUFFER_WRITEBACK); +err_unlock: + /* Unlock already locked pages */ + for (i = 0; i < failed_page_nr; i++) + unlock_page(eb->pages[i]); + /* + * Clear EXTENT_BUFFER_WRITEBACK and wake up anyone waiting on it. + * Also set back EXTENT_BUFFER_DIRTY so future attempts to this eb can + * be made and undo everything done before. + */ + btrfs_tree_lock(eb); + spin_lock(&eb->refs_lock); + set_bit(EXTENT_BUFFER_DIRTY, &eb->bflags); + end_extent_buffer_writeback(eb); + spin_unlock(&eb->refs_lock); + percpu_counter_add_batch(&fs_info->dirty_metadata_bytes, eb->len, + fs_info->dirty_metadata_batch); + btrfs_clear_header_flag(eb, BTRFS_HEADER_FLAG_WRITTEN); + btrfs_tree_unlock(eb); + return ret; } static void set_btree_ioerr(struct page *page) @@ -3869,7 +3942,44 @@ retry: index = 0; goto retry; } - flush_write_bio(&epd); + ASSERT(ret <= 0); + if (ret < 0) { + end_write_bio(&epd, ret); + return ret; + } + /* + * If something went wrong, don't allow any metadata write bio to be + * submitted. + * + * This would prevent use-after-free if we had dirty pages not + * cleaned up, which can still happen by fuzzed images. + * + * - Bad extent tree + * Allowing existing tree block to be allocated for other trees. + * + * - Log tree operations + * Exiting tree blocks get allocated to log tree, bumps its + * generation, then get cleaned in tree re-balance. + * Such tree block will not be written back, since it's clean, + * thus no WRITTEN flag set. + * And after log writes back, this tree block is not traced by + * any dirty extent_io_tree. + * + * - Offending tree block gets re-dirtied from its original owner + * Since it has bumped generation, no WRITTEN flag, it can be + * reused without COWing. This tree block will not be traced + * by btrfs_transaction::dirty_pages. + * + * Now such dirty tree block will not be cleaned by any dirty + * extent io tree. Thus we don't want to submit such wild eb + * if the fs already has error. + */ + if (!test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) { + ret = flush_write_bio(&epd); + } else { + ret = -EUCLEAN; + end_write_bio(&epd, ret); + } return ret; } @@ -3966,7 +4076,8 @@ retry: * tmpfs file mapping */ if (!trylock_page(page)) { - flush_write_bio(epd); + ret = flush_write_bio(epd); + BUG_ON(ret < 0); lock_page(page); } @@ -3976,8 +4087,10 @@ retry: } if (wbc->sync_mode != WB_SYNC_NONE) { - if (PageWriteback(page)) - flush_write_bio(epd); + if (PageWriteback(page)) { + ret = flush_write_bio(epd); + BUG_ON(ret < 0); + } wait_on_page_writeback(page); } @@ -4022,8 +4135,9 @@ retry: * page in our current bio, and thus deadlock, so flush the * write bio here. */ - flush_write_bio(epd); - goto retry; + ret = flush_write_bio(epd); + if (!ret) + goto retry; } if (wbc->range_cyclic || (wbc->nr_to_write > 0 && range_whole)) @@ -4033,17 +4147,6 @@ retry: return ret; } -static void flush_write_bio(struct extent_page_data *epd) -{ - if (epd->bio) { - int ret; - - ret = submit_one_bio(epd->bio, 0, 0); - BUG_ON(ret < 0); /* -ENOMEM */ - epd->bio = NULL; - } -} - int extent_write_full_page(struct page *page, struct writeback_control *wbc) { int ret; @@ -4055,8 +4158,14 @@ int extent_write_full_page(struct page *page, struct writeback_control *wbc) }; ret = __extent_writepage(page, wbc, &epd); + ASSERT(ret <= 0); + if (ret < 0) { + end_write_bio(&epd, ret); + return ret; + } - flush_write_bio(&epd); + ret = flush_write_bio(&epd); + ASSERT(ret <= 0); return ret; } @@ -4064,6 +4173,7 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end, int mode) { int ret = 0; + int flush_ret; struct address_space *mapping = inode->i_mapping; struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree; struct page *page; @@ -4098,7 +4208,8 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end, start += PAGE_SIZE; } - flush_write_bio(&epd); + flush_ret = flush_write_bio(&epd); + BUG_ON(flush_ret < 0); return ret; } @@ -4106,6 +4217,7 @@ int extent_writepages(struct address_space *mapping, struct writeback_control *wbc) { int ret = 0; + int flush_ret; struct extent_page_data epd = { .bio = NULL, .tree = &BTRFS_I(mapping->host)->io_tree, @@ -4114,7 +4226,8 @@ int extent_writepages(struct address_space *mapping, }; ret = extent_write_cache_pages(mapping, wbc, &epd); - flush_write_bio(&epd); + flush_ret = flush_write_bio(&epd); + BUG_ON(flush_ret < 0); return ret; } diff --git a/fs/btrfs/reada.c b/fs/btrfs/reada.c index 4c81ffe12385..368c349c5669 100644 --- a/fs/btrfs/reada.c +++ b/fs/btrfs/reada.c @@ -442,6 +442,8 @@ static struct reada_extent *reada_find_extent(struct btrfs_fs_info *fs_info, } have_zone = 1; } + if (!have_zone) + radix_tree_delete(&fs_info->reada_tree, index); spin_unlock(&fs_info->reada_lock); btrfs_dev_replace_read_unlock(&fs_info->dev_replace); diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 2bc80d0b56db..ed61c0daef41 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -3810,6 +3810,72 @@ static int update_ref_path(struct send_ctx *sctx, struct recorded_ref *ref) return 0; } +/* + * When processing the new references for an inode we may orphanize an existing + * directory inode because its old name conflicts with one of the new references + * of the current inode. Later, when processing another new reference of our + * inode, we might need to orphanize another inode, but the path we have in the + * reference reflects the pre-orphanization name of the directory we previously + * orphanized. For example: + * + * parent snapshot looks like: + * + * . (ino 256) + * |----- f1 (ino 257) + * |----- f2 (ino 258) + * |----- d1/ (ino 259) + * |----- d2/ (ino 260) + * + * send snapshot looks like: + * + * . (ino 256) + * |----- d1 (ino 258) + * |----- f2/ (ino 259) + * |----- f2_link/ (ino 260) + * | |----- f1 (ino 257) + * | + * |----- d2 (ino 258) + * + * When processing inode 257 we compute the name for inode 259 as "d1", and we + * cache it in the name cache. Later when we start processing inode 258, when + * collecting all its new references we set a full path of "d1/d2" for its new + * reference with name "d2". When we start processing the new references we + * start by processing the new reference with name "d1", and this results in + * orphanizing inode 259, since its old reference causes a conflict. Then we + * move on the next new reference, with name "d2", and we find out we must + * orphanize inode 260, as its old reference conflicts with ours - but for the + * orphanization we use a source path corresponding to the path we stored in the + * new reference, which is "d1/d2" and not "o259-6-0/d2" - this makes the + * receiver fail since the path component "d1/" no longer exists, it was renamed + * to "o259-6-0/" when processing the previous new reference. So in this case we + * must recompute the path in the new reference and use it for the new + * orphanization operation. + */ +static int refresh_ref_path(struct send_ctx *sctx, struct recorded_ref *ref) +{ + char *name; + int ret; + + name = kmemdup(ref->name, ref->name_len, GFP_KERNEL); + if (!name) + return -ENOMEM; + + fs_path_reset(ref->full_path); + ret = get_cur_path(sctx, ref->dir, ref->dir_gen, ref->full_path); + if (ret < 0) + goto out; + + ret = fs_path_add(ref->full_path, name, ref->name_len); + if (ret < 0) + goto out; + + /* Update the reference's base name pointer. */ + set_ref_path(ref, ref->full_path); +out: + kfree(name); + return ret; +} + /* * This does all the move/link/unlink/rmdir magic. */ @@ -3940,6 +4006,12 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) struct name_cache_entry *nce; struct waiting_dir_move *wdm; + if (orphanized_dir) { + ret = refresh_ref_path(sctx, cur); + if (ret < 0) + goto out; + } + ret = orphanize_inode(sctx, ow_inode, ow_gen, cur->full_path); if (ret < 0) @@ -6787,7 +6859,7 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg) alloc_size = sizeof(struct clone_root) * (arg->clone_sources_count + 1); - sctx->clone_roots = kzalloc(alloc_size, GFP_KERNEL); + sctx->clone_roots = kvzalloc(alloc_size, GFP_KERNEL); if (!sctx->clone_roots) { ret = -ENOMEM; goto out; diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c index d98ec885b72a..9023e6b46396 100644 --- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -448,6 +448,320 @@ static int check_block_group_item(struct btrfs_fs_info *fs_info, return 0; } +__printf(5, 6) +__cold +static void chunk_err(const struct btrfs_fs_info *fs_info, + const struct extent_buffer *leaf, + const struct btrfs_chunk *chunk, u64 logical, + const char *fmt, ...) +{ + bool is_sb; + struct va_format vaf; + va_list args; + int i; + int slot = -1; + + /* Only superblock eb is able to have such small offset */ + is_sb = (leaf->start == BTRFS_SUPER_INFO_OFFSET); + + if (!is_sb) { + /* + * Get the slot number by iterating through all slots, this + * would provide better readability. + */ + for (i = 0; i < btrfs_header_nritems(leaf); i++) { + if (btrfs_item_ptr_offset(leaf, i) == + (unsigned long)chunk) { + slot = i; + break; + } + } + } + va_start(args, fmt); + vaf.fmt = fmt; + vaf.va = &args; + + if (is_sb) + btrfs_crit(fs_info, + "corrupt superblock syschunk array: chunk_start=%llu, %pV", + logical, &vaf); + else + btrfs_crit(fs_info, + "corrupt leaf: root=%llu block=%llu slot=%d chunk_start=%llu, %pV", + BTRFS_CHUNK_TREE_OBJECTID, leaf->start, slot, + logical, &vaf); + va_end(args); +} + +/* + * The common chunk check which could also work on super block sys chunk array. + * + * Return -EUCLEAN if anything is corrupted. + * Return 0 if everything is OK. + */ +int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info, + struct extent_buffer *leaf, + struct btrfs_chunk *chunk, u64 logical) +{ + u64 length; + u64 stripe_len; + u16 num_stripes; + u16 sub_stripes; + u64 type; + u64 features; + bool mixed = false; + + length = btrfs_chunk_length(leaf, chunk); + stripe_len = btrfs_chunk_stripe_len(leaf, chunk); + num_stripes = btrfs_chunk_num_stripes(leaf, chunk); + sub_stripes = btrfs_chunk_sub_stripes(leaf, chunk); + type = btrfs_chunk_type(leaf, chunk); + + if (!num_stripes) { + chunk_err(fs_info, leaf, chunk, logical, + "invalid chunk num_stripes, have %u", num_stripes); + return -EUCLEAN; + } + if (!IS_ALIGNED(logical, fs_info->sectorsize)) { + chunk_err(fs_info, leaf, chunk, logical, + "invalid chunk logical, have %llu should aligned to %u", + logical, fs_info->sectorsize); + return -EUCLEAN; + } + if (btrfs_chunk_sector_size(leaf, chunk) != fs_info->sectorsize) { + chunk_err(fs_info, leaf, chunk, logical, + "invalid chunk sectorsize, have %u expect %u", + btrfs_chunk_sector_size(leaf, chunk), + fs_info->sectorsize); + return -EUCLEAN; + } + if (!length || !IS_ALIGNED(length, fs_info->sectorsize)) { + chunk_err(fs_info, leaf, chunk, logical, + "invalid chunk length, have %llu", length); + return -EUCLEAN; + } + if (!is_power_of_2(stripe_len) || stripe_len != BTRFS_STRIPE_LEN) { + chunk_err(fs_info, leaf, chunk, logical, + "invalid chunk stripe length: %llu", + stripe_len); + return -EUCLEAN; + } + if (~(BTRFS_BLOCK_GROUP_TYPE_MASK | BTRFS_BLOCK_GROUP_PROFILE_MASK) & + type) { + chunk_err(fs_info, leaf, chunk, logical, + "unrecognized chunk type: 0x%llx", + ~(BTRFS_BLOCK_GROUP_TYPE_MASK | + BTRFS_BLOCK_GROUP_PROFILE_MASK) & + btrfs_chunk_type(leaf, chunk)); + return -EUCLEAN; + } + + if (!is_power_of_2(type & BTRFS_BLOCK_GROUP_PROFILE_MASK) && + (type & BTRFS_BLOCK_GROUP_PROFILE_MASK) != 0) { + chunk_err(fs_info, leaf, chunk, logical, + "invalid chunk profile flag: 0x%llx, expect 0 or 1 bit set", + type & BTRFS_BLOCK_GROUP_PROFILE_MASK); + return -EUCLEAN; + } + if ((type & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0) { + chunk_err(fs_info, leaf, chunk, logical, + "missing chunk type flag, have 0x%llx one bit must be set in 0x%llx", + type, BTRFS_BLOCK_GROUP_TYPE_MASK); + return -EUCLEAN; + } + + if ((type & BTRFS_BLOCK_GROUP_SYSTEM) && + (type & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA))) { + chunk_err(fs_info, leaf, chunk, logical, + "system chunk with data or metadata type: 0x%llx", + type); + return -EUCLEAN; + } + + features = btrfs_super_incompat_flags(fs_info->super_copy); + if (features & BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS) + mixed = true; + + if (!mixed) { + if ((type & BTRFS_BLOCK_GROUP_METADATA) && + (type & BTRFS_BLOCK_GROUP_DATA)) { + chunk_err(fs_info, leaf, chunk, logical, + "mixed chunk type in non-mixed mode: 0x%llx", type); + return -EUCLEAN; + } + } + + if ((type & BTRFS_BLOCK_GROUP_RAID10 && sub_stripes != 2) || + (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes != 2) || + (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) || + (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) || + (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) || + ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 && num_stripes != 1)) { + chunk_err(fs_info, leaf, chunk, logical, + "invalid num_stripes:sub_stripes %u:%u for profile %llu", + num_stripes, sub_stripes, + type & BTRFS_BLOCK_GROUP_PROFILE_MASK); + return -EUCLEAN; + } + + return 0; +} + +__printf(4, 5) +__cold +static void dev_item_err(const struct btrfs_fs_info *fs_info, + const struct extent_buffer *eb, int slot, + const char *fmt, ...) +{ + struct btrfs_key key; + struct va_format vaf; + va_list args; + + btrfs_item_key_to_cpu(eb, &key, slot); + va_start(args, fmt); + + vaf.fmt = fmt; + vaf.va = &args; + + btrfs_crit(fs_info, + "corrupt %s: root=%llu block=%llu slot=%d devid=%llu %pV", + btrfs_header_level(eb) == 0 ? "leaf" : "node", + btrfs_header_owner(eb), btrfs_header_bytenr(eb), slot, + key.objectid, &vaf); + va_end(args); +} + +static int check_dev_item(struct btrfs_fs_info *fs_info, + struct extent_buffer *leaf, + struct btrfs_key *key, int slot) +{ + struct btrfs_dev_item *ditem; + + if (key->objectid != BTRFS_DEV_ITEMS_OBJECTID) { + dev_item_err(fs_info, leaf, slot, + "invalid objectid: has=%llu expect=%llu", + key->objectid, BTRFS_DEV_ITEMS_OBJECTID); + return -EUCLEAN; + } + ditem = btrfs_item_ptr(leaf, slot, struct btrfs_dev_item); + if (btrfs_device_id(leaf, ditem) != key->offset) { + dev_item_err(fs_info, leaf, slot, + "devid mismatch: key has=%llu item has=%llu", + key->offset, btrfs_device_id(leaf, ditem)); + return -EUCLEAN; + } + + /* + * For device total_bytes, we don't have reliable way to check it, as + * it can be 0 for device removal. Device size check can only be done + * by dev extents check. + */ + if (btrfs_device_bytes_used(leaf, ditem) > + btrfs_device_total_bytes(leaf, ditem)) { + dev_item_err(fs_info, leaf, slot, + "invalid bytes used: have %llu expect [0, %llu]", + btrfs_device_bytes_used(leaf, ditem), + btrfs_device_total_bytes(leaf, ditem)); + return -EUCLEAN; + } + /* + * Remaining members like io_align/type/gen/dev_group aren't really + * utilized. Skip them to make later usage of them easier. + */ + return 0; +} + +/* Inode item error output has the same format as dir_item_err() */ +#define inode_item_err(fs_info, eb, slot, fmt, ...) \ + dir_item_err(fs_info, eb, slot, fmt, __VA_ARGS__) + +static int check_inode_item(struct btrfs_fs_info *fs_info, + struct extent_buffer *leaf, + struct btrfs_key *key, int slot) +{ + struct btrfs_inode_item *iitem; + u64 super_gen = btrfs_super_generation(fs_info->super_copy); + u32 valid_mask = (S_IFMT | S_ISUID | S_ISGID | S_ISVTX | 0777); + u32 mode; + + if ((key->objectid < BTRFS_FIRST_FREE_OBJECTID || + key->objectid > BTRFS_LAST_FREE_OBJECTID) && + key->objectid != BTRFS_ROOT_TREE_DIR_OBJECTID && + key->objectid != BTRFS_FREE_INO_OBJECTID) { + generic_err(fs_info, leaf, slot, + "invalid key objectid: has %llu expect %llu or [%llu, %llu] or %llu", + key->objectid, BTRFS_ROOT_TREE_DIR_OBJECTID, + BTRFS_FIRST_FREE_OBJECTID, + BTRFS_LAST_FREE_OBJECTID, + BTRFS_FREE_INO_OBJECTID); + return -EUCLEAN; + } + if (key->offset != 0) { + inode_item_err(fs_info, leaf, slot, + "invalid key offset: has %llu expect 0", + key->offset); + return -EUCLEAN; + } + iitem = btrfs_item_ptr(leaf, slot, struct btrfs_inode_item); + + /* Here we use super block generation + 1 to handle log tree */ + if (btrfs_inode_generation(leaf, iitem) > super_gen + 1) { + inode_item_err(fs_info, leaf, slot, + "invalid inode generation: has %llu expect (0, %llu]", + btrfs_inode_generation(leaf, iitem), + super_gen + 1); + return -EUCLEAN; + } + /* Note for ROOT_TREE_DIR_ITEM, mkfs could set its transid 0 */ + if (btrfs_inode_transid(leaf, iitem) > super_gen + 1) { + inode_item_err(fs_info, leaf, slot, + "invalid inode transid: has %llu expect [0, %llu]", + btrfs_inode_transid(leaf, iitem), super_gen + 1); + return -EUCLEAN; + } + + /* + * For size and nbytes it's better not to be too strict, as for dir + * item its size/nbytes can easily get wrong, but doesn't affect + * anything in the fs. So here we skip the check. + */ + mode = btrfs_inode_mode(leaf, iitem); + if (mode & ~valid_mask) { + inode_item_err(fs_info, leaf, slot, + "unknown mode bit detected: 0x%x", + mode & ~valid_mask); + return -EUCLEAN; + } + + /* + * S_IFMT is not bit mapped so we can't completely rely on is_power_of_2, + * but is_power_of_2() can save us from checking FIFO/CHR/DIR/REG. + * Only needs to check BLK, LNK and SOCKS + */ + if (!is_power_of_2(mode & S_IFMT)) { + if (!S_ISLNK(mode) && !S_ISBLK(mode) && !S_ISSOCK(mode)) { + inode_item_err(fs_info, leaf, slot, + "invalid mode: has 0%o expect valid S_IF* bit(s)", + mode & S_IFMT); + return -EUCLEAN; + } + } + if (S_ISDIR(mode) && btrfs_inode_nlink(leaf, iitem) > 1) { + inode_item_err(fs_info, leaf, slot, + "invalid nlink: has %u expect no more than 1 for dir", + btrfs_inode_nlink(leaf, iitem)); + return -EUCLEAN; + } + if (btrfs_inode_flags(leaf, iitem) & ~BTRFS_INODE_FLAG_MASK) { + inode_item_err(fs_info, leaf, slot, + "unknown flags detected: 0x%llx", + btrfs_inode_flags(leaf, iitem) & + ~BTRFS_INODE_FLAG_MASK); + return -EUCLEAN; + } + return 0; +} + /* * Common point to switch the item-specific validation. */ @@ -456,6 +770,7 @@ static int check_leaf_item(struct btrfs_fs_info *fs_info, struct btrfs_key *key, int slot) { int ret = 0; + struct btrfs_chunk *chunk; switch (key->type) { case BTRFS_EXTENT_DATA_KEY: @@ -472,6 +787,17 @@ static int check_leaf_item(struct btrfs_fs_info *fs_info, case BTRFS_BLOCK_GROUP_ITEM_KEY: ret = check_block_group_item(fs_info, leaf, key, slot); break; + case BTRFS_CHUNK_ITEM_KEY: + chunk = btrfs_item_ptr(leaf, slot, struct btrfs_chunk); + ret = btrfs_check_chunk_valid(fs_info, leaf, chunk, + key->offset); + break; + case BTRFS_DEV_ITEM_KEY: + ret = check_dev_item(fs_info, leaf, key, slot); + break; + case BTRFS_INODE_ITEM_KEY: + ret = check_inode_item(fs_info, leaf, key, slot); + break; } return ret; } diff --git a/fs/btrfs/tree-checker.h b/fs/btrfs/tree-checker.h index ff043275b784..4df45e8a6659 100644 --- a/fs/btrfs/tree-checker.h +++ b/fs/btrfs/tree-checker.h @@ -25,4 +25,8 @@ int btrfs_check_leaf_relaxed(struct btrfs_fs_info *fs_info, struct extent_buffer *leaf); int btrfs_check_node(struct btrfs_fs_info *fs_info, struct extent_buffer *node); +int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info, + struct extent_buffer *leaf, + struct btrfs_chunk *chunk, u64 logical); + #endif diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 3e903e6a3387..7b940264c7b9 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -3589,6 +3589,7 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans, * search and this search we'll not find the key again and can just * bail. */ +search: ret = btrfs_search_slot(NULL, root, &min_key, path, 0, 0); if (ret != 0) goto done; @@ -3608,6 +3609,13 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans, if (min_key.objectid != ino || min_key.type != key_type) goto done; + + if (need_resched()) { + btrfs_release_path(path); + cond_resched(); + goto search; + } + ret = overwrite_item(trans, log, dst_path, src, i, &min_key); if (ret) { diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 815b655b8f10..05daa2b816c3 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -28,6 +28,7 @@ #include "math.h" #include "dev-replace.h" #include "sysfs.h" +#include "tree-checker.h" const struct btrfs_raid_attr btrfs_raid_array[BTRFS_NR_RAID_TYPES] = { [BTRFS_RAID_RAID10] = { @@ -857,16 +858,18 @@ static noinline struct btrfs_device *device_list_add(const char *path, bdput(path_bdev); mutex_unlock(&fs_devices->device_list_mutex); btrfs_warn_in_rcu(device->fs_info, - "duplicate device fsid:devid for %pU:%llu old:%s new:%s", - disk_super->fsid, devid, - rcu_str_deref(device->name), path); + "duplicate device %s devid %llu generation %llu scanned by %s (%d)", + path, devid, found_transid, + current->comm, + task_pid_nr(current)); return ERR_PTR(-EEXIST); } bdput(path_bdev); btrfs_info_in_rcu(device->fs_info, - "device fsid %pU devid %llu moved old:%s new:%s", - disk_super->fsid, devid, - rcu_str_deref(device->name), path); + "devid %llu device path %s changed to %s scanned by %s (%d)", + devid, rcu_str_deref(device->name), + path, current->comm, + task_pid_nr(current)); } name = rcu_string_strdup(path, GFP_NOFS); @@ -4603,15 +4606,6 @@ static void check_raid56_incompat_flag(struct btrfs_fs_info *info, u64 type) btrfs_set_fs_incompat(info, RAID56); } -#define BTRFS_MAX_DEVS(info) ((BTRFS_MAX_ITEM_SIZE(info) \ - - sizeof(struct btrfs_chunk)) \ - / sizeof(struct btrfs_stripe) + 1) - -#define BTRFS_MAX_DEVS_SYS_CHUNK ((BTRFS_SYSTEM_CHUNK_ARRAY_SIZE \ - - 2 * sizeof(struct btrfs_disk_key) \ - - 2 * sizeof(struct btrfs_chunk)) \ - / sizeof(struct btrfs_stripe) + 1) - static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans, u64 start, u64 type) { @@ -6368,99 +6362,6 @@ struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info, return dev; } -/* Return -EIO if any error, otherwise return 0. */ -static int btrfs_check_chunk_valid(struct btrfs_fs_info *fs_info, - struct extent_buffer *leaf, - struct btrfs_chunk *chunk, u64 logical) -{ - u64 length; - u64 stripe_len; - u16 num_stripes; - u16 sub_stripes; - u64 type; - u64 features; - bool mixed = false; - - length = btrfs_chunk_length(leaf, chunk); - stripe_len = btrfs_chunk_stripe_len(leaf, chunk); - num_stripes = btrfs_chunk_num_stripes(leaf, chunk); - sub_stripes = btrfs_chunk_sub_stripes(leaf, chunk); - type = btrfs_chunk_type(leaf, chunk); - - if (!num_stripes) { - btrfs_err(fs_info, "invalid chunk num_stripes: %u", - num_stripes); - return -EIO; - } - if (!IS_ALIGNED(logical, fs_info->sectorsize)) { - btrfs_err(fs_info, "invalid chunk logical %llu", logical); - return -EIO; - } - if (btrfs_chunk_sector_size(leaf, chunk) != fs_info->sectorsize) { - btrfs_err(fs_info, "invalid chunk sectorsize %u", - btrfs_chunk_sector_size(leaf, chunk)); - return -EIO; - } - if (!length || !IS_ALIGNED(length, fs_info->sectorsize)) { - btrfs_err(fs_info, "invalid chunk length %llu", length); - return -EIO; - } - if (!is_power_of_2(stripe_len) || stripe_len != BTRFS_STRIPE_LEN) { - btrfs_err(fs_info, "invalid chunk stripe length: %llu", - stripe_len); - return -EIO; - } - if (~(BTRFS_BLOCK_GROUP_TYPE_MASK | BTRFS_BLOCK_GROUP_PROFILE_MASK) & - type) { - btrfs_err(fs_info, "unrecognized chunk type: %llu", - ~(BTRFS_BLOCK_GROUP_TYPE_MASK | - BTRFS_BLOCK_GROUP_PROFILE_MASK) & - btrfs_chunk_type(leaf, chunk)); - return -EIO; - } - - if ((type & BTRFS_BLOCK_GROUP_TYPE_MASK) == 0) { - btrfs_err(fs_info, "missing chunk type flag: 0x%llx", type); - return -EIO; - } - - if ((type & BTRFS_BLOCK_GROUP_SYSTEM) && - (type & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA))) { - btrfs_err(fs_info, - "system chunk with data or metadata type: 0x%llx", type); - return -EIO; - } - - features = btrfs_super_incompat_flags(fs_info->super_copy); - if (features & BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS) - mixed = true; - - if (!mixed) { - if ((type & BTRFS_BLOCK_GROUP_METADATA) && - (type & BTRFS_BLOCK_GROUP_DATA)) { - btrfs_err(fs_info, - "mixed chunk type in non-mixed mode: 0x%llx", type); - return -EIO; - } - } - - if ((type & BTRFS_BLOCK_GROUP_RAID10 && sub_stripes != 2) || - (type & BTRFS_BLOCK_GROUP_RAID1 && num_stripes != 2) || - (type & BTRFS_BLOCK_GROUP_RAID5 && num_stripes < 2) || - (type & BTRFS_BLOCK_GROUP_RAID6 && num_stripes < 3) || - (type & BTRFS_BLOCK_GROUP_DUP && num_stripes != 2) || - ((type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0 && - num_stripes != 1)) { - btrfs_err(fs_info, - "invalid num_stripes:sub_stripes %u:%u for profile %llu", - num_stripes, sub_stripes, - type & BTRFS_BLOCK_GROUP_PROFILE_MASK); - return -EIO; - } - - return 0; -} - static void btrfs_report_missing_device(struct btrfs_fs_info *fs_info, u64 devid, u8 *uuid, bool error) { @@ -6491,9 +6392,15 @@ static int read_one_chunk(struct btrfs_fs_info *fs_info, struct btrfs_key *key, length = btrfs_chunk_length(leaf, chunk); num_stripes = btrfs_chunk_num_stripes(leaf, chunk); - ret = btrfs_check_chunk_valid(fs_info, leaf, chunk, logical); - if (ret) - return ret; + /* + * Only need to verify chunk item if we're reading from sys chunk array, + * as chunk item in tree block is already verified by tree-checker. + */ + if (leaf->start == BTRFS_SUPER_INFO_OFFSET) { + ret = btrfs_check_chunk_valid(fs_info, leaf, chunk, logical); + if (ret) + return ret; + } read_lock(&map_tree->map_tree.lock); em = lookup_extent_mapping(&map_tree->map_tree, logical, 1); diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 8e8bf3246de1..65cd023b097c 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -257,6 +257,15 @@ struct btrfs_fs_devices { #define BTRFS_BIO_INLINE_CSUM_SIZE 64 +#define BTRFS_MAX_DEVS(info) ((BTRFS_MAX_ITEM_SIZE(info) \ + - sizeof(struct btrfs_chunk)) \ + / sizeof(struct btrfs_stripe) + 1) + +#define BTRFS_MAX_DEVS_SYS_CHUNK ((BTRFS_SYSTEM_CHUNK_ARRAY_SIZE \ + - 2 * sizeof(struct btrfs_disk_key) \ + - 2 * sizeof(struct btrfs_chunk)) \ + / sizeof(struct btrfs_stripe) + 1) + /* * we need the mirror number and stripe index to be passed around * the call chain while we are processing end_io (especially errors). diff --git a/fs/buffer.c b/fs/buffer.c index 4f2c847c5c2d..8d7bfb4510ec 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2779,16 +2779,6 @@ int nobh_writepage(struct page *page, get_block_t *get_block, /* Is the page fully outside i_size? (truncate in progress) */ offset = i_size & (PAGE_SIZE-1); if (page->index >= end_index+1 || !offset) { - /* - * The page may have dirty, unmapped buffers. For example, - * they may have been added in ext3_writepage(). Make them - * freeable here, so the page does not leak. - */ -#if 0 - /* Not really sure about this - do we need this ? */ - if (page->mapping->a_ops->invalidatepage) - page->mapping->a_ops->invalidatepage(page, offset); -#endif unlock_page(page); return 0; /* don't care */ } @@ -2983,12 +2973,6 @@ int block_write_full_page(struct page *page, get_block_t *get_block, /* Is the page fully outside i_size? (truncate in progress) */ offset = i_size & (PAGE_SIZE-1); if (page->index >= end_index+1 || !offset) { - /* - * The page may have dirty, unmapped buffers. For example, - * they may have been added in ext3_writepage(). Make them - * freeable here, so the page does not leak. - */ - do_invalidatepage(page, 0, PAGE_SIZE); unlock_page(page); return 0; /* don't care */ } diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c index f822ac9e3cb0..f5bf10729a87 100644 --- a/fs/cachefiles/rdwr.c +++ b/fs/cachefiles/rdwr.c @@ -125,7 +125,7 @@ static int cachefiles_read_reissue(struct cachefiles_object *object, _debug("reissue read"); ret = bmapping->a_ops->readpage(NULL, backpage); if (ret < 0) - goto unlock_discard; + goto discard; } /* but the page may have been read before the monitor was installed, so @@ -142,6 +142,7 @@ static int cachefiles_read_reissue(struct cachefiles_object *object, unlock_discard: unlock_page(backpage); +discard: spin_lock_irq(&object->work_lock); list_del(&monitor->op_link); spin_unlock_irq(&object->work_lock); diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 476728bdae8c..e59b2f53a81f 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -1437,7 +1437,7 @@ static vm_fault_t ceph_filemap_fault(struct vm_fault *vmf) struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_file_info *fi = vma->vm_file->private_data; struct page *pinned_page = NULL; - loff_t off = vmf->pgoff << PAGE_SHIFT; + loff_t off = (loff_t)vmf->pgoff << PAGE_SHIFT; int want, got, err; sigset_t oldset; vm_fault_t ret = VM_FAULT_SIGBUS; diff --git a/fs/cifs/asn1.c b/fs/cifs/asn1.c index 3d19595eb352..4a9b53229fba 100644 --- a/fs/cifs/asn1.c +++ b/fs/cifs/asn1.c @@ -541,8 +541,8 @@ decode_negTokenInit(unsigned char *security_blob, int length, return 0; } else if ((cls != ASN1_CTX) || (con != ASN1_CON) || (tag != ASN1_EOC)) { - cifs_dbg(FYI, "cls = %d con = %d tag = %d end = %p (%d) exit 0\n", - cls, con, tag, end, *end); + cifs_dbg(FYI, "cls = %d con = %d tag = %d end = %p exit 0\n", + cls, con, tag, end); return 0; } @@ -552,8 +552,8 @@ decode_negTokenInit(unsigned char *security_blob, int length, return 0; } else if ((cls != ASN1_UNI) || (con != ASN1_CON) || (tag != ASN1_SEQ)) { - cifs_dbg(FYI, "cls = %d con = %d tag = %d end = %p (%d) exit 1\n", - cls, con, tag, end, *end); + cifs_dbg(FYI, "cls = %d con = %d tag = %d end = %p exit 1\n", + cls, con, tag, end); return 0; } @@ -563,8 +563,8 @@ decode_negTokenInit(unsigned char *security_blob, int length, return 0; } else if ((cls != ASN1_CTX) || (con != ASN1_CON) || (tag != ASN1_EOC)) { - cifs_dbg(FYI, "cls = %d con = %d tag = %d end = %p (%d) exit 0\n", - cls, con, tag, end, *end); + cifs_dbg(FYI, "cls = %d con = %d tag = %d end = %p exit 0\n", + cls, con, tag, end); return 0; } @@ -575,8 +575,8 @@ decode_negTokenInit(unsigned char *security_blob, int length, return 0; } else if ((cls != ASN1_UNI) || (con != ASN1_CON) || (tag != ASN1_SEQ)) { - cifs_dbg(FYI, "cls = %d con = %d tag = %d end = %p (%d) exit 1\n", - cls, con, tag, end, *end); + cifs_dbg(FYI, "cls = %d con = %d tag = %d sequence_end = %p exit 1\n", + cls, con, tag, sequence_end); return 0; } diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index 4a38f16d944d..d30eb4350656 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -2550,13 +2550,18 @@ cifs_setattr(struct dentry *direntry, struct iattr *attrs) { struct cifs_sb_info *cifs_sb = CIFS_SB(direntry->d_sb); struct cifs_tcon *pTcon = cifs_sb_master_tcon(cifs_sb); + int rc, retries = 0; - if (pTcon->unix_ext) - return cifs_setattr_unix(direntry, attrs); - - return cifs_setattr_nounix(direntry, attrs); + do { + if (pTcon->unix_ext) + rc = cifs_setattr_unix(direntry, attrs); + else + rc = cifs_setattr_nounix(direntry, attrs); + retries++; + } while (is_retryable_error(rc) && retries < 2); /* BB: add cifs_setattr_legacy for really old servers */ + return rc; } #if 0 diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c index 3d63c76ed098..e20d170d13f6 100644 --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -2730,7 +2730,7 @@ crypt_message(struct TCP_Server_Info *server, int num_rqst, if (rc) { cifs_dbg(VFS, "%s: Could not get %scryption key\n", __func__, enc ? "en" : "de"); - return 0; + return rc; } rc = smb3_crypto_aead_allocate(server); diff --git a/fs/dlm/config.c b/fs/dlm/config.c index 1270551d24e3..f13d86524450 100644 --- a/fs/dlm/config.c +++ b/fs/dlm/config.c @@ -218,6 +218,7 @@ struct dlm_space { struct list_head members; struct mutex members_lock; int members_count; + struct dlm_nodes *nds; }; struct dlm_comms { @@ -426,6 +427,7 @@ static struct config_group *make_space(struct config_group *g, const char *name) INIT_LIST_HEAD(&sp->members); mutex_init(&sp->members_lock); sp->members_count = 0; + sp->nds = nds; return &sp->group; fail: @@ -447,6 +449,7 @@ static void drop_space(struct config_group *g, struct config_item *i) static void release_space(struct config_item *i) { struct dlm_space *sp = config_item_to_space(i); + kfree(sp->nds); kfree(sp); } diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c index 5b68e4294faa..834615f13f3e 100644 --- a/fs/efivarfs/super.c +++ b/fs/efivarfs/super.c @@ -145,6 +145,9 @@ static int efivarfs_callback(efi_char16_t *name16, efi_guid_t vendor, name[len + EFI_VARIABLE_GUID_LEN+1] = '\0'; + /* replace invalid slashes like kobject_set_name_vargs does for /sys/firmware/efi/vars. */ + strreplace(name, '/', '!'); + inode = efivarfs_get_inode(sb, d_inode(root), S_IFREG | 0644, 0, is_removable); if (!inode) diff --git a/fs/exec.c b/fs/exec.c index e438bf91a54b..17d7f8a66167 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1028,10 +1028,23 @@ static int exec_mmap(struct mm_struct *mm) } } task_lock(tsk); + + local_irq_disable(); active_mm = tsk->active_mm; - tsk->mm = mm; tsk->active_mm = mm; + tsk->mm = mm; + /* + * This prevents preemption while active_mm is being loaded and + * it and mm are being updated, which could cause problems for + * lazy tlb mm refcounting when these are updated by context + * switches. Not all architectures can handle irqs off over + * activate_mm yet. + */ + if (!IS_ENABLED(CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM)) + local_irq_enable(); activate_mm(active_mm, mm); + if (IS_ENABLED(CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM)) + local_irq_enable(); tsk->mm->vmacache_seqnum = 0; vmacache_flush(tsk); task_unlock(tsk); diff --git a/fs/ext4/fsmap.c b/fs/ext4/fsmap.c index 4b99e2db95b8..6f3f245f3a80 100644 --- a/fs/ext4/fsmap.c +++ b/fs/ext4/fsmap.c @@ -108,6 +108,9 @@ static int ext4_getfsmap_helper(struct super_block *sb, /* Are we just counting mappings? */ if (info->gfi_head->fmh_count == 0) { + if (info->gfi_head->fmh_entries == UINT_MAX) + return EXT4_QUERY_RANGE_ABORT; + if (rec_fsblk > info->gfi_next_fsblk) info->gfi_head->fmh_entries++; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index cee152522c28..16e5479f1095 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5356,6 +5356,12 @@ static int ext4_do_update_inode(handle_t *handle, if (ext4_test_inode_state(inode, EXT4_STATE_NEW)) memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size); + err = ext4_inode_blocks_set(handle, raw_inode, ei); + if (err) { + spin_unlock(&ei->i_raw_lock); + goto out_brelse; + } + raw_inode->i_mode = cpu_to_le16(inode->i_mode); i_uid = i_uid_read(inode); i_gid = i_gid_read(inode); @@ -5389,11 +5395,6 @@ static int ext4_do_update_inode(handle_t *handle, EXT4_INODE_SET_XTIME(i_atime, inode, raw_inode); EXT4_EINODE_SET_XTIME(i_crtime, ei, raw_inode); - err = ext4_inode_blocks_set(handle, raw_inode, ei); - if (err) { - spin_unlock(&ei->i_raw_lock); - goto out_brelse; - } raw_inode->i_dtime = cpu_to_le32(ei->i_dtime); raw_inode->i_flags = cpu_to_le32(ei->i_flags & 0xFFFFFFFF); if (likely(!test_opt2(inode->i_sb, HURD_COMPAT))) diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c index ef552d93708e..8098255c2801 100644 --- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -861,8 +861,10 @@ static int add_new_gdb(handle_t *handle, struct inode *inode, BUFFER_TRACE(dind, "get_write_access"); err = ext4_journal_get_write_access(handle, dind); - if (unlikely(err)) + if (unlikely(err)) { ext4_std_error(sb, err); + goto errout; + } /* ext4_reserve_inode_write() gets a reference on the iloc */ err = ext4_reserve_inode_write(handle, inode, &iloc); diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 9b9c539e8897..cf44e0e3f2b0 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -4750,6 +4750,7 @@ cantfind_ext4: failed_mount8: ext4_unregister_sysfs(sb); + kobject_put(&sbi->s_kobj); failed_mount7: ext4_unregister_li_request(sb); failed_mount6: @@ -5914,6 +5915,11 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id, /* Quotafile not on the same filesystem? */ if (path->dentry->d_sb != sb) return -EXDEV; + + /* Quota already enabled for this file? */ + if (IS_NOQUOTA(d_inode(path->dentry))) + return -EBUSY; + /* Journaling quota? */ if (EXT4_SB(sb)->s_qf_names[type]) { /* Quotafile not in fs root? */ diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index fd010e2c440e..7aeb9456aec7 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -243,6 +243,8 @@ int f2fs_ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages, blkno * NAT_ENTRY_PER_BLOCK); break; case META_SIT: + if (unlikely(blkno >= TOTAL_SEGS(sbi))) + goto out; /* get sit block addr */ fio.new_blkaddr = current_sit_addr(sbi, blkno * SIT_ENTRY_PER_BLOCK); @@ -1047,8 +1049,12 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type) get_pages(sbi, is_dir ? F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA)); retry: - if (unlikely(f2fs_cp_error(sbi))) + if (unlikely(f2fs_cp_error(sbi))) { + trace_f2fs_sync_dirty_inodes_exit(sbi->sb, is_dir, + get_pages(sbi, is_dir ? + F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA)); return -EIO; + } spin_lock(&sbi->inode_lock[type]); diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c index 959c3461da6a..aa363840f1b3 100644 --- a/fs/f2fs/dir.c +++ b/fs/f2fs/dir.c @@ -380,16 +380,15 @@ struct f2fs_dir_entry *__f2fs_find_entry(struct inode *dir, unsigned int max_depth; unsigned int level; + *res_page = NULL; + if (f2fs_has_inline_dentry(dir)) { - *res_page = NULL; de = f2fs_find_in_inline_dir(dir, fname, res_page); goto out; } - if (npages == 0) { - *res_page = NULL; + if (npages == 0) goto out; - } max_depth = F2FS_I(dir)->i_current_depth; if (unlikely(max_depth > MAX_DIR_HASH_DEPTH)) { @@ -400,7 +399,6 @@ struct f2fs_dir_entry *__f2fs_find_entry(struct inode *dir, } for (level = 0; level < max_depth; level++) { - *res_page = NULL; de = find_in_level(dir, level, fname, res_page); if (de || IS_ERR(*res_page)) break; diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c index fcb5cc026f48..d90bf4cb6333 100644 --- a/fs/f2fs/sysfs.c +++ b/fs/f2fs/sysfs.c @@ -957,4 +957,5 @@ void f2fs_unregister_sysfs(struct f2fs_sb_info *sbi) } kobject_del(&sbi->s_kobj); kobject_put(&sbi->s_kobj); + wait_for_completion(&sbi->s_kobj_unregister); } diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index adbe8fc904e1..9861204da06f 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -574,6 +574,7 @@ ssize_t fuse_simple_request(struct fuse_conn *fc, struct fuse_args *args) req->out.numargs = args->out.numargs; memcpy(req->out.args, args->out.args, args->out.numargs * sizeof(struct fuse_arg)); + req->out.canonical_path = args->out.canonical_path; fuse_request_send(fc, req); ret = req->out.h.error; if (!ret && args->out.argvar) { @@ -1931,13 +1932,14 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud, cs->move_pages = 0; err = copy_out_args(cs, &req->out, nbytes); - if (req->in.h.opcode == FUSE_CANONICAL_PATH) { + fuse_copy_finish(cs); + + if (!err && req->in.h.opcode == FUSE_CANONICAL_PATH) { char *path = (char *)req->out.args[0].value; path[req->out.args[0].size - 1] = 0; - req->out.h.error = kern_path(path, 0, req->canonical_path); + req->out.h.error = kern_path(path, 0, req->out.canonical_path); } - fuse_copy_finish(cs); spin_lock(&fpq->lock); clear_bit(FR_LOCKED, &req->flags); diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 2c4ceb75482e..fcf5fc63f3af 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -263,50 +263,6 @@ invalid: goto out; } -/* - * Get the canonical path. Since we must translate to a path, this must be done - * in the context of the userspace daemon, however, the userspace daemon cannot - * look up paths on its own. Instead, we handle the lookup as a special case - * inside of the write request. - */ -static void fuse_dentry_canonical_path(const struct path *path, struct path *canonical_path) { - struct inode *inode = path->dentry->d_inode; - struct fuse_conn *fc = get_fuse_conn(inode); - struct fuse_req *req; - int err; - char *path_name; - - req = fuse_get_req(fc, 1); - err = PTR_ERR(req); - if (IS_ERR(req)) - goto default_path; - - path_name = (char*)__get_free_page(GFP_KERNEL); - if (!path_name) { - fuse_put_request(fc, req); - goto default_path; - } - - req->in.h.opcode = FUSE_CANONICAL_PATH; - req->in.h.nodeid = get_node_id(inode); - req->in.numargs = 0; - req->out.numargs = 1; - req->out.args[0].size = PATH_MAX; - req->out.args[0].value = path_name; - req->canonical_path = canonical_path; - req->out.argvar = 1; - fuse_request_send(fc, req); - err = req->out.h.error; - fuse_put_request(fc, req); - free_page((unsigned long)path_name); - if (!err) - return; -default_path: - canonical_path->dentry = path->dentry; - canonical_path->mnt = path->mnt; - path_get(canonical_path); -} - static int invalid_nodeid(u64 nodeid) { return !nodeid || nodeid == FUSE_ROOT_ID; @@ -325,6 +281,44 @@ static void fuse_dentry_release(struct dentry *dentry) kfree_rcu(fd, rcu); } +/* + * Get the canonical path. Since we must translate to a path, this must be done + * in the context of the userspace daemon, however, the userspace daemon cannot + * look up paths on its own. Instead, we handle the lookup as a special case + * inside of the write request. + */ +static void fuse_dentry_canonical_path(const struct path *path, + struct path *canonical_path) +{ + struct inode *inode = d_inode(path->dentry); + struct fuse_conn *fc = get_fuse_conn(inode); + FUSE_ARGS(args); + char *path_name; + int err; + + path_name = (char *)__get_free_page(GFP_KERNEL); + if (!path_name) + goto default_path; + + args.in.h.opcode = FUSE_CANONICAL_PATH; + args.in.h.nodeid = get_node_id(inode); + args.in.numargs = 0; + args.out.numargs = 1; + args.out.args[0].size = PATH_MAX; + args.out.args[0].value = path_name; + args.out.argvar = 1; + args.out.canonical_path = canonical_path; + + err = fuse_simple_request(fc, &args); + free_page((unsigned long)path_name); + if (err > 0) + return; +default_path: + canonical_path->dentry = path->dentry; + canonical_path->mnt = path->mnt; + path_get(canonical_path); +} + const struct dentry_operations fuse_dentry_operations = { .d_revalidate = fuse_dentry_revalidate, .d_init = fuse_dentry_init, diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 3d53a17b1f64..123f87804f51 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -213,6 +213,9 @@ struct fuse_out { /** Array of arguments */ struct fuse_arg args[2]; + + /* Path used for completing d_canonical_path */ + struct path *canonical_path; }; /** FUSE page descriptor */ @@ -235,6 +238,9 @@ struct fuse_args { unsigned argvar:1; unsigned numargs; struct fuse_arg args[2]; + + /* Path used for completing d_canonical_path */ + struct path *canonical_path; } out; }; @@ -371,9 +377,6 @@ struct fuse_req { /** Inode used in the request or NULL */ struct inode *inode; - /** Path used for completing d_canonical_path */ - struct path *canonical_path; - /** AIO control block */ struct fuse_io_priv *io; diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index ccdd8c821abd..c20d71d86812 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -870,7 +870,8 @@ int gfs2_glock_get(struct gfs2_sbd *sdp, u64 number, out_free: kfree(gl->gl_lksb.sb_lvbptr); kmem_cache_free(cachep, gl); - atomic_dec(&sdp->sd_glock_disposal); + if (atomic_dec_and_test(&sdp->sd_glock_disposal)) + wake_up(&sdp->sd_glock_wait); out: return ret; diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c index 9448c8461e57..17001f4e9f84 100644 --- a/fs/gfs2/ops_fstype.c +++ b/fs/gfs2/ops_fstype.c @@ -161,15 +161,19 @@ static int gfs2_check_sb(struct gfs2_sbd *sdp, int silent) return -EINVAL; } - /* If format numbers match exactly, we're done. */ + if (sb->sb_fs_format != GFS2_FORMAT_FS || + sb->sb_multihost_format != GFS2_FORMAT_MULTI) { + fs_warn(sdp, "Unknown on-disk format, unable to mount\n"); + return -EINVAL; + } - if (sb->sb_fs_format == GFS2_FORMAT_FS && - sb->sb_multihost_format == GFS2_FORMAT_MULTI) - return 0; + if (sb->sb_bsize < 512 || sb->sb_bsize > PAGE_SIZE || + (sb->sb_bsize & (sb->sb_bsize - 1))) { + pr_warn("Invalid superblock size\n"); + return -EINVAL; + } - fs_warn(sdp, "Unknown on-disk format, unable to mount\n"); - - return -EINVAL; + return 0; } static void end_bio_io_page(struct bio *bio) diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c index e5686be67be8..d57d453aecc2 100644 --- a/fs/nfs/namespace.c +++ b/fs/nfs/namespace.c @@ -30,9 +30,9 @@ int nfs_mountpoint_expiry_timeout = 500 * HZ; /* * nfs_path - reconstruct the path given an arbitrary dentry * @base - used to return pointer to the end of devname part of path - * @dentry - pointer to dentry + * @dentry_in - pointer to dentry * @buffer - result buffer - * @buflen - length of buffer + * @buflen_in - length of buffer * @flags - options (see below) * * Helper function for constructing the server pathname @@ -47,15 +47,19 @@ int nfs_mountpoint_expiry_timeout = 500 * HZ; * the original device (export) name * (if unset, the original name is returned verbatim) */ -char *nfs_path(char **p, struct dentry *dentry, char *buffer, ssize_t buflen, - unsigned flags) +char *nfs_path(char **p, struct dentry *dentry_in, char *buffer, + ssize_t buflen_in, unsigned flags) { char *end; int namelen; unsigned seq; const char *base; + struct dentry *dentry; + ssize_t buflen; rename_retry: + buflen = buflen_in; + dentry = dentry_in; end = buffer+buflen; *--end = '\0'; buflen--; diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index b2a2ff3f22a4..fe7b42c277ac 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -7600,9 +7600,11 @@ int nfs4_proc_secinfo(struct inode *dir, const struct qstr *name, * both PNFS and NON_PNFS flags set, and not having one of NON_PNFS, PNFS, or * DS flags set. */ -static int nfs4_check_cl_exchange_flags(u32 flags) +static int nfs4_check_cl_exchange_flags(u32 flags, u32 version) { - if (flags & ~EXCHGID4_FLAG_MASK_R) + if (version >= 2 && (flags & ~EXCHGID4_2_FLAG_MASK_R)) + goto out_inval; + else if (version < 2 && (flags & ~EXCHGID4_FLAG_MASK_R)) goto out_inval; if ((flags & EXCHGID4_FLAG_USE_PNFS_MDS) && (flags & EXCHGID4_FLAG_USE_NON_PNFS)) @@ -7997,7 +7999,8 @@ static int _nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred *cred, if (status != 0) goto out; - status = nfs4_check_cl_exchange_flags(resp->flags); + status = nfs4_check_cl_exchange_flags(resp->flags, + clp->cl_mvops->minor_version); if (status != 0) goto out; diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c index 0d20fd161225..01f7ce1ae127 100644 --- a/fs/nfsd/nfsproc.c +++ b/fs/nfsd/nfsproc.c @@ -118,6 +118,13 @@ done: return nfsd_return_attrs(nfserr, resp); } +/* Obsolete, replaced by MNTPROC_MNT. */ +static __be32 +nfsd_proc_root(struct svc_rqst *rqstp) +{ + return nfs_ok; +} + /* * Look up a path name component * Note: the dentry in the resp->fh may be negative if the file @@ -201,6 +208,13 @@ nfsd_proc_read(struct svc_rqst *rqstp) return fh_getattr(&resp->fh, &resp->stat); } +/* Reserved */ +static __be32 +nfsd_proc_writecache(struct svc_rqst *rqstp) +{ + return nfs_ok; +} + /* * Write data to a file * N.B. After this call resp->fh needs an fh_put @@ -615,6 +629,7 @@ static const struct svc_procedure nfsd_procedures2[18] = { .pc_xdrressize = ST+AT, }, [NFSPROC_ROOT] = { + .pc_func = nfsd_proc_root, .pc_decode = nfssvc_decode_void, .pc_encode = nfssvc_encode_void, .pc_argsize = sizeof(struct nfsd_void), @@ -652,6 +667,7 @@ static const struct svc_procedure nfsd_procedures2[18] = { .pc_xdrressize = ST+AT+1+NFSSVC_MAXBLKSIZE_V2/4, }, [NFSPROC_WRITECACHE] = { + .pc_func = nfsd_proc_writecache, .pc_decode = nfssvc_decode_void, .pc_encode = nfssvc_encode_void, .pc_argsize = sizeof(struct nfsd_void), diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c index a415ffa73a6d..9530a2dfabc3 100644 --- a/fs/notify/inotify/inotify_user.c +++ b/fs/notify/inotify/inotify_user.c @@ -750,9 +750,10 @@ SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname, goto fput_and_out; /* support stacked filesystems */ - if(path.dentry && path.dentry->d_op) { + if (path.dentry && path.dentry->d_op) { if (path.dentry->d_op->d_canonical_path) { - path.dentry->d_op->d_canonical_path(&path, &alteredpath); + path.dentry->d_op->d_canonical_path(&path, + &alteredpath); canonical_path = &alteredpath; path_put(&path); } diff --git a/fs/ntfs/inode.c b/fs/ntfs/inode.c index bd3221cbdd95..0d4b5b9843b6 100644 --- a/fs/ntfs/inode.c +++ b/fs/ntfs/inode.c @@ -1835,6 +1835,12 @@ int ntfs_read_inode_mount(struct inode *vi) brelse(bh); } + if (le32_to_cpu(m->bytes_allocated) != vol->mft_record_size) { + ntfs_error(sb, "Incorrect mft record size %u in superblock, should be %u.", + le32_to_cpu(m->bytes_allocated), vol->mft_record_size); + goto err_out; + } + /* Apply the mst fixups. */ if (post_read_mst_fixup((NTFS_RECORD*)m, vol->mft_record_size)) { /* FIXME: Try to use the $MFTMirr now. */ diff --git a/fs/quota/quota_v2.c b/fs/quota/quota_v2.c index a73e5b34db41..5d4dc0f84f20 100644 --- a/fs/quota/quota_v2.c +++ b/fs/quota/quota_v2.c @@ -283,6 +283,7 @@ static void v2r1_mem2diskdqb(void *dp, struct dquot *dquot) d->dqb_curspace = cpu_to_le64(m->dqb_curspace); d->dqb_btime = cpu_to_le64(m->dqb_btime); d->dqb_id = cpu_to_le32(from_kqid(&init_user_ns, dquot->dq_id)); + d->dqb_pad = 0; if (qtree_entry_unused(info, dp)) d->dqb_itime = cpu_to_le64(1); } diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c index 3ac1f2387083..5e1ebbe639eb 100644 --- a/fs/ramfs/file-nommu.c +++ b/fs/ramfs/file-nommu.c @@ -228,7 +228,7 @@ static unsigned long ramfs_nommu_get_unmapped_area(struct file *file, if (!pages) goto out_free; - nr = find_get_pages(inode->i_mapping, &pgoff, lpages, pages); + nr = find_get_pages_contig(inode->i_mapping, pgoff, lpages, pages); if (nr != lpages) goto out_free_pages; /* leave if some pages were missing */ diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c index 70387650436c..ac35ddf0dd60 100644 --- a/fs/reiserfs/inode.c +++ b/fs/reiserfs/inode.c @@ -2161,7 +2161,8 @@ out_end_trans: out_inserted_sd: clear_nlink(inode); th->t_trans_id = 0; /* so the caller can't use this handle later */ - unlock_new_inode(inode); /* OK to do even if we hadn't locked it */ + if (inode->i_state & I_NEW) + unlock_new_inode(inode); iput(inode); return err; } diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c index de5eda33c92a..ec5716dd58c2 100644 --- a/fs/reiserfs/super.c +++ b/fs/reiserfs/super.c @@ -1264,6 +1264,10 @@ static int reiserfs_parse_options(struct super_block *s, "turned on."); return 0; } + if (qf_names[qtype] != + REISERFS_SB(s)->s_qf_names[qtype]) + kfree(qf_names[qtype]); + qf_names[qtype] = NULL; if (*arg) { /* Some filename specified? */ if (REISERFS_SB(s)->s_qf_names[qtype] && strcmp(REISERFS_SB(s)->s_qf_names[qtype], @@ -1293,10 +1297,6 @@ static int reiserfs_parse_options(struct super_block *s, else *mount_options |= 1 << REISERFS_GRPQUOTA; } else { - if (qf_names[qtype] != - REISERFS_SB(s)->s_qf_names[qtype]) - kfree(qf_names[qtype]); - qf_names[qtype] = NULL; if (qtype == USRQUOTA) *mount_options &= ~(1 << REISERFS_USRQUOTA); else diff --git a/fs/ubifs/debug.c b/fs/ubifs/debug.c index 564e330d05b1..24bbecd4752b 100644 --- a/fs/ubifs/debug.c +++ b/fs/ubifs/debug.c @@ -1129,6 +1129,7 @@ int dbg_check_dir(struct ubifs_info *c, const struct inode *dir) err = PTR_ERR(dent); if (err == -ENOENT) break; + kfree(pdent); return err; } diff --git a/fs/udf/inode.c b/fs/udf/inode.c index 4c46ebf0e773..3bf89a633836 100644 --- a/fs/udf/inode.c +++ b/fs/udf/inode.c @@ -132,21 +132,24 @@ void udf_evict_inode(struct inode *inode) struct udf_inode_info *iinfo = UDF_I(inode); int want_delete = 0; - if (!inode->i_nlink && !is_bad_inode(inode)) { - want_delete = 1; - udf_setsize(inode, 0); - udf_update_inode(inode, IS_SYNC(inode)); + if (!is_bad_inode(inode)) { + if (!inode->i_nlink) { + want_delete = 1; + udf_setsize(inode, 0); + udf_update_inode(inode, IS_SYNC(inode)); + } + if (iinfo->i_alloc_type != ICBTAG_FLAG_AD_IN_ICB && + inode->i_size != iinfo->i_lenExtents) { + udf_warn(inode->i_sb, + "Inode %lu (mode %o) has inode size %llu different from extent length %llu. Filesystem need not be standards compliant.\n", + inode->i_ino, inode->i_mode, + (unsigned long long)inode->i_size, + (unsigned long long)iinfo->i_lenExtents); + } } truncate_inode_pages_final(&inode->i_data); invalidate_inode_buffers(inode); clear_inode(inode); - if (iinfo->i_alloc_type != ICBTAG_FLAG_AD_IN_ICB && - inode->i_size != iinfo->i_lenExtents) { - udf_warn(inode->i_sb, "Inode %lu (mode %o) has inode size %llu different from extent length %llu. Filesystem need not be standards compliant.\n", - inode->i_ino, inode->i_mode, - (unsigned long long)inode->i_size, - (unsigned long long)iinfo->i_lenExtents); - } kfree(iinfo->i_ext.i_data); iinfo->i_ext.i_data = NULL; udf_clear_extent_cache(inode); diff --git a/fs/udf/super.c b/fs/udf/super.c index 1676a175cd7a..c7f6243f318b 100644 --- a/fs/udf/super.c +++ b/fs/udf/super.c @@ -1349,6 +1349,12 @@ static int udf_load_sparable_map(struct super_block *sb, (int)spm->numSparingTables); return -EIO; } + if (le32_to_cpu(spm->sizeSparingTable) > sb->s_blocksize) { + udf_err(sb, "error loading logical volume descriptor: " + "Too big sparing table size (%u)\n", + le32_to_cpu(spm->sizeSparingTable)); + return -EIO; + } for (i = 0; i < spm->numSparingTables; i++) { loc = le32_to_cpu(spm->locSparingTable[i]); @@ -1679,7 +1685,8 @@ static noinline int udf_process_sequence( "Pointers (max %u supported)\n", UDF_MAX_TD_NESTING); brelse(bh); - return -EIO; + ret = -EIO; + goto out; } vdp = (struct volDescPtr *)bh->b_data; @@ -1699,7 +1706,8 @@ static noinline int udf_process_sequence( curr = get_volume_descriptor_record(ident, bh, &data); if (IS_ERR(curr)) { brelse(bh); - return PTR_ERR(curr); + ret = PTR_ERR(curr); + goto out; } /* Descriptor we don't care about? */ if (!curr) @@ -1721,28 +1729,31 @@ static noinline int udf_process_sequence( */ if (!data.vds[VDS_POS_PRIMARY_VOL_DESC].block) { udf_err(sb, "Primary Volume Descriptor not found!\n"); - return -EAGAIN; + ret = -EAGAIN; + goto out; } ret = udf_load_pvoldesc(sb, data.vds[VDS_POS_PRIMARY_VOL_DESC].block); if (ret < 0) - return ret; + goto out; if (data.vds[VDS_POS_LOGICAL_VOL_DESC].block) { ret = udf_load_logicalvol(sb, data.vds[VDS_POS_LOGICAL_VOL_DESC].block, fileset); if (ret < 0) - return ret; + goto out; } /* Now handle prevailing Partition Descriptors */ for (i = 0; i < data.num_part_descs; i++) { ret = udf_load_partdesc(sb, data.part_descs_loc[i].rec.block); if (ret < 0) - return ret; + goto out; } - - return 0; + ret = 0; +out: + kfree(data.part_descs_loc); + return ret; } /* diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index f35e1801f1c9..fc9950a505e6 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4920,20 +4920,25 @@ xfs_bmap_del_extent_real( flags = XFS_ILOG_CORE; if (whichfork == XFS_DATA_FORK && XFS_IS_REALTIME_INODE(ip)) { - xfs_fsblock_t bno; xfs_filblks_t len; xfs_extlen_t mod; - bno = div_u64_rem(del->br_startblock, mp->m_sb.sb_rextsize, - &mod); - ASSERT(mod == 0); len = div_u64_rem(del->br_blockcount, mp->m_sb.sb_rextsize, &mod); ASSERT(mod == 0); - error = xfs_rtfree_extent(tp, bno, (xfs_extlen_t)len); - if (error) - goto done; + if (!(bflags & XFS_BMAPI_REMAP)) { + xfs_fsblock_t bno; + + bno = div_u64_rem(del->br_startblock, + mp->m_sb.sb_rextsize, &mod); + ASSERT(mod == 0); + + error = xfs_rtfree_extent(tp, bno, (xfs_extlen_t)len); + if (error) + goto done; + } + do_fx = 0; nblks = len * mp->m_sb.sb_rextsize; qfield = XFS_TRANS_DQ_RTBCOUNT; diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c index b228c821bae6..fe7323032e78 100644 --- a/fs/xfs/libxfs/xfs_rtbitmap.c +++ b/fs/xfs/libxfs/xfs_rtbitmap.c @@ -1020,7 +1020,6 @@ xfs_rtalloc_query_range( struct xfs_mount *mp = tp->t_mountp; xfs_rtblock_t rtstart; xfs_rtblock_t rtend; - xfs_rtblock_t rem; int is_free; int error = 0; @@ -1029,13 +1028,12 @@ xfs_rtalloc_query_range( if (low_rec->ar_startext >= mp->m_sb.sb_rextents || low_rec->ar_startext == high_rec->ar_startext) return 0; - if (high_rec->ar_startext > mp->m_sb.sb_rextents) - high_rec->ar_startext = mp->m_sb.sb_rextents; + high_rec->ar_startext = min(high_rec->ar_startext, + mp->m_sb.sb_rextents - 1); /* Iterate the bitmap, looking for discrepancies. */ rtstart = low_rec->ar_startext; - rem = high_rec->ar_startext - rtstart; - while (rem) { + while (rtstart <= high_rec->ar_startext) { /* Is the first block free? */ error = xfs_rtcheck_range(mp, tp, rtstart, 1, 1, &rtend, &is_free); @@ -1044,7 +1042,7 @@ xfs_rtalloc_query_range( /* How long does the extent go for? */ error = xfs_rtfind_forw(mp, tp, rtstart, - high_rec->ar_startext - 1, &rtend); + high_rec->ar_startext, &rtend); if (error) break; @@ -1057,7 +1055,6 @@ xfs_rtalloc_query_range( break; } - rem -= rtend - rtstart + 1; rtstart = rtend + 1; } diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index 3d76a9e35870..75b57b683d3e 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -259,6 +259,9 @@ xfs_getfsmap_helper( /* Are we just counting mappings? */ if (info->head->fmh_count == 0) { + if (info->head->fmh_entries == UINT_MAX) + return -ECANCELED; + if (rec_daddr > info->next_daddr) info->head->fmh_entries++; diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 484eb0adcefb..280965fc9bbd 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -245,6 +245,9 @@ xfs_rtallocate_extent_block( end = XFS_BLOCKTOBIT(mp, bbno + 1) - 1; i <= end; i++) { + /* Make sure we don't scan off the end of the rt volume. */ + maxlen = min(mp->m_sb.sb_rextents, i + maxlen) - i; + /* * See if there's a free extent of maxlen starting at i. * If it's not so then next will contain the first non-free. @@ -440,6 +443,14 @@ xfs_rtallocate_extent_near( */ if (bno >= mp->m_sb.sb_rextents) bno = mp->m_sb.sb_rextents - 1; + + /* Make sure we don't run off the end of the rt volume. */ + maxlen = min(mp->m_sb.sb_rextents, bno + maxlen) - bno; + if (maxlen < minlen) { + *rtblock = NULLRTBLOCK; + return 0; + } + /* * Try the exact allocation first. */ @@ -987,10 +998,13 @@ xfs_growfs_rt( xfs_ilock(mp->m_rbmip, XFS_ILOCK_EXCL); xfs_trans_ijoin(tp, mp->m_rbmip, XFS_ILOCK_EXCL); /* - * Update the bitmap inode's size. + * Update the bitmap inode's size ondisk and incore. We need + * to update the incore size so that inode inactivation won't + * punch what it thinks are "posteof" blocks. */ mp->m_rbmip->i_d.di_size = nsbp->sb_rbmblocks * nsbp->sb_blocksize; + i_size_write(VFS_I(mp->m_rbmip), mp->m_rbmip->i_d.di_size); xfs_trans_log_inode(tp, mp->m_rbmip, XFS_ILOG_CORE); /* * Get the summary inode into the transaction. @@ -998,9 +1012,12 @@ xfs_growfs_rt( xfs_ilock(mp->m_rsumip, XFS_ILOCK_EXCL); xfs_trans_ijoin(tp, mp->m_rsumip, XFS_ILOCK_EXCL); /* - * Update the summary inode's size. + * Update the summary inode's size. We need to update the + * incore size so that inode inactivation won't punch what it + * thinks are "posteof" blocks. */ mp->m_rsumip->i_d.di_size = nmp->m_rsumsize; + i_size_write(VFS_I(mp->m_rsumip), mp->m_rsumip->i_d.di_size); xfs_trans_log_inode(tp, mp->m_rsumip, XFS_ILOG_CORE); /* * Copy summary data from old to new sizes. diff --git a/gen_headers_arm.bp b/gen_headers_arm.bp index 1446700ba42f..b5a546f124ac 100644 --- a/gen_headers_arm.bp +++ b/gen_headers_arm.bp @@ -231,6 +231,7 @@ gen_headers_out_arm = [ "linux/fou.h", "linux/fpga-dfl.h", "linux/fs.h", + "linux/fscrypt.h", "linux/fsi.h", "linux/fsl_hypervisor.h", "linux/fsmap.h", @@ -633,6 +634,7 @@ gen_headers_out_arm = [ "linux/watchdog.h", "linux/wil6210_uapi.h", "linux/wimax.h", + "linux/wireguard.h", "linux/wireless.h", "linux/wmi.h", "linux/x25.h", @@ -641,28 +643,27 @@ gen_headers_out_arm = [ "linux/xilinx-v4l2-controls.h", "linux/zorro.h", "linux/zorro_ids.h", - "linux/fscrypt.h", + "media/msm_cam_sensor.h", + "media/msm_camera.h", + "media/msm_camsensor_sdk.h", "media/msm_cvp_private.h", "media/msm_cvp_utils.h", + "media/msm_fd.h", + "media/msm_jpeg.h", + "media/msm_jpeg_dma.h", "media/msm_media_info.h", "media/msm_sde_rotator.h", "media/msm_vidc.h", "media/msm_vidc_private.h", "media/msm_vidc_utils.h", + "media/msmb_camera.h", + "media/msmb_generic_buf_mgr.h", + "media/msmb_isp.h", + "media/msmb_ispif.h", + "media/msmb_pproc.h", "media/radio-iris-commands.h", "media/radio-iris.h", "media/synx.h", - "media/msmb_isp.h", - "media/msm_camsensor_sdk.h", - "media/msm_cam_sensor.h", - "media/msmb_camera.h", - "media/msmb_generic_buf_mgr.h", - "media/msmb_ispif.h", - "media/msmb_pproc.h", - "media/msm_jpeg.h", - "media/msm_fd.h", - "media/msm_jpeg_dma.h", - "media/msm_camera.h", "misc/cxl.h", "misc/ocxl.h", "misc/wigig_sensing_uapi.h", diff --git a/gen_headers_arm64.bp b/gen_headers_arm64.bp index d73e96d1d767..17616f395cf2 100644 --- a/gen_headers_arm64.bp +++ b/gen_headers_arm64.bp @@ -226,6 +226,7 @@ gen_headers_out_arm64 = [ "linux/fou.h", "linux/fpga-dfl.h", "linux/fs.h", + "linux/fscrypt.h", "linux/fsi.h", "linux/fsl_hypervisor.h", "linux/fsmap.h", @@ -627,6 +628,7 @@ gen_headers_out_arm64 = [ "linux/watchdog.h", "linux/wil6210_uapi.h", "linux/wimax.h", + "linux/wireguard.h", "linux/wireless.h", "linux/wmi.h", "linux/x25.h", @@ -635,28 +637,27 @@ gen_headers_out_arm64 = [ "linux/xilinx-v4l2-controls.h", "linux/zorro.h", "linux/zorro_ids.h", - "linux/fscrypt.h", + "media/msm_cam_sensor.h", + "media/msm_camera.h", + "media/msm_camsensor_sdk.h", "media/msm_cvp_private.h", "media/msm_cvp_utils.h", + "media/msm_fd.h", + "media/msm_jpeg.h", + "media/msm_jpeg_dma.h", "media/msm_media_info.h", "media/msm_sde_rotator.h", + "media/msm_vidc.h", "media/msm_vidc_private.h", "media/msm_vidc_utils.h", - "media/msm_vidc.h", + "media/msmb_camera.h", + "media/msmb_generic_buf_mgr.h", + "media/msmb_isp.h", + "media/msmb_ispif.h", + "media/msmb_pproc.h", "media/radio-iris-commands.h", "media/radio-iris.h", "media/synx.h", - "media/msmb_isp.h", - "media/msm_camsensor_sdk.h", - "media/msm_cam_sensor.h", - "media/msmb_camera.h", - "media/msmb_generic_buf_mgr.h", - "media/msmb_ispif.h", - "media/msmb_pproc.h", - "media/msm_jpeg.h", - "media/msm_fd.h", - "media/msm_jpeg_dma.h", - "media/msm_camera.h", "misc/cxl.h", "misc/ocxl.h", "misc/wigig_sensing_uapi.h", diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 15fd0277ffa6..5901b0005929 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -1115,10 +1115,6 @@ static inline bool arch_has_pfn_modify_check(void) #endif /* !__ASSEMBLY__ */ -#ifndef io_remap_pfn_range -#define io_remap_pfn_range remap_pfn_range -#endif - #ifndef has_transparent_hugepage #ifdef CONFIG_TRANSPARENT_HUGEPAGE #define has_transparent_hugepage() 1 diff --git a/include/crypto/blake2s.h b/include/crypto/blake2s.h new file mode 100644 index 000000000000..b471deac28ff --- /dev/null +++ b/include/crypto/blake2s.h @@ -0,0 +1,106 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef BLAKE2S_H +#define BLAKE2S_H + +#include +#include +#include + +#include + +enum blake2s_lengths { + BLAKE2S_BLOCK_SIZE = 64, + BLAKE2S_HASH_SIZE = 32, + BLAKE2S_KEY_SIZE = 32, + + BLAKE2S_128_HASH_SIZE = 16, + BLAKE2S_160_HASH_SIZE = 20, + BLAKE2S_224_HASH_SIZE = 28, + BLAKE2S_256_HASH_SIZE = 32, +}; + +struct blake2s_state { + u32 h[8]; + u32 t[2]; + u32 f[2]; + u8 buf[BLAKE2S_BLOCK_SIZE]; + unsigned int buflen; + unsigned int outlen; +}; + +enum blake2s_iv { + BLAKE2S_IV0 = 0x6A09E667UL, + BLAKE2S_IV1 = 0xBB67AE85UL, + BLAKE2S_IV2 = 0x3C6EF372UL, + BLAKE2S_IV3 = 0xA54FF53AUL, + BLAKE2S_IV4 = 0x510E527FUL, + BLAKE2S_IV5 = 0x9B05688CUL, + BLAKE2S_IV6 = 0x1F83D9ABUL, + BLAKE2S_IV7 = 0x5BE0CD19UL, +}; + +void blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen); +void blake2s_final(struct blake2s_state *state, u8 *out); + +static inline void blake2s_init_param(struct blake2s_state *state, + const u32 param) +{ + *state = (struct blake2s_state){{ + BLAKE2S_IV0 ^ param, + BLAKE2S_IV1, + BLAKE2S_IV2, + BLAKE2S_IV3, + BLAKE2S_IV4, + BLAKE2S_IV5, + BLAKE2S_IV6, + BLAKE2S_IV7, + }}; +} + +static inline void blake2s_init(struct blake2s_state *state, + const size_t outlen) +{ + blake2s_init_param(state, 0x01010000 | outlen); + state->outlen = outlen; +} + +static inline void blake2s_init_key(struct blake2s_state *state, + const size_t outlen, const void *key, + const size_t keylen) +{ + WARN_ON(IS_ENABLED(DEBUG) && (!outlen || outlen > BLAKE2S_HASH_SIZE || + !key || !keylen || keylen > BLAKE2S_KEY_SIZE)); + + blake2s_init_param(state, 0x01010000 | keylen << 8 | outlen); + memcpy(state->buf, key, keylen); + state->buflen = BLAKE2S_BLOCK_SIZE; + state->outlen = outlen; +} + +static inline void blake2s(u8 *out, const u8 *in, const u8 *key, + const size_t outlen, const size_t inlen, + const size_t keylen) +{ + struct blake2s_state state; + + WARN_ON(IS_ENABLED(DEBUG) && ((!in && inlen > 0) || !out || !outlen || + outlen > BLAKE2S_HASH_SIZE || keylen > BLAKE2S_KEY_SIZE || + (!key && keylen))); + + if (keylen) + blake2s_init_key(&state, outlen, key, keylen); + else + blake2s_init(&state, outlen); + + blake2s_update(&state, in, inlen); + blake2s_final(&state, out); +} + +void blake2s256_hmac(u8 *out, const u8 *in, const u8 *key, const size_t inlen, + const size_t keylen); + +#endif /* BLAKE2S_H */ diff --git a/include/crypto/chacha.h b/include/crypto/chacha.h index 3d261f5cd156..d64f022e508a 100644 --- a/include/crypto/chacha.h +++ b/include/crypto/chacha.h @@ -15,9 +15,8 @@ #ifndef _CRYPTO_CHACHA_H #define _CRYPTO_CHACHA_H -#include +#include #include -#include /* 32-bit stream position, then 96-bit nonce (RFC7539 convention) */ #define CHACHA_IV_SIZE 16 @@ -25,29 +24,75 @@ #define CHACHA_KEY_SIZE 32 #define CHACHA_BLOCK_SIZE 64 +#define CHACHA_STATE_WORDS (CHACHA_BLOCK_SIZE / sizeof(u32)) + /* 192-bit nonce, then 64-bit stream position */ #define XCHACHA_IV_SIZE 32 -struct chacha_ctx { - u32 key[8]; - int nrounds; -}; - -void chacha_block(u32 *state, u8 *stream, int nrounds); +void chacha_block_generic(u32 *state, u8 *stream, int nrounds); static inline void chacha20_block(u32 *state, u8 *stream) { - chacha_block(state, stream, 20); + chacha_block_generic(state, stream, 20); } -void hchacha_block(const u32 *in, u32 *out, int nrounds); -void crypto_chacha_init(u32 *state, struct chacha_ctx *ctx, u8 *iv); +void hchacha_block_arch(const u32 *state, u32 *out, int nrounds); +void hchacha_block_generic(const u32 *state, u32 *out, int nrounds); -int crypto_chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key, - unsigned int keysize); -int crypto_chacha12_setkey(struct crypto_skcipher *tfm, const u8 *key, - unsigned int keysize); +static inline void hchacha_block(const u32 *state, u32 *out, int nrounds) +{ + if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_CHACHA)) + hchacha_block_arch(state, out, nrounds); + else + hchacha_block_generic(state, out, nrounds); +} -int crypto_chacha_crypt(struct skcipher_request *req); -int crypto_xchacha_crypt(struct skcipher_request *req); +void chacha_init_arch(u32 *state, const u32 *key, const u8 *iv); +static inline void chacha_init_generic(u32 *state, const u32 *key, const u8 *iv) +{ + state[0] = 0x61707865; /* "expa" */ + state[1] = 0x3320646e; /* "nd 3" */ + state[2] = 0x79622d32; /* "2-by" */ + state[3] = 0x6b206574; /* "te k" */ + state[4] = key[0]; + state[5] = key[1]; + state[6] = key[2]; + state[7] = key[3]; + state[8] = key[4]; + state[9] = key[5]; + state[10] = key[6]; + state[11] = key[7]; + state[12] = get_unaligned_le32(iv + 0); + state[13] = get_unaligned_le32(iv + 4); + state[14] = get_unaligned_le32(iv + 8); + state[15] = get_unaligned_le32(iv + 12); +} + +static inline void chacha_init(u32 *state, const u32 *key, const u8 *iv) +{ + if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_CHACHA)) + chacha_init_arch(state, key, iv); + else + chacha_init_generic(state, key, iv); +} + +void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, + unsigned int bytes, int nrounds); +void chacha_crypt_generic(u32 *state, u8 *dst, const u8 *src, + unsigned int bytes, int nrounds); + +static inline void chacha_crypt(u32 *state, u8 *dst, const u8 *src, + unsigned int bytes, int nrounds) +{ + if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_CHACHA)) + chacha_crypt_arch(state, dst, src, bytes, nrounds); + else + chacha_crypt_generic(state, dst, src, bytes, nrounds); +} + +static inline void chacha20_crypt(u32 *state, u8 *dst, const u8 *src, + unsigned int bytes) +{ + chacha_crypt(state, dst, src, bytes, 20); +} #endif /* _CRYPTO_CHACHA_H */ diff --git a/include/crypto/chacha20poly1305.h b/include/crypto/chacha20poly1305.h new file mode 100644 index 000000000000..d2ac3ff7dc1e --- /dev/null +++ b/include/crypto/chacha20poly1305.h @@ -0,0 +1,50 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef __CHACHA20POLY1305_H +#define __CHACHA20POLY1305_H + +#include +#include + +enum chacha20poly1305_lengths { + XCHACHA20POLY1305_NONCE_SIZE = 24, + CHACHA20POLY1305_KEY_SIZE = 32, + CHACHA20POLY1305_AUTHTAG_SIZE = 16 +}; + +void chacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len, + const u8 *ad, const size_t ad_len, + const u64 nonce, + const u8 key[CHACHA20POLY1305_KEY_SIZE]); + +bool __must_check +chacha20poly1305_decrypt(u8 *dst, const u8 *src, const size_t src_len, + const u8 *ad, const size_t ad_len, const u64 nonce, + const u8 key[CHACHA20POLY1305_KEY_SIZE]); + +void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len, + const u8 *ad, const size_t ad_len, + const u8 nonce[XCHACHA20POLY1305_NONCE_SIZE], + const u8 key[CHACHA20POLY1305_KEY_SIZE]); + +bool __must_check xchacha20poly1305_decrypt( + u8 *dst, const u8 *src, const size_t src_len, const u8 *ad, + const size_t ad_len, const u8 nonce[XCHACHA20POLY1305_NONCE_SIZE], + const u8 key[CHACHA20POLY1305_KEY_SIZE]); + +bool chacha20poly1305_encrypt_sg_inplace(struct scatterlist *src, size_t src_len, + const u8 *ad, const size_t ad_len, + const u64 nonce, + const u8 key[CHACHA20POLY1305_KEY_SIZE]); + +bool chacha20poly1305_decrypt_sg_inplace(struct scatterlist *src, size_t src_len, + const u8 *ad, const size_t ad_len, + const u64 nonce, + const u8 key[CHACHA20POLY1305_KEY_SIZE]); + +bool chacha20poly1305_selftest(void); + +#endif /* __CHACHA20POLY1305_H */ diff --git a/include/crypto/curve25519.h b/include/crypto/curve25519.h new file mode 100644 index 000000000000..9ecb3c1f0f15 --- /dev/null +++ b/include/crypto/curve25519.h @@ -0,0 +1,73 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#ifndef CURVE25519_H +#define CURVE25519_H + +#include // For crypto_memneq. +#include +#include + +enum curve25519_lengths { + CURVE25519_KEY_SIZE = 32 +}; + +extern const u8 curve25519_null_point[]; +extern const u8 curve25519_base_point[]; + +void curve25519_generic(u8 out[CURVE25519_KEY_SIZE], + const u8 scalar[CURVE25519_KEY_SIZE], + const u8 point[CURVE25519_KEY_SIZE]); + +void curve25519_arch(u8 out[CURVE25519_KEY_SIZE], + const u8 scalar[CURVE25519_KEY_SIZE], + const u8 point[CURVE25519_KEY_SIZE]); + +void curve25519_base_arch(u8 pub[CURVE25519_KEY_SIZE], + const u8 secret[CURVE25519_KEY_SIZE]); + +static inline +bool __must_check curve25519(u8 mypublic[CURVE25519_KEY_SIZE], + const u8 secret[CURVE25519_KEY_SIZE], + const u8 basepoint[CURVE25519_KEY_SIZE]) +{ + if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_CURVE25519) && + (!IS_ENABLED(CONFIG_CRYPTO_CURVE25519_X86) || IS_ENABLED(CONFIG_AS_ADX))) + curve25519_arch(mypublic, secret, basepoint); + else + curve25519_generic(mypublic, secret, basepoint); + return crypto_memneq(mypublic, curve25519_null_point, + CURVE25519_KEY_SIZE); +} + +static inline bool +__must_check curve25519_generate_public(u8 pub[CURVE25519_KEY_SIZE], + const u8 secret[CURVE25519_KEY_SIZE]) +{ + if (unlikely(!crypto_memneq(secret, curve25519_null_point, + CURVE25519_KEY_SIZE))) + return false; + + if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_CURVE25519) && + (!IS_ENABLED(CONFIG_CRYPTO_CURVE25519_X86) || IS_ENABLED(CONFIG_AS_ADX))) + curve25519_base_arch(pub, secret); + else + curve25519_generic(pub, secret, curve25519_base_point); + return crypto_memneq(pub, curve25519_null_point, CURVE25519_KEY_SIZE); +} + +static inline void curve25519_clamp_secret(u8 secret[CURVE25519_KEY_SIZE]) +{ + secret[0] &= 248; + secret[31] = (secret[31] & 127) | 64; +} + +static inline void curve25519_generate_secret(u8 secret[CURVE25519_KEY_SIZE]) +{ + get_random_bytes_wait(secret, CURVE25519_KEY_SIZE); + curve25519_clamp_secret(secret); +} + +#endif /* CURVE25519_H */ diff --git a/include/crypto/internal/blake2s.h b/include/crypto/internal/blake2s.h new file mode 100644 index 000000000000..74ff77032e52 --- /dev/null +++ b/include/crypto/internal/blake2s.h @@ -0,0 +1,24 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ + +#ifndef BLAKE2S_INTERNAL_H +#define BLAKE2S_INTERNAL_H + +#include + +struct blake2s_tfm_ctx { + u8 key[BLAKE2S_KEY_SIZE]; + unsigned int keylen; +}; + +void blake2s_compress_generic(struct blake2s_state *state,const u8 *block, + size_t nblocks, const u32 inc); + +void blake2s_compress_arch(struct blake2s_state *state,const u8 *block, + size_t nblocks, const u32 inc); + +static inline void blake2s_set_lastblock(struct blake2s_state *state) +{ + state->f[0] = -1; +} + +#endif /* BLAKE2S_INTERNAL_H */ diff --git a/include/crypto/internal/chacha.h b/include/crypto/internal/chacha.h new file mode 100644 index 000000000000..b085dc1ac151 --- /dev/null +++ b/include/crypto/internal/chacha.h @@ -0,0 +1,43 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef _CRYPTO_INTERNAL_CHACHA_H +#define _CRYPTO_INTERNAL_CHACHA_H + +#include +#include +#include + +struct chacha_ctx { + u32 key[8]; + int nrounds; +}; + +static inline int chacha_setkey(struct crypto_skcipher *tfm, const u8 *key, + unsigned int keysize, int nrounds) +{ + struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + int i; + + if (keysize != CHACHA_KEY_SIZE) + return -EINVAL; + + for (i = 0; i < ARRAY_SIZE(ctx->key); i++) + ctx->key[i] = get_unaligned_le32(key + i * sizeof(u32)); + + ctx->nrounds = nrounds; + return 0; +} + +static inline int chacha20_setkey(struct crypto_skcipher *tfm, const u8 *key, + unsigned int keysize) +{ + return chacha_setkey(tfm, key, keysize, 20); +} + +static inline int chacha12_setkey(struct crypto_skcipher *tfm, const u8 *key, + unsigned int keysize) +{ + return chacha_setkey(tfm, key, keysize, 12); +} + +#endif /* _CRYPTO_CHACHA_H */ diff --git a/include/crypto/internal/poly1305.h b/include/crypto/internal/poly1305.h new file mode 100644 index 000000000000..064e52ca5248 --- /dev/null +++ b/include/crypto/internal/poly1305.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Common values for the Poly1305 algorithm + */ + +#ifndef _CRYPTO_INTERNAL_POLY1305_H +#define _CRYPTO_INTERNAL_POLY1305_H + +#include +#include +#include + +/* + * Poly1305 core functions. These only accept whole blocks; the caller must + * handle any needed block buffering and padding. 'hibit' must be 1 for any + * full blocks, or 0 for the final block if it had to be padded. If 'nonce' is + * non-NULL, then it's added at the end to compute the Poly1305 MAC. Otherwise, + * only the ε-almost-∆-universal hash function (not the full MAC) is computed. + */ + +void poly1305_core_setkey(struct poly1305_core_key *key, const u8 *raw_key); +static inline void poly1305_core_init(struct poly1305_state *state) +{ + *state = (struct poly1305_state){}; +} + +void poly1305_core_blocks(struct poly1305_state *state, + const struct poly1305_core_key *key, const void *src, + unsigned int nblocks, u32 hibit); +void poly1305_core_emit(const struct poly1305_state *state, const u32 nonce[4], + void *dst); + +#endif diff --git a/include/crypto/nhpoly1305.h b/include/crypto/nhpoly1305.h index 53c04423c582..306925fea190 100644 --- a/include/crypto/nhpoly1305.h +++ b/include/crypto/nhpoly1305.h @@ -7,7 +7,7 @@ #define _NHPOLY1305_H #include -#include +#include /* NH parameterization: */ @@ -33,7 +33,7 @@ #define NHPOLY1305_KEY_SIZE (POLY1305_BLOCK_SIZE + NH_KEY_BYTES) struct nhpoly1305_key { - struct poly1305_key poly_key; + struct poly1305_core_key poly_key; u32 nh_key[NH_KEY_WORDS]; }; diff --git a/include/crypto/poly1305.h b/include/crypto/poly1305.h index 34317ed2071e..f1f67fc749cf 100644 --- a/include/crypto/poly1305.h +++ b/include/crypto/poly1305.h @@ -13,52 +13,85 @@ #define POLY1305_KEY_SIZE 32 #define POLY1305_DIGEST_SIZE 16 +/* The poly1305_key and poly1305_state types are mostly opaque and + * implementation-defined. Limbs might be in base 2^64 or base 2^26, or + * different yet. The union type provided keeps these 64-bit aligned for the + * case in which this is implemented using 64x64 multiplies. + */ + struct poly1305_key { - u32 r[5]; /* key, base 2^26 */ + union { + u32 r[5]; + u64 r64[3]; + }; +}; + +struct poly1305_core_key { + struct poly1305_key key; + struct poly1305_key precomputed_s; }; struct poly1305_state { - u32 h[5]; /* accumulator, base 2^26 */ + union { + u32 h[5]; + u64 h64[3]; + }; }; struct poly1305_desc_ctx { - /* key */ - struct poly1305_key r; - /* finalize key */ - u32 s[4]; - /* accumulator */ - struct poly1305_state h; /* partial buffer */ u8 buf[POLY1305_BLOCK_SIZE]; /* bytes used in partial buffer */ unsigned int buflen; - /* r key has been set */ - bool rset; - /* s key has been set */ + /* how many keys have been set in r[] */ + unsigned short rset; + /* whether s[] has been set */ bool sset; + /* finalize key */ + u32 s[4]; + /* accumulator */ + struct poly1305_state h; + /* key */ + union { + struct poly1305_key opaque_r[CONFIG_CRYPTO_LIB_POLY1305_RSIZE]; + struct poly1305_core_key core_r; + }; }; -/* - * Poly1305 core functions. These implement the ε-almost-∆-universal hash - * function underlying the Poly1305 MAC, i.e. they don't add an encrypted nonce - * ("s key") at the end. They also only support block-aligned inputs. - */ -void poly1305_core_setkey(struct poly1305_key *key, const u8 *raw_key); -static inline void poly1305_core_init(struct poly1305_state *state) -{ - memset(state->h, 0, sizeof(state->h)); -} -void poly1305_core_blocks(struct poly1305_state *state, - const struct poly1305_key *key, - const void *src, unsigned int nblocks); -void poly1305_core_emit(const struct poly1305_state *state, void *dst); +void poly1305_init_arch(struct poly1305_desc_ctx *desc, const u8 *key); +void poly1305_init_generic(struct poly1305_desc_ctx *desc, const u8 *key); -/* Crypto API helper functions for the Poly1305 MAC */ -int crypto_poly1305_init(struct shash_desc *desc); -unsigned int crypto_poly1305_setdesckey(struct poly1305_desc_ctx *dctx, - const u8 *src, unsigned int srclen); -int crypto_poly1305_update(struct shash_desc *desc, - const u8 *src, unsigned int srclen); -int crypto_poly1305_final(struct shash_desc *desc, u8 *dst); +static inline void poly1305_init(struct poly1305_desc_ctx *desc, const u8 *key) +{ + if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_POLY1305)) + poly1305_init_arch(desc, key); + else + poly1305_init_generic(desc, key); +} + +void poly1305_update_arch(struct poly1305_desc_ctx *desc, const u8 *src, + unsigned int nbytes); +void poly1305_update_generic(struct poly1305_desc_ctx *desc, const u8 *src, + unsigned int nbytes); + +static inline void poly1305_update(struct poly1305_desc_ctx *desc, + const u8 *src, unsigned int nbytes) +{ + if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_POLY1305)) + poly1305_update_arch(desc, src, nbytes); + else + poly1305_update_generic(desc, src, nbytes); +} + +void poly1305_final_arch(struct poly1305_desc_ctx *desc, u8 *digest); +void poly1305_final_generic(struct poly1305_desc_ctx *desc, u8 *digest); + +static inline void poly1305_final(struct poly1305_desc_ctx *desc, u8 *digest) +{ + if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_POLY1305)) + poly1305_final_arch(desc, digest); + else + poly1305_final_generic(desc, digest); +} #endif diff --git a/include/linux/dcache.h b/include/linux/dcache.h index 45b20e0d6f77..ee2d177dc0e8 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -150,8 +150,9 @@ struct dentry_operations { struct vfsmount *(*d_automount)(struct path *); int (*d_manage)(const struct path *, bool); struct dentry *(*d_real)(struct dentry *, const struct inode *); - void (*d_canonical_path)(const struct path *, struct path *); - ANDROID_KABI_RESERVE(1); + + ANDROID_KABI_USE(1, void (*d_canonical_path)(const struct path *, + struct path *)); ANDROID_KABI_RESERVE(2); ANDROID_KABI_RESERVE(3); ANDROID_KABI_RESERVE(4); diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h index fd1ce10553bf..f3eb2b2e05f4 100644 --- a/include/linux/fsnotify.h +++ b/include/linux/fsnotify.h @@ -211,11 +211,20 @@ static inline void fsnotify_open(struct file *file) { const struct path *path = &file->f_path; struct inode *inode = file_inode(file); + struct path lower_path; __u32 mask = FS_OPEN; if (S_ISDIR(inode->i_mode)) mask |= FS_ISDIR; + if (path->dentry->d_op && path->dentry->d_op->d_canonical_path) { + path->dentry->d_op->d_canonical_path(path, &lower_path); + fsnotify_parent(&lower_path, NULL, mask); + fsnotify(lower_path.dentry->d_inode, mask, &lower_path, + FSNOTIFY_EVENT_PATH, NULL, 0); + path_put(&lower_path); + } + fsnotify_parent(path, NULL, mask); fsnotify(inode, mask, path, FSNOTIFY_EVENT_PATH, NULL, 0); } diff --git a/include/linux/hil_mlc.h b/include/linux/hil_mlc.h index 774f7d3b8f6a..369221fd5518 100644 --- a/include/linux/hil_mlc.h +++ b/include/linux/hil_mlc.h @@ -103,7 +103,7 @@ struct hilse_node { /* Methods for back-end drivers, e.g. hp_sdc_mlc */ typedef int (hil_mlc_cts) (hil_mlc *mlc); -typedef void (hil_mlc_out) (hil_mlc *mlc); +typedef int (hil_mlc_out) (hil_mlc *mlc); typedef int (hil_mlc_in) (hil_mlc *mlc, suseconds_t timeout); struct hil_mlc_devinfo { diff --git a/include/linux/icmpv6.h b/include/linux/icmpv6.h index a8f888976137..024b7a4cd98e 100644 --- a/include/linux/icmpv6.h +++ b/include/linux/icmpv6.h @@ -22,12 +22,22 @@ extern int inet6_unregister_icmp_sender(ip6_icmp_send_t *fn); int ip6_err_gen_icmpv6_unreach(struct sk_buff *skb, int nhs, int type, unsigned int data_len); +#if IS_ENABLED(CONFIG_NF_NAT) +void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info); +#else +#define icmpv6_ndo_send icmpv6_send +#endif + #else static inline void icmpv6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info) { +} +static inline void icmpv6_ndo_send(struct sk_buff *skb, + u8 type, u8 code, __u32 info) +{ } #endif diff --git a/include/linux/mm.h b/include/linux/mm.h index 803c7e7589e2..ac4819fe36c0 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2745,6 +2745,15 @@ static inline vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, return VM_FAULT_NOPAGE; } +#ifndef io_remap_pfn_range +static inline int io_remap_pfn_range(struct vm_area_struct *vma, + unsigned long addr, unsigned long pfn, + unsigned long size, pgprot_t prot) +{ + return remap_pfn_range(vma, addr, pfn, size, pgprot_decrypted(prot)); +} +#endif + static inline vm_fault_t vmf_error(int err) { if (err == -ENOMEM) diff --git a/include/linux/mtd/pfow.h b/include/linux/mtd/pfow.h index 122f3439e1af..c65d7a3be3c6 100644 --- a/include/linux/mtd/pfow.h +++ b/include/linux/mtd/pfow.h @@ -128,7 +128,7 @@ static inline void print_drs_error(unsigned dsr) if (!(dsr & DSR_AVAILABLE)) printk(KERN_NOTICE"DSR.15: (0) Device not Available\n"); - if (prog_status & 0x03) + if ((prog_status & 0x03) == 0x03) printk(KERN_NOTICE"DSR.9,8: (11) Attempt to program invalid " "half with 41h command\n"); else if (prog_status & 0x02) diff --git a/include/linux/overflow.h b/include/linux/overflow.h index 15eb85de9226..4564a175e681 100644 --- a/include/linux/overflow.h +++ b/include/linux/overflow.h @@ -3,6 +3,7 @@ #define __LINUX_OVERFLOW_H #include +#include /* * In the fallback code below, we need to compute the minimum and diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 667eb126fade..184ffd40cb82 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1363,6 +1363,11 @@ static inline void skb_mark_not_on_list(struct sk_buff *skb) skb->next = NULL; } +/* Iterate through singly-linked GSO fragments of an skb. */ +#define skb_list_walk_safe(first, skb, next_skb) \ + for ((skb) = (first), (next_skb) = (skb) ? (skb)->next : NULL; (skb); \ + (skb) = (next_skb), (next_skb) = (skb) ? (skb)->next : NULL) + static inline void skb_list_del_init(struct sk_buff *skb) { __list_del_entry(&skb->list); diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h index a5a3cfc3c2fa..992fa3dc1cd5 100644 --- a/include/linux/timekeeping.h +++ b/include/linux/timekeeping.h @@ -113,6 +113,34 @@ static inline ktime_t ktime_get_coarse_clocktai(void) return ktime_get_coarse_with_offset(TK_OFFS_TAI); } +static inline ktime_t ktime_get_coarse(void) +{ + struct timespec64 ts; + + ktime_get_coarse_ts64(&ts); + return timespec64_to_ktime(ts); +} + +static inline u64 ktime_get_coarse_ns(void) +{ + return ktime_to_ns(ktime_get_coarse()); +} + +static inline u64 ktime_get_coarse_real_ns(void) +{ + return ktime_to_ns(ktime_get_coarse_real()); +} + +static inline u64 ktime_get_coarse_boottime_ns(void) +{ + return ktime_to_ns(ktime_get_coarse_boottime()); +} + +static inline u64 ktime_get_coarse_clocktai_ns(void) +{ + return ktime_to_ns(ktime_get_coarse_clocktai()); +} + /** * ktime_mono_to_real - Convert monotonic time to clock realtime */ diff --git a/include/linux/usb/pd.h b/include/linux/usb/pd.h index f2162e0fe531..bdf4c88d2aa0 100644 --- a/include/linux/usb/pd.h +++ b/include/linux/usb/pd.h @@ -451,6 +451,7 @@ static inline unsigned int rdo_max_power(u32 rdo) #define PD_T_ERROR_RECOVERY 100 /* minimum 25 is insufficient */ #define PD_T_SRCSWAPSTDBY 625 /* Maximum of 650ms */ #define PD_T_NEWSRC 250 /* Maximum of 275ms */ +#define PD_T_SWAP_SRC_START 20 /* Minimum of 20ms */ #define PD_T_DRP_TRY 100 /* 75 - 150 ms */ #define PD_T_DRP_TRYWAIT 600 /* 400 - 800 ms */ diff --git a/include/net/dsa.h b/include/net/dsa.h index 461e8a7661b7..d28df7adf948 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -196,6 +196,7 @@ struct dsa_port { unsigned int index; const char *name; const struct dsa_port *cpu_dp; + const char *mac; struct device_node *dn; unsigned int ageing_time; u8 stp_state; diff --git a/include/net/icmp.h b/include/net/icmp.h index 8665bf24e3b7..9c344e2655d2 100644 --- a/include/net/icmp.h +++ b/include/net/icmp.h @@ -47,6 +47,12 @@ static inline void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 __icmp_send(skb_in, type, code, info, &IPCB(skb_in)->opt); } +#if IS_ENABLED(CONFIG_NF_NAT) +void icmp_ndo_send(struct sk_buff *skb_in, int type, int code, __be32 info); +#else +#define icmp_ndo_send icmp_send +#endif + int icmp_rcv(struct sk_buff *skb); void icmp_err(struct sk_buff *skb, u32 info); int icmp_init(void); diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index e11423530d64..6f05af7eff0d 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -307,6 +307,8 @@ int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[], struct ip_tunnel_parm *p, __u32 fwmark); void ip_tunnel_setup(struct net_device *dev, unsigned int net_id); +__be16 ip_tunnel_parse_protocol(const struct sk_buff *skb); + struct ip_tunnel_encap_ops { size_t (*encap_hlen)(struct ip_tunnel_encap *e); int (*build_header)(struct sk_buff *skb, struct ip_tunnel_encap *e, diff --git a/include/net/netfilter/nf_log.h b/include/net/netfilter/nf_log.h index 0d3920896d50..716db4a0fed8 100644 --- a/include/net/netfilter/nf_log.h +++ b/include/net/netfilter/nf_log.h @@ -108,6 +108,7 @@ int nf_log_dump_tcp_header(struct nf_log_buf *m, const struct sk_buff *skb, unsigned int logflags); void nf_log_dump_sk_uid_gid(struct net *net, struct nf_log_buf *m, struct sock *sk); +void nf_log_dump_vlan(struct nf_log_buf *m, const struct sk_buff *skb); void nf_log_dump_packet_common(struct nf_log_buf *m, u_int8_t pf, unsigned int hooknum, const struct sk_buff *skb, const struct net_device *in, diff --git a/include/net/xfrm.h b/include/net/xfrm.h index fe8bed557691..1078828580d2 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -2093,4 +2093,38 @@ static inline int xfrm_tunnel_check(struct sk_buff *skb, struct xfrm_state *x, return 0; } + +extern const int xfrm_msg_min[XFRM_NR_MSGTYPES]; +extern const struct nla_policy xfrma_policy[XFRMA_MAX+1]; + +struct xfrm_translator { + /* Allocate frag_list and put compat translation there */ + int (*alloc_compat)(struct sk_buff *skb, const struct nlmsghdr *src); + + /* Allocate nlmsg with 64-bit translaton of received 32-bit message */ + struct nlmsghdr *(*rcv_msg_compat)(const struct nlmsghdr *nlh, + int maxtype, const struct nla_policy *policy, + struct netlink_ext_ack *extack); + + /* Translate 32-bit user_policy from sockptr */ + int (*xlate_user_policy_sockptr)(u8 **pdata32, int optlen); + + struct module *owner; +}; + +#if IS_ENABLED(CONFIG_XFRM_USER_COMPAT) +extern int xfrm_register_translator(struct xfrm_translator *xtr); +extern int xfrm_unregister_translator(struct xfrm_translator *xtr); +extern struct xfrm_translator *xfrm_get_translator(void); +extern void xfrm_put_translator(struct xfrm_translator *xtr); +#else +static inline struct xfrm_translator *xfrm_get_translator(void) +{ + return NULL; +} +static inline void xfrm_put_translator(struct xfrm_translator *xtr) +{ +} +#endif + #endif /* _NET_XFRM_H */ diff --git a/include/scsi/scsi_common.h b/include/scsi/scsi_common.h index 731ac09ed231..5b567b43e1b1 100644 --- a/include/scsi/scsi_common.h +++ b/include/scsi/scsi_common.h @@ -25,6 +25,13 @@ scsi_command_size(const unsigned char *cmnd) scsi_varlen_cdb_length(cmnd) : COMMAND_SIZE(cmnd[0]); } +static inline unsigned char +scsi_command_control(const unsigned char *cmnd) +{ + return (cmnd[0] == VARIABLE_LENGTH_CMD) ? + cmnd[1] : cmnd[COMMAND_SIZE(cmnd[0]) - 1]; +} + /* Returns a human-readable name for the device */ extern const char *scsi_device_type(unsigned type); diff --git a/include/trace/events/target.h b/include/trace/events/target.h index 914a872dd343..e87a3716b0ac 100644 --- a/include/trace/events/target.h +++ b/include/trace/events/target.h @@ -140,6 +140,7 @@ TRACE_EVENT(target_sequencer_start, __field( unsigned int, opcode ) __field( unsigned int, data_length ) __field( unsigned int, task_attribute ) + __field( unsigned char, control ) __array( unsigned char, cdb, TCM_MAX_COMMAND_SIZE ) __string( initiator, cmd->se_sess->se_node_acl->initiatorname ) ), @@ -149,6 +150,7 @@ TRACE_EVENT(target_sequencer_start, __entry->opcode = cmd->t_task_cdb[0]; __entry->data_length = cmd->data_length; __entry->task_attribute = cmd->sam_task_attr; + __entry->control = scsi_command_control(cmd->t_task_cdb); memcpy(__entry->cdb, cmd->t_task_cdb, TCM_MAX_COMMAND_SIZE); __assign_str(initiator, cmd->se_sess->se_node_acl->initiatorname); ), @@ -158,9 +160,7 @@ TRACE_EVENT(target_sequencer_start, show_opcode_name(__entry->opcode), __entry->data_length, __print_hex(__entry->cdb, 16), show_task_attribute_name(__entry->task_attribute), - scsi_command_size(__entry->cdb) <= 16 ? - __entry->cdb[scsi_command_size(__entry->cdb) - 1] : - __entry->cdb[1] + __entry->control ) ); @@ -175,6 +175,7 @@ TRACE_EVENT(target_cmd_complete, __field( unsigned int, opcode ) __field( unsigned int, data_length ) __field( unsigned int, task_attribute ) + __field( unsigned char, control ) __field( unsigned char, scsi_status ) __field( unsigned char, sense_length ) __array( unsigned char, cdb, TCM_MAX_COMMAND_SIZE ) @@ -187,6 +188,7 @@ TRACE_EVENT(target_cmd_complete, __entry->opcode = cmd->t_task_cdb[0]; __entry->data_length = cmd->data_length; __entry->task_attribute = cmd->sam_task_attr; + __entry->control = scsi_command_control(cmd->t_task_cdb); __entry->scsi_status = cmd->scsi_status; __entry->sense_length = cmd->scsi_status == SAM_STAT_CHECK_CONDITION ? min(18, ((u8 *) cmd->sense_buffer)[SPC_ADD_SENSE_LEN_OFFSET] + 8) : 0; @@ -203,9 +205,7 @@ TRACE_EVENT(target_cmd_complete, show_opcode_name(__entry->opcode), __entry->data_length, __print_hex(__entry->cdb, 16), show_task_attribute_name(__entry->task_attribute), - scsi_command_size(__entry->cdb) <= 16 ? - __entry->cdb[scsi_command_size(__entry->cdb) - 1] : - __entry->cdb[1] + __entry->control ) ); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index d143e277cdaf..71ca8c4dc290 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1193,8 +1193,8 @@ union bpf_attr { * Return * The return value depends on the result of the test, and can be: * - * * 0, if the *skb* task belongs to the cgroup2. - * * 1, if the *skb* task does not belong to the cgroup2. + * * 0, if current task belongs to the cgroup2. + * * 1, if current task does not belong to the cgroup2. * * A negative error code, if an error occurred. * * int bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags) diff --git a/include/uapi/linux/nfs4.h b/include/uapi/linux/nfs4.h index 8572930cf5b0..54a78529c8b3 100644 --- a/include/uapi/linux/nfs4.h +++ b/include/uapi/linux/nfs4.h @@ -136,6 +136,8 @@ #define EXCHGID4_FLAG_UPD_CONFIRMED_REC_A 0x40000000 #define EXCHGID4_FLAG_CONFIRMED_R 0x80000000 + +#define EXCHGID4_FLAG_SUPP_FENCE_OPS 0x00000004 /* * Since the validity of these bits depends on whether * they're set in the argument or response, have separate @@ -143,6 +145,7 @@ */ #define EXCHGID4_FLAG_MASK_A 0x40070103 #define EXCHGID4_FLAG_MASK_R 0x80070103 +#define EXCHGID4_2_FLAG_MASK_R 0x80070107 #define SEQ4_STATUS_CB_PATH_DOWN 0x00000001 #define SEQ4_STATUS_CB_GSS_CONTEXTS_EXPIRING 0x00000002 diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index f35eb72739c0..5fb4cdf37100 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -1079,7 +1079,7 @@ union perf_mem_data_src { #define PERF_MEM_SNOOPX_FWD 0x01 /* forward */ /* 1 free */ -#define PERF_MEM_SNOOPX_SHIFT 37 +#define PERF_MEM_SNOOPX_SHIFT 38 /* locked instruction */ #define PERF_MEM_LOCK_NA 0x01 /* not available */ diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h index 7b3e201a4dc8..df99783c9eb0 100644 --- a/include/uapi/linux/videodev2.h +++ b/include/uapi/linux/videodev2.h @@ -362,9 +362,9 @@ enum v4l2_hsv_encoding { enum v4l2_quantization { /* - * The default for R'G'B' quantization is always full range, except - * for the BT2020 colorspace. For Y'CbCr the quantization is always - * limited range, except for COLORSPACE_JPEG: this is full range. + * The default for R'G'B' quantization is always full range. + * For Y'CbCr the quantization is always limited range, except + * for COLORSPACE_JPEG: this is full range. */ V4L2_QUANTIZATION_DEFAULT = 0, V4L2_QUANTIZATION_FULL_RANGE = 1, @@ -373,14 +373,13 @@ enum v4l2_quantization { /* * Determine how QUANTIZATION_DEFAULT should map to a proper quantization. - * This depends on whether the image is RGB or not, the colorspace and the - * Y'CbCr encoding. + * This depends on whether the image is RGB or not, the colorspace. + * The Y'CbCr encoding is not used anymore, but is still there for backwards + * compatibility. */ #define V4L2_MAP_QUANTIZATION_DEFAULT(is_rgb_or_hsv, colsp, ycbcr_enc) \ - (((is_rgb_or_hsv) && (colsp) == V4L2_COLORSPACE_BT2020) ? \ - V4L2_QUANTIZATION_LIM_RANGE : \ - (((is_rgb_or_hsv) || (colsp) == V4L2_COLORSPACE_JPEG) ? \ - V4L2_QUANTIZATION_FULL_RANGE : V4L2_QUANTIZATION_LIM_RANGE)) + (((is_rgb_or_hsv) || (colsp) == V4L2_COLORSPACE_JPEG) ? \ + V4L2_QUANTIZATION_FULL_RANGE : V4L2_QUANTIZATION_LIM_RANGE) /* * Deprecated names for opRGB colorspace (IEC 61966-2-5) diff --git a/include/uapi/linux/wireguard.h b/include/uapi/linux/wireguard.h new file mode 100644 index 000000000000..ae88be14c947 --- /dev/null +++ b/include/uapi/linux/wireguard.h @@ -0,0 +1,196 @@ +/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR MIT */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * Documentation + * ============= + * + * The below enums and macros are for interfacing with WireGuard, using generic + * netlink, with family WG_GENL_NAME and version WG_GENL_VERSION. It defines two + * methods: get and set. Note that while they share many common attributes, + * these two functions actually accept a slightly different set of inputs and + * outputs. + * + * WG_CMD_GET_DEVICE + * ----------------- + * + * May only be called via NLM_F_REQUEST | NLM_F_DUMP. The command should contain + * one but not both of: + * + * WGDEVICE_A_IFINDEX: NLA_U32 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMSIZ - 1 + * + * The kernel will then return several messages (NLM_F_MULTI) containing the + * following tree of nested items: + * + * WGDEVICE_A_IFINDEX: NLA_U32 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMSIZ - 1 + * WGDEVICE_A_PRIVATE_KEY: NLA_EXACT_LEN, len WG_KEY_LEN + * WGDEVICE_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN + * WGDEVICE_A_LISTEN_PORT: NLA_U16 + * WGDEVICE_A_FWMARK: NLA_U32 + * WGDEVICE_A_PEERS: NLA_NESTED + * 0: NLA_NESTED + * WGPEER_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN + * WGPEER_A_PRESHARED_KEY: NLA_EXACT_LEN, len WG_KEY_LEN + * WGPEER_A_ENDPOINT: NLA_MIN_LEN(struct sockaddr), struct sockaddr_in or struct sockaddr_in6 + * WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL: NLA_U16 + * WGPEER_A_LAST_HANDSHAKE_TIME: NLA_EXACT_LEN, struct __kernel_timespec + * WGPEER_A_RX_BYTES: NLA_U64 + * WGPEER_A_TX_BYTES: NLA_U64 + * WGPEER_A_ALLOWEDIPS: NLA_NESTED + * 0: NLA_NESTED + * WGALLOWEDIP_A_FAMILY: NLA_U16 + * WGALLOWEDIP_A_IPADDR: NLA_MIN_LEN(struct in_addr), struct in_addr or struct in6_addr + * WGALLOWEDIP_A_CIDR_MASK: NLA_U8 + * 0: NLA_NESTED + * ... + * 0: NLA_NESTED + * ... + * ... + * WGPEER_A_PROTOCOL_VERSION: NLA_U32 + * 0: NLA_NESTED + * ... + * ... + * + * It is possible that all of the allowed IPs of a single peer will not + * fit within a single netlink message. In that case, the same peer will + * be written in the following message, except it will only contain + * WGPEER_A_PUBLIC_KEY and WGPEER_A_ALLOWEDIPS. This may occur several + * times in a row for the same peer. It is then up to the receiver to + * coalesce adjacent peers. Likewise, it is possible that all peers will + * not fit within a single message. So, subsequent peers will be sent + * in following messages, except those will only contain WGDEVICE_A_IFNAME + * and WGDEVICE_A_PEERS. It is then up to the receiver to coalesce these + * messages to form the complete list of peers. + * + * Since this is an NLA_F_DUMP command, the final message will always be + * NLMSG_DONE, even if an error occurs. However, this NLMSG_DONE message + * contains an integer error code. It is either zero or a negative error + * code corresponding to the errno. + * + * WG_CMD_SET_DEVICE + * ----------------- + * + * May only be called via NLM_F_REQUEST. The command should contain the + * following tree of nested items, containing one but not both of + * WGDEVICE_A_IFINDEX and WGDEVICE_A_IFNAME: + * + * WGDEVICE_A_IFINDEX: NLA_U32 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMSIZ - 1 + * WGDEVICE_A_FLAGS: NLA_U32, 0 or WGDEVICE_F_REPLACE_PEERS if all current + * peers should be removed prior to adding the list below. + * WGDEVICE_A_PRIVATE_KEY: len WG_KEY_LEN, all zeros to remove + * WGDEVICE_A_LISTEN_PORT: NLA_U16, 0 to choose randomly + * WGDEVICE_A_FWMARK: NLA_U32, 0 to disable + * WGDEVICE_A_PEERS: NLA_NESTED + * 0: NLA_NESTED + * WGPEER_A_PUBLIC_KEY: len WG_KEY_LEN + * WGPEER_A_FLAGS: NLA_U32, 0 and/or WGPEER_F_REMOVE_ME if the + * specified peer should not exist at the end of the + * operation, rather than added/updated and/or + * WGPEER_F_REPLACE_ALLOWEDIPS if all current allowed + * IPs of this peer should be removed prior to adding + * the list below and/or WGPEER_F_UPDATE_ONLY if the + * peer should only be set if it already exists. + * WGPEER_A_PRESHARED_KEY: len WG_KEY_LEN, all zeros to remove + * WGPEER_A_ENDPOINT: struct sockaddr_in or struct sockaddr_in6 + * WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL: NLA_U16, 0 to disable + * WGPEER_A_ALLOWEDIPS: NLA_NESTED + * 0: NLA_NESTED + * WGALLOWEDIP_A_FAMILY: NLA_U16 + * WGALLOWEDIP_A_IPADDR: struct in_addr or struct in6_addr + * WGALLOWEDIP_A_CIDR_MASK: NLA_U8 + * 0: NLA_NESTED + * ... + * 0: NLA_NESTED + * ... + * ... + * WGPEER_A_PROTOCOL_VERSION: NLA_U32, should not be set or used at + * all by most users of this API, as the + * most recent protocol will be used when + * this is unset. Otherwise, must be set + * to 1. + * 0: NLA_NESTED + * ... + * ... + * + * It is possible that the amount of configuration data exceeds that of + * the maximum message length accepted by the kernel. In that case, several + * messages should be sent one after another, with each successive one + * filling in information not contained in the prior. Note that if + * WGDEVICE_F_REPLACE_PEERS is specified in the first message, it probably + * should not be specified in fragments that come after, so that the list + * of peers is only cleared the first time but appended after. Likewise for + * peers, if WGPEER_F_REPLACE_ALLOWEDIPS is specified in the first message + * of a peer, it likely should not be specified in subsequent fragments. + * + * If an error occurs, NLMSG_ERROR will reply containing an errno. + */ + +#ifndef _WG_UAPI_WIREGUARD_H +#define _WG_UAPI_WIREGUARD_H + +#define WG_GENL_NAME "wireguard" +#define WG_GENL_VERSION 1 + +#define WG_KEY_LEN 32 + +enum wg_cmd { + WG_CMD_GET_DEVICE, + WG_CMD_SET_DEVICE, + __WG_CMD_MAX +}; +#define WG_CMD_MAX (__WG_CMD_MAX - 1) + +enum wgdevice_flag { + WGDEVICE_F_REPLACE_PEERS = 1U << 0, + __WGDEVICE_F_ALL = WGDEVICE_F_REPLACE_PEERS +}; +enum wgdevice_attribute { + WGDEVICE_A_UNSPEC, + WGDEVICE_A_IFINDEX, + WGDEVICE_A_IFNAME, + WGDEVICE_A_PRIVATE_KEY, + WGDEVICE_A_PUBLIC_KEY, + WGDEVICE_A_FLAGS, + WGDEVICE_A_LISTEN_PORT, + WGDEVICE_A_FWMARK, + WGDEVICE_A_PEERS, + __WGDEVICE_A_LAST +}; +#define WGDEVICE_A_MAX (__WGDEVICE_A_LAST - 1) + +enum wgpeer_flag { + WGPEER_F_REMOVE_ME = 1U << 0, + WGPEER_F_REPLACE_ALLOWEDIPS = 1U << 1, + WGPEER_F_UPDATE_ONLY = 1U << 2, + __WGPEER_F_ALL = WGPEER_F_REMOVE_ME | WGPEER_F_REPLACE_ALLOWEDIPS | + WGPEER_F_UPDATE_ONLY +}; +enum wgpeer_attribute { + WGPEER_A_UNSPEC, + WGPEER_A_PUBLIC_KEY, + WGPEER_A_PRESHARED_KEY, + WGPEER_A_FLAGS, + WGPEER_A_ENDPOINT, + WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL, + WGPEER_A_LAST_HANDSHAKE_TIME, + WGPEER_A_RX_BYTES, + WGPEER_A_TX_BYTES, + WGPEER_A_ALLOWEDIPS, + WGPEER_A_PROTOCOL_VERSION, + __WGPEER_A_LAST +}; +#define WGPEER_A_MAX (__WGPEER_A_LAST - 1) + +enum wgallowedip_attribute { + WGALLOWEDIP_A_UNSPEC, + WGALLOWEDIP_A_FAMILY, + WGALLOWEDIP_A_IPADDR, + WGALLOWEDIP_A_CIDR_MASK, + __WGALLOWEDIP_A_LAST +}; +#define WGALLOWEDIP_A_MAX (__WGALLOWEDIP_A_LAST - 1) + +#endif /* _WG_UAPI_WIREGUARD_H */ diff --git a/include/xen/events.h b/include/xen/events.h index 1650d39decae..d8255ed2052c 100644 --- a/include/xen/events.h +++ b/include/xen/events.h @@ -14,11 +14,16 @@ unsigned xen_evtchn_nr_channels(void); -int bind_evtchn_to_irq(unsigned int evtchn); -int bind_evtchn_to_irqhandler(unsigned int evtchn, +int bind_evtchn_to_irq(evtchn_port_t evtchn); +int bind_evtchn_to_irq_lateeoi(evtchn_port_t evtchn); +int bind_evtchn_to_irqhandler(evtchn_port_t evtchn, irq_handler_t handler, unsigned long irqflags, const char *devname, void *dev_id); +int bind_evtchn_to_irqhandler_lateeoi(evtchn_port_t evtchn, + irq_handler_t handler, + unsigned long irqflags, const char *devname, + void *dev_id); int bind_virq_to_irq(unsigned int virq, unsigned int cpu, bool percpu); int bind_virq_to_irqhandler(unsigned int virq, unsigned int cpu, irq_handler_t handler, @@ -31,13 +36,21 @@ int bind_ipi_to_irqhandler(enum ipi_vector ipi, const char *devname, void *dev_id); int bind_interdomain_evtchn_to_irq(unsigned int remote_domain, - unsigned int remote_port); + evtchn_port_t remote_port); +int bind_interdomain_evtchn_to_irq_lateeoi(unsigned int remote_domain, + evtchn_port_t remote_port); int bind_interdomain_evtchn_to_irqhandler(unsigned int remote_domain, - unsigned int remote_port, + evtchn_port_t remote_port, irq_handler_t handler, unsigned long irqflags, const char *devname, void *dev_id); +int bind_interdomain_evtchn_to_irqhandler_lateeoi(unsigned int remote_domain, + evtchn_port_t remote_port, + irq_handler_t handler, + unsigned long irqflags, + const char *devname, + void *dev_id); /* * Common unbind function for all event sources. Takes IRQ to unbind from. @@ -46,6 +59,14 @@ int bind_interdomain_evtchn_to_irqhandler(unsigned int remote_domain, */ void unbind_from_irqhandler(unsigned int irq, void *dev_id); +/* + * Send late EOI for an IRQ bound to an event channel via one of the *_lateeoi + * functions above. + */ +void xen_irq_lateeoi(unsigned int irq, unsigned int eoi_flags); +/* Signal an event was spurious, i.e. there was no action resulting from it. */ +#define XEN_EOI_FLAG_SPURIOUS 0x00000001 + #define XEN_IRQ_PRIORITY_MAX EVTCHN_FIFO_PRIORITY_MAX #define XEN_IRQ_PRIORITY_DEFAULT EVTCHN_FIFO_PRIORITY_DEFAULT #define XEN_IRQ_PRIORITY_MIN EVTCHN_FIFO_PRIORITY_MIN diff --git a/init/Kconfig b/init/Kconfig index ff9c4c6381c3..98f695e0e808 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -613,7 +613,8 @@ config IKHEADERS config LOG_BUF_SHIFT int "Kernel log buffer size (16 => 64KB, 17 => 128KB)" - range 12 25 + range 12 25 if !H8300 + range 12 19 if H8300 default 17 depends on PRINTK help diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c index fbb1bfdd2fa5..8c76141c99c8 100644 --- a/kernel/debug/debug_core.c +++ b/kernel/debug/debug_core.c @@ -95,14 +95,6 @@ int dbg_switch_cpu; /* Use kdb or gdbserver mode */ int dbg_kdb_mode = 1; -static int __init opt_kgdb_con(char *str) -{ - kgdb_use_con = 1; - return 0; -} - -early_param("kgdbcon", opt_kgdb_con); - module_param(kgdb_use_con, int, 0644); module_param(kgdbreboot, int, 0644); @@ -820,6 +812,20 @@ static struct console kgdbcons = { .index = -1, }; +static int __init opt_kgdb_con(char *str) +{ + kgdb_use_con = 1; + + if (kgdb_io_module_registered && !kgdb_con_registered) { + register_console(&kgdbcons); + kgdb_con_registered = 1; + } + + return 0; +} + +early_param("kgdbcon", opt_kgdb_con); + #ifdef CONFIG_MAGIC_SYSRQ static void sysrq_handle_dbg(int key) { diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c index 048ebbe43350..738b755605f3 100644 --- a/kernel/debug/kdb/kdb_io.c +++ b/kernel/debug/kdb/kdb_io.c @@ -688,12 +688,16 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap) size_avail = sizeof(kdb_buffer) - len; goto kdb_print_out; } - if (kdb_grepping_flag >= KDB_GREPPING_FLAG_SEARCH) + if (kdb_grepping_flag >= KDB_GREPPING_FLAG_SEARCH) { /* * This was a interactive search (using '/' at more - * prompt) and it has completed. Clear the flag. + * prompt) and it has completed. Replace the \0 with + * its original value to ensure multi-line strings + * are handled properly, and return to normal mode. */ + *cphold = replaced_byte; kdb_grepping_flag = 0; + } /* * at this point the string is a full line and * should be printed, up to the null. diff --git a/kernel/events/core.c b/kernel/events/core.c index 6992a7abcd3c..2c6b8409d56e 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -9237,6 +9237,7 @@ perf_event_parse_addr_filter(struct perf_event *event, char *fstr, if (token == IF_SRC_FILE || token == IF_SRC_FILEADDR) { int fpos = token == IF_SRC_FILE ? 2 : 1; + kfree(filename); filename = match_strdup(&args[fpos]); if (!filename) { ret = -ENOMEM; @@ -9283,16 +9284,13 @@ perf_event_parse_addr_filter(struct perf_event *event, char *fstr, */ ret = -EOPNOTSUPP; if (!event->ctx->task) - goto fail_free_name; + goto fail; /* look up the path and grab its inode */ ret = kern_path(filename, LOOKUP_FOLLOW, &filter->path); if (ret) - goto fail_free_name; - - kfree(filename); - filename = NULL; + goto fail; ret = -EINVAL; if (!filter->path.dentry || @@ -9312,13 +9310,13 @@ perf_event_parse_addr_filter(struct perf_event *event, char *fstr, if (state != IF_STATE_ACTION) goto fail; + kfree(filename); kfree(orig); return 0; -fail_free_name: - kfree(filename); fail: + kfree(filename); free_filters_list(filters); kfree(orig); diff --git a/kernel/fork.c b/kernel/fork.c index ef4295ac552e..c0c42ffb8aa1 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2141,14 +2141,9 @@ static __latent_entropy struct task_struct *copy_process( /* ok, now we should be set up.. */ p->pid = pid_nr(pid); if (clone_flags & CLONE_THREAD) { - p->exit_signal = -1; p->group_leader = current->group_leader; p->tgid = current->tgid; } else { - if (clone_flags & CLONE_PARENT) - p->exit_signal = current->group_leader->exit_signal; - else - p->exit_signal = (clone_flags & CSIGNAL); p->group_leader = p; p->tgid = p->pid; } @@ -2193,9 +2188,14 @@ static __latent_entropy struct task_struct *copy_process( if (clone_flags & (CLONE_PARENT|CLONE_THREAD)) { p->real_parent = current->real_parent; p->parent_exec_id = current->parent_exec_id; + if (clone_flags & CLONE_THREAD) + p->exit_signal = -1; + else + p->exit_signal = current->group_leader->exit_signal; } else { p->real_parent = current; p->parent_exec_id = current->self_exec_id; + p->exit_signal = (clone_flags & CSIGNAL); } klp_copy_process(p); diff --git a/kernel/futex.c b/kernel/futex.c index 920d853a8e9e..52f641c00a65 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1517,8 +1517,10 @@ static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state *pi_ */ newval = FUTEX_WAITERS | task_pid_vnr(new_owner); - if (unlikely(should_fail_futex(true))) + if (unlikely(should_fail_futex(true))) { ret = -EFAULT; + goto out_unlock; + } ret = cmpxchg_futex_value_locked(&curval, uaddr, uval, newval); if (!ret && (curval != uval)) { @@ -2415,10 +2417,22 @@ retry: } /* - * Since we just failed the trylock; there must be an owner. + * The trylock just failed, so either there is an owner or + * there is a higher priority waiter than this one. */ newowner = rt_mutex_owner(&pi_state->pi_mutex); - BUG_ON(!newowner); + /* + * If the higher priority waiter has not yet taken over the + * rtmutex then newowner is NULL. We can't return here with + * that state because it's inconsistent vs. the user space + * state. So drop the locks and try again. It's a valid + * situation and not any different from the other retry + * conditions. + */ + if (unlikely(!newowner)) { + err = -EAGAIN; + goto handle_err; + } } else { WARN_ON_ONCE(argowner != current); if (oldowner == current) { diff --git a/kernel/kthread.c b/kernel/kthread.c index 4b3644b85760..b7e2254e99a6 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -864,7 +864,8 @@ void kthread_delayed_work_timer_fn(struct timer_list *t) /* Move the work from worker->delayed_work_list. */ WARN_ON_ONCE(list_empty(&work->node)); list_del_init(&work->node); - kthread_insert_work(worker, work, &worker->work_list); + if (!work->canceling) + kthread_insert_work(worker, work, &worker->work_list); spin_unlock(&worker->lock); } diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 537a2a3c1dea..28db51274ed0 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -842,17 +842,6 @@ static int software_resume(void) /* Check if the device is there */ swsusp_resume_device = name_to_dev_t(resume_file); - - /* - * name_to_dev_t is ineffective to verify parition if resume_file is in - * integer format. (e.g. major:minor) - */ - if (isdigit(resume_file[0]) && resume_wait) { - int partno; - while (!get_gendisk(swsusp_resume_device, &partno)) - msleep(10); - } - if (!swsusp_resume_device) { /* * Some device discovery might still be in progress; we need diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c09b3545208c..ce496b142acb 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -30,7 +30,7 @@ DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues); -#if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_JUMP_LABEL) +#ifdef CONFIG_SCHED_DEBUG /* * Debugging: various feature bits * diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 3aa2efdf6791..bbff6abcdd27 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1681,7 +1681,7 @@ enum { #undef SCHED_FEAT -#if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_JUMP_LABEL) +#ifdef CONFIG_SCHED_DEBUG /* * To support run-time toggling of sched features, all the translation units @@ -1689,6 +1689,7 @@ enum { */ extern const_debug unsigned int sysctl_sched_features; +#ifdef CONFIG_JUMP_LABEL #define SCHED_FEAT(name, enabled) \ static __always_inline bool static_branch_##name(struct static_key *key) \ { \ @@ -1701,7 +1702,13 @@ static __always_inline bool static_branch_##name(struct static_key *key) \ extern struct static_key sched_feat_keys[__SCHED_FEAT_NR]; #define sched_feat(x) (static_branch_##x(&sched_feat_keys[__SCHED_FEAT_##x])) -#else /* !(SCHED_DEBUG && CONFIG_JUMP_LABEL) */ +#else /* !CONFIG_JUMP_LABEL */ + +#define sched_feat(x) (sysctl_sched_features & (1UL << __SCHED_FEAT_##x)) + +#endif /* CONFIG_JUMP_LABEL */ + +#else /* !SCHED_DEBUG */ /* * Each translation unit has its own copy of sysctl_sched_features to allow @@ -1717,7 +1724,7 @@ static const_debug __maybe_unused unsigned int sysctl_sched_features = #define sched_feat(x) !!(sysctl_sched_features & (1UL << __SCHED_FEAT_##x)) -#endif /* SCHED_DEBUG && CONFIG_JUMP_LABEL */ +#endif /* SCHED_DEBUG */ extern struct static_key_false sched_numa_balancing; extern struct static_key_false sched_schedstats; diff --git a/kernel/signal.c b/kernel/signal.c index 89c41b76f975..c82ead187fff 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -391,16 +391,17 @@ static bool task_participate_group_stop(struct task_struct *task) void task_join_group_stop(struct task_struct *task) { + unsigned long mask = current->jobctl & JOBCTL_STOP_SIGMASK; + struct signal_struct *sig = current->signal; + + if (sig->group_stop_count) { + sig->group_stop_count++; + mask |= JOBCTL_STOP_CONSUME; + } else if (!(sig->flags & SIGNAL_STOP_STOPPED)) + return; + /* Have the new thread join an on-going signal group stop */ - unsigned long jobctl = current->jobctl; - if (jobctl & JOBCTL_STOP_PENDING) { - struct signal_struct *sig = current->signal; - unsigned long signr = jobctl & JOBCTL_STOP_SIGMASK; - unsigned long gstop = JOBCTL_STOP_PENDING | JOBCTL_STOP_CONSUME; - if (task_set_jobctl_pending(task, signr | gstop)) { - sig->group_stop_count++; - } - } + task_set_jobctl_pending(task, mask | JOBCTL_STOP_PENDING); } /* diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index 1442f6152abc..645048bb1e86 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -521,10 +521,18 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, if (!bt->msg_data) goto err; - ret = -ENOENT; - - dir = debugfs_lookup(buts->name, blk_debugfs_root); - if (!dir) +#ifdef CONFIG_BLK_DEBUG_FS + /* + * When tracing whole make_request drivers (multiqueue) block devices, + * reuse the existing debugfs directory created by the block layer on + * init. For request-based block devices, all partitions block devices, + * and scsi-generic block devices we create a temporary new debugfs + * directory that will be removed once the trace ends. + */ + if (q->mq_ops && bdev && bdev == bdev->bd_contains) + dir = q->debugfs_dir; + else +#endif bt->dir = dir = debugfs_create_dir(buts->name, blk_debugfs_root); if (!dir) goto err; @@ -583,8 +591,6 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, ret = 0; err: - if (dir && !bt->dir) - dput(dir); if (ret) blk_trace_free(bt); return ret; diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index f0a4024425cc..360129e47540 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -444,14 +444,16 @@ struct rb_event_info { /* * Used for which event context the event is in. - * NMI = 0 - * IRQ = 1 - * SOFTIRQ = 2 - * NORMAL = 3 + * TRANSITION = 0 + * NMI = 1 + * IRQ = 2 + * SOFTIRQ = 3 + * NORMAL = 4 * * See trace_recursive_lock() comment below for more details. */ enum { + RB_CTX_TRANSITION, RB_CTX_NMI, RB_CTX_IRQ, RB_CTX_SOFTIRQ, @@ -1692,18 +1694,18 @@ int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size, { struct ring_buffer_per_cpu *cpu_buffer; unsigned long nr_pages; - int cpu, err = 0; + int cpu, err; /* * Always succeed at resizing a non-existent buffer: */ if (!buffer) - return size; + return 0; /* Make sure the requested buffer exists */ if (cpu_id != RING_BUFFER_ALL_CPUS && !cpumask_test_cpu(cpu_id, buffer->cpumask)) - return size; + return 0; nr_pages = DIV_ROUND_UP(size, BUF_PAGE_SIZE); @@ -1843,7 +1845,7 @@ int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size, } mutex_unlock(&buffer->mutex); - return size; + return 0; out_err: for_each_buffer_cpu(buffer, cpu) { @@ -2620,10 +2622,10 @@ rb_wakeups(struct ring_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer) * a bit of overhead in something as critical as function tracing, * we use a bitmask trick. * - * bit 0 = NMI context - * bit 1 = IRQ context - * bit 2 = SoftIRQ context - * bit 3 = normal context. + * bit 1 = NMI context + * bit 2 = IRQ context + * bit 3 = SoftIRQ context + * bit 4 = normal context. * * This works because this is the order of contexts that can * preempt other contexts. A SoftIRQ never preempts an IRQ @@ -2646,6 +2648,30 @@ rb_wakeups(struct ring_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer) * The least significant bit can be cleared this way, and it * just so happens that it is the same bit corresponding to * the current context. + * + * Now the TRANSITION bit breaks the above slightly. The TRANSITION bit + * is set when a recursion is detected at the current context, and if + * the TRANSITION bit is already set, it will fail the recursion. + * This is needed because there's a lag between the changing of + * interrupt context and updating the preempt count. In this case, + * a false positive will be found. To handle this, one extra recursion + * is allowed, and this is done by the TRANSITION bit. If the TRANSITION + * bit is already set, then it is considered a recursion and the function + * ends. Otherwise, the TRANSITION bit is set, and that bit is returned. + * + * On the trace_recursive_unlock(), the TRANSITION bit will be the first + * to be cleared. Even if it wasn't the context that set it. That is, + * if an interrupt comes in while NORMAL bit is set and the ring buffer + * is called before preempt_count() is updated, since the check will + * be on the NORMAL bit, the TRANSITION bit will then be set. If an + * NMI then comes in, it will set the NMI bit, but when the NMI code + * does the trace_recursive_unlock() it will clear the TRANSTION bit + * and leave the NMI bit set. But this is fine, because the interrupt + * code that set the TRANSITION bit will then clear the NMI bit when it + * calls trace_recursive_unlock(). If another NMI comes in, it will + * set the TRANSITION bit and continue. + * + * Note: The TRANSITION bit only handles a single transition between context. */ static __always_inline int @@ -2661,8 +2687,16 @@ trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer) bit = pc & NMI_MASK ? RB_CTX_NMI : pc & HARDIRQ_MASK ? RB_CTX_IRQ : RB_CTX_SOFTIRQ; - if (unlikely(val & (1 << (bit + cpu_buffer->nest)))) - return 1; + if (unlikely(val & (1 << (bit + cpu_buffer->nest)))) { + /* + * It is possible that this was called by transitioning + * between interrupt context, and preempt_count() has not + * been updated yet. In this case, use the TRANSITION bit. + */ + bit = RB_CTX_TRANSITION; + if (val & (1 << (bit + cpu_buffer->nest))) + return 1; + } val |= (1 << (bit + cpu_buffer->nest)); cpu_buffer->current_context = val; @@ -2677,8 +2711,8 @@ trace_recursive_unlock(struct ring_buffer_per_cpu *cpu_buffer) cpu_buffer->current_context - (1 << cpu_buffer->nest); } -/* The recursive locking above uses 4 bits */ -#define NESTED_BITS 4 +/* The recursive locking above uses 5 bits */ +#define NESTED_BITS 5 /** * ring_buffer_nest_start - Allow to trace while nested diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 13eeeb9d7c4c..868d6280d27a 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -2821,7 +2821,7 @@ static char *get_trace_buf(void) /* Interrupts must see nesting incremented before we use the buffer */ barrier(); - return &buffer->buffer[buffer->nesting][0]; + return &buffer->buffer[buffer->nesting - 1][0]; } static void put_trace_buf(void) diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index 8d3378fd387b..e791fc26d752 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -534,6 +534,12 @@ enum { TRACE_GRAPH_DEPTH_START_BIT, TRACE_GRAPH_DEPTH_END_BIT, + + /* + * When transitioning between context, the preempt_count() may + * not be correct. Allow for a single recursion to cover this case. + */ + TRACE_TRANSITION_BIT, }; #define trace_recursion_set(bit) do { (current)->trace_recursion |= (1<<(bit)); } while (0) @@ -588,14 +594,27 @@ static __always_inline int trace_test_and_set_recursion(int start, int max) return 0; bit = trace_get_context_bit() + start; - if (unlikely(val & (1 << bit))) - return -1; + if (unlikely(val & (1 << bit))) { + /* + * It could be that preempt_count has not been updated during + * a switch between contexts. Allow for a single recursion. + */ + bit = TRACE_TRANSITION_BIT; + if (trace_recursion_test(bit)) + return -1; + trace_recursion_set(bit); + barrier(); + return bit + 1; + } + + /* Normal check passed, clear the transition to allow it again */ + trace_recursion_clear(TRACE_TRANSITION_BIT); val |= 1 << bit; current->trace_recursion = val; barrier(); - return bit; + return bit + 1; } static __always_inline void trace_clear_recursion(int bit) @@ -605,6 +624,7 @@ static __always_inline void trace_clear_recursion(int bit) if (!bit) return; + bit--; bit = 1 << bit; val &= ~bit; diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c index 11e9daa4a568..330ba2b081ed 100644 --- a/kernel/trace/trace_selftest.c +++ b/kernel/trace/trace_selftest.c @@ -492,8 +492,13 @@ trace_selftest_function_recursion(void) unregister_ftrace_function(&test_rec_probe); ret = -1; - if (trace_selftest_recursion_cnt != 1) { - pr_cont("*callback not called once (%d)* ", + /* + * Recursion allows for transitions between context, + * and may call the callback twice. + */ + if (trace_selftest_recursion_cnt != 1 && + trace_selftest_recursion_cnt != 2) { + pr_cont("*callback not called once (or twice) (%d)* ", trace_selftest_recursion_cnt); goto out; } diff --git a/lib/Makefile b/lib/Makefile index c5460e8659d5..424494d6160d 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -31,7 +31,7 @@ endif lib-y := ctype.o string.o vsprintf.o cmdline.o \ rbtree.o radix-tree.o timerqueue.o\ idr.o int_sqrt.o extable.o \ - sha1.o chacha.o irq_regs.o argv_split.o \ + sha1.o irq_regs.o argv_split.o \ flex_proportions.o ratelimit.o show_mem.o \ is_single_threaded.o plist.o decompress.o kobject_uevent.o \ earlycpio.o seq_buf.o siphash.o dec_and_lock.o \ @@ -97,6 +97,8 @@ endif obj-$(CONFIG_DEBUG_INFO_REDUCED) += debug_info.o CFLAGS_debug_info.o += $(call cc-option, -femit-struct-debug-detailed=any) +obj-y += crypto/ + obj-$(CONFIG_GENERIC_IOMAP) += iomap.o obj-$(CONFIG_GENERIC_PCI_IOMAP) += pci_iomap.o obj-$(CONFIG_HAS_IOMEM) += iomap_copy.o devres.o diff --git a/lib/crc32.c b/lib/crc32.c index a6c9afafc8c8..1a5d08470044 100644 --- a/lib/crc32.c +++ b/lib/crc32.c @@ -328,7 +328,7 @@ static inline u32 __pure crc32_be_generic(u32 crc, unsigned char const *p, return crc; } -#if CRC_LE_BITS == 1 +#if CRC_BE_BITS == 1 u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) { return crc32_be_generic(crc, p, len, NULL, CRC32_POLY_BE); diff --git a/lib/crc32test.c b/lib/crc32test.c index 97d6a57cefcc..61ddce2cff77 100644 --- a/lib/crc32test.c +++ b/lib/crc32test.c @@ -683,7 +683,6 @@ static int __init crc32c_test(void) /* reduce OS noise */ local_irq_save(flags); - local_irq_disable(); nsec = ktime_get_ns(); for (i = 0; i < 100; i++) { @@ -694,7 +693,6 @@ static int __init crc32c_test(void) nsec = ktime_get_ns() - nsec; local_irq_restore(flags); - local_irq_enable(); pr_info("crc32c: CRC_LE_BITS = %d\n", CRC_LE_BITS); @@ -768,7 +766,6 @@ static int __init crc32_test(void) /* reduce OS noise */ local_irq_save(flags); - local_irq_disable(); nsec = ktime_get_ns(); for (i = 0; i < 100; i++) { @@ -783,7 +780,6 @@ static int __init crc32_test(void) nsec = ktime_get_ns() - nsec; local_irq_restore(flags); - local_irq_enable(); pr_info("crc32: CRC_LE_BITS = %d, CRC_BE BITS = %d\n", CRC_LE_BITS, CRC_BE_BITS); diff --git a/lib/crypto/Kconfig b/lib/crypto/Kconfig new file mode 100644 index 000000000000..e1059fa5fbdc --- /dev/null +++ b/lib/crypto/Kconfig @@ -0,0 +1,118 @@ +# SPDX-License-Identifier: GPL-2.0 + +comment "Crypto library routines" + +config CRYPTO_ARCH_HAVE_LIB_BLAKE2S + tristate + help + Declares whether the architecture provides an arch-specific + accelerated implementation of the Blake2s library interface, + either builtin or as a module. + +config CRYPTO_LIB_BLAKE2S_GENERIC + tristate + help + This symbol can be depended upon by arch implementations of the + Blake2s library interface that require the generic code as a + fallback, e.g., for SIMD implementations. If no arch specific + implementation is enabled, this implementation serves the users + of CRYPTO_LIB_BLAKE2S. + +config CRYPTO_LIB_BLAKE2S + tristate "BLAKE2s hash function library" + depends on CRYPTO_ARCH_HAVE_LIB_BLAKE2S || !CRYPTO_ARCH_HAVE_LIB_BLAKE2S + select CRYPTO_LIB_BLAKE2S_GENERIC if CRYPTO_ARCH_HAVE_LIB_BLAKE2S=n + help + Enable the Blake2s library interface. This interface may be fulfilled + by either the generic implementation or an arch-specific one, if one + is available and enabled. + +config CRYPTO_ARCH_HAVE_LIB_CHACHA + tristate + help + Declares whether the architecture provides an arch-specific + accelerated implementation of the ChaCha library interface, + either builtin or as a module. + +config CRYPTO_LIB_CHACHA_GENERIC + tristate + select CRYPTO_ALGAPI + help + This symbol can be depended upon by arch implementations of the + ChaCha library interface that require the generic code as a + fallback, e.g., for SIMD implementations. If no arch specific + implementation is enabled, this implementation serves the users + of CRYPTO_LIB_CHACHA. + +config CRYPTO_LIB_CHACHA + tristate "ChaCha library interface" + depends on CRYPTO_ARCH_HAVE_LIB_CHACHA || !CRYPTO_ARCH_HAVE_LIB_CHACHA + select CRYPTO_LIB_CHACHA_GENERIC if CRYPTO_ARCH_HAVE_LIB_CHACHA=n + help + Enable the ChaCha library interface. This interface may be fulfilled + by either the generic implementation or an arch-specific one, if one + is available and enabled. + +config CRYPTO_LIB_POLY1305_RSIZE + int + default 2 if MIPS + default 11 if X86_64 + default 9 if ARM || ARM64 + default 1 + +config CRYPTO_ARCH_HAVE_LIB_POLY1305 + tristate + help + Declares whether the architecture provides an arch-specific + accelerated implementation of the Poly1305 library interface, + either builtin or as a module. + +config CRYPTO_LIB_POLY1305_GENERIC + tristate + help + This symbol can be depended upon by arch implementations of the + Poly1305 library interface that require the generic code as a + fallback, e.g., for SIMD implementations. If no arch specific + implementation is enabled, this implementation serves the users + of CRYPTO_LIB_POLY1305. + +config CRYPTO_LIB_POLY1305 + tristate "Poly1305 library interface" + depends on CRYPTO_ARCH_HAVE_LIB_POLY1305 || !CRYPTO_ARCH_HAVE_LIB_POLY1305 + select CRYPTO_LIB_POLY1305_GENERIC if CRYPTO_ARCH_HAVE_LIB_POLY1305=n + help + Enable the Poly1305 library interface. This interface may be fulfilled + by either the generic implementation or an arch-specific one, if one + is available and enabled. + +config CRYPTO_ARCH_HAVE_LIB_CURVE25519 + tristate + help + Declares whether the architecture provides an arch-specific + accelerated implementation of the Curve25519 library interface, + either builtin or as a module. + +config CRYPTO_LIB_CURVE25519_GENERIC + tristate + help + This symbol can be depended upon by arch implementations of the + Curve25519 library interface that require the generic code as a + fallback, e.g., for SIMD implementations. If no arch specific + implementation is enabled, this implementation serves the users + of CRYPTO_LIB_CURVE25519. + +config CRYPTO_LIB_CURVE25519 + tristate "Curve25519 scalar multiplication library" + depends on CRYPTO_ARCH_HAVE_LIB_CURVE25519 || !CRYPTO_ARCH_HAVE_LIB_CURVE25519 + select CRYPTO_LIB_CURVE25519_GENERIC if CRYPTO_ARCH_HAVE_LIB_CURVE25519=n + help + Enable the Curve25519 library interface. This interface may be + fulfilled by either the generic implementation or an arch-specific + one, if one is available and enabled. + +config CRYPTO_LIB_CHACHA20POLY1305 + tristate "ChaCha20-Poly1305 AEAD support (8-byte nonce library version)" + depends on CRYPTO_ARCH_HAVE_LIB_CHACHA || !CRYPTO_ARCH_HAVE_LIB_CHACHA + depends on CRYPTO_ARCH_HAVE_LIB_POLY1305 || !CRYPTO_ARCH_HAVE_LIB_POLY1305 + select CRYPTO_LIB_CHACHA + select CRYPTO_LIB_POLY1305 diff --git a/lib/crypto/Makefile b/lib/crypto/Makefile new file mode 100644 index 000000000000..f96fa8c1b957 --- /dev/null +++ b/lib/crypto/Makefile @@ -0,0 +1,33 @@ +# SPDX-License-Identifier: GPL-2.0 + +# chacha is used by the /dev/random driver which is always builtin +obj-y += chacha.o +obj-$(CONFIG_CRYPTO_LIB_CHACHA_GENERIC) += libchacha.o + +obj-$(CONFIG_CRYPTO_LIB_POLY1305_GENERIC) += libpoly1305.o +libpoly1305-y := poly1305-donna32.o +libpoly1305-$(CONFIG_ARCH_SUPPORTS_INT128) := poly1305-donna64.o +libpoly1305-y += poly1305.o + +obj-$(CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC) += libblake2s-generic.o +libblake2s-generic-y += blake2s-generic.o + +obj-$(CONFIG_CRYPTO_LIB_BLAKE2S) += libblake2s.o +libblake2s-y += blake2s.o + +obj-$(CONFIG_CRYPTO_LIB_CHACHA20POLY1305) += libchacha20poly1305.o +libchacha20poly1305-y += chacha20poly1305.o + +obj-$(CONFIG_CRYPTO_LIB_CURVE25519_GENERIC) += libcurve25519-generic.o +libcurve25519-generic-y := curve25519-fiat32.o +libcurve25519-generic-$(CONFIG_ARCH_SUPPORTS_INT128) := curve25519-hacl64.o +libcurve25519-generic-y += curve25519-generic.o + +obj-$(CONFIG_CRYPTO_LIB_CURVE25519) += libcurve25519.o +libcurve25519-y += curve25519.o + +ifneq ($(CONFIG_CRYPTO_MANAGER_DISABLE_TESTS),y) +libblake2s-y += blake2s-selftest.o +libchacha20poly1305-y += chacha20poly1305-selftest.o +libcurve25519-y += curve25519-selftest.o +endif diff --git a/lib/crypto/blake2s-generic.c b/lib/crypto/blake2s-generic.c new file mode 100644 index 000000000000..04ff8df24513 --- /dev/null +++ b/lib/crypto/blake2s-generic.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * This is an implementation of the BLAKE2s hash and PRF functions. + * + * Information: https://blake2.net/ + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +static const u8 blake2s_sigma[10][16] = { + { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 }, + { 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3 }, + { 11, 8, 12, 0, 5, 2, 15, 13, 10, 14, 3, 6, 7, 1, 9, 4 }, + { 7, 9, 3, 1, 13, 12, 11, 14, 2, 6, 5, 10, 4, 0, 15, 8 }, + { 9, 0, 5, 7, 2, 4, 10, 15, 14, 1, 11, 12, 6, 8, 3, 13 }, + { 2, 12, 6, 10, 0, 11, 8, 3, 4, 13, 7, 5, 15, 14, 1, 9 }, + { 12, 5, 1, 15, 14, 13, 4, 10, 0, 7, 6, 3, 9, 2, 8, 11 }, + { 13, 11, 7, 14, 12, 1, 3, 9, 5, 0, 15, 4, 8, 6, 2, 10 }, + { 6, 15, 14, 9, 11, 3, 0, 8, 12, 2, 13, 7, 1, 4, 10, 5 }, + { 10, 2, 8, 4, 7, 6, 1, 5, 15, 11, 9, 14, 3, 12, 13, 0 }, +}; + +static inline void blake2s_increment_counter(struct blake2s_state *state, + const u32 inc) +{ + state->t[0] += inc; + state->t[1] += (state->t[0] < inc); +} + +void blake2s_compress_generic(struct blake2s_state *state,const u8 *block, + size_t nblocks, const u32 inc) +{ + u32 m[16]; + u32 v[16]; + int i; + + WARN_ON(IS_ENABLED(DEBUG) && + (nblocks > 1 && inc != BLAKE2S_BLOCK_SIZE)); + + while (nblocks > 0) { + blake2s_increment_counter(state, inc); + memcpy(m, block, BLAKE2S_BLOCK_SIZE); + le32_to_cpu_array(m, ARRAY_SIZE(m)); + memcpy(v, state->h, 32); + v[ 8] = BLAKE2S_IV0; + v[ 9] = BLAKE2S_IV1; + v[10] = BLAKE2S_IV2; + v[11] = BLAKE2S_IV3; + v[12] = BLAKE2S_IV4 ^ state->t[0]; + v[13] = BLAKE2S_IV5 ^ state->t[1]; + v[14] = BLAKE2S_IV6 ^ state->f[0]; + v[15] = BLAKE2S_IV7 ^ state->f[1]; + +#define G(r, i, a, b, c, d) do { \ + a += b + m[blake2s_sigma[r][2 * i + 0]]; \ + d = ror32(d ^ a, 16); \ + c += d; \ + b = ror32(b ^ c, 12); \ + a += b + m[blake2s_sigma[r][2 * i + 1]]; \ + d = ror32(d ^ a, 8); \ + c += d; \ + b = ror32(b ^ c, 7); \ +} while (0) + +#define ROUND(r) do { \ + G(r, 0, v[0], v[ 4], v[ 8], v[12]); \ + G(r, 1, v[1], v[ 5], v[ 9], v[13]); \ + G(r, 2, v[2], v[ 6], v[10], v[14]); \ + G(r, 3, v[3], v[ 7], v[11], v[15]); \ + G(r, 4, v[0], v[ 5], v[10], v[15]); \ + G(r, 5, v[1], v[ 6], v[11], v[12]); \ + G(r, 6, v[2], v[ 7], v[ 8], v[13]); \ + G(r, 7, v[3], v[ 4], v[ 9], v[14]); \ +} while (0) + ROUND(0); + ROUND(1); + ROUND(2); + ROUND(3); + ROUND(4); + ROUND(5); + ROUND(6); + ROUND(7); + ROUND(8); + ROUND(9); + +#undef G +#undef ROUND + + for (i = 0; i < 8; ++i) + state->h[i] ^= v[i] ^ v[i + 8]; + + block += BLAKE2S_BLOCK_SIZE; + --nblocks; + } +} + +EXPORT_SYMBOL(blake2s_compress_generic); + +MODULE_LICENSE("GPL v2"); +MODULE_DESCRIPTION("BLAKE2s hash function"); +MODULE_AUTHOR("Jason A. Donenfeld "); diff --git a/lib/crypto/blake2s-selftest.c b/lib/crypto/blake2s-selftest.c new file mode 100644 index 000000000000..79ef404a990d --- /dev/null +++ b/lib/crypto/blake2s-selftest.c @@ -0,0 +1,622 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include +#include + +/* + * blake2s_testvecs[] generated with the program below (using libb2-dev and + * libssl-dev [OpenSSL]) + * + * #include + * #include + * #include + * + * #include + * #include + * + * #define BLAKE2S_TESTVEC_COUNT 256 + * + * static void print_vec(const uint8_t vec[], int len) + * { + * int i; + * + * printf(" { "); + * for (i = 0; i < len; i++) { + * if (i && (i % 12) == 0) + * printf("\n "); + * printf("0x%02x, ", vec[i]); + * } + * printf("},\n"); + * } + * + * int main(void) + * { + * uint8_t key[BLAKE2S_KEYBYTES]; + * uint8_t buf[BLAKE2S_TESTVEC_COUNT]; + * uint8_t hash[BLAKE2S_OUTBYTES]; + * int i, j; + * + * key[0] = key[1] = 1; + * for (i = 2; i < BLAKE2S_KEYBYTES; ++i) + * key[i] = key[i - 2] + key[i - 1]; + * + * for (i = 0; i < BLAKE2S_TESTVEC_COUNT; ++i) + * buf[i] = (uint8_t)i; + * + * printf("static const u8 blake2s_testvecs[][BLAKE2S_HASH_SIZE] __initconst = {\n"); + * + * for (i = 0; i < BLAKE2S_TESTVEC_COUNT; ++i) { + * int outlen = 1 + i % BLAKE2S_OUTBYTES; + * int keylen = (13 * i) % (BLAKE2S_KEYBYTES + 1); + * + * blake2s(hash, buf, key + BLAKE2S_KEYBYTES - keylen, outlen, i, + * keylen); + * print_vec(hash, outlen); + * } + * printf("};\n\n"); + * + * printf("static const u8 blake2s_hmac_testvecs[][BLAKE2S_HASH_SIZE] __initconst = {\n"); + * + * HMAC(EVP_blake2s256(), key, sizeof(key), buf, sizeof(buf), hash, NULL); + * print_vec(hash, BLAKE2S_OUTBYTES); + * + * HMAC(EVP_blake2s256(), buf, sizeof(buf), key, sizeof(key), hash, NULL); + * print_vec(hash, BLAKE2S_OUTBYTES); + * + * printf("};\n"); + * + * return 0; + *} + */ +static const u8 blake2s_testvecs[][BLAKE2S_HASH_SIZE] __initconst = { + { 0xa1, }, + { 0x7c, 0x89, }, + { 0x74, 0x0e, 0xd4, }, + { 0x47, 0x0c, 0x21, 0x15, }, + { 0x18, 0xd6, 0x9c, 0xa6, 0xc4, }, + { 0x13, 0x5d, 0x16, 0x63, 0x2e, 0xf9, }, + { 0x2c, 0xb5, 0x04, 0xb7, 0x99, 0xe2, 0x73, }, + { 0x9a, 0x0f, 0xd2, 0x39, 0xd6, 0x68, 0x1b, 0x92, }, + { 0xc8, 0xde, 0x7a, 0xea, 0x2f, 0xf4, 0xd2, 0xe3, 0x2b, }, + { 0x5b, 0xf9, 0x43, 0x52, 0x0c, 0x12, 0xba, 0xb5, 0x93, 0x9f, }, + { 0xc6, 0x2c, 0x4e, 0x80, 0xfc, 0x32, 0x5b, 0x33, 0xb8, 0xb8, 0x0a, }, + { 0xa7, 0x5c, 0xfd, 0x3a, 0xcc, 0xbf, 0x90, 0xca, 0xb7, 0x97, 0xde, 0xd8, }, + { 0x66, 0xca, 0x3c, 0xc4, 0x19, 0xef, 0x92, 0x66, 0x3f, 0x21, 0x8f, 0xda, + 0xb7, }, + { 0xba, 0xe5, 0xbb, 0x30, 0x25, 0x94, 0x6d, 0xc3, 0x89, 0x09, 0xc4, 0x25, + 0x52, 0x3e, }, + { 0xa2, 0xef, 0x0e, 0x52, 0x0b, 0x5f, 0xa2, 0x01, 0x6d, 0x0a, 0x25, 0xbc, + 0x57, 0xe2, 0x27, }, + { 0x4f, 0xe0, 0xf9, 0x52, 0x12, 0xda, 0x84, 0xb7, 0xab, 0xae, 0xb0, 0xa6, + 0x47, 0x2a, 0xc7, 0xf5, }, + { 0x56, 0xe7, 0xa8, 0x1c, 0x4c, 0xca, 0xed, 0x90, 0x31, 0xec, 0x87, 0x43, + 0xe7, 0x72, 0x08, 0xec, 0xbe, }, + { 0x7e, 0xdf, 0x80, 0x1c, 0x93, 0x33, 0xfd, 0x53, 0x44, 0xba, 0xfd, 0x96, + 0xe1, 0xbb, 0xb5, 0x65, 0xa5, 0x00, }, + { 0xec, 0x6b, 0xed, 0xf7, 0x7b, 0x62, 0x1d, 0x7d, 0xf4, 0x82, 0xf3, 0x1e, + 0x18, 0xff, 0x2b, 0xc4, 0x06, 0x20, 0x2a, }, + { 0x74, 0x98, 0xd7, 0x68, 0x63, 0xed, 0x87, 0xe4, 0x5d, 0x8d, 0x9e, 0x1d, + 0xfd, 0x2a, 0xbb, 0x86, 0xac, 0xe9, 0x2a, 0x89, }, + { 0x89, 0xc3, 0x88, 0xce, 0x2b, 0x33, 0x1e, 0x10, 0xd1, 0x37, 0x20, 0x86, + 0x28, 0x43, 0x70, 0xd9, 0xfb, 0x96, 0xd9, 0xb5, 0xd3, }, + { 0xcb, 0x56, 0x74, 0x41, 0x8d, 0x80, 0x01, 0x9a, 0x6b, 0x38, 0xe1, 0x41, + 0xad, 0x9c, 0x62, 0x74, 0xce, 0x35, 0xd5, 0x6c, 0x89, 0x6e, }, + { 0x79, 0xaf, 0x94, 0x59, 0x99, 0x26, 0xe1, 0xc9, 0x34, 0xfe, 0x7c, 0x22, + 0xf7, 0x43, 0xd7, 0x65, 0xd4, 0x48, 0x18, 0xac, 0x3d, 0xfd, 0x93, }, + { 0x85, 0x0d, 0xff, 0xb8, 0x3e, 0x87, 0x41, 0xb0, 0x95, 0xd3, 0x3d, 0x00, + 0x47, 0x55, 0x9e, 0xd2, 0x69, 0xea, 0xbf, 0xe9, 0x7a, 0x2d, 0x61, 0x45, }, + { 0x03, 0xe0, 0x85, 0xec, 0x54, 0xb5, 0x16, 0x53, 0xa8, 0xc4, 0x71, 0xe9, + 0x6a, 0xe7, 0xcb, 0xc4, 0x15, 0x02, 0xfc, 0x34, 0xa4, 0xa4, 0x28, 0x13, + 0xd1, }, + { 0xe3, 0x34, 0x4b, 0xe1, 0xd0, 0x4b, 0x55, 0x61, 0x8f, 0xc0, 0x24, 0x05, + 0xe6, 0xe0, 0x3d, 0x70, 0x24, 0x4d, 0xda, 0xb8, 0x91, 0x05, 0x29, 0x07, + 0x01, 0x3e, }, + { 0x61, 0xff, 0x01, 0x72, 0xb1, 0x4d, 0xf6, 0xfe, 0xd1, 0xd1, 0x08, 0x74, + 0xe6, 0x91, 0x44, 0xeb, 0x61, 0xda, 0x40, 0xaf, 0xfc, 0x8c, 0x91, 0x6b, + 0xec, 0x13, 0xed, }, + { 0xd4, 0x40, 0xd2, 0xa0, 0x7f, 0xc1, 0x58, 0x0c, 0x85, 0xa0, 0x86, 0xc7, + 0x86, 0xb9, 0x61, 0xc9, 0xea, 0x19, 0x86, 0x1f, 0xab, 0x07, 0xce, 0x37, + 0x72, 0x67, 0x09, 0xfc, }, + { 0x9e, 0xf8, 0x18, 0x67, 0x93, 0x10, 0x9b, 0x39, 0x75, 0xe8, 0x8b, 0x38, + 0x82, 0x7d, 0xb8, 0xb7, 0xa5, 0xaf, 0xe6, 0x6a, 0x22, 0x5e, 0x1f, 0x9c, + 0x95, 0x29, 0x19, 0xf2, 0x4b, }, + { 0xc8, 0x62, 0x25, 0xf5, 0x98, 0xc9, 0xea, 0xe5, 0x29, 0x3a, 0xd3, 0x22, + 0xeb, 0xeb, 0x07, 0x7c, 0x15, 0x07, 0xee, 0x15, 0x61, 0xbb, 0x05, 0x30, + 0x99, 0x7f, 0x11, 0xf6, 0x0a, 0x1d, }, + { 0x68, 0x70, 0xf7, 0x90, 0xa1, 0x8b, 0x1f, 0x0f, 0xbb, 0xce, 0xd2, 0x0e, + 0x33, 0x1f, 0x7f, 0xa9, 0x78, 0xa8, 0xa6, 0x81, 0x66, 0xab, 0x8d, 0xcd, + 0x58, 0x55, 0x3a, 0x0b, 0x7a, 0xdb, 0xb5, }, + { 0xdd, 0x35, 0xd2, 0xb4, 0xf6, 0xc7, 0xea, 0xab, 0x64, 0x24, 0x4e, 0xfe, + 0xe5, 0x3d, 0x4e, 0x95, 0x8b, 0x6d, 0x6c, 0xbc, 0xb0, 0xf8, 0x88, 0x61, + 0x09, 0xb7, 0x78, 0xa3, 0x31, 0xfe, 0xd9, 0x2f, }, + { 0x0a, }, + { 0x6e, 0xd4, }, + { 0x64, 0xe9, 0xd1, }, + { 0x30, 0xdd, 0x71, 0xef, }, + { 0x11, 0xb5, 0x0c, 0x87, 0xc9, }, + { 0x06, 0x1c, 0x6d, 0x04, 0x82, 0xd0, }, + { 0x5c, 0x42, 0x0b, 0xee, 0xc5, 0x9c, 0xb2, }, + { 0xe8, 0x29, 0xd6, 0xb4, 0x5d, 0xf7, 0x2b, 0x93, }, + { 0x18, 0xca, 0x27, 0x72, 0x43, 0x39, 0x16, 0xbc, 0x6a, }, + { 0x39, 0x8f, 0xfd, 0x64, 0xf5, 0x57, 0x23, 0xb0, 0x45, 0xf8, }, + { 0xbb, 0x3a, 0x78, 0x6b, 0x02, 0x1d, 0x0b, 0x16, 0xe3, 0xb2, 0x9a, }, + { 0xb8, 0xb4, 0x0b, 0xe5, 0xd4, 0x1d, 0x0d, 0x85, 0x49, 0x91, 0x35, 0xfa, }, + { 0x6d, 0x48, 0x2a, 0x0c, 0x42, 0x08, 0xbd, 0xa9, 0x78, 0x6f, 0x18, 0xaf, + 0xe2, }, + { 0x10, 0x45, 0xd4, 0x58, 0x88, 0xec, 0x4e, 0x1e, 0xf6, 0x14, 0x92, 0x64, + 0x7e, 0xb0, }, + { 0x8b, 0x0b, 0x95, 0xee, 0x92, 0xc6, 0x3b, 0x91, 0xf1, 0x1e, 0xeb, 0x51, + 0x98, 0x0a, 0x8d, }, + { 0xa3, 0x50, 0x4d, 0xa5, 0x1d, 0x03, 0x68, 0xe9, 0x57, 0x78, 0xd6, 0x04, + 0xf1, 0xc3, 0x94, 0xd8, }, + { 0xb8, 0x66, 0x6e, 0xdd, 0x46, 0x15, 0xae, 0x3d, 0x83, 0x7e, 0xcf, 0xe7, + 0x2c, 0xe8, 0x8f, 0xc7, 0x34, }, + { 0x2e, 0xc0, 0x1f, 0x29, 0xea, 0xf6, 0xb9, 0xe2, 0xc2, 0x93, 0xeb, 0x41, + 0x0d, 0xf0, 0x0a, 0x13, 0x0e, 0xa2, }, + { 0x71, 0xb8, 0x33, 0xa9, 0x1b, 0xac, 0xf1, 0xb5, 0x42, 0x8f, 0x5e, 0x81, + 0x34, 0x43, 0xb7, 0xa4, 0x18, 0x5c, 0x47, }, + { 0xda, 0x45, 0xb8, 0x2e, 0x82, 0x1e, 0xc0, 0x59, 0x77, 0x9d, 0xfa, 0xb4, + 0x1c, 0x5e, 0xa0, 0x2b, 0x33, 0x96, 0x5a, 0x58, }, + { 0xe3, 0x09, 0x05, 0xa9, 0xeb, 0x48, 0x13, 0xad, 0x71, 0x88, 0x81, 0x9a, + 0x3e, 0x2c, 0xe1, 0x23, 0x99, 0x13, 0x35, 0x9f, 0xb5, }, + { 0xb7, 0x86, 0x2d, 0x16, 0xe1, 0x04, 0x00, 0x47, 0x47, 0x61, 0x31, 0xfb, + 0x14, 0xac, 0xd8, 0xe9, 0xe3, 0x49, 0xbd, 0xf7, 0x9c, 0x3f, }, + { 0x7f, 0xd9, 0x95, 0xa8, 0xa7, 0xa0, 0xcc, 0xba, 0xef, 0xb1, 0x0a, 0xa9, + 0x21, 0x62, 0x08, 0x0f, 0x1b, 0xff, 0x7b, 0x9d, 0xae, 0xb2, 0x95, }, + { 0x85, 0x99, 0xea, 0x33, 0xe0, 0x56, 0xff, 0x13, 0xc6, 0x61, 0x8c, 0xf9, + 0x57, 0x05, 0x03, 0x11, 0xf9, 0xfb, 0x3a, 0xf7, 0xce, 0xbb, 0x52, 0x30, }, + { 0xb2, 0x72, 0x9c, 0xf8, 0x77, 0x4e, 0x8f, 0x6b, 0x01, 0x6c, 0xff, 0x4e, + 0x4f, 0x02, 0xd2, 0xbc, 0xeb, 0x51, 0x28, 0x99, 0x50, 0xab, 0xc4, 0x42, + 0xe3, }, + { 0x8b, 0x0a, 0xb5, 0x90, 0x8f, 0xf5, 0x7b, 0xdd, 0xba, 0x47, 0x37, 0xc9, + 0x2a, 0xd5, 0x4b, 0x25, 0x08, 0x8b, 0x02, 0x17, 0xa7, 0x9e, 0x6b, 0x6e, + 0xe3, 0x90, }, + { 0x90, 0xdd, 0xf7, 0x75, 0xa7, 0xa3, 0x99, 0x5e, 0x5b, 0x7d, 0x75, 0xc3, + 0x39, 0x6b, 0xa0, 0xe2, 0x44, 0x53, 0xb1, 0x9e, 0xc8, 0xf1, 0x77, 0x10, + 0x58, 0x06, 0x9a, }, + { 0x99, 0x52, 0xf0, 0x49, 0xa8, 0x8c, 0xec, 0xa6, 0x97, 0x32, 0x13, 0xb5, + 0xf7, 0xa3, 0x8e, 0xfb, 0x4b, 0x59, 0x31, 0x3d, 0x01, 0x59, 0x98, 0x5d, + 0x53, 0x03, 0x1a, 0x39, }, + { 0x9f, 0xe0, 0xc2, 0xe5, 0x5d, 0x93, 0xd6, 0x9b, 0x47, 0x8f, 0x9b, 0xe0, + 0x26, 0x35, 0x84, 0x20, 0x1d, 0xc5, 0x53, 0x10, 0x0f, 0x22, 0xb9, 0xb5, + 0xd4, 0x36, 0xb1, 0xac, 0x73, }, + { 0x30, 0x32, 0x20, 0x3b, 0x10, 0x28, 0xec, 0x1f, 0x4f, 0x9b, 0x47, 0x59, + 0xeb, 0x7b, 0xee, 0x45, 0xfb, 0x0c, 0x49, 0xd8, 0x3d, 0x69, 0xbd, 0x90, + 0x2c, 0xf0, 0x9e, 0x8d, 0xbf, 0xd5, }, + { 0x2a, 0x37, 0x73, 0x7f, 0xf9, 0x96, 0x19, 0xaa, 0x25, 0xd8, 0x13, 0x28, + 0x01, 0x29, 0x89, 0xdf, 0x6e, 0x0c, 0x9b, 0x43, 0x44, 0x51, 0xe9, 0x75, + 0x26, 0x0c, 0xb7, 0x87, 0x66, 0x0b, 0x5f, }, + { 0x23, 0xdf, 0x96, 0x68, 0x91, 0x86, 0xd0, 0x93, 0x55, 0x33, 0x24, 0xf6, + 0xba, 0x08, 0x75, 0x5b, 0x59, 0x11, 0x69, 0xb8, 0xb9, 0xe5, 0x2c, 0x77, + 0x02, 0xf6, 0x47, 0xee, 0x81, 0xdd, 0xb9, 0x06, }, + { 0x9d, }, + { 0x9d, 0x7d, }, + { 0xfd, 0xc3, 0xda, }, + { 0xe8, 0x82, 0xcd, 0x21, }, + { 0xc3, 0x1d, 0x42, 0x4c, 0x74, }, + { 0xe9, 0xda, 0xf1, 0xa2, 0xe5, 0x7c, }, + { 0x52, 0xb8, 0x6f, 0x81, 0x5c, 0x3a, 0x4c, }, + { 0x5b, 0x39, 0x26, 0xfc, 0x92, 0x5e, 0xe0, 0x49, }, + { 0x59, 0xe4, 0x7c, 0x93, 0x1c, 0xf9, 0x28, 0x93, 0xde, }, + { 0xde, 0xdf, 0xb2, 0x43, 0x61, 0x0b, 0x86, 0x16, 0x4c, 0x2e, }, + { 0x14, 0x8f, 0x75, 0x51, 0xaf, 0xb9, 0xee, 0x51, 0x5a, 0xae, 0x23, }, + { 0x43, 0x5f, 0x50, 0xd5, 0x70, 0xb0, 0x5b, 0x87, 0xf5, 0xd9, 0xb3, 0x6d, }, + { 0x66, 0x0a, 0x64, 0x93, 0x79, 0x71, 0x94, 0x40, 0xb7, 0x68, 0x2d, 0xd3, + 0x63, }, + { 0x15, 0x00, 0xc4, 0x0c, 0x7d, 0x1b, 0x10, 0xa9, 0x73, 0x1b, 0x90, 0x6f, + 0xe6, 0xa9, }, + { 0x34, 0x75, 0xf3, 0x86, 0x8f, 0x56, 0xcf, 0x2a, 0x0a, 0xf2, 0x62, 0x0a, + 0xf6, 0x0e, 0x20, }, + { 0xb1, 0xde, 0xc9, 0xf5, 0xdb, 0xf3, 0x2f, 0x4c, 0xd6, 0x41, 0x7d, 0x39, + 0x18, 0x3e, 0xc7, 0xc3, }, + { 0xc5, 0x89, 0xb2, 0xf8, 0xb8, 0xc0, 0xa3, 0xb9, 0x3b, 0x10, 0x6d, 0x7c, + 0x92, 0xfc, 0x7f, 0x34, 0x41, }, + { 0xc4, 0xd8, 0xef, 0xba, 0xef, 0xd2, 0xaa, 0xc5, 0x6c, 0x8e, 0x3e, 0xbb, + 0x12, 0xfc, 0x0f, 0x72, 0xbf, 0x0f, }, + { 0xdd, 0x91, 0xd1, 0x15, 0x9e, 0x7d, 0xf8, 0xc1, 0xb9, 0x14, 0x63, 0x96, + 0xb5, 0xcb, 0x83, 0x1d, 0x35, 0x1c, 0xec, }, + { 0xa9, 0xf8, 0x52, 0xc9, 0x67, 0x76, 0x2b, 0xad, 0xfb, 0xd8, 0x3a, 0xa6, + 0x74, 0x02, 0xae, 0xb8, 0x25, 0x2c, 0x63, 0x49, }, + { 0x77, 0x1f, 0x66, 0x70, 0xfd, 0x50, 0x29, 0xaa, 0xeb, 0xdc, 0xee, 0xba, + 0x75, 0x98, 0xdc, 0x93, 0x12, 0x3f, 0xdc, 0x7c, 0x38, }, + { 0xe2, 0xe1, 0x89, 0x5c, 0x37, 0x38, 0x6a, 0xa3, 0x40, 0xac, 0x3f, 0xb0, + 0xca, 0xfc, 0xa7, 0xf3, 0xea, 0xf9, 0x0f, 0x5d, 0x8e, 0x39, }, + { 0x0f, 0x67, 0xc8, 0x38, 0x01, 0xb1, 0xb7, 0xb8, 0xa2, 0xe7, 0x0a, 0x6d, + 0xd2, 0x63, 0x69, 0x9e, 0xcc, 0xf0, 0xf2, 0xbe, 0x9b, 0x98, 0xdd, }, + { 0x13, 0xe1, 0x36, 0x30, 0xfe, 0xc6, 0x01, 0x8a, 0xa1, 0x63, 0x96, 0x59, + 0xc2, 0xa9, 0x68, 0x3f, 0x58, 0xd4, 0x19, 0x0c, 0x40, 0xf3, 0xde, 0x02, }, + { 0xa3, 0x9e, 0xce, 0xda, 0x42, 0xee, 0x8c, 0x6c, 0x5a, 0x7d, 0xdc, 0x89, + 0x02, 0x77, 0xdd, 0xe7, 0x95, 0xbb, 0xff, 0x0d, 0xa4, 0xb5, 0x38, 0x1e, + 0xaf, }, + { 0x9a, 0xf6, 0xb5, 0x9a, 0x4f, 0xa9, 0x4f, 0x2c, 0x35, 0x3c, 0x24, 0xdc, + 0x97, 0x6f, 0xd9, 0xa1, 0x7d, 0x1a, 0x85, 0x0b, 0xf5, 0xda, 0x2e, 0xe7, + 0xb1, 0x1d, }, + { 0x84, 0x1e, 0x8e, 0x3d, 0x45, 0xa5, 0xf2, 0x27, 0xf3, 0x31, 0xfe, 0xb9, + 0xfb, 0xc5, 0x45, 0x99, 0x99, 0xdd, 0x93, 0x43, 0x02, 0xee, 0x58, 0xaf, + 0xee, 0x6a, 0xbe, }, + { 0x07, 0x2f, 0xc0, 0xa2, 0x04, 0xc4, 0xab, 0x7c, 0x26, 0xbb, 0xa8, 0xd8, + 0xe3, 0x1c, 0x75, 0x15, 0x64, 0x5d, 0x02, 0x6a, 0xf0, 0x86, 0xe9, 0xcd, + 0x5c, 0xef, 0xa3, 0x25, }, + { 0x2f, 0x3b, 0x1f, 0xb5, 0x91, 0x8f, 0x86, 0xe0, 0xdc, 0x31, 0x48, 0xb6, + 0xa1, 0x8c, 0xfd, 0x75, 0xbb, 0x7d, 0x3d, 0xc1, 0xf0, 0x10, 0x9a, 0xd8, + 0x4b, 0x0e, 0xe3, 0x94, 0x9f, }, + { 0x29, 0xbb, 0x8f, 0x6c, 0xd1, 0xf2, 0xb6, 0xaf, 0xe5, 0xe3, 0x2d, 0xdc, + 0x6f, 0xa4, 0x53, 0x88, 0xd8, 0xcf, 0x4d, 0x45, 0x42, 0x62, 0xdb, 0xdf, + 0xf8, 0x45, 0xc2, 0x13, 0xec, 0x35, }, + { 0x06, 0x3c, 0xe3, 0x2c, 0x15, 0xc6, 0x43, 0x03, 0x81, 0xfb, 0x08, 0x76, + 0x33, 0xcb, 0x02, 0xc1, 0xba, 0x33, 0xe5, 0xe0, 0xd1, 0x92, 0xa8, 0x46, + 0x28, 0x3f, 0x3e, 0x9d, 0x2c, 0x44, 0x54, }, + { 0xea, 0xbb, 0x96, 0xf8, 0xd1, 0x8b, 0x04, 0x11, 0x40, 0x78, 0x42, 0x02, + 0x19, 0xd1, 0xbc, 0x65, 0x92, 0xd3, 0xc3, 0xd6, 0xd9, 0x19, 0xe7, 0xc3, + 0x40, 0x97, 0xbd, 0xd4, 0xed, 0xfa, 0x5e, 0x28, }, + { 0x02, }, + { 0x52, 0xa8, }, + { 0x38, 0x25, 0x0d, }, + { 0xe3, 0x04, 0xd4, 0x92, }, + { 0x97, 0xdb, 0xf7, 0x81, 0xca, }, + { 0x8a, 0x56, 0x9d, 0x62, 0x56, 0xcc, }, + { 0xa1, 0x8e, 0x3c, 0x72, 0x8f, 0x63, 0x03, }, + { 0xf7, 0xf3, 0x39, 0x09, 0x0a, 0xa1, 0xbb, 0x23, }, + { 0x6b, 0x03, 0xc0, 0xe9, 0xd9, 0x83, 0x05, 0x22, 0x01, }, + { 0x1b, 0x4b, 0xf5, 0xd6, 0x4f, 0x05, 0x75, 0x91, 0x4c, 0x7f, }, + { 0x4c, 0x8c, 0x25, 0x20, 0x21, 0xcb, 0xc2, 0x4b, 0x3a, 0x5b, 0x8d, }, + { 0x56, 0xe2, 0x77, 0xa0, 0xb6, 0x9f, 0x81, 0xec, 0x83, 0x75, 0xc4, 0xf9, }, + { 0x71, 0x70, 0x0f, 0xad, 0x4d, 0x35, 0x81, 0x9d, 0x88, 0x69, 0xf9, 0xaa, + 0xd3, }, + { 0x50, 0x6e, 0x86, 0x6e, 0x43, 0xc0, 0xc2, 0x44, 0xc2, 0xe2, 0xa0, 0x1c, + 0xb7, 0x9a, }, + { 0xe4, 0x7e, 0x72, 0xc6, 0x12, 0x8e, 0x7c, 0xfc, 0xbd, 0xe2, 0x08, 0x31, + 0x3d, 0x47, 0x3d, }, + { 0x08, 0x97, 0x5b, 0x80, 0xae, 0xc4, 0x1d, 0x50, 0x77, 0xdf, 0x1f, 0xd0, + 0x24, 0xf0, 0x17, 0xc0, }, + { 0x01, 0xb6, 0x29, 0xf4, 0xaf, 0x78, 0x5f, 0xb6, 0x91, 0xdd, 0x76, 0x76, + 0xd2, 0xfd, 0x0c, 0x47, 0x40, }, + { 0xa1, 0xd8, 0x09, 0x97, 0x7a, 0xa6, 0xc8, 0x94, 0xf6, 0x91, 0x7b, 0xae, + 0x2b, 0x9f, 0x0d, 0x83, 0x48, 0xf7, }, + { 0x12, 0xd5, 0x53, 0x7d, 0x9a, 0xb0, 0xbe, 0xd9, 0xed, 0xe9, 0x9e, 0xee, + 0x61, 0x5b, 0x42, 0xf2, 0xc0, 0x73, 0xc0, }, + { 0xd5, 0x77, 0xd6, 0x5c, 0x6e, 0xa5, 0x69, 0x2b, 0x3b, 0x8c, 0xd6, 0x7d, + 0x1d, 0xbe, 0x2c, 0xa1, 0x02, 0x21, 0xcd, 0x29, }, + { 0xa4, 0x98, 0x80, 0xca, 0x22, 0xcf, 0x6a, 0xab, 0x5e, 0x40, 0x0d, 0x61, + 0x08, 0x21, 0xef, 0xc0, 0x6c, 0x52, 0xb4, 0xb0, 0x53, }, + { 0xbf, 0xaf, 0x8f, 0x3b, 0x7a, 0x97, 0x33, 0xe5, 0xca, 0x07, 0x37, 0xfd, + 0x15, 0xdf, 0xce, 0x26, 0x2a, 0xb1, 0xa7, 0x0b, 0xb3, 0xac, }, + { 0x16, 0x22, 0xe1, 0xbc, 0x99, 0x4e, 0x01, 0xf0, 0xfa, 0xff, 0x8f, 0xa5, + 0x0c, 0x61, 0xb0, 0xad, 0xcc, 0xb1, 0xe1, 0x21, 0x46, 0xfa, 0x2e, }, + { 0x11, 0x5b, 0x0b, 0x2b, 0xe6, 0x14, 0xc1, 0xd5, 0x4d, 0x71, 0x5e, 0x17, + 0xea, 0x23, 0xdd, 0x6c, 0xbd, 0x1d, 0xbe, 0x12, 0x1b, 0xee, 0x4c, 0x1a, }, + { 0x40, 0x88, 0x22, 0xf3, 0x20, 0x6c, 0xed, 0xe1, 0x36, 0x34, 0x62, 0x2c, + 0x98, 0x83, 0x52, 0xe2, 0x25, 0xee, 0xe9, 0xf5, 0xe1, 0x17, 0xf0, 0x5c, + 0xae, }, + { 0xc3, 0x76, 0x37, 0xde, 0x95, 0x8c, 0xca, 0x2b, 0x0c, 0x23, 0xe7, 0xb5, + 0x38, 0x70, 0x61, 0xcc, 0xff, 0xd3, 0x95, 0x7b, 0xf3, 0xff, 0x1f, 0x9d, + 0x59, 0x00, }, + { 0x0c, 0x19, 0x52, 0x05, 0x22, 0x53, 0xcb, 0x48, 0xd7, 0x10, 0x0e, 0x7e, + 0x14, 0x69, 0xb5, 0xa2, 0x92, 0x43, 0xa3, 0x9e, 0x4b, 0x8f, 0x51, 0x2c, + 0x5a, 0x2c, 0x3b, }, + { 0xe1, 0x9d, 0x70, 0x70, 0x28, 0xec, 0x86, 0x40, 0x55, 0x33, 0x56, 0xda, + 0x88, 0xca, 0xee, 0xc8, 0x6a, 0x20, 0xb1, 0xe5, 0x3d, 0x57, 0xf8, 0x3c, + 0x10, 0x07, 0x2a, 0xc4, }, + { 0x0b, 0xae, 0xf1, 0xc4, 0x79, 0xee, 0x1b, 0x3d, 0x27, 0x35, 0x8d, 0x14, + 0xd6, 0xae, 0x4e, 0x3c, 0xe9, 0x53, 0x50, 0xb5, 0xcc, 0x0c, 0xf7, 0xdf, + 0xee, 0xa1, 0x74, 0xd6, 0x71, }, + { 0xe6, 0xa4, 0xf4, 0x99, 0x98, 0xb9, 0x80, 0xea, 0x96, 0x7f, 0x4f, 0x33, + 0xcf, 0x74, 0x25, 0x6f, 0x17, 0x6c, 0xbf, 0xf5, 0x5c, 0x38, 0xd0, 0xff, + 0x96, 0xcb, 0x13, 0xf9, 0xdf, 0xfd, }, + { 0xbe, 0x92, 0xeb, 0xba, 0x44, 0x2c, 0x24, 0x74, 0xd4, 0x03, 0x27, 0x3c, + 0x5d, 0x5b, 0x03, 0x30, 0x87, 0x63, 0x69, 0xe0, 0xb8, 0x94, 0xf4, 0x44, + 0x7e, 0xad, 0xcd, 0x20, 0x12, 0x16, 0x79, }, + { 0x30, 0xf1, 0xc4, 0x8e, 0x05, 0x90, 0x2a, 0x97, 0x63, 0x94, 0x46, 0xff, + 0xce, 0xd8, 0x67, 0xa7, 0xac, 0x33, 0x8c, 0x95, 0xb7, 0xcd, 0xa3, 0x23, + 0x98, 0x9d, 0x76, 0x6c, 0x9d, 0xa8, 0xd6, 0x8a, }, + { 0xbe, }, + { 0x17, 0x6c, }, + { 0x1a, 0x42, 0x4f, }, + { 0xba, 0xaf, 0xb7, 0x65, }, + { 0xc2, 0x63, 0x43, 0x6a, 0xea, }, + { 0xe4, 0x4d, 0xad, 0xf2, 0x0b, 0x02, }, + { 0x04, 0xc7, 0xc4, 0x7f, 0xa9, 0x2b, 0xce, }, + { 0x66, 0xf6, 0x67, 0xcb, 0x03, 0x53, 0xc8, 0xf1, }, + { 0x56, 0xa3, 0x60, 0x78, 0xc9, 0x5f, 0x70, 0x1b, 0x5e, }, + { 0x99, 0xff, 0x81, 0x7c, 0x13, 0x3c, 0x29, 0x79, 0x4b, 0x65, }, + { 0x51, 0x10, 0x50, 0x93, 0x01, 0x93, 0xb7, 0x01, 0xc9, 0x18, 0xb7, }, + { 0x8e, 0x3c, 0x42, 0x1e, 0x5e, 0x7d, 0xc1, 0x50, 0x70, 0x1f, 0x00, 0x98, }, + { 0x5f, 0xd9, 0x9b, 0xc8, 0xd7, 0xb2, 0x72, 0x62, 0x1a, 0x1e, 0xba, 0x92, + 0xe9, }, + { 0x70, 0x2b, 0xba, 0xfe, 0xad, 0x5d, 0x96, 0x3f, 0x27, 0xc2, 0x41, 0x6d, + 0xc4, 0xb3, }, + { 0xae, 0xe0, 0xd5, 0xd4, 0xc7, 0xae, 0x15, 0x5e, 0xdc, 0xdd, 0x33, 0x60, + 0xd7, 0xd3, 0x5e, }, + { 0x79, 0x8e, 0xbc, 0x9e, 0x20, 0xb9, 0x19, 0x4b, 0x63, 0x80, 0xf3, 0x16, + 0xaf, 0x39, 0xbd, 0x92, }, + { 0xc2, 0x0e, 0x85, 0xa0, 0x0b, 0x9a, 0xb0, 0xec, 0xde, 0x38, 0xd3, 0x10, + 0xd9, 0xa7, 0x66, 0x27, 0xcf, }, + { 0x0e, 0x3b, 0x75, 0x80, 0x67, 0x14, 0x0c, 0x02, 0x90, 0xd6, 0xb3, 0x02, + 0x81, 0xf6, 0xa6, 0x87, 0xce, 0x58, }, + { 0x79, 0xb5, 0xe9, 0x5d, 0x52, 0x4d, 0xf7, 0x59, 0xf4, 0x2e, 0x27, 0xdd, + 0xb3, 0xed, 0x57, 0x5b, 0x82, 0xea, 0x6f, }, + { 0xa2, 0x97, 0xf5, 0x80, 0x02, 0x3d, 0xde, 0xa3, 0xf9, 0xf6, 0xab, 0xe3, + 0x57, 0x63, 0x7b, 0x9b, 0x10, 0x42, 0x6f, 0xf2, }, + { 0x12, 0x7a, 0xfc, 0xb7, 0x67, 0x06, 0x0c, 0x78, 0x1a, 0xfe, 0x88, 0x4f, + 0xc6, 0xac, 0x52, 0x96, 0x64, 0x28, 0x97, 0x84, 0x06, }, + { 0xc5, 0x04, 0x44, 0x6b, 0xb2, 0xa5, 0xa4, 0x66, 0xe1, 0x76, 0xa2, 0x51, + 0xf9, 0x59, 0x69, 0x97, 0x56, 0x0b, 0xbf, 0x50, 0xb3, 0x34, }, + { 0x21, 0x32, 0x6b, 0x42, 0xb5, 0xed, 0x71, 0x8d, 0xf7, 0x5a, 0x35, 0xe3, + 0x90, 0xe2, 0xee, 0xaa, 0x89, 0xf6, 0xc9, 0x9c, 0x4d, 0x73, 0xf4, }, + { 0x4c, 0xa6, 0x09, 0xf4, 0x48, 0xe7, 0x46, 0xbc, 0x49, 0xfc, 0xe5, 0xda, + 0xd1, 0x87, 0x13, 0x17, 0x4c, 0x59, 0x71, 0x26, 0x5b, 0x2c, 0x42, 0xb7, }, + { 0x13, 0x63, 0xf3, 0x40, 0x02, 0xe5, 0xa3, 0x3a, 0x5e, 0x8e, 0xf8, 0xb6, + 0x8a, 0x49, 0x60, 0x76, 0x34, 0x72, 0x94, 0x73, 0xf6, 0xd9, 0x21, 0x6a, + 0x26, }, + { 0xdf, 0x75, 0x16, 0x10, 0x1b, 0x5e, 0x81, 0xc3, 0xc8, 0xde, 0x34, 0x24, + 0xb0, 0x98, 0xeb, 0x1b, 0x8f, 0xa1, 0x9b, 0x05, 0xee, 0xa5, 0xe9, 0x35, + 0xf4, 0x1d, }, + { 0xcd, 0x21, 0x93, 0x6e, 0x5b, 0xa0, 0x26, 0x2b, 0x21, 0x0e, 0xa0, 0xb9, + 0x1c, 0xb5, 0xbb, 0xb8, 0xf8, 0x1e, 0xff, 0x5c, 0xa8, 0xf9, 0x39, 0x46, + 0x4e, 0x29, 0x26, }, + { 0x73, 0x7f, 0x0e, 0x3b, 0x0b, 0x5c, 0xf9, 0x60, 0xaa, 0x88, 0xa1, 0x09, + 0xb1, 0x5d, 0x38, 0x7b, 0x86, 0x8f, 0x13, 0x7a, 0x8d, 0x72, 0x7a, 0x98, + 0x1a, 0x5b, 0xff, 0xc9, }, + { 0xd3, 0x3c, 0x61, 0x71, 0x44, 0x7e, 0x31, 0x74, 0x98, 0x9d, 0x9a, 0xd2, + 0x27, 0xf3, 0x46, 0x43, 0x42, 0x51, 0xd0, 0x5f, 0xe9, 0x1c, 0x5c, 0x69, + 0xbf, 0xf6, 0xbe, 0x3c, 0x40, }, + { 0x31, 0x99, 0x31, 0x9f, 0xaa, 0x43, 0x2e, 0x77, 0x3e, 0x74, 0x26, 0x31, + 0x5e, 0x61, 0xf1, 0x87, 0xe2, 0xeb, 0x9b, 0xcd, 0xd0, 0x3a, 0xee, 0x20, + 0x7e, 0x10, 0x0a, 0x0b, 0x7e, 0xfa, }, + { 0xa4, 0x27, 0x80, 0x67, 0x81, 0x2a, 0xa7, 0x62, 0xf7, 0x6e, 0xda, 0xd4, + 0x5c, 0x39, 0x74, 0xad, 0x7e, 0xbe, 0xad, 0xa5, 0x84, 0x7f, 0xa9, 0x30, + 0x5d, 0xdb, 0xe2, 0x05, 0x43, 0xf7, 0x1b, }, + { 0x0b, 0x37, 0xd8, 0x02, 0xe1, 0x83, 0xd6, 0x80, 0xf2, 0x35, 0xc2, 0xb0, + 0x37, 0xef, 0xef, 0x5e, 0x43, 0x93, 0xf0, 0x49, 0x45, 0x0a, 0xef, 0xb5, + 0x76, 0x70, 0x12, 0x44, 0xc4, 0xdb, 0xf5, 0x7a, }, + { 0x1f, }, + { 0x82, 0x60, }, + { 0xcc, 0xe3, 0x08, }, + { 0x56, 0x17, 0xe4, 0x59, }, + { 0xe2, 0xd7, 0x9e, 0xc4, 0x4c, }, + { 0xb2, 0xad, 0xd3, 0x78, 0x58, 0x5a, }, + { 0xce, 0x43, 0xb4, 0x02, 0x96, 0xab, 0x3c, }, + { 0xe6, 0x05, 0x1a, 0x73, 0x22, 0x32, 0xbb, 0x77, }, + { 0x23, 0xe7, 0xda, 0xfe, 0x2c, 0xef, 0x8c, 0x22, 0xec, }, + { 0xe9, 0x8e, 0x55, 0x38, 0xd1, 0xd7, 0x35, 0x23, 0x98, 0xc7, }, + { 0xb5, 0x81, 0x1a, 0xe5, 0xb5, 0xa5, 0xd9, 0x4d, 0xca, 0x41, 0xe7, }, + { 0x41, 0x16, 0x16, 0x95, 0x8d, 0x9e, 0x0c, 0xea, 0x8c, 0x71, 0x9a, 0xc1, }, + { 0x7c, 0x33, 0xc0, 0xa4, 0x00, 0x62, 0xea, 0x60, 0x67, 0xe4, 0x20, 0xbc, + 0x5b, }, + { 0xdb, 0xb1, 0xdc, 0xfd, 0x08, 0xc0, 0xde, 0x82, 0xd1, 0xde, 0x38, 0xc0, + 0x90, 0x48, }, + { 0x37, 0x18, 0x2e, 0x0d, 0x61, 0xaa, 0x61, 0xd7, 0x86, 0x20, 0x16, 0x60, + 0x04, 0xd9, 0xd5, }, + { 0xb0, 0xcf, 0x2c, 0x4c, 0x5e, 0x5b, 0x4f, 0x2a, 0x23, 0x25, 0x58, 0x47, + 0xe5, 0x31, 0x06, 0x70, }, + { 0x91, 0xa0, 0xa3, 0x86, 0x4e, 0xe0, 0x72, 0x38, 0x06, 0x67, 0x59, 0x5c, + 0x70, 0x25, 0xdb, 0x33, 0x27, }, + { 0x44, 0x58, 0x66, 0xb8, 0x58, 0xc7, 0x13, 0xed, 0x4c, 0xc0, 0xf4, 0x9a, + 0x1e, 0x67, 0x75, 0x33, 0xb6, 0xb8, }, + { 0x7f, 0x98, 0x4a, 0x8e, 0x50, 0xa2, 0x5c, 0xcd, 0x59, 0xde, 0x72, 0xb3, + 0x9d, 0xc3, 0x09, 0x8a, 0xab, 0x56, 0xf1, }, + { 0x80, 0x96, 0x49, 0x1a, 0x59, 0xa2, 0xc5, 0xd5, 0xa7, 0x20, 0x8a, 0xb7, + 0x27, 0x62, 0x84, 0x43, 0xc6, 0xe1, 0x1b, 0x5d, }, + { 0x6b, 0xb7, 0x2b, 0x26, 0x62, 0x14, 0x70, 0x19, 0x3d, 0x4d, 0xac, 0xac, + 0x63, 0x58, 0x5e, 0x94, 0xb5, 0xb7, 0xe8, 0xe8, 0xa2, }, + { 0x20, 0xa8, 0xc0, 0xfd, 0x63, 0x3d, 0x6e, 0x98, 0xcf, 0x0c, 0x49, 0x98, + 0xe4, 0x5a, 0xfe, 0x8c, 0xaa, 0x70, 0x82, 0x1c, 0x7b, 0x74, }, + { 0xc8, 0xe8, 0xdd, 0xdf, 0x69, 0x30, 0x01, 0xc2, 0x0f, 0x7e, 0x2f, 0x11, + 0xcc, 0x3e, 0x17, 0xa5, 0x69, 0x40, 0x3f, 0x0e, 0x79, 0x7f, 0xcf, }, + { 0xdb, 0x61, 0xc0, 0xe2, 0x2e, 0x49, 0x07, 0x31, 0x1d, 0x91, 0x42, 0x8a, + 0xfc, 0x5e, 0xd3, 0xf8, 0x56, 0x1f, 0x2b, 0x73, 0xfd, 0x9f, 0xb2, 0x8e, }, + { 0x0c, 0x89, 0x55, 0x0c, 0x1f, 0x59, 0x2c, 0x9d, 0x1b, 0x29, 0x1d, 0x41, + 0x1d, 0xe6, 0x47, 0x8f, 0x8c, 0x2b, 0xea, 0x8f, 0xf0, 0xff, 0x21, 0x70, + 0x88, }, + { 0x12, 0x18, 0x95, 0xa6, 0x59, 0xb1, 0x31, 0x24, 0x45, 0x67, 0x55, 0xa4, + 0x1a, 0x2d, 0x48, 0x67, 0x1b, 0x43, 0x88, 0x2d, 0x8e, 0xa0, 0x70, 0xb3, + 0xc6, 0xbb, }, + { 0xe7, 0xb1, 0x1d, 0xb2, 0x76, 0x4d, 0x68, 0x68, 0x68, 0x23, 0x02, 0x55, + 0x3a, 0xe2, 0xe5, 0xd5, 0x4b, 0x43, 0xf9, 0x34, 0x77, 0x5c, 0xa1, 0xf5, + 0x55, 0xfd, 0x4f, }, + { 0x8c, 0x87, 0x5a, 0x08, 0x3a, 0x73, 0xad, 0x61, 0xe1, 0xe7, 0x99, 0x7e, + 0xf0, 0x5d, 0xe9, 0x5d, 0x16, 0x43, 0x80, 0x2f, 0xd0, 0x66, 0x34, 0xe2, + 0x42, 0x64, 0x3b, 0x1a, }, + { 0x39, 0xc1, 0x99, 0xcf, 0x22, 0xbf, 0x16, 0x8f, 0x9f, 0x80, 0x7f, 0x95, + 0x0a, 0x05, 0x67, 0x27, 0xe7, 0x15, 0xdf, 0x9d, 0xb2, 0xfe, 0x1c, 0xb5, + 0x1d, 0x60, 0x8f, 0x8a, 0x1d, }, + { 0x9b, 0x6e, 0x08, 0x09, 0x06, 0x73, 0xab, 0x68, 0x02, 0x62, 0x1a, 0xe4, + 0xd4, 0xdf, 0xc7, 0x02, 0x4c, 0x6a, 0x5f, 0xfd, 0x23, 0xac, 0xae, 0x6d, + 0x43, 0xa4, 0x7a, 0x50, 0x60, 0x3c, }, + { 0x1d, 0xb4, 0xc6, 0xe1, 0xb1, 0x4b, 0xe3, 0xf2, 0xe2, 0x1a, 0x73, 0x1b, + 0xa0, 0x92, 0xa7, 0xf5, 0xff, 0x8f, 0x8b, 0x5d, 0xdf, 0xa8, 0x04, 0xb3, + 0xb0, 0xf7, 0xcc, 0x12, 0xfa, 0x35, 0x46, }, + { 0x49, 0x45, 0x97, 0x11, 0x0f, 0x1c, 0x60, 0x8e, 0xe8, 0x47, 0x30, 0xcf, + 0x60, 0xa8, 0x71, 0xc5, 0x1b, 0xe9, 0x39, 0x4d, 0x49, 0xb6, 0x12, 0x1f, + 0x24, 0xab, 0x37, 0xff, 0x83, 0xc2, 0xe1, 0x3a, }, + { 0x60, }, + { 0x24, 0x26, }, + { 0x47, 0xeb, 0xc9, }, + { 0x4a, 0xd0, 0xbc, 0xf0, }, + { 0x8e, 0x2b, 0xc9, 0x85, 0x3c, }, + { 0xa2, 0x07, 0x15, 0xb8, 0x12, 0x74, }, + { 0x0f, 0xdb, 0x5b, 0x33, 0x69, 0xfe, 0x4b, }, + { 0xa2, 0x86, 0x54, 0xf4, 0xfd, 0xb2, 0xd4, 0xe6, }, + { 0xbb, 0x84, 0x78, 0x49, 0x27, 0x8e, 0x61, 0xda, 0x60, }, + { 0x04, 0xc3, 0xcd, 0xaa, 0x8f, 0xa7, 0x03, 0xc9, 0xf9, 0xb6, }, + { 0xf8, 0x27, 0x1d, 0x61, 0xdc, 0x21, 0x42, 0xdd, 0xad, 0x92, 0x40, }, + { 0x12, 0x87, 0xdf, 0xc2, 0x41, 0x45, 0x5a, 0x36, 0x48, 0x5b, 0x51, 0x2b, }, + { 0xbb, 0x37, 0x5d, 0x1f, 0xf1, 0x68, 0x7a, 0xc4, 0xa5, 0xd2, 0xa4, 0x91, + 0x8d, }, + { 0x5b, 0x27, 0xd1, 0x04, 0x54, 0x52, 0x9f, 0xa3, 0x47, 0x86, 0x33, 0x33, + 0xbf, 0xa0, }, + { 0xcf, 0x04, 0xea, 0xf8, 0x03, 0x2a, 0x43, 0xff, 0xa6, 0x68, 0x21, 0x4c, + 0xd5, 0x4b, 0xed, }, + { 0xaf, 0xb8, 0xbc, 0x63, 0x0f, 0x18, 0x4d, 0xe2, 0x7a, 0xdd, 0x46, 0x44, + 0xc8, 0x24, 0x0a, 0xb7, }, + { 0x3e, 0xdc, 0x36, 0xe4, 0x89, 0xb1, 0xfa, 0xc6, 0x40, 0x93, 0x2e, 0x75, + 0xb2, 0x15, 0xd1, 0xb1, 0x10, }, + { 0x6c, 0xd8, 0x20, 0x3b, 0x82, 0x79, 0xf9, 0xc8, 0xbc, 0x9d, 0xe0, 0x35, + 0xbe, 0x1b, 0x49, 0x1a, 0xbc, 0x3a, }, + { 0x78, 0x65, 0x2c, 0xbe, 0x35, 0x67, 0xdc, 0x78, 0xd4, 0x41, 0xf6, 0xc9, + 0xde, 0xde, 0x1f, 0x18, 0x13, 0x31, 0x11, }, + { 0x8a, 0x7f, 0xb1, 0x33, 0x8f, 0x0c, 0x3c, 0x0a, 0x06, 0x61, 0xf0, 0x47, + 0x29, 0x1b, 0x29, 0xbc, 0x1c, 0x47, 0xef, 0x7a, }, + { 0x65, 0x91, 0xf1, 0xe6, 0xb3, 0x96, 0xd3, 0x8c, 0xc2, 0x4a, 0x59, 0x35, + 0x72, 0x8e, 0x0b, 0x9a, 0x87, 0xca, 0x34, 0x7b, 0x63, }, + { 0x5f, 0x08, 0x87, 0x80, 0x56, 0x25, 0x89, 0x77, 0x61, 0x8c, 0x64, 0xa1, + 0x59, 0x6d, 0x59, 0x62, 0xe8, 0x4a, 0xc8, 0x58, 0x99, 0xd1, }, + { 0x23, 0x87, 0x1d, 0xed, 0x6f, 0xf2, 0x91, 0x90, 0xe2, 0xfe, 0x43, 0x21, + 0xaf, 0x97, 0xc6, 0xbc, 0xd7, 0x15, 0xc7, 0x2d, 0x08, 0x77, 0x91, }, + { 0x90, 0x47, 0x9a, 0x9e, 0x3a, 0xdf, 0xf3, 0xc9, 0x4c, 0x1e, 0xa7, 0xd4, + 0x6a, 0x32, 0x90, 0xfe, 0xb7, 0xb6, 0x7b, 0xfa, 0x96, 0x61, 0xfb, 0xa4, }, + { 0xb1, 0x67, 0x60, 0x45, 0xb0, 0x96, 0xc5, 0x15, 0x9f, 0x4d, 0x26, 0xd7, + 0x9d, 0xf1, 0xf5, 0x6d, 0x21, 0x00, 0x94, 0x31, 0x64, 0x94, 0xd3, 0xa7, + 0xd3, }, + { 0x02, 0x3e, 0xaf, 0xf3, 0x79, 0x73, 0xa5, 0xf5, 0xcc, 0x7a, 0x7f, 0xfb, + 0x79, 0x2b, 0x85, 0x8c, 0x88, 0x72, 0x06, 0xbe, 0xfe, 0xaf, 0xc1, 0x16, + 0xa6, 0xd6, }, + { 0x2a, 0xb0, 0x1a, 0xe5, 0xaa, 0x6e, 0xb3, 0xae, 0x53, 0x85, 0x33, 0x80, + 0x75, 0xae, 0x30, 0xe6, 0xb8, 0x72, 0x42, 0xf6, 0x25, 0x4f, 0x38, 0x88, + 0x55, 0xd1, 0xa9, }, + { 0x90, 0xd8, 0x0c, 0xc0, 0x93, 0x4b, 0x4f, 0x9e, 0x65, 0x6c, 0xa1, 0x54, + 0xa6, 0xf6, 0x6e, 0xca, 0xd2, 0xbb, 0x7e, 0x6a, 0x1c, 0xd3, 0xce, 0x46, + 0xef, 0xb0, 0x00, 0x8d, }, + { 0xed, 0x9c, 0x49, 0xcd, 0xc2, 0xde, 0x38, 0x0e, 0xe9, 0x98, 0x6c, 0xc8, + 0x90, 0x9e, 0x3c, 0xd4, 0xd3, 0xeb, 0x88, 0x32, 0xc7, 0x28, 0xe3, 0x94, + 0x1c, 0x9f, 0x8b, 0xf3, 0xcb, }, + { 0xac, 0xe7, 0x92, 0x16, 0xb4, 0x14, 0xa0, 0xe4, 0x04, 0x79, 0xa2, 0xf4, + 0x31, 0xe6, 0x0c, 0x26, 0xdc, 0xbf, 0x2f, 0x69, 0x1b, 0x55, 0x94, 0x67, + 0xda, 0x0c, 0xd7, 0x32, 0x1f, 0xef, }, + { 0x68, 0x63, 0x85, 0x57, 0x95, 0x9e, 0x42, 0x27, 0x41, 0x43, 0x42, 0x02, + 0xa5, 0x78, 0xa7, 0xc6, 0x43, 0xc1, 0x6a, 0xba, 0x70, 0x80, 0xcd, 0x04, + 0xb6, 0x78, 0x76, 0x29, 0xf3, 0xe8, 0xa0, }, + { 0xe6, 0xac, 0x8d, 0x9d, 0xf0, 0xc0, 0xf7, 0xf7, 0xe3, 0x3e, 0x4e, 0x28, + 0x0f, 0x59, 0xb2, 0x67, 0x9e, 0x84, 0x34, 0x42, 0x96, 0x30, 0x2b, 0xca, + 0x49, 0xb6, 0xc5, 0x9a, 0x84, 0x59, 0xa7, 0x81, }, + { 0x7e, }, + { 0x1e, 0x21, }, + { 0x26, 0xd3, 0xdd, }, + { 0x2c, 0xd4, 0xb3, 0x3d, }, + { 0x86, 0x7b, 0x76, 0x3c, 0xf0, }, + { 0x12, 0xc3, 0x70, 0x1d, 0x55, 0x18, }, + { 0x96, 0xc2, 0xbd, 0x61, 0x55, 0xf4, 0x24, }, + { 0x20, 0x51, 0xf7, 0x86, 0x58, 0x8f, 0x07, 0x2a, }, + { 0x93, 0x15, 0xa8, 0x1d, 0xda, 0x97, 0xee, 0x0e, 0x6c, }, + { 0x39, 0x93, 0xdf, 0xd5, 0x0e, 0xca, 0xdc, 0x7a, 0x92, 0xce, }, + { 0x60, 0xd5, 0xfd, 0xf5, 0x1b, 0x26, 0x82, 0x26, 0x73, 0x02, 0xbc, }, + { 0x98, 0xf2, 0x34, 0xe1, 0xf5, 0xfb, 0x00, 0xac, 0x10, 0x4a, 0x38, 0x9f, }, + { 0xda, 0x3a, 0x92, 0x8a, 0xd0, 0xcd, 0x12, 0xcd, 0x15, 0xbb, 0xab, 0x77, + 0x66, }, + { 0xa2, 0x92, 0x1a, 0xe5, 0xca, 0x0c, 0x30, 0x75, 0xeb, 0xaf, 0x00, 0x31, + 0x55, 0x66, }, + { 0x06, 0xea, 0xfd, 0x3e, 0x86, 0x38, 0x62, 0x4e, 0xa9, 0x12, 0xa4, 0x12, + 0x43, 0xbf, 0xa1, }, + { 0xe4, 0x71, 0x7b, 0x94, 0xdb, 0xa0, 0xd2, 0xff, 0x9b, 0xeb, 0xad, 0x8e, + 0x95, 0x8a, 0xc5, 0xed, }, + { 0x25, 0x5a, 0x77, 0x71, 0x41, 0x0e, 0x7a, 0xe9, 0xed, 0x0c, 0x10, 0xef, + 0xf6, 0x2b, 0x3a, 0xba, 0x60, }, + { 0xee, 0xe2, 0xa3, 0x67, 0x64, 0x1d, 0xc6, 0x04, 0xc4, 0xe1, 0x68, 0xd2, + 0x6e, 0xd2, 0x91, 0x75, 0x53, 0x07, }, + { 0xe0, 0xf6, 0x4d, 0x8f, 0x68, 0xfc, 0x06, 0x7e, 0x18, 0x79, 0x7f, 0x2b, + 0x6d, 0xef, 0x46, 0x7f, 0xab, 0xb2, 0xad, }, + { 0x3d, 0x35, 0x88, 0x9f, 0x2e, 0xcf, 0x96, 0x45, 0x07, 0x60, 0x71, 0x94, + 0x00, 0x8d, 0xbf, 0xf4, 0xef, 0x46, 0x2e, 0x3c, }, + { 0x43, 0xcf, 0x98, 0xf7, 0x2d, 0xf4, 0x17, 0xe7, 0x8c, 0x05, 0x2d, 0x9b, + 0x24, 0xfb, 0x4d, 0xea, 0x4a, 0xec, 0x01, 0x25, 0x29, }, + { 0x8e, 0x73, 0x9a, 0x78, 0x11, 0xfe, 0x48, 0xa0, 0x3b, 0x1a, 0x26, 0xdf, + 0x25, 0xe9, 0x59, 0x1c, 0x70, 0x07, 0x9f, 0xdc, 0xa0, 0xa6, }, + { 0xe8, 0x47, 0x71, 0xc7, 0x3e, 0xdf, 0xb5, 0x13, 0xb9, 0x85, 0x13, 0xa8, + 0x54, 0x47, 0x6e, 0x59, 0x96, 0x09, 0x13, 0x5f, 0x82, 0x16, 0x0b, }, + { 0xfb, 0xc0, 0x8c, 0x03, 0x21, 0xb3, 0xc4, 0xb5, 0x43, 0x32, 0x6c, 0xea, + 0x7f, 0xa8, 0x43, 0x91, 0xe8, 0x4e, 0x3f, 0xbf, 0x45, 0x58, 0x6a, 0xa3, }, + { 0x55, 0xf8, 0xf3, 0x00, 0x76, 0x09, 0xef, 0x69, 0x5d, 0xd2, 0x8a, 0xf2, + 0x65, 0xc3, 0xcb, 0x9b, 0x43, 0xfd, 0xb1, 0x7e, 0x7f, 0xa1, 0x94, 0xb0, + 0xd7, }, + { 0xaa, 0x13, 0xc1, 0x51, 0x40, 0x6d, 0x8d, 0x4c, 0x0a, 0x95, 0x64, 0x7b, + 0xd1, 0x96, 0xb6, 0x56, 0xb4, 0x5b, 0xcf, 0xd6, 0xd9, 0x15, 0x97, 0xdd, + 0xb6, 0xef, }, + { 0xaf, 0xb7, 0x36, 0xb0, 0x04, 0xdb, 0xd7, 0x9c, 0x9a, 0x44, 0xc4, 0xf6, + 0x1f, 0x12, 0x21, 0x2d, 0x59, 0x30, 0x54, 0xab, 0x27, 0x61, 0xa3, 0x57, + 0xef, 0xf8, 0x53, }, + { 0x97, 0x34, 0x45, 0x3e, 0xce, 0x7c, 0x35, 0xa2, 0xda, 0x9f, 0x4b, 0x46, + 0x6c, 0x11, 0x67, 0xff, 0x2f, 0x76, 0x58, 0x15, 0x71, 0xfa, 0x44, 0x89, + 0x89, 0xfd, 0xf7, 0x99, }, + { 0x1f, 0xb1, 0x62, 0xeb, 0x83, 0xc5, 0x9c, 0x89, 0xf9, 0x2c, 0xd2, 0x03, + 0x61, 0xbc, 0xbb, 0xa5, 0x74, 0x0e, 0x9b, 0x7e, 0x82, 0x3e, 0x70, 0x0a, + 0xa9, 0x8f, 0x2b, 0x59, 0xfb, }, + { 0xf8, 0xca, 0x5e, 0x3a, 0x4f, 0x9e, 0x10, 0x69, 0x10, 0xd5, 0x4c, 0xeb, + 0x1a, 0x0f, 0x3c, 0x6a, 0x98, 0xf5, 0xb0, 0x97, 0x5b, 0x37, 0x2f, 0x0d, + 0xbd, 0x42, 0x4b, 0x69, 0xa1, 0x82, }, + { 0x12, 0x8c, 0x6d, 0x52, 0x08, 0xef, 0x74, 0xb2, 0xe6, 0xaa, 0xd3, 0xb0, + 0x26, 0xb0, 0xd9, 0x94, 0xb6, 0x11, 0x45, 0x0e, 0x36, 0x71, 0x14, 0x2d, + 0x41, 0x8c, 0x21, 0x53, 0x31, 0xe9, 0x68, }, + { 0xee, 0xea, 0x0d, 0x89, 0x47, 0x7e, 0x72, 0xd1, 0xd8, 0xce, 0x58, 0x4c, + 0x94, 0x1f, 0x0d, 0x51, 0x08, 0xa3, 0xb6, 0x3d, 0xe7, 0x82, 0x46, 0x92, + 0xd6, 0x98, 0x6b, 0x07, 0x10, 0x65, 0x52, 0x65, }, +}; + +static const u8 blake2s_hmac_testvecs[][BLAKE2S_HASH_SIZE] __initconst = { + { 0xce, 0xe1, 0x57, 0x69, 0x82, 0xdc, 0xbf, 0x43, 0xad, 0x56, 0x4c, 0x70, + 0xed, 0x68, 0x16, 0x96, 0xcf, 0xa4, 0x73, 0xe8, 0xe8, 0xfc, 0x32, 0x79, + 0x08, 0x0a, 0x75, 0x82, 0xda, 0x3f, 0x05, 0x11, }, + { 0x77, 0x2f, 0x0c, 0x71, 0x41, 0xf4, 0x4b, 0x2b, 0xb3, 0xc6, 0xb6, 0xf9, + 0x60, 0xde, 0xe4, 0x52, 0x38, 0x66, 0xe8, 0xbf, 0x9b, 0x96, 0xc4, 0x9f, + 0x60, 0xd9, 0x24, 0x37, 0x99, 0xd6, 0xec, 0x31, }, +}; + +bool __init blake2s_selftest(void) +{ + u8 key[BLAKE2S_KEY_SIZE]; + u8 buf[ARRAY_SIZE(blake2s_testvecs)]; + u8 hash[BLAKE2S_HASH_SIZE]; + struct blake2s_state state; + bool success = true; + int i, l; + + key[0] = key[1] = 1; + for (i = 2; i < sizeof(key); ++i) + key[i] = key[i - 2] + key[i - 1]; + + for (i = 0; i < sizeof(buf); ++i) + buf[i] = (u8)i; + + for (i = l = 0; i < ARRAY_SIZE(blake2s_testvecs); l = (l + 37) % ++i) { + int outlen = 1 + i % BLAKE2S_HASH_SIZE; + int keylen = (13 * i) % (BLAKE2S_KEY_SIZE + 1); + + blake2s(hash, buf, key + BLAKE2S_KEY_SIZE - keylen, outlen, i, + keylen); + if (memcmp(hash, blake2s_testvecs[i], outlen)) { + pr_err("blake2s self-test %d: FAIL\n", i + 1); + success = false; + } + + if (!keylen) + blake2s_init(&state, outlen); + else + blake2s_init_key(&state, outlen, + key + BLAKE2S_KEY_SIZE - keylen, + keylen); + + blake2s_update(&state, buf, l); + blake2s_update(&state, buf + l, i - l); + blake2s_final(&state, hash); + if (memcmp(hash, blake2s_testvecs[i], outlen)) { + pr_err("blake2s init/update/final self-test %d: FAIL\n", + i + 1); + success = false; + } + } + + if (success) { + blake2s256_hmac(hash, buf, key, sizeof(buf), sizeof(key)); + success &= !memcmp(hash, blake2s_hmac_testvecs[0], BLAKE2S_HASH_SIZE); + + blake2s256_hmac(hash, key, buf, sizeof(key), sizeof(buf)); + success &= !memcmp(hash, blake2s_hmac_testvecs[1], BLAKE2S_HASH_SIZE); + + if (!success) + pr_err("blake2s256_hmac self-test: FAIL\n"); + } + + return success; +} diff --git a/lib/crypto/blake2s.c b/lib/crypto/blake2s.c new file mode 100644 index 000000000000..41025a30c524 --- /dev/null +++ b/lib/crypto/blake2s.c @@ -0,0 +1,126 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * This is an implementation of the BLAKE2s hash and PRF functions. + * + * Information: https://blake2.net/ + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +bool blake2s_selftest(void); + +void blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen) +{ + const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen; + + if (unlikely(!inlen)) + return; + if (inlen > fill) { + memcpy(state->buf + state->buflen, in, fill); + if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_BLAKE2S)) + blake2s_compress_arch(state, state->buf, 1, + BLAKE2S_BLOCK_SIZE); + else + blake2s_compress_generic(state, state->buf, 1, + BLAKE2S_BLOCK_SIZE); + state->buflen = 0; + in += fill; + inlen -= fill; + } + if (inlen > BLAKE2S_BLOCK_SIZE) { + const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE); + /* Hash one less (full) block than strictly possible */ + if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_BLAKE2S)) + blake2s_compress_arch(state, in, nblocks - 1, + BLAKE2S_BLOCK_SIZE); + else + blake2s_compress_generic(state, in, nblocks - 1, + BLAKE2S_BLOCK_SIZE); + in += BLAKE2S_BLOCK_SIZE * (nblocks - 1); + inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1); + } + memcpy(state->buf + state->buflen, in, inlen); + state->buflen += inlen; +} +EXPORT_SYMBOL(blake2s_update); + +void blake2s_final(struct blake2s_state *state, u8 *out) +{ + WARN_ON(IS_ENABLED(DEBUG) && !out); + blake2s_set_lastblock(state); + memset(state->buf + state->buflen, 0, + BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */ + if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_BLAKE2S)) + blake2s_compress_arch(state, state->buf, 1, state->buflen); + else + blake2s_compress_generic(state, state->buf, 1, state->buflen); + cpu_to_le32_array(state->h, ARRAY_SIZE(state->h)); + memcpy(out, state->h, state->outlen); + memzero_explicit(state, sizeof(*state)); +} +EXPORT_SYMBOL(blake2s_final); + +void blake2s256_hmac(u8 *out, const u8 *in, const u8 *key, const size_t inlen, + const size_t keylen) +{ + struct blake2s_state state; + u8 x_key[BLAKE2S_BLOCK_SIZE] __aligned(__alignof__(u32)) = { 0 }; + u8 i_hash[BLAKE2S_HASH_SIZE] __aligned(__alignof__(u32)); + int i; + + if (keylen > BLAKE2S_BLOCK_SIZE) { + blake2s_init(&state, BLAKE2S_HASH_SIZE); + blake2s_update(&state, key, keylen); + blake2s_final(&state, x_key); + } else + memcpy(x_key, key, keylen); + + for (i = 0; i < BLAKE2S_BLOCK_SIZE; ++i) + x_key[i] ^= 0x36; + + blake2s_init(&state, BLAKE2S_HASH_SIZE); + blake2s_update(&state, x_key, BLAKE2S_BLOCK_SIZE); + blake2s_update(&state, in, inlen); + blake2s_final(&state, i_hash); + + for (i = 0; i < BLAKE2S_BLOCK_SIZE; ++i) + x_key[i] ^= 0x5c ^ 0x36; + + blake2s_init(&state, BLAKE2S_HASH_SIZE); + blake2s_update(&state, x_key, BLAKE2S_BLOCK_SIZE); + blake2s_update(&state, i_hash, BLAKE2S_HASH_SIZE); + blake2s_final(&state, i_hash); + + memcpy(out, i_hash, BLAKE2S_HASH_SIZE); + memzero_explicit(x_key, BLAKE2S_BLOCK_SIZE); + memzero_explicit(i_hash, BLAKE2S_HASH_SIZE); +} +EXPORT_SYMBOL(blake2s256_hmac); + +static int __init mod_init(void) +{ + if (!IS_ENABLED(CONFIG_CRYPTO_MANAGER_DISABLE_TESTS) && + WARN_ON(!blake2s_selftest())) + return -ENODEV; + return 0; +} + +static void __exit mod_exit(void) +{ +} + +module_init(mod_init); +module_exit(mod_exit); +MODULE_LICENSE("GPL v2"); +MODULE_DESCRIPTION("BLAKE2s hash function"); +MODULE_AUTHOR("Jason A. Donenfeld "); diff --git a/lib/chacha.c b/lib/crypto/chacha.c similarity index 88% rename from lib/chacha.c rename to lib/crypto/chacha.c index a46d2832dbab..f5c9e0692a96 100644 --- a/lib/chacha.c +++ b/lib/crypto/chacha.c @@ -9,9 +9,11 @@ * (at your option) any later version. */ +#include #include #include #include +#include #include #include #include @@ -76,7 +78,7 @@ static void chacha_permute(u32 *x, int nrounds) * The caller has already converted the endianness of the input. This function * also handles incrementing the block counter in the input matrix. */ -void chacha_block(u32 *state, u8 *stream, int nrounds) +void chacha_block_generic(u32 *state, u8 *stream, int nrounds) { u32 x[16]; int i; @@ -90,11 +92,11 @@ void chacha_block(u32 *state, u8 *stream, int nrounds) state[12]++; } -EXPORT_SYMBOL(chacha_block); +EXPORT_SYMBOL(chacha_block_generic); /** - * hchacha_block - abbreviated ChaCha core, for XChaCha - * @in: input state matrix (16 32-bit words) + * hchacha_block_generic - abbreviated ChaCha core, for XChaCha + * @state: input state matrix (16 32-bit words) * @out: output (8 32-bit words) * @nrounds: number of rounds (20 or 12; 20 is recommended) * @@ -103,15 +105,15 @@ EXPORT_SYMBOL(chacha_block); * skips the final addition of the initial state, and outputs only certain words * of the state. It should not be used for streaming directly. */ -void hchacha_block(const u32 *in, u32 *out, int nrounds) +void hchacha_block_generic(const u32 *state, u32 *stream, int nrounds) { u32 x[16]; - memcpy(x, in, 64); + memcpy(x, state, 64); chacha_permute(x, nrounds); - memcpy(&out[0], &x[0], 16); - memcpy(&out[4], &x[12], 16); + memcpy(&stream[0], &x[0], 16); + memcpy(&stream[4], &x[12], 16); } -EXPORT_SYMBOL(hchacha_block); +EXPORT_SYMBOL(hchacha_block_generic); diff --git a/lib/crypto/chacha20poly1305-selftest.c b/lib/crypto/chacha20poly1305-selftest.c new file mode 100644 index 000000000000..fa43deda2660 --- /dev/null +++ b/lib/crypto/chacha20poly1305-selftest.c @@ -0,0 +1,9082 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +struct chacha20poly1305_testvec { + const u8 *input, *output, *assoc, *nonce, *key; + size_t ilen, alen, nlen; + bool failure; +}; + +/* The first of these are the ChaCha20-Poly1305 AEAD test vectors from RFC7539 + * 2.8.2. After they are generated by reference implementations. And the final + * marked ones are taken from wycheproof, but we only do these for the encrypt + * side, because mostly we're stressing the primitives rather than the actual + * chapoly construction. + */ + +static const u8 enc_input001[] __initconst = { + 0x49, 0x6e, 0x74, 0x65, 0x72, 0x6e, 0x65, 0x74, + 0x2d, 0x44, 0x72, 0x61, 0x66, 0x74, 0x73, 0x20, + 0x61, 0x72, 0x65, 0x20, 0x64, 0x72, 0x61, 0x66, + 0x74, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65, + 0x6e, 0x74, 0x73, 0x20, 0x76, 0x61, 0x6c, 0x69, + 0x64, 0x20, 0x66, 0x6f, 0x72, 0x20, 0x61, 0x20, + 0x6d, 0x61, 0x78, 0x69, 0x6d, 0x75, 0x6d, 0x20, + 0x6f, 0x66, 0x20, 0x73, 0x69, 0x78, 0x20, 0x6d, + 0x6f, 0x6e, 0x74, 0x68, 0x73, 0x20, 0x61, 0x6e, + 0x64, 0x20, 0x6d, 0x61, 0x79, 0x20, 0x62, 0x65, + 0x20, 0x75, 0x70, 0x64, 0x61, 0x74, 0x65, 0x64, + 0x2c, 0x20, 0x72, 0x65, 0x70, 0x6c, 0x61, 0x63, + 0x65, 0x64, 0x2c, 0x20, 0x6f, 0x72, 0x20, 0x6f, + 0x62, 0x73, 0x6f, 0x6c, 0x65, 0x74, 0x65, 0x64, + 0x20, 0x62, 0x79, 0x20, 0x6f, 0x74, 0x68, 0x65, + 0x72, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65, + 0x6e, 0x74, 0x73, 0x20, 0x61, 0x74, 0x20, 0x61, + 0x6e, 0x79, 0x20, 0x74, 0x69, 0x6d, 0x65, 0x2e, + 0x20, 0x49, 0x74, 0x20, 0x69, 0x73, 0x20, 0x69, + 0x6e, 0x61, 0x70, 0x70, 0x72, 0x6f, 0x70, 0x72, + 0x69, 0x61, 0x74, 0x65, 0x20, 0x74, 0x6f, 0x20, + 0x75, 0x73, 0x65, 0x20, 0x49, 0x6e, 0x74, 0x65, + 0x72, 0x6e, 0x65, 0x74, 0x2d, 0x44, 0x72, 0x61, + 0x66, 0x74, 0x73, 0x20, 0x61, 0x73, 0x20, 0x72, + 0x65, 0x66, 0x65, 0x72, 0x65, 0x6e, 0x63, 0x65, + 0x20, 0x6d, 0x61, 0x74, 0x65, 0x72, 0x69, 0x61, + 0x6c, 0x20, 0x6f, 0x72, 0x20, 0x74, 0x6f, 0x20, + 0x63, 0x69, 0x74, 0x65, 0x20, 0x74, 0x68, 0x65, + 0x6d, 0x20, 0x6f, 0x74, 0x68, 0x65, 0x72, 0x20, + 0x74, 0x68, 0x61, 0x6e, 0x20, 0x61, 0x73, 0x20, + 0x2f, 0xe2, 0x80, 0x9c, 0x77, 0x6f, 0x72, 0x6b, + 0x20, 0x69, 0x6e, 0x20, 0x70, 0x72, 0x6f, 0x67, + 0x72, 0x65, 0x73, 0x73, 0x2e, 0x2f, 0xe2, 0x80, + 0x9d +}; +static const u8 enc_output001[] __initconst = { + 0x64, 0xa0, 0x86, 0x15, 0x75, 0x86, 0x1a, 0xf4, + 0x60, 0xf0, 0x62, 0xc7, 0x9b, 0xe6, 0x43, 0xbd, + 0x5e, 0x80, 0x5c, 0xfd, 0x34, 0x5c, 0xf3, 0x89, + 0xf1, 0x08, 0x67, 0x0a, 0xc7, 0x6c, 0x8c, 0xb2, + 0x4c, 0x6c, 0xfc, 0x18, 0x75, 0x5d, 0x43, 0xee, + 0xa0, 0x9e, 0xe9, 0x4e, 0x38, 0x2d, 0x26, 0xb0, + 0xbd, 0xb7, 0xb7, 0x3c, 0x32, 0x1b, 0x01, 0x00, + 0xd4, 0xf0, 0x3b, 0x7f, 0x35, 0x58, 0x94, 0xcf, + 0x33, 0x2f, 0x83, 0x0e, 0x71, 0x0b, 0x97, 0xce, + 0x98, 0xc8, 0xa8, 0x4a, 0xbd, 0x0b, 0x94, 0x81, + 0x14, 0xad, 0x17, 0x6e, 0x00, 0x8d, 0x33, 0xbd, + 0x60, 0xf9, 0x82, 0xb1, 0xff, 0x37, 0xc8, 0x55, + 0x97, 0x97, 0xa0, 0x6e, 0xf4, 0xf0, 0xef, 0x61, + 0xc1, 0x86, 0x32, 0x4e, 0x2b, 0x35, 0x06, 0x38, + 0x36, 0x06, 0x90, 0x7b, 0x6a, 0x7c, 0x02, 0xb0, + 0xf9, 0xf6, 0x15, 0x7b, 0x53, 0xc8, 0x67, 0xe4, + 0xb9, 0x16, 0x6c, 0x76, 0x7b, 0x80, 0x4d, 0x46, + 0xa5, 0x9b, 0x52, 0x16, 0xcd, 0xe7, 0xa4, 0xe9, + 0x90, 0x40, 0xc5, 0xa4, 0x04, 0x33, 0x22, 0x5e, + 0xe2, 0x82, 0xa1, 0xb0, 0xa0, 0x6c, 0x52, 0x3e, + 0xaf, 0x45, 0x34, 0xd7, 0xf8, 0x3f, 0xa1, 0x15, + 0x5b, 0x00, 0x47, 0x71, 0x8c, 0xbc, 0x54, 0x6a, + 0x0d, 0x07, 0x2b, 0x04, 0xb3, 0x56, 0x4e, 0xea, + 0x1b, 0x42, 0x22, 0x73, 0xf5, 0x48, 0x27, 0x1a, + 0x0b, 0xb2, 0x31, 0x60, 0x53, 0xfa, 0x76, 0x99, + 0x19, 0x55, 0xeb, 0xd6, 0x31, 0x59, 0x43, 0x4e, + 0xce, 0xbb, 0x4e, 0x46, 0x6d, 0xae, 0x5a, 0x10, + 0x73, 0xa6, 0x72, 0x76, 0x27, 0x09, 0x7a, 0x10, + 0x49, 0xe6, 0x17, 0xd9, 0x1d, 0x36, 0x10, 0x94, + 0xfa, 0x68, 0xf0, 0xff, 0x77, 0x98, 0x71, 0x30, + 0x30, 0x5b, 0xea, 0xba, 0x2e, 0xda, 0x04, 0xdf, + 0x99, 0x7b, 0x71, 0x4d, 0x6c, 0x6f, 0x2c, 0x29, + 0xa6, 0xad, 0x5c, 0xb4, 0x02, 0x2b, 0x02, 0x70, + 0x9b, 0xee, 0xad, 0x9d, 0x67, 0x89, 0x0c, 0xbb, + 0x22, 0x39, 0x23, 0x36, 0xfe, 0xa1, 0x85, 0x1f, + 0x38 +}; +static const u8 enc_assoc001[] __initconst = { + 0xf3, 0x33, 0x88, 0x86, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x4e, 0x91 +}; +static const u8 enc_nonce001[] __initconst = { + 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08 +}; +static const u8 enc_key001[] __initconst = { + 0x1c, 0x92, 0x40, 0xa5, 0xeb, 0x55, 0xd3, 0x8a, + 0xf3, 0x33, 0x88, 0x86, 0x04, 0xf6, 0xb5, 0xf0, + 0x47, 0x39, 0x17, 0xc1, 0x40, 0x2b, 0x80, 0x09, + 0x9d, 0xca, 0x5c, 0xbc, 0x20, 0x70, 0x75, 0xc0 +}; + +static const u8 enc_input002[] __initconst = { }; +static const u8 enc_output002[] __initconst = { + 0xea, 0xe0, 0x1e, 0x9e, 0x2c, 0x91, 0xaa, 0xe1, + 0xdb, 0x5d, 0x99, 0x3f, 0x8a, 0xf7, 0x69, 0x92 +}; +static const u8 enc_assoc002[] __initconst = { }; +static const u8 enc_nonce002[] __initconst = { + 0xca, 0xbf, 0x33, 0x71, 0x32, 0x45, 0x77, 0x8e +}; +static const u8 enc_key002[] __initconst = { + 0x4c, 0xf5, 0x96, 0x83, 0x38, 0xe6, 0xae, 0x7f, + 0x2d, 0x29, 0x25, 0x76, 0xd5, 0x75, 0x27, 0x86, + 0x91, 0x9a, 0x27, 0x7a, 0xfb, 0x46, 0xc5, 0xef, + 0x94, 0x81, 0x79, 0x57, 0x14, 0x59, 0x40, 0x68 +}; + +static const u8 enc_input003[] __initconst = { }; +static const u8 enc_output003[] __initconst = { + 0xdd, 0x6b, 0x3b, 0x82, 0xce, 0x5a, 0xbd, 0xd6, + 0xa9, 0x35, 0x83, 0xd8, 0x8c, 0x3d, 0x85, 0x77 +}; +static const u8 enc_assoc003[] __initconst = { + 0x33, 0x10, 0x41, 0x12, 0x1f, 0xf3, 0xd2, 0x6b +}; +static const u8 enc_nonce003[] __initconst = { + 0x3d, 0x86, 0xb5, 0x6b, 0xc8, 0xa3, 0x1f, 0x1d +}; +static const u8 enc_key003[] __initconst = { + 0x2d, 0xb0, 0x5d, 0x40, 0xc8, 0xed, 0x44, 0x88, + 0x34, 0xd1, 0x13, 0xaf, 0x57, 0xa1, 0xeb, 0x3a, + 0x2a, 0x80, 0x51, 0x36, 0xec, 0x5b, 0xbc, 0x08, + 0x93, 0x84, 0x21, 0xb5, 0x13, 0x88, 0x3c, 0x0d +}; + +static const u8 enc_input004[] __initconst = { + 0xa4 +}; +static const u8 enc_output004[] __initconst = { + 0xb7, 0x1b, 0xb0, 0x73, 0x59, 0xb0, 0x84, 0xb2, + 0x6d, 0x8e, 0xab, 0x94, 0x31, 0xa1, 0xae, 0xac, + 0x89 +}; +static const u8 enc_assoc004[] __initconst = { + 0x6a, 0xe2, 0xad, 0x3f, 0x88, 0x39, 0x5a, 0x40 +}; +static const u8 enc_nonce004[] __initconst = { + 0xd2, 0x32, 0x1f, 0x29, 0x28, 0xc6, 0xc4, 0xc4 +}; +static const u8 enc_key004[] __initconst = { + 0x4b, 0x28, 0x4b, 0xa3, 0x7b, 0xbe, 0xe9, 0xf8, + 0x31, 0x80, 0x82, 0xd7, 0xd8, 0xe8, 0xb5, 0xa1, + 0xe2, 0x18, 0x18, 0x8a, 0x9c, 0xfa, 0xa3, 0x3d, + 0x25, 0x71, 0x3e, 0x40, 0xbc, 0x54, 0x7a, 0x3e +}; + +static const u8 enc_input005[] __initconst = { + 0x2d +}; +static const u8 enc_output005[] __initconst = { + 0xbf, 0xe1, 0x5b, 0x0b, 0xdb, 0x6b, 0xf5, 0x5e, + 0x6c, 0x5d, 0x84, 0x44, 0x39, 0x81, 0xc1, 0x9c, + 0xac +}; +static const u8 enc_assoc005[] __initconst = { }; +static const u8 enc_nonce005[] __initconst = { + 0x20, 0x1c, 0xaa, 0x5f, 0x9c, 0xbf, 0x92, 0x30 +}; +static const u8 enc_key005[] __initconst = { + 0x66, 0xca, 0x9c, 0x23, 0x2a, 0x4b, 0x4b, 0x31, + 0x0e, 0x92, 0x89, 0x8b, 0xf4, 0x93, 0xc7, 0x87, + 0x98, 0xa3, 0xd8, 0x39, 0xf8, 0xf4, 0xa7, 0x01, + 0xc0, 0x2e, 0x0a, 0xa6, 0x7e, 0x5a, 0x78, 0x87 +}; + +static const u8 enc_input006[] __initconst = { + 0x33, 0x2f, 0x94, 0xc1, 0xa4, 0xef, 0xcc, 0x2a, + 0x5b, 0xa6, 0xe5, 0x8f, 0x1d, 0x40, 0xf0, 0x92, + 0x3c, 0xd9, 0x24, 0x11, 0xa9, 0x71, 0xf9, 0x37, + 0x14, 0x99, 0xfa, 0xbe, 0xe6, 0x80, 0xde, 0x50, + 0xc9, 0x96, 0xd4, 0xb0, 0xec, 0x9e, 0x17, 0xec, + 0xd2, 0x5e, 0x72, 0x99, 0xfc, 0x0a, 0xe1, 0xcb, + 0x48, 0xd2, 0x85, 0xdd, 0x2f, 0x90, 0xe0, 0x66, + 0x3b, 0xe6, 0x20, 0x74, 0xbe, 0x23, 0x8f, 0xcb, + 0xb4, 0xe4, 0xda, 0x48, 0x40, 0xa6, 0xd1, 0x1b, + 0xc7, 0x42, 0xce, 0x2f, 0x0c, 0xa6, 0x85, 0x6e, + 0x87, 0x37, 0x03, 0xb1, 0x7c, 0x25, 0x96, 0xa3, + 0x05, 0xd8, 0xb0, 0xf4, 0xed, 0xea, 0xc2, 0xf0, + 0x31, 0x98, 0x6c, 0xd1, 0x14, 0x25, 0xc0, 0xcb, + 0x01, 0x74, 0xd0, 0x82, 0xf4, 0x36, 0xf5, 0x41, + 0xd5, 0xdc, 0xca, 0xc5, 0xbb, 0x98, 0xfe, 0xfc, + 0x69, 0x21, 0x70, 0xd8, 0xa4, 0x4b, 0xc8, 0xde, + 0x8f +}; +static const u8 enc_output006[] __initconst = { + 0x8b, 0x06, 0xd3, 0x31, 0xb0, 0x93, 0x45, 0xb1, + 0x75, 0x6e, 0x26, 0xf9, 0x67, 0xbc, 0x90, 0x15, + 0x81, 0x2c, 0xb5, 0xf0, 0xc6, 0x2b, 0xc7, 0x8c, + 0x56, 0xd1, 0xbf, 0x69, 0x6c, 0x07, 0xa0, 0xda, + 0x65, 0x27, 0xc9, 0x90, 0x3d, 0xef, 0x4b, 0x11, + 0x0f, 0x19, 0x07, 0xfd, 0x29, 0x92, 0xd9, 0xc8, + 0xf7, 0x99, 0x2e, 0x4a, 0xd0, 0xb8, 0x2c, 0xdc, + 0x93, 0xf5, 0x9e, 0x33, 0x78, 0xd1, 0x37, 0xc3, + 0x66, 0xd7, 0x5e, 0xbc, 0x44, 0xbf, 0x53, 0xa5, + 0xbc, 0xc4, 0xcb, 0x7b, 0x3a, 0x8e, 0x7f, 0x02, + 0xbd, 0xbb, 0xe7, 0xca, 0xa6, 0x6c, 0x6b, 0x93, + 0x21, 0x93, 0x10, 0x61, 0xe7, 0x69, 0xd0, 0x78, + 0xf3, 0x07, 0x5a, 0x1a, 0x8f, 0x73, 0xaa, 0xb1, + 0x4e, 0xd3, 0xda, 0x4f, 0xf3, 0x32, 0xe1, 0x66, + 0x3e, 0x6c, 0xc6, 0x13, 0xba, 0x06, 0x5b, 0xfc, + 0x6a, 0xe5, 0x6f, 0x60, 0xfb, 0x07, 0x40, 0xb0, + 0x8c, 0x9d, 0x84, 0x43, 0x6b, 0xc1, 0xf7, 0x8d, + 0x8d, 0x31, 0xf7, 0x7a, 0x39, 0x4d, 0x8f, 0x9a, + 0xeb +}; +static const u8 enc_assoc006[] __initconst = { + 0x70, 0xd3, 0x33, 0xf3, 0x8b, 0x18, 0x0b +}; +static const u8 enc_nonce006[] __initconst = { + 0xdf, 0x51, 0x84, 0x82, 0x42, 0x0c, 0x75, 0x9c +}; +static const u8 enc_key006[] __initconst = { + 0x68, 0x7b, 0x8d, 0x8e, 0xe3, 0xc4, 0xdd, 0xae, + 0xdf, 0x72, 0x7f, 0x53, 0x72, 0x25, 0x1e, 0x78, + 0x91, 0xcb, 0x69, 0x76, 0x1f, 0x49, 0x93, 0xf9, + 0x6f, 0x21, 0xcc, 0x39, 0x9c, 0xad, 0xb1, 0x01 +}; + +static const u8 enc_input007[] __initconst = { + 0x9b, 0x18, 0xdb, 0xdd, 0x9a, 0x0f, 0x3e, 0xa5, + 0x15, 0x17, 0xde, 0xdf, 0x08, 0x9d, 0x65, 0x0a, + 0x67, 0x30, 0x12, 0xe2, 0x34, 0x77, 0x4b, 0xc1, + 0xd9, 0xc6, 0x1f, 0xab, 0xc6, 0x18, 0x50, 0x17, + 0xa7, 0x9d, 0x3c, 0xa6, 0xc5, 0x35, 0x8c, 0x1c, + 0xc0, 0xa1, 0x7c, 0x9f, 0x03, 0x89, 0xca, 0xe1, + 0xe6, 0xe9, 0xd4, 0xd3, 0x88, 0xdb, 0xb4, 0x51, + 0x9d, 0xec, 0xb4, 0xfc, 0x52, 0xee, 0x6d, 0xf1, + 0x75, 0x42, 0xc6, 0xfd, 0xbd, 0x7a, 0x8e, 0x86, + 0xfc, 0x44, 0xb3, 0x4f, 0xf3, 0xea, 0x67, 0x5a, + 0x41, 0x13, 0xba, 0xb0, 0xdc, 0xe1, 0xd3, 0x2a, + 0x7c, 0x22, 0xb3, 0xca, 0xac, 0x6a, 0x37, 0x98, + 0x3e, 0x1d, 0x40, 0x97, 0xf7, 0x9b, 0x1d, 0x36, + 0x6b, 0xb3, 0x28, 0xbd, 0x60, 0x82, 0x47, 0x34, + 0xaa, 0x2f, 0x7d, 0xe9, 0xa8, 0x70, 0x81, 0x57, + 0xd4, 0xb9, 0x77, 0x0a, 0x9d, 0x29, 0xa7, 0x84, + 0x52, 0x4f, 0xc2, 0x4a, 0x40, 0x3b, 0x3c, 0xd4, + 0xc9, 0x2a, 0xdb, 0x4a, 0x53, 0xc4, 0xbe, 0x80, + 0xe9, 0x51, 0x7f, 0x8f, 0xc7, 0xa2, 0xce, 0x82, + 0x5c, 0x91, 0x1e, 0x74, 0xd9, 0xd0, 0xbd, 0xd5, + 0xf3, 0xfd, 0xda, 0x4d, 0x25, 0xb4, 0xbb, 0x2d, + 0xac, 0x2f, 0x3d, 0x71, 0x85, 0x7b, 0xcf, 0x3c, + 0x7b, 0x3e, 0x0e, 0x22, 0x78, 0x0c, 0x29, 0xbf, + 0xe4, 0xf4, 0x57, 0xb3, 0xcb, 0x49, 0xa0, 0xfc, + 0x1e, 0x05, 0x4e, 0x16, 0xbc, 0xd5, 0xa8, 0xa3, + 0xee, 0x05, 0x35, 0xc6, 0x7c, 0xab, 0x60, 0x14, + 0x55, 0x1a, 0x8e, 0xc5, 0x88, 0x5d, 0xd5, 0x81, + 0xc2, 0x81, 0xa5, 0xc4, 0x60, 0xdb, 0xaf, 0x77, + 0x91, 0xe1, 0xce, 0xa2, 0x7e, 0x7f, 0x42, 0xe3, + 0xb0, 0x13, 0x1c, 0x1f, 0x25, 0x60, 0x21, 0xe2, + 0x40, 0x5f, 0x99, 0xb7, 0x73, 0xec, 0x9b, 0x2b, + 0xf0, 0x65, 0x11, 0xc8, 0xd0, 0x0a, 0x9f, 0xd3 +}; +static const u8 enc_output007[] __initconst = { + 0x85, 0x04, 0xc2, 0xed, 0x8d, 0xfd, 0x97, 0x5c, + 0xd2, 0xb7, 0xe2, 0xc1, 0x6b, 0xa3, 0xba, 0xf8, + 0xc9, 0x50, 0xc3, 0xc6, 0xa5, 0xe3, 0xa4, 0x7c, + 0xc3, 0x23, 0x49, 0x5e, 0xa9, 0xb9, 0x32, 0xeb, + 0x8a, 0x7c, 0xca, 0xe5, 0xec, 0xfb, 0x7c, 0xc0, + 0xcb, 0x7d, 0xdc, 0x2c, 0x9d, 0x92, 0x55, 0x21, + 0x0a, 0xc8, 0x43, 0x63, 0x59, 0x0a, 0x31, 0x70, + 0x82, 0x67, 0x41, 0x03, 0xf8, 0xdf, 0xf2, 0xac, + 0xa7, 0x02, 0xd4, 0xd5, 0x8a, 0x2d, 0xc8, 0x99, + 0x19, 0x66, 0xd0, 0xf6, 0x88, 0x2c, 0x77, 0xd9, + 0xd4, 0x0d, 0x6c, 0xbd, 0x98, 0xde, 0xe7, 0x7f, + 0xad, 0x7e, 0x8a, 0xfb, 0xe9, 0x4b, 0xe5, 0xf7, + 0xe5, 0x50, 0xa0, 0x90, 0x3f, 0xd6, 0x22, 0x53, + 0xe3, 0xfe, 0x1b, 0xcc, 0x79, 0x3b, 0xec, 0x12, + 0x47, 0x52, 0xa7, 0xd6, 0x04, 0xe3, 0x52, 0xe6, + 0x93, 0x90, 0x91, 0x32, 0x73, 0x79, 0xb8, 0xd0, + 0x31, 0xde, 0x1f, 0x9f, 0x2f, 0x05, 0x38, 0x54, + 0x2f, 0x35, 0x04, 0x39, 0xe0, 0xa7, 0xba, 0xc6, + 0x52, 0xf6, 0x37, 0x65, 0x4c, 0x07, 0xa9, 0x7e, + 0xb3, 0x21, 0x6f, 0x74, 0x8c, 0xc9, 0xde, 0xdb, + 0x65, 0x1b, 0x9b, 0xaa, 0x60, 0xb1, 0x03, 0x30, + 0x6b, 0xb2, 0x03, 0xc4, 0x1c, 0x04, 0xf8, 0x0f, + 0x64, 0xaf, 0x46, 0xe4, 0x65, 0x99, 0x49, 0xe2, + 0xea, 0xce, 0x78, 0x00, 0xd8, 0x8b, 0xd5, 0x2e, + 0xcf, 0xfc, 0x40, 0x49, 0xe8, 0x58, 0xdc, 0x34, + 0x9c, 0x8c, 0x61, 0xbf, 0x0a, 0x8e, 0xec, 0x39, + 0xa9, 0x30, 0x05, 0x5a, 0xd2, 0x56, 0x01, 0xc7, + 0xda, 0x8f, 0x4e, 0xbb, 0x43, 0xa3, 0x3a, 0xf9, + 0x15, 0x2a, 0xd0, 0xa0, 0x7a, 0x87, 0x34, 0x82, + 0xfe, 0x8a, 0xd1, 0x2d, 0x5e, 0xc7, 0xbf, 0x04, + 0x53, 0x5f, 0x3b, 0x36, 0xd4, 0x25, 0x5c, 0x34, + 0x7a, 0x8d, 0xd5, 0x05, 0xce, 0x72, 0xca, 0xef, + 0x7a, 0x4b, 0xbc, 0xb0, 0x10, 0x5c, 0x96, 0x42, + 0x3a, 0x00, 0x98, 0xcd, 0x15, 0xe8, 0xb7, 0x53 +}; +static const u8 enc_assoc007[] __initconst = { }; +static const u8 enc_nonce007[] __initconst = { + 0xde, 0x7b, 0xef, 0xc3, 0x65, 0x1b, 0x68, 0xb0 +}; +static const u8 enc_key007[] __initconst = { + 0x8d, 0xb8, 0x91, 0x48, 0xf0, 0xe7, 0x0a, 0xbd, + 0xf9, 0x3f, 0xcd, 0xd9, 0xa0, 0x1e, 0x42, 0x4c, + 0xe7, 0xde, 0x25, 0x3d, 0xa3, 0xd7, 0x05, 0x80, + 0x8d, 0xf2, 0x82, 0xac, 0x44, 0x16, 0x51, 0x01 +}; + +static const u8 enc_input008[] __initconst = { + 0xc3, 0x09, 0x94, 0x62, 0xe6, 0x46, 0x2e, 0x10, + 0xbe, 0x00, 0xe4, 0xfc, 0xf3, 0x40, 0xa3, 0xe2, + 0x0f, 0xc2, 0x8b, 0x28, 0xdc, 0xba, 0xb4, 0x3c, + 0xe4, 0x21, 0x58, 0x61, 0xcd, 0x8b, 0xcd, 0xfb, + 0xac, 0x94, 0xa1, 0x45, 0xf5, 0x1c, 0xe1, 0x12, + 0xe0, 0x3b, 0x67, 0x21, 0x54, 0x5e, 0x8c, 0xaa, + 0xcf, 0xdb, 0xb4, 0x51, 0xd4, 0x13, 0xda, 0xe6, + 0x83, 0x89, 0xb6, 0x92, 0xe9, 0x21, 0x76, 0xa4, + 0x93, 0x7d, 0x0e, 0xfd, 0x96, 0x36, 0x03, 0x91, + 0x43, 0x5c, 0x92, 0x49, 0x62, 0x61, 0x7b, 0xeb, + 0x43, 0x89, 0xb8, 0x12, 0x20, 0x43, 0xd4, 0x47, + 0x06, 0x84, 0xee, 0x47, 0xe9, 0x8a, 0x73, 0x15, + 0x0f, 0x72, 0xcf, 0xed, 0xce, 0x96, 0xb2, 0x7f, + 0x21, 0x45, 0x76, 0xeb, 0x26, 0x28, 0x83, 0x6a, + 0xad, 0xaa, 0xa6, 0x81, 0xd8, 0x55, 0xb1, 0xa3, + 0x85, 0xb3, 0x0c, 0xdf, 0xf1, 0x69, 0x2d, 0x97, + 0x05, 0x2a, 0xbc, 0x7c, 0x7b, 0x25, 0xf8, 0x80, + 0x9d, 0x39, 0x25, 0xf3, 0x62, 0xf0, 0x66, 0x5e, + 0xf4, 0xa0, 0xcf, 0xd8, 0xfd, 0x4f, 0xb1, 0x1f, + 0x60, 0x3a, 0x08, 0x47, 0xaf, 0xe1, 0xf6, 0x10, + 0x77, 0x09, 0xa7, 0x27, 0x8f, 0x9a, 0x97, 0x5a, + 0x26, 0xfa, 0xfe, 0x41, 0x32, 0x83, 0x10, 0xe0, + 0x1d, 0xbf, 0x64, 0x0d, 0xf4, 0x1c, 0x32, 0x35, + 0xe5, 0x1b, 0x36, 0xef, 0xd4, 0x4a, 0x93, 0x4d, + 0x00, 0x7c, 0xec, 0x02, 0x07, 0x8b, 0x5d, 0x7d, + 0x1b, 0x0e, 0xd1, 0xa6, 0xa5, 0x5d, 0x7d, 0x57, + 0x88, 0xa8, 0xcc, 0x81, 0xb4, 0x86, 0x4e, 0xb4, + 0x40, 0xe9, 0x1d, 0xc3, 0xb1, 0x24, 0x3e, 0x7f, + 0xcc, 0x8a, 0x24, 0x9b, 0xdf, 0x6d, 0xf0, 0x39, + 0x69, 0x3e, 0x4c, 0xc0, 0x96, 0xe4, 0x13, 0xda, + 0x90, 0xda, 0xf4, 0x95, 0x66, 0x8b, 0x17, 0x17, + 0xfe, 0x39, 0x43, 0x25, 0xaa, 0xda, 0xa0, 0x43, + 0x3c, 0xb1, 0x41, 0x02, 0xa3, 0xf0, 0xa7, 0x19, + 0x59, 0xbc, 0x1d, 0x7d, 0x6c, 0x6d, 0x91, 0x09, + 0x5c, 0xb7, 0x5b, 0x01, 0xd1, 0x6f, 0x17, 0x21, + 0x97, 0xbf, 0x89, 0x71, 0xa5, 0xb0, 0x6e, 0x07, + 0x45, 0xfd, 0x9d, 0xea, 0x07, 0xf6, 0x7a, 0x9f, + 0x10, 0x18, 0x22, 0x30, 0x73, 0xac, 0xd4, 0x6b, + 0x72, 0x44, 0xed, 0xd9, 0x19, 0x9b, 0x2d, 0x4a, + 0x41, 0xdd, 0xd1, 0x85, 0x5e, 0x37, 0x19, 0xed, + 0xd2, 0x15, 0x8f, 0x5e, 0x91, 0xdb, 0x33, 0xf2, + 0xe4, 0xdb, 0xff, 0x98, 0xfb, 0xa3, 0xb5, 0xca, + 0x21, 0x69, 0x08, 0xe7, 0x8a, 0xdf, 0x90, 0xff, + 0x3e, 0xe9, 0x20, 0x86, 0x3c, 0xe9, 0xfc, 0x0b, + 0xfe, 0x5c, 0x61, 0xaa, 0x13, 0x92, 0x7f, 0x7b, + 0xec, 0xe0, 0x6d, 0xa8, 0x23, 0x22, 0xf6, 0x6b, + 0x77, 0xc4, 0xfe, 0x40, 0x07, 0x3b, 0xb6, 0xf6, + 0x8e, 0x5f, 0xd4, 0xb9, 0xb7, 0x0f, 0x21, 0x04, + 0xef, 0x83, 0x63, 0x91, 0x69, 0x40, 0xa3, 0x48, + 0x5c, 0xd2, 0x60, 0xf9, 0x4f, 0x6c, 0x47, 0x8b, + 0x3b, 0xb1, 0x9f, 0x8e, 0xee, 0x16, 0x8a, 0x13, + 0xfc, 0x46, 0x17, 0xc3, 0xc3, 0x32, 0x56, 0xf8, + 0x3c, 0x85, 0x3a, 0xb6, 0x3e, 0xaa, 0x89, 0x4f, + 0xb3, 0xdf, 0x38, 0xfd, 0xf1, 0xe4, 0x3a, 0xc0, + 0xe6, 0x58, 0xb5, 0x8f, 0xc5, 0x29, 0xa2, 0x92, + 0x4a, 0xb6, 0xa0, 0x34, 0x7f, 0xab, 0xb5, 0x8a, + 0x90, 0xa1, 0xdb, 0x4d, 0xca, 0xb6, 0x2c, 0x41, + 0x3c, 0xf7, 0x2b, 0x21, 0xc3, 0xfd, 0xf4, 0x17, + 0x5c, 0xb5, 0x33, 0x17, 0x68, 0x2b, 0x08, 0x30, + 0xf3, 0xf7, 0x30, 0x3c, 0x96, 0xe6, 0x6a, 0x20, + 0x97, 0xe7, 0x4d, 0x10, 0x5f, 0x47, 0x5f, 0x49, + 0x96, 0x09, 0xf0, 0x27, 0x91, 0xc8, 0xf8, 0x5a, + 0x2e, 0x79, 0xb5, 0xe2, 0xb8, 0xe8, 0xb9, 0x7b, + 0xd5, 0x10, 0xcb, 0xff, 0x5d, 0x14, 0x73, 0xf3 +}; +static const u8 enc_output008[] __initconst = { + 0x14, 0xf6, 0x41, 0x37, 0xa6, 0xd4, 0x27, 0xcd, + 0xdb, 0x06, 0x3e, 0x9a, 0x4e, 0xab, 0xd5, 0xb1, + 0x1e, 0x6b, 0xd2, 0xbc, 0x11, 0xf4, 0x28, 0x93, + 0x63, 0x54, 0xef, 0xbb, 0x5e, 0x1d, 0x3a, 0x1d, + 0x37, 0x3c, 0x0a, 0x6c, 0x1e, 0xc2, 0xd1, 0x2c, + 0xb5, 0xa3, 0xb5, 0x7b, 0xb8, 0x8f, 0x25, 0xa6, + 0x1b, 0x61, 0x1c, 0xec, 0x28, 0x58, 0x26, 0xa4, + 0xa8, 0x33, 0x28, 0x25, 0x5c, 0x45, 0x05, 0xe5, + 0x6c, 0x99, 0xe5, 0x45, 0xc4, 0xa2, 0x03, 0x84, + 0x03, 0x73, 0x1e, 0x8c, 0x49, 0xac, 0x20, 0xdd, + 0x8d, 0xb3, 0xc4, 0xf5, 0xe7, 0x4f, 0xf1, 0xed, + 0xa1, 0x98, 0xde, 0xa4, 0x96, 0xdd, 0x2f, 0xab, + 0xab, 0x97, 0xcf, 0x3e, 0xd2, 0x9e, 0xb8, 0x13, + 0x07, 0x28, 0x29, 0x19, 0xaf, 0xfd, 0xf2, 0x49, + 0x43, 0xea, 0x49, 0x26, 0x91, 0xc1, 0x07, 0xd6, + 0xbb, 0x81, 0x75, 0x35, 0x0d, 0x24, 0x7f, 0xc8, + 0xda, 0xd4, 0xb7, 0xeb, 0xe8, 0x5c, 0x09, 0xa2, + 0x2f, 0xdc, 0x28, 0x7d, 0x3a, 0x03, 0xfa, 0x94, + 0xb5, 0x1d, 0x17, 0x99, 0x36, 0xc3, 0x1c, 0x18, + 0x34, 0xe3, 0x9f, 0xf5, 0x55, 0x7c, 0xb0, 0x60, + 0x9d, 0xff, 0xac, 0xd4, 0x61, 0xf2, 0xad, 0xf8, + 0xce, 0xc7, 0xbe, 0x5c, 0xd2, 0x95, 0xa8, 0x4b, + 0x77, 0x13, 0x19, 0x59, 0x26, 0xc9, 0xb7, 0x8f, + 0x6a, 0xcb, 0x2d, 0x37, 0x91, 0xea, 0x92, 0x9c, + 0x94, 0x5b, 0xda, 0x0b, 0xce, 0xfe, 0x30, 0x20, + 0xf8, 0x51, 0xad, 0xf2, 0xbe, 0xe7, 0xc7, 0xff, + 0xb3, 0x33, 0x91, 0x6a, 0xc9, 0x1a, 0x41, 0xc9, + 0x0f, 0xf3, 0x10, 0x0e, 0xfd, 0x53, 0xff, 0x6c, + 0x16, 0x52, 0xd9, 0xf3, 0xf7, 0x98, 0x2e, 0xc9, + 0x07, 0x31, 0x2c, 0x0c, 0x72, 0xd7, 0xc5, 0xc6, + 0x08, 0x2a, 0x7b, 0xda, 0xbd, 0x7e, 0x02, 0xea, + 0x1a, 0xbb, 0xf2, 0x04, 0x27, 0x61, 0x28, 0x8e, + 0xf5, 0x04, 0x03, 0x1f, 0x4c, 0x07, 0x55, 0x82, + 0xec, 0x1e, 0xd7, 0x8b, 0x2f, 0x65, 0x56, 0xd1, + 0xd9, 0x1e, 0x3c, 0xe9, 0x1f, 0x5e, 0x98, 0x70, + 0x38, 0x4a, 0x8c, 0x49, 0xc5, 0x43, 0xa0, 0xa1, + 0x8b, 0x74, 0x9d, 0x4c, 0x62, 0x0d, 0x10, 0x0c, + 0xf4, 0x6c, 0x8f, 0xe0, 0xaa, 0x9a, 0x8d, 0xb7, + 0xe0, 0xbe, 0x4c, 0x87, 0xf1, 0x98, 0x2f, 0xcc, + 0xed, 0xc0, 0x52, 0x29, 0xdc, 0x83, 0xf8, 0xfc, + 0x2c, 0x0e, 0xa8, 0x51, 0x4d, 0x80, 0x0d, 0xa3, + 0xfe, 0xd8, 0x37, 0xe7, 0x41, 0x24, 0xfc, 0xfb, + 0x75, 0xe3, 0x71, 0x7b, 0x57, 0x45, 0xf5, 0x97, + 0x73, 0x65, 0x63, 0x14, 0x74, 0xb8, 0x82, 0x9f, + 0xf8, 0x60, 0x2f, 0x8a, 0xf2, 0x4e, 0xf1, 0x39, + 0xda, 0x33, 0x91, 0xf8, 0x36, 0xe0, 0x8d, 0x3f, + 0x1f, 0x3b, 0x56, 0xdc, 0xa0, 0x8f, 0x3c, 0x9d, + 0x71, 0x52, 0xa7, 0xb8, 0xc0, 0xa5, 0xc6, 0xa2, + 0x73, 0xda, 0xf4, 0x4b, 0x74, 0x5b, 0x00, 0x3d, + 0x99, 0xd7, 0x96, 0xba, 0xe6, 0xe1, 0xa6, 0x96, + 0x38, 0xad, 0xb3, 0xc0, 0xd2, 0xba, 0x91, 0x6b, + 0xf9, 0x19, 0xdd, 0x3b, 0xbe, 0xbe, 0x9c, 0x20, + 0x50, 0xba, 0xa1, 0xd0, 0xce, 0x11, 0xbd, 0x95, + 0xd8, 0xd1, 0xdd, 0x33, 0x85, 0x74, 0xdc, 0xdb, + 0x66, 0x76, 0x44, 0xdc, 0x03, 0x74, 0x48, 0x35, + 0x98, 0xb1, 0x18, 0x47, 0x94, 0x7d, 0xff, 0x62, + 0xe4, 0x58, 0x78, 0xab, 0xed, 0x95, 0x36, 0xd9, + 0x84, 0x91, 0x82, 0x64, 0x41, 0xbb, 0x58, 0xe6, + 0x1c, 0x20, 0x6d, 0x15, 0x6b, 0x13, 0x96, 0xe8, + 0x35, 0x7f, 0xdc, 0x40, 0x2c, 0xe9, 0xbc, 0x8a, + 0x4f, 0x92, 0xec, 0x06, 0x2d, 0x50, 0xdf, 0x93, + 0x5d, 0x65, 0x5a, 0xa8, 0xfc, 0x20, 0x50, 0x14, + 0xa9, 0x8a, 0x7e, 0x1d, 0x08, 0x1f, 0xe2, 0x99, + 0xd0, 0xbe, 0xfb, 0x3a, 0x21, 0x9d, 0xad, 0x86, + 0x54, 0xfd, 0x0d, 0x98, 0x1c, 0x5a, 0x6f, 0x1f, + 0x9a, 0x40, 0xcd, 0xa2, 0xff, 0x6a, 0xf1, 0x54 +}; +static const u8 enc_assoc008[] __initconst = { }; +static const u8 enc_nonce008[] __initconst = { + 0x0e, 0x0d, 0x57, 0xbb, 0x7b, 0x40, 0x54, 0x02 +}; +static const u8 enc_key008[] __initconst = { + 0xf2, 0xaa, 0x4f, 0x99, 0xfd, 0x3e, 0xa8, 0x53, + 0xc1, 0x44, 0xe9, 0x81, 0x18, 0xdc, 0xf5, 0xf0, + 0x3e, 0x44, 0x15, 0x59, 0xe0, 0xc5, 0x44, 0x86, + 0xc3, 0x91, 0xa8, 0x75, 0xc0, 0x12, 0x46, 0xba +}; + +static const u8 enc_input009[] __initconst = { + 0xe6, 0xc3, 0xdb, 0x63, 0x55, 0x15, 0xe3, 0x5b, + 0xb7, 0x4b, 0x27, 0x8b, 0x5a, 0xdd, 0xc2, 0xe8, + 0x3a, 0x6b, 0xd7, 0x81, 0x96, 0x35, 0x97, 0xca, + 0xd7, 0x68, 0xe8, 0xef, 0xce, 0xab, 0xda, 0x09, + 0x6e, 0xd6, 0x8e, 0xcb, 0x55, 0xb5, 0xe1, 0xe5, + 0x57, 0xfd, 0xc4, 0xe3, 0xe0, 0x18, 0x4f, 0x85, + 0xf5, 0x3f, 0x7e, 0x4b, 0x88, 0xc9, 0x52, 0x44, + 0x0f, 0xea, 0xaf, 0x1f, 0x71, 0x48, 0x9f, 0x97, + 0x6d, 0xb9, 0x6f, 0x00, 0xa6, 0xde, 0x2b, 0x77, + 0x8b, 0x15, 0xad, 0x10, 0xa0, 0x2b, 0x7b, 0x41, + 0x90, 0x03, 0x2d, 0x69, 0xae, 0xcc, 0x77, 0x7c, + 0xa5, 0x9d, 0x29, 0x22, 0xc2, 0xea, 0xb4, 0x00, + 0x1a, 0xd2, 0x7a, 0x98, 0x8a, 0xf9, 0xf7, 0x82, + 0xb0, 0xab, 0xd8, 0xa6, 0x94, 0x8d, 0x58, 0x2f, + 0x01, 0x9e, 0x00, 0x20, 0xfc, 0x49, 0xdc, 0x0e, + 0x03, 0xe8, 0x45, 0x10, 0xd6, 0xa8, 0xda, 0x55, + 0x10, 0x9a, 0xdf, 0x67, 0x22, 0x8b, 0x43, 0xab, + 0x00, 0xbb, 0x02, 0xc8, 0xdd, 0x7b, 0x97, 0x17, + 0xd7, 0x1d, 0x9e, 0x02, 0x5e, 0x48, 0xde, 0x8e, + 0xcf, 0x99, 0x07, 0x95, 0x92, 0x3c, 0x5f, 0x9f, + 0xc5, 0x8a, 0xc0, 0x23, 0xaa, 0xd5, 0x8c, 0x82, + 0x6e, 0x16, 0x92, 0xb1, 0x12, 0x17, 0x07, 0xc3, + 0xfb, 0x36, 0xf5, 0x6c, 0x35, 0xd6, 0x06, 0x1f, + 0x9f, 0xa7, 0x94, 0xa2, 0x38, 0x63, 0x9c, 0xb0, + 0x71, 0xb3, 0xa5, 0xd2, 0xd8, 0xba, 0x9f, 0x08, + 0x01, 0xb3, 0xff, 0x04, 0x97, 0x73, 0x45, 0x1b, + 0xd5, 0xa9, 0x9c, 0x80, 0xaf, 0x04, 0x9a, 0x85, + 0xdb, 0x32, 0x5b, 0x5d, 0x1a, 0xc1, 0x36, 0x28, + 0x10, 0x79, 0xf1, 0x3c, 0xbf, 0x1a, 0x41, 0x5c, + 0x4e, 0xdf, 0xb2, 0x7c, 0x79, 0x3b, 0x7a, 0x62, + 0x3d, 0x4b, 0xc9, 0x9b, 0x2a, 0x2e, 0x7c, 0xa2, + 0xb1, 0x11, 0x98, 0xa7, 0x34, 0x1a, 0x00, 0xf3, + 0xd1, 0xbc, 0x18, 0x22, 0xba, 0x02, 0x56, 0x62, + 0x31, 0x10, 0x11, 0x6d, 0xe0, 0x54, 0x9d, 0x40, + 0x1f, 0x26, 0x80, 0x41, 0xca, 0x3f, 0x68, 0x0f, + 0x32, 0x1d, 0x0a, 0x8e, 0x79, 0xd8, 0xa4, 0x1b, + 0x29, 0x1c, 0x90, 0x8e, 0xc5, 0xe3, 0xb4, 0x91, + 0x37, 0x9a, 0x97, 0x86, 0x99, 0xd5, 0x09, 0xc5, + 0xbb, 0xa3, 0x3f, 0x21, 0x29, 0x82, 0x14, 0x5c, + 0xab, 0x25, 0xfb, 0xf2, 0x4f, 0x58, 0x26, 0xd4, + 0x83, 0xaa, 0x66, 0x89, 0x67, 0x7e, 0xc0, 0x49, + 0xe1, 0x11, 0x10, 0x7f, 0x7a, 0xda, 0x29, 0x04, + 0xff, 0xf0, 0xcb, 0x09, 0x7c, 0x9d, 0xfa, 0x03, + 0x6f, 0x81, 0x09, 0x31, 0x60, 0xfb, 0x08, 0xfa, + 0x74, 0xd3, 0x64, 0x44, 0x7c, 0x55, 0x85, 0xec, + 0x9c, 0x6e, 0x25, 0xb7, 0x6c, 0xc5, 0x37, 0xb6, + 0x83, 0x87, 0x72, 0x95, 0x8b, 0x9d, 0xe1, 0x69, + 0x5c, 0x31, 0x95, 0x42, 0xa6, 0x2c, 0xd1, 0x36, + 0x47, 0x1f, 0xec, 0x54, 0xab, 0xa2, 0x1c, 0xd8, + 0x00, 0xcc, 0xbc, 0x0d, 0x65, 0xe2, 0x67, 0xbf, + 0xbc, 0xea, 0xee, 0x9e, 0xe4, 0x36, 0x95, 0xbe, + 0x73, 0xd9, 0xa6, 0xd9, 0x0f, 0xa0, 0xcc, 0x82, + 0x76, 0x26, 0xad, 0x5b, 0x58, 0x6c, 0x4e, 0xab, + 0x29, 0x64, 0xd3, 0xd9, 0xa9, 0x08, 0x8c, 0x1d, + 0xa1, 0x4f, 0x80, 0xd8, 0x3f, 0x94, 0xfb, 0xd3, + 0x7b, 0xfc, 0xd1, 0x2b, 0xc3, 0x21, 0xeb, 0xe5, + 0x1c, 0x84, 0x23, 0x7f, 0x4b, 0xfa, 0xdb, 0x34, + 0x18, 0xa2, 0xc2, 0xe5, 0x13, 0xfe, 0x6c, 0x49, + 0x81, 0xd2, 0x73, 0xe7, 0xe2, 0xd7, 0xe4, 0x4f, + 0x4b, 0x08, 0x6e, 0xb1, 0x12, 0x22, 0x10, 0x9d, + 0xac, 0x51, 0x1e, 0x17, 0xd9, 0x8a, 0x0b, 0x42, + 0x88, 0x16, 0x81, 0x37, 0x7c, 0x6a, 0xf7, 0xef, + 0x2d, 0xe3, 0xd9, 0xf8, 0x5f, 0xe0, 0x53, 0x27, + 0x74, 0xb9, 0xe2, 0xd6, 0x1c, 0x80, 0x2c, 0x52, + 0x65 +}; +static const u8 enc_output009[] __initconst = { + 0xfd, 0x81, 0x8d, 0xd0, 0x3d, 0xb4, 0xd5, 0xdf, + 0xd3, 0x42, 0x47, 0x5a, 0x6d, 0x19, 0x27, 0x66, + 0x4b, 0x2e, 0x0c, 0x27, 0x9c, 0x96, 0x4c, 0x72, + 0x02, 0xa3, 0x65, 0xc3, 0xb3, 0x6f, 0x2e, 0xbd, + 0x63, 0x8a, 0x4a, 0x5d, 0x29, 0xa2, 0xd0, 0x28, + 0x48, 0xc5, 0x3d, 0x98, 0xa3, 0xbc, 0xe0, 0xbe, + 0x3b, 0x3f, 0xe6, 0x8a, 0xa4, 0x7f, 0x53, 0x06, + 0xfa, 0x7f, 0x27, 0x76, 0x72, 0x31, 0xa1, 0xf5, + 0xd6, 0x0c, 0x52, 0x47, 0xba, 0xcd, 0x4f, 0xd7, + 0xeb, 0x05, 0x48, 0x0d, 0x7c, 0x35, 0x4a, 0x09, + 0xc9, 0x76, 0x71, 0x02, 0xa3, 0xfb, 0xb7, 0x1a, + 0x65, 0xb7, 0xed, 0x98, 0xc6, 0x30, 0x8a, 0x00, + 0xae, 0xa1, 0x31, 0xe5, 0xb5, 0x9e, 0x6d, 0x62, + 0xda, 0xda, 0x07, 0x0f, 0x38, 0x38, 0xd3, 0xcb, + 0xc1, 0xb0, 0xad, 0xec, 0x72, 0xec, 0xb1, 0xa2, + 0x7b, 0x59, 0xf3, 0x3d, 0x2b, 0xef, 0xcd, 0x28, + 0x5b, 0x83, 0xcc, 0x18, 0x91, 0x88, 0xb0, 0x2e, + 0xf9, 0x29, 0x31, 0x18, 0xf9, 0x4e, 0xe9, 0x0a, + 0x91, 0x92, 0x9f, 0xae, 0x2d, 0xad, 0xf4, 0xe6, + 0x1a, 0xe2, 0xa4, 0xee, 0x47, 0x15, 0xbf, 0x83, + 0x6e, 0xd7, 0x72, 0x12, 0x3b, 0x2d, 0x24, 0xe9, + 0xb2, 0x55, 0xcb, 0x3c, 0x10, 0xf0, 0x24, 0x8a, + 0x4a, 0x02, 0xea, 0x90, 0x25, 0xf0, 0xb4, 0x79, + 0x3a, 0xef, 0x6e, 0xf5, 0x52, 0xdf, 0xb0, 0x0a, + 0xcd, 0x24, 0x1c, 0xd3, 0x2e, 0x22, 0x74, 0xea, + 0x21, 0x6f, 0xe9, 0xbd, 0xc8, 0x3e, 0x36, 0x5b, + 0x19, 0xf1, 0xca, 0x99, 0x0a, 0xb4, 0xa7, 0x52, + 0x1a, 0x4e, 0xf2, 0xad, 0x8d, 0x56, 0x85, 0xbb, + 0x64, 0x89, 0xba, 0x26, 0xf9, 0xc7, 0xe1, 0x89, + 0x19, 0x22, 0x77, 0xc3, 0xa8, 0xfc, 0xff, 0xad, + 0xfe, 0xb9, 0x48, 0xae, 0x12, 0x30, 0x9f, 0x19, + 0xfb, 0x1b, 0xef, 0x14, 0x87, 0x8a, 0x78, 0x71, + 0xf3, 0xf4, 0xb7, 0x00, 0x9c, 0x1d, 0xb5, 0x3d, + 0x49, 0x00, 0x0c, 0x06, 0xd4, 0x50, 0xf9, 0x54, + 0x45, 0xb2, 0x5b, 0x43, 0xdb, 0x6d, 0xcf, 0x1a, + 0xe9, 0x7a, 0x7a, 0xcf, 0xfc, 0x8a, 0x4e, 0x4d, + 0x0b, 0x07, 0x63, 0x28, 0xd8, 0xe7, 0x08, 0x95, + 0xdf, 0xa6, 0x72, 0x93, 0x2e, 0xbb, 0xa0, 0x42, + 0x89, 0x16, 0xf1, 0xd9, 0x0c, 0xf9, 0xa1, 0x16, + 0xfd, 0xd9, 0x03, 0xb4, 0x3b, 0x8a, 0xf5, 0xf6, + 0xe7, 0x6b, 0x2e, 0x8e, 0x4c, 0x3d, 0xe2, 0xaf, + 0x08, 0x45, 0x03, 0xff, 0x09, 0xb6, 0xeb, 0x2d, + 0xc6, 0x1b, 0x88, 0x94, 0xac, 0x3e, 0xf1, 0x9f, + 0x0e, 0x0e, 0x2b, 0xd5, 0x00, 0x4d, 0x3f, 0x3b, + 0x53, 0xae, 0xaf, 0x1c, 0x33, 0x5f, 0x55, 0x6e, + 0x8d, 0xaf, 0x05, 0x7a, 0x10, 0x34, 0xc9, 0xf4, + 0x66, 0xcb, 0x62, 0x12, 0xa6, 0xee, 0xe8, 0x1c, + 0x5d, 0x12, 0x86, 0xdb, 0x6f, 0x1c, 0x33, 0xc4, + 0x1c, 0xda, 0x82, 0x2d, 0x3b, 0x59, 0xfe, 0xb1, + 0xa4, 0x59, 0x41, 0x86, 0xd0, 0xef, 0xae, 0xfb, + 0xda, 0x6d, 0x11, 0xb8, 0xca, 0xe9, 0x6e, 0xff, + 0xf7, 0xa9, 0xd9, 0x70, 0x30, 0xfc, 0x53, 0xe2, + 0xd7, 0xa2, 0x4e, 0xc7, 0x91, 0xd9, 0x07, 0x06, + 0xaa, 0xdd, 0xb0, 0x59, 0x28, 0x1d, 0x00, 0x66, + 0xc5, 0x54, 0xc2, 0xfc, 0x06, 0xda, 0x05, 0x90, + 0x52, 0x1d, 0x37, 0x66, 0xee, 0xf0, 0xb2, 0x55, + 0x8a, 0x5d, 0xd2, 0x38, 0x86, 0x94, 0x9b, 0xfc, + 0x10, 0x4c, 0xa1, 0xb9, 0x64, 0x3e, 0x44, 0xb8, + 0x5f, 0xb0, 0x0c, 0xec, 0xe0, 0xc9, 0xe5, 0x62, + 0x75, 0x3f, 0x09, 0xd5, 0xf5, 0xd9, 0x26, 0xba, + 0x9e, 0xd2, 0xf4, 0xb9, 0x48, 0x0a, 0xbc, 0xa2, + 0xd6, 0x7c, 0x36, 0x11, 0x7d, 0x26, 0x81, 0x89, + 0xcf, 0xa4, 0xad, 0x73, 0x0e, 0xee, 0xcc, 0x06, + 0xa9, 0xdb, 0xb1, 0xfd, 0xfb, 0x09, 0x7f, 0x90, + 0x42, 0x37, 0x2f, 0xe1, 0x9c, 0x0f, 0x6f, 0xcf, + 0x43, 0xb5, 0xd9, 0x90, 0xe1, 0x85, 0xf5, 0xa8, + 0xae +}; +static const u8 enc_assoc009[] __initconst = { + 0x5a, 0x27, 0xff, 0xeb, 0xdf, 0x84, 0xb2, 0x9e, + 0xef +}; +static const u8 enc_nonce009[] __initconst = { + 0xef, 0x2d, 0x63, 0xee, 0x6b, 0x80, 0x8b, 0x78 +}; +static const u8 enc_key009[] __initconst = { + 0xea, 0xbc, 0x56, 0x99, 0xe3, 0x50, 0xff, 0xc5, + 0xcc, 0x1a, 0xd7, 0xc1, 0x57, 0x72, 0xea, 0x86, + 0x5b, 0x89, 0x88, 0x61, 0x3d, 0x2f, 0x9b, 0xb2, + 0xe7, 0x9c, 0xec, 0x74, 0x6e, 0x3e, 0xf4, 0x3b +}; + +static const u8 enc_input010[] __initconst = { + 0x42, 0x93, 0xe4, 0xeb, 0x97, 0xb0, 0x57, 0xbf, + 0x1a, 0x8b, 0x1f, 0xe4, 0x5f, 0x36, 0x20, 0x3c, + 0xef, 0x0a, 0xa9, 0x48, 0x5f, 0x5f, 0x37, 0x22, + 0x3a, 0xde, 0xe3, 0xae, 0xbe, 0xad, 0x07, 0xcc, + 0xb1, 0xf6, 0xf5, 0xf9, 0x56, 0xdd, 0xe7, 0x16, + 0x1e, 0x7f, 0xdf, 0x7a, 0x9e, 0x75, 0xb7, 0xc7, + 0xbe, 0xbe, 0x8a, 0x36, 0x04, 0xc0, 0x10, 0xf4, + 0x95, 0x20, 0x03, 0xec, 0xdc, 0x05, 0xa1, 0x7d, + 0xc4, 0xa9, 0x2c, 0x82, 0xd0, 0xbc, 0x8b, 0xc5, + 0xc7, 0x45, 0x50, 0xf6, 0xa2, 0x1a, 0xb5, 0x46, + 0x3b, 0x73, 0x02, 0xa6, 0x83, 0x4b, 0x73, 0x82, + 0x58, 0x5e, 0x3b, 0x65, 0x2f, 0x0e, 0xfd, 0x2b, + 0x59, 0x16, 0xce, 0xa1, 0x60, 0x9c, 0xe8, 0x3a, + 0x99, 0xed, 0x8d, 0x5a, 0xcf, 0xf6, 0x83, 0xaf, + 0xba, 0xd7, 0x73, 0x73, 0x40, 0x97, 0x3d, 0xca, + 0xef, 0x07, 0x57, 0xe6, 0xd9, 0x70, 0x0e, 0x95, + 0xae, 0xa6, 0x8d, 0x04, 0xcc, 0xee, 0xf7, 0x09, + 0x31, 0x77, 0x12, 0xa3, 0x23, 0x97, 0x62, 0xb3, + 0x7b, 0x32, 0xfb, 0x80, 0x14, 0x48, 0x81, 0xc3, + 0xe5, 0xea, 0x91, 0x39, 0x52, 0x81, 0xa2, 0x4f, + 0xe4, 0xb3, 0x09, 0xff, 0xde, 0x5e, 0xe9, 0x58, + 0x84, 0x6e, 0xf9, 0x3d, 0xdf, 0x25, 0xea, 0xad, + 0xae, 0xe6, 0x9a, 0xd1, 0x89, 0x55, 0xd3, 0xde, + 0x6c, 0x52, 0xdb, 0x70, 0xfe, 0x37, 0xce, 0x44, + 0x0a, 0xa8, 0x25, 0x5f, 0x92, 0xc1, 0x33, 0x4a, + 0x4f, 0x9b, 0x62, 0x35, 0xff, 0xce, 0xc0, 0xa9, + 0x60, 0xce, 0x52, 0x00, 0x97, 0x51, 0x35, 0x26, + 0x2e, 0xb9, 0x36, 0xa9, 0x87, 0x6e, 0x1e, 0xcc, + 0x91, 0x78, 0x53, 0x98, 0x86, 0x5b, 0x9c, 0x74, + 0x7d, 0x88, 0x33, 0xe1, 0xdf, 0x37, 0x69, 0x2b, + 0xbb, 0xf1, 0x4d, 0xf4, 0xd1, 0xf1, 0x39, 0x93, + 0x17, 0x51, 0x19, 0xe3, 0x19, 0x1e, 0x76, 0x37, + 0x25, 0xfb, 0x09, 0x27, 0x6a, 0xab, 0x67, 0x6f, + 0x14, 0x12, 0x64, 0xe7, 0xc4, 0x07, 0xdf, 0x4d, + 0x17, 0xbb, 0x6d, 0xe0, 0xe9, 0xb9, 0xab, 0xca, + 0x10, 0x68, 0xaf, 0x7e, 0xb7, 0x33, 0x54, 0x73, + 0x07, 0x6e, 0xf7, 0x81, 0x97, 0x9c, 0x05, 0x6f, + 0x84, 0x5f, 0xd2, 0x42, 0xfb, 0x38, 0xcf, 0xd1, + 0x2f, 0x14, 0x30, 0x88, 0x98, 0x4d, 0x5a, 0xa9, + 0x76, 0xd5, 0x4f, 0x3e, 0x70, 0x6c, 0x85, 0x76, + 0xd7, 0x01, 0xa0, 0x1a, 0xc8, 0x4e, 0xaa, 0xac, + 0x78, 0xfe, 0x46, 0xde, 0x6a, 0x05, 0x46, 0xa7, + 0x43, 0x0c, 0xb9, 0xde, 0xb9, 0x68, 0xfb, 0xce, + 0x42, 0x99, 0x07, 0x4d, 0x0b, 0x3b, 0x5a, 0x30, + 0x35, 0xa8, 0xf9, 0x3a, 0x73, 0xef, 0x0f, 0xdb, + 0x1e, 0x16, 0x42, 0xc4, 0xba, 0xae, 0x58, 0xaa, + 0xf8, 0xe5, 0x75, 0x2f, 0x1b, 0x15, 0x5c, 0xfd, + 0x0a, 0x97, 0xd0, 0xe4, 0x37, 0x83, 0x61, 0x5f, + 0x43, 0xa6, 0xc7, 0x3f, 0x38, 0x59, 0xe6, 0xeb, + 0xa3, 0x90, 0xc3, 0xaa, 0xaa, 0x5a, 0xd3, 0x34, + 0xd4, 0x17, 0xc8, 0x65, 0x3e, 0x57, 0xbc, 0x5e, + 0xdd, 0x9e, 0xb7, 0xf0, 0x2e, 0x5b, 0xb2, 0x1f, + 0x8a, 0x08, 0x0d, 0x45, 0x91, 0x0b, 0x29, 0x53, + 0x4f, 0x4c, 0x5a, 0x73, 0x56, 0xfe, 0xaf, 0x41, + 0x01, 0x39, 0x0a, 0x24, 0x3c, 0x7e, 0xbe, 0x4e, + 0x53, 0xf3, 0xeb, 0x06, 0x66, 0x51, 0x28, 0x1d, + 0xbd, 0x41, 0x0a, 0x01, 0xab, 0x16, 0x47, 0x27, + 0x47, 0x47, 0xf7, 0xcb, 0x46, 0x0a, 0x70, 0x9e, + 0x01, 0x9c, 0x09, 0xe1, 0x2a, 0x00, 0x1a, 0xd8, + 0xd4, 0x79, 0x9d, 0x80, 0x15, 0x8e, 0x53, 0x2a, + 0x65, 0x83, 0x78, 0x3e, 0x03, 0x00, 0x07, 0x12, + 0x1f, 0x33, 0x3e, 0x7b, 0x13, 0x37, 0xf1, 0xc3, + 0xef, 0xb7, 0xc1, 0x20, 0x3c, 0x3e, 0x67, 0x66, + 0x5d, 0x88, 0xa7, 0x7d, 0x33, 0x50, 0x77, 0xb0, + 0x28, 0x8e, 0xe7, 0x2c, 0x2e, 0x7a, 0xf4, 0x3c, + 0x8d, 0x74, 0x83, 0xaf, 0x8e, 0x87, 0x0f, 0xe4, + 0x50, 0xff, 0x84, 0x5c, 0x47, 0x0c, 0x6a, 0x49, + 0xbf, 0x42, 0x86, 0x77, 0x15, 0x48, 0xa5, 0x90, + 0x5d, 0x93, 0xd6, 0x2a, 0x11, 0xd5, 0xd5, 0x11, + 0xaa, 0xce, 0xe7, 0x6f, 0xa5, 0xb0, 0x09, 0x2c, + 0x8d, 0xd3, 0x92, 0xf0, 0x5a, 0x2a, 0xda, 0x5b, + 0x1e, 0xd5, 0x9a, 0xc4, 0xc4, 0xf3, 0x49, 0x74, + 0x41, 0xca, 0xe8, 0xc1, 0xf8, 0x44, 0xd6, 0x3c, + 0xae, 0x6c, 0x1d, 0x9a, 0x30, 0x04, 0x4d, 0x27, + 0x0e, 0xb1, 0x5f, 0x59, 0xa2, 0x24, 0xe8, 0xe1, + 0x98, 0xc5, 0x6a, 0x4c, 0xfe, 0x41, 0xd2, 0x27, + 0x42, 0x52, 0xe1, 0xe9, 0x7d, 0x62, 0xe4, 0x88, + 0x0f, 0xad, 0xb2, 0x70, 0xcb, 0x9d, 0x4c, 0x27, + 0x2e, 0x76, 0x1e, 0x1a, 0x63, 0x65, 0xf5, 0x3b, + 0xf8, 0x57, 0x69, 0xeb, 0x5b, 0x38, 0x26, 0x39, + 0x33, 0x25, 0x45, 0x3e, 0x91, 0xb8, 0xd8, 0xc7, + 0xd5, 0x42, 0xc0, 0x22, 0x31, 0x74, 0xf4, 0xbc, + 0x0c, 0x23, 0xf1, 0xca, 0xc1, 0x8d, 0xd7, 0xbe, + 0xc9, 0x62, 0xe4, 0x08, 0x1a, 0xcf, 0x36, 0xd5, + 0xfe, 0x55, 0x21, 0x59, 0x91, 0x87, 0x87, 0xdf, + 0x06, 0xdb, 0xdf, 0x96, 0x45, 0x58, 0xda, 0x05, + 0xcd, 0x50, 0x4d, 0xd2, 0x7d, 0x05, 0x18, 0x73, + 0x6a, 0x8d, 0x11, 0x85, 0xa6, 0x88, 0xe8, 0xda, + 0xe6, 0x30, 0x33, 0xa4, 0x89, 0x31, 0x75, 0xbe, + 0x69, 0x43, 0x84, 0x43, 0x50, 0x87, 0xdd, 0x71, + 0x36, 0x83, 0xc3, 0x78, 0x74, 0x24, 0x0a, 0xed, + 0x7b, 0xdb, 0xa4, 0x24, 0x0b, 0xb9, 0x7e, 0x5d, + 0xff, 0xde, 0xb1, 0xef, 0x61, 0x5a, 0x45, 0x33, + 0xf6, 0x17, 0x07, 0x08, 0x98, 0x83, 0x92, 0x0f, + 0x23, 0x6d, 0xe6, 0xaa, 0x17, 0x54, 0xad, 0x6a, + 0xc8, 0xdb, 0x26, 0xbe, 0xb8, 0xb6, 0x08, 0xfa, + 0x68, 0xf1, 0xd7, 0x79, 0x6f, 0x18, 0xb4, 0x9e, + 0x2d, 0x3f, 0x1b, 0x64, 0xaf, 0x8d, 0x06, 0x0e, + 0x49, 0x28, 0xe0, 0x5d, 0x45, 0x68, 0x13, 0x87, + 0xfa, 0xde, 0x40, 0x7b, 0xd2, 0xc3, 0x94, 0xd5, + 0xe1, 0xd9, 0xc2, 0xaf, 0x55, 0x89, 0xeb, 0xb4, + 0x12, 0x59, 0xa8, 0xd4, 0xc5, 0x29, 0x66, 0x38, + 0xe6, 0xac, 0x22, 0x22, 0xd9, 0x64, 0x9b, 0x34, + 0x0a, 0x32, 0x9f, 0xc2, 0xbf, 0x17, 0x6c, 0x3f, + 0x71, 0x7a, 0x38, 0x6b, 0x98, 0xfb, 0x49, 0x36, + 0x89, 0xc9, 0xe2, 0xd6, 0xc7, 0x5d, 0xd0, 0x69, + 0x5f, 0x23, 0x35, 0xc9, 0x30, 0xe2, 0xfd, 0x44, + 0x58, 0x39, 0xd7, 0x97, 0xfb, 0x5c, 0x00, 0xd5, + 0x4f, 0x7a, 0x1a, 0x95, 0x8b, 0x62, 0x4b, 0xce, + 0xe5, 0x91, 0x21, 0x7b, 0x30, 0x00, 0xd6, 0xdd, + 0x6d, 0x02, 0x86, 0x49, 0x0f, 0x3c, 0x1a, 0x27, + 0x3c, 0xd3, 0x0e, 0x71, 0xf2, 0xff, 0xf5, 0x2f, + 0x87, 0xac, 0x67, 0x59, 0x81, 0xa3, 0xf7, 0xf8, + 0xd6, 0x11, 0x0c, 0x84, 0xa9, 0x03, 0xee, 0x2a, + 0xc4, 0xf3, 0x22, 0xab, 0x7c, 0xe2, 0x25, 0xf5, + 0x67, 0xa3, 0xe4, 0x11, 0xe0, 0x59, 0xb3, 0xca, + 0x87, 0xa0, 0xae, 0xc9, 0xa6, 0x62, 0x1b, 0x6e, + 0x4d, 0x02, 0x6b, 0x07, 0x9d, 0xfd, 0xd0, 0x92, + 0x06, 0xe1, 0xb2, 0x9a, 0x4a, 0x1f, 0x1f, 0x13, + 0x49, 0x99, 0x97, 0x08, 0xde, 0x7f, 0x98, 0xaf, + 0x51, 0x98, 0xee, 0x2c, 0xcb, 0xf0, 0x0b, 0xc6, + 0xb6, 0xb7, 0x2d, 0x9a, 0xb1, 0xac, 0xa6, 0xe3, + 0x15, 0x77, 0x9d, 0x6b, 0x1a, 0xe4, 0xfc, 0x8b, + 0xf2, 0x17, 0x59, 0x08, 0x04, 0x58, 0x81, 0x9d, + 0x1b, 0x1b, 0x69, 0x55, 0xc2, 0xb4, 0x3c, 0x1f, + 0x50, 0xf1, 0x7f, 0x77, 0x90, 0x4c, 0x66, 0x40, + 0x5a, 0xc0, 0x33, 0x1f, 0xcb, 0x05, 0x6d, 0x5c, + 0x06, 0x87, 0x52, 0xa2, 0x8f, 0x26, 0xd5, 0x4f +}; +static const u8 enc_output010[] __initconst = { + 0xe5, 0x26, 0xa4, 0x3d, 0xbd, 0x33, 0xd0, 0x4b, + 0x6f, 0x05, 0xa7, 0x6e, 0x12, 0x7a, 0xd2, 0x74, + 0xa6, 0xdd, 0xbd, 0x95, 0xeb, 0xf9, 0xa4, 0xf1, + 0x59, 0x93, 0x91, 0x70, 0xd9, 0xfe, 0x9a, 0xcd, + 0x53, 0x1f, 0x3a, 0xab, 0xa6, 0x7c, 0x9f, 0xa6, + 0x9e, 0xbd, 0x99, 0xd9, 0xb5, 0x97, 0x44, 0xd5, + 0x14, 0x48, 0x4d, 0x9d, 0xc0, 0xd0, 0x05, 0x96, + 0xeb, 0x4c, 0x78, 0x55, 0x09, 0x08, 0x01, 0x02, + 0x30, 0x90, 0x7b, 0x96, 0x7a, 0x7b, 0x5f, 0x30, + 0x41, 0x24, 0xce, 0x68, 0x61, 0x49, 0x86, 0x57, + 0x82, 0xdd, 0x53, 0x1c, 0x51, 0x28, 0x2b, 0x53, + 0x6e, 0x2d, 0xc2, 0x20, 0x4c, 0xdd, 0x8f, 0x65, + 0x10, 0x20, 0x50, 0xdd, 0x9d, 0x50, 0xe5, 0x71, + 0x40, 0x53, 0x69, 0xfc, 0x77, 0x48, 0x11, 0xb9, + 0xde, 0xa4, 0x8d, 0x58, 0xe4, 0xa6, 0x1a, 0x18, + 0x47, 0x81, 0x7e, 0xfc, 0xdd, 0xf6, 0xef, 0xce, + 0x2f, 0x43, 0x68, 0xd6, 0x06, 0xe2, 0x74, 0x6a, + 0xad, 0x90, 0xf5, 0x37, 0xf3, 0x3d, 0x82, 0x69, + 0x40, 0xe9, 0x6b, 0xa7, 0x3d, 0xa8, 0x1e, 0xd2, + 0x02, 0x7c, 0xb7, 0x9b, 0xe4, 0xda, 0x8f, 0x95, + 0x06, 0xc5, 0xdf, 0x73, 0xa3, 0x20, 0x9a, 0x49, + 0xde, 0x9c, 0xbc, 0xee, 0x14, 0x3f, 0x81, 0x5e, + 0xf8, 0x3b, 0x59, 0x3c, 0xe1, 0x68, 0x12, 0x5a, + 0x3a, 0x76, 0x3a, 0x3f, 0xf7, 0x87, 0x33, 0x0a, + 0x01, 0xb8, 0xd4, 0xed, 0xb6, 0xbe, 0x94, 0x5e, + 0x70, 0x40, 0x56, 0x67, 0x1f, 0x50, 0x44, 0x19, + 0xce, 0x82, 0x70, 0x10, 0x87, 0x13, 0x20, 0x0b, + 0x4c, 0x5a, 0xb6, 0xf6, 0xa7, 0xae, 0x81, 0x75, + 0x01, 0x81, 0xe6, 0x4b, 0x57, 0x7c, 0xdd, 0x6d, + 0xf8, 0x1c, 0x29, 0x32, 0xf7, 0xda, 0x3c, 0x2d, + 0xf8, 0x9b, 0x25, 0x6e, 0x00, 0xb4, 0xf7, 0x2f, + 0xf7, 0x04, 0xf7, 0xa1, 0x56, 0xac, 0x4f, 0x1a, + 0x64, 0xb8, 0x47, 0x55, 0x18, 0x7b, 0x07, 0x4d, + 0xbd, 0x47, 0x24, 0x80, 0x5d, 0xa2, 0x70, 0xc5, + 0xdd, 0x8e, 0x82, 0xd4, 0xeb, 0xec, 0xb2, 0x0c, + 0x39, 0xd2, 0x97, 0xc1, 0xcb, 0xeb, 0xf4, 0x77, + 0x59, 0xb4, 0x87, 0xef, 0xcb, 0x43, 0x2d, 0x46, + 0x54, 0xd1, 0xa7, 0xd7, 0x15, 0x99, 0x0a, 0x43, + 0xa1, 0xe0, 0x99, 0x33, 0x71, 0xc1, 0xed, 0xfe, + 0x72, 0x46, 0x33, 0x8e, 0x91, 0x08, 0x9f, 0xc8, + 0x2e, 0xca, 0xfa, 0xdc, 0x59, 0xd5, 0xc3, 0x76, + 0x84, 0x9f, 0xa3, 0x37, 0x68, 0xc3, 0xf0, 0x47, + 0x2c, 0x68, 0xdb, 0x5e, 0xc3, 0x49, 0x4c, 0xe8, + 0x92, 0x85, 0xe2, 0x23, 0xd3, 0x3f, 0xad, 0x32, + 0xe5, 0x2b, 0x82, 0xd7, 0x8f, 0x99, 0x0a, 0x59, + 0x5c, 0x45, 0xd9, 0xb4, 0x51, 0x52, 0xc2, 0xae, + 0xbf, 0x80, 0xcf, 0xc9, 0xc9, 0x51, 0x24, 0x2a, + 0x3b, 0x3a, 0x4d, 0xae, 0xeb, 0xbd, 0x22, 0xc3, + 0x0e, 0x0f, 0x59, 0x25, 0x92, 0x17, 0xe9, 0x74, + 0xc7, 0x8b, 0x70, 0x70, 0x36, 0x55, 0x95, 0x75, + 0x4b, 0xad, 0x61, 0x2b, 0x09, 0xbc, 0x82, 0xf2, + 0x6e, 0x94, 0x43, 0xae, 0xc3, 0xd5, 0xcd, 0x8e, + 0xfe, 0x5b, 0x9a, 0x88, 0x43, 0x01, 0x75, 0xb2, + 0x23, 0x09, 0xf7, 0x89, 0x83, 0xe7, 0xfa, 0xf9, + 0xb4, 0x9b, 0xf8, 0xef, 0xbd, 0x1c, 0x92, 0xc1, + 0xda, 0x7e, 0xfe, 0x05, 0xba, 0x5a, 0xcd, 0x07, + 0x6a, 0x78, 0x9e, 0x5d, 0xfb, 0x11, 0x2f, 0x79, + 0x38, 0xb6, 0xc2, 0x5b, 0x6b, 0x51, 0xb4, 0x71, + 0xdd, 0xf7, 0x2a, 0xe4, 0xf4, 0x72, 0x76, 0xad, + 0xc2, 0xdd, 0x64, 0x5d, 0x79, 0xb6, 0xf5, 0x7a, + 0x77, 0x20, 0x05, 0x3d, 0x30, 0x06, 0xd4, 0x4c, + 0x0a, 0x2c, 0x98, 0x5a, 0xb9, 0xd4, 0x98, 0xa9, + 0x3f, 0xc6, 0x12, 0xea, 0x3b, 0x4b, 0xc5, 0x79, + 0x64, 0x63, 0x6b, 0x09, 0x54, 0x3b, 0x14, 0x27, + 0xba, 0x99, 0x80, 0xc8, 0x72, 0xa8, 0x12, 0x90, + 0x29, 0xba, 0x40, 0x54, 0x97, 0x2b, 0x7b, 0xfe, + 0xeb, 0xcd, 0x01, 0x05, 0x44, 0x72, 0xdb, 0x99, + 0xe4, 0x61, 0xc9, 0x69, 0xd6, 0xb9, 0x28, 0xd1, + 0x05, 0x3e, 0xf9, 0x0b, 0x49, 0x0a, 0x49, 0xe9, + 0x8d, 0x0e, 0xa7, 0x4a, 0x0f, 0xaf, 0x32, 0xd0, + 0xe0, 0xb2, 0x3a, 0x55, 0x58, 0xfe, 0x5c, 0x28, + 0x70, 0x51, 0x23, 0xb0, 0x7b, 0x6a, 0x5f, 0x1e, + 0xb8, 0x17, 0xd7, 0x94, 0x15, 0x8f, 0xee, 0x20, + 0xc7, 0x42, 0x25, 0x3e, 0x9a, 0x14, 0xd7, 0x60, + 0x72, 0x39, 0x47, 0x48, 0xa9, 0xfe, 0xdd, 0x47, + 0x0a, 0xb1, 0xe6, 0x60, 0x28, 0x8c, 0x11, 0x68, + 0xe1, 0xff, 0xd7, 0xce, 0xc8, 0xbe, 0xb3, 0xfe, + 0x27, 0x30, 0x09, 0x70, 0xd7, 0xfa, 0x02, 0x33, + 0x3a, 0x61, 0x2e, 0xc7, 0xff, 0xa4, 0x2a, 0xa8, + 0x6e, 0xb4, 0x79, 0x35, 0x6d, 0x4c, 0x1e, 0x38, + 0xf8, 0xee, 0xd4, 0x84, 0x4e, 0x6e, 0x28, 0xa7, + 0xce, 0xc8, 0xc1, 0xcf, 0x80, 0x05, 0xf3, 0x04, + 0xef, 0xc8, 0x18, 0x28, 0x2e, 0x8d, 0x5e, 0x0c, + 0xdf, 0xb8, 0x5f, 0x96, 0xe8, 0xc6, 0x9c, 0x2f, + 0xe5, 0xa6, 0x44, 0xd7, 0xe7, 0x99, 0x44, 0x0c, + 0xec, 0xd7, 0x05, 0x60, 0x97, 0xbb, 0x74, 0x77, + 0x58, 0xd5, 0xbb, 0x48, 0xde, 0x5a, 0xb2, 0x54, + 0x7f, 0x0e, 0x46, 0x70, 0x6a, 0x6f, 0x78, 0xa5, + 0x08, 0x89, 0x05, 0x4e, 0x7e, 0xa0, 0x69, 0xb4, + 0x40, 0x60, 0x55, 0x77, 0x75, 0x9b, 0x19, 0xf2, + 0xd5, 0x13, 0x80, 0x77, 0xf9, 0x4b, 0x3f, 0x1e, + 0xee, 0xe6, 0x76, 0x84, 0x7b, 0x8c, 0xe5, 0x27, + 0xa8, 0x0a, 0x91, 0x01, 0x68, 0x71, 0x8a, 0x3f, + 0x06, 0xab, 0xf6, 0xa9, 0xa5, 0xe6, 0x72, 0x92, + 0xe4, 0x67, 0xe2, 0xa2, 0x46, 0x35, 0x84, 0x55, + 0x7d, 0xca, 0xa8, 0x85, 0xd0, 0xf1, 0x3f, 0xbe, + 0xd7, 0x34, 0x64, 0xfc, 0xae, 0xe3, 0xe4, 0x04, + 0x9f, 0x66, 0x02, 0xb9, 0x88, 0x10, 0xd9, 0xc4, + 0x4c, 0x31, 0x43, 0x7a, 0x93, 0xe2, 0x9b, 0x56, + 0x43, 0x84, 0xdc, 0xdc, 0xde, 0x1d, 0xa4, 0x02, + 0x0e, 0xc2, 0xef, 0xc3, 0xf8, 0x78, 0xd1, 0xb2, + 0x6b, 0x63, 0x18, 0xc9, 0xa9, 0xe5, 0x72, 0xd8, + 0xf3, 0xb9, 0xd1, 0x8a, 0xc7, 0x1a, 0x02, 0x27, + 0x20, 0x77, 0x10, 0xe5, 0xc8, 0xd4, 0x4a, 0x47, + 0xe5, 0xdf, 0x5f, 0x01, 0xaa, 0xb0, 0xd4, 0x10, + 0xbb, 0x69, 0xe3, 0x36, 0xc8, 0xe1, 0x3d, 0x43, + 0xfb, 0x86, 0xcd, 0xcc, 0xbf, 0xf4, 0x88, 0xe0, + 0x20, 0xca, 0xb7, 0x1b, 0xf1, 0x2f, 0x5c, 0xee, + 0xd4, 0xd3, 0xa3, 0xcc, 0xa4, 0x1e, 0x1c, 0x47, + 0xfb, 0xbf, 0xfc, 0xa2, 0x41, 0x55, 0x9d, 0xf6, + 0x5a, 0x5e, 0x65, 0x32, 0x34, 0x7b, 0x52, 0x8d, + 0xd5, 0xd0, 0x20, 0x60, 0x03, 0xab, 0x3f, 0x8c, + 0xd4, 0x21, 0xea, 0x2a, 0xd9, 0xc4, 0xd0, 0xd3, + 0x65, 0xd8, 0x7a, 0x13, 0x28, 0x62, 0x32, 0x4b, + 0x2c, 0x87, 0x93, 0xa8, 0xb4, 0x52, 0x45, 0x09, + 0x44, 0xec, 0xec, 0xc3, 0x17, 0xdb, 0x9a, 0x4d, + 0x5c, 0xa9, 0x11, 0xd4, 0x7d, 0xaf, 0x9e, 0xf1, + 0x2d, 0xb2, 0x66, 0xc5, 0x1d, 0xed, 0xb7, 0xcd, + 0x0b, 0x25, 0x5e, 0x30, 0x47, 0x3f, 0x40, 0xf4, + 0xa1, 0xa0, 0x00, 0x94, 0x10, 0xc5, 0x6a, 0x63, + 0x1a, 0xd5, 0x88, 0x92, 0x8e, 0x82, 0x39, 0x87, + 0x3c, 0x78, 0x65, 0x58, 0x42, 0x75, 0x5b, 0xdd, + 0x77, 0x3e, 0x09, 0x4e, 0x76, 0x5b, 0xe6, 0x0e, + 0x4d, 0x38, 0xb2, 0xc0, 0xb8, 0x95, 0x01, 0x7a, + 0x10, 0xe0, 0xfb, 0x07, 0xf2, 0xab, 0x2d, 0x8c, + 0x32, 0xed, 0x2b, 0xc0, 0x46, 0xc2, 0xf5, 0x38, + 0x83, 0xf0, 0x17, 0xec, 0xc1, 0x20, 0x6a, 0x9a, + 0x0b, 0x00, 0xa0, 0x98, 0x22, 0x50, 0x23, 0xd5, + 0x80, 0x6b, 0xf6, 0x1f, 0xc3, 0xcc, 0x97, 0xc9, + 0x24, 0x9f, 0xf3, 0xaf, 0x43, 0x14, 0xd5, 0xa0 +}; +static const u8 enc_assoc010[] __initconst = { + 0xd2, 0xa1, 0x70, 0xdb, 0x7a, 0xf8, 0xfa, 0x27, + 0xba, 0x73, 0x0f, 0xbf, 0x3d, 0x1e, 0x82, 0xb2 +}; +static const u8 enc_nonce010[] __initconst = { + 0xdb, 0x92, 0x0f, 0x7f, 0x17, 0x54, 0x0c, 0x30 +}; +static const u8 enc_key010[] __initconst = { + 0x47, 0x11, 0xeb, 0x86, 0x2b, 0x2c, 0xab, 0x44, + 0x34, 0xda, 0x7f, 0x57, 0x03, 0x39, 0x0c, 0xaf, + 0x2c, 0x14, 0xfd, 0x65, 0x23, 0xe9, 0x8e, 0x74, + 0xd5, 0x08, 0x68, 0x08, 0xe7, 0xb4, 0x72, 0xd7 +}; + +static const u8 enc_input011[] __initconst = { + 0x7a, 0x57, 0xf2, 0xc7, 0x06, 0x3f, 0x50, 0x7b, + 0x36, 0x1a, 0x66, 0x5c, 0xb9, 0x0e, 0x5e, 0x3b, + 0x45, 0x60, 0xbe, 0x9a, 0x31, 0x9f, 0xff, 0x5d, + 0x66, 0x34, 0xb4, 0xdc, 0xfb, 0x9d, 0x8e, 0xee, + 0x6a, 0x33, 0xa4, 0x07, 0x3c, 0xf9, 0x4c, 0x30, + 0xa1, 0x24, 0x52, 0xf9, 0x50, 0x46, 0x88, 0x20, + 0x02, 0x32, 0x3a, 0x0e, 0x99, 0x63, 0xaf, 0x1f, + 0x15, 0x28, 0x2a, 0x05, 0xff, 0x57, 0x59, 0x5e, + 0x18, 0xa1, 0x1f, 0xd0, 0x92, 0x5c, 0x88, 0x66, + 0x1b, 0x00, 0x64, 0xa5, 0x93, 0x8d, 0x06, 0x46, + 0xb0, 0x64, 0x8b, 0x8b, 0xef, 0x99, 0x05, 0x35, + 0x85, 0xb3, 0xf3, 0x33, 0xbb, 0xec, 0x66, 0xb6, + 0x3d, 0x57, 0x42, 0xe3, 0xb4, 0xc6, 0xaa, 0xb0, + 0x41, 0x2a, 0xb9, 0x59, 0xa9, 0xf6, 0x3e, 0x15, + 0x26, 0x12, 0x03, 0x21, 0x4c, 0x74, 0x43, 0x13, + 0x2a, 0x03, 0x27, 0x09, 0xb4, 0xfb, 0xe7, 0xb7, + 0x40, 0xff, 0x5e, 0xce, 0x48, 0x9a, 0x60, 0xe3, + 0x8b, 0x80, 0x8c, 0x38, 0x2d, 0xcb, 0x93, 0x37, + 0x74, 0x05, 0x52, 0x6f, 0x73, 0x3e, 0xc3, 0xbc, + 0xca, 0x72, 0x0a, 0xeb, 0xf1, 0x3b, 0xa0, 0x95, + 0xdc, 0x8a, 0xc4, 0xa9, 0xdc, 0xca, 0x44, 0xd8, + 0x08, 0x63, 0x6a, 0x36, 0xd3, 0x3c, 0xb8, 0xac, + 0x46, 0x7d, 0xfd, 0xaa, 0xeb, 0x3e, 0x0f, 0x45, + 0x8f, 0x49, 0xda, 0x2b, 0xf2, 0x12, 0xbd, 0xaf, + 0x67, 0x8a, 0x63, 0x48, 0x4b, 0x55, 0x5f, 0x6d, + 0x8c, 0xb9, 0x76, 0x34, 0x84, 0xae, 0xc2, 0xfc, + 0x52, 0x64, 0x82, 0xf7, 0xb0, 0x06, 0xf0, 0x45, + 0x73, 0x12, 0x50, 0x30, 0x72, 0xea, 0x78, 0x9a, + 0xa8, 0xaf, 0xb5, 0xe3, 0xbb, 0x77, 0x52, 0xec, + 0x59, 0x84, 0xbf, 0x6b, 0x8f, 0xce, 0x86, 0x5e, + 0x1f, 0x23, 0xe9, 0xfb, 0x08, 0x86, 0xf7, 0x10, + 0xb9, 0xf2, 0x44, 0x96, 0x44, 0x63, 0xa9, 0xa8, + 0x78, 0x00, 0x23, 0xd6, 0xc7, 0xe7, 0x6e, 0x66, + 0x4f, 0xcc, 0xee, 0x15, 0xb3, 0xbd, 0x1d, 0xa0, + 0xe5, 0x9c, 0x1b, 0x24, 0x2c, 0x4d, 0x3c, 0x62, + 0x35, 0x9c, 0x88, 0x59, 0x09, 0xdd, 0x82, 0x1b, + 0xcf, 0x0a, 0x83, 0x6b, 0x3f, 0xae, 0x03, 0xc4, + 0xb4, 0xdd, 0x7e, 0x5b, 0x28, 0x76, 0x25, 0x96, + 0xd9, 0xc9, 0x9d, 0x5f, 0x86, 0xfa, 0xf6, 0xd7, + 0xd2, 0xe6, 0x76, 0x1d, 0x0f, 0xa1, 0xdc, 0x74, + 0x05, 0x1b, 0x1d, 0xe0, 0xcd, 0x16, 0xb0, 0xa8, + 0x8a, 0x34, 0x7b, 0x15, 0x11, 0x77, 0xe5, 0x7b, + 0x7e, 0x20, 0xf7, 0xda, 0x38, 0xda, 0xce, 0x70, + 0xe9, 0xf5, 0x6c, 0xd9, 0xbe, 0x0c, 0x4c, 0x95, + 0x4c, 0xc2, 0x9b, 0x34, 0x55, 0x55, 0xe1, 0xf3, + 0x46, 0x8e, 0x48, 0x74, 0x14, 0x4f, 0x9d, 0xc9, + 0xf5, 0xe8, 0x1a, 0xf0, 0x11, 0x4a, 0xc1, 0x8d, + 0xe0, 0x93, 0xa0, 0xbe, 0x09, 0x1c, 0x2b, 0x4e, + 0x0f, 0xb2, 0x87, 0x8b, 0x84, 0xfe, 0x92, 0x32, + 0x14, 0xd7, 0x93, 0xdf, 0xe7, 0x44, 0xbc, 0xc5, + 0xae, 0x53, 0x69, 0xd8, 0xb3, 0x79, 0x37, 0x80, + 0xe3, 0x17, 0x5c, 0xec, 0x53, 0x00, 0x9a, 0xe3, + 0x8e, 0xdc, 0x38, 0xb8, 0x66, 0xf0, 0xd3, 0xad, + 0x1d, 0x02, 0x96, 0x86, 0x3e, 0x9d, 0x3b, 0x5d, + 0xa5, 0x7f, 0x21, 0x10, 0xf1, 0x1f, 0x13, 0x20, + 0xf9, 0x57, 0x87, 0x20, 0xf5, 0x5f, 0xf1, 0x17, + 0x48, 0x0a, 0x51, 0x5a, 0xcd, 0x19, 0x03, 0xa6, + 0x5a, 0xd1, 0x12, 0x97, 0xe9, 0x48, 0xe2, 0x1d, + 0x83, 0x75, 0x50, 0xd9, 0x75, 0x7d, 0x6a, 0x82, + 0xa1, 0xf9, 0x4e, 0x54, 0x87, 0x89, 0xc9, 0x0c, + 0xb7, 0x5b, 0x6a, 0x91, 0xc1, 0x9c, 0xb2, 0xa9, + 0xdc, 0x9a, 0xa4, 0x49, 0x0a, 0x6d, 0x0d, 0xbb, + 0xde, 0x86, 0x44, 0xdd, 0x5d, 0x89, 0x2b, 0x96, + 0x0f, 0x23, 0x95, 0xad, 0xcc, 0xa2, 0xb3, 0xb9, + 0x7e, 0x74, 0x38, 0xba, 0x9f, 0x73, 0xae, 0x5f, + 0xf8, 0x68, 0xa2, 0xe0, 0xa9, 0xce, 0xbd, 0x40, + 0xd4, 0x4c, 0x6b, 0xd2, 0x56, 0x62, 0xb0, 0xcc, + 0x63, 0x7e, 0x5b, 0xd3, 0xae, 0xd1, 0x75, 0xce, + 0xbb, 0xb4, 0x5b, 0xa8, 0xf8, 0xb4, 0xac, 0x71, + 0x75, 0xaa, 0xc9, 0x9f, 0xbb, 0x6c, 0xad, 0x0f, + 0x55, 0x5d, 0xe8, 0x85, 0x7d, 0xf9, 0x21, 0x35, + 0xea, 0x92, 0x85, 0x2b, 0x00, 0xec, 0x84, 0x90, + 0x0a, 0x63, 0x96, 0xe4, 0x6b, 0xa9, 0x77, 0xb8, + 0x91, 0xf8, 0x46, 0x15, 0x72, 0x63, 0x70, 0x01, + 0x40, 0xa3, 0xa5, 0x76, 0x62, 0x2b, 0xbf, 0xf1, + 0xe5, 0x8d, 0x9f, 0xa3, 0xfa, 0x9b, 0x03, 0xbe, + 0xfe, 0x65, 0x6f, 0xa2, 0x29, 0x0d, 0x54, 0xb4, + 0x71, 0xce, 0xa9, 0xd6, 0x3d, 0x88, 0xf9, 0xaf, + 0x6b, 0xa8, 0x9e, 0xf4, 0x16, 0x96, 0x36, 0xb9, + 0x00, 0xdc, 0x10, 0xab, 0xb5, 0x08, 0x31, 0x1f, + 0x00, 0xb1, 0x3c, 0xd9, 0x38, 0x3e, 0xc6, 0x04, + 0xa7, 0x4e, 0xe8, 0xae, 0xed, 0x98, 0xc2, 0xf7, + 0xb9, 0x00, 0x5f, 0x8c, 0x60, 0xd1, 0xe5, 0x15, + 0xf7, 0xae, 0x1e, 0x84, 0x88, 0xd1, 0xf6, 0xbc, + 0x3a, 0x89, 0x35, 0x22, 0x83, 0x7c, 0xca, 0xf0, + 0x33, 0x82, 0x4c, 0x79, 0x3c, 0xfd, 0xb1, 0xae, + 0x52, 0x62, 0x55, 0xd2, 0x41, 0x60, 0xc6, 0xbb, + 0xfa, 0x0e, 0x59, 0xd6, 0xa8, 0xfe, 0x5d, 0xed, + 0x47, 0x3d, 0xe0, 0xea, 0x1f, 0x6e, 0x43, 0x51, + 0xec, 0x10, 0x52, 0x56, 0x77, 0x42, 0x6b, 0x52, + 0x87, 0xd8, 0xec, 0xe0, 0xaa, 0x76, 0xa5, 0x84, + 0x2a, 0x22, 0x24, 0xfd, 0x92, 0x40, 0x88, 0xd5, + 0x85, 0x1c, 0x1f, 0x6b, 0x47, 0xa0, 0xc4, 0xe4, + 0xef, 0xf4, 0xea, 0xd7, 0x59, 0xac, 0x2a, 0x9e, + 0x8c, 0xfa, 0x1f, 0x42, 0x08, 0xfe, 0x4f, 0x74, + 0xa0, 0x26, 0xf5, 0xb3, 0x84, 0xf6, 0x58, 0x5f, + 0x26, 0x66, 0x3e, 0xd7, 0xe4, 0x22, 0x91, 0x13, + 0xc8, 0xac, 0x25, 0x96, 0x23, 0xd8, 0x09, 0xea, + 0x45, 0x75, 0x23, 0xb8, 0x5f, 0xc2, 0x90, 0x8b, + 0x09, 0xc4, 0xfc, 0x47, 0x6c, 0x6d, 0x0a, 0xef, + 0x69, 0xa4, 0x38, 0x19, 0xcf, 0x7d, 0xf9, 0x09, + 0x73, 0x9b, 0x60, 0x5a, 0xf7, 0x37, 0xb5, 0xfe, + 0x9f, 0xe3, 0x2b, 0x4c, 0x0d, 0x6e, 0x19, 0xf1, + 0xd6, 0xc0, 0x70, 0xf3, 0x9d, 0x22, 0x3c, 0xf9, + 0x49, 0xce, 0x30, 0x8e, 0x44, 0xb5, 0x76, 0x15, + 0x8f, 0x52, 0xfd, 0xa5, 0x04, 0xb8, 0x55, 0x6a, + 0x36, 0x59, 0x7c, 0xc4, 0x48, 0xb8, 0xd7, 0xab, + 0x05, 0x66, 0xe9, 0x5e, 0x21, 0x6f, 0x6b, 0x36, + 0x29, 0xbb, 0xe9, 0xe3, 0xa2, 0x9a, 0xa8, 0xcd, + 0x55, 0x25, 0x11, 0xba, 0x5a, 0x58, 0xa0, 0xde, + 0xae, 0x19, 0x2a, 0x48, 0x5a, 0xff, 0x36, 0xcd, + 0x6d, 0x16, 0x7a, 0x73, 0x38, 0x46, 0xe5, 0x47, + 0x59, 0xc8, 0xa2, 0xf6, 0xe2, 0x6c, 0x83, 0xc5, + 0x36, 0x2c, 0x83, 0x7d, 0xb4, 0x01, 0x05, 0x69, + 0xe7, 0xaf, 0x5c, 0xc4, 0x64, 0x82, 0x12, 0x21, + 0xef, 0xf7, 0xd1, 0x7d, 0xb8, 0x8d, 0x8c, 0x98, + 0x7c, 0x5f, 0x7d, 0x92, 0x88, 0xb9, 0x94, 0x07, + 0x9c, 0xd8, 0xe9, 0x9c, 0x17, 0x38, 0xe3, 0x57, + 0x6c, 0xe0, 0xdc, 0xa5, 0x92, 0x42, 0xb3, 0xbd, + 0x50, 0xa2, 0x7e, 0xb5, 0xb1, 0x52, 0x72, 0x03, + 0x97, 0xd8, 0xaa, 0x9a, 0x1e, 0x75, 0x41, 0x11, + 0xa3, 0x4f, 0xcc, 0xd4, 0xe3, 0x73, 0xad, 0x96, + 0xdc, 0x47, 0x41, 0x9f, 0xb0, 0xbe, 0x79, 0x91, + 0xf5, 0xb6, 0x18, 0xfe, 0xc2, 0x83, 0x18, 0x7d, + 0x73, 0xd9, 0x4f, 0x83, 0x84, 0x03, 0xb3, 0xf0, + 0x77, 0x66, 0x3d, 0x83, 0x63, 0x2e, 0x2c, 0xf9, + 0xdd, 0xa6, 0x1f, 0x89, 0x82, 0xb8, 0x23, 0x42, + 0xeb, 0xe2, 0xca, 0x70, 0x82, 0x61, 0x41, 0x0a, + 0x6d, 0x5f, 0x75, 0xc5, 0xe2, 0xc4, 0x91, 0x18, + 0x44, 0x22, 0xfa, 0x34, 0x10, 0xf5, 0x20, 0xdc, + 0xb7, 0xdd, 0x2a, 0x20, 0x77, 0xf5, 0xf9, 0xce, + 0xdb, 0xa0, 0x0a, 0x52, 0x2a, 0x4e, 0xdd, 0xcc, + 0x97, 0xdf, 0x05, 0xe4, 0x5e, 0xb7, 0xaa, 0xf0, + 0xe2, 0x80, 0xff, 0xba, 0x1a, 0x0f, 0xac, 0xdf, + 0x02, 0x32, 0xe6, 0xf7, 0xc7, 0x17, 0x13, 0xb7, + 0xfc, 0x98, 0x48, 0x8c, 0x0d, 0x82, 0xc9, 0x80, + 0x7a, 0xe2, 0x0a, 0xc5, 0xb4, 0xde, 0x7c, 0x3c, + 0x79, 0x81, 0x0e, 0x28, 0x65, 0x79, 0x67, 0x82, + 0x69, 0x44, 0x66, 0x09, 0xf7, 0x16, 0x1a, 0xf9, + 0x7d, 0x80, 0xa1, 0x79, 0x14, 0xa9, 0xc8, 0x20, + 0xfb, 0xa2, 0x46, 0xbe, 0x08, 0x35, 0x17, 0x58, + 0xc1, 0x1a, 0xda, 0x2a, 0x6b, 0x2e, 0x1e, 0xe6, + 0x27, 0x55, 0x7b, 0x19, 0xe2, 0xfb, 0x64, 0xfc, + 0x5e, 0x15, 0x54, 0x3c, 0xe7, 0xc2, 0x11, 0x50, + 0x30, 0xb8, 0x72, 0x03, 0x0b, 0x1a, 0x9f, 0x86, + 0x27, 0x11, 0x5c, 0x06, 0x2b, 0xbd, 0x75, 0x1a, + 0x0a, 0xda, 0x01, 0xfa, 0x5c, 0x4a, 0xc1, 0x80, + 0x3a, 0x6e, 0x30, 0xc8, 0x2c, 0xeb, 0x56, 0xec, + 0x89, 0xfa, 0x35, 0x7b, 0xb2, 0xf0, 0x97, 0x08, + 0x86, 0x53, 0xbe, 0xbd, 0x40, 0x41, 0x38, 0x1c, + 0xb4, 0x8b, 0x79, 0x2e, 0x18, 0x96, 0x94, 0xde, + 0xe8, 0xca, 0xe5, 0x9f, 0x92, 0x9f, 0x15, 0x5d, + 0x56, 0x60, 0x5c, 0x09, 0xf9, 0x16, 0xf4, 0x17, + 0x0f, 0xf6, 0x4c, 0xda, 0xe6, 0x67, 0x89, 0x9f, + 0xca, 0x6c, 0xe7, 0x9b, 0x04, 0x62, 0x0e, 0x26, + 0xa6, 0x52, 0xbd, 0x29, 0xff, 0xc7, 0xa4, 0x96, + 0xe6, 0x6a, 0x02, 0xa5, 0x2e, 0x7b, 0xfe, 0x97, + 0x68, 0x3e, 0x2e, 0x5f, 0x3b, 0x0f, 0x36, 0xd6, + 0x98, 0x19, 0x59, 0x48, 0xd2, 0xc6, 0xe1, 0x55, + 0x1a, 0x6e, 0xd6, 0xed, 0x2c, 0xba, 0xc3, 0x9e, + 0x64, 0xc9, 0x95, 0x86, 0x35, 0x5e, 0x3e, 0x88, + 0x69, 0x99, 0x4b, 0xee, 0xbe, 0x9a, 0x99, 0xb5, + 0x6e, 0x58, 0xae, 0xdd, 0x22, 0xdb, 0xdd, 0x6b, + 0xfc, 0xaf, 0x90, 0xa3, 0x3d, 0xa4, 0xc1, 0x15, + 0x92, 0x18, 0x8d, 0xd2, 0x4b, 0x7b, 0x06, 0xd1, + 0x37, 0xb5, 0xe2, 0x7c, 0x2c, 0xf0, 0x25, 0xe4, + 0x94, 0x2a, 0xbd, 0xe3, 0x82, 0x70, 0x78, 0xa3, + 0x82, 0x10, 0x5a, 0x90, 0xd7, 0xa4, 0xfa, 0xaf, + 0x1a, 0x88, 0x59, 0xdc, 0x74, 0x12, 0xb4, 0x8e, + 0xd7, 0x19, 0x46, 0xf4, 0x84, 0x69, 0x9f, 0xbb, + 0x70, 0xa8, 0x4c, 0x52, 0x81, 0xa9, 0xff, 0x76, + 0x1c, 0xae, 0xd8, 0x11, 0x3d, 0x7f, 0x7d, 0xc5, + 0x12, 0x59, 0x28, 0x18, 0xc2, 0xa2, 0xb7, 0x1c, + 0x88, 0xf8, 0xd6, 0x1b, 0xa6, 0x7d, 0x9e, 0xde, + 0x29, 0xf8, 0xed, 0xff, 0xeb, 0x92, 0x24, 0x4f, + 0x05, 0xaa, 0xd9, 0x49, 0xba, 0x87, 0x59, 0x51, + 0xc9, 0x20, 0x5c, 0x9b, 0x74, 0xcf, 0x03, 0xd9, + 0x2d, 0x34, 0xc7, 0x5b, 0xa5, 0x40, 0xb2, 0x99, + 0xf5, 0xcb, 0xb4, 0xf6, 0xb7, 0x72, 0x4a, 0xd6, + 0xbd, 0xb0, 0xf3, 0x93, 0xe0, 0x1b, 0xa8, 0x04, + 0x1e, 0x35, 0xd4, 0x80, 0x20, 0xf4, 0x9c, 0x31, + 0x6b, 0x45, 0xb9, 0x15, 0xb0, 0x5e, 0xdd, 0x0a, + 0x33, 0x9c, 0x83, 0xcd, 0x58, 0x89, 0x50, 0x56, + 0xbb, 0x81, 0x00, 0x91, 0x32, 0xf3, 0x1b, 0x3e, + 0xcf, 0x45, 0xe1, 0xf9, 0xe1, 0x2c, 0x26, 0x78, + 0x93, 0x9a, 0x60, 0x46, 0xc9, 0xb5, 0x5e, 0x6a, + 0x28, 0x92, 0x87, 0x3f, 0x63, 0x7b, 0xdb, 0xf7, + 0xd0, 0x13, 0x9d, 0x32, 0x40, 0x5e, 0xcf, 0xfb, + 0x79, 0x68, 0x47, 0x4c, 0xfd, 0x01, 0x17, 0xe6, + 0x97, 0x93, 0x78, 0xbb, 0xa6, 0x27, 0xa3, 0xe8, + 0x1a, 0xe8, 0x94, 0x55, 0x7d, 0x08, 0xe5, 0xdc, + 0x66, 0xa3, 0x69, 0xc8, 0xca, 0xc5, 0xa1, 0x84, + 0x55, 0xde, 0x08, 0x91, 0x16, 0x3a, 0x0c, 0x86, + 0xab, 0x27, 0x2b, 0x64, 0x34, 0x02, 0x6c, 0x76, + 0x8b, 0xc6, 0xaf, 0xcc, 0xe1, 0xd6, 0x8c, 0x2a, + 0x18, 0x3d, 0xa6, 0x1b, 0x37, 0x75, 0x45, 0x73, + 0xc2, 0x75, 0xd7, 0x53, 0x78, 0x3a, 0xd6, 0xe8, + 0x29, 0xd2, 0x4a, 0xa8, 0x1e, 0x82, 0xf6, 0xb6, + 0x81, 0xde, 0x21, 0xed, 0x2b, 0x56, 0xbb, 0xf2, + 0xd0, 0x57, 0xc1, 0x7c, 0xd2, 0x6a, 0xd2, 0x56, + 0xf5, 0x13, 0x5f, 0x1c, 0x6a, 0x0b, 0x74, 0xfb, + 0xe9, 0xfe, 0x9e, 0xea, 0x95, 0xb2, 0x46, 0xab, + 0x0a, 0xfc, 0xfd, 0xf3, 0xbb, 0x04, 0x2b, 0x76, + 0x1b, 0xa4, 0x74, 0xb0, 0xc1, 0x78, 0xc3, 0x69, + 0xe2, 0xb0, 0x01, 0xe1, 0xde, 0x32, 0x4c, 0x8d, + 0x1a, 0xb3, 0x38, 0x08, 0xd5, 0xfc, 0x1f, 0xdc, + 0x0e, 0x2c, 0x9c, 0xb1, 0xa1, 0x63, 0x17, 0x22, + 0xf5, 0x6c, 0x93, 0x70, 0x74, 0x00, 0xf8, 0x39, + 0x01, 0x94, 0xd1, 0x32, 0x23, 0x56, 0x5d, 0xa6, + 0x02, 0x76, 0x76, 0x93, 0xce, 0x2f, 0x19, 0xe9, + 0x17, 0x52, 0xae, 0x6e, 0x2c, 0x6d, 0x61, 0x7f, + 0x3b, 0xaa, 0xe0, 0x52, 0x85, 0xc5, 0x65, 0xc1, + 0xbb, 0x8e, 0x5b, 0x21, 0xd5, 0xc9, 0x78, 0x83, + 0x07, 0x97, 0x4c, 0x62, 0x61, 0x41, 0xd4, 0xfc, + 0xc9, 0x39, 0xe3, 0x9b, 0xd0, 0xcc, 0x75, 0xc4, + 0x97, 0xe6, 0xdd, 0x2a, 0x5f, 0xa6, 0xe8, 0x59, + 0x6c, 0x98, 0xb9, 0x02, 0xe2, 0xa2, 0xd6, 0x68, + 0xee, 0x3b, 0x1d, 0xe3, 0x4d, 0x5b, 0x30, 0xef, + 0x03, 0xf2, 0xeb, 0x18, 0x57, 0x36, 0xe8, 0xa1, + 0xf4, 0x47, 0xfb, 0xcb, 0x8f, 0xcb, 0xc8, 0xf3, + 0x4f, 0x74, 0x9d, 0x9d, 0xb1, 0x8d, 0x14, 0x44, + 0xd9, 0x19, 0xb4, 0x54, 0x4f, 0x75, 0x19, 0x09, + 0xa0, 0x75, 0xbc, 0x3b, 0x82, 0xc6, 0x3f, 0xb8, + 0x83, 0x19, 0x6e, 0xd6, 0x37, 0xfe, 0x6e, 0x8a, + 0x4e, 0xe0, 0x4a, 0xab, 0x7b, 0xc8, 0xb4, 0x1d, + 0xf4, 0xed, 0x27, 0x03, 0x65, 0xa2, 0xa1, 0xae, + 0x11, 0xe7, 0x98, 0x78, 0x48, 0x91, 0xd2, 0xd2, + 0xd4, 0x23, 0x78, 0x50, 0xb1, 0x5b, 0x85, 0x10, + 0x8d, 0xca, 0x5f, 0x0f, 0x71, 0xae, 0x72, 0x9a, + 0xf6, 0x25, 0x19, 0x60, 0x06, 0xf7, 0x10, 0x34, + 0x18, 0x0d, 0xc9, 0x9f, 0x7b, 0x0c, 0x9b, 0x8f, + 0x91, 0x1b, 0x9f, 0xcd, 0x10, 0xee, 0x75, 0xf9, + 0x97, 0x66, 0xfc, 0x4d, 0x33, 0x6e, 0x28, 0x2b, + 0x92, 0x85, 0x4f, 0xab, 0x43, 0x8d, 0x8f, 0x7d, + 0x86, 0xa7, 0xc7, 0xd8, 0xd3, 0x0b, 0x8b, 0x57, + 0xb6, 0x1d, 0x95, 0x0d, 0xe9, 0xbc, 0xd9, 0x03, + 0xd9, 0x10, 0x19, 0xc3, 0x46, 0x63, 0x55, 0x87, + 0x61, 0x79, 0x6c, 0x95, 0x0e, 0x9c, 0xdd, 0xca, + 0xc3, 0xf3, 0x64, 0xf0, 0x7d, 0x76, 0xb7, 0x53, + 0x67, 0x2b, 0x1e, 0x44, 0x56, 0x81, 0xea, 0x8f, + 0x5c, 0x42, 0x16, 0xb8, 0x28, 0xeb, 0x1b, 0x61, + 0x10, 0x1e, 0xbf, 0xec, 0xa8 +}; +static const u8 enc_output011[] __initconst = { + 0x6a, 0xfc, 0x4b, 0x25, 0xdf, 0xc0, 0xe4, 0xe8, + 0x17, 0x4d, 0x4c, 0xc9, 0x7e, 0xde, 0x3a, 0xcc, + 0x3c, 0xba, 0x6a, 0x77, 0x47, 0xdb, 0xe3, 0x74, + 0x7a, 0x4d, 0x5f, 0x8d, 0x37, 0x55, 0x80, 0x73, + 0x90, 0x66, 0x5d, 0x3a, 0x7d, 0x5d, 0x86, 0x5e, + 0x8d, 0xfd, 0x83, 0xff, 0x4e, 0x74, 0x6f, 0xf9, + 0xe6, 0x70, 0x17, 0x70, 0x3e, 0x96, 0xa7, 0x7e, + 0xcb, 0xab, 0x8f, 0x58, 0x24, 0x9b, 0x01, 0xfd, + 0xcb, 0xe6, 0x4d, 0x9b, 0xf0, 0x88, 0x94, 0x57, + 0x66, 0xef, 0x72, 0x4c, 0x42, 0x6e, 0x16, 0x19, + 0x15, 0xea, 0x70, 0x5b, 0xac, 0x13, 0xdb, 0x9f, + 0x18, 0xe2, 0x3c, 0x26, 0x97, 0xbc, 0xdc, 0x45, + 0x8c, 0x6c, 0x24, 0x69, 0x9c, 0xf7, 0x65, 0x1e, + 0x18, 0x59, 0x31, 0x7c, 0xe4, 0x73, 0xbc, 0x39, + 0x62, 0xc6, 0x5c, 0x9f, 0xbf, 0xfa, 0x90, 0x03, + 0xc9, 0x72, 0x26, 0xb6, 0x1b, 0xc2, 0xb7, 0x3f, + 0xf2, 0x13, 0x77, 0xf2, 0x8d, 0xb9, 0x47, 0xd0, + 0x53, 0xdd, 0xc8, 0x91, 0x83, 0x8b, 0xb1, 0xce, + 0xa3, 0xfe, 0xcd, 0xd9, 0xdd, 0x92, 0x7b, 0xdb, + 0xb8, 0xfb, 0xc9, 0x2d, 0x01, 0x59, 0x39, 0x52, + 0xad, 0x1b, 0xec, 0xcf, 0xd7, 0x70, 0x13, 0x21, + 0xf5, 0x47, 0xaa, 0x18, 0x21, 0x5c, 0xc9, 0x9a, + 0xd2, 0x6b, 0x05, 0x9c, 0x01, 0xa1, 0xda, 0x35, + 0x5d, 0xb3, 0x70, 0xe6, 0xa9, 0x80, 0x8b, 0x91, + 0xb7, 0xb3, 0x5f, 0x24, 0x9a, 0xb7, 0xd1, 0x6b, + 0xa1, 0x1c, 0x50, 0xba, 0x49, 0xe0, 0xee, 0x2e, + 0x75, 0xac, 0x69, 0xc0, 0xeb, 0x03, 0xdd, 0x19, + 0xe5, 0xf6, 0x06, 0xdd, 0xc3, 0xd7, 0x2b, 0x07, + 0x07, 0x30, 0xa7, 0x19, 0x0c, 0xbf, 0xe6, 0x18, + 0xcc, 0xb1, 0x01, 0x11, 0x85, 0x77, 0x1d, 0x96, + 0xa7, 0xa3, 0x00, 0x84, 0x02, 0xa2, 0x83, 0x68, + 0xda, 0x17, 0x27, 0xc8, 0x7f, 0x23, 0xb7, 0xf4, + 0x13, 0x85, 0xcf, 0xdd, 0x7a, 0x7d, 0x24, 0x57, + 0xfe, 0x05, 0x93, 0xf5, 0x74, 0xce, 0xed, 0x0c, + 0x20, 0x98, 0x8d, 0x92, 0x30, 0xa1, 0x29, 0x23, + 0x1a, 0xa0, 0x4f, 0x69, 0x56, 0x4c, 0xe1, 0xc8, + 0xce, 0xf6, 0x9a, 0x0c, 0xa4, 0xfa, 0x04, 0xf6, + 0x62, 0x95, 0xf2, 0xfa, 0xc7, 0x40, 0x68, 0x40, + 0x8f, 0x41, 0xda, 0xb4, 0x26, 0x6f, 0x70, 0xab, + 0x40, 0x61, 0xa4, 0x0e, 0x75, 0xfb, 0x86, 0xeb, + 0x9d, 0x9a, 0x1f, 0xec, 0x76, 0x99, 0xe7, 0xea, + 0xaa, 0x1e, 0x2d, 0xb5, 0xd4, 0xa6, 0x1a, 0xb8, + 0x61, 0x0a, 0x1d, 0x16, 0x5b, 0x98, 0xc2, 0x31, + 0x40, 0xe7, 0x23, 0x1d, 0x66, 0x99, 0xc8, 0xc0, + 0xd7, 0xce, 0xf3, 0x57, 0x40, 0x04, 0x3f, 0xfc, + 0xea, 0xb3, 0xfc, 0xd2, 0xd3, 0x99, 0xa4, 0x94, + 0x69, 0xa0, 0xef, 0xd1, 0x85, 0xb3, 0xa6, 0xb1, + 0x28, 0xbf, 0x94, 0x67, 0x22, 0xc3, 0x36, 0x46, + 0xf8, 0xd2, 0x0f, 0x5f, 0xf4, 0x59, 0x80, 0xe6, + 0x2d, 0x43, 0x08, 0x7d, 0x19, 0x09, 0x97, 0xa7, + 0x4c, 0x3d, 0x8d, 0xba, 0x65, 0x62, 0xa3, 0x71, + 0x33, 0x29, 0x62, 0xdb, 0xc1, 0x33, 0x34, 0x1a, + 0x63, 0x33, 0x16, 0xb6, 0x64, 0x7e, 0xab, 0x33, + 0xf0, 0xe6, 0x26, 0x68, 0xba, 0x1d, 0x2e, 0x38, + 0x08, 0xe6, 0x02, 0xd3, 0x25, 0x2c, 0x47, 0x23, + 0x58, 0x34, 0x0f, 0x9d, 0x63, 0x4f, 0x63, 0xbb, + 0x7f, 0x3b, 0x34, 0x38, 0xa7, 0xb5, 0x8d, 0x65, + 0xd9, 0x9f, 0x79, 0x55, 0x3e, 0x4d, 0xe7, 0x73, + 0xd8, 0xf6, 0x98, 0x97, 0x84, 0x60, 0x9c, 0xc8, + 0xa9, 0x3c, 0xf6, 0xdc, 0x12, 0x5c, 0xe1, 0xbb, + 0x0b, 0x8b, 0x98, 0x9c, 0x9d, 0x26, 0x7c, 0x4a, + 0xe6, 0x46, 0x36, 0x58, 0x21, 0x4a, 0xee, 0xca, + 0xd7, 0x3b, 0xc2, 0x6c, 0x49, 0x2f, 0xe5, 0xd5, + 0x03, 0x59, 0x84, 0x53, 0xcb, 0xfe, 0x92, 0x71, + 0x2e, 0x7c, 0x21, 0xcc, 0x99, 0x85, 0x7f, 0xb8, + 0x74, 0x90, 0x13, 0x42, 0x3f, 0xe0, 0x6b, 0x1d, + 0xf2, 0x4d, 0x54, 0xd4, 0xfc, 0x3a, 0x05, 0xe6, + 0x74, 0xaf, 0xa6, 0xa0, 0x2a, 0x20, 0x23, 0x5d, + 0x34, 0x5c, 0xd9, 0x3e, 0x4e, 0xfa, 0x93, 0xe7, + 0xaa, 0xe9, 0x6f, 0x08, 0x43, 0x67, 0x41, 0xc5, + 0xad, 0xfb, 0x31, 0x95, 0x82, 0x73, 0x32, 0xd8, + 0xa6, 0xa3, 0xed, 0x0e, 0x2d, 0xf6, 0x5f, 0xfd, + 0x80, 0xa6, 0x7a, 0xe0, 0xdf, 0x78, 0x15, 0x29, + 0x74, 0x33, 0xd0, 0x9e, 0x83, 0x86, 0x72, 0x22, + 0x57, 0x29, 0xb9, 0x9e, 0x5d, 0xd3, 0x1a, 0xb5, + 0x96, 0x72, 0x41, 0x3d, 0xf1, 0x64, 0x43, 0x67, + 0xee, 0xaa, 0x5c, 0xd3, 0x9a, 0x96, 0x13, 0x11, + 0x5d, 0xf3, 0x0c, 0x87, 0x82, 0x1e, 0x41, 0x9e, + 0xd0, 0x27, 0xd7, 0x54, 0x3b, 0x67, 0x73, 0x09, + 0x91, 0xe9, 0xd5, 0x36, 0xa7, 0xb5, 0x55, 0xe4, + 0xf3, 0x21, 0x51, 0x49, 0x22, 0x07, 0x55, 0x4f, + 0x44, 0x4b, 0xd2, 0x15, 0x93, 0x17, 0x2a, 0xfa, + 0x4d, 0x4a, 0x57, 0xdb, 0x4c, 0xa6, 0xeb, 0xec, + 0x53, 0x25, 0x6c, 0x21, 0xed, 0x00, 0x4c, 0x3b, + 0xca, 0x14, 0x57, 0xa9, 0xd6, 0x6a, 0xcd, 0x8d, + 0x5e, 0x74, 0xac, 0x72, 0xc1, 0x97, 0xe5, 0x1b, + 0x45, 0x4e, 0xda, 0xfc, 0xcc, 0x40, 0xe8, 0x48, + 0x88, 0x0b, 0xa3, 0xe3, 0x8d, 0x83, 0x42, 0xc3, + 0x23, 0xfd, 0x68, 0xb5, 0x8e, 0xf1, 0x9d, 0x63, + 0x77, 0xe9, 0xa3, 0x8e, 0x8c, 0x26, 0x6b, 0xbd, + 0x72, 0x73, 0x35, 0x0c, 0x03, 0xf8, 0x43, 0x78, + 0x52, 0x71, 0x15, 0x1f, 0x71, 0x5d, 0x6e, 0xed, + 0xb9, 0xcc, 0x86, 0x30, 0xdb, 0x2b, 0xd3, 0x82, + 0x88, 0x23, 0x71, 0x90, 0x53, 0x5c, 0xa9, 0x2f, + 0x76, 0x01, 0xb7, 0x9a, 0xfe, 0x43, 0x55, 0xa3, + 0x04, 0x9b, 0x0e, 0xe4, 0x59, 0xdf, 0xc9, 0xe9, + 0xb1, 0xea, 0x29, 0x28, 0x3c, 0x5c, 0xae, 0x72, + 0x84, 0xb6, 0xc6, 0xeb, 0x0c, 0x27, 0x07, 0x74, + 0x90, 0x0d, 0x31, 0xb0, 0x00, 0x77, 0xe9, 0x40, + 0x70, 0x6f, 0x68, 0xa7, 0xfd, 0x06, 0xec, 0x4b, + 0xc0, 0xb7, 0xac, 0xbc, 0x33, 0xb7, 0x6d, 0x0a, + 0xbd, 0x12, 0x1b, 0x59, 0xcb, 0xdd, 0x32, 0xf5, + 0x1d, 0x94, 0x57, 0x76, 0x9e, 0x0c, 0x18, 0x98, + 0x71, 0xd7, 0x2a, 0xdb, 0x0b, 0x7b, 0xa7, 0x71, + 0xb7, 0x67, 0x81, 0x23, 0x96, 0xae, 0xb9, 0x7e, + 0x32, 0x43, 0x92, 0x8a, 0x19, 0xa0, 0xc4, 0xd4, + 0x3b, 0x57, 0xf9, 0x4a, 0x2c, 0xfb, 0x51, 0x46, + 0xbb, 0xcb, 0x5d, 0xb3, 0xef, 0x13, 0x93, 0x6e, + 0x68, 0x42, 0x54, 0x57, 0xd3, 0x6a, 0x3a, 0x8f, + 0x9d, 0x66, 0xbf, 0xbd, 0x36, 0x23, 0xf5, 0x93, + 0x83, 0x7b, 0x9c, 0xc0, 0xdd, 0xc5, 0x49, 0xc0, + 0x64, 0xed, 0x07, 0x12, 0xb3, 0xe6, 0xe4, 0xe5, + 0x38, 0x95, 0x23, 0xb1, 0xa0, 0x3b, 0x1a, 0x61, + 0xda, 0x17, 0xac, 0xc3, 0x58, 0xdd, 0x74, 0x64, + 0x22, 0x11, 0xe8, 0x32, 0x1d, 0x16, 0x93, 0x85, + 0x99, 0xa5, 0x9c, 0x34, 0x55, 0xb1, 0xe9, 0x20, + 0x72, 0xc9, 0x28, 0x7b, 0x79, 0x00, 0xa1, 0xa6, + 0xa3, 0x27, 0x40, 0x18, 0x8a, 0x54, 0xe0, 0xcc, + 0xe8, 0x4e, 0x8e, 0x43, 0x96, 0xe7, 0x3f, 0xc8, + 0xe9, 0xb2, 0xf9, 0xc9, 0xda, 0x04, 0x71, 0x50, + 0x47, 0xe4, 0xaa, 0xce, 0xa2, 0x30, 0xc8, 0xe4, + 0xac, 0xc7, 0x0d, 0x06, 0x2e, 0xe6, 0xe8, 0x80, + 0x36, 0x29, 0x9e, 0x01, 0xb8, 0xc3, 0xf0, 0xa0, + 0x5d, 0x7a, 0xca, 0x4d, 0xa0, 0x57, 0xbd, 0x2a, + 0x45, 0xa7, 0x7f, 0x9c, 0x93, 0x07, 0x8f, 0x35, + 0x67, 0x92, 0xe3, 0xe9, 0x7f, 0xa8, 0x61, 0x43, + 0x9e, 0x25, 0x4f, 0x33, 0x76, 0x13, 0x6e, 0x12, + 0xb9, 0xdd, 0xa4, 0x7c, 0x08, 0x9f, 0x7c, 0xe7, + 0x0a, 0x8d, 0x84, 0x06, 0xa4, 0x33, 0x17, 0x34, + 0x5e, 0x10, 0x7c, 0xc0, 0xa8, 0x3d, 0x1f, 0x42, + 0x20, 0x51, 0x65, 0x5d, 0x09, 0xc3, 0xaa, 0xc0, + 0xc8, 0x0d, 0xf0, 0x79, 0xbc, 0x20, 0x1b, 0x95, + 0xe7, 0x06, 0x7d, 0x47, 0x20, 0x03, 0x1a, 0x74, + 0xdd, 0xe2, 0xd4, 0xae, 0x38, 0x71, 0x9b, 0xf5, + 0x80, 0xec, 0x08, 0x4e, 0x56, 0xba, 0x76, 0x12, + 0x1a, 0xdf, 0x48, 0xf3, 0xae, 0xb3, 0xe6, 0xe6, + 0xbe, 0xc0, 0x91, 0x2e, 0x01, 0xb3, 0x01, 0x86, + 0xa2, 0xb9, 0x52, 0xd1, 0x21, 0xae, 0xd4, 0x97, + 0x1d, 0xef, 0x41, 0x12, 0x95, 0x3d, 0x48, 0x45, + 0x1c, 0x56, 0x32, 0x8f, 0xb8, 0x43, 0xbb, 0x19, + 0xf3, 0xca, 0xe9, 0xeb, 0x6d, 0x84, 0xbe, 0x86, + 0x06, 0xe2, 0x36, 0xb2, 0x62, 0x9d, 0xd3, 0x4c, + 0x48, 0x18, 0x54, 0x13, 0x4e, 0xcf, 0xfd, 0xba, + 0x84, 0xb9, 0x30, 0x53, 0xcf, 0xfb, 0xb9, 0x29, + 0x8f, 0xdc, 0x9f, 0xef, 0x60, 0x0b, 0x64, 0xf6, + 0x8b, 0xee, 0xa6, 0x91, 0xc2, 0x41, 0x6c, 0xf6, + 0xfa, 0x79, 0x67, 0x4b, 0xc1, 0x3f, 0xaf, 0x09, + 0x81, 0xd4, 0x5d, 0xcb, 0x09, 0xdf, 0x36, 0x31, + 0xc0, 0x14, 0x3c, 0x7c, 0x0e, 0x65, 0x95, 0x99, + 0x6d, 0xa3, 0xf4, 0xd7, 0x38, 0xee, 0x1a, 0x2b, + 0x37, 0xe2, 0xa4, 0x3b, 0x4b, 0xd0, 0x65, 0xca, + 0xf8, 0xc3, 0xe8, 0x15, 0x20, 0xef, 0xf2, 0x00, + 0xfd, 0x01, 0x09, 0xc5, 0xc8, 0x17, 0x04, 0x93, + 0xd0, 0x93, 0x03, 0x55, 0xc5, 0xfe, 0x32, 0xa3, + 0x3e, 0x28, 0x2d, 0x3b, 0x93, 0x8a, 0xcc, 0x07, + 0x72, 0x80, 0x8b, 0x74, 0x16, 0x24, 0xbb, 0xda, + 0x94, 0x39, 0x30, 0x8f, 0xb1, 0xcd, 0x4a, 0x90, + 0x92, 0x7c, 0x14, 0x8f, 0x95, 0x4e, 0xac, 0x9b, + 0xd8, 0x8f, 0x1a, 0x87, 0xa4, 0x32, 0x27, 0x8a, + 0xba, 0xf7, 0x41, 0xcf, 0x84, 0x37, 0x19, 0xe6, + 0x06, 0xf5, 0x0e, 0xcf, 0x36, 0xf5, 0x9e, 0x6c, + 0xde, 0xbc, 0xff, 0x64, 0x7e, 0x4e, 0x59, 0x57, + 0x48, 0xfe, 0x14, 0xf7, 0x9c, 0x93, 0x5d, 0x15, + 0xad, 0xcc, 0x11, 0xb1, 0x17, 0x18, 0xb2, 0x7e, + 0xcc, 0xab, 0xe9, 0xce, 0x7d, 0x77, 0x5b, 0x51, + 0x1b, 0x1e, 0x20, 0xa8, 0x32, 0x06, 0x0e, 0x75, + 0x93, 0xac, 0xdb, 0x35, 0x37, 0x1f, 0xe9, 0x19, + 0x1d, 0xb4, 0x71, 0x97, 0xd6, 0x4e, 0x2c, 0x08, + 0xa5, 0x13, 0xf9, 0x0e, 0x7e, 0x78, 0x6e, 0x14, + 0xe0, 0xa9, 0xb9, 0x96, 0x4c, 0x80, 0x82, 0xba, + 0x17, 0xb3, 0x9d, 0x69, 0xb0, 0x84, 0x46, 0xff, + 0xf9, 0x52, 0x79, 0x94, 0x58, 0x3a, 0x62, 0x90, + 0x15, 0x35, 0x71, 0x10, 0x37, 0xed, 0xa1, 0x8e, + 0x53, 0x6e, 0xf4, 0x26, 0x57, 0x93, 0x15, 0x93, + 0xf6, 0x81, 0x2c, 0x5a, 0x10, 0xda, 0x92, 0xad, + 0x2f, 0xdb, 0x28, 0x31, 0x2d, 0x55, 0x04, 0xd2, + 0x06, 0x28, 0x8c, 0x1e, 0xdc, 0xea, 0x54, 0xac, + 0xff, 0xb7, 0x6c, 0x30, 0x15, 0xd4, 0xb4, 0x0d, + 0x00, 0x93, 0x57, 0xdd, 0xd2, 0x07, 0x07, 0x06, + 0xd9, 0x43, 0x9b, 0xcd, 0x3a, 0xf4, 0x7d, 0x4c, + 0x36, 0x5d, 0x23, 0xa2, 0xcc, 0x57, 0x40, 0x91, + 0xe9, 0x2c, 0x2f, 0x2c, 0xd5, 0x30, 0x9b, 0x17, + 0xb0, 0xc9, 0xf7, 0xa7, 0x2f, 0xd1, 0x93, 0x20, + 0x6b, 0xc6, 0xc1, 0xe4, 0x6f, 0xcb, 0xd1, 0xe7, + 0x09, 0x0f, 0x9e, 0xdc, 0xaa, 0x9f, 0x2f, 0xdf, + 0x56, 0x9f, 0xd4, 0x33, 0x04, 0xaf, 0xd3, 0x6c, + 0x58, 0x61, 0xf0, 0x30, 0xec, 0xf2, 0x7f, 0xf2, + 0x9c, 0xdf, 0x39, 0xbb, 0x6f, 0xa2, 0x8c, 0x7e, + 0xc4, 0x22, 0x51, 0x71, 0xc0, 0x4d, 0x14, 0x1a, + 0xc4, 0xcd, 0x04, 0xd9, 0x87, 0x08, 0x50, 0x05, + 0xcc, 0xaf, 0xf6, 0xf0, 0x8f, 0x92, 0x54, 0x58, + 0xc2, 0xc7, 0x09, 0x7a, 0x59, 0x02, 0x05, 0xe8, + 0xb0, 0x86, 0xd9, 0xbf, 0x7b, 0x35, 0x51, 0x4d, + 0xaf, 0x08, 0x97, 0x2c, 0x65, 0xda, 0x2a, 0x71, + 0x3a, 0xa8, 0x51, 0xcc, 0xf2, 0x73, 0x27, 0xc3, + 0xfd, 0x62, 0xcf, 0xe3, 0xb2, 0xca, 0xcb, 0xbe, + 0x1a, 0x0a, 0xa1, 0x34, 0x7b, 0x77, 0xc4, 0x62, + 0x68, 0x78, 0x5f, 0x94, 0x07, 0x04, 0x65, 0x16, + 0x4b, 0x61, 0xcb, 0xff, 0x75, 0x26, 0x50, 0x66, + 0x1f, 0x6e, 0x93, 0xf8, 0xc5, 0x51, 0xeb, 0xa4, + 0x4a, 0x48, 0x68, 0x6b, 0xe2, 0x5e, 0x44, 0xb2, + 0x50, 0x2c, 0x6c, 0xae, 0x79, 0x4e, 0x66, 0x35, + 0x81, 0x50, 0xac, 0xbc, 0x3f, 0xb1, 0x0c, 0xf3, + 0x05, 0x3c, 0x4a, 0xa3, 0x6c, 0x2a, 0x79, 0xb4, + 0xb7, 0xab, 0xca, 0xc7, 0x9b, 0x8e, 0xcd, 0x5f, + 0x11, 0x03, 0xcb, 0x30, 0xa3, 0xab, 0xda, 0xfe, + 0x64, 0xb9, 0xbb, 0xd8, 0x5e, 0x3a, 0x1a, 0x56, + 0xe5, 0x05, 0x48, 0x90, 0x1e, 0x61, 0x69, 0x1b, + 0x22, 0xe6, 0x1a, 0x3c, 0x75, 0xad, 0x1f, 0x37, + 0x28, 0xdc, 0xe4, 0x6d, 0xbd, 0x42, 0xdc, 0xd3, + 0xc8, 0xb6, 0x1c, 0x48, 0xfe, 0x94, 0x77, 0x7f, + 0xbd, 0x62, 0xac, 0xa3, 0x47, 0x27, 0xcf, 0x5f, + 0xd9, 0xdb, 0xaf, 0xec, 0xf7, 0x5e, 0xc1, 0xb0, + 0x9d, 0x01, 0x26, 0x99, 0x7e, 0x8f, 0x03, 0x70, + 0xb5, 0x42, 0xbe, 0x67, 0x28, 0x1b, 0x7c, 0xbd, + 0x61, 0x21, 0x97, 0xcc, 0x5c, 0xe1, 0x97, 0x8f, + 0x8d, 0xde, 0x2b, 0xaa, 0xa7, 0x71, 0x1d, 0x1e, + 0x02, 0x73, 0x70, 0x58, 0x32, 0x5b, 0x1d, 0x67, + 0x3d, 0xe0, 0x74, 0x4f, 0x03, 0xf2, 0x70, 0x51, + 0x79, 0xf1, 0x61, 0x70, 0x15, 0x74, 0x9d, 0x23, + 0x89, 0xde, 0xac, 0xfd, 0xde, 0xd0, 0x1f, 0xc3, + 0x87, 0x44, 0x35, 0x4b, 0xe5, 0xb0, 0x60, 0xc5, + 0x22, 0xe4, 0x9e, 0xca, 0xeb, 0xd5, 0x3a, 0x09, + 0x45, 0xa4, 0xdb, 0xfa, 0x3f, 0xeb, 0x1b, 0xc7, + 0xc8, 0x14, 0x99, 0x51, 0x92, 0x10, 0xed, 0xed, + 0x28, 0xe0, 0xa1, 0xf8, 0x26, 0xcf, 0xcd, 0xcb, + 0x63, 0xa1, 0x3b, 0xe3, 0xdf, 0x7e, 0xfe, 0xa6, + 0xf0, 0x81, 0x9a, 0xbf, 0x55, 0xde, 0x54, 0xd5, + 0x56, 0x60, 0x98, 0x10, 0x68, 0xf4, 0x38, 0x96, + 0x8e, 0x6f, 0x1d, 0x44, 0x7f, 0xd6, 0x2f, 0xfe, + 0x55, 0xfb, 0x0c, 0x7e, 0x67, 0xe2, 0x61, 0x44, + 0xed, 0xf2, 0x35, 0x30, 0x5d, 0xe9, 0xc7, 0xd6, + 0x6d, 0xe0, 0xa0, 0xed, 0xf3, 0xfc, 0xd8, 0x3e, + 0x0a, 0x7b, 0xcd, 0xaf, 0x65, 0x68, 0x18, 0xc0, + 0xec, 0x04, 0x1c, 0x74, 0x6d, 0xe2, 0x6e, 0x79, + 0xd4, 0x11, 0x2b, 0x62, 0xd5, 0x27, 0xad, 0x4f, + 0x01, 0x59, 0x73, 0xcc, 0x6a, 0x53, 0xfb, 0x2d, + 0xd5, 0x4e, 0x99, 0x21, 0x65, 0x4d, 0xf5, 0x82, + 0xf7, 0xd8, 0x42, 0xce, 0x6f, 0x3d, 0x36, 0x47, + 0xf1, 0x05, 0x16, 0xe8, 0x1b, 0x6a, 0x8f, 0x93, + 0xf2, 0x8f, 0x37, 0x40, 0x12, 0x28, 0xa3, 0xe6, + 0xb9, 0x17, 0x4a, 0x1f, 0xb1, 0xd1, 0x66, 0x69, + 0x86, 0xc4, 0xfc, 0x97, 0xae, 0x3f, 0x8f, 0x1e, + 0x2b, 0xdf, 0xcd, 0xf9, 0x3c +}; +static const u8 enc_assoc011[] __initconst = { + 0xd6, 0x31, 0xda, 0x5d, 0x42, 0x5e, 0xd7 +}; +static const u8 enc_nonce011[] __initconst = { + 0xfd, 0x87, 0xd4, 0xd8, 0x62, 0xfd, 0xec, 0xaa +}; +static const u8 enc_key011[] __initconst = { + 0x35, 0x4e, 0xb5, 0x70, 0x50, 0x42, 0x8a, 0x85, + 0xf2, 0xfb, 0xed, 0x7b, 0xd0, 0x9e, 0x97, 0xca, + 0xfa, 0x98, 0x66, 0x63, 0xee, 0x37, 0xcc, 0x52, + 0xfe, 0xd1, 0xdf, 0x95, 0x15, 0x34, 0x29, 0x38 +}; + +static const u8 enc_input012[] __initconst = { + 0x74, 0xa6, 0x3e, 0xe4, 0xb1, 0xcb, 0xaf, 0xb0, + 0x40, 0xe5, 0x0f, 0x9e, 0xf1, 0xf2, 0x89, 0xb5, + 0x42, 0x34, 0x8a, 0xa1, 0x03, 0xb7, 0xe9, 0x57, + 0x46, 0xbe, 0x20, 0xe4, 0x6e, 0xb0, 0xeb, 0xff, + 0xea, 0x07, 0x7e, 0xef, 0xe2, 0x55, 0x9f, 0xe5, + 0x78, 0x3a, 0xb7, 0x83, 0xc2, 0x18, 0x40, 0x7b, + 0xeb, 0xcd, 0x81, 0xfb, 0x90, 0x12, 0x9e, 0x46, + 0xa9, 0xd6, 0x4a, 0xba, 0xb0, 0x62, 0xdb, 0x6b, + 0x99, 0xc4, 0xdb, 0x54, 0x4b, 0xb8, 0xa5, 0x71, + 0xcb, 0xcd, 0x63, 0x32, 0x55, 0xfb, 0x31, 0xf0, + 0x38, 0xf5, 0xbe, 0x78, 0xe4, 0x45, 0xce, 0x1b, + 0x6a, 0x5b, 0x0e, 0xf4, 0x16, 0xe4, 0xb1, 0x3d, + 0xf6, 0x63, 0x7b, 0xa7, 0x0c, 0xde, 0x6f, 0x8f, + 0x74, 0xdf, 0xe0, 0x1e, 0x9d, 0xce, 0x8f, 0x24, + 0xef, 0x23, 0x35, 0x33, 0x7b, 0x83, 0x34, 0x23, + 0x58, 0x74, 0x14, 0x77, 0x1f, 0xc2, 0x4f, 0x4e, + 0xc6, 0x89, 0xf9, 0x52, 0x09, 0x37, 0x64, 0x14, + 0xc4, 0x01, 0x6b, 0x9d, 0x77, 0xe8, 0x90, 0x5d, + 0xa8, 0x4a, 0x2a, 0xef, 0x5c, 0x7f, 0xeb, 0xbb, + 0xb2, 0xc6, 0x93, 0x99, 0x66, 0xdc, 0x7f, 0xd4, + 0x9e, 0x2a, 0xca, 0x8d, 0xdb, 0xe7, 0x20, 0xcf, + 0xe4, 0x73, 0xae, 0x49, 0x7d, 0x64, 0x0f, 0x0e, + 0x28, 0x46, 0xa9, 0xa8, 0x32, 0xe4, 0x0e, 0xf6, + 0x51, 0x53, 0xb8, 0x3c, 0xb1, 0xff, 0xa3, 0x33, + 0x41, 0x75, 0xff, 0xf1, 0x6f, 0xf1, 0xfb, 0xbb, + 0x83, 0x7f, 0x06, 0x9b, 0xe7, 0x1b, 0x0a, 0xe0, + 0x5c, 0x33, 0x60, 0x5b, 0xdb, 0x5b, 0xed, 0xfe, + 0xa5, 0x16, 0x19, 0x72, 0xa3, 0x64, 0x23, 0x00, + 0x02, 0xc7, 0xf3, 0x6a, 0x81, 0x3e, 0x44, 0x1d, + 0x79, 0x15, 0x5f, 0x9a, 0xde, 0xe2, 0xfd, 0x1b, + 0x73, 0xc1, 0xbc, 0x23, 0xba, 0x31, 0xd2, 0x50, + 0xd5, 0xad, 0x7f, 0x74, 0xa7, 0xc9, 0xf8, 0x3e, + 0x2b, 0x26, 0x10, 0xf6, 0x03, 0x36, 0x74, 0xe4, + 0x0e, 0x6a, 0x72, 0xb7, 0x73, 0x0a, 0x42, 0x28, + 0xc2, 0xad, 0x5e, 0x03, 0xbe, 0xb8, 0x0b, 0xa8, + 0x5b, 0xd4, 0xb8, 0xba, 0x52, 0x89, 0xb1, 0x9b, + 0xc1, 0xc3, 0x65, 0x87, 0xed, 0xa5, 0xf4, 0x86, + 0xfd, 0x41, 0x80, 0x91, 0x27, 0x59, 0x53, 0x67, + 0x15, 0x78, 0x54, 0x8b, 0x2d, 0x3d, 0xc7, 0xff, + 0x02, 0x92, 0x07, 0x5f, 0x7a, 0x4b, 0x60, 0x59, + 0x3c, 0x6f, 0x5c, 0xd8, 0xec, 0x95, 0xd2, 0xfe, + 0xa0, 0x3b, 0xd8, 0x3f, 0xd1, 0x69, 0xa6, 0xd6, + 0x41, 0xb2, 0xf4, 0x4d, 0x12, 0xf4, 0x58, 0x3e, + 0x66, 0x64, 0x80, 0x31, 0x9b, 0xa8, 0x4c, 0x8b, + 0x07, 0xb2, 0xec, 0x66, 0x94, 0x66, 0x47, 0x50, + 0x50, 0x5f, 0x18, 0x0b, 0x0e, 0xd6, 0xc0, 0x39, + 0x21, 0x13, 0x9e, 0x33, 0xbc, 0x79, 0x36, 0x02, + 0x96, 0x70, 0xf0, 0x48, 0x67, 0x2f, 0x26, 0xe9, + 0x6d, 0x10, 0xbb, 0xd6, 0x3f, 0xd1, 0x64, 0x7a, + 0x2e, 0xbe, 0x0c, 0x61, 0xf0, 0x75, 0x42, 0x38, + 0x23, 0xb1, 0x9e, 0x9f, 0x7c, 0x67, 0x66, 0xd9, + 0x58, 0x9a, 0xf1, 0xbb, 0x41, 0x2a, 0x8d, 0x65, + 0x84, 0x94, 0xfc, 0xdc, 0x6a, 0x50, 0x64, 0xdb, + 0x56, 0x33, 0x76, 0x00, 0x10, 0xed, 0xbe, 0xd2, + 0x12, 0xf6, 0xf6, 0x1b, 0xa2, 0x16, 0xde, 0xae, + 0x31, 0x95, 0xdd, 0xb1, 0x08, 0x7e, 0x4e, 0xee, + 0xe7, 0xf9, 0xa5, 0xfb, 0x5b, 0x61, 0x43, 0x00, + 0x40, 0xf6, 0x7e, 0x02, 0x04, 0x32, 0x4e, 0x0c, + 0xe2, 0x66, 0x0d, 0xd7, 0x07, 0x98, 0x0e, 0xf8, + 0x72, 0x34, 0x6d, 0x95, 0x86, 0xd7, 0xcb, 0x31, + 0x54, 0x47, 0xd0, 0x38, 0x29, 0x9c, 0x5a, 0x68, + 0xd4, 0x87, 0x76, 0xc9, 0xe7, 0x7e, 0xe3, 0xf4, + 0x81, 0x6d, 0x18, 0xcb, 0xc9, 0x05, 0xaf, 0xa0, + 0xfb, 0x66, 0xf7, 0xf1, 0x1c, 0xc6, 0x14, 0x11, + 0x4f, 0x2b, 0x79, 0x42, 0x8b, 0xbc, 0xac, 0xe7, + 0x6c, 0xfe, 0x0f, 0x58, 0xe7, 0x7c, 0x78, 0x39, + 0x30, 0xb0, 0x66, 0x2c, 0x9b, 0x6d, 0x3a, 0xe1, + 0xcf, 0xc9, 0xa4, 0x0e, 0x6d, 0x6d, 0x8a, 0xa1, + 0x3a, 0xe7, 0x28, 0xd4, 0x78, 0x4c, 0xa6, 0xa2, + 0x2a, 0xa6, 0x03, 0x30, 0xd7, 0xa8, 0x25, 0x66, + 0x87, 0x2f, 0x69, 0x5c, 0x4e, 0xdd, 0xa5, 0x49, + 0x5d, 0x37, 0x4a, 0x59, 0xc4, 0xaf, 0x1f, 0xa2, + 0xe4, 0xf8, 0xa6, 0x12, 0x97, 0xd5, 0x79, 0xf5, + 0xe2, 0x4a, 0x2b, 0x5f, 0x61, 0xe4, 0x9e, 0xe3, + 0xee, 0xb8, 0xa7, 0x5b, 0x2f, 0xf4, 0x9e, 0x6c, + 0xfb, 0xd1, 0xc6, 0x56, 0x77, 0xba, 0x75, 0xaa, + 0x3d, 0x1a, 0xa8, 0x0b, 0xb3, 0x68, 0x24, 0x00, + 0x10, 0x7f, 0xfd, 0xd7, 0xa1, 0x8d, 0x83, 0x54, + 0x4f, 0x1f, 0xd8, 0x2a, 0xbe, 0x8a, 0x0c, 0x87, + 0xab, 0xa2, 0xde, 0xc3, 0x39, 0xbf, 0x09, 0x03, + 0xa5, 0xf3, 0x05, 0x28, 0xe1, 0xe1, 0xee, 0x39, + 0x70, 0x9c, 0xd8, 0x81, 0x12, 0x1e, 0x02, 0x40, + 0xd2, 0x6e, 0xf0, 0xeb, 0x1b, 0x3d, 0x22, 0xc6, + 0xe5, 0xe3, 0xb4, 0x5a, 0x98, 0xbb, 0xf0, 0x22, + 0x28, 0x8d, 0xe5, 0xd3, 0x16, 0x48, 0x24, 0xa5, + 0xe6, 0x66, 0x0c, 0xf9, 0x08, 0xf9, 0x7e, 0x1e, + 0xe1, 0x28, 0x26, 0x22, 0xc7, 0xc7, 0x0a, 0x32, + 0x47, 0xfa, 0xa3, 0xbe, 0x3c, 0xc4, 0xc5, 0x53, + 0x0a, 0xd5, 0x94, 0x4a, 0xd7, 0x93, 0xd8, 0x42, + 0x99, 0xb9, 0x0a, 0xdb, 0x56, 0xf7, 0xb9, 0x1c, + 0x53, 0x4f, 0xfa, 0xd3, 0x74, 0xad, 0xd9, 0x68, + 0xf1, 0x1b, 0xdf, 0x61, 0xc6, 0x5e, 0xa8, 0x48, + 0xfc, 0xd4, 0x4a, 0x4c, 0x3c, 0x32, 0xf7, 0x1c, + 0x96, 0x21, 0x9b, 0xf9, 0xa3, 0xcc, 0x5a, 0xce, + 0xd5, 0xd7, 0x08, 0x24, 0xf6, 0x1c, 0xfd, 0xdd, + 0x38, 0xc2, 0x32, 0xe9, 0xb8, 0xe7, 0xb6, 0xfa, + 0x9d, 0x45, 0x13, 0x2c, 0x83, 0xfd, 0x4a, 0x69, + 0x82, 0xcd, 0xdc, 0xb3, 0x76, 0x0c, 0x9e, 0xd8, + 0xf4, 0x1b, 0x45, 0x15, 0xb4, 0x97, 0xe7, 0x58, + 0x34, 0xe2, 0x03, 0x29, 0x5a, 0xbf, 0xb6, 0xe0, + 0x5d, 0x13, 0xd9, 0x2b, 0xb4, 0x80, 0xb2, 0x45, + 0x81, 0x6a, 0x2e, 0x6c, 0x89, 0x7d, 0xee, 0xbb, + 0x52, 0xdd, 0x1f, 0x18, 0xe7, 0x13, 0x6b, 0x33, + 0x0e, 0xea, 0x36, 0x92, 0x77, 0x7b, 0x6d, 0x9c, + 0x5a, 0x5f, 0x45, 0x7b, 0x7b, 0x35, 0x62, 0x23, + 0xd1, 0xbf, 0x0f, 0xd0, 0x08, 0x1b, 0x2b, 0x80, + 0x6b, 0x7e, 0xf1, 0x21, 0x47, 0xb0, 0x57, 0xd1, + 0x98, 0x72, 0x90, 0x34, 0x1c, 0x20, 0x04, 0xff, + 0x3d, 0x5c, 0xee, 0x0e, 0x57, 0x5f, 0x6f, 0x24, + 0x4e, 0x3c, 0xea, 0xfc, 0xa5, 0xa9, 0x83, 0xc9, + 0x61, 0xb4, 0x51, 0x24, 0xf8, 0x27, 0x5e, 0x46, + 0x8c, 0xb1, 0x53, 0x02, 0x96, 0x35, 0xba, 0xb8, + 0x4c, 0x71, 0xd3, 0x15, 0x59, 0x35, 0x22, 0x20, + 0xad, 0x03, 0x9f, 0x66, 0x44, 0x3b, 0x9c, 0x35, + 0x37, 0x1f, 0x9b, 0xbb, 0xf3, 0xdb, 0x35, 0x63, + 0x30, 0x64, 0xaa, 0xa2, 0x06, 0xa8, 0x5d, 0xbb, + 0xe1, 0x9f, 0x70, 0xec, 0x82, 0x11, 0x06, 0x36, + 0xec, 0x8b, 0x69, 0x66, 0x24, 0x44, 0xc9, 0x4a, + 0x57, 0xbb, 0x9b, 0x78, 0x13, 0xce, 0x9c, 0x0c, + 0xba, 0x92, 0x93, 0x63, 0xb8, 0xe2, 0x95, 0x0f, + 0x0f, 0x16, 0x39, 0x52, 0xfd, 0x3a, 0x6d, 0x02, + 0x4b, 0xdf, 0x13, 0xd3, 0x2a, 0x22, 0xb4, 0x03, + 0x7c, 0x54, 0x49, 0x96, 0x68, 0x54, 0x10, 0xfa, + 0xef, 0xaa, 0x6c, 0xe8, 0x22, 0xdc, 0x71, 0x16, + 0x13, 0x1a, 0xf6, 0x28, 0xe5, 0x6d, 0x77, 0x3d, + 0xcd, 0x30, 0x63, 0xb1, 0x70, 0x52, 0xa1, 0xc5, + 0x94, 0x5f, 0xcf, 0xe8, 0xb8, 0x26, 0x98, 0xf7, + 0x06, 0xa0, 0x0a, 0x70, 0xfa, 0x03, 0x80, 0xac, + 0xc1, 0xec, 0xd6, 0x4c, 0x54, 0xd7, 0xfe, 0x47, + 0xb6, 0x88, 0x4a, 0xf7, 0x71, 0x24, 0xee, 0xf3, + 0xd2, 0xc2, 0x4a, 0x7f, 0xfe, 0x61, 0xc7, 0x35, + 0xc9, 0x37, 0x67, 0xcb, 0x24, 0x35, 0xda, 0x7e, + 0xca, 0x5f, 0xf3, 0x8d, 0xd4, 0x13, 0x8e, 0xd6, + 0xcb, 0x4d, 0x53, 0x8f, 0x53, 0x1f, 0xc0, 0x74, + 0xf7, 0x53, 0xb9, 0x5e, 0x23, 0x37, 0xba, 0x6e, + 0xe3, 0x9d, 0x07, 0x55, 0x25, 0x7b, 0xe6, 0x2a, + 0x64, 0xd1, 0x32, 0xdd, 0x54, 0x1b, 0x4b, 0xc0, + 0xe1, 0xd7, 0x69, 0x58, 0xf8, 0x93, 0x29, 0xc4, + 0xdd, 0x23, 0x2f, 0xa5, 0xfc, 0x9d, 0x7e, 0xf8, + 0xd4, 0x90, 0xcd, 0x82, 0x55, 0xdc, 0x16, 0x16, + 0x9f, 0x07, 0x52, 0x9b, 0x9d, 0x25, 0xed, 0x32, + 0xc5, 0x7b, 0xdf, 0xf6, 0x83, 0x46, 0x3d, 0x65, + 0xb7, 0xef, 0x87, 0x7a, 0x12, 0x69, 0x8f, 0x06, + 0x7c, 0x51, 0x15, 0x4a, 0x08, 0xe8, 0xac, 0x9a, + 0x0c, 0x24, 0xa7, 0x27, 0xd8, 0x46, 0x2f, 0xe7, + 0x01, 0x0e, 0x1c, 0xc6, 0x91, 0xb0, 0x6e, 0x85, + 0x65, 0xf0, 0x29, 0x0d, 0x2e, 0x6b, 0x3b, 0xfb, + 0x4b, 0xdf, 0xe4, 0x80, 0x93, 0x03, 0x66, 0x46, + 0x3e, 0x8a, 0x6e, 0xf3, 0x5e, 0x4d, 0x62, 0x0e, + 0x49, 0x05, 0xaf, 0xd4, 0xf8, 0x21, 0x20, 0x61, + 0x1d, 0x39, 0x17, 0xf4, 0x61, 0x47, 0x95, 0xfb, + 0x15, 0x2e, 0xb3, 0x4f, 0xd0, 0x5d, 0xf5, 0x7d, + 0x40, 0xda, 0x90, 0x3c, 0x6b, 0xcb, 0x17, 0x00, + 0x13, 0x3b, 0x64, 0x34, 0x1b, 0xf0, 0xf2, 0xe5, + 0x3b, 0xb2, 0xc7, 0xd3, 0x5f, 0x3a, 0x44, 0xa6, + 0x9b, 0xb7, 0x78, 0x0e, 0x42, 0x5d, 0x4c, 0xc1, + 0xe9, 0xd2, 0xcb, 0xb7, 0x78, 0xd1, 0xfe, 0x9a, + 0xb5, 0x07, 0xe9, 0xe0, 0xbe, 0xe2, 0x8a, 0xa7, + 0x01, 0x83, 0x00, 0x8c, 0x5c, 0x08, 0xe6, 0x63, + 0x12, 0x92, 0xb7, 0xb7, 0xa6, 0x19, 0x7d, 0x38, + 0x13, 0x38, 0x92, 0x87, 0x24, 0xf9, 0x48, 0xb3, + 0x5e, 0x87, 0x6a, 0x40, 0x39, 0x5c, 0x3f, 0xed, + 0x8f, 0xee, 0xdb, 0x15, 0x82, 0x06, 0xda, 0x49, + 0x21, 0x2b, 0xb5, 0xbf, 0x32, 0x7c, 0x9f, 0x42, + 0x28, 0x63, 0xcf, 0xaf, 0x1e, 0xf8, 0xc6, 0xa0, + 0xd1, 0x02, 0x43, 0x57, 0x62, 0xec, 0x9b, 0x0f, + 0x01, 0x9e, 0x71, 0xd8, 0x87, 0x9d, 0x01, 0xc1, + 0x58, 0x77, 0xd9, 0xaf, 0xb1, 0x10, 0x7e, 0xdd, + 0xa6, 0x50, 0x96, 0xe5, 0xf0, 0x72, 0x00, 0x6d, + 0x4b, 0xf8, 0x2a, 0x8f, 0x19, 0xf3, 0x22, 0x88, + 0x11, 0x4a, 0x8b, 0x7c, 0xfd, 0xb7, 0xed, 0xe1, + 0xf6, 0x40, 0x39, 0xe0, 0xe9, 0xf6, 0x3d, 0x25, + 0xe6, 0x74, 0x3c, 0x58, 0x57, 0x7f, 0xe1, 0x22, + 0x96, 0x47, 0x31, 0x91, 0xba, 0x70, 0x85, 0x28, + 0x6b, 0x9f, 0x6e, 0x25, 0xac, 0x23, 0x66, 0x2f, + 0x29, 0x88, 0x28, 0xce, 0x8c, 0x5c, 0x88, 0x53, + 0xd1, 0x3b, 0xcc, 0x6a, 0x51, 0xb2, 0xe1, 0x28, + 0x3f, 0x91, 0xb4, 0x0d, 0x00, 0x3a, 0xe3, 0xf8, + 0xc3, 0x8f, 0xd7, 0x96, 0x62, 0x0e, 0x2e, 0xfc, + 0xc8, 0x6c, 0x77, 0xa6, 0x1d, 0x22, 0xc1, 0xb8, + 0xe6, 0x61, 0xd7, 0x67, 0x36, 0x13, 0x7b, 0xbb, + 0x9b, 0x59, 0x09, 0xa6, 0xdf, 0xf7, 0x6b, 0xa3, + 0x40, 0x1a, 0xf5, 0x4f, 0xb4, 0xda, 0xd3, 0xf3, + 0x81, 0x93, 0xc6, 0x18, 0xd9, 0x26, 0xee, 0xac, + 0xf0, 0xaa, 0xdf, 0xc5, 0x9c, 0xca, 0xc2, 0xa2, + 0xcc, 0x7b, 0x5c, 0x24, 0xb0, 0xbc, 0xd0, 0x6a, + 0x4d, 0x89, 0x09, 0xb8, 0x07, 0xfe, 0x87, 0xad, + 0x0a, 0xea, 0xb8, 0x42, 0xf9, 0x5e, 0xb3, 0x3e, + 0x36, 0x4c, 0xaf, 0x75, 0x9e, 0x1c, 0xeb, 0xbd, + 0xbc, 0xbb, 0x80, 0x40, 0xa7, 0x3a, 0x30, 0xbf, + 0xa8, 0x44, 0xf4, 0xeb, 0x38, 0xad, 0x29, 0xba, + 0x23, 0xed, 0x41, 0x0c, 0xea, 0xd2, 0xbb, 0x41, + 0x18, 0xd6, 0xb9, 0xba, 0x65, 0x2b, 0xa3, 0x91, + 0x6d, 0x1f, 0xa9, 0xf4, 0xd1, 0x25, 0x8d, 0x4d, + 0x38, 0xff, 0x64, 0xa0, 0xec, 0xde, 0xa6, 0xb6, + 0x79, 0xab, 0x8e, 0x33, 0x6c, 0x47, 0xde, 0xaf, + 0x94, 0xa4, 0xa5, 0x86, 0x77, 0x55, 0x09, 0x92, + 0x81, 0x31, 0x76, 0xc7, 0x34, 0x22, 0x89, 0x8e, + 0x3d, 0x26, 0x26, 0xd7, 0xfc, 0x1e, 0x16, 0x72, + 0x13, 0x33, 0x63, 0xd5, 0x22, 0xbe, 0xb8, 0x04, + 0x34, 0x84, 0x41, 0xbb, 0x80, 0xd0, 0x9f, 0x46, + 0x48, 0x07, 0xa7, 0xfc, 0x2b, 0x3a, 0x75, 0x55, + 0x8c, 0xc7, 0x6a, 0xbd, 0x7e, 0x46, 0x08, 0x84, + 0x0f, 0xd5, 0x74, 0xc0, 0x82, 0x8e, 0xaa, 0x61, + 0x05, 0x01, 0xb2, 0x47, 0x6e, 0x20, 0x6a, 0x2d, + 0x58, 0x70, 0x48, 0x32, 0xa7, 0x37, 0xd2, 0xb8, + 0x82, 0x1a, 0x51, 0xb9, 0x61, 0xdd, 0xfd, 0x9d, + 0x6b, 0x0e, 0x18, 0x97, 0xf8, 0x45, 0x5f, 0x87, + 0x10, 0xcf, 0x34, 0x72, 0x45, 0x26, 0x49, 0x70, + 0xe7, 0xa3, 0x78, 0xe0, 0x52, 0x89, 0x84, 0x94, + 0x83, 0x82, 0xc2, 0x69, 0x8f, 0xe3, 0xe1, 0x3f, + 0x60, 0x74, 0x88, 0xc4, 0xf7, 0x75, 0x2c, 0xfb, + 0xbd, 0xb6, 0xc4, 0x7e, 0x10, 0x0a, 0x6c, 0x90, + 0x04, 0x9e, 0xc3, 0x3f, 0x59, 0x7c, 0xce, 0x31, + 0x18, 0x60, 0x57, 0x73, 0x46, 0x94, 0x7d, 0x06, + 0xa0, 0x6d, 0x44, 0xec, 0xa2, 0x0a, 0x9e, 0x05, + 0x15, 0xef, 0xca, 0x5c, 0xbf, 0x00, 0xeb, 0xf7, + 0x3d, 0x32, 0xd4, 0xa5, 0xef, 0x49, 0x89, 0x5e, + 0x46, 0xb0, 0xa6, 0x63, 0x5b, 0x8a, 0x73, 0xae, + 0x6f, 0xd5, 0x9d, 0xf8, 0x4f, 0x40, 0xb5, 0xb2, + 0x6e, 0xd3, 0xb6, 0x01, 0xa9, 0x26, 0xa2, 0x21, + 0xcf, 0x33, 0x7a, 0x3a, 0xa4, 0x23, 0x13, 0xb0, + 0x69, 0x6a, 0xee, 0xce, 0xd8, 0x9d, 0x01, 0x1d, + 0x50, 0xc1, 0x30, 0x6c, 0xb1, 0xcd, 0xa0, 0xf0, + 0xf0, 0xa2, 0x64, 0x6f, 0xbb, 0xbf, 0x5e, 0xe6, + 0xab, 0x87, 0xb4, 0x0f, 0x4f, 0x15, 0xaf, 0xb5, + 0x25, 0xa1, 0xb2, 0xd0, 0x80, 0x2c, 0xfb, 0xf9, + 0xfe, 0xd2, 0x33, 0xbb, 0x76, 0xfe, 0x7c, 0xa8, + 0x66, 0xf7, 0xe7, 0x85, 0x9f, 0x1f, 0x85, 0x57, + 0x88, 0xe1, 0xe9, 0x63, 0xe4, 0xd8, 0x1c, 0xa1, + 0xfb, 0xda, 0x44, 0x05, 0x2e, 0x1d, 0x3a, 0x1c, + 0xff, 0xc8, 0x3b, 0xc0, 0xfe, 0xda, 0x22, 0x0b, + 0x43, 0xd6, 0x88, 0x39, 0x4c, 0x4a, 0xa6, 0x69, + 0x18, 0x93, 0x42, 0x4e, 0xb5, 0xcc, 0x66, 0x0d, + 0x09, 0xf8, 0x1e, 0x7c, 0xd3, 0x3c, 0x99, 0x0d, + 0x50, 0x1d, 0x62, 0xe9, 0x57, 0x06, 0xbf, 0x19, + 0x88, 0xdd, 0xad, 0x7b, 0x4f, 0xf9, 0xc7, 0x82, + 0x6d, 0x8d, 0xc8, 0xc4, 0xc5, 0x78, 0x17, 0x20, + 0x15, 0xc5, 0x52, 0x41, 0xcf, 0x5b, 0xd6, 0x7f, + 0x94, 0x02, 0x41, 0xe0, 0x40, 0x22, 0x03, 0x5e, + 0xd1, 0x53, 0xd4, 0x86, 0xd3, 0x2c, 0x9f, 0x0f, + 0x96, 0xe3, 0x6b, 0x9a, 0x76, 0x32, 0x06, 0x47, + 0x4b, 0x11, 0xb3, 0xdd, 0x03, 0x65, 0xbd, 0x9b, + 0x01, 0xda, 0x9c, 0xb9, 0x7e, 0x3f, 0x6a, 0xc4, + 0x7b, 0xea, 0xd4, 0x3c, 0xb9, 0xfb, 0x5c, 0x6b, + 0x64, 0x33, 0x52, 0xba, 0x64, 0x78, 0x8f, 0xa4, + 0xaf, 0x7a, 0x61, 0x8d, 0xbc, 0xc5, 0x73, 0xe9, + 0x6b, 0x58, 0x97, 0x4b, 0xbf, 0x63, 0x22, 0xd3, + 0x37, 0x02, 0x54, 0xc5, 0xb9, 0x16, 0x4a, 0xf0, + 0x19, 0xd8, 0x94, 0x57, 0xb8, 0x8a, 0xb3, 0x16, + 0x3b, 0xd0, 0x84, 0x8e, 0x67, 0xa6, 0xa3, 0x7d, + 0x78, 0xec, 0x00 +}; +static const u8 enc_output012[] __initconst = { + 0x52, 0x34, 0xb3, 0x65, 0x3b, 0xb7, 0xe5, 0xd3, + 0xab, 0x49, 0x17, 0x60, 0xd2, 0x52, 0x56, 0xdf, + 0xdf, 0x34, 0x56, 0x82, 0xe2, 0xbe, 0xe5, 0xe1, + 0x28, 0xd1, 0x4e, 0x5f, 0x4f, 0x01, 0x7d, 0x3f, + 0x99, 0x6b, 0x30, 0x6e, 0x1a, 0x7c, 0x4c, 0x8e, + 0x62, 0x81, 0xae, 0x86, 0x3f, 0x6b, 0xd0, 0xb5, + 0xa9, 0xcf, 0x50, 0xf1, 0x02, 0x12, 0xa0, 0x0b, + 0x24, 0xe9, 0xe6, 0x72, 0x89, 0x2c, 0x52, 0x1b, + 0x34, 0x38, 0xf8, 0x75, 0x5f, 0xa0, 0x74, 0xe2, + 0x99, 0xdd, 0xa6, 0x4b, 0x14, 0x50, 0x4e, 0xf1, + 0xbe, 0xd6, 0x9e, 0xdb, 0xb2, 0x24, 0x27, 0x74, + 0x12, 0x4a, 0x78, 0x78, 0x17, 0xa5, 0x58, 0x8e, + 0x2f, 0xf9, 0xf4, 0x8d, 0xee, 0x03, 0x88, 0xae, + 0xb8, 0x29, 0xa1, 0x2f, 0x4b, 0xee, 0x92, 0xbd, + 0x87, 0xb3, 0xce, 0x34, 0x21, 0x57, 0x46, 0x04, + 0x49, 0x0c, 0x80, 0xf2, 0x01, 0x13, 0xa1, 0x55, + 0xb3, 0xff, 0x44, 0x30, 0x3c, 0x1c, 0xd0, 0xef, + 0xbc, 0x18, 0x74, 0x26, 0xad, 0x41, 0x5b, 0x5b, + 0x3e, 0x9a, 0x7a, 0x46, 0x4f, 0x16, 0xd6, 0x74, + 0x5a, 0xb7, 0x3a, 0x28, 0x31, 0xd8, 0xae, 0x26, + 0xac, 0x50, 0x53, 0x86, 0xf2, 0x56, 0xd7, 0x3f, + 0x29, 0xbc, 0x45, 0x68, 0x8e, 0xcb, 0x98, 0x64, + 0xdd, 0xc9, 0xba, 0xb8, 0x4b, 0x7b, 0x82, 0xdd, + 0x14, 0xa7, 0xcb, 0x71, 0x72, 0x00, 0x5c, 0xad, + 0x7b, 0x6a, 0x89, 0xa4, 0x3d, 0xbf, 0xb5, 0x4b, + 0x3e, 0x7c, 0x5a, 0xcf, 0xb8, 0xa1, 0xc5, 0x6e, + 0xc8, 0xb6, 0x31, 0x57, 0x7b, 0xdf, 0xa5, 0x7e, + 0xb1, 0xd6, 0x42, 0x2a, 0x31, 0x36, 0xd1, 0xd0, + 0x3f, 0x7a, 0xe5, 0x94, 0xd6, 0x36, 0xa0, 0x6f, + 0xb7, 0x40, 0x7d, 0x37, 0xc6, 0x55, 0x7c, 0x50, + 0x40, 0x6d, 0x29, 0x89, 0xe3, 0x5a, 0xae, 0x97, + 0xe7, 0x44, 0x49, 0x6e, 0xbd, 0x81, 0x3d, 0x03, + 0x93, 0x06, 0x12, 0x06, 0xe2, 0x41, 0x12, 0x4a, + 0xf1, 0x6a, 0xa4, 0x58, 0xa2, 0xfb, 0xd2, 0x15, + 0xba, 0xc9, 0x79, 0xc9, 0xce, 0x5e, 0x13, 0xbb, + 0xf1, 0x09, 0x04, 0xcc, 0xfd, 0xe8, 0x51, 0x34, + 0x6a, 0xe8, 0x61, 0x88, 0xda, 0xed, 0x01, 0x47, + 0x84, 0xf5, 0x73, 0x25, 0xf9, 0x1c, 0x42, 0x86, + 0x07, 0xf3, 0x5b, 0x1a, 0x01, 0xb3, 0xeb, 0x24, + 0x32, 0x8d, 0xf6, 0xed, 0x7c, 0x4b, 0xeb, 0x3c, + 0x36, 0x42, 0x28, 0xdf, 0xdf, 0xb6, 0xbe, 0xd9, + 0x8c, 0x52, 0xd3, 0x2b, 0x08, 0x90, 0x8c, 0xe7, + 0x98, 0x31, 0xe2, 0x32, 0x8e, 0xfc, 0x11, 0x48, + 0x00, 0xa8, 0x6a, 0x42, 0x4a, 0x02, 0xc6, 0x4b, + 0x09, 0xf1, 0xe3, 0x49, 0xf3, 0x45, 0x1f, 0x0e, + 0xbc, 0x56, 0xe2, 0xe4, 0xdf, 0xfb, 0xeb, 0x61, + 0xfa, 0x24, 0xc1, 0x63, 0x75, 0xbb, 0x47, 0x75, + 0xaf, 0xe1, 0x53, 0x16, 0x96, 0x21, 0x85, 0x26, + 0x11, 0xb3, 0x76, 0xe3, 0x23, 0xa1, 0x6b, 0x74, + 0x37, 0xd0, 0xde, 0x06, 0x90, 0x71, 0x5d, 0x43, + 0x88, 0x9b, 0x00, 0x54, 0xa6, 0x75, 0x2f, 0xa1, + 0xc2, 0x0b, 0x73, 0x20, 0x1d, 0xb6, 0x21, 0x79, + 0x57, 0x3f, 0xfa, 0x09, 0xbe, 0x8a, 0x33, 0xc3, + 0x52, 0xf0, 0x1d, 0x82, 0x31, 0xd1, 0x55, 0xb5, + 0x6c, 0x99, 0x25, 0xcf, 0x5c, 0x32, 0xce, 0xe9, + 0x0d, 0xfa, 0x69, 0x2c, 0xd5, 0x0d, 0xc5, 0x6d, + 0x86, 0xd0, 0x0c, 0x3b, 0x06, 0x50, 0x79, 0xe8, + 0xc3, 0xae, 0x04, 0xe6, 0xcd, 0x51, 0xe4, 0x26, + 0x9b, 0x4f, 0x7e, 0xa6, 0x0f, 0xab, 0xd8, 0xe5, + 0xde, 0xa9, 0x00, 0x95, 0xbe, 0xa3, 0x9d, 0x5d, + 0xb2, 0x09, 0x70, 0x18, 0x1c, 0xf0, 0xac, 0x29, + 0x23, 0x02, 0x29, 0x28, 0xd2, 0x74, 0x35, 0x57, + 0x62, 0x0f, 0x24, 0xea, 0x5e, 0x33, 0xc2, 0x92, + 0xf3, 0x78, 0x4d, 0x30, 0x1e, 0xa1, 0x99, 0xa9, + 0x82, 0xb0, 0x42, 0x31, 0x8d, 0xad, 0x8a, 0xbc, + 0xfc, 0xd4, 0x57, 0x47, 0x3e, 0xb4, 0x50, 0xdd, + 0x6e, 0x2c, 0x80, 0x4d, 0x22, 0xf1, 0xfb, 0x57, + 0xc4, 0xdd, 0x17, 0xe1, 0x8a, 0x36, 0x4a, 0xb3, + 0x37, 0xca, 0xc9, 0x4e, 0xab, 0xd5, 0x69, 0xc4, + 0xf4, 0xbc, 0x0b, 0x3b, 0x44, 0x4b, 0x29, 0x9c, + 0xee, 0xd4, 0x35, 0x22, 0x21, 0xb0, 0x1f, 0x27, + 0x64, 0xa8, 0x51, 0x1b, 0xf0, 0x9f, 0x19, 0x5c, + 0xfb, 0x5a, 0x64, 0x74, 0x70, 0x45, 0x09, 0xf5, + 0x64, 0xfe, 0x1a, 0x2d, 0xc9, 0x14, 0x04, 0x14, + 0xcf, 0xd5, 0x7d, 0x60, 0xaf, 0x94, 0x39, 0x94, + 0xe2, 0x7d, 0x79, 0x82, 0xd0, 0x65, 0x3b, 0x6b, + 0x9c, 0x19, 0x84, 0xb4, 0x6d, 0xb3, 0x0c, 0x99, + 0xc0, 0x56, 0xa8, 0xbd, 0x73, 0xce, 0x05, 0x84, + 0x3e, 0x30, 0xaa, 0xc4, 0x9b, 0x1b, 0x04, 0x2a, + 0x9f, 0xd7, 0x43, 0x2b, 0x23, 0xdf, 0xbf, 0xaa, + 0xd5, 0xc2, 0x43, 0x2d, 0x70, 0xab, 0xdc, 0x75, + 0xad, 0xac, 0xf7, 0xc0, 0xbe, 0x67, 0xb2, 0x74, + 0xed, 0x67, 0x10, 0x4a, 0x92, 0x60, 0xc1, 0x40, + 0x50, 0x19, 0x8a, 0x8a, 0x8c, 0x09, 0x0e, 0x72, + 0xe1, 0x73, 0x5e, 0xe8, 0x41, 0x85, 0x63, 0x9f, + 0x3f, 0xd7, 0x7d, 0xc4, 0xfb, 0x22, 0x5d, 0x92, + 0x6c, 0xb3, 0x1e, 0xe2, 0x50, 0x2f, 0x82, 0xa8, + 0x28, 0xc0, 0xb5, 0xd7, 0x5f, 0x68, 0x0d, 0x2c, + 0x2d, 0xaf, 0x7e, 0xfa, 0x2e, 0x08, 0x0f, 0x1f, + 0x70, 0x9f, 0xe9, 0x19, 0x72, 0x55, 0xf8, 0xfb, + 0x51, 0xd2, 0x33, 0x5d, 0xa0, 0xd3, 0x2b, 0x0a, + 0x6c, 0xbc, 0x4e, 0xcf, 0x36, 0x4d, 0xdc, 0x3b, + 0xe9, 0x3e, 0x81, 0x7c, 0x61, 0xdb, 0x20, 0x2d, + 0x3a, 0xc3, 0xb3, 0x0c, 0x1e, 0x00, 0xb9, 0x7c, + 0xf5, 0xca, 0x10, 0x5f, 0x3a, 0x71, 0xb3, 0xe4, + 0x20, 0xdb, 0x0c, 0x2a, 0x98, 0x63, 0x45, 0x00, + 0x58, 0xf6, 0x68, 0xe4, 0x0b, 0xda, 0x13, 0x3b, + 0x60, 0x5c, 0x76, 0xdb, 0xb9, 0x97, 0x71, 0xe4, + 0xd9, 0xb7, 0xdb, 0xbd, 0x68, 0xc7, 0x84, 0x84, + 0xaa, 0x7c, 0x68, 0x62, 0x5e, 0x16, 0xfc, 0xba, + 0x72, 0xaa, 0x9a, 0xa9, 0xeb, 0x7c, 0x75, 0x47, + 0x97, 0x7e, 0xad, 0xe2, 0xd9, 0x91, 0xe8, 0xe4, + 0xa5, 0x31, 0xd7, 0x01, 0x8e, 0xa2, 0x11, 0x88, + 0x95, 0xb9, 0xf2, 0x9b, 0xd3, 0x7f, 0x1b, 0x81, + 0x22, 0xf7, 0x98, 0x60, 0x0a, 0x64, 0xa6, 0xc1, + 0xf6, 0x49, 0xc7, 0xe3, 0x07, 0x4d, 0x94, 0x7a, + 0xcf, 0x6e, 0x68, 0x0c, 0x1b, 0x3f, 0x6e, 0x2e, + 0xee, 0x92, 0xfa, 0x52, 0xb3, 0x59, 0xf8, 0xf1, + 0x8f, 0x6a, 0x66, 0xa3, 0x82, 0x76, 0x4a, 0x07, + 0x1a, 0xc7, 0xdd, 0xf5, 0xda, 0x9c, 0x3c, 0x24, + 0xbf, 0xfd, 0x42, 0xa1, 0x10, 0x64, 0x6a, 0x0f, + 0x89, 0xee, 0x36, 0xa5, 0xce, 0x99, 0x48, 0x6a, + 0xf0, 0x9f, 0x9e, 0x69, 0xa4, 0x40, 0x20, 0xe9, + 0x16, 0x15, 0xf7, 0xdb, 0x75, 0x02, 0xcb, 0xe9, + 0x73, 0x8b, 0x3b, 0x49, 0x2f, 0xf0, 0xaf, 0x51, + 0x06, 0x5c, 0xdf, 0x27, 0x27, 0x49, 0x6a, 0xd1, + 0xcc, 0xc7, 0xb5, 0x63, 0xb5, 0xfc, 0xb8, 0x5c, + 0x87, 0x7f, 0x84, 0xb4, 0xcc, 0x14, 0xa9, 0x53, + 0xda, 0xa4, 0x56, 0xf8, 0xb6, 0x1b, 0xcc, 0x40, + 0x27, 0x52, 0x06, 0x5a, 0x13, 0x81, 0xd7, 0x3a, + 0xd4, 0x3b, 0xfb, 0x49, 0x65, 0x31, 0x33, 0xb2, + 0xfa, 0xcd, 0xad, 0x58, 0x4e, 0x2b, 0xae, 0xd2, + 0x20, 0xfb, 0x1a, 0x48, 0xb4, 0x3f, 0x9a, 0xd8, + 0x7a, 0x35, 0x4a, 0xc8, 0xee, 0x88, 0x5e, 0x07, + 0x66, 0x54, 0xb9, 0xec, 0x9f, 0xa3, 0xe3, 0xb9, + 0x37, 0xaa, 0x49, 0x76, 0x31, 0xda, 0x74, 0x2d, + 0x3c, 0xa4, 0x65, 0x10, 0x32, 0x38, 0xf0, 0xde, + 0xd3, 0x99, 0x17, 0xaa, 0x71, 0xaa, 0x8f, 0x0f, + 0x8c, 0xaf, 0xa2, 0xf8, 0x5d, 0x64, 0xba, 0x1d, + 0xa3, 0xef, 0x96, 0x73, 0xe8, 0xa1, 0x02, 0x8d, + 0x0c, 0x6d, 0xb8, 0x06, 0x90, 0xb8, 0x08, 0x56, + 0x2c, 0xa7, 0x06, 0xc9, 0xc2, 0x38, 0xdb, 0x7c, + 0x63, 0xb1, 0x57, 0x8e, 0xea, 0x7c, 0x79, 0xf3, + 0x49, 0x1d, 0xfe, 0x9f, 0xf3, 0x6e, 0xb1, 0x1d, + 0xba, 0x19, 0x80, 0x1a, 0x0a, 0xd3, 0xb0, 0x26, + 0x21, 0x40, 0xb1, 0x7c, 0xf9, 0x4d, 0x8d, 0x10, + 0xc1, 0x7e, 0xf4, 0xf6, 0x3c, 0xa8, 0xfd, 0x7c, + 0xa3, 0x92, 0xb2, 0x0f, 0xaa, 0xcc, 0xa6, 0x11, + 0xfe, 0x04, 0xe3, 0xd1, 0x7a, 0x32, 0x89, 0xdf, + 0x0d, 0xc4, 0x8f, 0x79, 0x6b, 0xca, 0x16, 0x7c, + 0x6e, 0xf9, 0xad, 0x0f, 0xf6, 0xfe, 0x27, 0xdb, + 0xc4, 0x13, 0x70, 0xf1, 0x62, 0x1a, 0x4f, 0x79, + 0x40, 0xc9, 0x9b, 0x8b, 0x21, 0xea, 0x84, 0xfa, + 0xf5, 0xf1, 0x89, 0xce, 0xb7, 0x55, 0x0a, 0x80, + 0x39, 0x2f, 0x55, 0x36, 0x16, 0x9c, 0x7b, 0x08, + 0xbd, 0x87, 0x0d, 0xa5, 0x32, 0xf1, 0x52, 0x7c, + 0xe8, 0x55, 0x60, 0x5b, 0xd7, 0x69, 0xe4, 0xfc, + 0xfa, 0x12, 0x85, 0x96, 0xea, 0x50, 0x28, 0xab, + 0x8a, 0xf7, 0xbb, 0x0e, 0x53, 0x74, 0xca, 0xa6, + 0x27, 0x09, 0xc2, 0xb5, 0xde, 0x18, 0x14, 0xd9, + 0xea, 0xe5, 0x29, 0x1c, 0x40, 0x56, 0xcf, 0xd7, + 0xae, 0x05, 0x3f, 0x65, 0xaf, 0x05, 0x73, 0xe2, + 0x35, 0x96, 0x27, 0x07, 0x14, 0xc0, 0xad, 0x33, + 0xf1, 0xdc, 0x44, 0x7a, 0x89, 0x17, 0x77, 0xd2, + 0x9c, 0x58, 0x60, 0xf0, 0x3f, 0x7b, 0x2d, 0x2e, + 0x57, 0x95, 0x54, 0x87, 0xed, 0xf2, 0xc7, 0x4c, + 0xf0, 0xae, 0x56, 0x29, 0x19, 0x7d, 0x66, 0x4b, + 0x9b, 0x83, 0x84, 0x42, 0x3b, 0x01, 0x25, 0x66, + 0x8e, 0x02, 0xde, 0xb9, 0x83, 0x54, 0x19, 0xf6, + 0x9f, 0x79, 0x0d, 0x67, 0xc5, 0x1d, 0x7a, 0x44, + 0x02, 0x98, 0xa7, 0x16, 0x1c, 0x29, 0x0d, 0x74, + 0xff, 0x85, 0x40, 0x06, 0xef, 0x2c, 0xa9, 0xc6, + 0xf5, 0x53, 0x07, 0x06, 0xae, 0xe4, 0xfa, 0x5f, + 0xd8, 0x39, 0x4d, 0xf1, 0x9b, 0x6b, 0xd9, 0x24, + 0x84, 0xfe, 0x03, 0x4c, 0xb2, 0x3f, 0xdf, 0xa1, + 0x05, 0x9e, 0x50, 0x14, 0x5a, 0xd9, 0x1a, 0xa2, + 0xa7, 0xfa, 0xfa, 0x17, 0xf7, 0x78, 0xd6, 0xb5, + 0x92, 0x61, 0x91, 0xac, 0x36, 0xfa, 0x56, 0x0d, + 0x38, 0x32, 0x18, 0x85, 0x08, 0x58, 0x37, 0xf0, + 0x4b, 0xdb, 0x59, 0xe7, 0xa4, 0x34, 0xc0, 0x1b, + 0x01, 0xaf, 0x2d, 0xde, 0xa1, 0xaa, 0x5d, 0xd3, + 0xec, 0xe1, 0xd4, 0xf7, 0xe6, 0x54, 0x68, 0xf0, + 0x51, 0x97, 0xa7, 0x89, 0xea, 0x24, 0xad, 0xd3, + 0x6e, 0x47, 0x93, 0x8b, 0x4b, 0xb4, 0xf7, 0x1c, + 0x42, 0x06, 0x67, 0xe8, 0x99, 0xf6, 0xf5, 0x7b, + 0x85, 0xb5, 0x65, 0xb5, 0xb5, 0xd2, 0x37, 0xf5, + 0xf3, 0x02, 0xa6, 0x4d, 0x11, 0xa7, 0xdc, 0x51, + 0x09, 0x7f, 0xa0, 0xd8, 0x88, 0x1c, 0x13, 0x71, + 0xae, 0x9c, 0xb7, 0x7b, 0x34, 0xd6, 0x4e, 0x68, + 0x26, 0x83, 0x51, 0xaf, 0x1d, 0xee, 0x8b, 0xbb, + 0x69, 0x43, 0x2b, 0x9e, 0x8a, 0xbc, 0x02, 0x0e, + 0xa0, 0x1b, 0xe0, 0xa8, 0x5f, 0x6f, 0xaf, 0x1b, + 0x8f, 0xe7, 0x64, 0x71, 0x74, 0x11, 0x7e, 0xa8, + 0xd8, 0xf9, 0x97, 0x06, 0xc3, 0xb6, 0xfb, 0xfb, + 0xb7, 0x3d, 0x35, 0x9d, 0x3b, 0x52, 0xed, 0x54, + 0xca, 0xf4, 0x81, 0x01, 0x2d, 0x1b, 0xc3, 0xa7, + 0x00, 0x3d, 0x1a, 0x39, 0x54, 0xe1, 0xf6, 0xff, + 0xed, 0x6f, 0x0b, 0x5a, 0x68, 0xda, 0x58, 0xdd, + 0xa9, 0xcf, 0x5c, 0x4a, 0xe5, 0x09, 0x4e, 0xde, + 0x9d, 0xbc, 0x3e, 0xee, 0x5a, 0x00, 0x3b, 0x2c, + 0x87, 0x10, 0x65, 0x60, 0xdd, 0xd7, 0x56, 0xd1, + 0x4c, 0x64, 0x45, 0xe4, 0x21, 0xec, 0x78, 0xf8, + 0x25, 0x7a, 0x3e, 0x16, 0x5d, 0x09, 0x53, 0x14, + 0xbe, 0x4f, 0xae, 0x87, 0xd8, 0xd1, 0xaa, 0x3c, + 0xf6, 0x3e, 0xa4, 0x70, 0x8c, 0x5e, 0x70, 0xa4, + 0xb3, 0x6b, 0x66, 0x73, 0xd3, 0xbf, 0x31, 0x06, + 0x19, 0x62, 0x93, 0x15, 0xf2, 0x86, 0xe4, 0x52, + 0x7e, 0x53, 0x4c, 0x12, 0x38, 0xcc, 0x34, 0x7d, + 0x57, 0xf6, 0x42, 0x93, 0x8a, 0xc4, 0xee, 0x5c, + 0x8a, 0xe1, 0x52, 0x8f, 0x56, 0x64, 0xf6, 0xa6, + 0xd1, 0x91, 0x57, 0x70, 0xcd, 0x11, 0x76, 0xf5, + 0x59, 0x60, 0x60, 0x3c, 0xc1, 0xc3, 0x0b, 0x7f, + 0x58, 0x1a, 0x50, 0x91, 0xf1, 0x68, 0x8f, 0x6e, + 0x74, 0x74, 0xa8, 0x51, 0x0b, 0xf7, 0x7a, 0x98, + 0x37, 0xf2, 0x0a, 0x0e, 0xa4, 0x97, 0x04, 0xb8, + 0x9b, 0xfd, 0xa0, 0xea, 0xf7, 0x0d, 0xe1, 0xdb, + 0x03, 0xf0, 0x31, 0x29, 0xf8, 0xdd, 0x6b, 0x8b, + 0x5d, 0xd8, 0x59, 0xa9, 0x29, 0xcf, 0x9a, 0x79, + 0x89, 0x19, 0x63, 0x46, 0x09, 0x79, 0x6a, 0x11, + 0xda, 0x63, 0x68, 0x48, 0x77, 0x23, 0xfb, 0x7d, + 0x3a, 0x43, 0xcb, 0x02, 0x3b, 0x7a, 0x6d, 0x10, + 0x2a, 0x9e, 0xac, 0xf1, 0xd4, 0x19, 0xf8, 0x23, + 0x64, 0x1d, 0x2c, 0x5f, 0xf2, 0xb0, 0x5c, 0x23, + 0x27, 0xf7, 0x27, 0x30, 0x16, 0x37, 0xb1, 0x90, + 0xab, 0x38, 0xfb, 0x55, 0xcd, 0x78, 0x58, 0xd4, + 0x7d, 0x43, 0xf6, 0x45, 0x5e, 0x55, 0x8d, 0xb1, + 0x02, 0x65, 0x58, 0xb4, 0x13, 0x4b, 0x36, 0xf7, + 0xcc, 0xfe, 0x3d, 0x0b, 0x82, 0xe2, 0x12, 0x11, + 0xbb, 0xe6, 0xb8, 0x3a, 0x48, 0x71, 0xc7, 0x50, + 0x06, 0x16, 0x3a, 0xe6, 0x7c, 0x05, 0xc7, 0xc8, + 0x4d, 0x2f, 0x08, 0x6a, 0x17, 0x9a, 0x95, 0x97, + 0x50, 0x68, 0xdc, 0x28, 0x18, 0xc4, 0x61, 0x38, + 0xb9, 0xe0, 0x3e, 0x78, 0xdb, 0x29, 0xe0, 0x9f, + 0x52, 0xdd, 0xf8, 0x4f, 0x91, 0xc1, 0xd0, 0x33, + 0xa1, 0x7a, 0x8e, 0x30, 0x13, 0x82, 0x07, 0x9f, + 0xd3, 0x31, 0x0f, 0x23, 0xbe, 0x32, 0x5a, 0x75, + 0xcf, 0x96, 0xb2, 0xec, 0xb5, 0x32, 0xac, 0x21, + 0xd1, 0x82, 0x33, 0xd3, 0x15, 0x74, 0xbd, 0x90, + 0xf1, 0x2c, 0xe6, 0x5f, 0x8d, 0xe3, 0x02, 0xe8, + 0xe9, 0xc4, 0xca, 0x96, 0xeb, 0x0e, 0xbc, 0x91, + 0xf4, 0xb9, 0xea, 0xd9, 0x1b, 0x75, 0xbd, 0xe1, + 0xac, 0x2a, 0x05, 0x37, 0x52, 0x9b, 0x1b, 0x3f, + 0x5a, 0xdc, 0x21, 0xc3, 0x98, 0xbb, 0xaf, 0xa3, + 0xf2, 0x00, 0xbf, 0x0d, 0x30, 0x89, 0x05, 0xcc, + 0xa5, 0x76, 0xf5, 0x06, 0xf0, 0xc6, 0x54, 0x8a, + 0x5d, 0xd4, 0x1e, 0xc1, 0xf2, 0xce, 0xb0, 0x62, + 0xc8, 0xfc, 0x59, 0x42, 0x9a, 0x90, 0x60, 0x55, + 0xfe, 0x88, 0xa5, 0x8b, 0xb8, 0x33, 0x0c, 0x23, + 0x24, 0x0d, 0x15, 0x70, 0x37, 0x1e, 0x3d, 0xf6, + 0xd2, 0xea, 0x92, 0x10, 0xb2, 0xc4, 0x51, 0xac, + 0xf2, 0xac, 0xf3, 0x6b, 0x6c, 0xaa, 0xcf, 0x12, + 0xc5, 0x6c, 0x90, 0x50, 0xb5, 0x0c, 0xfc, 0x1a, + 0x15, 0x52, 0xe9, 0x26, 0xc6, 0x52, 0xa4, 0xe7, + 0x81, 0x69, 0xe1, 0xe7, 0x9e, 0x30, 0x01, 0xec, + 0x84, 0x89, 0xb2, 0x0d, 0x66, 0xdd, 0xce, 0x28, + 0x5c, 0xec, 0x98, 0x46, 0x68, 0x21, 0x9f, 0x88, + 0x3f, 0x1f, 0x42, 0x77, 0xce, 0xd0, 0x61, 0xd4, + 0x20, 0xa7, 0xff, 0x53, 0xad, 0x37, 0xd0, 0x17, + 0x35, 0xc9, 0xfc, 0xba, 0x0a, 0x78, 0x3f, 0xf2, + 0xcc, 0x86, 0x89, 0xe8, 0x4b, 0x3c, 0x48, 0x33, + 0x09, 0x7f, 0xc6, 0xc0, 0xdd, 0xb8, 0xfd, 0x7a, + 0x66, 0x66, 0x65, 0xeb, 0x47, 0xa7, 0x04, 0x28, + 0xa3, 0x19, 0x8e, 0xa9, 0xb1, 0x13, 0x67, 0x62, + 0x70, 0xcf, 0xd6 +}; +static const u8 enc_assoc012[] __initconst = { + 0xb1, 0x69, 0x83, 0x87, 0x30, 0xaa, 0x5d, 0xb8, + 0x77, 0xe8, 0x21, 0xff, 0x06, 0x59, 0x35, 0xce, + 0x75, 0xfe, 0x38, 0xef, 0xb8, 0x91, 0x43, 0x8c, + 0xcf, 0x70, 0xdd, 0x0a, 0x68, 0xbf, 0xd4, 0xbc, + 0x16, 0x76, 0x99, 0x36, 0x1e, 0x58, 0x79, 0x5e, + 0xd4, 0x29, 0xf7, 0x33, 0x93, 0x48, 0xdb, 0x5f, + 0x01, 0xae, 0x9c, 0xb6, 0xe4, 0x88, 0x6d, 0x2b, + 0x76, 0x75, 0xe0, 0xf3, 0x74, 0xe2, 0xc9 +}; +static const u8 enc_nonce012[] __initconst = { + 0x05, 0xa3, 0x93, 0xed, 0x30, 0xc5, 0xa2, 0x06 +}; +static const u8 enc_key012[] __initconst = { + 0xb3, 0x35, 0x50, 0x03, 0x54, 0x2e, 0x40, 0x5e, + 0x8f, 0x59, 0x8e, 0xc5, 0x90, 0xd5, 0x27, 0x2d, + 0xba, 0x29, 0x2e, 0xcb, 0x1b, 0x70, 0x44, 0x1e, + 0x65, 0x91, 0x6e, 0x2a, 0x79, 0x22, 0xda, 0x64 +}; + +/* wycheproof - rfc7539 */ +static const u8 enc_input013[] __initconst = { + 0x4c, 0x61, 0x64, 0x69, 0x65, 0x73, 0x20, 0x61, + 0x6e, 0x64, 0x20, 0x47, 0x65, 0x6e, 0x74, 0x6c, + 0x65, 0x6d, 0x65, 0x6e, 0x20, 0x6f, 0x66, 0x20, + 0x74, 0x68, 0x65, 0x20, 0x63, 0x6c, 0x61, 0x73, + 0x73, 0x20, 0x6f, 0x66, 0x20, 0x27, 0x39, 0x39, + 0x3a, 0x20, 0x49, 0x66, 0x20, 0x49, 0x20, 0x63, + 0x6f, 0x75, 0x6c, 0x64, 0x20, 0x6f, 0x66, 0x66, + 0x65, 0x72, 0x20, 0x79, 0x6f, 0x75, 0x20, 0x6f, + 0x6e, 0x6c, 0x79, 0x20, 0x6f, 0x6e, 0x65, 0x20, + 0x74, 0x69, 0x70, 0x20, 0x66, 0x6f, 0x72, 0x20, + 0x74, 0x68, 0x65, 0x20, 0x66, 0x75, 0x74, 0x75, + 0x72, 0x65, 0x2c, 0x20, 0x73, 0x75, 0x6e, 0x73, + 0x63, 0x72, 0x65, 0x65, 0x6e, 0x20, 0x77, 0x6f, + 0x75, 0x6c, 0x64, 0x20, 0x62, 0x65, 0x20, 0x69, + 0x74, 0x2e +}; +static const u8 enc_output013[] __initconst = { + 0xd3, 0x1a, 0x8d, 0x34, 0x64, 0x8e, 0x60, 0xdb, + 0x7b, 0x86, 0xaf, 0xbc, 0x53, 0xef, 0x7e, 0xc2, + 0xa4, 0xad, 0xed, 0x51, 0x29, 0x6e, 0x08, 0xfe, + 0xa9, 0xe2, 0xb5, 0xa7, 0x36, 0xee, 0x62, 0xd6, + 0x3d, 0xbe, 0xa4, 0x5e, 0x8c, 0xa9, 0x67, 0x12, + 0x82, 0xfa, 0xfb, 0x69, 0xda, 0x92, 0x72, 0x8b, + 0x1a, 0x71, 0xde, 0x0a, 0x9e, 0x06, 0x0b, 0x29, + 0x05, 0xd6, 0xa5, 0xb6, 0x7e, 0xcd, 0x3b, 0x36, + 0x92, 0xdd, 0xbd, 0x7f, 0x2d, 0x77, 0x8b, 0x8c, + 0x98, 0x03, 0xae, 0xe3, 0x28, 0x09, 0x1b, 0x58, + 0xfa, 0xb3, 0x24, 0xe4, 0xfa, 0xd6, 0x75, 0x94, + 0x55, 0x85, 0x80, 0x8b, 0x48, 0x31, 0xd7, 0xbc, + 0x3f, 0xf4, 0xde, 0xf0, 0x8e, 0x4b, 0x7a, 0x9d, + 0xe5, 0x76, 0xd2, 0x65, 0x86, 0xce, 0xc6, 0x4b, + 0x61, 0x16, 0x1a, 0xe1, 0x0b, 0x59, 0x4f, 0x09, + 0xe2, 0x6a, 0x7e, 0x90, 0x2e, 0xcb, 0xd0, 0x60, + 0x06, 0x91 +}; +static const u8 enc_assoc013[] __initconst = { + 0x50, 0x51, 0x52, 0x53, 0xc0, 0xc1, 0xc2, 0xc3, + 0xc4, 0xc5, 0xc6, 0xc7 +}; +static const u8 enc_nonce013[] __initconst = { + 0x07, 0x00, 0x00, 0x00, 0x40, 0x41, 0x42, 0x43, + 0x44, 0x45, 0x46, 0x47 +}; +static const u8 enc_key013[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input014[] __initconst = { }; +static const u8 enc_output014[] __initconst = { + 0x76, 0xac, 0xb3, 0x42, 0xcf, 0x31, 0x66, 0xa5, + 0xb6, 0x3c, 0x0c, 0x0e, 0xa1, 0x38, 0x3c, 0x8d +}; +static const u8 enc_assoc014[] __initconst = { }; +static const u8 enc_nonce014[] __initconst = { + 0x4d, 0xa5, 0xbf, 0x8d, 0xfd, 0x58, 0x52, 0xc1, + 0xea, 0x12, 0x37, 0x9d +}; +static const u8 enc_key014[] __initconst = { + 0x80, 0xba, 0x31, 0x92, 0xc8, 0x03, 0xce, 0x96, + 0x5e, 0xa3, 0x71, 0xd5, 0xff, 0x07, 0x3c, 0xf0, + 0xf4, 0x3b, 0x6a, 0x2a, 0xb5, 0x76, 0xb2, 0x08, + 0x42, 0x6e, 0x11, 0x40, 0x9c, 0x09, 0xb9, 0xb0 +}; + +/* wycheproof - misc */ +static const u8 enc_input015[] __initconst = { }; +static const u8 enc_output015[] __initconst = { + 0x90, 0x6f, 0xa6, 0x28, 0x4b, 0x52, 0xf8, 0x7b, + 0x73, 0x59, 0xcb, 0xaa, 0x75, 0x63, 0xc7, 0x09 +}; +static const u8 enc_assoc015[] __initconst = { + 0xbd, 0x50, 0x67, 0x64, 0xf2, 0xd2, 0xc4, 0x10 +}; +static const u8 enc_nonce015[] __initconst = { + 0xa9, 0x2e, 0xf0, 0xac, 0x99, 0x1d, 0xd5, 0x16, + 0xa3, 0xc6, 0xf6, 0x89 +}; +static const u8 enc_key015[] __initconst = { + 0x7a, 0x4c, 0xd7, 0x59, 0x17, 0x2e, 0x02, 0xeb, + 0x20, 0x4d, 0xb2, 0xc3, 0xf5, 0xc7, 0x46, 0x22, + 0x7d, 0xf5, 0x84, 0xfc, 0x13, 0x45, 0x19, 0x63, + 0x91, 0xdb, 0xb9, 0x57, 0x7a, 0x25, 0x07, 0x42 +}; + +/* wycheproof - misc */ +static const u8 enc_input016[] __initconst = { + 0x2a +}; +static const u8 enc_output016[] __initconst = { + 0x3a, 0xca, 0xc2, 0x7d, 0xec, 0x09, 0x68, 0x80, + 0x1e, 0x9f, 0x6e, 0xde, 0xd6, 0x9d, 0x80, 0x75, + 0x22 +}; +static const u8 enc_assoc016[] __initconst = { }; +static const u8 enc_nonce016[] __initconst = { + 0x99, 0xe2, 0x3e, 0xc4, 0x89, 0x85, 0xbc, 0xcd, + 0xee, 0xab, 0x60, 0xf1 +}; +static const u8 enc_key016[] __initconst = { + 0xcc, 0x56, 0xb6, 0x80, 0x55, 0x2e, 0xb7, 0x50, + 0x08, 0xf5, 0x48, 0x4b, 0x4c, 0xb8, 0x03, 0xfa, + 0x50, 0x63, 0xeb, 0xd6, 0xea, 0xb9, 0x1f, 0x6a, + 0xb6, 0xae, 0xf4, 0x91, 0x6a, 0x76, 0x62, 0x73 +}; + +/* wycheproof - misc */ +static const u8 enc_input017[] __initconst = { + 0x51 +}; +static const u8 enc_output017[] __initconst = { + 0xc4, 0x16, 0x83, 0x10, 0xca, 0x45, 0xb1, 0xf7, + 0xc6, 0x6c, 0xad, 0x4e, 0x99, 0xe4, 0x3f, 0x72, + 0xb9 +}; +static const u8 enc_assoc017[] __initconst = { + 0x91, 0xca, 0x6c, 0x59, 0x2c, 0xbc, 0xca, 0x53 +}; +static const u8 enc_nonce017[] __initconst = { + 0xab, 0x0d, 0xca, 0x71, 0x6e, 0xe0, 0x51, 0xd2, + 0x78, 0x2f, 0x44, 0x03 +}; +static const u8 enc_key017[] __initconst = { + 0x46, 0xf0, 0x25, 0x49, 0x65, 0xf7, 0x69, 0xd5, + 0x2b, 0xdb, 0x4a, 0x70, 0xb4, 0x43, 0x19, 0x9f, + 0x8e, 0xf2, 0x07, 0x52, 0x0d, 0x12, 0x20, 0xc5, + 0x5e, 0x4b, 0x70, 0xf0, 0xfd, 0xa6, 0x20, 0xee +}; + +/* wycheproof - misc */ +static const u8 enc_input018[] __initconst = { + 0x5c, 0x60 +}; +static const u8 enc_output018[] __initconst = { + 0x4d, 0x13, 0x91, 0xe8, 0xb6, 0x1e, 0xfb, 0x39, + 0xc1, 0x22, 0x19, 0x54, 0x53, 0x07, 0x7b, 0x22, + 0xe5, 0xe2 +}; +static const u8 enc_assoc018[] __initconst = { }; +static const u8 enc_nonce018[] __initconst = { + 0x46, 0x1a, 0xf1, 0x22, 0xe9, 0xf2, 0xe0, 0x34, + 0x7e, 0x03, 0xf2, 0xdb +}; +static const u8 enc_key018[] __initconst = { + 0x2f, 0x7f, 0x7e, 0x4f, 0x59, 0x2b, 0xb3, 0x89, + 0x19, 0x49, 0x89, 0x74, 0x35, 0x07, 0xbf, 0x3e, + 0xe9, 0xcb, 0xde, 0x17, 0x86, 0xb6, 0x69, 0x5f, + 0xe6, 0xc0, 0x25, 0xfd, 0x9b, 0xa4, 0xc1, 0x00 +}; + +/* wycheproof - misc */ +static const u8 enc_input019[] __initconst = { + 0xdd, 0xf2 +}; +static const u8 enc_output019[] __initconst = { + 0xb6, 0x0d, 0xea, 0xd0, 0xfd, 0x46, 0x97, 0xec, + 0x2e, 0x55, 0x58, 0x23, 0x77, 0x19, 0xd0, 0x24, + 0x37, 0xa2 +}; +static const u8 enc_assoc019[] __initconst = { + 0x88, 0x36, 0x4f, 0xc8, 0x06, 0x05, 0x18, 0xbf +}; +static const u8 enc_nonce019[] __initconst = { + 0x61, 0x54, 0x6b, 0xa5, 0xf1, 0x72, 0x05, 0x90, + 0xb6, 0x04, 0x0a, 0xc6 +}; +static const u8 enc_key019[] __initconst = { + 0xc8, 0x83, 0x3d, 0xce, 0x5e, 0xa9, 0xf2, 0x48, + 0xaa, 0x20, 0x30, 0xea, 0xcf, 0xe7, 0x2b, 0xff, + 0xe6, 0x9a, 0x62, 0x0c, 0xaf, 0x79, 0x33, 0x44, + 0xe5, 0x71, 0x8f, 0xe0, 0xd7, 0xab, 0x1a, 0x58 +}; + +/* wycheproof - misc */ +static const u8 enc_input020[] __initconst = { + 0xab, 0x85, 0xe9, 0xc1, 0x57, 0x17, 0x31 +}; +static const u8 enc_output020[] __initconst = { + 0x5d, 0xfe, 0x34, 0x40, 0xdb, 0xb3, 0xc3, 0xed, + 0x7a, 0x43, 0x4e, 0x26, 0x02, 0xd3, 0x94, 0x28, + 0x1e, 0x0a, 0xfa, 0x9f, 0xb7, 0xaa, 0x42 +}; +static const u8 enc_assoc020[] __initconst = { }; +static const u8 enc_nonce020[] __initconst = { + 0x3c, 0x4e, 0x65, 0x4d, 0x66, 0x3f, 0xa4, 0x59, + 0x6d, 0xc5, 0x5b, 0xb7 +}; +static const u8 enc_key020[] __initconst = { + 0x55, 0x56, 0x81, 0x58, 0xd3, 0xa6, 0x48, 0x3f, + 0x1f, 0x70, 0x21, 0xea, 0xb6, 0x9b, 0x70, 0x3f, + 0x61, 0x42, 0x51, 0xca, 0xdc, 0x1a, 0xf5, 0xd3, + 0x4a, 0x37, 0x4f, 0xdb, 0xfc, 0x5a, 0xda, 0xc7 +}; + +/* wycheproof - misc */ +static const u8 enc_input021[] __initconst = { + 0x4e, 0xe5, 0xcd, 0xa2, 0x0d, 0x42, 0x90 +}; +static const u8 enc_output021[] __initconst = { + 0x4b, 0xd4, 0x72, 0x12, 0x94, 0x1c, 0xe3, 0x18, + 0x5f, 0x14, 0x08, 0xee, 0x7f, 0xbf, 0x18, 0xf5, + 0xab, 0xad, 0x6e, 0x22, 0x53, 0xa1, 0xba +}; +static const u8 enc_assoc021[] __initconst = { + 0x84, 0xe4, 0x6b, 0xe8, 0xc0, 0x91, 0x90, 0x53 +}; +static const u8 enc_nonce021[] __initconst = { + 0x58, 0x38, 0x93, 0x75, 0xc6, 0x9e, 0xe3, 0x98, + 0xde, 0x94, 0x83, 0x96 +}; +static const u8 enc_key021[] __initconst = { + 0xe3, 0xc0, 0x9e, 0x7f, 0xab, 0x1a, 0xef, 0xb5, + 0x16, 0xda, 0x6a, 0x33, 0x02, 0x2a, 0x1d, 0xd4, + 0xeb, 0x27, 0x2c, 0x80, 0xd5, 0x40, 0xc5, 0xda, + 0x52, 0xa7, 0x30, 0xf3, 0x4d, 0x84, 0x0d, 0x7f +}; + +/* wycheproof - misc */ +static const u8 enc_input022[] __initconst = { + 0xbe, 0x33, 0x08, 0xf7, 0x2a, 0x2c, 0x6a, 0xed +}; +static const u8 enc_output022[] __initconst = { + 0x8e, 0x94, 0x39, 0xa5, 0x6e, 0xee, 0xc8, 0x17, + 0xfb, 0xe8, 0xa6, 0xed, 0x8f, 0xab, 0xb1, 0x93, + 0x75, 0x39, 0xdd, 0x6c, 0x00, 0xe9, 0x00, 0x21 +}; +static const u8 enc_assoc022[] __initconst = { }; +static const u8 enc_nonce022[] __initconst = { + 0x4f, 0x07, 0xaf, 0xed, 0xfd, 0xc3, 0xb6, 0xc2, + 0x36, 0x18, 0x23, 0xd3 +}; +static const u8 enc_key022[] __initconst = { + 0x51, 0xe4, 0xbf, 0x2b, 0xad, 0x92, 0xb7, 0xaf, + 0xf1, 0xa4, 0xbc, 0x05, 0x55, 0x0b, 0xa8, 0x1d, + 0xf4, 0xb9, 0x6f, 0xab, 0xf4, 0x1c, 0x12, 0xc7, + 0xb0, 0x0e, 0x60, 0xe4, 0x8d, 0xb7, 0xe1, 0x52 +}; + +/* wycheproof - misc */ +static const u8 enc_input023[] __initconst = { + 0xa4, 0xc9, 0xc2, 0x80, 0x1b, 0x71, 0xf7, 0xdf +}; +static const u8 enc_output023[] __initconst = { + 0xb9, 0xb9, 0x10, 0x43, 0x3a, 0xf0, 0x52, 0xb0, + 0x45, 0x30, 0xf5, 0x1a, 0xee, 0xe0, 0x24, 0xe0, + 0xa4, 0x45, 0xa6, 0x32, 0x8f, 0xa6, 0x7a, 0x18 +}; +static const u8 enc_assoc023[] __initconst = { + 0x66, 0xc0, 0xae, 0x70, 0x07, 0x6c, 0xb1, 0x4d +}; +static const u8 enc_nonce023[] __initconst = { + 0xb4, 0xea, 0x66, 0x6e, 0xe1, 0x19, 0x56, 0x33, + 0x66, 0x48, 0x4a, 0x78 +}; +static const u8 enc_key023[] __initconst = { + 0x11, 0x31, 0xc1, 0x41, 0x85, 0x77, 0xa0, 0x54, + 0xde, 0x7a, 0x4a, 0xc5, 0x51, 0x95, 0x0f, 0x1a, + 0x05, 0x3f, 0x9a, 0xe4, 0x6e, 0x5b, 0x75, 0xfe, + 0x4a, 0xbd, 0x56, 0x08, 0xd7, 0xcd, 0xda, 0xdd +}; + +/* wycheproof - misc */ +static const u8 enc_input024[] __initconst = { + 0x42, 0xba, 0xae, 0x59, 0x78, 0xfe, 0xaf, 0x5c, + 0x36, 0x8d, 0x14, 0xe0 +}; +static const u8 enc_output024[] __initconst = { + 0xff, 0x7d, 0xc2, 0x03, 0xb2, 0x6c, 0x46, 0x7a, + 0x6b, 0x50, 0xdb, 0x33, 0x57, 0x8c, 0x0f, 0x27, + 0x58, 0xc2, 0xe1, 0x4e, 0x36, 0xd4, 0xfc, 0x10, + 0x6d, 0xcb, 0x29, 0xb4 +}; +static const u8 enc_assoc024[] __initconst = { }; +static const u8 enc_nonce024[] __initconst = { + 0x9a, 0x59, 0xfc, 0xe2, 0x6d, 0xf0, 0x00, 0x5e, + 0x07, 0x53, 0x86, 0x56 +}; +static const u8 enc_key024[] __initconst = { + 0x99, 0xb6, 0x2b, 0xd5, 0xaf, 0xbe, 0x3f, 0xb0, + 0x15, 0xbd, 0xe9, 0x3f, 0x0a, 0xbf, 0x48, 0x39, + 0x57, 0xa1, 0xc3, 0xeb, 0x3c, 0xa5, 0x9c, 0xb5, + 0x0b, 0x39, 0xf7, 0xf8, 0xa9, 0xcc, 0x51, 0xbe +}; + +/* wycheproof - misc */ +static const u8 enc_input025[] __initconst = { + 0xfd, 0xc8, 0x5b, 0x94, 0xa4, 0xb2, 0xa6, 0xb7, + 0x59, 0xb1, 0xa0, 0xda +}; +static const u8 enc_output025[] __initconst = { + 0x9f, 0x88, 0x16, 0xde, 0x09, 0x94, 0xe9, 0x38, + 0xd9, 0xe5, 0x3f, 0x95, 0xd0, 0x86, 0xfc, 0x6c, + 0x9d, 0x8f, 0xa9, 0x15, 0xfd, 0x84, 0x23, 0xa7, + 0xcf, 0x05, 0x07, 0x2f +}; +static const u8 enc_assoc025[] __initconst = { + 0xa5, 0x06, 0xe1, 0xa5, 0xc6, 0x90, 0x93, 0xf9 +}; +static const u8 enc_nonce025[] __initconst = { + 0x58, 0xdb, 0xd4, 0xad, 0x2c, 0x4a, 0xd3, 0x5d, + 0xd9, 0x06, 0xe9, 0xce +}; +static const u8 enc_key025[] __initconst = { + 0x85, 0xf3, 0x5b, 0x62, 0x82, 0xcf, 0xf4, 0x40, + 0xbc, 0x10, 0x20, 0xc8, 0x13, 0x6f, 0xf2, 0x70, + 0x31, 0x11, 0x0f, 0xa6, 0x3e, 0xc1, 0x6f, 0x1e, + 0x82, 0x51, 0x18, 0xb0, 0x06, 0xb9, 0x12, 0x57 +}; + +/* wycheproof - misc */ +static const u8 enc_input026[] __initconst = { + 0x51, 0xf8, 0xc1, 0xf7, 0x31, 0xea, 0x14, 0xac, + 0xdb, 0x21, 0x0a, 0x6d, 0x97, 0x3e, 0x07 +}; +static const u8 enc_output026[] __initconst = { + 0x0b, 0x29, 0x63, 0x8e, 0x1f, 0xbd, 0xd6, 0xdf, + 0x53, 0x97, 0x0b, 0xe2, 0x21, 0x00, 0x42, 0x2a, + 0x91, 0x34, 0x08, 0x7d, 0x67, 0xa4, 0x6e, 0x79, + 0x17, 0x8d, 0x0a, 0x93, 0xf5, 0xe1, 0xd2 +}; +static const u8 enc_assoc026[] __initconst = { }; +static const u8 enc_nonce026[] __initconst = { + 0x68, 0xab, 0x7f, 0xdb, 0xf6, 0x19, 0x01, 0xda, + 0xd4, 0x61, 0xd2, 0x3c +}; +static const u8 enc_key026[] __initconst = { + 0x67, 0x11, 0x96, 0x27, 0xbd, 0x98, 0x8e, 0xda, + 0x90, 0x62, 0x19, 0xe0, 0x8c, 0x0d, 0x0d, 0x77, + 0x9a, 0x07, 0xd2, 0x08, 0xce, 0x8a, 0x4f, 0xe0, + 0x70, 0x9a, 0xf7, 0x55, 0xee, 0xec, 0x6d, 0xcb +}; + +/* wycheproof - misc */ +static const u8 enc_input027[] __initconst = { + 0x97, 0x46, 0x9d, 0xa6, 0x67, 0xd6, 0x11, 0x0f, + 0x9c, 0xbd, 0xa1, 0xd1, 0xa2, 0x06, 0x73 +}; +static const u8 enc_output027[] __initconst = { + 0x32, 0xdb, 0x66, 0xc4, 0xa3, 0x81, 0x9d, 0x81, + 0x55, 0x74, 0x55, 0xe5, 0x98, 0x0f, 0xed, 0xfe, + 0xae, 0x30, 0xde, 0xc9, 0x4e, 0x6a, 0xd3, 0xa9, + 0xee, 0xa0, 0x6a, 0x0d, 0x70, 0x39, 0x17 +}; +static const u8 enc_assoc027[] __initconst = { + 0x64, 0x53, 0xa5, 0x33, 0x84, 0x63, 0x22, 0x12 +}; +static const u8 enc_nonce027[] __initconst = { + 0xd9, 0x5b, 0x32, 0x43, 0xaf, 0xae, 0xf7, 0x14, + 0xc5, 0x03, 0x5b, 0x6a +}; +static const u8 enc_key027[] __initconst = { + 0xe6, 0xf1, 0x11, 0x8d, 0x41, 0xe4, 0xb4, 0x3f, + 0xb5, 0x82, 0x21, 0xb7, 0xed, 0x79, 0x67, 0x38, + 0x34, 0xe0, 0xd8, 0xac, 0x5c, 0x4f, 0xa6, 0x0b, + 0xbc, 0x8b, 0xc4, 0x89, 0x3a, 0x58, 0x89, 0x4d +}; + +/* wycheproof - misc */ +static const u8 enc_input028[] __initconst = { + 0x54, 0x9b, 0x36, 0x5a, 0xf9, 0x13, 0xf3, 0xb0, + 0x81, 0x13, 0x1c, 0xcb, 0x6b, 0x82, 0x55, 0x88 +}; +static const u8 enc_output028[] __initconst = { + 0xe9, 0x11, 0x0e, 0x9f, 0x56, 0xab, 0x3c, 0xa4, + 0x83, 0x50, 0x0c, 0xea, 0xba, 0xb6, 0x7a, 0x13, + 0x83, 0x6c, 0xca, 0xbf, 0x15, 0xa6, 0xa2, 0x2a, + 0x51, 0xc1, 0x07, 0x1c, 0xfa, 0x68, 0xfa, 0x0c +}; +static const u8 enc_assoc028[] __initconst = { }; +static const u8 enc_nonce028[] __initconst = { + 0x2f, 0xcb, 0x1b, 0x38, 0xa9, 0x9e, 0x71, 0xb8, + 0x47, 0x40, 0xad, 0x9b +}; +static const u8 enc_key028[] __initconst = { + 0x59, 0xd4, 0xea, 0xfb, 0x4d, 0xe0, 0xcf, 0xc7, + 0xd3, 0xdb, 0x99, 0xa8, 0xf5, 0x4b, 0x15, 0xd7, + 0xb3, 0x9f, 0x0a, 0xcc, 0x8d, 0xa6, 0x97, 0x63, + 0xb0, 0x19, 0xc1, 0x69, 0x9f, 0x87, 0x67, 0x4a +}; + +/* wycheproof - misc */ +static const u8 enc_input029[] __initconst = { + 0x55, 0xa4, 0x65, 0x64, 0x4f, 0x5b, 0x65, 0x09, + 0x28, 0xcb, 0xee, 0x7c, 0x06, 0x32, 0x14, 0xd6 +}; +static const u8 enc_output029[] __initconst = { + 0xe4, 0xb1, 0x13, 0xcb, 0x77, 0x59, 0x45, 0xf3, + 0xd3, 0xa8, 0xae, 0x9e, 0xc1, 0x41, 0xc0, 0x0c, + 0x7c, 0x43, 0xf1, 0x6c, 0xe0, 0x96, 0xd0, 0xdc, + 0x27, 0xc9, 0x58, 0x49, 0xdc, 0x38, 0x3b, 0x7d +}; +static const u8 enc_assoc029[] __initconst = { + 0x03, 0x45, 0x85, 0x62, 0x1a, 0xf8, 0xd7, 0xff +}; +static const u8 enc_nonce029[] __initconst = { + 0x11, 0x8a, 0x69, 0x64, 0xc2, 0xd3, 0xe3, 0x80, + 0x07, 0x1f, 0x52, 0x66 +}; +static const u8 enc_key029[] __initconst = { + 0xb9, 0x07, 0xa4, 0x50, 0x75, 0x51, 0x3f, 0xe8, + 0xa8, 0x01, 0x9e, 0xde, 0xe3, 0xf2, 0x59, 0x14, + 0x87, 0xb2, 0xa0, 0x30, 0xb0, 0x3c, 0x6e, 0x1d, + 0x77, 0x1c, 0x86, 0x25, 0x71, 0xd2, 0xea, 0x1e +}; + +/* wycheproof - misc */ +static const u8 enc_input030[] __initconst = { + 0x3f, 0xf1, 0x51, 0x4b, 0x1c, 0x50, 0x39, 0x15, + 0x91, 0x8f, 0x0c, 0x0c, 0x31, 0x09, 0x4a, 0x6e, + 0x1f +}; +static const u8 enc_output030[] __initconst = { + 0x02, 0xcc, 0x3a, 0xcb, 0x5e, 0xe1, 0xfc, 0xdd, + 0x12, 0xa0, 0x3b, 0xb8, 0x57, 0x97, 0x64, 0x74, + 0xd3, 0xd8, 0x3b, 0x74, 0x63, 0xa2, 0xc3, 0x80, + 0x0f, 0xe9, 0x58, 0xc2, 0x8e, 0xaa, 0x29, 0x08, + 0x13 +}; +static const u8 enc_assoc030[] __initconst = { }; +static const u8 enc_nonce030[] __initconst = { + 0x45, 0xaa, 0xa3, 0xe5, 0xd1, 0x6d, 0x2d, 0x42, + 0xdc, 0x03, 0x44, 0x5d +}; +static const u8 enc_key030[] __initconst = { + 0x3b, 0x24, 0x58, 0xd8, 0x17, 0x6e, 0x16, 0x21, + 0xc0, 0xcc, 0x24, 0xc0, 0xc0, 0xe2, 0x4c, 0x1e, + 0x80, 0xd7, 0x2f, 0x7e, 0xe9, 0x14, 0x9a, 0x4b, + 0x16, 0x61, 0x76, 0x62, 0x96, 0x16, 0xd0, 0x11 +}; + +/* wycheproof - misc */ +static const u8 enc_input031[] __initconst = { + 0x63, 0x85, 0x8c, 0xa3, 0xe2, 0xce, 0x69, 0x88, + 0x7b, 0x57, 0x8a, 0x3c, 0x16, 0x7b, 0x42, 0x1c, + 0x9c +}; +static const u8 enc_output031[] __initconst = { + 0x35, 0x76, 0x64, 0x88, 0xd2, 0xbc, 0x7c, 0x2b, + 0x8d, 0x17, 0xcb, 0xbb, 0x9a, 0xbf, 0xad, 0x9e, + 0x6d, 0x1f, 0x39, 0x1e, 0x65, 0x7b, 0x27, 0x38, + 0xdd, 0xa0, 0x84, 0x48, 0xcb, 0xa2, 0x81, 0x1c, + 0xeb +}; +static const u8 enc_assoc031[] __initconst = { + 0x9a, 0xaf, 0x29, 0x9e, 0xee, 0xa7, 0x8f, 0x79 +}; +static const u8 enc_nonce031[] __initconst = { + 0xf0, 0x38, 0x4f, 0xb8, 0x76, 0x12, 0x14, 0x10, + 0x63, 0x3d, 0x99, 0x3d +}; +static const u8 enc_key031[] __initconst = { + 0xf6, 0x0c, 0x6a, 0x1b, 0x62, 0x57, 0x25, 0xf7, + 0x6c, 0x70, 0x37, 0xb4, 0x8f, 0xe3, 0x57, 0x7f, + 0xa7, 0xf7, 0xb8, 0x7b, 0x1b, 0xd5, 0xa9, 0x82, + 0x17, 0x6d, 0x18, 0x23, 0x06, 0xff, 0xb8, 0x70 +}; + +/* wycheproof - misc */ +static const u8 enc_input032[] __initconst = { + 0x10, 0xf1, 0xec, 0xf9, 0xc6, 0x05, 0x84, 0x66, + 0x5d, 0x9a, 0xe5, 0xef, 0xe2, 0x79, 0xe7, 0xf7, + 0x37, 0x7e, 0xea, 0x69, 0x16, 0xd2, 0xb1, 0x11 +}; +static const u8 enc_output032[] __initconst = { + 0x42, 0xf2, 0x6c, 0x56, 0xcb, 0x4b, 0xe2, 0x1d, + 0x9d, 0x8d, 0x0c, 0x80, 0xfc, 0x99, 0xdd, 0xe0, + 0x0d, 0x75, 0xf3, 0x80, 0x74, 0xbf, 0xe7, 0x64, + 0x54, 0xaa, 0x7e, 0x13, 0xd4, 0x8f, 0xff, 0x7d, + 0x75, 0x57, 0x03, 0x94, 0x57, 0x04, 0x0a, 0x3a +}; +static const u8 enc_assoc032[] __initconst = { }; +static const u8 enc_nonce032[] __initconst = { + 0xe6, 0xb1, 0xad, 0xf2, 0xfd, 0x58, 0xa8, 0x76, + 0x2c, 0x65, 0xf3, 0x1b +}; +static const u8 enc_key032[] __initconst = { + 0x02, 0x12, 0xa8, 0xde, 0x50, 0x07, 0xed, 0x87, + 0xb3, 0x3f, 0x1a, 0x70, 0x90, 0xb6, 0x11, 0x4f, + 0x9e, 0x08, 0xce, 0xfd, 0x96, 0x07, 0xf2, 0xc2, + 0x76, 0xbd, 0xcf, 0xdb, 0xc5, 0xce, 0x9c, 0xd7 +}; + +/* wycheproof - misc */ +static const u8 enc_input033[] __initconst = { + 0x92, 0x22, 0xf9, 0x01, 0x8e, 0x54, 0xfd, 0x6d, + 0xe1, 0x20, 0x08, 0x06, 0xa9, 0xee, 0x8e, 0x4c, + 0xc9, 0x04, 0xd2, 0x9f, 0x25, 0xcb, 0xa1, 0x93 +}; +static const u8 enc_output033[] __initconst = { + 0x12, 0x30, 0x32, 0x43, 0x7b, 0x4b, 0xfd, 0x69, + 0x20, 0xe8, 0xf7, 0xe7, 0xe0, 0x08, 0x7a, 0xe4, + 0x88, 0x9e, 0xbe, 0x7a, 0x0a, 0xd0, 0xe9, 0x00, + 0x3c, 0xf6, 0x8f, 0x17, 0x95, 0x50, 0xda, 0x63, + 0xd3, 0xb9, 0x6c, 0x2d, 0x55, 0x41, 0x18, 0x65 +}; +static const u8 enc_assoc033[] __initconst = { + 0x3e, 0x8b, 0xc5, 0xad, 0xe1, 0x82, 0xff, 0x08 +}; +static const u8 enc_nonce033[] __initconst = { + 0x6b, 0x28, 0x2e, 0xbe, 0xcc, 0x54, 0x1b, 0xcd, + 0x78, 0x34, 0xed, 0x55 +}; +static const u8 enc_key033[] __initconst = { + 0xc5, 0xbc, 0x09, 0x56, 0x56, 0x46, 0xe7, 0xed, + 0xda, 0x95, 0x4f, 0x1f, 0x73, 0x92, 0x23, 0xda, + 0xda, 0x20, 0xb9, 0x5c, 0x44, 0xab, 0x03, 0x3d, + 0x0f, 0xae, 0x4b, 0x02, 0x83, 0xd1, 0x8b, 0xe3 +}; + +/* wycheproof - misc */ +static const u8 enc_input034[] __initconst = { + 0xb0, 0x53, 0x99, 0x92, 0x86, 0xa2, 0x82, 0x4f, + 0x42, 0xcc, 0x8c, 0x20, 0x3a, 0xb2, 0x4e, 0x2c, + 0x97, 0xa6, 0x85, 0xad, 0xcc, 0x2a, 0xd3, 0x26, + 0x62, 0x55, 0x8e, 0x55, 0xa5, 0xc7, 0x29 +}; +static const u8 enc_output034[] __initconst = { + 0x45, 0xc7, 0xd6, 0xb5, 0x3a, 0xca, 0xd4, 0xab, + 0xb6, 0x88, 0x76, 0xa6, 0xe9, 0x6a, 0x48, 0xfb, + 0x59, 0x52, 0x4d, 0x2c, 0x92, 0xc9, 0xd8, 0xa1, + 0x89, 0xc9, 0xfd, 0x2d, 0xb9, 0x17, 0x46, 0x56, + 0x6d, 0x3c, 0xa1, 0x0e, 0x31, 0x1b, 0x69, 0x5f, + 0x3e, 0xae, 0x15, 0x51, 0x65, 0x24, 0x93 +}; +static const u8 enc_assoc034[] __initconst = { }; +static const u8 enc_nonce034[] __initconst = { + 0x04, 0xa9, 0xbe, 0x03, 0x50, 0x8a, 0x5f, 0x31, + 0x37, 0x1a, 0x6f, 0xd2 +}; +static const u8 enc_key034[] __initconst = { + 0x2e, 0xb5, 0x1c, 0x46, 0x9a, 0xa8, 0xeb, 0x9e, + 0x6c, 0x54, 0xa8, 0x34, 0x9b, 0xae, 0x50, 0xa2, + 0x0f, 0x0e, 0x38, 0x27, 0x11, 0xbb, 0xa1, 0x15, + 0x2c, 0x42, 0x4f, 0x03, 0xb6, 0x67, 0x1d, 0x71 +}; + +/* wycheproof - misc */ +static const u8 enc_input035[] __initconst = { + 0xf4, 0x52, 0x06, 0xab, 0xc2, 0x55, 0x52, 0xb2, + 0xab, 0xc9, 0xab, 0x7f, 0xa2, 0x43, 0x03, 0x5f, + 0xed, 0xaa, 0xdd, 0xc3, 0xb2, 0x29, 0x39, 0x56, + 0xf1, 0xea, 0x6e, 0x71, 0x56, 0xe7, 0xeb +}; +static const u8 enc_output035[] __initconst = { + 0x46, 0xa8, 0x0c, 0x41, 0x87, 0x02, 0x47, 0x20, + 0x08, 0x46, 0x27, 0x58, 0x00, 0x80, 0xdd, 0xe5, + 0xa3, 0xf4, 0xa1, 0x10, 0x93, 0xa7, 0x07, 0x6e, + 0xd6, 0xf3, 0xd3, 0x26, 0xbc, 0x7b, 0x70, 0x53, + 0x4d, 0x4a, 0xa2, 0x83, 0x5a, 0x52, 0xe7, 0x2d, + 0x14, 0xdf, 0x0e, 0x4f, 0x47, 0xf2, 0x5f +}; +static const u8 enc_assoc035[] __initconst = { + 0x37, 0x46, 0x18, 0xa0, 0x6e, 0xa9, 0x8a, 0x48 +}; +static const u8 enc_nonce035[] __initconst = { + 0x47, 0x0a, 0x33, 0x9e, 0xcb, 0x32, 0x19, 0xb8, + 0xb8, 0x1a, 0x1f, 0x8b +}; +static const u8 enc_key035[] __initconst = { + 0x7f, 0x5b, 0x74, 0xc0, 0x7e, 0xd1, 0xb4, 0x0f, + 0xd1, 0x43, 0x58, 0xfe, 0x2f, 0xf2, 0xa7, 0x40, + 0xc1, 0x16, 0xc7, 0x70, 0x65, 0x10, 0xe6, 0xa4, + 0x37, 0xf1, 0x9e, 0xa4, 0x99, 0x11, 0xce, 0xc4 +}; + +/* wycheproof - misc */ +static const u8 enc_input036[] __initconst = { + 0xb9, 0xc5, 0x54, 0xcb, 0xc3, 0x6a, 0xc1, 0x8a, + 0xe8, 0x97, 0xdf, 0x7b, 0xee, 0xca, 0xc1, 0xdb, + 0xeb, 0x4e, 0xaf, 0xa1, 0x56, 0xbb, 0x60, 0xce, + 0x2e, 0x5d, 0x48, 0xf0, 0x57, 0x15, 0xe6, 0x78 +}; +static const u8 enc_output036[] __initconst = { + 0xea, 0x29, 0xaf, 0xa4, 0x9d, 0x36, 0xe8, 0x76, + 0x0f, 0x5f, 0xe1, 0x97, 0x23, 0xb9, 0x81, 0x1e, + 0xd5, 0xd5, 0x19, 0x93, 0x4a, 0x44, 0x0f, 0x50, + 0x81, 0xac, 0x43, 0x0b, 0x95, 0x3b, 0x0e, 0x21, + 0x22, 0x25, 0x41, 0xaf, 0x46, 0xb8, 0x65, 0x33, + 0xc6, 0xb6, 0x8d, 0x2f, 0xf1, 0x08, 0xa7, 0xea +}; +static const u8 enc_assoc036[] __initconst = { }; +static const u8 enc_nonce036[] __initconst = { + 0x72, 0xcf, 0xd9, 0x0e, 0xf3, 0x02, 0x6c, 0xa2, + 0x2b, 0x7e, 0x6e, 0x6a +}; +static const u8 enc_key036[] __initconst = { + 0xe1, 0x73, 0x1d, 0x58, 0x54, 0xe1, 0xb7, 0x0c, + 0xb3, 0xff, 0xe8, 0xb7, 0x86, 0xa2, 0xb3, 0xeb, + 0xf0, 0x99, 0x43, 0x70, 0x95, 0x47, 0x57, 0xb9, + 0xdc, 0x8c, 0x7b, 0xc5, 0x35, 0x46, 0x34, 0xa3 +}; + +/* wycheproof - misc */ +static const u8 enc_input037[] __initconst = { + 0x6b, 0x26, 0x04, 0x99, 0x6c, 0xd3, 0x0c, 0x14, + 0xa1, 0x3a, 0x52, 0x57, 0xed, 0x6c, 0xff, 0xd3, + 0xbc, 0x5e, 0x29, 0xd6, 0xb9, 0x7e, 0xb1, 0x79, + 0x9e, 0xb3, 0x35, 0xe2, 0x81, 0xea, 0x45, 0x1e +}; +static const u8 enc_output037[] __initconst = { + 0x6d, 0xad, 0x63, 0x78, 0x97, 0x54, 0x4d, 0x8b, + 0xf6, 0xbe, 0x95, 0x07, 0xed, 0x4d, 0x1b, 0xb2, + 0xe9, 0x54, 0xbc, 0x42, 0x7e, 0x5d, 0xe7, 0x29, + 0xda, 0xf5, 0x07, 0x62, 0x84, 0x6f, 0xf2, 0xf4, + 0x7b, 0x99, 0x7d, 0x93, 0xc9, 0x82, 0x18, 0x9d, + 0x70, 0x95, 0xdc, 0x79, 0x4c, 0x74, 0x62, 0x32 +}; +static const u8 enc_assoc037[] __initconst = { + 0x23, 0x33, 0xe5, 0xce, 0x0f, 0x93, 0xb0, 0x59 +}; +static const u8 enc_nonce037[] __initconst = { + 0x26, 0x28, 0x80, 0xd4, 0x75, 0xf3, 0xda, 0xc5, + 0x34, 0x0d, 0xd1, 0xb8 +}; +static const u8 enc_key037[] __initconst = { + 0x27, 0xd8, 0x60, 0x63, 0x1b, 0x04, 0x85, 0xa4, + 0x10, 0x70, 0x2f, 0xea, 0x61, 0xbc, 0x87, 0x3f, + 0x34, 0x42, 0x26, 0x0c, 0xad, 0xed, 0x4a, 0xbd, + 0xe2, 0x5b, 0x78, 0x6a, 0x2d, 0x97, 0xf1, 0x45 +}; + +/* wycheproof - misc */ +static const u8 enc_input038[] __initconst = { + 0x97, 0x3d, 0x0c, 0x75, 0x38, 0x26, 0xba, 0xe4, + 0x66, 0xcf, 0x9a, 0xbb, 0x34, 0x93, 0x15, 0x2e, + 0x9d, 0xe7, 0x81, 0x9e, 0x2b, 0xd0, 0xc7, 0x11, + 0x71, 0x34, 0x6b, 0x4d, 0x2c, 0xeb, 0xf8, 0x04, + 0x1a, 0xa3, 0xce, 0xdc, 0x0d, 0xfd, 0x7b, 0x46, + 0x7e, 0x26, 0x22, 0x8b, 0xc8, 0x6c, 0x9a +}; +static const u8 enc_output038[] __initconst = { + 0xfb, 0xa7, 0x8a, 0xe4, 0xf9, 0xd8, 0x08, 0xa6, + 0x2e, 0x3d, 0xa4, 0x0b, 0xe2, 0xcb, 0x77, 0x00, + 0xc3, 0x61, 0x3d, 0x9e, 0xb2, 0xc5, 0x29, 0xc6, + 0x52, 0xe7, 0x6a, 0x43, 0x2c, 0x65, 0x8d, 0x27, + 0x09, 0x5f, 0x0e, 0xb8, 0xf9, 0x40, 0xc3, 0x24, + 0x98, 0x1e, 0xa9, 0x35, 0xe5, 0x07, 0xf9, 0x8f, + 0x04, 0x69, 0x56, 0xdb, 0x3a, 0x51, 0x29, 0x08, + 0xbd, 0x7a, 0xfc, 0x8f, 0x2a, 0xb0, 0xa9 +}; +static const u8 enc_assoc038[] __initconst = { }; +static const u8 enc_nonce038[] __initconst = { + 0xe7, 0x4a, 0x51, 0x5e, 0x7e, 0x21, 0x02, 0xb9, + 0x0b, 0xef, 0x55, 0xd2 +}; +static const u8 enc_key038[] __initconst = { + 0xcf, 0x0d, 0x40, 0xa4, 0x64, 0x4e, 0x5f, 0x51, + 0x81, 0x51, 0x65, 0xd5, 0x30, 0x1b, 0x22, 0x63, + 0x1f, 0x45, 0x44, 0xc4, 0x9a, 0x18, 0x78, 0xe3, + 0xa0, 0xa5, 0xe8, 0xe1, 0xaa, 0xe0, 0xf2, 0x64 +}; + +/* wycheproof - misc */ +static const u8 enc_input039[] __initconst = { + 0xa9, 0x89, 0x95, 0x50, 0x4d, 0xf1, 0x6f, 0x74, + 0x8b, 0xfb, 0x77, 0x85, 0xff, 0x91, 0xee, 0xb3, + 0xb6, 0x60, 0xea, 0x9e, 0xd3, 0x45, 0x0c, 0x3d, + 0x5e, 0x7b, 0x0e, 0x79, 0xef, 0x65, 0x36, 0x59, + 0xa9, 0x97, 0x8d, 0x75, 0x54, 0x2e, 0xf9, 0x1c, + 0x45, 0x67, 0x62, 0x21, 0x56, 0x40, 0xb9 +}; +static const u8 enc_output039[] __initconst = { + 0xa1, 0xff, 0xed, 0x80, 0x76, 0x18, 0x29, 0xec, + 0xce, 0x24, 0x2e, 0x0e, 0x88, 0xb1, 0x38, 0x04, + 0x90, 0x16, 0xbc, 0xa0, 0x18, 0xda, 0x2b, 0x6e, + 0x19, 0x98, 0x6b, 0x3e, 0x31, 0x8c, 0xae, 0x8d, + 0x80, 0x61, 0x98, 0xfb, 0x4c, 0x52, 0x7c, 0xc3, + 0x93, 0x50, 0xeb, 0xdd, 0xea, 0xc5, 0x73, 0xc4, + 0xcb, 0xf0, 0xbe, 0xfd, 0xa0, 0xb7, 0x02, 0x42, + 0xc6, 0x40, 0xd7, 0xcd, 0x02, 0xd7, 0xa3 +}; +static const u8 enc_assoc039[] __initconst = { + 0xb3, 0xe4, 0x06, 0x46, 0x83, 0xb0, 0x2d, 0x84 +}; +static const u8 enc_nonce039[] __initconst = { + 0xd4, 0xd8, 0x07, 0x34, 0x16, 0x83, 0x82, 0x5b, + 0x31, 0xcd, 0x4d, 0x95 +}; +static const u8 enc_key039[] __initconst = { + 0x6c, 0xbf, 0xd7, 0x1c, 0x64, 0x5d, 0x18, 0x4c, + 0xf5, 0xd2, 0x3c, 0x40, 0x2b, 0xdb, 0x0d, 0x25, + 0xec, 0x54, 0x89, 0x8c, 0x8a, 0x02, 0x73, 0xd4, + 0x2e, 0xb5, 0xbe, 0x10, 0x9f, 0xdc, 0xb2, 0xac +}; + +/* wycheproof - misc */ +static const u8 enc_input040[] __initconst = { + 0xd0, 0x96, 0x80, 0x31, 0x81, 0xbe, 0xef, 0x9e, + 0x00, 0x8f, 0xf8, 0x5d, 0x5d, 0xdc, 0x38, 0xdd, + 0xac, 0xf0, 0xf0, 0x9e, 0xe5, 0xf7, 0xe0, 0x7f, + 0x1e, 0x40, 0x79, 0xcb, 0x64, 0xd0, 0xdc, 0x8f, + 0x5e, 0x67, 0x11, 0xcd, 0x49, 0x21, 0xa7, 0x88, + 0x7d, 0xe7, 0x6e, 0x26, 0x78, 0xfd, 0xc6, 0x76, + 0x18, 0xf1, 0x18, 0x55, 0x86, 0xbf, 0xea, 0x9d, + 0x4c, 0x68, 0x5d, 0x50, 0xe4, 0xbb, 0x9a, 0x82 +}; +static const u8 enc_output040[] __initconst = { + 0x9a, 0x4e, 0xf2, 0x2b, 0x18, 0x16, 0x77, 0xb5, + 0x75, 0x5c, 0x08, 0xf7, 0x47, 0xc0, 0xf8, 0xd8, + 0xe8, 0xd4, 0xc1, 0x8a, 0x9c, 0xc2, 0x40, 0x5c, + 0x12, 0xbb, 0x51, 0xbb, 0x18, 0x72, 0xc8, 0xe8, + 0xb8, 0x77, 0x67, 0x8b, 0xec, 0x44, 0x2c, 0xfc, + 0xbb, 0x0f, 0xf4, 0x64, 0xa6, 0x4b, 0x74, 0x33, + 0x2c, 0xf0, 0x72, 0x89, 0x8c, 0x7e, 0x0e, 0xdd, + 0xf6, 0x23, 0x2e, 0xa6, 0xe2, 0x7e, 0xfe, 0x50, + 0x9f, 0xf3, 0x42, 0x7a, 0x0f, 0x32, 0xfa, 0x56, + 0x6d, 0x9c, 0xa0, 0xa7, 0x8a, 0xef, 0xc0, 0x13 +}; +static const u8 enc_assoc040[] __initconst = { }; +static const u8 enc_nonce040[] __initconst = { + 0xd6, 0x10, 0x40, 0xa3, 0x13, 0xed, 0x49, 0x28, + 0x23, 0xcc, 0x06, 0x5b +}; +static const u8 enc_key040[] __initconst = { + 0x5b, 0x1d, 0x10, 0x35, 0xc0, 0xb1, 0x7e, 0xe0, + 0xb0, 0x44, 0x47, 0x67, 0xf8, 0x0a, 0x25, 0xb8, + 0xc1, 0xb7, 0x41, 0xf4, 0xb5, 0x0a, 0x4d, 0x30, + 0x52, 0x22, 0x6b, 0xaa, 0x1c, 0x6f, 0xb7, 0x01 +}; + +/* wycheproof - misc */ +static const u8 enc_input041[] __initconst = { + 0x94, 0xee, 0x16, 0x6d, 0x6d, 0x6e, 0xcf, 0x88, + 0x32, 0x43, 0x71, 0x36, 0xb4, 0xae, 0x80, 0x5d, + 0x42, 0x88, 0x64, 0x35, 0x95, 0x86, 0xd9, 0x19, + 0x3a, 0x25, 0x01, 0x62, 0x93, 0xed, 0xba, 0x44, + 0x3c, 0x58, 0xe0, 0x7e, 0x7b, 0x71, 0x95, 0xec, + 0x5b, 0xd8, 0x45, 0x82, 0xa9, 0xd5, 0x6c, 0x8d, + 0x4a, 0x10, 0x8c, 0x7d, 0x7c, 0xe3, 0x4e, 0x6c, + 0x6f, 0x8e, 0xa1, 0xbe, 0xc0, 0x56, 0x73, 0x17 +}; +static const u8 enc_output041[] __initconst = { + 0x5f, 0xbb, 0xde, 0xcc, 0x34, 0xbe, 0x20, 0x16, + 0x14, 0xf6, 0x36, 0x03, 0x1e, 0xeb, 0x42, 0xf1, + 0xca, 0xce, 0x3c, 0x79, 0xa1, 0x2c, 0xff, 0xd8, + 0x71, 0xee, 0x8e, 0x73, 0x82, 0x0c, 0x82, 0x97, + 0x49, 0xf1, 0xab, 0xb4, 0x29, 0x43, 0x67, 0x84, + 0x9f, 0xb6, 0xc2, 0xaa, 0x56, 0xbd, 0xa8, 0xa3, + 0x07, 0x8f, 0x72, 0x3d, 0x7c, 0x1c, 0x85, 0x20, + 0x24, 0xb0, 0x17, 0xb5, 0x89, 0x73, 0xfb, 0x1e, + 0x09, 0x26, 0x3d, 0xa7, 0xb4, 0xcb, 0x92, 0x14, + 0x52, 0xf9, 0x7d, 0xca, 0x40, 0xf5, 0x80, 0xec +}; +static const u8 enc_assoc041[] __initconst = { + 0x71, 0x93, 0xf6, 0x23, 0x66, 0x33, 0x21, 0xa2 +}; +static const u8 enc_nonce041[] __initconst = { + 0xd3, 0x1c, 0x21, 0xab, 0xa1, 0x75, 0xb7, 0x0d, + 0xe4, 0xeb, 0xb1, 0x9c +}; +static const u8 enc_key041[] __initconst = { + 0x97, 0xd6, 0x35, 0xc4, 0xf4, 0x75, 0x74, 0xd9, + 0x99, 0x8a, 0x90, 0x87, 0x5d, 0xa1, 0xd3, 0xa2, + 0x84, 0xb7, 0x55, 0xb2, 0xd3, 0x92, 0x97, 0xa5, + 0x72, 0x52, 0x35, 0x19, 0x0e, 0x10, 0xa9, 0x7e +}; + +/* wycheproof - misc */ +static const u8 enc_input042[] __initconst = { + 0xb4, 0x29, 0xeb, 0x80, 0xfb, 0x8f, 0xe8, 0xba, + 0xed, 0xa0, 0xc8, 0x5b, 0x9c, 0x33, 0x34, 0x58, + 0xe7, 0xc2, 0x99, 0x2e, 0x55, 0x84, 0x75, 0x06, + 0x9d, 0x12, 0xd4, 0x5c, 0x22, 0x21, 0x75, 0x64, + 0x12, 0x15, 0x88, 0x03, 0x22, 0x97, 0xef, 0xf5, + 0x67, 0x83, 0x74, 0x2a, 0x5f, 0xc2, 0x2d, 0x74, + 0x10, 0xff, 0xb2, 0x9d, 0x66, 0x09, 0x86, 0x61, + 0xd7, 0x6f, 0x12, 0x6c, 0x3c, 0x27, 0x68, 0x9e, + 0x43, 0xb3, 0x72, 0x67, 0xca, 0xc5, 0xa3, 0xa6, + 0xd3, 0xab, 0x49, 0xe3, 0x91, 0xda, 0x29, 0xcd, + 0x30, 0x54, 0xa5, 0x69, 0x2e, 0x28, 0x07, 0xe4, + 0xc3, 0xea, 0x46, 0xc8, 0x76, 0x1d, 0x50, 0xf5, + 0x92 +}; +static const u8 enc_output042[] __initconst = { + 0xd0, 0x10, 0x2f, 0x6c, 0x25, 0x8b, 0xf4, 0x97, + 0x42, 0xce, 0xc3, 0x4c, 0xf2, 0xd0, 0xfe, 0xdf, + 0x23, 0xd1, 0x05, 0xfb, 0x4c, 0x84, 0xcf, 0x98, + 0x51, 0x5e, 0x1b, 0xc9, 0xa6, 0x4f, 0x8a, 0xd5, + 0xbe, 0x8f, 0x07, 0x21, 0xbd, 0xe5, 0x06, 0x45, + 0xd0, 0x00, 0x83, 0xc3, 0xa2, 0x63, 0xa3, 0x10, + 0x53, 0xb7, 0x60, 0x24, 0x5f, 0x52, 0xae, 0x28, + 0x66, 0xa5, 0xec, 0x83, 0xb1, 0x9f, 0x61, 0xbe, + 0x1d, 0x30, 0xd5, 0xc5, 0xd9, 0xfe, 0xcc, 0x4c, + 0xbb, 0xe0, 0x8f, 0xd3, 0x85, 0x81, 0x3a, 0x2a, + 0xa3, 0x9a, 0x00, 0xff, 0x9c, 0x10, 0xf7, 0xf2, + 0x37, 0x02, 0xad, 0xd1, 0xe4, 0xb2, 0xff, 0xa3, + 0x1c, 0x41, 0x86, 0x5f, 0xc7, 0x1d, 0xe1, 0x2b, + 0x19, 0x61, 0x21, 0x27, 0xce, 0x49, 0x99, 0x3b, + 0xb0 +}; +static const u8 enc_assoc042[] __initconst = { }; +static const u8 enc_nonce042[] __initconst = { + 0x17, 0xc8, 0x6a, 0x8a, 0xbb, 0xb7, 0xe0, 0x03, + 0xac, 0xde, 0x27, 0x99 +}; +static const u8 enc_key042[] __initconst = { + 0xfe, 0x6e, 0x55, 0xbd, 0xae, 0xd1, 0xf7, 0x28, + 0x4c, 0xa5, 0xfc, 0x0f, 0x8c, 0x5f, 0x2b, 0x8d, + 0xf5, 0x6d, 0xc0, 0xf4, 0x9e, 0x8c, 0xa6, 0x6a, + 0x41, 0x99, 0x5e, 0x78, 0x33, 0x51, 0xf9, 0x01 +}; + +/* wycheproof - misc */ +static const u8 enc_input043[] __initconst = { + 0xce, 0xb5, 0x34, 0xce, 0x50, 0xdc, 0x23, 0xff, + 0x63, 0x8a, 0xce, 0x3e, 0xf6, 0x3a, 0xb2, 0xcc, + 0x29, 0x73, 0xee, 0xad, 0xa8, 0x07, 0x85, 0xfc, + 0x16, 0x5d, 0x06, 0xc2, 0xf5, 0x10, 0x0f, 0xf5, + 0xe8, 0xab, 0x28, 0x82, 0xc4, 0x75, 0xaf, 0xcd, + 0x05, 0xcc, 0xd4, 0x9f, 0x2e, 0x7d, 0x8f, 0x55, + 0xef, 0x3a, 0x72, 0xe3, 0xdc, 0x51, 0xd6, 0x85, + 0x2b, 0x8e, 0x6b, 0x9e, 0x7a, 0xec, 0xe5, 0x7b, + 0xe6, 0x55, 0x6b, 0x0b, 0x6d, 0x94, 0x13, 0xe3, + 0x3f, 0xc5, 0xfc, 0x24, 0xa9, 0xa2, 0x05, 0xad, + 0x59, 0x57, 0x4b, 0xb3, 0x9d, 0x94, 0x4a, 0x92, + 0xdc, 0x47, 0x97, 0x0d, 0x84, 0xa6, 0xad, 0x31, + 0x76 +}; +static const u8 enc_output043[] __initconst = { + 0x75, 0x45, 0x39, 0x1b, 0x51, 0xde, 0x01, 0xd5, + 0xc5, 0x3d, 0xfa, 0xca, 0x77, 0x79, 0x09, 0x06, + 0x3e, 0x58, 0xed, 0xee, 0x4b, 0xb1, 0x22, 0x7e, + 0x71, 0x10, 0xac, 0x4d, 0x26, 0x20, 0xc2, 0xae, + 0xc2, 0xf8, 0x48, 0xf5, 0x6d, 0xee, 0xb0, 0x37, + 0xa8, 0xdc, 0xed, 0x75, 0xaf, 0xa8, 0xa6, 0xc8, + 0x90, 0xe2, 0xde, 0xe4, 0x2f, 0x95, 0x0b, 0xb3, + 0x3d, 0x9e, 0x24, 0x24, 0xd0, 0x8a, 0x50, 0x5d, + 0x89, 0x95, 0x63, 0x97, 0x3e, 0xd3, 0x88, 0x70, + 0xf3, 0xde, 0x6e, 0xe2, 0xad, 0xc7, 0xfe, 0x07, + 0x2c, 0x36, 0x6c, 0x14, 0xe2, 0xcf, 0x7c, 0xa6, + 0x2f, 0xb3, 0xd3, 0x6b, 0xee, 0x11, 0x68, 0x54, + 0x61, 0xb7, 0x0d, 0x44, 0xef, 0x8c, 0x66, 0xc5, + 0xc7, 0xbb, 0xf1, 0x0d, 0xca, 0xdd, 0x7f, 0xac, + 0xf6 +}; +static const u8 enc_assoc043[] __initconst = { + 0xa1, 0x1c, 0x40, 0xb6, 0x03, 0x76, 0x73, 0x30 +}; +static const u8 enc_nonce043[] __initconst = { + 0x46, 0x36, 0x2f, 0x45, 0xd6, 0x37, 0x9e, 0x63, + 0xe5, 0x22, 0x94, 0x60 +}; +static const u8 enc_key043[] __initconst = { + 0xaa, 0xbc, 0x06, 0x34, 0x74, 0xe6, 0x5c, 0x4c, + 0x3e, 0x9b, 0xdc, 0x48, 0x0d, 0xea, 0x97, 0xb4, + 0x51, 0x10, 0xc8, 0x61, 0x88, 0x46, 0xff, 0x6b, + 0x15, 0xbd, 0xd2, 0xa4, 0xa5, 0x68, 0x2c, 0x4e +}; + +/* wycheproof - misc */ +static const u8 enc_input044[] __initconst = { + 0xe5, 0xcc, 0xaa, 0x44, 0x1b, 0xc8, 0x14, 0x68, + 0x8f, 0x8f, 0x6e, 0x8f, 0x28, 0xb5, 0x00, 0xb2 +}; +static const u8 enc_output044[] __initconst = { + 0x7e, 0x72, 0xf5, 0xa1, 0x85, 0xaf, 0x16, 0xa6, + 0x11, 0x92, 0x1b, 0x43, 0x8f, 0x74, 0x9f, 0x0b, + 0x12, 0x42, 0xc6, 0x70, 0x73, 0x23, 0x34, 0x02, + 0x9a, 0xdf, 0xe1, 0xc5, 0x00, 0x16, 0x51, 0xe4 +}; +static const u8 enc_assoc044[] __initconst = { + 0x02 +}; +static const u8 enc_nonce044[] __initconst = { + 0x87, 0x34, 0x5f, 0x10, 0x55, 0xfd, 0x9e, 0x21, + 0x02, 0xd5, 0x06, 0x56 +}; +static const u8 enc_key044[] __initconst = { + 0x7d, 0x00, 0xb4, 0x80, 0x95, 0xad, 0xfa, 0x32, + 0x72, 0x05, 0x06, 0x07, 0xb2, 0x64, 0x18, 0x50, + 0x02, 0xba, 0x99, 0x95, 0x7c, 0x49, 0x8b, 0xe0, + 0x22, 0x77, 0x0f, 0x2c, 0xe2, 0xf3, 0x14, 0x3c +}; + +/* wycheproof - misc */ +static const u8 enc_input045[] __initconst = { + 0x02, 0xcd, 0xe1, 0x68, 0xfb, 0xa3, 0xf5, 0x44, + 0xbb, 0xd0, 0x33, 0x2f, 0x7a, 0xde, 0xad, 0xa8 +}; +static const u8 enc_output045[] __initconst = { + 0x85, 0xf2, 0x9a, 0x71, 0x95, 0x57, 0xcd, 0xd1, + 0x4d, 0x1f, 0x8f, 0xff, 0xab, 0x6d, 0x9e, 0x60, + 0x73, 0x2c, 0xa3, 0x2b, 0xec, 0xd5, 0x15, 0xa1, + 0xed, 0x35, 0x3f, 0x54, 0x2e, 0x99, 0x98, 0x58 +}; +static const u8 enc_assoc045[] __initconst = { + 0xb6, 0x48 +}; +static const u8 enc_nonce045[] __initconst = { + 0x87, 0xa3, 0x16, 0x3e, 0xc0, 0x59, 0x8a, 0xd9, + 0x5b, 0x3a, 0xa7, 0x13 +}; +static const u8 enc_key045[] __initconst = { + 0x64, 0x32, 0x71, 0x7f, 0x1d, 0xb8, 0x5e, 0x41, + 0xac, 0x78, 0x36, 0xbc, 0xe2, 0x51, 0x85, 0xa0, + 0x80, 0xd5, 0x76, 0x2b, 0x9e, 0x2b, 0x18, 0x44, + 0x4b, 0x6e, 0xc7, 0x2c, 0x3b, 0xd8, 0xe4, 0xdc +}; + +/* wycheproof - misc */ +static const u8 enc_input046[] __initconst = { + 0x16, 0xdd, 0xd2, 0x3f, 0xf5, 0x3f, 0x3d, 0x23, + 0xc0, 0x63, 0x34, 0x48, 0x70, 0x40, 0xeb, 0x47 +}; +static const u8 enc_output046[] __initconst = { + 0xc1, 0xb2, 0x95, 0x93, 0x6d, 0x56, 0xfa, 0xda, + 0xc0, 0x3e, 0x5f, 0x74, 0x2b, 0xff, 0x73, 0xa1, + 0x39, 0xc4, 0x57, 0xdb, 0xab, 0x66, 0x38, 0x2b, + 0xab, 0xb3, 0xb5, 0x58, 0x00, 0xcd, 0xa5, 0xb8 +}; +static const u8 enc_assoc046[] __initconst = { + 0xbd, 0x4c, 0xd0, 0x2f, 0xc7, 0x50, 0x2b, 0xbd, + 0xbd, 0xf6, 0xc9, 0xa3, 0xcb, 0xe8, 0xf0 +}; +static const u8 enc_nonce046[] __initconst = { + 0x6f, 0x57, 0x3a, 0xa8, 0x6b, 0xaa, 0x49, 0x2b, + 0xa4, 0x65, 0x96, 0xdf +}; +static const u8 enc_key046[] __initconst = { + 0x8e, 0x34, 0xcf, 0x73, 0xd2, 0x45, 0xa1, 0x08, + 0x2a, 0x92, 0x0b, 0x86, 0x36, 0x4e, 0xb8, 0x96, + 0xc4, 0x94, 0x64, 0x67, 0xbc, 0xb3, 0xd5, 0x89, + 0x29, 0xfc, 0xb3, 0x66, 0x90, 0xe6, 0x39, 0x4f +}; + +/* wycheproof - misc */ +static const u8 enc_input047[] __initconst = { + 0x62, 0x3b, 0x78, 0x50, 0xc3, 0x21, 0xe2, 0xcf, + 0x0c, 0x6f, 0xbc, 0xc8, 0xdf, 0xd1, 0xaf, 0xf2 +}; +static const u8 enc_output047[] __initconst = { + 0xc8, 0x4c, 0x9b, 0xb7, 0xc6, 0x1c, 0x1b, 0xcb, + 0x17, 0x77, 0x2a, 0x1c, 0x50, 0x0c, 0x50, 0x95, + 0xdb, 0xad, 0xf7, 0xa5, 0x13, 0x8c, 0xa0, 0x34, + 0x59, 0xa2, 0xcd, 0x65, 0x83, 0x1e, 0x09, 0x2f +}; +static const u8 enc_assoc047[] __initconst = { + 0x89, 0xcc, 0xe9, 0xfb, 0x47, 0x44, 0x1d, 0x07, + 0xe0, 0x24, 0x5a, 0x66, 0xfe, 0x8b, 0x77, 0x8b +}; +static const u8 enc_nonce047[] __initconst = { + 0x1a, 0x65, 0x18, 0xf0, 0x2e, 0xde, 0x1d, 0xa6, + 0x80, 0x92, 0x66, 0xd9 +}; +static const u8 enc_key047[] __initconst = { + 0xcb, 0x55, 0x75, 0xf5, 0xc7, 0xc4, 0x5c, 0x91, + 0xcf, 0x32, 0x0b, 0x13, 0x9f, 0xb5, 0x94, 0x23, + 0x75, 0x60, 0xd0, 0xa3, 0xe6, 0xf8, 0x65, 0xa6, + 0x7d, 0x4f, 0x63, 0x3f, 0x2c, 0x08, 0xf0, 0x16 +}; + +/* wycheproof - misc */ +static const u8 enc_input048[] __initconst = { + 0x87, 0xb3, 0xa4, 0xd7, 0xb2, 0x6d, 0x8d, 0x32, + 0x03, 0xa0, 0xde, 0x1d, 0x64, 0xef, 0x82, 0xe3 +}; +static const u8 enc_output048[] __initconst = { + 0x94, 0xbc, 0x80, 0x62, 0x1e, 0xd1, 0xe7, 0x1b, + 0x1f, 0xd2, 0xb5, 0xc3, 0xa1, 0x5e, 0x35, 0x68, + 0x33, 0x35, 0x11, 0x86, 0x17, 0x96, 0x97, 0x84, + 0x01, 0x59, 0x8b, 0x96, 0x37, 0x22, 0xf5, 0xb3 +}; +static const u8 enc_assoc048[] __initconst = { + 0xd1, 0x9f, 0x2d, 0x98, 0x90, 0x95, 0xf7, 0xab, + 0x03, 0xa5, 0xfd, 0xe8, 0x44, 0x16, 0xe0, 0x0c, + 0x0e +}; +static const u8 enc_nonce048[] __initconst = { + 0x56, 0x4d, 0xee, 0x49, 0xab, 0x00, 0xd2, 0x40, + 0xfc, 0x10, 0x68, 0xc3 +}; +static const u8 enc_key048[] __initconst = { + 0xa5, 0x56, 0x9e, 0x72, 0x9a, 0x69, 0xb2, 0x4b, + 0xa6, 0xe0, 0xff, 0x15, 0xc4, 0x62, 0x78, 0x97, + 0x43, 0x68, 0x24, 0xc9, 0x41, 0xe9, 0xd0, 0x0b, + 0x2e, 0x93, 0xfd, 0xdc, 0x4b, 0xa7, 0x76, 0x57 +}; + +/* wycheproof - misc */ +static const u8 enc_input049[] __initconst = { + 0xe6, 0x01, 0xb3, 0x85, 0x57, 0x79, 0x7d, 0xa2, + 0xf8, 0xa4, 0x10, 0x6a, 0x08, 0x9d, 0x1d, 0xa6 +}; +static const u8 enc_output049[] __initconst = { + 0x29, 0x9b, 0x5d, 0x3f, 0x3d, 0x03, 0xc0, 0x87, + 0x20, 0x9a, 0x16, 0xe2, 0x85, 0x14, 0x31, 0x11, + 0x4b, 0x45, 0x4e, 0xd1, 0x98, 0xde, 0x11, 0x7e, + 0x83, 0xec, 0x49, 0xfa, 0x8d, 0x85, 0x08, 0xd6 +}; +static const u8 enc_assoc049[] __initconst = { + 0x5e, 0x64, 0x70, 0xfa, 0xcd, 0x99, 0xc1, 0xd8, + 0x1e, 0x37, 0xcd, 0x44, 0x01, 0x5f, 0xe1, 0x94, + 0x80, 0xa2, 0xa4, 0xd3, 0x35, 0x2a, 0x4f, 0xf5, + 0x60, 0xc0, 0x64, 0x0f, 0xdb, 0xda +}; +static const u8 enc_nonce049[] __initconst = { + 0xdf, 0x87, 0x13, 0xe8, 0x7e, 0xc3, 0xdb, 0xcf, + 0xad, 0x14, 0xd5, 0x3e +}; +static const u8 enc_key049[] __initconst = { + 0x56, 0x20, 0x74, 0x65, 0xb4, 0xe4, 0x8e, 0x6d, + 0x04, 0x63, 0x0f, 0x4a, 0x42, 0xf3, 0x5c, 0xfc, + 0x16, 0x3a, 0xb2, 0x89, 0xc2, 0x2a, 0x2b, 0x47, + 0x84, 0xf6, 0xf9, 0x29, 0x03, 0x30, 0xbe, 0xe0 +}; + +/* wycheproof - misc */ +static const u8 enc_input050[] __initconst = { + 0xdc, 0x9e, 0x9e, 0xaf, 0x11, 0xe3, 0x14, 0x18, + 0x2d, 0xf6, 0xa4, 0xeb, 0xa1, 0x7a, 0xec, 0x9c +}; +static const u8 enc_output050[] __initconst = { + 0x60, 0x5b, 0xbf, 0x90, 0xae, 0xb9, 0x74, 0xf6, + 0x60, 0x2b, 0xc7, 0x78, 0x05, 0x6f, 0x0d, 0xca, + 0x38, 0xea, 0x23, 0xd9, 0x90, 0x54, 0xb4, 0x6b, + 0x42, 0xff, 0xe0, 0x04, 0x12, 0x9d, 0x22, 0x04 +}; +static const u8 enc_assoc050[] __initconst = { + 0xba, 0x44, 0x6f, 0x6f, 0x9a, 0x0c, 0xed, 0x22, + 0x45, 0x0f, 0xeb, 0x10, 0x73, 0x7d, 0x90, 0x07, + 0xfd, 0x69, 0xab, 0xc1, 0x9b, 0x1d, 0x4d, 0x90, + 0x49, 0xa5, 0x55, 0x1e, 0x86, 0xec, 0x2b, 0x37 +}; +static const u8 enc_nonce050[] __initconst = { + 0x8d, 0xf4, 0xb1, 0x5a, 0x88, 0x8c, 0x33, 0x28, + 0x6a, 0x7b, 0x76, 0x51 +}; +static const u8 enc_key050[] __initconst = { + 0x39, 0x37, 0x98, 0x6a, 0xf8, 0x6d, 0xaf, 0xc1, + 0xba, 0x0c, 0x46, 0x72, 0xd8, 0xab, 0xc4, 0x6c, + 0x20, 0x70, 0x62, 0x68, 0x2d, 0x9c, 0x26, 0x4a, + 0xb0, 0x6d, 0x6c, 0x58, 0x07, 0x20, 0x51, 0x30 +}; + +/* wycheproof - misc */ +static const u8 enc_input051[] __initconst = { + 0x81, 0xce, 0x84, 0xed, 0xe9, 0xb3, 0x58, 0x59, + 0xcc, 0x8c, 0x49, 0xa8, 0xf6, 0xbe, 0x7d, 0xc6 +}; +static const u8 enc_output051[] __initconst = { + 0x7b, 0x7c, 0xe0, 0xd8, 0x24, 0x80, 0x9a, 0x70, + 0xde, 0x32, 0x56, 0x2c, 0xcf, 0x2c, 0x2b, 0xbd, + 0x15, 0xd4, 0x4a, 0x00, 0xce, 0x0d, 0x19, 0xb4, + 0x23, 0x1f, 0x92, 0x1e, 0x22, 0xbc, 0x0a, 0x43 +}; +static const u8 enc_assoc051[] __initconst = { + 0xd4, 0x1a, 0x82, 0x8d, 0x5e, 0x71, 0x82, 0x92, + 0x47, 0x02, 0x19, 0x05, 0x40, 0x2e, 0xa2, 0x57, + 0xdc, 0xcb, 0xc3, 0xb8, 0x0f, 0xcd, 0x56, 0x75, + 0x05, 0x6b, 0x68, 0xbb, 0x59, 0xe6, 0x2e, 0x88, + 0x73 +}; +static const u8 enc_nonce051[] __initconst = { + 0xbe, 0x40, 0xe5, 0xf1, 0xa1, 0x18, 0x17, 0xa0, + 0xa8, 0xfa, 0x89, 0x49 +}; +static const u8 enc_key051[] __initconst = { + 0x36, 0x37, 0x2a, 0xbc, 0xdb, 0x78, 0xe0, 0x27, + 0x96, 0x46, 0xac, 0x3d, 0x17, 0x6b, 0x96, 0x74, + 0xe9, 0x15, 0x4e, 0xec, 0xf0, 0xd5, 0x46, 0x9c, + 0x65, 0x1e, 0xc7, 0xe1, 0x6b, 0x4c, 0x11, 0x99 +}; + +/* wycheproof - misc */ +static const u8 enc_input052[] __initconst = { + 0xa6, 0x67, 0x47, 0xc8, 0x9e, 0x85, 0x7a, 0xf3, + 0xa1, 0x8e, 0x2c, 0x79, 0x50, 0x00, 0x87, 0xed +}; +static const u8 enc_output052[] __initconst = { + 0xca, 0x82, 0xbf, 0xf3, 0xe2, 0xf3, 0x10, 0xcc, + 0xc9, 0x76, 0x67, 0x2c, 0x44, 0x15, 0xe6, 0x9b, + 0x57, 0x63, 0x8c, 0x62, 0xa5, 0xd8, 0x5d, 0xed, + 0x77, 0x4f, 0x91, 0x3c, 0x81, 0x3e, 0xa0, 0x32 +}; +static const u8 enc_assoc052[] __initconst = { + 0x3f, 0x2d, 0xd4, 0x9b, 0xbf, 0x09, 0xd6, 0x9a, + 0x78, 0xa3, 0xd8, 0x0e, 0xa2, 0x56, 0x66, 0x14, + 0xfc, 0x37, 0x94, 0x74, 0x19, 0x6c, 0x1a, 0xae, + 0x84, 0x58, 0x3d, 0xa7, 0x3d, 0x7f, 0xf8, 0x5c, + 0x6f, 0x42, 0xca, 0x42, 0x05, 0x6a, 0x97, 0x92, + 0xcc, 0x1b, 0x9f, 0xb3, 0xc7, 0xd2, 0x61 +}; +static const u8 enc_nonce052[] __initconst = { + 0x84, 0xc8, 0x7d, 0xae, 0x4e, 0xee, 0x27, 0x73, + 0x0e, 0xc3, 0x5d, 0x12 +}; +static const u8 enc_key052[] __initconst = { + 0x9f, 0x14, 0x79, 0xed, 0x09, 0x7d, 0x7f, 0xe5, + 0x29, 0xc1, 0x1f, 0x2f, 0x5a, 0xdd, 0x9a, 0xaf, + 0xf4, 0xa1, 0xca, 0x0b, 0x68, 0x99, 0x7a, 0x2c, + 0xb7, 0xf7, 0x97, 0x49, 0xbd, 0x90, 0xaa, 0xf4 +}; + +/* wycheproof - misc */ +static const u8 enc_input053[] __initconst = { + 0x25, 0x6d, 0x40, 0x88, 0x80, 0x94, 0x17, 0x83, + 0x55, 0xd3, 0x04, 0x84, 0x64, 0x43, 0xfe, 0xe8, + 0xdf, 0x99, 0x47, 0x03, 0x03, 0xfb, 0x3b, 0x7b, + 0x80, 0xe0, 0x30, 0xbe, 0xeb, 0xd3, 0x29, 0xbe +}; +static const u8 enc_output053[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0xe6, 0xd3, 0xd7, 0x32, 0x4a, 0x1c, 0xbb, 0xa7, + 0x77, 0xbb, 0xb0, 0xec, 0xdd, 0xa3, 0x78, 0x07 +}; +static const u8 enc_assoc053[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 +}; +static const u8 enc_nonce053[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key053[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input054[] __initconst = { + 0x25, 0x6d, 0x40, 0x88, 0x80, 0x94, 0x17, 0x83, + 0x55, 0xd3, 0x04, 0x84, 0x64, 0x43, 0xfe, 0xe8, + 0xdf, 0x99, 0x47, 0x03, 0x03, 0xfb, 0x3b, 0x7b, + 0x80, 0xe0, 0x30, 0xbe, 0xeb, 0xd3, 0x29, 0xbe, + 0xe3, 0xbc, 0xdb, 0x5b, 0x1e, 0xde, 0xfc, 0xfe, + 0x8b, 0xcd, 0xa1, 0xb6, 0xa1, 0x5c, 0x8c, 0x2b, + 0x08, 0x69, 0xff, 0xd2, 0xec, 0x5e, 0x26, 0xe5, + 0x53, 0xb7, 0xb2, 0x27, 0xfe, 0x87, 0xfd, 0xbd +}; +static const u8 enc_output054[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x06, 0x2d, 0xe6, 0x79, 0x5f, 0x27, 0x4f, 0xd2, + 0xa3, 0x05, 0xd7, 0x69, 0x80, 0xbc, 0x9c, 0xce +}; +static const u8 enc_assoc054[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 +}; +static const u8 enc_nonce054[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key054[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input055[] __initconst = { + 0x25, 0x6d, 0x40, 0x88, 0x80, 0x94, 0x17, 0x83, + 0x55, 0xd3, 0x04, 0x84, 0x64, 0x43, 0xfe, 0xe8, + 0xdf, 0x99, 0x47, 0x03, 0x03, 0xfb, 0x3b, 0x7b, + 0x80, 0xe0, 0x30, 0xbe, 0xeb, 0xd3, 0x29, 0xbe, + 0xe3, 0xbc, 0xdb, 0x5b, 0x1e, 0xde, 0xfc, 0xfe, + 0x8b, 0xcd, 0xa1, 0xb6, 0xa1, 0x5c, 0x8c, 0x2b, + 0x08, 0x69, 0xff, 0xd2, 0xec, 0x5e, 0x26, 0xe5, + 0x53, 0xb7, 0xb2, 0x27, 0xfe, 0x87, 0xfd, 0xbd, + 0x7a, 0xda, 0x44, 0x42, 0x42, 0x69, 0xbf, 0xfa, + 0x55, 0x27, 0xf2, 0x70, 0xac, 0xf6, 0x85, 0x02, + 0xb7, 0x4c, 0x5a, 0xe2, 0xe6, 0x0c, 0x05, 0x80, + 0x98, 0x1a, 0x49, 0x38, 0x45, 0x93, 0x92, 0xc4, + 0x9b, 0xb2, 0xf2, 0x84, 0xb6, 0x46, 0xef, 0xc7, + 0xf3, 0xf0, 0xb1, 0x36, 0x1d, 0xc3, 0x48, 0xed, + 0x77, 0xd3, 0x0b, 0xc5, 0x76, 0x92, 0xed, 0x38, + 0xfb, 0xac, 0x01, 0x88, 0x38, 0x04, 0x88, 0xc7 +}; +static const u8 enc_output055[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0xd8, 0xb4, 0x79, 0x02, 0xba, 0xae, 0xaf, 0xb3, + 0x42, 0x03, 0x05, 0x15, 0x29, 0xaf, 0x28, 0x2e +}; +static const u8 enc_assoc055[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 +}; +static const u8 enc_nonce055[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key055[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input056[] __initconst = { + 0xda, 0x92, 0xbf, 0x77, 0x7f, 0x6b, 0xe8, 0x7c, + 0xaa, 0x2c, 0xfb, 0x7b, 0x9b, 0xbc, 0x01, 0x17, + 0x20, 0x66, 0xb8, 0xfc, 0xfc, 0x04, 0xc4, 0x84, + 0x7f, 0x1f, 0xcf, 0x41, 0x14, 0x2c, 0xd6, 0x41 +}; +static const u8 enc_output056[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xb3, 0x89, 0x1c, 0x84, 0x9c, 0xb5, 0x2c, 0x27, + 0x74, 0x7e, 0xdf, 0xcf, 0x31, 0x21, 0x3b, 0xb6 +}; +static const u8 enc_assoc056[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce056[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key056[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input057[] __initconst = { + 0xda, 0x92, 0xbf, 0x77, 0x7f, 0x6b, 0xe8, 0x7c, + 0xaa, 0x2c, 0xfb, 0x7b, 0x9b, 0xbc, 0x01, 0x17, + 0x20, 0x66, 0xb8, 0xfc, 0xfc, 0x04, 0xc4, 0x84, + 0x7f, 0x1f, 0xcf, 0x41, 0x14, 0x2c, 0xd6, 0x41, + 0x1c, 0x43, 0x24, 0xa4, 0xe1, 0x21, 0x03, 0x01, + 0x74, 0x32, 0x5e, 0x49, 0x5e, 0xa3, 0x73, 0xd4, + 0xf7, 0x96, 0x00, 0x2d, 0x13, 0xa1, 0xd9, 0x1a, + 0xac, 0x48, 0x4d, 0xd8, 0x01, 0x78, 0x02, 0x42 +}; +static const u8 enc_output057[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xf0, 0xc1, 0x2d, 0x26, 0xef, 0x03, 0x02, 0x9b, + 0x62, 0xc0, 0x08, 0xda, 0x27, 0xc5, 0xdc, 0x68 +}; +static const u8 enc_assoc057[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce057[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key057[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input058[] __initconst = { + 0xda, 0x92, 0xbf, 0x77, 0x7f, 0x6b, 0xe8, 0x7c, + 0xaa, 0x2c, 0xfb, 0x7b, 0x9b, 0xbc, 0x01, 0x17, + 0x20, 0x66, 0xb8, 0xfc, 0xfc, 0x04, 0xc4, 0x84, + 0x7f, 0x1f, 0xcf, 0x41, 0x14, 0x2c, 0xd6, 0x41, + 0x1c, 0x43, 0x24, 0xa4, 0xe1, 0x21, 0x03, 0x01, + 0x74, 0x32, 0x5e, 0x49, 0x5e, 0xa3, 0x73, 0xd4, + 0xf7, 0x96, 0x00, 0x2d, 0x13, 0xa1, 0xd9, 0x1a, + 0xac, 0x48, 0x4d, 0xd8, 0x01, 0x78, 0x02, 0x42, + 0x85, 0x25, 0xbb, 0xbd, 0xbd, 0x96, 0x40, 0x05, + 0xaa, 0xd8, 0x0d, 0x8f, 0x53, 0x09, 0x7a, 0xfd, + 0x48, 0xb3, 0xa5, 0x1d, 0x19, 0xf3, 0xfa, 0x7f, + 0x67, 0xe5, 0xb6, 0xc7, 0xba, 0x6c, 0x6d, 0x3b, + 0x64, 0x4d, 0x0d, 0x7b, 0x49, 0xb9, 0x10, 0x38, + 0x0c, 0x0f, 0x4e, 0xc9, 0xe2, 0x3c, 0xb7, 0x12, + 0x88, 0x2c, 0xf4, 0x3a, 0x89, 0x6d, 0x12, 0xc7, + 0x04, 0x53, 0xfe, 0x77, 0xc7, 0xfb, 0x77, 0x38 +}; +static const u8 enc_output058[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xee, 0x65, 0x78, 0x30, 0x01, 0xc2, 0x56, 0x91, + 0xfa, 0x28, 0xd0, 0xf5, 0xf1, 0xc1, 0xd7, 0x62 +}; +static const u8 enc_assoc058[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce058[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key058[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input059[] __initconst = { + 0x25, 0x6d, 0x40, 0x08, 0x80, 0x94, 0x17, 0x03, + 0x55, 0xd3, 0x04, 0x04, 0x64, 0x43, 0xfe, 0x68, + 0xdf, 0x99, 0x47, 0x83, 0x03, 0xfb, 0x3b, 0xfb, + 0x80, 0xe0, 0x30, 0x3e, 0xeb, 0xd3, 0x29, 0x3e +}; +static const u8 enc_output059[] __initconst = { + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x79, 0xba, 0x7a, 0x29, 0xf5, 0xa7, 0xbb, 0x75, + 0x79, 0x7a, 0xf8, 0x7a, 0x61, 0x01, 0x29, 0xa4 +}; +static const u8 enc_assoc059[] __initconst = { + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80 +}; +static const u8 enc_nonce059[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key059[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input060[] __initconst = { + 0x25, 0x6d, 0x40, 0x08, 0x80, 0x94, 0x17, 0x03, + 0x55, 0xd3, 0x04, 0x04, 0x64, 0x43, 0xfe, 0x68, + 0xdf, 0x99, 0x47, 0x83, 0x03, 0xfb, 0x3b, 0xfb, + 0x80, 0xe0, 0x30, 0x3e, 0xeb, 0xd3, 0x29, 0x3e, + 0xe3, 0xbc, 0xdb, 0xdb, 0x1e, 0xde, 0xfc, 0x7e, + 0x8b, 0xcd, 0xa1, 0x36, 0xa1, 0x5c, 0x8c, 0xab, + 0x08, 0x69, 0xff, 0x52, 0xec, 0x5e, 0x26, 0x65, + 0x53, 0xb7, 0xb2, 0xa7, 0xfe, 0x87, 0xfd, 0x3d +}; +static const u8 enc_output060[] __initconst = { + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x36, 0xb1, 0x74, 0x38, 0x19, 0xe1, 0xb9, 0xba, + 0x15, 0x51, 0xe8, 0xed, 0x92, 0x2a, 0x95, 0x9a +}; +static const u8 enc_assoc060[] __initconst = { + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80 +}; +static const u8 enc_nonce060[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key060[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input061[] __initconst = { + 0x25, 0x6d, 0x40, 0x08, 0x80, 0x94, 0x17, 0x03, + 0x55, 0xd3, 0x04, 0x04, 0x64, 0x43, 0xfe, 0x68, + 0xdf, 0x99, 0x47, 0x83, 0x03, 0xfb, 0x3b, 0xfb, + 0x80, 0xe0, 0x30, 0x3e, 0xeb, 0xd3, 0x29, 0x3e, + 0xe3, 0xbc, 0xdb, 0xdb, 0x1e, 0xde, 0xfc, 0x7e, + 0x8b, 0xcd, 0xa1, 0x36, 0xa1, 0x5c, 0x8c, 0xab, + 0x08, 0x69, 0xff, 0x52, 0xec, 0x5e, 0x26, 0x65, + 0x53, 0xb7, 0xb2, 0xa7, 0xfe, 0x87, 0xfd, 0x3d, + 0x7a, 0xda, 0x44, 0xc2, 0x42, 0x69, 0xbf, 0x7a, + 0x55, 0x27, 0xf2, 0xf0, 0xac, 0xf6, 0x85, 0x82, + 0xb7, 0x4c, 0x5a, 0x62, 0xe6, 0x0c, 0x05, 0x00, + 0x98, 0x1a, 0x49, 0xb8, 0x45, 0x93, 0x92, 0x44, + 0x9b, 0xb2, 0xf2, 0x04, 0xb6, 0x46, 0xef, 0x47, + 0xf3, 0xf0, 0xb1, 0xb6, 0x1d, 0xc3, 0x48, 0x6d, + 0x77, 0xd3, 0x0b, 0x45, 0x76, 0x92, 0xed, 0xb8, + 0xfb, 0xac, 0x01, 0x08, 0x38, 0x04, 0x88, 0x47 +}; +static const u8 enc_output061[] __initconst = { + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0xfe, 0xac, 0x49, 0x55, 0x55, 0x4e, 0x80, 0x6f, + 0x3a, 0x19, 0x02, 0xe2, 0x44, 0x32, 0xc0, 0x8a +}; +static const u8 enc_assoc061[] __initconst = { + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80 +}; +static const u8 enc_nonce061[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key061[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input062[] __initconst = { + 0xda, 0x92, 0xbf, 0xf7, 0x7f, 0x6b, 0xe8, 0xfc, + 0xaa, 0x2c, 0xfb, 0xfb, 0x9b, 0xbc, 0x01, 0x97, + 0x20, 0x66, 0xb8, 0x7c, 0xfc, 0x04, 0xc4, 0x04, + 0x7f, 0x1f, 0xcf, 0xc1, 0x14, 0x2c, 0xd6, 0xc1 +}; +static const u8 enc_output062[] __initconst = { + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0x20, 0xa3, 0x79, 0x8d, 0xf1, 0x29, 0x2c, 0x59, + 0x72, 0xbf, 0x97, 0x41, 0xae, 0xc3, 0x8a, 0x19 +}; +static const u8 enc_assoc062[] __initconst = { + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f +}; +static const u8 enc_nonce062[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key062[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input063[] __initconst = { + 0xda, 0x92, 0xbf, 0xf7, 0x7f, 0x6b, 0xe8, 0xfc, + 0xaa, 0x2c, 0xfb, 0xfb, 0x9b, 0xbc, 0x01, 0x97, + 0x20, 0x66, 0xb8, 0x7c, 0xfc, 0x04, 0xc4, 0x04, + 0x7f, 0x1f, 0xcf, 0xc1, 0x14, 0x2c, 0xd6, 0xc1, + 0x1c, 0x43, 0x24, 0x24, 0xe1, 0x21, 0x03, 0x81, + 0x74, 0x32, 0x5e, 0xc9, 0x5e, 0xa3, 0x73, 0x54, + 0xf7, 0x96, 0x00, 0xad, 0x13, 0xa1, 0xd9, 0x9a, + 0xac, 0x48, 0x4d, 0x58, 0x01, 0x78, 0x02, 0xc2 +}; +static const u8 enc_output063[] __initconst = { + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xc0, 0x3d, 0x9f, 0x67, 0x35, 0x4a, 0x97, 0xb2, + 0xf0, 0x74, 0xf7, 0x55, 0x15, 0x57, 0xe4, 0x9c +}; +static const u8 enc_assoc063[] __initconst = { + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f +}; +static const u8 enc_nonce063[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key063[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input064[] __initconst = { + 0xda, 0x92, 0xbf, 0xf7, 0x7f, 0x6b, 0xe8, 0xfc, + 0xaa, 0x2c, 0xfb, 0xfb, 0x9b, 0xbc, 0x01, 0x97, + 0x20, 0x66, 0xb8, 0x7c, 0xfc, 0x04, 0xc4, 0x04, + 0x7f, 0x1f, 0xcf, 0xc1, 0x14, 0x2c, 0xd6, 0xc1, + 0x1c, 0x43, 0x24, 0x24, 0xe1, 0x21, 0x03, 0x81, + 0x74, 0x32, 0x5e, 0xc9, 0x5e, 0xa3, 0x73, 0x54, + 0xf7, 0x96, 0x00, 0xad, 0x13, 0xa1, 0xd9, 0x9a, + 0xac, 0x48, 0x4d, 0x58, 0x01, 0x78, 0x02, 0xc2, + 0x85, 0x25, 0xbb, 0x3d, 0xbd, 0x96, 0x40, 0x85, + 0xaa, 0xd8, 0x0d, 0x0f, 0x53, 0x09, 0x7a, 0x7d, + 0x48, 0xb3, 0xa5, 0x9d, 0x19, 0xf3, 0xfa, 0xff, + 0x67, 0xe5, 0xb6, 0x47, 0xba, 0x6c, 0x6d, 0xbb, + 0x64, 0x4d, 0x0d, 0xfb, 0x49, 0xb9, 0x10, 0xb8, + 0x0c, 0x0f, 0x4e, 0x49, 0xe2, 0x3c, 0xb7, 0x92, + 0x88, 0x2c, 0xf4, 0xba, 0x89, 0x6d, 0x12, 0x47, + 0x04, 0x53, 0xfe, 0xf7, 0xc7, 0xfb, 0x77, 0xb8 +}; +static const u8 enc_output064[] __initconst = { + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xc8, 0x6d, 0xa8, 0xdd, 0x65, 0x22, 0x86, 0xd5, + 0x02, 0x13, 0xd3, 0x28, 0xd6, 0x3e, 0x40, 0x06 +}; +static const u8 enc_assoc064[] __initconst = { + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f +}; +static const u8 enc_nonce064[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key064[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input065[] __initconst = { + 0x5a, 0x92, 0xbf, 0x77, 0xff, 0x6b, 0xe8, 0x7c, + 0x2a, 0x2c, 0xfb, 0x7b, 0x1b, 0xbc, 0x01, 0x17, + 0xa0, 0x66, 0xb8, 0xfc, 0x7c, 0x04, 0xc4, 0x84, + 0xff, 0x1f, 0xcf, 0x41, 0x94, 0x2c, 0xd6, 0x41 +}; +static const u8 enc_output065[] __initconst = { + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0xbe, 0xde, 0x90, 0x83, 0xce, 0xb3, 0x6d, 0xdf, + 0xe5, 0xfa, 0x81, 0x1f, 0x95, 0x47, 0x1c, 0x67 +}; +static const u8 enc_assoc065[] __initconst = { + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce065[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key065[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input066[] __initconst = { + 0x5a, 0x92, 0xbf, 0x77, 0xff, 0x6b, 0xe8, 0x7c, + 0x2a, 0x2c, 0xfb, 0x7b, 0x1b, 0xbc, 0x01, 0x17, + 0xa0, 0x66, 0xb8, 0xfc, 0x7c, 0x04, 0xc4, 0x84, + 0xff, 0x1f, 0xcf, 0x41, 0x94, 0x2c, 0xd6, 0x41, + 0x9c, 0x43, 0x24, 0xa4, 0x61, 0x21, 0x03, 0x01, + 0xf4, 0x32, 0x5e, 0x49, 0xde, 0xa3, 0x73, 0xd4, + 0x77, 0x96, 0x00, 0x2d, 0x93, 0xa1, 0xd9, 0x1a, + 0x2c, 0x48, 0x4d, 0xd8, 0x81, 0x78, 0x02, 0x42 +}; +static const u8 enc_output066[] __initconst = { + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x30, 0x08, 0x74, 0xbb, 0x06, 0x92, 0xb6, 0x89, + 0xde, 0xad, 0x9a, 0xe1, 0x5b, 0x06, 0x73, 0x90 +}; +static const u8 enc_assoc066[] __initconst = { + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce066[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key066[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input067[] __initconst = { + 0x5a, 0x92, 0xbf, 0x77, 0xff, 0x6b, 0xe8, 0x7c, + 0x2a, 0x2c, 0xfb, 0x7b, 0x1b, 0xbc, 0x01, 0x17, + 0xa0, 0x66, 0xb8, 0xfc, 0x7c, 0x04, 0xc4, 0x84, + 0xff, 0x1f, 0xcf, 0x41, 0x94, 0x2c, 0xd6, 0x41, + 0x9c, 0x43, 0x24, 0xa4, 0x61, 0x21, 0x03, 0x01, + 0xf4, 0x32, 0x5e, 0x49, 0xde, 0xa3, 0x73, 0xd4, + 0x77, 0x96, 0x00, 0x2d, 0x93, 0xa1, 0xd9, 0x1a, + 0x2c, 0x48, 0x4d, 0xd8, 0x81, 0x78, 0x02, 0x42, + 0x05, 0x25, 0xbb, 0xbd, 0x3d, 0x96, 0x40, 0x05, + 0x2a, 0xd8, 0x0d, 0x8f, 0xd3, 0x09, 0x7a, 0xfd, + 0xc8, 0xb3, 0xa5, 0x1d, 0x99, 0xf3, 0xfa, 0x7f, + 0xe7, 0xe5, 0xb6, 0xc7, 0x3a, 0x6c, 0x6d, 0x3b, + 0xe4, 0x4d, 0x0d, 0x7b, 0xc9, 0xb9, 0x10, 0x38, + 0x8c, 0x0f, 0x4e, 0xc9, 0x62, 0x3c, 0xb7, 0x12, + 0x08, 0x2c, 0xf4, 0x3a, 0x09, 0x6d, 0x12, 0xc7, + 0x84, 0x53, 0xfe, 0x77, 0x47, 0xfb, 0x77, 0x38 +}; +static const u8 enc_output067[] __initconst = { + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x99, 0xca, 0xd8, 0x5f, 0x45, 0xca, 0x40, 0x94, + 0x2d, 0x0d, 0x4d, 0x5e, 0x95, 0x0a, 0xde, 0x22 +}; +static const u8 enc_assoc067[] __initconst = { + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, + 0x7f, 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce067[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key067[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input068[] __initconst = { + 0x25, 0x6d, 0x40, 0x88, 0x7f, 0x6b, 0xe8, 0x7c, + 0x55, 0xd3, 0x04, 0x84, 0x9b, 0xbc, 0x01, 0x17, + 0xdf, 0x99, 0x47, 0x03, 0xfc, 0x04, 0xc4, 0x84, + 0x80, 0xe0, 0x30, 0xbe, 0x14, 0x2c, 0xd6, 0x41 +}; +static const u8 enc_output068[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x8b, 0xbe, 0x14, 0x52, 0x72, 0xe7, 0xc2, 0xd9, + 0xa1, 0x89, 0x1a, 0x3a, 0xb0, 0x98, 0x3d, 0x9d +}; +static const u8 enc_assoc068[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce068[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key068[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input069[] __initconst = { + 0x25, 0x6d, 0x40, 0x88, 0x7f, 0x6b, 0xe8, 0x7c, + 0x55, 0xd3, 0x04, 0x84, 0x9b, 0xbc, 0x01, 0x17, + 0xdf, 0x99, 0x47, 0x03, 0xfc, 0x04, 0xc4, 0x84, + 0x80, 0xe0, 0x30, 0xbe, 0x14, 0x2c, 0xd6, 0x41, + 0xe3, 0xbc, 0xdb, 0x5b, 0xe1, 0x21, 0x03, 0x01, + 0x8b, 0xcd, 0xa1, 0xb6, 0x5e, 0xa3, 0x73, 0xd4, + 0x08, 0x69, 0xff, 0xd2, 0x13, 0xa1, 0xd9, 0x1a, + 0x53, 0xb7, 0xb2, 0x27, 0x01, 0x78, 0x02, 0x42 +}; +static const u8 enc_output069[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x3b, 0x41, 0x86, 0x19, 0x13, 0xa8, 0xf6, 0xde, + 0x7f, 0x61, 0xe2, 0x25, 0x63, 0x1b, 0xc3, 0x82 +}; +static const u8 enc_assoc069[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce069[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key069[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input070[] __initconst = { + 0x25, 0x6d, 0x40, 0x88, 0x7f, 0x6b, 0xe8, 0x7c, + 0x55, 0xd3, 0x04, 0x84, 0x9b, 0xbc, 0x01, 0x17, + 0xdf, 0x99, 0x47, 0x03, 0xfc, 0x04, 0xc4, 0x84, + 0x80, 0xe0, 0x30, 0xbe, 0x14, 0x2c, 0xd6, 0x41, + 0xe3, 0xbc, 0xdb, 0x5b, 0xe1, 0x21, 0x03, 0x01, + 0x8b, 0xcd, 0xa1, 0xb6, 0x5e, 0xa3, 0x73, 0xd4, + 0x08, 0x69, 0xff, 0xd2, 0x13, 0xa1, 0xd9, 0x1a, + 0x53, 0xb7, 0xb2, 0x27, 0x01, 0x78, 0x02, 0x42, + 0x7a, 0xda, 0x44, 0x42, 0xbd, 0x96, 0x40, 0x05, + 0x55, 0x27, 0xf2, 0x70, 0x53, 0x09, 0x7a, 0xfd, + 0xb7, 0x4c, 0x5a, 0xe2, 0x19, 0xf3, 0xfa, 0x7f, + 0x98, 0x1a, 0x49, 0x38, 0xba, 0x6c, 0x6d, 0x3b, + 0x9b, 0xb2, 0xf2, 0x84, 0x49, 0xb9, 0x10, 0x38, + 0xf3, 0xf0, 0xb1, 0x36, 0xe2, 0x3c, 0xb7, 0x12, + 0x77, 0xd3, 0x0b, 0xc5, 0x89, 0x6d, 0x12, 0xc7, + 0xfb, 0xac, 0x01, 0x88, 0xc7, 0xfb, 0x77, 0x38 +}; +static const u8 enc_output070[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x84, 0x28, 0xbc, 0xf0, 0x23, 0xec, 0x6b, 0xf3, + 0x1f, 0xd9, 0xef, 0xb2, 0x03, 0xff, 0x08, 0x71 +}; +static const u8 enc_assoc070[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce070[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key070[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input071[] __initconst = { + 0xda, 0x92, 0xbf, 0x77, 0x80, 0x94, 0x17, 0x83, + 0xaa, 0x2c, 0xfb, 0x7b, 0x64, 0x43, 0xfe, 0xe8, + 0x20, 0x66, 0xb8, 0xfc, 0x03, 0xfb, 0x3b, 0x7b, + 0x7f, 0x1f, 0xcf, 0x41, 0xeb, 0xd3, 0x29, 0xbe +}; +static const u8 enc_output071[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0x13, 0x9f, 0xdf, 0x64, 0x74, 0xea, 0x24, 0xf5, + 0x49, 0xb0, 0x75, 0x82, 0x5f, 0x2c, 0x76, 0x20 +}; +static const u8 enc_assoc071[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 +}; +static const u8 enc_nonce071[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key071[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input072[] __initconst = { + 0xda, 0x92, 0xbf, 0x77, 0x80, 0x94, 0x17, 0x83, + 0xaa, 0x2c, 0xfb, 0x7b, 0x64, 0x43, 0xfe, 0xe8, + 0x20, 0x66, 0xb8, 0xfc, 0x03, 0xfb, 0x3b, 0x7b, + 0x7f, 0x1f, 0xcf, 0x41, 0xeb, 0xd3, 0x29, 0xbe, + 0x1c, 0x43, 0x24, 0xa4, 0x1e, 0xde, 0xfc, 0xfe, + 0x74, 0x32, 0x5e, 0x49, 0xa1, 0x5c, 0x8c, 0x2b, + 0xf7, 0x96, 0x00, 0x2d, 0xec, 0x5e, 0x26, 0xe5, + 0xac, 0x48, 0x4d, 0xd8, 0xfe, 0x87, 0xfd, 0xbd +}; +static const u8 enc_output072[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xbb, 0xad, 0x8d, 0x86, 0x3b, 0x83, 0x5a, 0x8e, + 0x86, 0x64, 0xfd, 0x1d, 0x45, 0x66, 0xb6, 0xb4 +}; +static const u8 enc_assoc072[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 +}; +static const u8 enc_nonce072[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key072[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - misc */ +static const u8 enc_input073[] __initconst = { + 0xda, 0x92, 0xbf, 0x77, 0x80, 0x94, 0x17, 0x83, + 0xaa, 0x2c, 0xfb, 0x7b, 0x64, 0x43, 0xfe, 0xe8, + 0x20, 0x66, 0xb8, 0xfc, 0x03, 0xfb, 0x3b, 0x7b, + 0x7f, 0x1f, 0xcf, 0x41, 0xeb, 0xd3, 0x29, 0xbe, + 0x1c, 0x43, 0x24, 0xa4, 0x1e, 0xde, 0xfc, 0xfe, + 0x74, 0x32, 0x5e, 0x49, 0xa1, 0x5c, 0x8c, 0x2b, + 0xf7, 0x96, 0x00, 0x2d, 0xec, 0x5e, 0x26, 0xe5, + 0xac, 0x48, 0x4d, 0xd8, 0xfe, 0x87, 0xfd, 0xbd, + 0x85, 0x25, 0xbb, 0xbd, 0x42, 0x69, 0xbf, 0xfa, + 0xaa, 0xd8, 0x0d, 0x8f, 0xac, 0xf6, 0x85, 0x02, + 0x48, 0xb3, 0xa5, 0x1d, 0xe6, 0x0c, 0x05, 0x80, + 0x67, 0xe5, 0xb6, 0xc7, 0x45, 0x93, 0x92, 0xc4, + 0x64, 0x4d, 0x0d, 0x7b, 0xb6, 0x46, 0xef, 0xc7, + 0x0c, 0x0f, 0x4e, 0xc9, 0x1d, 0xc3, 0x48, 0xed, + 0x88, 0x2c, 0xf4, 0x3a, 0x76, 0x92, 0xed, 0x38, + 0x04, 0x53, 0xfe, 0x77, 0x38, 0x04, 0x88, 0xc7 +}; +static const u8 enc_output073[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0x42, 0xf2, 0x35, 0x42, 0x97, 0x84, 0x9a, 0x51, + 0x1d, 0x53, 0xe5, 0x57, 0x17, 0x72, 0xf7, 0x1f +}; +static const u8 enc_assoc073[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 +}; +static const u8 enc_nonce073[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0xee, 0x32, 0x00 +}; +static const u8 enc_key073[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - checking for int overflows */ +static const u8 enc_input074[] __initconst = { + 0xd4, 0x50, 0x0b, 0xf0, 0x09, 0x49, 0x35, 0x51, + 0xc3, 0x80, 0xad, 0xf5, 0x2c, 0x57, 0x3a, 0x69, + 0xdf, 0x7e, 0x8b, 0x76, 0x24, 0x63, 0x33, 0x0f, + 0xac, 0xc1, 0x6a, 0x57, 0x26, 0xbe, 0x71, 0x90, + 0xc6, 0x3c, 0x5a, 0x1c, 0x92, 0x65, 0x84, 0xa0, + 0x96, 0x75, 0x68, 0x28, 0xdc, 0xdc, 0x64, 0xac, + 0xdf, 0x96, 0x3d, 0x93, 0x1b, 0xf1, 0xda, 0xe2, + 0x38, 0xf3, 0xf1, 0x57, 0x22, 0x4a, 0xc4, 0xb5, + 0x42, 0xd7, 0x85, 0xb0, 0xdd, 0x84, 0xdb, 0x6b, + 0xe3, 0xbc, 0x5a, 0x36, 0x63, 0xe8, 0x41, 0x49, + 0xff, 0xbe, 0xd0, 0x9e, 0x54, 0xf7, 0x8f, 0x16, + 0xa8, 0x22, 0x3b, 0x24, 0xcb, 0x01, 0x9f, 0x58, + 0xb2, 0x1b, 0x0e, 0x55, 0x1e, 0x7a, 0xa0, 0x73, + 0x27, 0x62, 0x95, 0x51, 0x37, 0x6c, 0xcb, 0xc3, + 0x93, 0x76, 0x71, 0xa0, 0x62, 0x9b, 0xd9, 0x5c, + 0x99, 0x15, 0xc7, 0x85, 0x55, 0x77, 0x1e, 0x7a +}; +static const u8 enc_output074[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x0b, 0x30, 0x0d, 0x8d, 0xa5, 0x6c, 0x21, 0x85, + 0x75, 0x52, 0x79, 0x55, 0x3c, 0x4c, 0x82, 0xca +}; +static const u8 enc_assoc074[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce074[] __initconst = { + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x00, 0x02, 0x50, 0x6e +}; +static const u8 enc_key074[] __initconst = { + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30 +}; + +/* wycheproof - checking for int overflows */ +static const u8 enc_input075[] __initconst = { + 0x7d, 0xe8, 0x7f, 0x67, 0x29, 0x94, 0x52, 0x75, + 0xd0, 0x65, 0x5d, 0xa4, 0xc7, 0xfd, 0xe4, 0x56, + 0x9e, 0x16, 0xf1, 0x11, 0xb5, 0xeb, 0x26, 0xc2, + 0x2d, 0x85, 0x9e, 0x3f, 0xf8, 0x22, 0xec, 0xed, + 0x3a, 0x6d, 0xd9, 0xa6, 0x0f, 0x22, 0x95, 0x7f, + 0x7b, 0x7c, 0x85, 0x7e, 0x88, 0x22, 0xeb, 0x9f, + 0xe0, 0xb8, 0xd7, 0x02, 0x21, 0x41, 0xf2, 0xd0, + 0xb4, 0x8f, 0x4b, 0x56, 0x12, 0xd3, 0x22, 0xa8, + 0x8d, 0xd0, 0xfe, 0x0b, 0x4d, 0x91, 0x79, 0x32, + 0x4f, 0x7c, 0x6c, 0x9e, 0x99, 0x0e, 0xfb, 0xd8, + 0x0e, 0x5e, 0xd6, 0x77, 0x58, 0x26, 0x49, 0x8b, + 0x1e, 0xfe, 0x0f, 0x71, 0xa0, 0xf3, 0xec, 0x5b, + 0x29, 0xcb, 0x28, 0xc2, 0x54, 0x0a, 0x7d, 0xcd, + 0x51, 0xb7, 0xda, 0xae, 0xe0, 0xff, 0x4a, 0x7f, + 0x3a, 0xc1, 0xee, 0x54, 0xc2, 0x9e, 0xe4, 0xc1, + 0x70, 0xde, 0x40, 0x8f, 0x66, 0x69, 0x21, 0x94 +}; +static const u8 enc_output075[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xc5, 0x78, 0xe2, 0xaa, 0x44, 0xd3, 0x09, 0xb7, + 0xb6, 0xa5, 0x19, 0x3b, 0xdc, 0x61, 0x18, 0xf5 +}; +static const u8 enc_assoc075[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce075[] __initconst = { + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x00, 0x03, 0x18, 0xa5 +}; +static const u8 enc_key075[] __initconst = { + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30 +}; + +/* wycheproof - checking for int overflows */ +static const u8 enc_input076[] __initconst = { + 0x1b, 0x99, 0x6f, 0x9a, 0x3c, 0xcc, 0x67, 0x85, + 0xde, 0x22, 0xff, 0x5b, 0x8a, 0xdd, 0x95, 0x02, + 0xce, 0x03, 0xa0, 0xfa, 0xf5, 0x99, 0x2a, 0x09, + 0x52, 0x2c, 0xdd, 0x12, 0x06, 0xd2, 0x20, 0xb8, + 0xf8, 0xbd, 0x07, 0xd1, 0xf1, 0xf5, 0xa1, 0xbd, + 0x9a, 0x71, 0xd1, 0x1c, 0x7f, 0x57, 0x9b, 0x85, + 0x58, 0x18, 0xc0, 0x8d, 0x4d, 0xe0, 0x36, 0x39, + 0x31, 0x83, 0xb7, 0xf5, 0x90, 0xb3, 0x35, 0xae, + 0xd8, 0xde, 0x5b, 0x57, 0xb1, 0x3c, 0x5f, 0xed, + 0xe2, 0x44, 0x1c, 0x3e, 0x18, 0x4a, 0xa9, 0xd4, + 0x6e, 0x61, 0x59, 0x85, 0x06, 0xb3, 0xe1, 0x1c, + 0x43, 0xc6, 0x2c, 0xbc, 0xac, 0xec, 0xed, 0x33, + 0x19, 0x08, 0x75, 0xb0, 0x12, 0x21, 0x8b, 0x19, + 0x30, 0xfb, 0x7c, 0x38, 0xec, 0x45, 0xac, 0x11, + 0xc3, 0x53, 0xd0, 0xcf, 0x93, 0x8d, 0xcc, 0xb9, + 0xef, 0xad, 0x8f, 0xed, 0xbe, 0x46, 0xda, 0xa5 +}; +static const u8 enc_output076[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x4b, 0x0b, 0xda, 0x8a, 0xd0, 0x43, 0x83, 0x0d, + 0x83, 0x19, 0xab, 0x82, 0xc5, 0x0c, 0x76, 0x63 +}; +static const u8 enc_assoc076[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce076[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x07, 0xb4, 0xf0 +}; +static const u8 enc_key076[] __initconst = { + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30 +}; + +/* wycheproof - checking for int overflows */ +static const u8 enc_input077[] __initconst = { + 0x86, 0xcb, 0xac, 0xae, 0x4d, 0x3f, 0x74, 0xae, + 0x01, 0x21, 0x3e, 0x05, 0x51, 0xcc, 0x15, 0x16, + 0x0e, 0xa1, 0xbe, 0x84, 0x08, 0xe3, 0xd5, 0xd7, + 0x4f, 0x01, 0x46, 0x49, 0x95, 0xa6, 0x9e, 0x61, + 0x76, 0xcb, 0x9e, 0x02, 0xb2, 0x24, 0x7e, 0xd2, + 0x99, 0x89, 0x2f, 0x91, 0x82, 0xa4, 0x5c, 0xaf, + 0x4c, 0x69, 0x40, 0x56, 0x11, 0x76, 0x6e, 0xdf, + 0xaf, 0xdc, 0x28, 0x55, 0x19, 0xea, 0x30, 0x48, + 0x0c, 0x44, 0xf0, 0x5e, 0x78, 0x1e, 0xac, 0xf8, + 0xfc, 0xec, 0xc7, 0x09, 0x0a, 0xbb, 0x28, 0xfa, + 0x5f, 0xd5, 0x85, 0xac, 0x8c, 0xda, 0x7e, 0x87, + 0x72, 0xe5, 0x94, 0xe4, 0xce, 0x6c, 0x88, 0x32, + 0x81, 0x93, 0x2e, 0x0f, 0x89, 0xf8, 0x77, 0xa1, + 0xf0, 0x4d, 0x9c, 0x32, 0xb0, 0x6c, 0xf9, 0x0b, + 0x0e, 0x76, 0x2b, 0x43, 0x0c, 0x4d, 0x51, 0x7c, + 0x97, 0x10, 0x70, 0x68, 0xf4, 0x98, 0xef, 0x7f +}; +static const u8 enc_output077[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x4b, 0xc9, 0x8f, 0x72, 0xc4, 0x94, 0xc2, 0xa4, + 0x3c, 0x2b, 0x15, 0xa1, 0x04, 0x3f, 0x1c, 0xfa +}; +static const u8 enc_assoc077[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce077[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0xfb, 0x66 +}; +static const u8 enc_key077[] __initconst = { + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30 +}; + +/* wycheproof - checking for int overflows */ +static const u8 enc_input078[] __initconst = { + 0xfa, 0xb1, 0xcd, 0xdf, 0x4f, 0xe1, 0x98, 0xef, + 0x63, 0xad, 0xd8, 0x81, 0xd6, 0xea, 0xd6, 0xc5, + 0x76, 0x37, 0xbb, 0xe9, 0x20, 0x18, 0xca, 0x7c, + 0x0b, 0x96, 0xfb, 0xa0, 0x87, 0x1e, 0x93, 0x2d, + 0xb1, 0xfb, 0xf9, 0x07, 0x61, 0xbe, 0x25, 0xdf, + 0x8d, 0xfa, 0xf9, 0x31, 0xce, 0x57, 0x57, 0xe6, + 0x17, 0xb3, 0xd7, 0xa9, 0xf0, 0xbf, 0x0f, 0xfe, + 0x5d, 0x59, 0x1a, 0x33, 0xc1, 0x43, 0xb8, 0xf5, + 0x3f, 0xd0, 0xb5, 0xa1, 0x96, 0x09, 0xfd, 0x62, + 0xe5, 0xc2, 0x51, 0xa4, 0x28, 0x1a, 0x20, 0x0c, + 0xfd, 0xc3, 0x4f, 0x28, 0x17, 0x10, 0x40, 0x6f, + 0x4e, 0x37, 0x62, 0x54, 0x46, 0xff, 0x6e, 0xf2, + 0x24, 0x91, 0x3d, 0xeb, 0x0d, 0x89, 0xaf, 0x33, + 0x71, 0x28, 0xe3, 0xd1, 0x55, 0xd1, 0x6d, 0x3e, + 0xc3, 0x24, 0x60, 0x41, 0x43, 0x21, 0x43, 0xe9, + 0xab, 0x3a, 0x6d, 0x2c, 0xcc, 0x2f, 0x4d, 0x62 +}; +static const u8 enc_output078[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xf7, 0xe9, 0xe1, 0x51, 0xb0, 0x25, 0x33, 0xc7, + 0x46, 0x58, 0xbf, 0xc7, 0x73, 0x7c, 0x68, 0x0d +}; +static const u8 enc_assoc078[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce078[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x38, 0xbb, 0x90 +}; +static const u8 enc_key078[] __initconst = { + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30 +}; + +/* wycheproof - checking for int overflows */ +static const u8 enc_input079[] __initconst = { + 0x22, 0x72, 0x02, 0xbe, 0x7f, 0x35, 0x15, 0xe9, + 0xd1, 0xc0, 0x2e, 0xea, 0x2f, 0x19, 0x50, 0xb6, + 0x48, 0x1b, 0x04, 0x8a, 0x4c, 0x91, 0x50, 0x6c, + 0xb4, 0x0d, 0x50, 0x4e, 0x6c, 0x94, 0x9f, 0x82, + 0xd1, 0x97, 0xc2, 0x5a, 0xd1, 0x7d, 0xc7, 0x21, + 0x65, 0x11, 0x25, 0x78, 0x2a, 0xc7, 0xa7, 0x12, + 0x47, 0xfe, 0xae, 0xf3, 0x2f, 0x1f, 0x25, 0x0c, + 0xe4, 0xbb, 0x8f, 0x79, 0xac, 0xaa, 0x17, 0x9d, + 0x45, 0xa7, 0xb0, 0x54, 0x5f, 0x09, 0x24, 0x32, + 0x5e, 0xfa, 0x87, 0xd5, 0xe4, 0x41, 0xd2, 0x84, + 0x78, 0xc6, 0x1f, 0x22, 0x23, 0xee, 0x67, 0xc3, + 0xb4, 0x1f, 0x43, 0x94, 0x53, 0x5e, 0x2a, 0x24, + 0x36, 0x9a, 0x2e, 0x16, 0x61, 0x3c, 0x45, 0x94, + 0x90, 0xc1, 0x4f, 0xb1, 0xd7, 0x55, 0xfe, 0x53, + 0xfb, 0xe1, 0xee, 0x45, 0xb1, 0xb2, 0x1f, 0x71, + 0x62, 0xe2, 0xfc, 0xaa, 0x74, 0x2a, 0xbe, 0xfd +}; +static const u8 enc_output079[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x79, 0x5b, 0xcf, 0xf6, 0x47, 0xc5, 0x53, 0xc2, + 0xe4, 0xeb, 0x6e, 0x0e, 0xaf, 0xd9, 0xe0, 0x4e +}; +static const u8 enc_assoc079[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce079[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x70, 0x48, 0x4a +}; +static const u8 enc_key079[] __initconst = { + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30 +}; + +/* wycheproof - checking for int overflows */ +static const u8 enc_input080[] __initconst = { + 0xfa, 0xe5, 0x83, 0x45, 0xc1, 0x6c, 0xb0, 0xf5, + 0xcc, 0x53, 0x7f, 0x2b, 0x1b, 0x34, 0x69, 0xc9, + 0x69, 0x46, 0x3b, 0x3e, 0xa7, 0x1b, 0xcf, 0x6b, + 0x98, 0xd6, 0x69, 0xa8, 0xe6, 0x0e, 0x04, 0xfc, + 0x08, 0xd5, 0xfd, 0x06, 0x9c, 0x36, 0x26, 0x38, + 0xe3, 0x40, 0x0e, 0xf4, 0xcb, 0x24, 0x2e, 0x27, + 0xe2, 0x24, 0x5e, 0x68, 0xcb, 0x9e, 0xc5, 0x83, + 0xda, 0x53, 0x40, 0xb1, 0x2e, 0xdf, 0x42, 0x3b, + 0x73, 0x26, 0xad, 0x20, 0xfe, 0xeb, 0x57, 0xda, + 0xca, 0x2e, 0x04, 0x67, 0xa3, 0x28, 0x99, 0xb4, + 0x2d, 0xf8, 0xe5, 0x6d, 0x84, 0xe0, 0x06, 0xbc, + 0x8a, 0x7a, 0xcc, 0x73, 0x1e, 0x7c, 0x1f, 0x6b, + 0xec, 0xb5, 0x71, 0x9f, 0x70, 0x77, 0xf0, 0xd4, + 0xf4, 0xc6, 0x1a, 0xb1, 0x1e, 0xba, 0xc1, 0x00, + 0x18, 0x01, 0xce, 0x33, 0xc4, 0xe4, 0xa7, 0x7d, + 0x83, 0x1d, 0x3c, 0xe3, 0x4e, 0x84, 0x10, 0xe1 +}; +static const u8 enc_output080[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x19, 0x46, 0xd6, 0x53, 0x96, 0x0f, 0x94, 0x7a, + 0x74, 0xd3, 0xe8, 0x09, 0x3c, 0xf4, 0x85, 0x02 +}; +static const u8 enc_assoc080[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce080[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x93, 0x2f, 0x40 +}; +static const u8 enc_key080[] __initconst = { + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30 +}; + +/* wycheproof - checking for int overflows */ +static const u8 enc_input081[] __initconst = { + 0xeb, 0xb2, 0x16, 0xdd, 0xd7, 0xca, 0x70, 0x92, + 0x15, 0xf5, 0x03, 0xdf, 0x9c, 0xe6, 0x3c, 0x5c, + 0xd2, 0x19, 0x4e, 0x7d, 0x90, 0x99, 0xe8, 0xa9, + 0x0b, 0x2a, 0xfa, 0xad, 0x5e, 0xba, 0x35, 0x06, + 0x99, 0x25, 0xa6, 0x03, 0xfd, 0xbc, 0x34, 0x1a, + 0xae, 0xd4, 0x15, 0x05, 0xb1, 0x09, 0x41, 0xfa, + 0x38, 0x56, 0xa7, 0xe2, 0x47, 0xb1, 0x04, 0x07, + 0x09, 0x74, 0x6c, 0xfc, 0x20, 0x96, 0xca, 0xa6, + 0x31, 0xb2, 0xff, 0xf4, 0x1c, 0x25, 0x05, 0x06, + 0xd8, 0x89, 0xc1, 0xc9, 0x06, 0x71, 0xad, 0xe8, + 0x53, 0xee, 0x63, 0x94, 0xc1, 0x91, 0x92, 0xa5, + 0xcf, 0x37, 0x10, 0xd1, 0x07, 0x30, 0x99, 0xe5, + 0xbc, 0x94, 0x65, 0x82, 0xfc, 0x0f, 0xab, 0x9f, + 0x54, 0x3c, 0x71, 0x6a, 0xe2, 0x48, 0x6a, 0x86, + 0x83, 0xfd, 0xca, 0x39, 0xd2, 0xe1, 0x4f, 0x23, + 0xd0, 0x0a, 0x58, 0x26, 0x64, 0xf4, 0xec, 0xb1 +}; +static const u8 enc_output081[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x36, 0xc3, 0x00, 0x29, 0x85, 0xdd, 0x21, 0xba, + 0xf8, 0x95, 0xd6, 0x33, 0x57, 0x3f, 0x12, 0xc0 +}; +static const u8 enc_assoc081[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce081[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0xe2, 0x93, 0x35 +}; +static const u8 enc_key081[] __initconst = { + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, + 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30 +}; + +/* wycheproof - checking for int overflows */ +static const u8 enc_input082[] __initconst = { + 0x40, 0x8a, 0xe6, 0xef, 0x1c, 0x7e, 0xf0, 0xfb, + 0x2c, 0x2d, 0x61, 0x08, 0x16, 0xfc, 0x78, 0x49, + 0xef, 0xa5, 0x8f, 0x78, 0x27, 0x3f, 0x5f, 0x16, + 0x6e, 0xa6, 0x5f, 0x81, 0xb5, 0x75, 0x74, 0x7d, + 0x03, 0x5b, 0x30, 0x40, 0xfe, 0xde, 0x1e, 0xb9, + 0x45, 0x97, 0x88, 0x66, 0x97, 0x88, 0x40, 0x8e, + 0x00, 0x41, 0x3b, 0x3e, 0x37, 0x6d, 0x15, 0x2d, + 0x20, 0x4a, 0xa2, 0xb7, 0xa8, 0x35, 0x58, 0xfc, + 0xd4, 0x8a, 0x0e, 0xf7, 0xa2, 0x6b, 0x1c, 0xd6, + 0xd3, 0x5d, 0x23, 0xb3, 0xf5, 0xdf, 0xe0, 0xca, + 0x77, 0xa4, 0xce, 0x32, 0xb9, 0x4a, 0xbf, 0x83, + 0xda, 0x2a, 0xef, 0xca, 0xf0, 0x68, 0x38, 0x08, + 0x79, 0xe8, 0x9f, 0xb0, 0xa3, 0x82, 0x95, 0x95, + 0xcf, 0x44, 0xc3, 0x85, 0x2a, 0xe2, 0xcc, 0x66, + 0x2b, 0x68, 0x9f, 0x93, 0x55, 0xd9, 0xc1, 0x83, + 0x80, 0x1f, 0x6a, 0xcc, 0x31, 0x3f, 0x89, 0x07 +}; +static const u8 enc_output082[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x65, 0x14, 0x51, 0x8e, 0x0a, 0x26, 0x41, 0x42, + 0xe0, 0xb7, 0x35, 0x1f, 0x96, 0x7f, 0xc2, 0xae +}; +static const u8 enc_assoc082[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce082[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x0e, 0xf7, 0xd5 +}; +static const u8 enc_key082[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - checking for int overflows */ +static const u8 enc_input083[] __initconst = { + 0x0a, 0x0a, 0x24, 0x49, 0x9b, 0xca, 0xde, 0x58, + 0xcf, 0x15, 0x76, 0xc3, 0x12, 0xac, 0xa9, 0x84, + 0x71, 0x8c, 0xb4, 0xcc, 0x7e, 0x01, 0x53, 0xf5, + 0xa9, 0x01, 0x58, 0x10, 0x85, 0x96, 0x44, 0xdf, + 0xc0, 0x21, 0x17, 0x4e, 0x0b, 0x06, 0x0a, 0x39, + 0x74, 0x48, 0xde, 0x8b, 0x48, 0x4a, 0x86, 0x03, + 0xbe, 0x68, 0x0a, 0x69, 0x34, 0xc0, 0x90, 0x6f, + 0x30, 0xdd, 0x17, 0xea, 0xe2, 0xd4, 0xc5, 0xfa, + 0xa7, 0x77, 0xf8, 0xca, 0x53, 0x37, 0x0e, 0x08, + 0x33, 0x1b, 0x88, 0xc3, 0x42, 0xba, 0xc9, 0x59, + 0x78, 0x7b, 0xbb, 0x33, 0x93, 0x0e, 0x3b, 0x56, + 0xbe, 0x86, 0xda, 0x7f, 0x2a, 0x6e, 0xb1, 0xf9, + 0x40, 0x89, 0xd1, 0xd1, 0x81, 0x07, 0x4d, 0x43, + 0x02, 0xf8, 0xe0, 0x55, 0x2d, 0x0d, 0xe1, 0xfa, + 0xb3, 0x06, 0xa2, 0x1b, 0x42, 0xd4, 0xc3, 0xba, + 0x6e, 0x6f, 0x0c, 0xbc, 0xc8, 0x1e, 0x87, 0x7a +}; +static const u8 enc_output083[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x4c, 0x19, 0x4d, 0xa6, 0xa9, 0x9f, 0xd6, 0x5b, + 0x40, 0xe9, 0xca, 0xd7, 0x98, 0xf4, 0x4b, 0x19 +}; +static const u8 enc_assoc083[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce083[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x00, 0x3d, 0xfc, 0xe4 +}; +static const u8 enc_key083[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - checking for int overflows */ +static const u8 enc_input084[] __initconst = { + 0x4a, 0x0a, 0xaf, 0xf8, 0x49, 0x47, 0x29, 0x18, + 0x86, 0x91, 0x70, 0x13, 0x40, 0xf3, 0xce, 0x2b, + 0x8a, 0x78, 0xee, 0xd3, 0xa0, 0xf0, 0x65, 0x99, + 0x4b, 0x72, 0x48, 0x4e, 0x79, 0x91, 0xd2, 0x5c, + 0x29, 0xaa, 0x07, 0x5e, 0xb1, 0xfc, 0x16, 0xde, + 0x93, 0xfe, 0x06, 0x90, 0x58, 0x11, 0x2a, 0xb2, + 0x84, 0xa3, 0xed, 0x18, 0x78, 0x03, 0x26, 0xd1, + 0x25, 0x8a, 0x47, 0x22, 0x2f, 0xa6, 0x33, 0xd8, + 0xb2, 0x9f, 0x3b, 0xd9, 0x15, 0x0b, 0x23, 0x9b, + 0x15, 0x46, 0xc2, 0xbb, 0x9b, 0x9f, 0x41, 0x0f, + 0xeb, 0xea, 0xd3, 0x96, 0x00, 0x0e, 0xe4, 0x77, + 0x70, 0x15, 0x32, 0xc3, 0xd0, 0xf5, 0xfb, 0xf8, + 0x95, 0xd2, 0x80, 0x19, 0x6d, 0x2f, 0x73, 0x7c, + 0x5e, 0x9f, 0xec, 0x50, 0xd9, 0x2b, 0xb0, 0xdf, + 0x5d, 0x7e, 0x51, 0x3b, 0xe5, 0xb8, 0xea, 0x97, + 0x13, 0x10, 0xd5, 0xbf, 0x16, 0xba, 0x7a, 0xee +}; +static const u8 enc_output084[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xc8, 0xae, 0x77, 0x88, 0xcd, 0x28, 0x74, 0xab, + 0xc1, 0x38, 0x54, 0x1e, 0x11, 0xfd, 0x05, 0x87 +}; +static const u8 enc_assoc084[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce084[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x01, 0x84, 0x86, 0xa8 +}; +static const u8 enc_key084[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - checking for int overflows */ +static const u8 enc_input085[] __initconst = { + 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0x78, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09, + 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8, + 0x9f, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0x9c, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39, + 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4, + 0x47, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0xd4, 0xd2, 0x06, 0x61, 0x6f, 0x92, 0x93, 0xf6, + 0x5b, 0x45, 0xdb, 0xbc, 0x74, 0xe7, 0xc2, 0xed, + 0xfb, 0xcb, 0xbf, 0x1c, 0xfb, 0x67, 0x9b, 0xb7, + 0x39, 0xa5, 0x86, 0x2d, 0xe2, 0xbc, 0xb9, 0x37, + 0xf7, 0x4d, 0x5b, 0xf8, 0x67, 0x1c, 0x5a, 0x8a, + 0x50, 0x92, 0xf6, 0x1d, 0x54, 0xc9, 0xaa, 0x5b +}; +static const u8 enc_output085[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x93, 0x3a, 0x51, 0x63, 0xc7, 0xf6, 0x23, 0x68, + 0x32, 0x7b, 0x3f, 0xbc, 0x10, 0x36, 0xc9, 0x43 +}; +static const u8 enc_assoc085[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce085[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key085[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - special case tag */ +static const u8 enc_input086[] __initconst = { + 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6, + 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd, + 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b, + 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2, + 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19, + 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4, + 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63, + 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d +}; +static const u8 enc_output086[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f +}; +static const u8 enc_assoc086[] __initconst = { + 0x85, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xa6, 0x90, 0x2f, 0xcb, 0xc8, 0x83, 0xbb, 0xc1, + 0x80, 0xb2, 0x56, 0xae, 0x34, 0xad, 0x7f, 0x00 +}; +static const u8 enc_nonce086[] __initconst = { + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b +}; +static const u8 enc_key086[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - special case tag */ +static const u8 enc_input087[] __initconst = { + 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6, + 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd, + 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b, + 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2, + 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19, + 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4, + 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63, + 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d +}; +static const u8 enc_output087[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 +}; +static const u8 enc_assoc087[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x24, 0x7e, 0x50, 0x64, 0x2a, 0x1c, 0x0a, 0x2f, + 0x8f, 0x77, 0x21, 0x96, 0x09, 0xdb, 0xa9, 0x58 +}; +static const u8 enc_nonce087[] __initconst = { + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b +}; +static const u8 enc_key087[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - special case tag */ +static const u8 enc_input088[] __initconst = { + 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6, + 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd, + 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b, + 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2, + 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19, + 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4, + 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63, + 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d +}; +static const u8 enc_output088[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_assoc088[] __initconst = { + 0x7c, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xd9, 0xe7, 0x2c, 0x06, 0x4a, 0xc8, 0x96, 0x1f, + 0x3f, 0xa5, 0x85, 0xe0, 0xe2, 0xab, 0xd6, 0x00 +}; +static const u8 enc_nonce088[] __initconst = { + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b +}; +static const u8 enc_key088[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - special case tag */ +static const u8 enc_input089[] __initconst = { + 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6, + 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd, + 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b, + 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2, + 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19, + 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4, + 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63, + 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d +}; +static const u8 enc_output089[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80, + 0x00, 0x00, 0x00, 0x80, 0x00, 0x00, 0x00, 0x80 +}; +static const u8 enc_assoc089[] __initconst = { + 0x65, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x95, 0xaf, 0x0f, 0x4d, 0x0b, 0x68, 0x6e, 0xae, + 0xcc, 0xca, 0x43, 0x07, 0xd5, 0x96, 0xf5, 0x02 +}; +static const u8 enc_nonce089[] __initconst = { + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b +}; +static const u8 enc_key089[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - special case tag */ +static const u8 enc_input090[] __initconst = { + 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6, + 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd, + 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b, + 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2, + 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19, + 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4, + 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63, + 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d +}; +static const u8 enc_output090[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0x7f, 0xff, 0xff, 0xff, 0x7f +}; +static const u8 enc_assoc090[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x85, 0x40, 0xb4, 0x64, 0x35, 0x77, 0x07, 0xbe, + 0x3a, 0x39, 0xd5, 0x5c, 0x34, 0xf8, 0xbc, 0xb3 +}; +static const u8 enc_nonce090[] __initconst = { + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b +}; +static const u8 enc_key090[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - special case tag */ +static const u8 enc_input091[] __initconst = { + 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6, + 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd, + 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b, + 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2, + 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19, + 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4, + 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63, + 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d +}; +static const u8 enc_output091[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, + 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00 +}; +static const u8 enc_assoc091[] __initconst = { + 0x4f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x66, 0x23, 0xd9, 0x90, 0xb8, 0x98, 0xd8, 0x30, + 0xd2, 0x12, 0xaf, 0x23, 0x83, 0x33, 0x07, 0x01 +}; +static const u8 enc_nonce091[] __initconst = { + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b +}; +static const u8 enc_key091[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - special case tag */ +static const u8 enc_input092[] __initconst = { + 0x9a, 0x49, 0xc4, 0x0f, 0x8b, 0x48, 0xd7, 0xc6, + 0x6d, 0x1d, 0xb4, 0xe5, 0x3f, 0x20, 0xf2, 0xdd, + 0x4a, 0xaa, 0x24, 0x1d, 0xda, 0xb2, 0x6b, 0x5b, + 0xc0, 0xe2, 0x18, 0xb7, 0x2c, 0x33, 0x90, 0xf2, + 0xdf, 0x3e, 0xbd, 0x01, 0x76, 0x70, 0x44, 0x19, + 0x97, 0x2b, 0xcd, 0xbc, 0x6b, 0xbc, 0xb3, 0xe4, + 0xe7, 0x4a, 0x71, 0x52, 0x8e, 0xf5, 0x12, 0x63, + 0xce, 0x24, 0xe0, 0xd5, 0x75, 0xe0, 0xe4, 0x4d +}; +static const u8 enc_output092[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 +}; +static const u8 enc_assoc092[] __initconst = { + 0x83, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x5f, 0x16, 0xd0, 0x9f, 0x17, 0x78, 0x72, 0x11, + 0xb7, 0xd4, 0x84, 0xe0, 0x24, 0xf8, 0x97, 0x01 +}; +static const u8 enc_nonce092[] __initconst = { + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b +}; +static const u8 enc_key092[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input093[] __initconst = { + 0x00, 0x52, 0x35, 0xd2, 0xa9, 0x19, 0xf2, 0x8d, + 0x3d, 0xb7, 0x66, 0x4a, 0x34, 0xae, 0x6b, 0x44, + 0x4d, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09, + 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8, + 0x5b, 0x8b, 0x94, 0x50, 0x9e, 0x2b, 0x74, 0xa3, + 0x6d, 0x34, 0x6e, 0x33, 0xd5, 0x72, 0x65, 0x9b, + 0xa9, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39, + 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4, + 0x83, 0xdc, 0xe9, 0xf3, 0x07, 0x3e, 0xfa, 0xdb, + 0x7d, 0x23, 0xb8, 0x7a, 0xce, 0x35, 0x16, 0x8c +}; +static const u8 enc_output093[] __initconst = { + 0x00, 0x39, 0xe2, 0xfd, 0x2f, 0xd3, 0x12, 0x14, + 0x9e, 0x98, 0x98, 0x80, 0x88, 0x48, 0x13, 0xe7, + 0xca, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x3b, 0x0e, 0x86, 0x9a, 0xaa, 0x8e, 0xa4, 0x96, + 0x32, 0xff, 0xff, 0x37, 0xb9, 0xe8, 0xce, 0x00, + 0xca, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x3b, 0x0e, 0x86, 0x9a, 0xaa, 0x8e, 0xa4, 0x96, + 0x32, 0xff, 0xff, 0x37, 0xb9, 0xe8, 0xce, 0x00, + 0xa5, 0x19, 0xac, 0x1a, 0x35, 0xb4, 0xa5, 0x77, + 0x87, 0x51, 0x0a, 0xf7, 0x8d, 0x8d, 0x20, 0x0a +}; +static const u8 enc_assoc093[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce093[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key093[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input094[] __initconst = { + 0xd3, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0xe5, 0xda, 0x78, 0x76, 0x6f, 0xa1, 0x92, 0x90, + 0xc0, 0x31, 0xf7, 0x52, 0x08, 0x50, 0x67, 0x45, + 0xae, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0x49, 0x6d, 0xde, 0xb0, 0x55, 0x09, 0xc6, 0xef, + 0xff, 0xab, 0x75, 0xeb, 0x2d, 0xf4, 0xab, 0x09, + 0x76, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0x01, 0x49, 0xef, 0x50, 0x4b, 0x71, 0xb1, 0x20, + 0xca, 0x4f, 0xf3, 0x95, 0x19, 0xc2, 0xc2, 0x10 +}; +static const u8 enc_output094[] __initconst = { + 0xd3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x62, 0x18, 0xb2, 0x7f, 0x83, 0xb8, 0xb4, 0x66, + 0x02, 0xf6, 0xe1, 0xd8, 0x34, 0x20, 0x7b, 0x02, + 0xce, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x2a, 0x64, 0x16, 0xce, 0xdb, 0x1c, 0xdd, 0x29, + 0x6e, 0xf5, 0xd7, 0xd6, 0x92, 0xda, 0xff, 0x02, + 0xce, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x2a, 0x64, 0x16, 0xce, 0xdb, 0x1c, 0xdd, 0x29, + 0x6e, 0xf5, 0xd7, 0xd6, 0x92, 0xda, 0xff, 0x02, + 0x30, 0x2f, 0xe8, 0x2a, 0xb0, 0xa0, 0x9a, 0xf6, + 0x44, 0x00, 0xd0, 0x15, 0xae, 0x83, 0xd9, 0xcc +}; +static const u8 enc_assoc094[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce094[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key094[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input095[] __initconst = { + 0xe9, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0x6d, 0xf1, 0x39, 0x4e, 0xdc, 0x53, 0x9b, 0x5b, + 0x3a, 0x09, 0x57, 0xbe, 0x0f, 0xb8, 0x59, 0x46, + 0x80, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0xd1, 0x76, 0x9f, 0xe8, 0x06, 0xbb, 0xfe, 0xb6, + 0xf5, 0x90, 0x95, 0x0f, 0x2e, 0xac, 0x9e, 0x0a, + 0x58, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0x99, 0x52, 0xae, 0x08, 0x18, 0xc3, 0x89, 0x79, + 0xc0, 0x74, 0x13, 0x71, 0x1a, 0x9a, 0xf7, 0x13 +}; +static const u8 enc_output095[] __initconst = { + 0xe9, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xea, 0x33, 0xf3, 0x47, 0x30, 0x4a, 0xbd, 0xad, + 0xf8, 0xce, 0x41, 0x34, 0x33, 0xc8, 0x45, 0x01, + 0xe0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xb2, 0x7f, 0x57, 0x96, 0x88, 0xae, 0xe5, 0x70, + 0x64, 0xce, 0x37, 0x32, 0x91, 0x82, 0xca, 0x01, + 0xe0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xb2, 0x7f, 0x57, 0x96, 0x88, 0xae, 0xe5, 0x70, + 0x64, 0xce, 0x37, 0x32, 0x91, 0x82, 0xca, 0x01, + 0x98, 0xa7, 0xe8, 0x36, 0xe0, 0xee, 0x4d, 0x02, + 0x35, 0x00, 0xd0, 0x55, 0x7e, 0xc2, 0xcb, 0xe0 +}; +static const u8 enc_assoc095[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce095[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key095[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input096[] __initconst = { + 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0x64, 0xf9, 0x0f, 0x5b, 0x26, 0x92, 0xb8, 0x60, + 0xd4, 0x59, 0x6f, 0xf4, 0xb3, 0x40, 0x2c, 0x5c, + 0x00, 0xb9, 0xbb, 0x53, 0x70, 0x7a, 0xa6, 0x67, + 0xd3, 0x56, 0xfe, 0x50, 0xc7, 0x19, 0x96, 0x94, + 0x03, 0x35, 0x61, 0xe7, 0xca, 0xca, 0x6d, 0x94, + 0x1d, 0xc3, 0xcd, 0x69, 0x14, 0xad, 0x69, 0x04 +}; +static const u8 enc_output096[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xe3, 0x3b, 0xc5, 0x52, 0xca, 0x8b, 0x9e, 0x96, + 0x16, 0x9e, 0x79, 0x7e, 0x8f, 0x30, 0x30, 0x1b, + 0x60, 0x3c, 0xa9, 0x99, 0x44, 0xdf, 0x76, 0x52, + 0x8c, 0x9d, 0x6f, 0x54, 0xab, 0x83, 0x3d, 0x0f, + 0x60, 0x3c, 0xa9, 0x99, 0x44, 0xdf, 0x76, 0x52, + 0x8c, 0x9d, 0x6f, 0x54, 0xab, 0x83, 0x3d, 0x0f, + 0x6a, 0xb8, 0xdc, 0xe2, 0xc5, 0x9d, 0xa4, 0x73, + 0x71, 0x30, 0xb0, 0x25, 0x2f, 0x68, 0xa8, 0xd8 +}; +static const u8 enc_assoc096[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce096[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key096[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input097[] __initconst = { + 0x68, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0xb0, 0x8f, 0x25, 0x67, 0x5b, 0x9b, 0xcb, 0xf6, + 0xe3, 0x84, 0x07, 0xde, 0x2e, 0xc7, 0x5a, 0x47, + 0x9f, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0x2d, 0x2a, 0xf7, 0xcd, 0x6b, 0x08, 0x05, 0x01, + 0xd3, 0x1b, 0xa5, 0x4f, 0xb2, 0xeb, 0x75, 0x96, + 0x47, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0x65, 0x0e, 0xc6, 0x2d, 0x75, 0x70, 0x72, 0xce, + 0xe6, 0xff, 0x23, 0x31, 0x86, 0xdd, 0x1c, 0x8f +}; +static const u8 enc_output097[] __initconst = { + 0x68, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x37, 0x4d, 0xef, 0x6e, 0xb7, 0x82, 0xed, 0x00, + 0x21, 0x43, 0x11, 0x54, 0x12, 0xb7, 0x46, 0x00, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x4e, 0x23, 0x3f, 0xb3, 0xe5, 0x1d, 0x1e, 0xc7, + 0x42, 0x45, 0x07, 0x72, 0x0d, 0xc5, 0x21, 0x9d, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x4e, 0x23, 0x3f, 0xb3, 0xe5, 0x1d, 0x1e, 0xc7, + 0x42, 0x45, 0x07, 0x72, 0x0d, 0xc5, 0x21, 0x9d, + 0x04, 0x4d, 0xea, 0x60, 0x88, 0x80, 0x41, 0x2b, + 0xfd, 0xff, 0xcf, 0x35, 0x57, 0x9e, 0x9b, 0x26 +}; +static const u8 enc_assoc097[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce097[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key097[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input098[] __initconst = { + 0x6d, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0xa1, 0x61, 0xb5, 0xab, 0x04, 0x09, 0x00, 0x62, + 0x9e, 0xfe, 0xff, 0x78, 0xd7, 0xd8, 0x6b, 0x45, + 0x9f, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0xc6, 0xf8, 0x07, 0x8c, 0xc8, 0xef, 0x12, 0xa0, + 0xff, 0x65, 0x7d, 0x6d, 0x08, 0xdb, 0x10, 0xb8, + 0x47, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0x8e, 0xdc, 0x36, 0x6c, 0xd6, 0x97, 0x65, 0x6f, + 0xca, 0x81, 0xfb, 0x13, 0x3c, 0xed, 0x79, 0xa1 +}; +static const u8 enc_output098[] __initconst = { + 0x6d, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x26, 0xa3, 0x7f, 0xa2, 0xe8, 0x10, 0x26, 0x94, + 0x5c, 0x39, 0xe9, 0xf2, 0xeb, 0xa8, 0x77, 0x02, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xa5, 0xf1, 0xcf, 0xf2, 0x46, 0xfa, 0x09, 0x66, + 0x6e, 0x3b, 0xdf, 0x50, 0xb7, 0xf5, 0x44, 0xb3, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xa5, 0xf1, 0xcf, 0xf2, 0x46, 0xfa, 0x09, 0x66, + 0x6e, 0x3b, 0xdf, 0x50, 0xb7, 0xf5, 0x44, 0xb3, + 0x1e, 0x6b, 0xea, 0x63, 0x14, 0x54, 0x2e, 0x2e, + 0xf9, 0xff, 0xcf, 0x45, 0x0b, 0x2e, 0x98, 0x2b +}; +static const u8 enc_assoc098[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce098[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key098[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input099[] __initconst = { + 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0xfc, 0x01, 0xb8, 0x91, 0xe5, 0xf0, 0xf9, 0x12, + 0x8d, 0x7d, 0x1c, 0x57, 0x91, 0x92, 0xb6, 0x98, + 0x63, 0x41, 0x44, 0x15, 0xb6, 0x99, 0x68, 0x95, + 0x9a, 0x72, 0x91, 0xb7, 0xa5, 0xaf, 0x13, 0x48, + 0x60, 0xcd, 0x9e, 0xa1, 0x0c, 0x29, 0xa3, 0x66, + 0x54, 0xe7, 0xa2, 0x8e, 0x76, 0x1b, 0xec, 0xd8 +}; +static const u8 enc_output099[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x7b, 0xc3, 0x72, 0x98, 0x09, 0xe9, 0xdf, 0xe4, + 0x4f, 0xba, 0x0a, 0xdd, 0xad, 0xe2, 0xaa, 0xdf, + 0x03, 0xc4, 0x56, 0xdf, 0x82, 0x3c, 0xb8, 0xa0, + 0xc5, 0xb9, 0x00, 0xb3, 0xc9, 0x35, 0xb8, 0xd3, + 0x03, 0xc4, 0x56, 0xdf, 0x82, 0x3c, 0xb8, 0xa0, + 0xc5, 0xb9, 0x00, 0xb3, 0xc9, 0x35, 0xb8, 0xd3, + 0xed, 0x20, 0x17, 0xc8, 0xdb, 0xa4, 0x77, 0x56, + 0x29, 0x04, 0x9d, 0x78, 0x6e, 0x3b, 0xce, 0xb1 +}; +static const u8 enc_assoc099[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce099[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key099[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input100[] __initconst = { + 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0x6b, 0x6d, 0xc9, 0xd2, 0x1a, 0x81, 0x9e, 0x70, + 0xb5, 0x77, 0xf4, 0x41, 0x37, 0xd3, 0xd6, 0xbd, + 0x13, 0x35, 0xf5, 0xeb, 0x44, 0x49, 0x40, 0x77, + 0xb2, 0x64, 0x49, 0xa5, 0x4b, 0x6c, 0x7c, 0x75, + 0x10, 0xb9, 0x2f, 0x5f, 0xfe, 0xf9, 0x8b, 0x84, + 0x7c, 0xf1, 0x7a, 0x9c, 0x98, 0xd8, 0x83, 0xe5 +}; +static const u8 enc_output100[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xec, 0xaf, 0x03, 0xdb, 0xf6, 0x98, 0xb8, 0x86, + 0x77, 0xb0, 0xe2, 0xcb, 0x0b, 0xa3, 0xca, 0xfa, + 0x73, 0xb0, 0xe7, 0x21, 0x70, 0xec, 0x90, 0x42, + 0xed, 0xaf, 0xd8, 0xa1, 0x27, 0xf6, 0xd7, 0xee, + 0x73, 0xb0, 0xe7, 0x21, 0x70, 0xec, 0x90, 0x42, + 0xed, 0xaf, 0xd8, 0xa1, 0x27, 0xf6, 0xd7, 0xee, + 0x07, 0x3f, 0x17, 0xcb, 0x67, 0x78, 0x64, 0x59, + 0x25, 0x04, 0x9d, 0x88, 0x22, 0xcb, 0xca, 0xb6 +}; +static const u8 enc_assoc100[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce100[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key100[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input101[] __initconst = { + 0xff, 0xcb, 0x2b, 0x11, 0x06, 0xf8, 0x23, 0x4c, + 0x5e, 0x99, 0xd4, 0xdb, 0x4c, 0x70, 0x48, 0xde, + 0x32, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09, + 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8, + 0x16, 0xe9, 0x88, 0x4a, 0x11, 0x4f, 0x0e, 0x92, + 0x66, 0xce, 0xa3, 0x88, 0x5f, 0xe3, 0x6b, 0x9f, + 0xd6, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39, + 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4, + 0xce, 0xbe, 0xf5, 0xe9, 0x88, 0x5a, 0x80, 0xea, + 0x76, 0xd9, 0x75, 0xc1, 0x44, 0xa4, 0x18, 0x88 +}; +static const u8 enc_output101[] __initconst = { + 0xff, 0xa0, 0xfc, 0x3e, 0x80, 0x32, 0xc3, 0xd5, + 0xfd, 0xb6, 0x2a, 0x11, 0xf0, 0x96, 0x30, 0x7d, + 0xb5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x76, 0x6c, 0x9a, 0x80, 0x25, 0xea, 0xde, 0xa7, + 0x39, 0x05, 0x32, 0x8c, 0x33, 0x79, 0xc0, 0x04, + 0xb5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x76, 0x6c, 0x9a, 0x80, 0x25, 0xea, 0xde, 0xa7, + 0x39, 0x05, 0x32, 0x8c, 0x33, 0x79, 0xc0, 0x04, + 0x8b, 0x9b, 0xb4, 0xb4, 0x86, 0x12, 0x89, 0x65, + 0x8c, 0x69, 0x6a, 0x83, 0x40, 0x15, 0x04, 0x05 +}; +static const u8 enc_assoc101[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce101[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key101[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input102[] __initconst = { + 0x6f, 0x9e, 0x70, 0xed, 0x3b, 0x8b, 0xac, 0xa0, + 0x26, 0xe4, 0x6a, 0x5a, 0x09, 0x43, 0x15, 0x8d, + 0x21, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09, + 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8, + 0x0c, 0x61, 0x2c, 0x5e, 0x8d, 0x89, 0xa8, 0x73, + 0xdb, 0xca, 0xad, 0x5b, 0x73, 0x46, 0x42, 0x9b, + 0xc5, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39, + 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4, + 0xd4, 0x36, 0x51, 0xfd, 0x14, 0x9c, 0x26, 0x0b, + 0xcb, 0xdd, 0x7b, 0x12, 0x68, 0x01, 0x31, 0x8c +}; +static const u8 enc_output102[] __initconst = { + 0x6f, 0xf5, 0xa7, 0xc2, 0xbd, 0x41, 0x4c, 0x39, + 0x85, 0xcb, 0x94, 0x90, 0xb5, 0xa5, 0x6d, 0x2e, + 0xa6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x6c, 0xe4, 0x3e, 0x94, 0xb9, 0x2c, 0x78, 0x46, + 0x84, 0x01, 0x3c, 0x5f, 0x1f, 0xdc, 0xe9, 0x00, + 0xa6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x6c, 0xe4, 0x3e, 0x94, 0xb9, 0x2c, 0x78, 0x46, + 0x84, 0x01, 0x3c, 0x5f, 0x1f, 0xdc, 0xe9, 0x00, + 0x8b, 0x3b, 0xbd, 0x51, 0x64, 0x44, 0x59, 0x56, + 0x8d, 0x81, 0xca, 0x1f, 0xa7, 0x2c, 0xe4, 0x04 +}; +static const u8 enc_assoc102[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce102[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key102[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input103[] __initconst = { + 0x41, 0x2b, 0x08, 0x0a, 0x3e, 0x19, 0xc1, 0x0d, + 0x44, 0xa1, 0xaf, 0x1e, 0xab, 0xde, 0xb4, 0xce, + 0x35, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09, + 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8, + 0x6b, 0x83, 0x94, 0x33, 0x09, 0x21, 0x48, 0x6c, + 0xa1, 0x1d, 0x29, 0x1c, 0x3e, 0x97, 0xee, 0x9a, + 0xd1, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39, + 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4, + 0xb3, 0xd4, 0xe9, 0x90, 0x90, 0x34, 0xc6, 0x14, + 0xb1, 0x0a, 0xff, 0x55, 0x25, 0xd0, 0x9d, 0x8d +}; +static const u8 enc_output103[] __initconst = { + 0x41, 0x40, 0xdf, 0x25, 0xb8, 0xd3, 0x21, 0x94, + 0xe7, 0x8e, 0x51, 0xd4, 0x17, 0x38, 0xcc, 0x6d, + 0xb2, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x0b, 0x06, 0x86, 0xf9, 0x3d, 0x84, 0x98, 0x59, + 0xfe, 0xd6, 0xb8, 0x18, 0x52, 0x0d, 0x45, 0x01, + 0xb2, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x0b, 0x06, 0x86, 0xf9, 0x3d, 0x84, 0x98, 0x59, + 0xfe, 0xd6, 0xb8, 0x18, 0x52, 0x0d, 0x45, 0x01, + 0x86, 0xfb, 0xab, 0x2b, 0x4a, 0x94, 0xf4, 0x7a, + 0xa5, 0x6f, 0x0a, 0xea, 0x65, 0xd1, 0x10, 0x08 +}; +static const u8 enc_assoc103[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce103[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key103[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input104[] __initconst = { + 0xb2, 0x47, 0xa7, 0x47, 0x23, 0x49, 0x1a, 0xac, + 0xac, 0xaa, 0xd7, 0x09, 0xc9, 0x1e, 0x93, 0x2b, + 0x31, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09, + 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8, + 0x9a, 0xde, 0x04, 0xe7, 0x5b, 0xb7, 0x01, 0xd9, + 0x66, 0x06, 0x01, 0xb3, 0x47, 0x65, 0xde, 0x98, + 0xd5, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39, + 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4, + 0x42, 0x89, 0x79, 0x44, 0xc2, 0xa2, 0x8f, 0xa1, + 0x76, 0x11, 0xd7, 0xfa, 0x5c, 0x22, 0xad, 0x8f +}; +static const u8 enc_output104[] __initconst = { + 0xb2, 0x2c, 0x70, 0x68, 0xa5, 0x83, 0xfa, 0x35, + 0x0f, 0x85, 0x29, 0xc3, 0x75, 0xf8, 0xeb, 0x88, + 0xb6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xfa, 0x5b, 0x16, 0x2d, 0x6f, 0x12, 0xd1, 0xec, + 0x39, 0xcd, 0x90, 0xb7, 0x2b, 0xff, 0x75, 0x03, + 0xb6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xfa, 0x5b, 0x16, 0x2d, 0x6f, 0x12, 0xd1, 0xec, + 0x39, 0xcd, 0x90, 0xb7, 0x2b, 0xff, 0x75, 0x03, + 0xa0, 0x19, 0xac, 0x2e, 0xd6, 0x67, 0xe1, 0x7d, + 0xa1, 0x6f, 0x0a, 0xfa, 0x19, 0x61, 0x0d, 0x0d +}; +static const u8 enc_assoc104[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce104[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key104[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input105[] __initconst = { + 0x74, 0x0f, 0x9e, 0x49, 0xf6, 0x10, 0xef, 0xa5, + 0x85, 0xb6, 0x59, 0xca, 0x6e, 0xd8, 0xb4, 0x99, + 0x2d, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09, + 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8, + 0x41, 0x2d, 0x96, 0xaf, 0xbe, 0x80, 0xec, 0x3e, + 0x79, 0xd4, 0x51, 0xb0, 0x0a, 0x2d, 0xb2, 0x9a, + 0xc9, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39, + 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4, + 0x99, 0x7a, 0xeb, 0x0c, 0x27, 0x95, 0x62, 0x46, + 0x69, 0xc3, 0x87, 0xf9, 0x11, 0x6a, 0xc1, 0x8d +}; +static const u8 enc_output105[] __initconst = { + 0x74, 0x64, 0x49, 0x66, 0x70, 0xda, 0x0f, 0x3c, + 0x26, 0x99, 0xa7, 0x00, 0xd2, 0x3e, 0xcc, 0x3a, + 0xaa, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x21, 0xa8, 0x84, 0x65, 0x8a, 0x25, 0x3c, 0x0b, + 0x26, 0x1f, 0xc0, 0xb4, 0x66, 0xb7, 0x19, 0x01, + 0xaa, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x21, 0xa8, 0x84, 0x65, 0x8a, 0x25, 0x3c, 0x0b, + 0x26, 0x1f, 0xc0, 0xb4, 0x66, 0xb7, 0x19, 0x01, + 0x73, 0x6e, 0x18, 0x18, 0x16, 0x96, 0xa5, 0x88, + 0x9c, 0x31, 0x59, 0xfa, 0xab, 0xab, 0x20, 0xfd +}; +static const u8 enc_assoc105[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce105[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key105[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input106[] __initconst = { + 0xad, 0xba, 0x5d, 0x10, 0x5b, 0xc8, 0xaa, 0x06, + 0x2c, 0x23, 0x36, 0xcb, 0x88, 0x9d, 0xdb, 0xd5, + 0x37, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09, + 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8, + 0x17, 0x7c, 0x5f, 0xfe, 0x28, 0x75, 0xf4, 0x68, + 0xf6, 0xc2, 0x96, 0x57, 0x48, 0xf3, 0x59, 0x9a, + 0xd3, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39, + 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4, + 0xcf, 0x2b, 0x22, 0x5d, 0xb1, 0x60, 0x7a, 0x10, + 0xe6, 0xd5, 0x40, 0x1e, 0x53, 0xb4, 0x2a, 0x8d +}; +static const u8 enc_output106[] __initconst = { + 0xad, 0xd1, 0x8a, 0x3f, 0xdd, 0x02, 0x4a, 0x9f, + 0x8f, 0x0c, 0xc8, 0x01, 0x34, 0x7b, 0xa3, 0x76, + 0xb0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x77, 0xf9, 0x4d, 0x34, 0x1c, 0xd0, 0x24, 0x5d, + 0xa9, 0x09, 0x07, 0x53, 0x24, 0x69, 0xf2, 0x01, + 0xb0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x77, 0xf9, 0x4d, 0x34, 0x1c, 0xd0, 0x24, 0x5d, + 0xa9, 0x09, 0x07, 0x53, 0x24, 0x69, 0xf2, 0x01, + 0xba, 0xd5, 0x8f, 0x10, 0xa9, 0x1e, 0x6a, 0x88, + 0x9a, 0xba, 0x32, 0xfd, 0x17, 0xd8, 0x33, 0x1a +}; +static const u8 enc_assoc106[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce106[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key106[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input107[] __initconst = { + 0xfe, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0xc0, 0x01, 0xed, 0xc5, 0xda, 0x44, 0x2e, 0x71, + 0x9b, 0xce, 0x9a, 0xbe, 0x27, 0x3a, 0xf1, 0x44, + 0xb4, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0x48, 0x02, 0x5f, 0x41, 0xfa, 0x4e, 0x33, 0x6c, + 0x78, 0x69, 0x57, 0xa2, 0xa7, 0xc4, 0x93, 0x0a, + 0x6c, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0x00, 0x26, 0x6e, 0xa1, 0xe4, 0x36, 0x44, 0xa3, + 0x4d, 0x8d, 0xd1, 0xdc, 0x93, 0xf2, 0xfa, 0x13 +}; +static const u8 enc_output107[] __initconst = { + 0xfe, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x47, 0xc3, 0x27, 0xcc, 0x36, 0x5d, 0x08, 0x87, + 0x59, 0x09, 0x8c, 0x34, 0x1b, 0x4a, 0xed, 0x03, + 0xd4, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x2b, 0x0b, 0x97, 0x3f, 0x74, 0x5b, 0x28, 0xaa, + 0xe9, 0x37, 0xf5, 0x9f, 0x18, 0xea, 0xc7, 0x01, + 0xd4, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x2b, 0x0b, 0x97, 0x3f, 0x74, 0x5b, 0x28, 0xaa, + 0xe9, 0x37, 0xf5, 0x9f, 0x18, 0xea, 0xc7, 0x01, + 0xd6, 0x8c, 0xe1, 0x74, 0x07, 0x9a, 0xdd, 0x02, + 0x8d, 0xd0, 0x5c, 0xf8, 0x14, 0x63, 0x04, 0x88 +}; +static const u8 enc_assoc107[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce107[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key107[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input108[] __initconst = { + 0xb5, 0x13, 0xb0, 0x6a, 0xb9, 0xac, 0x14, 0x43, + 0x5a, 0xcb, 0x8a, 0xa3, 0xa3, 0x7a, 0xfd, 0xb6, + 0x54, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09, + 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8, + 0x61, 0x95, 0x01, 0x93, 0xb1, 0xbf, 0x03, 0x11, + 0xff, 0x11, 0x79, 0x89, 0xae, 0xd9, 0xa9, 0x99, + 0xb0, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39, + 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4, + 0xb9, 0xc2, 0x7c, 0x30, 0x28, 0xaa, 0x8d, 0x69, + 0xef, 0x06, 0xaf, 0xc0, 0xb5, 0x9e, 0xda, 0x8e +}; +static const u8 enc_output108[] __initconst = { + 0xb5, 0x78, 0x67, 0x45, 0x3f, 0x66, 0xf4, 0xda, + 0xf9, 0xe4, 0x74, 0x69, 0x1f, 0x9c, 0x85, 0x15, + 0xd3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x01, 0x10, 0x13, 0x59, 0x85, 0x1a, 0xd3, 0x24, + 0xa0, 0xda, 0xe8, 0x8d, 0xc2, 0x43, 0x02, 0x02, + 0xd3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x01, 0x10, 0x13, 0x59, 0x85, 0x1a, 0xd3, 0x24, + 0xa0, 0xda, 0xe8, 0x8d, 0xc2, 0x43, 0x02, 0x02, + 0xaa, 0x48, 0xa3, 0x88, 0x7d, 0x4b, 0x05, 0x96, + 0x99, 0xc2, 0xfd, 0xf9, 0xc6, 0x78, 0x7e, 0x0a +}; +static const u8 enc_assoc108[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce108[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key108[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input109[] __initconst = { + 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0xd4, 0xf1, 0x09, 0xe8, 0x14, 0xce, 0xa8, 0x5a, + 0x08, 0xc0, 0x11, 0xd8, 0x50, 0xdd, 0x1d, 0xcb, + 0xcf, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0x53, 0x40, 0xb8, 0x5a, 0x9a, 0xa0, 0x82, 0x96, + 0xb7, 0x7a, 0x5f, 0xc3, 0x96, 0x1f, 0x66, 0x0f, + 0x17, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0x1b, 0x64, 0x89, 0xba, 0x84, 0xd8, 0xf5, 0x59, + 0x82, 0x9e, 0xd9, 0xbd, 0xa2, 0x29, 0x0f, 0x16 +}; +static const u8 enc_output109[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x53, 0x33, 0xc3, 0xe1, 0xf8, 0xd7, 0x8e, 0xac, + 0xca, 0x07, 0x07, 0x52, 0x6c, 0xad, 0x01, 0x8c, + 0xaf, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x30, 0x49, 0x70, 0x24, 0x14, 0xb5, 0x99, 0x50, + 0x26, 0x24, 0xfd, 0xfe, 0x29, 0x31, 0x32, 0x04, + 0xaf, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x30, 0x49, 0x70, 0x24, 0x14, 0xb5, 0x99, 0x50, + 0x26, 0x24, 0xfd, 0xfe, 0x29, 0x31, 0x32, 0x04, + 0xb9, 0x36, 0xa8, 0x17, 0xf2, 0x21, 0x1a, 0xf1, + 0x29, 0xe2, 0xcf, 0x16, 0x0f, 0xd4, 0x2b, 0xcb +}; +static const u8 enc_assoc109[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce109[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key109[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input110[] __initconst = { + 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0xdf, 0x4c, 0x62, 0x03, 0x2d, 0x41, 0x19, 0xb5, + 0x88, 0x47, 0x7e, 0x99, 0x92, 0x5a, 0x56, 0xd9, + 0xd6, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0xfa, 0x84, 0xf0, 0x64, 0x55, 0x36, 0x42, 0x1b, + 0x2b, 0xb9, 0x24, 0x6e, 0xc2, 0x19, 0xed, 0x0b, + 0x0e, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0xb2, 0xa0, 0xc1, 0x84, 0x4b, 0x4e, 0x35, 0xd4, + 0x1e, 0x5d, 0xa2, 0x10, 0xf6, 0x2f, 0x84, 0x12 +}; +static const u8 enc_output110[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x58, 0x8e, 0xa8, 0x0a, 0xc1, 0x58, 0x3f, 0x43, + 0x4a, 0x80, 0x68, 0x13, 0xae, 0x2a, 0x4a, 0x9e, + 0xb6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x99, 0x8d, 0x38, 0x1a, 0xdb, 0x23, 0x59, 0xdd, + 0xba, 0xe7, 0x86, 0x53, 0x7d, 0x37, 0xb9, 0x00, + 0xb6, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x99, 0x8d, 0x38, 0x1a, 0xdb, 0x23, 0x59, 0xdd, + 0xba, 0xe7, 0x86, 0x53, 0x7d, 0x37, 0xb9, 0x00, + 0x9f, 0x7a, 0xc4, 0x35, 0x1f, 0x6b, 0x91, 0xe6, + 0x30, 0x97, 0xa7, 0x13, 0x11, 0x5d, 0x05, 0xbe +}; +static const u8 enc_assoc110[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce110[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key110[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input111[] __initconst = { + 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0x13, 0xf8, 0x0a, 0x00, 0x6d, 0xc1, 0xbb, 0xda, + 0xd6, 0x39, 0xa9, 0x2f, 0xc7, 0xec, 0xa6, 0x55, + 0xf7, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0x63, 0x48, 0xb8, 0xfd, 0x29, 0xbf, 0x96, 0xd5, + 0x63, 0xa5, 0x17, 0xe2, 0x7d, 0x7b, 0xfc, 0x0f, + 0x2f, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0x2b, 0x6c, 0x89, 0x1d, 0x37, 0xc7, 0xe1, 0x1a, + 0x56, 0x41, 0x91, 0x9c, 0x49, 0x4d, 0x95, 0x16 +}; +static const u8 enc_output111[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x94, 0x3a, 0xc0, 0x09, 0x81, 0xd8, 0x9d, 0x2c, + 0x14, 0xfe, 0xbf, 0xa5, 0xfb, 0x9c, 0xba, 0x12, + 0x97, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x41, 0x70, 0x83, 0xa7, 0xaa, 0x8d, 0x13, + 0xf2, 0xfb, 0xb5, 0xdf, 0xc2, 0x55, 0xa8, 0x04, + 0x97, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x41, 0x70, 0x83, 0xa7, 0xaa, 0x8d, 0x13, + 0xf2, 0xfb, 0xb5, 0xdf, 0xc2, 0x55, 0xa8, 0x04, + 0x9a, 0x18, 0xa8, 0x28, 0x07, 0x02, 0x69, 0xf4, + 0x47, 0x00, 0xd0, 0x09, 0xe7, 0x17, 0x1c, 0xc9 +}; +static const u8 enc_assoc111[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce111[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key111[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input112[] __initconst = { + 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0x82, 0xe5, 0x9b, 0x45, 0x82, 0x91, 0x50, 0x38, + 0xf9, 0x33, 0x81, 0x1e, 0x65, 0x2d, 0xc6, 0x6a, + 0xfc, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0xb6, 0x71, 0xc8, 0xca, 0xc2, 0x70, 0xc2, 0x65, + 0xa0, 0xac, 0x2f, 0x53, 0x57, 0x99, 0x88, 0x0a, + 0x24, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0xfe, 0x55, 0xf9, 0x2a, 0xdc, 0x08, 0xb5, 0xaa, + 0x95, 0x48, 0xa9, 0x2d, 0x63, 0xaf, 0xe1, 0x13 +}; +static const u8 enc_output112[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x05, 0x27, 0x51, 0x4c, 0x6e, 0x88, 0x76, 0xce, + 0x3b, 0xf4, 0x97, 0x94, 0x59, 0x5d, 0xda, 0x2d, + 0x9c, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xd5, 0x78, 0x00, 0xb4, 0x4c, 0x65, 0xd9, 0xa3, + 0x31, 0xf2, 0x8d, 0x6e, 0xe8, 0xb7, 0xdc, 0x01, + 0x9c, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xd5, 0x78, 0x00, 0xb4, 0x4c, 0x65, 0xd9, 0xa3, + 0x31, 0xf2, 0x8d, 0x6e, 0xe8, 0xb7, 0xdc, 0x01, + 0xb4, 0x36, 0xa8, 0x2b, 0x93, 0xd5, 0x55, 0xf7, + 0x43, 0x00, 0xd0, 0x19, 0x9b, 0xa7, 0x18, 0xce +}; +static const u8 enc_assoc112[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce112[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key112[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input113[] __initconst = { + 0xff, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0xf1, 0xd1, 0x28, 0x87, 0xb7, 0x21, 0x69, 0x86, + 0xa1, 0x2d, 0x79, 0x09, 0x8b, 0x6d, 0xe6, 0x0f, + 0xc0, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0xa7, 0xc7, 0x58, 0x99, 0xf3, 0xe6, 0x0a, 0xf1, + 0xfc, 0xb6, 0xc7, 0x30, 0x7d, 0x87, 0x59, 0x0f, + 0x18, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0xef, 0xe3, 0x69, 0x79, 0xed, 0x9e, 0x7d, 0x3e, + 0xc9, 0x52, 0x41, 0x4e, 0x49, 0xb1, 0x30, 0x16 +}; +static const u8 enc_output113[] __initconst = { + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x76, 0x13, 0xe2, 0x8e, 0x5b, 0x38, 0x4f, 0x70, + 0x63, 0xea, 0x6f, 0x83, 0xb7, 0x1d, 0xfa, 0x48, + 0xa0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xc4, 0xce, 0x90, 0xe7, 0x7d, 0xf3, 0x11, 0x37, + 0x6d, 0xe8, 0x65, 0x0d, 0xc2, 0xa9, 0x0d, 0x04, + 0xa0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xc4, 0xce, 0x90, 0xe7, 0x7d, 0xf3, 0x11, 0x37, + 0x6d, 0xe8, 0x65, 0x0d, 0xc2, 0xa9, 0x0d, 0x04, + 0xce, 0x54, 0xa8, 0x2e, 0x1f, 0xa9, 0x42, 0xfa, + 0x3f, 0x00, 0xd0, 0x29, 0x4f, 0x37, 0x15, 0xd3 +}; +static const u8 enc_assoc113[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce113[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key113[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input114[] __initconst = { + 0xcb, 0xf1, 0xda, 0x9e, 0x0b, 0xa9, 0x37, 0x73, + 0x74, 0xe6, 0x9e, 0x1c, 0x0e, 0x60, 0x0c, 0xfc, + 0x34, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09, + 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8, + 0xbe, 0x3f, 0xa6, 0x6b, 0x6c, 0xe7, 0x80, 0x8a, + 0xa3, 0xe4, 0x59, 0x49, 0xf9, 0x44, 0x64, 0x9f, + 0xd0, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39, + 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4, + 0x66, 0x68, 0xdb, 0xc8, 0xf5, 0xf2, 0x0e, 0xf2, + 0xb3, 0xf3, 0x8f, 0x00, 0xe2, 0x03, 0x17, 0x88 +}; +static const u8 enc_output114[] __initconst = { + 0xcb, 0x9a, 0x0d, 0xb1, 0x8d, 0x63, 0xd7, 0xea, + 0xd7, 0xc9, 0x60, 0xd6, 0xb2, 0x86, 0x74, 0x5f, + 0xb3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xde, 0xba, 0xb4, 0xa1, 0x58, 0x42, 0x50, 0xbf, + 0xfc, 0x2f, 0xc8, 0x4d, 0x95, 0xde, 0xcf, 0x04, + 0xb3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xde, 0xba, 0xb4, 0xa1, 0x58, 0x42, 0x50, 0xbf, + 0xfc, 0x2f, 0xc8, 0x4d, 0x95, 0xde, 0xcf, 0x04, + 0x23, 0x83, 0xab, 0x0b, 0x79, 0x92, 0x05, 0x69, + 0x9b, 0x51, 0x0a, 0xa7, 0x09, 0xbf, 0x31, 0xf1 +}; +static const u8 enc_assoc114[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce114[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key114[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input115[] __initconst = { + 0x8f, 0x27, 0x86, 0x94, 0xc4, 0xe9, 0xda, 0xeb, + 0xd5, 0x8d, 0x3e, 0x5b, 0x96, 0x6e, 0x8b, 0x68, + 0x42, 0x3d, 0x35, 0xf6, 0x13, 0xe6, 0xd9, 0x09, + 0x3d, 0x38, 0xe9, 0x75, 0xc3, 0x8f, 0xe3, 0xb8, + 0x06, 0x53, 0xe7, 0xa3, 0x31, 0x71, 0x88, 0x33, + 0xac, 0xc3, 0xb9, 0xad, 0xff, 0x1c, 0x31, 0x98, + 0xa6, 0xf6, 0x37, 0x81, 0x71, 0xea, 0xe4, 0x39, + 0x6e, 0xa1, 0x5d, 0xc2, 0x40, 0xd1, 0xab, 0xf4, + 0xde, 0x04, 0x9a, 0x00, 0xa8, 0x64, 0x06, 0x4b, + 0xbc, 0xd4, 0x6f, 0xe4, 0xe4, 0x5b, 0x42, 0x8f +}; +static const u8 enc_output115[] __initconst = { + 0x8f, 0x4c, 0x51, 0xbb, 0x42, 0x23, 0x3a, 0x72, + 0x76, 0xa2, 0xc0, 0x91, 0x2a, 0x88, 0xf3, 0xcb, + 0xc5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x66, 0xd6, 0xf5, 0x69, 0x05, 0xd4, 0x58, 0x06, + 0xf3, 0x08, 0x28, 0xa9, 0x93, 0x86, 0x9a, 0x03, + 0xc5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x66, 0xd6, 0xf5, 0x69, 0x05, 0xd4, 0x58, 0x06, + 0xf3, 0x08, 0x28, 0xa9, 0x93, 0x86, 0x9a, 0x03, + 0x8b, 0xfb, 0xab, 0x17, 0xa9, 0xe0, 0xb8, 0x74, + 0x8b, 0x51, 0x0a, 0xe7, 0xd9, 0xfd, 0x23, 0x05 +}; +static const u8 enc_assoc115[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce115[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key115[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input116[] __initconst = { + 0xd5, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0x9a, 0x22, 0xd7, 0x0a, 0x48, 0xe2, 0x4f, 0xdd, + 0xcd, 0xd4, 0x41, 0x9d, 0xe6, 0x4c, 0x8f, 0x44, + 0xfc, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0x77, 0xb5, 0xc9, 0x07, 0xd9, 0xc9, 0xe1, 0xea, + 0x51, 0x85, 0x1a, 0x20, 0x4a, 0xad, 0x9f, 0x0a, + 0x24, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0x3f, 0x91, 0xf8, 0xe7, 0xc7, 0xb1, 0x96, 0x25, + 0x64, 0x61, 0x9c, 0x5e, 0x7e, 0x9b, 0xf6, 0x13 +}; +static const u8 enc_output116[] __initconst = { + 0xd5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x1d, 0xe0, 0x1d, 0x03, 0xa4, 0xfb, 0x69, 0x2b, + 0x0f, 0x13, 0x57, 0x17, 0xda, 0x3c, 0x93, 0x03, + 0x9c, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x14, 0xbc, 0x01, 0x79, 0x57, 0xdc, 0xfa, 0x2c, + 0xc0, 0xdb, 0xb8, 0x1d, 0xf5, 0x83, 0xcb, 0x01, + 0x9c, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x14, 0xbc, 0x01, 0x79, 0x57, 0xdc, 0xfa, 0x2c, + 0xc0, 0xdb, 0xb8, 0x1d, 0xf5, 0x83, 0xcb, 0x01, + 0x49, 0xbc, 0x6e, 0x9f, 0xc5, 0x1c, 0x4d, 0x50, + 0x30, 0x36, 0x64, 0x4d, 0x84, 0x27, 0x73, 0xd2 +}; +static const u8 enc_assoc116[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce116[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key116[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input117[] __initconst = { + 0xdb, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0x75, 0xd5, 0x64, 0x3a, 0xa5, 0xaf, 0x93, 0x4d, + 0x8c, 0xce, 0x39, 0x2c, 0xc3, 0xee, 0xdb, 0x47, + 0xc0, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0x60, 0x1b, 0x5a, 0xd2, 0x06, 0x7f, 0x28, 0x06, + 0x6a, 0x8f, 0x32, 0x81, 0x71, 0x5b, 0xa8, 0x08, + 0x18, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0x28, 0x3f, 0x6b, 0x32, 0x18, 0x07, 0x5f, 0xc9, + 0x5f, 0x6b, 0xb4, 0xff, 0x45, 0x6d, 0xc1, 0x11 +}; +static const u8 enc_output117[] __initconst = { + 0xdb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xf2, 0x17, 0xae, 0x33, 0x49, 0xb6, 0xb5, 0xbb, + 0x4e, 0x09, 0x2f, 0xa6, 0xff, 0x9e, 0xc7, 0x00, + 0xa0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x03, 0x12, 0x92, 0xac, 0x88, 0x6a, 0x33, 0xc0, + 0xfb, 0xd1, 0x90, 0xbc, 0xce, 0x75, 0xfc, 0x03, + 0xa0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0x03, 0x12, 0x92, 0xac, 0x88, 0x6a, 0x33, 0xc0, + 0xfb, 0xd1, 0x90, 0xbc, 0xce, 0x75, 0xfc, 0x03, + 0x63, 0xda, 0x6e, 0xa2, 0x51, 0xf0, 0x39, 0x53, + 0x2c, 0x36, 0x64, 0x5d, 0x38, 0xb7, 0x6f, 0xd7 +}; +static const u8 enc_assoc117[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce117[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key117[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +/* wycheproof - edge case intermediate sums in poly1305 */ +static const u8 enc_input118[] __initconst = { + 0x93, 0x94, 0x28, 0xd0, 0x79, 0x35, 0x1f, 0x66, + 0x5c, 0xd0, 0x01, 0x35, 0x43, 0x19, 0x87, 0x5c, + 0x62, 0x48, 0x39, 0x60, 0x42, 0x16, 0xe4, 0x03, + 0xeb, 0xcc, 0x6a, 0xf5, 0x59, 0xec, 0x8b, 0x43, + 0x97, 0x7a, 0xed, 0x35, 0xcb, 0x5a, 0x2f, 0xca, + 0xa0, 0x34, 0x6e, 0xfb, 0x93, 0x65, 0x54, 0x64, + 0xd8, 0xc8, 0xc3, 0xfa, 0x1a, 0x9e, 0x47, 0x4a, + 0xbe, 0x52, 0xd0, 0x2c, 0x81, 0x87, 0xe9, 0x0f, + 0x4f, 0x2d, 0x90, 0x96, 0x52, 0x4f, 0xa1, 0xb2, + 0xb0, 0x23, 0xb8, 0xb2, 0x88, 0x22, 0x27, 0x73, + 0x90, 0xec, 0xf2, 0x1a, 0x04, 0xe6, 0x30, 0x85, + 0x8b, 0xb6, 0x56, 0x52, 0xb5, 0xb1, 0x80, 0x16 +}; +static const u8 enc_output118[] __initconst = { + 0x93, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xe5, 0x8a, 0xf3, 0x69, 0xae, 0x0f, 0xc2, 0xf5, + 0x29, 0x0b, 0x7c, 0x7f, 0x65, 0x9c, 0x97, 0x04, + 0xf7, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xbb, 0xc1, 0x0b, 0x84, 0x94, 0x8b, 0x5c, 0x8c, + 0x2f, 0x0c, 0x72, 0x11, 0x3e, 0xa9, 0xbd, 0x04, + 0xf7, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xbb, 0xc1, 0x0b, 0x84, 0x94, 0x8b, 0x5c, 0x8c, + 0x2f, 0x0c, 0x72, 0x11, 0x3e, 0xa9, 0xbd, 0x04, + 0x73, 0xeb, 0x27, 0x24, 0xb5, 0xc4, 0x05, 0xf0, + 0x4d, 0x00, 0xd0, 0xf1, 0x58, 0x40, 0xa1, 0xc1 +}; +static const u8 enc_assoc118[] __initconst = { + 0xff, 0xff, 0xff, 0xff +}; +static const u8 enc_nonce118[] __initconst = { + 0x00, 0x00, 0x00, 0x00, 0x06, 0x4c, 0x2d, 0x52 +}; +static const u8 enc_key118[] __initconst = { + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f +}; + +static const struct chacha20poly1305_testvec +chacha20poly1305_enc_vectors[] __initconst = { + { enc_input001, enc_output001, enc_assoc001, enc_nonce001, enc_key001, + sizeof(enc_input001), sizeof(enc_assoc001), sizeof(enc_nonce001) }, + { enc_input002, enc_output002, enc_assoc002, enc_nonce002, enc_key002, + sizeof(enc_input002), sizeof(enc_assoc002), sizeof(enc_nonce002) }, + { enc_input003, enc_output003, enc_assoc003, enc_nonce003, enc_key003, + sizeof(enc_input003), sizeof(enc_assoc003), sizeof(enc_nonce003) }, + { enc_input004, enc_output004, enc_assoc004, enc_nonce004, enc_key004, + sizeof(enc_input004), sizeof(enc_assoc004), sizeof(enc_nonce004) }, + { enc_input005, enc_output005, enc_assoc005, enc_nonce005, enc_key005, + sizeof(enc_input005), sizeof(enc_assoc005), sizeof(enc_nonce005) }, + { enc_input006, enc_output006, enc_assoc006, enc_nonce006, enc_key006, + sizeof(enc_input006), sizeof(enc_assoc006), sizeof(enc_nonce006) }, + { enc_input007, enc_output007, enc_assoc007, enc_nonce007, enc_key007, + sizeof(enc_input007), sizeof(enc_assoc007), sizeof(enc_nonce007) }, + { enc_input008, enc_output008, enc_assoc008, enc_nonce008, enc_key008, + sizeof(enc_input008), sizeof(enc_assoc008), sizeof(enc_nonce008) }, + { enc_input009, enc_output009, enc_assoc009, enc_nonce009, enc_key009, + sizeof(enc_input009), sizeof(enc_assoc009), sizeof(enc_nonce009) }, + { enc_input010, enc_output010, enc_assoc010, enc_nonce010, enc_key010, + sizeof(enc_input010), sizeof(enc_assoc010), sizeof(enc_nonce010) }, + { enc_input011, enc_output011, enc_assoc011, enc_nonce011, enc_key011, + sizeof(enc_input011), sizeof(enc_assoc011), sizeof(enc_nonce011) }, + { enc_input012, enc_output012, enc_assoc012, enc_nonce012, enc_key012, + sizeof(enc_input012), sizeof(enc_assoc012), sizeof(enc_nonce012) }, + { enc_input013, enc_output013, enc_assoc013, enc_nonce013, enc_key013, + sizeof(enc_input013), sizeof(enc_assoc013), sizeof(enc_nonce013) }, + { enc_input014, enc_output014, enc_assoc014, enc_nonce014, enc_key014, + sizeof(enc_input014), sizeof(enc_assoc014), sizeof(enc_nonce014) }, + { enc_input015, enc_output015, enc_assoc015, enc_nonce015, enc_key015, + sizeof(enc_input015), sizeof(enc_assoc015), sizeof(enc_nonce015) }, + { enc_input016, enc_output016, enc_assoc016, enc_nonce016, enc_key016, + sizeof(enc_input016), sizeof(enc_assoc016), sizeof(enc_nonce016) }, + { enc_input017, enc_output017, enc_assoc017, enc_nonce017, enc_key017, + sizeof(enc_input017), sizeof(enc_assoc017), sizeof(enc_nonce017) }, + { enc_input018, enc_output018, enc_assoc018, enc_nonce018, enc_key018, + sizeof(enc_input018), sizeof(enc_assoc018), sizeof(enc_nonce018) }, + { enc_input019, enc_output019, enc_assoc019, enc_nonce019, enc_key019, + sizeof(enc_input019), sizeof(enc_assoc019), sizeof(enc_nonce019) }, + { enc_input020, enc_output020, enc_assoc020, enc_nonce020, enc_key020, + sizeof(enc_input020), sizeof(enc_assoc020), sizeof(enc_nonce020) }, + { enc_input021, enc_output021, enc_assoc021, enc_nonce021, enc_key021, + sizeof(enc_input021), sizeof(enc_assoc021), sizeof(enc_nonce021) }, + { enc_input022, enc_output022, enc_assoc022, enc_nonce022, enc_key022, + sizeof(enc_input022), sizeof(enc_assoc022), sizeof(enc_nonce022) }, + { enc_input023, enc_output023, enc_assoc023, enc_nonce023, enc_key023, + sizeof(enc_input023), sizeof(enc_assoc023), sizeof(enc_nonce023) }, + { enc_input024, enc_output024, enc_assoc024, enc_nonce024, enc_key024, + sizeof(enc_input024), sizeof(enc_assoc024), sizeof(enc_nonce024) }, + { enc_input025, enc_output025, enc_assoc025, enc_nonce025, enc_key025, + sizeof(enc_input025), sizeof(enc_assoc025), sizeof(enc_nonce025) }, + { enc_input026, enc_output026, enc_assoc026, enc_nonce026, enc_key026, + sizeof(enc_input026), sizeof(enc_assoc026), sizeof(enc_nonce026) }, + { enc_input027, enc_output027, enc_assoc027, enc_nonce027, enc_key027, + sizeof(enc_input027), sizeof(enc_assoc027), sizeof(enc_nonce027) }, + { enc_input028, enc_output028, enc_assoc028, enc_nonce028, enc_key028, + sizeof(enc_input028), sizeof(enc_assoc028), sizeof(enc_nonce028) }, + { enc_input029, enc_output029, enc_assoc029, enc_nonce029, enc_key029, + sizeof(enc_input029), sizeof(enc_assoc029), sizeof(enc_nonce029) }, + { enc_input030, enc_output030, enc_assoc030, enc_nonce030, enc_key030, + sizeof(enc_input030), sizeof(enc_assoc030), sizeof(enc_nonce030) }, + { enc_input031, enc_output031, enc_assoc031, enc_nonce031, enc_key031, + sizeof(enc_input031), sizeof(enc_assoc031), sizeof(enc_nonce031) }, + { enc_input032, enc_output032, enc_assoc032, enc_nonce032, enc_key032, + sizeof(enc_input032), sizeof(enc_assoc032), sizeof(enc_nonce032) }, + { enc_input033, enc_output033, enc_assoc033, enc_nonce033, enc_key033, + sizeof(enc_input033), sizeof(enc_assoc033), sizeof(enc_nonce033) }, + { enc_input034, enc_output034, enc_assoc034, enc_nonce034, enc_key034, + sizeof(enc_input034), sizeof(enc_assoc034), sizeof(enc_nonce034) }, + { enc_input035, enc_output035, enc_assoc035, enc_nonce035, enc_key035, + sizeof(enc_input035), sizeof(enc_assoc035), sizeof(enc_nonce035) }, + { enc_input036, enc_output036, enc_assoc036, enc_nonce036, enc_key036, + sizeof(enc_input036), sizeof(enc_assoc036), sizeof(enc_nonce036) }, + { enc_input037, enc_output037, enc_assoc037, enc_nonce037, enc_key037, + sizeof(enc_input037), sizeof(enc_assoc037), sizeof(enc_nonce037) }, + { enc_input038, enc_output038, enc_assoc038, enc_nonce038, enc_key038, + sizeof(enc_input038), sizeof(enc_assoc038), sizeof(enc_nonce038) }, + { enc_input039, enc_output039, enc_assoc039, enc_nonce039, enc_key039, + sizeof(enc_input039), sizeof(enc_assoc039), sizeof(enc_nonce039) }, + { enc_input040, enc_output040, enc_assoc040, enc_nonce040, enc_key040, + sizeof(enc_input040), sizeof(enc_assoc040), sizeof(enc_nonce040) }, + { enc_input041, enc_output041, enc_assoc041, enc_nonce041, enc_key041, + sizeof(enc_input041), sizeof(enc_assoc041), sizeof(enc_nonce041) }, + { enc_input042, enc_output042, enc_assoc042, enc_nonce042, enc_key042, + sizeof(enc_input042), sizeof(enc_assoc042), sizeof(enc_nonce042) }, + { enc_input043, enc_output043, enc_assoc043, enc_nonce043, enc_key043, + sizeof(enc_input043), sizeof(enc_assoc043), sizeof(enc_nonce043) }, + { enc_input044, enc_output044, enc_assoc044, enc_nonce044, enc_key044, + sizeof(enc_input044), sizeof(enc_assoc044), sizeof(enc_nonce044) }, + { enc_input045, enc_output045, enc_assoc045, enc_nonce045, enc_key045, + sizeof(enc_input045), sizeof(enc_assoc045), sizeof(enc_nonce045) }, + { enc_input046, enc_output046, enc_assoc046, enc_nonce046, enc_key046, + sizeof(enc_input046), sizeof(enc_assoc046), sizeof(enc_nonce046) }, + { enc_input047, enc_output047, enc_assoc047, enc_nonce047, enc_key047, + sizeof(enc_input047), sizeof(enc_assoc047), sizeof(enc_nonce047) }, + { enc_input048, enc_output048, enc_assoc048, enc_nonce048, enc_key048, + sizeof(enc_input048), sizeof(enc_assoc048), sizeof(enc_nonce048) }, + { enc_input049, enc_output049, enc_assoc049, enc_nonce049, enc_key049, + sizeof(enc_input049), sizeof(enc_assoc049), sizeof(enc_nonce049) }, + { enc_input050, enc_output050, enc_assoc050, enc_nonce050, enc_key050, + sizeof(enc_input050), sizeof(enc_assoc050), sizeof(enc_nonce050) }, + { enc_input051, enc_output051, enc_assoc051, enc_nonce051, enc_key051, + sizeof(enc_input051), sizeof(enc_assoc051), sizeof(enc_nonce051) }, + { enc_input052, enc_output052, enc_assoc052, enc_nonce052, enc_key052, + sizeof(enc_input052), sizeof(enc_assoc052), sizeof(enc_nonce052) }, + { enc_input053, enc_output053, enc_assoc053, enc_nonce053, enc_key053, + sizeof(enc_input053), sizeof(enc_assoc053), sizeof(enc_nonce053) }, + { enc_input054, enc_output054, enc_assoc054, enc_nonce054, enc_key054, + sizeof(enc_input054), sizeof(enc_assoc054), sizeof(enc_nonce054) }, + { enc_input055, enc_output055, enc_assoc055, enc_nonce055, enc_key055, + sizeof(enc_input055), sizeof(enc_assoc055), sizeof(enc_nonce055) }, + { enc_input056, enc_output056, enc_assoc056, enc_nonce056, enc_key056, + sizeof(enc_input056), sizeof(enc_assoc056), sizeof(enc_nonce056) }, + { enc_input057, enc_output057, enc_assoc057, enc_nonce057, enc_key057, + sizeof(enc_input057), sizeof(enc_assoc057), sizeof(enc_nonce057) }, + { enc_input058, enc_output058, enc_assoc058, enc_nonce058, enc_key058, + sizeof(enc_input058), sizeof(enc_assoc058), sizeof(enc_nonce058) }, + { enc_input059, enc_output059, enc_assoc059, enc_nonce059, enc_key059, + sizeof(enc_input059), sizeof(enc_assoc059), sizeof(enc_nonce059) }, + { enc_input060, enc_output060, enc_assoc060, enc_nonce060, enc_key060, + sizeof(enc_input060), sizeof(enc_assoc060), sizeof(enc_nonce060) }, + { enc_input061, enc_output061, enc_assoc061, enc_nonce061, enc_key061, + sizeof(enc_input061), sizeof(enc_assoc061), sizeof(enc_nonce061) }, + { enc_input062, enc_output062, enc_assoc062, enc_nonce062, enc_key062, + sizeof(enc_input062), sizeof(enc_assoc062), sizeof(enc_nonce062) }, + { enc_input063, enc_output063, enc_assoc063, enc_nonce063, enc_key063, + sizeof(enc_input063), sizeof(enc_assoc063), sizeof(enc_nonce063) }, + { enc_input064, enc_output064, enc_assoc064, enc_nonce064, enc_key064, + sizeof(enc_input064), sizeof(enc_assoc064), sizeof(enc_nonce064) }, + { enc_input065, enc_output065, enc_assoc065, enc_nonce065, enc_key065, + sizeof(enc_input065), sizeof(enc_assoc065), sizeof(enc_nonce065) }, + { enc_input066, enc_output066, enc_assoc066, enc_nonce066, enc_key066, + sizeof(enc_input066), sizeof(enc_assoc066), sizeof(enc_nonce066) }, + { enc_input067, enc_output067, enc_assoc067, enc_nonce067, enc_key067, + sizeof(enc_input067), sizeof(enc_assoc067), sizeof(enc_nonce067) }, + { enc_input068, enc_output068, enc_assoc068, enc_nonce068, enc_key068, + sizeof(enc_input068), sizeof(enc_assoc068), sizeof(enc_nonce068) }, + { enc_input069, enc_output069, enc_assoc069, enc_nonce069, enc_key069, + sizeof(enc_input069), sizeof(enc_assoc069), sizeof(enc_nonce069) }, + { enc_input070, enc_output070, enc_assoc070, enc_nonce070, enc_key070, + sizeof(enc_input070), sizeof(enc_assoc070), sizeof(enc_nonce070) }, + { enc_input071, enc_output071, enc_assoc071, enc_nonce071, enc_key071, + sizeof(enc_input071), sizeof(enc_assoc071), sizeof(enc_nonce071) }, + { enc_input072, enc_output072, enc_assoc072, enc_nonce072, enc_key072, + sizeof(enc_input072), sizeof(enc_assoc072), sizeof(enc_nonce072) }, + { enc_input073, enc_output073, enc_assoc073, enc_nonce073, enc_key073, + sizeof(enc_input073), sizeof(enc_assoc073), sizeof(enc_nonce073) }, + { enc_input074, enc_output074, enc_assoc074, enc_nonce074, enc_key074, + sizeof(enc_input074), sizeof(enc_assoc074), sizeof(enc_nonce074) }, + { enc_input075, enc_output075, enc_assoc075, enc_nonce075, enc_key075, + sizeof(enc_input075), sizeof(enc_assoc075), sizeof(enc_nonce075) }, + { enc_input076, enc_output076, enc_assoc076, enc_nonce076, enc_key076, + sizeof(enc_input076), sizeof(enc_assoc076), sizeof(enc_nonce076) }, + { enc_input077, enc_output077, enc_assoc077, enc_nonce077, enc_key077, + sizeof(enc_input077), sizeof(enc_assoc077), sizeof(enc_nonce077) }, + { enc_input078, enc_output078, enc_assoc078, enc_nonce078, enc_key078, + sizeof(enc_input078), sizeof(enc_assoc078), sizeof(enc_nonce078) }, + { enc_input079, enc_output079, enc_assoc079, enc_nonce079, enc_key079, + sizeof(enc_input079), sizeof(enc_assoc079), sizeof(enc_nonce079) }, + { enc_input080, enc_output080, enc_assoc080, enc_nonce080, enc_key080, + sizeof(enc_input080), sizeof(enc_assoc080), sizeof(enc_nonce080) }, + { enc_input081, enc_output081, enc_assoc081, enc_nonce081, enc_key081, + sizeof(enc_input081), sizeof(enc_assoc081), sizeof(enc_nonce081) }, + { enc_input082, enc_output082, enc_assoc082, enc_nonce082, enc_key082, + sizeof(enc_input082), sizeof(enc_assoc082), sizeof(enc_nonce082) }, + { enc_input083, enc_output083, enc_assoc083, enc_nonce083, enc_key083, + sizeof(enc_input083), sizeof(enc_assoc083), sizeof(enc_nonce083) }, + { enc_input084, enc_output084, enc_assoc084, enc_nonce084, enc_key084, + sizeof(enc_input084), sizeof(enc_assoc084), sizeof(enc_nonce084) }, + { enc_input085, enc_output085, enc_assoc085, enc_nonce085, enc_key085, + sizeof(enc_input085), sizeof(enc_assoc085), sizeof(enc_nonce085) }, + { enc_input086, enc_output086, enc_assoc086, enc_nonce086, enc_key086, + sizeof(enc_input086), sizeof(enc_assoc086), sizeof(enc_nonce086) }, + { enc_input087, enc_output087, enc_assoc087, enc_nonce087, enc_key087, + sizeof(enc_input087), sizeof(enc_assoc087), sizeof(enc_nonce087) }, + { enc_input088, enc_output088, enc_assoc088, enc_nonce088, enc_key088, + sizeof(enc_input088), sizeof(enc_assoc088), sizeof(enc_nonce088) }, + { enc_input089, enc_output089, enc_assoc089, enc_nonce089, enc_key089, + sizeof(enc_input089), sizeof(enc_assoc089), sizeof(enc_nonce089) }, + { enc_input090, enc_output090, enc_assoc090, enc_nonce090, enc_key090, + sizeof(enc_input090), sizeof(enc_assoc090), sizeof(enc_nonce090) }, + { enc_input091, enc_output091, enc_assoc091, enc_nonce091, enc_key091, + sizeof(enc_input091), sizeof(enc_assoc091), sizeof(enc_nonce091) }, + { enc_input092, enc_output092, enc_assoc092, enc_nonce092, enc_key092, + sizeof(enc_input092), sizeof(enc_assoc092), sizeof(enc_nonce092) }, + { enc_input093, enc_output093, enc_assoc093, enc_nonce093, enc_key093, + sizeof(enc_input093), sizeof(enc_assoc093), sizeof(enc_nonce093) }, + { enc_input094, enc_output094, enc_assoc094, enc_nonce094, enc_key094, + sizeof(enc_input094), sizeof(enc_assoc094), sizeof(enc_nonce094) }, + { enc_input095, enc_output095, enc_assoc095, enc_nonce095, enc_key095, + sizeof(enc_input095), sizeof(enc_assoc095), sizeof(enc_nonce095) }, + { enc_input096, enc_output096, enc_assoc096, enc_nonce096, enc_key096, + sizeof(enc_input096), sizeof(enc_assoc096), sizeof(enc_nonce096) }, + { enc_input097, enc_output097, enc_assoc097, enc_nonce097, enc_key097, + sizeof(enc_input097), sizeof(enc_assoc097), sizeof(enc_nonce097) }, + { enc_input098, enc_output098, enc_assoc098, enc_nonce098, enc_key098, + sizeof(enc_input098), sizeof(enc_assoc098), sizeof(enc_nonce098) }, + { enc_input099, enc_output099, enc_assoc099, enc_nonce099, enc_key099, + sizeof(enc_input099), sizeof(enc_assoc099), sizeof(enc_nonce099) }, + { enc_input100, enc_output100, enc_assoc100, enc_nonce100, enc_key100, + sizeof(enc_input100), sizeof(enc_assoc100), sizeof(enc_nonce100) }, + { enc_input101, enc_output101, enc_assoc101, enc_nonce101, enc_key101, + sizeof(enc_input101), sizeof(enc_assoc101), sizeof(enc_nonce101) }, + { enc_input102, enc_output102, enc_assoc102, enc_nonce102, enc_key102, + sizeof(enc_input102), sizeof(enc_assoc102), sizeof(enc_nonce102) }, + { enc_input103, enc_output103, enc_assoc103, enc_nonce103, enc_key103, + sizeof(enc_input103), sizeof(enc_assoc103), sizeof(enc_nonce103) }, + { enc_input104, enc_output104, enc_assoc104, enc_nonce104, enc_key104, + sizeof(enc_input104), sizeof(enc_assoc104), sizeof(enc_nonce104) }, + { enc_input105, enc_output105, enc_assoc105, enc_nonce105, enc_key105, + sizeof(enc_input105), sizeof(enc_assoc105), sizeof(enc_nonce105) }, + { enc_input106, enc_output106, enc_assoc106, enc_nonce106, enc_key106, + sizeof(enc_input106), sizeof(enc_assoc106), sizeof(enc_nonce106) }, + { enc_input107, enc_output107, enc_assoc107, enc_nonce107, enc_key107, + sizeof(enc_input107), sizeof(enc_assoc107), sizeof(enc_nonce107) }, + { enc_input108, enc_output108, enc_assoc108, enc_nonce108, enc_key108, + sizeof(enc_input108), sizeof(enc_assoc108), sizeof(enc_nonce108) }, + { enc_input109, enc_output109, enc_assoc109, enc_nonce109, enc_key109, + sizeof(enc_input109), sizeof(enc_assoc109), sizeof(enc_nonce109) }, + { enc_input110, enc_output110, enc_assoc110, enc_nonce110, enc_key110, + sizeof(enc_input110), sizeof(enc_assoc110), sizeof(enc_nonce110) }, + { enc_input111, enc_output111, enc_assoc111, enc_nonce111, enc_key111, + sizeof(enc_input111), sizeof(enc_assoc111), sizeof(enc_nonce111) }, + { enc_input112, enc_output112, enc_assoc112, enc_nonce112, enc_key112, + sizeof(enc_input112), sizeof(enc_assoc112), sizeof(enc_nonce112) }, + { enc_input113, enc_output113, enc_assoc113, enc_nonce113, enc_key113, + sizeof(enc_input113), sizeof(enc_assoc113), sizeof(enc_nonce113) }, + { enc_input114, enc_output114, enc_assoc114, enc_nonce114, enc_key114, + sizeof(enc_input114), sizeof(enc_assoc114), sizeof(enc_nonce114) }, + { enc_input115, enc_output115, enc_assoc115, enc_nonce115, enc_key115, + sizeof(enc_input115), sizeof(enc_assoc115), sizeof(enc_nonce115) }, + { enc_input116, enc_output116, enc_assoc116, enc_nonce116, enc_key116, + sizeof(enc_input116), sizeof(enc_assoc116), sizeof(enc_nonce116) }, + { enc_input117, enc_output117, enc_assoc117, enc_nonce117, enc_key117, + sizeof(enc_input117), sizeof(enc_assoc117), sizeof(enc_nonce117) }, + { enc_input118, enc_output118, enc_assoc118, enc_nonce118, enc_key118, + sizeof(enc_input118), sizeof(enc_assoc118), sizeof(enc_nonce118) } +}; + +static const u8 dec_input001[] __initconst = { + 0x64, 0xa0, 0x86, 0x15, 0x75, 0x86, 0x1a, 0xf4, + 0x60, 0xf0, 0x62, 0xc7, 0x9b, 0xe6, 0x43, 0xbd, + 0x5e, 0x80, 0x5c, 0xfd, 0x34, 0x5c, 0xf3, 0x89, + 0xf1, 0x08, 0x67, 0x0a, 0xc7, 0x6c, 0x8c, 0xb2, + 0x4c, 0x6c, 0xfc, 0x18, 0x75, 0x5d, 0x43, 0xee, + 0xa0, 0x9e, 0xe9, 0x4e, 0x38, 0x2d, 0x26, 0xb0, + 0xbd, 0xb7, 0xb7, 0x3c, 0x32, 0x1b, 0x01, 0x00, + 0xd4, 0xf0, 0x3b, 0x7f, 0x35, 0x58, 0x94, 0xcf, + 0x33, 0x2f, 0x83, 0x0e, 0x71, 0x0b, 0x97, 0xce, + 0x98, 0xc8, 0xa8, 0x4a, 0xbd, 0x0b, 0x94, 0x81, + 0x14, 0xad, 0x17, 0x6e, 0x00, 0x8d, 0x33, 0xbd, + 0x60, 0xf9, 0x82, 0xb1, 0xff, 0x37, 0xc8, 0x55, + 0x97, 0x97, 0xa0, 0x6e, 0xf4, 0xf0, 0xef, 0x61, + 0xc1, 0x86, 0x32, 0x4e, 0x2b, 0x35, 0x06, 0x38, + 0x36, 0x06, 0x90, 0x7b, 0x6a, 0x7c, 0x02, 0xb0, + 0xf9, 0xf6, 0x15, 0x7b, 0x53, 0xc8, 0x67, 0xe4, + 0xb9, 0x16, 0x6c, 0x76, 0x7b, 0x80, 0x4d, 0x46, + 0xa5, 0x9b, 0x52, 0x16, 0xcd, 0xe7, 0xa4, 0xe9, + 0x90, 0x40, 0xc5, 0xa4, 0x04, 0x33, 0x22, 0x5e, + 0xe2, 0x82, 0xa1, 0xb0, 0xa0, 0x6c, 0x52, 0x3e, + 0xaf, 0x45, 0x34, 0xd7, 0xf8, 0x3f, 0xa1, 0x15, + 0x5b, 0x00, 0x47, 0x71, 0x8c, 0xbc, 0x54, 0x6a, + 0x0d, 0x07, 0x2b, 0x04, 0xb3, 0x56, 0x4e, 0xea, + 0x1b, 0x42, 0x22, 0x73, 0xf5, 0x48, 0x27, 0x1a, + 0x0b, 0xb2, 0x31, 0x60, 0x53, 0xfa, 0x76, 0x99, + 0x19, 0x55, 0xeb, 0xd6, 0x31, 0x59, 0x43, 0x4e, + 0xce, 0xbb, 0x4e, 0x46, 0x6d, 0xae, 0x5a, 0x10, + 0x73, 0xa6, 0x72, 0x76, 0x27, 0x09, 0x7a, 0x10, + 0x49, 0xe6, 0x17, 0xd9, 0x1d, 0x36, 0x10, 0x94, + 0xfa, 0x68, 0xf0, 0xff, 0x77, 0x98, 0x71, 0x30, + 0x30, 0x5b, 0xea, 0xba, 0x2e, 0xda, 0x04, 0xdf, + 0x99, 0x7b, 0x71, 0x4d, 0x6c, 0x6f, 0x2c, 0x29, + 0xa6, 0xad, 0x5c, 0xb4, 0x02, 0x2b, 0x02, 0x70, + 0x9b, 0xee, 0xad, 0x9d, 0x67, 0x89, 0x0c, 0xbb, + 0x22, 0x39, 0x23, 0x36, 0xfe, 0xa1, 0x85, 0x1f, + 0x38 +}; +static const u8 dec_output001[] __initconst = { + 0x49, 0x6e, 0x74, 0x65, 0x72, 0x6e, 0x65, 0x74, + 0x2d, 0x44, 0x72, 0x61, 0x66, 0x74, 0x73, 0x20, + 0x61, 0x72, 0x65, 0x20, 0x64, 0x72, 0x61, 0x66, + 0x74, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65, + 0x6e, 0x74, 0x73, 0x20, 0x76, 0x61, 0x6c, 0x69, + 0x64, 0x20, 0x66, 0x6f, 0x72, 0x20, 0x61, 0x20, + 0x6d, 0x61, 0x78, 0x69, 0x6d, 0x75, 0x6d, 0x20, + 0x6f, 0x66, 0x20, 0x73, 0x69, 0x78, 0x20, 0x6d, + 0x6f, 0x6e, 0x74, 0x68, 0x73, 0x20, 0x61, 0x6e, + 0x64, 0x20, 0x6d, 0x61, 0x79, 0x20, 0x62, 0x65, + 0x20, 0x75, 0x70, 0x64, 0x61, 0x74, 0x65, 0x64, + 0x2c, 0x20, 0x72, 0x65, 0x70, 0x6c, 0x61, 0x63, + 0x65, 0x64, 0x2c, 0x20, 0x6f, 0x72, 0x20, 0x6f, + 0x62, 0x73, 0x6f, 0x6c, 0x65, 0x74, 0x65, 0x64, + 0x20, 0x62, 0x79, 0x20, 0x6f, 0x74, 0x68, 0x65, + 0x72, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65, + 0x6e, 0x74, 0x73, 0x20, 0x61, 0x74, 0x20, 0x61, + 0x6e, 0x79, 0x20, 0x74, 0x69, 0x6d, 0x65, 0x2e, + 0x20, 0x49, 0x74, 0x20, 0x69, 0x73, 0x20, 0x69, + 0x6e, 0x61, 0x70, 0x70, 0x72, 0x6f, 0x70, 0x72, + 0x69, 0x61, 0x74, 0x65, 0x20, 0x74, 0x6f, 0x20, + 0x75, 0x73, 0x65, 0x20, 0x49, 0x6e, 0x74, 0x65, + 0x72, 0x6e, 0x65, 0x74, 0x2d, 0x44, 0x72, 0x61, + 0x66, 0x74, 0x73, 0x20, 0x61, 0x73, 0x20, 0x72, + 0x65, 0x66, 0x65, 0x72, 0x65, 0x6e, 0x63, 0x65, + 0x20, 0x6d, 0x61, 0x74, 0x65, 0x72, 0x69, 0x61, + 0x6c, 0x20, 0x6f, 0x72, 0x20, 0x74, 0x6f, 0x20, + 0x63, 0x69, 0x74, 0x65, 0x20, 0x74, 0x68, 0x65, + 0x6d, 0x20, 0x6f, 0x74, 0x68, 0x65, 0x72, 0x20, + 0x74, 0x68, 0x61, 0x6e, 0x20, 0x61, 0x73, 0x20, + 0x2f, 0xe2, 0x80, 0x9c, 0x77, 0x6f, 0x72, 0x6b, + 0x20, 0x69, 0x6e, 0x20, 0x70, 0x72, 0x6f, 0x67, + 0x72, 0x65, 0x73, 0x73, 0x2e, 0x2f, 0xe2, 0x80, + 0x9d +}; +static const u8 dec_assoc001[] __initconst = { + 0xf3, 0x33, 0x88, 0x86, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x4e, 0x91 +}; +static const u8 dec_nonce001[] __initconst = { + 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08 +}; +static const u8 dec_key001[] __initconst = { + 0x1c, 0x92, 0x40, 0xa5, 0xeb, 0x55, 0xd3, 0x8a, + 0xf3, 0x33, 0x88, 0x86, 0x04, 0xf6, 0xb5, 0xf0, + 0x47, 0x39, 0x17, 0xc1, 0x40, 0x2b, 0x80, 0x09, + 0x9d, 0xca, 0x5c, 0xbc, 0x20, 0x70, 0x75, 0xc0 +}; + +static const u8 dec_input002[] __initconst = { + 0xea, 0xe0, 0x1e, 0x9e, 0x2c, 0x91, 0xaa, 0xe1, + 0xdb, 0x5d, 0x99, 0x3f, 0x8a, 0xf7, 0x69, 0x92 +}; +static const u8 dec_output002[] __initconst = { }; +static const u8 dec_assoc002[] __initconst = { }; +static const u8 dec_nonce002[] __initconst = { + 0xca, 0xbf, 0x33, 0x71, 0x32, 0x45, 0x77, 0x8e +}; +static const u8 dec_key002[] __initconst = { + 0x4c, 0xf5, 0x96, 0x83, 0x38, 0xe6, 0xae, 0x7f, + 0x2d, 0x29, 0x25, 0x76, 0xd5, 0x75, 0x27, 0x86, + 0x91, 0x9a, 0x27, 0x7a, 0xfb, 0x46, 0xc5, 0xef, + 0x94, 0x81, 0x79, 0x57, 0x14, 0x59, 0x40, 0x68 +}; + +static const u8 dec_input003[] __initconst = { + 0xdd, 0x6b, 0x3b, 0x82, 0xce, 0x5a, 0xbd, 0xd6, + 0xa9, 0x35, 0x83, 0xd8, 0x8c, 0x3d, 0x85, 0x77 +}; +static const u8 dec_output003[] __initconst = { }; +static const u8 dec_assoc003[] __initconst = { + 0x33, 0x10, 0x41, 0x12, 0x1f, 0xf3, 0xd2, 0x6b +}; +static const u8 dec_nonce003[] __initconst = { + 0x3d, 0x86, 0xb5, 0x6b, 0xc8, 0xa3, 0x1f, 0x1d +}; +static const u8 dec_key003[] __initconst = { + 0x2d, 0xb0, 0x5d, 0x40, 0xc8, 0xed, 0x44, 0x88, + 0x34, 0xd1, 0x13, 0xaf, 0x57, 0xa1, 0xeb, 0x3a, + 0x2a, 0x80, 0x51, 0x36, 0xec, 0x5b, 0xbc, 0x08, + 0x93, 0x84, 0x21, 0xb5, 0x13, 0x88, 0x3c, 0x0d +}; + +static const u8 dec_input004[] __initconst = { + 0xb7, 0x1b, 0xb0, 0x73, 0x59, 0xb0, 0x84, 0xb2, + 0x6d, 0x8e, 0xab, 0x94, 0x31, 0xa1, 0xae, 0xac, + 0x89 +}; +static const u8 dec_output004[] __initconst = { + 0xa4 +}; +static const u8 dec_assoc004[] __initconst = { + 0x6a, 0xe2, 0xad, 0x3f, 0x88, 0x39, 0x5a, 0x40 +}; +static const u8 dec_nonce004[] __initconst = { + 0xd2, 0x32, 0x1f, 0x29, 0x28, 0xc6, 0xc4, 0xc4 +}; +static const u8 dec_key004[] __initconst = { + 0x4b, 0x28, 0x4b, 0xa3, 0x7b, 0xbe, 0xe9, 0xf8, + 0x31, 0x80, 0x82, 0xd7, 0xd8, 0xe8, 0xb5, 0xa1, + 0xe2, 0x18, 0x18, 0x8a, 0x9c, 0xfa, 0xa3, 0x3d, + 0x25, 0x71, 0x3e, 0x40, 0xbc, 0x54, 0x7a, 0x3e +}; + +static const u8 dec_input005[] __initconst = { + 0xbf, 0xe1, 0x5b, 0x0b, 0xdb, 0x6b, 0xf5, 0x5e, + 0x6c, 0x5d, 0x84, 0x44, 0x39, 0x81, 0xc1, 0x9c, + 0xac +}; +static const u8 dec_output005[] __initconst = { + 0x2d +}; +static const u8 dec_assoc005[] __initconst = { }; +static const u8 dec_nonce005[] __initconst = { + 0x20, 0x1c, 0xaa, 0x5f, 0x9c, 0xbf, 0x92, 0x30 +}; +static const u8 dec_key005[] __initconst = { + 0x66, 0xca, 0x9c, 0x23, 0x2a, 0x4b, 0x4b, 0x31, + 0x0e, 0x92, 0x89, 0x8b, 0xf4, 0x93, 0xc7, 0x87, + 0x98, 0xa3, 0xd8, 0x39, 0xf8, 0xf4, 0xa7, 0x01, + 0xc0, 0x2e, 0x0a, 0xa6, 0x7e, 0x5a, 0x78, 0x87 +}; + +static const u8 dec_input006[] __initconst = { + 0x8b, 0x06, 0xd3, 0x31, 0xb0, 0x93, 0x45, 0xb1, + 0x75, 0x6e, 0x26, 0xf9, 0x67, 0xbc, 0x90, 0x15, + 0x81, 0x2c, 0xb5, 0xf0, 0xc6, 0x2b, 0xc7, 0x8c, + 0x56, 0xd1, 0xbf, 0x69, 0x6c, 0x07, 0xa0, 0xda, + 0x65, 0x27, 0xc9, 0x90, 0x3d, 0xef, 0x4b, 0x11, + 0x0f, 0x19, 0x07, 0xfd, 0x29, 0x92, 0xd9, 0xc8, + 0xf7, 0x99, 0x2e, 0x4a, 0xd0, 0xb8, 0x2c, 0xdc, + 0x93, 0xf5, 0x9e, 0x33, 0x78, 0xd1, 0x37, 0xc3, + 0x66, 0xd7, 0x5e, 0xbc, 0x44, 0xbf, 0x53, 0xa5, + 0xbc, 0xc4, 0xcb, 0x7b, 0x3a, 0x8e, 0x7f, 0x02, + 0xbd, 0xbb, 0xe7, 0xca, 0xa6, 0x6c, 0x6b, 0x93, + 0x21, 0x93, 0x10, 0x61, 0xe7, 0x69, 0xd0, 0x78, + 0xf3, 0x07, 0x5a, 0x1a, 0x8f, 0x73, 0xaa, 0xb1, + 0x4e, 0xd3, 0xda, 0x4f, 0xf3, 0x32, 0xe1, 0x66, + 0x3e, 0x6c, 0xc6, 0x13, 0xba, 0x06, 0x5b, 0xfc, + 0x6a, 0xe5, 0x6f, 0x60, 0xfb, 0x07, 0x40, 0xb0, + 0x8c, 0x9d, 0x84, 0x43, 0x6b, 0xc1, 0xf7, 0x8d, + 0x8d, 0x31, 0xf7, 0x7a, 0x39, 0x4d, 0x8f, 0x9a, + 0xeb +}; +static const u8 dec_output006[] __initconst = { + 0x33, 0x2f, 0x94, 0xc1, 0xa4, 0xef, 0xcc, 0x2a, + 0x5b, 0xa6, 0xe5, 0x8f, 0x1d, 0x40, 0xf0, 0x92, + 0x3c, 0xd9, 0x24, 0x11, 0xa9, 0x71, 0xf9, 0x37, + 0x14, 0x99, 0xfa, 0xbe, 0xe6, 0x80, 0xde, 0x50, + 0xc9, 0x96, 0xd4, 0xb0, 0xec, 0x9e, 0x17, 0xec, + 0xd2, 0x5e, 0x72, 0x99, 0xfc, 0x0a, 0xe1, 0xcb, + 0x48, 0xd2, 0x85, 0xdd, 0x2f, 0x90, 0xe0, 0x66, + 0x3b, 0xe6, 0x20, 0x74, 0xbe, 0x23, 0x8f, 0xcb, + 0xb4, 0xe4, 0xda, 0x48, 0x40, 0xa6, 0xd1, 0x1b, + 0xc7, 0x42, 0xce, 0x2f, 0x0c, 0xa6, 0x85, 0x6e, + 0x87, 0x37, 0x03, 0xb1, 0x7c, 0x25, 0x96, 0xa3, + 0x05, 0xd8, 0xb0, 0xf4, 0xed, 0xea, 0xc2, 0xf0, + 0x31, 0x98, 0x6c, 0xd1, 0x14, 0x25, 0xc0, 0xcb, + 0x01, 0x74, 0xd0, 0x82, 0xf4, 0x36, 0xf5, 0x41, + 0xd5, 0xdc, 0xca, 0xc5, 0xbb, 0x98, 0xfe, 0xfc, + 0x69, 0x21, 0x70, 0xd8, 0xa4, 0x4b, 0xc8, 0xde, + 0x8f +}; +static const u8 dec_assoc006[] __initconst = { + 0x70, 0xd3, 0x33, 0xf3, 0x8b, 0x18, 0x0b +}; +static const u8 dec_nonce006[] __initconst = { + 0xdf, 0x51, 0x84, 0x82, 0x42, 0x0c, 0x75, 0x9c +}; +static const u8 dec_key006[] __initconst = { + 0x68, 0x7b, 0x8d, 0x8e, 0xe3, 0xc4, 0xdd, 0xae, + 0xdf, 0x72, 0x7f, 0x53, 0x72, 0x25, 0x1e, 0x78, + 0x91, 0xcb, 0x69, 0x76, 0x1f, 0x49, 0x93, 0xf9, + 0x6f, 0x21, 0xcc, 0x39, 0x9c, 0xad, 0xb1, 0x01 +}; + +static const u8 dec_input007[] __initconst = { + 0x85, 0x04, 0xc2, 0xed, 0x8d, 0xfd, 0x97, 0x5c, + 0xd2, 0xb7, 0xe2, 0xc1, 0x6b, 0xa3, 0xba, 0xf8, + 0xc9, 0x50, 0xc3, 0xc6, 0xa5, 0xe3, 0xa4, 0x7c, + 0xc3, 0x23, 0x49, 0x5e, 0xa9, 0xb9, 0x32, 0xeb, + 0x8a, 0x7c, 0xca, 0xe5, 0xec, 0xfb, 0x7c, 0xc0, + 0xcb, 0x7d, 0xdc, 0x2c, 0x9d, 0x92, 0x55, 0x21, + 0x0a, 0xc8, 0x43, 0x63, 0x59, 0x0a, 0x31, 0x70, + 0x82, 0x67, 0x41, 0x03, 0xf8, 0xdf, 0xf2, 0xac, + 0xa7, 0x02, 0xd4, 0xd5, 0x8a, 0x2d, 0xc8, 0x99, + 0x19, 0x66, 0xd0, 0xf6, 0x88, 0x2c, 0x77, 0xd9, + 0xd4, 0x0d, 0x6c, 0xbd, 0x98, 0xde, 0xe7, 0x7f, + 0xad, 0x7e, 0x8a, 0xfb, 0xe9, 0x4b, 0xe5, 0xf7, + 0xe5, 0x50, 0xa0, 0x90, 0x3f, 0xd6, 0x22, 0x53, + 0xe3, 0xfe, 0x1b, 0xcc, 0x79, 0x3b, 0xec, 0x12, + 0x47, 0x52, 0xa7, 0xd6, 0x04, 0xe3, 0x52, 0xe6, + 0x93, 0x90, 0x91, 0x32, 0x73, 0x79, 0xb8, 0xd0, + 0x31, 0xde, 0x1f, 0x9f, 0x2f, 0x05, 0x38, 0x54, + 0x2f, 0x35, 0x04, 0x39, 0xe0, 0xa7, 0xba, 0xc6, + 0x52, 0xf6, 0x37, 0x65, 0x4c, 0x07, 0xa9, 0x7e, + 0xb3, 0x21, 0x6f, 0x74, 0x8c, 0xc9, 0xde, 0xdb, + 0x65, 0x1b, 0x9b, 0xaa, 0x60, 0xb1, 0x03, 0x30, + 0x6b, 0xb2, 0x03, 0xc4, 0x1c, 0x04, 0xf8, 0x0f, + 0x64, 0xaf, 0x46, 0xe4, 0x65, 0x99, 0x49, 0xe2, + 0xea, 0xce, 0x78, 0x00, 0xd8, 0x8b, 0xd5, 0x2e, + 0xcf, 0xfc, 0x40, 0x49, 0xe8, 0x58, 0xdc, 0x34, + 0x9c, 0x8c, 0x61, 0xbf, 0x0a, 0x8e, 0xec, 0x39, + 0xa9, 0x30, 0x05, 0x5a, 0xd2, 0x56, 0x01, 0xc7, + 0xda, 0x8f, 0x4e, 0xbb, 0x43, 0xa3, 0x3a, 0xf9, + 0x15, 0x2a, 0xd0, 0xa0, 0x7a, 0x87, 0x34, 0x82, + 0xfe, 0x8a, 0xd1, 0x2d, 0x5e, 0xc7, 0xbf, 0x04, + 0x53, 0x5f, 0x3b, 0x36, 0xd4, 0x25, 0x5c, 0x34, + 0x7a, 0x8d, 0xd5, 0x05, 0xce, 0x72, 0xca, 0xef, + 0x7a, 0x4b, 0xbc, 0xb0, 0x10, 0x5c, 0x96, 0x42, + 0x3a, 0x00, 0x98, 0xcd, 0x15, 0xe8, 0xb7, 0x53 +}; +static const u8 dec_output007[] __initconst = { + 0x9b, 0x18, 0xdb, 0xdd, 0x9a, 0x0f, 0x3e, 0xa5, + 0x15, 0x17, 0xde, 0xdf, 0x08, 0x9d, 0x65, 0x0a, + 0x67, 0x30, 0x12, 0xe2, 0x34, 0x77, 0x4b, 0xc1, + 0xd9, 0xc6, 0x1f, 0xab, 0xc6, 0x18, 0x50, 0x17, + 0xa7, 0x9d, 0x3c, 0xa6, 0xc5, 0x35, 0x8c, 0x1c, + 0xc0, 0xa1, 0x7c, 0x9f, 0x03, 0x89, 0xca, 0xe1, + 0xe6, 0xe9, 0xd4, 0xd3, 0x88, 0xdb, 0xb4, 0x51, + 0x9d, 0xec, 0xb4, 0xfc, 0x52, 0xee, 0x6d, 0xf1, + 0x75, 0x42, 0xc6, 0xfd, 0xbd, 0x7a, 0x8e, 0x86, + 0xfc, 0x44, 0xb3, 0x4f, 0xf3, 0xea, 0x67, 0x5a, + 0x41, 0x13, 0xba, 0xb0, 0xdc, 0xe1, 0xd3, 0x2a, + 0x7c, 0x22, 0xb3, 0xca, 0xac, 0x6a, 0x37, 0x98, + 0x3e, 0x1d, 0x40, 0x97, 0xf7, 0x9b, 0x1d, 0x36, + 0x6b, 0xb3, 0x28, 0xbd, 0x60, 0x82, 0x47, 0x34, + 0xaa, 0x2f, 0x7d, 0xe9, 0xa8, 0x70, 0x81, 0x57, + 0xd4, 0xb9, 0x77, 0x0a, 0x9d, 0x29, 0xa7, 0x84, + 0x52, 0x4f, 0xc2, 0x4a, 0x40, 0x3b, 0x3c, 0xd4, + 0xc9, 0x2a, 0xdb, 0x4a, 0x53, 0xc4, 0xbe, 0x80, + 0xe9, 0x51, 0x7f, 0x8f, 0xc7, 0xa2, 0xce, 0x82, + 0x5c, 0x91, 0x1e, 0x74, 0xd9, 0xd0, 0xbd, 0xd5, + 0xf3, 0xfd, 0xda, 0x4d, 0x25, 0xb4, 0xbb, 0x2d, + 0xac, 0x2f, 0x3d, 0x71, 0x85, 0x7b, 0xcf, 0x3c, + 0x7b, 0x3e, 0x0e, 0x22, 0x78, 0x0c, 0x29, 0xbf, + 0xe4, 0xf4, 0x57, 0xb3, 0xcb, 0x49, 0xa0, 0xfc, + 0x1e, 0x05, 0x4e, 0x16, 0xbc, 0xd5, 0xa8, 0xa3, + 0xee, 0x05, 0x35, 0xc6, 0x7c, 0xab, 0x60, 0x14, + 0x55, 0x1a, 0x8e, 0xc5, 0x88, 0x5d, 0xd5, 0x81, + 0xc2, 0x81, 0xa5, 0xc4, 0x60, 0xdb, 0xaf, 0x77, + 0x91, 0xe1, 0xce, 0xa2, 0x7e, 0x7f, 0x42, 0xe3, + 0xb0, 0x13, 0x1c, 0x1f, 0x25, 0x60, 0x21, 0xe2, + 0x40, 0x5f, 0x99, 0xb7, 0x73, 0xec, 0x9b, 0x2b, + 0xf0, 0x65, 0x11, 0xc8, 0xd0, 0x0a, 0x9f, 0xd3 +}; +static const u8 dec_assoc007[] __initconst = { }; +static const u8 dec_nonce007[] __initconst = { + 0xde, 0x7b, 0xef, 0xc3, 0x65, 0x1b, 0x68, 0xb0 +}; +static const u8 dec_key007[] __initconst = { + 0x8d, 0xb8, 0x91, 0x48, 0xf0, 0xe7, 0x0a, 0xbd, + 0xf9, 0x3f, 0xcd, 0xd9, 0xa0, 0x1e, 0x42, 0x4c, + 0xe7, 0xde, 0x25, 0x3d, 0xa3, 0xd7, 0x05, 0x80, + 0x8d, 0xf2, 0x82, 0xac, 0x44, 0x16, 0x51, 0x01 +}; + +static const u8 dec_input008[] __initconst = { + 0x14, 0xf6, 0x41, 0x37, 0xa6, 0xd4, 0x27, 0xcd, + 0xdb, 0x06, 0x3e, 0x9a, 0x4e, 0xab, 0xd5, 0xb1, + 0x1e, 0x6b, 0xd2, 0xbc, 0x11, 0xf4, 0x28, 0x93, + 0x63, 0x54, 0xef, 0xbb, 0x5e, 0x1d, 0x3a, 0x1d, + 0x37, 0x3c, 0x0a, 0x6c, 0x1e, 0xc2, 0xd1, 0x2c, + 0xb5, 0xa3, 0xb5, 0x7b, 0xb8, 0x8f, 0x25, 0xa6, + 0x1b, 0x61, 0x1c, 0xec, 0x28, 0x58, 0x26, 0xa4, + 0xa8, 0x33, 0x28, 0x25, 0x5c, 0x45, 0x05, 0xe5, + 0x6c, 0x99, 0xe5, 0x45, 0xc4, 0xa2, 0x03, 0x84, + 0x03, 0x73, 0x1e, 0x8c, 0x49, 0xac, 0x20, 0xdd, + 0x8d, 0xb3, 0xc4, 0xf5, 0xe7, 0x4f, 0xf1, 0xed, + 0xa1, 0x98, 0xde, 0xa4, 0x96, 0xdd, 0x2f, 0xab, + 0xab, 0x97, 0xcf, 0x3e, 0xd2, 0x9e, 0xb8, 0x13, + 0x07, 0x28, 0x29, 0x19, 0xaf, 0xfd, 0xf2, 0x49, + 0x43, 0xea, 0x49, 0x26, 0x91, 0xc1, 0x07, 0xd6, + 0xbb, 0x81, 0x75, 0x35, 0x0d, 0x24, 0x7f, 0xc8, + 0xda, 0xd4, 0xb7, 0xeb, 0xe8, 0x5c, 0x09, 0xa2, + 0x2f, 0xdc, 0x28, 0x7d, 0x3a, 0x03, 0xfa, 0x94, + 0xb5, 0x1d, 0x17, 0x99, 0x36, 0xc3, 0x1c, 0x18, + 0x34, 0xe3, 0x9f, 0xf5, 0x55, 0x7c, 0xb0, 0x60, + 0x9d, 0xff, 0xac, 0xd4, 0x61, 0xf2, 0xad, 0xf8, + 0xce, 0xc7, 0xbe, 0x5c, 0xd2, 0x95, 0xa8, 0x4b, + 0x77, 0x13, 0x19, 0x59, 0x26, 0xc9, 0xb7, 0x8f, + 0x6a, 0xcb, 0x2d, 0x37, 0x91, 0xea, 0x92, 0x9c, + 0x94, 0x5b, 0xda, 0x0b, 0xce, 0xfe, 0x30, 0x20, + 0xf8, 0x51, 0xad, 0xf2, 0xbe, 0xe7, 0xc7, 0xff, + 0xb3, 0x33, 0x91, 0x6a, 0xc9, 0x1a, 0x41, 0xc9, + 0x0f, 0xf3, 0x10, 0x0e, 0xfd, 0x53, 0xff, 0x6c, + 0x16, 0x52, 0xd9, 0xf3, 0xf7, 0x98, 0x2e, 0xc9, + 0x07, 0x31, 0x2c, 0x0c, 0x72, 0xd7, 0xc5, 0xc6, + 0x08, 0x2a, 0x7b, 0xda, 0xbd, 0x7e, 0x02, 0xea, + 0x1a, 0xbb, 0xf2, 0x04, 0x27, 0x61, 0x28, 0x8e, + 0xf5, 0x04, 0x03, 0x1f, 0x4c, 0x07, 0x55, 0x82, + 0xec, 0x1e, 0xd7, 0x8b, 0x2f, 0x65, 0x56, 0xd1, + 0xd9, 0x1e, 0x3c, 0xe9, 0x1f, 0x5e, 0x98, 0x70, + 0x38, 0x4a, 0x8c, 0x49, 0xc5, 0x43, 0xa0, 0xa1, + 0x8b, 0x74, 0x9d, 0x4c, 0x62, 0x0d, 0x10, 0x0c, + 0xf4, 0x6c, 0x8f, 0xe0, 0xaa, 0x9a, 0x8d, 0xb7, + 0xe0, 0xbe, 0x4c, 0x87, 0xf1, 0x98, 0x2f, 0xcc, + 0xed, 0xc0, 0x52, 0x29, 0xdc, 0x83, 0xf8, 0xfc, + 0x2c, 0x0e, 0xa8, 0x51, 0x4d, 0x80, 0x0d, 0xa3, + 0xfe, 0xd8, 0x37, 0xe7, 0x41, 0x24, 0xfc, 0xfb, + 0x75, 0xe3, 0x71, 0x7b, 0x57, 0x45, 0xf5, 0x97, + 0x73, 0x65, 0x63, 0x14, 0x74, 0xb8, 0x82, 0x9f, + 0xf8, 0x60, 0x2f, 0x8a, 0xf2, 0x4e, 0xf1, 0x39, + 0xda, 0x33, 0x91, 0xf8, 0x36, 0xe0, 0x8d, 0x3f, + 0x1f, 0x3b, 0x56, 0xdc, 0xa0, 0x8f, 0x3c, 0x9d, + 0x71, 0x52, 0xa7, 0xb8, 0xc0, 0xa5, 0xc6, 0xa2, + 0x73, 0xda, 0xf4, 0x4b, 0x74, 0x5b, 0x00, 0x3d, + 0x99, 0xd7, 0x96, 0xba, 0xe6, 0xe1, 0xa6, 0x96, + 0x38, 0xad, 0xb3, 0xc0, 0xd2, 0xba, 0x91, 0x6b, + 0xf9, 0x19, 0xdd, 0x3b, 0xbe, 0xbe, 0x9c, 0x20, + 0x50, 0xba, 0xa1, 0xd0, 0xce, 0x11, 0xbd, 0x95, + 0xd8, 0xd1, 0xdd, 0x33, 0x85, 0x74, 0xdc, 0xdb, + 0x66, 0x76, 0x44, 0xdc, 0x03, 0x74, 0x48, 0x35, + 0x98, 0xb1, 0x18, 0x47, 0x94, 0x7d, 0xff, 0x62, + 0xe4, 0x58, 0x78, 0xab, 0xed, 0x95, 0x36, 0xd9, + 0x84, 0x91, 0x82, 0x64, 0x41, 0xbb, 0x58, 0xe6, + 0x1c, 0x20, 0x6d, 0x15, 0x6b, 0x13, 0x96, 0xe8, + 0x35, 0x7f, 0xdc, 0x40, 0x2c, 0xe9, 0xbc, 0x8a, + 0x4f, 0x92, 0xec, 0x06, 0x2d, 0x50, 0xdf, 0x93, + 0x5d, 0x65, 0x5a, 0xa8, 0xfc, 0x20, 0x50, 0x14, + 0xa9, 0x8a, 0x7e, 0x1d, 0x08, 0x1f, 0xe2, 0x99, + 0xd0, 0xbe, 0xfb, 0x3a, 0x21, 0x9d, 0xad, 0x86, + 0x54, 0xfd, 0x0d, 0x98, 0x1c, 0x5a, 0x6f, 0x1f, + 0x9a, 0x40, 0xcd, 0xa2, 0xff, 0x6a, 0xf1, 0x54 +}; +static const u8 dec_output008[] __initconst = { + 0xc3, 0x09, 0x94, 0x62, 0xe6, 0x46, 0x2e, 0x10, + 0xbe, 0x00, 0xe4, 0xfc, 0xf3, 0x40, 0xa3, 0xe2, + 0x0f, 0xc2, 0x8b, 0x28, 0xdc, 0xba, 0xb4, 0x3c, + 0xe4, 0x21, 0x58, 0x61, 0xcd, 0x8b, 0xcd, 0xfb, + 0xac, 0x94, 0xa1, 0x45, 0xf5, 0x1c, 0xe1, 0x12, + 0xe0, 0x3b, 0x67, 0x21, 0x54, 0x5e, 0x8c, 0xaa, + 0xcf, 0xdb, 0xb4, 0x51, 0xd4, 0x13, 0xda, 0xe6, + 0x83, 0x89, 0xb6, 0x92, 0xe9, 0x21, 0x76, 0xa4, + 0x93, 0x7d, 0x0e, 0xfd, 0x96, 0x36, 0x03, 0x91, + 0x43, 0x5c, 0x92, 0x49, 0x62, 0x61, 0x7b, 0xeb, + 0x43, 0x89, 0xb8, 0x12, 0x20, 0x43, 0xd4, 0x47, + 0x06, 0x84, 0xee, 0x47, 0xe9, 0x8a, 0x73, 0x15, + 0x0f, 0x72, 0xcf, 0xed, 0xce, 0x96, 0xb2, 0x7f, + 0x21, 0x45, 0x76, 0xeb, 0x26, 0x28, 0x83, 0x6a, + 0xad, 0xaa, 0xa6, 0x81, 0xd8, 0x55, 0xb1, 0xa3, + 0x85, 0xb3, 0x0c, 0xdf, 0xf1, 0x69, 0x2d, 0x97, + 0x05, 0x2a, 0xbc, 0x7c, 0x7b, 0x25, 0xf8, 0x80, + 0x9d, 0x39, 0x25, 0xf3, 0x62, 0xf0, 0x66, 0x5e, + 0xf4, 0xa0, 0xcf, 0xd8, 0xfd, 0x4f, 0xb1, 0x1f, + 0x60, 0x3a, 0x08, 0x47, 0xaf, 0xe1, 0xf6, 0x10, + 0x77, 0x09, 0xa7, 0x27, 0x8f, 0x9a, 0x97, 0x5a, + 0x26, 0xfa, 0xfe, 0x41, 0x32, 0x83, 0x10, 0xe0, + 0x1d, 0xbf, 0x64, 0x0d, 0xf4, 0x1c, 0x32, 0x35, + 0xe5, 0x1b, 0x36, 0xef, 0xd4, 0x4a, 0x93, 0x4d, + 0x00, 0x7c, 0xec, 0x02, 0x07, 0x8b, 0x5d, 0x7d, + 0x1b, 0x0e, 0xd1, 0xa6, 0xa5, 0x5d, 0x7d, 0x57, + 0x88, 0xa8, 0xcc, 0x81, 0xb4, 0x86, 0x4e, 0xb4, + 0x40, 0xe9, 0x1d, 0xc3, 0xb1, 0x24, 0x3e, 0x7f, + 0xcc, 0x8a, 0x24, 0x9b, 0xdf, 0x6d, 0xf0, 0x39, + 0x69, 0x3e, 0x4c, 0xc0, 0x96, 0xe4, 0x13, 0xda, + 0x90, 0xda, 0xf4, 0x95, 0x66, 0x8b, 0x17, 0x17, + 0xfe, 0x39, 0x43, 0x25, 0xaa, 0xda, 0xa0, 0x43, + 0x3c, 0xb1, 0x41, 0x02, 0xa3, 0xf0, 0xa7, 0x19, + 0x59, 0xbc, 0x1d, 0x7d, 0x6c, 0x6d, 0x91, 0x09, + 0x5c, 0xb7, 0x5b, 0x01, 0xd1, 0x6f, 0x17, 0x21, + 0x97, 0xbf, 0x89, 0x71, 0xa5, 0xb0, 0x6e, 0x07, + 0x45, 0xfd, 0x9d, 0xea, 0x07, 0xf6, 0x7a, 0x9f, + 0x10, 0x18, 0x22, 0x30, 0x73, 0xac, 0xd4, 0x6b, + 0x72, 0x44, 0xed, 0xd9, 0x19, 0x9b, 0x2d, 0x4a, + 0x41, 0xdd, 0xd1, 0x85, 0x5e, 0x37, 0x19, 0xed, + 0xd2, 0x15, 0x8f, 0x5e, 0x91, 0xdb, 0x33, 0xf2, + 0xe4, 0xdb, 0xff, 0x98, 0xfb, 0xa3, 0xb5, 0xca, + 0x21, 0x69, 0x08, 0xe7, 0x8a, 0xdf, 0x90, 0xff, + 0x3e, 0xe9, 0x20, 0x86, 0x3c, 0xe9, 0xfc, 0x0b, + 0xfe, 0x5c, 0x61, 0xaa, 0x13, 0x92, 0x7f, 0x7b, + 0xec, 0xe0, 0x6d, 0xa8, 0x23, 0x22, 0xf6, 0x6b, + 0x77, 0xc4, 0xfe, 0x40, 0x07, 0x3b, 0xb6, 0xf6, + 0x8e, 0x5f, 0xd4, 0xb9, 0xb7, 0x0f, 0x21, 0x04, + 0xef, 0x83, 0x63, 0x91, 0x69, 0x40, 0xa3, 0x48, + 0x5c, 0xd2, 0x60, 0xf9, 0x4f, 0x6c, 0x47, 0x8b, + 0x3b, 0xb1, 0x9f, 0x8e, 0xee, 0x16, 0x8a, 0x13, + 0xfc, 0x46, 0x17, 0xc3, 0xc3, 0x32, 0x56, 0xf8, + 0x3c, 0x85, 0x3a, 0xb6, 0x3e, 0xaa, 0x89, 0x4f, + 0xb3, 0xdf, 0x38, 0xfd, 0xf1, 0xe4, 0x3a, 0xc0, + 0xe6, 0x58, 0xb5, 0x8f, 0xc5, 0x29, 0xa2, 0x92, + 0x4a, 0xb6, 0xa0, 0x34, 0x7f, 0xab, 0xb5, 0x8a, + 0x90, 0xa1, 0xdb, 0x4d, 0xca, 0xb6, 0x2c, 0x41, + 0x3c, 0xf7, 0x2b, 0x21, 0xc3, 0xfd, 0xf4, 0x17, + 0x5c, 0xb5, 0x33, 0x17, 0x68, 0x2b, 0x08, 0x30, + 0xf3, 0xf7, 0x30, 0x3c, 0x96, 0xe6, 0x6a, 0x20, + 0x97, 0xe7, 0x4d, 0x10, 0x5f, 0x47, 0x5f, 0x49, + 0x96, 0x09, 0xf0, 0x27, 0x91, 0xc8, 0xf8, 0x5a, + 0x2e, 0x79, 0xb5, 0xe2, 0xb8, 0xe8, 0xb9, 0x7b, + 0xd5, 0x10, 0xcb, 0xff, 0x5d, 0x14, 0x73, 0xf3 +}; +static const u8 dec_assoc008[] __initconst = { }; +static const u8 dec_nonce008[] __initconst = { + 0x0e, 0x0d, 0x57, 0xbb, 0x7b, 0x40, 0x54, 0x02 +}; +static const u8 dec_key008[] __initconst = { + 0xf2, 0xaa, 0x4f, 0x99, 0xfd, 0x3e, 0xa8, 0x53, + 0xc1, 0x44, 0xe9, 0x81, 0x18, 0xdc, 0xf5, 0xf0, + 0x3e, 0x44, 0x15, 0x59, 0xe0, 0xc5, 0x44, 0x86, + 0xc3, 0x91, 0xa8, 0x75, 0xc0, 0x12, 0x46, 0xba +}; + +static const u8 dec_input009[] __initconst = { + 0xfd, 0x81, 0x8d, 0xd0, 0x3d, 0xb4, 0xd5, 0xdf, + 0xd3, 0x42, 0x47, 0x5a, 0x6d, 0x19, 0x27, 0x66, + 0x4b, 0x2e, 0x0c, 0x27, 0x9c, 0x96, 0x4c, 0x72, + 0x02, 0xa3, 0x65, 0xc3, 0xb3, 0x6f, 0x2e, 0xbd, + 0x63, 0x8a, 0x4a, 0x5d, 0x29, 0xa2, 0xd0, 0x28, + 0x48, 0xc5, 0x3d, 0x98, 0xa3, 0xbc, 0xe0, 0xbe, + 0x3b, 0x3f, 0xe6, 0x8a, 0xa4, 0x7f, 0x53, 0x06, + 0xfa, 0x7f, 0x27, 0x76, 0x72, 0x31, 0xa1, 0xf5, + 0xd6, 0x0c, 0x52, 0x47, 0xba, 0xcd, 0x4f, 0xd7, + 0xeb, 0x05, 0x48, 0x0d, 0x7c, 0x35, 0x4a, 0x09, + 0xc9, 0x76, 0x71, 0x02, 0xa3, 0xfb, 0xb7, 0x1a, + 0x65, 0xb7, 0xed, 0x98, 0xc6, 0x30, 0x8a, 0x00, + 0xae, 0xa1, 0x31, 0xe5, 0xb5, 0x9e, 0x6d, 0x62, + 0xda, 0xda, 0x07, 0x0f, 0x38, 0x38, 0xd3, 0xcb, + 0xc1, 0xb0, 0xad, 0xec, 0x72, 0xec, 0xb1, 0xa2, + 0x7b, 0x59, 0xf3, 0x3d, 0x2b, 0xef, 0xcd, 0x28, + 0x5b, 0x83, 0xcc, 0x18, 0x91, 0x88, 0xb0, 0x2e, + 0xf9, 0x29, 0x31, 0x18, 0xf9, 0x4e, 0xe9, 0x0a, + 0x91, 0x92, 0x9f, 0xae, 0x2d, 0xad, 0xf4, 0xe6, + 0x1a, 0xe2, 0xa4, 0xee, 0x47, 0x15, 0xbf, 0x83, + 0x6e, 0xd7, 0x72, 0x12, 0x3b, 0x2d, 0x24, 0xe9, + 0xb2, 0x55, 0xcb, 0x3c, 0x10, 0xf0, 0x24, 0x8a, + 0x4a, 0x02, 0xea, 0x90, 0x25, 0xf0, 0xb4, 0x79, + 0x3a, 0xef, 0x6e, 0xf5, 0x52, 0xdf, 0xb0, 0x0a, + 0xcd, 0x24, 0x1c, 0xd3, 0x2e, 0x22, 0x74, 0xea, + 0x21, 0x6f, 0xe9, 0xbd, 0xc8, 0x3e, 0x36, 0x5b, + 0x19, 0xf1, 0xca, 0x99, 0x0a, 0xb4, 0xa7, 0x52, + 0x1a, 0x4e, 0xf2, 0xad, 0x8d, 0x56, 0x85, 0xbb, + 0x64, 0x89, 0xba, 0x26, 0xf9, 0xc7, 0xe1, 0x89, + 0x19, 0x22, 0x77, 0xc3, 0xa8, 0xfc, 0xff, 0xad, + 0xfe, 0xb9, 0x48, 0xae, 0x12, 0x30, 0x9f, 0x19, + 0xfb, 0x1b, 0xef, 0x14, 0x87, 0x8a, 0x78, 0x71, + 0xf3, 0xf4, 0xb7, 0x00, 0x9c, 0x1d, 0xb5, 0x3d, + 0x49, 0x00, 0x0c, 0x06, 0xd4, 0x50, 0xf9, 0x54, + 0x45, 0xb2, 0x5b, 0x43, 0xdb, 0x6d, 0xcf, 0x1a, + 0xe9, 0x7a, 0x7a, 0xcf, 0xfc, 0x8a, 0x4e, 0x4d, + 0x0b, 0x07, 0x63, 0x28, 0xd8, 0xe7, 0x08, 0x95, + 0xdf, 0xa6, 0x72, 0x93, 0x2e, 0xbb, 0xa0, 0x42, + 0x89, 0x16, 0xf1, 0xd9, 0x0c, 0xf9, 0xa1, 0x16, + 0xfd, 0xd9, 0x03, 0xb4, 0x3b, 0x8a, 0xf5, 0xf6, + 0xe7, 0x6b, 0x2e, 0x8e, 0x4c, 0x3d, 0xe2, 0xaf, + 0x08, 0x45, 0x03, 0xff, 0x09, 0xb6, 0xeb, 0x2d, + 0xc6, 0x1b, 0x88, 0x94, 0xac, 0x3e, 0xf1, 0x9f, + 0x0e, 0x0e, 0x2b, 0xd5, 0x00, 0x4d, 0x3f, 0x3b, + 0x53, 0xae, 0xaf, 0x1c, 0x33, 0x5f, 0x55, 0x6e, + 0x8d, 0xaf, 0x05, 0x7a, 0x10, 0x34, 0xc9, 0xf4, + 0x66, 0xcb, 0x62, 0x12, 0xa6, 0xee, 0xe8, 0x1c, + 0x5d, 0x12, 0x86, 0xdb, 0x6f, 0x1c, 0x33, 0xc4, + 0x1c, 0xda, 0x82, 0x2d, 0x3b, 0x59, 0xfe, 0xb1, + 0xa4, 0x59, 0x41, 0x86, 0xd0, 0xef, 0xae, 0xfb, + 0xda, 0x6d, 0x11, 0xb8, 0xca, 0xe9, 0x6e, 0xff, + 0xf7, 0xa9, 0xd9, 0x70, 0x30, 0xfc, 0x53, 0xe2, + 0xd7, 0xa2, 0x4e, 0xc7, 0x91, 0xd9, 0x07, 0x06, + 0xaa, 0xdd, 0xb0, 0x59, 0x28, 0x1d, 0x00, 0x66, + 0xc5, 0x54, 0xc2, 0xfc, 0x06, 0xda, 0x05, 0x90, + 0x52, 0x1d, 0x37, 0x66, 0xee, 0xf0, 0xb2, 0x55, + 0x8a, 0x5d, 0xd2, 0x38, 0x86, 0x94, 0x9b, 0xfc, + 0x10, 0x4c, 0xa1, 0xb9, 0x64, 0x3e, 0x44, 0xb8, + 0x5f, 0xb0, 0x0c, 0xec, 0xe0, 0xc9, 0xe5, 0x62, + 0x75, 0x3f, 0x09, 0xd5, 0xf5, 0xd9, 0x26, 0xba, + 0x9e, 0xd2, 0xf4, 0xb9, 0x48, 0x0a, 0xbc, 0xa2, + 0xd6, 0x7c, 0x36, 0x11, 0x7d, 0x26, 0x81, 0x89, + 0xcf, 0xa4, 0xad, 0x73, 0x0e, 0xee, 0xcc, 0x06, + 0xa9, 0xdb, 0xb1, 0xfd, 0xfb, 0x09, 0x7f, 0x90, + 0x42, 0x37, 0x2f, 0xe1, 0x9c, 0x0f, 0x6f, 0xcf, + 0x43, 0xb5, 0xd9, 0x90, 0xe1, 0x85, 0xf5, 0xa8, + 0xae +}; +static const u8 dec_output009[] __initconst = { + 0xe6, 0xc3, 0xdb, 0x63, 0x55, 0x15, 0xe3, 0x5b, + 0xb7, 0x4b, 0x27, 0x8b, 0x5a, 0xdd, 0xc2, 0xe8, + 0x3a, 0x6b, 0xd7, 0x81, 0x96, 0x35, 0x97, 0xca, + 0xd7, 0x68, 0xe8, 0xef, 0xce, 0xab, 0xda, 0x09, + 0x6e, 0xd6, 0x8e, 0xcb, 0x55, 0xb5, 0xe1, 0xe5, + 0x57, 0xfd, 0xc4, 0xe3, 0xe0, 0x18, 0x4f, 0x85, + 0xf5, 0x3f, 0x7e, 0x4b, 0x88, 0xc9, 0x52, 0x44, + 0x0f, 0xea, 0xaf, 0x1f, 0x71, 0x48, 0x9f, 0x97, + 0x6d, 0xb9, 0x6f, 0x00, 0xa6, 0xde, 0x2b, 0x77, + 0x8b, 0x15, 0xad, 0x10, 0xa0, 0x2b, 0x7b, 0x41, + 0x90, 0x03, 0x2d, 0x69, 0xae, 0xcc, 0x77, 0x7c, + 0xa5, 0x9d, 0x29, 0x22, 0xc2, 0xea, 0xb4, 0x00, + 0x1a, 0xd2, 0x7a, 0x98, 0x8a, 0xf9, 0xf7, 0x82, + 0xb0, 0xab, 0xd8, 0xa6, 0x94, 0x8d, 0x58, 0x2f, + 0x01, 0x9e, 0x00, 0x20, 0xfc, 0x49, 0xdc, 0x0e, + 0x03, 0xe8, 0x45, 0x10, 0xd6, 0xa8, 0xda, 0x55, + 0x10, 0x9a, 0xdf, 0x67, 0x22, 0x8b, 0x43, 0xab, + 0x00, 0xbb, 0x02, 0xc8, 0xdd, 0x7b, 0x97, 0x17, + 0xd7, 0x1d, 0x9e, 0x02, 0x5e, 0x48, 0xde, 0x8e, + 0xcf, 0x99, 0x07, 0x95, 0x92, 0x3c, 0x5f, 0x9f, + 0xc5, 0x8a, 0xc0, 0x23, 0xaa, 0xd5, 0x8c, 0x82, + 0x6e, 0x16, 0x92, 0xb1, 0x12, 0x17, 0x07, 0xc3, + 0xfb, 0x36, 0xf5, 0x6c, 0x35, 0xd6, 0x06, 0x1f, + 0x9f, 0xa7, 0x94, 0xa2, 0x38, 0x63, 0x9c, 0xb0, + 0x71, 0xb3, 0xa5, 0xd2, 0xd8, 0xba, 0x9f, 0x08, + 0x01, 0xb3, 0xff, 0x04, 0x97, 0x73, 0x45, 0x1b, + 0xd5, 0xa9, 0x9c, 0x80, 0xaf, 0x04, 0x9a, 0x85, + 0xdb, 0x32, 0x5b, 0x5d, 0x1a, 0xc1, 0x36, 0x28, + 0x10, 0x79, 0xf1, 0x3c, 0xbf, 0x1a, 0x41, 0x5c, + 0x4e, 0xdf, 0xb2, 0x7c, 0x79, 0x3b, 0x7a, 0x62, + 0x3d, 0x4b, 0xc9, 0x9b, 0x2a, 0x2e, 0x7c, 0xa2, + 0xb1, 0x11, 0x98, 0xa7, 0x34, 0x1a, 0x00, 0xf3, + 0xd1, 0xbc, 0x18, 0x22, 0xba, 0x02, 0x56, 0x62, + 0x31, 0x10, 0x11, 0x6d, 0xe0, 0x54, 0x9d, 0x40, + 0x1f, 0x26, 0x80, 0x41, 0xca, 0x3f, 0x68, 0x0f, + 0x32, 0x1d, 0x0a, 0x8e, 0x79, 0xd8, 0xa4, 0x1b, + 0x29, 0x1c, 0x90, 0x8e, 0xc5, 0xe3, 0xb4, 0x91, + 0x37, 0x9a, 0x97, 0x86, 0x99, 0xd5, 0x09, 0xc5, + 0xbb, 0xa3, 0x3f, 0x21, 0x29, 0x82, 0x14, 0x5c, + 0xab, 0x25, 0xfb, 0xf2, 0x4f, 0x58, 0x26, 0xd4, + 0x83, 0xaa, 0x66, 0x89, 0x67, 0x7e, 0xc0, 0x49, + 0xe1, 0x11, 0x10, 0x7f, 0x7a, 0xda, 0x29, 0x04, + 0xff, 0xf0, 0xcb, 0x09, 0x7c, 0x9d, 0xfa, 0x03, + 0x6f, 0x81, 0x09, 0x31, 0x60, 0xfb, 0x08, 0xfa, + 0x74, 0xd3, 0x64, 0x44, 0x7c, 0x55, 0x85, 0xec, + 0x9c, 0x6e, 0x25, 0xb7, 0x6c, 0xc5, 0x37, 0xb6, + 0x83, 0x87, 0x72, 0x95, 0x8b, 0x9d, 0xe1, 0x69, + 0x5c, 0x31, 0x95, 0x42, 0xa6, 0x2c, 0xd1, 0x36, + 0x47, 0x1f, 0xec, 0x54, 0xab, 0xa2, 0x1c, 0xd8, + 0x00, 0xcc, 0xbc, 0x0d, 0x65, 0xe2, 0x67, 0xbf, + 0xbc, 0xea, 0xee, 0x9e, 0xe4, 0x36, 0x95, 0xbe, + 0x73, 0xd9, 0xa6, 0xd9, 0x0f, 0xa0, 0xcc, 0x82, + 0x76, 0x26, 0xad, 0x5b, 0x58, 0x6c, 0x4e, 0xab, + 0x29, 0x64, 0xd3, 0xd9, 0xa9, 0x08, 0x8c, 0x1d, + 0xa1, 0x4f, 0x80, 0xd8, 0x3f, 0x94, 0xfb, 0xd3, + 0x7b, 0xfc, 0xd1, 0x2b, 0xc3, 0x21, 0xeb, 0xe5, + 0x1c, 0x84, 0x23, 0x7f, 0x4b, 0xfa, 0xdb, 0x34, + 0x18, 0xa2, 0xc2, 0xe5, 0x13, 0xfe, 0x6c, 0x49, + 0x81, 0xd2, 0x73, 0xe7, 0xe2, 0xd7, 0xe4, 0x4f, + 0x4b, 0x08, 0x6e, 0xb1, 0x12, 0x22, 0x10, 0x9d, + 0xac, 0x51, 0x1e, 0x17, 0xd9, 0x8a, 0x0b, 0x42, + 0x88, 0x16, 0x81, 0x37, 0x7c, 0x6a, 0xf7, 0xef, + 0x2d, 0xe3, 0xd9, 0xf8, 0x5f, 0xe0, 0x53, 0x27, + 0x74, 0xb9, 0xe2, 0xd6, 0x1c, 0x80, 0x2c, 0x52, + 0x65 +}; +static const u8 dec_assoc009[] __initconst = { + 0x5a, 0x27, 0xff, 0xeb, 0xdf, 0x84, 0xb2, 0x9e, + 0xef +}; +static const u8 dec_nonce009[] __initconst = { + 0xef, 0x2d, 0x63, 0xee, 0x6b, 0x80, 0x8b, 0x78 +}; +static const u8 dec_key009[] __initconst = { + 0xea, 0xbc, 0x56, 0x99, 0xe3, 0x50, 0xff, 0xc5, + 0xcc, 0x1a, 0xd7, 0xc1, 0x57, 0x72, 0xea, 0x86, + 0x5b, 0x89, 0x88, 0x61, 0x3d, 0x2f, 0x9b, 0xb2, + 0xe7, 0x9c, 0xec, 0x74, 0x6e, 0x3e, 0xf4, 0x3b +}; + +static const u8 dec_input010[] __initconst = { + 0xe5, 0x26, 0xa4, 0x3d, 0xbd, 0x33, 0xd0, 0x4b, + 0x6f, 0x05, 0xa7, 0x6e, 0x12, 0x7a, 0xd2, 0x74, + 0xa6, 0xdd, 0xbd, 0x95, 0xeb, 0xf9, 0xa4, 0xf1, + 0x59, 0x93, 0x91, 0x70, 0xd9, 0xfe, 0x9a, 0xcd, + 0x53, 0x1f, 0x3a, 0xab, 0xa6, 0x7c, 0x9f, 0xa6, + 0x9e, 0xbd, 0x99, 0xd9, 0xb5, 0x97, 0x44, 0xd5, + 0x14, 0x48, 0x4d, 0x9d, 0xc0, 0xd0, 0x05, 0x96, + 0xeb, 0x4c, 0x78, 0x55, 0x09, 0x08, 0x01, 0x02, + 0x30, 0x90, 0x7b, 0x96, 0x7a, 0x7b, 0x5f, 0x30, + 0x41, 0x24, 0xce, 0x68, 0x61, 0x49, 0x86, 0x57, + 0x82, 0xdd, 0x53, 0x1c, 0x51, 0x28, 0x2b, 0x53, + 0x6e, 0x2d, 0xc2, 0x20, 0x4c, 0xdd, 0x8f, 0x65, + 0x10, 0x20, 0x50, 0xdd, 0x9d, 0x50, 0xe5, 0x71, + 0x40, 0x53, 0x69, 0xfc, 0x77, 0x48, 0x11, 0xb9, + 0xde, 0xa4, 0x8d, 0x58, 0xe4, 0xa6, 0x1a, 0x18, + 0x47, 0x81, 0x7e, 0xfc, 0xdd, 0xf6, 0xef, 0xce, + 0x2f, 0x43, 0x68, 0xd6, 0x06, 0xe2, 0x74, 0x6a, + 0xad, 0x90, 0xf5, 0x37, 0xf3, 0x3d, 0x82, 0x69, + 0x40, 0xe9, 0x6b, 0xa7, 0x3d, 0xa8, 0x1e, 0xd2, + 0x02, 0x7c, 0xb7, 0x9b, 0xe4, 0xda, 0x8f, 0x95, + 0x06, 0xc5, 0xdf, 0x73, 0xa3, 0x20, 0x9a, 0x49, + 0xde, 0x9c, 0xbc, 0xee, 0x14, 0x3f, 0x81, 0x5e, + 0xf8, 0x3b, 0x59, 0x3c, 0xe1, 0x68, 0x12, 0x5a, + 0x3a, 0x76, 0x3a, 0x3f, 0xf7, 0x87, 0x33, 0x0a, + 0x01, 0xb8, 0xd4, 0xed, 0xb6, 0xbe, 0x94, 0x5e, + 0x70, 0x40, 0x56, 0x67, 0x1f, 0x50, 0x44, 0x19, + 0xce, 0x82, 0x70, 0x10, 0x87, 0x13, 0x20, 0x0b, + 0x4c, 0x5a, 0xb6, 0xf6, 0xa7, 0xae, 0x81, 0x75, + 0x01, 0x81, 0xe6, 0x4b, 0x57, 0x7c, 0xdd, 0x6d, + 0xf8, 0x1c, 0x29, 0x32, 0xf7, 0xda, 0x3c, 0x2d, + 0xf8, 0x9b, 0x25, 0x6e, 0x00, 0xb4, 0xf7, 0x2f, + 0xf7, 0x04, 0xf7, 0xa1, 0x56, 0xac, 0x4f, 0x1a, + 0x64, 0xb8, 0x47, 0x55, 0x18, 0x7b, 0x07, 0x4d, + 0xbd, 0x47, 0x24, 0x80, 0x5d, 0xa2, 0x70, 0xc5, + 0xdd, 0x8e, 0x82, 0xd4, 0xeb, 0xec, 0xb2, 0x0c, + 0x39, 0xd2, 0x97, 0xc1, 0xcb, 0xeb, 0xf4, 0x77, + 0x59, 0xb4, 0x87, 0xef, 0xcb, 0x43, 0x2d, 0x46, + 0x54, 0xd1, 0xa7, 0xd7, 0x15, 0x99, 0x0a, 0x43, + 0xa1, 0xe0, 0x99, 0x33, 0x71, 0xc1, 0xed, 0xfe, + 0x72, 0x46, 0x33, 0x8e, 0x91, 0x08, 0x9f, 0xc8, + 0x2e, 0xca, 0xfa, 0xdc, 0x59, 0xd5, 0xc3, 0x76, + 0x84, 0x9f, 0xa3, 0x37, 0x68, 0xc3, 0xf0, 0x47, + 0x2c, 0x68, 0xdb, 0x5e, 0xc3, 0x49, 0x4c, 0xe8, + 0x92, 0x85, 0xe2, 0x23, 0xd3, 0x3f, 0xad, 0x32, + 0xe5, 0x2b, 0x82, 0xd7, 0x8f, 0x99, 0x0a, 0x59, + 0x5c, 0x45, 0xd9, 0xb4, 0x51, 0x52, 0xc2, 0xae, + 0xbf, 0x80, 0xcf, 0xc9, 0xc9, 0x51, 0x24, 0x2a, + 0x3b, 0x3a, 0x4d, 0xae, 0xeb, 0xbd, 0x22, 0xc3, + 0x0e, 0x0f, 0x59, 0x25, 0x92, 0x17, 0xe9, 0x74, + 0xc7, 0x8b, 0x70, 0x70, 0x36, 0x55, 0x95, 0x75, + 0x4b, 0xad, 0x61, 0x2b, 0x09, 0xbc, 0x82, 0xf2, + 0x6e, 0x94, 0x43, 0xae, 0xc3, 0xd5, 0xcd, 0x8e, + 0xfe, 0x5b, 0x9a, 0x88, 0x43, 0x01, 0x75, 0xb2, + 0x23, 0x09, 0xf7, 0x89, 0x83, 0xe7, 0xfa, 0xf9, + 0xb4, 0x9b, 0xf8, 0xef, 0xbd, 0x1c, 0x92, 0xc1, + 0xda, 0x7e, 0xfe, 0x05, 0xba, 0x5a, 0xcd, 0x07, + 0x6a, 0x78, 0x9e, 0x5d, 0xfb, 0x11, 0x2f, 0x79, + 0x38, 0xb6, 0xc2, 0x5b, 0x6b, 0x51, 0xb4, 0x71, + 0xdd, 0xf7, 0x2a, 0xe4, 0xf4, 0x72, 0x76, 0xad, + 0xc2, 0xdd, 0x64, 0x5d, 0x79, 0xb6, 0xf5, 0x7a, + 0x77, 0x20, 0x05, 0x3d, 0x30, 0x06, 0xd4, 0x4c, + 0x0a, 0x2c, 0x98, 0x5a, 0xb9, 0xd4, 0x98, 0xa9, + 0x3f, 0xc6, 0x12, 0xea, 0x3b, 0x4b, 0xc5, 0x79, + 0x64, 0x63, 0x6b, 0x09, 0x54, 0x3b, 0x14, 0x27, + 0xba, 0x99, 0x80, 0xc8, 0x72, 0xa8, 0x12, 0x90, + 0x29, 0xba, 0x40, 0x54, 0x97, 0x2b, 0x7b, 0xfe, + 0xeb, 0xcd, 0x01, 0x05, 0x44, 0x72, 0xdb, 0x99, + 0xe4, 0x61, 0xc9, 0x69, 0xd6, 0xb9, 0x28, 0xd1, + 0x05, 0x3e, 0xf9, 0x0b, 0x49, 0x0a, 0x49, 0xe9, + 0x8d, 0x0e, 0xa7, 0x4a, 0x0f, 0xaf, 0x32, 0xd0, + 0xe0, 0xb2, 0x3a, 0x55, 0x58, 0xfe, 0x5c, 0x28, + 0x70, 0x51, 0x23, 0xb0, 0x7b, 0x6a, 0x5f, 0x1e, + 0xb8, 0x17, 0xd7, 0x94, 0x15, 0x8f, 0xee, 0x20, + 0xc7, 0x42, 0x25, 0x3e, 0x9a, 0x14, 0xd7, 0x60, + 0x72, 0x39, 0x47, 0x48, 0xa9, 0xfe, 0xdd, 0x47, + 0x0a, 0xb1, 0xe6, 0x60, 0x28, 0x8c, 0x11, 0x68, + 0xe1, 0xff, 0xd7, 0xce, 0xc8, 0xbe, 0xb3, 0xfe, + 0x27, 0x30, 0x09, 0x70, 0xd7, 0xfa, 0x02, 0x33, + 0x3a, 0x61, 0x2e, 0xc7, 0xff, 0xa4, 0x2a, 0xa8, + 0x6e, 0xb4, 0x79, 0x35, 0x6d, 0x4c, 0x1e, 0x38, + 0xf8, 0xee, 0xd4, 0x84, 0x4e, 0x6e, 0x28, 0xa7, + 0xce, 0xc8, 0xc1, 0xcf, 0x80, 0x05, 0xf3, 0x04, + 0xef, 0xc8, 0x18, 0x28, 0x2e, 0x8d, 0x5e, 0x0c, + 0xdf, 0xb8, 0x5f, 0x96, 0xe8, 0xc6, 0x9c, 0x2f, + 0xe5, 0xa6, 0x44, 0xd7, 0xe7, 0x99, 0x44, 0x0c, + 0xec, 0xd7, 0x05, 0x60, 0x97, 0xbb, 0x74, 0x77, + 0x58, 0xd5, 0xbb, 0x48, 0xde, 0x5a, 0xb2, 0x54, + 0x7f, 0x0e, 0x46, 0x70, 0x6a, 0x6f, 0x78, 0xa5, + 0x08, 0x89, 0x05, 0x4e, 0x7e, 0xa0, 0x69, 0xb4, + 0x40, 0x60, 0x55, 0x77, 0x75, 0x9b, 0x19, 0xf2, + 0xd5, 0x13, 0x80, 0x77, 0xf9, 0x4b, 0x3f, 0x1e, + 0xee, 0xe6, 0x76, 0x84, 0x7b, 0x8c, 0xe5, 0x27, + 0xa8, 0x0a, 0x91, 0x01, 0x68, 0x71, 0x8a, 0x3f, + 0x06, 0xab, 0xf6, 0xa9, 0xa5, 0xe6, 0x72, 0x92, + 0xe4, 0x67, 0xe2, 0xa2, 0x46, 0x35, 0x84, 0x55, + 0x7d, 0xca, 0xa8, 0x85, 0xd0, 0xf1, 0x3f, 0xbe, + 0xd7, 0x34, 0x64, 0xfc, 0xae, 0xe3, 0xe4, 0x04, + 0x9f, 0x66, 0x02, 0xb9, 0x88, 0x10, 0xd9, 0xc4, + 0x4c, 0x31, 0x43, 0x7a, 0x93, 0xe2, 0x9b, 0x56, + 0x43, 0x84, 0xdc, 0xdc, 0xde, 0x1d, 0xa4, 0x02, + 0x0e, 0xc2, 0xef, 0xc3, 0xf8, 0x78, 0xd1, 0xb2, + 0x6b, 0x63, 0x18, 0xc9, 0xa9, 0xe5, 0x72, 0xd8, + 0xf3, 0xb9, 0xd1, 0x8a, 0xc7, 0x1a, 0x02, 0x27, + 0x20, 0x77, 0x10, 0xe5, 0xc8, 0xd4, 0x4a, 0x47, + 0xe5, 0xdf, 0x5f, 0x01, 0xaa, 0xb0, 0xd4, 0x10, + 0xbb, 0x69, 0xe3, 0x36, 0xc8, 0xe1, 0x3d, 0x43, + 0xfb, 0x86, 0xcd, 0xcc, 0xbf, 0xf4, 0x88, 0xe0, + 0x20, 0xca, 0xb7, 0x1b, 0xf1, 0x2f, 0x5c, 0xee, + 0xd4, 0xd3, 0xa3, 0xcc, 0xa4, 0x1e, 0x1c, 0x47, + 0xfb, 0xbf, 0xfc, 0xa2, 0x41, 0x55, 0x9d, 0xf6, + 0x5a, 0x5e, 0x65, 0x32, 0x34, 0x7b, 0x52, 0x8d, + 0xd5, 0xd0, 0x20, 0x60, 0x03, 0xab, 0x3f, 0x8c, + 0xd4, 0x21, 0xea, 0x2a, 0xd9, 0xc4, 0xd0, 0xd3, + 0x65, 0xd8, 0x7a, 0x13, 0x28, 0x62, 0x32, 0x4b, + 0x2c, 0x87, 0x93, 0xa8, 0xb4, 0x52, 0x45, 0x09, + 0x44, 0xec, 0xec, 0xc3, 0x17, 0xdb, 0x9a, 0x4d, + 0x5c, 0xa9, 0x11, 0xd4, 0x7d, 0xaf, 0x9e, 0xf1, + 0x2d, 0xb2, 0x66, 0xc5, 0x1d, 0xed, 0xb7, 0xcd, + 0x0b, 0x25, 0x5e, 0x30, 0x47, 0x3f, 0x40, 0xf4, + 0xa1, 0xa0, 0x00, 0x94, 0x10, 0xc5, 0x6a, 0x63, + 0x1a, 0xd5, 0x88, 0x92, 0x8e, 0x82, 0x39, 0x87, + 0x3c, 0x78, 0x65, 0x58, 0x42, 0x75, 0x5b, 0xdd, + 0x77, 0x3e, 0x09, 0x4e, 0x76, 0x5b, 0xe6, 0x0e, + 0x4d, 0x38, 0xb2, 0xc0, 0xb8, 0x95, 0x01, 0x7a, + 0x10, 0xe0, 0xfb, 0x07, 0xf2, 0xab, 0x2d, 0x8c, + 0x32, 0xed, 0x2b, 0xc0, 0x46, 0xc2, 0xf5, 0x38, + 0x83, 0xf0, 0x17, 0xec, 0xc1, 0x20, 0x6a, 0x9a, + 0x0b, 0x00, 0xa0, 0x98, 0x22, 0x50, 0x23, 0xd5, + 0x80, 0x6b, 0xf6, 0x1f, 0xc3, 0xcc, 0x97, 0xc9, + 0x24, 0x9f, 0xf3, 0xaf, 0x43, 0x14, 0xd5, 0xa0 +}; +static const u8 dec_output010[] __initconst = { + 0x42, 0x93, 0xe4, 0xeb, 0x97, 0xb0, 0x57, 0xbf, + 0x1a, 0x8b, 0x1f, 0xe4, 0x5f, 0x36, 0x20, 0x3c, + 0xef, 0x0a, 0xa9, 0x48, 0x5f, 0x5f, 0x37, 0x22, + 0x3a, 0xde, 0xe3, 0xae, 0xbe, 0xad, 0x07, 0xcc, + 0xb1, 0xf6, 0xf5, 0xf9, 0x56, 0xdd, 0xe7, 0x16, + 0x1e, 0x7f, 0xdf, 0x7a, 0x9e, 0x75, 0xb7, 0xc7, + 0xbe, 0xbe, 0x8a, 0x36, 0x04, 0xc0, 0x10, 0xf4, + 0x95, 0x20, 0x03, 0xec, 0xdc, 0x05, 0xa1, 0x7d, + 0xc4, 0xa9, 0x2c, 0x82, 0xd0, 0xbc, 0x8b, 0xc5, + 0xc7, 0x45, 0x50, 0xf6, 0xa2, 0x1a, 0xb5, 0x46, + 0x3b, 0x73, 0x02, 0xa6, 0x83, 0x4b, 0x73, 0x82, + 0x58, 0x5e, 0x3b, 0x65, 0x2f, 0x0e, 0xfd, 0x2b, + 0x59, 0x16, 0xce, 0xa1, 0x60, 0x9c, 0xe8, 0x3a, + 0x99, 0xed, 0x8d, 0x5a, 0xcf, 0xf6, 0x83, 0xaf, + 0xba, 0xd7, 0x73, 0x73, 0x40, 0x97, 0x3d, 0xca, + 0xef, 0x07, 0x57, 0xe6, 0xd9, 0x70, 0x0e, 0x95, + 0xae, 0xa6, 0x8d, 0x04, 0xcc, 0xee, 0xf7, 0x09, + 0x31, 0x77, 0x12, 0xa3, 0x23, 0x97, 0x62, 0xb3, + 0x7b, 0x32, 0xfb, 0x80, 0x14, 0x48, 0x81, 0xc3, + 0xe5, 0xea, 0x91, 0x39, 0x52, 0x81, 0xa2, 0x4f, + 0xe4, 0xb3, 0x09, 0xff, 0xde, 0x5e, 0xe9, 0x58, + 0x84, 0x6e, 0xf9, 0x3d, 0xdf, 0x25, 0xea, 0xad, + 0xae, 0xe6, 0x9a, 0xd1, 0x89, 0x55, 0xd3, 0xde, + 0x6c, 0x52, 0xdb, 0x70, 0xfe, 0x37, 0xce, 0x44, + 0x0a, 0xa8, 0x25, 0x5f, 0x92, 0xc1, 0x33, 0x4a, + 0x4f, 0x9b, 0x62, 0x35, 0xff, 0xce, 0xc0, 0xa9, + 0x60, 0xce, 0x52, 0x00, 0x97, 0x51, 0x35, 0x26, + 0x2e, 0xb9, 0x36, 0xa9, 0x87, 0x6e, 0x1e, 0xcc, + 0x91, 0x78, 0x53, 0x98, 0x86, 0x5b, 0x9c, 0x74, + 0x7d, 0x88, 0x33, 0xe1, 0xdf, 0x37, 0x69, 0x2b, + 0xbb, 0xf1, 0x4d, 0xf4, 0xd1, 0xf1, 0x39, 0x93, + 0x17, 0x51, 0x19, 0xe3, 0x19, 0x1e, 0x76, 0x37, + 0x25, 0xfb, 0x09, 0x27, 0x6a, 0xab, 0x67, 0x6f, + 0x14, 0x12, 0x64, 0xe7, 0xc4, 0x07, 0xdf, 0x4d, + 0x17, 0xbb, 0x6d, 0xe0, 0xe9, 0xb9, 0xab, 0xca, + 0x10, 0x68, 0xaf, 0x7e, 0xb7, 0x33, 0x54, 0x73, + 0x07, 0x6e, 0xf7, 0x81, 0x97, 0x9c, 0x05, 0x6f, + 0x84, 0x5f, 0xd2, 0x42, 0xfb, 0x38, 0xcf, 0xd1, + 0x2f, 0x14, 0x30, 0x88, 0x98, 0x4d, 0x5a, 0xa9, + 0x76, 0xd5, 0x4f, 0x3e, 0x70, 0x6c, 0x85, 0x76, + 0xd7, 0x01, 0xa0, 0x1a, 0xc8, 0x4e, 0xaa, 0xac, + 0x78, 0xfe, 0x46, 0xde, 0x6a, 0x05, 0x46, 0xa7, + 0x43, 0x0c, 0xb9, 0xde, 0xb9, 0x68, 0xfb, 0xce, + 0x42, 0x99, 0x07, 0x4d, 0x0b, 0x3b, 0x5a, 0x30, + 0x35, 0xa8, 0xf9, 0x3a, 0x73, 0xef, 0x0f, 0xdb, + 0x1e, 0x16, 0x42, 0xc4, 0xba, 0xae, 0x58, 0xaa, + 0xf8, 0xe5, 0x75, 0x2f, 0x1b, 0x15, 0x5c, 0xfd, + 0x0a, 0x97, 0xd0, 0xe4, 0x37, 0x83, 0x61, 0x5f, + 0x43, 0xa6, 0xc7, 0x3f, 0x38, 0x59, 0xe6, 0xeb, + 0xa3, 0x90, 0xc3, 0xaa, 0xaa, 0x5a, 0xd3, 0x34, + 0xd4, 0x17, 0xc8, 0x65, 0x3e, 0x57, 0xbc, 0x5e, + 0xdd, 0x9e, 0xb7, 0xf0, 0x2e, 0x5b, 0xb2, 0x1f, + 0x8a, 0x08, 0x0d, 0x45, 0x91, 0x0b, 0x29, 0x53, + 0x4f, 0x4c, 0x5a, 0x73, 0x56, 0xfe, 0xaf, 0x41, + 0x01, 0x39, 0x0a, 0x24, 0x3c, 0x7e, 0xbe, 0x4e, + 0x53, 0xf3, 0xeb, 0x06, 0x66, 0x51, 0x28, 0x1d, + 0xbd, 0x41, 0x0a, 0x01, 0xab, 0x16, 0x47, 0x27, + 0x47, 0x47, 0xf7, 0xcb, 0x46, 0x0a, 0x70, 0x9e, + 0x01, 0x9c, 0x09, 0xe1, 0x2a, 0x00, 0x1a, 0xd8, + 0xd4, 0x79, 0x9d, 0x80, 0x15, 0x8e, 0x53, 0x2a, + 0x65, 0x83, 0x78, 0x3e, 0x03, 0x00, 0x07, 0x12, + 0x1f, 0x33, 0x3e, 0x7b, 0x13, 0x37, 0xf1, 0xc3, + 0xef, 0xb7, 0xc1, 0x20, 0x3c, 0x3e, 0x67, 0x66, + 0x5d, 0x88, 0xa7, 0x7d, 0x33, 0x50, 0x77, 0xb0, + 0x28, 0x8e, 0xe7, 0x2c, 0x2e, 0x7a, 0xf4, 0x3c, + 0x8d, 0x74, 0x83, 0xaf, 0x8e, 0x87, 0x0f, 0xe4, + 0x50, 0xff, 0x84, 0x5c, 0x47, 0x0c, 0x6a, 0x49, + 0xbf, 0x42, 0x86, 0x77, 0x15, 0x48, 0xa5, 0x90, + 0x5d, 0x93, 0xd6, 0x2a, 0x11, 0xd5, 0xd5, 0x11, + 0xaa, 0xce, 0xe7, 0x6f, 0xa5, 0xb0, 0x09, 0x2c, + 0x8d, 0xd3, 0x92, 0xf0, 0x5a, 0x2a, 0xda, 0x5b, + 0x1e, 0xd5, 0x9a, 0xc4, 0xc4, 0xf3, 0x49, 0x74, + 0x41, 0xca, 0xe8, 0xc1, 0xf8, 0x44, 0xd6, 0x3c, + 0xae, 0x6c, 0x1d, 0x9a, 0x30, 0x04, 0x4d, 0x27, + 0x0e, 0xb1, 0x5f, 0x59, 0xa2, 0x24, 0xe8, 0xe1, + 0x98, 0xc5, 0x6a, 0x4c, 0xfe, 0x41, 0xd2, 0x27, + 0x42, 0x52, 0xe1, 0xe9, 0x7d, 0x62, 0xe4, 0x88, + 0x0f, 0xad, 0xb2, 0x70, 0xcb, 0x9d, 0x4c, 0x27, + 0x2e, 0x76, 0x1e, 0x1a, 0x63, 0x65, 0xf5, 0x3b, + 0xf8, 0x57, 0x69, 0xeb, 0x5b, 0x38, 0x26, 0x39, + 0x33, 0x25, 0x45, 0x3e, 0x91, 0xb8, 0xd8, 0xc7, + 0xd5, 0x42, 0xc0, 0x22, 0x31, 0x74, 0xf4, 0xbc, + 0x0c, 0x23, 0xf1, 0xca, 0xc1, 0x8d, 0xd7, 0xbe, + 0xc9, 0x62, 0xe4, 0x08, 0x1a, 0xcf, 0x36, 0xd5, + 0xfe, 0x55, 0x21, 0x59, 0x91, 0x87, 0x87, 0xdf, + 0x06, 0xdb, 0xdf, 0x96, 0x45, 0x58, 0xda, 0x05, + 0xcd, 0x50, 0x4d, 0xd2, 0x7d, 0x05, 0x18, 0x73, + 0x6a, 0x8d, 0x11, 0x85, 0xa6, 0x88, 0xe8, 0xda, + 0xe6, 0x30, 0x33, 0xa4, 0x89, 0x31, 0x75, 0xbe, + 0x69, 0x43, 0x84, 0x43, 0x50, 0x87, 0xdd, 0x71, + 0x36, 0x83, 0xc3, 0x78, 0x74, 0x24, 0x0a, 0xed, + 0x7b, 0xdb, 0xa4, 0x24, 0x0b, 0xb9, 0x7e, 0x5d, + 0xff, 0xde, 0xb1, 0xef, 0x61, 0x5a, 0x45, 0x33, + 0xf6, 0x17, 0x07, 0x08, 0x98, 0x83, 0x92, 0x0f, + 0x23, 0x6d, 0xe6, 0xaa, 0x17, 0x54, 0xad, 0x6a, + 0xc8, 0xdb, 0x26, 0xbe, 0xb8, 0xb6, 0x08, 0xfa, + 0x68, 0xf1, 0xd7, 0x79, 0x6f, 0x18, 0xb4, 0x9e, + 0x2d, 0x3f, 0x1b, 0x64, 0xaf, 0x8d, 0x06, 0x0e, + 0x49, 0x28, 0xe0, 0x5d, 0x45, 0x68, 0x13, 0x87, + 0xfa, 0xde, 0x40, 0x7b, 0xd2, 0xc3, 0x94, 0xd5, + 0xe1, 0xd9, 0xc2, 0xaf, 0x55, 0x89, 0xeb, 0xb4, + 0x12, 0x59, 0xa8, 0xd4, 0xc5, 0x29, 0x66, 0x38, + 0xe6, 0xac, 0x22, 0x22, 0xd9, 0x64, 0x9b, 0x34, + 0x0a, 0x32, 0x9f, 0xc2, 0xbf, 0x17, 0x6c, 0x3f, + 0x71, 0x7a, 0x38, 0x6b, 0x98, 0xfb, 0x49, 0x36, + 0x89, 0xc9, 0xe2, 0xd6, 0xc7, 0x5d, 0xd0, 0x69, + 0x5f, 0x23, 0x35, 0xc9, 0x30, 0xe2, 0xfd, 0x44, + 0x58, 0x39, 0xd7, 0x97, 0xfb, 0x5c, 0x00, 0xd5, + 0x4f, 0x7a, 0x1a, 0x95, 0x8b, 0x62, 0x4b, 0xce, + 0xe5, 0x91, 0x21, 0x7b, 0x30, 0x00, 0xd6, 0xdd, + 0x6d, 0x02, 0x86, 0x49, 0x0f, 0x3c, 0x1a, 0x27, + 0x3c, 0xd3, 0x0e, 0x71, 0xf2, 0xff, 0xf5, 0x2f, + 0x87, 0xac, 0x67, 0x59, 0x81, 0xa3, 0xf7, 0xf8, + 0xd6, 0x11, 0x0c, 0x84, 0xa9, 0x03, 0xee, 0x2a, + 0xc4, 0xf3, 0x22, 0xab, 0x7c, 0xe2, 0x25, 0xf5, + 0x67, 0xa3, 0xe4, 0x11, 0xe0, 0x59, 0xb3, 0xca, + 0x87, 0xa0, 0xae, 0xc9, 0xa6, 0x62, 0x1b, 0x6e, + 0x4d, 0x02, 0x6b, 0x07, 0x9d, 0xfd, 0xd0, 0x92, + 0x06, 0xe1, 0xb2, 0x9a, 0x4a, 0x1f, 0x1f, 0x13, + 0x49, 0x99, 0x97, 0x08, 0xde, 0x7f, 0x98, 0xaf, + 0x51, 0x98, 0xee, 0x2c, 0xcb, 0xf0, 0x0b, 0xc6, + 0xb6, 0xb7, 0x2d, 0x9a, 0xb1, 0xac, 0xa6, 0xe3, + 0x15, 0x77, 0x9d, 0x6b, 0x1a, 0xe4, 0xfc, 0x8b, + 0xf2, 0x17, 0x59, 0x08, 0x04, 0x58, 0x81, 0x9d, + 0x1b, 0x1b, 0x69, 0x55, 0xc2, 0xb4, 0x3c, 0x1f, + 0x50, 0xf1, 0x7f, 0x77, 0x90, 0x4c, 0x66, 0x40, + 0x5a, 0xc0, 0x33, 0x1f, 0xcb, 0x05, 0x6d, 0x5c, + 0x06, 0x87, 0x52, 0xa2, 0x8f, 0x26, 0xd5, 0x4f +}; +static const u8 dec_assoc010[] __initconst = { + 0xd2, 0xa1, 0x70, 0xdb, 0x7a, 0xf8, 0xfa, 0x27, + 0xba, 0x73, 0x0f, 0xbf, 0x3d, 0x1e, 0x82, 0xb2 +}; +static const u8 dec_nonce010[] __initconst = { + 0xdb, 0x92, 0x0f, 0x7f, 0x17, 0x54, 0x0c, 0x30 +}; +static const u8 dec_key010[] __initconst = { + 0x47, 0x11, 0xeb, 0x86, 0x2b, 0x2c, 0xab, 0x44, + 0x34, 0xda, 0x7f, 0x57, 0x03, 0x39, 0x0c, 0xaf, + 0x2c, 0x14, 0xfd, 0x65, 0x23, 0xe9, 0x8e, 0x74, + 0xd5, 0x08, 0x68, 0x08, 0xe7, 0xb4, 0x72, 0xd7 +}; + +static const u8 dec_input011[] __initconst = { + 0x6a, 0xfc, 0x4b, 0x25, 0xdf, 0xc0, 0xe4, 0xe8, + 0x17, 0x4d, 0x4c, 0xc9, 0x7e, 0xde, 0x3a, 0xcc, + 0x3c, 0xba, 0x6a, 0x77, 0x47, 0xdb, 0xe3, 0x74, + 0x7a, 0x4d, 0x5f, 0x8d, 0x37, 0x55, 0x80, 0x73, + 0x90, 0x66, 0x5d, 0x3a, 0x7d, 0x5d, 0x86, 0x5e, + 0x8d, 0xfd, 0x83, 0xff, 0x4e, 0x74, 0x6f, 0xf9, + 0xe6, 0x70, 0x17, 0x70, 0x3e, 0x96, 0xa7, 0x7e, + 0xcb, 0xab, 0x8f, 0x58, 0x24, 0x9b, 0x01, 0xfd, + 0xcb, 0xe6, 0x4d, 0x9b, 0xf0, 0x88, 0x94, 0x57, + 0x66, 0xef, 0x72, 0x4c, 0x42, 0x6e, 0x16, 0x19, + 0x15, 0xea, 0x70, 0x5b, 0xac, 0x13, 0xdb, 0x9f, + 0x18, 0xe2, 0x3c, 0x26, 0x97, 0xbc, 0xdc, 0x45, + 0x8c, 0x6c, 0x24, 0x69, 0x9c, 0xf7, 0x65, 0x1e, + 0x18, 0x59, 0x31, 0x7c, 0xe4, 0x73, 0xbc, 0x39, + 0x62, 0xc6, 0x5c, 0x9f, 0xbf, 0xfa, 0x90, 0x03, + 0xc9, 0x72, 0x26, 0xb6, 0x1b, 0xc2, 0xb7, 0x3f, + 0xf2, 0x13, 0x77, 0xf2, 0x8d, 0xb9, 0x47, 0xd0, + 0x53, 0xdd, 0xc8, 0x91, 0x83, 0x8b, 0xb1, 0xce, + 0xa3, 0xfe, 0xcd, 0xd9, 0xdd, 0x92, 0x7b, 0xdb, + 0xb8, 0xfb, 0xc9, 0x2d, 0x01, 0x59, 0x39, 0x52, + 0xad, 0x1b, 0xec, 0xcf, 0xd7, 0x70, 0x13, 0x21, + 0xf5, 0x47, 0xaa, 0x18, 0x21, 0x5c, 0xc9, 0x9a, + 0xd2, 0x6b, 0x05, 0x9c, 0x01, 0xa1, 0xda, 0x35, + 0x5d, 0xb3, 0x70, 0xe6, 0xa9, 0x80, 0x8b, 0x91, + 0xb7, 0xb3, 0x5f, 0x24, 0x9a, 0xb7, 0xd1, 0x6b, + 0xa1, 0x1c, 0x50, 0xba, 0x49, 0xe0, 0xee, 0x2e, + 0x75, 0xac, 0x69, 0xc0, 0xeb, 0x03, 0xdd, 0x19, + 0xe5, 0xf6, 0x06, 0xdd, 0xc3, 0xd7, 0x2b, 0x07, + 0x07, 0x30, 0xa7, 0x19, 0x0c, 0xbf, 0xe6, 0x18, + 0xcc, 0xb1, 0x01, 0x11, 0x85, 0x77, 0x1d, 0x96, + 0xa7, 0xa3, 0x00, 0x84, 0x02, 0xa2, 0x83, 0x68, + 0xda, 0x17, 0x27, 0xc8, 0x7f, 0x23, 0xb7, 0xf4, + 0x13, 0x85, 0xcf, 0xdd, 0x7a, 0x7d, 0x24, 0x57, + 0xfe, 0x05, 0x93, 0xf5, 0x74, 0xce, 0xed, 0x0c, + 0x20, 0x98, 0x8d, 0x92, 0x30, 0xa1, 0x29, 0x23, + 0x1a, 0xa0, 0x4f, 0x69, 0x56, 0x4c, 0xe1, 0xc8, + 0xce, 0xf6, 0x9a, 0x0c, 0xa4, 0xfa, 0x04, 0xf6, + 0x62, 0x95, 0xf2, 0xfa, 0xc7, 0x40, 0x68, 0x40, + 0x8f, 0x41, 0xda, 0xb4, 0x26, 0x6f, 0x70, 0xab, + 0x40, 0x61, 0xa4, 0x0e, 0x75, 0xfb, 0x86, 0xeb, + 0x9d, 0x9a, 0x1f, 0xec, 0x76, 0x99, 0xe7, 0xea, + 0xaa, 0x1e, 0x2d, 0xb5, 0xd4, 0xa6, 0x1a, 0xb8, + 0x61, 0x0a, 0x1d, 0x16, 0x5b, 0x98, 0xc2, 0x31, + 0x40, 0xe7, 0x23, 0x1d, 0x66, 0x99, 0xc8, 0xc0, + 0xd7, 0xce, 0xf3, 0x57, 0x40, 0x04, 0x3f, 0xfc, + 0xea, 0xb3, 0xfc, 0xd2, 0xd3, 0x99, 0xa4, 0x94, + 0x69, 0xa0, 0xef, 0xd1, 0x85, 0xb3, 0xa6, 0xb1, + 0x28, 0xbf, 0x94, 0x67, 0x22, 0xc3, 0x36, 0x46, + 0xf8, 0xd2, 0x0f, 0x5f, 0xf4, 0x59, 0x80, 0xe6, + 0x2d, 0x43, 0x08, 0x7d, 0x19, 0x09, 0x97, 0xa7, + 0x4c, 0x3d, 0x8d, 0xba, 0x65, 0x62, 0xa3, 0x71, + 0x33, 0x29, 0x62, 0xdb, 0xc1, 0x33, 0x34, 0x1a, + 0x63, 0x33, 0x16, 0xb6, 0x64, 0x7e, 0xab, 0x33, + 0xf0, 0xe6, 0x26, 0x68, 0xba, 0x1d, 0x2e, 0x38, + 0x08, 0xe6, 0x02, 0xd3, 0x25, 0x2c, 0x47, 0x23, + 0x58, 0x34, 0x0f, 0x9d, 0x63, 0x4f, 0x63, 0xbb, + 0x7f, 0x3b, 0x34, 0x38, 0xa7, 0xb5, 0x8d, 0x65, + 0xd9, 0x9f, 0x79, 0x55, 0x3e, 0x4d, 0xe7, 0x73, + 0xd8, 0xf6, 0x98, 0x97, 0x84, 0x60, 0x9c, 0xc8, + 0xa9, 0x3c, 0xf6, 0xdc, 0x12, 0x5c, 0xe1, 0xbb, + 0x0b, 0x8b, 0x98, 0x9c, 0x9d, 0x26, 0x7c, 0x4a, + 0xe6, 0x46, 0x36, 0x58, 0x21, 0x4a, 0xee, 0xca, + 0xd7, 0x3b, 0xc2, 0x6c, 0x49, 0x2f, 0xe5, 0xd5, + 0x03, 0x59, 0x84, 0x53, 0xcb, 0xfe, 0x92, 0x71, + 0x2e, 0x7c, 0x21, 0xcc, 0x99, 0x85, 0x7f, 0xb8, + 0x74, 0x90, 0x13, 0x42, 0x3f, 0xe0, 0x6b, 0x1d, + 0xf2, 0x4d, 0x54, 0xd4, 0xfc, 0x3a, 0x05, 0xe6, + 0x74, 0xaf, 0xa6, 0xa0, 0x2a, 0x20, 0x23, 0x5d, + 0x34, 0x5c, 0xd9, 0x3e, 0x4e, 0xfa, 0x93, 0xe7, + 0xaa, 0xe9, 0x6f, 0x08, 0x43, 0x67, 0x41, 0xc5, + 0xad, 0xfb, 0x31, 0x95, 0x82, 0x73, 0x32, 0xd8, + 0xa6, 0xa3, 0xed, 0x0e, 0x2d, 0xf6, 0x5f, 0xfd, + 0x80, 0xa6, 0x7a, 0xe0, 0xdf, 0x78, 0x15, 0x29, + 0x74, 0x33, 0xd0, 0x9e, 0x83, 0x86, 0x72, 0x22, + 0x57, 0x29, 0xb9, 0x9e, 0x5d, 0xd3, 0x1a, 0xb5, + 0x96, 0x72, 0x41, 0x3d, 0xf1, 0x64, 0x43, 0x67, + 0xee, 0xaa, 0x5c, 0xd3, 0x9a, 0x96, 0x13, 0x11, + 0x5d, 0xf3, 0x0c, 0x87, 0x82, 0x1e, 0x41, 0x9e, + 0xd0, 0x27, 0xd7, 0x54, 0x3b, 0x67, 0x73, 0x09, + 0x91, 0xe9, 0xd5, 0x36, 0xa7, 0xb5, 0x55, 0xe4, + 0xf3, 0x21, 0x51, 0x49, 0x22, 0x07, 0x55, 0x4f, + 0x44, 0x4b, 0xd2, 0x15, 0x93, 0x17, 0x2a, 0xfa, + 0x4d, 0x4a, 0x57, 0xdb, 0x4c, 0xa6, 0xeb, 0xec, + 0x53, 0x25, 0x6c, 0x21, 0xed, 0x00, 0x4c, 0x3b, + 0xca, 0x14, 0x57, 0xa9, 0xd6, 0x6a, 0xcd, 0x8d, + 0x5e, 0x74, 0xac, 0x72, 0xc1, 0x97, 0xe5, 0x1b, + 0x45, 0x4e, 0xda, 0xfc, 0xcc, 0x40, 0xe8, 0x48, + 0x88, 0x0b, 0xa3, 0xe3, 0x8d, 0x83, 0x42, 0xc3, + 0x23, 0xfd, 0x68, 0xb5, 0x8e, 0xf1, 0x9d, 0x63, + 0x77, 0xe9, 0xa3, 0x8e, 0x8c, 0x26, 0x6b, 0xbd, + 0x72, 0x73, 0x35, 0x0c, 0x03, 0xf8, 0x43, 0x78, + 0x52, 0x71, 0x15, 0x1f, 0x71, 0x5d, 0x6e, 0xed, + 0xb9, 0xcc, 0x86, 0x30, 0xdb, 0x2b, 0xd3, 0x82, + 0x88, 0x23, 0x71, 0x90, 0x53, 0x5c, 0xa9, 0x2f, + 0x76, 0x01, 0xb7, 0x9a, 0xfe, 0x43, 0x55, 0xa3, + 0x04, 0x9b, 0x0e, 0xe4, 0x59, 0xdf, 0xc9, 0xe9, + 0xb1, 0xea, 0x29, 0x28, 0x3c, 0x5c, 0xae, 0x72, + 0x84, 0xb6, 0xc6, 0xeb, 0x0c, 0x27, 0x07, 0x74, + 0x90, 0x0d, 0x31, 0xb0, 0x00, 0x77, 0xe9, 0x40, + 0x70, 0x6f, 0x68, 0xa7, 0xfd, 0x06, 0xec, 0x4b, + 0xc0, 0xb7, 0xac, 0xbc, 0x33, 0xb7, 0x6d, 0x0a, + 0xbd, 0x12, 0x1b, 0x59, 0xcb, 0xdd, 0x32, 0xf5, + 0x1d, 0x94, 0x57, 0x76, 0x9e, 0x0c, 0x18, 0x98, + 0x71, 0xd7, 0x2a, 0xdb, 0x0b, 0x7b, 0xa7, 0x71, + 0xb7, 0x67, 0x81, 0x23, 0x96, 0xae, 0xb9, 0x7e, + 0x32, 0x43, 0x92, 0x8a, 0x19, 0xa0, 0xc4, 0xd4, + 0x3b, 0x57, 0xf9, 0x4a, 0x2c, 0xfb, 0x51, 0x46, + 0xbb, 0xcb, 0x5d, 0xb3, 0xef, 0x13, 0x93, 0x6e, + 0x68, 0x42, 0x54, 0x57, 0xd3, 0x6a, 0x3a, 0x8f, + 0x9d, 0x66, 0xbf, 0xbd, 0x36, 0x23, 0xf5, 0x93, + 0x83, 0x7b, 0x9c, 0xc0, 0xdd, 0xc5, 0x49, 0xc0, + 0x64, 0xed, 0x07, 0x12, 0xb3, 0xe6, 0xe4, 0xe5, + 0x38, 0x95, 0x23, 0xb1, 0xa0, 0x3b, 0x1a, 0x61, + 0xda, 0x17, 0xac, 0xc3, 0x58, 0xdd, 0x74, 0x64, + 0x22, 0x11, 0xe8, 0x32, 0x1d, 0x16, 0x93, 0x85, + 0x99, 0xa5, 0x9c, 0x34, 0x55, 0xb1, 0xe9, 0x20, + 0x72, 0xc9, 0x28, 0x7b, 0x79, 0x00, 0xa1, 0xa6, + 0xa3, 0x27, 0x40, 0x18, 0x8a, 0x54, 0xe0, 0xcc, + 0xe8, 0x4e, 0x8e, 0x43, 0x96, 0xe7, 0x3f, 0xc8, + 0xe9, 0xb2, 0xf9, 0xc9, 0xda, 0x04, 0x71, 0x50, + 0x47, 0xe4, 0xaa, 0xce, 0xa2, 0x30, 0xc8, 0xe4, + 0xac, 0xc7, 0x0d, 0x06, 0x2e, 0xe6, 0xe8, 0x80, + 0x36, 0x29, 0x9e, 0x01, 0xb8, 0xc3, 0xf0, 0xa0, + 0x5d, 0x7a, 0xca, 0x4d, 0xa0, 0x57, 0xbd, 0x2a, + 0x45, 0xa7, 0x7f, 0x9c, 0x93, 0x07, 0x8f, 0x35, + 0x67, 0x92, 0xe3, 0xe9, 0x7f, 0xa8, 0x61, 0x43, + 0x9e, 0x25, 0x4f, 0x33, 0x76, 0x13, 0x6e, 0x12, + 0xb9, 0xdd, 0xa4, 0x7c, 0x08, 0x9f, 0x7c, 0xe7, + 0x0a, 0x8d, 0x84, 0x06, 0xa4, 0x33, 0x17, 0x34, + 0x5e, 0x10, 0x7c, 0xc0, 0xa8, 0x3d, 0x1f, 0x42, + 0x20, 0x51, 0x65, 0x5d, 0x09, 0xc3, 0xaa, 0xc0, + 0xc8, 0x0d, 0xf0, 0x79, 0xbc, 0x20, 0x1b, 0x95, + 0xe7, 0x06, 0x7d, 0x47, 0x20, 0x03, 0x1a, 0x74, + 0xdd, 0xe2, 0xd4, 0xae, 0x38, 0x71, 0x9b, 0xf5, + 0x80, 0xec, 0x08, 0x4e, 0x56, 0xba, 0x76, 0x12, + 0x1a, 0xdf, 0x48, 0xf3, 0xae, 0xb3, 0xe6, 0xe6, + 0xbe, 0xc0, 0x91, 0x2e, 0x01, 0xb3, 0x01, 0x86, + 0xa2, 0xb9, 0x52, 0xd1, 0x21, 0xae, 0xd4, 0x97, + 0x1d, 0xef, 0x41, 0x12, 0x95, 0x3d, 0x48, 0x45, + 0x1c, 0x56, 0x32, 0x8f, 0xb8, 0x43, 0xbb, 0x19, + 0xf3, 0xca, 0xe9, 0xeb, 0x6d, 0x84, 0xbe, 0x86, + 0x06, 0xe2, 0x36, 0xb2, 0x62, 0x9d, 0xd3, 0x4c, + 0x48, 0x18, 0x54, 0x13, 0x4e, 0xcf, 0xfd, 0xba, + 0x84, 0xb9, 0x30, 0x53, 0xcf, 0xfb, 0xb9, 0x29, + 0x8f, 0xdc, 0x9f, 0xef, 0x60, 0x0b, 0x64, 0xf6, + 0x8b, 0xee, 0xa6, 0x91, 0xc2, 0x41, 0x6c, 0xf6, + 0xfa, 0x79, 0x67, 0x4b, 0xc1, 0x3f, 0xaf, 0x09, + 0x81, 0xd4, 0x5d, 0xcb, 0x09, 0xdf, 0x36, 0x31, + 0xc0, 0x14, 0x3c, 0x7c, 0x0e, 0x65, 0x95, 0x99, + 0x6d, 0xa3, 0xf4, 0xd7, 0x38, 0xee, 0x1a, 0x2b, + 0x37, 0xe2, 0xa4, 0x3b, 0x4b, 0xd0, 0x65, 0xca, + 0xf8, 0xc3, 0xe8, 0x15, 0x20, 0xef, 0xf2, 0x00, + 0xfd, 0x01, 0x09, 0xc5, 0xc8, 0x17, 0x04, 0x93, + 0xd0, 0x93, 0x03, 0x55, 0xc5, 0xfe, 0x32, 0xa3, + 0x3e, 0x28, 0x2d, 0x3b, 0x93, 0x8a, 0xcc, 0x07, + 0x72, 0x80, 0x8b, 0x74, 0x16, 0x24, 0xbb, 0xda, + 0x94, 0x39, 0x30, 0x8f, 0xb1, 0xcd, 0x4a, 0x90, + 0x92, 0x7c, 0x14, 0x8f, 0x95, 0x4e, 0xac, 0x9b, + 0xd8, 0x8f, 0x1a, 0x87, 0xa4, 0x32, 0x27, 0x8a, + 0xba, 0xf7, 0x41, 0xcf, 0x84, 0x37, 0x19, 0xe6, + 0x06, 0xf5, 0x0e, 0xcf, 0x36, 0xf5, 0x9e, 0x6c, + 0xde, 0xbc, 0xff, 0x64, 0x7e, 0x4e, 0x59, 0x57, + 0x48, 0xfe, 0x14, 0xf7, 0x9c, 0x93, 0x5d, 0x15, + 0xad, 0xcc, 0x11, 0xb1, 0x17, 0x18, 0xb2, 0x7e, + 0xcc, 0xab, 0xe9, 0xce, 0x7d, 0x77, 0x5b, 0x51, + 0x1b, 0x1e, 0x20, 0xa8, 0x32, 0x06, 0x0e, 0x75, + 0x93, 0xac, 0xdb, 0x35, 0x37, 0x1f, 0xe9, 0x19, + 0x1d, 0xb4, 0x71, 0x97, 0xd6, 0x4e, 0x2c, 0x08, + 0xa5, 0x13, 0xf9, 0x0e, 0x7e, 0x78, 0x6e, 0x14, + 0xe0, 0xa9, 0xb9, 0x96, 0x4c, 0x80, 0x82, 0xba, + 0x17, 0xb3, 0x9d, 0x69, 0xb0, 0x84, 0x46, 0xff, + 0xf9, 0x52, 0x79, 0x94, 0x58, 0x3a, 0x62, 0x90, + 0x15, 0x35, 0x71, 0x10, 0x37, 0xed, 0xa1, 0x8e, + 0x53, 0x6e, 0xf4, 0x26, 0x57, 0x93, 0x15, 0x93, + 0xf6, 0x81, 0x2c, 0x5a, 0x10, 0xda, 0x92, 0xad, + 0x2f, 0xdb, 0x28, 0x31, 0x2d, 0x55, 0x04, 0xd2, + 0x06, 0x28, 0x8c, 0x1e, 0xdc, 0xea, 0x54, 0xac, + 0xff, 0xb7, 0x6c, 0x30, 0x15, 0xd4, 0xb4, 0x0d, + 0x00, 0x93, 0x57, 0xdd, 0xd2, 0x07, 0x07, 0x06, + 0xd9, 0x43, 0x9b, 0xcd, 0x3a, 0xf4, 0x7d, 0x4c, + 0x36, 0x5d, 0x23, 0xa2, 0xcc, 0x57, 0x40, 0x91, + 0xe9, 0x2c, 0x2f, 0x2c, 0xd5, 0x30, 0x9b, 0x17, + 0xb0, 0xc9, 0xf7, 0xa7, 0x2f, 0xd1, 0x93, 0x20, + 0x6b, 0xc6, 0xc1, 0xe4, 0x6f, 0xcb, 0xd1, 0xe7, + 0x09, 0x0f, 0x9e, 0xdc, 0xaa, 0x9f, 0x2f, 0xdf, + 0x56, 0x9f, 0xd4, 0x33, 0x04, 0xaf, 0xd3, 0x6c, + 0x58, 0x61, 0xf0, 0x30, 0xec, 0xf2, 0x7f, 0xf2, + 0x9c, 0xdf, 0x39, 0xbb, 0x6f, 0xa2, 0x8c, 0x7e, + 0xc4, 0x22, 0x51, 0x71, 0xc0, 0x4d, 0x14, 0x1a, + 0xc4, 0xcd, 0x04, 0xd9, 0x87, 0x08, 0x50, 0x05, + 0xcc, 0xaf, 0xf6, 0xf0, 0x8f, 0x92, 0x54, 0x58, + 0xc2, 0xc7, 0x09, 0x7a, 0x59, 0x02, 0x05, 0xe8, + 0xb0, 0x86, 0xd9, 0xbf, 0x7b, 0x35, 0x51, 0x4d, + 0xaf, 0x08, 0x97, 0x2c, 0x65, 0xda, 0x2a, 0x71, + 0x3a, 0xa8, 0x51, 0xcc, 0xf2, 0x73, 0x27, 0xc3, + 0xfd, 0x62, 0xcf, 0xe3, 0xb2, 0xca, 0xcb, 0xbe, + 0x1a, 0x0a, 0xa1, 0x34, 0x7b, 0x77, 0xc4, 0x62, + 0x68, 0x78, 0x5f, 0x94, 0x07, 0x04, 0x65, 0x16, + 0x4b, 0x61, 0xcb, 0xff, 0x75, 0x26, 0x50, 0x66, + 0x1f, 0x6e, 0x93, 0xf8, 0xc5, 0x51, 0xeb, 0xa4, + 0x4a, 0x48, 0x68, 0x6b, 0xe2, 0x5e, 0x44, 0xb2, + 0x50, 0x2c, 0x6c, 0xae, 0x79, 0x4e, 0x66, 0x35, + 0x81, 0x50, 0xac, 0xbc, 0x3f, 0xb1, 0x0c, 0xf3, + 0x05, 0x3c, 0x4a, 0xa3, 0x6c, 0x2a, 0x79, 0xb4, + 0xb7, 0xab, 0xca, 0xc7, 0x9b, 0x8e, 0xcd, 0x5f, + 0x11, 0x03, 0xcb, 0x30, 0xa3, 0xab, 0xda, 0xfe, + 0x64, 0xb9, 0xbb, 0xd8, 0x5e, 0x3a, 0x1a, 0x56, + 0xe5, 0x05, 0x48, 0x90, 0x1e, 0x61, 0x69, 0x1b, + 0x22, 0xe6, 0x1a, 0x3c, 0x75, 0xad, 0x1f, 0x37, + 0x28, 0xdc, 0xe4, 0x6d, 0xbd, 0x42, 0xdc, 0xd3, + 0xc8, 0xb6, 0x1c, 0x48, 0xfe, 0x94, 0x77, 0x7f, + 0xbd, 0x62, 0xac, 0xa3, 0x47, 0x27, 0xcf, 0x5f, + 0xd9, 0xdb, 0xaf, 0xec, 0xf7, 0x5e, 0xc1, 0xb0, + 0x9d, 0x01, 0x26, 0x99, 0x7e, 0x8f, 0x03, 0x70, + 0xb5, 0x42, 0xbe, 0x67, 0x28, 0x1b, 0x7c, 0xbd, + 0x61, 0x21, 0x97, 0xcc, 0x5c, 0xe1, 0x97, 0x8f, + 0x8d, 0xde, 0x2b, 0xaa, 0xa7, 0x71, 0x1d, 0x1e, + 0x02, 0x73, 0x70, 0x58, 0x32, 0x5b, 0x1d, 0x67, + 0x3d, 0xe0, 0x74, 0x4f, 0x03, 0xf2, 0x70, 0x51, + 0x79, 0xf1, 0x61, 0x70, 0x15, 0x74, 0x9d, 0x23, + 0x89, 0xde, 0xac, 0xfd, 0xde, 0xd0, 0x1f, 0xc3, + 0x87, 0x44, 0x35, 0x4b, 0xe5, 0xb0, 0x60, 0xc5, + 0x22, 0xe4, 0x9e, 0xca, 0xeb, 0xd5, 0x3a, 0x09, + 0x45, 0xa4, 0xdb, 0xfa, 0x3f, 0xeb, 0x1b, 0xc7, + 0xc8, 0x14, 0x99, 0x51, 0x92, 0x10, 0xed, 0xed, + 0x28, 0xe0, 0xa1, 0xf8, 0x26, 0xcf, 0xcd, 0xcb, + 0x63, 0xa1, 0x3b, 0xe3, 0xdf, 0x7e, 0xfe, 0xa6, + 0xf0, 0x81, 0x9a, 0xbf, 0x55, 0xde, 0x54, 0xd5, + 0x56, 0x60, 0x98, 0x10, 0x68, 0xf4, 0x38, 0x96, + 0x8e, 0x6f, 0x1d, 0x44, 0x7f, 0xd6, 0x2f, 0xfe, + 0x55, 0xfb, 0x0c, 0x7e, 0x67, 0xe2, 0x61, 0x44, + 0xed, 0xf2, 0x35, 0x30, 0x5d, 0xe9, 0xc7, 0xd6, + 0x6d, 0xe0, 0xa0, 0xed, 0xf3, 0xfc, 0xd8, 0x3e, + 0x0a, 0x7b, 0xcd, 0xaf, 0x65, 0x68, 0x18, 0xc0, + 0xec, 0x04, 0x1c, 0x74, 0x6d, 0xe2, 0x6e, 0x79, + 0xd4, 0x11, 0x2b, 0x62, 0xd5, 0x27, 0xad, 0x4f, + 0x01, 0x59, 0x73, 0xcc, 0x6a, 0x53, 0xfb, 0x2d, + 0xd5, 0x4e, 0x99, 0x21, 0x65, 0x4d, 0xf5, 0x82, + 0xf7, 0xd8, 0x42, 0xce, 0x6f, 0x3d, 0x36, 0x47, + 0xf1, 0x05, 0x16, 0xe8, 0x1b, 0x6a, 0x8f, 0x93, + 0xf2, 0x8f, 0x37, 0x40, 0x12, 0x28, 0xa3, 0xe6, + 0xb9, 0x17, 0x4a, 0x1f, 0xb1, 0xd1, 0x66, 0x69, + 0x86, 0xc4, 0xfc, 0x97, 0xae, 0x3f, 0x8f, 0x1e, + 0x2b, 0xdf, 0xcd, 0xf9, 0x3c +}; +static const u8 dec_output011[] __initconst = { + 0x7a, 0x57, 0xf2, 0xc7, 0x06, 0x3f, 0x50, 0x7b, + 0x36, 0x1a, 0x66, 0x5c, 0xb9, 0x0e, 0x5e, 0x3b, + 0x45, 0x60, 0xbe, 0x9a, 0x31, 0x9f, 0xff, 0x5d, + 0x66, 0x34, 0xb4, 0xdc, 0xfb, 0x9d, 0x8e, 0xee, + 0x6a, 0x33, 0xa4, 0x07, 0x3c, 0xf9, 0x4c, 0x30, + 0xa1, 0x24, 0x52, 0xf9, 0x50, 0x46, 0x88, 0x20, + 0x02, 0x32, 0x3a, 0x0e, 0x99, 0x63, 0xaf, 0x1f, + 0x15, 0x28, 0x2a, 0x05, 0xff, 0x57, 0x59, 0x5e, + 0x18, 0xa1, 0x1f, 0xd0, 0x92, 0x5c, 0x88, 0x66, + 0x1b, 0x00, 0x64, 0xa5, 0x93, 0x8d, 0x06, 0x46, + 0xb0, 0x64, 0x8b, 0x8b, 0xef, 0x99, 0x05, 0x35, + 0x85, 0xb3, 0xf3, 0x33, 0xbb, 0xec, 0x66, 0xb6, + 0x3d, 0x57, 0x42, 0xe3, 0xb4, 0xc6, 0xaa, 0xb0, + 0x41, 0x2a, 0xb9, 0x59, 0xa9, 0xf6, 0x3e, 0x15, + 0x26, 0x12, 0x03, 0x21, 0x4c, 0x74, 0x43, 0x13, + 0x2a, 0x03, 0x27, 0x09, 0xb4, 0xfb, 0xe7, 0xb7, + 0x40, 0xff, 0x5e, 0xce, 0x48, 0x9a, 0x60, 0xe3, + 0x8b, 0x80, 0x8c, 0x38, 0x2d, 0xcb, 0x93, 0x37, + 0x74, 0x05, 0x52, 0x6f, 0x73, 0x3e, 0xc3, 0xbc, + 0xca, 0x72, 0x0a, 0xeb, 0xf1, 0x3b, 0xa0, 0x95, + 0xdc, 0x8a, 0xc4, 0xa9, 0xdc, 0xca, 0x44, 0xd8, + 0x08, 0x63, 0x6a, 0x36, 0xd3, 0x3c, 0xb8, 0xac, + 0x46, 0x7d, 0xfd, 0xaa, 0xeb, 0x3e, 0x0f, 0x45, + 0x8f, 0x49, 0xda, 0x2b, 0xf2, 0x12, 0xbd, 0xaf, + 0x67, 0x8a, 0x63, 0x48, 0x4b, 0x55, 0x5f, 0x6d, + 0x8c, 0xb9, 0x76, 0x34, 0x84, 0xae, 0xc2, 0xfc, + 0x52, 0x64, 0x82, 0xf7, 0xb0, 0x06, 0xf0, 0x45, + 0x73, 0x12, 0x50, 0x30, 0x72, 0xea, 0x78, 0x9a, + 0xa8, 0xaf, 0xb5, 0xe3, 0xbb, 0x77, 0x52, 0xec, + 0x59, 0x84, 0xbf, 0x6b, 0x8f, 0xce, 0x86, 0x5e, + 0x1f, 0x23, 0xe9, 0xfb, 0x08, 0x86, 0xf7, 0x10, + 0xb9, 0xf2, 0x44, 0x96, 0x44, 0x63, 0xa9, 0xa8, + 0x78, 0x00, 0x23, 0xd6, 0xc7, 0xe7, 0x6e, 0x66, + 0x4f, 0xcc, 0xee, 0x15, 0xb3, 0xbd, 0x1d, 0xa0, + 0xe5, 0x9c, 0x1b, 0x24, 0x2c, 0x4d, 0x3c, 0x62, + 0x35, 0x9c, 0x88, 0x59, 0x09, 0xdd, 0x82, 0x1b, + 0xcf, 0x0a, 0x83, 0x6b, 0x3f, 0xae, 0x03, 0xc4, + 0xb4, 0xdd, 0x7e, 0x5b, 0x28, 0x76, 0x25, 0x96, + 0xd9, 0xc9, 0x9d, 0x5f, 0x86, 0xfa, 0xf6, 0xd7, + 0xd2, 0xe6, 0x76, 0x1d, 0x0f, 0xa1, 0xdc, 0x74, + 0x05, 0x1b, 0x1d, 0xe0, 0xcd, 0x16, 0xb0, 0xa8, + 0x8a, 0x34, 0x7b, 0x15, 0x11, 0x77, 0xe5, 0x7b, + 0x7e, 0x20, 0xf7, 0xda, 0x38, 0xda, 0xce, 0x70, + 0xe9, 0xf5, 0x6c, 0xd9, 0xbe, 0x0c, 0x4c, 0x95, + 0x4c, 0xc2, 0x9b, 0x34, 0x55, 0x55, 0xe1, 0xf3, + 0x46, 0x8e, 0x48, 0x74, 0x14, 0x4f, 0x9d, 0xc9, + 0xf5, 0xe8, 0x1a, 0xf0, 0x11, 0x4a, 0xc1, 0x8d, + 0xe0, 0x93, 0xa0, 0xbe, 0x09, 0x1c, 0x2b, 0x4e, + 0x0f, 0xb2, 0x87, 0x8b, 0x84, 0xfe, 0x92, 0x32, + 0x14, 0xd7, 0x93, 0xdf, 0xe7, 0x44, 0xbc, 0xc5, + 0xae, 0x53, 0x69, 0xd8, 0xb3, 0x79, 0x37, 0x80, + 0xe3, 0x17, 0x5c, 0xec, 0x53, 0x00, 0x9a, 0xe3, + 0x8e, 0xdc, 0x38, 0xb8, 0x66, 0xf0, 0xd3, 0xad, + 0x1d, 0x02, 0x96, 0x86, 0x3e, 0x9d, 0x3b, 0x5d, + 0xa5, 0x7f, 0x21, 0x10, 0xf1, 0x1f, 0x13, 0x20, + 0xf9, 0x57, 0x87, 0x20, 0xf5, 0x5f, 0xf1, 0x17, + 0x48, 0x0a, 0x51, 0x5a, 0xcd, 0x19, 0x03, 0xa6, + 0x5a, 0xd1, 0x12, 0x97, 0xe9, 0x48, 0xe2, 0x1d, + 0x83, 0x75, 0x50, 0xd9, 0x75, 0x7d, 0x6a, 0x82, + 0xa1, 0xf9, 0x4e, 0x54, 0x87, 0x89, 0xc9, 0x0c, + 0xb7, 0x5b, 0x6a, 0x91, 0xc1, 0x9c, 0xb2, 0xa9, + 0xdc, 0x9a, 0xa4, 0x49, 0x0a, 0x6d, 0x0d, 0xbb, + 0xde, 0x86, 0x44, 0xdd, 0x5d, 0x89, 0x2b, 0x96, + 0x0f, 0x23, 0x95, 0xad, 0xcc, 0xa2, 0xb3, 0xb9, + 0x7e, 0x74, 0x38, 0xba, 0x9f, 0x73, 0xae, 0x5f, + 0xf8, 0x68, 0xa2, 0xe0, 0xa9, 0xce, 0xbd, 0x40, + 0xd4, 0x4c, 0x6b, 0xd2, 0x56, 0x62, 0xb0, 0xcc, + 0x63, 0x7e, 0x5b, 0xd3, 0xae, 0xd1, 0x75, 0xce, + 0xbb, 0xb4, 0x5b, 0xa8, 0xf8, 0xb4, 0xac, 0x71, + 0x75, 0xaa, 0xc9, 0x9f, 0xbb, 0x6c, 0xad, 0x0f, + 0x55, 0x5d, 0xe8, 0x85, 0x7d, 0xf9, 0x21, 0x35, + 0xea, 0x92, 0x85, 0x2b, 0x00, 0xec, 0x84, 0x90, + 0x0a, 0x63, 0x96, 0xe4, 0x6b, 0xa9, 0x77, 0xb8, + 0x91, 0xf8, 0x46, 0x15, 0x72, 0x63, 0x70, 0x01, + 0x40, 0xa3, 0xa5, 0x76, 0x62, 0x2b, 0xbf, 0xf1, + 0xe5, 0x8d, 0x9f, 0xa3, 0xfa, 0x9b, 0x03, 0xbe, + 0xfe, 0x65, 0x6f, 0xa2, 0x29, 0x0d, 0x54, 0xb4, + 0x71, 0xce, 0xa9, 0xd6, 0x3d, 0x88, 0xf9, 0xaf, + 0x6b, 0xa8, 0x9e, 0xf4, 0x16, 0x96, 0x36, 0xb9, + 0x00, 0xdc, 0x10, 0xab, 0xb5, 0x08, 0x31, 0x1f, + 0x00, 0xb1, 0x3c, 0xd9, 0x38, 0x3e, 0xc6, 0x04, + 0xa7, 0x4e, 0xe8, 0xae, 0xed, 0x98, 0xc2, 0xf7, + 0xb9, 0x00, 0x5f, 0x8c, 0x60, 0xd1, 0xe5, 0x15, + 0xf7, 0xae, 0x1e, 0x84, 0x88, 0xd1, 0xf6, 0xbc, + 0x3a, 0x89, 0x35, 0x22, 0x83, 0x7c, 0xca, 0xf0, + 0x33, 0x82, 0x4c, 0x79, 0x3c, 0xfd, 0xb1, 0xae, + 0x52, 0x62, 0x55, 0xd2, 0x41, 0x60, 0xc6, 0xbb, + 0xfa, 0x0e, 0x59, 0xd6, 0xa8, 0xfe, 0x5d, 0xed, + 0x47, 0x3d, 0xe0, 0xea, 0x1f, 0x6e, 0x43, 0x51, + 0xec, 0x10, 0x52, 0x56, 0x77, 0x42, 0x6b, 0x52, + 0x87, 0xd8, 0xec, 0xe0, 0xaa, 0x76, 0xa5, 0x84, + 0x2a, 0x22, 0x24, 0xfd, 0x92, 0x40, 0x88, 0xd5, + 0x85, 0x1c, 0x1f, 0x6b, 0x47, 0xa0, 0xc4, 0xe4, + 0xef, 0xf4, 0xea, 0xd7, 0x59, 0xac, 0x2a, 0x9e, + 0x8c, 0xfa, 0x1f, 0x42, 0x08, 0xfe, 0x4f, 0x74, + 0xa0, 0x26, 0xf5, 0xb3, 0x84, 0xf6, 0x58, 0x5f, + 0x26, 0x66, 0x3e, 0xd7, 0xe4, 0x22, 0x91, 0x13, + 0xc8, 0xac, 0x25, 0x96, 0x23, 0xd8, 0x09, 0xea, + 0x45, 0x75, 0x23, 0xb8, 0x5f, 0xc2, 0x90, 0x8b, + 0x09, 0xc4, 0xfc, 0x47, 0x6c, 0x6d, 0x0a, 0xef, + 0x69, 0xa4, 0x38, 0x19, 0xcf, 0x7d, 0xf9, 0x09, + 0x73, 0x9b, 0x60, 0x5a, 0xf7, 0x37, 0xb5, 0xfe, + 0x9f, 0xe3, 0x2b, 0x4c, 0x0d, 0x6e, 0x19, 0xf1, + 0xd6, 0xc0, 0x70, 0xf3, 0x9d, 0x22, 0x3c, 0xf9, + 0x49, 0xce, 0x30, 0x8e, 0x44, 0xb5, 0x76, 0x15, + 0x8f, 0x52, 0xfd, 0xa5, 0x04, 0xb8, 0x55, 0x6a, + 0x36, 0x59, 0x7c, 0xc4, 0x48, 0xb8, 0xd7, 0xab, + 0x05, 0x66, 0xe9, 0x5e, 0x21, 0x6f, 0x6b, 0x36, + 0x29, 0xbb, 0xe9, 0xe3, 0xa2, 0x9a, 0xa8, 0xcd, + 0x55, 0x25, 0x11, 0xba, 0x5a, 0x58, 0xa0, 0xde, + 0xae, 0x19, 0x2a, 0x48, 0x5a, 0xff, 0x36, 0xcd, + 0x6d, 0x16, 0x7a, 0x73, 0x38, 0x46, 0xe5, 0x47, + 0x59, 0xc8, 0xa2, 0xf6, 0xe2, 0x6c, 0x83, 0xc5, + 0x36, 0x2c, 0x83, 0x7d, 0xb4, 0x01, 0x05, 0x69, + 0xe7, 0xaf, 0x5c, 0xc4, 0x64, 0x82, 0x12, 0x21, + 0xef, 0xf7, 0xd1, 0x7d, 0xb8, 0x8d, 0x8c, 0x98, + 0x7c, 0x5f, 0x7d, 0x92, 0x88, 0xb9, 0x94, 0x07, + 0x9c, 0xd8, 0xe9, 0x9c, 0x17, 0x38, 0xe3, 0x57, + 0x6c, 0xe0, 0xdc, 0xa5, 0x92, 0x42, 0xb3, 0xbd, + 0x50, 0xa2, 0x7e, 0xb5, 0xb1, 0x52, 0x72, 0x03, + 0x97, 0xd8, 0xaa, 0x9a, 0x1e, 0x75, 0x41, 0x11, + 0xa3, 0x4f, 0xcc, 0xd4, 0xe3, 0x73, 0xad, 0x96, + 0xdc, 0x47, 0x41, 0x9f, 0xb0, 0xbe, 0x79, 0x91, + 0xf5, 0xb6, 0x18, 0xfe, 0xc2, 0x83, 0x18, 0x7d, + 0x73, 0xd9, 0x4f, 0x83, 0x84, 0x03, 0xb3, 0xf0, + 0x77, 0x66, 0x3d, 0x83, 0x63, 0x2e, 0x2c, 0xf9, + 0xdd, 0xa6, 0x1f, 0x89, 0x82, 0xb8, 0x23, 0x42, + 0xeb, 0xe2, 0xca, 0x70, 0x82, 0x61, 0x41, 0x0a, + 0x6d, 0x5f, 0x75, 0xc5, 0xe2, 0xc4, 0x91, 0x18, + 0x44, 0x22, 0xfa, 0x34, 0x10, 0xf5, 0x20, 0xdc, + 0xb7, 0xdd, 0x2a, 0x20, 0x77, 0xf5, 0xf9, 0xce, + 0xdb, 0xa0, 0x0a, 0x52, 0x2a, 0x4e, 0xdd, 0xcc, + 0x97, 0xdf, 0x05, 0xe4, 0x5e, 0xb7, 0xaa, 0xf0, + 0xe2, 0x80, 0xff, 0xba, 0x1a, 0x0f, 0xac, 0xdf, + 0x02, 0x32, 0xe6, 0xf7, 0xc7, 0x17, 0x13, 0xb7, + 0xfc, 0x98, 0x48, 0x8c, 0x0d, 0x82, 0xc9, 0x80, + 0x7a, 0xe2, 0x0a, 0xc5, 0xb4, 0xde, 0x7c, 0x3c, + 0x79, 0x81, 0x0e, 0x28, 0x65, 0x79, 0x67, 0x82, + 0x69, 0x44, 0x66, 0x09, 0xf7, 0x16, 0x1a, 0xf9, + 0x7d, 0x80, 0xa1, 0x79, 0x14, 0xa9, 0xc8, 0x20, + 0xfb, 0xa2, 0x46, 0xbe, 0x08, 0x35, 0x17, 0x58, + 0xc1, 0x1a, 0xda, 0x2a, 0x6b, 0x2e, 0x1e, 0xe6, + 0x27, 0x55, 0x7b, 0x19, 0xe2, 0xfb, 0x64, 0xfc, + 0x5e, 0x15, 0x54, 0x3c, 0xe7, 0xc2, 0x11, 0x50, + 0x30, 0xb8, 0x72, 0x03, 0x0b, 0x1a, 0x9f, 0x86, + 0x27, 0x11, 0x5c, 0x06, 0x2b, 0xbd, 0x75, 0x1a, + 0x0a, 0xda, 0x01, 0xfa, 0x5c, 0x4a, 0xc1, 0x80, + 0x3a, 0x6e, 0x30, 0xc8, 0x2c, 0xeb, 0x56, 0xec, + 0x89, 0xfa, 0x35, 0x7b, 0xb2, 0xf0, 0x97, 0x08, + 0x86, 0x53, 0xbe, 0xbd, 0x40, 0x41, 0x38, 0x1c, + 0xb4, 0x8b, 0x79, 0x2e, 0x18, 0x96, 0x94, 0xde, + 0xe8, 0xca, 0xe5, 0x9f, 0x92, 0x9f, 0x15, 0x5d, + 0x56, 0x60, 0x5c, 0x09, 0xf9, 0x16, 0xf4, 0x17, + 0x0f, 0xf6, 0x4c, 0xda, 0xe6, 0x67, 0x89, 0x9f, + 0xca, 0x6c, 0xe7, 0x9b, 0x04, 0x62, 0x0e, 0x26, + 0xa6, 0x52, 0xbd, 0x29, 0xff, 0xc7, 0xa4, 0x96, + 0xe6, 0x6a, 0x02, 0xa5, 0x2e, 0x7b, 0xfe, 0x97, + 0x68, 0x3e, 0x2e, 0x5f, 0x3b, 0x0f, 0x36, 0xd6, + 0x98, 0x19, 0x59, 0x48, 0xd2, 0xc6, 0xe1, 0x55, + 0x1a, 0x6e, 0xd6, 0xed, 0x2c, 0xba, 0xc3, 0x9e, + 0x64, 0xc9, 0x95, 0x86, 0x35, 0x5e, 0x3e, 0x88, + 0x69, 0x99, 0x4b, 0xee, 0xbe, 0x9a, 0x99, 0xb5, + 0x6e, 0x58, 0xae, 0xdd, 0x22, 0xdb, 0xdd, 0x6b, + 0xfc, 0xaf, 0x90, 0xa3, 0x3d, 0xa4, 0xc1, 0x15, + 0x92, 0x18, 0x8d, 0xd2, 0x4b, 0x7b, 0x06, 0xd1, + 0x37, 0xb5, 0xe2, 0x7c, 0x2c, 0xf0, 0x25, 0xe4, + 0x94, 0x2a, 0xbd, 0xe3, 0x82, 0x70, 0x78, 0xa3, + 0x82, 0x10, 0x5a, 0x90, 0xd7, 0xa4, 0xfa, 0xaf, + 0x1a, 0x88, 0x59, 0xdc, 0x74, 0x12, 0xb4, 0x8e, + 0xd7, 0x19, 0x46, 0xf4, 0x84, 0x69, 0x9f, 0xbb, + 0x70, 0xa8, 0x4c, 0x52, 0x81, 0xa9, 0xff, 0x76, + 0x1c, 0xae, 0xd8, 0x11, 0x3d, 0x7f, 0x7d, 0xc5, + 0x12, 0x59, 0x28, 0x18, 0xc2, 0xa2, 0xb7, 0x1c, + 0x88, 0xf8, 0xd6, 0x1b, 0xa6, 0x7d, 0x9e, 0xde, + 0x29, 0xf8, 0xed, 0xff, 0xeb, 0x92, 0x24, 0x4f, + 0x05, 0xaa, 0xd9, 0x49, 0xba, 0x87, 0x59, 0x51, + 0xc9, 0x20, 0x5c, 0x9b, 0x74, 0xcf, 0x03, 0xd9, + 0x2d, 0x34, 0xc7, 0x5b, 0xa5, 0x40, 0xb2, 0x99, + 0xf5, 0xcb, 0xb4, 0xf6, 0xb7, 0x72, 0x4a, 0xd6, + 0xbd, 0xb0, 0xf3, 0x93, 0xe0, 0x1b, 0xa8, 0x04, + 0x1e, 0x35, 0xd4, 0x80, 0x20, 0xf4, 0x9c, 0x31, + 0x6b, 0x45, 0xb9, 0x15, 0xb0, 0x5e, 0xdd, 0x0a, + 0x33, 0x9c, 0x83, 0xcd, 0x58, 0x89, 0x50, 0x56, + 0xbb, 0x81, 0x00, 0x91, 0x32, 0xf3, 0x1b, 0x3e, + 0xcf, 0x45, 0xe1, 0xf9, 0xe1, 0x2c, 0x26, 0x78, + 0x93, 0x9a, 0x60, 0x46, 0xc9, 0xb5, 0x5e, 0x6a, + 0x28, 0x92, 0x87, 0x3f, 0x63, 0x7b, 0xdb, 0xf7, + 0xd0, 0x13, 0x9d, 0x32, 0x40, 0x5e, 0xcf, 0xfb, + 0x79, 0x68, 0x47, 0x4c, 0xfd, 0x01, 0x17, 0xe6, + 0x97, 0x93, 0x78, 0xbb, 0xa6, 0x27, 0xa3, 0xe8, + 0x1a, 0xe8, 0x94, 0x55, 0x7d, 0x08, 0xe5, 0xdc, + 0x66, 0xa3, 0x69, 0xc8, 0xca, 0xc5, 0xa1, 0x84, + 0x55, 0xde, 0x08, 0x91, 0x16, 0x3a, 0x0c, 0x86, + 0xab, 0x27, 0x2b, 0x64, 0x34, 0x02, 0x6c, 0x76, + 0x8b, 0xc6, 0xaf, 0xcc, 0xe1, 0xd6, 0x8c, 0x2a, + 0x18, 0x3d, 0xa6, 0x1b, 0x37, 0x75, 0x45, 0x73, + 0xc2, 0x75, 0xd7, 0x53, 0x78, 0x3a, 0xd6, 0xe8, + 0x29, 0xd2, 0x4a, 0xa8, 0x1e, 0x82, 0xf6, 0xb6, + 0x81, 0xde, 0x21, 0xed, 0x2b, 0x56, 0xbb, 0xf2, + 0xd0, 0x57, 0xc1, 0x7c, 0xd2, 0x6a, 0xd2, 0x56, + 0xf5, 0x13, 0x5f, 0x1c, 0x6a, 0x0b, 0x74, 0xfb, + 0xe9, 0xfe, 0x9e, 0xea, 0x95, 0xb2, 0x46, 0xab, + 0x0a, 0xfc, 0xfd, 0xf3, 0xbb, 0x04, 0x2b, 0x76, + 0x1b, 0xa4, 0x74, 0xb0, 0xc1, 0x78, 0xc3, 0x69, + 0xe2, 0xb0, 0x01, 0xe1, 0xde, 0x32, 0x4c, 0x8d, + 0x1a, 0xb3, 0x38, 0x08, 0xd5, 0xfc, 0x1f, 0xdc, + 0x0e, 0x2c, 0x9c, 0xb1, 0xa1, 0x63, 0x17, 0x22, + 0xf5, 0x6c, 0x93, 0x70, 0x74, 0x00, 0xf8, 0x39, + 0x01, 0x94, 0xd1, 0x32, 0x23, 0x56, 0x5d, 0xa6, + 0x02, 0x76, 0x76, 0x93, 0xce, 0x2f, 0x19, 0xe9, + 0x17, 0x52, 0xae, 0x6e, 0x2c, 0x6d, 0x61, 0x7f, + 0x3b, 0xaa, 0xe0, 0x52, 0x85, 0xc5, 0x65, 0xc1, + 0xbb, 0x8e, 0x5b, 0x21, 0xd5, 0xc9, 0x78, 0x83, + 0x07, 0x97, 0x4c, 0x62, 0x61, 0x41, 0xd4, 0xfc, + 0xc9, 0x39, 0xe3, 0x9b, 0xd0, 0xcc, 0x75, 0xc4, + 0x97, 0xe6, 0xdd, 0x2a, 0x5f, 0xa6, 0xe8, 0x59, + 0x6c, 0x98, 0xb9, 0x02, 0xe2, 0xa2, 0xd6, 0x68, + 0xee, 0x3b, 0x1d, 0xe3, 0x4d, 0x5b, 0x30, 0xef, + 0x03, 0xf2, 0xeb, 0x18, 0x57, 0x36, 0xe8, 0xa1, + 0xf4, 0x47, 0xfb, 0xcb, 0x8f, 0xcb, 0xc8, 0xf3, + 0x4f, 0x74, 0x9d, 0x9d, 0xb1, 0x8d, 0x14, 0x44, + 0xd9, 0x19, 0xb4, 0x54, 0x4f, 0x75, 0x19, 0x09, + 0xa0, 0x75, 0xbc, 0x3b, 0x82, 0xc6, 0x3f, 0xb8, + 0x83, 0x19, 0x6e, 0xd6, 0x37, 0xfe, 0x6e, 0x8a, + 0x4e, 0xe0, 0x4a, 0xab, 0x7b, 0xc8, 0xb4, 0x1d, + 0xf4, 0xed, 0x27, 0x03, 0x65, 0xa2, 0xa1, 0xae, + 0x11, 0xe7, 0x98, 0x78, 0x48, 0x91, 0xd2, 0xd2, + 0xd4, 0x23, 0x78, 0x50, 0xb1, 0x5b, 0x85, 0x10, + 0x8d, 0xca, 0x5f, 0x0f, 0x71, 0xae, 0x72, 0x9a, + 0xf6, 0x25, 0x19, 0x60, 0x06, 0xf7, 0x10, 0x34, + 0x18, 0x0d, 0xc9, 0x9f, 0x7b, 0x0c, 0x9b, 0x8f, + 0x91, 0x1b, 0x9f, 0xcd, 0x10, 0xee, 0x75, 0xf9, + 0x97, 0x66, 0xfc, 0x4d, 0x33, 0x6e, 0x28, 0x2b, + 0x92, 0x85, 0x4f, 0xab, 0x43, 0x8d, 0x8f, 0x7d, + 0x86, 0xa7, 0xc7, 0xd8, 0xd3, 0x0b, 0x8b, 0x57, + 0xb6, 0x1d, 0x95, 0x0d, 0xe9, 0xbc, 0xd9, 0x03, + 0xd9, 0x10, 0x19, 0xc3, 0x46, 0x63, 0x55, 0x87, + 0x61, 0x79, 0x6c, 0x95, 0x0e, 0x9c, 0xdd, 0xca, + 0xc3, 0xf3, 0x64, 0xf0, 0x7d, 0x76, 0xb7, 0x53, + 0x67, 0x2b, 0x1e, 0x44, 0x56, 0x81, 0xea, 0x8f, + 0x5c, 0x42, 0x16, 0xb8, 0x28, 0xeb, 0x1b, 0x61, + 0x10, 0x1e, 0xbf, 0xec, 0xa8 +}; +static const u8 dec_assoc011[] __initconst = { + 0xd6, 0x31, 0xda, 0x5d, 0x42, 0x5e, 0xd7 +}; +static const u8 dec_nonce011[] __initconst = { + 0xfd, 0x87, 0xd4, 0xd8, 0x62, 0xfd, 0xec, 0xaa +}; +static const u8 dec_key011[] __initconst = { + 0x35, 0x4e, 0xb5, 0x70, 0x50, 0x42, 0x8a, 0x85, + 0xf2, 0xfb, 0xed, 0x7b, 0xd0, 0x9e, 0x97, 0xca, + 0xfa, 0x98, 0x66, 0x63, 0xee, 0x37, 0xcc, 0x52, + 0xfe, 0xd1, 0xdf, 0x95, 0x15, 0x34, 0x29, 0x38 +}; + +static const u8 dec_input012[] __initconst = { + 0x52, 0x34, 0xb3, 0x65, 0x3b, 0xb7, 0xe5, 0xd3, + 0xab, 0x49, 0x17, 0x60, 0xd2, 0x52, 0x56, 0xdf, + 0xdf, 0x34, 0x56, 0x82, 0xe2, 0xbe, 0xe5, 0xe1, + 0x28, 0xd1, 0x4e, 0x5f, 0x4f, 0x01, 0x7d, 0x3f, + 0x99, 0x6b, 0x30, 0x6e, 0x1a, 0x7c, 0x4c, 0x8e, + 0x62, 0x81, 0xae, 0x86, 0x3f, 0x6b, 0xd0, 0xb5, + 0xa9, 0xcf, 0x50, 0xf1, 0x02, 0x12, 0xa0, 0x0b, + 0x24, 0xe9, 0xe6, 0x72, 0x89, 0x2c, 0x52, 0x1b, + 0x34, 0x38, 0xf8, 0x75, 0x5f, 0xa0, 0x74, 0xe2, + 0x99, 0xdd, 0xa6, 0x4b, 0x14, 0x50, 0x4e, 0xf1, + 0xbe, 0xd6, 0x9e, 0xdb, 0xb2, 0x24, 0x27, 0x74, + 0x12, 0x4a, 0x78, 0x78, 0x17, 0xa5, 0x58, 0x8e, + 0x2f, 0xf9, 0xf4, 0x8d, 0xee, 0x03, 0x88, 0xae, + 0xb8, 0x29, 0xa1, 0x2f, 0x4b, 0xee, 0x92, 0xbd, + 0x87, 0xb3, 0xce, 0x34, 0x21, 0x57, 0x46, 0x04, + 0x49, 0x0c, 0x80, 0xf2, 0x01, 0x13, 0xa1, 0x55, + 0xb3, 0xff, 0x44, 0x30, 0x3c, 0x1c, 0xd0, 0xef, + 0xbc, 0x18, 0x74, 0x26, 0xad, 0x41, 0x5b, 0x5b, + 0x3e, 0x9a, 0x7a, 0x46, 0x4f, 0x16, 0xd6, 0x74, + 0x5a, 0xb7, 0x3a, 0x28, 0x31, 0xd8, 0xae, 0x26, + 0xac, 0x50, 0x53, 0x86, 0xf2, 0x56, 0xd7, 0x3f, + 0x29, 0xbc, 0x45, 0x68, 0x8e, 0xcb, 0x98, 0x64, + 0xdd, 0xc9, 0xba, 0xb8, 0x4b, 0x7b, 0x82, 0xdd, + 0x14, 0xa7, 0xcb, 0x71, 0x72, 0x00, 0x5c, 0xad, + 0x7b, 0x6a, 0x89, 0xa4, 0x3d, 0xbf, 0xb5, 0x4b, + 0x3e, 0x7c, 0x5a, 0xcf, 0xb8, 0xa1, 0xc5, 0x6e, + 0xc8, 0xb6, 0x31, 0x57, 0x7b, 0xdf, 0xa5, 0x7e, + 0xb1, 0xd6, 0x42, 0x2a, 0x31, 0x36, 0xd1, 0xd0, + 0x3f, 0x7a, 0xe5, 0x94, 0xd6, 0x36, 0xa0, 0x6f, + 0xb7, 0x40, 0x7d, 0x37, 0xc6, 0x55, 0x7c, 0x50, + 0x40, 0x6d, 0x29, 0x89, 0xe3, 0x5a, 0xae, 0x97, + 0xe7, 0x44, 0x49, 0x6e, 0xbd, 0x81, 0x3d, 0x03, + 0x93, 0x06, 0x12, 0x06, 0xe2, 0x41, 0x12, 0x4a, + 0xf1, 0x6a, 0xa4, 0x58, 0xa2, 0xfb, 0xd2, 0x15, + 0xba, 0xc9, 0x79, 0xc9, 0xce, 0x5e, 0x13, 0xbb, + 0xf1, 0x09, 0x04, 0xcc, 0xfd, 0xe8, 0x51, 0x34, + 0x6a, 0xe8, 0x61, 0x88, 0xda, 0xed, 0x01, 0x47, + 0x84, 0xf5, 0x73, 0x25, 0xf9, 0x1c, 0x42, 0x86, + 0x07, 0xf3, 0x5b, 0x1a, 0x01, 0xb3, 0xeb, 0x24, + 0x32, 0x8d, 0xf6, 0xed, 0x7c, 0x4b, 0xeb, 0x3c, + 0x36, 0x42, 0x28, 0xdf, 0xdf, 0xb6, 0xbe, 0xd9, + 0x8c, 0x52, 0xd3, 0x2b, 0x08, 0x90, 0x8c, 0xe7, + 0x98, 0x31, 0xe2, 0x32, 0x8e, 0xfc, 0x11, 0x48, + 0x00, 0xa8, 0x6a, 0x42, 0x4a, 0x02, 0xc6, 0x4b, + 0x09, 0xf1, 0xe3, 0x49, 0xf3, 0x45, 0x1f, 0x0e, + 0xbc, 0x56, 0xe2, 0xe4, 0xdf, 0xfb, 0xeb, 0x61, + 0xfa, 0x24, 0xc1, 0x63, 0x75, 0xbb, 0x47, 0x75, + 0xaf, 0xe1, 0x53, 0x16, 0x96, 0x21, 0x85, 0x26, + 0x11, 0xb3, 0x76, 0xe3, 0x23, 0xa1, 0x6b, 0x74, + 0x37, 0xd0, 0xde, 0x06, 0x90, 0x71, 0x5d, 0x43, + 0x88, 0x9b, 0x00, 0x54, 0xa6, 0x75, 0x2f, 0xa1, + 0xc2, 0x0b, 0x73, 0x20, 0x1d, 0xb6, 0x21, 0x79, + 0x57, 0x3f, 0xfa, 0x09, 0xbe, 0x8a, 0x33, 0xc3, + 0x52, 0xf0, 0x1d, 0x82, 0x31, 0xd1, 0x55, 0xb5, + 0x6c, 0x99, 0x25, 0xcf, 0x5c, 0x32, 0xce, 0xe9, + 0x0d, 0xfa, 0x69, 0x2c, 0xd5, 0x0d, 0xc5, 0x6d, + 0x86, 0xd0, 0x0c, 0x3b, 0x06, 0x50, 0x79, 0xe8, + 0xc3, 0xae, 0x04, 0xe6, 0xcd, 0x51, 0xe4, 0x26, + 0x9b, 0x4f, 0x7e, 0xa6, 0x0f, 0xab, 0xd8, 0xe5, + 0xde, 0xa9, 0x00, 0x95, 0xbe, 0xa3, 0x9d, 0x5d, + 0xb2, 0x09, 0x70, 0x18, 0x1c, 0xf0, 0xac, 0x29, + 0x23, 0x02, 0x29, 0x28, 0xd2, 0x74, 0x35, 0x57, + 0x62, 0x0f, 0x24, 0xea, 0x5e, 0x33, 0xc2, 0x92, + 0xf3, 0x78, 0x4d, 0x30, 0x1e, 0xa1, 0x99, 0xa9, + 0x82, 0xb0, 0x42, 0x31, 0x8d, 0xad, 0x8a, 0xbc, + 0xfc, 0xd4, 0x57, 0x47, 0x3e, 0xb4, 0x50, 0xdd, + 0x6e, 0x2c, 0x80, 0x4d, 0x22, 0xf1, 0xfb, 0x57, + 0xc4, 0xdd, 0x17, 0xe1, 0x8a, 0x36, 0x4a, 0xb3, + 0x37, 0xca, 0xc9, 0x4e, 0xab, 0xd5, 0x69, 0xc4, + 0xf4, 0xbc, 0x0b, 0x3b, 0x44, 0x4b, 0x29, 0x9c, + 0xee, 0xd4, 0x35, 0x22, 0x21, 0xb0, 0x1f, 0x27, + 0x64, 0xa8, 0x51, 0x1b, 0xf0, 0x9f, 0x19, 0x5c, + 0xfb, 0x5a, 0x64, 0x74, 0x70, 0x45, 0x09, 0xf5, + 0x64, 0xfe, 0x1a, 0x2d, 0xc9, 0x14, 0x04, 0x14, + 0xcf, 0xd5, 0x7d, 0x60, 0xaf, 0x94, 0x39, 0x94, + 0xe2, 0x7d, 0x79, 0x82, 0xd0, 0x65, 0x3b, 0x6b, + 0x9c, 0x19, 0x84, 0xb4, 0x6d, 0xb3, 0x0c, 0x99, + 0xc0, 0x56, 0xa8, 0xbd, 0x73, 0xce, 0x05, 0x84, + 0x3e, 0x30, 0xaa, 0xc4, 0x9b, 0x1b, 0x04, 0x2a, + 0x9f, 0xd7, 0x43, 0x2b, 0x23, 0xdf, 0xbf, 0xaa, + 0xd5, 0xc2, 0x43, 0x2d, 0x70, 0xab, 0xdc, 0x75, + 0xad, 0xac, 0xf7, 0xc0, 0xbe, 0x67, 0xb2, 0x74, + 0xed, 0x67, 0x10, 0x4a, 0x92, 0x60, 0xc1, 0x40, + 0x50, 0x19, 0x8a, 0x8a, 0x8c, 0x09, 0x0e, 0x72, + 0xe1, 0x73, 0x5e, 0xe8, 0x41, 0x85, 0x63, 0x9f, + 0x3f, 0xd7, 0x7d, 0xc4, 0xfb, 0x22, 0x5d, 0x92, + 0x6c, 0xb3, 0x1e, 0xe2, 0x50, 0x2f, 0x82, 0xa8, + 0x28, 0xc0, 0xb5, 0xd7, 0x5f, 0x68, 0x0d, 0x2c, + 0x2d, 0xaf, 0x7e, 0xfa, 0x2e, 0x08, 0x0f, 0x1f, + 0x70, 0x9f, 0xe9, 0x19, 0x72, 0x55, 0xf8, 0xfb, + 0x51, 0xd2, 0x33, 0x5d, 0xa0, 0xd3, 0x2b, 0x0a, + 0x6c, 0xbc, 0x4e, 0xcf, 0x36, 0x4d, 0xdc, 0x3b, + 0xe9, 0x3e, 0x81, 0x7c, 0x61, 0xdb, 0x20, 0x2d, + 0x3a, 0xc3, 0xb3, 0x0c, 0x1e, 0x00, 0xb9, 0x7c, + 0xf5, 0xca, 0x10, 0x5f, 0x3a, 0x71, 0xb3, 0xe4, + 0x20, 0xdb, 0x0c, 0x2a, 0x98, 0x63, 0x45, 0x00, + 0x58, 0xf6, 0x68, 0xe4, 0x0b, 0xda, 0x13, 0x3b, + 0x60, 0x5c, 0x76, 0xdb, 0xb9, 0x97, 0x71, 0xe4, + 0xd9, 0xb7, 0xdb, 0xbd, 0x68, 0xc7, 0x84, 0x84, + 0xaa, 0x7c, 0x68, 0x62, 0x5e, 0x16, 0xfc, 0xba, + 0x72, 0xaa, 0x9a, 0xa9, 0xeb, 0x7c, 0x75, 0x47, + 0x97, 0x7e, 0xad, 0xe2, 0xd9, 0x91, 0xe8, 0xe4, + 0xa5, 0x31, 0xd7, 0x01, 0x8e, 0xa2, 0x11, 0x88, + 0x95, 0xb9, 0xf2, 0x9b, 0xd3, 0x7f, 0x1b, 0x81, + 0x22, 0xf7, 0x98, 0x60, 0x0a, 0x64, 0xa6, 0xc1, + 0xf6, 0x49, 0xc7, 0xe3, 0x07, 0x4d, 0x94, 0x7a, + 0xcf, 0x6e, 0x68, 0x0c, 0x1b, 0x3f, 0x6e, 0x2e, + 0xee, 0x92, 0xfa, 0x52, 0xb3, 0x59, 0xf8, 0xf1, + 0x8f, 0x6a, 0x66, 0xa3, 0x82, 0x76, 0x4a, 0x07, + 0x1a, 0xc7, 0xdd, 0xf5, 0xda, 0x9c, 0x3c, 0x24, + 0xbf, 0xfd, 0x42, 0xa1, 0x10, 0x64, 0x6a, 0x0f, + 0x89, 0xee, 0x36, 0xa5, 0xce, 0x99, 0x48, 0x6a, + 0xf0, 0x9f, 0x9e, 0x69, 0xa4, 0x40, 0x20, 0xe9, + 0x16, 0x15, 0xf7, 0xdb, 0x75, 0x02, 0xcb, 0xe9, + 0x73, 0x8b, 0x3b, 0x49, 0x2f, 0xf0, 0xaf, 0x51, + 0x06, 0x5c, 0xdf, 0x27, 0x27, 0x49, 0x6a, 0xd1, + 0xcc, 0xc7, 0xb5, 0x63, 0xb5, 0xfc, 0xb8, 0x5c, + 0x87, 0x7f, 0x84, 0xb4, 0xcc, 0x14, 0xa9, 0x53, + 0xda, 0xa4, 0x56, 0xf8, 0xb6, 0x1b, 0xcc, 0x40, + 0x27, 0x52, 0x06, 0x5a, 0x13, 0x81, 0xd7, 0x3a, + 0xd4, 0x3b, 0xfb, 0x49, 0x65, 0x31, 0x33, 0xb2, + 0xfa, 0xcd, 0xad, 0x58, 0x4e, 0x2b, 0xae, 0xd2, + 0x20, 0xfb, 0x1a, 0x48, 0xb4, 0x3f, 0x9a, 0xd8, + 0x7a, 0x35, 0x4a, 0xc8, 0xee, 0x88, 0x5e, 0x07, + 0x66, 0x54, 0xb9, 0xec, 0x9f, 0xa3, 0xe3, 0xb9, + 0x37, 0xaa, 0x49, 0x76, 0x31, 0xda, 0x74, 0x2d, + 0x3c, 0xa4, 0x65, 0x10, 0x32, 0x38, 0xf0, 0xde, + 0xd3, 0x99, 0x17, 0xaa, 0x71, 0xaa, 0x8f, 0x0f, + 0x8c, 0xaf, 0xa2, 0xf8, 0x5d, 0x64, 0xba, 0x1d, + 0xa3, 0xef, 0x96, 0x73, 0xe8, 0xa1, 0x02, 0x8d, + 0x0c, 0x6d, 0xb8, 0x06, 0x90, 0xb8, 0x08, 0x56, + 0x2c, 0xa7, 0x06, 0xc9, 0xc2, 0x38, 0xdb, 0x7c, + 0x63, 0xb1, 0x57, 0x8e, 0xea, 0x7c, 0x79, 0xf3, + 0x49, 0x1d, 0xfe, 0x9f, 0xf3, 0x6e, 0xb1, 0x1d, + 0xba, 0x19, 0x80, 0x1a, 0x0a, 0xd3, 0xb0, 0x26, + 0x21, 0x40, 0xb1, 0x7c, 0xf9, 0x4d, 0x8d, 0x10, + 0xc1, 0x7e, 0xf4, 0xf6, 0x3c, 0xa8, 0xfd, 0x7c, + 0xa3, 0x92, 0xb2, 0x0f, 0xaa, 0xcc, 0xa6, 0x11, + 0xfe, 0x04, 0xe3, 0xd1, 0x7a, 0x32, 0x89, 0xdf, + 0x0d, 0xc4, 0x8f, 0x79, 0x6b, 0xca, 0x16, 0x7c, + 0x6e, 0xf9, 0xad, 0x0f, 0xf6, 0xfe, 0x27, 0xdb, + 0xc4, 0x13, 0x70, 0xf1, 0x62, 0x1a, 0x4f, 0x79, + 0x40, 0xc9, 0x9b, 0x8b, 0x21, 0xea, 0x84, 0xfa, + 0xf5, 0xf1, 0x89, 0xce, 0xb7, 0x55, 0x0a, 0x80, + 0x39, 0x2f, 0x55, 0x36, 0x16, 0x9c, 0x7b, 0x08, + 0xbd, 0x87, 0x0d, 0xa5, 0x32, 0xf1, 0x52, 0x7c, + 0xe8, 0x55, 0x60, 0x5b, 0xd7, 0x69, 0xe4, 0xfc, + 0xfa, 0x12, 0x85, 0x96, 0xea, 0x50, 0x28, 0xab, + 0x8a, 0xf7, 0xbb, 0x0e, 0x53, 0x74, 0xca, 0xa6, + 0x27, 0x09, 0xc2, 0xb5, 0xde, 0x18, 0x14, 0xd9, + 0xea, 0xe5, 0x29, 0x1c, 0x40, 0x56, 0xcf, 0xd7, + 0xae, 0x05, 0x3f, 0x65, 0xaf, 0x05, 0x73, 0xe2, + 0x35, 0x96, 0x27, 0x07, 0x14, 0xc0, 0xad, 0x33, + 0xf1, 0xdc, 0x44, 0x7a, 0x89, 0x17, 0x77, 0xd2, + 0x9c, 0x58, 0x60, 0xf0, 0x3f, 0x7b, 0x2d, 0x2e, + 0x57, 0x95, 0x54, 0x87, 0xed, 0xf2, 0xc7, 0x4c, + 0xf0, 0xae, 0x56, 0x29, 0x19, 0x7d, 0x66, 0x4b, + 0x9b, 0x83, 0x84, 0x42, 0x3b, 0x01, 0x25, 0x66, + 0x8e, 0x02, 0xde, 0xb9, 0x83, 0x54, 0x19, 0xf6, + 0x9f, 0x79, 0x0d, 0x67, 0xc5, 0x1d, 0x7a, 0x44, + 0x02, 0x98, 0xa7, 0x16, 0x1c, 0x29, 0x0d, 0x74, + 0xff, 0x85, 0x40, 0x06, 0xef, 0x2c, 0xa9, 0xc6, + 0xf5, 0x53, 0x07, 0x06, 0xae, 0xe4, 0xfa, 0x5f, + 0xd8, 0x39, 0x4d, 0xf1, 0x9b, 0x6b, 0xd9, 0x24, + 0x84, 0xfe, 0x03, 0x4c, 0xb2, 0x3f, 0xdf, 0xa1, + 0x05, 0x9e, 0x50, 0x14, 0x5a, 0xd9, 0x1a, 0xa2, + 0xa7, 0xfa, 0xfa, 0x17, 0xf7, 0x78, 0xd6, 0xb5, + 0x92, 0x61, 0x91, 0xac, 0x36, 0xfa, 0x56, 0x0d, + 0x38, 0x32, 0x18, 0x85, 0x08, 0x58, 0x37, 0xf0, + 0x4b, 0xdb, 0x59, 0xe7, 0xa4, 0x34, 0xc0, 0x1b, + 0x01, 0xaf, 0x2d, 0xde, 0xa1, 0xaa, 0x5d, 0xd3, + 0xec, 0xe1, 0xd4, 0xf7, 0xe6, 0x54, 0x68, 0xf0, + 0x51, 0x97, 0xa7, 0x89, 0xea, 0x24, 0xad, 0xd3, + 0x6e, 0x47, 0x93, 0x8b, 0x4b, 0xb4, 0xf7, 0x1c, + 0x42, 0x06, 0x67, 0xe8, 0x99, 0xf6, 0xf5, 0x7b, + 0x85, 0xb5, 0x65, 0xb5, 0xb5, 0xd2, 0x37, 0xf5, + 0xf3, 0x02, 0xa6, 0x4d, 0x11, 0xa7, 0xdc, 0x51, + 0x09, 0x7f, 0xa0, 0xd8, 0x88, 0x1c, 0x13, 0x71, + 0xae, 0x9c, 0xb7, 0x7b, 0x34, 0xd6, 0x4e, 0x68, + 0x26, 0x83, 0x51, 0xaf, 0x1d, 0xee, 0x8b, 0xbb, + 0x69, 0x43, 0x2b, 0x9e, 0x8a, 0xbc, 0x02, 0x0e, + 0xa0, 0x1b, 0xe0, 0xa8, 0x5f, 0x6f, 0xaf, 0x1b, + 0x8f, 0xe7, 0x64, 0x71, 0x74, 0x11, 0x7e, 0xa8, + 0xd8, 0xf9, 0x97, 0x06, 0xc3, 0xb6, 0xfb, 0xfb, + 0xb7, 0x3d, 0x35, 0x9d, 0x3b, 0x52, 0xed, 0x54, + 0xca, 0xf4, 0x81, 0x01, 0x2d, 0x1b, 0xc3, 0xa7, + 0x00, 0x3d, 0x1a, 0x39, 0x54, 0xe1, 0xf6, 0xff, + 0xed, 0x6f, 0x0b, 0x5a, 0x68, 0xda, 0x58, 0xdd, + 0xa9, 0xcf, 0x5c, 0x4a, 0xe5, 0x09, 0x4e, 0xde, + 0x9d, 0xbc, 0x3e, 0xee, 0x5a, 0x00, 0x3b, 0x2c, + 0x87, 0x10, 0x65, 0x60, 0xdd, 0xd7, 0x56, 0xd1, + 0x4c, 0x64, 0x45, 0xe4, 0x21, 0xec, 0x78, 0xf8, + 0x25, 0x7a, 0x3e, 0x16, 0x5d, 0x09, 0x53, 0x14, + 0xbe, 0x4f, 0xae, 0x87, 0xd8, 0xd1, 0xaa, 0x3c, + 0xf6, 0x3e, 0xa4, 0x70, 0x8c, 0x5e, 0x70, 0xa4, + 0xb3, 0x6b, 0x66, 0x73, 0xd3, 0xbf, 0x31, 0x06, + 0x19, 0x62, 0x93, 0x15, 0xf2, 0x86, 0xe4, 0x52, + 0x7e, 0x53, 0x4c, 0x12, 0x38, 0xcc, 0x34, 0x7d, + 0x57, 0xf6, 0x42, 0x93, 0x8a, 0xc4, 0xee, 0x5c, + 0x8a, 0xe1, 0x52, 0x8f, 0x56, 0x64, 0xf6, 0xa6, + 0xd1, 0x91, 0x57, 0x70, 0xcd, 0x11, 0x76, 0xf5, + 0x59, 0x60, 0x60, 0x3c, 0xc1, 0xc3, 0x0b, 0x7f, + 0x58, 0x1a, 0x50, 0x91, 0xf1, 0x68, 0x8f, 0x6e, + 0x74, 0x74, 0xa8, 0x51, 0x0b, 0xf7, 0x7a, 0x98, + 0x37, 0xf2, 0x0a, 0x0e, 0xa4, 0x97, 0x04, 0xb8, + 0x9b, 0xfd, 0xa0, 0xea, 0xf7, 0x0d, 0xe1, 0xdb, + 0x03, 0xf0, 0x31, 0x29, 0xf8, 0xdd, 0x6b, 0x8b, + 0x5d, 0xd8, 0x59, 0xa9, 0x29, 0xcf, 0x9a, 0x79, + 0x89, 0x19, 0x63, 0x46, 0x09, 0x79, 0x6a, 0x11, + 0xda, 0x63, 0x68, 0x48, 0x77, 0x23, 0xfb, 0x7d, + 0x3a, 0x43, 0xcb, 0x02, 0x3b, 0x7a, 0x6d, 0x10, + 0x2a, 0x9e, 0xac, 0xf1, 0xd4, 0x19, 0xf8, 0x23, + 0x64, 0x1d, 0x2c, 0x5f, 0xf2, 0xb0, 0x5c, 0x23, + 0x27, 0xf7, 0x27, 0x30, 0x16, 0x37, 0xb1, 0x90, + 0xab, 0x38, 0xfb, 0x55, 0xcd, 0x78, 0x58, 0xd4, + 0x7d, 0x43, 0xf6, 0x45, 0x5e, 0x55, 0x8d, 0xb1, + 0x02, 0x65, 0x58, 0xb4, 0x13, 0x4b, 0x36, 0xf7, + 0xcc, 0xfe, 0x3d, 0x0b, 0x82, 0xe2, 0x12, 0x11, + 0xbb, 0xe6, 0xb8, 0x3a, 0x48, 0x71, 0xc7, 0x50, + 0x06, 0x16, 0x3a, 0xe6, 0x7c, 0x05, 0xc7, 0xc8, + 0x4d, 0x2f, 0x08, 0x6a, 0x17, 0x9a, 0x95, 0x97, + 0x50, 0x68, 0xdc, 0x28, 0x18, 0xc4, 0x61, 0x38, + 0xb9, 0xe0, 0x3e, 0x78, 0xdb, 0x29, 0xe0, 0x9f, + 0x52, 0xdd, 0xf8, 0x4f, 0x91, 0xc1, 0xd0, 0x33, + 0xa1, 0x7a, 0x8e, 0x30, 0x13, 0x82, 0x07, 0x9f, + 0xd3, 0x31, 0x0f, 0x23, 0xbe, 0x32, 0x5a, 0x75, + 0xcf, 0x96, 0xb2, 0xec, 0xb5, 0x32, 0xac, 0x21, + 0xd1, 0x82, 0x33, 0xd3, 0x15, 0x74, 0xbd, 0x90, + 0xf1, 0x2c, 0xe6, 0x5f, 0x8d, 0xe3, 0x02, 0xe8, + 0xe9, 0xc4, 0xca, 0x96, 0xeb, 0x0e, 0xbc, 0x91, + 0xf4, 0xb9, 0xea, 0xd9, 0x1b, 0x75, 0xbd, 0xe1, + 0xac, 0x2a, 0x05, 0x37, 0x52, 0x9b, 0x1b, 0x3f, + 0x5a, 0xdc, 0x21, 0xc3, 0x98, 0xbb, 0xaf, 0xa3, + 0xf2, 0x00, 0xbf, 0x0d, 0x30, 0x89, 0x05, 0xcc, + 0xa5, 0x76, 0xf5, 0x06, 0xf0, 0xc6, 0x54, 0x8a, + 0x5d, 0xd4, 0x1e, 0xc1, 0xf2, 0xce, 0xb0, 0x62, + 0xc8, 0xfc, 0x59, 0x42, 0x9a, 0x90, 0x60, 0x55, + 0xfe, 0x88, 0xa5, 0x8b, 0xb8, 0x33, 0x0c, 0x23, + 0x24, 0x0d, 0x15, 0x70, 0x37, 0x1e, 0x3d, 0xf6, + 0xd2, 0xea, 0x92, 0x10, 0xb2, 0xc4, 0x51, 0xac, + 0xf2, 0xac, 0xf3, 0x6b, 0x6c, 0xaa, 0xcf, 0x12, + 0xc5, 0x6c, 0x90, 0x50, 0xb5, 0x0c, 0xfc, 0x1a, + 0x15, 0x52, 0xe9, 0x26, 0xc6, 0x52, 0xa4, 0xe7, + 0x81, 0x69, 0xe1, 0xe7, 0x9e, 0x30, 0x01, 0xec, + 0x84, 0x89, 0xb2, 0x0d, 0x66, 0xdd, 0xce, 0x28, + 0x5c, 0xec, 0x98, 0x46, 0x68, 0x21, 0x9f, 0x88, + 0x3f, 0x1f, 0x42, 0x77, 0xce, 0xd0, 0x61, 0xd4, + 0x20, 0xa7, 0xff, 0x53, 0xad, 0x37, 0xd0, 0x17, + 0x35, 0xc9, 0xfc, 0xba, 0x0a, 0x78, 0x3f, 0xf2, + 0xcc, 0x86, 0x89, 0xe8, 0x4b, 0x3c, 0x48, 0x33, + 0x09, 0x7f, 0xc6, 0xc0, 0xdd, 0xb8, 0xfd, 0x7a, + 0x66, 0x66, 0x65, 0xeb, 0x47, 0xa7, 0x04, 0x28, + 0xa3, 0x19, 0x8e, 0xa9, 0xb1, 0x13, 0x67, 0x62, + 0x70, 0xcf, 0xd6 +}; +static const u8 dec_output012[] __initconst = { + 0x74, 0xa6, 0x3e, 0xe4, 0xb1, 0xcb, 0xaf, 0xb0, + 0x40, 0xe5, 0x0f, 0x9e, 0xf1, 0xf2, 0x89, 0xb5, + 0x42, 0x34, 0x8a, 0xa1, 0x03, 0xb7, 0xe9, 0x57, + 0x46, 0xbe, 0x20, 0xe4, 0x6e, 0xb0, 0xeb, 0xff, + 0xea, 0x07, 0x7e, 0xef, 0xe2, 0x55, 0x9f, 0xe5, + 0x78, 0x3a, 0xb7, 0x83, 0xc2, 0x18, 0x40, 0x7b, + 0xeb, 0xcd, 0x81, 0xfb, 0x90, 0x12, 0x9e, 0x46, + 0xa9, 0xd6, 0x4a, 0xba, 0xb0, 0x62, 0xdb, 0x6b, + 0x99, 0xc4, 0xdb, 0x54, 0x4b, 0xb8, 0xa5, 0x71, + 0xcb, 0xcd, 0x63, 0x32, 0x55, 0xfb, 0x31, 0xf0, + 0x38, 0xf5, 0xbe, 0x78, 0xe4, 0x45, 0xce, 0x1b, + 0x6a, 0x5b, 0x0e, 0xf4, 0x16, 0xe4, 0xb1, 0x3d, + 0xf6, 0x63, 0x7b, 0xa7, 0x0c, 0xde, 0x6f, 0x8f, + 0x74, 0xdf, 0xe0, 0x1e, 0x9d, 0xce, 0x8f, 0x24, + 0xef, 0x23, 0x35, 0x33, 0x7b, 0x83, 0x34, 0x23, + 0x58, 0x74, 0x14, 0x77, 0x1f, 0xc2, 0x4f, 0x4e, + 0xc6, 0x89, 0xf9, 0x52, 0x09, 0x37, 0x64, 0x14, + 0xc4, 0x01, 0x6b, 0x9d, 0x77, 0xe8, 0x90, 0x5d, + 0xa8, 0x4a, 0x2a, 0xef, 0x5c, 0x7f, 0xeb, 0xbb, + 0xb2, 0xc6, 0x93, 0x99, 0x66, 0xdc, 0x7f, 0xd4, + 0x9e, 0x2a, 0xca, 0x8d, 0xdb, 0xe7, 0x20, 0xcf, + 0xe4, 0x73, 0xae, 0x49, 0x7d, 0x64, 0x0f, 0x0e, + 0x28, 0x46, 0xa9, 0xa8, 0x32, 0xe4, 0x0e, 0xf6, + 0x51, 0x53, 0xb8, 0x3c, 0xb1, 0xff, 0xa3, 0x33, + 0x41, 0x75, 0xff, 0xf1, 0x6f, 0xf1, 0xfb, 0xbb, + 0x83, 0x7f, 0x06, 0x9b, 0xe7, 0x1b, 0x0a, 0xe0, + 0x5c, 0x33, 0x60, 0x5b, 0xdb, 0x5b, 0xed, 0xfe, + 0xa5, 0x16, 0x19, 0x72, 0xa3, 0x64, 0x23, 0x00, + 0x02, 0xc7, 0xf3, 0x6a, 0x81, 0x3e, 0x44, 0x1d, + 0x79, 0x15, 0x5f, 0x9a, 0xde, 0xe2, 0xfd, 0x1b, + 0x73, 0xc1, 0xbc, 0x23, 0xba, 0x31, 0xd2, 0x50, + 0xd5, 0xad, 0x7f, 0x74, 0xa7, 0xc9, 0xf8, 0x3e, + 0x2b, 0x26, 0x10, 0xf6, 0x03, 0x36, 0x74, 0xe4, + 0x0e, 0x6a, 0x72, 0xb7, 0x73, 0x0a, 0x42, 0x28, + 0xc2, 0xad, 0x5e, 0x03, 0xbe, 0xb8, 0x0b, 0xa8, + 0x5b, 0xd4, 0xb8, 0xba, 0x52, 0x89, 0xb1, 0x9b, + 0xc1, 0xc3, 0x65, 0x87, 0xed, 0xa5, 0xf4, 0x86, + 0xfd, 0x41, 0x80, 0x91, 0x27, 0x59, 0x53, 0x67, + 0x15, 0x78, 0x54, 0x8b, 0x2d, 0x3d, 0xc7, 0xff, + 0x02, 0x92, 0x07, 0x5f, 0x7a, 0x4b, 0x60, 0x59, + 0x3c, 0x6f, 0x5c, 0xd8, 0xec, 0x95, 0xd2, 0xfe, + 0xa0, 0x3b, 0xd8, 0x3f, 0xd1, 0x69, 0xa6, 0xd6, + 0x41, 0xb2, 0xf4, 0x4d, 0x12, 0xf4, 0x58, 0x3e, + 0x66, 0x64, 0x80, 0x31, 0x9b, 0xa8, 0x4c, 0x8b, + 0x07, 0xb2, 0xec, 0x66, 0x94, 0x66, 0x47, 0x50, + 0x50, 0x5f, 0x18, 0x0b, 0x0e, 0xd6, 0xc0, 0x39, + 0x21, 0x13, 0x9e, 0x33, 0xbc, 0x79, 0x36, 0x02, + 0x96, 0x70, 0xf0, 0x48, 0x67, 0x2f, 0x26, 0xe9, + 0x6d, 0x10, 0xbb, 0xd6, 0x3f, 0xd1, 0x64, 0x7a, + 0x2e, 0xbe, 0x0c, 0x61, 0xf0, 0x75, 0x42, 0x38, + 0x23, 0xb1, 0x9e, 0x9f, 0x7c, 0x67, 0x66, 0xd9, + 0x58, 0x9a, 0xf1, 0xbb, 0x41, 0x2a, 0x8d, 0x65, + 0x84, 0x94, 0xfc, 0xdc, 0x6a, 0x50, 0x64, 0xdb, + 0x56, 0x33, 0x76, 0x00, 0x10, 0xed, 0xbe, 0xd2, + 0x12, 0xf6, 0xf6, 0x1b, 0xa2, 0x16, 0xde, 0xae, + 0x31, 0x95, 0xdd, 0xb1, 0x08, 0x7e, 0x4e, 0xee, + 0xe7, 0xf9, 0xa5, 0xfb, 0x5b, 0x61, 0x43, 0x00, + 0x40, 0xf6, 0x7e, 0x02, 0x04, 0x32, 0x4e, 0x0c, + 0xe2, 0x66, 0x0d, 0xd7, 0x07, 0x98, 0x0e, 0xf8, + 0x72, 0x34, 0x6d, 0x95, 0x86, 0xd7, 0xcb, 0x31, + 0x54, 0x47, 0xd0, 0x38, 0x29, 0x9c, 0x5a, 0x68, + 0xd4, 0x87, 0x76, 0xc9, 0xe7, 0x7e, 0xe3, 0xf4, + 0x81, 0x6d, 0x18, 0xcb, 0xc9, 0x05, 0xaf, 0xa0, + 0xfb, 0x66, 0xf7, 0xf1, 0x1c, 0xc6, 0x14, 0x11, + 0x4f, 0x2b, 0x79, 0x42, 0x8b, 0xbc, 0xac, 0xe7, + 0x6c, 0xfe, 0x0f, 0x58, 0xe7, 0x7c, 0x78, 0x39, + 0x30, 0xb0, 0x66, 0x2c, 0x9b, 0x6d, 0x3a, 0xe1, + 0xcf, 0xc9, 0xa4, 0x0e, 0x6d, 0x6d, 0x8a, 0xa1, + 0x3a, 0xe7, 0x28, 0xd4, 0x78, 0x4c, 0xa6, 0xa2, + 0x2a, 0xa6, 0x03, 0x30, 0xd7, 0xa8, 0x25, 0x66, + 0x87, 0x2f, 0x69, 0x5c, 0x4e, 0xdd, 0xa5, 0x49, + 0x5d, 0x37, 0x4a, 0x59, 0xc4, 0xaf, 0x1f, 0xa2, + 0xe4, 0xf8, 0xa6, 0x12, 0x97, 0xd5, 0x79, 0xf5, + 0xe2, 0x4a, 0x2b, 0x5f, 0x61, 0xe4, 0x9e, 0xe3, + 0xee, 0xb8, 0xa7, 0x5b, 0x2f, 0xf4, 0x9e, 0x6c, + 0xfb, 0xd1, 0xc6, 0x56, 0x77, 0xba, 0x75, 0xaa, + 0x3d, 0x1a, 0xa8, 0x0b, 0xb3, 0x68, 0x24, 0x00, + 0x10, 0x7f, 0xfd, 0xd7, 0xa1, 0x8d, 0x83, 0x54, + 0x4f, 0x1f, 0xd8, 0x2a, 0xbe, 0x8a, 0x0c, 0x87, + 0xab, 0xa2, 0xde, 0xc3, 0x39, 0xbf, 0x09, 0x03, + 0xa5, 0xf3, 0x05, 0x28, 0xe1, 0xe1, 0xee, 0x39, + 0x70, 0x9c, 0xd8, 0x81, 0x12, 0x1e, 0x02, 0x40, + 0xd2, 0x6e, 0xf0, 0xeb, 0x1b, 0x3d, 0x22, 0xc6, + 0xe5, 0xe3, 0xb4, 0x5a, 0x98, 0xbb, 0xf0, 0x22, + 0x28, 0x8d, 0xe5, 0xd3, 0x16, 0x48, 0x24, 0xa5, + 0xe6, 0x66, 0x0c, 0xf9, 0x08, 0xf9, 0x7e, 0x1e, + 0xe1, 0x28, 0x26, 0x22, 0xc7, 0xc7, 0x0a, 0x32, + 0x47, 0xfa, 0xa3, 0xbe, 0x3c, 0xc4, 0xc5, 0x53, + 0x0a, 0xd5, 0x94, 0x4a, 0xd7, 0x93, 0xd8, 0x42, + 0x99, 0xb9, 0x0a, 0xdb, 0x56, 0xf7, 0xb9, 0x1c, + 0x53, 0x4f, 0xfa, 0xd3, 0x74, 0xad, 0xd9, 0x68, + 0xf1, 0x1b, 0xdf, 0x61, 0xc6, 0x5e, 0xa8, 0x48, + 0xfc, 0xd4, 0x4a, 0x4c, 0x3c, 0x32, 0xf7, 0x1c, + 0x96, 0x21, 0x9b, 0xf9, 0xa3, 0xcc, 0x5a, 0xce, + 0xd5, 0xd7, 0x08, 0x24, 0xf6, 0x1c, 0xfd, 0xdd, + 0x38, 0xc2, 0x32, 0xe9, 0xb8, 0xe7, 0xb6, 0xfa, + 0x9d, 0x45, 0x13, 0x2c, 0x83, 0xfd, 0x4a, 0x69, + 0x82, 0xcd, 0xdc, 0xb3, 0x76, 0x0c, 0x9e, 0xd8, + 0xf4, 0x1b, 0x45, 0x15, 0xb4, 0x97, 0xe7, 0x58, + 0x34, 0xe2, 0x03, 0x29, 0x5a, 0xbf, 0xb6, 0xe0, + 0x5d, 0x13, 0xd9, 0x2b, 0xb4, 0x80, 0xb2, 0x45, + 0x81, 0x6a, 0x2e, 0x6c, 0x89, 0x7d, 0xee, 0xbb, + 0x52, 0xdd, 0x1f, 0x18, 0xe7, 0x13, 0x6b, 0x33, + 0x0e, 0xea, 0x36, 0x92, 0x77, 0x7b, 0x6d, 0x9c, + 0x5a, 0x5f, 0x45, 0x7b, 0x7b, 0x35, 0x62, 0x23, + 0xd1, 0xbf, 0x0f, 0xd0, 0x08, 0x1b, 0x2b, 0x80, + 0x6b, 0x7e, 0xf1, 0x21, 0x47, 0xb0, 0x57, 0xd1, + 0x98, 0x72, 0x90, 0x34, 0x1c, 0x20, 0x04, 0xff, + 0x3d, 0x5c, 0xee, 0x0e, 0x57, 0x5f, 0x6f, 0x24, + 0x4e, 0x3c, 0xea, 0xfc, 0xa5, 0xa9, 0x83, 0xc9, + 0x61, 0xb4, 0x51, 0x24, 0xf8, 0x27, 0x5e, 0x46, + 0x8c, 0xb1, 0x53, 0x02, 0x96, 0x35, 0xba, 0xb8, + 0x4c, 0x71, 0xd3, 0x15, 0x59, 0x35, 0x22, 0x20, + 0xad, 0x03, 0x9f, 0x66, 0x44, 0x3b, 0x9c, 0x35, + 0x37, 0x1f, 0x9b, 0xbb, 0xf3, 0xdb, 0x35, 0x63, + 0x30, 0x64, 0xaa, 0xa2, 0x06, 0xa8, 0x5d, 0xbb, + 0xe1, 0x9f, 0x70, 0xec, 0x82, 0x11, 0x06, 0x36, + 0xec, 0x8b, 0x69, 0x66, 0x24, 0x44, 0xc9, 0x4a, + 0x57, 0xbb, 0x9b, 0x78, 0x13, 0xce, 0x9c, 0x0c, + 0xba, 0x92, 0x93, 0x63, 0xb8, 0xe2, 0x95, 0x0f, + 0x0f, 0x16, 0x39, 0x52, 0xfd, 0x3a, 0x6d, 0x02, + 0x4b, 0xdf, 0x13, 0xd3, 0x2a, 0x22, 0xb4, 0x03, + 0x7c, 0x54, 0x49, 0x96, 0x68, 0x54, 0x10, 0xfa, + 0xef, 0xaa, 0x6c, 0xe8, 0x22, 0xdc, 0x71, 0x16, + 0x13, 0x1a, 0xf6, 0x28, 0xe5, 0x6d, 0x77, 0x3d, + 0xcd, 0x30, 0x63, 0xb1, 0x70, 0x52, 0xa1, 0xc5, + 0x94, 0x5f, 0xcf, 0xe8, 0xb8, 0x26, 0x98, 0xf7, + 0x06, 0xa0, 0x0a, 0x70, 0xfa, 0x03, 0x80, 0xac, + 0xc1, 0xec, 0xd6, 0x4c, 0x54, 0xd7, 0xfe, 0x47, + 0xb6, 0x88, 0x4a, 0xf7, 0x71, 0x24, 0xee, 0xf3, + 0xd2, 0xc2, 0x4a, 0x7f, 0xfe, 0x61, 0xc7, 0x35, + 0xc9, 0x37, 0x67, 0xcb, 0x24, 0x35, 0xda, 0x7e, + 0xca, 0x5f, 0xf3, 0x8d, 0xd4, 0x13, 0x8e, 0xd6, + 0xcb, 0x4d, 0x53, 0x8f, 0x53, 0x1f, 0xc0, 0x74, + 0xf7, 0x53, 0xb9, 0x5e, 0x23, 0x37, 0xba, 0x6e, + 0xe3, 0x9d, 0x07, 0x55, 0x25, 0x7b, 0xe6, 0x2a, + 0x64, 0xd1, 0x32, 0xdd, 0x54, 0x1b, 0x4b, 0xc0, + 0xe1, 0xd7, 0x69, 0x58, 0xf8, 0x93, 0x29, 0xc4, + 0xdd, 0x23, 0x2f, 0xa5, 0xfc, 0x9d, 0x7e, 0xf8, + 0xd4, 0x90, 0xcd, 0x82, 0x55, 0xdc, 0x16, 0x16, + 0x9f, 0x07, 0x52, 0x9b, 0x9d, 0x25, 0xed, 0x32, + 0xc5, 0x7b, 0xdf, 0xf6, 0x83, 0x46, 0x3d, 0x65, + 0xb7, 0xef, 0x87, 0x7a, 0x12, 0x69, 0x8f, 0x06, + 0x7c, 0x51, 0x15, 0x4a, 0x08, 0xe8, 0xac, 0x9a, + 0x0c, 0x24, 0xa7, 0x27, 0xd8, 0x46, 0x2f, 0xe7, + 0x01, 0x0e, 0x1c, 0xc6, 0x91, 0xb0, 0x6e, 0x85, + 0x65, 0xf0, 0x29, 0x0d, 0x2e, 0x6b, 0x3b, 0xfb, + 0x4b, 0xdf, 0xe4, 0x80, 0x93, 0x03, 0x66, 0x46, + 0x3e, 0x8a, 0x6e, 0xf3, 0x5e, 0x4d, 0x62, 0x0e, + 0x49, 0x05, 0xaf, 0xd4, 0xf8, 0x21, 0x20, 0x61, + 0x1d, 0x39, 0x17, 0xf4, 0x61, 0x47, 0x95, 0xfb, + 0x15, 0x2e, 0xb3, 0x4f, 0xd0, 0x5d, 0xf5, 0x7d, + 0x40, 0xda, 0x90, 0x3c, 0x6b, 0xcb, 0x17, 0x00, + 0x13, 0x3b, 0x64, 0x34, 0x1b, 0xf0, 0xf2, 0xe5, + 0x3b, 0xb2, 0xc7, 0xd3, 0x5f, 0x3a, 0x44, 0xa6, + 0x9b, 0xb7, 0x78, 0x0e, 0x42, 0x5d, 0x4c, 0xc1, + 0xe9, 0xd2, 0xcb, 0xb7, 0x78, 0xd1, 0xfe, 0x9a, + 0xb5, 0x07, 0xe9, 0xe0, 0xbe, 0xe2, 0x8a, 0xa7, + 0x01, 0x83, 0x00, 0x8c, 0x5c, 0x08, 0xe6, 0x63, + 0x12, 0x92, 0xb7, 0xb7, 0xa6, 0x19, 0x7d, 0x38, + 0x13, 0x38, 0x92, 0x87, 0x24, 0xf9, 0x48, 0xb3, + 0x5e, 0x87, 0x6a, 0x40, 0x39, 0x5c, 0x3f, 0xed, + 0x8f, 0xee, 0xdb, 0x15, 0x82, 0x06, 0xda, 0x49, + 0x21, 0x2b, 0xb5, 0xbf, 0x32, 0x7c, 0x9f, 0x42, + 0x28, 0x63, 0xcf, 0xaf, 0x1e, 0xf8, 0xc6, 0xa0, + 0xd1, 0x02, 0x43, 0x57, 0x62, 0xec, 0x9b, 0x0f, + 0x01, 0x9e, 0x71, 0xd8, 0x87, 0x9d, 0x01, 0xc1, + 0x58, 0x77, 0xd9, 0xaf, 0xb1, 0x10, 0x7e, 0xdd, + 0xa6, 0x50, 0x96, 0xe5, 0xf0, 0x72, 0x00, 0x6d, + 0x4b, 0xf8, 0x2a, 0x8f, 0x19, 0xf3, 0x22, 0x88, + 0x11, 0x4a, 0x8b, 0x7c, 0xfd, 0xb7, 0xed, 0xe1, + 0xf6, 0x40, 0x39, 0xe0, 0xe9, 0xf6, 0x3d, 0x25, + 0xe6, 0x74, 0x3c, 0x58, 0x57, 0x7f, 0xe1, 0x22, + 0x96, 0x47, 0x31, 0x91, 0xba, 0x70, 0x85, 0x28, + 0x6b, 0x9f, 0x6e, 0x25, 0xac, 0x23, 0x66, 0x2f, + 0x29, 0x88, 0x28, 0xce, 0x8c, 0x5c, 0x88, 0x53, + 0xd1, 0x3b, 0xcc, 0x6a, 0x51, 0xb2, 0xe1, 0x28, + 0x3f, 0x91, 0xb4, 0x0d, 0x00, 0x3a, 0xe3, 0xf8, + 0xc3, 0x8f, 0xd7, 0x96, 0x62, 0x0e, 0x2e, 0xfc, + 0xc8, 0x6c, 0x77, 0xa6, 0x1d, 0x22, 0xc1, 0xb8, + 0xe6, 0x61, 0xd7, 0x67, 0x36, 0x13, 0x7b, 0xbb, + 0x9b, 0x59, 0x09, 0xa6, 0xdf, 0xf7, 0x6b, 0xa3, + 0x40, 0x1a, 0xf5, 0x4f, 0xb4, 0xda, 0xd3, 0xf3, + 0x81, 0x93, 0xc6, 0x18, 0xd9, 0x26, 0xee, 0xac, + 0xf0, 0xaa, 0xdf, 0xc5, 0x9c, 0xca, 0xc2, 0xa2, + 0xcc, 0x7b, 0x5c, 0x24, 0xb0, 0xbc, 0xd0, 0x6a, + 0x4d, 0x89, 0x09, 0xb8, 0x07, 0xfe, 0x87, 0xad, + 0x0a, 0xea, 0xb8, 0x42, 0xf9, 0x5e, 0xb3, 0x3e, + 0x36, 0x4c, 0xaf, 0x75, 0x9e, 0x1c, 0xeb, 0xbd, + 0xbc, 0xbb, 0x80, 0x40, 0xa7, 0x3a, 0x30, 0xbf, + 0xa8, 0x44, 0xf4, 0xeb, 0x38, 0xad, 0x29, 0xba, + 0x23, 0xed, 0x41, 0x0c, 0xea, 0xd2, 0xbb, 0x41, + 0x18, 0xd6, 0xb9, 0xba, 0x65, 0x2b, 0xa3, 0x91, + 0x6d, 0x1f, 0xa9, 0xf4, 0xd1, 0x25, 0x8d, 0x4d, + 0x38, 0xff, 0x64, 0xa0, 0xec, 0xde, 0xa6, 0xb6, + 0x79, 0xab, 0x8e, 0x33, 0x6c, 0x47, 0xde, 0xaf, + 0x94, 0xa4, 0xa5, 0x86, 0x77, 0x55, 0x09, 0x92, + 0x81, 0x31, 0x76, 0xc7, 0x34, 0x22, 0x89, 0x8e, + 0x3d, 0x26, 0x26, 0xd7, 0xfc, 0x1e, 0x16, 0x72, + 0x13, 0x33, 0x63, 0xd5, 0x22, 0xbe, 0xb8, 0x04, + 0x34, 0x84, 0x41, 0xbb, 0x80, 0xd0, 0x9f, 0x46, + 0x48, 0x07, 0xa7, 0xfc, 0x2b, 0x3a, 0x75, 0x55, + 0x8c, 0xc7, 0x6a, 0xbd, 0x7e, 0x46, 0x08, 0x84, + 0x0f, 0xd5, 0x74, 0xc0, 0x82, 0x8e, 0xaa, 0x61, + 0x05, 0x01, 0xb2, 0x47, 0x6e, 0x20, 0x6a, 0x2d, + 0x58, 0x70, 0x48, 0x32, 0xa7, 0x37, 0xd2, 0xb8, + 0x82, 0x1a, 0x51, 0xb9, 0x61, 0xdd, 0xfd, 0x9d, + 0x6b, 0x0e, 0x18, 0x97, 0xf8, 0x45, 0x5f, 0x87, + 0x10, 0xcf, 0x34, 0x72, 0x45, 0x26, 0x49, 0x70, + 0xe7, 0xa3, 0x78, 0xe0, 0x52, 0x89, 0x84, 0x94, + 0x83, 0x82, 0xc2, 0x69, 0x8f, 0xe3, 0xe1, 0x3f, + 0x60, 0x74, 0x88, 0xc4, 0xf7, 0x75, 0x2c, 0xfb, + 0xbd, 0xb6, 0xc4, 0x7e, 0x10, 0x0a, 0x6c, 0x90, + 0x04, 0x9e, 0xc3, 0x3f, 0x59, 0x7c, 0xce, 0x31, + 0x18, 0x60, 0x57, 0x73, 0x46, 0x94, 0x7d, 0x06, + 0xa0, 0x6d, 0x44, 0xec, 0xa2, 0x0a, 0x9e, 0x05, + 0x15, 0xef, 0xca, 0x5c, 0xbf, 0x00, 0xeb, 0xf7, + 0x3d, 0x32, 0xd4, 0xa5, 0xef, 0x49, 0x89, 0x5e, + 0x46, 0xb0, 0xa6, 0x63, 0x5b, 0x8a, 0x73, 0xae, + 0x6f, 0xd5, 0x9d, 0xf8, 0x4f, 0x40, 0xb5, 0xb2, + 0x6e, 0xd3, 0xb6, 0x01, 0xa9, 0x26, 0xa2, 0x21, + 0xcf, 0x33, 0x7a, 0x3a, 0xa4, 0x23, 0x13, 0xb0, + 0x69, 0x6a, 0xee, 0xce, 0xd8, 0x9d, 0x01, 0x1d, + 0x50, 0xc1, 0x30, 0x6c, 0xb1, 0xcd, 0xa0, 0xf0, + 0xf0, 0xa2, 0x64, 0x6f, 0xbb, 0xbf, 0x5e, 0xe6, + 0xab, 0x87, 0xb4, 0x0f, 0x4f, 0x15, 0xaf, 0xb5, + 0x25, 0xa1, 0xb2, 0xd0, 0x80, 0x2c, 0xfb, 0xf9, + 0xfe, 0xd2, 0x33, 0xbb, 0x76, 0xfe, 0x7c, 0xa8, + 0x66, 0xf7, 0xe7, 0x85, 0x9f, 0x1f, 0x85, 0x57, + 0x88, 0xe1, 0xe9, 0x63, 0xe4, 0xd8, 0x1c, 0xa1, + 0xfb, 0xda, 0x44, 0x05, 0x2e, 0x1d, 0x3a, 0x1c, + 0xff, 0xc8, 0x3b, 0xc0, 0xfe, 0xda, 0x22, 0x0b, + 0x43, 0xd6, 0x88, 0x39, 0x4c, 0x4a, 0xa6, 0x69, + 0x18, 0x93, 0x42, 0x4e, 0xb5, 0xcc, 0x66, 0x0d, + 0x09, 0xf8, 0x1e, 0x7c, 0xd3, 0x3c, 0x99, 0x0d, + 0x50, 0x1d, 0x62, 0xe9, 0x57, 0x06, 0xbf, 0x19, + 0x88, 0xdd, 0xad, 0x7b, 0x4f, 0xf9, 0xc7, 0x82, + 0x6d, 0x8d, 0xc8, 0xc4, 0xc5, 0x78, 0x17, 0x20, + 0x15, 0xc5, 0x52, 0x41, 0xcf, 0x5b, 0xd6, 0x7f, + 0x94, 0x02, 0x41, 0xe0, 0x40, 0x22, 0x03, 0x5e, + 0xd1, 0x53, 0xd4, 0x86, 0xd3, 0x2c, 0x9f, 0x0f, + 0x96, 0xe3, 0x6b, 0x9a, 0x76, 0x32, 0x06, 0x47, + 0x4b, 0x11, 0xb3, 0xdd, 0x03, 0x65, 0xbd, 0x9b, + 0x01, 0xda, 0x9c, 0xb9, 0x7e, 0x3f, 0x6a, 0xc4, + 0x7b, 0xea, 0xd4, 0x3c, 0xb9, 0xfb, 0x5c, 0x6b, + 0x64, 0x33, 0x52, 0xba, 0x64, 0x78, 0x8f, 0xa4, + 0xaf, 0x7a, 0x61, 0x8d, 0xbc, 0xc5, 0x73, 0xe9, + 0x6b, 0x58, 0x97, 0x4b, 0xbf, 0x63, 0x22, 0xd3, + 0x37, 0x02, 0x54, 0xc5, 0xb9, 0x16, 0x4a, 0xf0, + 0x19, 0xd8, 0x94, 0x57, 0xb8, 0x8a, 0xb3, 0x16, + 0x3b, 0xd0, 0x84, 0x8e, 0x67, 0xa6, 0xa3, 0x7d, + 0x78, 0xec, 0x00 +}; +static const u8 dec_assoc012[] __initconst = { + 0xb1, 0x69, 0x83, 0x87, 0x30, 0xaa, 0x5d, 0xb8, + 0x77, 0xe8, 0x21, 0xff, 0x06, 0x59, 0x35, 0xce, + 0x75, 0xfe, 0x38, 0xef, 0xb8, 0x91, 0x43, 0x8c, + 0xcf, 0x70, 0xdd, 0x0a, 0x68, 0xbf, 0xd4, 0xbc, + 0x16, 0x76, 0x99, 0x36, 0x1e, 0x58, 0x79, 0x5e, + 0xd4, 0x29, 0xf7, 0x33, 0x93, 0x48, 0xdb, 0x5f, + 0x01, 0xae, 0x9c, 0xb6, 0xe4, 0x88, 0x6d, 0x2b, + 0x76, 0x75, 0xe0, 0xf3, 0x74, 0xe2, 0xc9 +}; +static const u8 dec_nonce012[] __initconst = { + 0x05, 0xa3, 0x93, 0xed, 0x30, 0xc5, 0xa2, 0x06 +}; +static const u8 dec_key012[] __initconst = { + 0xb3, 0x35, 0x50, 0x03, 0x54, 0x2e, 0x40, 0x5e, + 0x8f, 0x59, 0x8e, 0xc5, 0x90, 0xd5, 0x27, 0x2d, + 0xba, 0x29, 0x2e, 0xcb, 0x1b, 0x70, 0x44, 0x1e, + 0x65, 0x91, 0x6e, 0x2a, 0x79, 0x22, 0xda, 0x64 +}; + +static const u8 dec_input013[] __initconst = { + 0x52, 0x34, 0xb3, 0x65, 0x3b, 0xb7, 0xe5, 0xd3, + 0xab, 0x49, 0x17, 0x60, 0xd2, 0x52, 0x56, 0xdf, + 0xdf, 0x34, 0x56, 0x82, 0xe2, 0xbe, 0xe5, 0xe1, + 0x28, 0xd1, 0x4e, 0x5f, 0x4f, 0x01, 0x7d, 0x3f, + 0x99, 0x6b, 0x30, 0x6e, 0x1a, 0x7c, 0x4c, 0x8e, + 0x62, 0x81, 0xae, 0x86, 0x3f, 0x6b, 0xd0, 0xb5, + 0xa9, 0xcf, 0x50, 0xf1, 0x02, 0x12, 0xa0, 0x0b, + 0x24, 0xe9, 0xe6, 0x72, 0x89, 0x2c, 0x52, 0x1b, + 0x34, 0x38, 0xf8, 0x75, 0x5f, 0xa0, 0x74, 0xe2, + 0x99, 0xdd, 0xa6, 0x4b, 0x14, 0x50, 0x4e, 0xf1, + 0xbe, 0xd6, 0x9e, 0xdb, 0xb2, 0x24, 0x27, 0x74, + 0x12, 0x4a, 0x78, 0x78, 0x17, 0xa5, 0x58, 0x8e, + 0x2f, 0xf9, 0xf4, 0x8d, 0xee, 0x03, 0x88, 0xae, + 0xb8, 0x29, 0xa1, 0x2f, 0x4b, 0xee, 0x92, 0xbd, + 0x87, 0xb3, 0xce, 0x34, 0x21, 0x57, 0x46, 0x04, + 0x49, 0x0c, 0x80, 0xf2, 0x01, 0x13, 0xa1, 0x55, + 0xb3, 0xff, 0x44, 0x30, 0x3c, 0x1c, 0xd0, 0xef, + 0xbc, 0x18, 0x74, 0x26, 0xad, 0x41, 0x5b, 0x5b, + 0x3e, 0x9a, 0x7a, 0x46, 0x4f, 0x16, 0xd6, 0x74, + 0x5a, 0xb7, 0x3a, 0x28, 0x31, 0xd8, 0xae, 0x26, + 0xac, 0x50, 0x53, 0x86, 0xf2, 0x56, 0xd7, 0x3f, + 0x29, 0xbc, 0x45, 0x68, 0x8e, 0xcb, 0x98, 0x64, + 0xdd, 0xc9, 0xba, 0xb8, 0x4b, 0x7b, 0x82, 0xdd, + 0x14, 0xa7, 0xcb, 0x71, 0x72, 0x00, 0x5c, 0xad, + 0x7b, 0x6a, 0x89, 0xa4, 0x3d, 0xbf, 0xb5, 0x4b, + 0x3e, 0x7c, 0x5a, 0xcf, 0xb8, 0xa1, 0xc5, 0x6e, + 0xc8, 0xb6, 0x31, 0x57, 0x7b, 0xdf, 0xa5, 0x7e, + 0xb1, 0xd6, 0x42, 0x2a, 0x31, 0x36, 0xd1, 0xd0, + 0x3f, 0x7a, 0xe5, 0x94, 0xd6, 0x36, 0xa0, 0x6f, + 0xb7, 0x40, 0x7d, 0x37, 0xc6, 0x55, 0x7c, 0x50, + 0x40, 0x6d, 0x29, 0x89, 0xe3, 0x5a, 0xae, 0x97, + 0xe7, 0x44, 0x49, 0x6e, 0xbd, 0x81, 0x3d, 0x03, + 0x93, 0x06, 0x12, 0x06, 0xe2, 0x41, 0x12, 0x4a, + 0xf1, 0x6a, 0xa4, 0x58, 0xa2, 0xfb, 0xd2, 0x15, + 0xba, 0xc9, 0x79, 0xc9, 0xce, 0x5e, 0x13, 0xbb, + 0xf1, 0x09, 0x04, 0xcc, 0xfd, 0xe8, 0x51, 0x34, + 0x6a, 0xe8, 0x61, 0x88, 0xda, 0xed, 0x01, 0x47, + 0x84, 0xf5, 0x73, 0x25, 0xf9, 0x1c, 0x42, 0x86, + 0x07, 0xf3, 0x5b, 0x1a, 0x01, 0xb3, 0xeb, 0x24, + 0x32, 0x8d, 0xf6, 0xed, 0x7c, 0x4b, 0xeb, 0x3c, + 0x36, 0x42, 0x28, 0xdf, 0xdf, 0xb6, 0xbe, 0xd9, + 0x8c, 0x52, 0xd3, 0x2b, 0x08, 0x90, 0x8c, 0xe7, + 0x98, 0x31, 0xe2, 0x32, 0x8e, 0xfc, 0x11, 0x48, + 0x00, 0xa8, 0x6a, 0x42, 0x4a, 0x02, 0xc6, 0x4b, + 0x09, 0xf1, 0xe3, 0x49, 0xf3, 0x45, 0x1f, 0x0e, + 0xbc, 0x56, 0xe2, 0xe4, 0xdf, 0xfb, 0xeb, 0x61, + 0xfa, 0x24, 0xc1, 0x63, 0x75, 0xbb, 0x47, 0x75, + 0xaf, 0xe1, 0x53, 0x16, 0x96, 0x21, 0x85, 0x26, + 0x11, 0xb3, 0x76, 0xe3, 0x23, 0xa1, 0x6b, 0x74, + 0x37, 0xd0, 0xde, 0x06, 0x90, 0x71, 0x5d, 0x43, + 0x88, 0x9b, 0x00, 0x54, 0xa6, 0x75, 0x2f, 0xa1, + 0xc2, 0x0b, 0x73, 0x20, 0x1d, 0xb6, 0x21, 0x79, + 0x57, 0x3f, 0xfa, 0x09, 0xbe, 0x8a, 0x33, 0xc3, + 0x52, 0xf0, 0x1d, 0x82, 0x31, 0xd1, 0x55, 0xb5, + 0x6c, 0x99, 0x25, 0xcf, 0x5c, 0x32, 0xce, 0xe9, + 0x0d, 0xfa, 0x69, 0x2c, 0xd5, 0x0d, 0xc5, 0x6d, + 0x86, 0xd0, 0x0c, 0x3b, 0x06, 0x50, 0x79, 0xe8, + 0xc3, 0xae, 0x04, 0xe6, 0xcd, 0x51, 0xe4, 0x26, + 0x9b, 0x4f, 0x7e, 0xa6, 0x0f, 0xab, 0xd8, 0xe5, + 0xde, 0xa9, 0x00, 0x95, 0xbe, 0xa3, 0x9d, 0x5d, + 0xb2, 0x09, 0x70, 0x18, 0x1c, 0xf0, 0xac, 0x29, + 0x23, 0x02, 0x29, 0x28, 0xd2, 0x74, 0x35, 0x57, + 0x62, 0x0f, 0x24, 0xea, 0x5e, 0x33, 0xc2, 0x92, + 0xf3, 0x78, 0x4d, 0x30, 0x1e, 0xa1, 0x99, 0xa9, + 0x82, 0xb0, 0x42, 0x31, 0x8d, 0xad, 0x8a, 0xbc, + 0xfc, 0xd4, 0x57, 0x47, 0x3e, 0xb4, 0x50, 0xdd, + 0x6e, 0x2c, 0x80, 0x4d, 0x22, 0xf1, 0xfb, 0x57, + 0xc4, 0xdd, 0x17, 0xe1, 0x8a, 0x36, 0x4a, 0xb3, + 0x37, 0xca, 0xc9, 0x4e, 0xab, 0xd5, 0x69, 0xc4, + 0xf4, 0xbc, 0x0b, 0x3b, 0x44, 0x4b, 0x29, 0x9c, + 0xee, 0xd4, 0x35, 0x22, 0x21, 0xb0, 0x1f, 0x27, + 0x64, 0xa8, 0x51, 0x1b, 0xf0, 0x9f, 0x19, 0x5c, + 0xfb, 0x5a, 0x64, 0x74, 0x70, 0x45, 0x09, 0xf5, + 0x64, 0xfe, 0x1a, 0x2d, 0xc9, 0x14, 0x04, 0x14, + 0xcf, 0xd5, 0x7d, 0x60, 0xaf, 0x94, 0x39, 0x94, + 0xe2, 0x7d, 0x79, 0x82, 0xd0, 0x65, 0x3b, 0x6b, + 0x9c, 0x19, 0x84, 0xb4, 0x6d, 0xb3, 0x0c, 0x99, + 0xc0, 0x56, 0xa8, 0xbd, 0x73, 0xce, 0x05, 0x84, + 0x3e, 0x30, 0xaa, 0xc4, 0x9b, 0x1b, 0x04, 0x2a, + 0x9f, 0xd7, 0x43, 0x2b, 0x23, 0xdf, 0xbf, 0xaa, + 0xd5, 0xc2, 0x43, 0x2d, 0x70, 0xab, 0xdc, 0x75, + 0xad, 0xac, 0xf7, 0xc0, 0xbe, 0x67, 0xb2, 0x74, + 0xed, 0x67, 0x10, 0x4a, 0x92, 0x60, 0xc1, 0x40, + 0x50, 0x19, 0x8a, 0x8a, 0x8c, 0x09, 0x0e, 0x72, + 0xe1, 0x73, 0x5e, 0xe8, 0x41, 0x85, 0x63, 0x9f, + 0x3f, 0xd7, 0x7d, 0xc4, 0xfb, 0x22, 0x5d, 0x92, + 0x6c, 0xb3, 0x1e, 0xe2, 0x50, 0x2f, 0x82, 0xa8, + 0x28, 0xc0, 0xb5, 0xd7, 0x5f, 0x68, 0x0d, 0x2c, + 0x2d, 0xaf, 0x7e, 0xfa, 0x2e, 0x08, 0x0f, 0x1f, + 0x70, 0x9f, 0xe9, 0x19, 0x72, 0x55, 0xf8, 0xfb, + 0x51, 0xd2, 0x33, 0x5d, 0xa0, 0xd3, 0x2b, 0x0a, + 0x6c, 0xbc, 0x4e, 0xcf, 0x36, 0x4d, 0xdc, 0x3b, + 0xe9, 0x3e, 0x81, 0x7c, 0x61, 0xdb, 0x20, 0x2d, + 0x3a, 0xc3, 0xb3, 0x0c, 0x1e, 0x00, 0xb9, 0x7c, + 0xf5, 0xca, 0x10, 0x5f, 0x3a, 0x71, 0xb3, 0xe4, + 0x20, 0xdb, 0x0c, 0x2a, 0x98, 0x63, 0x45, 0x00, + 0x58, 0xf6, 0x68, 0xe4, 0x0b, 0xda, 0x13, 0x3b, + 0x60, 0x5c, 0x76, 0xdb, 0xb9, 0x97, 0x71, 0xe4, + 0xd9, 0xb7, 0xdb, 0xbd, 0x68, 0xc7, 0x84, 0x84, + 0xaa, 0x7c, 0x68, 0x62, 0x5e, 0x16, 0xfc, 0xba, + 0x72, 0xaa, 0x9a, 0xa9, 0xeb, 0x7c, 0x75, 0x47, + 0x97, 0x7e, 0xad, 0xe2, 0xd9, 0x91, 0xe8, 0xe4, + 0xa5, 0x31, 0xd7, 0x01, 0x8e, 0xa2, 0x11, 0x88, + 0x95, 0xb9, 0xf2, 0x9b, 0xd3, 0x7f, 0x1b, 0x81, + 0x22, 0xf7, 0x98, 0x60, 0x0a, 0x64, 0xa6, 0xc1, + 0xf6, 0x49, 0xc7, 0xe3, 0x07, 0x4d, 0x94, 0x7a, + 0xcf, 0x6e, 0x68, 0x0c, 0x1b, 0x3f, 0x6e, 0x2e, + 0xee, 0x92, 0xfa, 0x52, 0xb3, 0x59, 0xf8, 0xf1, + 0x8f, 0x6a, 0x66, 0xa3, 0x82, 0x76, 0x4a, 0x07, + 0x1a, 0xc7, 0xdd, 0xf5, 0xda, 0x9c, 0x3c, 0x24, + 0xbf, 0xfd, 0x42, 0xa1, 0x10, 0x64, 0x6a, 0x0f, + 0x89, 0xee, 0x36, 0xa5, 0xce, 0x99, 0x48, 0x6a, + 0xf0, 0x9f, 0x9e, 0x69, 0xa4, 0x40, 0x20, 0xe9, + 0x16, 0x15, 0xf7, 0xdb, 0x75, 0x02, 0xcb, 0xe9, + 0x73, 0x8b, 0x3b, 0x49, 0x2f, 0xf0, 0xaf, 0x51, + 0x06, 0x5c, 0xdf, 0x27, 0x27, 0x49, 0x6a, 0xd1, + 0xcc, 0xc7, 0xb5, 0x63, 0xb5, 0xfc, 0xb8, 0x5c, + 0x87, 0x7f, 0x84, 0xb4, 0xcc, 0x14, 0xa9, 0x53, + 0xda, 0xa4, 0x56, 0xf8, 0xb6, 0x1b, 0xcc, 0x40, + 0x27, 0x52, 0x06, 0x5a, 0x13, 0x81, 0xd7, 0x3a, + 0xd4, 0x3b, 0xfb, 0x49, 0x65, 0x31, 0x33, 0xb2, + 0xfa, 0xcd, 0xad, 0x58, 0x4e, 0x2b, 0xae, 0xd2, + 0x20, 0xfb, 0x1a, 0x48, 0xb4, 0x3f, 0x9a, 0xd8, + 0x7a, 0x35, 0x4a, 0xc8, 0xee, 0x88, 0x5e, 0x07, + 0x66, 0x54, 0xb9, 0xec, 0x9f, 0xa3, 0xe3, 0xb9, + 0x37, 0xaa, 0x49, 0x76, 0x31, 0xda, 0x74, 0x2d, + 0x3c, 0xa4, 0x65, 0x10, 0x32, 0x38, 0xf0, 0xde, + 0xd3, 0x99, 0x17, 0xaa, 0x71, 0xaa, 0x8f, 0x0f, + 0x8c, 0xaf, 0xa2, 0xf8, 0x5d, 0x64, 0xba, 0x1d, + 0xa3, 0xef, 0x96, 0x73, 0xe8, 0xa1, 0x02, 0x8d, + 0x0c, 0x6d, 0xb8, 0x06, 0x90, 0xb8, 0x08, 0x56, + 0x2c, 0xa7, 0x06, 0xc9, 0xc2, 0x38, 0xdb, 0x7c, + 0x63, 0xb1, 0x57, 0x8e, 0xea, 0x7c, 0x79, 0xf3, + 0x49, 0x1d, 0xfe, 0x9f, 0xf3, 0x6e, 0xb1, 0x1d, + 0xba, 0x19, 0x80, 0x1a, 0x0a, 0xd3, 0xb0, 0x26, + 0x21, 0x40, 0xb1, 0x7c, 0xf9, 0x4d, 0x8d, 0x10, + 0xc1, 0x7e, 0xf4, 0xf6, 0x3c, 0xa8, 0xfd, 0x7c, + 0xa3, 0x92, 0xb2, 0x0f, 0xaa, 0xcc, 0xa6, 0x11, + 0xfe, 0x04, 0xe3, 0xd1, 0x7a, 0x32, 0x89, 0xdf, + 0x0d, 0xc4, 0x8f, 0x79, 0x6b, 0xca, 0x16, 0x7c, + 0x6e, 0xf9, 0xad, 0x0f, 0xf6, 0xfe, 0x27, 0xdb, + 0xc4, 0x13, 0x70, 0xf1, 0x62, 0x1a, 0x4f, 0x79, + 0x40, 0xc9, 0x9b, 0x8b, 0x21, 0xea, 0x84, 0xfa, + 0xf5, 0xf1, 0x89, 0xce, 0xb7, 0x55, 0x0a, 0x80, + 0x39, 0x2f, 0x55, 0x36, 0x16, 0x9c, 0x7b, 0x08, + 0xbd, 0x87, 0x0d, 0xa5, 0x32, 0xf1, 0x52, 0x7c, + 0xe8, 0x55, 0x60, 0x5b, 0xd7, 0x69, 0xe4, 0xfc, + 0xfa, 0x12, 0x85, 0x96, 0xea, 0x50, 0x28, 0xab, + 0x8a, 0xf7, 0xbb, 0x0e, 0x53, 0x74, 0xca, 0xa6, + 0x27, 0x09, 0xc2, 0xb5, 0xde, 0x18, 0x14, 0xd9, + 0xea, 0xe5, 0x29, 0x1c, 0x40, 0x56, 0xcf, 0xd7, + 0xae, 0x05, 0x3f, 0x65, 0xaf, 0x05, 0x73, 0xe2, + 0x35, 0x96, 0x27, 0x07, 0x14, 0xc0, 0xad, 0x33, + 0xf1, 0xdc, 0x44, 0x7a, 0x89, 0x17, 0x77, 0xd2, + 0x9c, 0x58, 0x60, 0xf0, 0x3f, 0x7b, 0x2d, 0x2e, + 0x57, 0x95, 0x54, 0x87, 0xed, 0xf2, 0xc7, 0x4c, + 0xf0, 0xae, 0x56, 0x29, 0x19, 0x7d, 0x66, 0x4b, + 0x9b, 0x83, 0x84, 0x42, 0x3b, 0x01, 0x25, 0x66, + 0x8e, 0x02, 0xde, 0xb9, 0x83, 0x54, 0x19, 0xf6, + 0x9f, 0x79, 0x0d, 0x67, 0xc5, 0x1d, 0x7a, 0x44, + 0x02, 0x98, 0xa7, 0x16, 0x1c, 0x29, 0x0d, 0x74, + 0xff, 0x85, 0x40, 0x06, 0xef, 0x2c, 0xa9, 0xc6, + 0xf5, 0x53, 0x07, 0x06, 0xae, 0xe4, 0xfa, 0x5f, + 0xd8, 0x39, 0x4d, 0xf1, 0x9b, 0x6b, 0xd9, 0x24, + 0x84, 0xfe, 0x03, 0x4c, 0xb2, 0x3f, 0xdf, 0xa1, + 0x05, 0x9e, 0x50, 0x14, 0x5a, 0xd9, 0x1a, 0xa2, + 0xa7, 0xfa, 0xfa, 0x17, 0xf7, 0x78, 0xd6, 0xb5, + 0x92, 0x61, 0x91, 0xac, 0x36, 0xfa, 0x56, 0x0d, + 0x38, 0x32, 0x18, 0x85, 0x08, 0x58, 0x37, 0xf0, + 0x4b, 0xdb, 0x59, 0xe7, 0xa4, 0x34, 0xc0, 0x1b, + 0x01, 0xaf, 0x2d, 0xde, 0xa1, 0xaa, 0x5d, 0xd3, + 0xec, 0xe1, 0xd4, 0xf7, 0xe6, 0x54, 0x68, 0xf0, + 0x51, 0x97, 0xa7, 0x89, 0xea, 0x24, 0xad, 0xd3, + 0x6e, 0x47, 0x93, 0x8b, 0x4b, 0xb4, 0xf7, 0x1c, + 0x42, 0x06, 0x67, 0xe8, 0x99, 0xf6, 0xf5, 0x7b, + 0x85, 0xb5, 0x65, 0xb5, 0xb5, 0xd2, 0x37, 0xf5, + 0xf3, 0x02, 0xa6, 0x4d, 0x11, 0xa7, 0xdc, 0x51, + 0x09, 0x7f, 0xa0, 0xd8, 0x88, 0x1c, 0x13, 0x71, + 0xae, 0x9c, 0xb7, 0x7b, 0x34, 0xd6, 0x4e, 0x68, + 0x26, 0x83, 0x51, 0xaf, 0x1d, 0xee, 0x8b, 0xbb, + 0x69, 0x43, 0x2b, 0x9e, 0x8a, 0xbc, 0x02, 0x0e, + 0xa0, 0x1b, 0xe0, 0xa8, 0x5f, 0x6f, 0xaf, 0x1b, + 0x8f, 0xe7, 0x64, 0x71, 0x74, 0x11, 0x7e, 0xa8, + 0xd8, 0xf9, 0x97, 0x06, 0xc3, 0xb6, 0xfb, 0xfb, + 0xb7, 0x3d, 0x35, 0x9d, 0x3b, 0x52, 0xed, 0x54, + 0xca, 0xf4, 0x81, 0x01, 0x2d, 0x1b, 0xc3, 0xa7, + 0x00, 0x3d, 0x1a, 0x39, 0x54, 0xe1, 0xf6, 0xff, + 0xed, 0x6f, 0x0b, 0x5a, 0x68, 0xda, 0x58, 0xdd, + 0xa9, 0xcf, 0x5c, 0x4a, 0xe5, 0x09, 0x4e, 0xde, + 0x9d, 0xbc, 0x3e, 0xee, 0x5a, 0x00, 0x3b, 0x2c, + 0x87, 0x10, 0x65, 0x60, 0xdd, 0xd7, 0x56, 0xd1, + 0x4c, 0x64, 0x45, 0xe4, 0x21, 0xec, 0x78, 0xf8, + 0x25, 0x7a, 0x3e, 0x16, 0x5d, 0x09, 0x53, 0x14, + 0xbe, 0x4f, 0xae, 0x87, 0xd8, 0xd1, 0xaa, 0x3c, + 0xf6, 0x3e, 0xa4, 0x70, 0x8c, 0x5e, 0x70, 0xa4, + 0xb3, 0x6b, 0x66, 0x73, 0xd3, 0xbf, 0x31, 0x06, + 0x19, 0x62, 0x93, 0x15, 0xf2, 0x86, 0xe4, 0x52, + 0x7e, 0x53, 0x4c, 0x12, 0x38, 0xcc, 0x34, 0x7d, + 0x57, 0xf6, 0x42, 0x93, 0x8a, 0xc4, 0xee, 0x5c, + 0x8a, 0xe1, 0x52, 0x8f, 0x56, 0x64, 0xf6, 0xa6, + 0xd1, 0x91, 0x57, 0x70, 0xcd, 0x11, 0x76, 0xf5, + 0x59, 0x60, 0x60, 0x3c, 0xc1, 0xc3, 0x0b, 0x7f, + 0x58, 0x1a, 0x50, 0x91, 0xf1, 0x68, 0x8f, 0x6e, + 0x74, 0x74, 0xa8, 0x51, 0x0b, 0xf7, 0x7a, 0x98, + 0x37, 0xf2, 0x0a, 0x0e, 0xa4, 0x97, 0x04, 0xb8, + 0x9b, 0xfd, 0xa0, 0xea, 0xf7, 0x0d, 0xe1, 0xdb, + 0x03, 0xf0, 0x31, 0x29, 0xf8, 0xdd, 0x6b, 0x8b, + 0x5d, 0xd8, 0x59, 0xa9, 0x29, 0xcf, 0x9a, 0x79, + 0x89, 0x19, 0x63, 0x46, 0x09, 0x79, 0x6a, 0x11, + 0xda, 0x63, 0x68, 0x48, 0x77, 0x23, 0xfb, 0x7d, + 0x3a, 0x43, 0xcb, 0x02, 0x3b, 0x7a, 0x6d, 0x10, + 0x2a, 0x9e, 0xac, 0xf1, 0xd4, 0x19, 0xf8, 0x23, + 0x64, 0x1d, 0x2c, 0x5f, 0xf2, 0xb0, 0x5c, 0x23, + 0x27, 0xf7, 0x27, 0x30, 0x16, 0x37, 0xb1, 0x90, + 0xab, 0x38, 0xfb, 0x55, 0xcd, 0x78, 0x58, 0xd4, + 0x7d, 0x43, 0xf6, 0x45, 0x5e, 0x55, 0x8d, 0xb1, + 0x02, 0x65, 0x58, 0xb4, 0x13, 0x4b, 0x36, 0xf7, + 0xcc, 0xfe, 0x3d, 0x0b, 0x82, 0xe2, 0x12, 0x11, + 0xbb, 0xe6, 0xb8, 0x3a, 0x48, 0x71, 0xc7, 0x50, + 0x06, 0x16, 0x3a, 0xe6, 0x7c, 0x05, 0xc7, 0xc8, + 0x4d, 0x2f, 0x08, 0x6a, 0x17, 0x9a, 0x95, 0x97, + 0x50, 0x68, 0xdc, 0x28, 0x18, 0xc4, 0x61, 0x38, + 0xb9, 0xe0, 0x3e, 0x78, 0xdb, 0x29, 0xe0, 0x9f, + 0x52, 0xdd, 0xf8, 0x4f, 0x91, 0xc1, 0xd0, 0x33, + 0xa1, 0x7a, 0x8e, 0x30, 0x13, 0x82, 0x07, 0x9f, + 0xd3, 0x31, 0x0f, 0x23, 0xbe, 0x32, 0x5a, 0x75, + 0xcf, 0x96, 0xb2, 0xec, 0xb5, 0x32, 0xac, 0x21, + 0xd1, 0x82, 0x33, 0xd3, 0x15, 0x74, 0xbd, 0x90, + 0xf1, 0x2c, 0xe6, 0x5f, 0x8d, 0xe3, 0x02, 0xe8, + 0xe9, 0xc4, 0xca, 0x96, 0xeb, 0x0e, 0xbc, 0x91, + 0xf4, 0xb9, 0xea, 0xd9, 0x1b, 0x75, 0xbd, 0xe1, + 0xac, 0x2a, 0x05, 0x37, 0x52, 0x9b, 0x1b, 0x3f, + 0x5a, 0xdc, 0x21, 0xc3, 0x98, 0xbb, 0xaf, 0xa3, + 0xf2, 0x00, 0xbf, 0x0d, 0x30, 0x89, 0x05, 0xcc, + 0xa5, 0x76, 0xf5, 0x06, 0xf0, 0xc6, 0x54, 0x8a, + 0x5d, 0xd4, 0x1e, 0xc1, 0xf2, 0xce, 0xb0, 0x62, + 0xc8, 0xfc, 0x59, 0x42, 0x9a, 0x90, 0x60, 0x55, + 0xfe, 0x88, 0xa5, 0x8b, 0xb8, 0x33, 0x0c, 0x23, + 0x24, 0x0d, 0x15, 0x70, 0x37, 0x1e, 0x3d, 0xf6, + 0xd2, 0xea, 0x92, 0x10, 0xb2, 0xc4, 0x51, 0xac, + 0xf2, 0xac, 0xf3, 0x6b, 0x6c, 0xaa, 0xcf, 0x12, + 0xc5, 0x6c, 0x90, 0x50, 0xb5, 0x0c, 0xfc, 0x1a, + 0x15, 0x52, 0xe9, 0x26, 0xc6, 0x52, 0xa4, 0xe7, + 0x81, 0x69, 0xe1, 0xe7, 0x9e, 0x30, 0x01, 0xec, + 0x84, 0x89, 0xb2, 0x0d, 0x66, 0xdd, 0xce, 0x28, + 0x5c, 0xec, 0x98, 0x46, 0x68, 0x21, 0x9f, 0x88, + 0x3f, 0x1f, 0x42, 0x77, 0xce, 0xd0, 0x61, 0xd4, + 0x20, 0xa7, 0xff, 0x53, 0xad, 0x37, 0xd0, 0x17, + 0x35, 0xc9, 0xfc, 0xba, 0x0a, 0x78, 0x3f, 0xf2, + 0xcc, 0x86, 0x89, 0xe8, 0x4b, 0x3c, 0x48, 0x33, + 0x09, 0x7f, 0xc6, 0xc0, 0xdd, 0xb8, 0xfd, 0x7a, + 0x66, 0x66, 0x65, 0xeb, 0x47, 0xa7, 0x04, 0x28, + 0xa3, 0x19, 0x8e, 0xa9, 0xb1, 0x13, 0x67, 0x62, + 0x70, 0xcf, 0xd7 +}; +static const u8 dec_output013[] __initconst = { + 0x74, 0xa6, 0x3e, 0xe4, 0xb1, 0xcb, 0xaf, 0xb0, + 0x40, 0xe5, 0x0f, 0x9e, 0xf1, 0xf2, 0x89, 0xb5, + 0x42, 0x34, 0x8a, 0xa1, 0x03, 0xb7, 0xe9, 0x57, + 0x46, 0xbe, 0x20, 0xe4, 0x6e, 0xb0, 0xeb, 0xff, + 0xea, 0x07, 0x7e, 0xef, 0xe2, 0x55, 0x9f, 0xe5, + 0x78, 0x3a, 0xb7, 0x83, 0xc2, 0x18, 0x40, 0x7b, + 0xeb, 0xcd, 0x81, 0xfb, 0x90, 0x12, 0x9e, 0x46, + 0xa9, 0xd6, 0x4a, 0xba, 0xb0, 0x62, 0xdb, 0x6b, + 0x99, 0xc4, 0xdb, 0x54, 0x4b, 0xb8, 0xa5, 0x71, + 0xcb, 0xcd, 0x63, 0x32, 0x55, 0xfb, 0x31, 0xf0, + 0x38, 0xf5, 0xbe, 0x78, 0xe4, 0x45, 0xce, 0x1b, + 0x6a, 0x5b, 0x0e, 0xf4, 0x16, 0xe4, 0xb1, 0x3d, + 0xf6, 0x63, 0x7b, 0xa7, 0x0c, 0xde, 0x6f, 0x8f, + 0x74, 0xdf, 0xe0, 0x1e, 0x9d, 0xce, 0x8f, 0x24, + 0xef, 0x23, 0x35, 0x33, 0x7b, 0x83, 0x34, 0x23, + 0x58, 0x74, 0x14, 0x77, 0x1f, 0xc2, 0x4f, 0x4e, + 0xc6, 0x89, 0xf9, 0x52, 0x09, 0x37, 0x64, 0x14, + 0xc4, 0x01, 0x6b, 0x9d, 0x77, 0xe8, 0x90, 0x5d, + 0xa8, 0x4a, 0x2a, 0xef, 0x5c, 0x7f, 0xeb, 0xbb, + 0xb2, 0xc6, 0x93, 0x99, 0x66, 0xdc, 0x7f, 0xd4, + 0x9e, 0x2a, 0xca, 0x8d, 0xdb, 0xe7, 0x20, 0xcf, + 0xe4, 0x73, 0xae, 0x49, 0x7d, 0x64, 0x0f, 0x0e, + 0x28, 0x46, 0xa9, 0xa8, 0x32, 0xe4, 0x0e, 0xf6, + 0x51, 0x53, 0xb8, 0x3c, 0xb1, 0xff, 0xa3, 0x33, + 0x41, 0x75, 0xff, 0xf1, 0x6f, 0xf1, 0xfb, 0xbb, + 0x83, 0x7f, 0x06, 0x9b, 0xe7, 0x1b, 0x0a, 0xe0, + 0x5c, 0x33, 0x60, 0x5b, 0xdb, 0x5b, 0xed, 0xfe, + 0xa5, 0x16, 0x19, 0x72, 0xa3, 0x64, 0x23, 0x00, + 0x02, 0xc7, 0xf3, 0x6a, 0x81, 0x3e, 0x44, 0x1d, + 0x79, 0x15, 0x5f, 0x9a, 0xde, 0xe2, 0xfd, 0x1b, + 0x73, 0xc1, 0xbc, 0x23, 0xba, 0x31, 0xd2, 0x50, + 0xd5, 0xad, 0x7f, 0x74, 0xa7, 0xc9, 0xf8, 0x3e, + 0x2b, 0x26, 0x10, 0xf6, 0x03, 0x36, 0x74, 0xe4, + 0x0e, 0x6a, 0x72, 0xb7, 0x73, 0x0a, 0x42, 0x28, + 0xc2, 0xad, 0x5e, 0x03, 0xbe, 0xb8, 0x0b, 0xa8, + 0x5b, 0xd4, 0xb8, 0xba, 0x52, 0x89, 0xb1, 0x9b, + 0xc1, 0xc3, 0x65, 0x87, 0xed, 0xa5, 0xf4, 0x86, + 0xfd, 0x41, 0x80, 0x91, 0x27, 0x59, 0x53, 0x67, + 0x15, 0x78, 0x54, 0x8b, 0x2d, 0x3d, 0xc7, 0xff, + 0x02, 0x92, 0x07, 0x5f, 0x7a, 0x4b, 0x60, 0x59, + 0x3c, 0x6f, 0x5c, 0xd8, 0xec, 0x95, 0xd2, 0xfe, + 0xa0, 0x3b, 0xd8, 0x3f, 0xd1, 0x69, 0xa6, 0xd6, + 0x41, 0xb2, 0xf4, 0x4d, 0x12, 0xf4, 0x58, 0x3e, + 0x66, 0x64, 0x80, 0x31, 0x9b, 0xa8, 0x4c, 0x8b, + 0x07, 0xb2, 0xec, 0x66, 0x94, 0x66, 0x47, 0x50, + 0x50, 0x5f, 0x18, 0x0b, 0x0e, 0xd6, 0xc0, 0x39, + 0x21, 0x13, 0x9e, 0x33, 0xbc, 0x79, 0x36, 0x02, + 0x96, 0x70, 0xf0, 0x48, 0x67, 0x2f, 0x26, 0xe9, + 0x6d, 0x10, 0xbb, 0xd6, 0x3f, 0xd1, 0x64, 0x7a, + 0x2e, 0xbe, 0x0c, 0x61, 0xf0, 0x75, 0x42, 0x38, + 0x23, 0xb1, 0x9e, 0x9f, 0x7c, 0x67, 0x66, 0xd9, + 0x58, 0x9a, 0xf1, 0xbb, 0x41, 0x2a, 0x8d, 0x65, + 0x84, 0x94, 0xfc, 0xdc, 0x6a, 0x50, 0x64, 0xdb, + 0x56, 0x33, 0x76, 0x00, 0x10, 0xed, 0xbe, 0xd2, + 0x12, 0xf6, 0xf6, 0x1b, 0xa2, 0x16, 0xde, 0xae, + 0x31, 0x95, 0xdd, 0xb1, 0x08, 0x7e, 0x4e, 0xee, + 0xe7, 0xf9, 0xa5, 0xfb, 0x5b, 0x61, 0x43, 0x00, + 0x40, 0xf6, 0x7e, 0x02, 0x04, 0x32, 0x4e, 0x0c, + 0xe2, 0x66, 0x0d, 0xd7, 0x07, 0x98, 0x0e, 0xf8, + 0x72, 0x34, 0x6d, 0x95, 0x86, 0xd7, 0xcb, 0x31, + 0x54, 0x47, 0xd0, 0x38, 0x29, 0x9c, 0x5a, 0x68, + 0xd4, 0x87, 0x76, 0xc9, 0xe7, 0x7e, 0xe3, 0xf4, + 0x81, 0x6d, 0x18, 0xcb, 0xc9, 0x05, 0xaf, 0xa0, + 0xfb, 0x66, 0xf7, 0xf1, 0x1c, 0xc6, 0x14, 0x11, + 0x4f, 0x2b, 0x79, 0x42, 0x8b, 0xbc, 0xac, 0xe7, + 0x6c, 0xfe, 0x0f, 0x58, 0xe7, 0x7c, 0x78, 0x39, + 0x30, 0xb0, 0x66, 0x2c, 0x9b, 0x6d, 0x3a, 0xe1, + 0xcf, 0xc9, 0xa4, 0x0e, 0x6d, 0x6d, 0x8a, 0xa1, + 0x3a, 0xe7, 0x28, 0xd4, 0x78, 0x4c, 0xa6, 0xa2, + 0x2a, 0xa6, 0x03, 0x30, 0xd7, 0xa8, 0x25, 0x66, + 0x87, 0x2f, 0x69, 0x5c, 0x4e, 0xdd, 0xa5, 0x49, + 0x5d, 0x37, 0x4a, 0x59, 0xc4, 0xaf, 0x1f, 0xa2, + 0xe4, 0xf8, 0xa6, 0x12, 0x97, 0xd5, 0x79, 0xf5, + 0xe2, 0x4a, 0x2b, 0x5f, 0x61, 0xe4, 0x9e, 0xe3, + 0xee, 0xb8, 0xa7, 0x5b, 0x2f, 0xf4, 0x9e, 0x6c, + 0xfb, 0xd1, 0xc6, 0x56, 0x77, 0xba, 0x75, 0xaa, + 0x3d, 0x1a, 0xa8, 0x0b, 0xb3, 0x68, 0x24, 0x00, + 0x10, 0x7f, 0xfd, 0xd7, 0xa1, 0x8d, 0x83, 0x54, + 0x4f, 0x1f, 0xd8, 0x2a, 0xbe, 0x8a, 0x0c, 0x87, + 0xab, 0xa2, 0xde, 0xc3, 0x39, 0xbf, 0x09, 0x03, + 0xa5, 0xf3, 0x05, 0x28, 0xe1, 0xe1, 0xee, 0x39, + 0x70, 0x9c, 0xd8, 0x81, 0x12, 0x1e, 0x02, 0x40, + 0xd2, 0x6e, 0xf0, 0xeb, 0x1b, 0x3d, 0x22, 0xc6, + 0xe5, 0xe3, 0xb4, 0x5a, 0x98, 0xbb, 0xf0, 0x22, + 0x28, 0x8d, 0xe5, 0xd3, 0x16, 0x48, 0x24, 0xa5, + 0xe6, 0x66, 0x0c, 0xf9, 0x08, 0xf9, 0x7e, 0x1e, + 0xe1, 0x28, 0x26, 0x22, 0xc7, 0xc7, 0x0a, 0x32, + 0x47, 0xfa, 0xa3, 0xbe, 0x3c, 0xc4, 0xc5, 0x53, + 0x0a, 0xd5, 0x94, 0x4a, 0xd7, 0x93, 0xd8, 0x42, + 0x99, 0xb9, 0x0a, 0xdb, 0x56, 0xf7, 0xb9, 0x1c, + 0x53, 0x4f, 0xfa, 0xd3, 0x74, 0xad, 0xd9, 0x68, + 0xf1, 0x1b, 0xdf, 0x61, 0xc6, 0x5e, 0xa8, 0x48, + 0xfc, 0xd4, 0x4a, 0x4c, 0x3c, 0x32, 0xf7, 0x1c, + 0x96, 0x21, 0x9b, 0xf9, 0xa3, 0xcc, 0x5a, 0xce, + 0xd5, 0xd7, 0x08, 0x24, 0xf6, 0x1c, 0xfd, 0xdd, + 0x38, 0xc2, 0x32, 0xe9, 0xb8, 0xe7, 0xb6, 0xfa, + 0x9d, 0x45, 0x13, 0x2c, 0x83, 0xfd, 0x4a, 0x69, + 0x82, 0xcd, 0xdc, 0xb3, 0x76, 0x0c, 0x9e, 0xd8, + 0xf4, 0x1b, 0x45, 0x15, 0xb4, 0x97, 0xe7, 0x58, + 0x34, 0xe2, 0x03, 0x29, 0x5a, 0xbf, 0xb6, 0xe0, + 0x5d, 0x13, 0xd9, 0x2b, 0xb4, 0x80, 0xb2, 0x45, + 0x81, 0x6a, 0x2e, 0x6c, 0x89, 0x7d, 0xee, 0xbb, + 0x52, 0xdd, 0x1f, 0x18, 0xe7, 0x13, 0x6b, 0x33, + 0x0e, 0xea, 0x36, 0x92, 0x77, 0x7b, 0x6d, 0x9c, + 0x5a, 0x5f, 0x45, 0x7b, 0x7b, 0x35, 0x62, 0x23, + 0xd1, 0xbf, 0x0f, 0xd0, 0x08, 0x1b, 0x2b, 0x80, + 0x6b, 0x7e, 0xf1, 0x21, 0x47, 0xb0, 0x57, 0xd1, + 0x98, 0x72, 0x90, 0x34, 0x1c, 0x20, 0x04, 0xff, + 0x3d, 0x5c, 0xee, 0x0e, 0x57, 0x5f, 0x6f, 0x24, + 0x4e, 0x3c, 0xea, 0xfc, 0xa5, 0xa9, 0x83, 0xc9, + 0x61, 0xb4, 0x51, 0x24, 0xf8, 0x27, 0x5e, 0x46, + 0x8c, 0xb1, 0x53, 0x02, 0x96, 0x35, 0xba, 0xb8, + 0x4c, 0x71, 0xd3, 0x15, 0x59, 0x35, 0x22, 0x20, + 0xad, 0x03, 0x9f, 0x66, 0x44, 0x3b, 0x9c, 0x35, + 0x37, 0x1f, 0x9b, 0xbb, 0xf3, 0xdb, 0x35, 0x63, + 0x30, 0x64, 0xaa, 0xa2, 0x06, 0xa8, 0x5d, 0xbb, + 0xe1, 0x9f, 0x70, 0xec, 0x82, 0x11, 0x06, 0x36, + 0xec, 0x8b, 0x69, 0x66, 0x24, 0x44, 0xc9, 0x4a, + 0x57, 0xbb, 0x9b, 0x78, 0x13, 0xce, 0x9c, 0x0c, + 0xba, 0x92, 0x93, 0x63, 0xb8, 0xe2, 0x95, 0x0f, + 0x0f, 0x16, 0x39, 0x52, 0xfd, 0x3a, 0x6d, 0x02, + 0x4b, 0xdf, 0x13, 0xd3, 0x2a, 0x22, 0xb4, 0x03, + 0x7c, 0x54, 0x49, 0x96, 0x68, 0x54, 0x10, 0xfa, + 0xef, 0xaa, 0x6c, 0xe8, 0x22, 0xdc, 0x71, 0x16, + 0x13, 0x1a, 0xf6, 0x28, 0xe5, 0x6d, 0x77, 0x3d, + 0xcd, 0x30, 0x63, 0xb1, 0x70, 0x52, 0xa1, 0xc5, + 0x94, 0x5f, 0xcf, 0xe8, 0xb8, 0x26, 0x98, 0xf7, + 0x06, 0xa0, 0x0a, 0x70, 0xfa, 0x03, 0x80, 0xac, + 0xc1, 0xec, 0xd6, 0x4c, 0x54, 0xd7, 0xfe, 0x47, + 0xb6, 0x88, 0x4a, 0xf7, 0x71, 0x24, 0xee, 0xf3, + 0xd2, 0xc2, 0x4a, 0x7f, 0xfe, 0x61, 0xc7, 0x35, + 0xc9, 0x37, 0x67, 0xcb, 0x24, 0x35, 0xda, 0x7e, + 0xca, 0x5f, 0xf3, 0x8d, 0xd4, 0x13, 0x8e, 0xd6, + 0xcb, 0x4d, 0x53, 0x8f, 0x53, 0x1f, 0xc0, 0x74, + 0xf7, 0x53, 0xb9, 0x5e, 0x23, 0x37, 0xba, 0x6e, + 0xe3, 0x9d, 0x07, 0x55, 0x25, 0x7b, 0xe6, 0x2a, + 0x64, 0xd1, 0x32, 0xdd, 0x54, 0x1b, 0x4b, 0xc0, + 0xe1, 0xd7, 0x69, 0x58, 0xf8, 0x93, 0x29, 0xc4, + 0xdd, 0x23, 0x2f, 0xa5, 0xfc, 0x9d, 0x7e, 0xf8, + 0xd4, 0x90, 0xcd, 0x82, 0x55, 0xdc, 0x16, 0x16, + 0x9f, 0x07, 0x52, 0x9b, 0x9d, 0x25, 0xed, 0x32, + 0xc5, 0x7b, 0xdf, 0xf6, 0x83, 0x46, 0x3d, 0x65, + 0xb7, 0xef, 0x87, 0x7a, 0x12, 0x69, 0x8f, 0x06, + 0x7c, 0x51, 0x15, 0x4a, 0x08, 0xe8, 0xac, 0x9a, + 0x0c, 0x24, 0xa7, 0x27, 0xd8, 0x46, 0x2f, 0xe7, + 0x01, 0x0e, 0x1c, 0xc6, 0x91, 0xb0, 0x6e, 0x85, + 0x65, 0xf0, 0x29, 0x0d, 0x2e, 0x6b, 0x3b, 0xfb, + 0x4b, 0xdf, 0xe4, 0x80, 0x93, 0x03, 0x66, 0x46, + 0x3e, 0x8a, 0x6e, 0xf3, 0x5e, 0x4d, 0x62, 0x0e, + 0x49, 0x05, 0xaf, 0xd4, 0xf8, 0x21, 0x20, 0x61, + 0x1d, 0x39, 0x17, 0xf4, 0x61, 0x47, 0x95, 0xfb, + 0x15, 0x2e, 0xb3, 0x4f, 0xd0, 0x5d, 0xf5, 0x7d, + 0x40, 0xda, 0x90, 0x3c, 0x6b, 0xcb, 0x17, 0x00, + 0x13, 0x3b, 0x64, 0x34, 0x1b, 0xf0, 0xf2, 0xe5, + 0x3b, 0xb2, 0xc7, 0xd3, 0x5f, 0x3a, 0x44, 0xa6, + 0x9b, 0xb7, 0x78, 0x0e, 0x42, 0x5d, 0x4c, 0xc1, + 0xe9, 0xd2, 0xcb, 0xb7, 0x78, 0xd1, 0xfe, 0x9a, + 0xb5, 0x07, 0xe9, 0xe0, 0xbe, 0xe2, 0x8a, 0xa7, + 0x01, 0x83, 0x00, 0x8c, 0x5c, 0x08, 0xe6, 0x63, + 0x12, 0x92, 0xb7, 0xb7, 0xa6, 0x19, 0x7d, 0x38, + 0x13, 0x38, 0x92, 0x87, 0x24, 0xf9, 0x48, 0xb3, + 0x5e, 0x87, 0x6a, 0x40, 0x39, 0x5c, 0x3f, 0xed, + 0x8f, 0xee, 0xdb, 0x15, 0x82, 0x06, 0xda, 0x49, + 0x21, 0x2b, 0xb5, 0xbf, 0x32, 0x7c, 0x9f, 0x42, + 0x28, 0x63, 0xcf, 0xaf, 0x1e, 0xf8, 0xc6, 0xa0, + 0xd1, 0x02, 0x43, 0x57, 0x62, 0xec, 0x9b, 0x0f, + 0x01, 0x9e, 0x71, 0xd8, 0x87, 0x9d, 0x01, 0xc1, + 0x58, 0x77, 0xd9, 0xaf, 0xb1, 0x10, 0x7e, 0xdd, + 0xa6, 0x50, 0x96, 0xe5, 0xf0, 0x72, 0x00, 0x6d, + 0x4b, 0xf8, 0x2a, 0x8f, 0x19, 0xf3, 0x22, 0x88, + 0x11, 0x4a, 0x8b, 0x7c, 0xfd, 0xb7, 0xed, 0xe1, + 0xf6, 0x40, 0x39, 0xe0, 0xe9, 0xf6, 0x3d, 0x25, + 0xe6, 0x74, 0x3c, 0x58, 0x57, 0x7f, 0xe1, 0x22, + 0x96, 0x47, 0x31, 0x91, 0xba, 0x70, 0x85, 0x28, + 0x6b, 0x9f, 0x6e, 0x25, 0xac, 0x23, 0x66, 0x2f, + 0x29, 0x88, 0x28, 0xce, 0x8c, 0x5c, 0x88, 0x53, + 0xd1, 0x3b, 0xcc, 0x6a, 0x51, 0xb2, 0xe1, 0x28, + 0x3f, 0x91, 0xb4, 0x0d, 0x00, 0x3a, 0xe3, 0xf8, + 0xc3, 0x8f, 0xd7, 0x96, 0x62, 0x0e, 0x2e, 0xfc, + 0xc8, 0x6c, 0x77, 0xa6, 0x1d, 0x22, 0xc1, 0xb8, + 0xe6, 0x61, 0xd7, 0x67, 0x36, 0x13, 0x7b, 0xbb, + 0x9b, 0x59, 0x09, 0xa6, 0xdf, 0xf7, 0x6b, 0xa3, + 0x40, 0x1a, 0xf5, 0x4f, 0xb4, 0xda, 0xd3, 0xf3, + 0x81, 0x93, 0xc6, 0x18, 0xd9, 0x26, 0xee, 0xac, + 0xf0, 0xaa, 0xdf, 0xc5, 0x9c, 0xca, 0xc2, 0xa2, + 0xcc, 0x7b, 0x5c, 0x24, 0xb0, 0xbc, 0xd0, 0x6a, + 0x4d, 0x89, 0x09, 0xb8, 0x07, 0xfe, 0x87, 0xad, + 0x0a, 0xea, 0xb8, 0x42, 0xf9, 0x5e, 0xb3, 0x3e, + 0x36, 0x4c, 0xaf, 0x75, 0x9e, 0x1c, 0xeb, 0xbd, + 0xbc, 0xbb, 0x80, 0x40, 0xa7, 0x3a, 0x30, 0xbf, + 0xa8, 0x44, 0xf4, 0xeb, 0x38, 0xad, 0x29, 0xba, + 0x23, 0xed, 0x41, 0x0c, 0xea, 0xd2, 0xbb, 0x41, + 0x18, 0xd6, 0xb9, 0xba, 0x65, 0x2b, 0xa3, 0x91, + 0x6d, 0x1f, 0xa9, 0xf4, 0xd1, 0x25, 0x8d, 0x4d, + 0x38, 0xff, 0x64, 0xa0, 0xec, 0xde, 0xa6, 0xb6, + 0x79, 0xab, 0x8e, 0x33, 0x6c, 0x47, 0xde, 0xaf, + 0x94, 0xa4, 0xa5, 0x86, 0x77, 0x55, 0x09, 0x92, + 0x81, 0x31, 0x76, 0xc7, 0x34, 0x22, 0x89, 0x8e, + 0x3d, 0x26, 0x26, 0xd7, 0xfc, 0x1e, 0x16, 0x72, + 0x13, 0x33, 0x63, 0xd5, 0x22, 0xbe, 0xb8, 0x04, + 0x34, 0x84, 0x41, 0xbb, 0x80, 0xd0, 0x9f, 0x46, + 0x48, 0x07, 0xa7, 0xfc, 0x2b, 0x3a, 0x75, 0x55, + 0x8c, 0xc7, 0x6a, 0xbd, 0x7e, 0x46, 0x08, 0x84, + 0x0f, 0xd5, 0x74, 0xc0, 0x82, 0x8e, 0xaa, 0x61, + 0x05, 0x01, 0xb2, 0x47, 0x6e, 0x20, 0x6a, 0x2d, + 0x58, 0x70, 0x48, 0x32, 0xa7, 0x37, 0xd2, 0xb8, + 0x82, 0x1a, 0x51, 0xb9, 0x61, 0xdd, 0xfd, 0x9d, + 0x6b, 0x0e, 0x18, 0x97, 0xf8, 0x45, 0x5f, 0x87, + 0x10, 0xcf, 0x34, 0x72, 0x45, 0x26, 0x49, 0x70, + 0xe7, 0xa3, 0x78, 0xe0, 0x52, 0x89, 0x84, 0x94, + 0x83, 0x82, 0xc2, 0x69, 0x8f, 0xe3, 0xe1, 0x3f, + 0x60, 0x74, 0x88, 0xc4, 0xf7, 0x75, 0x2c, 0xfb, + 0xbd, 0xb6, 0xc4, 0x7e, 0x10, 0x0a, 0x6c, 0x90, + 0x04, 0x9e, 0xc3, 0x3f, 0x59, 0x7c, 0xce, 0x31, + 0x18, 0x60, 0x57, 0x73, 0x46, 0x94, 0x7d, 0x06, + 0xa0, 0x6d, 0x44, 0xec, 0xa2, 0x0a, 0x9e, 0x05, + 0x15, 0xef, 0xca, 0x5c, 0xbf, 0x00, 0xeb, 0xf7, + 0x3d, 0x32, 0xd4, 0xa5, 0xef, 0x49, 0x89, 0x5e, + 0x46, 0xb0, 0xa6, 0x63, 0x5b, 0x8a, 0x73, 0xae, + 0x6f, 0xd5, 0x9d, 0xf8, 0x4f, 0x40, 0xb5, 0xb2, + 0x6e, 0xd3, 0xb6, 0x01, 0xa9, 0x26, 0xa2, 0x21, + 0xcf, 0x33, 0x7a, 0x3a, 0xa4, 0x23, 0x13, 0xb0, + 0x69, 0x6a, 0xee, 0xce, 0xd8, 0x9d, 0x01, 0x1d, + 0x50, 0xc1, 0x30, 0x6c, 0xb1, 0xcd, 0xa0, 0xf0, + 0xf0, 0xa2, 0x64, 0x6f, 0xbb, 0xbf, 0x5e, 0xe6, + 0xab, 0x87, 0xb4, 0x0f, 0x4f, 0x15, 0xaf, 0xb5, + 0x25, 0xa1, 0xb2, 0xd0, 0x80, 0x2c, 0xfb, 0xf9, + 0xfe, 0xd2, 0x33, 0xbb, 0x76, 0xfe, 0x7c, 0xa8, + 0x66, 0xf7, 0xe7, 0x85, 0x9f, 0x1f, 0x85, 0x57, + 0x88, 0xe1, 0xe9, 0x63, 0xe4, 0xd8, 0x1c, 0xa1, + 0xfb, 0xda, 0x44, 0x05, 0x2e, 0x1d, 0x3a, 0x1c, + 0xff, 0xc8, 0x3b, 0xc0, 0xfe, 0xda, 0x22, 0x0b, + 0x43, 0xd6, 0x88, 0x39, 0x4c, 0x4a, 0xa6, 0x69, + 0x18, 0x93, 0x42, 0x4e, 0xb5, 0xcc, 0x66, 0x0d, + 0x09, 0xf8, 0x1e, 0x7c, 0xd3, 0x3c, 0x99, 0x0d, + 0x50, 0x1d, 0x62, 0xe9, 0x57, 0x06, 0xbf, 0x19, + 0x88, 0xdd, 0xad, 0x7b, 0x4f, 0xf9, 0xc7, 0x82, + 0x6d, 0x8d, 0xc8, 0xc4, 0xc5, 0x78, 0x17, 0x20, + 0x15, 0xc5, 0x52, 0x41, 0xcf, 0x5b, 0xd6, 0x7f, + 0x94, 0x02, 0x41, 0xe0, 0x40, 0x22, 0x03, 0x5e, + 0xd1, 0x53, 0xd4, 0x86, 0xd3, 0x2c, 0x9f, 0x0f, + 0x96, 0xe3, 0x6b, 0x9a, 0x76, 0x32, 0x06, 0x47, + 0x4b, 0x11, 0xb3, 0xdd, 0x03, 0x65, 0xbd, 0x9b, + 0x01, 0xda, 0x9c, 0xb9, 0x7e, 0x3f, 0x6a, 0xc4, + 0x7b, 0xea, 0xd4, 0x3c, 0xb9, 0xfb, 0x5c, 0x6b, + 0x64, 0x33, 0x52, 0xba, 0x64, 0x78, 0x8f, 0xa4, + 0xaf, 0x7a, 0x61, 0x8d, 0xbc, 0xc5, 0x73, 0xe9, + 0x6b, 0x58, 0x97, 0x4b, 0xbf, 0x63, 0x22, 0xd3, + 0x37, 0x02, 0x54, 0xc5, 0xb9, 0x16, 0x4a, 0xf0, + 0x19, 0xd8, 0x94, 0x57, 0xb8, 0x8a, 0xb3, 0x16, + 0x3b, 0xd0, 0x84, 0x8e, 0x67, 0xa6, 0xa3, 0x7d, + 0x78, 0xec, 0x00 +}; +static const u8 dec_assoc013[] __initconst = { + 0xb1, 0x69, 0x83, 0x87, 0x30, 0xaa, 0x5d, 0xb8, + 0x77, 0xe8, 0x21, 0xff, 0x06, 0x59, 0x35, 0xce, + 0x75, 0xfe, 0x38, 0xef, 0xb8, 0x91, 0x43, 0x8c, + 0xcf, 0x70, 0xdd, 0x0a, 0x68, 0xbf, 0xd4, 0xbc, + 0x16, 0x76, 0x99, 0x36, 0x1e, 0x58, 0x79, 0x5e, + 0xd4, 0x29, 0xf7, 0x33, 0x93, 0x48, 0xdb, 0x5f, + 0x01, 0xae, 0x9c, 0xb6, 0xe4, 0x88, 0x6d, 0x2b, + 0x76, 0x75, 0xe0, 0xf3, 0x74, 0xe2, 0xc9 +}; +static const u8 dec_nonce013[] __initconst = { + 0x05, 0xa3, 0x93, 0xed, 0x30, 0xc5, 0xa2, 0x06 +}; +static const u8 dec_key013[] __initconst = { + 0xb3, 0x35, 0x50, 0x03, 0x54, 0x2e, 0x40, 0x5e, + 0x8f, 0x59, 0x8e, 0xc5, 0x90, 0xd5, 0x27, 0x2d, + 0xba, 0x29, 0x2e, 0xcb, 0x1b, 0x70, 0x44, 0x1e, + 0x65, 0x91, 0x6e, 0x2a, 0x79, 0x22, 0xda, 0x64 +}; + +static const struct chacha20poly1305_testvec +chacha20poly1305_dec_vectors[] __initconst = { + { dec_input001, dec_output001, dec_assoc001, dec_nonce001, dec_key001, + sizeof(dec_input001), sizeof(dec_assoc001), sizeof(dec_nonce001) }, + { dec_input002, dec_output002, dec_assoc002, dec_nonce002, dec_key002, + sizeof(dec_input002), sizeof(dec_assoc002), sizeof(dec_nonce002) }, + { dec_input003, dec_output003, dec_assoc003, dec_nonce003, dec_key003, + sizeof(dec_input003), sizeof(dec_assoc003), sizeof(dec_nonce003) }, + { dec_input004, dec_output004, dec_assoc004, dec_nonce004, dec_key004, + sizeof(dec_input004), sizeof(dec_assoc004), sizeof(dec_nonce004) }, + { dec_input005, dec_output005, dec_assoc005, dec_nonce005, dec_key005, + sizeof(dec_input005), sizeof(dec_assoc005), sizeof(dec_nonce005) }, + { dec_input006, dec_output006, dec_assoc006, dec_nonce006, dec_key006, + sizeof(dec_input006), sizeof(dec_assoc006), sizeof(dec_nonce006) }, + { dec_input007, dec_output007, dec_assoc007, dec_nonce007, dec_key007, + sizeof(dec_input007), sizeof(dec_assoc007), sizeof(dec_nonce007) }, + { dec_input008, dec_output008, dec_assoc008, dec_nonce008, dec_key008, + sizeof(dec_input008), sizeof(dec_assoc008), sizeof(dec_nonce008) }, + { dec_input009, dec_output009, dec_assoc009, dec_nonce009, dec_key009, + sizeof(dec_input009), sizeof(dec_assoc009), sizeof(dec_nonce009) }, + { dec_input010, dec_output010, dec_assoc010, dec_nonce010, dec_key010, + sizeof(dec_input010), sizeof(dec_assoc010), sizeof(dec_nonce010) }, + { dec_input011, dec_output011, dec_assoc011, dec_nonce011, dec_key011, + sizeof(dec_input011), sizeof(dec_assoc011), sizeof(dec_nonce011) }, + { dec_input012, dec_output012, dec_assoc012, dec_nonce012, dec_key012, + sizeof(dec_input012), sizeof(dec_assoc012), sizeof(dec_nonce012) }, + { dec_input013, dec_output013, dec_assoc013, dec_nonce013, dec_key013, + sizeof(dec_input013), sizeof(dec_assoc013), sizeof(dec_nonce013), + true } +}; + +static const u8 xenc_input001[] __initconst = { + 0x49, 0x6e, 0x74, 0x65, 0x72, 0x6e, 0x65, 0x74, + 0x2d, 0x44, 0x72, 0x61, 0x66, 0x74, 0x73, 0x20, + 0x61, 0x72, 0x65, 0x20, 0x64, 0x72, 0x61, 0x66, + 0x74, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65, + 0x6e, 0x74, 0x73, 0x20, 0x76, 0x61, 0x6c, 0x69, + 0x64, 0x20, 0x66, 0x6f, 0x72, 0x20, 0x61, 0x20, + 0x6d, 0x61, 0x78, 0x69, 0x6d, 0x75, 0x6d, 0x20, + 0x6f, 0x66, 0x20, 0x73, 0x69, 0x78, 0x20, 0x6d, + 0x6f, 0x6e, 0x74, 0x68, 0x73, 0x20, 0x61, 0x6e, + 0x64, 0x20, 0x6d, 0x61, 0x79, 0x20, 0x62, 0x65, + 0x20, 0x75, 0x70, 0x64, 0x61, 0x74, 0x65, 0x64, + 0x2c, 0x20, 0x72, 0x65, 0x70, 0x6c, 0x61, 0x63, + 0x65, 0x64, 0x2c, 0x20, 0x6f, 0x72, 0x20, 0x6f, + 0x62, 0x73, 0x6f, 0x6c, 0x65, 0x74, 0x65, 0x64, + 0x20, 0x62, 0x79, 0x20, 0x6f, 0x74, 0x68, 0x65, + 0x72, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65, + 0x6e, 0x74, 0x73, 0x20, 0x61, 0x74, 0x20, 0x61, + 0x6e, 0x79, 0x20, 0x74, 0x69, 0x6d, 0x65, 0x2e, + 0x20, 0x49, 0x74, 0x20, 0x69, 0x73, 0x20, 0x69, + 0x6e, 0x61, 0x70, 0x70, 0x72, 0x6f, 0x70, 0x72, + 0x69, 0x61, 0x74, 0x65, 0x20, 0x74, 0x6f, 0x20, + 0x75, 0x73, 0x65, 0x20, 0x49, 0x6e, 0x74, 0x65, + 0x72, 0x6e, 0x65, 0x74, 0x2d, 0x44, 0x72, 0x61, + 0x66, 0x74, 0x73, 0x20, 0x61, 0x73, 0x20, 0x72, + 0x65, 0x66, 0x65, 0x72, 0x65, 0x6e, 0x63, 0x65, + 0x20, 0x6d, 0x61, 0x74, 0x65, 0x72, 0x69, 0x61, + 0x6c, 0x20, 0x6f, 0x72, 0x20, 0x74, 0x6f, 0x20, + 0x63, 0x69, 0x74, 0x65, 0x20, 0x74, 0x68, 0x65, + 0x6d, 0x20, 0x6f, 0x74, 0x68, 0x65, 0x72, 0x20, + 0x74, 0x68, 0x61, 0x6e, 0x20, 0x61, 0x73, 0x20, + 0x2f, 0xe2, 0x80, 0x9c, 0x77, 0x6f, 0x72, 0x6b, + 0x20, 0x69, 0x6e, 0x20, 0x70, 0x72, 0x6f, 0x67, + 0x72, 0x65, 0x73, 0x73, 0x2e, 0x2f, 0xe2, 0x80, + 0x9d +}; +static const u8 xenc_output001[] __initconst = { + 0x1a, 0x6e, 0x3a, 0xd9, 0xfd, 0x41, 0x3f, 0x77, + 0x54, 0x72, 0x0a, 0x70, 0x9a, 0xa0, 0x29, 0x92, + 0x2e, 0xed, 0x93, 0xcf, 0x0f, 0x71, 0x88, 0x18, + 0x7a, 0x9d, 0x2d, 0x24, 0xe0, 0xf5, 0xea, 0x3d, + 0x55, 0x64, 0xd7, 0xad, 0x2a, 0x1a, 0x1f, 0x7e, + 0x86, 0x6d, 0xb0, 0xce, 0x80, 0x41, 0x72, 0x86, + 0x26, 0xee, 0x84, 0xd7, 0xef, 0x82, 0x9e, 0xe2, + 0x60, 0x9d, 0x5a, 0xfc, 0xf0, 0xe4, 0x19, 0x85, + 0xea, 0x09, 0xc6, 0xfb, 0xb3, 0xa9, 0x50, 0x09, + 0xec, 0x5e, 0x11, 0x90, 0xa1, 0xc5, 0x4e, 0x49, + 0xef, 0x50, 0xd8, 0x8f, 0xe0, 0x78, 0xd7, 0xfd, + 0xb9, 0x3b, 0xc9, 0xf2, 0x91, 0xc8, 0x25, 0xc8, + 0xa7, 0x63, 0x60, 0xce, 0x10, 0xcd, 0xc6, 0x7f, + 0xf8, 0x16, 0xf8, 0xe1, 0x0a, 0xd9, 0xde, 0x79, + 0x50, 0x33, 0xf2, 0x16, 0x0f, 0x17, 0xba, 0xb8, + 0x5d, 0xd8, 0xdf, 0x4e, 0x51, 0xa8, 0x39, 0xd0, + 0x85, 0xca, 0x46, 0x6a, 0x10, 0xa7, 0xa3, 0x88, + 0xef, 0x79, 0xb9, 0xf8, 0x24, 0xf3, 0xe0, 0x71, + 0x7b, 0x76, 0x28, 0x46, 0x3a, 0x3a, 0x1b, 0x91, + 0xb6, 0xd4, 0x3e, 0x23, 0xe5, 0x44, 0x15, 0xbf, + 0x60, 0x43, 0x9d, 0xa4, 0xbb, 0xd5, 0x5f, 0x89, + 0xeb, 0xef, 0x8e, 0xfd, 0xdd, 0xb4, 0x0d, 0x46, + 0xf0, 0x69, 0x23, 0x63, 0xae, 0x94, 0xf5, 0x5e, + 0xa5, 0xad, 0x13, 0x1c, 0x41, 0x76, 0xe6, 0x90, + 0xd6, 0x6d, 0xa2, 0x8f, 0x97, 0x4c, 0xa8, 0x0b, + 0xcf, 0x8d, 0x43, 0x2b, 0x9c, 0x9b, 0xc5, 0x58, + 0xa5, 0xb6, 0x95, 0x9a, 0xbf, 0x81, 0xc6, 0x54, + 0xc9, 0x66, 0x0c, 0xe5, 0x4f, 0x6a, 0x53, 0xa1, + 0xe5, 0x0c, 0xba, 0x31, 0xde, 0x34, 0x64, 0x73, + 0x8a, 0x3b, 0xbd, 0x92, 0x01, 0xdb, 0x71, 0x69, + 0xf3, 0x58, 0x99, 0xbc, 0xd1, 0xcb, 0x4a, 0x05, + 0xe2, 0x58, 0x9c, 0x25, 0x17, 0xcd, 0xdc, 0x83, + 0xb7, 0xff, 0xfb, 0x09, 0x61, 0xad, 0xbf, 0x13, + 0x5b, 0x5e, 0xed, 0x46, 0x82, 0x6f, 0x22, 0xd8, + 0x93, 0xa6, 0x85, 0x5b, 0x40, 0x39, 0x5c, 0xc5, + 0x9c +}; +static const u8 xenc_assoc001[] __initconst = { + 0xf3, 0x33, 0x88, 0x86, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x4e, 0x91 +}; +static const u8 xenc_nonce001[] __initconst = { + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, + 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17 +}; +static const u8 xenc_key001[] __initconst = { + 0x1c, 0x92, 0x40, 0xa5, 0xeb, 0x55, 0xd3, 0x8a, + 0xf3, 0x33, 0x88, 0x86, 0x04, 0xf6, 0xb5, 0xf0, + 0x47, 0x39, 0x17, 0xc1, 0x40, 0x2b, 0x80, 0x09, + 0x9d, 0xca, 0x5c, 0xbc, 0x20, 0x70, 0x75, 0xc0 +}; + +static const struct chacha20poly1305_testvec +xchacha20poly1305_enc_vectors[] __initconst = { + { xenc_input001, xenc_output001, xenc_assoc001, xenc_nonce001, xenc_key001, + sizeof(xenc_input001), sizeof(xenc_assoc001), sizeof(xenc_nonce001) } +}; + +static const u8 xdec_input001[] __initconst = { + 0x1a, 0x6e, 0x3a, 0xd9, 0xfd, 0x41, 0x3f, 0x77, + 0x54, 0x72, 0x0a, 0x70, 0x9a, 0xa0, 0x29, 0x92, + 0x2e, 0xed, 0x93, 0xcf, 0x0f, 0x71, 0x88, 0x18, + 0x7a, 0x9d, 0x2d, 0x24, 0xe0, 0xf5, 0xea, 0x3d, + 0x55, 0x64, 0xd7, 0xad, 0x2a, 0x1a, 0x1f, 0x7e, + 0x86, 0x6d, 0xb0, 0xce, 0x80, 0x41, 0x72, 0x86, + 0x26, 0xee, 0x84, 0xd7, 0xef, 0x82, 0x9e, 0xe2, + 0x60, 0x9d, 0x5a, 0xfc, 0xf0, 0xe4, 0x19, 0x85, + 0xea, 0x09, 0xc6, 0xfb, 0xb3, 0xa9, 0x50, 0x09, + 0xec, 0x5e, 0x11, 0x90, 0xa1, 0xc5, 0x4e, 0x49, + 0xef, 0x50, 0xd8, 0x8f, 0xe0, 0x78, 0xd7, 0xfd, + 0xb9, 0x3b, 0xc9, 0xf2, 0x91, 0xc8, 0x25, 0xc8, + 0xa7, 0x63, 0x60, 0xce, 0x10, 0xcd, 0xc6, 0x7f, + 0xf8, 0x16, 0xf8, 0xe1, 0x0a, 0xd9, 0xde, 0x79, + 0x50, 0x33, 0xf2, 0x16, 0x0f, 0x17, 0xba, 0xb8, + 0x5d, 0xd8, 0xdf, 0x4e, 0x51, 0xa8, 0x39, 0xd0, + 0x85, 0xca, 0x46, 0x6a, 0x10, 0xa7, 0xa3, 0x88, + 0xef, 0x79, 0xb9, 0xf8, 0x24, 0xf3, 0xe0, 0x71, + 0x7b, 0x76, 0x28, 0x46, 0x3a, 0x3a, 0x1b, 0x91, + 0xb6, 0xd4, 0x3e, 0x23, 0xe5, 0x44, 0x15, 0xbf, + 0x60, 0x43, 0x9d, 0xa4, 0xbb, 0xd5, 0x5f, 0x89, + 0xeb, 0xef, 0x8e, 0xfd, 0xdd, 0xb4, 0x0d, 0x46, + 0xf0, 0x69, 0x23, 0x63, 0xae, 0x94, 0xf5, 0x5e, + 0xa5, 0xad, 0x13, 0x1c, 0x41, 0x76, 0xe6, 0x90, + 0xd6, 0x6d, 0xa2, 0x8f, 0x97, 0x4c, 0xa8, 0x0b, + 0xcf, 0x8d, 0x43, 0x2b, 0x9c, 0x9b, 0xc5, 0x58, + 0xa5, 0xb6, 0x95, 0x9a, 0xbf, 0x81, 0xc6, 0x54, + 0xc9, 0x66, 0x0c, 0xe5, 0x4f, 0x6a, 0x53, 0xa1, + 0xe5, 0x0c, 0xba, 0x31, 0xde, 0x34, 0x64, 0x73, + 0x8a, 0x3b, 0xbd, 0x92, 0x01, 0xdb, 0x71, 0x69, + 0xf3, 0x58, 0x99, 0xbc, 0xd1, 0xcb, 0x4a, 0x05, + 0xe2, 0x58, 0x9c, 0x25, 0x17, 0xcd, 0xdc, 0x83, + 0xb7, 0xff, 0xfb, 0x09, 0x61, 0xad, 0xbf, 0x13, + 0x5b, 0x5e, 0xed, 0x46, 0x82, 0x6f, 0x22, 0xd8, + 0x93, 0xa6, 0x85, 0x5b, 0x40, 0x39, 0x5c, 0xc5, + 0x9c +}; +static const u8 xdec_output001[] __initconst = { + 0x49, 0x6e, 0x74, 0x65, 0x72, 0x6e, 0x65, 0x74, + 0x2d, 0x44, 0x72, 0x61, 0x66, 0x74, 0x73, 0x20, + 0x61, 0x72, 0x65, 0x20, 0x64, 0x72, 0x61, 0x66, + 0x74, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65, + 0x6e, 0x74, 0x73, 0x20, 0x76, 0x61, 0x6c, 0x69, + 0x64, 0x20, 0x66, 0x6f, 0x72, 0x20, 0x61, 0x20, + 0x6d, 0x61, 0x78, 0x69, 0x6d, 0x75, 0x6d, 0x20, + 0x6f, 0x66, 0x20, 0x73, 0x69, 0x78, 0x20, 0x6d, + 0x6f, 0x6e, 0x74, 0x68, 0x73, 0x20, 0x61, 0x6e, + 0x64, 0x20, 0x6d, 0x61, 0x79, 0x20, 0x62, 0x65, + 0x20, 0x75, 0x70, 0x64, 0x61, 0x74, 0x65, 0x64, + 0x2c, 0x20, 0x72, 0x65, 0x70, 0x6c, 0x61, 0x63, + 0x65, 0x64, 0x2c, 0x20, 0x6f, 0x72, 0x20, 0x6f, + 0x62, 0x73, 0x6f, 0x6c, 0x65, 0x74, 0x65, 0x64, + 0x20, 0x62, 0x79, 0x20, 0x6f, 0x74, 0x68, 0x65, + 0x72, 0x20, 0x64, 0x6f, 0x63, 0x75, 0x6d, 0x65, + 0x6e, 0x74, 0x73, 0x20, 0x61, 0x74, 0x20, 0x61, + 0x6e, 0x79, 0x20, 0x74, 0x69, 0x6d, 0x65, 0x2e, + 0x20, 0x49, 0x74, 0x20, 0x69, 0x73, 0x20, 0x69, + 0x6e, 0x61, 0x70, 0x70, 0x72, 0x6f, 0x70, 0x72, + 0x69, 0x61, 0x74, 0x65, 0x20, 0x74, 0x6f, 0x20, + 0x75, 0x73, 0x65, 0x20, 0x49, 0x6e, 0x74, 0x65, + 0x72, 0x6e, 0x65, 0x74, 0x2d, 0x44, 0x72, 0x61, + 0x66, 0x74, 0x73, 0x20, 0x61, 0x73, 0x20, 0x72, + 0x65, 0x66, 0x65, 0x72, 0x65, 0x6e, 0x63, 0x65, + 0x20, 0x6d, 0x61, 0x74, 0x65, 0x72, 0x69, 0x61, + 0x6c, 0x20, 0x6f, 0x72, 0x20, 0x74, 0x6f, 0x20, + 0x63, 0x69, 0x74, 0x65, 0x20, 0x74, 0x68, 0x65, + 0x6d, 0x20, 0x6f, 0x74, 0x68, 0x65, 0x72, 0x20, + 0x74, 0x68, 0x61, 0x6e, 0x20, 0x61, 0x73, 0x20, + 0x2f, 0xe2, 0x80, 0x9c, 0x77, 0x6f, 0x72, 0x6b, + 0x20, 0x69, 0x6e, 0x20, 0x70, 0x72, 0x6f, 0x67, + 0x72, 0x65, 0x73, 0x73, 0x2e, 0x2f, 0xe2, 0x80, + 0x9d +}; +static const u8 xdec_assoc001[] __initconst = { + 0xf3, 0x33, 0x88, 0x86, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x4e, 0x91 +}; +static const u8 xdec_nonce001[] __initconst = { + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, + 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17 +}; +static const u8 xdec_key001[] __initconst = { + 0x1c, 0x92, 0x40, 0xa5, 0xeb, 0x55, 0xd3, 0x8a, + 0xf3, 0x33, 0x88, 0x86, 0x04, 0xf6, 0xb5, 0xf0, + 0x47, 0x39, 0x17, 0xc1, 0x40, 0x2b, 0x80, 0x09, + 0x9d, 0xca, 0x5c, 0xbc, 0x20, 0x70, 0x75, 0xc0 +}; + +static const struct chacha20poly1305_testvec +xchacha20poly1305_dec_vectors[] __initconst = { + { xdec_input001, xdec_output001, xdec_assoc001, xdec_nonce001, xdec_key001, + sizeof(xdec_input001), sizeof(xdec_assoc001), sizeof(xdec_nonce001) } +}; + +/* This is for the selftests-only, since it is only useful for the purpose of + * testing the underlying primitives and interactions. + */ +static void __init +chacha20poly1305_encrypt_bignonce(u8 *dst, const u8 *src, const size_t src_len, + const u8 *ad, const size_t ad_len, + const u8 nonce[12], + const u8 key[CHACHA20POLY1305_KEY_SIZE]) +{ + const u8 *pad0 = page_address(ZERO_PAGE(0)); + struct poly1305_desc_ctx poly1305_state; + u32 chacha20_state[CHACHA_STATE_WORDS]; + union { + u8 block0[POLY1305_KEY_SIZE]; + __le64 lens[2]; + } b = {{ 0 }}; + u8 bottom_row[16] = { 0 }; + u32 le_key[8]; + int i; + + memcpy(&bottom_row[4], nonce, 12); + for (i = 0; i < 8; ++i) + le_key[i] = get_unaligned_le32(key + sizeof(le_key[i]) * i); + chacha_init(chacha20_state, le_key, bottom_row); + chacha20_crypt(chacha20_state, b.block0, b.block0, sizeof(b.block0)); + poly1305_init(&poly1305_state, b.block0); + poly1305_update(&poly1305_state, ad, ad_len); + poly1305_update(&poly1305_state, pad0, (0x10 - ad_len) & 0xf); + chacha20_crypt(chacha20_state, dst, src, src_len); + poly1305_update(&poly1305_state, dst, src_len); + poly1305_update(&poly1305_state, pad0, (0x10 - src_len) & 0xf); + b.lens[0] = cpu_to_le64(ad_len); + b.lens[1] = cpu_to_le64(src_len); + poly1305_update(&poly1305_state, (u8 *)b.lens, sizeof(b.lens)); + poly1305_final(&poly1305_state, dst + src_len); +} + +static void __init +chacha20poly1305_selftest_encrypt(u8 *dst, const u8 *src, const size_t src_len, + const u8 *ad, const size_t ad_len, + const u8 *nonce, const size_t nonce_len, + const u8 key[CHACHA20POLY1305_KEY_SIZE]) +{ + if (nonce_len == 8) + chacha20poly1305_encrypt(dst, src, src_len, ad, ad_len, + get_unaligned_le64(nonce), key); + else if (nonce_len == 12) + chacha20poly1305_encrypt_bignonce(dst, src, src_len, ad, + ad_len, nonce, key); + else + BUG(); +} + +static bool __init +decryption_success(bool func_ret, bool expect_failure, int memcmp_result) +{ + if (expect_failure) + return !func_ret; + return func_ret && !memcmp_result; +} + +bool __init chacha20poly1305_selftest(void) +{ + enum { MAXIMUM_TEST_BUFFER_LEN = 1UL << 12 }; + size_t i, j, k, total_len; + u8 *computed_output = NULL, *input = NULL; + bool success = true, ret; + struct scatterlist sg_src[3]; + + computed_output = kmalloc(MAXIMUM_TEST_BUFFER_LEN, GFP_KERNEL); + input = kmalloc(MAXIMUM_TEST_BUFFER_LEN, GFP_KERNEL); + if (!computed_output || !input) { + pr_err("chacha20poly1305 self-test malloc: FAIL\n"); + success = false; + goto out; + } + + for (i = 0; i < ARRAY_SIZE(chacha20poly1305_enc_vectors); ++i) { + memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN); + chacha20poly1305_selftest_encrypt(computed_output, + chacha20poly1305_enc_vectors[i].input, + chacha20poly1305_enc_vectors[i].ilen, + chacha20poly1305_enc_vectors[i].assoc, + chacha20poly1305_enc_vectors[i].alen, + chacha20poly1305_enc_vectors[i].nonce, + chacha20poly1305_enc_vectors[i].nlen, + chacha20poly1305_enc_vectors[i].key); + if (memcmp(computed_output, + chacha20poly1305_enc_vectors[i].output, + chacha20poly1305_enc_vectors[i].ilen + + POLY1305_DIGEST_SIZE)) { + pr_err("chacha20poly1305 encryption self-test %zu: FAIL\n", + i + 1); + success = false; + } + } + + for (i = 0; i < ARRAY_SIZE(chacha20poly1305_enc_vectors); ++i) { + if (chacha20poly1305_enc_vectors[i].nlen != 8) + continue; + memcpy(computed_output, chacha20poly1305_enc_vectors[i].input, + chacha20poly1305_enc_vectors[i].ilen); + sg_init_one(sg_src, computed_output, + chacha20poly1305_enc_vectors[i].ilen + POLY1305_DIGEST_SIZE); + ret = chacha20poly1305_encrypt_sg_inplace(sg_src, + chacha20poly1305_enc_vectors[i].ilen, + chacha20poly1305_enc_vectors[i].assoc, + chacha20poly1305_enc_vectors[i].alen, + get_unaligned_le64(chacha20poly1305_enc_vectors[i].nonce), + chacha20poly1305_enc_vectors[i].key); + if (!ret || memcmp(computed_output, + chacha20poly1305_enc_vectors[i].output, + chacha20poly1305_enc_vectors[i].ilen + + POLY1305_DIGEST_SIZE)) { + pr_err("chacha20poly1305 sg encryption self-test %zu: FAIL\n", + i + 1); + success = false; + } + } + + for (i = 0; i < ARRAY_SIZE(chacha20poly1305_dec_vectors); ++i) { + memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN); + ret = chacha20poly1305_decrypt(computed_output, + chacha20poly1305_dec_vectors[i].input, + chacha20poly1305_dec_vectors[i].ilen, + chacha20poly1305_dec_vectors[i].assoc, + chacha20poly1305_dec_vectors[i].alen, + get_unaligned_le64(chacha20poly1305_dec_vectors[i].nonce), + chacha20poly1305_dec_vectors[i].key); + if (!decryption_success(ret, + chacha20poly1305_dec_vectors[i].failure, + memcmp(computed_output, + chacha20poly1305_dec_vectors[i].output, + chacha20poly1305_dec_vectors[i].ilen - + POLY1305_DIGEST_SIZE))) { + pr_err("chacha20poly1305 decryption self-test %zu: FAIL\n", + i + 1); + success = false; + } + } + + for (i = 0; i < ARRAY_SIZE(chacha20poly1305_dec_vectors); ++i) { + memcpy(computed_output, chacha20poly1305_dec_vectors[i].input, + chacha20poly1305_dec_vectors[i].ilen); + sg_init_one(sg_src, computed_output, + chacha20poly1305_dec_vectors[i].ilen); + ret = chacha20poly1305_decrypt_sg_inplace(sg_src, + chacha20poly1305_dec_vectors[i].ilen, + chacha20poly1305_dec_vectors[i].assoc, + chacha20poly1305_dec_vectors[i].alen, + get_unaligned_le64(chacha20poly1305_dec_vectors[i].nonce), + chacha20poly1305_dec_vectors[i].key); + if (!decryption_success(ret, + chacha20poly1305_dec_vectors[i].failure, + memcmp(computed_output, chacha20poly1305_dec_vectors[i].output, + chacha20poly1305_dec_vectors[i].ilen - + POLY1305_DIGEST_SIZE))) { + pr_err("chacha20poly1305 sg decryption self-test %zu: FAIL\n", + i + 1); + success = false; + } + } + + for (i = 0; i < ARRAY_SIZE(xchacha20poly1305_enc_vectors); ++i) { + memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN); + xchacha20poly1305_encrypt(computed_output, + xchacha20poly1305_enc_vectors[i].input, + xchacha20poly1305_enc_vectors[i].ilen, + xchacha20poly1305_enc_vectors[i].assoc, + xchacha20poly1305_enc_vectors[i].alen, + xchacha20poly1305_enc_vectors[i].nonce, + xchacha20poly1305_enc_vectors[i].key); + if (memcmp(computed_output, + xchacha20poly1305_enc_vectors[i].output, + xchacha20poly1305_enc_vectors[i].ilen + + POLY1305_DIGEST_SIZE)) { + pr_err("xchacha20poly1305 encryption self-test %zu: FAIL\n", + i + 1); + success = false; + } + } + + for (i = 0; i < ARRAY_SIZE(xchacha20poly1305_dec_vectors); ++i) { + memset(computed_output, 0, MAXIMUM_TEST_BUFFER_LEN); + ret = xchacha20poly1305_decrypt(computed_output, + xchacha20poly1305_dec_vectors[i].input, + xchacha20poly1305_dec_vectors[i].ilen, + xchacha20poly1305_dec_vectors[i].assoc, + xchacha20poly1305_dec_vectors[i].alen, + xchacha20poly1305_dec_vectors[i].nonce, + xchacha20poly1305_dec_vectors[i].key); + if (!decryption_success(ret, + xchacha20poly1305_dec_vectors[i].failure, + memcmp(computed_output, + xchacha20poly1305_dec_vectors[i].output, + xchacha20poly1305_dec_vectors[i].ilen - + POLY1305_DIGEST_SIZE))) { + pr_err("xchacha20poly1305 decryption self-test %zu: FAIL\n", + i + 1); + success = false; + } + } + + for (total_len = POLY1305_DIGEST_SIZE; IS_ENABLED(DEBUG_CHACHA20POLY1305_SLOW_CHUNK_TEST) + && total_len <= 1 << 10; ++total_len) { + for (i = 0; i <= total_len; ++i) { + for (j = i; j <= total_len; ++j) { + k = 0; + sg_init_table(sg_src, 3); + if (i) + sg_set_buf(&sg_src[k++], input, i); + if (j - i) + sg_set_buf(&sg_src[k++], input + i, j - i); + if (total_len - j) + sg_set_buf(&sg_src[k++], input + j, total_len - j); + sg_init_marker(sg_src, k); + memset(computed_output, 0, total_len); + memset(input, 0, total_len); + + if (!chacha20poly1305_encrypt_sg_inplace(sg_src, + total_len - POLY1305_DIGEST_SIZE, NULL, 0, + 0, enc_key001)) + goto chunkfail; + chacha20poly1305_encrypt(computed_output, + computed_output, + total_len - POLY1305_DIGEST_SIZE, NULL, 0, 0, + enc_key001); + if (memcmp(computed_output, input, total_len)) + goto chunkfail; + if (!chacha20poly1305_decrypt(computed_output, + input, total_len, NULL, 0, 0, enc_key001)) + goto chunkfail; + for (k = 0; k < total_len - POLY1305_DIGEST_SIZE; ++k) { + if (computed_output[k]) + goto chunkfail; + } + if (!chacha20poly1305_decrypt_sg_inplace(sg_src, + total_len, NULL, 0, 0, enc_key001)) + goto chunkfail; + for (k = 0; k < total_len - POLY1305_DIGEST_SIZE; ++k) { + if (input[k]) + goto chunkfail; + } + continue; + + chunkfail: + pr_err("chacha20poly1305 chunked self-test %zu/%zu/%zu: FAIL\n", + total_len, i, j); + success = false; + } + + } + } + +out: + kfree(computed_output); + kfree(input); + return success; +} diff --git a/lib/crypto/chacha20poly1305.c b/lib/crypto/chacha20poly1305.c new file mode 100644 index 000000000000..431e04280332 --- /dev/null +++ b/lib/crypto/chacha20poly1305.c @@ -0,0 +1,370 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * This is an implementation of the ChaCha20Poly1305 AEAD construction. + * + * Information: https://tools.ietf.org/html/rfc8439 + */ + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#define CHACHA_KEY_WORDS (CHACHA_KEY_SIZE / sizeof(u32)) + +static void chacha_load_key(u32 *k, const u8 *in) +{ + k[0] = get_unaligned_le32(in); + k[1] = get_unaligned_le32(in + 4); + k[2] = get_unaligned_le32(in + 8); + k[3] = get_unaligned_le32(in + 12); + k[4] = get_unaligned_le32(in + 16); + k[5] = get_unaligned_le32(in + 20); + k[6] = get_unaligned_le32(in + 24); + k[7] = get_unaligned_le32(in + 28); +} + +static void xchacha_init(u32 *chacha_state, const u8 *key, const u8 *nonce) +{ + u32 k[CHACHA_KEY_WORDS]; + u8 iv[CHACHA_IV_SIZE]; + + memset(iv, 0, 8); + memcpy(iv + 8, nonce + 16, 8); + + chacha_load_key(k, key); + + /* Compute the subkey given the original key and first 128 nonce bits */ + chacha_init(chacha_state, k, nonce); + hchacha_block(chacha_state, k, 20); + + chacha_init(chacha_state, k, iv); + + memzero_explicit(k, sizeof(k)); + memzero_explicit(iv, sizeof(iv)); +} + +static void +__chacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len, + const u8 *ad, const size_t ad_len, u32 *chacha_state) +{ + const u8 *pad0 = page_address(ZERO_PAGE(0)); + struct poly1305_desc_ctx poly1305_state; + union { + u8 block0[POLY1305_KEY_SIZE]; + __le64 lens[2]; + } b; + + chacha20_crypt(chacha_state, b.block0, pad0, sizeof(b.block0)); + poly1305_init(&poly1305_state, b.block0); + + poly1305_update(&poly1305_state, ad, ad_len); + if (ad_len & 0xf) + poly1305_update(&poly1305_state, pad0, 0x10 - (ad_len & 0xf)); + + chacha20_crypt(chacha_state, dst, src, src_len); + + poly1305_update(&poly1305_state, dst, src_len); + if (src_len & 0xf) + poly1305_update(&poly1305_state, pad0, 0x10 - (src_len & 0xf)); + + b.lens[0] = cpu_to_le64(ad_len); + b.lens[1] = cpu_to_le64(src_len); + poly1305_update(&poly1305_state, (u8 *)b.lens, sizeof(b.lens)); + + poly1305_final(&poly1305_state, dst + src_len); + + memzero_explicit(chacha_state, CHACHA_STATE_WORDS * sizeof(u32)); + memzero_explicit(&b, sizeof(b)); +} + +void chacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len, + const u8 *ad, const size_t ad_len, + const u64 nonce, + const u8 key[CHACHA20POLY1305_KEY_SIZE]) +{ + u32 chacha_state[CHACHA_STATE_WORDS]; + u32 k[CHACHA_KEY_WORDS]; + __le64 iv[2]; + + chacha_load_key(k, key); + + iv[0] = 0; + iv[1] = cpu_to_le64(nonce); + + chacha_init(chacha_state, k, (u8 *)iv); + __chacha20poly1305_encrypt(dst, src, src_len, ad, ad_len, chacha_state); + + memzero_explicit(iv, sizeof(iv)); + memzero_explicit(k, sizeof(k)); +} +EXPORT_SYMBOL(chacha20poly1305_encrypt); + +void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len, + const u8 *ad, const size_t ad_len, + const u8 nonce[XCHACHA20POLY1305_NONCE_SIZE], + const u8 key[CHACHA20POLY1305_KEY_SIZE]) +{ + u32 chacha_state[CHACHA_STATE_WORDS]; + + xchacha_init(chacha_state, key, nonce); + __chacha20poly1305_encrypt(dst, src, src_len, ad, ad_len, chacha_state); +} +EXPORT_SYMBOL(xchacha20poly1305_encrypt); + +static bool +__chacha20poly1305_decrypt(u8 *dst, const u8 *src, const size_t src_len, + const u8 *ad, const size_t ad_len, u32 *chacha_state) +{ + const u8 *pad0 = page_address(ZERO_PAGE(0)); + struct poly1305_desc_ctx poly1305_state; + size_t dst_len; + int ret; + union { + u8 block0[POLY1305_KEY_SIZE]; + u8 mac[POLY1305_DIGEST_SIZE]; + __le64 lens[2]; + } b; + + if (unlikely(src_len < POLY1305_DIGEST_SIZE)) + return false; + + chacha20_crypt(chacha_state, b.block0, pad0, sizeof(b.block0)); + poly1305_init(&poly1305_state, b.block0); + + poly1305_update(&poly1305_state, ad, ad_len); + if (ad_len & 0xf) + poly1305_update(&poly1305_state, pad0, 0x10 - (ad_len & 0xf)); + + dst_len = src_len - POLY1305_DIGEST_SIZE; + poly1305_update(&poly1305_state, src, dst_len); + if (dst_len & 0xf) + poly1305_update(&poly1305_state, pad0, 0x10 - (dst_len & 0xf)); + + b.lens[0] = cpu_to_le64(ad_len); + b.lens[1] = cpu_to_le64(dst_len); + poly1305_update(&poly1305_state, (u8 *)b.lens, sizeof(b.lens)); + + poly1305_final(&poly1305_state, b.mac); + + ret = crypto_memneq(b.mac, src + dst_len, POLY1305_DIGEST_SIZE); + if (likely(!ret)) + chacha20_crypt(chacha_state, dst, src, dst_len); + + memzero_explicit(&b, sizeof(b)); + + return !ret; +} + +bool chacha20poly1305_decrypt(u8 *dst, const u8 *src, const size_t src_len, + const u8 *ad, const size_t ad_len, + const u64 nonce, + const u8 key[CHACHA20POLY1305_KEY_SIZE]) +{ + u32 chacha_state[CHACHA_STATE_WORDS]; + u32 k[CHACHA_KEY_WORDS]; + __le64 iv[2]; + bool ret; + + chacha_load_key(k, key); + + iv[0] = 0; + iv[1] = cpu_to_le64(nonce); + + chacha_init(chacha_state, k, (u8 *)iv); + ret = __chacha20poly1305_decrypt(dst, src, src_len, ad, ad_len, + chacha_state); + + memzero_explicit(chacha_state, sizeof(chacha_state)); + memzero_explicit(iv, sizeof(iv)); + memzero_explicit(k, sizeof(k)); + return ret; +} +EXPORT_SYMBOL(chacha20poly1305_decrypt); + +bool xchacha20poly1305_decrypt(u8 *dst, const u8 *src, const size_t src_len, + const u8 *ad, const size_t ad_len, + const u8 nonce[XCHACHA20POLY1305_NONCE_SIZE], + const u8 key[CHACHA20POLY1305_KEY_SIZE]) +{ + u32 chacha_state[CHACHA_STATE_WORDS]; + + xchacha_init(chacha_state, key, nonce); + return __chacha20poly1305_decrypt(dst, src, src_len, ad, ad_len, + chacha_state); +} +EXPORT_SYMBOL(xchacha20poly1305_decrypt); + +static +bool chacha20poly1305_crypt_sg_inplace(struct scatterlist *src, + const size_t src_len, + const u8 *ad, const size_t ad_len, + const u64 nonce, + const u8 key[CHACHA20POLY1305_KEY_SIZE], + int encrypt) +{ + const u8 *pad0 = page_address(ZERO_PAGE(0)); + struct poly1305_desc_ctx poly1305_state; + u32 chacha_state[CHACHA_STATE_WORDS]; + struct sg_mapping_iter miter; + size_t partial = 0; + unsigned int flags; + bool ret = true; + int sl; + union { + struct { + u32 k[CHACHA_KEY_WORDS]; + __le64 iv[2]; + }; + u8 block0[POLY1305_KEY_SIZE]; + u8 chacha_stream[CHACHA_BLOCK_SIZE]; + struct { + u8 mac[2][POLY1305_DIGEST_SIZE]; + }; + __le64 lens[2]; + } b __aligned(16); + + if (WARN_ON(src_len > INT_MAX)) + return false; + + chacha_load_key(b.k, key); + + b.iv[0] = 0; + b.iv[1] = cpu_to_le64(nonce); + + chacha_init(chacha_state, b.k, (u8 *)b.iv); + chacha20_crypt(chacha_state, b.block0, pad0, sizeof(b.block0)); + poly1305_init(&poly1305_state, b.block0); + + if (unlikely(ad_len)) { + poly1305_update(&poly1305_state, ad, ad_len); + if (ad_len & 0xf) + poly1305_update(&poly1305_state, pad0, 0x10 - (ad_len & 0xf)); + } + + flags = SG_MITER_TO_SG; + if (!preemptible()) + flags |= SG_MITER_ATOMIC; + + sg_miter_start(&miter, src, sg_nents(src), flags); + + for (sl = src_len; sl > 0 && sg_miter_next(&miter); sl -= miter.length) { + u8 *addr = miter.addr; + size_t length = min_t(size_t, sl, miter.length); + + if (!encrypt) + poly1305_update(&poly1305_state, addr, length); + + if (unlikely(partial)) { + size_t l = min(length, CHACHA_BLOCK_SIZE - partial); + + crypto_xor(addr, b.chacha_stream + partial, l); + partial = (partial + l) & (CHACHA_BLOCK_SIZE - 1); + + addr += l; + length -= l; + } + + if (likely(length >= CHACHA_BLOCK_SIZE || length == sl)) { + size_t l = length; + + if (unlikely(length < sl)) + l &= ~(CHACHA_BLOCK_SIZE - 1); + chacha20_crypt(chacha_state, addr, addr, l); + addr += l; + length -= l; + } + + if (unlikely(length > 0)) { + chacha20_crypt(chacha_state, b.chacha_stream, pad0, + CHACHA_BLOCK_SIZE); + crypto_xor(addr, b.chacha_stream, length); + partial = length; + } + + if (encrypt) + poly1305_update(&poly1305_state, miter.addr, + min_t(size_t, sl, miter.length)); + } + + if (src_len & 0xf) + poly1305_update(&poly1305_state, pad0, 0x10 - (src_len & 0xf)); + + b.lens[0] = cpu_to_le64(ad_len); + b.lens[1] = cpu_to_le64(src_len); + poly1305_update(&poly1305_state, (u8 *)b.lens, sizeof(b.lens)); + + if (likely(sl <= -POLY1305_DIGEST_SIZE)) { + if (encrypt) { + poly1305_final(&poly1305_state, + miter.addr + miter.length + sl); + ret = true; + } else { + poly1305_final(&poly1305_state, b.mac[0]); + ret = !crypto_memneq(b.mac[0], + miter.addr + miter.length + sl, + POLY1305_DIGEST_SIZE); + } + } + + sg_miter_stop(&miter); + + if (unlikely(sl > -POLY1305_DIGEST_SIZE)) { + poly1305_final(&poly1305_state, b.mac[1]); + scatterwalk_map_and_copy(b.mac[encrypt], src, src_len, + sizeof(b.mac[1]), encrypt); + ret = encrypt || + !crypto_memneq(b.mac[0], b.mac[1], POLY1305_DIGEST_SIZE); + } + + memzero_explicit(chacha_state, sizeof(chacha_state)); + memzero_explicit(&b, sizeof(b)); + + return ret; +} + +bool chacha20poly1305_encrypt_sg_inplace(struct scatterlist *src, size_t src_len, + const u8 *ad, const size_t ad_len, + const u64 nonce, + const u8 key[CHACHA20POLY1305_KEY_SIZE]) +{ + return chacha20poly1305_crypt_sg_inplace(src, src_len, ad, ad_len, + nonce, key, 1); +} +EXPORT_SYMBOL(chacha20poly1305_encrypt_sg_inplace); + +bool chacha20poly1305_decrypt_sg_inplace(struct scatterlist *src, size_t src_len, + const u8 *ad, const size_t ad_len, + const u64 nonce, + const u8 key[CHACHA20POLY1305_KEY_SIZE]) +{ + if (unlikely(src_len < POLY1305_DIGEST_SIZE)) + return false; + + return chacha20poly1305_crypt_sg_inplace(src, + src_len - POLY1305_DIGEST_SIZE, + ad, ad_len, nonce, key, 0); +} +EXPORT_SYMBOL(chacha20poly1305_decrypt_sg_inplace); + +static int __init mod_init(void) +{ + if (!IS_ENABLED(CONFIG_CRYPTO_MANAGER_DISABLE_TESTS) && + WARN_ON(!chacha20poly1305_selftest())) + return -ENODEV; + return 0; +} + +module_init(mod_init); +MODULE_LICENSE("GPL v2"); +MODULE_DESCRIPTION("ChaCha20Poly1305 AEAD construction"); +MODULE_AUTHOR("Jason A. Donenfeld "); diff --git a/lib/crypto/curve25519-fiat32.c b/lib/crypto/curve25519-fiat32.c new file mode 100644 index 000000000000..2fde0ec33dbd --- /dev/null +++ b/lib/crypto/curve25519-fiat32.c @@ -0,0 +1,864 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2016 The fiat-crypto Authors. + * Copyright (C) 2018-2019 Jason A. Donenfeld . All Rights Reserved. + * + * This is a machine-generated formally verified implementation of Curve25519 + * ECDH from: . Though originally + * machine generated, it has been tweaked to be suitable for use in the kernel. + * It is optimized for 32-bit machines and machines that cannot work efficiently + * with 128-bit integer types. + */ + +#include +#include +#include + +/* fe means field element. Here the field is \Z/(2^255-19). An element t, + * entries t[0]...t[9], represents the integer t[0]+2^26 t[1]+2^51 t[2]+2^77 + * t[3]+2^102 t[4]+...+2^230 t[9]. + * fe limbs are bounded by 1.125*2^26,1.125*2^25,1.125*2^26,1.125*2^25,etc. + * Multiplication and carrying produce fe from fe_loose. + */ +typedef struct fe { u32 v[10]; } fe; + +/* fe_loose limbs are bounded by 3.375*2^26,3.375*2^25,3.375*2^26,3.375*2^25,etc + * Addition and subtraction produce fe_loose from (fe, fe). + */ +typedef struct fe_loose { u32 v[10]; } fe_loose; + +static __always_inline void fe_frombytes_impl(u32 h[10], const u8 *s) +{ + /* Ignores top bit of s. */ + u32 a0 = get_unaligned_le32(s); + u32 a1 = get_unaligned_le32(s+4); + u32 a2 = get_unaligned_le32(s+8); + u32 a3 = get_unaligned_le32(s+12); + u32 a4 = get_unaligned_le32(s+16); + u32 a5 = get_unaligned_le32(s+20); + u32 a6 = get_unaligned_le32(s+24); + u32 a7 = get_unaligned_le32(s+28); + h[0] = a0&((1<<26)-1); /* 26 used, 32-26 left. 26 */ + h[1] = (a0>>26) | ((a1&((1<<19)-1))<< 6); /* (32-26) + 19 = 6+19 = 25 */ + h[2] = (a1>>19) | ((a2&((1<<13)-1))<<13); /* (32-19) + 13 = 13+13 = 26 */ + h[3] = (a2>>13) | ((a3&((1<< 6)-1))<<19); /* (32-13) + 6 = 19+ 6 = 25 */ + h[4] = (a3>> 6); /* (32- 6) = 26 */ + h[5] = a4&((1<<25)-1); /* 25 */ + h[6] = (a4>>25) | ((a5&((1<<19)-1))<< 7); /* (32-25) + 19 = 7+19 = 26 */ + h[7] = (a5>>19) | ((a6&((1<<12)-1))<<13); /* (32-19) + 12 = 13+12 = 25 */ + h[8] = (a6>>12) | ((a7&((1<< 6)-1))<<20); /* (32-12) + 6 = 20+ 6 = 26 */ + h[9] = (a7>> 6)&((1<<25)-1); /* 25 */ +} + +static __always_inline void fe_frombytes(fe *h, const u8 *s) +{ + fe_frombytes_impl(h->v, s); +} + +static __always_inline u8 /*bool*/ +addcarryx_u25(u8 /*bool*/ c, u32 a, u32 b, u32 *low) +{ + /* This function extracts 25 bits of result and 1 bit of carry + * (26 total), so a 32-bit intermediate is sufficient. + */ + u32 x = a + b + c; + *low = x & ((1 << 25) - 1); + return (x >> 25) & 1; +} + +static __always_inline u8 /*bool*/ +addcarryx_u26(u8 /*bool*/ c, u32 a, u32 b, u32 *low) +{ + /* This function extracts 26 bits of result and 1 bit of carry + * (27 total), so a 32-bit intermediate is sufficient. + */ + u32 x = a + b + c; + *low = x & ((1 << 26) - 1); + return (x >> 26) & 1; +} + +static __always_inline u8 /*bool*/ +subborrow_u25(u8 /*bool*/ c, u32 a, u32 b, u32 *low) +{ + /* This function extracts 25 bits of result and 1 bit of borrow + * (26 total), so a 32-bit intermediate is sufficient. + */ + u32 x = a - b - c; + *low = x & ((1 << 25) - 1); + return x >> 31; +} + +static __always_inline u8 /*bool*/ +subborrow_u26(u8 /*bool*/ c, u32 a, u32 b, u32 *low) +{ + /* This function extracts 26 bits of result and 1 bit of borrow + *(27 total), so a 32-bit intermediate is sufficient. + */ + u32 x = a - b - c; + *low = x & ((1 << 26) - 1); + return x >> 31; +} + +static __always_inline u32 cmovznz32(u32 t, u32 z, u32 nz) +{ + t = -!!t; /* all set if nonzero, 0 if 0 */ + return (t&nz) | ((~t)&z); +} + +static __always_inline void fe_freeze(u32 out[10], const u32 in1[10]) +{ + { const u32 x17 = in1[9]; + { const u32 x18 = in1[8]; + { const u32 x16 = in1[7]; + { const u32 x14 = in1[6]; + { const u32 x12 = in1[5]; + { const u32 x10 = in1[4]; + { const u32 x8 = in1[3]; + { const u32 x6 = in1[2]; + { const u32 x4 = in1[1]; + { const u32 x2 = in1[0]; + { u32 x20; u8/*bool*/ x21 = subborrow_u26(0x0, x2, 0x3ffffed, &x20); + { u32 x23; u8/*bool*/ x24 = subborrow_u25(x21, x4, 0x1ffffff, &x23); + { u32 x26; u8/*bool*/ x27 = subborrow_u26(x24, x6, 0x3ffffff, &x26); + { u32 x29; u8/*bool*/ x30 = subborrow_u25(x27, x8, 0x1ffffff, &x29); + { u32 x32; u8/*bool*/ x33 = subborrow_u26(x30, x10, 0x3ffffff, &x32); + { u32 x35; u8/*bool*/ x36 = subborrow_u25(x33, x12, 0x1ffffff, &x35); + { u32 x38; u8/*bool*/ x39 = subborrow_u26(x36, x14, 0x3ffffff, &x38); + { u32 x41; u8/*bool*/ x42 = subborrow_u25(x39, x16, 0x1ffffff, &x41); + { u32 x44; u8/*bool*/ x45 = subborrow_u26(x42, x18, 0x3ffffff, &x44); + { u32 x47; u8/*bool*/ x48 = subborrow_u25(x45, x17, 0x1ffffff, &x47); + { u32 x49 = cmovznz32(x48, 0x0, 0xffffffff); + { u32 x50 = (x49 & 0x3ffffed); + { u32 x52; u8/*bool*/ x53 = addcarryx_u26(0x0, x20, x50, &x52); + { u32 x54 = (x49 & 0x1ffffff); + { u32 x56; u8/*bool*/ x57 = addcarryx_u25(x53, x23, x54, &x56); + { u32 x58 = (x49 & 0x3ffffff); + { u32 x60; u8/*bool*/ x61 = addcarryx_u26(x57, x26, x58, &x60); + { u32 x62 = (x49 & 0x1ffffff); + { u32 x64; u8/*bool*/ x65 = addcarryx_u25(x61, x29, x62, &x64); + { u32 x66 = (x49 & 0x3ffffff); + { u32 x68; u8/*bool*/ x69 = addcarryx_u26(x65, x32, x66, &x68); + { u32 x70 = (x49 & 0x1ffffff); + { u32 x72; u8/*bool*/ x73 = addcarryx_u25(x69, x35, x70, &x72); + { u32 x74 = (x49 & 0x3ffffff); + { u32 x76; u8/*bool*/ x77 = addcarryx_u26(x73, x38, x74, &x76); + { u32 x78 = (x49 & 0x1ffffff); + { u32 x80; u8/*bool*/ x81 = addcarryx_u25(x77, x41, x78, &x80); + { u32 x82 = (x49 & 0x3ffffff); + { u32 x84; u8/*bool*/ x85 = addcarryx_u26(x81, x44, x82, &x84); + { u32 x86 = (x49 & 0x1ffffff); + { u32 x88; addcarryx_u25(x85, x47, x86, &x88); + out[0] = x52; + out[1] = x56; + out[2] = x60; + out[3] = x64; + out[4] = x68; + out[5] = x72; + out[6] = x76; + out[7] = x80; + out[8] = x84; + out[9] = x88; + }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} +} + +static __always_inline void fe_tobytes(u8 s[32], const fe *f) +{ + u32 h[10]; + fe_freeze(h, f->v); + s[0] = h[0] >> 0; + s[1] = h[0] >> 8; + s[2] = h[0] >> 16; + s[3] = (h[0] >> 24) | (h[1] << 2); + s[4] = h[1] >> 6; + s[5] = h[1] >> 14; + s[6] = (h[1] >> 22) | (h[2] << 3); + s[7] = h[2] >> 5; + s[8] = h[2] >> 13; + s[9] = (h[2] >> 21) | (h[3] << 5); + s[10] = h[3] >> 3; + s[11] = h[3] >> 11; + s[12] = (h[3] >> 19) | (h[4] << 6); + s[13] = h[4] >> 2; + s[14] = h[4] >> 10; + s[15] = h[4] >> 18; + s[16] = h[5] >> 0; + s[17] = h[5] >> 8; + s[18] = h[5] >> 16; + s[19] = (h[5] >> 24) | (h[6] << 1); + s[20] = h[6] >> 7; + s[21] = h[6] >> 15; + s[22] = (h[6] >> 23) | (h[7] << 3); + s[23] = h[7] >> 5; + s[24] = h[7] >> 13; + s[25] = (h[7] >> 21) | (h[8] << 4); + s[26] = h[8] >> 4; + s[27] = h[8] >> 12; + s[28] = (h[8] >> 20) | (h[9] << 6); + s[29] = h[9] >> 2; + s[30] = h[9] >> 10; + s[31] = h[9] >> 18; +} + +/* h = f */ +static __always_inline void fe_copy(fe *h, const fe *f) +{ + memmove(h, f, sizeof(u32) * 10); +} + +static __always_inline void fe_copy_lt(fe_loose *h, const fe *f) +{ + memmove(h, f, sizeof(u32) * 10); +} + +/* h = 0 */ +static __always_inline void fe_0(fe *h) +{ + memset(h, 0, sizeof(u32) * 10); +} + +/* h = 1 */ +static __always_inline void fe_1(fe *h) +{ + memset(h, 0, sizeof(u32) * 10); + h->v[0] = 1; +} + +static noinline void fe_add_impl(u32 out[10], const u32 in1[10], const u32 in2[10]) +{ + { const u32 x20 = in1[9]; + { const u32 x21 = in1[8]; + { const u32 x19 = in1[7]; + { const u32 x17 = in1[6]; + { const u32 x15 = in1[5]; + { const u32 x13 = in1[4]; + { const u32 x11 = in1[3]; + { const u32 x9 = in1[2]; + { const u32 x7 = in1[1]; + { const u32 x5 = in1[0]; + { const u32 x38 = in2[9]; + { const u32 x39 = in2[8]; + { const u32 x37 = in2[7]; + { const u32 x35 = in2[6]; + { const u32 x33 = in2[5]; + { const u32 x31 = in2[4]; + { const u32 x29 = in2[3]; + { const u32 x27 = in2[2]; + { const u32 x25 = in2[1]; + { const u32 x23 = in2[0]; + out[0] = (x5 + x23); + out[1] = (x7 + x25); + out[2] = (x9 + x27); + out[3] = (x11 + x29); + out[4] = (x13 + x31); + out[5] = (x15 + x33); + out[6] = (x17 + x35); + out[7] = (x19 + x37); + out[8] = (x21 + x39); + out[9] = (x20 + x38); + }}}}}}}}}}}}}}}}}}}} +} + +/* h = f + g + * Can overlap h with f or g. + */ +static __always_inline void fe_add(fe_loose *h, const fe *f, const fe *g) +{ + fe_add_impl(h->v, f->v, g->v); +} + +static noinline void fe_sub_impl(u32 out[10], const u32 in1[10], const u32 in2[10]) +{ + { const u32 x20 = in1[9]; + { const u32 x21 = in1[8]; + { const u32 x19 = in1[7]; + { const u32 x17 = in1[6]; + { const u32 x15 = in1[5]; + { const u32 x13 = in1[4]; + { const u32 x11 = in1[3]; + { const u32 x9 = in1[2]; + { const u32 x7 = in1[1]; + { const u32 x5 = in1[0]; + { const u32 x38 = in2[9]; + { const u32 x39 = in2[8]; + { const u32 x37 = in2[7]; + { const u32 x35 = in2[6]; + { const u32 x33 = in2[5]; + { const u32 x31 = in2[4]; + { const u32 x29 = in2[3]; + { const u32 x27 = in2[2]; + { const u32 x25 = in2[1]; + { const u32 x23 = in2[0]; + out[0] = ((0x7ffffda + x5) - x23); + out[1] = ((0x3fffffe + x7) - x25); + out[2] = ((0x7fffffe + x9) - x27); + out[3] = ((0x3fffffe + x11) - x29); + out[4] = ((0x7fffffe + x13) - x31); + out[5] = ((0x3fffffe + x15) - x33); + out[6] = ((0x7fffffe + x17) - x35); + out[7] = ((0x3fffffe + x19) - x37); + out[8] = ((0x7fffffe + x21) - x39); + out[9] = ((0x3fffffe + x20) - x38); + }}}}}}}}}}}}}}}}}}}} +} + +/* h = f - g + * Can overlap h with f or g. + */ +static __always_inline void fe_sub(fe_loose *h, const fe *f, const fe *g) +{ + fe_sub_impl(h->v, f->v, g->v); +} + +static noinline void fe_mul_impl(u32 out[10], const u32 in1[10], const u32 in2[10]) +{ + { const u32 x20 = in1[9]; + { const u32 x21 = in1[8]; + { const u32 x19 = in1[7]; + { const u32 x17 = in1[6]; + { const u32 x15 = in1[5]; + { const u32 x13 = in1[4]; + { const u32 x11 = in1[3]; + { const u32 x9 = in1[2]; + { const u32 x7 = in1[1]; + { const u32 x5 = in1[0]; + { const u32 x38 = in2[9]; + { const u32 x39 = in2[8]; + { const u32 x37 = in2[7]; + { const u32 x35 = in2[6]; + { const u32 x33 = in2[5]; + { const u32 x31 = in2[4]; + { const u32 x29 = in2[3]; + { const u32 x27 = in2[2]; + { const u32 x25 = in2[1]; + { const u32 x23 = in2[0]; + { u64 x40 = ((u64)x23 * x5); + { u64 x41 = (((u64)x23 * x7) + ((u64)x25 * x5)); + { u64 x42 = ((((u64)(0x2 * x25) * x7) + ((u64)x23 * x9)) + ((u64)x27 * x5)); + { u64 x43 = (((((u64)x25 * x9) + ((u64)x27 * x7)) + ((u64)x23 * x11)) + ((u64)x29 * x5)); + { u64 x44 = (((((u64)x27 * x9) + (0x2 * (((u64)x25 * x11) + ((u64)x29 * x7)))) + ((u64)x23 * x13)) + ((u64)x31 * x5)); + { u64 x45 = (((((((u64)x27 * x11) + ((u64)x29 * x9)) + ((u64)x25 * x13)) + ((u64)x31 * x7)) + ((u64)x23 * x15)) + ((u64)x33 * x5)); + { u64 x46 = (((((0x2 * ((((u64)x29 * x11) + ((u64)x25 * x15)) + ((u64)x33 * x7))) + ((u64)x27 * x13)) + ((u64)x31 * x9)) + ((u64)x23 * x17)) + ((u64)x35 * x5)); + { u64 x47 = (((((((((u64)x29 * x13) + ((u64)x31 * x11)) + ((u64)x27 * x15)) + ((u64)x33 * x9)) + ((u64)x25 * x17)) + ((u64)x35 * x7)) + ((u64)x23 * x19)) + ((u64)x37 * x5)); + { u64 x48 = (((((((u64)x31 * x13) + (0x2 * (((((u64)x29 * x15) + ((u64)x33 * x11)) + ((u64)x25 * x19)) + ((u64)x37 * x7)))) + ((u64)x27 * x17)) + ((u64)x35 * x9)) + ((u64)x23 * x21)) + ((u64)x39 * x5)); + { u64 x49 = (((((((((((u64)x31 * x15) + ((u64)x33 * x13)) + ((u64)x29 * x17)) + ((u64)x35 * x11)) + ((u64)x27 * x19)) + ((u64)x37 * x9)) + ((u64)x25 * x21)) + ((u64)x39 * x7)) + ((u64)x23 * x20)) + ((u64)x38 * x5)); + { u64 x50 = (((((0x2 * ((((((u64)x33 * x15) + ((u64)x29 * x19)) + ((u64)x37 * x11)) + ((u64)x25 * x20)) + ((u64)x38 * x7))) + ((u64)x31 * x17)) + ((u64)x35 * x13)) + ((u64)x27 * x21)) + ((u64)x39 * x9)); + { u64 x51 = (((((((((u64)x33 * x17) + ((u64)x35 * x15)) + ((u64)x31 * x19)) + ((u64)x37 * x13)) + ((u64)x29 * x21)) + ((u64)x39 * x11)) + ((u64)x27 * x20)) + ((u64)x38 * x9)); + { u64 x52 = (((((u64)x35 * x17) + (0x2 * (((((u64)x33 * x19) + ((u64)x37 * x15)) + ((u64)x29 * x20)) + ((u64)x38 * x11)))) + ((u64)x31 * x21)) + ((u64)x39 * x13)); + { u64 x53 = (((((((u64)x35 * x19) + ((u64)x37 * x17)) + ((u64)x33 * x21)) + ((u64)x39 * x15)) + ((u64)x31 * x20)) + ((u64)x38 * x13)); + { u64 x54 = (((0x2 * ((((u64)x37 * x19) + ((u64)x33 * x20)) + ((u64)x38 * x15))) + ((u64)x35 * x21)) + ((u64)x39 * x17)); + { u64 x55 = (((((u64)x37 * x21) + ((u64)x39 * x19)) + ((u64)x35 * x20)) + ((u64)x38 * x17)); + { u64 x56 = (((u64)x39 * x21) + (0x2 * (((u64)x37 * x20) + ((u64)x38 * x19)))); + { u64 x57 = (((u64)x39 * x20) + ((u64)x38 * x21)); + { u64 x58 = ((u64)(0x2 * x38) * x20); + { u64 x59 = (x48 + (x58 << 0x4)); + { u64 x60 = (x59 + (x58 << 0x1)); + { u64 x61 = (x60 + x58); + { u64 x62 = (x47 + (x57 << 0x4)); + { u64 x63 = (x62 + (x57 << 0x1)); + { u64 x64 = (x63 + x57); + { u64 x65 = (x46 + (x56 << 0x4)); + { u64 x66 = (x65 + (x56 << 0x1)); + { u64 x67 = (x66 + x56); + { u64 x68 = (x45 + (x55 << 0x4)); + { u64 x69 = (x68 + (x55 << 0x1)); + { u64 x70 = (x69 + x55); + { u64 x71 = (x44 + (x54 << 0x4)); + { u64 x72 = (x71 + (x54 << 0x1)); + { u64 x73 = (x72 + x54); + { u64 x74 = (x43 + (x53 << 0x4)); + { u64 x75 = (x74 + (x53 << 0x1)); + { u64 x76 = (x75 + x53); + { u64 x77 = (x42 + (x52 << 0x4)); + { u64 x78 = (x77 + (x52 << 0x1)); + { u64 x79 = (x78 + x52); + { u64 x80 = (x41 + (x51 << 0x4)); + { u64 x81 = (x80 + (x51 << 0x1)); + { u64 x82 = (x81 + x51); + { u64 x83 = (x40 + (x50 << 0x4)); + { u64 x84 = (x83 + (x50 << 0x1)); + { u64 x85 = (x84 + x50); + { u64 x86 = (x85 >> 0x1a); + { u32 x87 = ((u32)x85 & 0x3ffffff); + { u64 x88 = (x86 + x82); + { u64 x89 = (x88 >> 0x19); + { u32 x90 = ((u32)x88 & 0x1ffffff); + { u64 x91 = (x89 + x79); + { u64 x92 = (x91 >> 0x1a); + { u32 x93 = ((u32)x91 & 0x3ffffff); + { u64 x94 = (x92 + x76); + { u64 x95 = (x94 >> 0x19); + { u32 x96 = ((u32)x94 & 0x1ffffff); + { u64 x97 = (x95 + x73); + { u64 x98 = (x97 >> 0x1a); + { u32 x99 = ((u32)x97 & 0x3ffffff); + { u64 x100 = (x98 + x70); + { u64 x101 = (x100 >> 0x19); + { u32 x102 = ((u32)x100 & 0x1ffffff); + { u64 x103 = (x101 + x67); + { u64 x104 = (x103 >> 0x1a); + { u32 x105 = ((u32)x103 & 0x3ffffff); + { u64 x106 = (x104 + x64); + { u64 x107 = (x106 >> 0x19); + { u32 x108 = ((u32)x106 & 0x1ffffff); + { u64 x109 = (x107 + x61); + { u64 x110 = (x109 >> 0x1a); + { u32 x111 = ((u32)x109 & 0x3ffffff); + { u64 x112 = (x110 + x49); + { u64 x113 = (x112 >> 0x19); + { u32 x114 = ((u32)x112 & 0x1ffffff); + { u64 x115 = (x87 + (0x13 * x113)); + { u32 x116 = (u32) (x115 >> 0x1a); + { u32 x117 = ((u32)x115 & 0x3ffffff); + { u32 x118 = (x116 + x90); + { u32 x119 = (x118 >> 0x19); + { u32 x120 = (x118 & 0x1ffffff); + out[0] = x117; + out[1] = x120; + out[2] = (x119 + x93); + out[3] = x96; + out[4] = x99; + out[5] = x102; + out[6] = x105; + out[7] = x108; + out[8] = x111; + out[9] = x114; + }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} +} + +static __always_inline void fe_mul_ttt(fe *h, const fe *f, const fe *g) +{ + fe_mul_impl(h->v, f->v, g->v); +} + +static __always_inline void fe_mul_tlt(fe *h, const fe_loose *f, const fe *g) +{ + fe_mul_impl(h->v, f->v, g->v); +} + +static __always_inline void +fe_mul_tll(fe *h, const fe_loose *f, const fe_loose *g) +{ + fe_mul_impl(h->v, f->v, g->v); +} + +static noinline void fe_sqr_impl(u32 out[10], const u32 in1[10]) +{ + { const u32 x17 = in1[9]; + { const u32 x18 = in1[8]; + { const u32 x16 = in1[7]; + { const u32 x14 = in1[6]; + { const u32 x12 = in1[5]; + { const u32 x10 = in1[4]; + { const u32 x8 = in1[3]; + { const u32 x6 = in1[2]; + { const u32 x4 = in1[1]; + { const u32 x2 = in1[0]; + { u64 x19 = ((u64)x2 * x2); + { u64 x20 = ((u64)(0x2 * x2) * x4); + { u64 x21 = (0x2 * (((u64)x4 * x4) + ((u64)x2 * x6))); + { u64 x22 = (0x2 * (((u64)x4 * x6) + ((u64)x2 * x8))); + { u64 x23 = ((((u64)x6 * x6) + ((u64)(0x4 * x4) * x8)) + ((u64)(0x2 * x2) * x10)); + { u64 x24 = (0x2 * ((((u64)x6 * x8) + ((u64)x4 * x10)) + ((u64)x2 * x12))); + { u64 x25 = (0x2 * (((((u64)x8 * x8) + ((u64)x6 * x10)) + ((u64)x2 * x14)) + ((u64)(0x2 * x4) * x12))); + { u64 x26 = (0x2 * (((((u64)x8 * x10) + ((u64)x6 * x12)) + ((u64)x4 * x14)) + ((u64)x2 * x16))); + { u64 x27 = (((u64)x10 * x10) + (0x2 * ((((u64)x6 * x14) + ((u64)x2 * x18)) + (0x2 * (((u64)x4 * x16) + ((u64)x8 * x12)))))); + { u64 x28 = (0x2 * ((((((u64)x10 * x12) + ((u64)x8 * x14)) + ((u64)x6 * x16)) + ((u64)x4 * x18)) + ((u64)x2 * x17))); + { u64 x29 = (0x2 * (((((u64)x12 * x12) + ((u64)x10 * x14)) + ((u64)x6 * x18)) + (0x2 * (((u64)x8 * x16) + ((u64)x4 * x17))))); + { u64 x30 = (0x2 * (((((u64)x12 * x14) + ((u64)x10 * x16)) + ((u64)x8 * x18)) + ((u64)x6 * x17))); + { u64 x31 = (((u64)x14 * x14) + (0x2 * (((u64)x10 * x18) + (0x2 * (((u64)x12 * x16) + ((u64)x8 * x17)))))); + { u64 x32 = (0x2 * ((((u64)x14 * x16) + ((u64)x12 * x18)) + ((u64)x10 * x17))); + { u64 x33 = (0x2 * ((((u64)x16 * x16) + ((u64)x14 * x18)) + ((u64)(0x2 * x12) * x17))); + { u64 x34 = (0x2 * (((u64)x16 * x18) + ((u64)x14 * x17))); + { u64 x35 = (((u64)x18 * x18) + ((u64)(0x4 * x16) * x17)); + { u64 x36 = ((u64)(0x2 * x18) * x17); + { u64 x37 = ((u64)(0x2 * x17) * x17); + { u64 x38 = (x27 + (x37 << 0x4)); + { u64 x39 = (x38 + (x37 << 0x1)); + { u64 x40 = (x39 + x37); + { u64 x41 = (x26 + (x36 << 0x4)); + { u64 x42 = (x41 + (x36 << 0x1)); + { u64 x43 = (x42 + x36); + { u64 x44 = (x25 + (x35 << 0x4)); + { u64 x45 = (x44 + (x35 << 0x1)); + { u64 x46 = (x45 + x35); + { u64 x47 = (x24 + (x34 << 0x4)); + { u64 x48 = (x47 + (x34 << 0x1)); + { u64 x49 = (x48 + x34); + { u64 x50 = (x23 + (x33 << 0x4)); + { u64 x51 = (x50 + (x33 << 0x1)); + { u64 x52 = (x51 + x33); + { u64 x53 = (x22 + (x32 << 0x4)); + { u64 x54 = (x53 + (x32 << 0x1)); + { u64 x55 = (x54 + x32); + { u64 x56 = (x21 + (x31 << 0x4)); + { u64 x57 = (x56 + (x31 << 0x1)); + { u64 x58 = (x57 + x31); + { u64 x59 = (x20 + (x30 << 0x4)); + { u64 x60 = (x59 + (x30 << 0x1)); + { u64 x61 = (x60 + x30); + { u64 x62 = (x19 + (x29 << 0x4)); + { u64 x63 = (x62 + (x29 << 0x1)); + { u64 x64 = (x63 + x29); + { u64 x65 = (x64 >> 0x1a); + { u32 x66 = ((u32)x64 & 0x3ffffff); + { u64 x67 = (x65 + x61); + { u64 x68 = (x67 >> 0x19); + { u32 x69 = ((u32)x67 & 0x1ffffff); + { u64 x70 = (x68 + x58); + { u64 x71 = (x70 >> 0x1a); + { u32 x72 = ((u32)x70 & 0x3ffffff); + { u64 x73 = (x71 + x55); + { u64 x74 = (x73 >> 0x19); + { u32 x75 = ((u32)x73 & 0x1ffffff); + { u64 x76 = (x74 + x52); + { u64 x77 = (x76 >> 0x1a); + { u32 x78 = ((u32)x76 & 0x3ffffff); + { u64 x79 = (x77 + x49); + { u64 x80 = (x79 >> 0x19); + { u32 x81 = ((u32)x79 & 0x1ffffff); + { u64 x82 = (x80 + x46); + { u64 x83 = (x82 >> 0x1a); + { u32 x84 = ((u32)x82 & 0x3ffffff); + { u64 x85 = (x83 + x43); + { u64 x86 = (x85 >> 0x19); + { u32 x87 = ((u32)x85 & 0x1ffffff); + { u64 x88 = (x86 + x40); + { u64 x89 = (x88 >> 0x1a); + { u32 x90 = ((u32)x88 & 0x3ffffff); + { u64 x91 = (x89 + x28); + { u64 x92 = (x91 >> 0x19); + { u32 x93 = ((u32)x91 & 0x1ffffff); + { u64 x94 = (x66 + (0x13 * x92)); + { u32 x95 = (u32) (x94 >> 0x1a); + { u32 x96 = ((u32)x94 & 0x3ffffff); + { u32 x97 = (x95 + x69); + { u32 x98 = (x97 >> 0x19); + { u32 x99 = (x97 & 0x1ffffff); + out[0] = x96; + out[1] = x99; + out[2] = (x98 + x72); + out[3] = x75; + out[4] = x78; + out[5] = x81; + out[6] = x84; + out[7] = x87; + out[8] = x90; + out[9] = x93; + }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} +} + +static __always_inline void fe_sq_tl(fe *h, const fe_loose *f) +{ + fe_sqr_impl(h->v, f->v); +} + +static __always_inline void fe_sq_tt(fe *h, const fe *f) +{ + fe_sqr_impl(h->v, f->v); +} + +static __always_inline void fe_loose_invert(fe *out, const fe_loose *z) +{ + fe t0; + fe t1; + fe t2; + fe t3; + int i; + + fe_sq_tl(&t0, z); + fe_sq_tt(&t1, &t0); + for (i = 1; i < 2; ++i) + fe_sq_tt(&t1, &t1); + fe_mul_tlt(&t1, z, &t1); + fe_mul_ttt(&t0, &t0, &t1); + fe_sq_tt(&t2, &t0); + fe_mul_ttt(&t1, &t1, &t2); + fe_sq_tt(&t2, &t1); + for (i = 1; i < 5; ++i) + fe_sq_tt(&t2, &t2); + fe_mul_ttt(&t1, &t2, &t1); + fe_sq_tt(&t2, &t1); + for (i = 1; i < 10; ++i) + fe_sq_tt(&t2, &t2); + fe_mul_ttt(&t2, &t2, &t1); + fe_sq_tt(&t3, &t2); + for (i = 1; i < 20; ++i) + fe_sq_tt(&t3, &t3); + fe_mul_ttt(&t2, &t3, &t2); + fe_sq_tt(&t2, &t2); + for (i = 1; i < 10; ++i) + fe_sq_tt(&t2, &t2); + fe_mul_ttt(&t1, &t2, &t1); + fe_sq_tt(&t2, &t1); + for (i = 1; i < 50; ++i) + fe_sq_tt(&t2, &t2); + fe_mul_ttt(&t2, &t2, &t1); + fe_sq_tt(&t3, &t2); + for (i = 1; i < 100; ++i) + fe_sq_tt(&t3, &t3); + fe_mul_ttt(&t2, &t3, &t2); + fe_sq_tt(&t2, &t2); + for (i = 1; i < 50; ++i) + fe_sq_tt(&t2, &t2); + fe_mul_ttt(&t1, &t2, &t1); + fe_sq_tt(&t1, &t1); + for (i = 1; i < 5; ++i) + fe_sq_tt(&t1, &t1); + fe_mul_ttt(out, &t1, &t0); +} + +static __always_inline void fe_invert(fe *out, const fe *z) +{ + fe_loose l; + fe_copy_lt(&l, z); + fe_loose_invert(out, &l); +} + +/* Replace (f,g) with (g,f) if b == 1; + * replace (f,g) with (f,g) if b == 0. + * + * Preconditions: b in {0,1} + */ +static noinline void fe_cswap(fe *f, fe *g, unsigned int b) +{ + unsigned i; + b = 0 - b; + for (i = 0; i < 10; i++) { + u32 x = f->v[i] ^ g->v[i]; + x &= b; + f->v[i] ^= x; + g->v[i] ^= x; + } +} + +/* NOTE: based on fiat-crypto fe_mul, edited for in2=121666, 0, 0.*/ +static __always_inline void fe_mul_121666_impl(u32 out[10], const u32 in1[10]) +{ + { const u32 x20 = in1[9]; + { const u32 x21 = in1[8]; + { const u32 x19 = in1[7]; + { const u32 x17 = in1[6]; + { const u32 x15 = in1[5]; + { const u32 x13 = in1[4]; + { const u32 x11 = in1[3]; + { const u32 x9 = in1[2]; + { const u32 x7 = in1[1]; + { const u32 x5 = in1[0]; + { const u32 x38 = 0; + { const u32 x39 = 0; + { const u32 x37 = 0; + { const u32 x35 = 0; + { const u32 x33 = 0; + { const u32 x31 = 0; + { const u32 x29 = 0; + { const u32 x27 = 0; + { const u32 x25 = 0; + { const u32 x23 = 121666; + { u64 x40 = ((u64)x23 * x5); + { u64 x41 = (((u64)x23 * x7) + ((u64)x25 * x5)); + { u64 x42 = ((((u64)(0x2 * x25) * x7) + ((u64)x23 * x9)) + ((u64)x27 * x5)); + { u64 x43 = (((((u64)x25 * x9) + ((u64)x27 * x7)) + ((u64)x23 * x11)) + ((u64)x29 * x5)); + { u64 x44 = (((((u64)x27 * x9) + (0x2 * (((u64)x25 * x11) + ((u64)x29 * x7)))) + ((u64)x23 * x13)) + ((u64)x31 * x5)); + { u64 x45 = (((((((u64)x27 * x11) + ((u64)x29 * x9)) + ((u64)x25 * x13)) + ((u64)x31 * x7)) + ((u64)x23 * x15)) + ((u64)x33 * x5)); + { u64 x46 = (((((0x2 * ((((u64)x29 * x11) + ((u64)x25 * x15)) + ((u64)x33 * x7))) + ((u64)x27 * x13)) + ((u64)x31 * x9)) + ((u64)x23 * x17)) + ((u64)x35 * x5)); + { u64 x47 = (((((((((u64)x29 * x13) + ((u64)x31 * x11)) + ((u64)x27 * x15)) + ((u64)x33 * x9)) + ((u64)x25 * x17)) + ((u64)x35 * x7)) + ((u64)x23 * x19)) + ((u64)x37 * x5)); + { u64 x48 = (((((((u64)x31 * x13) + (0x2 * (((((u64)x29 * x15) + ((u64)x33 * x11)) + ((u64)x25 * x19)) + ((u64)x37 * x7)))) + ((u64)x27 * x17)) + ((u64)x35 * x9)) + ((u64)x23 * x21)) + ((u64)x39 * x5)); + { u64 x49 = (((((((((((u64)x31 * x15) + ((u64)x33 * x13)) + ((u64)x29 * x17)) + ((u64)x35 * x11)) + ((u64)x27 * x19)) + ((u64)x37 * x9)) + ((u64)x25 * x21)) + ((u64)x39 * x7)) + ((u64)x23 * x20)) + ((u64)x38 * x5)); + { u64 x50 = (((((0x2 * ((((((u64)x33 * x15) + ((u64)x29 * x19)) + ((u64)x37 * x11)) + ((u64)x25 * x20)) + ((u64)x38 * x7))) + ((u64)x31 * x17)) + ((u64)x35 * x13)) + ((u64)x27 * x21)) + ((u64)x39 * x9)); + { u64 x51 = (((((((((u64)x33 * x17) + ((u64)x35 * x15)) + ((u64)x31 * x19)) + ((u64)x37 * x13)) + ((u64)x29 * x21)) + ((u64)x39 * x11)) + ((u64)x27 * x20)) + ((u64)x38 * x9)); + { u64 x52 = (((((u64)x35 * x17) + (0x2 * (((((u64)x33 * x19) + ((u64)x37 * x15)) + ((u64)x29 * x20)) + ((u64)x38 * x11)))) + ((u64)x31 * x21)) + ((u64)x39 * x13)); + { u64 x53 = (((((((u64)x35 * x19) + ((u64)x37 * x17)) + ((u64)x33 * x21)) + ((u64)x39 * x15)) + ((u64)x31 * x20)) + ((u64)x38 * x13)); + { u64 x54 = (((0x2 * ((((u64)x37 * x19) + ((u64)x33 * x20)) + ((u64)x38 * x15))) + ((u64)x35 * x21)) + ((u64)x39 * x17)); + { u64 x55 = (((((u64)x37 * x21) + ((u64)x39 * x19)) + ((u64)x35 * x20)) + ((u64)x38 * x17)); + { u64 x56 = (((u64)x39 * x21) + (0x2 * (((u64)x37 * x20) + ((u64)x38 * x19)))); + { u64 x57 = (((u64)x39 * x20) + ((u64)x38 * x21)); + { u64 x58 = ((u64)(0x2 * x38) * x20); + { u64 x59 = (x48 + (x58 << 0x4)); + { u64 x60 = (x59 + (x58 << 0x1)); + { u64 x61 = (x60 + x58); + { u64 x62 = (x47 + (x57 << 0x4)); + { u64 x63 = (x62 + (x57 << 0x1)); + { u64 x64 = (x63 + x57); + { u64 x65 = (x46 + (x56 << 0x4)); + { u64 x66 = (x65 + (x56 << 0x1)); + { u64 x67 = (x66 + x56); + { u64 x68 = (x45 + (x55 << 0x4)); + { u64 x69 = (x68 + (x55 << 0x1)); + { u64 x70 = (x69 + x55); + { u64 x71 = (x44 + (x54 << 0x4)); + { u64 x72 = (x71 + (x54 << 0x1)); + { u64 x73 = (x72 + x54); + { u64 x74 = (x43 + (x53 << 0x4)); + { u64 x75 = (x74 + (x53 << 0x1)); + { u64 x76 = (x75 + x53); + { u64 x77 = (x42 + (x52 << 0x4)); + { u64 x78 = (x77 + (x52 << 0x1)); + { u64 x79 = (x78 + x52); + { u64 x80 = (x41 + (x51 << 0x4)); + { u64 x81 = (x80 + (x51 << 0x1)); + { u64 x82 = (x81 + x51); + { u64 x83 = (x40 + (x50 << 0x4)); + { u64 x84 = (x83 + (x50 << 0x1)); + { u64 x85 = (x84 + x50); + { u64 x86 = (x85 >> 0x1a); + { u32 x87 = ((u32)x85 & 0x3ffffff); + { u64 x88 = (x86 + x82); + { u64 x89 = (x88 >> 0x19); + { u32 x90 = ((u32)x88 & 0x1ffffff); + { u64 x91 = (x89 + x79); + { u64 x92 = (x91 >> 0x1a); + { u32 x93 = ((u32)x91 & 0x3ffffff); + { u64 x94 = (x92 + x76); + { u64 x95 = (x94 >> 0x19); + { u32 x96 = ((u32)x94 & 0x1ffffff); + { u64 x97 = (x95 + x73); + { u64 x98 = (x97 >> 0x1a); + { u32 x99 = ((u32)x97 & 0x3ffffff); + { u64 x100 = (x98 + x70); + { u64 x101 = (x100 >> 0x19); + { u32 x102 = ((u32)x100 & 0x1ffffff); + { u64 x103 = (x101 + x67); + { u64 x104 = (x103 >> 0x1a); + { u32 x105 = ((u32)x103 & 0x3ffffff); + { u64 x106 = (x104 + x64); + { u64 x107 = (x106 >> 0x19); + { u32 x108 = ((u32)x106 & 0x1ffffff); + { u64 x109 = (x107 + x61); + { u64 x110 = (x109 >> 0x1a); + { u32 x111 = ((u32)x109 & 0x3ffffff); + { u64 x112 = (x110 + x49); + { u64 x113 = (x112 >> 0x19); + { u32 x114 = ((u32)x112 & 0x1ffffff); + { u64 x115 = (x87 + (0x13 * x113)); + { u32 x116 = (u32) (x115 >> 0x1a); + { u32 x117 = ((u32)x115 & 0x3ffffff); + { u32 x118 = (x116 + x90); + { u32 x119 = (x118 >> 0x19); + { u32 x120 = (x118 & 0x1ffffff); + out[0] = x117; + out[1] = x120; + out[2] = (x119 + x93); + out[3] = x96; + out[4] = x99; + out[5] = x102; + out[6] = x105; + out[7] = x108; + out[8] = x111; + out[9] = x114; + }}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} +} + +static __always_inline void fe_mul121666(fe *h, const fe_loose *f) +{ + fe_mul_121666_impl(h->v, f->v); +} + +void curve25519_generic(u8 out[CURVE25519_KEY_SIZE], + const u8 scalar[CURVE25519_KEY_SIZE], + const u8 point[CURVE25519_KEY_SIZE]) +{ + fe x1, x2, z2, x3, z3; + fe_loose x2l, z2l, x3l; + unsigned swap = 0; + int pos; + u8 e[32]; + + memcpy(e, scalar, 32); + curve25519_clamp_secret(e); + + /* The following implementation was transcribed to Coq and proven to + * correspond to unary scalar multiplication in affine coordinates given + * that x1 != 0 is the x coordinate of some point on the curve. It was + * also checked in Coq that doing a ladderstep with x1 = x3 = 0 gives + * z2' = z3' = 0, and z2 = z3 = 0 gives z2' = z3' = 0. The statement was + * quantified over the underlying field, so it applies to Curve25519 + * itself and the quadratic twist of Curve25519. It was not proven in + * Coq that prime-field arithmetic correctly simulates extension-field + * arithmetic on prime-field values. The decoding of the byte array + * representation of e was not considered. + * + * Specification of Montgomery curves in affine coordinates: + * + * + * Proof that these form a group that is isomorphic to a Weierstrass + * curve: + * + * + * Coq transcription and correctness proof of the loop + * (where scalarbits=255): + * + * + * preconditions: 0 <= e < 2^255 (not necessarily e < order), + * fe_invert(0) = 0 + */ + fe_frombytes(&x1, point); + fe_1(&x2); + fe_0(&z2); + fe_copy(&x3, &x1); + fe_1(&z3); + + for (pos = 254; pos >= 0; --pos) { + fe tmp0, tmp1; + fe_loose tmp0l, tmp1l; + /* loop invariant as of right before the test, for the case + * where x1 != 0: + * pos >= -1; if z2 = 0 then x2 is nonzero; if z3 = 0 then x3 + * is nonzero + * let r := e >> (pos+1) in the following equalities of + * projective points: + * to_xz (r*P) === if swap then (x3, z3) else (x2, z2) + * to_xz ((r+1)*P) === if swap then (x2, z2) else (x3, z3) + * x1 is the nonzero x coordinate of the nonzero + * point (r*P-(r+1)*P) + */ + unsigned b = 1 & (e[pos / 8] >> (pos & 7)); + swap ^= b; + fe_cswap(&x2, &x3, swap); + fe_cswap(&z2, &z3, swap); + swap = b; + /* Coq transcription of ladderstep formula (called from + * transcribed loop): + * + * + * x1 != 0 + * x1 = 0 + */ + fe_sub(&tmp0l, &x3, &z3); + fe_sub(&tmp1l, &x2, &z2); + fe_add(&x2l, &x2, &z2); + fe_add(&z2l, &x3, &z3); + fe_mul_tll(&z3, &tmp0l, &x2l); + fe_mul_tll(&z2, &z2l, &tmp1l); + fe_sq_tl(&tmp0, &tmp1l); + fe_sq_tl(&tmp1, &x2l); + fe_add(&x3l, &z3, &z2); + fe_sub(&z2l, &z3, &z2); + fe_mul_ttt(&x2, &tmp1, &tmp0); + fe_sub(&tmp1l, &tmp1, &tmp0); + fe_sq_tl(&z2, &z2l); + fe_mul121666(&z3, &tmp1l); + fe_sq_tl(&x3, &x3l); + fe_add(&tmp0l, &tmp0, &z3); + fe_mul_ttt(&z3, &x1, &z2); + fe_mul_tll(&z2, &tmp1l, &tmp0l); + } + /* here pos=-1, so r=e, so to_xz (e*P) === if swap then (x3, z3) + * else (x2, z2) + */ + fe_cswap(&x2, &x3, swap); + fe_cswap(&z2, &z3, swap); + + fe_invert(&z2, &z2); + fe_mul_ttt(&x2, &x2, &z2); + fe_tobytes(out, &x2); + + memzero_explicit(&x1, sizeof(x1)); + memzero_explicit(&x2, sizeof(x2)); + memzero_explicit(&z2, sizeof(z2)); + memzero_explicit(&x3, sizeof(x3)); + memzero_explicit(&z3, sizeof(z3)); + memzero_explicit(&x2l, sizeof(x2l)); + memzero_explicit(&z2l, sizeof(z2l)); + memzero_explicit(&x3l, sizeof(x3l)); + memzero_explicit(&e, sizeof(e)); +} diff --git a/lib/crypto/curve25519-generic.c b/lib/crypto/curve25519-generic.c new file mode 100644 index 000000000000..de7c99172fa2 --- /dev/null +++ b/lib/crypto/curve25519-generic.c @@ -0,0 +1,24 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * This is an implementation of the Curve25519 ECDH algorithm, using either + * a 32-bit implementation or a 64-bit implementation with 128-bit integers, + * depending on what is supported by the target compiler. + * + * Information: https://cr.yp.to/ecdh.html + */ + +#include +#include + +const u8 curve25519_null_point[CURVE25519_KEY_SIZE] __aligned(32) = { 0 }; +const u8 curve25519_base_point[CURVE25519_KEY_SIZE] __aligned(32) = { 9 }; + +EXPORT_SYMBOL(curve25519_null_point); +EXPORT_SYMBOL(curve25519_base_point); +EXPORT_SYMBOL(curve25519_generic); + +MODULE_LICENSE("GPL v2"); +MODULE_DESCRIPTION("Curve25519 scalar multiplication"); +MODULE_AUTHOR("Jason A. Donenfeld "); diff --git a/lib/crypto/curve25519-hacl64.c b/lib/crypto/curve25519-hacl64.c new file mode 100644 index 000000000000..771d82dc5f14 --- /dev/null +++ b/lib/crypto/curve25519-hacl64.c @@ -0,0 +1,788 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2016-2017 INRIA and Microsoft Corporation. + * Copyright (C) 2018-2019 Jason A. Donenfeld . All Rights Reserved. + * + * This is a machine-generated formally verified implementation of Curve25519 + * ECDH from: . Though originally machine + * generated, it has been tweaked to be suitable for use in the kernel. It is + * optimized for 64-bit machines that can efficiently work with 128-bit + * integer types. + */ + +#include +#include +#include + +typedef __uint128_t u128; + +static __always_inline u64 u64_eq_mask(u64 a, u64 b) +{ + u64 x = a ^ b; + u64 minus_x = ~x + (u64)1U; + u64 x_or_minus_x = x | minus_x; + u64 xnx = x_or_minus_x >> (u32)63U; + u64 c = xnx - (u64)1U; + return c; +} + +static __always_inline u64 u64_gte_mask(u64 a, u64 b) +{ + u64 x = a; + u64 y = b; + u64 x_xor_y = x ^ y; + u64 x_sub_y = x - y; + u64 x_sub_y_xor_y = x_sub_y ^ y; + u64 q = x_xor_y | x_sub_y_xor_y; + u64 x_xor_q = x ^ q; + u64 x_xor_q_ = x_xor_q >> (u32)63U; + u64 c = x_xor_q_ - (u64)1U; + return c; +} + +static __always_inline void modulo_carry_top(u64 *b) +{ + u64 b4 = b[4]; + u64 b0 = b[0]; + u64 b4_ = b4 & 0x7ffffffffffffLLU; + u64 b0_ = b0 + 19 * (b4 >> 51); + b[4] = b4_; + b[0] = b0_; +} + +static __always_inline void fproduct_copy_from_wide_(u64 *output, u128 *input) +{ + { + u128 xi = input[0]; + output[0] = ((u64)(xi)); + } + { + u128 xi = input[1]; + output[1] = ((u64)(xi)); + } + { + u128 xi = input[2]; + output[2] = ((u64)(xi)); + } + { + u128 xi = input[3]; + output[3] = ((u64)(xi)); + } + { + u128 xi = input[4]; + output[4] = ((u64)(xi)); + } +} + +static __always_inline void +fproduct_sum_scalar_multiplication_(u128 *output, u64 *input, u64 s) +{ + output[0] += (u128)input[0] * s; + output[1] += (u128)input[1] * s; + output[2] += (u128)input[2] * s; + output[3] += (u128)input[3] * s; + output[4] += (u128)input[4] * s; +} + +static __always_inline void fproduct_carry_wide_(u128 *tmp) +{ + { + u32 ctr = 0; + u128 tctr = tmp[ctr]; + u128 tctrp1 = tmp[ctr + 1]; + u64 r0 = ((u64)(tctr)) & 0x7ffffffffffffLLU; + u128 c = ((tctr) >> (51)); + tmp[ctr] = ((u128)(r0)); + tmp[ctr + 1] = ((tctrp1) + (c)); + } + { + u32 ctr = 1; + u128 tctr = tmp[ctr]; + u128 tctrp1 = tmp[ctr + 1]; + u64 r0 = ((u64)(tctr)) & 0x7ffffffffffffLLU; + u128 c = ((tctr) >> (51)); + tmp[ctr] = ((u128)(r0)); + tmp[ctr + 1] = ((tctrp1) + (c)); + } + + { + u32 ctr = 2; + u128 tctr = tmp[ctr]; + u128 tctrp1 = tmp[ctr + 1]; + u64 r0 = ((u64)(tctr)) & 0x7ffffffffffffLLU; + u128 c = ((tctr) >> (51)); + tmp[ctr] = ((u128)(r0)); + tmp[ctr + 1] = ((tctrp1) + (c)); + } + { + u32 ctr = 3; + u128 tctr = tmp[ctr]; + u128 tctrp1 = tmp[ctr + 1]; + u64 r0 = ((u64)(tctr)) & 0x7ffffffffffffLLU; + u128 c = ((tctr) >> (51)); + tmp[ctr] = ((u128)(r0)); + tmp[ctr + 1] = ((tctrp1) + (c)); + } +} + +static __always_inline void fmul_shift_reduce(u64 *output) +{ + u64 tmp = output[4]; + u64 b0; + { + u32 ctr = 5 - 0 - 1; + u64 z = output[ctr - 1]; + output[ctr] = z; + } + { + u32 ctr = 5 - 1 - 1; + u64 z = output[ctr - 1]; + output[ctr] = z; + } + { + u32 ctr = 5 - 2 - 1; + u64 z = output[ctr - 1]; + output[ctr] = z; + } + { + u32 ctr = 5 - 3 - 1; + u64 z = output[ctr - 1]; + output[ctr] = z; + } + output[0] = tmp; + b0 = output[0]; + output[0] = 19 * b0; +} + +static __always_inline void fmul_mul_shift_reduce_(u128 *output, u64 *input, + u64 *input21) +{ + u32 i; + u64 input2i; + { + u64 input2i = input21[0]; + fproduct_sum_scalar_multiplication_(output, input, input2i); + fmul_shift_reduce(input); + } + { + u64 input2i = input21[1]; + fproduct_sum_scalar_multiplication_(output, input, input2i); + fmul_shift_reduce(input); + } + { + u64 input2i = input21[2]; + fproduct_sum_scalar_multiplication_(output, input, input2i); + fmul_shift_reduce(input); + } + { + u64 input2i = input21[3]; + fproduct_sum_scalar_multiplication_(output, input, input2i); + fmul_shift_reduce(input); + } + i = 4; + input2i = input21[i]; + fproduct_sum_scalar_multiplication_(output, input, input2i); +} + +static __always_inline void fmul_fmul(u64 *output, u64 *input, u64 *input21) +{ + u64 tmp[5] = { input[0], input[1], input[2], input[3], input[4] }; + { + u128 b4; + u128 b0; + u128 b4_; + u128 b0_; + u64 i0; + u64 i1; + u64 i0_; + u64 i1_; + u128 t[5] = { 0 }; + fmul_mul_shift_reduce_(t, tmp, input21); + fproduct_carry_wide_(t); + b4 = t[4]; + b0 = t[0]; + b4_ = ((b4) & (((u128)(0x7ffffffffffffLLU)))); + b0_ = ((b0) + (((u128)(19) * (((u64)(((b4) >> (51)))))))); + t[4] = b4_; + t[0] = b0_; + fproduct_copy_from_wide_(output, t); + i0 = output[0]; + i1 = output[1]; + i0_ = i0 & 0x7ffffffffffffLLU; + i1_ = i1 + (i0 >> 51); + output[0] = i0_; + output[1] = i1_; + } +} + +static __always_inline void fsquare_fsquare__(u128 *tmp, u64 *output) +{ + u64 r0 = output[0]; + u64 r1 = output[1]; + u64 r2 = output[2]; + u64 r3 = output[3]; + u64 r4 = output[4]; + u64 d0 = r0 * 2; + u64 d1 = r1 * 2; + u64 d2 = r2 * 2 * 19; + u64 d419 = r4 * 19; + u64 d4 = d419 * 2; + u128 s0 = ((((((u128)(r0) * (r0))) + (((u128)(d4) * (r1))))) + + (((u128)(d2) * (r3)))); + u128 s1 = ((((((u128)(d0) * (r1))) + (((u128)(d4) * (r2))))) + + (((u128)(r3 * 19) * (r3)))); + u128 s2 = ((((((u128)(d0) * (r2))) + (((u128)(r1) * (r1))))) + + (((u128)(d4) * (r3)))); + u128 s3 = ((((((u128)(d0) * (r3))) + (((u128)(d1) * (r2))))) + + (((u128)(r4) * (d419)))); + u128 s4 = ((((((u128)(d0) * (r4))) + (((u128)(d1) * (r3))))) + + (((u128)(r2) * (r2)))); + tmp[0] = s0; + tmp[1] = s1; + tmp[2] = s2; + tmp[3] = s3; + tmp[4] = s4; +} + +static __always_inline void fsquare_fsquare_(u128 *tmp, u64 *output) +{ + u128 b4; + u128 b0; + u128 b4_; + u128 b0_; + u64 i0; + u64 i1; + u64 i0_; + u64 i1_; + fsquare_fsquare__(tmp, output); + fproduct_carry_wide_(tmp); + b4 = tmp[4]; + b0 = tmp[0]; + b4_ = ((b4) & (((u128)(0x7ffffffffffffLLU)))); + b0_ = ((b0) + (((u128)(19) * (((u64)(((b4) >> (51)))))))); + tmp[4] = b4_; + tmp[0] = b0_; + fproduct_copy_from_wide_(output, tmp); + i0 = output[0]; + i1 = output[1]; + i0_ = i0 & 0x7ffffffffffffLLU; + i1_ = i1 + (i0 >> 51); + output[0] = i0_; + output[1] = i1_; +} + +static __always_inline void fsquare_fsquare_times_(u64 *output, u128 *tmp, + u32 count1) +{ + u32 i; + fsquare_fsquare_(tmp, output); + for (i = 1; i < count1; ++i) + fsquare_fsquare_(tmp, output); +} + +static __always_inline void fsquare_fsquare_times(u64 *output, u64 *input, + u32 count1) +{ + u128 t[5]; + memcpy(output, input, 5 * sizeof(*input)); + fsquare_fsquare_times_(output, t, count1); +} + +static __always_inline void fsquare_fsquare_times_inplace(u64 *output, + u32 count1) +{ + u128 t[5]; + fsquare_fsquare_times_(output, t, count1); +} + +static __always_inline void crecip_crecip(u64 *out, u64 *z) +{ + u64 buf[20] = { 0 }; + u64 *a0 = buf; + u64 *t00 = buf + 5; + u64 *b0 = buf + 10; + u64 *t01; + u64 *b1; + u64 *c0; + u64 *a; + u64 *t0; + u64 *b; + u64 *c; + fsquare_fsquare_times(a0, z, 1); + fsquare_fsquare_times(t00, a0, 2); + fmul_fmul(b0, t00, z); + fmul_fmul(a0, b0, a0); + fsquare_fsquare_times(t00, a0, 1); + fmul_fmul(b0, t00, b0); + fsquare_fsquare_times(t00, b0, 5); + t01 = buf + 5; + b1 = buf + 10; + c0 = buf + 15; + fmul_fmul(b1, t01, b1); + fsquare_fsquare_times(t01, b1, 10); + fmul_fmul(c0, t01, b1); + fsquare_fsquare_times(t01, c0, 20); + fmul_fmul(t01, t01, c0); + fsquare_fsquare_times_inplace(t01, 10); + fmul_fmul(b1, t01, b1); + fsquare_fsquare_times(t01, b1, 50); + a = buf; + t0 = buf + 5; + b = buf + 10; + c = buf + 15; + fmul_fmul(c, t0, b); + fsquare_fsquare_times(t0, c, 100); + fmul_fmul(t0, t0, c); + fsquare_fsquare_times_inplace(t0, 50); + fmul_fmul(t0, t0, b); + fsquare_fsquare_times_inplace(t0, 5); + fmul_fmul(out, t0, a); +} + +static __always_inline void fsum(u64 *a, u64 *b) +{ + a[0] += b[0]; + a[1] += b[1]; + a[2] += b[2]; + a[3] += b[3]; + a[4] += b[4]; +} + +static __always_inline void fdifference(u64 *a, u64 *b) +{ + u64 tmp[5] = { 0 }; + u64 b0; + u64 b1; + u64 b2; + u64 b3; + u64 b4; + memcpy(tmp, b, 5 * sizeof(*b)); + b0 = tmp[0]; + b1 = tmp[1]; + b2 = tmp[2]; + b3 = tmp[3]; + b4 = tmp[4]; + tmp[0] = b0 + 0x3fffffffffff68LLU; + tmp[1] = b1 + 0x3ffffffffffff8LLU; + tmp[2] = b2 + 0x3ffffffffffff8LLU; + tmp[3] = b3 + 0x3ffffffffffff8LLU; + tmp[4] = b4 + 0x3ffffffffffff8LLU; + { + u64 xi = a[0]; + u64 yi = tmp[0]; + a[0] = yi - xi; + } + { + u64 xi = a[1]; + u64 yi = tmp[1]; + a[1] = yi - xi; + } + { + u64 xi = a[2]; + u64 yi = tmp[2]; + a[2] = yi - xi; + } + { + u64 xi = a[3]; + u64 yi = tmp[3]; + a[3] = yi - xi; + } + { + u64 xi = a[4]; + u64 yi = tmp[4]; + a[4] = yi - xi; + } +} + +static __always_inline void fscalar(u64 *output, u64 *b, u64 s) +{ + u128 tmp[5]; + u128 b4; + u128 b0; + u128 b4_; + u128 b0_; + { + u64 xi = b[0]; + tmp[0] = ((u128)(xi) * (s)); + } + { + u64 xi = b[1]; + tmp[1] = ((u128)(xi) * (s)); + } + { + u64 xi = b[2]; + tmp[2] = ((u128)(xi) * (s)); + } + { + u64 xi = b[3]; + tmp[3] = ((u128)(xi) * (s)); + } + { + u64 xi = b[4]; + tmp[4] = ((u128)(xi) * (s)); + } + fproduct_carry_wide_(tmp); + b4 = tmp[4]; + b0 = tmp[0]; + b4_ = ((b4) & (((u128)(0x7ffffffffffffLLU)))); + b0_ = ((b0) + (((u128)(19) * (((u64)(((b4) >> (51)))))))); + tmp[4] = b4_; + tmp[0] = b0_; + fproduct_copy_from_wide_(output, tmp); +} + +static __always_inline void fmul(u64 *output, u64 *a, u64 *b) +{ + fmul_fmul(output, a, b); +} + +static __always_inline void crecip(u64 *output, u64 *input) +{ + crecip_crecip(output, input); +} + +static __always_inline void point_swap_conditional_step(u64 *a, u64 *b, + u64 swap1, u32 ctr) +{ + u32 i = ctr - 1; + u64 ai = a[i]; + u64 bi = b[i]; + u64 x = swap1 & (ai ^ bi); + u64 ai1 = ai ^ x; + u64 bi1 = bi ^ x; + a[i] = ai1; + b[i] = bi1; +} + +static __always_inline void point_swap_conditional5(u64 *a, u64 *b, u64 swap1) +{ + point_swap_conditional_step(a, b, swap1, 5); + point_swap_conditional_step(a, b, swap1, 4); + point_swap_conditional_step(a, b, swap1, 3); + point_swap_conditional_step(a, b, swap1, 2); + point_swap_conditional_step(a, b, swap1, 1); +} + +static __always_inline void point_swap_conditional(u64 *a, u64 *b, u64 iswap) +{ + u64 swap1 = 0 - iswap; + point_swap_conditional5(a, b, swap1); + point_swap_conditional5(a + 5, b + 5, swap1); +} + +static __always_inline void point_copy(u64 *output, u64 *input) +{ + memcpy(output, input, 5 * sizeof(*input)); + memcpy(output + 5, input + 5, 5 * sizeof(*input)); +} + +static __always_inline void addanddouble_fmonty(u64 *pp, u64 *ppq, u64 *p, + u64 *pq, u64 *qmqp) +{ + u64 *qx = qmqp; + u64 *x2 = pp; + u64 *z2 = pp + 5; + u64 *x3 = ppq; + u64 *z3 = ppq + 5; + u64 *x = p; + u64 *z = p + 5; + u64 *xprime = pq; + u64 *zprime = pq + 5; + u64 buf[40] = { 0 }; + u64 *origx = buf; + u64 *origxprime0 = buf + 5; + u64 *xxprime0; + u64 *zzprime0; + u64 *origxprime; + xxprime0 = buf + 25; + zzprime0 = buf + 30; + memcpy(origx, x, 5 * sizeof(*x)); + fsum(x, z); + fdifference(z, origx); + memcpy(origxprime0, xprime, 5 * sizeof(*xprime)); + fsum(xprime, zprime); + fdifference(zprime, origxprime0); + fmul(xxprime0, xprime, z); + fmul(zzprime0, x, zprime); + origxprime = buf + 5; + { + u64 *xx0; + u64 *zz0; + u64 *xxprime; + u64 *zzprime; + u64 *zzzprime; + xx0 = buf + 15; + zz0 = buf + 20; + xxprime = buf + 25; + zzprime = buf + 30; + zzzprime = buf + 35; + memcpy(origxprime, xxprime, 5 * sizeof(*xxprime)); + fsum(xxprime, zzprime); + fdifference(zzprime, origxprime); + fsquare_fsquare_times(x3, xxprime, 1); + fsquare_fsquare_times(zzzprime, zzprime, 1); + fmul(z3, zzzprime, qx); + fsquare_fsquare_times(xx0, x, 1); + fsquare_fsquare_times(zz0, z, 1); + { + u64 *zzz; + u64 *xx; + u64 *zz; + u64 scalar; + zzz = buf + 10; + xx = buf + 15; + zz = buf + 20; + fmul(x2, xx, zz); + fdifference(zz, xx); + scalar = 121665; + fscalar(zzz, zz, scalar); + fsum(zzz, xx); + fmul(z2, zzz, zz); + } + } +} + +static __always_inline void +ladder_smallloop_cmult_small_loop_step(u64 *nq, u64 *nqpq, u64 *nq2, u64 *nqpq2, + u64 *q, u8 byt) +{ + u64 bit0 = (u64)(byt >> 7); + u64 bit; + point_swap_conditional(nq, nqpq, bit0); + addanddouble_fmonty(nq2, nqpq2, nq, nqpq, q); + bit = (u64)(byt >> 7); + point_swap_conditional(nq2, nqpq2, bit); +} + +static __always_inline void +ladder_smallloop_cmult_small_loop_double_step(u64 *nq, u64 *nqpq, u64 *nq2, + u64 *nqpq2, u64 *q, u8 byt) +{ + u8 byt1; + ladder_smallloop_cmult_small_loop_step(nq, nqpq, nq2, nqpq2, q, byt); + byt1 = byt << 1; + ladder_smallloop_cmult_small_loop_step(nq2, nqpq2, nq, nqpq, q, byt1); +} + +static __always_inline void +ladder_smallloop_cmult_small_loop(u64 *nq, u64 *nqpq, u64 *nq2, u64 *nqpq2, + u64 *q, u8 byt, u32 i) +{ + while (i--) { + ladder_smallloop_cmult_small_loop_double_step(nq, nqpq, nq2, + nqpq2, q, byt); + byt <<= 2; + } +} + +static __always_inline void ladder_bigloop_cmult_big_loop(u8 *n1, u64 *nq, + u64 *nqpq, u64 *nq2, + u64 *nqpq2, u64 *q, + u32 i) +{ + while (i--) { + u8 byte = n1[i]; + ladder_smallloop_cmult_small_loop(nq, nqpq, nq2, nqpq2, q, + byte, 4); + } +} + +static void ladder_cmult(u64 *result, u8 *n1, u64 *q) +{ + u64 point_buf[40] = { 0 }; + u64 *nq = point_buf; + u64 *nqpq = point_buf + 10; + u64 *nq2 = point_buf + 20; + u64 *nqpq2 = point_buf + 30; + point_copy(nqpq, q); + nq[0] = 1; + ladder_bigloop_cmult_big_loop(n1, nq, nqpq, nq2, nqpq2, q, 32); + point_copy(result, nq); +} + +static __always_inline void format_fexpand(u64 *output, const u8 *input) +{ + const u8 *x00 = input + 6; + const u8 *x01 = input + 12; + const u8 *x02 = input + 19; + const u8 *x0 = input + 24; + u64 i0, i1, i2, i3, i4, output0, output1, output2, output3, output4; + i0 = get_unaligned_le64(input); + i1 = get_unaligned_le64(x00); + i2 = get_unaligned_le64(x01); + i3 = get_unaligned_le64(x02); + i4 = get_unaligned_le64(x0); + output0 = i0 & 0x7ffffffffffffLLU; + output1 = i1 >> 3 & 0x7ffffffffffffLLU; + output2 = i2 >> 6 & 0x7ffffffffffffLLU; + output3 = i3 >> 1 & 0x7ffffffffffffLLU; + output4 = i4 >> 12 & 0x7ffffffffffffLLU; + output[0] = output0; + output[1] = output1; + output[2] = output2; + output[3] = output3; + output[4] = output4; +} + +static __always_inline void format_fcontract_first_carry_pass(u64 *input) +{ + u64 t0 = input[0]; + u64 t1 = input[1]; + u64 t2 = input[2]; + u64 t3 = input[3]; + u64 t4 = input[4]; + u64 t1_ = t1 + (t0 >> 51); + u64 t0_ = t0 & 0x7ffffffffffffLLU; + u64 t2_ = t2 + (t1_ >> 51); + u64 t1__ = t1_ & 0x7ffffffffffffLLU; + u64 t3_ = t3 + (t2_ >> 51); + u64 t2__ = t2_ & 0x7ffffffffffffLLU; + u64 t4_ = t4 + (t3_ >> 51); + u64 t3__ = t3_ & 0x7ffffffffffffLLU; + input[0] = t0_; + input[1] = t1__; + input[2] = t2__; + input[3] = t3__; + input[4] = t4_; +} + +static __always_inline void format_fcontract_first_carry_full(u64 *input) +{ + format_fcontract_first_carry_pass(input); + modulo_carry_top(input); +} + +static __always_inline void format_fcontract_second_carry_pass(u64 *input) +{ + u64 t0 = input[0]; + u64 t1 = input[1]; + u64 t2 = input[2]; + u64 t3 = input[3]; + u64 t4 = input[4]; + u64 t1_ = t1 + (t0 >> 51); + u64 t0_ = t0 & 0x7ffffffffffffLLU; + u64 t2_ = t2 + (t1_ >> 51); + u64 t1__ = t1_ & 0x7ffffffffffffLLU; + u64 t3_ = t3 + (t2_ >> 51); + u64 t2__ = t2_ & 0x7ffffffffffffLLU; + u64 t4_ = t4 + (t3_ >> 51); + u64 t3__ = t3_ & 0x7ffffffffffffLLU; + input[0] = t0_; + input[1] = t1__; + input[2] = t2__; + input[3] = t3__; + input[4] = t4_; +} + +static __always_inline void format_fcontract_second_carry_full(u64 *input) +{ + u64 i0; + u64 i1; + u64 i0_; + u64 i1_; + format_fcontract_second_carry_pass(input); + modulo_carry_top(input); + i0 = input[0]; + i1 = input[1]; + i0_ = i0 & 0x7ffffffffffffLLU; + i1_ = i1 + (i0 >> 51); + input[0] = i0_; + input[1] = i1_; +} + +static __always_inline void format_fcontract_trim(u64 *input) +{ + u64 a0 = input[0]; + u64 a1 = input[1]; + u64 a2 = input[2]; + u64 a3 = input[3]; + u64 a4 = input[4]; + u64 mask0 = u64_gte_mask(a0, 0x7ffffffffffedLLU); + u64 mask1 = u64_eq_mask(a1, 0x7ffffffffffffLLU); + u64 mask2 = u64_eq_mask(a2, 0x7ffffffffffffLLU); + u64 mask3 = u64_eq_mask(a3, 0x7ffffffffffffLLU); + u64 mask4 = u64_eq_mask(a4, 0x7ffffffffffffLLU); + u64 mask = (((mask0 & mask1) & mask2) & mask3) & mask4; + u64 a0_ = a0 - (0x7ffffffffffedLLU & mask); + u64 a1_ = a1 - (0x7ffffffffffffLLU & mask); + u64 a2_ = a2 - (0x7ffffffffffffLLU & mask); + u64 a3_ = a3 - (0x7ffffffffffffLLU & mask); + u64 a4_ = a4 - (0x7ffffffffffffLLU & mask); + input[0] = a0_; + input[1] = a1_; + input[2] = a2_; + input[3] = a3_; + input[4] = a4_; +} + +static __always_inline void format_fcontract_store(u8 *output, u64 *input) +{ + u64 t0 = input[0]; + u64 t1 = input[1]; + u64 t2 = input[2]; + u64 t3 = input[3]; + u64 t4 = input[4]; + u64 o0 = t1 << 51 | t0; + u64 o1 = t2 << 38 | t1 >> 13; + u64 o2 = t3 << 25 | t2 >> 26; + u64 o3 = t4 << 12 | t3 >> 39; + u8 *b0 = output; + u8 *b1 = output + 8; + u8 *b2 = output + 16; + u8 *b3 = output + 24; + put_unaligned_le64(o0, b0); + put_unaligned_le64(o1, b1); + put_unaligned_le64(o2, b2); + put_unaligned_le64(o3, b3); +} + +static __always_inline void format_fcontract(u8 *output, u64 *input) +{ + format_fcontract_first_carry_full(input); + format_fcontract_second_carry_full(input); + format_fcontract_trim(input); + format_fcontract_store(output, input); +} + +static __always_inline void format_scalar_of_point(u8 *scalar, u64 *point) +{ + u64 *x = point; + u64 *z = point + 5; + u64 buf[10] __aligned(32) = { 0 }; + u64 *zmone = buf; + u64 *sc = buf + 5; + crecip(zmone, z); + fmul(sc, x, zmone); + format_fcontract(scalar, sc); +} + +void curve25519_generic(u8 mypublic[CURVE25519_KEY_SIZE], + const u8 secret[CURVE25519_KEY_SIZE], + const u8 basepoint[CURVE25519_KEY_SIZE]) +{ + u64 buf0[10] __aligned(32) = { 0 }; + u64 *x0 = buf0; + u64 *z = buf0 + 5; + u64 *q; + format_fexpand(x0, basepoint); + z[0] = 1; + q = buf0; + { + u8 e[32] __aligned(32) = { 0 }; + u8 *scalar; + memcpy(e, secret, 32); + curve25519_clamp_secret(e); + scalar = e; + { + u64 buf[15] = { 0 }; + u64 *nq = buf; + u64 *x = nq; + x[0] = 1; + ladder_cmult(nq, scalar, q); + format_scalar_of_point(mypublic, nq); + memzero_explicit(buf, sizeof(buf)); + } + memzero_explicit(e, sizeof(e)); + } + memzero_explicit(buf0, sizeof(buf0)); +} diff --git a/lib/crypto/curve25519-selftest.c b/lib/crypto/curve25519-selftest.c new file mode 100644 index 000000000000..c85e85381e78 --- /dev/null +++ b/lib/crypto/curve25519-selftest.c @@ -0,0 +1,1321 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#include + +struct curve25519_test_vector { + u8 private[CURVE25519_KEY_SIZE]; + u8 public[CURVE25519_KEY_SIZE]; + u8 result[CURVE25519_KEY_SIZE]; + bool valid; +}; +static const struct curve25519_test_vector curve25519_test_vectors[] __initconst = { + { + .private = { 0x77, 0x07, 0x6d, 0x0a, 0x73, 0x18, 0xa5, 0x7d, + 0x3c, 0x16, 0xc1, 0x72, 0x51, 0xb2, 0x66, 0x45, + 0xdf, 0x4c, 0x2f, 0x87, 0xeb, 0xc0, 0x99, 0x2a, + 0xb1, 0x77, 0xfb, 0xa5, 0x1d, 0xb9, 0x2c, 0x2a }, + .public = { 0xde, 0x9e, 0xdb, 0x7d, 0x7b, 0x7d, 0xc1, 0xb4, + 0xd3, 0x5b, 0x61, 0xc2, 0xec, 0xe4, 0x35, 0x37, + 0x3f, 0x83, 0x43, 0xc8, 0x5b, 0x78, 0x67, 0x4d, + 0xad, 0xfc, 0x7e, 0x14, 0x6f, 0x88, 0x2b, 0x4f }, + .result = { 0x4a, 0x5d, 0x9d, 0x5b, 0xa4, 0xce, 0x2d, 0xe1, + 0x72, 0x8e, 0x3b, 0xf4, 0x80, 0x35, 0x0f, 0x25, + 0xe0, 0x7e, 0x21, 0xc9, 0x47, 0xd1, 0x9e, 0x33, + 0x76, 0xf0, 0x9b, 0x3c, 0x1e, 0x16, 0x17, 0x42 }, + .valid = true + }, + { + .private = { 0x5d, 0xab, 0x08, 0x7e, 0x62, 0x4a, 0x8a, 0x4b, + 0x79, 0xe1, 0x7f, 0x8b, 0x83, 0x80, 0x0e, 0xe6, + 0x6f, 0x3b, 0xb1, 0x29, 0x26, 0x18, 0xb6, 0xfd, + 0x1c, 0x2f, 0x8b, 0x27, 0xff, 0x88, 0xe0, 0xeb }, + .public = { 0x85, 0x20, 0xf0, 0x09, 0x89, 0x30, 0xa7, 0x54, + 0x74, 0x8b, 0x7d, 0xdc, 0xb4, 0x3e, 0xf7, 0x5a, + 0x0d, 0xbf, 0x3a, 0x0d, 0x26, 0x38, 0x1a, 0xf4, + 0xeb, 0xa4, 0xa9, 0x8e, 0xaa, 0x9b, 0x4e, 0x6a }, + .result = { 0x4a, 0x5d, 0x9d, 0x5b, 0xa4, 0xce, 0x2d, 0xe1, + 0x72, 0x8e, 0x3b, 0xf4, 0x80, 0x35, 0x0f, 0x25, + 0xe0, 0x7e, 0x21, 0xc9, 0x47, 0xd1, 0x9e, 0x33, + 0x76, 0xf0, 0x9b, 0x3c, 0x1e, 0x16, 0x17, 0x42 }, + .valid = true + }, + { + .private = { 1 }, + .public = { 0x25, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .result = { 0x3c, 0x77, 0x77, 0xca, 0xf9, 0x97, 0xb2, 0x64, + 0x41, 0x60, 0x77, 0x66, 0x5b, 0x4e, 0x22, 0x9d, + 0x0b, 0x95, 0x48, 0xdc, 0x0c, 0xd8, 0x19, 0x98, + 0xdd, 0xcd, 0xc5, 0xc8, 0x53, 0x3c, 0x79, 0x7f }, + .valid = true + }, + { + .private = { 1 }, + .public = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0xb3, 0x2d, 0x13, 0x62, 0xc2, 0x48, 0xd6, 0x2f, + 0xe6, 0x26, 0x19, 0xcf, 0xf0, 0x4d, 0xd4, 0x3d, + 0xb7, 0x3f, 0xfc, 0x1b, 0x63, 0x08, 0xed, 0xe3, + 0x0b, 0x78, 0xd8, 0x73, 0x80, 0xf1, 0xe8, 0x34 }, + .valid = true + }, + { + .private = { 0xa5, 0x46, 0xe3, 0x6b, 0xf0, 0x52, 0x7c, 0x9d, + 0x3b, 0x16, 0x15, 0x4b, 0x82, 0x46, 0x5e, 0xdd, + 0x62, 0x14, 0x4c, 0x0a, 0xc1, 0xfc, 0x5a, 0x18, + 0x50, 0x6a, 0x22, 0x44, 0xba, 0x44, 0x9a, 0xc4 }, + .public = { 0xe6, 0xdb, 0x68, 0x67, 0x58, 0x30, 0x30, 0xdb, + 0x35, 0x94, 0xc1, 0xa4, 0x24, 0xb1, 0x5f, 0x7c, + 0x72, 0x66, 0x24, 0xec, 0x26, 0xb3, 0x35, 0x3b, + 0x10, 0xa9, 0x03, 0xa6, 0xd0, 0xab, 0x1c, 0x4c }, + .result = { 0xc3, 0xda, 0x55, 0x37, 0x9d, 0xe9, 0xc6, 0x90, + 0x8e, 0x94, 0xea, 0x4d, 0xf2, 0x8d, 0x08, 0x4f, + 0x32, 0xec, 0xcf, 0x03, 0x49, 0x1c, 0x71, 0xf7, + 0x54, 0xb4, 0x07, 0x55, 0x77, 0xa2, 0x85, 0x52 }, + .valid = true + }, + { + .private = { 1, 2, 3, 4 }, + .public = { 0 }, + .result = { 0 }, + .valid = false + }, + { + .private = { 2, 4, 6, 8 }, + .public = { 0xe0, 0xeb, 0x7a, 0x7c, 0x3b, 0x41, 0xb8, 0xae, + 0x16, 0x56, 0xe3, 0xfa, 0xf1, 0x9f, 0xc4, 0x6a, + 0xda, 0x09, 0x8d, 0xeb, 0x9c, 0x32, 0xb1, 0xfd, + 0x86, 0x62, 0x05, 0x16, 0x5f, 0x49, 0xb8 }, + .result = { 0 }, + .valid = false + }, + { + .private = { 0xff, 0xff, 0xff, 0xff, 0x0a, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .public = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0x0a, 0x00, 0xfb, 0x9f }, + .result = { 0x77, 0x52, 0xb6, 0x18, 0xc1, 0x2d, 0x48, 0xd2, + 0xc6, 0x93, 0x46, 0x83, 0x81, 0x7c, 0xc6, 0x57, + 0xf3, 0x31, 0x03, 0x19, 0x49, 0x48, 0x20, 0x05, + 0x42, 0x2b, 0x4e, 0xae, 0x8d, 0x1d, 0x43, 0x23 }, + .valid = true + }, + { + .private = { 0x8e, 0x0a, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .public = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x8e, 0x06 }, + .result = { 0x5a, 0xdf, 0xaa, 0x25, 0x86, 0x8e, 0x32, 0x3d, + 0xae, 0x49, 0x62, 0xc1, 0x01, 0x5c, 0xb3, 0x12, + 0xe1, 0xc5, 0xc7, 0x9e, 0x95, 0x3f, 0x03, 0x99, + 0xb0, 0xba, 0x16, 0x22, 0xf3, 0xb6, 0xf7, 0x0c }, + .valid = true + }, + /* wycheproof - normal case */ + { + .private = { 0x48, 0x52, 0x83, 0x4d, 0x9d, 0x6b, 0x77, 0xda, + 0xde, 0xab, 0xaa, 0xf2, 0xe1, 0x1d, 0xca, 0x66, + 0xd1, 0x9f, 0xe7, 0x49, 0x93, 0xa7, 0xbe, 0xc3, + 0x6c, 0x6e, 0x16, 0xa0, 0x98, 0x3f, 0xea, 0xba }, + .public = { 0x9c, 0x64, 0x7d, 0x9a, 0xe5, 0x89, 0xb9, 0xf5, + 0x8f, 0xdc, 0x3c, 0xa4, 0x94, 0x7e, 0xfb, 0xc9, + 0x15, 0xc4, 0xb2, 0xe0, 0x8e, 0x74, 0x4a, 0x0e, + 0xdf, 0x46, 0x9d, 0xac, 0x59, 0xc8, 0xf8, 0x5a }, + .result = { 0x87, 0xb7, 0xf2, 0x12, 0xb6, 0x27, 0xf7, 0xa5, + 0x4c, 0xa5, 0xe0, 0xbc, 0xda, 0xdd, 0xd5, 0x38, + 0x9d, 0x9d, 0xe6, 0x15, 0x6c, 0xdb, 0xcf, 0x8e, + 0xbe, 0x14, 0xff, 0xbc, 0xfb, 0x43, 0x65, 0x51 }, + .valid = true + }, + /* wycheproof - public key on twist */ + { + .private = { 0x58, 0x8c, 0x06, 0x1a, 0x50, 0x80, 0x4a, 0xc4, + 0x88, 0xad, 0x77, 0x4a, 0xc7, 0x16, 0xc3, 0xf5, + 0xba, 0x71, 0x4b, 0x27, 0x12, 0xe0, 0x48, 0x49, + 0x13, 0x79, 0xa5, 0x00, 0x21, 0x19, 0x98, 0xa8 }, + .public = { 0x63, 0xaa, 0x40, 0xc6, 0xe3, 0x83, 0x46, 0xc5, + 0xca, 0xf2, 0x3a, 0x6d, 0xf0, 0xa5, 0xe6, 0xc8, + 0x08, 0x89, 0xa0, 0x86, 0x47, 0xe5, 0x51, 0xb3, + 0x56, 0x34, 0x49, 0xbe, 0xfc, 0xfc, 0x97, 0x33 }, + .result = { 0xb1, 0xa7, 0x07, 0x51, 0x94, 0x95, 0xff, 0xff, + 0xb2, 0x98, 0xff, 0x94, 0x17, 0x16, 0xb0, 0x6d, + 0xfa, 0xb8, 0x7c, 0xf8, 0xd9, 0x11, 0x23, 0xfe, + 0x2b, 0xe9, 0xa2, 0x33, 0xdd, 0xa2, 0x22, 0x12 }, + .valid = true + }, + /* wycheproof - public key on twist */ + { + .private = { 0xb0, 0x5b, 0xfd, 0x32, 0xe5, 0x53, 0x25, 0xd9, + 0xfd, 0x64, 0x8c, 0xb3, 0x02, 0x84, 0x80, 0x39, + 0x00, 0x0b, 0x39, 0x0e, 0x44, 0xd5, 0x21, 0xe5, + 0x8a, 0xab, 0x3b, 0x29, 0xa6, 0x96, 0x0b, 0xa8 }, + .public = { 0x0f, 0x83, 0xc3, 0x6f, 0xde, 0xd9, 0xd3, 0x2f, + 0xad, 0xf4, 0xef, 0xa3, 0xae, 0x93, 0xa9, 0x0b, + 0xb5, 0xcf, 0xa6, 0x68, 0x93, 0xbc, 0x41, 0x2c, + 0x43, 0xfa, 0x72, 0x87, 0xdb, 0xb9, 0x97, 0x79 }, + .result = { 0x67, 0xdd, 0x4a, 0x6e, 0x16, 0x55, 0x33, 0x53, + 0x4c, 0x0e, 0x3f, 0x17, 0x2e, 0x4a, 0xb8, 0x57, + 0x6b, 0xca, 0x92, 0x3a, 0x5f, 0x07, 0xb2, 0xc0, + 0x69, 0xb4, 0xc3, 0x10, 0xff, 0x2e, 0x93, 0x5b }, + .valid = true + }, + /* wycheproof - public key on twist */ + { + .private = { 0x70, 0xe3, 0x4b, 0xcb, 0xe1, 0xf4, 0x7f, 0xbc, + 0x0f, 0xdd, 0xfd, 0x7c, 0x1e, 0x1a, 0xa5, 0x3d, + 0x57, 0xbf, 0xe0, 0xf6, 0x6d, 0x24, 0x30, 0x67, + 0xb4, 0x24, 0xbb, 0x62, 0x10, 0xbe, 0xd1, 0x9c }, + .public = { 0x0b, 0x82, 0x11, 0xa2, 0xb6, 0x04, 0x90, 0x97, + 0xf6, 0x87, 0x1c, 0x6c, 0x05, 0x2d, 0x3c, 0x5f, + 0xc1, 0xba, 0x17, 0xda, 0x9e, 0x32, 0xae, 0x45, + 0x84, 0x03, 0xb0, 0x5b, 0xb2, 0x83, 0x09, 0x2a }, + .result = { 0x4a, 0x06, 0x38, 0xcf, 0xaa, 0x9e, 0xf1, 0x93, + 0x3b, 0x47, 0xf8, 0x93, 0x92, 0x96, 0xa6, 0xb2, + 0x5b, 0xe5, 0x41, 0xef, 0x7f, 0x70, 0xe8, 0x44, + 0xc0, 0xbc, 0xc0, 0x0b, 0x13, 0x4d, 0xe6, 0x4a }, + .valid = true + }, + /* wycheproof - public key on twist */ + { + .private = { 0x68, 0xc1, 0xf3, 0xa6, 0x53, 0xa4, 0xcd, 0xb1, + 0xd3, 0x7b, 0xba, 0x94, 0x73, 0x8f, 0x8b, 0x95, + 0x7a, 0x57, 0xbe, 0xb2, 0x4d, 0x64, 0x6e, 0x99, + 0x4d, 0xc2, 0x9a, 0x27, 0x6a, 0xad, 0x45, 0x8d }, + .public = { 0x34, 0x3a, 0xc2, 0x0a, 0x3b, 0x9c, 0x6a, 0x27, + 0xb1, 0x00, 0x81, 0x76, 0x50, 0x9a, 0xd3, 0x07, + 0x35, 0x85, 0x6e, 0xc1, 0xc8, 0xd8, 0xfc, 0xae, + 0x13, 0x91, 0x2d, 0x08, 0xd1, 0x52, 0xf4, 0x6c }, + .result = { 0x39, 0x94, 0x91, 0xfc, 0xe8, 0xdf, 0xab, 0x73, + 0xb4, 0xf9, 0xf6, 0x11, 0xde, 0x8e, 0xa0, 0xb2, + 0x7b, 0x28, 0xf8, 0x59, 0x94, 0x25, 0x0b, 0x0f, + 0x47, 0x5d, 0x58, 0x5d, 0x04, 0x2a, 0xc2, 0x07 }, + .valid = true + }, + /* wycheproof - public key on twist */ + { + .private = { 0xd8, 0x77, 0xb2, 0x6d, 0x06, 0xdf, 0xf9, 0xd9, + 0xf7, 0xfd, 0x4c, 0x5b, 0x37, 0x69, 0xf8, 0xcd, + 0xd5, 0xb3, 0x05, 0x16, 0xa5, 0xab, 0x80, 0x6b, + 0xe3, 0x24, 0xff, 0x3e, 0xb6, 0x9e, 0xa0, 0xb2 }, + .public = { 0xfa, 0x69, 0x5f, 0xc7, 0xbe, 0x8d, 0x1b, 0xe5, + 0xbf, 0x70, 0x48, 0x98, 0xf3, 0x88, 0xc4, 0x52, + 0xba, 0xfd, 0xd3, 0xb8, 0xea, 0xe8, 0x05, 0xf8, + 0x68, 0x1a, 0x8d, 0x15, 0xc2, 0xd4, 0xe1, 0x42 }, + .result = { 0x2c, 0x4f, 0xe1, 0x1d, 0x49, 0x0a, 0x53, 0x86, + 0x17, 0x76, 0xb1, 0x3b, 0x43, 0x54, 0xab, 0xd4, + 0xcf, 0x5a, 0x97, 0x69, 0x9d, 0xb6, 0xe6, 0xc6, + 0x8c, 0x16, 0x26, 0xd0, 0x76, 0x62, 0xf7, 0x58 }, + .valid = true + }, + /* wycheproof - public key = 0 */ + { + .private = { 0x20, 0x74, 0x94, 0x03, 0x8f, 0x2b, 0xb8, 0x11, + 0xd4, 0x78, 0x05, 0xbc, 0xdf, 0x04, 0xa2, 0xac, + 0x58, 0x5a, 0xda, 0x7f, 0x2f, 0x23, 0x38, 0x9b, + 0xfd, 0x46, 0x58, 0xf9, 0xdd, 0xd4, 0xde, 0xbc }, + .public = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key = 1 */ + { + .private = { 0x20, 0x2e, 0x89, 0x72, 0xb6, 0x1c, 0x7e, 0x61, + 0x93, 0x0e, 0xb9, 0x45, 0x0b, 0x50, 0x70, 0xea, + 0xe1, 0xc6, 0x70, 0x47, 0x56, 0x85, 0x54, 0x1f, + 0x04, 0x76, 0x21, 0x7e, 0x48, 0x18, 0xcf, 0xab }, + .public = { 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - edge case on twist */ + { + .private = { 0x38, 0xdd, 0xe9, 0xf3, 0xe7, 0xb7, 0x99, 0x04, + 0x5f, 0x9a, 0xc3, 0x79, 0x3d, 0x4a, 0x92, 0x77, + 0xda, 0xde, 0xad, 0xc4, 0x1b, 0xec, 0x02, 0x90, + 0xf8, 0x1f, 0x74, 0x4f, 0x73, 0x77, 0x5f, 0x84 }, + .public = { 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .result = { 0x9a, 0x2c, 0xfe, 0x84, 0xff, 0x9c, 0x4a, 0x97, + 0x39, 0x62, 0x5c, 0xae, 0x4a, 0x3b, 0x82, 0xa9, + 0x06, 0x87, 0x7a, 0x44, 0x19, 0x46, 0xf8, 0xd7, + 0xb3, 0xd7, 0x95, 0xfe, 0x8f, 0x5d, 0x16, 0x39 }, + .valid = true + }, + /* wycheproof - edge case on twist */ + { + .private = { 0x98, 0x57, 0xa9, 0x14, 0xe3, 0xc2, 0x90, 0x36, + 0xfd, 0x9a, 0x44, 0x2b, 0xa5, 0x26, 0xb5, 0xcd, + 0xcd, 0xf2, 0x82, 0x16, 0x15, 0x3e, 0x63, 0x6c, + 0x10, 0x67, 0x7a, 0xca, 0xb6, 0xbd, 0x6a, 0xa5 }, + .public = { 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .result = { 0x4d, 0xa4, 0xe0, 0xaa, 0x07, 0x2c, 0x23, 0x2e, + 0xe2, 0xf0, 0xfa, 0x4e, 0x51, 0x9a, 0xe5, 0x0b, + 0x52, 0xc1, 0xed, 0xd0, 0x8a, 0x53, 0x4d, 0x4e, + 0xf3, 0x46, 0xc2, 0xe1, 0x06, 0xd2, 0x1d, 0x60 }, + .valid = true + }, + /* wycheproof - edge case on twist */ + { + .private = { 0x48, 0xe2, 0x13, 0x0d, 0x72, 0x33, 0x05, 0xed, + 0x05, 0xe6, 0xe5, 0x89, 0x4d, 0x39, 0x8a, 0x5e, + 0x33, 0x36, 0x7a, 0x8c, 0x6a, 0xac, 0x8f, 0xcd, + 0xf0, 0xa8, 0x8e, 0x4b, 0x42, 0x82, 0x0d, 0xb7 }, + .public = { 0xff, 0xff, 0xff, 0x03, 0x00, 0x00, 0xf8, 0xff, + 0xff, 0x1f, 0x00, 0x00, 0xc0, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0xfe, 0xff, 0xff, 0x07, 0x00, + 0x00, 0xf0, 0xff, 0xff, 0x3f, 0x00, 0x00, 0x00 }, + .result = { 0x9e, 0xd1, 0x0c, 0x53, 0x74, 0x7f, 0x64, 0x7f, + 0x82, 0xf4, 0x51, 0x25, 0xd3, 0xde, 0x15, 0xa1, + 0xe6, 0xb8, 0x24, 0x49, 0x6a, 0xb4, 0x04, 0x10, + 0xff, 0xcc, 0x3c, 0xfe, 0x95, 0x76, 0x0f, 0x3b }, + .valid = true + }, + /* wycheproof - edge case on twist */ + { + .private = { 0x28, 0xf4, 0x10, 0x11, 0x69, 0x18, 0x51, 0xb3, + 0xa6, 0x2b, 0x64, 0x15, 0x53, 0xb3, 0x0d, 0x0d, + 0xfd, 0xdc, 0xb8, 0xff, 0xfc, 0xf5, 0x37, 0x00, + 0xa7, 0xbe, 0x2f, 0x6a, 0x87, 0x2e, 0x9f, 0xb0 }, + .public = { 0x00, 0x00, 0x00, 0xfc, 0xff, 0xff, 0x07, 0x00, + 0x00, 0xe0, 0xff, 0xff, 0x3f, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0x01, 0x00, 0x00, 0xf8, 0xff, + 0xff, 0x0f, 0x00, 0x00, 0xc0, 0xff, 0xff, 0x7f }, + .result = { 0xcf, 0x72, 0xb4, 0xaa, 0x6a, 0xa1, 0xc9, 0xf8, + 0x94, 0xf4, 0x16, 0x5b, 0x86, 0x10, 0x9a, 0xa4, + 0x68, 0x51, 0x76, 0x48, 0xe1, 0xf0, 0xcc, 0x70, + 0xe1, 0xab, 0x08, 0x46, 0x01, 0x76, 0x50, 0x6b }, + .valid = true + }, + /* wycheproof - edge case on twist */ + { + .private = { 0x18, 0xa9, 0x3b, 0x64, 0x99, 0xb9, 0xf6, 0xb3, + 0x22, 0x5c, 0xa0, 0x2f, 0xef, 0x41, 0x0e, 0x0a, + 0xde, 0xc2, 0x35, 0x32, 0x32, 0x1d, 0x2d, 0x8e, + 0xf1, 0xa6, 0xd6, 0x02, 0xa8, 0xc6, 0x5b, 0x83 }, + .public = { 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0x7f }, + .result = { 0x5d, 0x50, 0xb6, 0x28, 0x36, 0xbb, 0x69, 0x57, + 0x94, 0x10, 0x38, 0x6c, 0xf7, 0xbb, 0x81, 0x1c, + 0x14, 0xbf, 0x85, 0xb1, 0xc7, 0xb1, 0x7e, 0x59, + 0x24, 0xc7, 0xff, 0xea, 0x91, 0xef, 0x9e, 0x12 }, + .valid = true + }, + /* wycheproof - edge case on twist */ + { + .private = { 0xc0, 0x1d, 0x13, 0x05, 0xa1, 0x33, 0x8a, 0x1f, + 0xca, 0xc2, 0xba, 0x7e, 0x2e, 0x03, 0x2b, 0x42, + 0x7e, 0x0b, 0x04, 0x90, 0x31, 0x65, 0xac, 0xa9, + 0x57, 0xd8, 0xd0, 0x55, 0x3d, 0x87, 0x17, 0xb0 }, + .public = { 0xea, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .result = { 0x19, 0x23, 0x0e, 0xb1, 0x48, 0xd5, 0xd6, 0x7c, + 0x3c, 0x22, 0xab, 0x1d, 0xae, 0xff, 0x80, 0xa5, + 0x7e, 0xae, 0x42, 0x65, 0xce, 0x28, 0x72, 0x65, + 0x7b, 0x2c, 0x80, 0x99, 0xfc, 0x69, 0x8e, 0x50 }, + .valid = true + }, + /* wycheproof - edge case for public key */ + { + .private = { 0x38, 0x6f, 0x7f, 0x16, 0xc5, 0x07, 0x31, 0xd6, + 0x4f, 0x82, 0xe6, 0xa1, 0x70, 0xb1, 0x42, 0xa4, + 0xe3, 0x4f, 0x31, 0xfd, 0x77, 0x68, 0xfc, 0xb8, + 0x90, 0x29, 0x25, 0xe7, 0xd1, 0xe2, 0x1a, 0xbe }, + .public = { 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .result = { 0x0f, 0xca, 0xb5, 0xd8, 0x42, 0xa0, 0x78, 0xd7, + 0xa7, 0x1f, 0xc5, 0x9b, 0x57, 0xbf, 0xb4, 0xca, + 0x0b, 0xe6, 0x87, 0x3b, 0x49, 0xdc, 0xdb, 0x9f, + 0x44, 0xe1, 0x4a, 0xe8, 0xfb, 0xdf, 0xa5, 0x42 }, + .valid = true + }, + /* wycheproof - edge case for public key */ + { + .private = { 0xe0, 0x23, 0xa2, 0x89, 0xbd, 0x5e, 0x90, 0xfa, + 0x28, 0x04, 0xdd, 0xc0, 0x19, 0xa0, 0x5e, 0xf3, + 0xe7, 0x9d, 0x43, 0x4b, 0xb6, 0xea, 0x2f, 0x52, + 0x2e, 0xcb, 0x64, 0x3a, 0x75, 0x29, 0x6e, 0x95 }, + .public = { 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, + 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00 }, + .result = { 0x54, 0xce, 0x8f, 0x22, 0x75, 0xc0, 0x77, 0xe3, + 0xb1, 0x30, 0x6a, 0x39, 0x39, 0xc5, 0xe0, 0x3e, + 0xef, 0x6b, 0xbb, 0x88, 0x06, 0x05, 0x44, 0x75, + 0x8d, 0x9f, 0xef, 0x59, 0xb0, 0xbc, 0x3e, 0x4f }, + .valid = true + }, + /* wycheproof - edge case for public key */ + { + .private = { 0x68, 0xf0, 0x10, 0xd6, 0x2e, 0xe8, 0xd9, 0x26, + 0x05, 0x3a, 0x36, 0x1c, 0x3a, 0x75, 0xc6, 0xea, + 0x4e, 0xbd, 0xc8, 0x60, 0x6a, 0xb2, 0x85, 0x00, + 0x3a, 0x6f, 0x8f, 0x40, 0x76, 0xb0, 0x1e, 0x83 }, + .public = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x03 }, + .result = { 0xf1, 0x36, 0x77, 0x5c, 0x5b, 0xeb, 0x0a, 0xf8, + 0x11, 0x0a, 0xf1, 0x0b, 0x20, 0x37, 0x23, 0x32, + 0x04, 0x3c, 0xab, 0x75, 0x24, 0x19, 0x67, 0x87, + 0x75, 0xa2, 0x23, 0xdf, 0x57, 0xc9, 0xd3, 0x0d }, + .valid = true + }, + /* wycheproof - edge case for public key */ + { + .private = { 0x58, 0xeb, 0xcb, 0x35, 0xb0, 0xf8, 0x84, 0x5c, + 0xaf, 0x1e, 0xc6, 0x30, 0xf9, 0x65, 0x76, 0xb6, + 0x2c, 0x4b, 0x7b, 0x6c, 0x36, 0xb2, 0x9d, 0xeb, + 0x2c, 0xb0, 0x08, 0x46, 0x51, 0x75, 0x5c, 0x96 }, + .public = { 0xff, 0xff, 0xff, 0xfb, 0xff, 0xff, 0xfb, 0xff, + 0xff, 0xdf, 0xff, 0xff, 0xdf, 0xff, 0xff, 0xff, + 0xfe, 0xff, 0xff, 0xfe, 0xff, 0xff, 0xf7, 0xff, + 0xff, 0xf7, 0xff, 0xff, 0xbf, 0xff, 0xff, 0x3f }, + .result = { 0xbf, 0x9a, 0xff, 0xd0, 0x6b, 0x84, 0x40, 0x85, + 0x58, 0x64, 0x60, 0x96, 0x2e, 0xf2, 0x14, 0x6f, + 0xf3, 0xd4, 0x53, 0x3d, 0x94, 0x44, 0xaa, 0xb0, + 0x06, 0xeb, 0x88, 0xcc, 0x30, 0x54, 0x40, 0x7d }, + .valid = true + }, + /* wycheproof - edge case for public key */ + { + .private = { 0x18, 0x8c, 0x4b, 0xc5, 0xb9, 0xc4, 0x4b, 0x38, + 0xbb, 0x65, 0x8b, 0x9b, 0x2a, 0xe8, 0x2d, 0x5b, + 0x01, 0x01, 0x5e, 0x09, 0x31, 0x84, 0xb1, 0x7c, + 0xb7, 0x86, 0x35, 0x03, 0xa7, 0x83, 0xe1, 0xbb }, + .public = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f }, + .result = { 0xd4, 0x80, 0xde, 0x04, 0xf6, 0x99, 0xcb, 0x3b, + 0xe0, 0x68, 0x4a, 0x9c, 0xc2, 0xe3, 0x12, 0x81, + 0xea, 0x0b, 0xc5, 0xa9, 0xdc, 0xc1, 0x57, 0xd3, + 0xd2, 0x01, 0x58, 0xd4, 0x6c, 0xa5, 0x24, 0x6d }, + .valid = true + }, + /* wycheproof - edge case for public key */ + { + .private = { 0xe0, 0x6c, 0x11, 0xbb, 0x2e, 0x13, 0xce, 0x3d, + 0xc7, 0x67, 0x3f, 0x67, 0xf5, 0x48, 0x22, 0x42, + 0x90, 0x94, 0x23, 0xa9, 0xae, 0x95, 0xee, 0x98, + 0x6a, 0x98, 0x8d, 0x98, 0xfa, 0xee, 0x23, 0xa2 }, + .public = { 0xff, 0xff, 0xff, 0xff, 0xfe, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0xff, 0xfe, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0xff, 0xfe, 0xff, 0xff, 0x7f, + 0xff, 0xff, 0xff, 0xff, 0xfe, 0xff, 0xff, 0x7f }, + .result = { 0x4c, 0x44, 0x01, 0xcc, 0xe6, 0xb5, 0x1e, 0x4c, + 0xb1, 0x8f, 0x27, 0x90, 0x24, 0x6c, 0x9b, 0xf9, + 0x14, 0xdb, 0x66, 0x77, 0x50, 0xa1, 0xcb, 0x89, + 0x06, 0x90, 0x92, 0xaf, 0x07, 0x29, 0x22, 0x76 }, + .valid = true + }, + /* wycheproof - edge case for public key */ + { + .private = { 0xc0, 0x65, 0x8c, 0x46, 0xdd, 0xe1, 0x81, 0x29, + 0x29, 0x38, 0x77, 0x53, 0x5b, 0x11, 0x62, 0xb6, + 0xf9, 0xf5, 0x41, 0x4a, 0x23, 0xcf, 0x4d, 0x2c, + 0xbc, 0x14, 0x0a, 0x4d, 0x99, 0xda, 0x2b, 0x8f }, + .public = { 0xeb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .result = { 0x57, 0x8b, 0xa8, 0xcc, 0x2d, 0xbd, 0xc5, 0x75, + 0xaf, 0xcf, 0x9d, 0xf2, 0xb3, 0xee, 0x61, 0x89, + 0xf5, 0x33, 0x7d, 0x68, 0x54, 0xc7, 0x9b, 0x4c, + 0xe1, 0x65, 0xea, 0x12, 0x29, 0x3b, 0x3a, 0x0f }, + .valid = true + }, + /* wycheproof - public key with low order */ + { + .private = { 0x10, 0x25, 0x5c, 0x92, 0x30, 0xa9, 0x7a, 0x30, + 0xa4, 0x58, 0xca, 0x28, 0x4a, 0x62, 0x96, 0x69, + 0x29, 0x3a, 0x31, 0x89, 0x0c, 0xda, 0x9d, 0x14, + 0x7f, 0xeb, 0xc7, 0xd1, 0xe2, 0x2d, 0x6b, 0xb1 }, + .public = { 0xe0, 0xeb, 0x7a, 0x7c, 0x3b, 0x41, 0xb8, 0xae, + 0x16, 0x56, 0xe3, 0xfa, 0xf1, 0x9f, 0xc4, 0x6a, + 0xda, 0x09, 0x8d, 0xeb, 0x9c, 0x32, 0xb1, 0xfd, + 0x86, 0x62, 0x05, 0x16, 0x5f, 0x49, 0xb8, 0x00 }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key with low order */ + { + .private = { 0x78, 0xf1, 0xe8, 0xed, 0xf1, 0x44, 0x81, 0xb3, + 0x89, 0x44, 0x8d, 0xac, 0x8f, 0x59, 0xc7, 0x0b, + 0x03, 0x8e, 0x7c, 0xf9, 0x2e, 0xf2, 0xc7, 0xef, + 0xf5, 0x7a, 0x72, 0x46, 0x6e, 0x11, 0x52, 0x96 }, + .public = { 0x5f, 0x9c, 0x95, 0xbc, 0xa3, 0x50, 0x8c, 0x24, + 0xb1, 0xd0, 0xb1, 0x55, 0x9c, 0x83, 0xef, 0x5b, + 0x04, 0x44, 0x5c, 0xc4, 0x58, 0x1c, 0x8e, 0x86, + 0xd8, 0x22, 0x4e, 0xdd, 0xd0, 0x9f, 0x11, 0x57 }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key with low order */ + { + .private = { 0xa0, 0xa0, 0x5a, 0x3e, 0x8f, 0x9f, 0x44, 0x20, + 0x4d, 0x5f, 0x80, 0x59, 0xa9, 0x4a, 0xc7, 0xdf, + 0xc3, 0x9a, 0x49, 0xac, 0x01, 0x6d, 0xd7, 0x43, + 0xdb, 0xfa, 0x43, 0xc5, 0xd6, 0x71, 0xfd, 0x88 }, + .public = { 0xec, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key with low order */ + { + .private = { 0xd0, 0xdb, 0xb3, 0xed, 0x19, 0x06, 0x66, 0x3f, + 0x15, 0x42, 0x0a, 0xf3, 0x1f, 0x4e, 0xaf, 0x65, + 0x09, 0xd9, 0xa9, 0x94, 0x97, 0x23, 0x50, 0x06, + 0x05, 0xad, 0x7c, 0x1c, 0x6e, 0x74, 0x50, 0xa9 }, + .public = { 0xed, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key with low order */ + { + .private = { 0xc0, 0xb1, 0xd0, 0xeb, 0x22, 0xb2, 0x44, 0xfe, + 0x32, 0x91, 0x14, 0x00, 0x72, 0xcd, 0xd9, 0xd9, + 0x89, 0xb5, 0xf0, 0xec, 0xd9, 0x6c, 0x10, 0x0f, + 0xeb, 0x5b, 0xca, 0x24, 0x1c, 0x1d, 0x9f, 0x8f }, + .public = { 0xee, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key with low order */ + { + .private = { 0x48, 0x0b, 0xf4, 0x5f, 0x59, 0x49, 0x42, 0xa8, + 0xbc, 0x0f, 0x33, 0x53, 0xc6, 0xe8, 0xb8, 0x85, + 0x3d, 0x77, 0xf3, 0x51, 0xf1, 0xc2, 0xca, 0x6c, + 0x2d, 0x1a, 0xbf, 0x8a, 0x00, 0xb4, 0x22, 0x9c }, + .public = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key with low order */ + { + .private = { 0x30, 0xf9, 0x93, 0xfc, 0xf8, 0x51, 0x4f, 0xc8, + 0x9b, 0xd8, 0xdb, 0x14, 0xcd, 0x43, 0xba, 0x0d, + 0x4b, 0x25, 0x30, 0xe7, 0x3c, 0x42, 0x76, 0xa0, + 0x5e, 0x1b, 0x14, 0x5d, 0x42, 0x0c, 0xed, 0xb4 }, + .public = { 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key with low order */ + { + .private = { 0xc0, 0x49, 0x74, 0xb7, 0x58, 0x38, 0x0e, 0x2a, + 0x5b, 0x5d, 0xf6, 0xeb, 0x09, 0xbb, 0x2f, 0x6b, + 0x34, 0x34, 0xf9, 0x82, 0x72, 0x2a, 0x8e, 0x67, + 0x6d, 0x3d, 0xa2, 0x51, 0xd1, 0xb3, 0xde, 0x83 }, + .public = { 0xe0, 0xeb, 0x7a, 0x7c, 0x3b, 0x41, 0xb8, 0xae, + 0x16, 0x56, 0xe3, 0xfa, 0xf1, 0x9f, 0xc4, 0x6a, + 0xda, 0x09, 0x8d, 0xeb, 0x9c, 0x32, 0xb1, 0xfd, + 0x86, 0x62, 0x05, 0x16, 0x5f, 0x49, 0xb8, 0x80 }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key with low order */ + { + .private = { 0x50, 0x2a, 0x31, 0x37, 0x3d, 0xb3, 0x24, 0x46, + 0x84, 0x2f, 0xe5, 0xad, 0xd3, 0xe0, 0x24, 0x02, + 0x2e, 0xa5, 0x4f, 0x27, 0x41, 0x82, 0xaf, 0xc3, + 0xd9, 0xf1, 0xbb, 0x3d, 0x39, 0x53, 0x4e, 0xb5 }, + .public = { 0x5f, 0x9c, 0x95, 0xbc, 0xa3, 0x50, 0x8c, 0x24, + 0xb1, 0xd0, 0xb1, 0x55, 0x9c, 0x83, 0xef, 0x5b, + 0x04, 0x44, 0x5c, 0xc4, 0x58, 0x1c, 0x8e, 0x86, + 0xd8, 0x22, 0x4e, 0xdd, 0xd0, 0x9f, 0x11, 0xd7 }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key with low order */ + { + .private = { 0x90, 0xfa, 0x64, 0x17, 0xb0, 0xe3, 0x70, 0x30, + 0xfd, 0x6e, 0x43, 0xef, 0xf2, 0xab, 0xae, 0xf1, + 0x4c, 0x67, 0x93, 0x11, 0x7a, 0x03, 0x9c, 0xf6, + 0x21, 0x31, 0x8b, 0xa9, 0x0f, 0x4e, 0x98, 0xbe }, + .public = { 0xec, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key with low order */ + { + .private = { 0x78, 0xad, 0x3f, 0x26, 0x02, 0x7f, 0x1c, 0x9f, + 0xdd, 0x97, 0x5a, 0x16, 0x13, 0xb9, 0x47, 0x77, + 0x9b, 0xad, 0x2c, 0xf2, 0xb7, 0x41, 0xad, 0xe0, + 0x18, 0x40, 0x88, 0x5a, 0x30, 0xbb, 0x97, 0x9c }, + .public = { 0xed, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key with low order */ + { + .private = { 0x98, 0xe2, 0x3d, 0xe7, 0xb1, 0xe0, 0x92, 0x6e, + 0xd9, 0xc8, 0x7e, 0x7b, 0x14, 0xba, 0xf5, 0x5f, + 0x49, 0x7a, 0x1d, 0x70, 0x96, 0xf9, 0x39, 0x77, + 0x68, 0x0e, 0x44, 0xdc, 0x1c, 0x7b, 0x7b, 0x8b }, + .public = { 0xee, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = false + }, + /* wycheproof - public key >= p */ + { + .private = { 0xf0, 0x1e, 0x48, 0xda, 0xfa, 0xc9, 0xd7, 0xbc, + 0xf5, 0x89, 0xcb, 0xc3, 0x82, 0xc8, 0x78, 0xd1, + 0x8b, 0xda, 0x35, 0x50, 0x58, 0x9f, 0xfb, 0x5d, + 0x50, 0xb5, 0x23, 0xbe, 0xbe, 0x32, 0x9d, 0xae }, + .public = { 0xef, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .result = { 0xbd, 0x36, 0xa0, 0x79, 0x0e, 0xb8, 0x83, 0x09, + 0x8c, 0x98, 0x8b, 0x21, 0x78, 0x67, 0x73, 0xde, + 0x0b, 0x3a, 0x4d, 0xf1, 0x62, 0x28, 0x2c, 0xf1, + 0x10, 0xde, 0x18, 0xdd, 0x48, 0x4c, 0xe7, 0x4b }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0x28, 0x87, 0x96, 0xbc, 0x5a, 0xff, 0x4b, 0x81, + 0xa3, 0x75, 0x01, 0x75, 0x7b, 0xc0, 0x75, 0x3a, + 0x3c, 0x21, 0x96, 0x47, 0x90, 0xd3, 0x86, 0x99, + 0x30, 0x8d, 0xeb, 0xc1, 0x7a, 0x6e, 0xaf, 0x8d }, + .public = { 0xf0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .result = { 0xb4, 0xe0, 0xdd, 0x76, 0xda, 0x7b, 0x07, 0x17, + 0x28, 0xb6, 0x1f, 0x85, 0x67, 0x71, 0xaa, 0x35, + 0x6e, 0x57, 0xed, 0xa7, 0x8a, 0x5b, 0x16, 0x55, + 0xcc, 0x38, 0x20, 0xfb, 0x5f, 0x85, 0x4c, 0x5c }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0x98, 0xdf, 0x84, 0x5f, 0x66, 0x51, 0xbf, 0x11, + 0x38, 0x22, 0x1f, 0x11, 0x90, 0x41, 0xf7, 0x2b, + 0x6d, 0xbc, 0x3c, 0x4a, 0xce, 0x71, 0x43, 0xd9, + 0x9f, 0xd5, 0x5a, 0xd8, 0x67, 0x48, 0x0d, 0xa8 }, + .public = { 0xf1, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .result = { 0x6f, 0xdf, 0x6c, 0x37, 0x61, 0x1d, 0xbd, 0x53, + 0x04, 0xdc, 0x0f, 0x2e, 0xb7, 0xc9, 0x51, 0x7e, + 0xb3, 0xc5, 0x0e, 0x12, 0xfd, 0x05, 0x0a, 0xc6, + 0xde, 0xc2, 0x70, 0x71, 0xd4, 0xbf, 0xc0, 0x34 }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0xf0, 0x94, 0x98, 0xe4, 0x6f, 0x02, 0xf8, 0x78, + 0x82, 0x9e, 0x78, 0xb8, 0x03, 0xd3, 0x16, 0xa2, + 0xed, 0x69, 0x5d, 0x04, 0x98, 0xa0, 0x8a, 0xbd, + 0xf8, 0x27, 0x69, 0x30, 0xe2, 0x4e, 0xdc, 0xb0 }, + .public = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .result = { 0x4c, 0x8f, 0xc4, 0xb1, 0xc6, 0xab, 0x88, 0xfb, + 0x21, 0xf1, 0x8f, 0x6d, 0x4c, 0x81, 0x02, 0x40, + 0xd4, 0xe9, 0x46, 0x51, 0xba, 0x44, 0xf7, 0xa2, + 0xc8, 0x63, 0xce, 0xc7, 0xdc, 0x56, 0x60, 0x2d }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0x18, 0x13, 0xc1, 0x0a, 0x5c, 0x7f, 0x21, 0xf9, + 0x6e, 0x17, 0xf2, 0x88, 0xc0, 0xcc, 0x37, 0x60, + 0x7c, 0x04, 0xc5, 0xf5, 0xae, 0xa2, 0xdb, 0x13, + 0x4f, 0x9e, 0x2f, 0xfc, 0x66, 0xbd, 0x9d, 0xb8 }, + .public = { 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 }, + .result = { 0x1c, 0xd0, 0xb2, 0x82, 0x67, 0xdc, 0x54, 0x1c, + 0x64, 0x2d, 0x6d, 0x7d, 0xca, 0x44, 0xa8, 0xb3, + 0x8a, 0x63, 0x73, 0x6e, 0xef, 0x5c, 0x4e, 0x65, + 0x01, 0xff, 0xbb, 0xb1, 0x78, 0x0c, 0x03, 0x3c }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0x78, 0x57, 0xfb, 0x80, 0x86, 0x53, 0x64, 0x5a, + 0x0b, 0xeb, 0x13, 0x8a, 0x64, 0xf5, 0xf4, 0xd7, + 0x33, 0xa4, 0x5e, 0xa8, 0x4c, 0x3c, 0xda, 0x11, + 0xa9, 0xc0, 0x6f, 0x7e, 0x71, 0x39, 0x14, 0x9e }, + .public = { 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 }, + .result = { 0x87, 0x55, 0xbe, 0x01, 0xc6, 0x0a, 0x7e, 0x82, + 0x5c, 0xff, 0x3e, 0x0e, 0x78, 0xcb, 0x3a, 0xa4, + 0x33, 0x38, 0x61, 0x51, 0x6a, 0xa5, 0x9b, 0x1c, + 0x51, 0xa8, 0xb2, 0xa5, 0x43, 0xdf, 0xa8, 0x22 }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0xe0, 0x3a, 0xa8, 0x42, 0xe2, 0xab, 0xc5, 0x6e, + 0x81, 0xe8, 0x7b, 0x8b, 0x9f, 0x41, 0x7b, 0x2a, + 0x1e, 0x59, 0x13, 0xc7, 0x23, 0xee, 0xd2, 0x8d, + 0x75, 0x2f, 0x8d, 0x47, 0xa5, 0x9f, 0x49, 0x8f }, + .public = { 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80 }, + .result = { 0x54, 0xc9, 0xa1, 0xed, 0x95, 0xe5, 0x46, 0xd2, + 0x78, 0x22, 0xa3, 0x60, 0x93, 0x1d, 0xda, 0x60, + 0xa1, 0xdf, 0x04, 0x9d, 0xa6, 0xf9, 0x04, 0x25, + 0x3c, 0x06, 0x12, 0xbb, 0xdc, 0x08, 0x74, 0x76 }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0xf8, 0xf7, 0x07, 0xb7, 0x99, 0x9b, 0x18, 0xcb, + 0x0d, 0x6b, 0x96, 0x12, 0x4f, 0x20, 0x45, 0x97, + 0x2c, 0xa2, 0x74, 0xbf, 0xc1, 0x54, 0xad, 0x0c, + 0x87, 0x03, 0x8c, 0x24, 0xc6, 0xd0, 0xd4, 0xb2 }, + .public = { 0xda, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0xcc, 0x1f, 0x40, 0xd7, 0x43, 0xcd, 0xc2, 0x23, + 0x0e, 0x10, 0x43, 0xda, 0xba, 0x8b, 0x75, 0xe8, + 0x10, 0xf1, 0xfb, 0xab, 0x7f, 0x25, 0x52, 0x69, + 0xbd, 0x9e, 0xbb, 0x29, 0xe6, 0xbf, 0x49, 0x4f }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0xa0, 0x34, 0xf6, 0x84, 0xfa, 0x63, 0x1e, 0x1a, + 0x34, 0x81, 0x18, 0xc1, 0xce, 0x4c, 0x98, 0x23, + 0x1f, 0x2d, 0x9e, 0xec, 0x9b, 0xa5, 0x36, 0x5b, + 0x4a, 0x05, 0xd6, 0x9a, 0x78, 0x5b, 0x07, 0x96 }, + .public = { 0xdb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0x54, 0x99, 0x8e, 0xe4, 0x3a, 0x5b, 0x00, 0x7b, + 0xf4, 0x99, 0xf0, 0x78, 0xe7, 0x36, 0x52, 0x44, + 0x00, 0xa8, 0xb5, 0xc7, 0xe9, 0xb9, 0xb4, 0x37, + 0x71, 0x74, 0x8c, 0x7c, 0xdf, 0x88, 0x04, 0x12 }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0x30, 0xb6, 0xc6, 0xa0, 0xf2, 0xff, 0xa6, 0x80, + 0x76, 0x8f, 0x99, 0x2b, 0xa8, 0x9e, 0x15, 0x2d, + 0x5b, 0xc9, 0x89, 0x3d, 0x38, 0xc9, 0x11, 0x9b, + 0xe4, 0xf7, 0x67, 0xbf, 0xab, 0x6e, 0x0c, 0xa5 }, + .public = { 0xdc, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0xea, 0xd9, 0xb3, 0x8e, 0xfd, 0xd7, 0x23, 0x63, + 0x79, 0x34, 0xe5, 0x5a, 0xb7, 0x17, 0xa7, 0xae, + 0x09, 0xeb, 0x86, 0xa2, 0x1d, 0xc3, 0x6a, 0x3f, + 0xee, 0xb8, 0x8b, 0x75, 0x9e, 0x39, 0x1e, 0x09 }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0x90, 0x1b, 0x9d, 0xcf, 0x88, 0x1e, 0x01, 0xe0, + 0x27, 0x57, 0x50, 0x35, 0xd4, 0x0b, 0x43, 0xbd, + 0xc1, 0xc5, 0x24, 0x2e, 0x03, 0x08, 0x47, 0x49, + 0x5b, 0x0c, 0x72, 0x86, 0x46, 0x9b, 0x65, 0x91 }, + .public = { 0xea, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0x60, 0x2f, 0xf4, 0x07, 0x89, 0xb5, 0x4b, 0x41, + 0x80, 0x59, 0x15, 0xfe, 0x2a, 0x62, 0x21, 0xf0, + 0x7a, 0x50, 0xff, 0xc2, 0xc3, 0xfc, 0x94, 0xcf, + 0x61, 0xf1, 0x3d, 0x79, 0x04, 0xe8, 0x8e, 0x0e }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0x80, 0x46, 0x67, 0x7c, 0x28, 0xfd, 0x82, 0xc9, + 0xa1, 0xbd, 0xb7, 0x1a, 0x1a, 0x1a, 0x34, 0xfa, + 0xba, 0x12, 0x25, 0xe2, 0x50, 0x7f, 0xe3, 0xf5, + 0x4d, 0x10, 0xbd, 0x5b, 0x0d, 0x86, 0x5f, 0x8e }, + .public = { 0xeb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0xe0, 0x0a, 0xe8, 0xb1, 0x43, 0x47, 0x12, 0x47, + 0xba, 0x24, 0xf1, 0x2c, 0x88, 0x55, 0x36, 0xc3, + 0xcb, 0x98, 0x1b, 0x58, 0xe1, 0xe5, 0x6b, 0x2b, + 0xaf, 0x35, 0xc1, 0x2a, 0xe1, 0xf7, 0x9c, 0x26 }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0x60, 0x2f, 0x7e, 0x2f, 0x68, 0xa8, 0x46, 0xb8, + 0x2c, 0xc2, 0x69, 0xb1, 0xd4, 0x8e, 0x93, 0x98, + 0x86, 0xae, 0x54, 0xfd, 0x63, 0x6c, 0x1f, 0xe0, + 0x74, 0xd7, 0x10, 0x12, 0x7d, 0x47, 0x24, 0x91 }, + .public = { 0xef, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0x98, 0xcb, 0x9b, 0x50, 0xdd, 0x3f, 0xc2, 0xb0, + 0xd4, 0xf2, 0xd2, 0xbf, 0x7c, 0x5c, 0xfd, 0xd1, + 0x0c, 0x8f, 0xcd, 0x31, 0xfc, 0x40, 0xaf, 0x1a, + 0xd4, 0x4f, 0x47, 0xc1, 0x31, 0x37, 0x63, 0x62 }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0x60, 0x88, 0x7b, 0x3d, 0xc7, 0x24, 0x43, 0x02, + 0x6e, 0xbe, 0xdb, 0xbb, 0xb7, 0x06, 0x65, 0xf4, + 0x2b, 0x87, 0xad, 0xd1, 0x44, 0x0e, 0x77, 0x68, + 0xfb, 0xd7, 0xe8, 0xe2, 0xce, 0x5f, 0x63, 0x9d }, + .public = { 0xf0, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0x38, 0xd6, 0x30, 0x4c, 0x4a, 0x7e, 0x6d, 0x9f, + 0x79, 0x59, 0x33, 0x4f, 0xb5, 0x24, 0x5b, 0xd2, + 0xc7, 0x54, 0x52, 0x5d, 0x4c, 0x91, 0xdb, 0x95, + 0x02, 0x06, 0x92, 0x62, 0x34, 0xc1, 0xf6, 0x33 }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0x78, 0xd3, 0x1d, 0xfa, 0x85, 0x44, 0x97, 0xd7, + 0x2d, 0x8d, 0xef, 0x8a, 0x1b, 0x7f, 0xb0, 0x06, + 0xce, 0xc2, 0xd8, 0xc4, 0x92, 0x46, 0x47, 0xc9, + 0x38, 0x14, 0xae, 0x56, 0xfa, 0xed, 0xa4, 0x95 }, + .public = { 0xf1, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0x78, 0x6c, 0xd5, 0x49, 0x96, 0xf0, 0x14, 0xa5, + 0xa0, 0x31, 0xec, 0x14, 0xdb, 0x81, 0x2e, 0xd0, + 0x83, 0x55, 0x06, 0x1f, 0xdb, 0x5d, 0xe6, 0x80, + 0xa8, 0x00, 0xac, 0x52, 0x1f, 0x31, 0x8e, 0x23 }, + .valid = true + }, + /* wycheproof - public key >= p */ + { + .private = { 0xc0, 0x4c, 0x5b, 0xae, 0xfa, 0x83, 0x02, 0xdd, + 0xde, 0xd6, 0xa4, 0xbb, 0x95, 0x77, 0x61, 0xb4, + 0xeb, 0x97, 0xae, 0xfa, 0x4f, 0xc3, 0xb8, 0x04, + 0x30, 0x85, 0xf9, 0x6a, 0x56, 0x59, 0xb3, 0xa5 }, + .public = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .result = { 0x29, 0xae, 0x8b, 0xc7, 0x3e, 0x9b, 0x10, 0xa0, + 0x8b, 0x4f, 0x68, 0x1c, 0x43, 0xc3, 0xe0, 0xac, + 0x1a, 0x17, 0x1d, 0x31, 0xb3, 0x8f, 0x1a, 0x48, + 0xef, 0xba, 0x29, 0xae, 0x63, 0x9e, 0xa1, 0x34 }, + .valid = true + }, + /* wycheproof - RFC 7748 */ + { + .private = { 0xa0, 0x46, 0xe3, 0x6b, 0xf0, 0x52, 0x7c, 0x9d, + 0x3b, 0x16, 0x15, 0x4b, 0x82, 0x46, 0x5e, 0xdd, + 0x62, 0x14, 0x4c, 0x0a, 0xc1, 0xfc, 0x5a, 0x18, + 0x50, 0x6a, 0x22, 0x44, 0xba, 0x44, 0x9a, 0x44 }, + .public = { 0xe6, 0xdb, 0x68, 0x67, 0x58, 0x30, 0x30, 0xdb, + 0x35, 0x94, 0xc1, 0xa4, 0x24, 0xb1, 0x5f, 0x7c, + 0x72, 0x66, 0x24, 0xec, 0x26, 0xb3, 0x35, 0x3b, + 0x10, 0xa9, 0x03, 0xa6, 0xd0, 0xab, 0x1c, 0x4c }, + .result = { 0xc3, 0xda, 0x55, 0x37, 0x9d, 0xe9, 0xc6, 0x90, + 0x8e, 0x94, 0xea, 0x4d, 0xf2, 0x8d, 0x08, 0x4f, + 0x32, 0xec, 0xcf, 0x03, 0x49, 0x1c, 0x71, 0xf7, + 0x54, 0xb4, 0x07, 0x55, 0x77, 0xa2, 0x85, 0x52 }, + .valid = true + }, + /* wycheproof - RFC 7748 */ + { + .private = { 0x48, 0x66, 0xe9, 0xd4, 0xd1, 0xb4, 0x67, 0x3c, + 0x5a, 0xd2, 0x26, 0x91, 0x95, 0x7d, 0x6a, 0xf5, + 0xc1, 0x1b, 0x64, 0x21, 0xe0, 0xea, 0x01, 0xd4, + 0x2c, 0xa4, 0x16, 0x9e, 0x79, 0x18, 0xba, 0x4d }, + .public = { 0xe5, 0x21, 0x0f, 0x12, 0x78, 0x68, 0x11, 0xd3, + 0xf4, 0xb7, 0x95, 0x9d, 0x05, 0x38, 0xae, 0x2c, + 0x31, 0xdb, 0xe7, 0x10, 0x6f, 0xc0, 0x3c, 0x3e, + 0xfc, 0x4c, 0xd5, 0x49, 0xc7, 0x15, 0xa4, 0x13 }, + .result = { 0x95, 0xcb, 0xde, 0x94, 0x76, 0xe8, 0x90, 0x7d, + 0x7a, 0xad, 0xe4, 0x5c, 0xb4, 0xb8, 0x73, 0xf8, + 0x8b, 0x59, 0x5a, 0x68, 0x79, 0x9f, 0xa1, 0x52, + 0xe6, 0xf8, 0xf7, 0x64, 0x7a, 0xac, 0x79, 0x57 }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0x0a, 0xb4, 0xe7, 0x63, 0x80, 0xd8, 0x4d, 0xde, + 0x4f, 0x68, 0x33, 0xc5, 0x8f, 0x2a, 0x9f, 0xb8, + 0xf8, 0x3b, 0xb0, 0x16, 0x9b, 0x17, 0x2b, 0xe4, + 0xb6, 0xe0, 0x59, 0x28, 0x87, 0x74, 0x1a, 0x36 }, + .result = { 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0x89, 0xe1, 0x0d, 0x57, 0x01, 0xb4, 0x33, 0x7d, + 0x2d, 0x03, 0x21, 0x81, 0x53, 0x8b, 0x10, 0x64, + 0xbd, 0x40, 0x84, 0x40, 0x1c, 0xec, 0xa1, 0xfd, + 0x12, 0x66, 0x3a, 0x19, 0x59, 0x38, 0x80, 0x00 }, + .result = { 0x09, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0x2b, 0x55, 0xd3, 0xaa, 0x4a, 0x8f, 0x80, 0xc8, + 0xc0, 0xb2, 0xae, 0x5f, 0x93, 0x3e, 0x85, 0xaf, + 0x49, 0xbe, 0xac, 0x36, 0xc2, 0xfa, 0x73, 0x94, + 0xba, 0xb7, 0x6c, 0x89, 0x33, 0xf8, 0xf8, 0x1d }, + .result = { 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0x63, 0xe5, 0xb1, 0xfe, 0x96, 0x01, 0xfe, 0x84, + 0x38, 0x5d, 0x88, 0x66, 0xb0, 0x42, 0x12, 0x62, + 0xf7, 0x8f, 0xbf, 0xa5, 0xaf, 0xf9, 0x58, 0x5e, + 0x62, 0x66, 0x79, 0xb1, 0x85, 0x47, 0xd9, 0x59 }, + .result = { 0xfe, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0xe4, 0x28, 0xf3, 0xda, 0xc1, 0x78, 0x09, 0xf8, + 0x27, 0xa5, 0x22, 0xce, 0x32, 0x35, 0x50, 0x58, + 0xd0, 0x73, 0x69, 0x36, 0x4a, 0xa7, 0x89, 0x02, + 0xee, 0x10, 0x13, 0x9b, 0x9f, 0x9d, 0xd6, 0x53 }, + .result = { 0xfc, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0xb3, 0xb5, 0x0e, 0x3e, 0xd3, 0xa4, 0x07, 0xb9, + 0x5d, 0xe9, 0x42, 0xef, 0x74, 0x57, 0x5b, 0x5a, + 0xb8, 0xa1, 0x0c, 0x09, 0xee, 0x10, 0x35, 0x44, + 0xd6, 0x0b, 0xdf, 0xed, 0x81, 0x38, 0xab, 0x2b }, + .result = { 0xf9, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0x21, 0x3f, 0xff, 0xe9, 0x3d, 0x5e, 0xa8, 0xcd, + 0x24, 0x2e, 0x46, 0x28, 0x44, 0x02, 0x99, 0x22, + 0xc4, 0x3c, 0x77, 0xc9, 0xe3, 0xe4, 0x2f, 0x56, + 0x2f, 0x48, 0x5d, 0x24, 0xc5, 0x01, 0xa2, 0x0b }, + .result = { 0xf3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x3f }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0x91, 0xb2, 0x32, 0xa1, 0x78, 0xb3, 0xcd, 0x53, + 0x09, 0x32, 0x44, 0x1e, 0x61, 0x39, 0x41, 0x8f, + 0x72, 0x17, 0x22, 0x92, 0xf1, 0xda, 0x4c, 0x18, + 0x34, 0xfc, 0x5e, 0xbf, 0xef, 0xb5, 0x1e, 0x3f }, + .result = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x03 }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0x04, 0x5c, 0x6e, 0x11, 0xc5, 0xd3, 0x32, 0x55, + 0x6c, 0x78, 0x22, 0xfe, 0x94, 0xeb, 0xf8, 0x9b, + 0x56, 0xa3, 0x87, 0x8d, 0xc2, 0x7c, 0xa0, 0x79, + 0x10, 0x30, 0x58, 0x84, 0x9f, 0xab, 0xcb, 0x4f }, + .result = { 0xe5, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0x1c, 0xa2, 0x19, 0x0b, 0x71, 0x16, 0x35, 0x39, + 0x06, 0x3c, 0x35, 0x77, 0x3b, 0xda, 0x0c, 0x9c, + 0x92, 0x8e, 0x91, 0x36, 0xf0, 0x62, 0x0a, 0xeb, + 0x09, 0x3f, 0x09, 0x91, 0x97, 0xb7, 0xf7, 0x4e }, + .result = { 0xe3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0xf7, 0x6e, 0x90, 0x10, 0xac, 0x33, 0xc5, 0x04, + 0x3b, 0x2d, 0x3b, 0x76, 0xa8, 0x42, 0x17, 0x10, + 0x00, 0xc4, 0x91, 0x62, 0x22, 0xe9, 0xe8, 0x58, + 0x97, 0xa0, 0xae, 0xc7, 0xf6, 0x35, 0x0b, 0x3c }, + .result = { 0xdd, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0xbb, 0x72, 0x68, 0x8d, 0x8f, 0x8a, 0xa7, 0xa3, + 0x9c, 0xd6, 0x06, 0x0c, 0xd5, 0xc8, 0x09, 0x3c, + 0xde, 0xc6, 0xfe, 0x34, 0x19, 0x37, 0xc3, 0x88, + 0x6a, 0x99, 0x34, 0x6c, 0xd0, 0x7f, 0xaa, 0x55 }, + .result = { 0xdb, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0x88, 0xfd, 0xde, 0xa1, 0x93, 0x39, 0x1c, 0x6a, + 0x59, 0x33, 0xef, 0x9b, 0x71, 0x90, 0x15, 0x49, + 0x44, 0x72, 0x05, 0xaa, 0xe9, 0xda, 0x92, 0x8a, + 0x6b, 0x91, 0xa3, 0x52, 0xba, 0x10, 0xf4, 0x1f }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02 }, + .valid = true + }, + /* wycheproof - edge case for shared secret */ + { + .private = { 0xa0, 0xa4, 0xf1, 0x30, 0xb9, 0x8a, 0x5b, 0xe4, + 0xb1, 0xce, 0xdb, 0x7c, 0xb8, 0x55, 0x84, 0xa3, + 0x52, 0x0e, 0x14, 0x2d, 0x47, 0x4d, 0xc9, 0xcc, + 0xb9, 0x09, 0xa0, 0x73, 0xa9, 0x76, 0xbf, 0x63 }, + .public = { 0x30, 0x3b, 0x39, 0x2f, 0x15, 0x31, 0x16, 0xca, + 0xd9, 0xcc, 0x68, 0x2a, 0x00, 0xcc, 0xc4, 0x4c, + 0x95, 0xff, 0x0d, 0x3b, 0xbe, 0x56, 0x8b, 0xeb, + 0x6c, 0x4e, 0x73, 0x9b, 0xaf, 0xdc, 0x2c, 0x68 }, + .result = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, 0x00 }, + .valid = true + }, + /* wycheproof - checking for overflow */ + { + .private = { 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d, + 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d, + 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c, + 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 }, + .public = { 0xfd, 0x30, 0x0a, 0xeb, 0x40, 0xe1, 0xfa, 0x58, + 0x25, 0x18, 0x41, 0x2b, 0x49, 0xb2, 0x08, 0xa7, + 0x84, 0x2b, 0x1e, 0x1f, 0x05, 0x6a, 0x04, 0x01, + 0x78, 0xea, 0x41, 0x41, 0x53, 0x4f, 0x65, 0x2d }, + .result = { 0xb7, 0x34, 0x10, 0x5d, 0xc2, 0x57, 0x58, 0x5d, + 0x73, 0xb5, 0x66, 0xcc, 0xb7, 0x6f, 0x06, 0x27, + 0x95, 0xcc, 0xbe, 0xc8, 0x91, 0x28, 0xe5, 0x2b, + 0x02, 0xf3, 0xe5, 0x96, 0x39, 0xf1, 0x3c, 0x46 }, + .valid = true + }, + /* wycheproof - checking for overflow */ + { + .private = { 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d, + 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d, + 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c, + 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 }, + .public = { 0xc8, 0xef, 0x79, 0xb5, 0x14, 0xd7, 0x68, 0x26, + 0x77, 0xbc, 0x79, 0x31, 0xe0, 0x6e, 0xe5, 0xc2, + 0x7c, 0x9b, 0x39, 0x2b, 0x4a, 0xe9, 0x48, 0x44, + 0x73, 0xf5, 0x54, 0xe6, 0x67, 0x8e, 0xcc, 0x2e }, + .result = { 0x64, 0x7a, 0x46, 0xb6, 0xfc, 0x3f, 0x40, 0xd6, + 0x21, 0x41, 0xee, 0x3c, 0xee, 0x70, 0x6b, 0x4d, + 0x7a, 0x92, 0x71, 0x59, 0x3a, 0x7b, 0x14, 0x3e, + 0x8e, 0x2e, 0x22, 0x79, 0x88, 0x3e, 0x45, 0x50 }, + .valid = true + }, + /* wycheproof - checking for overflow */ + { + .private = { 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d, + 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d, + 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c, + 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 }, + .public = { 0x64, 0xae, 0xac, 0x25, 0x04, 0x14, 0x48, 0x61, + 0x53, 0x2b, 0x7b, 0xbc, 0xb6, 0xc8, 0x7d, 0x67, + 0xdd, 0x4c, 0x1f, 0x07, 0xeb, 0xc2, 0xe0, 0x6e, + 0xff, 0xb9, 0x5a, 0xec, 0xc6, 0x17, 0x0b, 0x2c }, + .result = { 0x4f, 0xf0, 0x3d, 0x5f, 0xb4, 0x3c, 0xd8, 0x65, + 0x7a, 0x3c, 0xf3, 0x7c, 0x13, 0x8c, 0xad, 0xce, + 0xcc, 0xe5, 0x09, 0xe4, 0xeb, 0xa0, 0x89, 0xd0, + 0xef, 0x40, 0xb4, 0xe4, 0xfb, 0x94, 0x61, 0x55 }, + .valid = true + }, + /* wycheproof - checking for overflow */ + { + .private = { 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d, + 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d, + 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c, + 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 }, + .public = { 0xbf, 0x68, 0xe3, 0x5e, 0x9b, 0xdb, 0x7e, 0xee, + 0x1b, 0x50, 0x57, 0x02, 0x21, 0x86, 0x0f, 0x5d, + 0xcd, 0xad, 0x8a, 0xcb, 0xab, 0x03, 0x1b, 0x14, + 0x97, 0x4c, 0xc4, 0x90, 0x13, 0xc4, 0x98, 0x31 }, + .result = { 0x21, 0xce, 0xe5, 0x2e, 0xfd, 0xbc, 0x81, 0x2e, + 0x1d, 0x02, 0x1a, 0x4a, 0xf1, 0xe1, 0xd8, 0xbc, + 0x4d, 0xb3, 0xc4, 0x00, 0xe4, 0xd2, 0xa2, 0xc5, + 0x6a, 0x39, 0x26, 0xdb, 0x4d, 0x99, 0xc6, 0x5b }, + .valid = true + }, + /* wycheproof - checking for overflow */ + { + .private = { 0xc8, 0x17, 0x24, 0x70, 0x40, 0x00, 0xb2, 0x6d, + 0x31, 0x70, 0x3c, 0xc9, 0x7e, 0x3a, 0x37, 0x8d, + 0x56, 0xfa, 0xd8, 0x21, 0x93, 0x61, 0xc8, 0x8c, + 0xca, 0x8b, 0xd7, 0xc5, 0x71, 0x9b, 0x12, 0xb2 }, + .public = { 0x53, 0x47, 0xc4, 0x91, 0x33, 0x1a, 0x64, 0xb4, + 0x3d, 0xdc, 0x68, 0x30, 0x34, 0xe6, 0x77, 0xf5, + 0x3d, 0xc3, 0x2b, 0x52, 0xa5, 0x2a, 0x57, 0x7c, + 0x15, 0xa8, 0x3b, 0xf2, 0x98, 0xe9, 0x9f, 0x19 }, + .result = { 0x18, 0xcb, 0x89, 0xe4, 0xe2, 0x0c, 0x0c, 0x2b, + 0xd3, 0x24, 0x30, 0x52, 0x45, 0x26, 0x6c, 0x93, + 0x27, 0x69, 0x0b, 0xbe, 0x79, 0xac, 0xb8, 0x8f, + 0x5b, 0x8f, 0xb3, 0xf7, 0x4e, 0xca, 0x3e, 0x52 }, + .valid = true + }, + /* wycheproof - private key == -1 (mod order) */ + { + .private = { 0xa0, 0x23, 0xcd, 0xd0, 0x83, 0xef, 0x5b, 0xb8, + 0x2f, 0x10, 0xd6, 0x2e, 0x59, 0xe1, 0x5a, 0x68, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x50 }, + .public = { 0x25, 0x8e, 0x04, 0x52, 0x3b, 0x8d, 0x25, 0x3e, + 0xe6, 0x57, 0x19, 0xfc, 0x69, 0x06, 0xc6, 0x57, + 0x19, 0x2d, 0x80, 0x71, 0x7e, 0xdc, 0x82, 0x8f, + 0xa0, 0xaf, 0x21, 0x68, 0x6e, 0x2f, 0xaa, 0x75 }, + .result = { 0x25, 0x8e, 0x04, 0x52, 0x3b, 0x8d, 0x25, 0x3e, + 0xe6, 0x57, 0x19, 0xfc, 0x69, 0x06, 0xc6, 0x57, + 0x19, 0x2d, 0x80, 0x71, 0x7e, 0xdc, 0x82, 0x8f, + 0xa0, 0xaf, 0x21, 0x68, 0x6e, 0x2f, 0xaa, 0x75 }, + .valid = true + }, + /* wycheproof - private key == 1 (mod order) on twist */ + { + .private = { 0x58, 0x08, 0x3d, 0xd2, 0x61, 0xad, 0x91, 0xef, + 0xf9, 0x52, 0x32, 0x2e, 0xc8, 0x24, 0xc6, 0x82, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, + 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x5f }, + .public = { 0x2e, 0xae, 0x5e, 0xc3, 0xdd, 0x49, 0x4e, 0x9f, + 0x2d, 0x37, 0xd2, 0x58, 0xf8, 0x73, 0xa8, 0xe6, + 0xe9, 0xd0, 0xdb, 0xd1, 0xe3, 0x83, 0xef, 0x64, + 0xd9, 0x8b, 0xb9, 0x1b, 0x3e, 0x0b, 0xe0, 0x35 }, + .result = { 0x2e, 0xae, 0x5e, 0xc3, 0xdd, 0x49, 0x4e, 0x9f, + 0x2d, 0x37, 0xd2, 0x58, 0xf8, 0x73, 0xa8, 0xe6, + 0xe9, 0xd0, 0xdb, 0xd1, 0xe3, 0x83, 0xef, 0x64, + 0xd9, 0x8b, 0xb9, 0x1b, 0x3e, 0x0b, 0xe0, 0x35 }, + .valid = true + } +}; + +bool __init curve25519_selftest(void) +{ + bool success = true, ret, ret2; + size_t i = 0, j; + u8 in[CURVE25519_KEY_SIZE]; + u8 out[CURVE25519_KEY_SIZE], out2[CURVE25519_KEY_SIZE], + out3[CURVE25519_KEY_SIZE]; + + for (i = 0; i < ARRAY_SIZE(curve25519_test_vectors); ++i) { + memset(out, 0, CURVE25519_KEY_SIZE); + ret = curve25519(out, curve25519_test_vectors[i].private, + curve25519_test_vectors[i].public); + if (ret != curve25519_test_vectors[i].valid || + memcmp(out, curve25519_test_vectors[i].result, + CURVE25519_KEY_SIZE)) { + pr_err("curve25519 self-test %zu: FAIL\n", i + 1); + success = false; + } + } + + for (i = 0; i < 5; ++i) { + get_random_bytes(in, sizeof(in)); + ret = curve25519_generate_public(out, in); + ret2 = curve25519(out2, in, (u8[CURVE25519_KEY_SIZE]){ 9 }); + curve25519_generic(out3, in, (u8[CURVE25519_KEY_SIZE]){ 9 }); + if (ret != ret2 || + memcmp(out, out2, CURVE25519_KEY_SIZE) || + memcmp(out, out3, CURVE25519_KEY_SIZE)) { + pr_err("curve25519 basepoint self-test %zu: FAIL: input - 0x", + i + 1); + for (j = CURVE25519_KEY_SIZE; j-- > 0;) + printk(KERN_CONT "%02x", in[j]); + printk(KERN_CONT "\n"); + success = false; + } + } + + return success; +} diff --git a/lib/crypto/curve25519.c b/lib/crypto/curve25519.c new file mode 100644 index 000000000000..288a62cd29b2 --- /dev/null +++ b/lib/crypto/curve25519.c @@ -0,0 +1,35 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * This is an implementation of the Curve25519 ECDH algorithm, using either + * a 32-bit implementation or a 64-bit implementation with 128-bit integers, + * depending on what is supported by the target compiler. + * + * Information: https://cr.yp.to/ecdh.html + */ + +#include +#include +#include + +bool curve25519_selftest(void); + +static int __init mod_init(void) +{ + if (!IS_ENABLED(CONFIG_CRYPTO_MANAGER_DISABLE_TESTS) && + WARN_ON(!curve25519_selftest())) + return -ENODEV; + return 0; +} + +static void __exit mod_exit(void) +{ +} + +module_init(mod_init); +module_exit(mod_exit); + +MODULE_LICENSE("GPL v2"); +MODULE_DESCRIPTION("Curve25519 scalar multiplication"); +MODULE_AUTHOR("Jason A. Donenfeld "); diff --git a/lib/crypto/libchacha.c b/lib/crypto/libchacha.c new file mode 100644 index 000000000000..dabc3accae05 --- /dev/null +++ b/lib/crypto/libchacha.c @@ -0,0 +1,35 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * The ChaCha stream cipher (RFC7539) + * + * Copyright (C) 2015 Martin Willi + */ + +#include +#include +#include + +#include // for crypto_xor_cpy +#include + +void chacha_crypt_generic(u32 *state, u8 *dst, const u8 *src, + unsigned int bytes, int nrounds) +{ + /* aligned to potentially speed up crypto_xor() */ + u8 stream[CHACHA_BLOCK_SIZE] __aligned(sizeof(long)); + + while (bytes >= CHACHA_BLOCK_SIZE) { + chacha_block_generic(state, stream, nrounds); + crypto_xor_cpy(dst, src, stream, CHACHA_BLOCK_SIZE); + bytes -= CHACHA_BLOCK_SIZE; + dst += CHACHA_BLOCK_SIZE; + src += CHACHA_BLOCK_SIZE; + } + if (bytes) { + chacha_block_generic(state, stream, nrounds); + crypto_xor_cpy(dst, src, stream, bytes); + } +} +EXPORT_SYMBOL(chacha_crypt_generic); + +MODULE_LICENSE("GPL"); diff --git a/lib/crypto/poly1305-donna32.c b/lib/crypto/poly1305-donna32.c new file mode 100644 index 000000000000..3cc77d94390b --- /dev/null +++ b/lib/crypto/poly1305-donna32.c @@ -0,0 +1,204 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * This is based in part on Andrew Moon's poly1305-donna, which is in the + * public domain. + */ + +#include +#include +#include + +void poly1305_core_setkey(struct poly1305_core_key *key, const u8 raw_key[16]) +{ + /* r &= 0xffffffc0ffffffc0ffffffc0fffffff */ + key->key.r[0] = (get_unaligned_le32(&raw_key[0])) & 0x3ffffff; + key->key.r[1] = (get_unaligned_le32(&raw_key[3]) >> 2) & 0x3ffff03; + key->key.r[2] = (get_unaligned_le32(&raw_key[6]) >> 4) & 0x3ffc0ff; + key->key.r[3] = (get_unaligned_le32(&raw_key[9]) >> 6) & 0x3f03fff; + key->key.r[4] = (get_unaligned_le32(&raw_key[12]) >> 8) & 0x00fffff; + + /* s = 5*r */ + key->precomputed_s.r[0] = key->key.r[1] * 5; + key->precomputed_s.r[1] = key->key.r[2] * 5; + key->precomputed_s.r[2] = key->key.r[3] * 5; + key->precomputed_s.r[3] = key->key.r[4] * 5; +} +EXPORT_SYMBOL(poly1305_core_setkey); + +void poly1305_core_blocks(struct poly1305_state *state, + const struct poly1305_core_key *key, const void *src, + unsigned int nblocks, u32 hibit) +{ + const u8 *input = src; + u32 r0, r1, r2, r3, r4; + u32 s1, s2, s3, s4; + u32 h0, h1, h2, h3, h4; + u64 d0, d1, d2, d3, d4; + u32 c; + + if (!nblocks) + return; + + hibit <<= 24; + + r0 = key->key.r[0]; + r1 = key->key.r[1]; + r2 = key->key.r[2]; + r3 = key->key.r[3]; + r4 = key->key.r[4]; + + s1 = key->precomputed_s.r[0]; + s2 = key->precomputed_s.r[1]; + s3 = key->precomputed_s.r[2]; + s4 = key->precomputed_s.r[3]; + + h0 = state->h[0]; + h1 = state->h[1]; + h2 = state->h[2]; + h3 = state->h[3]; + h4 = state->h[4]; + + do { + /* h += m[i] */ + h0 += (get_unaligned_le32(&input[0])) & 0x3ffffff; + h1 += (get_unaligned_le32(&input[3]) >> 2) & 0x3ffffff; + h2 += (get_unaligned_le32(&input[6]) >> 4) & 0x3ffffff; + h3 += (get_unaligned_le32(&input[9]) >> 6) & 0x3ffffff; + h4 += (get_unaligned_le32(&input[12]) >> 8) | hibit; + + /* h *= r */ + d0 = ((u64)h0 * r0) + ((u64)h1 * s4) + + ((u64)h2 * s3) + ((u64)h3 * s2) + + ((u64)h4 * s1); + d1 = ((u64)h0 * r1) + ((u64)h1 * r0) + + ((u64)h2 * s4) + ((u64)h3 * s3) + + ((u64)h4 * s2); + d2 = ((u64)h0 * r2) + ((u64)h1 * r1) + + ((u64)h2 * r0) + ((u64)h3 * s4) + + ((u64)h4 * s3); + d3 = ((u64)h0 * r3) + ((u64)h1 * r2) + + ((u64)h2 * r1) + ((u64)h3 * r0) + + ((u64)h4 * s4); + d4 = ((u64)h0 * r4) + ((u64)h1 * r3) + + ((u64)h2 * r2) + ((u64)h3 * r1) + + ((u64)h4 * r0); + + /* (partial) h %= p */ + c = (u32)(d0 >> 26); + h0 = (u32)d0 & 0x3ffffff; + d1 += c; + c = (u32)(d1 >> 26); + h1 = (u32)d1 & 0x3ffffff; + d2 += c; + c = (u32)(d2 >> 26); + h2 = (u32)d2 & 0x3ffffff; + d3 += c; + c = (u32)(d3 >> 26); + h3 = (u32)d3 & 0x3ffffff; + d4 += c; + c = (u32)(d4 >> 26); + h4 = (u32)d4 & 0x3ffffff; + h0 += c * 5; + c = (h0 >> 26); + h0 = h0 & 0x3ffffff; + h1 += c; + + input += POLY1305_BLOCK_SIZE; + } while (--nblocks); + + state->h[0] = h0; + state->h[1] = h1; + state->h[2] = h2; + state->h[3] = h3; + state->h[4] = h4; +} +EXPORT_SYMBOL(poly1305_core_blocks); + +void poly1305_core_emit(const struct poly1305_state *state, const u32 nonce[4], + void *dst) +{ + u8 *mac = dst; + u32 h0, h1, h2, h3, h4, c; + u32 g0, g1, g2, g3, g4; + u64 f; + u32 mask; + + /* fully carry h */ + h0 = state->h[0]; + h1 = state->h[1]; + h2 = state->h[2]; + h3 = state->h[3]; + h4 = state->h[4]; + + c = h1 >> 26; + h1 = h1 & 0x3ffffff; + h2 += c; + c = h2 >> 26; + h2 = h2 & 0x3ffffff; + h3 += c; + c = h3 >> 26; + h3 = h3 & 0x3ffffff; + h4 += c; + c = h4 >> 26; + h4 = h4 & 0x3ffffff; + h0 += c * 5; + c = h0 >> 26; + h0 = h0 & 0x3ffffff; + h1 += c; + + /* compute h + -p */ + g0 = h0 + 5; + c = g0 >> 26; + g0 &= 0x3ffffff; + g1 = h1 + c; + c = g1 >> 26; + g1 &= 0x3ffffff; + g2 = h2 + c; + c = g2 >> 26; + g2 &= 0x3ffffff; + g3 = h3 + c; + c = g3 >> 26; + g3 &= 0x3ffffff; + g4 = h4 + c - (1UL << 26); + + /* select h if h < p, or h + -p if h >= p */ + mask = (g4 >> ((sizeof(u32) * 8) - 1)) - 1; + g0 &= mask; + g1 &= mask; + g2 &= mask; + g3 &= mask; + g4 &= mask; + mask = ~mask; + + h0 = (h0 & mask) | g0; + h1 = (h1 & mask) | g1; + h2 = (h2 & mask) | g2; + h3 = (h3 & mask) | g3; + h4 = (h4 & mask) | g4; + + /* h = h % (2^128) */ + h0 = ((h0) | (h1 << 26)) & 0xffffffff; + h1 = ((h1 >> 6) | (h2 << 20)) & 0xffffffff; + h2 = ((h2 >> 12) | (h3 << 14)) & 0xffffffff; + h3 = ((h3 >> 18) | (h4 << 8)) & 0xffffffff; + + if (likely(nonce)) { + /* mac = (h + nonce) % (2^128) */ + f = (u64)h0 + nonce[0]; + h0 = (u32)f; + f = (u64)h1 + nonce[1] + (f >> 32); + h1 = (u32)f; + f = (u64)h2 + nonce[2] + (f >> 32); + h2 = (u32)f; + f = (u64)h3 + nonce[3] + (f >> 32); + h3 = (u32)f; + } + + put_unaligned_le32(h0, &mac[0]); + put_unaligned_le32(h1, &mac[4]); + put_unaligned_le32(h2, &mac[8]); + put_unaligned_le32(h3, &mac[12]); +} +EXPORT_SYMBOL(poly1305_core_emit); diff --git a/lib/crypto/poly1305-donna64.c b/lib/crypto/poly1305-donna64.c new file mode 100644 index 000000000000..6ae181bb4345 --- /dev/null +++ b/lib/crypto/poly1305-donna64.c @@ -0,0 +1,185 @@ +// SPDX-License-Identifier: GPL-2.0 OR MIT +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * This is based in part on Andrew Moon's poly1305-donna, which is in the + * public domain. + */ + +#include +#include +#include + +typedef __uint128_t u128; + +void poly1305_core_setkey(struct poly1305_core_key *key, const u8 raw_key[16]) +{ + u64 t0, t1; + + /* r &= 0xffffffc0ffffffc0ffffffc0fffffff */ + t0 = get_unaligned_le64(&raw_key[0]); + t1 = get_unaligned_le64(&raw_key[8]); + + key->key.r64[0] = t0 & 0xffc0fffffffULL; + key->key.r64[1] = ((t0 >> 44) | (t1 << 20)) & 0xfffffc0ffffULL; + key->key.r64[2] = ((t1 >> 24)) & 0x00ffffffc0fULL; + + /* s = 20*r */ + key->precomputed_s.r64[0] = key->key.r64[1] * 20; + key->precomputed_s.r64[1] = key->key.r64[2] * 20; +} +EXPORT_SYMBOL(poly1305_core_setkey); + +void poly1305_core_blocks(struct poly1305_state *state, + const struct poly1305_core_key *key, const void *src, + unsigned int nblocks, u32 hibit) +{ + const u8 *input = src; + u64 hibit64; + u64 r0, r1, r2; + u64 s1, s2; + u64 h0, h1, h2; + u64 c; + u128 d0, d1, d2, d; + + if (!nblocks) + return; + + hibit64 = ((u64)hibit) << 40; + + r0 = key->key.r64[0]; + r1 = key->key.r64[1]; + r2 = key->key.r64[2]; + + h0 = state->h64[0]; + h1 = state->h64[1]; + h2 = state->h64[2]; + + s1 = key->precomputed_s.r64[0]; + s2 = key->precomputed_s.r64[1]; + + do { + u64 t0, t1; + + /* h += m[i] */ + t0 = get_unaligned_le64(&input[0]); + t1 = get_unaligned_le64(&input[8]); + + h0 += t0 & 0xfffffffffffULL; + h1 += ((t0 >> 44) | (t1 << 20)) & 0xfffffffffffULL; + h2 += (((t1 >> 24)) & 0x3ffffffffffULL) | hibit64; + + /* h *= r */ + d0 = (u128)h0 * r0; + d = (u128)h1 * s2; + d0 += d; + d = (u128)h2 * s1; + d0 += d; + d1 = (u128)h0 * r1; + d = (u128)h1 * r0; + d1 += d; + d = (u128)h2 * s2; + d1 += d; + d2 = (u128)h0 * r2; + d = (u128)h1 * r1; + d2 += d; + d = (u128)h2 * r0; + d2 += d; + + /* (partial) h %= p */ + c = (u64)(d0 >> 44); + h0 = (u64)d0 & 0xfffffffffffULL; + d1 += c; + c = (u64)(d1 >> 44); + h1 = (u64)d1 & 0xfffffffffffULL; + d2 += c; + c = (u64)(d2 >> 42); + h2 = (u64)d2 & 0x3ffffffffffULL; + h0 += c * 5; + c = h0 >> 44; + h0 = h0 & 0xfffffffffffULL; + h1 += c; + + input += POLY1305_BLOCK_SIZE; + } while (--nblocks); + + state->h64[0] = h0; + state->h64[1] = h1; + state->h64[2] = h2; +} +EXPORT_SYMBOL(poly1305_core_blocks); + +void poly1305_core_emit(const struct poly1305_state *state, const u32 nonce[4], + void *dst) +{ + u8 *mac = dst; + u64 h0, h1, h2, c; + u64 g0, g1, g2; + u64 t0, t1; + + /* fully carry h */ + h0 = state->h64[0]; + h1 = state->h64[1]; + h2 = state->h64[2]; + + c = h1 >> 44; + h1 &= 0xfffffffffffULL; + h2 += c; + c = h2 >> 42; + h2 &= 0x3ffffffffffULL; + h0 += c * 5; + c = h0 >> 44; + h0 &= 0xfffffffffffULL; + h1 += c; + c = h1 >> 44; + h1 &= 0xfffffffffffULL; + h2 += c; + c = h2 >> 42; + h2 &= 0x3ffffffffffULL; + h0 += c * 5; + c = h0 >> 44; + h0 &= 0xfffffffffffULL; + h1 += c; + + /* compute h + -p */ + g0 = h0 + 5; + c = g0 >> 44; + g0 &= 0xfffffffffffULL; + g1 = h1 + c; + c = g1 >> 44; + g1 &= 0xfffffffffffULL; + g2 = h2 + c - (1ULL << 42); + + /* select h if h < p, or h + -p if h >= p */ + c = (g2 >> ((sizeof(u64) * 8) - 1)) - 1; + g0 &= c; + g1 &= c; + g2 &= c; + c = ~c; + h0 = (h0 & c) | g0; + h1 = (h1 & c) | g1; + h2 = (h2 & c) | g2; + + if (likely(nonce)) { + /* h = (h + nonce) */ + t0 = ((u64)nonce[1] << 32) | nonce[0]; + t1 = ((u64)nonce[3] << 32) | nonce[2]; + + h0 += t0 & 0xfffffffffffULL; + c = h0 >> 44; + h0 &= 0xfffffffffffULL; + h1 += (((t0 >> 44) | (t1 << 20)) & 0xfffffffffffULL) + c; + c = h1 >> 44; + h1 &= 0xfffffffffffULL; + h2 += (((t1 >> 24)) & 0x3ffffffffffULL) + c; + h2 &= 0x3ffffffffffULL; + } + + /* mac = h % (2^128) */ + h0 = h0 | (h1 << 44); + h1 = (h1 >> 20) | (h2 << 24); + + put_unaligned_le64(h0, &mac[0]); + put_unaligned_le64(h1, &mac[8]); +} +EXPORT_SYMBOL(poly1305_core_emit); diff --git a/lib/crypto/poly1305.c b/lib/crypto/poly1305.c new file mode 100644 index 000000000000..9d2d14df0fee --- /dev/null +++ b/lib/crypto/poly1305.c @@ -0,0 +1,77 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Poly1305 authenticator algorithm, RFC7539 + * + * Copyright (C) 2015 Martin Willi + * + * Based on public domain code by Andrew Moon and Daniel J. Bernstein. + */ + +#include +#include +#include +#include + +void poly1305_init_generic(struct poly1305_desc_ctx *desc, const u8 *key) +{ + poly1305_core_setkey(&desc->core_r, key); + desc->s[0] = get_unaligned_le32(key + 16); + desc->s[1] = get_unaligned_le32(key + 20); + desc->s[2] = get_unaligned_le32(key + 24); + desc->s[3] = get_unaligned_le32(key + 28); + poly1305_core_init(&desc->h); + desc->buflen = 0; + desc->sset = true; + desc->rset = 2; +} +EXPORT_SYMBOL_GPL(poly1305_init_generic); + +void poly1305_update_generic(struct poly1305_desc_ctx *desc, const u8 *src, + unsigned int nbytes) +{ + unsigned int bytes; + + if (unlikely(desc->buflen)) { + bytes = min(nbytes, POLY1305_BLOCK_SIZE - desc->buflen); + memcpy(desc->buf + desc->buflen, src, bytes); + src += bytes; + nbytes -= bytes; + desc->buflen += bytes; + + if (desc->buflen == POLY1305_BLOCK_SIZE) { + poly1305_core_blocks(&desc->h, &desc->core_r, desc->buf, + 1, 1); + desc->buflen = 0; + } + } + + if (likely(nbytes >= POLY1305_BLOCK_SIZE)) { + poly1305_core_blocks(&desc->h, &desc->core_r, src, + nbytes / POLY1305_BLOCK_SIZE, 1); + src += nbytes - (nbytes % POLY1305_BLOCK_SIZE); + nbytes %= POLY1305_BLOCK_SIZE; + } + + if (unlikely(nbytes)) { + desc->buflen = nbytes; + memcpy(desc->buf, src, nbytes); + } +} +EXPORT_SYMBOL_GPL(poly1305_update_generic); + +void poly1305_final_generic(struct poly1305_desc_ctx *desc, u8 *dst) +{ + if (unlikely(desc->buflen)) { + desc->buf[desc->buflen++] = 1; + memset(desc->buf + desc->buflen, 0, + POLY1305_BLOCK_SIZE - desc->buflen); + poly1305_core_blocks(&desc->h, &desc->core_r, desc->buf, 1, 0); + } + + poly1305_core_emit(&desc->h, desc->s, dst); + *desc = (struct poly1305_desc_ctx){}; +} +EXPORT_SYMBOL_GPL(poly1305_final_generic); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Martin Willi "); diff --git a/lib/fonts/font_10x18.c b/lib/fonts/font_10x18.c index 0e2deac97da0..e02f9df24d1e 100644 --- a/lib/fonts/font_10x18.c +++ b/lib/fonts/font_10x18.c @@ -8,7 +8,7 @@ #define FONTDATAMAX 9216 -static struct font_data fontdata_10x18 = { +static const struct font_data fontdata_10x18 = { { 0, 0, FONTDATAMAX, 0 }, { /* 0 0x00 '^@' */ 0x00, 0x00, /* 0000000000 */ diff --git a/lib/fonts/font_6x10.c b/lib/fonts/font_6x10.c index 87da8acd07db..6e3c4b7691c8 100644 --- a/lib/fonts/font_6x10.c +++ b/lib/fonts/font_6x10.c @@ -3,7 +3,7 @@ #define FONTDATAMAX 2560 -static struct font_data fontdata_6x10 = { +static const struct font_data fontdata_6x10 = { { 0, 0, FONTDATAMAX, 0 }, { /* 0 0x00 '^@' */ 0x00, /* 00000000 */ diff --git a/lib/fonts/font_6x11.c b/lib/fonts/font_6x11.c index 5e975dfa10a5..2d22a24e816f 100644 --- a/lib/fonts/font_6x11.c +++ b/lib/fonts/font_6x11.c @@ -9,7 +9,7 @@ #define FONTDATAMAX (11*256) -static struct font_data fontdata_6x11 = { +static const struct font_data fontdata_6x11 = { { 0, 0, FONTDATAMAX, 0 }, { /* 0 0x00 '^@' */ 0x00, /* 00000000 */ diff --git a/lib/fonts/font_7x14.c b/lib/fonts/font_7x14.c index 86d298f38505..9cc7ae2e03f7 100644 --- a/lib/fonts/font_7x14.c +++ b/lib/fonts/font_7x14.c @@ -8,7 +8,7 @@ #define FONTDATAMAX 3584 -static struct font_data fontdata_7x14 = { +static const struct font_data fontdata_7x14 = { { 0, 0, FONTDATAMAX, 0 }, { /* 0 0x00 '^@' */ 0x00, /* 0000000 */ diff --git a/lib/fonts/font_8x16.c b/lib/fonts/font_8x16.c index 37cedd36ca5e..bab25dc59e8d 100644 --- a/lib/fonts/font_8x16.c +++ b/lib/fonts/font_8x16.c @@ -10,7 +10,7 @@ #define FONTDATAMAX 4096 -static struct font_data fontdata_8x16 = { +static const struct font_data fontdata_8x16 = { { 0, 0, FONTDATAMAX, 0 }, { /* 0 0x00 '^@' */ 0x00, /* 00000000 */ diff --git a/lib/fonts/font_8x8.c b/lib/fonts/font_8x8.c index 8ab695538395..109d0572368f 100644 --- a/lib/fonts/font_8x8.c +++ b/lib/fonts/font_8x8.c @@ -9,7 +9,7 @@ #define FONTDATAMAX 2048 -static struct font_data fontdata_8x8 = { +static const struct font_data fontdata_8x8 = { { 0, 0, FONTDATAMAX, 0 }, { /* 0 0x00 '^@' */ 0x00, /* 00000000 */ diff --git a/lib/fonts/font_acorn_8x8.c b/lib/fonts/font_acorn_8x8.c index 069b3e80c434..fb395f0d4031 100644 --- a/lib/fonts/font_acorn_8x8.c +++ b/lib/fonts/font_acorn_8x8.c @@ -5,7 +5,7 @@ #define FONTDATAMAX 2048 -static struct font_data acorndata_8x8 = { +static const struct font_data acorndata_8x8 = { { 0, 0, FONTDATAMAX, 0 }, { /* 00 */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* ^@ */ /* 01 */ 0x7e, 0x81, 0xa5, 0x81, 0xbd, 0x99, 0x81, 0x7e, /* ^A */ diff --git a/lib/fonts/font_mini_4x6.c b/lib/fonts/font_mini_4x6.c index 1449876c6a27..592774a90917 100644 --- a/lib/fonts/font_mini_4x6.c +++ b/lib/fonts/font_mini_4x6.c @@ -43,7 +43,7 @@ __END__; #define FONTDATAMAX 1536 -static struct font_data fontdata_mini_4x6 = { +static const struct font_data fontdata_mini_4x6 = { { 0, 0, FONTDATAMAX, 0 }, { /*{*/ /* Char 0: ' ' */ diff --git a/lib/fonts/font_pearl_8x8.c b/lib/fonts/font_pearl_8x8.c index 32d65551e7ed..a6f95ebce950 100644 --- a/lib/fonts/font_pearl_8x8.c +++ b/lib/fonts/font_pearl_8x8.c @@ -14,7 +14,7 @@ #define FONTDATAMAX 2048 -static struct font_data fontdata_pearl8x8 = { +static const struct font_data fontdata_pearl8x8 = { { 0, 0, FONTDATAMAX, 0 }, { /* 0 0x00 '^@' */ 0x00, /* 00000000 */ diff --git a/lib/fonts/font_sun12x22.c b/lib/fonts/font_sun12x22.c index 641a6b4dca42..a5b65bd49604 100644 --- a/lib/fonts/font_sun12x22.c +++ b/lib/fonts/font_sun12x22.c @@ -3,7 +3,7 @@ #define FONTDATAMAX 11264 -static struct font_data fontdata_sun12x22 = { +static const struct font_data fontdata_sun12x22 = { { 0, 0, FONTDATAMAX, 0 }, { /* 0 0x00 '^@' */ 0x00, 0x00, /* 000000000000 */ diff --git a/lib/fonts/font_sun8x16.c b/lib/fonts/font_sun8x16.c index 193fe6d988e0..e577e76a6a7c 100644 --- a/lib/fonts/font_sun8x16.c +++ b/lib/fonts/font_sun8x16.c @@ -3,7 +3,7 @@ #define FONTDATAMAX 4096 -static struct font_data fontdata_sun8x16 = { +static const struct font_data fontdata_sun8x16 = { { 0, 0, FONTDATAMAX, 0 }, { /* */ 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, /* */ 0x00,0x00,0x7e,0x81,0xa5,0x81,0x81,0xbd,0x99,0x81,0x81,0x7e,0x00,0x00,0x00,0x00, diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 60e7eca2f4be..3b859201f84c 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -506,7 +506,7 @@ struct scatterlist *sgl_alloc_order(unsigned long long length, elem_len = min_t(u64, length, PAGE_SIZE << order); page = alloc_pages(gfp, order); if (!page) { - sgl_free(sgl); + sgl_free_order(sgl, order); return NULL; } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index aa730a3d5c25..87cd5bf1b487 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4780,7 +4780,7 @@ static struct page *mc_handle_swap_pte(struct vm_area_struct *vma, struct page *page = NULL; swp_entry_t ent = pte_to_swp_entry(ptent); - if (!(mc.flags & MOVE_ANON) || non_swap_entry(ent)) + if (!(mc.flags & MOVE_ANON)) return NULL; /* @@ -4799,6 +4799,9 @@ static struct page *mc_handle_swap_pte(struct vm_area_struct *vma, return page; } + if (non_swap_entry(ent)) + return NULL; + /* * Because lookup_swap_cache() updates some statistics counter, * we call find_get_page() with swapper_space directly. diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 19d0c550a30f..06492ac031ee 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -499,7 +499,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr, unsigned long flags = qp->flags; int ret; bool has_unmovable = false; - pte_t *pte; + pte_t *pte, *mapped_pte; spinlock_t *ptl; ptl = pmd_trans_huge_lock(pmd, vma); @@ -513,7 +513,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr, if (pmd_trans_unstable(pmd)) return 0; - pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); + mapped_pte = pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); for (; addr != end; pte++, addr += PAGE_SIZE) { if (!pte_present(*pte)) continue; @@ -545,7 +545,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr, } else break; } - pte_unmap_unlock(pte - 1, ptl); + pte_unmap_unlock(mapped_pte, ptl); cond_resched(); if (has_unmovable) diff --git a/mm/slob.c b/mm/slob.c index 307c2c9feb44..3b18cde8e6a3 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -474,6 +474,7 @@ void *__kmalloc_track_caller(size_t size, gfp_t gfp, unsigned long caller) { return __do_kmalloc_node(size, gfp, NUMA_NO_NODE, caller); } +EXPORT_SYMBOL(__kmalloc_track_caller); #ifdef CONFIG_NUMA void *__kmalloc_node_track_caller(size_t size, gfp_t gfp, @@ -481,6 +482,7 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t gfp, { return __do_kmalloc_node(size, gfp, node, caller); } +EXPORT_SYMBOL(__kmalloc_node_track_caller); #endif void kfree(const void *block) diff --git a/mm/slub.c b/mm/slub.c index 363a41f34166..dc50ed7f3a88 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4430,6 +4430,7 @@ void *__kmalloc_track_caller(size_t size, gfp_t gfpflags, unsigned long caller) return ret; } +EXPORT_SYMBOL(__kmalloc_track_caller); #ifdef CONFIG_NUMA void *__kmalloc_node_track_caller(size_t size, gfp_t gfpflags, @@ -4460,6 +4461,7 @@ void *__kmalloc_node_track_caller(size_t size, gfp_t gfpflags, return ret; } +EXPORT_SYMBOL(__kmalloc_node_track_caller); #endif #ifdef CONFIG_SYSFS diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index b6dcb40fa8a7..9268f808afc0 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -1038,7 +1038,7 @@ p9_fd_create_unix(struct p9_client *client, const char *addr, char *args) csocket = NULL; - if (addr == NULL) + if (!addr || !strlen(addr)) return -EINVAL; if (strlen(addr) >= UNIX_PATH_MAX) { diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c index 2a85dc3be8bf..198a1fdd6709 100644 --- a/net/bluetooth/l2cap_sock.c +++ b/net/bluetooth/l2cap_sock.c @@ -1341,8 +1341,6 @@ static void l2cap_sock_teardown_cb(struct l2cap_chan *chan, int err) parent = bt_sk(sk)->parent; - sock_set_flag(sk, SOCK_ZAPPED); - switch (chan->state) { case BT_OPEN: case BT_BOUND: @@ -1369,8 +1367,11 @@ static void l2cap_sock_teardown_cb(struct l2cap_chan *chan, int err) break; } - release_sock(sk); + + /* Only zap after cleanup to avoid use after free race */ + sock_set_flag(sk, SOCK_ZAPPED); + } static void l2cap_sock_state_change_cb(struct l2cap_chan *chan, int state, diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index f7d7f32ac673..21bd37ec5511 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -3037,6 +3037,11 @@ static void con_fault(struct ceph_connection *con) ceph_msg_put(con->in_msg); con->in_msg = NULL; } + if (con->out_msg) { + BUG_ON(con->out_msg->con != con); + ceph_msg_put(con->out_msg); + con->out_msg = NULL; + } /* Requeue anything that hasn't been acked */ list_splice_init(&con->out_sent, &con->out_queue); diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c index b036c55f5cb1..7c10bc4dacd3 100644 --- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -261,6 +261,7 @@ static int dsa_port_setup(struct dsa_port *dp) int err = 0; memset(&dp->devlink_port, 0, sizeof(dp->devlink_port)); + dp->mac = of_get_mac_address(dp->dn); if (dp->type != DSA_PORT_TYPE_UNUSED) err = devlink_port_register(ds->devlink, &dp->devlink_port, diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 8ee28b6016d8..d03c67e761df 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -1313,7 +1313,10 @@ int dsa_slave_create(struct dsa_port *port) slave_dev->features = master->vlan_features | NETIF_F_HW_TC; slave_dev->hw_features |= NETIF_F_HW_TC; slave_dev->ethtool_ops = &dsa_slave_ethtool_ops; - eth_hw_addr_inherit(slave_dev, master); + if (port->mac && is_valid_ether_addr(port->mac)) + ether_addr_copy(slave_dev->dev_addr, port->mac); + else + eth_hw_addr_inherit(slave_dev, master); slave_dev->priv_flags |= IFF_NO_QUEUE; slave_dev->netdev_ops = &dsa_slave_netdev_ops; slave_dev->switchdev_ops = &dsa_slave_switchdev_ops; diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 4efa5e33513e..84b2f2146499 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -244,7 +244,7 @@ static struct { /** * icmp_global_allow - Are we allowed to send one more ICMP message ? * - * Uses a token bucket to limit our ICMP messages to sysctl_icmp_msgs_per_sec. + * Uses a token bucket to limit our ICMP messages to ~sysctl_icmp_msgs_per_sec. * Returns false if we reached the limit and can not send another packet. * Note: called with BH disabled */ @@ -272,7 +272,10 @@ bool icmp_global_allow(void) } credit = min_t(u32, icmp_global.credit + incr, sysctl_icmp_msgs_burst); if (credit) { - credit--; + /* We want to use a credit of one in average, but need to randomize + * it for security reasons. + */ + credit = max_t(int, credit - prandom_u32_max(3), 0); rc = true; } WRITE_ONCE(icmp_global.credit, credit); @@ -752,6 +755,39 @@ out:; } EXPORT_SYMBOL(__icmp_send); +#if IS_ENABLED(CONFIG_NF_NAT) +#include +void icmp_ndo_send(struct sk_buff *skb_in, int type, int code, __be32 info) +{ + struct sk_buff *cloned_skb = NULL; + enum ip_conntrack_info ctinfo; + struct nf_conn *ct; + __be32 orig_ip; + + ct = nf_ct_get(skb_in, &ctinfo); + if (!ct || !(ct->status & IPS_SRC_NAT)) { + icmp_send(skb_in, type, code, info); + return; + } + + if (skb_shared(skb_in)) + skb_in = cloned_skb = skb_clone(skb_in, GFP_ATOMIC); + + if (unlikely(!skb_in || skb_network_header(skb_in) < skb_in->head || + (skb_network_header(skb_in) + sizeof(struct iphdr)) > + skb_tail_pointer(skb_in) || skb_ensure_writable(skb_in, + skb_network_offset(skb_in) + sizeof(struct iphdr)))) + goto out; + + orig_ip = ip_hdr(skb_in)->saddr; + ip_hdr(skb_in)->saddr = ct->tuplehash[0].tuple.src.u3.ip; + icmp_send(skb_in, type, code, info); + ip_hdr(skb_in)->saddr = orig_ip; +out: + consume_skb(cloned_skb); +} +EXPORT_SYMBOL(icmp_ndo_send); +#endif static void icmp_socket_deliver(struct sk_buff *skb, u32 info) { diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index ffcb5983107d..de6f89511a21 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -680,9 +680,7 @@ static netdev_tx_t ipgre_xmit(struct sk_buff *skb, } if (dev->header_ops) { - /* Need space for new headers */ - if (skb_cow_head(skb, dev->needed_headroom - - (tunnel->hlen + sizeof(struct iphdr)))) + if (skb_cow_head(skb, 0)) goto free_skb; tnl_params = (const struct iphdr *)skb->data; @@ -800,7 +798,11 @@ static void ipgre_link_update(struct net_device *dev, bool set_mtu) len = tunnel->tun_hlen - len; tunnel->hlen = tunnel->hlen + len; - dev->needed_headroom = dev->needed_headroom + len; + if (dev->header_ops) + dev->hard_header_len += len; + else + dev->needed_headroom += len; + if (set_mtu) dev->mtu = max_t(int, dev->mtu - len, 68); @@ -1003,6 +1005,7 @@ static void __gre_tunnel_init(struct net_device *dev) tunnel->parms.iph.protocol = IPPROTO_GRE; tunnel->hlen = tunnel->tun_hlen + tunnel->encap_hlen; + dev->needed_headroom = tunnel->hlen + sizeof(tunnel->parms.iph); dev->features |= GRE_FEATURES; dev->hw_features |= GRE_FEATURES; @@ -1046,10 +1049,14 @@ static int ipgre_tunnel_init(struct net_device *dev) return -EINVAL; dev->flags = IFF_BROADCAST; dev->header_ops = &ipgre_header_ops; + dev->hard_header_len = tunnel->hlen + sizeof(*iph); + dev->needed_headroom = 0; } #endif } else if (!tunnel->collect_md) { dev->header_ops = &ipgre_header_ops; + dev->hard_header_len = tunnel->hlen + sizeof(*iph); + dev->needed_headroom = 0; } return ip_tunnel_init(dev); diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c index 67ef9d853d90..e36d10285bfa 100644 --- a/net/ipv4/ip_tunnel_core.c +++ b/net/ipv4/ip_tunnel_core.c @@ -440,3 +440,18 @@ void ip_tunnel_unneed_metadata(void) static_branch_dec(&ip_tunnel_metadata_cnt); } EXPORT_SYMBOL_GPL(ip_tunnel_unneed_metadata); + +/* Returns either the correct skb->protocol value, or 0 if invalid. */ +__be16 ip_tunnel_parse_protocol(const struct sk_buff *skb) +{ + if (skb_network_header(skb) >= skb->head && + (skb_network_header(skb) + sizeof(struct iphdr)) <= skb_tail_pointer(skb) && + ip_hdr(skb)->version == 4) + return htons(ETH_P_IP); + if (skb_network_header(skb) >= skb->head && + (skb_network_header(skb) + sizeof(struct ipv6hdr)) <= skb_tail_pointer(skb) && + ipv6_hdr(skb)->version == 6) + return htons(ETH_P_IPV6); + return 0; +} +EXPORT_SYMBOL(ip_tunnel_parse_protocol); diff --git a/net/ipv4/netfilter/nf_log_arp.c b/net/ipv4/netfilter/nf_log_arp.c index df5c2a2061a4..19fff2c589fa 100644 --- a/net/ipv4/netfilter/nf_log_arp.c +++ b/net/ipv4/netfilter/nf_log_arp.c @@ -46,16 +46,31 @@ static void dump_arp_packet(struct nf_log_buf *m, const struct nf_loginfo *info, const struct sk_buff *skb, unsigned int nhoff) { - const struct arphdr *ah; - struct arphdr _arph; const struct arppayload *ap; struct arppayload _arpp; + const struct arphdr *ah; + unsigned int logflags; + struct arphdr _arph; ah = skb_header_pointer(skb, 0, sizeof(_arph), &_arph); if (ah == NULL) { nf_log_buf_add(m, "TRUNCATED"); return; } + + if (info->type == NF_LOG_TYPE_LOG) + logflags = info->u.log.logflags; + else + logflags = NF_LOG_DEFAULT_MASK; + + if (logflags & NF_LOG_MACDECODE) { + nf_log_buf_add(m, "MACSRC=%pM MACDST=%pM ", + eth_hdr(skb)->h_source, eth_hdr(skb)->h_dest); + nf_log_dump_vlan(m, skb); + nf_log_buf_add(m, "MACPROTO=%04x ", + ntohs(eth_hdr(skb)->h_proto)); + } + nf_log_buf_add(m, "ARP HTYPE=%d PTYPE=0x%04x OPCODE=%d", ntohs(ah->ar_hrd), ntohs(ah->ar_pro), ntohs(ah->ar_op)); diff --git a/net/ipv4/netfilter/nf_log_ipv4.c b/net/ipv4/netfilter/nf_log_ipv4.c index 1e6f28c97d3a..cde1918607e9 100644 --- a/net/ipv4/netfilter/nf_log_ipv4.c +++ b/net/ipv4/netfilter/nf_log_ipv4.c @@ -287,8 +287,10 @@ static void dump_ipv4_mac_header(struct nf_log_buf *m, switch (dev->type) { case ARPHRD_ETHER: - nf_log_buf_add(m, "MACSRC=%pM MACDST=%pM MACPROTO=%04x ", - eth_hdr(skb)->h_source, eth_hdr(skb)->h_dest, + nf_log_buf_add(m, "MACSRC=%pM MACDST=%pM ", + eth_hdr(skb)->h_source, eth_hdr(skb)->h_dest); + nf_log_dump_vlan(m, skb); + nf_log_buf_add(m, "MACPROTO=%04x ", ntohs(eth_hdr(skb)->h_proto)); return; default: diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 3db428242b22..48081e6d50b4 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2634,10 +2634,12 @@ struct rtable *ip_route_output_flow(struct net *net, struct flowi4 *flp4, if (IS_ERR(rt)) return rt; - if (flp4->flowi4_proto) + if (flp4->flowi4_proto) { + flp4->flowi4_oif = rt->dst.dev->ifindex; rt = (struct rtable *)xfrm_lookup_route(net, &rt->dst, flowi4_to_flowi(flp4), sk, 0); + } return rt; } diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 6aaa4d54e1a9..629ac16c0e1e 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -500,6 +500,8 @@ static inline bool tcp_stream_is_readable(const struct tcp_sock *tp, return true; if (tcp_rmem_pressure(sk)) return true; + if (tcp_receive_window(tp) <= inet_csk(sk)->icsk_ack.rcv_mss) + return true; } if (sk->sk_prot->stream_memory_read) return sk->sk_prot->stream_memory_read(sk); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 314ec6e01b05..b42e72ced1dc 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4704,7 +4704,8 @@ void tcp_data_ready(struct sock *sk) int avail = tp->rcv_nxt - tp->copied_seq; if (avail < sk->sk_rcvlowat && !tcp_rmem_pressure(sk) && - !sock_flag(sk, SOCK_DONE)) + !sock_flag(sk, SOCK_DONE) && + tcp_receive_window(tp) > inet_csk(sk)->icsk_ack.rcv_mss) return; sk->sk_data_ready(sk); @@ -5632,6 +5633,8 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) tcp_data_snd_check(sk); if (!inet_csk_ack_scheduled(sk)) goto no_ack; + } else { + tcp_update_wl(tp, TCP_SKB_CB(skb)->seq); } __tcp_ack_snd_check(sk, 0); diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index b924941b96a3..8b5459b34bc4 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -2417,8 +2417,10 @@ static void *ipv6_route_seq_start(struct seq_file *seq, loff_t *pos) iter->skip = *pos; if (iter->tbl) { + loff_t p = 0; + ipv6_route_seq_setup_walk(iter, net); - return ipv6_route_seq_next(seq, NULL, pos); + return ipv6_route_seq_next(seq, NULL, &p); } else { return NULL; } diff --git a/net/ipv6/ip6_icmp.c b/net/ipv6/ip6_icmp.c index 02045494c24c..e0086758b6ee 100644 --- a/net/ipv6/ip6_icmp.c +++ b/net/ipv6/ip6_icmp.c @@ -45,4 +45,38 @@ out: rcu_read_unlock(); } EXPORT_SYMBOL(icmpv6_send); + +#if IS_ENABLED(CONFIG_NF_NAT) +#include +void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info) +{ + struct sk_buff *cloned_skb = NULL; + enum ip_conntrack_info ctinfo; + struct in6_addr orig_ip; + struct nf_conn *ct; + + ct = nf_ct_get(skb_in, &ctinfo); + if (!ct || !(ct->status & IPS_SRC_NAT)) { + icmpv6_send(skb_in, type, code, info); + return; + } + + if (skb_shared(skb_in)) + skb_in = cloned_skb = skb_clone(skb_in, GFP_ATOMIC); + + if (unlikely(!skb_in || skb_network_header(skb_in) < skb_in->head || + (skb_network_header(skb_in) + sizeof(struct ipv6hdr)) > + skb_tail_pointer(skb_in) || skb_ensure_writable(skb_in, + skb_network_offset(skb_in) + sizeof(struct ipv6hdr)))) + goto out; + + orig_ip = ipv6_hdr(skb_in)->saddr; + ipv6_hdr(skb_in)->saddr = ct->tuplehash[0].tuple.src.u3.in6; + icmpv6_send(skb_in, type, code, info); + ipv6_hdr(skb_in)->saddr = orig_ip; +out: + consume_skb(cloned_skb); +} +EXPORT_SYMBOL(icmpv6_ndo_send); +#endif #endif diff --git a/net/ipv6/netfilter/nf_log_ipv6.c b/net/ipv6/netfilter/nf_log_ipv6.c index c6bf580d0f33..c456e2f902b9 100644 --- a/net/ipv6/netfilter/nf_log_ipv6.c +++ b/net/ipv6/netfilter/nf_log_ipv6.c @@ -300,9 +300,11 @@ static void dump_ipv6_mac_header(struct nf_log_buf *m, switch (dev->type) { case ARPHRD_ETHER: - nf_log_buf_add(m, "MACSRC=%pM MACDST=%pM MACPROTO=%04x ", - eth_hdr(skb)->h_source, eth_hdr(skb)->h_dest, - ntohs(eth_hdr(skb)->h_proto)); + nf_log_buf_add(m, "MACSRC=%pM MACDST=%pM ", + eth_hdr(skb)->h_source, eth_hdr(skb)->h_dest); + nf_log_dump_vlan(m, skb); + nf_log_buf_add(m, "MACPROTO=%04x ", + ntohs(eth_hdr(skb)->h_proto)); return; default: break; diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c index b6670e74aeb7..9926455dd546 100644 --- a/net/mac80211/cfg.c +++ b/net/mac80211/cfg.c @@ -664,7 +664,8 @@ void sta_set_rate_info_tx(struct sta_info *sta, u16 brate; sband = ieee80211_get_sband(sta->sdata); - if (sband) { + WARN_ON_ONCE(sband && !sband->bitrates); + if (sband && sband->bitrates) { brate = sband->bitrates[rate->idx].bitrate; rinfo->legacy = DIV_ROUND_UP(brate, 1 << shift); } diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c index 2a82d438991b..9968b8a976f1 100644 --- a/net/mac80211/sta_info.c +++ b/net/mac80211/sta_info.c @@ -2009,6 +2009,10 @@ static void sta_stats_decode_rate(struct ieee80211_local *local, u32 rate, int rate_idx = STA_STATS_GET(LEGACY_IDX, rate); sband = local->hw.wiphy->bands[band]; + + if (WARN_ON_ONCE(!sband->bitrates)) + break; + brate = sband->bitrates[rate_idx].bitrate; if (rinfo->bw == RATE_INFO_BW_5) shift = 2; diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index c339b5e386b7..3ad1de081e3c 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -2393,6 +2393,10 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len) /* Set timeout values for (tcp tcpfin udp) */ ret = ip_vs_set_timeout(ipvs, (struct ip_vs_timeout_user *)arg); goto out_unlock; + } else if (!len) { + /* No more commands with len == 0 below */ + ret = -EINVAL; + goto out_unlock; } usvc_compat = (struct ip_vs_service_user *)arg; @@ -2469,9 +2473,6 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len) break; case IP_VS_SO_SET_DELDEST: ret = ip_vs_del_dest(svc, &udest); - break; - default: - ret = -EINVAL; } out_unlock: diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c index 3f75cd947045..11f7c546e57b 100644 --- a/net/netfilter/ipvs/ip_vs_xmit.c +++ b/net/netfilter/ipvs/ip_vs_xmit.c @@ -586,6 +586,8 @@ static inline int ip_vs_tunnel_xmit_prepare(struct sk_buff *skb, if (ret == NF_ACCEPT) { nf_reset(skb); skb_forward_csum(skb); + if (skb->dev) + skb->tstamp = 0; } return ret; } @@ -626,6 +628,8 @@ static inline int ip_vs_nat_send_or_cont(int pf, struct sk_buff *skb, if (!local) { skb_forward_csum(skb); + if (skb->dev) + skb->tstamp = 0; NF_HOOK(pf, NF_INET_LOCAL_OUT, cp->ipvs->net, NULL, skb, NULL, skb_dst(skb)->dev, dst_output); } else @@ -646,6 +650,8 @@ static inline int ip_vs_send_or_cont(int pf, struct sk_buff *skb, if (!local) { ip_vs_drop_early_demux_sk(skb); skb_forward_csum(skb); + if (skb->dev) + skb->tstamp = 0; NF_HOOK(pf, NF_INET_LOCAL_OUT, cp->ipvs->net, NULL, skb, NULL, skb_dst(skb)->dev, dst_output); } else diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index 7011ab27c437..40f8a1252394 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -549,13 +549,20 @@ static bool tcp_in_window(const struct nf_conn *ct, swin = win << sender->td_scale; sender->td_maxwin = (swin == 0 ? 1 : swin); sender->td_maxend = end + sender->td_maxwin; - /* - * We haven't seen traffic in the other direction yet - * but we have to tweak window tracking to pass III - * and IV until that happens. - */ - if (receiver->td_maxwin == 0) + if (receiver->td_maxwin == 0) { + /* We haven't seen traffic in the other + * direction yet but we have to tweak window + * tracking to pass III and IV until that + * happens. + */ receiver->td_end = receiver->td_maxend = sack; + } else if (sack == receiver->td_end + 1) { + /* Likely a reply to a keepalive. + * Needed for III. + */ + receiver->td_end++; + } + } } else if (((state->state == TCP_CONNTRACK_SYN_SENT && dir == IP_CT_DIR_ORIGINAL) diff --git a/net/netfilter/nf_dup_netdev.c b/net/netfilter/nf_dup_netdev.c index f4a566e67213..98d117f3340c 100644 --- a/net/netfilter/nf_dup_netdev.c +++ b/net/netfilter/nf_dup_netdev.c @@ -21,6 +21,7 @@ static void nf_do_netdev_egress(struct sk_buff *skb, struct net_device *dev) skb_push(skb, skb->mac_len); skb->dev = dev; + skb->tstamp = 0; dev_queue_xmit(skb); } diff --git a/net/netfilter/nf_log_common.c b/net/netfilter/nf_log_common.c index a8c5c846aec1..b164a0e1e053 100644 --- a/net/netfilter/nf_log_common.c +++ b/net/netfilter/nf_log_common.c @@ -176,6 +176,18 @@ nf_log_dump_packet_common(struct nf_log_buf *m, u_int8_t pf, } EXPORT_SYMBOL_GPL(nf_log_dump_packet_common); +void nf_log_dump_vlan(struct nf_log_buf *m, const struct sk_buff *skb) +{ + u16 vid; + + if (!skb_vlan_tag_present(skb)) + return; + + vid = skb_vlan_tag_get(skb); + nf_log_buf_add(m, "VPROTO=%04x VID=%u ", ntohs(skb->vlan_proto), vid); +} +EXPORT_SYMBOL_GPL(nf_log_dump_vlan); + /* bridge and netdev logging families share this code. */ void nf_log_l2packet(struct net *net, u_int8_t pf, __be16 protocol, diff --git a/net/netfilter/nft_fwd_netdev.c b/net/netfilter/nft_fwd_netdev.c index 649edbe77a20..10a12e094929 100644 --- a/net/netfilter/nft_fwd_netdev.c +++ b/net/netfilter/nft_fwd_netdev.c @@ -129,6 +129,7 @@ static void nft_fwd_neigh_eval(const struct nft_expr *expr, return; skb->dev = dev; + skb->tstamp = 0; neigh_xmit(neigh_table, dev, addr, skb); out: regs->verdict.code = verdict; diff --git a/net/nfc/netlink.c b/net/nfc/netlink.c index a65a5a5f434b..310872a9d660 100644 --- a/net/nfc/netlink.c +++ b/net/nfc/netlink.c @@ -1235,7 +1235,7 @@ static int nfc_genl_fw_download(struct sk_buff *skb, struct genl_info *info) u32 idx; char firmware_name[NFC_FIRMWARE_NAME_MAXSIZE + 1]; - if (!info->attrs[NFC_ATTR_DEVICE_INDEX]) + if (!info->attrs[NFC_ATTR_DEVICE_INDEX] || !info->attrs[NFC_ATTR_FIRMWARE_NAME]) return -EINVAL; idx = nla_get_u32(info->attrs[NFC_ATTR_DEVICE_INDEX]); diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c index e4fc6b2bc29d..f43234be5695 100644 --- a/net/sched/act_tunnel_key.c +++ b/net/sched/act_tunnel_key.c @@ -314,7 +314,7 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla, metadata = __ipv6_tun_set_dst(&saddr, &daddr, tos, ttl, dst_port, 0, flags, - key_id, 0); + key_id, opts_len); } else { NL_SET_ERR_MSG(extack, "Missing either ipv4 or ipv6 src and dst"); ret = -EINVAL; diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index 014a28d8dd4f..02d8d3fd84a5 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -330,7 +330,7 @@ static s64 tabledist(s64 mu, s32 sigma, /* default uniform distribution */ if (dist == NULL) - return ((rnd % (2 * sigma)) + mu) - sigma; + return ((rnd % (2 * (u32)sigma)) + mu) - sigma; t = dist->table[rnd % dist->size]; x = (sigma % NETEM_DIST_SCALE) * t; @@ -787,6 +787,10 @@ static void get_slot(struct netem_sched_data *q, const struct nlattr *attr) q->slot_config.max_packets = INT_MAX; if (q->slot_config.max_bytes == 0) q->slot_config.max_bytes = INT_MAX; + + /* capping dist_jitter to the range acceptable by tabledist() */ + q->slot_config.dist_jitter = min_t(__s64, INT_MAX, abs(q->slot_config.dist_jitter)); + q->slot.packets_left = q->slot_config.max_packets; q->slot.bytes_left = q->slot_config.max_bytes; if (q->slot_config.min_delay | q->slot_config.max_delay | @@ -1011,6 +1015,9 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt, if (tb[TCA_NETEM_SLOT]) get_slot(q, tb[TCA_NETEM_SLOT]); + /* capping jitter to the range acceptable by tabledist() */ + q->jitter = min_t(s64, abs(q->jitter), INT_MAX); + return ret; get_table_failure: diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c index 567517e44811..bb4f1d0da153 100644 --- a/net/sctp/sm_sideeffect.c +++ b/net/sctp/sm_sideeffect.c @@ -1615,12 +1615,12 @@ static int sctp_cmd_interpreter(enum sctp_event event_type, break; case SCTP_CMD_INIT_FAILED: - sctp_cmd_init_failed(commands, asoc, cmd->obj.u32); + sctp_cmd_init_failed(commands, asoc, cmd->obj.u16); break; case SCTP_CMD_ASSOC_FAILED: sctp_cmd_assoc_failed(commands, asoc, event_type, - subtype, chunk, cmd->obj.u32); + subtype, chunk, cmd->obj.u16); break; case SCTP_CMD_INIT_COUNTER_INC: diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index 2c9baf8bf118..e7a6c8dcf6b8 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -770,7 +770,7 @@ static struct smc_buf_desc *smcr_new_buf_create(struct smc_link_group *lgr, return buf_desc; } -#define SMCD_DMBE_SIZES 7 /* 0 -> 16KB, 1 -> 32KB, .. 6 -> 1MB */ +#define SMCD_DMBE_SIZES 6 /* 0 -> 16KB, 1 -> 32KB, .. 6 -> 1MB */ static struct smc_buf_desc *smcd_new_buf_create(struct smc_link_group *lgr, bool is_dmb, int bufsize) diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c index 68259eec6afd..ab086081be9c 100644 --- a/net/sunrpc/auth_gss/svcauth_gss.c +++ b/net/sunrpc/auth_gss/svcauth_gss.c @@ -1079,9 +1079,9 @@ static int gss_read_proxy_verf(struct svc_rqst *rqstp, struct gssp_in_token *in_token) { struct kvec *argv = &rqstp->rq_arg.head[0]; - unsigned int page_base, length; - int pages, i, res; - size_t inlen; + unsigned int length, pgto_offs, pgfrom_offs; + int pages, i, res, pgto, pgfrom; + size_t inlen, to_offs, from_offs; res = gss_read_common_verf(gc, argv, authp, in_handle); if (res) @@ -1109,17 +1109,24 @@ static int gss_read_proxy_verf(struct svc_rqst *rqstp, memcpy(page_address(in_token->pages[0]), argv->iov_base, length); inlen -= length; - i = 1; - page_base = rqstp->rq_arg.page_base; + to_offs = length; + from_offs = rqstp->rq_arg.page_base; while (inlen) { - length = min_t(unsigned int, inlen, PAGE_SIZE); - memcpy(page_address(in_token->pages[i]), - page_address(rqstp->rq_arg.pages[i]) + page_base, + pgto = to_offs >> PAGE_SHIFT; + pgfrom = from_offs >> PAGE_SHIFT; + pgto_offs = to_offs & ~PAGE_MASK; + pgfrom_offs = from_offs & ~PAGE_MASK; + + length = min_t(unsigned int, inlen, + min_t(unsigned int, PAGE_SIZE - pgto_offs, + PAGE_SIZE - pgfrom_offs)); + memcpy(page_address(in_token->pages[pgto]) + pgto_offs, + page_address(rqstp->rq_arg.pages[pgfrom]) + pgfrom_offs, length); + to_offs += length; + from_offs += length; inlen -= length; - page_base = 0; - i++; } return 0; } diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c index aa4d19a780d7..4062cd624b26 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c @@ -639,10 +639,11 @@ static int svc_rdma_pull_up_reply_msg(struct svcxprt_rdma *rdma, while (remaining) { len = min_t(u32, PAGE_SIZE - pageoff, remaining); - memcpy(dst, page_address(*ppages), len); + memcpy(dst, page_address(*ppages) + pageoff, len); remaining -= len; dst += len; pageoff = 0; + ppages++; } } diff --git a/net/tipc/core.c b/net/tipc/core.c index 49bb9fc0aa06..ce0f067d0faf 100644 --- a/net/tipc/core.c +++ b/net/tipc/core.c @@ -93,6 +93,11 @@ out_sk_rht: static void __net_exit tipc_exit_net(struct net *net) { tipc_net_stop(net); + + /* Make sure the tipc_net_finalize_work stopped + * before releasing the resources. + */ + flush_scheduled_work(); tipc_bcast_stop(net); tipc_nametbl_stop(net); tipc_sk_rht_destroy(net); diff --git a/net/tipc/msg.c b/net/tipc/msg.c index b078b77620f1..f04843ca8216 100644 --- a/net/tipc/msg.c +++ b/net/tipc/msg.c @@ -140,11 +140,11 @@ int tipc_buf_append(struct sk_buff **headbuf, struct sk_buff **buf) if (fragid == FIRST_FRAGMENT) { if (unlikely(head)) goto err; + *buf = NULL; frag = skb_unshare(frag, GFP_ATOMIC); if (unlikely(!frag)) goto err; head = *headbuf = frag; - *buf = NULL; TIPC_SKB_CB(head)->tail = NULL; if (skb_is_nonlinear(head)) { skb_walk_frags(head, tail) { diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index 575d62130578..dd0fc2aa6875 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -351,13 +351,13 @@ static int tls_push_data(struct sock *sk, struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_offload_context_tx *ctx = tls_offload_ctx_tx(tls_ctx); int tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST; - int more = flags & (MSG_SENDPAGE_NOTLAST | MSG_MORE); struct tls_record_info *record = ctx->open_record; struct page_frag *pfrag; size_t orig_size = size; u32 max_open_record_len; - int copy, rc = 0; + bool more = false; bool done = false; + int copy, rc = 0; long timeo; if (flags & @@ -422,9 +422,8 @@ handle_error: if (!size) { last_record: tls_push_record_flags = flags; - if (more) { - tls_ctx->pending_open_record_frags = - record->num_frags; + if (flags & (MSG_SENDPAGE_NOTLAST | MSG_MORE)) { + more = true; break; } @@ -445,6 +444,8 @@ last_record: } } while (!done); + tls_ctx->pending_open_record_frags = more; + if (orig_size - size > 0) rc = orig_size - size; diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index c88dc8ee3144..02374459c417 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -629,7 +629,7 @@ struct sock *__vsock_create(struct net *net, vsk->owner = get_cred(psk->owner); vsk->connect_timeout = psk->connect_timeout; } else { - vsk->trusted = capable(CAP_NET_ADMIN); + vsk->trusted = ns_capable_noaudit(&init_user_ns, CAP_NET_ADMIN); vsk->owner = get_current_cred(); vsk->connect_timeout = VSOCK_DEFAULT_CONNECT_TIMEOUT; } diff --git a/net/xfrm/Kconfig b/net/xfrm/Kconfig index 372c91faa283..a552b73c5a27 100644 --- a/net/xfrm/Kconfig +++ b/net/xfrm/Kconfig @@ -27,6 +27,17 @@ config XFRM_USER If unsure, say Y. +config XFRM_USER_COMPAT + tristate "Compatible ABI support" + depends on XFRM_USER && COMPAT_FOR_U64_ALIGNMENT && \ + HAVE_EFFICIENT_UNALIGNED_ACCESS + select WANT_COMPAT_NETLINK_MESSAGES + help + Transformation(XFRM) user configuration interface like IPsec + used by compatible Linux applications. + + If unsure, say N. + config XFRM_INTERFACE tristate "Transformation virtual interface" depends on XFRM && IPV6 diff --git a/net/xfrm/Makefile b/net/xfrm/Makefile index fbc4552d17b8..f32511588d56 100644 --- a/net/xfrm/Makefile +++ b/net/xfrm/Makefile @@ -9,5 +9,6 @@ obj-$(CONFIG_XFRM) := xfrm_policy.o xfrm_state.o xfrm_hash.o \ obj-$(CONFIG_XFRM_STATISTICS) += xfrm_proc.o obj-$(CONFIG_XFRM_ALGO) += xfrm_algo.o obj-$(CONFIG_XFRM_USER) += xfrm_user.o +obj-$(CONFIG_XFRM_USER_COMPAT) += xfrm_compat.o obj-$(CONFIG_XFRM_IPCOMP) += xfrm_ipcomp.o obj-$(CONFIG_XFRM_INTERFACE) += xfrm_interface.o diff --git a/net/xfrm/xfrm_compat.c b/net/xfrm/xfrm_compat.c new file mode 100644 index 000000000000..6f597ba266ca --- /dev/null +++ b/net/xfrm/xfrm_compat.c @@ -0,0 +1,625 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * XFRM compat layer + * Author: Dmitry Safonov + * Based on code and translator idea by: Florian Westphal + */ +#include +#include +#include + +struct compat_xfrm_lifetime_cfg { + compat_u64 soft_byte_limit, hard_byte_limit; + compat_u64 soft_packet_limit, hard_packet_limit; + compat_u64 soft_add_expires_seconds, hard_add_expires_seconds; + compat_u64 soft_use_expires_seconds, hard_use_expires_seconds; +}; /* same size on 32bit, but only 4 byte alignment required */ + +struct compat_xfrm_lifetime_cur { + compat_u64 bytes, packets, add_time, use_time; +}; /* same size on 32bit, but only 4 byte alignment required */ + +struct compat_xfrm_userpolicy_info { + struct xfrm_selector sel; + struct compat_xfrm_lifetime_cfg lft; + struct compat_xfrm_lifetime_cur curlft; + __u32 priority, index; + u8 dir, action, flags, share; + /* 4 bytes additional padding on 64bit */ +}; + +struct compat_xfrm_usersa_info { + struct xfrm_selector sel; + struct xfrm_id id; + xfrm_address_t saddr; + struct compat_xfrm_lifetime_cfg lft; + struct compat_xfrm_lifetime_cur curlft; + struct xfrm_stats stats; + __u32 seq, reqid; + u16 family; + u8 mode, replay_window, flags; + /* 4 bytes additional padding on 64bit */ +}; + +struct compat_xfrm_user_acquire { + struct xfrm_id id; + xfrm_address_t saddr; + struct xfrm_selector sel; + struct compat_xfrm_userpolicy_info policy; + /* 4 bytes additional padding on 64bit */ + __u32 aalgos, ealgos, calgos, seq; +}; + +struct compat_xfrm_userspi_info { + struct compat_xfrm_usersa_info info; + /* 4 bytes additional padding on 64bit */ + __u32 min, max; +}; + +struct compat_xfrm_user_expire { + struct compat_xfrm_usersa_info state; + /* 8 bytes additional padding on 64bit */ + u8 hard; +}; + +struct compat_xfrm_user_polexpire { + struct compat_xfrm_userpolicy_info pol; + /* 8 bytes additional padding on 64bit */ + u8 hard; +}; + +#define XMSGSIZE(type) sizeof(struct type) + +static const int compat_msg_min[XFRM_NR_MSGTYPES] = { + [XFRM_MSG_NEWSA - XFRM_MSG_BASE] = XMSGSIZE(compat_xfrm_usersa_info), + [XFRM_MSG_DELSA - XFRM_MSG_BASE] = XMSGSIZE(xfrm_usersa_id), + [XFRM_MSG_GETSA - XFRM_MSG_BASE] = XMSGSIZE(xfrm_usersa_id), + [XFRM_MSG_NEWPOLICY - XFRM_MSG_BASE] = XMSGSIZE(compat_xfrm_userpolicy_info), + [XFRM_MSG_DELPOLICY - XFRM_MSG_BASE] = XMSGSIZE(xfrm_userpolicy_id), + [XFRM_MSG_GETPOLICY - XFRM_MSG_BASE] = XMSGSIZE(xfrm_userpolicy_id), + [XFRM_MSG_ALLOCSPI - XFRM_MSG_BASE] = XMSGSIZE(compat_xfrm_userspi_info), + [XFRM_MSG_ACQUIRE - XFRM_MSG_BASE] = XMSGSIZE(compat_xfrm_user_acquire), + [XFRM_MSG_EXPIRE - XFRM_MSG_BASE] = XMSGSIZE(compat_xfrm_user_expire), + [XFRM_MSG_UPDPOLICY - XFRM_MSG_BASE] = XMSGSIZE(compat_xfrm_userpolicy_info), + [XFRM_MSG_UPDSA - XFRM_MSG_BASE] = XMSGSIZE(compat_xfrm_usersa_info), + [XFRM_MSG_POLEXPIRE - XFRM_MSG_BASE] = XMSGSIZE(compat_xfrm_user_polexpire), + [XFRM_MSG_FLUSHSA - XFRM_MSG_BASE] = XMSGSIZE(xfrm_usersa_flush), + [XFRM_MSG_FLUSHPOLICY - XFRM_MSG_BASE] = 0, + [XFRM_MSG_NEWAE - XFRM_MSG_BASE] = XMSGSIZE(xfrm_aevent_id), + [XFRM_MSG_GETAE - XFRM_MSG_BASE] = XMSGSIZE(xfrm_aevent_id), + [XFRM_MSG_REPORT - XFRM_MSG_BASE] = XMSGSIZE(xfrm_user_report), + [XFRM_MSG_MIGRATE - XFRM_MSG_BASE] = XMSGSIZE(xfrm_userpolicy_id), + [XFRM_MSG_NEWSADINFO - XFRM_MSG_BASE] = sizeof(u32), + [XFRM_MSG_GETSADINFO - XFRM_MSG_BASE] = sizeof(u32), + [XFRM_MSG_NEWSPDINFO - XFRM_MSG_BASE] = sizeof(u32), + [XFRM_MSG_GETSPDINFO - XFRM_MSG_BASE] = sizeof(u32), + [XFRM_MSG_MAPPING - XFRM_MSG_BASE] = XMSGSIZE(xfrm_user_mapping) +}; + +static const struct nla_policy compat_policy[XFRMA_MAX+1] = { + [XFRMA_SA] = { .len = XMSGSIZE(compat_xfrm_usersa_info)}, + [XFRMA_POLICY] = { .len = XMSGSIZE(compat_xfrm_userpolicy_info)}, + [XFRMA_LASTUSED] = { .type = NLA_U64}, + [XFRMA_ALG_AUTH_TRUNC] = { .len = sizeof(struct xfrm_algo_auth)}, + [XFRMA_ALG_AEAD] = { .len = sizeof(struct xfrm_algo_aead) }, + [XFRMA_ALG_AUTH] = { .len = sizeof(struct xfrm_algo) }, + [XFRMA_ALG_CRYPT] = { .len = sizeof(struct xfrm_algo) }, + [XFRMA_ALG_COMP] = { .len = sizeof(struct xfrm_algo) }, + [XFRMA_ENCAP] = { .len = sizeof(struct xfrm_encap_tmpl) }, + [XFRMA_TMPL] = { .len = sizeof(struct xfrm_user_tmpl) }, + [XFRMA_SEC_CTX] = { .len = sizeof(struct xfrm_sec_ctx) }, + [XFRMA_LTIME_VAL] = { .len = sizeof(struct xfrm_lifetime_cur) }, + [XFRMA_REPLAY_VAL] = { .len = sizeof(struct xfrm_replay_state) }, + [XFRMA_REPLAY_THRESH] = { .type = NLA_U32 }, + [XFRMA_ETIMER_THRESH] = { .type = NLA_U32 }, + [XFRMA_SRCADDR] = { .len = sizeof(xfrm_address_t) }, + [XFRMA_COADDR] = { .len = sizeof(xfrm_address_t) }, + [XFRMA_POLICY_TYPE] = { .len = sizeof(struct xfrm_userpolicy_type)}, + [XFRMA_MIGRATE] = { .len = sizeof(struct xfrm_user_migrate) }, + [XFRMA_KMADDRESS] = { .len = sizeof(struct xfrm_user_kmaddress) }, + [XFRMA_MARK] = { .len = sizeof(struct xfrm_mark) }, + [XFRMA_TFCPAD] = { .type = NLA_U32 }, + [XFRMA_REPLAY_ESN_VAL] = { .len = sizeof(struct xfrm_replay_state_esn) }, + [XFRMA_SA_EXTRA_FLAGS] = { .type = NLA_U32 }, + [XFRMA_PROTO] = { .type = NLA_U8 }, + [XFRMA_ADDRESS_FILTER] = { .len = sizeof(struct xfrm_address_filter) }, + [XFRMA_OFFLOAD_DEV] = { .len = sizeof(struct xfrm_user_offload) }, + [XFRMA_SET_MARK] = { .type = NLA_U32 }, + [XFRMA_SET_MARK_MASK] = { .type = NLA_U32 }, + [XFRMA_IF_ID] = { .type = NLA_U32 }, +}; + +static struct nlmsghdr *xfrm_nlmsg_put_compat(struct sk_buff *skb, + const struct nlmsghdr *nlh_src, u16 type) +{ + int payload = compat_msg_min[type]; + int src_len = xfrm_msg_min[type]; + struct nlmsghdr *nlh_dst; + + /* Compat messages are shorter or equal to native (+padding) */ + if (WARN_ON_ONCE(src_len < payload)) + return ERR_PTR(-EMSGSIZE); + + nlh_dst = nlmsg_put(skb, nlh_src->nlmsg_pid, nlh_src->nlmsg_seq, + nlh_src->nlmsg_type, payload, nlh_src->nlmsg_flags); + if (!nlh_dst) + return ERR_PTR(-EMSGSIZE); + + memset(nlmsg_data(nlh_dst), 0, payload); + + switch (nlh_src->nlmsg_type) { + /* Compat message has the same layout as native */ + case XFRM_MSG_DELSA: + case XFRM_MSG_DELPOLICY: + case XFRM_MSG_FLUSHSA: + case XFRM_MSG_FLUSHPOLICY: + case XFRM_MSG_NEWAE: + case XFRM_MSG_REPORT: + case XFRM_MSG_MIGRATE: + case XFRM_MSG_NEWSADINFO: + case XFRM_MSG_NEWSPDINFO: + case XFRM_MSG_MAPPING: + WARN_ON_ONCE(src_len != payload); + memcpy(nlmsg_data(nlh_dst), nlmsg_data(nlh_src), src_len); + break; + /* 4 byte alignment for trailing u64 on native, but not on compat */ + case XFRM_MSG_NEWSA: + case XFRM_MSG_NEWPOLICY: + case XFRM_MSG_UPDSA: + case XFRM_MSG_UPDPOLICY: + WARN_ON_ONCE(src_len != payload + 4); + memcpy(nlmsg_data(nlh_dst), nlmsg_data(nlh_src), payload); + break; + case XFRM_MSG_EXPIRE: { + const struct xfrm_user_expire *src_ue = nlmsg_data(nlh_src); + struct compat_xfrm_user_expire *dst_ue = nlmsg_data(nlh_dst); + + /* compat_xfrm_user_expire has 4-byte smaller state */ + memcpy(dst_ue, src_ue, sizeof(dst_ue->state)); + dst_ue->hard = src_ue->hard; + break; + } + case XFRM_MSG_ACQUIRE: { + const struct xfrm_user_acquire *src_ua = nlmsg_data(nlh_src); + struct compat_xfrm_user_acquire *dst_ua = nlmsg_data(nlh_dst); + + memcpy(dst_ua, src_ua, offsetof(struct compat_xfrm_user_acquire, aalgos)); + dst_ua->aalgos = src_ua->aalgos; + dst_ua->ealgos = src_ua->ealgos; + dst_ua->calgos = src_ua->calgos; + dst_ua->seq = src_ua->seq; + break; + } + case XFRM_MSG_POLEXPIRE: { + const struct xfrm_user_polexpire *src_upe = nlmsg_data(nlh_src); + struct compat_xfrm_user_polexpire *dst_upe = nlmsg_data(nlh_dst); + + /* compat_xfrm_user_polexpire has 4-byte smaller state */ + memcpy(dst_upe, src_upe, sizeof(dst_upe->pol)); + dst_upe->hard = src_upe->hard; + break; + } + case XFRM_MSG_ALLOCSPI: { + const struct xfrm_userspi_info *src_usi = nlmsg_data(nlh_src); + struct compat_xfrm_userspi_info *dst_usi = nlmsg_data(nlh_dst); + + /* compat_xfrm_user_polexpire has 4-byte smaller state */ + memcpy(dst_usi, src_usi, sizeof(src_usi->info)); + dst_usi->min = src_usi->min; + dst_usi->max = src_usi->max; + break; + } + /* Not being sent by kernel */ + case XFRM_MSG_GETSA: + case XFRM_MSG_GETPOLICY: + case XFRM_MSG_GETAE: + case XFRM_MSG_GETSADINFO: + case XFRM_MSG_GETSPDINFO: + default: + WARN_ONCE(1, "unsupported nlmsg_type %d", nlh_src->nlmsg_type); + return ERR_PTR(-EOPNOTSUPP); + } + + return nlh_dst; +} + +static int xfrm_nla_cpy(struct sk_buff *dst, const struct nlattr *src, int len) +{ + return nla_put(dst, src->nla_type, len, nla_data(src)); +} + +static int xfrm_xlate64_attr(struct sk_buff *dst, const struct nlattr *src) +{ + switch (src->nla_type) { + case XFRMA_PAD: + /* Ignore */ + return 0; + case XFRMA_ALG_AUTH: + case XFRMA_ALG_CRYPT: + case XFRMA_ALG_COMP: + case XFRMA_ENCAP: + case XFRMA_TMPL: + return xfrm_nla_cpy(dst, src, nla_len(src)); + case XFRMA_SA: + return xfrm_nla_cpy(dst, src, XMSGSIZE(compat_xfrm_usersa_info)); + case XFRMA_POLICY: + return xfrm_nla_cpy(dst, src, XMSGSIZE(compat_xfrm_userpolicy_info)); + case XFRMA_SEC_CTX: + return xfrm_nla_cpy(dst, src, nla_len(src)); + case XFRMA_LTIME_VAL: + return nla_put_64bit(dst, src->nla_type, nla_len(src), + nla_data(src), XFRMA_PAD); + case XFRMA_REPLAY_VAL: + case XFRMA_REPLAY_THRESH: + case XFRMA_ETIMER_THRESH: + case XFRMA_SRCADDR: + case XFRMA_COADDR: + return xfrm_nla_cpy(dst, src, nla_len(src)); + case XFRMA_LASTUSED: + return nla_put_64bit(dst, src->nla_type, nla_len(src), + nla_data(src), XFRMA_PAD); + case XFRMA_POLICY_TYPE: + case XFRMA_MIGRATE: + case XFRMA_ALG_AEAD: + case XFRMA_KMADDRESS: + case XFRMA_ALG_AUTH_TRUNC: + case XFRMA_MARK: + case XFRMA_TFCPAD: + case XFRMA_REPLAY_ESN_VAL: + case XFRMA_SA_EXTRA_FLAGS: + case XFRMA_PROTO: + case XFRMA_ADDRESS_FILTER: + case XFRMA_OFFLOAD_DEV: + case XFRMA_SET_MARK: + case XFRMA_SET_MARK_MASK: + case XFRMA_IF_ID: + return xfrm_nla_cpy(dst, src, nla_len(src)); + default: + BUILD_BUG_ON(XFRMA_MAX != XFRMA_IF_ID); + WARN_ONCE(1, "unsupported nla_type %d", src->nla_type); + return -EOPNOTSUPP; + } +} + +/* Take kernel-built (64bit layout) and create 32bit layout for userspace */ +static int xfrm_xlate64(struct sk_buff *dst, const struct nlmsghdr *nlh_src) +{ + u16 type = nlh_src->nlmsg_type - XFRM_MSG_BASE; + const struct nlattr *nla, *attrs; + struct nlmsghdr *nlh_dst; + int len, remaining; + + nlh_dst = xfrm_nlmsg_put_compat(dst, nlh_src, type); + if (IS_ERR(nlh_dst)) + return PTR_ERR(nlh_dst); + + attrs = nlmsg_attrdata(nlh_src, xfrm_msg_min[type]); + len = nlmsg_attrlen(nlh_src, xfrm_msg_min[type]); + + nla_for_each_attr(nla, attrs, len, remaining) { + int err = xfrm_xlate64_attr(dst, nla); + + if (err) + return err; + } + + nlmsg_end(dst, nlh_dst); + + return 0; +} + +static int xfrm_alloc_compat(struct sk_buff *skb, const struct nlmsghdr *nlh_src) +{ + u16 type = nlh_src->nlmsg_type - XFRM_MSG_BASE; + struct sk_buff *new = NULL; + int err; + + if (WARN_ON_ONCE(type >= ARRAY_SIZE(xfrm_msg_min))) + return -EOPNOTSUPP; + + if (skb_shinfo(skb)->frag_list == NULL) { + new = alloc_skb(skb->len + skb_tailroom(skb), GFP_ATOMIC); + if (!new) + return -ENOMEM; + skb_shinfo(skb)->frag_list = new; + } + + err = xfrm_xlate64(skb_shinfo(skb)->frag_list, nlh_src); + if (err) { + if (new) { + kfree_skb(new); + skb_shinfo(skb)->frag_list = NULL; + } + return err; + } + + return 0; +} + +/* Calculates len of translated 64-bit message. */ +static size_t xfrm_user_rcv_calculate_len64(const struct nlmsghdr *src, + struct nlattr *attrs[XFRMA_MAX+1]) +{ + size_t len = nlmsg_len(src); + + switch (src->nlmsg_type) { + case XFRM_MSG_NEWSA: + case XFRM_MSG_NEWPOLICY: + case XFRM_MSG_ALLOCSPI: + case XFRM_MSG_ACQUIRE: + case XFRM_MSG_UPDPOLICY: + case XFRM_MSG_UPDSA: + len += 4; + break; + case XFRM_MSG_EXPIRE: + case XFRM_MSG_POLEXPIRE: + len += 8; + break; + default: + break; + } + + if (attrs[XFRMA_SA]) + len += 4; + if (attrs[XFRMA_POLICY]) + len += 4; + + /* XXX: some attrs may need to be realigned + * if !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS + */ + + return len; +} + +static int xfrm_attr_cpy32(void *dst, size_t *pos, const struct nlattr *src, + size_t size, int copy_len, int payload) +{ + struct nlmsghdr *nlmsg = dst; + struct nlattr *nla; + + if (WARN_ON_ONCE(copy_len > payload)) + copy_len = payload; + + if (size - *pos < nla_attr_size(payload)) + return -ENOBUFS; + + nla = dst + *pos; + + memcpy(nla, src, nla_attr_size(copy_len)); + nla->nla_len = nla_attr_size(payload); + *pos += nla_attr_size(payload); + nlmsg->nlmsg_len += nla->nla_len; + + memset(dst + *pos, 0, payload - copy_len); + *pos += payload - copy_len; + + return 0; +} + +static int xfrm_xlate32_attr(void *dst, const struct nlattr *nla, + size_t *pos, size_t size, + struct netlink_ext_ack *extack) +{ + int type = nla_type(nla); + u16 pol_len32, pol_len64; + int err; + + if (type > XFRMA_MAX) { + BUILD_BUG_ON(XFRMA_MAX != XFRMA_IF_ID); + NL_SET_ERR_MSG(extack, "Bad attribute"); + return -EOPNOTSUPP; + } + if (nla_len(nla) < compat_policy[type].len) { + NL_SET_ERR_MSG(extack, "Attribute bad length"); + return -EOPNOTSUPP; + } + + pol_len32 = compat_policy[type].len; + pol_len64 = xfrma_policy[type].len; + + /* XFRMA_SA and XFRMA_POLICY - need to know how-to translate */ + if (pol_len32 != pol_len64) { + if (nla_len(nla) != compat_policy[type].len) { + NL_SET_ERR_MSG(extack, "Attribute bad length"); + return -EOPNOTSUPP; + } + err = xfrm_attr_cpy32(dst, pos, nla, size, pol_len32, pol_len64); + if (err) + return err; + } + + return xfrm_attr_cpy32(dst, pos, nla, size, nla_len(nla), nla_len(nla)); +} + +static int xfrm_xlate32(struct nlmsghdr *dst, const struct nlmsghdr *src, + struct nlattr *attrs[XFRMA_MAX+1], + size_t size, u8 type, struct netlink_ext_ack *extack) +{ + size_t pos; + int i; + + memcpy(dst, src, NLMSG_HDRLEN); + dst->nlmsg_len = NLMSG_HDRLEN + xfrm_msg_min[type]; + memset(nlmsg_data(dst), 0, xfrm_msg_min[type]); + + switch (src->nlmsg_type) { + /* Compat message has the same layout as native */ + case XFRM_MSG_DELSA: + case XFRM_MSG_GETSA: + case XFRM_MSG_DELPOLICY: + case XFRM_MSG_GETPOLICY: + case XFRM_MSG_FLUSHSA: + case XFRM_MSG_FLUSHPOLICY: + case XFRM_MSG_NEWAE: + case XFRM_MSG_GETAE: + case XFRM_MSG_REPORT: + case XFRM_MSG_MIGRATE: + case XFRM_MSG_NEWSADINFO: + case XFRM_MSG_GETSADINFO: + case XFRM_MSG_NEWSPDINFO: + case XFRM_MSG_GETSPDINFO: + case XFRM_MSG_MAPPING: + memcpy(nlmsg_data(dst), nlmsg_data(src), compat_msg_min[type]); + break; + /* 4 byte alignment for trailing u64 on native, but not on compat */ + case XFRM_MSG_NEWSA: + case XFRM_MSG_NEWPOLICY: + case XFRM_MSG_UPDSA: + case XFRM_MSG_UPDPOLICY: + memcpy(nlmsg_data(dst), nlmsg_data(src), compat_msg_min[type]); + break; + case XFRM_MSG_EXPIRE: { + const struct compat_xfrm_user_expire *src_ue = nlmsg_data(src); + struct xfrm_user_expire *dst_ue = nlmsg_data(dst); + + /* compat_xfrm_user_expire has 4-byte smaller state */ + memcpy(dst_ue, src_ue, sizeof(src_ue->state)); + dst_ue->hard = src_ue->hard; + break; + } + case XFRM_MSG_ACQUIRE: { + const struct compat_xfrm_user_acquire *src_ua = nlmsg_data(src); + struct xfrm_user_acquire *dst_ua = nlmsg_data(dst); + + memcpy(dst_ua, src_ua, offsetof(struct compat_xfrm_user_acquire, aalgos)); + dst_ua->aalgos = src_ua->aalgos; + dst_ua->ealgos = src_ua->ealgos; + dst_ua->calgos = src_ua->calgos; + dst_ua->seq = src_ua->seq; + break; + } + case XFRM_MSG_POLEXPIRE: { + const struct compat_xfrm_user_polexpire *src_upe = nlmsg_data(src); + struct xfrm_user_polexpire *dst_upe = nlmsg_data(dst); + + /* compat_xfrm_user_polexpire has 4-byte smaller state */ + memcpy(dst_upe, src_upe, sizeof(src_upe->pol)); + dst_upe->hard = src_upe->hard; + break; + } + case XFRM_MSG_ALLOCSPI: { + const struct compat_xfrm_userspi_info *src_usi = nlmsg_data(src); + struct xfrm_userspi_info *dst_usi = nlmsg_data(dst); + + /* compat_xfrm_user_polexpire has 4-byte smaller state */ + memcpy(dst_usi, src_usi, sizeof(src_usi->info)); + dst_usi->min = src_usi->min; + dst_usi->max = src_usi->max; + break; + } + default: + NL_SET_ERR_MSG(extack, "Unsupported message type"); + return -EOPNOTSUPP; + } + pos = dst->nlmsg_len; + + for (i = 1; i < XFRMA_MAX + 1; i++) { + int err; + + if (i == XFRMA_PAD) + continue; + + if (!attrs[i]) + continue; + + err = xfrm_xlate32_attr(dst, attrs[i], &pos, size, extack); + if (err) + return err; + } + + return 0; +} + +static struct nlmsghdr *xfrm_user_rcv_msg_compat(const struct nlmsghdr *h32, + int maxtype, const struct nla_policy *policy, + struct netlink_ext_ack *extack) +{ + /* netlink_rcv_skb() checks if a message has full (struct nlmsghdr) */ + u16 type = h32->nlmsg_type - XFRM_MSG_BASE; + struct nlattr *attrs[XFRMA_MAX+1]; + struct nlmsghdr *h64; + size_t len; + int err; + + BUILD_BUG_ON(ARRAY_SIZE(xfrm_msg_min) != ARRAY_SIZE(compat_msg_min)); + + if (type >= ARRAY_SIZE(xfrm_msg_min)) + return ERR_PTR(-EINVAL); + + /* Don't call parse: the message might have only nlmsg header */ + if ((h32->nlmsg_type == XFRM_MSG_GETSA || + h32->nlmsg_type == XFRM_MSG_GETPOLICY) && + (h32->nlmsg_flags & NLM_F_DUMP)) + return NULL; + + err = nlmsg_parse(h32, compat_msg_min[type], attrs, + maxtype ? : XFRMA_MAX, policy ? : compat_policy, extack); + if (err < 0) + return ERR_PTR(err); + + len = xfrm_user_rcv_calculate_len64(h32, attrs); + /* The message doesn't need translation */ + if (len == nlmsg_len(h32)) + return NULL; + + len += NLMSG_HDRLEN; + h64 = kvmalloc(len, GFP_KERNEL | __GFP_ZERO); + if (!h64) + return ERR_PTR(-ENOMEM); + + err = xfrm_xlate32(h64, h32, attrs, len, type, extack); + if (err < 0) { + kvfree(h64); + return ERR_PTR(err); + } + + return h64; +} + +static int xfrm_user_policy_compat(u8 **pdata32, int optlen) +{ + struct compat_xfrm_userpolicy_info *p = (void *)*pdata32; + u8 *src_templates, *dst_templates; + u8 *data64; + + if (optlen < sizeof(*p)) + return -EINVAL; + + data64 = kmalloc(optlen + 4, GFP_USER | __GFP_NOWARN); + if (!data64) + return -ENOMEM; + + memcpy(data64, *pdata32, sizeof(*p)); + memset(data64 + sizeof(*p), 0, 4); + + src_templates = *pdata32 + sizeof(*p); + dst_templates = data64 + sizeof(*p) + 4; + memcpy(dst_templates, src_templates, optlen - sizeof(*p)); + + kfree(*pdata32); + *pdata32 = data64; + return 0; +} + +static struct xfrm_translator xfrm_translator = { + .owner = THIS_MODULE, + .alloc_compat = xfrm_alloc_compat, + .rcv_msg_compat = xfrm_user_rcv_msg_compat, + .xlate_user_policy_sockptr = xfrm_user_policy_compat, +}; + +static int __init xfrm_compat_init(void) +{ + return xfrm_register_translator(&xfrm_translator); +} + +static void __exit xfrm_compat_exit(void) +{ + xfrm_unregister_translator(&xfrm_translator); +} + +module_init(xfrm_compat_init); +module_exit(xfrm_compat_exit); +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Dmitry Safonov"); +MODULE_DESCRIPTION("XFRM 32-bit compatibility layer"); diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index c7c8d08a9148..c42df9dcad6e 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -2118,6 +2118,66 @@ bool km_is_alive(const struct km_event *c) } EXPORT_SYMBOL(km_is_alive); +#if IS_ENABLED(CONFIG_XFRM_USER_COMPAT) +static DEFINE_SPINLOCK(xfrm_translator_lock); +static struct xfrm_translator __rcu *xfrm_translator; + +struct xfrm_translator *xfrm_get_translator(void) +{ + struct xfrm_translator *xtr; + + rcu_read_lock(); + xtr = rcu_dereference(xfrm_translator); + if (unlikely(!xtr)) + goto out; + if (!try_module_get(xtr->owner)) + xtr = NULL; +out: + rcu_read_unlock(); + return xtr; +} +EXPORT_SYMBOL_GPL(xfrm_get_translator); + +void xfrm_put_translator(struct xfrm_translator *xtr) +{ + module_put(xtr->owner); +} +EXPORT_SYMBOL_GPL(xfrm_put_translator); + +int xfrm_register_translator(struct xfrm_translator *xtr) +{ + int err = 0; + + spin_lock_bh(&xfrm_translator_lock); + if (unlikely(xfrm_translator != NULL)) + err = -EEXIST; + else + rcu_assign_pointer(xfrm_translator, xtr); + spin_unlock_bh(&xfrm_translator_lock); + + return err; +} +EXPORT_SYMBOL_GPL(xfrm_register_translator); + +int xfrm_unregister_translator(struct xfrm_translator *xtr) +{ + int err = 0; + + spin_lock_bh(&xfrm_translator_lock); + if (likely(xfrm_translator != NULL)) { + if (rcu_access_pointer(xfrm_translator) != xtr) + err = -EINVAL; + else + RCU_INIT_POINTER(xfrm_translator, NULL); + } + spin_unlock_bh(&xfrm_translator_lock); + synchronize_rcu(); + + return err; +} +EXPORT_SYMBOL_GPL(xfrm_unregister_translator); +#endif + int xfrm_user_policy(struct sock *sk, int optname, u8 __user *optval, int optlen) { int err; @@ -2139,6 +2199,23 @@ int xfrm_user_policy(struct sock *sk, int optname, u8 __user *optval, int optlen if (IS_ERR(data)) return PTR_ERR(data); + /* Use the 64-bit / untranslated format on Android, even for compat */ + if (!IS_ENABLED(CONFIG_ANDROID) || IS_ENABLED(CONFIG_XFRM_USER_COMPAT)) { + if (in_compat_syscall()) { + struct xfrm_translator *xtr = xfrm_get_translator(); + + if (!xtr) + return -EOPNOTSUPP; + + err = xtr->xlate_user_policy_sockptr(&data, optlen); + xfrm_put_translator(xtr); + if (err) { + kfree(data); + return err; + } + } + } + err = -EINVAL; rcu_read_lock(); list_for_each_entry_rcu(km, &xfrm_km_list, list) { diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index fc853e106dbd..fe7ff38f2374 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -974,6 +974,7 @@ static int dump_one_state(struct xfrm_state *x, int count, void *ptr) struct xfrm_dump_info *sp = ptr; struct sk_buff *in_skb = sp->in_skb; struct sk_buff *skb = sp->out_skb; + struct xfrm_translator *xtr; struct xfrm_usersa_info *p; struct nlmsghdr *nlh; int err; @@ -991,6 +992,18 @@ static int dump_one_state(struct xfrm_state *x, int count, void *ptr) return err; } nlmsg_end(skb, nlh); + + xtr = xfrm_get_translator(); + if (xtr) { + err = xtr->alloc_compat(skb, nlh); + + xfrm_put_translator(xtr); + if (err) { + nlmsg_cancel(skb, nlh); + return err; + } + } + return 0; } @@ -1005,7 +1018,6 @@ static int xfrm_dump_sa_done(struct netlink_callback *cb) return 0; } -static const struct nla_policy xfrma_policy[XFRMA_MAX+1]; static int xfrm_dump_sa(struct sk_buff *skb, struct netlink_callback *cb) { struct net *net = sock_net(skb->sk); @@ -1082,12 +1094,24 @@ static inline int xfrm_nlmsg_multicast(struct net *net, struct sk_buff *skb, u32 pid, unsigned int group) { struct sock *nlsk = rcu_dereference(net->xfrm.nlsk); + struct xfrm_translator *xtr; if (!nlsk) { kfree_skb(skb); return -EPIPE; } + xtr = xfrm_get_translator(); + if (xtr) { + int err = xtr->alloc_compat(skb, nlmsg_hdr(skb)); + + xfrm_put_translator(xtr); + if (err) { + kfree_skb(skb); + return err; + } + } + return nlmsg_multicast(nlsk, skb, pid, group, GFP_ATOMIC); } @@ -1307,6 +1331,7 @@ static int xfrm_alloc_userspi(struct sk_buff *skb, struct nlmsghdr *nlh, struct net *net = sock_net(skb->sk); struct xfrm_state *x; struct xfrm_userspi_info *p; + struct xfrm_translator *xtr; struct sk_buff *resp_skb; xfrm_address_t *daddr; int family; @@ -1357,6 +1382,17 @@ static int xfrm_alloc_userspi(struct sk_buff *skb, struct nlmsghdr *nlh, goto out; } + xtr = xfrm_get_translator(); + if (xtr) { + err = xtr->alloc_compat(skb, nlmsg_hdr(skb)); + + xfrm_put_translator(xtr); + if (err) { + kfree_skb(resp_skb); + goto out; + } + } + err = nlmsg_unicast(net->xfrm.nlsk, resp_skb, NETLINK_CB(skb).portid); out: @@ -1763,6 +1799,7 @@ static int dump_one_policy(struct xfrm_policy *xp, int dir, int count, void *ptr struct xfrm_userpolicy_info *p; struct sk_buff *in_skb = sp->in_skb; struct sk_buff *skb = sp->out_skb; + struct xfrm_translator *xtr; struct nlmsghdr *nlh; int err; @@ -1787,6 +1824,18 @@ static int dump_one_policy(struct xfrm_policy *xp, int dir, int count, void *ptr return err; } nlmsg_end(skb, nlh); + + xtr = xfrm_get_translator(); + if (xtr) { + err = xtr->alloc_compat(skb, nlh); + + xfrm_put_translator(xtr); + if (err) { + nlmsg_cancel(skb, nlh); + return err; + } + } + return 0; } @@ -2528,7 +2577,7 @@ static int xfrm_send_migrate(const struct xfrm_selector *sel, u8 dir, u8 type, #define XMSGSIZE(type) sizeof(struct type) -static const int xfrm_msg_min[XFRM_NR_MSGTYPES] = { +const int xfrm_msg_min[XFRM_NR_MSGTYPES] = { [XFRM_MSG_NEWSA - XFRM_MSG_BASE] = XMSGSIZE(xfrm_usersa_info), [XFRM_MSG_DELSA - XFRM_MSG_BASE] = XMSGSIZE(xfrm_usersa_id), [XFRM_MSG_GETSA - XFRM_MSG_BASE] = XMSGSIZE(xfrm_usersa_id), @@ -2551,10 +2600,11 @@ static const int xfrm_msg_min[XFRM_NR_MSGTYPES] = { [XFRM_MSG_NEWSPDINFO - XFRM_MSG_BASE] = sizeof(u32), [XFRM_MSG_GETSPDINFO - XFRM_MSG_BASE] = sizeof(u32), }; +EXPORT_SYMBOL_GPL(xfrm_msg_min); #undef XMSGSIZE -static const struct nla_policy xfrma_policy[XFRMA_MAX+1] = { +const struct nla_policy xfrma_policy[XFRMA_MAX+1] = { [XFRMA_SA] = { .len = sizeof(struct xfrm_usersa_info)}, [XFRMA_POLICY] = { .len = sizeof(struct xfrm_userpolicy_info)}, [XFRMA_LASTUSED] = { .type = NLA_U64}, @@ -2586,6 +2636,7 @@ static const struct nla_policy xfrma_policy[XFRMA_MAX+1] = { [XFRMA_SET_MARK_MASK] = { .type = NLA_U32 }, [XFRMA_IF_ID] = { .type = NLA_U32 }, }; +EXPORT_SYMBOL_GPL(xfrma_policy); static const struct nla_policy xfrma_spd_policy[XFRMA_SPD_MAX+1] = { [XFRMA_SPD_IPV4_HTHRESH] = { .len = sizeof(struct xfrmu_spdhthresh) }, @@ -2635,6 +2686,7 @@ static int xfrm_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, struct net *net = sock_net(skb->sk); struct nlattr *attrs[XFRMA_MAX+1]; const struct xfrm_link *link; + struct nlmsghdr *nlh64 = NULL; int type, err; type = nlh->nlmsg_type; @@ -2648,32 +2700,58 @@ static int xfrm_user_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, if (!netlink_net_capable(skb, CAP_NET_ADMIN)) return -EPERM; + /* Use the 64-bit / untranslated format on Android, even for compat */ + if (!IS_ENABLED(CONFIG_ANDROID) || IS_ENABLED(CONFIG_XFRM_USER_COMPAT)) { + if (in_compat_syscall()) { + struct xfrm_translator *xtr = xfrm_get_translator(); + + if (!xtr) + return -EOPNOTSUPP; + + nlh64 = xtr->rcv_msg_compat(nlh, link->nla_max, + link->nla_pol, extack); + xfrm_put_translator(xtr); + if (IS_ERR(nlh64)) + return PTR_ERR(nlh64); + if (nlh64) + nlh = nlh64; + } + } + if ((type == (XFRM_MSG_GETSA - XFRM_MSG_BASE) || type == (XFRM_MSG_GETPOLICY - XFRM_MSG_BASE)) && (nlh->nlmsg_flags & NLM_F_DUMP)) { - if (link->dump == NULL) - return -EINVAL; + struct netlink_dump_control c = { + .start = link->start, + .dump = link->dump, + .done = link->done, + }; - { - struct netlink_dump_control c = { - .start = link->start, - .dump = link->dump, - .done = link->done, - }; - return netlink_dump_start(net->xfrm.nlsk, skb, nlh, &c); + if (link->dump == NULL) { + err = -EINVAL; + goto err; } + + err = netlink_dump_start(net->xfrm.nlsk, skb, nlh, &c); + goto err; } err = nlmsg_parse(nlh, xfrm_msg_min[type], attrs, link->nla_max ? : XFRMA_MAX, link->nla_pol ? : xfrma_policy, extack); if (err < 0) - return err; + goto err; - if (link->doit == NULL) - return -EINVAL; + if (link->doit == NULL) { + err = -EINVAL; + goto err; + } - return link->doit(skb, nlh, attrs); + err = link->doit(skb, nlh, attrs); + +err: + kvfree(nlh64); + return err; } static void xfrm_netlink_rcv(struct sk_buff *skb) diff --git a/samples/mic/mpssd/mpssd.c b/samples/mic/mpssd/mpssd.c index f42ce551bb48..a50d27473e12 100644 --- a/samples/mic/mpssd/mpssd.c +++ b/samples/mic/mpssd/mpssd.c @@ -414,9 +414,9 @@ mic_virtio_copy(struct mic_info *mic, int fd, static inline unsigned _vring_size(unsigned int num, unsigned long align) { - return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (3 + num) + return _ALIGN_UP(((sizeof(struct vring_desc) * num + sizeof(__u16) * (3 + num) + align - 1) & ~(align - 1)) - + sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num; + + sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num, 4); } /* diff --git a/scripts/clang-android.sh b/scripts/clang-android.sh deleted file mode 100755 index 9186c4f48576..000000000000 --- a/scripts/clang-android.sh +++ /dev/null @@ -1,4 +0,0 @@ -#!/bin/sh -# SPDX-License-Identifier: GPL-2.0 - -$* -dM -E - &1 | grep -q __ANDROID__ && echo "y" diff --git a/scripts/setlocalversion b/scripts/setlocalversion index 081013eb0d5e..c2e8455b3aeb 100755 --- a/scripts/setlocalversion +++ b/scripts/setlocalversion @@ -65,7 +65,7 @@ scm_version() # Check for git and a git repo. if test -z "$(git rev-parse --show-cdup 2>/dev/null)" && - head=`git rev-parse --verify --short HEAD 2>/dev/null`; then + head=$(git rev-parse --verify HEAD 2>/dev/null); then if [ -n "$android_release" ] && [ -n "$kmi_generation" ]; then printf '%s' "-$android_release-$kmi_generation" @@ -83,11 +83,22 @@ scm_version() fi # If we are past a tagged commit (like # "v2.6.30-rc5-302-g72357d5"), we pretty print it. - if atag="`git describe 2>/dev/null`"; then - echo "$atag" | awk -F- '{printf("-%05d-%s", $(NF-1),$(NF))}' + # + # Ensure the abbreviated sha1 has exactly 12 + # hex characters, to make the output + # independent of git version, local + # core.abbrev settings and/or total number of + # objects in the current repository - passing + # --abbrev=12 ensures a minimum of 12, and the + # awk substr() then picks the 'g' and first 12 + # hex chars. + if atag="$(git describe --abbrev=12 2>/dev/null)"; then + echo "$atag" | awk -F- '{printf("-%05d-%s", $(NF-1),substr($(NF),0,13))}' - # If we don't have a tag at all we print -g{commitish}. + # If we don't have a tag at all we print -g{commitish}, + # again using exactly 12 hex chars. else + head="$(echo $head | cut -c1-12)" printf '%s%s' -g $head fi fi diff --git a/security/integrity/evm/evm_main.c b/security/integrity/evm/evm_main.c index e11d860fdce4..651c0127c00d 100644 --- a/security/integrity/evm/evm_main.c +++ b/security/integrity/evm/evm_main.c @@ -186,6 +186,12 @@ static enum integrity_status evm_verify_hmac(struct dentry *dentry, break; case EVM_IMA_XATTR_DIGSIG: case EVM_XATTR_PORTABLE_DIGSIG: + /* accept xattr with non-empty signature field */ + if (xattr_len <= sizeof(struct signature_v2_hdr)) { + evm_status = INTEGRITY_FAIL; + goto out; + } + hdr = (struct signature_v2_hdr *)xattr_data; digest.hdr.algo = hdr->hash_algo; rc = evm_calc_hash(dentry, xattr_name, xattr_value, diff --git a/security/integrity/ima/ima_crypto.c b/security/integrity/ima/ima_crypto.c index c5dd05ace28c..f4f3de5f06ca 100644 --- a/security/integrity/ima/ima_crypto.c +++ b/security/integrity/ima/ima_crypto.c @@ -682,6 +682,8 @@ static int ima_calc_boot_aggregate_tfm(char *digest, ima_pcrread(i, pcr_i); /* now accumulate with current aggregate */ rc = crypto_shash_update(shash, pcr_i, TPM_DIGEST_SIZE); + if (rc != 0) + return rc; } if (!rc) crypto_shash_final(shash, digest); diff --git a/sound/core/seq/oss/seq_oss.c b/sound/core/seq/oss/seq_oss.c index ed5bca0db3e7..f4a9d9972330 100644 --- a/sound/core/seq/oss/seq_oss.c +++ b/sound/core/seq/oss/seq_oss.c @@ -187,9 +187,12 @@ odev_ioctl(struct file *file, unsigned int cmd, unsigned long arg) if (snd_BUG_ON(!dp)) return -ENXIO; - mutex_lock(®ister_mutex); + if (cmd != SNDCTL_SEQ_SYNC && + mutex_lock_interruptible(®ister_mutex)) + return -ERESTARTSYS; rc = snd_seq_oss_ioctl(dp, cmd, arg); - mutex_unlock(®ister_mutex); + if (cmd != SNDCTL_SEQ_SYNC) + mutex_unlock(®ister_mutex); return rc; } diff --git a/sound/firewire/bebob/bebob_hwdep.c b/sound/firewire/bebob/bebob_hwdep.c index 04c321e08c62..a04b5880926c 100644 --- a/sound/firewire/bebob/bebob_hwdep.c +++ b/sound/firewire/bebob/bebob_hwdep.c @@ -37,12 +37,11 @@ hwdep_read(struct snd_hwdep *hwdep, char __user *buf, long count, } memset(&event, 0, sizeof(event)); + count = min_t(long, count, sizeof(event.lock_status)); if (bebob->dev_lock_changed) { event.lock_status.type = SNDRV_FIREWIRE_EVENT_LOCK_STATUS; event.lock_status.status = (bebob->dev_lock_count > 0); bebob->dev_lock_changed = false; - - count = min_t(long, count, sizeof(event.lock_status)); } spin_unlock_irq(&bebob->lock); diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 24bc9e446047..382a8d179eb0 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -1906,6 +1906,8 @@ enum { ALC1220_FIXUP_CLEVO_P950, ALC1220_FIXUP_CLEVO_PB51ED, ALC1220_FIXUP_CLEVO_PB51ED_PINS, + ALC887_FIXUP_ASUS_AUDIO, + ALC887_FIXUP_ASUS_HMIC, }; static void alc889_fixup_coef(struct hda_codec *codec, @@ -2118,6 +2120,31 @@ static void alc1220_fixup_clevo_pb51ed(struct hda_codec *codec, alc_fixup_headset_mode_no_hp_mic(codec, fix, action); } +static void alc887_asus_hp_automute_hook(struct hda_codec *codec, + struct hda_jack_callback *jack) +{ + struct alc_spec *spec = codec->spec; + unsigned int vref; + + snd_hda_gen_hp_automute(codec, jack); + + if (spec->gen.hp_jack_present) + vref = AC_PINCTL_VREF_80; + else + vref = AC_PINCTL_VREF_HIZ; + snd_hda_set_pin_ctl(codec, 0x19, PIN_HP | vref); +} + +static void alc887_fixup_asus_jack(struct hda_codec *codec, + const struct hda_fixup *fix, int action) +{ + struct alc_spec *spec = codec->spec; + if (action != HDA_FIXUP_ACT_PROBE) + return; + snd_hda_set_pin_ctl_cache(codec, 0x1b, PIN_HP); + spec->gen.hp_automute_hook = alc887_asus_hp_automute_hook; +} + static const struct hda_fixup alc882_fixups[] = { [ALC882_FIXUP_ABIT_AW9D_MAX] = { .type = HDA_FIXUP_PINS, @@ -2375,6 +2402,20 @@ static const struct hda_fixup alc882_fixups[] = { .chained = true, .chain_id = ALC1220_FIXUP_CLEVO_PB51ED, }, + [ALC887_FIXUP_ASUS_AUDIO] = { + .type = HDA_FIXUP_PINS, + .v.pins = (const struct hda_pintbl[]) { + { 0x15, 0x02a14150 }, /* use as headset mic, without its own jack detect */ + { 0x19, 0x22219420 }, + {} + }, + }, + [ALC887_FIXUP_ASUS_HMIC] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc887_fixup_asus_jack, + .chained = true, + .chain_id = ALC887_FIXUP_ASUS_AUDIO, + }, }; static const struct snd_pci_quirk alc882_fixup_tbl[] = { @@ -2408,6 +2449,7 @@ static const struct snd_pci_quirk alc882_fixup_tbl[] = { SND_PCI_QUIRK(0x1043, 0x13c2, "Asus A7M", ALC882_FIXUP_EAPD), SND_PCI_QUIRK(0x1043, 0x1873, "ASUS W90V", ALC882_FIXUP_ASUS_W90V), SND_PCI_QUIRK(0x1043, 0x1971, "Asus W2JC", ALC882_FIXUP_ASUS_W2JC), + SND_PCI_QUIRK(0x1043, 0x2390, "Asus D700SA", ALC887_FIXUP_ASUS_HMIC), SND_PCI_QUIRK(0x1043, 0x835f, "Asus Eee 1601", ALC888_FIXUP_EEE1601), SND_PCI_QUIRK(0x1043, 0x84bc, "ASUS ET2700", ALC887_FIXUP_ASUS_BASS), SND_PCI_QUIRK(0x1043, 0x8691, "ASUS ROG Ranger VIII", ALC882_FIXUP_GPIO3), diff --git a/sound/soc/qcom/lpass-cpu.c b/sound/soc/qcom/lpass-cpu.c index 292b103abada..475579a9830a 100644 --- a/sound/soc/qcom/lpass-cpu.c +++ b/sound/soc/qcom/lpass-cpu.c @@ -182,21 +182,6 @@ static int lpass_cpu_daiops_hw_params(struct snd_pcm_substream *substream, return 0; } -static int lpass_cpu_daiops_hw_free(struct snd_pcm_substream *substream, - struct snd_soc_dai *dai) -{ - struct lpass_data *drvdata = snd_soc_dai_get_drvdata(dai); - int ret; - - ret = regmap_write(drvdata->lpaif_map, - LPAIF_I2SCTL_REG(drvdata->variant, dai->driver->id), - 0); - if (ret) - dev_err(dai->dev, "error writing to i2sctl reg: %d\n", ret); - - return ret; -} - static int lpass_cpu_daiops_prepare(struct snd_pcm_substream *substream, struct snd_soc_dai *dai) { @@ -277,7 +262,6 @@ const struct snd_soc_dai_ops asoc_qcom_lpass_cpu_dai_ops = { .startup = lpass_cpu_daiops_startup, .shutdown = lpass_cpu_daiops_shutdown, .hw_params = lpass_cpu_daiops_hw_params, - .hw_free = lpass_cpu_daiops_hw_free, .prepare = lpass_cpu_daiops_prepare, .trigger = lpass_cpu_daiops_trigger, }; diff --git a/sound/soc/qcom/lpass-platform.c b/sound/soc/qcom/lpass-platform.c index d07271ea4c45..2f2967247789 100644 --- a/sound/soc/qcom/lpass-platform.c +++ b/sound/soc/qcom/lpass-platform.c @@ -69,7 +69,7 @@ static int lpass_platform_pcmops_open(struct snd_pcm_substream *substream) int ret, dma_ch, dir = substream->stream; struct lpass_pcm_data *data; - data = devm_kzalloc(soc_runtime->dev, sizeof(*data), GFP_KERNEL); + data = kzalloc(sizeof(*data), GFP_KERNEL); if (!data) return -ENOMEM; @@ -127,6 +127,7 @@ static int lpass_platform_pcmops_close(struct snd_pcm_substream *substream) if (v->free_dma_channel) v->free_dma_channel(drvdata, data->dma_ch); + kfree(data); return 0; } diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 2ded31004487..94e52495888d 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -399,6 +399,7 @@ static int set_sync_ep_implicit_fb_quirk(struct snd_usb_substream *subs, switch (subs->stream->chip->usb_id) { case USB_ID(0x0763, 0x2030): /* M-Audio Fast Track C400 */ case USB_ID(0x0763, 0x2031): /* M-Audio Fast Track C600 */ + case USB_ID(0x22f0, 0x0006): /* Allen&Heath Qu-16 */ ep = 0x81; ifnum = 3; goto add_sync_ep_from_ifnum; @@ -408,6 +409,7 @@ static int set_sync_ep_implicit_fb_quirk(struct snd_usb_substream *subs, ifnum = 2; goto add_sync_ep_from_ifnum; case USB_ID(0x2466, 0x8003): /* Fractal Audio Axe-Fx II */ + case USB_ID(0x0499, 0x172a): /* Yamaha MODX */ ep = 0x86; ifnum = 2; goto add_sync_ep_from_ifnum; @@ -415,6 +417,10 @@ static int set_sync_ep_implicit_fb_quirk(struct snd_usb_substream *subs, ep = 0x81; ifnum = 2; goto add_sync_ep_from_ifnum; + case USB_ID(0x1686, 0xf029): /* Zoom UAC-2 */ + ep = 0x82; + ifnum = 2; + goto add_sync_ep_from_ifnum; case USB_ID(0x1397, 0x0001): /* Behringer UFX1604 */ case USB_ID(0x1397, 0x0002): /* Behringer UFX1204 */ ep = 0x81; diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index 87078e5d672b..49fe37ad8cb5 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1463,6 +1463,7 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip, case 0x278b: /* Rotel? */ case 0x292b: /* Gustard/Ess based devices */ case 0x2ab6: /* T+A devices */ + case 0x3353: /* Khadas devices */ case 0x3842: /* EVGA */ case 0xc502: /* HiBy devices */ if (fp->dsd_raw) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index bf4cd924aed5..13944978ada5 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1191,8 +1191,8 @@ union bpf_attr { * Return * The return value depends on the result of the test, and can be: * - * * 0, if the *skb* task belongs to the cgroup2. - * * 1, if the *skb* task does not belong to the cgroup2. + * * 0, if current task belongs to the cgroup2. + * * 1, if current task does not belong to the cgroup2. * * A negative error code, if an error occurred. * * int bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags) diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c index 3f98dcfbc177..0b1ba8e7d18a 100644 --- a/tools/objtool/orc_gen.c +++ b/tools/objtool/orc_gen.c @@ -100,11 +100,6 @@ static int create_orc_entry(struct section *u_sec, struct section *ip_relasec, struct orc_entry *orc; struct rela *rela; - if (!insn_sec->sym) { - WARN("missing symbol for section %s", insn_sec->name); - return -1; - } - /* populate ORC data */ orc = (struct orc_entry *)u_sec->data->d_buf + idx; memcpy(orc, o, sizeof(*orc)); @@ -117,8 +112,32 @@ static int create_orc_entry(struct section *u_sec, struct section *ip_relasec, } memset(rela, 0, sizeof(*rela)); - rela->sym = insn_sec->sym; - rela->addend = insn_off; + if (insn_sec->sym) { + rela->sym = insn_sec->sym; + rela->addend = insn_off; + } else { + /* + * The Clang assembler doesn't produce section symbols, so we + * have to reference the function symbol instead: + */ + rela->sym = find_symbol_containing(insn_sec, insn_off); + if (!rela->sym) { + /* + * Hack alert. This happens when we need to reference + * the NOP pad insn immediately after the function. + */ + rela->sym = find_symbol_containing(insn_sec, + insn_off - 1); + } + if (!rela->sym) { + WARN("missing symbol for insn at offset 0x%lx\n", + insn_off); + return -1; + } + + rela->addend = insn_off - rela->sym->offset; + } + rela->type = R_X86_64_PC32; rela->offset = idx * sizeof(int); diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index ff2c41ea94c8..2434a0014491 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -876,6 +876,8 @@ static void intel_pt_set_pid_tid_cpu(struct intel_pt *pt, if (queue->tid == -1 || pt->have_sched_switch) { ptq->tid = machine__get_current_tid(pt->machine, ptq->cpu); + if (ptq->tid == -1) + ptq->pid = -1; thread__zput(ptq->thread); } @@ -1915,10 +1917,8 @@ static int intel_pt_context_switch(struct intel_pt *pt, union perf_event *event, tid = sample->tid; } - if (tid == -1) { - pr_err("context_switch event has no tid\n"); - return -EINVAL; - } + if (tid == -1) + intel_pt_log("context_switch event has no tid\n"); intel_pt_log("context_switch: cpu %d pid %d tid %d time %"PRIu64" tsc %#"PRIx64"\n", cpu, pid, tid, sample->time, perf_time_to_tsc(sample->time, diff --git a/tools/perf/util/print_binary.c b/tools/perf/util/print_binary.c index 23e367063446..71aeaf6f45cc 100644 --- a/tools/perf/util/print_binary.c +++ b/tools/perf/util/print_binary.c @@ -50,7 +50,7 @@ int is_printable_array(char *p, unsigned int len) len--; - for (i = 0; i < len; i++) { + for (i = 0; i < len && p[i]; i++) { if (!isprint(p[i]) && !isspace(p[i])) return 0; } diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h index dc58254a2b69..8c01b2cfdb1a 100644 --- a/tools/perf/util/util.h +++ b/tools/perf/util/util.h @@ -22,7 +22,7 @@ static inline void *zalloc(size_t size) return calloc(1, size); } -#define zfree(ptr) ({ free(*ptr); *ptr = NULL; }) +#define zfree(ptr) ({ free((void *)*ptr); *ptr = NULL; }) struct dirent; struct nsinfo; diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh new file mode 100755 index 000000000000..d77f4829f1e0 --- /dev/null +++ b/tools/testing/selftests/wireguard/netns.sh @@ -0,0 +1,614 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. +# +# This script tests the below topology: +# +# ┌─────────────────────┐ ┌──────────────────────────────────┐ ┌─────────────────────┐ +# │ $ns1 namespace │ │ $ns0 namespace │ │ $ns2 namespace │ +# │ │ │ │ │ │ +# │┌────────┐ │ │ ┌────────┐ │ │ ┌────────┐│ +# ││ wg0 │───────────┼───┼────────────│ lo │────────────┼───┼───────────│ wg0 ││ +# │├────────┴──────────┐│ │ ┌───────┴────────┴────────┐ │ │┌──────────┴────────┤│ +# ││192.168.241.1/24 ││ │ │(ns1) (ns2) │ │ ││192.168.241.2/24 ││ +# ││fd00::1/24 ││ │ │127.0.0.1:1 127.0.0.1:2│ │ ││fd00::2/24 ││ +# │└───────────────────┘│ │ │[::]:1 [::]:2 │ │ │└───────────────────┘│ +# └─────────────────────┘ │ └─────────────────────────┘ │ └─────────────────────┘ +# └──────────────────────────────────┘ +# +# After the topology is prepared we run a series of TCP/UDP iperf3 tests between the +# wireguard peers in $ns1 and $ns2. Note that $ns0 is the endpoint for the wg0 +# interfaces in $ns1 and $ns2. See https://www.wireguard.com/netns/ for further +# details on how this is accomplished. +set -e + +exec 3>&1 +export LANG=C +export WG_HIDE_KEYS=never +netns0="wg-test-$$-0" +netns1="wg-test-$$-1" +netns2="wg-test-$$-2" +pretty() { echo -e "\x1b[32m\x1b[1m[+] ${1:+NS$1: }${2}\x1b[0m" >&3; } +pp() { pretty "" "$*"; "$@"; } +maybe_exec() { if [[ $BASHPID -eq $$ ]]; then "$@"; else exec "$@"; fi; } +n0() { pretty 0 "$*"; maybe_exec ip netns exec $netns0 "$@"; } +n1() { pretty 1 "$*"; maybe_exec ip netns exec $netns1 "$@"; } +n2() { pretty 2 "$*"; maybe_exec ip netns exec $netns2 "$@"; } +ip0() { pretty 0 "ip $*"; ip -n $netns0 "$@"; } +ip1() { pretty 1 "ip $*"; ip -n $netns1 "$@"; } +ip2() { pretty 2 "ip $*"; ip -n $netns2 "$@"; } +sleep() { read -t "$1" -N 1 || true; } +waitiperf() { pretty "${1//*-}" "wait for iperf:5201 pid $2"; while [[ $(ss -N "$1" -tlpH 'sport = 5201') != *\"iperf3\",pid=$2,fd=* ]]; do sleep 0.1; done; } +waitncatudp() { pretty "${1//*-}" "wait for udp:1111 pid $2"; while [[ $(ss -N "$1" -ulpH 'sport = 1111') != *\"ncat\",pid=$2,fd=* ]]; do sleep 0.1; done; } +waitiface() { pretty "${1//*-}" "wait for $2 to come up"; ip netns exec "$1" bash -c "while [[ \$(< \"/sys/class/net/$2/operstate\") != up ]]; do read -t .1 -N 0 || true; done;"; } + +cleanup() { + set +e + exec 2>/dev/null + printf "$orig_message_cost" > /proc/sys/net/core/message_cost + ip0 link del dev wg0 + ip0 link del dev wg1 + ip1 link del dev wg0 + ip1 link del dev wg1 + ip2 link del dev wg0 + ip2 link del dev wg1 + local to_kill="$(ip netns pids $netns0) $(ip netns pids $netns1) $(ip netns pids $netns2)" + [[ -n $to_kill ]] && kill $to_kill + pp ip netns del $netns1 + pp ip netns del $netns2 + pp ip netns del $netns0 + exit +} + +orig_message_cost="$(< /proc/sys/net/core/message_cost)" +trap cleanup EXIT +printf 0 > /proc/sys/net/core/message_cost + +ip netns del $netns0 2>/dev/null || true +ip netns del $netns1 2>/dev/null || true +ip netns del $netns2 2>/dev/null || true +pp ip netns add $netns0 +pp ip netns add $netns1 +pp ip netns add $netns2 +ip0 link set up dev lo + +ip0 link add dev wg0 type wireguard +ip0 link set wg0 netns $netns1 +ip0 link add dev wg0 type wireguard +ip0 link set wg0 netns $netns2 +key1="$(pp wg genkey)" +key2="$(pp wg genkey)" +key3="$(pp wg genkey)" +key4="$(pp wg genkey)" +pub1="$(pp wg pubkey <<<"$key1")" +pub2="$(pp wg pubkey <<<"$key2")" +pub3="$(pp wg pubkey <<<"$key3")" +pub4="$(pp wg pubkey <<<"$key4")" +psk="$(pp wg genpsk)" +[[ -n $key1 && -n $key2 && -n $psk ]] + +configure_peers() { + ip1 addr add 192.168.241.1/24 dev wg0 + ip1 addr add fd00::1/112 dev wg0 + + ip2 addr add 192.168.241.2/24 dev wg0 + ip2 addr add fd00::2/112 dev wg0 + + n1 wg set wg0 \ + private-key <(echo "$key1") \ + listen-port 1 \ + peer "$pub2" \ + preshared-key <(echo "$psk") \ + allowed-ips 192.168.241.2/32,fd00::2/128 + n2 wg set wg0 \ + private-key <(echo "$key2") \ + listen-port 2 \ + peer "$pub1" \ + preshared-key <(echo "$psk") \ + allowed-ips 192.168.241.1/32,fd00::1/128 + + ip1 link set up dev wg0 + ip2 link set up dev wg0 +} +configure_peers + +tests() { + # Ping over IPv4 + n2 ping -c 10 -f -W 1 192.168.241.1 + n1 ping -c 10 -f -W 1 192.168.241.2 + + # Ping over IPv6 + n2 ping6 -c 10 -f -W 1 fd00::1 + n1 ping6 -c 10 -f -W 1 fd00::2 + + # TCP over IPv4 + n2 iperf3 -s -1 -B 192.168.241.2 & + waitiperf $netns2 $! + n1 iperf3 -Z -t 3 -c 192.168.241.2 + + # TCP over IPv6 + n1 iperf3 -s -1 -B fd00::1 & + waitiperf $netns1 $! + n2 iperf3 -Z -t 3 -c fd00::1 + + # UDP over IPv4 + n1 iperf3 -s -1 -B 192.168.241.1 & + waitiperf $netns1 $! + n2 iperf3 -Z -t 3 -b 0 -u -c 192.168.241.1 + + # UDP over IPv6 + n2 iperf3 -s -1 -B fd00::2 & + waitiperf $netns2 $! + n1 iperf3 -Z -t 3 -b 0 -u -c fd00::2 +} + +[[ $(ip1 link show dev wg0) =~ mtu\ ([0-9]+) ]] && orig_mtu="${BASH_REMATCH[1]}" +big_mtu=$(( 34816 - 1500 + $orig_mtu )) + +# Test using IPv4 as outer transport +n1 wg set wg0 peer "$pub2" endpoint 127.0.0.1:2 +n2 wg set wg0 peer "$pub1" endpoint 127.0.0.1:1 +# Before calling tests, we first make sure that the stats counters and timestamper are working +n2 ping -c 10 -f -W 1 192.168.241.1 +{ read _; read _; read _; read rx_bytes _; read _; read tx_bytes _; } < <(ip2 -stats link show dev wg0) +(( rx_bytes == 1372 && (tx_bytes == 1428 || tx_bytes == 1460) )) +{ read _; read _; read _; read rx_bytes _; read _; read tx_bytes _; } < <(ip1 -stats link show dev wg0) +(( tx_bytes == 1372 && (rx_bytes == 1428 || rx_bytes == 1460) )) +read _ rx_bytes tx_bytes < <(n2 wg show wg0 transfer) +(( rx_bytes == 1372 && (tx_bytes == 1428 || tx_bytes == 1460) )) +read _ rx_bytes tx_bytes < <(n1 wg show wg0 transfer) +(( tx_bytes == 1372 && (rx_bytes == 1428 || rx_bytes == 1460) )) +read _ timestamp < <(n1 wg show wg0 latest-handshakes) +(( timestamp != 0 )) + +tests +ip1 link set wg0 mtu $big_mtu +ip2 link set wg0 mtu $big_mtu +tests + +ip1 link set wg0 mtu $orig_mtu +ip2 link set wg0 mtu $orig_mtu + +# Test using IPv6 as outer transport +n1 wg set wg0 peer "$pub2" endpoint [::1]:2 +n2 wg set wg0 peer "$pub1" endpoint [::1]:1 +tests +ip1 link set wg0 mtu $big_mtu +ip2 link set wg0 mtu $big_mtu +tests + +# Test that route MTUs work with the padding +ip1 link set wg0 mtu 1300 +ip2 link set wg0 mtu 1300 +n1 wg set wg0 peer "$pub2" endpoint 127.0.0.1:2 +n2 wg set wg0 peer "$pub1" endpoint 127.0.0.1:1 +n0 iptables -A INPUT -m length --length 1360 -j DROP +n1 ip route add 192.168.241.2/32 dev wg0 mtu 1299 +n2 ip route add 192.168.241.1/32 dev wg0 mtu 1299 +n2 ping -c 1 -W 1 -s 1269 192.168.241.1 +n2 ip route delete 192.168.241.1/32 dev wg0 mtu 1299 +n1 ip route delete 192.168.241.2/32 dev wg0 mtu 1299 +n0 iptables -F INPUT + +ip1 link set wg0 mtu $orig_mtu +ip2 link set wg0 mtu $orig_mtu + +# Test using IPv4 that roaming works +ip0 -4 addr del 127.0.0.1/8 dev lo +ip0 -4 addr add 127.212.121.99/8 dev lo +n1 wg set wg0 listen-port 9999 +n1 wg set wg0 peer "$pub2" endpoint 127.0.0.1:2 +n1 ping6 -W 1 -c 1 fd00::2 +[[ $(n2 wg show wg0 endpoints) == "$pub1 127.212.121.99:9999" ]] + +# Test using IPv6 that roaming works +n1 wg set wg0 listen-port 9998 +n1 wg set wg0 peer "$pub2" endpoint [::1]:2 +n1 ping -W 1 -c 1 192.168.241.2 +[[ $(n2 wg show wg0 endpoints) == "$pub1 [::1]:9998" ]] + +# Test that crypto-RP filter works +n1 wg set wg0 peer "$pub2" allowed-ips 192.168.241.0/24 +exec 4< <(n1 ncat -l -u -p 1111) +ncat_pid=$! +waitncatudp $netns1 $ncat_pid +n2 ncat -u 192.168.241.1 1111 <<<"X" +read -r -N 1 -t 1 out <&4 && [[ $out == "X" ]] +kill $ncat_pid +more_specific_key="$(pp wg genkey | pp wg pubkey)" +n1 wg set wg0 peer "$more_specific_key" allowed-ips 192.168.241.2/32 +n2 wg set wg0 listen-port 9997 +exec 4< <(n1 ncat -l -u -p 1111) +ncat_pid=$! +waitncatudp $netns1 $ncat_pid +n2 ncat -u 192.168.241.1 1111 <<<"X" +! read -r -N 1 -t 1 out <&4 || false +kill $ncat_pid +n1 wg set wg0 peer "$more_specific_key" remove +[[ $(n1 wg show wg0 endpoints) == "$pub2 [::1]:9997" ]] + +# Test that we can change private keys keys and immediately handshake +n1 wg set wg0 private-key <(echo "$key1") peer "$pub2" preshared-key <(echo "$psk") allowed-ips 192.168.241.2/32 endpoint 127.0.0.1:2 +n2 wg set wg0 private-key <(echo "$key2") listen-port 2 peer "$pub1" preshared-key <(echo "$psk") allowed-ips 192.168.241.1/32 +n1 ping -W 1 -c 1 192.168.241.2 +n1 wg set wg0 private-key <(echo "$key3") +n2 wg set wg0 peer "$pub3" preshared-key <(echo "$psk") allowed-ips 192.168.241.1/32 peer "$pub1" remove +n1 ping -W 1 -c 1 192.168.241.2 +n2 wg set wg0 peer "$pub3" remove + +# Test that we can route wg through wg +ip1 addr flush dev wg0 +ip2 addr flush dev wg0 +ip1 addr add fd00::5:1/112 dev wg0 +ip2 addr add fd00::5:2/112 dev wg0 +n1 wg set wg0 private-key <(echo "$key1") peer "$pub2" preshared-key <(echo "$psk") allowed-ips fd00::5:2/128 endpoint 127.0.0.1:2 +n2 wg set wg0 private-key <(echo "$key2") listen-port 2 peer "$pub1" preshared-key <(echo "$psk") allowed-ips fd00::5:1/128 endpoint 127.212.121.99:9998 +ip1 link add wg1 type wireguard +ip2 link add wg1 type wireguard +ip1 addr add 192.168.241.1/24 dev wg1 +ip1 addr add fd00::1/112 dev wg1 +ip2 addr add 192.168.241.2/24 dev wg1 +ip2 addr add fd00::2/112 dev wg1 +ip1 link set mtu 1340 up dev wg1 +ip2 link set mtu 1340 up dev wg1 +n1 wg set wg1 listen-port 5 private-key <(echo "$key3") peer "$pub4" allowed-ips 192.168.241.2/32,fd00::2/128 endpoint [fd00::5:2]:5 +n2 wg set wg1 listen-port 5 private-key <(echo "$key4") peer "$pub3" allowed-ips 192.168.241.1/32,fd00::1/128 endpoint [fd00::5:1]:5 +tests +# Try to set up a routing loop between the two namespaces +ip1 link set netns $netns0 dev wg1 +ip0 addr add 192.168.241.1/24 dev wg1 +ip0 link set up dev wg1 +n0 ping -W 1 -c 1 192.168.241.2 +n1 wg set wg0 peer "$pub2" endpoint 192.168.241.2:7 +ip2 link del wg0 +ip2 link del wg1 +! n0 ping -W 1 -c 10 -f 192.168.241.2 || false # Should not crash kernel + +ip0 link del wg1 +ip1 link del wg0 + +# Test using NAT. We now change the topology to this: +# ┌────────────────────────────────────────┐ ┌────────────────────────────────────────────────┐ ┌────────────────────────────────────────┐ +# │ $ns1 namespace │ │ $ns0 namespace │ │ $ns2 namespace │ +# │ │ │ │ │ │ +# │ ┌─────┐ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ ┌─────┐ │ +# │ │ wg0 │─────────────│vethc│───────────┼────┼────│vethrc│ │vethrs│──────────────┼─────┼──│veths│────────────│ wg0 │ │ +# │ ├─────┴──────────┐ ├─────┴──────────┐│ │ ├──────┴─────────┐ ├──────┴────────────┐ │ │ ├─────┴──────────┐ ├─────┴──────────┐ │ +# │ │192.168.241.1/24│ │192.168.1.100/24││ │ │192.168.1.1/24 │ │10.0.0.1/24 │ │ │ │10.0.0.100/24 │ │192.168.241.2/24│ │ +# │ │fd00::1/24 │ │ ││ │ │ │ │SNAT:192.168.1.0/24│ │ │ │ │ │fd00::2/24 │ │ +# │ └────────────────┘ └────────────────┘│ │ └────────────────┘ └───────────────────┘ │ │ └────────────────┘ └────────────────┘ │ +# └────────────────────────────────────────┘ └────────────────────────────────────────────────┘ └────────────────────────────────────────┘ + +ip1 link add dev wg0 type wireguard +ip2 link add dev wg0 type wireguard +configure_peers + +ip0 link add vethrc type veth peer name vethc +ip0 link add vethrs type veth peer name veths +ip0 link set vethc netns $netns1 +ip0 link set veths netns $netns2 +ip0 link set vethrc up +ip0 link set vethrs up +ip0 addr add 192.168.1.1/24 dev vethrc +ip0 addr add 10.0.0.1/24 dev vethrs +ip1 addr add 192.168.1.100/24 dev vethc +ip1 link set vethc up +ip1 route add default via 192.168.1.1 +ip2 addr add 10.0.0.100/24 dev veths +ip2 link set veths up +waitiface $netns0 vethrc +waitiface $netns0 vethrs +waitiface $netns1 vethc +waitiface $netns2 veths + +n0 bash -c 'printf 1 > /proc/sys/net/ipv4/ip_forward' +n0 bash -c 'printf 2 > /proc/sys/net/netfilter/nf_conntrack_udp_timeout' +n0 bash -c 'printf 2 > /proc/sys/net/netfilter/nf_conntrack_udp_timeout_stream' +n0 iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -d 10.0.0.0/24 -j SNAT --to 10.0.0.1 + +n1 wg set wg0 peer "$pub2" endpoint 10.0.0.100:2 persistent-keepalive 1 +n1 ping -W 1 -c 1 192.168.241.2 +n2 ping -W 1 -c 1 192.168.241.1 +[[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.1:1" ]] +# Demonstrate n2 can still send packets to n1, since persistent-keepalive will prevent connection tracking entry from expiring (to see entries: `n0 conntrack -L`). +pp sleep 3 +n2 ping -W 1 -c 1 192.168.241.1 +n1 wg set wg0 peer "$pub2" persistent-keepalive 0 + +# Test that onion routing works, even when it loops +n1 wg set wg0 peer "$pub3" allowed-ips 192.168.242.2/32 endpoint 192.168.241.2:5 +ip1 addr add 192.168.242.1/24 dev wg0 +ip2 link add wg1 type wireguard +ip2 addr add 192.168.242.2/24 dev wg1 +n2 wg set wg1 private-key <(echo "$key3") listen-port 5 peer "$pub1" allowed-ips 192.168.242.1/32 +ip2 link set wg1 up +n1 ping -W 1 -c 1 192.168.242.2 +ip2 link del wg1 +n1 wg set wg0 peer "$pub3" endpoint 192.168.242.2:5 +! n1 ping -W 1 -c 1 192.168.242.2 || false # Should not crash kernel +n1 wg set wg0 peer "$pub3" remove +ip1 addr del 192.168.242.1/24 dev wg0 + +# Do a wg-quick(8)-style policy routing for the default route, making sure vethc has a v6 address to tease out bugs. +ip1 -6 addr add fc00::9/96 dev vethc +ip1 -6 route add default via fc00::1 +ip2 -4 addr add 192.168.99.7/32 dev wg0 +ip2 -6 addr add abab::1111/128 dev wg0 +n1 wg set wg0 fwmark 51820 peer "$pub2" allowed-ips 192.168.99.7,abab::1111 +ip1 -6 route add default dev wg0 table 51820 +ip1 -6 rule add not fwmark 51820 table 51820 +ip1 -6 rule add table main suppress_prefixlength 0 +ip1 -4 route add default dev wg0 table 51820 +ip1 -4 rule add not fwmark 51820 table 51820 +ip1 -4 rule add table main suppress_prefixlength 0 +# Flood the pings instead of sending just one, to trigger routing table reference counting bugs. +n1 ping -W 1 -c 100 -f 192.168.99.7 +n1 ping -W 1 -c 100 -f abab::1111 + +# Have ns2 NAT into wg0 packets from ns0, but return an icmp error along the right route. +n2 iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -d 192.168.241.0/24 -j SNAT --to 192.168.241.2 +n0 iptables -t filter -A INPUT \! -s 10.0.0.0/24 -i vethrs -j DROP # Manual rpfilter just to be explicit. +n2 bash -c 'printf 1 > /proc/sys/net/ipv4/ip_forward' +ip0 -4 route add 192.168.241.1 via 10.0.0.100 +n2 wg set wg0 peer "$pub1" remove +[[ $(! n0 ping -W 1 -c 1 192.168.241.1 || false) == *"From 10.0.0.100 icmp_seq=1 Destination Host Unreachable"* ]] + +n0 iptables -t nat -F +n0 iptables -t filter -F +n2 iptables -t nat -F +ip0 link del vethrc +ip0 link del vethrs +ip1 link del wg0 +ip2 link del wg0 + +# Test that saddr routing is sticky but not too sticky, changing to this topology: +# ┌────────────────────────────────────────┐ ┌────────────────────────────────────────┐ +# │ $ns1 namespace │ │ $ns2 namespace │ +# │ │ │ │ +# │ ┌─────┐ ┌─────┐ │ │ ┌─────┐ ┌─────┐ │ +# │ │ wg0 │─────────────│veth1│───────────┼────┼──│veth2│────────────│ wg0 │ │ +# │ ├─────┴──────────┐ ├─────┴──────────┐│ │ ├─────┴──────────┐ ├─────┴──────────┐ │ +# │ │192.168.241.1/24│ │10.0.0.1/24 ││ │ │10.0.0.2/24 │ │192.168.241.2/24│ │ +# │ │fd00::1/24 │ │fd00:aa::1/96 ││ │ │fd00:aa::2/96 │ │fd00::2/24 │ │ +# │ └────────────────┘ └────────────────┘│ │ └────────────────┘ └────────────────┘ │ +# └────────────────────────────────────────┘ └────────────────────────────────────────┘ + +ip1 link add dev wg0 type wireguard +ip2 link add dev wg0 type wireguard +configure_peers +ip1 link add veth1 type veth peer name veth2 +ip1 link set veth2 netns $netns2 +n1 bash -c 'printf 0 > /proc/sys/net/ipv6/conf/all/accept_dad' +n2 bash -c 'printf 0 > /proc/sys/net/ipv6/conf/all/accept_dad' +n1 bash -c 'printf 0 > /proc/sys/net/ipv6/conf/veth1/accept_dad' +n2 bash -c 'printf 0 > /proc/sys/net/ipv6/conf/veth2/accept_dad' +n1 bash -c 'printf 1 > /proc/sys/net/ipv4/conf/veth1/promote_secondaries' + +# First we check that we aren't overly sticky and can fall over to new IPs when old ones are removed +ip1 addr add 10.0.0.1/24 dev veth1 +ip1 addr add fd00:aa::1/96 dev veth1 +ip2 addr add 10.0.0.2/24 dev veth2 +ip2 addr add fd00:aa::2/96 dev veth2 +ip1 link set veth1 up +ip2 link set veth2 up +waitiface $netns1 veth1 +waitiface $netns2 veth2 +n1 wg set wg0 peer "$pub2" endpoint 10.0.0.2:2 +n1 ping -W 1 -c 1 192.168.241.2 +ip1 addr add 10.0.0.10/24 dev veth1 +ip1 addr del 10.0.0.1/24 dev veth1 +n1 ping -W 1 -c 1 192.168.241.2 +n1 wg set wg0 peer "$pub2" endpoint [fd00:aa::2]:2 +n1 ping -W 1 -c 1 192.168.241.2 +ip1 addr add fd00:aa::10/96 dev veth1 +ip1 addr del fd00:aa::1/96 dev veth1 +n1 ping -W 1 -c 1 192.168.241.2 + +# Now we show that we can successfully do reply to sender routing +ip1 link set veth1 down +ip2 link set veth2 down +ip1 addr flush dev veth1 +ip2 addr flush dev veth2 +ip1 addr add 10.0.0.1/24 dev veth1 +ip1 addr add 10.0.0.2/24 dev veth1 +ip1 addr add fd00:aa::1/96 dev veth1 +ip1 addr add fd00:aa::2/96 dev veth1 +ip2 addr add 10.0.0.3/24 dev veth2 +ip2 addr add fd00:aa::3/96 dev veth2 +ip1 link set veth1 up +ip2 link set veth2 up +waitiface $netns1 veth1 +waitiface $netns2 veth2 +n2 wg set wg0 peer "$pub1" endpoint 10.0.0.1:1 +n2 ping -W 1 -c 1 192.168.241.1 +[[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.1:1" ]] +n2 wg set wg0 peer "$pub1" endpoint [fd00:aa::1]:1 +n2 ping -W 1 -c 1 192.168.241.1 +[[ $(n2 wg show wg0 endpoints) == "$pub1 [fd00:aa::1]:1" ]] +n2 wg set wg0 peer "$pub1" endpoint 10.0.0.2:1 +n2 ping -W 1 -c 1 192.168.241.1 +[[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.2:1" ]] +n2 wg set wg0 peer "$pub1" endpoint [fd00:aa::2]:1 +n2 ping -W 1 -c 1 192.168.241.1 +[[ $(n2 wg show wg0 endpoints) == "$pub1 [fd00:aa::2]:1" ]] + +# What happens if the inbound destination address belongs to a different interface as the default route? +ip1 link add dummy0 type dummy +ip1 addr add 10.50.0.1/24 dev dummy0 +ip1 link set dummy0 up +ip2 route add 10.50.0.0/24 dev veth2 +n2 wg set wg0 peer "$pub1" endpoint 10.50.0.1:1 +n2 ping -W 1 -c 1 192.168.241.1 +[[ $(n2 wg show wg0 endpoints) == "$pub1 10.50.0.1:1" ]] + +ip1 link del dummy0 +ip1 addr flush dev veth1 +ip2 addr flush dev veth2 +ip1 route flush dev veth1 +ip2 route flush dev veth2 + +# Now we see what happens if another interface route takes precedence over an ongoing one +ip1 link add veth3 type veth peer name veth4 +ip1 link set veth4 netns $netns2 +ip1 addr add 10.0.0.1/24 dev veth1 +ip2 addr add 10.0.0.2/24 dev veth2 +ip1 addr add 10.0.0.3/24 dev veth3 +ip1 link set veth1 up +ip2 link set veth2 up +ip1 link set veth3 up +ip2 link set veth4 up +waitiface $netns1 veth1 +waitiface $netns2 veth2 +waitiface $netns1 veth3 +waitiface $netns2 veth4 +ip1 route flush dev veth1 +ip1 route flush dev veth3 +ip1 route add 10.0.0.0/24 dev veth1 src 10.0.0.1 metric 2 +n1 wg set wg0 peer "$pub2" endpoint 10.0.0.2:2 +n1 ping -W 1 -c 1 192.168.241.2 +[[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.1:1" ]] +ip1 route add 10.0.0.0/24 dev veth3 src 10.0.0.3 metric 1 +n1 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/veth1/rp_filter' +n2 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/veth4/rp_filter' +n1 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/all/rp_filter' +n2 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/all/rp_filter' +n1 ping -W 1 -c 1 192.168.241.2 +[[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.3:1" ]] + +ip1 link del veth1 +ip1 link del veth3 +ip1 link del wg0 +ip2 link del wg0 + +# We test that Netlink/IPC is working properly by doing things that usually cause split responses +ip0 link add dev wg0 type wireguard +config=( "[Interface]" "PrivateKey=$(wg genkey)" "[Peer]" "PublicKey=$(wg genkey)" ) +for a in {1..255}; do + for b in {0..255}; do + config+=( "AllowedIPs=$a.$b.0.0/16,$a::$b/128" ) + done +done +n0 wg setconf wg0 <(printf '%s\n' "${config[@]}") +i=0 +for ip in $(n0 wg show wg0 allowed-ips); do + ((++i)) +done +((i == 255*256*2+1)) +ip0 link del wg0 +ip0 link add dev wg0 type wireguard +config=( "[Interface]" "PrivateKey=$(wg genkey)" ) +for a in {1..40}; do + config+=( "[Peer]" "PublicKey=$(wg genkey)" ) + for b in {1..52}; do + config+=( "AllowedIPs=$a.$b.0.0/16" ) + done +done +n0 wg setconf wg0 <(printf '%s\n' "${config[@]}") +i=0 +while read -r line; do + j=0 + for ip in $line; do + ((++j)) + done + ((j == 53)) + ((++i)) +done < <(n0 wg show wg0 allowed-ips) +((i == 40)) +ip0 link del wg0 +ip0 link add wg0 type wireguard +config=( ) +for i in {1..29}; do + config+=( "[Peer]" "PublicKey=$(wg genkey)" ) +done +config+=( "[Peer]" "PublicKey=$(wg genkey)" "AllowedIPs=255.2.3.4/32,abcd::255/128" ) +n0 wg setconf wg0 <(printf '%s\n' "${config[@]}") +n0 wg showconf wg0 > /dev/null +ip0 link del wg0 + +allowedips=( ) +for i in {1..197}; do + allowedips+=( abcd::$i ) +done +saved_ifs="$IFS" +IFS=, +allowedips="${allowedips[*]}" +IFS="$saved_ifs" +ip0 link add wg0 type wireguard +n0 wg set wg0 peer "$pub1" +n0 wg set wg0 peer "$pub2" allowed-ips "$allowedips" +{ + read -r pub allowedips + [[ $pub == "$pub1" && $allowedips == "(none)" ]] + read -r pub allowedips + [[ $pub == "$pub2" ]] + i=0 + for _ in $allowedips; do + ((++i)) + done + ((i == 197)) +} < <(n0 wg show wg0 allowed-ips) +ip0 link del wg0 + +! n0 wg show doesnotexist || false + +ip0 link add wg0 type wireguard +n0 wg set wg0 private-key <(echo "$key1") peer "$pub2" preshared-key <(echo "$psk") +[[ $(n0 wg show wg0 private-key) == "$key1" ]] +[[ $(n0 wg show wg0 preshared-keys) == "$pub2 $psk" ]] +n0 wg set wg0 private-key /dev/null peer "$pub2" preshared-key /dev/null +[[ $(n0 wg show wg0 private-key) == "(none)" ]] +[[ $(n0 wg show wg0 preshared-keys) == "$pub2 (none)" ]] +n0 wg set wg0 peer "$pub2" +n0 wg set wg0 private-key <(echo "$key2") +[[ $(n0 wg show wg0 public-key) == "$pub2" ]] +[[ -z $(n0 wg show wg0 peers) ]] +n0 wg set wg0 peer "$pub2" +[[ -z $(n0 wg show wg0 peers) ]] +n0 wg set wg0 private-key <(echo "$key1") +n0 wg set wg0 peer "$pub2" +[[ $(n0 wg show wg0 peers) == "$pub2" ]] +n0 wg set wg0 private-key <(echo "/${key1:1}") +[[ $(n0 wg show wg0 private-key) == "+${key1:1}" ]] +n0 wg set wg0 peer "$pub2" allowed-ips 0.0.0.0/0,10.0.0.0/8,100.0.0.0/10,172.16.0.0/12,192.168.0.0/16 +n0 wg set wg0 peer "$pub2" allowed-ips 0.0.0.0/0 +n0 wg set wg0 peer "$pub2" allowed-ips ::/0,1700::/111,5000::/4,e000::/37,9000::/75 +n0 wg set wg0 peer "$pub2" allowed-ips ::/0 +n0 wg set wg0 peer "$pub2" remove +for low_order_point in AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA= AQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA= 4Ot6fDtBuK4WVuP68Z/EatoJjeucMrH9hmIFFl9JuAA= X5yVvKNQjCSx0LFVnIPvWwREXMRYHI6G2CJO3dCfEVc= 7P///////////////////////////////////////38= 7f///////////////////////////////////////38= 7v///////////////////////////////////////38=; do + n0 wg set wg0 peer "$low_order_point" persistent-keepalive 1 endpoint 127.0.0.1:1111 +done +[[ -n $(n0 wg show wg0 peers) ]] +exec 4< <(n0 ncat -l -u -p 1111) +ncat_pid=$! +waitncatudp $netns0 $ncat_pid +ip0 link set wg0 up +! read -r -n 1 -t 2 <&4 || false +kill $ncat_pid +ip0 link del wg0 + +# Ensure there aren't circular reference loops +ip1 link add wg1 type wireguard +ip2 link add wg2 type wireguard +ip1 link set wg1 netns $netns2 +ip2 link set wg2 netns $netns1 +pp ip netns delete $netns1 +pp ip netns delete $netns2 +pp ip netns add $netns1 +pp ip netns add $netns2 + +sleep 2 # Wait for cleanup and grace periods +declare -A objects +while read -t 0.1 -r line 2>/dev/null || [[ $? -ne 142 ]]; do + [[ $line =~ .*(wg[0-9]+:\ [A-Z][a-z]+\ ?[0-9]*)\ .*(created|destroyed).* ]] || continue + objects["${BASH_REMATCH[1]}"]+="${BASH_REMATCH[2]}" +done < /dev/kmsg +alldeleted=1 +for object in "${!objects[@]}"; do + if [[ ${objects["$object"]} != *createddestroyed ]]; then + echo "Error: $object: merely ${objects["$object"]}" >&3 + alldeleted=0 + fi +done +[[ $alldeleted -eq 1 ]] +pretty "" "Objects that were created were also destroyed." diff --git a/tools/testing/selftests/wireguard/qemu/.gitignore b/tools/testing/selftests/wireguard/qemu/.gitignore new file mode 100644 index 000000000000..415b542a9d59 --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/.gitignore @@ -0,0 +1,2 @@ +build/ +distfiles/ diff --git a/tools/testing/selftests/wireguard/qemu/Makefile b/tools/testing/selftests/wireguard/qemu/Makefile new file mode 100644 index 000000000000..2dab4f57516d --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/Makefile @@ -0,0 +1,377 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + +PWD := $(shell pwd) + +CHOST := $(shell gcc -dumpmachine) +HOST_ARCH := $(firstword $(subst -, ,$(CHOST))) +ifneq (,$(ARCH)) +CBUILD := $(subst -gcc,,$(lastword $(subst /, ,$(firstword $(wildcard $(foreach bindir,$(subst :, ,$(PATH)),$(bindir)/$(ARCH)-*-gcc)))))) +ifeq (,$(CBUILD)) +$(error The toolchain for $(ARCH) is not installed) +endif +else +CBUILD := $(CHOST) +ARCH := $(firstword $(subst -, ,$(CBUILD))) +endif + +# Set these from the environment to override +KERNEL_PATH ?= $(PWD)/../../../../.. +BUILD_PATH ?= $(PWD)/build/$(ARCH) +DISTFILES_PATH ?= $(PWD)/distfiles +NR_CPUS ?= 4 + +MIRROR := https://download.wireguard.com/qemu-test/distfiles/ + +default: qemu + +# variable name, tarball project name, version, tarball extension, default URI base +define tar_download = +$(1)_VERSION := $(3) +$(1)_NAME := $(2)-$$($(1)_VERSION) +$(1)_TAR := $(DISTFILES_PATH)/$$($(1)_NAME)$(4) +$(1)_PATH := $(BUILD_PATH)/$$($(1)_NAME) +$(call file_download,$$($(1)_NAME)$(4),$(5),$(6)) +endef + +define file_download = +$(DISTFILES_PATH)/$(1): + mkdir -p $(DISTFILES_PATH) + flock -x $$@.lock -c '[ -f $$@ ] && exit 0; wget -O $$@.tmp $(MIRROR)$(1) || wget -O $$@.tmp $(2)$(1) || rm -f $$@.tmp; [ -f $$@.tmp ] || exit 1; if echo "$(3) $$@.tmp" | sha256sum -c -; then mv $$@.tmp $$@; else rm -f $$@.tmp; exit 71; fi' +endef + +$(eval $(call tar_download,MUSL,musl,1.1.24,.tar.gz,https://www.musl-libc.org/releases/,1370c9a812b2cf2a7d92802510cca0058cc37e66a7bedd70051f0a34015022a3)) +$(eval $(call tar_download,IPERF,iperf,3.7,.tar.gz,https://downloads.es.net/pub/iperf/,d846040224317caf2f75c843d309a950a7db23f9b44b94688ccbe557d6d1710c)) +$(eval $(call tar_download,BASH,bash,5.0,.tar.gz,https://ftp.gnu.org/gnu/bash/,b4a80f2ac66170b2913efbfb9f2594f1f76c7b1afd11f799e22035d63077fb4d)) +$(eval $(call tar_download,IPROUTE2,iproute2,5.6.0,.tar.xz,https://www.kernel.org/pub/linux/utils/net/iproute2/,1b5b0e25ce6e23da7526ea1da044e814ad85ba761b10dd29c2b027c056b04692)) +$(eval $(call tar_download,IPTABLES,iptables,1.8.4,.tar.bz2,https://www.netfilter.org/projects/iptables/files/,993a3a5490a544c2cbf2ef15cf7e7ed21af1845baf228318d5c36ef8827e157c)) +$(eval $(call tar_download,NMAP,nmap,7.80,.tar.bz2,https://nmap.org/dist/,fcfa5a0e42099e12e4bf7a68ebe6fde05553383a682e816a7ec9256ab4773faa)) +$(eval $(call tar_download,IPUTILS,iputils,s20190709,.tar.gz,https://github.com/iputils/iputils/archive/s20190709.tar.gz/#,a15720dd741d7538dd2645f9f516d193636ae4300ff7dbc8bfca757bf166490a)) +$(eval $(call tar_download,WIREGUARD_TOOLS,wireguard-tools,1.0.20200206,.tar.xz,https://git.zx2c4.com/wireguard-tools/snapshot/,f5207248c6a3c3e3bfc9ab30b91c1897b00802ed861e1f9faaed873366078c64)) + +KERNEL_BUILD_PATH := $(BUILD_PATH)/kernel$(if $(findstring yes,$(DEBUG_KERNEL)),-debug) +rwildcard=$(foreach d,$(wildcard $1*),$(call rwildcard,$d/,$2) $(filter $(subst *,%,$2),$d)) +WIREGUARD_SOURCES := $(call rwildcard,$(KERNEL_PATH)/drivers/net/wireguard/,*) + +export CFLAGS ?= -O3 -pipe +export LDFLAGS ?= +export CPPFLAGS := -I$(BUILD_PATH)/include + +ifeq ($(HOST_ARCH),$(ARCH)) +CROSS_COMPILE_FLAG := --host=$(CHOST) +CFLAGS += -march=native +STRIP := strip +else +$(info Cross compilation: building for $(CBUILD) using $(CHOST)) +CROSS_COMPILE_FLAG := --build=$(CBUILD) --host=$(CHOST) +export CROSS_COMPILE=$(CBUILD)- +STRIP := $(CBUILD)-strip +endif +ifeq ($(ARCH),aarch64) +QEMU_ARCH := aarch64 +KERNEL_ARCH := arm64 +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/arch/arm64/boot/Image +ifeq ($(HOST_ARCH),$(ARCH)) +QEMU_MACHINE := -cpu host -machine virt,gic_version=host,accel=kvm +else +QEMU_MACHINE := -cpu cortex-a53 -machine virt +CFLAGS += -march=armv8-a -mtune=cortex-a53 +endif +else ifeq ($(ARCH),aarch64_be) +QEMU_ARCH := aarch64 +KERNEL_ARCH := arm64 +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/arch/arm64/boot/Image +ifeq ($(HOST_ARCH),$(ARCH)) +QEMU_MACHINE := -cpu host -machine virt,gic_version=host,accel=kvm +else +QEMU_MACHINE := -cpu cortex-a53 -machine virt +CFLAGS += -march=armv8-a -mtune=cortex-a53 +endif +else ifeq ($(ARCH),arm) +QEMU_ARCH := arm +KERNEL_ARCH := arm +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/arch/arm/boot/zImage +ifeq ($(HOST_ARCH),$(ARCH)) +QEMU_MACHINE := -cpu host -machine virt,gic_version=host,accel=kvm +else +QEMU_MACHINE := -cpu cortex-a15 -machine virt +CFLAGS += -march=armv7-a -mtune=cortex-a15 -mabi=aapcs-linux +endif +else ifeq ($(ARCH),armeb) +QEMU_ARCH := arm +KERNEL_ARCH := arm +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/arch/arm/boot/zImage +ifeq ($(HOST_ARCH),$(ARCH)) +QEMU_MACHINE := -cpu host -machine virt,gic_version=host,accel=kvm +else +QEMU_MACHINE := -cpu cortex-a15 -machine virt +CFLAGS += -march=armv7-a -mabi=aapcs-linux # We don't pass -mtune=cortex-a15 due to a compiler bug on big endian. +LDFLAGS += -Wl,--be8 +endif +else ifeq ($(ARCH),x86_64) +QEMU_ARCH := x86_64 +KERNEL_ARCH := x86_64 +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/arch/x86/boot/bzImage +ifeq ($(HOST_ARCH),$(ARCH)) +QEMU_MACHINE := -cpu host -machine q35,accel=kvm +else +QEMU_MACHINE := -cpu Skylake-Server -machine q35 +CFLAGS += -march=skylake-avx512 +endif +else ifeq ($(ARCH),i686) +QEMU_ARCH := i386 +KERNEL_ARCH := x86 +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/arch/x86/boot/bzImage +ifeq ($(subst x86_64,i686,$(HOST_ARCH)),$(ARCH)) +QEMU_MACHINE := -cpu host -machine q35,accel=kvm +else +QEMU_MACHINE := -cpu coreduo -machine q35 +CFLAGS += -march=prescott +endif +else ifeq ($(ARCH),mips64) +QEMU_ARCH := mips64 +KERNEL_ARCH := mips +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/vmlinux +ifeq ($(HOST_ARCH),$(ARCH)) +QEMU_MACHINE := -cpu host -machine malta,accel=kvm +CFLAGS += -EB +else +QEMU_MACHINE := -cpu MIPS64R2-generic -machine malta -smp 1 +CFLAGS += -march=mips64r2 -EB +endif +else ifeq ($(ARCH),mips64el) +QEMU_ARCH := mips64el +KERNEL_ARCH := mips +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/vmlinux +ifeq ($(HOST_ARCH),$(ARCH)) +QEMU_MACHINE := -cpu host -machine malta,accel=kvm +CFLAGS += -EL +else +QEMU_MACHINE := -cpu MIPS64R2-generic -machine malta -smp 1 +CFLAGS += -march=mips64r2 -EL +endif +else ifeq ($(ARCH),mips) +QEMU_ARCH := mips +KERNEL_ARCH := mips +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/vmlinux +ifeq ($(HOST_ARCH),$(ARCH)) +QEMU_MACHINE := -cpu host -machine malta,accel=kvm +CFLAGS += -EB +else +QEMU_MACHINE := -cpu 24Kf -machine malta -smp 1 +CFLAGS += -march=mips32r2 -EB +endif +else ifeq ($(ARCH),mipsel) +QEMU_ARCH := mipsel +KERNEL_ARCH := mips +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/vmlinux +ifeq ($(HOST_ARCH),$(ARCH)) +QEMU_MACHINE := -cpu host -machine malta,accel=kvm +CFLAGS += -EL +else +QEMU_MACHINE := -cpu 24Kf -machine malta -smp 1 +CFLAGS += -march=mips32r2 -EL +endif +else ifeq ($(ARCH),powerpc64le) +QEMU_ARCH := ppc64 +KERNEL_ARCH := powerpc +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/vmlinux +ifeq ($(HOST_ARCH),$(ARCH)) +QEMU_MACHINE := -cpu host,accel=kvm -machine pseries +else +QEMU_MACHINE := -machine pseries +endif +CFLAGS += -mcpu=powerpc64le -mlong-double-64 +else ifeq ($(ARCH),powerpc) +QEMU_ARCH := ppc +KERNEL_ARCH := powerpc +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/arch/powerpc/boot/uImage +ifeq ($(HOST_ARCH),$(ARCH)) +QEMU_MACHINE := -cpu host,accel=kvm -machine ppce500 +else +QEMU_MACHINE := -machine ppce500 +endif +CFLAGS += -mcpu=powerpc -mlong-double-64 -msecure-plt +else ifeq ($(ARCH),m68k) +QEMU_ARCH := m68k +KERNEL_ARCH := m68k +KERNEL_BZIMAGE := $(KERNEL_BUILD_PATH)/vmlinux +KERNEL_CMDLINE := $(shell sed -n 's/CONFIG_CMDLINE=\(.*\)/\1/p' arch/m68k.config) +ifeq ($(HOST_ARCH),$(ARCH)) +QEMU_MACHINE := -cpu host,accel=kvm -machine q800 -smp 1 -append $(KERNEL_CMDLINE) +else +QEMU_MACHINE := -machine q800 -smp 1 -append $(KERNEL_CMDLINE) +endif +else +$(error I only build: x86_64, i686, arm, armeb, aarch64, aarch64_be, mips, mipsel, mips64, mips64el, powerpc64le, powerpc, m68k) +endif + +REAL_CC := $(CBUILD)-gcc +MUSL_CC := $(BUILD_PATH)/musl-gcc +export CC := $(MUSL_CC) +USERSPACE_DEPS := $(MUSL_CC) $(BUILD_PATH)/include/.installed $(BUILD_PATH)/include/linux/.installed + +build: $(KERNEL_BZIMAGE) +qemu: $(KERNEL_BZIMAGE) + rm -f $(BUILD_PATH)/result + timeout --foreground 20m qemu-system-$(QEMU_ARCH) \ + -nodefaults \ + -nographic \ + -smp $(NR_CPUS) \ + $(QEMU_MACHINE) \ + -m $$(grep -q CONFIG_DEBUG_KMEMLEAK=y $(KERNEL_BUILD_PATH)/.config && echo 1G || echo 256M) \ + -serial stdio \ + -serial file:$(BUILD_PATH)/result \ + -no-reboot \ + -monitor none \ + -kernel $< + grep -Fq success $(BUILD_PATH)/result + +$(BUILD_PATH)/init-cpio-spec.txt: + mkdir -p $(BUILD_PATH) + echo "file /init $(BUILD_PATH)/init 755 0 0" > $@ + echo "file /init.sh $(PWD)/../netns.sh 755 0 0" >> $@ + echo "dir /dev 755 0 0" >> $@ + echo "nod /dev/console 644 0 0 c 5 1" >> $@ + echo "dir /bin 755 0 0" >> $@ + echo "file /bin/iperf3 $(IPERF_PATH)/src/iperf3 755 0 0" >> $@ + echo "file /bin/wg $(WIREGUARD_TOOLS_PATH)/src/wg 755 0 0" >> $@ + echo "file /bin/bash $(BASH_PATH)/bash 755 0 0" >> $@ + echo "file /bin/ip $(IPROUTE2_PATH)/ip/ip 755 0 0" >> $@ + echo "file /bin/ss $(IPROUTE2_PATH)/misc/ss 755 0 0" >> $@ + echo "file /bin/ping $(IPUTILS_PATH)/ping 755 0 0" >> $@ + echo "file /bin/ncat $(NMAP_PATH)/ncat/ncat 755 0 0" >> $@ + echo "file /bin/xtables-legacy-multi $(IPTABLES_PATH)/iptables/xtables-legacy-multi 755 0 0" >> $@ + echo "slink /bin/iptables xtables-legacy-multi 777 0 0" >> $@ + echo "slink /bin/ping6 ping 777 0 0" >> $@ + echo "dir /lib 755 0 0" >> $@ + echo "file /lib/libc.so $(MUSL_PATH)/lib/libc.so 755 0 0" >> $@ + echo "slink /lib/ld-linux.so.1 libc.so 777 0 0" >> $@ + +$(KERNEL_BUILD_PATH)/.config: kernel.config arch/$(ARCH).config + mkdir -p $(KERNEL_BUILD_PATH) + cp kernel.config $(KERNEL_BUILD_PATH)/minimal.config + printf 'CONFIG_NR_CPUS=$(NR_CPUS)\nCONFIG_INITRAMFS_SOURCE="$(BUILD_PATH)/init-cpio-spec.txt"\n' >> $(KERNEL_BUILD_PATH)/minimal.config + cat arch/$(ARCH).config >> $(KERNEL_BUILD_PATH)/minimal.config + $(MAKE) -C $(KERNEL_PATH) O=$(KERNEL_BUILD_PATH) ARCH=$(KERNEL_ARCH) allnoconfig + cd $(KERNEL_BUILD_PATH) && ARCH=$(KERNEL_ARCH) $(KERNEL_PATH)/scripts/kconfig/merge_config.sh -n $(KERNEL_BUILD_PATH)/.config $(KERNEL_BUILD_PATH)/minimal.config + $(if $(findstring yes,$(DEBUG_KERNEL)),cp debug.config $(KERNEL_BUILD_PATH) && cd $(KERNEL_BUILD_PATH) && ARCH=$(KERNEL_ARCH) $(KERNEL_PATH)/scripts/kconfig/merge_config.sh -n $(KERNEL_BUILD_PATH)/.config debug.config,) + +$(KERNEL_BZIMAGE): $(KERNEL_BUILD_PATH)/.config $(BUILD_PATH)/init-cpio-spec.txt $(MUSL_PATH)/lib/libc.so $(IPERF_PATH)/src/iperf3 $(IPUTILS_PATH)/ping $(BASH_PATH)/bash $(IPROUTE2_PATH)/misc/ss $(IPROUTE2_PATH)/ip/ip $(IPTABLES_PATH)/iptables/xtables-legacy-multi $(NMAP_PATH)/ncat/ncat $(WIREGUARD_TOOLS_PATH)/src/wg $(BUILD_PATH)/init ../netns.sh $(WIREGUARD_SOURCES) + $(MAKE) -C $(KERNEL_PATH) O=$(KERNEL_BUILD_PATH) ARCH=$(KERNEL_ARCH) CROSS_COMPILE=$(CROSS_COMPILE) + +$(BUILD_PATH)/include/linux/.installed: | $(KERNEL_BUILD_PATH)/.config + $(MAKE) -C $(KERNEL_PATH) O=$(KERNEL_BUILD_PATH) INSTALL_HDR_PATH=$(BUILD_PATH) ARCH=$(KERNEL_ARCH) CROSS_COMPILE=$(CROSS_COMPILE) headers_install + touch $@ + +$(MUSL_PATH)/lib/libc.so: $(MUSL_TAR) + mkdir -p $(BUILD_PATH) + flock -s $<.lock tar -C $(BUILD_PATH) -xf $< + cd $(MUSL_PATH) && CC=$(REAL_CC) ./configure --prefix=/ --disable-static --build=$(CBUILD) + $(MAKE) -C $(MUSL_PATH) + $(STRIP) -s $@ + +$(BUILD_PATH)/include/.installed: $(MUSL_PATH)/lib/libc.so + $(MAKE) -C $(MUSL_PATH) DESTDIR=$(BUILD_PATH) install-headers + touch $@ + +$(MUSL_CC): $(MUSL_PATH)/lib/libc.so + sh $(MUSL_PATH)/tools/musl-gcc.specs.sh $(BUILD_PATH)/include $(MUSL_PATH)/lib /lib/ld-linux.so.1 > $(BUILD_PATH)/musl-gcc.specs + printf '#!/bin/sh\nexec "$(REAL_CC)" --specs="$(BUILD_PATH)/musl-gcc.specs" "$$@"\n' > $(BUILD_PATH)/musl-gcc + chmod +x $(BUILD_PATH)/musl-gcc + +$(IPERF_PATH)/.installed: $(IPERF_TAR) + mkdir -p $(BUILD_PATH) + flock -s $<.lock tar -C $(BUILD_PATH) -xf $< + sed -i '1s/^/#include /' $(IPERF_PATH)/src/cjson.h $(IPERF_PATH)/src/timer.h + sed -i -r 's/-p?g//g' $(IPERF_PATH)/src/Makefile* + touch $@ + +$(IPERF_PATH)/src/iperf3: | $(IPERF_PATH)/.installed $(USERSPACE_DEPS) + cd $(IPERF_PATH) && CFLAGS="$(CFLAGS) -D_GNU_SOURCE" ./configure --prefix=/ $(CROSS_COMPILE_FLAG) --enable-static --disable-shared --with-openssl=no + $(MAKE) -C $(IPERF_PATH) + $(STRIP) -s $@ + +$(WIREGUARD_TOOLS_PATH)/.installed: $(WIREGUARD_TOOLS_TAR) + mkdir -p $(BUILD_PATH) + flock -s $<.lock tar -C $(BUILD_PATH) -xf $< + touch $@ + +$(WIREGUARD_TOOLS_PATH)/src/wg: | $(WIREGUARD_TOOLS_PATH)/.installed $(USERSPACE_DEPS) + $(MAKE) -C $(WIREGUARD_TOOLS_PATH)/src wg + $(STRIP) -s $@ + +$(BUILD_PATH)/init: init.c | $(USERSPACE_DEPS) + mkdir -p $(BUILD_PATH) + $(MUSL_CC) -o $@ $(CFLAGS) $(LDFLAGS) -std=gnu11 $< + $(STRIP) -s $@ + +$(IPUTILS_PATH)/.installed: $(IPUTILS_TAR) + mkdir -p $(BUILD_PATH) + flock -s $<.lock tar -C $(BUILD_PATH) -xf $< + touch $@ + +$(IPUTILS_PATH)/ping: | $(IPUTILS_PATH)/.installed $(USERSPACE_DEPS) + sed -i /atexit/d $(IPUTILS_PATH)/ping.c + cd $(IPUTILS_PATH) && $(CC) $(CFLAGS) -std=c99 -o $@ ping.c ping_common.c ping6_common.c iputils_common.c -D_GNU_SOURCE -D'IPUTILS_VERSION(f)=f' -lresolv $(LDFLAGS) + $(STRIP) -s $@ + +$(BASH_PATH)/.installed: $(BASH_TAR) + mkdir -p $(BUILD_PATH) + flock -s $<.lock tar -C $(BUILD_PATH) -xf $< + touch $@ + +$(BASH_PATH)/bash: | $(BASH_PATH)/.installed $(USERSPACE_DEPS) + cd $(BASH_PATH) && ./configure --prefix=/ $(CROSS_COMPILE_FLAG) --without-bash-malloc --disable-debugger --disable-help-builtin --disable-history --disable-multibyte --disable-progcomp --disable-readline --disable-mem-scramble + $(MAKE) -C $(BASH_PATH) + $(STRIP) -s $@ + +$(IPROUTE2_PATH)/.installed: $(IPROUTE2_TAR) + mkdir -p $(BUILD_PATH) + flock -s $<.lock tar -C $(BUILD_PATH) -xf $< + printf 'CC:=$(CC)\nPKG_CONFIG:=pkg-config\nTC_CONFIG_XT:=n\nTC_CONFIG_ATM:=n\nTC_CONFIG_IPSET:=n\nIP_CONFIG_SETNS:=y\nHAVE_ELF:=n\nHAVE_MNL:=n\nHAVE_BERKELEY_DB:=n\nHAVE_LATEX:=n\nHAVE_PDFLATEX:=n\nCFLAGS+=-DHAVE_SETNS\n' > $(IPROUTE2_PATH)/config.mk + printf 'lib: snapshot\n\t$$(MAKE) -C lib\nip/ip: lib\n\t$$(MAKE) -C ip ip\nmisc/ss: lib\n\t$$(MAKE) -C misc ss\n' >> $(IPROUTE2_PATH)/Makefile + touch $@ + +$(IPROUTE2_PATH)/ip/ip: | $(IPROUTE2_PATH)/.installed $(USERSPACE_DEPS) + $(MAKE) -C $(IPROUTE2_PATH) PREFIX=/ ip/ip + $(STRIP) -s $@ + +$(IPROUTE2_PATH)/misc/ss: | $(IPROUTE2_PATH)/.installed $(USERSPACE_DEPS) + $(MAKE) -C $(IPROUTE2_PATH) PREFIX=/ misc/ss + $(STRIP) -s $@ + +$(IPTABLES_PATH)/.installed: $(IPTABLES_TAR) + mkdir -p $(BUILD_PATH) + flock -s $<.lock tar -C $(BUILD_PATH) -xf $< + sed -i -e "/nfnetlink=[01]/s:=[01]:=0:" -e "/nfconntrack=[01]/s:=[01]:=0:" $(IPTABLES_PATH)/configure + touch $@ + +$(IPTABLES_PATH)/iptables/xtables-legacy-multi: | $(IPTABLES_PATH)/.installed $(USERSPACE_DEPS) + cd $(IPTABLES_PATH) && ./configure --prefix=/ $(CROSS_COMPILE_FLAG) --enable-static --disable-shared --disable-nftables --disable-bpf-compiler --disable-nfsynproxy --disable-libipq --disable-connlabel --with-kernel=$(BUILD_PATH)/include + $(MAKE) -C $(IPTABLES_PATH) + $(STRIP) -s $@ + +$(NMAP_PATH)/.installed: $(NMAP_TAR) + mkdir -p $(BUILD_PATH) + flock -s $<.lock tar -C $(BUILD_PATH) -xf $< + touch $@ + +$(NMAP_PATH)/ncat/ncat: | $(NMAP_PATH)/.installed $(USERSPACE_DEPS) + cd $(NMAP_PATH) && ./configure --prefix=/ $(CROSS_COMPILE_FLAG) --enable-static --disable-shared --without-ndiff --without-zenmap --without-nping --with-libpcap=included --with-libpcre=included --with-libdnet=included --without-liblua --with-liblinear=included --without-nmap-update --without-openssl --with-pcap=linux --without-libssh + $(MAKE) -C $(NMAP_PATH)/libpcap + $(MAKE) -C $(NMAP_PATH)/ncat + $(STRIP) -s $@ + +clean: + rm -rf $(BUILD_PATH) + +distclean: clean + rm -rf $(DISTFILES_PATH) + +menuconfig: $(KERNEL_BUILD_PATH)/.config + $(MAKE) -C $(KERNEL_PATH) O=$(KERNEL_BUILD_PATH) ARCH=$(KERNEL_ARCH) CROSS_COMPILE=$(CROSS_COMPILE) menuconfig + +.PHONY: qemu build clean distclean menuconfig +.DELETE_ON_ERROR: diff --git a/tools/testing/selftests/wireguard/qemu/arch/aarch64.config b/tools/testing/selftests/wireguard/qemu/arch/aarch64.config new file mode 100644 index 000000000000..3d063bb247bb --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/aarch64.config @@ -0,0 +1,5 @@ +CONFIG_SERIAL_AMBA_PL011=y +CONFIG_SERIAL_AMBA_PL011_CONSOLE=y +CONFIG_CMDLINE_BOOL=y +CONFIG_CMDLINE="console=ttyAMA0 wg.success=ttyAMA1" +CONFIG_FRAME_WARN=1280 diff --git a/tools/testing/selftests/wireguard/qemu/arch/aarch64_be.config b/tools/testing/selftests/wireguard/qemu/arch/aarch64_be.config new file mode 100644 index 000000000000..dbdc7e406a7b --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/aarch64_be.config @@ -0,0 +1,6 @@ +CONFIG_CPU_BIG_ENDIAN=y +CONFIG_SERIAL_AMBA_PL011=y +CONFIG_SERIAL_AMBA_PL011_CONSOLE=y +CONFIG_CMDLINE_BOOL=y +CONFIG_CMDLINE="console=ttyAMA0 wg.success=ttyAMA1" +CONFIG_FRAME_WARN=1280 diff --git a/tools/testing/selftests/wireguard/qemu/arch/arm.config b/tools/testing/selftests/wireguard/qemu/arch/arm.config new file mode 100644 index 000000000000..148f49905418 --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/arm.config @@ -0,0 +1,9 @@ +CONFIG_MMU=y +CONFIG_ARCH_MULTI_V7=y +CONFIG_ARCH_VIRT=y +CONFIG_THUMB2_KERNEL=n +CONFIG_SERIAL_AMBA_PL011=y +CONFIG_SERIAL_AMBA_PL011_CONSOLE=y +CONFIG_CMDLINE_BOOL=y +CONFIG_CMDLINE="console=ttyAMA0 wg.success=ttyAMA1" +CONFIG_FRAME_WARN=1024 diff --git a/tools/testing/selftests/wireguard/qemu/arch/armeb.config b/tools/testing/selftests/wireguard/qemu/arch/armeb.config new file mode 100644 index 000000000000..bd76b07d00a2 --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/armeb.config @@ -0,0 +1,10 @@ +CONFIG_MMU=y +CONFIG_ARCH_MULTI_V7=y +CONFIG_ARCH_VIRT=y +CONFIG_THUMB2_KERNEL=n +CONFIG_SERIAL_AMBA_PL011=y +CONFIG_SERIAL_AMBA_PL011_CONSOLE=y +CONFIG_CMDLINE_BOOL=y +CONFIG_CMDLINE="console=ttyAMA0 wg.success=ttyAMA1" +CONFIG_CPU_BIG_ENDIAN=y +CONFIG_FRAME_WARN=1024 diff --git a/tools/testing/selftests/wireguard/qemu/arch/i686.config b/tools/testing/selftests/wireguard/qemu/arch/i686.config new file mode 100644 index 000000000000..a85025d7206e --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/i686.config @@ -0,0 +1,5 @@ +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_CMDLINE_BOOL=y +CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1" +CONFIG_FRAME_WARN=1024 diff --git a/tools/testing/selftests/wireguard/qemu/arch/m68k.config b/tools/testing/selftests/wireguard/qemu/arch/m68k.config new file mode 100644 index 000000000000..62a15bdb877e --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/m68k.config @@ -0,0 +1,9 @@ +CONFIG_MMU=y +CONFIG_M68KCLASSIC=y +CONFIG_M68040=y +CONFIG_MAC=y +CONFIG_SERIAL_PMACZILOG=y +CONFIG_SERIAL_PMACZILOG_TTYS=y +CONFIG_SERIAL_PMACZILOG_CONSOLE=y +CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1" +CONFIG_FRAME_WARN=1024 diff --git a/tools/testing/selftests/wireguard/qemu/arch/mips.config b/tools/testing/selftests/wireguard/qemu/arch/mips.config new file mode 100644 index 000000000000..df71d6b95546 --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/mips.config @@ -0,0 +1,11 @@ +CONFIG_CPU_MIPS32_R2=y +CONFIG_MIPS_MALTA=y +CONFIG_MIPS_CPS=y +CONFIG_MIPS_FP_SUPPORT=y +CONFIG_POWER_RESET=y +CONFIG_POWER_RESET_SYSCON=y +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_CMDLINE_BOOL=y +CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1" +CONFIG_FRAME_WARN=1024 diff --git a/tools/testing/selftests/wireguard/qemu/arch/mips64.config b/tools/testing/selftests/wireguard/qemu/arch/mips64.config new file mode 100644 index 000000000000..90c783f725c4 --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/mips64.config @@ -0,0 +1,14 @@ +CONFIG_64BIT=y +CONFIG_CPU_MIPS64_R2=y +CONFIG_MIPS32_N32=y +CONFIG_CPU_HAS_MSA=y +CONFIG_MIPS_MALTA=y +CONFIG_MIPS_CPS=y +CONFIG_MIPS_FP_SUPPORT=y +CONFIG_POWER_RESET=y +CONFIG_POWER_RESET_SYSCON=y +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_CMDLINE_BOOL=y +CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1" +CONFIG_FRAME_WARN=1280 diff --git a/tools/testing/selftests/wireguard/qemu/arch/mips64el.config b/tools/testing/selftests/wireguard/qemu/arch/mips64el.config new file mode 100644 index 000000000000..435b0b43e00c --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/mips64el.config @@ -0,0 +1,15 @@ +CONFIG_64BIT=y +CONFIG_CPU_MIPS64_R2=y +CONFIG_MIPS32_N32=y +CONFIG_CPU_HAS_MSA=y +CONFIG_MIPS_MALTA=y +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_MIPS_CPS=y +CONFIG_MIPS_FP_SUPPORT=y +CONFIG_POWER_RESET=y +CONFIG_POWER_RESET_SYSCON=y +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_CMDLINE_BOOL=y +CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1" +CONFIG_FRAME_WARN=1280 diff --git a/tools/testing/selftests/wireguard/qemu/arch/mipsel.config b/tools/testing/selftests/wireguard/qemu/arch/mipsel.config new file mode 100644 index 000000000000..62bb50c4a85f --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/mipsel.config @@ -0,0 +1,12 @@ +CONFIG_CPU_MIPS32_R2=y +CONFIG_MIPS_MALTA=y +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_MIPS_CPS=y +CONFIG_MIPS_FP_SUPPORT=y +CONFIG_POWER_RESET=y +CONFIG_POWER_RESET_SYSCON=y +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_CMDLINE_BOOL=y +CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1" +CONFIG_FRAME_WARN=1024 diff --git a/tools/testing/selftests/wireguard/qemu/arch/powerpc.config b/tools/testing/selftests/wireguard/qemu/arch/powerpc.config new file mode 100644 index 000000000000..57957093b71b --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/powerpc.config @@ -0,0 +1,10 @@ +CONFIG_PPC_QEMU_E500=y +CONFIG_FSL_SOC_BOOKE=y +CONFIG_PPC_85xx=y +CONFIG_PHYS_64BIT=y +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_MATH_EMULATION=y +CONFIG_CMDLINE_BOOL=y +CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1" +CONFIG_FRAME_WARN=1024 diff --git a/tools/testing/selftests/wireguard/qemu/arch/powerpc64le.config b/tools/testing/selftests/wireguard/qemu/arch/powerpc64le.config new file mode 100644 index 000000000000..f52f1e2bc7f6 --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/powerpc64le.config @@ -0,0 +1,13 @@ +CONFIG_PPC64=y +CONFIG_PPC_PSERIES=y +CONFIG_ALTIVEC=y +CONFIG_VSX=y +CONFIG_PPC_OF_BOOT_TRAMPOLINE=y +CONFIG_PPC_RADIX_MMU=y +CONFIG_HVC_CONSOLE=y +CONFIG_CPU_LITTLE_ENDIAN=y +CONFIG_CMDLINE_BOOL=y +CONFIG_CMDLINE="console=hvc0 wg.success=hvc1" +CONFIG_SECTION_MISMATCH_WARN_ONLY=y +CONFIG_FRAME_WARN=1280 +CONFIG_THREAD_SHIFT=14 diff --git a/tools/testing/selftests/wireguard/qemu/arch/x86_64.config b/tools/testing/selftests/wireguard/qemu/arch/x86_64.config new file mode 100644 index 000000000000..00a1ef4869d5 --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/arch/x86_64.config @@ -0,0 +1,5 @@ +CONFIG_SERIAL_8250=y +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_CMDLINE_BOOL=y +CONFIG_CMDLINE="console=ttyS0 wg.success=ttyS1" +CONFIG_FRAME_WARN=1280 diff --git a/tools/testing/selftests/wireguard/qemu/debug.config b/tools/testing/selftests/wireguard/qemu/debug.config new file mode 100644 index 000000000000..b9c72706fe4d --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/debug.config @@ -0,0 +1,67 @@ +CONFIG_LOCALVERSION="-debug" +CONFIG_ENABLE_WARN_DEPRECATED=y +CONFIG_ENABLE_MUST_CHECK=y +CONFIG_FRAME_POINTER=y +CONFIG_STACK_VALIDATION=y +CONFIG_DEBUG_KERNEL=y +CONFIG_DEBUG_INFO=y +CONFIG_DEBUG_INFO_DWARF4=y +CONFIG_PAGE_EXTENSION=y +CONFIG_PAGE_POISONING=y +CONFIG_DEBUG_OBJECTS=y +CONFIG_DEBUG_OBJECTS_FREE=y +CONFIG_DEBUG_OBJECTS_TIMERS=y +CONFIG_DEBUG_OBJECTS_WORK=y +CONFIG_DEBUG_OBJECTS_RCU_HEAD=y +CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y +CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1 +CONFIG_SLUB_DEBUG_ON=y +CONFIG_DEBUG_VM=y +CONFIG_DEBUG_MEMORY_INIT=y +CONFIG_HAVE_DEBUG_STACKOVERFLOW=y +CONFIG_DEBUG_STACKOVERFLOW=y +CONFIG_HAVE_ARCH_KMEMCHECK=y +CONFIG_HAVE_ARCH_KASAN=y +CONFIG_KASAN=y +CONFIG_KASAN_INLINE=y +CONFIG_UBSAN=y +CONFIG_UBSAN_SANITIZE_ALL=y +CONFIG_UBSAN_NO_ALIGNMENT=y +CONFIG_UBSAN_NULL=y +CONFIG_DEBUG_KMEMLEAK=y +CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=8192 +CONFIG_DEBUG_STACK_USAGE=y +CONFIG_DEBUG_SHIRQ=y +CONFIG_WQ_WATCHDOG=y +CONFIG_SCHED_DEBUG=y +CONFIG_SCHED_INFO=y +CONFIG_SCHEDSTATS=y +CONFIG_SCHED_STACK_END_CHECK=y +CONFIG_DEBUG_TIMEKEEPING=y +CONFIG_TIMER_STATS=y +CONFIG_DEBUG_PREEMPT=y +CONFIG_DEBUG_RT_MUTEXES=y +CONFIG_DEBUG_SPINLOCK=y +CONFIG_DEBUG_MUTEXES=y +CONFIG_DEBUG_LOCK_ALLOC=y +CONFIG_PROVE_LOCKING=y +CONFIG_LOCKDEP=y +CONFIG_DEBUG_ATOMIC_SLEEP=y +CONFIG_TRACE_IRQFLAGS=y +CONFIG_DEBUG_BUGVERBOSE=y +CONFIG_DEBUG_LIST=y +CONFIG_DEBUG_PI_LIST=y +CONFIG_PROVE_RCU=y +CONFIG_SPARSE_RCU_POINTER=y +CONFIG_RCU_CPU_STALL_TIMEOUT=21 +CONFIG_RCU_TRACE=y +CONFIG_RCU_EQS_DEBUG=y +CONFIG_USER_STACKTRACE_SUPPORT=y +CONFIG_DEBUG_SG=y +CONFIG_DEBUG_NOTIFIERS=y +CONFIG_DOUBLEFAULT=y +CONFIG_X86_DEBUG_FPU=y +CONFIG_DEBUG_SECTION_MISMATCH=y +CONFIG_DEBUG_PAGEALLOC=y +CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y +CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y diff --git a/tools/testing/selftests/wireguard/qemu/init.c b/tools/testing/selftests/wireguard/qemu/init.c new file mode 100644 index 000000000000..c9698120ac9d --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/init.c @@ -0,0 +1,284 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +__attribute__((noreturn)) static void poweroff(void) +{ + fflush(stdout); + fflush(stderr); + reboot(RB_AUTOBOOT); + sleep(30); + fprintf(stderr, "\x1b[37m\x1b[41m\x1b[1mFailed to power off!!!\x1b[0m\n"); + exit(1); +} + +static void panic(const char *what) +{ + fprintf(stderr, "\n\n\x1b[37m\x1b[41m\x1b[1mSOMETHING WENT HORRIBLY WRONG\x1b[0m\n\n \x1b[31m\x1b[1m%s: %s\x1b[0m\n\n\x1b[37m\x1b[44m\x1b[1mPower off...\x1b[0m\n\n", what, strerror(errno)); + poweroff(); +} + +#define pretty_message(msg) puts("\x1b[32m\x1b[1m" msg "\x1b[0m") + +static void print_banner(void) +{ + struct utsname utsname; + int len; + + if (uname(&utsname) < 0) + panic("uname"); + + len = strlen(" WireGuard Test Suite on ") + strlen(utsname.sysname) + strlen(utsname.release) + strlen(utsname.machine); + printf("\x1b[45m\x1b[33m\x1b[1m%*.s\x1b[0m\n\x1b[45m\x1b[33m\x1b[1m WireGuard Test Suite on %s %s %s \x1b[0m\n\x1b[45m\x1b[33m\x1b[1m%*.s\x1b[0m\n\n", len, "", utsname.sysname, utsname.release, utsname.machine, len, ""); +} + +static void seed_rng(void) +{ + int fd; + struct { + int entropy_count; + int buffer_size; + unsigned char buffer[256]; + } entropy = { + .entropy_count = sizeof(entropy.buffer) * 8, + .buffer_size = sizeof(entropy.buffer), + .buffer = "Adding real entropy is not actually important for these tests. Don't try this at home, kids!" + }; + + if (mknod("/dev/urandom", S_IFCHR | 0644, makedev(1, 9))) + panic("mknod(/dev/urandom)"); + fd = open("/dev/urandom", O_WRONLY); + if (fd < 0) + panic("open(urandom)"); + for (int i = 0; i < 256; ++i) { + if (ioctl(fd, RNDADDENTROPY, &entropy) < 0) + panic("ioctl(urandom)"); + } + close(fd); +} + +static void mount_filesystems(void) +{ + pretty_message("[+] Mounting filesystems..."); + mkdir("/dev", 0755); + mkdir("/proc", 0755); + mkdir("/sys", 0755); + mkdir("/tmp", 0755); + mkdir("/run", 0755); + mkdir("/var", 0755); + if (mount("none", "/dev", "devtmpfs", 0, NULL)) + panic("devtmpfs mount"); + if (mount("none", "/proc", "proc", 0, NULL)) + panic("procfs mount"); + if (mount("none", "/sys", "sysfs", 0, NULL)) + panic("sysfs mount"); + if (mount("none", "/tmp", "tmpfs", 0, NULL)) + panic("tmpfs mount"); + if (mount("none", "/run", "tmpfs", 0, NULL)) + panic("tmpfs mount"); + if (mount("none", "/sys/kernel/debug", "debugfs", 0, NULL)) + ; /* Not a problem if it fails.*/ + if (symlink("/run", "/var/run")) + panic("run symlink"); + if (symlink("/proc/self/fd", "/dev/fd")) + panic("fd symlink"); +} + +static void enable_logging(void) +{ + int fd; + pretty_message("[+] Enabling logging..."); + fd = open("/proc/sys/kernel/printk", O_WRONLY); + if (fd >= 0) { + if (write(fd, "9\n", 2) != 2) + panic("write(printk)"); + close(fd); + } + fd = open("/proc/sys/debug/exception-trace", O_WRONLY); + if (fd >= 0) { + if (write(fd, "1\n", 2) != 2) + panic("write(exception-trace)"); + close(fd); + } + fd = open("/proc/sys/kernel/panic_on_warn", O_WRONLY); + if (fd >= 0) { + if (write(fd, "1\n", 2) != 2) + panic("write(panic_on_warn)"); + close(fd); + } +} + +static void kmod_selftests(void) +{ + FILE *file; + char line[2048], *start, *pass; + bool success = true; + pretty_message("[+] Module self-tests:"); + file = fopen("/proc/kmsg", "r"); + if (!file) + panic("fopen(kmsg)"); + if (fcntl(fileno(file), F_SETFL, O_NONBLOCK) < 0) + panic("fcntl(kmsg, nonblock)"); + while (fgets(line, sizeof(line), file)) { + start = strstr(line, "wireguard: "); + if (!start) + continue; + start += 11; + *strchrnul(start, '\n') = '\0'; + if (strstr(start, "www.wireguard.com")) + break; + pass = strstr(start, ": pass"); + if (!pass || pass[6] != '\0') { + success = false; + printf(" \x1b[31m* %s\x1b[0m\n", start); + } else + printf(" \x1b[32m* %s\x1b[0m\n", start); + } + fclose(file); + if (!success) { + puts("\x1b[31m\x1b[1m[-] Tests failed! \u2639\x1b[0m"); + poweroff(); + } +} + +static void launch_tests(void) +{ + char cmdline[4096], *success_dev; + int status, fd; + pid_t pid; + + pretty_message("[+] Launching tests..."); + pid = fork(); + if (pid == -1) + panic("fork"); + else if (pid == 0) { + execl("/init.sh", "init", NULL); + panic("exec"); + } + if (waitpid(pid, &status, 0) < 0) + panic("waitpid"); + if (WIFEXITED(status) && WEXITSTATUS(status) == 0) { + pretty_message("[+] Tests successful! :-)"); + fd = open("/proc/cmdline", O_RDONLY); + if (fd < 0) + panic("open(/proc/cmdline)"); + if (read(fd, cmdline, sizeof(cmdline) - 1) <= 0) + panic("read(/proc/cmdline)"); + cmdline[sizeof(cmdline) - 1] = '\0'; + for (success_dev = strtok(cmdline, " \n"); success_dev; success_dev = strtok(NULL, " \n")) { + if (strncmp(success_dev, "wg.success=", 11)) + continue; + memcpy(success_dev + 11 - 5, "/dev/", 5); + success_dev += 11 - 5; + break; + } + if (!success_dev || !strlen(success_dev)) + panic("Unable to find success device"); + + fd = open(success_dev, O_WRONLY); + if (fd < 0) + panic("open(success_dev)"); + if (write(fd, "success\n", 8) != 8) + panic("write(success_dev)"); + close(fd); + } else { + const char *why = "unknown cause"; + int what = -1; + + if (WIFEXITED(status)) { + why = "exit code"; + what = WEXITSTATUS(status); + } else if (WIFSIGNALED(status)) { + why = "signal"; + what = WTERMSIG(status); + } + printf("\x1b[31m\x1b[1m[-] Tests failed with %s %d! \u2639\x1b[0m\n", why, what); + } +} + +static void ensure_console(void) +{ + for (unsigned int i = 0; i < 1000; ++i) { + int fd = open("/dev/console", O_RDWR); + if (fd < 0) { + usleep(50000); + continue; + } + dup2(fd, 0); + dup2(fd, 1); + dup2(fd, 2); + close(fd); + if (write(1, "\0\0\0\0\n", 5) == 5) + return; + } + panic("Unable to open console device"); +} + +static void clear_leaks(void) +{ + int fd; + + fd = open("/sys/kernel/debug/kmemleak", O_WRONLY); + if (fd < 0) + return; + pretty_message("[+] Starting memory leak detection..."); + write(fd, "clear\n", 5); + close(fd); +} + +static void check_leaks(void) +{ + int fd; + + fd = open("/sys/kernel/debug/kmemleak", O_WRONLY); + if (fd < 0) + return; + pretty_message("[+] Scanning for memory leaks..."); + sleep(2); /* Wait for any grace periods. */ + write(fd, "scan\n", 5); + close(fd); + + fd = open("/sys/kernel/debug/kmemleak", O_RDONLY); + if (fd < 0) + return; + if (sendfile(1, fd, NULL, 0x7ffff000) > 0) + panic("Memory leaks encountered"); + close(fd); +} + +int main(int argc, char *argv[]) +{ + seed_rng(); + ensure_console(); + print_banner(); + mount_filesystems(); + kmod_selftests(); + enable_logging(); + clear_leaks(); + launch_tests(); + check_leaks(); + poweroff(); + return 1; +} diff --git a/tools/testing/selftests/wireguard/qemu/kernel.config b/tools/testing/selftests/wireguard/qemu/kernel.config new file mode 100644 index 000000000000..af9323a0b6e0 --- /dev/null +++ b/tools/testing/selftests/wireguard/qemu/kernel.config @@ -0,0 +1,88 @@ +CONFIG_LOCALVERSION="" +CONFIG_NET=y +CONFIG_NETDEVICES=y +CONFIG_NET_CORE=y +CONFIG_NET_IPIP=y +CONFIG_DUMMY=y +CONFIG_VETH=y +CONFIG_MULTIUSER=y +CONFIG_NAMESPACES=y +CONFIG_NET_NS=y +CONFIG_UNIX=y +CONFIG_INET=y +CONFIG_IPV6=y +CONFIG_NETFILTER=y +CONFIG_NETFILTER_ADVANCED=y +CONFIG_NF_CONNTRACK=y +CONFIG_NF_NAT=y +CONFIG_NETFILTER_XTABLES=y +CONFIG_NETFILTER_XT_NAT=y +CONFIG_NETFILTER_XT_MATCH_LENGTH=y +CONFIG_NF_CONNTRACK_IPV4=y +CONFIG_NF_NAT_IPV4=y +CONFIG_IP_NF_IPTABLES=y +CONFIG_IP_NF_FILTER=y +CONFIG_IP_NF_NAT=y +CONFIG_IP_ADVANCED_ROUTER=y +CONFIG_IP_MULTIPLE_TABLES=y +CONFIG_IPV6_MULTIPLE_TABLES=y +CONFIG_TTY=y +CONFIG_BINFMT_ELF=y +CONFIG_BINFMT_SCRIPT=y +CONFIG_VDSO=y +CONFIG_VIRTUALIZATION=y +CONFIG_HYPERVISOR_GUEST=y +CONFIG_PARAVIRT=y +CONFIG_KVM_GUEST=y +CONFIG_PARAVIRT_SPINLOCKS=y +CONFIG_PRINTK=y +CONFIG_KALLSYMS=y +CONFIG_BUG=y +CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y +CONFIG_JUMP_LABEL=y +CONFIG_EMBEDDED=n +CONFIG_BASE_FULL=y +CONFIG_FUTEX=y +CONFIG_SHMEM=y +CONFIG_SLUB=y +CONFIG_SPARSEMEM_VMEMMAP=y +CONFIG_SMP=y +CONFIG_SCHED_SMT=y +CONFIG_SCHED_MC=y +CONFIG_NUMA=y +CONFIG_PREEMPT=y +CONFIG_NO_HZ=y +CONFIG_NO_HZ_IDLE=y +CONFIG_NO_HZ_FULL=n +CONFIG_HZ_PERIODIC=n +CONFIG_HIGH_RES_TIMERS=y +CONFIG_COMPAT_32BIT_TIME=y +CONFIG_ARCH_RANDOM=y +CONFIG_FILE_LOCKING=y +CONFIG_POSIX_TIMERS=y +CONFIG_DEVTMPFS=y +CONFIG_PROC_FS=y +CONFIG_PROC_SYSCTL=y +CONFIG_SYSFS=y +CONFIG_TMPFS=y +CONFIG_CONSOLE_LOGLEVEL_DEFAULT=15 +CONFIG_PRINTK_TIME=y +CONFIG_BLK_DEV_INITRD=y +CONFIG_LEGACY_VSYSCALL_NONE=y +CONFIG_KERNEL_GZIP=y +CONFIG_PANIC_ON_OOPS=y +CONFIG_BUG_ON_DATA_CORRUPTION=y +CONFIG_LOCKUP_DETECTOR=y +CONFIG_SOFTLOCKUP_DETECTOR=y +CONFIG_HARDLOCKUP_DETECTOR=y +CONFIG_WQ_WATCHDOG=y +CONFIG_DETECT_HUNG_TASK=y +CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y +CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y +CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y +CONFIG_PANIC_TIMEOUT=-1 +CONFIG_STACKTRACE=y +CONFIG_EARLY_PRINTK=y +CONFIG_GDB_SCRIPTS=y +CONFIG_WIREGUARD=y +CONFIG_WIREGUARD_DEBUG=y