- Jul 03, 2014
-
-
Jarod Wilson authored
Per further discussion with NIST, the requirements for FIPS state that we only need to panic the system on failed kernel module signature checks for crypto subsystem modules. This moves the fips-mode-only module signature check out of the generic module loading code, into the crypto subsystem, at points where we can catch both algorithm module loads and mode module loads. At the same time, make CONFIG_CRYPTO_FIPS dependent on CONFIG_MODULE_SIG, as this is entirely necessary for FIPS mode. v2: remove extraneous blank line, perform checks in static inline function, drop no longer necessary fips.h include. CC: "David S. Miller" <davem@davemloft.net> CC: Rusty Russell <rusty@rustcorp.com.au> CC: Stephan Mueller <stephan.mueller@atsec.com> Signed-off-by:
Jarod Wilson <jarod@redhat.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jun 20, 2014
-
-
Jussi Kivilinna authored
Patch adds x86_64 assembly implementation of Triple DES EDE cipher algorithm. Two assembly implementations are provided. First is regular 'one-block at time' encrypt/decrypt function. Second is 'three-blocks at time' function that gains performance increase on out-of-order CPUs. tcrypt test results: Intel Core i5-4570: des3_ede-asm vs des3_ede-generic: size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec 16B 1.21x 1.22x 1.27x 1.36x 1.25x 1.25x 64B 1.98x 1.96x 1.23x 2.04x 2.01x 2.00x 256B 2.34x 2.37x 1.21x 2.40x 2.38x 2.39x 1024B 2.50x 2.47x 1.22x 2.51x 2.52x 2.51x 8192B 2.51x 2.53x 1.21x 2.56x 2.54x 2.55x Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Stephan Mueller authored
The different DRBG types of CTR, Hash, HMAC can be enabled or disabled at compile time. At least one DRBG type shall be selected. The default is the HMAC DRBG as its code base is smallest. Signed-off-by:
Stephan Mueller <smueller@chronox.de> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Mar 21, 2014
-
-
chandramouli narayanan authored
This git patch adds x86_64 AVX2 optimization of SHA1 transform to crypto support. The patch has been tested with 3.14.0-rc1 kernel. On a Haswell desktop, with turbo disabled and all cpus running at maximum frequency, tcrypt shows AVX2 performance improvement from 3% for 256 bytes update to 16% for 1024 bytes update over AVX implementation. This patch adds sha1_avx2_transform(), the glue, build and configuration changes needed for AVX2 optimization of SHA1 transform to crypto support. sha1-ssse3 is one module which adds the necessary optimization support (SSSE3/AVX/AVX2) for the low-level SHA1 transform function. With better optimization support, transform function is overridden as the case may be. In the case of AVX2, due to performance reasons across datablock sizes, the AVX or AVX2 transform function is used at run-time as it suits best. The Makefile change therefore appends the necessary objects to the linkage. Due to this, the patch merely appends AVX2 transform to the existing build mix and Kconfig support and leaves the configuration build support as is. Signed-off-by:
Chandramouli Narayanan <mouli@linux.intel.com> Reviewed-by:
Marek Vasut <marex@denx.de> Acked-by:
H. Peter Anvin <hpa@linux.intel.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Oct 25, 2013
-
-
Dmitry Kasatkin authored
This patch provides a single place for information about hash algorithms, such as hash sizes and kernel driver names, which will be used by IMA and the public key code. Changelog: - Fix sparse and checkpatch warnings - Move hash algo enums to uapi for userspace signing functions. Signed-off-by:
Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by:
Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Oct 04, 2013
-
-
Ard Biesheuvel authored
Bit sliced AES gives around 45% speedup on Cortex-A15 for encryption and around 25% for decryption. This implementation of the AES algorithm does not rely on any lookup tables so it is believed to be invulnerable to cache timing attacks. This algorithm processes up to 8 blocks in parallel in constant time. This means that it is not usable by chaining modes that are strictly sequential in nature, such as CBC encryption. CBC decryption, however, can benefit from this implementation and runs about 25% faster. The other chaining modes implemented in this module, XTS and CTR, can execute fully in parallel in both directions. The core code has been adopted from the OpenSSL project (in collaboration with the original author, on cc). For ease of maintenance, this version is identical to the upstream OpenSSL code, i.e., all modifications that were required to make it suitable for inclusion into the kernel have been made upstream. The original can be found here: http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=6f6a6130 Note to integrators: While this implementation is significantly faster than the existing table based ones (generic or ARM asm), especially in CTR mode, the effects on power efficiency are unclear as of yet. This code does fundamentally more work, by calculating values that the table based code obtains by a simple lookup; only by doing all of that work in a SIMD fashion, it manages to perform better. Cc: Andy Polyakov <appro@openssl.org> Acked-by:
Nicolas Pitre <nico@linaro.org> Signed-off-by:
Ard Biesheuvel <ard.biesheuvel@linaro.org>
-
- Sep 23, 2013
-
-
Ard Biesheuvel authored
Move all users of ablk_helper under x86/ to the generic version and delete the x86 specific version. Acked-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Create a generic version of ablk_helper so it can be reused by other architectures. Acked-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Sep 07, 2013
-
-
Herbert Xu authored
This patch reinstates commits 67822649 39761214 0b95a7f8 31d93962 2d31e518 Now that module softdeps are in the kernel we can use that to resolve the boot issue which cause the revert. Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jul 24, 2013
-
-
Herbert Xu authored
This reverts commits 67822649 39761214 0b95a7f8 31d93962 2d31e518 Unfortunately this change broke boot on some systems that used an initrd which does not include the newly created crct10dif modules. As these modules are required by sd_mod under certain configurations this is a serious problem. Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jul 09, 2013
-
-
Chanho Min authored
Add support for lz4 and lz4hc compression algorithm using the lib/lz4/* codebase. [akpm@linux-foundation.org: fix warnings] Signed-off-by:
Chanho Min <chanho.min@lge.com> Cc: "Darrick J. Wong" <djwong@us.ibm.com> Cc: Bob Pearson <rpearson@systemfabricworks.com> Cc: Richard Weinberger <richard@nod.at> Cc: Herbert Xu <herbert@gondor.hengli.com.au> Cc: Yann Collet <yann.collet.73@gmail.com> Cc: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jun 21, 2013
-
-
Jussi Kivilinna authored
This reverts commit cf1521a1. Instruction (vpgatherdd) that this implementation relied on turned out to be slow performer on real hardware (i5-4570). The previous 8-way twofish/AVX implementation is therefore faster and this implementation should be removed. Converting this implementation to use the same method as in twofish/AVX for table look-ups would give additional ~3% speed up vs twofish/AVX, but would hardly be worth of the added code and binary size. Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
This reverts commit 60488010. Instruction (vpgatherdd) that this implementation relied on turned out to be slow performer on real hardware (i5-4570). The previous 4-way blowfish implementation is therefore faster and this implementation should be removed. Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jun 05, 2013
-
-
Jussi Kivilinna authored
It appears that the performance of 'vpgatherdd' is suboptimal for this kind of workload (tested on Core i5-4570) and causes blowfish-avx2 to be significantly slower than blowfish-amd64. So disable the AVX2 implementation to avoid performance regressions. Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
It appears that the performance of 'vpgatherdd' is suboptimal for this kind of workload (tested on Core i5-4570) and causes twofish_avx2 to be significantly slower than twofish_avx. So disable the AVX2 implementation to avoid performance regressions. Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- May 24, 2013
-
-
Tim Chen authored
Glue code that plugs the PCLMULQDQ accelerated CRC T10 DIF hash into the crypto framework. The config CRYPTO_CRCT10DIF_PCLMUL should be turned on to enable the feature. The crc_t10dif crypto library function will use this faster algorithm when crct10dif_pclmul module is loaded. Signed-off-by:
Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- May 20, 2013
-
-
Tim Chen authored
When CRC T10 DIF is calculated using the crypto transform framework, we wrap the crc_t10dif function call to utilize it. This allows us to take advantage of any accelerated CRC T10 DIF transform that is plugged into the crypto framework. Signed-off-by:
Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Apr 25, 2013
-
-
Jussi Kivilinna authored
Patch adds AVX2/AES-NI/x86-64 implementation of Camellia cipher, requiring 32 parallel blocks for input (512 bytes). Compared to AVX implementation, this version is extended to use the 256-bit wide YMM registers. For AES-NI instructions data is split to two 128-bit registers and merged afterwards. Even with this additional handling, performance should be higher compared to the AES-NI/AVX implementation. Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Patch adds AVX2/x86-64 implementation of Serpent cipher, requiring 16 parallel blocks for input (256 bytes). Implementation is based on the AVX implementation and extends to use the 256-bit wide YMM registers. Since serpent does not use table look-ups, this implementation should be close to two times faster than the AVX implementation. Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Patch adds AVX2/x86-64 implementation of Twofish cipher, requiring 16 parallel blocks for input (256 bytes). Table look-ups are performed using vpgatherdd instruction directly from vector registers and thus should be faster than earlier implementations. Implementation also uses 256-bit wide YMM registers, which should give additional speed up compared to the AVX implementation. Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Patch adds AVX2/x86-64 implementation of Blowfish cipher, requiring 32 parallel blocks for input (256 bytes). Table look-ups are performed using vpgatherdd instruction directly from vector registers and thus should be faster than earlier implementations. Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
The Kconfig setting for glue helper module is CRYPTO_GLUE_HELPER_X86, but recent change for aesni_intel used CRYPTO_GLUE_HELPER instead. Patch corrects this issue. Cc: kbuild-all@01.org Reported-by:
kbuild test robot <fengguang.wu@intel.com> Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Add more optimized XTS code for aesni_intel in 64-bit mode, for smaller stack usage and boost for speed. tcrypt results, with Intel i5-2450M: 256-bit key enc dec 16B 0.98x 0.99x 64B 0.64x 0.63x 256B 1.29x 1.32x 1024B 1.54x 1.58x 8192B 1.57x 1.60x 512-bit key enc dec 16B 0.98x 0.99x 64B 0.60x 0.59x 256B 1.24x 1.25x 1024B 1.39x 1.42x 8192B 1.38x 1.42x I chose not to optimize smaller than block size of 256 bytes, since XTS is practically always used with data blocks of size 512 bytes. This is why performance is reduced in tcrypt for 64 byte long blocks. Cc: Huang Ying <ying.huang@intel.com> Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Patch adds support for NIST recommended block cipher mode CMAC to CryptoAPI. This work is based on Tom St Denis' earlier patch, http://marc.info/?l=linux-crypto-vger&m=135877306305466&w=2 Cc: Tom St Denis <tstdenis@elliptictech.com> Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Acked-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
The GMAC code assumes that dst==src, which causes problems when trying to add rfc4543(gcm(aes)) test vectors. So fix this code to work when source and destination buffer are different. Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Tim Chen authored
crypto: sha512 - Create module providing optimized SHA512 routines using SSSE3, AVX or AVX2 instructions. We added glue code and config options to create crypto module that uses SSE/AVX/AVX2 optimized SHA512 x86_64 assembly routines. Signed-off-by:
Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
Tim Chen authored
crypto: sha256 - Create module providing optimized SHA256 routines using SSSE3, AVX or AVX2 instructions. We added glue code and config options to create crypto module that uses SSE/AVX/AVX2 optimized SHA256 x86_64 assembly routines. Signed-off-by:
Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Feb 26, 2013
-
-
Herbert Xu authored
This bool option can never be set to anything other than y. So let's just kill it. Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jan 19, 2013
-
-
Alexander Boyko authored
This patch adds crc32 algorithms to shash crypto api. One is wrapper to gerneric crc32_le function. Second is crc32 pclmulqdq implementation. It use hardware provided PCLMULQDQ instruction to accelerate the CRC32 disposal. This instruction present from Intel Westmere and AMD Bulldozer CPUs. For intel core i5 I got 450MB/s for table implementation and 2100MB/s for pclmulqdq implementation. Signed-off-by:
Alexander Boyko <alexander_boyko@xyratex.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jan 11, 2013
-
-
Kees Cook authored
The CONFIG_EXPERIMENTAL config item has not carried much meaning for a while now and is almost always enabled by default. As agreed during the Linux kernel summit, remove it from any "depends on" lines in Kconfigs. CC: Herbert Xu <herbert@gondor.apana.org.au> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by:
Kees Cook <keescook@chromium.org> Acked-by:
David S. Miller <davem@davemloft.net>
-
- Jan 10, 2013
-
-
Michael Ellerman authored
This patch adds a crypto driver which provides a powerpc accelerated implementation of SHA-1, accelerated in that it is written in asm. Original patch by Paul, minor fixups for upstream by moi. Lightly tested on 64-bit with the test program here: http://michael.ellerman.id.au/files/junkcode/sha1test.c Seems to work, and is "not slower" than the generic version. Needs testing on 32-bit. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Michael Ellerman <michael@ellerman.id.au> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
- Dec 06, 2012
-
-
Jussi Kivilinna authored
CAST5 and CAST6 both use same lookup tables, which can be moved shared module 'cast_common'. Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Acked-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Nov 09, 2012
-
-
Jussi Kivilinna authored
This patch adds AES-NI/AVX/x86_64 assembler implementation of Camellia block cipher. Implementation process data in sixteen block chunks, which are byte-sliced and AES SubBytes is reused for Camellia s-box with help of pre- and post-filtering. Patch has been tested with tcrypt and automated filesystem tests. tcrypt test results: Intel Core i5-2450M: camellia-aesni-avx vs camellia-asm-x86_64-2way: 128bit key: (lrw:256bit) (xts:256bit) size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec 16B 0.98x 0.96x 0.99x 0.96x 0.96x 0.95x 0.95x 0.94x 0.97x 0.98x 64B 0.99x 0.98x 1.00x 0.98x 0.98x 0.99x 0.98x 0.93x 0.99x 0.98x 256B 2.28x 2.28x 1.01x 2.29x 2.25x 2.24x 1.96x 1.97x 1.91x 1.90x 1024B 2.57x 2.56x 1.00x 2.57x 2.51x 2.53x 2.19x 2.17x 2.19x 2.22x 8192B 2.49x 2.49x 1.00x 2.53x 2.48x 2.49x 2.17x 2.17x 2.22x 2.22x 256bit key: (lrw:384bit) (xts:512bit) size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec 16B 0.97x 0.98x 0.99x 0.97x 0.97x 0.96x 0.97x 0.98x 0.98x 0.99x 64B 1.00x 1.00x 1.01x 0.99x 0.98x 0.99x 0.99x 0.99x 0.99x 0.99x 256B 2.37x 2.37x 1.01x 2.39x 2.35x 2.33x 2.10x 2.11x 1.99x 2.02x 1024B 2.58x 2.60x 1.00x 2.58x 2.56x 2.56x 2.28x 2.29x 2.28x 2.29x 8192B 2.50x 2.52x 1.00x 2.56x 2.51x 2.51x 2.24x 2.25x 2.26x 2.29x Signed-off-by:
Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Acked-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Oct 15, 2012
-
-
Tim Chen authored
This patch adds the crc_pcl function that calculates CRC32C checksum using the PCLMULQDQ instruction on processors that support this feature. This will provide speedup over using CRC32 instruction only. The usage of PCLMULQDQ necessitate the invocation of kernel_fpu_begin and kernel_fpu_end and incur some overhead. So the new crc_pcl function is only invoked for buffer size of 512 bytes or more. Larger sized buffers will expect to see greater speedup. This feature is best used coupled with eager_fpu which reduces the kernel_fpu_begin/end overhead. For buffer size of 1K the speedup is around 1.6x and for buffer size greater than 4K, the speedup is around 3x compared to original implementation in crc32c-intel module. Test was performed on Sandy Bridge based platform with constant frequency set for cpu. A white paper detailing the algorithm can be found here: http://download.intel.com/design/intarch/papers/323405.pdf Signed-off-by:
Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Oct 08, 2012
-
-
David Howells authored
Create a key type that can be used to represent an asymmetric key type for use in appropriate cryptographic operations, such as encryption, decryption, signature generation and signature verification. The key type is "asymmetric" and can provide access to a variety of cryptographic algorithms. Possibly, this would be better as "public_key" - but that has the disadvantage that "public key" is an overloaded term. Signed-off-by:
David Howells <dhowells@redhat.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
- Oct 03, 2012
-
-
Dave Jones authored
Asking for this option on x86 seems a bit pointless. Signed-off-by:
Dave Jones <davej@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Sep 06, 2012
-
-
David McCullough authored
Add assembler versions of AES and SHA1 for ARM platforms. This has provided up to a 50% improvement in IPsec/TCP throughout for tunnels using AES128/SHA1. Platform CPU SPeed Endian Before (bps) After (bps) Improvement IXP425 533 MHz big 11217042 15566294 ~38% KS8695 166 MHz little 3828549 5795373 ~51% Signed-off-by:
David McCullough <ucdevel@gmail.com> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Aug 29, 2012
-
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Aug 26, 2012
-
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Aug 23, 2012
-
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-