Skip to content
Snippets Groups Projects
  1. Nov 09, 2012
    • Jussi Kivilinna's avatar
      crypto: camellia - add AES-NI/AVX/x86_64 assembler implementation of camellia cipher · d9b1d2e7
      Jussi Kivilinna authored
      
      This patch adds AES-NI/AVX/x86_64 assembler implementation of Camellia block
      cipher. Implementation process data in sixteen block chunks, which are
      byte-sliced and AES SubBytes is reused for Camellia s-box with help of pre-
      and post-filtering.
      
      Patch has been tested with tcrypt and automated filesystem tests.
      
      tcrypt test results:
      
      Intel Core i5-2450M:
      
      camellia-aesni-avx vs camellia-asm-x86_64-2way:
      128bit key:                                             (lrw:256bit)    (xts:256bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     0.98x   0.96x   0.99x   0.96x   0.96x   0.95x   0.95x   0.94x   0.97x   0.98x
      64B     0.99x   0.98x   1.00x   0.98x   0.98x   0.99x   0.98x   0.93x   0.99x   0.98x
      256B    2.28x   2.28x   1.01x   2.29x   2.25x   2.24x   1.96x   1.97x   1.91x   1.90x
      1024B   2.57x   2.56x   1.00x   2.57x   2.51x   2.53x   2.19x   2.17x   2.19x   2.22x
      8192B   2.49x   2.49x   1.00x   2.53x   2.48x   2.49x   2.17x   2.17x   2.22x   2.22x
      
      256bit key:                                             (lrw:384bit)    (xts:512bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     0.97x   0.98x   0.99x   0.97x   0.97x   0.96x   0.97x   0.98x   0.98x   0.99x
      64B     1.00x   1.00x   1.01x   0.99x   0.98x   0.99x   0.99x   0.99x   0.99x   0.99x
      256B    2.37x   2.37x   1.01x   2.39x   2.35x   2.33x   2.10x   2.11x   1.99x   2.02x
      1024B   2.58x   2.60x   1.00x   2.58x   2.56x   2.56x   2.28x   2.29x   2.28x   2.29x
      8192B   2.50x   2.52x   1.00x   2.56x   2.51x   2.51x   2.24x   2.25x   2.26x   2.29x
      
      Signed-off-by: default avatarJussi Kivilinna <jussi.kivilinna@mbnet.fi>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      d9b1d2e7
  2. Oct 15, 2012
    • Tim Chen's avatar
      crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction · 6a8ce1ef
      Tim Chen authored
      This patch adds the crc_pcl function that calculates CRC32C checksum using the
      PCLMULQDQ instruction on processors that support this feature. This will
      provide speedup over using CRC32 instruction only.
      The usage of PCLMULQDQ necessitate the invocation of kernel_fpu_begin and
      kernel_fpu_end and incur some overhead.  So the new crc_pcl function is only
      invoked for buffer size of 512 bytes or more.  Larger sized
      buffers will expect to see greater speedup.  This feature is best used coupled
      with eager_fpu which reduces the kernel_fpu_begin/end overhead.  For
      buffer size of 1K the speedup is around 1.6x and for buffer size greater than
      4K, the speedup is around 3x compared to original implementation in crc32c-intel
      module. Test was performed on Sandy Bridge based platform with constant frequency
      set for cpu.
      
      A white paper detailing the algorithm can be found here:
      http://download.intel.com/design/intarch/papers/323405.pdf
      
      
      
      Signed-off-by: default avatarTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      6a8ce1ef
  3. Oct 08, 2012
    • David Howells's avatar
      KEYS: Implement asymmetric key type · 964f3b3b
      David Howells authored
      
      Create a key type that can be used to represent an asymmetric key type for use
      in appropriate cryptographic operations, such as encryption, decryption,
      signature generation and signature verification.
      
      The key type is "asymmetric" and can provide access to a variety of
      cryptographic algorithms.
      
      Possibly, this would be better as "public_key" - but that has the disadvantage
      that "public key" is an overloaded term.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      964f3b3b
  4. Oct 03, 2012
  5. Sep 06, 2012
  6. Aug 29, 2012
  7. Aug 26, 2012
  8. Aug 23, 2012
  9. Aug 22, 2012
  10. Aug 20, 2012
  11. Aug 01, 2012
    • Seth Jennings's avatar
      powerpc/crypto: add 842 crypto driver · 35a1fc18
      Seth Jennings authored
      
      This patch add the 842 cryptographic API driver that
      submits compression requests to the 842 hardware compression
      accelerator driver (nx-compress).
      
      If the hardware accelerator goes offline for any reason
      (dynamic disable, migration, etc...), this driver will use LZO
      as a software failover for all future compression requests.
      For decompression requests, the 842 hardware driver contains
      a software implementation of the 842 decompressor to support
      the decompression of data that was compressed before the accelerator
      went offline.
      
      Signed-off-by: default avatarRobert Jennings <rcj@linux.vnet.ibm.com>
      Signed-off-by: default avatarSeth Jennings <sjenning@linux.vnet.ibm.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      35a1fc18
    • Johannes Goetzfried's avatar
      crypto: cast6 - add x86_64/avx assembler implementation · 4ea1277d
      Johannes Goetzfried authored
      
      This patch adds a x86_64/avx assembler implementation of the Cast6 block
      cipher. The implementation processes eight blocks in parallel (two 4 block
      chunk AVX operations). The table-lookups are done in general-purpose registers.
      For small blocksizes the functions from the generic module are called. A good
      performance increase is provided for blocksizes greater or equal to 128B.
      
      Patch has been tested with tcrypt and automated filesystem tests.
      
      Tcrypt benchmark results:
      
      Intel Core i5-2500 CPU (fam:6, model:42, step:7)
      
      cast6-avx-x86_64 vs. cast6-generic
      128bit key:                                             (lrw:256bit)    (xts:256bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     0.97x   1.00x   1.01x   1.01x   0.99x   0.97x   0.98x   1.01x   0.96x   0.98x
      64B     0.98x   0.99x   1.02x   1.01x   0.99x   1.00x   1.01x   0.99x   1.00x   0.99x
      256B    1.77x   1.84x   0.99x   1.85x   1.77x   1.77x   1.70x   1.74x   1.69x   1.72x
      1024B   1.93x   1.95x   0.99x   1.96x   1.93x   1.93x   1.84x   1.85x   1.89x   1.87x
      8192B   1.91x   1.95x   0.99x   1.97x   1.95x   1.91x   1.86x   1.87x   1.93x   1.90x
      
      256bit key:                                             (lrw:384bit)    (xts:512bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     0.97x   0.99x   1.02x   1.01x   0.98x   0.99x   1.00x   1.00x   0.98x   0.98x
      64B     0.98x   0.99x   1.01x   1.00x   1.00x   1.00x   1.01x   1.01x   0.97x   1.00x
      256B    1.77x   1.83x   1.00x   1.86x   1.79x   1.78x   1.70x   1.76x   1.71x   1.69x
      1024B   1.92x   1.95x   0.99x   1.96x   1.93x   1.93x   1.83x   1.86x   1.89x   1.87x
      8192B   1.94x   1.95x   0.99x   1.97x   1.95x   1.95x   1.87x   1.87x   1.93x   1.91x
      
      Signed-off-by: default avatarJohannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      4ea1277d
    • Johannes Goetzfried's avatar
      crypto: cast5 - add x86_64/avx assembler implementation · 4d6d6a2c
      Johannes Goetzfried authored
      
      This patch adds a x86_64/avx assembler implementation of the Cast5 block
      cipher. The implementation processes sixteen blocks in parallel (four 4 block
      chunk AVX operations). The table-lookups are done in general-purpose registers.
      For small blocksizes the functions from the generic module are called. A good
      performance increase is provided for blocksizes greater or equal to 128B.
      
      Patch has been tested with tcrypt and automated filesystem tests.
      
      Tcrypt benchmark results:
      
      Intel Core i5-2500 CPU (fam:6, model:42, step:7)
      
      cast5-avx-x86_64 vs. cast5-generic
      64bit key:
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
      16B     0.99x   0.99x   1.00x   1.00x   1.02x   1.01x
      64B     1.00x   1.00x   0.98x   1.00x   1.01x   1.02x
      256B    2.03x   2.01x   0.95x   2.11x   2.12x   2.13x
      1024B   2.30x   2.24x   0.95x   2.29x   2.35x   2.35x
      8192B   2.31x   2.27x   0.95x   2.31x   2.39x   2.39x
      
      128bit key:
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
      16B     0.99x   0.99x   1.00x   1.00x   1.01x   1.01x
      64B     1.00x   1.00x   0.98x   1.01x   1.02x   1.01x
      256B    2.17x   2.13x   0.96x   2.19x   2.19x   2.19x
      1024B   2.29x   2.32x   0.95x   2.34x   2.37x   2.38x
      8192B   2.35x   2.32x   0.95x   2.35x   2.39x   2.39x
      
      Signed-off-by: default avatarJohannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      4d6d6a2c
  12. Jun 27, 2012
  13. Jun 12, 2012
    • Johannes Goetzfried's avatar
      crypto: serpent - add x86_64/avx assembler implementation · 7efe4076
      Johannes Goetzfried authored
      
      This patch adds a x86_64/avx assembler implementation of the Serpent block
      cipher. The implementation is very similar to the sse2 implementation and
      processes eight blocks in parallel. Because of the new non-destructive three
      operand syntax all move-instructions can be removed and therefore a little
      performance increase is provided.
      
      Patch has been tested with tcrypt and automated filesystem tests.
      
      Tcrypt benchmark results:
      
      Intel Core i5-2500 CPU (fam:6, model:42, step:7)
      
      serpent-avx-x86_64 vs. serpent-sse2-x86_64
      128bit key:                                             (lrw:256bit)    (xts:256bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     1.03x   1.01x   1.01x   1.01x   1.00x   1.00x   1.00x   1.00x   1.00x   1.01x
      64B     1.00x   1.00x   1.00x   1.00x   1.00x   0.99x   1.00x   1.01x   1.00x   1.00x
      256B    1.05x   1.03x   1.00x   1.02x   1.05x   1.06x   1.05x   1.02x   1.05x   1.02x
      1024B   1.05x   1.02x   1.00x   1.02x   1.05x   1.06x   1.05x   1.03x   1.05x   1.02x
      8192B   1.05x   1.02x   1.00x   1.02x   1.06x   1.06x   1.04x   1.03x   1.04x   1.02x
      
      256bit key:                                             (lrw:384bit)    (xts:512bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     1.01x   1.00x   1.01x   1.01x   1.00x   1.00x   0.99x   1.03x   1.01x   1.01x
      64B     1.00x   1.00x   1.00x   1.00x   1.00x   1.00x   1.00x   1.01x   1.00x   1.02x
      256B    1.05x   1.02x   1.00x   1.02x   1.05x   1.02x   1.04x   1.05x   1.05x   1.02x
      1024B   1.06x   1.02x   1.00x   1.02x   1.07x   1.06x   1.05x   1.04x   1.05x   1.02x
      8192B   1.05x   1.02x   1.00x   1.02x   1.06x   1.06x   1.04x   1.05x   1.05x   1.02x
      
      serpent-avx-x86_64 vs aes-asm (8kB block):
               128bit  256bit
      ecb-enc  1.26x   1.73x
      ecb-dec  1.20x   1.64x
      cbc-enc  0.33x   0.45x
      cbc-dec  1.24x   1.67x
      ctr-enc  1.32x   1.76x
      ctr-dec  1.32x   1.76x
      lrw-enc  1.20x   1.60x
      lrw-dec  1.15x   1.54x
      xts-enc  1.22x   1.64x
      xts-dec  1.17x   1.57x
      
      Signed-off-by: default avatarJohannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      7efe4076
    • Johannes Goetzfried's avatar
      crypto: twofish - add x86_64/avx assembler implementation · 107778b5
      Johannes Goetzfried authored
      
      This patch adds a x86_64/avx assembler implementation of the Twofish block
      cipher. The implementation processes eight blocks in parallel (two 4 block
      chunk AVX operations). The table-lookups are done in general-purpose registers.
      For small blocksizes the 3way-parallel functions from the twofish-x86_64-3way
      module are called. A good performance increase is provided for blocksizes
      greater or equal to 128B.
      
      Patch has been tested with tcrypt and automated filesystem tests.
      
      Tcrypt benchmark results:
      
      Intel Core i5-2500 CPU (fam:6, model:42, step:7)
      
      twofish-avx-x86_64 vs. twofish-x86_64-3way
      128bit key:                                             (lrw:256bit)    (xts:256bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     0.96x   0.97x   1.00x   0.95x   0.97x   0.97x   0.96x   0.95x   0.95x   0.98x
      64B     0.99x   0.99x   1.00x   0.99x   0.98x   0.98x   0.99x   0.98x   0.99x   0.98x
      256B    1.20x   1.21x   1.00x   1.19x   1.15x   1.14x   1.19x   1.20x   1.18x   1.19x
      1024B   1.29x   1.30x   1.00x   1.28x   1.23x   1.24x   1.26x   1.28x   1.26x   1.27x
      8192B   1.31x   1.32x   1.00x   1.31x   1.25x   1.25x   1.28x   1.29x   1.28x   1.30x
      
      256bit key:                                             (lrw:384bit)    (xts:512bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     0.96x   0.96x   1.00x   0.96x   0.97x   0.98x   0.95x   0.95x   0.95x   0.96x
      64B     1.00x   0.99x   1.00x   0.98x   0.98x   1.01x   0.98x   0.98x   0.98x   0.98x
      256B    1.20x   1.21x   1.00x   1.21x   1.15x   1.15x   1.19x   1.20x   1.18x   1.19x
      1024B   1.29x   1.30x   1.00x   1.28x   1.23x   1.23x   1.26x   1.27x   1.26x   1.27x
      8192B   1.31x   1.33x   1.00x   1.31x   1.26x   1.26x   1.29x   1.29x   1.28x   1.30x
      
      twofish-avx-x86_64 vs aes-asm (8kB block):
               128bit  256bit
      ecb-enc  1.19x   1.63x
      ecb-dec  1.18x   1.62x
      cbc-enc  0.75x   1.03x
      cbc-dec  1.23x   1.67x
      ctr-enc  1.24x   1.65x
      ctr-dec  1.24x   1.65x
      lrw-enc  1.15x   1.53x
      lrw-dec  1.14x   1.52x
      xts-enc  1.16x   1.56x
      xts-dec  1.16x   1.56x
      
      Signed-off-by: default avatarJohannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      107778b5
  14. Apr 09, 2012
  15. Mar 23, 2012
  16. Mar 14, 2012
    • Jussi Kivilinna's avatar
      crypto: camellia - add assembler implementation for x86_64 · 0b95ec56
      Jussi Kivilinna authored
      
      Patch adds x86_64 assembler implementation of Camellia block cipher. Two set of
      functions are provided. First set is regular 'one-block at time' encrypt/decrypt
      functions. Second is 'two-block at time' functions that gain performance increase
      on out-of-order CPUs. Performance of 2-way functions should be equal to 1-way
      functions with in-order CPUs.
      
      Patch has been tested with tcrypt and automated filesystem tests.
      
      Tcrypt benchmark results:
      
      AMD Phenom II 1055T (fam:16, model:10):
      
      camellia-asm vs camellia_generic:
      128bit key:                                             (lrw:256bit)    (xts:256bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     1.27x   1.22x   1.30x   1.42x   1.30x   1.34x   1.19x   1.05x   1.23x   1.24x
      64B     1.74x   1.79x   1.43x   1.87x   1.81x   1.87x   1.48x   1.38x   1.55x   1.62x
      256B    1.90x   1.87x   1.43x   1.94x   1.94x   1.95x   1.63x   1.62x   1.67x   1.70x
      1024B   1.96x   1.93x   1.43x   1.95x   1.98x   2.01x   1.67x   1.69x   1.74x   1.80x
      8192B   1.96x   1.96x   1.39x   1.93x   2.01x   2.03x   1.72x   1.64x   1.71x   1.76x
      
      256bit key:                                             (lrw:384bit)    (xts:512bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     1.23x   1.23x   1.33x   1.39x   1.34x   1.38x   1.04x   1.18x   1.21x   1.29x
      64B     1.72x   1.69x   1.42x   1.78x   1.81x   1.89x   1.57x   1.52x   1.56x   1.65x
      256B    1.85x   1.88x   1.42x   1.86x   1.93x   1.96x   1.69x   1.65x   1.70x   1.75x
      1024B   1.88x   1.86x   1.45x   1.95x   1.96x   1.95x   1.77x   1.71x   1.77x   1.78x
      8192B   1.91x   1.86x   1.42x   1.91x   2.03x   1.98x   1.73x   1.71x   1.78x   1.76x
      
      camellia-asm vs aes-asm (8kB block):
               128bit  256bit
      ecb-enc  1.15x   1.22x
      ecb-dec  1.16x   1.16x
      cbc-enc  0.85x   0.90x
      cbc-dec  1.20x   1.23x
      ctr-enc  1.28x   1.30x
      ctr-dec  1.27x   1.28x
      lrw-enc  1.12x   1.16x
      lrw-dec  1.08x   1.10x
      xts-enc  1.11x   1.15x
      xts-dec  1.14x   1.15x
      
      Intel Core2 T8100 (fam:6, model:23, step:6):
      
      camellia-asm vs camellia_generic:
      128bit key:                                             (lrw:256bit)    (xts:256bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     1.10x   1.12x   1.14x   1.16x   1.16x   1.15x   1.02x   1.02x   1.08x   1.08x
      64B     1.61x   1.60x   1.17x   1.68x   1.67x   1.66x   1.43x   1.42x   1.44x   1.42x
      256B    1.65x   1.73x   1.17x   1.77x   1.81x   1.80x   1.54x   1.53x   1.58x   1.54x
      1024B   1.76x   1.74x   1.18x   1.80x   1.85x   1.85x   1.60x   1.59x   1.65x   1.60x
      8192B   1.77x   1.75x   1.19x   1.81x   1.85x   1.86x   1.63x   1.61x   1.66x   1.62x
      
      256bit key:                                             (lrw:384bit)    (xts:512bit)
      size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
      16B     1.10x   1.07x   1.13x   1.16x   1.11x   1.16x   1.03x   1.02x   1.08x   1.07x
      64B     1.61x   1.62x   1.15x   1.66x   1.63x   1.68x   1.47x   1.46x   1.47x   1.44x
      256B    1.71x   1.70x   1.16x   1.75x   1.69x   1.79x   1.58x   1.57x   1.59x   1.55x
      1024B   1.78x   1.72x   1.17x   1.75x   1.80x   1.80x   1.63x   1.62x   1.65x   1.62x
      8192B   1.76x   1.73x   1.17x   1.78x   1.80x   1.81x   1.64x   1.62x   1.68x   1.64x
      
      camellia-asm vs aes-asm (8kB block):
               128bit  256bit
      ecb-enc  1.17x   1.21x
      ecb-dec  1.17x   1.20x
      cbc-enc  0.80x   0.82x
      cbc-dec  1.22x   1.24x
      ctr-enc  1.25x   1.26x
      ctr-dec  1.25x   1.26x
      lrw-enc  1.14x   1.18x
      lrw-dec  1.13x   1.17x
      xts-enc  1.14x   1.18x
      xts-dec  1.14x   1.17x
      
      Signed-off-by: default avatarJussi Kivilinna <jussi.kivilinna@mbnet.fi>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      0b95ec56
  17. Dec 20, 2011
  18. Nov 30, 2011
  19. Nov 21, 2011
  20. Nov 13, 2011
Loading