1. 10 Jul, 2019 2 commits
  2. 21 May, 2019 6 commits
    • Eric Biggers's avatar
      crypto: ccm - fix incompatibility between "ccm" and "ccm_base" · 654a509d
      Eric Biggers authored
      commit 6a1faa4a43f5fabf9cbeaa742d916e7b5e73120f upstream.
      
      CCM instances can be created by either the "ccm" template, which only
      allows choosing the block cipher, e.g. "ccm(aes)"; or by "ccm_base",
      which allows choosing the ctr and cbcmac implementations, e.g.
      "ccm_base(ctr(aes-generic),cbcmac(aes-generic))".
      
      However, a "ccm_base" instance prevents a "ccm" instance from being
      registered using the same implementations.  Nor will the instance be
      found by lookups of "ccm".  This can be used as a denial of service.
      Moreover, "ccm_base" instances are never tested by the crypto
      self-tests, even if there are compatible "ccm" tests.
      
      The root cause of these problems is that instances of the two templates
      use different cra_names.  Therefore, fix these problems by making
      "ccm_base" instances set the same cra_name as "ccm" instances, e.g.
      "ccm(aes)" instead of "ccm_base(ctr(aes-generic),cbcmac(aes-generic))".
      
      This requires extracting the block cipher name from the name of the ctr
      and cbcmac algorithms.  It also requires starting to verify that the
      algorithms are really ctr and cbcmac using the same block cipher, not
      something else entirely.  But it would be bizarre if anyone were
      actually using non-ccm-compatible algorithms with ccm_base, so this
      shouldn't break anyone in practice.
      
      Fixes: 4a49b499 ("[CRYPTO] ccm: Added CCM mode")
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      654a509d
    • Eric Biggers's avatar
      crypto: salsa20 - don't access already-freed walk.iv · c4cf8de7
      Eric Biggers authored
      commit edaf28e996af69222b2cb40455dbb5459c2b875a upstream.
      
      If the user-provided IV needs to be aligned to the algorithm's
      alignmask, then skcipher_walk_virt() copies the IV into a new aligned
      buffer walk.iv.  But skcipher_walk_virt() can fail afterwards, and then
      if the caller unconditionally accesses walk.iv, it's a use-after-free.
      
      salsa20-generic doesn't set an alignmask, so currently it isn't affected
      by this despite unconditionally accessing walk.iv.  However this is more
      subtle than desired, and it was actually broken prior to the alignmask
      being removed by commit b62b3db7 ("crypto: salsa20-generic - cleanup
      and convert to skcipher API").
      
      Since salsa20-generic does not update the IV and does not need any IV
      alignment, update it to use req->iv instead of walk.iv.
      
      Fixes: 2407d608 ("[CRYPTO] salsa20: Salsa20 stream cipher")
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      c4cf8de7
    • Eric Biggers's avatar
      crypto: gcm - fix incompatibility between "gcm" and "gcm_base" · b5344a0e
      Eric Biggers authored
      commit f699594d436960160f6d5ba84ed4a222f20d11cd upstream.
      
      GCM instances can be created by either the "gcm" template, which only
      allows choosing the block cipher, e.g. "gcm(aes)"; or by "gcm_base",
      which allows choosing the ctr and ghash implementations, e.g.
      "gcm_base(ctr(aes-generic),ghash-generic)".
      
      However, a "gcm_base" instance prevents a "gcm" instance from being
      registered using the same implementations.  Nor will the instance be
      found by lookups of "gcm".  This can be used as a denial of service.
      Moreover, "gcm_base" instances are never tested by the crypto
      self-tests, even if there are compatible "gcm" tests.
      
      The root cause of these problems is that instances of the two templates
      use different cra_names.  Therefore, fix these problems by making
      "gcm_base" instances set the same cra_name as "gcm" instances, e.g.
      "gcm(aes)" instead of "gcm_base(ctr(aes-generic),ghash-generic)".
      
      This requires extracting the block cipher name from the name of the ctr
      algorithm.  It also requires starting to verify that the algorithms are
      really ctr and ghash, not something else entirely.  But it would be
      bizarre if anyone were actually using non-gcm-compatible algorithms with
      gcm_base, so this shouldn't break anyone in practice.
      
      Fixes: d00aa19b ("[CRYPTO] gcm: Allow block cipher parameter")
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5344a0e
    • Eric Biggers's avatar
      crypto: crct10dif-generic - fix use via crypto_shash_digest() · e4bd7d1d
      Eric Biggers authored
      commit 307508d1072979f4435416f87936f87eaeb82054 upstream.
      
      The ->digest() method of crct10dif-generic reads the current CRC value
      from the shash_desc context.  But this value is uninitialized, causing
      crypto_shash_digest() to compute the wrong result.  Fix it.
      
      Probably this wasn't noticed before because lib/crc-t10dif.c only uses
      crypto_shash_update(), not crypto_shash_digest().  Likewise,
      crypto_shash_digest() is not yet tested by the crypto self-tests because
      those only test the ahash API which only uses shash init/update/final.
      
      This bug was detected by my patches that improve testmgr to fuzz
      algorithms against their generic implementation.
      
      Fixes: 2d31e518 ("crypto: crct10dif - Wrap crc_t10dif function all to use crypto transform framework")
      Cc: <stable@vger.kernel.org> # v3.11+
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e4bd7d1d
    • Eric Biggers's avatar
      crypto: skcipher - don't WARN on unprocessed data after slow walk step · d418ee2c
      Eric Biggers authored
      commit dcaca01a42cc2c425154a13412b4124293a6e11e upstream.
      
      skcipher_walk_done() assumes it's a bug if, after the "slow" path is
      executed where the next chunk of data is processed via a bounce buffer,
      the algorithm says it didn't process all bytes.  Thus it WARNs on this.
      
      However, this can happen legitimately when the message needs to be
      evenly divisible into "blocks" but isn't, and the algorithm has a
      'walksize' greater than the block size.  For example, ecb-aes-neonbs
      sets 'walksize' to 128 bytes and only supports messages evenly divisible
      into 16-byte blocks.  If, say, 17 message bytes remain but they straddle
      scatterlist elements, the skcipher_walk code will take the "slow" path
      and pass the algorithm all 17 bytes in the bounce buffer.  But the
      algorithm will only be able to process 16 bytes, triggering the WARN.
      
      Fix this by just removing the WARN_ON().  Returning -EINVAL, as the code
      already does, is the right behavior.
      
      This bug was detected by my patches that improve testmgr to fuzz
      algorithms against their generic implementation.
      
      Fixes: b286d8b1 ("crypto: skcipher - Add skcipher walk interface")
      Cc: <stable@vger.kernel.org> # v4.10+
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d418ee2c
    • Eric Biggers's avatar
      crypto: chacha20poly1305 - set cra_name correctly · 559c1730
      Eric Biggers authored
      commit 5e27f38f1f3f45a0c938299c3a34a2d2db77165a upstream.
      
      If the rfc7539 template is instantiated with specific implementations,
      e.g. "rfc7539(chacha20-generic,poly1305-generic)" rather than
      "rfc7539(chacha20,poly1305)", then the implementation names end up
      included in the instance's cra_name.  This is incorrect because it then
      prevents all users from allocating "rfc7539(chacha20,poly1305)", if the
      highest priority implementations of chacha20 and poly1305 were selected.
      Also, the self-tests aren't run on an instance allocated in this way.
      
      Fix it by setting the instance's cra_name from the underlying
      algorithms' actual cra_names, rather than from the requested names.
      This matches what other templates do.
      
      Fixes: 71ebc4d1 ("crypto: chacha20poly1305 - Add a ChaCha20-Poly1305 AEAD construction, RFC7539")
      Cc: <stable@vger.kernel.org> # v4.2+
      Cc: Martin Willi <martin@strongswan.org>
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: 's avatarMartin Willi <martin@strongswan.org>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      559c1730
  3. 27 Apr, 2019 1 commit
    • Eric Biggers's avatar
      crypto: x86/poly1305 - fix overflow during partial reduction · 4de25ac0
      Eric Biggers authored
      commit 678cce4019d746da6c680c48ba9e6d417803e127 upstream.
      
      The x86_64 implementation of Poly1305 produces the wrong result on some
      inputs because poly1305_4block_avx2() incorrectly assumes that when
      partially reducing the accumulator, the bits carried from limb 'd4' to
      limb 'h0' fit in a 32-bit integer.  This is true for poly1305-generic
      which processes only one block at a time.  However, it's not true for
      the AVX2 implementation, which processes 4 blocks at a time and
      therefore can produce intermediate limbs about 4x larger.
      
      Fix it by making the relevant calculations use 64-bit arithmetic rather
      than 32-bit.  Note that most of the carries already used 64-bit
      arithmetic, but the d4 -> h0 carry was different for some reason.
      
      To be safe I also made the same change to the corresponding SSE2 code,
      though that only operates on 1 or 2 blocks at a time.  I don't think
      it's really needed for poly1305_block_sse2(), but it doesn't hurt
      because it's already x86_64 code.  It *might* be needed for
      poly1305_2block_sse2(), but overflows aren't easy to reproduce there.
      
      This bug was originally detected by my patches that improve testmgr to
      fuzz algorithms against their generic implementation.  But also add a
      test vector which reproduces it directly (in the AVX2 case).
      
      Fixes: b1ccc8f4 ("crypto: poly1305 - Add a four block AVX2 variant for x86_64")
      Fixes: c70f4abe ("crypto: poly1305 - Add a SSE2 SIMD variant for x86_64")
      Cc: <stable@vger.kernel.org> # v4.3+
      Cc: Martin Willi <martin@strongswan.org>
      Cc: Jason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: 's avatarMartin Willi <martin@strongswan.org>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4de25ac0
  4. 23 Mar, 2019 4 commits
    • Eric Biggers's avatar
      crypto: pcbc - remove bogus memcpy()s with src == dest · 05c02839
      Eric Biggers authored
      commit 251b7aea34ba3c4d4fdfa9447695642eb8b8b098 upstream.
      
      The memcpy()s in the PCBC implementation use walk->iv as both the source
      and destination, which has undefined behavior.  These memcpy()'s are
      actually unneeded, because walk->iv is already used to hold the previous
      plaintext block XOR'd with the previous ciphertext block.  Thus,
      walk->iv is already updated to its final value.
      
      So remove the broken and unnecessary memcpy()s.
      
      Fixes: 91652be5 ("[CRYPTO] pcbc: Add Propagated CBC template")
      Cc: <stable@vger.kernel.org> # v2.6.21+
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarMaxim Zhukov <mussitantesmortem@gmail.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05c02839
    • Eric Biggers's avatar
      crypto: testmgr - skip crc32c context test for ahash algorithms · 29ef5d0f
      Eric Biggers authored
      commit eb5e6730db98fcc4b51148b4a819fa4bf864ae54 upstream.
      
      Instantiating "cryptd(crc32c)" causes a crypto self-test failure because
      the crypto_alloc_shash() in alg_test_crc32c() fails.  This is because
      cryptd(crc32c) is an ahash algorithm, not a shash algorithm; so it can
      only be accessed through the ahash API, unlike shash algorithms which
      can be accessed through both the ahash and shash APIs.
      
      As the test is testing the shash descriptor format which is only
      applicable to shash algorithms, skip it for ahash algorithms.
      
      (Note that it's still important to fix crypto self-test failures even
       for weird algorithm instantiations like cryptd(crc32c) that no one
       would really use; in fips_enabled mode unprivileged users can use them
       to panic the kernel, and also they prevent treating a crypto self-test
       failure as a bug when fuzzing the kernel.)
      
      Fixes: 8e3ee85e ("crypto: crc32c - Test descriptor context format")
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      29ef5d0f
    • Eric Biggers's avatar
      crypto: hash - set CRYPTO_TFM_NEED_KEY if ->setkey() fails · 3a48ec70
      Eric Biggers authored
      commit ba7d7433a0e998c902132bd47330e355a1eaa894 upstream.
      
      Some algorithms have a ->setkey() method that is not atomic, in the
      sense that setting a key can fail after changes were already made to the
      tfm context.  In this case, if a key was already set the tfm can end up
      in a state that corresponds to neither the old key nor the new key.
      
      It's not feasible to make all ->setkey() methods atomic, especially ones
      that have to key multiple sub-tfms.  Therefore, make the crypto API set
      CRYPTO_TFM_NEED_KEY if ->setkey() fails and the algorithm requires a
      key, to prevent the tfm from being used until a new key is set.
      
      Note: we can't set CRYPTO_TFM_NEED_KEY for OPTIONAL_KEY algorithms, so
      ->setkey() for those must nevertheless be atomic.  That's fine for now
      since only the crc32 and crc32c algorithms set OPTIONAL_KEY, and it's
      not intended that OPTIONAL_KEY be used much.
      
      [Cc stable mainly because when introducing the NEED_KEY flag I changed
       AF_ALG to rely on it; and unlike in-kernel crypto API users, AF_ALG
       previously didn't have this problem.  So these "incompletely keyed"
       states became theoretically accessible via AF_ALG -- though, the
       opportunities for causing real mischief seem pretty limited.]
      
      Fixes: 9fa68f62 ("crypto: hash - prevent using keyed hashes without setting key")
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3a48ec70
    • Eric Biggers's avatar
      crypto: ahash - fix another early termination in hash walk · 09024fed
      Eric Biggers authored
      commit 77568e535af7c4f97eaef1e555bf0af83772456c upstream.
      
      Hash algorithms with an alignmask set, e.g. "xcbc(aes-aesni)" and
      "michael_mic", fail the improved hash tests because they sometimes
      produce the wrong digest.  The bug is that in the case where a
      scatterlist element crosses pages, not all the data is actually hashed
      because the scatterlist walk terminates too early.  This happens because
      the 'nbytes' variable in crypto_hash_walk_done() is assigned the number
      of bytes remaining in the page, then later interpreted as the number of
      bytes remaining in the scatterlist element.  Fix it.
      
      Fixes: 900a081f ("crypto: ahash - Fix early termination in hash walk")
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09024fed
  5. 23 Feb, 2019 1 commit
    • Mao Wenan's avatar
      net: crypto set sk to NULL when af_alg_release. · 6e4c01ee
      Mao Wenan authored
      [ Upstream commit 9060cb719e61b685ec0102574e10337fa5f445ea ]
      
      KASAN has found use-after-free in sockfs_setattr.
      The existed commit 6d8c50dc ("socket: close race condition between sock_close()
      and sockfs_setattr()") is to fix this simillar issue, but it seems to ignore
      that crypto module forgets to set the sk to NULL after af_alg_release.
      
      KASAN report details as below:
      BUG: KASAN: use-after-free in sockfs_setattr+0x120/0x150
      Write of size 4 at addr ffff88837b956128 by task syz-executor0/4186
      
      CPU: 2 PID: 4186 Comm: syz-executor0 Not tainted xxx + #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      1.10.2-1ubuntu1 04/01/2014
      Call Trace:
       dump_stack+0xca/0x13e
       print_address_description+0x79/0x330
       ? vprintk_func+0x5e/0xf0
       kasan_report+0x18a/0x2e0
       ? sockfs_setattr+0x120/0x150
       sockfs_setattr+0x120/0x150
       ? sock_register+0x2d0/0x2d0
       notify_change+0x90c/0xd40
       ? chown_common+0x2ef/0x510
       chown_common+0x2ef/0x510
       ? chmod_common+0x3b0/0x3b0
       ? __lock_is_held+0xbc/0x160
       ? __sb_start_write+0x13d/0x2b0
       ? __mnt_want_write+0x19a/0x250
       do_fchownat+0x15c/0x190
       ? __ia32_sys_chmod+0x80/0x80
       ? trace_hardirqs_on_thunk+0x1a/0x1c
       __x64_sys_fchownat+0xbf/0x160
       ? lockdep_hardirqs_on+0x39a/0x5e0
       do_syscall_64+0xc8/0x580
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x462589
      Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89
      f7 48 89 d6 48 89
      ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3
      48 c7 c1 bc ff ff
      ff f7 d8 64 89 01 48
      RSP: 002b:00007fb4b2c83c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000104
      RAX: ffffffffffffffda RBX: 000000000072bfa0 RCX: 0000000000462589
      RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000007
      RBP: 0000000000000005 R08: 0000000000001000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb4b2c846bc
      R13: 00000000004bc733 R14: 00000000006f5138 R15: 00000000ffffffff
      
      Allocated by task 4185:
       kasan_kmalloc+0xa0/0xd0
       __kmalloc+0x14a/0x350
       sk_prot_alloc+0xf6/0x290
       sk_alloc+0x3d/0xc00
       af_alg_accept+0x9e/0x670
       hash_accept+0x4a3/0x650
       __sys_accept4+0x306/0x5c0
       __x64_sys_accept4+0x98/0x100
       do_syscall_64+0xc8/0x580
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 4184:
       __kasan_slab_free+0x12e/0x180
       kfree+0xeb/0x2f0
       __sk_destruct+0x4e6/0x6a0
       sk_destruct+0x48/0x70
       __sk_free+0xa9/0x270
       sk_free+0x2a/0x30
       af_alg_release+0x5c/0x70
       __sock_release+0xd3/0x280
       sock_close+0x1a/0x20
       __fput+0x27f/0x7f0
       task_work_run+0x136/0x1b0
       exit_to_usermode_loop+0x1a7/0x1d0
       do_syscall_64+0x461/0x580
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Syzkaller reproducer:
      r0 = perf_event_open(&(0x7f0000000000)={0x0, 0x70, 0x0, 0x0, 0x0, 0x0,
      0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
      0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
      0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_config_ext}, 0x0, 0x0,
      0xffffffffffffffff, 0x0)
      r1 = socket$alg(0x26, 0x5, 0x0)
      getrusage(0x0, 0x0)
      bind(r1, &(0x7f00000001c0)=@alg={0x26, 'hash\x00', 0x0, 0x0,
      'sha256-ssse3\x00'}, 0x80)
      r2 = accept(r1, 0x0, 0x0)
      r3 = accept4$unix(r2, 0x0, 0x0, 0x0)
      r4 = dup3(r3, r0, 0x0)
      fchownat(r4, &(0x7f00000000c0)='\x00', 0x0, 0x0, 0x1000)
      
      Fixes: 6d8c50dc ("socket: close race condition between sock_close() and sockfs_setattr()")
      Signed-off-by: 's avatarMao Wenan <maowenan@huawei.com>
      Signed-off-by: 's avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6e4c01ee
  6. 12 Feb, 2019 1 commit
  7. 23 Jan, 2019 2 commits
    • Eric Biggers's avatar
      crypto: authenc - fix parsing key with misaligned rta_len · b9119fd2
      Eric Biggers authored
      commit 8f9c469348487844328e162db57112f7d347c49f upstream.
      
      Keys for "authenc" AEADs are formatted as an rtattr containing a 4-byte
      'enckeylen', followed by an authentication key and an encryption key.
      crypto_authenc_extractkeys() parses the key to find the inner keys.
      
      However, it fails to consider the case where the rtattr's payload is
      longer than 4 bytes but not 4-byte aligned, and where the key ends
      before the next 4-byte aligned boundary.  In this case, 'keylen -=
      RTA_ALIGN(rta->rta_len);' underflows to a value near UINT_MAX.  This
      causes a buffer overread and crash during crypto_ahash_setkey().
      
      Fix it by restricting the rtattr payload to the expected size.
      
      Reproducer using AF_ALG:
      
      	#include <linux/if_alg.h>
      	#include <linux/rtnetlink.h>
      	#include <sys/socket.h>
      
      	int main()
      	{
      		int fd;
      		struct sockaddr_alg addr = {
      			.salg_type = "aead",
      			.salg_name = "authenc(hmac(sha256),cbc(aes))",
      		};
      		struct {
      			struct rtattr attr;
      			__be32 enckeylen;
      			char keys[1];
      		} __attribute__((packed)) key = {
      			.attr.rta_len = sizeof(key),
      			.attr.rta_type = 1 /* CRYPTO_AUTHENC_KEYA_PARAM */,
      		};
      
      		fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
      		bind(fd, (void *)&addr, sizeof(addr));
      		setsockopt(fd, SOL_ALG, ALG_SET_KEY, &key, sizeof(key));
      	}
      
      It caused:
      
      	BUG: unable to handle kernel paging request at ffff88007ffdc000
      	PGD 2e01067 P4D 2e01067 PUD 2e04067 PMD 2e05067 PTE 0
      	Oops: 0000 [#1] SMP
      	CPU: 0 PID: 883 Comm: authenc Not tainted 4.20.0-rc1-00108-g00c9fe37a7f27 #13
      	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
      	RIP: 0010:sha256_ni_transform+0xb3/0x330 arch/x86/crypto/sha256_ni_asm.S:155
      	[...]
      	Call Trace:
      	 sha256_ni_finup+0x10/0x20 arch/x86/crypto/sha256_ssse3_glue.c:321
      	 crypto_shash_finup+0x1a/0x30 crypto/shash.c:178
      	 shash_digest_unaligned+0x45/0x60 crypto/shash.c:186
      	 crypto_shash_digest+0x24/0x40 crypto/shash.c:202
      	 hmac_setkey+0x135/0x1e0 crypto/hmac.c:66
      	 crypto_shash_setkey+0x2b/0xb0 crypto/shash.c:66
      	 shash_async_setkey+0x10/0x20 crypto/shash.c:223
      	 crypto_ahash_setkey+0x2d/0xa0 crypto/ahash.c:202
      	 crypto_authenc_setkey+0x68/0x100 crypto/authenc.c:96
      	 crypto_aead_setkey+0x2a/0xc0 crypto/aead.c:62
      	 aead_setkey+0xc/0x10 crypto/algif_aead.c:526
      	 alg_setkey crypto/af_alg.c:223 [inline]
      	 alg_setsockopt+0xfe/0x130 crypto/af_alg.c:256
      	 __sys_setsockopt+0x6d/0xd0 net/socket.c:1902
      	 __do_sys_setsockopt net/socket.c:1913 [inline]
      	 __se_sys_setsockopt net/socket.c:1910 [inline]
      	 __x64_sys_setsockopt+0x1f/0x30 net/socket.c:1910
      	 do_syscall_64+0x4a/0x180 arch/x86/entry/common.c:290
      	 entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: e236d4a8 ("[CRYPTO] authenc: Move enckeylen into key itself")
      Cc: <stable@vger.kernel.org> # v2.6.25+
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b9119fd2
    • Harsh Jain's avatar
      crypto: authencesn - Avoid twice completion call in decrypt path · d196d2fd
      Harsh Jain authored
      commit a7773363624b034ab198c738661253d20a8055c2 upstream.
      
      Authencesn template in decrypt path unconditionally calls aead_request_complete
      after ahash_verify which leads to following kernel panic in after decryption.
      
      [  338.539800] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
      [  338.548372] PGD 0 P4D 0
      [  338.551157] Oops: 0000 [#1] SMP PTI
      [  338.554919] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Tainted: G        W I       4.19.7+ #13
      [  338.564431] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0        07/29/10
      [  338.572212] RIP: 0010:esp_input_done2+0x350/0x410 [esp4]
      [  338.578030] Code: ff 0f b6 68 10 48 8b 83 c8 00 00 00 e9 8e fe ff ff 8b 04 25 04 00 00 00 83 e8 01 48 98 48 8b 3c c5 10 00 00 00 e9 f7 fd ff ff <8b> 04 25 04 00 00 00 83 e8 01 48 98 4c 8b 24 c5 10 00 00 00 e9 3b
      [  338.598547] RSP: 0018:ffff911c97803c00 EFLAGS: 00010246
      [  338.604268] RAX: 0000000000000002 RBX: ffff911c4469ee00 RCX: 0000000000000000
      [  338.612090] RDX: 0000000000000000 RSI: 0000000000000130 RDI: ffff911b87c20400
      [  338.619874] RBP: 0000000000000000 R08: ffff911b87c20498 R09: 000000000000000a
      [  338.627610] R10: 0000000000000001 R11: 0000000000000004 R12: 0000000000000000
      [  338.635402] R13: ffff911c89590000 R14: ffff911c91730000 R15: 0000000000000000
      [  338.643234] FS:  0000000000000000(0000) GS:ffff911c97800000(0000) knlGS:0000000000000000
      [  338.652047] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  338.658299] CR2: 0000000000000004 CR3: 00000001ec20a000 CR4: 00000000000006f0
      [  338.666382] Call Trace:
      [  338.669051]  <IRQ>
      [  338.671254]  esp_input_done+0x12/0x20 [esp4]
      [  338.675922]  chcr_handle_resp+0x3b5/0x790 [chcr]
      [  338.680949]  cpl_fw6_pld_handler+0x37/0x60 [chcr]
      [  338.686080]  chcr_uld_rx_handler+0x22/0x50 [chcr]
      [  338.691233]  uldrx_handler+0x8c/0xc0 [cxgb4]
      [  338.695923]  process_responses+0x2f0/0x5d0 [cxgb4]
      [  338.701177]  ? bitmap_find_next_zero_area_off+0x3a/0x90
      [  338.706882]  ? matrix_alloc_area.constprop.7+0x60/0x90
      [  338.712517]  ? apic_update_irq_cfg+0x82/0xf0
      [  338.717177]  napi_rx_handler+0x14/0xe0 [cxgb4]
      [  338.722015]  net_rx_action+0x2aa/0x3e0
      [  338.726136]  __do_softirq+0xcb/0x280
      [  338.730054]  irq_exit+0xde/0xf0
      [  338.733504]  do_IRQ+0x54/0xd0
      [  338.736745]  common_interrupt+0xf/0xf
      
      Fixes: 104880a6 ("crypto: authencesn - Convert to new AEAD...")
      Signed-off-by: 's avatarHarsh Jain <harsh@chelsio.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d196d2fd
  8. 01 Dec, 2018 1 commit
  9. 21 Nov, 2018 1 commit
  10. 13 Nov, 2018 2 commits
  11. 04 Oct, 2018 1 commit
    • Stafford Horne's avatar
      crypto: skcipher - Fix -Wstringop-truncation warnings · 29db2772
      Stafford Horne authored
      [ Upstream commit cefd769fd0192c84d638f66da202459ed8ad63ba ]
      
      As of GCC 9.0.0 the build is reporting warnings like:
      
          crypto/ablkcipher.c: In function ‘crypto_ablkcipher_report’:
          crypto/ablkcipher.c:374:2: warning: ‘strncpy’ specified bound 64 equals destination size [-Wstringop-truncation]
            strncpy(rblkcipher.geniv, alg->cra_ablkcipher.geniv ?: "<default>",
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
             sizeof(rblkcipher.geniv));
             ~~~~~~~~~~~~~~~~~~~~~~~~~
      
      This means the strnycpy might create a non null terminated string.  Fix this by
      explicitly performing '\0' termination.
      
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Eric Biggers <ebiggers3@gmail.com>
      Cc: Nick Desaulniers <nick.desaulniers@gmail.com>
      Signed-off-by: 's avatarStafford Horne <shorne@gmail.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      29db2772
  12. 26 Sep, 2018 1 commit
  13. 19 Sep, 2018 1 commit
  14. 09 Sep, 2018 1 commit
  15. 17 Aug, 2018 6 commits
    • Eric Biggers's avatar
      crypto: skcipher - fix crash flushing dcache in error path · 7e179bff
      Eric Biggers authored
      commit 8088d3dd upstream.
      
      scatterwalk_done() is only meant to be called after a nonzero number of
      bytes have been processed, since scatterwalk_pagedone() will flush the
      dcache of the *previous* page.  But in the error case of
      skcipher_walk_done(), e.g. if the input wasn't an integer number of
      blocks, scatterwalk_done() was actually called after advancing 0 bytes.
      This caused a crash ("BUG: unable to handle kernel paging request")
      during '!PageSlab(page)' on architectures like arm and arm64 that define
      ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was
      page-aligned as in that case walk->offset == 0.
      
      Fix it by reorganizing skcipher_walk_done() to skip the
      scatterwalk_advance() and scatterwalk_done() if an error has occurred.
      
      This bug was found by syzkaller fuzzing.
      
      Reproducer, assuming ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE:
      
      	#include <linux/if_alg.h>
      	#include <sys/socket.h>
      	#include <unistd.h>
      
      	int main()
      	{
      		struct sockaddr_alg addr = {
      			.salg_type = "skcipher",
      			.salg_name = "cbc(aes-generic)",
      		};
      		char buffer[4096] __attribute__((aligned(4096))) = { 0 };
      		int fd;
      
      		fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
      		bind(fd, (void *)&addr, sizeof(addr));
      		setsockopt(fd, SOL_ALG, ALG_SET_KEY, buffer, 16);
      		fd = accept(fd, NULL, NULL);
      		write(fd, buffer, 15);
      		read(fd, buffer, 15);
      	}
      Reported-by: 's avatarLiu Chao <liuchao741@huawei.com>
      Fixes: b286d8b1 ("crypto: skcipher - Add skcipher walk interface")
      Cc: <stable@vger.kernel.org> # v4.10+
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7e179bff
    • Eric Biggers's avatar
      crypto: skcipher - fix aligning block size in skcipher_copy_iv() · 0f2981ee
      Eric Biggers authored
      commit 0567fc9e upstream.
      
      The ALIGN() macro needs to be passed the alignment, not the alignmask
      (which is the alignment minus 1).
      
      Fixes: b286d8b1 ("crypto: skcipher - Add skcipher walk interface")
      Cc: <stable@vger.kernel.org> # v4.10+
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0f2981ee
    • Eric Biggers's avatar
      crypto: ablkcipher - fix crash flushing dcache in error path · 68432fd1
      Eric Biggers authored
      commit 318abdfb upstream.
      
      Like the skcipher_walk and blkcipher_walk cases:
      
      scatterwalk_done() is only meant to be called after a nonzero number of
      bytes have been processed, since scatterwalk_pagedone() will flush the
      dcache of the *previous* page.  But in the error case of
      ablkcipher_walk_done(), e.g. if the input wasn't an integer number of
      blocks, scatterwalk_done() was actually called after advancing 0 bytes.
      This caused a crash ("BUG: unable to handle kernel paging request")
      during '!PageSlab(page)' on architectures like arm and arm64 that define
      ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was
      page-aligned as in that case walk->offset == 0.
      
      Fix it by reorganizing ablkcipher_walk_done() to skip the
      scatterwalk_advance() and scatterwalk_done() if an error has occurred.
      Reported-by: 's avatarLiu Chao <liuchao741@huawei.com>
      Fixes: bf06099d ("crypto: skcipher - Add ablkcipher_walk interfaces")
      Cc: <stable@vger.kernel.org> # v2.6.35+
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      68432fd1
    • Eric Biggers's avatar
      crypto: blkcipher - fix crash flushing dcache in error path · 2cde72d9
      Eric Biggers authored
      commit 0868def3 upstream.
      
      Like the skcipher_walk case:
      
      scatterwalk_done() is only meant to be called after a nonzero number of
      bytes have been processed, since scatterwalk_pagedone() will flush the
      dcache of the *previous* page.  But in the error case of
      blkcipher_walk_done(), e.g. if the input wasn't an integer number of
      blocks, scatterwalk_done() was actually called after advancing 0 bytes.
      This caused a crash ("BUG: unable to handle kernel paging request")
      during '!PageSlab(page)' on architectures like arm and arm64 that define
      ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was
      page-aligned as in that case walk->offset == 0.
      
      Fix it by reorganizing blkcipher_walk_done() to skip the
      scatterwalk_advance() and scatterwalk_done() if an error has occurred.
      
      This bug was found by syzkaller fuzzing.
      
      Reproducer, assuming ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE:
      
      	#include <linux/if_alg.h>
      	#include <sys/socket.h>
      	#include <unistd.h>
      
      	int main()
      	{
      		struct sockaddr_alg addr = {
      			.salg_type = "skcipher",
      			.salg_name = "ecb(aes-generic)",
      		};
      		char buffer[4096] __attribute__((aligned(4096))) = { 0 };
      		int fd;
      
      		fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
      		bind(fd, (void *)&addr, sizeof(addr));
      		setsockopt(fd, SOL_ALG, ALG_SET_KEY, buffer, 16);
      		fd = accept(fd, NULL, NULL);
      		write(fd, buffer, 15);
      		read(fd, buffer, 15);
      	}
      Reported-by: 's avatarLiu Chao <liuchao741@huawei.com>
      Fixes: 5cde0af2 ("[CRYPTO] cipher: Added block cipher type")
      Cc: <stable@vger.kernel.org> # v2.6.19+
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2cde72d9
    • Eric Biggers's avatar
      crypto: vmac - separate tfm and request context · e7aefb13
      Eric Biggers authored
      commit bb296481 upstream.
      
      syzbot reported a crash in vmac_final() when multiple threads
      concurrently use the same "vmac(aes)" transform through AF_ALG.  The bug
      is pretty fundamental: the VMAC template doesn't separate per-request
      state from per-tfm (per-key) state like the other hash algorithms do,
      but rather stores it all in the tfm context.  That's wrong.
      
      Also, vmac_final() incorrectly zeroes most of the state including the
      derived keys and cached pseudorandom pad.  Therefore, only the first
      VMAC invocation with a given key calculates the correct digest.
      
      Fix these bugs by splitting the per-tfm state from the per-request state
      and using the proper init/update/final sequencing for requests.
      
      Reproducer for the crash:
      
          #include <linux/if_alg.h>
          #include <sys/socket.h>
          #include <unistd.h>
      
          int main()
          {
                  int fd;
                  struct sockaddr_alg addr = {
                          .salg_type = "hash",
                          .salg_name = "vmac(aes)",
                  };
                  char buf[256] = { 0 };
      
                  fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
                  bind(fd, (void *)&addr, sizeof(addr));
                  setsockopt(fd, SOL_ALG, ALG_SET_KEY, buf, 16);
                  fork();
                  fd = accept(fd, NULL, NULL);
                  for (;;)
                          write(fd, buf, 256);
          }
      
      The immediate cause of the crash is that vmac_ctx_t.partial_size exceeds
      VMAC_NHBYTES, causing vmac_final() to memset() a negative length.
      
      Reported-by: syzbot+264bca3a6e8d645550d3@syzkaller.appspotmail.com
      Fixes: f1939f7c ("crypto: vmac - New hash algorithm for intel_txt support")
      Cc: <stable@vger.kernel.org> # v2.6.32+
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e7aefb13
    • Eric Biggers's avatar
      crypto: vmac - require a block cipher with 128-bit block size · ef70d145
      Eric Biggers authored
      commit 73bf20ef upstream.
      
      The VMAC template assumes the block cipher has a 128-bit block size, but
      it failed to check for that.  Thus it was possible to instantiate it
      using a 64-bit block size cipher, e.g. "vmac(cast5)", causing
      uninitialized memory to be used.
      
      Add the needed check when instantiating the template.
      
      Fixes: f1939f7c ("crypto: vmac - New hash algorithm for intel_txt support")
      Cc: <stable@vger.kernel.org> # v2.6.32+
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef70d145
  16. 03 Aug, 2018 2 commits
  17. 22 Jul, 2018 1 commit
  18. 17 Jul, 2018 1 commit
    • Eric Biggers's avatar
      crypto: x86/salsa20 - remove x86 salsa20 implementations · 19f39eff
      Eric Biggers authored
      commit b7b73cd5 upstream.
      
      The x86 assembly implementations of Salsa20 use the frame base pointer
      register (%ebp or %rbp), which breaks frame pointer convention and
      breaks stack traces when unwinding from an interrupt in the crypto code.
      Recent (v4.10+) kernels will warn about this, e.g.
      
      WARNING: kernel stack regs at 00000000a8291e69 in syzkaller047086:4677 has bad 'bp' value 000000001077994c
      [...]
      
      But after looking into it, I believe there's very little reason to still
      retain the x86 Salsa20 code.  First, these are *not* vectorized
      (SSE2/SSSE3/AVX2) implementations, which would be needed to get anywhere
      close to the best Salsa20 performance on any remotely modern x86
      processor; they're just regular x86 assembly.  Second, it's still
      unclear that anyone is actually using the kernel's Salsa20 at all,
      especially given that now ChaCha20 is supported too, and with much more
      efficient SSSE3 and AVX2 implementations.  Finally, in benchmarks I did
      on both Intel and AMD processors with both gcc 8.1.0 and gcc 4.9.4, the
      x86_64 salsa20-asm is actually slightly *slower* than salsa20-generic
      (~3% slower on Skylake, ~10% slower on Zen), while the i686 salsa20-asm
      is only slightly faster than salsa20-generic (~15% faster on Skylake,
      ~20% faster on Zen).  The gcc version made little difference.
      
      So, the x86_64 salsa20-asm is pretty clearly useless.  That leaves just
      the i686 salsa20-asm, which based on my tests provides a 15-20% speed
      boost.  But that's without updating the code to not use %ebp.  And given
      the maintenance cost, the small speed difference vs. salsa20-generic,
      the fact that few people still use i686 kernels, the doubt that anyone
      is even using the kernel's Salsa20 at all, and the fact that a SSE2
      implementation would almost certainly be much faster on any remotely
      modern x86 processor yet no one has cared enough to add one yet, I don't
      think it's worthwhile to keep.
      
      Thus, just remove both the x86_64 and i686 salsa20-asm implementations.
      
      Reported-by: syzbot+ffa3a158337bbc01ff09@syzkaller.appspotmail.com
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19f39eff
  19. 03 Jul, 2018 1 commit
    • Maciej S. Szmigiero's avatar
      X.509: unpack RSA signatureValue field from BIT STRING · af20e4ec
      Maciej S. Szmigiero authored
      commit b65c32ec upstream.
      
      The signatureValue field of a X.509 certificate is encoded as a BIT STRING.
      For RSA signatures this BIT STRING is of so-called primitive subtype, which
      contains a u8 prefix indicating a count of unused bits in the encoding.
      
      We have to strip this prefix from signature data, just as we already do for
      key data in x509_extract_key_data() function.
      
      This wasn't noticed earlier because this prefix byte is zero for RSA key
      sizes divisible by 8. Since BIT STRING is a big-endian encoding adding zero
      prefixes has no bearing on its value.
      
      The signature length, however was incorrect, which is a problem for RSA
      implementations that need it to be exactly correct (like AMD CCP).
      Signed-off-by: 's avatarMaciej S. Szmigiero <mail@maciej.szmigiero.name>
      Fixes: c26fd69f ("X.509: Add a crypto key parser for binary (DER) X.509 certificates")
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarJames Morris <james.morris@microsoft.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      af20e4ec
  20. 30 May, 2018 1 commit
    • Eric Biggers's avatar
      PKCS#7: fix direct verification of SignerInfo signature · e47c1bf9
      Eric Biggers authored
      [ Upstream commit 6459ae38 ]
      
      If none of the certificates in a SignerInfo's certificate chain match a
      trusted key, nor is the last certificate signed by a trusted key, then
      pkcs7_validate_trust_one() tries to check whether the SignerInfo's
      signature was made directly by a trusted key.  But, it actually fails to
      set the 'sig' variable correctly, so it actually verifies the last
      signature seen.  That will only be the SignerInfo's signature if the
      certificate chain is empty; otherwise it will actually be the last
      certificate's signature.
      
      This is not by itself a security problem, since verifying any of the
      certificates in the chain should be sufficient to verify the SignerInfo.
      Still, it's not working as intended so it should be fixed.
      
      Fix it by setting 'sig' correctly for the direct verification case.
      
      Fixes: 757932e6 ("PKCS#7: Handle PKCS#7 messages that contain no X.509 certs")
      Signed-off-by: 's avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: 's avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: 's avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e47c1bf9
  21. 16 May, 2018 1 commit
  22. 01 May, 2018 1 commit
  23. 12 Apr, 2018 1 commit
    • Arnd Bergmann's avatar
      crypto: aes-generic - build with -Os on gcc-7+ · 7cae67e3
      Arnd Bergmann authored
      
      [ Upstream commit 148b974d ]
      
      While testing other changes, I discovered that gcc-7.2.1 produces badly
      optimized code for aes_encrypt/aes_decrypt. This is especially true when
      CONFIG_UBSAN_SANITIZE_ALL is enabled, where it leads to extremely
      large stack usage that in turn might cause kernel stack overflows:
      
      crypto/aes_generic.c: In function 'aes_encrypt':
      crypto/aes_generic.c:1371:1: warning: the frame size of 4880 bytes is larger than 2048 bytes [-Wframe-larger-than=]
      crypto/aes_generic.c: In function 'aes_decrypt':
      crypto/aes_generic.c:1441:1: warning: the frame size of 4864 bytes is larger than 2048 bytes [-Wframe-larger-than=]
      
      I verified that this problem exists on all architectures that are
      supported by gcc-7.2, though arm64 in particular is less affected than
      the others. I also found that gcc-7.1 and gcc-8 do not show the extreme
      stack usage but still produce worse code than earlier versions for this
      file, apparently because of optimization passes that generally provide
      a substantial improvement in object code quality but understandably fail
      to find any shortcuts in the AES algorithm.
      
      Possible workarounds include
      
      a) disabling -ftree-pre and -ftree-sra optimizations, this was an earlier
         patch I tried, which reliably fixed the stack usage, but caused a
         serious performance regression in some versions, as later testing
         found.
      
      b) disabling UBSAN on this file or all ciphers, as suggested by Ard
         Biesheuvel. This would lead to massively better crypto performance in
         UBSAN-enabled kernels and avoid the stack usage, but there is a concern
         over whether we should exclude arbitrary files from UBSAN at all.
      
      c) Forcing the optimization level in a different way. Similar to a),
         but rather than deselecting specific optimization stages,
         this now uses "gcc -Os" for this file, regardless of the
         CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE/SIZE option. This is a reliable
         workaround for the stack consumption on all architecture, and I've
         retested the performance results now on x86, cycles/byte (lower is
         better) for cbc(aes-generic) with 256 bit keys:
      
      			-O2     -Os
      	gcc-6.3.1	14.9	15.1
      	gcc-7.0.1	14.7	15.3
      	gcc-7.1.1	15.3	14.7
      	gcc-7.2.1	16.8	15.9
      	gcc-8.0.0	15.5	15.6
      
      This implements the option c) by enabling forcing -Os on all compiler
      versions starting with gcc-7.1. As a workaround for PR83356, it would
      only be needed for gcc-7.2+ with UBSAN enabled, but since it also shows
      better performance on gcc-7.1 without UBSAN, it seems appropriate to
      use the faster version here as well.
      
      Side note: during testing, I also played with the AES code in libressl,
      which had a similar performance regression from gcc-6 to gcc-7.2,
      but was three times slower overall. It might be interesting to
      investigate that further and possibly port the Linux implementation
      into that.
      
      Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
      Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651
      Cc: Richard Biener <rguenther@suse.de>
      Cc: Jakub Jelinek <jakub@gcc.gnu.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: 's avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: 's avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: 's avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: 's avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7cae67e3