crypto: talitos - chain in buffered data for ahash on SEC1
SEC1 doesn't support S/G in descriptors so for hash operations,
the CPU has to build a buffer containing the buffered block and
the incoming data. This generates a lot of memory copies which
represents more than 50% of CPU time of a md5sum operation as
shown below with a 'perf record'.
However, unlike SEC2+, SEC1 offers the possibility to chain
descriptors. It is therefore possible to build a first descriptor
pointing to the buffered data and a second descriptor pointing to
the incoming data, hence avoiding the memory copy to a single
buffer.
With this patch, the time necessary for a md5sum on a 90Mbytes file
is approximately 3 seconds. Without the patch it takes 6 seconds.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: talitos - don't check the number of channels at each interrupt
The number of channels is known from the beginning, no need to
test it everytime.
This patch defines two additional done functions handling only channel 0.
Then the probe registers the correct one based on the number of channels.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: talitos - zeroize the descriptor with memset()
This patch zeroize the descriptor at allocation using memset().
This has two advantages:
- It reduces the number of places where data has to be set to 0
- It avoids reading memory and loading the cache with data that
will be entirely replaced.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This is due to SHA224 not being supported by the HW. Allthough for
hash we are able to init the hash context by SW, it is not
possible for AEAD. Therefore SHA224 AEAD has to be deactivated.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: talitos - fix setkey to check key weakness
Crypto manager test report the following failures:
[ 3.061081] alg: skcipher: setkey failed on test 5 for ecb-des-talitos: flags=100
[ 3.069342] alg: skcipher-ddst: setkey failed on test 5 for ecb-des-talitos: flags=100
[ 3.077754] alg: skcipher-ddst: setkey failed on test 5 for ecb-des-talitos: flags=100
This is due to setkey being expected to detect weak keys.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If the crypto4xx device is continuously loaded by dm-crypt
and ipsec work, it will start to work intermittent after a
few (between 20-30) seconds, hurting throughput and latency.
This patch contains various stability improvements in order
to fix this issue. So far, the hardware has survived more
than a day without suffering any stalls under the continuous
load.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto4xx_core.c:179:6: warning: symbol 'crypto4xx_free_state_record'
was not declared. Should it be static?
crypto4xx_core.c:331:5: warning: symbol 'crypto4xx_get_n_gd'
was not declared. Should it be static?
crypto4xx_core.c:652:6: warning: symbol 'crypto4xx_return_pd'
was not declared. Should it be static?
crypto4xx_return_pd() is not used by anything. Therefore it is removed.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch overhauls and fixes code related to crypto4xx_build_pd()
* crypto4xx_build_pd() did not handle chained source scatterlist.
This is fixed by replacing the buggy indexed-access of &src[idx]
with sg_next() in the gather array setup loop.
* The redundant is_hash, direction, save_iv and pd_ctl members
in the crypto4xx_ctx struct have been removed.
- is_hash can be derived from the crypto_async_request parameter.
- direction is already part of the security association's
bf.dir bitfield.
- save_iv is unused.
- pd_ctl always had the host_ready bit enabled anyway.
(the hash_final case is rather pointless, since the ahash
code has been deactivated).
* make crypto4xx_build_pd()'s caller responsible for converting
the IV to the LE32 format.
* change crypto4xx_ahash_update() and crypto4xx_ahash_digest() to
initialize a temporary destination scatterlist. This allows the
removal of an ugly cast of req->result (which is a pointer to an
u8-array) to a scatterlist pointer.
* change crypto4xx_build_pd() return type to int. After all
it returns -EINPROGRESS/-EBUSY.
* fix crypto4xx_build_pd() thread-unsafe sa handling.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: crypto4xx - use the correct LE32 format for IV and key defs
The hardware expects that the keys, IVs (and inner/outer hashes)
are in the le32 format.
This patch changes all hardware interface declarations to use
the correct LE32 data format for each field.
In order to pass __CHECK_ENDIAN__ checks, crypto4xx_memcpy_le
has to be honest about the endianness of its parameters.
The function was split and moved to the common crypto4xx_core.h
header. This allows the compiler to generate better code if the
sizes/len is a constant (various *_IV_LEN).
Please note that the hardware isn't consistent with the endiannes
of the save_digest field in the state record struct though.
The hashes produced by GHASH and CBC (for CCM) will be in LE32.
Whereas md5 and sha{1/,256,...} do not need any conversion.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Previously, If the crypto4xx driver used all available
security contexts, it would simply refuse new requests
with -EAGAIN. CRYPTO_TFM_REQ_MAY_BACKLOG was ignored.
in case of dm-crypt.c's crypt_convert() function this was
causing the following errors to manifest, if the system was
pushed hard enough:
To fix this mess, the crypto4xx driver needs to notifiy the
user to slow down. This can be achieved by returning -EBUSY
on requests, once the crypto hardware was falling behind.
Note: -EBUSY has two different meanings. Setting the flag
CRYPTO_TFM_REQ_MAY_BACKLOG implies that the request was
successfully queued, by the crypto driver. To achieve this
requirement, the implementation introduces a threshold check and
adds logic to the completion routines in much the same way as
AMD's Cryptographic Coprocessor (CCP) driver do.
Note2: Tests showed that dm-crypt starved ipsec traffic.
Under load, ipsec links dropped to 0 Kbits/s. This is because
dm-crypt's callback would instantly queue the next request.
In order to not starve ipsec, the driver reserves a small
portion of the available crypto contexts for this purpose.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
I used aes-cbc as a template for ofb. But sadly I forgot
to update set_key method to crypto4xx_setkey_aes_ofb().
this was caught by the testmgr:
alg: skcipher: Test 1 failed (invalid result) on encr. for ofb-aes-ppc4xx 00000000: 76 49 ab ac 81 19 b2 46 ce e9 8e 9b 12 e9 19 7d 00000010: 50 86 cb 9b 50 72 19 ee 95 db 11 3a 91 76 78 b2 00000020: 73 be d6 b8 e3 c1 74 3b 71 16 e6 9e 22 22 95 16 00000030: 3f f1 ca a1 68 1f ac 09 12 0e ca 30 75 86 e1 a7
With the correct set_key method, the aes-ofb cipher passes the test.
name : ofb(aes)
driver : ofb-aes-ppc4xx
module : crypto4xx
priority : 300
refcnt : 1
selftest : passed
internal : no
type : ablkcipher
async : yes
blocksize : 16
min keysize : 16
max keysize : 32
ivsize : 16
geniv : <default>
Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Stephan Mueller [Tue, 3 Oct 2017 02:19:59 +0000 (04:19 +0200)]
crypto: keywrap - simplify code
The code is simplified by using two __be64 values for the operation
instead of using two arrays of u8. This allows to get rid of the memory
alignment code. In addition, the crypto_xor can be replaced with a
native XOR operation. Finally, the definition of the variables is
re-arranged such that the data structures come before simple variables
to potentially reduce memory space.
Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The usage of of_device_get_match_data reduce the code size a bit.
Furthermore, it prevents an improbable dereference when
of_match_device() return NULL.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The usage of of_device_get_match_data reduce the code size a bit.
Furthermore, it prevents an improbable dereference when
of_match_device() return NULL.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The usage of of_device_get_match_data reduce the code size a bit.
Furthermore, it prevents an improbable dereference when
of_match_device() return NULL.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
padata: ensure padata_do_serial() runs on the correct CPU
If the algorithm we're parallelizing is asynchronous we might change
CPUs between padata_do_parallel() and padata_do_serial(). However, we
don't expect this to happen as we need to enqueue the padata object into
the per-cpu reorder queue we took it from, i.e. the same-cpu's parallel
queue.
Ensure we're not switching CPUs for a given padata object by tracking
the CPU within the padata object. If the serial callback gets called on
the wrong CPU, defer invoking padata_reorder() via a kernel worker on
the CPU we're expected to run on.
Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
padata: ensure the reorder timer callback runs on the correct CPU
The reorder timer function runs on the CPU where the timer interrupt was
handled which is not necessarily one of the CPUs of the 'pcpu' CPU mask
set.
Ensure the padata_reorder() callback runs on the correct CPU, which is
one in the 'pcpu' CPU mask set and, preferrably, the next expected one.
Do so by comparing the current CPU with the expected target CPU. If they
match, call padata_reorder() right away. If they differ, schedule a work
item on the target CPU that does the padata_reorder() call for us.
Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The parallel queue per-cpu data structure gets initialized only for CPUs
in the 'pcpu' CPU mask set. This is not sufficient as the reorder timer
may run on a different CPU and might wrongly decide it's the target CPU
for the next reorder item as per-cpu memory gets memset(0) and we might
be waiting for the first CPU in cpumask.pcpu, i.e. cpu_index 0.
Make the '__this_cpu_read(pd->pqueue->cpu_index) == next_queue->cpu_index'
compare in padata_get_next() fail in this case by initializing the
cpu_index member of all per-cpu parallel queues. Use -1 for unused ones.
Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In 32-bit mode, the x86 architecture can hold full 32-bit pointers.
Therefore, the code that copies the current address to the %ecx register
and uses %ecx-relative addressing is useless, we could just use absolute
addressing.
The processors have a stack of return addresses for branch prediction. If
we use a call instruction and pop the return address, it desynchronizes
the return stack and causes branch prediction misses.
This patch also moves the data to the .rodata section.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Himanshu Jha [Sat, 26 Aug 2017 21:15:29 +0000 (02:45 +0530)]
crypto: n2 - remove null check before kfree
kfree on NULL pointer is a no-op and therefore checking it is redundant.
Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Arvind Yadav [Fri, 25 Aug 2017 18:23:59 +0000 (23:53 +0530)]
crypto: padlock-sha - constify x86_cpu_id
x86_cpu_id are not supposed to change at runtime. MODULE_DEVICE_TABLE
and x86_match_cpu are working with const x86_cpu_id. So mark the
non-const x86_cpu_id structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Arvind Yadav [Fri, 25 Aug 2017 18:23:42 +0000 (23:53 +0530)]
crypto: padlock-aes - constify x86_cpu_id
x86_cpu_id are not supposed to change at runtime. MODULE_DEVICE_TABLE
and x86_match_cpu are working with const x86_cpu_id. So mark the
non-const x86_cpu_id structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch improves the readability of various functions,
by replacing various void* pointers declarations with
their respective structs *. This makes it possible to go
for the eye-friendly array-indexing methods.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch refactors the crypto4xx_copy_pkt_to_dst() to use
scatterwalk_map_and_copy() to copy the processed data between
the crypto engine's scatter ring buffer and the destination
specified by the ablkcipher_request.
This also makes the crypto4xx_fill_one_page() function redundant.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: crypto4xx - remove extern statement before function declaration
All function declarations are "extern" by default, there is no need to
specify it explicitly.
For C99 states in 6.2.2.5:
"If the declaration of an identifier for a function has no
storage-class specifier, its linkage is determined exactly
as if it were declared with the storage-class specifier
extern."
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto: crypto4xx - set CRYPTO_ALG_KERN_DRIVER_ONLY flag
The security offload function is performed by a cryptographic
engine core attached to the 128-bit PLB (processor local bus)
with builtin DMA and interrupt controllers. This, I think,
satisfies the requirement for the CRYPTO_ALG_KERN_DRIVER_ONLY
flag.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
alg entries are only added to the list, after the registration
was successful. If the registration failed, it was never added
to the list in the first place.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Colin Ian King [Sun, 20 Aug 2017 21:34:38 +0000 (22:34 +0100)]
crypto: aesni - make arrays aesni_simd_skciphers and aesni_simd_skciphers2 static
Arrays aesni_simd_skciphers and aesni_simd_skciphers2 are local to the
source and do not need to be in global scope, so make them static.
Cleans up sparse warnings:
symbol 'aesni_simd_skciphers' was not declared. Should it be static?
symbol 'aesni_simd_skciphers2' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Arvind Yadav [Thu, 17 Aug 2017 17:36:23 +0000 (23:06 +0530)]
hwrng: pseries - constify vio_device_id
vio_device_id are not supposed to change at runtime. All functions
working with vio_device_id provided by <asm/vio.h> work with
const vio_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Arvind Yadav [Thu, 17 Aug 2017 13:14:11 +0000 (18:44 +0530)]
crypto: nx-842 - constify vio_device_id
vio_device_id are not supposed to change at runtime. All functions
working with vio_device_id provided by <asm/vio.h> work with
const vio_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Arvind Yadav [Thu, 17 Aug 2017 13:14:10 +0000 (18:44 +0530)]
crypto: nx - constify vio_device_id
vio_device_id are not supposed to change at runtime. All functions
working with vio_device_id provided by <asm/vio.h> work with
const vio_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>