Re: Fwd: crypto accelerator driver problems

Hamid Nassiby <h.nassiby@xxxxxxxxx> · Tue, 5 Jul 2011 10:15:08 +0330

On Thu, Jan 27, 2011 at 3:03 AM, Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Jan 26, 2011 at 11:20:22AM +0330, Hamid Nassiby wrote:
> >
> > Do you mean that different IP packets fit into one single Block Cipher tfm?
> > Would you please explain expansively?
>
> We allocate one tfm per SA.  So as long as ordering is guaranteed
> per SA then it's guaranteed per SA which is all that's needed.
>
> Cheers,
> --
> Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Dears,
Referring to my previous posts related to a hardware AES accelerator (that is
to be used to accelerate IPSec block cipher operations) driver, I would like to
ask you about an possibly algorithmic problem exists in our solution.
As I said earlier our driver is inspired by geode_aes driver, so assume that we
have defined our supported  algorithm as:

static struct crypto_alg shams_cbc_alg = {
        .cra_name               =       "cbc(aes)",
        .cra_driver_name        =       "cbc-aes-mine",
        .cra_priority           =       400,
        .cra_flags                      =       CRYPTO_ALG_TYPE_BLKCIPHER |

CRYPTO_ALG_NEED_FALLBACK,
        .cra_init                       =       fallback_init_blk,
        .cra_exit                       =       fallback_exit_blk,
        .cra_blocksize          =       AES_MIN_BLOCK_SIZE,
        .cra_ctxsize            =       sizeof(struct my_aes_op),
        .cra_alignmask          =       0,
        .cra_type                       =       &crypto_blkcipher_type,
        .cra_module                     =       THIS_MODULE,
        .cra_list                       =
LIST_HEAD_INIT(shams_cbc_alg.cra_list),
        .cra_u                          =       {
                .blkcipher      =       {
                        .min_keysize    =       AES_MIN_KEY_SIZE,
                        .max_keysize    =       AES_MIN_KEY_SIZE,
                        .setkey                 =       my_setkey_blk,
                        .encrypt                =       my_cbc_encrypt,
                        .decrypt                =       my_cbc_decrypt,
                        .ivsize                 =       AES_IV_LENGTH,
                }
        }
};

And our encrypt function, my_cbc_encrypt, looks like:

static int
my_cbc_encrypt(struct blkcipher_desc *desc,
                  struct scatterlist *dst, struct scatterlist *src,
                  unsigned int nbytes)
{
        struct my_aes_op *op = crypto_blkcipher_ctx(desc->tfm);
        struct blkcipher_walk walk;
        int err, ret;
        unsigned long flag1, c2flag;
        u32 my_req_id;

        spin_lock_irqsave(&reqlock, c2flag);
        /*Our request id sent to device and then retrieved to be able
to distinguish between device responses. */
        my_req_id = (global_reqid++) % 63000;
        spin_unlock_irqrestore(&reqlock, c2flag);

        if (unlikely(op->keylen != AES_KEYSIZE_128))
                return fallback_blk_enc(desc, dst, src, nbytes);

        blkcipher_walk_init(&walk, dst, src, nbytes);
        err = blkcipher_walk_virt(desc, &walk);
        op->iv = walk.iv;

        while((nbytes = walk.nbytes)) {
                op->src = walk.src.virt.addr,
                op->dst = walk.dst.virt.addr;
                op->mode = AES_MODE_CBC;
                op->len = nbytes /*- (nbytes % AES_MIN_BLOCK_SIZE)*/;
                op->dir = AES_DIR_ENCRYPT;

            /* Critical PSEUDO code */
              spin_lock_irqsave(&1lock, flag1);
                 write_to_device(op, 0, my_req_id);
              spin_unlock_irqrestore(&lock1, flag1);

              spin_lock_irqsave(&lock1, flag1);
                ret = read_from_device(op, 0, my_req_id);
              spin_unlock_irqrestore(&lock1, flag1);
            /* End of Critical PSEUDO code*/
                nbytes -= ret;
                err = blkcipher_walk_done(desc, &walk, nbytes);
        }

        return err;
}

As I mentioned earlier we have multiple AES engines in our hardware, so to
utilize hardware as much as possible, we would like to have the possibility to
give multiple requests to device and get responses from it as soon as one
becomes ready.

Now look at that section of my_cbc_encrypt, commented as "Critical PSEUDO code".
This section gives requests to device and reads back responses (And is the damn
bottleneck) . If we protect write_to_device and read_from_device call, by one
pair of lock/unlock as:

/* Critical PSEUDO code */
              spin_lock_irqsave(&lock1, flag1);
                 write_to_device(op, 0, my_req_id);
                ret = read_from_device(op, 0, my_req_id);
              spin_unlock_irqrestore(&lock1, flag1);
/* End of Critical PSEUDO code*/

then we would have no problem, system works and IPSec en/decrypts by our
hardware. But ONLY one aes engine of our hardware is utilized; Good(system
works), Bad (only one engine is utilized) and the Ugly (throughput is not
awesome). So we must change the section to:

/* Critical PSEUDO code */
              spin_lock_irqsave(&lock1, flag1);
                 write_to_device(op, 0, my_req_id);
              spin_unlock_irqrestore(&lock1, flag1);

              spin_lock_irqsave(&glock, t2flag);
                ret = read_from_device(op, 0, my_req_id);
              spin_unlock_irqrestore(&glock, t2flag);
/* End of Critical PSEUDO code */

and preferably to :

/* Critical PSEUDO code */
/* distinct locks for write_to_device and read_from_device */
              spin_lock_irqsave(&lock1, flag1);
                 write_to_device(op, 0, my_req_id);
              spin_unlock_irqrestore(&lock1, flag1);

              spin_lock_irqsave(&lock2, flag2);
                ret = read_from_device(op, 0, my_req_id);
              spin_unlock_irqrestore(&lock2, flag2);
/* End of Critical PSEUDO*/

Here, it seems we must have no problem, but as soon as one TCP flow starts
the system hangs.
Finally, I request your guidelines about the problem.

Thanks in advance,
Hamid.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html