To me it seems obvious that if the hardware provides a real RNG, that should be used to feed random(4). This solves a genuine problem and, even if calls to the hardware are expensive, overall overhead will not be high because random(4) does not need huge amounts of input. I'm much less certain hardware acceleration is worthwhile for ciphers & hashes, except where the CPU itself includes instructions to speed them up.