On Mon, 2006-12-18 at 23:36 +0100, Erik Mouw wrote: > On Mon, Dec 18, 2006 at 05:12:53PM -0500, Ming Zhang wrote: > > See this code piece > > > > http://lxr.linux.no/source/drivers/scsi/libata-scsi.c?v=2.6.18#L1049 > > > > 1056 lba |= ((u64)scsicmd[2]) << 24; > > 1057 lba |= ((u64)scsicmd[3]) << 16; > > 1058 lba |= ((u64)scsicmd[4]) << 8; > > 1059 lba |= ((u64)scsicmd[5]); > > > > it can also be written as > > > > lba = be32_to_cpu(*(u32 *)(&scsicmd[2]) > > > > > > i wrote a simple code to test this and i found second one is several > > time faster than first one. > > I guess the total time spend in doing the conversion is orders of > magnitude smaller than the time needed to get a data block from the > drive. Don't bother optimising things that don't need to be optimised. thanks. i do not want to optimize it, since i run it 1,000,000,000 times and only get 8 sec faster. ;) i just get used to write in 2nd way and wonder why people write in 1st way if 2nd way is ok. > > > if there any pitfall with 2nd way? > > AFAICS be32_to_cpu() works on native alignment 32 bit integers. Yes, it > works on x86 cause x86 can do unaligned memory accesses. I don't think > your second method will work on a RISC architecture (ARM, MIPS, Alpha, > etc). > o, yes, i forgot this. so the 1st way is a correct way while 2nd is _not_. thanks again. > > Erik > -- Kernelnewbies: Help each other learn about the Linux kernel. Archive: http://mail.nl.linux.org/kernelnewbies/ FAQ: http://kernelnewbies.org/faq/