Re: lots of small files in a folder on Linux centos

Marc Deop <damnshock@xxxxxxxxx> · Mon, 25 Jul 2011 12:38:47 +0200

On Sunday 24 July 2011 10:13:30 R P Herrold wrote:
> #!/bin/sh
> #
> CANDIDATES="pix00001.jpg pix00002.jpg pix00003.jpg"
> for i in `echo "${CANDIDATES}"`; do
>          HASH=`echo "$i" | md5sum - | awk {'print $1'}`
>          echo "$i        ${HASH}"
> done

I know it absolutelly has nothing to do with databases or files in folders but as we are talking about optimizing:

#!/bin/bash
CANDIDATES=(pix00001.jpg pix00002.jpg pix00003.jpg)
for i in "${CANDIDATES[@]}"; do 
    MD5SUM=$(md5sum <(echo $i)) 
    echo "$i     ${MD5SUM% *}";
done

It's more than twice as fast than the previous sh script.

[ willing to learn mode, feel free to ignore this]

Anyway, about the the hashes and directories and so on... I assume we'd need a hash table in our application, right?

Would we proceed as follows (correct me if I'm wrong please)?

1- m5sum the file we need
2- look for the first letter of the hash
3- get into the directory
4- now we look for our file

Is this right? I understand this would improve the searching of files when there's a lot of them.

Thanks to anyone that replies me and sorry for the offtopic

Regards,

Marc Deop
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos