Re: How to find a needle in a haystack?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Tue, 2010-05-18 at 16:49 -0400, aragonx@xxxxxxxxxx wrote:
>> Hello all,
>>
>> I need some ideas.
>>
>> I have a backup server that contains 10 ext3 file systems each with 12
>> million files scattered randomly over 4000 directories.  The files
>> average
>> size is 1MB.
>
> So each filesystem is about 12*10^6 * 1MB = 12*10^12 or 12 terabytes?

Each filesystem is 2.5TB so the average file size must be much smaller. 
At last count, one of the filesystems contained 20 million files.

> You don't say what the file contents are like, e.g. text, structured
> data, unstructured binary, etc, nor do you say how you match the file
> you want (e.g. is it equivalent to a text substring, a regular
> expression, or what?). Knowing what the contents look like would help to
> evaluate if it's worth e.g. generating a hash for subsections of the
> file when it's being stored. Alternatively, it could conceivably make
> sense to search for strings in the raw disk and work backwards to
> calculate what files they belong to, who knows?

The data in the files is of the unstructured binary type.  When I do a
search, I have _most_ of the file name.  Enough to uniquely identify it.

I hope that helps.

---
Will Y.


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-- 
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux