I can see what you're saying, it could be useful for finding differences in different versions of the same binary but from what I can see Joxean's app is meant to group files of the same 'type,' not provide 'diff' capabilities. -Travis On Tue, Jan 5, 2010 at 9:51 AM, Dan Kaminsky <dan@xxxxxxxxxxx> wrote: > I looked into a fair amount of this sort of normalization back when I was > playing with dotplots. The idea was to upgrade from simple Levenshtein > string comparison (with no knowledge of variable length x86 instructions, > pointers that shift from compile to compile, etc) to something with at least > some domain specific knowledge. What I found, somewhat surprisingly, was > that dumb string comparison was more than enough. In fact, when I compared > pre-patch and post-patch builds, it was easy to directly see when content > was added, removed, shifted in location, etc. Joxean's going to have much > the same result -- as basic as his similarity metric is, he'll get the broad > strokes just fine. > > Ultimately the best approach is to build a graph of how functions interact > and measure graph isomorphism, but of course Halvar figured that out years > ago :) > > On Tue, Jan 5, 2010 at 3:41 PM, T Biehn <tbiehn@xxxxxxxxx> wrote: >> >> Hmm, >> Wouldn't it be more useful to the sec community to have a algorithm >> that abstracts at the -interpreted- content level? That is when >> analyzing binaries I wouldn't think that this would classify two with >> near identical functionality together, even though it is removing a >> significant chunk of information during the hash pass. >> >> I would largely assume that your algorithm, as is, works best on >> uncompressed bitmaps. Is there something I'm missing? >> >> -Travis >> >> On Sun, Jan 3, 2010 at 6:37 AM, Joxean Koret <joxeankoret@xxxxxxxx> wrote: >> > Hi all, >> > >> > I'm happy to announce the very first public release of the open source >> > project DeepToad, a tool for computing fuzzy hashes from files. >> > >> > DeepToad can generate signatures, clusterize files and/or directories >> > and compare them. It's inspired in the very good tool ssdeep [1] and, in >> > fact, both projects are very similar. >> > >> > The complete project is written in pure python and is distributed under >> > the LGPL license [2]. >> > >> > Links: >> > Project's Web Page http://code.google.com/p/deeptoad/ >> > Download Web Page http://code.google.com/p/deeptoad/downloads/list >> > Wiki http://code.google.com/p/deeptoad/w/list >> > >> > References: >> > [1] http://ssdeep.sourceforge.net/ >> > [2] http://www.gnu.org/licenses/lgpl.html >> > >> > Regards && Happy new year! >> > Joxean Koret >> > >> > >> > _______________________________________________ >> > Full-Disclosure - We believe in it. >> > Charter: http://lists.grok.org.uk/full-disclosure-charter.html >> > Hosted and sponsored by Secunia - http://secunia.com/ >> > >> >> >> >> -- >> FD1D E574 6CAB 2FAF 2921 F22E B8B7 9D0D 99FF A73C >> http://pgp.mit.edu:11371/pks/lookup?search=tbiehn&op=index&fingerprint=on >> http://pastebin.com/f6fd606da >> >> _______________________________________________ >> Full-Disclosure - We believe in it. >> Charter: http://lists.grok.org.uk/full-disclosure-charter.html >> Hosted and sponsored by Secunia - http://secunia.com/ > > -- FD1D E574 6CAB 2FAF 2921 F22E B8B7 9D0D 99FF A73C http://pgp.mit.edu:11371/pks/lookup?search=tbiehn&op=index&fingerprint=on http://pastebin.com/f6fd606da