Re: Question: How range-diff lapjv algorithm work

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Thu, 9 Mar 2023, ZheNing Hu wrote:

> Martin Ågren <martin.agren@xxxxxxxxx> 于2023年3月8日周三 15:47写道:
>
> > On Wed, 8 Mar 2023 at 07:50, ZheNing Hu <adlternative@xxxxxxxxx> wrote:
> > >
> > > 3. How does LAPJV work here? [2]
> > >
> > > [1]: https://github.com/git/git/blob/725f57037d81e24eacfda6e59a19c60c0b4c8062/range-diff.c#L310
> > > [2]: https://github.com/git/git/blob/725f57037d81e24eacfda6e59a19c60c0b4c8062/linear-assignment.c#L15
> >
> > This appears to be based on work by Jonker and Volgenant. Maybe
> > searching for those names online could find something. Maybe not
> > git-specific, but even if it's just the general algorithm as such, it
> > might be possible to find different explanations of the algorithm.
> >
> > I haven't really studied this algorithm, but since it's faster than the
> > Hungarian algorithm, I could imagine that either
> >
> >   * it's super-useful to first understand the Hungarian algorithm, then
> >     understand why and how the Jonker-Volgenant algorithm does better,
> >     or,
> >
> >   * it's completely useless to first understand the Hungarian algorithm,
> >     since they're so different.
> >
> > :-)
> >
>
> Ah, I had a look at the Hungarian algorithm earlier, because it is the most
> typical algorithm in linear assignment problem, it can still be understood.
> I didn't read that paper by Jonker and Volgenant, but I should try to read
> it later.

Oh wow, what a blast from the past.

As I had briefly mentioned in
https://lore.kernel.org/git/nycvar.QRO.7.76.6.1805032227520.77@xxxxxxxxxxxxxxxxx/,
the code I contributed in
https://lore.kernel.org/git/3f51970cbc44bfe34133c48c0844ed3723e83808.1525361419.git.johannes.schindelin@xxxxxx/
is a port from some Java code I had written earlier in my life which was
based on the Pascal code provided in the cited paper.

My recollection has faded somewhat in the past 9 years. But maybe the
following helps, or provides at least some entertainment.

My first version of a "linear assignment problem" solver _did_ implement
the Hungarian algorithm, and I cannot even find it anymore, it must have
been in 2010 or 2011. I only remember that the O(n^4) runtime was
prohibitively expensive: I wanted to automatically track cells in
microscope recordings as they migrated over a plate of agar. I _seem_ to
remember that it was kind of okay for up to a couple dozen, but we wanted
to track hundreds, even thousands of cells, for extended amounts of time.
We _could_ do it, kind of, on a ridiculously expensive machine with 128GB
(I _think_), but naturally time on that machine was limited.

So I set out to implement the Munkres & Kuhn algorithm (see
https://github.com/fiji/fiji/blob/7d683d2ad46d22ec2e0bdffe1aa02072ab62e359/src-plugins/TrackMate_/fiji/plugin/trackmate/tracking/hungarian/MunkresKuhnAlgorithm.java)
because it promised O(n^3) complexity, and it did allow us to track the
amount of cells we were studying at the time.

It took a _week_ to wrap my head around that paper, even if being somewhat
familiar with the Hungarian algorithm probably helped. You will see
extensive notes in the initial code comment that I wrote for my future
self if I ever needed to understand the algorithm _again_. Reading over
those notes after more than a decade, I realize that while they are
helpful for _me_, they probably help others only in conjunction with some
quality time studying the paper.

However, later on our tests revealed that The Munkres & Kuhn algorithm,
while cubic in complexity, was still much slower than the Jonker &
Volgenant algorithm. This algorithm is also cubic in complexity, but with
a lower constant factor, and I could implement much quicker than the
Munkres & Kuhn algorithm because I essentially could cheat a bit by
studying the the paper while taking frequent peeks at the code. That also
explains why I did not take extensive notes this time and only added a
terse code comment, and the commit message is not much better, either:
https://github.com/trackmate-sc/TrackMate/commit/e306a02f76a6b6e383c0d32efaba565aca01a773

My tests must have been quite convincing back then because I already
switched to Jonker & Volgenant on the same day that I finished the
implementation:
https://github.com/trackmate-sc/TrackMate/commit/ad5012827417a40673901a37af44904a2de73095

_Years_ later, I wanted to run `git tbdiff` and couldn't because it
uses not only Python but `numpy`. And `numpy` is largely native code, not
Python, and I had the choice of either getting `numpy` to build on
Windows, or reimplement `tbdiff` in pure C as a new Git built-in, I chose
the latter (because it seemed much easier). And since `tbdiff` used the
`hungarian` algorithm (https://pypi.org/project/hungarian/) that
implements Munkres' algorithm (which I assumed was a slight variation of
Munkres & Kuhn, with the same performance characteristics), I undusted my
Java code and ported it. When I say "undusted", I lie, because it is still
in use today. But it basically went unchanged for years because it is bug
free ;-)

So now that you know about the provenance of the code we're discussing, I
have to admit that I can contribute only very little to understand it,
other than providing a couple of pointers:

https://en.wikipedia.org/wiki/Hungarian_algorithm has a pretty good, if
quite mathematical description of the algorithm. (And now that I read it,
I realize that I probably inadvertently attributed Tomizawa's improvements
to Munkres & Kuhn.)

It is amazingly hard to find good explanations on the 'net that aren't
using hard-code mathematical symbols, but I think I found one that
illustrates how the algorithm works:
https://medium.com/@rajneeshtiwari_22870/linear-assignment-problem-in-metric-learning-for-computer-vision-eba7d637c5d4

Hopefully that helps.

Ciao,
Johannes

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux