Re: Parsing images

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2006-05-12 at 08:37, tedd wrote:
> On Thu, May 11, 2006 1:03 pm, Robert Cummings wrote:
> >  Edge detection, noise suppression, and data analysis don't quite
> >  equate
> >  to recognition. Also 30 years of OCR still requires that the sample be
> >  good quality and conform to fairly detectable patterns. If this is so
> >  trivial, I await the release of your captcha parser. The spammers
> >  would
> >  probably pay you millions for it. Where exactly is this bleeding edge,
> >  and where can I read more about it? I think you're quite
> >  wholeheartedly
> >  being naive about the complexity of visual recognition. Prove me
> >  wrong.
> 
> If you had millions, I would prove you wrong. But, in the meantime I 
> have other stuff to do.
>
> The original poster asked the question -- can it be done. And of 
> course, the answer is Yes.

Yes "it can be done". No, it's not trivial in the general.

> Visual recognition in its entirety is not what we are talking about 
> here. Instead, we are talking about a very specific and limited 
> problem of how a program can detect known characters in noise -- that 
> -- are also detectable by humans.

Visial recognition is what we are talking about when you say it's
trivial and can be done easily. Because visual recognition is the
necessary branch of analysis required to be able to analyse any captcha
and come up with the answer.

> Granted, the more complex the image, the more difficult for a program 
> to decipher it, but a CAPTCHA has to be, by definition, detectable by 
> a visually unimpaired human.

And you're making what point here? That you can write code to match the
capabilities of a visually unimpaired human's visual processing and
recognition? Wow!!

> Image analysis, enhancement, alteration, and such are better 
> performed by computers than by humans -- that's the reason we 
> developed the software in the first place.

WRONG! A small subset of these are better performed by computers. And in
almost every case, the computer's results are checked by humans
afterwards. Most computers just flag things as interesting, then a human
goes and makes the executive decision.

> The step between analysis and detection is simply meeting an 
> acceptable error threshold.

Maybe so, and I guess if you considering .00000001 success rate ok, then
writing a captcha breaker might be trivial since you can probably just
generate a random string.

> However, the reasons why CAPTCHA's still work is that the time 
> required to do the analysis could be better spent elsewhere by 
> spammers.

When spammers do come around to generically detecting simple captchas
you can be sure the more complex captchas will increase in frequency.

> As for me being naive -- well... either one of us could be -- but 
> that's probably what I get for working in signal analysis (seismic 
> data) since 1975.

Someone once posted the following to the list, I feel it appropriate to
quote it at this time:

    "Locus ab auctoritate est infirmissimus"

> As for the bleeding edge, that's obvious, just look to medical 
> imaging. They wish that the detection of their problems were as 
> simple as CAPTCHA.

Just because they may wish they were detecting something simpler such as
captcha (maybe, since I could render my captcha to look like medical
imagery and so it would be just as difficult to detect), doesn't mean
captcha is trivial. I'm not currently aware of too much medical imaging
processing that occurs without human intervention. Much of it requires
that a human view the results and make an informed decision based on the
computers analysis. I think part of your naivety is thinking that the
captcha you see right now is as hard as it gets. You are very mistaken.

Cheers,
Rob.
-- 
.------------------------------------------------------------.
| InterJinn Application Framework - http://www.interjinn.com |
:------------------------------------------------------------:
| An application and templating framework for PHP. Boasting  |
| a powerful, scalable system for accessing system services  |
| such as forms, properties, sessions, and caches. InterJinn |
| also provides an extremely flexible architecture for       |
| creating re-usable components quickly and easily.          |
`------------------------------------------------------------'

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux