Re: thoughts and a possible solution on homograph attacks

"Dmitry Yu. Bolkhovityanov" <D.Yu.Bolkhovityanov@xxxxxxxxxx> · Tue, 8 Mar 2005 11:00:29 +0600

On Mon, 7 Mar 2005, Michael Roitzsch wrote:

> Hi security community,
> 
> this is my first publication I post on Bugtraq, so please be patient with me.
> 
> Since the recent problems with IDN, I wanted to clear up my thoughts on 
> homograph attacks, so I sorted everything in an article which also contains 
> what I believe to be an easy and general solution.

Quote from your .pdf:

> I propose to present the user with a dialog showing the text to be
> validated and an input field, into which the user has to type in the given
> text again. The user is told, if both texts match precisely and what this
> means: If the typed text's internal representation matches the given text
> bit-by-bit, trust can be established. If it does not match, the user is
> told to re-check for typing errors and not to establish trust.

	What you propose is the same as entering the password for each
site you visit.  Yes, this IS a solution, but it is TOO DISTURBING for
users.  Web surfers usually do hundreds (or thousands?) clicks per day,
and at least dozens of them are cross-site.  And forcing them to type
domain's name each time is just not the way to go.

	Domain names AREN'T passwords, they exist to be memorable.

	Remember: users are lazy, and >90% home installs of Windows have
autologin enabled -- no usernames, no passwords.  If the users are SO
lazy, they would definitely object to entering a long domain names by
their fingers.

	However, there CAN be a solution for a tiny real-world subset of
"homograph attacks" problem -- the web browsers interface.  My idea
is the following:

	Domain names are usually written as text strings of "default
	interface colors".  But the browser can highlight non-ASCII
	glyphs by some different background, so that even a
	security-unconscious user would pay attention.

	For example, if regular "URL text" colors are black-on-white, the
browser can highlight greek letters (U+0380-U+03FF) with light-blue
background, cyrillics (U+0400-U+04FF) -- with red, and all other non-ASCII
(or non-ISO8859-1) characters -- with yellow.

	Such three-color highlight seems to be enough, since most
looking-identical-to-latin glyphs are in greek and cyrillc alphabets, and
the "catch-all" yellow will satisfy all other cases.

P.S. My native language is russian, so the alphabet is cyrillic.  Since
     cyrillic has ~30% letters looking identical to latin (but often
     pronounced differently), and having different Unicode positions, it
     was obvious years ago that IDN was very poorly thought.  It is a big
     mistake from both security and marketing points of view.

     And this problem of homograpgh attacks in a general form can have no
     solution at all, just because of this problem's nature.  There are
     cases in a real life when a russian-speaking (to be correct, a
     cyrillic-based-language-speaking) person can't determine which
     language some word is spelled in.  For example, ask some
     russian-speaker how would he or she read "nona" (that's a real name
     of a hotel in Bulgaria, which causes constant fun for russian
     tourists).

     Just my two cents...

	_________________________________________
	  Dmitry Yu. Bolkhovityanov
	  The Budker Institute of Nuclear Physics
	  Novosibirsk, Russia