Dave Singer scripsit: > Yes, I picked off an easy example for which the 'matching' section of > the draft didn't seem adequate. This really is a tar-pit, of course. Indeed it is, which is why the draft provides only one simple algorithm (described as "the most common implementation", which it is) and explicitly allows for cleverer techniques for those who want them. > I assume that they are mutually intelligible. Among speakers of good will, yes. > The whole question of what is a language, a variant or dialect of a > language, or a suitable substitute for a language, would benefit some > thought in any tagging scheme, though I agree the problem is not > generally soluble. See the editor's draft of ISO 639-3 at http://tinyurl.com/6kky2 . This is a PDF file about 4 MB in size, so I excerpt the relevant text here (clause 4.2.1, pp. 3-4): # There is no one definition of "language" that is agreed upon by all and # appropriate for all purposes. As a result, there can be disagreement, # even among speakers or linguistic experts, as to whether two varieties # represent dialects of a single language or two distinct languages. For # this part of ISO 639, judgments regarding when two varieties are # considered to be the same or different languages are based on a number # of factors, including linguistic similarity, intelligibility, a common # literature, the views of speakers concerning the relationship between # language and identity, and other factors. The following basic criteria # are followed: # # Two related varieties are normally considered varieties of the same # language if speakers of each variety have inherent understanding # of the other variety (that is, can understand based on knowledge of # their own variety without needing to learn the other variety) at a # functional level. # # Where spoken intelligibility between varieties is marginal, the # existence of a common literature or of a common ethnolinguistic # identity with a central variety that both understand can be strong # indicators that they should nevertheless be considered varieties of # the same language. # # Where there is enough intelligibility between varieties to # enable communication, the existence of well-established distinct # ethnolinguistic identities can be a strong indicator that they should # nevertheless be considered to be different languages. # # Some of the distinctions made on this basis may not be considered # appropriate by some users or for certain applications. These basic # criteria are thought to best fit the intended range of applications, # however. -- First known example of political correctness: John Cowan "After Nurhachi had united all the other http://www.reutershealth.com Jurchen tribes under the leadership of the http://www.ccil.org/~cowan Manchus, his successor Abahai (1592-1643) jcowan@xxxxxxxxxxxxxxxxx issued an order that the name Jurchen should --S. Robert Ramsey, be banned, and from then on, they were all The Languages of China to be called Manchus." _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf