Re: list major languages first in anaconda

Martin Kolman <mkolman@xxxxxxxxxx> · Thu, 22 Oct 2020 17:18:54 +0200

On Thu, 2020-10-22 at 08:41 +0000, Zbigniew Jędrzejewski-Szmek wrote:
> On Thu, Oct 22, 2020 at 06:28:38AM -0000, Sundeep Anand wrote:
> > Hi,
> > 
> > To help users choose their native language anaconda tries to evaluate priority languages based on geolocation and
> > place them first in the list. Proposal[1] is to broad this scope by appending major/common speaking languages as
> > well. This may cater to the use case where a major/common language speaker relocated to a different territory.
> > Determining the list of major/common language is tricky, however, as a starting point we may look at gnome-control-
> > center[2].
> 
> I strongly support this. The 10 or so most popular languages cover
> maybe 60% of the world population, so this optimizes language selection
> a large fraction of our users, without really making anything worse for
> other users, who just have to go through the search list as before.
> 
> As to the list of languages: I think we should go by the total number
> of speakers of a given language, though taking into account popularity of Fedora
> and OSS in a given group too. (For example, French is only spoken by 77 mln as
> the first language according to Wikipedia, but we have many French contributors
> and users, proportionally more than the 1% of world population that that 77 mln is.
> So I think it is important to include French in this list, even though
> it's not that popular in the world.)
> 
> Also, I think we should go by the *total* number of speakers, not just the speakers
> for whom the language is the *first* language. My thinking (and I would love
> to hear from people who are in this situation) is that many parts of the world
> people know multiple languages and are likely to select the interface in
> a second language, if that second language for example is of European origin
> and uses the Latin alphabet or is otherwise better supported by the software.
> 
> I think we should not put regional dialects (**) of a language on the list,
> and always stick to the most popular dialect. A speaker of a given regional
> variant might *prefer* it, but they will not have any trouble understanding
> the most popular variant. This saves us a spot, which we can fill in with
> another language that is significantly different than those already on the list.
> This increases the chance that someone who is using Fedora will see at least
> one language (and alphabet) which they know enough to operate the installer
> interface. (So for example, en_AU, en_GB, en_HK, even though they are somewhat
> popular, would not be included since en_US is.)
> 
> Finally, a caveat that if our localization in a given language is very
> bad, we should not advertise it, even if otherwise we'd want to include it.
> We should instead set a medium-term goal to improve fonts/translations/localization
> in that language first.
> 
> Going by https://en.wikipedia.org/wiki/List_of_languages_by_total_number_of_speakers
> we have:
> 1  English     1268 mln   <-- on the list already, twice
> 2  Mandarin    1120       <-- on the list already
> 3  Hindi        637.3
> 4  Spanish      537.9     <-- on the list already
> 5  French       276.6     <-- on the list already
> 6  Standard Ar  274.0     <-- on the list already (*)
> 7  Bengali      265.2
> 8  Russian      258.0     <-- on the list already
> 9  Portuguese   252.2
> 10 Indonesian   199.0
> 11 Urdu         170.6
> 12 German       131.6     <-- on the list already
> 13 Japanese     126.4     <-- on the list already
> 
> (*) ar_EG is on the list. Is it close enough to other Arabic languages?
> 
> My conclusion would be to drop en_GB, add one Hindi, Bengali,
> Portuguese, Indonesian and Urdu variant each (with the caveat about
> sufficient supported described above). This covers another 1.5trn people
> and gives us significantly better coverage in Asia and South America.
> 
> Japanese is important because it's a significantly distinct language
> with special fonts and conventions, and I think many speakers would
> not be comfortable in anything else. OTOH, German is meh, because in
> my experience all Germans understand English well enough to use it in
> the UI. If we had to drop one more language, I'd drop German.
Just a note about about possible implementation of such suggested improvements,
Anaconda does not contain the language/timezone/keyboard listings and mappings, 
it uses the langtable library, which provides this data over its API:

https://github.com/mike-fabian/langtable

So if more improvements are desired (both in data & code) they should be integrated
into langtable, Anaconda (and other users of langtable) will then just pick it up automatically.

In the past long gone Anaconda used to host and process the language/timezone/kybaord mappings
and it was not just hard to maintain but also not accessible for other project interested in using the same data. For
that reason the langtable project was created and the data and code for accessing it moved there. And even
more signifficantly, it is maitined by Mike Fabian who know all about this things (thanks yet again Mike :) ).

> 
> Zbyszek
> 
> (**) a variety of a language that is a characteristic of a particular
> group of the language's speakers (Wikipedia)
> _______________________________________________
> devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx