On Thu, 2020-10-22 at 08:41 +0000, Zbigniew Jędrzejewski-Szmek wrote: > On Thu, Oct 22, 2020 at 06:28:38AM -0000, Sundeep Anand wrote: > > Hi, > > > > To help users choose their native language anaconda tries to evaluate priority languages based on geolocation and > > place them first in the list. Proposal[1] is to broad this scope by appending major/common speaking languages as > > well. This may cater to the use case where a major/common language speaker relocated to a different territory. > > Determining the list of major/common language is tricky, however, as a starting point we may look at gnome-control- > > center[2]. > > I strongly support this. The 10 or so most popular languages cover > maybe 60% of the world population, so this optimizes language selection > a large fraction of our users, without really making anything worse for > other users, who just have to go through the search list as before. > > As to the list of languages: I think we should go by the total number > of speakers of a given language, though taking into account popularity of Fedora > and OSS in a given group too. (For example, French is only spoken by 77 mln as > the first language according to Wikipedia, but we have many French contributors > and users, proportionally more than the 1% of world population that that 77 mln is. > So I think it is important to include French in this list, even though > it's not that popular in the world.) > > Also, I think we should go by the *total* number of speakers, not just the speakers > for whom the language is the *first* language. My thinking (and I would love > to hear from people who are in this situation) is that many parts of the world > people know multiple languages and are likely to select the interface in > a second language, if that second language for example is of European origin > and uses the Latin alphabet or is otherwise better supported by the software. > > I think we should not put regional dialects (**) of a language on the list, > and always stick to the most popular dialect. A speaker of a given regional > variant might *prefer* it, but they will not have any trouble understanding > the most popular variant. This saves us a spot, which we can fill in with > another language that is significantly different than those already on the list. > This increases the chance that someone who is using Fedora will see at least > one language (and alphabet) which they know enough to operate the installer > interface. (So for example, en_AU, en_GB, en_HK, even though they are somewhat > popular, would not be included since en_US is.) > > Finally, a caveat that if our localization in a given language is very > bad, we should not advertise it, even if otherwise we'd want to include it. > We should instead set a medium-term goal to improve fonts/translations/localization > in that language first. > > Going by https://en.wikipedia.org/wiki/List_of_languages_by_total_number_of_speakers > we have: > 1 English 1268 mln <-- on the list already, twice > 2 Mandarin 1120 <-- on the list already > 3 Hindi 637.3 > 4 Spanish 537.9 <-- on the list already > 5 French 276.6 <-- on the list already > 6 Standard Ar 274.0 <-- on the list already (*) > 7 Bengali 265.2 > 8 Russian 258.0 <-- on the list already > 9 Portuguese 252.2 > 10 Indonesian 199.0 > 11 Urdu 170.6 > 12 German 131.6 <-- on the list already > 13 Japanese 126.4 <-- on the list already > > (*) ar_EG is on the list. Is it close enough to other Arabic languages? > > My conclusion would be to drop en_GB, add one Hindi, Bengali, > Portuguese, Indonesian and Urdu variant each (with the caveat about > sufficient supported described above). This covers another 1.5trn people > and gives us significantly better coverage in Asia and South America. > > Japanese is important because it's a significantly distinct language > with special fonts and conventions, and I think many speakers would > not be comfortable in anything else. OTOH, German is meh, because in > my experience all Germans understand English well enough to use it in > the UI. If we had to drop one more language, I'd drop German. Just a note about about possible implementation of such suggested improvements, Anaconda does not contain the language/timezone/keyboard listings and mappings, it uses the langtable library, which provides this data over its API: https://github.com/mike-fabian/langtable So if more improvements are desired (both in data & code) they should be integrated into langtable, Anaconda (and other users of langtable) will then just pick it up automatically. In the past long gone Anaconda used to host and process the language/timezone/kybaord mappings and it was not just hard to maintain but also not accessible for other project interested in using the same data. For that reason the langtable project was created and the data and code for accessing it moved there. And even more signifficantly, it is maitined by Mike Fabian who know all about this things (thanks yet again Mike :) ). > > Zbyszek > > (**) a variety of a language that is a characteristic of a particular > group of the language's speakers (Wikipedia) > _______________________________________________ > devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx > To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx