[RFD] How to analyze future results of "Git User's Survey 2010" - correlations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This email is intended to start discussion about how to analyze
results of "Git User's Survey 2010" when it finishes
  https://git.wiki.kernel.org/index.php/GitSurvey2010

Analyzing results of individual questions is fairly straighforward, so
let's talk mainly about more difficult issue: correlations between
answers.


Information about survey completion
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

0. Completion Rate Graph / Date of response
   
Unfortunately the last question "28. How did you hear about this Git
User's Survey?" doesn't provide a way to specify details about a way
one found about survey, i.e. what website, which IRC channel, address
of a mailing list or Google Group, name of Usenet newsgroup.

To find which announcements brought most responders, we can try to
correlate date of response with time of posting announcement (where it
is available).  Of course that would underplay the impact of later
announcements, because people who noticed it there might have noticed
it earlier somewhere else.

The export of survey data with individual responses include date and
time of response (in timezone of account; timezone is not specified in
data, unfortunately).  In addition to histogram with day-wide bins
(like the "Weekly" and "Monthly" plots in "Completion Rate Graph" on the
Analyze page for this survey: http://tinyurl.com/GitSurvey2010Analyze)
we can also display running "daily average" line plot, where value at
given point would be number of responses +/- 12 hours around given
time (24 hours i.e. a day centered around given time and date).

It is a pity that Survs.com doesn't provide (and probably also doesn't
gather) detailed account of views, and not only completed / finished
surveys (the "Viewed" number in "Survey Completion Statistics" box).


Another thing worth trying is to create a histogram of _time_ of
response, and perhaps try to correlate it with country of residence
(and range of timezones therein).


About you
^^^^^^^^^
1. What country do you live in (country of residence)?

If IP adresses (or at least parts of them) were available in export
data (they are in idividual responses tab, available on Analyze page
to members of 'git' account on Survs.com, with sufficient
permissions), then we could correlate country of residence with GeoIP
(country of ISP provider used to fill the survey).

We could try to use GeoTools (used by LogToMap) or StatPlanet Map
Maker, or a similar solution/tool, to generate map colored with number
of responders from given country.


2. How old are you (in years)?

Not much to correlate with, I think, though we can try to compare with
demographics from other surveys, or with world demographics (if we can
find such data).


Getting started with Git
^^^^^^^^^^^^^^^^^^^^^^^^
3. Have you found Git easy to learn?
4. Have you found Git easy to use?

Those two compared can show us whether Git is difficult, or just have
steep learning curve.

I also wonder how the correlation looks like and what the correlation
coefficient is for answers to those two questions.


5. Which Git version(s) are you using?

We can compare results against git versions distributed with major
distributions (perhaps limiting view to those responders that use
binary packages, or admin installs git),... though we have limited
resolution here.


6. Rate your own proficiency with Git (1-5):

Using this data we can check if novices and gurus use different tools,
use different features, want different features, etc.


How you use Git
^^^^^^^^^^^^^^^
7. I use Git for (check all that apply):

- work projects / unpaid projects
- proprietary projects / OSS development / private (unpublished)

We can try to use this data to check for example whether people use
different hosting services for OSS development (using public
repositories) and for proprietary development (private repositories on
hosting services or company internal).

- large (>1 MB) binary files
- often changing binary files

We can correlate this with people using git-bigfiles fork, and with
people wanting better support for large binary files in Git.


8. How do/did you obtain Git (install and/or upgrade)?
9. On which operating system(s) do you use Git?

We can correlate those two, and correlate them with git version used.


10. What Git interfaces, implementations and frontends do you use?
11. How often do you use following kinds of Git tools? 
12. What Git GUIs (graphical user interfaces) do you use?

Do answers to those questions depends on the level of proficiency with
Git?  Do people who use GUI find git hard to learn or hard to use
more (or do they find it less difficult)?

17. Which of the following features would you like to see
    implemented in git?
18. Describe what features would you like to have in Git, 
    if they are not present on the list above (in previous question)

We can check here if people who marked 'other, specify below' did
provide extended answer, or did they forget to fill it.


24. Have you tried to get help regarding Git from other people?
25. If yes, did you get these problems resolved quickly and to your 
liking?

Those two would be made into single question in next year, if there
will be Git User's Survey 2011.


26. What channel(s) did you use to request help?
27. Which communication channel(s) do you use?
    Do you read the mailing list, or watch IRC channel?

Those questions are about different things, but answers are probably
correlated to some extent.

-- 
Jakub Narebski
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]