[Outreachy][proposal]: Finish adding a 'os-version' capability to Git protocol v2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Git Community,

I hope this mail finds you well.

This is my proposal for the project "Finish adding a 'os-version'
capability to Git protocol v2" for Outreachy internship program 2024.

I appreciate any feedback on this proposal.

---------<8----------<8----------<8----------<8----------<8----------<8

Personal Information
================

Full Name: Usman Akinyemi
Email: usmanakinyemi202@xxxxxxxxx
Personal Blog: https://uniqueusman.hashnode.dev/
Personal Website: https://uniqueusman.tech
GitHub: https://github.com/Unique-Usman
Degree: Bachelor of Technology (B.Tech) in Computer Science and
Artificial Intelligence


About Me
========

I am Usman Akinyemi, I often like to refer to myself as a nomadic
computer programmer as I love the ability to work on interesting
projects without being restricted to a physical location. I love the
Linux Operating System and most people already alias my name to Linux
as I also preach it to everyone everyday. I learnt programming in
multiple places, in college,  ALX Software Engineering program and
also personal studying. I have decent experience in contributing to
OpenSource projects. I have contributed to systemd, Cpython
documentation, Canonical website,  pep8speaks and Open Science
Initiative for Perfusion Imaging (OSIPI). Being someone who is a
product of the community, I value community development so much. I
have always tried my best to contribute to the community in my own
way. One case is when I organized a month-long program(structured
webinar) aimed at exposing young Nigerians to opportunities in tech.
The program focuses on topics such as OpenSource contributions,
securing internships, career development, freelancing, datascience,
and introduction to Github and Linux([ Youtube Recording
]https://www.youtube.com/watch?v=OrAThr-84t8&list=PLBW_HlYT-kP1tUqbzavMAq-ZZQRhMkZH2&ab_channel=UsmanAkinyemi
). I have also volunteered for different communities one of which is
DesignIT where we train Nigerian youth on technology.


Past Experience with Git
===================
I have been a Git user for about three years now. I mainly use git for
personal projects, Group projects and OpenSource contributions. I have
also had the opportunity to introduce and teach people to Git. I am
really excited to be here in the Git community.
During the contribution stage, I have got more familiar with the
community and how to send patches to Git with the help of the Git
community. I have also learnt a couple of things, one of which is Git
contributor best practices.


Contributions to the Git Community
===========================

I joined the Git community after I got selected for Outreachy
Contribution Phase and I have been able to send some patches to the
Git codebase with the help of the Git community while also learning.
Below is the list of my contributions:

MICROPROJECT
------------------------
- Link: https://public-inbox.org/git/pull.1805.git.git.1728192814.gitgitgadget@xxxxxxxxx/T/#u
- Merge Commit: 6487b2b
- Status: merged into next/jch

+ [PATCH v7 1/2] t3404: avoid losing exit status with focus on `git
show` and `git cat-file`
- Description: In the Git t3404 test script, I improved error
detection by restructuring command chains to ensure accurate exit
status handling, preventing missed errors from piped commands.
The exit code of the preceding command in a pipe is disregarded. So if
that preceding command is a Git command that fails, the test would not
fail. Instead, by saving the output of that Git command to a file, and
removing the pipe, we make sure the test will fail if that Git
command fails. This particular patch focuses on all `git show` and
some instances of `git cat-file`.

+ [PATCH v7 2/2] t3404: replace test with test_line_count()
- Description: Refactor t3404 to replace instances of `test` with
`test_line_count()` for checking line counts. This improves
readability and aligns with Git's current test practices.

- Remarks: Through this process, I deepened my understanding of shell
scripting and command chaining, focusing on how exit statuses affect
testing accuracy. My mentors suggested keeping commands readable and
consistent with Git's scripting standards, emphasizing simplicity and
future maintainability. The result is a more robust, reliable test
script that better aligns with Git’s best practices, improving overall
test suite integrity. Also, through this patch, I was able to
understand the workflow involved in submitting a patch to git which is
quite different from many other projects which I have worked on. This
is really an interesting learning experience.
I also learnt about the importance of following Git’s best practices
and also How  to submit patches with multiple commits.

LeftOverbits
-----------------
- Link: https://public-inbox.org/git/pull.1810.v3.git.git.1729574624.gitgitgadget@xxxxxxxxx/T/#t
- Merge Commit: cfd82c9
- Status: merged into next/jch

After completing the microproject, I wanted to gain a deeper
understanding of Git’s codebase and workflow. I began looking through
leftoverbits to work on and found a suitable one. Through this, I
learned how to add tests for my code additions, which helped me
understand the process of integrating and validating changes in the
codebase.

- General Description: In this series of patches, I replaced `atoi()`
with `strtoul_ui()` and `strtol_i()` across the daemon, merge, and
IMAP components to address the issue of inadequate error handling and
input validation. The use of `atoi()` could lead to undefined behavior
when parsing invalid inputs, such as letters, which might result in
incorrect program behavior. Now, invalid inputs trigger clear error
messages, ensuring safer parsing and preventing malformed responses. I
updated tests for the daemon and merged components to verify these
improvements, while IMAP changes didn't include tests since none
existed for `git-imap-send`. Overall, this update significantly
strengthens input validation and code reliability.

+ [PATCH v6 1/3] daemon: replace atoi() with strtoul_ui() and strtol_i()
Replace atoi() with strtoul_ui() for --timeout and --init-timeout
(non-negative integers) and with strtol_i() for --max-connections
(signed integers). This improves error handling and input validation
by detecting invalid values and providing clear error messages.

+ [PATCH v6 2/3] merge: replace atoi() with strtol_i() for marker size
validation
Replace atoi() with strtol_i() for parsing conflict-marker-size to
improve error handling. Invalid values, such as those containing
letters now trigger a clear error message.

+ [PATCH v6 3/3] imap: replace atoi() with strtol_i() for UIDVALIDITY
and UIDNEXT
Replace unsafe uses of atoi() with strtol_i() to improve error handling
when parsing UIDVALIDITY, UIDNEXT, and APPENDUID in IMAP commands.
Invalid values, such as those with letters, now trigger error messages
and prevent malformed status responses.

- Remarks: In this patch series, I learnt a lot of things. The
importance of splitting large changes into Smaller changes for easy
review. I also learnt about how to submit multiple patches which are
not related by creating a new branch from origin/master. Also, I
learnt how to write better commit messages. And lastly, the importance
of asking for help and integrating suggestions from the community.

I also had the opportunity to review a patch and also answer doubt
from other outreachy applicant mate
https://public-inbox.org/git/CAEqABkKvbpo-8-gDpFtfNcpmiC8A5mJMkcDXfhcdNrpwMvBsDA@xxxxxxxxxxxxxx/T/#u
https://public-inbox.org/git/CAPSxiM8SjJwb6x2bhCd4xsYLiNk+KhWYna7-rZhdNGpYNV1tLg@xxxxxxxxxxxxxx/


Past experience with other communities
===============================

Systemd
------------

- I Developed a new unit test framework with assertion macros which
enhanced debugging by providing detailed error reports with file
names, line numbers, and expression values upon failure, improving
issue identification and resolution
- I Updated approximately 22 existing unit test files by modifying 403
lines of code to incorporate the new assertion macros, resulting in
improved logging details and enhanced overall test coverage and
debugging efficiency.
- I Implemented the --json option for the bootctl status command and
updated the integration tests, enabling machine-readable JSON output
for comprehensive bootloader status information

PR Link:
https://github.com/systemd/systemd/pull/31873
https://github.com/systemd/systemd/pull/31853
https://github.com/systemd/systemd/pull/31819
https://github.com/systemd/systemd/pull/31700
https://github.com/systemd/systemd/pull/31678
https://github.com/systemd/systemd/pull/31669
https://github.com/systemd/systemd/pull/31666
https://github.com/systemd/systemd/pull/32035

Python Official Documentation
-----------------------------------------
- I have contributed to improving Python's official documentation,
enhancing my Python knowledge, technical writing, and collaboration
skills in open-source.

PR Link:
https://github.com/python/cpython/pull/109696
https://github.com/python/cpython/pull/111574
https://github.com/python/docs-community/pull/96
https://github.com/python/cpython/pull/113209
https://github.com/python/docs-community/pull/97

OSIPI (Open Science Initiative for Perfusion Imaging(OSIPI) organization)
---------------------------------------------------------------------------------------------------
- Added a command-line interface to the existing 4D IVIM phantoms
generator, with detailed documentation for usage.
- Created a Python script for efficient reading and writing of NIfTI
images, improving data processing workflows.
- Dockerized the TF2.4_IVIM-MRI_CodeCollection project and implemented
a GitHub Action for automated Docker image building and testing,
ensuring consistent deployment and a streamlined CI/CD pipeline.

PR Link:
https://github.com/OSIPI/TF2.4_IVIM-MRI_CodeCollection/commit/92a80d61cfca322da49d126dcd598996fca92668
https://github.com/OSIPI/TF2.4_IVIM-MRI_CodeCollection/commit/163a187c5eaad33b55a0af0487bd9e19ca520828
https://github.com/OSIPI/TF2.4_IVIM-MRI_CodeCollection/commit/56ee7173c91c7a1dd64412d3884a3167c7514665
https://github.com/OSIPI/TF2.4_IVIM-MRI_CodeCollection/commit/e6a47211410ed57705e2ec32adb397ac6663d061
https://github.com/OSIPI/TF2.4_IVIM-MRI_CodeCollection/pull/60
https://github.com/OSIPI/TF2.4_IVIM-MRI_CodeCollection/pull/74

Canonical Docs Website
---------------------------------
I developed a solution to integrate GitHub contributor information
into Sphinx documentation templates using the GitHub API, enhancing
documentation with contributor insights.

PR Link:
https://github.com/canonical/sphinx-docs-starter-pack/pull/203#issuecomment-2018521984

LLVM
--------
I fixed a bug in Clang's Extract API for Objective-C JSON generation
and optimized the test suite within the LLVM Compiler Infrastructure.
https://github.com/Unique-Usman/llvm-project/commit/32b53cf9d0c8c0e01ce5b0e7d5c717202a98cdf5


Experience As a user of OpenSource Software
====================================

As an avid user of open-source software, my experience has been
primarily with Linux distributions, particularly Arch, which serves as
my primary operating system. My past usage of Ubuntu has also
contributed to my understanding of different Linux environments.
In addition, I have extensively utilized various free software such as
MySQL for database management, LibreOffice for office productivity,
GCC and G++ for C and C++ programming, Python for scripting and
application development, React for web development, Clang for C/C++
compilation, Git for version control and many others. These
experiences have not only enriched my software knowledge but have also
deepened my understanding of the principles and benefits of
open-source development.


—--------------------------------- Project Overview
—--—--------------------------------
In June 2024, a patch series was submitted to the Git mailing list
aimed at adding a new 'os-version' capability to the Git protocol v2.
This capability is designed to allow Git clients and servers to
exchange information about the Operating System (OS) they are using,
which can aid in diagnosing issues and collecting statistical data.
Following the patch submission, discussions arose regarding the
necessary improvements and issues, particularly with Windows
compatibility.
The objective of this internship is to address these outstanding
issues, implement the required improvements, and ensure the successful
integration of the 'os-version' capability into the Git protocol.

—------- Internship objectives and plans  —-------
The goal of this internship is to finalize the implementation of the
'os-version' capability in Git protocol v2, as proposed in the patch
series sent to the Git mailing list in June 2024. This enhancement
will allow Git clients and servers to advertise their operating
systems (OS), aiding in diagnostics and data collection.

Detailed Tasks and Steps
====================

Review the Current Patch Series
--------------------------------------------
1. Examine the Patch: Thoroughly analyze the existing patch series
submitted to the Git mailing list. Understand its design and
functionality, focusing on:
   -  How the OS information is gathered and transmitted.
   -  Current configurations and their implications on data transmission.
2. Feedback Analysis: Collect feedback from the Git mailing list
discussion regarding the patch. Identify key concerns, especially
related to:
    - Privacy issues.
    - Default behavior expectations.
    - Cross-platform compatibility.
3. Consider User-Agent Integration: Investigate the suggestion to
integrate the 'os-version' data into the existing user-agent string
rather than creating a new capability. Evaluate:
    - The implications of combining this data with the user-agent.
    - How this approach might address concerns about telemetry and user privacy.

Implement Default Behavior for 'os-version'
----------------------------------------------------------
1. Modify Default Configuration: Adjust the implementation so that by
default, only the OS name (e.g., "Linux" or "Windows") is sent during
communications.
2. Impact Assessment: Evaluate how this change impacts existing users
and any potential performance implications.

Introduce a Configuration Variable
---------------------------------------------
1. Define Configuration Options
    - Disable Option: Allow users to disable the 'os-version'
capability entirely via configuration.
    - Verbose Option: Enable a verbose mode that sends detailed OS
information (e.g., the output of the uname -srvm command).

2. Documentation: Improve the documentation outlining how to enable,
disable, and configure the 'os-version' capability. Include examples
for:
    - Basic usage (default OS name).
    - Detailed usage (full OS version information).
3. Implementation: Code the configuration settings and ensure they are
recognized by the Git system.

Fix Cross-Platform Tests
---------------------------------

1. Identify Issues and added tests for changes/addition: Investigate
existing test failures, particularly those occurring on Windows and .
     - Review the test logs and identify the root causes of failures.
     - Analyze differences in OS behaviors and how they affect the tests.
     - Cross-platform tests to validate the functionality on Linux,
Windows, and macOS environments.
2. Implement Fixes:
      - Modify tests to ensure they run correctly on Windows,
addressing any compatibility issues with the test framework or Git
commands.
      - Ensure all tests reflect the changes made to the OS reporting
capabilities.

Testing and Validation
------------------------------
Ensure comprehensive test coverage—including default behavior,
configuration options, and edge cases—integrate tests into the Git CI
pipeline for automatic execution, and share results with the community
for feedback on robustness and additional scenarios.

Documentation Updates
---------------------------------

1. User Documentation: Update the Git documentation to include:
    - Instructions on how to configure the feature, with practical examples.
    - Best practices regarding data privacy when using the capability.
2 Developer Documentation: Include comments in the code for
maintainability and understanding of how the 'os-version' capability
works internally.

 Prepare for Merging
----------------------------
1. Final Review: Conduct a thorough review of all code, tests, and
documentation. Ensure everything aligns with Git’s contribution
standards.
2. Engagement with Community: Present the finalized patch to the Git
mailing list, addressing any additional concerns raised during the
discussions.
3. Merge Process: Coordinate with the maintainers for merging the
patch into the main branch, ensuring all feedback has been
incorporated.


—------------------------- Timeline —-------------------------------------

Community Feedback and Finalization
=============================
Dates: November 26 - December 8
Engage with the Git community to gather input, especially on privacy
concerns and minimal data sharing. Determine default behavior (sharing
only OS name) and finalize whether to use "user-agent" or another
identifier in the protocol(os-version).

Minimal Default Implementation
========================
Dates: December 9 - December 20
Implement the core feature to share only the OS name by default,
keeping data minimal as per feedback.

Configurable Options for OS Version
============================

Dates: December 21 - December 30
Develop settings to allow users to disable OS data sharing or choose
verbose mode (e.g., uname -srvm output).

Cross-Platform Testing (Focus on Windows)
==================================

Dates: December 31 - January 13
Conduct robust testing across platforms, addressing prior Windows
compatibility issues.

Beta Testing and Community Feedback
==============================
Dates: January 14 - January 27
Release for beta testing, integrate feedback, and refine functionality
based on real-world use.

Documentation
============
Dates: January 28 - February 10
Document feature usage, configuration options, and setup instructions
for smooth adoption.

Final Review and Merge
===================
Dates: February 11 - March 6
The final review phase will include presenting the completed work to
the Git community for a thorough final assessment. Any remaining
concerns or suggestions will be addressed before the patch is prepared
for merge. This stage will allow for further feedback, particularly
from stakeholders and maintainers who raised the initial questions,
ensuring the solution is acceptable to the broad Git community. Once
consensus is achieved, the patch will be merged into the Git mainline
codebase, concluding the project.


Availability
========
I will be available to work for the required minimum of 30 hours
during the internship period and will be happy to extend if required.

Blogging
=======
I also plan to keep writing blogs after two weeks, to track my
progress,  give updates about what I am currently working on and also
as a documentation for future contributors.

Post Outreachy Internship
====================
One of my dreams is to be an active member of an open-source community
which I can proudly support and contribute to. Continuing my
contributions after the internship is a big part of making that dream
a reality. I’m committed to contributing to Git long-term, helping to
improve the project and supporting new contributors along the way.

Appreciation
==========
I really appreciate the support and guidance I got from the git
community. I also appreciate all the effort from the outreachy mentor.
Thanks for your time.

Thank you.
Usman Akinyemi.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux