[GSOC] [PROPOSAL v2] Draft of proposal for "Unify ref-filter formats with other pretty formats"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have changed my proposal according to the comments by Hariom Verma.

Improvement vs v1:
1. Put more effort into related work and grasp a lot from them.
2. More details about timeline.
3. More details about my plan.
4. Some tiny changes in other content.

Open to more guidances. Thanks for suggestions.


* Unify ref-filter formats with other pretty formats

* Personal Information

Full name: Zhang Yi

E-mail: 18994118902@xxxxxxx
Tel: (+86)18994118902

Education: Wuhan University of Technology (China)
Major: Computer engineering 
Year: First-year postgraduate student

Github: https://github.com/zhanyi22333

*  Synopsis

** Motivation

Git has different implements to format command output, which makes chaos and
hinder improvement of code quality.

Aim to unify the different implementations to format output for different
commands, we want to transform pretty into ref-filter formatting logic. According
to the present situation, I need to add more ref-filter atoms to replace
pretty.

** Previous Work

  - `git for-each-ref`, `git branch` and `git tag` formats into the
ref-filter formats:

done by Karthik Nayak (GSoC 2015)

  -  `git cat-file` formats and the ref-filter formats:

started by Olga Telezhnaya (Outreachy 2017-2018),
continued by ZheNing Hu (GSoC 2021),
    There are a lot of patches which are concluded in his final blog [1]
but still not finished due to tricky performance issues

  - ref-filter formats and pretty formats:

started by Hariom Verma (GSoC 2020)
    There are also a lot of patches which are concluded in his final blog [2]
continued a bit by Jaydeep Das (GSoC 2022)
    Patch: gpg-interface: add function for converting trust level to string [3]
and continued by Nsengiyumva Wilberforce and his  work on the "signature" atoms
should be mostly over when the GSoC starts. (Outreachy 2022-2023)
    Patch: ref-filter: add new atom "signature" atom [4]

ps: There seems no conclusion articles of Karthik Nayak's and Olga Telezhnava's
works.

** What is left

Since the work of "signature" atoms will be finished by Nsengiyumva Wilberforce,
There may be some other atoms left for ref-filter formats and pretty formats.
But I still need to check.

If there is no work left for for ref-filter formats and pretty formats, then
there may be another command which has a different format implement with
ref-filter.

** Steps

In my mind, there are 4 steps logically:
1. Check and find a pretty atom which has no substitute in ref-filter.
   This step is to decide the whole direction of the next work.
   Christian Couder informed me that I can do things like the following:
   - making sure that all the atoms in the pretty formats have similar
   atoms implemented in the ref-filter formats
   - find a way to convert any string containing pretty format atoms to
   a string containing only ref-filter format atoms
   - find a way to plug-in the ref-filter code into the pretty code, so
   that callers of the pretty code would not need to be changed much.
2. Add reasonable test scripts and maybe documents in advance.
   In my opinion, making a draft of test scripts and documents in advance can
   help me have a deep understanding of the behavior that I need to code. I learn
   this development mode from book. And I have really met problems rising from
   the misunderstanding of needed behavior which will result in a lot of reworks.
3. Change code.
   Inspired by Hariom Verma's proposal, I can  start by first looking at what
   actually needed to be replaced (for example by studying the PRETTY FORMATS
   section in 'man git-log', what which verbs you can use in the ref-filter
   ('man git-for-each-ref') to achieve the same thing. Then I can research how
   one format is implemented in 'pretty.c', and see how a similar thing using
   the ref-filter is implemented in 'ref-filter.c'.
4. Recheck documents and run test scripts.
   Necessary step to check the behavior of code.


* Benefits to Community

I'm willing to stay around after the project. By that time, I will be in my
second year without classes. And my tutor has an open mind about my request to
involve in an open source project by now. Considering the subjective and
objective conditions, I think there is a high possibility that I will stay
around.

Particularly, I wish to be a co-mentor if I have the ability. There may be some
difficulties. But what I learn from my finite experience is that you should not
refuse something positive just because of the difficulties in the mind. A
fresh new job may be difficult, but it can show me the possibilities of the
world, which means changing my mind.

What's more, I tried to persuade a schoolmate who I think is kind of obsessed
with technology to take part in an open source community for both self-growth and
companion. And I failed, because he thinks it is hard.  It's always hard to
change Others' deep-rooted ideas by word. But I think the actions speak louder
than words. Maybe after the project, I can change the minds of people around me
about joining an open source community. There may be no visual benefits to the
Git Community but should be beneficial to the whole open source community.

* Microproject

t9700: modernize test scripts [5]

The microproject patches have been merged. The merge info is as below:

commit 8760a2b3c63478e8766b7ff45d798bd1be47f52d
Merge: a2d2b5229e 509d3f5103
Author: Junio C Hamano <gitster@xxxxxxxxx>
Date:   Tue Feb 28 16:38:47 2023 -0800

    Merge branch 'zy/t9700-style'

    Test style fixes.

    * zy/t9700-style:
      t9700: modernize test scripts

* Plan

** Timeline and deliverables

The official GSOC code time start from 05-29 to 08-28, which is 13 weeks.
The period from 06-05 to 06~30 is near the end of the semester. There are many
classes for me. So I guess I may be not productive during this period.
I think it is a bit time-limited if I follow the official timeline. It seems
necessary to do some work in advance.

1. preparatory work:
 Period:
  04-01 ~ 05-28
  about 8 weeks
 Tasks:
  1. Decide which parts need to work and which has priority.
  2. Read Hariom's blogs.
  3. Trying to understand the formatting logic behind pretty and ref-filter.
  (Maybe try gdb?)
  4. Try to make some trial change

2. Write draft of documents and test scripts.
 Period:
  05-29 ~ 06-02
  week 1
 Tasks:
  Based on the preparatory work, write drafts of doc and test.
 Deliverables:
  Drafts of documents and test scripts
3. Inactive Period
 Period:
  06-05 ~ 06-30
  week 2~5
  4 weeks
 Tasks:
  1. Build the base of other works like atoms.
  2. Should pass some special tests.
 Deliverables:
  A new atoms

4. Active code period 1
 Period:
  07-03 ~ 07-07
  week 6
 Tasks:
  1. Add a new argument and grab functions for the atoms
  2. Need to pass tests and in same with documents
 Deliverables:
  A new argument and its grab function
5. Midterm evaluation
 Period:
  07-10 ~ 07-14
  week 7
 Tasks:
  1. Submitting midterm evaluations
  2. Maybe need to continue the work left from last week
 Deliverables:
  midterm evaluation

6. Active code period 2
 Period:
  07-17 ~ 08-04
  week 8~10
  3 weeks
 Tasks:
  1. Add 2~3 new arguments
  2. Also need to pass tests and in same with documents.
  3. Drafts of documents and test scripts should be updated.
 Deliverables:
  1. New arguments
  2. Documents
  3. test scripts

7. Finishing touches
 Period:
  08-07 ~ 08-26
  week 11~13
  3 weeks
 Tasks:
  1. There should be some bugs to fix or work left.
  2. This period is also left for unexpected events.
  3. Submit final work product and final mentor evaluation.
 Deliverables:
  1. final work product
  2. final mentor evaluation


* Grasp from related work
** From Hariom Verma's blog
Walking through the blogs of Hariom Verma, I find many things useful.

*** Debugging

An extremely informative(step-by-step) debugging guide by Christian. [6]

*** 11 questions for understanding someone's work. [7]

1. What was the goal of each patch?
2. which approach did she took to achieve the goal?
3. what were the goals of the patch series?
4. which approach did she took to achieve the goals?
5. what was the goal of her previous patch series?
6. what was the general direction her patch series were going?
7. why did she took that direction?
8. are there ways to continue in the same direction?
9. are there ways to achieve similar goals?
10. how were her goals similar and different from the goals in my proposal?
11. is it possible to use the same approach?

*** Else

There are many details about his work progress. I can refer to them when I am in
similar situations.

** From ZheNing Hu's blog

*** Time analyzing

Use performance testing tools to analyze the time-consuming steps of
`git cat-file --batch`.

 Using Google's `gperftools`:
1. Add the link parameter `-lprofiler` in `config.mak`: `CFLAGS += -lprofiler`.
2. `make`.
3. Use `CPUPROFILE=/tmp/prof.out /<path>/git cat-file --batch-check
--batch-all-objects`
to run the git and general `prof.out`, which contains the results of
performance analysis.
4. Use `pprof --text /<path>/git /tmp/prof.out` to display the result
in the terminal.

*** About Github CI

"GitHub-Travis CI hints" in Documentation/SubmittingPatches

*** Else

He also writes his process of debugging and optimization in detail. It's worth
deepening into when I need them.

This proposal draft benefits from the works of predecessors much. Thanks.

* Biograhical information

It is always funny to recall that I first learned about Linux in a stimulated
hacker game in my fresh year in college. After that, I tried to teach myself
Linux and started to know open source projects. Overcome many difficulties and I
finally know something shallow about Linux. As a side effect, I am more
enthusiastic and better at programming compared with my schoolmates. But the
period of stagnation came, I began to write some meaningless projects for school
tasks and repeated myself without progress. The best out of the worst, I touched
excellent open source software during the time, such as vim, emacs, visual
studio code, Qt, VLC and, of course, git. Near the end of my junior year, I read
an article about learning by contributing to an open source project by a geek
in the community of emacs. Almost at the same time, I knew the GSOC and preferred
to take part in git. But it was near the start date of my plan for postgraduate
qualifying examination. So I just postponed the stuff for GSOC.  Luckily, I
passed the examination. After I got used to life as a postgraduate student, I
felt the motivation to progress again. Then I tried to contribute for git. Now I
just finished a micro project, which seems trivial. But it really let me have a
deeper understanding of open source and free software and more motivation to
contribute. I hope I can stay here a long time before being involved with other
interesting projects since the quality is more important than the quantity.
I know it seems a bit stubborn to believe that contributing will lead to
progress, which is also influenced by my learning attitude. But without action,
I can not verify the belief.  Sooat least I will try to contribute for one year.
After that, I hope I can have a better understanding.

Sorry, the above text may be messing. In short, I will try to contribute for
git for at least one year.

* Closing remarks

It seems blogs will help much for later work. I think It worth rebuilding my
blog site on github.

Thanks for Christian Couder's and Hariom Verma's help.


[1] https://public-inbox.org/git/CAOLTT8SxHuH2EbiSwQX6pyJJs5KyVuKx6ZOPxpzWLH+Tbz5F+A@xxxxxxxxxxxxxx/
[2] https://harry-hov.github.io/blogs/posts/the-final-report
[3] https://public-inbox.org/git/pull.1281.git.1657202265048.gitgitgadget@xxxxxxxxx/
[4] https://public-inbox.org/git/pull.1452.git.1672102523902.gitgitgadget@xxxxxxxxx/#t
[5] https://lore.kernel.org/git/20230222040745.1511205-1-18994118902@xxxxxxx/
[6] https://public-inbox.org/git/CAP8UFD3Bd4Af1XZ00VyuHnQs=MFrdUufKeePO1tyedWoReRjwQ@xxxxxxxxxxxxxx/
[7] https://harry-hov.github.io/blogs/posts/week1-the-ten-questions




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux