Hello, guys. So, I have posted my GSoC application yesterday. Sorry for not discussing it with you on the list. I was a bit busy these weeks, but to be honest, I was busy not with this project, so I probably could write the same proposal two weeks ago, and maybe rewrite it depending on the feedback. Anyway, the application is now written and sent to Google, so there is nothing wrong if I also send it to the list, right? Last year, when I participated in the GSoC program for the first time, I started working on my project very early, actually before the application deadline. This was probably the best decision I made during that summer of code. As this project seems to be at least as complex as the one from the last year, I'd also like to start it early, if possible. So if you like the application, I'd probably take the risk wasting two weeks of work in case it won't get accepted in the end. ========================================================= Subject: A C code "linker" based on sparse, required to build an advanced static analyser. Abstract: Right now, sparse is limited mostly to type checking and questionable/dubious C constructs. Yet, static code analysis may be used to perform lots of wider checks. To build an advanced checker, a way to perform inter-module references is needed. While my tool for creating the symbol database will be working on the source code and not on the compiled object files, I'd call in a C linker by the analogue to the tasks, performed by a binary linker. As time allows, I'd also like to share my thoughts on actually creating an advanced checker. Description: Currently, I'm thinking about a static analyser based on abstract interpretation, mainly to be used to check for buffer overruns and NULL-pointer references. Yet, my ideas are still not ready to form a proposal for a summer of code project, and probably writing a working implementation would also be too big for one. Since as at least one person besides me is working on an advanced checker and needs a C code linker, I thought it would be a good thing to actually write one. Even if my ideas for the global project are not yet fully shaped, the linker is one separate and self-sufficient thing that may already be written. So, as I understand it, the linker should be run by the build system as a replacement for $LD and should create a database of all the symbols met, with references to their locations in the .c files (Well, we'd probably have to copy the .c files to .o), so we can actually replace $LD in the build system). An other way might be to convert the source code to some generic intermediate representation, and actually link it into a big "object". While I am considering such a way, currently I think it might complicate the further analyser development, as people are not very good at reading and understanding intermediate code, which would be needed at the debugging stage. Besides collecting the symbols from the sources, the linker should also support the linker scripts, to actually allocate the symbols. This is needed to reduce the number of false positives from the constructs traversing some linking tables, usually used for module initialisation. Since the result of running sparse on a c file is already a symbol list (with all the details about the symbol's body), it should be relatively easy to create a database containing the needed information. So, I hope there will be some time left to actually do the more interesting work. As a minimal result of the project, I see a fully working linker, as I described. If some of the sparse developers see it working in a different way, I'll probably implement both modes. ========================================================= Also, Greg KH already asked a question in the webapp, so I'll copy it here. ========================================================= 04/08/08 05:04 Greg Kroah Hartman I like this proposal. What kind of background do you have that suggests that you will be able to achieve this kind of goal within the timeframe of the GSoC process? 04/08/08 06:04 Alexey Zaytsev You mean the advanced analyser or the linker? If the linker, uh, well. I actually consider it as an easy task, and hope to complete it within a month at most. I have got a good understanding how a linker works. I have written quite some ld scripts, and have no problems understanding what is written in the "Linkers and Loaders" book. From the technical side, the project is also not very hard. The main sparse(file) function already returns you a list of symbols exported from the parsed file, so you basically have to run sparse() on all the input files (assuming the .c were copied to .o) and collect all the resulting symbols. One not-completely-trivial thing might be the linker script parsing, but it too does not seem to be too hard to me. So if anything, I fear this project is too simple for the summer of code. But I hope the mentors would trust that after completing the linker, I will continue my work on my advanced analyser, or would join the collaborative effort, if such should arise. If you mean the creation of the advanced analyser (whatever this could mean ;), I don't claim it as a certain goal for this summer. I never got a strong theoretical computer since/mathematical education, partly because of my own irresponsibility, so I'm probably not the right person for this task. But I've got some ideas, that I hope to write them down some day. They seems simple enough to me, and right now I'm looking through the papers published on the subject, mainly to make sure I'm not reinventing the wheel. So, I hope to plan a better project in a month or two, and it is absolutely possible that you will see some practical result. I just don't make it a got this summer of code, to not disappoint anyone. ========================================================= P.S: Please copy any essential questions to the webapp, or ask people with the proper access rights to do so. I hope the on-list discussion not to replace, but to accelerate the formal process. -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html