Training students with open-source software

5 minute read

Following my recent experience with POSSE workshop, I went back decided to put all of that in practice as soon as possible. My setting was an operating system lab, which consisted of 2 hours class per week. The course has two parts: first we cover basic Linux principles and commands, followed by some shell scripting. The second part was about concurrent programming. In such short course, students were introduced to basic threading and synchronization stuff. Although it touches basic concepts, I recognize that the second part is harder than the first one. Concurrent programming is hard itself. Exposing concurrent programming to second year students is even harder. Because of that, I decided to leverage open-source only in the first part of the course.

Since the first part was about Linux and shell script, I motivated students to work with funcoeszz, which is a long history, fairly active, Brazilian open-source software. More interesting to our subjects is that fact that this project is mostly written in shell script. Another important decision here was to focus on a Brazilian open-source software. Before students joined the project, I thought that students may benefit from having a community of developers that speak the same language. Turns out that students benefited a lot from sharing the same language. Still regarding the section of the open-source project, I didn’t pay much attention whether funcoeszz had an active community of not. Again, turns out that one of the project maintainers (@itamarnet) is very active. He answered most (all?) questions that students raised in a timely manner (e.g., many questions were answered in the same day!). Low language barrier and an active community were key (unexpected) decisions.

Students worked in groups. Students were told to follow several “steps for contributing”. For instance, they were told to (1) create a blog post about the given open-source project, (2) report bugs, (3) improve documentation, and (4) implement a change to the source code. In my opinion, the 1-3 steps were required because, if students start with implementing changes, they might face problems that might demotivate them to contribute. Turns out that most of the students skipped the 1-3 steps and went directly to implement the source code changes. I also asked students to find and fix an issue. This did not happen as well. Students proposed the changes they thought would be useful. To perform these tasks, students had 3 weeks. Ideally, in the first week students would get acquainted with open-source and the pull-request model. In the second week students would try funcoeszz locally, and look for bugs. In the final week, students would learn the code and propose changes. In practice, none of this happened. Students left the assignment to the very last minute.

Students have no Github or git background. I was not able to teach them git basics because (1) the course was too short and (2) it was not the main goal of the course — they should catch up for themselves. Turns out that all students used the Github web interface to propose changes to the source code. Even with no pull-request background, total of 9 pull-requests were made. All pull-requests performed were aimed at introducing new features – no one was interested in fixing bugs. This might make sense. To fix a bug, students might need to understand more about the source code than when introducing a feature. As regarding the contributions, I was kinda impressed to see that many of them were non-trivial. For instance, although incomplete, one student proposed a JSON parser. The project maintainer, although liked the contribution, politely said that it’s incompleteness turns it unlikely to be accepted. He ended up encouraging the student to keep working on the pull-request, ‘‘I hope you take this project forward’’ (which did not happen unfortunately). Another interesting example is a pull-request that introduces a feature that, given a person id, it checks the state from where the person comes. In Brazil, there is an unified id system, with 11 digits. The pull-request gets the 9th digit, then matches the digit in a list states associated with that given digit, and returns the state found. This contribution went through a thorough code review. After making her suggestions, the project maintainer said that she would merge the contribution, if the the suggestions were made. In this case, the student did implement the suggestions, but the pull-request was not yet accepted (the project maintainer seems currently busy). After that, the student expressed to me her positive perception about this process. In particular, she highlighted that she learnt a lot with the code review process.

On the other hand, students opened 9 issues. All issues were reporting bugs – no one was proposing new features or asking questions. Students might not be aware that issues serve for different purposes. Unfortunately, most of the issues had shallow description. For instance, in issue 429, the student said the “the zztop function is not working on my side”. The student did not provide any details about the environment she was using. Fortunately, our project maintainer politely asked additional questions about the problem that student was facing. It is worth noting such triangulation challenge. On the one hand, students have no idea on how to report a bug. On the other hand, this particular project did not have explicitly guidelines for reporting bugs. Still, I myself thought I do not need to teach students how to report a bug, since it is fairly easy. Although there is no universal right or wrong, I recognized that all of us could do a better job.

I acknowledge that introducing open-source software to such course made students shift the focus to non-related operating system activities, such as creating blog posts or learning Github. However, I think students learned several important lessons that are beyond the scope of the course. However, I also acknowledge that professors need to find a good balance between the goal of the course and the open-source thing in order to not lose focus. In this particular instance, I think the were more benefits than drawbacks. I plan to run it again next semester. Let’s see how it goes.

My take-away is (not necessarily what I did, but what I learnt):

  • Find active open-source projects;
  • Make sure that there is at least one maintainer that could provide help;
  • Look for maintainers that speak the same language that students speak;
  • Introduce yourself to the maintainer, and explain your goals;
  • Teach students the basics for contributing (if possible);
  • Put students to work on small problems first;
  • Ask for partial results; Do not wait for the last minute;
  • Have fun.

PS: If you are a researcher and this topic interest you, I just published a paper on this topic. Happy reading.

Updated: