UEDashboard: New Research Prototype to Detect Unusual Events in SVN Repositories

We are very pleased that we had a paper accepted at the FSE 2015 Tool Demo Track :-)

Our paper presents UEDashboard [demo] [source], a tool which automatically detects unusual events in a commit history based on metrics and smells, and surfaces them in an event feed. This tool was developed as part of my Bachelor's thesis project. It currently supports SVN repositories, and a version for Git is in the works.

The goal of UEDashboard is to increase awareness in development teams, since being aware of unusual events can be useful in preventing errors, and in alerting developers and managers of events that may require justification or that can affect the work of others, especially when the events relate to significant changes to the project.

Our first step was to establish what is unusual in a given context, considering team size, work dynamics, product size, development model, etc. UEDashboard takes context into account by collecting commit-related metrics, such as size, complexity, and time since last commit, and by considering the mean value for these metrics per developer as the "usual" value. "Unusual" values are detected if they differ from the mean value by more than two standard deviations.

We evaluated our approach by showing usual and unusual commits detected by UEDashboard in the commit history of Superintendência de Informática (SINFO) - a Brazilian software company inside UFRN - to six of the company’s developers. We applied UEDashboard to the 215 commits made by SINFO’s development team in March 2015, and we used additional data from the commits between August 2014 and February 2015 for the calculation of historic means and standard deviations. The occurences of events detected in our analysis are illustrated below:

Long time between commits13
Long time between commits to a file0
Large number of files added10
Large number of files changed17
Large number of files deleted3
Large number of lines of code added8
Large number of lines of code changed21
Large number of lines of code deleted14
Large number of methods added19
Large number of methods changed20
High cyclomatic complexity10
Low cyclomatic complexity3
Files changed by many different developers43
Many modifications in a single file65

In semi-structured interviews, we presented each participant with four of their own commits and the corresponding issues from the company’s issue tracking system: two commits UEDashboard had classified as unusual and two commits the tool had not classified as unusual. The participants rated each corresponding task in terms of how (i) difficult, (ii) critical, and (iii) typical it was on a five-point Likert scale and they found unusual commits to be significantly more difficult than usual commits.

In addition, participants highlighted the usefulness of the approach: "It makes sense, it would be useful for the team to understand how the code is evolving, who is changing what, what classes are more modified than others. It could also help enhance planning and could potentially increase code quality". Developers emphasized that unusual events may be useful for justification: "Perhaps if a developer takes very long in one task the events can be used as justification as to why it took a long time and how complex or big the change was."

Participants also mentioned that the information provided by UEDashboard can be used as a discussion starter or a meeting agenda: "It would be useful to be aware of unusual events from other developers. Since my tasks are related to bug fixing, the more information I have about past commits, the better. If I notice a strange modification or many modifications I can promptly talk to the developer about it." Unusual events of new members on the team might be particularly interesting: "I also find it very useful to know how long it has been since the last commit from a developer. As a manager, it is a way to look closer to what newcomers are doing, and if it takes a long time for a newcomer to make a commit, it can indicate that they are accumulating a lot of changes before a commit or that the task is too complex for them. The information would be useful in the meetings, since I could question and talk to them about their tasks without being too passive and waiting for them to tell me something." Another developer added: "It is good for everyone to know what the others are doing. We have meetings twice a week, but we don’t go into details and we don’t have the information the tool provides."

The first step of our future work lies in expanding the detection of unusual events to other development artifacts beyond commits, in particular issues and pull requests. This will also enable us to draw on connections between these artifacts, e.g., a high number of added files in a commit related to a new use case might be expected, but the same event would be unusual if it occurred in a commit related to a bug fix.

Please, feel free to give us feedback, suggestions, and if you want to collaborate, just drop us an email (larissaleite@lcc.ufrn.br) :-) You can find UEDashboard online at http://larissaleite.pythonanywhere.com. UEDashboard is open source, feel free to contribute at https://github.com/coopera/UEDashboard.