Dr. Romain Robbes

me
I am an Assistant Professor in the Computer Science Department of the University of Chile.
My research focuses on software engineering, software maintenance and evolution.
Navigate this site with the arrow keys and the space bar; you can also click around as usual.

What's New

03/10
  • This website
  • paper accepted at TOOLS 2010
  • Reviewer, IEEE TSE
  • PC member, ICSM 2010, ERA track
  • PC member, ICPC 2010, Tool Demos
02/10
  • paper accepted at MSR 2010 (1)
  • paper accepted at MSR 2010 (2)
  • paper accepted at Web2SE 2010
  • Nominated for the GI Dissertationspreis 2009
01/10
  • paper accepted in the J. ASE
  • Started a new job at U. Chile
12/09
  • Reviewer for WASDeTT 2010
  • paper accepted at ICSE 2010
11/09
  • Reviewer for SEAA 2010, EDISON track
10/09
  • Best Paper Award at WCRE 2009

Contact

Adress:
Romain Robbes
DCC, University of Chile
Blanco Encalada 2120, Off. 308

837-0459 Santiago
Chile
  • Phone: +56 2 978 49 74
  • Fax: +56 2 689 55 31
  • Email:
  • Skype: romain.robbes

Research

I work in Software Engineering, Software Maintenance and Evolution, Reverse Engineering, and Mining Software Repositories.
Change-based software evolution
Mining Software Repositories is a very active research area, in which researchers exploit the historical artifacts that are produced during the development of a system in order to assist its future development.

My Ph.D. work stems from the observation that the software repositories used in MSR are ad-hoc. They were designed to solve other problems than the ones MSR adresses, and as such, have issues that MSR researchers need to work around.
In my dissertation, entitled "Of Change and Software", I introduced a model of software evolution placing change at the center of the evolution instead of the more common concept of the version. This model is known as Change-Based Software Evolution and its effectiveness was validated in several forward and reverse engineering applications. The following publications show examples of this work:
J.ASE 2010
MSR 2010
ICPC 2007
Ph.D. Thesis
Lile Hattori works on a version of the model that I introduced adapted for concurrent and collaborative software development
In the meantime ...
Since Change-based software evolution is still a prototype, I also mine existing software repositories. In this area, I work with my former research group at USI in Lugano, Michele Lanza's REVEAL.

We work in the areas of bug prediction (with Marco D'Ambros), the usage of emails in software evolution (with Alberto Bacchelli), software ecosystems (with Mircea Lungu), visualizing software as cities (with Richard Wettel), and in improving IDEs (with Fernando Olivero).

Teaching

I have taught and assisted the following courses over the years.

At the Università della Svizzera Italiana:


At the Université de Caen:


I was also an invited lecturer at the PL 2009 summer school.

Software Atelier 1
First-year bachelor course where students learnt the basic tools they needed for other courses: Shell programming, Version control with Subversion, Latex, HTML and CSS. I was assisted by Marco Primi and Simone Rollini.
Software Architecture
Master-level course about software architecture, architecture modelling languages, and UML. I was assisting Prof Cesare Pautasso.
Programming Fundamentals 1 & 2
First-year bachelor courses where students learnt the basics of programming, algorithmics, and object oriented programming, using Scheme, Smalltalk and Java. I assisted profs Amy Murphy and Michele Lanza.
You can read more about this course in our ICSE 2008 publication.
Courses in France
I participated in two courses at the University of Caen, in France, for first-year and third-year bachelors, assisting profs. Serge Stinckwich and Anne Nicole.

Articles

Here is a list of all my publications, classified by type. You can also check DBLP, although it is not always up to date.

Where possible, I put pdfs of the papers. I don't put presentation slides, since good slides should be useless without the presenter.
Categories:
Selected Publications:
  • ICSE 2010
  • J.ASE 2010
  • MSR 2010
  • ICPC 2007
Journals
  • Improving code completion with program history.
    R. Robbes, M. Lanza - J.ASE 2010
  • The small project observatory: Visualizing software ecosystems.
    M. Lungu, M. Lanza, T. Girba, R. Robbes - SCP 2009
  • A Change-Based Approach to Software Evolution.
    R. Robbes, M. Lanza - ENTCS 2007
Conferences
  • Visualizing Dynamic Metrics with Profiling Blueprints.
    A. Bergel, R. Robbes, W. Binder - TOOLS 2010
  • Replaying IDE Interactions to Evaluate and Improve Change Prediction Approaches.
    R. Robbes, D. Pollet, M. Lanza - MSR 2010
  • An extensive comparison of bug prediction approaches.
    M. D'Ambros, M. Lanza, R. Robbes - MSR 2010
  • Linking emails and source code artifacts.
    A. Bacchelli, M. Lanza, R. Robbes - ICSE 2010
  • Benchmarking Lightweight Techniques to Link E-Mails and Source Code.
    A. Bacchelli, M. D'Ambros, M. Lanza, R. Robbes - WCRE 2009
  • On the Relationship Between Change Coupling and Software Defects.
    M. D'Ambros, M. Lanza, R. Robbes - WCRE 2009
  • Promises and Perils of Porting Software Visualization Tools to the Web.
    M. D'Ambros, M. Lungu, M. Lanza, R. Robbes - WSE 2009
  • How Program History Can Improve Code Completion.
    R. Robbes, M. Lanza - ASE 2008
  • Example-based Program Transformation.
    R. Robbes, M. Lanza - MODELS 2008
  • Characterizing and Understanding Development Sessions.
    R. Robbes, M. Lanza - ICPC 2007
  • An Approach to Software Evolution Based on Semantic Changes.
    R. Robbes, M. Lanza, M. Lungu - FASE 2007
  • Microprints: A Pixel-Based, Semantically Rich Visualization of Methods.
    R. Robbes, S. Ducasse, M. Lanza - ESUG 2005
  • An Aspect-Based Multi-Agent System.
    R. Robbes, N. Bouraqadi, S. Stinckwich - ESUG 2004
Short conference papers and tool demos
  • Supporting Task-Oriented Navigation in IDEs with Configurable HeatMaps.
    D. Rothlisberger, O. Nierstrasz, S. Ducasse, D. Pollet, R. Robbes - ICPC 2009
  • A Teamwork-Based Approach to Programming Fundamentals with Scheme, Smalltalk and Java.
    M. Lanza, A. Murphy, R. Robbes, M. Lungu, P. Bonzini - ICSE 2008
  • SpyWare: A Change-Aware Development Toolset.
    R. Robbes, M. Lanza - ICSE 2008
  • Logical Coupling Based on Fine-grained Change Information.
    R. Robbes, D. Pollet, M. Lanza - WCRE 2008
Workshops
  • Commit 2.0.
    M. D'Ambros, M. Lanza, R. Robbes - Web2SE 2010
  • Lumiere : An Infrastructure for Producing 3D Applications in Smalltalk.
    F. Olivero, M. Lanza, R. Robbes - Famoosr 2009
  • Lumiere : a Novel Framework for Rendering 3D graphics in Smalltalk.
    F. Olivero, M. Lanza, R. Robbes - ISWT 2009
  • On the Evaluation of Recommender Systems with Recorded Interactions.
    R. Robbes - SUITE 2009
  • The "Extract Refactoring" Refactoring.
    R. Robbes, M. Lanza - WRT 2007
  • Mining a Change-Based Software Repository.
    R. Robbes - MSR 2007
  • Change-Based Software Evolution.
    R. Robbes, M. Lanza - EVOL 2006
  • Versioning Systems for Evolution Research.
    R. Robbes, M. Lanza - IWPSE 2005
  • Multi-level Method Understanding with Microprints.
    S. Ducasse, M. Lanza, R. Robbes - Vissoft 2005
Various
  • Supporting Task-Oriented Navigation in IDEs with Configurable HeatMaps.
    D. Rothlisberger, O. Nierstrasz, S. Ducasse, D. Pollet, R. Robbes - Tech Report, University of Bern, 2009
  • A Benchmark for Change Prediction.
    R. Robbes, M. Lanza, D. Pollet - Tech Report, University of Lugano, 2008
  • Towards Change-aware Development Tools.
    R. Robbes, M. Lanza - Tech Report, University of Lugano, 2008
  • Modelling Change-based Software Evolution.
    R. Robbes - ECOOP Doctoral Symposium 2007
  • Explicitely Modeling Software Change.
    R. Robbes - LASER 2006
  • Un modèle multi-agent unifiant les notions de groupe et d'aspect.
    R. Robbes, N. Bouraqadi, S. Stinckwich - JFSMA 2004
Theses
  • Of Change and Software.
    R. Robbes - Ph.D. Thesis, University of Lugano, 2008
  • Mise en Oeuvre de la Programmation par Aspects dans le Cadre des Systèmes Multi-agents.
    R. Robbes - Master's Thesis, University of Caen, 2003

Service

I am/was involved in the organisation of the following events. I am also a member of ACM SIGSOFT.
Steering committees
Program and Track chair
Journal reviewer
  • IEEE Transactions on Software Engineering
  • IEEE Software
Conference and workshop program committees
  • ICPC 2010, Tool Demo Track
  • ICSM 2010, Early Research Achievements Track
  • WASDeTT 2010
  • SEAA 2010, EDISON Track
  • VISSOFT 2009
  • ICPC 2009, Tool Demo Track
  • SUITE 2009
  • WCRE 2008, Tool Demo Track
  • ASE 2008, Tool Demo Track
  • ECOOP 2008, Doctoral Symposium
The details of this paper are not yet available. Contact me for information about this work.

dismiss

Electronic Edition
Local Copy

During the evolution of a software system, a large amount of information, which is not always directly related to the source code, is produced. Several researchers have provided evidence that the contents of mailing lists represent a valuable source of information: Through e-mails, developers discuss design decisions, ideas, known problems and bugs, etc. which are otherwise not to be found in the system.A technical challenge in this context is how to establish the missing link between free-form e-mails and the system artifacts they refer to. Although the range of approaches is vast, establishing their accuracy remains a problem, as there is no benchmark against which to compare their performance.To overcome this issue, we manually inspected a statistically significant number of e-mails pertaining to the ArgoUML system. Based on this benchmark, we present a variety of lightweight techniques to assign e-mails to software artifacts and measure their effectiveness in terms of precision and recall.

@inproceedings{BDLR-WCRE2009, 
	author    = {Alberto Based and Marco D'Ambros and Michele Lanza and Romain Robbes},
	title={Benchmarking Lightweight Techniques to Link E-Mails and Source Code},
	booktitle = {WCRE 2009: Proceedings of the 16th IEEE Working Conference on Reverse Engineering},
	year      = {2009},
	pages     = {205--214},
}		

dismiss

Local Copy

E-mails concerning the development issues of a system constitute an important source of information about high-level design decisions, low-level implementation concerns, and the social structure of developers.

Establishing links between e-mails and the software artifacts they discuss is a non-trivial problem, due to the inherently informal nature of human communication. Different approaches can be brought into play to tackle this traceability issue, but the question of how they can be evaluated remains unaddressed, as there is no recognized benchmark against which they can be compared.

In this article we present such a benchmark, which we created through the manual inspection of a statistically significant number of e-mails pertaining to six unrelated software systems. We then use our benchmark to measure the ectiveness of a number of approaches, ranging from lightweight approaches based on regular expressions to full-fledged information retrieval approaches.

@inproceedings{BLR-ICSE2010,
	author    = {Alberto Bacchelli and Michele Lanza and Romain Robbes},
	title     = {Linking E-Mails and Source Code Artifacts},
	booktitle = {ICSE 2010: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering},
	year      = {2010},
	pages     = {to appear}
}		

dismiss

While traditional approaches to code profiling help locate performance bottlenecks, they offer only limited support for removing these bottlenecks. The main reason is the lack of visual and detailed runtime information to identify and eliminate computation redundancy.

We provide two profiling blueprints which help identify and remove performance bottlenecks. The structural distribution blueprint graphically represents the CPU consumption share for each method and class of an application. The behavioral distribution blueprint depicts the distribution of CPU consumption along method invocations, and hints at method candidates for caching optimizations. These two blueprints helped us to significantly optimize Mondrian, an open source visualization engine. Our implementation is freely available for the Pharo development environment and has been evaluated in a number of different scenarios.

@inproceedings{BRB-TOOLS2010,
	author = {Alexandre Bergel and Romain Robbes and Walter Binder},
	title = {Visualizing Dynamic Metrics with Profiling Blueprints},
	booktitle = {TOOLS 2010: Proceedings of the 48th International Conference on Objects, Models, Components, Patterns},
	year = {2010},
	pages = {to appear}
}

dismiss

Software systems are hard to understand due to the complexity and the sheer size of the data to be analyzed. Software visualization tools are a great help as they can sum up large quantities of data in dense, meaningful pictures. Traditionally such tools come in the form of desktop applications. Modern web frameworks are about to change this status quo, as building software visualization tools as web applications can help in making them available to a larger audience in a collaborative setting. Such a migration comes with a number of promises, perils and technical implications that have to be taken into account before starting any migration process.

In this paper we share our experiences in porting two such tools to the web and discuss the promises and perils that go hand in hand with such an endeavour.

@inproceedings{DLLR-WSE2009, 
	author    = {Marco D'Ambros and Michele Lanza and Michele Lanza and Romain Robbes},
	title={Promises and Perils of Porting Software Visualization Tools to the Web},
	booktitle = {WSE 2009: Proceedings of the 11th IEEE International Symposium on Web Systems Evolution},
	year      = {2009},
	pages     = {109-118},
}		

dismiss

Local Copy

Reliably predicting software defects is one of software engineering’s holy grails. Researchers have devised and implemented a plethora of bug prediction approaches varying in terms of accuracy, complexity and the input data they require. However, the absence of an established benchmark makes it hard, if not impossible, to compare approaches.

We present a benchmark for defect prediction, in the form of a publicly available data set consisting of several software systems, and provide an extensive comparison of the explanative and predictive power of well-known bug prediction approaches, together with novel approaches we devised.

Based on the results, we discuss the performance and stability of the approaches with respect to our benchmark and deduce a number of insights on bug prediction models.

@inproceedings{DLR-MSR2010, 
	author    = {Marco D'Ambros and Michele Lanza and Romain Robbes},
	title     = {An Extensive Comparison of Bug Prediction Approaches},
	booktitle = {MSR 2010: Proceedings of the 7th IEEE Working Conference on Mining Software Repositories},
	year      = {2010},
	pages     = {to appear}
}		

dismiss

Electronic Edition
Local Copy

Change coupling is the implicit relationship between two or more software artifacts that have been observed to frequently change together during the evolution of a software system. Researchers have studied this dependency and have observed that it points to design issues such as architectural decay. It is still unknown whether change coupling correlates with a tangible effect of design issues, i.e., software defects.In this paper we analyze the relationship between change coupling and software defects on three large software systems. We investigate whether change coupling correlates with defects, and if the performance of bug prediction models based on software metrics can be improved with change coupling information.

@inproceedings{DLR-WCRE2009, 
	author    = {Marco D'Ambros and Michele Lanza and Romain Robbes},
	title={On the Relationship Between Change Coupling and Software Defects},
	booktitle = {WCRE 2009: Proceedings of the 16th IEEE Working Conference on Reverse Engineering},
	year      = {2009},
	pages     = {135--144},
}		

dismiss

Commit comments written by developers when they submit their changes to a versioning system are useful for a number of tasks: Developers write commit comments to document changes and as a means to communicate with the rest of the development team; Researchers mine commit-related data contained in software repositories to support software evolution and reverse engineering activities. However, the support provided by IDEs is restricted in this respect, as they limit the users to use only text to document their changes.

We present Commit 2.0, an IDE enhancement to enrich commit comments using software visualization. Commit 2.0 generates visualizations of the performed changes at different granularity levels, and lets the user annotate them.

@inproceedings{DLR-WEB2SE2010, 
	author    = {Marco D'Ambros and Michele Lanza and Romain Robbes},
	title={Commit 2.0},
	booktitle = {Web2SE 2010: Proceedings of the 1st Workshop on Web 2.0 for Software Engineering},
	year      = {2010},
	pages     = {to appear},
}		

dismiss

Electronic Edition
Local Copy

Software evolution research has focused mostly on analyzing the evolution of single software systems. However, it is rarely the case that a project exists as standalone, independent of others. Rather, projects exist in parallel within larger contexts in companies, research groups or even the open-source communities. We call these contexts software ecosystems, and on this paper we present The Small Project Observatory, a prototype tool which aims to support the analysis of project ecosystems through interactive visualization and exploration. We present a case-study of exploring an ecosystem using our tool, we describe about the architecture of the tool, and we distill the lessons learned during the tool-building experience.

@article{LLGR-SCP2010, 
	Author    = {Mircea Lungu and Michele Lanza and Tudor G\^irba and Romain Robbes},
	Title     = {The {Small Project Observatory}: Visualizing Software Ecosystems},
	journal   = {Science of Computer Programming},
	year      = {2010},
	volume    = {75},
	number    = {4},
	pages     = {264--275},
}

dismiss

Electronic Edition
Local Copy

Software changes. Any long-lived software system has maintenance costs dominating its initial development costs as it is adapted to new or changing requirements. Systems on which such continuous changes are performed inevitably decay, making maintenance harder. This problem is not new: The software evolution research community has been tackling it for more than two decades. However, most approaches have been targeting specific maintenance activities using an ad-hoc model of software evolution.

Instead of only addressing individual maintenance activities, we propose to take a step back and address the software evolution problem at its root by treating change as a first-class entity. We apply the strategy of reification, used with success in other branches of software engineering, to the changes software systems experience. Our thesis is that a reified change-based representation of software enables better evolution support for both reverse and forward engineering activities. To this aim, we present our approach, Change-based Software Evolution, in which first-class changes to programs are recorded as they happen.

We implemented our approach and recorded the evolution of several systems. We validated our thesis by providing support for several maintenance activities. We found that:

* Change-based Software Evolution eases the reverse engineering and program comprehension of systems by providing access to historical information that is lost by other approaches. The fine-grained change information we record, when summarized in evolutionary measurements, also gives more accurate insights about a system’s evolution.

* Change-based Software Evolution facilitates the evolution of systems by integrating program transformations, their definition, comprehension and possible evolution in the overall evolution of the system. Further, our approach is a source of fine-grained data useful to both evaluate and improve the performance of recommender systems that guide developers as they change a software system.

These results support our view that software evolution is a continuous process, alternating forward and reverse engineering activities that requires the support of a model of software evolution integrating these activities in a harmonious whole.

@phdthesis{R-USI2008, 
	author    = {Romain Robbes},
	title     = {Of Change and Software},
	school = {University of Lugano},
	month = {December},
	year = {2008},
}

dismiss

Electronic Edition
Local Copy

Code completion is a widely used productivity tool. It takes away the burden of remembering and typing the exact names of methods or classes: As a developer starts typing a name, it provides a progressively refined list of candidates matching the name. However, the candidate list always comes in alphabetic order, i.e., the environment is only second-guessing the name based on pattern matching. Finding the correct candidate can be cumbersome or slower than typing the full name.

We present an approach to improve code completion with program history. We define a benchmark measuring the accuracy and usefulness of a code completion engine. Further, we use the change history data to also improve the results offered by code completion tools. Finally, we propose an alternative interface for completion tools.

@inproceedings{RL-ASE2008, 
	author    = {Romain Robbes and Michele Lanza},
	title     = {How Program History Can Improve Code Completion},
	booktitle = {ASE 2008: Proceedings of the 23rd ACM/IEEE International Conference on Automated Software Engineering},
	year      = {2008},
	pages     = {317-326}
}		

dismiss

Electronic Edition
Local Copy

Software evolution research is limited by the amount of information available to researchers: Current version control tools do not store all the information generated by developers. They do not record every intermediate version of the system issued, but only snapshots taken when a developer commits source code into the repository. Additionally, most software evolution analysis tools are not a part of the day-to-day programming activities, because analysis tools are resource intensive and not integrated in development environments. We propose to model development information as change operations that we retrieve directly from the programming environment the developers are using, while they are effecting changes to the system. This accurate and incremental information opens new ways for both developers and researchers to explore and evolve complex systems.

@article{RL-ENTCS2010, 
	author    = {Romain Robbes and Michele Lanza},
	title     = {How Program History Can Improve Code Completion},
	journal   = {Electr. Notes Theor. Comput. Sci.},
	volume    = {166},
	year      = {2007},
	pages     = {93-109},
}		

dismiss

Electronic Edition
Local Copy

The understanding of development sessions, the phases during which a developer actively modifies a software system, is a valuable asset for program comprehension, since the sessions directly impact the current state and future evolution of a software system. Such information is usually lost by state-of-the-art versioning systems, because of the checkin/checkout model they rely on: a developer must explicitly commit his changes to the repository. Since this happens in arbitrary and sometimes long intervals, recovering the changes between two commits is difficult and inaccurate, and recovering the order of the changes is impossible.

We have implemented an evolution monitoring prototype which records every semantic change performed on a system, and is able to completely reconstruct development sessions. In this paper we use this fine-grained information to understand and characterize the development sessions as they were carried out on two object-oriented systems.

@inproceedings{RL-ICPC2007, 
	author    = {Romain Robbes and Michele Lanza},
	title     = {Characterizing and Understanding Development Sessions},
	booktitle = {ICPC 2007: Proceedings of the 15th International Conference on Program Comprehension}
	year      = {2007},
	pages     = {155-166},
}

dismiss

Electronic Edition
Local Copy

Code completion is a widely used productivity tool. It takes away the burden of remembering and typing the exact names of methods or classes: As a developer starts typing a name, it provides a progressively refined list of candidates matching the name. However, the candidate list usually comes in alphabetic order, i.e., the environment is only second-guessing the name based on pattern matching, relying on human intervention to pick the correct one. Finding the correct candidate can thus be cumbersome or slower than typing the full name.

We present an approach to improve code completion based on recorded program histories. We define a benchmarking procedure measuring the accuracy of a code completion engine and apply it to several completion algorithms on a dataset consisting of the history of several systems. Further, we use the change history data to improve the results offered by code completion tools. Finally, we propose an alternative interface for completion tools that we released to developers and evaluated.

@article{RL-JASE2010, 
	author    = {Romain Robbes and Michele Lanza},
	title     = {How Program History Can Improve Code Completion},
	journal   = {Autom. Softw. Eng.},
	year      = {2010},
	volume    = {in press},
	number    = {in press},
	pages     = {in press},
}		

dismiss

Electronic Edition
Local Copy

Software changes. During their life cycle, software systems experience a wide spectrum of changes, from minor modifications to major architectural shifts. Small-scale changes are usually performed with text editing and refactorings, while large-scale transformations require dedicated program transformation languages. For medium-scale transformations, both approaches have disadvantages. Manual modifications may require a myriad of similar yet not identical edits, leading to errors and omissions, while program transformation languages have a steep learning curve, and thus only pay off for large-scale transformations.

We present a system supporting example-based program transformation. To define a transformation, a programmer performs an example change manually, feeds it into our system, and generalizes it to other application contexts. With time, a developer can build a palette of reusable medium-sized code transformations. We provide a detailed description of our approach and illustrate it with examples.

@inproceedings{RL-MODELS2008,
	author    = {Romain Robbes and Michele Lanza},
	title     = {Example-Based Program Transformation},
	booktitle = {MoDELS 2008: Proceedings of the 11th ACM/IEEE International Conference on Model Driven Engineering},
	year      = {2008},
	pages     = {174-188},
}

dismiss

Local Copy

Change prediction helps developers by recommending program entities that will have to be changed alongside the entities currently being changed. To evaluate their accuracy, current change prediction approaches use data from versioning systems such as CVS or SVN. These data sources provide a coarse-grained view of the development history that flattens the sequence of changes in a single commit. They are thus not a valid basis for evaluation in the case of development-style prediction, where the order of the predictions has to match the order of the changes a developer makes.

We propose a benchmark for the evaluation of change prediction approaches based on fine-grained change data recorded from IDE usage. Moreover, the change prediction approaches themselves can use the more accurate data to fine-tune their prediction. We present an evaluation procedure and use it on several change prediction approaches, both novel and from the literature, and report on the results.

@inproceedings{RPL-MSR2010, 
	author    = {Romain Robbes and Damien Pollet and Michele Lanza},
	title     = {Replaying IDE Interactions to Evaluate and Improve Change Prediction},
	booktitle = {MSR 2010: Proceedings of the 7th IEEE Working Conference on Mining Software Repositories},
	year      = {2010},
	pages     = {to appear}
}		

dismiss