Research Interests

Debugging and Testing

One big challenge for automating debugging is that the correct version of program usually exists only in programmer's mind. To tackle this problem, our ICSE'17 paper describes a feedback-based debugging approach, which asks for programmers’ feedback as partial code specification and interactively recommends suspicious steps on the buggy execution trace. Based on the trace collection technique in this work, we build a regression fault localization technique, which can automatically compare a correct trace and a buggy trace to generate explanation for a regression bug. The source code is available at https://github.com/llmhyy/microbat and https://github.com/llmhyy/tregression. On tackling the debugging problem, we observe the limit of dynamic slicing, a traditional approach to locating relevant program statements based on a given statement. That is, dynamic slicing comes into a dead end when a software bug is caused by missing some code or missing the execution of some code. To this end, our ASE'18 paper proposed a data-driven approach to enhance dynamic slicing by building a neural network to predict the location where code or the execution of code is missing.

Code Recommendation

We propose an approach to recommending where and how to modify a piece of pasted code (see our FSE'15). The rationale is that, we regard all the duplicated code as historical copy-and-paste results, and their differences as the historical modifications on the pasted code. Therefore, the pattern of clone difference can be learned to suggest how to modify a new pasted code. The tools are published at https://github.com/llmhyy/ccdeamon. In addition, we also proposed an approach to identifying recurring design in software systems and the identified recurring designs can be automatically extracted into design templates for code generation~ (see our ASE'17 paper). We observe that programmers sometimes copy and paste related pieces of code. An keen observation is that they are copying more a design than several pieces code. Therefore, similar designs recur many times in the software system. Recurred designs indicate undocumented project-specific convention or knowledge. Therefore, I developed a technique to automatically identify recurring designs in the system. In addition, these identified recurring designs will be extracted into code templates manifested as UML class diagrams to generate code skeleton as well as body. A video of the tool is available at http://linyun.info/micode/index.html and the source code is available is at https://github.com/llmhyy/MICoDe.

Code Clone

Code clone is a phenomenon where code duplication is distributed among the system, it is very prevalent among industry software systems. Many software engineering researchers have devoted themselves on this topic in terms of its detection, evolution, removal, and management. Different from all existing approaches, my research on code clone lies in how to utilize code clone for supporting practical software development and maintenance. To this end, my research starts with clone difference analysis (see our ICSE'14 paper). Based on how duplicated code are different from one another, we extract undocumented rules of program convention to help programmers avoid potential fault (see our ICSME'14 paper).

Refactoring

In the same line of software maintenance, I also developed a technique to keep the consistency of code and design. With the evolving of the software development, the code is usually deviated from its original design. In order to avoid such deviation, programmers usually need to refactor code manually, which is a time and effort consuming task. Therefore, I developed a technique called Refactoring Navigator to automate the consistency of code design and its implementation. Refactoring Navigator (see our FSE'16 paper) regards user-specific design as target and code implementation as source, and automatically generates a refactoring solution consisting of a sequence of refactoring steps leading the source to the target. Regarding programmers have their own preference on how to refactor their code, the technique allows them to accept or reject certain refactoring steps as user feedback. Based on these feedbacks, Refactoring Navigator will generate a new sequence of refactoring steps. Refactoring Navigator is applied on both student code assignment and an industrial system to show its effectiveness. Its source code is available at https://github.com/llmhyy/Refactoring-Navigator