Gopinath: Pygram: Learning input grammars for Python programs TU Darmstadt, Germany (software group retreat), 2017

AUTOGRAM is a method to infer human-readable input grammars from observing program behavior on test runs. The resulting grammars can be used for fuzzing, secondary or parallel validation of new inputs, and identifying the essence of a program in a language agnostic manner. I will present my current work in AUTOGRAM, and discuss my research in taking AUTOGRAM forward.

Gopinath: Who tests our tests: An overview of mutation analysis, its caveats, and pitfalls McGill University, Canada, 2017

A key concern in software engineering is determining the reliability of our software artifacts. Software engineers traditionally rely on extensive testing to ensure that programs are fault free. However, tests are usually also written by human beings, and hence vulnerable to the similar problems as software applications. An inadequate test suite, or one that contains errors, can provide a false sense of security to the programmer or the consumer. Hence a major question in software testing is how to evaluate the quality of test suites.

Following the maxim “you can’t find a bug in what you don’t cover”, code coverage is often used as a measure of test suite quality. However, code coverage can not evaluate the quality of our oracles, and given that our oracles are often insufficient or wrong, code coverage, on its own is insufficient, and mutation testing, a stronger fault-based technique is often recommended. In this talk, I will present the basics of mutation testing, and look at why it is effective. Next, I will examine the limitations to mutation analysis, and how these can be mitigated. I will also examine how insights from mutation testing may be used in fields such as evaluating type annotations, program repair, and program adaptation.

Gopinath: Code Coverage is a Strong Predictor of Test suite Effectiveness in the Real World GTAC, 2016

This talk is about the effectiveness of coverage as a technique for evaluating quality of test suite. We show the utility of coverage by considering its correlation with mutation score, and also show that coverage is a significant defence against bugs. Further, we also critique effectiveness of mutation score as a criteria for test suite quality.