2012
Autores
Machado, N; Romano, P; Rodrigues, L;
Publicação
2012 42ND ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN)
Abstract
This paper presents CoopREP, a system that provides support for fault replication of concurrent programs, based on cooperative recording and partial log combination. CoopREP employs partial recording to reduce the amount of information that a given program instance is required to store in order to support deterministic replay. This allows to substantially reduce the overhead imposed by the instrumentation of the code, but raises the problem of finding the combination of logs capable of replaying the fault. CoopREP tackles this issue by introducing several innovative statistical analysis techniques aimed at guiding the search of partial logs to be combined and used during the replay phase. CoopREP has been evaluated using both standard benchmarks for multi-threaded applications and a real-world application. The results highlight that CoopREP can successfully replay concurrency bugs involving tens of thousands of memory accesses, reducing logging overhead with respect to state of the art non-cooperative logging schemes by up to 50 times in computationally intensive applications.
2016
Autores
Machado, N; Quinta, D; Lucia, B; Rodrigues, L;
Publicação
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY
Abstract
We present Symbiosis: a concurrency debugging technique based on novel differential schedule projections (DSPs). A DSP shows the small set of memory operations and dataflows responsible for a failure, as well as a reordering of those elements that avoids the failure. To build a DSP, Symbiosis first generates a full, failing, multithreaded schedule via thread path profiling and symbolic constraint solving. Symbiosis selectively reorders events in the failing schedule to produce a nonfailing, alternate schedule. A DSP reports the ordering and dataflow differences between the failing and nonfailing schedules. Our evaluation on buggy real-world software and benchmarks shows that, in practical time, Symbiosis generates DSPs that both isolate the small fraction of event orders and dataflows responsible for the failure and report which event reorderings prevent failing. In our experiments, DSPs contain 90% fewer events and 96% fewer dataflows than the full failure-inducing schedules. We also conducted a user study that shows that, by allowing developers to focus on only a few events, DSPs reduce the amount of time required to understand the bug's root cause and find a valid fix.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.