Publicacoes - INESC TEC

Publicações

Publicações por José Nuno Macedo

2019

Get Your Spreadsheets Under (Version) Control

Autores
Macedo, JN; Moreira, R; Cunha, J; Saraiva, J;

Publicação
Proceedings of the XXII Iberoamerican Conference on Software Engineering, CIbSE 2019, La Habana, Cuba, April 22-26, 2019.

Abstract
Spreadsheets play a pivotal role in many organizations. They serve to store and manipulate data or forecasting, and they are often used to help in the decision process, thus directly impacting the success, or not, of organizations. As the research community already realized, spreadsheets tend to have the same problems “professional” software contain. Thus, in the past decade many software engineering techniques have been successfully proposed to aid spreadsheet developers and users. However, one of the most used mechanisms to manage software projects is still lacking in spreadsheets: a version control system. A version control system allows for collaborative development, while also allowing individual developers to explore different alternatives without compromising the main project. In this paper we present a version control system, named SheetGit, oriented for end-user programmers. It allows to graphically visualize the history of versions (including branches), to switch between different versions just by pointing and clicking, and to visualize the differences between any two versions in an animated way. To validate our approach/tool we performed an empirical evaluation which shows evidence that SheetGit can aid users when compared to other tools.

FecharLer Abstract

2020

InDubio: A Combinator Library to Disambiguate Ambiguous Grammars

Autores
Macedo, JN; Saraiva, J;

Publicação
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2020, PART IV

Abstract
To infer an abstract model from source code is one of the main tasks of most software quality analysis methods. Such abstract model is called Abstract Syntax Tree and the inference task is called parsing. A parser is usually generated from a grammar specification of a (programming) language and it converts source code of that language into said abstract tree representation. Then, several techniques traverse this tree to assess the quality of the code (for example by computing source code metrics), or by building new data structures (e.g, flow graphs) to perform further analysis (such as, code cloning, dead code, etc). Parsing is a well established technique. In recent years, however, modern languages are inherently ambiguous which can only be fully handled by ambiguous grammars. In this setting disambiguation rules, which are usually included as part of the grammar specification of the ambiguous language, need to be defined. This approach has a severe limitation: disambiguation rules are not first class citizens. Parser generators offer a small set of rules that can not be extended or changed. Thus, grammar writers are not able to manipulate nor define a new specific rule that the language he is considering requires. In this paper we present a tool, name InDubio, that consists of an extensible combinator library of disambiguation filters together with a generalized parser generator for ambiguous grammars. InDubio defines a set of basic disambiguation rules as abstract syntax tree filters that can be combined into more powerful rules. Moreover, the filters are independent of the parser generator and parsing technology, and consequently, they can be easily extended and manipulated. This paper presents InDubio in detail and also presents our first experimental results.

FecharLer Abstract

2022

Zipping Strategies and Attribute Grammars

Autores
Macedo, JN; Viera, M; Saraiva, J;

Publicação
Functional and Logic Programming - 16th International Symposium, FLOPS 2022, Kyoto, Japan, May 10-12, 2022, Proceedings

Abstract
Strategic term rewriting and attribute grammars are two powerful programming techniques widely used in language engineering. The former relies on strategies (recursion schemes) to apply term rewrite rules in defining transformations, while the latter is suitable for expressing context-dependent language processing algorithms. Each of these techniques, however, is usually implemented by its own powerful and large processor system. As a result, it makes such systems harder to extend and to combine. We present the embedding of both strategic tree rewriting and attribute grammars in a zipper-based, purely functional setting. The embedding of the two techniques in the same setting has several advantages: First, we easily combine/zip attribute grammars and strategies, thus providing language engineers the best of the two worlds. Second, the combined embedding is easier to maintain and extend since it is written in a concise and uniform setting. We show the expressive power of our library in optimizing Haskell let expressions, expressing several Haskell refactorings and solving several language processing tasks for an Oberon-0 compiler. © 2022, Springer Nature Switzerland AG.

FecharLer Abstract

2023

Efficient Embedding of Strategic Attribute Grammars via Memoization

Autores
Macedo, JN; Rodrigues, E; Viera, M; Saraiva, J;

Publicação
Proceedings of the 2023 ACM SIGPLAN International Workshop on Partial Evaluation and Program Manipulation, PEPM 2023, Boston, MA, USA, January 16-17, 2023

Abstract
Strategic term re-writing and attribute grammars are two powerful programming techniques widely used in language engineering. The former relies on strategies to apply term re-write rules in defining large-scale language transformations, while the latter is suitable to express context-dependent language processing algorithms. These two techniques can be expressed and combined via a powerful navigation abstraction: generic zippers. This results in a concise zipper-based embedding offering the expressiveness of both techniques. Such elegant embedding has a severe limitation since it recomputes attribute values. This paper presents a proper and efficient embedding of both techniques. First, attribute values are memoized in the zipper data structure, thus avoiding their re-computation. Moreover, strategic zipper based functions are adapted to access such memoized values. We have implemented our memoized embedding as the Ztrategic library and we benchmarked it against the state-of-the-art Strafunski and Kiama libraries. Our first results show that we are competitive against those two well established libraries. © 2023 ACM.

FecharLer Abstract

2023

Beyond Code Generation: The Need for Type-Aware Language Models

Autores
Ribeiro, F; Macedo, JN; Tsushima, K;

Publicação
2023 IEEE/ACM INTERNATIONAL WORKSHOP ON AUTOMATED PROGRAM REPAIR, APR

Abstract
Type systems and type inference systems can be used to help text and code generation models like GPT-3 produce more accurate and appropriate results. These systems provide information about the types of variables, functions, and other elements in a program or codebase, which can be used to guide the generation of new code or text. For example, a code generation model that is aware of the types of variables and functions being used in a program can generate code that is more likely to be syntactically correct and semantically meaningful. We argue for the specialization of language models such as GPT-3 for automatic program repair tasks, incorporating type information in the model's learning process. A trained language model is expected to perform better by understanding the nuances of type systems and using them for program repair, instead of just relying on the general structure of programs.

FecharLer Abstract

2023

GPT-3-Powered Type Error Debugging: Investigating the Use of Large Language Models for Code Repair

Autores
Ribeiro, F; de Macedo, JNC; Tsushima, K; Abreu, R; Saraiva, J;

Publicação
PROCEEDINGS OF THE 16TH ACM SIGPLAN INTERNATIONAL CONFERENCE ON SOFTWARE LANGUAGE ENGINEERING, SLE 2023

Abstract
Type systems are responsible for assigning types to terms in programs. That way, they enforce the actions that can be taken and can, consequently, detect type errors during compilation. However, while they are able to flag the existence of an error, they often fail to pinpoint its cause or provide a helpful error message. Thus, without adequate support, debugging this kind of errors can take a considerable amount of effort. Recently, neural network models have been developed that are able to understand programming languages and perform several downstream tasks. We argue that type error debugging can be enhanced by taking advantage of this deeper understanding of the language's structure. In this paper, we present a technique that leverages GPT-3's capabilities to automatically fix type errors in OCaml programs. We perform multiple source code analysis tasks to produce useful prompts that are then provided to GPT-3 to generate potential patches. Our publicly available tool, Mentat, supports multiple modes and was validated on an existing public dataset with thousands of OCaml programs. We automatically validate successful repairs by using Quickcheck to verify which generated patches produce the same output as the user-intended fixed version, achieving a 39% repair rate. In a comparative study, Mentat outperformed two other techniques in automatically fixing ill-typed OCaml programs.

FecharLer Abstract