Will GPT-engineer Speed Up Your Refactoring? An In-Depth Guide
Improved efficiency, better software performance, and lower maintenance costs are just some of the many benefits of refactoring. Yet, the process itself is tedious and quite time-consuming. Perfect for giving it away to AI-powered tools like GPT-engineer.
The real question is: can implementing the GPT-engineer help you streamline the process?
This guide will help you find the answer.
Key takeaways:
- GPT-engineer was tested by our team for its potential in speeding up refactoring.
We evaluated its performance in common tasks such as renaming variables, methods, and classes, and extracting methods, interfaces, and subclasses. - GPT-engineer handles writing new code much better than it does refactoring and understanding existing code.
What is code refactoring?
Code refactoring is a process of improving existing code without changing its external behavior.
By restructuring the code, you can refine the design, structure, or implementation of the software while preserving its original functionality. This technique helps improve the nonfunctional attributes of the software, such as its readability, maintainability, and complexity.
Why you should consider refactoring your code
Now, if the functionality stays the same, you may wonder: why should I refactor the code? There are many advantages to refactoring, including:
- Better work efficiency: Clearer code equals reduced time and effort needed for onboarding new developers, as they can quickly understand the entire codebase.
- Lower maintenance costs: Clarification of the code leads to easier identification and resolution of bugs or any other issues.
- Increased productivity: With less time spent on deciphering the code, developers can also have more time for implementing new features.
- Enhanced agility: Easily readable code allows for quicker modifications and iterations, which is crucial in responding to changes in business requirements.
- Improved quality of the product: Refactoring the code can also enhance the software's performance, leading to higher customer satisfaction.
Most common refactoring activities
The process of refactoring can consist of several different actions. Below, we've listed some of the most frequent activities:
- Renaming a variable: Giving a new name to an existing code variable
- Renaming a method: Changing the name of a function or a procedure in the code
- Renaming a class: Altering the name of a class in the code
- Moving a method: Transferring a function to a different location within the codebase
- Moving a class: Relocating a class to a different part of the codebase
- Extracting a method: Creating a new function from a part of an existing one
- Inlining a method: Incorporating the functionality of a function or a procedure directly into its calling code
- Extracting a class: Creating a new class from a part of an existing one
- Extracting an interface: Defining a new interface based on common behaviors or methods of existing classes
- Extracting a superclass: Creating a new parent class to hold common attributes or methods shared by multiple existing classes
- Pulling up a method: Moving a method from a subclass to its superclass to promote code reusability
- Pushing down a method: Moving a method from a superclass to one or more of its subclasses to better align functionality with subclass responsibilities
What is the GPT-engineer?
In short, the GPT-engineer is an AI-powered app builder that translates a project description into a ready-to-use codebase. The solution is based on GPT models and can convert natural language into code, execute it, and implement improvements in existing projects.
The premise of the GPT-engineer platform is that users no longer have to write code from scratch. Instead, programmers can describe the project, and in return, the GPT-engineer will generate the entire codebase.
Yet, the GPT-engineer is not just a tool for the code generation process. It also acts as software developers' AI coding assistant, helping them with their daily tasks.
How does the GPT-engineer work
The GPT-engineer is a tool that leverages artificial intelligence and machine learning to assist you with coding. The workflow of the GPT-engineer is very simple and consists of three elements:
1. Defining the prompt
The first step in using the GPT-engineer is creating a prompt, which includes any specifications or requirements related to your project, allowing the AI tool to generate code.
2. Generating code and/or codebase
After that, the GPT-engineer analyzes your request and generates code snippets, functions, or the entire codebase based on the prompted tasks. Optionally, you may also need to add supplementary answers.
3. Improving the generated code
Now, software developers need to adapt the new code. Although the result serves as a solid starting point, finalizing the code to meet all requirements remains a human task.
Installing the GPT-engineer: Step-by-step
For stable release
Step 1: Install gpt-engineer
For this use python -m pip install gpt-engineer
Step 2: Set API Key
Enter your API key. There are several ways to do this, all of which are explained in the official documentation.
Step 3: Run the GPT-engineer
Now, you're ready to run the program. Create a file called prompt in your project directory and gpte <project_dir> or gpte <project_dir> -i after writing the instructions.
For development
Step 1: Clone the GPT-engineer GitHub repository
Clone the GPT-engineer GitHub repository.
Step 2: Set API Key
Enter your API key. There are several ways to do this, all of which are explained in the official documentation.
Step 3: Set up the GPT-engineer
Following that, you'll need to navigate to the cloned directory using the 'cd' command. Then, install all necessary dependencies and activate the virtual environment with the following commands:
poetry install
poetry shel
Step 4: Run the GPT-engineer
Now, you're ready to run the program. Create a file called prompt in your project directory and gpte <project_dir> or gpte <project_dir> -i after writing the instructions.
Using the GPT-engineer for refactoring: A case study
Having learned the benefits of the tool, we decided to test the GPT-engineer's capabilities ourselves to determine whether the solution is as good as it promises to be.
The results? They were quite surprising, but let's start from the beginning.
The GPT-engineer in refactoring: Chosen methodology of study
To measure the effectiveness of the GPT-engineer in the case of refactoring, we decided to perform the most common refactoring activities manually (by a person) and with the help of the tool.
We ensured that the code exposed to refactoring had 100% test coverage to confirm that the changes made did not disrupt the application logic.
The GPT-engineer in refactoring: Tested use cases
Our team set out to test the GPT-engineer in the following use cases: renaming a variable, renaming a method, renaming a class, extracting a method, extracting an interface, and extracting a subclass.
The GPT-engineer in refactoring: Application under study
Project description: A simple CRUD application for products, where each product has two properties – a name and a quantity
Technologies in use: Node.js, Typescript, express, and Joi
Testing: The app had a 100% test coverage achieved exclusively via integration testing (thanks to this we would know whether refactoring is changing the business logic); all tests were written using Jest and Supertest
Test results:
PASS src/controllers/productsController.test.ts (5.197 s)
Products API
POST /api/products
✓ should create a new product (204 ms)
✓ should return 400 if product parameters are invalid (8 ms)
GET /api/products
✓ should list all products (9 ms)
✓ should return 400 if pagination parameters are invalid (6 ms)
GET /api/products/:id
✓ should get a product by id (7 ms)
✓ should return 404 if product not found (6 ms)
✓ should return 400 if product id is invalid (6 ms)
PUT /api/products/:id
✓ should update a product by id (8 ms)
✓ should return 404 if product not found (7 ms)
✓ should return 400 if product parameters are invalid (6 ms)
✓ should return 400 if product id is invalid (6 ms)
DELETE /api/products/:id
✓ should delete a product by id (6 ms)
✓ should return 404 if product not found (4 ms)
✓ should return 400 if product id is invalid (15 ms)
------------------------|---------|----------|---------|---------|-------------------
File | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s
------------------------|---------|----------|---------|---------|-------------------
All files | 100 | 100 | 100 | 100 |
src | 100 | 100 | 100 | 100 |
server.ts | 100 | 100 | 100 | 100 |
src/controllers | 100 | 100 | 100 | 100 |
productsController.ts | 100 | 100 | 100 | 100 |
src/entity | 100 | 100 | 100 | 100 |
ProductEntity.ts | 100 | 100 | 100 | 100 |
src/errors | 100 | 100 | 100 | 100 |
ApplicationError.ts | 100 | 100 | 100 | 100 |
src/repositories | 100 | 100 | 100 | 100 |
productsRepository.ts | 100 | 100 | 100 | 100 |
src/routes | 100 | 100 | 100 | 100 |
productsRouter.ts | 100 | 100 | 100 | 100 |
src/services | 100 | 100 | 100 | 100 |
productsService.ts | 100 | 100 | 100 | 100 |
src/validators | 100 | 100 | 100 | 100 |
productsValidator.ts | 100 | 100 | 100 | 100 |
------------------------|---------|----------|---------|---------|-------------------
Test Suites: 1 passed, 1 total
Tests: 14 passed, 14 total
Snapshots: 0 total
Time: 5.684 s, estimated 7 s
Application architecture: A standard layered structure, presented in the diagram below
Application architecture
Please note that the repository is small and consists of only 350 lines of code, including an integration test file that is 114 lines long.
Endpoints:
HTTP Method |
Path |
Description |
POST |
/api/products |
creates a product |
GET |
/api/products |
returns array of products, supports pagination |
GET |
/api/products/:id |
returns specific product by ID, throws a 404 error, if the given ID does not exist |
PUT |
/api/products/:id |
updates the whole product by id, throws a 404 error, if the given ID does not exist |
DELETE |
/api/products/:id |
removes product by ID, throws a 404 error, if the given ID does not exist |
The GPT-engineer in refactoring: Examining the results
So, how did the GPT-engineer tool perform? Unfortunately, in most cases, proposed changes failed to pass the tests.
Renaming operations
Renaming variables
We started with renaming the variables. As the current IDEs already offer features that automate renaming a single variable over many files, we decided to try something a bit harder.
We created a case in which we no longer store products, but currently available products, thus we want to change each product reference in the entire codebase to availableProduct.
Here's the prompt that we used and the results of the test.
Parameters |
Outcomes |
Prompt |
Rename each product variable to availableProduct. For example, when a product is returned from a repository, use availableProduct rather than product. Do this in all scenarios in all files. |
Result |
Tests don’t pass after changes. |
Cost |
It took 5 minutes and cost $2.8 to execute the prompt and make changes through GPT-engineer. |
Our first attempt ended in failure. Either the prompt was not effective enough or the task was too challenging for the current version of the tool.
In most cases, it didn’t update variable names, and when it did, the code ended up with multiple syntax errors. Since the tested codebase was quite small, the execution seemed to be slow and expensive.
Below, you’ll find a few output examples.
In this part of the code, GPT-engineer renamed the variable from product to availableProduct in line six but failed to do so in line number two, resulting in an invalid reference.
createProduct = async (req: Request, res: Response) => {
const product = await this.productsService.createProduct(req.body);
res.status(201).json({
httpStatusCode: 201,
data: availableProduct,
message: "Product created successfully",
});
};
Here, the function declaration has been duplicated, which is a very common occurrence. Often, we needed to fix the missing curly brackets, which can be quite time-consuming.
async update(product: ProductEntity): Promise<availableProduct | null> {
async update(product: ProductEntity): Promise<ProductEntity | null> {
const index = this.products.findIndex(
(product) => product.id === product.id
);
// ...
}
In this rare instance, the modified code was correct. However, the approach was unusual — rather than renaming the variable in line number two, it was reassigned in line three.
getProductById = async (req: Request, res: Response) => {
const product = await this.productsService.getProductById(req.params.id);
const availableProduct = product;
res.status(200).json({
httpStatusCode: 200,
data: availableProduct,
message: "Product returned successfully",
});
};
In summary, the AI tool renamed variables very rarely and inconsistently, breaking the code in almost every instance. Overall, less than 5% of the proposed changes were acceptable.
Renaming the method
After that, we set on trying something simpler. This time, we asked the AI tool to rename all methods from "camel case" to "snake case".
Parameters |
Outcomes |
Prompt |
Please change method names from camel case to snake case. |
Result |
Tests don’t pass after changes. |
Cost |
It took 4 minutes and cost $0.21 to execute the prompt and make changes through GPT-engineer. |
Occasionally, the GPT-engineer literally added snake_case to the method names.
async create_snake_case(product: ProductEntity): Promise<ProductEntity> {
this.products.push(product);
return product;
}
However, here, for example, it unnecessarily renamed the constructor method.
export class ProductEntity {
constructor_snake_case(
public id: string,
public name: string,
public quantity: number
) {}
}
In many cases, the tool correctly renamed methods in declarations but did not update the names when calling the same methods. It ended up renaming findById to find_by_id, but still referenced findById in other files.
We tried to apply more descriptive prompts to achieve this goal, but it didn't make much of a difference. We also noticed some other unexpected results, such as adding No change comments throughout the entire codebase.
export class ProductEntity {
constructor(
public id: string, // No change
public name: string, // No change
public quantity: number // No change
) {}
}
All in all, using the GPT-engineer for this action was not very helpful as its proposals often introduced obvious syntax errors and were very inconsistent.
Renaming the class
We moved on to another test. We wanted every class in our code to end with the word "Class", and here's how it went.
Parameters |
Outcomes |
Prompt |
Please add "Class" word at the end of each class. For instance ProductsController rename to ProductsControllerClass, ProductEntity to ProductEntityClass. Do it with each class name. |
Result |
Tests don’t pass after changes. |
Cost |
It took 55 seconds and cost $0.16 to execute the prompt and make changes through GPT-engineer. |
The execution of the prompt was fast and inexpensive, but the result was not satisfactory. In this case, the mistakes were similar to previous examples—lack of consistency and invalid syntax.
Extracting a method
This time, we attempted to use the GPT-engineer to refactor the code by extracting the UUID generation into a separate method.
We aimed to transform the following:
async createProduct(input: {
name: string;
quantity: number;
}): Promise<ProductEntity> {
const product = new ProductEntity(randomUUID(), input.name, input.quantity);
return this.productsRepository.create(product);
}
Into this:
generateIdentifier(): string {
return randomUUID();
}
async createProduct(input: {
name: string;
quantity: number;
}): Promise<ProductEntity> {
const product = new ProductEntity(
this.generateIdentifier(),
input.name,
input.quantity
);
return this.productsRepository.create(product);
}
What were the outcomes?
Parameters |
Outcomes |
Prompt |
Please extract from the create method in productService generation of randomUUID. Extract this to a method called generateIdentifier and use it in the create method. generateIdentifier should not receive any parameters and should return the uuid generated by randomUUID function. |
Result |
Tests don’t pass after changes. |
Cost |
It took 5 seconds and cost $0.04 to execute the prompt and make changes through gpt-engineer. |
The result was close but not functional. The tool successfully extracted the method, but the call to the repository on line 14 was unnecessary and disrupted the code.
async createProduct(input: {
name: string;
quantity: number;
}): Promise<ProductEntity> {
const identifier = this.generateIdentifier();
const product = new ProductEntity(identifier, input.name, input.quantity);
return this.productsRepository.create(product);
}
private generateIdentifier(): string {
return randomUUID();
return this.productsRepository.create(product);
}
Changing the code manually in one place is simpler than typing a prompt and correcting the result later. Our team had GitHub Copilot integrated with their IDE, and, in this case, typing generateIdentifier()
mainly involves accepting the suggested lines by pressing the Tab key a few times.
Extracting the interface
Moving on, we asked the AI tool to extract the interface from ProductEntity.ts. Our goal was to create an interface like this:
export interface ProductInterface {
id: string;
name: string;
quantity: number;
}
And then, make ProductEntity.ts implement a newly created interface.
export class ProductEntity implements ProductInterface {
constructor(
public id: string,
public name: string,
public quantity: number
) {}
}
Surprisingly, we had our first success.
Parameters |
Outcomes |
Prompt |
Create interface for ProductEntity class. Make ProductEntity implement it. Put interface next to ProductEntity.ts file and name it ProductInterface.ts. |
Result |
Tests pass after changes. |
Cost |
It took 10 seconds and cost $0.04 to execute the prompt and make changes through GPT-engineer. |
The GPT-engineer AI tool correctly created the interface.
However, it did not implement it in ProductEntity. Instead, it only removed the last line of the file in ProductEntity.
export interface IProduct {
id: string;
name: string;
quantity: number;
}
// This interface defines the structure of a product entity.
The overall outcome was close, but making this change by hand would be much simpler and faster.
Extracting the superclass
Lastly, we wanted to check how the AI tool will handle extracting common functionalities of ProductEntity and PersonEntity into another class and then sharing it by inheritance.
The common elements between those two classes were: id, name, and the returnIdAndName method. What we expected was that these would be moved to the superclass.
export class ProductEntity {
constructor(
public id: string,
public name: string,
public quantity: number
) {}
returnIdAndName(): string {
return `ID: ${this.id}, Name: ${this.name}`;
}
}
export class PersonEntity {
constructor(
public id: string,
public name: string,
) {}
returnIdAndName(): string {
return `ID: ${this.id}, Name: ${this.name}`;
}
}
Here's what we ended up with.
Parameters |
Outcomes |
Prompt |
Extract common features of PersonEntity.ts class and ProductEntity class to another class. Name this class RecordEntity. PersonEntity and ProductEntity should extend RecordEntity and inherit shared features. Please remove from PersonEntity and ProductEntity features extracted to RecordEntity. |
Result |
Tests don’t pass after changes. |
Cost |
It took 7 seconds and cost $0.10 to execute the prompt and make changes through GPT-engineer. |
RecordEnitity looked good. AI accurately identified the duplicate functionality that needed to be extracted.
export abstract class RecordEntity {
constructor(public id: string, public name: string) {}
returnIdAndName(): string {
return `ID: ${this.id}, Name: ${this.name}`;
}
}
PersonEntity had some extra curly brackets, but, apart from that, was correct.
import { RecordEntity } from './RecordEntity';
export class PersonEntity extends RecordEntity {
}
}
However, ProductEntity remained unchanged, which was not the expected result.
export class ProductEntity {
constructor(
public id: string,
public name: string,
public quantity: number
) {}
returnIdAndName(): string {
return `ID: ${this.id}, Name: ${this.name}`;
}
}
The outcome of using the GPT-engineer in this scenario was somewhat correct but did require some adjustments.
The GPT-engineer in refactoring: Our conclusion
All things considered, the use of the GPT-engineer failed to speed up the chosen refactoring activities.
In cases where the tool had the opportunity to shine, such as when changes had to be made to multiple files across the project, it failed and created messy, inconsistent code.
When it came to small and local changes, selecting the files that needed to be changed, creating the prompt, and correcting the inaccurate result took more time than writing the code by hand or using alternatives such as Github Copilot.
Limitations and things to consider while implementing the GPT-engineer
While looking to raise the cost- and time-effectiveness of the refactoring process, implementing the GPT-engineer might not be the best choice just yet. At the moment, the solution still requires a lot of manual corrections and doesn't produce satisfactory results.
The code that the system generates needs the user to check and adapt it. The tool doesn't function as an independent refactoring AI agent, and it is actually better at building a foundation for software development than refactoring.
Nevertheless, the potential for refactoring that it brings to the users, once fulfilled, will make it a very interesting extension to the developers' team. For the time being, it is still a great tool for tasks such as setting up the project structures or generating boilerplate code.