Visualizing GPT-Engineer Output with Mermaid

Updated Feb 17, 2025 • 15 min read

Automated test generation can produce complex outputs that are often hard to interpret. Using Mermaid, a JavaScript tool for creating diagrams and charts, helps simplify these results and identify inefficiencies.

In this guide, we’ll demonstrate how to visualize test outputs generated by GPT-Engineer using Mermaid. First, we used GPT-Engineer to create the test outputs, then leveraged GPT (ChatGPT) to transform those outputs into Mermaid diagrams. By combining GPT's processing capabilities with Mermaid’s visualization features, you can turn complex data into clear, actionable diagrams, streamlining the review and optimization process.

We’ll cover how to create flowcharts, sequence diagrams, and class diagrams to better document and improve GPT-Engineer outputs.

What is Mermaid?

Mermaid is a JavaScript tool that creates diagrams and charts from text. It allows developers and testers to easily convert written descriptions into visual formats, such as flowcharts, sequence diagrams, and class diagrams. With its simplicity and flexibility, Mermaid is ideal for documenting processes, code structures, and test case flows. By turning complex data into clear visuals, it enhances communication and understanding within development and testing teams.

Why visualize output with Mermaid?

Visualizing test outputs, especially from GPT-Engineer, can significantly enhance your workflow. Clear diagrams make it easier to spot potential issues, identify gaps, and understand test flows. Mermaid simplifies this by allowing you to create readable diagrams that improve both documentation and refactoring. These visuals help testers better grasp test case sequences, component relationships, and areas needing optimization.

Key questions for visualizing GPT-engineer outputs with Mermaid

This research explores how to effectively visualize GPT-Engineer test outputs using both ChatGPT and Mermaid. Specifically, we aim to answer the following key questions:

How can we prompt GPT to create Mermaid diagrams?
We’ll examine how to craft clear prompts that guide GPT to generate accurate and useful diagrams for testers.
Can we create a reusable template for GPT outputs?
We’ll explore the possibility of developing a standard template to consistently convert GPT outputs into Mermaid diagrams.
Which Mermaid diagrams are most effective?
We’ll determine which diagram types—such as flowcharts and sequence diagrams—are most helpful for visualizing GPT-Engineer outputs.

Using GPT for automated test visualizations in Mermaid

Here’s a step-by-step guide on how to get started with Mermaid for visualizing your GPT-Engineer outputs for automated test.

Steps for crafting a good prompt

To generate useful diagrams, crafting precise prompts is essential. When working with ChatGPT, follow these best practices:

Be specific: Clearly specify the type of diagram you want, whether it’s a flowchart, sequence diagram, or class diagram.
Focus on structure: Define the logical steps the diagram should represent, such as test cases or page object classes.
Provide clear examples: Detailed examples help GPT generate more accurate diagrams.

During the research, I provided several GPT-Engineer-generated outputs from previous projects to GPT. In most cases, GPT successfully generated promising diagram structures with just a basic prompt. For example:

Create mermaid graph based on
  it(‘should login with valid credentials and complete a purchase’, () => {
    cy.fixture(‘users’).then((users) => {
      const { validUser } = users;
      loginPage.fillUsername(validUser.username);
      loginPage.fillPassword(validUser.password);
      loginPage.submit();
      inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
      inventoryPage.goToCart();
      cartPage.checkout();
      checkoutPage.fillFirstName(‘John’);
      checkoutPage.fillLastName(‘Doe’);
      checkoutPage.fillPostalCode(‘12345’);
      checkoutPage.continue();
      checkoutPage.finish();
    });
  });
  it(‘should fail login with invalid credentials’, () => {
    cy.fixture(‘users’).then((users) => {
      const { invalidUser } = users;
      loginPage.fillUsername(invalidUser.username);
      loginPage.fillPassword(invalidUser.password);
      loginPage.submit();
      cy.get(‘.error-message-container’).should(‘be.visible’);
    });

The result, once visualized as a graph, looked like this:

GPT-generated diagram for a basic prompt

GPT generated a graph that clearly visualized the steps in each test case, making it easier to spot any missing elements. However, using the same basic prompt sometimes produced inconsistent results, such as switching between flowcharts and sequence diagrams. This highlights the need for more specific instructions to achieve consistent, repeatable outputs. Below are examples of prompts that delivered more accurate results.

Graphs

Create mermaid graph based on
describe(‘Saucedemo UI Tests’, () => {
  const loginPage = new LoginPage();
  const inventoryPage = new InventoryPage();
  const cartPage = new CartPage();
  const checkoutPage = new CheckoutPage();

  beforeEach(() => {
    loginPage.visit();
  });

  it(‘should login with valid credentials and complete a purchase’, () => {
    cy.fixture(‘users’).then((users) => {
      const { validUser } = users;
      loginPage.fillUsername(validUser.username);
      loginPage.fillPassword(validUser.password);
      loginPage.submit();

      inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
      inventoryPage.goToCart();

      cartPage.checkout();

      checkoutPage.fillFirstName(‘John’);
      checkoutPage.fillLastName(‘Doe’);
      checkoutPage.fillPostalCode(‘12345’);
      checkoutPage.continue();
      checkoutPage.finish();
    });
  });

  it(‘should fail login with invalid credentials’, () => {
    cy.fixture(‘users’).then((users) => {
      const { invalidUser } = users;
      loginPage.fillUsername(invalidUser.username);
      loginPage.fillPassword(invalidUser.password);
      loginPage.submit();

      cy.get(‘.error-message-container’).should(‘be.visible’);
    });
  });
});

Use the graph structure to demonstrate the steps that need to be taken in each test case. The graph should follow the subgraph structure below:
1 Test suite
2 Test case
3 Steps for each test case

The output generated quite an accurate graph structure:

more accurate GPT-generated graph for more complex prompts

This time, the graph was more accurate, though a bit cluttered, making some test case descriptions difficult to read. GPT successfully managed the beforeEach hook and placed it correctly, but more complex hooks led to less predictable results.

For test cases with more intricate hooks, the same approach proved less successful:

Create mermaid graph based on
describe(‘Inventory Page Tests’, () => {
  const loginPage = new LoginPage();
  const inventoryPage = new InventoryPage();
  beforeEach(() => {
    loginPage.visit();
    loginPage.fillUsername(‘standard_user’);
    loginPage.fillPassword(‘secret_sauce’);
    loginPage.submit();
    cy.url().should(‘include’, ‘/inventory.html’);
  });
  it(‘should add an item to the cart’, () => {
    inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
    inventoryPage.getCartBadge().should(‘contain’, ‘1’);
  });
  it(‘should remove an item from the cart’, () => {
    inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
    inventoryPage.removeItemFromCart(‘Sauce Labs Backpack’);
    inventoryPage.getCartBadge().should(‘not.exist’);
  });
});

Use the graph structure to demonstrate the steps that need to be taken in each test case. The graph should follow the subgraph structure below:
1 Test suite
2 Test case
3 Steps for each test case

The generated graph was far from perfect:

GPT-generated diagram for test cases with more intricate hooks

To ensure GPT knows where to place the hook, explicitly list it in the graph structure. While this doesn’t guarantee perfect results—GPT may still produce unpredictable output from the same prompt—it should work in most cases.

Create mermaid graph based on
describe(‘Inventory Page Tests’, () => {
  const loginPage = new LoginPage();
  const inventoryPage = new InventoryPage();
  beforeEach(() => {
    loginPage.visit();
    loginPage.fillUsername(‘standard_user’);
    loginPage.fillPassword(‘secret_sauce’);
    loginPage.submit();
    cy.url().should(‘include’, ‘/inventory.html’);
  });
  it(‘should add an item to the cart’, () => {
    inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
    inventoryPage.getCartBadge().should(‘contain’, ‘1’);
  });
  it(‘should remove an item from the cart’, () => {
    inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
    inventoryPage.removeItemFromCart(‘Sauce Labs Backpack’);
    inventoryPage.getCartBadge().should(‘not.exist’);
  });
});

Use the graph structure to demonstrate the steps that need to be taken in each test case. The graph should follow the subgraph structure below:
1 Test suite
2 Before each test
3 Test cases
4 Steps for each test case

Flowchart

A similar, and perhaps easier-to-follow, way to present the logic of your tests is through flowcharts:

Create a mermaid flowchart based on
  it(‘should login with valid credentials and complete a purchase’, () => {
    cy.fixture(‘users’).then((users) => {
      const { validUser } = users;
      loginPage.fillUsername(validUser.username);
      loginPage.fillPassword(validUser.password);
      loginPage.submit();
      inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
      inventoryPage.goToCart();
      cartPage.checkout();
      checkoutPage.fillFirstName(‘John’);
      checkoutPage.fillLastName(‘Doe’);
      checkoutPage.fillPostalCode(‘12345’);
      checkoutPage.continue();
      checkoutPage.finish();
    });
  });
  it(‘should fail login with invalid credentials’, () => {
    cy.fixture(‘users’).then((users) => {
      const { invalidUser } = users;
      loginPage.fillUsername(invalidUser.username);
      loginPage.fillPassword(invalidUser.password);
      loginPage.submit();
      cy.get(‘.error-message-container’).should(‘be.visible’);
    });

The flowchart should demonstrate the sequence of steps to be taken in each test case.

What about more complex tests, such as those using hooks? As always, being as precise as possible is key in these cases:

Create a mermaid flowchart based on
describe(‘Inventory Page Tests’, () => {
  const loginPage = new LoginPage();
  const inventoryPage = new InventoryPage();
  beforeEach(() => {
    loginPage.visit();
    loginPage.fillUsername(‘standard_user’);
    loginPage.fillPassword(‘secret_sauce’);
    loginPage.submit();
    cy.url().should(‘include’, ‘/inventory.html’);
  });
  it(‘should add an item to the cart’, () => {
    inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
    inventoryPage.getCartBadge().should(‘contain’, ‘1’);
  });
  it(‘should remove an item from the cart’, () => {
    inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
    inventoryPage.removeItemFromCart(‘Sauce Labs Backpack’);
    inventoryPage.getCartBadge().should(‘not.exist’);
  });
});

The flowchart should demonstrate the sequence of steps to be taken in each test case. It should use the following subgraph structure:
1. Steps from beforeEach hook
2. Steps from a test case

The same structure should be used to represent all test cases above.

The resulting diagram reflects the desired structure:

Using a slightly different, but similar, prompt—such as, "The flowchart should demonstrate the sequence of steps in each test case, with the steps from the beforeEach hook logically connected to those in each test case"—can sometimes result in a more confusing chart.

It may represent the flow of your tests, but it’s fragmented and much harder to decipher. Unless this is your intended structure, I’d recommend sticking with the previous approach.

Sequence diagram

Another way to represent the logic of your tests is through sequence diagrams, which illustrate the sequence of interactions between entities in a given process:

Create a mermaid sequence diagram based on
describe(‘Inventory Page Tests’, () => {
  const loginPage = new LoginPage();
  const inventoryPage = new InventoryPage();
  beforeEach(() => {
    loginPage.visit();
    loginPage.fillUsername(‘standard_user’);
    loginPage.fillPassword(‘secret_sauce’);
    loginPage.submit();
    cy.url().should(‘include’, ‘/inventory.html’);
  });
  it(‘should add an item to the cart’, () => {
    inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
    inventoryPage.getCartBadge().should(‘contain’, ‘1’);
  });
  it(‘should remove an item from the cart’, () => {
    inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
    inventoryPage.removeItemFromCart(‘Sauce Labs Backpack’);
    inventoryPage.getCartBadge().should(‘not.exist’);
  });
});

The diagram should represent the logical sequence of steps that need to be taken in each test case. The beforeEach hook contains steps to be taken before each test case. Please follow the following structure:
1. beforeEach hook
2. Test case
3. beforeEach hook
4. Test case

The resulting diagram effectively represents the sequences, showing which entities interact at each stage—for example, the user interacts with the inventory page (addItemToCart), and later the inventory page confirms that the item was added to the cart (getCartBadge):

If you want to show that each hook appears before each test case, it’s best to specify the diagram's structure in steps, as shown in the previous prompt. Otherwise, you might end up with a single beforeEach sequence block at the top of the diagram—unless that’s the intended outcome.

Class diagram

In addition to visualizing the flow or sequence of your tests, you may want to focus on the classes generated within them. For this, you can use a class diagram:

Create a mermaid class diagram based on
describe(‘Inventory Page Tests’, () => {
  const loginPage = new LoginPage();
  const inventoryPage = new InventoryPage();
  beforeEach(() => {
    loginPage.visit();
    loginPage.fillUsername(‘standard_user’);
    loginPage.fillPassword(‘secret_sauce’);
    loginPage.submit();
    cy.url().should(‘include’, ‘/inventory.html’);
  });
  it(‘should add an item to the cart’, () => {
    inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
    inventoryPage.getCartBadge().should(‘contain’, ‘1’);
  });
  it(‘should remove an item from the cart’, () => {
    inventoryPage.addItemToCart(‘Sauce Labs Backpack’);
    inventoryPage.removeItemFromCart(‘Sauce Labs Backpack’);
    inventoryPage.getCartBadge().should(‘not.exist’);
  });
});

The diagram should represent the page object classes used in the test cases and commands associated with them. The diagram should contain three entities:
1. InventoryPageTests
2.LoginPage
3.InventoryPage

The resulting diagram illustrates the classes used in the tests, along with the commands or actions associated with each:

Challenges

While GPT can often generate useful Mermaid diagrams, there are some challenges to be aware of:

Syntax Errors: Occasionally, the generated diagrams contain syntax that can’t be used in Mermaid without manual fixes. These errors can vary from minor to complex, even with previously successful prompts.
Inconsistent Output: GPT sometimes produces different types of diagrams from similar prompts, requiring precise instructions to ensure consistency.

Another recurring issue was correctly generated diagram structures that included extra spaces in a few lines, causing the following error:

Fortunately, this is a minor issue that can be easily fixed with small adjustments in the code. However, keep in mind that GPT's output is often slightly imperfect and may require some refactoring, even if you're not familiar with Mermaid.

Conclusions after the research

After conducting this research, it’s clear that while GPT is not a perfect AI toolfor generating Mermaid diagrams, it is capable of producing useful results when guided by precise prompts.

Crafting effective prompts

To get reliable outputs, precision is key. A good prompt should:

Refer specifically to the type of diagram (e.g., flowchart, sequence diagram).
Clearly describe the diagram’s purpose.
Outline the steps, structure, and entities needed to avoid unpredictable results.

Using templates for transformation

While there isn’t a universal template for every scenario, a general prompt structure can help streamline the process. For example:

The [GRAPH TYPE] should illustrate the [INTENDED PURPOSE]. It should follow this structure:

[STEP 1]
[STEP 2]

This approach can be adapted for various purposes, such as visualizing test logic, representing classes, or detailing the structure of a test suite.

Most useful Mermaid diagrams

In this research, flowcharts and general graphs proved to be the most useful for visualizing test logic and sequences. These diagrams are straightforward and easy to follow. However, class and sequence diagrams can also provide valuable insights depending on the specific needs of your project.

Final thoughts

While GPT-generated diagrams may require minor corrections, using GPT in combination with Mermaid is a time-saving and efficient method for visualizing test outputs. With well-crafted prompts, this approach can streamline the process of documenting and optimizing test workflows.