ChatGPT for Bespoke Test Data Generation

I’m seeing a lot of brilliant posts that I’ve learned a tonne from all about ChatGPT and what it can do to help us Quality Engineers. I’ve also posted about what a contentious topic this is to even acknowledge you are using ChatGPT – my earlier musings on this topic seem to have been borne out in my experience so far.

However, the one thing I haven’t yet heard anyone go into is using ChatGPT to create test data generation.

TLDR: Watch this video to see how I did this

There are a few advantages to this:-

  • Realistic data sets – We know ChatGPT-3 isn’t the latest data set, but it is at least based on a heck of a lot of data. So maybe you want to know what the most popular products are for your company, or what the best grossing films were in 2010 in Morocco, or what the most well known technical trailblazers names were. If its something where you care about what others think, as opposed to having a linear set of something e.g. 1-200, then being able to tap into those data sets could help you be more realistic
  • Bespoke Test Data – there are already brilliant libraries you can use (such as faker.js) which auto-generate test data for you, but what if you need something more specific. One example could be you have a field that requires first names, but you want to only use female first times. That isn’t a sub-selection you can currently do out of the box.
  • Something fun for Demo’s – want to spice up a customer demonstration, or an end-of-sprint show and tell? Plug in unique test data and ask it for something wild!

I spent an evening a few weeks back solving this puzzle.

I created:-

  • an open-source, free, publicly available workspace in Postman
  • Using Postman Flows (the low code workflow builder feature) you simply modify a query that you feed in to ChatGPT using the template provided, and Flows formats the response that comes back to allow you create an array of comma separated test data
  • This test data is then immediately plugged into an API request – showing the end-to-end process of test data generation and looping through a request for each and every bit of test data ChatGPT Provides – so you get to select how many times you want this to run by asking for that number of items in your query.

The results are in this video:-

YouTube Video Walking Through OpenAI Test Data Generator

Pros and Cons

  • ChatGPT will not be free forever, so this may have limited shelf life unless you’re willing to pay for access to the API
  • Asking for large datasets may use your free tokens pretty quickly
  • If you ask the same query, ChatGPT will come back with the same answer. So its really important you know what to ask it if you need randomised data every time the query runs
  • You don’t know how accurate the data is – so be careful what you are asking it for and how much you rely on it as a source of truth

I hope this helps you in some way, as this is quite a novel way to use ChatGPT from what I’ve read (although I’m sure someone will create a more user-friendly tool version that does something similar soon if they haven’t already). As with all of my posts, videos and community work, I didn’t get a penny for creating this or putting it out there, so if you do find it useful please remember to say thankyou and quote your sources, it gives me the impetus to keep going!

T’ra for now 😜

%d bloggers like this: