ChatGPT is Useless for Real Test Automation
Avoid yet another hype

Coding Jag (a software testing newsletter) #120 featured three articles on ChatGPT:
- ChatGPT for testers
- Using ChatGPT for Test Automation
- ChatGPT for iMessage, Slack, user feedback, and beyond
I am aware of this because my daughter’s article, Playwright vs Selenium WebDriver Syntax Comparison by Example, is also featured in Coding Jag #120.
Some of you might have heard of “ChatGPT” and AI in Test automation in general. I will give my conclusion first: “ChatGPT is totally useless for Real Test Automation”. I will illustrate with ChatGPT examples.
A Simple User Login Test
A user login test in Test Automation is regarded as the “HelloWorld” in coding. Logically, we expect AI Testing to score perfectly on this simplest and well-trained test scenario. Let’s see how well ChatGPT performs.
1. My Request:

2. ChatGPT’s Answer (part 1: test design in steps)

3. ChatGPT Answer (part 2: test script)

My Assessment
I will focus on the test script part (the test design is not worth discussing). Please note, I just specified “user login automated test”, not mentioning any framework or scripting language.
Pros:
1. The test script uses the Selenium WebDriver framework.
A Good choice. Real automated testers take this for granted, however, if you see how many fake automated testers promoting Protractor (deprecated), Test Cafe (hardly seen now), Cypress (Cypress.io, the company behind Cypress, is dying) and Playwright, at least ChatGPT did not make a fundamental error, sticking with W3C’s WebDriver standard.
2. The test script is in Python, a scripting language.
Automated Test Scripts Shall be in the Syntax of a Scripting Language, such as Ruby or Python. We have seen many fake automated testers use C#, Java and JavaScript (JS is not a pure scripting language). Not long ago, I saw a survey, Java, a compiled language, is mostly used in the Selenium community. I will show another proof shortly, related to ChatGPT.
ChatGPT uses Python over Ruby for an obvious reason: Python's dominance in AI.
Cons:
1. It is not a real test case.
A real automated test case shall be in a test framework, such as JUnit for Java and RSpec for Ruby. PyTest is a popular test framework in Python, but ChatGPT does not use it. Its assertion is done by the language keyword "assert".
What if I want to add an alternative test scenario, “User login failed” (a very common one, almost a must) later?
2. The default browser, Firefox, is wrong.
I don’t need to show the stats to state that Chrome is the dominant browser. It does not need high intelligence to figure that out.
3. The test script is invalid!
driver.find_element(By.ID, “login”).click()
There is no "login" button on the login page, rather, a “Sign in” button.
<input type='submit' name='commit' value='Sign in'>
driver.find_element(By.LINK_TEXT, 'Logout').click()
There is no “Logout” link, rather, a “Sign off” link. There is also a wrong assumption of “Dashboard”.
In other words, ChatGPT does not analyze https://travel.agileway.net and does some magic AI work to generate the dedicated and working test scripts. To me, it is totally useless. I am saying this as a real test automation engineer who has been developing thousands of requested (non-generic) automated test scripts for over 17 years.
“The reason that you and your team are hired is to develop a custom software solution. A generic AI bot won’t help you” — Zhimin Zhan
How about a more complex test scenario?
The above is a user login test, a test automation script in its simplest form. Yet, ChatGPT did it wrongly!
Let me pretend to be a naive ‘automated tester’ who wishes AI Testing could help me do the work: write an automated test case for a work business scenario.

The test script ChatGPT generated:

Of course, it is all wrong.
- No “/login” page, shall be /sign-in instead
- No “#login” button, instead #login-btn .
- The booking part is totally not related to WhenWise.
Please note the syntax error in the last step, shall be "driver.close()".

Please note this is still a quite simple and common scenario. In reality, the work business scenario is much more complex, such as submitting an insurance claim, which involves maybe hundreds of steps.
--
In the next article on this topic, I will explain why the common Arguments Supporting ChatGPT in End-to-End Test Automation are Wrong.
The original article was published on my Medium Blog, 2023-01-01
About the Creator
Zhimin Zhan
Test automation & CT coach, author, speaker and award-winning software developer.
A top writer on Test Automation, with 150+ articles featured in leading software testing newsletters.



Comments (1)
Well-written piece by Zhimin Zhan. While ChatGPT shows a basic understanding of test automation frameworks such as Selenium WebDriver and scripting languages like Python, its lack of contextual comprehension leads to erroneous and impractical test scripts. The article highlights the importance of human testers' domain knowledge and contextual understanding to accurately design and execute test cases, specifically in complex business scenarios where dynamic elements pose challenges for AI models. Moreover, human testers possess the creativity, intuition, and adaptability needed to address unforeseen issues, generate diverse test data sets, and explore system behavior beyond scripted test cases. While AI models like ChatGPT may provide assistance in certain aspects of test automation, they cannot replace the critical thinking and problem-solving skills inherent in human testers, particularly in scenarios involving dynamic elements and collaborative team dynamics. Feel free to check out this related post: www.jhavtech.com.au/chatgpt-has-a-serious-problem