AI can now crack your password by listening to your keyboard clicks
August 16, 2023
Annapolis Cybersecurity
STARLIMS SDMS Helps Labs Attain Data Integrity – Part 3
August 18, 2023
AI can now crack your password by listening to your keyboard clicks
August 16, 2023
Annapolis Cybersecurity
STARLIMS SDMS Helps Labs Attain Data Integrity – Part 3
August 18, 2023

ChatGPT has been found to produce incorrect answers for over half of the software engineering questions posed, according to fresh research.

The Purdue University study saw researchers analyze ChatGPT answers to 517 Stack Overflow questions, with the aim of assessing the “correctness, consistency, comprehensiveness, and conciseness” of answers presented by the generative AI tool.

Researchers reported 52% of answers to programming-related queries were inaccurate, while more than three-quarters (77%) were deemed “verbose”.

A key talking point from the study centered around user interpretation of answers presented by ChatGPT, as well as the perceived legitimacy of answers produced by the chatbot.

Researchers said ChatGPT’s answers are “still preferred [to Stack Overflow] 39.34% of the time due to their comprehensiveness and well-articulated language style”, resulting in users taking answers at face value.

“When a participant failed to correctly identify the incorrect answer, we asked them what could be the contributing factors,” researchers said. “Seven out of 12 participants mentioned the logical and insightful explanations, and comprehensive and easy-to-read solutions generated by ChatGPT made them believe it to be correct.”

Of the “preferred answers” identified by users to software queries, more than three-quarters (77%) of these were found to be wrong.

Researchers said that users were only able to identify errors in ChatGPT-based answers when it was glaringly obvious. However, in instances where the error was “not readily verifiable”, users frequently failed to identify incorrect answers or “underestimate the degree” of error in the answer itself.

Surprisingly, the study also found that even when answers contained obvious errors, two out of 12 still marked them as correct and revealed they “preferred that answer”.

Researchers said the perceived legitimacy of answers presented by ChatGPT in this instance should be a cause for concern among users. “Communication correctness” should be a key focus for the creators of such tools.

ChatGPT does give users ample warning the answers provided may not be entirely accurate, stating the chatbot “may produce inaccurate information about people, places, or facts”.

But the study suggested “such a generic warning is insufficient” and recommended answers are complemented with a disclaimer outlining the “level of incorrectness and uncertainty”.

“Previous studies show that LLM knows when it is lying, but does LLM know when it is speculating? And how can we communicate the level of speculation?” the study pondered. “Therefore, it’s imperative to investigate how to communicate the level of incorrectness of the answers.”

The use of generative AI tools in software development and programming has gathered significant pace in recent years, most notably with the launch of GitHub’s Copilot services.

Earlier this year, the firm announced the general availability of the AI-based coding assistant for business customers. The model is designed specifically to bolster code safety and has been hailed by developers as a vital tool in supporting their daily operations.

A survey published by GitHub in June revealed that a majority (92%) of devs now use an AI coding tool at their work, with 70% stating they see “significant benefits” to using generative AI tools in workplace settings.

Source: https://www.itpro.com/technology/artificial-intelligence/chatgpt-gives-wrong-answers-to-programming-questions-more-than-50-of-the-time