Tech

ChatGPT: Smooth Talker But A Terrible AI

The more I interacted with ChatGPT, the more I realised that it was not just mathematical or logical ineptitude. There is a much larger problem with ChatGPT. It has a penchant for making errors, both arithmetic and factual.

Aniruddhan GowrisankarFeb 12, 2023, 05:32 PM | Updated 05:41 PM IST

OpenAI's ChatGPT (Photo by Jonathan Kemper on Unsplash)

Have we truly entered the "age of artificial intelligence?"

Ever since ChatGPT took the world by storm, this has become a central question.

So, how good is it? Are we witnessing the rise of a new AI? Has OpenAI caught some of the technology giants napping?

The jury is still out. But, in my opinion, ChatGPT is anything but an AI, at least not of the kind we have in mind. Despite all the acclaim and praise, there seems to be a fundamental anomaly with ChatGPT.

Figure 1

Figure 2

For an AI, mathematics skills are taken for granted.

Multiplying two numbers, large as they may be, should be comfortable. However, while ChatGPT understood the task (Figure 1) and the numbers it had to multiply, it still got the answer wrong (Figure 2).

At first glance, this seems harmless. After all, ChatGPT is still sort of in the beta testing stage. So, I took the pains to correct it. Although it should not have committed a mistake in the first place, the chatbot accepted and apologised for it. An AI gentleman.

When asked to make the same computation again, it provided the same wrong answer that I had cooked up. Again. And again.

Naturally, I felt sorry for it and hinted that it did not sound correct. Adding to my guilt, all of its responses by then began with an apology.

However, when it did finally pick up on my cue, it came back a full circle and gave me the original wrong answer. Not the sharpest tool in the shed.

Figure 3

The natural next stop was logic and reasoning, and the same peculiar problem continued.

ChatGPT is able to comprehend the puzzle, but drops the ball when it comes to execution. The amazing thing, however, is its knack for generating human-like equations and steps.

Looking past the erroneous definition (work is not inversely proportional to the time taken to complete the work) and the wrong answer, the fact that it did so, in and of itself, is commendable.

In another thread (Figure 5), when asked a similar question, it made sound logical relationships between the factors, but still got the answer wrong.

Figure 4

Figure 5

The more I interacted with ChatGPT, the more I realised that this was not just mathematical or logical ineptitude. There is a much larger problem with ChatGPT. It has a penchant for making errors, both arithmetic and factual.

Figure 6

Quite frankly, the answer felt so wrong on so many levels that I am unsure of where to start. Without going into the jargon, let me break it down (except for statement three; too much maths there for my understanding).

1729 is not an abundant number. It has only three divisors — 7,13, and 19, other than 1 and itself. Not only is the list of divisors wrong (except for 19), but they do not sum to greater than 1729.

Obviously, if 1729 has divisors, it cannot be a prime number. Nor can it be represented as 2^p - 1. Interestingly, when I queried (yeah, I was that bored), ChatGPT thought that 1729 = 2⁶ - 1 (actually 63; not to mention that six is not even a prime number).

The very next statement says 1729 is a composite number. Again, since it has only three divisors, it is also not a highly composite number.

ChatGPT astonishingly managed to contradict itself over three statements. Such inconsistency and self-contradiction seem to be common, especially when some technical content is involved.

Naturally, it begs the question whether ChatGPT’s language prowess is mostly just brute force, lacking deeper understanding or awareness.

Much has been published about its prolific abilities, from drafting scientific reports to simulating operating systems.

In one of my own tests, not only did ChatGPT understand a peculiar word-puzzle game that I posed (Figure 7), but it also managed to write a slick scoring programme in python (Figure 8).

However, it could not play the game coherently. So, why this dichotomy in competence? At this juncture, we can probably only speculate. It is also quite possible that this apparent clumsiness is just a paywall.

Irrespective of the reason, ChatGPT, in its current iteration, is far from being the AI of our dreams. It goes without saying that if these flaws are baked into it, ChatGPT is not overthrowing Google search anytime soon.

As a tool, ChatGPT exclusively excels at language comprehension and communication in much the same way that previous purpose-built AI engines have excelled at, playing chess, GO, pattern/image recognition, and driving cars.

In this context, ChatGPT is nothing more than a specialised module that will fit into our larger AI toolbox. It certainly marks a significant step towards our ambivalent goal of building a true AI, but the hype surrounding ChatGPT’s abilities has overinflated its significance.

Figure 7

Figure 8

Join our WhatsApp channel - no spam, only sharp analysis