Is GPT -4 Really an Improved Version? Technical Chamber

Is GPT-4 really an improved version?

According to OpenAI, the new model performs better in a series of tests designed to measure the intelligence and knowledge of humans and machines. It also makes fewer blunders and can respond to both images and text.

However, GPT-4 suffers from the same problems that have plagued ChatGPT and led some AI experts to be skeptical of its usefulness, such as a tendency to fabricate incorrect information, display conflicting social biases, and misbehave or assume disturbing personalities when presented with an 'adverse' situation.

"Although they've come a long way, it's clearly not to be trusted," says Oren Etzioni, professor emeritus at the University of Washington and founding CEO of the Allen Institute for AI. “It will be a long time before you want any GPT to run your nuclear plant.”

OpenAI provided several demos and benchmark data to illustrate the capabilities of GPT-4. The new model can not only beat the passing scores of the Uniform Bar Examination, which is used to license lawyers in many US states, but also score in the top 10% of those obtained by humans.

It also scores better than GPT-3 on other tests designed to test knowledge and reasoning in subjects such as biology, art history, and calculus. And it achieves better results than any other artificial intelligence language model in tests to measure the progress of this type of algorithm. "In a way, it's more of the same," Etzioni says. "But it's more of the same in a series of absolutely mind-blowing trailers."

GPT-4 is also capable of tricks already seen in GPT-3 and ChatGPT, such as summarizing and suggesting edits to text fragments. It can also do other things that its predecessors couldn't, like serving as a Socratic tutor to guide students to the correct answers or comment on the content of photos. For example, if you provide a photo of ingredients on a kitchen counter, GPT-4 can suggest a suitable recipe. If you are

presented with a graph, you can explain the conclusions that can be drawn from it.

"It definitely seems to have acquired some capabilities," says Vincent Conitzer, a Carnegie Mellon University (CMU) professor specializing in artificial intelligence who has begun experimenting with the new language model. But it indicates that you keep making mistakes, such as suggesting meaningless directions or proposing bogus mathematical proofs.

Turn static files into dynamic content formats.

Create a flipbook