A new artificial intelligence program can keep up with human beings on a number of professional and academic tests, according to its creator, which said the model scored among the top 10% of test-takers on a simulated bar exam.
Technology firm OpenAI unveiled its latest language model on Tuesday, dubbed GPT-4, saying the program is “more reliable, creative, and able to handle much more nuanced instructions” than its predecessor, GPT-3.5.
“GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks,” it said. “For example, it passes a simulated bar exam with a score around the top 10% of test takers; in contrast, GPT-3.5’s score was around the bottom 10%.”
The AI model also performed at the 93rd percentile on a SAT reading exam and at the 89th percentile on a SAT math test, the company added.
The GPT software has been embedded in a number of other apps, such as the language-learning program Duolingo, which is aiming to create conversational bots, as well as automated tutors for the online education company Khan Academy.
The model’s previous iteration, GPT-3.5, gained popularity in the form of the ChatGPT chatbot program, which is capable of holding complex, human-like conversations with users.
According to OpenAI, GPT-4 is its “most advanced system yet,” and unlike GPT-3.5 is able to process image prompts in addition to text. However, despite its improved capabilities, the company warned the new model is “not fully reliable” and still suffers from some glitches, including what it calls “hallucinations,” in which the AI simply fabricates information or generates erroneous answers.
The company partnered with Microsoft earlier this year, receiving a multibillion-dollar investment from the tech giant to further develop its AI models. The new GPT-4 program will play a major role in Microsoft’s Bing chatbot, which was unveiled on a limited basis earlier this year, and the company is expected to announce integration into other consumer products sometime in the coming weeks, according to the Financial Times.