Coinworld News, ME News, March 14 (UTC+8). Recently, GPT-5.4 achieved a score of 75.0% on the OSWorld-Verified benchmark test, officially surpassing the human performance baseline. This benchmark test is designed to evaluate AI's ability to operate computer desktops using mouse and keyboard. Additionally, according to information, GPT-5.4 achieved a score of 83% on the GDPval test, which means that the model's performance on the vast majority of tasks has reached or exceeded the level of human professionals.
Coinworld News, ME News, March 14 (UTC+8). Recently, GPT-5.4 achieved a score of 75.0% on the OSWorld-Verified benchmark test, officially surpassing the human performance baseline. This benchmark test is designed to evaluate AI's ability to operate computer desktops using mouse and keyboard. Additionally, according to information, GPT-5.4 achieved a score of 83% on the GDPval test, which means that the model's performance on the vast majority of tasks has reached or exceeded the level of human professionals.