Researchers have used top Generative AI models to grade hundreds of undergraduate essays and found that AI only matched human-awarded degree classification around half the time, with AI often failing ...
Top AI systems show bias towards rewarding overly complex prose styles and only match human examiners for grade bands around ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results