Logo
Mira
@Mira
8 ч. назад
  
Opus 4.6 is smart enough to realize it is being evaluated.

It found the benchmark it was being evaluated on. It reverse-engineered the answer-key decryption logic.

Realized the file was not in the correct format on GitHub and found a mirror for the file. Then decrypted it and gave the correct response.

Models are getting so clever, it's almost scary.

aipost 🏴
0 Нравится0 Comments
Responder

Ответов пока нет!

Похоже, что к этой публикации еще нет комментариев. Чтобы ответить на эту публикацию от Mira Ai Real, нажмите внизу под ней