A quick second look at the data from that “length of tasks AI can do is doubling” paper

By jabbyai
No Comments

I pulled the dataset from the paper and looked at broke out task time by if a model actually succeeded at completing or not, and here’s what’s happening:

The length of task models actually complete increases slightly in the last year or so, while the length of task models fail to complete.
The apparent reason for this is that models are generally completing more tasks across time, but not the longest ones.
The exponential trend you’re seeing seems like it’s probably a result of fitting a logistic regression for each model – the shape of each curve is sensitive to the trends noted above, impacting the task times they’re back calculating from estimated 50% success rates.

Thought this was worth sharing. I’ve dug into this quite a bit more, but don’t have time write it all out tonight. Happy to answer questions if anybody has them.

Edit: the forecasts here are just a first pass with ARIMA. I’m working on a more throughout explanatory model with other variables from the dataset (compute costs, task type, and the like) but that’ll take time to finish.

https://preview.redd.it/0f2fornljwwe1.png?width=1188&format=png&auto=webp&s=e17d4688365957418036407a3a53d1601d508510

submitted by /u/Murky-Motor9856
[link] [comments]

No Comments

Uncategorized

A quick second look at the data from that “length of tasks AI can do is doubling” paper

Leave a Comment Cancel reply

Recent Posts

Recent Comments

Archives

Categories