J A B B Y A I

Loading

A quick second look at the data from that "length of tasks AI can do is doubling" paper

I pulled the dataset from the paper and looked at broke out task time by if a model actually succeeded at completing or not, and here’s what’s happening:

  • The length of task models actually complete increases slightly in the last year or so, while the length of task models fail to complete.
  • The apparent reason for this is that models are generally completing more tasks across time, but not the longest ones.
  • The exponential trend you’re seeing seems like it’s probably a result of fitting a logistic regression for each model – the shape of each curve is sensitive to the trends noted above, impacting the task times they’re back calculating from estimated 50% success rates.

Thought this was worth sharing. I’ve dug into this quite a bit more, but don’t have time write it all out tonight. Happy to answer questions if anybody has them.

Edit: the forecasts here are just a first pass with ARIMA. I’m working on a more throughout explanatory model with other variables from the dataset (compute costs, task type, and the like) but that’ll take time to finish.

https://preview.redd.it/0f2fornljwwe1.png?width=1188&format=png&auto=webp&s=e17d4688365957418036407a3a53d1601d508510

submitted by /u/Murky-Motor9856
[link] [comments]

Leave a Comment