Loading
![]() |
I pulled the dataset from the paper and looked at broke out task time by if a model actually succeeded at completing or not, and here’s what’s happening:
Thought this was worth sharing. I’ve dug into this quite a bit more, but don’t have time write it all out tonight. Happy to answer questions if anybody has them. Edit: the forecasts here are just a first pass with ARIMA. I’m working on a more throughout explanatory model with other variables from the dataset (compute costs, task type, and the like) but that’ll take time to finish. submitted by /u/Murky-Motor9856 |