J A B B Y A I

I created a website (rival.tips) to view how the new models compare in one-shot challenges

I created a website (rival.tips) to view how the n...

https://reddit.com/link/1j12vc6/video/5qrwwq0tq3me1/player Last few weeks where a bit crazy with all the new gen of models, this makes it a bit easier to compare the models against. I was particularly surprised at how bad R1 performed to my liking, and a

Read More No Comments

Definitive Proof LLMs Can Reason

Ive heard a lot of people say that LLMs can’t reason outsude their training data both in and outside of this sub, which is completely untrue. Here’s my proof for why I believe that: MIT study shows language models defy

Read More No Comments

Did a Chinese Robot Really Try to Attack a Human?:...

submitted by /u/Fabulous_Bluebird931 [link] [comments]

Read More No Comments

Thousands of exposed GitHub repositories, now priv...

submitted by /u/F0urLeafCl0ver [link] [comments]

Read More No Comments

Potential cuts at AI Safety Institute stoke concer...

submitted by /u/F0urLeafCl0ver [link] [comments]

Read More No Comments

Untangling safety from AI security is tough, exper...

submitted by /u/F0urLeafCl0ver [link] [comments]

Read More No Comments

OpenAI’s ARC de Triumph

submitted by /u/F0urLeafCl0ver [link] [comments]

Read More No Comments

Test-Time Routing Optimization for Multimodal Mixt...

This paper introduces a test-time optimization method called R2-T2 that improves routing in mixture-of-experts (MoE) models without requiring retraining. The core idea is using gradient descent during inference to optimize how inputs get routed to different experts, particularly for multimodal

I created a website (rival.tips) to view how the new models compare in one-shot challenges

I created a website (rival.tips) to view how the n...

Definitive Proof LLMs Can Reason

Did a Chinese Robot Really Try to Attack a Human?:...

Thousands of exposed GitHub repositories, now priv...

Potential cuts at AI Safety Institute stoke concer...

Untangling safety from AI security is tough, exper...

OpenAI’s ARC de Triumph

Test-Time Routing Optimization for Multimodal Mixt...

Microsoft Copilot getting feisty?

OpenAI Slows GPT-4.5 Rollout as Sam Altman Warns o...

Recent Posts

Recent Comments

Archives

Categories