Over for Google and OpenAI

D

Deleted member 131140

𝕯𝖝𝕯 π–ˆπ–—π–Šπ–œ . Alonso
Joined
Mar 1, 2025
Posts
19,563
Reputation
48,751

Apparently, the researchers say that the reasoning models have 0 accuracy as the logical reasoning tests go up in complexity

The ques for the established benchmarks might already have answers baked into the training set of the models so they were inaccurate when assessing a model's accuracy

@Jason Voorhees career extended by 20 years πŸ’€πŸ’€

Link to paper
 
Last edited:
  • +1
  • JFL
Reactions: Deleted member 67738, Bitchwhipper2, Basedman420 and 13 others
@Snicket
 
  • +1
Reactions: Deleted member 91663 and Deleted member 117288
@Chimera
 
  • +1
Reactions: Deleted member 158882
chatgpt couldn't understand my 1 sentence prompt today, can confirm
 
  • JFL
  • +1
Reactions: Deleted member 132430 and Deleted member 131140
chatgpt couldn't understand my 1 sentence prompt today, can confirm
Its for purely logical reasoning tasks though.

Specifically highly complex tasks
 
  • Woah
  • +1
Reactions: Deleted member 132430 and DirtyBlonde
exactly, AI is extremely gay and over-hyped to the point where it cringes you
 
  • JFL
Reactions: Deleted member 131140
  • JFL
Reactions: Deleted member 117288
  • Love it
Reactions: Deleted member 131140
Can’t be bothered with this tech stuff.

Just going outside and doing anything mogs.

1
 
Last edited:
  • +1
Reactions: Deleted member 131140
Lol, Apple has been left behind in the AI department, that's why they desperately try to downplay the importance of AI...
 
  • +1
  • JFL
Reactions: Deleted member 117288 and Deleted member 131140
Lol, Apple has been left behind in the AI department, that's why they desperately try to downplay the importance of AI...

The problems in the paper were not very tough though. The river problem was simple enough for humans to do it
 
  • +1
Reactions: Gonthar
I haven't been following the story very closely.
What's the bottleneck of AI in personal phone usage?
Surely some kind of GPT based model could do an adequate job? But obviously it's more complex than that.
 
  • +1
Reactions: Deleted member 131140
this is why grok mogs
 
  • JFL
Reactions: Deleted member 131140 and Lord Shadow
obviously lol for anyone thinking AI wasnt going to be any different from just a smarter google search than its over @lifeless
 
  • JFL
Reactions: Deleted member 131140 and lifeless
I haven't been following the story very closely.
What's the bottleneck of AI in personal phone usage?
Surely some kind of GPT based model could do an adequate job? But obviously it's more complex than that.
Its not personal usage to be specific.

The reasoning models just fail when given deterministic algorithmic problems.

It should be doable for them since its within the context window but they just "give up" before attempting as more variables are stacked on
 
  • +1
Reactions: Deleted member 91663
because Ai is stupid they cant think
 
  • JFL
Reactions: piec and Deleted member 131140
Chatgpt is a genuine retard tbh
 
  • JFL
Reactions: Deleted member 131140
Chatgpt is a genuine retard tbh
Still has a good knowledge base though. 2030 will be either complete bust and start an AI winter or a complete boom
 
  • +1
Reactions: Bitchwhipper2
Tell that to the soldiers who get their faces blasted off by machine learning drones
 
  • +1
Reactions: Deleted member 131140
Tell that to the soldiers who get their faces blasted off by machine learning drones
You dont need high level reasoning for that.

A bqsic Coordinate system and tracer technology can do that
 
Still has a good knowledge base though. 2030 will be either complete bust and start an AI winter or a complete boom
Yea, it has a good bit of knowledge to draw upon.

Good data analytics. But asking for its own take on philosophical ponderings is just shooting yourself in the foot
 
  • +1
Reactions: Deleted member 131140
I haven't been following the story very closely.
What's the bottleneck of AI in personal phone usage?
Surely some kind of GPT based model could do an adequate job? But obviously it's more complex than that.
They can't run the top LLM models locally, you would need a very powerful computer and lots of RAM for that, mostly they are run in the cloud on special servers.
 
  • +1
Reactions: Deleted member 91663
They can't run the top LLM models locally, you would need a very powerful computer and lots of RAM for that, mostly they are run in the cloud on special servers.
Good point. Hadn't considered this.
Why can’t phones have cloud-based AI instead of on-device?
 
Last edited:
Its not personal usage to be specific.

The reasoning models just fail when given deterministic algorithmic problems.

It should be doable for them since its within the context window but they just "give up" before attempting as more variables are stacked on
Interesting.

Is Apple limited by trying to develop on-device AI versus using something server-based like Chat GPT or Gemini instead?
 
Last edited:
Good point. Hadn't considered this.
Why can’t phones have cloud-based AI instead of on-device?
It still takes a few seconds until you get a response from ChatGPT, cloud performance can fluctuate a lot depending on how many users are online and using that service, or the Internet speed, etc., a phone would simply lag too much if you would have to wait for seconds for a response to your various requests.
 
  • +1
Reactions: Deleted member 91663
It still takes a few seconds until you get a response from ChatGPT, cloud performance can fluctuate a lot depending on how many users are online and using that service, or the Internet speed, etc., a phone would simply lag too much if you would have to wait for seconds for a response to your various requests.
Yeah, makes sense. And with over 1 billion iPhone users worldwide, the cloud infrastructure costs would be astronomical.
 
  • +1
Reactions: Gonthar

Similar threads

D
Replies
0
Views
257
Deleted member 185139
D
Rigged
Replies
143
Views
13K
Deleted member 45092
D
_MVP_
Replies
2
Views
523
_MVP_
_MVP_
_MVP_
Replies
6
Views
682
looksmaxxed
looksmaxxed
D
Replies
26
Views
4K
Deleted member 86409
D

Users who are viewing this thread

Back
Top