Why my company is stuck rn

Jason Voorhees

Jason Voorhees

๐•ธ๐–Š๐–—๐–ˆ๐–Š๐–“๐–†๐–—๐–ž ๐•ฎ๐–”๐–—๐–• โ€ข ๐Ÿ๐ŸŽ๐Ÿ๐Ÿ’๐Ÿฅ‡
Joined
May 15, 2020
Posts
85,587
Reputation
254,843
I'll keep it as simply as possible so anyone can understand. You see all computers store things Os and 1s. Everything is long sequences of Os and 1s but the problem is all systems since the dawn of computers have rounding errors. 0.1+0.2 =/= 0.3 because decimal numbers cannot reliably be represented in Os and 1s.It would be something like 0.30000000000000004 or something equally cursed.

It's a fundamental law of how binary floating point (IEEE 754) works. For 50+ years we have just ignored this because who tf cares but now with AI in the picture. You can't simply ignore it. Millions of matrix multiplications per second,millisecond inferences and perfect consistency across training runs. It means even than tiny errors get magnified into something catastrophic across 175 billion parameters.

This isn't a huge problem generally neural networks don't need mathematical perfection infact gradient descent actually loves a bit of noise and garbage for generalization but the problem is here we are dealing with algorithms. There's ofc workarounds like quantization, tensor cores in GPUs specifically designed to handle this with FP32 but there's none that specifically catering to our needs because we require a deterministic bit exact reproducibility.
 
Last edited:
  • +1
  • JFL
  • Woah
Reactions: User28823, topology, vernier and 25 others
@Glorious King @topology @SharpOrange @LXR @User28823
 
  • +1
Reactions: topology, LXR, Glorious King and 6 others
Always some bullshit with this old nigga
 
  • +1
  • JFL
Reactions: iflookscouldk1ll, browncurrycel, Aox Ofwar and 6 others
@mcmentalonthemic @jeoyw9192 @Jatt @childishkillah
 
  • +1
Reactions: mcmentalonthemic, childishkillah, unstable and 1 other person
Chadlite GIF
 
  • +1
  • JFL
Reactions: browncurrycel, Aox Ofwar, Chadeep and 2 others
sounds complicated

dropped comp science
 
  • +1
Reactions: SharpOrange, Chadeep, tansel and 1 other person
@imontheloose @dhusc @unstable @Jager
 
  • +1
Reactions: Jager and unstable
I'll keep it as simply as possible so anyone can understand. You see all computers store things Os and 1s. Everything is long sequences of Os and 1s but the problem is all systems since the dawn of computers have rounding errors. 0.1+0.2 =/= 0.3 because decimal numbers cannot reliably be represented in Os and 1s.It would be something like 0.30000000000000004 or something equally cursed.

It's a fundamental law of how binary floating point (IEEE 754) works. For 50+ years we have just ignored this because who tf cares but now with AI in the picture. You can't simply ignore it. Millions of matrix multiplications per second,millisecond inferences and perfect consistency across training runs. It means even than tiny errors get magnified into something catastrophic across 175 billion parameters.

This isn't a huge problem generally neural networks don't need mathematical perfection infact gradient descent actually loves a bit of noise and garbage for generalization but the problem is here we are dealing with algorithms. There's ofc workarounds like quantization, tensor cores in GPUs specifically designed to handle this with FP32 but there's none that specifically catering to our needs because we require a deterministic bit exact reproducibility.
I hate working with computers ๐Ÿฅ€
 
  • +1
  • JFL
Reactions: Sprinkles, SharpOrange and Jason Voorhees
what do you think about non transistor processors for ai?
i think they are all hype.
 
  • +1
Reactions: SharpOrange and Jason Voorhees
Will read later im playing video games
 
  • +1
Reactions: Aox Ofwar and Jason Voorhees
what do you think about non transistor processors for ai?
i think they are all hype.
That's the hardware side of things that I'm not sure about tbh
 
  • +1
  • Hmm...
Reactions: SharpOrange and unstable
@Incelforeever
 
which languages would you advice someone to learn for best job opportunities.
 
  • +1
Reactions: Jason Voorhees and SharpOrange
  • +1
  • JFL
Reactions: unstable and Jason Voorhees
  • +1
  • JFL
Reactions: unstable and Jason Voorhees
  • Love it
  • Hmm...
Reactions: AverageCurryEnjoyer and unstable
I'll keep it as simply as possible so anyone can understand. You see all computers store things Os and 1s. Everything is long sequences of Os and 1s but the problem is all systems since the dawn of computers have rounding errors. 0.1+0.2 =/= 0.3 because decimal numbers cannot reliably be represented in Os and 1s.It would be something like 0.30000000000000004 or something equally cursed.

It's a fundamental law of how binary floating point (IEEE 754) works. For 50+ years we have just ignored this because who tf cares but now with AI in the picture. You can't simply ignore it. Millions of matrix multiplications per second,millisecond inferences and perfect consistency across training runs. It means even than tiny errors get magnified into something catastrophic across 175 billion parameters.

This isn't a huge problem generally neural networks don't need mathematical perfection infact gradient descent actually loves a bit of noise and garbage for generalization but the problem is here we are dealing with algorithms. There's ofc workarounds like quantization, tensor cores in GPUs specifically designed to handle this with FP32 but there's none that specifically catering to our needs because we require a deterministic bit exact reproducibility.
dnr
 
  • +1
Reactions: Jason Voorhees
Hire me bhai
 
  • +1
Reactions: Jason Voorhees
I'll keep it as simply as possible so anyone can understand. You see all computers store things Os and 1s. Everything is long sequences of Os and 1s but the problem is all systems since the dawn of computers have rounding errors. 0.1+0.2 =/= 0.3 because decimal numbers cannot reliably be represented in Os and 1s.It would be something like 0.30000000000000004 or something equally cursed.

It's a fundamental law of how binary floating point (IEEE 754) works. For 50+ years we have just ignored this because who tf cares but now with AI in the picture. You can't simply ignore it. Millions of matrix multiplications per second,millisecond inferences and perfect consistency across training runs. It means even than tiny errors get magnified into something catastrophic across 175 billion parameters.

This isn't a huge problem generally neural networks don't need mathematical perfection infact gradient descent actually loves a bit of noise and garbage for generalization but the problem is here we are dealing with algorithms. There's ofc workarounds like quantization, tensor cores in GPUs specifically designed to handle this with FP32 but there's none that specifically catering to our needs because we require a deterministic bit exact reproducibility.
idk anything about computers so ignore any following idiocy
this is floating point precision errors correct?
this is just unfixable isnt it?
is quantum computing a fad? if not would it fix this? same with tertiary computers, although unrealistic to inplement
 
  • +1
Reactions: Jason Voorhees
I'll keep it as simply as possible so anyone can understand. You see all computers store things Os and 1s. Everything is long sequences of Os and 1s but the problem is all systems since the dawn of computers have rounding errors. 0.1+0.2 =/= 0.3 because decimal numbers cannot reliably be represented in Os and 1s.It would be something like 0.30000000000000004 or something equally cursed.

It's a fundamental law of how binary floating point (IEEE 754) works. For 50+ years we have just ignored this because who tf cares but now with AI in the picture. You can't simply ignore it. Millions of matrix multiplications per second,millisecond inferences and perfect consistency across training runs. It means even than tiny errors get magnified into something catastrophic across 175 billion parameters.

This isn't a huge problem generally neural networks don't need mathematical perfection infact gradient descent actually loves a bit of noise and garbage for generalization but the problem is here we are dealing with algorithms. There's ofc workarounds like quantization, tensor cores in GPUs specifically designed to handle this with FP32 but there's none that specifically catering to our needs because we require a deterministic bit exact reproducibility.
so, since your company works with ai its caused it to stuck rn ?

Wish you luck and hope that you are a crucial member of your company as to not get fired
 
  • +1
Reactions: Jason Voorhees
Always some bullshit with this old nigga
If you have nothing meaningful to say (including questions) stfu.
I'll keep it as simply as possible so anyone can understand. You see all computers store things Os and 1s. Everything is long sequences of Os and 1s but the problem is all systems since the dawn of computers have rounding errors. 0.1+0.2 =/= 0.3 because decimal numbers cannot reliably be represented in Os and 1s.It would be something like 0.30000000000000004 or something equally cursed.

It's a fundamental law of how binary floating point (IEEE 754) works. For 50+ years we have just ignored this because who tf cares but now with AI in the picture. You can't simply ignore it. Millions of matrix multiplications per second,millisecond inferences and perfect consistency across training runs. It means even than tiny errors get magnified into something catastrophic across 175 billion parameters.

This isn't a huge problem generally neural networks don't need mathematical perfection infact gradient descent actually loves a bit of noise and garbage for generalization but the problem is here we are dealing with algorithms. There's ofc workarounds like quantization, tensor cores in GPUs specifically designed to handle this with FP32 but there's none that specifically catering to our needs because we require a deterministic bit exact reproducibility.
Thing is for instance are you considering decentralized AI networks as well? Practically ud have multiple nodes that might require verification of a certain computation. In what you mention of you don't bit exact outcomes then the nodes could end up disagreeing on the valid model state regardless of the same inputs/outputs.

Another point is the issue of being able to debug at a much larger scale, like when a model training run costs a shit ton of money like in the millions , the bugs here are annoying asf to deal with. I'm talking finding the precise MS where a gradient exploded.
 
  • +1
Reactions: Jason Voorhees
I didn't know this, what kind of IA are you working with? :feelswat:

71216f091571da2a4ddb4c3f93d52f82
 
  • +1
Reactions: Jason Voorhees
idk anything about computers so ignore any following idiocy
this is floating point precision errors correct?
this is just unfixable isnt it?
is quantum computing a fad? if not would it fix this? same with tertiary computers, although unrealistic to inplement
Quantum computing not a fad. Is revolutionary and progressing steadily for certain hard problems but won't fix this for today's Al training. Quantum bits have their own massive noise/error issues. It's a seperate rabbit hole entirely and still mostly lab experiments that are insanely hard to scale reliably
 
so, since your company works with ai its caused it to stuck rn ?

Wish you luck and hope that you are a crucial member of your company as to not get fired
No one is getting fired brah. This issue isn't something specific to our company it's just end game for AI researchers. Not something actively harming us but could be huge bottle neck later.
 
  • +1
Reactions: Sayori
If you have nothing meaningful to say (including questions) stfu.

Thing is for instance are you considering decentralized AI networks as well? Practically ud have multiple nodes that might require verification of a certain computation. In what you mention of you don't bit exact outcomes then the nodes could end up disagreeing on the valid model state regardless of the same inputs/outputs.

Another point is the issue of being able to debug at a much larger scale, like when a model training run costs a shit ton of money like in the millions , the bugs here are annoying asf to deal with. I'm talking finding the precise MS where a gradient exploded.
This is actually called the determinism gap in AI.




In decentralized Al, if two nodes run the exact same input but use different GPU architectures or sometimes even different driver versions their floating point rounding will diverge. This is why the bit exact reproducible is the holy grail for researchers rn without it you can't easily verify if a node is cheating or just experiencing standard drift you u can't reliably trace when or why something like a gradient explosion happened. It makes all these million dollars AI tuning into a game of blind trial and error.
 
Last edited:
  • +1
Reactions: jeoyw9192
Quantum computing not a fad. Is revolutionary and progressing steadily for certain hard problems but won't fix this for today's Al training. Quantum bits have their own massive noise/error issues. It's a seperate rabbit hole entirely and still mostly lab experiments that are insanely hard to scale reliably
what about tertiary computers? would floating point precision errors still be an issue? hell what about analogue computers
 
  • +1
Reactions: Jason Voorhees
This is actually called the determinism gap in AI.




In decentralized Al, if two nodes run the exact same input but use different GPU architectures or sometimes even different driver versions their floating point rounding will diverge. This is why the bit exact reproducible is the holy grail for researchers rn without it you can't easily verify if a node is cheating or just experiencing standard drift you u can't reliably trace when or why something like a gradient explosion happened. It makes all these million dollars AI tuning into a game of blind trial and error.

Very interesting I'll take a look; but yeah ur def right Abt it being a game of blind trial and error ๐Ÿ˜‚ can't even fathom dealing with that shit
I know:owo:why u bulleh meh
Wasn't trying to it's just annoying when comments like urs are made on such threads, u can always save time and choose not to send a message :lul:
 
  • +1
Reactions: Jason Voorhees and NinjaRG9
I'll keep it as simply as possible so anyone can understand. You see all computers store things Os and 1s. Everything is long sequences of Os and 1s but the problem is all systems since the dawn of computers have rounding errors. 0.1+0.2 =/= 0.3 because decimal numbers cannot reliably be represented in Os and 1s.It would be something like 0.30000000000000004 or something equally cursed.

It's a fundamental law of how binary floating point (IEEE 754) works. For 50+ years we have just ignored this because who tf cares but now with AI in the picture. You can't simply ignore it. Millions of matrix multiplications per second,millisecond inferences and perfect consistency across training runs. It means even than tiny errors get magnified into something catastrophic across 175 billion parameters.

This isn't a huge problem generally neural networks don't need mathematical perfection infact gradient descent actually loves a bit of noise and garbage for generalization but the problem is here we are dealing with algorithms. There's ofc workarounds like quantization, tensor cores in GPUs specifically designed to handle this with FP32 but there's none that specifically catering to our needs because we require a deterministic bit exact reproducibility.
healthcare or law? Probably already read through this but I remember thinking machines talked about deterministic inference on a single server
 
  • +1
Reactions: Jason Voorhees
healthcare or law? Probably already read through this but I remember thinking machines talked about deterministic inference on a single server
I've read this before. Thinking Machines approach shows that while a single GPU operation can be made deterministic, scaling that to a production is a nightmare.
 
@Gomez
 
  • +1
Reactions: Gomez
the real question is

how do you find time to do this and research more on topics like this?

istg i cant find time to read my own code base brah :feelswah:

@Swarthy Knight
 
Last edited:
  • +1
Reactions: Swarthy Knight and Jason Voorhees
the real question is

how do you find time to do this and research more on topics like this?

istg i cant find time read my own code base brah :feelswah:

@Swarthy Knight
Subscribe to YouTube channels and tech newsletters and learn. I literally read about these things at breakfast or while running listening to podcasts
 
  • +1
  • Woah
Reactions: jeoyw9192, Swarthy Knight and Glorious King
Subscribe to YouTube channels and tech newsletters and learn. I literally read about these things at breakfast or while running listening to podcasts
drop their names in dms pls

my feed is filled with japan slop and jdm shi
 
  • JFL
Reactions: Swarthy Knight and Jason Voorhees
drop their names in dms pls

my feed is filled with japan slop and jdm shi
Will dm you when I'm done shit posting in too lazy to go click share a dozen times now.
 
  • +1
  • JFL
Reactions: Swarthy Knight and Glorious King
I've read this before. Thinking Machines approach shows that while a single GPU operation can be made deterministic, scaling that to a production is a nightmare.
you sound AI but yeah
 
  • +1
Reactions: Jason Voorhees
Last edited:
  • +1
Reactions: Swarthy Knight
I'll keep it as simply as possible so anyone can understand. You see all computers store things Os and 1s. Everything is long sequences of Os and 1s but the problem is all systems since the dawn of computers have rounding errors. 0.1+0.2 =/= 0.3 because decimal numbers cannot reliably be represented in Os and 1s.It would be something like 0.30000000000000004 or something equally cursed.

It's a fundamental law of how binary floating point (IEEE 754) works. For 50+ years we have just ignored this because who tf cares but now with AI in the picture. You can't simply ignore it. Millions of matrix multiplications per second,millisecond inferences and perfect consistency across training runs. It means even than tiny errors get magnified into something catastrophic across 175 billion parameters.

This isn't a huge problem generally neural networks don't need mathematical perfection infact gradient descent actually loves a bit of noise and garbage for generalization but the problem is here we are dealing with algorithms. There's ofc workarounds like quantization, tensor cores in GPUs specifically designed to handle this with FP32 but there's none that specifically catering to our needs because we require a deterministic bit exact reproducibility.
Time to prove your college tag, invent a new paradigm
 
  • +1
Reactions: Jason Voorhees
I'll keep it as simply as possible so anyone can understand. You see all computers store things Os and 1s. Everything is long sequences of Os and 1s but the problem is all systems since the dawn of computers have rounding errors. 0.1+0.2 =/= 0.3 because decimal numbers cannot reliably be represented in Os and 1s.It would be something like 0.30000000000000004 or something equally cursed.

It's a fundamental law of how binary floating point (IEEE 754) works. For 50+ years we have just ignored this because who tf cares but now with AI in the picture. You can't simply ignore it. Millions of matrix multiplications per second,millisecond inferences and perfect consistency across training runs. It means even than tiny errors get magnified into something catastrophic across 175 billion parameters.

This isn't a huge problem generally neural networks don't need mathematical perfection infact gradient descent actually loves a bit of noise and garbage for generalization but the problem is here we are dealing with algorithms. There's ofc workarounds like quantization, tensor cores in GPUs specifically designed to handle this with FP32 but there's none that specifically catering to our needs because we require a deterministic bit exact reproducibility.
Do you work for a massive company? Didn't know your company specialized in AI.
 
  • +1
Reactions: Jason Voorhees
Subscribe to YouTube channels and tech newsletters and learn. I literally read about these things at breakfast or while running listening to podcasts
DM me too should be helpful
 
  • +1
Reactions: Jason Voorhees

Similar threads

JeanneDArcAlter
Discussion Headphones
Replies
15
Views
2K
JeanneDArcAlter
JeanneDArcAlter
D
Replies
26
Views
4K
Deleted member 86409
D
Deleted member 6403
Replies
14
Views
3K
oldcelloser
oldcelloser

Users who are viewing this thread

Back
Top