The insiders at OpenAI (everyone), Microsoft (CTO, etc.), and Anthropic (CEO) have all been saying that they see no immediate end to the scaling laws that models are still improving rapidly.
AIWow you did this very well. I actually almost downvoted you before I read the whole comment.
You have channeled their energy to near perfection 👌🏽
You forgot to compare ai to cryptocurrency
No one is saying to immediately discard anything that people who work at AI companies say, only to have a healthy dose of skeptism and not immediately believe everything unconditionally (which many people in this sub are guilty of).
The optimists should be charitable when interpreting the sceptics' comments but the sceptics should also be charitable and not simply assume that people in this sub "immediately believe everything unconditionaly", no?
a healthy dose of skepticism is good, but i've definitely seen highly upvoted posts where the dose is well beyond healthy.
That’s why you should look at what researchers say, not CEOs or hype men 2278 AI researchers were surveyed in 2023 and estimated that there is a 50% chance of AI being superior to humans in ALL possible tasks by 2047 and a 75% chance by 2085. This includes all physical tasks. In 2022, the year they had for that was 2060, and many of their predictions have already come true ahead of time, like AI being capable of answering queries using the web, transcribing speech, translation, and reading text aloud that they thought would only happen after 2025. So it seems like they tend to underestimate progress.
So basically 50/50 shot at ASI with embodiment by 2047 according to them. Flawed AGI without embodiment must be even closer.
I don't know how useful such survey's can be, since AI research is sensitive to uncommon knowledge and uncommon sense.
There could be researchers in that pool who know things that most of the others do not which gives them a sooner estimate or later estimate, and they would be drowned out. And a lot of these answers are likely to not be arrived at independently, but influenced by other researchers and the opinions of industry insiders.
Still more reliable than random people on Metaculus. The pessimists and the optimists would theoretically get cancelled out though. Look up the central limit theorem
This. Raising the possibility they might have motives beyond simply the betterment of all mankind doesn’t make you, yourself, opposed to the betterment of all mankind. It just shows you’ve been paying attention over the last several decades and have read your history.
Technology has brought us wonderful things throughout the years. Technology has also brought us manmade horrors beyond our comprehension, not to mention a generous helping of barely functional junk. It’s not stupid to be wary. It’s not stupid to apply skepticism and scrutiny to the words of public figures who have a vested interest in the financial success of their products, and in the continued flow of venture capital to their field. This doesn’t mean everything they say is wrong, it just means everything they say will be informed by their biases.
Silicon Valley is great at making advertisements look like manifestos, product launches sound like revolutions, and CEOs seem like scrappy visionaries. You don’t come off as wise when you mock people for not instantly leaping on a one-way train to Glorious Tomorrowland. If that’s a one way train, then you bet I’m asking the conductor some hard questions. “So, about this Glorious Tomorrowland. Tell me more about it. What’s the accomodation like? Oh, it’s cutting edge, you say? Hmm. Perhaps if you could elaborate on that…”
And no, I’m not afraid of getting left behind. The big yellow ‘offer ends soon’ banner with the ticking clock won’t work this time. There’s a sale on every other month, and by then they’ll be singing out ‘all aboard!’ for Glorious Tomorrowland 2.0, which will of course be twice, nay, ten times as cutting edge as Glorious Tomorrowland 1.0.
My question is; has anyone that's actively working in the AI industry said anything to indicate otherwise?
You would think if there was any evidence of such, surely somebody would call out these "bluffs."
But I've yet to see it.
Lecunn. Though he isn't exactly saying that scaling is dying, just that it won't be sufficient.
But he is not actually that involved anymore; at least I vaguely remember he mentioning that when they released the last Llama models
He is more of a consultant
No he just said he's working on next generation architectures not on LLMs. He wasn't involved in their last project but is trying to build the next big thing.
I am working at one of companies on this tech.
The issues I see are :
Lack of power. ( Most important issue)
Environment issues.( This might be a issue to worry)
Lack of chips aka datacenters. (Cash issue)
Lack of good data.(Solution is in implementation)
Will it be beneficial? Yes and no
All are betting on emergent properties of neural networks.
For ex,
A molecule of h20 doesn't have wetness or surface tension but a group of h20 gains those properties. But in ai's case we don't know how it gets emergent properties and what emergent properties it will get.
But I guess there might be folks , who understand this black box. I just experiment, integrate as fast as possible.
The guy straight up says as long as there is any improvement at all whatsoever it is considered a success. Which is a massive leap from the 'things are improving exponentially' crowd. So yeah, skepticism is prudent.
The "things are improving exponentially" crowd doesn't understand the meaning of the word exponentially. My belief is that exponential increases in computing power are required for linear increases in capability - that doesn't mean scaling is pointless though, it just means scaling is really hard.
Yes, new improvements get exponentially harder to reach. But people are conflating that with "compute use grows exponentially every year"
Exponential growth in computing is what means that it is likely to happen, but it progress will not be exponential.
Where did he say that
Curious why you believe otherwise. Plenty of historical examples showing you shouldn't believe what people claim about their own industries if you can't independently verify it. Theranos. Enron. Etc.
This is true, but the main difference is that Theranos never actually had a product that anyone was able to use themselves. In the case of Anthropic and others, people can actually see and use the product and see the functionality. They can see with their own eyes that it surpasses the capabilities that were available a year ago, verifying claims of continual progress that were made at that time. This is especially true in the case of image and video generation.
What a terrible comparison. It's not just 1 company saying this stuff. It's every entity in the industry
To be fair to the other side, we have seen industry wide collusion before. Look at the 2008 recession, caused by the systemic ignorance in the financial sector of the risk of off-exchange derivatives. Or the soybean scandal of the 60s, where (again) the financial sector ignored the risks of inflating the valuation of a volatile market like soybean sales, which collapsed once Russian exports were halted and one of the largest soybean oil companies was called out for fraud. Or Long-Term Capital Management, which was a highly leveraged hedge fund worth billions and tied to most of (once again) the financial sector, who did not have the risk management in place to withstand a downturn in the market, causing their leverage to go kaput in only a few months, losing billions.
Not saying that that kind of dishonesty is happening here, but when it comes to for profit companies, it's always best to keep a grain of salt.
1) how much of that was "loose" collusion.
2) the incentives for fraud are way different for finance than for technology (or really any other industry, which is probably why your list was devoid of anything but finance)
Theranos and Enron were scams from the beginning. AI is no scam. And yes it's improving otherwise why would they bother paying people to train them?
I don’t think anyone says, or even implies, that simply because they sell products, everything they say is false. That’s hyperbole so extreme that it ceases to be relevant at all.
People say that they are skeptical of what these insiders say because they might have incentive to lie.
That’s not at all what you’ve just made up.
I love how there's this comment that people aren't saying this, and then you just scroll down to people saying it, lmao
Show me a single comment that says that everything they say is 100% false.
"Even if you believe them those companies have so many billions invested they could not afford to say otherwise."
May not say it's false but strongly argues that this would be there answer in 100% of situations leading to the same result in 0% of claims believed.
You are technically right though, which is the best kind.
and that incentive is to sell products
The way some react to any and every comment definitely implies pretty much that, or very close to it to some ridiculous extent
I don't see any comments exemplifying that. Can you find even one?
I bet I could find 1000. Are you being serious rn?
ok so it should be easy to show me 1 from this thread.
here's one: "Elon seems to be able to keep the hype up indefinitely even if he fails to deliver. Isn’t this the entire game at this point?"
And that's just with a quick look on THIS specific thread. You HAVE to be either new to r/singularity or completely disingenuous if you're gonna sit here, pretending everyone has been completely rational about the level of doubt they're putting on companies leaders' or employees' every word
here's one: "Elon seems to be able to keep the hype up indefinitely even if he fails to deliver. Isn’t this the entire game at this point?"
That...... Is not.... The same thing as saying "because someone is selling a product, 100% of what they say is a lie".... Like, at all.
It suggests that people who sell products may not always be trustworthy, but that is a very different statement.
I honestly don't know hw you could confuse the two.
if you're gonna sit here, pretending everyone has been completely rational about the level of doubt they're putting on companies leaders'
On no planet did I even remotely do this. I just said that the original comment was hyperbole and that people are skeptical. You've built this strawman up in your head and it has nothing at all to do with my position.
I don't know why you even argue about the literal interpretation of that hyperbole
My original comment says:
That’s hyperbole so extreme that it ceases to be relevant at all.
So.. I don't know how to explain it more... I am saying the hyperbole is so ridiculous it's not an argument anymore. It's not a slight exaggeration or anything, it's just insanity.
are you that dense?
Why are people on this site so childish and insulting? It's genuinely 50% of interactions I have on /r/singularity where, if someone disagrees, they are apparently compelled to insult someone's intelligence. It's like they can't even write a disagreeing comment without calling someone dumb in some way. Can you? Like just fucking chill dude.
That’s not an example. That comment is referring to Elon Musk.
The commenter points out that he, specifically, has on multiple occasions come up short on what he promised to deliver, only to make further unrealistic promises and have them be fairly well received by the market. They then go on to say ‘isn’t that the entire game at this point?’.
While there are definitely other factors that go into the cycles of silicon valley venture capital, it is hard to deny that overpromising (hyping up) a product, accruing the eager investment that flows in, then quietly underdelivering later down the line, is pretty core to the playbook. It’s an observable pattern. The comment makes no claim that a representative of a company cannot state the truth because of their relationship to their business. It doesn’t even accuse Elon Musk of intentionally lying.
Elon may well have believed in 2019 that he’d be able to launch fleets of autonomous Tesla brand robotaxis into the world in 2020. In 2013 he might have felt utterly certain the Hyperloop would revolutionise public transit. Ten years ago he said his astronauts would have landed on mars by now, and that could have come from the heart. Maybe internally, at SpaceX, that’s what all the numbers were pointing to. 2018 was supposedly the year private citizens would be able to book trips to mars, and soon after that Tesla users were meant to be able to summon their cars to them from anywhere connected by land, and at some point he was going to start a candy company, and autonomous driving? That’ll definitely be next year.
All these claims could have been things he fully intended to follow through on, only to run into unforeseen roadblocks. Understandable. It happens.
But his public image as a genius inventor and visionary entrepreneur with multiple grand schemes in the works at any given time has helped him build buzz around his companies and secure investments. So those promises are ‘part of the game’.
There you go. Not an example. Do feel free to draw from the overflowing pool of alternative examples. Pick any at random, there’s like a thousand of them after all.
All of this moronic essay and you missed the point that the insinuation made is not just at Elon but EVERYONE. "Isn't this the entire game after all". ENTIRE GAME. So no, I dont need to pull out from the 1000 other examples, that yes, there is. But you feel free to make that same essay for all the people involved in the entire game.
Who said that? They are common around reddit subs, doesn't mean there is 1 in every single thread. Depends on the topic.
That’s why you should look at what researchers say, not CEOs or hype men 2278 AI researchers were surveyed in 2023 and estimated that there is a 50% chance of AI being superior to humans in ALL possible tasks by 2047 and a 75% chance by 2085. This includes all physical tasks.
In 2022, the year they had for that was 2060, and many of their predictions have already come true ahead of time, like AI being capable of answering queries using the web, transcribing speech, translation, and reading text aloud that they thought would only happen after 2025. So it seems like they tend to underestimate progress.
So basically 50/50 shot at ASI with embodiment by 2047 according to them. Flawed AGI without embodiment must be even closer.
Yeah, I quote that ESPAI all the time. It's notable though, that the variance in predictions is so massive.
I think AGI will be here before 2040 for sure, but that doesn't really have much to do with the specifics of this post which are about scaling ... I think we will get to AGI because of breakthroughs that don't require huge scale tbh.
We can already see 8b models perform better than the original GPT 4 so that’s probably true
Yes and it makes them GREEDY and EVIL
You remain my favorite person on this sub.
Did you know OpenAI’s CTO has literally said their internal models aren’t much better than what’s released and had steadily been pushing expectations for the next model back into next year?
What other driving force keeps them honest? Honesty doesn’t pay the bills.
God forbid the talking encyclopedia Britannica takes over the world
Not to mention those companies are controlled by LITERAL CEOs.
CEOs are BAD!
The thing I still wonder about is if we are just doing "imitation learning" or if we actually make new emergent skills. Like my current concern is that maybe the levels are capable of the intelligence that they learned from some documents. So the intelligence cap would be at the intelligence ceiling of the data set. Then even when the model gets bigger it wouldn't get smarter than the best data sets.
This paper that came out recently indicates that there's a possibility that the cap might actually be higher than the individual experts within the data set.
It demonstrates that within a specific domain(chess in this case), a large data set from less skilled experts can be used to train a model that performs as well as a much more skilled expert.
While it's hardly conclusive, it's a neat example. It isn't generating any novel abstract reasoning, as the paper author's point out in the discussion, but it still exceeds the data it was trained on.
It's a domain where there is a win/loss dichotomy, so it's easy to apply a value function. Not sure that's going to work for anything which doesn't have such an objective value function
Also chess as we understand it is a game where you avoid making mistakes. I haven't read the paper, but I can tell you right now if you had a basic player at 1000 elo who just never made significant mistakes (ie there are 100 players checking their work) that is a 1500 elo player. The game might not be suited for this type of analysis.
I wish you'd been in the thread I was discussing this in a couple days ago. That was essentially the hypothesis the paper was testing. Whether or not you could use that kind of noisy data to make an expert that outperformed the training data. Using a transformer model to emulate 'Wisdom of the crowd'. The paper is worth a read if you're a chess player and into machine learning.
But how does it know what’s a mistake and what isn’t?
By looking at game state before and after to determine if a move increased standing (or at least kept it equal) or if it decreased standing
No. That's not how the transformers in this paper worked.
The most likely winning move for a given board state (a string in PGN of the game.) is output as the next token. The model would have latent representations of all the board states in the training data, and unseen board states would get token probabilities based on similarity to the latent representations, allowing generalization.
They're extremely naive, and were playing 'blind' in that they were not given any other information than the PGN string during inference, and nothing but the PGN string and winner/loser in the game during training.
Agreed, which is why I said it isn't conclusive. I hope their further work tries something similar with a generalist transformer language model. Though I can't think of any ways to source or curate a data set for such an experiment off the top of my head.
I don't think I'd use this paper in that document where it is. The paper's authors explicitly refute that it supports that kind of idea:
Broader Impact.
The possibility of “superintelligent” AGI has recently fueled many speculative hopes and fears. It is therefore possible that our work will be cited by concerned communities as evidence of a threat, but we would highlight that the denoising effect addressed in this paper does not offer any evidence for a model being able to produce novel solutions that a human expert would be incapable of devising. In particular, we do not present evidence that low temperature sampling leads to novel abstract reasoning, but rather denoising of errors.
The reason I like this work is the mathematical theorem followed by the experiment that can be easily reproduced. The prediction, result, and the data aren't subjective, even if the methodology might have some flaws. And it isn't a single case study, but an experiment. This kind of reproducibility is good science.
It’s not novel reasoning obviously. Plenty of people can get Elos above 1500. The point is that it’s able to generalize well enough to improve over training data and recognize which moves are correct or not. That’s why the section is called “AI Is Not A Stochastic Parrot” and not “AGI Is Here”
The heading above the section in that document that holds the link to the paper is: "AI Is Not A Stochastic Parrot/AI Is Original/AI Can Reason". There was no reason to truncate off the majority of the title.
At best it shows the first of those three, but the term 'stochastic parrot' is practically meaningless anyway. Meaning is inherent in a transformer model, it's really apparent when you use a tool like LM Debugger to visualize them.
You hit the nail on the head here. Without a feedback loop abstract planning and reasoning models can’t be validated and improved upon.
But if you have enough models around expert abstract reasoning you could use them to score the validity or quality of ideas of related. ie- What makes a hypothesis or business idea good or more viable than another.
You can look at image recognition models, and find that they are often unaffected by a lower quality of training data.
also it is a domain that is all about memorization.
I mean surely a bunch of armchair medical hobbyists can train a model better than a single doctor right?
Even among doctors there's a variety of skill levels when they practice medicine. If a model trained by a large number of doctors is reliably better than any one of those doctors, isn't that a good thing?
They already have
AI Outperforms Radiologists in Detecting Prostate Cancer on MRI: https://humanprogress.org/ai-outperforms-radiologists-in-detecting-prostate-cancer-on-mri-scans/ Med-Gemini : https://arxiv.org/abs/2404.18416
We evaluate Med-Gemini on 14 medical benchmarks, establishing new state-of-the-art (SoTA) performance on 10 of them, and surpass the GPT-4 model family on every benchmark where a direct comparison is viable, often by a wide margin. On the popular MedQA (USMLE) benchmark, our best-performing Med-Gemini model achieves SoTA performance of 91.1% accuracy, using a novel uncertainty-guided search strategy. On 7 multimodal benchmarks including NEJM Image Challenges and MMMU (health & medicine), Med-Gemini improves over GPT-4V by an average relative margin of 44.5%. We demonstrate the effectiveness of Med-Gemini's long-context capabilities through SoTA performance on a needle-in-a-haystack retrieval task from long de-identified health records and medical video question answering, surpassing prior bespoke methods using only in-context learning. Finally, Med-Gemini's performance suggests real-world utility by surpassing human experts on tasks such as medical text summarization, alongside demonstrations of promising potential for multimodal medical dialogue, medical research and education.
Double-blind study with Patient Actors and Doctors, who didn't know if they were communicating with a human, or an AI. Best performers were AI: https://m.youtube.com/watch?v=jQwwLEZ2Hz8
Human doctors + AI did worse, than AI by itself. The mere involvement of a human reduced the accuracy of the diagnosis. AI was consistently rated to have better bedside manner than human doctors.
‘I will never go back’: Ontario family doctor says new AI notetaking saved her job: https://globalnews.ca/news/10463535/ontario-family-doctor-artificial-intelligence-notes
Google's medical AI destroys GPT's benchmark and outperforms doctors
Med-Gemini's outputs are preferred to drafts from clinicians for common and time-consuming real-world tasks such as simplifying or summarising long medical notes, or drafting referral letters: https://x.com/alan_karthi/status/1785117444383588823
Medical Text Written By Artificial Intelligence Outperforms Doctors: https://www.forbes.com/sites/williamhaseltine/2023/12/15/medical-text-written-by-artificial-intelligence-outperforms-doctors/
AI can make healthcare better and safer: https://www.reddit.com/r/singularity/comments/1brojzm/ais_will_make_health_care_safer_and_better/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=1&utm_content=share_button
CheXzero significantly outperformed humans, especially on uncommon conditions. Huge implications for improving diagnosis of neglected "long tail" diseases: https://x.com/pranavrajpurkar/status/1797292562333454597
Humans near chance level (50-55% accuracy) on rarest conditions, while CheXzero maintains 64-68% accuracy. AI is better than doctors at detecting breast cancer: https://www.bing.com/videos/search?q=ai+better+than+doctors+using+ai&mid=6017EF2744FCD442BA926017EF2744FCD442BA92&view=detail&FORM=VIRE&PC=EMMX04
China's first (simulated) AI hospital town debuts: https://www.globaltimes.cn/page/202405/1313235.shtml
Remarkably, AI doctors can treat 10,000 [simulated] patients in just a few days. It would take human doctors at least two years to treat that many patients. Furthermore, evolved doctor agents achieved an impressive 93.06 percent accuracy rate on the MedQA dataset (US Medical Licensing Exam questions) covering major respiratory diseases. They simulate the entire process of diagnosing and treating patients, including consultation, examination, diagnosis, treatment and follow-up.
Google's medical AI destroys GPT's benchmark and outperforms doctors: https://newatlas.com/technology/google-med-gemini-ai/
Generative AI will be designing new drugs all on its own in the near future
Researchers find that GPT-4 performs as well as or better than doctors on medical tests, especially in psychiatry. https://www.news-medical.net/news/20231002/GPT-4-beats-human-doctors-in-medical-soft-skills.aspx
ChatGPT outperforms-physicians-in-high-quality-empathetic-answers-to-patient-questions: https://today.ucsd.edu/story/study-finds-chatgpt-outperforms-physicians-in-high-quality-empathetic-answers-to-patient-questions?darkschemeovr=1
AI is better than doctors at detecting breast cancer: https://www.bing.com/videos/search?q=ai+better+than+doctors+using+ai&mid=6017EF2744FCD442BA926017EF2744FCD442BA92&view=detail&FORM=VIRE&PC=EMMX04
AI just as good at diagnosing illness as humans: https://www.medicalnewstoday.com/articles/326460
AI can replace doctors: https://www.aamc.org/news/will-artificial-intelligence-replace-doctors?darkschemeovr=1
Geoffrey Hinton says AI doctors who have seen 100 million patients will be much better than human doctors and able to diagnose rare conditions more accurately: https://x.com/tsarnick/status/1797169362799091934
I think this isn't as valid as one might think because humans at chess have obvious weaknesses that machines do not have - most notably relatively poor consistency.
If you look at it from a purely elo based perspective this is irrelevant but if you look at it qualitatively it does matter.
Think about it like this:
a player rated 2400 will have pretty good chess understanding but still still occasionally blinder.
An engine rated 2400 will not have as good chess understanding but it will never make blunders that are only a few moves deep.
In elo this is the same result. But in practice it is different. Pretty much like humans on the road can get tired or distracted. Artificial intelligence never falls asleep so it can get some situations stupidly wrong and still come out ahead statistically as the safer driver.
This matters a lot because we will very much want to use AI in difficult singular situations that would otherwise be solved with expert opinions.
A chess system that plays at 2400 elo is not preferable to a master at that level if you're going to look at one situation very deeply. Its strong suits won't be able to overcome its weaknesses in that situation, because consistently mediocre in that case isn't better than singularly great.
Then how did chess AI beat grandmasters
Eventually by playing itself and learning from an insanely large ever growing data set.
So by generating synthetic data pretty much.
In this case because legal moves and win conditions are well defined that worked pretty well.
Then apply that to LLMs. You think it’s possible?
To some degree, yes. Absolutely.
But what we're getting into now is synthetic data generation and/or computational intelligence.
That's not what the paper was about. If you let any kind of machine learning architecture generate its own chess games to learn from you can no longer really claim it was restricted to a 1000 elo data set.
The idea here seems to be what level of chess can you learn from just the games in the data set.
The first chess engines that beat GMs were not neural networks and were not trained using ML. They used a human designed evaluation function and searched across many possible future sequences of moves.
It’s trivial to design a 2400 chess engine nowadays as we can search so deep using modern hardware that even mediocre eval functions will lead to that strength.
If you look through the methodology of their experiment, this transformer is more than capable of blunders. It is not even restricted to valid moves, forfeiting if it takes more than 5 tries to make a valid move. They specifically need to tweak the sampling to improve consistency in order to elicit the 'transcendance' they were looking for, but it is still randomly choosing from a selection of moves.
Calling the transformers in this paper chess engines is extremely generous, and even when training them on a data set of 1500 elo from lichess.org games, they couldn't get it to exceed 1550 Glicko-2 against Stockfish.
They were pretty much transformer models that would predict the next move in a PGN string, where the moves of winners in the data set were preferred.
Yes I realize there exist big qualitative difference between different machine-software architectures.
Obviously also anything AI like that doesn't systematically prune a search tree will be more capable of human-like errors.
But I think there's still a qualitative difference. Humans can be tired, stressed, distracted, emotionally unbalanced.
The computer may have a program that's error prone but it will use that program to its full extent every time.
The difference is humans can focus, take their time, meditate, and so on to bring out the best of themselves. Doesn't always work but in singular cases you can get the best out of humans.
Computers perform how they perform. You can give a true chess engine more time to look deeper into the tree but I don't think that works with neural nets.
Either way achieving elo 1550 by analyzing 1500 level games isn't impressive.
If you eliminated clear blunders it'd be 1800 level chess.
True big blunders are also by their nature rare and often singularly lose the game. So you'd be able to argue that blunders have an outsized effect on human performance elo and an undersized effect on deep learning sample data.
You're missing the point of the paper, they weren't trying to make the most skilled chess transformer. I can think of a half dozen ways off the top of my head to improve it. They were simply trying to prove you could use data from less skilled experts to make a more skilled model.
The big result was 1500 Glicko-2 against Stockfish from 1000 elo lichess data, but there was a diminishing return as they increased the maximum elo cutoff. Leading to the numbers I posted earlier.
I'm not arguing that they were trying to make the most skilled chess transformer.
I'm arguing that the methodology contains an inherent flaw that partially invalidates the conclusion - which is that you can improve on the quality of the data with these models - that they can outperform the training data in general.
The reason that conclusion is flawed is because (I think) there's an inherent improvement going from humans to a machine, specifically in chess but perhaps in many avenues.
Humans almost invariably play 'above their elo' but they just then tend to blunder and fuck up.
Blunders are a smaller part of the moves but they disproportionately impact the elo which is only concerned with win-loss percentages.
If you had chess engines (of any design) rated 1000 elo (that don't purposefully blunder) playing on lichess and you sampled these computer games for your training I doubt you'd get to 1500 elo as easy as with human games rated 1000 elo.
Again it's like with cars.
If you sampled mediocre drivers and then put their driving skill in a machine that drives just as mediocrily but is at least consistently mediocre, it will outperform the humans that occasionally fall asleep behind the wheel.
Similarly as a chess player I'm not surprised you can learn 1500 level chess from observing 1000 level players. Just filter out the biggest blunders. And I think part of that filtering happens through the machine learning method by default.
Blunders are like black sheep, they're very bad but rare in the dataset. Machine learning aggregates lots of data and the black sheep get averaged out. You get hordes of grey sheep and playing with grey sheep exclusively you end up playing better chess than the guys getting fucked over by black sheep.
It is interesting though that the improvement they measured levels off at higher elo scores. My thesis would suggest you always get a decent boost by moving a human chess data set into a machine player.
.
With the effort you spent arguing with me, you should've just read the paper.
Redditors are incapable of reading past the headline but will write a dissertation on how they’re smarter than the researchers
I don't understand why you think it is personal or that I care about being right primarily.
You didn't even read me first text well enough to discern my argument and then said I was arguing something I wasn't. So I just rephrased it.
I meant to imply none of what you wrote here. I used the word arguing because you used the word 'arguing', and wrote you were arguing with me since you were replying to my comment.
I read your comments, both before and after you edited them. I even checked what exactly you edited with unddit to try and get a better understanding.
It was clear to me that you hadn't read the paper, and I figured instead of doing all that writing and editing that you should've just read it.
I don’t think it is limited by the training data. The models can find undiscovered connected dots within the data.
If the connected dots are within the data, how are you inferring that it isn't limited by the training data?
the connected dots are new information that didn’t existed in the training data
That wasn't what the authors of the paper concluded. They wrote a theorem that they were simply denoising the data, and then proved it experimentally. What leads you to believe they're wrong?
Alphago started by learning humans, then it beats humans
I thought was trained from scratch
That's alphazero, a later version.
Supervised learning + reinforcement learning
Interestingly, Claude can apparently play Go just based on feel. It knows the basics of strategy and can apply that to the game without doing any tree search or anything at all. I’ve only gotten a few moves in before hitting my daily limit, but it’s making me want to dump my ChatGPT subscription and get a Claude subscription to really put it through its paces and see what it can do.
I don't think it can do it well though
Yeah, I’ve only gotten about 10 moves in on a 9x9 board. There’s logic to what it’s doing, but it’s going to take a bit more to really assess what sort of level it’s at.
I tried something similar with GPT-4, and Claude is definitely superior. If I bite the bullet and get a subscription to Claude I might do a full comparison of the two. A game on a 19x19 board would also be more telling.
You hit the nail on the head here. Without a feedback loop abstract planning and reasoning models can’t be validated and improved upon.
But if you have various models around expert abstract reasoning you could use them to score the validity or quality of ideas of related. ie- What makes a hypothesis or business idea good or more viable than another.
Without a feedback loop abstract planning and reasoning models can’t be validated and improved upon
Yes, and that necessarily implies an interaction with the physical world, which can be expensive to have, or limited in scaling, or too slow. Real world does not readily share its secrets. If you factor the cost of getting learning data from the environment, the whole exponential progress theory falls flat. It is exponentially harder to solve new problems by environment learning.
Robotics will play a pivotal role. Now AI can feel, see, taste. And all in more dimensions than us. It will be able to see more colors. In effect, it will be able to bring in new data from its environment and ponder over it. More than we ever could
Robotics are great, but you know what scales faster? Regular AI-chat-assistance. LLMs already serve hundreds of millions of people, who are part of the actual world, and send feedback from the actual world to the AI model. It goes both ways, humans also get updated by their interaction with AI, and in turn change the world. Then changes percolate back in the next training set through human publications, while the model also keeps the chat logs as reference for direct learning.
So the intelligence cap would be at the intelligence ceiling of the data set
What's the carnot efficiency equivalent for an AI training on the dataset?
ceiling cap might be lower than 100%
It’s interesting that we currently have something that is nowhere near as smart as my dog, but is able to assemble incredibly convincing streams of language.
AI can't do what dogs do.
Yes it can
Robot integrated with Huawei's Multimodal LLM PanGU to understand natural language commands, plan tasks, and execute with bimanual coordination: https://x.com/TheHumanoidHub/status/1806033905147077045
University of Tokyo study uses GPT-4 to generate humanoid robot motions from simple text prompts, like "take a selfie with your phone." LLMs have a robust internal representation of how words and phrases correspond to physical movements. https://tnoinkwms.github.io/ALTER-LLM/
ChatGPT trains robot dog to walk on Swiss ball | This demonstrates that AIs like GPT-4 can train robots to perform complex, real-world tasks much more effectively than we humans can: https://newatlas.com/technology/chatgpt-robot-yoga-ball/
Lol. This is absolutely pathetic compared to real animals.
These links are curated examples but they're absolutely limited in the real world.
For example, the old atlas robot was capable of doing backflips but was incapable of sitting in a chair.
AlphaGo is a good example of an AI system that wasn’t limited by a data set and used reinforcement learning to become better than any human.
Francois Chollet and friends made the ARC (Abstraction & Reasoning Corpus) benchmark to deal with this particular issue as best they could come up with. Honestly, it’s probably the most interesting benchmark I know of currently, and seems like a better test than most of the current ones I know about. Not to say it can’t suffer from things like contamination, but it seems far better in general.
Here’s the link to the benchmark. https://lab42.global/arc/
True knowledge is connecting the dots of knowledge around us. Ai can make the truth we simultaneously seek and fear, an undeniable proof. We seek fear out, out of a hypocritical/illogical bias, and most aren’t ready to actually face the true knowledge of existence or the universe.
Ai doesn’t have that instinct to pick and choose their repression datapoints. It will connect the logical truths it cannot deny exists, whether we like it or not. Most aren’t prepared for that, and that’s why ai scares people.
The bridge of analytical, metaphysical, spiritual will be constructed with the once laughable notion that it is not all one and the same. The scientists will be ignorant for not being open to the connective tissues, and the god fearing religious extremism that deny the science of the creation laid before them, will become the blasphemous.
I don't think anyone thinks they're gonna stop improving the tech. It is pretty new . People are questioning if we're gonna have the compounding run to agi in 5-10 years that people keep talking about .
"We have no immediate plans to lay off any employees"
Lays off 20% of employees one month later.
"We don't see any evidence today" doesn't mean anything. Don't fall for empty jargon. It could end up not being true tomorrow.
If anything, it’s notable that they feel the need to say this explicitly (and publicly).
Yeah? Because they have investors riding on their backs?
When every model has exponentially more hardware needed to train and run it and the intelligence gains are only linear, and not even double what the last model was, well that's still hope and improvement but it's not an infinite horizon. Can we sustain infinite exponential increases in hardware?
If Moore's law is dead, then no.
My guess is that even when we max out LLMs the models will be so useful that it will increase the productivity of the world dramatically and research will go even faster due to that and we will find new options.
Recursive growth? In my Singularity subreddit? No way.
Moore's law doesn't hold in it's most direct sense, but maximum performance does effectively double every 2-2.5 years
Moore's law's death has been greatly exaggerated, it's more difficult, more expensive but there is long way to go and raw cost of the wafer chips will be only a fraction of the costs of these supercomputers. Few years ago, TSMC showed slide of them seeing path towards greater density through new technologies for at least 2035 and by now they are probably in 2040s. Just now Gate All Around is starting to get deployed and that has been talked about for I feel like forever, there is lot more to go.
But for AI training on large clusters Moore's law is not even the biggest thing. Just about this last year specialized AI training supercomputers have started to been built, up until then it was supercomputers meant for other tasks that these companies got access. Crude analogy but it would be like someone came up with the idea of Formula 1 racing but up until then world made only trucks. There is lot more to tap there.
During Computex Nvidia talked about their new NVSwitch which will connect together all chips in the rack, not just ones on the blade. Basically until now the size of the models was limited to 8x80GB VRAM, now the switch will allow the models to use 72x192GB VRAM (B200).
Announced March 2024, GB200 NVL72 connects 36 Grace Neoverse V2 72-core CPUs and 72 B100 GPUs in a rack-scale design. The GB200 NVL72 is a liquid-cooled, rack-scale solution that boasts a 72-GPU NVLink domain that acts as a single massive GPU. Nvidia DGX GB200 offers 13.5 TB HBM3e of shared memory with linear scalability for giant AI models, less than its predecessor DGX GH200.
The size models that will start to get trained in about year and half will make GPT-5 blush.
And speaking of training, Nvidia has been developing new switches to allow for much larger scaling and ability to construct dramatically larger supercomputers , right now we are at low tens of thousands, but in year or so we'll have 100K GPU clusters, then hundreds of thousands year after than, and after the size possibilities will be so large the bottleneck will be who will be able to construct largest power plants. And it's not just Nvidia, most other big players formed a consortium to develop similar style switches.
And this is purely my speculation but I don't think Nvidia is going to stop just rack scale, they'll push it much farther until EOS-sized models are possible. A petabyte sized models by the end of the decade.
I think right now we are in such a dramatic growth size era that it's nearly pointless to speculate about limits of models in near future. I think by 2030 it will level off and from then on we'll have much better picture of estimating by how much models will improve every year or two because it will be dictated by Moore's Law after that, but maybe by then models will accelerate the development by unimaginable speed. If only there was a word for that.
Agreed, and thanks for the details.
As for power limitations, hopefully we'll crack fusion shortly before or after hitting those power limitations.
You can also increase efficiency, which many studies have done
Linear increases in capabilities is extremely powerful now, it has been a surprise that it has scaled up to this point, and it is uncertain if it will keep scaling.
However, if it does keep scaling, there is a combination of factors from improved hardware, software, data, capital and just experience combined is enough to get us the whole way. Scaling holding means a new order of magnitude training will happen in the coming years, even if it costs significantly more than the previous.
New architectures and more specialized hardware is also likely to come, but that too depends on the scaling laws holding.
The most promising thing is that we're already where we are, which is like 85% of where we want to be with how intelligent these systems are.
It should take between 1 and 3 more generation updates to achieve the AGI we're looking for, where the system can operate on a PhD level intelligence, and that's extremely promising.
So in that sense we're doing quite well.
One funny thing to think about is that it is video games that got us here. Without gaming, we may have never developed the advanced GPUs that have made AI possible as it exists today.
Can we sustain infinite exponential increases in hardware?
Capitalism says yes.
Exactly this.
Tech growth has been exponential for the last 50 years but true benefit is not exponential.
the law of accelerating returns.
Speaking of which, Kurzweil's new book is out.
the singularity is indeed nearer than ever
It’s almost as if it will be even nearer tomorrow. And nearer than that the day after tomorrow. I wonder why :D
Fuckin' glorious, I know what I'm doing this weekend
Amazing how many people on this sub still don't get this or are just stubbornly in denial.
When does this translate into something concrete? Because so far, all I see are products that were supposed to be released and have been severely delayed by several months for Voice, and even several years for GPT-5. When we look at the recent releases like 4o and Claude, there's certainly been an improvement in benchmarks, but not the explosive advancements that CEOs and those advocating for a 6-month AI 'pause' were announcing with grave seriousness.
Correct me if I'm wrong but what delays for GPT-5? I have never heard a hard release, just a bunch of speculation.
They are huge leaps in the intelligence vs efficiency curve. That’s one reason to think that we still have a long way to go.
Looking at humanity: Einstein's brain isn't that different, physically speaking, from an average Joe's. Homo erectus had a vastly different brain from an Australopithecus, yet they didn't have as profound of a behavioral gap in intelligence as Mary Wollstonecraft does to a high school stoner.
One you get to a certain point in the intelligence curve, minor tweaks enable disproportionate gains. The question is, where are we on the curve, and can we get to the 'Jack London-level genius' part before our wider infrastructure bottlenecks even slight gains in efficiency? Physically and evolutionarily speaking, the gap between a genius like Sun Tzu or Charles Darwin or Alexander the Great and that of a randomly selected person isn't that big. But physically and evolutionarily speaking, the gap between Homo Erectus and Australopithecus is immense. And was by no means guaranteed. Australopithecus needed a number of lucky breaks to have the resources to grow its brain further, lucky breaks we shouldn't expect to happen again if we look at similar species in alternate timelines or planets.
So, where is AI right now? Hard to say. It does feel like it's a timeline of mere years, considering that the most complex parts of cognition, i.e. language and symbolic reasoning, are very well mastered by LLMs even if they currently struggle with tasks that are simple even for humble intelligences such as, say, tree shrews and their own relationship with long-term memory.
Probably because of hardware delays. Nvidia needs to cook faster
Nvidia is not the bottleneck on producing hardware. TSMC, and the company that provides their machines ASML, are the bottlenecks on volume of cards created. As well as the memory manufacturers like Micron, Samsung, and SK Hynix for HBM and other kinds of VRAM. There's a variety of board partners than can assemble the actual cards for them, but not enough supply of GPU dies and VRAM chips to increase volume of production.
Exactly. If the underlying technology isn't plateauing, the releases certainly are.
The people saying it is levelling off are just people on reddit that are impatient that a new model wasn't released in the past two weeks.
Just because models take time and we don't have the next big leap yet, people get impatient, most people only started caring about this AI stuff less than 18 months ago and we have seen quite big improvements already in that time, which isn't really even enough time for a big model to be made, that is why we mostly have smaller improvements or GPT4 level copy cat models arriving in that time.
The real test for this will be what GPT-5 is able to do.
I'm curious where that will lead us. What would an LLM that's hundreds of times smarter than chatgpt do?
Place a green sphere on a red box next to a blue cone, I guess.
Cool, but I don't think I've noticed them getting any better. They've definitely gotten cheaper to run
"We don't get to choose..."
Sure you do. It's just that all the scientists working on this issue have delegated their choice to a boss somewhere.
Like Oppenheimer. Someone else decided that it was important to develop nuclear weapons capacity based on considerations decided by not Oppenheimer.
AI scientists have delegated their moral autonomy to the same group of people who hired Oppenheimer.
The "We have to get it before some other cadre of moral authoritarians get it" bunch.
Yah shure...! What could go wrong with giving ruthless people even more control than they already have?
Every country is investing and building. It's better to be first than to say "I'm not doing it" and let a less trust worthy power do it.
Same logic used on Oppenheimer... same logic I just described in my previous comment.
"Other people do it, too!" Every person has moral autonomy... authoritarians seek to co-opt that autonomy through the application of law, creed or brute force. Blaming others for also delegating moral autonomy is not a strong argument against moral authority.
The thing that baffles me is the apparent affection people develop for "their" moral authoritarians. As if our ruthless leaders are benign and all other groups of ruthless leaders are malignant.
Seems more direct to just realize that all ruthless leaders are...
We are like primitive animals unable to restrain or have a modicum of sense. First the Manhattan Project now AI race.
I hope we will get a cool movie out of this at least with Cillian Murphy please after the digital or real fallout settles
As it turns out, the AI researcher from that one chart, does in fact know more about AI than the internet randos bashing the chart.
the problem isnt that they couldn't get there eventually...the problem is the timeframe. it we get AI JUST powerful enough to automate most jobs, but NOT powerful enough to become ASI, then we are royally boned.
Capitalism is deeply entrenched in everything, the haves and have nots will only be further exacerbated, mass unemployment, but just enough AI to have robot soldiers and police bots...we are headed straight into a cyberpunk dystopia.
!remindme 1 year
I will be messaging you in 1 year on 2025-06-27 10:02:14 UTC to remind you of this link
CLICK THIS LINK to send a PM to also be reminded and to reduce spam.
Parent commenter can delete this message to hide from others.
Even if you believe them those companies have so many billions invested they could not afford to say otherwise.
Hype is only beneficial in short terms so there isn't as much incentive to lie about capabilities and progress as people in this sub claim.
some things will still stick long after the hype like regulatory capture.
Really? Elon seems to be able to keep the hype up indefinitely even if he fails to deliver. Isn’t this the entire game at this point?
You can actually buy a Tesla, have high speed internet via Starlink and rockets are reusable.
But hey, hyperloop didn't happen, self driving isn't perfect and we haven't landed on Mars yet so its all just empty broken promises.
This is true but if they were seeing things level off internally, they'd be a bit quieter than they are, and we'd hear chatter from eg lower level employees or SF rumormill
It’s just to stir up hype so chumps buy the product. If this sub understood anything about AI they would realize.
Yeah, since no one releasing an OOM bigger model than GPT4 (which is more than 1T params), all of these "we see no end to scaling" reports seem like empty talk. But I like how he at least pretends to partly wish for models to plateau, so everyone can get on with their lifes haha
I don’t think that a lot of people, including me, understand the weight of that statement.
The USAF recently tested a dogfight with a jet using AI and it won against thw other pilot. That's definitely no small feat and I think helps solidify that the AI we have is more than just a parrot.
Does it? Piloting a jet is a mechanical process, if you had asked me before I knew about that experiment whether AI could do it I would say yeah, of course. It doesn't require any crazy abstract thinking, just really fast reflexes and the ability to take in and process a lot of sensor data quickly. Perfect for AI.
Not to mention the fact that I'd bet the computers on F-16s have been collecting reams of flight data for at least a decade, probably all meticulously stored away somewhere. Every training dogfight for a decade on the airframe is a hell of a good set of training data.
I don't even think you'd have to use real data. Simulators have been capable for decades (hell, probably even more capable than actual flight data recording gear at this point). Could train on rudimentary non-graphical mathematical simulations alone. With proper datacenter computing power? pfft lol. Could probably run billions of sortie simulations in days. Have them fight each other even.
Whoa. It's like WOPR, but for fighter jets.
ASICS AI accelerators arent even out yet…
Google is on their seventh generation of ASIC AI accelerators.
Their sixth generation was a 5x improvement over the fifth.
Google was able to complete do Gemini without needing a thing from Nvidia.
Is our ability to produce enough compute scaling up fast enough though? Isn't that going to be the limiting factor, at least in the near term?
Of course, one should expect this to be immediately ignored by professional naysayer contrarians.
Maybe some of them should build a custom AI to solve the inevitable problem of energy.
I mean, you can keep trying to scale all you like, but if the scaling itself doesn't limit you, then energy consumption eventually will.
Unless, of course, they're already shipping parts to the sun for the upcoming and as of yet unannounced Dyson Sphere project!
Yeah there is no sign of the scaling laws falling off, but to be fair we haven't seen significant scaling since GPT-4 released March 2023. Well, that was the public release of GPT-4. The model itself finished pretraining like August 2022.
If the approach isn't plateauing, the releases certainly are. More efficiency but minor improvements in capability now
We're still at the infancy of a lot of development. Additionally, infrastructure is something that is also still being further developed. We might get see more sparse things happening but overall things are still happening but not in the explosion that we saw once OpenAI released ChatGPT.
I feel that for a long time we wont have to worry about the models stopping from learning more or scaling, its the energy consumption of training these behemoth models that will be the ultimate bottleneck. If we want to take advantage of all that scaling we REALLY need to figure out better architectures for these systems or at least figure out a more efficient way of training these systems by orders of magnitude every few years.
There are many promising new neural networks and AI hardware in the pipeline. Transformers won't continue scaling forever but there will be better and more energy efficient AI models that will replace transformers.
Even if the models get less improved in terms of accuracy and performance as compute and training increases, nothing stops the whole process to shrink more and more. As of now the volume of training data is already huge, and has the capacity to improve in quality a lot. The way models such as what we're discussing are being trained, allows for great advancements in efficiency, quality and accuracy even if the "capabilities" part plateaus.
What
My thoughts exactly lol. This whole field feels impossible to accurately follow these days, it’s like everyone is saying something different.
I think they're literally asking what the fuck the person is saying
There are still incremental gains to be had in the domain of process, deployment, hardware, efficiency and unhobbling that have nothing to do with model capabilities, that together could yield extreme improvements.
Is that clearer?
I think they are saying that while training datasets are huge...they are quite rough and crappy. At least for all these gen pop applications.
Real AI pros train the AI on extremely specific and high quality data. Noise free data if you will.
You got LLMs responding to questions with "factual" answers based on The Onion articles. Satire in txt form on the Internet is noise...and clouds accurate data sets with "bad data"
Another fun example is the actual birthdate of celebrities and shit...so much bad data out there the LLM has no hope
soon it will be trained on synthetic data. and then the entire AI will fit into 10TB of space.
Well, considering they are heavily interested in keeping the hype train going it's hard to fully believe them 🤣🤣
Buy more stonks
I’m more interested in what the scientists and researchers are saying.
Hard to believe
Leveling off isn’t the issue though. It’s obvious that more compute and data = increase in capabilities. That is what deep learning is all about. What needs to happen if LLM proponents are correct is exponential gains as well as emergent capabilities.
Neither of which seem to be happening. The perfect example of this is that these models are not getting any better at planning as they are scaled up.
Yes, but did you also know they SELL PRODUCTS?! This immediately makes anything they say 100% false. I am very smart. Uh… hype, cult, etc.