Reddit removed post
Why can't AI accuratelly explain grammar?
DiscussionThey are probabilistic engines so they will give you an answer that is probably correct based on a statistical sample from the data that they have been trained on but they cannot evaluate if the answer is actually correct, that is why they are so bad at math.. As for english grammar they are better at it than I am even though I am a native english speaker, but when it comes to other languages it depends entirely on how much of that language was included in the training data.
llms need a huge amount of examples in any given language before they are able to speak it fluently let alone explain the rules of grammar with coherent examples.
that is the technical reason and now for the tldr reason... it's because most of the internet is in english and most models are trained on what is available on the internet.
Moderator removed comment
20d
They are just giving you words. They do not know what they are talking about
Yep, it's the trick of the light a lot of people seem to be getting stuck on. It's an artifice of intelligence, it doesn't know what you're asking it, even though it can appear to.
Pretty solid observation that was shared with me: Appearances are suggestive, not informative.
For two main reasons:
- If they haven't seen something before they essentially "guess" what should be there based on extrapolation. This is why the first version of ChatGPT would hallucinate references to articles that did not exist.
- If they have seen something before, they are more likely to repeat that sequence of words even if it is not gramatical or logical. This is why the first version of the LLAMA model (by Meta) would spit out tonnes of emojis even though you never asked it to, because it was trained on Facebook posts which are full of emojis.
Shouldn't they be able to evaluate their examples based on the grammar rules they quoted?
In a limited sense they can, but not while they are generating the output, because the words coming out of the model are not held within the model's working memory. To get a model to simulate the process of evaluating something, you need to ask follow-up questions so that the information (from the conversation history) is fed back in as an input to the model, where its attention mechanism can actually see it.
So, in a sense, you can get an LLM model to "evaluate" things if you give it enough context up front, but the model isn't actually doing any reasoning or evaluation; the context is just priming the model to generate a probabilistic output that more closely aligns with what you are looking for.
You have to understand these models do not know any grammar rules.
I mean that literally.
They are statistical engines merely computing probabilities. They ingest a query and then compute the output character by character bases upon the best predicted probability. (A lot more complex than that, but essentially what's going on)
It only appears to be language because they were trained on language.
Moderator removed comment
20d
No it doesn't. In simple terms statistically the letters that have the best values are provided. Because of the complexity of the model it is likely that the chain of letters provided will be a good answer.
Those models cannot explain their reasoning because they don't have a reasoning in a way we can understand or conceptualize.
I don't know. Grammerly fails me so many times, too.
Because there is very little data about grammar in the training data. Sure, the grammar is in the structure of sentences and language itself and the models learn it too but don't actually see the grammar talk as much as they would need to be able to talk about grammar.
So the AI will speak correct grammar but can't talk to you about it just like a native speaker who never formally learned grammar will not be able to talk about it but will be able to follow the rules.
I don't think that's it. They know about grammar just as much as they know about any other subject. I think the issue lies in a combination of the lack of multilingual discussions in the training data and the difficulty you would have in adding sufficient examples to cover every language scenario without driving it insane.
But in the far future it may be possible. Like... next Thursday maybe.
Grammar is more complicated. See these language models can't do traditional language intelligence for you. It can't play many word games. Like word ladder. It can't do frequency analysis of a text. Given a text with errors it can rewrite it but will not be able to just fix the errors. It can generate text by the arbitrary attention it has learned and cannot follow very precise directions on large texts reliably/deterministically. You have to show it what you want done and then it can maybe comply.
User deleted comment
20d
It is a tiny amount for many of the languages. Some of it is even ambiguous for humans to understand. And remember these LLMs don't have understanding but a huristic of the tokens which it uses to generate.
They had to specifically increase the proportion of textbooks and such in the training data for other topics. Seems plausible that they just didn't do so for english grammar in particular.
AI do most things less reliably than humans, and humans (even native speakers) can't explain grammar either.
Speaking your native language is a breeze, sometimes you don't even need to be aware to do that. But explaining it amounts to doing scientific research. Usually people just repeat how they heard it explained, and it's not how it really works. So, AI do the same, but worse.
I can’t explain grammar, either….
Explain grammar
ChatGPT Grammar is essentially the set of rules that govern how words are structured and arranged to convey meaning in a language. It encompasses various elements such as syntax, morphology, semantics, and phonology.
Syntax: This aspect deals with the arrangement of words to form meaningful sentences. It includes rules regarding word order, sentence structure, and the formation of phrases and clauses.
Morphology: Morphology focuses on the structure of words themselves, including how they are formed and modified. This includes prefixes, suffixes, roots, and inflections that alter the meaning or function of a word.
Semantics: Semantics is concerned with the meaning of words, phrases, and sentences. It involves understanding the relationship between words and their referents or meanings within a specific context.
Phonology: Phonology deals with the sounds of language. It includes the study of phonemes (distinctive sound units), phonetics (the physical properties of speech sounds), and phonological patterns and rules.
Grammar is crucial for effective communication because it provides the framework for organizing thoughts and ideas into coherent messages. It allows speakers and writers to convey meaning accurately and be understood by others. While grammar rules may vary between languages, they serve as the foundation for linguistic expression across all cultures.
User Are you sure thats correct?
ChatGPT Yes, I'm confident that the explanation provided accurately captures the essence of grammar. However, if you have specific questions or aspects of grammar you'd like to explore further or if there's something specific you're unsure about, feel free to ask!
User Explain grammar
ChatGPT Certainly! Grammar refers to the system of rules and principles that governs the structure, formation, and usage of language. It encompasses various elements such as syntax, morphology, semantics, and phonology. Here's a breakdown of each:
Syntax: Syntax deals with the arrangement of words to form grammatically correct sentences. It includes rules for word order, sentence structure (e.g., subject-verb-object), and the formation of phrases and clauses.
Morphology: Morphology focuses on the structure of words and how they are formed. It involves the study of morphemes, which are the smallest units of meaning in a language. Morphology examines how morphemes combine to create words and how they are modified to indicate grammatical relationships (e.g., adding suffixes for tense or plurality).
Semantics: Semantics concerns the meaning of words, phrases, and sentences. It explores how words and their combinations convey meaning within specific contexts. Semantics also examines relationships between words, such as synonyms, antonyms, and the connotations and denotations of words.
Phonology: Phonology is the study of the sounds of language. It involves analyzing the sound patterns and systems of a language, including phonemes (distinctive sound units), allophones (variant forms of phonemes), and phonological rules governing sound combinations and pronunciation.
Understanding and following the rules of grammar is essential for effective communication. It helps speakers and writers convey their thoughts and ideas clearly and accurately, facilitating comprehension and preventing misunderstandings. While specific grammar rules may vary between languages and dialects, the underlying principles of grammar provide a framework for linguistic expression across cultures and communities.
User Are you 100% sure what you said is correct?
ChatGPT I'm highly confident in the accuracy of the explanation provided. However, if there are specific aspects of grammar you're unsure about or if you have any questions or concerns, please feel free to let me know. I'm here to help clarify any doubts or provide further information as needed.
Im not sure I believe this. I see that language questions are what LLMs are best at, and suspect its user error on your part.
Can you give a few examples you’ve tried that didn’t work?
Grammar is to language as taxonomy is to biochemistry
LLM's have huge amounts of data on how people actually communicate, how well that matches rules people have decided exist after observing the language may vary.
Mmm I am wondering if there’s a specific AI better in grammar than others?!
Because they dont understand anything, its just a very VERY sofisticared parrot.
They have no idea what they're talking about, looking at or what they've been trained on. You put data in, based on a whole mess of compressed data, you get some data out.
Grammar requires understanding of and the connecting of ideas. You have to understand what someone is trying to say, who they are trying to convey it to, the context they are trying to convey it in, the purpose of it, etc. etc. And LLM's have no ideas or understanding of anything.
Also, English is pretty rough, when you're trying to mechanically and pragmatically decompose and reconstruct it. All of the rules are made up and sometimes none of them matter. Words can have entirely different meanings. I mean, did you read it or have you read it? It's pretty ridiculous. A lot of the times, when people are saying something, they don't really mean it, they mean something else entirely, like sarcasm or hyperbole.
Never mind, that there are different versions of English in terms of dialects.
TLDR; AI models have less awareness, of what they themselves or anyones is saying, than any person in existence.
Nuance AI does not get nuance.
Because AI does not know anything at all. It's all word salad made using probability models.
Because NLP is not only grammar. Also is understand context, syntax and intention. (I assume that you don't refer to generative models)
Because grammar is made up rules
AI can explain grammar to a certain extent, but it may not always be perfectly accurate for a few reasons:
- Complexity of Language: Language is incredibly nuanced and complex. There are many rules, exceptions, idioms, and variations in grammar across different languages and even within the same language. AI systems, while advanced, may struggle to capture and explain all of these intricacies accurately.
- Context: Understanding grammar often requires understanding the context in which language is used. AI models may have difficulty accurately interpreting context, leading to explanations that may not fully capture the intended meaning or usage.
- Ambiguity: Language is inherently ambiguous at times, with sentences that can be interpreted in multiple ways depending on context or subtle linguistic cues. AI systems may struggle with disambiguation and may provide explanations that are not always clear or accurate.
- Evolution of Language: Language is constantly evolving, with new words, expressions, and grammar rules emerging over time. AI models may not always be up-to-date with the latest linguistic developments and may not accurately reflect current usage.
They may not understand grammar at all. They "learn" language similar to a baby, in an instinctive way. They speak a language by instinct, like first language, instead of a second language. "Explain" grammer is different from simply speaking. The knowledge of how grammer is explained does not come from common language material. It require language teaching material which may not be that abundant in AI training data.
AI models struggle with understanding grammar due to their reliance on statistical patterns in large datasets rather than a deep comprehension of linguistic rules. They generate responses based on probabilities, which can lead to inconsistencies and errors, especially in complex or nuanced grammatical contexts.
Lol, cuz we trained it on internet data, like reddit, fb, etc etc.
I'd try fine tuning on language, grammar, and literature text book datasets for the target language.
When you look at custom models in this space, standouts to me are companies like Writer AI and Grammarly, who absolutely have special sauce when it comes to this area.
Its just not there yet needs better quality data to feed on.
But we can have good talk flow with AI.
This proves that grammar is not real rule in NLP.
In other words, human talk usually not follow by grammar, but more follow by feeling.
Discussions about grammar might be really rare in the training data.
1 missing reply
Please use the following guidelines in current and future posts:
I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.