(Un)natural Language Processing
Natural language processing (NLP) attempts to understand, interpret, and generate human language. In my observation, it does fine at "generate," okay (I guess) at "interpret," but does not really "understand." NLP is based on statistics. It tries to guess what's next based on what word (or words) most often follow what you've typed. That is, most often in the training data. If the training data is really generic, then you might get one response, but if it's contextualized then you might get a different one.
Here's an example: If I were to write "I had several bats in my garage, which..." the next few words could be "I would bring to baseball practice." or "I had to get removed by a specialist." The difference is context. A model trained on sports writing would make the former assumption. A model trainined on nature and wild animal enounters might make the latter. As a reader, you'd know which the writer meant because you'd already have that context.
Another issue is with pronouns. That is, GenAI is terrible at them! Take these two pairs of sentences:
The word "it" in the second sentence obviously refers to the suitcase in the first pair and the box in the second pair. NLP algorithms have a REALLY hard time figuring that out, becuase they don't actually understand the meanings behind the words. To know which is "it" in the second sentence, one also needs to know what it means to fit in something and (in the case of the first pair) that the thing into which the other thing is going to fit cannot be too small.
This is why the output of most GenAI programs seems...unnatural. It's not grammatically incorrect, just unnatural. So, if you want to use GenAI to help with your writing, go ahead! Just make sure that you take the time to edit it so that it's not obvious.

Comments
Post a Comment