The capabilities of AI to create content have exploded in the past year, but writing remains incredibly personal. When AI is used to help people communicate, respecting the original intent of a message is of paramount importance, but recent innovation, especially in the field of generative AI, has outpaced existing approaches to delivering responsible writing assistance.
When researchers and industry professionals think about safety and fairness in the context of AI writing systems, they usually focus on identifying toxic language, such as derogatory terms or profanity, and preventing it from appearing to users. This is an essential step to make models safer and ensure they don’t produce the worst content. But in itself this is not enough to make a model safe. What if a model produces content that is completely harmless in itself, but becomes offensive in certain contexts? A saying like “Look on the bright side” can be positive in the context of a minor inconvenience, yet outrageously offensive in the context of war.
As AI developers, it is not enough for us to block toxic language and claim that our models are safe. To truly deliver responsible AI products, we need to understand how our models work, what their shortcomings are, and in what context they can be used – and we need to put controls in place to prevent harmful interactions between our AI systems and our users.
Knar Hovakimian
Social links navigation
Responsible AI team, Grammarly.
The problem and why it matters
According to a Forrester survey, 70 percent of people use generative AI for most or all of their writing and editing at work. With this increase in the use of generative AI tools, more content than ever before regularly interacts with AI, machine learning (ML), and large language models (LLMs).
And we know that AI makes mistakes. When an AI model makes a suggestion that changes the meaning of a sentence, it is usually an innocent error; this can simply be dismissed. This becomes more complicated as technology advances and as developers rely more on LLMs. For example, if an LLM is prone to political bias, it may not be responsible to allow it to generate political reporting. If it is prone to misinformation and hallucinations, allowing it to generate medical advice and diagnoses can be dangerous and unethical. The stakes of inappropriate outcomes are much higher, where innocent mistakes are no longer the only outcome.
A way forward
The industry must develop new tactics for security efforts to keep up with the capabilities (and shortcomings) of the latest AI models.
I’ve previously mentioned a few circumstances where blocking toxic language isn’t enough to prevent dangerous interactions between AI systems and our users in today’s ecosystem. If we take the time to investigate how our models work, what their weaknesses are, and in what context they will be used, we can provide responsible support in these examples and more:
A generative AI writing tool can prepare a summary of a medical diagnosis. However, given the risk of inserting misleading or out-of-context information, we can prevent the LLM from returning inaccurate information by using the appropriate ML model as a guardrail.
Political opinions are nuanced, and an AI product’s suggestion or output can easily misinterpret the integrity of a point because it doesn’t understand the intent or context. Again, a carefully constructed model can prevent an LLM from engaging with certain political topics in cases where there is a risk of misinformation or bias.
If you’re writing a condolence letter to a colleague, a model can prevent an AI writing assistant from making a tone-deaf suggestion to sound more positive.
An example of a mechanism that can produce these types of results is Seismograph, the first model of its kind that can be stacked on top of large language models and proprietary machine learning models to reduce the likelihood of troublesome results. Just as a seismograph machine measures earthquake waves, seismograph technology detects and measures how sensitive a text is so that models know how to interact with it, minimizing the negative impact on customers.
Seismograph is just one example of how a hybrid approach to building – where LLMs, ML and AI models work together – creates more reliable AI products. By reducing the chance of AI delivering unfavorable content without proper context, the industry can provide AI communication assistance from a place of empathy and responsibility.
The future of responsible AI
When AI communication tools were mainly limited to the basic mechanics of writing, the potential damage caused by a writing suggestion was minimal, regardless of context. Today we rely on AI to take on more complex writing tasks where context matters. AI providers therefore have a greater responsibility to ensure that their technology does not have unintended consequences.
Product builders can follow these three principles to hold themselves accountable:
1. Test for vulnerabilities in your product: Red teaming, bias and fairness assessments, and other pressure testing can reveal vulnerabilities before they have a significant impact on customers.
2. Identify industry-wide solutions that make building responsible AI easier and more accessible: Developments in responsible approaches help us all improve the quality of our products and strengthen consumer trust in AI technology.
3. Integrate responsible AI teams into product development: This work can fail if no one is explicitly responsible for model safety. Companies should prioritize responsible AI teams and empower them to play a central role in building new features and maintaining existing ones.
These principles can guide the work and efforts of industry in developing publicly available models such as Seismograph. By doing this, we show that the industry can stay ahead of risk and provide people with more complex suggestions and generated results, without causing harm.
We’ve highlighted the best AI chatbot for businesses.
This article was produced as part of Ny BreakingPro’s Expert Insights channel, where we profile the best and brightest minds in today’s technology industry. The views expressed here are those of the author and are not necessarily those of Ny BreakingPro or Future plc. If you are interested in contributing, you can read more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro