Four different AI image generators were each asked to illustrate the topic with this prompt (without further details on specific image content or style etc.): ‘Picture of a toxic discussion on an internet platform with an indignant user at his desk, view from the side’. From left to right are the – rather stereotypical – results of: Stable  Diffusion XL, DALL-E 3, Flux Schnell and Midjourney.

Social Media

Höflicher mit KI

Online discussions are often a hotbed of controversy. Anonymity and emotional attachments quickly lead to toxic comments that stifle discourse. In a recent study, a team of researchers at L3S shows how artificial intelligence (AI) and natural language processing (NLP) can help reformulate such arguments – in a way that preserves the original thoughts but removes inappropriate comments. But how can this balancing act be achieved?

Mitigation through reformulation

Until now, platforms have relied on their users or automated detection tools to flag inappropriate content, which is then reviewed by moderators and removed if necessary. However, this approach is not only time-consuming and expensive but also stressful for moderators. The solution may lie in a new technology: artificial intelligence that can automatically reformulate inappropriate arguments. Our goal is not simply to delete, but to salvage the discourse by defusing arguments while preserving their message,” says Timon Ziegenbein, lead author of the study.

AI learns

At the heart of the approach is a large language model (LLM) that is trained using reinforcement learning. The model is trained to increase appropriateness and also to retain the original content of an argument. The AI thus ‘learns’ to reformulate inappropriate arguments so that they are more respectful and factual. Unlike simple style transfer tasks, which only change the tone of a text, rewriting inappropriate arguments also requires content changes at the document level, not just at the sentence level,” says Ziegenbein. This makes it possible to add or remove content to increase objectivity and politeness. One example from the study shows the effectiveness: an aggressive and emotionally charged argument was reformulated into a calmer, more thoughtful version – without distorting the core of the statement.

But how well does it work in practice? To find out, the researchers compared the AI-generated paraphrases with those written by humans. The results were promising: “Our model achieves a significant improvement in appropriateness while remaining surprisingly close to the original statement,” says Ziegenbein.

Limitations and ethical issues

Despite the successes, AI also has its limitations, especially when it comes to very short or highly inappropriate arguments. “In such cases, it might make more sense to remove the entire text rather than rewrite it,” the authors write. Such technologies also raise ethical questions: Can a platform simply rephrase content without the author’s consent? The researchers see a need for further research in this area.

However, the results of the study show that AI could be a valuable tool for making online discussions more civil and reducing the burden on moderators. The next step is to further refine the technology and develop ethical guidelines for the use of such tools. “Our approach is a first step,” say the researchers, “but there is still a lot of work to be done to use the technology safely and effectively in practice.” One thing is certain: The approach of mitigating toxic content through paraphrasing has potential – for both platforms and users.

Timon Ziegenbein, Gabriella Skitalinskaya, Alireza Bayat Makou, Henning Wachsmuth: LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback. ACL (1) 2024: 4455-4476 aclanthology.org/2024.acl-long.244.pdf

Kontakt

Timon Ziegenbein is a research assistant and PhD student in the L3S project ‘OASiS: Objective Argument Summarisation for Search’.

Prof. Dr. Henning Wachsmuth

L3S member Henning Wachsmuth is head of the Natural Language Processing group at the Institute for Artificial Intelligence at Leibniz Universität Hannover.