Abstract
Natural language generation has achieved remarkable performance on various tasks, including dialogue generation, summarization, translation, etc. Nevertheless, the repetition problem exists in nearly all the generated tasks mentioned above. Several methods based on the token-level probabilities have been proposed to solve the repetition problem. However, for the task of dialogue generation, as the continuation, i.e., the next utterance has strong coherence with the dialogue history, directly applying a repetition penalty may violate the coherence. Additionally, the response generated by the generative model has a high chance of repeating the dialogue history. To address these problems, we propose a novel repetition penalty approach with intraresponse n-gram repetition penalty (IRNRP) and interconversation repetition penalty (ICRP) for Chinese dialogue generation. The experiments on large-scale Chinese dialogue datasets have shown the effectiveness of our proposed approach.(Our code will be released at https://git.openi.org.cn/PCL-Platform.Intelligence/PanGu-Dialog)
| Original language | English |
|---|---|
| Pages (from-to) | 15941-15948 |
| Number of pages | 8 |
| Journal | Neural Computing and Applications |
| Volume | 37 |
| Issue number | 20 |
| DOIs | |
| State | Published - Jul 2025 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2025.
Keywords
- Coherence
- Dialogue
- Generation
- Penalty
- Repetition