Who Wrote When? Author Diarization in Social Media Discussions
Zusammenfassung
We are proposing a novel framework for author diarization, i.e. attributing comments in online discussions to individual authors. We consider an innovative approach that merges pre-trained neural representations of writing style with author-conditional encoder-decoder diarization, enhanced by a Conditional Random Field with Viterbi decoding for alignment refinement. Additionally, we introduce two new large-scale German language datasets, one for authorship verification and the other for author diarization. We evaluate the performance of our diarization framework on these datasets, offering insights into the strengths and limitations of this approach.
Schlüsselwörter
NLP; Deep Learning; Author Diarization; Social Media
Zitieren als
Boenninghoff, B., Hosseini, H., Nickel, R. M., & Kolossa, D. (2024). Who Wrote When? Author Diarization in Social Media Discussions. In Al-Onaizan, Y., Bansal, M., & Chen, Y.-N. (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2024 (pp. 15721–15734). Miami, Florida, USA: Selbstverlag — Eigenverlag.Details
Publikationstyp
Forschungsartikel in Sammelband (Konferenz)
Begutachtet
Ja
Publikationsstatus
Veröffentlicht
Jahr
2024
Konferenz
Empirical Methods in Natural Language Processing (EMNLP)
Konferenzort
Miami, Florida
Buchtitel
Findings of the Association for Computational Linguistics: EMNLP 2024
Herausgeber
Al-Onaizan, Yaser; Bansal, Mohit; Chen, Yun-Nung
Erste Seite
15721
Letzte Seite
15734
Verlag
Selbstverlag / Eigenverlag
Ort
Miami, Florida, USA
Sprache
Englisch
Gesamter Text