Jing Yang

MAR 2052,

Marchstrasse 23

Berlin, Germany

I am a post-doc researcher at the XplaiNLP group in TU Berlin led by Dr. Vera Schmitt, and supervised by Prof. Sebastian Möller. I am also a guest researcher at the German Research Center for Artificial Intelligence (DFKI). My current research project (FakeXplain) is a BIFOLD agility project related to generating natural language explanations for AI-based disinformation detection.

I completed my Bachelor’s degree in Information and Computing Science at the Hubei University of Technology in China, followed by a Master’s degree in Computer Science at Hunan University, China. My Master’s dissertation is related to identifying 3D printed objects and printers with Digital Forensics and Machine Learning. After obtaining my Master’s degree in 2019, I pursued a PhD at the RECOD.ai lab from the University of Campinas in Brazil, under the supervision of Prof. Anderson Rocha. During 2022-2023, I did a research internship at the Ubiquitous Knowledge Processing (UKP) lab in TU Darmstadt. My PhD thesis was related to improving fact-checking efficiency and explainability with few-shot learning and large language models.

My research interests are:

Natural language explanation generation
Synthetic text evaluation
Human preference learning on text generation
Applications and social impacts of large language models

news

Mar 25, 2026	Very happy to be attending EACL 2026, and presenting our paper “Persona Prompting as a Lens on LLM Social Reasoning”.
Jan 12, 2026	Our work about NLG evaluation trend analysis based on LLM extracted structured data from NLP papers, is available on arXiv: Order in the Evaluation Court: A Critical Analysis of NLG Evaluation Trends.
Jan 06, 2026	Our paper “Persona Prompting as a Lens on LLM Social Reasoning” has been accepted to EACL 2026! Many thanks to our co-authors: Moritz Hechtbauer, Elisabeth Khalilov, Evelyn Luise Brinkmann, Vera Schmitt, Nils Feldhus. Looking forward to seeing you in Rabat, Morocco!
Jul 16, 2025	Three co-authored papers were accepted recently! Comparing LLMs and BERT-based Classifiers for Resource-Sensitive Claim Verification in Social Media. Max Upravitelev, Nicolau Duran-Silva, Christian Woerle, Giuseppe Guarino, Salar Mohtaj, Jing Yang, Veronika Solopova and Vera Schmitt. Scholarly Document Processing workshop at ACL 2025. Exploring Semantic Filtering Heuristics For Efficient Claim Verification. Max Upravitelev, Premtim Sahitaj, Arthur Hilbert, Veronika Solopova, Jing Yang, Nils Feldhus, Tatiana Anikina, Simon Ostermann and Vera Schmitt. FEVER workshop at ACL 2025. dfkinit2b at CheckThat! 2025: Leveraging LLMs and Ensemble of Methods for Multilingual Claim Normalization. Tatiana Anikina, Van Vykopal, Sebastian Kula, Ravi Kiran Chikkala, Natalia Skachkova, Jing Yang, Veronika Solopova, Vera Schmitt, and Simon Ostermann. CLEF 2025: Conference and Labs of the Evaluation Forum.
Apr 03, 2025	Our TACL paper is now published in MIT Press: Self-Rationalization in the Wild: A Large-scale Out-of-Distribution Evaluation on NLI-related tasks Open Access. Feel free to check out!

selected publications

ICASSP
Explainable Fact-checking through Question Answering

Jing Yang, Didier Vega-Oliveros, Taı́s Seibt, and 1 more author

In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022

Abs Bib PDF Code

Misleading or false information has been creating chaos in some places around the world. To mitigate this issue, many researchers have proposed automated fact-checking methods to fight the spread of fake news. However, most methods cannot explain the reasoning behind their decisions, failing to build trust between machines and humans using such technology. Trust is essential for fact-checking to be applied in the real world. Here, we address fact-checking explainability through question answering. In particular, we propose generating questions and answers from claims and answering the same questions from evidence. We also propose an answer comparison model with an attention mechanism attached to each question. Leveraging question answering as a proxy, we break down automated fact-checking into several steps — this separation aids models’ explainability as it allows for more detailed analysis of their decision-making processes. Experimental results show that the proposed model can achieve state-of-the-art performance while providing reasonable explainable capabilities.
@inproceedings{yang2021explainable, title = {Explainable Fact-checking through Question Answering}, author = {Yang, Jing and Vega-Oliveros, Didier and Seibt, Ta{\'\i}s and Rocha, Anderson}, booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year = {2022}, organization = {IEEE}, }
WIFS
Scalable Fact-checking with Human-in-the-Loop

Jing Yang, Didier Vega-Oliveros, Tais Seibt, and 1 more author

In 2021 IEEE International Workshop on Information Forensics and Security (WIFS), 2021

Abs DOI Bib PDF Code

Researchers have been investigating automated solutions for fact-checking in various fronts. However, current approaches often overlook the fact that information released every day is escalating, and a large amount of them overlap. Intending to accelerate fact-checking, we bridge this gap by proposing a new pipeline – grouping similar messages and summarizing them into aggregated claims. Specifically, we first clean a set of social media posts (e.g., tweets) and build a graph of all posts based on their semantics; Then, we perform two clustering methods to group the messages for further claim summarization. We evaluate the summaries both quantitatively with ROUGE scores and qualitatively with human evaluation. We also generate a graph of summaries to verify that there is no significant overlap among them. The results reduced 28,818 original messages to 700 summary claims, showing the potential to speed up the fact-checking process by organizing and selecting representative claims from massive disorganized and redundant messages.
@inproceedings{yang2021scalable, title = {Scalable Fact-checking with Human-in-the-Loop}, author = {Yang, Jing and Vega-Oliveros, Didier and Seibt, Tais and Rocha, Anderson}, booktitle = {2021 IEEE International Workshop on Information Forensics and Security (WIFS)}, year = {2021}, doi = {10.1109/WIFS53200.2021.9648388}, organization = {IEEE}, }

WIFS

Take It Easy: Label-Adaptive Self-Rationalization for Fact Verification and Explanation Generation

Jing Yang, and Anderson Rocha

In 2024 IEEE International Workshop on Information Forensics and Security (WIFS), 2024

Bib Code

@inproceedings{yang2024take,
  title = {Take It Easy: Label-Adaptive Self-Rationalization for Fact Verification and Explanation Generation},
  author = {Yang, Jing and Rocha, Anderson},
  booktitle = {2024 IEEE International Workshop on Information Forensics and Security (WIFS)},
  pages = {1--6},
  year = {2024},
  organization = {IEEE},
}

TACL

Self-Rationalization in the Wild: A Large-scale Out-of-Distribution Evaluation on NLI-related tasks

Jing Yang, Max Glockner, Anderson Rocha, and 1 more author

Transactions of the Association for Computational Linguistics, 2025

Bib Code

@article{yang2025self,
  title = {Self-Rationalization in the Wild: A Large-scale Out-of-Distribution Evaluation on NLI-related tasks},
  author = {Yang, Jing and Glockner, Max and Rocha, Anderson and Gurevych, Iryna},
  journal = {Transactions of the Association for Computational Linguistics},
  volume = {13},
  pages = {314--342},
  year = {2025},
}

EACL
Persona Prompting as a Lens on LLM Social Reasoning

Jing Yang, Moritz Hechtbauer, Elisabeth Khalilov, and 3 more authors

In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Mar 2026

Abs DOI Bib HTML

For socially sensitive tasks like hate speech detection, the quality of explanations from Large Language Models (LLMs) is crucial for factors like user trust and model alignment. While Persona prompting (PP) is increasingly used as a way to steer model towards user-specific generation, its effect on model rationales remains underexplored. We investigate how LLM-generated rationales vary when conditioned on different simulated demographic personas. Using datasets annotated with word-level rationales, we measure agreement with human annotations from different demographic groups, and assess the impact of PP on model bias and human alignment. Our evaluation across three LLMs results reveals three key findings: (1) PP improving classification on the most subjective task (hate speech) but degrading rationale quality. (2) Simulated personas fail to align with their real-world demographic counterparts, and high inter-persona agreement shows models are resistant to significant steering. (3) Models exhibit consistent demographic biases and a strong tendency to over-flag content as harmful, regardless of PP. Our findings reveal a critical trade-off: while PP can improve classification in socially-sensitive tasks, it often comes at the cost of rationale quality and fails to mitigate underlying biases, urging caution in its application.
@inproceedings{yang-etal-2026-persona, title = {Persona Prompting as a Lens on {LLM} Social Reasoning}, author = {Yang, Jing and Hechtbauer, Moritz and Khalilov, Elisabeth and Brinkmann, Evelyn Luise and Schmitt, Vera and Feldhus, Nils}, editor = {Demberg, Vera and Inui, Kentaro and Marquez, Llu{\'i}s}, booktitle = {Proceedings of the 19th Conference of the {E}uropean Chapter of the {A}ssociation for {C}omputational {L}inguistics (Volume 1: Long Papers)}, month = mar, year = {2026}, address = {Rabat, Morocco}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2026.eacl-long.52/}, doi = {10.18653/v1/2026.eacl-long.52}, pages = {1152--1170}, }