Hi there! I am a Ph.D. student in Computer Science at the University of Campinas, Brazil, where I am a member of RECOD.ai lab. since 2019, August. My advisor is Prof. Anderson Rocha.
My research focuses on improving methods for automated fact-checking. I am especially interested in making fact-checking more explainable through question answering. Thematically, I am interested in fact-checking, question answering, information retrieval, and explainable AI.
selected publications
ICASSP
Explainable Fact-checking through Question Answering
Yang, Jing, Vega-Oliveros, Didier, Seibt, TaıÌs, and Rocha, Anderson
In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
Misleading or false information has been creating chaos in some places around the world. To mitigate this issue, many researchers have proposed automated fact-checking methods to fight the spread of fake news. However, most methods cannot explain the reasoning behind their decisions, failing to build trust between machines and humans using such technology. Trust is essential for fact-checking to be applied in the real world. Here, we address fact-checking explainability through question answering. In particular, we propose generating questions and answers from claims and answering the same questions from evidence. We also propose an answer comparison model with an attention mechanism attached to each question. Leveraging question answering as a proxy, we break down automated fact-checking into several steps â this separation aids modelsâ explainability as it allows for more detailed analysis of their decision-making processes. Experimental results show that the proposed model can achieve state-of-the-art performance while providing reasonable explainable capabilities.
WIFS
Scalable Fact-checking with Human-in-the-Loop
Yang, Jing, Vega-Oliveros, Didier, Seibt, Tais, and Rocha, Anderson
In 2021 IEEE International Workshop on Information Forensics and Security (WIFS) 2021
Researchers have been investigating automated solutions for fact-checking in various fronts. However, current approaches often overlook the fact that information released every day is escalating, and a large amount of them overlap. Intending to accelerate fact-checking, we bridge this gap by proposing a new pipeline â grouping similar messages and summarizing them into aggregated claims. Specifically, we first clean a set of social media posts (e.g., tweets) and build a graph of all posts based on their semantics; Then, we perform two clustering methods to group the messages for further claim summarization. We evaluate the summaries both quantitatively with ROUGE scores and qualitatively with human evaluation. We also generate a graph of summaries to verify that there is no significant overlap among them. The results reduced 28,818 original messages to 700 summary claims, showing the potential to speed up the fact-checking process by organizing and selecting representative claims from massive disorganized and redundant messages.