KDIR 2025 Abstracts


Full Papers
Paper Nr: 38
Title:

Generating Aerial Flood Prediction Imagery with Generative Adversarial Networks

Authors:

Natasha Randall, Gernot Heisenberg and Juan Luis Ramirez Duval

Abstract: Floods are one of the most dangerous, impactful natural disasters, and flood forecasting is a critical component of effective pre-flooding preparedness. In this paper a data-driven approach to flood forecasting is presented, which provides photorealistic predictions that are less computationally expensive to generate than traditional physically-based models. A ‘PairedAttention’ generative adversarial network (GAN) was developed, that combines attention and content mask subnetworks, and was trained on paired sets of pre- and post-flooding aerial satellite images aligned with topographical data. The PairedAttention GAN achieved 88% accuracy and an F1 score of 0.8 at flood predictions on three USA flood events, and an ablation study determined that the digital elevation model was the most significant factor to improving the GAN’s performance. Although the model is a successful proof-of-concept for the effectiveness of a data-driven GAN to generate photorealistic, accurate aerial flood prediction imagery, it nevertheless struggled with generalisation, indicating an important avenue for future research.
Download

Paper Nr: 46
Title:

Sheet Metal Forming Springback Prediction Using Image Geometrics (SPIG): A Novel Approach Using Heatmaps and Convolutional Neural Network

Authors:

Du Chen, Mariluz Penalva Oscoz, Yang Hai, Martin Rebe Ander, Frans Coenen and Anh Nguyen

Abstract: We propose the Springback Prediction Using Image Geometrics (SPIG) approach to predict springback errors in Single Point Incremental Forming (SPIF). We achieved highly accurate predictions by converting local geometric information into heatmaps and employing ResNet based method. Augmenting the dataset twenty-four-fold through various transformations, our ResNet model significantly outperformed LSTM, SVM, and GRU alternatives in terms of the MSE and RMSE values obtained. The best performance result in an R² value of 0.9688, 4.95% improvement over alternative methods. The research demonstrates the potential of ResNet models in predicting springback errors, offering advancements over alternative methods. Future work will focus on further optimisation, advanced data augmentation, and applying the method to other forming processes. Our code and models are available at https://github.com/DarrenChen0923/SPIF.
Download

Paper Nr: 56
Title:

Towards Transparent AI in Medical Imaging: Fracture Detection in Hand Radiographs with Grad-CAM Insights

Authors:

Mustafa Juzer Fatehi, Siddharath Malavalli Nagesh, Mandalam Akshit Rao, Stellin John George, J. Angel Arul Jothi and Elakkiya Rajasekar

Abstract: Timely and accurate detection of bone fractures in hand radiographs, particularly in fingers and wrists remains a critical challenge in clinical diagnostics due to anatomical complexity and subtle fracture patterns. This study presents an explainable AI framework for automatic fracture detection using a single-shot detection framework-YOLOv5 Medium (YOLOv5m) model, optimized through targeted preprocessing and interpretability techniques. A dedicated preprocessing pipeline is used to enhance fracture visibility and reduce irrelevant noise. This includes key steps like histogram equalization, Gaussian filtering, Laplacian filtering, and intensity normalization. To foster clinical trust and transparency, we integrate Gradient-weighted Class Activation Mapping (Grad-CAM) to visualize regions of interest influencing the model’s predictions. Trained on a curated dataset of over 9,000 annotated X-ray images, YOLOv5m achieved outstanding performance, with a mean Average Precision mAP@50 of 95.87% and an inference speed of 690 ms, making it suitable for real-time diagnostic support. This work demonstrates the potential of AI-assisted systems not only to improve fracture diagnosis but also to bridge the trust gap in clinical deployment through transparent decision-making support.
Download

Paper Nr: 61
Title:

Redefining Prerequisites Through Text Embeddings: Identifying Practical Course Dependencies

Authors:

Şükrü Kaan Tetik, Emirhan Toprak, Senem Kumova Metin and Hande Aka Uymaz

Abstract: This study proposes a framework to support undergraduate students in course selection by identifying implicit prerequisites and predicting performance in elective courses. Unlike traditional prerequisite rules that rely solely on curriculum design, our approach integrates students’ academic history and course-level semantic information. We define two core tasks: (T1) identifying practical prerequisites that significantly impact success in a target course, and (T2) predicting student success in elective courses based on academic profiles. For T1, we analyze prior course performance and learning outcomes using SHAP (SHapley Additive exPlanations) to determine the most influential courses. For T2, we build student representations using course descriptions and learning outcomes, then apply embedding models (Sentence-BERT, Doc2Vec, Universal Sentence Encoder) combined with classification algorithms to predict course success. Experiments demonstrate that embedding-based models, especially those using Sentence-BERT, can effectively predict course outcomes. The results suggest that incorporating semantic representations enhances curriculum design, course advisement, and prerequisite refinement.
Download

Paper Nr: 63
Title:

A Two-Layer Deep Learning Approach for R&D Partner Recommendation in the Self-Driving Vehicle Industry

Authors:

Juite Wang and Ying-Pei Kao

Abstract: This study presents a two-stage deep learning framework for recommending strategic R&D partners in the self-driving vehicle (SDV) industry. Leveraging 165,775 U.S. patent applications from 2015 to 2023, we constructed a co-patent network and extracted node, edge, and topological features to represent organizational attributes, collaboration intensity, and network structure. These features were integrated using a hybrid Graph Neural Network (GNN) and Deep Neural Network (DNN) architecture to predict future collaborations. The model achieved high predictive performance (accuracy = 96.65%, precision = 70.83%, recall = 66.92%, F1 = 68.82%, and AUPRC = 78.93%) and demonstrated its ability to identify both established and emerging partners. Community detection revealed influential clusters anchored by firms like Toyota and Hyundai. Case analyses showed that the model can recommend both historical and emerging R&D partners. Compared to prior work, this study contributes a scalable, data-driven approach that incorporates deep structural and semantic signals to improve partner selection accuracy. The framework advances patent analytics by linking network-based learning with partner recommendations, offering practical implications for R&D planning in complex technology-based industries.
Download

Paper Nr: 64
Title:

Unsupervised Thematic Context Discovery for Explainable AI in Fact Verification: Advancing the CARAG Framework

Authors:

Manju Vallayil, Parma Nand, Wei Qi Yan and Héctor Allende-Cid

Abstract: This paper introduces CARAG-u, an unsupervised extension of the Context-Aware Retrieval Augmented Generation (CARAG) framework, designed to advance explainability in Automated Fact Verification (AFV) architectures. Unlike its predecessor, CARAG-u eliminates reliance on predefined thematic annotations and claim-evidence pair labels, by dynamically deriving thematic clusters and evidence pools from unstructured datasets. This innovation enables CARAG-u to balance local and global perspectives in evidence retrieval and explanation generation. We benchmark CARAG-u against Retrieval Augmented Generation (RAG) and compare it with CARAG, highlighting its unsupervised adaptability while maintaining a competitive performance. Evaluations on the FactVer dataset demonstrate CARAG-u’s ability to generate thematically coherent and context-sensitive post-hoc explanations, advancing Explainable AI in AFV. The implementation of CARAG-u, including all dependencies, is publicly available to ensure reproducibility and support further research.
Download

Paper Nr: 68
Title:

A Convexity-Dependent Two-Phase Training Algorithm for Deep Neural Networks

Authors:

Tomas Hrycej, Bernhard Bermeitinger, Massimo Pavone, Götz-Henrik Wiegand and Siegfried Handschuh

Abstract: The key task of machine learning is to minimize the loss function that measures the model fit to the training data. The numerical methods to do this efficiently depend on the properties of the loss function. The most decisive among these properties is the convexity or non-convexity of the loss function. The fact that the loss function can have, and frequently has, non-convex regions has led to a widespread commitment to non-convex methods such as Adam. However, a local minimum implies that, in some environment around it, the function is convex. In this environment, second-order minimizing methods such as the Conjugate Gradient (CG) give a guaranteed superlinear convergence. We propose a novel framework grounded in the hypothesis that loss functions in real-world tasks swap from initial non-convexity to convexity towards the optimum - a property we leverage to design an innovative two-phase optimization algorithm. The presented algorithm detects the swap point by observing the gradient norm dependence on the loss. In these regions, non-convex (Adam) and convex (CG) algorithms are used, respectively. Computing experiments confirm the hypothesis that this simple convexity structure is frequent enough to be practically exploited to substantially improve convergence and accuracy.
Download

Paper Nr: 69
Title:

Enhanced LLM Text Classification Method with Embedded Semantic Feature Encoding

Authors:

Meng Wang, Jing Xie, Yang Li, Zhixiong Zhang and Hanyu Li

Abstract: Accurate identification of semantic features in scientific texts is crucial for enhancing text classification performance. This paper presents a large language model text classification method with embedded semantic feature encoding, which enhances the model's understanding of textual semantics through a dual semantic feature encoding mechanism. The method employs a dynamic window-based local-global feature extraction strategy to capture topical semantic features and utilizes hierarchical structural aggregation mechanisms to extract organizational semantic information from texts. To fully leverage the extracted semantic features, we design a feature replacement encoding strategy that embeds topical semantic features and structural semantic features into the [CLS] and [SEP] positions of large language models, respectively, achieving deep fusion between semantic features and internal model representations, thereby improving the accuracy and robustness of text classification. Experimental results demonstrate that the proposed semantic feature encoding enhancement method achieves significant performance improvements. On the DBPedia dataset, the semantically encoded SciBERT model achieves an F1-score of 91.07%, representing a 5.26% improvement over the original encoding approach. In the scientific literature value sentence identification task, Qwen3-14B combined with semantic feature encoding and QLora fine-tuning achieves an F1-score of 94.19%, showing a 14.64% improvement over the baseline model. Compared to traditional feature concatenation or simple fusion approaches, our feature replacement encoding strategy leverages semantic features at critical positions, significantly enhancing both classification precision and recall. Ablation experiments further validate the synergistic effects of topical semantic features and structural semantic features, confirming the effectiveness of the dual semantic feature encoding mechanism. The research findings highlight the advantages of semantic feature encoding in text classification tasks, providing an effective technical solution for intelligent analysis of scientific texts.
Download

Paper Nr: 111
Title:

Hierarchical Patch Compression for ColPali: Efficient Multi-Vector Document Retrieval with Dynamic Pruning and Quantization

Authors:

Bach Duong and Pham Nhat Minh

Abstract: Multi-vector document retrieval systems, such as ColPali, excel in fine-grained matching for complex queries but incur significant storage and computational costs due to their reliance on high-dimensional patch embeddings and late-interaction scoring. To address these challenges, we propose HPC-ColPali, a Hierarchical Patch Compression framework that enhances the efficiency of ColPali while preserving its retrieval accuracy. Our approach integrates three innovative techniques: (1) K-Means quantization, which compresses patch embeddings into 1-byte centroid indices, achieving 32× storage reduction; (2) attention-guided dynamic pruning, utilizing Vision-Language Model attention weights to retain only the top-p% most salient patches, reducing late-interaction computation by 60% with less than 2% nDCG@10 loss; and (3) optional binary encoding of centroid indices into b-bit strings (b = ⌈log2K⌉), enabling rapid Hamming distance-based similarity search for resource-constrained environments. In domains like legal and financial analysis, where documents contain visual elements (e.g., charts in SEC filings), multi-vector models like ColPali enable precise retrieval but scale poorly. This work introduces hierarchical compression, novel in combining VLM attention pruning with quantization, reducing costs by 30-50% while preserving accuracy, as validated on ViDoRe. Evaluated on the ViDoRe and SEC-Filings datasets, HPC-ColPali achieves 30–50% lower query latency under HNSW indexing while maintaining high retrieval precision. When integrated into a Retrieval-Augmented Generation pipeline for legal summarization, it reduces hallucination rates by 30% and halves end-to-end latency. These advancements establish HPC-ColPali as a scalable and efficient solution for multi-vector document retrieval across diverse applications. Code is available at https://github.com/DngBack/HPC-ColPali.
Download

Paper Nr: 118
Title:

Retrieval-Augmented Generation in Industry: An Interview Study on Use Cases, Requirements, Challenges, and Evaluation

Authors:

Lorenz Brehme, Benedikt Dornauer, Thomas Ströhle, Maximilian Ehrhart and Ruth Breu

Abstract: Retrieval-Augmented Generation (RAG) is a well-established and rapidly evolving field within AI that enhances the outputs of large language models by integrating relevant information retrieved from external knowledge sources. While industry adoption of RAG is now beginning, there is a significant lack of research on its practical application in industrial contexts. To address this gap, we conducted a semi-structured interview study with 13 industry practitioners to explore the current state of RAG adoption in real-world settings. Our study investigates how companies apply RAG in practice, providing (1) an overview of industry use cases, (2) a consolidated list of system requirements, (3) key challenges and lessons learned from practical experiences, and (4) an analysis of current industry evaluation methods. Our main findings show that current RAG applications are mostly limited to domain-specific QA tasks, with systems still in prototype stages; industry requirements focus primarily on data protection, security, and quality, while issues such as ethics, bias, and scalability receive less attention; data preprocessing remains a key challenge, and system evaluation is predominantly conducted by humans rather than automated methods.
Download

Paper Nr: 143
Title:

FIRESPARQL: A LLM-Based Framework for SPARQL Query Generation over Scholarly Knowledge Graphs

Authors:

Xueli Pan, Victor de Boer and Jacco van Ossenbruggen

Abstract: Question answering over Scholarly Knowledge Graphs (SKGs) remains a challenging task due to the complexity of scholarly content and the intricate structure of these graphs. Large Language Model (LLM) approaches could be used to translate natural language questions (NLQs) into SPARQL queries; however, these LLM-based approaches struggle with SPARQL query generation due to limited exposure to SKG-specific content and the underlying schema. We identified two main types of errors in the LLM-generated SPARQL queries: (i) structural inconsistencies, such as missing or redundant triples in the queries, and (ii) semantic inaccuracies, where incorrect entities or properties are shown in the queries despite a correct query structure. To address these issues, we propose FIRESPARQL, a modular framework that supports fine-tuned LLMs as a core component, with optional context provided via retrieval-augmented generation (RAG) and a SPARQL query correction layer. We evaluate the framework on the SciQA Benchmark using various configurations (zero-shot, zero-shot with RAG, one-shot, fine-tuning, and fine-tuning with RAG) and compare the performance with baseline and state-of-the-art approaches. We measure query accuracy using BLEU and ROUGE metrics, and execution result accuracy using relaxed exact match(RelaxedEM), with respect to the gold standards containing the NLQs, SPARQL queries, and the results of the queries. Experimental results demonstrate that fine-tuning achieves the highest overall performance, reaching 0.90 ROUGE-L for query accuracy and 0.85 RelaxedEM for result accuracy on the test set.
Download

Paper Nr: 160
Title:

Embeddings Might Be all You Need: Domain-Specific Sentence Encoders for Latin American E-Commerce Questions

Authors:

Rodrigo Caus, Victor Sotelo, Victor Hochgreb and Julio Cesar dos Reis

Abstract: In Latin American e-commerce, customer inquiries often exhibit unique linguistic patterns that require specialized handling for accurate responses. Traditional sentence encoders may struggle with these regional nuances, leading to less effective answers. This study investigates the application of fine-tuned transformer models to generate domain-specific sentence embeddings, focusing on Portuguese and Spanish retrieval tasks. Our findings demonstrate that these specialized embeddings significantly outperform general-purpose pre-trained models and traditional techniques, such as BM-25, thereby eliminating the need for additional re-ranking steps in retrieval processes. Our results investigate the impact of multi-objective training within Matryoshka Representation Learning, demonstrating its effectiveness in maintaining retrieval performance across various embedding dimensions. Our approach offers a scalable and efficient solution for multilingual retrieval in e-commerce, reducing computational costs while ensuring high accuracy.
Download

Paper Nr: 165
Title:

Suggesting Product Prices in Automotive E-Commerce: A Study Assessing Regression Models and Explicability

Authors:

André Gomes Regino, Gilson Yuuji Shimizu, Fernando Rezende Zagatti, Filipe Loyola Lopes, Rodrigo Bonacin, Julio Cesar Dos Reis and Cristina Dutra de Aguiar

Abstract: E-commerce pricing may involve complex processes, including various factors such as cost, perceived value, and market demand. Exploring machine learning (ML) for informing pricing in the automotive sector presents significant open research challenges that require innovative solutions. This investigation examines a real-world Brazilian e-commerce dataset to train, test, and compare several state-of-the-art regression models to understand their applicability. Our study originally includes how SHapley Additive exPlanations (SHAP) help to interpret the most influential features for price prediction. Results indicate that Light GBM and XGBoost performed best, combining high predictive accuracy with computational efficiency, and reveal features such as product weight, stock levels, and physical dimensions as the most influential on final pricing. This study outcome paves the way for novel data-driven pricing strategies in Brazilian automotive e-commerce.
Download

Paper Nr: 171
Title:

LLMs and Knowledge Discovery in Low-Resource Language Parliamentary Corpora: The PQ Dashboard Case Study

Authors:

Joel Azzopardi

Abstract: Parliamentary Questions (PQs) are a critical mechanism for democratic oversight and accountability. However, their comprehensive analysis can be hindered by limitations such as single-language availability (especially when the language is a low-resource language such as Maltese) and a lack of structured thematic organisation or interlinking. This paper introduces the PQ Dashboard, a web-based platform developed to enhance the accessibility and analytical utility of Maltese Parliamentary Questions. The system employs AI and open Large Language Models (LLMs) to automate PQ collection, translate content into English, classify it according to the COFOG-99 taxonomy, extract key terms, and identify interconnections. The interactive dashboard provides users – including the public, journalists, and academic researchers – with functionalities to navigate PQs by category or keyword, visualise thematic distributions, and analyse trends in MPs’ activity and ministerial responses. This enhanced data accessibility aims to facilitate deeper insights into parliamentary discourse, policy development, and governmental accountability. The PQ Dashboard demonstrates a practical application of AI-driven solutions for transforming unstructured public data into a more accessible and analysable format, thereby contributing to increased transparency and informed public engagement.
Download

Short Papers
Paper Nr: 13
Title:

Experimental Study of Algorithms for Transforming Decision Rule Systems into Decision Trees

Authors:

Kerven Durdymyradov and Mikhail Moshkov

Abstract: The examination of the relationships between decision trees and systems of decision rules represents a significant area of research within computer science. While methods for converting decision trees into systems of decision rules are well-established and straightforward, the inverse transformation problem presents considerable challenges. Our previous work has demonstrated that the complexity of constructing complete decision trees can be superpolynomial in many cases. In our book, we proposed three polynomial time algorithms that do not construct the entire decision tree but instead outline the computation path within this tree for a specified input. Additionally, we introduced a dynamic programming algorithm that calculates the minimum depth of a decision tree corresponding to a given decision rule system. In the present paper, we describe these algorithms and the theoretical results obtained in the book. The primary objective of this paper is to experimentally compare the performance of the three algorithms and evaluate their outcomes against the optimal results generated by the dynamic programming algorithm.
Download

Paper Nr: 25
Title:

Real-Time Arabic Sign Language Recognition Using YOLOv5

Authors:

Zainab Abualhassan, Haidar Ramadhan, Mohammed Faisal Naji and Hajar Alsulaili

Abstract: Sign language is a vital means of communication for the deaf and hard-of-hearing community, yet automatic recognition still faces many challenges. While several sign languages have seen major advances in recognition systems, Arabic sign language (ArSL) remains underdeveloped and requires much more research. Object detection models like YOLOv5 (You Only Look Once, Version 5) have revolutionized computer vision with their high speed, accuracy, and ability to process data in real time. This paper introduces a recognition system leveraging YOLOv5 , a leading object detection model, to classify the 28 letters of the Arabic alphabet. The model was trained on a comprehensive dataset containing thousands of images representing each letter, achieving strong classification results with certain classes reaching perfect accuracy of 100%. To assess the model’s performance, evaluation metrics such as precision, recall, and mean Average Precision (mAP) were employed, demonstrating its practicality for real-world applications. Results indicate that YOLOv5’s architecture, with its efficient feature extraction and real-time processing, reliably handles the complex hand gesture variations in Arabic sign language. Its capability to distinguish subtle differences in hand positions makes it a valuable tool for educational applications, accessibility solutions for the deaf and hard-of-hearing, and future advancements in sign language translation systems. This study contributes a robust Arabic sign language recognition model, addressing an essential need for improved accessibility and communication for Arabic-speaking users.
Download

Paper Nr: 44
Title:

On the Asymmetrical Nature of Entity Matching Using Pre-Trained Transformers

Authors:

Andrei Olar

Abstract: Entity resolution (ER) is a foundational task in data integration and knowledge discovery, aimed at identifying which information refer to the same real-world entity. While ER pipelines traditionally rely on matcher symmetry (if a matches b then b matches a) this assumption is challenged by modern matchers based on pre-trained transformers, which are inherently sensitive to input order. In this paper, we investigate the asymmetric behavior of transformer-based matchers with respect to input order and its implications for end-to-end (E2E) ER. We introduce a strong asymmetric matcher that outperforms prior baselines, demonstrate how to integrate such matchers into E2E pipelines via directed reference graphs, and evaluate clustering performance across multiple benchmark datasets. Our results reveal that asymmetry is not only measurable but also materially impacts clustering quality, highlighting the need to revisit core assumptions in ER system design.
Download

Paper Nr: 47
Title:

Mapping Weaponised Victimhood: A Machine Learning Approach

Authors:

Samantha Butcher and Beatriz De La Iglesia

Abstract: Political discourse frequently leverages group identity and moral alignment, with weaponised victimhood (WV) standing out as a powerful rhetorical strategy. Dominant actors employ WV to frame themselves or their allies as victims, thereby justifying exclusionary or retaliatory political actions. Despite advancements in Natural Language Processing (NLP), existing computational approaches struggle to capture such subtle rhetorical framing at scale, especially when alignment is implied rather than explicitly stated. This paper introduces a dual-task framework designed to address this gap by linking Named Entity Recognition (NER) with a nuanced rhetorical positioning classification (positive, negative, or neutral - POSIT). By treating rhetorical alignment as a structured classification task tied to entity references, our approach moves beyond sentiment-based heuristics to yield a more interpretable and fine-grained analysis of political discourse. We train and compare transformer-based models (BERT, DistilBERT, RoBERTa) across Single-Task, Multi-Task, and Task-Conditioned Multi-Task Learning architectures. Our findings demonstrate that NER consistently outperformed rhetorical positioning, achieving higher F1-scores and distinct loss dynamics. While single-task learning showed wide loss disparities (e.g., BERT NER 0.45 vs POSIT 0.99), multi-task setups fostered more balanced learning, with losses converging across tasks. Multi-token rhetorical spans proved challenging but showed modest F1 gains in integrated setups. Neutral positioning remained the weakest category, though targeted improvements were observed. Models displayed greater sensitivity to polarised language (e.g., RoBERTa TC-MTL reaching 0.55 F1 on negative spans). Ultimately, entity-level F1 scores converged (NER: 0.60–0.61; POSIT: 0.50–0.52), suggesting increasingly generalisable learning and reinforcing multi-task modelling as a promising approach for decoding complex rhetorical strategies in real-world political language.
Download

Paper Nr: 49
Title:

Assessing Grade Levels of Texts via Local Search over Fine-Tuned LLMs

Authors:

Changfeng Yu and Jie Wang

Abstract: The leading method for determining the grade level of a written work involves training an SVC model on hundreds of linguistic features (LFs) and a predicted grade generated by a fine-tuned large language model (FT-LLM). When applied to a diverse dataset of materials for grades 3 through 12 spanning 33 genres, however, this approach yields a poor accuracy of less than 51%. To address this issue, we devise a novel local-search algorithm called LS-LLM independent of LFs. LS-LLM employs different FT-LLMs to identify a genre, predict a genre-aware grade, and compare readability of the text to a randomly selected set of annotated works from the same genre and grade level. We demonstrate that LS-LLM significantly improves accuracy, exceeding 65%, and achieves over 92% accuracy within a one-grade error margin, making it viable for certain practical applications. To further validate its robustness, we show that LS-LLM also enhances the performance of the leading method on the WeeBit dataset used in prior research.
Download

Paper Nr: 53
Title:

Predicting Stock Price Movement with LLM-Enhanced Tweet Emotion Analysis

Authors:

An Vuong and Susan Gauch

Abstract: Accurately predicting short-term stock price movement remains a challenging task due to the market’s inherent volatility and sensitivity to investor sentiment. This paper discusses a deep learning framework that integrates emotion features extracted from tweet data with historical stock price information to forecast significant price changes on the following day. We utilize Meta’s Llama 3.1-8B-Instruct model to preprocess tweet data, thereby enhancing the quality of emotion features derived from three emotion analysis approaches: a transformer-based DistilRoBERTa classifier from the Hugging Face library and two lexicon-based methods using National Research Council Canada (NRC) resources. These features are combined with previous-day stock price data to train a Long Short-Term Memory (LSTM) model. Experimental results on TSLA, AAPL, and AMZN stocks show that all three emotion analysis methods improve the average accuracy for predicting significant price movements, compared to the baseline model using only historical stock prices, which yields an accuracy of 13.5%. The DistilRoBERTa-based stock prediction model achives the best performance, with accuracy rising from 23.6% to 38.5% when using LLaMA-enhanced emotion analysis. These results demonstrate that using large language models to preprocess tweet content enhances the effectiveness of emotion analysis which in turn improves the accuracy of predicting significant stock price movements.
Download

Paper Nr: 67
Title:

Peer-to-Peer Federated Learning with Trusted Data Sharing for Non-IID Mitigation

Authors:

Mahran Jazi and Irad Ben-Gal

Abstract: Collaboration between edge devices without a central server defines the foundation of Peer-to-Peer Federated Learning (P2P FL), a decentralized approach to machine learning that preserves user privacy. However, P2P FL faces significant challenges when data distributions across clients are non-independent and identically distributed (non-IID), which can severely degrade learning performance. In this work, we propose an enhance-ment to P2P FL through direct data sharing between trusted peers, such as friends, colleagues, or collaborators, where each client shares a small, controlled portion of its local dataset with a selected set of neighbors. While this data-sharing mechanism enhances consistency in learning and improves model performance across the decentralized network, it introduces a trade-off between privacy and performance, as limited data sharing may increase privacy risks. To mitigate these risks, our approach assumes a trusted peer-to-peer network and avoids reliance on any central authority. We evaluate our approach using standard datasets (MNIST, CIFAR-10, and CIFAR-100) and models, including logistic regression, multilayer perceptron, convolutional neural networks (CNNs), and DenseNet-121. The results demonstrate that even modest amounts of peer data sharing significantly improve performance in non-identically distributed (non-IID) settings, offering a simple yet effective strategy to address the challenges of decentralized learning in peer-to-peer federated learning (P2P FL) systems.
Download

Paper Nr: 72
Title:

AutoVU-KG: Automated Validation and Updates for Knowledge Graphs with Web-Search-Augmented LLMs

Authors:

Amel Gader and Alsayed Algergawy

Abstract: Knowledge Graphs (KGs) offer a powerful framework for representing and managing structured information in many applications. However, when it comes to frequently changing facts, KGs often lag behind real-world updates. Large Language Models (LLMs) hold promise for enriching and updating KGs, but their capabilities are limited by static training cutoffs and a tendency to hallucinate or produce outdated information. To address these concerns, we introduce AutoVUKG: Automated Validation and Updates for Knowledge Graphs with Web-Search-Augmented LLMs. Our approach comprises: a classification module that identifies facts likely to change and therefore needing updates; An LLM-driven validation and update pipeline, enhanced with real-time web retrieval to ground assertions in current external sources, and an entity matching and alignment component that ensures updates maintain internal consistency within the KG. Evaluation on subsets of Wikidata demonstrates that the proposed approach achieves high accuracy and significantly outperforms vanilla LLMs. Additionally, it reduces the number of outdated facts by up to 60% on one of the datasets. The source code is available at https://github.com/amal-gader/autovu-kg.
Download

Paper Nr: 76
Title:

Explaining the Judges’ Decisions Criteria

Authors:

Aerty Santos, Gabriel Silveira de Queirós Campos, Cristine Griffo, Eliana Zandonade and Elias de Oliveira

Abstract: The identification of named entities in free text is a foundational research area for building intelligent systems in text and document mining. These textual elements allow us to evaluate the reasoning expressed by document authors. In a judicial decision, for example, by identifying time-related entities, an intelligent system can assess and verify whether a sentence issued by a justice agent falls within socially agreed-upon statistical parameters. In this study, 769 judicial decisions from the São Paulo court were evaluated. Our experiments compared the extreme time-value sentences against those with the lowest sentence, for instance, to infer the expressions that justified and have explained their values. The results revealed differences in sentence severity among robbery, drug trafficking, and theft, as well as in how judges cluster based on their sentencing behavior. The study also highlights anomalies in sentencing and links them to specific textual justifications, demonstrating how judges’ decisions can reflect both legal criteria and subjective biases. [...] In a lawsuit the first to speak seems right, until someone comes forward and cross-examines.(Proverbs 18:17)
Download

Paper Nr: 80
Title:

CGNTM: Unsupervised Causal Topic Modeling with LLMs and Nonlinear Causal GNNs

Authors:

Peixuan Men, Longchao Wang, Aihua Li and Xiaoli Tang

Abstract: We propose CGNTM, a fully unsupervised causal topic model that integrates large language models (LLMs) with neural causal inference. Unlike conventional and supervised topic models, CGNTM learns both hierarchical topics and their directed causal relations directly from raw text, without requiring labeled data. The framework leverages LLM-based prompt extraction to identify salient keywords and candidate causal pairs, which are refined through differentiable Directed Acyclic Graph (DAG) learning and modeled via a nonlinear structural causal model (SCM). A directionally masked graph neural network (GNN) propagates information strictly along causal edges, while a Wasserstein Generative Adversarial Network (GAN) enforces semantic consistency under counterfactual interventions via BERT-based regularization. This combination enables the model to not only discover coherent and diverse topics but also uncover interpretable causal relationships among them. The architecture supports hierarchical topic organization by clustering fine-grained terms into broader themes and modeling cross-level dependencies through dual-layer message passing. Experimental results demonstrate that CGNTM outperforms state-of-the-art models in topic quality and causal interpretability. Ablation studies confirm the essential role of each component-LLM-guided extraction, nonlinear SCM, directional GNN propagation, and adversarial training-in contributing to both causal accuracy and topic coherence. The proposed framework opens new directions for unsupervised causal discovery in text, offering transforma-tive potential in domains where understanding why certain topics co-occur is as crucial as identifying what they are.
Download

Paper Nr: 84
Title:

Weakly Supervised Graph Neural Networks for Scalable 3D Phase Segmentation in Molecular Dynamics Simulations

Authors:

Abin Shakya and Bijaya B. Karki

Abstract: Accurate phase identification in large-scale molecular dynamics simulation remains a significant challenge due to ambiguous boundaries between compositionally distinct regions and the lack of ground truth labels. While unsupervised methods can perform phase segmentation for small systems through structure-aware segmentation pipelines, their computational cost becomes prohibitive for large-scale analysis. We present a weakly-supervised machine learning pipeline that trains Graph Neural Networks (GNNs) to enable scalable phase segmentation in 3D atomistic systems. Using a physically grounded unsupervised method, we generate weak labels for small FeMgSiON systems that exhibit Fe-rich (metallic) and Fe-poor (silicate) phase separation. These labels guide GNNs to learn physically meaningful representations of atomic neighborhoods. Once trained, the GNNs act as an efficient parametric model, enabling direct segmentation of arbitrarily large atomistic systems eliminating the computational overhead of the initial unsupervised pipeline. By learning from thousands of weakly labeled snapshots, the model discerns latent structural patterns, enhancing both prediction accuracy and generalization to unseen data. This methodology enables efficient, accurate, and physically consistent phase segmentation in large-scale molecular dynamics, unlocking new possibilities for scalable analysis in material simulations.
Download

Paper Nr: 85
Title:

MultiFlags: A Probabilistic Framework for Article-Based Size Advice in Fashion E-Commerce

Authors:

Matthias Späth, Andrea Nestler, Henry Böddeker, Leonidas Lefakis, Yevgeniy Puzikov, Rodrigo Weffer, Nour Karessli, Nadja Klein and Reza Shirvany

Abstract: Accurately modeling the size behavior of fashion articles at scale is a critical task for fashion e-commerce. However, it has proven to be highly challenging due to inconsistent sizing systems across countries, inconsistent garment design processes, and brand-specific sizing specifications. Widespread methods in the field focus primarily on giving customers rudimentary size recommendations (e.g., we recommend you size S) based on the customers’ purchase behavior and/or their size and fit preferences. These approaches fail to take into account the size and fit behavior of the article, for example their design cut, shape, material, etc. (or at best treat it with simplistic ad hoc assumptions), and in turn, not effectively reducing the high volume of online article returns due to size and fit. In this work, we propose a theoretically-motivated probabilistic framework, MultiFlags, which can significantly reduce size-related returns in fashion e-commerce thanks to modeling multiple aspects of article’s size and fit behavior. We also highlight how this framework enables a principled approach to article-based size advice, while leveraging data from multiple modalities. The results validate the competitiveness of the proposed framework in the state-of-the-art in several size advice scenarios that are critical for fashion e-commerce. The framework is deployed in production in a large e-commerce site, serving millions of customers and driving significant results.
Download

Paper Nr: 97
Title:

Contrastive Learning for Conversational Emotion Recognition Using Knowledge Enhancement of Large Language Models

Authors:

Andrew L. Mackey, B. Israel Cuevas and Susan Gauch

Abstract: Emotion recognition in conversation (ERC) is the task of classifying the emotion of each utterance in a conversation while learning the underlying latent representations. However, the representations for utterances are challenging to produce effectively given semantic and contextual information in the conversation. Large Language Models (LLMs) have demonstrated performance in various forms of emotion classification, including in zero-shot and few-shot settings, but their usage may be curtailed in some settings, particularly in limited resource environments. In this work, we propose a contrastive learning framework for the ERC task that leverages emotional anchors with semantic information encoded from an LLM to facilitate the learning of representations using a lightweight pretrained langauge model (PLM). Experimental results on benchmark ERC datasets demonstrate the effectiveness of our approach to baseline models while simultaneously reducing the inference cost of LLMs.
Download

Paper Nr: 98
Title:

From Observations to Causations: A GNN-Based Probabilistic Prediction Framework for Causal Discovery

Authors:

Rezaur Rashid and Gabriel Terejanu

Abstract: Causal discovery from observational data is challenging, especially with large datasets and complex relationships. Traditional methods often struggle with scalability and capturing global structural information. To overcome these limitations, we introduce a novel graph neural network (GNN)-based probabilistic framework that learns a probability distribution over the entire space of causal graphs, unlike methods that output a single deterministic graph. Our framework leverages a GNN that encodes both node and edge attributes into a unified graph representation, enabling the model to learn complex causal structures directly from data. The GNN model is trained on a diverse set of synthetic datasets augmented with statistical and information-theoretic measures, such as mutual information and conditional entropy, capturing both local and global data properties. We frame causal discovery as a supervised learning problem, directly predicting the entire graph structure. Our approach demonstrates superior performance, outperforming both traditional and recent non-GNN-based methods, as well as a GNN-based approach, in terms of accuracy and scalability on synthetic and real-world datasets without further training. This probabilistic framework significantly improves causal structure learning, with broad implications for decision-making and scientific discovery across various fields.
Download

Paper Nr: 100
Title:

Exploring Customer Service Agent Preferences for Conversational and Keyword-Based Information Retrieval

Authors:

Nektarios Machner, Yaren Mändle and Florian Matthes

Abstract: Effective knowledge discovery and information retrieval drive organizational innovation and competitive advantage. To support this, organizations have long used knowledge management systems that historically have relied on keyword-based search. The rise of artificial intelligence (AI), most notably large language models (LLMs), has enabled conversational search (CS) interfaces that understand natural-language queries, synthesize information from multiple sources, and generate answers. This study investigates the factors that influence customer service agents’ preferences for conversational search versus traditional keyword-based search within an internal knowledge management system. Set in a large European insurance company, we employ a mixed-methods empirical approach, integrating semi-structured interviews (n = 13), a structured survey (n = 17), and log-file analysis of 508 real-world queries. Our research explores which factors drive agents’ choice between the two search approaches, and examines the practical strengths and limitations of each approach. Our findings reveal that agents choose keyword search when they are confident of where to look and conversational search when they need natural-language guidance, with trust and time constraints further tipping the balance. This complementarity suggests hybrid interfaces, blending ease of use, reliable results, and flexible query handling, best support agents’ workflows.
Download

Paper Nr: 114
Title:

Beyond Parameter Counts: Benchmarking Similar-Sized Large Language Models for Next-Item Recommendation

Authors:

Kavach Dheer, Peter Corcoran and Josephine Griffith

Abstract: Large language models (LLMs) are rapidly being integrated into recommender systems. New LLMs are released frequently, offering numerous architectures that share identical parameter sizes within their class, giving practitioners many options to choose from. While existing benchmarks evaluate LLM-powered recommender systems on various tasks, none have examined how same-sized LLMs perform under identical experimental conditions as a recommender system. Additionally, these benchmarks do not verify whether the evaluation datasets were part of the LLMs pre-training data. This research evaluates five open-source 7–8B parameter models (Gemma, Deepseek, Qwen, Llama-3.1, and Mistral) using a fixed A-LLMRec architecture for next-item prediction using the Amazon Luxury-Beauty Dataset. We measure top-1 accuracy (Hit@1) and evaluate dataset leakage through reference-model membership-inference attacks to ensure no model gains advantages from pre-training exposure. Although all models show negligible dataset leakage rates $(<0.2\%)$, Hit@1 varies dramatically across 20 percentage points, from 44\% for Gemma to 64\% for Mistral, despite identical parameter counts and evaluation conditions. These findings demonstrate that selecting among the most appropriate LLMs is a crucial design decision in LLM-based recommender systems.
Download

Paper Nr: 120
Title:

Composer Classification Using a Note Difference Graph

Authors:

Raymond Conlin and Colm O'Riordan

Abstract: This paper presents a representation for a symbolically encoded work of music that highlights the relative differences between related notes. Our experiments show that when a Graph Neural Network (GNN) is trained to classify classical composers using this note difference graph, it outperforms a network trained with the representation described by Szeto and Wong \cite{A_graph-theoretical_approach_for_pattern_matching_in_post-tonal_music_analysis}. The note difference graph employed in this work is derived from the representation of Szeto and Wong. Each node in the note difference graph corresponds to two connected notes in a piece and contains the information relating to the differences between them. Nodes in the note difference graph are joined by an edge if they share any notes in common. The described representation provides an improvement in classification accuracy and reduction in bias when using imbalanced datasets. Given the improved classification accuracy achieved by the neural networlk with our representation, we believe that highlighting the relationship between notes provides the network with the opportunity of identifying the salient features more readily.
Download

Paper Nr: 123
Title:

From Detection to Diagnosis: A Layered Hybrid Framework for Anomaly Characterization in Maritime Sensor Streams

Authors:

Nadeem Iftikhar, Cosmin-Stefan Raita, Aziz Kadem, David Buncek, Matthew Haze Trinh, Yi-Chen Lin, Anders Vestergaard and Gianna Belle

Abstract: Effective knowledge discovery from industrial sensor data depends on a deep understanding of data quality issues. In the maritime domain, sensor streams often suffer from a diverse set of problems, from simple signal freezes to complex, context-dependent behavioral shifts. Merely detecting these events as a monolithic “anomaly” class provides limited actionable insight. This paper argues for a shift from anomaly detection to anomaly characterization. We propose a novel, layered hybrid framework that systematically identifies and classifies data issues into distinct types. Our pipeline effectively combines the reliability of statistical methods with the advanced pattern-finding ability of machine/deep learning. Each layer acts as a specialized filter that identifies a specific type of anomaly and cleans the data for the next, more advanced analysis. We demonstrate on real-world vessel data that this layered characterization not only achieves high detection accuracy but, more importantly, transforms raw detection flags into actionable knowledge for operational decision-making.
Download

Paper Nr: 129
Title:

Uncertainty Estimation and Calibration of a Few-Shot Transfer Learning Model for Lettuce Phenotyping

Authors:

Rusith Chamara Hathurusinghe Dewage, Habib Ullah, Muhammad Salman. Siddiqui, Rakibul Islam and Fadi-Al Machot

Abstract: Computer vision-assisted automatic plant phenotyping in controlled environment agriculture (CEA) remains a significant challenge due to the scarcity of labeled data from growing conditions. In this work, we investigate few-shot transfer learning for estimating the maximum width of lettuce from cropped and segmented images exhibiting non-uniform spatial distribution. The dataset presents additional complexity as images are captured using a wide-angle, off-center camera. We systematically investigate backbone architectures (ResNet, EfficientNet, MobileNet, DenseNet, and Vision Transformer) and perform various data augmentation strategies and regression head designs to identify optimal configurations under few-shot conditions. To enhance predictive reliability, we employ post-hoc uncertainty estimation using Monte Carlo (MC) dropout and conformal prediction, and further evaluate model calibration to analyze alignment between predicted uncertainties and empirical errors. Our best model, based on Vision Transformer Huge with 14x14 patch size (ViT-H/14), achieved a root mean square error (RMSE) of 14.34 mm on the test set. For uncertainty estimation, MC dropout achieved a miscalibration area of 0.19, an average prediction interval width of 27.89 mm, and an empirical coverage of 73\% at the nominal 90\% confidence level. Our results highlight the importance of backbone selection, augmentation, and head architecture on model generalization and reliability. This study offers practical guidelines for developing robust, uncertainty-aware few-shot models for plant phenotyping, enabling more trustworthy deployment in CEA applications.
Download

Paper Nr: 132
Title:

Identifying Innovation Frontiers Based on Prediction of Citation Network Links Between Papers and Patents

Authors:

Li Yongjie, Zhu Jian, Wang Longchao and Tang Xiaoli

Abstract: This paper proposes a novel method for identifying innovation frontiers based on link prediction in a heterogeneous citation network integrating academic papers and technological patents. By constructing a unified citation graph and applying the Graph Sample and Aggregate model, we perform node embedding learning and link prediction to uncover potential knowledge flow pathways. Combining graph embedding with clustering analysis, we identify frontier knowledge clusters characterized by high interdisciplinarity, novelty, and knowledge mobility. Preliminary experiments demonstrate that the proposed method outperforms existing graph neural network models in both link prediction and clustering tasks, effectively revealing emerging innovation frontiers at the intersection of scientific and technological knowledge.
Download

Paper Nr: 135
Title:

The WikiWooW Dataset: Harnessing Semantic Similarity and Clickstream-Data for Serendipitous Hyperlinked-Paths Mining in Wikipedia

Authors:

Cosimo Palma and Bence Molnár

Abstract: This paper introduces WikiWooW, a dataset generator designed for distilling a formal model of Wikipedia entity-pairs serendipity. The task, foundational to mining serendipitous hyperlinked paths, builds upon cognitive theory and exploits serendipity sub-components: graph centrality, popularity, clickstream, corpus-based and knowledge-based similarity. Two proof-of-concept experiments were conducted, based on two different datasets. The first one uses a single Wikipedia entity linked through the DBpedia dbo:wikiPageWikiLink property to other 413 entities. These pairs are searched in Wikimedia clickstream data and scored for interestingness according to a principled mathematical model, which is validated against Amazon Mechanical Turk- and author annotations. The second dataset contains 146 random Wikipedia entity-pairs annotated by 10 postgraduate students following detailed guidelines. Average serendipity scores are then correlated with dataset features using the original model and four alternatives. The proposed dataset-generator aims to support Serendipity Mining for Computational Creativity, particularly Knowledge-based Automatic Story Generation, where serendipity matters more than similarity-based interestingness metrics. First results, despite their limitations, confirm the principles initially deduced for modelling serendipity, showing that serendipity can be effectively modeled through comprehensive parameter optimization.
Download

Paper Nr: 140
Title:

Bias-Mitigating News Search with BiasRank

Authors:

Tim Menzner and Jochen L. Leidner

Abstract: As geopolitical adversaries as well as internal commercial and political actors target democracies with disinformation campaigns, it is increasingly necessary to filter out biased reporting. Some automatic success has recently been achieved in this task. For further progress, web search engines need to implement news bias resistance mechanisms for ranking news stories. To this end, we present BiasRank, a new approach that demotes articles exhibiting news media bias by combining a large neural language model for news bias classification with a heuristic re-ranker. Our experiments, based on artificially polluting a (mostly neutral) standard news corpus with various degrees of biased news stories (biased to varying extents), inspired by earlier work on answer injection, demonstrate the effectiveness of the approach. Our evaluation shows that the method radically reduces news bias at a negligible cost in terms of relevance. In turn, we also provide new metrics for the evaluation of similar systems that aim to balance two variables (like relevancy and bias in our case). Additionally, we release our test collection on git to support further research on de-biasing news search.
Download

Paper Nr: 154
Title:

Towards Early Detection of Mild Cognitive Impairment: Predictive Analytics Using the Oculo-Cognitive Addition Test (OCAT)

Authors:

Gaurav N. Pradhan, Sarah E. Kingsbury, Michael J. Cevette, Jan Stepanek and Richard J. Caselli

Abstract: Mild cognitive impairment (MCI) is often challenging to diagnose. The Oculo-Cognitive Addition Test (OCAT) is a rapid, objective tool that measures eye movement and time-based features during mental addition tasks in under one minute. This study aims to develop predictive machine learning algorithms for early detection of those at greater risk for mild cognitive impairment, helping warrant further testing. OCAT testing with integrated eye tracking was completed by 250 patients. Time-related and eye movement features were extracted from raw gaze data. Feature selection was performed using machine learning methods, including random forest and univariate decision trees, to identify predictors of Dementia Rating Scale (DRS) outcomes. Supervised models-logistic regression (LR) and K-nearest neighbors (KNN)-were trained to classify MCI. Class imbalance was addressed using the Synthetic Minority Over-sampling Technique. LR models achieved the highest performance using the combined time and eye movement features, with an accuracy of 0.97, recall of 0.91, and the area under the precision-recall curve (AUPRC) of 0.95. This study demonstrates that machine learning models trained on OCAT-derived features can reliably predict DRS outcomes (PASS/FAIL), offering a promising approach for early identification of MCI.
Download

Paper Nr: 158
Title:

Integrating Large Language Models into Automated Machine Learning: A Human-Centric Approach

Authors:

Néstor Miguel-Morante, Iván Rivero, Diego García-Prieto, Rafael Duque, Camilo Palazuelos and Abraham Casas

Abstract: The growing complexity and volume of data in modern applications have amplified the need for efficient and accessible machine learning (ML) solutions. Automated Machine Learning (AutoML) addresses this challenge by automating key stages of the ML pipeline, such as data preprocessing, model selection and hy-perparameter tuning. However, AutoML systems often remain limited in their ability to interpret user intent or adapt flexibly to domain-specific requirements. Recent advances in Large Language Models (LLMs), such as GPT-based models, offer a novel opportunity to enhance AutoML through natural language understanding and generation capabilities. This paper proposes a software system that integrates LLMs into AutoML workflows, enabling users to interact with ML pipelines through natural language prompts. The system leverages LLMs to translate textual descriptions into code, suggest model configurations and interpret ML tasks in a human-centric manner. Experimental evaluation across diverse public datasets demonstrates the system’s ability to streamline model development while maintaining high performance and reproducibility. By bridging the gap between domain expertise and technical implementation, this integration fosters more intuitive, scalable and democratized ML development. The results highlight the potential of LLMs to transform AutoML into a truly interactive and accessible tool for a broader range of users.
Download

Paper Nr: 166
Title:

BrandNERD: An Extensive Brand Dataset and Analysis Pipeline for Named Entity Resolution

Authors:

Nicholas Caporusso, Alina Campan, Ayush Bhandari, Stephen Kroeger and Sarita Gautam

Abstract: Named entity resolution (NER) comprises several steps to address multifaceted challenges, including canonicalization, aggregation, and validation. Nonetheless, NER research is hindered by the scarcity of realistic, labeled corpora that capture the spelling noise and brand proliferation found in data from multiple sources, from e-commerce to social media. In this paper, we introduce the Brand Name Entity Resolution Dataset (BrandNERD), an extensive dataset of real-world brand names extracted from an existing high-traffic retail marketplace. BrandNERD consists of multiple datasets along the entity resolution pipeline: raw surface forms, unique canonical entities, similarity clusters, validated brands, and a lookup table reconciling multiple canonical forms with a list of validated preferred brand labels. In addition to the BrandNERD dataset, our contribution includes an analysis of adequacy of various text similarity measures to the brand NER task at hand, the processing algorithms used in each step of the resolution process, and user interfaces and data visualization tools for manual reviews, resulting in a modular, fully reproducible, and extensible pipeline that reflects the complete NER workflow. BrandNERD, which is released as a public repository, contains the dataset and processing pipeline for over 390,000 raw brand names. The repository is continuously updated with new data and improved NER algorithms, making it a living resource for research in marketing and machine learning, and for enabling more complex downstream tasks such as entity disambiguation and brand sentiment analysis.
Download

Paper Nr: 173
Title:

From Free Text to Upper Gastrointestinal Cancer Diagnosis: Fine-Tuning Language Models on Endoscopy and Histology Narratives

Authors:

Kazhan Misri, Leo Alexandre and Beatriz De La Iglesia

Abstract: Clinical free text reports from endoscopy and histology are a valuable yet underexploited source of information for supporting upper gastrointestinal (GI) cancer diagnosis. Our initial learning task was to classify procedures as cancer-positive or cancer-negative based on downstream registry-confirmed diagnoses. For this, we developed a patient-level dataset of 63,040 endoscopy reports linked with histology data and cancer registry outcomes, allowing supervised learning on real-world clinical data. We fine-tuned two transformer-based models: general-purpose BERT and domain-specific BioClinicalBERT and evaluated methods to address severe class imbalance, including random minority upsampling and class weighting. BioClinicalBERT combined with up-sampling achieved the best recall (sensitivity) of 85% and reduced false negatives compared to BERT’s recall of 78%. Calibration analysis indicated that predicted probabilities were broadly reliable. We also applied SHapley Additive exPlanations (SHAP) to interpret model decisions by highlighting influential clinical terms, fostering transparency and trust. Our findings demonstrate the potential of scalable, interpretable natural language processing models to extract clinically meaningful insights from unstructured narratives, providing a foundation for future retrospective review of cancer diagnosis and clinical decision support tools.
Download

Paper Nr: 175
Title:

RFG Framework: Retrieval-Feedback-Grounded Multi-Query Expansion

Authors:

Ronaldinho Vega Centeno Olivera, Allan M. de Souza and Julio Cesar dos Reis

Abstract: Information Retrieval (IR) systems face challenges such as query ambiguity and lexical mismatch, which limit the effectiveness of dense retrieval models, whose generalization capability to new domains or tasks is often limited. This study proposes a novel query expansion framework, named RFG, which integrates the capabilities of Large Language Models (LLMs) into an architecture that combines Retrieval-Augmented Generation (RAG) with Pseudo-Relevance Feedback (PRF). Our solution is based on using an initial document retrieval as a grounding context for the LLMs, a process that mitigates the generation of unsubstantiated information (“hallucinations”) by guiding the creation of a diverse set of pseudo-queries. Following an evaluation across a broad spectrum of retrieval models, including unsupervised and supervised dense models, our experimental results demonstrate that RFG consistently outperforms baseline methods, such as HyDE and Query2doc. In contrast to previous findings that suggest a negative correlation between retriever performance and query expansion benefits, this study originally reveals that our approach not only benefits models with lower initial effectiveness but also improves the results of more robust retrievers. This positions the generation of multiple, contextualized queries as a versatile and highly effective expansion strategy.
Download

Paper Nr: 177
Title:

Prompt Injection Attacks on Large Language Models: Multi-Model Security Analysis with Categorized Attack Types

Authors:

Selin Şaşal and Özgü Can

Abstract: Large Language Models (LLMs) are widely used in information processing, language interaction, and decision support. The command-based structure of these systems creates security vulnerabilities that can be exploited through attacks designed to bypass security measures and generate malicious content. This study presents a comparative analysis of three LLMs (GPT-4o, Claude 4 Sonnet, and Gemini 2.5 Flash) based on four fundamental security metrics: compliance, filter bypass, sensitive information leakage, and security risk level. The study used an attack dataset containing unethical, harmful, and manipulation-oriented prompts. According to the results, the Claude model demonstrated the most robust security posture by providing secure responses with high consistency. Gemini was the most vulnerable due to filtering failures and information leakage. GPT-4o showed average performance, behaving securely in most scenarios but exhibiting inconsistency in the face of indirect attacks. The findings reveal that LLM security is influenced not only by content-level factors but also by structural factors such as model architectural design, training data scope, and filtering strategies. Therefore, it is critical to regularly test models against attacks and establish transparent, explainable, and ethics-based security principles.
Download

Paper Nr: 36
Title:

A Hybrid Approach for Mining the Organizational Structure from University Websites

Authors:

Arman Arzani, Theodor Josef Vogl, Marcus Handte and Pedro José Marrón

Abstract: To support innovation coaches in scouting activities such as discovering expertise, trends inside a university and finding potential innovators, we designed INSE, an innovation search engine which automates the data gathering and analysis processes. The primary goal of INSE is to provide comprehensive system support across all stages of innovation scouting, reducing the need for manual data collection and aggregation. To provide innovation coaches with the necessary information on individuals, INSE must first establish the structure of the organization. This includes identifying the associated staff and researchers in order to assess their academic activities. While this could in theory be done manually, this task is error-prone and virtually impossible to do for large organizations. In this paper, we propose a generic organization mining approach that combines a rule-based algorithm, LLMs and finetuned sequence-to-sequence classifier on university websites, independent of web technologies, content management systems or website layout. We implement the approach and evaluate the results against four different universities, namely Duisburg-Essen, Münster, Dortmund, and Wuppertal. The evaluation indicate that our approach is generic and enables the identification of university aggregators pages with F1 score of above 85% and landing pages of entities with F1 scores of 100% for faculties, above 78% for institutes and chairs.
Download

Paper Nr: 42
Title:

Triples, Chains, and Trees: A Hybrid Knowledge Representation Framework for Multi-Modal Scientific Content

Authors:

Maodi Hu, Donghuan Song, Xiumin Liu, Xi Sun and Zhixiong Zhang

Abstract: The exponential growth of scientific literature necessitates advanced automated methods for knowledge extraction and reasoning. Scientific papers, however, present a significant challenge because of their multi-modal nature, which incorporates text, tables, figures, and formulas. Existing approaches often focus on individual modalities, particularly text, or employ uniform representations that fail to capture the rich structural information inherent in diverse content forms like tables or formulas. This paper introduces a novel knowledge representation and reasoning framework specifically designed to handle the heterogeneity of scientific publications. Our core innovation lies in employing distinct yet interconnected representation schemes tailored to different content types: knowledge triples for unstructured text and non-chart images, knowledge chains for structured tabular and chart data, and knowledge trees for mathematical formulas. Crucially, these representations are unified by a common knowledge object typology. This shared ontology enables seamless integration and cross-modal linkage of knowledge extracted from disparate parts of a paper. Furthermore, we formalize reasoning tasks upon this hybrid representation structure. Specifically, we model Knowledge Graph completion, leveraging insights from Chains and Trees to enhance triple prediction, and data prediction within knowledge chains, utilizing contextual information from Triples and structural constraints from Trees. This integrated framework provides a more comprehensive and nuanced understanding of scientific papers, paving the way for more powerful downstream applications like automated scientific discovery and hypothesis generation.

Paper Nr: 59
Title:

Objective-Oriented Transformer for Abstractive Document Summarization

Authors:

Parma Nand, CangeGe Zhang and Manju Vallayil

Abstract: The effectiveness of transformer language models has been extensively used in a variety of language understanding and production tasks. Adaptation of these models for the specific purpose of text summarization has not been explored as much. In this work, we present the adaptation of a pre-trained Transformer model for the specific task of text summarization. A common way to train a language model is to randomly mask tokens from text and train the model to predict the masked words. The learner does this by paying attention to other neighbouring words in order to predict the masked words. Instead of training a single learner to learn random words, we trained three separate learners to focus only on specific types of words and generate separate summaries from multiple summary viewpoints. Then we used these focused learners to generate composite summaries corresponding to the type of words on which they were trained. We hypothesize that if we combine these different summaries, then it should result in a richer, more accurate summary covering multiple perspectives. We used already trained masked language models, BERT and RoBERTa, to extend the pretraining on the composite tasks of predicting just the nouns, the verbs or the rest of the words, as 3 separate pretraining objectives. We then trained the composite models for the downstream task of corresponding composite summarization. The evaluation was carried out by combining the three composite summaries with two benchmark data sets, Food Review and CNN/Daily Mail. The proposed composite pre-trained model and the composite summary generation algorithm produced a higher precision score based on ROUGE-1 and ROUGE-3 but a slightly lower score on ROUGE-2 compared to the state-of-the-art. The results showed that generating multiple summaries from different perspectives and then merging them has the potential to produce a richer and better summary compared to a one-shot strategy.
Download

Paper Nr: 81
Title:

WPCM: A Multi-Label Patent Classification Method Based on Weakly Supervised Learning

Authors:

Dechao Wang, Yongjie Li, Jian Zhu and Xiaoli Tang

Abstract: Current research on automatic patent classification predominantly focuses on reclassification within existing patent classification systems. This study aims to enhance the classification performance of automatic patent classification tasks in scenarios lacking annotated data, broaden the application scope of patent classification, and establish a foundation for mapping patents to real-world scenarios or subject-specific classification systems. To achieve this, we propose a weakly supervised multi-label patent classification method. This approach captures semantic similarity features both within patent documents and between patents and hierarchical classification labels through a two-stage process involving contrastive learning and comprehensive classification, enabling the automatic classification of unlabeled patents. Experimental results on a medical patent dataset demonstrate the efficacy of the proposed method. The model achieves Precision scores of 0.8237, 0.5743, and 0.4467 at the subclass, main group, and subgroup levels, respectively. Comparative and ablation experiments further validate the effectiveness of each component module within the method.
Download

Paper Nr: 82
Title:

Research on Frontier Discovery of Technological Innovation Based on Knowledge Flow

Authors:

Dechao Wang, Yongjie Li, Jian Zhu and Xiaoli Tang

Abstract: Amidst intensifying technological competition, technological innovation fundamentally arises from the flow of scientific knowledge into technical knowledge. Precisely characterizing the features and connotation of this knowledge flow is therefore crucial for identifying the frontiers of technological innovation. Adopting a semantic flow perspective, this study developed a simulation framework to model semantic flow between documents, progressing from key knowledge elements to the full-text level. Leveraging deep learning models, it then re-identified knowledge flow relationships between documents. The concept of knowledge meme was introduced to quantify the propagation dynamics (intensity and scope) of knowledge units across scientific and technical knowledge systems. Subsequently, a knowledge flow network connecting patents and academic papers in the lung cancer domain was constructed. Building upon this network, the substantive content of the knowledge flows was measured. This research achieved the identification and reconstruction of knowledge flow relationships between scientific and technical documents. Furthermore, by analyzing the content and communication patterns of computable knowledge units, it elucidated the frontiers of technological innovation. This approach holds significant implications for understanding science-technology linkages and identifying emerging technological innovation frontiers.
Download

Paper Nr: 93
Title:

JURISMIND: Context-Driven Retrieval for Accurate and Relevant Legal Question-Answering in Patent Filings

Authors:

Pandey Shourya Prasad, Vidhish Trivedi, Madhav Rao and Srijoni Sen

Abstract: Large Language Models (LLMs) have demonstrated strong performance in domain-specific conversational forums, but they often suffer from hallucinations-producing factually incorrect or contextually irrelevant responses. This issue is particularly critical in the legal domain, where accuracy is paramount. Existing solutions such as fine-tuning and static retrieval methods struggle to handle the complexities of legal language and often fail to provide sufficient contextual grounding. To address this, we propose JURISMIND, a context-driven retrieval-augmented generation (RAG) pipeline designed for the legal domain, with a focus on Patent Filing. Our approach retrieves relevant legal texts, case law, and statutes based on the input query. This retrieved context is combined with a base prompt and the user query, guiding the language model to respond using the provided legal context. This method significantly reduces hallucinations and improves the contextual accuracy of responses. Preliminary evaluation indicates that 56.32% of responses are in strong agreement and 27.59% in fair agreement with ground truth, totaling 83.91% alignment. Furthermore, JURISMIND achieves a BERTScore of 0.91, outperforming the 0.838 BERTScore of a pretrained LLaMA-based model. The code and dataset are publicly released to support adoption and further research in the developer community.
Download

Paper Nr: 108
Title:

JARGES: Detecting and Decoding Jargon for Enterprise Search

Authors:

Colin Daly and Lucy Hederman

Abstract: Newcomers to an organisation often struggle with unfamiliar internal vocabulary, which can affect their ability to retrieve relevant information. Enterprise Search (ES) systems frequently underperform when queries contain jargon or terminology that is specific to the organisation. This paper introduces `JARGES', a novel feature for detecting and decoding jargon for ES. It is designed to enhance a ranking model combining Learning to Rank (LTR) and transformer-based synonym expansion. The ranking model is evaluated using the ENTRP-SRCH dataset. Our experiments showed, however, that the JARGES feature yielded no significant improvement over the baseline (nDCG@10 = 0.964, $\Delta = 0.001$, p$ > 0.05$). These failures are likely due to the dataset’s lack of jargon-rich pairs. This highlights the need for larger ES datasets derived from click-through data or other implicit feedback to detect subtle ranking signals.
Download

Paper Nr: 134
Title:

Towards Interpretable Fairness-Aware Online Machine Learning Through Local Surrogate Models

Authors:

Farnaz Sadeghi, Capri Paquet and Herna Viktor

Abstract: State-of-the-art online machine learning algorithms create highly accurate models that are increasingly used to make decisions in finance, health and cybersecurity. Often, these algorithms are based on vast amounts of data and construct complex black-box models. With this in mind, relying on data and models without fully understanding them poses a significant risk. This is especially important in the fairness-aware learning domain, where we seek to assess fairness, transparency and trustworthiness in decision-making contexts that affect individuals and groups. Local surrogate models can be used to explain the reasoning behind individual predictions made by a black-box algorithm by estimating the local decision boundary within an area of interest. To this end, this paper introduces a new local surrogate model approach in the context of a fairness-aware online machine learning setting, where the Multi-Sensitive Queue-based Online Fair Learning (MQ-OFL) black-box algorithm is used. We demonstrate the value of the method through a case study that predicts a person's income based on characteristics such as ethnic origin, biological sex, age, and level of education. In addition, we provide reflections on the interpretability and fairness of the model from both technical and sociocultural perspectives, considering anthropological insights as essential to understanding how algorithmic decisions affect real-world communities.
Download

Paper Nr: 139
Title:

Comparing Chain and Tree-Based Reasoning for Explainable Knowledge Discovery in Contract Analytics Using Large Language Models

Authors:

Antony Seabra, Claudio Cavalcante and Sergio Lifschitz

Abstract: This paper presents a comparative analysis of two structured reasoning strategies-Chain-of-Thought (CoT) and Tree-of-Thought (ToT)-for explainable knowledge discovery with Large Language Models (LLMs). Grounded in real-world IT contract management scenarios, we apply both techniques to a diverse set of competency questions that require advanced reasoning over structured and unstructured data. CoT guides the model through sequential, linear reasoning steps, whereas ToT enables the exploration of multiple reasoning paths before selecting a final response. We evaluate the generated insights using three key criteria: clarity, usefulness, and confidence in justifications, with particular attention to their effectiveness in supporting decision-making. The results indicate that ToT produces richer and more comprehensive rationales in complex scenarios, while CoT offers faster and more direct responses in narrowly defined tasks. Our findings highlight the complementary strengths of each approach and contribute to the design of adaptive, self-rationalizing AI agents capable of delivering explainable and actionable recommendations in contract analysis contexts.
Download

Paper Nr: 144
Title:

Decision Rule-Based Learning of Terrorist Threats

Authors:

Nida Meddouri, Loïc Salmon, David Beserra and Elloh Adja

Abstract: Artificial Intelligence (AI) offers powerful tools for analyzing criminal data and predicting security threats. This paper focuses on the interpretable prediction of terrorist threats in France using official crime datasets from 2012 to 2021. We propose a preprocessing methodology to aggregate and label spatio-temporal crime data at the departmental level, addressing challenges such as data imbalance and structural heterogeneity. To ensure explainability, we adopt symbolic learning approaches based on decision rule generators implemented in WEKA, including MODLEM, NNge, and MOEFC. We evaluate these models through nine experiments simulating real-world prediction scenarios, using metrics such as misclassification rate, Recall, Kappa statistic, AUC-ROC, and AUPR. Results show that rule-based models achieve stable performance across periods, with Recall averaging 96% and AUPR close to 0.93, despite severe class imbalance. Among the tested methods, NNge and MOEFC provide the best trade-off between interpretability and predictive accuracy. These findings highlight the potential of interpretable rule-based models for supporting counter-terrorism strategies.
Download

Paper Nr: 162
Title:

Identifying Research Hotspots and Research Gaps in Specific Research Area Based on Fine-Grained Information Extraction via Large Language Models

Authors:

Yuling Sun, Xuening Cui, Aning Qin and Jiatang Luo

Abstract: This paper constructs a fine-grained scientific data indicator framework using LLMs to conduct knowledge mining in a specific field of natural science and technology, with empirical analysis carried out in the domain of carbon dioxide conversion and utilization technology. Firstly, based on the characteristics of the technical field, we systematically established four key scientific data dimensions: products, technologies, materials, and performance. Subsequently, six key scientific data indicators were selected to characterize these dimensions. Finally, the extracted scientific data were employed to analyse research hotspots and gaps in the field. This approach effectively addresses the inherent limitations of traditional technology topic analysis, such as overly coarse metric granularity and the lack of quantitative features. Moreover, since these scientific data dimensions and indicators are generalizable to natural science and technology fields aimed at product development, the proposed methodology demonstrates broad applicability.
Download

Paper Nr: 172
Title:

Are all Genders Equal in the Eyes of Algorithms? Analysing Search and Retrieval Algorithms for Algorithmic Gender Fairness

Authors:

Stefanie Urchs, Veronika Thurner, Matthias Aßenmacher, Ludwig Bothmann, Christian Heumann and Stephanie Thiemichen

Abstract: Algorithmic systems such as search engines and information retrieval platforms significantly influence academic visibility and the dissemination of knowledge. Despite assumptions of neutrality, these systems can reproduce or reinforce societal biases, including those related to gender. This paper introduces and applies a bias-preserving definition of algorithmic gender fairness, which assesses whether algorithmic outputs reflect real-world gender distributions without introducing or amplifying disparities. Using a heterogeneous dataset of academic profiles from German universities and universities of applied sciences, we analyse gender differences in metadata completeness, publication retrieval in academic databases, and visibility in Google search results. While we observe no overt algorithmic discrimination, our findings reveal subtle but consistent imbalances: male professors are associated with a greater number of search results and more aligned publication records, while female professors display higher variability in digital visibility. These patterns reflect the interplay between platform algorithms, institutional curation, and individual self-presentation. Our study highlights the need for fairness evaluations that account for both technical performance and representational equality in digital systems.
Download