Abstracts Track 2025


Nr: 185
Title:

HORC: An Harmonized Ontology for Regulatory Compliance Assessment

Authors:

Antoine Sacré, Jean-Noël Colin, Benoit Hosselet and Dara Tith

Abstract: Automatic regulatory compliance assessment systems are essential to integrate compliance early and continuously into software development. They help reduce costs and risks, improve automation, and allow teams to monitor and manage compliance independently. These systems must rely on complete, machine-readable knowledge base. Existing techniques can extract key elements such as obligations, rules, actors, actions, and conditions of applicability from unstructured legal texts, standards, and contracts. Ontology-Based Information Extraction (OBIE) are systems that process natural language text through a mechanism guided by ontologies to instantiate a predefined model. To support such systems, regulatory ontologies already exist, such as NISO SSOS for standards, and LegalRuleML, the European Legislation Identifier (ELI), FOLaw, or LRI-Core for legal texts. However, no existing cross-domain ontology of regulation is specifically designed to support compliance assessment. Our previous published paper on OBIE for regulatory compliance relied on a lightweight cross-domain ontology of regulations, aimed at exploring the feasibility and usability of a simplified OBIE process for legal practitioners. Due to its simplicity, though, the ontology lacked formal structure and was not built on a solid ontological foundation, which limited its applicability in industrial settings. In this work, we developed and aligned two regulatory ontologies, both aimed at enabling compliance assessment: one for ISO international standards, and one for EU binding acts. The ISO and EU ontology are designed using respectively the ISO/IEC Directives Part 2 document and the Joint Practical Guide by the European Parliament, the Council, and the Commission. For both, we followed the Ontology Development 101 methodology by Noy and McGuinness (2001) to define classes, hierarchies, and properties. We then aligned the two into a single harmonized ontology, called HORC (Harmonized Ontology for Regulatory Compliance). The harmonisation followed the ISO 860 methodology on terminology alignment and resolved both structural and semantic mismatches across both regulatory document families. The result is a unified ontology compatible with both ISO standards and EU binding acts aimed at supporting compliance assessment. We tested the ontology by instantiating it with articles from both ISO standards and EU binding acts, and used it to assess compliance in a small scenario. The results showed that the model successfully represents both document types in a machine-readable format. However, two limitations emerged. First, not all applicable rules could be easily identified. For EU binding acts in particular, determining applicability often requires interpreting conditions from the rule content or linking across multiple provisions. Second, while the harmonized ontology allows the same rule, article, or definition content to be shared across normative sources, it does not prevent duplication or contradictions. These limitations reflect structural differences between regulatory sources. ISO standards are designed with verifiability in mind: requirements must be demonstrable and testable. EU acts, by contrast, are not drafted for direct verifiability and legal interpretation remains necessary for building a knowledge base suitable for automatic compliance assessment. HORC: https://github.com/AntoineSacre/HORC Previous published paper: https://ceur-ws.org/Vol-3882/km4law-1.pdf.

Nr: 156
Title:

Analysis of the Causal Effects of Air Pollutants on Outpatient Visits to Atopic Dermatitis in Young childhood: An Approach Through Instrumental Variable Analysis.

Authors:

Seong Pyo Kim, Su Hwan Kim, Jae-In Song, Zio Kim, Dongjae Shin, Yeju Park, Yujin Park, Hyung-Jin Yoon, Heung-Woo Park, Jin Youp Kim and Hyung-Jin Yoon

Abstract: Introduction: Atopic dermatitis (AD), a chronic skin disease causing inflammation and itch, is a common condition affecting approximately 20% of children worldwide. Beyond affecting physical health, AD negatively influences various aspects of daily life, including academic achievement and sleep quality. Given that recent studies have identified air pollutants as one of the major exacerbating factors of AD, elucidating the causality between air pollution and AD in children is crucial for improving young children’s quality of life. However, as randomized controlled trials for exploring it are not feasible for ethical and practical reasons, we aim to investigate the causal effects of air pollutants on AD in children using observational data. Methods: The number of daily outpatient visits to AD in children in Seoul, the capital of South Korea, from January 2014 to December 2017, as recorded in the National Health Insurance Service database, was utilized as an outcome variable in the study. Air pollutants, such as PM (particulate matter) 2.5 and PM 10, along with nitrogen dioxide, sulfur dioxide, carbon monoxide, and ozone, were converted into air quality index (AQI) values as an exposure. Instrumental Variable (IV) analysis is a powerful quasi-experimental design for inferring causal effects in the presence of unmeasured confounding and endogeneity. It relies on a variable that is associated with the exposure but affects the outcome only through its influence on the exposure. In this study, we employed Thermal Inversion (TI)—a meteorological phenomenon in which a warm air layer becomes trapped between cooler layers above and below, resulting in temperatures increasing with altitude—as an IV, which satisfies the core assumptions of IV. Using TI as an IV, we estimated the causal effects of AQI on outpatient visits for AD in children by quantifying relative risks associated with lagged and moving average AQI exposures, utilizing a two-stage Generalized Method of Moments estimation model. Results: A total of 1,394,739 AD outpatient visits for children between the ages of 0 and 9 years was analyzed. The effects of Inter Quartile Range (IQR) increase of AQI on AD outpatient visits for children were found to be significantly positive from lag 0 (visit date) to lag 7 (7 days before the visit date) except lag1, with the strongest effect on lag 0 (relative risk (RR):1.129, 95% confidence interval (CI):1.084-1.177, p <.0001). Furthermore, the effects of the moving averages (MA) IQR increase of AQI, spanning from MA0-1 (the average of lag0 and lag1) to MA0-7 (the average of lag0 through lag7), showed an increasing trend, becoming more pronounced over time, from MA0-1(RR:1.104, 95% CI:1.061-1.150, p <.0001) to MA0-7 (RR:1.207, 95% CI:1.150-1.267, p <.0001). Conclusions: This study demonstrates the substantial influence of the increase of AQI on outpatient visits for AD, particularly in children. The study minimizes potential confounding factors while bolstering the hypothesis for a causal association between air quality and AD outpatient visits by using TI as an IV. Lag‐days and multiple MA windows were analyzed, and the results showed that cumulative or extended exposure to elevated AQI values may gradually raise the likelihood of AD-related medical visits of children. These findings highlight the significance of taking preemptive steps to regulate air pollution, particularly during conditions favorable to TI, from the perspective of public health.

Nr: 174
Title:

From Unstructured Text to Actionable Knowledge: A Framework Integrating Rag, LLM, and Process Mining for Automated Process Discovery

Authors:

Szabina Fodor

Abstract: The rapid growth of accessible unstructured data—driven by digitization and evolving regulatory demands—poses significant challenges for effective information management. Users struggle with information overload, reduced coherence across documents, and content that quickly becomes outdated or invalid. Large Language Models (LLMs) have emerged as powerful tools for processing unstructured text, and their potential has been widely recognized in the scientific community. However, the output of LLMs can lack transparency and reliability due to the untraceable nature of their responses and their inability to reflect up-to-date knowledge, which is particularly problematic in fast-changing domains. To address these limitations, Retrieval-Augmented Generation (RAG) has gained traction as a hybrid approach that combines information retrieval from trusted sources with LLM-based text generation. RAG enhances the trustworthiness, accuracy, and controllability of outputs, enabling more robust and dynamic knowledge extraction. Building upon this, our research explores a novel application of RAG-based content extraction as a foundation for automated process discovery. By integrating process mining—a rapidly advancing field focused on deriving process models from data—we propose a comprehensive, end-to-end framework that transforms unstructured text into executable process diagrams. While prior attempts have used rule-based NLP methods for extracting process logic, they lack scalability and adaptability. In contrast, our approach leverages the capabilities of modern LLMs and retrieval pipelines to generate flexible, explainable, and maintainable process models. The framework is implemented in Python and designed with modularity in mind, allowing for future expansion and reuse. As a proof of concept, we evaluate the generated process models independently and within the context of a higher education teaching workflow. This research contributes a novel, automated solution for converting unstructured knowledge into structured, actionable insights—bridging the domains of information retrieval, generative AI, and process mining.