Ingeniería Informática
Permanent URI for this community
Browse
Browsing Ingeniería Informática by Author "Awiti, Judith"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
artículo de publicación periódica.listelement.badge Design and implementation of ETL processes using BPMN and relational algebra(2020-06-13) Awiti, Judith; Vaisman, Alejandro Ariel; Zimányi, Esteban"Extraction, transformation, and loading (ETL) processes are used to extract data from internal and external sources of an organization, transform these data, and load them into a data warehouse. The Business Process Modeling and Notation (BPMN) has been proposed for expressing ETL processes at a conceptual level. A different approach is studied in this paper, where relational algebra (RA), extended with update operations, is used for specifying ETL processes. In this approach, data tasks in an ETL workflow can be automatically translated into SQL queries to be executed over a DBMS. To illustrate this study, the paper addresses the problem of updating Slowly Changing Dimensions (SCDs) with dependencies, that is, the case when updating a SCD table impacts on associated SCD tables. Tackling this problem requires extending the classic RA with update operations. The paper also shows the implementation of a portion of the TPC-DI benchmark that results from both approaches. Thus, the paper presents three implementations: (a) An SQL implementation based on the extended RA-based specification of an ETL process expressed in BPMN4ETL; and (b) Two implementations of workflows that follow from BPMN4ETL, one that uses the Pentaho DI tool, and another one that uses Talend Open Studio for DI. Experiments over these implementations of the TPC-DI benchmark for different scale factors were carried out, and are described and discussed in the paper, showing that the extended RA approach results in more efficient processes than the ones produced by implementing the BPMN4ETL specification over the mentioned ETL tools. The reasons for this result are also discussed."ponencia en congreso.listelement.badge From conceptual to logical ETL design using BPMN and relational algebra(2019) Awiti, Judith; Vaisman, Alejandro Ariel; Zimányi, Esteban"Extraction, transformation, and loading (ETL) processes are used to extract data from internal and external sources of an organization, transform these data, and load them into a data warehouse. The Business Process Modeling Notation (BPMN) has been proposed for expressing ETL processes at a conceptual level. This paper extends relational algebra (RA) with update operations for specifying ETL processes at a logical level. In this approach, data tasks can be automatically translated into SQL queries to be executed over a DBMS. An extension of RA is presented, as well as a translation mechanism from BPMN to the RA specification. Throughout the paper, the TPC-DI benchmark is used for comparing both approaches. Experiments show the efficiency of the resulting ETL flow with respect to the Pentaho Data Integration tool."