Presentaciones a Congresos
Permanent URI for this collection
Browse
Browsing Presentaciones a Congresos by Subject "ANALISIS DE DATOS"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
ponencia en congreso.listelement.badge Data quality in a big data context(2018) Arolfo, Franco A.; Vaisman, Alejandro Ariel"In each of the phases of a Big Data analysis process, data quality (DQ) plays a key role. Given the particular characteristics of the data at hand, the traditional DQ methods used for relational databases, based on quality dimensions and metrics, must be adapted and extended, in order to capture the new characteristics that Big Data introduces. This paper dives into this problem, re-defining the DQ dimensions and metrics for a Big Data scenario, where data may arrive, for example, as unstructured documents in real time. This general scenario is instantiated to study the concrete case of Twitter feeds. Further, the paper also describes the implementation of a system that acquires tweets in real time, and computes the quality of each tweet, applying the quality metrics that are defined formally in the paper. The implementation includes a web user interface that allows filtering the tweets for example by keywords, and visualizing the quality of a data stream in many different ways. Experiments are performed and their results discussed."ponencia en congreso.listelement.badge Modelling and querying star and snowflake warehouses using graph databases(2019) Vaisman, Alejandro Ariel; Besteiro, María Florencia; Valverde Melito, Maximiliano Javier"In current “Big Data” scenarios, graph databases are increasingly being used. Online Analytical Processing (OLAP) operations can expand the possibilities of graph analysis beyond the traditional graphbased computation. This paper studies graph databases as an alternative to implement star and snowflake schemas, the typical choices for data warehouse design. For this, the MusicBrainz database is used. A data warehouse for this database is designed, and implemented over a Postgres relational database. This warehouse is also represented as a graph, and implemented over the Neo4j graph database. A collection of typical OLAP queries is used to compare both implementations. The results reported here show that in ten out of thirteen queries tested, the graph implementation outperforms the relational one, in ratios that go from 1.3 to 26 times faster, and performs similarly to the relational implementation in the three remaining cases."ponencia en congreso.listelement.badge Performing OLAP over graph data: query language, implementation, and a case study(2017-08) Gómez, Leticia Irene; Kuijpers, Bart; Vaisman, Alejandro Ariel"In current Big Data scenarios, traditional data warehousing and Online Analytical Processing (OLAP) operations on cubes are clearly not sufficient to address the current data analysis requirements. Nevertheless, OLAP operations and models can expand the possibilities of graph analysis beyond the traditional graph-based computation. In spite of this, there is not much work on the problem of taking OLAP analysis to the graph data model. In previous work we proposed a multidimensional (MD) data model for graph analysis, that considers not only the basic graph data, but background information in the form of dimension hierarchies as well. The graphs in our model are node- and edge-labelled directed multi-hypergraphs, called graphoids, defined at several different levels of granularity. In this paper we show how we implemented this proposal over the widely used Neo4J graph database, discuss implementation issues, and present a detailed case study to show how OLAP operations can be used on graphs."ponencia en congreso.listelement.badge Temporal SOLAP: query language, implementation, and a use case(2012) Bisceglia, Pablo; Gómez, Leticia Irene; Vaisman, Alejandro Ariel"The integration of Geographic Information Systems (GIS) and On-Line Analytical Processing (OLAP), denoted SOLAP, is aimed at exploring and analyzing spatial data. In real-world SOLAP applications, spatial and non-spatial data are subject to changes. In this paper we present a temporal query language for SOLAP, called TPiet-QL, supporting so-called discrete changes (for example, in land use or cadastral applications there are situations where parcels are merged or split). TPiet-QL allows expressing integrated GIS-OLAP queries in an scenario where spatial objects change across time. We also present a prototype implementation, and show how this application is used in a real-world scenario: the analysis of protected areas in Uruguay."