Examinando Presentaciones a Congresos por Materia "ANALISIS DE DATOS"
Mostrando1 - 6 de 6
Resultados por página
Opciones de clasificación
- Ponencia en CongresoData quality in a big data context(2018) Arolfo, Franco A.; Vaisman, Alejandro Ariel"In each of the phases of a Big Data analysis process, data quality (DQ) plays a key role. Given the particular characteristics of the data at hand, the traditional DQ methods used for relational databases, based on quality dimensions and metrics, must be adapted and extended, in order to capture the new characteristics that Big Data introduces. This paper dives into this problem, re-defining the DQ dimensions and metrics for a Big Data scenario, where data may arrive, for example, as unstructured documents in real time. This general scenario is instantiated to study the concrete case of Twitter feeds. Further, the paper also describes the implementation of a system that acquires tweets in real time, and computes the quality of each tweet, applying the quality metrics that are defined formally in the paper. The implementation includes a web user interface that allows filtering the tweets for example by keywords, and visualizing the quality of a data stream in many different ways. Experiments are performed and their results discussed."
- Ponencia en CongresoDevelopment of an optimized LEB filter and its application to INS/GPS test data(1993) Antonini, Claudio Daniel"An optimized linear-ellipsoidal-bounded (LEB) filter has been developed and applied to data obtained from a ground test using a combined INS/GPS configuration. In this cascaded configuration, the filter receives eight outputs from the INS (accelerations, velocity, angles, altitude) and six outputs from the GPS (velocities and positions). The GPS measurements have included the effect of selected availability (SA)-of varying or unknown spectrum-which, although likely to be estimated and compensated with some modelling techniques-at the expense of including extra state variables-could also be dealt with using the approach indicated in this article with much less effort. The SA effect is modelled as an unknown-but-bounded (UBB) noise process. Comparisons with an extended Kalman filter (KF) show that KF innovations are not white and the LEB filter innovations are one order of magnitude smaller than those produced by the KF. A simple second order example is developed to show the behavior of the LEB filter when compared to a KF."
- Ponencia en CongresoModelling and querying star and snowflake warehouses using graph databases(2019) Vaisman, Alejandro Ariel; Besteiro, María Florencia; Valverde Melito, Maximiliano Javier"In current “Big Data” scenarios, graph databases are increasingly being used. Online Analytical Processing (OLAP) operations can expand the possibilities of graph analysis beyond the traditional graphbased computation. This paper studies graph databases as an alternative to implement star and snowflake schemas, the typical choices for data warehouse design. For this, the MusicBrainz database is used. A data warehouse for this database is designed, and implemented over a Postgres relational database. This warehouse is also represented as a graph, and implemented over the Neo4j graph database. A collection of typical OLAP queries is used to compare both implementations. The results reported here show that in ten out of thirteen queries tested, the graph implementation outperforms the relational one, in ratios that go from 1.3 to 26 times faster, and performs similarly to the relational implementation in the three remaining cases."
- Ponencia en CongresoOn some goodness-of-fit tests and their connection to graphical methods with uncensored and censored data(2019) Castro-Kuriss, Claudia; Huerta, Mauricio; Leiva, Víctor; Tapia, Alejandra"In this work, we present goodness-of-fit tests related to the Kolmogorov-Smirnov and Michael statistics and connect them to graphical methods with uncensored and censored data. The Anderson-Darling test is often empirically more powerful than the Kolmogorov-Smirnov test. However, the former one cannot be related to graphical tools by means of probability plots, as the Kolmogorov-Smirnov test does. The Michael test is, in some cases, more powerful than the Anderson-Darling and Kolmogorov- Smirnov tests and can also be related to probability plots.We consider the Kolmogorov-Smirnov and Michael tests for detecting whether any distribution is suitable or not to model censored or uncensored data. We conduct numerical studies to show the performance of these tests and the corresponding graphical tools. Some comments related to big data and lifetime analysis, under the context of this study, are provided in the conclusions of this work."
- Ponencia en CongresoPerforming OLAP over graph data: query language, implementation, and a case study(2017-08) Gómez, Leticia Irene; Kuijpers, Bart; Vaisman, Alejandro Ariel"In current Big Data scenarios, traditional data warehousing and Online Analytical Processing (OLAP) operations on cubes are clearly not sufficient to address the current data analysis requirements. Nevertheless, OLAP operations and models can expand the possibilities of graph analysis beyond the traditional graph-based computation. In spite of this, there is not much work on the problem of taking OLAP analysis to the graph data model. In previous work we proposed a multidimensional (MD) data model for graph analysis, that considers not only the basic graph data, but background information in the form of dimension hierarchies as well. The graphs in our model are node- and edge-labelled directed multi-hypergraphs, called graphoids, defined at several different levels of granularity. In this paper we show how we implemented this proposal over the widely used Neo4J graph database, discuss implementation issues, and present a detailed case study to show how OLAP operations can be used on graphs."
- Ponencia en CongresoTemporal SOLAP: query language, implementation, and a use case(2012) Bisceglia, Pablo; Gómez, Leticia Irene; Vaisman, Alejandro Ariel"The integration of Geographic Information Systems (GIS) and On-Line Analytical Processing (OLAP), denoted SOLAP, is aimed at exploring and analyzing spatial data. In real-world SOLAP applications, spatial and non-spatial data are subject to changes. In this paper we present a temporal query language for SOLAP, called TPiet-QL, supporting so-called discrete changes (for example, in land use or cadastral applications there are situations where parcels are merged or split). TPiet-QL allows expressing integrated GIS-OLAP queries in an scenario where spatial objects change across time. We also present a prototype implementation, and show how this application is used in a real-world scenario: the analysis of protected areas in Uruguay."