2023 | Information Systems Frontiers | Citations: 0
Authors: Forresi, Chiara; Francia, Matteo; Gallinucci, Enrico; Golfarelli, Matteo
Abstract: Multistores are data management systems that enable query processing across diff ...
Expand
Abstract: Multistores are data management systems that enable query processing across different and heterogeneous databases; besides the distribution of data, complexity factors like schema heterogeneity and data replication must be resolved through integration and data fusion activities. Our multistore solution relies on a dataspace to provide the user with an integrated view of the available data and enables the formulation and execution of GPSJ queries. In this paper, we propose a technique to optimize the execution of GPSJ queries by formulating and evaluating different execution plans on the multistore. In particular, we outline different strategies to carry out joins and data fusion by relying on different schema representations; then, a self-learning black-box cost model is used to estimate execution times and select the most efficient plan. The experiments assess the effectiveness of the cost model in choosing the best execution plan for the given queries and exploit multiple multistore benchmarks to investigate the factors that influence the performance of different plans.
Collapse
Semantic filters:
Redis
Topics:
database system MongoDB PostgreSQL document database relational database
Methods:
computational algorithm experiment case study regression analysis method longitudinal research
Application Massive Data Processing Platform for Smart Manufacturing Based on Optimization of Data Storage
2023 | ACM Transactions on Management Information Systems | Citations: 0
Authors: Ren, Bin; Chen, Yuquiang; Wang, Fujie
Abstract: The aim of smart manufacturing is to reduce manpower requirements of the product ...
Expand
Abstract: The aim of smart manufacturing is to reduce manpower requirements of the production line by applying technology of huge amounts of data to the manufacturing industry. Smart manufacturing is also called Industry 4.0, and the platform for processing huge amounts of data has an indispensable role. The massive data processing platform is like the brain of the entire factory, receiving all data from production line sensors via edge computing, processing, and analyzing, and finally making feedback decisions. With the innovation of production technology, the data that the platform needs to process has become diverse and complex, and the amount has become increasingly large. As well, many precision manufacturing industries have begun to enter the field of Industry 4.0. In addition to the accuracy and availability of data processing, there is emphasis on the real-time nature of data processing. After the sensor receives the data, the platform must provide feedback within a short period of time. This article proposes a massive data processing platform based on the Lambda architecture, which has the coexistence of stream processing and batch processing to meet real-time feedback needs of high-precision manufacturing. To verify the effectiveness of the optimization, it is based on real data from the manufacturing industry. To generate a large amount of test data to confirm the optimization of the storage of pictures. The results show that it optimizes the storage and optimization of the image data generated by the Automated Optical Inspection technology used in manufacturing today and optimizes the query for data storage. It also reduces the consumption of a large amount of memory as expected, and the query for Hive reduced the time spent.
Collapse
Abstract: Real-time data collection and analytics is a desirable but challenging feature t ...
Expand
Abstract: Real-time data collection and analytics is a desirable but challenging feature to provide in dataintensive software systems. To provide highly concurrent and efficient real-time analytics on streaming data at interactive speeds requires a welldesigned software architecture that makes use of a carefully selected set of software frameworks. In this paper, we report on the design and implementation of the Incremental Data Collection & Analytics Platform (IDCAP). The IDCAP provides incremental data collection and indexing in real-time of social media data; support for real-time analytics at interactive speeds; highly concurrent batch data processing supported by a novel data model; and a front-end web client that allows an analyst to manage IDCAP resources, to monitor incoming data in real-time, and to provide an interface that allows incremental queries to be performed on top of large Twitter datasets.
Collapse
Semantic filters:
Redis
Topics:
Twitter Cassandra mobile application Redis Apache Solr
Methods:
design artifact literature study data modeling