Applying Optimized Algorithms and Technology for Interconnecting Big Data Resources in Government Institutions

Authors

DOI:

https://doi.org/10.3991/ijoe.v19i08.39661

Keywords:

data quality assessment, Levenshtein distance (LV) algorithm, data quality improvement

Abstract


The quality of the data in core electronic registers has constantly decreased as a result of numerous errors that were made and inconsistencies in the data in these databases due to the growing number of databases created with the intention of providing electronic services for public administration and the lack of the data harmonization or interoperability between these databases.
Evaluating and improving the quality of data by matching and linking records from multiple data sources becomes exceedingly difficult due to the incredibly large volume of data in these numerous data sources with different data architectures and no unique field to create interconnection among them.
Different algorithms are developed to treat these issues and our focus will be on algorithms that handle large amounts of data, such as Levenshtein distance (LV) algorithm and Damerau-Levenshtein distance (DL) algorithm.
In order to analyze and evaluate the effectiveness and quality of data using the mentioned algorithms and making improvements to these algorithms, through this paper we will conduct experiments on large data sets with more than 1 million records.

Downloads

Published

2023-06-27

How to Cite

Hamzaj, G. ., Mazrekaj, A., & Shabani, I. (2023). Applying Optimized Algorithms and Technology for Interconnecting Big Data Resources in Government Institutions. International Journal of Online and Biomedical Engineering (iJOE), 19(08), pp. 4–18. https://doi.org/10.3991/ijoe.v19i08.39661

Issue

Section

Papers