Applying Optimized Algorithms and Technology for Interconnecting Big Data Resources in Government Institutions
DOI:
https://doi.org/10.3991/ijoe.v19i08.39661Keywords:
data quality assessment, Levenshtein distance (LV) algorithm, data quality improvementAbstract
The quality of the data in core electronic registers has constantly decreased as a result of numerous errors that were made and inconsistencies in the data in these databases due to the growing number of databases created with the intention of providing electronic services for public administration and the lack of the data harmonization or interoperability between these databases.
Evaluating and improving the quality of data by matching and linking records from multiple data sources becomes exceedingly difficult due to the incredibly large volume of data in these numerous data sources with different data architectures and no unique field to create interconnection among them.
Different algorithms are developed to treat these issues and our focus will be on algorithms that handle large amounts of data, such as Levenshtein distance (LV) algorithm and Damerau-Levenshtein distance (DL) algorithm.
In order to analyze and evaluate the effectiveness and quality of data using the mentioned algorithms and making improvements to these algorithms, through this paper we will conduct experiments on large data sets with more than 1 million records.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Genc Hamzaj, Artan Mazrekaj, Isak Shabani
This work is licensed under a Creative Commons Attribution 4.0 International License.
The submitting author warrants that the submission is original and that she/he is the author of the submission together with the named co-authors; to the extend the submission incorporates text passages, figures, data or other material from the work of others, the submitting author has obtained any necessary permission.
Articles in this journal are published under the Creative Commons Attribution Licence (CC-BY What does this mean?). This is to get more legal certainty about what readers can do with published articles, and thus a wider dissemination and archiving, which in turn makes publishing with this journal more valuable for you, the authors.
By submitting an article the author grants to this journal the non-exclusive right to publish it. The author retains the copyright and the publishing rights for his article without any restrictions.