A Practical Analysis of Mobile Data Collection Apps

Nowadays, data collection has become an activity inherent in the emergence of any organization. The digital age has enabled the development of mobile data collection apps that are becoming increasingly common around the world. But faced with the growing number of apps offered, Data Managers are often challenged by with the choice of the solution that best suits their case. This study meets this need by providing clear, precise and verified information on each of the selected solutions. The study presents, analyzes and compares four mobile data collection solutions. To achieve an effective comparison, we first chose to collect and select papers on each of the solutions, and then to install and test each of them by executing a data collection process, all the way from the form creation to the visualization of collected data. The comparison presented in this paper was based on technical aspects but also on other important aspects to help users make a good decision. Keywords—Mobile data collection; Comparison; ODK; ODK-X; KoBoTool box; Pendragon Forms.


Introduction
Data collection and management have become activities inherent in the emergence of any organization [1][2][3]. Thanks to the data collected from users, an organization can considerably improve its services, launch a new product or generate statistics.
"Knowing, before planning, before doing." This famous quote from the French philosopher Auguste Comte in the 19th century is truer today with COVID-19 pandemic than it was two centuries ago. In this context, World Health Organization (WHO) says, about COVID-19: "epidemiological exposure data and biological samples can be systematically collected and shared rapidly in a format that can be easily aggregated, tabulated and analyzed across many different settings globally for timely estimates of 2019-nCoV infection severity and transmissibility, as well as to inform public health responses and policy decisions". So, WHO has provided a protocol for investigating the first few cases and contact investigation for COVID- 19.
In order to carry out field surveys, paper forms were once the preferred means of collecting data, but the advent of digital technology and the marketing of the smartphone in 2007 [4][5] led to abandoning this method, which has become archaic. Mobile data collection has become one of the easiest and most used techniques by organizations around the world.
But the ever-increasing number of tools makes it difficult for Data Managers (or anyone/organization) interested in mobile data collection to choose the right one. In order to provide clear, precise and verified information allowing them to choose the tools best suited to their context and need of use; this paper presents, evaluates and compares four software solutions for mobile data collection: • Pendragon Forms • KoBoToolbox • Open Data Kit (ODK) • Open Data Kit X (ODK-X).
As our research [5][6] is focused on mobile data collection and the way it improves the daily lives of populations, the applications studied were chosen mainly because of their wide adoption, the abundance of research papers available, and for three of them, because of their humanitarian nature.
This paper is subdivided as follows: Section 2 presents the method followed to analyze and compare the four apps solutions and then describes the evolution, structure and operation of each; Section 3 compares these four solutions based on non-technical criteria; Section 4 performs a second comparison, this time, based on technical criteria; and Section 5 gives, as a conclusion of this study, the recommendations to follow when choosing the solution to be used as well as a proposal to improve mobile data collection software solutions.

Method and Materials
The comparison in this paper follows a well-structured research methodology. This section cites and explains the steps that make up this methodology. It also presents the tools and versions of the software that were used.

Method
The double contribution made by this paper, namely on the one hand a result of research on the four software solutions and on the other hand a result of the test of each of them, requires adopting a rigorous approach in four stages. This approach made it possible to compare the four solutions taking into account not only the technical aspects (difficulty or ease of installation of software suites, creation of forms, synchronization of data, etc.) but also other aspects such as documentation of the suite, the cases already carried out with the suite, the community of users, the open source or commercial nature of the suite, etc. The four stages of the adopted method are as follows: • Collection and Selection of Papers: For each software suite, we have collected documents that allow us to better understand it and identify its functionalities. Papers were collected first on official websites [7][8][9], then three scientific databases were consulted: Google Scholar, Springer and Science Direct. Research in scientific databases allowed us to collect research papers published, and official websites allowed us to obtain the official documentation, when available. The selection was made taking into account the contribution of the paper in understanding the architecture or the use of the software, but also its contribution in tracing the evolutionary history of the software. • Reading documents and extracting data: The documents selected in the previous step have been carefully studied to extract the elements necessary for the proper understanding of this software. • Installation of software suite tools: In order to test each suite, mobile and desktop tools were installed. This process helped determine the technical complexity in setting up the platforms, giving so a clear idea of the technical level required to use the suite. • Execution of the data workflow on the software: For each suite, we created and deployed a data collection form on a mobile application. Then, we proceeded to a simple data collection, submitted the collected data to the server and finally visualized this data with the tool proposed by the software suite. This step allowed us to determine the complexity of use of each software suite, a determining factor for organizations in the choice of software to use. Fig. 1 summarizes the method detailed above.

Fig. 1. Research Methodology
This method led to a comparison based first on general features, detailed in Section 3, and which is the result of the first two steps of the method. Then, a comparison based on technical aspects, detailed in Section 4, which is the result of the last two steps of the method.

Pendragon forms
Pendragon Forms is one of the trademarks of Pendragon Software Corporation [10]. Pendragon Forms is a form designer and mobile data collection software [7]. The solution mainly offers a desktop form design tool and a mobile data collection application. That said, the architecture of Pendragon Forms has evolved significantly over time, adapting to the evolution of technology.
In 2007, Pendragon Forms designed version 5.1 of the software [11]. In this version the data collection was to be done on a laptop or on the Palm Treo (smartphone of that year). Then, in 2010, version 6 appeared [12]. In this version data collection was done via a website that could be used both on a mobile device and on a computer. It was in 2014, in its version 7 [13], that Pendragon Forms implemented a mobile data collection application adapted to new generation smartphones.
This work used Pendragon Forms version 8. In this version, the software solution offers four functionalities / services: • Mobile Form Design: For creating basic mobile data collection forms with ordinary logic and workflow. No programming skills are required. • Custom Development: It is the service of Pendragon Forms which facilitates the design of personalized and complex mobile forms [14]. • Mobile Database Synchronization: Data collected is stored by default in the Pendragon Forms cloud database, and can be downloaded in Excel format. • Integrating Mobile Application: This is the option to integrate Pendragon Forms with existing mobile applications [15].

KoBoToolbox
KoBoToolbox is an open-source software used to collect and analyze data by field sampling (e.g. needs assessment), using mobile devices such as mobile phones or tablets. It is especially suitable for harsh environments, in particular, humanitarian fields and emergency situations [8,16]. KoBoToolbox was developed by members of the Harvard Humanitarian Initiative [8,17].
Despite our multiple researches, our discussions with the KoBoToolbox support team and our messages on the forum, the weak documentation of KoBoToolbox did not allow us to trace the evolutionary history of this software solution.
To date, KoBoToolbox offers three main functionalities fulfilled by three tools: • Building Forms: The software offers a form builder that allows us to reuse existing questionnaires and blocks of questionnaires; to create forms with the possibility of filling and validating the fields in any order; and to import / export xls forms. • Collecting Data: This functionality is provided by the mobile application KoBoCollect or by the web form Enketo, which can be used on any browser. KoBoCollect and Enketo allow data collection with or without internet connection (online and offline).
• Analyzing and Managing Data: The solution offers a web tool that allows usto create summary reports with graphs and tables; view the data collected on a map; group the data collected in a report or on a map; export data in Excel, CSV, KML, ZIP (for media) and SPSS formats. This functionality is provided jointly by the web tool and the KoBoToolbox server.
For this study, we installed and used KoBoCollect v1.23.3k. For the design of the forms, we used the online version of the form designer of KoBoToolbox.

Open data kit (ODK)
Open Data Kit (ODK) is a suite of open source tools allowing organizations to collect and manage their data [18][19].
A broader description of the architecture of ODK as well as its functioning was made and published in [5], then the history of its evolution and use in [6].
Note however that to date ODK offers an architecture subdividing the tools into three sets: • Desktop Clients: These are the tools to install on a laptop or desktop computer. ODK Build which is the ODK graphical tool for creating forms. ODK XLSForm intended for creating forms more complex than those of Build and for converting Excel files into Xform format supported by ODK tools. ODK Briefcase allows importing and exporting forms on ODK servers [18,5]. • Mobile Client: It's ODK Collect, which is compatible only with Android devices and allows using forms created with ODK Build to collect data [20,5]. • Servers: ODK suggest two servers. ODK Aggregate allows storage, analysis and visualization of collected data [21]. ODK Central is an alternative server that supports new technologies like REST API [22,5].
For this study, we installed and used ODK Collect v1.24.1 and ODK Aggregate v1.7.3. We used the online web designer of ODK to create forms.

Open data kit X (ODK-X)
ODK-X, formerly known as ODK 2.0, is the new ODK toolkit [23][24]. Like ODK, ODK-X makes it possible to collect, store and analyze data. Unlike its predecessor, ODK-X allows creating forms for studies with very complex workflows and requires quite advanced technical skills for its use.
A broader description of the architecture of ODK-X, as well as its functioning, was made and published in [5], then the history of its creation, evolution and use in [6].
Like ODK, ODK-X offers an architecture that divides tools into three sets: • Desktop Clients: ODK-X offers two desktop tools. ODK Application Designer is the equivalent of ODK Build. It allows creating the forms that will be used by ODK-X mobile applications. ODK Suitcase is the equivalent of ODK Briefcase, it allows importing and exporting forms on ODK-X servers [5,25].
• Mobile Clients: Unlike ODK, ODK-X does not offer one mobile tool, but three. ODK Survey and ODK Tables [26] are mobile applications used to collect data. ODK Services is installed as a prerequisite for the use of the two others [5,25]. • Servers: ODK-X has one server, ODK Sync Endpoint. In addition, it is possible for all ODK Aggregate v1.
x.x to enable the configuration to support ODK-X tools. It should be noted that this configuration was removed in ODK Aggregate v2. x.x.
For this study, we used and when necessary installed ODK-X Services v2.

General Features
This section deals with the non-technical but essential aspects when choosing the solution to be used for mobile data collection. Table 1 gives the result of this assessment for the four solutions.
The criteria used are: Free and Open Source: This must be one of the first, if not the first, pieces of information to have before choosing a software. This means that the software can be used, modified (accessible source code) and shared without paying, because its design is public.
Online Documentation: Knowing if the software has a large documentation on the internet lets you know in advance whether it will be easy to understand or not. The best documented software is often the best understood.
Reactivity of the support team: Although a software has good online documentation, technicians or engineers still need to contact the support team for technical assistance. For this it is better to know in advance whether this team reacts quickly or not.
Frequency of software updates: This lets you know if current bugs in the software will be fixed and/or new features added.
Users Community: This lets you know whether the software has already been used by several people around the world.
The first piece of information in Table 1 is essential. This information (Free and Open Source) allows understanding that if the user wants to extend the software solution by adding a specific functionality and/or if he does not have a large budget for the project, Pendragon Forms may not be the right solution. One of the other three solutions will have to be considered.
Of the four solutions, ODK has the best documentation. This is due to the fact that ODK has a large community of users, especially in developing countries, but also because it is open source and older than the other two open source software suites (ODK-X and KoBoToolbox). Several engineers and researchers have written papers explaining an improvement, an extension or simply the use of ODK for a specific case.
The ODK documentation includes: • A well-detailed website.
• Official documentation in pdf version of about 600 (six hundred) pages.
• Numerous research papers to be found on scientific databases.
• Numerous videos on YouTube that show the procedure to install or use the ODK tools.
On the other hand, although ODK-X has as much official documentation as ODK, it does not have as many research papers and videos on YouTube as its predecessor. This is justified by the fact that ODK is older than ODK-X. (Designed on February 2013, [24]). KoBoToolbox has a website with very little information and few research papers and videos on YouTube and has no official documentation. Pendragon Forms has a website with enough information to understand the suite and official documentation in pdf version but almost no research papers and videos on YouTube.
In order to test the reactivity of the support teams, we registered to the KoBoToolbox and ODK/ODK-X forums. Pendragon Forms does not have a forum yet, so we have communicated by email with the support team. The Pendragon Forms support team confirmed by email that the project does not have a forum yet.
We found that a message posted in the ODK/ODK-X forum was answered in less than 24 hours, as well as an email sent to the Pendragon Forms support team. However, a message posted in the KoBoToolbox forum or emailed to the support team could take up to a week to receive a response.

Technical aspects
This axis of analysis deals with aspects related to the installation and use of the tools of the selected solutions as well as those related to the creation and customization of data collection forms or applications. The result of this assessment is summarized in Table 2.
The criteria we selected for this evaluation are: • Designing of simple custom mobile forms: This criterion assesses how easy it is to create simple but customized forms.

• Designing of complex custom mobile forms: This criterion assesses how easy it is
to create complex and customized forms. • Interoperability with other technologies/tools: This criterion evaluates the possibility of using a solution's tools with tools that do not belong to the solution.
• Programming Languages: This criterion indicates the programming languages used to develop the platform tools. These languages can be used to extend these tools KoBoToolbox, Pendragon Forms and ODK form designers make it easy to create forms. However, these designers are very limited when it comes to creating complex forms. Pendragon Forms fills this gap by providing custom development services to support those who want to create complex forms with Pendragon Forms [14].
However, ODK-X does not offer a graphical form designer. ODK-X's form designer allows the creation of mobile collection applications (forms) using JavaScript and HTML programming languages. This makes ODK-X more suitable for creating complex forms than simple forms.
Due to their open source nature, the tools of all three solutions are open for use with other external tools. Pendragon Forms, although it is a commercial solution, also allows users to use these tools with other tools that do not belong to it [15]. It should be noted that ODK tools are not compatible with ODK-X tools except for the ODK Aggregate server which can be configured to support ODK-X tools. This configuration of ODK Aggregate is only possible on v1.x.x.
Until v5.1, Pendragon Form offered a development kit for developers, which allowed them to add new features to the platform using C and C# programming languages [11]. But in the documentation of the versions after v5.1 and on the official platform website, there is no mention of this. We contacted the support team by email, and they confirmed that Pendragon Forms no longer offers this service.

Results and discussion
This section deals with the results obtained after testing the selected software suites, from the creation of the form to the visualization of the collected data. The purpose of the form that we created was to conduct an opinion poll on vaccination in order to find out how vaccines are perceived in a population. Such a study could make it possible to know if it is necessary to carry out an awareness and information campaign before proceeding with a vaccination campaign in a given territory.
To test the 4 software solutions, we have created a form with 5 fields including: • Full Name: A 'text' field allowing the user to fill in his/her full name.
• Age: A 'number' field, taking only numbers, to fill in the age of the participant.
• Gender: A 'radio' field giving the possibility of making only one choice between the two proposed options ('Female' and 'Male'). • Have you ever been vaccinated? A 'radio' field giving the possibility of making a single choice between the two options proposed ('Yes' and 'No'). • Do you trust vaccination? Why? A 'text' field to receive the opinion of the respondent about vaccinations.

KoBoToolbox and ODK
What these two solutions have in common is that they offer a graphical tool for designing forms but do not allow for the creation of complex forms.
KoBoToolbox offers an online tool (website) that includes both the form designer and the server. KoBoToolbox is project-based. In order to create a form, you first need to set up a project. The created project can be shared publicly or, using email addresses, given to specific users who will have access rights to the project. Fig. 2 shows the Ko-BoToolbox form designer with the project that has been created for this study.

Fig. 2. KoBoToolbox -Form Designer
On the mobile application (KoBoCollect), in order to access the forms, you must specify the server URL as well as the username and password. Fig. 3 shows a form completely filled in with KoBoCollect. After data collection and submission to the server, the visualization of the data is done on the website which is on the same server as the form designer. Fig. 4 shows the data visualization on the KoBoToolbox server. The functioning of KoBoToolbox as described above is the same for ODK. The difference is that ODK separates the forms designer (ODK Build) from the server (ODK Aggregate).  This means that ODK requires more technical skills than KoBoToolbox to start using. The ODK documentation provides detailed step-by-step tutorials for all these options to minimize the technical knowledge required for server configuration [21].

Pendragon forms and ODK-X
What these two platforms have in common is that they offer the ability to create forms with non-linear and more complex workflows and many other features. Pendragon Forms offers all of this with the assistance of a team dedicated to supporting customers while ODK-X features specific performance tools, including its form designer (ODK-X Application Designer) illustrated in Fig. 8. For example, ODK-X offers the possibility of reading data from a structured source (e.g. CSV files) and using the results as response options or to set up a certain logic in the execution of a form, another example is the launch of sub forms that store data in different tables [17,27].

Fig. 8. ODK-X Application Designer
At this level, the difference between the two lies in the fact that ODK-X requires sufficient technical skills, especially in software programming, whereas Pendragon Forms requires very little on the client side (company or organization using the software solution).
Another point in common between the two solutions is, on the one hand, that they have good online documentation, on the other hand, a very reactive support team for Pendragon Forms and a very active forum for ODK-X. This helps to mitigate the technical skills requirement with ODK-X.
In addition, both solutions offer the possibility to integrate their tools into an existing software ecosystem. So, the question of compatibility with existing equipment in the company does not arise.
As for the tools offered and how they work, Pendragon Forms is very similar to KoBoToolbox because it offers a single mobile data collection application that can be downloaded from the Google Play Store and a website containing both the graphic designers of the forms shown in Fig. 9 and the server shown in Fig. 10.  ODK-X is quite similar to ODK with the difference that it offers three mobile applications, one of which, ODK-X Services, must be installed as a mandatory prerequisite to the other two, namely, ODK-X Tables and ODK-X Survey, which can optionally be used for data collection. For this study, we chose to use ODK Tables. It should be noted that the ODK-X mobile applications are not available in the Google Play Store.
To install them, you need to download the .apk file corresponding to the mobile application from the project GitHub and then install it on the mobile device. The installation procedure is explained step by step in the ODK documentation [28].

Conclusion and Future Work
This paper presents, evaluates and compares four apps solutions for data collection by mobile devices. In order to achieve an effective comparison, we have chosen to collect and select documentation on each of these solutions and to test each of them by running a data collection process from form creation to data visualization.
This study leads to the following recommendations: • For companies, organizations or individuals wanting to do data collection for a study with a relative ordinary workflow, ODK and KoBoToolbox are most likely the best choices. The choice between the two suites will be determined by the autonomy and technical skills of the team leading the study. If this team has enough technical skills, especially on server installation and configuration, it is best to opt for ODK, which has a large community of users and a very active forum where help is readily available. If, on the other hand, the team does not have enough technical skills but has a strong capacity to understand the software (autonomy), in this case, it is better to opt for KoBoToolbox, whose tools are very intuitive and which does not require the team to install and configure the server. • For companies, organizations or people who want to do data collection for a study with a complex and highly personalized workflow, ODK-X and Pendragon Forms are the most suitable. In this case, the choice between the two suites will be based on the financial capacity of the organization as well as the technical skills available. • If the company has sufficient financial capacity and few technical skills within its organization, Pendragon Forms, which is a commercial solution, is the best option.
On the other hand, if the organization does not have sufficient financial means, ODK-X is the best choice provided that there are people with programming skills on the team.
There is therefore no software solution that is better in every respect and under every circumstance. Each case must be considered according to the needs and context of the users.
Furthermore, this study also confirmed one of the observations made by Markus Steinberg in [27]. This observation is that the analysis of collected data is not integrated within mobile data collection platforms. A possible improvement would be to integrate analysis functionalities.
This study focuses on presenting architectures of selected software and comparison of these tools using technical and non-technical aspects. In one of our previous works [6] we have done a literature review of ODK, ODK-X and their extensions, in which there is plenty of cases, type of cases and the extend at which these tools were used for various surveys.
Our future work, which is currently underway, will focus on the functional analysis of these tools. We will first present the data analysis that can currently be done with these tools and then we will present the new predictive analysis module that we want to add to the mobile data collection platforms.