Global Research on Big Data in Relation with Artificial Intelligence (A Bibliometric Study: 2008-2019)

—The purpose of this paper is to analyze and explore the research studies on Big Data in relation with Artificial Intelligence domain, which published in Peer Review Journals and indexed in Web of Science Core Collection for the period of 2008-2019 years. The publication data for our research analysis “Big Data in relation with Artificial Intelligence” has been derived from the Web of Science (WoS) Core Collection database (Indexes included SCI Expanded and SSCI). The Bibliometric Analysis Methods is applied for the study in order to find out the relations between two domains and to investigate the status of scientific development level in the research era. Therefore, our research concentrates and highlights the current issues discussed and studied by the scholars around the globe. This paper would useful for researchers to show the publication trends on big data in relation with artificial intelligence research outcomes in highly reputable SCI-Exp and SSCI journal (ranked by WoS).


Introduction
Big Data (BD) and Artificial Intelligence (AI) are of high interest nowadays and its application are spreading in all our everyday activities. AI is used to facilitate capturing and structuring big data, and then analyze big data for key insights [1] Through this research work, it's aimed to analyze and explore the research studies on Big Data in relation with Artificial Intelligence domain, published in Peer Review Journals and indexed in Web of Science Core Collection for the period of 2008-2019. In order to achieve this goal, the Bibliometric Analysis Methods is applied and six research questions are defined. Our research sample consist of 107 publications, found in the WoS database for the research period. As per the publication profile, 57.94% of the publications are peer reviewed journal articles, published mainly in English, 89.72% of the publications.
Even though that the study period is 2008-2019, we noticed that the publications in our research topic started in year 2014 and majority of the publications (49.53%) are published in 2018.
The main subject areas of the publications are Engineering (Electrical and Electronic), Telecommunications and Computer Science, comprising 39.25%. Nevertheless, we can notice a huge interest of publishing in medicine and health related subjects such as: Clinical neurology, general medicine, radiology, endocrinology, oncology, surgery, etc.
The most productive and active institutions publishing on Big Data and AI related subjects are North American Universities such as: Harvard, Stanford, U. of Michigan, U. of Toronto, MIT, followed by Chinese, Australian, and European universities. Therefore, the leading countries are USA (41.12%), followed by Peoples Republic of China (16.82%), Canada (11.22%), Germany (10. 28%), France (7.48%), Australia and England, which have same publication records (6.54%). The average citation per publication is 6.78, with the highest increase in year 2018.

Literature Review: Big Data and Artificial Intelligence
Researchers in various academic disciplines have studied Big Data recently. The BD mostly about capturing the data from different structured and unstructured platforms in order to analyze and make better decision making. O'Leary [1] studied on the intelligent systems and examined some of the basic concerns and uses of AI in relation to BD. On the other hand, Lei et al. [2] proposed a two-stage learning method for intelligent diagnosis of machines, which reduces the need of human labor and makes intelligent fault diagnosis handle big data more easily.
Considering the huge amount and heterogeneity of data being produced nowadays, efficient retrieval and management of data is becoming a key issue. Gani et al. [3] in his research investigates and examines existing indexing techniques for big data. He proposed a taxonomy of categorized indexing techniques based on their method. The categories are non-artificial intelligence, artificial intelligence, and collaborative artificial intelligence indexing methods. In addition, the significance of different procedures and performance were analyzed, besides limitations of each technique. Another research shows that "AI Intelligence in Medicine and Cardiac Imaging" concentrates the harnessing BD and advanced computer practices to diagnose and treatment usage. [4]. The transition of available patient data in electronic medical records related to increased volume of patient information support medical alert for diagnosis and treatment process. In this research advanced computer systems examined in parallel computing helped revolutionary practicing. Similar research conducted on diabetic patients and diabetes systems, in conjunction with 5G networks [5], medical BD analytics, IoT along with wearable technologies [6].
In medical Oncology research, namely prostate radiotherapy showed that BD delivered better understandings with integrating heterogeneous [7] data types of patient-specific clinical parameters along with potential risk factors [8]. Another study in medical science concentrates deep learning using BD toward a mobile system [9]. The study reveals that it is in combination with neuromorphic hardware, which provides real-time, patient-specific warning system with reliable life-long performance.
Deep learning as part of AI used to develop the domain in medical science particularly in drug discovery and development [10]. Study reviews the mainstream architectures such as convolutional neural network (CNN), recurrent neural network (RNN), and deep auto-encoder networks (DAENs), for supervised learning and nonsupervised learning to use of Deep Learning. Griffen et al. similarly studied on BD and AI in medicinal chemistry by augmenting the chemist [11]. They concentrated the securing knowledge sharing as intellectual property critical task for medicinal chemist using with proper technologies BD, AI and relevant software.
Ozdemir and Hekim [12] studied the next potential Industry 5.0 concepts by reviewing BD, AI, IoT, robotics, broadband Wi-Fi, and with various elements. The BD and related integrated systems would the key player. They propose, "Industry 5.0 that can democratize knowledge coproduction from Big Data, building on the new concept of symmetrical innovation". Another research in Supply Chain Risk Management, the BD studied in order to mitigate risk in operations and supported with different case studies [13].
The relation between Machine Learning and BD studied in terms of analytics, complex models, [14] huge amount of parameters along with massive datasets were reviewed [15] and [16]. Kanevsky et al. [17] studied BD and Machine Learning in plastic surgery in terms of innovations in medical science. The nature of data based on quantifiable structure, huge amount of the data recorded for each single operation. They studied the how manage the complexity of the biomedical data with BD solutions in associated with best clinical outcomes. The results revealed that modern plastic surgeon to machine learning and computational interpretation of large data sets as subfield of AI can support to address the various problems in plastic surgery.
Role of BD and Machine learning [18] studied in diagnostic decision supports in radiology [19]. The aim was to support imaging specialists for improving the accuracy and efficiency of diagnosis and treatment with modern equipment and methods. The outcomes highlighting as growth of this field for radiology, diagnostic decision support, efficiency of rule-based expert systems along with cognitive assistants of the modern era.
The usage and analysis of BD studied in SP theory of intelligence in accordance with knowledge management and processes [20]. Another study conducted in materials science and engineering, NOMAD, Center of Excellence's computational materials science was reviewed. The research concentrated that how the culture and initiatives opened new avenues for mining science using BD were examined [21].
In another research on BD to knowledge in AI 2.0 revises the challenges and opportunities with emerging theoretical and technological advances of AI in BD settings [22]. Findings show that data-driven machine learning (part of AI) should work with human knowledge in order to implicit intuitions could turn into explicit intelligence. Therefore, proposed AI 2.0 transforms big data into structured knowledge, in order to enable better decision-making for the society. The research on trusted autonomy literature review reveal opportunities for researchers and practitioners to work on topics that can create a leap forward in advancing the field of trusted autonomy [23]. Therefore it is (robotics and software agents) that can be used in individual's daily life, and can be integrated with humans seamlessly, naturally, and efficiently.

Objectives and scope of the study
This research study concentrates to explore the recent developments on Big Data domain related to Artificial Intelligence in scholarly academic publications. This research is unique in its respect to explore the developments of the Big Data subject in high ranked SCI-Exp and SSCI journals in the field. In this study, the data were collected from various journal articles on "Big Data" from Web of Science Core Collection of Thomson Reuter's database (SCI Expanded and SSCI), similar data collection strategy used by [24] in "air pollution" [25] in "open access journals" and similar search strategy criteria used to construct the bibliometric research framework. It has reputation and prestigious that the Web of Science -Core Collection contains information that its scholarly emphasize the research area with citations, accuracy, and recent developments in the field.

Data collection
In this study, a bibliometric method conducted to explore the scientific knowledge level of big data in relations with artificial intelligence domains. Publication data retrieved from the ISI Web of Science (WoS) Core Collection database with content specific query keywords. Initial search steps are as follows below; Step 1: Initialfirst step search query; TOPIC: ("big data") AND TOPIC: ("artificial intelligence") Timespan=2008-2019: Indexes=SCI-EXPANDED, SSCI Results: 454 publications found (in all document types) Step 2 (Final search query): TITLE: ("big data") AND TOPIC: ("artificial intelligence") Timespan=2008-2019: Indexes=SCI-EXPANDED, SSCI Results: 107 publications found (in all document types) According to findings after Step-1 the data set reviewed and unrelated papers excluded for further analysis. In Step-2 the final data found as 107 academic publications.

Research questions
In order to understand the patterns of research stream in "big data" in accordance with "artificial intelligence"; the design of the research should be clearly structured based on research objectives and scope. A bibliometric study that enable us to gather all related publication data, and analyze them with proper methodologies to answer these research questions below: The search results show that the total of 107 publications were found from the WoS database for the research period. The main sources are peer reviewed articles (62 records), Reviews (20 records) and Editorial Materials (19 records), rest of them Conference Proceedings, Book Chapter, Book Reviews, etc. These publications were downloaded into MS Excel spreadsheet software for further analysis. The descriptive information about the publications given in Table 1 for document types and in Table 2 for document languages, below; Web of Science-Core Collection subject category analysis shows that the literature related "Big Data" were categorized under the top 18 major subject areas in Table 3 below:

4.2
Publication outputs by years RQ1: How many articles on big data were published per year between 2008 and 2019? Table 4 shows the publication years during the literature review periods. The average journal articles published overall of this period (2008-2019) was 15.29 records. The increased number of the publications in the period of 2016 to 2017 showed significant increase for the periods of 2013 to 2015. On the other hand, 2018 was the highest amount of publication counted as 53 records. Similar trend could be said for the period of 2019 (for the first half, the amount of publication would be the same amount or more than the previous term).

RQ2:
In which journals were the articles about big data published most frequently? Table 5 shows publication frequency of the journals on big data and artificial intelligence research in the literature below: Although there is not significance in terms of activity of the specific journals at some extend in some extend, the top contributed journals namely IEEE Access, Journal of the American College of Radiology, University of Toronto Law Journal, Circulation Research, IEE Communication Magazine could be found as frequently published the studies on big data in relation with artificial intelligence among the data retrieved from WoS Database. Although there are not significant number of articles published in one or few journals, but it seems that journals published similar amount of the papers, respectively. Table 5 shows the publication frequency of journals, 9 journals have categorized as higher the number of publication counted in the field (more than 2 articles ranked in the table). However, 81 out 107 publication sources has only one paper each.

4.4
Most productive authors/co-authors RQ3: Who are the most productive authors/co-authors that published in those periods?
According to Author findings; there were 100 authors contributed the papers as author or co-author in 107 publication. Majority of the authors or co-authors counted in the publications with only one paper. Figure 1 shows that the authors/co-authors who published 2 papers during the period of 2008 to 2019 were 16 records compared to others who published only one paper.

Fig. 1. Most productive authors by publication volume
Although there was no significant difference on publication records by per author, those listed researchers above were slightly higher than others who published only one paper. Therefore, publication productivity by authors had similar amount.

4.5
The most productive institutions RQ4: What are the most productive institutions? According to findings, Figure 2 shows that the most productive and active institutions (Organizations Enhanced) on Big Data and Artificial Intelligence subjects are mostly in North American Universities such as Harvard, Stanford, U. of Michigan, U. of Toronto, UBC, MIT, following Chinese, Australian, and European universities.

4.6
The most productive country regions RQ5: What are the most productive countries-regions? Figure 3 shows the most productive countries that contributed the BD and AI fields. The leading countries are USA with 44 papers (41.12%) is in top on the list, following by Peoples R. China with 18 papers (16.82%), Canada with 12 papers

Conclusion
The objective of this work is to present a bibliometric analysis of the scientific papers published in the journals, which indexed and ranked in Web of Science Core Collection the period of 2008 to 2019. The structure of this study to explore and identify the main research domain of big data in relation with artificial intelligence streams in scholarly publications. Therefore, research questions have formed to understand and find out the outputs in publication patterns according to database results.
According to results, the study shows that the number of papers published in research topic have increased in the period of 2018 to 2019 than average publication (15.29 average articles per year). Articles counted as 11 records each year in 2016 and 2017; 53 records in 2018, which is the highest record among the year of the period. Similar trends could be seen in 2019 (has not completed yet as of half of the year).
Research outcome would clearly states that Big Data in relation with Artificial Intelligence publications are increasing respectively. Due to the importance of the topic has widely interested subject area for the different academic fields. Engineering, Computer Science, Medical Science and even Law [26] disciplines are the most popular subject categories according to WoS publications results recently. This paper is relevant for researchers, academics, and practitioners in different disciplines as well as who works and contributes in the field of information studies, engineering, management-business, and interdisciplinary studies.

Limitations and future research directions
The paper has some limitations likewise other scholarly research papers. First, the study focused on the TITLE, TOPIC and keyword search in Web of Science, the Core Collection of Thomson Reuter's database (SCI expanded and SSCI) on big data related to artificial intelligence domain publications during the period of 2008-2019. Therefore, AHCI index excluded from the research criteria. In the initial search steps, all related "big data" published papers derived. Second, most of the bibliometric analysis widely used, but content analysis and co-author citation map, and network analysis are excluded according to scope of the study. Further studies would be more on network and citation mapping analysis. It would be interesting to use other software, such as Payek, BibExcel, Perish, etc. to display and present results of network and citation data to show connections among authors in the field. Although the study gives us the clear picture of the current situation about the publication patterns, but the results shown that there would be change over, the time with new studies and papers published in related subject areas.