A Machine Learning Way to Classify Autism Spectrum Disorder

—In recent times Autism Spectrum Disorder (ASD) is picking up its force quicker than at any other time. Distinguishing autism characteristics through screening tests is over the top expensive and tedious. Screening of the same is a challenging task, and classification must be conducted with great care. Machine Learning (ML) can perform great in the classification of this problem. Most researchers have utilized the ML strategy to characterize patients and typical controls, among which support vector machines (SVM) are broadly utilized. Even though several studies have been done utilizing various methods, these investigations didn't give any complete decision about anticipating autism qualities regarding distinctive age groups. Accordingly, this paper plans to locate the best technique for ASD classification out of SVM, K-nearest neighbor (KNN), Random Forest (RF), Naïve Bayes (NB), Stochastic gradient descent (SGD), Adaptive boosting (AdaBoost), and CN2 Rule Induction using 4 ASD datasets taken from UCI ML repository. The classification accuracy (CA) we acquired after experimentation is as follows: in the case of the adult dataset SGD gives 99.7%, in the adolescent dataset RF gives 97.2%, in the child dataset SGD gives 99.6%, in the toddler dataset AdaBoost gives 99.8%. Autism spectrum quotients (AQs) varied among several scenarios for toddlers, adults, adolescents, and children that include positive predictive value for the scaling purpose. AQ questions referred to topics about attention to detail, attention


Introduction
The autism spectrum disorder (ASD) screening process differs according to age. Two global classification systems for ASD diagnosis, namely, the Diagnostic Statistical Manual (DSM-5), which is provided by the American Psychiatric Association and considers the condition as a single diagnosis by removing subgroups, and the International Classification of Disease (ICD-11), created by the World Health Organization (WHO). According to the DSM, autism and intellectual disability occur concurrently. By contrast, the ICD provides a detailed guide to distinguish autism prevailing with and without an intellectual disability; it also considers historical data on loss of previous skill in the diagnostic process. The most difficult aspect of diagnosing ASD is that no single pathognomonic feature exists and all symptoms revolve around the modification of an individual's behavioral profile, which varies according to age and severity.
In the prevailing system, classification is carried out using datasets of cases collected from a versatile group. The data depend on the autism diagnostic observation schedule (ADOS) & autism diagnostic interview (ADI), which is conducted in a clinical setting. ADOS sessions are 30-45 minutes long, and the examiner records the provided responses. ADI refers to interviews of suspected autism individuals over 18 with their parents or caregivers in the clinic. The interview is performed in five phases using a questionnaire that probes areas related to communication, social development, play, restricted behavior, and general skills. The individual's responses are evaluated by using scoring algorithms, and 3 main domains, namely, etymological & communiqué, societal relation, constrained & tedious behavior, are assessed. Cumulative scores exceeding the corresponding cut-off values indicate a positive syndrome that must be addressed immediately by proper diagnosis. Determining the most prominent features from a massive dataset is challenging work that must be done by careful analysis. Data processing tasks also present a potential hurdle in managing missing values in attributes. The rest of the process of applying machine learning mostly relies upon the quality of information taken into consideration. Automation based on the diagnostic perspective must be fine-tuned.
ASD is a neurodevelopment disorder that can occur in adults, adolescents, children, and toddlers. Leo Kanner refers to autism as a prototypical condition with a spectrum of presentations and phenotypes that become more subtle in terms of behavioral features when a change in the environment occurs. It is categorized through interactive irregularities in communiqué and shared societal relation, organized thru outlines of monotonous, controlled, & typecast safeties & actions. These issues typically extent during infantile & are probable to increase in the intensity of diverse ages. The heterogeneousness of the exaggerated entities & their hereditary intricacy has helped researchers identify the causes of ASD. Diagnosis of ASD is a lengthy process and varies from individual to individual. Symptoms also change across one's lifespan. ASD can be difficult to detect in young children, and parent raises the concern after the persistent monitoring of the children which delays the process of early diagnosis.
Our contribution to this work is as follows: • We designed to classify the patient is affected by autism or not based on the various attributes using a machine learning process. • For the new record, prediction of syndrome and treatment at the earliest stage can be facilitated to prevent worse condition at the earliest. • The health-related sector requires more accurate and precise estimation at the earliest.
This article is prepared as follows: Related work is presented followed by the proposed method via a workflow. Results and a discussion are then provided, and the conclusions are summarized.

Literature Review
ASD denotes a neurodevelopmental issue categorized by confinements in social associations, correspondence, & conduct that become progressively regular [1]. The reasons for ASD is generally connected to hereditary & neural factors; however, they are fundamentally analyzed by utilizing non-hereditary factors identified with conduct, such as social cooperation, play, creative thinking, monotonous practices, and correspondence, amongst others [2]. Prevailing approximations disclose that approximately 1.5% of the populace is on the range, & many persons on the range are believed to remain undetected [3]. Accordingly, the need for quick analyzing amenities conforming by developing alertness of ASD [4]. Wall et al. proposed numerous data mining techniques in an alternating decision-tree algorithm (ADTree) to moderate the count of substances present in the ADOS-Revised test. This work intended to hasten ASD analysis so that members, including family, could utilize the necessary services provided. To accomplish this goal, the authors removed instances of non-ASD cases & then investigated the classification frameworks produced by the ADTree calculation on an imbalanced dataset. The WEKA software was subsequently utilized to evaluate the classification accuracy obtained by using the ADTree method. Subsequently examining the outcomes of the ADTree calculation, the authors found that among the 29 objects included in the ADOS-Revised test, solitary 8 features appear in the classification framework; thus, the group believed that the 29 items could be represented by only these 8 items. There is a necessity to reconsider the features includes within ASD diagnostic tool to satisfy a smaller number of items sets though keeping up the sensitivity & validity of the test [5,6].
ASD prediction-based ML requires cautious examination, particularly when managing diagnostic strategies employing techniques in the clinical setting. Limiting the ADOS-Revised test to eight items may result in misleading results because exercises must be directed by the clinician on an experiment before the grouping [6,8]. Duda et al. [7] conducted a realistic investigation associating numerous smart methods to differentiate amongst ASD & attention deficit hyperactivity disorder (ADHD). Six methods were differentiated on a dataset with 65 items obtained from the Simons Simplex Collection version 15.41. Information was gathered by utilizing a parent-directed survey symptomatic strategy called the Social Responsiveness Scale. A preprocessing stage was conducted by the author to (1) dispose of occurrences that had at least four missing qualities, (2) balance data collection by using the under-sampling procedure, and (3) diminish information dimensionality by using feature selection strategies. Chu et al. [9] explored several approaches to separate ADHD and obstructive sleep apnea (OSA) by using the data of 217 kids who had been diagnosed as having ADHD, OSA, or a combination of ADHD & OSA as per the Diagnostic & Statistical Manual of Mental Disorders (fourth edition; DSM IV) standards. Information was gathered by utilizing a diverse diagnostic tool, and three ML techniques were used to infer classifiers that could help clinicians & doctors improve diagnostic criteria. Detailed outcomes demonstrated that 17 highlights show significant distinctions amongst 3 groups of pervasive developmental disorders (PDDs), especially in the Child Behavior Checklist (CBCL). Moreover, compared with the neural network and CHAID algorithm, the decision tree generated classifiers faster.
Wolfers et al. [10] researched issues identified with PDDs, having trivial sample extents, exterior legitimacy, & ML methodic difficulties, lacking focus on ASD. Lopez Marcano [11] inspected the appropriateness of various methods, for example, neural system & RF, to minimize the period required for ASD diagnosis. Maenner et al. [12] examined the RF method on a dataset obtained from the Georgia Autism & Developmental Disabilities Monitoring Network using expressions & words acquired in youngsters' formative assessments. The dataset comprised 5,396 assessments for 1,162 offspring, 601 of whom were on the range. The RF classifiers were assessed on an autonomous test informational collection containing 9,811 assessments of 1,450 youngsters. The outcomes revealed that RF achieves approximately 89% predictive ability & 84% sensitivity. Thabtah dissected limitations related to testing reads that embraced ML for ASD classification [13][14][15]. The goal of [36] is through a literature survey: a) To outline the emotional turn of events and instruction of people on the range b) To present the discoveries of examinations c) To present and raise key worries about the emotional intelligence of kids range of autism d) Bring up issues about the advancement of instructive techniques pointed toward improving the emotional improvement of people in the autism range and subsequently the advancement of social sentiments their maternal aptitudes [37] gives a brief & agent depiction of the job that artificial intelligence plays these days at the evaluation of autism. [38] features the point-by-point research occurred between 2010 -2020, while looking at the effect of robots on medically introverted kids through their connection, utilization of craftsmanship, programming, etc. Behavior inflexibility (BI) for youngsters with formative inabilities was built by utilizing a multistep procedure with the help of a parent Studies need to analyze the focalized and unique legitimacy of the Behavioral Inflexibility Scale (BIS) using a multi-method approach.
2 2020 [32] An unsupervised online learning model was built for ASD grouping.
Models must be prepared by using the dataset, rather than simply employing a pre-trained model.

2020 [33]
Utilizing the Stockholm Youth Cohort, authors analyzed anxiety syndrome amongst mentally imbalanced adults (n = 4,049) with and without scholarly inability against a population control (n = 217,645).
More investigation is necessary to govern the reasons for anxiety among individuals through ASD. Future research is expected to improve the understanding of the phenomenologist of anxiety syndromes and enhance methods to estimate and treat anxiety.

2019 [34]
Gaussian mixed models and hierarchical clustering were applied to distinguish among social phenotypes of ASD and assess treatment reactions over scholarly phenotypes.
A limitation of the present investigation is the absence of information from institutionalized appraisals.

2018 [35]
Ongoing investigations on mental imbalance were examined. This work not only articulated previously mentioned issues but also suggested ways to improve AI use in ASD in terms of conceptualization, execution, and information.
No implementation work was shown.

Workflow
The data available in the UCI repository were obtained for our work and collected with the help of a mobile application (hereinafter referred to as an app) developed to perform four ASD screening methods, namely, the autism-spectrum quotient (AQ) of Adult, Adolescent, Child, and Toddler. Dataset available in the UCI repository includes clean data without missing values. Since the dataset has approximately 21 features and 1 class labeled autism and non-autism. The features are age, sex, jaundice during birth, ASD for any family member, questions A1-A10, and ASD score from the application was used to classify the work as autism or non-autism. Principal components were retrieved by using a principal component analysis (PCA)-algorithm then applied to a minimized dataset. We considered five eigenvectors from the given data and co-occurrence matrices and then fed the system to different classifier algorithms with a cross-fold value of 10. The system was classified by using the different algorithms, and the best algorithm for early diagnosis of ASD was identified by using precision, recall, F1 score, and accuracy values. Figure 1 illustrates the proposed workflow.

Data collection & description
Four classes include data adolescent, autism data adult, autism child process, and toddler. The dataset included the following attributes: age, sex, jaundice during birth, ASD for any family member, residence, previous app use, screening, language, and classes. The screening test was conducted among age groups of 4-11 years, 12-16 years, and 17 years and older. Upon completion of the test by the user (questions A1-A10), a screen appeared so that the user can review and modify his/her responses. Before the data gets saved, it allows the users to verify the filled data as part of quality assurance. The value "0" or "1" is recorded based on the response given by the participants. The attributes and their data types are illustrated in Table 1.

Attribute extraction
The attributes were extracted by using PCA. Certain rules are associated with attribute extraction, as discussed below. The main idea of PCA is to diminish the dimensionality of the dataset variables available in the given input data. Principal components are created in the process of PCA using orthogonal transformation by transferring the set of possible correlated components or variables into a set of linearly uncorrelated variables. In the flow of work, we used five principal components (PC1-PC5) derived from the set of data inputs after preprocessing. These vectors have been used as feature extraction variables for the rule-based algorithm described in our previous work [16].
The steps involved in PCA begins with normalization of the data, identification of covariance matrix, computation of the eigenvalues, and vectors followed by choosing the principal component than creating the attribute vector.

Classification Algorithms
According to the workflow for ASD diagnosis and prediction, the dataset is first framed, after which feature selection is conducted. The severity of autism is calculated by applying machine learning classification algorithms. After a review of their characteristics, the following supervised classification algorithms are applied.

SVM
The goal of SVM is to compute a hyperplane in the N-dimensional field. To separate as two classes of data points, the number of hyperplanes selected. We intend to get an aircraft with the greatest margin, i.e., the optimal distance among two-class data points. The trustworthiness of future data points is improved by optimizing the margin gap. Hyperplanes are conclusion limits that help classify data points. The data points on each side of the hyperplane can be allocated to separate groups. The hyperplane dimension also relies on the number of characteristics. If the input number is 2, for example, the hyperplane is only one line. A hyperplane is a two-dimensional plane if the number of features to be entered is 3. The number of features approaching 3 is difficult to imagine. Vectors supporting the hyperplane are similar data points and influence the hyperplane position and orientation. Support vectors are used to optimize the margin of the classifier. The elimination of support vectors would change the hyperplane's location. These concepts were used to build our SVM [17][18].

K-NN
KNN is a data mining algorithm utilized for classification purposes. The steps involved in the method are as follows: 1) Obtain the unclassified data 2) Evaluate the distance from new data to all other already categorized (Euclidian, Manhattan, Minkowski, or weighted) data 3) Calculate k value 4) Review the list of classes at the minimum distance, counting the number of every appearing class 5) Selection of the class that occurs most often as the right one 6) Classify actual data with the class obtained in (5) The distance between two points can be easily calculated using several formulas [19][20].
The formula for the Euclidean distance is as follows and a, b stands for points:

RF
A decision tree is an abstract typical form that can be used as a building block for an RF. This paradigm is interpretable because the classifications are familiar before a decision reached (in an ideal world), how the issue affecting the data is built are the technical details of a decision tree. The decision tree in the Classification and Regression Trees (CART) algorithm is constructed by evaluating the questions (called node splits) contributing to the largest reduction in Gini impurities when responded. This indicates that the decision tree tries to create nodes involving a high ratio of datasets (data points) from a single class by locating values in the attributes that split the data cleanly into classes [21][22]. (2)

NB
A classifier is a model used to distinguish between objects based on certain characteristics. The NB classifier is a deterministic prediction system model. The cluster is focused on the principle of Bayes. Finding the likelihood of A occurring as B is happening is conducted by using NB [23]. Here B is the proof, and A is the assumption. The predictions/features here are believed to be independent, i.e., one function has no impact on the other. The various kinds of NB Classifiers namely Multinomial, Bernoulli, and Gaussian.

AdaBoost (AB)
AB is a sub-algorithm used for machine learning established by Yoav Freund & Robert Schapire, who received the Nobel Prize for their research in 2003. It can be used to enhance performance in combination with several other learning algorithms. The performance of other optimization algorithms is incorporated into a weighted sum representing the boosted classifier's overall results. While AdaBoost is prone to loud outliers and data, it is less vulnerable than most other learning algorithms to overfitting issues in several situations. AdaBoost is frequently known as the satisfactory out-of-the-field classifier. However, the pattern is introduced at every level of the AB set of rules [24][25].

SGD
Stochastic refers to a random probability-related scheme or method. Thus, in SGD, several samples, rather than the whole set of data for each iteration, are randomly chosen. In the descending gradient, the term batch indicates the maximum set of data from a sample used to measure the gradient of iteration. The goal of the SGD is to find a better way of traveling the error surface so that minimum error value is achieved quickly without resorting to brute force search, therefore it is very costly to perform computationally. SGD solves this problem because, in SGD, only one sample is used for iteration. The batch is spontaneously mixed and chosen to carry out the computation process [26].

CN2 rule induction
CN2 rule induction is a classification algorithm that works on rules based on a condition followed by a prediction class [27] on different datasets. Result and Discussion

Performance metrics
The performance of the entire system architecture was calculated based on F1 scores, precision, recall, and accuracy [28][29][30].

F1 Score
F1 score is the weighted standard of precision and recall. The score evaluates false positives and negatives. While it is not as straightforward as exactness, the F1 score is normally more helpful than precision. Precision works best if false positives and negatives have a comparative expense. If the expense of false positives and negatives are altogether different, precision and recall may be more informative. In the adult dataset, the F1 score for AB is 0.993, which is higher than the F1 scores obtained from the other methods. In the adolescent dataset, the F1 score of RF is 0.972, which is higher than the F1 scores obtained from the other methods. In the child dataset, the F1 score of SGD is 0.996, which is higher than the F1 scores obtained from the other methods. In the toddler dataset, the precision rate of SGD is 0.996, which is higher than the F1 scores obtained from the other methods.

Precision
Precision is the ratio of effectively anticipated optimistic perceptions compared with all-out anticipated positive perceptions. In the adult dataset, the precision rate of SGD is 0.997, which is higher than the precision scores obtained from the other methods. In the adolescent dataset, the precision rate of RF is 0.972, which is higher than the precision scores obtained from the other methods. In the child dataset, the precision rate of SGD is 0.996, which is higher than the precision scores obtained from the other methods. In the toddler dataset, the precision rate of SGD is 0.996, which is higher than the precision scores obtained from the other methods.

Recall
Recall refers to the ratio of effectively anticipated optimistic perceptions versus all perceptions in the real class. In the adult dataset, the recall of SGD is 0.997, which is higher than the recall scores obtained from the other methods. In the adolescent dataset, the recall of RF is 0.972, which is higher than the recall scores obtained from the other methods. In the child dataset, the recall of SGD is 0.996, which is higher than the recall scores obtained from the other methods. In the toddler dataset, the recall of AdaBoost is 0.997, which is higher than the recall scores obtained from the other methods.
Deploying the autism dataset on various machine learning algorithms provides insights into the type of algorithm yielding optimal results. Figures 2-5 provide a comparative analysis of these methods.

Classifier accuracy
In the adult dataset, the highest accuracy (99.7%) was obtained from SGD. In the adolescent dataset, the highest accuracy (97.2%) was obtained from RF. In the child and toddler datasets, the highest accuracies were obtained from SGD and RF. Figure 2 describes the performance values obtained for SVM, KNN, RF, NB, AB, SGD, and CN2 rule inducer. The F1 score, precision, and recall of each algorithm were obtained for the child ASD dataset, and the RF algorithm yielded the highest value of 0.98.   Figure 4 describes the performance values obtained for SVM, KNN, RF, NB, AdaBoost, SGD, and CN2 rule inducer. The F1 score, precision, and recall of each algorithm were obtained for the toddler ASD dataset, and the RF algorithm yielded the highest value of 0.99. Figure 5 describes the performance values obtained for SVM, KNN, RF, NB, AB, SGD, and CN2 rule inducer. The F1 score, precision, and recall of each algorithm were obtained for the child ASD dataset, and the SGD algorithm yielded the highest value of 0.99.   Figure 6 describes the performance values obtained for SVM, KNN, RF, NB, AB, SGD, and CN2 rule inducer and it shows the cumulative accuracy chart. The best algorithm for adult dataset is SGD, which yields an accuracy of 99.3%. The best algorithm for adolescent dataset is RF, which has an accuracy of 97.2%. The best algorithm for child dataset is SGD, which yields an accuracy of 99.6%. Finally, the best algorithms for the toddler dataset are RF and SGD, both of which yield 99.7% accuracy. Accuracy for Child ASD Data Accuracy for Toddler ASD Data worldwide have developed screening and diagnosis methods to detect ASD and assist in its medical diagnosis. In particular, the development of machine learning algorithms provides great support for the medical field. A stakeholder of these projects are patients, the caretakers who can provide the best insight about the patients, medical practitioners, psychologists, behavioral science, and neuroscience.
In this work, we used several classification algorithms to make the best prediction. Various potential supervised classification algorithms were applied over the dataset and trained the model. The performance was evaluated based on accuracy, precision, recall, and F1 score.
The work has shown a predominant result, and the system can be further trained with deep learning procedures to improve the early detection of ASD.  He has great international exposure in academia, research, administration, and academic quality accreditation. He worked with ILMA University, and King Faisal University (KFU) for a decade. He has 20 years of teaching & administrative experience. He has an intensive background of academic quality accreditation in higher education besides scientific research activities, he had worked a decade for academic accreditation and earned ABET accreditation twice for three programs at CCSIT, King Faisal University, Saudi Arabia. He also worked for National Commission for Academic Accreditation and Assessment (NCAAA), Education Evaluation Commission Higher Education Sector (EECHES) formerly NCAAA Saudi Arabia, for institutional level accreditation. He also worked for the National Computing Education Accreditation Council (NCEAC). Dr Noor Zaman has awarded as top reviewer 1% globally by WoS/ISI (Publons) recently for the year 2019. He has edited/authored more than 13 research books with international reputed publishers, earned several research grants, and a great number of indexed research articles on his credit. He has supervised several postgraduate students, including master's and PhD. Dr Noor Zaman Jhanjhi is an Associate Editor of IEEE ACCESS, moderator of IEEE TechRxiv, Keynote speaker for several IEEE international conferences globally, External examiner/evaluator for PhD and masters for several universities, Guest editor of several reputed journals, member of the editorial board of several research journals, and active TPC member of reputed conferences around the globe. noorzaman.jhanjhi@taylors.edu.my Article submitted 2020-10-31. Resubmitted 2020-11-27. Final acceptance 2020-11-28. Final version published as submitted by the authors.