Paper— Design and Implementation of a Multimedia-based Technology Solution to Assist Children with … Design and Implementation of a Multimedia-based Technology Solution to Assist Children with Intellectual Disability to Learn

—This paper presents an innovative technological solution to improve the understanding and cognitive functions of children with intellectual disability through multimedia. New algorithms that mine the story-based scripts, rank their sentences, extract keywords and match them with multimedia have been proposed in our previous work. They are integrated in the proposed system and being assessed over 50 stories for children. The results showed that the children concentration and refection are highly improved using multimedia. The system is fully dynamic and support the personalized learning. Instructors can focus on the main ideas of the story which are extracted automatically and use then images and clips to explain them. The words and images are stored in a corpus which is enriched continuously. Every new story taught will add new information to the corpus and to every children dataset. Parents can review the tutorials used in school with their children at home. The system contributes to the reintegration of children with intellectual disability into the society and break their marginalization and isolation.


Introduction
The number of children with intellectual disability (ID) is increasing in all countries. The rate is around 15% in average in industrialized countries. Although no accurate statistics are published in developing countries, this rate, however, may be higher due to many economic and environmental factors. The recent UNESCO report on disability [9] showed that ninety per cent of children with disabilities in developing countries do not attend school. Brazilian census reported a 14.5 percent disability rate. While in Turkey the rate is around 12.3 percent and in Nicaragua it is 10.1 percent. There are around 3.8 million people in Spain who suffer from a kind for physical or intellectual disability [2].
Children with disabilities are victims to violence and marginalization. A survey in India with 70 million persons with disability showed that only 100,000 are employed and 25 per cent of girls with ID had been raped. Early intervention to teach these children is highly recommended to allow them be autonomous, educated and can reply on themselves to find a job in the future. Many initiatives have been proposed to improve the learning skills of children with ID. In fact, disability should not prohibit this population from learning and competing with their normal peers. The state of Qatar opened several centers and schools to accommodate these children and offer them the right to education and care. The Shafallah center [8] is one of these modern centers. It has several schools and units that receive children from early childhood to the age of 21 years and assist them to be integrated into the society. However, the capacity of this center is limited and cannot accommodate all the children with disability in the country. In addition, every child requires personalized materials to suit her/his special needs and challenges. Instructors cannot design tutorials for every child independently. They prepare a general lesson where a very few number of children can understand it in the first session. The instructors repeat then the same lesson several times using different concepts to fill-in the understanding gabs of the other children who cannot understand it. One lesson may take several weeks to be taught and comprehended. This implies an important slowdown in the learning process and the proposed curriculum cannot be achieved as expected. The instructors face a major difficulty in teaching these children as no appropriate contents are available. The curriculum gives the outline of the topics that should be covered based on the children IQ (a key indicator to measure the intellectual challenges of every child), and every instructor should develop the content and present to the children in the classroom. As memorization and concentration abilities are major problems affecting all the children with ID, three instructors and assistants should be together in the same classroom to assist a group of eight children. This requires to have an important number of special education instructors and monitors to be recruited by the center to teach these children. The number of specialized instructors is very limited in the country. In addition, parents cannot assist their children at home as they don't know the lessons taught in the classroom and don't have a real-time connection with the instructors to follow up the progress of their children. All what they can do is to ask the instructors over the phones or during the parents' meetings about the status of their children.
In our previous work [3] we have proposed an assistive learning system based on multimedia and intelligent algorithms to understand story-based scripts and translate them automatically to multimedia elements. It is known that multimedia has a great impact in improving the learning [4]. The system has four main components which are assessed in this work.
The remaining of the paper is organized as follows: section 2 discusses briefly the first prototype of the system; section 3 presents the results of assessing the system's components while sections 4 and 5 conclude the paper, and discuss the future work.

The System
As detailed in our previous work [3] the system consists of four main components: (a) a corpus that groups stories for children and words with their weights which can be used to show their importance in the domain of discourse (i.e., animals, food, and trees). In addition, every word is associated with a set of representative images and clips. This corpus is growing with every new story added to it and the words' weights are updated accordingly. (b) the second component is formed of several algorithms. They are used to rank the sentences of the story and select the most representative ones. Those sentences will present the main ideas of the story. The instructors can then use them to explain the script in an effective manner. (c) the third component consists of translating the sentences to multimedia (i.e., images and clips). They are retrieved mainly from the corpus. A matching algorithm was developed to find words and images. The instructor can also address new queries to Google search engine using its API features [7] to fetch complementary multimedia contents whenever needed, validate them, and present them immediately to the children in the class room. The learning session will be dynamic and the instructor don't lack resources. As Google search engine provides many images, a filtering algorithm was also proposed. It can be used to remove offensive and none relevant images and reduce then the number of images fetched from Google. Every child has also its own dataset which contains the words she/he learnt in school. Parents can use the system at home and assist their children in the learning process. In fact, the school time is very limited and these children need intensive care and assistance to understand the meaning of words and concepts. The system support personalized learning and involve parents and guardians in the learning process. Figure 1 gives an overview about the proposed system. Every child will be given an iPad to see the images corresponding to the words being explained by the instructor. Every tutorial is followed by a set of exercises of different difficulty levels. Instructors can then follow up the progress of the children based on the solved exercises. Every child has its own account to login and solve the exercises.

Experiments and Evaluations
The This section discusses the evaluation of main components of the proposed system. These components are: (1) the creation of the domain-based corpus; (2) the sentence ranking and selection; (3) the pattern recognition; and (4) the image retrieval and filtering algorithms. This section shall focus on evaluating each component individually.
The dataset: We have selected 50 stories for children from three domains: animals, food, and trees. These stories have been taken from the following websites: • http://www.kidsworldfun.com/shortstories • http://www.shortstories.net/for-children • http://kidsfront.com/stories-for-kids Three different approaches have been used to evaluate the system's components. The first approach uses data analysis to show the accuracy of the algorithms used. The second approach is based on expert users' observations and uses the recall and precision formulae. The last approach compares the technique used here with others known in the literature.

Evaluation of automatic domain-term creation
The key to this approach is the design and implementation of a corpus that can host many stories and categorize them automatically based on their domains. The proposed technique can analyze novel stories, and update the corpus automatically with new terms and calculate their weights that show their importance in the domain of discourse. The weights of the existing terms will also be updated with the addition of a new story. Therefore, the corpus is enriched with every story and we can cover properly the domain used, identify effectively the most representative terms and mod-el them with images and clips. This approach allows to build a library of words annotated with their weights and domains, which can be used for new story analysis and evaluation. Our system can identify around 1421 terms in different domains from the 50 stories that we selected. Table 1 shows the terms found in every domain. During this experiment, 26 terms were found to be common to the three domains (animal, food, and tree) albeit with different weights as shown in Table 2.  Many words have different weights in every domain. For instance, the word 'farmer' has higher weights in the animal and food domains but a lower weight in the tree domain. This weight calculation is based on the frequency of the word farmer in selected stories on these domains. As mentioned before, a new story can change the words' weights in the corpus. Our algorithm is based on a bubble sort approach where important words in the domain will have higher weights. We can then select the most representative sentences present in the story script by using the weights of its constituent words. This innovative approach can be used in different applications for natural language processing including text summarization and searching [5] [6]. Even though the algorithm requires some time to organize the corpus and update the words' weights, it can help a lot in ranking and selection of sentences in the story. The corpus can also give information about the number of words that can cover a specific domain. Table 3 gives a summary of terms distribution across the three domains and the common words that can be used in these domains.

Evaluation of sentence selection and ranking
An experiment was conducted with two expert users to identify the most important sentences in ten different stories. They read them and extracted the sentences. The results of the two users were compared with those from the method presented in this paper. The results are shown in both restricted and flexible modes in Table 4. In the restricted mod, the precision of our system is 74%, compared to 88% precision in the flexible mode.
Discussion: Based on the above results, the domain-based approach developed here can select the most important set of sentences that represent the main ideas of the story. This method is considered as enhancement to other methods of sentence ranking, and can improve the sentence ranking approaches if combined with others -especially for domain oriented documents. Our algorithm can also be used with search engines to improve the retrieval and ranking of documents. In fact, we can develop a huge corpus grouping several domains that can be used in the searching and retrieval techniques. We can also develop a dedicated corpus for a specific domain like business or education and use it with search engines to retrieved the most pertinent documents in these domains ranked based on their importance. The analysis was conducted by two expert users who evaluated the ten stories. The evaluation was based on the selection of the important sentences in each story by each user and making a comparison of their selections with the system output. The evaluation was split into two modes: • restricted mod: It requires both the users and the system to identify the same sentences and rank them. • Flexible mod: It requires either user 1 or user 2 to select the same sentence as the system.

Evaluation of pattern recognation:
The optimal method of evaluating our approach is by making a comparison with the system proposed by Alchemyapi [1]. In this experiment, the two systems were provided with the same story. The results obtained from both systems are shown in tables 5 and 6. Our system can identify additional relationships such as "hens ruin it" while the 'Alchemyapi' system instead identified a lase relationship which is 'Harry ruins it'. In addition, the system presented here is able to discover the relationship 'Hens lay eggs' while the Alchemyapi's system couldn't extract it. This assessment is based on one story. We can compare the system on the 50 stories and show the differences between them. However, it is not currently the main purpose of this work.
"Once in a city there were two neighbors, Harry and Danny. Harry would always remain tense as the hens of his neighbor daily used to enter his garden and ruin it. One day Harry came to know that Danny was away. So, he placed some eggs in the garden. When Danny returned home and saw Harry picking up the eggs, he asked him, "Harry, you have no hens. Then where did you get these hens from?" Harry innocently replied, "some hens come in my garden and lay eggs. I am lucky to get free eggs daily.'' Since that day, the hens of Danny never entered the garden of Harry". sentences with simple structure where stop words are eliminated first and then a stemmer is used to recognize the words roots. These words are searched first in the corpus and their corresponding multimedia elements are presented to the instructor to select what is appropriate to present to the children. Whenever the words are not found in the corpus the system addresses a query search engine like Google or Yahoo using their APIs features. Since most search engines provide a large set of irrelevant results and sometimes offensive for children, a filtering algorithm is prosed to remove the these type of images.
The evaluation of the image processing component is based on two factors: (1) comparison between 'before and after' using our approach; and (2) accuracy of the obtained results. Table 7 shows the results of evaluating the image retrieval and filtering algorithm with 25 different queries. The algorithm performed with 77% precision on average across all queries. Discussion: In this study, the results of image searches using the Google search engine were compared with the method described here. Based on the results of textbased image retrieval using surrounding image annotations, it can be concluded that the image search is successfully distributed. The developed system of text-based image retrieval can improve the efficiency of image searching, and consequently can assist instructors in rapidly finding appropriate images.
There are two major limitations to this method that need to be addressed in future work: • The method depends on surrounding information attached to the images (e.g. tags).
However, some attached annotations do not represent the real content of the image. • To improve results, the method must compare its query with synonyms, since some results may be retrieved using synonymous annotation such as the word 'kid' for the word 'child'

Conclusion
We have proposed an assistive learning system with new algorithms that can mine the script of stories, select the most representative sentences and translate to multimedia. A corpus with is provided. It groups words, their weights, and corresponding images. The system is dynamic where the instructors request additional multimedia elements through Google search engines whenever needed. Three domains were used with 50 stories of simple structure and the common words have been determined with their importance in every domain. The instructor found the system very useful to assist them in preparing their tutorials and present to the children in an enjoyable manner. They could keep the children more focused and more engaged for a larger time.

Future Work
The following tasks need to be completed in the future work: 1. Testing: different testing sessions with instructors and students using the system should be conducted at regular time. We attended some testing sessions in the Shafallah center, however more sessions should be considered and the results should be analyzed to demonstrate which component need to be improved. 2. Automation of domain oriented creation: Methods to improve the efficiency of story classification is required. This could be achieved by adding many stories with different domains to the corpus and evaluating the subsequent quality of the system. In the current iteration, we have added 50 stories on animals, trees and food domains. But certainly, a larger number is needed to cover properly the domains. 3. Pattern recognition using chunking sequence: different algorithms are currently used for extracting hidden information from story scripts. These are mainly based on predefined rules such as X <action> Y. Addition of new rules to these algo-rithms may improve the automatic pattern recognition. A lexical annotating software like MADAMIRA can be used to tag the words as verbs, nouns and adverbs which help to know the actions and the actors in the story. 4. Term weight calculation: To improve the terms weights calculation, we propose to use WorldNet to provide synonyms to the terms. Such an approach can help to create a generic corpus that groups the words, their synonyms and their weights in different domains.
5. Understanding complex stories: Our proposed system deals currently with stories possessing simple structure or sentences. Dealing with articles or stories with complex structure is one of the areas that needs to be studied in details in the future.
6. Images copyright: The system currently sends queries to the Google search engine which returns a set of different images. Consideration of the copyright issues related to the use of these images is an important factor prior to a wider roll-out.