A Distributed Distribution and Scheduling Algorithm of Educational Resources Based on Vector Space Model

— To explore the distributed dispatching and scheduling algorithm of educational resources, based on the vector space model, the user and resource description model is built. In addition, the user interest representation method based on the background and tense vector space model (BTVSM) is proposed, and the distributed architecture based on the scheduling server is used to provide the personalized distribution service to the resource users. At the same time, the task scheduling model and algorithm are analyzed. The task scheduling optimization based on ant colony algorithm is mainly used in the scheduling server, and the simulation experiment is designed to verify the effectiveness of distributed dispatching and task scheduling algorithm for educational resources. The experimental results prove the validity of the application of distributed dispatching and scheduling algorithm based on VSM in specific systems, and it can effectively solve the allocation problem between the resource user group and the distributed dispatching center.


Introduction
With the continuous accumulation of new resources and the increasing number of resource users, there is an obvious problem in the construction of educational resources: new resources cannot be updated to the user side of resources in time and the resource users are in urgent need of new resources. How to realize effective resource distribution has become an urgent problem in the current educational resources construction, which is the bottleneck of restricting the development of educational resources and the further development of educational information. In view of the problems existing in the distribution of existing resources, the distributed delivery and scheduling problem of educational resources is studied, focusing on the research of its delivery and scheduling algorithms. In the distribution of educational resources, the emphasis is put on the personalized needs of the resources users. In the process of personalized distribution of resources, the user interest representation method based on the background and tense vector space model (BTVSM) is put forward. In addition, the task scheduling optimization based on ant colony algorithm is adopted in the scheduling server, which effectively solves the allocation problem between the resource user group and the distributed distribution center.

2
Literature Review Chen et al. (2014) indicated that users could submit tasks on the host for processing. The demand for load allocation occurred in such an environment. Because of the random arrival of the task and the time requirements of their random CPU service, several computers are very likely to be idle or light load, and some other computers are heavy in load, which reduces the performance. In practical applications, there is always a server or system waiting on the other server for idle tasks [1]. Liu (2017) took educational resources distribution as the research object, selected 96 universities in the United States as the actual cases, and analyzed how to achieve a more equitable distribution of educational resources [2]. Gang (2013) focused on the study of free resources based on Internet. The research showed that this kind of resource service was provided only by a few sites, and the variety and quantity of resources were limited and were often the resources of some courseware and journal publications [3].
Because of the difference between economic and social development, there is a great difference in the degree of education information at home and abroad. As a result, it leads to a great difference in the construction of educational resources and the service of resources. In foreign countries, the information technology level of school teachers is very high. Educational resources are generally developed by school teachers themselves, with only a small number of teaching aids. Chakraborty (2013) analyzed the work of load allocation algorithm, and pointed out that it distributed and redistributed tasks between all participating nodes, thus maximizing the overall performance of the whole system. He focused on the details of the load allocation algorithm and its applicability in various load sensors [4]. Bouguerra et al. (2014) studied the fault-tolerant scheduling of parallel systems with non-memory failure distribution, and focused on the analysis of error handling that may occur during resource scheduling [5]. Purdy et al. (2015) studied the scheduling and allocation of medical education resources, mainly to improve the energy consumption efficiency of Core processors. Compared with the latest techniques, the proposed method had significantly lower energy consumption at both low parallel processing speed and higher parallel processing speed [6]. Wan et al. (2014) studied the user-centered resource service, and pointed out that most of the forms were based on the portal as the basic presentation, and friendly interface and convenient access to resources could provide timely services, such as information search and common problem solutions. Multiple portals could correspond to a resource database to provide different functional interfaces [7]. Roumasset and Wada (2011) studied online ordering resources, which mainly transferred resources by mail or express delivery. They also pointed out that the content of educational resources was not only K12 resources for basic education, but also a variety of professional resources. The resources were mainly displayed in the form of literature, and there were a large number of curriculum plans and programs, while audio and video resources were relative less [8].
To sum up, the distribution of educational resources requires not only a larger operating environment, but also needs a fair distribution mechanism. The load allocation of educational resources and the optimization of scheduling algorithm can maximize the overall performance of the system, reduce the efficiency of energy consumption, and diversify the variety of resources, thus further providing users with timely service and convenient access to resources. Therefore, by introducing the vector space model, the personalized needs of the resource users are focused on, and the allocation problem between the resource user group and the distributed distribution center is solved.
The first part introduces the research background and explains the importance of Background and Tense Vector Space Model (BTVSM) for the representation of user interest. The second part summarizes various researches on resource scheduling and allocation. The third part constructs the user interest model based on BTVSM, analyzes the recommendation process of personalized resources, and introduces the recommendation method of personalized services, task scheduling model and algorithm, as well as the experimental environment. The fourth part takes a user as an example to select personalized resources for Chinese subjects in grade 4 of primary school and carries out personalized service experiment and task scheduling service experiment. The fifth part gives the conclusion.
This study confirms the effectiveness of education resource distributed distribution and scheduling algorithm based on VSM in the application of specific systems, and it can effectively solve the distribution problem between the resource user group and the distributed distribution center. The specific example is studied, so the results have certain validity. However, the study is only focused on one area, and the findings may not be applicable to other areas. Therefore, it is necessary to conduct further research in other fields.

3
Method and Technology

User personalized resource recommendation
The premise of personalized resource recommendation is that the user and resource description should be characterized by the modeling, and then the personalized service can be realized by the matching calculation. As a result, user modeling and resource modeling are very important contents, which will be focused on.
First of all, it is necessary to build user interest model based on BTVSM. For a user's interest in a particular field, the interest and preference of the user shows the characteristics of stability and variability, so the user interest model should have strong adaptability and robustness. In the description of the user interest model, the following two factors should be considered: first, the user will have different understandings for different background knowledge expressed in a key word; second, the interests of users will change over time.
In order to reflect the user's interest more truly, the user interest model is represented by the BTVSM. In the model, the subject and study section are introduced as 1 u the background restriction, and the interest weighting function based on temporal change is introduced into the vector space to calculate the attenuation and update of the user's interest weight. At some point t, the user interest model is expressed as: (1) In Formula (1), there are , , and .
In Formula (1), S represents a collection of disciplines (such as mathematics, language, and biology); G indicates a collection of learning segments (primary school first year, primary school second year, and primary school third year); K suggests the user interest keyword vector space; kn is the key word for nth describing interest; refers to the time set submitted by keyword kn each time; tnm is the time for mth submission of the keyword kn; is the weighting function of time for keyword kn.
For the attenuation and update of user interest, the time window mechanism is used to calculate, that is, if the keyword is submitted, the weight increases in a certain time window Δt; otherwise the weight will be attenuated.
Supposing that: For each time window Δt, each time the key weight kn is submitted, then the weight of interest is increased by the unit of a; For each time window Δt, if the key word kn is not submitted, then the weight is attenuated by the unit of b; Then, at some moment t, the interest weight function of the keyword kn is expressed as: The user interest model is described by tree structure as shown in Figure 1.
iJET -Vol. 14, No. 4, 2019 Then, it needs to analyze the recommendation process of personalized resources. The personalized service connotation includes two aspects: first, it refers to the selection of personalized initial resources for the new users based on the basic information of the resource users, and the second is to recommend the update resources that users are interested in according to the old users' interest in the information platform.
The flow chart of new user personalized initial resource recommendation is shown in Figure 2. The service flow of old user personalized update resource recommendation is shown in Figure 3.

Recommendation method of personalized service
When selecting the personalized initial resources for users, the information filtering technology is used to make personalized resource selection for users. In order to improve the performance of personalized recommendation service, the system combines content-based recommendation and collaborative recommendation based on information filtering technology to implement mixed recommendation.
The first is a rule-based recommendation method.
In order to accurately express the resource delivery rules based on user basic information, the weighted uncertainty representation with confidence is used, shown as follows: Ei is a prerequisite for rules. It can be a simple condition or a combination condition composed of multiple simple conditions connected by AND. H is a conclusion that can be a single conclusion or a combination conclusion formed by linking AND. CF (H, E) is the credibility of the rule, which is called the Certainty Factor or the rule intensity. Credibility is a quantitative representation of the degree of belief in things, and its initial value is determined by domain experts. λ is a threshold, and it provides a limit to the applicability of the corresponding rules. Only when the confidence level CF(Ei) of the prerequisite condition Ei reaches or exceeds that limit, that is, CF (Ei)≥λ, will the corresponding rules be applied. ωi (i =1, 2,..., n) is a weighting factor and its value is given by domain experts. In order to obtain the characteristics of the user's choice of resources, it is necessary for the experienced experts to determine the influence factors and the importance of the user's satisfaction to the distribution resources from the user's basic information and the distribution of the resources. Then, according to the historical record information in the information base, it is used as a training example set and the user choice resource characteristics are obtained by data mining. These characteristics are processed through a series of steps, such as rule transformation, conflict resolution, synthesis, and updating.
In the selection of distribution rules, a target-oriented reverse reasoning is adopted, and fuzzy matching and fuzzy reasoning are adopted to achieve the selection of distribution rules. According to the basic information of the user, fuzzy matching is used to select the distribution rules for the users. If there are no matching rules, fuzzy reasoning theory can be used to produce new distribution rules to supplement the distribution rule base and enrich the distribution rule base continuously.
Then, the recommendation method based on content filtering mainly recommends resources by comparing resources and user models. An example is given to illustrate the implementation process of content filtering based on a user's recommendation of his interest resources.
Input: user interest tree User Interface (UI), resource set R, threshold n in TOPn principle.
Output: user interest resource (UIR) set. Method: /*Initialize the distance list L and user interest resource set UIR*/ For user interest tree, each interest branch UIj do Select the resource subset R having the same segment and subject as current user interest UIj from the resource set R For resource subset R'do Calculate the distance between the keyword vector of current user interest UIj and the keyword vector of current resource Ri'; store in the distance list L of current user interest UIj End For Select the former n highly correlated resource storage user from the distance list L according to TOPn principle Interest resource set UIR End For Return to the user interest resource set UIR. From the above process, it can be seen that the key step is the similarity calculation of resources and user interests.
For vector space models, Euclidean distance, cosine distance and inner product are commonly used.
For any two vectors , Euclidean distance is: Cosine distance is: Inner product is: The greater the distance between user interest keyword vector and resource description keyword vector is, the greater the similarity of them is; on the contrary, the smaller it is.
In addition, the determination of threshold n in input information is also very important. Too large or too small value affects the performance of the system. If n is too small, the recall rate will be reduced, and if n is too larger, the accuracy rate will be reduced. Therefore, the choice of n value is a question worthy of further study. In this system, the value of threshold n is associated with the total number of recommended resources, and the value of n is 3/5 of the total recommended resources. , ( Finally, it is a recommendation method based on collaborative filtering. It is different from the content-based filtering technology. It compares the user model, not the resource model or the user model. The key problem is to establish the user interest group, and it can realize the user clustering through some mature clustering algorithms.

Task scheduling model and algorithm
In the process of educational resources distribution, the essence of task scheduling is to deal with the matching between the user and the distribution center. The aim is to balance the system load of whole education resource distribution center. First, the task scheduling model should be built.
Combined with the actual situation of distributed distribution system of educational resources, the load balancing of distribution centers in the system is implemented by agent-based software. That is to say, a scheduling server is added between the distribution center and the user group, and the system realizes load balancing according to a certain task scheduling algorithm. Task scheduling services in the distributed distribution system of educational resources are composed of task receivers, task schedulers, task transponders, task monitors, as well as original task queues, optimized task queues, task logs and information bases, as shown in Figure 4. From a macro point of view, the scheduling server balances the tasks of each resource distribution center, trying to avoid excessive requests to a single center, so as to balance the system load. On the microscopic level, it is possible to get a reasonable allocation scheme between distribution centers and resource users through optimization algorithm, so that the system performance is optimal. The early warning mechanism is set up for the prevention of abnormal problems. During the system operation, the health condition of the system is monitored in real time. When the abnormal situation occurs, the emergency processing program is automatically triggered to protect the normal operation of the system.
In the scheduling server, the task scheduler is the key component of the whole system. In the process of its operation and scheduling, the following two key problems should be dealt with: the first is the load status evaluation of the distribution center server; and the second is the optimal scheduling algorithm that matches between the user and the distribution center.
For the load evaluation of the distribution center server, the system uses static evaluation method to evaluate the load of the distribution center in the initial stage. And it is evaluated by the statistical evaluation method after the system is running for a period of time. This can more effectively reflect the load capacity of the distribution center in the actual work.
In this system, different servers can respond to different users, and a typical combinatorial optimization problem is allocated among the two. According to the previous introduction, ant colony algorithm is a kind of optimization technique based on population operation. It can implicitly search multiple solutions in the solution space and can use the similarity between different solutions to improve the efficiency of the concurrency solution. In addition, ant colony algorithm is also very versatile and robust, so the ant colony algorithm is very suitable for combinatorial optimization problem. Therefore, ant colony algorithm is used to solve the optimal allocation between users and distribution centers. The optimal scheduling algorithm based on MAX-MIN ant colony algorithm is as follows: The key of ant colony algorithm is to transform practical problems into ant colony networks. Here, the distribution center distribution process is regarded as stages, and each stage assigns a distribution center to provide downloading service.
ants are set up, each ant transfers from to , and the transfer probability is as follows: In Formula (5), indicates the user set that the ant k can access to, and in each cycle, the users having been accessed are deleted from the list. suggests the pheromone content between and at the moment of t, is the inspiration degree from to , the parameter α means the weight of residual pheromone, and the parameter β denotes the weight of inspiration information. At the initial time, the pheromones of each path are equal. Set (C is a constant), and ijη is determined by a certain inspiration algorithm. The heuristic algorithm used is the reciprocal of download time at for completing the user task: In Formula (7), p suggests the coefficient of volatilization, which is a constant in range of 0 to 1; is a constant; indicates the global better solution.
The specific steps for algorithm: initializeα, β, p, Q, and C; oc ←0(oc is external cycle times); nc ←0(nc is built-in cycle times). Let out ants at (j=1, 2, …, m), and each ant chooses the next user according to the transfer probability. Require: ants in the same center cannot be transferred to the same user, and the number of ants at a user cannot exceed the upper limit of the user connection distribution center.
Modify allowedk list, calculate the value of all the objectives of this assignment, compare the better results, and put in the result set: nc←nc+1. If nc>the cycle times set, better results are used to update pheromone intensity according to update equation: oc←oc+1; otherwise, reset allowedk list and turn to the fourth step; if oc> the cycle times set, then output the current optimal solution or turn to the third step.

Experimental environment
In order to test the performance of distributed distribution and scheduling algorithm for educational resources designed, the task scheduling of personalized service and the user and the distribution center are simulated. As the task scheduling between the personalized service and the distribution center is implemented in the scheduling server, the scheduling server provides access interface to the resource users, the distribution center and the headquarters resource center in the form of Web Services in the actual system operation, and there is no visual interface. To observe the process and result of the system intuitively, a visual Windows program is used to carry out the simulation experiment. With the users language discipline of grade four in primary school as the simulation experiment object, the personalized resource selection based on the content filtering technology, the personalized resource selection based on the collaborative filtering technology and the task scheduling simulation experiment between the user and the distribution center are carried out, respectively. This experiment takes 3 distribution centers to provide upgrade services for 20 users as an example to test the optimal scheduling results of the system. The server situation is as follows: Three servers are respectively expressed as: C1, C2, and C3; The maximum number of connections for server C1 is 5, the maximum number of connections for server C2 is 8 and the maximum number of connections for server C3 is 7; 20 users are represented as: U1, U2, ..., U20; Assuming that the maximum number of server connections for 20 users is 1, the same resource packet is downloaded, and the file size is 50000K, then the download speed between each user and the server is limited to 100K/s~900K/s and generated randomly. Results

Personalized service experiment
The personalized resource selection of language discipline of grade four in primary school is conducted.
Step 1: choose the resources with the academic subjects of Chinese language in grade four from the resources to be upgraded; Through the user u1's direct feedback, it is known that 3 resources (5, 6, 13) are the resources that the user u1 is not very interested in among the resources recommended by the system, and 4 of them (11, 24, 27, 31) are the resources that the user u1 is interested in among the resources that are not recommended. According to the personalized evaluation method discussed above, the results are shown below: Recall rate = Accuracy rate = Similarity = According to the above experimental process, 10 experiments were carried out, and the data of personalized service indicators were shown in Table 1. From the above experimental data, it can be clearly seen that the personalized resource recommendation service based on content filtering technology in this system has relatively high accuracy rate, but the performance of the recall rate is slightly lower. The case reflected in the experimental data is consistent with the previous analyzed case. Therefore, users' potential interest resources based on collaborative filtering technology need to be supplemented. The experimental results of personalized resource selection based on collaborative filtering technology are as follows: Taking u1 as the discussion object, collaborative resources are recommended. The user u1, in the fourth class interest group, carries out content-based filtering resource recommendation to form a group interest resource list. Then, the resources not in the content-based interest resources of user u1 are found as the potential interest resource recommended to user u1.
From the above analysis, it can be seen that although collaborative filtering technology can improve the recall rate of personalized services, occasionally too many potential interest resources often reduce the accuracy and similarity.

Task scheduling service experiment
In order to prove the optimization performance based on ant colony algorithm task scheduling, the simulation experiments are carried out by using the fastest response method and the weighted rotation method, as well as the allocation between the distribution centers and the resources based on ant colony algorithm. The constant in the ant colony algorithm is initialized to: , and the number of iterations is 200.
The allocation scheme of fast response method is as follows: • Users that server C1 provides services: U4, U7, U10, U11, and U12.
• The allocation scheme of the weighted rotation method is as follows: • Users that server C1 provides services: U6, U9, U12, U15, and U18.
• The allocation scheme based on ant colony algorithm is as follows: • Users that server C1 provides services: U4, U10, U12, U14, and U17.
The detailed results of the contrastive experiment are shown in Table 2.
The histogram for the comparison of algorithm results is shown in Figure 5. The statistical map of the result of the objective function of above ant colony algorithm iterating 200 is shown in Figure 6.
It can be seen from the figure that the algorithm finds a better solution for 1383.5 seconds after the 106th generation cycle and has good convergence.