Application of Data Mining in Library-Based Personalized Learning

— this paper expounds to mine up data with the DBSCAN algo-rithm in order to help teachers and students find which books they expect in the sea of library. In the first place, the model that DBSCAN algorithm applies in library data miner is proposed, followed by the DBSCAN algorithm improved on demands. In the end, an experiment is cited herein to validate this algorithm. The results show that the book price and the inventory level in the library produce a less impact on the resultant aggregation than the classification of books and the frequency of book borrowings. Library procurers should therefore purchase and subscribe data based on the results from cluster analysis thereby to improve hierarchies and structure distribution of library resources, forging on the library resources to be more scientific and reasonable, while it is also conducive to arousing readers' borrowing interest.


Introduction
Library, as a first-hand source of intelligence database for teachers and students in the universities, treasures up abundant books involving a wide range of disciplines.There is a steady stream of new books purchased every year which makes the collection of books in the libraries stacks up to the peak.For this reason, it is a rather difficult task for the universities' teachers and students to find what they expect in the thousands of great books.In the actual process of borrowing, books borrowed by teachers and students do not always represent the user's interests and hobbies.Sometimes borrower checks out books on behalf of other students which leads to a fact that the results of the recommender may not be what teachers and students themselves really want.It requires another kind of thinking for them to choose.Consequently, it is of great importance for teachers and students' learning and research if an accurate and efficient optimization is achieved in the structure of the stock books, "as stated in [1]".
This paper applies the cluster algorithm to assist the librarians in acquiring book classification data about the borrowing frequency and the type of favorites of all kinds of books for fanciers, "as stated in [2]", and then recommend readers the appropriate iJET -Vol.12, No. 12, 2017 resources according to their professional backgrounds, interests and hobbies, and other information.

2
Optimization design for DBSCAN algorithm DBSCAN (Density-Based Spatial Clustering of Applications with Noise) proposed by Martin Ester in 1996 is a typical density-based clustering algorithm.Unlike the partitional and hierarchical clustering, it defines the clusters as the maximum set of density-connected points, and will be able to partition the area with sufficiently high density into clusters.It can also find clusters of arbitrary shapes in a noisy spatial database.
The basic event complexity of the DBSCAN algorithm is O (N *, the time spent in finding the midpoint of the Eps field).In the worst case the time complexity is O (N * N); in the better case, the time complexity is O (N * logN).The most prominent advantage of this algorithm is the density-based classification.In relation to other algorithms, it features better noise countermeasures and processing clusters with arbitrary size and shape, however it will have a poor timeliness when the density of a cluster changes greatly.If it is a multidimensional, a great challenge we face is how to define the density to make their inherent relevance have a better reflex.

Application model of DBSCAN in the library
A suitable book recommended to the borrowing user, for example, requires to be dug up.For this purpose, the application model is introduced here.Other data mining analysis can be carried out using similar models.
1. Make word segmentation on the user information to obtain a set of user vector labels; repeat this step for the books in the library (information such as a book publisher, a book author, a book blurb, etc.) to capture a set of book vector labels; these word segmentations are mainly used for partition and classification of the categories for more detailed analysis.2. Aggregate the set of book vector labels based on DBSCAN algorithm, the experiment gives individual book clusters and the cluster center for each cluster.3. Extract the book-related vector in all the borrow records of individual subscriber as the user set.4. Use the book cluster center as the initial cluster center of the user set, and then the book clustering is performed with the DBSCAN algorithm to capture the user book clusters and the user cluster center for each cluster. 5. Use the user cluster center as the cluster center applied in the set of book vector labels to form the aggregation clusters, the book vector in the aggregation cluster is used as candidate object (user) of cluster center; 6. Carry out a correlation analysis on the candidate books obtained for recommendations, further access to the user's professional background, interests and hobbies and other information.

Optimization design
This paper proposes an improved strategy in allusion to a great impact of EPS on the efficiency and the precision of DBSCAN algorithm, that is, two EPSs are used to perform the algorithm, called EPS1 and EPS2 respectively, where Eps1 <Eps2.This idea comes here as follows.The algorithm flow is visualized as figure 1.

Evaluation criteria
As learned from Fig. 2-5, the TP312 programming ranks the first in the frequency of borrowings, and has been on the rise since 2012.The TP393 computer network also shows an upward trend, especially from 2015 to 2016, even has surpassed TP312 programming for some time.For the H319 English language instruction which is biased toward literal arts, the frequency of borrowings decreases year on year, and basically maintains the annual borrowings of 2,000 or so.I247 contemporary works iJET -Vol.12, No. 12, 2017 samples is analyzed to extract the user's interests and hobbies, professional backgrounds and other profiles, so as to recommend them the targeted books.This paper uses cluster analysis, as well as other analysis methods such as association analysis and classification analysis for data mining.It is obvious that data mining can do far more than recommendation of library books described in this paper, there is still lots of spatial applications required to be mined.

Fig. 1 .
Fig. 1.Improved DBSCAN algorithm flow 1. Build a cluster for unprocessed point object P using Eps1; 2. If the point object P can be named a core point in Eps1, then the other unprocessed point objects are processed; 3.If point object P is unable to be named a core point after using Eps1, Eps2 (Eps1 <Eps2) is used for the point object P; 4. If the point object P can be transferred into the core point by Eps2, then the other unprocessed point objects are proceeded; 5.If the point object has not yet been a core point after using Eps2, the point P is marked as a boundary point; 6. Go to the other point objects, repeat the Steps 1-5.
This experiment adopts bulk of desensitization data from 2012 to 2016 in a Polytechnic University as study sample, among which, four kinds of books are selected, i.e.TP312 programming, TP393 computer network, I247 contemporary works, H319 English language instruction.As shown in Fig.2-5, the X-axis represents the year; the Y-axis represents the frequency of book borrowings.The dots in the figure reflect the frequencies of book borrowings each year between 2012 and 2016.