Paper—A New Algorithm to Detect and Evaluate Learning Communities in Social Networks: Facebook ... A New Algorithm to Detect and Evaluate Learning Communities in Social Networks: Facebook Groups

This article aims to present a new method of evaluating learners by communities on Facebook groups which based on their interactions. The objective of our study is to set up a community learning structure according to the learners' levels. In this context, we have proposed a new algorithm to detect and evaluate learning communities. Our algorithm consists of two phases. The first phase aims to evaluate learners by measuring their degrees of  ̳Safely‘. The second phase is used to detect communities. These two phases will be repeated until the best community structure is found. Finally, we test the performance of our proposed approach on five Facebook groups. Our algorithm gives good results compared to other community detection algorithms. Keywords—Community detection, evaluation, centrality, social network, safely, learning communities.


Introduction
Social networks represent a space for discussing and sharing information, they became channels of knowledge for communication and interaction between its users. Today, social networks are used by multiple users in several disciplines. In particular, social networks play a very important role in the field of education. Typically, they bring learners together from different places in real time to facilitate the interaction between them. This interaction allows creating a positive and active environment for online learning to follow students' news and to evaluate them in real time. Students use a variety of social networks such as Facebook, Pinterest, Twitter, Instagram, Snapchat, etc. According to the Diplomeo survey on digital practices of students in France, Facebook is the first social network has been used by students, 93% of 17 -27 years are indeed registered on Facebook. 82% of students are registered on Snapchat, whereas only 64% of students have account on Instagram and 53% on Twitter 1 . Social networks offer several measures that we can use to follow users' actions:  The number of mentions  Hashtags Learners' traces in social networks can be studied and analyzed to evaluate them. In a simulation study which we have performed on two types of interactions ‗comment' and ‗like'. We found that the interaction via comments are important, it makes possible to see exactly what learners think and what their difficulties are. On the other hand, the interaction via likes show that some learners received only the information and they could not interact with their colleagues [1] Social Network Analysis (SNA) is used to model and describe the relationships between users. In this context, social networks modeling is based on graph theory. Each network can be presented and viewed as a graph which contains nodes (users) and links (the relationship between users). These relationships can be structural (used by, colleague of), factual (communicate with, interact with), or declarative (like, comment, subscriber) [2]. A graph contains several knowledge that we use to assess learners in a social network.
The notions most commonly used in the Social Network Analysis (SNA) are the centrality and the density of link of the network [3]. Centrality (Eigenvector centrality, PageRank, Betweenness centrality, etc.) is more subtle notion; it seeks to highlight the most important users in the network. The density of links represents the number of internal links between users in a community. The creation of the virtual communities is an important property in social networks. McMillan and Chavis proposed a community model consisting of four main elements: Membership, influence, needs fulfillment, and emotional connection. These elements can be directly applied to create online communities in an educational context [4].
Community detection provides an interesting lighting into the network structure. A good community structure gives a microscopic view of complex systems. The detection of communities differs from one aspect to another. For example, web pages communities include pages that deal with the same subject [5]. Thereby, linguistic communities contain people who communicate with the same linguistic tools [6]. Learning communities also regroup learners who have and share the same level of learning and the common interests in social networks [7]. Bielaczyc and Collins defined the learning community as: -a community is a social unit where a learning culture manifests itself in which all are involved in a collective effort of understanding.‖ [8].
The main problem of community detection is to form groups in such a way that users within these groups are strongly connected. Today, there are several algorithms to detect communities in social networks. These algorithms seek to optimize a quality function called modularity (Q), which it measures the density of internal links of communities [9].
This article discusses two important aspects of research: Online learners' evaluation and community detection. The main idea of our work is to propose a method that facilitates the online assessment of learners. Instead of evaluating learner by learner, we proposed to evaluate them by communities. Therefore, we provide a pedagogical community representation of learners in a network. In this sense, we present our algo-rithm which allows detecting and evaluating communities in social networks. The most community detection solutions focus on the position of initial nodes of communities. In this approach, we will define a new measurement called -Safely Centrality‖ which it identifies safe learners in the network. Then, communities will be formed around safe learners by looking for their neighbors. Therefore, the main contributions of this research paper are:  We propose a new method to identify and to evaluate learning communities in social networks in which teachers can easily assess their learners  We define a new measurement of centrality called -Safely centrality‖ to evaluate learning community in social networks  We evaluate the performance of our method on five Facebook groups using the modularity. The experimental results demonstrate the effectiveness of our algorithm to identify at-risk learners in social networks The rest of the paper is organized as the following: In the second section, we present some existing approaches related to our work. Then, section 3 describes in detail the different steps of our algorithm. The results of our experience are presented in section 4. Finally, the conclusion and future directions of our research will be presented in section 5.

Related Works
In this section, we present some existing works related to our proposed method. First, we cite some studies that demonstrate the educational potential of using social networks. Then, we present some existing approaches to detect communities in social networks.

The educational uses of social networks
Nowadays, social networks represent an important part of our daily life. They offer a simple and convenient solution to learn online. Jeon et al demonstrated that college students can use Facebook as helpful venues for information seeking. In this study, authors use an App Facebook called -College Connect‖ that helps students to identify useful resources by visualizing their personal social ties with their friends who have the same interest. [10]. In addition, a study has done on a Facebook group which devoted to chemistry, according to students' view, this group is practice to promote their skills and also to motivate themselves to learn online [11].
In this sense, Seidel suggested a descriptive study of the evolution of a Facebook group named -Breast Imaging Radio Logist‖ for radiologists interested in breast imaging. The purpose of this study is to analyze affiliations of this Facebook group. In this context, radiologists find it useful to use Facebook groups as a forum to exchange information [12]. In Malaysia, because of the language problems facing students, a study examines how learners make up for their inadequate linguistic repertoire, and it also improves their online discussion using communication strategies on Facebook groups [13]. In addition, Inderawati et al. proposed an innovative approach to evaluate 48 students of Sriwijaya University in a Facebook group of English writing courses. This method based on the quality of students' comments. Authors check the reliability of comments based on two kinds of rubric _rating scale_ containing scoring systems. They propose a system to assess learners consist of four scores: score D (bad) score C (average) score B (good) and score A (very good) [14]. Another study conducted at Mzuzu University in Malawi makes it possible to integrate Twitter and blogs into two undergraduate courses at the library information department. This study showed that students used these technologies correctly to share the course materials, and to communicate actively and instantly between themselves and with their teachers [15]. According to Anggraeny, he examined the students' point of view on the use of Instagram in teaching and learning processes. The importance of this study is to help teachers to communicate with these learners and also to better understand their barriers [16].

Community detection in social networks
Community detection is a problem that widely studied in the field of Social Networks Analysis. Several methods have been proposed to detect communities in social networks. Blondel et al. proposed a fast and easy method called -Louvain method‖ for detecting communities in large networks based on the optimization of the modularity [17]. Modularity (Q) is a measurement function introduced by Newman et al. It makes it possible to evaluate the quality of the community structure which was obtained, the modularity is a value between -1 and 1 that measures the density of edges within communities compared to the edges connecting communities to each other [18].
In addition, Raghavan et al. designed a simple method based on nodes' labels to detect communities in a graph. Initially, each node is initialized by a unique label. In the different iterations, each node takes the label shared by the majority of its neighbors. If there is no single majority of labels, one of the labels is chosen randomly. In this way, most of the labels are propagated in the graph. The algorithm stops when each node has the majority label of its neighbors. Communities are defined as sets of nodes with identical labels [19].
In this context, some community detection algorithms use centrality measures. For example, Ahajjami et al. have proposed a new scalable leader-community detection approach for community detection in social networks based on leadership. This study is divided into two steps: the first step consists to select the network leadership by the eigenvector centrality measure. In the second step, they detected the communities by the similarity of nodes [20]. Otherwise [21] suggested a new community representation of a network, they defined two measures of centrality -leading degree‖ and -following degree‖ to measure the representation of a node and its relations in a graph. A community is made up of a leader and his followers. In a graph, leading nodes have a Higher degree of leading, whereas the other nodes have a low degree of leading and a higher degree of following in relation to leading nodes.  Proposed Algorithm

General approach
A learning community is made up of learners, young people or adults who interact with each other in order to develop their personal and collective knowledge [22]. Our approach is used to detect and evaluate learning communities in social networks. Figure 1 summarizes the different steps of our approach. In this article, we propose a new algorithm called Evaluation and Detection Community Algorithm (EDCA). EDCA is built in two phases:  Learners' evaluation to detect safe learners in the network  Building communities by detecting neighbors.

Notations and definitions
Let an undirected and weighted graph G (V, E, WV, WE) with:  V = {Ui}: is the set of nodes (learners)  E = {Aij}: is the set of arcs that represents the interaction between the learners  WE= {mij}: is the set of arcs' weights that indicates the total number of interactions between two learners  Wv = { Di} : is the set of nodes' weights, it represents the node degree, that is to say, the number of incoming and outgoing interactions  Ωi { Ui , Status} : is the set of communities detected and evaluated in the social network. In which ‗Status' can be safe or at-risk  Safe : safe learners are the active learners in the network  At-risk: at-risk learners represent students who have problems to interact with their colleagues  Safe community: contains the most active users in the network  At risk Community: possesses at-risk learners who have difficulty to interact with each other

Safely centrality
Safe learners represent the learners' principal of the network. They can easily interact and exchange information with each other. Detecting safe learners in a social network is the main challenge faced by researchers. The Social Learning Analytics (SLA) allows presenting several measures, the most important centrality measures are: betweenness centrality, closeness centrality and degree centrality that we used in our previous work [23]. These measures make it possible to measure the representation of a learner in a network. In our approach, we defined a new measure of centrality called -Safely centrality‖ defined by equation (1). This measure detects safe learners in a social network.

∑ (1)
With N is the number of learners in the network, d(i,j) is the distance between i and j.
To judge if a community is safe or at-risk, it is necessary to compare the degree of safely with a threshold (S) that varies from one community to another.

Community detection
In a network, it is easy to observe the interaction between nodes, but it is difficult to see its community structure. Therefore, we are introducing a new community representation to better reflect the community learning structure of a network.
A community is usually formed from an initial node. In our approach, we chose nodes that have a higher degree of ‗Safely centrality' compared with other nodes as the initial nodes of communities. These nodes called safe nodes. Then we seek for their neighbors according to this equation: http://www.i-jet.org

With
, and three nodes of a network, such as is a safe node. are two at-risk nodes. The detection of the neighbors of the initial nodes is done by equation (3) according to these properties: Property 1: and are part of the same community if its weight is higher than other links' weight in the network.
Property 2: If a risky node does not have a relationship with safe nodes then we place this node in a separate community.
Property 3: If a safe node does not have a relationship with at-risk nodes then we place this node in a separate community.

Renew the graph
At each iteration, EDCA uses a new graph of which vertices (V') are the communities discovered during the previous iteration. For this purpose, the weight of links between these new vertices is given by the sum of links' weights that existed between the nodes of these two communities. The links that existed between vertices of the same community create loops on this community in the new graph. The new graph G'(V',E',W'E,W'V ) is defined as the following: With N' is the number of nodes of G' such that N' < N. {A'ij} is the set of links between the new nodes of the network. And } is the set of links' weights between the nodes.

Evaluation and detect community algorithm (EDCA)
As shown in Table 1, the proposed algorithm EDCA is divided into two phases: Phase 1: Evaluate community The initial partition consists to place each node in a separate community. Thus, this partition is composed of N communities. Afterwards, for each community, we calculate the "safely centrality" measure. If this measure is higher than the threshold (S) then the community is considerate safe, if it is not the community will be at-risk.
Phase 2: Community detection For each safely node, we detect its neighbors by equation (3) to create communities, and we calculate the modularity of this partition. Thereafter, if the value of the modularity is differing from the previous value then we renew the graph by equation(4). Again we apply repeatedly the first and the second phase of the algorithm on the new graph.
In each iteration of EDCA, we calculate the modularity. If we obtain a fixed modularity in two followed iterations, then the algorithm stops and it takes the community structure of the highest modularity.

Experimental Study
In this section, we present experimental results that are obtained in our study. In addition, for measuring the performance of our algorithm EDCA, we compare it with three community detection algorithms: Edge betweenness centrality [24], Label propagation [19] and leading eigen [25]. Before discussing our results, we describe the dataset in which we apply these algorithms, and the quality metric used in this research.

Datasets description
Our experimental study was chosen to adapt to the real environment of online social networks. Actually, we aim to generate our dataset that contains learners' interactions on Facebook groups. The purpose of this article is to evaluate the performance of our algorithm, for that reason we use a dataset contains users' interactions on Facebook groups. In our case, we considered that each user is a learner. The data were collected from Cheltenham's Facebook groups 1 ; discussions within these groups consist to exchange the major issues of users. Five open groups were selected, which are described in the following:

Performance metrics
Quality indicators answer the question: What is the right community structure for a network? They are generally based on the local properties of communities. One of the quality functions called Modularity was introduced by Newman et al. [9] This function makes it possible to evaluate the quality of the detected community structure. Modularity calculates the density of links in a community.

With
∑ is the sum of weights of links attached to the vertex i, Ωi is the community to which the vertex i is assigned, is the Kronecker delta which is equal 1 if u=v and 0 otherwise, and ∑ .

Discussion and evaluation
We implemented our algorithm EDCA with the R language. "igraph" and "cluster" are two libraries that we used to interact with the network.
Community detection and analysis is an important methodology to understand the organization of various networks. In general, community detection algorithms are always based on a characteristic or information such as labels, leadership, shortest path, etc. In our algorithm, we used the "safely centrality" measure to detect the safe nodes in the network, which present the initial nodes of communities. Figure 3 shows the community structure which is detected by EDCA for five Facebook groups. While red clusters represent at-risk communities, green clusters represent safe communities. The results of our algorithm prove its performance in detecting and evaluating learning communities. More concretely, the community structure was obtained by EDCA allowed us to easily identify the most active users and the less interacted ones in a group, especially, learners who face barriers to learning.
As shown in Figure 4, the x-axis represents the number of iterations of our algorithm, and the y-axis shows the value of the modularity. The modularity varies according to the number of iterations. Fig4.a and fig4.b illustrate the improvement of modularity. However, during the evolution of the modularity, it rises a little then it goes down after it increases until it reaches the maximum threshold, so that it takes a fixed value (see fig4.c, fig4.d, and fig4.e).
On the other hand, as mentioned above, we compared the performance of our algorithm with different community detection algorithms. The objective of this comparison is to assess the internal connectivity of communities using the modularity measure. Figure 5 illustrates the modularity obtained for each algorithm. We note that the modularity obtained by EDCA and leading eigen are close, this result implies that both algorithms have give a similar partition. On the other hand, we see that the modularity of the EDCA algorithm is higher compared to other algorithm, Edge betweenness and Label probagation. In all five Facebook groups, EDCA produces the highest modularity value compared to other algorithms. These results mean that our proposed method is more flexible than other methods.

Conclusion
Nowadays, learners and teachers use social networks as a learning environment to facilitate the interaction between them. This study proves that the use of social net-works as an informal learning activity allows learners to learn together without constraint of time and place. In this article, we have proposed a new algorithm for detecting and evaluating learning communities. Our approach begins with the identification of the safe nodes which is based on the "safely centrality" measurement. These nodes represent the initial nodes of communities. For each safe node, we look for its neighbors to build communities. The experimental results illustrate the performance of our proposed algorithm. Experimental results make evidence that the community structure obtained by EDCA is more flexible compared to other algorithms. These results provide an opportunity to use this algorithm in other areas like e-commerce, e-mailing, science citations, etc. So that we can analyze and evaluate groups of people. As a perspective, we aim to collect our own dataset from Facebook groups to implement our algorithm; we also aim to optimize our algorithm to minimize the execution time.