Advanced Profile Similarity to Enhance Semantic Web Services Matching

In this paper, we present a fine-grained matching method of the services based on a hybrid similarity measure. We propose a novel encoding of the services descriptions, allowing the match between a request and an advertisement in order to make more efficient publishing and searching process of Web services and reduce the number of comparisons required. By this kind of similarity between concepts of profile, a precise matching method is developed to match the profile of the Web services and user. Searching process in the UDDI registry is done via an algorithm that allows us to extract the search concepts and retrieve the topk services, thereby further reducing the search engine's response time. The approach is illustrated through some experiments both on real and synthetic data to demonstrate its consistency and effectiveness.


I. INTRODUCTION
The recent evolution of Internet, driven by the SWS (Semantic Web Services) technology, has extended the role of the Web from a support of information interaction to a middleware for B2B (Business to Business) interactions.Service orientation is a promising paradigm for offering and consuming functionalities within and across organizations in the Web.Indeed, semantic Web services allow a homogeneous use of heterogeneous software components deployed in large networks and in particular the Internet.
The discovery of the SWS is a previous process to their use.It constitutes the process permitting to find the most suitable set of Web services for a user request.It is essentially based on the syntactic research of the WSDL (Web Services Description Language) descriptions (Inputs, Outputs, Preconditions and Effects parameters) using the UDDI's (Universal Description, Discovery and Integration) registers.
However, with the exponential growth of available services, the diversity of users and the conditions under which they access Web services, finding the relevant SWS for particular users is becoming a challenging task.With Web 2.0 applications and particularly e-business and ecommerce applications, Web service discovery is becoming much more important in a Web context.Service computing tries to solve questions based on profiles of users from a contextual informational view of the Web where users have several characteristics such as the client terminal, the client preferences, its location, etc.All these parameters form a particular context of use called the profile.
In addition, the methods available in the UDDI publication do not contain a formal model describing the profile of services' users.Therefore service discovery could not be achieved efficiently without considering their profiles.Indeed, when a user requests a Web service discovery system, (s) he would have services tailored to its context and profile.Currently, the discovery problem can be formalized as a decision problem on the selection of services from a set of alternative services that provide the same functionality but differ in profile parameters.How to describe the information of SWS and matching process of SWS constitutes one of the main focuses of the research in service computing.It is thus necessary to propose a model which provides relevant and adapted results to the user profile.This model allows the representation of information that characterizes both the user and the service.The proposed search method is based on the use of a sophisticated similarity measure, which estimates the correspondence degree between the desired profile and the provided profile.
A user has many and diverse preferences according to the context, his/her interest domain, etc.; the question is how to take advantage of such user's information to improve the discovery process and better satisfy the user request?Suppose a person is looking for a hotel in Italy, current search engines provide a list (which can be huge) of all the hotels, but the question remains which one to choose?To solve this problem, we are leveraging the profile information in a Web services discovery system during the search and selection stages by selecting services corresponding as well as possible to the user request.This selection is based on the use of similarity measures, which estimate the matching degree between the desired profile and the provided profile parameters.
Our main contributions are summarized as follows: (i) A novel model of specification of the profile as well as a setting for the description of Web services are proposed, allowing an efficient match between a service request and a service advertisement, (ii) a Web service discovery architecture including the profile dimension is described, (iii) an hybrid similarity measure for the management of profiles and the rank-ordering of the retrieved services is investigated.In addition, a top-k services building algorithm is provided.
In the remainder of this paper, we first provide, in Section II, a critical overview of existing works related to the SWS discovery issue.Building upon these works, an approach towards efficient SWS discovery and selection is discussed in Section III.Section IV provides an experimental study to show the relevance and suitability iJES -Volume 1, Issue 1, August 2013 PAPER ADVANCED PROFILE SIMILARITY TO ENHANCE SEMANTIC WEB SERVICES MATCHING of our proposal, whereas the conclusion and future works are presented in Section V.

II. RELATEDWORK AND ANALYSIS
Users and software entities should be able to discover, invoke, compose, and monitor Web resources offering particular services and having particular properties.They should also be able to do so with a high degree of automation if needed.As the number of services on the Web constanly increases, efficiency and scalability of service discovery techniques become then a critical issue.On the other hand, several approaches have been proposed for enhancing the SWS discovery.Some of these approaches are syntax based [1], while other are semantic based [4] [6].Recently, other approaches, reviewed below, are distinguished by the fact that they rely on QoS parameters [9], context information [10], etc.
The predominant problem is the restrictions posed by keyword matching that do not allow retrieval of SWS with similar functionality.These works [4,6] have focused mainly on providing means to describe the functionality of a SWS and to allow a very expressive language for querying services.This development is increasingly significant since it seems to be able to overcome some insufficiencies of the syntactic approaches and tackle some of the UDDI register inadequacies.However none of these papers discusses in depth how a publisher/requester should provide data related to her/his context.We can say that each of the two entities, user and Web service, has its own context [10].The service context can group the service localization (geographical restriction), the implementation cost, the quality of services (QoS) parameters, etc.The user context can be formed of her/his localization, her/his preferences, etc.
The study of [15] focuses on QoS in discovery systems.The service consumer searches UDDI registry for a specific service through discovery agent which helps to find best quality service from available services which satisfies QoS constraints and preferences of requesters.Context-Awarenessas proposed in [8] performs the necessary changes in the service behavior and/or the data handled in order to adapt the service to the context of the each user.Rong et al. [14] suggest with an example that context should be domain oriented or problem oriented in Web services discovery systems.They divide context in two categories as explicit and implicit, with Personal profile oriented context, Usage history oriented context, Process oriented context and other context.Chukmol et al. [7] propose the personal opinion on service functionality and quality or invocation cost should also be considered by collaborative taggingbased environment for Web services discovery.The study done in [5] has proposed a novel approach to enhance Web services discovery based on, among others, QoS, customer's preferences and past experiences.The work in [19] presents an alternative approach for supporting users in Web services discovery by implementing the implicit culture approach for recommending Web services to developers based on the history of decisions made by other developers with similar needs.
Let us note that SWS properties include several parameters like the functional (IOPE, functionalities, etc.) and non-functional parameters (QoS, property that identifies the technical standards or protocols for implementing services and categorization).However, the majority of suggested approaches focus only on some parameters like: QoS, localization, etc.The information should have more user centric presentation in the discovery system.Multiple Web services with the same kind of functionality may be available in different contexts; best service among them should be selected.This can be done using profile parameters.
Moreover, few works took into account multiples qualitative and quantitative parameters to help users to find the best service during the discovery process.It is also worth noting that the existing researches in SWS discovery either use information retrieval techniques or semantic-based methods for locating Web services.One of the big problems of Web search is the definition of a correspondence function between the representation of the proposed service and the user request.In order to solve such problem, we propose a new approach to improve the SWS automatic discovery.In particular, we provide a descriptive common form for both service and user profiles and introduce a novel similarity measure to match the user profile and the one of the available service.

III. A PROFILE SIMILARITY-BASED SEMANTIC WEB SERVICES DISCOVERY APPROACH
The service discovery engine should provide rich and machine-processable abstractions that enable to describe service properties and profile as well as the specifications of user needs.Furthermore, discovery requires a matching process that compares the advertisements with the requests to verify whether they describe matching profiles.This is essential to enable the development of reasoning mechanisms to handle the discovery process.In this section, we will describe the service, the user and the profile formalism, and the proposed architecture.Finally, we will show how a new profile-based similarity measure can be used to enhance the discovery process.

A. Formalisation
The discovery process requires a formalism that can be used to encode Web service profile for advertisement and for requests.Before describing the concepts of our model, we first give some formal definitions.DEFINITION

A SEMANTIC WEB SERVICE IS DEFINED AS A QUINTUPLE: {N S , D S , P S , O S , PP S }, WHERE:
• N s is the name of the service.
• D s is the functional description of the service.
• P s is a set of parameters describing the service.
• O s is the set of operations of the service.
• PP s is the set of concepts composing the profile of provided service.Each operation is associated with an input list I={i 1 …i n } and an output list O={o 1 …o n }.Services are also described by preconditions and effects.The "IO" (Input, Output) model for service operations is used in our model as a first step towards SWS discovery.• N s is the name of the request.

PAPER ADVANCED PROFILE SIMILARITY TO ENHANCE SEMANTIC WEB SERVICES MATCHING
• I r is a set of inputs parameters of the request.
• O r is a set of outputs parameters of the request.
• PR s is the set of concepts composing the profile of required service.
is the interest score defined by the user.

DEFINITION3. EACH PARAMETER I IN THE LIST OF INPUTS AND EACH PARAMETER O IN THE LIST OF OUTPUTS IS
DEFINED AS A QUADRUPLE{N T ,V T , T, CO},WHERE: • N T is the name of the attribute.
• V T is the value of the attribute.
• T is the data type of the attribute value.
• CO is the concept of ontology which is connected to the attribute to improve its quality its profile concepts.These are used to limit the set of the returned services that match the profile values specific to the user.The information of the profile can be static (personal data...), evolutionary (preferences…) and temporary (localization...).These pieces of information must be captured to make correspondence between demands and offers of services, on the syntactic and semantic level.This improves the relevance of answers during a discovery session.The proposed model contains mainly several dimensions (Figure 1) that involve most information characterizing a profile.This general structure takes the form of a tree that includes a set of hierarchically organized concepts.The structure thus defined is flexible in the sense that different features can be spread through the tree structure of the description proposed.It allows modeling the user's profile soliciting the service as well as the semantic Web service's profile offered.

B. An Illustrative Example
To illustrate our proposal, let us take a tourist's case that likes to travel during vacations in Italy, (s)he tries to book a hotel{five stars} that possesses a {restaurant}, knowing that (s)he is a {smoker} and has a {car}.His/Her query can be defined as follows: Qu={Hotel, NumberStars, Region}.
1) Step 1. Simple search.Att={Name Of Service = Hotel !Inputs = (NumberStars, Region)}, can be used for the case of a simple research of services.For instance, the results returned by this method are the hotels given in Table I, that correspond to the desired number of stars, and the hotels nearest of user's location which describes her/his current position.
2) Step 2. Profile-based advanced search.The results returned by simple search don't really satisfy the user's needs (in particular, the needs related to the parking and a smoking zone).To improve the satisfaction of the results, one needs to use elements of service's profile (Hotel.Facilities) described in Table II, and user's profile which can be defined as follows: User Profile: Pu={Food(0.2),Smoke(0.2),Parking(0.6)}.Result.Both hotels are five stars hotels and close to the user, although the price of the first hotel is better than the second, but the second hotel is more suitable to the user's profile.

C. Weighted Characteristics
It is possible that some characteristics are more important than others.Our similarity function uses different weights for each characteristic.It is obvious that the determination of the weights play an important role in the final result.Several methods have been proposed: the objective methods [16] [17] for which the requestor is not solicited and methods for which values of weights resulting directly from the requestor's opinions.
A simple method is used in this paper that assumes weights are given by the user.In our example, on each user profile's attribute, is added a number between 0 and 1 to express the relative importance of this attribute with respect to the other.Thus the value 0.  ributes means that these conditions are of the same importance.The last attribute expresses the fact that the user has a stronger preference for hotels with a parking.

D. Towards a new discovery architecture
Our approach is supported by the architecture depicted in Figure 2. It is capable of integrating the profile into the process of semantic Web services discovery.It is composed of basic elements of SOA (Service Oriented Architecture) [3]: the service provider, the service requester and the service registry UDDI, to which we add an "Interoperability Module" and a "Discovery and Selection Engine".
1) Interoperability module: In order to achieve interoperability, our architecture is endowed with module which consists of the following: a) Administrator of the profile :its tasks are: initialization, modification, search and consultation of the profile.c) A profile database: to record the profile information. 2

) Discovery and Selection Engine:
The result of the query can be send directly to the customer through a simple search as in SOA, or via an advanced search through the following components: a) Service/Request profile filtering module: The result of the request ( 5) is intercepted by this module that generates XML's files including profile descriptions of the preselected service as well as those of the request from files stocked respectively in the basis of the profile service and the basis of the profile user ( 6), (7) (see Figure 2).b) Similarity measure treatment module: It aims at comparing different parameters of the profile related to Web services and the users (8) using a proposed similarity measure, and sent the best services to the customer (9).

E. A novel similarity measure
In this section, we first present some of the most-known semantic similarity measures that are used to compute the similarity between two concepts, and then we introduce our similarity measure.
1) Overview of similarity measures Matching operation is recognized to be in a key of many applications such as ontology engineering, information integration (schema, catalogue, data, etc.), Web service composition, Web query answering, peer-to-peer information sharing [17].The quality of matching results depends -highly-on the similarity measure that is used to estimate the resemblance between concepts of the matched entities.Several similarity measures were PAPER ADVANCED PROFILE SIMILARITY TO ENHANCE SEMANTIC WEB SERVICES MATCHING proposed in various applications fields like [12] (intrusion detection), [13] (textual data)…etc.a) Path-based measures • Shortest path length metric: Rada el al. [20] defined the distance between concepts in an is-a semantic net as the length of the shortest path separating them.
Authors gave an example of a formula that transforms distance into similarity: Ferrara, Montanelli, and Raccas metric: Castano et al. [21] propose a similarity measure.It is given by the following formula: is the weight associated with the !!! path, where !!!" is the weight associated with the !th relationship in the path.b) Depth-relative measures As path-based measures have many drawbacks, many authors have presented metrics that capture more efficiency the semantic similarity between concepts.

• Measure of Wu and Palmer[22]
The length of a path between two concepts in a hierarchy is an intuitive measure to calculate similarity.This is a useful and easy to implement.The similarity of two concepts is defined by how closely they are related in the hierarchy.
Where, depth (!) is the length of path between C and the root of the hierarchy, !"#$! !!" is the number of arcs between !" and root through !. c) Measure based on the interpretation of concepts D'amato et al. [23] proposed to measure the semantic similarity between two concepts described in logic.
is a function of interpretation and ! is the cardinal of a set.This measure is interesting because it verifies the semantic properties such as the similarity between two concepts equivalent !!! is equal to 1, the similarity between two concepts is not null.

2) Our similarity measure
A basic similarity measure is useful to define the resemblance rate between objects (profiles) composed of a set of attributes.In general, there are two differents approaches for expressing these attributes: a numeric (as price) and a categoric one (as colour).In our work, we propose to improve the similarity measure QSim defined in [11] and inspired by Jaccard [2] which operates only on the numerical value properties.We combine syntactic similarity with semantic similarity of each qualitative and quantitative attribute of request and service profile to calculate the overall score in order to recommend to user the best service that matches her/his profile.To evaluate the similarity between two profiles, we have to give a set of definitions: • Let P be a set of objects' profile (users, documents, Web services, etc.).• Let n be a set of services.Each service is composed of a set of concepts.• Let x, y be a profile belonging to P for a given request and service respectively, and (" 1 . . ." n ) is a set of weights associated to each concept of x and y where !!!!. • a is the set of common characteristics of x (requested by the client) and y (suggested by the service); • b is the set of characteristics existing in x and not existing in y (requested by the client but not suggested by the service); • c is the set of characteristics existing in y and not existing in x (suggested by the service but not requested by the client); Jaccard Similarity Measure is defined as follows: Quantitative Similarity Measure is defined as follows: ( ) Such as !"#$ is the atomic similarity between each characteristic of x and y.It is defined in a set IR + which can be modeled as follows (where !"means not defined): (3) Our Similarity Measure checks the properties of QSim defined in [11].It is formalized as follows: • Assume that Qi = (A1 ! . . .A n !P u )/ P u = (C 1 . . .C m ) be a conjunctive query of functional attributes (A 1 ! . . .A n ) and user profile attributes (C 1 . . .C m ).• Assume that D is a dataset of Web services with qualitative and quantitative concepts C = (C 1 . . .C m ) and D(C i ) represents the domain of values of concepts C i in the dataset D. Each service is composed of a set of functional and profile concept as described in Section III.A. • Let S i be the set of returned services after submitting the query using functional attributes, PAPER ADVANCED PROFILE SIMILARITY TO ENHANCE SEMANTIC WEB SERVICES MATCHING and Rs i is the relevant services that match user profile for the given query.
• Let Px, Py be a profile belonging to P for a given request and Web services respectively, and (" 1 ," 2 ,..." n ) is a set of weights associated to each characteristic of Px where !!!!. • We define a threshold # in order to present only services that have a rate of similarity with the user profile higher than #.In our work we focused on the calculation of the similarity between the user profile and the profile of returned services from the simple search (see example above).With these influential variables, we came to the following definition for the similarity measure of profile characteristics: ) Such as !"#$ is the atomic similarity between each characteristic of x and y.It is defined in a universe U which can be modeled as follows: if!! !!!! are qualitative value, !"#$ is calculated (in our example) by using the Jaccard coefficient (see formula (1)).

F. Algorithm
The proposed algorithm (Algorithm 1) tries to match a query with each advertisement in the repository.It takes a list of query's concept from the user's profile and a list of Web service's profile concepts as input in order to calculate atomic similarity !"#$ for each pair of concepts (of service and request).If the match is not a Fail, it returns the overall score Pro Sim of each service and appends the advertisement to the result set.Finally the sorted result set is ranked and returned to the client.

A. Experimental setup
For testing the proposed approach and models, a preliminary experiment has been set up.The data used in our test are sampled from hotels base (CSV) downloaded from the link address: http://api.hotelsbase.org/.Hotelsbase is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.Among the files content, the facilities keys file that is used to describe each hotels Web service.These facilities involve in their profile part, a set of elements named category.We run a set of experiments for different parameters values: • We consider about 100,000 hotels Web services with all properties and facilities.Properties can be:

End For End For End For
Step 2:Get each service an overall similarity value.
Step 3: Return score ranking SC.We retain the list of hotels, if its matching score is greater than a fixed threshold � /�! [0,1] is a user-defined threshold.starsMax . . .etc.The facilities-key table is segmented in twenty nine category: Room service, Food, Kids, Sport, Animals, Entertainment, Shopping, Parking, Health, Confort, NewsandTV, Touring . . .etc.These categories play the role of the services profiles.

B. Experimental results
The objective of the tests is to show the interest of the profile of the user and the service in the discovery of the best services as well as to confront the two similarity measures (SimPro and QSim) and to compare the obtained results.Figure 3, Figure 4 and Figure 5 illustrate the similarity degrees obtained between user's three requests and the various services respectively using SimPro and QSim.
According to the curve depicted in Figure 3, 4 and 5, for Qsim we note that the degree of similarity is always weak [0, 0.4] at every time where the numberc is important, although the service satisfies the user's need well.This is due principally to the fact that QSim adds the variable c in the denominator!! !!!. Mathematically, at every time where c increases, the score decreases, what interest us more in the calculation of the similarity is set of common characteristics (a) of Px and Py and those that are requested by the client(b).For instance, we have the case of: User 1 with Service 1 (QSim= 0.01), this low score is due to the value of c=96, on the other hand a=2 and b=1, so Simpro=0.59,the query 1 is satisfied with two attributes.Now looking at results obtained by our similarity measure SimPro, the degree of the similarity is always high if the set of feature (a) suggested by the service and requested by the client is important, no matter the of attributes provided by the service and non requested by the client, for instance services (1, 2, 3, 4 and 5) correspond to 100 % of the profile of user 2, because the characteristics desired by this user are available in five hotels.
The graphs results show that QSim is unable find some of the relevant services that were directly related to the queries concepts.SimPro uses the common information between user profile and service profile to match services based on quantitative and qualitative similarity rather than just quantitative similarity and thus exhibited better answers than QSim and Jaccard measure.
In short, we think that the influence of a value (and thus a feature) depends on the number of features it was mentioned by the user and published by the provider.The performance of Web service may depend on atomic similarity with weight of each request's characteristic and on the quantity !!!! that is in fact the average frequency of all mentioned values.V. CONCLUSION AND OUTLOOK   In this work, we proposed an approach based on profile similarity to enhance the search and recommendation of SWS.In our approach, we applied a new similarity measure for calculating similarity degree between the provided services profile and the desired services profile for better results.The most powerful strength of the proposed approach is the ability to make it possible to systematically measure the similarity between different attributes (either qualitative or quantitative) of user profile and a set of attributes of services profile.As for future work, we first plan to conduct extensive and thorough experiments to study the effectiveness of the   proposed formula for measuring similarity, to analyze the quality of SWS from a user point of view, and to consider the use of ontology for the proposed profile model.Another interesting line of research is to model the parameters of user profile using gradual concepts (such as "cheap cost", "fast service", etc.) which can be represented thanks to fuzzy sets [24], and use the diversity criterion (to return a set of SWS which are as diverse as possible while preserving the satisfaction) in order to improve the quality of the answers.

Figure 1 .
Figure 1.Profile model of the service and the user.DEFINITION 2. A REQUEST IS DEFINED AS AQUINTUPLE: {N R , I R , O R , PR S , �}, WHERE: 2 on the first two att-iJES -Volume 1, Issue 1, August 2013 PAPER ADVANCED PROFILE SIMILARITY TO ENHANCE SEMANTIC WEB SERVICES MATCHING

Figure 2 .
Figure 2. Proposed discovery architecture.b) Inter ontology similarity module: It assures a semantically annotation of each parameter of the profile of the service and the request by means of an associated ontology O.c) A profile database: to record the profile information.2) Discovery and Selection Engine:The result of the query can be send directly to the customer through a simple search as in SOA, or via an advanced search through the following components: a) Service/Request profile filtering module: The result of the request (5) is intercepted by this module that generates XML's files including profile descriptions of the preselected service as well as those of the request from files stocked respectively in the basis of the profile service and the basis of the profile user (6),(7) (see Figure2).b) Similarity measure treatment module: It aims at comparing different parameters of the profile related to Web services and the users (8) using a proposed similarity measure, and sent the best services to the customer (9).
[19]me approaches use the notion of ontology to determine search context and user's interests[18][19].

TABLE I .
ATTRIBUTES OF THE FIRST AND THE SECOND HOTELS

TABLE II .
PROFILES OF THE FIRST AND THE SECOND HOTELS RESPECTIVELY For Px 1 to Px k do {For each Profile j !request} For Py 1 to Py k do {For each Profile k PAPER ADVANCED PROFILE SIMILARITY TO ENHANCE SEMANTIC WEB SERVICES MATCHING iJES -Volume 1, Issue 1, August 2013