The Perception of Emojis for Analyzing App Feedback

— When analyzing textual user feedback, the challenge today is that automation is only possible in a non-satisfactory way due to the limitations of understanding natural language. However, manual evaluation is not possible for products with a large amount of user feedback, as this is neither efficient nor effective. Internet texts are full of emojis. Emojis tell how a user feels about certain aspects of a product. We conducted a survey on the perception of emojis and used the results from 107 participants. The core result is that people perceive emojis in a very homogenous way regarding their sentiments and the emotions they represent This means that emojis give us a tool for analyzing the users’ perspectives on a product by looking at the emojis contained in their feedback.


Introduction
Mobile apps have gained increasing importance in daily use. Because of short time to market, quality has often become subordinated to features. Agile software development processes and the shift towards short iterative lifecycles make it necessary to get quick and frequent feedback from customers about their perception of released apps. This feedback will lead to knowledge about the actual user needs, which is the prerequisite for the development of high-quality software despite time pressure. Quick and frequent changes based on feedback enable high user acceptance. It is not just relevant to be the first one on the market, but also to offer high quality. To achieve this quality, companies need to invest in quality assurance approaches in order to remain competitive.
To support companies in this challenge, we developed the approach Opti4Apps [ [1]]. This approach considers user feedback in a continuous way to provide key suggestions to a manager of an app. It automatically captures textual user feedback, e.g., user reviews from app stores. In a subsequent step, the feedback is analyzed based on data mining methods in order to reveal insights about the users' perception. Overall, this approach makes it possible to rapidly process feedback, which is required for lean development to achieve high software quality according to the users' needs. However, to date, our approach does not reveal the concrete underlying emotions of the users yet -which would provide a more detailed understanding.
The lack of emotions leads to the problem that a manager role can only identify hot topics of a certain app based on the processed textual user feedback. Using sentiment analysis for analyzing texts would mitigate this problem only slightly. However, this kind of analysis itself is highly challenging [ [2]]. Moreover, there is the challenge of handling different languages. The idea of this contribution is to include emojis within the textual analysis of our approach since emojis are used to express emotion and clarify the text. This requires reliable information about the perception of emojis. Furthermore, it still needs to be explored how to integrate such emoji analysis into the text mining process. Overall, we intend to consider emojis to analyze user feedback automatically. Therefore, we are investigating the following research questions: • Are emojis perceived homogenously among people? (RQ1) • How are emojis linked to perceived emotions? (RQ2) • How could this link be used for the textual analysis of app feedback provided by heterogeneous data sources? (RQ3) This article is structured as follows: Section II explains the foundations and related work regarding emotions and emojis. Section III presents our text mining emoji model, followed by its evaluation in Section IV. Finally, Section V concludes this article and points out future research directions.
The Unicode Standard for emojis presents guidelines about how to design emojis [ [8]]. The goal of the standard is for emojis offered by different keyboard providers to look alike. Still, the design of emojis differs so much among providers that users may attribute different meanings to them [[9]]. An example of this is the emoji 'Relieved Face', which was designed like this by Apple and like this by emojidex [ [10]].
An emoticon is a textual representation of an emotion or gesture. Punctuations or symbols are mostly used to form them [ [6]]. Emoticons are, for instance, =) or ;). Some messaging programs transform punctuations such as :-) automatically into a pictograph such as J. This makes it hard to draw a clear line between emoticons and emojis. This could be one reason why the two terms are often used interchangeably, even in research (e.g., [[9]], [ [11]]) or the Unicode standard [ [6]]. Therefore, we do not draw a clear line between emoticons and emojis. We will use the word emoji for emojis as well as emoticons except for topics where a distinction is relevant.

Usage
Emojis are used for non-verbal communication. Emojis showing facial expressions can be used to express concrete emotions or at least sentiment. "Sentiment is the underlying feeling, attitude, evaluation or emotion associated with an opinion" [ [11]]. Hogenboom et al. [ [13]] distinguish the usage of emojis to express sentiment. According to them, emojis are used to simply express sentiment, to intensify the sentiment indicated by the text, or to clarify the sentiment of the text (e.g., when using irony or sarcasm). According to [[14]], emojis are also used for maintaining a conversational connection, permitting play, and creating shared and secret uniqueness.

Emoji categorization
Our research is driven by the question whether emojis could be used for analyzing online reviews. In such an analysis, the emojis would be used to analyze the reviewer´s emotions. Therefore, emojis need to be linked to emotions. This means we must know whether various people attribute the same emotion to an emoji. Otherwise, the interpretation of the reviewer´s emotions would not be reliable. There are studies about analyzing the sentiment or emotions of a text using the included emojis. Within these studies, emojis have been mapped to emotions or sentiment. However, in several studies the researchers conducted the mapping (e.g., [[13]], [[15]]) on their own. This means that it is not clear whether users interpret emojis the same way as the researchers do. Another weakness of some studies is that the emojis were only categorized by sentiment. This means that the only classification categories were 'positive' and 'negative' (e.g., [[16]]) and sometimes 'neutral' in addition [ [17]]. There are studies in which emojis were mapped to emotions; however, the emotions considered in the categorization scheme do not cover the full range of basic emotions. In the study by [[15]], only anger, disgust, joy, and sadness were used as categories. However, categories such as 'surprise' are missing. The paper by Wang et al. [ [18]] does not provide a description of where the categorization came from and who mapped the emojis to these categories. We observed that none of the approaches in existence today can cover more than a few emojis while still maintaining a clear connection to the human perception.

Elicitation of the Text Mining Emoji Model and Study Design
In this paper, we present our approach for mapping emojis to emotions. As we have seen, no such mapping exists yet that considers the actual usage of emojis from the users' perspective. This mapping allows us to make well-grounded decisions about each emoji found in feedback texts. We present a categorization scheme that considers emotions that are relevant for review analysis. We asked more than one hundred users to perform the classification. In the following, we will explain our approach for defining an emotion model that is suitable for identifying emotions through emojis in order to apply text mining to textual feedback. The process we followed for creating our model and the study can be seen in Figure 1. We will give an overview of emotion models that exist in the field of psychology and how we made use of them to create an appropriate model of emotion that can be used by people who are not experts in psychology to categorize the emotions connected with certain emojis (3.1). This needs to be a simple model with categories that are easy to understand. Furthermore, we will give an overview of how we selected the emojis we used for our survey (3.2). We grouped the collected emojis and selected a representative for each group (3.3). Based on the representatives, we created our survey and executed it (3.4).

Building an emotion model for emojis
There are several approaches for classifying and ordering emotions. We looked at the emotion models presented by Ekman [[19]  Regarding the models that are available in psychology, we noticed that they vary in the way they classify emotions and in their level of detail. When we looked at the wide range of approaches available for emotion classification, it turned out that none of the models perfectly fits our needs for categorizing emojis in text analysis. Since emojis are simplified representations of faces, persons, or objects, fine-grained emotion models do not appear to be applicable. Our goal was to use an emotion model that is easy to understand for users from various domains and does not require any psychological background. Therefore, we decided not to use any of these models.
We also investigated other sources that use a classification of emotions. We looked at Facebook reactions, for example [ [23]]. Facebook offers the following options to react to a Facebook post: Like, Love, Haha, Wow, Sad, and Angry. Another source of inspiration are solutions for text analysis. One of these products is the IBM Watson Natural Language Understanding Service [24]. This service is able to perform sentiment analysis; for some languages, it also has a functionality for classifying texts by emotions [26]. It uses the following dimensions of emotion: sadness, joy, fear, disgust, and anger. Even though Watson can perform sentiment and emotional analyses, it does not provide a link between them.
For these reasons, we created our own classification for emotions, which is inspired by the emotion models found in research as well as in practice and considers the particularities of emojis. In the following, we will explain our model (see We started with the categories used in sentiment analysis [25]. Here, natural language is classified between positive and negative. To get a link between sentiment analysis and emotional analysis, we assigned a negative, neutral, or positive sentiment to each emotion. Therefore, sentiments act as a major category and the emotions as subcategories. We selected emotions that were easy to understand and easy to distinguish by the survey participants and relevant for online product reviews. The complete model is depicted in . Our model allows combining traditional sentiment analysis results with more detailed emotional results. Within the sentiment Positive, we distinguish between the emotions happy, excited, and funny. Excited is similar to Happy but expresses a higher degree of intensity. Emojis are also used to express that something is meant to be funny, or to describe funny things. Those should fit into the funny emotion. The Neutral sentiment contains elements that truly express a neutral emotion (Really neutral) and emojis whose sentiment is Context dependent. Surprise can be positive or negative, depending on the specific situation. We refined the "negative" sentiment into emotions for being Angry, Scared, Sad, and Bored.

Fig. 2. Emotion Model for Emojis
Following this model, we are able to map our model to the basic emotions mentioned above. The only exception is disgust, as we believe this emotion might not be found in our context of product reviews to a significant extent. The goal of this investigation is to build a model that can serve to identify the attitude of a user for or against a product, or a feature or behavior of a product.

Collecting relevant emojis for user feedback analysis with emotions
After our emotion model was completed, we began collecting emojis and emoticons that may be relevant for analyzing emotions included in online reviews. In this section, we distinguish between emojis and emoticons and explain their collection separately.
We considered only emojis that can be used widely. In addition to considering the Unicode standard for maintaining emojis as well as emoticons, this required checking the operating system support for the emojis derived from the standard. At the time we were conceptualizing this survey, the most recent versions of iOS, Android, Windows, and macOS had already adopted most parts of emoji Unicode 4.0 [[26]]. Therefore, we started by looking at version 4.0 of this standard. In version 4.0, this standard contains 2384 emojis, including variations for skin color or gender. Therefore, we narrowed down the number of emojis considered. In the first step, we excluded all emojis that do not serve the purpose of expressing emotions. Thus, we excluded emojis representing objects or activities, such as , , , , . These emojis just replace words, but generally do not express emotions. The same is true for the several hundred flags that are part of the standard. However, there are some objects that may be used to express emotions, for example , , , , which were therefore not excluded. This selection was performed and evaluated by three researchers. Many of the emojis that indicate human emotions are available in different variants. Usually, a gender-"neutral" person version is available, as well as a female and male representation. In addition to that, emojis are also available in different skin colors. The 'person' version of these emojis g 24 http://www.i-jim.org sometimes has a male and sometimes a female representation. Table 1 shows some variants of the raising hand emoji. From the emoticons, we selected those that are commonly used by Western cultures, namely those based on ASCII characters, since our goal is to analyze texts written by customers from these cultures. This implies that we excluded kaomoji, a Japanese emoticon style, and other Asian emoticons.
In addition to the standards, we validated the availability of emojis on operating systems by checking emojipedia, which provides information about the availability and visualization of emojis across different platforms and versions [ [28]]. All in all, we collected 612 emojis together with a name and a short description.

Group concept for emojis
Considering our list of emojis, it became obvious that many emojis had the same or a similar meaning. This and their large quantity prompted us to think about a possible reduction of the list in order to allow it to be categorized by one person in a session of about ten minutes. We started arranging similar emojis together by looking for variations or alternative representations. For instance, the emoticons ':)' and ':-)' both represent happy faces and are therefore equal. We grouped emojis by their equality. For these groups, we selected one as a representative of the group 1 . Due to this equality, we can transfer the corresponding emotion to all other emojis within the group. Table 1 shows an excerpt of the group "Person raising hand" with different designs from various providers. For emojis in different skin colors, the yellow-skin emoji and, in the case of a person emoji, the 'person' version was chosen as the representative, as the choice of color and gender is not important for identifying the associated emotion. In other cases, the representative should be a well-known emoji in order to get a high recognition rate and the most precise categorization possible. For example, the textual group 'Smiley or happy faces' has the representative ':-)'. The choice of the representatives as well as the grouping was made by one person and validated afterwards by two other researchers. The goal was to always use the neutral coloring as well as the neutral gender as the representative. If the emojis within the group did not have such a candidate, we tried to pick the one we considered the most prominent one. This process led to 99 groups for 612 emojis.

Design and execution
After we finished the collection of the emojis, we applied our emotion model to the collected emojis in such a way that the emojis got a sentiment and an emotion. To get a categorization that is as precise as possible and commonly accepted, we designed a survey where the participants were asked to categorize a given list of emojis according to our emotion model. We used two different types of questionnaires, one including just the emojis and another one also including the name of the emojis and the descriptions. We gave the questionnaires to people of both genders, different ages, and various levels of familiarity with the usage of emojis. We invited people to participate whom we knew in person, people we met after public talks, and people we contacted via mailing lists on the Internet.
In a pre-study, 11 people categorized a total number of 99 emoji representatives, which we presented together with their names and description. When evaluating the results, we realized that some emojis were consistently categorized in terms of sentiment and emotion by all the participants. We expected that these emojis would be categorized in the same way by other participants. We therefore decided to remove these 18 emojis (see Figure 3) for further participants to shorten the survey to an acceptable length. We used 96 responses from the sorter survey for our analysis.

Survey Results and Analysis
In the following, we will describe the results of our survey, report how the analysis was performed, and discuss the results. In addition, we will explain the identified threats to the validity of our study.

Results
We received a total of 111 responses. We removed four questionnaires due to insufficient data quality. In one questionnaire, only the beginning was filled out; one questionnaire was not filled out with emotions at all. Furthermore, two people answered the questionnaire identically and even used the same email address for their answers. Besides that, the two questionnaires did not contain the necessary statistical data describing the participant. It is likely that someone mistakenly sent us the wrong file.
We thus analyzed a total of 107 responses. Of these 107 participants, 53 were female and 54 male. Their ages ranged from 17 to 75 years. The mean age was 31.41. The distribution in terms of digital generations [29] can be found in Figure 6. Most of our participants belong to the generation of Millennials (n=69), which is also one of the major groups of app users. The second largest group was Generation Z (n=16), followed by Generation X (n=11). With ten people from the Baby Boomer generation and one participant from the Builder generation, we also had people who had not been in contact with computers, mobile devices, and emojis when they were young. Most of our participants had a Western European background. Some of the participants were not originally from Western Europe, but were at least living there at the time we executed the survey. 101 participants indicated medium familiarity (n=26) or familiarity (n=75) with emojis. There were just six participants who classified themselves as being unfamiliar with emojis.
We have made an aggregated version of our results available for download 2 . The responses contain the numbers for all participants as well as the results classified according to gender, digital generation, and familiarity with emojis.
Among the 107 responses analyzed, the number of votes for the sentiment of an emoji varied between 68 and 106, with a mean of 97 and a median of 100. Considering the distribution of emoji categorizations into sentiments, the participants agreed with a mean of 71 votes (67% of all participants) with each other. The median value was 73. 2 Our aggregated results (including the results per digital generation) are available for download from http://opti4apps.iese.de/emoji/studyoverview.xlsx. In our analysis, we selected the sentiment for an emoji if at least half of the responses gave the same result. We were able to categorize all except seven. The other 92 had at least an agreement of more than 50%. 66 of them had an agreement of 70% or more. An overview of how the participants agreed on each emoji according to sentiment can be found in Fig 5. Extending this result from the emoji representatives to all emojis being considered, we were able to classify 591 out of 612 emojis. For many emojis, the classification indicates a distinct tendency towards one of the sentiments, as in the case of the unamused face , where 102 of 107 participants chose negative as sentiment, two chose neutral, and three did not categorize the emoji. However, there are some emojis we could not categorize clearly, e.g., "pile of poo" . 42 participants rated it with a negative sentiment, 31 chose positive, 28 chose neutral, and 6 did not rate it. Another finding was that most emojis that we are able to categorize according to sentiment also had a clear meaning in terms of emotion. As shown in Figure 6, which depicts only those emojis that we were able to categorize according to sentiment, 78 emojis had an agreement on the emotion level of at least 50%. This means that fewer emojis could be classified according to emotion compared to their classification according to sentiment. However, by considering not the representatives but rather the underlying emojis of the group, we were able to classify 512 out of 612 emojis. Nevertheless, for some emojis that have a clear sentiment, the concrete emotion was not clear. This happened in cases where a lot of participants rated the sentiment but not the emotion, or where the distribution of emotions was not very distinct. For example, consider the unamused face classified as a negative sentiment. The ratings of the emotions are: 38 for bored, 27 for angry, 27 for sad, 3 for scared, and 7 did not choose an emotion. Indeed, the majority of the participants agreed that this emoji has a negative meaning but the sentiments about the exact emotion conveyed by this emoji differ. Figure 6 shows the agreement of the participants in terms of emotions. We also performed an analysis regarding how the different groups (gender, social generation, and familiarity) rated the emojis. To visualize and compare the ratings of emojis between different groups, we used spider charts. The data length of the radii corresponds to the percentage of representatives classified with the sentiment or emotion. Figure 7 shows the distribution of sentiment and emotions in terms of gender. The chart shows that both genders voted mostly in a similar way. In terms of emotion, the female participants were able to classify more emojis than the males. Female Male female participants. The males classified them as Positive ( ) and Negative ( ) whereas the females classified both as Neutral. Looking at the emotion dimension, we observed different classifications in nine emojis, all of which we could classify with less than 70% in terms of emotion. Except for the emoji , which was classified as Happy by males and Funny by females, all the different classifications are in the sentiment Neutral between the emotions Really neutral and Context-dependent. We also observed that the female participants had a clearer perception in terms of emotions than the male participants regarding these nine emojis. On average, the classification by females was 7% more distinct than that by males. We also analyzed the results based on familiarity. The corresponding chart can be seen in Figure 8. In general, we did not observe any huge differences between those participants who considered themselves familiar with emojis and those who assessed their familiarity as medium. For people who reported being unfamiliar with emojis, we observed that their results deviated more from the other two groups, especially in terms of emotion, resulting in a larger share of emojis that could not be classified in this group. If we consider the number of emojis that could be classified in terms of emotion, it is apparent that people classify emojis more distinctively the more they are familiar with them. The group of familiar participants was able to classify 75 emojis, those with medium familiarity managed to classify 62, and those unfamiliar with emojis could classify only 53 emojis.
When analyzing the different social generations, Generation X, Millennials, and Generation Z are mostly similar, as illustrated in Figure 9. In the participants from the Baby Boomer generation, we see a larger deviation from the average classification more often, but the participants of this group included a large number of people who were unfamiliar with emojis (4 out of 10). It can also be seen that the image for the Baby Boomers is a smaller version of the other groups, with lower values on all emotion axes. The number of emojis that could be classified with respect to emotion is nearly equal for all generation groups except the Baby Boomers. Generation Z could classify 81 emojis, Millennials and Generation X 82, while the Baby Boomers were able to classify only 58 of the 99 emojis. This can also be attributed to the high number of unfamiliar participants in this group. We did not analyze the Builders generation as we only had one participant from this group.

Discussion
The data obtained from our survey shows that most emojis that were part of our study could be mapped very well to a sentiment. Even though we required an agreement of at least 50%, there were only seven emoji representatives (equaling 21 emojis) that could not be categorized in terms of sentiment. Narrowing this to a minimum of 70% still allowed us to categorize 67% emoji representatives or 79% of all emojis. The categorization in terms of sentiment seems to be very clear to most participants. Considering the overall result in terms of sentiment, it turned out that most of the contradictory answers were between neutral and negative, or between neutral and positive. Larger contradictions between positive and negative occurred very rarely, only in the case of two emojis. This highlights the clear perception of the participants. However, there are also emojis that were categorized ambiguously in terms of sentiment. The number of participants giving a vote differed among the emojis. There are emojis for which many participants indicated that they were unable to categorize them or for which the participants categorized the sentiment but not the emotion. A large number of participants being unable to categorize an emoji is an indicator that the emoji is not a good representative of its group or that this group cannot be classified in terms of emotion or sentiment at all. As described in Section 3.3, the groups contain similar emojis but only one representative was selected for classification purposes.
In terms of emotion, the results are not that distinct. Nevertheless, we were able to classify 78 emoji representatives in terms of emotion. These 78 representatives represent a total of 512 emojis according to our grouping. This means that 83% of all emojis could be classified. If we leave out those that could not be classified by sentiment either, this value increases to 87%.
The results of our survey in terms of sentiment and in terms of emotion show that people have a homogenous perception of emojis. This clearly indicates that emojis can be analyzed well in terms of sentiment and emotion. We also analyzed whether the perception is dependent on age, gender, or familiarity. It turns out that males and females perceive emojis in a pretty similar way. We observed that older people are less familiar with emojis. The generations X, Millennials, and Z share their view on emojis to a large extent, but participants who are older and less familiar with emojis perceived some of the emojis differently. This might also be related to the fact that some of them just guessed them, as some of these participants even classified all of the emojis presented. We conclude that many of the emojis provide a clear visualization, allowing sentiment and emotion to be understood intuitively. This allows the conclusion that the emojis that were classified correctly by the Baby Boomers are more intuitive to understand.
The results enable us to capture and analyze emotions through emojis found within texts. This makes it possible to quickly analyze larger texts in terms of emotions. Furthermore, this can basically be done for any textual data source as the use of emojis is widespread. This makes them more universal compared to deriving a sentiment value, e.g., based on star ratings found in app stores. In addition, we can combine feedback items from different origins in one analysis to get a complete picture. In addition, the finding that the perception of emojis is not related to gender is very valuable. There is no need to find out whether a text was written by a man or by a woman to be able to analyze the emojis contained in it.
Regarding those emojis where there was less than 50% agreement among the participants, there might be several reasons for that. In the case of and , people use them in very different ways. Sometimes they are meant to be funny but sometimes they are also meant to indicate something bad. Especially for , the participants were unsure about the sentiment. Each of the sentiments got one third of the votes. Understanding this emoji requires further investigation. In the case of meaning "collision", the largest group (49% of the votes) voted for a negative sentiment, but a number of participants believed that this emoji might be neutral and especially contextdependent, meaning that depending on the context in which it is used, it can sometimes also be understood as a positive emoji. Currently we do not have an explanation for this result except that the visual representation might not be so close to an actual explosion, meaning people might also use it for different purposes. We got a similar result for *_* and :$. *_* has a 48% share of positive sentiment and a very close share of neutral votes. :$ received 44% neutral sentiment and 38%negative sentiment. Because the emojis that we have been unable to categorize to date have similar results for two sentiments, we assume that no shared view currently exists for them. Therefore they should not be used to analyze the content of textual feedback.
In the case of those emojis where we were unable to find at least 50% agreement on the emotion level, it turned out that often the emotions "sad" and "bored", resp. "happy" and "excited" got quite similar shares of votes, but none of them achieved the majority. This indicates that those emojis might be used in multiple ways. An emotional value should not be derived as two different emotions have a high share of votes that are close to each other. Therefore, we can only analyze them in terms of their sentiment value, as this seemed to be clear to the participants.

Threats to validity
In the following, we will list potential issues threatening the validity of our survey results.
First, the study might not represent a fully global view on how emojis are perceived, as all participants had lived in Western Europe for some time at least, which might have influenced their view of emojis. We are not aware of any similar study involving the cultural background of people when it comes to perceiving emojis. Nevertheless, recent news articles share the viewpoint that emojis are becoming a global lingua franca, e.g., [[30]], [ [31]], and [ [32]]. This might also be related to their perception in terms of emotions.
As we grouped equal emojis together, we only evaluated the representatives of the groups within the survey. The grouping was validated by three researchers, but not within a survey. Especially for emojis that are not just a skin-tone or gender variant of the representative, the distribution of the perceived emotions might be slightly different compared to the representative.
Moreover, we solely used the emoji design of Windows 10. People's perception might be influenced by the emoji design as some of them might differ so much that users might attribute different meanings to them [ [9]].

Conclusion and Future Work
Emojis are an important part of Internet-based texts and quite a trending topic also in public media. Our initial assumption was that each emoji has a dedicated meaning or purpose. Therefore, we investigated whether emojis are perceived homogenously by people (RQ1). We created an emotional feedback model and performed a survey to find out which sentiment and underlying emotions are related to which emoji. The result we found is that the perception of many emojis was homogenous among the participants. For most emojis, people had a clear perception in terms of sentiment as well as in terms of emotion (RQ2). This allows us to start evaluating the emotions underlying texts (RQ3). In the future, this can be done in many ways. As there is usually only a single star rating for an entire review, using emojis to classify the review in terms of emotion and sentiment could lead to a more precise automated analysis of the user's attitude concerning a product. This could also be used for areas where no rating scale is present, like in social media feedback or for in-app feedback functionalities. To make better use of heterogeneous feedback sources, it will be important to investigate the relationship between emojis and the star ratings given in app stores. Furthermore, it will be interesting to correlate regular text-based sentiment analysis and the use of emojis.
As we were not able to classify all emojis in our study in terms of sentiment and emotion, it will be worth investigating which emojis are used to what extent in app reviews to determine the size of the actual gap between emojis that can be characterized and those that cannot be characterized yet.