The Effect of Gaze and Product Salience on Digital Visual Engagement: An Experimental Research

In the digital communication world, non-verbal communications play important part on building perception and image, that might lead to a certain behavior. Advertising can incorporate both verbal and non-verbal communications in a variety of ways, such as model’s facial expressions and visual cues. This study aims to examine the effect of advertising model’s gaze and product salience on digital engagement in social media. Two hundred participants were recruited to participate in a 2x2 experimental research design. Findings showed that direct gaze of the model in the ad and high product salience increased digital visual engagement. Furthermore, digital visual engagement positively influences attitude and purchase intention. This study offers empirical evidence that there is an interaction between gaze and product salience on digital visual engagement. Further studies might replicate this experiment with other products, visual content, online platforms, and factors based on visual social semiotics theory. This study offers managers knowledge on making visual content that will produce more engagement, especially in social media ads, which can influence the attitude and purchase intention of their audience.


INTRODUCTION
Social media has a huge impact in the modern world, as it has changed business, economic, political, and cultural models throughout the world (Chiou, Knewtson, & Nofsinger, 2019). Social media is defined as any website or application that allows users to engage in social networking activities such as creating, sharing, or interacting (Nisar, Prabhakar, & Strakova, 2019). Facebook, YouTube, WhatsApp, WeChat, Instagram, Pinterest, Snapchat, LinkedIn, and Twitter are some of the most popular social media platforms in the world.
Social media is used not only for interactions between users, but also as a source of information as stated by Hamid et al. (2016), in that social media is a source of various kinds of information used every day. In addition to personal benefit, social media is also used for campaigns in US presidential elections (Lee & Xenos, 2019), news channels (Kumar et al., 2018), and lecture education (Stathopoulou, Siamagka, & Christodoulides, 2019). Nowadays the use of social media is the most important digital media channel at 79% compared to search engines, which only reach 73% (Solomon, 2018).
Social media content is divided into three components: audio, visual, and textual (Poria et al., 2016). These three components are found on Instagram, especially the visual component. Instagram is a social media platform included in the most popular photo sharing category at the moment (Sheldon & Newman, 2019). The number of active users as of June 2018 reached 1 billion compared to September 2017, which had 800 million (Statista, 2019). Aside from being the most popular photo sharing social media platform, Instagram is also the most common social media used by marketers to engage users by creating visual content (Erkan, 2015).
In practical implications, companies use visual content by combining models and products as promotion tools and by becoming a market expert in these products (Shareef et al., 2019). The use of visual content aims to build relationships with the audience because it is more easily responded to, understood, and is more effective (Pressgrove, Janoske, & Haught, 2018). Visual content also increases engagement on social media by 46% (Manic, 2015). Other studies have shown that visual content is more dominant than textual content on a website (López-Sánchez, Arrieta, & Corchado, 2019).
In the context of visual content on Instagram, previous research on visual content has seen an interaction effect between visual complexity and endorsement (Kusumasondjaja & Tjiptono, 2019). Other studies have shown the interaction between the textual component and the visual component by combining the textual component in the form of positive emoticons, and the visual form of the view will increase engagement (Jaakonmäki, Müller, & Vom Brocke, 2017).
Both research examples combine visual content with external factors, that is, the model in the image and the textual component in the form of emoticons, but research on the elements in the image is still fairly minimal.
Research on elements in pictures by Valentini et al. (2018) examined the effects of interaction between gaze and product salience on an Instagram content. Gaze refers to the gaze direction of the model, which is divided into two main components, namely direct gaze and indirect gaze (Senju & Hasegawa, 2005). Direct gaze is a condition where the subject in the picture looks directly at the audience, while indirect gaze is a condition where the subject in the picture does not look directly at the audience. Product salience is the level of product proportion in an image, where high product salience is usually where the product is in the foreground to get a greater proportion and low product salience is the opposite (Jia & Han, 2013).
The interaction between gaze and product salience that enhances digital visual engagement on Instagram in the form of likes, shares, comments, and follows (Valentini et.al., 2018). In this case, digital visual engagement has similarities with social media engagement because the digital engagement occurs in social media. In the world of advertising, social media engagement has a strong relationship with purchase intention (Kilger & Romer, 2007) and a positive influence on attitude (Barger, Peltier, & Schultz, 2016). Attitude is a positive or negative response to an object (Lee et al., 2015). Purchase intention is the intention of future consumer purchases (Martins et al., 2019). This research will use gaze, product salience, digital visual engagement, attitude, and purchase intention as the terminology in the following discussion.

Gaze and Product Salience
The science of gaze and product salience is taken from visual semiotic theory that studies the correlation between images and the audience through its elements (Kress & Luewen, 1996). This was also explained by Harrison (2003) through the example of the US supreme court, which explained that social visual semiotics learned how the elements in a picture could influence the delivery of meaning to the audience.
To be able to work as a whole system in creating communication between viewers and images, visual social semiotics is divided into three groups: representational, interactive, and compositional metafunction (Valentini et al., 2018). Representational metafunction talks about the narrative and concepts from the picture and provides information about what is represented in the picture. For example, in a picture where a baby holding hands with his mother, this picture has a very strong narrative about being together, love, and other similar things.
Interactive metafunction talks about the relationship between the audience and elements of the picture through contact between the actor and the audience, the distance, and also the point of view. In the previous example, interactive metafunction can be illustrated by the direct gaze of a baby or mother with a happy look that invites the audience to be involved in a happy atmosphere. Gaze is a type of contact in interactive metafunction. Direct contact made through direct gaze can produce social interaction (Valentini et al., 2018). This is also supported by research by Pfeiffer et al. (2013), which says that direct gaze can create social interactions, which has also been tested using visual content. Arapakis et al. (2014) found that gaze had a positive relationship with user engagement. Senju and Hasegawa (2005) found that direct gaze gave signals and captures the attention of the audience, making it hard for them to escape from engaging. Direct gaze also increases the dwelling time for viewers to disengage from the image. Valentini et al. (2018) previously divided interactive metafunction into two types: demand contact and offer contact. Demand contact occurs when the model in the picture looks directly at the audience, whereas offer contact occurs when the model is not staring directly at the audience. In offer contact, the model allows the audience to evaluate the image.
Compositional metafunction integrates representational and interactive through elements of information value, salience, framing, and modality. Returning to the example used in representational and interactive metafunction, the framing presented in the picture is seen from the perspective of the father of the baby where the baby and mother are eating ice cream from a brand. This is where representational and interactive are connected and find their true meaning. One element of compositional metafunction is salience, which also relates to the layout of the image and the composition of the visual content. Salience can be interpreted as 'top of mind' / 'brought to mind' (Romaniuk et al., 2004). Jia and Han (2013) said that products in the foreground will increase the level of salient (high product salience), and products that are in the background will have low salient (low product salience).

Digital Visual Engagement
Digital engagement can be defined as individual behavior towards content, an organization, or a brand in an online environment (Eigenraam et al., 2018). Digital engagement was classified by Khan (2017) based on the level of activity, namely participation (active) and consumption (passive). Participation is a behavior such as like, share, comment, and upload, while consumption behavior is only in the form of view and read. Other research conducted by Kim and Yang (2017) classified digital engagement based on three levels of involvement: consuming, contributing, and creating. Consuming is the lowest level and involves participatory behavior such as view, follow, and read. At the secondary level, contributing is the interaction between users with content, which includes participating in content such as likes and comments. Creating is the highest level, which involves producing and publishing content.
Previous research has shown that well-known brands such as Rolex, Nike, and Coca-Cola are successful in using digital content to increase brand awareness, engagement, purchase intention, trust, and loyalty (Hollebeek & Macky, 2019). Digital content in the form of visual is effective because it forms the relationship between the brand, content, and audience (Hollebeek & Macky, 2019). In research on social media, brands are shown to use social media by sharing visual content with the aim of building brand relationships (Gómez, Lopez, & Molina, 2019). Based on these findings, this study defines digital visual engagement as individual behavior towards visual content in an online environment, especially in social media.

Attitude
Attitude is defined as an overall object evaluation (Eagly & Chaiken, 1993). Attitude is evaluative because it reflects a positive or negative response to an object (Lee et al., 2015). In the context of social media, especially Instagram, when content has meaning, then one's attitude will tend to be positive (Borges-Tiago, Tiago, & Chaiken, 2019). Kim, Kim, and Watcher (2013) said that engagement can form attitudes that will affect overall judgment. The findings of Le Roux, Irene, and Tania (2016) showed that there is a correlation between attitude, purchase intention, and engagement.

Purchase Intention
Purchase intention can be defined as the likelihood that consumers will plan or are willing to buy certain products or services in the future (Martins et al., 2019;Rahim et al., 2016). In the context of social media, the customer's decision to buy a product is largely dependent on the value of the product and recommendations by other consumers on social media (Dehghani & Tumer, 2015). Balakrishnan, Dahnil, and Yi (2014), said that social media is an effective marketing media to increase purchase intention. Previous research has shown that more and more companies are using social media Instagram to increase purchase intention (Sokolova & Kevi, 2019). Features in Instagram that allow content sharing in the form of photos has become part of social media advertising that can increase purchase intention (Alalwan, 2018). Today, many companies use advertising to attract customers' attention. This adds brand image and brand equity after or during the buying cycle. They also found that customers' desire to buy goods from the brand increased after seeing many likes and shares stating that the brand had a reputation (Deghani & Tumer, 2015).

Hypothesis Development
To find out how an image can attract, involve, and engage the audience, this research analyzes Instagram based on two metafunctions, which are interpersonal and compositional. Representational metafunction wasn't included in the study because it didn't provide information about how the audience could relate to the picture (Valentini et al., 2018). This is reflected in two factors in each metafunction, which are gaze and product salience.
According to Valentini et al. (2018) the more the model looks at the audience (direct gaze) the greater the engagement that occurs between the model and the audience, and the opposite is true for indirect gaze. This is supported by Ewing, Rhodes, and Pellicano (2010), who said that direct gaze engages more than indirect gaze. This has been tested directly and visually. From this, it can be seen that there is a correlation between the gaze of the model and digital visual engagement in visual content.
Besides gaze, there are other factors that can influence the audience in engaging with the image. The audience looks at the elements in a picture. Visual salience talks about strategic choices in compositional functions to be able to attract the audience's attention through the elements in the picture. According to McCay-Peet, Lalmas, and Navalpakkam (2012), research on visual salience shows that visual content with high product salience will facilitate information reception and increase engagement compared to low product salience content. This has been proven in previous research by Valentini et al. (2018), who showed that there is an interaction between gaze and product salience towards the audience. Based on the above research, the first hypothesis would like to see the interaction of visual social semiotics theory regarding interpersonal and compositional metafunction, where gaze has an interaction with product salience. H1: There is an interaction between gaze and product salience on digital visual engagement.
When there is direct gaze and high product salience, then digital visual engagement will increase. Visual content will be more easily accepted and provide more relevant information than non-visual content that will affect one's attitude and motivation (Johnston & Davis, 2019). Berry et al. (2006) found that the visual element of an advertisement is very important in building a positive attitude towards the ad, which will later increase purchase intention. Engagement that is formed from information on the internet and social media will increase positive attitude towards the information (Lewis & Sznitman, 2019). Therefore, the second hypothesis of this study is: H2: Digital visual engagement creates a positive affect on attitude.
Research on the affect of engagement on purchase intention in online communities in China shows that the higher engagement a person has with social media, the greater the purchase intention (Prentice et al., 2019). Other research on social media marketing shows that there is social media engagement in the form of feedback that increases purchase intention (Balakrishnan, Dahnil, & Yi, 2014). Because social media engagement and digital visual engagement have similarities in building relationships with audiences, digital visual engagement is expected to have a similar effect on purchase intention. The third hypothesis of this study is: H3: Digital visual engagement creates a positive affect on purchase intention.

Design
This research used a causal research design with an experimental method. The experimental method was carried out with the aim of seeing the causal correlation between the independent variable and the dependent variable (Maholtra, Nunan & Birks, 2017). In addition to the experimental methods, hypotheses 2 and 3 were tested using simple regression.
In the context of this study, gaze and product salience were the independent variables, while digital visual engagement, attitude, and purchase intention were the dependent variables. Visual content was manipulated in a controlled environment in terms of gaze and product salience. This study measured the effects of manipulation of visual content on digital visual engagement, attitude, and purchase intention.

Participants
The participant population comprised social media users from all professions. The sample was taken from Instagram, as it is one of the most popular photo-sharing social media platforms (Sheldon & Newman, 2019). The samples were 18-34-year-old users, because most Instagram users are at that age level (Statista, 2019). Considering there were four experimental conditions that must be tested, this study used 200 samples with each experimental condition receiving 50 samples.
This research used random convenience sampling. The sampling was done online using the Amazon Mechanical Turk platform that linked to Questionpro.com. Small compensation was given to the respondent. To get results, the answers from the respondents were screened to be sampled. The respondents who participated must have met several criteria in the form of a manipulation check, age (18-34 years), and they must be an active Instagram user who was sorted according to the answers of the questionnaire. In addition, the questionnaire filling time was also a criterion. A minimum amount of time was needed to fill in the questionnaire to be included in this study.
A total of 280 respondents from four groups (70 for each group) were collected to answer the provided survey. After the survey had been completed, the respondents received a unique code that must be entered into the Amazon Mechanical Turk as proof of their work. Each respondent had different code. From 280 respondents, 200 respondents (50 each group) were taken after screening. Screening was done by checking the code entered followed by the processing time. Durations that did not meet the qualifications were removed from the results. The IP address was also checked to make sure that no respondents filled out the survey more than once. Respondents who did not pass the screening were rejected from Amazon Mechanical Turk. The data collected was entered into SPSS and processed to get the analysis.

Procedure
When the respondent entered Amazon Mechanical Turk, the respondent was given a special link to open the survey that had been prepared. Respondents were directed to the main page, where they must fill in a number of questions to check whether the respondent met the criteria. If they did not meet one of the criteria, then the respondent could not enter the actual survey. Respondents were randomized based on their choice of certain variables. The survey was divided into four conditions (direct gaze x high product salience; indirect gaze x high product salience; direct gaze x low product salience; indirect gaze x low product salience) and given stimuli with four different images.
To test H1, we used a 2-way ANOVA, whereas H2 and H3 were tested using linear regression. Hypothesis 1 was tested using a 2x2 factorial design between gaze and product salience. This study tested the hypotheses under four different experimental conditions (direct gaze x high product salience; indirect gaze x high product salience; direct gaze x low product salience; indirect gaze x low product salience). The four conditions of the experiment were manipulated based on the pictures given to each respondent, where respondents would get different images. Each respondent was asked a question with a 7-point Likert scale assessment.
Hypotheses 2 and 3 were tested using linear regression. However, before the linear regressions, ANOVA was used to see the effect of the interaction between gaze and product salience on purchase intention and attitude. Overall regression was carried out to get the beta, t-value, and significance value. Then the regression was done based on the four scenarios to see the difference between the scenario of attitude and purchase intention. From each regression, the results based on the scenario's delta were compared to see the difference between the scenarios.

RESULTS
The first analysis conducted was to determine the interaction between gaze and product salience on digital visual engagement. Referring to Appendix F, there was a significant interaction between gaze and product salience (F(1,196) = 7.649 , p < 0.05). When the mean was compared between each factor, it was found that direct gaze and high product salience increased digital visual engagement compared to images with low product salience (Mhigh = 5.48(0.99) vs Mlow = 4.39(1.51), p < 0.05). This also applies to indirect gaze where high product salience increased digital visual engagement compared to low product salience (Mhigh = 4.38 (1.42) vs Mlow = 4.28 (1.06), p < 0.05) as shown in Figure 1.

Figure 1. Mean Digital Visual Engagement
From these results, it can be seen that direct gaze increases digital visual engagement, especially when product image was salient compared to the images in the background (low product salience). The main effect was significant in both factors (gaze and product salience) (Fgaze(1,196) = 11.34 ,p < 0.05 & Fsalience(1,196) = 11.04, p < 0.05). This can be seen also in the mean, which had a significant difference in gaze (MDirectGaze = 4.94 (1.38) vs MIndirectGaze = 4.34 (1.25), p < 0.05) and product salience (MHighSalience = 4.94 (1.34) vs MLowSalience = 4.34 (1.31), p < 0.05). These results show that there was an interaction between gaze and product salience on digital visual engagement. The next analysis of this research was to determine the correlation between digital visual engagement with attitude. The analysis was done by looking at the effect of the interaction between gaze and product salience on attitude. The results from ANOVA showed that there was a significant interaction effect (F(1,196) =9.680, p<0.05). Referring to Table 1 and Figure  2, the attitude is more positive under conditions of direct gaze and high product salience than with low product salience (Mhigh = 5.63 (1.09) vs Mlow = 4.63 (1.45), p < 0.05). Indirect gaze and low product salience conditions formed a more positive attitude compared to high product salience (Mhigh = 4.83 (1.64) vs Mlow = 5.03 (1.17), p < 0.05).

Figure 2. Mean Attitude
Overall regression was carried out in subsequent analyses. Table 2 shows that digital visual engagement is positively and significantly related to attitude (β = 0.835; t = 21.336, p < 0.001) When digital visual engagement increases, attitude will also increase. The analysis was carried out in more depth by doing a four times regression from each experimental condition. The results showed that all experimental conditions have a a positive and significant digital visual engagement correlation with attitude. By comparing the results of the regressions, the difference in direct gaze and indirect gaze when there is low product salience is quite large, but when there is high product salience, the regression results are not large between direct gaze and indirect gaze (Δ = 1.231). Direct gaze x high salience β = 0.834; t = 10.482, p < 0.001 Direct gaze x low salience β = 0.877; t = 12.663, p < 0.001 Indirect gaze x high salience β = 0.861; t = 11.713, p < 0.001 Indirect gaze x low salience β = 0.718; t = 7.155, p < 0.001 The final analysis of this study was to determine the correlation between digital visual engagement with purchase intention. The analysis was carried out by looking at the effect of the interaction between gaze and product salience on purchase intention. There is a significant interaction effect (F(1,196) =6.415,p<0.05). Purchase intention is more positive under conditions of direct gaze and high product salience than low product salience (Mhigh =5.34(1.24)vsMlow =4.47(1.62), p < 0.05) as shown in Figure 3.

Figure 3. Mean Purchase Intention
Overall regression was carried out in subsequent analyzes. The results of the regression show that digital visual engagement is positively and significantly related to purchase intention (β = 0.842; t = 21.921, p < 0.001) ( Table 3). When digital visual engagement increases, purchase intention will also increase. The analysis was carried out in more depth by doing a four times regression from each existing experimental condition. The result is that all experimental conditions show a positive and significant correlation between digital visual engagement and purchase intention, but it depends on the interaction effect on gaze and product salience. By comparing the results between regressions, the difference in direct gaze and indirect gaze when there is low product salience is quite huge (Δ = 4.984), but when the product salience is high, regression results aren't too large between direct gaze and indirect gaze (Δ = 1.968). A manipulation check consists of three parts: picture quality, gaze, and salience manipulation check. The picture quality indicator did not show a significant difference between the four experimental conditions. A manipulation check on gaze showed that the tested image matched the criteria of each scenario. Referring to Table 4, the condition of direct gaze with high product salience and low product salience show that the model in the picture looks directly at the respondent (Mhigh = 5.11 (0.94) vs Mlow = 5.04 (0.91)). The condition of indirect gaze with high product salience and low product salience showed that the model in the picture did not look directly at the respondent (Mhigh = 2.84 (0.86) vs Mlow = 2.99 (1.02)).
The manipulation check on salience showed that the tested image matched the criteria of each scenario. High product salience conditions with direct gaze and indirect gaze indicated that the product in the picture was in the foreground and attracted the attention of the respondent (Mdirect = 5.21 (0.96) vs Mindirect = 5.22 (0.85)). The condition of low product salience with direct gaze and indirect gaze showed that the product in the picture was in the foreground and did not attract respondents' attention (Mdirect = 3.86 (1.39) vs Mindirect = 3.88 (0.81)).

DISCUSSION
From the results above it can be seen that there is an interaction between gaze and product salience on digital visual engagement. This research shows that direct gaze can improve digital visual engagement. This result is supported by previous theories by Pfeiffer et al. (2013), which reported that direct gaze can create social interaction or increase engagement with the audience. As a result, audiences who get direct gaze are more likely to share pictures with friends, follow the brand, and reply to the post compared to those who get indirect gaze. Psychology says that direct gaze increases the attention time on faces in images and delays disengagement (Senju & Hasegawa, 2005).
In the context of product salience, products placed in the foreground (high product salience) generate greater engagement from the audience than those placed in the background (low product salience). Research on visual composition in Instagram shows that the audience is more interested when the product in the image is more prominent (high product salience), and the audience is more concerned with the product than the model (Ramos-Serrano & Martínez-García, 2016). This is confirmed by research by Chu et al. (2017), who stated that images with dominant products have the highest percentage of likes and comments.
In addition to the interaction between gaze and product salience, there is a main effect on gaze, which is the significant difference between direct gaze and indirect gaze. Ewing, Rhodes, and Pellicano (2010) stated that direct gaze is preferred and has a higher level of attractiveness compared to indirect gaze, which causes higher engagement. Main effects also occur in product salience, because there are significant differences between high product salience and low product salience. Research by McCay-Peet, Lalmas, and Navalpakkam (2012) showed that visual content with high product salience will facilitate information reception and increase engagement compared to low product salience content. This discovery can be used managerially to create effective visual content by combining direct gaze and high product salience on products marketed to create a high level of engagement with the advertising audience.
In accordance with the second hypothesis, there is a positive correlation between digital visual engagement and attitude. Conditions of direct gaze and high product salience provide the greatest influence on attitude. These results are consistent with the findings of Lewis and Sznitman (2019), where a positive attitude will be formed when there is engagement in content on social media. This finding is also related to research by Kim, Kim, and Watcher (2013), where engagement can form attitudes that will affect overall judgment. Aside from attitude, purchase intention also has a positive affect on digital visual engagement.
The third hypothesis is in line with the findings of Prentice et al. (2019), who showed that the higher one's digital visual engagement, the greater the purchase intention. Conditions when there is a direct gaze and the product is in the foreground (direct gaze x high product salience) gives the greatest impact on purchase intention. This study has similarities with research conducted by Balakrishnan, Dahnil, and Yi (2014), who found that social media engagement in the form of feedback increases purchase intention. Deghani and Tumer (2015) found that purchase intention is influenced by engagement from users as well as previous users by seeing likes and comments. The more likes and comments, the increase in a brand's reputation.
The company improves their brand image and brand equity through the buying cycle by using advertisements to engage customers. In the context of social media, this discovery is in line with the theory by Alalwan (2018), who said that the feature on Instagram that allows companies to share content in the form of photos becomes part of social media advertising that can increase purchase intention. With a positive correlation between engagement with attitude and purchase intention, businesses can apply this discovery to their marketing techniques to increase engagement so that the attitudes and purchase intentions of potential buyers go up, which can increase sales.

LIMITATIONS AND FUTURE RESEARCH
This research has some limitations. First, the product used in this research were unisex jackets. Different effects may be found if the products used are gender-specific. Past research used products with male-specific categories such as done by Valentini et al. (2018). Therefore, this provides opportunities for future research to examine if gender-specific products may moderate the effect. Second, a hypothetical brand was used in this research to avoid biases. While bias avoidance should be maintained in research, it is also important to see whether brand names may affect the effect. Thus, this provides a future research opportunity to examine whether brand may have a direct main effect or, perhaps, a moderating effect.
In addition, the visual content used for future research does not have to be images. Moving images such as animation and videos can also be explored further to see how it will bring different results on consumers' responses. Further, as prices were not included in this study, it will be interesting to examine whether there is an interaction effect of gaze, product salience, and price on digital visual engagement.
In the theory of metafunction in visual social semiotics, researchers can explore the effect of three main metafunctions: representational, interactive, and compositional, on visual engagement. Future research opportunities exist to examine the interactions between perspectives in the angle of images, framing, colors, and many others, for instance.