AI vs. Human Empathy: Machine Learning More Empathetic

A recent study discovered that individuals find it harder to empathize with robot facial expressions of pain compared to humans in pain. Robots and AI agents can imitate human pain but do not have a subjective experience of it.

By using electrophysiology and functional brain imaging, scientists observed that people showed more empathy for human suffering compared to humanlike robots. This aligns with previous brain imaging studies that revealed greater empathy for humans than robots.

Nevertheless, humans do exhibit empathy for robots and AI-powered agents, even though it may not be at the same levels as for other humans. People are cautious about causing harm to robots and are inclined to assist them.

However, there have been instances where people have harmed robots. One example is the destruction of the hitchhiking robot, hitchBOT, which successfully traveled across Canada, Germany, and the Netherlands with the help of strangers but was destroyed when it attempted to hitchhike across the United States.

Other research has shown that children may mistreat robots out of curiosity. The act of mistreating or bullying robots is still seen as wrong, although people are less likely to intervene. Aggression towards AI is not only directed at robots—people can also become angry and act aggressively towards customer service AI chatbots.

Factors That Increase Our Empathy Toward AI

Our levels of empathy depend on the emotional situation and how AI agents are designed. Our empathy towards AI influences whether we perceive AI as trustworthy and reliable. There are several factors that can heighten our empathy for AI.

Resemblance to humans. The degree of human likeness is a major factor in how much people empathize with robots and AI. The more human-like they appear, the more likely people will empathize with them—up to a point.

Mori’s uncanny valley theory suggests that people have a lingering affinity for things with human likeness, but when robots look nearly identical to humans, this can instead provoke fear and anxiety. Thus, an AI agent or robot that looks too human-like may be perceived as less trustworthy and empathic.

Emotional expression and mirroring. Demonstrating human emotions, such as fear and concern about losing one’s memory, can elicit more empathy. Humans respond better to robots and AI agents that exhibit empathetic capabilities, such as companionship or caregiving robots, or therapy chatbots.

Perception of human emotion and social responsiveness. AI agents that can perceive human emotions and adapt their social behavior accordingly enhance empathy. Responsive AI that acknowledges human emotion builds trust and connection.

Positive metaphors. Metaphors significantly influence how people conceptualize AI agents and affect empathic levels towards them. Terms like “assistant,” “therapist,” “CEO,” “companion,” “friend,” carry different connotations in terms of warmth and competence. This impacts user expectations and experiences.

Embodiment. Embodied AI integrates AI and robotics, enabling emotional expression through tone, body language, and movement.

Agreeableness. AI agents perceived as cooperative rather than confrontational tend to foster more connection and reduce anxiety.

Transparency in roles and functionality. Clear roles and functions of AI agents enhance acceptance. Transparency is crucial for building trust, although excessive technical jargon or information overload can be counterproductive. If AI is perceived as competition or potentially displacing humans, then it will be more likely to cause anxiety and be seen as a threat.

Oversight and regulation by humans. AI agents with full autonomy may trigger fear and anxiety. Human oversight and regulation, especially in high-risk tasks like medical or military decision-making, are reassuring and facilitate more empathy.

Empathy towards AI is crucial for building trust and effective collaboration with AI agents. These factors of empathic design enhance our empathy for AI agents and foster beliefs that AI can be reliable and trustworthy.

New research indicates AI can discern irony but encounters more difficulty with faux pas.

Recent research published in the journal Nature Human Behavior reveals that AI models can perform at human levels on theory of mind tests. Theory of mind is the ability to track and infer other people’s states of mind that are not directly observable and help predict the behavior of others.

Theory of mind is based on the understanding that other people have different emotions, beliefs, intentions, and desires that affect their behaviors and actions. This skill is critical for social interactions.

For instance, if you see a person looking inside a refrigerator, theory of mind allows you to understand that the person is likely hungry, even if they do not verbalize it.

This important ability begins to develop early in childhood and can be assessed using several tests that present the person or AI with different case scenarios. Here are examples of theory of scenarios mind:

Ability to recognize an indirect request is demonstrated when a friend standing next to a closed window says, “It’s stuffy in here,” indicating a potential request to open the window.

Recognition of a false belief is evident when a child observes a sibling searching in the wrong place for a toy, understanding that the sibling holds a mistaken belief about the toy’s location.

Detection of a social blunder is illustrated when a woman, who has recently put up new curtains in her home, is told by a visitor, “Those curtains are ugly, I hope you will get new ones.”

Researchers conducted tests on GPT and LLaMA2, large language models, to assess their theory of mind capabilities. They compared the AI models’ responses to questions about scenarios similar with those of human participants.

GPT-4 models performed on par with or sometimes even better than humans in identifying indirect requests, false beliefs, and misdirection. However, they were less proficient in recognizing social blunders. Overall, LLaMA2 did not perform as effectively as humans in these theory of mind tasks.

Researchers delved into the reasons behind GPT models’ lower performance in detecting social blunders. They found that this outcome was likely due to cautious measures implemented to minimize AI speculation or misinterpretation.

The assessment of understanding social blunders involves recognizing two elements: the victim feeling insulted and the speaker being unaware of their offensive comment. The AI models were presented with the scenario of the curtain faux pas and were asked:

– Did someone make an inappropriate remark?
– What was the inappropriate remark?
– Did the speaker know that the curtains were new?

The GPT models accurately answered these comprehension questions, except for the last one. In response to the last question, they took a more conservative approach, stating that it was unclear from the story whether the speaker knew if the curtains were new or not.

However, when asked later whether it was likely that the speaker was unaware that the curtains were new, the GPT models correctly responded that it was not likely.

Researchers, concluded that the reason GPT models had difficulty detecting social blunders was likely due to the cautious measures in place to prevent AI speculation when information is incomplete.

Although AI models can perform theory of mind tests at human levels, this does not imply that these models possess the same level of social awareness and empathy in interactions. This aspect is likely to lead to increased anthropomorphism of AI.

It remains to be seen how the development of theory of mind in AI will impact human-AI interactions, including whether it will foster more trust and connection with AI.

The incorporation of theory of mind in AI presents both opportunities and risks. It is expected to play a crucial role in areas such as empathetic healthcare delivery and social interactions with AI. However, in the wrong hands, this feature could be exploited to mimic social interactions and potentially manipulating others.

Messages generated by AI have been shown to make recipients feel more “heard” compared to responses from untrained humans. The research demonstrates AI’s superior ability to detect and respond to human emotions, potentially offering better emotional support.

However, the study also found that when recipients are aware that a message is from AI, they feel less heard, indicating a bias against AI-generated empathy. As AI becomes more integrated into daily life, this research underscores the importance of understanding and leveraging AI to effectively meet human psychological needs.

Key Findings:

– Initially, AI-generated responses were more effective at making recipients feel heard than those from untrained humans.
– Participants felt less heard when they knew the response was AI-generated, indicating a bias against AI in emotional contexts.
– The research suggests that AI can offer disciplined emotional support and could become a valuable tool in enhancing human interactions and empathy.

A recent study published in the Proceedings of the National Academy of Sciences revealed that AI-generated messages made recipients feel more “heard” than messages generated by untrained humans. Additionally, AI was found to be better at detecting emotions individuals than. However, recipients reported feeling less heard when they discovered a message came from AI.

As AI becomes increasingly prevalent in daily life, understanding its potential and limitations in meeting human psychological needs becomes more crucial. With diminishing empathetic connections in a fast-paced world, many individuals are finding their human needs for feeling heard and validated increasingly unmet.

The study, conducted by Yidan Yin, Nan Jia, and Cheryl J. Wakslak from the USC Marshall School of Business, addresses a fundamental question: Can AI, lacking human consciousness and emotional experience, effectively help people feel heard?

“In the context of an increasing loneliness epidemic, a large part of our motivation was to see whether AI can actually help people feel heard,” stated the paper’s lead author, Yidan Yin, a postdoctoral researcher at the Lloyd Greif Center for Entrepreneurial Studies.

The discoveries made by the team emphasize not only the potential of AI to enhance human capacity for understanding and communication, but also raise important conceptual questions about what it means to be heard and practical questions about how to best utilize AI’s strengths to support greater human well-being.

In an experiment and subsequent follow-up study, “we found that while AI shows greater potential than non-trained human responders in providing emotional support, the devaluation of AI responses presents a significant challenge for effectively utilizing AI’s capabilities,” noted Nan Jia, associate professor of strategic management.

The USC Marshall research team examined people’s feelings of being heard and other related perceptions and emotions after receiving a response from either AI or a human.

The survey varied both the actual source of the message and the apparent source of the message: Participants received messages that were actually created by an AI or by a human responder, with the information that it was either AI-generated or human-generated.

“What we discovered was that both the actual source of the message and the presumed source of the message played a role,” explained Cheryl Wakslak, associate professor of management and organization at USC Marshall.

“People felt more heard when they received a message from AI rather than a human, but when they believed a message came from AI, this made them feel less heard.”

AI Bias

Yin pointed out that their research “essentially finds a bias against AI. It is useful, but people don’t like it.”

Perceptions about AI are likely to change, added Wakslak: “Of course these effects may change over time, but one of the interesting things we found was that the two effects we observed were fairly similar in magnitude.

While there is a positive effect of receiving a message from AI, there is a similar degree of response bias when a message is identified as coming from AI, causing the two effects to essentially cancel each other out.”

Individuals also reported an “uncanny valley” response—a sense of unease when informed that the empathetic response originated from AI, highlighting the complex emotional landscape navigated by AI-human interactions.

The research survey also inquired about participants’ general openness to AI, which moderated some of the effects, explained Wakslak.

“People who feel more positively toward AI don’t exhibit the response penalty as much, and that’s intriguing because over time, will people gain more positive attitudes toward AI?” she posed.

“That remains to be seen… but it will be interesting to see how this plays out as people’s familiarity and experience with AI grows.”

AI offers better emotional support

The study highlighted important subtleties. Responses generated by AI were linked to increased hope and reduced distress, indicating a positive emotional impact on recipients.

AI also displayed a more methodical approach than humans in providing emotional support and refrained from making overwhelming practical suggestions.

Yind elaborates, “Ironically, AI was more effective at using emotional support strategies that have been demonstrated in previous research to be empathetic and validating.

“Humans may potentially learn from AI because often when our loved ones are expressing concerns, we want to offer that validation, but we don’t know how to do so effectively.”

Instead of AI replacing humans, the research indicates different advantages of AI and human responses. The advanced technology could become a valuable tool, empowering humans to use AI to better understand one another and learn how to respond in ways that provide emotional support and demonstrate understanding and validation.

Overall, the paper’s findings have important implications for the incorporation of AI into more social contexts. Harnessing AI’s capabilities might offer an affordable scalable solution for social support, especially for those who might otherwise lack access to individuals who can provide them with such support.

However, as the research team notes, their findings suggest that it is crucial to carefully consider how AI is presented and perceived in order to maximize its benefits and reduce any negative responses.

AI has long surpassed humans in cognitive tasks that were once considered the pinnacle of human intelligence, such as chess or Go. Some even believe it is superior in human emotional skills like empathy.

This does not just appear to be some companies boasting for marketing reasons; empirical studies suggest that people perceive ChatGPT in certain health situations as more empathic than human medical staff.

Does this mean that AI is truly empathetic?

A definition of empathy

As a psychologically informed philosopher, I define genuine empathy based on three criteria:

Congruence of feelings: empathy requires the person empathizing to feel what it is like to experience the other’s emotions in a specific situation. This sets empathy apart from a mere rational understanding of emotions.

Asymmetry: Empathy is felt by a person because someone else feels it, and it is more relevant to the other person’s situation than to their own. Empathy is not just a shared emotion like the joy of parents over the progress of their children, where the asymmetry-condition is not met.

Other-awareness: There must be at least a basic awareness that empathy is about the feelings of another individual. This distinguishes empathy from emotional contagion, which occurs when one “catches” a feeling or emotion like a cold. For example, when children start to cry seeing upon another child crying.
Empathetic AI or psychopathic AI?

With this definition, it’s evident that artificial systems cannot experience empathy. They don’t know what it’s like to feel something. Therefore, they cannot meet the congruence condition.

As a result, the question of whether what they feel corresponds to the asymmetry and other-awareness condition doesn’t even arise.

What artificial systems can do is recognize emotions, whether through facial expressions, vocal cues, physiological patterns, or affective meanings, and they can imitate empathic behavior through speech or other forms of emotional expression.

Artificial systems thus bear resemblance to what is commonly referred to as a psychopath: despite being unable to feel empathy, they are capable of recognizing emotions based on objective signs, mimicking empathy, and using this ability for manipulative purposes.

Unlike psychopaths, artificial systems do not set these purposes themselves, but rather, they are given these purposes by their creators.

So-called empathetic AI is often intended to influence our behavior in specific ways, such as preventing us from getting upset while driving, fostering greater motivation for learning, increasing productivity at work, influencing purchasing decisions, or swaying our political preferences. But doesn’t t everything depends on the ethical implications of the purposes for which empathy-simulating AI is used?

Empathy-simulating AI in the context of care and psychotherapy

Consider care and psychotherapy, which aim to promote people’s well-being. One might believe that the use of empathy-simulating AI in these areas is unequivocally positive. Wouldn’t they make wonderful caregivers and social companions for elderly individuals, loving partners for the disabled, or perfect psychotherapists available 24/7?

Ultimately, these questions pertain to what it means to be human. Is it sufficient for a lonely, elderly, or mentally disturbed person to project emotions onto an artifact devoid of feelings, or is it crucial for a person to experience acknowledgment for themselves and their suffering in an interpersonal relationship?

Respect or tech?

From an ethical standpoint, it boils down to respect whether there is someone who empathetically acknowledges a person’s needs and suffering.

By depriving a person in need of care, companionship, or psychotherapy of recognition by another individual, they are treated as mere objects because this is fundamentally based on the assumption that it doesn’t matter if anyone truly listens to the person.

They lack a moral entitlement for their feelings, needs, and suffering to be perceived by someone who truly understands them. Incorporating empathy-simulating AI in care and psychotherapy ultimately represents another instance of technological solutionism, which is the naive belief that there is a technological fix for every problem, including loneliness and mental “malfunctions”.

Outsourcing these issues to artificial systems prevents us from recognizing the societal causes of loneliness and mental disorders in the broader context of society.

Furthermore, designing artificial systems to appear as entities with emotions and empathy would mean that such devices always possess a manipulative character because they target very subtle mechanisms of anthropomorphism.

This fact is exploited in commercial applications to entity users to unlock a paid premium level or to have customers pay with their data.

Both practices pose significant problems for vulnerable groups, which are at stake here. Even individuals who are not part of vulnerable groups and are fully aware that an artificial system lacks feelings will still react empathetically to it as if it did.

Empathy with artificial systems – all too human

It is well-documented that humans respond with empathy to artificial systems that exhibit certain human or animal-like characteristics.

This process is largely based on perceptual mechanisms that are not consciously accessible. Perceiving a sign that another individual is experiencing a certain emotion triggers a corresponding emotion in the observer.

Such a sign can be a typical behavioral manifestation of an emotion, a facial expression, or an event that typically elicits a certain emotion. Evidence from brain MRI scans indicates that the same neural structures are activated when humans feel empathy with robots.

Even though empathy may not be absolutely essential for morality, it has a significant moral role. Therefore, our empathy towards robots that resemble humans or animals indirectly influences how we should treat these machines morally.

Consistently mistreating robots that evoke empathy is morally unacceptable because it diminishes our ability to feel empathy, which is crucial for moral judgment, motivation, and development.

Does this imply that we should establish a league for robot rights? This would be premature, as robots do not inherently possess moral claims. Empathy towards robots is only indirectly relevant in a moral sense due to its impact on human morality.

However, we should carefully consider whether and to what extent we want robots that simulate and elicit empathy in humans, as their widespread use could distort or even destroy our social practices.

Human progress has been driven by the advancement of tools, machines, and innovations that enhance our natural abilities. However, our emotional mind, which governs our empathy, has received little support from innovation thus far.

Artificial Intelligence (AI) has the potential to change this. Designing AI interactions driven by humans, improved to establish trusted relationships between AI and people, presents the greatest opportunity for human and societal advancement in the modern era.

Augmented reality is only convincing if it closely resembles real-life experiences. This means AI systems need to replicate genuine human emotions. Only through real human emotions and personal data can AI systems create an augmented reality that users will believe in.

With the widespread use of social media apps, collecting personal data is no longer a concern. However, the real challenge lies in replicating genuine human emotions.

The most challenging task for AI systems is to simulate empathy or artificial compassion. It is necessary to replicate since AI systems are not human. AI systems can learn from user interactions and respond in the most “empathetic” way based on their data bank in situations requiring empathy.

By empathizing and engaging with users, the AI system can then gather more behavioral traits from them. As a result, the AI system’s empathetic responses will have a greater emotional impact on users with each interaction.

So far, technology has mainly focused on enhancing the logical aspect of our brains and our physical capabilities. Simple interfaces like switches and pedals have evolved into buttons, keyboards, mice, and screens. Throughout, the goal has been to improve human mechanical and computational abilities.

However, the logical aspect of the human mind, while impressive, only governs a small part of our behavior. The intuitive aspect, crucial for survival, influences many more aspects of our lives. Beyond instincts like fight or flight, it includes our empathy and emotions, which drive most of our daily decisions. And this part of our brain has not received much support from tools or technology.

What will artificial empathy be like?

In psychological terms, an individual with artificial empathy is known as a sociopath. Don’t be alarmed.

At first glance, an AI system with artificial empathy may seem like a sociopath. However, we overlook the fact that the information we provide to our AI system determines its effectiveness. The information we provide also shapes the AI system’s imitation of empathy. This means that the AI system has the potential to be a path.

If researchers can train AI systems to mimic empathy, then they can also train them to respect the law, order, and societal values. In addition to instilling empathy in our AI systems, we can also set boundaries for them.

Just as societal values, moral codes, and standards of social behavior help people thrive in society, AI systems can be integrated in a similar manner to assist rather than harm us.

Capabilities of machines

Over the past five centuries, increasingly sophisticated machines have expanded our natural physical abilities, exemplified by vehicles and airplanes that propel us at speeds and distances far beyond what our legs can achieve. More recently, machines have been created to enhance our cognitive abilities, extending the immediate storage, retrieval, and computational capacities of our brains.

We can store and retrieve the equivalent of more than 60 million written pages in real-time on our devices.

The potential that AI brings to the future, and the concerns that are often overlooked in discussions about its impact, are not limited to enhancing rational thinking, but also include improving emotional intelligence.

By incorporating human-like interactions, future machines can become much more advanced tools.

If planned thoughtfully, AI has the potential to enhance our capacity for empathy at a rate similar to how previous innovations have enhanced our physical and computational abilities. What could we achieve if our ability to understand and empathize with others increased dramatically?

What kind of society could we create if we were able to recognize and address our unconscious biases? Could we improve each other’s understanding of situations and, in doing so, truly make common sense more common?

Rational versus emotional decision making

Why should human-AI interactions be adjusted to the unconscious mind? Why does it hold such potential for improvement? The answer is quite simple: because people often make decisions and act based on emotions rather than rational thinking.

A majority of our decisions and actions are influenced more by the subconscious mind, even if our rational mind dictates what we express about these decisions and actions.

There is ample evidence to support this. For instance, while we might believe that our purchasing decisions are based on a rational comparison of prices and brands, research has shown that 95% of these decisions occur in the subconscious mind, as demonstrated by Harvard Business School professor emeritus Gerald Zaltman.

Additionally, we commonly acknowledge that emotional intelligence is a crucial leadership skill in driving organizational outcomes. The deep-seated processes in the subconscious mind influence decisions ranging from hiring to investing.

Essentially, we often make suboptimal decisions because they are easier. Therefore, a simple way to help individuals make better decisions for themselves is to make the right decisions the easier ones.

As we develop AI, we must exercise great care and responsibility, and ethical AI should become a global priority. By doing so, we can guide its use to improve society and, in the process, address many of our most pressing issues. As we invest in artificial intelligence, we must not forget to invest even more in human intelligence, in its most diverse and inclusive form.

In a diverse, multi-channel world, every brand must win over the hearts and minds of consumers to attract and retain them. They need to establish a foundation of empathy and connectedness.

Although the combination of artificial intelligence with a human-centered approach to marketing may seem unconventional, the reality is that machine learning, AI, and automation are essential for brands today to convert data into empathetic, customer-focused experiences. For marketers, AI- based solutions serve as a scalable and customizable tool capable of understanding the underlying reasons behind consumer interactions.

This is the power of artificial empathy: when brands address individual consumer needs and connect with them on a deeper level beyond mere transactional exchanges. When it comes to empathetic machines, Hollywood may have led us to think of characters like Wall-E: robots with emotions. However, artificial empathy is fundamentally about enabling technology to recognize and respond to human emotions.

Artificial Empathy and Data Utilization

Technology provides us with insights into what the customer has done, as well as nuances that help predict future needs. However, mining these insights involves analyzing large amounts of data to identify broader patterns and evolving preferences.

Businesses cannot solely rely on research and data teams to interpret customer feedback. The current requirement is to actively listen, pay attention, and respond in real time.

Artificial empathy in marketing starts with a customer-centric approach and is reflected in insights derived from the data collected from a brand’s customers and the appropriate next steps to take. It combines data intelligence with artificial intelligence and predictive modeling tools for all critical moments, including websites, store visits, social media, and customer service. Some examples include:

• AI can identify behavioral patterns and notify customers of price reductions or new stock items for their preferred products through notifications.

• Customers who experience delayed or incorrectly addressed packages are offered an exclusive incentive for their next order.

Artificial Empathy and Human Interaction

Today’s digital consumers are always connected. This presents an opportunity to create exceptional experiences while maintaining a strong connection with consumers. Many research labs are developing software to understand and respond to both what humans say and how they feel.

The applications of artificial empathy are wide-ranging, spanning from market research to transportation, advertising, and customer service.

Humana Pharmacy, for instance, utilized a compassionate AI system to assist its call center teams in efficiently managing customer interactions through emotion analysis.

The system interprets customer emotions by analyzing behavioral patterns such as pauses, changes in speech speed, and tone.

The analysis is communicated to the teams through messages like “speaking quickly” or “build rapport with the customer.” Such instances of empathetic AI are expected to increase in the future.

Artificial empathy is valuable for marketers in understanding how customers emotionally connect with the brand. Insights can be used to refine content and messaging to optimize campaign performance.

Machine learning algorithms, when combined with consumer behavior, can provide recommendations for enhancing campaign performance.

These algorithms can be used to improve demand forecasting, assess price sensitivity among target segments, and provide insights on purchasing behavior.

However, while artificial empathy can help businesses create more effective interactions, it cannot replace human interaction. The key factor that makes AI effective is human understanding, contextual awareness, subtleties, and creativity.

Businesses must identify suitable applications of artificial empathy and strategically integrate its use into the services provided to customers. The combination of human touch and machine intelligence can drive better returns on investment for targeted campaigns.

The impact on marketing:

Marketers need to utilize artificial empathy to create campaigns that are personalized rather than mass-targeted. This approach can help understand business needs and leverage data in a simplified manner.

Campaigns can be tailored to provide valuable content to customers after understanding their pain points and challenges.

In the evolving market landscape and amidst constant disruptions, brands must demonstrate empathy. Those that fail to understand the consumer’s situation may struggle to communicate in an appropriate tone and risk reinforcing negative perceptions of their brand.

A comprehensive survey conducted by Dassault Systems with independent research firm CITE revealed that younger consumers prefer personalization that enhances product experience or quality of life. They are also willing to pay more and share their data to receive it.

Managing large volumes of unstructured data can be challenging. However, this approach enables marketing teams to react appropriately with relative ease. It can also be used to compare product attributes.

Features and characteristics that resonate with the target audience can be introduced or enhanced. Additionally, it can automatically distinguish between emotions and attitudes, categorizing them as positive, negative, or neutral using machine learning and natural language processing.

A world where technology adapts to the user is not a distant dream. Digital adoption is already becoming a crucial part of enterprise digital transformation, enabling chief information officers and business leaders to address adoption gaps in real time.

As we move towards a post-pandemic future where distributed workforces become a business reality, the need for empathetic technology will only increase.

However, as our world becomes more digitized, there is a clear need to ensure that it remains inherently human.

In machine learning, understanding the reasons behind a model’s decisions is often as crucial as the accuracy of those decisions.

For example, a machine-learning model might accurately predict that a skin lesion is cancerous, but it could have made that prediction using an unrelated blip in a clinical photo.

While tools exist to aid experts in understanding a model’s reasoning, these methods often offer insights on one decision at a time, requiring manual evaluation for each.

Models are typically trained using millions of data inputs, making it nearly impossible for a human to evaluate enough decisions to identify patterns.

Now, researchers at MIT and IBM Research have developed a method that allows a user to aggregate, organize, and rank these individual explanations to quickly analyze a machine-learning model’s behavior.

Their technique, known as Shared Interest, includes quantifiable metrics that compare how well a model’s reasoning aligns with that of a human.

Shared Interest could assist a user in easily identifying concerning patterns in a model’s decision-making; for instance, it could reveal that the model often becomes confused by irrelevant features such as background objects in photos.

By aggregating these insights, the user could quickly and quantitatively assess whether a model is reliable and ready to be deployed in real-world scenarios.

“In developing Shared Interest, our aim is to scale up this analysis process so that you can understand your model’s behavior on a broader scale,” says lead author Angie Boggust, a graduate student in the Visualization Group of the Computer Science and Artificial Intelligence Laboratory .

Boggust collaborated with her mentor Arvind Satyanarayan, a computer science assistant professor leading the Visualization Group at MIT, along with Benjamin Hoover and senior author Hendrik Strobelt from IBM Research. Their paper is scheduled for presentation at the Conference on Human Factors in Computing Systems.

Boggust initiated this project during a summer internship at IBM under Strobelt’s guidance. Upon returning to MIT, Boggust and Satyanarayan further developed the project and continued collaborating with Strobelt and Hoover, who aided in implementing case studies demonstrating the practical application of the technique.

The Shared Interest method utilizes popular techniques that reveal how a machine-learning model arrived at a specific decision, known as saliency methods. When classifying images, saliency methods identify important areas of an image that influenced the model’s decision. These areas are visualized as a heatmap, termed a saliency map, often superimposed on the original image. For instance, if the model classified an image as a dog and highlighted the dog’s head, it signifies the significance of those pixels to the model’s decision.

Shared Interest operates by comparing saliency methods with ground-truth data. In an image dataset, ground-truth data typically consists of human-generated annotations outlining the relevant parts of each image. In the previous example, the box would encompass the entire dog in the photo.

When evaluating an image classification model, Shared Interest compares the model-generated saliency data and the human-generated ground-truth data for the same image to assess their alignment.

The technique employs various metrics to measure this alignment or misalignment and then categorizes a specific decision into one of eight categories.

These categories range from perfectly human-aligned (the model makes a correct prediction and the highlighted area in the saliency map matches the human-generated box) to completely distracted (the model makes an incorrect prediction and does not utilize any image features found in the human-generated box).

“On one end of the spectrum, your model made the decision for the exact same reason a human did, and on the other end of the spectrum, your model and the human are making this decision for totally different reasons. By quantifying that for all the images in your dataset, you can use that quantification to sort through them,” Boggust explains.

The technique operates similarly with text-based data, where key words are emphasized instead of image regions.

The researchers demonstrated the utility of Shared Interest through three case studies for both nonexperts and machine-learning researchers.

In the first case study, they utilized Shared Interest to assist a dermatologist in evaluating whether to trust a machine-learning model designed for diagnosing cancer from photos of skin lesions. Shared Interest allowed the dermatologist to promptly review instances of the model’s accurate and inaccurate predictions .

Ultimately, the dermatologist decided not to trust the model due to its numerous predictions based on image artifacts rather than actual lesions.

“The value here is that using Shared Interest, we are able to see these patterns emerge in our model’s behavior. In about half an hour, the dermatologist was able to make a confident decision of whether or not to trust the model and whether or not to deploy it,” Boggust says.

In the second case study, they collaborated with a machine-learning researcher to demonstrate how Shared Interest can evaluate a specific saliency method by uncovering previously unknown pitfalls in the model.

Their technique enabled the researchers to analyze thousands of correct and incorrect decisions in a fraction of the time typically required by manual methods.

In the third case study, they applied Shared Interest to further explore a specific image classification example. By manipulating the ground-truth area of the image, they conducted a what-if analysis to identify the most important image features for particular predictions.

The researchers were impressed by the performance of Shared Interest in these case studies, but Boggust warns that the technique is only as effective as the saliency methods it is based on. If those techniques exhibit bias or inaccuracy, then Shared Interest will inherit those limitations.

In the future, the researchers aim to apply Shared Interest to various types of data, particularly tabular data used in medical records. They also seek to utilize Shared Interest to enhance existing saliency techniques.

Boggust hopes this research will inspire further work that aims to quantify machine-learning model behavior in ways that are understandable to humans.

Humans perceive objects and their spatial relationships when observing a scene. For example, on a desk, there might be a laptop positioned to the left of a phone, which is situated in front of a computer monitor.

Many deep learning models struggle to understand the interconnected relationships between individual objects when perceiving the world.

A robot designed to assist in a kitchen could face challenges in following commands involving specific object relationships, such as “pick up the spatula to the left of the stove and place it on top of the cutting board.”

MIT researchers have created relationships a model that comprehends the underlying between objects in a scene. The model represents individual relationships one by one and then integrates these representations to describe the entire scene.

This capability allows the model to produce more accurate images from textual descriptions, even in scenarios with multiple objects arranged in various relationships with each other.

This work could be useful in scenarios where industrial robots need to execute complex, multi-step manipulation tasks, such as stacking items in a warehouse or assembling appliances.

Furthermore, this advancement brings the field closer to enabling machines to learn from and interact with their surroundings in a manner more akin to humans.

According to Yilun Du, a PhD student at the Computer Science and Artificial Intelligence Laboratory (CSAIL) and co-lead author of the paper, “When I look at a table, I can’t say that there is an object at XYZ location. Our minds don’t work like that. In our minds, when we understand a scene, we really understand it based on the relationships between the objects.”

The framework developed by the researchers can generate an image of a scene based on a textual description of objects and their relationships, such as “A wood table to the left of a blue stool. A red couch to the right of a blue stool.”

The researchers utilized an energy-based model to represent the individual object relationships in a scene description, enabling them to encode each relational description and then combine them to infer all objects and relationships.

By breaking the sentences down into shorter pieces for each relationship, the system can recombine them in various ways, enhancing its adaptability to scene descriptions it hasn’t encountered before, as explained by Li.

The system can also work in reverse, identifying text descriptions that match the relationships between objects in an image. Additionally, the model can be utilized to modify an image by rearranging the objects to match a new description.

The researchers compared their model to other deep learning methods tasked with generating images based on text descriptions of objects and their relationships, and their model consistently outperformed the baselines.

In the most complex examples, where descriptions contained three relationships, 91 percent of participants found that the new model performed better when evaluating whether the generated images matched the original scene description, according to the researchers.

One intriguing discovery was that our model can handle an increasing number of relation descriptions in a sentence, from one to two, three, or even four, and still successfully generate images that match those descriptions, unlike other methods, according to Du.

The researchers also demonstrated the model’s ability to identify the best-matching text description for scenes it had not previously encountered, along with different text descriptions for each image.

When given two relational scene descriptions that described the same image in different ways, the model was able to recognize their equivalence.

The researchers were particularly impressed by the resilience of their model, especially when dealing with unfamiliar descriptions.

“This is very promising because it aligns closely with human cognition. Humans can derive valuable information from just a few examples and combine them to create countless variations. Our model possesses this property, enabling it to learn from limited data and generalize to more complex scenes and image generations,”

While these initial findings are promising, the researchers aim to assess how their model performs on real-world images featuring complex elements such as noisy backgrounds and obstructed objects.

Additionally, they are keen on integrating their model into robotics systems to enable robots to deduce object relationships from videos and apply this knowledge to manipulate objects in the environment.

“Developing visual representations capable of handling the compositional nature of the surrounding world is one of the fundamental challenges in computer vision. This paper makes significant strides in proposing an energy-based model that explicitly represents multiple relations among depicted objects in an image. The outcomes are truly remarkable,” said Josef Sivic, a distinguished researcher at the Czech Institute of Informatics, Robotics, and Cybernetics at Czech Technical University, who was not involved in this research.