{"id":22739,"date":"2026-05-12T11:02:08","date_gmt":"2026-05-12T11:02:08","guid":{"rendered":"https:\/\/ideainthebox.com\/index.php\/2026\/05\/12\/beyond-verification-what-responsible-ai-really-demands-of-human-experts\/"},"modified":"2026-05-12T11:02:08","modified_gmt":"2026-05-12T11:02:08","slug":"beyond-verification-what-responsible-ai-really-demands-of-human-experts","status":"publish","type":"post","link":"https:\/\/ideainthebox.com\/index.php\/2026\/05\/12\/beyond-verification-what-responsible-ai-really-demands-of-human-experts\/","title":{"rendered":"Beyond Verification \u2014 What Responsible AI Really Demands of Human Experts"},"content":{"rendered":"<div>\n<figure class=\"article-inline\">\n<img class=\"lazyload\" decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" data-orig-src=\"https:\/\/sloanreview.mit.edu\/wp-content\/uploads\/2026\/04\/BCG-RAI_2026_ExpertPanel01-1290x860-2.jpg\" alt=\"\"><br \/>\n<\/figure>\n<p>For the fifth year in a row, <cite>MIT Sloan Management Review<\/cite> and Boston Consulting Group (BCG) have assembled an international panel of AI experts that includes academics and practitioners to help us understand how responsible artificial intelligence is being implemented across organizations worldwide. In our first post this year, we explored how organizations should think about AI\u2019s impact on the workforce, with our experts stressing that responsible AI means looking beyond the safety of AI systems to address real-world consequences for workers and economic stability. <\/p>\n<p>This time, we asked our panel to react to the following provocation: <em>Responsible AI efforts fail if they don\u2019t cultivate human experts who can verify AI solutions<\/em>. On the surface, there is broad consensus, with a clear majority (84%) of our panelists agreeing or strongly agreeing with the statement. But a deeper dive reveals that panelists define <em>verification<\/em> far more expansively than the provocation implies. Rather than treating it as a narrow, output-by-output check, they describe verification as the work of applying human judgment across an AI system\u2019s life cycle, interpreting context, designing tests, auditing workflows, setting thresholds, weighing when AI should not be relied on at all, and carrying the accountability that machines cannot. Understood this way, verification is not a final checkpoint but the connective tissue of responsible AI, encompassing the design, oversight, and accountability that organizations need to scale alongside the systems themselves. Below, we share panelist insights and offer our practical recommendations for organizations seeking to cultivate the human expertise their responsible AI governance efforts depend on.<\/p>\n<div class=\"callout-highlight callout-highlight--transparent\">\n<aside class=\"l-content-wrap\">\n<article>\n<h4>If human experts cannot verify AI solutions, RAI efforts have failed.<\/h4>\n<p class=\"caption mb30\">Eighty-four percent of panelists agree or strongly agree that RAI efforts have failed if they do not cultivate human experts who can verify AI solutions.<\/p>\n<p><img class=\"lazyload\" decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" data-orig-src=\"https:\/\/sloanreview.mit.edu\/wp-content\/uploads\/2026\/04\/RAI2026-HumanExperts-Article2.png\" alt=\"Bar Chart: Strongly disagree: 6%; Disagree: 6%; Neither agree nor disagree: 3%; Agree: 42%; Strongly agree: 42%\"><\/p>\n<p class=\"attribution\">Source: Panel of 31 experts in artificial intelligence strategy.<\/p>\n<\/article>\n<\/aside>\n<\/div>\n<p><strong>Humans provide the context for verifying AI outputs.<\/strong> ForHumanity founder Ryan Carrier backed the consensus that responsible AI efforts must cultivate human expertise to verify AI outputs because, as he puts it, \u201ccontext matters.\u201d Similarly, T\u00dcV AI.Lab CEO Franziska Weindauer notes, \u201cAI solutions operate within complex real-world contexts, and human experts are essential to interpret results, detect failures, and ensure that systems function as intended.\u201d As GovLab chief research and development officer Stefaan Verhulst explains, \u201cMany of the most significant risks of AI are societal rather than technical, such as misalignment with public values, harmful impacts on vulnerable groups, or inappropriate deployment contexts.\u201d Those risks, many experts contend, are precisely the ones hardest to address with a wholly technical solution. <\/p>\n<p>For some, context is irreducibly human and cannot be captured in machine-readable form alone. As OdeseIA president Idoia Salazar explains, \u201cNot everything is translated into data, such as context in a specific situation.\u201d Distinguished member of the investments committee of the Co-Develop fund\u2019s Yasadora Cordova agrees that responsible AI requires \u201ccontextual sensitivity\u201d \u2014 a quality that, in her view, \u201ccannot be automated.\u201d Jai Ganesh, Ph.D., vice president of technology, connected services, engineering, at Wipro Ltd., adds, \u201cSituational awareness is another area of concern for AI systems where an output that is correct may be culturally insensitive or legally problematic in a specific country or situation.\u201d Automation Anywhere\u2019s Yan Chow similarly observes that \u201chumans identify sociopolitical nuances and shifts that data cannot capture.\u201d For these reasons, National University of Singapore provost Simon Chesterman concludes that \u201chowever sophisticated the model or elaborate the governance framework, someone must still be capable of asking whether a system is reliable, lawful, and appropriate in context,\u201d a responsibility, in his view, that requires human expertise. <\/p>\n<\/p>\n<p>If context cannot be fully captured by machines, the practical consequences are significant. Carrier argues that \u201cdomain experts are necessary to provide feedback and risk assessments that result in well-tailored controls, treatments, and mitigations designed to tackle the specific and unique risks presented by context-dependent AI deployment and usage.\u201d Salazar goes further, contending that \u201cno matter how advanced a tool is, it cannot be the one to guarantee that its outputs are fair, safe, or appropriate to the context.\u201d For Ganesh, the risks are heightened with \u201cedge cases, rare scenarios, and new contexts where AI systems tend to break down,\u201d and he believes \u201ccatching these failures requires human judgment and deep domain expertise.\u201d Chow agrees that human expertise is critical for building \u201cexpert-validated guardrails for the edge cases where AI is most fragile.\u201d Moreover, he argues that \u201cresponsible AI frameworks collapse into compliance theater without human experts because AI cannot perceive dynamic context.\u201d<\/p>\n<p><strong>Losing human expertise poses an existential threat to organizations.<\/strong> The concern is not only that AI systems will fail without human expertise to verify outcomes but that organizations may lose human expert capacity over time. Cordova argues that \u201corganizations that delegate verification only to AI erode the institutional capacity to audit it as expertise atrophies and junior staff never develop independence.\u201d Likewise, consultant Linda Leopold cautions, \u201cIf we always let AI do the work for us, we gradually lose the expertise needed to oversee it,\u201d and \u201cwe need to keep human judgment sharp enough to challenge it.\u201d EnBW chief data officer Rainer Hoffmann says, \u201cResponsible AI efforts fail not because humans cannot verify every AI decision but because organizations lack the expertise to govern how AI systems should be evaluated, monitored, and deployed responsibly.\u201d <\/p>\n<p>The business stakes, through this lens, are fundamentally human. As Australian National University\u2019s Belona Sonna contends, \u201cThe core objective of responsible AI is not only to design systems that align with ethical principles but also to ensure that humans remain capable of intervening when misalignment occurs.\u201d Put differently, Salazar says that responsible AI \u201cneeds people who are prepared not to delegate to machines what remains a fundamentally human responsibility.\u201d Without this capacity, the question of whether responsible AI requires human verification of AI outputs becomes moot \u2014 as no one left has the expertise to do it.<\/p>\n<p><strong>Human verification alone does not scale.<\/strong> Despite broad support for the importance of cultivating human expertise, many experts cite concerns about the scale and scope of human verification. Wharton School professor Kartik Hosanagar explains: \u201cThere are many settings where it\u2019s helpful to have human verification. But there are many others where human verification is infeasible because of the scale of verification needed.\u201d Hoffmann agrees that for \u201capplications that process large volumes of data or detect patterns beyond human capability, output-by-output human verification is neither feasible nor meaningful.\u201d For some experts, requiring human verification to scale in this way would undermine the entire value proposition of using AI in the first place. As \u00d6yk\u00fc I\u015fik puts it, \u201cthe core value of AI lies in its speed and scale,\u201d such that \u201crequiring human verification for every output would effectively neutralize these efficiency gains.\u201d  <\/p>\n<\/p>\n<p>The solution, for these experts, is not to abandon human judgment but to deploy it more strategically. Philip Dawson, head of AI policy at Armilla AI, believes that \u201cas AI systems grow in complexity and deployment velocity, human-only verification becomes a structural bottleneck\u201d and requires a different approach. Citing cybersecurity as an example, I\u015fik contends that a system needs the ability to identify when human intervention is needed \u201cwhile relying on automated decision-making for the bulk of the workload to avoid massive operational bottlenecks\u201d and argues that \u201cthe most successful responsible AI efforts treat human expertise and automated tools as a combined system.\u201d Alyssa Lefaivre \u0160kopac, director of trust and safety at Alberta Machine Intelligence Institute, advocates for a \u201cdefense-in-depth approach that spans everything from front-line users who can meaningfully question an output to the professionals building the assurance ecosystem around these systems.\u201d Dawson similarly contends that \u201cthe field must invest in automated evaluation frameworks and agentic assurance pipelines that extend, not replace, human judgment at scale.\u201d<\/p>\n<p><strong>Oversight and accountability remain paramount.<\/strong> In addition to relying on a combination of human and machine verification, our experts believe that oversight and accountability remain paramount to any responsible AI strategy. Chesterman argues that \u201cverification should not be understood too narrowly.\u201d He adds, \u201cIn some settings, human experts will directly validate outputs; in others, they will design tests, audit workflows, set thresholds for acceptable use, or decide when AI should not be relied upon at all.\u201d In other words, as Chow puts it, \u201cHuman expertise is a design-time necessity, not just a run-time check.\u201d Former DBS Bank chief analytics officer Sameer Gupta agrees that \u201cgovernance and oversight should be embedded into every stage of an AI solution\u2019s design and deployment rather than treated as a final checkpoint on the outputs alone.\u201d<\/p>\n<\/p>\n<p>Many experts argue that human verification of AI outputs is essential not as an end but as a core part of meaningful oversight and accountability over AI systems. IAG chief AI scientist Ben Dias explains that as \u201ca technological construct \u2026 AI systems lack the agency to be held legally or ethically accountable for the consequences of their actions.\u201d For this reason, Dias says, \u201cevery AI solution needs an accountable human who is responsible for ensuring that the system\u2019s outputs are properly understood and verified.\u201d ADP\u2019s chief product officer Naomi Lariviere agrees, saying, \u201cAI systems can generate recommendations and automate decisions, but they can\u2019t carry accountability.\u201d Mike Linksvayer, vice president of developer policy at GitHub, argues that \u201cas systems become more agentic, the limiting factor is no longer the ability to check individual outputs but the ability to exercise informed judgment over goals, constraints, escalation paths, and responsibility.\u201d<\/p>\n<h3>Recommendations<\/h3>\n<p>If the limiting factor is the ability to exercise informed judgment, not just check AI outputs, then organizations need to invest in that judgment deliberately. We offer the following recommendations for organizations looking to cultivate human expertise that scales with their AI ambitions:<\/p>\n<p><strong>1. Verify designs, not just outputs.<\/strong> A narrow view of human verification that only addresses system outputs is insufficient. Human verification, in the broader sense of human oversight, should be embedded at every stage of an AI solution\u2019s design and deployment, not treated as a final checkpoint. This means human experts setting thresholds, designing tests, auditing workflows, and deciding when AI should not be relied on, not just reviewing individual outputs after the fact.<\/p>\n<p><strong>2. Don\u2019t rely on human verification alone.<\/strong> Because human verification of every AI output doesn\u2019t scale, organizations committed to responsible oversight should invest in a variety of approaches that use automated tools to extend or augment human judgment. Human verification should be emphasized where human judgment is essential, including edge cases, high-stakes decisions, and novel contexts, while automated tools can handle the remaining volume of tasks. The goal is a combined system that extends human judgment at scale rather than either replacing or being bottlenecked by it.<\/p>\n<p><strong>3. Invest in human expertise.<\/strong> Organizations should invest in human expertise to verify the outputs of AI systems and provide ongoing oversight over how systems are designed and whether they are working as intended. In fact, as technical capabilities grow, the need for human expertise only increases. If junior staff never develop independent judgment and senior employees\u2019 expertise atrophies because they are not part of this process, the organization risks losing its ability to govern AI systems. This may mean maintaining human involvement in processes or tasks that build expertise and judgment, even when they could be automated with AI. In these cases, the efficiency gains that are forgone should be viewed as strategic investments in the future.<\/p>\n<p><strong>4. Verify what is learned, not just what is produced.<\/strong> Organizations tend to focus verification on whether an AI system\u2019s outputs are correct, but they also need to scrutinize the lessons they draw from AI deployments and outcomes. When teams interpret pilot results, measure performance gains, or decide what worked and what didn\u2019t, those conclusions become the foundation for future investments, scaling decisions, and organizational narratives about AI\u2019s value. If those lessons are flawed (the wrong metrics were tracked, edge cases were ignored, or success was declared prematurely), organizations risk perpetuating bad assumptions at increasing scale. Human experts should be involved not only in verifying what AI systems produce but in critically evaluating what the organization believes it has learned from deploying them.<\/p>\n<p><strong>5. Treat verification as a strategic imperative, not just a responsibility practice.<\/strong> According to a global executive survey conducted in 2025 by <cite>MIT Sloan Management Review<\/cite> and BCG, 86% of top management teams consider AI to be a significant part of their strategic priorities. When AI is central to how an organization competes, grows, and makes decisions, the quality of human oversight directly affects strategic outcomes, not just ethical ones. Flawed outputs, unchecked deployments, and poorly drawn lessons don\u2019t just create responsibility risks; they lead to misallocated resources, failed initiatives, eroded competitive position, and lost customer trust. The preceding recommendations \u2014 verifying designs, combining human and automated oversight, investing in expertise, and scrutinizing what is learned \u2014 are not merely aspirational additions to a responsible AI program. They are preconditions for effective strategic management.<\/p>\n<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>For the fifth year in a row, MIT Sloan Management  [&#8230;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[194],"tags":[],"class_list":["post-22739","post","type-post","status-publish","format-standard","hentry","category-graphic-design"],"acf":[],"_links":{"self":[{"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/posts\/22739","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/comments?post=22739"}],"version-history":[{"count":0,"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/posts\/22739\/revisions"}],"wp:attachment":[{"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/media?parent=22739"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/categories?post=22739"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ideainthebox.com\/index.php\/wp-json\/wp\/v2\/tags?post=22739"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}