Share:

AI Is Good at Picking Qualified Suppliers. It Still Struggles to Pick the Best One.

Companies are rapidly integrating generative AI into procurement and sourcing decisions. The promise is obvious. AI can read thousands of pages of supplier proposals faster than any human team, summarize technical requirements in seconds, and create the appearance of consistency and objectivity in evaluation. But there is an important distinction managers are starting to overlook.

The same AI system that performs extremely well at identifying whether a supplier meets minimum requirements may perform much less reliably when judging which supplier is truly better.

Recent research published in the Journal of Business Logistics examined this issue by comparing how large language models evaluated supplier bids against evaluations completed by experienced procurement professionals. It analyzed 123 supplier proposals tied to 31 public procurement projects conducted by the State of Ohio between 2023 and 2024. The projects involved complex IT services contracts, many containing large, text-heavy bid packages requiring evaluative judgment rather than simple arithmetic comparisons.

The researchers tested three reasoning-oriented AI models: OpenAI o3, Grok-3-Mini, and DeepSeek R1. They then compared their evaluations against human procurement scores. The findings revealed a surprisingly clear pattern.

AI performed well when evaluating compliance signals. These are signals tied to baseline qualifications and technical requirements. Does the supplier meet the required certifications? Do they satisfy the mandatory specifications? Did they include the required documentation? Are the implementation requirements addressed?

On these types of tasks, the AI models showed relatively high agreement with human evaluators and relatively stable scoring behavior across repeated evaluations. But the results changed once proposals shifted from compliance to differentiation.

When suppliers attempted to distinguish themselves through strategic capabilities, innovation claims, implementation approaches, past experience, or value-added propositions, AI scoring became far more volatile. The same proposal could receive meaningfully different evaluations across repeated AI runs even when the prompt and underlying content remained unchanged.

That volatility matters.

In procurement, the most important decisions often occur after baseline qualification has already been established. Most serious bidders can satisfy the minimum requirements. Competitive advantage comes from identifying which supplier will create superior long-term value, adapt better during uncertainty, collaborate more effectively, or reduce implementation risk in ways that are difficult to fully codify.

Those judgments require interpretation, contextual reasoning, and tradeoff assessment. Humans are imperfect at this too, but experienced procurement professionals rely on domain expertise and pattern recognition developed over years of evaluating suppliers and managing outcomes.

Large language models work differently. They generate probabilistic outputs based on statistical relationships in language rather than genuine understanding of supplier quality or operational fit. That distinction becomes especially important in ambiguous or strategically nuanced evaluations.

Many executives currently frame AI adoption as a replacement question: “Can AI evaluate suppliers as well as humans?” That is the wrong question. A better question is: “Which parts of supplier evaluation are structured enough for AI to handle reliably, and which parts still require human judgment?”

The answer emerging from the research suggests a hybrid approach.

In the first stage, AI can handle qualification screening. It can rapidly process proposals, verify compliance requirements, identify missing information, summarize technical content, and flag inconsistencies. This reduces administrative burden and allows procurement professionals to focus their attention where it matters most.

In the second stage, humans should take the lead in evaluating differentiation. This is where procurement teams assess strategic fit, implementation realism, innovation potential, relationship quality, operational flexibility, and long-term value creation. These decisions are often embedded in subtle contextual cues that AI systems do not evaluate consistently.

One of the most interesting findings from the study is that AI volatility itself may become a useful management signal. When repeated AI evaluations produce highly inconsistent scores, managers should interpret that inconsistency as a warning sign rather than a nuisance. In many cases, volatility may indicate that the proposal contains ambiguous, subjective, or strategically complex content requiring deeper human review.

In other words, AI uncertainty may serve as a diagnostic tool for identifying where human expertise is most valuable. This has implications beyond procurement.

Many organizations are currently deploying generative AI into judgment-heavy workflows involving hiring, performance evaluations, contract review, lending decisions, and strategic analysis. In many of these contexts, AI may excel at standardized screening tasks while struggling with contextual differentiation and nuanced tradeoffs. Managers should resist the temptation to confuse speed with understanding.

The real opportunity is not eliminating humans from decision processes. It is reallocating human attention more effectively.

The organizations that benefit most from generative AI will likely be those that understand where automation creates leverage and where human expertise still creates advantage.

Supplier selection sits directly at that intersection.

Based on research published in the Journal of Business Logistics:

Finnegan A. McKinley, Anne E. Dohmen, and Vincent E. Castillo, “Do Humans and GAI See Eye to Eye? Implications of LLM Scoring Volatility in Supplier Evaluations,” Journal of Business Logistics, 2026, 47. https://doi.org/10.1111/jbl.70072.

More Blogs

leadership

Blogs

April 29, 2026

From Integration to Impact: Lessons in Modern Supply Chain Leadership

In a recent conversation, Supply Chain Now’s Scott Luton gained perspective from Sylvia Wilks, Chief Supply Chain Officer at Lamb Weston, who shared a powerful point of view on what it takes to lead in today’s increasingly complex, high-stakes supply chain environment. Her journey, from leading transformative initiatives at Starbucks to shaping global operations at Kimberly-Clark and REI, offers a consistent message: Success in supply chain isn’t just about systems or strategy. It’s about people, integration, and clarity of purpose. Wilks’ passion for supply chain was sparked during her time at Starbucks, where she led a bold initiative to insource instant coffee production. What began as a business case evolved into a transformative opportunity. “Seeing the entire chain, from strategy through operations, work seamlessly toward a common goal reinforced how much value organizations unlock when supply chain subfunctions operate collaboratively rather than in silos,” she explained. The idea of breaking down silos to create an integrated value chain has remained a central theme throughout Wilks’ leadership career. The Power of People and Integration Across organizations of all sizes, Wilks sees a common thread: The challenges may be similar, but outcomes depend on how well teams work together. “My passion…

supply chain planning

Blogs

January 7, 2026

ToolsGroup CEO Sean Elliott on Embracing Uncertainty, Probabilistic Planning, and Preparing for an Agentic Future

At the Gartner Supply Chain Planning Summit in Denver, Scott Luton sat down with Sean Elliott, CEO of ToolsGroup, to discuss why uncertainty is no longer something supply chain leaders should fear—and how the right technology can turn volatility into advantage. Elliott brings decades of experience across supply chain execution and planning, a background that shapes his pragmatic leadership philosophy. As he noted, bad plans can cripple even the best execution environments, just as poor execution can undermine well-crafted plans. ToolsGroup’s mission sits squarely at that intersection. What Makes ToolsGroup Different Elliott described ToolsGroup as one of the few truly probabilistic planning providers in the market. While many vendors claim probabilistic capabilities, most stop at probabilistic forecasting. ToolsGroup goes further by embedding probabilistic thinking across the full breadth of its planning technology. The company’s belief is simple but powerful: uncertainty is not the enemy—it’s an asset. Rather than chasing forecast accuracy for its own sake, ToolsGroup focuses on business outcomes. What planning organizations really care about, Elliott argued, is having the right inventory in the right place at the right time to satisfy customers. Customer satisfaction—driven by availability, pricing, and service—is the ultimate goal. Probabilistic planning enables organizations to…

More Blogs

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.